From patchwork Thu Aug 17 03:26:55 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Wei W" X-Patchwork-Id: 9904879 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 6614660386 for ; Thu, 17 Aug 2017 03:39:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5CB73286BA for ; Thu, 17 Aug 2017 03:39:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 512CB28946; Thu, 17 Aug 2017 03:39:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CB46C286BA for ; Thu, 17 Aug 2017 03:39:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752561AbdHQDjA (ORCPT ); Wed, 16 Aug 2017 23:39:00 -0400 Received: from mga01.intel.com ([192.55.52.88]:35766 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752453AbdHQDib (ORCPT ); Wed, 16 Aug 2017 23:38:31 -0400 Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 16 Aug 2017 20:38:31 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.41,386,1498546800"; d="scan'208";a="119892758" Received: from devel-ww.sh.intel.com ([10.239.48.97]) by orsmga004.jf.intel.com with ESMTP; 16 Aug 2017 20:38:26 -0700 From: Wei Wang To: virtio-dev@lists.oasis-open.org, linux-kernel@vger.kernel.org, qemu-devel@nongnu.org, virtualization@lists.linux-foundation.org, kvm@vger.kernel.org, linux-mm@kvack.org, mst@redhat.com, mhocko@kernel.org, akpm@linux-foundation.org, mawilcox@microsoft.com Cc: david@redhat.com, cornelia.huck@de.ibm.com, mgorman@techsingularity.net, aarcange@redhat.com, amit.shah@redhat.com, pbonzini@redhat.com, willy@infradead.org, wei.w.wang@intel.com, liliang.opensource@gmail.com, yang.zhang.wz@gmail.com, quan.xu@aliyun.com Subject: [PATCH v14 4/5] mm: support reporting free page blocks Date: Thu, 17 Aug 2017 11:26:55 +0800 Message-Id: <1502940416-42944-5-git-send-email-wei.w.wang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1502940416-42944-1-git-send-email-wei.w.wang@intel.com> References: <1502940416-42944-1-git-send-email-wei.w.wang@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patch adds support to walk through the free page blocks in the system and report them via a callback function. Some page blocks may leave the free list after zone->lock is released, so it is the caller's responsibility to either detect or prevent the use of such pages. Signed-off-by: Wei Wang Signed-off-by: Liang Li Cc: Michal Hocko Cc: Michael S. Tsirkin --- include/linux/mm.h | 6 ++++++ mm/page_alloc.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 50 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 46b9ac5..cd29b9f 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1835,6 +1835,12 @@ extern void free_area_init_node(int nid, unsigned long * zones_size, unsigned long zone_start_pfn, unsigned long *zholes_size); extern void free_initmem(void); +extern void walk_free_mem_block(void *opaque1, + unsigned int min_order, + void (*visit)(void *opaque2, + unsigned long pfn, + unsigned long nr_pages)); + /* * Free reserved pages within range [PAGE_ALIGN(start), end & PAGE_MASK) * into the buddy system. The freed pages will be poisoned with pattern diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 6d00f74..a721a35 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4762,6 +4762,50 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask) show_swap_cache_info(); } +/** + * walk_free_mem_block - Walk through the free page blocks in the system + * @opaque1: the context passed from the caller + * @min_order: the minimum order of free lists to check + * @visit: the callback function given by the caller + * + * The function is used to walk through the free page blocks in the system, + * and each free page block is reported to the caller via the @visit callback. + * Please note: + * 1) The function is used to report hints of free pages, so the caller should + * not use those reported pages after the callback returns. + * 2) The callback is invoked with the zone->lock being held, so it should not + * block and should finish as soon as possible. + */ +void walk_free_mem_block(void *opaque1, + unsigned int min_order, + void (*visit)(void *opaque2, + unsigned long pfn, + unsigned long nr_pages)) +{ + struct zone *zone; + struct page *page; + struct list_head *list; + unsigned int order; + enum migratetype mt; + unsigned long pfn, flags; + + for_each_populated_zone(zone) { + for (order = MAX_ORDER - 1; + order < MAX_ORDER && order >= min_order; order--) { + for (mt = 0; mt < MIGRATE_TYPES; mt++) { + spin_lock_irqsave(&zone->lock, flags); + list = &zone->free_area[order].free_list[mt]; + list_for_each_entry(page, list, lru) { + pfn = page_to_pfn(page); + visit(opaque1, pfn, 1 << order); + } + spin_unlock_irqrestore(&zone->lock, flags); + } + } + } +} +EXPORT_SYMBOL_GPL(walk_free_mem_block); + static void zoneref_set_zone(struct zone *zone, struct zoneref *zoneref) { zoneref->zone = zone;