From patchwork Thu May 2 05:55:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10926039 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 707231398 for ; Thu, 2 May 2019 06:09:18 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5E03528FDA for ; Thu, 2 May 2019 06:09:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 526D228FDE; Thu, 2 May 2019 06:09:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6B05E28FDA for ; Thu, 2 May 2019 06:09:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6E1D46B0006; Thu, 2 May 2019 02:09:16 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 692556B0007; Thu, 2 May 2019 02:09:16 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5820A6B0008; Thu, 2 May 2019 02:09:16 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f198.google.com (mail-pf1-f198.google.com [209.85.210.198]) by kanga.kvack.org (Postfix) with ESMTP id 1C5526B0006 for ; Thu, 2 May 2019 02:09:16 -0400 (EDT) Received: by mail-pf1-f198.google.com with SMTP id e20so693722pfn.8 for ; Wed, 01 May 2019 23:09:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=ukeS8qbSSwdoP/rEe42M2gnUvzgXCMaTs9WCxU3b9Tk=; b=SU9ui7R2PXEfwe6MP1FI+GTNCfFy9EM5bPDMXHABhjp4HVNB7CpDkfg0ETRUhtV5sM 4VosUASka7ppLODkcIkkPPAvLQBluZbAA48rqqbNa7q6kH3KFeGqW3p5bAXVYBbHxBzP 6gklpcjcY6q+LoYTy6lOODZlY4GyPDy4PqPxo6aqvBLXzU4L3ZEs0d2MAqnzi/eGKzuv +/Q+M1mQqzSj/AB8CZi+HFISL7jvodCUF37ufvhM9Y/vzeAYDSvLp1ZSj2hQWvmPtF7e YI2Ao6+8rtrYishmnOMSe5tQhBu7ziWRzhyJWb6lj/QqotqpgBJkw9mYXEy1eOX4WFMC /vYQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAXf0QrVdBBkVCdfaDE7UzvJTTx9/QIEEh5lGJG4qp4ra5t/CiYZ zlCxFYLHNssq9I+j1sp/c4YFYRyum2J074MUdYVg84wURXWbbU04cx6BSXK8howjnSORwgQbErw M990yeGL+JxFst6EAzq+D3NC+1x4A6X++DoUGFsq+eoefoSgPH/H0d0VSa2K054X6/A== X-Received: by 2002:aa7:8289:: with SMTP id s9mr2241679pfm.208.1556777355610; Wed, 01 May 2019 23:09:15 -0700 (PDT) X-Google-Smtp-Source: APXvYqz4EqgAjtTCmU5vVfevCZIvWrZKHi1sIKOT/nqcZ08gircrb4S0i9HAOiNoVUgUdhsDD/XI X-Received: by 2002:aa7:8289:: with SMTP id s9mr2241580pfm.208.1556777354534; Wed, 01 May 2019 23:09:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556777354; cv=none; d=google.com; s=arc-20160816; b=eu1p/MfGwfdlun3fdNN1Mnkn6nygV1UGsFEZTwv9USFt9+7v0ns3j/uuNJfh2qmn+6 Zfkd8YEcj6RyLU5o3iKTgSlGre7trnzFeUHIFXreuvS7zWnRFi5fh4Pv5fRd/2uo+kcM TLqR6rB4BKuYTX6tnizp8YK0ldMRhMscC9ola7xzB3N/DdzwXjer+ezA4GwEenWsAWdx APiyPqFzU1kNvaDQezSgO1lW4906+n9T+5c60B12DhevWbJzTp/6QbX8N7uxXgF7V43O CIIkfYtvzVvZ//lHe+BeTLxR/yz1/8czPZxYGhHSAn7EztEIkQMof7nnQRL2aNbzZv22 tTvg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=ukeS8qbSSwdoP/rEe42M2gnUvzgXCMaTs9WCxU3b9Tk=; b=Qu68Pcp1edNQAFuUCWB1WnCRxuXrS2DAkMxFdh6zg/qgn4nY5HqGx7sbDG+8YlRKw/ dyVmzkH7YQZ5GDgugXa3j2Lswhp4PkN7PeTh3YPyJJQ5UvQ7/7Qq6XOTETun8BO63Uv2 yqsncZSYZZR2cpfkUqTz4keLKQYdSo61YhSHR5Oj77qnkMmH3/aFlZRVO3hw7ZwDQ4TN TXV2dXMTJqqzhf40dr0Ht8/VTpGTQGy2eH7kqh0NM3OfhMSSCJFFk7z7nD7G4B+EJP2/ F7Uu48VESosNQcq+U8up0UU5UAp0q6JXUdkUo1kgX3g7FJEwwvULJTTEP+cfnj8LcT07 APOQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga03.intel.com (mga03.intel.com. [134.134.136.65]) by mx.google.com with ESMTPS id 3si17983295pfo.157.2019.05.01.23.09.14 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 01 May 2019 23:09:14 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.65 as permitted sender) client-ip=134.134.136.65; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 May 2019 23:09:14 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,420,1549958400"; d="scan'208";a="140583170" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga006.jf.intel.com with ESMTP; 01 May 2019 23:09:13 -0700 Subject: [PATCH v7 01/12] mm/sparsemem: Introduce struct mem_section_usage From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Vlastimil Babka , Logan Gunthorpe , linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, osalvador@suse.de, mhocko@suse.com Date: Wed, 01 May 2019 22:55:27 -0700 Message-ID: <155677652762.2336373.6522945152928524695.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155677652226.2336373.8700273400832001094.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155677652226.2336373.8700273400832001094.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Towards enabling memory hotplug to track partial population of a section, introduce 'struct mem_section_usage'. A pointer to a 'struct mem_section_usage' instance replaces the existing pointer to a 'pageblock_flags' bitmap. Effectively it adds one more 'unsigned long' beyond the 'pageblock_flags' (usemap) allocation to house a new 'map_active' bitmap. The new bitmap enables the memory hot{plug,remove} implementation to act on incremental sub-divisions of a section. The primary motivation for this functionality is to support platforms that mix "System RAM" and "Persistent Memory" within a single section, or multiple PMEM ranges with different mapping lifetimes within a single section. The section restriction for hotplug has caused an ongoing saga of hacks and bugs for devm_memremap_pages() users. Beyond the fixups to teach existing paths how to retrieve the 'usemap' from a section, and updates to usemap allocation path, there are no expected behavior changes. Cc: Michal Hocko Cc: Vlastimil Babka Cc: Logan Gunthorpe Signed-off-by: Dan Williams Reviewed-by: Oscar Salvador --- include/linux/mmzone.h | 23 ++++++++++++-- mm/memory_hotplug.c | 18 ++++++----- mm/page_alloc.c | 2 + mm/sparse.c | 81 ++++++++++++++++++++++++------------------------ 4 files changed, 71 insertions(+), 53 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 70394cabaf4e..f0bbd85dc19a 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1160,6 +1160,19 @@ static inline unsigned long section_nr_to_pfn(unsigned long sec) #define SECTION_ALIGN_UP(pfn) (((pfn) + PAGES_PER_SECTION - 1) & PAGE_SECTION_MASK) #define SECTION_ALIGN_DOWN(pfn) ((pfn) & PAGE_SECTION_MASK) +#define SECTION_ACTIVE_SIZE ((1UL << SECTION_SIZE_BITS) / BITS_PER_LONG) +#define SECTION_ACTIVE_MASK (~(SECTION_ACTIVE_SIZE - 1)) + +struct mem_section_usage { + /* + * SECTION_ACTIVE_SIZE portions of the section that are populated in + * the memmap + */ + unsigned long map_active; + /* See declaration of similar field in struct zone */ + unsigned long pageblock_flags[0]; +}; + struct page; struct page_ext; struct mem_section { @@ -1177,8 +1190,7 @@ struct mem_section { */ unsigned long section_mem_map; - /* See declaration of similar field in struct zone */ - unsigned long *pageblock_flags; + struct mem_section_usage *usage; #ifdef CONFIG_PAGE_EXTENSION /* * If SPARSEMEM, pgdat doesn't have page_ext pointer. We use @@ -1209,6 +1221,11 @@ extern struct mem_section **mem_section; extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT]; #endif +static inline unsigned long *section_to_usemap(struct mem_section *ms) +{ + return ms->usage->pageblock_flags; +} + static inline struct mem_section *__nr_to_section(unsigned long nr) { #ifdef CONFIG_SPARSEMEM_EXTREME @@ -1220,7 +1237,7 @@ static inline struct mem_section *__nr_to_section(unsigned long nr) return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK]; } extern int __section_nr(struct mem_section* ms); -extern unsigned long usemap_size(void); +extern size_t mem_section_usage_size(void); /* * We use the lower bits of the mem_map pointer to store diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 328878b6799d..a76fc6a6e9fe 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -165,9 +165,10 @@ void put_page_bootmem(struct page *page) #ifndef CONFIG_SPARSEMEM_VMEMMAP static void register_page_bootmem_info_section(unsigned long start_pfn) { - unsigned long *usemap, mapsize, section_nr, i; + unsigned long mapsize, section_nr, i; struct mem_section *ms; struct page *page, *memmap; + struct mem_section_usage *usage; section_nr = pfn_to_section_nr(start_pfn); ms = __nr_to_section(section_nr); @@ -187,10 +188,10 @@ static void register_page_bootmem_info_section(unsigned long start_pfn) for (i = 0; i < mapsize; i++, page++) get_page_bootmem(section_nr, page, SECTION_INFO); - usemap = ms->pageblock_flags; - page = virt_to_page(usemap); + usage = ms->usage; + page = virt_to_page(usage); - mapsize = PAGE_ALIGN(usemap_size()) >> PAGE_SHIFT; + mapsize = PAGE_ALIGN(mem_section_usage_size()) >> PAGE_SHIFT; for (i = 0; i < mapsize; i++, page++) get_page_bootmem(section_nr, page, MIX_SECTION_INFO); @@ -199,9 +200,10 @@ static void register_page_bootmem_info_section(unsigned long start_pfn) #else /* CONFIG_SPARSEMEM_VMEMMAP */ static void register_page_bootmem_info_section(unsigned long start_pfn) { - unsigned long *usemap, mapsize, section_nr, i; + unsigned long mapsize, section_nr, i; struct mem_section *ms; struct page *page, *memmap; + struct mem_section_usage *usage; section_nr = pfn_to_section_nr(start_pfn); ms = __nr_to_section(section_nr); @@ -210,10 +212,10 @@ static void register_page_bootmem_info_section(unsigned long start_pfn) register_page_bootmem_memmap(section_nr, memmap, PAGES_PER_SECTION); - usemap = ms->pageblock_flags; - page = virt_to_page(usemap); + usage = ms->usage; + page = virt_to_page(usage); - mapsize = PAGE_ALIGN(usemap_size()) >> PAGE_SHIFT; + mapsize = PAGE_ALIGN(mem_section_usage_size()) >> PAGE_SHIFT; for (i = 0; i < mapsize; i++, page++) get_page_bootmem(section_nr, page, MIX_SECTION_INFO); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 1f99db76b1ff..61c2b54a5b61 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -403,7 +403,7 @@ static inline unsigned long *get_pageblock_bitmap(struct page *page, unsigned long pfn) { #ifdef CONFIG_SPARSEMEM - return __pfn_to_section(pfn)->pageblock_flags; + return section_to_usemap(__pfn_to_section(pfn)); #else return page_zone(page)->pageblock_flags; #endif /* CONFIG_SPARSEMEM */ diff --git a/mm/sparse.c b/mm/sparse.c index fd13166949b5..f87de7ad32c8 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -288,33 +288,31 @@ struct page *sparse_decode_mem_map(unsigned long coded_mem_map, unsigned long pn static void __meminit sparse_init_one_section(struct mem_section *ms, unsigned long pnum, struct page *mem_map, - unsigned long *pageblock_bitmap) + struct mem_section_usage *usage) { ms->section_mem_map &= ~SECTION_MAP_MASK; ms->section_mem_map |= sparse_encode_mem_map(mem_map, pnum) | SECTION_HAS_MEM_MAP; - ms->pageblock_flags = pageblock_bitmap; + ms->usage = usage; } -unsigned long usemap_size(void) +static unsigned long usemap_size(void) { return BITS_TO_LONGS(SECTION_BLOCKFLAGS_BITS) * sizeof(unsigned long); } -#ifdef CONFIG_MEMORY_HOTPLUG -static unsigned long *__kmalloc_section_usemap(void) +size_t mem_section_usage_size(void) { - return kmalloc(usemap_size(), GFP_KERNEL); + return sizeof(struct mem_section_usage) + usemap_size(); } -#endif /* CONFIG_MEMORY_HOTPLUG */ #ifdef CONFIG_MEMORY_HOTREMOVE -static unsigned long * __init +static struct mem_section_usage * __init sparse_early_usemaps_alloc_pgdat_section(struct pglist_data *pgdat, unsigned long size) { + struct mem_section_usage *usage; unsigned long goal, limit; - unsigned long *p; int nid; /* * A page may contain usemaps for other sections preventing the @@ -330,15 +328,16 @@ sparse_early_usemaps_alloc_pgdat_section(struct pglist_data *pgdat, limit = goal + (1UL << PA_SECTION_SHIFT); nid = early_pfn_to_nid(goal >> PAGE_SHIFT); again: - p = memblock_alloc_try_nid(size, SMP_CACHE_BYTES, goal, limit, nid); - if (!p && limit) { + usage = memblock_alloc_try_nid(size, SMP_CACHE_BYTES, goal, limit, nid); + if (!usage && limit) { limit = 0; goto again; } - return p; + return usage; } -static void __init check_usemap_section_nr(int nid, unsigned long *usemap) +static void __init check_usemap_section_nr(int nid, + struct mem_section_usage *usage) { unsigned long usemap_snr, pgdat_snr; static unsigned long old_usemap_snr; @@ -352,7 +351,7 @@ static void __init check_usemap_section_nr(int nid, unsigned long *usemap) old_pgdat_snr = NR_MEM_SECTIONS; } - usemap_snr = pfn_to_section_nr(__pa(usemap) >> PAGE_SHIFT); + usemap_snr = pfn_to_section_nr(__pa(usage) >> PAGE_SHIFT); pgdat_snr = pfn_to_section_nr(__pa(pgdat) >> PAGE_SHIFT); if (usemap_snr == pgdat_snr) return; @@ -380,14 +379,15 @@ static void __init check_usemap_section_nr(int nid, unsigned long *usemap) usemap_snr, pgdat_snr, nid); } #else -static unsigned long * __init +static struct mem_section_usage * __init sparse_early_usemaps_alloc_pgdat_section(struct pglist_data *pgdat, unsigned long size) { return memblock_alloc_node(size, SMP_CACHE_BYTES, pgdat->node_id); } -static void __init check_usemap_section_nr(int nid, unsigned long *usemap) +static void __init check_usemap_section_nr(int nid, + struct mem_section_usage *usage) { } #endif /* CONFIG_MEMORY_HOTREMOVE */ @@ -474,14 +474,13 @@ static void __init sparse_init_nid(int nid, unsigned long pnum_begin, unsigned long pnum_end, unsigned long map_count) { - unsigned long pnum, usemap_longs, *usemap; + struct mem_section_usage *usage; + unsigned long pnum; struct page *map; - usemap_longs = BITS_TO_LONGS(SECTION_BLOCKFLAGS_BITS); - usemap = sparse_early_usemaps_alloc_pgdat_section(NODE_DATA(nid), - usemap_size() * - map_count); - if (!usemap) { + usage = sparse_early_usemaps_alloc_pgdat_section(NODE_DATA(nid), + mem_section_usage_size() * map_count); + if (!usage) { pr_err("%s: node[%d] usemap allocation failed", __func__, nid); goto failed; } @@ -497,9 +496,9 @@ static void __init sparse_init_nid(int nid, unsigned long pnum_begin, pnum_begin = pnum; goto failed; } - check_usemap_section_nr(nid, usemap); - sparse_init_one_section(__nr_to_section(pnum), pnum, map, usemap); - usemap += usemap_longs; + check_usemap_section_nr(nid, usage); + sparse_init_one_section(__nr_to_section(pnum), pnum, map, usage); + usage = (void *) usage + mem_section_usage_size(); } sparse_buffer_fini(); return; @@ -701,9 +700,9 @@ int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, struct vmem_altmap *altmap) { unsigned long section_nr = pfn_to_section_nr(start_pfn); + struct mem_section_usage *usage; struct mem_section *ms; struct page *memmap; - unsigned long *usemap; int ret; /* @@ -717,8 +716,8 @@ int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, memmap = kmalloc_section_memmap(section_nr, nid, altmap); if (!memmap) return -ENOMEM; - usemap = __kmalloc_section_usemap(); - if (!usemap) { + usage = kzalloc(mem_section_usage_size(), GFP_KERNEL); + if (!usage) { __kfree_section_memmap(memmap, altmap); return -ENOMEM; } @@ -736,11 +735,11 @@ int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, page_init_poison(memmap, sizeof(struct page) * PAGES_PER_SECTION); section_mark_present(ms); - sparse_init_one_section(ms, section_nr, memmap, usemap); + sparse_init_one_section(ms, section_nr, memmap, usage); out: if (ret < 0) { - kfree(usemap); + kfree(usage); __kfree_section_memmap(memmap, altmap); } return ret; @@ -777,20 +776,20 @@ static inline void clear_hwpoisoned_pages(struct page *memmap, int nr_pages) } #endif -static void free_section_usemap(struct page *memmap, unsigned long *usemap, - struct vmem_altmap *altmap) +static void free_section_usage(struct page *memmap, + struct mem_section_usage *usage, struct vmem_altmap *altmap) { - struct page *usemap_page; + struct page *usage_page; - if (!usemap) + if (!usage) return; - usemap_page = virt_to_page(usemap); + usage_page = virt_to_page(usage); /* * Check to see if allocation came from hot-plug-add */ - if (PageSlab(usemap_page) || PageCompound(usemap_page)) { - kfree(usemap); + if (PageSlab(usage_page) || PageCompound(usage_page)) { + kfree(usage); if (memmap) __kfree_section_memmap(memmap, altmap); return; @@ -809,19 +808,19 @@ void sparse_remove_one_section(struct zone *zone, struct mem_section *ms, unsigned long map_offset, struct vmem_altmap *altmap) { struct page *memmap = NULL; - unsigned long *usemap = NULL; + struct mem_section_usage *usage = NULL; if (ms->section_mem_map) { - usemap = ms->pageblock_flags; + usage = ms->usage; memmap = sparse_decode_mem_map(ms->section_mem_map, __section_nr(ms)); ms->section_mem_map = 0; - ms->pageblock_flags = NULL; + ms->usage = NULL; } clear_hwpoisoned_pages(memmap + map_offset, PAGES_PER_SECTION - map_offset); - free_section_usemap(memmap, usemap, altmap); + free_section_usage(memmap, usage, altmap); } #endif /* CONFIG_MEMORY_HOTREMOVE */ #endif /* CONFIG_MEMORY_HOTPLUG */ From patchwork Thu May 2 05:55:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10926043 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 122D813AD for ; Thu, 2 May 2019 06:09:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 02E3C28FDA for ; Thu, 2 May 2019 06:09:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EB6D528FDE; Thu, 2 May 2019 06:09:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6BB3A28FDA for ; Thu, 2 May 2019 06:09:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4221D6B0007; Thu, 2 May 2019 02:09:21 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 3D2A76B0008; Thu, 2 May 2019 02:09:21 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2C19A6B000A; Thu, 2 May 2019 02:09:21 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f197.google.com (mail-pg1-f197.google.com [209.85.215.197]) by kanga.kvack.org (Postfix) with ESMTP id EAB276B0007 for ; Thu, 2 May 2019 02:09:20 -0400 (EDT) Received: by mail-pg1-f197.google.com with SMTP id 63so698947pga.18 for ; Wed, 01 May 2019 23:09:20 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=BXUMJAmAtU8I+1zy9esgkFAGeC72peX6yOy4TBwj0rc=; b=X6QswPuuaC5+V1Ppzh9oyFLmIToQQC5u8Uyts+rgowR3RyPQpKuvg4TWVwz816OhYT dCE0TojraAQxhp4izEH2SMx0SBHJ5NPMZsipx0aFxSzUrYOC8Ywz1Wx8Uu5i//2HQQPC N8Lhqw7bQmEcx3vYbidTieJZgyKiSxEisRcXOnuUOz+1qYv4vMLJ8ZlMN9O3XWEEtYzC rYOnZGaHDzI0/uad3/bdfmFlTw7Yc0s0UAsbP2puiyYtAZPxCIb834ft7d6qVJIpBV8i 5LpsJCmTvQtpbERNtcYvmX7UmczbRjbQOviA7QwpBnXIIhBoK6ptmUDRqrWSBZwcjA/6 24MQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAX67NFqFNzY9/x3vxv7EfxKG82bpi3wp76AmnCDykCXCXqMzyiR VUxUNERVUXRQNPBvmHWNSyX8bFdwpCVvvA9DOpWmv8HpJRZEbcxGIyCXeem9w4/uIjb6uI99LhE ONdYOPpLEERSgdeJCPt+VY9HMwwtx29SdTYZHDxVHx6JYAuEOlsZ2Eiw9jqUvXrs5bA== X-Received: by 2002:a17:902:4643:: with SMTP id o61mr1813163pld.95.1556777360613; Wed, 01 May 2019 23:09:20 -0700 (PDT) X-Google-Smtp-Source: APXvYqywDAmfwLYqaHW58MpiLE0rZjNiF9gHJZr3suYTBECfTC7iY7QaVmClNaA+Q5T8rMX7MMK8 X-Received: by 2002:a17:902:4643:: with SMTP id o61mr1813094pld.95.1556777359887; Wed, 01 May 2019 23:09:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556777359; cv=none; d=google.com; s=arc-20160816; b=Y1TAT36HzLXdWTW3PDBfhBrGJJSVCz4K31MLaikr8O6BnuEvVdG0mzWX4hRj1xyuYY kqO5Q5pVK9mDoPd+s328+oyQ+Z+9AhUilJJlPBFo/v5L4TRRkokN88T+aP6eCPuaf2Mo VxDq8lo2MQM/N9w4LaoHN9Ahs9OSW5WIGLaDOCCorsh/NPR42WljPZCPsTbIc1zfken3 BSORo6kv588vXsutRSaQ1cJtzCGUHX9zEu2JNbc0KSiR1kx0Wf7ON+9s6iwt64kADaBt FjzQLsYy5VNTmtW0cgyZ5dH1ifEd5SIQxqHfuTZIYUUBbtcrPvB9OsV/GfuAy0cSN5aX 6YRg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=BXUMJAmAtU8I+1zy9esgkFAGeC72peX6yOy4TBwj0rc=; b=NzaVDTNixYNlmnf1EWrqx2YOFc6Y6Av3tB0Q2IUuBfSIE4YpuVFV7rL2Xu9Gi9v+Mz zx0FS+gHkDBBw169s4FIVPNQxDO8IUv/PTZLEU81BrgeJcmGbbNQX9q7WlGfuss8LyUU NFe4Z0HmyklYJ6Zkwx6Ctv9mT50DL6Gh5ada1DGig8MjsYEDUvBFTPs5LDQZpfjGW6BX skL/uebmYipGaEh920FYx+LW9fJHOXkTk2hT8iCSnzY+EaWuw/CXw+SmnUNvLWuW+DFA GjcHS7cc/34BGN9XPe7z03H8cx/V53FdolMv+7mOeD81PF/d9YvGVkVvZgRrrmR/ILp9 5lXw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga04.intel.com (mga04.intel.com. [192.55.52.120]) by mx.google.com with ESMTPS id o12si39902078pgp.94.2019.05.01.23.09.19 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 01 May 2019 23:09:19 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.120 as permitted sender) client-ip=192.55.52.120; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 May 2019 23:09:19 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,420,1549958400"; d="scan'208";a="136147452" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga007.jf.intel.com with ESMTP; 01 May 2019 23:09:18 -0700 Subject: [PATCH v7 02/12] mm/sparsemem: Introduce common definitions for the size and mask of a section From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Vlastimil Babka , =?utf-8?b?SsOpcsO0bWU=?= Glisse , Logan Gunthorpe , linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, osalvador@suse.de, mhocko@suse.com Date: Wed, 01 May 2019 22:55:32 -0700 Message-ID: <155677653274.2336373.11220321059915670288.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155677652226.2336373.8700273400832001094.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155677652226.2336373.8700273400832001094.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Up-level the local section size and mask from kernel/memremap.c to global definitions. These will be used by the new sub-section hotplug support. Cc: Michal Hocko Cc: Vlastimil Babka Cc: Jérôme Glisse Cc: Logan Gunthorpe Signed-off-by: Dan Williams Reviewed-by: Oscar Salvador --- include/linux/mmzone.h | 2 ++ kernel/memremap.c | 10 ++++------ mm/hmm.c | 2 -- 3 files changed, 6 insertions(+), 8 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index f0bbd85dc19a..6726fc175b51 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1134,6 +1134,8 @@ static inline unsigned long early_pfn_to_nid(unsigned long pfn) * PFN_SECTION_SHIFT pfn to/from section number */ #define PA_SECTION_SHIFT (SECTION_SIZE_BITS) +#define PA_SECTION_SIZE (1UL << PA_SECTION_SHIFT) +#define PA_SECTION_MASK (~(PA_SECTION_SIZE-1)) #define PFN_SECTION_SHIFT (SECTION_SIZE_BITS - PAGE_SHIFT) #define NR_MEM_SECTIONS (1UL << SECTIONS_SHIFT) diff --git a/kernel/memremap.c b/kernel/memremap.c index 4e59d29245f4..f355586ea54a 100644 --- a/kernel/memremap.c +++ b/kernel/memremap.c @@ -14,8 +14,6 @@ #include static DEFINE_XARRAY(pgmap_array); -#define SECTION_MASK ~((1UL << PA_SECTION_SHIFT) - 1) -#define SECTION_SIZE (1UL << PA_SECTION_SHIFT) #if IS_ENABLED(CONFIG_DEVICE_PRIVATE) vm_fault_t device_private_entry_fault(struct vm_area_struct *vma, @@ -98,8 +96,8 @@ static void devm_memremap_pages_release(void *data) put_page(pfn_to_page(pfn)); /* pages are dead and unused, undo the arch mapping */ - align_start = res->start & ~(SECTION_SIZE - 1); - align_size = ALIGN(res->start + resource_size(res), SECTION_SIZE) + align_start = res->start & ~(PA_SECTION_SIZE - 1); + align_size = ALIGN(res->start + resource_size(res), PA_SECTION_SIZE) - align_start; nid = page_to_nid(pfn_to_page(align_start >> PAGE_SHIFT)); @@ -160,8 +158,8 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) if (!pgmap->ref || !pgmap->kill) return ERR_PTR(-EINVAL); - align_start = res->start & ~(SECTION_SIZE - 1); - align_size = ALIGN(res->start + resource_size(res), SECTION_SIZE) + align_start = res->start & ~(PA_SECTION_SIZE - 1); + align_size = ALIGN(res->start + resource_size(res), PA_SECTION_SIZE) - align_start; align_end = align_start + align_size - 1; diff --git a/mm/hmm.c b/mm/hmm.c index 0db8491090b8..a7e7f8e33c5f 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -34,8 +34,6 @@ #include #include -#define PA_SECTION_SIZE (1UL << PA_SECTION_SHIFT) - #if IS_ENABLED(CONFIG_HMM_MIRROR) static const struct mmu_notifier_ops hmm_mmu_notifier_ops; From patchwork Thu May 2 05:55:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10926047 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E115F13AD for ; Thu, 2 May 2019 06:09:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D0F9328FDA for ; Thu, 2 May 2019 06:09:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C4D0C28FDE; Thu, 2 May 2019 06:09:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 40E5428FDA for ; Thu, 2 May 2019 06:09:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2E9836B0008; Thu, 2 May 2019 02:09:26 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 29B066B000A; Thu, 2 May 2019 02:09:26 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 189A36B000C; Thu, 2 May 2019 02:09:26 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f199.google.com (mail-pg1-f199.google.com [209.85.215.199]) by kanga.kvack.org (Postfix) with ESMTP id D5F8C6B0008 for ; Thu, 2 May 2019 02:09:25 -0400 (EDT) Received: by mail-pg1-f199.google.com with SMTP id v9so713449pgg.8 for ; Wed, 01 May 2019 23:09:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=h1qZRCxKdav4Z/xZvb0za7BlD7cFxA5vHoQHlWtfT7o=; b=j/KEsAkSopx3sA/0PbJdwC8Ux7WK4/+PH4RWFxniMw0+svtU7ezNCqMqhlE8zfsGJp eCehqzQxdSuieUD4iy+QV/G335SjjrDuyVBlCmP1ayGGvuHeFT+FLjaCiEskhu8d04VV FSTfa8fm7JD5TzW7KqPGhxEfsMXfXRjTyRyTpwmZCmTzA9NPNlTisrk7BVJ2CCu/rQoZ EQUWqccWwmOKztJoT1ElNQiwEQ4MMldn+f4TzR7E4S7rXcIXcWtxYj8yW4fhbof4AZwl H38v9+tLhXjs29wjFAilkIgnxsZ0GM27FfXikN29Q6qswsnZy7B16V4cuW5wWhSMhn3H XrXQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAWPghJDM8EFcmUkj1nl64gwa+vqY/jqdXo95nlGV+FRzMHZR55j /v85ITSFaJNJteWdtUnTT278vNRU0fGNfrsxGvflMPpnfjPINizmR3eYuBg7/MjYMmdEWeG+P7I lCGwg85/M6wHPbHk6gJhb6xINtBihkSvb4axQAnSJeNBBWbYCZ0umejaLLJUU3ZMYhQ== X-Received: by 2002:a62:d244:: with SMTP id c65mr2222170pfg.173.1556777365524; Wed, 01 May 2019 23:09:25 -0700 (PDT) X-Google-Smtp-Source: APXvYqy74uYzsxIo9sZ3buQujgYy4Tt9UoEti3k86mMVGF+3c2JjF2qW2XdTu9Oe6td7ws0znfyC X-Received: by 2002:a62:d244:: with SMTP id c65mr2222088pfg.173.1556777364582; Wed, 01 May 2019 23:09:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556777364; cv=none; d=google.com; s=arc-20160816; b=YShVLXy/T2SoEqZQ33nkZbi+HFuy54wT1vw11AZfIvlwHRJt1BklJUpdhcdSY5JaWR kz3Rgl/3MEfFuk3JGp+G5hudeng3i0Xwm3/8g4zJlDZFFdvk/bgPwwGAF13ppdPJhS9L j33EftUPNoVXkBom8X6EFBjuADLhnqWk5sNbdrD3i0GZDVlHsttudmPAJKZ1WSfMk5dX 4TYyY5rLtLR8qhb38mVfXAJeH1C1v2V+H+3JM1zN9BOcm24k9QaKt9RfreC4fillXeCZ 1Zn0sOGefNoFmmci3vj5AXsDy2fcO8Qz9uxqOh8S5ITdQFglS0F+IkW8uCYYW5Xwe1Ex W5sQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=h1qZRCxKdav4Z/xZvb0za7BlD7cFxA5vHoQHlWtfT7o=; b=wvuA7lA40k52/grC66COuGMqqd95s+IEQaSFWZLiQuLsInTBZetdZRcrkuwzxQYoAr 8jEbCmI2DjekCAwbOSA9Ho+OLOAg63hu86t/YWCrkHp00wxZGaYzGtM3icTlC+SBSo/T svyQCWRwC98ebfAjrRDIY2xtq1wUeGJlbSy7XLmEWZcn/Utl0W2TrRAEoybFo9c/7EGL 9gI9jVJGrHxtNfBnJofj/MhoIQ3vTTeMKKALebrlnP5LhLRn6I32tRteYvzNTip9ULSU qyYet9xpD7OeIVdAW/fz6cZ6zcYCisQluZGRcvnxWqy2/hJlhiG3UZjbAYCtaUx91qc3 91Ag== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga06.intel.com (mga06.intel.com. [134.134.136.31]) by mx.google.com with ESMTPS id t14si20369382pgg.32.2019.05.01.23.09.24 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 01 May 2019 23:09:24 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) client-ip=134.134.136.31; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 May 2019 23:09:24 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,420,1549958400"; d="scan'208";a="296291608" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga004.jf.intel.com with ESMTP; 01 May 2019 23:09:24 -0700 Subject: [PATCH v7 03/12] mm/sparsemem: Add helpers track active portions of a section at boot From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Vlastimil Babka , Logan Gunthorpe , Jane Chu , linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, osalvador@suse.de, mhocko@suse.com Date: Wed, 01 May 2019 22:55:37 -0700 Message-ID: <155677653785.2336373.11131100812252340469.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155677652226.2336373.8700273400832001094.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155677652226.2336373.8700273400832001094.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Prepare for hot{plug,remove} of sub-ranges of a section by tracking a section active bitmask, each bit representing 2MB (SECTION_SIZE (128M) / map_active bitmask length (64)). If it turns out that 2MB is too large of an active tracking granularity it is trivial to increase the size of the map_active bitmap. The implications of a partially populated section is that pfn_valid() needs to go beyond a valid_section() check and read the sub-section active ranges from the bitmask. Cc: Michal Hocko Cc: Vlastimil Babka Cc: Logan Gunthorpe Tested-by: Jane Chu Signed-off-by: Dan Williams --- include/linux/mmzone.h | 29 ++++++++++++++++++++++++++++- mm/page_alloc.c | 4 +++- mm/sparse.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 79 insertions(+), 2 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 6726fc175b51..cffde898e345 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1175,6 +1175,8 @@ struct mem_section_usage { unsigned long pageblock_flags[0]; }; +void section_active_init(unsigned long pfn, unsigned long nr_pages); + struct page; struct page_ext; struct mem_section { @@ -1312,12 +1314,36 @@ static inline struct mem_section *__pfn_to_section(unsigned long pfn) extern int __highest_present_section_nr; +static inline int section_active_index(phys_addr_t phys) +{ + return (phys & ~(PA_SECTION_MASK)) / SECTION_ACTIVE_SIZE; +} + +#ifdef CONFIG_SPARSEMEM_VMEMMAP +static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) +{ + int idx = section_active_index(PFN_PHYS(pfn)); + + return !!(ms->usage->map_active & (1UL << idx)); +} +#else +static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) +{ + return 1; +} +#endif + #ifndef CONFIG_HAVE_ARCH_PFN_VALID static inline int pfn_valid(unsigned long pfn) { + struct mem_section *ms; + if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) return 0; - return valid_section(__nr_to_section(pfn_to_section_nr(pfn))); + ms = __nr_to_section(pfn_to_section_nr(pfn)); + if (!valid_section(ms)) + return 0; + return pfn_section_valid(ms, pfn); } #endif @@ -1349,6 +1375,7 @@ void sparse_init(void); #define sparse_init() do {} while (0) #define sparse_index_init(_sec, _nid) do {} while (0) #define pfn_present pfn_valid +#define section_active_init(_pfn, _nr_pages) do {} while (0) #endif /* CONFIG_SPARSEMEM */ /* diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 61c2b54a5b61..a68735c79609 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -7291,10 +7291,12 @@ void __init free_area_init_nodes(unsigned long *max_zone_pfn) /* Print out the early node map */ pr_info("Early memory node ranges\n"); - for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) + for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) { pr_info(" node %3d: [mem %#018Lx-%#018Lx]\n", nid, (u64)start_pfn << PAGE_SHIFT, ((u64)end_pfn << PAGE_SHIFT) - 1); + section_active_init(start_pfn, end_pfn - start_pfn); + } /* Initialise every node */ mminit_verify_pageflags_layout(); diff --git a/mm/sparse.c b/mm/sparse.c index f87de7ad32c8..8d4f28e2c25e 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -210,6 +210,54 @@ static inline unsigned long first_present_section_nr(void) return next_present_section_nr(-1); } +static unsigned long section_active_mask(unsigned long pfn, + unsigned long nr_pages) +{ + int idx_start, idx_size; + phys_addr_t start, size; + + if (!nr_pages) + return 0; + + start = PFN_PHYS(pfn); + size = PFN_PHYS(min(nr_pages, PAGES_PER_SECTION + - (pfn & ~PAGE_SECTION_MASK))); + size = ALIGN(size, SECTION_ACTIVE_SIZE); + + idx_start = section_active_index(start); + idx_size = section_active_index(size); + + if (idx_size == 0) + return -1; + return ((1UL << idx_size) - 1) << idx_start; +} + +void section_active_init(unsigned long pfn, unsigned long nr_pages) +{ + int end_sec = pfn_to_section_nr(pfn + nr_pages - 1); + int i, start_sec = pfn_to_section_nr(pfn); + + if (!nr_pages) + return; + + for (i = start_sec; i <= end_sec; i++) { + struct mem_section *ms; + unsigned long mask; + unsigned long pfns; + + pfns = min(nr_pages, PAGES_PER_SECTION + - (pfn & ~PAGE_SECTION_MASK)); + mask = section_active_mask(pfn, pfns); + + ms = __nr_to_section(i); + ms->usage->map_active |= mask; + pr_debug("%s: sec: %d mask: %#018lx\n", __func__, i, ms->usage->map_active); + + pfn += pfns; + nr_pages -= pfns; + } +} + /* Record a memory area against a node. */ void __init memory_present(int nid, unsigned long start, unsigned long end) { From patchwork Thu May 2 05:55:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10926051 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 666D71398 for ; Thu, 2 May 2019 06:09:33 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 566F128FDA for ; Thu, 2 May 2019 06:09:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 497A128FDE; Thu, 2 May 2019 06:09:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C951328FDA for ; Thu, 2 May 2019 06:09:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C604D6B000A; Thu, 2 May 2019 02:09:31 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id BE9786B000C; Thu, 2 May 2019 02:09:31 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AFFE46B000D; Thu, 2 May 2019 02:09:31 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f197.google.com (mail-pg1-f197.google.com [209.85.215.197]) by kanga.kvack.org (Postfix) with ESMTP id 7A0476B000A for ; Thu, 2 May 2019 02:09:31 -0400 (EDT) Received: by mail-pg1-f197.google.com with SMTP id x13so712548pgl.10 for ; Wed, 01 May 2019 23:09:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=hjdIGXbMx4GcFibb2CVeLHDFdsRvl1weqFUALYsiTZU=; b=Upfsooq9yAkjQ/cr4Bl7/VRwnhvKcngmMyiBZOVi3rlAwwQ8T9O72slEBvCKKMvoDQ SGoq36OXDbyw/8180/xEZwXi+GB/7tuYdtq3hi6aRh2ivzw7Is/Gv3einAuwzqeSGKiw N73tmHOMnpOa6aXzO0pTRucbgPUM4zYKNBOu9E7RIsEcrzholQgzTg2JtNjKZRLl22cd piwdiUeeLYYFtkg6fWBz82tcDxiQmycTZdfrTu3G7UpIyqZefJCWidRb8NImHSuwK7F2 qVMCSC0TqxXOe1tpzbj6nkFxvPMVsMTq1donVG2MhmoMiI0+LIWqyMZWQVGYFTvR6izT +rAw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAX9CZQvjCB5VDvI/G/+NCANv4Kt92PGLd0XaaDdOfG4jRc8AJoK gkY/HbNA3mx20Z8o/ClkW9mERi89rtbbW3v/TaWvxl7R3KtYpqrEeBd6rIPtefK7CXURG/QvjJT 6BYaonSa9OyOfxy/6mIT0xp281VbMgpap+nMWg5uO19RpAy8M2Woixd+aWG/ICncm/A== X-Received: by 2002:a63:690:: with SMTP id 138mr2105765pgg.415.1556777371165; Wed, 01 May 2019 23:09:31 -0700 (PDT) X-Google-Smtp-Source: APXvYqwGLoerteCk/lyE13lw1AagNtxsyyXZIBQCuYP+8kJrTrMHrjTGVEBHLRM2SLuLyyT5Qh3d X-Received: by 2002:a63:690:: with SMTP id 138mr2105681pgg.415.1556777370129; Wed, 01 May 2019 23:09:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556777370; cv=none; d=google.com; s=arc-20160816; b=eCSNiJFnDo4JlTlYvit2KnMqYgfnyn0tdcYF2f0hkBll7SRbb//Fxb7sUvZ7bw6NiX xzdLbzN1QcFGb4AGJa2xEccxaMB1radzNCtHO9BRmNiWrHv06bzRSZeAjmGMnz4x1Fgr OVb3b+PAYTlwoeCHqAETQ5Gdvyt1i72eeoTWIasM9HnUnXewUq3JpLWkQU/elpCPZA4P NF0zHj/+8yv0V7KR7HAGAe9m1Yuccokh9SFoA6lvfvhJ9tp2zM9HXMJKyEIrto+VClxa mZ4QvBju1BiYP9UYustmVbjBRxekpvSgQgLSJSKG6A5r2gOxdWPS9t9pX8NBhfsg1/KK T/zA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=hjdIGXbMx4GcFibb2CVeLHDFdsRvl1weqFUALYsiTZU=; b=YaYFRdt/JRNhqawWXbuubklPb8mlMZO7nYoKz0t1H3rINEGF7wxdqxVNEM6qMZiW7H LlfHhPHX/WgEvbq6/mGmmiKv7K4/+ySIMxK1hWrz+dqOLu9DUawxdrafbpQnMVQ24twA qFa4gVHjT3bD+KFadfTQwwH1h1YL0eiB9YpKLGf2fKjpd9XGb30vcccK0o3ZXJrOFj96 Oc7YlVf1WCNPsEpv+h4IfWW66UYlaVF6O0khBD73qRkEYT9Pt6zOjWVX9JoxfPTuUGq+ hR0J44ldrEj/Rn7qd0hCNv6wCRDP7JRLSPBMGuXYRcDopIecidEyva5M06SM00u4em2p 9G/Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga14.intel.com (mga14.intel.com. [192.55.52.115]) by mx.google.com with ESMTPS id u22si43155258plq.193.2019.05.01.23.09.29 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 01 May 2019 23:09:30 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.115 as permitted sender) client-ip=192.55.52.115; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 May 2019 23:09:29 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,420,1549958400"; d="scan'208";a="342689748" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga006.fm.intel.com with ESMTP; 01 May 2019 23:09:29 -0700 Subject: [PATCH v7 04/12] mm/hotplug: Prepare shrink_{zone, pgdat}_span for sub-section removal From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Vlastimil Babka , Logan Gunthorpe , Oscar Salvador , linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, osalvador@suse.de, mhocko@suse.com Date: Wed, 01 May 2019 22:55:43 -0700 Message-ID: <155677654297.2336373.3779112213402789415.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155677652226.2336373.8700273400832001094.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155677652226.2336373.8700273400832001094.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Sub-section hotplug support reduces the unit of operation of hotplug from section-sized-units (PAGES_PER_SECTION) to sub-section-sized units (PAGES_PER_SUBSECTION). Teach shrink_{zone,pgdat}_span() to consider PAGES_PER_SUBSECTION boundaries as the points where pfn_valid(), not valid_section(), can toggle. Cc: Michal Hocko Cc: Vlastimil Babka Cc: Logan Gunthorpe Reviewed-by: Oscar Salvador Signed-off-by: Dan Williams --- include/linux/mmzone.h | 2 ++ mm/memory_hotplug.c | 29 ++++++++--------------------- 2 files changed, 10 insertions(+), 21 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index cffde898e345..b13f0cddf75e 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1164,6 +1164,8 @@ static inline unsigned long section_nr_to_pfn(unsigned long sec) #define SECTION_ACTIVE_SIZE ((1UL << SECTION_SIZE_BITS) / BITS_PER_LONG) #define SECTION_ACTIVE_MASK (~(SECTION_ACTIVE_SIZE - 1)) +#define PAGES_PER_SUB_SECTION (SECTION_ACTIVE_SIZE / PAGE_SIZE) +#define PAGE_SUB_SECTION_MASK (~(PAGES_PER_SUB_SECTION-1)) struct mem_section_usage { /* diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index a76fc6a6e9fe..0d379da0f1a8 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -325,12 +325,8 @@ static unsigned long find_smallest_section_pfn(int nid, struct zone *zone, unsigned long start_pfn, unsigned long end_pfn) { - struct mem_section *ms; - - for (; start_pfn < end_pfn; start_pfn += PAGES_PER_SECTION) { - ms = __pfn_to_section(start_pfn); - - if (unlikely(!valid_section(ms))) + for (; start_pfn < end_pfn; start_pfn += PAGES_PER_SUB_SECTION) { + if (unlikely(!pfn_valid(start_pfn))) continue; if (unlikely(pfn_to_nid(start_pfn) != nid)) @@ -350,15 +346,12 @@ static unsigned long find_biggest_section_pfn(int nid, struct zone *zone, unsigned long start_pfn, unsigned long end_pfn) { - struct mem_section *ms; unsigned long pfn; /* pfn is the end pfn of a memory section. */ pfn = end_pfn - 1; - for (; pfn >= start_pfn; pfn -= PAGES_PER_SECTION) { - ms = __pfn_to_section(pfn); - - if (unlikely(!valid_section(ms))) + for (; pfn >= start_pfn; pfn -= PAGES_PER_SUB_SECTION) { + if (unlikely(!pfn_valid(pfn))) continue; if (unlikely(pfn_to_nid(pfn) != nid)) @@ -380,7 +373,6 @@ static void shrink_zone_span(struct zone *zone, unsigned long start_pfn, unsigned long z = zone_end_pfn(zone); /* zone_end_pfn namespace clash */ unsigned long zone_end_pfn = z; unsigned long pfn; - struct mem_section *ms; int nid = zone_to_nid(zone); zone_span_writelock(zone); @@ -417,10 +409,8 @@ static void shrink_zone_span(struct zone *zone, unsigned long start_pfn, * it check the zone has only hole or not. */ pfn = zone_start_pfn; - for (; pfn < zone_end_pfn; pfn += PAGES_PER_SECTION) { - ms = __pfn_to_section(pfn); - - if (unlikely(!valid_section(ms))) + for (; pfn < zone_end_pfn; pfn += PAGES_PER_SUB_SECTION) { + if (unlikely(!pfn_valid(pfn))) continue; if (page_zone(pfn_to_page(pfn)) != zone) @@ -448,7 +438,6 @@ static void shrink_pgdat_span(struct pglist_data *pgdat, unsigned long p = pgdat_end_pfn(pgdat); /* pgdat_end_pfn namespace clash */ unsigned long pgdat_end_pfn = p; unsigned long pfn; - struct mem_section *ms; int nid = pgdat->node_id; if (pgdat_start_pfn == start_pfn) { @@ -485,10 +474,8 @@ static void shrink_pgdat_span(struct pglist_data *pgdat, * has only hole or not. */ pfn = pgdat_start_pfn; - for (; pfn < pgdat_end_pfn; pfn += PAGES_PER_SECTION) { - ms = __pfn_to_section(pfn); - - if (unlikely(!valid_section(ms))) + for (; pfn < pgdat_end_pfn; pfn += PAGES_PER_SUB_SECTION) { + if (unlikely(!pfn_valid(pfn))) continue; if (pfn_to_nid(pfn) != nid) From patchwork Thu May 2 05:55:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10926055 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0F46613AD for ; Thu, 2 May 2019 06:09:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F3A9128FDA for ; Thu, 2 May 2019 06:09:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E7A0428FDE; Thu, 2 May 2019 06:09:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 302A328FDA for ; Thu, 2 May 2019 06:09:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E340B6B000D; Thu, 2 May 2019 02:09:36 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id DE4546B000E; Thu, 2 May 2019 02:09:36 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CD6516B0010; Thu, 2 May 2019 02:09:36 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f197.google.com (mail-pf1-f197.google.com [209.85.210.197]) by kanga.kvack.org (Postfix) with ESMTP id 93AA06B000D for ; Thu, 2 May 2019 02:09:36 -0400 (EDT) Received: by mail-pf1-f197.google.com with SMTP id d21so704022pfr.3 for ; Wed, 01 May 2019 23:09:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=OOqGHzrOteQ2KD1XXo5HKErQD1DIEXCq7Q2pecySPGA=; b=m6b2nYk1J2s/EgvyoCctrs3py7hFuLuzHxV2MHKHb1aRHMQBEqplbUv1AIRY9ZPfDq 9q9grBVorJUonFi5HEYqrJfLtp4gtj0+c/RMmFS8OucC5vxHKmNmFVPlwRR/24wIdQTT 5EyzuvoOt2X1o/iIdK0XMkHFgIZ/6+PtjlQQHj7udrGEP1H13fskIMeOHXtzrYok2z3G dppSrAL9Iu8KXSudc92Z89hlB+PkJuog1zu1UZIT1kyNb3FGdubvEJN0QTdp4INk+HCR zPg9cPgd4dxMWb4GRD/g5h4icc8atpippai/eZshKSoreIuZ8Hr1BodvcEU4LnqNop+s SyQw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAWxGkaQoCm6K+pcPAH0DqrL8Q+Adb5zluM8ed19JURbImWfZg8U l80rQYTXBlfRaLZwOJ3R0F8/51CE2kBsB8CEy72Itez8VLnc9waGtzfPU/MIzuQu6lf01/qlW8e 2R066w5gArpnE6AWjKpRdS4WOKOG+SX/M3JG+3L6k/WRTPBNhuwtr5+0UgoQwthU8ow== X-Received: by 2002:a62:6b44:: with SMTP id g65mr2246469pfc.27.1556777376257; Wed, 01 May 2019 23:09:36 -0700 (PDT) X-Google-Smtp-Source: APXvYqxbS5b3uaKh1PmJfGyV6EQTA46YoCRY9RUiF7x45KEBAtOuFPk1gmO6JTamHEQG0ZbRphuN X-Received: by 2002:a62:6b44:: with SMTP id g65mr2246405pfc.27.1556777375423; Wed, 01 May 2019 23:09:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556777375; cv=none; d=google.com; s=arc-20160816; b=q7kRI+ClsfRLAcQ0dEQVtxHyZZB+HuQV+2XkEn46hP3gVb1Mgf8Qwq5yjS9z6jQ7Ht foxzSQy6IVq/SusZ+QT+1J39c5MDE+8jnf3eoRJBFD8U2PWc5e5R0cY5PpuqhG+erH+k 0FP30uAb5UvmO1dzAiVrtkcGFvAV0+n/q4U6WSnrPV+/G7/BZ9Vc2DMdn8yd8y98QdWj c5EyRmVEJmwIJBpjz89xmt6ftkZD4sO69HBm2ziwKSn3vFILH1HjcvO8ndp3wv7gOrAY Jj5glQA+uujMLMIjy6SEnKwnWbDWm2aXa650lKbvWgf3TT/9pGxA6bK/Go+Aq5IaYr7k O3LQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=OOqGHzrOteQ2KD1XXo5HKErQD1DIEXCq7Q2pecySPGA=; b=UU6JrI3JPoR6Yc216by/COpGGBCGWZEywVY425lYF2VJyw2wnbegxEm3KVT2lIhwsS NgUNysEUSaW378+raNeVnQeBrPgHY7qhbkeLQ9wvtl8/7BkdZ5AlMmhNlsOnCreZfi57 o3QYhVAY+urlJNwBNCB23spax+WYArGFvyAVfuJgUrilxhEJxx541KlJ1wietnqceaAs Va7qMeCf1XpgE0Uv5tngrqZQtooo+b89N+n6ktjbUJjtRFomwUtdkgwsTfd7Ce10L00f oyYfdC6sK8ULTxOfo/m3LGr/cvnFZJFK/fgtj+G+2aJywQKqM2N3dp+os/vyIKWSxJw0 Xzgg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga04.intel.com (mga04.intel.com. [192.55.52.120]) by mx.google.com with ESMTPS id k33si44852414pld.27.2019.05.01.23.09.35 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 01 May 2019 23:09:35 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.120 as permitted sender) client-ip=192.55.52.120; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 May 2019 23:09:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,420,1549958400"; d="scan'208";a="166797485" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga002.fm.intel.com with ESMTP; 01 May 2019 23:09:34 -0700 Subject: [PATCH v7 05/12] mm/sparsemem: Convert kmalloc_section_memmap() to populate_section_memmap() From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , David Hildenbrand , Logan Gunthorpe , linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, osalvador@suse.de, mhocko@suse.com Date: Wed, 01 May 2019 22:55:48 -0700 Message-ID: <155677654842.2336373.17000900051843592636.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155677652226.2336373.8700273400832001094.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155677652226.2336373.8700273400832001094.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Allow sub-section sized ranges to be added to the memmap. populate_section_memmap() takes an explict pfn range rather than assuming a full section, and those parameters are plumbed all the way through to vmmemap_populate(). There should be no sub-section usage in current deployments. New warnings are added to clarify which memmap allocation paths are sub-section capable. Cc: Michal Hocko Cc: David Hildenbrand Cc: Logan Gunthorpe Signed-off-by: Dan Williams --- arch/x86/mm/init_64.c | 4 ++- include/linux/mm.h | 4 ++- mm/sparse-vmemmap.c | 21 +++++++++++------ mm/sparse.c | 61 +++++++++++++++++++++++++++++++------------------ 4 files changed, 57 insertions(+), 33 deletions(-) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 20d14254b686..bb018d09d2dc 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1457,7 +1457,9 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, { int err; - if (boot_cpu_has(X86_FEATURE_PSE)) + if (end - start < PAGES_PER_SECTION * sizeof(struct page)) + err = vmemmap_populate_basepages(start, end, node); + else if (boot_cpu_has(X86_FEATURE_PSE)) err = vmemmap_populate_hugepages(start, end, node, altmap); else if (altmap) { pr_err_once("%s: no cpu support for altmap allocations\n", diff --git a/include/linux/mm.h b/include/linux/mm.h index 0e8834ac32b7..5360a0e4051d 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2748,8 +2748,8 @@ const char * arch_vma_name(struct vm_area_struct *vma); void print_vma_addr(char *prefix, unsigned long rip); void *sparse_buffer_alloc(unsigned long size); -struct page *sparse_mem_map_populate(unsigned long pnum, int nid, - struct vmem_altmap *altmap); +struct page * __populate_section_memmap(unsigned long pfn, + unsigned long nr_pages, int nid, struct vmem_altmap *altmap); pgd_t *vmemmap_pgd_populate(unsigned long addr, int node); p4d_t *vmemmap_p4d_populate(pgd_t *pgd, unsigned long addr, int node); pud_t *vmemmap_pud_populate(p4d_t *p4d, unsigned long addr, int node); diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 7fec05796796..dcb023aa23d1 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -245,19 +245,26 @@ int __meminit vmemmap_populate_basepages(unsigned long start, return 0; } -struct page * __meminit sparse_mem_map_populate(unsigned long pnum, int nid, - struct vmem_altmap *altmap) +struct page * __meminit __populate_section_memmap(unsigned long pfn, + unsigned long nr_pages, int nid, struct vmem_altmap *altmap) { unsigned long start; unsigned long end; - struct page *map; - map = pfn_to_page(pnum * PAGES_PER_SECTION); - start = (unsigned long)map; - end = (unsigned long)(map + PAGES_PER_SECTION); + /* + * The minimum granularity of memmap extensions is + * SECTION_ACTIVE_SIZE as allocations are tracked in the + * 'map_active' bitmap of the section. + */ + end = ALIGN(pfn + nr_pages, PHYS_PFN(SECTION_ACTIVE_SIZE)); + pfn &= PHYS_PFN(SECTION_ACTIVE_MASK); + nr_pages = end - pfn; + + start = (unsigned long) pfn_to_page(pfn); + end = start + nr_pages * sizeof(struct page); if (vmemmap_populate(start, end, nid, altmap)) return NULL; - return map; + return pfn_to_page(pfn); } diff --git a/mm/sparse.c b/mm/sparse.c index 8d4f28e2c25e..ed26761327bf 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -452,8 +452,8 @@ static unsigned long __init section_map_size(void) return PAGE_ALIGN(sizeof(struct page) * PAGES_PER_SECTION); } -struct page __init *sparse_mem_map_populate(unsigned long pnum, int nid, - struct vmem_altmap *altmap) +struct page __init *__populate_section_memmap(unsigned long pfn, + unsigned long nr_pages, int nid, struct vmem_altmap *altmap) { unsigned long size = section_map_size(); struct page *map = sparse_buffer_alloc(size); @@ -534,10 +534,13 @@ static void __init sparse_init_nid(int nid, unsigned long pnum_begin, } sparse_buffer_init(map_count * section_map_size(), nid); for_each_present_section_nr(pnum_begin, pnum) { + unsigned long pfn = section_nr_to_pfn(pnum); + if (pnum >= pnum_end) break; - map = sparse_mem_map_populate(pnum, nid, NULL); + map = __populate_section_memmap(pfn, PAGES_PER_SECTION, + nid, NULL); if (!map) { pr_err("%s: node[%d] memory map backing failed. Some memory will not be available.", __func__, nid); @@ -637,17 +640,17 @@ void offline_mem_sections(unsigned long start_pfn, unsigned long end_pfn) #endif #ifdef CONFIG_SPARSEMEM_VMEMMAP -static inline struct page *kmalloc_section_memmap(unsigned long pnum, int nid, - struct vmem_altmap *altmap) +static struct page *populate_section_memmap(unsigned long pfn, + unsigned long nr_pages, int nid, struct vmem_altmap *altmap) { - /* This will make the necessary allocations eventually. */ - return sparse_mem_map_populate(pnum, nid, altmap); + return __populate_section_memmap(pfn, nr_pages, nid, altmap); } -static void __kfree_section_memmap(struct page *memmap, + +static void depopulate_section_memmap(unsigned long pfn, unsigned long nr_pages, struct vmem_altmap *altmap) { - unsigned long start = (unsigned long)memmap; - unsigned long end = (unsigned long)(memmap + PAGES_PER_SECTION); + unsigned long start = (unsigned long) pfn_to_page(pfn); + unsigned long end = start + nr_pages * sizeof(struct page); vmemmap_free(start, end, altmap); } @@ -661,11 +664,18 @@ static void free_map_bootmem(struct page *memmap) } #endif /* CONFIG_MEMORY_HOTREMOVE */ #else -static struct page *__kmalloc_section_memmap(void) +struct page *populate_section_memmap(unsigned long pfn, + unsigned long nr_pages, int nid, struct vmem_altmap *altmap) { struct page *page, *ret; unsigned long memmap_size = sizeof(struct page) * PAGES_PER_SECTION; + if ((pfn & ~PAGE_SECTION_MASK) || nr_pages != PAGES_PER_SECTION) { + WARN(1, "%s: called with section unaligned parameters\n", + __func__); + return NULL; + } + page = alloc_pages(GFP_KERNEL|__GFP_NOWARN, get_order(memmap_size)); if (page) goto got_map_page; @@ -682,15 +692,17 @@ static struct page *__kmalloc_section_memmap(void) return ret; } -static inline struct page *kmalloc_section_memmap(unsigned long pnum, int nid, +static void depopulate_section_memmap(unsigned long pfn, unsigned long nr_pages, struct vmem_altmap *altmap) { - return __kmalloc_section_memmap(); -} + struct page *memmap = pfn_to_page(pfn); + + if ((pfn & ~PAGE_SECTION_MASK) || nr_pages != PAGES_PER_SECTION) { + WARN(1, "%s: called with section unaligned parameters\n", + __func__); + return; + } -static void __kfree_section_memmap(struct page *memmap, - struct vmem_altmap *altmap) -{ if (is_vmalloc_addr(memmap)) vfree(memmap); else @@ -761,12 +773,13 @@ int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, if (ret < 0 && ret != -EEXIST) return ret; ret = 0; - memmap = kmalloc_section_memmap(section_nr, nid, altmap); + memmap = populate_section_memmap(start_pfn, PAGES_PER_SECTION, nid, + altmap); if (!memmap) return -ENOMEM; usage = kzalloc(mem_section_usage_size(), GFP_KERNEL); if (!usage) { - __kfree_section_memmap(memmap, altmap); + depopulate_section_memmap(start_pfn, PAGES_PER_SECTION, altmap); return -ENOMEM; } @@ -788,7 +801,7 @@ int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, out: if (ret < 0) { kfree(usage); - __kfree_section_memmap(memmap, altmap); + depopulate_section_memmap(start_pfn, PAGES_PER_SECTION, altmap); } return ret; } @@ -825,7 +838,8 @@ static inline void clear_hwpoisoned_pages(struct page *memmap, int nr_pages) #endif static void free_section_usage(struct page *memmap, - struct mem_section_usage *usage, struct vmem_altmap *altmap) + struct mem_section_usage *usage, unsigned long pfn, + unsigned long nr_pages, struct vmem_altmap *altmap) { struct page *usage_page; @@ -839,7 +853,7 @@ static void free_section_usage(struct page *memmap, if (PageSlab(usage_page) || PageCompound(usage_page)) { kfree(usage); if (memmap) - __kfree_section_memmap(memmap, altmap); + depopulate_section_memmap(pfn, nr_pages, altmap); return; } @@ -868,7 +882,8 @@ void sparse_remove_one_section(struct zone *zone, struct mem_section *ms, clear_hwpoisoned_pages(memmap + map_offset, PAGES_PER_SECTION - map_offset); - free_section_usage(memmap, usage, altmap); + free_section_usage(memmap, usage, section_nr_to_pfn(__section_nr(ms)), + PAGES_PER_SECTION, altmap); } #endif /* CONFIG_MEMORY_HOTREMOVE */ #endif /* CONFIG_MEMORY_HOTPLUG */ From patchwork Thu May 2 05:55:53 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10926059 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4F2F81398 for ; Thu, 2 May 2019 06:09:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3F3C328FDA for ; Thu, 2 May 2019 06:09:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3317A28FDE; Thu, 2 May 2019 06:09:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CD4E128FDA for ; Thu, 2 May 2019 06:09:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 952246B0010; Thu, 2 May 2019 02:09:42 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 902D26B0266; Thu, 2 May 2019 02:09:42 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7F25D6B0269; Thu, 2 May 2019 02:09:42 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f198.google.com (mail-pf1-f198.google.com [209.85.210.198]) by kanga.kvack.org (Postfix) with ESMTP id 4A05D6B0010 for ; Thu, 2 May 2019 02:09:42 -0400 (EDT) Received: by mail-pf1-f198.google.com with SMTP id x5so700424pfi.5 for ; Wed, 01 May 2019 23:09:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=mV0vBGkFbHqI8MQAzWF9l08qAvVtY/5fx7U+Bbzbpug=; b=IqNaWuKN+U3D3LKsKS42qJ3VbcLkHAI6b0Y4l/twsSIvuNYJGtEWhX3u0NQTczWYvQ M4aHyDXx/FofqsH5JraePYsLbFkSq9WM43ou4XAkbft7gUBqoSCb0wLX66/GCLvX9iSk C8z0ceyrl375Ur4jkL7QLkyXvMXEkIv1Ymg9KEhbbxnAKUtHA0mX9shQHT8E+y5sxmKI YHumgBVKrA0PBVU3MacQe1WeM2RospDc/x1QnAlCDjCEe2VbESz0H/h/lzFtm1v3gx/B Q/rH3Jf+1WIRoNZVnyhhG4Yyqz0m1y9mX4DJMa47rJiW7fl2QPEG8Swf97LpCxZB6ald uCNA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAVwKqNXN7L9Fp6vWk12etpL/SAVNkW80Aeg/ybh7halVBbI4s4w l8a5E060+srZsZ1XZZ0C6a3XkeGLE9/Ls4uYzQNTuMWaUTAkGfIxzTR3S9G7dQSvDsu2/Jxwls3 GFV1clcajvPnie6iQre8SHzIyRxVX1Sqt3Xa4Fc+4F/x2/B1YP0b2L38B7FyeTz27TA== X-Received: by 2002:a63:5947:: with SMTP id j7mr2204115pgm.62.1556777381994; Wed, 01 May 2019 23:09:41 -0700 (PDT) X-Google-Smtp-Source: APXvYqyj5vHnCi7d47Z8XFvg6UUihPyY0wX4h8nWEOeXOVrzcHBdM+CfiUtYj/bHygCYWt+09QSR X-Received: by 2002:a63:5947:: with SMTP id j7mr2204051pgm.62.1556777381236; Wed, 01 May 2019 23:09:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556777381; cv=none; d=google.com; s=arc-20160816; b=pu61BqP5O4u5PVmVNtbXFZgfDLc1EuqDL0+4MXGBoaoFU4eWEiVGr5VahsS1XLibrt skfrEOp94PXT07oZFga+vEqN2/gPtR8POYYBNgdMcyJsvZ6sJXQ6qrUePe/1O53NkwyW Qr6TCzuweK9UgVRIY+QIccVtoiesQyX0wR+3GyUtPX7gb294zgM6koT+8mU+tknAbDeD Ul7dHJONazeg2e8v2Yl8TiBueWVDPK3+0n8DVeKALMerQp+qzBX3Q8v5sMUvyHFritj5 92CBLOpEbLZdu66iWxnTywf0nDuetfMLQuo0T/y+zjt5cTNmE/1PTw35nFvovDKqAegU 5vOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=mV0vBGkFbHqI8MQAzWF9l08qAvVtY/5fx7U+Bbzbpug=; b=RwH3l2UlTpnzDVltvrYv/kpmMFfNKCZ4x9sdipZUHPwW7yn3veL9mX54b1of1DZ7g+ uxezIMd1vdye7U5NoGhroPBrvPF2hmVfXSBRTOnj1RefxmBddTuucR2fny06r4JVvPmg KJlkfRN/4pVk/4FdSLfT0xRQDfBe+kyMCrhnuEEt/4HdnE3J2QLACXb0UIe5QJfHKN0v pYtBUdQdNkE95h55QgPo+vYhSI0zrrLrV5W89FppQOJJcJVyjO7qT4ZRRsAinjDm0tLs 4oaktr2txj3c2JVwzvQcVxXGk/goiIRgc4d0UkCwlOzkpbReWAjezXxv8g94Y7N+0D6p UixA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga06.intel.com (mga06.intel.com. [134.134.136.31]) by mx.google.com with ESMTPS id s17si44984215pfm.170.2019.05.01.23.09.41 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 01 May 2019 23:09:41 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) client-ip=134.134.136.31; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 May 2019 23:09:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,420,1549958400"; d="scan'208";a="169819071" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga001.fm.intel.com with ESMTP; 01 May 2019 23:09:40 -0700 Subject: [PATCH v7 06/12] mm/hotplug: Kill is_dev_zone() usage in __remove_pages() From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Logan Gunthorpe , David Hildenbrand , linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, osalvador@suse.de, mhocko@suse.com Date: Wed, 01 May 2019 22:55:53 -0700 Message-ID: <155677655373.2336373.15845721823034005000.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155677652226.2336373.8700273400832001094.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155677652226.2336373.8700273400832001094.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The zone type check was a leftover from the cleanup that plumbed altmap through the memory hotplug path, i.e. commit da024512a1fa "mm: pass the vmem_altmap to arch_remove_memory and __remove_pages". Cc: Michal Hocko Cc: Logan Gunthorpe Cc: David Hildenbrand Signed-off-by: Dan Williams Reviewed-by: David Hildenbrand Reviewed-by: Oscar Salvador --- mm/memory_hotplug.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 0d379da0f1a8..108380e20d8f 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -544,11 +544,8 @@ void __remove_pages(struct zone *zone, unsigned long phys_start_pfn, unsigned long map_offset = 0; int sections_to_remove; - /* In the ZONE_DEVICE case device driver owns the memory region */ - if (is_dev_zone(zone)) { - if (altmap) - map_offset = vmem_altmap_offset(altmap); - } + if (altmap) + map_offset = vmem_altmap_offset(altmap); clear_zone_contiguous(zone); From patchwork Thu May 2 05:55:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10926063 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1826613AD for ; Thu, 2 May 2019 06:09:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0871928FDA for ; Thu, 2 May 2019 06:09:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F12D828FDE; Thu, 2 May 2019 06:09:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8894F28FDA for ; Thu, 2 May 2019 06:09:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6929A6B0266; Thu, 2 May 2019 02:09:48 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 6438F6B0269; Thu, 2 May 2019 02:09:48 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 50C096B026A; Thu, 2 May 2019 02:09:48 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f197.google.com (mail-pl1-f197.google.com [209.85.214.197]) by kanga.kvack.org (Postfix) with ESMTP id 1C4D06B0266 for ; Thu, 2 May 2019 02:09:48 -0400 (EDT) Received: by mail-pl1-f197.google.com with SMTP id a97so728954pla.9 for ; Wed, 01 May 2019 23:09:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=UmQ381Oi02UParDnCjy/auhbv+38BBncKGGBYSSyDUI=; b=UV09f6ITFyzOGffwhdRFWLzyOk8VzsclEAKxQHzn5KE4xqsao8wSUkIaAbrpCstCcw 7GeSPgk5vm+mnJnu38H3Acye07yVBRfO+h/rh8aE9xbz9zJoziEo2d4gD9o51hr95DSV izKwMDak0haKYIvl+qNbkL82J40StyBIpc80udLjgJmbAJ5PnGY1Ce9oYbeGmWP0taiC sgY/nySkP0YlJ2LqEaZTgyUOPdOGOmL4aPZh8mzxNKz9DQNsKMN85kZiAdYXI1nQfGsk sjxy0AbNqsRSMi3+cPKOLGoZwM/DGiFQ/M23FRHPru2DX9aCEe2w96aZnyV6OL4HibMO UzUw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAUXUp2VEHzSLPPsaIojyhnRqf3VpwR2YuuGf8IhlrVJsgA/Vc/I iJ/IW4CxlL+bbfmVlgo941JDbK/bNiI/goma+EjVnD5EWeJWvTuAXGmLiQbVBZP5wTE8/Zmr8hz hZqz1JDEETIYAxCJZ6w8DmDYcxk63b3SYUGXaOiAeWCBX9ZEAwfBZugtWPahyV6nGOg== X-Received: by 2002:a17:902:32b:: with SMTP id 40mr1868252pld.204.1556777387767; Wed, 01 May 2019 23:09:47 -0700 (PDT) X-Google-Smtp-Source: APXvYqyUXJGq82I2Tx8pHBb264LyOyvtOWbyULrhNC1K3IkQwPPe6FFKin352W5IN2hB1FETJmak X-Received: by 2002:a17:902:32b:: with SMTP id 40mr1868169pld.204.1556777386940; Wed, 01 May 2019 23:09:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556777386; cv=none; d=google.com; s=arc-20160816; b=uj18CJDUFn05/y/nLIqJICP4thW96ToUDBY9rumFfI663+QoyRnZz8mMG3falsemNO cdf2TLiN5GlGRurIo6o3VCfyk63v0O+p2W3h7dPcso1JQ0X68uMHbfuCP/M1pdtksxv2 JqZTEMaKNuuzJiX4cJQ+DvAGC+K6De6rxDwf0RRoz81CaLPaqfJKgU9lpBOlc43DVPS5 AXeahxnWOwa3IGEE9C3rzkY0xZ03Fo160HD0s3RpsQC0X1cTCRBdGen8xK4BAb9g591z kwzxkmgAUCBUT3t1S/7Rul3eD94AmIe/v0nW9PAEP0PO8V8/a7EvoF+bm7pqosPtMqEP YRsw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=UmQ381Oi02UParDnCjy/auhbv+38BBncKGGBYSSyDUI=; b=l7x+08PNuQ2ARoCCQ29WUCFRC38cYHMq8yHHgC19pxQ2Ik3f5cNKQ93N5qv8pzETgs KFl/9KYB7NExCHEA0aB53FQEuZ26uTqo/fcHAKwToICoI4JlJvGtRb1455zPlSFI7zUS AuMfcVBgxz3/Yaax6XR7O15+GvC3tAJ9AJ+9Ic/2R7XyW84HW3YDP2ZUmpRhLTNOvFDU 5Af+JqAWkELkgIswWrCw4Xmn66QIJn857fzceOqlTU13pveAE7fG+o/ocFjCIIw0eo0I OaVhg3CztZ5yxpxo/BlCo7JFTHPWNH/wtIU+OsQFE46YrqvwPYOhpg/U87CcGoLYhOAL DsIg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga14.intel.com (mga14.intel.com. [192.55.52.115]) by mx.google.com with ESMTPS id h11si20733196pgs.490.2019.05.01.23.09.46 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 01 May 2019 23:09:46 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.115 as permitted sender) client-ip=192.55.52.115; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 May 2019 23:09:46 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,420,1549958400"; d="scan'208";a="154054353" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by FMSMGA003.fm.intel.com with ESMTP; 01 May 2019 23:09:45 -0700 Subject: [PATCH v7 07/12] mm: Kill is_dev_zone() helper From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Logan Gunthorpe , David Hildenbrand , Oscar Salvador , linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, osalvador@suse.de, mhocko@suse.com Date: Wed, 01 May 2019 22:55:59 -0700 Message-ID: <155677655941.2336373.17601391574483353034.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155677652226.2336373.8700273400832001094.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155677652226.2336373.8700273400832001094.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Given there are no more usages of is_dev_zone() outside of 'ifdef CONFIG_ZONE_DEVICE' protection, kill off the compilation helper. Cc: Michal Hocko Cc: Logan Gunthorpe Acked-by: David Hildenbrand Reviewed-by: Oscar Salvador Signed-off-by: Dan Williams --- include/linux/mmzone.h | 12 ------------ mm/page_alloc.c | 2 +- 2 files changed, 1 insertion(+), 13 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index b13f0cddf75e..3237c5e456df 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -855,18 +855,6 @@ static inline int local_memory_node(int node_id) { return node_id; }; */ #define zone_idx(zone) ((zone) - (zone)->zone_pgdat->node_zones) -#ifdef CONFIG_ZONE_DEVICE -static inline bool is_dev_zone(const struct zone *zone) -{ - return zone_idx(zone) == ZONE_DEVICE; -} -#else -static inline bool is_dev_zone(const struct zone *zone) -{ - return false; -} -#endif - /* * Returns true if a zone has pages managed by the buddy allocator. * All the reclaim decisions have to use this function rather than diff --git a/mm/page_alloc.c b/mm/page_alloc.c index a68735c79609..be309d6a79de 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5864,7 +5864,7 @@ void __ref memmap_init_zone_device(struct zone *zone, unsigned long start = jiffies; int nid = pgdat->node_id; - if (WARN_ON_ONCE(!pgmap || !is_dev_zone(zone))) + if (WARN_ON_ONCE(!pgmap || zone_idx(zone) != ZONE_DEVICE)) return; /* From patchwork Thu May 2 05:56:05 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10926067 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 370E61398 for ; Thu, 2 May 2019 06:09:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2652228C20 for ; Thu, 2 May 2019 06:09:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1772B28D4C; Thu, 2 May 2019 06:09:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C725628C20 for ; Thu, 2 May 2019 06:09:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 841206B0269; Thu, 2 May 2019 02:09:53 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7F1E26B026A; Thu, 2 May 2019 02:09:53 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6E1366B026B; Thu, 2 May 2019 02:09:53 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f198.google.com (mail-pl1-f198.google.com [209.85.214.198]) by kanga.kvack.org (Postfix) with ESMTP id 339D96B0269 for ; Thu, 2 May 2019 02:09:53 -0400 (EDT) Received: by mail-pl1-f198.google.com with SMTP id a5so721664plh.14 for ; Wed, 01 May 2019 23:09:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=goWyNnY+juBPLit62pAaX1kxWdLRVjU36SDXYB0DhtE=; b=t11EFP1G7N1ddmJHykb6HrOrvlNkXuZ564KRNx87o0+2YKhfnW9z+EKHbHUyM1/fpN Ymx1jVELfK91Rkx9mJMbAMuQGEyr97sWINAMtQtsGN56RM+niZYeAPU0jLfreudZCvkb 2U0BsvV9P2KHw4NDt8ocwy7aP6fzLuQeYZQNNGYgDJt8BV3U0ZpwzkZ+aXAPfeNI+eJH MQAyKLN8eiq+VQoDxlckBm+WoZjJNSODKFyBpzqQ2b2umHDGZiHR600cFAu/p98exnJ5 DBHHm2Jf4ZIfBIUZywj0bn+qiAcokAutPgeP6JYR9w31R4ifDCOqgZU/YKN44TrMq9zA ADSw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAXMsmNKOB8yhzphAoFa/VqyScaHFwk7m411BSEFDr7uHYpt8X0Q pVeYEacGhoNuAyH1Ls0pJoRIJs7MA2nx6PIzqq/cIexgrVv3RrwWmR94OGOpwGZb5iYoGurOcrd WFo9nhAnPFHCsui/WXRyhNujCA/mDzFfN2EypmJvJ7dXHpbcyTw+/795GbLtJOWuG9g== X-Received: by 2002:a63:f712:: with SMTP id x18mr2189433pgh.293.1556777392760; Wed, 01 May 2019 23:09:52 -0700 (PDT) X-Google-Smtp-Source: APXvYqzRARXrg+m3RO4kssO6HzFtQZmSfUU50zsVhxm12vOMeSNAQqGRnVN3XbrsRdXOAwpNNfMA X-Received: by 2002:a63:f712:: with SMTP id x18mr2189345pgh.293.1556777391823; Wed, 01 May 2019 23:09:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556777391; cv=none; d=google.com; s=arc-20160816; b=nyiPEmBAvsSGiAyH9TbKbvN5TUaWuj4PxI0Nuc12sQ9V+lSRJInKoQ8XhXMTlsGytq hFBDIXWDIjjTpwKZXHn/C0dIp/479kBcO3N3RdHR/tYsgtFPLhDkHbX8aUKYbrKeJ1cm XY9smuPIs8aLdYpmoZB9IcefvE960soWUgCd1ue5VkerUNsoErBq9AYoKAv0XkwCR5Lb eAzUEbkJttJmJTApXL2ya2C/ClyBw+tYk6nf/snMR6YVIHopMQIukL9pNpkz3tVEun4E Y13UVTYjEb9QM6Z3hYgLDYNhYxo/W5u8st4DYjXX5jwxIaTZNQ4wDezY82JuHupRtjYM Tqpw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=goWyNnY+juBPLit62pAaX1kxWdLRVjU36SDXYB0DhtE=; b=R0U/9pb+yjxGqw7bqzC46PU/x2FNKaJIYUlY6trYPNlhAIQRwQxmgnRavQ3mR2ASy+ 0w5NmBSKDC9KJS8AozAx7go2QDuEzFP3A8JXNkX2viGYTIBve+eGLNpoDREy3A1C41dC FB7jf3CyGgSHZrWzIrUe20ez1YvXRh6U21bw1b1PB3vHtqik6iYkXKPHSb2+XDyk5Qxl 2ylu72TtSF9oBt1iSlGBRJVsa90AmGsFmkI1sHNRF3cZFqKRjn27RvLzzJCqUi5RQDGW y6hetGgayKdvJw2KMHxQLv7s5xUusl4atH1YGrZO5cj0zmZol9qhKp3bo7sJhokmGbLm ZJ5w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga02.intel.com (mga02.intel.com. [134.134.136.20]) by mx.google.com with ESMTPS id x32si12568728pld.16.2019.05.01.23.09.51 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 01 May 2019 23:09:51 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.20 as permitted sender) client-ip=134.134.136.20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 May 2019 23:09:51 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,420,1549958400"; d="scan'208";a="139200250" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga008.jf.intel.com with ESMTP; 01 May 2019 23:09:51 -0700 Subject: [PATCH v7 08/12] mm/sparsemem: Prepare for sub-section ranges From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Vlastimil Babka , Logan Gunthorpe , linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, osalvador@suse.de, mhocko@suse.com Date: Wed, 01 May 2019 22:56:05 -0700 Message-ID: <155677656509.2336373.4432941742094481750.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155677652226.2336373.8700273400832001094.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155677652226.2336373.8700273400832001094.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Prepare the memory hot-{add,remove} paths for handling sub-section ranges by plumbing the starting page frame and number of pages being handled through arch_{add,remove}_memory() to sparse_{add,remove}_one_section(). This is simply plumbing, small cleanups, and some identifier renames. No intended functional changes. Cc: Michal Hocko Cc: Vlastimil Babka Cc: Logan Gunthorpe Signed-off-by: Dan Williams Reviewed-by: Oscar Salvador --- include/linux/memory_hotplug.h | 7 +- mm/memory_hotplug.c | 118 +++++++++++++++++++++++++--------------- mm/sparse.c | 7 +- 3 files changed, 83 insertions(+), 49 deletions(-) diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index ae892eef8b82..835a94650ee3 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -354,9 +354,10 @@ extern int add_memory_resource(int nid, struct resource *resource); extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn, unsigned long nr_pages, struct vmem_altmap *altmap); extern bool is_memblock_offlined(struct memory_block *mem); -extern int sparse_add_one_section(int nid, unsigned long start_pfn, - struct vmem_altmap *altmap); -extern void sparse_remove_one_section(struct zone *zone, struct mem_section *ms, +extern int sparse_add_section(int nid, unsigned long pfn, + unsigned long nr_pages, struct vmem_altmap *altmap); +extern void sparse_remove_section(struct zone *zone, struct mem_section *ms, + unsigned long pfn, unsigned long nr_pages, unsigned long map_offset, struct vmem_altmap *altmap); extern struct page *sparse_decode_mem_map(unsigned long coded_mem_map, unsigned long pnum); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 108380e20d8f..9f73332af910 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -251,22 +251,44 @@ void __init register_page_bootmem_info_node(struct pglist_data *pgdat) } #endif /* CONFIG_HAVE_BOOTMEM_INFO_NODE */ -static int __meminit __add_section(int nid, unsigned long phys_start_pfn, - struct vmem_altmap *altmap, bool want_memblock) +static int __meminit __add_section(int nid, unsigned long pfn, + unsigned long nr_pages, struct vmem_altmap *altmap, + bool want_memblock) { int ret; - if (pfn_valid(phys_start_pfn)) + if (pfn_valid(pfn)) return -EEXIST; - ret = sparse_add_one_section(nid, phys_start_pfn, altmap); + ret = sparse_add_section(nid, pfn, nr_pages, altmap); if (ret < 0) return ret; if (!want_memblock) return 0; - return hotplug_memory_register(nid, __pfn_to_section(phys_start_pfn)); + return hotplug_memory_register(nid, __pfn_to_section(pfn)); +} + +static int subsection_check(unsigned long pfn, unsigned long nr_pages, + unsigned long flags, const char *reason) +{ + /* + * Only allow partial section hotplug for !memblock ranges, + * since register_new_memory() requires section alignment, and + * CONFIG_SPARSEMEM_VMEMMAP=n requires sections to be fully + * populated. + */ + if ((!IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) + || (flags & MHP_MEMBLOCK_API)) + && ((pfn & ~PAGE_SECTION_MASK) + || (nr_pages & ~PAGE_SECTION_MASK))) { + WARN(1, "Sub-section hot-%s incompatible with %s\n", reason, + (flags & MHP_MEMBLOCK_API) + ? "memblock api" : "!CONFIG_SPARSEMEM_VMEMMAP"); + return -EINVAL; + } + return 0; } /* @@ -275,34 +297,40 @@ static int __meminit __add_section(int nid, unsigned long phys_start_pfn, * call this function after deciding the zone to which to * add the new pages. */ -int __ref __add_pages(int nid, unsigned long phys_start_pfn, - unsigned long nr_pages, struct mhp_restrictions *restrictions) +int __ref __add_pages(int nid, unsigned long pfn, unsigned long nr_pages, + struct mhp_restrictions *restrictions) { unsigned long i; - int err = 0; - int start_sec, end_sec; + int start_sec, end_sec, err; struct vmem_altmap *altmap = restrictions->altmap; - /* during initialize mem_map, align hot-added range to section */ - start_sec = pfn_to_section_nr(phys_start_pfn); - end_sec = pfn_to_section_nr(phys_start_pfn + nr_pages - 1); - if (altmap) { /* * Validate altmap is within bounds of the total request */ - if (altmap->base_pfn != phys_start_pfn + if (altmap->base_pfn != pfn || vmem_altmap_offset(altmap) > nr_pages) { pr_warn_once("memory add fail, invalid altmap\n"); - err = -EINVAL; - goto out; + return -EINVAL; } altmap->alloc = 0; } + err = subsection_check(pfn, nr_pages, restrictions->flags, "add"); + if (err) + return err; + + start_sec = pfn_to_section_nr(pfn); + end_sec = pfn_to_section_nr(pfn + nr_pages - 1); for (i = start_sec; i <= end_sec; i++) { - err = __add_section(nid, section_nr_to_pfn(i), altmap, + unsigned long pfns; + + pfns = min(nr_pages, PAGES_PER_SECTION + - (pfn & ~PAGE_SECTION_MASK)); + err = __add_section(nid, pfn, pfns, altmap, restrictions->flags & MHP_MEMBLOCK_API); + pfn += pfns; + nr_pages -= pfns; /* * EEXIST is finally dealt with by ioresource collision @@ -315,7 +343,6 @@ int __ref __add_pages(int nid, unsigned long phys_start_pfn, cond_resched(); } vmemmap_populate_print_last(); -out: return err; } @@ -494,10 +521,10 @@ static void shrink_pgdat_span(struct pglist_data *pgdat, pgdat->node_spanned_pages = 0; } -static void __remove_zone(struct zone *zone, unsigned long start_pfn) +static void __remove_zone(struct zone *zone, unsigned long start_pfn, + unsigned long nr_pages) { struct pglist_data *pgdat = zone->zone_pgdat; - int nr_pages = PAGES_PER_SECTION; unsigned long flags; pgdat_resize_lock(zone->zone_pgdat, &flags); @@ -506,29 +533,26 @@ static void __remove_zone(struct zone *zone, unsigned long start_pfn) pgdat_resize_unlock(zone->zone_pgdat, &flags); } -static void __remove_section(struct zone *zone, struct mem_section *ms, - unsigned long map_offset, - struct vmem_altmap *altmap) +static void __remove_section(struct zone *zone, unsigned long pfn, + unsigned long nr_pages, unsigned long map_offset, + struct vmem_altmap *altmap) { - unsigned long start_pfn; - int scn_nr; + struct mem_section *ms = __nr_to_section(pfn_to_section_nr(pfn)); if (WARN_ON_ONCE(!valid_section(ms))) return; unregister_memory_section(ms); - scn_nr = __section_nr(ms); - start_pfn = section_nr_to_pfn((unsigned long)scn_nr); - __remove_zone(zone, start_pfn); + __remove_zone(zone, pfn, nr_pages); - sparse_remove_one_section(zone, ms, map_offset, altmap); + sparse_remove_section(zone, ms, pfn, nr_pages, map_offset, altmap); } /** * __remove_pages() - remove sections of pages from a zone * @zone: zone from which pages need to be removed - * @phys_start_pfn: starting pageframe (must be aligned to start of a section) + * @pfn: starting pageframe (must be aligned to start of a section) * @nr_pages: number of pages to remove (must be multiple of section size) * @altmap: alternative device page map or %NULL if default memmap is used * @@ -537,31 +561,39 @@ static void __remove_section(struct zone *zone, struct mem_section *ms, * sure that pages are marked reserved and zones are adjust properly by * calling offline_pages(). */ -void __remove_pages(struct zone *zone, unsigned long phys_start_pfn, +void __remove_pages(struct zone *zone, unsigned long pfn, unsigned long nr_pages, struct vmem_altmap *altmap) { - unsigned long i; unsigned long map_offset = 0; - int sections_to_remove; + int i, start_sec, end_sec; + struct memory_block *mem; + unsigned long flags = 0; if (altmap) map_offset = vmem_altmap_offset(altmap); clear_zone_contiguous(zone); - /* - * We can only remove entire sections - */ - BUG_ON(phys_start_pfn & ~PAGE_SECTION_MASK); - BUG_ON(nr_pages % PAGES_PER_SECTION); + mem = find_memory_block(__pfn_to_section(pfn)); + if (mem) { + flags |= MHP_MEMBLOCK_API; + put_device(&mem->dev); + } - sections_to_remove = nr_pages / PAGES_PER_SECTION; - for (i = 0; i < sections_to_remove; i++) { - unsigned long pfn = phys_start_pfn + i*PAGES_PER_SECTION; + if (subsection_check(pfn, nr_pages, flags, "remove")) + return; + + start_sec = pfn_to_section_nr(pfn); + end_sec = pfn_to_section_nr(pfn + nr_pages - 1); + for (i = start_sec; i <= end_sec; i++) { + unsigned long pfns; cond_resched(); - __remove_section(zone, __pfn_to_section(pfn), map_offset, - altmap); + pfns = min(nr_pages, PAGES_PER_SECTION + - (pfn & ~PAGE_SECTION_MASK)); + __remove_section(zone, pfn, pfns, map_offset, altmap); + pfn += pfns; + nr_pages -= pfns; map_offset = 0; } diff --git a/mm/sparse.c b/mm/sparse.c index ed26761327bf..198371e5fc87 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -756,8 +756,8 @@ static void free_map_bootmem(struct page *memmap) * * -EEXIST - Section has been present. * * -ENOMEM - Out of memory. */ -int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, - struct vmem_altmap *altmap) +int __meminit sparse_add_section(int nid, unsigned long start_pfn, + unsigned long nr_pages, struct vmem_altmap *altmap) { unsigned long section_nr = pfn_to_section_nr(start_pfn); struct mem_section_usage *usage; @@ -866,7 +866,8 @@ static void free_section_usage(struct page *memmap, free_map_bootmem(memmap); } -void sparse_remove_one_section(struct zone *zone, struct mem_section *ms, +void sparse_remove_section(struct zone *zone, struct mem_section *ms, + unsigned long pfn, unsigned long nr_pages, unsigned long map_offset, struct vmem_altmap *altmap) { struct page *memmap = NULL; From patchwork Thu May 2 05:56:10 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10926069 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DF10013AD for ; Thu, 2 May 2019 06:10:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B7A6528C20 for ; Thu, 2 May 2019 06:10:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A8EBD28E75; Thu, 2 May 2019 06:10:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 873A528C20 for ; Thu, 2 May 2019 06:10:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3FA926B026B; Thu, 2 May 2019 02:09:59 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 3ABD86B026C; Thu, 2 May 2019 02:09:59 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 29B146B026D; Thu, 2 May 2019 02:09:59 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f197.google.com (mail-pf1-f197.google.com [209.85.210.197]) by kanga.kvack.org (Postfix) with ESMTP id E23516B026B for ; Thu, 2 May 2019 02:09:58 -0400 (EDT) Received: by mail-pf1-f197.google.com with SMTP id l74so677485pfb.23 for ; Wed, 01 May 2019 23:09:58 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=QbxSsJ8fT0BA4/ZNo5wwVI5K+KlPWec7WacBfxevIV0=; b=jxxeuR8EsN+hyo7abYjr9VWy/U3dI7lkgG4dk6WYsnOvhyWgCEaZOTmxOkotylzPdc UN2hO4V5U72E1oLriWCS0lDZd3s4rZOUTkd7HlcECJp2fMVIhfr39dSUQWXXHvBbkdot nG8bwUlLjSSRJZOoSEVoTtwVo5L5ASt26fpEvvwHqdJUIJsJvevTSgC4eew7Fd2bUQNU gPcDZ5ZBGRnTtbltYiZ7oSRg3S7duD9UMUlR8xcWKjCo8GgkpHBOLXHvo3S5PbjnyGxk gU6E31AGO4gK8j99vQFqUtbaW3fTFR25xB9riVxArFF0Vhjo4nie6OuwICmbmffW90Xe znFA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAWKl+1Qh/2lMqZcZtPqucR2dVCVSbe2fxrgkl12mWD/JHrb/sss IBgX78umdA4OMbsF4tiEIue4THt6A+pKtI9NWgp5lfd0NCB1QRC1XpogYeQcLJGY2sK9cw48wDV rrdF2P7FNBtPZu9zKORIUB+zIYVtneZc8O6vGZBRC9ZjtDKPLkvzkf+uBjXdFIfFvLQ== X-Received: by 2002:a63:8bc9:: with SMTP id j192mr2155089pge.212.1556777398534; Wed, 01 May 2019 23:09:58 -0700 (PDT) X-Google-Smtp-Source: APXvYqw0QNpqAbqympS6iZejvhEt1LBR3tRQYJ6GCDO5j3j1z3tzqlDYGcNMrKDQjuHykqlDZQwY X-Received: by 2002:a63:8bc9:: with SMTP id j192mr2155000pge.212.1556777397478; Wed, 01 May 2019 23:09:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556777397; cv=none; d=google.com; s=arc-20160816; b=f2R3vkLBo9khZse+IjKotsE3hhcCeWzqdXiToEpGFkPAqME+IzALhDGZvAHmPHbqSj GWdFmO1XBxidSL9IwW4mKuYBliCXcemC4lSG8V6y4VRi97h/BPrDMUnPERRMEvmfT0cH 8h4E5J6RpK/jgz7g1Oyh280Vz0OJs0sgzY/JatAu++Ez9xk4mLGc9B1DQtH7pXKgVfGL v9u4EN92yW00MxVfb8L04Ev9eWZHaEH04tBQRNuBIIJNla5MB+x+trKqEDuC/cdknfsK 3u1hS70+deBvOe+Mb4R5OHyGn52GNpbxCN5a8KqcPVNpQVhBf6n232JumwQGXAfv7S+Z cTfg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=QbxSsJ8fT0BA4/ZNo5wwVI5K+KlPWec7WacBfxevIV0=; b=bzv1lJJOLt6i8CMVQU9TUfzGuX46v9CbeUUhX3UTVDYZlDG413Lrs02rufcn3vTfk8 He26sbVwL3dt1J8vXOuZdFlMPe8IQMXvPq/6pATpmV12BpLVDENLTefNWKosfEUJ0Byu XCkk0Hpx8FESkU+JdYXi5RQ7GkHfwofqlpOBik/W60Qu4rjp+0wQsxTMG/Br7g+9h540 ii+y0CG3aG7EZQ7SLPmNErmUKUHthA0E5k2zZJAaOLZ1KB6xoVtn15FiUz8DIiz6wk1i 0q1zt/EJWLzfMazdjhnuajHb7OEd8FfJW2TUNPIXnEm1aegzHZCa9nlAivzG32FJfItZ jzaQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga07.intel.com (mga07.intel.com. [134.134.136.100]) by mx.google.com with ESMTPS id r124si40751810pgr.201.2019.05.01.23.09.57 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 01 May 2019 23:09:57 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.100 as permitted sender) client-ip=134.134.136.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 May 2019 23:09:57 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,420,1549958400"; d="scan'208";a="342689797" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga006.fm.intel.com with ESMTP; 01 May 2019 23:09:56 -0700 Subject: [PATCH v7 09/12] mm/sparsemem: Support sub-section hotplug From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Vlastimil Babka , Logan Gunthorpe , linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, osalvador@suse.de, mhocko@suse.com Date: Wed, 01 May 2019 22:56:10 -0700 Message-ID: <155677657023.2336373.4452495266651002382.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155677652226.2336373.8700273400832001094.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155677652226.2336373.8700273400832001094.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The libnvdimm sub-system has suffered a series of hacks and broken workarounds for the memory-hotplug implementation's awkward section-aligned (128MB) granularity. For example the following backtrace is emitted when attempting arch_add_memory() with physical address ranges that intersect 'System RAM' (RAM) with 'Persistent Memory' (PMEM) within a given section: WARNING: CPU: 0 PID: 558 at kernel/memremap.c:300 devm_memremap_pages+0x3b5/0x4c0 devm_memremap_pages attempted on mixed region [mem 0x200000000-0x2fbffffff flags 0x200] [..] Call Trace: dump_stack+0x86/0xc3 __warn+0xcb/0xf0 warn_slowpath_fmt+0x5f/0x80 devm_memremap_pages+0x3b5/0x4c0 __wrap_devm_memremap_pages+0x58/0x70 [nfit_test_iomap] pmem_attach_disk+0x19a/0x440 [nd_pmem] Recently it was discovered that the problem goes beyond RAM vs PMEM collisions as some platform produce PMEM vs PMEM collisions within a given section. The libnvdimm workaround for that case revealed that the libnvdimm section-alignment-padding implementation has been broken for a long while. A fix for that long-standing breakage introduces as many problems as it solves as it would require a backward-incompatible change to the namespace metadata interpretation. Instead of that dubious route [1], address the root problem in the memory-hotplug implementation. [1]: https://lore.kernel.org/r/155000671719.348031.2347363160141119237.stgit@dwillia2-desk3.amr.corp.intel.com Cc: Michal Hocko Cc: Vlastimil Babka Cc: Logan Gunthorpe Signed-off-by: Dan Williams --- mm/sparse.c | 223 ++++++++++++++++++++++++++++++++++++++++------------------- 1 file changed, 150 insertions(+), 73 deletions(-) diff --git a/mm/sparse.c b/mm/sparse.c index 198371e5fc87..419a3620af6e 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -83,8 +83,15 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid) unsigned long root = SECTION_NR_TO_ROOT(section_nr); struct mem_section *section; + /* + * An existing section is possible in the sub-section hotplug + * case. First hot-add instantiates, follow-on hot-add reuses + * the existing section. + * + * The mem_hotplug_lock resolves the apparent race below. + */ if (mem_section[root]) - return -EEXIST; + return 0; section = sparse_index_alloc(nid); if (!section) @@ -338,6 +345,15 @@ static void __meminit sparse_init_one_section(struct mem_section *ms, unsigned long pnum, struct page *mem_map, struct mem_section_usage *usage) { + /* + * Given that SPARSEMEM_VMEMMAP=y supports sub-section hotplug, + * ->section_mem_map can not be guaranteed to point to a full + * section's worth of memory. The field is only valid / used + * in the SPARSEMEM_VMEMMAP=n case. + */ + if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP)) + mem_map = NULL; + ms->section_mem_map &= ~SECTION_MAP_MASK; ms->section_mem_map |= sparse_encode_mem_map(mem_map, pnum) | SECTION_HAS_MEM_MAP; @@ -743,10 +759,130 @@ static void free_map_bootmem(struct page *memmap) #endif /* CONFIG_MEMORY_HOTREMOVE */ #endif /* CONFIG_SPARSEMEM_VMEMMAP */ +#ifndef CONFIG_MEMORY_HOTREMOVE +static void free_map_bootmem(struct page *memmap) +{ +} +#endif + +static bool is_early_section(struct mem_section *ms) +{ + struct page *usage_page; + + usage_page = virt_to_page(ms->usage); + if (PageSlab(usage_page) || PageCompound(usage_page)) + return false; + else + return true; +} + +static void section_deactivate(unsigned long pfn, unsigned long nr_pages, + int nid, struct vmem_altmap *altmap) +{ + unsigned long mask = section_active_mask(pfn, nr_pages); + struct mem_section *ms = __pfn_to_section(pfn); + bool early_section = is_early_section(ms); + struct page *memmap = NULL; + + if (WARN(!ms->usage || (ms->usage->map_active & mask) != mask, + "section already deactivated: active: %#lx mask: %#lx\n", + ms->usage ? ms->usage->map_active : 0, mask)) + return; + + if (WARN(!IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) + && nr_pages < PAGES_PER_SECTION, + "partial memory section removal not supported\n")) + return; + + /* + * There are 3 cases to handle across two configurations + * (SPARSEMEM_VMEMMAP={y,n}): + * + * 1/ deactivation of a partial hot-added section (only possible + * in the SPARSEMEM_VMEMMAP=y case). + * a/ section was present at memory init + * b/ section was hot-added post memory init + * 2/ deactivation of a complete hot-added section + * 3/ deactivation of a complete section from memory init + * + * For 1/, when map_active does not go to zero we will not be + * freeing the usage map, but still need to free the vmemmap + * range. + * + * For 2/ and 3/ the SPARSEMEM_VMEMMAP={y,n} cases are unified + */ + ms->usage->map_active ^= mask; + if (ms->usage->map_active == 0) { + unsigned long section_nr = pfn_to_section_nr(pfn); + + if (!early_section) { + kfree(ms->usage); + ms->usage = NULL; + } + memmap = sparse_decode_mem_map(ms->section_mem_map, section_nr); + ms->section_mem_map = sparse_encode_mem_map(NULL, section_nr); + } + + if (early_section && memmap) + free_map_bootmem(memmap); + else + depopulate_section_memmap(pfn, nr_pages, altmap); +} + +static struct page * __meminit section_activate(int nid, unsigned long pfn, + unsigned long nr_pages, struct vmem_altmap *altmap) +{ + unsigned long mask = section_active_mask(pfn, nr_pages); + struct mem_section *ms = __pfn_to_section(pfn); + struct mem_section_usage *usage = NULL; + struct page *memmap; + int rc = 0; + + if (!ms->usage) { + usage = kzalloc(mem_section_usage_size(), GFP_KERNEL); + if (!usage) + return ERR_PTR(-ENOMEM); + ms->usage = usage; + } + + if (!mask) + rc = -EINVAL; + else if (mask & ms->usage->map_active) + rc = -EEXIST; + else + ms->usage->map_active |= mask; + + if (rc) { + if (usage) + ms->usage = NULL; + kfree(usage); + return ERR_PTR(rc); + } + + /* + * The early init code does not consider partially populated + * initial sections, it simply assumes that memory will never be + * referenced. If we hot-add memory into such a section then we + * do not need to populate the memmap and can simply reuse what + * is already there. + */ + if (nr_pages < PAGES_PER_SECTION && is_early_section(ms)) + return pfn_to_page(pfn); + + memmap = populate_section_memmap(pfn, nr_pages, nid, altmap); + if (!memmap) { + section_deactivate(pfn, nr_pages, nid, altmap); + return ERR_PTR(-ENOMEM); + } + + return memmap; +} + /** - * sparse_add_one_section - add a memory section + * sparse_add_section - add a memory section, or populate an existing one * @nid: The node to add section on * @start_pfn: start pfn of the memory range + * @nr_pages: number of pfns to add in the section * @altmap: device page map * * This is only intended for hotplug. @@ -760,49 +896,31 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, unsigned long nr_pages, struct vmem_altmap *altmap) { unsigned long section_nr = pfn_to_section_nr(start_pfn); - struct mem_section_usage *usage; struct mem_section *ms; struct page *memmap; int ret; - /* - * no locking for this, because it does its own - * plus, it does a kmalloc - */ ret = sparse_index_init(section_nr, nid); if (ret < 0 && ret != -EEXIST) return ret; - ret = 0; - memmap = populate_section_memmap(start_pfn, PAGES_PER_SECTION, nid, - altmap); - if (!memmap) - return -ENOMEM; - usage = kzalloc(mem_section_usage_size(), GFP_KERNEL); - if (!usage) { - depopulate_section_memmap(start_pfn, PAGES_PER_SECTION, altmap); - return -ENOMEM; - } - ms = __pfn_to_section(start_pfn); - if (ms->section_mem_map & SECTION_MARKED_PRESENT) { - ret = -EEXIST; - goto out; - } + memmap = section_activate(nid, start_pfn, nr_pages, altmap); + if (IS_ERR(memmap)) + return PTR_ERR(memmap); + ret = 0; /* * Poison uninitialized struct pages in order to catch invalid flags * combinations. */ - page_init_poison(memmap, sizeof(struct page) * PAGES_PER_SECTION); + page_init_poison(pfn_to_page(start_pfn), sizeof(struct page) * nr_pages); + ms = __pfn_to_section(start_pfn); section_mark_present(ms); - sparse_init_one_section(ms, section_nr, memmap, usage); + sparse_init_one_section(ms, section_nr, memmap, ms->usage); -out: - if (ret < 0) { - kfree(usage); - depopulate_section_memmap(start_pfn, PAGES_PER_SECTION, altmap); - } + if (ret < 0) + section_deactivate(start_pfn, nr_pages, nid, altmap); return ret; } @@ -837,54 +955,13 @@ static inline void clear_hwpoisoned_pages(struct page *memmap, int nr_pages) } #endif -static void free_section_usage(struct page *memmap, - struct mem_section_usage *usage, unsigned long pfn, - unsigned long nr_pages, struct vmem_altmap *altmap) -{ - struct page *usage_page; - - if (!usage) - return; - - usage_page = virt_to_page(usage); - /* - * Check to see if allocation came from hot-plug-add - */ - if (PageSlab(usage_page) || PageCompound(usage_page)) { - kfree(usage); - if (memmap) - depopulate_section_memmap(pfn, nr_pages, altmap); - return; - } - - /* - * The usemap came from bootmem. This is packed with other usemaps - * on the section which has pgdat at boot time. Just keep it as is now. - */ - - if (memmap) - free_map_bootmem(memmap); -} - void sparse_remove_section(struct zone *zone, struct mem_section *ms, unsigned long pfn, unsigned long nr_pages, unsigned long map_offset, struct vmem_altmap *altmap) { - struct page *memmap = NULL; - struct mem_section_usage *usage = NULL; - - if (ms->section_mem_map) { - usage = ms->usage; - memmap = sparse_decode_mem_map(ms->section_mem_map, - __section_nr(ms)); - ms->section_mem_map = 0; - ms->usage = NULL; - } - - clear_hwpoisoned_pages(memmap + map_offset, - PAGES_PER_SECTION - map_offset); - free_section_usage(memmap, usage, section_nr_to_pfn(__section_nr(ms)), - PAGES_PER_SECTION, altmap); + clear_hwpoisoned_pages(pfn_to_page(pfn) + map_offset, + nr_pages - map_offset); + section_deactivate(pfn, nr_pages, zone_to_nid(zone), altmap); } #endif /* CONFIG_MEMORY_HOTREMOVE */ #endif /* CONFIG_MEMORY_HOTPLUG */ From patchwork Thu May 2 05:56:15 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10926073 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A217813AD for ; Thu, 2 May 2019 06:10:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 838E628DEA for ; Thu, 2 May 2019 06:10:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 77E7928F4F; Thu, 2 May 2019 06:10:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8CEAB28DEA for ; Thu, 2 May 2019 06:10:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0EE456B026C; Thu, 2 May 2019 02:10:05 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 09F8A6B026D; Thu, 2 May 2019 02:10:05 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ED1656B026E; Thu, 2 May 2019 02:10:04 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f198.google.com (mail-pl1-f198.google.com [209.85.214.198]) by kanga.kvack.org (Postfix) with ESMTP id B03FE6B026C for ; Thu, 2 May 2019 02:10:04 -0400 (EDT) Received: by mail-pl1-f198.google.com with SMTP id bg6so728484plb.8 for ; Wed, 01 May 2019 23:10:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=b7RJtdZWf35ggJCzUu72j0QwXn8PLWDAPFhCtA3SldY=; b=ekTsg6z0ge4cbXhvLJ9VEZwbSawo2D96ZHnYdZELXV6yNodTHN+B7xndtru2EZETvH ofY4vueXT1oW+cOjgOvfSUa4/unZejb2igsbbopeVmewQnZBRQL6M77suXCl62yesrNY ISgAadoK5WxkR/LEU8fYLCRcM6fe9R7gI88pAq236jV7OEHWNEMn7O/sMz5K/Gi5e1RV rkoppEsg4GV2Iav612mr9rFMn0C+F/Pg99jXVi7V+ZxM5dihcSEqv3ijcljaKfBRENDy h7B8bMFzpjoVWUSAZyWwK7qZw6YpojxcLX3fRXX0QHlmVgray9vSzYPoEY5kLy7Ovfto ehoA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAUn45fHlKNaEzrV65GwybvXbdvo4ER0kKY1MEQXmWwhYz9vKWOu R86tyBxKrW1Z1XxX1xNjxq//E6OKTRLtwtk9iimr93+CHp/D/SaL5IY6ChDKIo+VATD3kBKhO6R aywMZLzvklvWU33fCuEBWDk/Vk49UViifABT8uKgxXHSgo9XXAS8r/xScuJHw1vjJJg== X-Received: by 2002:a17:902:9b8d:: with SMTP id y13mr1828549plp.70.1556777404365; Wed, 01 May 2019 23:10:04 -0700 (PDT) X-Google-Smtp-Source: APXvYqxu3pTLeXNHX/UqBWf1k13ZIUJhOfW2mVHCncYyeexeKDRTgrf5zfzOrLgbZ/vxsXUmEtAn X-Received: by 2002:a17:902:9b8d:: with SMTP id y13mr1828463plp.70.1556777403491; Wed, 01 May 2019 23:10:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556777403; cv=none; d=google.com; s=arc-20160816; b=fjVa4UubACQjR9LiCESRI7UQWyUwBvPlZWgKRrd+eUIDctEV0mnzY0nwwWWUz/YiG3 dC6bTIe7oaaRmgtZQECAx0W7K0A/F4TiL+lyAgtcGPTkgz1p6BZVvT66QY3yf9+8qCOA uw3c62Djv+m/r60l+kq47wWSfeZsWMczjIeCGhFHqG7aQlv5k4RuuqgI/G346QU+7yWU Nx11d6kqLuQ530oJnAI5K67N+lDOhbhbkwjWPCWJIl8QIrwj17/vbbZSVJuEPJWXjTcK xPUgBkjmHIttKkaC4IL6x9ZxCcCcY8pkloGtUywChnGzhrp1KbhBBQNdrh2gC3CNrh3O h3Ng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=b7RJtdZWf35ggJCzUu72j0QwXn8PLWDAPFhCtA3SldY=; b=Wqasv4p4rTh9ijoOSuNQ/70sFnYgVoG/+V9kMBManR/yFmWR7zqNQoI88D6oLfm26f MojIR8jPwp4+XrWXYU8I5S5fPxRrYcCMtT//roGZhzovJ9GYViQr3XNKR1Cjqx5pcE1B /1TaTXla1BHw528hxNuH4pNDAFFs7jKedl4JZJhS4iY5ercIJ6mWlLTlKm3F/wfsspD9 fxo5c06rZ/PU/uIAqq6zgOcZgj80Txew8teyjbisxJOu0Exe08kyXyCn5xPrVsXu7Lf9 eAxw9a2rAzfF67bs1pLBBqHuezkQ410XSZaqJvBIMPzV6lJqeKYAQIXN/Fh6vgBVrURr 4Z8Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga09.intel.com (mga09.intel.com. [134.134.136.24]) by mx.google.com with ESMTPS id f20si39541388pgj.278.2019.05.01.23.10.03 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 01 May 2019 23:10:03 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) client-ip=134.134.136.24; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 May 2019 23:10:02 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,420,1549958400"; d="scan'208";a="342618658" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga005.fm.intel.com with ESMTP; 01 May 2019 23:10:02 -0700 Subject: [PATCH v7 10/12] mm/devm_memremap_pages: Enable sub-section remap From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Toshi Kani , =?utf-8?b?SsOpcsO0bWU=?= Glisse , Logan Gunthorpe , linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, osalvador@suse.de, mhocko@suse.com Date: Wed, 01 May 2019 22:56:15 -0700 Message-ID: <155677657576.2336373.1598502251563862624.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155677652226.2336373.8700273400832001094.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155677652226.2336373.8700273400832001094.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Teach devm_memremap_pages() about the new sub-section capabilities of arch_{add,remove}_memory(). Effectively, just replace all usage of align_start, align_end, and align_size with res->start, res->end, and resource_size(res). The existing sanity check will still make sure that the two separate remap attempts do not collide within a sub-section (2MB on x86). Cc: Michal Hocko Cc: Toshi Kani Cc: Jérôme Glisse Cc: Logan Gunthorpe Signed-off-by: Dan Williams --- kernel/memremap.c | 61 +++++++++++++++++++++-------------------------------- 1 file changed, 24 insertions(+), 37 deletions(-) diff --git a/kernel/memremap.c b/kernel/memremap.c index f355586ea54a..425904858d97 100644 --- a/kernel/memremap.c +++ b/kernel/memremap.c @@ -59,7 +59,7 @@ static unsigned long pfn_first(struct dev_pagemap *pgmap) struct vmem_altmap *altmap = &pgmap->altmap; unsigned long pfn; - pfn = res->start >> PAGE_SHIFT; + pfn = PHYS_PFN(res->start); if (pgmap->altmap_valid) pfn += vmem_altmap_offset(altmap); return pfn; @@ -87,7 +87,6 @@ static void devm_memremap_pages_release(void *data) struct dev_pagemap *pgmap = data; struct device *dev = pgmap->dev; struct resource *res = &pgmap->res; - resource_size_t align_start, align_size; unsigned long pfn; int nid; @@ -96,25 +95,21 @@ static void devm_memremap_pages_release(void *data) put_page(pfn_to_page(pfn)); /* pages are dead and unused, undo the arch mapping */ - align_start = res->start & ~(PA_SECTION_SIZE - 1); - align_size = ALIGN(res->start + resource_size(res), PA_SECTION_SIZE) - - align_start; - - nid = page_to_nid(pfn_to_page(align_start >> PAGE_SHIFT)); + nid = page_to_nid(pfn_to_page(PHYS_PFN(res->start))); mem_hotplug_begin(); if (pgmap->type == MEMORY_DEVICE_PRIVATE) { - pfn = align_start >> PAGE_SHIFT; + pfn = PHYS_PFN(res->start); __remove_pages(page_zone(pfn_to_page(pfn)), pfn, - align_size >> PAGE_SHIFT, NULL); + PHYS_PFN(resource_size(res)), NULL); } else { - arch_remove_memory(nid, align_start, align_size, + arch_remove_memory(nid, res->start, resource_size(res), pgmap->altmap_valid ? &pgmap->altmap : NULL); - kasan_remove_zero_shadow(__va(align_start), align_size); + kasan_remove_zero_shadow(__va(res->start), resource_size(res)); } mem_hotplug_done(); - untrack_pfn(NULL, PHYS_PFN(align_start), align_size); + untrack_pfn(NULL, PHYS_PFN(res->start), resource_size(res)); pgmap_array_delete(res); dev_WARN_ONCE(dev, pgmap->altmap.alloc, "%s: failed to free all reserved pages\n", __func__); @@ -141,16 +136,13 @@ static void devm_memremap_pages_release(void *data) */ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) { - resource_size_t align_start, align_size, align_end; - struct vmem_altmap *altmap = pgmap->altmap_valid ? - &pgmap->altmap : NULL; struct resource *res = &pgmap->res; struct dev_pagemap *conflict_pgmap; struct mhp_restrictions restrictions = { /* * We do not want any optional features only our own memmap */ - .altmap = altmap, + .altmap = pgmap->altmap_valid ? &pgmap->altmap : NULL, }; pgprot_t pgprot = PAGE_KERNEL; int error, nid, is_ram; @@ -158,26 +150,21 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) if (!pgmap->ref || !pgmap->kill) return ERR_PTR(-EINVAL); - align_start = res->start & ~(PA_SECTION_SIZE - 1); - align_size = ALIGN(res->start + resource_size(res), PA_SECTION_SIZE) - - align_start; - align_end = align_start + align_size - 1; - - conflict_pgmap = get_dev_pagemap(PHYS_PFN(align_start), NULL); + conflict_pgmap = get_dev_pagemap(PHYS_PFN(res->start), NULL); if (conflict_pgmap) { dev_WARN(dev, "Conflicting mapping in same section\n"); put_dev_pagemap(conflict_pgmap); return ERR_PTR(-ENOMEM); } - conflict_pgmap = get_dev_pagemap(PHYS_PFN(align_end), NULL); + conflict_pgmap = get_dev_pagemap(PHYS_PFN(res->end), NULL); if (conflict_pgmap) { dev_WARN(dev, "Conflicting mapping in same section\n"); put_dev_pagemap(conflict_pgmap); return ERR_PTR(-ENOMEM); } - is_ram = region_intersects(align_start, align_size, + is_ram = region_intersects(res->start, resource_size(res), IORESOURCE_SYSTEM_RAM, IORES_DESC_NONE); if (is_ram != REGION_DISJOINT) { @@ -198,8 +185,8 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) if (nid < 0) nid = numa_mem_id(); - error = track_pfn_remap(NULL, &pgprot, PHYS_PFN(align_start), 0, - align_size); + error = track_pfn_remap(NULL, &pgprot, PHYS_PFN(res->start), 0, + resource_size(res)); if (error) goto err_pfn_remap; @@ -217,25 +204,25 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) * arch_add_memory(). */ if (pgmap->type == MEMORY_DEVICE_PRIVATE) { - error = add_pages(nid, align_start >> PAGE_SHIFT, - align_size >> PAGE_SHIFT, &restrictions); + error = add_pages(nid, PHYS_PFN(res->start), + PHYS_PFN(resource_size(res)), &restrictions); } else { - error = kasan_add_zero_shadow(__va(align_start), align_size); + error = kasan_add_zero_shadow(__va(res->start), resource_size(res)); if (error) { mem_hotplug_done(); goto err_kasan; } - error = arch_add_memory(nid, align_start, align_size, - &restrictions); + error = arch_add_memory(nid, res->start, resource_size(res), + &restrictions); } if (!error) { struct zone *zone; zone = &NODE_DATA(nid)->node_zones[ZONE_DEVICE]; - move_pfn_range_to_zone(zone, align_start >> PAGE_SHIFT, - align_size >> PAGE_SHIFT, altmap); + move_pfn_range_to_zone(zone, PHYS_PFN(res->start), + PHYS_PFN(resource_size(res)), restrictions.altmap); } mem_hotplug_done(); @@ -247,8 +234,8 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) * to allow us to do the work while not holding the hotplug lock. */ memmap_init_zone_device(&NODE_DATA(nid)->node_zones[ZONE_DEVICE], - align_start >> PAGE_SHIFT, - align_size >> PAGE_SHIFT, pgmap); + PHYS_PFN(res->start), + PHYS_PFN(resource_size(res)), pgmap); percpu_ref_get_many(pgmap->ref, pfn_end(pgmap) - pfn_first(pgmap)); error = devm_add_action_or_reset(dev, devm_memremap_pages_release, @@ -259,9 +246,9 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) return __va(res->start); err_add_memory: - kasan_remove_zero_shadow(__va(align_start), align_size); + kasan_remove_zero_shadow(__va(res->start), resource_size(res)); err_kasan: - untrack_pfn(NULL, PHYS_PFN(align_start), align_size); + untrack_pfn(NULL, PHYS_PFN(res->start), resource_size(res)); err_pfn_remap: pgmap_array_delete(res); err_array: From patchwork Thu May 2 05:56:21 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10926077 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BC1591398 for ; Thu, 2 May 2019 06:10:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AC6C728F39 for ; Thu, 2 May 2019 06:10:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AA8F128F94; Thu, 2 May 2019 06:10:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0B1D028F48 for ; Thu, 2 May 2019 06:10:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D99046B026D; Thu, 2 May 2019 02:10:09 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D7A5A6B026E; Thu, 2 May 2019 02:10:09 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C5FEA6B026F; Thu, 2 May 2019 02:10:09 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f200.google.com (mail-pl1-f200.google.com [209.85.214.200]) by kanga.kvack.org (Postfix) with ESMTP id 8EE856B026D for ; Thu, 2 May 2019 02:10:09 -0400 (EDT) Received: by mail-pl1-f200.google.com with SMTP id a90so730434plc.7 for ; Wed, 01 May 2019 23:10:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=LKvk91HhP9ezsuBTNBJqcLkpO+1Sux7+j9Ueu/2PT+g=; b=cpgjNDoSceRpJwjf0+vhbUAS54yyJuoVUoyJeqa/lbcgThHGdGIyXCkLHu2Uyaywkj D7GAnxBJxMYcC7gHhn08DpIHcBrRdu+9/2Q2XSJ8g0S1Q2fGqpGZxNJCzbRoW9luysPH ybRUorTPo/5BvaZXyeGgG/JvkOui8/2BuO/0cu+q+eG4YhqxRATti4CoaNS7Zaxuxwwp yuTMjkHIgU7Jc6ksvMLKAJbolQO8CmO0y0o6jbafAvlrQ9psIKOYUMD+iYhXZLhqmm32 /ER+5zJKhafk1C8nbkb6xYBqSbsmodkX96dGLy7X/vk7WoqiJVyWB6yNQ8RraBzoe/8l Qpdw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAVz8UPBobS1K0hI01Ob/HpTcynbXfmG0jGCVQVgR/5ocTZWw5Mc 9A4gcOB3MpeHkER1dgyxGQ2Lja+NxjaYzamgEFBN+N4ncRoun7AjP8NnbLbQlznfl8J4GcZAtqW d3ozDpf3eiAEKraDLpgIn7Rbin8iQWIa6KN01ukqGpVXnp0Ny7kq2FiXQcaHHXRRa7w== X-Received: by 2002:a17:902:8604:: with SMTP id f4mr1781706plo.245.1556777409227; Wed, 01 May 2019 23:10:09 -0700 (PDT) X-Google-Smtp-Source: APXvYqwpaUj30K0rNz5Tx70BBJrZ1ea0t9r6UYLnBHe/GniyHWTrQOTpGtcK4fIS/kOuzfJ03VkZ X-Received: by 2002:a17:902:8604:: with SMTP id f4mr1781624plo.245.1556777408296; Wed, 01 May 2019 23:10:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556777408; cv=none; d=google.com; s=arc-20160816; b=RJtWJFkee8UZe7UNSP/QXbDFRsLRiBIHcBsXAZnhKOQ4h29NlbpQn3TcSRd5tut15s RT5PXqDlNM4Dh+kiSQhFhqzH3bxfkL/UEm0IvDXuBeRm9kGhtbktr0TtPrtb7VBFmQw3 t4WnRPSfTks2VwuL3vIzTKF4cxIXRtwCm4cTg9X/8ExiKTRyPl6zI2X8Ur5jlYM1SEDe nu/L6R8C28Ffr1gbv5fI8Y5xlsKZhQ1SpDQ5/76wGn9vopGfDw6zvR3w/2tdVmFDtdKZ DKjkDHa/tbN5WgV9cB0+WK/cwoj3dEL8bCYhoxcbh6lUDU+G4sO+IsuGmNKVFJCEfX5q 5Ehg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=LKvk91HhP9ezsuBTNBJqcLkpO+1Sux7+j9Ueu/2PT+g=; b=apjtCyiJ+VCEvb4szj7OzLZu4RFqYJPu2xb5TkTzGT2ZtPmN53A8st1MoaEswqNR7Z 2c2Pk8CZJglRMwi1J3w7122FGJhuzunh71DlrZo0lyzDo7d8fPQquDW1+IvH0qA/udVB qJ/RMfgJEwMYtwz76tIOEIwjUl+e2T4tyK/PgEiH5TTeVc33oAEKxnlireMdJftB2ZCk HgoI79w2Rg/zljBqd7Mnzp9h/i0VaMNimfnoRRBsO/UksTNuXA+AFl5FBXGKayWkoYrE 7VYv69ISphsuqE66yicAcFd4vvHFSfqCk6NpXX3jDch3q9WoZEve8MPaxCm88W3IOFNH doBg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga04.intel.com (mga04.intel.com. [192.55.52.120]) by mx.google.com with ESMTPS id b5si7932355pfa.261.2019.05.01.23.10.08 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 01 May 2019 23:10:08 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.120 as permitted sender) client-ip=192.55.52.120; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 May 2019 23:10:07 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,420,1549958400"; d="scan'208";a="145342058" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga008.fm.intel.com with ESMTP; 01 May 2019 23:10:07 -0700 Subject: [PATCH v7 11/12] libnvdimm/pfn: Fix fsdax-mode namespace info-block zero-fields From: Dan Williams To: akpm@linux-foundation.org Cc: stable@vger.kernel.org, linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, osalvador@suse.de, mhocko@suse.com Date: Wed, 01 May 2019 22:56:21 -0700 Message-ID: <155677658129.2336373.14466297657474148349.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155677652226.2336373.8700273400832001094.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155677652226.2336373.8700273400832001094.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP At namespace creation time there is the potential for the "expected to be zero" fields of a 'pfn' info-block to be filled with indeterminate data. While the kernel buffer is zeroed on allocation it is immediately overwritten by nd_pfn_validate() filling it with the current contents of the on-media info-block location. For fields like, 'flags' and the 'padding' it potentially means that future implementations can not rely on those fields being zero. In preparation to stop using the 'start_pad' and 'end_trunc' fields for section alignment, arrange for fields that are not explicitly initialized to be guaranteed zero. Bump the minor version to indicate it is safe to assume the 'padding' and 'flags' are zero. Otherwise, this corruption is expected to benign since all other critical fields are explicitly initialized. Fixes: 32ab0a3f5170 ("libnvdimm, pmem: 'struct page' for pmem") Cc: Signed-off-by: Dan Williams --- drivers/nvdimm/dax_devs.c | 2 +- drivers/nvdimm/pfn.h | 1 + drivers/nvdimm/pfn_devs.c | 18 +++++++++++++++--- 3 files changed, 17 insertions(+), 4 deletions(-) diff --git a/drivers/nvdimm/dax_devs.c b/drivers/nvdimm/dax_devs.c index 0453f49dc708..326f02ffca81 100644 --- a/drivers/nvdimm/dax_devs.c +++ b/drivers/nvdimm/dax_devs.c @@ -126,7 +126,7 @@ int nd_dax_probe(struct device *dev, struct nd_namespace_common *ndns) nvdimm_bus_unlock(&ndns->dev); if (!dax_dev) return -ENOMEM; - pfn_sb = devm_kzalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); + pfn_sb = devm_kmalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); nd_pfn->pfn_sb = pfn_sb; rc = nd_pfn_validate(nd_pfn, DAX_SIG); dev_dbg(dev, "dax: %s\n", rc == 0 ? dev_name(dax_dev) : ""); diff --git a/drivers/nvdimm/pfn.h b/drivers/nvdimm/pfn.h index dde9853453d3..e901e3a3b04c 100644 --- a/drivers/nvdimm/pfn.h +++ b/drivers/nvdimm/pfn.h @@ -36,6 +36,7 @@ struct nd_pfn_sb { __le32 end_trunc; /* minor-version-2 record the base alignment of the mapping */ __le32 align; + /* minor-version-3 guarantee the padding and flags are zero */ u8 padding[4000]; __le64 checksum; }; diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c index 01f40672507f..a2406253eb70 100644 --- a/drivers/nvdimm/pfn_devs.c +++ b/drivers/nvdimm/pfn_devs.c @@ -420,6 +420,15 @@ static int nd_pfn_clear_memmap_errors(struct nd_pfn *nd_pfn) return 0; } +/** + * nd_pfn_validate - read and validate info-block + * @nd_pfn: fsdax namespace runtime state / properties + * @sig: 'devdax' or 'fsdax' signature + * + * Upon return the info-block buffer contents (->pfn_sb) are + * indeterminate when validation fails, and a coherent info-block + * otherwise. + */ int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig) { u64 checksum, offset; @@ -565,7 +574,7 @@ int nd_pfn_probe(struct device *dev, struct nd_namespace_common *ndns) nvdimm_bus_unlock(&ndns->dev); if (!pfn_dev) return -ENOMEM; - pfn_sb = devm_kzalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); + pfn_sb = devm_kmalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); nd_pfn = to_nd_pfn(pfn_dev); nd_pfn->pfn_sb = pfn_sb; rc = nd_pfn_validate(nd_pfn, PFN_SIG); @@ -702,7 +711,7 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) u64 checksum; int rc; - pfn_sb = devm_kzalloc(&nd_pfn->dev, sizeof(*pfn_sb), GFP_KERNEL); + pfn_sb = devm_kmalloc(&nd_pfn->dev, sizeof(*pfn_sb), GFP_KERNEL); if (!pfn_sb) return -ENOMEM; @@ -711,11 +720,14 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) sig = DAX_SIG; else sig = PFN_SIG; + rc = nd_pfn_validate(nd_pfn, sig); if (rc != -ENODEV) return rc; /* no info block, do init */; + memset(pfn_sb, 0, sizeof(*pfn_sb)); + nd_region = to_nd_region(nd_pfn->dev.parent); if (nd_region->ro) { dev_info(&nd_pfn->dev, @@ -768,7 +780,7 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) memcpy(pfn_sb->uuid, nd_pfn->uuid, 16); memcpy(pfn_sb->parent_uuid, nd_dev_to_uuid(&ndns->dev), 16); pfn_sb->version_major = cpu_to_le16(1); - pfn_sb->version_minor = cpu_to_le16(2); + pfn_sb->version_minor = cpu_to_le16(3); pfn_sb->start_pad = cpu_to_le32(start_pad); pfn_sb->end_trunc = cpu_to_le32(end_trunc); pfn_sb->align = cpu_to_le32(nd_pfn->align); From patchwork Thu May 2 05:56:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10926083 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EAF4813AD for ; Thu, 2 May 2019 06:10:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DB27E28F44 for ; Thu, 2 May 2019 06:10:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CFA5E28F2C; Thu, 2 May 2019 06:10:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7B77528F94 for ; Thu, 2 May 2019 06:10:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 471A26B026F; Thu, 2 May 2019 02:10:15 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 421356B0270; Thu, 2 May 2019 02:10:15 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3371D6B0271; Thu, 2 May 2019 02:10:15 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f199.google.com (mail-pg1-f199.google.com [209.85.215.199]) by kanga.kvack.org (Postfix) with ESMTP id EF56B6B026F for ; Thu, 2 May 2019 02:10:14 -0400 (EDT) Received: by mail-pg1-f199.google.com with SMTP id x13so713512pgl.10 for ; Wed, 01 May 2019 23:10:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=0Eiev16VluPo/Ft1mi41KIG+c9eC0uF0PiWNIa4cMlE=; b=MBcoF7n2LsYsIFHILiIRe2HPcilqY+LNTBe2MnKEtWoTcUJe5RP50vIYDRGEQXQhup WairAXN9gZsGw9IitY/rpeUdt1g2d7k7hb23/7Ep6yGdqrr4JIRIO2vTVKaxLUbPntlC psPldeMF2qwbhGeehZeq+PhfA3kHaV62moENYXh3Gagiyp1VZU2xpbykluFi2mB38340 pcuLiFYqsPEP2tnrLZrSzirVdsMCr02ceuOJQWyEuzgn5hv/bW98EEkiXjHrMcEhuajN iKV6a5f3WsCMgXQEmvsLpe8O5YsItuo3TqebRP4Qczmi7NJkuGMvuwnXwMJKzgDa+5Ij lm4g== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAWzsY/Ep4Ru/UFVahs+DEEWjON/as2n9C5BxIDqFHQOLA8ta6V/ oKbYFJRapsQzHqGFbLFwwKM8Et7cUY8YyN8lqeUJpl/CZi9I8DeTsQgAwFSLxRlbiwlqXhWIzs0 a9bcalQNtby583TsAP7Cf9uRgkiu0rJ0j6GVojNPo2TIVYE5V3Ea6qtgpncwv0FCThw== X-Received: by 2002:a63:6196:: with SMTP id v144mr2166857pgb.235.1556777414587; Wed, 01 May 2019 23:10:14 -0700 (PDT) X-Google-Smtp-Source: APXvYqxDPses1HfvM8Xi7mUybE0/mo1k8/uVeImPZ8s4R9EM9bw/YEBAwS6IVVdicJ2t2hf0LyEP X-Received: by 2002:a63:6196:: with SMTP id v144mr2166753pgb.235.1556777413570; Wed, 01 May 2019 23:10:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556777413; cv=none; d=google.com; s=arc-20160816; b=0AXa3Icpt5qgd8hL8e7WO3RKOVcd/Ziu4um3r3RnCNlRIno1xo0+ghfaRWxS/+OtYg omRLm6f9M9FFU785idjxpL0N1YLL8kKLh1DqJdgCe4swxQQhVIlR8d53R3JfQeTKYref UIUixb4L+uXj/bL+rA67cmFpyb4nszm+MvVgoUoc4s4kjnpkHAb4G2d6UCBgMpF+UrnF EJiidfiJURk86Zvk1wb3UEP+0GGxRsB+hjjHj8qoK80ABBEQFUF4Ipm3Mb8cbM/uZNlF kpgnJcoszfQntFH5f7GcS/GkHC/1LYMCtJAYwK+abcTQIfslflCAJQ3QSDn9SkffLbi4 uMug== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=0Eiev16VluPo/Ft1mi41KIG+c9eC0uF0PiWNIa4cMlE=; b=qH4NzddD538V7fYA9UIO4CyWeHDVxAFuWy90m/3g4KBgSn/AUOn38wRYqrShDD6OLW Xm98khTPqXbXeol3e29fM+WXlWzM1hI9cHt6+3eu6aIoJe0YZhvGD8zd4i82Nx7ma5Z6 4KQI0gYNMxJU9fVuPr4a6/urDuHm3QREFSMISH+pCp26ZYcOK+bcoAQwb8xdfmIXgN1T BwFmwJlHYTroESBGZbo/gLgfnzMjUysSFxfpaCvbyL7RN744JSPGRuIBQV9XRHEMfKm4 ANJ33yVlqKVxqsYer/gnnxnU2henoBAaScYOH1RfcspRju+7k34SvehTGE4VLidaRgYX cJaA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga12.intel.com (mga12.intel.com. [192.55.52.136]) by mx.google.com with ESMTPS id t1si14907090pgu.572.2019.05.01.23.10.13 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 01 May 2019 23:10:13 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) client-ip=192.55.52.136; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 May 2019 23:10:13 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,420,1549958400"; d="scan'208";a="166797701" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga002.fm.intel.com with ESMTP; 01 May 2019 23:10:12 -0700 Subject: [PATCH v7 12/12] libnvdimm/pfn: Stop padding pmem namespaces to section alignment From: Dan Williams To: akpm@linux-foundation.org Cc: Jeff Moyer , linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, osalvador@suse.de, mhocko@suse.com Date: Wed, 01 May 2019 22:56:26 -0700 Message-ID: <155677658661.2336373.9934181067409522929.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155677652226.2336373.8700273400832001094.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155677652226.2336373.8700273400832001094.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Now that the mm core supports section-unaligned hotplug of ZONE_DEVICE memory, we no longer need to add padding at pfn/dax device creation time. The kernel will still honor padding established by older kernels. Reported-by: Jeff Moyer Signed-off-by: Dan Williams --- drivers/nvdimm/pfn.h | 11 ++----- drivers/nvdimm/pfn_devs.c | 75 +++++++-------------------------------------- include/linux/mmzone.h | 4 ++ 3 files changed, 19 insertions(+), 71 deletions(-) diff --git a/drivers/nvdimm/pfn.h b/drivers/nvdimm/pfn.h index e901e3a3b04c..ae589cc528f2 100644 --- a/drivers/nvdimm/pfn.h +++ b/drivers/nvdimm/pfn.h @@ -41,18 +41,13 @@ struct nd_pfn_sb { __le64 checksum; }; -#ifdef CONFIG_SPARSEMEM -#define PFN_SECTION_ALIGN_DOWN(x) SECTION_ALIGN_DOWN(x) -#define PFN_SECTION_ALIGN_UP(x) SECTION_ALIGN_UP(x) -#else /* * In this case ZONE_DEVICE=n and we will disable 'pfn' device support, * but we still want pmem to compile. */ -#define PFN_SECTION_ALIGN_DOWN(x) (x) -#define PFN_SECTION_ALIGN_UP(x) (x) +#ifndef SUB_SECTION_ALIGN_DOWN +#define SUB_SECTION_ALIGN_DOWN(x) (x) +#define SUB_SECTION_ALIGN_UP(x) (x) #endif -#define PHYS_SECTION_ALIGN_DOWN(x) PFN_PHYS(PFN_SECTION_ALIGN_DOWN(PHYS_PFN(x))) -#define PHYS_SECTION_ALIGN_UP(x) PFN_PHYS(PFN_SECTION_ALIGN_UP(PHYS_PFN(x))) #endif /* __NVDIMM_PFN_H */ diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c index a2406253eb70..7bdaaf3dc77e 100644 --- a/drivers/nvdimm/pfn_devs.c +++ b/drivers/nvdimm/pfn_devs.c @@ -595,14 +595,14 @@ static u32 info_block_reserve(void) } /* - * We hotplug memory at section granularity, pad the reserved area from - * the previous section base to the namespace base address. + * We hotplug memory at sub-section granularity, pad the reserved area + * from the previous section base to the namespace base address. */ static unsigned long init_altmap_base(resource_size_t base) { unsigned long base_pfn = PHYS_PFN(base); - return PFN_SECTION_ALIGN_DOWN(base_pfn); + return SUB_SECTION_ALIGN_DOWN(base_pfn); } static unsigned long init_altmap_reserve(resource_size_t base) @@ -610,7 +610,7 @@ static unsigned long init_altmap_reserve(resource_size_t base) unsigned long reserve = info_block_reserve() >> PAGE_SHIFT; unsigned long base_pfn = PHYS_PFN(base); - reserve += base_pfn - PFN_SECTION_ALIGN_DOWN(base_pfn); + reserve += base_pfn - SUB_SECTION_ALIGN_DOWN(base_pfn); return reserve; } @@ -641,8 +641,7 @@ static int __nvdimm_setup_pfn(struct nd_pfn *nd_pfn, struct dev_pagemap *pgmap) nd_pfn->npfns = le64_to_cpu(pfn_sb->npfns); pgmap->altmap_valid = false; } else if (nd_pfn->mode == PFN_MODE_PMEM) { - nd_pfn->npfns = PFN_SECTION_ALIGN_UP((resource_size(res) - - offset) / PAGE_SIZE); + nd_pfn->npfns = PHYS_PFN((resource_size(res) - offset)); if (le64_to_cpu(nd_pfn->pfn_sb->npfns) > nd_pfn->npfns) dev_info(&nd_pfn->dev, "number of pfns truncated from %lld to %ld\n", @@ -658,50 +657,10 @@ static int __nvdimm_setup_pfn(struct nd_pfn *nd_pfn, struct dev_pagemap *pgmap) return 0; } -static u64 phys_pmem_align_down(struct nd_pfn *nd_pfn, u64 phys) -{ - return min_t(u64, PHYS_SECTION_ALIGN_DOWN(phys), - ALIGN_DOWN(phys, nd_pfn->align)); -} - -/* - * Check if pmem collides with 'System RAM', or other regions when - * section aligned. Trim it accordingly. - */ -static void trim_pfn_device(struct nd_pfn *nd_pfn, u32 *start_pad, u32 *end_trunc) -{ - struct nd_namespace_common *ndns = nd_pfn->ndns; - struct nd_namespace_io *nsio = to_nd_namespace_io(&ndns->dev); - struct nd_region *nd_region = to_nd_region(nd_pfn->dev.parent); - const resource_size_t start = nsio->res.start; - const resource_size_t end = start + resource_size(&nsio->res); - resource_size_t adjust, size; - - *start_pad = 0; - *end_trunc = 0; - - adjust = start - PHYS_SECTION_ALIGN_DOWN(start); - size = resource_size(&nsio->res) + adjust; - if (region_intersects(start - adjust, size, IORESOURCE_SYSTEM_RAM, - IORES_DESC_NONE) == REGION_MIXED - || nd_region_conflict(nd_region, start - adjust, size)) - *start_pad = PHYS_SECTION_ALIGN_UP(start) - start; - - /* Now check that end of the range does not collide. */ - adjust = PHYS_SECTION_ALIGN_UP(end) - end; - size = resource_size(&nsio->res) + adjust; - if (region_intersects(start, size, IORESOURCE_SYSTEM_RAM, - IORES_DESC_NONE) == REGION_MIXED - || !IS_ALIGNED(end, nd_pfn->align) - || nd_region_conflict(nd_region, start, size)) - *end_trunc = end - phys_pmem_align_down(nd_pfn, end); -} - static int nd_pfn_init(struct nd_pfn *nd_pfn) { struct nd_namespace_common *ndns = nd_pfn->ndns; struct nd_namespace_io *nsio = to_nd_namespace_io(&ndns->dev); - u32 start_pad, end_trunc, reserve = info_block_reserve(); resource_size_t start, size; struct nd_region *nd_region; struct nd_pfn_sb *pfn_sb; @@ -736,43 +695,35 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) return -ENXIO; } - memset(pfn_sb, 0, sizeof(*pfn_sb)); - - trim_pfn_device(nd_pfn, &start_pad, &end_trunc); - if (start_pad + end_trunc) - dev_info(&nd_pfn->dev, "%s alignment collision, truncate %d bytes\n", - dev_name(&ndns->dev), start_pad + end_trunc); - /* * Note, we use 64 here for the standard size of struct page, * debugging options may cause it to be larger in which case the * implementation will limit the pfns advertised through * ->direct_access() to those that are included in the memmap. */ - start = nsio->res.start + start_pad; + start = nsio->res.start; size = resource_size(&nsio->res); - npfns = PFN_SECTION_ALIGN_UP((size - start_pad - end_trunc - reserve) - / PAGE_SIZE); + npfns = PHYS_PFN(size - SZ_8K); if (nd_pfn->mode == PFN_MODE_PMEM) { /* * The altmap should be padded out to the block size used * when populating the vmemmap. This *should* be equal to * PMD_SIZE for most architectures. */ - offset = ALIGN(start + reserve + 64 * npfns, - max(nd_pfn->align, PMD_SIZE)) - start; + offset = ALIGN(start + SZ_8K + 64 * npfns, + max(nd_pfn->align, SECTION_ACTIVE_SIZE)) - start; } else if (nd_pfn->mode == PFN_MODE_RAM) - offset = ALIGN(start + reserve, nd_pfn->align) - start; + offset = ALIGN(start + SZ_8K, nd_pfn->align) - start; else return -ENXIO; - if (offset + start_pad + end_trunc >= size) { + if (offset >= size) { dev_err(&nd_pfn->dev, "%s unable to satisfy requested alignment\n", dev_name(&ndns->dev)); return -ENXIO; } - npfns = (size - offset - start_pad - end_trunc) / SZ_4K; + npfns = PHYS_PFN(size - offset); pfn_sb->mode = cpu_to_le32(nd_pfn->mode); pfn_sb->dataoff = cpu_to_le64(offset); pfn_sb->npfns = cpu_to_le64(npfns); @@ -781,8 +732,6 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) memcpy(pfn_sb->parent_uuid, nd_dev_to_uuid(&ndns->dev), 16); pfn_sb->version_major = cpu_to_le16(1); pfn_sb->version_minor = cpu_to_le16(3); - pfn_sb->start_pad = cpu_to_le32(start_pad); - pfn_sb->end_trunc = cpu_to_le32(end_trunc); pfn_sb->align = cpu_to_le32(nd_pfn->align); checksum = nd_sb_checksum((struct nd_gen_sb *) pfn_sb); pfn_sb->checksum = cpu_to_le64(checksum); diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 3237c5e456df..d2445c483ad4 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1155,6 +1155,10 @@ static inline unsigned long section_nr_to_pfn(unsigned long sec) #define PAGES_PER_SUB_SECTION (SECTION_ACTIVE_SIZE / PAGE_SIZE) #define PAGE_SUB_SECTION_MASK (~(PAGES_PER_SUB_SECTION-1)) +#define SUB_SECTION_ALIGN_UP(pfn) (((pfn) + PAGES_PER_SUB_SECTION - 1) \ + & PAGE_SUB_SECTION_MASK) +#define SUB_SECTION_ALIGN_DOWN(pfn) ((pfn) & PAGE_SUB_SECTION_MASK) + struct mem_section_usage { /* * SECTION_ACTIVE_SIZE portions of the section that are populated in