From patchwork Wed Jun 19 05:51:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 11003455 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4AC6776 for ; Wed, 19 Jun 2019 06:06:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 37A911FF41 for ; Wed, 19 Jun 2019 06:06:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2BE8B28B45; Wed, 19 Jun 2019 06:06:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id ACCEB1FF41 for ; Wed, 19 Jun 2019 06:05:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B515D6B0005; Wed, 19 Jun 2019 02:05:57 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id ADB528E0002; Wed, 19 Jun 2019 02:05:57 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 955148E0001; Wed, 19 Jun 2019 02:05:57 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) by kanga.kvack.org (Postfix) with ESMTP id 5CB096B0005 for ; Wed, 19 Jun 2019 02:05:57 -0400 (EDT) Received: by mail-pl1-f199.google.com with SMTP id 91so9223995pla.7 for ; Tue, 18 Jun 2019 23:05:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=Fv9DizISIc0NkVm1MjpeDQZoymFuV0tchADdBluzOhk=; b=XnaXF6iw9lETkfyNYTokugPWoxaSk3jHSKuyKtd+3AT/zTzA2du2kAxnOf9al7JH/s g2Xi7YD4jS3qmSqVcATfo8e30xQjFgTTctnHPEYHCINeWRGvT9X/wn+c268Upvjmg1UX 003QTyfgD/zxZ/ZfLkMPnB6jPeuv1+8d5z1rRIPhcWC43uIqsVzM7MaqKBm/HvUOHKOc neUAY3JWruSy4e3b1uOL6U/s+Uox4j0OV4OUPTNMnH9LfHhG6cclyQY8hH8A+9/YRyy2 +FPa5NOnYOhWtKDDiwP+zL5soUV/oCV6pdnhh1X5I0zgwLf3Nh6l1Xq3fYM8EhotLwC4 7qjA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAVmsxWVyGtub9IXy8eIC/m6Qod2JLtyUJy0gg5f0pssRC9nr2Ga JTEziePwwSd5Dg/lOdPnAT8wkc2P9LcfTnsYjivRGOLmD5J0FhvKYW9rvhCr7JVDyfry9F/WgmW P/gFbxeNz1Fs9vEFIxk8pdPBr150yEsh6d5CrEVHTAScSyLIIXIVY6C0IP8pHBgx0aw== X-Received: by 2002:a17:902:848c:: with SMTP id c12mr115811508plo.17.1560924357015; Tue, 18 Jun 2019 23:05:57 -0700 (PDT) X-Google-Smtp-Source: APXvYqzEacIQ7IaUuYN79Gj3Dep/IoP8Nkmj7K31oF0HRel2cMv2QfD0KtQeWyzcG1OnQAYPGm9v X-Received: by 2002:a17:902:848c:: with SMTP id c12mr115811439plo.17.1560924356134; Tue, 18 Jun 2019 23:05:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560924356; cv=none; d=google.com; s=arc-20160816; b=jnZ+TvaT3mCruyg2zZcJ8ViVunRHzn8PfqBnIh8xH/eV6fyIjFVspaRWYRldMgzg+x NeVm9tVJI1GN1UEgPS+LN6xE9ujnQFxvlgPX2ibRuwd5E1+Uy/G447WfLbJVZRDpGIsP Vtn+lqCOEn2qOBCwLafx+E687lg/Bin0fk22IRC9VBUXHotXdgQzn/xoRj1mpKjan97O hQg8NXLJ+3NapPMX717JoKk7DyEOgqN/pcudeh/eRQjKdjn3R9cBxhT8nx00vJKcVfd9 IfrVbUoMFi05xXvcOi+iihkULlNfNviyzUVMUaG/sEUSmRNEfytTHnW0kCstTssERvKI lWZg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=Fv9DizISIc0NkVm1MjpeDQZoymFuV0tchADdBluzOhk=; b=dYRE7/vwqIZ9d6KUcFRRH+m4dO4IcRPsqHckJI2c1UuAMtq+5h2e4B4BzQ2pp8zJW/ VY1KEd4raQvNVT6qv+p8t9MtaHhQbrWozcBu9kM9KPjrvOxi1Bb4OMFrKagYBj3TEHEU QgnnAT5HPNz+akzn6t4v41EqBxOIwySjVYUPlckIt/FMjhBrS8uwb3MVpjiE1p7X6lmd HyJ6iw9FVE++6IV5a5NhTtYwiNNdLJakViUlxStm3mhL8+0K94DkFkoZamWflXDGLTCq Rhw6oQsbG8tNnby64M10i52F7rk6CoOUqDd0sauODaM/sFkvYVLJpX86SPzk2nILR52n /zBQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga05.intel.com (mga05.intel.com. [192.55.52.43]) by mx.google.com with ESMTPS id i10si15316577pfd.226.2019.06.18.23.05.55 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 18 Jun 2019 23:05:56 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.43 as permitted sender) client-ip=192.55.52.43; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:05:55 -0700 X-IronPort-AV: E=Sophos;i="5.63,392,1557212400"; d="scan'208";a="170464316" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:05:55 -0700 Subject: [PATCH v10 01/13] mm/sparsemem: Introduce struct mem_section_usage From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Vlastimil Babka , Logan Gunthorpe , Pavel Tatashin , Oscar Salvador , Wei Yang , linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org Date: Tue, 18 Jun 2019 22:51:38 -0700 Message-ID: <156092349845.979959.73333291612799019.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> References: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Towards enabling memory hotplug to track partial population of a section, introduce 'struct mem_section_usage'. A pointer to a 'struct mem_section_usage' instance replaces the existing pointer to a 'pageblock_flags' bitmap. Effectively it adds one more 'unsigned long' beyond the 'pageblock_flags' (usemap) allocation to house a new 'subsection_map' bitmap. The new bitmap enables the memory hot{plug,remove} implementation to act on incremental sub-divisions of a section. SUBSECTION_SHIFT is defined as global constant instead of per-architecture value like SECTION_SIZE_BITS in order to allow cross-arch compatibility of subsection users. Specifically a common subsection size allows for the possibility that persistent memory namespace configurations be made compatible across architectures. The primary motivation for this functionality is to support platforms that mix "System RAM" and "Persistent Memory" within a single section, or multiple PMEM ranges with different mapping lifetimes within a single section. The section restriction for hotplug has caused an ongoing saga of hacks and bugs for devm_memremap_pages() users. Beyond the fixups to teach existing paths how to retrieve the 'usemap' from a section, and updates to usemap allocation path, there are no expected behavior changes. Cc: Michal Hocko Cc: Vlastimil Babka Cc: Logan Gunthorpe Cc: Pavel Tatashin Reviewed-by: Oscar Salvador Reviewed-by: Wei Yang Signed-off-by: Dan Williams --- include/linux/mmzone.h | 28 +++++++++++++++-- mm/memory_hotplug.c | 18 ++++++----- mm/page_alloc.c | 2 + mm/sparse.c | 81 ++++++++++++++++++++++++------------------------ 4 files changed, 76 insertions(+), 53 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 427b79c39b3c..179680c94262 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1161,6 +1161,24 @@ static inline unsigned long section_nr_to_pfn(unsigned long sec) #define SECTION_ALIGN_UP(pfn) (((pfn) + PAGES_PER_SECTION - 1) & PAGE_SECTION_MASK) #define SECTION_ALIGN_DOWN(pfn) ((pfn) & PAGE_SECTION_MASK) +#define SUBSECTION_SHIFT 21 + +#define PFN_SUBSECTION_SHIFT (SUBSECTION_SHIFT - PAGE_SHIFT) +#define PAGES_PER_SUBSECTION (1UL << PFN_SUBSECTION_SHIFT) +#define PAGE_SUBSECTION_MASK (~(PAGES_PER_SUBSECTION-1)) + +#if SUBSECTION_SHIFT > SECTION_SIZE_BITS +#error Subsection size exceeds section size +#else +#define SUBSECTIONS_PER_SECTION (1UL << (SECTION_SIZE_BITS - SUBSECTION_SHIFT)) +#endif + +struct mem_section_usage { + DECLARE_BITMAP(subsection_map, SUBSECTIONS_PER_SECTION); + /* See declaration of similar field in struct zone */ + unsigned long pageblock_flags[0]; +}; + struct page; struct page_ext; struct mem_section { @@ -1178,8 +1196,7 @@ struct mem_section { */ unsigned long section_mem_map; - /* See declaration of similar field in struct zone */ - unsigned long *pageblock_flags; + struct mem_section_usage *usage; #ifdef CONFIG_PAGE_EXTENSION /* * If SPARSEMEM, pgdat doesn't have page_ext pointer. We use @@ -1210,6 +1227,11 @@ extern struct mem_section **mem_section; extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT]; #endif +static inline unsigned long *section_to_usemap(struct mem_section *ms) +{ + return ms->usage->pageblock_flags; +} + static inline struct mem_section *__nr_to_section(unsigned long nr) { #ifdef CONFIG_SPARSEMEM_EXTREME @@ -1221,7 +1243,7 @@ static inline struct mem_section *__nr_to_section(unsigned long nr) return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK]; } extern int __section_nr(struct mem_section* ms); -extern unsigned long usemap_size(void); +extern size_t mem_section_usage_size(void); /* * We use the lower bits of the mem_map pointer to store diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index a88c5f334e5a..7b963c2d3a0d 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -166,9 +166,10 @@ void put_page_bootmem(struct page *page) #ifndef CONFIG_SPARSEMEM_VMEMMAP static void register_page_bootmem_info_section(unsigned long start_pfn) { - unsigned long *usemap, mapsize, section_nr, i; + unsigned long mapsize, section_nr, i; struct mem_section *ms; struct page *page, *memmap; + struct mem_section_usage *usage; section_nr = pfn_to_section_nr(start_pfn); ms = __nr_to_section(section_nr); @@ -188,10 +189,10 @@ static void register_page_bootmem_info_section(unsigned long start_pfn) for (i = 0; i < mapsize; i++, page++) get_page_bootmem(section_nr, page, SECTION_INFO); - usemap = ms->pageblock_flags; - page = virt_to_page(usemap); + usage = ms->usage; + page = virt_to_page(usage); - mapsize = PAGE_ALIGN(usemap_size()) >> PAGE_SHIFT; + mapsize = PAGE_ALIGN(mem_section_usage_size()) >> PAGE_SHIFT; for (i = 0; i < mapsize; i++, page++) get_page_bootmem(section_nr, page, MIX_SECTION_INFO); @@ -200,9 +201,10 @@ static void register_page_bootmem_info_section(unsigned long start_pfn) #else /* CONFIG_SPARSEMEM_VMEMMAP */ static void register_page_bootmem_info_section(unsigned long start_pfn) { - unsigned long *usemap, mapsize, section_nr, i; + unsigned long mapsize, section_nr, i; struct mem_section *ms; struct page *page, *memmap; + struct mem_section_usage *usage; section_nr = pfn_to_section_nr(start_pfn); ms = __nr_to_section(section_nr); @@ -211,10 +213,10 @@ static void register_page_bootmem_info_section(unsigned long start_pfn) register_page_bootmem_memmap(section_nr, memmap, PAGES_PER_SECTION); - usemap = ms->pageblock_flags; - page = virt_to_page(usemap); + usage = ms->usage; + page = virt_to_page(usage); - mapsize = PAGE_ALIGN(usemap_size()) >> PAGE_SHIFT; + mapsize = PAGE_ALIGN(mem_section_usage_size()) >> PAGE_SHIFT; for (i = 0; i < mapsize; i++, page++) get_page_bootmem(section_nr, page, MIX_SECTION_INFO); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index f4651a09948c..8cc091e87200 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -403,7 +403,7 @@ static inline unsigned long *get_pageblock_bitmap(struct page *page, unsigned long pfn) { #ifdef CONFIG_SPARSEMEM - return __pfn_to_section(pfn)->pageblock_flags; + return section_to_usemap(__pfn_to_section(pfn)); #else return page_zone(page)->pageblock_flags; #endif /* CONFIG_SPARSEMEM */ diff --git a/mm/sparse.c b/mm/sparse.c index 1552c855d62a..71da15cc7432 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -288,33 +288,31 @@ struct page *sparse_decode_mem_map(unsigned long coded_mem_map, unsigned long pn static void __meminit sparse_init_one_section(struct mem_section *ms, unsigned long pnum, struct page *mem_map, - unsigned long *pageblock_bitmap) + struct mem_section_usage *usage) { ms->section_mem_map &= ~SECTION_MAP_MASK; ms->section_mem_map |= sparse_encode_mem_map(mem_map, pnum) | SECTION_HAS_MEM_MAP; - ms->pageblock_flags = pageblock_bitmap; + ms->usage = usage; } -unsigned long usemap_size(void) +static unsigned long usemap_size(void) { return BITS_TO_LONGS(SECTION_BLOCKFLAGS_BITS) * sizeof(unsigned long); } -#ifdef CONFIG_MEMORY_HOTPLUG -static unsigned long *__kmalloc_section_usemap(void) +size_t mem_section_usage_size(void) { - return kmalloc(usemap_size(), GFP_KERNEL); + return sizeof(struct mem_section_usage) + usemap_size(); } -#endif /* CONFIG_MEMORY_HOTPLUG */ #ifdef CONFIG_MEMORY_HOTREMOVE -static unsigned long * __init +static struct mem_section_usage * __init sparse_early_usemaps_alloc_pgdat_section(struct pglist_data *pgdat, unsigned long size) { + struct mem_section_usage *usage; unsigned long goal, limit; - unsigned long *p; int nid; /* * A page may contain usemaps for other sections preventing the @@ -330,15 +328,16 @@ sparse_early_usemaps_alloc_pgdat_section(struct pglist_data *pgdat, limit = goal + (1UL << PA_SECTION_SHIFT); nid = early_pfn_to_nid(goal >> PAGE_SHIFT); again: - p = memblock_alloc_try_nid(size, SMP_CACHE_BYTES, goal, limit, nid); - if (!p && limit) { + usage = memblock_alloc_try_nid(size, SMP_CACHE_BYTES, goal, limit, nid); + if (!usage && limit) { limit = 0; goto again; } - return p; + return usage; } -static void __init check_usemap_section_nr(int nid, unsigned long *usemap) +static void __init check_usemap_section_nr(int nid, + struct mem_section_usage *usage) { unsigned long usemap_snr, pgdat_snr; static unsigned long old_usemap_snr; @@ -352,7 +351,7 @@ static void __init check_usemap_section_nr(int nid, unsigned long *usemap) old_pgdat_snr = NR_MEM_SECTIONS; } - usemap_snr = pfn_to_section_nr(__pa(usemap) >> PAGE_SHIFT); + usemap_snr = pfn_to_section_nr(__pa(usage) >> PAGE_SHIFT); pgdat_snr = pfn_to_section_nr(__pa(pgdat) >> PAGE_SHIFT); if (usemap_snr == pgdat_snr) return; @@ -380,14 +379,15 @@ static void __init check_usemap_section_nr(int nid, unsigned long *usemap) usemap_snr, pgdat_snr, nid); } #else -static unsigned long * __init +static struct mem_section_usage * __init sparse_early_usemaps_alloc_pgdat_section(struct pglist_data *pgdat, unsigned long size) { return memblock_alloc_node(size, SMP_CACHE_BYTES, pgdat->node_id); } -static void __init check_usemap_section_nr(int nid, unsigned long *usemap) +static void __init check_usemap_section_nr(int nid, + struct mem_section_usage *usage) { } #endif /* CONFIG_MEMORY_HOTREMOVE */ @@ -474,14 +474,13 @@ static void __init sparse_init_nid(int nid, unsigned long pnum_begin, unsigned long pnum_end, unsigned long map_count) { - unsigned long pnum, usemap_longs, *usemap; + struct mem_section_usage *usage; + unsigned long pnum; struct page *map; - usemap_longs = BITS_TO_LONGS(SECTION_BLOCKFLAGS_BITS); - usemap = sparse_early_usemaps_alloc_pgdat_section(NODE_DATA(nid), - usemap_size() * - map_count); - if (!usemap) { + usage = sparse_early_usemaps_alloc_pgdat_section(NODE_DATA(nid), + mem_section_usage_size() * map_count); + if (!usage) { pr_err("%s: node[%d] usemap allocation failed", __func__, nid); goto failed; } @@ -497,9 +496,9 @@ static void __init sparse_init_nid(int nid, unsigned long pnum_begin, pnum_begin = pnum; goto failed; } - check_usemap_section_nr(nid, usemap); - sparse_init_one_section(__nr_to_section(pnum), pnum, map, usemap); - usemap += usemap_longs; + check_usemap_section_nr(nid, usage); + sparse_init_one_section(__nr_to_section(pnum), pnum, map, usage); + usage = (void *) usage + mem_section_usage_size(); } sparse_buffer_fini(); return; @@ -697,9 +696,9 @@ int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, struct vmem_altmap *altmap) { unsigned long section_nr = pfn_to_section_nr(start_pfn); + struct mem_section_usage *usage; struct mem_section *ms; struct page *memmap; - unsigned long *usemap; int ret; /* @@ -713,8 +712,8 @@ int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, memmap = kmalloc_section_memmap(section_nr, nid, altmap); if (!memmap) return -ENOMEM; - usemap = __kmalloc_section_usemap(); - if (!usemap) { + usage = kzalloc(mem_section_usage_size(), GFP_KERNEL); + if (!usage) { __kfree_section_memmap(memmap, altmap); return -ENOMEM; } @@ -732,11 +731,11 @@ int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, page_init_poison(memmap, sizeof(struct page) * PAGES_PER_SECTION); section_mark_present(ms); - sparse_init_one_section(ms, section_nr, memmap, usemap); + sparse_init_one_section(ms, section_nr, memmap, usage); out: if (ret < 0) { - kfree(usemap); + kfree(usage); __kfree_section_memmap(memmap, altmap); } return ret; @@ -772,20 +771,20 @@ static inline void clear_hwpoisoned_pages(struct page *memmap, int nr_pages) } #endif -static void free_section_usemap(struct page *memmap, unsigned long *usemap, - struct vmem_altmap *altmap) +static void free_section_usage(struct page *memmap, + struct mem_section_usage *usage, struct vmem_altmap *altmap) { - struct page *usemap_page; + struct page *usage_page; - if (!usemap) + if (!usage) return; - usemap_page = virt_to_page(usemap); + usage_page = virt_to_page(usage); /* * Check to see if allocation came from hot-plug-add */ - if (PageSlab(usemap_page) || PageCompound(usemap_page)) { - kfree(usemap); + if (PageSlab(usage_page) || PageCompound(usage_page)) { + kfree(usage); if (memmap) __kfree_section_memmap(memmap, altmap); return; @@ -804,18 +803,18 @@ void sparse_remove_one_section(struct mem_section *ms, unsigned long map_offset, struct vmem_altmap *altmap) { struct page *memmap = NULL; - unsigned long *usemap = NULL; + struct mem_section_usage *usage = NULL; if (ms->section_mem_map) { - usemap = ms->pageblock_flags; + usage = ms->usage; memmap = sparse_decode_mem_map(ms->section_mem_map, __section_nr(ms)); ms->section_mem_map = 0; - ms->pageblock_flags = NULL; + ms->usage = NULL; } clear_hwpoisoned_pages(memmap + map_offset, PAGES_PER_SECTION - map_offset); - free_section_usemap(memmap, usemap, altmap); + free_section_usage(memmap, usage, altmap); } #endif /* CONFIG_MEMORY_HOTPLUG */ From patchwork Wed Jun 19 05:51:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 11003459 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 637D114B6 for ; Wed, 19 Jun 2019 06:06:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 50E7A1FF41 for ; Wed, 19 Jun 2019 06:06:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 412B028B45; Wed, 19 Jun 2019 06:06:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A16C41FF41 for ; Wed, 19 Jun 2019 06:06:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B02096B0006; Wed, 19 Jun 2019 02:06:03 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A8CC68E0002; Wed, 19 Jun 2019 02:06:03 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8DEB28E0001; Wed, 19 Jun 2019 02:06:03 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f200.google.com (mail-pg1-f200.google.com [209.85.215.200]) by kanga.kvack.org (Postfix) with ESMTP id 54E1B6B0006 for ; Wed, 19 Jun 2019 02:06:03 -0400 (EDT) Received: by mail-pg1-f200.google.com with SMTP id g9so11567194pgd.17 for ; Tue, 18 Jun 2019 23:06:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=rBPlEcf/zda1uNkC6O2OdGpJ9KV4wxiKq/x0oVCD0uE=; b=rLYhf0RiPHB9OJwM5qkLkEk6gQhl1obAaFDnIjLQG6q1lgoMLvKvs1QDcFELjjUBBT whs5lBG8qF83Mckm0UU4069q6OU7ztkHUe5HZNglu5UHqCFc621rTI3PotA1lGoe3Oxa aOtWfThPsOK4BkCatfucATCcbTWy+Tsr+itMP4ZwiGOTjIjSBPb5aRfx5hSeaEduVfB9 +VXrSTS7n/d7YEWk7MAJYrpxauRz8VDcZosO+V3QSZgss0PpzaMvkJvEcUiQJJU8f7jE Bd+1GRreyqbhlXfQQGkMJybZ+jvUarxAmI1qAxFP6Bc/SuTd0JLOQRPQLPN4UrvK1Y/i dEow== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAWFxfViYB+usQXAQphAGQd5ZI8k55cZ13i165VmLDmU3lFiNEoe XfBPW2DITCwcNsgFp8Ar/mUxOcfnlpa7PYnG4+r2TjSp60q+wHadJrKg1E4pVQGUbvFL2pJ6tjJ zdPsHNGHQTJM1ATmJpD6VcwK51RCIKOcDxWmNRFCD4EFK6jwjlB3q+Hf/e0Hn8ojSgw== X-Received: by 2002:a63:6245:: with SMTP id w66mr122545pgb.117.1560924362831; Tue, 18 Jun 2019 23:06:02 -0700 (PDT) X-Google-Smtp-Source: APXvYqyTTQhpQ68jcwX3nP3F6B//E4UicAm0qbkhOCfhBb6RaGkIrbDGoTFyOibzcye+nKMvA49N X-Received: by 2002:a63:6245:: with SMTP id w66mr122487pgb.117.1560924362001; Tue, 18 Jun 2019 23:06:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560924361; cv=none; d=google.com; s=arc-20160816; b=x7pu9I0ZRiBqQbvDGyrnOl7Dqb+A36rsCJT/MLjgwppHw6u3o0jYUsYOg1tvN4ISkw 9rsa2r5WZeB1Y0lFWMoza2FAkk1sQ8cMGhmFvc5d5WV9Cd+8uo460DVrTr/pbAWuAwij QokNweb7tFMxdsNCyBxw6GdbvhyyE/pPjxtVuoWVJF595CyVyN3GQ6yQcGj8e+ROwxg5 /WyJYQkh6UNxnIzUsTx22vU2Gp06+DRPLcpS9CyGFfM6kACgE/N1BuMb8UrgVcY659rR 4s3Ih01ZkEV2sDH5hAPVKRirVoD/c1XXfOlkhHB2++2AwozIuVeQQc0MGFSNQ+dU7+rK Bd+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=rBPlEcf/zda1uNkC6O2OdGpJ9KV4wxiKq/x0oVCD0uE=; b=rxMyt1qteYsbwL8d6OrtyJKMm7O2GvGeNcVsXpC7WUc2k5fB0MploamFjyCZDcEaGu wba/KwyE2EaVwl8nE/ZuNc55MivaE/x/gXVKTCUhWcPyJQZT6cGcMQ1k374wpzGr8nVz lhhtEBfbX6gGhopF4i51YY9V00uHGtdugOcE3XCB8Ye9EI4W6wyw2cbvAviNrKuJHUuE fASYyHTwVEqXHUB4xEDWcfCRNUqE6hXI8tPh5ypF4wnbsJUNfNSIul3Z6THs9P3XYhLk bHnE9fqXlw3oKWka5RC9oIvX4ctrZfMEXWM4fba9gQbwSru2B6x47hoDEvkE/uSi9J/Z /+zg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga12.intel.com (mga12.intel.com. [192.55.52.136]) by mx.google.com with ESMTPS id 35si173412pgn.199.2019.06.18.23.06.01 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 18 Jun 2019 23:06:01 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) client-ip=192.55.52.136; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:06:00 -0700 X-IronPort-AV: E=Sophos;i="5.63,392,1557212400"; d="scan'208";a="243216408" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:06:00 -0700 Subject: [PATCH v10 02/13] mm/sparsemem: Introduce a SECTION_IS_EARLY flag From: Dan Williams To: akpm@linux-foundation.org Cc: Qian Cai , Michal Hocko , Logan Gunthorpe , David Hildenbrand , Oscar Salvador , Pavel Tatashin , linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org Date: Tue, 18 Jun 2019 22:51:43 -0700 Message-ID: <156092350358.979959.5817209875548072819.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> References: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP In preparation for sub-section hotplug, track whether a given section was created during early memory initialization, or later via memory hotplug. This distinction is needed to maintain the coarse expectation that pfn_valid() returns true for any pfn within a given section even if that section has pages that are reserved from the page allocator. For example one of the of goals of subsection hotplug is to support cases where the system physical memory layout collides System RAM and PMEM within a section. Several pfn_valid() users expect to just check if a section is valid, but they are not careful to check if the given pfn is within a "System RAM" boundary and instead expect pgdat information to further validate the pfn. Rather than unwind those paths to make their pfn_valid() queries more precise a follow on patch uses the SECTION_IS_EARLY flag to maintain the traditional expectation that pfn_valid() returns true for all early sections. Link: https://lore.kernel.org/lkml/1560366952-10660-1-git-send-email-cai@lca.pw/ Reported-by: Qian Cai Cc: Michal Hocko Cc: Logan Gunthorpe Cc: David Hildenbrand Cc: Oscar Salvador Cc: Pavel Tatashin Signed-off-by: Dan Williams Reviewed-by: Oscar Salvador --- include/linux/mmzone.h | 8 +++++++- mm/sparse.c | 20 +++++++++----------- 2 files changed, 16 insertions(+), 12 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 179680c94262..d081c9a1d25d 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1261,7 +1261,8 @@ extern size_t mem_section_usage_size(void); #define SECTION_MARKED_PRESENT (1UL<<0) #define SECTION_HAS_MEM_MAP (1UL<<1) #define SECTION_IS_ONLINE (1UL<<2) -#define SECTION_MAP_LAST_BIT (1UL<<3) +#define SECTION_IS_EARLY (1UL<<3) +#define SECTION_MAP_LAST_BIT (1UL<<4) #define SECTION_MAP_MASK (~(SECTION_MAP_LAST_BIT-1)) #define SECTION_NID_SHIFT 3 @@ -1287,6 +1288,11 @@ static inline int valid_section(struct mem_section *section) return (section && (section->section_mem_map & SECTION_HAS_MEM_MAP)); } +static inline int early_section(struct mem_section *section) +{ + return (section && (section->section_mem_map & SECTION_IS_EARLY)); +} + static inline int valid_section_nr(unsigned long nr) { return valid_section(__nr_to_section(nr)); diff --git a/mm/sparse.c b/mm/sparse.c index 71da15cc7432..2031a0694f35 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -288,11 +288,11 @@ struct page *sparse_decode_mem_map(unsigned long coded_mem_map, unsigned long pn static void __meminit sparse_init_one_section(struct mem_section *ms, unsigned long pnum, struct page *mem_map, - struct mem_section_usage *usage) + struct mem_section_usage *usage, unsigned long flags) { ms->section_mem_map &= ~SECTION_MAP_MASK; - ms->section_mem_map |= sparse_encode_mem_map(mem_map, pnum) | - SECTION_HAS_MEM_MAP; + ms->section_mem_map |= sparse_encode_mem_map(mem_map, pnum) + | SECTION_HAS_MEM_MAP | flags; ms->usage = usage; } @@ -497,7 +497,8 @@ static void __init sparse_init_nid(int nid, unsigned long pnum_begin, goto failed; } check_usemap_section_nr(nid, usage); - sparse_init_one_section(__nr_to_section(pnum), pnum, map, usage); + sparse_init_one_section(__nr_to_section(pnum), pnum, map, usage, + SECTION_IS_EARLY); usage = (void *) usage + mem_section_usage_size(); } sparse_buffer_fini(); @@ -731,7 +732,7 @@ int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, page_init_poison(memmap, sizeof(struct page) * PAGES_PER_SECTION); section_mark_present(ms); - sparse_init_one_section(ms, section_nr, memmap, usage); + sparse_init_one_section(ms, section_nr, memmap, usage, 0); out: if (ret < 0) { @@ -771,19 +772,16 @@ static inline void clear_hwpoisoned_pages(struct page *memmap, int nr_pages) } #endif -static void free_section_usage(struct page *memmap, +static void free_section_usage(struct mem_section *ms, struct page *memmap, struct mem_section_usage *usage, struct vmem_altmap *altmap) { - struct page *usage_page; - if (!usage) return; - usage_page = virt_to_page(usage); /* * Check to see if allocation came from hot-plug-add */ - if (PageSlab(usage_page) || PageCompound(usage_page)) { + if (!early_section(ms)) { kfree(usage); if (memmap) __kfree_section_memmap(memmap, altmap); @@ -815,6 +813,6 @@ void sparse_remove_one_section(struct mem_section *ms, unsigned long map_offset, clear_hwpoisoned_pages(memmap + map_offset, PAGES_PER_SECTION - map_offset); - free_section_usage(memmap, usage, altmap); + free_section_usage(ms, memmap, usage, altmap); } #endif /* CONFIG_MEMORY_HOTPLUG */ From patchwork Wed Jun 19 05:51:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 11003463 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CBEBB14B6 for ; Wed, 19 Jun 2019 06:06:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BB0691FF41 for ; Wed, 19 Jun 2019 06:06:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AF6B928B45; Wed, 19 Jun 2019 06:06:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 800E41FF41 for ; Wed, 19 Jun 2019 06:06:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 92D8D6B0007; Wed, 19 Jun 2019 02:06:08 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 8EA378E0001; Wed, 19 Jun 2019 02:06:08 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 77F5F8E0001; Wed, 19 Jun 2019 02:06:08 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f198.google.com (mail-pf1-f198.google.com [209.85.210.198]) by kanga.kvack.org (Postfix) with ESMTP id 40CBD6B0007 for ; Wed, 19 Jun 2019 02:06:08 -0400 (EDT) Received: by mail-pf1-f198.google.com with SMTP id x18so10990874pfj.4 for ; Tue, 18 Jun 2019 23:06:08 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=efV/bOh1lMhwsgpUzGakm9wILLGEQUb8r2cn7Vf8up0=; b=hZ6Vt6A4s5GhJVJ5h1hvY/jHAFCQHQzQjImWMi9R/RU4RyI2qI7wFRn7HVUeqHwclc nuIOu1fpRIgs7DqeA2NzFmtdwl5XjYvpmUyqrhgL0YxJwis8EkFOlqs3B4ZzeE+q2INA YY0vbnn+29jAlEygXpFUiRLkHF6fb+0zmAq9V/vxAi5/a0ldADSjNm0HSFxnDkkGICoE bHKq9wlET5LHjbY6YLa44i60SQwmrFz2y5JnjTqTJl00ltRq+0SLqVFwl+XM5m/tgQfB k5EKf4FHTpx7XZm3MC6AUI/u0d9YNeQDICDigsW3OzQdGOt95ze1LBx3SBOgi8mzNuwN LPCA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAWC94/PfQMM7SIhIqI+8qSPz4ADQUbeUXL8N28Sa2qWU2sSKP7O uvr3Tp6Sf/3KAaaaMSuisGFxQPe+JXliZ1tv4hS/FLgXyMpKOjr+qqV5ty/HwNEFBHe64PpeNmu phZQ97ugDrY7TnhNuYHsypakjyo3FKaQjtcmQ/soGBAXC0ENwzEe0Z4Sj29OKHgk2jg== X-Received: by 2002:a63:4d50:: with SMTP id n16mr6288630pgl.146.1560924367732; Tue, 18 Jun 2019 23:06:07 -0700 (PDT) X-Google-Smtp-Source: APXvYqyGOxfHhva9serZtdxOGtB8ib0WPG+S1r9ma0c2uEygCCpG6Dra90534GMDnK0NzLISAFkv X-Received: by 2002:a63:4d50:: with SMTP id n16mr6288571pgl.146.1560924366933; Tue, 18 Jun 2019 23:06:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560924366; cv=none; d=google.com; s=arc-20160816; b=qeEF7JZGEmgGNaZ2h3E0EU64y9daua2q82M2HY/5nhA/HWGNZ88lgp9W3+w5piWm2Y HCxGfhPVY+4AM4hqgyOwS/D1pGO1Cs3mHppinYe3mLcwNVWyI+YeN/sLNBZc3xkPmf15 nLDqK9HEPmr+QSnY9b/YjGmCarbRgbWaxA3JCDMOnU9uYrvD4P/Rt27N9b+THMYeoQvd qMOtW/elO6v/j9AaBtX4qJ2nYtd4WUrSt5tstZAtGjJhaIyUggY/VldU7Ymsa9VeRvZj XVagRGb0zDVUMUcsLaht0tLxHF+SPSb9A1x5hVwZIYaId0wsU/eH4sZsvK+9sF2p3heG ZY8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=efV/bOh1lMhwsgpUzGakm9wILLGEQUb8r2cn7Vf8up0=; b=t5IcmDyQ+1CHODu0yp+Y6O8vj37UpTmdk4cc7PBootSntgBGI95ZH+Ja7ZPLE6HKSN Qco5KJbzV32POKfQTsBQej1W2rUJScPAHRfCmMnz0ysT5NCG9nsvH4A5d4bXKEi6kw3B +1SSSr4V0+CD+k40AiIqXs9LUqRnt4wsipjK1u8r4R/OsdEl1ffcaBkptgYc5pTLgLA/ xMH+G0C9YrW+10Vas3ycyfFzunBjz/hNrhKTChbNtmgG1P0sLj4Z0m8asWiLxeMdWWyW SZ0MmAiUELkPVxGins90jZa/YEIX9PRHADIhfm5WgxSX9QxGwwNAXozNTu1iSkhOXrT9 RrkA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga01.intel.com (mga01.intel.com. [192.55.52.88]) by mx.google.com with ESMTPS id l1si14855891plb.302.2019.06.18.23.06.06 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 18 Jun 2019 23:06:06 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.88 as permitted sender) client-ip=192.55.52.88; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:06:06 -0700 X-IronPort-AV: E=Sophos;i="5.63,392,1557212400"; d="scan'208";a="160271014" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:06:05 -0700 Subject: [PATCH v10 03/13] mm/sparsemem: Add helpers track active portions of a section at boot From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Vlastimil Babka , Logan Gunthorpe , Oscar Salvador , Pavel Tatashin , Qian Cai , Jane Chu , linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org Date: Tue, 18 Jun 2019 22:51:48 -0700 Message-ID: <156092350874.979959.18185938451405518285.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> References: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Prepare for hot{plug,remove} of sub-ranges of a section by tracking a sub-section active bitmask, each bit representing a PMD_SIZE span of the architecture's memory hotplug section size. The implications of a partially populated section is that pfn_valid() needs to go beyond a valid_section() check and either determine that the section is an "early section", or read the sub-section active ranges from the bitmask. The expectation is that the bitmask (subsection_map) fits in the same cacheline as the valid_section() / early_section() data, so the incremental performance overhead to pfn_valid() should be negligible. The rationale for using early_section() to short-ciruit the subsection_map check is that there are legacy code paths that use pfn_valid() at section granularity before validating the pfn against pgdat data. So, the early_section() check allows those traditional assumptions to persist while also permitting subsection_map to tell the truth for purposes of populating the unused portions of early sections with PMEM and other ZONE_DEVICE mappings. Cc: Michal Hocko Cc: Vlastimil Babka Cc: Logan Gunthorpe Cc: Oscar Salvador Cc: Pavel Tatashin Reported-by: Qian Cai Tested-by: Jane Chu Signed-off-by: Dan Williams Reviewed-by: Oscar Salvador --- include/linux/mmzone.h | 33 ++++++++++++++++++++++++++++++++- mm/page_alloc.c | 10 ++++++++-- mm/sparse.c | 35 +++++++++++++++++++++++++++++++++++ 3 files changed, 75 insertions(+), 3 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index d081c9a1d25d..c4e8843e283c 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1179,6 +1179,8 @@ struct mem_section_usage { unsigned long pageblock_flags[0]; }; +void subsection_map_init(unsigned long pfn, unsigned long nr_pages); + struct page; struct page_ext; struct mem_section { @@ -1322,12 +1324,40 @@ static inline struct mem_section *__pfn_to_section(unsigned long pfn) extern int __highest_present_section_nr; +static inline int subsection_map_index(unsigned long pfn) +{ + return (pfn & ~(PAGE_SECTION_MASK)) / PAGES_PER_SUBSECTION; +} + +#ifdef CONFIG_SPARSEMEM_VMEMMAP +static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) +{ + int idx = subsection_map_index(pfn); + + return test_bit(idx, ms->usage->subsection_map); +} +#else +static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) +{ + return 1; +} +#endif + #ifndef CONFIG_HAVE_ARCH_PFN_VALID static inline int pfn_valid(unsigned long pfn) { + struct mem_section *ms; + if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) return 0; - return valid_section(__nr_to_section(pfn_to_section_nr(pfn))); + ms = __nr_to_section(pfn_to_section_nr(pfn)); + if (!valid_section(ms)) + return 0; + /* + * Traditionally early sections always returned pfn_valid() for + * the entire section-sized span. + */ + return early_section(ms) || pfn_section_valid(ms, pfn); } #endif @@ -1359,6 +1389,7 @@ void sparse_init(void); #define sparse_init() do {} while (0) #define sparse_index_init(_sec, _nid) do {} while (0) #define pfn_present pfn_valid +#define subsection_map_init(_pfn, _nr_pages) do {} while (0) #endif /* CONFIG_SPARSEMEM */ /* diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 8cc091e87200..8e7215fb6976 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -7306,12 +7306,18 @@ void __init free_area_init_nodes(unsigned long *max_zone_pfn) (u64)zone_movable_pfn[i] << PAGE_SHIFT); } - /* Print out the early node map */ + /* + * Print out the early node map, and initialize the + * subsection-map relative to active online memory ranges to + * enable future "sub-section" extensions of the memory map. + */ pr_info("Early memory node ranges\n"); - for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) + for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) { pr_info(" node %3d: [mem %#018Lx-%#018Lx]\n", nid, (u64)start_pfn << PAGE_SHIFT, ((u64)end_pfn << PAGE_SHIFT) - 1); + subsection_map_init(start_pfn, end_pfn - start_pfn); + } /* Initialise every node */ mminit_verify_pageflags_layout(); diff --git a/mm/sparse.c b/mm/sparse.c index 2031a0694f35..e9fec3c2f7ec 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -210,6 +210,41 @@ static inline unsigned long first_present_section_nr(void) return next_present_section_nr(-1); } +void subsection_mask_set(unsigned long *map, unsigned long pfn, + unsigned long nr_pages) +{ + int idx = subsection_map_index(pfn); + int end = subsection_map_index(pfn + nr_pages - 1); + + bitmap_set(map, idx, end - idx + 1); +} + +void __init subsection_map_init(unsigned long pfn, unsigned long nr_pages) +{ + int end_sec = pfn_to_section_nr(pfn + nr_pages - 1); + int i, start_sec = pfn_to_section_nr(pfn); + + if (!nr_pages) + return; + + for (i = start_sec; i <= end_sec; i++) { + struct mem_section *ms; + unsigned long pfns; + + pfns = min(nr_pages, PAGES_PER_SECTION + - (pfn & ~PAGE_SECTION_MASK)); + ms = __nr_to_section(i); + subsection_mask_set(ms->usage->subsection_map, pfn, pfns); + + pr_debug("%s: sec: %d pfns: %ld set(%d, %d)\n", __func__, i, + pfns, subsection_map_index(pfn), + subsection_map_index(pfn + pfns - 1)); + + pfn += pfns; + nr_pages -= pfns; + } +} + /* Record a memory area against a node. */ void __init memory_present(int nid, unsigned long start, unsigned long end) { From patchwork Wed Jun 19 05:51:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 11003467 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1492613AF for ; Wed, 19 Jun 2019 06:06:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 036B71FF41 for ; Wed, 19 Jun 2019 06:06:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EB83A28B45; Wed, 19 Jun 2019 06:06:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 733541FF41 for ; Wed, 19 Jun 2019 06:06:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7CD746B0008; Wed, 19 Jun 2019 02:06:14 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 757F78E0002; Wed, 19 Jun 2019 02:06:14 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 646868E0001; Wed, 19 Jun 2019 02:06:14 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f197.google.com (mail-pg1-f197.google.com [209.85.215.197]) by kanga.kvack.org (Postfix) with ESMTP id 304DB6B0008 for ; Wed, 19 Jun 2019 02:06:14 -0400 (EDT) Received: by mail-pg1-f197.google.com with SMTP id 30so11569438pgk.16 for ; Tue, 18 Jun 2019 23:06:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=hae6rk+qL45xxjhpub2qabo/6u1of96bSs3wXLn92Vo=; b=iRep2NP8MuJ6iX8qVXojdECP1sX5A4d3isJmP0cAzqcngcGp/jHl0SbySUTbQW4ZJZ Bnard3ONx3Dqs8EY+WESvSjeCeSl5eWuhiavaSEoXZ8TyJzqpOqhdemnRvP4VDROQGN1 G+7Belyl1ggLT905ZHckepSlzcwMzva3Pb2Kcpudr0iokJ7Pa1nBlbcGPLMcucJAGbR1 aRsWy7smHWNmuAhQLJld3IJtnIJJdiEd9xbQSN0yjqjTBmPsr5sOAbCQYRTuQRHpb8Z6 jMI5kH4jg124JIltQeCFGZNDyjiGyOy10hwoAqMw9iMOqDuS1Y6ccJqaMpuH01KfuXFW 0MeA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAVHSrcEn0iiYFGZCiGhJb1d3w2Vg89pBA0elDhFrleh/ZAcGo9R NVn6owJc8oJLemvFcyI/X3x9Ilt4YdJx2LDpjVtlp3e0OV5dfBNQvPGfJqlzFZ8Il2WjrREyKhl 2ygiSZH72S52BhzOGgTqG2cTLiuXS9+zvA1f7feFmGNwVPvtdMKcnPWXhwLXN9/3w7g== X-Received: by 2002:a17:90a:db52:: with SMTP id u18mr9421446pjx.107.1560924373875; Tue, 18 Jun 2019 23:06:13 -0700 (PDT) X-Google-Smtp-Source: APXvYqwumISH0dIPWcBgTN2tMIUqmrUiDsSjSs7/rA4EpFS59luHzFeqF+WfRj3+Y2n14swO+3g1 X-Received: by 2002:a17:90a:db52:: with SMTP id u18mr9421373pjx.107.1560924372768; Tue, 18 Jun 2019 23:06:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560924372; cv=none; d=google.com; s=arc-20160816; b=OqeZrDt5FwoHvU3HiQjzXrRzSGg+G+V2M8h6vKQx9+L1NQFJZjupRRfY/gqbodI0fD ugfrzXOsqJe7pJ5ReOSSPVNAVdaQotE2gKgPa9KuINUYVczWzmZaRmeET/zD5itJoFEy dXNTbnhha+2T+XVtcZSvlYf7DgCYiG5E7MU9T7csffWy+KlPPovuhKAlhNSBqp4wIVWw D/mlw6YXM5bhHv1sBjkdo2QFu5VYG9oRCWbnm+HxifT03Hw77sFeT1JwTpQC+wMm0ya5 tJbSztXlEr87trp/JJx9V9RMDN2mpNaxOTZ7fuxgNA+QUyg4wOz7FZHRc7oeGurBlVIr yIiQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=hae6rk+qL45xxjhpub2qabo/6u1of96bSs3wXLn92Vo=; b=T33f0GmBYRwBB7YRkbgvMUe3nnmX5dwe0WriX8xn0IE04uyu+1E4Y3l20f+Z3oK/+T I7ccSr98fA1zQbBFEmIkK13IydgBWQZcBvgVGAFd5ja5dpHwJrwMlhW0Q1pchCs6NEgq /lpxlbZT+2FWKMBrAIKw8ZuqnvraOh9IxMZ7lFW4hIUzTaUHLPHm7ttSY6bMOuE9eosJ jWDD+lkGNd2dOzINF5TMN4auyiv9eyAJPaZU1u116R0YCA++t/ZzHymSf/Z3N7n7E6xo go+1ulpYkLeejN9fHwSBymiumkBFin/7yacZPLX6cugwwIEtP5YhhOONvRpO/b4dUSwh 6H3A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga11.intel.com (mga11.intel.com. [192.55.52.93]) by mx.google.com with ESMTPS id j191si2267391pgc.73.2019.06.18.23.06.12 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 18 Jun 2019 23:06:12 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.93 as permitted sender) client-ip=192.55.52.93; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:06:12 -0700 X-IronPort-AV: E=Sophos;i="5.63,392,1557212400"; d="scan'208";a="161956637" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:06:11 -0700 Subject: [PATCH v10 04/13] mm/hotplug: Prepare shrink_{zone, pgdat}_span for sub-section removal From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Vlastimil Babka , Logan Gunthorpe , Pavel Tatashin , Oscar Salvador , linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org Date: Tue, 18 Jun 2019 22:51:55 -0700 Message-ID: <156092351496.979959.12703722803097017492.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> References: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Sub-section hotplug support reduces the unit of operation of hotplug from section-sized-units (PAGES_PER_SECTION) to sub-section-sized units (PAGES_PER_SUBSECTION). Teach shrink_{zone,pgdat}_span() to consider PAGES_PER_SUBSECTION boundaries as the points where pfn_valid(), not valid_section(), can toggle. Cc: Michal Hocko Cc: Vlastimil Babka Cc: Logan Gunthorpe Reviewed-by: Pavel Tatashin Reviewed-by: Oscar Salvador Signed-off-by: Dan Williams --- mm/memory_hotplug.c | 29 ++++++++--------------------- 1 file changed, 8 insertions(+), 21 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 7b963c2d3a0d..647859a1d119 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -318,12 +318,8 @@ static unsigned long find_smallest_section_pfn(int nid, struct zone *zone, unsigned long start_pfn, unsigned long end_pfn) { - struct mem_section *ms; - - for (; start_pfn < end_pfn; start_pfn += PAGES_PER_SECTION) { - ms = __pfn_to_section(start_pfn); - - if (unlikely(!valid_section(ms))) + for (; start_pfn < end_pfn; start_pfn += PAGES_PER_SUBSECTION) { + if (unlikely(!pfn_valid(start_pfn))) continue; if (unlikely(pfn_to_nid(start_pfn) != nid)) @@ -343,15 +339,12 @@ static unsigned long find_biggest_section_pfn(int nid, struct zone *zone, unsigned long start_pfn, unsigned long end_pfn) { - struct mem_section *ms; unsigned long pfn; /* pfn is the end pfn of a memory section. */ pfn = end_pfn - 1; - for (; pfn >= start_pfn; pfn -= PAGES_PER_SECTION) { - ms = __pfn_to_section(pfn); - - if (unlikely(!valid_section(ms))) + for (; pfn >= start_pfn; pfn -= PAGES_PER_SUBSECTION) { + if (unlikely(!pfn_valid(pfn))) continue; if (unlikely(pfn_to_nid(pfn) != nid)) @@ -373,7 +366,6 @@ static void shrink_zone_span(struct zone *zone, unsigned long start_pfn, unsigned long z = zone_end_pfn(zone); /* zone_end_pfn namespace clash */ unsigned long zone_end_pfn = z; unsigned long pfn; - struct mem_section *ms; int nid = zone_to_nid(zone); zone_span_writelock(zone); @@ -410,10 +402,8 @@ static void shrink_zone_span(struct zone *zone, unsigned long start_pfn, * it check the zone has only hole or not. */ pfn = zone_start_pfn; - for (; pfn < zone_end_pfn; pfn += PAGES_PER_SECTION) { - ms = __pfn_to_section(pfn); - - if (unlikely(!valid_section(ms))) + for (; pfn < zone_end_pfn; pfn += PAGES_PER_SUBSECTION) { + if (unlikely(!pfn_valid(pfn))) continue; if (page_zone(pfn_to_page(pfn)) != zone) @@ -441,7 +431,6 @@ static void shrink_pgdat_span(struct pglist_data *pgdat, unsigned long p = pgdat_end_pfn(pgdat); /* pgdat_end_pfn namespace clash */ unsigned long pgdat_end_pfn = p; unsigned long pfn; - struct mem_section *ms; int nid = pgdat->node_id; if (pgdat_start_pfn == start_pfn) { @@ -478,10 +467,8 @@ static void shrink_pgdat_span(struct pglist_data *pgdat, * has only hole or not. */ pfn = pgdat_start_pfn; - for (; pfn < pgdat_end_pfn; pfn += PAGES_PER_SECTION) { - ms = __pfn_to_section(pfn); - - if (unlikely(!valid_section(ms))) + for (; pfn < pgdat_end_pfn; pfn += PAGES_PER_SUBSECTION) { + if (unlikely(!pfn_valid(pfn))) continue; if (pfn_to_nid(pfn) != nid) From patchwork Wed Jun 19 05:52:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 11003471 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C504C76 for ; Wed, 19 Jun 2019 06:06:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B4AEF1FF41 for ; Wed, 19 Jun 2019 06:06:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A8A7728B45; Wed, 19 Jun 2019 06:06:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E11A21FF41 for ; Wed, 19 Jun 2019 06:06:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E109B6B000A; Wed, 19 Jun 2019 02:06:19 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D728A8E0002; Wed, 19 Jun 2019 02:06:19 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C12E38E0001; Wed, 19 Jun 2019 02:06:19 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f198.google.com (mail-pl1-f198.google.com [209.85.214.198]) by kanga.kvack.org (Postfix) with ESMTP id 883156B000A for ; Wed, 19 Jun 2019 02:06:19 -0400 (EDT) Received: by mail-pl1-f198.google.com with SMTP id s22so9231848plp.5 for ; Tue, 18 Jun 2019 23:06:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=yxskwsT7F/CvvQDzs8B0qeKCdtu7eZAf48CoDQUIXNc=; b=EftMepFxAIB9lGrV8PdSF5vr4V2LH/w0A7CrfuZyH9BeARC5KXsI9HXbWuXwuAdXpn pa8V0OQoTW0RJn4sFX5A2MFrpQdZeWTiUjcq79uhQtH8RKUyuDqr/mEnMjvuDZpJj0QC YV9TMo9tpWjAfv5cwlgIQC2sjQo7mw6LNCzS2gib8Du6ntQO0q/6AMehkhma0AOWHJpV qZ9N9/W/S3+t/IMBKUq6aIjVS2CJ/rTiOZggqgPaGWBVhpb6DXQwW1nYp92TTYtV9uQs mjeo77H7iXnNI2FGPS9BaE/398ztc2eZIABpjCBHGS/yLi9etheejR90NHPpR8dSTqb0 KtNQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAWz2BTJhMPBRUTRDOCgZwxOPJMWLeszNryxl7nX8wTQvE/OMmXd cNTH//CrjYqcyFkFKHCWx3eGimACTrlKE/AaUmUhJknk3Cs6/PSwHk40z7p9+Cd5UVJVoNHKfpg vpLdwE3YavzrnlyYsLVylT41Qh6IAsGMYwDn1soa0kjkXbyw6+dQf8M74Kb1Rleui2w== X-Received: by 2002:a17:902:7883:: with SMTP id q3mr116201963pll.89.1560924379194; Tue, 18 Jun 2019 23:06:19 -0700 (PDT) X-Google-Smtp-Source: APXvYqxU0ONqp7FOlCb8q9ZASL6ENrThM/Di3TFQMHtyLiFawKo6JNc5H4f6p1bot3Gb+tL9i7KO X-Received: by 2002:a17:902:7883:: with SMTP id q3mr116201902pll.89.1560924378316; Tue, 18 Jun 2019 23:06:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560924378; cv=none; d=google.com; s=arc-20160816; b=HGECL9qYmYkGAMI4Vq48MNawiQ1UPWKMN1H6W+4KJAkNt2lgKwBQERFJWazNgDfTxT /hYy/hS9tlCwM+asOx1jklOJg4rTuKo0xWB4cXAgUa3LCPgTSbG0umXfqYslZHspJPvk aCdsBDowR5j/n2t5bB0SOek7OhCUnuhafyvc3pe3dKpVmmwf2Fywp4LFKUE5BZoIyUCQ 9/7ujcEEe1nVsXAN/ooqzT0JbR635WqCEsuGu4i/1uWTjAJZ1+5weYv238VD5n+waaGa LFte7VYXaKoYjqJsoCiSBT6zmaDbABWlwpqCDrc/6Z32NP78bxWY8TyDK8uw367zbe2y r4XQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=yxskwsT7F/CvvQDzs8B0qeKCdtu7eZAf48CoDQUIXNc=; b=dvRxm5brPu/oAKp98Hz2WRz8l1yibO8JWOdaLesuWt4vFPn/FcYhzXs/jGENpKmVVH wydoR2Zjp5QUaTBMIUM2yJkagFOntOTYDRIU6xYMlcjLTAgfUQk56I2DM5oz6e8jnodS /7ZKrWt5M2i48MNCz1CozJQTqkwTYwPDvsPaonyaOcEp79KR4GyG05r/KqvUSSmPSWJK qO8mGsSFNrQGYuMBVGbD4ZENsTwejufdpD7d6hIZIkx9e66FtM43r+Uh9r8ETQYmrHS4 nekasUU6k4JFun/zNIoD/rEYtSXXxUjygllxS56d61s9p4KYBvhZLzvRdAqESexQTtxa cNyQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga11.intel.com (mga11.intel.com. [192.55.52.93]) by mx.google.com with ESMTPS id j191si2267391pgc.73.2019.06.18.23.06.18 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 18 Jun 2019 23:06:18 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.93 as permitted sender) client-ip=192.55.52.93; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:06:18 -0700 X-IronPort-AV: E=Sophos;i="5.63,392,1557212400"; d="scan'208";a="358511698" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:06:17 -0700 Subject: [PATCH v10 05/13] mm/sparsemem: Convert kmalloc_section_memmap() to populate_section_memmap() From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , David Hildenbrand , Logan Gunthorpe , Oscar Salvador , Pavel Tatashin , linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org Date: Tue, 18 Jun 2019 22:52:00 -0700 Message-ID: <156092352058.979959.6551283472062305149.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> References: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Allow sub-section sized ranges to be added to the memmap. populate_section_memmap() takes an explict pfn range rather than assuming a full section, and those parameters are plumbed all the way through to vmmemap_populate(). There should be no sub-section usage in current deployments. New warnings are added to clarify which memmap allocation paths are sub-section capable. Cc: Michal Hocko Cc: David Hildenbrand Cc: Logan Gunthorpe Cc: Oscar Salvador Reviewed-by: Pavel Tatashin Signed-off-by: Dan Williams Reviewed-by: Oscar Salvador --- arch/x86/mm/init_64.c | 4 +++- include/linux/mm.h | 4 ++-- mm/sparse-vmemmap.c | 21 ++++++++++++++------- mm/sparse.c | 50 ++++++++++++++++++++++++++----------------------- 4 files changed, 46 insertions(+), 33 deletions(-) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 8335ac6e1112..688fb0687e55 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1520,7 +1520,9 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, { int err; - if (boot_cpu_has(X86_FEATURE_PSE)) + if (end - start < PAGES_PER_SECTION * sizeof(struct page)) + err = vmemmap_populate_basepages(start, end, node); + else if (boot_cpu_has(X86_FEATURE_PSE)) err = vmemmap_populate_hugepages(start, end, node, altmap); else if (altmap) { pr_err_once("%s: no cpu support for altmap allocations\n", diff --git a/include/linux/mm.h b/include/linux/mm.h index c6ae9eba645d..f7616518124e 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2752,8 +2752,8 @@ const char * arch_vma_name(struct vm_area_struct *vma); void print_vma_addr(char *prefix, unsigned long rip); void *sparse_buffer_alloc(unsigned long size); -struct page *sparse_mem_map_populate(unsigned long pnum, int nid, - struct vmem_altmap *altmap); +struct page * __populate_section_memmap(unsigned long pfn, + unsigned long nr_pages, int nid, struct vmem_altmap *altmap); pgd_t *vmemmap_pgd_populate(unsigned long addr, int node); p4d_t *vmemmap_p4d_populate(pgd_t *pgd, unsigned long addr, int node); pud_t *vmemmap_pud_populate(p4d_t *p4d, unsigned long addr, int node); diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 7fec05796796..200aef686722 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -245,19 +245,26 @@ int __meminit vmemmap_populate_basepages(unsigned long start, return 0; } -struct page * __meminit sparse_mem_map_populate(unsigned long pnum, int nid, - struct vmem_altmap *altmap) +struct page * __meminit __populate_section_memmap(unsigned long pfn, + unsigned long nr_pages, int nid, struct vmem_altmap *altmap) { unsigned long start; unsigned long end; - struct page *map; - map = pfn_to_page(pnum * PAGES_PER_SECTION); - start = (unsigned long)map; - end = (unsigned long)(map + PAGES_PER_SECTION); + /* + * The minimum granularity of memmap extensions is + * PAGES_PER_SUBSECTION as allocations are tracked in the + * 'subsection_map' bitmap of the section. + */ + end = ALIGN(pfn + nr_pages, PAGES_PER_SUBSECTION); + pfn &= PAGE_SUBSECTION_MASK; + nr_pages = end - pfn; + + start = (unsigned long) pfn_to_page(pfn); + end = start + nr_pages * sizeof(struct page); if (vmemmap_populate(start, end, nid, altmap)) return NULL; - return map; + return pfn_to_page(pfn); } diff --git a/mm/sparse.c b/mm/sparse.c index e9fec3c2f7ec..49f0c03d15a3 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -439,8 +439,8 @@ static unsigned long __init section_map_size(void) return PAGE_ALIGN(sizeof(struct page) * PAGES_PER_SECTION); } -struct page __init *sparse_mem_map_populate(unsigned long pnum, int nid, - struct vmem_altmap *altmap) +struct page __init *__populate_section_memmap(unsigned long pfn, + unsigned long nr_pages, int nid, struct vmem_altmap *altmap) { unsigned long size = section_map_size(); struct page *map = sparse_buffer_alloc(size); @@ -521,10 +521,13 @@ static void __init sparse_init_nid(int nid, unsigned long pnum_begin, } sparse_buffer_init(map_count * section_map_size(), nid); for_each_present_section_nr(pnum_begin, pnum) { + unsigned long pfn = section_nr_to_pfn(pnum); + if (pnum >= pnum_end) break; - map = sparse_mem_map_populate(pnum, nid, NULL); + map = __populate_section_memmap(pfn, PAGES_PER_SECTION, + nid, NULL); if (!map) { pr_err("%s: node[%d] memory map backing failed. Some memory will not be available.", __func__, nid); @@ -625,17 +628,17 @@ void offline_mem_sections(unsigned long start_pfn, unsigned long end_pfn) #endif #ifdef CONFIG_SPARSEMEM_VMEMMAP -static inline struct page *kmalloc_section_memmap(unsigned long pnum, int nid, - struct vmem_altmap *altmap) +static struct page *populate_section_memmap(unsigned long pfn, + unsigned long nr_pages, int nid, struct vmem_altmap *altmap) { - /* This will make the necessary allocations eventually. */ - return sparse_mem_map_populate(pnum, nid, altmap); + return __populate_section_memmap(pfn, nr_pages, nid, altmap); } -static void __kfree_section_memmap(struct page *memmap, + +static void depopulate_section_memmap(unsigned long pfn, unsigned long nr_pages, struct vmem_altmap *altmap) { - unsigned long start = (unsigned long)memmap; - unsigned long end = (unsigned long)(memmap + PAGES_PER_SECTION); + unsigned long start = (unsigned long) pfn_to_page(pfn); + unsigned long end = start + nr_pages * sizeof(struct page); vmemmap_free(start, end, altmap); } @@ -647,7 +650,8 @@ static void free_map_bootmem(struct page *memmap) vmemmap_free(start, end, NULL); } #else -static struct page *__kmalloc_section_memmap(void) +struct page *populate_section_memmap(unsigned long pfn, + unsigned long nr_pages, int nid, struct vmem_altmap *altmap) { struct page *page, *ret; unsigned long memmap_size = sizeof(struct page) * PAGES_PER_SECTION; @@ -668,15 +672,11 @@ static struct page *__kmalloc_section_memmap(void) return ret; } -static inline struct page *kmalloc_section_memmap(unsigned long pnum, int nid, +static void depopulate_section_memmap(unsigned long pfn, unsigned long nr_pages, struct vmem_altmap *altmap) { - return __kmalloc_section_memmap(); -} + struct page *memmap = pfn_to_page(pfn); -static void __kfree_section_memmap(struct page *memmap, - struct vmem_altmap *altmap) -{ if (is_vmalloc_addr(memmap)) vfree(memmap); else @@ -745,12 +745,13 @@ int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, if (ret < 0 && ret != -EEXIST) return ret; ret = 0; - memmap = kmalloc_section_memmap(section_nr, nid, altmap); + memmap = populate_section_memmap(start_pfn, PAGES_PER_SECTION, nid, + altmap); if (!memmap) return -ENOMEM; usage = kzalloc(mem_section_usage_size(), GFP_KERNEL); if (!usage) { - __kfree_section_memmap(memmap, altmap); + depopulate_section_memmap(start_pfn, PAGES_PER_SECTION, altmap); return -ENOMEM; } @@ -772,7 +773,7 @@ int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, out: if (ret < 0) { kfree(usage); - __kfree_section_memmap(memmap, altmap); + depopulate_section_memmap(start_pfn, PAGES_PER_SECTION, altmap); } return ret; } @@ -808,7 +809,8 @@ static inline void clear_hwpoisoned_pages(struct page *memmap, int nr_pages) #endif static void free_section_usage(struct mem_section *ms, struct page *memmap, - struct mem_section_usage *usage, struct vmem_altmap *altmap) + struct mem_section_usage *usage, unsigned long pfn, + unsigned long nr_pages, struct vmem_altmap *altmap) { if (!usage) return; @@ -819,7 +821,7 @@ static void free_section_usage(struct mem_section *ms, struct page *memmap, if (!early_section(ms)) { kfree(usage); if (memmap) - __kfree_section_memmap(memmap, altmap); + depopulate_section_memmap(pfn, nr_pages, altmap); return; } @@ -848,6 +850,8 @@ void sparse_remove_one_section(struct mem_section *ms, unsigned long map_offset, clear_hwpoisoned_pages(memmap + map_offset, PAGES_PER_SECTION - map_offset); - free_section_usage(ms, memmap, usage, altmap); + free_section_usage(ms, memmap, usage, + section_nr_to_pfn(__section_nr(ms)), + PAGES_PER_SECTION, altmap); } #endif /* CONFIG_MEMORY_HOTPLUG */ From patchwork Wed Jun 19 05:52:06 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 11003475 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8D96D13AF for ; Wed, 19 Jun 2019 06:06:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7C1181FF41 for ; Wed, 19 Jun 2019 06:06:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6FDE728B45; Wed, 19 Jun 2019 06:06:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0E9061FF41 for ; Wed, 19 Jun 2019 06:06:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1B3406B000C; Wed, 19 Jun 2019 02:06:26 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 13E508E0002; Wed, 19 Jun 2019 02:06:26 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 02C408E0001; Wed, 19 Jun 2019 02:06:25 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f197.google.com (mail-pl1-f197.google.com [209.85.214.197]) by kanga.kvack.org (Postfix) with ESMTP id C281D6B000C for ; Wed, 19 Jun 2019 02:06:25 -0400 (EDT) Received: by mail-pl1-f197.google.com with SMTP id 71so9213064pld.17 for ; Tue, 18 Jun 2019 23:06:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=LEsj0JvF5oyzUNmNgQWz6pMzr+Cf/rYYP2iyFKbjxv8=; b=arLrX7+Pxb6Adh42bTloOYL/i1S0H3S4fgk4ltxt/BRu8xSavr/eTZ/Y7UvatmyYBC QBfuxaEKfhMzHW/axsig822TFll3Zn5e7vNbPox0bc8A+WaqrARV1MKD5dUR+FGat29a Gnft9a/6cO8eMHEOswgzuymtSr8n6dUi/rcuZ+yKghqff771cy3KOAsLkAcNfrKTtP/1 kkDxbU8IPL0Syt2ZUc97wm35FVYRBEKhSkZKryNS723EO6WgbzHG6cZY6p4FzRPW1MqK kdvPh7I5ddME8XqiZg4vKhA60qYyx8+iPSSoPJChKy+iv5a/GimSV6y9x/6tZ9fwc3ma QGeA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAX+bn/9WKFMIdjGvQB3a2QIECUDyWhY9sOCN0UBKwGd0vmLZNl0 rQYY2ojJB3sUISA3fIckY4BwrDlG1o55yMytP8zrApulgLx+SYN/pE0I0ZZQl6LddV3aPU+XeaJ O6lo30k/S3CSWATmx0iMjTm0vZGgEtwU0s2BDS0N1y0NI4PIYqcqlkXWns+UKdwznJQ== X-Received: by 2002:a65:638e:: with SMTP id h14mr5973390pgv.86.1560924385139; Tue, 18 Jun 2019 23:06:25 -0700 (PDT) X-Google-Smtp-Source: APXvYqwerjnOGfbF6799jvCTrJ7n/6vjBHi8IwQ+H43Me9EFRmS0vb8AtYwRHn59NELNqiVDJOSs X-Received: by 2002:a65:638e:: with SMTP id h14mr5973321pgv.86.1560924384151; Tue, 18 Jun 2019 23:06:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560924384; cv=none; d=google.com; s=arc-20160816; b=ngNarhQgujuxyoLd0uJW7mh544Y8+ikAhAyHoQCPv1vmzZmUpOS2ZpyvzY2zK5ellp xwwSAGRcqX8Xigj53xxs72gxhOEbFXQWF3mkhgn2PppYAD4v2kc5HyFsyohZo7hkRjJq K019Bo6mjRuCr4pgsvGnUKji/FwDxsg76uBpTn256IEAFeJjMzu639N8QwX3tSrn60n6 LspAxEpWZFURBpQsv41dL6OQxNYR0a0ElLeAFuYWjxCLshCmRdYCOiYp7mPjzH9Fhro5 cUda8YWida1cdKd7fcTZ7CDcWD64MhtddmCM3pdOcoD86Qo53zyhKV34byILBqA37c35 ZeNQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=LEsj0JvF5oyzUNmNgQWz6pMzr+Cf/rYYP2iyFKbjxv8=; b=pMiB6p3xhXsQg5LLcP1AGu846LMdXgsN3Ut83wXdHbaHLw41AuGAIp80xaO9/0j1pp WkuDCb0DTusEBzGKv8NKOsPE8pCKjqEuIdWeK8UYxKdJexKA7EtJc0ApHzb6i/JLT9Ap 2iotIvl7yBdeaa8t9r8gtd6hTv7lumuGVXa5r2/nJgy9GdeWqhC62k9dzaL2ONnecqhJ zxBUYbxG4frzMaskpf5Xnvrkvkh7Fs2CTv84QlDBo37/mwGgJZ6HNEhL7SVQINi+MdCi Dfova8mc16nBJsPUYyykqWZR7IM7HhI/7YV6aQFXjmfIJhFGc4NAb1iFhV6kX6efJXEu 7HhQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga14.intel.com (mga14.intel.com. [192.55.52.115]) by mx.google.com with ESMTPS id c24si2233203pgi.462.2019.06.18.23.06.23 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 18 Jun 2019 23:06:24 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.115 as permitted sender) client-ip=192.55.52.115; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:06:23 -0700 X-IronPort-AV: E=Sophos;i="5.63,392,1557212400"; d="scan'208";a="358087805" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:06:23 -0700 Subject: [PATCH v10 06/13] mm/hotplug: Kill is_dev_zone() usage in __remove_pages() From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Logan Gunthorpe , Pavel Tatashin , David Hildenbrand , Oscar Salvador , linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org Date: Tue, 18 Jun 2019 22:52:06 -0700 Message-ID: <156092352642.979959.6664333788149363039.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> References: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The zone type check was a leftover from the cleanup that plumbed altmap through the memory hotplug path, i.e. commit da024512a1fa "mm: pass the vmem_altmap to arch_remove_memory and __remove_pages". Cc: Michal Hocko Cc: Logan Gunthorpe Cc: Pavel Tatashin Reviewed-by: David Hildenbrand Reviewed-by: Oscar Salvador Signed-off-by: Dan Williams --- mm/memory_hotplug.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 647859a1d119..4b882c57781a 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -535,11 +535,8 @@ void __remove_pages(struct zone *zone, unsigned long phys_start_pfn, unsigned long map_offset = 0; int sections_to_remove; - /* In the ZONE_DEVICE case device driver owns the memory region */ - if (is_dev_zone(zone)) { - if (altmap) - map_offset = vmem_altmap_offset(altmap); - } + if (altmap) + map_offset = vmem_altmap_offset(altmap); clear_zone_contiguous(zone); From patchwork Wed Jun 19 05:52:12 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 11003479 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CDEC576 for ; Wed, 19 Jun 2019 06:06:33 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BD36028A5B for ; Wed, 19 Jun 2019 06:06:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B194A28B47; Wed, 19 Jun 2019 06:06:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7F71028B45 for ; Wed, 19 Jun 2019 06:06:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 92C896B000D; Wed, 19 Jun 2019 02:06:31 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 8DDCC8E0002; Wed, 19 Jun 2019 02:06:31 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7F4208E0001; Wed, 19 Jun 2019 02:06:31 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f198.google.com (mail-pg1-f198.google.com [209.85.215.198]) by kanga.kvack.org (Postfix) with ESMTP id 4CC0A6B000D for ; Wed, 19 Jun 2019 02:06:31 -0400 (EDT) Received: by mail-pg1-f198.google.com with SMTP id 21so11595603pgl.5 for ; Tue, 18 Jun 2019 23:06:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=ooM3OIIwAC3fXTsl2k3zrap55/PII7cOjsSf5DeeT6Y=; b=Rel4wBXu/rycbXnvG84P9xS/q1tuKLMv+FcoqkoHO6H6KoijPM5SdPuR0KH/L12VKc GY8ALRJsl5CvnPXf8S7fm9Q4Lgo1YdsOdnK9opN1+hxNOrdSVBfxjD/7c8pY3RUYZm0A X8G7e7N8lwIVPza2qI+wayfVM/9wfRmQ0tat6BRmiU4CEtTepgYoPK++QLJQ3jH2kI8X zNtujaKZYZR+Or2A/Ov4jeQLKmoxuINjylqwT0YhMSQVQDTxpk+/MSEKaOobsLIz/59t vfYP9WxOyLrSK4NYAlfg7P/CtLrael6wDs9US4yjh3/Fq/AsF162bNYtsvr7rfL4eo8q cQeQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAWN95S6TmzXIDSecHRKU77aKisfvP3mCquttziCEo12223cnc2a MFOWKVw2T+rQKls8QiAkSrwdEdB5Or1sR1PocsPeqnFSpieur93p3eUxTalNOqVlyCvFsnqrBvw wQ8MfY7sr7yVIl+cvj4TV4SR5tu4QPGVehGOAz7Ut40szmrJKBaIphx4FEGLdUte68Q== X-Received: by 2002:a65:4841:: with SMTP id i1mr6098623pgs.37.1560924390929; Tue, 18 Jun 2019 23:06:30 -0700 (PDT) X-Google-Smtp-Source: APXvYqy4NxtyrCZLa2R076Wg10jEiYggRvWf9xmHxVHfAc1ITDUW9L14L/vw8EVjqmA+mLmLIOWJ X-Received: by 2002:a65:4841:: with SMTP id i1mr6098573pgs.37.1560924390144; Tue, 18 Jun 2019 23:06:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560924390; cv=none; d=google.com; s=arc-20160816; b=N6tIlBKlyBiVv5sH1CerSC5yCn5yMqnLsdglzOnEMQnhoJA+J8XJLwr6LyAVD/oTse kiZxM9DsO+gEa7gh9Fo6+o8lO/cZ5pG4PuD6u43voxUKJ824rbhZl7Kr/uTS2SWHEcun LmnquINHnbitgzyLR/Cta+vKYTMdJcmCEl3k9uSnthoBm2lAFs3k0T4DyVC/kwSd1MLd wfvWp8YK/u133JbD5GJ7n/xG36ulejF+BuK98XxyJux+6rugCJkAtEbKdsvwQzRl+UM+ kPzvroRX5RUO1Re7PPUpYsD7kNgNTgOTt67YUZ0WnkcJiB1nyx2VLaC/p8RPtSO0X2g6 Wm7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=ooM3OIIwAC3fXTsl2k3zrap55/PII7cOjsSf5DeeT6Y=; b=kzT26rrEdx6NMKMSsERHFwC/DGX3x3+cgyNjy0JeEqXeuHZdGzUaRz5zoG3g2gORzP 5s3QsRjO+a54mCoM6Bhgz/eZRKGHT+U/049tXdQJK2x4682ax2up3f+TTJnJwcZbitYA yABXljdrWHUE14jB/dvE4NUIXphmN8jaAZpRls75BpgMIgop1wjuIiN1gl72mI1gwFka 2p/XPWC5YzZClLcph3GRUyT1eNxR61qzJrCri45D19yxZc7jU+1nwd7NPLXRepZUV5Df jgf8wl8AhF59M7rA+NDdt74r2ZtTEmVJliwQTSvVyOw1oCR30UvDmjkY3HyLaPRd5WFk QUdA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga07.intel.com (mga07.intel.com. [134.134.136.100]) by mx.google.com with ESMTPS id w23si15066818ply.230.2019.06.18.23.06.30 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 18 Jun 2019 23:06:30 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.100 as permitted sender) client-ip=134.134.136.100; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.100 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:06:29 -0700 X-IronPort-AV: E=Sophos;i="5.63,392,1557212400"; d="scan'208";a="182640932" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:06:29 -0700 Subject: [PATCH v10 07/13] mm: Kill is_dev_zone() helper From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Logan Gunthorpe , David Hildenbrand , Oscar Salvador , Pavel Tatashin , Wei Yang , linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org Date: Tue, 18 Jun 2019 22:52:12 -0700 Message-ID: <156092353211.979959.1489004866360828964.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> References: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Given there are no more usages of is_dev_zone() outside of 'ifdef CONFIG_ZONE_DEVICE' protection, kill off the compilation helper. Cc: Michal Hocko Cc: Logan Gunthorpe Acked-by: David Hildenbrand Reviewed-by: Oscar Salvador Reviewed-by: Pavel Tatashin Reviewed-by: Wei Yang Signed-off-by: Dan Williams --- include/linux/mmzone.h | 12 ------------ mm/page_alloc.c | 2 +- 2 files changed, 1 insertion(+), 13 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index c4e8843e283c..e976faf57292 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -855,18 +855,6 @@ static inline int local_memory_node(int node_id) { return node_id; }; */ #define zone_idx(zone) ((zone) - (zone)->zone_pgdat->node_zones) -#ifdef CONFIG_ZONE_DEVICE -static inline bool is_dev_zone(const struct zone *zone) -{ - return zone_idx(zone) == ZONE_DEVICE; -} -#else -static inline bool is_dev_zone(const struct zone *zone) -{ - return false; -} -#endif - /* * Returns true if a zone has pages managed by the buddy allocator. * All the reclaim decisions have to use this function rather than diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 8e7215fb6976..12b2afd3a529 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5881,7 +5881,7 @@ void __ref memmap_init_zone_device(struct zone *zone, unsigned long start = jiffies; int nid = pgdat->node_id; - if (WARN_ON_ONCE(!pgmap || !is_dev_zone(zone))) + if (WARN_ON_ONCE(!pgmap || zone_idx(zone) != ZONE_DEVICE)) return; /* From patchwork Wed Jun 19 05:52:17 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 11003483 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 71AB213AF for ; Wed, 19 Jun 2019 06:06:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5F3F128A5B for ; Wed, 19 Jun 2019 06:06:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 53ABF28B47; Wed, 19 Jun 2019 06:06:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7A94428A5B for ; Wed, 19 Jun 2019 06:06:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5AA076B000E; Wed, 19 Jun 2019 02:06:37 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 55D118E0002; Wed, 19 Jun 2019 02:06:37 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 424988E0001; Wed, 19 Jun 2019 02:06:37 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f197.google.com (mail-pl1-f197.google.com [209.85.214.197]) by kanga.kvack.org (Postfix) with ESMTP id 07F926B000E for ; Wed, 19 Jun 2019 02:06:37 -0400 (EDT) Received: by mail-pl1-f197.google.com with SMTP id 71so9213369pld.17 for ; Tue, 18 Jun 2019 23:06:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=yjul3cXNe+Ots4bECSmwFa4bwz15EUcOCkpt7FBWuQ8=; b=kjKfm35zF01obVjtN+L82ce9xWbbuPFnCzsfFxkXxCq9mPTBfboc09EnnrmnH/10W8 DMzQEJS18xB7ic5jDKsAN87sywtOy/Dr0FCmE6y9iDhCao2F4CcrE8FoJ/hjYv+Uvwmv 30d30LOEU3MZW9yMHbCm7zONwhFki4qHHZUh6C9Ct6YDbgSy5dssXUtUahy5jBVIp2i+ /HNOc7cyl+Ud9oxzW5jgFCGGVftje9GuH+yZcO/0osFl0E5IQa9aJlWa362pco0OrkY8 vY107cWGih+cusXJ9VnMhs+zlX64UyHqZn30Rw04CULszJX7K4P7lEBAivJXd8J6MDs6 ZESw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAVkJGx+RQe+EVROWxfw8t51enBS/tlPR7zuLYg7Ks35gWEJBLVo A+wjxtpVsBhD7LICkCfBvR0kz3oLbjh+lrRGRc//sHkfwM2nNSQnHDIOIIADkOTflHzwwEDlBY0 ItvPiraSdYHzIWLygTsNuelTzUgp2GwMP80lu5YQeNbI2mqME3b+udM8iMmAJSVFbzQ== X-Received: by 2002:a17:90a:1b4c:: with SMTP id q70mr9066620pjq.69.1560924396675; Tue, 18 Jun 2019 23:06:36 -0700 (PDT) X-Google-Smtp-Source: APXvYqy/VZACt39guHWKabY5OUIzPIfuK0kMPl7ozJM7Yk0yTT5mz7eKUHS0JIsIBOnPlIxZlMht X-Received: by 2002:a17:90a:1b4c:: with SMTP id q70mr9066572pjq.69.1560924395912; Tue, 18 Jun 2019 23:06:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560924395; cv=none; d=google.com; s=arc-20160816; b=Bl/hNCHc/K+Q7L8bqT5F9Fefzeh/G2Odh+MNHPGTXtdi5sj99u+sgxGsphx/HX1W0R dnrih8aMhveSKI3cnQtLDXVZXHqqL4xKX6Pzd88hIT36jGQGdMcKQHdq+KyQDk/5sdAX G40+6pBnUsNji1xOvWYe69xHpnrHcqFpgqzQ8r8rR1gB0TN2NdcOiiIGxJ8M838HHNvh TYJRZKC1ztGvewpbf6ZbUllgXIJL5zfLY88iBWGWlDIZGl8FusBM0XiVU5+di7TtAOxc R6nxMSxuMwhLX0U1j5JL7PFpasH+TyTApfIuLpZWl/oybYL3YZd3gJE2ASh9QD+DwTC2 aBcg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=yjul3cXNe+Ots4bECSmwFa4bwz15EUcOCkpt7FBWuQ8=; b=l3BiQNjC6EcmNYl6dDvRdx2KcHfxeDlvh8wML6m8DNUohaQ3pLJohPJC0I9RKvo2kd jYeFcDVBe7DM7F/HjgBAEknhfcfBc08qpRlzcB6UxBZAN6Q10pD1V6ToI72Z9Q5rleuc iexW3qna3HbO7EUisAz5w2clJ+VrddiKKGtKkT7WsOua3D6s9M7JTS9fbw/KV+Kx2Ys7 VEM2HguQipLbWGW7IufDmMX5T2WlBiGhuG2sJLBQQZCpC5EW4Om0mYcDe+14Cxc7a8t5 8fzGVpvgD0SodB5tdLopjbevXizdIlwHdV4rT2uHVIgEssbappkfjwPG5jGttdK529CX /QmQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga09.intel.com (mga09.intel.com. [134.134.136.24]) by mx.google.com with ESMTPS id w2si2220288pgr.396.2019.06.18.23.06.35 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 18 Jun 2019 23:06:35 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) client-ip=134.134.136.24; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:06:35 -0700 X-IronPort-AV: E=Sophos;i="5.63,392,1557212400"; d="scan'208";a="168146307" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:06:34 -0700 Subject: [PATCH v10 08/13] mm/sparsemem: Prepare for sub-section ranges From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Vlastimil Babka , Logan Gunthorpe , Oscar Salvador , Pavel Tatashin , linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org Date: Tue, 18 Jun 2019 22:52:17 -0700 Message-ID: <156092353780.979959.9713046515562743194.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> References: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Prepare the memory hot-{add,remove} paths for handling sub-section ranges by plumbing the starting page frame and number of pages being handled through arch_{add,remove}_memory() to sparse_{add,remove}_one_section(). This is simply plumbing, small cleanups, and some identifier renames. No intended functional changes. Cc: Michal Hocko Cc: Vlastimil Babka Cc: Logan Gunthorpe Cc: Oscar Salvador Reviewed-by: Pavel Tatashin Signed-off-by: Dan Williams Reviewed-by: Oscar Salvador --- include/linux/memory_hotplug.h | 5 +- mm/memory_hotplug.c | 114 +++++++++++++++++++++++++--------------- mm/sparse.c | 16 ++---- 3 files changed, 81 insertions(+), 54 deletions(-) diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 79e0add6a597..3ab0282b4fe5 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -348,9 +348,10 @@ extern int add_memory_resource(int nid, struct resource *resource); extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn, unsigned long nr_pages, struct vmem_altmap *altmap); extern bool is_memblock_offlined(struct memory_block *mem); -extern int sparse_add_one_section(int nid, unsigned long start_pfn, - struct vmem_altmap *altmap); +extern int sparse_add_section(int nid, unsigned long pfn, + unsigned long nr_pages, struct vmem_altmap *altmap); extern void sparse_remove_one_section(struct mem_section *ms, + unsigned long pfn, unsigned long nr_pages, unsigned long map_offset, struct vmem_altmap *altmap); extern struct page *sparse_decode_mem_map(unsigned long coded_mem_map, unsigned long pnum); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 4b882c57781a..399bf78bccc5 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -252,51 +252,84 @@ void __init register_page_bootmem_info_node(struct pglist_data *pgdat) } #endif /* CONFIG_HAVE_BOOTMEM_INFO_NODE */ -static int __meminit __add_section(int nid, unsigned long phys_start_pfn, - struct vmem_altmap *altmap) +static int __meminit __add_section(int nid, unsigned long pfn, + unsigned long nr_pages, struct vmem_altmap *altmap) { int ret; - if (pfn_valid(phys_start_pfn)) + if (pfn_valid(pfn)) return -EEXIST; - ret = sparse_add_one_section(nid, phys_start_pfn, altmap); + ret = sparse_add_section(nid, pfn, nr_pages, altmap); return ret < 0 ? ret : 0; } +static int check_pfn_span(unsigned long pfn, unsigned long nr_pages, + const char *reason) +{ + /* + * Disallow all operations smaller than a sub-section and only + * allow operations smaller than a section for + * SPARSEMEM_VMEMMAP. Note that check_hotplug_memory_range() + * enforces a larger memory_block_size_bytes() granularity for + * memory that will be marked online, so this check should only + * fire for direct arch_{add,remove}_memory() users outside of + * add_memory_resource(). + */ + unsigned long min_align; + + if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP)) + min_align = PAGES_PER_SUBSECTION; + else + min_align = PAGES_PER_SECTION; + if (!IS_ALIGNED(pfn, min_align) + || !IS_ALIGNED(nr_pages, min_align)) { + WARN(1, "Misaligned __%s_pages start: %#lx end: #%lx\n", + reason, pfn, pfn + nr_pages - 1); + return -EINVAL; + } + return 0; +} + /* * Reasonably generic function for adding memory. It is * expected that archs that support memory hotplug will * call this function after deciding the zone to which to * add the new pages. */ -int __ref __add_pages(int nid, unsigned long phys_start_pfn, - unsigned long nr_pages, struct mhp_restrictions *restrictions) +int __ref __add_pages(int nid, unsigned long pfn, unsigned long nr_pages, + struct mhp_restrictions *restrictions) { unsigned long i; - int err = 0; - int start_sec, end_sec; + int start_sec, end_sec, err; struct vmem_altmap *altmap = restrictions->altmap; - /* during initialize mem_map, align hot-added range to section */ - start_sec = pfn_to_section_nr(phys_start_pfn); - end_sec = pfn_to_section_nr(phys_start_pfn + nr_pages - 1); - if (altmap) { /* * Validate altmap is within bounds of the total request */ - if (altmap->base_pfn != phys_start_pfn + if (altmap->base_pfn != pfn || vmem_altmap_offset(altmap) > nr_pages) { pr_warn_once("memory add fail, invalid altmap\n"); - err = -EINVAL; - goto out; + return -EINVAL; } altmap->alloc = 0; } + err = check_pfn_span(pfn, nr_pages, "add"); + if (err) + return err; + + start_sec = pfn_to_section_nr(pfn); + end_sec = pfn_to_section_nr(pfn + nr_pages - 1); for (i = start_sec; i <= end_sec; i++) { - err = __add_section(nid, section_nr_to_pfn(i), altmap); + unsigned long pfns; + + pfns = min(nr_pages, PAGES_PER_SECTION + - (pfn & ~PAGE_SECTION_MASK)); + err = __add_section(nid, pfn, pfns, altmap); + pfn += pfns; + nr_pages -= pfns; /* * EEXIST is finally dealt with by ioresource collision @@ -309,7 +342,6 @@ int __ref __add_pages(int nid, unsigned long phys_start_pfn, cond_resched(); } vmemmap_populate_print_last(); -out: return err; } @@ -487,10 +519,10 @@ static void shrink_pgdat_span(struct pglist_data *pgdat, pgdat->node_spanned_pages = 0; } -static void __remove_zone(struct zone *zone, unsigned long start_pfn) +static void __remove_zone(struct zone *zone, unsigned long start_pfn, + unsigned long nr_pages) { struct pglist_data *pgdat = zone->zone_pgdat; - int nr_pages = PAGES_PER_SECTION; unsigned long flags; pgdat_resize_lock(zone->zone_pgdat, &flags); @@ -499,27 +531,23 @@ static void __remove_zone(struct zone *zone, unsigned long start_pfn) pgdat_resize_unlock(zone->zone_pgdat, &flags); } -static void __remove_section(struct zone *zone, struct mem_section *ms, - unsigned long map_offset, - struct vmem_altmap *altmap) +static void __remove_section(struct zone *zone, unsigned long pfn, + unsigned long nr_pages, unsigned long map_offset, + struct vmem_altmap *altmap) { - unsigned long start_pfn; - int scn_nr; + struct mem_section *ms = __nr_to_section(pfn_to_section_nr(pfn)); if (WARN_ON_ONCE(!valid_section(ms))) return; - scn_nr = __section_nr(ms); - start_pfn = section_nr_to_pfn((unsigned long)scn_nr); - __remove_zone(zone, start_pfn); - - sparse_remove_one_section(ms, map_offset, altmap); + __remove_zone(zone, pfn, nr_pages); + sparse_remove_one_section(ms, pfn, nr_pages, map_offset, altmap); } /** * __remove_pages() - remove sections of pages from a zone * @zone: zone from which pages need to be removed - * @phys_start_pfn: starting pageframe (must be aligned to start of a section) + * @pfn: starting pageframe (must be aligned to start of a section) * @nr_pages: number of pages to remove (must be multiple of section size) * @altmap: alternative device page map or %NULL if default memmap is used * @@ -528,31 +556,31 @@ static void __remove_section(struct zone *zone, struct mem_section *ms, * sure that pages are marked reserved and zones are adjust properly by * calling offline_pages(). */ -void __remove_pages(struct zone *zone, unsigned long phys_start_pfn, +void __remove_pages(struct zone *zone, unsigned long pfn, unsigned long nr_pages, struct vmem_altmap *altmap) { - unsigned long i; unsigned long map_offset = 0; - int sections_to_remove; + int i, start_sec, end_sec; if (altmap) map_offset = vmem_altmap_offset(altmap); clear_zone_contiguous(zone); - /* - * We can only remove entire sections - */ - BUG_ON(phys_start_pfn & ~PAGE_SECTION_MASK); - BUG_ON(nr_pages % PAGES_PER_SECTION); + if (check_pfn_span(pfn, nr_pages, "remove")) + return; - sections_to_remove = nr_pages / PAGES_PER_SECTION; - for (i = 0; i < sections_to_remove; i++) { - unsigned long pfn = phys_start_pfn + i*PAGES_PER_SECTION; + start_sec = pfn_to_section_nr(pfn); + end_sec = pfn_to_section_nr(pfn + nr_pages - 1); + for (i = start_sec; i <= end_sec; i++) { + unsigned long pfns; cond_resched(); - __remove_section(zone, __pfn_to_section(pfn), map_offset, - altmap); + pfns = min(nr_pages, PAGES_PER_SECTION + - (pfn & ~PAGE_SECTION_MASK)); + __remove_section(zone, pfn, pfns, map_offset, altmap); + pfn += pfns; + nr_pages -= pfns; map_offset = 0; } diff --git a/mm/sparse.c b/mm/sparse.c index 49f0c03d15a3..ad47e25c8f94 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -728,8 +728,8 @@ static void free_map_bootmem(struct page *memmap) * * -EEXIST - Section has been present. * * -ENOMEM - Out of memory. */ -int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, - struct vmem_altmap *altmap) +int __meminit sparse_add_section(int nid, unsigned long start_pfn, + unsigned long nr_pages, struct vmem_altmap *altmap) { unsigned long section_nr = pfn_to_section_nr(start_pfn); struct mem_section_usage *usage; @@ -834,8 +834,9 @@ static void free_section_usage(struct mem_section *ms, struct page *memmap, free_map_bootmem(memmap); } -void sparse_remove_one_section(struct mem_section *ms, unsigned long map_offset, - struct vmem_altmap *altmap) +void sparse_remove_one_section(struct mem_section *ms, unsigned long pfn, + unsigned long nr_pages, unsigned long map_offset, + struct vmem_altmap *altmap) { struct page *memmap = NULL; struct mem_section_usage *usage = NULL; @@ -848,10 +849,7 @@ void sparse_remove_one_section(struct mem_section *ms, unsigned long map_offset, ms->usage = NULL; } - clear_hwpoisoned_pages(memmap + map_offset, - PAGES_PER_SECTION - map_offset); - free_section_usage(ms, memmap, usage, - section_nr_to_pfn(__section_nr(ms)), - PAGES_PER_SECTION, altmap); + clear_hwpoisoned_pages(memmap + map_offset, nr_pages - map_offset); + free_section_usage(ms, memmap, usage, pfn, nr_pages, altmap); } #endif /* CONFIG_MEMORY_HOTPLUG */ From patchwork Wed Jun 19 05:52:23 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 11003487 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D6BC813AF for ; Wed, 19 Jun 2019 06:06:45 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C54A828B45 for ; Wed, 19 Jun 2019 06:06:45 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B946328B47; Wed, 19 Jun 2019 06:06:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C055828B4F for ; Wed, 19 Jun 2019 06:06:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 97B116B0010; Wed, 19 Jun 2019 02:06:43 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 92C618E0002; Wed, 19 Jun 2019 02:06:43 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 81B528E0001; Wed, 19 Jun 2019 02:06:43 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f197.google.com (mail-pf1-f197.google.com [209.85.210.197]) by kanga.kvack.org (Postfix) with ESMTP id 452976B0010 for ; Wed, 19 Jun 2019 02:06:43 -0400 (EDT) Received: by mail-pf1-f197.google.com with SMTP id b127so10978621pfb.8 for ; Tue, 18 Jun 2019 23:06:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=TE8J3m5aJd70+ePJPSirH1KtoStgBL+bzxvU29XwDTY=; b=VB10ANLy1GYbfqA4rEeRm755eZIFUvaJ/BXKvauWWmc465TNtDacS625a6x8plZKLp 4aQcyE9iVLBL+LJMaSzWLSd6cBOyor0y3sOS5ApPTP8Ji/ljHFcpBJOvbWgRI7QZXSyB YpVUF7OiySNEHyHiMPDmk346GHtbcey9I0q2pqGWPDnzX424lCvTS7jlZVFCz3KjXkY7 6QjBWQJ/b2EuJKsS9DKO1XTIxpzWC6+X2NNHJwYqgXl2hxjJEsZ3iJZG6yh/uL8qx9s/ KWpKtN0HsJegXF1WKwEOVk/wAMUXFbCpdIi8CvGRrHacJqequ7bPzFjPJmTZUzCCiK0m 2CGw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAWN1VcQanmt5MWKWLd2wvaC3pVSotKQDIHuhEQBu3X2hysDqKQX QLS+JtwOkDPmKWET9npF3LypzGmEik6cp/WWpfS2KpQloQFXoRNRnuDUtOPFylemxIbf/IAAUDv UT1KdmReoSYhE4XZTp9+Ub01rnHfylJpoQWCa+KSQOD5jZCGqj6UIGIZ4Yz6ue1JtIw== X-Received: by 2002:a17:90a:b30a:: with SMTP id d10mr9498498pjr.8.1560924402921; Tue, 18 Jun 2019 23:06:42 -0700 (PDT) X-Google-Smtp-Source: APXvYqwDxciHaPZNOd9z6XZMV8RGZ4U7leytEJZnDtC4RzBJSA/9fjUqKAdw9dBFC1XzEYuvg/N3 X-Received: by 2002:a17:90a:b30a:: with SMTP id d10mr9498425pjr.8.1560924401868; Tue, 18 Jun 2019 23:06:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560924401; cv=none; d=google.com; s=arc-20160816; b=klR8TQCdj+a+4VZ3APIczvo+C1gxKXC2QElrb4cW55wBkqs0nPEsTwR5P7wssKdrie RFKqK9dciHSZ5eurgoh0PqSKOQheQOyn4BXIP4bJKWhEFQcBYnLO/g4UlyQsl6T1O4ha HWL6uFPOYZt6ew9JGad9bjdKMLTkUDCAnU/n1gZ8tr8HG53IYr0mX4TJ5cLh09yi7CFC vObxWpXU/1kAgYpLwtd82sSOMcCNAOCMrEVzwIHJH5vx7yw082G9rSffVdUaJCzFr6jY E8dsRHeawJGfYBCaQOOGjRbvgkq/NpJIqER1ap9+Ef9CTpz58cOJJmrpe6gk3FgvhT2H gfoQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=TE8J3m5aJd70+ePJPSirH1KtoStgBL+bzxvU29XwDTY=; b=eEdo1mZb/Un2DAH1i1WQTTLvUZplZUFZ/R9pL8Kvpr0rbssN/aIgwGksAQDd6mdF6o kzsajCbXrTunzvu0fWyNGSN86i54qLxmJEJ8cmwbd0oQ3dVvj85tYfSGW3ctP2vCJl5i fcpOnABXkriyYO/WCMuaxOhMRxmEERJsfSP7+SPJ27GO4/m8Z3GgGk5X249D2JjvwF6H 61RrDFy/aaEdXEhEe+DcnHXMJ5vA54MYpInl9kr036vw8j9wfPPyXMQSQS9msUjDIi42 FM/lhvGrv2Bc4a59PfEzZEkuViV1y2Xs6FsvVlpA996SVZYUc0cCyiJ9aBNsZey0lF7H WEZg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga05.intel.com (mga05.intel.com. [192.55.52.43]) by mx.google.com with ESMTPS id m3si333852pgm.411.2019.06.18.23.06.41 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 18 Jun 2019 23:06:41 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.43 as permitted sender) client-ip=192.55.52.43; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:06:41 -0700 X-IronPort-AV: E=Sophos;i="5.63,392,1557212400"; d="scan'208";a="186352097" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:06:40 -0700 Subject: [PATCH v10 09/13] mm/sparsemem: Support sub-section hotplug From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Vlastimil Babka , Logan Gunthorpe , Oscar Salvador , Pavel Tatashin , linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org Date: Tue, 18 Jun 2019 22:52:23 -0700 Message-ID: <156092354368.979959.6232443923440952359.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> References: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The libnvdimm sub-system has suffered a series of hacks and broken workarounds for the memory-hotplug implementation's awkward section-aligned (128MB) granularity. For example the following backtrace is emitted when attempting arch_add_memory() with physical address ranges that intersect 'System RAM' (RAM) with 'Persistent Memory' (PMEM) within a given section: # cat /proc/iomem | grep -A1 -B1 Persistent\ Memory 100000000-1ffffffff : System RAM 200000000-303ffffff : Persistent Memory (legacy) 304000000-43fffffff : System RAM 440000000-23ffffffff : Persistent Memory 2400000000-43bfffffff : Persistent Memory 2400000000-43bfffffff : namespace2.0 WARNING: CPU: 38 PID: 928 at arch/x86/mm/init_64.c:850 add_pages+0x5c/0x60 [..] RIP: 0010:add_pages+0x5c/0x60 [..] Call Trace: devm_memremap_pages+0x460/0x6e0 pmem_attach_disk+0x29e/0x680 [nd_pmem] ? nd_dax_probe+0xfc/0x120 [libnvdimm] nvdimm_bus_probe+0x66/0x160 [libnvdimm] It was discovered that the problem goes beyond RAM vs PMEM collisions as some platform produce PMEM vs PMEM collisions within a given section. The libnvdimm workaround for that case revealed that the libnvdimm section-alignment-padding implementation has been broken for a long while. A fix for that long-standing breakage introduces as many problems as it solves as it would require a backward-incompatible change to the namespace metadata interpretation. Instead of that dubious route [1], address the root problem in the memory-hotplug implementation. Note that EEXIST is no longer treated as success as that is how sparse_add_section() reports subsection collisions, it was also obviated by recent changes to perform the request_region() for 'System RAM' before arch_add_memory() in the add_memory() sequence. [1]: https://lore.kernel.org/r/155000671719.348031.2347363160141119237.stgit@dwillia2-desk3.amr.corp.intel.com Cc: Michal Hocko Cc: Vlastimil Babka Cc: Logan Gunthorpe Cc: Oscar Salvador Cc: Pavel Tatashin Signed-off-by: Dan Williams Reviewed-by: Oscar Salvador --- include/linux/memory_hotplug.h | 2 mm/memory_hotplug.c | 27 +---- mm/page_alloc.c | 2 mm/sparse.c | 205 ++++++++++++++++++++++++++-------------- 4 files changed, 140 insertions(+), 96 deletions(-) diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 3ab0282b4fe5..0b8a5e5ef2da 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -350,7 +350,7 @@ extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn, extern bool is_memblock_offlined(struct memory_block *mem); extern int sparse_add_section(int nid, unsigned long pfn, unsigned long nr_pages, struct vmem_altmap *altmap); -extern void sparse_remove_one_section(struct mem_section *ms, +extern void sparse_remove_section(struct mem_section *ms, unsigned long pfn, unsigned long nr_pages, unsigned long map_offset, struct vmem_altmap *altmap); extern struct page *sparse_decode_mem_map(unsigned long coded_mem_map, diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 399bf78bccc5..4e8e65954f31 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -252,18 +252,6 @@ void __init register_page_bootmem_info_node(struct pglist_data *pgdat) } #endif /* CONFIG_HAVE_BOOTMEM_INFO_NODE */ -static int __meminit __add_section(int nid, unsigned long pfn, - unsigned long nr_pages, struct vmem_altmap *altmap) -{ - int ret; - - if (pfn_valid(pfn)) - return -EEXIST; - - ret = sparse_add_section(nid, pfn, nr_pages, altmap); - return ret < 0 ? ret : 0; -} - static int check_pfn_span(unsigned long pfn, unsigned long nr_pages, const char *reason) { @@ -327,18 +315,11 @@ int __ref __add_pages(int nid, unsigned long pfn, unsigned long nr_pages, pfns = min(nr_pages, PAGES_PER_SECTION - (pfn & ~PAGE_SECTION_MASK)); - err = __add_section(nid, pfn, pfns, altmap); + err = sparse_add_section(nid, pfn, pfns, altmap); + if (err) + break; pfn += pfns; nr_pages -= pfns; - - /* - * EEXIST is finally dealt with by ioresource collision - * check. see add_memory() => register_memory_resource() - * Warning will be printed if there is collision. - */ - if (err && (err != -EEXIST)) - break; - err = 0; cond_resched(); } vmemmap_populate_print_last(); @@ -541,7 +522,7 @@ static void __remove_section(struct zone *zone, unsigned long pfn, return; __remove_zone(zone, pfn, nr_pages); - sparse_remove_one_section(ms, pfn, nr_pages, map_offset, altmap); + sparse_remove_section(ms, pfn, nr_pages, map_offset, altmap); } /** diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 12b2afd3a529..5b3266d63521 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5931,7 +5931,7 @@ void __ref memmap_init_zone_device(struct zone *zone, * pfn out of zone. * * Please note that MEMMAP_HOTPLUG path doesn't clear memmap - * because this is done early in sparse_add_one_section + * because this is done early in section_activate() */ if (!(pfn & (pageblock_nr_pages - 1))) { set_pageblock_migratetype(page, MIGRATE_MOVABLE); diff --git a/mm/sparse.c b/mm/sparse.c index ad47e25c8f94..b77ca21a27a4 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -83,8 +83,15 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid) unsigned long root = SECTION_NR_TO_ROOT(section_nr); struct mem_section *section; + /* + * An existing section is possible in the sub-section hotplug + * case. First hot-add instantiates, follow-on hot-add reuses + * the existing section. + * + * The mem_hotplug_lock resolves the apparent race below. + */ if (mem_section[root]) - return -EEXIST; + return 0; section = sparse_index_alloc(nid); if (!section) @@ -715,10 +722,119 @@ static void free_map_bootmem(struct page *memmap) } #endif /* CONFIG_SPARSEMEM_VMEMMAP */ +static void section_deactivate(unsigned long pfn, unsigned long nr_pages, + struct vmem_altmap *altmap) +{ + DECLARE_BITMAP(map, SUBSECTIONS_PER_SECTION) = { 0 }; + DECLARE_BITMAP(tmp, SUBSECTIONS_PER_SECTION) = { 0 }; + struct mem_section *ms = __pfn_to_section(pfn); + struct page *memmap = NULL; + unsigned long *subsection_map = ms->usage + ? &ms->usage->subsection_map[0] : NULL; + + subsection_mask_set(map, pfn, nr_pages); + if (subsection_map) + bitmap_and(tmp, map, subsection_map, SUBSECTIONS_PER_SECTION); + + if (WARN(!subsection_map || !bitmap_equal(tmp, map, SUBSECTIONS_PER_SECTION), + "section already deactivated (%#lx + %ld)\n", + pfn, nr_pages)) + return; + + /* + * There are 3 cases to handle across two configurations + * (SPARSEMEM_VMEMMAP={y,n}): + * + * 1/ deactivation of a partial hot-added section (only possible + * in the SPARSEMEM_VMEMMAP=y case). + * a/ section was present at memory init + * b/ section was hot-added post memory init + * 2/ deactivation of a complete hot-added section + * 3/ deactivation of a complete section from memory init + * + * For 1/, when subsection_map does not empty we will not be + * freeing the usage map, but still need to free the vmemmap + * range. + * + * For 2/ and 3/ the SPARSEMEM_VMEMMAP={y,n} cases are unified + */ + bitmap_xor(subsection_map, map, subsection_map, SUBSECTIONS_PER_SECTION); + if (bitmap_empty(subsection_map, SUBSECTIONS_PER_SECTION)) { + unsigned long section_nr = pfn_to_section_nr(pfn); + + if (!early_section(ms)) { + kfree(ms->usage); + ms->usage = NULL; + } + memmap = sparse_decode_mem_map(ms->section_mem_map, section_nr); + ms->section_mem_map = sparse_encode_mem_map(NULL, section_nr); + } + + if (early_section(ms) && memmap) + free_map_bootmem(memmap); + else + depopulate_section_memmap(pfn, nr_pages, altmap); +} + +static struct page * __meminit section_activate(int nid, unsigned long pfn, + unsigned long nr_pages, struct vmem_altmap *altmap) +{ + DECLARE_BITMAP(map, SUBSECTIONS_PER_SECTION) = { 0 }; + struct mem_section *ms = __pfn_to_section(pfn); + struct mem_section_usage *usage = NULL; + unsigned long *subsection_map; + struct page *memmap; + int rc = 0; + + subsection_mask_set(map, pfn, nr_pages); + + if (!ms->usage) { + usage = kzalloc(mem_section_usage_size(), GFP_KERNEL); + if (!usage) + return ERR_PTR(-ENOMEM); + ms->usage = usage; + } + subsection_map = &ms->usage->subsection_map[0]; + + if (bitmap_empty(map, SUBSECTIONS_PER_SECTION)) + rc = -EINVAL; + else if (bitmap_intersects(map, subsection_map, SUBSECTIONS_PER_SECTION)) + rc = -EEXIST; + else + bitmap_or(subsection_map, map, subsection_map, + SUBSECTIONS_PER_SECTION); + + if (rc) { + if (usage) + ms->usage = NULL; + kfree(usage); + return ERR_PTR(rc); + } + + /* + * The early init code does not consider partially populated + * initial sections, it simply assumes that memory will never be + * referenced. If we hot-add memory into such a section then we + * do not need to populate the memmap and can simply reuse what + * is already there. + */ + if (nr_pages < PAGES_PER_SECTION && early_section(ms)) + return pfn_to_page(pfn); + + memmap = populate_section_memmap(pfn, nr_pages, nid, altmap); + if (!memmap) { + section_deactivate(pfn, nr_pages, altmap); + return ERR_PTR(-ENOMEM); + } + + return memmap; +} + /** - * sparse_add_one_section - add a memory section + * sparse_add_section - add a memory section, or populate an existing one * @nid: The node to add section on * @start_pfn: start pfn of the memory range + * @nr_pages: number of pfns to add in the section * @altmap: device page map * * This is only intended for hotplug. @@ -732,50 +848,33 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, unsigned long nr_pages, struct vmem_altmap *altmap) { unsigned long section_nr = pfn_to_section_nr(start_pfn); - struct mem_section_usage *usage; struct mem_section *ms; struct page *memmap; int ret; - /* - * no locking for this, because it does its own - * plus, it does a kmalloc - */ ret = sparse_index_init(section_nr, nid); - if (ret < 0 && ret != -EEXIST) + if (ret < 0) return ret; - ret = 0; - memmap = populate_section_memmap(start_pfn, PAGES_PER_SECTION, nid, - altmap); - if (!memmap) - return -ENOMEM; - usage = kzalloc(mem_section_usage_size(), GFP_KERNEL); - if (!usage) { - depopulate_section_memmap(start_pfn, PAGES_PER_SECTION, altmap); - return -ENOMEM; - } - ms = __pfn_to_section(start_pfn); - if (ms->section_mem_map & SECTION_MARKED_PRESENT) { - ret = -EEXIST; - goto out; - } + memmap = section_activate(nid, start_pfn, nr_pages, altmap); + if (IS_ERR(memmap)) + return PTR_ERR(memmap); /* * Poison uninitialized struct pages in order to catch invalid flags * combinations. */ - page_init_poison(memmap, sizeof(struct page) * PAGES_PER_SECTION); + page_init_poison(pfn_to_page(start_pfn), sizeof(struct page) * nr_pages); + ms = __pfn_to_section(start_pfn); section_mark_present(ms); - sparse_init_one_section(ms, section_nr, memmap, usage, 0); -out: - if (ret < 0) { - kfree(usage); - depopulate_section_memmap(start_pfn, PAGES_PER_SECTION, altmap); - } - return ret; + /* Align memmap to section boundary in the subsection case */ + if (section_nr_to_pfn(section_nr) != start_pfn) + memmap = pfn_to_kaddr(section_nr_to_pfn(section_nr)); + sparse_init_one_section(ms, section_nr, memmap, ms->usage, 0); + + return 0; } #ifdef CONFIG_MEMORY_FAILURE @@ -808,48 +907,12 @@ static inline void clear_hwpoisoned_pages(struct page *memmap, int nr_pages) } #endif -static void free_section_usage(struct mem_section *ms, struct page *memmap, - struct mem_section_usage *usage, unsigned long pfn, - unsigned long nr_pages, struct vmem_altmap *altmap) -{ - if (!usage) - return; - - /* - * Check to see if allocation came from hot-plug-add - */ - if (!early_section(ms)) { - kfree(usage); - if (memmap) - depopulate_section_memmap(pfn, nr_pages, altmap); - return; - } - - /* - * The usemap came from bootmem. This is packed with other usemaps - * on the section which has pgdat at boot time. Just keep it as is now. - */ - - if (memmap) - free_map_bootmem(memmap); -} - -void sparse_remove_one_section(struct mem_section *ms, unsigned long pfn, +void sparse_remove_section(struct mem_section *ms, unsigned long pfn, unsigned long nr_pages, unsigned long map_offset, struct vmem_altmap *altmap) { - struct page *memmap = NULL; - struct mem_section_usage *usage = NULL; - - if (ms->section_mem_map) { - usage = ms->usage; - memmap = sparse_decode_mem_map(ms->section_mem_map, - __section_nr(ms)); - ms->section_mem_map = 0; - ms->usage = NULL; - } - - clear_hwpoisoned_pages(memmap + map_offset, nr_pages - map_offset); - free_section_usage(ms, memmap, usage, pfn, nr_pages, altmap); + clear_hwpoisoned_pages(pfn_to_page(pfn) + map_offset, + nr_pages - map_offset); + section_deactivate(pfn, nr_pages, altmap); } #endif /* CONFIG_MEMORY_HOTPLUG */ From patchwork Wed Jun 19 05:52:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 11003491 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 333BD76 for ; Wed, 19 Jun 2019 06:06:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2042A28B45 for ; Wed, 19 Jun 2019 06:06:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 14E7528B4B; Wed, 19 Jun 2019 06:06:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 957D528B45 for ; Wed, 19 Jun 2019 06:06:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 92CBB8E0001; Wed, 19 Jun 2019 02:06:49 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 8DE4E6B0269; Wed, 19 Jun 2019 02:06:49 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7F3D08E0001; Wed, 19 Jun 2019 02:06:49 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f199.google.com (mail-pf1-f199.google.com [209.85.210.199]) by kanga.kvack.org (Postfix) with ESMTP id 466436B0266 for ; Wed, 19 Jun 2019 02:06:49 -0400 (EDT) Received: by mail-pf1-f199.google.com with SMTP id r142so10997238pfc.2 for ; Tue, 18 Jun 2019 23:06:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=ZKEVF6/9Fmnf7YWUEHXmw9ISQdNkRcpJo8Na4G/GKC4=; b=UTQ2FS88aQW73kH3AdhtPdnEeJuF1LruLfa3mvy8eKyGuDFkwnvEHFLbdELKsCM9JF gKbA7wCHZUVryPvKPT68mhUtZ8RryOYaLgoMe3gIctRXitH0ep8Sfh33cB60cCS9VkP8 j8NUAfMai2exah9h+nX8Q43mqdhtEhNLODaV1TDSXqS/2CeRWLVWWSTmlxGfMS0IKCdZ OVjaf2nNg5+ISgahZHOBBpJmYTtkDMUqhOOsnWbhbkZ93lOoPMbatDpICvBymdKIW8qc r9rexGQSxLQh+kREI4DLdvUcnNv860crPPpWO8FpR+0PsQfxsM0LC4J5U3bx6ciWGlG6 KQHQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAWcopqX+YsrkIn2FQBa9qSL8ZuXKalTQ8Cuu3MmmLv1ryZBsvWu mrV7sql7OY3JD1cnuFkLB9kJciQHDYwQL9pYvSyDI/Coo6Lo5eRRE53L4wB5RBGjA6t0ZBn8GVu 2J3aEvD9h/ZXynyGClxnlvsJofELy8FVvlqxTQQkIx9eBblEzByzUt3DpZmip0o/bZA== X-Received: by 2002:a63:2326:: with SMTP id j38mr6319652pgj.134.1560924408750; Tue, 18 Jun 2019 23:06:48 -0700 (PDT) X-Google-Smtp-Source: APXvYqzIAMBpqzxfHg0KN0w1+yBof3dD3cpbCdAvsW/tD9CAyh4It7d3BJdCvzS2l3w9RmkdciT8 X-Received: by 2002:a63:2326:: with SMTP id j38mr6319574pgj.134.1560924407408; Tue, 18 Jun 2019 23:06:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560924407; cv=none; d=google.com; s=arc-20160816; b=JJMFi//CV8beoUae273cktcTe6Yh0E2vmZ6Sg7y76gRSHZKOBMlKKXxJCiaydqZVrM 12MmWQChKcXLp2azfmA6LbGkBOz3B9w6yqJVmAn93S+IIiHMb1Jaxn+gxpDNBooAzWK3 6hi7iaahJuqwEDUXaqg7eu/kKnRUM1H4I4al8rHw4+DcSsjduWhxfYdSAmwHHA/Qg8ZB kqU0ZgOc04Vxtw2xNVfWh+V37Y1a7UBbT/QX1OoZ8bvzt9h6I2oJtLlTWcQhRGI9zH+B vR67o0MTWUgB84NOI3tedIegrPYyI8AQ1MbCH+v3LCjMBQhpbCrLqBayClSyLBjbmHx1 Cw+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=ZKEVF6/9Fmnf7YWUEHXmw9ISQdNkRcpJo8Na4G/GKC4=; b=fxZA2lUpRFVFZtcnEUKAyf9MX2KGoRe+0za1bNxAyNlDytBZ7mFAo6Bk+PN73F/f2G J3hy96j+Td3tElg3AbHdwsWmE73KnP9nxZGp6seCHtEJbCdMrsRUGcdEOnQfD3EFaNJg ysn4v07b9pLX6QfHJWA3mVAhAzVnayG5wH+dUItnuc4x16faaw0qccNxZe8rqE3gmgWZ TBTEPeWR8orPif9LCAb/i28+zG6SM02Vpb9Mbk/ddpMb8wGod79zF5w+tRTvnftsicME gw7pFDt8RSSFX2wm0iXMwKxAHX53hqEMNHrxsmlaUdBmHYJzV2c7/xTNsYyxl1i84mV+ Cw3w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga14.intel.com (mga14.intel.com. [192.55.52.115]) by mx.google.com with ESMTPS id g7si2239090pgd.32.2019.06.18.23.06.47 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 18 Jun 2019 23:06:47 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.115 as permitted sender) client-ip=192.55.52.115; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:06:46 -0700 X-IronPort-AV: E=Sophos;i="5.63,392,1557212400"; d="scan'208";a="181561933" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:06:46 -0700 Subject: [PATCH v10 10/13] mm: Document ZONE_DEVICE memory-model implications From: Dan Williams To: akpm@linux-foundation.org Cc: Jonathan Corbet , Mike Rapoport , linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org Date: Tue, 18 Jun 2019 22:52:29 -0700 Message-ID: <156092354985.979959.15763234410543451710.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> References: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Explain the general mechanisms of 'ZONE_DEVICE' pages and list the users of 'devm_memremap_pages()'. Cc: Jonathan Corbet Reported-by: Mike Rapoport Signed-off-by: Dan Williams Reviewed-by: Mike Rapoport --- Documentation/vm/memory-model.rst | 39 +++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/Documentation/vm/memory-model.rst b/Documentation/vm/memory-model.rst index 382f72ace1fc..e0af47e02e78 100644 --- a/Documentation/vm/memory-model.rst +++ b/Documentation/vm/memory-model.rst @@ -181,3 +181,42 @@ that is eventually passed to vmemmap_populate() through a long chain of function calls. The vmemmap_populate() implementation may use the `vmem_altmap` along with :c:func:`altmap_alloc_block_buf` helper to allocate memory map on the persistent memory device. + +ZONE_DEVICE +=========== +The `ZONE_DEVICE` facility builds upon `SPARSEMEM_VMEMMAP` to offer +`struct page` `mem_map` services for device driver identified physical +address ranges. The "device" aspect of `ZONE_DEVICE` relates to the fact +that the page objects for these address ranges are never marked online, +and that a reference must be taken against the device, not just the page +to keep the memory pinned for active use. `ZONE_DEVICE`, via +:c:func:`devm_memremap_pages`, performs just enough memory hotplug to +turn on :c:func:`pfn_to_page`, :c:func:`page_to_pfn`, and +:c:func:`get_user_pages` service for the given range of pfns. Since the +page reference count never drops below 1 the page is never tracked as +free memory and the page's `struct list_head lru` space is repurposed +for back referencing to the host device / driver that mapped the memory. + +While `SPARSEMEM` presents memory as a collection of sections, +optionally collected into memory blocks, `ZONE_DEVICE` users have a need +for smaller granularity of populating the `mem_map`. Given that +`ZONE_DEVICE` memory is never marked online it is subsequently never +subject to its memory ranges being exposed through the sysfs memory +hotplug api on memory block boundaries. The implementation relies on +this lack of user-api constraint to allow sub-section sized memory +ranges to be specified to :c:func:`arch_add_memory`, the top-half of +memory hotplug. Sub-section support allows for `PMD_SIZE` as the minimum +alignment granularity for :c:func:`devm_memremap_pages`. + +The users of `ZONE_DEVICE` are: +* pmem: Map platform persistent memory to be used as a direct-I/O target + via DAX mappings. + +* hmm: Extend `ZONE_DEVICE` with `->page_fault()` and `->page_free()` + event callbacks to allow a device-driver to coordinate memory management + events related to device-memory, typically GPU memory. See + Documentation/vm/hmm.rst. + +* p2pdma: Create `struct page` objects to allow peer devices in a + PCI/-E topology to coordinate direct-DMA operations between themselves, + i.e. bypass host memory. From patchwork Wed Jun 19 05:52:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 11003495 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AFB5076 for ; Wed, 19 Jun 2019 06:06:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9EBD128B45 for ; Wed, 19 Jun 2019 06:06:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9296628B4B; Wed, 19 Jun 2019 06:06:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C922A28B45 for ; Wed, 19 Jun 2019 06:06:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 891068E0002; Wed, 19 Jun 2019 02:06:54 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 81D046B0269; Wed, 19 Jun 2019 02:06:54 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 70BD58E0002; Wed, 19 Jun 2019 02:06:54 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f197.google.com (mail-pl1-f197.google.com [209.85.214.197]) by kanga.kvack.org (Postfix) with ESMTP id 383EA6B0266 for ; Wed, 19 Jun 2019 02:06:54 -0400 (EDT) Received: by mail-pl1-f197.google.com with SMTP id 59so9219191plb.14 for ; Tue, 18 Jun 2019 23:06:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=EowCKhf3YudIrPL5E9Ee9l1jM9t44G3yACrnSK7J+yA=; b=hVp5hBOj/wbgtSVFnWaivKdUnaC/un0gyz5IVcJqbJ/3NlUJDZZm6mmDGTeJmWcdUT 6Q0KY0mqe0DYzOBNqS+7mme2QAYry103LUpbfkhC0pHtJvvtdkIO50Zp9203t5RULwcZ Bgrx+rvYchUqNFM4lsw1XswH2LmZTq7byl0aiSbq8EI7cRPxs/0tRLZ7sNxTnWPoGKvI 2QKnTjXOJJFYko5ozjiz9ouuJPXOrV8GUQ34BzjlI3pyC8FOlqtAqzCF3aK/v0fcVUIX tbCDpUOO7bJec1lrb3NBDfvblYVvl15bxlob4zFtlPS9gXw1/hpxxwnMu3DZEeyrT5eZ Arkg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAUYESwqCC6KaKtdS50+rCAeUQoVmkrAaF95BJQfG4GJps2ABFr9 EtttuzE65Rw1nxvV3xA39C1IBKW/hgoWO/DoE7OxI6Aud5KTQHmIT5mTryEG6wX4usHArfPDzj2 b9Heysncsh5Gjm3j2t5f2MkLaiSpjMqpK4PINfZUazUFvDDl4Lwp9lsN/gZrQfc/xKQ== X-Received: by 2002:a63:6144:: with SMTP id v65mr470494pgb.445.1560924413783; Tue, 18 Jun 2019 23:06:53 -0700 (PDT) X-Google-Smtp-Source: APXvYqxt5cYgjoZbBBUeqcAX4vGqebcsw8GJkXbJpEo4vnSZ3HHiJx9mzthNUktaXTBJpbWpnPKs X-Received: by 2002:a63:6144:: with SMTP id v65mr470425pgb.445.1560924412552; Tue, 18 Jun 2019 23:06:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560924412; cv=none; d=google.com; s=arc-20160816; b=elmF+VBs7Lu5qhenp5APi48DPTjjGJh1LQdRigGVPjK4tvv1kRbS0g1urTh7LgHM8L RtEHvIy9h+TYZyN+ggTPrrpWXvhddX3vw5zPGYKmf2jQHZpAnnEm25p+sWSX5/3YPqAK hi+74T/EiQm3nE+kE8SRevvPpXKVACyjDmI4dL/90L9MxrHRV2LnZ+rNRbhmqZ+KS80a tWTM9Uy7MV/YzXFUM2zspqBQeGlJenl7o2r4wf9HC0oPnt8PyaAaASbh6x+fqVYOxzyG owRdmWBPBeUzyu7+nSLfox6IaPpGCzUgS1na2bECxEkX0p/pJl2heCmr9Y8hN2SoCJXL Co7A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=EowCKhf3YudIrPL5E9Ee9l1jM9t44G3yACrnSK7J+yA=; b=Os7emLGQdah+8h7AMSvOED6kmneaWkJddwW3LXt3hcbDeFg+Lme0m24Vu7+hMgeJxD K9BOFkW6ZGqFl6+IgkyOj5Ld/IqlA4yzCDBnzo0fECCd9ehNoSXoByN6WURxxIKiSBos ZXmP9UlEJNw9A5A+0xXV8fpdKZM7/S7qI6zgN3tbmAs2xipzcfyL51c5QwYhEwAoNYsc oJifkbi2WHwqRs/bnrZR8Wti4PJb3aDrhu+cAgQgVya9Gb+CQTgwqbYIJseA21N5yy6C a/NpKFKo0oA+A2Uvvg47P+MODQp7473rJnGVJ5MeJeTL9NqZ8tlRpLfHZyLTYvEq2p+v 6pqw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga09.intel.com (mga09.intel.com. [134.134.136.24]) by mx.google.com with ESMTPS id n11si2455520pgi.27.2019.06.18.23.06.52 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 18 Jun 2019 23:06:52 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) client-ip=134.134.136.24; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:06:52 -0700 X-IronPort-AV: E=Sophos;i="5.63,392,1557212400"; d="scan'208";a="153706263" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:06:52 -0700 Subject: [PATCH v10 11/13] mm/devm_memremap_pages: Enable sub-section remap From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Toshi Kani , =?utf-8?b?SsOpcsO0bWU=?= Glisse , Logan Gunthorpe , Oscar Salvador , Pavel Tatashin , linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org Date: Tue, 18 Jun 2019 22:52:35 -0700 Message-ID: <156092355542.979959.10060071713397030576.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> References: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Teach devm_memremap_pages() about the new sub-section capabilities of arch_{add,remove}_memory(). Effectively, just replace all usage of align_start, align_end, and align_size with res->start, res->end, and resource_size(res). The existing sanity check will still make sure that the two separate remap attempts do not collide within a sub-section (2MB on x86). Cc: Michal Hocko Cc: Toshi Kani Cc: Jérôme Glisse Cc: Logan Gunthorpe Cc: Oscar Salvador Cc: Pavel Tatashin Signed-off-by: Dan Williams --- kernel/memremap.c | 61 +++++++++++++++++++++-------------------------------- 1 file changed, 24 insertions(+), 37 deletions(-) diff --git a/kernel/memremap.c b/kernel/memremap.c index 57980ed4e571..a0e5f6b91b04 100644 --- a/kernel/memremap.c +++ b/kernel/memremap.c @@ -58,7 +58,7 @@ static unsigned long pfn_first(struct dev_pagemap *pgmap) struct vmem_altmap *altmap = &pgmap->altmap; unsigned long pfn; - pfn = res->start >> PAGE_SHIFT; + pfn = PHYS_PFN(res->start); if (pgmap->altmap_valid) pfn += vmem_altmap_offset(altmap); return pfn; @@ -86,7 +86,6 @@ static void devm_memremap_pages_release(void *data) struct dev_pagemap *pgmap = data; struct device *dev = pgmap->dev; struct resource *res = &pgmap->res; - resource_size_t align_start, align_size; unsigned long pfn; int nid; @@ -96,25 +95,21 @@ static void devm_memremap_pages_release(void *data) pgmap->cleanup(pgmap->ref); /* pages are dead and unused, undo the arch mapping */ - align_start = res->start & ~(PA_SECTION_SIZE - 1); - align_size = ALIGN(res->start + resource_size(res), PA_SECTION_SIZE) - - align_start; - - nid = page_to_nid(pfn_to_page(align_start >> PAGE_SHIFT)); + nid = page_to_nid(pfn_to_page(PHYS_PFN(res->start))); mem_hotplug_begin(); if (pgmap->type == MEMORY_DEVICE_PRIVATE) { - pfn = align_start >> PAGE_SHIFT; + pfn = PHYS_PFN(res->start); __remove_pages(page_zone(pfn_to_page(pfn)), pfn, - align_size >> PAGE_SHIFT, NULL); + PHYS_PFN(resource_size(res)), NULL); } else { - arch_remove_memory(nid, align_start, align_size, + arch_remove_memory(nid, res->start, resource_size(res), pgmap->altmap_valid ? &pgmap->altmap : NULL); - kasan_remove_zero_shadow(__va(align_start), align_size); + kasan_remove_zero_shadow(__va(res->start), resource_size(res)); } mem_hotplug_done(); - untrack_pfn(NULL, PHYS_PFN(align_start), align_size); + untrack_pfn(NULL, PHYS_PFN(res->start), resource_size(res)); pgmap_array_delete(res); dev_WARN_ONCE(dev, pgmap->altmap.alloc, "%s: failed to free all reserved pages\n", __func__); @@ -141,16 +136,13 @@ static void devm_memremap_pages_release(void *data) */ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) { - resource_size_t align_start, align_size, align_end; - struct vmem_altmap *altmap = pgmap->altmap_valid ? - &pgmap->altmap : NULL; struct resource *res = &pgmap->res; struct dev_pagemap *conflict_pgmap; struct mhp_restrictions restrictions = { /* * We do not want any optional features only our own memmap */ - .altmap = altmap, + .altmap = pgmap->altmap_valid ? &pgmap->altmap : NULL, }; pgprot_t pgprot = PAGE_KERNEL; int error, nid, is_ram; @@ -160,12 +152,7 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) return ERR_PTR(-EINVAL); } - align_start = res->start & ~(PA_SECTION_SIZE - 1); - align_size = ALIGN(res->start + resource_size(res), PA_SECTION_SIZE) - - align_start; - align_end = align_start + align_size - 1; - - conflict_pgmap = get_dev_pagemap(PHYS_PFN(align_start), NULL); + conflict_pgmap = get_dev_pagemap(PHYS_PFN(res->start), NULL); if (conflict_pgmap) { dev_WARN(dev, "Conflicting mapping in same section\n"); put_dev_pagemap(conflict_pgmap); @@ -173,7 +160,7 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) goto err_array; } - conflict_pgmap = get_dev_pagemap(PHYS_PFN(align_end), NULL); + conflict_pgmap = get_dev_pagemap(PHYS_PFN(res->end), NULL); if (conflict_pgmap) { dev_WARN(dev, "Conflicting mapping in same section\n"); put_dev_pagemap(conflict_pgmap); @@ -181,7 +168,7 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) goto err_array; } - is_ram = region_intersects(align_start, align_size, + is_ram = region_intersects(res->start, resource_size(res), IORESOURCE_SYSTEM_RAM, IORES_DESC_NONE); if (is_ram != REGION_DISJOINT) { @@ -202,8 +189,8 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) if (nid < 0) nid = numa_mem_id(); - error = track_pfn_remap(NULL, &pgprot, PHYS_PFN(align_start), 0, - align_size); + error = track_pfn_remap(NULL, &pgprot, PHYS_PFN(res->start), 0, + resource_size(res)); if (error) goto err_pfn_remap; @@ -221,25 +208,25 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) * arch_add_memory(). */ if (pgmap->type == MEMORY_DEVICE_PRIVATE) { - error = add_pages(nid, align_start >> PAGE_SHIFT, - align_size >> PAGE_SHIFT, &restrictions); + error = add_pages(nid, PHYS_PFN(res->start), + PHYS_PFN(resource_size(res)), &restrictions); } else { - error = kasan_add_zero_shadow(__va(align_start), align_size); + error = kasan_add_zero_shadow(__va(res->start), resource_size(res)); if (error) { mem_hotplug_done(); goto err_kasan; } - error = arch_add_memory(nid, align_start, align_size, - &restrictions); + error = arch_add_memory(nid, res->start, resource_size(res), + &restrictions); } if (!error) { struct zone *zone; zone = &NODE_DATA(nid)->node_zones[ZONE_DEVICE]; - move_pfn_range_to_zone(zone, align_start >> PAGE_SHIFT, - align_size >> PAGE_SHIFT, altmap); + move_pfn_range_to_zone(zone, PHYS_PFN(res->start), + PHYS_PFN(resource_size(res)), restrictions.altmap); } mem_hotplug_done(); @@ -251,8 +238,8 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) * to allow us to do the work while not holding the hotplug lock. */ memmap_init_zone_device(&NODE_DATA(nid)->node_zones[ZONE_DEVICE], - align_start >> PAGE_SHIFT, - align_size >> PAGE_SHIFT, pgmap); + PHYS_PFN(res->start), + PHYS_PFN(resource_size(res)), pgmap); percpu_ref_get_many(pgmap->ref, pfn_end(pgmap) - pfn_first(pgmap)); error = devm_add_action_or_reset(dev, devm_memremap_pages_release, @@ -263,9 +250,9 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) return __va(res->start); err_add_memory: - kasan_remove_zero_shadow(__va(align_start), align_size); + kasan_remove_zero_shadow(__va(res->start), resource_size(res)); err_kasan: - untrack_pfn(NULL, PHYS_PFN(align_start), align_size); + untrack_pfn(NULL, PHYS_PFN(res->start), resource_size(res)); err_pfn_remap: pgmap_array_delete(res); err_array: From patchwork Wed Jun 19 05:52:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 11003499 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8DFB676 for ; Wed, 19 Jun 2019 06:07:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7B91928B45 for ; Wed, 19 Jun 2019 06:07:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6FD4A28B4B; Wed, 19 Jun 2019 06:07:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E063C28B45 for ; Wed, 19 Jun 2019 06:07:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 92DB96B0266; Wed, 19 Jun 2019 02:06:59 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9047D8E0005; Wed, 19 Jun 2019 02:06:59 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7F3908E0003; Wed, 19 Jun 2019 02:06:59 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f198.google.com (mail-pl1-f198.google.com [209.85.214.198]) by kanga.kvack.org (Postfix) with ESMTP id 4377B6B0266 for ; Wed, 19 Jun 2019 02:06:59 -0400 (EDT) Received: by mail-pl1-f198.google.com with SMTP id e7so9232591plt.13 for ; Tue, 18 Jun 2019 23:06:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=OS6R7QuDwLu8nlVNMCTkBEkzBtl/n8KHFbc4UGW5JJU=; b=FRWDxJSlBYQ6asdexJ91SZFbkXbxb7YP8Q5ILgHA+D1TwwVgHEltnMkwUz99xpSxsU Hnf0ZenAGzOuAp7K+8rZBmcjR3HTbDeLiTmOif0KGnfLa2y/04lUgYGvQIqku1Vbq18M 7oJphjeWnP+4NRUgAMYacs9WLoDQqFCbgkL92jS3PiUL+yCZQEaQtMhGmoPGQ6aflqvw 6TB3mId16txagYZY9V2D5eQMkD0bJ7/Dz5XvIkow2g64Pmx5kwhI6cg3chHIOwbNIYYF zs0dONgfJvQHt95XpnYLSo/zQt8TUUmO8R46UEZBJTf94Qm98yFuJTh2HW6U7afiWL1J YoOQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAV3bStH+iBoFCF9Ls+86iuRUpH3k7WqXAQzLu9Wm+fifuCfqvhn wBINU890E8DMNyjovodVxCFy5An2INKJAXpQBxdvv2U3nkv315fx3Z0Khalih1Ig5ezDVg4+MHe aakFl5CqL43K1LxftyK7NRZR1awDmLtWGodWJy1FC2ci4pX7H25A3EWBKoxTVyoNEFw== X-Received: by 2002:a63:c302:: with SMTP id c2mr5801302pgd.300.1560924418866; Tue, 18 Jun 2019 23:06:58 -0700 (PDT) X-Google-Smtp-Source: APXvYqysWob2GzTyPmBDirH6WsNn4GKf8qa9Gme2Cu+EbH0I0y7UOV14HXFFA60cGF0SyCcVlnRv X-Received: by 2002:a63:c302:: with SMTP id c2mr5801261pgd.300.1560924418008; Tue, 18 Jun 2019 23:06:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560924418; cv=none; d=google.com; s=arc-20160816; b=COztxxgTVq1q87EAU6fy+j6aQ3XXleh2bziuFwTnblkMjq4vdqmJNtbUiTjaYx+8+S 90NPpI7fS3n628kZaFvWve0Uahwcv6RWPxRWYsHDoEM8oYs7nAit4dwPFf+fAbynlKOz fchvWi0zKc1KUZ4UJDh/2aD7uO/zylmzRuo53CdMo/e1WV8vFDEBpq6R3VrSEDTqnebd Pwh7id4/Oc58IVBUrE7/Dkc7hEp4+cYRG1kgHD9WnjR8NK2d2z8NI6lTDmnrHR4O+rRE 2Cn0nBfJrbUsmAyLE9KZmcSDzBQHWtUrH4efO/ibkmDMYjOAJSW0TBzm/0QFcf+25ZUe caSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=OS6R7QuDwLu8nlVNMCTkBEkzBtl/n8KHFbc4UGW5JJU=; b=BWrJLSmaarFTZzPcKNbsuRDHY2aTOZyqHZ+bKJFMZ6v+NVR7YN1v7VS+Cl4OlYo2ON HQKuyPcvpDW6zfiE4t0oLz9+NaEb+vW+WGlLMyQy/lhr1BHVoTHk/WLPRxjDNqN2JKeM ngdCCnGPlP1kV0LsEd2VQX6ruLyaF+idLUvqdZrDtDQVc3FogfKG263QuB6L8HjF6ySP DW1MNQ6SjNsvSfPMvE/N32yNTgYTDtt9BoXBcfjFU88UaeS5h6FmTA3dYp1FqncELYxD 8YGjaX9KQzwqQj+aLJ8ik6T6xa52+kZAb3+h55lPwKXhIjit0PTMxX9bC1zV0vtEUO5z Mxvg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga18.intel.com (mga18.intel.com. [134.134.136.126]) by mx.google.com with ESMTPS id d191si2136202pgc.460.2019.06.18.23.06.57 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 18 Jun 2019 23:06:57 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.126 as permitted sender) client-ip=134.134.136.126; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:06:57 -0700 X-IronPort-AV: E=Sophos;i="5.63,392,1557212400"; d="scan'208";a="150512526" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:06:57 -0700 Subject: [PATCH v10 12/13] libnvdimm/pfn: Fix fsdax-mode namespace info-block zero-fields From: Dan Williams To: akpm@linux-foundation.org Cc: stable@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org Date: Tue, 18 Jun 2019 22:52:40 -0700 Message-ID: <156092356065.979959.6681003754765958296.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> References: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP At namespace creation time there is the potential for the "expected to be zero" fields of a 'pfn' info-block to be filled with indeterminate data. While the kernel buffer is zeroed on allocation it is immediately overwritten by nd_pfn_validate() filling it with the current contents of the on-media info-block location. For fields like, 'flags' and the 'padding' it potentially means that future implementations can not rely on those fields being zero. In preparation to stop using the 'start_pad' and 'end_trunc' fields for section alignment, arrange for fields that are not explicitly initialized to be guaranteed zero. Bump the minor version to indicate it is safe to assume the 'padding' and 'flags' are zero. Otherwise, this corruption is expected to benign since all other critical fields are explicitly initialized. Note The cc: stable is about spreading this new policy to as many kernels as possible not fixing an issue in those kernels. It is not until the change titled "libnvdimm/pfn: Stop padding pmem namespaces to section alignment" where this improper initialization becomes a problem. So if someone decides to backport "libnvdimm/pfn: Stop padding pmem namespaces to section alignment" (which is not tagged for stable), make sure this pre-requisite is flagged. Fixes: 32ab0a3f5170 ("libnvdimm, pmem: 'struct page' for pmem") Cc: Signed-off-by: Dan Williams --- drivers/nvdimm/dax_devs.c | 2 +- drivers/nvdimm/pfn.h | 1 + drivers/nvdimm/pfn_devs.c | 18 +++++++++++++++--- 3 files changed, 17 insertions(+), 4 deletions(-) diff --git a/drivers/nvdimm/dax_devs.c b/drivers/nvdimm/dax_devs.c index 49fc18ee0565..6d22b0f83b3b 100644 --- a/drivers/nvdimm/dax_devs.c +++ b/drivers/nvdimm/dax_devs.c @@ -118,7 +118,7 @@ int nd_dax_probe(struct device *dev, struct nd_namespace_common *ndns) nvdimm_bus_unlock(&ndns->dev); if (!dax_dev) return -ENOMEM; - pfn_sb = devm_kzalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); + pfn_sb = devm_kmalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); nd_pfn->pfn_sb = pfn_sb; rc = nd_pfn_validate(nd_pfn, DAX_SIG); dev_dbg(dev, "dax: %s\n", rc == 0 ? dev_name(dax_dev) : ""); diff --git a/drivers/nvdimm/pfn.h b/drivers/nvdimm/pfn.h index f58b849e455b..dfb2bcda8f5a 100644 --- a/drivers/nvdimm/pfn.h +++ b/drivers/nvdimm/pfn.h @@ -28,6 +28,7 @@ struct nd_pfn_sb { __le32 end_trunc; /* minor-version-2 record the base alignment of the mapping */ __le32 align; + /* minor-version-3 guarantee the padding and flags are zero */ u8 padding[4000]; __le64 checksum; }; diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c index 0f81fc56bbfd..4977424693b0 100644 --- a/drivers/nvdimm/pfn_devs.c +++ b/drivers/nvdimm/pfn_devs.c @@ -412,6 +412,15 @@ static int nd_pfn_clear_memmap_errors(struct nd_pfn *nd_pfn) return 0; } +/** + * nd_pfn_validate - read and validate info-block + * @nd_pfn: fsdax namespace runtime state / properties + * @sig: 'devdax' or 'fsdax' signature + * + * Upon return the info-block buffer contents (->pfn_sb) are + * indeterminate when validation fails, and a coherent info-block + * otherwise. + */ int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig) { u64 checksum, offset; @@ -557,7 +566,7 @@ int nd_pfn_probe(struct device *dev, struct nd_namespace_common *ndns) nvdimm_bus_unlock(&ndns->dev); if (!pfn_dev) return -ENOMEM; - pfn_sb = devm_kzalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); + pfn_sb = devm_kmalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); nd_pfn = to_nd_pfn(pfn_dev); nd_pfn->pfn_sb = pfn_sb; rc = nd_pfn_validate(nd_pfn, PFN_SIG); @@ -694,7 +703,7 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) u64 checksum; int rc; - pfn_sb = devm_kzalloc(&nd_pfn->dev, sizeof(*pfn_sb), GFP_KERNEL); + pfn_sb = devm_kmalloc(&nd_pfn->dev, sizeof(*pfn_sb), GFP_KERNEL); if (!pfn_sb) return -ENOMEM; @@ -703,11 +712,14 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) sig = DAX_SIG; else sig = PFN_SIG; + rc = nd_pfn_validate(nd_pfn, sig); if (rc != -ENODEV) return rc; /* no info block, do init */; + memset(pfn_sb, 0, sizeof(*pfn_sb)); + nd_region = to_nd_region(nd_pfn->dev.parent); if (nd_region->ro) { dev_info(&nd_pfn->dev, @@ -760,7 +772,7 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) memcpy(pfn_sb->uuid, nd_pfn->uuid, 16); memcpy(pfn_sb->parent_uuid, nd_dev_to_uuid(&ndns->dev), 16); pfn_sb->version_major = cpu_to_le16(1); - pfn_sb->version_minor = cpu_to_le16(2); + pfn_sb->version_minor = cpu_to_le16(3); pfn_sb->start_pad = cpu_to_le32(start_pad); pfn_sb->end_trunc = cpu_to_le32(end_trunc); pfn_sb->align = cpu_to_le32(nd_pfn->align); From patchwork Wed Jun 19 05:52:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 11003503 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A1F0013AF for ; Wed, 19 Jun 2019 06:07:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8F5AC1FF41 for ; Wed, 19 Jun 2019 06:07:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 83A8428B45; Wed, 19 Jun 2019 06:07:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CA6371FF41 for ; Wed, 19 Jun 2019 06:07:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 81BA96B0269; Wed, 19 Jun 2019 02:07:05 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7CC478E0005; Wed, 19 Jun 2019 02:07:05 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6BB868E0003; Wed, 19 Jun 2019 02:07:05 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f199.google.com (mail-pg1-f199.google.com [209.85.215.199]) by kanga.kvack.org (Postfix) with ESMTP id 31FDD6B0269 for ; Wed, 19 Jun 2019 02:07:05 -0400 (EDT) Received: by mail-pg1-f199.google.com with SMTP id a21so11577015pgh.11 for ; Tue, 18 Jun 2019 23:07:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=uENTIG141SlYLUctyscwu1NCfEx0TuuCIzKm45ugQqw=; b=MUCGylCMePbLzXUyIc3CT273NPppo6hFvbjMzxtW5f4jlKAwxDV46iizUteGK5BkZT +Jnpj8NasMh9XCdaGQS4liyhWoRETvPGYzlaR8CEJxPnI0aTS5dO/ioTXa3dm1xrKd3k 5IdjqjgaI5t+4D3v45xV+x/+tf/y4icmEcEkCInMsAisBPcLOXsqdwdi/3trAyNVVZe7 153XVO9Xcacw3BgKO5SMZAV8WS161m3Txq/VZJHH37ttSgl3kh5D0tEBcA1Z9Sri9nQQ 23poK2UfFMA8gV+THeIq+H2aDqQRIilQYEtgRoQAANyFSHa1yZRUmozKioyv4JFEL+2n KexA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAVmP2Rc4P+YUNlEYXv1q3mG8/pCUH7KEgInW16SB6hheJVkn4IV WO02awbjBIGSK0aQsqTw3YRjf+eR5O91G474XESn7+z0XFuk9Zu/u/szkd3X+rN3KEs1KxGZGw0 C6Y3p6Yc0mgLJ6mHDPm9EaUXmMUafXRIhjdnCixbdHDI00p5qZIjDYNX0ScqgzLG9cw== X-Received: by 2002:a63:f510:: with SMTP id w16mr6308408pgh.0.1560924424801; Tue, 18 Jun 2019 23:07:04 -0700 (PDT) X-Google-Smtp-Source: APXvYqzk8/lFvQPQbalQUWVXqLNxhqQUjsfRvDaGd1o6C9gEhGuwTNJo7fEpnG4CaBVGzLdiNcgM X-Received: by 2002:a63:f510:: with SMTP id w16mr6308345pgh.0.1560924423536; Tue, 18 Jun 2019 23:07:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560924423; cv=none; d=google.com; s=arc-20160816; b=VLWsHQ8ow2FbTz70f4fpmI1lQxyiUKyLCPeCoHYsudkYBJ3kLbB1ZgdaxNQRPFAcFT nhB6ougYrFP6LyBPfOInKpG+uTsL9DVTYv5zOMXsLzBHw2Gcm0qM+Wim/mF9MeFmqMdk sp/mya6ZT+RDPhGoL5c7p2JJzMvqkk6fki63qdSHLv7jZjm+UEtmMO9aWZdROl7ayk5z x+JBQnKP1LEwHr4k39a/lvB3/3be2TODgVx3fOVWLIRDcVETSkowYK0/6kHx0xkO2wpQ Nt1K6hSYcwCP7bgfI8TyhmYGiVy8intlQD+fRNUsEA94Q1b8cmNr6GmlUTRGHQr12pMB y+BA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=uENTIG141SlYLUctyscwu1NCfEx0TuuCIzKm45ugQqw=; b=HfVdXTmzFW3FvXiRO6mPh5ajHHv6YtsM1vO16Dtzzkz/o8h8mL09+dzBNeAfVQdrAC PBB1oPyfuEXPDlVJBQc7F6c72VLmD/r3dZyd0PRhtqFsPRWkzV2J9Q9qltT6+TGbFL9I AL9QiTEQFJ2A7XUgwSVPFJG1bTzdgZO4VPu4YnoYLi3lWJ7To6xEjyQ4neQl+VfgV8BA znB0S7W84LEqp17668iyC3LgXMzEFUsf3trs3I34pkU0/VenG4gWY8K4iKp1aZXT0mAf 1NuNYrCghzk4fPkGtGGeMe/+xQJUVYN41NojpC+BQ+oInbuZnEW3j2bzyRPd9sGgddNM Dvdw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga17.intel.com (mga17.intel.com. [192.55.52.151]) by mx.google.com with ESMTPS id j68si4362274plb.28.2019.06.18.23.07.03 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 18 Jun 2019 23:07:03 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.151 as permitted sender) client-ip=192.55.52.151; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:07:02 -0700 X-IronPort-AV: E=Sophos;i="5.63,392,1557212400"; d="scan'208";a="164932402" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2019 23:07:02 -0700 Subject: [PATCH v10 13/13] libnvdimm/pfn: Stop padding pmem namespaces to section alignment From: Dan Williams To: akpm@linux-foundation.org Cc: Jeff Moyer , linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org Date: Tue, 18 Jun 2019 22:52:45 -0700 Message-ID: <156092356588.979959.6793371748950931916.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> References: <156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Now that the mm core supports section-unaligned hotplug of ZONE_DEVICE memory, we no longer need to add padding at pfn/dax device creation time. The kernel will still honor padding established by older kernels. Reported-by: Jeff Moyer Signed-off-by: Dan Williams --- drivers/nvdimm/pfn.h | 14 -------- drivers/nvdimm/pfn_devs.c | 77 ++++++++------------------------------------- include/linux/mmzone.h | 3 ++ 3 files changed, 16 insertions(+), 78 deletions(-) diff --git a/drivers/nvdimm/pfn.h b/drivers/nvdimm/pfn.h index dfb2bcda8f5a..7381673b7b70 100644 --- a/drivers/nvdimm/pfn.h +++ b/drivers/nvdimm/pfn.h @@ -33,18 +33,4 @@ struct nd_pfn_sb { __le64 checksum; }; -#ifdef CONFIG_SPARSEMEM -#define PFN_SECTION_ALIGN_DOWN(x) SECTION_ALIGN_DOWN(x) -#define PFN_SECTION_ALIGN_UP(x) SECTION_ALIGN_UP(x) -#else -/* - * In this case ZONE_DEVICE=n and we will disable 'pfn' device support, - * but we still want pmem to compile. - */ -#define PFN_SECTION_ALIGN_DOWN(x) (x) -#define PFN_SECTION_ALIGN_UP(x) (x) -#endif - -#define PHYS_SECTION_ALIGN_DOWN(x) PFN_PHYS(PFN_SECTION_ALIGN_DOWN(PHYS_PFN(x))) -#define PHYS_SECTION_ALIGN_UP(x) PFN_PHYS(PFN_SECTION_ALIGN_UP(PHYS_PFN(x))) #endif /* __NVDIMM_PFN_H */ diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c index 4977424693b0..2537aa338bd0 100644 --- a/drivers/nvdimm/pfn_devs.c +++ b/drivers/nvdimm/pfn_devs.c @@ -587,14 +587,14 @@ static u32 info_block_reserve(void) } /* - * We hotplug memory at section granularity, pad the reserved area from - * the previous section base to the namespace base address. + * We hotplug memory at sub-section granularity, pad the reserved area + * from the previous section base to the namespace base address. */ static unsigned long init_altmap_base(resource_size_t base) { unsigned long base_pfn = PHYS_PFN(base); - return PFN_SECTION_ALIGN_DOWN(base_pfn); + return SUBSECTION_ALIGN_DOWN(base_pfn); } static unsigned long init_altmap_reserve(resource_size_t base) @@ -602,7 +602,7 @@ static unsigned long init_altmap_reserve(resource_size_t base) unsigned long reserve = info_block_reserve() >> PAGE_SHIFT; unsigned long base_pfn = PHYS_PFN(base); - reserve += base_pfn - PFN_SECTION_ALIGN_DOWN(base_pfn); + reserve += base_pfn - SUBSECTION_ALIGN_DOWN(base_pfn); return reserve; } @@ -633,8 +633,7 @@ static int __nvdimm_setup_pfn(struct nd_pfn *nd_pfn, struct dev_pagemap *pgmap) nd_pfn->npfns = le64_to_cpu(pfn_sb->npfns); pgmap->altmap_valid = false; } else if (nd_pfn->mode == PFN_MODE_PMEM) { - nd_pfn->npfns = PFN_SECTION_ALIGN_UP((resource_size(res) - - offset) / PAGE_SIZE); + nd_pfn->npfns = PHYS_PFN((resource_size(res) - offset)); if (le64_to_cpu(nd_pfn->pfn_sb->npfns) > nd_pfn->npfns) dev_info(&nd_pfn->dev, "number of pfns truncated from %lld to %ld\n", @@ -650,54 +649,14 @@ static int __nvdimm_setup_pfn(struct nd_pfn *nd_pfn, struct dev_pagemap *pgmap) return 0; } -static u64 phys_pmem_align_down(struct nd_pfn *nd_pfn, u64 phys) -{ - return min_t(u64, PHYS_SECTION_ALIGN_DOWN(phys), - ALIGN_DOWN(phys, nd_pfn->align)); -} - -/* - * Check if pmem collides with 'System RAM', or other regions when - * section aligned. Trim it accordingly. - */ -static void trim_pfn_device(struct nd_pfn *nd_pfn, u32 *start_pad, u32 *end_trunc) -{ - struct nd_namespace_common *ndns = nd_pfn->ndns; - struct nd_namespace_io *nsio = to_nd_namespace_io(&ndns->dev); - struct nd_region *nd_region = to_nd_region(nd_pfn->dev.parent); - const resource_size_t start = nsio->res.start; - const resource_size_t end = start + resource_size(&nsio->res); - resource_size_t adjust, size; - - *start_pad = 0; - *end_trunc = 0; - - adjust = start - PHYS_SECTION_ALIGN_DOWN(start); - size = resource_size(&nsio->res) + adjust; - if (region_intersects(start - adjust, size, IORESOURCE_SYSTEM_RAM, - IORES_DESC_NONE) == REGION_MIXED - || nd_region_conflict(nd_region, start - adjust, size)) - *start_pad = PHYS_SECTION_ALIGN_UP(start) - start; - - /* Now check that end of the range does not collide. */ - adjust = PHYS_SECTION_ALIGN_UP(end) - end; - size = resource_size(&nsio->res) + adjust; - if (region_intersects(start, size, IORESOURCE_SYSTEM_RAM, - IORES_DESC_NONE) == REGION_MIXED - || !IS_ALIGNED(end, nd_pfn->align) - || nd_region_conflict(nd_region, start, size)) - *end_trunc = end - phys_pmem_align_down(nd_pfn, end); -} - static int nd_pfn_init(struct nd_pfn *nd_pfn) { struct nd_namespace_common *ndns = nd_pfn->ndns; struct nd_namespace_io *nsio = to_nd_namespace_io(&ndns->dev); - u32 start_pad, end_trunc, reserve = info_block_reserve(); resource_size_t start, size; struct nd_region *nd_region; + unsigned long npfns, align; struct nd_pfn_sb *pfn_sb; - unsigned long npfns; phys_addr_t offset; const char *sig; u64 checksum; @@ -728,43 +687,35 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) return -ENXIO; } - memset(pfn_sb, 0, sizeof(*pfn_sb)); - - trim_pfn_device(nd_pfn, &start_pad, &end_trunc); - if (start_pad + end_trunc) - dev_info(&nd_pfn->dev, "%s alignment collision, truncate %d bytes\n", - dev_name(&ndns->dev), start_pad + end_trunc); - /* * Note, we use 64 here for the standard size of struct page, * debugging options may cause it to be larger in which case the * implementation will limit the pfns advertised through * ->direct_access() to those that are included in the memmap. */ - start = nsio->res.start + start_pad; + start = nsio->res.start; size = resource_size(&nsio->res); - npfns = PFN_SECTION_ALIGN_UP((size - start_pad - end_trunc - reserve) - / PAGE_SIZE); + npfns = PHYS_PFN(size - SZ_8K); + align = max(nd_pfn->align, (1UL << SUBSECTION_SHIFT)); if (nd_pfn->mode == PFN_MODE_PMEM) { /* * The altmap should be padded out to the block size used * when populating the vmemmap. This *should* be equal to * PMD_SIZE for most architectures. */ - offset = ALIGN(start + reserve + 64 * npfns, - max(nd_pfn->align, PMD_SIZE)) - start; + offset = ALIGN(start + SZ_8K + 64 * npfns, align) - start; } else if (nd_pfn->mode == PFN_MODE_RAM) - offset = ALIGN(start + reserve, nd_pfn->align) - start; + offset = ALIGN(start + SZ_8K, align) - start; else return -ENXIO; - if (offset + start_pad + end_trunc >= size) { + if (offset >= size) { dev_err(&nd_pfn->dev, "%s unable to satisfy requested alignment\n", dev_name(&ndns->dev)); return -ENXIO; } - npfns = (size - offset - start_pad - end_trunc) / SZ_4K; + npfns = PHYS_PFN(size - offset); pfn_sb->mode = cpu_to_le32(nd_pfn->mode); pfn_sb->dataoff = cpu_to_le64(offset); pfn_sb->npfns = cpu_to_le64(npfns); @@ -773,8 +724,6 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) memcpy(pfn_sb->parent_uuid, nd_dev_to_uuid(&ndns->dev), 16); pfn_sb->version_major = cpu_to_le16(1); pfn_sb->version_minor = cpu_to_le16(3); - pfn_sb->start_pad = cpu_to_le32(start_pad); - pfn_sb->end_trunc = cpu_to_le32(end_trunc); pfn_sb->align = cpu_to_le32(nd_pfn->align); checksum = nd_sb_checksum((struct nd_gen_sb *) pfn_sb); pfn_sb->checksum = cpu_to_le64(checksum); diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index e976faf57292..350a24e48a1b 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1161,6 +1161,9 @@ static inline unsigned long section_nr_to_pfn(unsigned long sec) #define SUBSECTIONS_PER_SECTION (1UL << (SECTION_SIZE_BITS - SUBSECTION_SHIFT)) #endif +#define SUBSECTION_ALIGN_UP(pfn) ALIGN((pfn), PAGES_PER_SUBSECTION) +#define SUBSECTION_ALIGN_DOWN(pfn) ((pfn) & PAGE_SUBSECTION_MASK) + struct mem_section_usage { DECLARE_BITMAP(subsection_map, SUBSECTIONS_PER_SECTION); /* See declaration of similar field in struct zone */