From patchwork Mon May 6 23:39:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10932055 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E414D16C1 for ; Mon, 6 May 2019 23:53:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D086625D9E for ; Mon, 6 May 2019 23:53:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C44C428889; Mon, 6 May 2019 23:53:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AD03C28567 for ; Mon, 6 May 2019 23:53:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9A5C16B0006; Mon, 6 May 2019 19:53:20 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 956236B0007; Mon, 6 May 2019 19:53:20 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 845866B0008; Mon, 6 May 2019 19:53:20 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f197.google.com (mail-pg1-f197.google.com [209.85.215.197]) by kanga.kvack.org (Postfix) with ESMTP id 47F556B0006 for ; Mon, 6 May 2019 19:53:20 -0400 (EDT) Received: by mail-pg1-f197.google.com with SMTP id 14so9015527pgo.14 for ; Mon, 06 May 2019 16:53:20 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=91qtS46HMC5ueG8eMhb+qrDmGLLpKmnR0jblcr4fYVg=; b=HeB56QzLApxfZ5qlY/1YldLoYpyrtu0SRrCgMprA/e/EFkLBTMV3T6bfXqK2ZA6z3T yyrEZ4QWsDWcWBYQyGa1QbKkTH3Y6nbk6YJxNKW+MdE+yB/O9rf+dl+ZTgJCrXxJhuMh ZNUJ4VHQC0IFF2zpVrARO/y3d4u1RVtRjoEOxYfPH3uFPeYlxmwTpfZhAnFEauYOqjEB le/h7IE/NfMceLQmGK7n9CBeal1M8pNLxCUSBn4/O2KvzclppUSNP5T/cdkKBcGrwOWD Qso8l2x3wJsk4fgrVeqTFlo2zwicwqkIYy8RZBx5BV06hv7VCfOVl6lEUZHdKgJTUXzM KWnQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAXalEH4lVnZJCgdi1I4PqhAKmrTVgNyOc03VVO6shuSUX2kaWLF OMRE+NZQZYHF28KkGPPP6zT2QSpmaXPZTfeDVbTDwVuPiudTai92kxhBDZ8SuFDdYjlX2VZ8sDs tin9YYZ5PqZhl6J9byRv6rughF0k42d+wSDHyzG5rfJ+au7DuR6/eGpWMDip4VXzMNA== X-Received: by 2002:a63:df43:: with SMTP id h3mr36503072pgj.294.1557186799847; Mon, 06 May 2019 16:53:19 -0700 (PDT) X-Google-Smtp-Source: APXvYqwj9SWG4dsaN9ZDbBMMYkeOLeYCZ8xYo5x9ino20mDFxEY7pSvGt/XhZcO/0cJznMJjvMCE X-Received: by 2002:a63:df43:: with SMTP id h3mr36502989pgj.294.1557186798794; Mon, 06 May 2019 16:53:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557186798; cv=none; d=google.com; s=arc-20160816; b=UWO05JOOTBrBGPYxNkLAaBkotMKFITJDwmTpkanc8knyezO57h9FDYC6nLjAIwHf58 teUU3YlYf+IOHtAwecejK0rSBfu1SSkqYt2mO9/HxUG0CH/ommIKCSIBC/6PecEUuP4Z LBebvTHVsPSf+0nDvpkbaHXXhSg9dSbyA6ncYjpxlLx4NZRaDlA8MnoAGCuaa4Vqg5XS bY2ahVZW0Bt/mvc6Q7RoGAsEekdWFcKTsOqEhhVkyMqBa2mcvYj2VAdoMQ2hpORqXqIb rvb7G4bizsEDl/LqA1U8VMDJNQlfnLjGSL42ZaNtvn+97HKYk9ir/IkKhFxx5ow+PfVm zzrw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=91qtS46HMC5ueG8eMhb+qrDmGLLpKmnR0jblcr4fYVg=; b=eIhHj5TOV1q5S79sUNHRgfrK4kR/IRT5Cdkco0IiQ2BnTL1tqMiH887VW8AIyFPrNr m9GZMQJJqtB4wB0I+Vx9Z3kgb9BwoWPL38beFQJYolRTItPrZ7Wnv3WrfYhUn3A0EJ1L bwQTk09jtOovvBX+5QXdgCbbL2y1aeV6Y31Ygn+T0G8GeduVR7PeENxoZKIUt9jRG3jM ZlfWdmurX2ooLMix5jvMOCawrllux+HXSfgJ0u1Pku670xAIq87IhMLnL6bQZsD46WwR 2gXhxNkhSoTwAKA81Z2Zriz55H9TdpiDYPSjUHzH188Y7y5O4bgiMlsSEKSjmNMhUBwd aa+A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga06.intel.com (mga06.intel.com. [134.134.136.31]) by mx.google.com with ESMTPS id t11si15626961pgu.104.2019.05.06.16.53.18 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 06 May 2019 16:53:18 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) client-ip=134.134.136.31; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 May 2019 16:53:18 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,439,1549958400"; d="scan'208";a="297766524" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga004.jf.intel.com with ESMTP; 06 May 2019 16:53:18 -0700 Subject: [PATCH v8 01/12] mm/sparsemem: Introduce struct mem_section_usage From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Vlastimil Babka , Logan Gunthorpe , Oscar Salvador , Pavel Tatashin , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, osalvador@suse.de, mhocko@suse.com Date: Mon, 06 May 2019 16:39:31 -0700 Message-ID: <155718597192.130019.7128788290111464258.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155718596657.130019.17139634728875079809.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155718596657.130019.17139634728875079809.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Towards enabling memory hotplug to track partial population of a section, introduce 'struct mem_section_usage'. A pointer to a 'struct mem_section_usage' instance replaces the existing pointer to a 'pageblock_flags' bitmap. Effectively it adds one more 'unsigned long' beyond the 'pageblock_flags' (usemap) allocation to house a new 'subsection_map' bitmap. The new bitmap enables the memory hot{plug,remove} implementation to act on incremental sub-divisions of a section. The default SUBSECTION_SHIFT is chosen to keep the 'subsection_map' no larger than a single 'unsigned long' on the major architectures. Alternatively an architecture can define ARCH_SUBSECTION_SHIFT to override the default PMD_SHIFT. Note that PowerPC needs to use ARCH_SUBSECTION_SHIFT to workaround PMD_SHIFT being a non-constant expression on PowerPC. The primary motivation for this functionality is to support platforms that mix "System RAM" and "Persistent Memory" within a single section, or multiple PMEM ranges with different mapping lifetimes within a single section. The section restriction for hotplug has caused an ongoing saga of hacks and bugs for devm_memremap_pages() users. Beyond the fixups to teach existing paths how to retrieve the 'usemap' from a section, and updates to usemap allocation path, there are no expected behavior changes. Cc: Michal Hocko Cc: Vlastimil Babka Cc: Logan Gunthorpe Cc: Oscar Salvador Cc: Pavel Tatashin Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Michael Ellerman Signed-off-by: Dan Williams --- arch/powerpc/include/asm/sparsemem.h | 3 + include/linux/mmzone.h | 48 +++++++++++++++++++- mm/memory_hotplug.c | 18 ++++---- mm/page_alloc.c | 2 - mm/sparse.c | 81 +++++++++++++++++----------------- 5 files changed, 99 insertions(+), 53 deletions(-) diff --git a/arch/powerpc/include/asm/sparsemem.h b/arch/powerpc/include/asm/sparsemem.h index 3192d454a733..1aa3c9303bf8 100644 --- a/arch/powerpc/include/asm/sparsemem.h +++ b/arch/powerpc/include/asm/sparsemem.h @@ -10,6 +10,9 @@ */ #define SECTION_SIZE_BITS 24 +/* Reflect the largest possible PMD-size as the subsection-size constant */ +#define ARCH_SUBSECTION_SHIFT 24 + #endif /* CONFIG_SPARSEMEM */ #ifdef CONFIG_MEMORY_HOTPLUG diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 70394cabaf4e..ef8d878079f9 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1160,6 +1160,44 @@ static inline unsigned long section_nr_to_pfn(unsigned long sec) #define SECTION_ALIGN_UP(pfn) (((pfn) + PAGES_PER_SECTION - 1) & PAGE_SECTION_MASK) #define SECTION_ALIGN_DOWN(pfn) ((pfn) & PAGE_SECTION_MASK) +/* + * SUBSECTION_SHIFT must be constant since it is used to declare + * subsection_map and related bitmaps without triggering the generation + * of variable-length arrays. The most natural size for a subsection is + * a PMD-page. For architectures that do not have a constant PMD-size + * ARCH_SUBSECTION_SHIFT can be set to a constant max size, or otherwise + * fallback to 2MB. + */ +#if defined(ARCH_SUBSECTION_SHIFT) +#define SUBSECTION_SHIFT (ARCH_SUBSECTION_SHIFT) +#elif defined(PMD_SHIFT) +#define SUBSECTION_SHIFT (PMD_SHIFT) +#else +/* + * Memory hotplug enabled platforms avoid this default because they + * either define ARCH_SUBSECTION_SHIFT, or PMD_SHIFT is a constant, but + * this is kept as a backstop to allow compilation on + * !ARCH_ENABLE_MEMORY_HOTPLUG archs. + */ +#define SUBSECTION_SHIFT 21 +#endif + +#define PFN_SUBSECTION_SHIFT (SUBSECTION_SHIFT - PAGE_SHIFT) +#define PAGES_PER_SUBSECTION (1UL << PFN_SUBSECTION_SHIFT) +#define PAGE_SUBSECTION_MASK ((~(PAGES_PER_SUBSECTION-1))) + +#if SUBSECTION_SHIFT > SECTION_SIZE_BITS +#error Subsection size exceeds section size +#else +#define SUBSECTIONS_PER_SECTION (1UL << (SECTION_SIZE_BITS - SUBSECTION_SHIFT)) +#endif + +struct mem_section_usage { + DECLARE_BITMAP(subsection_map, SUBSECTIONS_PER_SECTION); + /* See declaration of similar field in struct zone */ + unsigned long pageblock_flags[0]; +}; + struct page; struct page_ext; struct mem_section { @@ -1177,8 +1215,7 @@ struct mem_section { */ unsigned long section_mem_map; - /* See declaration of similar field in struct zone */ - unsigned long *pageblock_flags; + struct mem_section_usage *usage; #ifdef CONFIG_PAGE_EXTENSION /* * If SPARSEMEM, pgdat doesn't have page_ext pointer. We use @@ -1209,6 +1246,11 @@ extern struct mem_section **mem_section; extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT]; #endif +static inline unsigned long *section_to_usemap(struct mem_section *ms) +{ + return ms->usage->pageblock_flags; +} + static inline struct mem_section *__nr_to_section(unsigned long nr) { #ifdef CONFIG_SPARSEMEM_EXTREME @@ -1220,7 +1262,7 @@ static inline struct mem_section *__nr_to_section(unsigned long nr) return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK]; } extern int __section_nr(struct mem_section* ms); -extern unsigned long usemap_size(void); +extern size_t mem_section_usage_size(void); /* * We use the lower bits of the mem_map pointer to store diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 328878b6799d..a76fc6a6e9fe 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -165,9 +165,10 @@ void put_page_bootmem(struct page *page) #ifndef CONFIG_SPARSEMEM_VMEMMAP static void register_page_bootmem_info_section(unsigned long start_pfn) { - unsigned long *usemap, mapsize, section_nr, i; + unsigned long mapsize, section_nr, i; struct mem_section *ms; struct page *page, *memmap; + struct mem_section_usage *usage; section_nr = pfn_to_section_nr(start_pfn); ms = __nr_to_section(section_nr); @@ -187,10 +188,10 @@ static void register_page_bootmem_info_section(unsigned long start_pfn) for (i = 0; i < mapsize; i++, page++) get_page_bootmem(section_nr, page, SECTION_INFO); - usemap = ms->pageblock_flags; - page = virt_to_page(usemap); + usage = ms->usage; + page = virt_to_page(usage); - mapsize = PAGE_ALIGN(usemap_size()) >> PAGE_SHIFT; + mapsize = PAGE_ALIGN(mem_section_usage_size()) >> PAGE_SHIFT; for (i = 0; i < mapsize; i++, page++) get_page_bootmem(section_nr, page, MIX_SECTION_INFO); @@ -199,9 +200,10 @@ static void register_page_bootmem_info_section(unsigned long start_pfn) #else /* CONFIG_SPARSEMEM_VMEMMAP */ static void register_page_bootmem_info_section(unsigned long start_pfn) { - unsigned long *usemap, mapsize, section_nr, i; + unsigned long mapsize, section_nr, i; struct mem_section *ms; struct page *page, *memmap; + struct mem_section_usage *usage; section_nr = pfn_to_section_nr(start_pfn); ms = __nr_to_section(section_nr); @@ -210,10 +212,10 @@ static void register_page_bootmem_info_section(unsigned long start_pfn) register_page_bootmem_memmap(section_nr, memmap, PAGES_PER_SECTION); - usemap = ms->pageblock_flags; - page = virt_to_page(usemap); + usage = ms->usage; + page = virt_to_page(usage); - mapsize = PAGE_ALIGN(usemap_size()) >> PAGE_SHIFT; + mapsize = PAGE_ALIGN(mem_section_usage_size()) >> PAGE_SHIFT; for (i = 0; i < mapsize; i++, page++) get_page_bootmem(section_nr, page, MIX_SECTION_INFO); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 1f99db76b1ff..61c2b54a5b61 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -403,7 +403,7 @@ static inline unsigned long *get_pageblock_bitmap(struct page *page, unsigned long pfn) { #ifdef CONFIG_SPARSEMEM - return __pfn_to_section(pfn)->pageblock_flags; + return section_to_usemap(__pfn_to_section(pfn)); #else return page_zone(page)->pageblock_flags; #endif /* CONFIG_SPARSEMEM */ diff --git a/mm/sparse.c b/mm/sparse.c index fd13166949b5..f87de7ad32c8 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -288,33 +288,31 @@ struct page *sparse_decode_mem_map(unsigned long coded_mem_map, unsigned long pn static void __meminit sparse_init_one_section(struct mem_section *ms, unsigned long pnum, struct page *mem_map, - unsigned long *pageblock_bitmap) + struct mem_section_usage *usage) { ms->section_mem_map &= ~SECTION_MAP_MASK; ms->section_mem_map |= sparse_encode_mem_map(mem_map, pnum) | SECTION_HAS_MEM_MAP; - ms->pageblock_flags = pageblock_bitmap; + ms->usage = usage; } -unsigned long usemap_size(void) +static unsigned long usemap_size(void) { return BITS_TO_LONGS(SECTION_BLOCKFLAGS_BITS) * sizeof(unsigned long); } -#ifdef CONFIG_MEMORY_HOTPLUG -static unsigned long *__kmalloc_section_usemap(void) +size_t mem_section_usage_size(void) { - return kmalloc(usemap_size(), GFP_KERNEL); + return sizeof(struct mem_section_usage) + usemap_size(); } -#endif /* CONFIG_MEMORY_HOTPLUG */ #ifdef CONFIG_MEMORY_HOTREMOVE -static unsigned long * __init +static struct mem_section_usage * __init sparse_early_usemaps_alloc_pgdat_section(struct pglist_data *pgdat, unsigned long size) { + struct mem_section_usage *usage; unsigned long goal, limit; - unsigned long *p; int nid; /* * A page may contain usemaps for other sections preventing the @@ -330,15 +328,16 @@ sparse_early_usemaps_alloc_pgdat_section(struct pglist_data *pgdat, limit = goal + (1UL << PA_SECTION_SHIFT); nid = early_pfn_to_nid(goal >> PAGE_SHIFT); again: - p = memblock_alloc_try_nid(size, SMP_CACHE_BYTES, goal, limit, nid); - if (!p && limit) { + usage = memblock_alloc_try_nid(size, SMP_CACHE_BYTES, goal, limit, nid); + if (!usage && limit) { limit = 0; goto again; } - return p; + return usage; } -static void __init check_usemap_section_nr(int nid, unsigned long *usemap) +static void __init check_usemap_section_nr(int nid, + struct mem_section_usage *usage) { unsigned long usemap_snr, pgdat_snr; static unsigned long old_usemap_snr; @@ -352,7 +351,7 @@ static void __init check_usemap_section_nr(int nid, unsigned long *usemap) old_pgdat_snr = NR_MEM_SECTIONS; } - usemap_snr = pfn_to_section_nr(__pa(usemap) >> PAGE_SHIFT); + usemap_snr = pfn_to_section_nr(__pa(usage) >> PAGE_SHIFT); pgdat_snr = pfn_to_section_nr(__pa(pgdat) >> PAGE_SHIFT); if (usemap_snr == pgdat_snr) return; @@ -380,14 +379,15 @@ static void __init check_usemap_section_nr(int nid, unsigned long *usemap) usemap_snr, pgdat_snr, nid); } #else -static unsigned long * __init +static struct mem_section_usage * __init sparse_early_usemaps_alloc_pgdat_section(struct pglist_data *pgdat, unsigned long size) { return memblock_alloc_node(size, SMP_CACHE_BYTES, pgdat->node_id); } -static void __init check_usemap_section_nr(int nid, unsigned long *usemap) +static void __init check_usemap_section_nr(int nid, + struct mem_section_usage *usage) { } #endif /* CONFIG_MEMORY_HOTREMOVE */ @@ -474,14 +474,13 @@ static void __init sparse_init_nid(int nid, unsigned long pnum_begin, unsigned long pnum_end, unsigned long map_count) { - unsigned long pnum, usemap_longs, *usemap; + struct mem_section_usage *usage; + unsigned long pnum; struct page *map; - usemap_longs = BITS_TO_LONGS(SECTION_BLOCKFLAGS_BITS); - usemap = sparse_early_usemaps_alloc_pgdat_section(NODE_DATA(nid), - usemap_size() * - map_count); - if (!usemap) { + usage = sparse_early_usemaps_alloc_pgdat_section(NODE_DATA(nid), + mem_section_usage_size() * map_count); + if (!usage) { pr_err("%s: node[%d] usemap allocation failed", __func__, nid); goto failed; } @@ -497,9 +496,9 @@ static void __init sparse_init_nid(int nid, unsigned long pnum_begin, pnum_begin = pnum; goto failed; } - check_usemap_section_nr(nid, usemap); - sparse_init_one_section(__nr_to_section(pnum), pnum, map, usemap); - usemap += usemap_longs; + check_usemap_section_nr(nid, usage); + sparse_init_one_section(__nr_to_section(pnum), pnum, map, usage); + usage = (void *) usage + mem_section_usage_size(); } sparse_buffer_fini(); return; @@ -701,9 +700,9 @@ int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, struct vmem_altmap *altmap) { unsigned long section_nr = pfn_to_section_nr(start_pfn); + struct mem_section_usage *usage; struct mem_section *ms; struct page *memmap; - unsigned long *usemap; int ret; /* @@ -717,8 +716,8 @@ int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, memmap = kmalloc_section_memmap(section_nr, nid, altmap); if (!memmap) return -ENOMEM; - usemap = __kmalloc_section_usemap(); - if (!usemap) { + usage = kzalloc(mem_section_usage_size(), GFP_KERNEL); + if (!usage) { __kfree_section_memmap(memmap, altmap); return -ENOMEM; } @@ -736,11 +735,11 @@ int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, page_init_poison(memmap, sizeof(struct page) * PAGES_PER_SECTION); section_mark_present(ms); - sparse_init_one_section(ms, section_nr, memmap, usemap); + sparse_init_one_section(ms, section_nr, memmap, usage); out: if (ret < 0) { - kfree(usemap); + kfree(usage); __kfree_section_memmap(memmap, altmap); } return ret; @@ -777,20 +776,20 @@ static inline void clear_hwpoisoned_pages(struct page *memmap, int nr_pages) } #endif -static void free_section_usemap(struct page *memmap, unsigned long *usemap, - struct vmem_altmap *altmap) +static void free_section_usage(struct page *memmap, + struct mem_section_usage *usage, struct vmem_altmap *altmap) { - struct page *usemap_page; + struct page *usage_page; - if (!usemap) + if (!usage) return; - usemap_page = virt_to_page(usemap); + usage_page = virt_to_page(usage); /* * Check to see if allocation came from hot-plug-add */ - if (PageSlab(usemap_page) || PageCompound(usemap_page)) { - kfree(usemap); + if (PageSlab(usage_page) || PageCompound(usage_page)) { + kfree(usage); if (memmap) __kfree_section_memmap(memmap, altmap); return; @@ -809,19 +808,19 @@ void sparse_remove_one_section(struct zone *zone, struct mem_section *ms, unsigned long map_offset, struct vmem_altmap *altmap) { struct page *memmap = NULL; - unsigned long *usemap = NULL; + struct mem_section_usage *usage = NULL; if (ms->section_mem_map) { - usemap = ms->pageblock_flags; + usage = ms->usage; memmap = sparse_decode_mem_map(ms->section_mem_map, __section_nr(ms)); ms->section_mem_map = 0; - ms->pageblock_flags = NULL; + ms->usage = NULL; } clear_hwpoisoned_pages(memmap + map_offset, PAGES_PER_SECTION - map_offset); - free_section_usemap(memmap, usemap, altmap); + free_section_usage(memmap, usage, altmap); } #endif /* CONFIG_MEMORY_HOTREMOVE */ #endif /* CONFIG_MEMORY_HOTPLUG */ From patchwork Mon May 6 23:39:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10932059 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D540F912 for ; Mon, 6 May 2019 23:53:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C318F28567 for ; Mon, 6 May 2019 23:53:26 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B71162888C; Mon, 6 May 2019 23:53:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4263E28567 for ; Mon, 6 May 2019 23:53:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5D53D6B0007; Mon, 6 May 2019 19:53:25 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 5850C6B0008; Mon, 6 May 2019 19:53:25 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 49ABF6B000A; Mon, 6 May 2019 19:53:25 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f198.google.com (mail-pg1-f198.google.com [209.85.215.198]) by kanga.kvack.org (Postfix) with ESMTP id 1230E6B0007 for ; Mon, 6 May 2019 19:53:25 -0400 (EDT) Received: by mail-pg1-f198.google.com with SMTP id a8so8973631pgq.22 for ; Mon, 06 May 2019 16:53:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=8RNOIZV97NnexO1Jdc7vL45W3MhnWW1X7GLE26PcXAY=; b=VUXdSinlOQOI/KNAYauJmZzvpRcyozHkq6BP4SD1dygsBUrc0u4EdMJAC2+uEAh3VM YKqQGHh/xlWg8WsLp7vcRxvEuDzdku1YjNRF1yRNa+6DVkvfahYkJDFeHOU96pvSmUns VGT+qOMonqCNXnd7xQFNmj4C8Sm++A884WDy28eNlLCVvEPo7wdY/yZGbA6XlRMLqngq h+Ar81Z3jNm2UrFaziBxM6qgzPFOMcWthub5uRypS1Ol9RbYiCnhvu+lqdFBiy70gBVt L2MNCaV/VOB0gNxiaw61sMRRSegksT1KRqweR7BcCghxl4JfBK4vyVZSDtu2pF3v4sQo o33w== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAV6/DvPI+bx5dkO4C0/sYjQKDYLKeMAQodRC9Y098BmeXnME1cV T7otnvMvqcEeoAd6+HNuAZRDkkJMabUodm633v9+sIW78CaCdNbUM0GYGD1xrSc79aDFKaWfSlp ZyPvp6JBk/++80lymGxtxfQj2vnrkC6r9h+3suvXVIgipI6hd+l+TDA2WJfoXPat0MA== X-Received: by 2002:a17:902:2827:: with SMTP id e36mr35262945plb.45.1557186804707; Mon, 06 May 2019 16:53:24 -0700 (PDT) X-Google-Smtp-Source: APXvYqwyZXpIylgmKFGtIu0VZ5H88JKckfHVTEGSKa/tA6cXzlFOrXME1hctwTVq8g7kDqWiMfMe X-Received: by 2002:a17:902:2827:: with SMTP id e36mr35262891plb.45.1557186803842; Mon, 06 May 2019 16:53:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557186803; cv=none; d=google.com; s=arc-20160816; b=laudWF1D35CBvR42xIH+nWYsMCjDaYkPB1SYU0Zt4xG0L2Ps5OHuwOKeQPsnIWK1gh /Hx1/D2p2zXEThllrbJw4ybtJqN3cKBqWAQ1z1QwZZqsL1b/YFbtx2VrOMj3B9CKV77n 1P5KSu2Psv7Hq9TaYznYiNaJ7/6KzIM34KHk4n/TBudOqO2pm6wuE7/z3PlO4qLKIupi 2bXeOUEiK21SCukpG7/TExLTtSnQzNziR+e1vLYJyaMLLjW7kZd2TDNT98pRWa2kFiW+ +Zgor7Zk2lpX91Sz6mU7o/ROFdu34Xhv3p/K3MitPQu4OHj3cfkoBpub6K4gbwDZL6Rq J9nQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=8RNOIZV97NnexO1Jdc7vL45W3MhnWW1X7GLE26PcXAY=; b=pUqWpFaVewvLg0Vcy51L9Ze7wc5vEis2wP0bsMt1OQ85z710TOpP9wtWlUcgdOXRmf m9Dt7UlnIHC8zN43Yj+UVg3y1EMtwWmFqHxVFH8pHunH2L9zBCYZhiADKweLjeKzyINf mld3cDjDvJW0ThgF9PjgIJJe+7F7ojwW8xa/+8zpuJd+3oKrOBiUFcyl6/XQwJFjmhQB M0mwb9IHksw4UOhYEN50bQOFCBU4auRUKTqN6odwHxo5CpvUy4X70yDGTgLFW0g1lZha rpbRndLJQxQHYjhka7Xky6oOzV40lMhhbS2gEqyOITCQsal5KhM3oAf2K6Y7fyG6jh3J 0hKQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga09.intel.com (mga09.intel.com. [134.134.136.24]) by mx.google.com with ESMTPS id z6si16989449plo.372.2019.05.06.16.53.23 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 06 May 2019 16:53:23 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) client-ip=134.134.136.24; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 May 2019 16:53:23 -0700 X-ExtLoop1: 1 Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga002.jf.intel.com with ESMTP; 06 May 2019 16:53:23 -0700 Subject: [PATCH v8 02/12] mm/memremap: Rename and consolidate SECTION_SIZE From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , David Hildenbrand , Oscar Salvador , Pavel Tatashin , Robin Murphy , Anshuman Khandual , linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, osalvador@suse.de, mhocko@suse.com Date: Mon, 06 May 2019 16:39:37 -0700 Message-ID: <155718597703.130019.5955560833756434949.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155718596657.130019.17139634728875079809.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155718596657.130019.17139634728875079809.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Robin Murphy Trying to activate ZONE_DEVICE for arm64 reveals that memremap's internal helpers for sparsemem sections conflict with and arm64's definitions for hugepages, which inherit the name of "sections" from earlier versions of the ARM architecture. Disambiguate memremap (and now HMM too) by propagating sparsemem's PA_ prefix, to clarify that these values are in terms of addresses rather than PFNs (and because it's a heck of a lot easier than changing all the arch code). SECTION_MASK is unused, so it can just go. [anshuman: Consolidated mm/hmm.c instance and updated the commit message] Acked-by: Michal Hocko Reviewed-by: David Hildenbrand Cc: Oscar Salvador Cc: Pavel Tatashin Signed-off-by: Robin Murphy Signed-off-by: Anshuman Khandual Signed-off-by: Dan Williams --- include/linux/mmzone.h | 1 + kernel/memremap.c | 10 ++++------ mm/hmm.c | 2 -- 3 files changed, 5 insertions(+), 8 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index ef8d878079f9..ac163f2f274f 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1134,6 +1134,7 @@ static inline unsigned long early_pfn_to_nid(unsigned long pfn) * PFN_SECTION_SHIFT pfn to/from section number */ #define PA_SECTION_SHIFT (SECTION_SIZE_BITS) +#define PA_SECTION_SIZE (1UL << PA_SECTION_SHIFT) #define PFN_SECTION_SHIFT (SECTION_SIZE_BITS - PAGE_SHIFT) #define NR_MEM_SECTIONS (1UL << SECTIONS_SHIFT) diff --git a/kernel/memremap.c b/kernel/memremap.c index 4e59d29245f4..f355586ea54a 100644 --- a/kernel/memremap.c +++ b/kernel/memremap.c @@ -14,8 +14,6 @@ #include static DEFINE_XARRAY(pgmap_array); -#define SECTION_MASK ~((1UL << PA_SECTION_SHIFT) - 1) -#define SECTION_SIZE (1UL << PA_SECTION_SHIFT) #if IS_ENABLED(CONFIG_DEVICE_PRIVATE) vm_fault_t device_private_entry_fault(struct vm_area_struct *vma, @@ -98,8 +96,8 @@ static void devm_memremap_pages_release(void *data) put_page(pfn_to_page(pfn)); /* pages are dead and unused, undo the arch mapping */ - align_start = res->start & ~(SECTION_SIZE - 1); - align_size = ALIGN(res->start + resource_size(res), SECTION_SIZE) + align_start = res->start & ~(PA_SECTION_SIZE - 1); + align_size = ALIGN(res->start + resource_size(res), PA_SECTION_SIZE) - align_start; nid = page_to_nid(pfn_to_page(align_start >> PAGE_SHIFT)); @@ -160,8 +158,8 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) if (!pgmap->ref || !pgmap->kill) return ERR_PTR(-EINVAL); - align_start = res->start & ~(SECTION_SIZE - 1); - align_size = ALIGN(res->start + resource_size(res), SECTION_SIZE) + align_start = res->start & ~(PA_SECTION_SIZE - 1); + align_size = ALIGN(res->start + resource_size(res), PA_SECTION_SIZE) - align_start; align_end = align_start + align_size - 1; diff --git a/mm/hmm.c b/mm/hmm.c index 0db8491090b8..a7e7f8e33c5f 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -34,8 +34,6 @@ #include #include -#define PA_SECTION_SIZE (1UL << PA_SECTION_SHIFT) - #if IS_ENABLED(CONFIG_HMM_MIRROR) static const struct mmu_notifier_ops hmm_mmu_notifier_ops; From patchwork Mon May 6 23:39:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10932063 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CAD9E13AD for ; Mon, 6 May 2019 23:53:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B917128567 for ; Mon, 6 May 2019 23:53:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AD75F2888C; Mon, 6 May 2019 23:53:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2EC3C28567 for ; Mon, 6 May 2019 23:53:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4E96F6B0008; Mon, 6 May 2019 19:53:31 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 49B086B000A; Mon, 6 May 2019 19:53:31 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3B2146B000C; Mon, 6 May 2019 19:53:31 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f200.google.com (mail-pg1-f200.google.com [209.85.215.200]) by kanga.kvack.org (Postfix) with ESMTP id 063586B0008 for ; Mon, 6 May 2019 19:53:31 -0400 (EDT) Received: by mail-pg1-f200.google.com with SMTP id d1so8998810pgk.21 for ; Mon, 06 May 2019 16:53:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=YhTLad8qon+eKY9RglTr9sk0iApBZ9IOXEG1Y6Y5HQY=; b=nDCcJHq2AC6R1h8iqeMSJEV7VRsDsrBbMCqpcF54m6HmdPwQA5Xbn0L7NDr9Ci0pLQ CltrDmSiuOgxqPmUkzQcva9yDLnMCtePZChmBxy4yBj1PeETt7fBaYzXO6SU8gnUhZOm ij343XNYbDd2DrSPcxNFrv50ZIUQw23Eob8mq1+WDBrP2Z8HABuCc+Lu4uPE4NyrxZV+ uNwRpCvd6YRr2K2iixcpd/mqbOnpV+vGcRO0p3U/aVBExcgWvoCFh4fPhn+E5Q0ui4Lq LzSASD3xSjkOZmiXQHqtpXS6foc9v3x+WcHGXbMiImBedZhi1nJSoUh2AmsEn+XQEDpQ OeZw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAVtuKGTJlPYnMR/pv3Jh+Q7h1nyv/1J6Zrvh7P6Yh5BKvge5KBq y/MO+dGyJq3H/lNiqegw9IZ6SKukhcMVp40GFZbcRtLt5dCNY+xDHPDEx1PZ6bfPD9LHTo9ThTA CZkJFxNugOWfeqe94zQCDCIJ3L0jndLHSCSBI+QufKPsGR6AXwMAp4IwWanLscPbV6A== X-Received: by 2002:a17:902:bd91:: with SMTP id q17mr9959433pls.13.1557186810644; Mon, 06 May 2019 16:53:30 -0700 (PDT) X-Google-Smtp-Source: APXvYqxcu7Pq2OcUJUJn6rmDScvOTpVkVtugbQ+6URjMZI4hGuF5S75n3/dsSVZ1ljqr82K9X6zP X-Received: by 2002:a17:902:bd91:: with SMTP id q17mr9959384pls.13.1557186809751; Mon, 06 May 2019 16:53:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557186809; cv=none; d=google.com; s=arc-20160816; b=DMci42qi6nHkzK38JMEUEv6/dr8IMbJ6N0HPMioeODugl29gTh3kBYBH8r9FUiO97t rW8ZTL3DBMnbi3LB7wL2NQuNbUOF2+94kogsc9HprzQIm+aYOFgx6qQlimEi5X+eEOs6 JN+oxcn3AUypZ7uHuuyXV6O71LwMWc9Ynghn0JOdJMr1FG2c0x76DeKAITbzV7k3kDaJ 0u52pmzRr3eQ0e7CK0au/axYGXfKmwGpe6kNn/wCO47VzUpxH0/gfd+xFHwdmCiw+KKA HBECMU2ZGaSunKR/wpn7UlbhihTkpw1uAMd7/IRMbhTe/0XjYaYYdziy/OP8DiJWa9S8 63ew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=YhTLad8qon+eKY9RglTr9sk0iApBZ9IOXEG1Y6Y5HQY=; b=K3N5ChTLZgkE018GefZr91AamrP5R90DibfgIGuTPNx39izUJnf6VxBpBWqhtVP3gH ny/84QkQabSTGtDLtpjUmAR2IpgMbJqZUReQXBCTrnXXUT12m0hk3i+vv80jfFYxrHsQ ZWe7qzlrsOQTmZqXcYxsiLfQ0nShx5RgyORTklH/jUWRUWP/WCYOh3OAqKZLVyx7mYTw uvQLOBlZRskUpLD9l/oBGZ7y9RppA/wxS0KYnSqpPmOfSPZNhBIHH4P5tvr/O9YzfkYn S5nTgTzY3mRnyvGXal6dG4RE/sZkzfIYEIXEr+YXs3GJBAB3lasPCM1PEoCmyy59zlxR 9FwQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga06.intel.com (mga06.intel.com. [134.134.136.31]) by mx.google.com with ESMTPS id a68si17835459pla.60.2019.05.06.16.53.29 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 06 May 2019 16:53:29 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) client-ip=134.134.136.31; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 May 2019 16:53:29 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,439,1549958400"; d="scan'208";a="147053295" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga008.fm.intel.com with ESMTP; 06 May 2019 16:53:28 -0700 Subject: [PATCH v8 03/12] mm/sparsemem: Add helpers track active portions of a section at boot From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Vlastimil Babka , Logan Gunthorpe , Oscar Salvador , Pavel Tatashin , Jane Chu , linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, osalvador@suse.de, mhocko@suse.com Date: Mon, 06 May 2019 16:39:42 -0700 Message-ID: <155718598213.130019.10989541248734713186.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155718596657.130019.17139634728875079809.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155718596657.130019.17139634728875079809.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Prepare for hot{plug,remove} of sub-ranges of a section by tracking a sub-section active bitmask, each bit representing a PMD_SIZE span of the architecture's memory hotplug section size. The implications of a partially populated section is that pfn_valid() needs to go beyond a valid_section() check and read the sub-section active ranges from the bitmask. The expectation is that the bitmask (subsection_map) fits in the same cacheline as the valid_section() data, so the incremental performance overhead to pfn_valid() should be negligible. Cc: Michal Hocko Cc: Vlastimil Babka Cc: Logan Gunthorpe Cc: Oscar Salvador Cc: Pavel Tatashin Tested-by: Jane Chu Signed-off-by: Dan Williams Reviewed-by: Oscar Salvador --- include/linux/mmzone.h | 29 ++++++++++++++++++++++++++++- mm/page_alloc.c | 4 +++- mm/sparse.c | 29 +++++++++++++++++++++++++++++ 3 files changed, 60 insertions(+), 2 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index ac163f2f274f..6dd52d544857 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1199,6 +1199,8 @@ struct mem_section_usage { unsigned long pageblock_flags[0]; }; +void subsection_map_init(unsigned long pfn, unsigned long nr_pages); + struct page; struct page_ext; struct mem_section { @@ -1336,12 +1338,36 @@ static inline struct mem_section *__pfn_to_section(unsigned long pfn) extern int __highest_present_section_nr; +static inline int subsection_map_index(unsigned long pfn) +{ + return (pfn & ~(PAGE_SECTION_MASK)) / PAGES_PER_SUBSECTION; +} + +#ifdef CONFIG_SPARSEMEM_VMEMMAP +static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) +{ + int idx = subsection_map_index(pfn); + + return test_bit(idx, ms->usage->subsection_map); +} +#else +static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) +{ + return 1; +} +#endif + #ifndef CONFIG_HAVE_ARCH_PFN_VALID static inline int pfn_valid(unsigned long pfn) { + struct mem_section *ms; + if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) return 0; - return valid_section(__nr_to_section(pfn_to_section_nr(pfn))); + ms = __nr_to_section(pfn_to_section_nr(pfn)); + if (!valid_section(ms)) + return 0; + return pfn_section_valid(ms, pfn); } #endif @@ -1373,6 +1399,7 @@ void sparse_init(void); #define sparse_init() do {} while (0) #define sparse_index_init(_sec, _nid) do {} while (0) #define pfn_present pfn_valid +#define subsection_map_init(_pfn, _nr_pages) do {} while (0) #endif /* CONFIG_SPARSEMEM */ /* diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 61c2b54a5b61..13816c5a51eb 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -7291,10 +7291,12 @@ void __init free_area_init_nodes(unsigned long *max_zone_pfn) /* Print out the early node map */ pr_info("Early memory node ranges\n"); - for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) + for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) { pr_info(" node %3d: [mem %#018Lx-%#018Lx]\n", nid, (u64)start_pfn << PAGE_SHIFT, ((u64)end_pfn << PAGE_SHIFT) - 1); + subsection_map_init(start_pfn, end_pfn - start_pfn); + } /* Initialise every node */ mminit_verify_pageflags_layout(); diff --git a/mm/sparse.c b/mm/sparse.c index f87de7ad32c8..ac47a48050c7 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -210,6 +210,35 @@ static inline unsigned long first_present_section_nr(void) return next_present_section_nr(-1); } +void subsection_map_init(unsigned long pfn, unsigned long nr_pages) +{ + int end_sec = pfn_to_section_nr(pfn + nr_pages - 1); + int i, start_sec = pfn_to_section_nr(pfn); + + if (!nr_pages) + return; + + for (i = start_sec; i <= end_sec; i++) { + int idx, end; + unsigned long pfns; + struct mem_section *ms; + + idx = subsection_map_index(pfn); + pfns = min(nr_pages, PAGES_PER_SECTION + - (pfn & ~PAGE_SECTION_MASK)); + end = subsection_map_index(pfn + pfns - 1); + + ms = __nr_to_section(i); + bitmap_set(ms->usage->subsection_map, idx, end - idx + 1); + + pr_debug("%s: sec: %d pfns: %ld set(%d, %d)\n", __func__, i, + pfns, idx, end - idx + 1); + + pfn += pfns; + nr_pages -= pfns; + } +} + /* Record a memory area against a node. */ void __init memory_present(int nid, unsigned long start, unsigned long end) { From patchwork Mon May 6 23:39:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10932067 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3DD9D13AD for ; Mon, 6 May 2019 23:53:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2A06228567 for ; Mon, 6 May 2019 23:53:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1D89C2888C; Mon, 6 May 2019 23:53:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 98CC328567 for ; Mon, 6 May 2019 23:53:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B7BF26B000A; Mon, 6 May 2019 19:53:36 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B2B2F6B000C; Mon, 6 May 2019 19:53:36 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A41BF6B000D; Mon, 6 May 2019 19:53:36 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f198.google.com (mail-pl1-f198.google.com [209.85.214.198]) by kanga.kvack.org (Postfix) with ESMTP id 6F1706B000A for ; Mon, 6 May 2019 19:53:36 -0400 (EDT) Received: by mail-pl1-f198.google.com with SMTP id 93so6220138plf.14 for ; Mon, 06 May 2019 16:53:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=a7EZldfln8O+xN6/aEzC/yGj8XySO3ehbPGaCHwyPHM=; b=gs6XY/nZczyxoqXStYuRN+QfryiWLijrGeyLVVRK4IjvF+uEEgudvMnrNsU1mHVDy3 dhxAOP7Rdj+xjLv/mHV3SCIWFwgDOxypRtghe+af7xnfUx6wngDgPtYgChLuZHA2cm92 WH//ad/u+CrRVkr/EWiNrYKtaN1HUZu+6PYweB05CiqwQAoLq0o/IPxhXuh+NjSUrSUz ESx5RrhwJvXUvUeS0K8JriDV3yxCWlH5pVq/fGEzaEXc5eLewkMy64vhnc9CFddIN8tu 7kmol+VZbuIhKGTS92mA87xBVlpHpTGw/qj8Ci6F9jjdN+DDxfmYMnvVoYKcZRx6VoUw y2ng== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAWsQBiOTCm8Jv0wtB4s0OD05lkxzHN2VdsZZvZKekKYQN0r8d4U yUQbnwwwsfx7YUGpZPnZfiApwU0Yt1dA6XLg08d1VieuT+f1/DzKZsB2Njrs6X0A8TVk4qcQRUX wXjh8u04pKHoGaO4kHzV85+BQ1+B3M/KSIO3Lmk/tTdMRRPi1+Y7nSqUIaEMeX2sH1g== X-Received: by 2002:a63:cc4e:: with SMTP id q14mr24895820pgi.84.1557186816131; Mon, 06 May 2019 16:53:36 -0700 (PDT) X-Google-Smtp-Source: APXvYqzB8J3C2+nAlM0Wx7eESQns0wAPXdJmNLxE16h9cuq6pKj1bdFSmgxiAty/KE18duRNDOW2 X-Received: by 2002:a63:cc4e:: with SMTP id q14mr24895782pgi.84.1557186815388; Mon, 06 May 2019 16:53:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557186815; cv=none; d=google.com; s=arc-20160816; b=MtG+r7hqjSBwOd1lrYG+WBARnp8SUHNoRcsiW8lcAtAEjT3ENlP6XN6pmobLYZWoQV UaG4K8Hnv80bcaJtDbknfym/LXCX3WPlhhCWO9pCgOe9DCKqPwgKsh8ajcA+2hzQvIBI JgQbdroaI8U0Fsik5qx5uj7qodaJpezW6CUCAvcxKQCZ7TqDvytl/uz9HnTZcERbeOIs XB+CMLc5qx++p2E3AkOaBQgPT5jxP+Qz0bs2+AzfK3dJmbRVCQlMK9Zqk2YiBnR+uByG QgNfMMODcf8+Nsmy5Zncpvwd09ECFH8nZntrcu/lGVjNztIengChn/4NBTviQhssWkNt urRg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=a7EZldfln8O+xN6/aEzC/yGj8XySO3ehbPGaCHwyPHM=; b=Fs6ctHPfM8lQM6xQ3+VdiPVW3IspwIHYUVQvbVpEmoxOD9FKYKx65CYimZNAq9HW4a XVKP98DPqCBbQ0dvawzhNKoXeocdxqQZo9fCLsvrYmiE5x3vWCloXWOUgkL1UbVpsuuD 3IX7NhdVp5bHWq4+bLCfsHI0f5NV7+WIFtO8Fu6aLyTtIIZ6AMxfrCX94O6nb+9tQdbA jfheFupPdq8A7sTpVhxZT6iL6ciYLXDG3UAxewcapEjw36opIKGVT9YdmNUGOSH/Y2at lJdbWBukR/J3T2QOVEzjsE7SkzJchxzdESxfgo4VMl1q+NOjt/SuX2Az8x24w1rqrhNN SlnA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga03.intel.com (mga03.intel.com. [134.134.136.65]) by mx.google.com with ESMTPS id v8si18502056plg.156.2019.05.06.16.53.35 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 06 May 2019 16:53:35 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.65 as permitted sender) client-ip=134.134.136.65; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 May 2019 16:53:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,439,1549958400"; d="scan'208";a="149329731" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga007.fm.intel.com with ESMTP; 06 May 2019 16:53:34 -0700 Subject: [PATCH v8 04/12] mm/hotplug: Prepare shrink_{zone, pgdat}_span for sub-section removal From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Vlastimil Babka , Logan Gunthorpe , Pavel Tatashin , Oscar Salvador , linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, osalvador@suse.de, mhocko@suse.com Date: Mon, 06 May 2019 16:39:47 -0700 Message-ID: <155718598766.130019.2843092676507694047.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155718596657.130019.17139634728875079809.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155718596657.130019.17139634728875079809.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Sub-section hotplug support reduces the unit of operation of hotplug from section-sized-units (PAGES_PER_SECTION) to sub-section-sized units (PAGES_PER_SUBSECTION). Teach shrink_{zone,pgdat}_span() to consider PAGES_PER_SUBSECTION boundaries as the points where pfn_valid(), not valid_section(), can toggle. Cc: Michal Hocko Cc: Vlastimil Babka Cc: Logan Gunthorpe Reviewed-by: Pavel Tatashin Reviewed-by: Oscar Salvador Signed-off-by: Dan Williams --- mm/memory_hotplug.c | 29 ++++++++--------------------- 1 file changed, 8 insertions(+), 21 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index a76fc6a6e9fe..393ab2b9c3f7 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -325,12 +325,8 @@ static unsigned long find_smallest_section_pfn(int nid, struct zone *zone, unsigned long start_pfn, unsigned long end_pfn) { - struct mem_section *ms; - - for (; start_pfn < end_pfn; start_pfn += PAGES_PER_SECTION) { - ms = __pfn_to_section(start_pfn); - - if (unlikely(!valid_section(ms))) + for (; start_pfn < end_pfn; start_pfn += PAGES_PER_SUBSECTION) { + if (unlikely(!pfn_valid(start_pfn))) continue; if (unlikely(pfn_to_nid(start_pfn) != nid)) @@ -350,15 +346,12 @@ static unsigned long find_biggest_section_pfn(int nid, struct zone *zone, unsigned long start_pfn, unsigned long end_pfn) { - struct mem_section *ms; unsigned long pfn; /* pfn is the end pfn of a memory section. */ pfn = end_pfn - 1; - for (; pfn >= start_pfn; pfn -= PAGES_PER_SECTION) { - ms = __pfn_to_section(pfn); - - if (unlikely(!valid_section(ms))) + for (; pfn >= start_pfn; pfn -= PAGES_PER_SUBSECTION) { + if (unlikely(!pfn_valid(pfn))) continue; if (unlikely(pfn_to_nid(pfn) != nid)) @@ -380,7 +373,6 @@ static void shrink_zone_span(struct zone *zone, unsigned long start_pfn, unsigned long z = zone_end_pfn(zone); /* zone_end_pfn namespace clash */ unsigned long zone_end_pfn = z; unsigned long pfn; - struct mem_section *ms; int nid = zone_to_nid(zone); zone_span_writelock(zone); @@ -417,10 +409,8 @@ static void shrink_zone_span(struct zone *zone, unsigned long start_pfn, * it check the zone has only hole or not. */ pfn = zone_start_pfn; - for (; pfn < zone_end_pfn; pfn += PAGES_PER_SECTION) { - ms = __pfn_to_section(pfn); - - if (unlikely(!valid_section(ms))) + for (; pfn < zone_end_pfn; pfn += PAGES_PER_SUBSECTION) { + if (unlikely(!pfn_valid(pfn))) continue; if (page_zone(pfn_to_page(pfn)) != zone) @@ -448,7 +438,6 @@ static void shrink_pgdat_span(struct pglist_data *pgdat, unsigned long p = pgdat_end_pfn(pgdat); /* pgdat_end_pfn namespace clash */ unsigned long pgdat_end_pfn = p; unsigned long pfn; - struct mem_section *ms; int nid = pgdat->node_id; if (pgdat_start_pfn == start_pfn) { @@ -485,10 +474,8 @@ static void shrink_pgdat_span(struct pglist_data *pgdat, * has only hole or not. */ pfn = pgdat_start_pfn; - for (; pfn < pgdat_end_pfn; pfn += PAGES_PER_SECTION) { - ms = __pfn_to_section(pfn); - - if (unlikely(!valid_section(ms))) + for (; pfn < pgdat_end_pfn; pfn += PAGES_PER_SUBSECTION) { + if (unlikely(!pfn_valid(pfn))) continue; if (pfn_to_nid(pfn) != nid) From patchwork Mon May 6 23:39:53 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10932071 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1DC8B912 for ; Mon, 6 May 2019 23:53:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0BC0A28567 for ; Mon, 6 May 2019 23:53:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F40B228892; Mon, 6 May 2019 23:53:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2F26628889 for ; Mon, 6 May 2019 23:53:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3B2666B000C; Mon, 6 May 2019 19:53:42 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 362146B000D; Mon, 6 May 2019 19:53:42 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 27ADA6B000E; Mon, 6 May 2019 19:53:42 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) by kanga.kvack.org (Postfix) with ESMTP id E1B576B000C for ; Mon, 6 May 2019 19:53:41 -0400 (EDT) Received: by mail-pl1-f199.google.com with SMTP id x5so8116240pll.2 for ; Mon, 06 May 2019 16:53:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=7djFGX0LQpqv8rZjfuAtjcX1aTfDPiMDvHr7c868+bc=; b=f05RAMCGcr0AZZJaejyz2t3dTBXhqjC0cEAui7j4ACllWU+IVcauo8MSks46b03eR0 W7OXTyruAgAgF74qUG70/rRJqFqU1nfiqKdsaFodb9W2VsogfO46W64DvfnXJHqzkDoA yxVBEPsDaa0zEYkKcS8eDBzc/v+4S9jr6cWI35D4GDpGLurNG6ypDby+5giSvoQesrBM 3TILbbBd0HhGfi/NIJlnawR11shNyq6Z9dsMLULLFMQ5nytrRhiht3OI4Vlr1+9T1h+z 8FuQ+bbRNNEufdiqhvzA45fqPhWpQqJfQZv/8V9nzxCjcSKPkIQ2S3mOHWfMhz+0EpSd HadA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAUcaX2c3YfOaVkKRrYw6a05UUKCrhWtl1r3+eigmslgZtyoUcUJ ahEoR77pu+80Ig/uilmAll1LFqamrcZtHTYlof6Y5K6FmshIfFoCKn5lECHtmgByQKyxj5x1u+P NG9L1M/RZL3+HKF+wROXhDO+S+PjtH91f2izM4n1M3iQfsSULNkG3CVQg1N3dy/fSGw== X-Received: by 2002:a17:902:a582:: with SMTP id az2mr36541759plb.315.1557186821569; Mon, 06 May 2019 16:53:41 -0700 (PDT) X-Google-Smtp-Source: APXvYqwGRb/Xsjy+OGtkANJO22OdYij+MhQbBkIrK4zODhYatGtTqyHId0TeZ270ba9b81nIQveA X-Received: by 2002:a17:902:a582:: with SMTP id az2mr36541705plb.315.1557186820705; Mon, 06 May 2019 16:53:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557186820; cv=none; d=google.com; s=arc-20160816; b=gd4UVpeAMINVoZJxr3aUIy0P6OQxM9+5ag565xzG3aL/sXsfQ2WjT0aQlzUMJg3iO1 aYVS2HJEz8gtJYjMn2Lr3Cr73B6JYEt+GGOi+RcruE47JbEy3RPJPdzreWtL3QIqGWiY QBzNzVcNua+cDasoaG5wq2Ji1aCqSBP90vUvzqxLeX+d+o102FL2nYdngOqbK6yBaMXf ixOz0r3IXY1YpEBTymJ/B5/WmDss2x8cKhCiDee3SNVjjxKWw3BSKigoCA+6eKbNmBHu 65aU3zp1HaUuga2qMEqvgM8BIt6CqJA2zG6oIVvQPRosND95h1SwTEcYt5Jc5jAnTdj7 j3jA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=7djFGX0LQpqv8rZjfuAtjcX1aTfDPiMDvHr7c868+bc=; b=Fz0TxLkOhYclA+F6ORpfWAxhy7LGnakqs4LiGZo316xTXktA8Gw0la2ap4cwzmFwny hODwsVysImyKqph+HUhCjNe8E3YLQd+3ACp1mmm3fsMH312GlXf4QuZxoCFC3Va3RKae eEZEERUPhiVKymWH4ZX9ya32VvSkCeTnWROMDqSwolhRWkb3QFGBKonyazawVQqGLkZd QXUwaVbKQu1zSMS+en+0SqbHdNsqc0WibOp+7uwOuQE6Q/SdKzLm9OMeFxW1Y8CVrZ+h 6LZYMWvHosXQ8AavK08kEnUowZ4MnmltmFak/5wKJ7Q/hIUcehiuGR96DGuvwx+jZ1OF eq+A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga06.intel.com (mga06.intel.com. [134.134.136.31]) by mx.google.com with ESMTPS id 187si13732236pfe.116.2019.05.06.16.53.40 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 06 May 2019 16:53:40 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) client-ip=134.134.136.31; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 May 2019 16:53:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,439,1549958400"; d="scan'208";a="168641555" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga004.fm.intel.com with ESMTP; 06 May 2019 16:53:39 -0700 Subject: [PATCH v8 05/12] mm/sparsemem: Convert kmalloc_section_memmap() to populate_section_memmap() From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , David Hildenbrand , Logan Gunthorpe , Oscar Salvador , Pavel Tatashin , linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, osalvador@suse.de, mhocko@suse.com Date: Mon, 06 May 2019 16:39:53 -0700 Message-ID: <155718599340.130019.4645499183822580585.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155718596657.130019.17139634728875079809.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155718596657.130019.17139634728875079809.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Allow sub-section sized ranges to be added to the memmap. populate_section_memmap() takes an explict pfn range rather than assuming a full section, and those parameters are plumbed all the way through to vmmemap_populate(). There should be no sub-section usage in current deployments. New warnings are added to clarify which memmap allocation paths are sub-section capable. Cc: Michal Hocko Cc: David Hildenbrand Cc: Logan Gunthorpe Cc: Oscar Salvador Reviewed-by: Pavel Tatashin Signed-off-by: Dan Williams --- arch/x86/mm/init_64.c | 4 ++- include/linux/mm.h | 4 ++- mm/sparse-vmemmap.c | 21 +++++++++++------ mm/sparse.c | 61 +++++++++++++++++++++++++++++++------------------ 4 files changed, 57 insertions(+), 33 deletions(-) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 20d14254b686..bb018d09d2dc 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1457,7 +1457,9 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, { int err; - if (boot_cpu_has(X86_FEATURE_PSE)) + if (end - start < PAGES_PER_SECTION * sizeof(struct page)) + err = vmemmap_populate_basepages(start, end, node); + else if (boot_cpu_has(X86_FEATURE_PSE)) err = vmemmap_populate_hugepages(start, end, node, altmap); else if (altmap) { pr_err_once("%s: no cpu support for altmap allocations\n", diff --git a/include/linux/mm.h b/include/linux/mm.h index 0e8834ac32b7..5360a0e4051d 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2748,8 +2748,8 @@ const char * arch_vma_name(struct vm_area_struct *vma); void print_vma_addr(char *prefix, unsigned long rip); void *sparse_buffer_alloc(unsigned long size); -struct page *sparse_mem_map_populate(unsigned long pnum, int nid, - struct vmem_altmap *altmap); +struct page * __populate_section_memmap(unsigned long pfn, + unsigned long nr_pages, int nid, struct vmem_altmap *altmap); pgd_t *vmemmap_pgd_populate(unsigned long addr, int node); p4d_t *vmemmap_p4d_populate(pgd_t *pgd, unsigned long addr, int node); pud_t *vmemmap_pud_populate(p4d_t *p4d, unsigned long addr, int node); diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 7fec05796796..200aef686722 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -245,19 +245,26 @@ int __meminit vmemmap_populate_basepages(unsigned long start, return 0; } -struct page * __meminit sparse_mem_map_populate(unsigned long pnum, int nid, - struct vmem_altmap *altmap) +struct page * __meminit __populate_section_memmap(unsigned long pfn, + unsigned long nr_pages, int nid, struct vmem_altmap *altmap) { unsigned long start; unsigned long end; - struct page *map; - map = pfn_to_page(pnum * PAGES_PER_SECTION); - start = (unsigned long)map; - end = (unsigned long)(map + PAGES_PER_SECTION); + /* + * The minimum granularity of memmap extensions is + * PAGES_PER_SUBSECTION as allocations are tracked in the + * 'subsection_map' bitmap of the section. + */ + end = ALIGN(pfn + nr_pages, PAGES_PER_SUBSECTION); + pfn &= PAGE_SUBSECTION_MASK; + nr_pages = end - pfn; + + start = (unsigned long) pfn_to_page(pfn); + end = start + nr_pages * sizeof(struct page); if (vmemmap_populate(start, end, nid, altmap)) return NULL; - return map; + return pfn_to_page(pfn); } diff --git a/mm/sparse.c b/mm/sparse.c index ac47a48050c7..d613f108cf34 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -433,8 +433,8 @@ static unsigned long __init section_map_size(void) return PAGE_ALIGN(sizeof(struct page) * PAGES_PER_SECTION); } -struct page __init *sparse_mem_map_populate(unsigned long pnum, int nid, - struct vmem_altmap *altmap) +struct page __init *__populate_section_memmap(unsigned long pfn, + unsigned long nr_pages, int nid, struct vmem_altmap *altmap) { unsigned long size = section_map_size(); struct page *map = sparse_buffer_alloc(size); @@ -515,10 +515,13 @@ static void __init sparse_init_nid(int nid, unsigned long pnum_begin, } sparse_buffer_init(map_count * section_map_size(), nid); for_each_present_section_nr(pnum_begin, pnum) { + unsigned long pfn = section_nr_to_pfn(pnum); + if (pnum >= pnum_end) break; - map = sparse_mem_map_populate(pnum, nid, NULL); + map = __populate_section_memmap(pfn, PAGES_PER_SECTION, + nid, NULL); if (!map) { pr_err("%s: node[%d] memory map backing failed. Some memory will not be available.", __func__, nid); @@ -618,17 +621,17 @@ void offline_mem_sections(unsigned long start_pfn, unsigned long end_pfn) #endif #ifdef CONFIG_SPARSEMEM_VMEMMAP -static inline struct page *kmalloc_section_memmap(unsigned long pnum, int nid, - struct vmem_altmap *altmap) +static struct page *populate_section_memmap(unsigned long pfn, + unsigned long nr_pages, int nid, struct vmem_altmap *altmap) { - /* This will make the necessary allocations eventually. */ - return sparse_mem_map_populate(pnum, nid, altmap); + return __populate_section_memmap(pfn, nr_pages, nid, altmap); } -static void __kfree_section_memmap(struct page *memmap, + +static void depopulate_section_memmap(unsigned long pfn, unsigned long nr_pages, struct vmem_altmap *altmap) { - unsigned long start = (unsigned long)memmap; - unsigned long end = (unsigned long)(memmap + PAGES_PER_SECTION); + unsigned long start = (unsigned long) pfn_to_page(pfn); + unsigned long end = start + nr_pages * sizeof(struct page); vmemmap_free(start, end, altmap); } @@ -642,11 +645,18 @@ static void free_map_bootmem(struct page *memmap) } #endif /* CONFIG_MEMORY_HOTREMOVE */ #else -static struct page *__kmalloc_section_memmap(void) +struct page *populate_section_memmap(unsigned long pfn, + unsigned long nr_pages, int nid, struct vmem_altmap *altmap) { struct page *page, *ret; unsigned long memmap_size = sizeof(struct page) * PAGES_PER_SECTION; + if ((pfn & ~PAGE_SECTION_MASK) || nr_pages != PAGES_PER_SECTION) { + WARN(1, "%s: called with section unaligned parameters\n", + __func__); + return NULL; + } + page = alloc_pages(GFP_KERNEL|__GFP_NOWARN, get_order(memmap_size)); if (page) goto got_map_page; @@ -663,15 +673,17 @@ static struct page *__kmalloc_section_memmap(void) return ret; } -static inline struct page *kmalloc_section_memmap(unsigned long pnum, int nid, +static void depopulate_section_memmap(unsigned long pfn, unsigned long nr_pages, struct vmem_altmap *altmap) { - return __kmalloc_section_memmap(); -} + struct page *memmap = pfn_to_page(pfn); + + if ((pfn & ~PAGE_SECTION_MASK) || nr_pages != PAGES_PER_SECTION) { + WARN(1, "%s: called with section unaligned parameters\n", + __func__); + return; + } -static void __kfree_section_memmap(struct page *memmap, - struct vmem_altmap *altmap) -{ if (is_vmalloc_addr(memmap)) vfree(memmap); else @@ -742,12 +754,13 @@ int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, if (ret < 0 && ret != -EEXIST) return ret; ret = 0; - memmap = kmalloc_section_memmap(section_nr, nid, altmap); + memmap = populate_section_memmap(start_pfn, PAGES_PER_SECTION, nid, + altmap); if (!memmap) return -ENOMEM; usage = kzalloc(mem_section_usage_size(), GFP_KERNEL); if (!usage) { - __kfree_section_memmap(memmap, altmap); + depopulate_section_memmap(start_pfn, PAGES_PER_SECTION, altmap); return -ENOMEM; } @@ -769,7 +782,7 @@ int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, out: if (ret < 0) { kfree(usage); - __kfree_section_memmap(memmap, altmap); + depopulate_section_memmap(start_pfn, PAGES_PER_SECTION, altmap); } return ret; } @@ -806,7 +819,8 @@ static inline void clear_hwpoisoned_pages(struct page *memmap, int nr_pages) #endif static void free_section_usage(struct page *memmap, - struct mem_section_usage *usage, struct vmem_altmap *altmap) + struct mem_section_usage *usage, unsigned long pfn, + unsigned long nr_pages, struct vmem_altmap *altmap) { struct page *usage_page; @@ -820,7 +834,7 @@ static void free_section_usage(struct page *memmap, if (PageSlab(usage_page) || PageCompound(usage_page)) { kfree(usage); if (memmap) - __kfree_section_memmap(memmap, altmap); + depopulate_section_memmap(pfn, nr_pages, altmap); return; } @@ -849,7 +863,8 @@ void sparse_remove_one_section(struct zone *zone, struct mem_section *ms, clear_hwpoisoned_pages(memmap + map_offset, PAGES_PER_SECTION - map_offset); - free_section_usage(memmap, usage, altmap); + free_section_usage(memmap, usage, section_nr_to_pfn(__section_nr(ms)), + PAGES_PER_SECTION, altmap); } #endif /* CONFIG_MEMORY_HOTREMOVE */ #endif /* CONFIG_MEMORY_HOTPLUG */ From patchwork Mon May 6 23:39:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10932075 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 63CD413AD for ; Mon, 6 May 2019 23:53:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 519F12888C for ; Mon, 6 May 2019 23:53:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 45CEF288A0; Mon, 6 May 2019 23:53:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DD7902888C for ; Mon, 6 May 2019 23:53:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EAD786B000D; Mon, 6 May 2019 19:53:46 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E37206B0010; Mon, 6 May 2019 19:53:46 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D56DF6B000D; Mon, 6 May 2019 19:53:46 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f199.google.com (mail-pg1-f199.google.com [209.85.215.199]) by kanga.kvack.org (Postfix) with ESMTP id 9E87D6B000D for ; Mon, 6 May 2019 19:53:46 -0400 (EDT) Received: by mail-pg1-f199.google.com with SMTP id x13so9013610pgl.10 for ; Mon, 06 May 2019 16:53:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=T4hW7qSH622eyy5fNq13SROo3yEstLTCwZGD235V01w=; b=P88lUen0r56MUBKw9BuZ2R5nZ4L45dHebwxmi9wtemGKFhkzyTfjQyhU+LafGj7XwH rWKluCK0sQlroat4vObwV1RPkC5FnTevTDLRiLrzw4r3o2d8BRSS6rfmswo31oYT5btV HOY337J/OcDQEIBGwygaaZsMzWPHQIMH/5loGxwbp15fTVqks/B62Bm1/k/db7ct+48N MunODZyp3+sPMDwTZy2+6CuRW5UrC2RtAk55NAFvlvXRHSMtDAaRyfYX0VEMM/x0+gkF nh60lr3F2o3M3dlnSYv2hwJfhRoAIzYKfRxuKLJqkEXV5SZ0lq30oCLmAQhTfqSrdB6q xc2A== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAXv+Fb9GHA1tbrzAVyQiRiWQpJQ6a62g+3geVSLJ87B27jURlrs fQ6l5B6HPoyLGXwiSIwidDXwKTlW7DzeXrJ51E27WBPTUnIRO5scgZB3uoMxyAbhoku8qfP5GKM k0ozZ2rWeH9DMMsK5oGZ/dtlypD9JUxEMdWoaUzm/20DW+1KyBQZohL5ZvLWrnxpGgw== X-Received: by 2002:a63:1d05:: with SMTP id d5mr13066668pgd.157.1557186826321; Mon, 06 May 2019 16:53:46 -0700 (PDT) X-Google-Smtp-Source: APXvYqyDKFU1nCzc04nw251kTrQAojxxRYaTYGqookcPUjsqFC2K8coomVaqS9AlsR+BDAdyfbDW X-Received: by 2002:a63:1d05:: with SMTP id d5mr13066626pgd.157.1557186825610; Mon, 06 May 2019 16:53:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557186825; cv=none; d=google.com; s=arc-20160816; b=xsER56yj+4i5jCNDO4FP4yeMOI4X7YPRfSbWlC49WX9fmkiWvXSzxKhGlQkUQQ9fCA 405agZaeWlzSCOsJ1iO3inwb1c7PFEmvX4X3T7C8oEtEzqcD9f+e2G4MAAP9ugViECJ8 Ngh1YQ990Mg3sVdR6pqa4sgwYE0rMbzGyE17pXtskH3doNHhXte/GHQp6ifjnBMxi6R6 PFkWyvDj4JMDA4hfa/STkwiFec9cVUUeJKzy7V+mWfHaX8+tw8jcTZGdO9Pc+jMteogy fBhy9m5vjMKRXCxbTLSR10ZOPcQk7ZoCx7d4ADS+TTCeGQamW10oDGcP8Pp4ohTKDlGr PJBw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=T4hW7qSH622eyy5fNq13SROo3yEstLTCwZGD235V01w=; b=XzcVDw3tnIFrpsrI8s7g01Q4WDGMkMv0+SvFPXp5zToThsOQejZUkm9dnpHJM9IGqf I9akMTlQOsMk8xRYlFyn5CA41N46hTOEC2fTVq2iCJjcYlQDeojbMG9bnMHadb2vKDit b0ApIq01n8mxMFIpw6kDpFwYtdA8QwJdYvTFFImpSRsQ0m1F/3D4Ac0yLR7cfGWqkEmO xjdKfF19Tb248NWd9Z5xhguy7Cxhyc5aWeCU5a6EWvyyUR4ckG8swz1bEhivosObkpvH uCbp+OutDFTf0Si/kE8ynWaeLgas4KcD5SkoRiG8dRGWJlGtenYKTZOKes/xO3QBwto+ 4KXA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga09.intel.com (mga09.intel.com. [134.134.136.24]) by mx.google.com with ESMTPS id b3si18437783plc.106.2019.05.06.16.53.45 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 06 May 2019 16:53:45 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) client-ip=134.134.136.24; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 May 2019 16:53:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,439,1549958400"; d="scan'208";a="230153605" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga001.jf.intel.com with ESMTP; 06 May 2019 16:53:45 -0700 Subject: [PATCH v8 06/12] mm/hotplug: Kill is_dev_zone() usage in __remove_pages() From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Logan Gunthorpe , Pavel Tatashin , David Hildenbrand , Oscar Salvador , linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, osalvador@suse.de, mhocko@suse.com Date: Mon, 06 May 2019 16:39:58 -0700 Message-ID: <155718599876.130019.1344795832811586975.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155718596657.130019.17139634728875079809.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155718596657.130019.17139634728875079809.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The zone type check was a leftover from the cleanup that plumbed altmap through the memory hotplug path, i.e. commit da024512a1fa "mm: pass the vmem_altmap to arch_remove_memory and __remove_pages". Cc: Michal Hocko Cc: Logan Gunthorpe Cc: Pavel Tatashin Reviewed-by: David Hildenbrand Reviewed-by: Oscar Salvador Signed-off-by: Dan Williams --- mm/memory_hotplug.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 393ab2b9c3f7..cb9e68729ea3 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -544,11 +544,8 @@ void __remove_pages(struct zone *zone, unsigned long phys_start_pfn, unsigned long map_offset = 0; int sections_to_remove; - /* In the ZONE_DEVICE case device driver owns the memory region */ - if (is_dev_zone(zone)) { - if (altmap) - map_offset = vmem_altmap_offset(altmap); - } + if (altmap) + map_offset = vmem_altmap_offset(altmap); clear_zone_contiguous(zone); From patchwork Mon May 6 23:40:03 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10932079 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CFC5C912 for ; Mon, 6 May 2019 23:53:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BDF852888C for ; Mon, 6 May 2019 23:53:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B1EEC288A0; Mon, 6 May 2019 23:53:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 50F922888C for ; Mon, 6 May 2019 23:53:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 70BF76B000E; Mon, 6 May 2019 19:53:52 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 6BCBE6B0010; Mon, 6 May 2019 19:53:52 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5869E6B0266; Mon, 6 May 2019 19:53:52 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f197.google.com (mail-pl1-f197.google.com [209.85.214.197]) by kanga.kvack.org (Postfix) with ESMTP id 22AF96B000E for ; Mon, 6 May 2019 19:53:52 -0400 (EDT) Received: by mail-pl1-f197.google.com with SMTP id 94so6339294plc.19 for ; Mon, 06 May 2019 16:53:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=evrTBx4MEJBYRhHlPVYfXUeUNl0Zqok9NmwymN9MOHo=; b=HmbAy0SBDrEnU8jkXae9wVFpurf571IyvXNB+IsIZXmfRTBdaIhKvf2wBd/WBsaZUv mgifaNbzJIGxKD4kjzZu1PkZ31T7dJSOxFEQdsRDwBCCxnDtm1ZeccKnXsQz7D1fNNGo E5KNmybbABoyspVNFx6KvnzmuL1rzGi4WTFHo/ha50rMinC9Yj+tnac+larkweCqg9ZU 1/6h8W/pvVnCtj/noWYjBACbWU/7EBrlttjcWXNGUeqP/NOrSEg0MnQCtJJfiBDXyWI4 CubfbWtRwPXHzfXrrTfG3unATPkyyOqXT9myZKiB3OQdVKEKPPXjldVzY53sQriYN0XX o7Dw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAWY5kVlLOY1HSOIVSrUBOv/uCp2zKDUr3F+hHIibwFtfTFYl6gW 5j+sr99FAIi04LyuXxkcs4lk2ymPuZTkgVlU3BUFlPrCo6Aw1meuOIwQalMMAKUPNvLSNvawDKi LMR4q6qCI6N1uvG7JhI/0DLcX1FxQUKMsOOyibNMGJnEIn4UdORiewSvRW5DRgTaUAA== X-Received: by 2002:a62:2687:: with SMTP id m129mr38417848pfm.204.1557186831808; Mon, 06 May 2019 16:53:51 -0700 (PDT) X-Google-Smtp-Source: APXvYqwfpRJLTjXGgBLWYpi32iEDsvgkHAmAmlJDx4OYZbP9O6rl0ateUZe4PIJqbSW87J7UJPLa X-Received: by 2002:a62:2687:: with SMTP id m129mr38417802pfm.204.1557186831038; Mon, 06 May 2019 16:53:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557186831; cv=none; d=google.com; s=arc-20160816; b=BgHTpESnTosrF4TczXX5SsMQJe5UsfApcsWzTndgKZ26I/MaBQY+2ZGuXRYimEuJwY GAUUWg2xEwqH1lYgbUCCP9hRHkHhhIdsVKWguS8nG8LXOydOwxI5a5n6Gl/HUGp/KCwW qGs5gUGpxkMmkzyrBmy+r2m5vy7V61z3u/+KUyhDHwBEMmBcajcjHDnFN9FGEP+PK/tt uIiQh340JSKOEjmG0VVFbFcALP6yyESl3T6tDMUe7X1Lh/MJp8y2Qg2JkmM2qdB4ouem 7PWITJrIvf5QxOWod53rmQtp3RV3o/NyeOXzBFxvU2+2K1+Y9Y4JT33AbcHNKA8JhP+z 6BkA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=evrTBx4MEJBYRhHlPVYfXUeUNl0Zqok9NmwymN9MOHo=; b=Kef7aMz/M3ZTjlksawaZQFl5ckMJkbdq5WP5SOZxIpC4JJLBkCfHP8rksk0TkXNkZ0 ORuvwPRpbdfAYrCgII1tFPlNvrf60EXWA44tpqPaiBC9poeBtIymj83N05QI8y5Kpz2R S6HBKlwGn2MbzsFuUwdYRUBCcrLXqDhp3kZJGCiDI2frCH/EOmBpllqU/8Y3Aqw8vGgk sTU2jgAdZNkKclusMwLQvPvnw630mqVL4fbRz17xvNRmedc6RztXup/sqEF//+k/zGEe PdsflzE/6VibjAG6WSStmaOgxNvziF/9GL9H1LUpf0tBMo6sZkoSs8FACMi4QQKHwDE/ FYYQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga14.intel.com (mga14.intel.com. [192.55.52.115]) by mx.google.com with ESMTPS id cg5si3112094plb.47.2019.05.06.16.53.50 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 06 May 2019 16:53:51 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.115 as permitted sender) client-ip=192.55.52.115; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 May 2019 16:53:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,439,1549958400"; d="scan'208";a="149002609" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga003.jf.intel.com with ESMTP; 06 May 2019 16:53:50 -0700 Subject: [PATCH v8 07/12] mm: Kill is_dev_zone() helper From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Logan Gunthorpe , David Hildenbrand , Oscar Salvador , Pavel Tatashin , linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, osalvador@suse.de, mhocko@suse.com Date: Mon, 06 May 2019 16:40:03 -0700 Message-ID: <155718600386.130019.2834681306356516509.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155718596657.130019.17139634728875079809.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155718596657.130019.17139634728875079809.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Given there are no more usages of is_dev_zone() outside of 'ifdef CONFIG_ZONE_DEVICE' protection, kill off the compilation helper. Cc: Michal Hocko Cc: Logan Gunthorpe Acked-by: David Hildenbrand Reviewed-by: Oscar Salvador Reviewed-by: Pavel Tatashin Signed-off-by: Dan Williams --- include/linux/mmzone.h | 12 ------------ mm/page_alloc.c | 2 +- 2 files changed, 1 insertion(+), 13 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 6dd52d544857..49e7fb452dfd 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -855,18 +855,6 @@ static inline int local_memory_node(int node_id) { return node_id; }; */ #define zone_idx(zone) ((zone) - (zone)->zone_pgdat->node_zones) -#ifdef CONFIG_ZONE_DEVICE -static inline bool is_dev_zone(const struct zone *zone) -{ - return zone_idx(zone) == ZONE_DEVICE; -} -#else -static inline bool is_dev_zone(const struct zone *zone) -{ - return false; -} -#endif - /* * Returns true if a zone has pages managed by the buddy allocator. * All the reclaim decisions have to use this function rather than diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 13816c5a51eb..2a5c5cbfb5fc 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5864,7 +5864,7 @@ void __ref memmap_init_zone_device(struct zone *zone, unsigned long start = jiffies; int nid = pgdat->node_id; - if (WARN_ON_ONCE(!pgmap || !is_dev_zone(zone))) + if (WARN_ON_ONCE(!pgmap || zone_idx(zone) != ZONE_DEVICE)) return; /* From patchwork Mon May 6 23:40:09 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10932083 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7FB0E13AD for ; Mon, 6 May 2019 23:53:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6C6CD2888C for ; Mon, 6 May 2019 23:53:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 60758288A0; Mon, 6 May 2019 23:53:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 939952888C for ; Mon, 6 May 2019 23:53:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A40AF6B0010; Mon, 6 May 2019 19:53:57 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9F0596B0266; Mon, 6 May 2019 19:53:57 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 86B556B0269; Mon, 6 May 2019 19:53:57 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f198.google.com (mail-pf1-f198.google.com [209.85.210.198]) by kanga.kvack.org (Postfix) with ESMTP id 4B87E6B0010 for ; Mon, 6 May 2019 19:53:57 -0400 (EDT) Received: by mail-pf1-f198.google.com with SMTP id d12so8946875pfn.9 for ; Mon, 06 May 2019 16:53:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=2ONZ4oedpRQow5i7o/Nw+xQNhOL6pnPVO7syGXEQtK8=; b=mj9KmTPVh+fosqBo2Ej/of2nP8xfbGAGfsBeHiFN9Xl2SZLnANCq5SWr7rjGHVALBj TieRorDTBhOsSD3i6hmcaMIojpVsYoqfW2jVOudWlvAVGIxjs6r1+OteS69SavcE/4Ky RI9WppF8uwoRSO+1Y8wXMKVpUuLxX+flyxpXtXIE9ICOOqcOpT9xLLC/uQ+ptGJhH7x8 S0NsNEryDQ7/q9SXHU9UBxPVZG3cAXWEh9f8DPirsUuFdhVRdpqmAYlu/MVXxaAXEzUH DOlpAf5WnDFTk0/PAWjEbwC88q+KM0zZ1RLe5ePib25kz0UXeVyc8/Iy3wC7TLZGpKj3 xzAA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAW/Uk9HdZ1QSAmpHksYPV3yB7Qmeug52PMoll1ob8FLrlyZThWp Fb80auw61vyPcaFhR2NBbOorRjD8DFTB5qOcyyKlLQVbNsq2XFlvceZ99xoo9FQ4uXhZj3MS47g pm+SDLMx/l26rCTwBG/jX8IYa60M1LuFFIzLdx/o3MBFd7/LflQWbrHncfeJ7UXq0RQ== X-Received: by 2002:aa7:86c3:: with SMTP id h3mr36262096pfo.169.1557186836942; Mon, 06 May 2019 16:53:56 -0700 (PDT) X-Google-Smtp-Source: APXvYqyCuPqRxGS/nPnbf3HH58cl4ZcCEjozaDs5uFrqVwsEiH0SchH33CbVCTLHIkc3M3pwIRMc X-Received: by 2002:aa7:86c3:: with SMTP id h3mr36262037pfo.169.1557186836127; Mon, 06 May 2019 16:53:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557186836; cv=none; d=google.com; s=arc-20160816; b=EKBMHA2yDiFIZUknSSWJBWl6U9UQkYKMtBrZsL4AVY3hc0Y+az+JODPgrgo3lwV8Qw Z5uHbZD34kzP8o7/HqMIVDgbi3SYKoa6STcBgkf9EJ5vAyN8OOOCLrfx//BJz1otOXx/ YyyN/2tyIx+kE+i2shc/cHKxYdUspq+Kwj0/ZYBATWAyZSM/b8R4nVPFS8W8bRzOISmY KXxqz5VS8o1SbS7EOcMxxQ89ujPdu8Uqv9/9QW3BGIAZblCH/qAlqRkGw5KtVzMKIXbo S7tsyd19y9iGzfT+wgtw2oqjYdxq5kXwCxSRhlTgir+1astBcsjMIQOOE4oWLNZH8t/l BFfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=2ONZ4oedpRQow5i7o/Nw+xQNhOL6pnPVO7syGXEQtK8=; b=MZQWM7afjN/0IkdJb4MBnCrOYcMfkdJBpAqsX44qYRdrCD4vEMljWatgUbkFZxTe1N Lyx130jxAFn4wHOs7qsX9tfgvcOxQHbe4AA3RiI6GwwIJAuhTtJFeQv8TNe24ltG+VIr /9YSdbrniVVmbfduaaj1sRAQAvn9AaQWrtVjQoVLsF5IORfbdSrGc4kODSsS7PHKXmam c89OgNtWpe3HXtAkRFAeJeZQuM0iurzKPtMWU3/osso6O7IK/gwRG5SOVMeFp+++Ta7L mCrJnepfaUgIRQVnzxLuBiGQiqWuuMTe4TqNU9GqIKh5ebem9XGa9BJP3vM+V0v6Gmdr oGhQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga12.intel.com (mga12.intel.com. [192.55.52.136]) by mx.google.com with ESMTPS id c7si4954705pgd.512.2019.05.06.16.53.55 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 06 May 2019 16:53:56 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) client-ip=192.55.52.136; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 May 2019 16:53:55 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,439,1549958400"; d="scan'208";a="142031415" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga006.jf.intel.com with ESMTP; 06 May 2019 16:53:55 -0700 Subject: [PATCH v8 08/12] mm/sparsemem: Prepare for sub-section ranges From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Vlastimil Babka , Logan Gunthorpe , Oscar Salvador , Pavel Tatashin , linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, osalvador@suse.de, mhocko@suse.com Date: Mon, 06 May 2019 16:40:09 -0700 Message-ID: <155718600896.130019.3565988182718346388.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155718596657.130019.17139634728875079809.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155718596657.130019.17139634728875079809.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Prepare the memory hot-{add,remove} paths for handling sub-section ranges by plumbing the starting page frame and number of pages being handled through arch_{add,remove}_memory() to sparse_{add,remove}_one_section(). This is simply plumbing, small cleanups, and some identifier renames. No intended functional changes. Cc: Michal Hocko Cc: Vlastimil Babka Cc: Logan Gunthorpe Cc: Oscar Salvador Reviewed-by: Pavel Tatashin Signed-off-by: Dan Williams Reviewed-by: Oscar Salvador --- include/linux/memory_hotplug.h | 7 +- mm/memory_hotplug.c | 118 +++++++++++++++++++++++++--------------- mm/sparse.c | 7 +- 3 files changed, 83 insertions(+), 49 deletions(-) diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index ae892eef8b82..835a94650ee3 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -354,9 +354,10 @@ extern int add_memory_resource(int nid, struct resource *resource); extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn, unsigned long nr_pages, struct vmem_altmap *altmap); extern bool is_memblock_offlined(struct memory_block *mem); -extern int sparse_add_one_section(int nid, unsigned long start_pfn, - struct vmem_altmap *altmap); -extern void sparse_remove_one_section(struct zone *zone, struct mem_section *ms, +extern int sparse_add_section(int nid, unsigned long pfn, + unsigned long nr_pages, struct vmem_altmap *altmap); +extern void sparse_remove_section(struct zone *zone, struct mem_section *ms, + unsigned long pfn, unsigned long nr_pages, unsigned long map_offset, struct vmem_altmap *altmap); extern struct page *sparse_decode_mem_map(unsigned long coded_mem_map, unsigned long pnum); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index cb9e68729ea3..41b544f63816 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -251,22 +251,44 @@ void __init register_page_bootmem_info_node(struct pglist_data *pgdat) } #endif /* CONFIG_HAVE_BOOTMEM_INFO_NODE */ -static int __meminit __add_section(int nid, unsigned long phys_start_pfn, - struct vmem_altmap *altmap, bool want_memblock) +static int __meminit __add_section(int nid, unsigned long pfn, + unsigned long nr_pages, struct vmem_altmap *altmap, + bool want_memblock) { int ret; - if (pfn_valid(phys_start_pfn)) + if (pfn_valid(pfn)) return -EEXIST; - ret = sparse_add_one_section(nid, phys_start_pfn, altmap); + ret = sparse_add_section(nid, pfn, nr_pages, altmap); if (ret < 0) return ret; if (!want_memblock) return 0; - return hotplug_memory_register(nid, __pfn_to_section(phys_start_pfn)); + return hotplug_memory_register(nid, __pfn_to_section(pfn)); +} + +static int subsection_check(unsigned long pfn, unsigned long nr_pages, + unsigned long flags, const char *reason) +{ + /* + * Only allow partial section hotplug for !memblock ranges, + * since register_new_memory() requires section alignment, and + * CONFIG_SPARSEMEM_VMEMMAP=n requires sections to be fully + * populated. + */ + if ((!IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) + || (flags & MHP_MEMBLOCK_API)) + && ((pfn & ~PAGE_SECTION_MASK) + || (nr_pages & ~PAGE_SECTION_MASK))) { + WARN(1, "Sub-section hot-%s incompatible with %s\n", reason, + (flags & MHP_MEMBLOCK_API) + ? "memblock api" : "!CONFIG_SPARSEMEM_VMEMMAP"); + return -EINVAL; + } + return 0; } /* @@ -275,34 +297,40 @@ static int __meminit __add_section(int nid, unsigned long phys_start_pfn, * call this function after deciding the zone to which to * add the new pages. */ -int __ref __add_pages(int nid, unsigned long phys_start_pfn, - unsigned long nr_pages, struct mhp_restrictions *restrictions) +int __ref __add_pages(int nid, unsigned long pfn, unsigned long nr_pages, + struct mhp_restrictions *restrictions) { unsigned long i; - int err = 0; - int start_sec, end_sec; + int start_sec, end_sec, err; struct vmem_altmap *altmap = restrictions->altmap; - /* during initialize mem_map, align hot-added range to section */ - start_sec = pfn_to_section_nr(phys_start_pfn); - end_sec = pfn_to_section_nr(phys_start_pfn + nr_pages - 1); - if (altmap) { /* * Validate altmap is within bounds of the total request */ - if (altmap->base_pfn != phys_start_pfn + if (altmap->base_pfn != pfn || vmem_altmap_offset(altmap) > nr_pages) { pr_warn_once("memory add fail, invalid altmap\n"); - err = -EINVAL; - goto out; + return -EINVAL; } altmap->alloc = 0; } + err = subsection_check(pfn, nr_pages, restrictions->flags, "add"); + if (err) + return err; + + start_sec = pfn_to_section_nr(pfn); + end_sec = pfn_to_section_nr(pfn + nr_pages - 1); for (i = start_sec; i <= end_sec; i++) { - err = __add_section(nid, section_nr_to_pfn(i), altmap, + unsigned long pfns; + + pfns = min(nr_pages, PAGES_PER_SECTION + - (pfn & ~PAGE_SECTION_MASK)); + err = __add_section(nid, pfn, pfns, altmap, restrictions->flags & MHP_MEMBLOCK_API); + pfn += pfns; + nr_pages -= pfns; /* * EEXIST is finally dealt with by ioresource collision @@ -315,7 +343,6 @@ int __ref __add_pages(int nid, unsigned long phys_start_pfn, cond_resched(); } vmemmap_populate_print_last(); -out: return err; } @@ -494,10 +521,10 @@ static void shrink_pgdat_span(struct pglist_data *pgdat, pgdat->node_spanned_pages = 0; } -static void __remove_zone(struct zone *zone, unsigned long start_pfn) +static void __remove_zone(struct zone *zone, unsigned long start_pfn, + unsigned long nr_pages) { struct pglist_data *pgdat = zone->zone_pgdat; - int nr_pages = PAGES_PER_SECTION; unsigned long flags; pgdat_resize_lock(zone->zone_pgdat, &flags); @@ -506,29 +533,26 @@ static void __remove_zone(struct zone *zone, unsigned long start_pfn) pgdat_resize_unlock(zone->zone_pgdat, &flags); } -static void __remove_section(struct zone *zone, struct mem_section *ms, - unsigned long map_offset, - struct vmem_altmap *altmap) +static void __remove_section(struct zone *zone, unsigned long pfn, + unsigned long nr_pages, unsigned long map_offset, + struct vmem_altmap *altmap) { - unsigned long start_pfn; - int scn_nr; + struct mem_section *ms = __nr_to_section(pfn_to_section_nr(pfn)); if (WARN_ON_ONCE(!valid_section(ms))) return; unregister_memory_section(ms); - scn_nr = __section_nr(ms); - start_pfn = section_nr_to_pfn((unsigned long)scn_nr); - __remove_zone(zone, start_pfn); + __remove_zone(zone, pfn, nr_pages); - sparse_remove_one_section(zone, ms, map_offset, altmap); + sparse_remove_section(zone, ms, pfn, nr_pages, map_offset, altmap); } /** * __remove_pages() - remove sections of pages from a zone * @zone: zone from which pages need to be removed - * @phys_start_pfn: starting pageframe (must be aligned to start of a section) + * @pfn: starting pageframe (must be aligned to start of a section) * @nr_pages: number of pages to remove (must be multiple of section size) * @altmap: alternative device page map or %NULL if default memmap is used * @@ -537,31 +561,39 @@ static void __remove_section(struct zone *zone, struct mem_section *ms, * sure that pages are marked reserved and zones are adjust properly by * calling offline_pages(). */ -void __remove_pages(struct zone *zone, unsigned long phys_start_pfn, +void __remove_pages(struct zone *zone, unsigned long pfn, unsigned long nr_pages, struct vmem_altmap *altmap) { - unsigned long i; unsigned long map_offset = 0; - int sections_to_remove; + int i, start_sec, end_sec; + struct memory_block *mem; + unsigned long flags = 0; if (altmap) map_offset = vmem_altmap_offset(altmap); clear_zone_contiguous(zone); - /* - * We can only remove entire sections - */ - BUG_ON(phys_start_pfn & ~PAGE_SECTION_MASK); - BUG_ON(nr_pages % PAGES_PER_SECTION); + mem = find_memory_block(__pfn_to_section(pfn)); + if (mem) { + flags |= MHP_MEMBLOCK_API; + put_device(&mem->dev); + } - sections_to_remove = nr_pages / PAGES_PER_SECTION; - for (i = 0; i < sections_to_remove; i++) { - unsigned long pfn = phys_start_pfn + i*PAGES_PER_SECTION; + if (subsection_check(pfn, nr_pages, flags, "remove")) + return; + + start_sec = pfn_to_section_nr(pfn); + end_sec = pfn_to_section_nr(pfn + nr_pages - 1); + for (i = start_sec; i <= end_sec; i++) { + unsigned long pfns; cond_resched(); - __remove_section(zone, __pfn_to_section(pfn), map_offset, - altmap); + pfns = min(nr_pages, PAGES_PER_SECTION + - (pfn & ~PAGE_SECTION_MASK)); + __remove_section(zone, pfn, pfns, map_offset, altmap); + pfn += pfns; + nr_pages -= pfns; map_offset = 0; } diff --git a/mm/sparse.c b/mm/sparse.c index d613f108cf34..8867f8901ee2 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -737,8 +737,8 @@ static void free_map_bootmem(struct page *memmap) * * -EEXIST - Section has been present. * * -ENOMEM - Out of memory. */ -int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, - struct vmem_altmap *altmap) +int __meminit sparse_add_section(int nid, unsigned long start_pfn, + unsigned long nr_pages, struct vmem_altmap *altmap) { unsigned long section_nr = pfn_to_section_nr(start_pfn); struct mem_section_usage *usage; @@ -847,7 +847,8 @@ static void free_section_usage(struct page *memmap, free_map_bootmem(memmap); } -void sparse_remove_one_section(struct zone *zone, struct mem_section *ms, +void sparse_remove_section(struct zone *zone, struct mem_section *ms, + unsigned long pfn, unsigned long nr_pages, unsigned long map_offset, struct vmem_altmap *altmap) { struct page *memmap = NULL; From patchwork Mon May 6 23:40:14 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10932087 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0A48113AD for ; Mon, 6 May 2019 23:54:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EADB12888C for ; Mon, 6 May 2019 23:54:04 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DEA4F288A0; Mon, 6 May 2019 23:54:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 03BB82888C for ; Mon, 6 May 2019 23:54:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 007F06B0266; Mon, 6 May 2019 19:54:03 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id EFB686B0269; Mon, 6 May 2019 19:54:02 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DEB126B026A; Mon, 6 May 2019 19:54:02 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f200.google.com (mail-pg1-f200.google.com [209.85.215.200]) by kanga.kvack.org (Postfix) with ESMTP id 9889B6B0266 for ; Mon, 6 May 2019 19:54:02 -0400 (EDT) Received: by mail-pg1-f200.google.com with SMTP id z7so9029189pgc.1 for ; Mon, 06 May 2019 16:54:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=9IcdRL5YLg/41s5JMHxLfYLM3xlXQfXisf5ES6NCxjw=; b=fzgzbBfAHxFerM92o+eKNvOY9K5DPppZ4IJ7G2v8j/S3ifbHhy44otUt/4L9MgA5HS HjBDZE6jI5VOKKV4dIqFIJd/PW3UevrTrkA61Ns73CL3guTJ5pqj7HIYVvRKTBX8QbND S1hO0t+CyZ73QwOQ5915H/+TdWuKTTx6t5UqWOzmQ3pbZ8SuuaeILCbguqbyXvZBgSUg TDB71RhFUbhmxXkEA+z6STTuE2If+uFtMSv2xXb/vvUvEzGTCdUC5f6pgjy5j7rLQNOe jenJDqbxHwpcGGn3gzBQzTnUq+g8rNpFHWKZn/SIggqy6W7WHys36j5KhaGHTlpkuj5+ keYA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAVXAeBqb6seYQv13nF6JIknqsMKMEMEsXdQlrijCUgFpc+7Si97 3r+MlDQjP4CxR/RLJsQZZBVS3xADt6uI4tmJTVJ2V/W4Qo0rxMpnVjZwFxfZwOpmtRQAtNdNDJo AUdx8Mz518EcyfWOp3DMoiPj3j0aer9tgLYaSKMDl8W5lRnTfbJzS4jvZ/8XtaGyqPw== X-Received: by 2002:aa7:80d1:: with SMTP id a17mr36635418pfn.156.1557186842207; Mon, 06 May 2019 16:54:02 -0700 (PDT) X-Google-Smtp-Source: APXvYqy1viLX0Q3dqPOhAL63XSFESzzqOFqRXyksGsVyRedk/dvocwyfs1XYiBroSMeDFPV2M3kP X-Received: by 2002:aa7:80d1:: with SMTP id a17mr36635341pfn.156.1557186841185; Mon, 06 May 2019 16:54:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557186841; cv=none; d=google.com; s=arc-20160816; b=duLe9QL8WJtfz8GSwGuBWTc8013J+Tj1ZlLf2OgawsgKhV9PfmWTO/sHt4drnTaOup s5qvpJE+jmOpaMTjnHAw5kk0bfElLpDPDpHKqUzWUNEdKrMIe8hTCXaKG2bzYe2+dtTI k6rF3oDllrtIaL2Auzink0xKyL7LnOIdHHsQICxaDUzt49Tp+RQ8GLIJxEOsPEnH33GE EH071DL4ThxMo3dDu8KJFgaua4sLg1ap+wGDxTJ/HVcYS5iAB5HPubdE+9ir9iUbl2Ps ayfK51IAacX70SHF5dnyowuJBB4+pwAP1bgJv5vXIn3GZLH0bulIb48xuHw/86hae5zj Ag+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=9IcdRL5YLg/41s5JMHxLfYLM3xlXQfXisf5ES6NCxjw=; b=LpDb3sb4DfnEmTydyI/1WlrebqHUQ2x563YktPpcVe3IGW6zBhMJ/kp861sWyx9edq iLo5a3PCJZFGsnTYR0OpFUnaj5j+IJ4qCm2bApA7RUyjDXFEjqS5UPQPGbQ4Co6/f00V gT9AP8ROTJHc+NaSOZEM9DM0CDXARSP+lIqsAAM9XTxqIBP0yhIMuuuzGSGqj0Jte8rI ncGeZOp1/+/GgA9viYxU0ZHDcTMQfOPyVJURahRzuB0HtFpAt6H0SZZWtj5UPLD+vRts iwXYLfaCZpksJVP2tfdwu8AMlKvpmmCHREE8Cgxdp/N6jhrS5h/A4J6BX4E3X/eQaErK /mhw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga05.intel.com (mga05.intel.com. [192.55.52.43]) by mx.google.com with ESMTPS id 101si5459284plb.31.2019.05.06.16.54.00 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 06 May 2019 16:54:01 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.43 as permitted sender) client-ip=192.55.52.43; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 May 2019 16:54:00 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,439,1549958400"; d="scan'208";a="297766722" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga004.jf.intel.com with ESMTP; 06 May 2019 16:54:00 -0700 Subject: [PATCH v8 09/12] mm/sparsemem: Support sub-section hotplug From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Vlastimil Babka , Logan Gunthorpe , Oscar Salvador , Pavel Tatashin , linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, osalvador@suse.de, mhocko@suse.com Date: Mon, 06 May 2019 16:40:14 -0700 Message-ID: <155718601407.130019.14248061058774128227.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155718596657.130019.17139634728875079809.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155718596657.130019.17139634728875079809.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The libnvdimm sub-system has suffered a series of hacks and broken workarounds for the memory-hotplug implementation's awkward section-aligned (128MB) granularity. For example the following backtrace is emitted when attempting arch_add_memory() with physical address ranges that intersect 'System RAM' (RAM) with 'Persistent Memory' (PMEM) within a given section: WARNING: CPU: 0 PID: 558 at kernel/memremap.c:300 devm_memremap_pages+0x3b5/0x4c0 devm_memremap_pages attempted on mixed region [mem 0x200000000-0x2fbffffff flags 0x200] [..] Call Trace: dump_stack+0x86/0xc3 __warn+0xcb/0xf0 warn_slowpath_fmt+0x5f/0x80 devm_memremap_pages+0x3b5/0x4c0 __wrap_devm_memremap_pages+0x58/0x70 [nfit_test_iomap] pmem_attach_disk+0x19a/0x440 [nd_pmem] Recently it was discovered that the problem goes beyond RAM vs PMEM collisions as some platform produce PMEM vs PMEM collisions within a given section. The libnvdimm workaround for that case revealed that the libnvdimm section-alignment-padding implementation has been broken for a long while. A fix for that long-standing breakage introduces as many problems as it solves as it would require a backward-incompatible change to the namespace metadata interpretation. Instead of that dubious route [1], address the root problem in the memory-hotplug implementation. [1]: https://lore.kernel.org/r/155000671719.348031.2347363160141119237.stgit@dwillia2-desk3.amr.corp.intel.com Cc: Michal Hocko Cc: Vlastimil Babka Cc: Logan Gunthorpe Cc: Oscar Salvador Cc: Pavel Tatashin Signed-off-by: Dan Williams --- mm/sparse.c | 255 ++++++++++++++++++++++++++++++++++++++++------------------- 1 file changed, 175 insertions(+), 80 deletions(-) diff --git a/mm/sparse.c b/mm/sparse.c index 8867f8901ee2..34f322d14e62 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -83,8 +83,15 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid) unsigned long root = SECTION_NR_TO_ROOT(section_nr); struct mem_section *section; + /* + * An existing section is possible in the sub-section hotplug + * case. First hot-add instantiates, follow-on hot-add reuses + * the existing section. + * + * The mem_hotplug_lock resolves the apparent race below. + */ if (mem_section[root]) - return -EEXIST; + return 0; section = sparse_index_alloc(nid); if (!section) @@ -210,6 +217,15 @@ static inline unsigned long first_present_section_nr(void) return next_present_section_nr(-1); } +void subsection_mask_set(unsigned long *map, unsigned long pfn, + unsigned long nr_pages) +{ + int idx = subsection_map_index(pfn); + int end = subsection_map_index(pfn + nr_pages - 1); + + bitmap_set(map, idx, end - idx + 1); +} + void subsection_map_init(unsigned long pfn, unsigned long nr_pages) { int end_sec = pfn_to_section_nr(pfn + nr_pages - 1); @@ -219,20 +235,17 @@ void subsection_map_init(unsigned long pfn, unsigned long nr_pages) return; for (i = start_sec; i <= end_sec; i++) { - int idx, end; - unsigned long pfns; struct mem_section *ms; + unsigned long pfns; - idx = subsection_map_index(pfn); pfns = min(nr_pages, PAGES_PER_SECTION - (pfn & ~PAGE_SECTION_MASK)); - end = subsection_map_index(pfn + pfns - 1); - ms = __nr_to_section(i); - bitmap_set(ms->usage->subsection_map, idx, end - idx + 1); + subsection_mask_set(ms->usage->subsection_map, pfn, pfns); pr_debug("%s: sec: %d pfns: %ld set(%d, %d)\n", __func__, i, - pfns, idx, end - idx + 1); + pfns, subsection_map_index(pfn), + subsection_map_index(pfn + pfns - 1)); pfn += pfns; nr_pages -= pfns; @@ -319,6 +332,15 @@ static void __meminit sparse_init_one_section(struct mem_section *ms, unsigned long pnum, struct page *mem_map, struct mem_section_usage *usage) { + /* + * Given that SPARSEMEM_VMEMMAP=y supports sub-section hotplug, + * ->section_mem_map can not be guaranteed to point to a full + * section's worth of memory. The field is only valid / used + * in the SPARSEMEM_VMEMMAP=n case. + */ + if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP)) + mem_map = NULL; + ms->section_mem_map &= ~SECTION_MAP_MASK; ms->section_mem_map |= sparse_encode_mem_map(mem_map, pnum) | SECTION_HAS_MEM_MAP; @@ -724,10 +746,142 @@ static void free_map_bootmem(struct page *memmap) #endif /* CONFIG_MEMORY_HOTREMOVE */ #endif /* CONFIG_SPARSEMEM_VMEMMAP */ +#ifndef CONFIG_MEMORY_HOTREMOVE +static void free_map_bootmem(struct page *memmap) +{ +} +#endif + +static bool is_early_section(struct mem_section *ms) +{ + struct page *usage_page; + + usage_page = virt_to_page(ms->usage); + if (PageSlab(usage_page) || PageCompound(usage_page)) + return false; + else + return true; +} + +static void section_deactivate(unsigned long pfn, unsigned long nr_pages, + int nid, struct vmem_altmap *altmap) +{ + DECLARE_BITMAP(map, SUBSECTIONS_PER_SECTION) = { 0 }; + DECLARE_BITMAP(tmp, SUBSECTIONS_PER_SECTION) = { 0 }; + struct mem_section *ms = __pfn_to_section(pfn); + bool early_section = is_early_section(ms); + struct page *memmap = NULL; + unsigned long *subsection_map = ms->usage + ? &ms->usage->subsection_map[0] : NULL; + + subsection_mask_set(map, pfn, nr_pages); + if (subsection_map) + bitmap_and(tmp, map, subsection_map, SUBSECTIONS_PER_SECTION); + + if (WARN(!subsection_map || !bitmap_equal(tmp, map, SUBSECTIONS_PER_SECTION), + "section already deactivated (%#lx + %ld)\n", + pfn, nr_pages)) + return; + + if (WARN(!IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) + && nr_pages < PAGES_PER_SECTION, + "partial memory section removal not supported\n")) + return; + + /* + * There are 3 cases to handle across two configurations + * (SPARSEMEM_VMEMMAP={y,n}): + * + * 1/ deactivation of a partial hot-added section (only possible + * in the SPARSEMEM_VMEMMAP=y case). + * a/ section was present at memory init + * b/ section was hot-added post memory init + * 2/ deactivation of a complete hot-added section + * 3/ deactivation of a complete section from memory init + * + * For 1/, when subsection_map does not empty we will not be + * freeing the usage map, but still need to free the vmemmap + * range. + * + * For 2/ and 3/ the SPARSEMEM_VMEMMAP={y,n} cases are unified + */ + bitmap_xor(subsection_map, map, subsection_map, SUBSECTIONS_PER_SECTION); + if (bitmap_empty(subsection_map, SUBSECTIONS_PER_SECTION)) { + unsigned long section_nr = pfn_to_section_nr(pfn); + + if (!early_section) { + kfree(ms->usage); + ms->usage = NULL; + } + memmap = sparse_decode_mem_map(ms->section_mem_map, section_nr); + ms->section_mem_map = sparse_encode_mem_map(NULL, section_nr); + } + + if (early_section && memmap) + free_map_bootmem(memmap); + else + depopulate_section_memmap(pfn, nr_pages, altmap); +} + +static struct page * __meminit section_activate(int nid, unsigned long pfn, + unsigned long nr_pages, struct vmem_altmap *altmap) +{ + DECLARE_BITMAP(map, SUBSECTIONS_PER_SECTION) = { 0 }; + struct mem_section *ms = __pfn_to_section(pfn); + struct mem_section_usage *usage = NULL; + unsigned long *subsection_map; + struct page *memmap; + int rc = 0; + + subsection_mask_set(map, pfn, nr_pages); + + if (!ms->usage) { + usage = kzalloc(mem_section_usage_size(), GFP_KERNEL); + if (!usage) + return ERR_PTR(-ENOMEM); + ms->usage = usage; + } + subsection_map = &ms->usage->subsection_map[0]; + + if (bitmap_empty(map, SUBSECTIONS_PER_SECTION)) + rc = -EINVAL; + else if (bitmap_intersects(map, subsection_map, SUBSECTIONS_PER_SECTION)) + rc = -EEXIST; + else + bitmap_or(subsection_map, map, subsection_map, + SUBSECTIONS_PER_SECTION); + + if (rc) { + if (usage) + ms->usage = NULL; + kfree(usage); + return ERR_PTR(rc); + } + + /* + * The early init code does not consider partially populated + * initial sections, it simply assumes that memory will never be + * referenced. If we hot-add memory into such a section then we + * do not need to populate the memmap and can simply reuse what + * is already there. + */ + if (nr_pages < PAGES_PER_SECTION && is_early_section(ms)) + return pfn_to_page(pfn); + + memmap = populate_section_memmap(pfn, nr_pages, nid, altmap); + if (!memmap) { + section_deactivate(pfn, nr_pages, nid, altmap); + return ERR_PTR(-ENOMEM); + } + + return memmap; +} + /** - * sparse_add_one_section - add a memory section + * sparse_add_section - add a memory section, or populate an existing one * @nid: The node to add section on * @start_pfn: start pfn of the memory range + * @nr_pages: number of pfns to add in the section * @altmap: device page map * * This is only intended for hotplug. @@ -741,49 +895,31 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, unsigned long nr_pages, struct vmem_altmap *altmap) { unsigned long section_nr = pfn_to_section_nr(start_pfn); - struct mem_section_usage *usage; struct mem_section *ms; struct page *memmap; int ret; - /* - * no locking for this, because it does its own - * plus, it does a kmalloc - */ ret = sparse_index_init(section_nr, nid); if (ret < 0 && ret != -EEXIST) return ret; - ret = 0; - memmap = populate_section_memmap(start_pfn, PAGES_PER_SECTION, nid, - altmap); - if (!memmap) - return -ENOMEM; - usage = kzalloc(mem_section_usage_size(), GFP_KERNEL); - if (!usage) { - depopulate_section_memmap(start_pfn, PAGES_PER_SECTION, altmap); - return -ENOMEM; - } - ms = __pfn_to_section(start_pfn); - if (ms->section_mem_map & SECTION_MARKED_PRESENT) { - ret = -EEXIST; - goto out; - } + memmap = section_activate(nid, start_pfn, nr_pages, altmap); + if (IS_ERR(memmap)) + return PTR_ERR(memmap); + ret = 0; /* * Poison uninitialized struct pages in order to catch invalid flags * combinations. */ - page_init_poison(memmap, sizeof(struct page) * PAGES_PER_SECTION); + page_init_poison(pfn_to_page(start_pfn), sizeof(struct page) * nr_pages); + ms = __pfn_to_section(start_pfn); section_mark_present(ms); - sparse_init_one_section(ms, section_nr, memmap, usage); + sparse_init_one_section(ms, section_nr, memmap, ms->usage); -out: - if (ret < 0) { - kfree(usage); - depopulate_section_memmap(start_pfn, PAGES_PER_SECTION, altmap); - } + if (ret < 0) + section_deactivate(start_pfn, nr_pages, nid, altmap); return ret; } @@ -818,54 +954,13 @@ static inline void clear_hwpoisoned_pages(struct page *memmap, int nr_pages) } #endif -static void free_section_usage(struct page *memmap, - struct mem_section_usage *usage, unsigned long pfn, - unsigned long nr_pages, struct vmem_altmap *altmap) -{ - struct page *usage_page; - - if (!usage) - return; - - usage_page = virt_to_page(usage); - /* - * Check to see if allocation came from hot-plug-add - */ - if (PageSlab(usage_page) || PageCompound(usage_page)) { - kfree(usage); - if (memmap) - depopulate_section_memmap(pfn, nr_pages, altmap); - return; - } - - /* - * The usemap came from bootmem. This is packed with other usemaps - * on the section which has pgdat at boot time. Just keep it as is now. - */ - - if (memmap) - free_map_bootmem(memmap); -} - void sparse_remove_section(struct zone *zone, struct mem_section *ms, unsigned long pfn, unsigned long nr_pages, unsigned long map_offset, struct vmem_altmap *altmap) { - struct page *memmap = NULL; - struct mem_section_usage *usage = NULL; - - if (ms->section_mem_map) { - usage = ms->usage; - memmap = sparse_decode_mem_map(ms->section_mem_map, - __section_nr(ms)); - ms->section_mem_map = 0; - ms->usage = NULL; - } - - clear_hwpoisoned_pages(memmap + map_offset, - PAGES_PER_SECTION - map_offset); - free_section_usage(memmap, usage, section_nr_to_pfn(__section_nr(ms)), - PAGES_PER_SECTION, altmap); + clear_hwpoisoned_pages(pfn_to_page(pfn) + map_offset, + nr_pages - map_offset); + section_deactivate(pfn, nr_pages, zone_to_nid(zone), altmap); } #endif /* CONFIG_MEMORY_HOTREMOVE */ #endif /* CONFIG_MEMORY_HOTPLUG */ From patchwork Mon May 6 23:40:19 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10932091 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 03676912 for ; Mon, 6 May 2019 23:54:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E63D42888C for ; Mon, 6 May 2019 23:54:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DAA51288A0; Mon, 6 May 2019 23:54:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 228072888C for ; Mon, 6 May 2019 23:54:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0F29C6B026A; Mon, 6 May 2019 19:54:08 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 0A4B26B026B; Mon, 6 May 2019 19:54:08 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EFB626B026C; Mon, 6 May 2019 19:54:07 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f200.google.com (mail-pf1-f200.google.com [209.85.210.200]) by kanga.kvack.org (Postfix) with ESMTP id B54A46B026A for ; Mon, 6 May 2019 19:54:07 -0400 (EDT) Received: by mail-pf1-f200.google.com with SMTP id n3so8943175pff.4 for ; Mon, 06 May 2019 16:54:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=m7C8BslmRbGGSGQpdjo+HYZBj54+tXDU6VhYpS0DnS8=; b=HAIxlgbb/sEQiI46uin/B5NArIUhvUsQW83qRCEklNDy3mMVAkzD+wzNfix1x3IuG+ u8s0SlpZLgaTRW5IpCQK4noGNRetCdHYAXfNs7rtrGOZCbsanT5GTHQTM3NyhSSIAtp1 8tMixFZYtRzdUkFKHluaVlW4BXCMlWHvKGkLWG1mNerrG19uvXDaWIi8dXTPzYsmoXXq MnAo5Nh6Z/57FM9G3SKdzK/L/gzux9mAM6dn6TtukxqdOA+pQ3DaxnwgJU6a/8agWkDF jOc0Iegkhgn7bxvLavv8fc6uEw97PToGc1gqF7t4BFWbdKFWmbR3C4cSSk1CgXESyHyU AJGQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAXJPSGpJ7cXuuY5n1h9g+zkWnlwlLuId1Dc/Nfb75coq/TZyxVx yvKeaXSZuulqpnsmAtHx2uFep+zpt+ddjPXOkFIk42KgYH5y+g/bL0xMfb3MFzOsEZw9U9GIMpA z2L98OEJvVJ4akLeISxbgpoHEdU0Q+ZAEGu/UfIZAp0APXIRqJmzu3kbJfOwKo5503g== X-Received: by 2002:a65:4b88:: with SMTP id t8mr35819390pgq.374.1557186847382; Mon, 06 May 2019 16:54:07 -0700 (PDT) X-Google-Smtp-Source: APXvYqxGiVLp8W0ekLXJlSrE4T639JVEPg1OPsw0LhZyhmzZmrWvMRqsUGcTjBPChBO0iG6vD4GO X-Received: by 2002:a65:4b88:: with SMTP id t8mr35819329pgq.374.1557186846595; Mon, 06 May 2019 16:54:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557186846; cv=none; d=google.com; s=arc-20160816; b=lcfgAckJPHahXo9CU4zTYBYiWMXvatwki1rkj8wBzNYdxG+4Gdm+GKhbcjFoIlkJ+r fhPS5i0PbyZLHuiVBtnpjtOJBg5M0zd04BuhioklVGKDcCfEYvMpKwV+B2KWa5RcY8au /LaNBrbVqIyv1MZ0gqUlWQHSEj1L0XrqOK6HNq6Y+IPHpy/8j+rmGrFPmef3lY66Yuof CLQnsviwPgFRRzeRZVQ6/20W3LijlnYDzTGmeevQ4yV3RI4V6TeXm+uXiPIRcqlHqwJd IDOGyenTW8eEkXxlzGM4ScQEGsK/7WX50/lk/0KN7Oxpqu8JQAGXF7w3POHTOQzfLop1 hb+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=m7C8BslmRbGGSGQpdjo+HYZBj54+tXDU6VhYpS0DnS8=; b=fF5rL3BaGjZL6bIEsuMWW9hFSXeXxcltDiPSIe3+y9PMJc5pFWBQVWIbYKagjt5ULn LVsgsBoVjaHZ6qijglh86dC8rYkxzqpiUPFNcWC/kFcBt7alRGajIp74oc9L4s8PcYiU tPdKMGOEwJhLkFjZd7047QAxrf6HEMat176v6yDzaOD4q6R3nvrFUc7TwNegBr0U74bM 0xlH8lXl7XzEzS5p+Pjz2kNAbXm5cFAcNIWa1QAjD/Hgsgr7MAd4v0usxqXvi0jy0gFp 57xlziLyYUgWl/KloLmp/N5HLisfWsXHCZfjScevXcZNpZkKnvkzUEdyVs447IH4i1i9 iQvg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga01.intel.com (mga01.intel.com. [192.55.52.88]) by mx.google.com with ESMTPS id bg4si11836895plb.164.2019.05.06.16.54.06 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 06 May 2019 16:54:06 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.88 as permitted sender) client-ip=192.55.52.88; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.88 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 May 2019 16:54:06 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,439,1549958400"; d="scan'208";a="147053379" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga008.fm.intel.com with ESMTP; 06 May 2019 16:54:05 -0700 Subject: [PATCH v8 10/12] mm/devm_memremap_pages: Enable sub-section remap From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Toshi Kani , =?utf-8?b?SsOpcsO0bWU=?= Glisse , Logan Gunthorpe , Oscar Salvador , Pavel Tatashin , linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, osalvador@suse.de, mhocko@suse.com Date: Mon, 06 May 2019 16:40:19 -0700 Message-ID: <155718601917.130019.30099990750225408.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155718596657.130019.17139634728875079809.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155718596657.130019.17139634728875079809.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Teach devm_memremap_pages() about the new sub-section capabilities of arch_{add,remove}_memory(). Effectively, just replace all usage of align_start, align_end, and align_size with res->start, res->end, and resource_size(res). The existing sanity check will still make sure that the two separate remap attempts do not collide within a sub-section (2MB on x86). Cc: Michal Hocko Cc: Toshi Kani Cc: Jérôme Glisse Cc: Logan Gunthorpe Cc: Oscar Salvador Cc: Pavel Tatashin Signed-off-by: Dan Williams --- kernel/memremap.c | 61 +++++++++++++++++++++-------------------------------- 1 file changed, 24 insertions(+), 37 deletions(-) diff --git a/kernel/memremap.c b/kernel/memremap.c index f355586ea54a..425904858d97 100644 --- a/kernel/memremap.c +++ b/kernel/memremap.c @@ -59,7 +59,7 @@ static unsigned long pfn_first(struct dev_pagemap *pgmap) struct vmem_altmap *altmap = &pgmap->altmap; unsigned long pfn; - pfn = res->start >> PAGE_SHIFT; + pfn = PHYS_PFN(res->start); if (pgmap->altmap_valid) pfn += vmem_altmap_offset(altmap); return pfn; @@ -87,7 +87,6 @@ static void devm_memremap_pages_release(void *data) struct dev_pagemap *pgmap = data; struct device *dev = pgmap->dev; struct resource *res = &pgmap->res; - resource_size_t align_start, align_size; unsigned long pfn; int nid; @@ -96,25 +95,21 @@ static void devm_memremap_pages_release(void *data) put_page(pfn_to_page(pfn)); /* pages are dead and unused, undo the arch mapping */ - align_start = res->start & ~(PA_SECTION_SIZE - 1); - align_size = ALIGN(res->start + resource_size(res), PA_SECTION_SIZE) - - align_start; - - nid = page_to_nid(pfn_to_page(align_start >> PAGE_SHIFT)); + nid = page_to_nid(pfn_to_page(PHYS_PFN(res->start))); mem_hotplug_begin(); if (pgmap->type == MEMORY_DEVICE_PRIVATE) { - pfn = align_start >> PAGE_SHIFT; + pfn = PHYS_PFN(res->start); __remove_pages(page_zone(pfn_to_page(pfn)), pfn, - align_size >> PAGE_SHIFT, NULL); + PHYS_PFN(resource_size(res)), NULL); } else { - arch_remove_memory(nid, align_start, align_size, + arch_remove_memory(nid, res->start, resource_size(res), pgmap->altmap_valid ? &pgmap->altmap : NULL); - kasan_remove_zero_shadow(__va(align_start), align_size); + kasan_remove_zero_shadow(__va(res->start), resource_size(res)); } mem_hotplug_done(); - untrack_pfn(NULL, PHYS_PFN(align_start), align_size); + untrack_pfn(NULL, PHYS_PFN(res->start), resource_size(res)); pgmap_array_delete(res); dev_WARN_ONCE(dev, pgmap->altmap.alloc, "%s: failed to free all reserved pages\n", __func__); @@ -141,16 +136,13 @@ static void devm_memremap_pages_release(void *data) */ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) { - resource_size_t align_start, align_size, align_end; - struct vmem_altmap *altmap = pgmap->altmap_valid ? - &pgmap->altmap : NULL; struct resource *res = &pgmap->res; struct dev_pagemap *conflict_pgmap; struct mhp_restrictions restrictions = { /* * We do not want any optional features only our own memmap */ - .altmap = altmap, + .altmap = pgmap->altmap_valid ? &pgmap->altmap : NULL, }; pgprot_t pgprot = PAGE_KERNEL; int error, nid, is_ram; @@ -158,26 +150,21 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) if (!pgmap->ref || !pgmap->kill) return ERR_PTR(-EINVAL); - align_start = res->start & ~(PA_SECTION_SIZE - 1); - align_size = ALIGN(res->start + resource_size(res), PA_SECTION_SIZE) - - align_start; - align_end = align_start + align_size - 1; - - conflict_pgmap = get_dev_pagemap(PHYS_PFN(align_start), NULL); + conflict_pgmap = get_dev_pagemap(PHYS_PFN(res->start), NULL); if (conflict_pgmap) { dev_WARN(dev, "Conflicting mapping in same section\n"); put_dev_pagemap(conflict_pgmap); return ERR_PTR(-ENOMEM); } - conflict_pgmap = get_dev_pagemap(PHYS_PFN(align_end), NULL); + conflict_pgmap = get_dev_pagemap(PHYS_PFN(res->end), NULL); if (conflict_pgmap) { dev_WARN(dev, "Conflicting mapping in same section\n"); put_dev_pagemap(conflict_pgmap); return ERR_PTR(-ENOMEM); } - is_ram = region_intersects(align_start, align_size, + is_ram = region_intersects(res->start, resource_size(res), IORESOURCE_SYSTEM_RAM, IORES_DESC_NONE); if (is_ram != REGION_DISJOINT) { @@ -198,8 +185,8 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) if (nid < 0) nid = numa_mem_id(); - error = track_pfn_remap(NULL, &pgprot, PHYS_PFN(align_start), 0, - align_size); + error = track_pfn_remap(NULL, &pgprot, PHYS_PFN(res->start), 0, + resource_size(res)); if (error) goto err_pfn_remap; @@ -217,25 +204,25 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) * arch_add_memory(). */ if (pgmap->type == MEMORY_DEVICE_PRIVATE) { - error = add_pages(nid, align_start >> PAGE_SHIFT, - align_size >> PAGE_SHIFT, &restrictions); + error = add_pages(nid, PHYS_PFN(res->start), + PHYS_PFN(resource_size(res)), &restrictions); } else { - error = kasan_add_zero_shadow(__va(align_start), align_size); + error = kasan_add_zero_shadow(__va(res->start), resource_size(res)); if (error) { mem_hotplug_done(); goto err_kasan; } - error = arch_add_memory(nid, align_start, align_size, - &restrictions); + error = arch_add_memory(nid, res->start, resource_size(res), + &restrictions); } if (!error) { struct zone *zone; zone = &NODE_DATA(nid)->node_zones[ZONE_DEVICE]; - move_pfn_range_to_zone(zone, align_start >> PAGE_SHIFT, - align_size >> PAGE_SHIFT, altmap); + move_pfn_range_to_zone(zone, PHYS_PFN(res->start), + PHYS_PFN(resource_size(res)), restrictions.altmap); } mem_hotplug_done(); @@ -247,8 +234,8 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) * to allow us to do the work while not holding the hotplug lock. */ memmap_init_zone_device(&NODE_DATA(nid)->node_zones[ZONE_DEVICE], - align_start >> PAGE_SHIFT, - align_size >> PAGE_SHIFT, pgmap); + PHYS_PFN(res->start), + PHYS_PFN(resource_size(res)), pgmap); percpu_ref_get_many(pgmap->ref, pfn_end(pgmap) - pfn_first(pgmap)); error = devm_add_action_or_reset(dev, devm_memremap_pages_release, @@ -259,9 +246,9 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) return __va(res->start); err_add_memory: - kasan_remove_zero_shadow(__va(align_start), align_size); + kasan_remove_zero_shadow(__va(res->start), resource_size(res)); err_kasan: - untrack_pfn(NULL, PHYS_PFN(align_start), align_size); + untrack_pfn(NULL, PHYS_PFN(res->start), resource_size(res)); err_pfn_remap: pgmap_array_delete(res); err_array: From patchwork Mon May 6 23:40:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10932095 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2C15E13AD for ; Mon, 6 May 2019 23:54:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1A25E2888C for ; Mon, 6 May 2019 23:54:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0E4D9288A0; Mon, 6 May 2019 23:54:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 870432888C for ; Mon, 6 May 2019 23:54:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 957CA6B026C; Mon, 6 May 2019 19:54:13 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 906146B026D; Mon, 6 May 2019 19:54:13 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 81D386B026E; Mon, 6 May 2019 19:54:13 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) by kanga.kvack.org (Postfix) with ESMTP id 4744C6B026C for ; Mon, 6 May 2019 19:54:13 -0400 (EDT) Received: by mail-pl1-f199.google.com with SMTP id g89so3418298plb.3 for ; Mon, 06 May 2019 16:54:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=LKvk91HhP9ezsuBTNBJqcLkpO+1Sux7+j9Ueu/2PT+g=; b=RvVT+FSeWZNuz94mEyL16/o2ZQvwWpDIb3fHcvyowpNZDxU3nogMxxaQirb9F4mKMV 6omTm5h4RjnCND/QmSe7AIqw73FuP9xnlb3+eLBRMUfK0zMj1hPk5gMlaXo2Ulc6l7UH Q1mP/RFmclYLMv9ZjVgnqDBK0qPVaZ9I2yHBWfAC+UaYt7oCBrPtqbhhM2mSRQDCne6Y mYQxN8wA5EOZqDaNWO+OXUzLM5k/KMh8pKqoYgC3ljTw96xLrNDUU30OwXetHNY0tQvo Iqc85qVIUfiO1ucRbb8/Sphy6+CZbBMzP2dXaT8FGyo1IyxQhPVWK79Wix+t+tRvzMR3 7olQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAX93qYgjKOvmiSW+ams6do/6/e45lLDwWt8xA8XvAJT6olSxsCK /28lHqjxfMkNCKput2HTiE4M5MF3seDfggMMy1xgde0bqMUBIn4oodl1JkPK84qgf5RY3VQwuLc wmFuvZ5YDcFfOIInaW1SZnEWwNdTFz96VzSS9RuMXpQ0+DFEcq0G5pBfyKuXcv1Hvrg== X-Received: by 2002:a63:5516:: with SMTP id j22mr33901288pgb.370.1557186852927; Mon, 06 May 2019 16:54:12 -0700 (PDT) X-Google-Smtp-Source: APXvYqz1EYJoW/p/3z65+0GiJnfQU/0tVpkJxYFXI5HyZwLf2FsYET7AlkoOIBoTAGq7yMroOhV4 X-Received: by 2002:a63:5516:: with SMTP id j22mr33901229pgb.370.1557186852023; Mon, 06 May 2019 16:54:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557186852; cv=none; d=google.com; s=arc-20160816; b=THZaBz4iK9O8JjksR+fDyF3/JW7dMBvmWiKArH8BSIKtTDaebGrHcFVRo91ZDUm/kn kPVWjdz4KJ1VnoGjwywCfL4jQYtLLO5LhqopnAb1HSppTMsWDo8fKZrGgB2lt3nIVowX SoNE9f8rpLOHJwinl1OnMYThM4Uk6BN8GQ3F/qeCA4ksToGSmfOeU0nl/VtRn183MFJG II7kG4WvXffSVYvINrXaBpFUuUg2UrJcFdiW3amGPycx5wZvAAT+kbG/ZtfJZ3c0Xfu1 toeTfvu0ss56u2Tu+yYhnwK441uJpvPNtIj5cisbGQEHHVYvaI8Z3SixNnADtuNhYl82 a4bg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=LKvk91HhP9ezsuBTNBJqcLkpO+1Sux7+j9Ueu/2PT+g=; b=rZ+3Js+N1uIL9gJq7jwEwpxzOALYy4NxV15wYij2gYsD/1bKF0csDXtUhRfXyCCgIn WANSCl2XIngLkf2cqSZgs0ERfdROaX16e6aSTt+BTFiTbI14upwqaqmWu7yle5aoFbEZ aQauG5fQbRLC7rElSDRAXbr6u9hYz3eXjQGsg/QeSvWAjg0TBRRcyRllz/Zc4Idd0kw5 9CEYmQWqasGXkCe4+DopQYWUjxOTkS5/0COZvMTxGxb/pmJagr5jnvh8nP9GJvHsDV9r TiY0dSPozhSHQ6YQDx3woasGNbdU4O/Iz9x3liAu1zfi9UaV8elJVUYDFDZ6/seF7Ba0 19KA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga05.intel.com (mga05.intel.com. [192.55.52.43]) by mx.google.com with ESMTPS id a23si4805149plm.385.2019.05.06.16.54.11 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 06 May 2019 16:54:12 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.43 as permitted sender) client-ip=192.55.52.43; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 May 2019 16:54:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,439,1549958400"; d="scan'208";a="149329913" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga007.fm.intel.com with ESMTP; 06 May 2019 16:54:11 -0700 Subject: [PATCH v8 11/12] libnvdimm/pfn: Fix fsdax-mode namespace info-block zero-fields From: Dan Williams To: akpm@linux-foundation.org Cc: stable@vger.kernel.org, linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, osalvador@suse.de, mhocko@suse.com Date: Mon, 06 May 2019 16:40:24 -0700 Message-ID: <155718602469.130019.1073417828141766553.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155718596657.130019.17139634728875079809.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155718596657.130019.17139634728875079809.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP At namespace creation time there is the potential for the "expected to be zero" fields of a 'pfn' info-block to be filled with indeterminate data. While the kernel buffer is zeroed on allocation it is immediately overwritten by nd_pfn_validate() filling it with the current contents of the on-media info-block location. For fields like, 'flags' and the 'padding' it potentially means that future implementations can not rely on those fields being zero. In preparation to stop using the 'start_pad' and 'end_trunc' fields for section alignment, arrange for fields that are not explicitly initialized to be guaranteed zero. Bump the minor version to indicate it is safe to assume the 'padding' and 'flags' are zero. Otherwise, this corruption is expected to benign since all other critical fields are explicitly initialized. Fixes: 32ab0a3f5170 ("libnvdimm, pmem: 'struct page' for pmem") Cc: Signed-off-by: Dan Williams --- drivers/nvdimm/dax_devs.c | 2 +- drivers/nvdimm/pfn.h | 1 + drivers/nvdimm/pfn_devs.c | 18 +++++++++++++++--- 3 files changed, 17 insertions(+), 4 deletions(-) diff --git a/drivers/nvdimm/dax_devs.c b/drivers/nvdimm/dax_devs.c index 0453f49dc708..326f02ffca81 100644 --- a/drivers/nvdimm/dax_devs.c +++ b/drivers/nvdimm/dax_devs.c @@ -126,7 +126,7 @@ int nd_dax_probe(struct device *dev, struct nd_namespace_common *ndns) nvdimm_bus_unlock(&ndns->dev); if (!dax_dev) return -ENOMEM; - pfn_sb = devm_kzalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); + pfn_sb = devm_kmalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); nd_pfn->pfn_sb = pfn_sb; rc = nd_pfn_validate(nd_pfn, DAX_SIG); dev_dbg(dev, "dax: %s\n", rc == 0 ? dev_name(dax_dev) : ""); diff --git a/drivers/nvdimm/pfn.h b/drivers/nvdimm/pfn.h index dde9853453d3..e901e3a3b04c 100644 --- a/drivers/nvdimm/pfn.h +++ b/drivers/nvdimm/pfn.h @@ -36,6 +36,7 @@ struct nd_pfn_sb { __le32 end_trunc; /* minor-version-2 record the base alignment of the mapping */ __le32 align; + /* minor-version-3 guarantee the padding and flags are zero */ u8 padding[4000]; __le64 checksum; }; diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c index 01f40672507f..a2406253eb70 100644 --- a/drivers/nvdimm/pfn_devs.c +++ b/drivers/nvdimm/pfn_devs.c @@ -420,6 +420,15 @@ static int nd_pfn_clear_memmap_errors(struct nd_pfn *nd_pfn) return 0; } +/** + * nd_pfn_validate - read and validate info-block + * @nd_pfn: fsdax namespace runtime state / properties + * @sig: 'devdax' or 'fsdax' signature + * + * Upon return the info-block buffer contents (->pfn_sb) are + * indeterminate when validation fails, and a coherent info-block + * otherwise. + */ int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig) { u64 checksum, offset; @@ -565,7 +574,7 @@ int nd_pfn_probe(struct device *dev, struct nd_namespace_common *ndns) nvdimm_bus_unlock(&ndns->dev); if (!pfn_dev) return -ENOMEM; - pfn_sb = devm_kzalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); + pfn_sb = devm_kmalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); nd_pfn = to_nd_pfn(pfn_dev); nd_pfn->pfn_sb = pfn_sb; rc = nd_pfn_validate(nd_pfn, PFN_SIG); @@ -702,7 +711,7 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) u64 checksum; int rc; - pfn_sb = devm_kzalloc(&nd_pfn->dev, sizeof(*pfn_sb), GFP_KERNEL); + pfn_sb = devm_kmalloc(&nd_pfn->dev, sizeof(*pfn_sb), GFP_KERNEL); if (!pfn_sb) return -ENOMEM; @@ -711,11 +720,14 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) sig = DAX_SIG; else sig = PFN_SIG; + rc = nd_pfn_validate(nd_pfn, sig); if (rc != -ENODEV) return rc; /* no info block, do init */; + memset(pfn_sb, 0, sizeof(*pfn_sb)); + nd_region = to_nd_region(nd_pfn->dev.parent); if (nd_region->ro) { dev_info(&nd_pfn->dev, @@ -768,7 +780,7 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) memcpy(pfn_sb->uuid, nd_pfn->uuid, 16); memcpy(pfn_sb->parent_uuid, nd_dev_to_uuid(&ndns->dev), 16); pfn_sb->version_major = cpu_to_le16(1); - pfn_sb->version_minor = cpu_to_le16(2); + pfn_sb->version_minor = cpu_to_le16(3); pfn_sb->start_pad = cpu_to_le32(start_pad); pfn_sb->end_trunc = cpu_to_le32(end_trunc); pfn_sb->align = cpu_to_le32(nd_pfn->align); From patchwork Mon May 6 23:40:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10932099 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 02972912 for ; Mon, 6 May 2019 23:54:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E35472888C for ; Mon, 6 May 2019 23:54:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D7470288A0; Mon, 6 May 2019 23:54:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2C2892888C for ; Mon, 6 May 2019 23:54:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 22A9D6B026E; Mon, 6 May 2019 19:54:19 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 1DB286B026F; Mon, 6 May 2019 19:54:19 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0F1E66B0270; Mon, 6 May 2019 19:54:19 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f199.google.com (mail-pf1-f199.google.com [209.85.210.199]) by kanga.kvack.org (Postfix) with ESMTP id CB3746B026E for ; Mon, 6 May 2019 19:54:18 -0400 (EDT) Received: by mail-pf1-f199.google.com with SMTP id x5so8934182pfi.5 for ; Mon, 06 May 2019 16:54:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=bYwzlu+qlg87yG+G9ABQwbpiA0ygWWX0zPRHiAUPmik=; b=lCiOa8NyiM9t9LM1zeMVqx2kLEtn5ndCXKuqnWPfI/JOJvwbqlzDXystiYsXhuHZNW VcD0Mt5OU/Ser7HlwTPQ1ZgDNvVVjtcnDLxYNtKJ8B3oiiD3JSYN4KvgeKrelB+LtYvi k8HiuMNc0zBz7c16/9taw61489X4NeUzU0Vjw7z2sjTk3Abrpz9379YrErm56xi/rHkQ QhHUcMamyWuxRlxf+pjSRSk/9AIR1eBZm+eUq/dWa4dDwCgPdoGiw7sBN8g4db10Mh7e f1Ab6ZostyMSHlFfGW0MCHxXgg3zUg8J/rXnszNog4mGBfWya7cB1/uiV0kKkgFKVp/K raUA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAWJrKCbocFVjgmuXNYnIW3BtT6N2tLJhbGl3Q8yp8YTkZET8q50 3JKgmPYKh11smW4L5tkzgz7bH4lF1sCm3WDU4FE3fDKVmPHnfAQYYItdxj1m5EuPwTKlTC4Uhm0 cg0WvMzEbvCFkgWdbU2GkCbUjYvdhy0ef3Sa98Vd0OyfVDWz2LTMmfO3LG9eG1uO9sQ== X-Received: by 2002:aa7:8096:: with SMTP id v22mr37407352pff.94.1557186858473; Mon, 06 May 2019 16:54:18 -0700 (PDT) X-Google-Smtp-Source: APXvYqyn/wTBxsYqaljBXiRCOBobDPUSbIuCfCOOzCkO9Pf+mcBuXvizKtA+RExdZtRyrut/En6z X-Received: by 2002:aa7:8096:: with SMTP id v22mr37407286pff.94.1557186857452; Mon, 06 May 2019 16:54:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557186857; cv=none; d=google.com; s=arc-20160816; b=DKk1TiZIT5/HFxujJQAyqa+LrN2N4xdE8UNX37pezemLxOSXevGOfzP4I4MHgyWIuF aYVWRSO37bXz9/ghZjBHz7GDLHgnv6wn6LBGz1eXv2Cn3AZxQoBCDkgopPKwV4n9MkZI hO4qIWrsLBTWRbROdwt5MUQoUVcBGyu87WV/Jc2RNVq+rfygEx7I/3sJk6Sr3pNS4mCc Udry3nIRmfcepur2D0k+VfT2wlT6S3RcO/uC7EFJXUA2ll81aHMtA0MO0bXCQRa55Gmk P/x+O7hsjT85Oa7OrPvShRuTU/2dNKwv3xX8YOiEtYzwwrYxFv1aR1qFgLvDjG2PdZyd omkQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=bYwzlu+qlg87yG+G9ABQwbpiA0ygWWX0zPRHiAUPmik=; b=Wt6tGat6imkHZR6vxqbDwAEnGYKQ33JzBDcQ7nPCnq40XtRDNyONU5o/zDlVDDEtB0 CmIQDpYWFT+6nmVsZkphvAk0oUP5Ritcwd778Gq1nWNTDVsbiVBB9hjy8TFCyyRXBU76 BB/OPAki7/MZ1/74FphBK5x4KAkuUSLorxnuY9tlrXf4NB26vZC/jYSv1DEKtmkUZ1C7 OXNKfrZMwR0GP351aqxvB2XXTp24fwmx238fBRFOsY7DqE6XadoTkVbTcAVwRY7Yw6TG c5wn0ABDhiC4IIMU8H+lHRMH+LgYjuJgrUuDJtPSI0/EALlDt++kJbuxWils78jxE8kQ gH2A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga12.intel.com (mga12.intel.com. [192.55.52.136]) by mx.google.com with ESMTPS id g5si17341491plp.120.2019.05.06.16.54.17 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 06 May 2019 16:54:17 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) client-ip=192.55.52.136; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 06 May 2019 16:54:16 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,439,1549958400"; d="scan'208";a="168641737" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga004.fm.intel.com with ESMTP; 06 May 2019 16:54:16 -0700 Subject: [PATCH v8 12/12] libnvdimm/pfn: Stop padding pmem namespaces to section alignment From: Dan Williams To: akpm@linux-foundation.org Cc: Jeff Moyer , linux-nvdimm@lists.01.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, osalvador@suse.de, mhocko@suse.com Date: Mon, 06 May 2019 16:40:30 -0700 Message-ID: <155718603019.130019.12886712685677568035.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155718596657.130019.17139634728875079809.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155718596657.130019.17139634728875079809.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Now that the mm core supports section-unaligned hotplug of ZONE_DEVICE memory, we no longer need to add padding at pfn/dax device creation time. The kernel will still honor padding established by older kernels. Reported-by: Jeff Moyer Signed-off-by: Dan Williams --- drivers/nvdimm/pfn.h | 14 -------- drivers/nvdimm/pfn_devs.c | 77 ++++++++------------------------------------- include/linux/mmzone.h | 3 ++ 3 files changed, 16 insertions(+), 78 deletions(-) diff --git a/drivers/nvdimm/pfn.h b/drivers/nvdimm/pfn.h index e901e3a3b04c..cc042a98758f 100644 --- a/drivers/nvdimm/pfn.h +++ b/drivers/nvdimm/pfn.h @@ -41,18 +41,4 @@ struct nd_pfn_sb { __le64 checksum; }; -#ifdef CONFIG_SPARSEMEM -#define PFN_SECTION_ALIGN_DOWN(x) SECTION_ALIGN_DOWN(x) -#define PFN_SECTION_ALIGN_UP(x) SECTION_ALIGN_UP(x) -#else -/* - * In this case ZONE_DEVICE=n and we will disable 'pfn' device support, - * but we still want pmem to compile. - */ -#define PFN_SECTION_ALIGN_DOWN(x) (x) -#define PFN_SECTION_ALIGN_UP(x) (x) -#endif - -#define PHYS_SECTION_ALIGN_DOWN(x) PFN_PHYS(PFN_SECTION_ALIGN_DOWN(PHYS_PFN(x))) -#define PHYS_SECTION_ALIGN_UP(x) PFN_PHYS(PFN_SECTION_ALIGN_UP(PHYS_PFN(x))) #endif /* __NVDIMM_PFN_H */ diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c index a2406253eb70..7f54374b082f 100644 --- a/drivers/nvdimm/pfn_devs.c +++ b/drivers/nvdimm/pfn_devs.c @@ -595,14 +595,14 @@ static u32 info_block_reserve(void) } /* - * We hotplug memory at section granularity, pad the reserved area from - * the previous section base to the namespace base address. + * We hotplug memory at sub-section granularity, pad the reserved area + * from the previous section base to the namespace base address. */ static unsigned long init_altmap_base(resource_size_t base) { unsigned long base_pfn = PHYS_PFN(base); - return PFN_SECTION_ALIGN_DOWN(base_pfn); + return SUBSECTION_ALIGN_DOWN(base_pfn); } static unsigned long init_altmap_reserve(resource_size_t base) @@ -610,7 +610,7 @@ static unsigned long init_altmap_reserve(resource_size_t base) unsigned long reserve = info_block_reserve() >> PAGE_SHIFT; unsigned long base_pfn = PHYS_PFN(base); - reserve += base_pfn - PFN_SECTION_ALIGN_DOWN(base_pfn); + reserve += base_pfn - SUBSECTION_ALIGN_DOWN(base_pfn); return reserve; } @@ -641,8 +641,7 @@ static int __nvdimm_setup_pfn(struct nd_pfn *nd_pfn, struct dev_pagemap *pgmap) nd_pfn->npfns = le64_to_cpu(pfn_sb->npfns); pgmap->altmap_valid = false; } else if (nd_pfn->mode == PFN_MODE_PMEM) { - nd_pfn->npfns = PFN_SECTION_ALIGN_UP((resource_size(res) - - offset) / PAGE_SIZE); + nd_pfn->npfns = PHYS_PFN((resource_size(res) - offset)); if (le64_to_cpu(nd_pfn->pfn_sb->npfns) > nd_pfn->npfns) dev_info(&nd_pfn->dev, "number of pfns truncated from %lld to %ld\n", @@ -658,54 +657,14 @@ static int __nvdimm_setup_pfn(struct nd_pfn *nd_pfn, struct dev_pagemap *pgmap) return 0; } -static u64 phys_pmem_align_down(struct nd_pfn *nd_pfn, u64 phys) -{ - return min_t(u64, PHYS_SECTION_ALIGN_DOWN(phys), - ALIGN_DOWN(phys, nd_pfn->align)); -} - -/* - * Check if pmem collides with 'System RAM', or other regions when - * section aligned. Trim it accordingly. - */ -static void trim_pfn_device(struct nd_pfn *nd_pfn, u32 *start_pad, u32 *end_trunc) -{ - struct nd_namespace_common *ndns = nd_pfn->ndns; - struct nd_namespace_io *nsio = to_nd_namespace_io(&ndns->dev); - struct nd_region *nd_region = to_nd_region(nd_pfn->dev.parent); - const resource_size_t start = nsio->res.start; - const resource_size_t end = start + resource_size(&nsio->res); - resource_size_t adjust, size; - - *start_pad = 0; - *end_trunc = 0; - - adjust = start - PHYS_SECTION_ALIGN_DOWN(start); - size = resource_size(&nsio->res) + adjust; - if (region_intersects(start - adjust, size, IORESOURCE_SYSTEM_RAM, - IORES_DESC_NONE) == REGION_MIXED - || nd_region_conflict(nd_region, start - adjust, size)) - *start_pad = PHYS_SECTION_ALIGN_UP(start) - start; - - /* Now check that end of the range does not collide. */ - adjust = PHYS_SECTION_ALIGN_UP(end) - end; - size = resource_size(&nsio->res) + adjust; - if (region_intersects(start, size, IORESOURCE_SYSTEM_RAM, - IORES_DESC_NONE) == REGION_MIXED - || !IS_ALIGNED(end, nd_pfn->align) - || nd_region_conflict(nd_region, start, size)) - *end_trunc = end - phys_pmem_align_down(nd_pfn, end); -} - static int nd_pfn_init(struct nd_pfn *nd_pfn) { struct nd_namespace_common *ndns = nd_pfn->ndns; struct nd_namespace_io *nsio = to_nd_namespace_io(&ndns->dev); - u32 start_pad, end_trunc, reserve = info_block_reserve(); resource_size_t start, size; struct nd_region *nd_region; + unsigned long npfns, align; struct nd_pfn_sb *pfn_sb; - unsigned long npfns; phys_addr_t offset; const char *sig; u64 checksum; @@ -736,43 +695,35 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) return -ENXIO; } - memset(pfn_sb, 0, sizeof(*pfn_sb)); - - trim_pfn_device(nd_pfn, &start_pad, &end_trunc); - if (start_pad + end_trunc) - dev_info(&nd_pfn->dev, "%s alignment collision, truncate %d bytes\n", - dev_name(&ndns->dev), start_pad + end_trunc); - /* * Note, we use 64 here for the standard size of struct page, * debugging options may cause it to be larger in which case the * implementation will limit the pfns advertised through * ->direct_access() to those that are included in the memmap. */ - start = nsio->res.start + start_pad; + start = nsio->res.start; size = resource_size(&nsio->res); - npfns = PFN_SECTION_ALIGN_UP((size - start_pad - end_trunc - reserve) - / PAGE_SIZE); + npfns = PHYS_PFN(size - SZ_8K); + align = max(nd_pfn->align, (1UL << SUBSECTION_SHIFT)); if (nd_pfn->mode == PFN_MODE_PMEM) { /* * The altmap should be padded out to the block size used * when populating the vmemmap. This *should* be equal to * PMD_SIZE for most architectures. */ - offset = ALIGN(start + reserve + 64 * npfns, - max(nd_pfn->align, PMD_SIZE)) - start; + offset = ALIGN(start + SZ_8K + 64 * npfns, align) - start; } else if (nd_pfn->mode == PFN_MODE_RAM) - offset = ALIGN(start + reserve, nd_pfn->align) - start; + offset = ALIGN(start + SZ_8K, align) - start; else return -ENXIO; - if (offset + start_pad + end_trunc >= size) { + if (offset >= size) { dev_err(&nd_pfn->dev, "%s unable to satisfy requested alignment\n", dev_name(&ndns->dev)); return -ENXIO; } - npfns = (size - offset - start_pad - end_trunc) / SZ_4K; + npfns = PHYS_PFN(size - offset); pfn_sb->mode = cpu_to_le32(nd_pfn->mode); pfn_sb->dataoff = cpu_to_le64(offset); pfn_sb->npfns = cpu_to_le64(npfns); @@ -781,8 +732,6 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) memcpy(pfn_sb->parent_uuid, nd_dev_to_uuid(&ndns->dev), 16); pfn_sb->version_major = cpu_to_le16(1); pfn_sb->version_minor = cpu_to_le16(3); - pfn_sb->start_pad = cpu_to_le32(start_pad); - pfn_sb->end_trunc = cpu_to_le32(end_trunc); pfn_sb->align = cpu_to_le32(nd_pfn->align); checksum = nd_sb_checksum((struct nd_gen_sb *) pfn_sb); pfn_sb->checksum = cpu_to_le64(checksum); diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 49e7fb452dfd..15e07f007ba2 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1181,6 +1181,9 @@ static inline unsigned long section_nr_to_pfn(unsigned long sec) #define SUBSECTIONS_PER_SECTION (1UL << (SECTION_SIZE_BITS - SUBSECTION_SHIFT)) #endif +#define SUBSECTION_ALIGN_UP(pfn) ALIGN((pfn), PAGES_PER_SUBSECTION) +#define SUBSECTION_ALIGN_DOWN(pfn) ((pfn) & PAGE_SUBSECTION_MASK) + struct mem_section_usage { DECLARE_BITMAP(subsection_map, SUBSECTIONS_PER_SECTION); /* See declaration of similar field in struct zone */