From patchwork Wed Apr 17 18:39:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10905853 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D88F51515 for ; Wed, 17 Apr 2019 18:52:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B861E28B7D for ; Wed, 17 Apr 2019 18:52:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AC5BE28B83; Wed, 17 Apr 2019 18:52:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C00C728B7D for ; Wed, 17 Apr 2019 18:52:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B5ABF6B0006; Wed, 17 Apr 2019 14:52:49 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B31EE6B0007; Wed, 17 Apr 2019 14:52:49 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A215A6B0008; Wed, 17 Apr 2019 14:52:49 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f198.google.com (mail-pg1-f198.google.com [209.85.215.198]) by kanga.kvack.org (Postfix) with ESMTP id 673E66B0006 for ; Wed, 17 Apr 2019 14:52:49 -0400 (EDT) Received: by mail-pg1-f198.google.com with SMTP id g37so15169298pgl.19 for ; Wed, 17 Apr 2019 11:52:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=dWkC8W7TfrX3AuxKWyWsJ/q5FbF8/KITOsTBbm5rs6o=; b=DvnvfZgFXIwvZlI6d33Bsiiybp35qvDzVP7wjPyUqpbh0Jyv8NCr0caWqfUf5BS9Vf cxo62H79/OSo30qyd9kAWi486rNTc/IWmztARDG4JQAeDwTPx0IHal6c/co2T3yrE4z9 UDgH/seH1SP65IswOGbQL8hzqlCeg0+RckUfIsLIE46NNDPiLVXgnNG4N5HItR5yTKM6 vnvxaPNF4mxOvWkJLOXPsXfq65yLxhznGAvGYCfZyM8KUz22y98YAlXs4RGUWQuqlXN8 HkQTK9m4YRcUpQwapGU+6aYM7pi/FD7ehlx3NEvtgkv9axvghDsLuWsD3cs+I3B+2cBW 2gXw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAXt7oTSsbDcI3zI6sYeCxU+r2QbDa+D6XgdBJTBmKJeh3qjRIOg UALydw7ZLgTDH/KBfxLxLed5tQFYUqu3e5fet0LgubNGdWEPdMxx9gSxiZAU4Squ7lakQhMSY++ ibBLiPAWi89wh9pC9UcHbJVjTNw/sm3xWs0adrdoVDCzdYkotsmAvr/RRy8JX+8EOXQ== X-Received: by 2002:aa7:9e9e:: with SMTP id p30mr1478034pfq.255.1555527168636; Wed, 17 Apr 2019 11:52:48 -0700 (PDT) X-Google-Smtp-Source: APXvYqy7p9q4dX53qdY9G2NyhVMAM8qhZCkT3VVco0dYHldCAf8pD3y89FalbJuv9kL06m596e/6 X-Received: by 2002:aa7:9e9e:: with SMTP id p30mr1477944pfq.255.1555527167036; Wed, 17 Apr 2019 11:52:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555527167; cv=none; d=google.com; s=arc-20160816; b=Mr7moJm1B7QUfcEs9omp6eSYP0GvWyvnM4NiAlVZ7iz2lEGLBMIu8eowYzLWmRS3SY T9hqYf+yUUCnB51pVWJ5F40WtfZSNI8DbH7+rgzPuv7vF+7FqzcUi46ZYa977M6EXyUC kAL7zsjgtOd1R+oJm+LnocWiqXe+v39iarYeszxcEV12fLsbcpwUW5I3brZ1JMNlTtFP cTfdtMnLsiILioAV1Pi/k+LTbPTbZr2JEo5m7XljsHBZ8+ycK6DkwTW9avHJqM2cUJzX 33d+acWIcRJsjKQy4SSej+ssSWq0fn/MOieXA3cwopXOoX0xn1+5dXJuea8wot0Xx7wv L9Fw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=dWkC8W7TfrX3AuxKWyWsJ/q5FbF8/KITOsTBbm5rs6o=; b=G2Az7nmqelNz1wT9w18sZSeXABMNO2RBPN14sAdLTad90t8wQ7udtXRMz8Z50JX378 sMz7z/Ig5izSfWXLOVdOGLr/9CNTsOPAx45nwyfgppN0tK24LIxSnXjMTfAar7D8aJyE xhIaf40rDM/RssnYWQJuM0y2qG7GhndRQMaAE+4DJzA1mgNl+P5eNviclY56Vm6WVc4U nDHRy1zcpi7pQvXFFVK/I7/1944urmN5R0M+IKNDlUhAOcDxnN4SjZrh86SseaeKGDwn PSaq3GuX+FOKVFQ3887R9szZCfPVJjPXPFwQcIO+Pka63TnYSr3znt/apl1y0kyVmxDA c/XA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga02.intel.com (mga02.intel.com. [134.134.136.20]) by mx.google.com with ESMTPS id v9si33847005plo.95.2019.04.17.11.52.46 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 17 Apr 2019 11:52:47 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.20 as permitted sender) client-ip=134.134.136.20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Apr 2019 11:52:46 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,362,1549958400"; d="scan'208";a="135207423" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga008.jf.intel.com with ESMTP; 17 Apr 2019 11:52:45 -0700 Subject: [PATCH v6 01/12] mm/sparsemem: Introduce struct mem_section_usage From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Vlastimil Babka , Logan Gunthorpe , linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, mhocko@suse.com, david@redhat.com Date: Wed, 17 Apr 2019 11:39:00 -0700 Message-ID: <155552634075.2015392.3371070426600230054.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Towards enabling memory hotplug to track partial population of a section, introduce 'struct mem_section_usage'. A pointer to a 'struct mem_section_usage' instance replaces the existing pointer to a 'pageblock_flags' bitmap. Effectively it adds one more 'unsigned long' beyond the 'pageblock_flags' (usemap) allocation to house a new 'map_active' bitmap. The new bitmap enables the memory hot{plug,remove} implementation to act on incremental sub-divisions of a section. The primary motivation for this functionality is to support platforms that mix "System RAM" and "Persistent Memory" within a single section, or multiple PMEM ranges with different mapping lifetimes within a single section. The section restriction for hotplug has caused an ongoing saga of hacks and bugs for devm_memremap_pages() users. Beyond the fixups to teach existing paths how to retrieve the 'usemap' from a section, and updates to usemap allocation path, there are no expected behavior changes. Cc: Michal Hocko Cc: Vlastimil Babka Cc: Logan Gunthorpe Signed-off-by: Dan Williams Reviewed-by: Pavel Tatashin --- include/linux/mmzone.h | 23 ++++++++++++-- mm/memory_hotplug.c | 18 ++++++----- mm/page_alloc.c | 2 + mm/sparse.c | 81 ++++++++++++++++++++++++------------------------ 4 files changed, 71 insertions(+), 53 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 70394cabaf4e..f0bbd85dc19a 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1160,6 +1160,19 @@ static inline unsigned long section_nr_to_pfn(unsigned long sec) #define SECTION_ALIGN_UP(pfn) (((pfn) + PAGES_PER_SECTION - 1) & PAGE_SECTION_MASK) #define SECTION_ALIGN_DOWN(pfn) ((pfn) & PAGE_SECTION_MASK) +#define SECTION_ACTIVE_SIZE ((1UL << SECTION_SIZE_BITS) / BITS_PER_LONG) +#define SECTION_ACTIVE_MASK (~(SECTION_ACTIVE_SIZE - 1)) + +struct mem_section_usage { + /* + * SECTION_ACTIVE_SIZE portions of the section that are populated in + * the memmap + */ + unsigned long map_active; + /* See declaration of similar field in struct zone */ + unsigned long pageblock_flags[0]; +}; + struct page; struct page_ext; struct mem_section { @@ -1177,8 +1190,7 @@ struct mem_section { */ unsigned long section_mem_map; - /* See declaration of similar field in struct zone */ - unsigned long *pageblock_flags; + struct mem_section_usage *usage; #ifdef CONFIG_PAGE_EXTENSION /* * If SPARSEMEM, pgdat doesn't have page_ext pointer. We use @@ -1209,6 +1221,11 @@ extern struct mem_section **mem_section; extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT]; #endif +static inline unsigned long *section_to_usemap(struct mem_section *ms) +{ + return ms->usage->pageblock_flags; +} + static inline struct mem_section *__nr_to_section(unsigned long nr) { #ifdef CONFIG_SPARSEMEM_EXTREME @@ -1220,7 +1237,7 @@ static inline struct mem_section *__nr_to_section(unsigned long nr) return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK]; } extern int __section_nr(struct mem_section* ms); -extern unsigned long usemap_size(void); +extern size_t mem_section_usage_size(void); /* * We use the lower bits of the mem_map pointer to store diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 52fef4a81e4c..8b7415736d21 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -165,9 +165,10 @@ void put_page_bootmem(struct page *page) #ifndef CONFIG_SPARSEMEM_VMEMMAP static void register_page_bootmem_info_section(unsigned long start_pfn) { - unsigned long *usemap, mapsize, section_nr, i; + unsigned long mapsize, section_nr, i; struct mem_section *ms; struct page *page, *memmap; + struct mem_section_usage *usage; section_nr = pfn_to_section_nr(start_pfn); ms = __nr_to_section(section_nr); @@ -187,10 +188,10 @@ static void register_page_bootmem_info_section(unsigned long start_pfn) for (i = 0; i < mapsize; i++, page++) get_page_bootmem(section_nr, page, SECTION_INFO); - usemap = ms->pageblock_flags; - page = virt_to_page(usemap); + usage = ms->usage; + page = virt_to_page(usage); - mapsize = PAGE_ALIGN(usemap_size()) >> PAGE_SHIFT; + mapsize = PAGE_ALIGN(mem_section_usage_size()) >> PAGE_SHIFT; for (i = 0; i < mapsize; i++, page++) get_page_bootmem(section_nr, page, MIX_SECTION_INFO); @@ -199,9 +200,10 @@ static void register_page_bootmem_info_section(unsigned long start_pfn) #else /* CONFIG_SPARSEMEM_VMEMMAP */ static void register_page_bootmem_info_section(unsigned long start_pfn) { - unsigned long *usemap, mapsize, section_nr, i; + unsigned long mapsize, section_nr, i; struct mem_section *ms; struct page *page, *memmap; + struct mem_section_usage *usage; section_nr = pfn_to_section_nr(start_pfn); ms = __nr_to_section(section_nr); @@ -210,10 +212,10 @@ static void register_page_bootmem_info_section(unsigned long start_pfn) register_page_bootmem_memmap(section_nr, memmap, PAGES_PER_SECTION); - usemap = ms->pageblock_flags; - page = virt_to_page(usemap); + usage = ms->usage; + page = virt_to_page(usage); - mapsize = PAGE_ALIGN(usemap_size()) >> PAGE_SHIFT; + mapsize = PAGE_ALIGN(mem_section_usage_size()) >> PAGE_SHIFT; for (i = 0; i < mapsize; i++, page++) get_page_bootmem(section_nr, page, MIX_SECTION_INFO); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index deea16489e2b..f671401a7c0b 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -390,7 +390,7 @@ static inline unsigned long *get_pageblock_bitmap(struct page *page, unsigned long pfn) { #ifdef CONFIG_SPARSEMEM - return __pfn_to_section(pfn)->pageblock_flags; + return section_to_usemap(__pfn_to_section(pfn)); #else return page_zone(page)->pageblock_flags; #endif /* CONFIG_SPARSEMEM */ diff --git a/mm/sparse.c b/mm/sparse.c index fd13166949b5..f87de7ad32c8 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -288,33 +288,31 @@ struct page *sparse_decode_mem_map(unsigned long coded_mem_map, unsigned long pn static void __meminit sparse_init_one_section(struct mem_section *ms, unsigned long pnum, struct page *mem_map, - unsigned long *pageblock_bitmap) + struct mem_section_usage *usage) { ms->section_mem_map &= ~SECTION_MAP_MASK; ms->section_mem_map |= sparse_encode_mem_map(mem_map, pnum) | SECTION_HAS_MEM_MAP; - ms->pageblock_flags = pageblock_bitmap; + ms->usage = usage; } -unsigned long usemap_size(void) +static unsigned long usemap_size(void) { return BITS_TO_LONGS(SECTION_BLOCKFLAGS_BITS) * sizeof(unsigned long); } -#ifdef CONFIG_MEMORY_HOTPLUG -static unsigned long *__kmalloc_section_usemap(void) +size_t mem_section_usage_size(void) { - return kmalloc(usemap_size(), GFP_KERNEL); + return sizeof(struct mem_section_usage) + usemap_size(); } -#endif /* CONFIG_MEMORY_HOTPLUG */ #ifdef CONFIG_MEMORY_HOTREMOVE -static unsigned long * __init +static struct mem_section_usage * __init sparse_early_usemaps_alloc_pgdat_section(struct pglist_data *pgdat, unsigned long size) { + struct mem_section_usage *usage; unsigned long goal, limit; - unsigned long *p; int nid; /* * A page may contain usemaps for other sections preventing the @@ -330,15 +328,16 @@ sparse_early_usemaps_alloc_pgdat_section(struct pglist_data *pgdat, limit = goal + (1UL << PA_SECTION_SHIFT); nid = early_pfn_to_nid(goal >> PAGE_SHIFT); again: - p = memblock_alloc_try_nid(size, SMP_CACHE_BYTES, goal, limit, nid); - if (!p && limit) { + usage = memblock_alloc_try_nid(size, SMP_CACHE_BYTES, goal, limit, nid); + if (!usage && limit) { limit = 0; goto again; } - return p; + return usage; } -static void __init check_usemap_section_nr(int nid, unsigned long *usemap) +static void __init check_usemap_section_nr(int nid, + struct mem_section_usage *usage) { unsigned long usemap_snr, pgdat_snr; static unsigned long old_usemap_snr; @@ -352,7 +351,7 @@ static void __init check_usemap_section_nr(int nid, unsigned long *usemap) old_pgdat_snr = NR_MEM_SECTIONS; } - usemap_snr = pfn_to_section_nr(__pa(usemap) >> PAGE_SHIFT); + usemap_snr = pfn_to_section_nr(__pa(usage) >> PAGE_SHIFT); pgdat_snr = pfn_to_section_nr(__pa(pgdat) >> PAGE_SHIFT); if (usemap_snr == pgdat_snr) return; @@ -380,14 +379,15 @@ static void __init check_usemap_section_nr(int nid, unsigned long *usemap) usemap_snr, pgdat_snr, nid); } #else -static unsigned long * __init +static struct mem_section_usage * __init sparse_early_usemaps_alloc_pgdat_section(struct pglist_data *pgdat, unsigned long size) { return memblock_alloc_node(size, SMP_CACHE_BYTES, pgdat->node_id); } -static void __init check_usemap_section_nr(int nid, unsigned long *usemap) +static void __init check_usemap_section_nr(int nid, + struct mem_section_usage *usage) { } #endif /* CONFIG_MEMORY_HOTREMOVE */ @@ -474,14 +474,13 @@ static void __init sparse_init_nid(int nid, unsigned long pnum_begin, unsigned long pnum_end, unsigned long map_count) { - unsigned long pnum, usemap_longs, *usemap; + struct mem_section_usage *usage; + unsigned long pnum; struct page *map; - usemap_longs = BITS_TO_LONGS(SECTION_BLOCKFLAGS_BITS); - usemap = sparse_early_usemaps_alloc_pgdat_section(NODE_DATA(nid), - usemap_size() * - map_count); - if (!usemap) { + usage = sparse_early_usemaps_alloc_pgdat_section(NODE_DATA(nid), + mem_section_usage_size() * map_count); + if (!usage) { pr_err("%s: node[%d] usemap allocation failed", __func__, nid); goto failed; } @@ -497,9 +496,9 @@ static void __init sparse_init_nid(int nid, unsigned long pnum_begin, pnum_begin = pnum; goto failed; } - check_usemap_section_nr(nid, usemap); - sparse_init_one_section(__nr_to_section(pnum), pnum, map, usemap); - usemap += usemap_longs; + check_usemap_section_nr(nid, usage); + sparse_init_one_section(__nr_to_section(pnum), pnum, map, usage); + usage = (void *) usage + mem_section_usage_size(); } sparse_buffer_fini(); return; @@ -701,9 +700,9 @@ int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, struct vmem_altmap *altmap) { unsigned long section_nr = pfn_to_section_nr(start_pfn); + struct mem_section_usage *usage; struct mem_section *ms; struct page *memmap; - unsigned long *usemap; int ret; /* @@ -717,8 +716,8 @@ int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, memmap = kmalloc_section_memmap(section_nr, nid, altmap); if (!memmap) return -ENOMEM; - usemap = __kmalloc_section_usemap(); - if (!usemap) { + usage = kzalloc(mem_section_usage_size(), GFP_KERNEL); + if (!usage) { __kfree_section_memmap(memmap, altmap); return -ENOMEM; } @@ -736,11 +735,11 @@ int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, page_init_poison(memmap, sizeof(struct page) * PAGES_PER_SECTION); section_mark_present(ms); - sparse_init_one_section(ms, section_nr, memmap, usemap); + sparse_init_one_section(ms, section_nr, memmap, usage); out: if (ret < 0) { - kfree(usemap); + kfree(usage); __kfree_section_memmap(memmap, altmap); } return ret; @@ -777,20 +776,20 @@ static inline void clear_hwpoisoned_pages(struct page *memmap, int nr_pages) } #endif -static void free_section_usemap(struct page *memmap, unsigned long *usemap, - struct vmem_altmap *altmap) +static void free_section_usage(struct page *memmap, + struct mem_section_usage *usage, struct vmem_altmap *altmap) { - struct page *usemap_page; + struct page *usage_page; - if (!usemap) + if (!usage) return; - usemap_page = virt_to_page(usemap); + usage_page = virt_to_page(usage); /* * Check to see if allocation came from hot-plug-add */ - if (PageSlab(usemap_page) || PageCompound(usemap_page)) { - kfree(usemap); + if (PageSlab(usage_page) || PageCompound(usage_page)) { + kfree(usage); if (memmap) __kfree_section_memmap(memmap, altmap); return; @@ -809,19 +808,19 @@ void sparse_remove_one_section(struct zone *zone, struct mem_section *ms, unsigned long map_offset, struct vmem_altmap *altmap) { struct page *memmap = NULL; - unsigned long *usemap = NULL; + struct mem_section_usage *usage = NULL; if (ms->section_mem_map) { - usemap = ms->pageblock_flags; + usage = ms->usage; memmap = sparse_decode_mem_map(ms->section_mem_map, __section_nr(ms)); ms->section_mem_map = 0; - ms->pageblock_flags = NULL; + ms->usage = NULL; } clear_hwpoisoned_pages(memmap + map_offset, PAGES_PER_SECTION - map_offset); - free_section_usemap(memmap, usemap, altmap); + free_section_usage(memmap, usage, altmap); } #endif /* CONFIG_MEMORY_HOTREMOVE */ #endif /* CONFIG_MEMORY_HOTPLUG */ From patchwork Wed Apr 17 18:39:05 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10905857 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3FEE9161F for ; Wed, 17 Apr 2019 18:52:55 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2852A28B7D for ; Wed, 17 Apr 2019 18:52:55 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1C8DF28B83; Wed, 17 Apr 2019 18:52:55 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9672328B7D for ; Wed, 17 Apr 2019 18:52:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 876996B0007; Wed, 17 Apr 2019 14:52:53 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7FD9C6B0008; Wed, 17 Apr 2019 14:52:53 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6ECD36B000A; Wed, 17 Apr 2019 14:52:53 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) by kanga.kvack.org (Postfix) with ESMTP id 398D26B0007 for ; Wed, 17 Apr 2019 14:52:53 -0400 (EDT) Received: by mail-pl1-f199.google.com with SMTP id s22so16012787plq.1 for ; Wed, 17 Apr 2019 11:52:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=IWDyHzmpfGujcmr8kiUjBHNn9xrobFYl7Wt30dr8yPw=; b=e908Wd1kM0LiRMEMhpv/b/hgeJQNpY44Jh9uKQ8MWZxtQXVuVAgV1KCQ1fQPXa+sZo Ek+JJT76JWlMwDA5Hu//+bkdegjtU9xbwysjT2Xk7QpxDeXdBZpE5B9jLxL3VIeKr4IH WQekB/T3Z/3WimwDFpn9JxcHQxEfGMeS8UWICjfW9rP5F/VT5aCxFPt+TEk3MQuf61V1 +cEed1G5ZZC3ZHnD/vG9KsO33fC71h5GEqIq+avfzYMg4BvGSz5TRwzhOJOYfXt1zZtN GK/ROA1RHUUPI9fekVLPGBlHu0I053ozY7JHWA9R8u8Sq1scnryi59bHQsZzXAzJ5tpP msrw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAUCLaT3qMV+a6OwtXjiUXj5jrnubtTj/VaN+l3gzJPePrQ6GQxi GZWmcWMIHn/hoR30yLXfzEuyrEV3x2jrouU4oyQcQdn1oHfdDMsnJIgz/DT4ngPdcpTHMUadxN6 Z1bdWRKvGRCbU7wJwCp8rRqIkB68nv27teRwQhSIU47hMgXMtbOGy6EmKsdeF1+ynxQ== X-Received: by 2002:a17:902:7c01:: with SMTP id x1mr64847279pll.299.1555527172795; Wed, 17 Apr 2019 11:52:52 -0700 (PDT) X-Google-Smtp-Source: APXvYqy/iuiKcAlBgEYdHqryF4Mkki6a87iM+SAHKEk4EZqyjw6GMB6Vld7Pabs4+EKHKX5xNJzx X-Received: by 2002:a17:902:7c01:: with SMTP id x1mr64847233pll.299.1555527172107; Wed, 17 Apr 2019 11:52:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555527172; cv=none; d=google.com; s=arc-20160816; b=g3QJ5NY+4zAI3muIghfG6OHAm6r+OR/esIjcgQ75/6JRuilnMKrd6CEnagR1aEMhlv LtM7B6sGqBvPnvG/Vm2SgsPAwBFEJRdK0TXIRAx6wqOJCdiSq9E3n/bnWReDNa13Wlff uo9XnEs9T2dkSDh7cyiXYHTsxH5iKMov794W7Q7/yagPIKqiG3XB6rfMGhn99vktMeCG uKZ/xnTmSU/xgUPHLO4GRokI7lxwjyaCaAp/bwnj7V1ArEQ48qj5cXQhXjaa+SeI1Bdk HLY34ubIvVm66r+W720S2doDupJzpuX4FS/G7EqeWd8JlGP5pFSt15xwmOaywTBW3SJ+ RLfg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=IWDyHzmpfGujcmr8kiUjBHNn9xrobFYl7Wt30dr8yPw=; b=bUAr+sciWIfuKG/68w4jqoUopPztdRPjy9FwLLmXI21JE1774l2Vsqkn6A5y98n0ql aQkalwCFNizyT/NmLYCRzvXITArFqAFZpDfobPCtcfokPgY46jG2A1tDCGVaRR5k4uWy dVLVcD836JeykyvQXV+Rfdvj2BNHC9nYeKUwInU01gqNcvbAlzHewcG8UqaGeOTe3nGB vBpZXiKC8rRN2WHIs0Um1v5W+fyyrQHMK0FRw0qN163a2UJogS0xBSIoiCw/CyZ6QLMx 4wZN09WzQ8N7afJ74U2gNrK/cmyug+Cs8amOGHJmkMSRNQktoohuRt3pV1RPmwLlj/UK xaAQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga18.intel.com (mga18.intel.com. [134.134.136.126]) by mx.google.com with ESMTPS id f2si27881166pgc.182.2019.04.17.11.52.51 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 17 Apr 2019 11:52:52 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.126 as permitted sender) client-ip=134.134.136.126; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Apr 2019 11:52:51 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,362,1549958400"; d="scan'208";a="292403034" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga004.jf.intel.com with ESMTP; 17 Apr 2019 11:52:51 -0700 Subject: [PATCH v6 02/12] mm/sparsemem: Introduce common definitions for the size and mask of a section From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Vlastimil Babka , =?utf-8?b?SsOpcsO0bWU=?= Glisse , Logan Gunthorpe , linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, mhocko@suse.com, david@redhat.com Date: Wed, 17 Apr 2019 11:39:05 -0700 Message-ID: <155552634586.2015392.2662168839054356692.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Up-level the local section size and mask from kernel/memremap.c to global definitions. These will be used by the new sub-section hotplug support. Cc: Michal Hocko Cc: Vlastimil Babka Cc: Jérôme Glisse Cc: Logan Gunthorpe Signed-off-by: Dan Williams Reviewed-by: Pavel Tatashin --- include/linux/mmzone.h | 2 ++ kernel/memremap.c | 10 ++++------ mm/hmm.c | 2 -- 3 files changed, 6 insertions(+), 8 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index f0bbd85dc19a..6726fc175b51 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1134,6 +1134,8 @@ static inline unsigned long early_pfn_to_nid(unsigned long pfn) * PFN_SECTION_SHIFT pfn to/from section number */ #define PA_SECTION_SHIFT (SECTION_SIZE_BITS) +#define PA_SECTION_SIZE (1UL << PA_SECTION_SHIFT) +#define PA_SECTION_MASK (~(PA_SECTION_SIZE-1)) #define PFN_SECTION_SHIFT (SECTION_SIZE_BITS - PAGE_SHIFT) #define NR_MEM_SECTIONS (1UL << SECTIONS_SHIFT) diff --git a/kernel/memremap.c b/kernel/memremap.c index 4e59d29245f4..f355586ea54a 100644 --- a/kernel/memremap.c +++ b/kernel/memremap.c @@ -14,8 +14,6 @@ #include static DEFINE_XARRAY(pgmap_array); -#define SECTION_MASK ~((1UL << PA_SECTION_SHIFT) - 1) -#define SECTION_SIZE (1UL << PA_SECTION_SHIFT) #if IS_ENABLED(CONFIG_DEVICE_PRIVATE) vm_fault_t device_private_entry_fault(struct vm_area_struct *vma, @@ -98,8 +96,8 @@ static void devm_memremap_pages_release(void *data) put_page(pfn_to_page(pfn)); /* pages are dead and unused, undo the arch mapping */ - align_start = res->start & ~(SECTION_SIZE - 1); - align_size = ALIGN(res->start + resource_size(res), SECTION_SIZE) + align_start = res->start & ~(PA_SECTION_SIZE - 1); + align_size = ALIGN(res->start + resource_size(res), PA_SECTION_SIZE) - align_start; nid = page_to_nid(pfn_to_page(align_start >> PAGE_SHIFT)); @@ -160,8 +158,8 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) if (!pgmap->ref || !pgmap->kill) return ERR_PTR(-EINVAL); - align_start = res->start & ~(SECTION_SIZE - 1); - align_size = ALIGN(res->start + resource_size(res), SECTION_SIZE) + align_start = res->start & ~(PA_SECTION_SIZE - 1); + align_size = ALIGN(res->start + resource_size(res), PA_SECTION_SIZE) - align_start; align_end = align_start + align_size - 1; diff --git a/mm/hmm.c b/mm/hmm.c index ecd16718285e..def451a56c3e 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -34,8 +34,6 @@ #include #include -#define PA_SECTION_SIZE (1UL << PA_SECTION_SHIFT) - #if IS_ENABLED(CONFIG_HMM_MIRROR) static const struct mmu_notifier_ops hmm_mmu_notifier_ops; From patchwork Wed Apr 17 18:39:11 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10905861 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 79C161515 for ; Wed, 17 Apr 2019 18:53:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5D4B528B7D for ; Wed, 17 Apr 2019 18:53:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5127A28B83; Wed, 17 Apr 2019 18:53:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C3FC628B7D for ; Wed, 17 Apr 2019 18:52:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BCF676B0008; Wed, 17 Apr 2019 14:52:58 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id BA6176B000A; Wed, 17 Apr 2019 14:52:58 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A958A6B000C; Wed, 17 Apr 2019 14:52:58 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f200.google.com (mail-pg1-f200.google.com [209.85.215.200]) by kanga.kvack.org (Postfix) with ESMTP id 740F66B0008 for ; Wed, 17 Apr 2019 14:52:58 -0400 (EDT) Received: by mail-pg1-f200.google.com with SMTP id z12so15186594pgs.4 for ; Wed, 17 Apr 2019 11:52:58 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=X4wbSxsH5/0V3f1ZPczRFAr4k1ojDCrVDY/qEO4VDNY=; b=jY/T5+LvexVCWGzUmbUBj12Ors+zhXKTX8NsOSq8oNW+Q4ckbDIcJ7xDxoA4buvd25 dR+xvgf3R3BUxwfU8KfgSkFnmMmSHrRrBBSa7yKkv9Q6cyP08rkOSzOWBz+dDRuGotc8 pMg9R2nNPh4QDsbE1i+FQkQB2/LCbNc6dnQCdGpjeiHJ8LvPxXZJ7Ej5xkmC/sgwquG6 0hwNGQKKshb59tl0yzlaT0inTs8IL38PwsfCebEA0FF8XfqISqCcvICQ/GIkcQAXDV6s jMYIx1EH76T+k4A6ECwH79DqdGF8wHsox0uz/5pb0ShPjZMSiLlkLidgdQENmoT672Jl 1D0g== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAXWwNFrc1Ad/FMxgWr6s6n1aR5VyUFAqJu8vp6ZCgDBv81uCGs3 ulIEKjU39ZWL9F3KYW+DBeW1X0K3rgJgh+ZP6LXU7Uq5q9ZCi4YjKAhqn3Fif8TTrN1fFQcpyCO LK31Q6xCNo4leP6vt+ktHO525EKrV43wBQ5k480G6ewQIbSFgX1kMtuIjxP0a/gEJiQ== X-Received: by 2002:a17:902:20c9:: with SMTP id v9mr69515576plg.239.1555527178137; Wed, 17 Apr 2019 11:52:58 -0700 (PDT) X-Google-Smtp-Source: APXvYqyUJpzwfVgmnMuEvLtKDbYjJSCNm83YKHReOz28LEKDMoeMHV7G/QazInba4y2fWon1BGUv X-Received: by 2002:a17:902:20c9:: with SMTP id v9mr69515535plg.239.1555527177422; Wed, 17 Apr 2019 11:52:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555527177; cv=none; d=google.com; s=arc-20160816; b=HLP61Su+JwwzQjajKVvCbvkaZokgMoRNtmGqD1QmufuG/qrXr1lpB7KTYdUWD2i11X SWKJe2tlfpEzdjVpTtn1Md//bK9kIVjOcKy/LsRRdH+u6bpoLb1sCBTECsaxne4ExNfo SDI3rYKvBm2inBLqz8FcapMIZ2Lm0ZpJVRTj9UCeAayktGH7CgqdQ3Ba+BQPV7ONrw/B HbSwu5x1g5+Zs5zqkFfuo1fgFcJA09h3+UMWM8tn94779Jy28La+U49AR7oRy58UV3qQ ZFa+2QHkQ2ZfjFmNRV8JBXKUmC2knxGs/Nye1OG8LGBQWOWTezIhp9+aoXYVUraGAStR uBcQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=X4wbSxsH5/0V3f1ZPczRFAr4k1ojDCrVDY/qEO4VDNY=; b=TuEbrn174PcJZiJcY/26BSA2UaAgh/JOPL/Owzfi8KsdUydOpXl7LGTQQluS/Sjyt8 agS48llgzRyKuDuFsz2g8r6L/00fUrrqJhvxhDs8XZ8h75JUX9L/fKxM2vwx4MzlufrW fz20TmByQAdx/Q5+5jhBCsdYN5QJZhuyxfnotyUpc1MlJCAt4Of2kyQY9Pnco+cBj84B m4C8qaoBiiuWzTRmR+Jw6XshNwMg3B7znSZvrk49fxwYFih0dR3qoKZsd2hp4Ed/EB2u +NytzhFKJ+v4kOGzsbfHUbjfOskaDygXP22EwLQ1AKpZ2gekeskhGk9Mw+wWDeGQLdRq HwzA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga12.intel.com (mga12.intel.com. [192.55.52.136]) by mx.google.com with ESMTPS id g189si49051674pfb.289.2019.04.17.11.52.57 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 17 Apr 2019 11:52:57 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) client-ip=192.55.52.136; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Apr 2019 11:52:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,362,1549958400"; d="scan'208";a="132232290" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga007.jf.intel.com with ESMTP; 17 Apr 2019 11:52:56 -0700 Subject: [PATCH v6 03/12] mm/sparsemem: Add helpers track active portions of a section at boot From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Vlastimil Babka , Logan Gunthorpe , linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, mhocko@suse.com, david@redhat.com Date: Wed, 17 Apr 2019 11:39:11 -0700 Message-ID: <155552635098.2015392.5460028594173939000.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Prepare for hot{plug,remove} of sub-ranges of a section by tracking a section active bitmask, each bit representing 2MB (SECTION_SIZE (128M) / map_active bitmask length (64)). If it turns out that 2MB is too large of an active tracking granularity it is trivial to increase the size of the map_active bitmap. The implications of a partially populated section is that pfn_valid() needs to go beyond a valid_section() check and read the sub-section active ranges from the bitmask. Cc: Michal Hocko Cc: Vlastimil Babka Cc: Logan Gunthorpe Signed-off-by: Dan Williams --- include/linux/mmzone.h | 29 ++++++++++++++++++++++++++++- mm/page_alloc.c | 4 +++- mm/sparse.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 79 insertions(+), 2 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 6726fc175b51..cffde898e345 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1175,6 +1175,8 @@ struct mem_section_usage { unsigned long pageblock_flags[0]; }; +void section_active_init(unsigned long pfn, unsigned long nr_pages); + struct page; struct page_ext; struct mem_section { @@ -1312,12 +1314,36 @@ static inline struct mem_section *__pfn_to_section(unsigned long pfn) extern int __highest_present_section_nr; +static inline int section_active_index(phys_addr_t phys) +{ + return (phys & ~(PA_SECTION_MASK)) / SECTION_ACTIVE_SIZE; +} + +#ifdef CONFIG_SPARSEMEM_VMEMMAP +static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) +{ + int idx = section_active_index(PFN_PHYS(pfn)); + + return !!(ms->usage->map_active & (1UL << idx)); +} +#else +static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) +{ + return 1; +} +#endif + #ifndef CONFIG_HAVE_ARCH_PFN_VALID static inline int pfn_valid(unsigned long pfn) { + struct mem_section *ms; + if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) return 0; - return valid_section(__nr_to_section(pfn_to_section_nr(pfn))); + ms = __nr_to_section(pfn_to_section_nr(pfn)); + if (!valid_section(ms)) + return 0; + return pfn_section_valid(ms, pfn); } #endif @@ -1349,6 +1375,7 @@ void sparse_init(void); #define sparse_init() do {} while (0) #define sparse_index_init(_sec, _nid) do {} while (0) #define pfn_present pfn_valid +#define section_active_init(_pfn, _nr_pages) do {} while (0) #endif /* CONFIG_SPARSEMEM */ /* diff --git a/mm/page_alloc.c b/mm/page_alloc.c index f671401a7c0b..c9ad28a78018 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -7273,10 +7273,12 @@ void __init free_area_init_nodes(unsigned long *max_zone_pfn) /* Print out the early node map */ pr_info("Early memory node ranges\n"); - for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) + for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) { pr_info(" node %3d: [mem %#018Lx-%#018Lx]\n", nid, (u64)start_pfn << PAGE_SHIFT, ((u64)end_pfn << PAGE_SHIFT) - 1); + section_active_init(start_pfn, end_pfn - start_pfn); + } /* Initialise every node */ mminit_verify_pageflags_layout(); diff --git a/mm/sparse.c b/mm/sparse.c index f87de7ad32c8..5ef2f884c4e1 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -210,6 +210,54 @@ static inline unsigned long first_present_section_nr(void) return next_present_section_nr(-1); } +static unsigned long section_active_mask(unsigned long pfn, + unsigned long nr_pages) +{ + int idx_start, idx_size; + phys_addr_t start, size; + + if (!nr_pages) + return 0; + + start = PFN_PHYS(pfn); + size = PFN_PHYS(min(nr_pages, PAGES_PER_SECTION + - (pfn & ~PAGE_SECTION_MASK))); + size = ALIGN(size, SECTION_ACTIVE_SIZE); + + idx_start = section_active_index(start); + idx_size = section_active_index(size); + + if (idx_size == 0) + return -1; + return ((1UL << idx_size) - 1) << idx_start; +} + +void section_active_init(unsigned long pfn, unsigned long nr_pages) +{ + int end_sec = pfn_to_section_nr(pfn + nr_pages - 1); + int i, start_sec = pfn_to_section_nr(pfn); + + if (!nr_pages) + return; + + for (i = start_sec; i <= end_sec; i++) { + struct mem_section *ms; + unsigned long mask; + unsigned long pfns; + + pfns = min(nr_pages, PAGES_PER_SECTION + - (pfn & ~PAGE_SECTION_MASK)); + mask = section_active_mask(pfn, pfns); + + ms = __nr_to_section(i); + pr_debug("%s: sec: %d mask: %#018lx\n", __func__, i, mask); + ms->usage->map_active = mask; + + pfn += pfns; + nr_pages -= pfns; + } +} + /* Record a memory area against a node. */ void __init memory_present(int nid, unsigned long start, unsigned long end) { From patchwork Wed Apr 17 18:39:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10905863 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E286A161F for ; Wed, 17 Apr 2019 18:53:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C69FE28B7D for ; Wed, 17 Apr 2019 18:53:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BA90D28B83; Wed, 17 Apr 2019 18:53:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3C5B228B7D for ; Wed, 17 Apr 2019 18:53:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2806A6B000A; Wed, 17 Apr 2019 14:53:04 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 230296B000C; Wed, 17 Apr 2019 14:53:04 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 146216B000D; Wed, 17 Apr 2019 14:53:04 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f200.google.com (mail-pf1-f200.google.com [209.85.210.200]) by kanga.kvack.org (Postfix) with ESMTP id D166F6B000A for ; Wed, 17 Apr 2019 14:53:03 -0400 (EDT) Received: by mail-pf1-f200.google.com with SMTP id e20so16793447pfn.8 for ; Wed, 17 Apr 2019 11:53:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=0zgvBeX32KFY/f5kA6MPIacaljL+CBotozBqVDmlEME=; b=AowUG6ugTvTvAtlX2nnm6TLL1GXnGlUbvBr5txH60Hc1ThFj5y5VmfLXKBPODCQSgu XNGSPXohAfIjhvSPO309w1sCKYO7NO26kGplCDdpzXVSoKDTdZGYSw8YYibQBlkp9tNy rto89hVomlihhxYrTCSK2dKY/mbwbO0LDRMHhnUNC9h3jrkgHDicFIlBKbyW5Xx3qG3J HN9Sqrt53E/zhY6qs65Q+W5rISWka2Ivjiryhz4hEbAjmWDRWlrHboRCUmZdIZunYteV RgiC+fwMJNYLJoYddZPcu9FAaH/axMBnNlpDJmnR1/Hh60h0IcBZGwSKZZvZdz4WkS5A 3/LQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAW7ECDZboBNfzz+VP29Rpp/xzMBG8dCMsQhWr+l44PKS55N2Hme UcJVJQHavC0YpF60p5WROUQ/+kpZiQwEvpW4YY2APWeMoC7GFXAvDdeuks6jvvOytTQYVgZk4IJ uxLnIGF8hd8IcZONiFzzxUjszLWBC3I6ex8CyHWnCfOpfDTwVgyg/DsyLpeSMLFf9FA== X-Received: by 2002:a65:64c8:: with SMTP id t8mr83487360pgv.248.1555527183526; Wed, 17 Apr 2019 11:53:03 -0700 (PDT) X-Google-Smtp-Source: APXvYqzArcXfbcZqC3ssm8nK24mmlg5A09bxU+9bRcNvMLG1AgiaPPslBBLrUwwBS+o9V29tg0Ac X-Received: by 2002:a65:64c8:: with SMTP id t8mr83487322pgv.248.1555527182857; Wed, 17 Apr 2019 11:53:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555527182; cv=none; d=google.com; s=arc-20160816; b=fMa3pQPMtYXACQgATtrHWDFTIviB5ZhCUUs9jcHSXRUdy7je/frF/t7galzoDRq8EZ lvrFUshHWj4z0QGoPwWSKM+z91ZeaEU73sriCermJceS3agOzdxuOo4CVMO8uY64gfL5 n6XCo/B3JiycLSj9B67dp38aQjnAjeUHDWJTR/hDkqVKo2sFS0tfV7K+1JDm5qyOsTmB w7dz12DBshDRhP6lIRWbI5pdwrfYbS+d1Sf/nJTU8hnri6l7EOdKeA+NlASyZgRxBzQz GBuqmQLMBixONplJui+LXq/zK3nQPEvZSZcLFkpwk6lYbBrhhx/EmgIZ1D85rt7CK/Ta Rb2w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=0zgvBeX32KFY/f5kA6MPIacaljL+CBotozBqVDmlEME=; b=gInyUhDsJwK7LRPl2AnCIG/aKE5BvA/kQHSx5bIdL2UVfoATcSaDc5/mu7vAB90Vi6 /isr0zPfh61ILrle/v9KA/QJub4BcPihF5IvQsvY7GyZ91yKlV1CVb+pnB3GIucPvtnI Xvpcwg5BkSsy5MaAGvEv3C7F2QjUT7qeq/hbduJb6rin9HxgJCb4OCTzbje0N8Jkpyxk z8YkPRcaHi0wjGYjLVfZxBxsOKglEhvifwzdo1su/M3OYh919DvCCPz9UvupPb+fvpP3 a7BBWcMzptq19u+rHgFJCTyd23iFOzvrbzXjqIIlfg3xgwMJ/6JVVwIv4oCwErj/1jOr m00A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga11.intel.com (mga11.intel.com. [192.55.52.93]) by mx.google.com with ESMTPS id h2si21340726pfk.277.2019.04.17.11.53.02 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 17 Apr 2019 11:53:02 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.93 as permitted sender) client-ip=192.55.52.93; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Apr 2019 11:53:02 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,362,1549958400"; d="scan'208";a="143759198" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga007.fm.intel.com with ESMTP; 17 Apr 2019 11:53:01 -0700 Subject: [PATCH v6 04/12] mm/hotplug: Prepare shrink_{zone, pgdat}_span for sub-section removal From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Vlastimil Babka , Logan Gunthorpe , linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, mhocko@suse.com, david@redhat.com Date: Wed, 17 Apr 2019 11:39:16 -0700 Message-ID: <155552635609.2015392.6246305135559796835.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Sub-section hotplug support reduces the unit of operation of hotplug from section-sized-units (PAGES_PER_SECTION) to sub-section-sized units (PAGES_PER_SUBSECTION). Teach shrink_{zone,pgdat}_span() to consider PAGES_PER_SUBSECTION boundaries as the points where pfn_valid(), not valid_section(), can toggle. Cc: Michal Hocko Cc: Vlastimil Babka Cc: Logan Gunthorpe Signed-off-by: Dan Williams Reviewed-by: Oscar Salvador Reviewed-by: Pavel Tatashin --- include/linux/mmzone.h | 2 ++ mm/memory_hotplug.c | 16 ++++++++-------- 2 files changed, 10 insertions(+), 8 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index cffde898e345..b13f0cddf75e 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1164,6 +1164,8 @@ static inline unsigned long section_nr_to_pfn(unsigned long sec) #define SECTION_ACTIVE_SIZE ((1UL << SECTION_SIZE_BITS) / BITS_PER_LONG) #define SECTION_ACTIVE_MASK (~(SECTION_ACTIVE_SIZE - 1)) +#define PAGES_PER_SUB_SECTION (SECTION_ACTIVE_SIZE / PAGE_SIZE) +#define PAGE_SUB_SECTION_MASK (~(PAGES_PER_SUB_SECTION-1)) struct mem_section_usage { /* diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 8b7415736d21..d5874f9d4043 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -327,10 +327,10 @@ static unsigned long find_smallest_section_pfn(int nid, struct zone *zone, { struct mem_section *ms; - for (; start_pfn < end_pfn; start_pfn += PAGES_PER_SECTION) { + for (; start_pfn < end_pfn; start_pfn += PAGES_PER_SUB_SECTION) { ms = __pfn_to_section(start_pfn); - if (unlikely(!valid_section(ms))) + if (unlikely(!pfn_valid(start_pfn))) continue; if (unlikely(pfn_to_nid(start_pfn) != nid)) @@ -355,10 +355,10 @@ static unsigned long find_biggest_section_pfn(int nid, struct zone *zone, /* pfn is the end pfn of a memory section. */ pfn = end_pfn - 1; - for (; pfn >= start_pfn; pfn -= PAGES_PER_SECTION) { + for (; pfn >= start_pfn; pfn -= PAGES_PER_SUB_SECTION) { ms = __pfn_to_section(pfn); - if (unlikely(!valid_section(ms))) + if (unlikely(!pfn_valid(pfn))) continue; if (unlikely(pfn_to_nid(pfn) != nid)) @@ -417,10 +417,10 @@ static void shrink_zone_span(struct zone *zone, unsigned long start_pfn, * it check the zone has only hole or not. */ pfn = zone_start_pfn; - for (; pfn < zone_end_pfn; pfn += PAGES_PER_SECTION) { + for (; pfn < zone_end_pfn; pfn += PAGES_PER_SUB_SECTION) { ms = __pfn_to_section(pfn); - if (unlikely(!valid_section(ms))) + if (unlikely(!pfn_valid(pfn))) continue; if (page_zone(pfn_to_page(pfn)) != zone) @@ -485,10 +485,10 @@ static void shrink_pgdat_span(struct pglist_data *pgdat, * has only hole or not. */ pfn = pgdat_start_pfn; - for (; pfn < pgdat_end_pfn; pfn += PAGES_PER_SECTION) { + for (; pfn < pgdat_end_pfn; pfn += PAGES_PER_SUB_SECTION) { ms = __pfn_to_section(pfn); - if (unlikely(!valid_section(ms))) + if (unlikely(!pfn_valid(pfn))) continue; if (pfn_to_nid(pfn) != nid) From patchwork Wed Apr 17 18:39:21 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10905869 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B7474161F for ; Wed, 17 Apr 2019 18:53:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 98DF528B7D for ; Wed, 17 Apr 2019 18:53:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8D40028B83; Wed, 17 Apr 2019 18:53:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C8E8A28B7D for ; Wed, 17 Apr 2019 18:53:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B0B2E6B000C; Wed, 17 Apr 2019 14:53:09 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id ABBAD6B000D; Wed, 17 Apr 2019 14:53:09 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9D1DD6B000E; Wed, 17 Apr 2019 14:53:09 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f198.google.com (mail-pf1-f198.google.com [209.85.210.198]) by kanga.kvack.org (Postfix) with ESMTP id 61E476B000C for ; Wed, 17 Apr 2019 14:53:09 -0400 (EDT) Received: by mail-pf1-f198.google.com with SMTP id p8so16830847pfd.4 for ; Wed, 17 Apr 2019 11:53:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=jO+BwvDDOCcGtQbdhcBNPhSeyjC7lz9U6ozqWwkDM70=; b=Ewu7Tj5+Xy1tIaHEJ2nczvYjHimyRKF+Xn6IZNovrSqZPAeuTQeCt944R/hjNMpsYC aNKD7v5AbdlAadloeMRBaebfn6pG7ZBo06Eg6uFZZ8f+RlDpUYb3UtX5BvucZmugnUb8 Bj4gBEzkPBVBNpXHKv5A6juc1QKdA2DhMP842uyrWudS7tECDTS9Ib3Q9a7sHB3NgmVK +NPzU2b52+Mp1Qq84RuRTnw/DYj6poYYwq/G8Q5FGMi/UYvc68BuYCWf+vOod6tHeG4D aSwURWSOZA19FHOc9dhvK01Pxho3v+0fkSzKpqy0trM0JrAcLiR2E60q3MI2d0l0MB6P WYSg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAVTWLxfzDjHO5GyYTaHM7oZmyhkxAQq/OLjweHwXXzHXrdo640A nf6W0Y0l4k2HIWieH3JCsbTPWwUTlgThKt2WIr7BYIzvI0ppvdVWfHFeaXnqifbOi91s72y2oi6 niXd/4P+tWI8+9mS+QYBOI6EpESp4QnrkSiufXRGkc5/4oDnPvaV6oxasHem3NGtrJQ== X-Received: by 2002:a17:902:2f84:: with SMTP id t4mr69016927plb.6.1555527189037; Wed, 17 Apr 2019 11:53:09 -0700 (PDT) X-Google-Smtp-Source: APXvYqwKZZts5uvkGppI3EDjcZrprTOhNlI0TfzaACDMabKzBtH47Zj7ncYzqu8recy5vzDqm3NJ X-Received: by 2002:a17:902:2f84:: with SMTP id t4mr69016866plb.6.1555527188234; Wed, 17 Apr 2019 11:53:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555527188; cv=none; d=google.com; s=arc-20160816; b=SlfJAhEPWXtaAhoIky9Qn5clgqFyQrrHvyQXN3gRs2JrmrbjvWfNQsWM0tCgDXse03 3aPgmyh1+tcqfxy0z4Uqfsa8OKAL90iL49jTNp/0cyJx3MVBDhlIldkysWVD8TCZniiE 5YltxjtvOKPV44qNnx94gUsC8PikBi7fgovDBK8o4xhtUVElNPqcKG5jJ84PdLT6kfiC /k3eriUqR3m2jmpLRUTskk2fsULfl4o/iBHs/Fd+PGQHAvtqhntwHuB9Sn6qNwBFJeY1 ArLPJuWMN+1jstyWKGFiQGAqLhVYe3h2mzeBeqgVeSyw68BiViyqMpuw7xs8aYGO4qJc JTdA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=jO+BwvDDOCcGtQbdhcBNPhSeyjC7lz9U6ozqWwkDM70=; b=nbPQYlfqK62+TvFrvfdc9DUwdx4TpRmbr+kB6Zt4Kd4mm+egvgN/fZCQCalLHZ/Zvc awdfyrdBGyX+g1ARg/cJ6Ko8VzHHm6CyqjFEuGgDJ+V50s8KP96Nppvok86DJSI1RbCM OmEktsFjCB1Zhk3j/3CPSJk1w+h0WAbDaKb2JSKz5Wr7WP/jRVTxKmkbPg3cyy1ncWDr QSoN9tGpwDOwj/JdbqgP+VKr63phpZAogQaX1K8yo2PVKa7zUXWTXxn6XEuD28f3o9j1 X7d+MyLNJD2hUAqaQ2uNmP5ktXqht52XFRRHgKbm38hcU2TcMPsaFR6Je4vO5EdNvdet jzYQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga14.intel.com (mga14.intel.com. [192.55.52.115]) by mx.google.com with ESMTPS id d2si48534302pgq.129.2019.04.17.11.53.08 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 17 Apr 2019 11:53:08 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.115 as permitted sender) client-ip=192.55.52.115; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.115 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Apr 2019 11:53:07 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,362,1549958400"; d="scan'208";a="143569844" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga003.jf.intel.com with ESMTP; 17 Apr 2019 11:53:07 -0700 Subject: [PATCH v6 05/12] mm/sparsemem: Convert kmalloc_section_memmap() to populate_section_memmap() From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , David Hildenbrand , Logan Gunthorpe , linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, mhocko@suse.com, david@redhat.com Date: Wed, 17 Apr 2019 11:39:21 -0700 Message-ID: <155552636181.2015392.6062894291885124658.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Allow sub-section sized ranges to be added to the memmap. populate_section_memmap() takes an explict pfn range rather than assuming a full section, and those parameters are plumbed all the way through to vmmemap_populate(). There should be no sub-section usage in current deployments. New warnings are added to clarify which memmap allocation paths are sub-section capable. Cc: Michal Hocko Cc: David Hildenbrand Cc: Logan Gunthorpe Signed-off-by: Dan Williams --- arch/x86/mm/init_64.c | 4 ++- include/linux/mm.h | 4 ++- mm/sparse-vmemmap.c | 21 +++++++++++------ mm/sparse.c | 61 +++++++++++++++++++++++++++++++------------------ 4 files changed, 57 insertions(+), 33 deletions(-) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 20d14254b686..bb018d09d2dc 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1457,7 +1457,9 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, { int err; - if (boot_cpu_has(X86_FEATURE_PSE)) + if (end - start < PAGES_PER_SECTION * sizeof(struct page)) + err = vmemmap_populate_basepages(start, end, node); + else if (boot_cpu_has(X86_FEATURE_PSE)) err = vmemmap_populate_hugepages(start, end, node, altmap); else if (altmap) { pr_err_once("%s: no cpu support for altmap allocations\n", diff --git a/include/linux/mm.h b/include/linux/mm.h index 91a19229452b..3cc599fd3ae0 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2750,8 +2750,8 @@ const char * arch_vma_name(struct vm_area_struct *vma); void print_vma_addr(char *prefix, unsigned long rip); void *sparse_buffer_alloc(unsigned long size); -struct page *sparse_mem_map_populate(unsigned long pnum, int nid, - struct vmem_altmap *altmap); +struct page * __populate_section_memmap(unsigned long pfn, + unsigned long nr_pages, int nid, struct vmem_altmap *altmap); pgd_t *vmemmap_pgd_populate(unsigned long addr, int node); p4d_t *vmemmap_p4d_populate(pgd_t *pgd, unsigned long addr, int node); pud_t *vmemmap_pud_populate(p4d_t *p4d, unsigned long addr, int node); diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 7fec05796796..dcb023aa23d1 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -245,19 +245,26 @@ int __meminit vmemmap_populate_basepages(unsigned long start, return 0; } -struct page * __meminit sparse_mem_map_populate(unsigned long pnum, int nid, - struct vmem_altmap *altmap) +struct page * __meminit __populate_section_memmap(unsigned long pfn, + unsigned long nr_pages, int nid, struct vmem_altmap *altmap) { unsigned long start; unsigned long end; - struct page *map; - map = pfn_to_page(pnum * PAGES_PER_SECTION); - start = (unsigned long)map; - end = (unsigned long)(map + PAGES_PER_SECTION); + /* + * The minimum granularity of memmap extensions is + * SECTION_ACTIVE_SIZE as allocations are tracked in the + * 'map_active' bitmap of the section. + */ + end = ALIGN(pfn + nr_pages, PHYS_PFN(SECTION_ACTIVE_SIZE)); + pfn &= PHYS_PFN(SECTION_ACTIVE_MASK); + nr_pages = end - pfn; + + start = (unsigned long) pfn_to_page(pfn); + end = start + nr_pages * sizeof(struct page); if (vmemmap_populate(start, end, nid, altmap)) return NULL; - return map; + return pfn_to_page(pfn); } diff --git a/mm/sparse.c b/mm/sparse.c index 5ef2f884c4e1..98408c0da060 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -452,8 +452,8 @@ static unsigned long __init section_map_size(void) return PAGE_ALIGN(sizeof(struct page) * PAGES_PER_SECTION); } -struct page __init *sparse_mem_map_populate(unsigned long pnum, int nid, - struct vmem_altmap *altmap) +struct page __init *__populate_section_memmap(unsigned long pfn, + unsigned long nr_pages, int nid, struct vmem_altmap *altmap) { unsigned long size = section_map_size(); struct page *map = sparse_buffer_alloc(size); @@ -534,10 +534,13 @@ static void __init sparse_init_nid(int nid, unsigned long pnum_begin, } sparse_buffer_init(map_count * section_map_size(), nid); for_each_present_section_nr(pnum_begin, pnum) { + unsigned long pfn = section_nr_to_pfn(pnum); + if (pnum >= pnum_end) break; - map = sparse_mem_map_populate(pnum, nid, NULL); + map = __populate_section_memmap(pfn, PAGES_PER_SECTION, + nid, NULL); if (!map) { pr_err("%s: node[%d] memory map backing failed. Some memory will not be available.", __func__, nid); @@ -637,17 +640,17 @@ void offline_mem_sections(unsigned long start_pfn, unsigned long end_pfn) #endif #ifdef CONFIG_SPARSEMEM_VMEMMAP -static inline struct page *kmalloc_section_memmap(unsigned long pnum, int nid, - struct vmem_altmap *altmap) +static struct page *populate_section_memmap(unsigned long pfn, + unsigned long nr_pages, int nid, struct vmem_altmap *altmap) { - /* This will make the necessary allocations eventually. */ - return sparse_mem_map_populate(pnum, nid, altmap); + return __populate_section_memmap(pfn, nr_pages, nid, altmap); } -static void __kfree_section_memmap(struct page *memmap, + +static void depopulate_section_memmap(unsigned long pfn, unsigned long nr_pages, struct vmem_altmap *altmap) { - unsigned long start = (unsigned long)memmap; - unsigned long end = (unsigned long)(memmap + PAGES_PER_SECTION); + unsigned long start = (unsigned long) pfn_to_page(pfn); + unsigned long end = start + nr_pages * sizeof(struct page); vmemmap_free(start, end, altmap); } @@ -661,11 +664,18 @@ static void free_map_bootmem(struct page *memmap) } #endif /* CONFIG_MEMORY_HOTREMOVE */ #else -static struct page *__kmalloc_section_memmap(void) +struct page *populate_section_memmap(unsigned long pfn, + unsigned long nr_pages, int nid, struct vmem_altmap *altmap) { struct page *page, *ret; unsigned long memmap_size = sizeof(struct page) * PAGES_PER_SECTION; + if ((pfn & ~PAGE_SECTION_MASK) || nr_pages != PAGES_PER_SECTION) { + WARN(1, "%s: called with section unaligned parameters\n", + __func__); + return NULL; + } + page = alloc_pages(GFP_KERNEL|__GFP_NOWARN, get_order(memmap_size)); if (page) goto got_map_page; @@ -682,15 +692,17 @@ static struct page *__kmalloc_section_memmap(void) return ret; } -static inline struct page *kmalloc_section_memmap(unsigned long pnum, int nid, +static void depopulate_section_memmap(unsigned long pfn, unsigned long nr_pages, struct vmem_altmap *altmap) { - return __kmalloc_section_memmap(); -} + struct page *memmap = pfn_to_page(pfn); + + if ((pfn & ~PAGE_SECTION_MASK) || nr_pages != PAGES_PER_SECTION) { + WARN(1, "%s: called with section unaligned parameters\n", + __func__); + return; + } -static void __kfree_section_memmap(struct page *memmap, - struct vmem_altmap *altmap) -{ if (is_vmalloc_addr(memmap)) vfree(memmap); else @@ -761,12 +773,13 @@ int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, if (ret < 0 && ret != -EEXIST) return ret; ret = 0; - memmap = kmalloc_section_memmap(section_nr, nid, altmap); + memmap = populate_section_memmap(start_pfn, PAGES_PER_SECTION, nid, + altmap); if (!memmap) return -ENOMEM; usage = kzalloc(mem_section_usage_size(), GFP_KERNEL); if (!usage) { - __kfree_section_memmap(memmap, altmap); + depopulate_section_memmap(start_pfn, PAGES_PER_SECTION, altmap); return -ENOMEM; } @@ -788,7 +801,7 @@ int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, out: if (ret < 0) { kfree(usage); - __kfree_section_memmap(memmap, altmap); + depopulate_section_memmap(start_pfn, PAGES_PER_SECTION, altmap); } return ret; } @@ -825,7 +838,8 @@ static inline void clear_hwpoisoned_pages(struct page *memmap, int nr_pages) #endif static void free_section_usage(struct page *memmap, - struct mem_section_usage *usage, struct vmem_altmap *altmap) + struct mem_section_usage *usage, unsigned long pfn, + unsigned long nr_pages, struct vmem_altmap *altmap) { struct page *usage_page; @@ -839,7 +853,7 @@ static void free_section_usage(struct page *memmap, if (PageSlab(usage_page) || PageCompound(usage_page)) { kfree(usage); if (memmap) - __kfree_section_memmap(memmap, altmap); + depopulate_section_memmap(pfn, nr_pages, altmap); return; } @@ -868,7 +882,8 @@ void sparse_remove_one_section(struct zone *zone, struct mem_section *ms, clear_hwpoisoned_pages(memmap + map_offset, PAGES_PER_SECTION - map_offset); - free_section_usage(memmap, usage, altmap); + free_section_usage(memmap, usage, section_nr_to_pfn(__section_nr(ms)), + PAGES_PER_SECTION, altmap); } #endif /* CONFIG_MEMORY_HOTREMOVE */ #endif /* CONFIG_MEMORY_HOTPLUG */ From patchwork Wed Apr 17 18:39:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10905873 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0E2D71515 for ; Wed, 17 Apr 2019 18:53:18 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DFA9828B7D for ; Wed, 17 Apr 2019 18:53:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D3D2D28B83; Wed, 17 Apr 2019 18:53:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0DE7428B7D for ; Wed, 17 Apr 2019 18:53:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DF2696B000D; Wed, 17 Apr 2019 14:53:15 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id DA1986B000E; Wed, 17 Apr 2019 14:53:15 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C1C196B0010; Wed, 17 Apr 2019 14:53:15 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f200.google.com (mail-pl1-f200.google.com [209.85.214.200]) by kanga.kvack.org (Postfix) with ESMTP id 885886B000D for ; Wed, 17 Apr 2019 14:53:15 -0400 (EDT) Received: by mail-pl1-f200.google.com with SMTP id x9so16008846pln.0 for ; Wed, 17 Apr 2019 11:53:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=IQHg6CwmC/uiN5b4Q03OBpjZcUGL4204JoFqmheTvXA=; b=DYyA7dm39Pv3wZMwYpJQSqqaiJ9zHsk6anQFBAYhLRp0O2Hpv/7lD9Um11aaOyPqRw 2/hUGFO83CjpQTSCe4SxrDG06R63mhSXqUtZwGzQEO31rBJdELtQnWfHKLN0pTbNTrEp 8S8X72w1rT2+l8Be7DYwjSqs8UV15rFVyLxaX3CbHQI1ot+cR1deuq+gJAVqQIbIK8BE lLRBn47pAKIfy6aItSJe1X0QNWCaPGQz6PXf8/dXwIDD/mi/2pjYIrwubDrnTnq8S0xM VNPxcJcR0inZ4IzZ4Jcl75k1AIkLK7py/z+f89t/ws7QlX68+He4cH2KKsJ0QDNXPZBZ JdXA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAXqnQWVELrwcdemjUU+iJLt75IQfkMLpPGw56HWiaNlV637I1sk k3JSX3yqS8lhKFvm0oxit1YUmVV383xkJMPnQvmqwYpMoGEufYI9YLf3X1sGlArS1qrBhYrbsPb sRBQi3jYMUcSFnAVmDzLkoWU6WzVHCmExU277DXlJPudzTC3gl2qCnG0Whz2Lff4oug== X-Received: by 2002:a63:5846:: with SMTP id i6mr1318440pgm.423.1555527195172; Wed, 17 Apr 2019 11:53:15 -0700 (PDT) X-Google-Smtp-Source: APXvYqx4MA00G2BDrlJLoX3F0YYsI+Y0puWmihH6l4xdO9HYKYHmGcsP6HLKfZ9VDGRvG/Zr04lX X-Received: by 2002:a63:5846:: with SMTP id i6mr1318385pgm.423.1555527194276; Wed, 17 Apr 2019 11:53:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555527194; cv=none; d=google.com; s=arc-20160816; b=e+ldmQBoBLprbvBNG/N915fsnBKzwuVodsuOtEWrwJPZgdGdAM643GA3fCMETYRn85 zAvNQ0Q3NfiBGkIYxmWTgSDfHPecWi5I5eSyk9/GUW6w40X8vbTMjoRti9KGHG8KDSnt I0uwwe7RmjRdmGVZpI8cSl5ieHLkF6zAEEYq4VU/y0doT/oX2kXQ8GtPZjaLE6gu2Zlg TJzAZeAOydZ6OILOU/gVAfGb3gTvosYSdigHj7XNrb4Jt8BE5HUsruh50Y4a0mSNeVRv Evs+AmA+JQKbr6N9T+TM2xRBDboevt771aKiuMyd2s9JFVwYCFts/3U0KKSXkDo5O+jN dwTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=IQHg6CwmC/uiN5b4Q03OBpjZcUGL4204JoFqmheTvXA=; b=WarFByhl4/oPoxWgSZkKdCL8YjWUEiQ4RRrPMkZop0efFWJq3F+W4uv4JSTv4aub/R hq8stS5Is6ocgjDcX5sEjPda7xYjlZx91D0pdGtHAP0wOMTzYuG+MpH452CojItywOBJ lk47oLEaUTGq9GUt4bVjgrPlVF3KhglgoHqmP7N+znWiIcHVRrFWI7QtUveMN9/bJyoq nZPLHJRKQn3er0wdpsTTADLLQApc+Mf5489Ev/DFsP5QjmgYyO525ZEtN2jXsr8gXugk n4Sqv/NE1+UA8+IEoZJBLzYGCTJVI00hDeBMZzWB87yFHlacgHx79q4fcm5nopZJ6eW6 NoLg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga03.intel.com (mga03.intel.com. [134.134.136.65]) by mx.google.com with ESMTPS id g67si30470460plb.375.2019.04.17.11.53.14 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 17 Apr 2019 11:53:14 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.65 as permitted sender) client-ip=134.134.136.65; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Apr 2019 11:53:12 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,362,1549958400"; d="scan'208";a="316812583" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga005.jf.intel.com with ESMTP; 17 Apr 2019 11:53:12 -0700 Subject: [PATCH v6 06/12] mm/hotplug: Add mem-hotplug restrictions for remove_memory() From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Logan Gunthorpe , David Hildenbrand , linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, mhocko@suse.com, david@redhat.com Date: Wed, 17 Apr 2019 11:39:27 -0700 Message-ID: <155552636696.2015392.12612320706815016081.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Teach the arch_remove_memory() path to consult the same 'struct mhp_restrictions' context as was specified at arch_add_memory() time. No functional change, this is a preparation step for teaching __remove_pages() about how and when to allow sub-section hot-remove, and a cleanup for an unnecessary "is_dev_zone()" special case. Cc: Michal Hocko Cc: Logan Gunthorpe Cc: David Hildenbrand Signed-off-by: Dan Williams --- arch/ia64/mm/init.c | 4 ++-- arch/powerpc/mm/mem.c | 5 +++-- arch/s390/mm/init.c | 2 +- arch/sh/mm/init.c | 4 ++-- arch/x86/mm/init_32.c | 4 ++-- arch/x86/mm/init_64.c | 5 +++-- include/linux/memory_hotplug.h | 5 +++-- kernel/memremap.c | 14 ++++++++------ mm/memory_hotplug.c | 17 ++++++++--------- 9 files changed, 32 insertions(+), 28 deletions(-) diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c index d28e29103bdb..86c69c87e7e8 100644 --- a/arch/ia64/mm/init.c +++ b/arch/ia64/mm/init.c @@ -683,14 +683,14 @@ int arch_add_memory(int nid, u64 start, u64 size, #ifdef CONFIG_MEMORY_HOTREMOVE void arch_remove_memory(int nid, u64 start, u64 size, - struct vmem_altmap *altmap) + struct mhp_restrictions *restrictions) { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; struct zone *zone; zone = page_zone(pfn_to_page(start_pfn)); - __remove_pages(zone, start_pfn, nr_pages, altmap); + __remove_pages(zone, start_pfn, nr_pages, restrictions); } #endif #endif diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c index cc9425fb9056..ccab989f397d 100644 --- a/arch/powerpc/mm/mem.c +++ b/arch/powerpc/mm/mem.c @@ -132,10 +132,11 @@ int __meminit arch_add_memory(int nid, u64 start, u64 size, #ifdef CONFIG_MEMORY_HOTREMOVE void __meminit arch_remove_memory(int nid, u64 start, u64 size, - struct vmem_altmap *altmap) + struct mhp_restrictions *restrictions) { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; + struct vmem_altmap *altmap = restrictions->altmap; struct page *page; int ret; @@ -147,7 +148,7 @@ void __meminit arch_remove_memory(int nid, u64 start, u64 size, if (altmap) page += vmem_altmap_offset(altmap); - __remove_pages(page_zone(page), start_pfn, nr_pages, altmap); + __remove_pages(page_zone(page), start_pfn, nr_pages, restrictions); /* Remove htab bolted mappings for this section of memory */ start = (unsigned long)__va(start); diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c index 31b1071315d7..3af7b99af1b1 100644 --- a/arch/s390/mm/init.c +++ b/arch/s390/mm/init.c @@ -235,7 +235,7 @@ int arch_add_memory(int nid, u64 start, u64 size, #ifdef CONFIG_MEMORY_HOTREMOVE void arch_remove_memory(int nid, u64 start, u64 size, - struct vmem_altmap *altmap) + struct mhp_restrictions *restrictions) { /* * There is no hardware or firmware interface which could trigger a diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c index 5aeb4d7099a1..3cff7e4723e6 100644 --- a/arch/sh/mm/init.c +++ b/arch/sh/mm/init.c @@ -430,14 +430,14 @@ EXPORT_SYMBOL_GPL(memory_add_physaddr_to_nid); #ifdef CONFIG_MEMORY_HOTREMOVE void arch_remove_memory(int nid, u64 start, u64 size, - struct vmem_altmap *altmap) + struct mhp_restrictions *restrictions) { unsigned long start_pfn = PFN_DOWN(start); unsigned long nr_pages = size >> PAGE_SHIFT; struct zone *zone; zone = page_zone(pfn_to_page(start_pfn)); - __remove_pages(zone, start_pfn, nr_pages, altmap); + __remove_pages(zone, start_pfn, nr_pages, restrictions); } #endif #endif /* CONFIG_MEMORY_HOTPLUG */ diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c index 075e568098f2..ba888fd38f5d 100644 --- a/arch/x86/mm/init_32.c +++ b/arch/x86/mm/init_32.c @@ -861,14 +861,14 @@ int arch_add_memory(int nid, u64 start, u64 size, #ifdef CONFIG_MEMORY_HOTREMOVE void arch_remove_memory(int nid, u64 start, u64 size, - struct vmem_altmap *altmap) + struct mhp_restrictions *restrictions) { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; struct zone *zone; zone = page_zone(pfn_to_page(start_pfn)); - __remove_pages(zone, start_pfn, nr_pages, altmap); + __remove_pages(zone, start_pfn, nr_pages, restrictions); } #endif #endif diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index bb018d09d2dc..4071632be007 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1142,8 +1142,9 @@ kernel_physical_mapping_remove(unsigned long start, unsigned long end) } void __ref arch_remove_memory(int nid, u64 start, u64 size, - struct vmem_altmap *altmap) + struct mhp_restrictions *restrictions) { + struct vmem_altmap *altmap = restrictions->altmap; unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; struct page *page = pfn_to_page(start_pfn); @@ -1153,7 +1154,7 @@ void __ref arch_remove_memory(int nid, u64 start, u64 size, if (altmap) page += vmem_altmap_offset(altmap); zone = page_zone(page); - __remove_pages(zone, start_pfn, nr_pages, altmap); + __remove_pages(zone, start_pfn, nr_pages, restrictions); kernel_physical_mapping_remove(start, start + size); } #endif diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index ae892eef8b82..31b768bd1268 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -125,9 +125,10 @@ static inline bool movable_node_is_enabled(void) #ifdef CONFIG_MEMORY_HOTREMOVE extern void arch_remove_memory(int nid, u64 start, u64 size, - struct vmem_altmap *altmap); + struct mhp_restrictions *restrictions); extern void __remove_pages(struct zone *zone, unsigned long start_pfn, - unsigned long nr_pages, struct vmem_altmap *altmap); + unsigned long nr_pages, + struct mhp_restrictions *restrictions); #endif /* CONFIG_MEMORY_HOTREMOVE */ /* diff --git a/kernel/memremap.c b/kernel/memremap.c index f355586ea54a..33475e211568 100644 --- a/kernel/memremap.c +++ b/kernel/memremap.c @@ -108,8 +108,11 @@ static void devm_memremap_pages_release(void *data) __remove_pages(page_zone(pfn_to_page(pfn)), pfn, align_size >> PAGE_SHIFT, NULL); } else { - arch_remove_memory(nid, align_start, align_size, - pgmap->altmap_valid ? &pgmap->altmap : NULL); + struct mhp_restrictions restrictions = { + .altmap = pgmap->altmap_valid ? &pgmap->altmap : NULL, + }; + + arch_remove_memory(nid, align_start, align_size, &restrictions); kasan_remove_zero_shadow(__va(align_start), align_size); } mem_hotplug_done(); @@ -142,15 +145,14 @@ static void devm_memremap_pages_release(void *data) void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) { resource_size_t align_start, align_size, align_end; - struct vmem_altmap *altmap = pgmap->altmap_valid ? - &pgmap->altmap : NULL; struct resource *res = &pgmap->res; struct dev_pagemap *conflict_pgmap; struct mhp_restrictions restrictions = { /* * We do not want any optional features only our own memmap */ - .altmap = altmap, + + .altmap = pgmap->altmap_valid ? &pgmap->altmap : NULL, }; pgprot_t pgprot = PAGE_KERNEL; int error, nid, is_ram; @@ -235,7 +237,7 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) zone = &NODE_DATA(nid)->node_zones[ZONE_DEVICE]; move_pfn_range_to_zone(zone, align_start >> PAGE_SHIFT, - align_size >> PAGE_SHIFT, altmap); + align_size >> PAGE_SHIFT, restrictions.altmap); } mem_hotplug_done(); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index d5874f9d4043..055cea62be6e 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -543,7 +543,7 @@ static void __remove_section(struct zone *zone, struct mem_section *ms, * @zone: zone from which pages need to be removed * @phys_start_pfn: starting pageframe (must be aligned to start of a section) * @nr_pages: number of pages to remove (must be multiple of section size) - * @altmap: alternative device page map or %NULL if default memmap is used + * @restrictions: optional alternative device page map and other features * * Generic helper function to remove section mappings and sysfs entries * for the section of the memory we are removing. Caller needs to make @@ -551,17 +551,15 @@ static void __remove_section(struct zone *zone, struct mem_section *ms, * calling offline_pages(). */ void __remove_pages(struct zone *zone, unsigned long phys_start_pfn, - unsigned long nr_pages, struct vmem_altmap *altmap) + unsigned long nr_pages, struct mhp_restrictions *restrictions) { unsigned long i; - unsigned long map_offset = 0; int sections_to_remove; + unsigned long map_offset = 0; + struct vmem_altmap *altmap = restrictions->altmap; - /* In the ZONE_DEVICE case device driver owns the memory region */ - if (is_dev_zone(zone)) { - if (altmap) - map_offset = vmem_altmap_offset(altmap); - } + if (altmap) + map_offset = vmem_altmap_offset(altmap); clear_zone_contiguous(zone); @@ -1832,6 +1830,7 @@ static void __release_memory_resource(u64 start, u64 size) */ void __ref __remove_memory(int nid, u64 start, u64 size) { + struct mhp_restrictions restrictions = { 0 }; int ret; BUG_ON(check_hotplug_memory_range(start, size)); @@ -1853,7 +1852,7 @@ void __ref __remove_memory(int nid, u64 start, u64 size) memblock_free(start, size); memblock_remove(start, size); - arch_remove_memory(nid, start, size, NULL); + arch_remove_memory(nid, start, size, &restrictions); __release_memory_resource(start, size); try_offline_node(nid); From patchwork Wed Apr 17 18:39:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10905877 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9819F161F for ; Wed, 17 Apr 2019 18:53:21 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7D81728B7D for ; Wed, 17 Apr 2019 18:53:21 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 71DB328B83; Wed, 17 Apr 2019 18:53:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 06E5728B7D for ; Wed, 17 Apr 2019 18:53:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CE19F6B000E; Wed, 17 Apr 2019 14:53:19 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id CAFB76B0010; Wed, 17 Apr 2019 14:53:19 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B801A6B0266; Wed, 17 Apr 2019 14:53:19 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f197.google.com (mail-pl1-f197.google.com [209.85.214.197]) by kanga.kvack.org (Postfix) with ESMTP id 825136B000E for ; Wed, 17 Apr 2019 14:53:19 -0400 (EDT) Received: by mail-pl1-f197.google.com with SMTP id x5so16003272pll.2 for ; Wed, 17 Apr 2019 11:53:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=GA3JZMiq0PcBxt4ve/+vCKmPStNhj3ad7dMF+slvpo8=; b=ZdStBBX5T9MFARO1huE+SwHpMumfGqQHxyG5oNbSngK1jG8hJGWUyL3K1cA3+ypvDd SVoyRq53IOxKHklTbOojr6x5a8sGA5DqaewdFSKy+6bQTrt9EX0398SHVDsCnpGBt13A eL8WSoBdluh5SwYU5+XKifO9GbWrjoSaeRuetNeIG+In5Bo0vsuTm8b4J/wijgkcJW5x OWDL8My2sulHKYxC2Sql06cqwLnEw1EcOPmzQpbCcuMamRLPO4MMJ3hAZikWCiRO98Sh bA2/Ym4F6J1b3lvUQzelqjHx3r+3LBUAOS2ONLf/q2iOx8xU93l34B8Wcpvvp2Aqa0C1 s1/A== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAVPwvKrqoReMaNCxwdxZ+jUjRDr5GpFKztw2FxP8lThAm62of1h 43BR47OYBYDRjA5Sa9FCr6OGhnGQ6ne8zEjcs93lMe+ZqV/LjsF8suUALZsUOYI2qb8gchJWLYn KGauGk3GP4yFwuSkaPujA6lCdthwSjqRmIBLPm+BHDPtilKPteoMAArVvsU4iBih/Eg== X-Received: by 2002:a63:8142:: with SMTP id t63mr80205130pgd.63.1555527199201; Wed, 17 Apr 2019 11:53:19 -0700 (PDT) X-Google-Smtp-Source: APXvYqxJbjgsPMDXMfMgVdtyaQAqwEsqFrszUi2x/BdeLUvqFiXc6kt/nuyqt+dqVXUFTRa0hwsa X-Received: by 2002:a63:8142:: with SMTP id t63mr80205079pgd.63.1555527198549; Wed, 17 Apr 2019 11:53:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555527198; cv=none; d=google.com; s=arc-20160816; b=rfnQ8DvUm9j2ALIYF/3FMCtnnmb/yWgGuPuOSAX3WwpKE2eJix1Q7SfUIr9rOdB1Y+ BsztZgeaoUvI8a+tQDoCt0bgZdBpUr7XA6jj+1LpXogqotRkJ3BmZcnFaP8RNoG4Ra9E r8Tb646lQxWnduF56ly9DSAt4v05M9VrngGV88NncmzPRfGM+jrRyNUkqQK3/eMsv43Y 33vC6gQk3n/iE+iaNWl0vnfMcmVzi4Pc9LHw2ruS90N6E0dbY0l20isx3xnwwHFij0MI Q9R5V4EoGH9QPpTL6r8EqLOMEjY6rMaDLHOLKQ6L2R+CGQ/Udc5JEPly4tHiWksWmZV3 Lijg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=GA3JZMiq0PcBxt4ve/+vCKmPStNhj3ad7dMF+slvpo8=; b=mu29tcwc5J/ugYmejevXZ7LP5nfv0RkTDwXXvb+0W9bp2Li1xZkL/5RExJmTW/28fY SwCtQRd/oZ9pFNrZfisw6m1B5yYDhk9DFcp41As0inu6jzzlVxRzormLM5E6Cwf1I3sP lNQRAtttt5zHKwg0UGUsGGXKyRdBcD6cKK2qkiKyr7Ql3SiNZhQmLDg5iJRlIj/21CmB e0UO0bUInFqQPcW0Oxx3Y9BayQquKGxvIXcegNHbbAwVyzcEI+LEkthdbRXmAclHoCpb F/optXFOAAjbPAqOmOeC66XuC2iNpki4dxw3p/29xTx00aRBhHVREl5R1hRoYHSZGBvG +rBQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga11.intel.com (mga11.intel.com. [192.55.52.93]) by mx.google.com with ESMTPS id f2si26939614pfd.17.2019.04.17.11.53.18 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 17 Apr 2019 11:53:18 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.93 as permitted sender) client-ip=192.55.52.93; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Apr 2019 11:53:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,362,1549958400"; d="scan'208";a="135207587" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga008.jf.intel.com with ESMTP; 17 Apr 2019 11:53:16 -0700 Subject: [PATCH v6 07/12] mm: Kill is_dev_zone() helper From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , David Hildenbrand , Logan Gunthorpe , linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, mhocko@suse.com, david@redhat.com Date: Wed, 17 Apr 2019 11:39:32 -0700 Message-ID: <155552637207.2015392.16917498971420465931.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Given there are no more usages of is_dev_zone() outside of 'ifdef CONFIG_ZONE_DEVICE' protection, kill off the compilation helper. Cc: Michal Hocko Cc: David Hildenbrand Cc: Logan Gunthorpe Signed-off-by: Dan Williams Acked-by: David Hildenbrand Reviewed-by: Oscar Salvador Reviewed-by: Pavel Tatashin --- include/linux/mmzone.h | 12 ------------ mm/page_alloc.c | 2 +- 2 files changed, 1 insertion(+), 13 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index b13f0cddf75e..3237c5e456df 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -855,18 +855,6 @@ static inline int local_memory_node(int node_id) { return node_id; }; */ #define zone_idx(zone) ((zone) - (zone)->zone_pgdat->node_zones) -#ifdef CONFIG_ZONE_DEVICE -static inline bool is_dev_zone(const struct zone *zone) -{ - return zone_idx(zone) == ZONE_DEVICE; -} -#else -static inline bool is_dev_zone(const struct zone *zone) -{ - return false; -} -#endif - /* * Returns true if a zone has pages managed by the buddy allocator. * All the reclaim decisions have to use this function rather than diff --git a/mm/page_alloc.c b/mm/page_alloc.c index c9ad28a78018..fd455bd742d5 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5844,7 +5844,7 @@ void __ref memmap_init_zone_device(struct zone *zone, unsigned long start = jiffies; int nid = pgdat->node_id; - if (WARN_ON_ONCE(!pgmap || !is_dev_zone(zone))) + if (WARN_ON_ONCE(!pgmap || zone_idx(zone) != ZONE_DEVICE)) return; /* From patchwork Wed Apr 17 18:39:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10905881 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 315A21515 for ; Wed, 17 Apr 2019 18:53:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 10A5728B7D for ; Wed, 17 Apr 2019 18:53:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 023DD28B83; Wed, 17 Apr 2019 18:53:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3196A28B7D for ; Wed, 17 Apr 2019 18:53:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F2AF26B0010; Wed, 17 Apr 2019 14:53:24 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id F02F56B0266; Wed, 17 Apr 2019 14:53:24 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DF1DF6B0269; Wed, 17 Apr 2019 14:53:24 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f200.google.com (mail-pg1-f200.google.com [209.85.215.200]) by kanga.kvack.org (Postfix) with ESMTP id A51B96B0010 for ; Wed, 17 Apr 2019 14:53:24 -0400 (EDT) Received: by mail-pg1-f200.google.com with SMTP id z7so15190894pgc.1 for ; Wed, 17 Apr 2019 11:53:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=lZbnnnZKADsTNE0g7XzWh3rYR7warp6wAZ1OsQ/E7pw=; b=Ef5s3LwFK0FUPWEAnPry05xe8cCVyV5azzSA6GM2NQBN7BG7WHmj+TUV32VkHc6lTF hWIkwJ5uyHJAodAjFNx1FfzMMg3hGSSmumlBgmLZqpJwId8Pl9jKHNC9PrJn7eQG0rwK D/cNfJI/eeO5dncJFByvGto3DwtJUUi+hsKSQfHrYmhrHZzk1uoytHAtrXIChroiPN/3 wAy4bF61OQNwuMVi6/5yiDt5DN5GWcvpLIN+Ygk61uG3PXBwRi/50imJ2x7tlD86hhfK hkfwzXwPa15lq6n0Pcl6a8Np3wDv9m371OIscTgj6o7FJ59jMPSNq2/NjNZgbIDJTGKu ijaA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAVrDb1h3Uv6ccFwBUGreWPHyBeqAmWOd4QWd8IsjwHN48Qxlilg Su0meEPwrJN2N6IRCzpK0c0lDaSdyqtOYjhqEqGeXNwOPrPp0mfXR1cbh7SoVCbTXnJYxBfD5PA hZvUa/5cRgwn67wPOK2+eCIvSG2NXbgaaOdf1LL3ZmSZFQMpkn5TJtNgP5t4PqrxUbA== X-Received: by 2002:a62:76c1:: with SMTP id r184mr88352362pfc.229.1555527204305; Wed, 17 Apr 2019 11:53:24 -0700 (PDT) X-Google-Smtp-Source: APXvYqxj3YWWvjM5+pYJV7Rchvbf0mavRV4dUg+zzRaK4yBjzIbT3PEOQExTe3Trg7NwDHQtMZIo X-Received: by 2002:a62:76c1:: with SMTP id r184mr88352281pfc.229.1555527203220; Wed, 17 Apr 2019 11:53:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555527203; cv=none; d=google.com; s=arc-20160816; b=nxpukbpahvufJWmQDVHwU646NCZcxld2jdy6iZEVigC0PDwFxgJov7al7/6gK4QNUE udx4DuMDDM+cvQHr6mb1uAiTcaRCPagXcKd/Jj+c907IECnsc2a50hIQArsGW05jWThG 0PZDcZut5HJgbArodzmCbkrTMS8na1cyjNjtM9kg4urQ3ZSGCdNjEDI8cNvw9owGMGay vjKo+imeE+0HuzFUgINVkXNxpXKmH4r2UPywYrS/fm0hPU8d2/f1y/+9AvkisOECwZqg ZU5JIbuMcYeHMoJKrxjGTA0hGasqbVOIpxIOxBLQ4ft6PtNMGV+NUUt0wFqA09Pt7LiV +1Ng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=lZbnnnZKADsTNE0g7XzWh3rYR7warp6wAZ1OsQ/E7pw=; b=nOh7guGd7W4c6YZ1HF61927kOOi2/N6hIA9fjaGeRkWwEsS3VzBYfxgh6HLabStAuq 7fwVxV5eGVLQRNYDRmKXgBOdG4pCJbdIA08zFCDZ3CK/8GyVEVoL0IUBg0WHH8WnBgko 3vnM97eTe9AJfIXShxER1zwazfPqkdCFAT8S9y/hrl8SNTRalc9ZybBItoS3jqdZQkcC 1cvbsYB42JhgBgSvIJbHyTSOymt0pqMG276i5kTgdPoKh+qzN1AtUVtbj9f5R5pYMR1g 7bZEFFQWLsaO8LD8UwW5qp6JiKRMKSmcQTb0uO9Jl35zQtLoWoYOgPl9RaQ/u0YCNCm0 KCQA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga06.intel.com (mga06.intel.com. [134.134.136.31]) by mx.google.com with ESMTPS id b41si27189133pla.241.2019.04.17.11.53.23 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 17 Apr 2019 11:53:23 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) client-ip=134.134.136.31; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Apr 2019 11:53:22 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,362,1549958400"; d="scan'208";a="292403148" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga004.jf.intel.com with ESMTP; 17 Apr 2019 11:53:22 -0700 Subject: [PATCH v6 08/12] mm/sparsemem: Prepare for sub-section ranges From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Vlastimil Babka , Logan Gunthorpe , linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, mhocko@suse.com, david@redhat.com Date: Wed, 17 Apr 2019 11:39:37 -0700 Message-ID: <155552637717.2015392.6818206043460116960.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Prepare the memory hot-{add,remove} paths for handling sub-section ranges by plumbing the starting page frame and number of pages being handled through arch_{add,remove}_memory() to sparse_{add,remove}_one_section(). This is simply plumbing, small cleanups, and some identifier renames. No intended functional changes. Cc: Michal Hocko Cc: Vlastimil Babka Cc: Logan Gunthorpe Signed-off-by: Dan Williams Reviewed-by: Pavel Tatashin --- include/linux/memory_hotplug.h | 7 ++- mm/memory_hotplug.c | 103 ++++++++++++++++++++++++---------------- mm/sparse.c | 7 ++- 3 files changed, 71 insertions(+), 46 deletions(-) diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 31b768bd1268..70dd3b4d9ceb 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -355,9 +355,10 @@ extern int add_memory_resource(int nid, struct resource *resource); extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn, unsigned long nr_pages, struct vmem_altmap *altmap); extern bool is_memblock_offlined(struct memory_block *mem); -extern int sparse_add_one_section(int nid, unsigned long start_pfn, - struct vmem_altmap *altmap); -extern void sparse_remove_one_section(struct zone *zone, struct mem_section *ms, +extern int sparse_add_section(int nid, unsigned long pfn, + unsigned long nr_pages, struct vmem_altmap *altmap); +extern void sparse_remove_section(struct zone *zone, struct mem_section *ms, + unsigned long pfn, unsigned long nr_pages, unsigned long map_offset, struct vmem_altmap *altmap); extern struct page *sparse_decode_mem_map(unsigned long coded_mem_map, unsigned long pnum); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 055cea62be6e..6622c4d06ac3 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -251,22 +251,44 @@ void __init register_page_bootmem_info_node(struct pglist_data *pgdat) } #endif /* CONFIG_HAVE_BOOTMEM_INFO_NODE */ -static int __meminit __add_section(int nid, unsigned long phys_start_pfn, - struct vmem_altmap *altmap, bool want_memblock) +static int __meminit __add_section(int nid, unsigned long pfn, + unsigned long nr_pages, struct vmem_altmap *altmap, + bool want_memblock) { int ret; - if (pfn_valid(phys_start_pfn)) + if (pfn_valid(pfn)) return -EEXIST; - ret = sparse_add_one_section(nid, phys_start_pfn, altmap); + ret = sparse_add_section(nid, pfn, nr_pages, altmap); if (ret < 0) return ret; if (!want_memblock) return 0; - return hotplug_memory_register(nid, __pfn_to_section(phys_start_pfn)); + return hotplug_memory_register(nid, __pfn_to_section(pfn)); +} + +static int subsection_check(unsigned long pfn, unsigned long nr_pages, + struct mhp_restrictions *restrictions, const char *reason) +{ + /* + * Only allow partial section hotplug for !memblock ranges, + * since register_new_memory() requires section alignment, and + * CONFIG_SPARSEMEM_VMEMMAP=n requires sections to be fully + * populated. + */ + if ((!IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) + || (restrictions->flags & MHP_MEMBLOCK_API)) + && ((pfn & ~PAGE_SECTION_MASK) + || (nr_pages & ~PAGE_SECTION_MASK))) { + WARN(1, "Sub-section hot-%s incompatible with %s\n", reason, + (restrictions->flags & MHP_MEMBLOCK_API) + ? "memblock api" : "!CONFIG_SPARSEMEM_VMEMMAP"); + return -EINVAL; + } + return 0; } /* @@ -275,23 +297,19 @@ static int __meminit __add_section(int nid, unsigned long phys_start_pfn, * call this function after deciding the zone to which to * add the new pages. */ -int __ref __add_pages(int nid, unsigned long phys_start_pfn, - unsigned long nr_pages, struct mhp_restrictions *restrictions) +int __ref __add_pages(int nid, unsigned long pfn, unsigned long nr_pages, + struct mhp_restrictions *restrictions) { unsigned long i; int err = 0; int start_sec, end_sec; struct vmem_altmap *altmap = restrictions->altmap; - /* during initialize mem_map, align hot-added range to section */ - start_sec = pfn_to_section_nr(phys_start_pfn); - end_sec = pfn_to_section_nr(phys_start_pfn + nr_pages - 1); - if (altmap) { /* * Validate altmap is within bounds of the total request */ - if (altmap->base_pfn != phys_start_pfn + if (altmap->base_pfn != pfn || vmem_altmap_offset(altmap) > nr_pages) { pr_warn_once("memory add fail, invalid altmap\n"); err = -EINVAL; @@ -300,9 +318,17 @@ int __ref __add_pages(int nid, unsigned long phys_start_pfn, altmap->alloc = 0; } + start_sec = pfn_to_section_nr(pfn); + end_sec = pfn_to_section_nr(pfn + nr_pages - 1); for (i = start_sec; i <= end_sec; i++) { - err = __add_section(nid, section_nr_to_pfn(i), altmap, + unsigned long pfns; + + pfns = min(nr_pages, PAGES_PER_SECTION + - (pfn & ~PAGE_SECTION_MASK)); + err = __add_section(nid, pfn, pfns, altmap, restrictions->flags & MHP_MEMBLOCK_API); + pfn += pfns; + nr_pages -= pfns; /* * EEXIST is finally dealt with by ioresource collision @@ -507,10 +533,10 @@ static void shrink_pgdat_span(struct pglist_data *pgdat, pgdat->node_spanned_pages = 0; } -static void __remove_zone(struct zone *zone, unsigned long start_pfn) +static void __remove_zone(struct zone *zone, unsigned long start_pfn, + unsigned long nr_pages) { struct pglist_data *pgdat = zone->zone_pgdat; - int nr_pages = PAGES_PER_SECTION; unsigned long flags; pgdat_resize_lock(zone->zone_pgdat, &flags); @@ -519,29 +545,26 @@ static void __remove_zone(struct zone *zone, unsigned long start_pfn) pgdat_resize_unlock(zone->zone_pgdat, &flags); } -static void __remove_section(struct zone *zone, struct mem_section *ms, - unsigned long map_offset, - struct vmem_altmap *altmap) +static void __remove_section(struct zone *zone, unsigned long pfn, + unsigned long nr_pages, unsigned long map_offset, + struct vmem_altmap *altmap) { - unsigned long start_pfn; - int scn_nr; + struct mem_section *ms = __nr_to_section(pfn_to_section_nr(pfn)); if (WARN_ON_ONCE(!valid_section(ms))) return; unregister_memory_section(ms); - scn_nr = __section_nr(ms); - start_pfn = section_nr_to_pfn((unsigned long)scn_nr); - __remove_zone(zone, start_pfn); + __remove_zone(zone, pfn, nr_pages); - sparse_remove_one_section(zone, ms, map_offset, altmap); + sparse_remove_section(zone, ms, pfn, nr_pages, map_offset, altmap); } /** * __remove_pages() - remove sections of pages from a zone * @zone: zone from which pages need to be removed - * @phys_start_pfn: starting pageframe (must be aligned to start of a section) + * @pfn: starting pageframe (must be aligned to start of a section) * @nr_pages: number of pages to remove (must be multiple of section size) * @restrictions: optional alternative device page map and other features * @@ -550,11 +573,10 @@ static void __remove_section(struct zone *zone, struct mem_section *ms, * sure that pages are marked reserved and zones are adjust properly by * calling offline_pages(). */ -void __remove_pages(struct zone *zone, unsigned long phys_start_pfn, - unsigned long nr_pages, struct mhp_restrictions *restrictions) +void __remove_pages(struct zone *zone, unsigned long pfn, + unsigned long nr_pages, struct mhp_restrictions *restrictions) { - unsigned long i; - int sections_to_remove; + int i, start_sec, end_sec; unsigned long map_offset = 0; struct vmem_altmap *altmap = restrictions->altmap; @@ -563,19 +585,20 @@ void __remove_pages(struct zone *zone, unsigned long phys_start_pfn, clear_zone_contiguous(zone); - /* - * We can only remove entire sections - */ - BUG_ON(phys_start_pfn & ~PAGE_SECTION_MASK); - BUG_ON(nr_pages % PAGES_PER_SECTION); + if (subsection_check(pfn, nr_pages, restrictions, "remove")) + return; - sections_to_remove = nr_pages / PAGES_PER_SECTION; - for (i = 0; i < sections_to_remove; i++) { - unsigned long pfn = phys_start_pfn + i*PAGES_PER_SECTION; + start_sec = pfn_to_section_nr(pfn); + end_sec = pfn_to_section_nr(pfn + nr_pages - 1); + for (i = start_sec; i <= end_sec; i++) { + unsigned long pfns; cond_resched(); - __remove_section(zone, __pfn_to_section(pfn), map_offset, - altmap); + pfns = min(nr_pages, PAGES_PER_SECTION + - (pfn & ~PAGE_SECTION_MASK)); + __remove_section(zone, pfn, pfns, map_offset, altmap); + pfn += pfns; + nr_pages -= pfns; map_offset = 0; } @@ -1830,7 +1853,7 @@ static void __release_memory_resource(u64 start, u64 size) */ void __ref __remove_memory(int nid, u64 start, u64 size) { - struct mhp_restrictions restrictions = { 0 }; + struct mhp_restrictions restrictions = { .flags = MHP_MEMBLOCK_API }; int ret; BUG_ON(check_hotplug_memory_range(start, size)); diff --git a/mm/sparse.c b/mm/sparse.c index 98408c0da060..bd45bff78ca1 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -756,8 +756,8 @@ static void free_map_bootmem(struct page *memmap) * * -EEXIST - Section has been present. * * -ENOMEM - Out of memory. */ -int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, - struct vmem_altmap *altmap) +int __meminit sparse_add_section(int nid, unsigned long start_pfn, + unsigned long nr_pages, struct vmem_altmap *altmap) { unsigned long section_nr = pfn_to_section_nr(start_pfn); struct mem_section_usage *usage; @@ -866,7 +866,8 @@ static void free_section_usage(struct page *memmap, free_map_bootmem(memmap); } -void sparse_remove_one_section(struct zone *zone, struct mem_section *ms, +void sparse_remove_section(struct zone *zone, struct mem_section *ms, + unsigned long pfn, unsigned long nr_pages, unsigned long map_offset, struct vmem_altmap *altmap) { struct page *memmap = NULL; From patchwork Wed Apr 17 18:39:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10905885 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 64ABA161F for ; Wed, 17 Apr 2019 18:53:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4811128B7D for ; Wed, 17 Apr 2019 18:53:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3C8E528B83; Wed, 17 Apr 2019 18:53:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6359B28B7D for ; Wed, 17 Apr 2019 18:53:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F29B36B0266; Wed, 17 Apr 2019 14:53:29 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id F01056B0269; Wed, 17 Apr 2019 14:53:29 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DF3686B026A; Wed, 17 Apr 2019 14:53:29 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f200.google.com (mail-pf1-f200.google.com [209.85.210.200]) by kanga.kvack.org (Postfix) with ESMTP id A379B6B0266 for ; Wed, 17 Apr 2019 14:53:29 -0400 (EDT) Received: by mail-pf1-f200.google.com with SMTP id h69so16769209pfd.21 for ; Wed, 17 Apr 2019 11:53:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=t5+nULTwXja/GMeFymaL9IwyDQFdnrWg57qA6/xBAT0=; b=tA2x5+iZ9r6WVL27EQAkoAuV7ft3oa6EjBOKWuOUUatKr9iVQBfpF5Ndtsln4TNl5Z MX1QxjnfdjNMN6qXSMpPRMcYWLk+odAGghIZFMQpeBRMtbLn9ZcTF35AHojgrs9LHEb9 O3C40vf3JkwksZQg+5Co540zqdosgxErVJe8W5TeZzdBq7+x4BH1qzZMPa/Ea2vSRVRN TF80cpipeSl+AS4i0fOd3lWDLGC6TVO4FEd/PUNUEPWWg6d95n4qyaixkQmdZbf81DJs g2LcPCJCTXpLXaErEkmT5TeszUwx7pKpRWWHZGPygRTRwn7u8yJ7+ocFyva76vKvWvWE iNHw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAU5NDZ0ApT7hc1/nmKdPfRU+TMAtGZBQ179pRdeMCImouvvmebp KL4oMtoLU5RSY+6EmLwYZsMAtOqY3t0hepfSSb09ur9zKz27DF/MbSYKVeBmWipKI6MgVOD8aA5 8xcASuefjIXWN8uwVg15LBwk7gPaBjvMU0XenAdpFWPTqlVDFPK9xYgLZ6ObUTNZ0lQ== X-Received: by 2002:a63:5466:: with SMTP id e38mr81721073pgm.340.1555527209275; Wed, 17 Apr 2019 11:53:29 -0700 (PDT) X-Google-Smtp-Source: APXvYqwyVj5xK87WkG/1S7bkWkCyc6DqTXwPO1i2hPBXijmmY9XHjVanmySR3BvcVDUUnFWjNdc2 X-Received: by 2002:a63:5466:: with SMTP id e38mr81721009pgm.340.1555527208415; Wed, 17 Apr 2019 11:53:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555527208; cv=none; d=google.com; s=arc-20160816; b=vCYaxMDsZEf4qGXyhrIC1+bM1UK+ZPiiZIoc8OFM5t0a9SyhyaOVDm9Upv2GdP6Ymn t/y7oAoiGlSvwdEwKvcMDIE/3UMNEmSuKaqDv8HSe7obzW60oM2whZucLLFBapHWecs+ CHXeuAnICMW34Cj3CvgK6ByQueY2Il6sHUy5fi7pKRVbYbDbbTDVL1ju/VhuUiFnZ5bk 8XbVOJY2JuxUO61gLtGa+WRqsIrMgJAbRKzhvM6gopw3st85ZcVLoLCKxWCrY6soakcU v4Od8a1NqvAXLjSOfKKJ2UNt17SIPQkhejhM6g3cW++KqfSbOuw5/4kfAWjavo7E9LXu Toog== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=t5+nULTwXja/GMeFymaL9IwyDQFdnrWg57qA6/xBAT0=; b=z9qItVKNRJnTpsUXkYB3suUxN3xtCwKZl0pRViJbqEAJ+b+gBIKGR7F/SHyqxKbrHC SUbGlB9k38Hwd8S6HOZpSrxG3HDgI8Sx5uf6C3Ov/880PWzQvKg5XVaoapBTWe5VOUvt boQy3hJpD+PVyKARc9uFJkbVG12UAbOEOMfSCENxsGNo5YehZAEREGK6iky6mdlHkDqq j0x4IFIuGw1l0pw6WMrvqxizxYbxeJSX6Mb2igr84S6FO0XMGU6/pxU2fSc9OPV2KZJz p/e24BVM5CdxMHtAReqsgtW6Ci5rsm4tgqJKoToogm2iza/HGSpzDU0V6/N3cLdSRMMs 19qA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga09.intel.com (mga09.intel.com. [134.134.136.24]) by mx.google.com with ESMTPS id f20si41529064pgv.584.2019.04.17.11.53.28 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 17 Apr 2019 11:53:28 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) client-ip=134.134.136.24; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Apr 2019 11:53:28 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,362,1549958400"; d="scan'208";a="132232393" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga007.jf.intel.com with ESMTP; 17 Apr 2019 11:53:27 -0700 Subject: [PATCH v6 09/12] mm/sparsemem: Support sub-section hotplug From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Vlastimil Babka , Logan Gunthorpe , linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, mhocko@suse.com, david@redhat.com Date: Wed, 17 Apr 2019 11:39:42 -0700 Message-ID: <155552638228.2015392.2866282581991830795.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The libnvdimm sub-system has suffered a series of hacks and broken workarounds for the memory-hotplug implementation's awkward section-aligned (128MB) granularity. For example the following backtrace is emitted when attempting arch_add_memory() with physical address ranges that intersect 'System RAM' (RAM) with 'Persistent Memory' (PMEM) within a given section: WARNING: CPU: 0 PID: 558 at kernel/memremap.c:300 devm_memremap_pages+0x3b5/0x4c0 devm_memremap_pages attempted on mixed region [mem 0x200000000-0x2fbffffff flags 0x200] [..] Call Trace: dump_stack+0x86/0xc3 __warn+0xcb/0xf0 warn_slowpath_fmt+0x5f/0x80 devm_memremap_pages+0x3b5/0x4c0 __wrap_devm_memremap_pages+0x58/0x70 [nfit_test_iomap] pmem_attach_disk+0x19a/0x440 [nd_pmem] Recently it was discovered that the problem goes beyond RAM vs PMEM collisions as some platform produce PMEM vs PMEM collisions within a given section. The libnvdimm workaround for that case revealed that the libnvdimm section-alignment-padding implementation has been broken for a long while. A fix for that long-standing breakage introduces as many problems as it solves as it would require a backward-incompatible change to the namespace metadata interpretation. Instead of that dubious route [1], address the root problem in the memory-hotplug implementation. [1]: https://lore.kernel.org/r/155000671719.348031.2347363160141119237.stgit@dwillia2-desk3.amr.corp.intel.com Cc: Michal Hocko Cc: Vlastimil Babka Cc: Logan Gunthorpe Signed-off-by: Dan Williams --- mm/sparse.c | 224 ++++++++++++++++++++++++++++++++++++++++------------------- 1 file changed, 150 insertions(+), 74 deletions(-) diff --git a/mm/sparse.c b/mm/sparse.c index bd45bff78ca1..3411321998b1 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -83,8 +83,15 @@ static int __meminit sparse_index_init(unsigned long section_nr, int nid) unsigned long root = SECTION_NR_TO_ROOT(section_nr); struct mem_section *section; + /* + * An existing section is possible in the sub-section hotplug + * case. First hot-add instantiates, follow-on hot-add reuses + * the existing section. + * + * The mem_hotplug_lock resolves the apparent race below. + */ if (mem_section[root]) - return -EEXIST; + return 0; section = sparse_index_alloc(nid); if (!section) @@ -338,6 +345,15 @@ static void __meminit sparse_init_one_section(struct mem_section *ms, unsigned long pnum, struct page *mem_map, struct mem_section_usage *usage) { + /* + * Given that SPARSEMEM_VMEMMAP=y supports sub-section hotplug, + * ->section_mem_map can not be guaranteed to point to a full + * section's worth of memory. The field is only valid / used + * in the SPARSEMEM_VMEMMAP=n case. + */ + if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP)) + mem_map = NULL; + ms->section_mem_map &= ~SECTION_MAP_MASK; ms->section_mem_map |= sparse_encode_mem_map(mem_map, pnum) | SECTION_HAS_MEM_MAP; @@ -743,10 +759,130 @@ static void free_map_bootmem(struct page *memmap) #endif /* CONFIG_MEMORY_HOTREMOVE */ #endif /* CONFIG_SPARSEMEM_VMEMMAP */ +#ifndef CONFIG_MEMORY_HOTREMOVE +static void free_map_bootmem(struct page *memmap) +{ +} +#endif + +static bool is_early_section(struct mem_section *ms) +{ + struct page *usage_page; + + usage_page = virt_to_page(ms->usage); + if (PageSlab(usage_page) || PageCompound(usage_page)) + return false; + else + return true; +} + +static void section_deactivate(unsigned long pfn, unsigned long nr_pages, + int nid, struct vmem_altmap *altmap) +{ + unsigned long mask = section_active_mask(pfn, nr_pages); + struct mem_section *ms = __pfn_to_section(pfn); + bool early_section = is_early_section(ms); + struct page *memmap = NULL; + + if (WARN(!ms->usage || (ms->usage->map_active & mask) != mask, + "section already deactivated: active: %#lx mask: %#lx\n", + ms->usage ? ms->usage->map_active : 0, mask)) + return; + + if (WARN(!IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) + && nr_pages < PAGES_PER_SECTION, + "partial memory section removal not supported\n")) + return; + + /* + * There are 3 cases to handle across two configurations + * (SPARSEMEM_VMEMMAP={y,n}): + * + * 1/ deactivation of a partial hot-added section (only possible + * in the SPARSEMEM_VMEMMAP=y case). + * a/ section was present at memory init + * b/ section was hot-added post memory init + * 2/ deactivation of a complete hot-added section + * 3/ deactivation of a complete section from memory init + * + * For 1/, when map_active does not go to zero we will not be + * freeing the usage map, but still need to free the vmemmap + * range. + * + * For 2/ and 3/ the SPARSEMEM_VMEMMAP={y,n} cases are unified + */ + ms->usage->map_active ^= mask; + if (ms->usage->map_active == 0) { + unsigned long section_nr = pfn_to_section_nr(pfn); + + if (!early_section) { + kfree(ms->usage); + ms->usage = NULL; + } + memmap = sparse_decode_mem_map(ms->section_mem_map, section_nr); + ms->section_mem_map = sparse_encode_mem_map(NULL, section_nr); + } + + if (early_section && memmap) + free_map_bootmem(memmap); + else + depopulate_section_memmap(pfn, nr_pages, altmap); +} + +static struct page * __meminit section_activate(int nid, unsigned long pfn, + unsigned long nr_pages, struct vmem_altmap *altmap) +{ + unsigned long mask = section_active_mask(pfn, nr_pages); + struct mem_section *ms = __pfn_to_section(pfn); + struct mem_section_usage *usage = NULL; + struct page *memmap; + int rc = 0; + + if (!ms->usage) { + usage = kzalloc(mem_section_usage_size(), GFP_KERNEL); + if (!usage) + return ERR_PTR(-ENOMEM); + ms->usage = usage; + } + + if (!mask) + rc = -EINVAL; + else if (mask & ms->usage->map_active) + rc = -EEXIST; + else + ms->usage->map_active |= mask; + + if (rc) { + if (usage) + ms->usage = NULL; + kfree(usage); + return ERR_PTR(rc); + } + + /* + * The early init code does not consider partially populated + * initial sections, it simply assumes that memory will never be + * referenced. If we hot-add memory into such a section then we + * do not need to populate the memmap and can simply reuse what + * is already there. + */ + if (nr_pages < PAGES_PER_SECTION && is_early_section(ms)) + return pfn_to_page(pfn); + + memmap = populate_section_memmap(pfn, nr_pages, nid, altmap); + if (!memmap) { + section_deactivate(pfn, nr_pages, nid, altmap); + return ERR_PTR(-ENOMEM); + } + + return memmap; +} + /** - * sparse_add_one_section - add a memory section + * sparse_add_section - add a memory section, or populate an existing one * @nid: The node to add section on * @start_pfn: start pfn of the memory range + * @nr_pages: number of pfns to add in the section * @altmap: device page map * * This is only intended for hotplug. @@ -760,49 +896,30 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, unsigned long nr_pages, struct vmem_altmap *altmap) { unsigned long section_nr = pfn_to_section_nr(start_pfn); - struct mem_section_usage *usage; - struct mem_section *ms; + struct mem_section *ms = __pfn_to_section(start_pfn); struct page *memmap; int ret; - /* - * no locking for this, because it does its own - * plus, it does a kmalloc - */ ret = sparse_index_init(section_nr, nid); if (ret < 0 && ret != -EEXIST) return ret; - ret = 0; - memmap = populate_section_memmap(start_pfn, PAGES_PER_SECTION, nid, - altmap); - if (!memmap) - return -ENOMEM; - usage = kzalloc(mem_section_usage_size(), GFP_KERNEL); - if (!usage) { - depopulate_section_memmap(start_pfn, PAGES_PER_SECTION, altmap); - return -ENOMEM; - } - ms = __pfn_to_section(start_pfn); - if (ms->section_mem_map & SECTION_MARKED_PRESENT) { - ret = -EEXIST; - goto out; - } + memmap = section_activate(nid, start_pfn, nr_pages, altmap); + if (IS_ERR(memmap)) + return PTR_ERR(memmap); + ret = 0; /* * Poison uninitialized struct pages in order to catch invalid flags * combinations. */ - page_init_poison(memmap, sizeof(struct page) * PAGES_PER_SECTION); + page_init_poison(pfn_to_page(start_pfn), sizeof(struct page) * nr_pages); section_mark_present(ms); - sparse_init_one_section(ms, section_nr, memmap, usage); + sparse_init_one_section(ms, section_nr, memmap, ms->usage); -out: - if (ret < 0) { - kfree(usage); - depopulate_section_memmap(start_pfn, PAGES_PER_SECTION, altmap); - } + if (ret < 0) + section_deactivate(start_pfn, nr_pages, nid, altmap); return ret; } @@ -837,54 +954,13 @@ static inline void clear_hwpoisoned_pages(struct page *memmap, int nr_pages) } #endif -static void free_section_usage(struct page *memmap, - struct mem_section_usage *usage, unsigned long pfn, - unsigned long nr_pages, struct vmem_altmap *altmap) -{ - struct page *usage_page; - - if (!usage) - return; - - usage_page = virt_to_page(usage); - /* - * Check to see if allocation came from hot-plug-add - */ - if (PageSlab(usage_page) || PageCompound(usage_page)) { - kfree(usage); - if (memmap) - depopulate_section_memmap(pfn, nr_pages, altmap); - return; - } - - /* - * The usemap came from bootmem. This is packed with other usemaps - * on the section which has pgdat at boot time. Just keep it as is now. - */ - - if (memmap) - free_map_bootmem(memmap); -} - void sparse_remove_section(struct zone *zone, struct mem_section *ms, unsigned long pfn, unsigned long nr_pages, unsigned long map_offset, struct vmem_altmap *altmap) { - struct page *memmap = NULL; - struct mem_section_usage *usage = NULL; - - if (ms->section_mem_map) { - usage = ms->usage; - memmap = sparse_decode_mem_map(ms->section_mem_map, - __section_nr(ms)); - ms->section_mem_map = 0; - ms->usage = NULL; - } - - clear_hwpoisoned_pages(memmap + map_offset, - PAGES_PER_SECTION - map_offset); - free_section_usage(memmap, usage, section_nr_to_pfn(__section_nr(ms)), - PAGES_PER_SECTION, altmap); + clear_hwpoisoned_pages(pfn_to_page(pfn) + map_offset, + nr_pages - map_offset); + section_deactivate(pfn, nr_pages, zone_to_nid(zone), altmap); } #endif /* CONFIG_MEMORY_HOTREMOVE */ #endif /* CONFIG_MEMORY_HOTPLUG */ From patchwork Wed Apr 17 18:39:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10905889 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F409A161F for ; Wed, 17 Apr 2019 18:53:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D950B28B7E for ; Wed, 17 Apr 2019 18:53:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CD8D328B87; Wed, 17 Apr 2019 18:53:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 05A6028B7E for ; Wed, 17 Apr 2019 18:53:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 825096B0269; Wed, 17 Apr 2019 14:53:35 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7FC576B026A; Wed, 17 Apr 2019 14:53:35 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6EBB76B026B; Wed, 17 Apr 2019 14:53:35 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f198.google.com (mail-pg1-f198.google.com [209.85.215.198]) by kanga.kvack.org (Postfix) with ESMTP id 35DF66B0269 for ; Wed, 17 Apr 2019 14:53:35 -0400 (EDT) Received: by mail-pg1-f198.google.com with SMTP id h14so15120501pgn.23 for ; Wed, 17 Apr 2019 11:53:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=aSPlNcaqYJWY7Dx9FIn/bsDVRzbR+x2q4v+WFCxxa2g=; b=cgc2G28tjN24e7GMSBY4iAnxgRC19zrLIth7R8RF1SZXc5A9rZVUBkihgBE7X1zvp1 gujvju62do5Dkm0+47hqUQZPvctfRMl2EJdOOvHuwhvst4maa9s+L8SLM71ViNv4eCc3 N+c5jCxBrBchNQyzEXNoQyeRLb25rgZav0zjuqZgcMwCwmkMBwLEJwUz93n68sJWcRxq rY4yICWDvCPQvYc9Cntbng7tztCOJuz3wF9C8WlrL6+aMYvAriiu3QUC20JkedABISNa 8ez6xUW+RN5RdxWoGAhfYsBbCJErFOeaNFnOQHzh+AIUOEhxZXgXkoRWbqq5OZ+dXBpR AegQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAU7mLk5eX6jeB0g/gR9PqDKANd4AKMIMC+lia7zqZtwS+zbB/DT cIYHzvuFH5uQ3Ms5WbBnPhMO3nUfUZf//e8R81eTT7kJDVsTeIis2oyT1ZMydPGJ3JSddIflDWl j7qwe+Zg94qIbQCbojQ4VHLuLKQHPdgElb30KNKoUXhf5/MrcnvG27bbtA7looeHJKA== X-Received: by 2002:a62:6444:: with SMTP id y65mr90817903pfb.56.1555527214863; Wed, 17 Apr 2019 11:53:34 -0700 (PDT) X-Google-Smtp-Source: APXvYqyrcsgIYMz5nssBrMblHqx82JgOVwavQs8vUIsiWFqp51+F1BkfweJ70HBp5Yxmk9P92PcY X-Received: by 2002:a62:6444:: with SMTP id y65mr90817840pfb.56.1555527214114; Wed, 17 Apr 2019 11:53:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555527214; cv=none; d=google.com; s=arc-20160816; b=KbtzzXt8D+FNPm9QPHX5MUjjqbFHjCmj6viXoKDNRP58b6Rm/OkoKw1RS4byXCExD0 aSK6KYjbGJhkZdRkfIbXN98a+KSexSLCXcgLGv+e0MjdJPkkBjaraZjA89iKfoYX8vKS 3tAAWrDQG1eSv/vCINPWkD+zpObWp+iwfB1T7/0RyMsyFanhFhx5wxvdDyUccWHaHijf ssnmhMUbpEzl/pXjSJJmX1Ak/5GA8TXf0Udcfz3uj3X5LNAz1ImmmN3I/85qqtPb18Dv chaH7/r2tjsbYyqRrWV1gEMvS/qJVp+I4p/dKTH8jF8VtaPGCzwG7aI8sN2gzTsm+oyD Sc6w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=aSPlNcaqYJWY7Dx9FIn/bsDVRzbR+x2q4v+WFCxxa2g=; b=yw9OkwsnpO5aVfhP4Ga9Pur3h3bLUeFkN2RzF0dgNE3rtvnfFyXRSi0zrd7+jCq6Qt 6aQ27FrV68fdNS04NouDy6HR9VftE8e1APqQmjX3ZthY+67v547IiEBdeaSEKmUVgOok SVh2E/Mymq94aq2DCLAuNNBFgAUck9jpzzm4vrmcbjd5TZ6Rf+McBFb9UgQ88caG13LV OIwjIp+JH3ZWrta7z5SJYnD4edTU85K1XgNceRk+Ms1HOAAlONUlxA30m6eLYwjA5tcm CQV7zn7gR4yH+O1wPFTpf+ASTKSAqscHDXSHzy5T78crzp+OTFpvhT2nWK63f4VRKSHt NrNw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga06.intel.com (mga06.intel.com. [134.134.136.31]) by mx.google.com with ESMTPS id l191si33259965pgd.549.2019.04.17.11.53.33 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 17 Apr 2019 11:53:34 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) client-ip=134.134.136.31; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Apr 2019 11:53:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,362,1549958400"; d="scan'208";a="338429903" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga006.fm.intel.com with ESMTP; 17 Apr 2019 11:53:33 -0700 Subject: [PATCH v6 10/12] mm/devm_memremap_pages: Enable sub-section remap From: Dan Williams To: akpm@linux-foundation.org Cc: Michal Hocko , Toshi Kani , =?utf-8?b?SsOpcsO0bWU=?= Glisse , Logan Gunthorpe , linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, mhocko@suse.com, david@redhat.com Date: Wed, 17 Apr 2019 11:39:47 -0700 Message-ID: <155552638740.2015392.3474262226224495313.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Teach devm_memremap_pages() about the new sub-section capabilities of arch_{add,remove}_memory(). Effectively, just replace all usage of align_start, align_end, and align_size with res->start, res->end, and resource_size(res). The existing sanity check will still make sure that the two separate remap attempts do not collide within a sub-section (2MB on x86). Cc: Michal Hocko Cc: Toshi Kani Cc: Jérôme Glisse Cc: Logan Gunthorpe Signed-off-by: Dan Williams --- kernel/memremap.c | 58 ++++++++++++++++++++++------------------------------- 1 file changed, 24 insertions(+), 34 deletions(-) diff --git a/kernel/memremap.c b/kernel/memremap.c index 33475e211568..ca74a006371a 100644 --- a/kernel/memremap.c +++ b/kernel/memremap.c @@ -59,7 +59,7 @@ static unsigned long pfn_first(struct dev_pagemap *pgmap) struct vmem_altmap *altmap = &pgmap->altmap; unsigned long pfn; - pfn = res->start >> PAGE_SHIFT; + pfn = PHYS_PFN(res->start); if (pgmap->altmap_valid) pfn += vmem_altmap_offset(altmap); return pfn; @@ -87,7 +87,6 @@ static void devm_memremap_pages_release(void *data) struct dev_pagemap *pgmap = data; struct device *dev = pgmap->dev; struct resource *res = &pgmap->res; - resource_size_t align_start, align_size; unsigned long pfn; int nid; @@ -96,28 +95,25 @@ static void devm_memremap_pages_release(void *data) put_page(pfn_to_page(pfn)); /* pages are dead and unused, undo the arch mapping */ - align_start = res->start & ~(PA_SECTION_SIZE - 1); - align_size = ALIGN(res->start + resource_size(res), PA_SECTION_SIZE) - - align_start; - - nid = page_to_nid(pfn_to_page(align_start >> PAGE_SHIFT)); + nid = page_to_nid(pfn_to_page(PHYS_PFN(res->start))); mem_hotplug_begin(); if (pgmap->type == MEMORY_DEVICE_PRIVATE) { - pfn = align_start >> PAGE_SHIFT; + pfn = PHYS_PFN(res->start); __remove_pages(page_zone(pfn_to_page(pfn)), pfn, - align_size >> PAGE_SHIFT, NULL); + PHYS_PFN(resource_size(res)), NULL); } else { struct mhp_restrictions restrictions = { .altmap = pgmap->altmap_valid ? &pgmap->altmap : NULL, }; - arch_remove_memory(nid, align_start, align_size, &restrictions); - kasan_remove_zero_shadow(__va(align_start), align_size); + arch_remove_memory(nid, res->start, resource_size(res), + &restrictions); + kasan_remove_zero_shadow(__va(res->start), resource_size(res)); } mem_hotplug_done(); - untrack_pfn(NULL, PHYS_PFN(align_start), align_size); + untrack_pfn(NULL, PHYS_PFN(res->start), resource_size(res)); pgmap_array_delete(res); dev_WARN_ONCE(dev, pgmap->altmap.alloc, "%s: failed to free all reserved pages\n", __func__); @@ -144,7 +140,6 @@ static void devm_memremap_pages_release(void *data) */ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) { - resource_size_t align_start, align_size, align_end; struct resource *res = &pgmap->res; struct dev_pagemap *conflict_pgmap; struct mhp_restrictions restrictions = { @@ -160,26 +155,21 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) if (!pgmap->ref || !pgmap->kill) return ERR_PTR(-EINVAL); - align_start = res->start & ~(PA_SECTION_SIZE - 1); - align_size = ALIGN(res->start + resource_size(res), PA_SECTION_SIZE) - - align_start; - align_end = align_start + align_size - 1; - - conflict_pgmap = get_dev_pagemap(PHYS_PFN(align_start), NULL); + conflict_pgmap = get_dev_pagemap(PHYS_PFN(res->start), NULL); if (conflict_pgmap) { dev_WARN(dev, "Conflicting mapping in same section\n"); put_dev_pagemap(conflict_pgmap); return ERR_PTR(-ENOMEM); } - conflict_pgmap = get_dev_pagemap(PHYS_PFN(align_end), NULL); + conflict_pgmap = get_dev_pagemap(PHYS_PFN(res->end), NULL); if (conflict_pgmap) { dev_WARN(dev, "Conflicting mapping in same section\n"); put_dev_pagemap(conflict_pgmap); return ERR_PTR(-ENOMEM); } - is_ram = region_intersects(align_start, align_size, + is_ram = region_intersects(res->start, resource_size(res), IORESOURCE_SYSTEM_RAM, IORES_DESC_NONE); if (is_ram != REGION_DISJOINT) { @@ -200,8 +190,8 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) if (nid < 0) nid = numa_mem_id(); - error = track_pfn_remap(NULL, &pgprot, PHYS_PFN(align_start), 0, - align_size); + error = track_pfn_remap(NULL, &pgprot, PHYS_PFN(res->start), 0, + resource_size(res)); if (error) goto err_pfn_remap; @@ -219,25 +209,25 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) * arch_add_memory(). */ if (pgmap->type == MEMORY_DEVICE_PRIVATE) { - error = add_pages(nid, align_start >> PAGE_SHIFT, - align_size >> PAGE_SHIFT, &restrictions); + error = add_pages(nid, PHYS_PFN(res->start), + PHYS_PFN(resource_size(res)), &restrictions); } else { - error = kasan_add_zero_shadow(__va(align_start), align_size); + error = kasan_add_zero_shadow(__va(res->start), resource_size(res)); if (error) { mem_hotplug_done(); goto err_kasan; } - error = arch_add_memory(nid, align_start, align_size, - &restrictions); + error = arch_add_memory(nid, res->start, resource_size(res), + &restrictions); } if (!error) { struct zone *zone; zone = &NODE_DATA(nid)->node_zones[ZONE_DEVICE]; - move_pfn_range_to_zone(zone, align_start >> PAGE_SHIFT, - align_size >> PAGE_SHIFT, restrictions.altmap); + move_pfn_range_to_zone(zone, PHYS_PFN(res->start), + PHYS_PFN(resource_size(res)), restrictions.altmap); } mem_hotplug_done(); @@ -249,8 +239,8 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) * to allow us to do the work while not holding the hotplug lock. */ memmap_init_zone_device(&NODE_DATA(nid)->node_zones[ZONE_DEVICE], - align_start >> PAGE_SHIFT, - align_size >> PAGE_SHIFT, pgmap); + PHYS_PFN(res->start), + PHYS_PFN(resource_size(res)), pgmap); percpu_ref_get_many(pgmap->ref, pfn_end(pgmap) - pfn_first(pgmap)); error = devm_add_action_or_reset(dev, devm_memremap_pages_release, @@ -261,9 +251,9 @@ void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap) return __va(res->start); err_add_memory: - kasan_remove_zero_shadow(__va(align_start), align_size); + kasan_remove_zero_shadow(__va(res->start), resource_size(res)); err_kasan: - untrack_pfn(NULL, PHYS_PFN(align_start), align_size); + untrack_pfn(NULL, PHYS_PFN(res->start), resource_size(res)); err_pfn_remap: pgmap_array_delete(res); err_array: From patchwork Wed Apr 17 18:39:52 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10905893 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8CC381515 for ; Wed, 17 Apr 2019 18:53:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7283128B7D for ; Wed, 17 Apr 2019 18:53:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 66B0F28B83; Wed, 17 Apr 2019 18:53:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DEA2828B7D for ; Wed, 17 Apr 2019 18:53:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7AEDB6B026A; Wed, 17 Apr 2019 14:53:41 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7891E6B026B; Wed, 17 Apr 2019 14:53:41 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6756F6B026C; Wed, 17 Apr 2019 14:53:41 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f197.google.com (mail-pg1-f197.google.com [209.85.215.197]) by kanga.kvack.org (Postfix) with ESMTP id 305766B026A for ; Wed, 17 Apr 2019 14:53:41 -0400 (EDT) Received: by mail-pg1-f197.google.com with SMTP id l13so15201890pgp.3 for ; Wed, 17 Apr 2019 11:53:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=LKvk91HhP9ezsuBTNBJqcLkpO+1Sux7+j9Ueu/2PT+g=; b=nUo9UiHaslRPzOR0kDdO+gY4qrxo5dcZu7LOWbsd6XtoFV3sNqZUIYFpCjz/XJHYUp cDESuK8UAW4H4lIHiNbJ9GwXhw+E2Ygy8IITIPutAKj8auS5p1BV9xhCsDL0oJ+JK5F7 M5Y9Sw7GBUupnxDGR5Iw1IIOB4Gykc4GIqtlFQb/nGlIvXXXxjm8F6nmLthLvvrTO+ji fEyfc7OhrPj4vx0nIRba7Sx8/iJcXWfmw1Cj74CnYs7ygxisgUhvMwHrzEYUmjWVpxmM wQ5wJMGW19CSF84Xfc6JveAWFb+eIKuQwddjJqfwx4ULM0e/UKTdBJ7uZp6Ag0tXBed4 S5Qg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAWSWoOaBnRdzEZ0YAceqjUQcyeH2FCw3S7ZZjrznsrU7WefcoRz 7QQfLp9tKiz0aK5OPWwa9elsN38YX5BgKapmK+g6i/7oNEN94P46tov+sgvoYCI7nAyNbM7wSgt ZJWwI500xWRsE9p9Ghvbs9mjWYkkW7SE0sYBps4lk4n6zpjZfytXdRGUOiACAei0GSg== X-Received: by 2002:a17:902:e709:: with SMTP id co9mr91886573plb.86.1555527220851; Wed, 17 Apr 2019 11:53:40 -0700 (PDT) X-Google-Smtp-Source: APXvYqyJdZI34+1k08LPQgevX6HEzfndhVxUHBi5ov+UZWxGyeHH/gjVVIxfLldDC+FF8tjTyL45 X-Received: by 2002:a17:902:e709:: with SMTP id co9mr91886496plb.86.1555527219674; Wed, 17 Apr 2019 11:53:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555527219; cv=none; d=google.com; s=arc-20160816; b=qF3wYxcXnXphq4RGeDdT5OaWEVtyN3Ge6ms4NNJzN8O3deuCOv50xVkNuX8LsxciGx RECyZXdGeqU0y4zIX7swpoBYT+cHNrwZvpmX7Ek7MEu/Gvx0reTpFZtzj3k4LdM7liQn uuGMEvUWARRe8w//Ss16SIDyBOey06qkjDWOudfPuRRSmIMFyR381ZZUr4UUVdyXHu54 ecCb3vVv8XdNVTuxBdWge5IxY5oWel5vBTJclQfgCUcE6ZRo0bWBUvLGZ+YbaUqswATh D3qr25ltF9Yq6cbjuE4Oxj6s2Th3lkWPN9Mle6ufm7TnbjA4IjE9T0HnKDMy53jbKEjk cIig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=LKvk91HhP9ezsuBTNBJqcLkpO+1Sux7+j9Ueu/2PT+g=; b=z2mA4+HeWzKR1/l9uo8Fk1FYDvGBs9r853WM99obxdrjmFX9ztIp9r215M7HqduuZ4 OcGvL519EToWR4oT7T+nrUsqv7Gb73KyuvrpzvQKk5+b9gSxTn8Hs1EGxQ5f+/9O3/uq d5LLll/VmjhdMeJGP520nKfYP9ZmgrWJigwg7zqAsr6MGZca+/eIZZT4iKh2RMT4dZxK 40JUfvFPq+qX8zmtxFrze9zu0AOx2izJS/rJa6Ja3ORhlRz3PRx9lYT+a/uZbz/GkBMU /ZkCs/YdhVxecwoA43vCNrFR7HBrxvBbvfgNQ0OlnEXM/fivhrswE19Sdfm5sKaoa30T Bzfw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga04.intel.com (mga04.intel.com. [192.55.52.120]) by mx.google.com with ESMTPS id d23si50791866pls.151.2019.04.17.11.53.39 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 17 Apr 2019 11:53:39 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.120 as permitted sender) client-ip=192.55.52.120; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.120 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Apr 2019 11:53:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,362,1549958400"; d="scan'208";a="338528135" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga005.fm.intel.com with ESMTP; 17 Apr 2019 11:53:38 -0700 Subject: [PATCH v6 11/12] libnvdimm/pfn: Fix fsdax-mode namespace info-block zero-fields From: Dan Williams To: akpm@linux-foundation.org Cc: stable@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, mhocko@suse.com, david@redhat.com Date: Wed, 17 Apr 2019 11:39:52 -0700 Message-ID: <155552639290.2015392.17304211251966796338.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP At namespace creation time there is the potential for the "expected to be zero" fields of a 'pfn' info-block to be filled with indeterminate data. While the kernel buffer is zeroed on allocation it is immediately overwritten by nd_pfn_validate() filling it with the current contents of the on-media info-block location. For fields like, 'flags' and the 'padding' it potentially means that future implementations can not rely on those fields being zero. In preparation to stop using the 'start_pad' and 'end_trunc' fields for section alignment, arrange for fields that are not explicitly initialized to be guaranteed zero. Bump the minor version to indicate it is safe to assume the 'padding' and 'flags' are zero. Otherwise, this corruption is expected to benign since all other critical fields are explicitly initialized. Fixes: 32ab0a3f5170 ("libnvdimm, pmem: 'struct page' for pmem") Cc: Signed-off-by: Dan Williams --- drivers/nvdimm/dax_devs.c | 2 +- drivers/nvdimm/pfn.h | 1 + drivers/nvdimm/pfn_devs.c | 18 +++++++++++++++--- 3 files changed, 17 insertions(+), 4 deletions(-) diff --git a/drivers/nvdimm/dax_devs.c b/drivers/nvdimm/dax_devs.c index 0453f49dc708..326f02ffca81 100644 --- a/drivers/nvdimm/dax_devs.c +++ b/drivers/nvdimm/dax_devs.c @@ -126,7 +126,7 @@ int nd_dax_probe(struct device *dev, struct nd_namespace_common *ndns) nvdimm_bus_unlock(&ndns->dev); if (!dax_dev) return -ENOMEM; - pfn_sb = devm_kzalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); + pfn_sb = devm_kmalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); nd_pfn->pfn_sb = pfn_sb; rc = nd_pfn_validate(nd_pfn, DAX_SIG); dev_dbg(dev, "dax: %s\n", rc == 0 ? dev_name(dax_dev) : ""); diff --git a/drivers/nvdimm/pfn.h b/drivers/nvdimm/pfn.h index dde9853453d3..e901e3a3b04c 100644 --- a/drivers/nvdimm/pfn.h +++ b/drivers/nvdimm/pfn.h @@ -36,6 +36,7 @@ struct nd_pfn_sb { __le32 end_trunc; /* minor-version-2 record the base alignment of the mapping */ __le32 align; + /* minor-version-3 guarantee the padding and flags are zero */ u8 padding[4000]; __le64 checksum; }; diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c index 01f40672507f..a2406253eb70 100644 --- a/drivers/nvdimm/pfn_devs.c +++ b/drivers/nvdimm/pfn_devs.c @@ -420,6 +420,15 @@ static int nd_pfn_clear_memmap_errors(struct nd_pfn *nd_pfn) return 0; } +/** + * nd_pfn_validate - read and validate info-block + * @nd_pfn: fsdax namespace runtime state / properties + * @sig: 'devdax' or 'fsdax' signature + * + * Upon return the info-block buffer contents (->pfn_sb) are + * indeterminate when validation fails, and a coherent info-block + * otherwise. + */ int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig) { u64 checksum, offset; @@ -565,7 +574,7 @@ int nd_pfn_probe(struct device *dev, struct nd_namespace_common *ndns) nvdimm_bus_unlock(&ndns->dev); if (!pfn_dev) return -ENOMEM; - pfn_sb = devm_kzalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); + pfn_sb = devm_kmalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); nd_pfn = to_nd_pfn(pfn_dev); nd_pfn->pfn_sb = pfn_sb; rc = nd_pfn_validate(nd_pfn, PFN_SIG); @@ -702,7 +711,7 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) u64 checksum; int rc; - pfn_sb = devm_kzalloc(&nd_pfn->dev, sizeof(*pfn_sb), GFP_KERNEL); + pfn_sb = devm_kmalloc(&nd_pfn->dev, sizeof(*pfn_sb), GFP_KERNEL); if (!pfn_sb) return -ENOMEM; @@ -711,11 +720,14 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) sig = DAX_SIG; else sig = PFN_SIG; + rc = nd_pfn_validate(nd_pfn, sig); if (rc != -ENODEV) return rc; /* no info block, do init */; + memset(pfn_sb, 0, sizeof(*pfn_sb)); + nd_region = to_nd_region(nd_pfn->dev.parent); if (nd_region->ro) { dev_info(&nd_pfn->dev, @@ -768,7 +780,7 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) memcpy(pfn_sb->uuid, nd_pfn->uuid, 16); memcpy(pfn_sb->parent_uuid, nd_dev_to_uuid(&ndns->dev), 16); pfn_sb->version_major = cpu_to_le16(1); - pfn_sb->version_minor = cpu_to_le16(2); + pfn_sb->version_minor = cpu_to_le16(3); pfn_sb->start_pad = cpu_to_le32(start_pad); pfn_sb->end_trunc = cpu_to_le32(end_trunc); pfn_sb->align = cpu_to_le32(nd_pfn->align); From patchwork Wed Apr 17 18:39:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10905897 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 48B1E161F for ; Wed, 17 Apr 2019 18:53:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2887228B7D for ; Wed, 17 Apr 2019 18:53:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1C8F828B83; Wed, 17 Apr 2019 18:53:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 61DDF28B7D for ; Wed, 17 Apr 2019 18:53:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4EF8C6B0005; Wed, 17 Apr 2019 14:53:49 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 4C7526B026C; Wed, 17 Apr 2019 14:53:49 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3B78D6B026D; Wed, 17 Apr 2019 14:53:49 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f197.google.com (mail-pl1-f197.google.com [209.85.214.197]) by kanga.kvack.org (Postfix) with ESMTP id 03AFA6B0005 for ; Wed, 17 Apr 2019 14:53:49 -0400 (EDT) Received: by mail-pl1-f197.google.com with SMTP id p13so15979466pll.20 for ; Wed, 17 Apr 2019 11:53:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=0Eiev16VluPo/Ft1mi41KIG+c9eC0uF0PiWNIa4cMlE=; b=nS28BB5Fwx/eXpYDJ1nt/kNo+SCIdkzxZ958RN6RbXBrwt94rv6l6j4uZ/Xc+sF2/U m8VOEy4v6m3zL1PbwxxCSGAf4zHyk9bSATEILRC89s4HklYH21rCqeCIKtw+pgkYt/QH Wt1Slr3+YpgukCN3hRAnw9dwFljejtH+hCrlMSVx2xjcEKN04mP17jBZQmlE6MUbQNoj Juw6ZSEXokJx9irkq9mRW3eMeWgOG8Wbk8i5mPxLdPed5CpniY1Im8sUZCO4b5Z/iUHI pFCAG0CLxfWDlfhbg3OkhN7TmyJekG3vcE5TZ7XXxNNlItTW/w1t0hQkBmB98m2lmZVK gZFg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAXS3GrqzD+ogmorzMMmZF3emKlH7pN/BYeB++CNkwGiodbeHZ5w 2oX60EYc8CAGI/lu573wmUYNDG2kkSc4AxhaIJikC7Vb+1oqi2JLS1bkRT+sJMSdaGJASFaZdIp TTnZZPN9tIbZrP18hxL+XvE+T9wj6VnemMQ5FESWijJEJ35T1TBsw7rg1CHUpzrJbyw== X-Received: by 2002:a63:1f61:: with SMTP id q33mr78288760pgm.325.1555527228652; Wed, 17 Apr 2019 11:53:48 -0700 (PDT) X-Google-Smtp-Source: APXvYqyF4eAtzTM5ogXYK3Gu7dFZtuUGYIuLT3XaxqkRqVmfrR2IAis0E+VgREnEkcIP7TgRO6zd X-Received: by 2002:a63:1f61:: with SMTP id q33mr78288686pgm.325.1555527227650; Wed, 17 Apr 2019 11:53:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555527227; cv=none; d=google.com; s=arc-20160816; b=LgwUDR8jhSH6vHaBPfu0kDd2ZlG5BR14AlJeleBkU+kEVg0N8MruVOSzPi6koPEH+w EZG46V/yhTidVkOqlcfOc3Lyrl6YjLv4MUWw46NjiraqfTKvS7wntfVvl27B6xLM+Bmt h48LMX9OrjQSADWKvnj4eqhmr0pbPr7QeUVczDERzaNy2PSV6F7ZPfcsRvTzflaiLbae jc3umigIVKF3TD0tFf1nPMSf+1p5gDcNJdGvSCm5HhI8MG2e5ogUbWzXbqvOBjLF8wsd UmOuVMK7aLuZpa8dnILcIXG/VT6C7uiKPorO0aWZFG4LjqgU2TKt71u3WHzJD0+CFwmh qTiQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=0Eiev16VluPo/Ft1mi41KIG+c9eC0uF0PiWNIa4cMlE=; b=tjDc/9CQC+HwE9A6VNDJ3ckat79lSr38YDzIOdQQIbBhFwQOZvpuLkrRIs08FTU9/R hSwc6y5G7YUD47mpbeY0ItkBf+ewDhJqYAoSUrfx2u0dYfDaig4KouP5vizGfla0zkjl qRKqycJcOgympFQERtQinjv6PyP3LtChSufPAOhaGPhr0vw9ISN/jSXdWdMdaeUwjCtQ eho46KO5efpiUjJkOeJqPeKouq6pVT8zYXlRcSw+wdb7mtSNssjUfMq6cbsvM99443kG DeOfxkHzags7uEDzq2lsPxOtMbYraI8mJzWUbXRN/8btWNURp86vQZASL+tJc6L23fP/ kA7w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga12.intel.com (mga12.intel.com. [192.55.52.136]) by mx.google.com with ESMTPS id d6si48144276pls.402.2019.04.17.11.53.47 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 17 Apr 2019 11:53:47 -0700 (PDT) Received-SPF: pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) client-ip=192.55.52.136; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dan.j.williams@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Apr 2019 11:53:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,362,1549958400"; d="scan'208";a="143759353" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga007.fm.intel.com with ESMTP; 17 Apr 2019 11:53:44 -0700 Subject: [PATCH v6 12/12] libnvdimm/pfn: Stop padding pmem namespaces to section alignment From: Dan Williams To: akpm@linux-foundation.org Cc: Jeff Moyer , linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, mhocko@suse.com, david@redhat.com Date: Wed, 17 Apr 2019 11:39:58 -0700 Message-ID: <155552639846.2015392.14832585852837457293.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> References: <155552633539.2015392.2477781120122237934.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Now that the mm core supports section-unaligned hotplug of ZONE_DEVICE memory, we no longer need to add padding at pfn/dax device creation time. The kernel will still honor padding established by older kernels. Reported-by: Jeff Moyer Signed-off-by: Dan Williams --- drivers/nvdimm/pfn.h | 11 ++----- drivers/nvdimm/pfn_devs.c | 75 +++++++-------------------------------------- include/linux/mmzone.h | 4 ++ 3 files changed, 19 insertions(+), 71 deletions(-) diff --git a/drivers/nvdimm/pfn.h b/drivers/nvdimm/pfn.h index e901e3a3b04c..ae589cc528f2 100644 --- a/drivers/nvdimm/pfn.h +++ b/drivers/nvdimm/pfn.h @@ -41,18 +41,13 @@ struct nd_pfn_sb { __le64 checksum; }; -#ifdef CONFIG_SPARSEMEM -#define PFN_SECTION_ALIGN_DOWN(x) SECTION_ALIGN_DOWN(x) -#define PFN_SECTION_ALIGN_UP(x) SECTION_ALIGN_UP(x) -#else /* * In this case ZONE_DEVICE=n and we will disable 'pfn' device support, * but we still want pmem to compile. */ -#define PFN_SECTION_ALIGN_DOWN(x) (x) -#define PFN_SECTION_ALIGN_UP(x) (x) +#ifndef SUB_SECTION_ALIGN_DOWN +#define SUB_SECTION_ALIGN_DOWN(x) (x) +#define SUB_SECTION_ALIGN_UP(x) (x) #endif -#define PHYS_SECTION_ALIGN_DOWN(x) PFN_PHYS(PFN_SECTION_ALIGN_DOWN(PHYS_PFN(x))) -#define PHYS_SECTION_ALIGN_UP(x) PFN_PHYS(PFN_SECTION_ALIGN_UP(PHYS_PFN(x))) #endif /* __NVDIMM_PFN_H */ diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c index a2406253eb70..7bdaaf3dc77e 100644 --- a/drivers/nvdimm/pfn_devs.c +++ b/drivers/nvdimm/pfn_devs.c @@ -595,14 +595,14 @@ static u32 info_block_reserve(void) } /* - * We hotplug memory at section granularity, pad the reserved area from - * the previous section base to the namespace base address. + * We hotplug memory at sub-section granularity, pad the reserved area + * from the previous section base to the namespace base address. */ static unsigned long init_altmap_base(resource_size_t base) { unsigned long base_pfn = PHYS_PFN(base); - return PFN_SECTION_ALIGN_DOWN(base_pfn); + return SUB_SECTION_ALIGN_DOWN(base_pfn); } static unsigned long init_altmap_reserve(resource_size_t base) @@ -610,7 +610,7 @@ static unsigned long init_altmap_reserve(resource_size_t base) unsigned long reserve = info_block_reserve() >> PAGE_SHIFT; unsigned long base_pfn = PHYS_PFN(base); - reserve += base_pfn - PFN_SECTION_ALIGN_DOWN(base_pfn); + reserve += base_pfn - SUB_SECTION_ALIGN_DOWN(base_pfn); return reserve; } @@ -641,8 +641,7 @@ static int __nvdimm_setup_pfn(struct nd_pfn *nd_pfn, struct dev_pagemap *pgmap) nd_pfn->npfns = le64_to_cpu(pfn_sb->npfns); pgmap->altmap_valid = false; } else if (nd_pfn->mode == PFN_MODE_PMEM) { - nd_pfn->npfns = PFN_SECTION_ALIGN_UP((resource_size(res) - - offset) / PAGE_SIZE); + nd_pfn->npfns = PHYS_PFN((resource_size(res) - offset)); if (le64_to_cpu(nd_pfn->pfn_sb->npfns) > nd_pfn->npfns) dev_info(&nd_pfn->dev, "number of pfns truncated from %lld to %ld\n", @@ -658,50 +657,10 @@ static int __nvdimm_setup_pfn(struct nd_pfn *nd_pfn, struct dev_pagemap *pgmap) return 0; } -static u64 phys_pmem_align_down(struct nd_pfn *nd_pfn, u64 phys) -{ - return min_t(u64, PHYS_SECTION_ALIGN_DOWN(phys), - ALIGN_DOWN(phys, nd_pfn->align)); -} - -/* - * Check if pmem collides with 'System RAM', or other regions when - * section aligned. Trim it accordingly. - */ -static void trim_pfn_device(struct nd_pfn *nd_pfn, u32 *start_pad, u32 *end_trunc) -{ - struct nd_namespace_common *ndns = nd_pfn->ndns; - struct nd_namespace_io *nsio = to_nd_namespace_io(&ndns->dev); - struct nd_region *nd_region = to_nd_region(nd_pfn->dev.parent); - const resource_size_t start = nsio->res.start; - const resource_size_t end = start + resource_size(&nsio->res); - resource_size_t adjust, size; - - *start_pad = 0; - *end_trunc = 0; - - adjust = start - PHYS_SECTION_ALIGN_DOWN(start); - size = resource_size(&nsio->res) + adjust; - if (region_intersects(start - adjust, size, IORESOURCE_SYSTEM_RAM, - IORES_DESC_NONE) == REGION_MIXED - || nd_region_conflict(nd_region, start - adjust, size)) - *start_pad = PHYS_SECTION_ALIGN_UP(start) - start; - - /* Now check that end of the range does not collide. */ - adjust = PHYS_SECTION_ALIGN_UP(end) - end; - size = resource_size(&nsio->res) + adjust; - if (region_intersects(start, size, IORESOURCE_SYSTEM_RAM, - IORES_DESC_NONE) == REGION_MIXED - || !IS_ALIGNED(end, nd_pfn->align) - || nd_region_conflict(nd_region, start, size)) - *end_trunc = end - phys_pmem_align_down(nd_pfn, end); -} - static int nd_pfn_init(struct nd_pfn *nd_pfn) { struct nd_namespace_common *ndns = nd_pfn->ndns; struct nd_namespace_io *nsio = to_nd_namespace_io(&ndns->dev); - u32 start_pad, end_trunc, reserve = info_block_reserve(); resource_size_t start, size; struct nd_region *nd_region; struct nd_pfn_sb *pfn_sb; @@ -736,43 +695,35 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) return -ENXIO; } - memset(pfn_sb, 0, sizeof(*pfn_sb)); - - trim_pfn_device(nd_pfn, &start_pad, &end_trunc); - if (start_pad + end_trunc) - dev_info(&nd_pfn->dev, "%s alignment collision, truncate %d bytes\n", - dev_name(&ndns->dev), start_pad + end_trunc); - /* * Note, we use 64 here for the standard size of struct page, * debugging options may cause it to be larger in which case the * implementation will limit the pfns advertised through * ->direct_access() to those that are included in the memmap. */ - start = nsio->res.start + start_pad; + start = nsio->res.start; size = resource_size(&nsio->res); - npfns = PFN_SECTION_ALIGN_UP((size - start_pad - end_trunc - reserve) - / PAGE_SIZE); + npfns = PHYS_PFN(size - SZ_8K); if (nd_pfn->mode == PFN_MODE_PMEM) { /* * The altmap should be padded out to the block size used * when populating the vmemmap. This *should* be equal to * PMD_SIZE for most architectures. */ - offset = ALIGN(start + reserve + 64 * npfns, - max(nd_pfn->align, PMD_SIZE)) - start; + offset = ALIGN(start + SZ_8K + 64 * npfns, + max(nd_pfn->align, SECTION_ACTIVE_SIZE)) - start; } else if (nd_pfn->mode == PFN_MODE_RAM) - offset = ALIGN(start + reserve, nd_pfn->align) - start; + offset = ALIGN(start + SZ_8K, nd_pfn->align) - start; else return -ENXIO; - if (offset + start_pad + end_trunc >= size) { + if (offset >= size) { dev_err(&nd_pfn->dev, "%s unable to satisfy requested alignment\n", dev_name(&ndns->dev)); return -ENXIO; } - npfns = (size - offset - start_pad - end_trunc) / SZ_4K; + npfns = PHYS_PFN(size - offset); pfn_sb->mode = cpu_to_le32(nd_pfn->mode); pfn_sb->dataoff = cpu_to_le64(offset); pfn_sb->npfns = cpu_to_le64(npfns); @@ -781,8 +732,6 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) memcpy(pfn_sb->parent_uuid, nd_dev_to_uuid(&ndns->dev), 16); pfn_sb->version_major = cpu_to_le16(1); pfn_sb->version_minor = cpu_to_le16(3); - pfn_sb->start_pad = cpu_to_le32(start_pad); - pfn_sb->end_trunc = cpu_to_le32(end_trunc); pfn_sb->align = cpu_to_le32(nd_pfn->align); checksum = nd_sb_checksum((struct nd_gen_sb *) pfn_sb); pfn_sb->checksum = cpu_to_le64(checksum); diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 3237c5e456df..d2445c483ad4 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1155,6 +1155,10 @@ static inline unsigned long section_nr_to_pfn(unsigned long sec) #define PAGES_PER_SUB_SECTION (SECTION_ACTIVE_SIZE / PAGE_SIZE) #define PAGE_SUB_SECTION_MASK (~(PAGES_PER_SUB_SECTION-1)) +#define SUB_SECTION_ALIGN_UP(pfn) (((pfn) + PAGES_PER_SUB_SECTION - 1) \ + & PAGE_SUB_SECTION_MASK) +#define SUB_SECTION_ALIGN_DOWN(pfn) ((pfn) & PAGE_SUB_SECTION_MASK) + struct mem_section_usage { /* * SECTION_ACTIVE_SIZE portions of the section that are populated in