From patchwork Fri Nov 30 17:59:19 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 10706957 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D779B13B0 for ; Fri, 30 Nov 2018 18:00:04 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C64232FF33 for ; Fri, 30 Nov 2018 18:00:04 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B89E52FF86; Fri, 30 Nov 2018 18:00:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 21D022FF33 for ; Fri, 30 Nov 2018 18:00:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726103AbeLAFKC (ORCPT ); Sat, 1 Dec 2018 00:10:02 -0500 Received: from mx1.redhat.com ([209.132.183.28]:54128 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725754AbeLAFKC (ORCPT ); Sat, 1 Dec 2018 00:10:02 -0500 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4937F8E664; Fri, 30 Nov 2018 17:59:57 +0000 (UTC) Received: from t460s.redhat.com (ovpn-126-156.rdu2.redhat.com [10.10.126.156]) by smtp.corp.redhat.com (Postfix) with ESMTP id 734C55D9C9; Fri, 30 Nov 2018 17:59:44 +0000 (UTC) From: David Hildenbrand To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, linux-ia64@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, linux-acpi@vger.kernel.org, devel@linuxdriverproject.org, xen-devel@lists.xenproject.org, x86@kernel.org, David Hildenbrand , Greg Kroah-Hartman , "Rafael J. Wysocki" , Andrew Morton , Ingo Molnar , Pavel Tatashin , Stephen Rothwell , Andrew Banman , "mike.travis@hpe.com" , Oscar Salvador , Dave Hansen , Michal Hocko , =?utf-8?q?Michal_Such=C3=A1nek?= , Vitaly Kuznetsov , Dan Williams , Pavel Tatashin , Martin Schwidefsky , Heiko Carstens Subject: [PATCH RFCv2 1/4] mm/memory_hotplug: Introduce memory block types Date: Fri, 30 Nov 2018 18:59:19 +0100 Message-Id: <20181130175922.10425-2-david@redhat.com> In-Reply-To: <20181130175922.10425-1-david@redhat.com> References: <20181130175922.10425-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Fri, 30 Nov 2018 17:59:57 +0000 (UTC) Sender: linux-sh-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Memory onlining should always be handled by user space, because only user space knows which use cases it wants to satisfy. E.g. memory might be onlined to the MOVABLE zone even if it can never be removed from the system, e.g. to make usage of huge pages more reliable. However to implement such rules (especially default rules in distributions) we need more information about the memory that was added in user space. E.g. on x86 we want to online memory provided by balloon devices (e.g. XEN, Hyper-V) differently (-> will not be unplugged by offlining the whole block) than ordinary DIMMs (-> might eventually be unplugged by offlining the whole block). This might also become relevat for other architectures. Also, udev rules right now check if running on s390x and treat all added memory blocks as standby memory (-> don't online automatically). As soon as we support other memory hotplug mechanism (e.g. virtio-mem) checks would have to get more involved (e.g. also check if under KVM) but eventually also wrong (e.g. if KVM ever supports standby memory we are doomed). I decided to allow to specify the type of memory that is getting added to the system. Let's start with two types, BOOT and UNSPECIFIED to get the basic infrastructure running. We'll introduce and use further types in follow-up patches. For now we classify any hotplugged memory temporarily as as UNSPECIFIED (which will eventually be dropped later on). Cc: Greg Kroah-Hartman Cc: "Rafael J. Wysocki" Cc: Andrew Morton Cc: Ingo Molnar Cc: Pavel Tatashin Cc: Stephen Rothwell Cc: Andrew Banman Cc: "mike.travis@hpe.com" Cc: Oscar Salvador Cc: Dave Hansen Cc: Michal Hocko Cc: Michal Suchánek Cc: Vitaly Kuznetsov Cc: Dan Williams Cc: Pavel Tatashin Cc: Martin Schwidefsky Cc: Heiko Carstens Signed-off-by: David Hildenbrand --- drivers/base/memory.c | 38 +++++++++++++++++++++++++++++++++++--- include/linux/memory.h | 27 +++++++++++++++++++++++++++ 2 files changed, 62 insertions(+), 3 deletions(-) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index 0c290f86ab20..17f2985c07c5 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -381,6 +381,29 @@ static ssize_t show_phys_device(struct device *dev, return sprintf(buf, "%d\n", mem->phys_device); } +static ssize_t type_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct memory_block *mem = to_memory_block(dev); + ssize_t len = 0; + + switch (mem->type) { + case MEMORY_BLOCK_UNSPECIFIED: + len = sprintf(buf, "unspecified\n"); + break; + case MEMORY_BLOCK_BOOT: + len = sprintf(buf, "boot\n"); + break; + default: + len = sprintf(buf, "ERROR-UNKNOWN-%ld\n", + mem->state); + WARN_ON(1); + break; + } + + return len; +} + #ifdef CONFIG_MEMORY_HOTREMOVE static void print_allowed_zone(char *buf, int nid, unsigned long start_pfn, unsigned long nr_pages, int online_type, @@ -442,6 +465,7 @@ static DEVICE_ATTR(phys_index, 0444, show_mem_start_phys_index, NULL); static DEVICE_ATTR(state, 0644, show_mem_state, store_mem_state); static DEVICE_ATTR(phys_device, 0444, show_phys_device, NULL); static DEVICE_ATTR(removable, 0444, show_mem_removable, NULL); +static DEVICE_ATTR_RO(type); /* * Block size attribute stuff @@ -620,6 +644,7 @@ static struct attribute *memory_memblk_attrs[] = { &dev_attr_state.attr, &dev_attr_phys_device.attr, &dev_attr_removable.attr, + &dev_attr_type.attr, #ifdef CONFIG_MEMORY_HOTREMOVE &dev_attr_valid_zones.attr, #endif @@ -657,13 +682,17 @@ int register_memory(struct memory_block *memory) } static int init_memory_block(struct memory_block **memory, - struct mem_section *section, unsigned long state) + struct mem_section *section, unsigned long state, + int type) { struct memory_block *mem; unsigned long start_pfn; int scn_nr; int ret = 0; + if (type == MEMORY_BLOCK_NONE) + return -EINVAL; + mem = kzalloc(sizeof(*mem), GFP_KERNEL); if (!mem) return -ENOMEM; @@ -675,6 +704,7 @@ static int init_memory_block(struct memory_block **memory, mem->state = state; start_pfn = section_nr_to_pfn(mem->start_section_nr); mem->phys_device = arch_get_memory_phys_device(start_pfn); + mem->type = type; ret = register_memory(mem); @@ -699,7 +729,8 @@ static int add_memory_block(int base_section_nr) if (section_count == 0) return 0; - ret = init_memory_block(&mem, __nr_to_section(section_nr), MEM_ONLINE); + ret = init_memory_block(&mem, __nr_to_section(section_nr), MEM_ONLINE, + MEMORY_BLOCK_BOOT); if (ret) return ret; mem->section_count = section_count; @@ -722,7 +753,8 @@ int hotplug_memory_register(int nid, struct mem_section *section) mem->section_count++; put_device(&mem->dev); } else { - ret = init_memory_block(&mem, section, MEM_OFFLINE); + ret = init_memory_block(&mem, section, MEM_OFFLINE, + MEMORY_BLOCK_UNSPECIFIED); if (ret) goto out; mem->section_count++; diff --git a/include/linux/memory.h b/include/linux/memory.h index d75ec88ca09d..06268e96e0da 100644 --- a/include/linux/memory.h +++ b/include/linux/memory.h @@ -34,12 +34,39 @@ struct memory_block { int (*phys_callback)(struct memory_block *); struct device dev; int nid; /* NID for this memory block */ + int type; /* type of this memory block */ }; int arch_get_memory_phys_device(unsigned long start_pfn); unsigned long memory_block_size_bytes(void); int set_memory_block_size_order(unsigned int order); +/* + * Memory block types allow user space to formulate rules if and how to + * online memory blocks. The types are exposed to user space as text + * strings in sysfs. + * + * MEMORY_BLOCK_NONE: + * No memory block is to be created (e.g. device memory). Not exposed to + * user space. + * + * MEMORY_BLOCK_UNSPECIFIED: + * The type of memory block was not further specified when adding the + * memory block. + * + * MEMORY_BLOCK_BOOT: + * This memory block was added during boot by the basic system. No + * specific device driver takes care of this memory block. This memory + * block type is onlined automatically by the kernel during boot and might + * later be managed by a different device driver, in which case the type + * might change. + */ +enum { + MEMORY_BLOCK_NONE = 0, + MEMORY_BLOCK_UNSPECIFIED, + MEMORY_BLOCK_BOOT, +}; + /* These states are exposed to userspace as text strings in sysfs */ #define MEM_ONLINE (1<<0) /* exposed to userspace */ #define MEM_GOING_OFFLINE (1<<1) /* exposed to userspace */ From patchwork Fri Nov 30 17:59:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 10706973 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 634CB4B7E for ; Fri, 30 Nov 2018 18:00:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 50E182FF6F for ; Fri, 30 Nov 2018 18:00:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 43DA32FF86; Fri, 30 Nov 2018 18:00:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1D0612FF6F for ; Fri, 30 Nov 2018 18:00:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726887AbeLAFKh (ORCPT ); Sat, 1 Dec 2018 00:10:37 -0500 Received: from mx1.redhat.com ([209.132.183.28]:55276 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726103AbeLAFKh (ORCPT ); Sat, 1 Dec 2018 00:10:37 -0500 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C27F63001569; Fri, 30 Nov 2018 18:00:30 +0000 (UTC) Received: from t460s.redhat.com (ovpn-126-156.rdu2.redhat.com [10.10.126.156]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6455C5D9C9; Fri, 30 Nov 2018 18:00:15 +0000 (UTC) From: David Hildenbrand To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, linux-ia64@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, linux-acpi@vger.kernel.org, devel@linuxdriverproject.org, xen-devel@lists.xenproject.org, x86@kernel.org, David Hildenbrand , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , "Rafael J. Wysocki" , Len Brown , Greg Kroah-Hartman , "K. Y. Srinivasan" , Haiyang Zhang , Stephen Hemminger , Martin Schwidefsky , Heiko Carstens , Boris Ostrovsky , Juergen Gross , Stefano Stabellini , Rashmica Gupta , Andrew Morton , Pavel Tatashin , Balbir Singh , Michael Neuling , Nathan Fontenot , YueHaibing , Vasily Gorbik , Ingo Molnar , Stephen Rothwell , "mike.travis@hpe.com" , Oscar Salvador , Joonsoo Kim , Mathieu Malaterre , Michal Hocko , Arun KS , Andrew Banman , Dave Hansen , =?utf-8?q?Michal_Such=C3=A1nek?= , Vitaly Kuznetsov , Dan Williams Subject: [PATCH RFCv2 3/4] mm/memory_hotplug: Introduce and use more memory types Date: Fri, 30 Nov 2018 18:59:21 +0100 Message-Id: <20181130175922.10425-4-david@redhat.com> In-Reply-To: <20181130175922.10425-1-david@redhat.com> References: <20181130175922.10425-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.46]); Fri, 30 Nov 2018 18:00:31 +0000 (UTC) Sender: linux-sh-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Let's introduce new types for different kinds of memory blocks and use them in existing code. As I don't see an easy way to split this up, do it in one hunk for now. acpi: Use DIMM or DIMM_UNREMOVABLE depending on hotremove support in the kernel. Properly change the type when trying to add memory that was already detected and used during boot (so this memory will correctly end up as "acpi" in user space). pseries: Use DIMM or DIMM_UNREMOVABLE depending on hotremove support in the kernel. As far as I see, handling like in the acpi case for existing blocks is not required. probed memory from user space: Use DIMM_UNREMOVABLE as there is no interface to get rid of this code again. hv_balloon,xen/balloon: Use BALLOON. As simple as that :) s390x/sclp: Use a dedicated type S390X_STANDBY as this type of memory and it's semantics are very s390x specific. powernv/memtrace: Only allow to use BOOT memory for memtrace. I consider this code in general dangerous, but we have to keep it working ... most probably just a debug feature. Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Michael Ellerman Cc: "Rafael J. Wysocki" Cc: Len Brown Cc: Greg Kroah-Hartman Cc: "K. Y. Srinivasan" Cc: Haiyang Zhang Cc: Stephen Hemminger Cc: Martin Schwidefsky Cc: Heiko Carstens Cc: Boris Ostrovsky Cc: Juergen Gross Cc: Stefano Stabellini Cc: Rashmica Gupta Cc: Andrew Morton Cc: Pavel Tatashin Cc: Balbir Singh Cc: Michael Neuling Cc: Nathan Fontenot Cc: YueHaibing Cc: Vasily Gorbik Cc: Ingo Molnar Cc: Stephen Rothwell Cc: "mike.travis@hpe.com" Cc: Oscar Salvador Cc: Joonsoo Kim Cc: Mathieu Malaterre Cc: Michal Hocko Cc: Arun KS Cc: Andrew Banman Cc: Dave Hansen Cc: Michal Suchánek Cc: Vitaly Kuznetsov Cc: Dan Williams Signed-off-by: David Hildenbrand --- At first I tried to abstract the types quite a lot, but I think there are subtle differences that are worth differentiating. More details about the types can be found in the excessive documentation. It is wort noting that BALLOON_MOVABLE has no user yet, but I have something in mind that might want to make use of that (virtio-mem). Just included it to discuss the general approach. I can drop it from this patch. --- arch/powerpc/platforms/powernv/memtrace.c | 9 ++-- .../platforms/pseries/hotplug-memory.c | 7 ++- drivers/acpi/acpi_memhotplug.c | 16 ++++++- drivers/base/memory.c | 18 ++++++- drivers/hv/hv_balloon.c | 3 +- drivers/s390/char/sclp_cmd.c | 3 +- drivers/xen/balloon.c | 2 +- include/linux/memory.h | 47 ++++++++++++++++++- include/linux/memory_hotplug.h | 6 +-- mm/memory_hotplug.c | 15 +++--- 10 files changed, 104 insertions(+), 22 deletions(-) diff --git a/arch/powerpc/platforms/powernv/memtrace.c b/arch/powerpc/platforms/powernv/memtrace.c index 248a38ad25c7..5d08db87091e 100644 --- a/arch/powerpc/platforms/powernv/memtrace.c +++ b/arch/powerpc/platforms/powernv/memtrace.c @@ -54,9 +54,9 @@ static const struct file_operations memtrace_fops = { .open = simple_open, }; -static int check_memblock_online(struct memory_block *mem, void *arg) +static int check_memblock_boot_and_online(struct memory_block *mem, void *arg) { - if (mem->state != MEM_ONLINE) + if (mem->type != MEM_BLOCK_BOOT || mem->state != MEM_ONLINE) return -1; return 0; @@ -77,7 +77,7 @@ static bool memtrace_offline_pages(u32 nid, u64 start_pfn, u64 nr_pages) u64 end_pfn = start_pfn + nr_pages - 1; if (walk_memory_range(start_pfn, end_pfn, NULL, - check_memblock_online)) + check_memblock_boot_and_online)) return false; walk_memory_range(start_pfn, end_pfn, (void *)MEM_GOING_OFFLINE, @@ -233,7 +233,8 @@ static int memtrace_online(void) ent->mem = 0; } - if (add_memory(ent->nid, ent->start, ent->size)) { + if (add_memory(ent->nid, ent->start, ent->size, + MEMORY_BLOCK_BOOT)) { pr_err("Failed to add trace memory to node %d\n", ent->nid); ret += 1; diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c index 2a983b5a52e1..5f91359c7993 100644 --- a/arch/powerpc/platforms/pseries/hotplug-memory.c +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c @@ -651,7 +651,7 @@ static int dlpar_memory_remove_by_ic(u32 lmbs_to_remove, u32 drc_index) static int dlpar_add_lmb(struct drmem_lmb *lmb) { unsigned long block_sz; - int nid, rc; + int nid, rc, type = MEMORY_BLOCK_DIMM; if (lmb->flags & DRCONF_MEM_ASSIGNED) return -EINVAL; @@ -667,8 +667,11 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb) /* Find the node id for this address */ nid = memory_add_physaddr_to_nid(lmb->base_addr); + if (!IS_ENABLED(CONFIG_MEMORY_HOTREMOVE)) + type = MEMORY_BLOCK_DIMM_UNREMOVABLE; + /* Add the memory */ - rc = __add_memory(nid, lmb->base_addr, block_sz); + rc = __add_memory(nid, lmb->base_addr, block_sz, type); if (rc) { invalidate_lmb_associativity_index(lmb); return rc; diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c index 8fe0960ea572..f841113b450d 100644 --- a/drivers/acpi/acpi_memhotplug.c +++ b/drivers/acpi/acpi_memhotplug.c @@ -177,6 +177,13 @@ static unsigned long acpi_meminfo_end_pfn(struct acpi_memory_info *info) static int acpi_bind_memblk(struct memory_block *mem, void *arg) { + /* switch the type of memory block if this memory was already present */ + if (mem->type == MEMORY_BLOCK_BOOT) { + if (IS_ENABLED(CONFIG_MEMORY_HOTREMOVE)) + mem->type = MEMORY_BLOCK_DIMM; + else + mem->type = MEMORY_BLOCK_DIMM_UNREMOVABLE; + } return acpi_bind_one(&mem->dev, arg); } @@ -191,6 +198,7 @@ static int acpi_bind_memory_blocks(struct acpi_memory_info *info, static int acpi_unbind_memblk(struct memory_block *mem, void *arg) { acpi_unbind_one(&mem->dev); + mem->type = MEMORY_BLOCK_BOOT; return 0; } @@ -203,10 +211,13 @@ static void acpi_unbind_memory_blocks(struct acpi_memory_info *info) static int acpi_memory_enable_device(struct acpi_memory_device *mem_device) { acpi_handle handle = mem_device->device->handle; - int result, num_enabled = 0; + int result, num_enabled = 0, type = MEMORY_BLOCK_DIMM; struct acpi_memory_info *info; int node; + if (!IS_ENABLED(CONFIG_MEMORY_HOTREMOVE)) + type = MEMORY_BLOCK_DIMM_UNREMOVABLE; + node = acpi_get_node(handle); /* * Tell the VM there is more memory here... @@ -228,7 +239,8 @@ static int acpi_memory_enable_device(struct acpi_memory_device *mem_device) if (node < 0) node = memory_add_physaddr_to_nid(info->start_addr); - result = __add_memory(node, info->start_addr, info->length); + result = __add_memory(node, info->start_addr, info->length, + type); /* * If the memory block has been used by the kernel, add_memory() diff --git a/drivers/base/memory.c b/drivers/base/memory.c index c42300082c88..c5fdca7a3009 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -394,6 +394,21 @@ static ssize_t type_show(struct device *dev, struct device_attribute *attr, case MEMORY_BLOCK_BOOT: len = sprintf(buf, "boot\n"); break; + case MEMORY_BLOCK_DIMM: + len = sprintf(buf, "dimm\n"); + break; + case MEMORY_BLOCK_DIMM_UNREMOVABLE: + len = sprintf(buf, "dimm-unremovable\n"); + break; + case MEMORY_BLOCK_BALLOON: + len = sprintf(buf, "balloon\n"); + break; + case MEMORY_BLOCK_BALLOON_MOVABLE: + len = sprintf(buf, "balloon-movable\n"); + break; + case MEMORY_BLOCK_S390X_STANDBY: + len = sprintf(buf, "s390x-standby\n"); + break; default: len = sprintf(buf, "ERROR-UNKNOWN-%ld\n", mem->state); @@ -538,7 +553,8 @@ memory_probe_store(struct device *dev, struct device_attribute *attr, nid = memory_add_physaddr_to_nid(phys_addr); ret = __add_memory(nid, phys_addr, - MIN_MEMORY_BLOCK_SIZE * sections_per_block); + MIN_MEMORY_BLOCK_SIZE * sections_per_block, + MEMORY_BLOCK_DIMM_UNREMOVABLE); if (ret) goto out; diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index 47719862e57f..f502ea6cd255 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -741,7 +741,8 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size, nid = memory_add_physaddr_to_nid(PFN_PHYS(start_pfn)); ret = add_memory(nid, PFN_PHYS((start_pfn)), - (HA_CHUNK << PAGE_SHIFT)); + (HA_CHUNK << PAGE_SHIFT), + MEMORY_BLOCK_BALLOON); if (ret) { pr_err("hot_add memory failed error is %d\n", ret); diff --git a/drivers/s390/char/sclp_cmd.c b/drivers/s390/char/sclp_cmd.c index 37d42de06079..0ca6f77e7e1d 100644 --- a/drivers/s390/char/sclp_cmd.c +++ b/drivers/s390/char/sclp_cmd.c @@ -406,7 +406,8 @@ static void __init add_memory_merged(u16 rn) if (!size) goto skip_add; for (addr = start; addr < start + size; addr += block_size) - add_memory(numa_pfn_to_nid(PFN_DOWN(addr)), addr, block_size); + add_memory(numa_pfn_to_nid(PFN_DOWN(addr)), addr, block_size, + MEMORY_BLOCK_S390X_STANDBY); skip_add: first_rn = rn; num = 1; diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 5d2d7a917b4e..953ff86d609b 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -352,7 +352,7 @@ static enum bp_state reserve_additional_memory(void) mutex_unlock(&balloon_mutex); /* add_memory_resource() requires the device_hotplug lock */ lock_device_hotplug(); - rc = add_memory_resource(nid, resource); + rc = add_memory_resource(nid, resource, MEMORY_BLOCK_BALLOON); unlock_device_hotplug(); mutex_lock(&balloon_mutex); diff --git a/include/linux/memory.h b/include/linux/memory.h index 9f39ef41e6d2..a3a1e9764805 100644 --- a/include/linux/memory.h +++ b/include/linux/memory.h @@ -59,12 +59,57 @@ int set_memory_block_size_order(unsigned int order); * specific device driver takes care of this memory block. This memory * block type is onlined automatically by the kernel during boot and might * later be managed by a different device driver, in which case the type - * might change. + * might change (e.g. to MEMORY_BLOCK_DIMM). + * + * MEMORY_BLOCK_DIMM: + * This memory block is managed by a device driver taking care of DIMMs + * (or similar). Once all memory blocks belonging to the DIMM have been + * offlined, the DIMM along with the memory blocks can be removed to + * effectively unplug it. This memory block type is usually onlined to the + * MOVABLE zone, to make offlining and unplug possible. Examples include + * ACPI DIMMs and PPC LMBs if the kernel supports removal of memory. + * + * MEMORY_BLOCK_DIMM_UNREMOVABLE: + * This memory block is managed by a device driver taking care of DIMMs + * (or similar). There is either no HW interface to remove the DIMM or + * the kernel does not support offlining/removal of memory, so this memory + * block can never be removed. Examples include ACPI DIMMs and PPC LMBs + * when removal of memory is not supported by the kernel, as well as + * memory probed manually from user space. + * This memory block type is usually onlined to the NORMAL zone. + * + * MEMORY_BLOCK_BALLOON: + * This memory block was added by a balloon device driver (or similar) + * that does not require a specific zone for optimal operation + * (e.g. unplug memory using balloon inflation on this memory block on + * page granularity). Examples include memory added by the XEN and Hyper-V + * balloon driver. + * This memory block type is usually onlined to the NORMAL zone. + * + * MEMORY_BLOCK_BALLOON_MOVABLE: + * This memory block was added by a balloon device driver (or similar) + * that suggests to online this memory block to the MOVABLE zone for + * optimal operation (a.g. unplug using balloon inflation on this memory + * block in bigger chunks than pages). There are no examples yet. + * This memory block type is usually onlined to the MOVABLE zone. + * + * MEMORY_BLOCK_S390X_STANDBY: + * The memory block is special standby memory on s390x. As long as + * offline, no memory will be allocated to the system for this memory + * block. Onlining memory will result in memory getting allocated to the + * system and memory can usually not be offlined again. The memory block + * will never be removed. This memory type is usually not onlined + * automatically but explicitly by the administrator. */ enum { MEMORY_BLOCK_NONE = 0, MEMORY_BLOCK_UNSPECIFIED, MEMORY_BLOCK_BOOT, + MEMORY_BLOCK_DIMM, + MEMORY_BLOCK_DIMM_UNREMOVABLE, + MEMORY_BLOCK_BALLOON, + MEMORY_BLOCK_BALLOON_MOVABLE, + MEMORY_BLOCK_S390X_STANDBY, }; /* These states are exposed to userspace as text strings in sysfs */ diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 667a37aa9a3c..7c8895299e8c 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -326,9 +326,9 @@ static inline void __remove_memory(int nid, u64 start, u64 size) {} extern void __ref free_area_init_core_hotplug(int nid); extern int walk_memory_range(unsigned long start_pfn, unsigned long end_pfn, void *arg, int (*func)(struct memory_block *, void *)); -extern int __add_memory(int nid, u64 start, u64 size); -extern int add_memory(int nid, u64 start, u64 size); -extern int add_memory_resource(int nid, struct resource *resource); +extern int __add_memory(int nid, u64 start, u64 size, int type); +extern int add_memory(int nid, u64 start, u64 size, int type); +extern int add_memory_resource(int nid, struct resource *resource, int type); extern int arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap, int type); extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn, diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 7246faa44488..f109002d6e6e 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1071,7 +1071,7 @@ static int online_memory_block(struct memory_block *mem, void *arg) * * we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */ -int __ref add_memory_resource(int nid, struct resource *res) +int __ref add_memory_resource(int nid, struct resource *res, int type) { u64 start, size; bool new_node = false; @@ -1080,6 +1080,9 @@ int __ref add_memory_resource(int nid, struct resource *res) start = res->start; size = resource_size(res); + if (type == MEMORY_BLOCK_NONE) + return -EINVAL; + ret = check_hotplug_memory_range(start, size); if (ret) return ret; @@ -1100,7 +1103,7 @@ int __ref add_memory_resource(int nid, struct resource *res) new_node = ret; /* call arch's memory hotadd */ - ret = arch_add_memory(nid, start, size, NULL, MEMORY_TYPE_UNSPECIFIED); + ret = arch_add_memory(nid, start, size, NULL, type); if (ret < 0) goto error; @@ -1141,7 +1144,7 @@ int __ref add_memory_resource(int nid, struct resource *res) } /* requires device_hotplug_lock, see add_memory_resource() */ -int __ref __add_memory(int nid, u64 start, u64 size) +int __ref __add_memory(int nid, u64 start, u64 size, int type) { struct resource *res; int ret; @@ -1150,18 +1153,18 @@ int __ref __add_memory(int nid, u64 start, u64 size) if (IS_ERR(res)) return PTR_ERR(res); - ret = add_memory_resource(nid, res); + ret = add_memory_resource(nid, res, type); if (ret < 0) release_memory_resource(res); return ret; } -int add_memory(int nid, u64 start, u64 size) +int add_memory(int nid, u64 start, u64 size, int type) { int rc; lock_device_hotplug(); - rc = __add_memory(nid, start, size); + rc = __add_memory(nid, start, size, type); unlock_device_hotplug(); return rc; From patchwork Fri Nov 30 17:59:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 10706969 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CD08E14D6 for ; Fri, 30 Nov 2018 18:00:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B9AB22FF6F for ; Fri, 30 Nov 2018 18:00:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A772C2FF86; Fri, 30 Nov 2018 18:00:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4C15E2FF6F for ; Fri, 30 Nov 2018 18:00:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727394AbeLAFKn (ORCPT ); Sat, 1 Dec 2018 00:10:43 -0500 Received: from mx1.redhat.com ([209.132.183.28]:37362 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726111AbeLAFKn (ORCPT ); Sat, 1 Dec 2018 00:10:43 -0500 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id CFEBD307D982; Fri, 30 Nov 2018 18:00:37 +0000 (UTC) Received: from t460s.redhat.com (ovpn-126-156.rdu2.redhat.com [10.10.126.156]) by smtp.corp.redhat.com (Postfix) with ESMTP id 26FAB6870A; Fri, 30 Nov 2018 18:00:30 +0000 (UTC) From: David Hildenbrand To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, linux-ia64@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, linux-acpi@vger.kernel.org, devel@linuxdriverproject.org, xen-devel@lists.xenproject.org, x86@kernel.org, David Hildenbrand , Greg Kroah-Hartman , "Rafael J. Wysocki" , Andrew Morton , Ingo Molnar , Pavel Tatashin , Stephen Rothwell , Andrew Banman , "mike.travis@hpe.com" , Oscar Salvador , Dave Hansen , Michal Hocko , =?utf-8?q?Michal_Such=C3=A1nek?= , Vitaly Kuznetsov , Dan Williams , Pavel Tatashin Subject: [PATCH RFCv2 4/4] mm/memory_hotplug: Drop MEMORY_TYPE_UNSPECIFIED Date: Fri, 30 Nov 2018 18:59:22 +0100 Message-Id: <20181130175922.10425-5-david@redhat.com> In-Reply-To: <20181130175922.10425-1-david@redhat.com> References: <20181130175922.10425-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.48]); Fri, 30 Nov 2018 18:00:38 +0000 (UTC) Sender: linux-sh-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We now have proper types for all users, we can drop this one. Cc: Greg Kroah-Hartman Cc: "Rafael J. Wysocki" Cc: Andrew Morton Cc: Ingo Molnar Cc: Pavel Tatashin Cc: Stephen Rothwell Cc: Andrew Banman Cc: "mike.travis@hpe.com" Cc: Oscar Salvador Cc: Dave Hansen Cc: Michal Hocko Cc: Michal Suchánek Cc: Vitaly Kuznetsov Cc: Dan Williams Cc: Pavel Tatashin Signed-off-by: David Hildenbrand --- drivers/base/memory.c | 3 --- include/linux/memory.h | 5 ----- 2 files changed, 8 deletions(-) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index c5fdca7a3009..a6e524f0ea38 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -388,9 +388,6 @@ static ssize_t type_show(struct device *dev, struct device_attribute *attr, ssize_t len = 0; switch (mem->type) { - case MEMORY_BLOCK_UNSPECIFIED: - len = sprintf(buf, "unspecified\n"); - break; case MEMORY_BLOCK_BOOT: len = sprintf(buf, "boot\n"); break; diff --git a/include/linux/memory.h b/include/linux/memory.h index a3a1e9764805..11679622f743 100644 --- a/include/linux/memory.h +++ b/include/linux/memory.h @@ -50,10 +50,6 @@ int set_memory_block_size_order(unsigned int order); * No memory block is to be created (e.g. device memory). Not exposed to * user space. * - * MEMORY_BLOCK_UNSPECIFIED: - * The type of memory block was not further specified when adding the - * memory block. - * * MEMORY_BLOCK_BOOT: * This memory block was added during boot by the basic system. No * specific device driver takes care of this memory block. This memory @@ -103,7 +99,6 @@ int set_memory_block_size_order(unsigned int order); */ enum { MEMORY_BLOCK_NONE = 0, - MEMORY_BLOCK_UNSPECIFIED, MEMORY_BLOCK_BOOT, MEMORY_BLOCK_DIMM, MEMORY_BLOCK_DIMM_UNREMOVABLE,