From patchwork Thu Sep 10 09:13:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 11767247 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CFF69698 for ; Thu, 10 Sep 2020 09:17:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9E74420C09 for ; Thu, 10 Sep 2020 09:17:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="g0xcm5Ha" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728631AbgIJJRd (ORCPT ); Thu, 10 Sep 2020 05:17:33 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:38779 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730204AbgIJJOQ (ORCPT ); Thu, 10 Sep 2020 05:14:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1599729254; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0L/VNlFzuDB/95oeV+NVSs6BIkKkE+sXG0xqekszlyk=; b=g0xcm5Hanux7U1zw3r5XNByLH/bojfd/kWEG3cUCQ6jOKG+uAGAliHb0KyU4BOiQ8Ai6l5 jyZpLETr3e3MHlVMkp9dHB6rDgA4NHlQ6KTBOT0xbYxSo8QAuqdThciunh7TEuHTrJMXUF fpxgY65A0CIBMxUL/PvmkPI8qfJsvzU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-292-_e125CRHNGeNx5hwN_T7eg-1; Thu, 10 Sep 2020 05:14:10 -0400 X-MC-Unique: _e125CRHNGeNx5hwN_T7eg-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 87B7664082; Thu, 10 Sep 2020 09:14:08 +0000 (UTC) Received: from t480s.redhat.com (ovpn-113-88.ams2.redhat.com [10.36.113.88]) by smtp.corp.redhat.com (Postfix) with ESMTP id 996B427CCC; Thu, 10 Sep 2020 09:14:01 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: virtualization@lists.linux-foundation.org, linux-mm@kvack.org, linux-hyperv@vger.kernel.org, xen-devel@lists.xenproject.org, linux-acpi@vger.kernel.org, linux-nvdimm@lists.01.org, linux-s390@vger.kernel.org, Andrew Morton , David Hildenbrand , Michal Hocko , Dan Williams , Jason Gunthorpe , Kees Cook , Ard Biesheuvel , Pankaj Gupta , Baoquan He , Wei Yang Subject: [PATCH v3 1/7] kernel/resource: make release_mem_region_adjustable() never fail Date: Thu, 10 Sep 2020 11:13:34 +0200 Message-Id: <20200910091340.8654-2-david@redhat.com> In-Reply-To: <20200910091340.8654-1-david@redhat.com> References: <20200910091340.8654-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org Let's make sure splitting a resource on memory hotunplug will never fail. This will become more relevant once we merge selected System RAM resources - then, we'll trigger that case more often on memory hotunplug. In general, this function is already unlikely to fail. When we remove memory, we free up quite a lot of metadata (memmap, page tables, memory block device, etc.). The only reason it could really fail would be when injecting allocation errors. All other error cases inside release_mem_region_adjustable() seem to be sanity checks if the function would be abused in different context - let's add WARN_ON_ONCE() in these cases so we can catch them. Cc: Andrew Morton Cc: Michal Hocko Cc: Dan Williams Cc: Jason Gunthorpe Cc: Kees Cook Cc: Ard Biesheuvel Cc: Pankaj Gupta Cc: Baoquan He Cc: Wei Yang Signed-off-by: David Hildenbrand --- include/linux/ioport.h | 4 ++-- kernel/resource.c | 49 ++++++++++++++++++++++++------------------ mm/memory_hotplug.c | 22 +------------------ 3 files changed, 31 insertions(+), 44 deletions(-) diff --git a/include/linux/ioport.h b/include/linux/ioport.h index 6c2b06fe8beb7..52a91f5fa1a36 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -248,8 +248,8 @@ extern struct resource * __request_region(struct resource *, extern void __release_region(struct resource *, resource_size_t, resource_size_t); #ifdef CONFIG_MEMORY_HOTREMOVE -extern int release_mem_region_adjustable(struct resource *, resource_size_t, - resource_size_t); +extern void release_mem_region_adjustable(struct resource *, resource_size_t, + resource_size_t); #endif /* Wrappers for managed devices */ diff --git a/kernel/resource.c b/kernel/resource.c index f1175ce93a1d5..36b3552210120 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -1258,21 +1258,28 @@ EXPORT_SYMBOL(__release_region); * assumes that all children remain in the lower address entry for * simplicity. Enhance this logic when necessary. */ -int release_mem_region_adjustable(struct resource *parent, - resource_size_t start, resource_size_t size) +void release_mem_region_adjustable(struct resource *parent, + resource_size_t start, resource_size_t size) { + struct resource *new_res = NULL; + bool alloc_nofail = false; struct resource **p; struct resource *res; - struct resource *new_res; resource_size_t end; - int ret = -EINVAL; end = start + size - 1; - if ((start < parent->start) || (end > parent->end)) - return ret; + if (WARN_ON_ONCE((start < parent->start) || (end > parent->end))) + return; - /* The alloc_resource() result gets checked later */ - new_res = alloc_resource(GFP_KERNEL); + /* + * We free up quite a lot of memory on memory hotunplug (esp., memap), + * just before releasing the region. This is highly unlikely to + * fail - let's play save and make it never fail as the caller cannot + * perform any error handling (e.g., trying to re-add memory will fail + * similarly). + */ +retry: + new_res = alloc_resource(GFP_KERNEL | alloc_nofail ? __GFP_NOFAIL : 0); p = &parent->child; write_lock(&resource_lock); @@ -1298,7 +1305,6 @@ int release_mem_region_adjustable(struct resource *parent, * so if we are dealing with them, let us just back off here. */ if (!(res->flags & IORESOURCE_SYSRAM)) { - ret = 0; break; } @@ -1315,20 +1321,23 @@ int release_mem_region_adjustable(struct resource *parent, /* free the whole entry */ *p = res->sibling; free_resource(res); - ret = 0; } else if (res->start == start && res->end != end) { /* adjust the start */ - ret = __adjust_resource(res, end + 1, - res->end - end); + WARN_ON_ONCE(__adjust_resource(res, end + 1, + res->end - end)); } else if (res->start != start && res->end == end) { /* adjust the end */ - ret = __adjust_resource(res, res->start, - start - res->start); + WARN_ON_ONCE(__adjust_resource(res, res->start, + start - res->start)); } else { - /* split into two entries */ + /* split into two entries - we need a new resource */ if (!new_res) { - ret = -ENOMEM; - break; + new_res = alloc_resource(GFP_ATOMIC); + if (!new_res) { + alloc_nofail = true; + write_unlock(&resource_lock); + goto retry; + } } new_res->name = res->name; new_res->start = end + 1; @@ -1339,9 +1348,8 @@ int release_mem_region_adjustable(struct resource *parent, new_res->sibling = res->sibling; new_res->child = NULL; - ret = __adjust_resource(res, res->start, - start - res->start); - if (ret) + if (WARN_ON_ONCE(__adjust_resource(res, res->start, + start - res->start))) break; res->sibling = new_res; new_res = NULL; @@ -1352,7 +1360,6 @@ int release_mem_region_adjustable(struct resource *parent, write_unlock(&resource_lock); free_resource(new_res); - return ret; } #endif /* CONFIG_MEMORY_HOTREMOVE */ diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index baded53b9ff92..4c47b68a9f4b5 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1724,26 +1724,6 @@ void try_offline_node(int nid) } EXPORT_SYMBOL(try_offline_node); -static void __release_memory_resource(resource_size_t start, - resource_size_t size) -{ - int ret; - - /* - * When removing memory in the same granularity as it was added, - * this function never fails. It might only fail if resources - * have to be adjusted or split. We'll ignore the error, as - * removing of memory cannot fail. - */ - ret = release_mem_region_adjustable(&iomem_resource, start, size); - if (ret) { - resource_size_t endres = start + size - 1; - - pr_warn("Unable to release resource <%pa-%pa> (%d)\n", - &start, &endres, ret); - } -} - static int __ref try_remove_memory(int nid, u64 start, u64 size) { int rc = 0; @@ -1777,7 +1757,7 @@ static int __ref try_remove_memory(int nid, u64 start, u64 size) memblock_remove(start, size); } - __release_memory_resource(start, size); + release_mem_region_adjustable(&iomem_resource, start, size); try_offline_node(nid); From patchwork Thu Sep 10 09:13:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 11767225 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6298A698 for ; Thu, 10 Sep 2020 09:15:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3BDFE208CA for ; Thu, 10 Sep 2020 09:15:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="ZK+Bvk7m" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730127AbgIJJPb (ORCPT ); Thu, 10 Sep 2020 05:15:31 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:42959 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730193AbgIJJOW (ORCPT ); Thu, 10 Sep 2020 05:14:22 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1599729260; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zFptVInSDSbB2nDLyh5K9NKtnygiDyhn7CU7AXmV4mo=; b=ZK+Bvk7m7basu/rtGK9lRajkhlHLta5kNFKE4iol6Z4Oq4JiS9XnJPs0C063kJIwaMOhVJ TlVZN2Y16oa/LvPNdZczJ2ZHO5We2ioKdh0Avw28SveSk8ekzz4y7ogV+WN8IGaFo64c+I 4cLkYo1uZr0+L8vLciN5NEZO+dbJ+D0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-444-O9Y06Jz4OkiDaTZaHIbCkg-1; Thu, 10 Sep 2020 05:14:16 -0400 X-MC-Unique: O9Y06Jz4OkiDaTZaHIbCkg-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 0280418B9F00; Thu, 10 Sep 2020 09:14:14 +0000 (UTC) Received: from t480s.redhat.com (ovpn-113-88.ams2.redhat.com [10.36.113.88]) by smtp.corp.redhat.com (Postfix) with ESMTP id D920727CCC; Thu, 10 Sep 2020 09:14:08 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: virtualization@lists.linux-foundation.org, linux-mm@kvack.org, linux-hyperv@vger.kernel.org, xen-devel@lists.xenproject.org, linux-acpi@vger.kernel.org, linux-nvdimm@lists.01.org, linux-s390@vger.kernel.org, Andrew Morton , David Hildenbrand , Michal Hocko , Dan Williams , Jason Gunthorpe , Kees Cook , Ard Biesheuvel , Pankaj Gupta , Baoquan He , Wei Yang , Eric Biederman , Thomas Gleixner , Greg Kroah-Hartman , kexec@lists.infradead.org Subject: [PATCH v3 2/7] kernel/resource: move and rename IORESOURCE_MEM_DRIVER_MANAGED Date: Thu, 10 Sep 2020 11:13:35 +0200 Message-Id: <20200910091340.8654-3-david@redhat.com> In-Reply-To: <20200910091340.8654-1-david@redhat.com> References: <20200910091340.8654-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org IORESOURCE_MEM_DRIVER_MANAGED currently uses an unused PnP bit, which is always set to 0 by hardware. This is far from beautiful (and confusing), and the bit only applies to SYSRAM. So let's move it out of the bus-specific (PnP) defined bits. We'll add another SYSRAM specific bit soon. If we ever need more bits for other purposes, we can steal some from "desc", or reshuffle/regroup what we have. Cc: Andrew Morton Cc: Michal Hocko Cc: Dan Williams Cc: Jason Gunthorpe Cc: Kees Cook Cc: Ard Biesheuvel Cc: Pankaj Gupta Cc: Baoquan He Cc: Wei Yang Cc: Eric Biederman Cc: Thomas Gleixner Cc: Greg Kroah-Hartman Cc: kexec@lists.infradead.org Signed-off-by: David Hildenbrand --- include/linux/ioport.h | 4 +++- kernel/kexec_file.c | 2 +- mm/memory_hotplug.c | 4 ++-- 3 files changed, 6 insertions(+), 4 deletions(-) diff --git a/include/linux/ioport.h b/include/linux/ioport.h index 52a91f5fa1a36..d7620d7c941a0 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -58,6 +58,9 @@ struct resource { #define IORESOURCE_EXT_TYPE_BITS 0x01000000 /* Resource extended types */ #define IORESOURCE_SYSRAM 0x01000000 /* System RAM (modifier) */ +/* IORESOURCE_SYSRAM specific bits. */ +#define IORESOURCE_SYSRAM_DRIVER_MANAGED 0x02000000 /* Always detected via a driver. */ + #define IORESOURCE_EXCLUSIVE 0x08000000 /* Userland may not map this resource */ #define IORESOURCE_DISABLED 0x10000000 @@ -103,7 +106,6 @@ struct resource { #define IORESOURCE_MEM_32BIT (3<<3) #define IORESOURCE_MEM_SHADOWABLE (1<<5) /* dup: IORESOURCE_SHADOWABLE */ #define IORESOURCE_MEM_EXPANSIONROM (1<<6) -#define IORESOURCE_MEM_DRIVER_MANAGED (1<<7) /* PnP I/O specific bits (IORESOURCE_BITS) */ #define IORESOURCE_IO_16BIT_ADDR (1<<0) diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c index ca40bef75a616..dfeeed1aed084 100644 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -520,7 +520,7 @@ static int locate_mem_hole_callback(struct resource *res, void *arg) /* Returning 0 will take to next memory range */ /* Don't use memory that will be detected and handled by a driver. */ - if (res->flags & IORESOURCE_MEM_DRIVER_MANAGED) + if (res->flags & IORESOURCE_SYSRAM_DRIVER_MANAGED) return 0; if (sz < kbuf->memsz) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 4c47b68a9f4b5..8e1cd18b5cf14 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -105,7 +105,7 @@ static struct resource *register_memory_resource(u64 start, u64 size, unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; if (strcmp(resource_name, "System RAM")) - flags |= IORESOURCE_MEM_DRIVER_MANAGED; + flags |= IORESOURCE_SYSRAM_DRIVER_MANAGED; /* * Make sure value parsed from 'mem=' only restricts memory adding @@ -1160,7 +1160,7 @@ EXPORT_SYMBOL_GPL(add_memory); * * For this memory, no entries in /sys/firmware/memmap ("raw firmware-provided * memory map") are created. Also, the created memory resource is flagged - * with IORESOURCE_MEM_DRIVER_MANAGED, so in-kernel users can special-case + * with IORESOURCE_SYSRAM_DRIVER_MANAGED, so in-kernel users can special-case * this memory as well (esp., not place kexec images onto it). * * The resource_name (visible via /proc/iomem) has to have the format From patchwork Thu Sep 10 09:13:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 11767229 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 41E79698 for ; Thu, 10 Sep 2020 09:15:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 08E30206BE for ; Thu, 10 Sep 2020 09:15:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="ckVmIeos" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730136AbgIJJPd (ORCPT ); Thu, 10 Sep 2020 05:15:33 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:47732 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726847AbgIJJOh (ORCPT ); Thu, 10 Sep 2020 05:14:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1599729274; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bG9jcFZqt1ezDhzv7IvKf+DjV9LEx4e4KE6iRuRMdvg=; b=ckVmIeosHVrzW8DwH90ZLffnOxoAH/ZrUHMELW9clqUSroFf2XYb+mEDlyxA2hAlDFPeWC sicHID76alTECPMScJI7VaHjvcEqOKOWArSKB9ipCxAsfZCsWFenB5xGIQPqeFv7NAMY8v IK4S8ZJBGZwQC7ysSH3dv2IzPRs1j0M= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-85-ptkUy2LiMBG7-ccx0nM6Tg-1; Thu, 10 Sep 2020 05:14:30 -0400 X-MC-Unique: ptkUy2LiMBG7-ccx0nM6Tg-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 796B71091061; Thu, 10 Sep 2020 09:14:25 +0000 (UTC) Received: from t480s.redhat.com (ovpn-113-88.ams2.redhat.com [10.36.113.88]) by smtp.corp.redhat.com (Postfix) with ESMTP id 55A301A8EC; Thu, 10 Sep 2020 09:14:14 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: virtualization@lists.linux-foundation.org, linux-mm@kvack.org, linux-hyperv@vger.kernel.org, xen-devel@lists.xenproject.org, linux-acpi@vger.kernel.org, linux-nvdimm@lists.01.org, linux-s390@vger.kernel.org, Andrew Morton , David Hildenbrand , Wei Liu , Juergen Gross , Michal Hocko , Dan Williams , Jason Gunthorpe , Pankaj Gupta , Baoquan He , Wei Yang , Michael Ellerman , Benjamin Herrenschmidt , Paul Mackerras , "Rafael J. Wysocki" , Len Brown , Greg Kroah-Hartman , Vishal Verma , Dave Jiang , "K. Y. Srinivasan" , Haiyang Zhang , Stephen Hemminger , Heiko Carstens , Vasily Gorbik , Christian Borntraeger , "Michael S. Tsirkin" , Jason Wang , Boris Ostrovsky , Stefano Stabellini , "Oliver O'Halloran" , Pingfan Liu , Nathan Lynch , Libor Pechacek , Anton Blanchard , Leonardo Bras , linuxppc-dev@lists.ozlabs.org Subject: [PATCH v3 3/7] mm/memory_hotplug: prepare passing flags to add_memory() and friends Date: Thu, 10 Sep 2020 11:13:36 +0200 Message-Id: <20200910091340.8654-4-david@redhat.com> In-Reply-To: <20200910091340.8654-1-david@redhat.com> References: <20200910091340.8654-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org We soon want to pass flags, e.g., to mark added System RAM resources. mergeable. Prepare for that. This patch is based on a similar patch by Oscar Salvador: https://lkml.kernel.org/r/20190625075227.15193-3-osalvador@suse.de Acked-by: Wei Liu Reviewed-by: Juergen Gross # Xen related part Cc: Andrew Morton Cc: Michal Hocko Cc: Dan Williams Cc: Jason Gunthorpe Cc: Pankaj Gupta Cc: Baoquan He Cc: Wei Yang Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: "Rafael J. Wysocki" Cc: Len Brown Cc: Greg Kroah-Hartman Cc: Vishal Verma Cc: Dave Jiang Cc: "K. Y. Srinivasan" Cc: Haiyang Zhang Cc: Stephen Hemminger Cc: Wei Liu Cc: Heiko Carstens Cc: Vasily Gorbik Cc: Christian Borntraeger Cc: David Hildenbrand Cc: "Michael S. Tsirkin" Cc: Jason Wang Cc: Boris Ostrovsky Cc: Juergen Gross Cc: Stefano Stabellini Cc: "Oliver O'Halloran" Cc: Pingfan Liu Cc: Nathan Lynch Cc: Libor Pechacek Cc: Anton Blanchard Cc: Leonardo Bras Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-acpi@vger.kernel.org Cc: linux-nvdimm@lists.01.org Cc: linux-hyperv@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Signed-off-by: David Hildenbrand Reviewed-by: Pankaj Gupta Reported-by: kernel test robot Reported-by: kernel test robot --- arch/powerpc/platforms/powernv/memtrace.c | 2 +- arch/powerpc/platforms/pseries/hotplug-memory.c | 2 +- drivers/acpi/acpi_memhotplug.c | 3 ++- drivers/base/memory.c | 3 ++- drivers/dax/kmem.c | 2 +- drivers/hv/hv_balloon.c | 2 +- drivers/s390/char/sclp_cmd.c | 2 +- drivers/virtio/virtio_mem.c | 2 +- drivers/xen/balloon.c | 2 +- include/linux/memory_hotplug.h | 16 ++++++++++++---- mm/memory_hotplug.c | 14 +++++++------- 11 files changed, 30 insertions(+), 20 deletions(-) diff --git a/arch/powerpc/platforms/powernv/memtrace.c b/arch/powerpc/platforms/powernv/memtrace.c index 13b369d2cc454..6828108486f83 100644 --- a/arch/powerpc/platforms/powernv/memtrace.c +++ b/arch/powerpc/platforms/powernv/memtrace.c @@ -224,7 +224,7 @@ static int memtrace_online(void) ent->mem = 0; } - if (add_memory(ent->nid, ent->start, ent->size)) { + if (add_memory(ent->nid, ent->start, ent->size, MHP_NONE)) { pr_err("Failed to add trace memory to node %d\n", ent->nid); ret += 1; diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c index 0ea976d1cac47..e1c9fa0d730f5 100644 --- a/arch/powerpc/platforms/pseries/hotplug-memory.c +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c @@ -615,7 +615,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb) nid = memory_add_physaddr_to_nid(lmb->base_addr); /* Add the memory */ - rc = __add_memory(nid, lmb->base_addr, block_sz); + rc = __add_memory(nid, lmb->base_addr, block_sz, MHP_NONE); if (rc) { invalidate_lmb_associativity_index(lmb); return rc; diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c index e294f44a78504..2067c3bc55763 100644 --- a/drivers/acpi/acpi_memhotplug.c +++ b/drivers/acpi/acpi_memhotplug.c @@ -207,7 +207,8 @@ static int acpi_memory_enable_device(struct acpi_memory_device *mem_device) if (node < 0) node = memory_add_physaddr_to_nid(info->start_addr); - result = __add_memory(node, info->start_addr, info->length); + result = __add_memory(node, info->start_addr, info->length, + MHP_NONE); /* * If the memory block has been used by the kernel, add_memory() diff --git a/drivers/base/memory.c b/drivers/base/memory.c index 4db3c660de831..b4c297dd04755 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -432,7 +432,8 @@ static ssize_t probe_store(struct device *dev, struct device_attribute *attr, nid = memory_add_physaddr_to_nid(phys_addr); ret = __add_memory(nid, phys_addr, - MIN_MEMORY_BLOCK_SIZE * sections_per_block); + MIN_MEMORY_BLOCK_SIZE * sections_per_block, + MHP_NONE); if (ret) goto out; diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c index 7dcb2902e9b1b..896cb9444e727 100644 --- a/drivers/dax/kmem.c +++ b/drivers/dax/kmem.c @@ -95,7 +95,7 @@ int dev_dax_kmem_probe(struct dev_dax *dev_dax) * this as RAM automatically. */ rc = add_memory_driver_managed(numa_node, range.start, - range_len(&range), kmem_name); + range_len(&range), kmem_name, MHP_NONE); res->flags |= IORESOURCE_BUSY; if (rc) { diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index 32e3bc0aa665a..3c0d52e244520 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -726,7 +726,7 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size, nid = memory_add_physaddr_to_nid(PFN_PHYS(start_pfn)); ret = add_memory(nid, PFN_PHYS((start_pfn)), - (HA_CHUNK << PAGE_SHIFT)); + (HA_CHUNK << PAGE_SHIFT), MHP_NONE); if (ret) { pr_err("hot_add memory failed error is %d\n", ret); diff --git a/drivers/s390/char/sclp_cmd.c b/drivers/s390/char/sclp_cmd.c index a864b21af602a..f6e97f0830f64 100644 --- a/drivers/s390/char/sclp_cmd.c +++ b/drivers/s390/char/sclp_cmd.c @@ -406,7 +406,7 @@ static void __init add_memory_merged(u16 rn) if (!size) goto skip_add; for (addr = start; addr < start + size; addr += block_size) - add_memory(0, addr, block_size); + add_memory(0, addr, block_size, MHP_NONE); skip_add: first_rn = rn; num = 1; diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index 834b7c13ef3dc..ed99e43354010 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -424,7 +424,7 @@ static int virtio_mem_mb_add(struct virtio_mem *vm, unsigned long mb_id) dev_dbg(&vm->vdev->dev, "adding memory block: %lu\n", mb_id); return add_memory_driver_managed(nid, addr, memory_block_size_bytes(), - vm->resource_name); + vm->resource_name, MHP_NONE); } /* diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 51427c752b37b..9f40a294d398d 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -331,7 +331,7 @@ static enum bp_state reserve_additional_memory(void) mutex_unlock(&balloon_mutex); /* add_memory_resource() requires the device_hotplug lock */ lock_device_hotplug(); - rc = add_memory_resource(nid, resource); + rc = add_memory_resource(nid, resource, MHP_NONE); unlock_device_hotplug(); mutex_lock(&balloon_mutex); diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 51a877fec8da8..e53d1058f3443 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -57,6 +57,12 @@ enum { MMOP_ONLINE_MOVABLE, }; +/* Flags for add_memory() and friends to specify memory hotplug details. */ +typedef int __bitwise mhp_t; + +/* No special request */ +#define MHP_NONE ((__force mhp_t)0) + /* * Extended parameters for memory hotplug: * altmap: alternative allocator for memmap array (optional) @@ -345,11 +351,13 @@ extern void set_zone_contiguous(struct zone *zone); extern void clear_zone_contiguous(struct zone *zone); extern void __ref free_area_init_core_hotplug(int nid); -extern int __add_memory(int nid, u64 start, u64 size); -extern int add_memory(int nid, u64 start, u64 size); -extern int add_memory_resource(int nid, struct resource *resource); +extern int __add_memory(int nid, u64 start, u64 size, mhp_t mhp_flags); +extern int add_memory(int nid, u64 start, u64 size, mhp_t mhp_flags); +extern int add_memory_resource(int nid, struct resource *resource, + mhp_t mhp_flags); extern int add_memory_driver_managed(int nid, u64 start, u64 size, - const char *resource_name); + const char *resource_name, + mhp_t mhp_flags); extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn, unsigned long nr_pages, struct vmem_altmap *altmap, int migratetype); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 8e1cd18b5cf14..8f0bd7c9a63a5 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1039,7 +1039,7 @@ static int online_memory_block(struct memory_block *mem, void *arg) * * we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */ -int __ref add_memory_resource(int nid, struct resource *res) +int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) { struct mhp_params params = { .pgprot = PAGE_KERNEL }; u64 start, size; @@ -1118,7 +1118,7 @@ int __ref add_memory_resource(int nid, struct resource *res) } /* requires device_hotplug_lock, see add_memory_resource() */ -int __ref __add_memory(int nid, u64 start, u64 size) +int __ref __add_memory(int nid, u64 start, u64 size, mhp_t mhp_flags) { struct resource *res; int ret; @@ -1127,18 +1127,18 @@ int __ref __add_memory(int nid, u64 start, u64 size) if (IS_ERR(res)) return PTR_ERR(res); - ret = add_memory_resource(nid, res); + ret = add_memory_resource(nid, res, mhp_flags); if (ret < 0) release_memory_resource(res); return ret; } -int add_memory(int nid, u64 start, u64 size) +int add_memory(int nid, u64 start, u64 size, mhp_t mhp_flags) { int rc; lock_device_hotplug(); - rc = __add_memory(nid, start, size); + rc = __add_memory(nid, start, size, mhp_flags); unlock_device_hotplug(); return rc; @@ -1167,7 +1167,7 @@ EXPORT_SYMBOL_GPL(add_memory); * "System RAM ($DRIVER)". */ int add_memory_driver_managed(int nid, u64 start, u64 size, - const char *resource_name) + const char *resource_name, mhp_t mhp_flags) { struct resource *res; int rc; @@ -1185,7 +1185,7 @@ int add_memory_driver_managed(int nid, u64 start, u64 size, goto out_unlock; } - rc = add_memory_resource(nid, res); + rc = add_memory_resource(nid, res, mhp_flags); if (rc < 0) release_memory_resource(res); From patchwork Thu Sep 10 09:13:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 11767245 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4788692C for ; Thu, 10 Sep 2020 09:16:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1BE8320882 for ; Thu, 10 Sep 2020 09:16:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="X1w2bRfP" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730338AbgIJJQn (ORCPT ); Thu, 10 Sep 2020 05:16:43 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:27433 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730157AbgIJJOo (ORCPT ); Thu, 10 Sep 2020 05:14:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1599729282; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=48UuGMu8lAd3i+T5wgFdxU+g4zP+de252WBvi4+PyDM=; b=X1w2bRfPpBPtQA2230bsPzi50aEABcTontEYqR9qfwmVnZnfVAB+Z4Plv46agPnMYoDqAX 2+54XciDRGn1wR7qmkN5m1qbnLLAKBBO5/WzEMq+yA2mQRiUyDQZu5OJNJ18t1uKgAVCum k9qt7M5PH+nCHTXPXgN2xe8UHXoRbkc= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-89-nJixZS_8MP68XFaEi9SX_g-1; Thu, 10 Sep 2020 05:14:37 -0400 X-MC-Unique: nJixZS_8MP68XFaEi9SX_g-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 44BCF1091061; Thu, 10 Sep 2020 09:14:34 +0000 (UTC) Received: from t480s.redhat.com (ovpn-113-88.ams2.redhat.com [10.36.113.88]) by smtp.corp.redhat.com (Postfix) with ESMTP id CBFC51A8EC; Thu, 10 Sep 2020 09:14:25 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: virtualization@lists.linux-foundation.org, linux-mm@kvack.org, linux-hyperv@vger.kernel.org, xen-devel@lists.xenproject.org, linux-acpi@vger.kernel.org, linux-nvdimm@lists.01.org, linux-s390@vger.kernel.org, Andrew Morton , David Hildenbrand , Michal Hocko , Dan Williams , Jason Gunthorpe , Kees Cook , Ard Biesheuvel , Thomas Gleixner , "K. Y. Srinivasan" , Haiyang Zhang , Stephen Hemminger , Wei Liu , Boris Ostrovsky , Juergen Gross , Stefano Stabellini , =?utf-8?q?Roger_Pau_Monn=C3=A9?= , Julien Grall , Pankaj Gupta , Baoquan He , Wei Yang Subject: [PATCH v3 4/7] mm/memory_hotplug: MEMHP_MERGE_RESOURCE to specify merging of System RAM resources Date: Thu, 10 Sep 2020 11:13:37 +0200 Message-Id: <20200910091340.8654-5-david@redhat.com> In-Reply-To: <20200910091340.8654-1-david@redhat.com> References: <20200910091340.8654-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org Some add_memory*() users add memory in small, contiguous memory blocks. Examples include virtio-mem, hyper-v balloon, and the XEN balloon. This can quickly result in a lot of memory resources, whereby the actual resource boundaries are not of interest (e.g., it might be relevant for DIMMs, exposed via /proc/iomem to user space). We really want to merge added resources in this scenario where possible. Let's provide a flag (MEMHP_MERGE_RESOURCE) to specify that a resource either created within add_memory*() or passed via add_memory_resource() shall be marked mergeable and merged with applicable siblings. To implement that, we need a kernel/resource interface to mark selected System RAM resources mergeable (IORESOURCE_SYSRAM_MERGEABLE) and trigger merging. Note: We really want to merge after the whole operation succeeded, not directly when adding a resource to the resource tree (it would break add_memory_resource() and require splitting resources again when the operation failed - e.g., due to -ENOMEM). Cc: Andrew Morton Cc: Michal Hocko Cc: Dan Williams Cc: Jason Gunthorpe Cc: Kees Cook Cc: Ard Biesheuvel Cc: Thomas Gleixner Cc: "K. Y. Srinivasan" Cc: Haiyang Zhang Cc: Stephen Hemminger Cc: Wei Liu Cc: Boris Ostrovsky Cc: Juergen Gross Cc: Stefano Stabellini Cc: Roger Pau Monné Cc: Julien Grall Cc: Pankaj Gupta Cc: Baoquan He Cc: Wei Yang Signed-off-by: David Hildenbrand Reviewed-by: Pankaj Gupta --- include/linux/ioport.h | 4 +++ include/linux/memory_hotplug.h | 7 ++++ kernel/resource.c | 60 ++++++++++++++++++++++++++++++++++ mm/memory_hotplug.c | 7 ++++ 4 files changed, 78 insertions(+) diff --git a/include/linux/ioport.h b/include/linux/ioport.h index d7620d7c941a0..7e61389dcb017 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -60,6 +60,7 @@ struct resource { /* IORESOURCE_SYSRAM specific bits. */ #define IORESOURCE_SYSRAM_DRIVER_MANAGED 0x02000000 /* Always detected via a driver. */ +#define IORESOURCE_SYSRAM_MERGEABLE 0x04000000 /* Resource can be merged. */ #define IORESOURCE_EXCLUSIVE 0x08000000 /* Userland may not map this resource */ @@ -253,6 +254,9 @@ extern void __release_region(struct resource *, resource_size_t, extern void release_mem_region_adjustable(struct resource *, resource_size_t, resource_size_t); #endif +#ifdef CONFIG_MEMORY_HOTPLUG +extern void merge_system_ram_resource(struct resource *res); +#endif /* Wrappers for managed devices */ struct device; diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index e53d1058f3443..869a59006cd8e 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -62,6 +62,13 @@ typedef int __bitwise mhp_t; /* No special request */ #define MHP_NONE ((__force mhp_t)0) +/* + * Allow merging of the added System RAM resource with adjacent, + * mergeable resources. After a successful call to add_memory_resource() + * with this flag set, the resource pointer must no longer be used as it + * might be stale, or the resource might have changed. + */ +#define MEMHP_MERGE_RESOURCE ((__force mhp_t)BIT(0)) /* * Extended parameters for memory hotplug: diff --git a/kernel/resource.c b/kernel/resource.c index 36b3552210120..7a91b935f4c20 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -1363,6 +1363,66 @@ void release_mem_region_adjustable(struct resource *parent, } #endif /* CONFIG_MEMORY_HOTREMOVE */ +#ifdef CONFIG_MEMORY_HOTPLUG +static bool system_ram_resources_mergeable(struct resource *r1, + struct resource *r2) +{ + /* We assume either r1 or r2 is IORESOURCE_SYSRAM_MERGEABLE. */ + return r1->flags == r2->flags && r1->end + 1 == r2->start && + r1->name == r2->name && r1->desc == r2->desc && + !r1->child && !r2->child; +} + +/* + * merge_system_ram_resource - mark the System RAM resource mergeable and try to + * merge it with adjacent, mergeable resources + * @res: resource descriptor + * + * This interface is intended for memory hotplug, whereby lots of contiguous + * system ram resources are added (e.g., via add_memory*()) by a driver, and + * the actual resource boundaries are not of interest (e.g., it might be + * relevant for DIMMs). Only resources that are marked mergeable, that have the + * same parent, and that don't have any children are considered. All mergeable + * resources must be immutable during the request. + * + * Note: + * - The caller has to make sure that no pointers to resources that are + * marked mergeable are used anymore after this call - the resource might + * be freed and the pointer might be stale! + * - release_mem_region_adjustable() will split on demand on memory hotunplug + */ +void merge_system_ram_resource(struct resource *res) +{ + const unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + struct resource *cur; + + if (WARN_ON_ONCE((res->flags & flags) != flags)) + return; + + write_lock(&resource_lock); + res->flags |= IORESOURCE_SYSRAM_MERGEABLE; + + /* Try to merge with next item in the list. */ + cur = res->sibling; + if (cur && system_ram_resources_mergeable(res, cur)) { + res->end = cur->end; + res->sibling = cur->sibling; + free_resource(cur); + } + + /* Try to merge with previous item in the list. */ + cur = res->parent->child; + while (cur && cur->sibling != res) + cur = cur->sibling; + if (cur && system_ram_resources_mergeable(cur, res)) { + cur->end = res->end; + cur->sibling = res->sibling; + free_resource(res); + } + write_unlock(&resource_lock); +} +#endif /* CONFIG_MEMORY_HOTPLUG */ + /* * Managed region resource */ diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 8f0bd7c9a63a5..553c718226b3e 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1102,6 +1102,13 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) /* device_online() will take the lock when calling online_pages() */ mem_hotplug_done(); + /* + * In case we're allowed to merge the resource, flag it and trigger + * merging now that adding succeeded. + */ + if (mhp_flags & MEMHP_MERGE_RESOURCE) + merge_system_ram_resource(res); + /* online pages if requested */ if (memhp_default_online_type != MMOP_OFFLINE) walk_memory_blocks(start, size, NULL, online_memory_block); From patchwork Thu Sep 10 09:13:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 11767243 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A24D3698 for ; Thu, 10 Sep 2020 09:16:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7C02B20882 for ; Thu, 10 Sep 2020 09:16:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="NO/Pwa1b" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730225AbgIJJQm (ORCPT ); Thu, 10 Sep 2020 05:16:42 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:54885 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730233AbgIJJOq (ORCPT ); Thu, 10 Sep 2020 05:14:46 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1599729283; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UKbsrpgJcVk/I/seBIDa5mjqSR3CWWtNLQlSkA+xweM=; b=NO/Pwa1b+vp7qL8VJNUOz0fPOG3S7afG4YBaXrdmUdmTS8inIY1tK5jrEICAnpD7QdlnSb OU/EHfdPEyneZFUj0G6Y38D53JwXwG5YdqlF/jiaJXAKeL0TGklvklljv/FKfpL1bhM6F+ n1i3n/snvGXpPWHEWkIRnfkevgZnvx0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-276-4Ir0pFWkOr2SyXBA0eB5IQ-1; Thu, 10 Sep 2020 05:14:39 -0400 X-MC-Unique: 4Ir0pFWkOr2SyXBA0eB5IQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id DEF4E18B9F11; Thu, 10 Sep 2020 09:14:37 +0000 (UTC) Received: from t480s.redhat.com (ovpn-113-88.ams2.redhat.com [10.36.113.88]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9B8ED1A8EC; Thu, 10 Sep 2020 09:14:34 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: virtualization@lists.linux-foundation.org, linux-mm@kvack.org, linux-hyperv@vger.kernel.org, xen-devel@lists.xenproject.org, linux-acpi@vger.kernel.org, linux-nvdimm@lists.01.org, linux-s390@vger.kernel.org, Andrew Morton , David Hildenbrand , Michal Hocko , Dan Williams , "Michael S . Tsirkin" , Jason Wang , Pankaj Gupta , Baoquan He , Wei Yang Subject: [PATCH v3 5/7] virtio-mem: try to merge system ram resources Date: Thu, 10 Sep 2020 11:13:38 +0200 Message-Id: <20200910091340.8654-6-david@redhat.com> In-Reply-To: <20200910091340.8654-1-david@redhat.com> References: <20200910091340.8654-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org virtio-mem adds memory in memory block granularity, to be able to remove it in the same granularity again later, and to grow slowly on demand. This, however, results in quite a lot of resources when adding a lot of memory. Resources are effectively stored in a list-based tree. Having a lot of resources not only wastes memory, it also makes traversing that tree more expensive, and makes /proc/iomem explode in size (e.g., requiring kexec-tools to manually merge resources later when e.g., trying to create a kdump header). Before this patch, we get (/proc/iomem) when hotplugging 2G via virtio-mem on x86-64: [...] 100000000-13fffffff : System RAM 140000000-33fffffff : virtio0 140000000-147ffffff : System RAM (virtio_mem) 148000000-14fffffff : System RAM (virtio_mem) 150000000-157ffffff : System RAM (virtio_mem) 158000000-15fffffff : System RAM (virtio_mem) 160000000-167ffffff : System RAM (virtio_mem) 168000000-16fffffff : System RAM (virtio_mem) 170000000-177ffffff : System RAM (virtio_mem) 178000000-17fffffff : System RAM (virtio_mem) 180000000-187ffffff : System RAM (virtio_mem) 188000000-18fffffff : System RAM (virtio_mem) 190000000-197ffffff : System RAM (virtio_mem) 198000000-19fffffff : System RAM (virtio_mem) 1a0000000-1a7ffffff : System RAM (virtio_mem) 1a8000000-1afffffff : System RAM (virtio_mem) 1b0000000-1b7ffffff : System RAM (virtio_mem) 1b8000000-1bfffffff : System RAM (virtio_mem) 3280000000-32ffffffff : PCI Bus 0000:00 With this patch, we get (/proc/iomem): [...] fffc0000-ffffffff : Reserved 100000000-13fffffff : System RAM 140000000-33fffffff : virtio0 140000000-1bfffffff : System RAM (virtio_mem) 3280000000-32ffffffff : PCI Bus 0000:00 Of course, with more hotplugged memory, it gets worse. When unplugging memory blocks again, try_remove_memory() (via offline_and_remove_memory()) will properly split the resource up again. Cc: Andrew Morton Cc: Michal Hocko Cc: Dan Williams Cc: Michael S. Tsirkin Cc: Jason Wang Cc: Pankaj Gupta Cc: Baoquan He Cc: Wei Yang Signed-off-by: David Hildenbrand Reviewed-by: Pankaj Gupta --- drivers/virtio/virtio_mem.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index ed99e43354010..ba4de598f6636 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -424,7 +424,8 @@ static int virtio_mem_mb_add(struct virtio_mem *vm, unsigned long mb_id) dev_dbg(&vm->vdev->dev, "adding memory block: %lu\n", mb_id); return add_memory_driver_managed(nid, addr, memory_block_size_bytes(), - vm->resource_name, MHP_NONE); + vm->resource_name, + MEMHP_MERGE_RESOURCE); } /* From patchwork Thu Sep 10 09:13:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 11767239 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E2EE692C for ; Thu, 10 Sep 2020 09:16:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BD3AA207DE for ; Thu, 10 Sep 2020 09:16:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="ggBtz7HK" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730184AbgIJJQH (ORCPT ); Thu, 10 Sep 2020 05:16:07 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:41019 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730268AbgIJJOu (ORCPT ); Thu, 10 Sep 2020 05:14:50 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1599729288; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pcPWZv9PYAHyw7ygKg8TV8AW3ZEfFOjouUb95Y7pmVU=; b=ggBtz7HKaK9A03rUvqwACI3V5StcqeAQSdpO36PoTftsPskhPBdgrEyTOhwaQYM1OKE3wA pR898mI2ALQhwhlu/seAQIRFflXS0KhWRYb2rfmaeXXrFfuPiVz9bwjihsnA5iuft7R37q qljDlcUOqpd+I/LtpwnSONGX5aVM+rQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-97-F3s5MUEMNf-YywNaYeYo4Q-1; Thu, 10 Sep 2020 05:14:44 -0400 X-MC-Unique: F3s5MUEMNf-YywNaYeYo4Q-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id E63CB18B9F00; Thu, 10 Sep 2020 09:14:41 +0000 (UTC) Received: from t480s.redhat.com (ovpn-113-88.ams2.redhat.com [10.36.113.88]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3BE081A8EC; Thu, 10 Sep 2020 09:14:38 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: virtualization@lists.linux-foundation.org, linux-mm@kvack.org, linux-hyperv@vger.kernel.org, xen-devel@lists.xenproject.org, linux-acpi@vger.kernel.org, linux-nvdimm@lists.01.org, linux-s390@vger.kernel.org, Andrew Morton , David Hildenbrand , Juergen Gross , Andrew Morton , Michal Hocko , Boris Ostrovsky , Stefano Stabellini , =?utf-8?q?Roger_Pau_Monn=C3=A9?= , Julien Grall , Pankaj Gupta , Baoquan He , Wei Yang Subject: [PATCH v3 6/7] xen/balloon: try to merge system ram resources Date: Thu, 10 Sep 2020 11:13:39 +0200 Message-Id: <20200910091340.8654-7-david@redhat.com> In-Reply-To: <20200910091340.8654-1-david@redhat.com> References: <20200910091340.8654-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org Let's try to merge system ram resources we add, to minimize the number of resources in /proc/iomem. We don't care about the boundaries of individual chunks we added. Reviewed-by: Juergen Gross Cc: Andrew Morton Cc: Michal Hocko Cc: Boris Ostrovsky Cc: Juergen Gross Cc: Stefano Stabellini Cc: Roger Pau Monné Cc: Julien Grall Cc: Pankaj Gupta Cc: Baoquan He Cc: Wei Yang Signed-off-by: David Hildenbrand --- drivers/xen/balloon.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 9f40a294d398d..b57b2067ecbfb 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -331,7 +331,7 @@ static enum bp_state reserve_additional_memory(void) mutex_unlock(&balloon_mutex); /* add_memory_resource() requires the device_hotplug lock */ lock_device_hotplug(); - rc = add_memory_resource(nid, resource, MHP_NONE); + rc = add_memory_resource(nid, resource, MEMHP_MERGE_RESOURCE); unlock_device_hotplug(); mutex_lock(&balloon_mutex); From patchwork Thu Sep 10 09:13:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 11767241 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A8B4A698 for ; Thu, 10 Sep 2020 09:16:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7FF8621D40 for ; Thu, 10 Sep 2020 09:16:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Q69G4RFV" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730324AbgIJJQh (ORCPT ); Thu, 10 Sep 2020 05:16:37 -0400 Received: from us-smtp-1.mimecast.com ([207.211.31.81]:26127 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730287AbgIJJOx (ORCPT ); Thu, 10 Sep 2020 05:14:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1599729291; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YsfsAQ2XAkLrSeK5gDaCbmO+kEOKaYFgY1Wng1dGBy0=; b=Q69G4RFVHHK/W6Jrw+wNa65ycMJjAmBmaW3xOxkEpTGSl0xBiwjvm1OfIiVqeU2V7/fTqi bu2SG5ZJ7kG5IcEuV76jf0YAixLfxzAJUlLd7nZdIz2EA+F/SWmu0yz6f2svtPyElxWJu6 JHtp/swa2vFmsY39t0x8vSJT5pGxWg0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-475-HBflxYiUOtC-j12zoJgTEQ-1; Thu, 10 Sep 2020 05:14:47 -0400 X-MC-Unique: HBflxYiUOtC-j12zoJgTEQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id A191D109106C; Thu, 10 Sep 2020 09:14:45 +0000 (UTC) Received: from t480s.redhat.com (ovpn-113-88.ams2.redhat.com [10.36.113.88]) by smtp.corp.redhat.com (Postfix) with ESMTP id 445081A8EC; Thu, 10 Sep 2020 09:14:42 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: virtualization@lists.linux-foundation.org, linux-mm@kvack.org, linux-hyperv@vger.kernel.org, xen-devel@lists.xenproject.org, linux-acpi@vger.kernel.org, linux-nvdimm@lists.01.org, linux-s390@vger.kernel.org, Andrew Morton , David Hildenbrand , Wei Liu , Michal Hocko , "K. Y. Srinivasan" , Haiyang Zhang , Stephen Hemminger , Pankaj Gupta , Baoquan He , Wei Yang Subject: [PATCH v3 7/7] hv_balloon: try to merge system ram resources Date: Thu, 10 Sep 2020 11:13:40 +0200 Message-Id: <20200910091340.8654-8-david@redhat.com> In-Reply-To: <20200910091340.8654-1-david@redhat.com> References: <20200910091340.8654-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org Let's try to merge system ram resources we add, to minimize the number of resources in /proc/iomem. We don't care about the boundaries of individual chunks we added. Reviewed-by: Wei Liu Cc: Andrew Morton Cc: Michal Hocko Cc: "K. Y. Srinivasan" Cc: Haiyang Zhang Cc: Stephen Hemminger Cc: Wei Liu Cc: Pankaj Gupta Cc: Baoquan He Cc: Wei Yang Signed-off-by: David Hildenbrand --- drivers/hv/hv_balloon.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index 3c0d52e244520..b64d2efbefe71 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -726,7 +726,7 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size, nid = memory_add_physaddr_to_nid(PFN_PHYS(start_pfn)); ret = add_memory(nid, PFN_PHYS((start_pfn)), - (HA_CHUNK << PAGE_SHIFT), MHP_NONE); + (HA_CHUNK << PAGE_SHIFT), MEMHP_MERGE_RESOURCE); if (ret) { pr_err("hot_add memory failed error is %d\n", ret);