diff mbox series

[v1,2/5] kernel/resource: merge_system_ram_resources() to merge resources after hotplug

Message ID 20200821103431.13481-3-david@redhat.com (mailing list archive)
State Superseded
Headers show
Series mm/memory_hotplug: selective merging of system ram resources | expand

Commit Message

David Hildenbrand Aug. 21, 2020, 10:34 a.m. UTC
Some add_memory*() users add memory in small, contiguous memory blocks.
Examples include virtio-mem, hyper-v balloon, and the XEN balloon.

This can quickly result in a lot of memory resources, whereby the actual
resource boundaries are not of interest (e.g., it might be relevant for
DIMMs, exposed via /proc/iomem to user space). We really want to merge
added resources in this scenario where possible.

Let's provide an interface to trigger merging of applicable child
resources. It will be, for example, used by virtio-mem to trigger
merging of system ram resources it added to its resource container, but
also by XEN and Hyper-V to trigger merging of system ram resources in
iomem_resource.

Note: We really want to merge after the whole operation succeeded, not
directly when adding a resource to the resource tree (it would break
add_memory_resource() and require splitting resources again when the
operation failed - e.g., due to -ENOMEM).

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Roger Pau Monné <roger.pau@citrix.com>
Cc: Julien Grall <julien@xen.org>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/ioport.h |  3 +++
 kernel/resource.c      | 52 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 55 insertions(+)

Comments

Pankaj Gupta Aug. 31, 2020, 9:35 a.m. UTC | #1
> Some add_memory*() users add memory in small, contiguous memory blocks.
> Examples include virtio-mem, hyper-v balloon, and the XEN balloon.
>
> This can quickly result in a lot of memory resources, whereby the actual
> resource boundaries are not of interest (e.g., it might be relevant for
> DIMMs, exposed via /proc/iomem to user space). We really want to merge
> added resources in this scenario where possible.
>
> Let's provide an interface to trigger merging of applicable child
> resources. It will be, for example, used by virtio-mem to trigger
> merging of system ram resources it added to its resource container, but
> also by XEN and Hyper-V to trigger merging of system ram resources in
> iomem_resource.
>
> Note: We really want to merge after the whole operation succeeded, not
> directly when adding a resource to the resource tree (it would break
> add_memory_resource() and require splitting resources again when the
> operation failed - e.g., due to -ENOMEM).
>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Ard Biesheuvel <ardb@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: "K. Y. Srinivasan" <kys@microsoft.com>
> Cc: Haiyang Zhang <haiyangz@microsoft.com>
> Cc: Stephen Hemminger <sthemmin@microsoft.com>
> Cc: Wei Liu <wei.liu@kernel.org>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Juergen Gross <jgross@suse.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Roger Pau Monné <roger.pau@citrix.com>
> Cc: Julien Grall <julien@xen.org>
> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Wei Yang <richardw.yang@linux.intel.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  include/linux/ioport.h |  3 +++
>  kernel/resource.c      | 52 ++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 55 insertions(+)
>
> diff --git a/include/linux/ioport.h b/include/linux/ioport.h
> index 52a91f5fa1a36..3bb0020cd6ddc 100644
> --- a/include/linux/ioport.h
> +++ b/include/linux/ioport.h
> @@ -251,6 +251,9 @@ extern void __release_region(struct resource *, resource_size_t,
>  extern void release_mem_region_adjustable(struct resource *, resource_size_t,
>                                           resource_size_t);
>  #endif
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +extern void merge_system_ram_resources(struct resource *res);
> +#endif
>
>  /* Wrappers for managed devices */
>  struct device;
> diff --git a/kernel/resource.c b/kernel/resource.c
> index 1dcef5d53d76e..b4e0963edadd2 100644
> --- a/kernel/resource.c
> +++ b/kernel/resource.c
> @@ -1360,6 +1360,58 @@ void release_mem_region_adjustable(struct resource *parent,
>  }
>  #endif /* CONFIG_MEMORY_HOTREMOVE */
>
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +static bool system_ram_resources_mergeable(struct resource *r1,
> +                                          struct resource *r2)
> +{
> +       return r1->flags == r2->flags && r1->end + 1 == r2->start &&
> +              r1->name == r2->name && r1->desc == r2->desc &&
> +              !r1->child && !r2->child;
> +}
> +
> +/*
> + * merge_system_ram_resources - try to merge contiguous system ram resources
> + * @parent: parent resource descriptor
> + *
> + * This interface is intended for memory hotplug, whereby lots of contiguous
> + * system ram resources are added (e.g., via add_memory*()) by a driver, and
> + * the actual resource boundaries are not of interest (e.g., it might be
> + * relevant for DIMMs). Only immediate child resources that are busy and
> + * don't have any children are considered. All applicable child resources
> + * must be immutable during the request.
> + *
> + * Note:
> + * - The caller has to make sure that no pointers to resources that might
> + *   get merged are held anymore. Callers should only trigger merging of child
> + *   resources when they are the only one adding system ram resources to the
> + *   parent (besides during boot).
> + * - release_mem_region_adjustable() will split on demand on memory hotunplug
> + */
> +void merge_system_ram_resources(struct resource *parent)
> +{
> +       const unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> +       struct resource *cur, *next;
> +
> +       write_lock(&resource_lock);
> +
> +       cur = parent->child;
> +       while (cur && cur->sibling) {
> +               next = cur->sibling;
> +               if ((cur->flags & flags) == flags &&

Maybe this can be changed to:
!(cur->flags & ~flags)

> +                   system_ram_resources_mergeable(cur, next)) {
> +                       cur->end = next->end;
> +                       cur->sibling = next->sibling;
> +                       free_resource(next);
> +                       next = cur->sibling;
> +               }
> +               cur = next;
> +       }
> +
> +       write_unlock(&resource_lock);
> +}
> +EXPORT_SYMBOL(merge_system_ram_resources);
> +#endif /* CONFIG_MEMORY_HOTPLUG */
> +
>  /*
>   * Managed region resource
>   */
> --
> 2.26.2
>
David Hildenbrand Sept. 8, 2020, 10:26 a.m. UTC | #2
On 31.08.20 11:35, Pankaj Gupta wrote:
>> Some add_memory*() users add memory in small, contiguous memory blocks.
>> Examples include virtio-mem, hyper-v balloon, and the XEN balloon.
>>
>> This can quickly result in a lot of memory resources, whereby the actual
>> resource boundaries are not of interest (e.g., it might be relevant for
>> DIMMs, exposed via /proc/iomem to user space). We really want to merge
>> added resources in this scenario where possible.
>>
>> Let's provide an interface to trigger merging of applicable child
>> resources. It will be, for example, used by virtio-mem to trigger
>> merging of system ram resources it added to its resource container, but
>> also by XEN and Hyper-V to trigger merging of system ram resources in
>> iomem_resource.
>>
>> Note: We really want to merge after the whole operation succeeded, not
>> directly when adding a resource to the resource tree (it would break
>> add_memory_resource() and require splitting resources again when the
>> operation failed - e.g., due to -ENOMEM).
>>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Jason Gunthorpe <jgg@ziepe.ca>
>> Cc: Kees Cook <keescook@chromium.org>
>> Cc: Ard Biesheuvel <ardb@kernel.org>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: "K. Y. Srinivasan" <kys@microsoft.com>
>> Cc: Haiyang Zhang <haiyangz@microsoft.com>
>> Cc: Stephen Hemminger <sthemmin@microsoft.com>
>> Cc: Wei Liu <wei.liu@kernel.org>
>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> Cc: Juergen Gross <jgross@suse.com>
>> Cc: Stefano Stabellini <sstabellini@kernel.org>
>> Cc: Roger Pau Monné <roger.pau@citrix.com>
>> Cc: Julien Grall <julien@xen.org>
>> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
>> Cc: Baoquan He <bhe@redhat.com>
>> Cc: Wei Yang <richardw.yang@linux.intel.com>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>  include/linux/ioport.h |  3 +++
>>  kernel/resource.c      | 52 ++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 55 insertions(+)
>>
>> diff --git a/include/linux/ioport.h b/include/linux/ioport.h
>> index 52a91f5fa1a36..3bb0020cd6ddc 100644
>> --- a/include/linux/ioport.h
>> +++ b/include/linux/ioport.h
>> @@ -251,6 +251,9 @@ extern void __release_region(struct resource *, resource_size_t,
>>  extern void release_mem_region_adjustable(struct resource *, resource_size_t,
>>                                           resource_size_t);
>>  #endif
>> +#ifdef CONFIG_MEMORY_HOTPLUG
>> +extern void merge_system_ram_resources(struct resource *res);
>> +#endif
>>
>>  /* Wrappers for managed devices */
>>  struct device;
>> diff --git a/kernel/resource.c b/kernel/resource.c
>> index 1dcef5d53d76e..b4e0963edadd2 100644
>> --- a/kernel/resource.c
>> +++ b/kernel/resource.c
>> @@ -1360,6 +1360,58 @@ void release_mem_region_adjustable(struct resource *parent,
>>  }
>>  #endif /* CONFIG_MEMORY_HOTREMOVE */
>>
>> +#ifdef CONFIG_MEMORY_HOTPLUG
>> +static bool system_ram_resources_mergeable(struct resource *r1,
>> +                                          struct resource *r2)
>> +{
>> +       return r1->flags == r2->flags && r1->end + 1 == r2->start &&
>> +              r1->name == r2->name && r1->desc == r2->desc &&
>> +              !r1->child && !r2->child;
>> +}
>> +
>> +/*
>> + * merge_system_ram_resources - try to merge contiguous system ram resources
>> + * @parent: parent resource descriptor
>> + *
>> + * This interface is intended for memory hotplug, whereby lots of contiguous
>> + * system ram resources are added (e.g., via add_memory*()) by a driver, and
>> + * the actual resource boundaries are not of interest (e.g., it might be
>> + * relevant for DIMMs). Only immediate child resources that are busy and
>> + * don't have any children are considered. All applicable child resources
>> + * must be immutable during the request.
>> + *
>> + * Note:
>> + * - The caller has to make sure that no pointers to resources that might
>> + *   get merged are held anymore. Callers should only trigger merging of child
>> + *   resources when they are the only one adding system ram resources to the
>> + *   parent (besides during boot).
>> + * - release_mem_region_adjustable() will split on demand on memory hotunplug
>> + */
>> +void merge_system_ram_resources(struct resource *parent)
>> +{
>> +       const unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
>> +       struct resource *cur, *next;
>> +
>> +       write_lock(&resource_lock);
>> +
>> +       cur = parent->child;
>> +       while (cur && cur->sibling) {
>> +               next = cur->sibling;
>> +               if ((cur->flags & flags) == flags &&
> 
> Maybe this can be changed to:
> !(cur->flags & ~flags)

That would be different I think.

(cur->flags & flags) == flags
checks that all "flags" are set (additional ones might be set).

!(cur->flags & ~flags)
checks that no other flags besides "flags" are set (and "flags" are not
required to be set).


We use the same handling in find_next_iomem_res(), e.g., called via
walk_system_ram_range also with IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY.

Thanks for having a look!
diff mbox series

Patch

diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index 52a91f5fa1a36..3bb0020cd6ddc 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -251,6 +251,9 @@  extern void __release_region(struct resource *, resource_size_t,
 extern void release_mem_region_adjustable(struct resource *, resource_size_t,
 					  resource_size_t);
 #endif
+#ifdef CONFIG_MEMORY_HOTPLUG
+extern void merge_system_ram_resources(struct resource *res);
+#endif
 
 /* Wrappers for managed devices */
 struct device;
diff --git a/kernel/resource.c b/kernel/resource.c
index 1dcef5d53d76e..b4e0963edadd2 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -1360,6 +1360,58 @@  void release_mem_region_adjustable(struct resource *parent,
 }
 #endif	/* CONFIG_MEMORY_HOTREMOVE */
 
+#ifdef CONFIG_MEMORY_HOTPLUG
+static bool system_ram_resources_mergeable(struct resource *r1,
+					   struct resource *r2)
+{
+	return r1->flags == r2->flags && r1->end + 1 == r2->start &&
+	       r1->name == r2->name && r1->desc == r2->desc &&
+	       !r1->child && !r2->child;
+}
+
+/*
+ * merge_system_ram_resources - try to merge contiguous system ram resources
+ * @parent: parent resource descriptor
+ *
+ * This interface is intended for memory hotplug, whereby lots of contiguous
+ * system ram resources are added (e.g., via add_memory*()) by a driver, and
+ * the actual resource boundaries are not of interest (e.g., it might be
+ * relevant for DIMMs). Only immediate child resources that are busy and
+ * don't have any children are considered. All applicable child resources
+ * must be immutable during the request.
+ *
+ * Note:
+ * - The caller has to make sure that no pointers to resources that might
+ *   get merged are held anymore. Callers should only trigger merging of child
+ *   resources when they are the only one adding system ram resources to the
+ *   parent (besides during boot).
+ * - release_mem_region_adjustable() will split on demand on memory hotunplug
+ */
+void merge_system_ram_resources(struct resource *parent)
+{
+	const unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
+	struct resource *cur, *next;
+
+	write_lock(&resource_lock);
+
+	cur = parent->child;
+	while (cur && cur->sibling) {
+		next = cur->sibling;
+		if ((cur->flags & flags) == flags &&
+		    system_ram_resources_mergeable(cur, next)) {
+			cur->end = next->end;
+			cur->sibling = next->sibling;
+			free_resource(next);
+			next = cur->sibling;
+		}
+		cur = next;
+	}
+
+	write_unlock(&resource_lock);
+}
+EXPORT_SYMBOL(merge_system_ram_resources);
+#endif	/* CONFIG_MEMORY_HOTPLUG */
+
 /*
  * Managed region resource
  */