Message ID | 20220207135618.17231-1-linmiaohe@huawei.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm/memory_hotplug: fix kfree() of bootmem memory | expand |
On 07.02.22 14:56, Miaohe Lin wrote: > We can't use kfree() to release the resource as it might come from bootmem. > Use release_mem_region() instead. How can this happen? release_mem_region() is called either from __add_memory() or from add_memory_driver_managed(), where we allocated the region via register_memory_resource(). Both functions shouldn't ever be called before the buddy is up an running. Do you have a backtrace of an actual instance of this issue? Or was this identified as possibly broken by code inspection?
Hi: On 2022/2/7 22:33, David Hildenbrand wrote: > On 07.02.22 14:56, Miaohe Lin wrote: >> We can't use kfree() to release the resource as it might come from bootmem. >> Use release_mem_region() instead. > > How can this happen? release_mem_region() is called either from > __add_memory() or from add_memory_driver_managed(), where we allocated > the region via register_memory_resource(). Both functions shouldn't ever > be called before the buddy is up an running. > > Do you have a backtrace of an actual instance of this issue? Or was this > identified as possibly broken by code inspection? > This is identified as possibly broken by code inspection. IIUC, alloc_resource is always used to allocate the resource. It has the below logic: if (bootmem_resource_free) { res = bootmem_resource_free; bootmem_resource_free = res->sibling; } where bootmem_resource_free is used to reusing the resource entries allocated by boot mem after the system is up: /* * For memory hotplug, there is no way to free resource entries allocated * by boot mem after the system is up. So for reusing the resource entry * we need to remember the resource. */ static struct resource *bootmem_resource_free; So I think register_memory_resource() can reuse the resource allocated by bootmem. Or am I miss anything? Thanks.
On 08.02.22 02:59, Miaohe Lin wrote: > Hi: > On 2022/2/7 22:33, David Hildenbrand wrote: >> On 07.02.22 14:56, Miaohe Lin wrote: >>> We can't use kfree() to release the resource as it might come from bootmem. >>> Use release_mem_region() instead. >> >> How can this happen? release_mem_region() is called either from >> __add_memory() or from add_memory_driver_managed(), where we allocated >> the region via register_memory_resource(). Both functions shouldn't ever >> be called before the buddy is up an running. >> >> Do you have a backtrace of an actual instance of this issue? Or was this >> identified as possibly broken by code inspection? >> > > This is identified as possibly broken by code inspection. IIUC, alloc_resource > is always used to allocate the resource. It has the below logic: > > if (bootmem_resource_free) { > res = bootmem_resource_free; > bootmem_resource_free = res->sibling; > } > > where bootmem_resource_free is used to reusing the resource entries allocated by boot > mem after the system is up: > > /* > * For memory hotplug, there is no way to free resource entries allocated > * by boot mem after the system is up. So for reusing the resource entry > * we need to remember the resource. > */ > static struct resource *bootmem_resource_free; > > So I think register_memory_resource() can reuse the resource allocated by bootmem. > Or am I miss anything? I think you're right, if we did a previous free_resource() of a resource allocated during boot we could end up reusing that here. My best guess is that this never really happens. Wow, that's ugly. It affects essentially anybody reserving+freeing a resource. E.g., dax/kmem.c similarly does a release_resource(res)+kfree(res) We could either a) Expose free_resource() and replace all kfree(res) instances by it b) Just simplify that. I don't think we care about saving a couple of bytes in corner cases. I might be wrong (IIRC primarily ppc64 really succeeds in unplugging boot memory) diff --git a/kernel/resource.c b/kernel/resource.c index 9c08d6e9eef2..fe91a72fd951 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -56,14 +56,6 @@ struct resource_constraint { static DEFINE_RWLOCK(resource_lock); -/* - * For memory hotplug, there is no way to free resource entries allocated - * by boot mem after the system is up. So for reusing the resource entry - * we need to remember the resource. - */ -static struct resource *bootmem_resource_free; -static DEFINE_SPINLOCK(bootmem_resource_lock); - static struct resource *next_resource(struct resource *p) { if (p->child) @@ -160,36 +152,19 @@ __initcall(ioresources_init); static void free_resource(struct resource *res) { - if (!res) - return; - - if (!PageSlab(virt_to_head_page(res))) { - spin_lock(&bootmem_resource_lock); - res->sibling = bootmem_resource_free; - bootmem_resource_free = res; - spin_unlock(&bootmem_resource_lock); - } else { + /* + * If the resource was allocated using memblock early during boot + * we'll leak it here: we can only return full pages back to the + * buddy and trying to be smart and reusing them eventually in + * alloc_resource() overcomplicates resource handling. + */ + if (res && PageSlab(virt_to_head_page(res))) kfree(res); - } } static struct resource *alloc_resource(gfp_t flags) { - struct resource *res = NULL; - - spin_lock(&bootmem_resource_lock); - if (bootmem_resource_free) { - res = bootmem_resource_free; - bootmem_resource_free = res->sibling; - } - spin_unlock(&bootmem_resource_lock); - - if (res) - memset(res, 0, sizeof(struct resource)); - else - res = kzalloc(sizeof(struct resource), flags); - - return res; + return kzalloc(sizeof(struct resource), flags); } /* Return the conflict entry if you can't request it */
On 2022/2/8 17:19, David Hildenbrand wrote: > On 08.02.22 02:59, Miaohe Lin wrote: >> Hi: >> On 2022/2/7 22:33, David Hildenbrand wrote: >>> On 07.02.22 14:56, Miaohe Lin wrote: >>>> We can't use kfree() to release the resource as it might come from bootmem. >>>> Use release_mem_region() instead. >>> >>> How can this happen? release_mem_region() is called either from >>> __add_memory() or from add_memory_driver_managed(), where we allocated >>> the region via register_memory_resource(). Both functions shouldn't ever >>> be called before the buddy is up an running. >>> >>> Do you have a backtrace of an actual instance of this issue? Or was this >>> identified as possibly broken by code inspection? >>> >> >> This is identified as possibly broken by code inspection. IIUC, alloc_resource >> is always used to allocate the resource. It has the below logic: >> >> if (bootmem_resource_free) { >> res = bootmem_resource_free; >> bootmem_resource_free = res->sibling; >> } >> >> where bootmem_resource_free is used to reusing the resource entries allocated by boot >> mem after the system is up: >> >> /* >> * For memory hotplug, there is no way to free resource entries allocated >> * by boot mem after the system is up. So for reusing the resource entry >> * we need to remember the resource. >> */ >> static struct resource *bootmem_resource_free; >> >> So I think register_memory_resource() can reuse the resource allocated by bootmem. >> Or am I miss anything? > > I think you're right, if we did a previous free_resource() of a resource allocated > during boot we could end up reusing that here. My best guess is that this never > really happens. > Agree with you. This reusing mechanism is introduced since 2013 and this possible issue never happens. > Wow, that's ugly. It affects essentially anybody reserving+freeing a resource. > > E.g., dax/kmem.c similarly does a release_resource(res)+kfree(res) > > > We could either > > a) Expose free_resource() and replace all kfree(res) instances by it > > b) Just simplify that. I don't think we care about saving a couple of > bytes in corner cases. I might be wrong (IIRC primarily ppc64 really > succeeds in unplugging boot memory) > I prefer this one or there will be a huge change if we choose a. I will drop my patch and below patch looks good to me. Many thanks! > > diff --git a/kernel/resource.c b/kernel/resource.c > index 9c08d6e9eef2..fe91a72fd951 100644 > --- a/kernel/resource.c > +++ b/kernel/resource.c > @@ -56,14 +56,6 @@ struct resource_constraint { > > static DEFINE_RWLOCK(resource_lock); > > -/* > - * For memory hotplug, there is no way to free resource entries allocated > - * by boot mem after the system is up. So for reusing the resource entry > - * we need to remember the resource. > - */ > -static struct resource *bootmem_resource_free; > -static DEFINE_SPINLOCK(bootmem_resource_lock); > - > static struct resource *next_resource(struct resource *p) > { > if (p->child) > @@ -160,36 +152,19 @@ __initcall(ioresources_init); > > static void free_resource(struct resource *res) > { > - if (!res) > - return; > - > - if (!PageSlab(virt_to_head_page(res))) { > - spin_lock(&bootmem_resource_lock); > - res->sibling = bootmem_resource_free; > - bootmem_resource_free = res; > - spin_unlock(&bootmem_resource_lock); > - } else { > + /* > + * If the resource was allocated using memblock early during boot > + * we'll leak it here: we can only return full pages back to the > + * buddy and trying to be smart and reusing them eventually in > + * alloc_resource() overcomplicates resource handling. > + */ > + if (res && PageSlab(virt_to_head_page(res))) > kfree(res); > - } > } > > static struct resource *alloc_resource(gfp_t flags) > { > - struct resource *res = NULL; > - > - spin_lock(&bootmem_resource_lock); > - if (bootmem_resource_free) { > - res = bootmem_resource_free; > - bootmem_resource_free = res->sibling; > - } > - spin_unlock(&bootmem_resource_lock); > - > - if (res) > - memset(res, 0, sizeof(struct resource)); > - else > - res = kzalloc(sizeof(struct resource), flags); > - > - return res; > + return kzalloc(sizeof(struct resource), flags); > } > > /* Return the conflict entry if you can't request it */ > >
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 7dc7e12302db..dc570772b4b1 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -216,8 +216,7 @@ static void release_memory_resource(struct resource *res) { if (!res) return; - release_resource(res); - kfree(res); + release_mem_region(res->start, resource_size(res)); } static int check_pfn_span(unsigned long pfn, unsigned long nr_pages,
We can't use kfree() to release the resource as it might come from bootmem. Use release_mem_region() instead. Fixes: ebff7d8f270d ("mem hotunplug: fix kfree() of bootmem memory") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> --- mm/memory_hotplug.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)