Message ID | 20221112114602.1268989-4-liushixin2@huawei.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Refactor __kmem_cache_create() and fix memory leak | expand |
On Sat, Nov 12, 2022 at 07:46:02PM +0800, Liu Shixin wrote: > There is a memory leak of kobj->name in sysfs_slab_add(): > > unreferenced object 0xffff88817e446440 (size 32): > comm "insmod", pid 4085, jiffies 4296564501 (age 126.272s) > hex dump (first 32 bytes): > 75 62 69 66 73 5f 69 6e 6f 64 65 5f 73 6c 61 62 ubifs_inode_slab > 00 65 44 7e 81 88 ff ff 00 00 00 00 00 00 00 00 .eD~............ > backtrace: > [<000000005b30fbbd>] __kmalloc_node_track_caller+0x4e/0x150 > [<000000002f70da0c>] kstrdup_const+0x4b/0x80 > [<00000000c6712c61>] kobject_set_name_vargs+0x2f/0xb0 > [<00000000b151218e>] kobject_init_and_add+0xb0/0x120 > [<00000000e56a4cf5>] sysfs_slab_add+0x17d/0x220 > [<000000009326fd57>] __kmem_cache_create+0x406/0x590 > [<00000000dde33cff>] kmem_cache_create_usercopy+0x1fc/0x300 > [<00000000fe90cedb>] kmem_cache_create+0x12/0x20 > [<000000007a6531c8>] 0xffffffffa02d802d > [<000000000e3b13c7>] do_one_initcall+0x87/0x2a0 > [<00000000995ecdcf>] do_init_module+0xdf/0x320 > [<000000008821941f>] load_module+0x2f98/0x3330 > [<00000000ef51efa4>] __do_sys_finit_module+0x113/0x1b0 > [<000000009339fbce>] do_syscall_64+0x35/0x80 > [<000000006b7f2033>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 > > Following the rules stated in the comment for kobject_init_and_add(): > If this function returns an error, kobject_put() must be called to > properly clean up the memory associated with the object. > > kobject_put() is more appropriate for error handling after kobject_init(). > And we can use this function to solve this problem. > > For the cache created early, the related sysfs_slab_add() is called in > slab_sysfs_init(). Skip free these kmem_cache since they are important > for system. Keep them working without sysfs. > > Fixes: 80da026a8e5d ("mm/slub: fix slab double-free in case of duplicate sysfs filename") > Signed-off-by: Liu Shixin <liushixin2@huawei.com> > --- > include/linux/slub_def.h | 4 ++-- > mm/slab_common.c | 6 ++---- > mm/slub.c | 21 +++++++++++++++++---- > 3 files changed, 21 insertions(+), 10 deletions(-) > > diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h > index 26d56c4c74d1..90c3e06b77b1 100644 > --- a/include/linux/slub_def.h > +++ b/include/linux/slub_def.h > @@ -144,11 +144,11 @@ struct kmem_cache { > > #ifdef CONFIG_SYSFS > #define SLAB_SUPPORTS_SYSFS > -int sysfs_slab_add(struct kmem_cache *); > +int sysfs_slab_add(struct kmem_cache *, bool); > void sysfs_slab_unlink(struct kmem_cache *); > void sysfs_slab_release(struct kmem_cache *); > #else > -static inline int sysfs_slab_add(struct kmem_cache *s) > +static inline int sysfs_slab_add(struct kmem_cache *s, bool free_slab) > { > return 0; > } > diff --git a/mm/slab_common.c b/mm/slab_common.c > index 55e2cf064dfe..30808a1d1b32 100644 > --- a/mm/slab_common.c > +++ b/mm/slab_common.c > @@ -237,11 +237,9 @@ static struct kmem_cache *create_cache(const char *name, > #ifdef SLAB_SUPPORTS_SYSFS > /* Mutex is not taken during early boot */ > if (slab_state >= FULL) { > - err = sysfs_slab_add(s); > - if (err) { > - slab_kmem_cache_release(s); > + err = sysfs_slab_add(s, true); > + if (err) > return ERR_PTR(err); > - } > debugfs_slab_add(s); > } > #endif > diff --git a/mm/slub.c b/mm/slub.c > index a1ad759753ce..25575bce0c3c 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -5881,7 +5881,7 @@ static char *create_unique_id(struct kmem_cache *s) > return name; > } > > -int sysfs_slab_add(struct kmem_cache *s) > +int sysfs_slab_add(struct kmem_cache *s, bool free_slab) free_slab is confusing. Maybe 'release' or 'shutdown'? > { > int err; > const char *name; > @@ -5911,14 +5911,17 @@ int sysfs_slab_add(struct kmem_cache *s) > * for the symlinks. > */ > name = create_unique_id(s); > - if (IS_ERR(name)) > + if (IS_ERR(name)) { > + if (free_slab) > + slab_kmem_cache_release(s); > return PTR_ERR(name); > + } > } > > s->kobj.kset = kset; > err = kobject_init_and_add(&s->kobj, &slab_ktype, NULL, "%s", name); > if (err) > - goto out; > + goto out_put_kobj; > > err = sysfs_create_group(&s->kobj, &slab_attr_group); > if (err) > @@ -5934,6 +5937,16 @@ int sysfs_slab_add(struct kmem_cache *s) > return err; > out_del_kobj: > kobject_del(&s->kobj); > +out_put_kobj: > + /* > + * Skip free kmem_cache that create early since they are important > + * for system. Keep them working without sysfs. Only free the name > + * for early allocated kmem_cache. > + */ > + if (free_slab) > + kobject_put(&s->kobj); > + else > + kfree_const(s->kobj.name); This is bypassing kobject API - can we initialize it with different ktype that has different .release function, depending on the boolean parameter? > goto out; > } > > @@ -6002,7 +6015,7 @@ static int __init slab_sysfs_init(void) > slab_state = FULL; > > list_for_each_entry(s, &slab_caches, list) { > - err = sysfs_slab_add(s); > + err = sysfs_slab_add(s, false); > if (err) > pr_err("SLUB: Unable to add boot slab %s to sysfs\n", > s->name); > -- > 2.25.1 > >
On 2022/11/12 19:46, Liu Shixin wrote: > There is a memory leak of kobj->name in sysfs_slab_add(): > > unreferenced object 0xffff88817e446440 (size 32): > comm "insmod", pid 4085, jiffies 4296564501 (age 126.272s) > hex dump (first 32 bytes): > 75 62 69 66 73 5f 69 6e 6f 64 65 5f 73 6c 61 62 ubifs_inode_slab > 00 65 44 7e 81 88 ff ff 00 00 00 00 00 00 00 00 .eD~............ > backtrace: > [<000000005b30fbbd>] __kmalloc_node_track_caller+0x4e/0x150 > [<000000002f70da0c>] kstrdup_const+0x4b/0x80 > [<00000000c6712c61>] kobject_set_name_vargs+0x2f/0xb0 > [<00000000b151218e>] kobject_init_and_add+0xb0/0x120 > [<00000000e56a4cf5>] sysfs_slab_add+0x17d/0x220 > [<000000009326fd57>] __kmem_cache_create+0x406/0x590 > [<00000000dde33cff>] kmem_cache_create_usercopy+0x1fc/0x300 > [<00000000fe90cedb>] kmem_cache_create+0x12/0x20 > [<000000007a6531c8>] 0xffffffffa02d802d > [<000000000e3b13c7>] do_one_initcall+0x87/0x2a0 > [<00000000995ecdcf>] do_init_module+0xdf/0x320 > [<000000008821941f>] load_module+0x2f98/0x3330 > [<00000000ef51efa4>] __do_sys_finit_module+0x113/0x1b0 > [<000000009339fbce>] do_syscall_64+0x35/0x80 > [<000000006b7f2033>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 Hi,every one, I found the same problem and it solve this problem with the patch, is there any plan to update the patch and solve it. > > Following the rules stated in the comment for kobject_init_and_add(): > If this function returns an error, kobject_put() must be called to > properly clean up the memory associated with the object. > > kobject_put() is more appropriate for error handling after kobject_init(). > And we can use this function to solve this problem. > > For the cache created early, the related sysfs_slab_add() is called in > slab_sysfs_init(). Skip free these kmem_cache since they are important > for system. Keep them working without sysfs. > > Fixes: 80da026a8e5d ("mm/slub: fix slab double-free in case of duplicate sysfs filename") > Signed-off-by: Liu Shixin <liushixin2@huawei.com> > --- > include/linux/slub_def.h | 4 ++-- > mm/slab_common.c | 6 ++---- > mm/slub.c | 21 +++++++++++++++++---- > 3 files changed, 21 insertions(+), 10 deletions(-) > > diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h > index 26d56c4c74d1..90c3e06b77b1 100644 > --- a/include/linux/slub_def.h > +++ b/include/linux/slub_def.h > @@ -144,11 +144,11 @@ struct kmem_cache { > > #ifdef CONFIG_SYSFS > #define SLAB_SUPPORTS_SYSFS > -int sysfs_slab_add(struct kmem_cache *); > +int sysfs_slab_add(struct kmem_cache *, bool); > void sysfs_slab_unlink(struct kmem_cache *); > void sysfs_slab_release(struct kmem_cache *); > #else > -static inline int sysfs_slab_add(struct kmem_cache *s) > +static inline int sysfs_slab_add(struct kmem_cache *s, bool free_slab) > { > return 0; > } > diff --git a/mm/slab_common.c b/mm/slab_common.c > index 55e2cf064dfe..30808a1d1b32 100644 > --- a/mm/slab_common.c > +++ b/mm/slab_common.c > @@ -237,11 +237,9 @@ static struct kmem_cache *create_cache(const char *name, > #ifdef SLAB_SUPPORTS_SYSFS > /* Mutex is not taken during early boot */ > if (slab_state >= FULL) { > - err = sysfs_slab_add(s); > - if (err) { > - slab_kmem_cache_release(s); > + err = sysfs_slab_add(s, true); > + if (err) > return ERR_PTR(err); > - } > debugfs_slab_add(s); > } > #endif > diff --git a/mm/slub.c b/mm/slub.c > index a1ad759753ce..25575bce0c3c 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -5881,7 +5881,7 @@ static char *create_unique_id(struct kmem_cache *s) > return name; > } > > -int sysfs_slab_add(struct kmem_cache *s) > +int sysfs_slab_add(struct kmem_cache *s, bool free_slab) > { > int err; > const char *name; > @@ -5911,14 +5911,17 @@ int sysfs_slab_add(struct kmem_cache *s) > * for the symlinks. > */ > name = create_unique_id(s); > - if (IS_ERR(name)) > + if (IS_ERR(name)) { > + if (free_slab) > + slab_kmem_cache_release(s); > return PTR_ERR(name); > + } > } > > s->kobj.kset = kset; > err = kobject_init_and_add(&s->kobj, &slab_ktype, NULL, "%s", name); > if (err) > - goto out; > + goto out_put_kobj; > > err = sysfs_create_group(&s->kobj, &slab_attr_group); > if (err) > @@ -5934,6 +5937,16 @@ int sysfs_slab_add(struct kmem_cache *s) > return err; > out_del_kobj: > kobject_del(&s->kobj); > +out_put_kobj: > + /* > + * Skip free kmem_cache that create early since they are important > + * for system. Keep them working without sysfs. Only free the name > + * for early allocated kmem_cache. > + */ > + if (free_slab) > + kobject_put(&s->kobj); > + else > + kfree_const(s->kobj.name); > goto out; > } > > @@ -6002,7 +6015,7 @@ static int __init slab_sysfs_init(void) > slab_state = FULL; > > list_for_each_entry(s, &slab_caches, list) { > - err = sysfs_slab_add(s); > + err = sysfs_slab_add(s, false); > if (err) > pr_err("SLUB: Unable to add boot slab %s to sysfs\n", > s->name);
On Thu, Sep 5, 2024 at 12:41 PM Jinjie Ruan <ruanjinjie@huawei.com> wrote: > > > > On 2022/11/12 19:46, Liu Shixin wrote: > > There is a memory leak of kobj->name in sysfs_slab_add(): > > > > unreferenced object 0xffff88817e446440 (size 32): > > comm "insmod", pid 4085, jiffies 4296564501 (age 126.272s) > > hex dump (first 32 bytes): > > 75 62 69 66 73 5f 69 6e 6f 64 65 5f 73 6c 61 62 ubifs_inode_slab > > 00 65 44 7e 81 88 ff ff 00 00 00 00 00 00 00 00 .eD~............ > > backtrace: > > [<000000005b30fbbd>] __kmalloc_node_track_caller+0x4e/0x150 > > [<000000002f70da0c>] kstrdup_const+0x4b/0x80 > > [<00000000c6712c61>] kobject_set_name_vargs+0x2f/0xb0 > > [<00000000b151218e>] kobject_init_and_add+0xb0/0x120 > > [<00000000e56a4cf5>] sysfs_slab_add+0x17d/0x220 > > [<000000009326fd57>] __kmem_cache_create+0x406/0x590 > > [<00000000dde33cff>] kmem_cache_create_usercopy+0x1fc/0x300 > > [<00000000fe90cedb>] kmem_cache_create+0x12/0x20 > > [<000000007a6531c8>] 0xffffffffa02d802d > > [<000000000e3b13c7>] do_one_initcall+0x87/0x2a0 > > [<00000000995ecdcf>] do_init_module+0xdf/0x320 > > [<000000008821941f>] load_module+0x2f98/0x3330 > > [<00000000ef51efa4>] __do_sys_finit_module+0x113/0x1b0 > > [<000000009339fbce>] do_syscall_64+0x35/0x80 > > [<000000006b7f2033>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 > > > Hi,every one, Hi. > I found the same problem and it solve this problem with the patch, is > there any plan to update the patch and solve it. What kernel version do you use, and when do you encounter it or how do you reproduce it? -- Hyeonggon
On 2024/9/5 21:59, Hyeonggon Yoo wrote: > On Thu, Sep 5, 2024 at 12:41 PM Jinjie Ruan <ruanjinjie@huawei.com> wrote: >> >> >> >> On 2022/11/12 19:46, Liu Shixin wrote: >>> There is a memory leak of kobj->name in sysfs_slab_add(): >>> >>> unreferenced object 0xffff88817e446440 (size 32): >>> comm "insmod", pid 4085, jiffies 4296564501 (age 126.272s) >>> hex dump (first 32 bytes): >>> 75 62 69 66 73 5f 69 6e 6f 64 65 5f 73 6c 61 62 ubifs_inode_slab >>> 00 65 44 7e 81 88 ff ff 00 00 00 00 00 00 00 00 .eD~............ >>> backtrace: >>> [<000000005b30fbbd>] __kmalloc_node_track_caller+0x4e/0x150 >>> [<000000002f70da0c>] kstrdup_const+0x4b/0x80 >>> [<00000000c6712c61>] kobject_set_name_vargs+0x2f/0xb0 >>> [<00000000b151218e>] kobject_init_and_add+0xb0/0x120 >>> [<00000000e56a4cf5>] sysfs_slab_add+0x17d/0x220 >>> [<000000009326fd57>] __kmem_cache_create+0x406/0x590 >>> [<00000000dde33cff>] kmem_cache_create_usercopy+0x1fc/0x300 >>> [<00000000fe90cedb>] kmem_cache_create+0x12/0x20 >>> [<000000007a6531c8>] 0xffffffffa02d802d >>> [<000000000e3b13c7>] do_one_initcall+0x87/0x2a0 >>> [<00000000995ecdcf>] do_init_module+0xdf/0x320 >>> [<000000008821941f>] load_module+0x2f98/0x3330 >>> [<00000000ef51efa4>] __do_sys_finit_module+0x113/0x1b0 >>> [<000000009339fbce>] do_syscall_64+0x35/0x80 >>> [<000000006b7f2033>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 >> >> >> Hi,every one, > > Hi. > >> I found the same problem and it solve this problem with the patch, is >> there any plan to update the patch and solve it. > > What kernel version do you use, 6.11.0-rc6 > and when do you encounter it or how do you reproduce it? Hi, Hyeonggon, Thank you, I encounter it when doing inject fault test while modprobe amdgpu.ko. > > -- > Hyeonggon
On 9/6/24 10:10, Jinjie Ruan wrote: > > > On 2024/9/5 21:59, Hyeonggon Yoo wrote: >> On Thu, Sep 5, 2024 at 12:41 PM Jinjie Ruan <ruanjinjie@huawei.com> wrote: >>> >>> >>> >>> On 2022/11/12 19:46, Liu Shixin wrote: >>>> There is a memory leak of kobj->name in sysfs_slab_add(): >>>> >>>> unreferenced object 0xffff88817e446440 (size 32): >>>> comm "insmod", pid 4085, jiffies 4296564501 (age 126.272s) >>>> hex dump (first 32 bytes): >>>> 75 62 69 66 73 5f 69 6e 6f 64 65 5f 73 6c 61 62 ubifs_inode_slab >>>> 00 65 44 7e 81 88 ff ff 00 00 00 00 00 00 00 00 .eD~............ >>>> backtrace: >>>> [<000000005b30fbbd>] __kmalloc_node_track_caller+0x4e/0x150 >>>> [<000000002f70da0c>] kstrdup_const+0x4b/0x80 >>>> [<00000000c6712c61>] kobject_set_name_vargs+0x2f/0xb0 >>>> [<00000000b151218e>] kobject_init_and_add+0xb0/0x120 >>>> [<00000000e56a4cf5>] sysfs_slab_add+0x17d/0x220 >>>> [<000000009326fd57>] __kmem_cache_create+0x406/0x590 >>>> [<00000000dde33cff>] kmem_cache_create_usercopy+0x1fc/0x300 >>>> [<00000000fe90cedb>] kmem_cache_create+0x12/0x20 >>>> [<000000007a6531c8>] 0xffffffffa02d802d >>>> [<000000000e3b13c7>] do_one_initcall+0x87/0x2a0 >>>> [<00000000995ecdcf>] do_init_module+0xdf/0x320 >>>> [<000000008821941f>] load_module+0x2f98/0x3330 >>>> [<00000000ef51efa4>] __do_sys_finit_module+0x113/0x1b0 >>>> [<000000009339fbce>] do_syscall_64+0x35/0x80 >>>> [<000000006b7f2033>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 >>> >>> >>> Hi,every one, >> >> Hi. >> >>> I found the same problem and it solve this problem with the patch, is >>> there any plan to update the patch and solve it. Hmm looks like back in 2022, Hyeonggon had some feedback to the series which was not answered and then it got forgotten. Feel free to take over and send an updated version. >> What kernel version do you use, > > 6.11.0-rc6 > >> and when do you encounter it or how do you reproduce it? > > Hi, Hyeonggon, > > Thank you, I encounter it when doing inject fault test while modprobe > amdgpu.ko. So I wonder where's the problem that results in kobject_init_and_add() failing. If it's genuinely duplicate name as commit 80da026a8e5d suggests, 6.12-rc1 will have a warning to prevent that. Delayed destruction of SLAB_TYPESAFE_BY_RCU caches should also no longer happen with 6.12-rc1. So worth retrying with that and if it's still failing, we should look at the root cause perhaps. >> >> -- >> Hyeonggon
> On Sep 13, 2024, at 11:10 PM, Vlastimil Babka <vbabka@suse.cz> wrote: > > On 9/6/24 10:10, Jinjie Ruan wrote: >> >> >> On 2024/9/5 21:59, Hyeonggon Yoo wrote: >>> On Thu, Sep 5, 2024 at 12:41 PM Jinjie Ruan <ruanjinjie@huawei.com> wrote: >>>> >>>> >>>> >>>> On 2022/11/12 19:46, Liu Shixin wrote: >>>>> There is a memory leak of kobj->name in sysfs_slab_add(): >>>>> >>>>> unreferenced object 0xffff88817e446440 (size 32): >>>>> comm "insmod", pid 4085, jiffies 4296564501 (age 126.272s) >>>>> hex dump (first 32 bytes): >>>>> 75 62 69 66 73 5f 69 6e 6f 64 65 5f 73 6c 61 62 ubifs_inode_slab >>>>> 00 65 44 7e 81 88 ff ff 00 00 00 00 00 00 00 00 .eD~............ >>>>> backtrace: >>>>> [<000000005b30fbbd>] __kmalloc_node_track_caller+0x4e/0x150 >>>>> [<000000002f70da0c>] kstrdup_const+0x4b/0x80 >>>>> [<00000000c6712c61>] kobject_set_name_vargs+0x2f/0xb0 >>>>> [<00000000b151218e>] kobject_init_and_add+0xb0/0x120 >>>>> [<00000000e56a4cf5>] sysfs_slab_add+0x17d/0x220 >>>>> [<000000009326fd57>] __kmem_cache_create+0x406/0x590 >>>>> [<00000000dde33cff>] kmem_cache_create_usercopy+0x1fc/0x300 >>>>> [<00000000fe90cedb>] kmem_cache_create+0x12/0x20 >>>>> [<000000007a6531c8>] 0xffffffffa02d802d >>>>> [<000000000e3b13c7>] do_one_initcall+0x87/0x2a0 >>>>> [<00000000995ecdcf>] do_init_module+0xdf/0x320 >>>>> [<000000008821941f>] load_module+0x2f98/0x3330 >>>>> [<00000000ef51efa4>] __do_sys_finit_module+0x113/0x1b0 >>>>> [<000000009339fbce>] do_syscall_64+0x35/0x80 >>>>> [<000000006b7f2033>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 >>>> >>>> >>>> Hi,every one, >>> >>> Hi. >>> >>>> I found the same problem and it solve this problem with the patch, is >>>> there any plan to update the patch and solve it. > > Hmm looks like back in 2022, Hyeonggon had some feedback to the series which > was not answered and then it got forgotten. Feel free to take over and send > an updated version. I was thinking of what the fix would be with my feedback, and I still think passing different kobj_type (with a dummy release function) for early kmem_caches will be a more appropriate approach. However, there is one concern: people that wrote kobject.rst might not like it :( in Documentation/core-api/kobject.rst: > One important point cannot be overstated: every kobject must have a release() method, > and the kobject must persist (in a consistent state) until that method is called. If these constraints are not met, > the code is flawed. Note that the kernel will warn you if you forget to provide a release() method. > Do not try to get rid of this warning by providing an "empty" release function. But obviously we don't want to release caches just because the kernel failed to add it to sysfs. >>> What kernel version do you use, >> >> 6.11.0-rc6 >> >>> and when do you encounter it or how do you reproduce it? >> >> Hi, Hyeonggon, >> >> Thank you, I encounter it when doing inject fault test while modprobe >> amdgpu.ko. > > So I wonder where's the problem that results in kobject_init_and_add() > failing. If it's genuinely duplicate name as commit 80da026a8e5d suggests, > 6.12-rc1 will have a warning to prevent that. Delayed destruction of > SLAB_TYPESAFE_BY_RCU caches should also no longer happen with 6.12-rc1. So > worth retrying with that and if it's still failing, we should look at the > root cause perhaps. I thought it was because the memory allocation for a name string failed due to fault injection? > >>> >>> -- >>> Hyeonggon
On 9/13/24 17:00, Hyeonggon Yoo wrote: > > >> On Sep 13, 2024, at 11:10 PM, Vlastimil Babka <vbabka@suse.cz> wrote: >> >> On 9/6/24 10:10, Jinjie Ruan wrote: >>> >>> >>> On 2024/9/5 21:59, Hyeonggon Yoo wrote: >>>> On Thu, Sep 5, 2024 at 12:41 PM Jinjie Ruan <ruanjinjie@huawei.com> wrote: >>>>> >>>>> >>>>> >>>>> On 2022/11/12 19:46, Liu Shixin wrote: >>>>>> There is a memory leak of kobj->name in sysfs_slab_add(): >>>>>> >>>>>> unreferenced object 0xffff88817e446440 (size 32): >>>>>> comm "insmod", pid 4085, jiffies 4296564501 (age 126.272s) >>>>>> hex dump (first 32 bytes): >>>>>> 75 62 69 66 73 5f 69 6e 6f 64 65 5f 73 6c 61 62 ubifs_inode_slab >>>>>> 00 65 44 7e 81 88 ff ff 00 00 00 00 00 00 00 00 .eD~............ >>>>>> backtrace: >>>>>> [<000000005b30fbbd>] __kmalloc_node_track_caller+0x4e/0x150 >>>>>> [<000000002f70da0c>] kstrdup_const+0x4b/0x80 >>>>>> [<00000000c6712c61>] kobject_set_name_vargs+0x2f/0xb0 >>>>>> [<00000000b151218e>] kobject_init_and_add+0xb0/0x120 >>>>>> [<00000000e56a4cf5>] sysfs_slab_add+0x17d/0x220 >>>>>> [<000000009326fd57>] __kmem_cache_create+0x406/0x590 >>>>>> [<00000000dde33cff>] kmem_cache_create_usercopy+0x1fc/0x300 >>>>>> [<00000000fe90cedb>] kmem_cache_create+0x12/0x20 >>>>>> [<000000007a6531c8>] 0xffffffffa02d802d >>>>>> [<000000000e3b13c7>] do_one_initcall+0x87/0x2a0 >>>>>> [<00000000995ecdcf>] do_init_module+0xdf/0x320 >>>>>> [<000000008821941f>] load_module+0x2f98/0x3330 >>>>>> [<00000000ef51efa4>] __do_sys_finit_module+0x113/0x1b0 >>>>>> [<000000009339fbce>] do_syscall_64+0x35/0x80 >>>>>> [<000000006b7f2033>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 >>>>> >>>>> >>>>> Hi,every one, >>>> >>>> Hi. >>>> >>>>> I found the same problem and it solve this problem with the patch, is >>>>> there any plan to update the patch and solve it. >> >> Hmm looks like back in 2022, Hyeonggon had some feedback to the series which >> was not answered and then it got forgotten. Feel free to take over and send >> an updated version. > > > I was thinking of what the fix would be with my feedback, > and I still think passing different kobj_type (with a dummy release function) for early kmem_caches > will be a more appropriate approach. > > However, there is one concern: people that wrote kobject.rst might not like it :( > > in Documentation/core-api/kobject.rst: >> One important point cannot be overstated: every kobject must have a release() method, >> and the kobject must persist (in a consistent state) until that method is called. If these constraints are not met, >> the code is flawed. Note that the kernel will warn you if you forget to provide a release() method. >> Do not try to get rid of this warning by providing an "empty" release function. > > But obviously we don't want to release caches just because the kernel failed to add it to sysfs. > >>>> What kernel version do you use, >>> >>> 6.11.0-rc6 >>> >>>> and when do you encounter it or how do you reproduce it? >>> >>> Hi, Hyeonggon, >>> >>> Thank you, I encounter it when doing inject fault test while modprobe >>> amdgpu.ko. >> >> So I wonder where's the problem that results in kobject_init_and_add() >> failing. If it's genuinely duplicate name as commit 80da026a8e5d suggests, >> 6.12-rc1 will have a warning to prevent that. Delayed destruction of >> SLAB_TYPESAFE_BY_RCU caches should also no longer happen with 6.12-rc1. So >> worth retrying with that and if it's still failing, we should look at the >> root cause perhaps. > > I thought it was because the memory allocation for a name string failed due to fault injection? Well in any case 6.12-rc1 introduced a new one, fixed by: https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git/commit/?h=slab/for-6.12-rc1/fixes&id=77ced98f0f03fdc196561d1afbe652899c318073 So once that's mainline, we can see if anything remains >> >>>> >>>> -- >>>> Hyeonggon > >
On 2024/10/2 19:35, Vlastimil Babka wrote: > On 9/13/24 17:00, Hyeonggon Yoo wrote: >> >> >>> On Sep 13, 2024, at 11:10 PM, Vlastimil Babka <vbabka@suse.cz> wrote: >>> >>> On 9/6/24 10:10, Jinjie Ruan wrote: >>>> >>>> >>>> On 2024/9/5 21:59, Hyeonggon Yoo wrote: >>>>> On Thu, Sep 5, 2024 at 12:41 PM Jinjie Ruan <ruanjinjie@huawei.com> wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 2022/11/12 19:46, Liu Shixin wrote: >>>>>>> There is a memory leak of kobj->name in sysfs_slab_add(): >>>>>>> >>>>>>> unreferenced object 0xffff88817e446440 (size 32): >>>>>>> comm "insmod", pid 4085, jiffies 4296564501 (age 126.272s) >>>>>>> hex dump (first 32 bytes): >>>>>>> 75 62 69 66 73 5f 69 6e 6f 64 65 5f 73 6c 61 62 ubifs_inode_slab >>>>>>> 00 65 44 7e 81 88 ff ff 00 00 00 00 00 00 00 00 .eD~............ >>>>>>> backtrace: >>>>>>> [<000000005b30fbbd>] __kmalloc_node_track_caller+0x4e/0x150 >>>>>>> [<000000002f70da0c>] kstrdup_const+0x4b/0x80 >>>>>>> [<00000000c6712c61>] kobject_set_name_vargs+0x2f/0xb0 >>>>>>> [<00000000b151218e>] kobject_init_and_add+0xb0/0x120 >>>>>>> [<00000000e56a4cf5>] sysfs_slab_add+0x17d/0x220 >>>>>>> [<000000009326fd57>] __kmem_cache_create+0x406/0x590 >>>>>>> [<00000000dde33cff>] kmem_cache_create_usercopy+0x1fc/0x300 >>>>>>> [<00000000fe90cedb>] kmem_cache_create+0x12/0x20 >>>>>>> [<000000007a6531c8>] 0xffffffffa02d802d >>>>>>> [<000000000e3b13c7>] do_one_initcall+0x87/0x2a0 >>>>>>> [<00000000995ecdcf>] do_init_module+0xdf/0x320 >>>>>>> [<000000008821941f>] load_module+0x2f98/0x3330 >>>>>>> [<00000000ef51efa4>] __do_sys_finit_module+0x113/0x1b0 >>>>>>> [<000000009339fbce>] do_syscall_64+0x35/0x80 >>>>>>> [<000000006b7f2033>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 >>>>>> >>>>>> >>>>>> Hi,every one, >>>>> >>>>> Hi. >>>>> >>>>>> I found the same problem and it solve this problem with the patch, is >>>>>> there any plan to update the patch and solve it. >>> >>> Hmm looks like back in 2022, Hyeonggon had some feedback to the series which >>> was not answered and then it got forgotten. Feel free to take over and send >>> an updated version. >> >> >> I was thinking of what the fix would be with my feedback, >> and I still think passing different kobj_type (with a dummy release function) for early kmem_caches >> will be a more appropriate approach. >> >> However, there is one concern: people that wrote kobject.rst might not like it :( >> >> in Documentation/core-api/kobject.rst: >>> One important point cannot be overstated: every kobject must have a release() method, >>> and the kobject must persist (in a consistent state) until that method is called. If these constraints are not met, >>> the code is flawed. Note that the kernel will warn you if you forget to provide a release() method. >>> Do not try to get rid of this warning by providing an "empty" release function. >> >> But obviously we don't want to release caches just because the kernel failed to add it to sysfs. >> >>>>> What kernel version do you use, >>>> >>>> 6.11.0-rc6 >>>> >>>>> and when do you encounter it or how do you reproduce it? >>>> >>>> Hi, Hyeonggon, >>>> >>>> Thank you, I encounter it when doing inject fault test while modprobe >>>> amdgpu.ko. >>> >>> So I wonder where's the problem that results in kobject_init_and_add() >>> failing. If it's genuinely duplicate name as commit 80da026a8e5d suggests, >>> 6.12-rc1 will have a warning to prevent that. Delayed destruction of >>> SLAB_TYPESAFE_BY_RCU caches should also no longer happen with 6.12-rc1. So >>> worth retrying with that and if it's still failing, we should look at the >>> root cause perhaps. >> >> I thought it was because the memory allocation for a name string failed due to fault injection? > > Well in any case 6.12-rc1 introduced a new one, fixed by: > https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git/commit/?h=slab/for-6.12-rc1/fixes&id=77ced98f0f03fdc196561d1afbe652899c318073 > > So once that's mainline, we can see if anything remains Using the newest 6.12.0-rc4, the issue still exists in slab: unreferenced object 0xffffff80ce6da9c0 (size 16): comm "modprobe", pid 12782, jiffies 4299073226 hex dump (first 16 bytes): 6f 76 6c 5f 69 6e 6f 64 65 00 6d ce 80 ff ff ff ovl_inode.m..... backtrace (crc 1a460899): [<00000000edf3be8b>] kmemleak_alloc+0x34/0x40 [<0000000004121c8d>] __kmalloc_node_track_caller_noprof+0x304/0x3e8 [<00000000515e9eda>] kstrdup+0x48/0x84 [<000000005d2d0c1a>] kstrdup_const+0x34/0x40 [<00000000d14076ce>] kvasprintf_const+0x170/0x1e0 [<0000000060f79972>] kobject_set_name_vargs+0x5c/0x12c [<00000000299f544a>] kobject_init_and_add+0xd4/0x168 [<000000008ceb40f4>] sysfs_slab_add+0x190/0x21c [<00000000027371b9>] do_kmem_cache_create+0x354/0x5cc [<00000000cc9eb2aa>] __kmem_cache_create_args+0x1b8/0x2c8 [<000000006a3e21cc>] 0xffffffea545a409c [<0000000002f945b3>] do_one_initcall+0x110/0x77c [<0000000024f23211>] do_init_module+0x1dc/0x5c8 [<00000000a16337d6>] load_module+0x4acc/0x4e90 [<00000000be447e77>] init_module_from_file+0xd4/0x128 [<0000000048065de1>] idempotent_init_module+0x2d4/0x57c > >>> >>>>> >>>>> -- >>>>> Hyeonggon >> >> >
On Fri, Oct 25, 2024 at 11:10 AM Jinjie Ruan <ruanjinjie@huawei.com> wrote: > > > > On 2024/10/2 19:35, Vlastimil Babka wrote: > > On 9/13/24 17:00, Hyeonggon Yoo wrote: > >> > >> > >>> On Sep 13, 2024, at 11:10 PM, Vlastimil Babka <vbabka@suse.cz> wrote: > >>> > >>> On 9/6/24 10:10, Jinjie Ruan wrote: > >>>> > >>>> > >>>> On 2024/9/5 21:59, Hyeonggon Yoo wrote: > >>>>> On Thu, Sep 5, 2024 at 12:41 PM Jinjie Ruan <ruanjinjie@huawei.com> wrote: > >>>>>> > >>>>>> > >>>>>> > >>>>>> On 2022/11/12 19:46, Liu Shixin wrote: > >>>>>>> There is a memory leak of kobj->name in sysfs_slab_add(): > >>>>>>> > >>>>>>> unreferenced object 0xffff88817e446440 (size 32): > >>>>>>> comm "insmod", pid 4085, jiffies 4296564501 (age 126.272s) > >>>>>>> hex dump (first 32 bytes): > >>>>>>> 75 62 69 66 73 5f 69 6e 6f 64 65 5f 73 6c 61 62 ubifs_inode_slab > >>>>>>> 00 65 44 7e 81 88 ff ff 00 00 00 00 00 00 00 00 .eD~............ > >>>>>>> backtrace: > >>>>>>> [<000000005b30fbbd>] __kmalloc_node_track_caller+0x4e/0x150 > >>>>>>> [<000000002f70da0c>] kstrdup_const+0x4b/0x80 > >>>>>>> [<00000000c6712c61>] kobject_set_name_vargs+0x2f/0xb0 > >>>>>>> [<00000000b151218e>] kobject_init_and_add+0xb0/0x120 > >>>>>>> [<00000000e56a4cf5>] sysfs_slab_add+0x17d/0x220 > >>>>>>> [<000000009326fd57>] __kmem_cache_create+0x406/0x590 > >>>>>>> [<00000000dde33cff>] kmem_cache_create_usercopy+0x1fc/0x300 > >>>>>>> [<00000000fe90cedb>] kmem_cache_create+0x12/0x20 > >>>>>>> [<000000007a6531c8>] 0xffffffffa02d802d > >>>>>>> [<000000000e3b13c7>] do_one_initcall+0x87/0x2a0 > >>>>>>> [<00000000995ecdcf>] do_init_module+0xdf/0x320 > >>>>>>> [<000000008821941f>] load_module+0x2f98/0x3330 > >>>>>>> [<00000000ef51efa4>] __do_sys_finit_module+0x113/0x1b0 > >>>>>>> [<000000009339fbce>] do_syscall_64+0x35/0x80 > >>>>>>> [<000000006b7f2033>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 > >>>>>> > >>>>>> > >>>>>> Hi,every one, > >>>>> > >>>>> Hi. > >>>>> > >>>>>> I found the same problem and it solve this problem with the patch, is > >>>>>> there any plan to update the patch and solve it. > >>> > >>> Hmm looks like back in 2022, Hyeonggon had some feedback to the series which > >>> was not answered and then it got forgotten. Feel free to take over and send > >>> an updated version. > >> > >> > >> I was thinking of what the fix would be with my feedback, > >> and I still think passing different kobj_type (with a dummy release function) for early kmem_caches > >> will be a more appropriate approach. > >> > >> However, there is one concern: people that wrote kobject.rst might not like it :( > >> > >> in Documentation/core-api/kobject.rst: > >>> One important point cannot be overstated: every kobject must have a release() method, > >>> and the kobject must persist (in a consistent state) until that method is called. If these constraints are not met, > >>> the code is flawed. Note that the kernel will warn you if you forget to provide a release() method. > >>> Do not try to get rid of this warning by providing an "empty" release function. > >> > >> But obviously we don't want to release caches just because the kernel failed to add it to sysfs. > >> > >>>>> What kernel version do you use, > >>>> > >>>> 6.11.0-rc6 > >>>> > >>>>> and when do you encounter it or how do you reproduce it? > >>>> > >>>> Hi, Hyeonggon, > >>>> > >>>> Thank you, I encounter it when doing inject fault test while modprobe > >>>> amdgpu.ko. > >>> > >>> So I wonder where's the problem that results in kobject_init_and_add() > >>> failing. If it's genuinely duplicate name as commit 80da026a8e5d suggests, > >>> 6.12-rc1 will have a warning to prevent that. Delayed destruction of > >>> SLAB_TYPESAFE_BY_RCU caches should also no longer happen with 6.12-rc1. So > >>> worth retrying with that and if it's still failing, we should look at the > >>> root cause perhaps. > >> > >> I thought it was because the memory allocation for a name string failed due to fault injection? > > > > Well in any case 6.12-rc1 introduced a new one, fixed by: > > https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git/commit/?h=slab/for-6.12-rc1/fixes&id=77ced98f0f03fdc196561d1afbe652899c318073 > > > > So once that's mainline, we can see if anything remains > > Using the newest 6.12.0-rc4, the issue still exists in slab: > > unreferenced object 0xffffff80ce6da9c0 (size 16): > comm "modprobe", pid 12782, jiffies 4299073226 > hex dump (first 16 bytes): > 6f 76 6c 5f 69 6e 6f 64 65 00 6d ce 80 ff ff ff ovl_inode.m..... > backtrace (crc 1a460899): > [<00000000edf3be8b>] kmemleak_alloc+0x34/0x40 > [<0000000004121c8d>] __kmalloc_node_track_caller_noprof+0x304/0x3e8 > [<00000000515e9eda>] kstrdup+0x48/0x84 > [<000000005d2d0c1a>] kstrdup_const+0x34/0x40 > [<00000000d14076ce>] kvasprintf_const+0x170/0x1e0 > [<0000000060f79972>] kobject_set_name_vargs+0x5c/0x12c > [<00000000299f544a>] kobject_init_and_add+0xd4/0x168 > [<000000008ceb40f4>] sysfs_slab_add+0x190/0x21c > [<00000000027371b9>] do_kmem_cache_create+0x354/0x5cc > [<00000000cc9eb2aa>] __kmem_cache_create_args+0x1b8/0x2c8 > [<000000006a3e21cc>] 0xffffffea545a409c > [<0000000002f945b3>] do_one_initcall+0x110/0x77c > [<0000000024f23211>] do_init_module+0x1dc/0x5c8 > [<00000000a16337d6>] load_module+0x4acc/0x4e90 > [<00000000be447e77>] init_module_from_file+0xd4/0x128 > [<0000000048065de1>] idempotent_init_module+0x2d4/0x57c Hi Jinjie, Oh, I should've Cc'd you... https://lore.kernel.org/linux-mm/20241021091413.154775-1-42.hyeyoo@gmail.com/ I wrote a fix for this and I'm adjusting feedback from Christoph and Vlastimil. Thanks, Hyeonggon
diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h index 26d56c4c74d1..90c3e06b77b1 100644 --- a/include/linux/slub_def.h +++ b/include/linux/slub_def.h @@ -144,11 +144,11 @@ struct kmem_cache { #ifdef CONFIG_SYSFS #define SLAB_SUPPORTS_SYSFS -int sysfs_slab_add(struct kmem_cache *); +int sysfs_slab_add(struct kmem_cache *, bool); void sysfs_slab_unlink(struct kmem_cache *); void sysfs_slab_release(struct kmem_cache *); #else -static inline int sysfs_slab_add(struct kmem_cache *s) +static inline int sysfs_slab_add(struct kmem_cache *s, bool free_slab) { return 0; } diff --git a/mm/slab_common.c b/mm/slab_common.c index 55e2cf064dfe..30808a1d1b32 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -237,11 +237,9 @@ static struct kmem_cache *create_cache(const char *name, #ifdef SLAB_SUPPORTS_SYSFS /* Mutex is not taken during early boot */ if (slab_state >= FULL) { - err = sysfs_slab_add(s); - if (err) { - slab_kmem_cache_release(s); + err = sysfs_slab_add(s, true); + if (err) return ERR_PTR(err); - } debugfs_slab_add(s); } #endif diff --git a/mm/slub.c b/mm/slub.c index a1ad759753ce..25575bce0c3c 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -5881,7 +5881,7 @@ static char *create_unique_id(struct kmem_cache *s) return name; } -int sysfs_slab_add(struct kmem_cache *s) +int sysfs_slab_add(struct kmem_cache *s, bool free_slab) { int err; const char *name; @@ -5911,14 +5911,17 @@ int sysfs_slab_add(struct kmem_cache *s) * for the symlinks. */ name = create_unique_id(s); - if (IS_ERR(name)) + if (IS_ERR(name)) { + if (free_slab) + slab_kmem_cache_release(s); return PTR_ERR(name); + } } s->kobj.kset = kset; err = kobject_init_and_add(&s->kobj, &slab_ktype, NULL, "%s", name); if (err) - goto out; + goto out_put_kobj; err = sysfs_create_group(&s->kobj, &slab_attr_group); if (err) @@ -5934,6 +5937,16 @@ int sysfs_slab_add(struct kmem_cache *s) return err; out_del_kobj: kobject_del(&s->kobj); +out_put_kobj: + /* + * Skip free kmem_cache that create early since they are important + * for system. Keep them working without sysfs. Only free the name + * for early allocated kmem_cache. + */ + if (free_slab) + kobject_put(&s->kobj); + else + kfree_const(s->kobj.name); goto out; } @@ -6002,7 +6015,7 @@ static int __init slab_sysfs_init(void) slab_state = FULL; list_for_each_entry(s, &slab_caches, list) { - err = sysfs_slab_add(s); + err = sysfs_slab_add(s, false); if (err) pr_err("SLUB: Unable to add boot slab %s to sysfs\n", s->name);
There is a memory leak of kobj->name in sysfs_slab_add(): unreferenced object 0xffff88817e446440 (size 32): comm "insmod", pid 4085, jiffies 4296564501 (age 126.272s) hex dump (first 32 bytes): 75 62 69 66 73 5f 69 6e 6f 64 65 5f 73 6c 61 62 ubifs_inode_slab 00 65 44 7e 81 88 ff ff 00 00 00 00 00 00 00 00 .eD~............ backtrace: [<000000005b30fbbd>] __kmalloc_node_track_caller+0x4e/0x150 [<000000002f70da0c>] kstrdup_const+0x4b/0x80 [<00000000c6712c61>] kobject_set_name_vargs+0x2f/0xb0 [<00000000b151218e>] kobject_init_and_add+0xb0/0x120 [<00000000e56a4cf5>] sysfs_slab_add+0x17d/0x220 [<000000009326fd57>] __kmem_cache_create+0x406/0x590 [<00000000dde33cff>] kmem_cache_create_usercopy+0x1fc/0x300 [<00000000fe90cedb>] kmem_cache_create+0x12/0x20 [<000000007a6531c8>] 0xffffffffa02d802d [<000000000e3b13c7>] do_one_initcall+0x87/0x2a0 [<00000000995ecdcf>] do_init_module+0xdf/0x320 [<000000008821941f>] load_module+0x2f98/0x3330 [<00000000ef51efa4>] __do_sys_finit_module+0x113/0x1b0 [<000000009339fbce>] do_syscall_64+0x35/0x80 [<000000006b7f2033>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 Following the rules stated in the comment for kobject_init_and_add(): If this function returns an error, kobject_put() must be called to properly clean up the memory associated with the object. kobject_put() is more appropriate for error handling after kobject_init(). And we can use this function to solve this problem. For the cache created early, the related sysfs_slab_add() is called in slab_sysfs_init(). Skip free these kmem_cache since they are important for system. Keep them working without sysfs. Fixes: 80da026a8e5d ("mm/slub: fix slab double-free in case of duplicate sysfs filename") Signed-off-by: Liu Shixin <liushixin2@huawei.com> --- include/linux/slub_def.h | 4 ++-- mm/slab_common.c | 6 ++---- mm/slub.c | 21 +++++++++++++++++---- 3 files changed, 21 insertions(+), 10 deletions(-)