Message ID | 20241014032336.482088-1-chenridong@huaweicloud.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] mm: shrinker: avoid memleak in alloc_shrinker_info | expand |
> On Oct 14, 2024, at 11:23, Chen Ridong <chenridong@huaweicloud.com> wrote: > > From: Chen Ridong <chenridong@huawei.com> > > A memleak was found as bellow: > > unreferenced object 0xffff8881010d2a80 (size 32): > comm "mkdir", pid 1559, jiffies 4294932666 > hex dump (first 32 bytes): > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @............... > backtrace (crc 2e7ef6fa): > [<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470 > [<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0 > [<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0 > [<ffffffff81198dd9>] online_css+0x29/0xa0 > [<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360 > [<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0 > [<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90 > [<ffffffff813dbb24>] vfs_mkdir+0x144/0x220 > [<ffffffff813e1c97>] do_mkdirat+0x87/0x130 > [<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70 > [<ffffffff81f8c928>] do_syscall_64+0x68/0x140 > [<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e > > In the alloc_shrinker_info function, when shrinker_unit_alloc return > err, the info won't be freed. Just fix it. > > Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}") > Signed-off-by: Chen Ridong <chenridong@huawei.com> > --- > mm/shrinker.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/mm/shrinker.c b/mm/shrinker.c > index dc5d2a6fcfc4..92270413190d 100644 > --- a/mm/shrinker.c > +++ b/mm/shrinker.c > @@ -97,6 +97,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) > > err: > mutex_unlock(&shrinker_mutex); > + kvfree(info); Please transpose kvfree() to the place above mutex_unlock(). We should release the resources in the order which is inverse to the that of acquiring resources. Thanks. > free_shrinker_info(memcg); > return -ENOMEM; > } > -- > 2.34.1 >
On 10/14/24 08:53, Chen Ridong wrote: > From: Chen Ridong <chenridong@huawei.com> > > A memleak was found as bellow: > > unreferenced object 0xffff8881010d2a80 (size 32): > comm "mkdir", pid 1559, jiffies 4294932666 > hex dump (first 32 bytes): > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @............... > backtrace (crc 2e7ef6fa): > [<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470 > [<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0 > [<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0 > [<ffffffff81198dd9>] online_css+0x29/0xa0 > [<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360 > [<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0 > [<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90 > [<ffffffff813dbb24>] vfs_mkdir+0x144/0x220 > [<ffffffff813e1c97>] do_mkdirat+0x87/0x130 > [<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70 > [<ffffffff81f8c928>] do_syscall_64+0x68/0x140 > [<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e > > In the alloc_shrinker_info function, when shrinker_unit_alloc return > err, the info won't be freed. Just fix it. > > Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}") > Signed-off-by: Chen Ridong <chenridong@huawei.com> > --- > mm/shrinker.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/mm/shrinker.c b/mm/shrinker.c > index dc5d2a6fcfc4..92270413190d 100644 > --- a/mm/shrinker.c > +++ b/mm/shrinker.c > @@ -97,6 +97,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) > > err: > mutex_unlock(&shrinker_mutex); > + kvfree(info); > free_shrinker_info(memcg); > return -ENOMEM; > } There are two scenarios when "goto err:" gets called - When shrinker_info allocations fails, no kvfree() is required - but after this change kvfree() would be called even when the allocation had failed originally, which does not sound right - shrinker_unit_alloc() fails, kvfree() is actually required I guess kvfree() should be called just after shrinker_unit_alloc() fails but before calling into "goto err". But curious, should not both kvzalloc_node()/kvfree() be avoided while inside mutex lock to avoid possible lockdep issues ?
> On Oct 14, 2024, at 16:13, Anshuman Khandual <anshuman.khandual@arm.com> wrote: > > > > On 10/14/24 08:53, Chen Ridong wrote: >> From: Chen Ridong <chenridong@huawei.com> >> >> A memleak was found as bellow: >> >> unreferenced object 0xffff8881010d2a80 (size 32): >> comm "mkdir", pid 1559, jiffies 4294932666 >> hex dump (first 32 bytes): >> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >> 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @............... >> backtrace (crc 2e7ef6fa): >> [<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470 >> [<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0 >> [<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0 >> [<ffffffff81198dd9>] online_css+0x29/0xa0 >> [<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360 >> [<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0 >> [<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90 >> [<ffffffff813dbb24>] vfs_mkdir+0x144/0x220 >> [<ffffffff813e1c97>] do_mkdirat+0x87/0x130 >> [<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70 >> [<ffffffff81f8c928>] do_syscall_64+0x68/0x140 >> [<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e >> >> In the alloc_shrinker_info function, when shrinker_unit_alloc return >> err, the info won't be freed. Just fix it. >> >> Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}") >> Signed-off-by: Chen Ridong <chenridong@huawei.com> >> --- >> mm/shrinker.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/mm/shrinker.c b/mm/shrinker.c >> index dc5d2a6fcfc4..92270413190d 100644 >> --- a/mm/shrinker.c >> +++ b/mm/shrinker.c >> @@ -97,6 +97,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >> >> err: >> mutex_unlock(&shrinker_mutex); >> + kvfree(info); >> free_shrinker_info(memcg); >> return -ENOMEM; >> } > > There are two scenarios when "goto err:" gets called > > - When shrinker_info allocations fails, no kvfree() is required > - but after this change kvfree() would be called even > when the allocation had failed originally, which does > not sound right Yes. In this case, @info is NULL and kvfree could handle NULL. It seems strange but the final behaviour correct. > > - shrinker_unit_alloc() fails, kvfree() is actually required > > I guess kvfree() should be called just after shrinker_unit_alloc() > fails but before calling into "goto err". We could do it like this, which avoids ambiguity (if someone ignores that kvfree could handle NULL). Something like: --- a/mm/shrinker.c +++ b/mm/shrinker.c @@ -88,13 +88,14 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) goto err; info->map_nr_max = shrinker_nr_max; if (shrinker_unit_alloc(info, NULL, nid)) - goto err; + goto free; rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); } mutex_unlock(&shrinker_mutex); return ret; - +free: + kvfree(info); err: mutex_unlock(&shrinker_mutex); free_shrinker_info(memcg); Thanks. > > But curious, should not both kvzalloc_node()/kvfree() be avoided > while inside mutex lock to avoid possible lockdep issues ?
On 2024/10/14 16:43, Muchun Song wrote: > > >> On Oct 14, 2024, at 16:13, Anshuman Khandual <anshuman.khandual@arm.com> wrote: >> >> >> >> On 10/14/24 08:53, Chen Ridong wrote: >>> From: Chen Ridong <chenridong@huawei.com> >>> >>> A memleak was found as bellow: >>> >>> unreferenced object 0xffff8881010d2a80 (size 32): >>> comm "mkdir", pid 1559, jiffies 4294932666 >>> hex dump (first 32 bytes): >>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >>> 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @............... >>> backtrace (crc 2e7ef6fa): >>> [<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470 >>> [<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0 >>> [<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0 >>> [<ffffffff81198dd9>] online_css+0x29/0xa0 >>> [<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360 >>> [<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0 >>> [<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90 >>> [<ffffffff813dbb24>] vfs_mkdir+0x144/0x220 >>> [<ffffffff813e1c97>] do_mkdirat+0x87/0x130 >>> [<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70 >>> [<ffffffff81f8c928>] do_syscall_64+0x68/0x140 >>> [<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e >>> >>> In the alloc_shrinker_info function, when shrinker_unit_alloc return >>> err, the info won't be freed. Just fix it. >>> >>> Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}") >>> Signed-off-by: Chen Ridong <chenridong@huawei.com> >>> --- >>> mm/shrinker.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/mm/shrinker.c b/mm/shrinker.c >>> index dc5d2a6fcfc4..92270413190d 100644 >>> --- a/mm/shrinker.c >>> +++ b/mm/shrinker.c >>> @@ -97,6 +97,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>> >>> err: >>> mutex_unlock(&shrinker_mutex); >>> + kvfree(info); >>> free_shrinker_info(memcg); >>> return -ENOMEM; >>> } >> >> There are two scenarios when "goto err:" gets called >> >> - When shrinker_info allocations fails, no kvfree() is required >> - but after this change kvfree() would be called even >> when the allocation had failed originally, which does >> not sound right > > Yes. In this case, @info is NULL and kvfree could handle NULL. > It seems strange but the final behaviour correct. > >> >> - shrinker_unit_alloc() fails, kvfree() is actually required >> >> I guess kvfree() should be called just after shrinker_unit_alloc() >> fails but before calling into "goto err". > > We could do it like this, which avoids ambiguity (if someone ignores > that kvfree could handle NULL). Something like: > > --- a/mm/shrinker.c > +++ b/mm/shrinker.c > @@ -88,13 +88,14 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) > goto err; > info->map_nr_max = shrinker_nr_max; > if (shrinker_unit_alloc(info, NULL, nid)) > - goto err; > + goto free; > rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); > } > mutex_unlock(&shrinker_mutex); > > return ret; > - > +free: > + kvfree(info); > err: > mutex_unlock(&shrinker_mutex); > free_shrinker_info(memcg); > > Thanks. > >> >> But curious, should not both kvzalloc_node()/kvfree() be avoided >> while inside mutex lock to avoid possible lockdep issues ? > How about: diff --git a/mm/shrinker.c b/mm/shrinker.c index dc5d2a6fcfc4..7baee7f00497 100644 --- a/mm/shrinker.c +++ b/mm/shrinker.c @@ -87,9 +87,9 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) if (!info) goto err; info->map_nr_max = shrinker_nr_max; + rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); if (shrinker_unit_alloc(info, NULL, nid)) goto err; - rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); } mutex_unlock(&shrinker_mutex); I think this is concise. Best regards, Ridong
> On Oct 14, 2024, at 17:04, chenridong <chenridong@huawei.com> wrote: > > > > On 2024/10/14 16:43, Muchun Song wrote: >>> On Oct 14, 2024, at 16:13, Anshuman Khandual <anshuman.khandual@arm.com> wrote: >>> >>> >>> >>> On 10/14/24 08:53, Chen Ridong wrote: >>>> From: Chen Ridong <chenridong@huawei.com> >>>> >>>> A memleak was found as bellow: >>>> >>>> unreferenced object 0xffff8881010d2a80 (size 32): >>>> comm "mkdir", pid 1559, jiffies 4294932666 >>>> hex dump (first 32 bytes): >>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >>>> 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @............... >>>> backtrace (crc 2e7ef6fa): >>>> [<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470 >>>> [<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0 >>>> [<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0 >>>> [<ffffffff81198dd9>] online_css+0x29/0xa0 >>>> [<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360 >>>> [<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0 >>>> [<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90 >>>> [<ffffffff813dbb24>] vfs_mkdir+0x144/0x220 >>>> [<ffffffff813e1c97>] do_mkdirat+0x87/0x130 >>>> [<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70 >>>> [<ffffffff81f8c928>] do_syscall_64+0x68/0x140 >>>> [<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e >>>> >>>> In the alloc_shrinker_info function, when shrinker_unit_alloc return >>>> err, the info won't be freed. Just fix it. >>>> >>>> Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}") >>>> Signed-off-by: Chen Ridong <chenridong@huawei.com> >>>> --- >>>> mm/shrinker.c | 1 + >>>> 1 file changed, 1 insertion(+) >>>> >>>> diff --git a/mm/shrinker.c b/mm/shrinker.c >>>> index dc5d2a6fcfc4..92270413190d 100644 >>>> --- a/mm/shrinker.c >>>> +++ b/mm/shrinker.c >>>> @@ -97,6 +97,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>>> >>>> err: >>>> mutex_unlock(&shrinker_mutex); >>>> + kvfree(info); >>>> free_shrinker_info(memcg); >>>> return -ENOMEM; >>>> } >>> >>> There are two scenarios when "goto err:" gets called >>> >>> - When shrinker_info allocations fails, no kvfree() is required >>> - but after this change kvfree() would be called even >>> when the allocation had failed originally, which does >>> not sound right >> Yes. In this case, @info is NULL and kvfree could handle NULL. >> It seems strange but the final behaviour correct. >>> >>> - shrinker_unit_alloc() fails, kvfree() is actually required >>> >>> I guess kvfree() should be called just after shrinker_unit_alloc() >>> fails but before calling into "goto err". >> We could do it like this, which avoids ambiguity (if someone ignores >> that kvfree could handle NULL). Something like: >> --- a/mm/shrinker.c >> +++ b/mm/shrinker.c >> @@ -88,13 +88,14 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >> goto err; >> info->map_nr_max = shrinker_nr_max; >> if (shrinker_unit_alloc(info, NULL, nid)) >> - goto err; >> + goto free; >> rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); >> } >> mutex_unlock(&shrinker_mutex); >> return ret; >> - >> +free: >> + kvfree(info); >> err: >> mutex_unlock(&shrinker_mutex); >> free_shrinker_info(memcg); >> Thanks. >>> >>> But curious, should not both kvzalloc_node()/kvfree() be avoided >>> while inside mutex lock to avoid possible lockdep issues ? > How about: > > diff --git a/mm/shrinker.c b/mm/shrinker.c > index dc5d2a6fcfc4..7baee7f00497 100644 > --- a/mm/shrinker.c > +++ b/mm/shrinker.c > @@ -87,9 +87,9 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) > if (!info) > goto err; > info->map_nr_max = shrinker_nr_max; > + rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); > if (shrinker_unit_alloc(info, NULL, nid)) > goto err; > - rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); > } > mutex_unlock(&shrinker_mutex); No. We should make sure the @info is fully initialized before others could see it. That's why rcu_assign_pointer is used here. > > I think this is concise. > > Best regards, > Ridong
On 2024/10/14 17:20, Muchun Song wrote: > > >> On Oct 14, 2024, at 17:04, chenridong <chenridong@huawei.com> wrote: >> >> >> >> On 2024/10/14 16:43, Muchun Song wrote: >>>> On Oct 14, 2024, at 16:13, Anshuman Khandual <anshuman.khandual@arm.com> wrote: >>>> >>>> >>>> >>>> On 10/14/24 08:53, Chen Ridong wrote: >>>>> From: Chen Ridong <chenridong@huawei.com> >>>>> >>>>> A memleak was found as bellow: >>>>> >>>>> unreferenced object 0xffff8881010d2a80 (size 32): >>>>> comm "mkdir", pid 1559, jiffies 4294932666 >>>>> hex dump (first 32 bytes): >>>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >>>>> 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @............... >>>>> backtrace (crc 2e7ef6fa): >>>>> [<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470 >>>>> [<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0 >>>>> [<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0 >>>>> [<ffffffff81198dd9>] online_css+0x29/0xa0 >>>>> [<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360 >>>>> [<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0 >>>>> [<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90 >>>>> [<ffffffff813dbb24>] vfs_mkdir+0x144/0x220 >>>>> [<ffffffff813e1c97>] do_mkdirat+0x87/0x130 >>>>> [<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70 >>>>> [<ffffffff81f8c928>] do_syscall_64+0x68/0x140 >>>>> [<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e >>>>> >>>>> In the alloc_shrinker_info function, when shrinker_unit_alloc return >>>>> err, the info won't be freed. Just fix it. >>>>> >>>>> Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}") >>>>> Signed-off-by: Chen Ridong <chenridong@huawei.com> >>>>> --- >>>>> mm/shrinker.c | 1 + >>>>> 1 file changed, 1 insertion(+) >>>>> >>>>> diff --git a/mm/shrinker.c b/mm/shrinker.c >>>>> index dc5d2a6fcfc4..92270413190d 100644 >>>>> --- a/mm/shrinker.c >>>>> +++ b/mm/shrinker.c >>>>> @@ -97,6 +97,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>>>> >>>>> err: >>>>> mutex_unlock(&shrinker_mutex); >>>>> + kvfree(info); >>>>> free_shrinker_info(memcg); >>>>> return -ENOMEM; >>>>> } >>>> >>>> There are two scenarios when "goto err:" gets called >>>> >>>> - When shrinker_info allocations fails, no kvfree() is required >>>> - but after this change kvfree() would be called even >>>> when the allocation had failed originally, which does >>>> not sound right >>> Yes. In this case, @info is NULL and kvfree could handle NULL. >>> It seems strange but the final behaviour correct. >>>> >>>> - shrinker_unit_alloc() fails, kvfree() is actually required >>>> >>>> I guess kvfree() should be called just after shrinker_unit_alloc() >>>> fails but before calling into "goto err". >>> We could do it like this, which avoids ambiguity (if someone ignores >>> that kvfree could handle NULL). Something like: >>> --- a/mm/shrinker.c >>> +++ b/mm/shrinker.c >>> @@ -88,13 +88,14 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>> goto err; >>> info->map_nr_max = shrinker_nr_max; >>> if (shrinker_unit_alloc(info, NULL, nid)) >>> - goto err; >>> + goto free; >>> rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); >>> } >>> mutex_unlock(&shrinker_mutex); >>> return ret; >>> - >>> +free: >>> + kvfree(info); >>> err: >>> mutex_unlock(&shrinker_mutex); >>> free_shrinker_info(memcg); >>> Thanks. >>>> >>>> But curious, should not both kvzalloc_node()/kvfree() be avoided >>>> while inside mutex lock to avoid possible lockdep issues ? >> How about: >> >> diff --git a/mm/shrinker.c b/mm/shrinker.c >> index dc5d2a6fcfc4..7baee7f00497 100644 >> --- a/mm/shrinker.c >> +++ b/mm/shrinker.c >> @@ -87,9 +87,9 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >> if (!info) >> goto err; >> info->map_nr_max = shrinker_nr_max; >> + rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); >> if (shrinker_unit_alloc(info, NULL, nid)) >> goto err; >> - rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); >> } >> mutex_unlock(&shrinker_mutex); > > No. We should make sure the @info is fully initialized before others > could see it. That's why rcu_assign_pointer is used here. > Thank you, it seems that 'goto free' is a better choice. Will update. Thanks, Ridong >> >> I think this is concise. >> >> Best regards, >> Ridong >
On Mon, Oct 14, 2024 at 03:23:36AM +0000, Chen Ridong wrote: > From: Chen Ridong <chenridong@huawei.com> > > A memleak was found as bellow: > > unreferenced object 0xffff8881010d2a80 (size 32): > comm "mkdir", pid 1559, jiffies 4294932666 > hex dump (first 32 bytes): > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @............... > backtrace (crc 2e7ef6fa): > [<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470 > [<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0 > [<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0 > [<ffffffff81198dd9>] online_css+0x29/0xa0 > [<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360 > [<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0 > [<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90 > [<ffffffff813dbb24>] vfs_mkdir+0x144/0x220 > [<ffffffff813e1c97>] do_mkdirat+0x87/0x130 > [<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70 > [<ffffffff81f8c928>] do_syscall_64+0x68/0x140 > [<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e > > In the alloc_shrinker_info function, when shrinker_unit_alloc return > err, the info won't be freed. Just fix it. > > Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}") > Signed-off-by: Chen Ridong <chenridong@huawei.com> > --- > mm/shrinker.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/mm/shrinker.c b/mm/shrinker.c > index dc5d2a6fcfc4..92270413190d 100644 > --- a/mm/shrinker.c > +++ b/mm/shrinker.c > @@ -97,6 +97,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) > > err: > mutex_unlock(&shrinker_mutex); > + kvfree(info); > free_shrinker_info(memcg); > return -ENOMEM; > } NAK. If in the future there going to one more error case after rcu_assign_pointer() we will end up with double free. This should be safer: diff --git a/mm/shrinker.c b/mm/shrinker.c index dc5d2a6fcfc4..763fd556bc7d 100644 --- a/mm/shrinker.c +++ b/mm/shrinker.c @@ -87,8 +87,10 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) if (!info) goto err; info->map_nr_max = shrinker_nr_max; - if (shrinker_unit_alloc(info, NULL, nid)) + if (shrinker_unit_alloc(info, NULL, nid)) { + kvfree(info); goto err; + } rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); } mutex_unlock(&shrinker_mutex);
On 2024/10/14 19:29, Kirill A. Shutemov wrote: > On Mon, Oct 14, 2024 at 03:23:36AM +0000, Chen Ridong wrote: >> From: Chen Ridong <chenridong@huawei.com> >> >> A memleak was found as bellow: >> >> unreferenced object 0xffff8881010d2a80 (size 32): >> comm "mkdir", pid 1559, jiffies 4294932666 >> hex dump (first 32 bytes): >> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >> 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @............... >> backtrace (crc 2e7ef6fa): >> [<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470 >> [<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0 >> [<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0 >> [<ffffffff81198dd9>] online_css+0x29/0xa0 >> [<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360 >> [<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0 >> [<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90 >> [<ffffffff813dbb24>] vfs_mkdir+0x144/0x220 >> [<ffffffff813e1c97>] do_mkdirat+0x87/0x130 >> [<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70 >> [<ffffffff81f8c928>] do_syscall_64+0x68/0x140 >> [<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e >> >> In the alloc_shrinker_info function, when shrinker_unit_alloc return >> err, the info won't be freed. Just fix it. >> >> Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}") >> Signed-off-by: Chen Ridong <chenridong@huawei.com> >> --- >> mm/shrinker.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/mm/shrinker.c b/mm/shrinker.c >> index dc5d2a6fcfc4..92270413190d 100644 >> --- a/mm/shrinker.c >> +++ b/mm/shrinker.c >> @@ -97,6 +97,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >> >> err: >> mutex_unlock(&shrinker_mutex); >> + kvfree(info); >> free_shrinker_info(memcg); >> return -ENOMEM; >> } > > NAK. If in the future there going to one more error case after > rcu_assign_pointer() we will end up with double free. > > This should be safer: > > diff --git a/mm/shrinker.c b/mm/shrinker.c > index dc5d2a6fcfc4..763fd556bc7d 100644 > --- a/mm/shrinker.c > +++ b/mm/shrinker.c > @@ -87,8 +87,10 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) > if (!info) > goto err; > info->map_nr_max = shrinker_nr_max; > - if (shrinker_unit_alloc(info, NULL, nid)) > + if (shrinker_unit_alloc(info, NULL, nid)) { > + kvfree(info); > goto err; > + } > rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); > } > mutex_unlock(&shrinker_mutex); This is what I sent in v1. Muchun thinks it's better to do the similar thing in the same way. Thanks, Ridong
On 10/14/24 16:59, Kirill A. Shutemov wrote: > On Mon, Oct 14, 2024 at 03:23:36AM +0000, Chen Ridong wrote: >> From: Chen Ridong <chenridong@huawei.com> >> >> A memleak was found as bellow: >> >> unreferenced object 0xffff8881010d2a80 (size 32): >> comm "mkdir", pid 1559, jiffies 4294932666 >> hex dump (first 32 bytes): >> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >> 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @............... >> backtrace (crc 2e7ef6fa): >> [<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470 >> [<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0 >> [<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0 >> [<ffffffff81198dd9>] online_css+0x29/0xa0 >> [<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360 >> [<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0 >> [<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90 >> [<ffffffff813dbb24>] vfs_mkdir+0x144/0x220 >> [<ffffffff813e1c97>] do_mkdirat+0x87/0x130 >> [<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70 >> [<ffffffff81f8c928>] do_syscall_64+0x68/0x140 >> [<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e >> >> In the alloc_shrinker_info function, when shrinker_unit_alloc return >> err, the info won't be freed. Just fix it. >> >> Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}") >> Signed-off-by: Chen Ridong <chenridong@huawei.com> >> --- >> mm/shrinker.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/mm/shrinker.c b/mm/shrinker.c >> index dc5d2a6fcfc4..92270413190d 100644 >> --- a/mm/shrinker.c >> +++ b/mm/shrinker.c >> @@ -97,6 +97,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >> >> err: >> mutex_unlock(&shrinker_mutex); >> + kvfree(info); >> free_shrinker_info(memcg); >> return -ENOMEM; >> } > > NAK. If in the future there going to one more error case after > rcu_assign_pointer() we will end up with double free. > > This should be safer: > > diff --git a/mm/shrinker.c b/mm/shrinker.c > index dc5d2a6fcfc4..763fd556bc7d 100644 > --- a/mm/shrinker.c > +++ b/mm/shrinker.c > @@ -87,8 +87,10 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) > if (!info) > goto err; > info->map_nr_max = shrinker_nr_max; > - if (shrinker_unit_alloc(info, NULL, nid)) > + if (shrinker_unit_alloc(info, NULL, nid)) { > + kvfree(info); > goto err; > + } > rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); > } > mutex_unlock(&shrinker_mutex); Agreed, this is what I mentioned earlier as well. ------------------------------------------------------------------ I guess kvfree() should be called just after shrinker_unit_alloc() fails but before calling into "goto err" ------------------------------------------------------------------
On 2024/10/15 14:55, Anshuman Khandual wrote: > > > On 10/14/24 16:59, Kirill A. Shutemov wrote: >> On Mon, Oct 14, 2024 at 03:23:36AM +0000, Chen Ridong wrote: >>> From: Chen Ridong <chenridong@huawei.com> >>> >>> A memleak was found as bellow: >>> >>> unreferenced object 0xffff8881010d2a80 (size 32): >>> comm "mkdir", pid 1559, jiffies 4294932666 >>> hex dump (first 32 bytes): >>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >>> 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @............... >>> backtrace (crc 2e7ef6fa): >>> [<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470 >>> [<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0 >>> [<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0 >>> [<ffffffff81198dd9>] online_css+0x29/0xa0 >>> [<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360 >>> [<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0 >>> [<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90 >>> [<ffffffff813dbb24>] vfs_mkdir+0x144/0x220 >>> [<ffffffff813e1c97>] do_mkdirat+0x87/0x130 >>> [<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70 >>> [<ffffffff81f8c928>] do_syscall_64+0x68/0x140 >>> [<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e >>> >>> In the alloc_shrinker_info function, when shrinker_unit_alloc return >>> err, the info won't be freed. Just fix it. >>> >>> Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}") >>> Signed-off-by: Chen Ridong <chenridong@huawei.com> >>> --- >>> mm/shrinker.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/mm/shrinker.c b/mm/shrinker.c >>> index dc5d2a6fcfc4..92270413190d 100644 >>> --- a/mm/shrinker.c >>> +++ b/mm/shrinker.c >>> @@ -97,6 +97,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>> >>> err: >>> mutex_unlock(&shrinker_mutex); >>> + kvfree(info); >>> free_shrinker_info(memcg); >>> return -ENOMEM; >>> } >> >> NAK. If in the future there going to one more error case after >> rcu_assign_pointer() we will end up with double free. >> >> This should be safer: >> >> diff --git a/mm/shrinker.c b/mm/shrinker.c >> index dc5d2a6fcfc4..763fd556bc7d 100644 >> --- a/mm/shrinker.c >> +++ b/mm/shrinker.c >> @@ -87,8 +87,10 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >> if (!info) >> goto err; >> info->map_nr_max = shrinker_nr_max; >> - if (shrinker_unit_alloc(info, NULL, nid)) >> + if (shrinker_unit_alloc(info, NULL, nid)) { >> + kvfree(info); >> goto err; >> + } >> rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); >> } >> mutex_unlock(&shrinker_mutex); > > Agreed, this is what I mentioned earlier as well. > > ------------------------------------------------------------------ > I guess kvfree() should be called just after shrinker_unit_alloc() > fails but before calling into "goto err" > ------------------------------------------------------------------ > After discussion, it seems that v1 is acceptable. Hi, Muchun, do you have any other opinions? Best regards, Ridong
> On Oct 16, 2024, at 09:25, chenridong <chenridong@huawei.com> wrote: > > > > On 2024/10/15 14:55, Anshuman Khandual wrote: >> On 10/14/24 16:59, Kirill A. Shutemov wrote: >>> On Mon, Oct 14, 2024 at 03:23:36AM +0000, Chen Ridong wrote: >>>> From: Chen Ridong <chenridong@huawei.com> >>>> >>>> A memleak was found as bellow: >>>> >>>> unreferenced object 0xffff8881010d2a80 (size 32): >>>> comm "mkdir", pid 1559, jiffies 4294932666 >>>> hex dump (first 32 bytes): >>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >>>> 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @............... >>>> backtrace (crc 2e7ef6fa): >>>> [<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470 >>>> [<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0 >>>> [<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0 >>>> [<ffffffff81198dd9>] online_css+0x29/0xa0 >>>> [<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360 >>>> [<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0 >>>> [<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90 >>>> [<ffffffff813dbb24>] vfs_mkdir+0x144/0x220 >>>> [<ffffffff813e1c97>] do_mkdirat+0x87/0x130 >>>> [<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70 >>>> [<ffffffff81f8c928>] do_syscall_64+0x68/0x140 >>>> [<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e >>>> >>>> In the alloc_shrinker_info function, when shrinker_unit_alloc return >>>> err, the info won't be freed. Just fix it. >>>> >>>> Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}") >>>> Signed-off-by: Chen Ridong <chenridong@huawei.com> >>>> --- >>>> mm/shrinker.c | 1 + >>>> 1 file changed, 1 insertion(+) >>>> >>>> diff --git a/mm/shrinker.c b/mm/shrinker.c >>>> index dc5d2a6fcfc4..92270413190d 100644 >>>> --- a/mm/shrinker.c >>>> +++ b/mm/shrinker.c >>>> @@ -97,6 +97,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>>> err: >>>> mutex_unlock(&shrinker_mutex); >>>> + kvfree(info); >>>> free_shrinker_info(memcg); >>>> return -ENOMEM; >>>> } >>> >>> NAK. If in the future there going to one more error case after >>> rcu_assign_pointer() we will end up with double free. >>> >>> This should be safer: >>> >>> diff --git a/mm/shrinker.c b/mm/shrinker.c >>> index dc5d2a6fcfc4..763fd556bc7d 100644 >>> --- a/mm/shrinker.c >>> +++ b/mm/shrinker.c >>> @@ -87,8 +87,10 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>> if (!info) >>> goto err; >>> info->map_nr_max = shrinker_nr_max; >>> - if (shrinker_unit_alloc(info, NULL, nid)) >>> + if (shrinker_unit_alloc(info, NULL, nid)) { >>> + kvfree(info); >>> goto err; >>> + } >>> rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); >>> } >>> mutex_unlock(&shrinker_mutex); >> Agreed, this is what I mentioned earlier as well. >> ------------------------------------------------------------------ >> I guess kvfree() should be called just after shrinker_unit_alloc() >> fails but before calling into "goto err" >> ------------------------------------------------------------------ > > After discussion, it seems that v1 is acceptable. > Hi, Muchun, do you have any other opinions? I insist on my opinion, not mixing two different approaches to do release resources. Thanks. > > Best regards, > Ridong
On Wed, Oct 16, 2024 at 10:21:30AM +0800, Muchun Song wrote: > > > > On Oct 16, 2024, at 09:25, chenridong <chenridong@huawei.com> wrote: > > > > > > > > On 2024/10/15 14:55, Anshuman Khandual wrote: > >> On 10/14/24 16:59, Kirill A. Shutemov wrote: > >>> On Mon, Oct 14, 2024 at 03:23:36AM +0000, Chen Ridong wrote: > >>>> From: Chen Ridong <chenridong@huawei.com> > >>>> > >>>> A memleak was found as bellow: > >>>> > >>>> unreferenced object 0xffff8881010d2a80 (size 32): > >>>> comm "mkdir", pid 1559, jiffies 4294932666 > >>>> hex dump (first 32 bytes): > >>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > >>>> 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @............... > >>>> backtrace (crc 2e7ef6fa): > >>>> [<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470 > >>>> [<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0 > >>>> [<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0 > >>>> [<ffffffff81198dd9>] online_css+0x29/0xa0 > >>>> [<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360 > >>>> [<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0 > >>>> [<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90 > >>>> [<ffffffff813dbb24>] vfs_mkdir+0x144/0x220 > >>>> [<ffffffff813e1c97>] do_mkdirat+0x87/0x130 > >>>> [<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70 > >>>> [<ffffffff81f8c928>] do_syscall_64+0x68/0x140 > >>>> [<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e > >>>> > >>>> In the alloc_shrinker_info function, when shrinker_unit_alloc return > >>>> err, the info won't be freed. Just fix it. > >>>> > >>>> Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}") > >>>> Signed-off-by: Chen Ridong <chenridong@huawei.com> > >>>> --- > >>>> mm/shrinker.c | 1 + > >>>> 1 file changed, 1 insertion(+) > >>>> > >>>> diff --git a/mm/shrinker.c b/mm/shrinker.c > >>>> index dc5d2a6fcfc4..92270413190d 100644 > >>>> --- a/mm/shrinker.c > >>>> +++ b/mm/shrinker.c > >>>> @@ -97,6 +97,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) > >>>> err: > >>>> mutex_unlock(&shrinker_mutex); > >>>> + kvfree(info); > >>>> free_shrinker_info(memcg); > >>>> return -ENOMEM; > >>>> } > >>> > >>> NAK. If in the future there going to one more error case after > >>> rcu_assign_pointer() we will end up with double free. > >>> > >>> This should be safer: > >>> > >>> diff --git a/mm/shrinker.c b/mm/shrinker.c > >>> index dc5d2a6fcfc4..763fd556bc7d 100644 > >>> --- a/mm/shrinker.c > >>> +++ b/mm/shrinker.c > >>> @@ -87,8 +87,10 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) > >>> if (!info) > >>> goto err; > >>> info->map_nr_max = shrinker_nr_max; > >>> - if (shrinker_unit_alloc(info, NULL, nid)) > >>> + if (shrinker_unit_alloc(info, NULL, nid)) { > >>> + kvfree(info); > >>> goto err; > >>> + } > >>> rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); > >>> } > >>> mutex_unlock(&shrinker_mutex); > >> Agreed, this is what I mentioned earlier as well. > >> ------------------------------------------------------------------ > >> I guess kvfree() should be called just after shrinker_unit_alloc() > >> fails but before calling into "goto err" > >> ------------------------------------------------------------------ > > > > After discussion, it seems that v1 is acceptable. > > Hi, Muchun, do you have any other opinions? > > I insist on my opinion, not mixing two different approaches > to do release resources. It makes no sense. This kvfree() is specifically to handle the case when 'info' is allocated, but not yet assigned to ->shrinker_info. And 'err:' block handles all other error cases. Putting kvfree() in 'err:' section is double-free timebomb.
On 10/16/24 04:21, Muchun Song wrote: > > >> On Oct 16, 2024, at 09:25, chenridong <chenridong@huawei.com> wrote: >> >> >> >> On 2024/10/15 14:55, Anshuman Khandual wrote: >>> On 10/14/24 16:59, Kirill A. Shutemov wrote: >>>> On Mon, Oct 14, 2024 at 03:23:36AM +0000, Chen Ridong wrote: >>>>> From: Chen Ridong <chenridong@huawei.com> >>>>> >>>>> A memleak was found as bellow: >>>>> >>>>> unreferenced object 0xffff8881010d2a80 (size 32): >>>>> comm "mkdir", pid 1559, jiffies 4294932666 >>>>> hex dump (first 32 bytes): >>>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >>>>> 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @............... >>>>> backtrace (crc 2e7ef6fa): >>>>> [<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470 >>>>> [<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0 >>>>> [<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0 >>>>> [<ffffffff81198dd9>] online_css+0x29/0xa0 >>>>> [<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360 >>>>> [<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0 >>>>> [<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90 >>>>> [<ffffffff813dbb24>] vfs_mkdir+0x144/0x220 >>>>> [<ffffffff813e1c97>] do_mkdirat+0x87/0x130 >>>>> [<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70 >>>>> [<ffffffff81f8c928>] do_syscall_64+0x68/0x140 >>>>> [<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e >>>>> >>>>> In the alloc_shrinker_info function, when shrinker_unit_alloc return >>>>> err, the info won't be freed. Just fix it. >>>>> >>>>> Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}") >>>>> Signed-off-by: Chen Ridong <chenridong@huawei.com> >>>>> --- >>>>> mm/shrinker.c | 1 + >>>>> 1 file changed, 1 insertion(+) >>>>> >>>>> diff --git a/mm/shrinker.c b/mm/shrinker.c >>>>> index dc5d2a6fcfc4..92270413190d 100644 >>>>> --- a/mm/shrinker.c >>>>> +++ b/mm/shrinker.c >>>>> @@ -97,6 +97,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>>>> err: >>>>> mutex_unlock(&shrinker_mutex); >>>>> + kvfree(info); >>>>> free_shrinker_info(memcg); >>>>> return -ENOMEM; >>>>> } >>>> >>>> NAK. If in the future there going to one more error case after >>>> rcu_assign_pointer() we will end up with double free. >>>> >>>> This should be safer: >>>> >>>> diff --git a/mm/shrinker.c b/mm/shrinker.c >>>> index dc5d2a6fcfc4..763fd556bc7d 100644 >>>> --- a/mm/shrinker.c >>>> +++ b/mm/shrinker.c >>>> @@ -87,8 +87,10 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>>> if (!info) >>>> goto err; >>>> info->map_nr_max = shrinker_nr_max; >>>> - if (shrinker_unit_alloc(info, NULL, nid)) >>>> + if (shrinker_unit_alloc(info, NULL, nid)) { >>>> + kvfree(info); >>>> goto err; >>>> + } >>>> rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); >>>> } >>>> mutex_unlock(&shrinker_mutex); >>> Agreed, this is what I mentioned earlier as well. >>> ------------------------------------------------------------------ >>> I guess kvfree() should be called just after shrinker_unit_alloc() >>> fails but before calling into "goto err" >>> ------------------------------------------------------------------ >> >> After discussion, it seems that v1 is acceptable. >> Hi, Muchun, do you have any other opinions? > > I insist on my opinion, not mixing two different approaches > to do release resources. So instead we mix the cleanup of the whole function with the cleanup of what is effectively a per-iteration temporary variable? The fact there was already a confusion in this thread about whether it's safe and relies on kvfree(NULL) to be a no-op, should be a hint. So no, I a gree with Kirill and others. Ideally the fix would also move the declaration of info into the for loop to make its scope more obvious. > Thanks. > >> >> Best regards, >> Ridong > >
On 10/14/24 11:20, Muchun Song wrote: > > >> On Oct 14, 2024, at 17:04, chenridong <chenridong@huawei.com> wrote: >> >> >> >> On 2024/10/14 16:43, Muchun Song wrote: >>>> On Oct 14, 2024, at 16:13, Anshuman Khandual <anshuman.khandual@arm.com> wrote: >>>> >>>> >>>> >>>> On 10/14/24 08:53, Chen Ridong wrote: >>>>> From: Chen Ridong <chenridong@huawei.com> >>>>> >>>>> A memleak was found as bellow: >>>>> >>>>> unreferenced object 0xffff8881010d2a80 (size 32): >>>>> comm "mkdir", pid 1559, jiffies 4294932666 >>>>> hex dump (first 32 bytes): >>>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >>>>> 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @............... >>>>> backtrace (crc 2e7ef6fa): >>>>> [<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470 >>>>> [<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0 >>>>> [<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0 >>>>> [<ffffffff81198dd9>] online_css+0x29/0xa0 >>>>> [<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360 >>>>> [<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0 >>>>> [<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90 >>>>> [<ffffffff813dbb24>] vfs_mkdir+0x144/0x220 >>>>> [<ffffffff813e1c97>] do_mkdirat+0x87/0x130 >>>>> [<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70 >>>>> [<ffffffff81f8c928>] do_syscall_64+0x68/0x140 >>>>> [<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e >>>>> >>>>> In the alloc_shrinker_info function, when shrinker_unit_alloc return >>>>> err, the info won't be freed. Just fix it. >>>>> >>>>> Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}") >>>>> Signed-off-by: Chen Ridong <chenridong@huawei.com> >>>>> --- >>>>> mm/shrinker.c | 1 + >>>>> 1 file changed, 1 insertion(+) >>>>> >>>>> diff --git a/mm/shrinker.c b/mm/shrinker.c >>>>> index dc5d2a6fcfc4..92270413190d 100644 >>>>> --- a/mm/shrinker.c >>>>> +++ b/mm/shrinker.c >>>>> @@ -97,6 +97,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>>>> >>>>> err: >>>>> mutex_unlock(&shrinker_mutex); >>>>> + kvfree(info); >>>>> free_shrinker_info(memcg); >>>>> return -ENOMEM; >>>>> } >>>> >>>> There are two scenarios when "goto err:" gets called >>>> >>>> - When shrinker_info allocations fails, no kvfree() is required >>>> - but after this change kvfree() would be called even >>>> when the allocation had failed originally, which does >>>> not sound right >>> Yes. In this case, @info is NULL and kvfree could handle NULL. >>> It seems strange but the final behaviour correct. >>>> >>>> - shrinker_unit_alloc() fails, kvfree() is actually required >>>> >>>> I guess kvfree() should be called just after shrinker_unit_alloc() >>>> fails but before calling into "goto err". >>> We could do it like this, which avoids ambiguity (if someone ignores >>> that kvfree could handle NULL). Something like: >>> --- a/mm/shrinker.c >>> +++ b/mm/shrinker.c >>> @@ -88,13 +88,14 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>> goto err; >>> info->map_nr_max = shrinker_nr_max; >>> if (shrinker_unit_alloc(info, NULL, nid)) >>> - goto err; >>> + goto free; >>> rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); >>> } >>> mutex_unlock(&shrinker_mutex); >>> return ret; >>> - >>> +free: >>> + kvfree(info); >>> err: >>> mutex_unlock(&shrinker_mutex); >>> free_shrinker_info(memcg); >>> Thanks. >>>> >>>> But curious, should not both kvzalloc_node()/kvfree() be avoided >>>> while inside mutex lock to avoid possible lockdep issues ? >> How about: >> >> diff --git a/mm/shrinker.c b/mm/shrinker.c >> index dc5d2a6fcfc4..7baee7f00497 100644 >> --- a/mm/shrinker.c >> +++ b/mm/shrinker.c >> @@ -87,9 +87,9 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >> if (!info) >> goto err; >> info->map_nr_max = shrinker_nr_max; >> + rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); >> if (shrinker_unit_alloc(info, NULL, nid)) >> goto err; >> - rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); >> } >> mutex_unlock(&shrinker_mutex); > > No. We should make sure the @info is fully initialized before others > could see it. That's why rcu_assign_pointer is used here. If the info is immediately visible, is the failure cleanup free_shrinker_info() safe? It uses kvfree(info) and not kvfree_rcu(), and shrinker_unit_free() is also doing kfree(). >> >> I think this is concise. >> >> Best regards, >> Ridong > >
> On Oct 16, 2024, at 18:17, Kirill A. Shutemov <kirill@shutemov.name> wrote: > > On Wed, Oct 16, 2024 at 10:21:30AM +0800, Muchun Song wrote: >> >> >>>> On Oct 16, 2024, at 09:25, chenridong <chenridong@huawei.com> wrote: >>> >>> >>> >>> On 2024/10/15 14:55, Anshuman Khandual wrote: >>>> On 10/14/24 16:59, Kirill A. Shutemov wrote: >>>>> On Mon, Oct 14, 2024 at 03:23:36AM +0000, Chen Ridong wrote: >>>>>> From: Chen Ridong <chenridong@huawei.com> >>>>>> >>>>>> A memleak was found as bellow: >>>>>> >>>>>> unreferenced object 0xffff8881010d2a80 (size 32): >>>>>> comm "mkdir", pid 1559, jiffies 4294932666 >>>>>> hex dump (first 32 bytes): >>>>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >>>>>> 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @............... >>>>>> backtrace (crc 2e7ef6fa): >>>>>> [<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470 >>>>>> [<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0 >>>>>> [<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0 >>>>>> [<ffffffff81198dd9>] online_css+0x29/0xa0 >>>>>> [<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360 >>>>>> [<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0 >>>>>> [<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90 >>>>>> [<ffffffff813dbb24>] vfs_mkdir+0x144/0x220 >>>>>> [<ffffffff813e1c97>] do_mkdirat+0x87/0x130 >>>>>> [<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70 >>>>>> [<ffffffff81f8c928>] do_syscall_64+0x68/0x140 >>>>>> [<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e >>>>>> >>>>>> In the alloc_shrinker_info function, when shrinker_unit_alloc return >>>>>> err, the info won't be freed. Just fix it. >>>>>> >>>>>> Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}") >>>>>> Signed-off-by: Chen Ridong <chenridong@huawei.com> >>>>>> --- >>>>>> mm/shrinker.c | 1 + >>>>>> 1 file changed, 1 insertion(+) >>>>>> >>>>>> diff --git a/mm/shrinker.c b/mm/shrinker.c >>>>>> index dc5d2a6fcfc4..92270413190d 100644 >>>>>> --- a/mm/shrinker.c >>>>>> +++ b/mm/shrinker.c >>>>>> @@ -97,6 +97,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>>>>> err: >>>>>> mutex_unlock(&shrinker_mutex); >>>>>> + kvfree(info); >>>>>> free_shrinker_info(memcg); >>>>>> return -ENOMEM; >>>>>> } >>>>> >>>>> NAK. If in the future there going to one more error case after >>>>> rcu_assign_pointer() we will end up with double free. >>>>> >>>>> This should be safer: >>>>> >>>>> diff --git a/mm/shrinker.c b/mm/shrinker.c >>>>> index dc5d2a6fcfc4..763fd556bc7d 100644 >>>>> --- a/mm/shrinker.c >>>>> +++ b/mm/shrinker.c >>>>> @@ -87,8 +87,10 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>>>> if (!info) >>>>> goto err; >>>>> info->map_nr_max = shrinker_nr_max; >>>>> - if (shrinker_unit_alloc(info, NULL, nid)) >>>>> + if (shrinker_unit_alloc(info, NULL, nid)) { >>>>> + kvfree(info); >>>>> goto err; >>>>> + } >>>>> rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); >>>>> } >>>>> mutex_unlock(&shrinker_mutex); >>>> Agreed, this is what I mentioned earlier as well. >>>> ------------------------------------------------------------------ >>>> I guess kvfree() should be called just after shrinker_unit_alloc() >>>> fails but before calling into "goto err" >>>> ------------------------------------------------------------------ >>> >>> After discussion, it seems that v1 is acceptable. >>> Hi, Muchun, do you have any other opinions? >> >> I insist on my opinion, not mixing two different approaches >> to do release resources. > > It makes no sense. > > This kvfree() is specifically to handle the case when 'info' is allocated, > but not yet assigned to ->shrinker_info. And 'err:' block handles all > other error cases. Putting kvfree() in 'err:' section is double-free > timebomb. Please carefully read my previous reply. I don’t think there is any double-free. FYI, I pasted my previous suggested fix here. Thanks. --- a/mm/shrinker.c +++ b/mm/shrinker.c @@ -88,13 +88,14 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) goto err; info->map_nr_max = shrinker_nr_max; if (shrinker_unit_alloc(info, NULL, nid)) - goto err; + goto free; rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); } mutex_unlock(&shrinker_mutex); return ret; - +free: + kvfree(info); err: mutex_unlock(&shrinker_mutex); free_shrinker_info(memcg); > > -- > Kiryl Shutsemau / Kirill A. Shutemov
> On Oct 16, 2024, at 19:43, Vlastimil Babka <vbabka@suse.cz> wrote: > On 10/16/24 04:21, Muchun Song wrote: >> >> >>> On Oct 16, 2024, at 09:25, chenridong <chenridong@huawei.com> wrote: >>> On 2024/10/15 14:55, Anshuman Khandual wrote: >>>> On 10/14/24 16:59, Kirill A. Shutemov wrote: >>>>> On Mon, Oct 14, 2024 at 03:23:36AM +0000, Chen Ridong wrote: >>>>>> From: Chen Ridong <chenridong@huawei.com> >>>>>> A memleak was found as bellow: >>>>>> unreferenced object 0xffff8881010d2a80 (size 32): >>>>>> comm "mkdir", pid 1559, jiffies 4294932666 >>>>>> hex dump (first 32 bytes): >>>>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >>>>>> 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @............... >>>>>> backtrace (crc 2e7ef6fa): >>>>>> [<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470 >>>>>> [<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0 >>>>>> [<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0 >>>>>> [<ffffffff81198dd9>] online_css+0x29/0xa0 >>>>>> [<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360 >>>>>> [<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0 >>>>>> [<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90 >>>>>> [<ffffffff813dbb24>] vfs_mkdir+0x144/0x220 >>>>>> [<ffffffff813e1c97>] do_mkdirat+0x87/0x130 >>>>>> [<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70 >>>>>> [<ffffffff81f8c928>] do_syscall_64+0x68/0x140 >>>>>> [<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e >>>>>> In the alloc_shrinker_info function, when shrinker_unit_alloc return >>>>>> err, the info won't be freed. Just fix it. >>>>>> Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}") >>>>>> Signed-off-by: Chen Ridong <chenridong@huawei.com> >>>>>> --- >>>>>> mm/shrinker.c | 1 + >>>>>> 1 file changed, 1 insertion(+) >>>>>> diff --git a/mm/shrinker.c b/mm/shrinker.c >>>>>> index dc5d2a6fcfc4..92270413190d 100644 >>>>>> --- a/mm/shrinker.c >>>>>> +++ b/mm/shrinker.c >>>>>> @@ -97,6 +97,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>>>>> err: >>>>>> mutex_unlock(&shrinker_mutex); >>>>>> + kvfree(info); >>>>>> free_shrinker_info(memcg); >>>>>> return -ENOMEM; >>>>>> } >>>>> NAK. If in the future there going to one more error case after >>>>> rcu_assign_pointer() we will end up with double free. >>>>> This should be safer: >>>>> diff --git a/mm/shrinker.c b/mm/shrinker.c >>>>> index dc5d2a6fcfc4..763fd556bc7d 100644 >>>>> --- a/mm/shrinker.c >>>>> +++ b/mm/shrinker.c >>>>> @@ -87,8 +87,10 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>>>> if (!info) >>>>> goto err; >>>>> info->map_nr_max = shrinker_nr_max; >>>>> - if (shrinker_unit_alloc(info, NULL, nid)) >>>>> + if (shrinker_unit_alloc(info, NULL, nid)) { >>>>> + kvfree(info); >>>>> goto err; >>>>> + } >>>>> rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); >>>>> } >>>>> mutex_unlock(&shrinker_mutex); >>>> Agreed, this is what I mentioned earlier as well. >>>> ------------------------------------------------------------------ >>>> I guess kvfree() should be called just after shrinker_unit_alloc() >>>> fails but before calling into "goto err" >>>> ------------------------------------------------------------------ >>> After discussion, it seems that v1 is acceptable. >>> Hi, Muchun, do you have any other opinions? >> >> I insist on my opinion, not mixing two different approaches >> to do release resources. > > So instead we mix the cleanup of the whole function with the cleanup of what > is effectively a per-iteration temporary variable? > > The fact there was already a confusion in this thread about whether it's > safe and relies on kvfree(NULL) to be a no-op, should be a hint. Yes. I think someone is confused about my opinion. I don’t care about whether we should apply this hit. If we think the hint is tricky, we could add another label to fix it like I suggested previously. Because we already use goto-based approaches to cleanup the resources, why not keeping consistent? It will be easier for us to add a new "if" statement and handle the failure case in the future. For example, if we use his v1 proposal, we should do the cleanups again for info. But for goto-based version, we just add another label to do the cleanups and go to the new label for failure case. goto-based fix is what I insisted on. I copied my previous suggested fix here to clarify my opinion. --- a/mm/shrinker.c +++ b/mm/shrinker.c @@ -88,13 +88,14 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) goto err; info->map_nr_max = shrinker_nr_max; if (shrinker_unit_alloc(info, NULL, nid)) - goto err; + goto free; rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); } mutex_unlock(&shrinker_mutex); return ret; - +free: + kvfree(info); err: mutex_unlock(&shrinker_mutex); free_shrinker_info(memcg); Muchun, Thanks. > > So no, I a gree with Kirill and others. Ideally the fix would also move the > declaration of info into the for loop to make its scope more obvious. > >> Thanks. >> >>> Best regards, >>> Ridong
> On Oct 16, 2024, at 20:13, Vlastimil Babka <vbabka@suse.cz> wrote: > > On 10/14/24 11:20, Muchun Song wrote: >> >> >>>> On Oct 14, 2024, at 17:04, chenridong <chenridong@huawei.com> wrote: >>> >>> >>> >>> On 2024/10/14 16:43, Muchun Song wrote: >>>>> On Oct 14, 2024, at 16:13, Anshuman Khandual <anshuman.khandual@arm.com> wrote: >>>>> >>>>> >>>>> >>>>> On 10/14/24 08:53, Chen Ridong wrote: >>>>>> From: Chen Ridong <chenridong@huawei.com> >>>>>> >>>>>> A memleak was found as bellow: >>>>>> >>>>>> unreferenced object 0xffff8881010d2a80 (size 32): >>>>>> comm "mkdir", pid 1559, jiffies 4294932666 >>>>>> hex dump (first 32 bytes): >>>>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >>>>>> 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @............... >>>>>> backtrace (crc 2e7ef6fa): >>>>>> [<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470 >>>>>> [<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0 >>>>>> [<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0 >>>>>> [<ffffffff81198dd9>] online_css+0x29/0xa0 >>>>>> [<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360 >>>>>> [<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0 >>>>>> [<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90 >>>>>> [<ffffffff813dbb24>] vfs_mkdir+0x144/0x220 >>>>>> [<ffffffff813e1c97>] do_mkdirat+0x87/0x130 >>>>>> [<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70 >>>>>> [<ffffffff81f8c928>] do_syscall_64+0x68/0x140 >>>>>> [<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e >>>>>> >>>>>> In the alloc_shrinker_info function, when shrinker_unit_alloc return >>>>>> err, the info won't be freed. Just fix it. >>>>>> >>>>>> Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}") >>>>>> Signed-off-by: Chen Ridong <chenridong@huawei.com> >>>>>> --- >>>>>> mm/shrinker.c | 1 + >>>>>> 1 file changed, 1 insertion(+) >>>>>> >>>>>> diff --git a/mm/shrinker.c b/mm/shrinker.c >>>>>> index dc5d2a6fcfc4..92270413190d 100644 >>>>>> --- a/mm/shrinker.c >>>>>> +++ b/mm/shrinker.c >>>>>> @@ -97,6 +97,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>>>>> >>>>>> err: >>>>>> mutex_unlock(&shrinker_mutex); >>>>>> + kvfree(info); >>>>>> free_shrinker_info(memcg); >>>>>> return -ENOMEM; >>>>>> } >>>>> >>>>> There are two scenarios when "goto err:" gets called >>>>> >>>>> - When shrinker_info allocations fails, no kvfree() is required >>>>> - but after this change kvfree() would be called even >>>>> when the allocation had failed originally, which does >>>>> not sound right >>>> Yes. In this case, @info is NULL and kvfree could handle NULL. >>>> It seems strange but the final behaviour correct. >>>>> >>>>> - shrinker_unit_alloc() fails, kvfree() is actually required >>>>> >>>>> I guess kvfree() should be called just after shrinker_unit_alloc() >>>>> fails but before calling into "goto err". >>>> We could do it like this, which avoids ambiguity (if someone ignores >>>> that kvfree could handle NULL). Something like: >>>> --- a/mm/shrinker.c >>>> +++ b/mm/shrinker.c >>>> @@ -88,13 +88,14 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>>> goto err; >>>> info->map_nr_max = shrinker_nr_max; >>>> if (shrinker_unit_alloc(info, NULL, nid)) >>>> - goto err; >>>> + goto free; >>>> rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); >>>> } >>>> mutex_unlock(&shrinker_mutex); >>>> return ret; >>>> - >>>> +free: >>>> + kvfree(info); >>>> err: >>>> mutex_unlock(&shrinker_mutex); >>>> free_shrinker_info(memcg); >>>> Thanks. >>>>> >>>>> But curious, should not both kvzalloc_node()/kvfree() be avoided >>>>> while inside mutex lock to avoid possible lockdep issues ? >>> How about: >>> >>> diff --git a/mm/shrinker.c b/mm/shrinker.c >>> index dc5d2a6fcfc4..7baee7f00497 100644 >>> --- a/mm/shrinker.c >>> +++ b/mm/shrinker.c >>> @@ -87,9 +87,9 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>> if (!info) >>> goto err; >>> info->map_nr_max = shrinker_nr_max; >>> + rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); >>> if (shrinker_unit_alloc(info, NULL, nid)) >>> goto err; >>> - rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); >>> } >>> mutex_unlock(&shrinker_mutex); >> >> No. We should make sure the @info is fully initialized before others >> could see it. That's why rcu_assign_pointer is used here. > > If the info is immediately visible, is the failure cleanup > free_shrinker_info() safe? It uses kvfree(info) and not kvfree_rcu(), and > shrinker_unit_free() is also doing kfree(). Qi told me that the @info will not visible immediately yesterday. So non-rcu-based kvfree is safe. Even if this fix could work properly, it’s a bit strange for me to use rcu_assign_pointer to assign the @info without full initialization to it. Muchun, Thanks. > >>> >>> I think this is concise. >>> >>> Best regards, >>> Ridong >> >> >
On 10/16/24 16:08, Muchun Song wrote: > > >> On Oct 16, 2024, at 19:43, Vlastimil Babka <vbabka@suse.cz> wrote: >> On 10/16/24 04:21, Muchun Song wrote: >>> >>> >>>> On Oct 16, 2024, at 09:25, chenridong <chenridong@huawei.com> wrote: >>>> On 2024/10/15 14:55, Anshuman Khandual wrote: >>>>> On 10/14/24 16:59, Kirill A. Shutemov wrote: >>>>>> On Mon, Oct 14, 2024 at 03:23:36AM +0000, Chen Ridong wrote: >>>>>>> From: Chen Ridong <chenridong@huawei.com> >>>>>>> A memleak was found as bellow: >>>>>>> unreferenced object 0xffff8881010d2a80 (size 32): >>>>>>> comm "mkdir", pid 1559, jiffies 4294932666 >>>>>>> hex dump (first 32 bytes): >>>>>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >>>>>>> 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @............... >>>>>>> backtrace (crc 2e7ef6fa): >>>>>>> [<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470 >>>>>>> [<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0 >>>>>>> [<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0 >>>>>>> [<ffffffff81198dd9>] online_css+0x29/0xa0 >>>>>>> [<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360 >>>>>>> [<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0 >>>>>>> [<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90 >>>>>>> [<ffffffff813dbb24>] vfs_mkdir+0x144/0x220 >>>>>>> [<ffffffff813e1c97>] do_mkdirat+0x87/0x130 >>>>>>> [<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70 >>>>>>> [<ffffffff81f8c928>] do_syscall_64+0x68/0x140 >>>>>>> [<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e >>>>>>> In the alloc_shrinker_info function, when shrinker_unit_alloc return >>>>>>> err, the info won't be freed. Just fix it. >>>>>>> Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}") >>>>>>> Signed-off-by: Chen Ridong <chenridong@huawei.com> >>>>>>> --- >>>>>>> mm/shrinker.c | 1 + >>>>>>> 1 file changed, 1 insertion(+) >>>>>>> diff --git a/mm/shrinker.c b/mm/shrinker.c >>>>>>> index dc5d2a6fcfc4..92270413190d 100644 >>>>>>> --- a/mm/shrinker.c >>>>>>> +++ b/mm/shrinker.c >>>>>>> @@ -97,6 +97,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>>>>>> err: >>>>>>> mutex_unlock(&shrinker_mutex); >>>>>>> + kvfree(info); >>>>>>> free_shrinker_info(memcg); >>>>>>> return -ENOMEM; >>>>>>> } >>>>>> NAK. If in the future there going to one more error case after >>>>>> rcu_assign_pointer() we will end up with double free. >>>>>> This should be safer: >>>>>> diff --git a/mm/shrinker.c b/mm/shrinker.c >>>>>> index dc5d2a6fcfc4..763fd556bc7d 100644 >>>>>> --- a/mm/shrinker.c >>>>>> +++ b/mm/shrinker.c >>>>>> @@ -87,8 +87,10 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>>>>> if (!info) >>>>>> goto err; >>>>>> info->map_nr_max = shrinker_nr_max; >>>>>> - if (shrinker_unit_alloc(info, NULL, nid)) >>>>>> + if (shrinker_unit_alloc(info, NULL, nid)) { >>>>>> + kvfree(info); >>>>>> goto err; >>>>>> + } >>>>>> rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); >>>>>> } >>>>>> mutex_unlock(&shrinker_mutex); >>>>> Agreed, this is what I mentioned earlier as well. >>>>> ------------------------------------------------------------------ >>>>> I guess kvfree() should be called just after shrinker_unit_alloc() >>>>> fails but before calling into "goto err" >>>>> ------------------------------------------------------------------ >>>> After discussion, it seems that v1 is acceptable. >>>> Hi, Muchun, do you have any other opinions? >>> >>> I insist on my opinion, not mixing two different approaches >>> to do release resources. >> >> So instead we mix the cleanup of the whole function with the cleanup of what >> is effectively a per-iteration temporary variable? >> >> The fact there was already a confusion in this thread about whether it's >> safe and relies on kvfree(NULL) to be a no-op, should be a hint. > > Yes. I think someone is confused about my opinion. > I don’t care about whether we should apply this hit. > If we think the hint is tricky, we could add another > label to fix it like I suggested previously. Because > we already use goto-based approaches to > cleanup the resources, why not keeping > consistent? I think we're rather pragmatic than striving to be consistent for the sake of consistency. goto is not the nicest thing in the world, but we (unlike other projects) use it where it makes sense to avoid if/else nesting explosion. Here for the info it's not the most pragmatic option. > It will be easier for us to add a new > "if" statement and handle the failure case in the future. Let's not overengineer things for hypothetical future. > For example, if we use his v1 proposal, we should do > the cleanups again for info. But for goto-based > version, we just add another label to do the > cleanups and go to the new label for failure case. goto-based fix is what I insisted on. I copied my previous suggested fix here to clarify my opinion. Again, info is a loop-iteration-local variable, v1 fix making it truly local is the way to go. If there are further cleanups added in the loop itself in the future, they could hopefully keep being local to the loop as well. Cleanup of info outside the loop iteration is breaking its real scope. > > --- a/mm/shrinker.c > +++ b/mm/shrinker.c > @@ -88,13 +88,14 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) > goto err; > info->map_nr_max = shrinker_nr_max; > if (shrinker_unit_alloc(info, NULL, nid)) > - goto err; > + goto free; > rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); > } > mutex_unlock(&shrinker_mutex); > > return ret; > - > +free: > + kvfree(info); > err: > mutex_unlock(&shrinker_mutex); > free_shrinker_info(memcg); > > Muchun, > Thanks. > >> >> So no, I a gree with Kirill and others. Ideally the fix would also move the >> declaration of info into the for loop to make its scope more obvious. >> >>> Thanks. >>> >>>> Best regards, >>>> Ridong
On Wed, Oct 16, 2024 at 07:02:23PM +0200, Vlastimil Babka wrote: > On 10/16/24 16:08, Muchun Song wrote: > > > > > >> On Oct 16, 2024, at 19:43, Vlastimil Babka <vbabka@suse.cz> wrote: > >> On 10/16/24 04:21, Muchun Song wrote: > >>> > >>> > >>>> On Oct 16, 2024, at 09:25, chenridong <chenridong@huawei.com> wrote: > >>>> On 2024/10/15 14:55, Anshuman Khandual wrote: > >>>>> On 10/14/24 16:59, Kirill A. Shutemov wrote: > >>>>>> On Mon, Oct 14, 2024 at 03:23:36AM +0000, Chen Ridong wrote: > >>>>>>> From: Chen Ridong <chenridong@huawei.com> > >>>>>>> A memleak was found as bellow: > >>>>>>> unreferenced object 0xffff8881010d2a80 (size 32): > >>>>>>> comm "mkdir", pid 1559, jiffies 4294932666 > >>>>>>> hex dump (first 32 bytes): > >>>>>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > >>>>>>> 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @............... > >>>>>>> backtrace (crc 2e7ef6fa): > >>>>>>> [<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470 > >>>>>>> [<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0 > >>>>>>> [<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0 > >>>>>>> [<ffffffff81198dd9>] online_css+0x29/0xa0 > >>>>>>> [<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360 > >>>>>>> [<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0 > >>>>>>> [<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90 > >>>>>>> [<ffffffff813dbb24>] vfs_mkdir+0x144/0x220 > >>>>>>> [<ffffffff813e1c97>] do_mkdirat+0x87/0x130 > >>>>>>> [<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70 > >>>>>>> [<ffffffff81f8c928>] do_syscall_64+0x68/0x140 > >>>>>>> [<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e > >>>>>>> In the alloc_shrinker_info function, when shrinker_unit_alloc return > >>>>>>> err, the info won't be freed. Just fix it. > >>>>>>> Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}") > >>>>>>> Signed-off-by: Chen Ridong <chenridong@huawei.com> > >>>>>>> --- > >>>>>>> mm/shrinker.c | 1 + > >>>>>>> 1 file changed, 1 insertion(+) > >>>>>>> diff --git a/mm/shrinker.c b/mm/shrinker.c > >>>>>>> index dc5d2a6fcfc4..92270413190d 100644 > >>>>>>> --- a/mm/shrinker.c > >>>>>>> +++ b/mm/shrinker.c > >>>>>>> @@ -97,6 +97,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) > >>>>>>> err: > >>>>>>> mutex_unlock(&shrinker_mutex); > >>>>>>> + kvfree(info); > >>>>>>> free_shrinker_info(memcg); > >>>>>>> return -ENOMEM; > >>>>>>> } > >>>>>> NAK. If in the future there going to one more error case after > >>>>>> rcu_assign_pointer() we will end up with double free. > >>>>>> This should be safer: > >>>>>> diff --git a/mm/shrinker.c b/mm/shrinker.c > >>>>>> index dc5d2a6fcfc4..763fd556bc7d 100644 > >>>>>> --- a/mm/shrinker.c > >>>>>> +++ b/mm/shrinker.c > >>>>>> @@ -87,8 +87,10 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) > >>>>>> if (!info) > >>>>>> goto err; > >>>>>> info->map_nr_max = shrinker_nr_max; > >>>>>> - if (shrinker_unit_alloc(info, NULL, nid)) > >>>>>> + if (shrinker_unit_alloc(info, NULL, nid)) { > >>>>>> + kvfree(info); > >>>>>> goto err; > >>>>>> + } > >>>>>> rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); > >>>>>> } > >>>>>> mutex_unlock(&shrinker_mutex); > >>>>> Agreed, this is what I mentioned earlier as well. > >>>>> ------------------------------------------------------------------ > >>>>> I guess kvfree() should be called just after shrinker_unit_alloc() > >>>>> fails but before calling into "goto err" > >>>>> ------------------------------------------------------------------ > >>>> After discussion, it seems that v1 is acceptable. > >>>> Hi, Muchun, do you have any other opinions? > >>> > >>> I insist on my opinion, not mixing two different approaches > >>> to do release resources. > >> > >> So instead we mix the cleanup of the whole function with the cleanup of what > >> is effectively a per-iteration temporary variable? > >> > >> The fact there was already a confusion in this thread about whether it's > >> safe and relies on kvfree(NULL) to be a no-op, should be a hint. > > > > Yes. I think someone is confused about my opinion. > > I don’t care about whether we should apply this hit. > > If we think the hint is tricky, we could add another > > label to fix it like I suggested previously. Because > > we already use goto-based approaches to > > cleanup the resources, why not keeping > > consistent? > > I think we're rather pragmatic than striving to be consistent for the sake > of consistency. goto is not the nicest thing in the world, but we (unlike > other projects) use it where it makes sense to avoid if/else nesting > explosion. Here for the info it's not the most pragmatic option. > > > It will be easier for us to add a new > > "if" statement and handle the failure case in the future. > > Let's not overengineer things for hypothetical future. > > > For example, if we use his v1 proposal, we should do > > the cleanups again for info. But for goto-based > > version, we just add another label to do the > > cleanups and go to the new label for failure case. goto-based fix is what I insisted on. I copied my previous suggested fix here to clarify my opinion. > > Again, info is a loop-iteration-local variable, v1 fix making it truly local > is the way to go. If there are further cleanups added in the loop itself in > the future, they could hopefully keep being local to the loop as well. > Cleanup of info outside the loop iteration is breaking its real scope. +1 to that. I don't think it's such a big deal and both versions are ok, but I strongly prefer the original version (without introduction of a new label). Please, feel free to use Acked-by: Roman Gushchin <roman.gushchin@linux.dev> with the original version. Thanks!
On 2024/10/16 22:22, Muchun Song wrote: > > >> On Oct 16, 2024, at 20:13, Vlastimil Babka <vbabka@suse.cz> wrote: >> >> On 10/14/24 11:20, Muchun Song wrote: >>> >>> >>>>> On Oct 14, 2024, at 17:04, chenridong <chenridong@huawei.com> wrote: >>>> >>>> >>>> >>>> On 2024/10/14 16:43, Muchun Song wrote: >>>>>> On Oct 14, 2024, at 16:13, Anshuman Khandual <anshuman.khandual@arm.com> wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 10/14/24 08:53, Chen Ridong wrote: >>>>>>> From: Chen Ridong <chenridong@huawei.com> >>>>>>> >>>>>>> A memleak was found as bellow: >>>>>>> >>>>>>> unreferenced object 0xffff8881010d2a80 (size 32): >>>>>>> comm "mkdir", pid 1559, jiffies 4294932666 >>>>>>> hex dump (first 32 bytes): >>>>>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >>>>>>> 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @............... >>>>>>> backtrace (crc 2e7ef6fa): >>>>>>> [<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470 >>>>>>> [<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0 >>>>>>> [<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0 >>>>>>> [<ffffffff81198dd9>] online_css+0x29/0xa0 >>>>>>> [<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360 >>>>>>> [<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0 >>>>>>> [<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90 >>>>>>> [<ffffffff813dbb24>] vfs_mkdir+0x144/0x220 >>>>>>> [<ffffffff813e1c97>] do_mkdirat+0x87/0x130 >>>>>>> [<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70 >>>>>>> [<ffffffff81f8c928>] do_syscall_64+0x68/0x140 >>>>>>> [<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e >>>>>>> >>>>>>> In the alloc_shrinker_info function, when shrinker_unit_alloc return >>>>>>> err, the info won't be freed. Just fix it. >>>>>>> >>>>>>> Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}") >>>>>>> Signed-off-by: Chen Ridong <chenridong@huawei.com> >>>>>>> --- >>>>>>> mm/shrinker.c | 1 + >>>>>>> 1 file changed, 1 insertion(+) >>>>>>> >>>>>>> diff --git a/mm/shrinker.c b/mm/shrinker.c >>>>>>> index dc5d2a6fcfc4..92270413190d 100644 >>>>>>> --- a/mm/shrinker.c >>>>>>> +++ b/mm/shrinker.c >>>>>>> @@ -97,6 +97,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>>>>>> >>>>>>> err: >>>>>>> mutex_unlock(&shrinker_mutex); >>>>>>> + kvfree(info); >>>>>>> free_shrinker_info(memcg); >>>>>>> return -ENOMEM; >>>>>>> } >>>>>> >>>>>> There are two scenarios when "goto err:" gets called >>>>>> >>>>>> - When shrinker_info allocations fails, no kvfree() is required >>>>>> - but after this change kvfree() would be called even >>>>>> when the allocation had failed originally, which does >>>>>> not sound right >>>>> Yes. In this case, @info is NULL and kvfree could handle NULL. >>>>> It seems strange but the final behaviour correct. >>>>>> >>>>>> - shrinker_unit_alloc() fails, kvfree() is actually required >>>>>> >>>>>> I guess kvfree() should be called just after shrinker_unit_alloc() >>>>>> fails but before calling into "goto err". >>>>> We could do it like this, which avoids ambiguity (if someone ignores >>>>> that kvfree could handle NULL). Something like: >>>>> --- a/mm/shrinker.c >>>>> +++ b/mm/shrinker.c >>>>> @@ -88,13 +88,14 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>>>> goto err; >>>>> info->map_nr_max = shrinker_nr_max; >>>>> if (shrinker_unit_alloc(info, NULL, nid)) >>>>> - goto err; >>>>> + goto free; >>>>> rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); >>>>> } >>>>> mutex_unlock(&shrinker_mutex); >>>>> return ret; >>>>> - >>>>> +free: >>>>> + kvfree(info); >>>>> err: >>>>> mutex_unlock(&shrinker_mutex); >>>>> free_shrinker_info(memcg); >>>>> Thanks. >>>>>> >>>>>> But curious, should not both kvzalloc_node()/kvfree() be avoided >>>>>> while inside mutex lock to avoid possible lockdep issues ? >>>> How about: >>>> >>>> diff --git a/mm/shrinker.c b/mm/shrinker.c >>>> index dc5d2a6fcfc4..7baee7f00497 100644 >>>> --- a/mm/shrinker.c >>>> +++ b/mm/shrinker.c >>>> @@ -87,9 +87,9 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>>> if (!info) >>>> goto err; >>>> info->map_nr_max = shrinker_nr_max; >>>> + rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); >>>> if (shrinker_unit_alloc(info, NULL, nid)) >>>> goto err; >>>> - rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); >>>> } >>>> mutex_unlock(&shrinker_mutex); >>> >>> No. We should make sure the @info is fully initialized before others >>> could see it. That's why rcu_assign_pointer is used here. >> >> If the info is immediately visible, is the failure cleanup >> free_shrinker_info() safe? It uses kvfree(info) and not kvfree_rcu(), and >> shrinker_unit_free() is also doing kfree(). > > Qi told me that the @info will not visible immediately yesterday. > So non-rcu-based kvfree is safe. Even if this fix could Yes, alloc_shrinker_info() is only called by mem_cgroup_css_online(). At this time, the memcg is not online yet, so it is not visible to shrink_slab(). And free_shrinker_info() is also called by mem_cgroup_css_free(), the memcg has already been offline. The shrinker_unit_free() is also called by expand_one_shrinker_info(), but the shrinker_info 'new' is also not visible at that time. So non-rcu-based kvfree is safe. > work properly, it’s a bit strange for me to use > rcu_assign_pointer to assign the @info without full initialization to it. Agree. > > Muchun, > Thanks. > >> >>>> >>>> I think this is concise. >>>> >>>> Best regards, >>>> Ridong >>> >>> >>
On 2024/10/17 1:31, Roman Gushchin wrote: > On Wed, Oct 16, 2024 at 07:02:23PM +0200, Vlastimil Babka wrote: >> On 10/16/24 16:08, Muchun Song wrote: >>> >>> >>>> On Oct 16, 2024, at 19:43, Vlastimil Babka <vbabka@suse.cz> wrote: >>>> On 10/16/24 04:21, Muchun Song wrote: >>>>> >>>>> >>>>>> On Oct 16, 2024, at 09:25, chenridong <chenridong@huawei.com> wrote: >>>>>> On 2024/10/15 14:55, Anshuman Khandual wrote: >>>>>>> On 10/14/24 16:59, Kirill A. Shutemov wrote: >>>>>>>> On Mon, Oct 14, 2024 at 03:23:36AM +0000, Chen Ridong wrote: >>>>>>>>> From: Chen Ridong <chenridong@huawei.com> >>>>>>>>> A memleak was found as bellow: >>>>>>>>> unreferenced object 0xffff8881010d2a80 (size 32): >>>>>>>>> comm "mkdir", pid 1559, jiffies 4294932666 >>>>>>>>> hex dump (first 32 bytes): >>>>>>>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >>>>>>>>> 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 @............... >>>>>>>>> backtrace (crc 2e7ef6fa): >>>>>>>>> [<ffffffff81372754>] __kmalloc_node_noprof+0x394/0x470 >>>>>>>>> [<ffffffff813024ab>] alloc_shrinker_info+0x7b/0x1a0 >>>>>>>>> [<ffffffff813b526a>] mem_cgroup_css_online+0x11a/0x3b0 >>>>>>>>> [<ffffffff81198dd9>] online_css+0x29/0xa0 >>>>>>>>> [<ffffffff811a243d>] cgroup_apply_control_enable+0x20d/0x360 >>>>>>>>> [<ffffffff811a5728>] cgroup_mkdir+0x168/0x5f0 >>>>>>>>> [<ffffffff8148543e>] kernfs_iop_mkdir+0x5e/0x90 >>>>>>>>> [<ffffffff813dbb24>] vfs_mkdir+0x144/0x220 >>>>>>>>> [<ffffffff813e1c97>] do_mkdirat+0x87/0x130 >>>>>>>>> [<ffffffff813e1de9>] __x64_sys_mkdir+0x49/0x70 >>>>>>>>> [<ffffffff81f8c928>] do_syscall_64+0x68/0x140 >>>>>>>>> [<ffffffff8200012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e >>>>>>>>> In the alloc_shrinker_info function, when shrinker_unit_alloc return >>>>>>>>> err, the info won't be freed. Just fix it. >>>>>>>>> Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}") >>>>>>>>> Signed-off-by: Chen Ridong <chenridong@huawei.com> >>>>>>>>> --- >>>>>>>>> mm/shrinker.c | 1 + >>>>>>>>> 1 file changed, 1 insertion(+) >>>>>>>>> diff --git a/mm/shrinker.c b/mm/shrinker.c >>>>>>>>> index dc5d2a6fcfc4..92270413190d 100644 >>>>>>>>> --- a/mm/shrinker.c >>>>>>>>> +++ b/mm/shrinker.c >>>>>>>>> @@ -97,6 +97,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>>>>>>>> err: >>>>>>>>> mutex_unlock(&shrinker_mutex); >>>>>>>>> + kvfree(info); >>>>>>>>> free_shrinker_info(memcg); >>>>>>>>> return -ENOMEM; >>>>>>>>> } >>>>>>>> NAK. If in the future there going to one more error case after >>>>>>>> rcu_assign_pointer() we will end up with double free. >>>>>>>> This should be safer: >>>>>>>> diff --git a/mm/shrinker.c b/mm/shrinker.c >>>>>>>> index dc5d2a6fcfc4..763fd556bc7d 100644 >>>>>>>> --- a/mm/shrinker.c >>>>>>>> +++ b/mm/shrinker.c >>>>>>>> @@ -87,8 +87,10 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) >>>>>>>> if (!info) >>>>>>>> goto err; >>>>>>>> info->map_nr_max = shrinker_nr_max; >>>>>>>> - if (shrinker_unit_alloc(info, NULL, nid)) >>>>>>>> + if (shrinker_unit_alloc(info, NULL, nid)) { >>>>>>>> + kvfree(info); >>>>>>>> goto err; >>>>>>>> + } >>>>>>>> rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_info, info); >>>>>>>> } >>>>>>>> mutex_unlock(&shrinker_mutex); >>>>>>> Agreed, this is what I mentioned earlier as well. >>>>>>> ------------------------------------------------------------------ >>>>>>> I guess kvfree() should be called just after shrinker_unit_alloc() >>>>>>> fails but before calling into "goto err" >>>>>>> ------------------------------------------------------------------ >>>>>> After discussion, it seems that v1 is acceptable. >>>>>> Hi, Muchun, do you have any other opinions? >>>>> >>>>> I insist on my opinion, not mixing two different approaches >>>>> to do release resources. >>>> >>>> So instead we mix the cleanup of the whole function with the cleanup of what >>>> is effectively a per-iteration temporary variable? >>>> >>>> The fact there was already a confusion in this thread about whether it's >>>> safe and relies on kvfree(NULL) to be a no-op, should be a hint. >>> >>> Yes. I think someone is confused about my opinion. >>> I don’t care about whether we should apply this hit. >>> If we think the hint is tricky, we could add another >>> label to fix it like I suggested previously. Because >>> we already use goto-based approaches to >>> cleanup the resources, why not keeping >>> consistent? >> >> I think we're rather pragmatic than striving to be consistent for the sake >> of consistency. goto is not the nicest thing in the world, but we (unlike >> other projects) use it where it makes sense to avoid if/else nesting >> explosion. Here for the info it's not the most pragmatic option. >> >>> It will be easier for us to add a new >>> "if" statement and handle the failure case in the future. >> >> Let's not overengineer things for hypothetical future. >> >>> For example, if we use his v1 proposal, we should do >>> the cleanups again for info. But for goto-based >>> version, we just add another label to do the >>> cleanups and go to the new label for failure case. goto-based fix is what I insisted on. I copied my previous suggested fix here to clarify my opinion. >> >> Again, info is a loop-iteration-local variable, v1 fix making it truly local >> is the way to go. If there are further cleanups added in the loop itself in >> the future, they could hopefully keep being local to the loop as well. >> Cleanup of info outside the loop iteration is breaking its real scope. > > +1 to that. > > I don't think it's such a big deal and both versions are ok, but I strongly > prefer the original version (without introduction of a new label). > > Please, feel free to use > Acked-by: Roman Gushchin <roman.gushchin@linux.dev> > with the original version. > > Thanks! I agree with Roman. Hello, Andrew and Dave, Do you have any opinions? The original version: https://lore.kernel.org/linux-kernel/4184c61f-80f7-4adc-8929-c29f959cb8df@huawei.com/ Best regards, Ridong
On 10/24/24 03:26, Chen Ridong wrote: >>>> For example, if we use his v1 proposal, we should do >>>> the cleanups again for info. But for goto-based >>>> version, we just add another label to do the >>>> cleanups and go to the new label for failure case. goto-based fix is what I insisted on. I copied my previous suggested fix here to clarify my opinion. >>> >>> Again, info is a loop-iteration-local variable, v1 fix making it truly local >>> is the way to go. If there are further cleanups added in the loop itself in >>> the future, they could hopefully keep being local to the loop as well. >>> Cleanup of info outside the loop iteration is breaking its real scope. >> >> +1 to that. >> >> I don't think it's such a big deal and both versions are ok, but I strongly >> prefer the original version (without introduction of a new label). >> >> Please, feel free to use >> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> >> with the original version. >> >> Thanks! > > I agree with Roman. > Hello, Andrew and Dave, Do you have any opinions? > > The original version: > https://lore.kernel.org/linux-kernel/4184c61f-80f7-4adc-8929-c29f959cb8df@huawei.com/ Hi, Can you resend the v1 as v3, but also move the declaration of "struct shrinker_info *info;" inside the for_each_node() block? Also you can add all the acks collected for that version and mine too: Acked-by: Vlastimil Babka <vbabka@suse.cz> Thanks. > Best regards, > Ridong > >
On 2024/10/24 17:08, Vlastimil Babka wrote: > On 10/24/24 03:26, Chen Ridong wrote: >>>>> For example, if we use his v1 proposal, we should do >>>>> the cleanups again for info. But for goto-based >>>>> version, we just add another label to do the >>>>> cleanups and go to the new label for failure case. goto-based fix is what I insisted on. I copied my previous suggested fix here to clarify my opinion. >>>> >>>> Again, info is a loop-iteration-local variable, v1 fix making it truly local >>>> is the way to go. If there are further cleanups added in the loop itself in >>>> the future, they could hopefully keep being local to the loop as well. >>>> Cleanup of info outside the loop iteration is breaking its real scope. >>> >>> +1 to that. >>> >>> I don't think it's such a big deal and both versions are ok, but I strongly >>> prefer the original version (without introduction of a new label). >>> >>> Please, feel free to use >>> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> >>> with the original version. >>> >>> Thanks! >> >> I agree with Roman. >> Hello, Andrew and Dave, Do you have any opinions? >> >> The original version: >> https://lore.kernel.org/linux-kernel/4184c61f-80f7-4adc-8929-c29f959cb8df@huawei.com/ > > Hi, > > Can you resend the v1 as v3, but also move the declaration of "struct > shrinker_info *info;" inside the for_each_node() block? Also you can add all > the acks collected for that version and mine too: > > Acked-by: Vlastimil Babka <vbabka@suse.cz> > > Thanks. > >> Best regards, >> Ridong >> >> Will update, Thanks. Best regards, Ridong
diff --git a/mm/shrinker.c b/mm/shrinker.c index dc5d2a6fcfc4..92270413190d 100644 --- a/mm/shrinker.c +++ b/mm/shrinker.c @@ -97,6 +97,7 @@ int alloc_shrinker_info(struct mem_cgroup *memcg) err: mutex_unlock(&shrinker_mutex); + kvfree(info); free_shrinker_info(memcg); return -ENOMEM; }