diff mbox series

mm,page_owner: Defer enablement of static branch

Message ID 20240409131715.13632-1-osalvador@suse.de (mailing list archive)
State New
Headers show
Series mm,page_owner: Defer enablement of static branch | expand

Commit Message

Oscar Salvador April 9, 2024, 1:17 p.m. UTC
Kefeng Wang reported that he was seeing some memory leaks with kmemleak
with page_owner enabled.
The reason behind is that we enable the page_owner_inited static branch
and then proceed with the linking of stack_list struct to dummy_stack,
which means that exists a race window between these two steps where we
can have pages already being allocated calling add_stack_record_to_list(),
allocating objects and linking them to stack_list, but then we set
stack_list pointing to dummy_stack in init_page_owner.
Which means that the objects that have been allocated during that time
window are unreferenced and lost.

Fix this by deferring the enablement of the branch until we have properly
set up the list.

Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Closes: https://lore.kernel.org/linux-mm/74b147b0-718d-4d50-be75-d6afc801cd24@huawei.com/
Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Fixes: 4bedfb314bdd ("mm,page_owner: maintain own list of stack_records structs")
Signed-off-by: Oscar Salvador <osalvador@suse.de>
---
Special thanks and kudos go to Kefeng Wang for helping me out chasing
down this bug, as I could not reproduce it with any of my machines, and
to Vlastimil to bring another pair of eyes, which was very helpful.

 mm/page_owner.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Vlastimil Babka April 9, 2024, 1:26 p.m. UTC | #1
On 4/9/24 3:17 PM, Oscar Salvador wrote:
> Kefeng Wang reported that he was seeing some memory leaks with kmemleak
> with page_owner enabled.
> The reason behind is that we enable the page_owner_inited static branch
> and then proceed with the linking of stack_list struct to dummy_stack,
> which means that exists a race window between these two steps where we
> can have pages already being allocated calling add_stack_record_to_list(),
> allocating objects and linking them to stack_list, but then we set
> stack_list pointing to dummy_stack in init_page_owner.
> Which means that the objects that have been allocated during that time
> window are unreferenced and lost.
> 
> Fix this by deferring the enablement of the branch until we have properly
> set up the list.
> 
> Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> Closes: https://lore.kernel.org/linux-mm/74b147b0-718d-4d50-be75-d6afc801cd24@huawei.com/
> Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> Fixes: 4bedfb314bdd ("mm,page_owner: maintain own list of stack_records structs")

That's in rc1 so ideally mm-hotfixes.

> Signed-off-by: Oscar Salvador <osalvador@suse.de>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
> Special thanks and kudos go to Kefeng Wang for helping me out chasing
> down this bug, as I could not reproduce it with any of my machines, and
> to Vlastimil to bring another pair of eyes, which was very helpful.
> 
>  mm/page_owner.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/page_owner.c b/mm/page_owner.c
> index 9bef0b442863..742f432e5bf0 100644
> --- a/mm/page_owner.c
> +++ b/mm/page_owner.c
> @@ -118,7 +118,6 @@ static __init void init_page_owner(void)
>  	register_dummy_stack();
>  	register_failure_stack();
>  	register_early_stack();
> -	static_branch_enable(&page_owner_inited);
>  	init_early_allocated_pages();
>  	/* Initialize dummy and failure stacks and link them to stack_list */
>  	dummy_stack.stack_record = __stack_depot_get_stack_record(dummy_handle);
> @@ -129,6 +128,7 @@ static __init void init_page_owner(void)
>  		refcount_set(&failure_stack.stack_record->count, 1);
>  	dummy_stack.next = &failure_stack;
>  	stack_list = &dummy_stack;
> +	static_branch_enable(&page_owner_inited);
>  }
>  
>  struct page_ext_operations page_owner_ops = {
Kefeng Wang April 9, 2024, 1:27 p.m. UTC | #2
On 2024/4/9 21:17, Oscar Salvador wrote:
> Kefeng Wang reported that he was seeing some memory leaks with kmemleak
> with page_owner enabled.
> The reason behind is that we enable the page_owner_inited static branch
> and then proceed with the linking of stack_list struct to dummy_stack,
> which means that exists a race window between these two steps where we
> can have pages already being allocated calling add_stack_record_to_list(),
> allocating objects and linking them to stack_list, but then we set
> stack_list pointing to dummy_stack in init_page_owner.
> Which means that the objects that have been allocated during that time
> window are unreferenced and lost.
> 
> Fix this by deferring the enablement of the branch until we have properly
> set up the list.
> 
> Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> Closes: https://lore.kernel.org/linux-mm/74b147b0-718d-4d50-be75-d6afc801cd24@huawei.com/
> Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> Fixes: 4bedfb314bdd ("mm,page_owner: maintain own list of stack_records structs")
> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> ---
> Special thanks and kudos go to Kefeng Wang for helping me out chasing
> down this bug, as I could not reproduce it with any of my machines, and
> to Vlastimil to bring another pair of eyes, which was very helpful.
> 
The issue is found by accident when test my migrate changes, thanks for
your great job and quick response, this does pass my test, thanks ;)

>   mm/page_owner.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/page_owner.c b/mm/page_owner.c
> index 9bef0b442863..742f432e5bf0 100644
> --- a/mm/page_owner.c
> +++ b/mm/page_owner.c
> @@ -118,7 +118,6 @@ static __init void init_page_owner(void)
>   	register_dummy_stack();
>   	register_failure_stack();
>   	register_early_stack();
> -	static_branch_enable(&page_owner_inited);
>   	init_early_allocated_pages();
>   	/* Initialize dummy and failure stacks and link them to stack_list */
>   	dummy_stack.stack_record = __stack_depot_get_stack_record(dummy_handle);
> @@ -129,6 +128,7 @@ static __init void init_page_owner(void)
>   		refcount_set(&failure_stack.stack_record->count, 1);
>   	dummy_stack.next = &failure_stack;
>   	stack_list = &dummy_stack;
> +	static_branch_enable(&page_owner_inited);
>   }
>   
>   struct page_ext_operations page_owner_ops = {
diff mbox series

Patch

diff --git a/mm/page_owner.c b/mm/page_owner.c
index 9bef0b442863..742f432e5bf0 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -118,7 +118,6 @@  static __init void init_page_owner(void)
 	register_dummy_stack();
 	register_failure_stack();
 	register_early_stack();
-	static_branch_enable(&page_owner_inited);
 	init_early_allocated_pages();
 	/* Initialize dummy and failure stacks and link them to stack_list */
 	dummy_stack.stack_record = __stack_depot_get_stack_record(dummy_handle);
@@ -129,6 +128,7 @@  static __init void init_page_owner(void)
 		refcount_set(&failure_stack.stack_record->count, 1);
 	dummy_stack.next = &failure_stack;
 	stack_list = &dummy_stack;
+	static_branch_enable(&page_owner_inited);
 }
 
 struct page_ext_operations page_owner_ops = {