Message ID | 20240319183212.17156-3-osalvador@suse.de (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | page_owner: Refcount fixups | expand |
On Tue, Mar 19, 2024 at 07:32:12PM +0100, Oscar Salvador wrote: > Upon migration, new allocated pages are being given the handle of the old > pages. This is problematic because it means that for the stack which > allocated the old page, we will be substracting the old page + the new one > when that page is freed, creating an accounting imbalance. > > Fix this by adding a new migrate_handle in the page_owner struct, and > record the handle that allocated the new page in __folio_copy_owner(). > Upon freeing, we check whether we have a migrate_handle, and if we do, > we use migrate_handle for dec_stack_record_count(), which will > subtract those pages from its right handle. Is this the right way to fix this problem? I would have thought we'd be better off accounting this as migration freeing the old page and allocating the new page. If I understand correctly, this is the code which says "This page was last allocated by X and freed by Y", and I would think that being last freed (or allocated) by the migration code would be a very nice hint about where a problem might stem from.
On Tue, Mar 19, 2024 at 06:48:31PM +0000, Matthew Wilcox wrote: > Is this the right way to fix this problem? I would have thought we'd > be better off accounting this as migration freeing the old page and > allocating the new page. If I understand correctly, this is the code > which says "This page was last allocated by X and freed by Y", and I > would think that being last freed (or allocated) by the migration code > would be a very nice hint about where a problem might stem from. I hear you, and I had the same thought when I stumbled upon this. I did not know that the handle was being changed, otherwise it would have saved me quite a lot of debugging time. Checking the history of this, I can see this decision was made in 2016 in: commit d435edca928805074dae005ab9a42d9fa60fc702 Author: Vlastimil Babka <vbabka@suse.cz> Date: Tue Mar 15 14:56:15 2016 -0700 mm, page_owner: copy page owner info during migration And let me quote: "The page_owner mechanism stores gfp_flags of an allocation and stack trace that lead to it. During page migration, the original information is practically replaced by the allocation of free page as the migration target. Arguably this is less useful and might lead to all the page_owner info for migratable pages gradually converge towards compaction or numa balancing migrations. It has also lead to inaccuracies such as one fixed by commit e2cfc91120fa ("mm/page_owner: set correct gfp_mask on page_owner")." A following patch stored the migration reason in last_migrate_reason, and the patch also add a bit of information if last_migrate_reason was other than 0: + if (page_ext->last_migrate_reason != -1) { + ret += snprintf(kbuf + ret, count - ret, + "Page has been migrated, last migrate reason: %s\n", + migrate_reason_names[page_ext->last_migrate_reason]); + if (ret >= count) + goto err; + } Now, thinking about this some more, it kind of makes sense, because one of the things page_owner is used for, in my experience, is for memory leaks. We print the output, try to correlate allocation/free operations per stack so one can easily spot a stack that just keeps allocating memory and never frees it (it might be legit, and not a leak, but it gives a hint). Now imagine there are 10k pages pointing to stack A. If those pages were to be migrated e.g: kcompactd jumps in, we will lose the original stack and replace it with something like: migrate_pages() ... ... kcompatd After that, stack A does not have those 10k pages pointing to it anymore, although it stills "own" them, just that got replaced by new ones due to migration. This kind of defeats the purpose of page_owner. And after all, we do record some migration information in those new pages, which will give us a hint when looking at the output. So, all in all, I am leaning towards "this is fine".
On 3/20/24 06:00, Oscar Salvador wrote: > On Tue, Mar 19, 2024 at 06:48:31PM +0000, Matthew Wilcox wrote: >> Is this the right way to fix this problem? I would have thought we'd >> be better off accounting this as migration freeing the old page and >> allocating the new page. If I understand correctly, this is the code >> which says "This page was last allocated by X and freed by Y", and I >> would think that being last freed (or allocated) by the migration code >> would be a very nice hint about where a problem might stem from. > > I hear you, and I had the same thought when I stumbled upon this. > I did not know that the handle was being changed, otherwise it would > have saved me quite a lot of debugging time. > > Checking the history of this, I can see this decision was made in > 2016 in: > > commit d435edca928805074dae005ab9a42d9fa60fc702 > Author: Vlastimil Babka <vbabka@suse.cz> > Date: Tue Mar 15 14:56:15 2016 -0700 > > mm, page_owner: copy page owner info during migration > Yeah I think we could keep that logic. But we could also simply subtract the refcount of the old handle (the "allocated for migration") in __folio_copy_owner() no? Then we wouldn't need the extra migrate_handle. Also we might have more issues here. Most page owner code takes care to set everything for all pages within a folio, but __folio_copy_owner() and __set_page_owner_migrate_reason() don't.
On Thu, Mar 21, 2024 at 11:50:36AM +0100, Vlastimil Babka wrote: > Yeah I think we could keep that logic. I am all for keeping it. > But we could also simply subtract the refcount of the old handle (the > "allocated for migration") in __folio_copy_owner() no? Then we wouldn't need > the extra migrate_handle. Since new_page will have the old handle pointing to the old stack after the call, we could uncharge the old_page to the migrate_stack, which new_page->_handle holds before it gets changed. So we basically swap it. It could work, but I kinda have a bittersweet feeling here. I am trying to work towards to reduce the number of lookups in the hlist, but for the approach described above I would need to lookup the stack for new_page->handle in order to substract the page. OTHO, I understand that adding migrate_handle kinda wasted memory. 16MB for 16GB of memory. > Also we might have more issues here. Most page owner code takes care to set > everything for all pages within a folio, but __folio_copy_owner() and > __set_page_owner_migrate_reason() don't. I did not check deeply but do not we split the folio upon migration in case it is large? Which means we should reach split_page_owner() before the copy takes place. Do I get it right?
On 3/21/24 12:07, Oscar Salvador wrote: > On Thu, Mar 21, 2024 at 11:50:36AM +0100, Vlastimil Babka wrote: >> Yeah I think we could keep that logic. > > I am all for keeping it. > >> But we could also simply subtract the refcount of the old handle (the >> "allocated for migration") in __folio_copy_owner() no? Then we wouldn't need >> the extra migrate_handle. > > Since new_page will have the old handle pointing to the old stack after > the call, we > could uncharge the old_page to the migrate_stack, which new_page->_handle holds > before it gets changed. > > So we basically swap it. > > It could work, but I kinda have a bittersweet feeling here. > I am trying to work towards to reduce the number of lookups in the > hlist, but for the approach described above I would need to lookup > the stack for new_page->handle in order to substract the page. Understood, but migration is kinda heavy and non-fast-path operation already so the extra lookup wouldn't be in a critical fast path. > OTHO, I understand that adding migrate_handle kinda wasted memory. > 16MB for 16GB of memory. > >> Also we might have more issues here. Most page owner code takes care to set >> everything for all pages within a folio, but __folio_copy_owner() and >> __set_page_owner_migrate_reason() don't. > > I did not check deeply but do not we split the folio upon migration > in case it is large? > Which means we should reach split_page_owner() before the copy takes > place. Do I get it right? When I mean is we have __set_page_owner_handle() setting up everything for tail pages and then we have __folio_copy_owner updating only the head page, so this will create kinda a mixed up information. Which might not be an issue as __folio_copy_owner() should mean it's a proper folio thus compound page thus nobody ever will check those mismatched tail pages... Maybe we could adjust __set_page_owner_handle() to skip tail pages for compound pages as well and unify this, and tail pages would be only setup for those weird high-order non-compound pages so that the odd case in __free_pages() works? Or maybe page_owner is considered debugging functionality enough so it might be worth having the redundant data in tail pages in case something goes wrong. But either way now it seems we're not doing it consistently.
On Thu, Mar 21, 2024 at 12:20:24PM +0100, Vlastimil Babka wrote: > Understood, but migration is kinda heavy and non-fast-path operation already > so the extra lookup wouldn't be in a critical fast path. Ok, you convinced me, let us save that memory. > When I mean is we have __set_page_owner_handle() setting up everything for > tail pages and then we have __folio_copy_owner updating only the head page, > so this will create kinda a mixed up information. Which might not be an > issue as __folio_copy_owner() should mean it's a proper folio thus compound > page thus nobody ever will check those mismatched tail pages... Maybe we > could adjust __set_page_owner_handle() to skip tail pages for compound > pages as well and unify this, and tail pages would be only setup for those > weird high-order non-compound pages so that the odd case in __free_pages() > works? > > Or maybe page_owner is considered debugging functionality enough so it might > be worth having the redundant data in tail pages in case something goes > wrong. But either way now it seems we're not doing it consistently. So we basically have two options here, 1) is to also update tail pages in __folio_copy_owner, and 2) is to follow __folio_copy_owner example and skip tail pages in __set_page_owner. The thing is, going with 2) might mean, as you pointed out, that if something goes wrong we lose the information for those pages as page_owner does not have a record for them. OTOH, would that information be really useful? Sure we could see the stack, but AFAICS not even the migrate_reason would be recorded in those tail pages, which means that if they are migrated and freed, we would only see that they were freed. (the free stack would point to the migrate code though, so that is a hint). Not really sure which path to take here, skipping tail pages would be less of a complexity but at a risk of loosing information. Anyway, let me first tackle the issue at hand here, and then I will think more about that. Thanks!
diff --git a/mm/page_owner.c b/mm/page_owner.c index 2613805cb665..1a7d0d1dc640 100644 --- a/mm/page_owner.c +++ b/mm/page_owner.c @@ -27,6 +27,7 @@ struct page_owner { gfp_t gfp_mask; depot_stack_handle_t handle; depot_stack_handle_t free_handle; + depot_stack_handle_t migrate_handle; u64 ts_nsec; u64 free_ts_nsec; char comm[TASK_COMM_LEN]; @@ -240,7 +241,15 @@ void __reset_page_owner(struct page *page, unsigned short order) return; page_owner = get_page_owner(page_ext); - alloc_handle = page_owner->handle; + /* + * If this page was allocated for migration purposes, its handle doesn't + * reference the stack it was allocated from, so make sure to use the + * migrate_handle in order to subtract it from the right stack. + */ + if (!page_owner->migrate_handle) + alloc_handle = page_owner->handle; + else + alloc_handle = page_owner->migrate_handle; handle = save_stack(GFP_NOWAIT | __GFP_NOWARN); for (i = 0; i < (1 << order); i++) { @@ -277,6 +286,7 @@ static inline void __set_page_owner_handle(struct page_ext *page_ext, page_owner->handle = handle; page_owner->order = order; page_owner->gfp_mask = gfp_mask; + page_owner->migrate_handle = 0; page_owner->last_migrate_reason = -1; page_owner->pid = current->pid; page_owner->tgid = current->tgid; @@ -358,6 +368,7 @@ void __folio_copy_owner(struct folio *newfolio, struct folio *old) new_page_owner->gfp_mask = old_page_owner->gfp_mask; new_page_owner->last_migrate_reason = old_page_owner->last_migrate_reason; + new_page_owner->migrate_handle = new_page_owner->handle; new_page_owner->handle = old_page_owner->handle; new_page_owner->pid = old_page_owner->pid; new_page_owner->tgid = old_page_owner->tgid;
Upon migration, new allocated pages are being given the handle of the old pages. This is problematic because it means that for the stack which allocated the old page, we will be substracting the old page + the new one when that page is freed, creating an accounting imbalance. Fix this by adding a new migrate_handle in the page_owner struct, and record the handle that allocated the new page in __folio_copy_owner(). Upon freeing, we check whether we have a migrate_handle, and if we do, we use migrate_handle for dec_stack_record_count(), which will subtract those pages from its right handle. Fixes: 217b2119b9e2 ("mm,page_owner: implement the tracking of the stacks count") Signed-off-by: Oscar Salvador <osalvador@suse.de> --- mm/page_owner.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)