Message ID | 20220131203504.3458775-1-willmcvicker@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v1,1/1] mm/gup: skip pinnable check for refs==1 | expand |
On 1/31/22 12:35, Will McVicker wrote: > This fixes commit 54d516b1d62f ("mm/gup: small refactoring: simplify > try_grab_page()") which refactors try_grab_page() to call > try_grab_compound_head() with refs=1. The refactor commit is causing > pin_user_pages() to return -ENOMEM when we try to pin one user page that > is migratable and not in the movable zone. Previously, try_grab_page() > didn't check if the page was pinnable for FOLL_PIN. To match the same > functionality, this fix adds the check `refs > 1 &&` to skip the call to > is_pinnable_page(). > That's a clear write-up of what you're seeing, what caused it, and how you'd like to correct it. The previous code had a loophole, and you want to keep that loophole. More below... > This issue is reproducible with the Pixel 6 on the 5.15 LTS kernel. Here > is the call stack to reproduce the -ENOMEM error: ... > Fixes: 54d516b1d62f ("mm/gup: small refactoring: simplify try_grab_page()") > Cc: John Hubbard <jhubbard@nvidia.com> > Cc: Minchan Kim <minchan@google.com> > Signed-off-by: Will McVicker <willmcvicker@google.com> > --- > mm/gup.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/gup.c b/mm/gup.c > index f0af462ac1e2..0509c49c46a3 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -135,7 +135,7 @@ struct page *try_grab_compound_head(struct page *page, > * right zone, so fail and let the caller fall back to the slow > * path. > */ > - if (unlikely((flags & FOLL_LONGTERM) && > + if (refs > 1 && unlikely((flags & FOLL_LONGTERM) && > !is_pinnable_page(page))) > return NULL; > ...but are you really sure that this is the best way to "fix" the problem? This trades correctness for "bug-for-bug compatibility" with the previous code. It says, "it's OK to violate the pinnable and longterm checks, as long as you do it one page at a time, rather than in larger chunks. Wouldn't it be better to try to fix up the calling code so that it's not in violation of these zone rules? thanks,
On Mon, Jan 31, 2022 at 12:49 PM John Hubbard <jhubbard@nvidia.com> wrote: > > On 1/31/22 12:35, Will McVicker wrote: > > This fixes commit 54d516b1d62f ("mm/gup: small refactoring: simplify > > try_grab_page()") which refactors try_grab_page() to call > > try_grab_compound_head() with refs=1. The refactor commit is causing > > pin_user_pages() to return -ENOMEM when we try to pin one user page that > > is migratable and not in the movable zone. Previously, try_grab_page() > > didn't check if the page was pinnable for FOLL_PIN. To match the same > > functionality, this fix adds the check `refs > 1 &&` to skip the call to > > is_pinnable_page(). > > > > That's a clear write-up of what you're seeing, what caused it, and how > you'd like to correct it. The previous code had a loophole, and you want > to keep that loophole. More below... > > > This issue is reproducible with the Pixel 6 on the 5.15 LTS kernel. Here > > is the call stack to reproduce the -ENOMEM error: > ... > > Fixes: 54d516b1d62f ("mm/gup: small refactoring: simplify try_grab_page()") > > Cc: John Hubbard <jhubbard@nvidia.com> > > Cc: Minchan Kim <minchan@google.com> > > Signed-off-by: Will McVicker <willmcvicker@google.com> > > --- > > mm/gup.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/mm/gup.c b/mm/gup.c > > index f0af462ac1e2..0509c49c46a3 100644 > > --- a/mm/gup.c > > +++ b/mm/gup.c > > @@ -135,7 +135,7 @@ struct page *try_grab_compound_head(struct page *page, > > * right zone, so fail and let the caller fall back to the slow > > * path. > > */ > > - if (unlikely((flags & FOLL_LONGTERM) && > > + if (refs > 1 && unlikely((flags & FOLL_LONGTERM) && > > !is_pinnable_page(page))) > > return NULL; > > > > ...but are you really sure that this is the best way to "fix" the > problem? This trades correctness for "bug-for-bug compatibility" with > the previous code. It says, "it's OK to violate the pinnable and > longterm checks, as long as you do it one page at a time, rather than in > larger chunks. > > Wouldn't it be better to try to fix up the calling code so that it's > not in violation of these zone rules? > > > thanks, > -- > John Hubbard > NVIDIA Hi John, Thanks for the prompt response! I'm not super familiar with what PIN+LONGTERM conditions require, but if this was previously a bug, then I definitely don't want to re-introduce it. Since you're confirming that, let me sync-up with the driver owner to see how I can fix this on the side. Thanks! Will
Hi John, On Mon, Jan 31, 2022 at 12:49:35PM -0800, John Hubbard wrote: > On 1/31/22 12:35, Will McVicker wrote: > > This fixes commit 54d516b1d62f ("mm/gup: small refactoring: simplify > > try_grab_page()") which refactors try_grab_page() to call > > try_grab_compound_head() with refs=1. The refactor commit is causing > > pin_user_pages() to return -ENOMEM when we try to pin one user page that > > is migratable and not in the movable zone. Previously, try_grab_page() > > didn't check if the page was pinnable for FOLL_PIN. To match the same > > functionality, this fix adds the check `refs > 1 &&` to skip the call to > > is_pinnable_page(). > > > > That's a clear write-up of what you're seeing, what caused it, and how > you'd like to correct it. The previous code had a loophole, and you want > to keep that loophole. More below... > > > This issue is reproducible with the Pixel 6 on the 5.15 LTS kernel. Here > > is the call stack to reproduce the -ENOMEM error: > ... > > Fixes: 54d516b1d62f ("mm/gup: small refactoring: simplify try_grab_page()") > > Cc: John Hubbard <jhubbard@nvidia.com> > > Cc: Minchan Kim <minchan@google.com> > > Signed-off-by: Will McVicker <willmcvicker@google.com> > > --- > > mm/gup.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/mm/gup.c b/mm/gup.c > > index f0af462ac1e2..0509c49c46a3 100644 > > --- a/mm/gup.c > > +++ b/mm/gup.c > > @@ -135,7 +135,7 @@ struct page *try_grab_compound_head(struct page *page, > > * right zone, so fail and let the caller fall back to the slow > > * path. > > */ > > - if (unlikely((flags & FOLL_LONGTERM) && > > + if (refs > 1 && unlikely((flags & FOLL_LONGTERM) && > > !is_pinnable_page(page))) > > return NULL; > > ...but are you really sure that this is the best way to "fix" the > problem? This trades correctness for "bug-for-bug compatibility" with > the previous code. It says, "it's OK to violate the pinnable and > longterm checks, as long as you do it one page at a time, rather than in > larger chunks. > > Wouldn't it be better to try to fix up the calling code so that it's > not in violation of these zone rules? I think the problem is before pin_user_pages can work with CMA pages in the fallback path but now it doesn't work with CMA page. Driver couldn't know whether it will work with CMA page or not in advance. pin_user_pages __get_user_pages_locked follow_page_mask follow_page_pte try_grab_page !is_pinnable_page(page) return NULL; return ERR_PTR(-ENOMEM); return -ENOMEM without faultin_page
On 1/31/22 23:28, Minchan Kim wrote: ... >>> --- a/mm/gup.c >>> +++ b/mm/gup.c >>> @@ -135,7 +135,7 @@ struct page *try_grab_compound_head(struct page *page, >>> * right zone, so fail and let the caller fall back to the slow >>> * path. >>> */ >>> - if (unlikely((flags & FOLL_LONGTERM) && >>> + if (refs > 1 && unlikely((flags & FOLL_LONGTERM) && >>> !is_pinnable_page(page))) >>> return NULL; >> >> ...but are you really sure that this is the best way to "fix" the >> problem? This trades correctness for "bug-for-bug compatibility" with >> the previous code. It says, "it's OK to violate the pinnable and >> longterm checks, as long as you do it one page at a time, rather than in >> larger chunks. >> >> Wouldn't it be better to try to fix up the calling code so that it's >> not in violation of these zone rules? > > I think the problem is before pin_user_pages can work with CMA pages > in the fallback path but now it doesn't work with CMA page. Driver Actually, it "worked" only if the caller did it one page at a time. (See how the above attempted fix restores the "make it work for refs == 1.) > couldn't know whether it will work with CMA page or not in advance. > > pin_user_pages > __get_user_pages_locked > follow_page_mask > follow_page_pte > try_grab_page > !is_pinnable_page(page) > return NULL; > return ERR_PTR(-ENOMEM); > return -ENOMEM without faultin_page Yes, that's all clear. thanks,
On 1/31/22 23:35, John Hubbard wrote: > On 1/31/22 23:28, Minchan Kim wrote: > ... >>>> --- a/mm/gup.c >>>> +++ b/mm/gup.c >>>> @@ -135,7 +135,7 @@ struct page *try_grab_compound_head(struct page *page, >>>> * right zone, so fail and let the caller fall back to the slow >>>> * path. >>>> */ >>>> - if (unlikely((flags & FOLL_LONGTERM) && >>>> + if (refs > 1 && unlikely((flags & FOLL_LONGTERM) && >>>> !is_pinnable_page(page))) >>>> return NULL; >>> >>> ...but are you really sure that this is the best way to "fix" the >>> problem? This trades correctness for "bug-for-bug compatibility" with >>> the previous code. It says, "it's OK to violate the pinnable and >>> longterm checks, as long as you do it one page at a time, rather than in >>> larger chunks. >>> >>> Wouldn't it be better to try to fix up the calling code so that it's >>> not in violation of these zone rules? >> >> I think the problem is before pin_user_pages can work with CMA pages >> in the fallback path but now it doesn't work with CMA page. Driver > > Actually, it "worked" only if the caller did it one page at a time. > (See how the above attempted fix restores the "make it work for > refs == 1.) > >> couldn't know whether it will work with CMA page or not in advance. >> >> pin_user_pages >> __get_user_pages_locked >> follow_page_mask >> follow_page_pte >> try_grab_page >> !is_pinnable_page(page) >> return NULL; >> return ERR_PTR(-ENOMEM); >> return -ENOMEM without faultin_page > > Yes, that's all clear. > ...oh, but I guess you're pointing out that it's always going to be page-at-a-time this deep in the pin_user_pages() call path. Which is true. I hadn't worked through how to fix this yet, my initial reaction was that allowing single refs to go through, while prohibiting multiple refs, was clearly *not* the way to go. thanks,
On 1/31/22 23:43, John Hubbard wrote: >>> pin_user_pages >>> __get_user_pages_locked >>> follow_page_mask >>> follow_page_pte >>> try_grab_page >>> !is_pinnable_page(page) >>> return NULL; >>> return ERR_PTR(-ENOMEM); >>> return -ENOMEM without faultin_page >> >> Yes, that's all clear. >> ...oh, but I guess you're pointing out that it's always going to be > page-at-a-time this deep in the pin_user_pages() call path. Which is true. > > I hadn't worked through how to fix this yet, my initial reaction was > that allowing single refs to go through, while prohibiting multiple refs, > was clearly *not* the way to go. > OK, so after looking at this some more, I think that the real problem with commit 54d516b1d62f ("mm/gup: small refactoring: simplify try_grab_page()") is that it funnels fast and slow gup through the same routine (try_grab_compound_head()). And the problem with *that*, is that try_grab_compound_head() is coded up with that assumption that it is being called *only* by fast-gup: /* * Can't do FOLL_LONGTERM + FOLL_PIN gup fast path if not in a * right zone, so fail and let the caller fall back to the slow * path. */ if (unlikely((flags & FOLL_LONGTERM) && !is_pinnable_page(page))) return NULL; Now, to fix that, I'd really rather not conflate "refs == 1" with "on the slow path", because that's a conceptual mismatch. So, other ideas: a) Remove the above check, and fail fast gup at a different point for the (FOLL_LONGTERM && !is_pinnable) case. Haven't looked closely at this yet. b) Pass in FOLL_FAST_ONLY from all call sites *except* try_grag_page(). Skip the above check in slow-gup (!FOLL_FAST_ONLY) cases. This makes the code ugly, though, and I'm just listing it here for completeness. c) Just call the entire refactoring idea a mistake, and roll it back either entirely, or enough to keep fast and slow gup separate. Unless a better idea shows up, (c) is probably the way to go, I think... thanks,
diff --git a/mm/gup.c b/mm/gup.c index f0af462ac1e2..0509c49c46a3 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -135,7 +135,7 @@ struct page *try_grab_compound_head(struct page *page, * right zone, so fail and let the caller fall back to the slow * path. */ - if (unlikely((flags & FOLL_LONGTERM) && + if (refs > 1 && unlikely((flags & FOLL_LONGTERM) && !is_pinnable_page(page))) return NULL;
This fixes commit 54d516b1d62f ("mm/gup: small refactoring: simplify try_grab_page()") which refactors try_grab_page() to call try_grab_compound_head() with refs=1. The refactor commit is causing pin_user_pages() to return -ENOMEM when we try to pin one user page that is migratable and not in the movable zone. Previously, try_grab_page() didn't check if the page was pinnable for FOLL_PIN. To match the same functionality, this fix adds the check `refs > 1 &&` to skip the call to is_pinnable_page(). This issue is reproducible with the Pixel 6 on the 5.15 LTS kernel. Here is the call stack to reproduce the -ENOMEM error: Call trace: : dump_backtrace.cfi_jt+0x0/0x8 : show_stack+0x1c/0x2c : dump_stack_lvl+0x68/0x98 : try_grab_compound_head+0x298/0x3c4 : follow_page_pte+0x1dc/0x330 : follow_page_mask+0x174/0x340 : __get_user_pages+0x158/0x34c : __gup_longterm_locked+0xfc/0x194 : __gup_longterm_unlocked+0x80/0xf4 : internal_get_user_pages_fast+0xf0/0x15c : pin_user_pages_fast+0x24/0x40 : edgetpu_device_group_map+0x130/0x584 [abrolhos] : edgetpu_ioctl_map_buffer+0x110/0x3b4 [abrolhos] : edgetpu_ioctl+0x238/0x408 [abrolhos] : edgetpu_fs_ioctl+0x14/0x24 [abrolhos] Fixes: 54d516b1d62f ("mm/gup: small refactoring: simplify try_grab_page()") Cc: John Hubbard <jhubbard@nvidia.com> Cc: Minchan Kim <minchan@google.com> Signed-off-by: Will McVicker <willmcvicker@google.com> --- mm/gup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)