Message ID | 20210710100329.49174-2-linmiaohe@huawei.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Cleanup and fixup for vmscan | expand |
On Sat, Jul 10, 2021 at 4:03 AM Miaohe Lin <linmiaohe@huawei.com> wrote: > > If the MADV_FREE pages are redirtied before they could be reclaimed, put > the pages back to anonymous LRU list by setting SwapBacked flag and the > pages will be reclaimed in normal swapout way. Otherwise MADV_FREE pages > won't be reclaimed as expected. > > Fixes: 802a3a92ad7a ("mm: reclaim MADV_FREE pages") This is not a bug -- the dirty check isn't needed but it was copied from __remove_mapping(). The page has only one reference left, which is from the isolation. After the caller puts the page back on lru and drops the reference, the page will be freed anyway. It doesn't matter which lru it goes. > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > --- > mm/vmscan.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index a7602f71ec04..6483fe0e2065 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1628,6 +1628,7 @@ static unsigned int shrink_page_list(struct list_head *page_list, > if (!page_ref_freeze(page, 1)) > goto keep_locked; > if (PageDirty(page)) { > + SetPageSwapBacked(page); > page_ref_unfreeze(page, 1); > goto keep_locked; > } > -- > 2.23.0 > >
On 2021/7/11 7:22, Yu Zhao wrote: > On Sat, Jul 10, 2021 at 4:03 AM Miaohe Lin <linmiaohe@huawei.com> wrote: >> >> If the MADV_FREE pages are redirtied before they could be reclaimed, put >> the pages back to anonymous LRU list by setting SwapBacked flag and the >> pages will be reclaimed in normal swapout way. Otherwise MADV_FREE pages >> won't be reclaimed as expected. >> >> Fixes: 802a3a92ad7a ("mm: reclaim MADV_FREE pages") > > This is not a bug -- the dirty check isn't needed but it was copied > from __remove_mapping(). Yes, this is not a bug and harmless. When we reach here, page should not be dirtied because PageDirty is handled above and there is no way to redirty it again as pagetable references are all gone and it's not in the swap cache. > > The page has only one reference left, which is from the isolation. > After the caller puts the page back on lru and drops the reference, > the page will be freed anyway. It doesn't matter which lru it goes. But it looks buggy as it didn't perform the expected ops from code view. Should I drop the Fixes tag and send a v2 version? Many thanks for reply! > >> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >> --- >> mm/vmscan.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/mm/vmscan.c b/mm/vmscan.c >> index a7602f71ec04..6483fe0e2065 100644 >> --- a/mm/vmscan.c >> +++ b/mm/vmscan.c >> @@ -1628,6 +1628,7 @@ static unsigned int shrink_page_list(struct list_head *page_list, >> if (!page_ref_freeze(page, 1)) >> goto keep_locked; >> if (PageDirty(page)) { >> + SetPageSwapBacked(page); >> page_ref_unfreeze(page, 1); >> goto keep_locked; >> } >> -- >> 2.23.0 >> >> > . >
On Sat 10-07-21 18:03:25, Miaohe Lin wrote: > If the MADV_FREE pages are redirtied before they could be reclaimed, put > the pages back to anonymous LRU list by setting SwapBacked flag and the > pages will be reclaimed in normal swapout way. Otherwise MADV_FREE pages > won't be reclaimed as expected. Could you describe problem which you are trying to address? What does it mean that pages won't be reclaimed as expected? Also why is SetPageSwapBacked in shrink_page_list insufficient? > Fixes: 802a3a92ad7a ("mm: reclaim MADV_FREE pages") > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > --- > mm/vmscan.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index a7602f71ec04..6483fe0e2065 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1628,6 +1628,7 @@ static unsigned int shrink_page_list(struct list_head *page_list, > if (!page_ref_freeze(page, 1)) > goto keep_locked; > if (PageDirty(page)) { > + SetPageSwapBacked(page); > page_ref_unfreeze(page, 1); > goto keep_locked; > } > -- > 2.23.0
On 2021/7/12 15:22, Michal Hocko wrote: > On Sat 10-07-21 18:03:25, Miaohe Lin wrote: >> If the MADV_FREE pages are redirtied before they could be reclaimed, put >> the pages back to anonymous LRU list by setting SwapBacked flag and the >> pages will be reclaimed in normal swapout way. Otherwise MADV_FREE pages >> won't be reclaimed as expected. > > Could you describe problem which you are trying to address? What does it > mean that pages won't be reclaimed as expected? > In fact, this is not a bug and harmless. But it looks buggy as it didn't perform the expected ops from code view. Lazyfree (MADV_FREE) pages are clean anonymous pages. They have SwapBacked flag cleared to distinguish normal anonymous pages. When the MADV_FREE pages are redirtied before they could be reclaimed, the pages should be put back to anonymous LRU list by setting SwapBacked flag, thus the pages will be reclaimed in normal swapout way. Many thanks for review and reply. > Also why is SetPageSwapBacked in shrink_page_list insufficient? > >> Fixes: 802a3a92ad7a ("mm: reclaim MADV_FREE pages") >> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >> --- >> mm/vmscan.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/mm/vmscan.c b/mm/vmscan.c >> index a7602f71ec04..6483fe0e2065 100644 >> --- a/mm/vmscan.c >> +++ b/mm/vmscan.c >> @@ -1628,6 +1628,7 @@ static unsigned int shrink_page_list(struct list_head *page_list, >> if (!page_ref_freeze(page, 1)) >> goto keep_locked; >> if (PageDirty(page)) { >> + SetPageSwapBacked(page); >> page_ref_unfreeze(page, 1); >> goto keep_locked; >> } >> -- >> 2.23.0 >
On Mon, Jul 12, 2021 at 1:12 AM Miaohe Lin <linmiaohe@huawei.com> wrote: > > On 2021/7/11 7:22, Yu Zhao wrote: > > On Sat, Jul 10, 2021 at 4:03 AM Miaohe Lin <linmiaohe@huawei.com> wrote: > >> > >> If the MADV_FREE pages are redirtied before they could be reclaimed, put > >> the pages back to anonymous LRU list by setting SwapBacked flag and the > >> pages will be reclaimed in normal swapout way. Otherwise MADV_FREE pages > >> won't be reclaimed as expected. > >> > >> Fixes: 802a3a92ad7a ("mm: reclaim MADV_FREE pages") > > > > This is not a bug -- the dirty check isn't needed but it was copied > > from __remove_mapping(). > > Yes, this is not a bug and harmless. When we reach here, page should not be > dirtied because PageDirty is handled above and there is no way to redirty it > again as pagetable references are all gone and it's not in the swap cache. > > > > > The page has only one reference left, which is from the isolation. > > After the caller puts the page back on lru and drops the reference, > > the page will be freed anyway. It doesn't matter which lru it goes. > > But it looks buggy as it didn't perform the expected ops from code view. > Should I drop the Fixes tag and send a v2 version? I don't understand the logic here -- it looks pretty obvious to me that, if we want to change anything, we should delete the dirty check, not add another line that would enforce the belief that the dirty check is needed. > > Many thanks for reply! > > > > >> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > >> --- > >> mm/vmscan.c | 1 + > >> 1 file changed, 1 insertion(+) > >> > >> diff --git a/mm/vmscan.c b/mm/vmscan.c > >> index a7602f71ec04..6483fe0e2065 100644 > >> --- a/mm/vmscan.c > >> +++ b/mm/vmscan.c > >> @@ -1628,6 +1628,7 @@ static unsigned int shrink_page_list(struct list_head *page_list, > >> if (!page_ref_freeze(page, 1)) > >> goto keep_locked; > >> if (PageDirty(page)) { > >> + SetPageSwapBacked(page); > >> page_ref_unfreeze(page, 1); > >> goto keep_locked; > >> } > >> -- > >> 2.23.0 > >> > >> > > . > > >
On Mon 12-07-21 19:03:39, Miaohe Lin wrote: > On 2021/7/12 15:22, Michal Hocko wrote: > > On Sat 10-07-21 18:03:25, Miaohe Lin wrote: > >> If the MADV_FREE pages are redirtied before they could be reclaimed, put > >> the pages back to anonymous LRU list by setting SwapBacked flag and the > >> pages will be reclaimed in normal swapout way. Otherwise MADV_FREE pages > >> won't be reclaimed as expected. > > > > Could you describe problem which you are trying to address? What does it > > mean that pages won't be reclaimed as expected? > > > > In fact, this is not a bug and harmless. Fixes tag is then misleading and the changelog should be more clear about this as well. > But it looks buggy as it didn't perform > the expected ops from code view. Lazyfree (MADV_FREE) pages are clean anonymous > pages. They have SwapBacked flag cleared to distinguish normal anonymous pages. yes. > When the MADV_FREE pages are redirtied before they could be reclaimed, the pages > should be put back to anonymous LRU list by setting SwapBacked flag, thus the > pages will be reclaimed in normal swapout way. Agreed. But the question is why this needs an explicit handling here when we already do handle this case when trying to unmap the page. Please make sure to document the behavior you are observing, why it is not desirable. > Many thanks for review and reply. > > > Also why is SetPageSwapBacked in shrink_page_list insufficient? Sorry I meant to say try_to_unmap path here > >> Fixes: 802a3a92ad7a ("mm: reclaim MADV_FREE pages") > >> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > >> --- > >> mm/vmscan.c | 1 + > >> 1 file changed, 1 insertion(+) > >> > >> diff --git a/mm/vmscan.c b/mm/vmscan.c > >> index a7602f71ec04..6483fe0e2065 100644 > >> --- a/mm/vmscan.c > >> +++ b/mm/vmscan.c > >> @@ -1628,6 +1628,7 @@ static unsigned int shrink_page_list(struct list_head *page_list, > >> if (!page_ref_freeze(page, 1)) > >> goto keep_locked; > >> if (PageDirty(page)) { > >> + SetPageSwapBacked(page); > >> page_ref_unfreeze(page, 1); > >> goto keep_locked; > >> } > >> -- > >> 2.23.0 > >
On 2021/7/13 17:30, Michal Hocko wrote: > On Mon 12-07-21 19:03:39, Miaohe Lin wrote: >> On 2021/7/12 15:22, Michal Hocko wrote: >>> On Sat 10-07-21 18:03:25, Miaohe Lin wrote: >>>> If the MADV_FREE pages are redirtied before they could be reclaimed, put >>>> the pages back to anonymous LRU list by setting SwapBacked flag and the >>>> pages will be reclaimed in normal swapout way. Otherwise MADV_FREE pages >>>> won't be reclaimed as expected. >>> >>> Could you describe problem which you are trying to address? What does it >>> mean that pages won't be reclaimed as expected? >>> >> >> In fact, this is not a bug and harmless. > > Fixes tag is then misleading and the changelog should be more clear > about this as well. Sure. > >> But it looks buggy as it didn't perform >> the expected ops from code view. Lazyfree (MADV_FREE) pages are clean anonymous >> pages. They have SwapBacked flag cleared to distinguish normal anonymous pages. > > yes. > >> When the MADV_FREE pages are redirtied before they could be reclaimed, the pages >> should be put back to anonymous LRU list by setting SwapBacked flag, thus the >> pages will be reclaimed in normal swapout way. > > Agreed. But the question is why this needs an explicit handling here > when we already do handle this case when trying to unmap the page. This makes me think more. It seems even the page_ref_freeze call is guaranteed to success as no one can grab the page refcnt after the page is successfully unmapped. Does the change below makes sense for you? Many Thanks. diff --git a/mm/vmscan.c b/mm/vmscan.c index 6e26b3c93242..c31925320b33 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1624,15 +1624,11 @@ static unsigned int shrink_page_list(struct list_head *page_list, } if (PageAnon(page) && !PageSwapBacked(page)) { - /* follow __remove_mapping for reference */ - if (!page_ref_freeze(page, 1)) - goto keep_locked; - if (PageDirty(page)) { - SetPageSwapBacked(page); - page_ref_unfreeze(page, 1); - goto keep_locked; - } - + /* + * No one can grab the page refcnt or redirty the page + * after the page is successfully unmapped. + */ + WARN_ON_ONCE(!page_ref_freeze(page, 1)); count_vm_event(PGLAZYFREED); count_memcg_page_event(page, PGLAZYFREED); } else if (!mapping || !__remove_mapping(mapping, page, true, > Please make sure to document the behavior you are observing, why it is > not desirable. > >> Many thanks for review and reply. >> >>> Also why is SetPageSwapBacked in shrink_page_list insufficient? > > Sorry I meant to say try_to_unmap path here > >>>> Fixes: 802a3a92ad7a ("mm: reclaim MADV_FREE pages") >>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>>> --- >>>> mm/vmscan.c | 1 + >>>> 1 file changed, 1 insertion(+) >>>> >>>> diff --git a/mm/vmscan.c b/mm/vmscan.c >>>> index a7602f71ec04..6483fe0e2065 100644 >>>> --- a/mm/vmscan.c >>>> +++ b/mm/vmscan.c >>>> @@ -1628,6 +1628,7 @@ static unsigned int shrink_page_list(struct list_head *page_list, >>>> if (!page_ref_freeze(page, 1)) >>>> goto keep_locked; >>>> if (PageDirty(page)) { >>>> + SetPageSwapBacked(page); >>>> page_ref_unfreeze(page, 1); >>>> goto keep_locked; >>>> } >>>> -- >>>> 2.23.0 >>> >
On 2021/7/13 15:25, Yu Zhao wrote: > On Mon, Jul 12, 2021 at 1:12 AM Miaohe Lin <linmiaohe@huawei.com> wrote: >> >> On 2021/7/11 7:22, Yu Zhao wrote: >>> On Sat, Jul 10, 2021 at 4:03 AM Miaohe Lin <linmiaohe@huawei.com> wrote: >>>> >>>> If the MADV_FREE pages are redirtied before they could be reclaimed, put >>>> the pages back to anonymous LRU list by setting SwapBacked flag and the >>>> pages will be reclaimed in normal swapout way. Otherwise MADV_FREE pages >>>> won't be reclaimed as expected. >>>> >>>> Fixes: 802a3a92ad7a ("mm: reclaim MADV_FREE pages") >>> >>> This is not a bug -- the dirty check isn't needed but it was copied >>> from __remove_mapping(). >> >> Yes, this is not a bug and harmless. When we reach here, page should not be >> dirtied because PageDirty is handled above and there is no way to redirty it >> again as pagetable references are all gone and it's not in the swap cache. >> >>> >>> The page has only one reference left, which is from the isolation. >>> After the caller puts the page back on lru and drops the reference, >>> the page will be freed anyway. It doesn't matter which lru it goes. >> >> But it looks buggy as it didn't perform the expected ops from code view. >> Should I drop the Fixes tag and send a v2 version? > > I don't understand the logic here -- it looks pretty obvious to me > that, if we want to change anything, we should delete the dirty check, > not add another line that would enforce the belief that the dirty > check is needed. > The dirty check could be removed even with the page_ref_freeze check because no one can grab the page refcnt after the page is successfully unmapped. Does the change below makes sense for you? Many Thanks. diff --git a/mm/vmscan.c b/mm/vmscan.c index 6e26b3c93242..c31925320b33 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1624,15 +1624,11 @@ static unsigned int shrink_page_list(struct list_head *page_list, } if (PageAnon(page) && !PageSwapBacked(page)) { - /* follow __remove_mapping for reference */ - if (!page_ref_freeze(page, 1)) - goto keep_locked; - if (PageDirty(page)) { - SetPageSwapBacked(page); - page_ref_unfreeze(page, 1); - goto keep_locked; - } - + /* + * No one can grab the page refcnt or redirty the page + * after the page is successfully unmapped. + */ + WARN_ON_ONCE(!page_ref_freeze(page, 1)); count_vm_event(PGLAZYFREED); count_memcg_page_event(page, PGLAZYFREED); } else if (!mapping || !__remove_mapping(mapping, page, true, >> >> Many thanks for reply! >> >>> >>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>>> --- >>>> mm/vmscan.c | 1 + >>>> 1 file changed, 1 insertion(+) >>>> >>>> diff --git a/mm/vmscan.c b/mm/vmscan.c >>>> index a7602f71ec04..6483fe0e2065 100644 >>>> --- a/mm/vmscan.c >>>> +++ b/mm/vmscan.c >>>> @@ -1628,6 +1628,7 @@ static unsigned int shrink_page_list(struct list_head *page_list, >>>> if (!page_ref_freeze(page, 1)) >>>> goto keep_locked; >>>> if (PageDirty(page)) { >>>> + SetPageSwapBacked(page); >>>> page_ref_unfreeze(page, 1); >>>> goto keep_locked; >>>> } >>>> -- >>>> 2.23.0 >>>> >>>> >>> . >>> >> > . >
On Tue, Jul 13, 2021 at 09:13:51PM +0800, Miaohe Lin wrote: > >> When the MADV_FREE pages are redirtied before they could be reclaimed, the pages > >> should be put back to anonymous LRU list by setting SwapBacked flag, thus the > >> pages will be reclaimed in normal swapout way. > > > > Agreed. But the question is why this needs an explicit handling here > > when we already do handle this case when trying to unmap the page. > > This makes me think more. It seems even the page_ref_freeze call is guaranteed to > success as no one can grab the page refcnt after the page is successfully unmapped. NO! This is wrong. Every page can have its refcount speculatively raised (and then lowered). The two prime candidates for this are lockless GUP and page cache lookups, but there can be others too.
On 2021/7/13 21:34, Matthew Wilcox wrote: > On Tue, Jul 13, 2021 at 09:13:51PM +0800, Miaohe Lin wrote: >>>> When the MADV_FREE pages are redirtied before they could be reclaimed, the pages >>>> should be put back to anonymous LRU list by setting SwapBacked flag, thus the >>>> pages will be reclaimed in normal swapout way. >>> >>> Agreed. But the question is why this needs an explicit handling here >>> when we already do handle this case when trying to unmap the page. >> >> This makes me think more. It seems even the page_ref_freeze call is guaranteed to >> success as no one can grab the page refcnt after the page is successfully unmapped. > > NO! This is wrong. Every page can have its refcount speculatively raised > (and then lowered). The two prime candidates for this are lockless GUP > and page cache lookups, but there can be others too. > Many thanks for pointing this out. My overlook! Sorry! So, it seems lockless GUP can redirty the MADV_FREE page. But is it ok to just release a redirtied MADV_FREE pages? Because we hold the last reference here and the page will be freed anyway... > . >
On Wed, Jul 14, 2021 at 07:36:57PM +0800, Miaohe Lin wrote: > On 2021/7/13 21:34, Matthew Wilcox wrote: > > On Tue, Jul 13, 2021 at 09:13:51PM +0800, Miaohe Lin wrote: > >>>> When the MADV_FREE pages are redirtied before they could be reclaimed, the pages > >>>> should be put back to anonymous LRU list by setting SwapBacked flag, thus the > >>>> pages will be reclaimed in normal swapout way. > >>> > >>> Agreed. But the question is why this needs an explicit handling here > >>> when we already do handle this case when trying to unmap the page. > >> > >> This makes me think more. It seems even the page_ref_freeze call is guaranteed to > >> success as no one can grab the page refcnt after the page is successfully unmapped. > > > > NO! This is wrong. Every page can have its refcount speculatively raised > > (and then lowered). The two prime candidates for this are lockless GUP > > and page cache lookups, but there can be others too. > > > > Many thanks for pointing this out. My overlook! Sorry! > So, it seems lockless GUP can redirty the MADV_FREE page. But is it ok to just release > a redirtied MADV_FREE pages? Because we hold the last reference here and the page will > be freed anyway... I don't see how lockless GUP can redirty the page. It can grab the refcount, thus making the refcount here two. Then the call to freeze here fails and the page stays on the list. But the lockless GUP checks the page is still in the page table (and discovers it isn't, so releases the reference count). Am I missing a path that lets lockless GUP dirty the page?
On 7/14/21 4:48 AM, Matthew Wilcox wrote: > On Wed, Jul 14, 2021 at 07:36:57PM +0800, Miaohe Lin wrote: >> On 2021/7/13 21:34, Matthew Wilcox wrote: >>> On Tue, Jul 13, 2021 at 09:13:51PM +0800, Miaohe Lin wrote: >>>>>> When the MADV_FREE pages are redirtied before they could be reclaimed, the pages >>>>>> should be put back to anonymous LRU list by setting SwapBacked flag, thus the >>>>>> pages will be reclaimed in normal swapout way. >>>>> >>>>> Agreed. But the question is why this needs an explicit handling here >>>>> when we already do handle this case when trying to unmap the page. >>>> >>>> This makes me think more. It seems even the page_ref_freeze call is guaranteed to >>>> success as no one can grab the page refcnt after the page is successfully unmapped. >>> >>> NO! This is wrong. Every page can have its refcount speculatively raised >>> (and then lowered). The two prime candidates for this are lockless GUP >>> and page cache lookups, but there can be others too. >>> >> >> Many thanks for pointing this out. My overlook! Sorry! >> So, it seems lockless GUP can redirty the MADV_FREE page. But is it ok to just release >> a redirtied MADV_FREE pages? Because we hold the last reference here and the page will >> be freed anyway... > > I don't see how lockless GUP can redirty the page. It can grab the > refcount, thus making the refcount here two. Then the call to freeze > here fails and the page stays on the list. But the lockless GUP checks > the page is still in the page table (and discovers it isn't, so releases > the reference count). Am I missing a path that lets lockless GUP dirty > the page? > If a device driver pins some pages using gup, and the device then uses dma to write to those pages, then you could get there. That story is part of the reasoning that led to creating pin_user_pages(), which btw does not yet fully solve that case. Basically, though, unless a non-CPU device has access to the page, it's hard to see how gup itself can lead to a page getting dirtied. thanks,
On 2021/7/15 3:43, John Hubbard wrote: > On 7/14/21 4:48 AM, Matthew Wilcox wrote: >> On Wed, Jul 14, 2021 at 07:36:57PM +0800, Miaohe Lin wrote: >>> On 2021/7/13 21:34, Matthew Wilcox wrote: >>>> On Tue, Jul 13, 2021 at 09:13:51PM +0800, Miaohe Lin wrote: >>>>>>> When the MADV_FREE pages are redirtied before they could be reclaimed, the pages >>>>>>> should be put back to anonymous LRU list by setting SwapBacked flag, thus the >>>>>>> pages will be reclaimed in normal swapout way. >>>>>> >>>>>> Agreed. But the question is why this needs an explicit handling here >>>>>> when we already do handle this case when trying to unmap the page. >>>>> >>>>> This makes me think more. It seems even the page_ref_freeze call is guaranteed to >>>>> success as no one can grab the page refcnt after the page is successfully unmapped. >>>> >>>> NO! This is wrong. Every page can have its refcount speculatively raised >>>> (and then lowered). The two prime candidates for this are lockless GUP >>>> and page cache lookups, but there can be others too. >>>> >>> >>> Many thanks for pointing this out. My overlook! Sorry! >>> So, it seems lockless GUP can redirty the MADV_FREE page. But is it ok to just release >>> a redirtied MADV_FREE pages? Because we hold the last reference here and the page will >>> be freed anyway... >> >> I don't see how lockless GUP can redirty the page. It can grab the >> refcount, thus making the refcount here two. Then the call to freeze >> here fails and the page stays on the list. But the lockless GUP checks >> the page is still in the page table (and discovers it isn't, so releases >> the reference count). Am I missing a path that lets lockless GUP dirty >> the page? >> > > If a device driver pins some pages using gup, and the device then uses dma > to write to those pages, then you could get there. That story is part of the > reasoning that led to creating pin_user_pages(), which btw does not yet > fully solve that case. Many thanks for your explanation. So the similar scenario that is clarified in the __remove_mapping() is possible: get_user_pages(&page); [user mapping goes away] write_to(page); !PageDirty(page) [good] SetPageDirty(page); put_page(page); !page_count(page) [good, discard it] [oops, our write_to data is lost] The page can be redirtied after the page is unmapped. And there is no way to restore the page table as clean MADV_FREE page is simply cleared from page table via the try_to_unmap path. Is it ok to just release the redirtied MADV_FREE pages here as we hold the last reference and the page will be freed anyway... ? > > Basically, though, unless a non-CPU device has access to the page, it's > hard to see how gup itself can lead to a page getting dirtied. > > thanks,
On 7/15/21 4:30 AM, Miaohe Lin wrote: ... >>>> So, it seems lockless GUP can redirty the MADV_FREE page. But is it ok to just release >>>> a redirtied MADV_FREE pages? Because we hold the last reference here and the page will >>>> be freed anyway... >>> >>> I don't see how lockless GUP can redirty the page. It can grab the >>> refcount, thus making the refcount here two. Then the call to freeze >>> here fails and the page stays on the list. But the lockless GUP checks >>> the page is still in the page table (and discovers it isn't, so releases >>> the reference count). Am I missing a path that lets lockless GUP dirty >>> the page? >>> >> >> If a device driver pins some pages using gup, and the device then uses dma >> to write to those pages, then you could get there. That story is part of the >> reasoning that led to creating pin_user_pages(), which btw does not yet >> fully solve that case. > > Many thanks for your explanation. > So the similar scenario that is clarified in the __remove_mapping() is possible: I probably should have added that the scenario I was describing is broken even before any patches that you might apply here. I was just trying to ensure that the complete list of scenarios was known. thanks,
On 2021/7/16 8:01, John Hubbard wrote: > On 7/15/21 4:30 AM, Miaohe Lin wrote: > ... >>>>> So, it seems lockless GUP can redirty the MADV_FREE page. But is it ok to just release >>>>> a redirtied MADV_FREE pages? Because we hold the last reference here and the page will >>>>> be freed anyway... >>>> >>>> I don't see how lockless GUP can redirty the page. It can grab the >>>> refcount, thus making the refcount here two. Then the call to freeze >>>> here fails and the page stays on the list. But the lockless GUP checks >>>> the page is still in the page table (and discovers it isn't, so releases >>>> the reference count). Am I missing a path that lets lockless GUP dirty >>>> the page? >>>> >>> >>> If a device driver pins some pages using gup, and the device then uses dma >>> to write to those pages, then you could get there. That story is part of the >>> reasoning that led to creating pin_user_pages(), which btw does not yet >>> fully solve that case. >> >> Many thanks for your explanation. >> So the similar scenario that is clarified in the __remove_mapping() is possible: > > I probably should have added that the scenario I was describing is broken even > before any patches that you might apply here. I was just trying to ensure that > the complete list of scenarios was known. > Many thanks for doing this! :) > > > thanks,
diff --git a/mm/vmscan.c b/mm/vmscan.c index a7602f71ec04..6483fe0e2065 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1628,6 +1628,7 @@ static unsigned int shrink_page_list(struct list_head *page_list, if (!page_ref_freeze(page, 1)) goto keep_locked; if (PageDirty(page)) { + SetPageSwapBacked(page); page_ref_unfreeze(page, 1); goto keep_locked; }
If the MADV_FREE pages are redirtied before they could be reclaimed, put the pages back to anonymous LRU list by setting SwapBacked flag and the pages will be reclaimed in normal swapout way. Otherwise MADV_FREE pages won't be reclaimed as expected. Fixes: 802a3a92ad7a ("mm: reclaim MADV_FREE pages") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> --- mm/vmscan.c | 1 + 1 file changed, 1 insertion(+)