Message ID | 20220425132723.34824-3-linmiaohe@huawei.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | A few cleanup and fixup patches for migration | expand |
On 25.04.22 15:27, Miaohe Lin wrote: > When non-lru movable page was freed from under us, __ClearPageMovable must > have been done. Even if it's not done, ClearPageIsolated here won't hurt > as page will be freed anyway. So we can thus remove unneeded lock page and > PageMovable check here. > > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > Reviewed-by: Christoph Hellwig <hch@lst.de> > --- > mm/migrate.c | 8 ++------ > 1 file changed, 2 insertions(+), 6 deletions(-) > > diff --git a/mm/migrate.c b/mm/migrate.c > index b779646665fe..0fc4651b3e39 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -1093,12 +1093,8 @@ static int unmap_and_move(new_page_t get_new_page, > /* page was freed from under us. So we are done. */ > ClearPageActive(page); > ClearPageUnevictable(page); > - if (unlikely(__PageMovable(page))) { > - lock_page(page); > - if (!PageMovable(page)) > - ClearPageIsolated(page); > - unlock_page(page); > - } > + if (unlikely(__PageMovable(page))) > + ClearPageIsolated(page); > goto out; > } Hm, that code+change raises a couple of questions. We're doing here the same as in putback_movable_pages(). So I guess the difference here is that the caller did release the reference while the page was isolated, while we don't assume the same in putback_movable_pages(). Shouldn't whoever owned the page have cleared that? IOW, is it even valid that we see a movable or isolated page here (WARN/BUG?)? At least for balloon compaction, I remember that __PageMovable() is properly cleared before freeing it via balloon_page_delete(). Also, I am not sure how reliable that page count check is here: if we'd have another speculative reference to the page, we might see "page_count(page) > 1" and not take that path, although the previous owner released the last reference.
On 2022/4/29 18:07, David Hildenbrand wrote: > On 25.04.22 15:27, Miaohe Lin wrote: >> When non-lru movable page was freed from under us, __ClearPageMovable must >> have been done. Even if it's not done, ClearPageIsolated here won't hurt >> as page will be freed anyway. So we can thus remove unneeded lock page and >> PageMovable check here. >> >> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >> Reviewed-by: Christoph Hellwig <hch@lst.de> >> --- >> mm/migrate.c | 8 ++------ >> 1 file changed, 2 insertions(+), 6 deletions(-) >> >> diff --git a/mm/migrate.c b/mm/migrate.c >> index b779646665fe..0fc4651b3e39 100644 >> --- a/mm/migrate.c >> +++ b/mm/migrate.c >> @@ -1093,12 +1093,8 @@ static int unmap_and_move(new_page_t get_new_page, >> /* page was freed from under us. So we are done. */ >> ClearPageActive(page); >> ClearPageUnevictable(page); >> - if (unlikely(__PageMovable(page))) { >> - lock_page(page); >> - if (!PageMovable(page)) >> - ClearPageIsolated(page); >> - unlock_page(page); >> - } >> + if (unlikely(__PageMovable(page))) >> + ClearPageIsolated(page); >> goto out; >> } > > Hm, that code+change raises a couple of questions. > > We're doing here the same as in putback_movable_pages(). So I guess the > difference here is that the caller did release the reference while the > page was isolated, while we don't assume the same in > putback_movable_pages(). Agree. > > > Shouldn't whoever owned the page have cleared that? IOW, is it even > valid that we see a movable or isolated page here (WARN/BUG?)? > > At least for balloon compaction, I remember that __PageMovable() is > properly cleared before freeing it via balloon_page_delete(). z3fold, zsmalloc will do __ClearPageMovable when the page is going to be released. So I think we shouldn't see a movable page here: void __ClearPageMovable(struct page *page) { VM_BUG_ON_PAGE(!PageMovable(page), page); /* * Clear registered address_space val with keeping PAGE_MAPPING_MOVABLE * flag so that VM can catch up released page by driver after isolation. * With it, VM migration doesn't try to put it back. */ page->mapping = (void *)((unsigned long)page->mapping & PAGE_MAPPING_MOVABLE); } But it seems there is no guarantee for PageIsolated flag. Or am I miss something? > > > Also, I am not sure how reliable that page count check is here: if we'd > have another speculative reference to the page, we might see > "page_count(page) > 1" and not take that path, although the previous > owner released the last reference. IIUC, there should not be such speculative reference. The driver should have taken care of it. Thanks! > >
On 09.05.22 10:51, Miaohe Lin wrote: > On 2022/4/29 18:07, David Hildenbrand wrote: >> On 25.04.22 15:27, Miaohe Lin wrote: >>> When non-lru movable page was freed from under us, __ClearPageMovable must >>> have been done. Even if it's not done, ClearPageIsolated here won't hurt >>> as page will be freed anyway. So we can thus remove unneeded lock page and >>> PageMovable check here. >>> >>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>> Reviewed-by: Christoph Hellwig <hch@lst.de> >>> --- >>> mm/migrate.c | 8 ++------ >>> 1 file changed, 2 insertions(+), 6 deletions(-) >>> >>> diff --git a/mm/migrate.c b/mm/migrate.c >>> index b779646665fe..0fc4651b3e39 100644 >>> --- a/mm/migrate.c >>> +++ b/mm/migrate.c >>> @@ -1093,12 +1093,8 @@ static int unmap_and_move(new_page_t get_new_page, >>> /* page was freed from under us. So we are done. */ >>> ClearPageActive(page); >>> ClearPageUnevictable(page); >>> - if (unlikely(__PageMovable(page))) { >>> - lock_page(page); >>> - if (!PageMovable(page)) >>> - ClearPageIsolated(page); >>> - unlock_page(page); >>> - } >>> + if (unlikely(__PageMovable(page))) >>> + ClearPageIsolated(page); >>> goto out; >>> } >> >> Hm, that code+change raises a couple of questions. >> >> We're doing here the same as in putback_movable_pages(). So I guess the >> difference here is that the caller did release the reference while the >> page was isolated, while we don't assume the same in >> putback_movable_pages(). > > Agree. > >> >> >> Shouldn't whoever owned the page have cleared that? IOW, is it even >> valid that we see a movable or isolated page here (WARN/BUG?)? >> >> At least for balloon compaction, I remember that __PageMovable() is >> properly cleared before freeing it via balloon_page_delete(). > > z3fold, zsmalloc will do __ClearPageMovable when the page is going to be released. > So I think we shouldn't see a movable page here: > > void __ClearPageMovable(struct page *page) > { > VM_BUG_ON_PAGE(!PageMovable(page), page); > /* > * Clear registered address_space val with keeping PAGE_MAPPING_MOVABLE > * flag so that VM can catch up released page by driver after isolation. > * With it, VM migration doesn't try to put it back. > */ > page->mapping = (void *)((unsigned long)page->mapping & > PAGE_MAPPING_MOVABLE); > } > > But it seems there is no guarantee for PageIsolated flag. Or am I miss something? At least the code we have now: if (unlikely(__PageMovable(page))) ClearPageIsolated(page); Should be dead code. So PG_isolated could remain set. If PG_isolated is still set, it will get cleared in the buddy when freeing the page via page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; > >> >> >> Also, I am not sure how reliable that page count check is here: if we'd >> have another speculative reference to the page, we might see >> "page_count(page) > 1" and not take that path, although the previous >> owner released the last reference. > > IIUC, there should not be such speculative reference. The driver should have taken care > of it. How can you prevent any kind of speculative references? See isolate_movable_page() as an example, which grabs a speculative reference to then find out that the page is already isolated by someone else, to then back off.
On 2022/5/11 23:23, David Hildenbrand wrote: > On 09.05.22 10:51, Miaohe Lin wrote: >> On 2022/4/29 18:07, David Hildenbrand wrote: snip >> >> z3fold, zsmalloc will do __ClearPageMovable when the page is going to be released. >> So I think we shouldn't see a movable page here: >> >> void __ClearPageMovable(struct page *page) >> { >> VM_BUG_ON_PAGE(!PageMovable(page), page); >> /* >> * Clear registered address_space val with keeping PAGE_MAPPING_MOVABLE >> * flag so that VM can catch up released page by driver after isolation. >> * With it, VM migration doesn't try to put it back. >> */ >> page->mapping = (void *)((unsigned long)page->mapping & >> PAGE_MAPPING_MOVABLE); >> } >> >> But it seems there is no guarantee for PageIsolated flag. Or am I miss something? > > At least the code we have now: > > if (unlikely(__PageMovable(page))) > ClearPageIsolated(page); > > Should be dead code. So PG_isolated could remain set. > > If PG_isolated is still set, it will get cleared in the buddy when > freeing the page via > > page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; Yes, check_free_page only complains about flags belonging to PAGE_FLAGS_CHECK_AT_FREE and PG_isolated will be cleared in the buddy when freeing the page. But it might not be a good idea to reply on this ? IMHO, it should be better to clear the PG_isolated explicitly ourselves. > >> >>> >>> >>> Also, I am not sure how reliable that page count check is here: if we'd >>> have another speculative reference to the page, we might see >>> "page_count(page) > 1" and not take that path, although the previous >>> owner released the last reference. >> >> IIUC, there should not be such speculative reference. The driver should have taken care >> of it. > > How can you prevent any kind of speculative references? > > See isolate_movable_page() as an example, which grabs a speculative > reference to then find out that the page is already isolated by someone > else, to then back off. You're right. isolate_movable_page will be an speculative references case. But the page count check here is just an optimization. If we encounter speculative references, it still works with useless effort of migrating to be released page. Thanks! >
>> If PG_isolated is still set, it will get cleared in the buddy when >> freeing the page via >> >> page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; > > Yes, check_free_page only complains about flags belonging to PAGE_FLAGS_CHECK_AT_FREE and PG_isolated > will be cleared in the buddy when freeing the page. But it might not be a good idea to reply on this ? > IMHO, it should be better to clear the PG_isolated explicitly ourselves. I think we can pretty much rely on this handling in the buddy :) > >> >>> >>>> >>>> >>>> Also, I am not sure how reliable that page count check is here: if we'd >>>> have another speculative reference to the page, we might see >>>> "page_count(page) > 1" and not take that path, although the previous >>>> owner released the last reference. >>> >>> IIUC, there should not be such speculative reference. The driver should have taken care >>> of it. >> >> How can you prevent any kind of speculative references? >> >> See isolate_movable_page() as an example, which grabs a speculative >> reference to then find out that the page is already isolated by someone >> else, to then back off. > > You're right. isolate_movable_page will be an speculative references case. But the page count check here > is just an optimization. If we encounter speculative references, it still works with useless effort of > migrating to be released page. Not really. The issue is that PAGE_FLAGS_CHECK_AT_FREE contains PG_active and PG_unevictable. We only clear those 2 flags if "page_count(page) == 1". Consequently, with a speculative reference, we'll run into the check_free_page_bad() when dropping the last reference. This is just shaky. Special casing on "page_count(page) == 1" for detecting "was this freed by the owner" is not 100% water proof. In an ideal world, we'd just get rid of that whole block of code and let the actual freeing code clear PG_active and PG_unevictable. But that would require changes to free_pages_prepare(). Now I do wonder, if we ever even have PG_active or PG_unevictable still set when the page was freed by the owner in this code. IOW, maybe that is dead code as well and we can just remove the whole shaky "page_count(page) == 1" code block. Ccing Minchan, who added clearing of the pageflags at that point.
On 2022/5/12 15:10, David Hildenbrand wrote: >>> If PG_isolated is still set, it will get cleared in the buddy when >>> freeing the page via >>> >>> page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; >> >> Yes, check_free_page only complains about flags belonging to PAGE_FLAGS_CHECK_AT_FREE and PG_isolated >> will be cleared in the buddy when freeing the page. But it might not be a good idea to reply on this ? >> IMHO, it should be better to clear the PG_isolated explicitly ourselves. > > I think we can pretty much rely on this handling in the buddy :) So is the below code change what you're suggesting? if (page_count(page) == 1) { /* page was freed from under us. So we are done. */ ClearPageActive(page); ClearPageUnevictable(page); - if (unlikely(__PageMovable(page))) - ClearPageIsolated(page); goto out; } > >> >>> >>>> >>>>> >>>>> >>>>> Also, I am not sure how reliable that page count check is here: if we'd >>>>> have another speculative reference to the page, we might see >>>>> "page_count(page) > 1" and not take that path, although the previous >>>>> owner released the last reference. >>>> >>>> IIUC, there should not be such speculative reference. The driver should have taken care >>>> of it. >>> >>> How can you prevent any kind of speculative references? >>> >>> See isolate_movable_page() as an example, which grabs a speculative >>> reference to then find out that the page is already isolated by someone >>> else, to then back off. >> >> You're right. isolate_movable_page will be an speculative references case. But the page count check here >> is just an optimization. If we encounter speculative references, it still works with useless effort of >> migrating to be released page. > > > Not really. The issue is that PAGE_FLAGS_CHECK_AT_FREE contains > PG_active and PG_unevictable. > > We only clear those 2 flags if "page_count(page) == 1". Consequently, > with a speculative reference, we'll run into the check_free_page_bad() > when dropping the last reference. It seems if a speculative reference happens after the "page_count(page) == 1" check, it's ok because we cleared the PG_active and PG_unevictable. And if it happens before the check, this code block is skipped and the page will be freed after migration. The PG_active and PG_unevictable will be correctly cleared when page is actually freed via __folio_clear_active. (Please see below comment) > > This is just shaky. Special casing on "page_count(page) == 1" for > detecting "was this freed by the owner" is not 100% water proof. > > In an ideal world, we'd just get rid of that whole block of code and let > the actual freeing code clear PG_active and PG_unevictable. But that > would require changes to free_pages_prepare(). > > > Now I do wonder, if we ever even have PG_active or PG_unevictable still > set when the page was freed by the owner in this code. IOW, maybe that > is dead code as well and we can just remove the whole shaky > "page_count(page) == 1" code block. Think about below common scene: Anonymous page is actively used by the sole owner process, so it will have PG_active set. Then process exited while vm tries to migrate that page. So the page should have refcnt == 1 while PG_active is set? Note normally PG_active should be cleared when the page is released: __put_single_page PageLRU __clear_page_lru_flags __folio_clear_active __folio_clear_unevictable But for isolated page, PageLRU is cleared. So when the isolated page is released, __clear_page_lru_flags won't be called. So we have to clear the PG_active and PG_unevictable here manully. So I think this code block works. Or am I miss something again? Thanks! > > Ccing Minchan, who added clearing of the pageflags at that point. >
On 12.05.22 15:26, Miaohe Lin wrote: > On 2022/5/12 15:10, David Hildenbrand wrote: >>>> If PG_isolated is still set, it will get cleared in the buddy when >>>> freeing the page via >>>> >>>> page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; >>> >>> Yes, check_free_page only complains about flags belonging to PAGE_FLAGS_CHECK_AT_FREE and PG_isolated >>> will be cleared in the buddy when freeing the page. But it might not be a good idea to reply on this ? >>> IMHO, it should be better to clear the PG_isolated explicitly ourselves. >> >> I think we can pretty much rely on this handling in the buddy :) > > So is the below code change what you're suggesting? > > if (page_count(page) == 1) { > /* page was freed from under us. So we are done. */ > ClearPageActive(page); > ClearPageUnevictable(page); > - if (unlikely(__PageMovable(page))) > - ClearPageIsolated(page); > goto out; > } Yeah, unless I am missing something important :) >> >>> >>>> >>>>> >>>>>> >>>>>> >>>>>> Also, I am not sure how reliable that page count check is here: if we'd >>>>>> have another speculative reference to the page, we might see >>>>>> "page_count(page) > 1" and not take that path, although the previous >>>>>> owner released the last reference. >>>>> >>>>> IIUC, there should not be such speculative reference. The driver should have taken care >>>>> of it. >>>> >>>> How can you prevent any kind of speculative references? >>>> >>>> See isolate_movable_page() as an example, which grabs a speculative >>>> reference to then find out that the page is already isolated by someone >>>> else, to then back off. >>> >>> You're right. isolate_movable_page will be an speculative references case. But the page count check here >>> is just an optimization. If we encounter speculative references, it still works with useless effort of >>> migrating to be released page. >> >> >> Not really. The issue is that PAGE_FLAGS_CHECK_AT_FREE contains >> PG_active and PG_unevictable. >> >> We only clear those 2 flags if "page_count(page) == 1". Consequently, >> with a speculative reference, we'll run into the check_free_page_bad() >> when dropping the last reference. > > It seems if a speculative reference happens after the "page_count(page) == 1" check, > it's ok because we cleared the PG_active and PG_unevictable. And if it happens before > the check, this code block is skipped and the page will be freed after migration. The > PG_active and PG_unevictable will be correctly cleared when page is actually freed via > __folio_clear_active. (Please see below comment) > >> >> This is just shaky. Special casing on "page_count(page) == 1" for >> detecting "was this freed by the owner" is not 100% water proof. >> >> In an ideal world, we'd just get rid of that whole block of code and let >> the actual freeing code clear PG_active and PG_unevictable. But that >> would require changes to free_pages_prepare(). >> >> >> Now I do wonder, if we ever even have PG_active or PG_unevictable still >> set when the page was freed by the owner in this code. IOW, maybe that >> is dead code as well and we can just remove the whole shaky >> "page_count(page) == 1" code block. > > Think about below common scene: Anonymous page is actively used by the sole owner process, so it > will have PG_active set. Then process exited while vm tries to migrate that page. So the page > should have refcnt == 1 while PG_active is set? Note normally PG_active should be cleared when > the page is released: > > __put_single_page > PageLRU > __clear_page_lru_flags > __folio_clear_active > __folio_clear_unevictable > > But for isolated page, PageLRU is cleared. So when the isolated page is released, __clear_page_lru_flags > won't be called. So we have to clear the PG_active and PG_unevictable here manully. So I think > this code block works. Or am I miss something again? Let's assume the following: page as freed by the owner and we enter unmap_and_move(). #1: enter unmap_and_move() // page_count is 1 #2: enter isolate_movable_page() // page_count is 1 #2: get_page_unless_zero() // page_count is now 2 #1: if (page_count(page) == 1) { // does not trigger #2: put_page(page); // page_count is now 1 #1: put_page(page); // page_count is now 0 -> freed #1 will trigger __put_page() -> __put_single_page() -> __page_cache_release() will not clear the flags because it's not an LRU page at that point in time, right (-> isolated)? We did not run that code block that would clear PG_active and PG_unevictable. Which still leaves the questions: a) If PG_active and PG_unevictable was cleared, where? b) Why is that code block that conditionally clears the flags of any value and why can't we simply drop it?
On 2022/5/13 0:50, David Hildenbrand wrote: > On 12.05.22 15:26, Miaohe Lin wrote: >> On 2022/5/12 15:10, David Hildenbrand wrote: >>>>> If PG_isolated is still set, it will get cleared in the buddy when >>>>> freeing the page via >>>>> >>>>> page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; >>>> >>>> Yes, check_free_page only complains about flags belonging to PAGE_FLAGS_CHECK_AT_FREE and PG_isolated >>>> will be cleared in the buddy when freeing the page. But it might not be a good idea to reply on this ? >>>> IMHO, it should be better to clear the PG_isolated explicitly ourselves. >>> >>> I think we can pretty much rely on this handling in the buddy :) >> >> So is the below code change what you're suggesting? >> >> if (page_count(page) == 1) { >> /* page was freed from under us. So we are done. */ >> ClearPageActive(page); >> ClearPageUnevictable(page); >> - if (unlikely(__PageMovable(page))) >> - ClearPageIsolated(page); >> goto out; >> } > > Yeah, unless I am missing something important :) > >>> >>>> >>>>> >>>>>> >>>>>>> >>>>>>> >>>>>>> Also, I am not sure how reliable that page count check is here: if we'd >>>>>>> have another speculative reference to the page, we might see >>>>>>> "page_count(page) > 1" and not take that path, although the previous >>>>>>> owner released the last reference. >>>>>> >>>>>> IIUC, there should not be such speculative reference. The driver should have taken care >>>>>> of it. >>>>> >>>>> How can you prevent any kind of speculative references? >>>>> >>>>> See isolate_movable_page() as an example, which grabs a speculative >>>>> reference to then find out that the page is already isolated by someone >>>>> else, to then back off. >>>> >>>> You're right. isolate_movable_page will be an speculative references case. But the page count check here >>>> is just an optimization. If we encounter speculative references, it still works with useless effort of >>>> migrating to be released page. >>> >>> >>> Not really. The issue is that PAGE_FLAGS_CHECK_AT_FREE contains >>> PG_active and PG_unevictable. >>> >>> We only clear those 2 flags if "page_count(page) == 1". Consequently, >>> with a speculative reference, we'll run into the check_free_page_bad() >>> when dropping the last reference. >> >> It seems if a speculative reference happens after the "page_count(page) == 1" check, >> it's ok because we cleared the PG_active and PG_unevictable. And if it happens before >> the check, this code block is skipped and the page will be freed after migration. The >> PG_active and PG_unevictable will be correctly cleared when page is actually freed via >> __folio_clear_active. (Please see below comment) >> >>> >>> This is just shaky. Special casing on "page_count(page) == 1" for >>> detecting "was this freed by the owner" is not 100% water proof. >>> >>> In an ideal world, we'd just get rid of that whole block of code and let >>> the actual freeing code clear PG_active and PG_unevictable. But that >>> would require changes to free_pages_prepare(). >>> >>> >>> Now I do wonder, if we ever even have PG_active or PG_unevictable still >>> set when the page was freed by the owner in this code. IOW, maybe that >>> is dead code as well and we can just remove the whole shaky >>> "page_count(page) == 1" code block. >> >> Think about below common scene: Anonymous page is actively used by the sole owner process, so it >> will have PG_active set. Then process exited while vm tries to migrate that page. So the page >> should have refcnt == 1 while PG_active is set? Note normally PG_active should be cleared when >> the page is released: >> >> __put_single_page >> PageLRU >> __clear_page_lru_flags >> __folio_clear_active >> __folio_clear_unevictable >> >> But for isolated page, PageLRU is cleared. So when the isolated page is released, __clear_page_lru_flags >> won't be called. So we have to clear the PG_active and PG_unevictable here manully. So I think >> this code block works. Or am I miss something again? > > Let's assume the following: page as freed by the owner and we enter > unmap_and_move(). > > > #1: enter unmap_and_move() // page_count is 1 > #2: enter isolate_movable_page() // page_count is 1 > #2: get_page_unless_zero() // page_count is now 2 > #1: if (page_count(page) == 1) { // does not trigger > #2: put_page(page); // page_count is now 1 > #1: put_page(page); // page_count is now 0 -> freed > > > #1 will trigger __put_page() -> __put_single_page() -> > __page_cache_release() will not clear the flags because it's not an LRU > page at that point in time, right (-> isolated)? Sorry, you're right. I thought the old page will be freed via putback_lru_page which will set PageLRU back instead of put_page directly. So if the above race occurs, PG_active and PG_unevictable will remain set while page goes to the buddy and check_free_page will complain about it. But it seems this is never witnessed? > > We did not run that code block that would clear PG_active and > PG_unevictable. > > Which still leaves the questions: > > a) If PG_active and PG_unevictable was cleared, where? For LRU pages, PG_active and PG_unevictable are cleared via __page_cache_release. And for isolated (LRU) pages, PG_active and PG_unevictable should be cleared ourselves? > b) Why is that code block that conditionally clears the flags of any > value and why can't we simply drop it? > To fix the issue, should we clear PG_active and PG_unevictable unconditionally here? Thanks a lot!
On 2022/5/13 0:50, David Hildenbrand wrote: > On 12.05.22 15:26, Miaohe Lin wrote: >> On 2022/5/12 15:10, David Hildenbrand wrote: >>>>> If PG_isolated is still set, it will get cleared in the buddy when >>>>> freeing the page via >>>>> >>>>> page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; >>>> >>>> Yes, check_free_page only complains about flags belonging to PAGE_FLAGS_CHECK_AT_FREE and PG_isolated >>>> will be cleared in the buddy when freeing the page. But it might not be a good idea to reply on this ? >>>> IMHO, it should be better to clear the PG_isolated explicitly ourselves. >>> >>> I think we can pretty much rely on this handling in the buddy :) >> >> So is the below code change what you're suggesting? >> >> if (page_count(page) == 1) { >> /* page was freed from under us. So we are done. */ >> ClearPageActive(page); >> ClearPageUnevictable(page); >> - if (unlikely(__PageMovable(page))) >> - ClearPageIsolated(page); >> goto out; >> } > > Yeah, unless I am missing something important :) > >>> >>>> >>>>> >>>>>> >>>>>>> >>>>>>> >>>>>>> Also, I am not sure how reliable that page count check is here: if we'd >>>>>>> have another speculative reference to the page, we might see >>>>>>> "page_count(page) > 1" and not take that path, although the previous >>>>>>> owner released the last reference. >>>>>> >>>>>> IIUC, there should not be such speculative reference. The driver should have taken care >>>>>> of it. >>>>> >>>>> How can you prevent any kind of speculative references? >>>>> >>>>> See isolate_movable_page() as an example, which grabs a speculative >>>>> reference to then find out that the page is already isolated by someone >>>>> else, to then back off. >>>> >>>> You're right. isolate_movable_page will be an speculative references case. But the page count check here >>>> is just an optimization. If we encounter speculative references, it still works with useless effort of >>>> migrating to be released page. >>> >>> >>> Not really. The issue is that PAGE_FLAGS_CHECK_AT_FREE contains >>> PG_active and PG_unevictable. >>> >>> We only clear those 2 flags if "page_count(page) == 1". Consequently, >>> with a speculative reference, we'll run into the check_free_page_bad() >>> when dropping the last reference. >> >> It seems if a speculative reference happens after the "page_count(page) == 1" check, >> it's ok because we cleared the PG_active and PG_unevictable. And if it happens before >> the check, this code block is skipped and the page will be freed after migration. The >> PG_active and PG_unevictable will be correctly cleared when page is actually freed via >> __folio_clear_active. (Please see below comment) >> >>> >>> This is just shaky. Special casing on "page_count(page) == 1" for >>> detecting "was this freed by the owner" is not 100% water proof. >>> >>> In an ideal world, we'd just get rid of that whole block of code and let >>> the actual freeing code clear PG_active and PG_unevictable. But that >>> would require changes to free_pages_prepare(). >>> >>> >>> Now I do wonder, if we ever even have PG_active or PG_unevictable still >>> set when the page was freed by the owner in this code. IOW, maybe that >>> is dead code as well and we can just remove the whole shaky >>> "page_count(page) == 1" code block. >> >> Think about below common scene: Anonymous page is actively used by the sole owner process, so it >> will have PG_active set. Then process exited while vm tries to migrate that page. So the page >> should have refcnt == 1 while PG_active is set? Note normally PG_active should be cleared when >> the page is released: >> >> __put_single_page >> PageLRU >> __clear_page_lru_flags >> __folio_clear_active >> __folio_clear_unevictable >> >> But for isolated page, PageLRU is cleared. So when the isolated page is released, __clear_page_lru_flags >> won't be called. So we have to clear the PG_active and PG_unevictable here manully. So I think >> this code block works. Or am I miss something again? > > Let's assume the following: page as freed by the owner and we enter > unmap_and_move(). > > > #1: enter unmap_and_move() // page_count is 1 > #2: enter isolate_movable_page() // page_count is 1 > #2: get_page_unless_zero() // page_count is now 2 > #1: if (page_count(page) == 1) { // does not trigger > #2: put_page(page); // page_count is now 1 > #1: put_page(page); // page_count is now 0 -> freed > > > #1 will trigger __put_page() -> __put_single_page() -> > __page_cache_release() will not clear the flags because it's not an LRU > page at that point in time, right (-> isolated)? > > We did not run that code block that would clear PG_active and > PG_unevictable. > > Which still leaves the questions: > > a) If PG_active and PG_unevictable was cleared, where? > b) Why is that code block that conditionally clears the flags of any > value and why can't we simply drop it? > I took a more close look of the code today. And I think the current code works: There are 3 cases in unmap_and_move: 1.page is freed through "if (page_count(page) == 1)" code block. This works as PG_active and PG_unevictable are cleared here. 2. Failed to migrate the page. The page won't be release so we don't care about it. 3. The page is migrated successfully. The PG_active and PG_unevictable are cleared via folio_migrate_flags(): if (folio_test_clear_active(folio)) { VM_BUG_ON_FOLIO(folio_test_unevictable(folio), folio); folio_set_active(newfolio); } else if (folio_test_clear_unevictable(folio)) folio_set_unevictable(newfolio); For the above race case, the page won't be freed through "if (page_count(page) == 1)" code block. It will just be migrated and freed via put_page() after folio_migrate_flags() having cleared PG_active and PG_unevictable. Am I miss something again ? ;) Thanks a lot!
Sorry for the late reply, was on vacation. >>> >>> But for isolated page, PageLRU is cleared. So when the isolated page is released, __clear_page_lru_flags >>> won't be called. So we have to clear the PG_active and PG_unevictable here manully. So I think >>> this code block works. Or am I miss something again? >> >> Let's assume the following: page as freed by the owner and we enter >> unmap_and_move(). >> >> >> #1: enter unmap_and_move() // page_count is 1 >> #2: enter isolate_movable_page() // page_count is 1 >> #2: get_page_unless_zero() // page_count is now 2 >> #1: if (page_count(page) == 1) { // does not trigger >> #2: put_page(page); // page_count is now 1 >> #1: put_page(page); // page_count is now 0 -> freed >> >> >> #1 will trigger __put_page() -> __put_single_page() -> >> __page_cache_release() will not clear the flags because it's not an LRU >> page at that point in time, right (-> isolated)? > > Sorry, you're right. I thought the old page will be freed via putback_lru_page which will > set PageLRU back instead of put_page directly. So if the above race occurs, PG_active and > PG_unevictable will remain set while page goes to the buddy and check_free_page will complain > about it. But it seems this is never witnessed? Maybe a) we were lucky so far and didn't trigger it b) the whole code block is dead code because we are missing something c) we are missing something else :) > >> >> We did not run that code block that would clear PG_active and >> PG_unevictable. >> >> Which still leaves the questions: >> >> a) If PG_active and PG_unevictable was cleared, where? > > For LRU pages, PG_active and PG_unevictable are cleared via __page_cache_release. And for isolated > (LRU) pages, PG_active and PG_unevictable should be cleared ourselves? > >> b) Why is that code block that conditionally clears the flags of any >> value and why can't we simply drop it? >> > > To fix the issue, should we clear PG_active and PG_unevictable unconditionally here? I wonder if we should simply teach actual freeing code to simply clear both flags when freeing an isolated page? IOW, to detect "isolated LRU" is getting freed and fixup?
On 2022/5/31 19:59, David Hildenbrand wrote: > Sorry for the late reply, was on vacation. That's all right. Hope you have a great time. ;) > >>>> >>>> But for isolated page, PageLRU is cleared. So when the isolated page is released, __clear_page_lru_flags >>>> won't be called. So we have to clear the PG_active and PG_unevictable here manully. So I think >>>> this code block works. Or am I miss something again? >>> >>> Let's assume the following: page as freed by the owner and we enter >>> unmap_and_move(). >>> >>> >>> #1: enter unmap_and_move() // page_count is 1 >>> #2: enter isolate_movable_page() // page_count is 1 >>> #2: get_page_unless_zero() // page_count is now 2 >>> #1: if (page_count(page) == 1) { // does not trigger >>> #2: put_page(page); // page_count is now 1 >>> #1: put_page(page); // page_count is now 0 -> freed >>> >>> >>> #1 will trigger __put_page() -> __put_single_page() -> >>> __page_cache_release() will not clear the flags because it's not an LRU >>> page at that point in time, right (-> isolated)? >> >> Sorry, you're right. I thought the old page will be freed via putback_lru_page which will >> set PageLRU back instead of put_page directly. So if the above race occurs, PG_active and >> PG_unevictable will remain set while page goes to the buddy and check_free_page will complain >> about it. But it seems this is never witnessed? > > Maybe > > a) we were lucky so far and didn't trigger it > b) the whole code block is dead code because we are missing something > c) we are missing something else :) I think I found the things we missed in another email [1]. [1]: https://lore.kernel.org/all/948ea45e-3b2b-e16c-5b8c-4c34de0ea593@huawei.com/ Paste the main content of [1] here: " There are 3 cases in unmap_and_move: 1.page is freed through "if (page_count(page) == 1)" code block. This works as PG_active and PG_unevictable are cleared here. 2. Failed to migrate the page. The page won't be release so we don't care about it. 3. The page is migrated successfully. The PG_active and PG_unevictable are cleared via folio_migrate_flags(): if (folio_test_clear_active(folio)) { VM_BUG_ON_FOLIO(folio_test_unevictable(folio), folio); folio_set_active(newfolio); } else if (folio_test_clear_unevictable(folio)) folio_set_unevictable(newfolio); For the above race case, the page won't be freed through "if (page_count(page) == 1)" code block. It will just be migrated and freed via put_page() after folio_migrate_flags() having cleared PG_active and PG_unevictable. " Or Am I miss something again? :) > >> >>> >>> We did not run that code block that would clear PG_active and >>> PG_unevictable. >>> >>> Which still leaves the questions: >>> >>> a) If PG_active and PG_unevictable was cleared, where? >> >> For LRU pages, PG_active and PG_unevictable are cleared via __page_cache_release. And for isolated >> (LRU) pages, PG_active and PG_unevictable should be cleared ourselves? >> >>> b) Why is that code block that conditionally clears the flags of any >>> value and why can't we simply drop it? >>> >> >> To fix the issue, should we clear PG_active and PG_unevictable unconditionally here? > > I wonder if we should simply teach actual freeing code to simply clear > both flags when freeing an isolated page? IOW, to detect "isolated LRU" > is getting freed and fixup? IMHO, clearing both flags are done by the caller indeed. Another example I found when I read the khugepaged code recently is pasted below: collapse_file ... page_ref_unfreeze(page, 1); ClearPageActive(page); ClearPageUnevictable(page); unlock_page(page); put_page(page); index++; ... Thanks! >
On 31.05.22 14:37, Miaohe Lin wrote: > On 2022/5/31 19:59, David Hildenbrand wrote: >> Sorry for the late reply, was on vacation. > > That's all right. Hope you have a great time. ;) > >> >>>>> >>>>> But for isolated page, PageLRU is cleared. So when the isolated page is released, __clear_page_lru_flags >>>>> won't be called. So we have to clear the PG_active and PG_unevictable here manully. So I think >>>>> this code block works. Or am I miss something again? >>>> >>>> Let's assume the following: page as freed by the owner and we enter >>>> unmap_and_move(). >>>> >>>> >>>> #1: enter unmap_and_move() // page_count is 1 >>>> #2: enter isolate_movable_page() // page_count is 1 >>>> #2: get_page_unless_zero() // page_count is now 2 >>>> #1: if (page_count(page) == 1) { // does not trigger >>>> #2: put_page(page); // page_count is now 1 >>>> #1: put_page(page); // page_count is now 0 -> freed >>>> >>>> >>>> #1 will trigger __put_page() -> __put_single_page() -> >>>> __page_cache_release() will not clear the flags because it's not an LRU >>>> page at that point in time, right (-> isolated)? >>> >>> Sorry, you're right. I thought the old page will be freed via putback_lru_page which will >>> set PageLRU back instead of put_page directly. So if the above race occurs, PG_active and >>> PG_unevictable will remain set while page goes to the buddy and check_free_page will complain >>> about it. But it seems this is never witnessed? >> >> Maybe >> >> a) we were lucky so far and didn't trigger it >> b) the whole code block is dead code because we are missing something >> c) we are missing something else :) > > I think I found the things we missed in another email [1]. > [1]: https://lore.kernel.org/all/948ea45e-3b2b-e16c-5b8c-4c34de0ea593@huawei.com/ > > Paste the main content of [1] here: > > " > There are 3 cases in unmap_and_move: > > 1.page is freed through "if (page_count(page) == 1)" code block. This works > as PG_active and PG_unevictable are cleared here. > > 2. Failed to migrate the page. The page won't be release so we don't care about it. Right, page is un-isolated. > > 3. The page is migrated successfully. The PG_active and PG_unevictable are cleared > via folio_migrate_flags(): > > if (folio_test_clear_active(folio)) { > VM_BUG_ON_FOLIO(folio_test_unevictable(folio), folio); > folio_set_active(newfolio); > } else if (folio_test_clear_unevictable(folio)) > folio_set_unevictable(newfolio); Right. > > For the above race case, the page won't be freed through "if (page_count(page) == 1)" code block. > It will just be migrated and freed via put_page() after folio_migrate_flags() having cleared PG_active > and PG_unevictable. > " > Or Am I miss something again? :) For #1, I'm still not sure what would happen on a speculative reference. It's worth summarizing that a) free_pages_prepare() will clear both flags via page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; b) free_pages_prepare() will bail out if any flag is set in check_free_page(). As we've never seen b) in the wild, this certainly has low priority, and maybe it really cannot happen right now. However, maybe really allowing these flags to be set when freeing the page and removing the "page_count(page) == 1" case from migration code would be the clean thing to do.
On 2022/6/1 18:31, David Hildenbrand wrote: > On 31.05.22 14:37, Miaohe Lin wrote: >> On 2022/5/31 19:59, David Hildenbrand wrote: >>> Sorry for the late reply, was on vacation. >> >> That's all right. Hope you have a great time. ;) >> >>> >>>>>> >>>>>> But for isolated page, PageLRU is cleared. So when the isolated page is released, __clear_page_lru_flags >>>>>> won't be called. So we have to clear the PG_active and PG_unevictable here manully. So I think >>>>>> this code block works. Or am I miss something again? >>>>> >>>>> Let's assume the following: page as freed by the owner and we enter >>>>> unmap_and_move(). >>>>> >>>>> >>>>> #1: enter unmap_and_move() // page_count is 1 >>>>> #2: enter isolate_movable_page() // page_count is 1 >>>>> #2: get_page_unless_zero() // page_count is now 2 >>>>> #1: if (page_count(page) == 1) { // does not trigger >>>>> #2: put_page(page); // page_count is now 1 >>>>> #1: put_page(page); // page_count is now 0 -> freed >>>>> >>>>> >>>>> #1 will trigger __put_page() -> __put_single_page() -> >>>>> __page_cache_release() will not clear the flags because it's not an LRU >>>>> page at that point in time, right (-> isolated)? >>>> >>>> Sorry, you're right. I thought the old page will be freed via putback_lru_page which will >>>> set PageLRU back instead of put_page directly. So if the above race occurs, PG_active and >>>> PG_unevictable will remain set while page goes to the buddy and check_free_page will complain >>>> about it. But it seems this is never witnessed? >>> >>> Maybe >>> >>> a) we were lucky so far and didn't trigger it >>> b) the whole code block is dead code because we are missing something >>> c) we are missing something else :) >> >> I think I found the things we missed in another email [1]. >> [1]: https://lore.kernel.org/all/948ea45e-3b2b-e16c-5b8c-4c34de0ea593@huawei.com/ >> >> Paste the main content of [1] here: >> >> " >> There are 3 cases in unmap_and_move: >> >> 1.page is freed through "if (page_count(page) == 1)" code block. This works >> as PG_active and PG_unevictable are cleared here. >> >> 2. Failed to migrate the page. The page won't be release so we don't care about it. > > Right, page is un-isolated. > >> >> 3. The page is migrated successfully. The PG_active and PG_unevictable are cleared >> via folio_migrate_flags(): >> >> if (folio_test_clear_active(folio)) { >> VM_BUG_ON_FOLIO(folio_test_unevictable(folio), folio); >> folio_set_active(newfolio); >> } else if (folio_test_clear_unevictable(folio)) >> folio_set_unevictable(newfolio); > > Right. > >> >> For the above race case, the page won't be freed through "if (page_count(page) == 1)" code block. >> It will just be migrated and freed via put_page() after folio_migrate_flags() having cleared PG_active >> and PG_unevictable. >> " >> Or Am I miss something again? :) > > For #1, I'm still not sure what would happen on a speculative reference. > > It's worth summarizing that > > a) free_pages_prepare() will clear both flags via page->flags &= > ~PAGE_FLAGS_CHECK_AT_PREP; > > b) free_pages_prepare() will bail out if any flag is set in > check_free_page(). > > As we've never seen b) in the wild, this certainly has low priority, and > maybe it really cannot happen right now. > > However, maybe really allowing these flags to be set when freeing the > page and removing the "page_count(page) == 1" case from migration code > would be the clean thing to do. IMHO, check_free_page is used to catch possible problem. There's the comment of PAGE_FLAGS_CHECK_AT_FREE: /* * Flags checked when a page is freed. Pages being freed should not have * these flags set. If they are, there is a problem. */ #define PAGE_FLAGS_CHECK_AT_FREE There might be an assumption: when page is freed, it shouldn't be an active or unevictable page. It should be inactive and evictable. So allowing these flags to be set when freeing the page might not be a good idea? Thanks! >
On 02.06.22 09:40, Miaohe Lin wrote: > On 2022/6/1 18:31, David Hildenbrand wrote: >> On 31.05.22 14:37, Miaohe Lin wrote: >>> On 2022/5/31 19:59, David Hildenbrand wrote: >>>> Sorry for the late reply, was on vacation. >>> >>> That's all right. Hope you have a great time. ;) >>> >>>> >>>>>>> >>>>>>> But for isolated page, PageLRU is cleared. So when the isolated page is released, __clear_page_lru_flags >>>>>>> won't be called. So we have to clear the PG_active and PG_unevictable here manully. So I think >>>>>>> this code block works. Or am I miss something again? >>>>>> >>>>>> Let's assume the following: page as freed by the owner and we enter >>>>>> unmap_and_move(). >>>>>> >>>>>> >>>>>> #1: enter unmap_and_move() // page_count is 1 >>>>>> #2: enter isolate_movable_page() // page_count is 1 >>>>>> #2: get_page_unless_zero() // page_count is now 2 >>>>>> #1: if (page_count(page) == 1) { // does not trigger >>>>>> #2: put_page(page); // page_count is now 1 >>>>>> #1: put_page(page); // page_count is now 0 -> freed >>>>>> >>>>>> >>>>>> #1 will trigger __put_page() -> __put_single_page() -> >>>>>> __page_cache_release() will not clear the flags because it's not an LRU >>>>>> page at that point in time, right (-> isolated)? >>>>> >>>>> Sorry, you're right. I thought the old page will be freed via putback_lru_page which will >>>>> set PageLRU back instead of put_page directly. So if the above race occurs, PG_active and >>>>> PG_unevictable will remain set while page goes to the buddy and check_free_page will complain >>>>> about it. But it seems this is never witnessed? >>>> >>>> Maybe >>>> >>>> a) we were lucky so far and didn't trigger it >>>> b) the whole code block is dead code because we are missing something >>>> c) we are missing something else :) >>> >>> I think I found the things we missed in another email [1]. >>> [1]: https://lore.kernel.org/all/948ea45e-3b2b-e16c-5b8c-4c34de0ea593@huawei.com/ >>> >>> Paste the main content of [1] here: >>> >>> " >>> There are 3 cases in unmap_and_move: >>> >>> 1.page is freed through "if (page_count(page) == 1)" code block. This works >>> as PG_active and PG_unevictable are cleared here. >>> >>> 2. Failed to migrate the page. The page won't be release so we don't care about it. >> >> Right, page is un-isolated. >> >>> >>> 3. The page is migrated successfully. The PG_active and PG_unevictable are cleared >>> via folio_migrate_flags(): >>> >>> if (folio_test_clear_active(folio)) { >>> VM_BUG_ON_FOLIO(folio_test_unevictable(folio), folio); >>> folio_set_active(newfolio); >>> } else if (folio_test_clear_unevictable(folio)) >>> folio_set_unevictable(newfolio); >> >> Right. >> >>> >>> For the above race case, the page won't be freed through "if (page_count(page) == 1)" code block. >>> It will just be migrated and freed via put_page() after folio_migrate_flags() having cleared PG_active >>> and PG_unevictable. >>> " >>> Or Am I miss something again? :) >> >> For #1, I'm still not sure what would happen on a speculative reference. >> >> It's worth summarizing that >> >> a) free_pages_prepare() will clear both flags via page->flags &= >> ~PAGE_FLAGS_CHECK_AT_PREP; >> >> b) free_pages_prepare() will bail out if any flag is set in >> check_free_page(). >> >> As we've never seen b) in the wild, this certainly has low priority, and >> maybe it really cannot happen right now. >> >> However, maybe really allowing these flags to be set when freeing the >> page and removing the "page_count(page) == 1" case from migration code >> would be the clean thing to do. > > IMHO, check_free_page is used to catch possible problem. There's the comment of PAGE_FLAGS_CHECK_AT_FREE: > > /* > * Flags checked when a page is freed. Pages being freed should not have > * these flags set. If they are, there is a problem. > */ > #define PAGE_FLAGS_CHECK_AT_FREE > > There might be an assumption: when page is freed, it shouldn't be an active or unevictable page. It should be > inactive and evictable. So allowing these flags to be set when freeing the page might not be a good idea? Yeah, and we'd be lifting that restriction because there is good reason to do so. Maybe we *could* special case for isolated pages; however, that adds runtime overhead. Of course, we could perform different checks for e.g., DEBUG_VM vs !DEBUG_VM.
On 2022/6/2 16:47, David Hildenbrand wrote: > On 02.06.22 09:40, Miaohe Lin wrote: >> On 2022/6/1 18:31, David Hildenbrand wrote: >>> On 31.05.22 14:37, Miaohe Lin wrote: >>>> On 2022/5/31 19:59, David Hildenbrand wrote: >>>>> Sorry for the late reply, was on vacation. >>>> >>>> That's all right. Hope you have a great time. ;) >>>> >>>>> >>>>>>>> >>>>>>>> But for isolated page, PageLRU is cleared. So when the isolated page is released, __clear_page_lru_flags >>>>>>>> won't be called. So we have to clear the PG_active and PG_unevictable here manully. So I think >>>>>>>> this code block works. Or am I miss something again? >>>>>>> >>>>>>> Let's assume the following: page as freed by the owner and we enter >>>>>>> unmap_and_move(). >>>>>>> >>>>>>> >>>>>>> #1: enter unmap_and_move() // page_count is 1 >>>>>>> #2: enter isolate_movable_page() // page_count is 1 >>>>>>> #2: get_page_unless_zero() // page_count is now 2 >>>>>>> #1: if (page_count(page) == 1) { // does not trigger >>>>>>> #2: put_page(page); // page_count is now 1 >>>>>>> #1: put_page(page); // page_count is now 0 -> freed >>>>>>> >>>>>>> >>>>>>> #1 will trigger __put_page() -> __put_single_page() -> >>>>>>> __page_cache_release() will not clear the flags because it's not an LRU >>>>>>> page at that point in time, right (-> isolated)? >>>>>> >>>>>> Sorry, you're right. I thought the old page will be freed via putback_lru_page which will >>>>>> set PageLRU back instead of put_page directly. So if the above race occurs, PG_active and >>>>>> PG_unevictable will remain set while page goes to the buddy and check_free_page will complain >>>>>> about it. But it seems this is never witnessed? >>>>> >>>>> Maybe >>>>> >>>>> a) we were lucky so far and didn't trigger it >>>>> b) the whole code block is dead code because we are missing something >>>>> c) we are missing something else :) >>>> >>>> I think I found the things we missed in another email [1]. >>>> [1]: https://lore.kernel.org/all/948ea45e-3b2b-e16c-5b8c-4c34de0ea593@huawei.com/ >>>> >>>> Paste the main content of [1] here: >>>> >>>> " >>>> There are 3 cases in unmap_and_move: >>>> >>>> 1.page is freed through "if (page_count(page) == 1)" code block. This works >>>> as PG_active and PG_unevictable are cleared here. >>>> >>>> 2. Failed to migrate the page. The page won't be release so we don't care about it. >>> >>> Right, page is un-isolated. >>> >>>> >>>> 3. The page is migrated successfully. The PG_active and PG_unevictable are cleared >>>> via folio_migrate_flags(): >>>> >>>> if (folio_test_clear_active(folio)) { >>>> VM_BUG_ON_FOLIO(folio_test_unevictable(folio), folio); >>>> folio_set_active(newfolio); >>>> } else if (folio_test_clear_unevictable(folio)) >>>> folio_set_unevictable(newfolio); >>> >>> Right. >>> >>>> >>>> For the above race case, the page won't be freed through "if (page_count(page) == 1)" code block. >>>> It will just be migrated and freed via put_page() after folio_migrate_flags() having cleared PG_active >>>> and PG_unevictable. >>>> " >>>> Or Am I miss something again? :) >>> >>> For #1, I'm still not sure what would happen on a speculative reference. >>> >>> It's worth summarizing that >>> >>> a) free_pages_prepare() will clear both flags via page->flags &= >>> ~PAGE_FLAGS_CHECK_AT_PREP; >>> >>> b) free_pages_prepare() will bail out if any flag is set in >>> check_free_page(). >>> >>> As we've never seen b) in the wild, this certainly has low priority, and >>> maybe it really cannot happen right now. >>> >>> However, maybe really allowing these flags to be set when freeing the >>> page and removing the "page_count(page) == 1" case from migration code >>> would be the clean thing to do. >> >> IMHO, check_free_page is used to catch possible problem. There's the comment of PAGE_FLAGS_CHECK_AT_FREE: >> >> /* >> * Flags checked when a page is freed. Pages being freed should not have >> * these flags set. If they are, there is a problem. >> */ >> #define PAGE_FLAGS_CHECK_AT_FREE >> >> There might be an assumption: when page is freed, it shouldn't be an active or unevictable page. It should be >> inactive and evictable. So allowing these flags to be set when freeing the page might not be a good idea? > > Yeah, and we'd be lifting that restriction because there is good reason > to do so. > > Maybe we *could* special case for isolated pages; however, that adds > runtime overhead. Of course, we could perform different checks for e.g., > DEBUG_VM vs !DEBUG_VM. I found there is one assumption about PG_active and PG_unevictable, i.e. in __folio_clear_lru_flags: /* this shouldn't happen, so leave the flags to bad_page() */ if (folio_test_active(folio) && folio_test_unevictable(folio)) return; If PG_active and PG_unevictable are both set, this case will be caught in the bad_page() via check_free_page(). There might be some other assumptions about PG_active and PG_unevictable. So I think it's not safe to lift that restriction. But maybe we could limit this check within DEBUG_VM as you suggested. Am I supposed to do it? Thanks! >
On 07.06.22 04:20, Miaohe Lin wrote: > On 2022/6/2 16:47, David Hildenbrand wrote: >> On 02.06.22 09:40, Miaohe Lin wrote: >>> On 2022/6/1 18:31, David Hildenbrand wrote: >>>> On 31.05.22 14:37, Miaohe Lin wrote: >>>>> On 2022/5/31 19:59, David Hildenbrand wrote: >>>>>> Sorry for the late reply, was on vacation. >>>>> >>>>> That's all right. Hope you have a great time. ;) >>>>> >>>>>> >>>>>>>>> >>>>>>>>> But for isolated page, PageLRU is cleared. So when the isolated page is released, __clear_page_lru_flags >>>>>>>>> won't be called. So we have to clear the PG_active and PG_unevictable here manully. So I think >>>>>>>>> this code block works. Or am I miss something again? >>>>>>>> >>>>>>>> Let's assume the following: page as freed by the owner and we enter >>>>>>>> unmap_and_move(). >>>>>>>> >>>>>>>> >>>>>>>> #1: enter unmap_and_move() // page_count is 1 >>>>>>>> #2: enter isolate_movable_page() // page_count is 1 >>>>>>>> #2: get_page_unless_zero() // page_count is now 2 >>>>>>>> #1: if (page_count(page) == 1) { // does not trigger >>>>>>>> #2: put_page(page); // page_count is now 1 >>>>>>>> #1: put_page(page); // page_count is now 0 -> freed >>>>>>>> >>>>>>>> >>>>>>>> #1 will trigger __put_page() -> __put_single_page() -> >>>>>>>> __page_cache_release() will not clear the flags because it's not an LRU >>>>>>>> page at that point in time, right (-> isolated)? >>>>>>> >>>>>>> Sorry, you're right. I thought the old page will be freed via putback_lru_page which will >>>>>>> set PageLRU back instead of put_page directly. So if the above race occurs, PG_active and >>>>>>> PG_unevictable will remain set while page goes to the buddy and check_free_page will complain >>>>>>> about it. But it seems this is never witnessed? >>>>>> >>>>>> Maybe >>>>>> >>>>>> a) we were lucky so far and didn't trigger it >>>>>> b) the whole code block is dead code because we are missing something >>>>>> c) we are missing something else :) >>>>> >>>>> I think I found the things we missed in another email [1]. >>>>> [1]: https://lore.kernel.org/all/948ea45e-3b2b-e16c-5b8c-4c34de0ea593@huawei.com/ >>>>> >>>>> Paste the main content of [1] here: >>>>> >>>>> " >>>>> There are 3 cases in unmap_and_move: >>>>> >>>>> 1.page is freed through "if (page_count(page) == 1)" code block. This works >>>>> as PG_active and PG_unevictable are cleared here. >>>>> >>>>> 2. Failed to migrate the page. The page won't be release so we don't care about it. >>>> >>>> Right, page is un-isolated. >>>> >>>>> >>>>> 3. The page is migrated successfully. The PG_active and PG_unevictable are cleared >>>>> via folio_migrate_flags(): >>>>> >>>>> if (folio_test_clear_active(folio)) { >>>>> VM_BUG_ON_FOLIO(folio_test_unevictable(folio), folio); >>>>> folio_set_active(newfolio); >>>>> } else if (folio_test_clear_unevictable(folio)) >>>>> folio_set_unevictable(newfolio); >>>> >>>> Right. >>>> >>>>> >>>>> For the above race case, the page won't be freed through "if (page_count(page) == 1)" code block. >>>>> It will just be migrated and freed via put_page() after folio_migrate_flags() having cleared PG_active >>>>> and PG_unevictable. >>>>> " >>>>> Or Am I miss something again? :) >>>> >>>> For #1, I'm still not sure what would happen on a speculative reference. >>>> >>>> It's worth summarizing that >>>> >>>> a) free_pages_prepare() will clear both flags via page->flags &= >>>> ~PAGE_FLAGS_CHECK_AT_PREP; >>>> >>>> b) free_pages_prepare() will bail out if any flag is set in >>>> check_free_page(). >>>> >>>> As we've never seen b) in the wild, this certainly has low priority, and >>>> maybe it really cannot happen right now. >>>> >>>> However, maybe really allowing these flags to be set when freeing the >>>> page and removing the "page_count(page) == 1" case from migration code >>>> would be the clean thing to do. >>> >>> IMHO, check_free_page is used to catch possible problem. There's the comment of PAGE_FLAGS_CHECK_AT_FREE: >>> >>> /* >>> * Flags checked when a page is freed. Pages being freed should not have >>> * these flags set. If they are, there is a problem. >>> */ >>> #define PAGE_FLAGS_CHECK_AT_FREE >>> >>> There might be an assumption: when page is freed, it shouldn't be an active or unevictable page. It should be >>> inactive and evictable. So allowing these flags to be set when freeing the page might not be a good idea? >> >> Yeah, and we'd be lifting that restriction because there is good reason >> to do so. >> >> Maybe we *could* special case for isolated pages; however, that adds >> runtime overhead. Of course, we could perform different checks for e.g., >> DEBUG_VM vs !DEBUG_VM. > > I found there is one assumption about PG_active and PG_unevictable, i.e. in __folio_clear_lru_flags: > > /* this shouldn't happen, so leave the flags to bad_page() */ > if (folio_test_active(folio) && folio_test_unevictable(folio)) > return; > > If PG_active and PG_unevictable are both set, this case will be caught in the bad_page() via check_free_page(). > There might be some other assumptions about PG_active and PG_unevictable. So I think it's not safe to lift that > restriction. > > But maybe we could limit this check within DEBUG_VM as you suggested. Am I supposed to do it? Well, if you want, you can look into ways of cleaning that up and removing the "if there is more than one reference, the owner hasn't freed the page" condition, because there are corner cases where the owner might have freed the page but speculative references keep the refcount temporarily incremented.
On 2022/6/8 18:05, David Hildenbrand wrote: > On 07.06.22 04:20, Miaohe Lin wrote: >> On 2022/6/2 16:47, David Hildenbrand wrote: >>> On 02.06.22 09:40, Miaohe Lin wrote: >>>> On 2022/6/1 18:31, David Hildenbrand wrote: >>>>> On 31.05.22 14:37, Miaohe Lin wrote: >>>>>> On 2022/5/31 19:59, David Hildenbrand wrote: >>>>>>> Sorry for the late reply, was on vacation. >>>>>> >>>>>> That's all right. Hope you have a great time. ;) >>>>>> >>>>>>> >>>>>>>>>> >>>>>>>>>> But for isolated page, PageLRU is cleared. So when the isolated page is released, __clear_page_lru_flags >>>>>>>>>> won't be called. So we have to clear the PG_active and PG_unevictable here manully. So I think >>>>>>>>>> this code block works. Or am I miss something again? >>>>>>>>> >>>>>>>>> Let's assume the following: page as freed by the owner and we enter >>>>>>>>> unmap_and_move(). >>>>>>>>> >>>>>>>>> >>>>>>>>> #1: enter unmap_and_move() // page_count is 1 >>>>>>>>> #2: enter isolate_movable_page() // page_count is 1 >>>>>>>>> #2: get_page_unless_zero() // page_count is now 2 >>>>>>>>> #1: if (page_count(page) == 1) { // does not trigger >>>>>>>>> #2: put_page(page); // page_count is now 1 >>>>>>>>> #1: put_page(page); // page_count is now 0 -> freed >>>>>>>>> >>>>>>>>> >>>>>>>>> #1 will trigger __put_page() -> __put_single_page() -> >>>>>>>>> __page_cache_release() will not clear the flags because it's not an LRU >>>>>>>>> page at that point in time, right (-> isolated)? >>>>>>>> >>>>>>>> Sorry, you're right. I thought the old page will be freed via putback_lru_page which will >>>>>>>> set PageLRU back instead of put_page directly. So if the above race occurs, PG_active and >>>>>>>> PG_unevictable will remain set while page goes to the buddy and check_free_page will complain >>>>>>>> about it. But it seems this is never witnessed? >>>>>>> >>>>>>> Maybe >>>>>>> >>>>>>> a) we were lucky so far and didn't trigger it >>>>>>> b) the whole code block is dead code because we are missing something >>>>>>> c) we are missing something else :) >>>>>> >>>>>> I think I found the things we missed in another email [1]. >>>>>> [1]: https://lore.kernel.org/all/948ea45e-3b2b-e16c-5b8c-4c34de0ea593@huawei.com/ >>>>>> >>>>>> Paste the main content of [1] here: >>>>>> >>>>>> " >>>>>> There are 3 cases in unmap_and_move: >>>>>> >>>>>> 1.page is freed through "if (page_count(page) == 1)" code block. This works >>>>>> as PG_active and PG_unevictable are cleared here. >>>>>> >>>>>> 2. Failed to migrate the page. The page won't be release so we don't care about it. >>>>> >>>>> Right, page is un-isolated. >>>>> >>>>>> >>>>>> 3. The page is migrated successfully. The PG_active and PG_unevictable are cleared >>>>>> via folio_migrate_flags(): >>>>>> >>>>>> if (folio_test_clear_active(folio)) { >>>>>> VM_BUG_ON_FOLIO(folio_test_unevictable(folio), folio); >>>>>> folio_set_active(newfolio); >>>>>> } else if (folio_test_clear_unevictable(folio)) >>>>>> folio_set_unevictable(newfolio); >>>>> >>>>> Right. >>>>> >>>>>> >>>>>> For the above race case, the page won't be freed through "if (page_count(page) == 1)" code block. >>>>>> It will just be migrated and freed via put_page() after folio_migrate_flags() having cleared PG_active >>>>>> and PG_unevictable. >>>>>> " >>>>>> Or Am I miss something again? :) >>>>> >>>>> For #1, I'm still not sure what would happen on a speculative reference. >>>>> >>>>> It's worth summarizing that >>>>> >>>>> a) free_pages_prepare() will clear both flags via page->flags &= >>>>> ~PAGE_FLAGS_CHECK_AT_PREP; >>>>> >>>>> b) free_pages_prepare() will bail out if any flag is set in >>>>> check_free_page(). >>>>> >>>>> As we've never seen b) in the wild, this certainly has low priority, and >>>>> maybe it really cannot happen right now. >>>>> >>>>> However, maybe really allowing these flags to be set when freeing the >>>>> page and removing the "page_count(page) == 1" case from migration code >>>>> would be the clean thing to do. >>>> >>>> IMHO, check_free_page is used to catch possible problem. There's the comment of PAGE_FLAGS_CHECK_AT_FREE: >>>> >>>> /* >>>> * Flags checked when a page is freed. Pages being freed should not have >>>> * these flags set. If they are, there is a problem. >>>> */ >>>> #define PAGE_FLAGS_CHECK_AT_FREE >>>> >>>> There might be an assumption: when page is freed, it shouldn't be an active or unevictable page. It should be >>>> inactive and evictable. So allowing these flags to be set when freeing the page might not be a good idea? >>> >>> Yeah, and we'd be lifting that restriction because there is good reason >>> to do so. >>> >>> Maybe we *could* special case for isolated pages; however, that adds >>> runtime overhead. Of course, we could perform different checks for e.g., >>> DEBUG_VM vs !DEBUG_VM. >> >> I found there is one assumption about PG_active and PG_unevictable, i.e. in __folio_clear_lru_flags: >> >> /* this shouldn't happen, so leave the flags to bad_page() */ >> if (folio_test_active(folio) && folio_test_unevictable(folio)) >> return; >> >> If PG_active and PG_unevictable are both set, this case will be caught in the bad_page() via check_free_page(). >> There might be some other assumptions about PG_active and PG_unevictable. So I think it's not safe to lift that >> restriction. >> >> But maybe we could limit this check within DEBUG_VM as you suggested. Am I supposed to do it? > > Well, if you want, you can look into ways of cleaning that up and > removing the "if there is more than one reference, the owner hasn't > freed the page" condition, because there are corner cases where the > owner might have freed the page but speculative references keep the > refcount temporarily incremented.> Let me queue it to my TODO list. :) Thanks for your valuable suggestion!
diff --git a/mm/migrate.c b/mm/migrate.c index b779646665fe..0fc4651b3e39 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1093,12 +1093,8 @@ static int unmap_and_move(new_page_t get_new_page, /* page was freed from under us. So we are done. */ ClearPageActive(page); ClearPageUnevictable(page); - if (unlikely(__PageMovable(page))) { - lock_page(page); - if (!PageMovable(page)) - ClearPageIsolated(page); - unlock_page(page); - } + if (unlikely(__PageMovable(page))) + ClearPageIsolated(page); goto out; }