Message ID | dc903b378d1e2d26bbbe85409ab9d009631f175c.1651056365.git.baolin.wang@linux.alibaba.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Fix cache flush issues considering PMD sharing | expand |
On 4/27/22 03:52, Baolin Wang wrote: > Now we will use flush_cache_page() to flush cache for anonymous hugetlb > pages when unmapping or migrating a hugetlb page mapping, but the > flush_cache_page() only handles a PAGE_SIZE range on some architectures > (like arm32, arc and so on), which will cause potential cache issues. > Thus change to use flush_cache_range() to cover the whole size of a > hugetlb page. > > Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> > --- > mm/rmap.c | 90 ++++++++++++++++++++++++++++++++++----------------------------- > 1 file changed, 48 insertions(+), 42 deletions(-) > > diff --git a/mm/rmap.c b/mm/rmap.c > index 4f0d115..6fdd198 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -1528,13 +1528,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, > anon_exclusive = folio_test_anon(folio) && > PageAnonExclusive(subpage); > > - if (folio_test_hugetlb(folio) && !folio_test_anon(folio)) { > - /* > - * To call huge_pmd_unshare, i_mmap_rwsem must be > - * held in write mode. Caller needs to explicitly > - * do this outside rmap routines. > - */ > - VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); > + if (folio_test_hugetlb(folio)) { > /* > * huge_pmd_unshare may unmap an entire PMD page. > * There is no way of knowing exactly which PMDs may > @@ -1544,22 +1538,31 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, > */ > flush_cache_range(vma, range.start, range.end); > > - if (huge_pmd_unshare(mm, vma, &address, pvmw.pte)) { > - flush_tlb_range(vma, range.start, range.end); > - mmu_notifier_invalidate_range(mm, range.start, > - range.end); > - > + if (!folio_test_anon(folio)) { > /* > - * The ref count of the PMD page was dropped > - * which is part of the way map counting > - * is done for shared PMDs. Return 'true' > - * here. When there is no other sharing, > - * huge_pmd_unshare returns false and we will > - * unmap the actual page and drop map count > - * to zero. > + * To call huge_pmd_unshare, i_mmap_rwsem must be > + * held in write mode. Caller needs to explicitly > + * do this outside rmap routines. > */ > - page_vma_mapped_walk_done(&pvmw); > - break; > + VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); > + > + if (huge_pmd_unshare(mm, vma, &address, pvmw.pte)) { > + flush_tlb_range(vma, range.start, range.end); > + mmu_notifier_invalidate_range(mm, range.start, > + range.end); > + > + /* > + * The ref count of the PMD page was dropped > + * which is part of the way map counting > + * is done for shared PMDs. Return 'true' > + * here. When there is no other sharing, > + * huge_pmd_unshare returns false and we will > + * unmap the actual page and drop map count > + * to zero. > + */ > + page_vma_mapped_walk_done(&pvmw); > + break; > + } > } > } else { > flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); > @@ -1885,13 +1888,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, > anon_exclusive = folio_test_anon(folio) && > PageAnonExclusive(subpage); > > - if (folio_test_hugetlb(folio) && !folio_test_anon(folio)) { > - /* > - * To call huge_pmd_unshare, i_mmap_rwsem must be > - * held in write mode. Caller needs to explicitly > - * do this outside rmap routines. > - */ > - VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); > + if (folio_test_hugetlb(folio)) { > /* > * huge_pmd_unshare may unmap an entire PMD page. > * There is no way of knowing exactly which PMDs may > @@ -1901,22 +1898,31 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, > */ > flush_cache_range(vma, range.start, range.end); > > - if (huge_pmd_unshare(mm, vma, &address, pvmw.pte)) { > - flush_tlb_range(vma, range.start, range.end); > - mmu_notifier_invalidate_range(mm, range.start, > - range.end); > - > + if (!folio_test_anon(folio)) { > /* > - * The ref count of the PMD page was dropped > - * which is part of the way map counting > - * is done for shared PMDs. Return 'true' > - * here. When there is no other sharing, > - * huge_pmd_unshare returns false and we will > - * unmap the actual page and drop map count > - * to zero. > + * To call huge_pmd_unshare, i_mmap_rwsem must be > + * held in write mode. Caller needs to explicitly > + * do this outside rmap routines. > */ > - page_vma_mapped_walk_done(&pvmw); > - break; > + VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); > + > + if (huge_pmd_unshare(mm, vma, &address, pvmw.pte)) { > + flush_tlb_range(vma, range.start, range.end); > + mmu_notifier_invalidate_range(mm, range.start, > + range.end); > + > + /* > + * The ref count of the PMD page was dropped > + * which is part of the way map counting > + * is done for shared PMDs. Return 'true' > + * here. When there is no other sharing, > + * huge_pmd_unshare returns false and we will > + * unmap the actual page and drop map count > + * to zero. > + */ > + page_vma_mapped_walk_done(&pvmw); > + break; > + } > } > } else { > flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); Thanks, The code looks fine. It is unfortunate that we need so many levels of indenting and exceed 80 columns. But, that is OK. Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> I see you have a followup series to address the call to ptep_clear_flush() for hugetlb pages not unmapped via huge_pmd_share and will take a look at that soon.
On 5/4/2022 4:17 AM, Mike Kravetz wrote: > On 4/27/22 03:52, Baolin Wang wrote: >> Now we will use flush_cache_page() to flush cache for anonymous hugetlb >> pages when unmapping or migrating a hugetlb page mapping, but the >> flush_cache_page() only handles a PAGE_SIZE range on some architectures >> (like arm32, arc and so on), which will cause potential cache issues. >> Thus change to use flush_cache_range() to cover the whole size of a >> hugetlb page. >> >> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> >> --- >> mm/rmap.c | 90 ++++++++++++++++++++++++++++++++++----------------------------- >> 1 file changed, 48 insertions(+), 42 deletions(-) >> >> diff --git a/mm/rmap.c b/mm/rmap.c >> index 4f0d115..6fdd198 100644 >> --- a/mm/rmap.c >> +++ b/mm/rmap.c >> @@ -1528,13 +1528,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, >> anon_exclusive = folio_test_anon(folio) && >> PageAnonExclusive(subpage); >> >> - if (folio_test_hugetlb(folio) && !folio_test_anon(folio)) { >> - /* >> - * To call huge_pmd_unshare, i_mmap_rwsem must be >> - * held in write mode. Caller needs to explicitly >> - * do this outside rmap routines. >> - */ >> - VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); >> + if (folio_test_hugetlb(folio)) { >> /* >> * huge_pmd_unshare may unmap an entire PMD page. >> * There is no way of knowing exactly which PMDs may >> @@ -1544,22 +1538,31 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, >> */ >> flush_cache_range(vma, range.start, range.end); >> >> - if (huge_pmd_unshare(mm, vma, &address, pvmw.pte)) { >> - flush_tlb_range(vma, range.start, range.end); >> - mmu_notifier_invalidate_range(mm, range.start, >> - range.end); >> - >> + if (!folio_test_anon(folio)) { >> /* >> - * The ref count of the PMD page was dropped >> - * which is part of the way map counting >> - * is done for shared PMDs. Return 'true' >> - * here. When there is no other sharing, >> - * huge_pmd_unshare returns false and we will >> - * unmap the actual page and drop map count >> - * to zero. >> + * To call huge_pmd_unshare, i_mmap_rwsem must be >> + * held in write mode. Caller needs to explicitly >> + * do this outside rmap routines. >> */ >> - page_vma_mapped_walk_done(&pvmw); >> - break; >> + VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); >> + >> + if (huge_pmd_unshare(mm, vma, &address, pvmw.pte)) { >> + flush_tlb_range(vma, range.start, range.end); >> + mmu_notifier_invalidate_range(mm, range.start, >> + range.end); >> + >> + /* >> + * The ref count of the PMD page was dropped >> + * which is part of the way map counting >> + * is done for shared PMDs. Return 'true' >> + * here. When there is no other sharing, >> + * huge_pmd_unshare returns false and we will >> + * unmap the actual page and drop map count >> + * to zero. >> + */ >> + page_vma_mapped_walk_done(&pvmw); >> + break; >> + } >> } >> } else { >> flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); >> @@ -1885,13 +1888,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, >> anon_exclusive = folio_test_anon(folio) && >> PageAnonExclusive(subpage); >> >> - if (folio_test_hugetlb(folio) && !folio_test_anon(folio)) { >> - /* >> - * To call huge_pmd_unshare, i_mmap_rwsem must be >> - * held in write mode. Caller needs to explicitly >> - * do this outside rmap routines. >> - */ >> - VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); >> + if (folio_test_hugetlb(folio)) { >> /* >> * huge_pmd_unshare may unmap an entire PMD page. >> * There is no way of knowing exactly which PMDs may >> @@ -1901,22 +1898,31 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, >> */ >> flush_cache_range(vma, range.start, range.end); >> >> - if (huge_pmd_unshare(mm, vma, &address, pvmw.pte)) { >> - flush_tlb_range(vma, range.start, range.end); >> - mmu_notifier_invalidate_range(mm, range.start, >> - range.end); >> - >> + if (!folio_test_anon(folio)) { >> /* >> - * The ref count of the PMD page was dropped >> - * which is part of the way map counting >> - * is done for shared PMDs. Return 'true' >> - * here. When there is no other sharing, >> - * huge_pmd_unshare returns false and we will >> - * unmap the actual page and drop map count >> - * to zero. >> + * To call huge_pmd_unshare, i_mmap_rwsem must be >> + * held in write mode. Caller needs to explicitly >> + * do this outside rmap routines. >> */ >> - page_vma_mapped_walk_done(&pvmw); >> - break; >> + VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); >> + >> + if (huge_pmd_unshare(mm, vma, &address, pvmw.pte)) { >> + flush_tlb_range(vma, range.start, range.end); >> + mmu_notifier_invalidate_range(mm, range.start, >> + range.end); >> + >> + /* >> + * The ref count of the PMD page was dropped >> + * which is part of the way map counting >> + * is done for shared PMDs. Return 'true' >> + * here. When there is no other sharing, >> + * huge_pmd_unshare returns false and we will >> + * unmap the actual page and drop map count >> + * to zero. >> + */ >> + page_vma_mapped_walk_done(&pvmw); >> + break; >> + } >> } >> } else { >> flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); > > Thanks, > The code looks fine. It is unfortunate that we need so many levels of > indenting and exceed 80 columns. But, that is OK. I'll do a cleanup to make it more readable with factoring the hugetlb case into a new function after the fix series[1]. [1] https://lore.kernel.org/linux-arm-kernel/20220503120343.6264e126@thinkpad/T/ > > Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> > > I see you have a followup series to address the call to ptep_clear_flush() > for hugetlb pages not unmapped via huge_pmd_share and will take a look at > that soon. Thanks.
diff --git a/mm/rmap.c b/mm/rmap.c index 4f0d115..6fdd198 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1528,13 +1528,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, anon_exclusive = folio_test_anon(folio) && PageAnonExclusive(subpage); - if (folio_test_hugetlb(folio) && !folio_test_anon(folio)) { - /* - * To call huge_pmd_unshare, i_mmap_rwsem must be - * held in write mode. Caller needs to explicitly - * do this outside rmap routines. - */ - VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); + if (folio_test_hugetlb(folio)) { /* * huge_pmd_unshare may unmap an entire PMD page. * There is no way of knowing exactly which PMDs may @@ -1544,22 +1538,31 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, */ flush_cache_range(vma, range.start, range.end); - if (huge_pmd_unshare(mm, vma, &address, pvmw.pte)) { - flush_tlb_range(vma, range.start, range.end); - mmu_notifier_invalidate_range(mm, range.start, - range.end); - + if (!folio_test_anon(folio)) { /* - * The ref count of the PMD page was dropped - * which is part of the way map counting - * is done for shared PMDs. Return 'true' - * here. When there is no other sharing, - * huge_pmd_unshare returns false and we will - * unmap the actual page and drop map count - * to zero. + * To call huge_pmd_unshare, i_mmap_rwsem must be + * held in write mode. Caller needs to explicitly + * do this outside rmap routines. */ - page_vma_mapped_walk_done(&pvmw); - break; + VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); + + if (huge_pmd_unshare(mm, vma, &address, pvmw.pte)) { + flush_tlb_range(vma, range.start, range.end); + mmu_notifier_invalidate_range(mm, range.start, + range.end); + + /* + * The ref count of the PMD page was dropped + * which is part of the way map counting + * is done for shared PMDs. Return 'true' + * here. When there is no other sharing, + * huge_pmd_unshare returns false and we will + * unmap the actual page and drop map count + * to zero. + */ + page_vma_mapped_walk_done(&pvmw); + break; + } } } else { flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); @@ -1885,13 +1888,7 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, anon_exclusive = folio_test_anon(folio) && PageAnonExclusive(subpage); - if (folio_test_hugetlb(folio) && !folio_test_anon(folio)) { - /* - * To call huge_pmd_unshare, i_mmap_rwsem must be - * held in write mode. Caller needs to explicitly - * do this outside rmap routines. - */ - VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); + if (folio_test_hugetlb(folio)) { /* * huge_pmd_unshare may unmap an entire PMD page. * There is no way of knowing exactly which PMDs may @@ -1901,22 +1898,31 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, */ flush_cache_range(vma, range.start, range.end); - if (huge_pmd_unshare(mm, vma, &address, pvmw.pte)) { - flush_tlb_range(vma, range.start, range.end); - mmu_notifier_invalidate_range(mm, range.start, - range.end); - + if (!folio_test_anon(folio)) { /* - * The ref count of the PMD page was dropped - * which is part of the way map counting - * is done for shared PMDs. Return 'true' - * here. When there is no other sharing, - * huge_pmd_unshare returns false and we will - * unmap the actual page and drop map count - * to zero. + * To call huge_pmd_unshare, i_mmap_rwsem must be + * held in write mode. Caller needs to explicitly + * do this outside rmap routines. */ - page_vma_mapped_walk_done(&pvmw); - break; + VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); + + if (huge_pmd_unshare(mm, vma, &address, pvmw.pte)) { + flush_tlb_range(vma, range.start, range.end); + mmu_notifier_invalidate_range(mm, range.start, + range.end); + + /* + * The ref count of the PMD page was dropped + * which is part of the way map counting + * is done for shared PMDs. Return 'true' + * here. When there is no other sharing, + * huge_pmd_unshare returns false and we will + * unmap the actual page and drop map count + * to zero. + */ + page_vma_mapped_walk_done(&pvmw); + break; + } } } else { flush_cache_page(vma, address, pte_pfn(*pvmw.pte));
Now we will use flush_cache_page() to flush cache for anonymous hugetlb pages when unmapping or migrating a hugetlb page mapping, but the flush_cache_page() only handles a PAGE_SIZE range on some architectures (like arm32, arc and so on), which will cause potential cache issues. Thus change to use flush_cache_range() to cover the whole size of a hugetlb page. Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> --- mm/rmap.c | 90 ++++++++++++++++++++++++++++++++++----------------------------- 1 file changed, 48 insertions(+), 42 deletions(-)