Message ID | 20240221234732.187629-6-vishal.moola@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Handle hugetlb faults under the VMA lock | expand |
On Wed, Feb 21, 2024 at 03:47:32PM -0800, Vishal Moola (Oracle) wrote: > Hugetlb can now safely handle faults under the VMA lock, so allow it to > do so. > > This patch may cause ltp hugemmap10 to "fail". Hugemmap10 tests hugetlb > counters, and expects the counters to remain unchanged on failure to > handle a fault. > > In hugetlb_no_page(), vmf_anon_prepare() may bailout with no anon_vma > under the VMA lock after allocating a folio for the hugepage. In > free_huge_folio(), this folio is completely freed on bailout iff there > is a surplus of hugetlb pages. This will remove a folio off the freelist > and decrement the number of hugepages while ltp expects these counters > to remain unchanged on failure. > > Originally this could only happen due to OOM failures, but now it may > also occur after we allocate a hugetlb folio without a suitable anon_vma > under the VMA lock. This should only happen for the first freshly > allocated hugepage in this vma. Hmm, so it's a bug in LTP. Have you talked to the LTP people about it (cc's added)? Also, did you try moving the anon_vma check befor the folio allocation so that you can bail out without disturbing the counters? > Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
On Wed, Feb 21, 2024 at 7:55 PM Matthew Wilcox <willy@infradead.org> wrote: > > On Wed, Feb 21, 2024 at 03:47:32PM -0800, Vishal Moola (Oracle) wrote: > > Hugetlb can now safely handle faults under the VMA lock, so allow it to > > do so. > > > > This patch may cause ltp hugemmap10 to "fail". Hugemmap10 tests hugetlb > > counters, and expects the counters to remain unchanged on failure to > > handle a fault. > > > > In hugetlb_no_page(), vmf_anon_prepare() may bailout with no anon_vma > > under the VMA lock after allocating a folio for the hugepage. In > > free_huge_folio(), this folio is completely freed on bailout iff there > > is a surplus of hugetlb pages. This will remove a folio off the freelist > > and decrement the number of hugepages while ltp expects these counters > > to remain unchanged on failure. > > > > Originally this could only happen due to OOM failures, but now it may > > also occur after we allocate a hugetlb folio without a suitable anon_vma > > under the VMA lock. This should only happen for the first freshly > > allocated hugepage in this vma. > > Hmm, so it's a bug in LTP. Have you talked to the LTP people about it > (cc's added)? > > Also, did you try moving the anon_vma check befor the folio allocation > so that you can bail out without disturbing the counters? Moving the check before folio allocation occurs keeps the folios on the freelist so the counters remain static as expected. If we are looking at a shareable vma, hugetlb_no_page() does not need this check at all, so I left the check where it is. We could definitely place the anon_vma check above allocation, it would just make the code slightly more complicated - needing another variable (or reassigning ret in various places) as well as adding another potentially unnecessary vma flag check. > > Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> > > Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
diff --git a/mm/hugetlb.c b/mm/hugetlb.c index ae8c8b3da981..688017ca0cc2 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6354,12 +6354,6 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, */ }; - /* TODO: Handle faults under the VMA lock */ - if (flags & FAULT_FLAG_VMA_LOCK) { - vma_end_read(vma); - return VM_FAULT_RETRY; - } - /* * Serialize hugepage allocation and instantiation, so that we don't * get spurious allocation failures if two CPUs race to instantiate
Hugetlb can now safely handle faults under the VMA lock, so allow it to do so. This patch may cause ltp hugemmap10 to "fail". Hugemmap10 tests hugetlb counters, and expects the counters to remain unchanged on failure to handle a fault. In hugetlb_no_page(), vmf_anon_prepare() may bailout with no anon_vma under the VMA lock after allocating a folio for the hugepage. In free_huge_folio(), this folio is completely freed on bailout iff there is a surplus of hugetlb pages. This will remove a folio off the freelist and decrement the number of hugepages while ltp expects these counters to remain unchanged on failure. Originally this could only happen due to OOM failures, but now it may also occur after we allocate a hugetlb folio without a suitable anon_vma under the VMA lock. This should only happen for the first freshly allocated hugepage in this vma. Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> --- mm/hugetlb.c | 6 ------ 1 file changed, 6 deletions(-)