Message ID | 20240401202651.31440-3-vishal.moola@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Hugetlb fault path to use struct vm_fault | expand |
On Mon, Apr 01, 2024 at 01:26:50PM -0700, Vishal Moola (Oracle) wrote: > hugetlb_no_page() can use the struct vm_fault passed in from > hugetlb_fault(). This alleviates the stack by consolidating 7 > variables into a single struct. > > Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> > --- > mm/hugetlb.c | 59 ++++++++++++++++++++++++++-------------------------- > 1 file changed, 29 insertions(+), 30 deletions(-) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 360b82374a89..aca2f11b4138 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -6189,9 +6189,7 @@ static bool hugetlb_pte_stable(struct hstate *h, struct mm_struct *mm, > > static vm_fault_t hugetlb_no_page(struct mm_struct *mm, > struct vm_area_struct *vma, > - struct address_space *mapping, pgoff_t idx, > - unsigned long address, pte_t *ptep, > - pte_t old_pte, unsigned int flags, > + struct address_space *mapping, AFAICS all this can be self-contained in vm_fault struct. vmf->vma->mm and vmf->vma. I mean, if we want to convert this interface, why not going all the way? Looks a bit odd some fields yes while some others remain. Or am I missing something?
On Thu, Apr 4, 2024 at 5:49 AM Oscar Salvador <osalvador@suse.de> wrote: > > On Mon, Apr 01, 2024 at 01:26:50PM -0700, Vishal Moola (Oracle) wrote: > > hugetlb_no_page() can use the struct vm_fault passed in from > > hugetlb_fault(). This alleviates the stack by consolidating 7 > > variables into a single struct. > > > > Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> > > --- > > mm/hugetlb.c | 59 ++++++++++++++++++++++++++-------------------------- > > 1 file changed, 29 insertions(+), 30 deletions(-) > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > index 360b82374a89..aca2f11b4138 100644 > > --- a/mm/hugetlb.c > > +++ b/mm/hugetlb.c > > @@ -6189,9 +6189,7 @@ static bool hugetlb_pte_stable(struct hstate *h, struct mm_struct *mm, > > > > static vm_fault_t hugetlb_no_page(struct mm_struct *mm, > > struct vm_area_struct *vma, > > - struct address_space *mapping, pgoff_t idx, > > - unsigned long address, pte_t *ptep, > > - pte_t old_pte, unsigned int flags, > > + struct address_space *mapping, > > AFAICS all this can be self-contained in vm_fault struct. > vmf->vma->mm and vmf->vma. > I mean, if we want to convert this interface, why not going all the way? > > Looks a bit odd some fields yes while some others remain. > > Or am I missing something? Mainly just minimizing code churn, we would either unnecessarily change multiple lines using vma or have to declare the variables again anyways (or have extra churn I didn't like).
On Mon, Apr 01, 2024 at 01:26:50PM -0700, Vishal Moola (Oracle) wrote: > hugetlb_no_page() can use the struct vm_fault passed in from > hugetlb_fault(). This alleviates the stack by consolidating 7 > variables into a single struct. > > Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Reviewed-by: Oscar Salvador <osalvador@suse.de>
> On Apr 5, 2024, at 03:58, Vishal Moola <vishal.moola@gmail.com> wrote: > > On Thu, Apr 4, 2024 at 5:49 AM Oscar Salvador <osalvador@suse.de> wrote: >> >> On Mon, Apr 01, 2024 at 01:26:50PM -0700, Vishal Moola (Oracle) wrote: >>> hugetlb_no_page() can use the struct vm_fault passed in from >>> hugetlb_fault(). This alleviates the stack by consolidating 7 >>> variables into a single struct. >>> >>> Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> >>> --- >>> mm/hugetlb.c | 59 ++++++++++++++++++++++++++-------------------------- >>> 1 file changed, 29 insertions(+), 30 deletions(-) >>> >>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c >>> index 360b82374a89..aca2f11b4138 100644 >>> --- a/mm/hugetlb.c >>> +++ b/mm/hugetlb.c >>> @@ -6189,9 +6189,7 @@ static bool hugetlb_pte_stable(struct hstate *h, struct mm_struct *mm, >>> >>> static vm_fault_t hugetlb_no_page(struct mm_struct *mm, >>> struct vm_area_struct *vma, >>> - struct address_space *mapping, pgoff_t idx, >>> - unsigned long address, pte_t *ptep, >>> - pte_t old_pte, unsigned int flags, >>> + struct address_space *mapping, >> >> AFAICS all this can be self-contained in vm_fault struct. >> vmf->vma->mm and vmf->vma. >> I mean, if we want to convert this interface, why not going all the way? >> >> Looks a bit odd some fields yes while some others remain. >> >> Or am I missing something? > > Mainly just minimizing code churn, we would either unnecessarily > change multiple lines using vma or have to declare the variables > again anyways (or have extra churn I didn't like). I don't think adding some variables is a problem. I suppose the compiler could do some optimization for us. So I think it is better to pass only one argument vmf to hugetlb_no_page(). Otherwise, LGTM. Muchun, Thanks.
On Sun, Apr 07, 2024 at 04:59:13PM +0800, Muchun Song wrote: > > > > On Apr 5, 2024, at 03:58, Vishal Moola <vishal.moola@gmail.com> wrote: > > > > On Thu, Apr 4, 2024 at 5:49 AM Oscar Salvador <osalvador@suse.de> wrote: > >> > >> On Mon, Apr 01, 2024 at 01:26:50PM -0700, Vishal Moola (Oracle) wrote: > >>> hugetlb_no_page() can use the struct vm_fault passed in from > >>> hugetlb_fault(). This alleviates the stack by consolidating 7 > >>> variables into a single struct. > >>> > >>> Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> > >>> --- > >>> mm/hugetlb.c | 59 ++++++++++++++++++++++++++-------------------------- > >>> 1 file changed, 29 insertions(+), 30 deletions(-) > >>> > >>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c > >>> index 360b82374a89..aca2f11b4138 100644 > >>> --- a/mm/hugetlb.c > >>> +++ b/mm/hugetlb.c > >>> @@ -6189,9 +6189,7 @@ static bool hugetlb_pte_stable(struct hstate *h, struct mm_struct *mm, > >>> > >>> static vm_fault_t hugetlb_no_page(struct mm_struct *mm, > >>> struct vm_area_struct *vma, > >>> - struct address_space *mapping, pgoff_t idx, > >>> - unsigned long address, pte_t *ptep, > >>> - pte_t old_pte, unsigned int flags, > >>> + struct address_space *mapping, > >> > >> AFAICS all this can be self-contained in vm_fault struct. > >> vmf->vma->mm and vmf->vma. > >> I mean, if we want to convert this interface, why not going all the way? > >> > >> Looks a bit odd some fields yes while some others remain. > >> > >> Or am I missing something? > > > > Mainly just minimizing code churn, we would either unnecessarily > > change multiple lines using vma or have to declare the variables > > again anyways (or have extra churn I didn't like). > > I don't think adding some variables is a problem. I suppose the compiler > could do some optimization for us. So I think it is better to pass > only one argument vmf to hugetlb_no_page(). Otherwise, LGTM. Alright we can get rid of the vm_area_struct and mm_struct arguments as well. Andrew, could you please fold the attached patch into this one? From 891e085115a06f638e238bea267d520bb2432fba Mon Sep 17 00:00:00 2001 From: "Vishal Moola (Oracle)" <vishal.moola@gmail.com> Date: Mon, 8 Apr 2024 10:17:54 -0700 Subject: [PATCH 1/2] hugetlb: Simplify hugetlb_no_page() arguments To simplify the function arguments, as suggested by Oscar and Muchun. Suggested-by: Muchun Song <muchun.song@linux.dev> Suggested-by: Oscar Salvador <osalvador@suse.de> Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> --- mm/hugetlb.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 456c81fbf8f5..05fe610f4699 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6186,11 +6186,11 @@ static bool hugetlb_pte_stable(struct hstate *h, struct mm_struct *mm, return same; } -static vm_fault_t hugetlb_no_page(struct mm_struct *mm, - struct vm_area_struct *vma, - struct address_space *mapping, +static vm_fault_t hugetlb_no_page(struct address_space *mapping, struct vm_fault *vmf) { + struct vm_area_struct *vma = vmf->vma; + struct mm_struct *mm = vma->vm_mm; struct hstate *h = hstate_vma(vma); vm_fault_t ret = VM_FAULT_SIGBUS; int anon_rmap = 0; @@ -6483,7 +6483,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, * hugetlb_no_page will drop vma lock and hugetlb fault * mutex internally, which make us return immediately. */ - return hugetlb_no_page(mm, vma, mapping, &vmf); + return hugetlb_no_page(mapping, &vmf); } ret = 0;
diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 360b82374a89..aca2f11b4138 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6189,9 +6189,7 @@ static bool hugetlb_pte_stable(struct hstate *h, struct mm_struct *mm, static vm_fault_t hugetlb_no_page(struct mm_struct *mm, struct vm_area_struct *vma, - struct address_space *mapping, pgoff_t idx, - unsigned long address, pte_t *ptep, - pte_t old_pte, unsigned int flags, + struct address_space *mapping, struct vm_fault *vmf) { struct hstate *h = hstate_vma(vma); @@ -6200,10 +6198,8 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, unsigned long size; struct folio *folio; pte_t new_pte; - spinlock_t *ptl; - unsigned long haddr = address & huge_page_mask(h); bool new_folio, new_pagecache_folio = false; - u32 hash = hugetlb_fault_mutex_hash(mapping, idx); + u32 hash = hugetlb_fault_mutex_hash(mapping, vmf->pgoff); /* * Currently, we are forced to kill the process in the event the @@ -6222,10 +6218,10 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, * before we get page_table_lock. */ new_folio = false; - folio = filemap_lock_hugetlb_folio(h, mapping, idx); + folio = filemap_lock_hugetlb_folio(h, mapping, vmf->pgoff); if (IS_ERR(folio)) { size = i_size_read(mapping->host) >> huge_page_shift(h); - if (idx >= size) + if (vmf->pgoff >= size) goto out; /* Check for page in userfault range */ if (userfaultfd_missing(vma)) { @@ -6246,7 +6242,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, * never happen on the page after UFFDIO_COPY has * correctly installed the page and returned. */ - if (!hugetlb_pte_stable(h, mm, ptep, old_pte)) { + if (!hugetlb_pte_stable(h, mm, vmf->pte, vmf->orig_pte)) { ret = 0; goto out; } @@ -6255,7 +6251,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, VM_UFFD_MISSING); } - folio = alloc_hugetlb_folio(vma, haddr, 0); + folio = alloc_hugetlb_folio(vma, vmf->address, 0); if (IS_ERR(folio)) { /* * Returning error will result in faulting task being @@ -6269,18 +6265,20 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, * here. Before returning error, get ptl and make * sure there really is no pte entry. */ - if (hugetlb_pte_stable(h, mm, ptep, old_pte)) + if (hugetlb_pte_stable(h, mm, vmf->pte, vmf->orig_pte)) ret = vmf_error(PTR_ERR(folio)); else ret = 0; goto out; } - clear_huge_page(&folio->page, address, pages_per_huge_page(h)); + clear_huge_page(&folio->page, vmf->real_address, + pages_per_huge_page(h)); __folio_mark_uptodate(folio); new_folio = true; if (vma->vm_flags & VM_MAYSHARE) { - int err = hugetlb_add_to_page_cache(folio, mapping, idx); + int err = hugetlb_add_to_page_cache(folio, mapping, + vmf->pgoff); if (err) { /* * err can't be -EEXIST which implies someone @@ -6289,7 +6287,8 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, * to the page cache. So it's safe to call * restore_reserve_on_error() here. */ - restore_reserve_on_error(h, vma, haddr, folio); + restore_reserve_on_error(h, vma, vmf->address, + folio); folio_put(folio); goto out; } @@ -6319,7 +6318,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, folio_unlock(folio); folio_put(folio); /* See comment in userfaultfd_missing() block above */ - if (!hugetlb_pte_stable(h, mm, ptep, old_pte)) { + if (!hugetlb_pte_stable(h, mm, vmf->pte, vmf->orig_pte)) { ret = 0; goto out; } @@ -6334,23 +6333,23 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, * any allocations necessary to record that reservation occur outside * the spinlock. */ - if ((flags & FAULT_FLAG_WRITE) && !(vma->vm_flags & VM_SHARED)) { - if (vma_needs_reservation(h, vma, haddr) < 0) { + if ((vmf->flags & FAULT_FLAG_WRITE) && !(vma->vm_flags & VM_SHARED)) { + if (vma_needs_reservation(h, vma, vmf->address) < 0) { ret = VM_FAULT_OOM; goto backout_unlocked; } /* Just decrements count, does not deallocate */ - vma_end_reservation(h, vma, haddr); + vma_end_reservation(h, vma, vmf->address); } - ptl = huge_pte_lock(h, mm, ptep); + vmf->ptl = huge_pte_lock(h, mm, vmf->pte); ret = 0; /* If pte changed from under us, retry */ - if (!pte_same(huge_ptep_get(ptep), old_pte)) + if (!pte_same(huge_ptep_get(vmf->pte), vmf->orig_pte)) goto backout; if (anon_rmap) - hugetlb_add_new_anon_rmap(folio, vma, haddr); + hugetlb_add_new_anon_rmap(folio, vma, vmf->address); else hugetlb_add_file_rmap(folio); new_pte = make_huge_pte(vma, &folio->page, ((vma->vm_flags & VM_WRITE) @@ -6359,17 +6358,18 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, * If this pte was previously wr-protected, keep it wr-protected even * if populated. */ - if (unlikely(pte_marker_uffd_wp(old_pte))) + if (unlikely(pte_marker_uffd_wp(vmf->orig_pte))) new_pte = huge_pte_mkuffd_wp(new_pte); - set_huge_pte_at(mm, haddr, ptep, new_pte, huge_page_size(h)); + set_huge_pte_at(mm, vmf->address, vmf->pte, new_pte, huge_page_size(h)); hugetlb_count_add(pages_per_huge_page(h), mm); - if ((flags & FAULT_FLAG_WRITE) && !(vma->vm_flags & VM_SHARED)) { + if ((vmf->flags & FAULT_FLAG_WRITE) && !(vma->vm_flags & VM_SHARED)) { /* Optimization, do the COW without a second fault */ - ret = hugetlb_wp(mm, vma, address, ptep, flags, folio, ptl, vmf); + ret = hugetlb_wp(mm, vma, vmf->real_address, vmf->pte, + vmf->flags, folio, vmf->ptl, vmf); } - spin_unlock(ptl); + spin_unlock(vmf->ptl); /* * Only set hugetlb_migratable in newly allocated pages. Existing pages @@ -6386,10 +6386,10 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, return ret; backout: - spin_unlock(ptl); + spin_unlock(vmf->ptl); backout_unlocked: if (new_folio && !new_pagecache_folio) - restore_reserve_on_error(h, vma, haddr, folio); + restore_reserve_on_error(h, vma, vmf->address, folio); folio_unlock(folio); folio_put(folio); @@ -6485,8 +6485,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, * hugetlb_no_page will drop vma lock and hugetlb fault * mutex internally, which make us return immediately. */ - return hugetlb_no_page(mm, vma, mapping, vmf.pgoff, address, - vmf.pte, vmf.orig_pte, flags, &vmf); + return hugetlb_no_page(mm, vma, mapping, &vmf); } ret = 0;
hugetlb_no_page() can use the struct vm_fault passed in from hugetlb_fault(). This alleviates the stack by consolidating 7 variables into a single struct. Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> --- mm/hugetlb.c | 59 ++++++++++++++++++++++++++-------------------------- 1 file changed, 29 insertions(+), 30 deletions(-)