diff mbox series

[v2,2/3] hugetlb: Convert hugetlb_no_page() to use struct vm_fault

Message ID 20240401202651.31440-3-vishal.moola@gmail.com (mailing list archive)
State New
Headers show
Series Hugetlb fault path to use struct vm_fault | expand

Commit Message

Vishal Moola April 1, 2024, 8:26 p.m. UTC
hugetlb_no_page() can use the struct vm_fault passed in from
hugetlb_fault(). This alleviates the stack by consolidating 7
variables into a single struct.

Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
---
 mm/hugetlb.c | 59 ++++++++++++++++++++++++++--------------------------
 1 file changed, 29 insertions(+), 30 deletions(-)

Comments

Oscar Salvador April 4, 2024, 12:50 p.m. UTC | #1
On Mon, Apr 01, 2024 at 01:26:50PM -0700, Vishal Moola (Oracle) wrote:
> hugetlb_no_page() can use the struct vm_fault passed in from
> hugetlb_fault(). This alleviates the stack by consolidating 7
> variables into a single struct.
> 
> Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
> ---
>  mm/hugetlb.c | 59 ++++++++++++++++++++++++++--------------------------
>  1 file changed, 29 insertions(+), 30 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 360b82374a89..aca2f11b4138 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -6189,9 +6189,7 @@ static bool hugetlb_pte_stable(struct hstate *h, struct mm_struct *mm,
>  
>  static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
>  			struct vm_area_struct *vma,
> -			struct address_space *mapping, pgoff_t idx,
> -			unsigned long address, pte_t *ptep,
> -			pte_t old_pte, unsigned int flags,
> +			struct address_space *mapping,

AFAICS all this can be self-contained in vm_fault struct.
vmf->vma->mm and vmf->vma.
I mean, if we want to convert this interface, why not going all the way?

Looks a bit odd some fields yes while some others remain.

Or am I missing something?
Vishal Moola April 4, 2024, 7:58 p.m. UTC | #2
On Thu, Apr 4, 2024 at 5:49 AM Oscar Salvador <osalvador@suse.de> wrote:
>
> On Mon, Apr 01, 2024 at 01:26:50PM -0700, Vishal Moola (Oracle) wrote:
> > hugetlb_no_page() can use the struct vm_fault passed in from
> > hugetlb_fault(). This alleviates the stack by consolidating 7
> > variables into a single struct.
> >
> > Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
> > ---
> >  mm/hugetlb.c | 59 ++++++++++++++++++++++++++--------------------------
> >  1 file changed, 29 insertions(+), 30 deletions(-)
> >
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index 360b82374a89..aca2f11b4138 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -6189,9 +6189,7 @@ static bool hugetlb_pte_stable(struct hstate *h, struct mm_struct *mm,
> >
> >  static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
> >                       struct vm_area_struct *vma,
> > -                     struct address_space *mapping, pgoff_t idx,
> > -                     unsigned long address, pte_t *ptep,
> > -                     pte_t old_pte, unsigned int flags,
> > +                     struct address_space *mapping,
>
> AFAICS all this can be self-contained in vm_fault struct.
> vmf->vma->mm and vmf->vma.
> I mean, if we want to convert this interface, why not going all the way?
>
> Looks a bit odd some fields yes while some others remain.
>
> Or am I missing something?

Mainly just minimizing code churn, we would either unnecessarily
change multiple lines using vma or have to declare the variables
again anyways (or have extra churn I didn't like).
Oscar Salvador April 5, 2024, 3:12 a.m. UTC | #3
On Mon, Apr 01, 2024 at 01:26:50PM -0700, Vishal Moola (Oracle) wrote:
> hugetlb_no_page() can use the struct vm_fault passed in from
> hugetlb_fault(). This alleviates the stack by consolidating 7
> variables into a single struct.
> 
> Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>


Reviewed-by: Oscar Salvador <osalvador@suse.de>
Muchun Song April 7, 2024, 8:59 a.m. UTC | #4
> On Apr 5, 2024, at 03:58, Vishal Moola <vishal.moola@gmail.com> wrote:
> 
> On Thu, Apr 4, 2024 at 5:49 AM Oscar Salvador <osalvador@suse.de> wrote:
>> 
>> On Mon, Apr 01, 2024 at 01:26:50PM -0700, Vishal Moola (Oracle) wrote:
>>> hugetlb_no_page() can use the struct vm_fault passed in from
>>> hugetlb_fault(). This alleviates the stack by consolidating 7
>>> variables into a single struct.
>>> 
>>> Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
>>> ---
>>> mm/hugetlb.c | 59 ++++++++++++++++++++++++++--------------------------
>>> 1 file changed, 29 insertions(+), 30 deletions(-)
>>> 
>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>>> index 360b82374a89..aca2f11b4138 100644
>>> --- a/mm/hugetlb.c
>>> +++ b/mm/hugetlb.c
>>> @@ -6189,9 +6189,7 @@ static bool hugetlb_pte_stable(struct hstate *h, struct mm_struct *mm,
>>> 
>>> static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
>>>                      struct vm_area_struct *vma,
>>> -                     struct address_space *mapping, pgoff_t idx,
>>> -                     unsigned long address, pte_t *ptep,
>>> -                     pte_t old_pte, unsigned int flags,
>>> +                     struct address_space *mapping,
>> 
>> AFAICS all this can be self-contained in vm_fault struct.
>> vmf->vma->mm and vmf->vma.
>> I mean, if we want to convert this interface, why not going all the way?
>> 
>> Looks a bit odd some fields yes while some others remain.
>> 
>> Or am I missing something?
> 
> Mainly just minimizing code churn, we would either unnecessarily
> change multiple lines using vma or have to declare the variables
> again anyways (or have extra churn I didn't like).

I don't think adding some variables is a problem. I suppose the compiler
could do some optimization for us. So I think it is better to pass
only one argument vmf to hugetlb_no_page(). Otherwise, LGTM.

Muchun,
Thanks.
Vishal Moola April 8, 2024, 5:45 p.m. UTC | #5
On Sun, Apr 07, 2024 at 04:59:13PM +0800, Muchun Song wrote:
> 
> 
> > On Apr 5, 2024, at 03:58, Vishal Moola <vishal.moola@gmail.com> wrote:
> > 
> > On Thu, Apr 4, 2024 at 5:49 AM Oscar Salvador <osalvador@suse.de> wrote:
> >> 
> >> On Mon, Apr 01, 2024 at 01:26:50PM -0700, Vishal Moola (Oracle) wrote:
> >>> hugetlb_no_page() can use the struct vm_fault passed in from
> >>> hugetlb_fault(). This alleviates the stack by consolidating 7
> >>> variables into a single struct.
> >>> 
> >>> Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
> >>> ---
> >>> mm/hugetlb.c | 59 ++++++++++++++++++++++++++--------------------------
> >>> 1 file changed, 29 insertions(+), 30 deletions(-)
> >>> 
> >>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> >>> index 360b82374a89..aca2f11b4138 100644
> >>> --- a/mm/hugetlb.c
> >>> +++ b/mm/hugetlb.c
> >>> @@ -6189,9 +6189,7 @@ static bool hugetlb_pte_stable(struct hstate *h, struct mm_struct *mm,
> >>> 
> >>> static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
> >>>                      struct vm_area_struct *vma,
> >>> -                     struct address_space *mapping, pgoff_t idx,
> >>> -                     unsigned long address, pte_t *ptep,
> >>> -                     pte_t old_pte, unsigned int flags,
> >>> +                     struct address_space *mapping,
> >> 
> >> AFAICS all this can be self-contained in vm_fault struct.
> >> vmf->vma->mm and vmf->vma.
> >> I mean, if we want to convert this interface, why not going all the way?
> >> 
> >> Looks a bit odd some fields yes while some others remain.
> >> 
> >> Or am I missing something?
> > 
> > Mainly just minimizing code churn, we would either unnecessarily
> > change multiple lines using vma or have to declare the variables
> > again anyways (or have extra churn I didn't like).
> 
> I don't think adding some variables is a problem. I suppose the compiler
> could do some optimization for us. So I think it is better to pass
> only one argument vmf to hugetlb_no_page(). Otherwise, LGTM.

Alright we can get rid of the vm_area_struct and mm_struct arguments as
well.

Andrew, could you please fold the attached patch into this one?
From 891e085115a06f638e238bea267d520bb2432fba Mon Sep 17 00:00:00 2001
From: "Vishal Moola (Oracle)" <vishal.moola@gmail.com>
Date: Mon, 8 Apr 2024 10:17:54 -0700
Subject: [PATCH 1/2] hugetlb: Simplify hugetlb_no_page() arguments

To simplify the function arguments, as suggested by Oscar and Muchun.

Suggested-by: Muchun Song <muchun.song@linux.dev>
Suggested-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
---
 mm/hugetlb.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 456c81fbf8f5..05fe610f4699 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -6186,11 +6186,11 @@ static bool hugetlb_pte_stable(struct hstate *h, struct mm_struct *mm,
 	return same;
 }
 
-static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
-			struct vm_area_struct *vma,
-			struct address_space *mapping,
+static vm_fault_t hugetlb_no_page(struct address_space *mapping,
 			struct vm_fault *vmf)
 {
+	struct vm_area_struct *vma = vmf->vma;
+	struct mm_struct *mm = vma->vm_mm;
 	struct hstate *h = hstate_vma(vma);
 	vm_fault_t ret = VM_FAULT_SIGBUS;
 	int anon_rmap = 0;
@@ -6483,7 +6483,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 		 * hugetlb_no_page will drop vma lock and hugetlb fault
 		 * mutex internally, which make us return immediately.
 		 */
-		return hugetlb_no_page(mm, vma, mapping, &vmf);
+		return hugetlb_no_page(mapping, &vmf);
 	}
 
 	ret = 0;
diff mbox series

Patch

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 360b82374a89..aca2f11b4138 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -6189,9 +6189,7 @@  static bool hugetlb_pte_stable(struct hstate *h, struct mm_struct *mm,
 
 static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 			struct vm_area_struct *vma,
-			struct address_space *mapping, pgoff_t idx,
-			unsigned long address, pte_t *ptep,
-			pte_t old_pte, unsigned int flags,
+			struct address_space *mapping,
 			struct vm_fault *vmf)
 {
 	struct hstate *h = hstate_vma(vma);
@@ -6200,10 +6198,8 @@  static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 	unsigned long size;
 	struct folio *folio;
 	pte_t new_pte;
-	spinlock_t *ptl;
-	unsigned long haddr = address & huge_page_mask(h);
 	bool new_folio, new_pagecache_folio = false;
-	u32 hash = hugetlb_fault_mutex_hash(mapping, idx);
+	u32 hash = hugetlb_fault_mutex_hash(mapping, vmf->pgoff);
 
 	/*
 	 * Currently, we are forced to kill the process in the event the
@@ -6222,10 +6218,10 @@  static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 	 * before we get page_table_lock.
 	 */
 	new_folio = false;
-	folio = filemap_lock_hugetlb_folio(h, mapping, idx);
+	folio = filemap_lock_hugetlb_folio(h, mapping, vmf->pgoff);
 	if (IS_ERR(folio)) {
 		size = i_size_read(mapping->host) >> huge_page_shift(h);
-		if (idx >= size)
+		if (vmf->pgoff >= size)
 			goto out;
 		/* Check for page in userfault range */
 		if (userfaultfd_missing(vma)) {
@@ -6246,7 +6242,7 @@  static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 			 * never happen on the page after UFFDIO_COPY has
 			 * correctly installed the page and returned.
 			 */
-			if (!hugetlb_pte_stable(h, mm, ptep, old_pte)) {
+			if (!hugetlb_pte_stable(h, mm, vmf->pte, vmf->orig_pte)) {
 				ret = 0;
 				goto out;
 			}
@@ -6255,7 +6251,7 @@  static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 							VM_UFFD_MISSING);
 		}
 
-		folio = alloc_hugetlb_folio(vma, haddr, 0);
+		folio = alloc_hugetlb_folio(vma, vmf->address, 0);
 		if (IS_ERR(folio)) {
 			/*
 			 * Returning error will result in faulting task being
@@ -6269,18 +6265,20 @@  static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 			 * here.  Before returning error, get ptl and make
 			 * sure there really is no pte entry.
 			 */
-			if (hugetlb_pte_stable(h, mm, ptep, old_pte))
+			if (hugetlb_pte_stable(h, mm, vmf->pte, vmf->orig_pte))
 				ret = vmf_error(PTR_ERR(folio));
 			else
 				ret = 0;
 			goto out;
 		}
-		clear_huge_page(&folio->page, address, pages_per_huge_page(h));
+		clear_huge_page(&folio->page, vmf->real_address,
+				pages_per_huge_page(h));
 		__folio_mark_uptodate(folio);
 		new_folio = true;
 
 		if (vma->vm_flags & VM_MAYSHARE) {
-			int err = hugetlb_add_to_page_cache(folio, mapping, idx);
+			int err = hugetlb_add_to_page_cache(folio, mapping,
+							vmf->pgoff);
 			if (err) {
 				/*
 				 * err can't be -EEXIST which implies someone
@@ -6289,7 +6287,8 @@  static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 				 * to the page cache. So it's safe to call
 				 * restore_reserve_on_error() here.
 				 */
-				restore_reserve_on_error(h, vma, haddr, folio);
+				restore_reserve_on_error(h, vma, vmf->address,
+							folio);
 				folio_put(folio);
 				goto out;
 			}
@@ -6319,7 +6318,7 @@  static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 			folio_unlock(folio);
 			folio_put(folio);
 			/* See comment in userfaultfd_missing() block above */
-			if (!hugetlb_pte_stable(h, mm, ptep, old_pte)) {
+			if (!hugetlb_pte_stable(h, mm, vmf->pte, vmf->orig_pte)) {
 				ret = 0;
 				goto out;
 			}
@@ -6334,23 +6333,23 @@  static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 	 * any allocations necessary to record that reservation occur outside
 	 * the spinlock.
 	 */
-	if ((flags & FAULT_FLAG_WRITE) && !(vma->vm_flags & VM_SHARED)) {
-		if (vma_needs_reservation(h, vma, haddr) < 0) {
+	if ((vmf->flags & FAULT_FLAG_WRITE) && !(vma->vm_flags & VM_SHARED)) {
+		if (vma_needs_reservation(h, vma, vmf->address) < 0) {
 			ret = VM_FAULT_OOM;
 			goto backout_unlocked;
 		}
 		/* Just decrements count, does not deallocate */
-		vma_end_reservation(h, vma, haddr);
+		vma_end_reservation(h, vma, vmf->address);
 	}
 
-	ptl = huge_pte_lock(h, mm, ptep);
+	vmf->ptl = huge_pte_lock(h, mm, vmf->pte);
 	ret = 0;
 	/* If pte changed from under us, retry */
-	if (!pte_same(huge_ptep_get(ptep), old_pte))
+	if (!pte_same(huge_ptep_get(vmf->pte), vmf->orig_pte))
 		goto backout;
 
 	if (anon_rmap)
-		hugetlb_add_new_anon_rmap(folio, vma, haddr);
+		hugetlb_add_new_anon_rmap(folio, vma, vmf->address);
 	else
 		hugetlb_add_file_rmap(folio);
 	new_pte = make_huge_pte(vma, &folio->page, ((vma->vm_flags & VM_WRITE)
@@ -6359,17 +6358,18 @@  static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 	 * If this pte was previously wr-protected, keep it wr-protected even
 	 * if populated.
 	 */
-	if (unlikely(pte_marker_uffd_wp(old_pte)))
+	if (unlikely(pte_marker_uffd_wp(vmf->orig_pte)))
 		new_pte = huge_pte_mkuffd_wp(new_pte);
-	set_huge_pte_at(mm, haddr, ptep, new_pte, huge_page_size(h));
+	set_huge_pte_at(mm, vmf->address, vmf->pte, new_pte, huge_page_size(h));
 
 	hugetlb_count_add(pages_per_huge_page(h), mm);
-	if ((flags & FAULT_FLAG_WRITE) && !(vma->vm_flags & VM_SHARED)) {
+	if ((vmf->flags & FAULT_FLAG_WRITE) && !(vma->vm_flags & VM_SHARED)) {
 		/* Optimization, do the COW without a second fault */
-		ret = hugetlb_wp(mm, vma, address, ptep, flags, folio, ptl, vmf);
+		ret = hugetlb_wp(mm, vma, vmf->real_address, vmf->pte,
+				vmf->flags, folio, vmf->ptl, vmf);
 	}
 
-	spin_unlock(ptl);
+	spin_unlock(vmf->ptl);
 
 	/*
 	 * Only set hugetlb_migratable in newly allocated pages.  Existing pages
@@ -6386,10 +6386,10 @@  static vm_fault_t hugetlb_no_page(struct mm_struct *mm,
 	return ret;
 
 backout:
-	spin_unlock(ptl);
+	spin_unlock(vmf->ptl);
 backout_unlocked:
 	if (new_folio && !new_pagecache_folio)
-		restore_reserve_on_error(h, vma, haddr, folio);
+		restore_reserve_on_error(h, vma, vmf->address, folio);
 
 	folio_unlock(folio);
 	folio_put(folio);
@@ -6485,8 +6485,7 @@  vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 		 * hugetlb_no_page will drop vma lock and hugetlb fault
 		 * mutex internally, which make us return immediately.
 		 */
-		return hugetlb_no_page(mm, vma, mapping, vmf.pgoff, address,
-					vmf.pte, vmf.orig_pte, flags, &vmf);
+		return hugetlb_no_page(mm, vma, mapping, &vmf);
 	}
 
 	ret = 0;