diff mbox series

[v3,4/6] mm: introduce arch_do_swap_page_nr() which allows restore metadata for nr pages

Message ID 20240503005023.174597-5-21cnbao@gmail.com (mailing list archive)
State New
Headers show
Series large folios swap-in: handle refault cases first | expand

Commit Message

Barry Song May 3, 2024, 12:50 a.m. UTC
From: Barry Song <v-songbaohua@oppo.com>

Should do_swap_page() have the capability to directly map a large folio,
metadata restoration becomes necessary for a specified number of pages
denoted as nr. It's important to highlight that metadata restoration is
solely required by the SPARC platform, which, however, does not enable
THP_SWAP. Consequently, in the present kernel configuration, there
exists no practical scenario where users necessitate the restoration of
nr metadata. Platforms implementing THP_SWAP might invoke this function
with nr values exceeding 1, subsequent to do_swap_page() successfully
mapping an entire large folio. Nonetheless, their arch_do_swap_page_nr()
functions remain empty.

Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Andreas Larsson <andreas@gaisler.com>
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
---
 include/linux/pgtable.h | 26 ++++++++++++++++++++------
 mm/memory.c             |  3 ++-
 2 files changed, 22 insertions(+), 7 deletions(-)

Comments

Ryan Roberts May 3, 2024, 10:02 a.m. UTC | #1
On 03/05/2024 01:50, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
> 
> Should do_swap_page() have the capability to directly map a large folio,
> metadata restoration becomes necessary for a specified number of pages
> denoted as nr. It's important to highlight that metadata restoration is
> solely required by the SPARC platform, which, however, does not enable
> THP_SWAP. Consequently, in the present kernel configuration, there
> exists no practical scenario where users necessitate the restoration of
> nr metadata. Platforms implementing THP_SWAP might invoke this function
> with nr values exceeding 1, subsequent to do_swap_page() successfully
> mapping an entire large folio. Nonetheless, their arch_do_swap_page_nr()
> functions remain empty.
> 
> Cc: Khalid Aziz <khalid.aziz@oracle.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Andreas Larsson <andreas@gaisler.com>
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> ---
>  include/linux/pgtable.h | 26 ++++++++++++++++++++------
>  mm/memory.c             |  3 ++-
>  2 files changed, 22 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
> index 18019f037bae..463e84c3de26 100644
> --- a/include/linux/pgtable.h
> +++ b/include/linux/pgtable.h
> @@ -1084,6 +1084,15 @@ static inline int pgd_same(pgd_t pgd_a, pgd_t pgd_b)
>  })
>  
>  #ifndef __HAVE_ARCH_DO_SWAP_PAGE
> +static inline void arch_do_swap_page_nr(struct mm_struct *mm,
> +				     struct vm_area_struct *vma,
> +				     unsigned long addr,
> +				     pte_t pte, pte_t oldpte,
> +				     int nr)
> +{
> +
> +}
> +#else
>  /*
>   * Some architectures support metadata associated with a page. When a
>   * page is being swapped out, this metadata must be saved so it can be
> @@ -1092,12 +1101,17 @@ static inline int pgd_same(pgd_t pgd_a, pgd_t pgd_b)
>   * page as metadata for the page. arch_do_swap_page() can restore this
>   * metadata when a page is swapped back in.
>   */
> -static inline void arch_do_swap_page(struct mm_struct *mm,
> -				     struct vm_area_struct *vma,
> -				     unsigned long addr,
> -				     pte_t pte, pte_t oldpte)

This hook seems to be very similar to arch_swap_restore(), I wonder if it makes
sense to merge them. Out of scope for this patch series though.


> -{
> -
> +static inline void arch_do_swap_page_nr(struct mm_struct *mm,
> +					struct vm_area_struct *vma,
> +					unsigned long addr,
> +					pte_t pte, pte_t oldpte,
> +					int nr)
> +{
> +	for (int i = 0; i < nr; i++) {
> +		arch_do_swap_page(vma->vm_mm, vma, addr + i * PAGE_SIZE,
> +				pte_advance_pfn(pte, i),
> +				pte_advance_pfn(oldpte, i));

It seems a bit odd to create a batched version of this, but not allow arches to
take advantage. Although I guess your point is that only SPARC implements it and
on that platform nr will always be 1? So no point right now? So this is just a
convenience for do_swap_page()? Makes sense.

Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>

> +	}
>  }
>  #endif
>  
> diff --git a/mm/memory.c b/mm/memory.c
> index f033eb3528ba..74cdefd58f5f 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -4266,7 +4266,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
>  	VM_BUG_ON(!folio_test_anon(folio) ||
>  			(pte_write(pte) && !PageAnonExclusive(page)));
>  	set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte);
> -	arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte);
> +	arch_do_swap_page_nr(vma->vm_mm, vma, vmf->address,
> +			pte, vmf->orig_pte, 1);
>  
>  	folio_unlock(folio);
>  	if (folio != swapcache && swapcache) {
Khalid Aziz May 6, 2024, 4:51 p.m. UTC | #2
On 5/2/24 18:50, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
> 
> Should do_swap_page() have the capability to directly map a large folio,
> metadata restoration becomes necessary for a specified number of pages
> denoted as nr. It's important to highlight that metadata restoration is
> solely required by the SPARC platform, which, however, does not enable
> THP_SWAP. Consequently, in the present kernel configuration, there
> exists no practical scenario where users necessitate the restoration of
> nr metadata. Platforms implementing THP_SWAP might invoke this function
> with nr values exceeding 1, subsequent to do_swap_page() successfully
> mapping an entire large folio. Nonetheless, their arch_do_swap_page_nr()
> functions remain empty.
> 
> Cc: Khalid Aziz <khalid.aziz@oracle.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Andreas Larsson <andreas@gaisler.com>
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>

Looks good to me.

Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>


> ---
>   include/linux/pgtable.h | 26 ++++++++++++++++++++------
>   mm/memory.c             |  3 ++-
>   2 files changed, 22 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
> index 18019f037bae..463e84c3de26 100644
> --- a/include/linux/pgtable.h
> +++ b/include/linux/pgtable.h
> @@ -1084,6 +1084,15 @@ static inline int pgd_same(pgd_t pgd_a, pgd_t pgd_b)
>   })
>   
>   #ifndef __HAVE_ARCH_DO_SWAP_PAGE
> +static inline void arch_do_swap_page_nr(struct mm_struct *mm,
> +				     struct vm_area_struct *vma,
> +				     unsigned long addr,
> +				     pte_t pte, pte_t oldpte,
> +				     int nr)
> +{
> +
> +}
> +#else
>   /*
>    * Some architectures support metadata associated with a page. When a
>    * page is being swapped out, this metadata must be saved so it can be
> @@ -1092,12 +1101,17 @@ static inline int pgd_same(pgd_t pgd_a, pgd_t pgd_b)
>    * page as metadata for the page. arch_do_swap_page() can restore this
>    * metadata when a page is swapped back in.
>    */
> -static inline void arch_do_swap_page(struct mm_struct *mm,
> -				     struct vm_area_struct *vma,
> -				     unsigned long addr,
> -				     pte_t pte, pte_t oldpte)
> -{
> -
> +static inline void arch_do_swap_page_nr(struct mm_struct *mm,
> +					struct vm_area_struct *vma,
> +					unsigned long addr,
> +					pte_t pte, pte_t oldpte,
> +					int nr)
> +{
> +	for (int i = 0; i < nr; i++) {
> +		arch_do_swap_page(vma->vm_mm, vma, addr + i * PAGE_SIZE,
> +				pte_advance_pfn(pte, i),
> +				pte_advance_pfn(oldpte, i));
> +	}
>   }
>   #endif
>   
> diff --git a/mm/memory.c b/mm/memory.c
> index f033eb3528ba..74cdefd58f5f 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -4266,7 +4266,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
>   	VM_BUG_ON(!folio_test_anon(folio) ||
>   			(pte_write(pte) && !PageAnonExclusive(page)));
>   	set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte);
> -	arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte);
> +	arch_do_swap_page_nr(vma->vm_mm, vma, vmf->address,
> +			pte, vmf->orig_pte, 1);
>   
>   	folio_unlock(folio);
>   	if (folio != swapcache && swapcache) {
diff mbox series

Patch

diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index 18019f037bae..463e84c3de26 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -1084,6 +1084,15 @@  static inline int pgd_same(pgd_t pgd_a, pgd_t pgd_b)
 })
 
 #ifndef __HAVE_ARCH_DO_SWAP_PAGE
+static inline void arch_do_swap_page_nr(struct mm_struct *mm,
+				     struct vm_area_struct *vma,
+				     unsigned long addr,
+				     pte_t pte, pte_t oldpte,
+				     int nr)
+{
+
+}
+#else
 /*
  * Some architectures support metadata associated with a page. When a
  * page is being swapped out, this metadata must be saved so it can be
@@ -1092,12 +1101,17 @@  static inline int pgd_same(pgd_t pgd_a, pgd_t pgd_b)
  * page as metadata for the page. arch_do_swap_page() can restore this
  * metadata when a page is swapped back in.
  */
-static inline void arch_do_swap_page(struct mm_struct *mm,
-				     struct vm_area_struct *vma,
-				     unsigned long addr,
-				     pte_t pte, pte_t oldpte)
-{
-
+static inline void arch_do_swap_page_nr(struct mm_struct *mm,
+					struct vm_area_struct *vma,
+					unsigned long addr,
+					pte_t pte, pte_t oldpte,
+					int nr)
+{
+	for (int i = 0; i < nr; i++) {
+		arch_do_swap_page(vma->vm_mm, vma, addr + i * PAGE_SIZE,
+				pte_advance_pfn(pte, i),
+				pte_advance_pfn(oldpte, i));
+	}
 }
 #endif
 
diff --git a/mm/memory.c b/mm/memory.c
index f033eb3528ba..74cdefd58f5f 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4266,7 +4266,8 @@  vm_fault_t do_swap_page(struct vm_fault *vmf)
 	VM_BUG_ON(!folio_test_anon(folio) ||
 			(pte_write(pte) && !PageAnonExclusive(page)));
 	set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte);
-	arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte);
+	arch_do_swap_page_nr(vma->vm_mm, vma, vmf->address,
+			pte, vmf->orig_pte, 1);
 
 	folio_unlock(folio);
 	if (folio != swapcache && swapcache) {