diff mbox series

[v2,2/2] arm64/hugetlb: Implement arm64 specific huge_ptep_get()

Message ID de60e44dc6fa7991889320a7dfc9ee7ea38f01d8.1652411252.git.baolin.wang@linux.alibaba.com (mailing list archive)
State New, archived
Headers show
Series Implement arm64 specific huge_ptep_get() | expand

Commit Message

Baolin Wang May 13, 2022, 3:37 a.m. UTC
Now we use huge_ptep_get() to get the pte value of a hugetlb page,
however it will only return one specific pte value for the CONT-PTE
or CONT-PMD size hugetlb on ARM64 system, which can contain seravel
continuous pte or pmd entries with same page table attributes. And it
will not take into account the subpages' dirty or young bits of a
CONT-PTE/PMD size hugetlb page.

So the huge_ptep_get() is inconsistent with huge_ptep_get_and_clear(),
which already takes account the dirty or young bits for any subpages
in this CONT-PTE/PMD size hugetlb [1]. Meanwhile we can miss dirty or
young flags statistics for hugetlb pages with current huge_ptep_get(),
such as the gather_hugetlb_stats() function, and CONT-PTE/PMD hugetlb
monitoring with DAMON.

Thus define an ARM64 specific  huge_ptep_get() implementation, that will
take into account any subpages' dirty or young bits for CONT-PTE/PMD size
hugetlb page, for those functions that want to check the dirty and young
flags of a hugetlb page.

[1] https://lore.kernel.org/linux-mm/85bd80b4-b4fd-0d3f-a2e5-149559f2f387@oracle.com/

Suggested-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
---
 arch/arm64/include/asm/hugetlb.h |  2 ++
 arch/arm64/mm/hugetlbpage.c      | 24 ++++++++++++++++++++++++
 2 files changed, 26 insertions(+)

Comments

Anshuman Khandual May 13, 2022, 11:03 a.m. UTC | #1
On 5/13/22 09:07, Baolin Wang wrote:
> Now we use huge_ptep_get() to get the pte value of a hugetlb page,
> however it will only return one specific pte value for the CONT-PTE
> or CONT-PMD size hugetlb on ARM64 system, which can contain seravel

A small nit.

s/seravel/several

> continuous pte or pmd entries with same page table attributes. And it
> will not take into account the subpages' dirty or young bits of a
> CONT-PTE/PMD size hugetlb page.
> 
> So the huge_ptep_get() is inconsistent with huge_ptep_get_and_clear(),
> which already takes account the dirty or young bits for any subpages
> in this CONT-PTE/PMD size hugetlb [1]. Meanwhile we can miss dirty or
> young flags statistics for hugetlb pages with current huge_ptep_get(),
> such as the gather_hugetlb_stats() function, and CONT-PTE/PMD hugetlb
> monitoring with DAMON.
> 
> Thus define an ARM64 specific  huge_ptep_get() implementation, that will
> take into account any subpages' dirty or young bits for CONT-PTE/PMD size
> hugetlb page, for those functions that want to check the dirty and young
> flags of a hugetlb page.
> 
> [1] https://lore.kernel.org/linux-mm/85bd80b4-b4fd-0d3f-a2e5-149559f2f387@oracle.com/

Might be worth mentioning that arm64 now enables __HAVE_ARCH_HUGE_PTEP_GET.

> 
> Suggested-by: Muchun Song <songmuchun@bytedance.com>
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> Reviewed-by: Muchun Song <songmuchun@bytedance.com>
> ---
>  arch/arm64/include/asm/hugetlb.h |  2 ++
>  arch/arm64/mm/hugetlbpage.c      | 24 ++++++++++++++++++++++++
>  2 files changed, 26 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
> index 616b2ca..1fd2846 100644
> --- a/arch/arm64/include/asm/hugetlb.h
> +++ b/arch/arm64/include/asm/hugetlb.h
> @@ -44,6 +44,8 @@ extern pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
>  #define __HAVE_ARCH_HUGE_PTE_CLEAR
>  extern void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
>  			   pte_t *ptep, unsigned long sz);
> +#define __HAVE_ARCH_HUGE_PTEP_GET
> +extern pte_t huge_ptep_get(pte_t *ptep);
>  extern void set_huge_swap_pte_at(struct mm_struct *mm, unsigned long addr,
>  				 pte_t *ptep, pte_t pte, unsigned long sz);
>  #define set_huge_swap_pte_at set_huge_swap_pte_at
> diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
> index 9553851..9a3f7f1 100644
> --- a/arch/arm64/mm/hugetlbpage.c
> +++ b/arch/arm64/mm/hugetlbpage.c
> @@ -158,6 +158,30 @@ static inline int num_contig_ptes(unsigned long size, size_t *pgsize)
>  	return contig_ptes;
>  }
>  
> +pte_t huge_ptep_get(pte_t *ptep)
> +{
> +	int ncontig, i;
> +	size_t pgsize;
> +	pte_t orig_pte = ptep_get(ptep);
> +
> +	if (!pte_present(orig_pte) || !pte_cont(orig_pte))
> +		return orig_pte;
> +
> +	ncontig = num_contig_ptes(page_size(pte_page(orig_pte)), &pgsize);

Hmm, I guess there is no better way of deriving page size here.

Please drop the extra line here.

> +
> +	for (i = 0; i < ncontig; i++, ptep++) {
> +		pte_t pte = ptep_get(ptep);
> +
> +		if (pte_dirty(pte))
> +			orig_pte = pte_mkdirty(orig_pte);
> +
> +		if (pte_young(pte))
> +			orig_pte = pte_mkyoung(orig_pte);
> +	}

Please drop the extra line here.

> +
> +	return orig_pte;
> +}
> +
>  /*
>   * Changing some bits of contiguous entries requires us to follow a
>   * Break-Before-Make approach, breaking the whole contiguous set
Otherwise LGTM.

With those small changes accommodated,

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Baolin Wang May 14, 2022, 1:43 a.m. UTC | #2
在 5/13/2022 7:03 PM, Anshuman Khandual 写道:
> 
> 
> On 5/13/22 09:07, Baolin Wang wrote:
>> Now we use huge_ptep_get() to get the pte value of a hugetlb page,
>> however it will only return one specific pte value for the CONT-PTE
>> or CONT-PMD size hugetlb on ARM64 system, which can contain seravel
> 
> A small nit.
> 
> s/seravel/several

Will fix.

> 
>> continuous pte or pmd entries with same page table attributes. And it
>> will not take into account the subpages' dirty or young bits of a
>> CONT-PTE/PMD size hugetlb page.
>>
>> So the huge_ptep_get() is inconsistent with huge_ptep_get_and_clear(),
>> which already takes account the dirty or young bits for any subpages
>> in this CONT-PTE/PMD size hugetlb [1]. Meanwhile we can miss dirty or
>> young flags statistics for hugetlb pages with current huge_ptep_get(),
>> such as the gather_hugetlb_stats() function, and CONT-PTE/PMD hugetlb
>> monitoring with DAMON.
>>
>> Thus define an ARM64 specific  huge_ptep_get() implementation, that will
>> take into account any subpages' dirty or young bits for CONT-PTE/PMD size
>> hugetlb page, for those functions that want to check the dirty and young
>> flags of a hugetlb page.
>>
>> [1] https://lore.kernel.org/linux-mm/85bd80b4-b4fd-0d3f-a2e5-149559f2f387@oracle.com/
> 
> Might be worth mentioning that arm64 now enables __HAVE_ARCH_HUGE_PTEP_GET.

Sure.

> 
>>
>> Suggested-by: Muchun Song <songmuchun@bytedance.com>
>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>> Reviewed-by: Muchun Song <songmuchun@bytedance.com>
>> ---
>>   arch/arm64/include/asm/hugetlb.h |  2 ++
>>   arch/arm64/mm/hugetlbpage.c      | 24 ++++++++++++++++++++++++
>>   2 files changed, 26 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
>> index 616b2ca..1fd2846 100644
>> --- a/arch/arm64/include/asm/hugetlb.h
>> +++ b/arch/arm64/include/asm/hugetlb.h
>> @@ -44,6 +44,8 @@ extern pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
>>   #define __HAVE_ARCH_HUGE_PTE_CLEAR
>>   extern void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
>>   			   pte_t *ptep, unsigned long sz);
>> +#define __HAVE_ARCH_HUGE_PTEP_GET
>> +extern pte_t huge_ptep_get(pte_t *ptep);
>>   extern void set_huge_swap_pte_at(struct mm_struct *mm, unsigned long addr,
>>   				 pte_t *ptep, pte_t pte, unsigned long sz);
>>   #define set_huge_swap_pte_at set_huge_swap_pte_at
>> diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
>> index 9553851..9a3f7f1 100644
>> --- a/arch/arm64/mm/hugetlbpage.c
>> +++ b/arch/arm64/mm/hugetlbpage.c
>> @@ -158,6 +158,30 @@ static inline int num_contig_ptes(unsigned long size, size_t *pgsize)
>>   	return contig_ptes;
>>   }
>>   
>> +pte_t huge_ptep_get(pte_t *ptep)
>> +{
>> +	int ncontig, i;
>> +	size_t pgsize;
>> +	pte_t orig_pte = ptep_get(ptep);
>> +
>> +	if (!pte_present(orig_pte) || !pte_cont(orig_pte))
>> +		return orig_pte;
>> +
>> +	ncontig = num_contig_ptes(page_size(pte_page(orig_pte)), &pgsize);
> 
> Hmm, I guess there is no better way of deriving page size here.
> 
> Please drop the extra line here.

OK.

> 
>> +
>> +	for (i = 0; i < ncontig; i++, ptep++) {
>> +		pte_t pte = ptep_get(ptep);
>> +
>> +		if (pte_dirty(pte))
>> +			orig_pte = pte_mkdirty(orig_pte);
>> +
>> +		if (pte_young(pte))
>> +			orig_pte = pte_mkyoung(orig_pte);
>> +	}
> 
> Please drop the extra line here.

Sure.

> 
>> +
>> +	return orig_pte;
>> +}
>> +
>>   /*
>>    * Changing some bits of contiguous entries requires us to follow a
>>    * Break-Before-Make approach, breaking the whole contiguous set
> Otherwise LGTM.
> 
> With those small changes accommodated,
> 
> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

Thanks for reviewing.
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
index 616b2ca..1fd2846 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -44,6 +44,8 @@  extern pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
 #define __HAVE_ARCH_HUGE_PTE_CLEAR
 extern void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
 			   pte_t *ptep, unsigned long sz);
+#define __HAVE_ARCH_HUGE_PTEP_GET
+extern pte_t huge_ptep_get(pte_t *ptep);
 extern void set_huge_swap_pte_at(struct mm_struct *mm, unsigned long addr,
 				 pte_t *ptep, pte_t pte, unsigned long sz);
 #define set_huge_swap_pte_at set_huge_swap_pte_at
diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index 9553851..9a3f7f1 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -158,6 +158,30 @@  static inline int num_contig_ptes(unsigned long size, size_t *pgsize)
 	return contig_ptes;
 }
 
+pte_t huge_ptep_get(pte_t *ptep)
+{
+	int ncontig, i;
+	size_t pgsize;
+	pte_t orig_pte = ptep_get(ptep);
+
+	if (!pte_present(orig_pte) || !pte_cont(orig_pte))
+		return orig_pte;
+
+	ncontig = num_contig_ptes(page_size(pte_page(orig_pte)), &pgsize);
+
+	for (i = 0; i < ncontig; i++, ptep++) {
+		pte_t pte = ptep_get(ptep);
+
+		if (pte_dirty(pte))
+			orig_pte = pte_mkdirty(orig_pte);
+
+		if (pte_young(pte))
+			orig_pte = pte_mkyoung(orig_pte);
+	}
+
+	return orig_pte;
+}
+
 /*
  * Changing some bits of contiguous entries requires us to follow a
  * Break-Before-Make approach, breaking the whole contiguous set