diff mbox series

[v1,03/16] arm64: hugetlb: Fix flush_hugetlb_tlb_range() invalidation level

Message ID 20250205151003.88959-4-ryan.roberts@arm.com (mailing list archive)
State New
Headers show
Series hugetlb and vmalloc fixes and perf improvements | expand

Commit Message

Ryan Roberts Feb. 5, 2025, 3:09 p.m. UTC
commit c910f2b65518 ("arm64/mm: Update tlb invalidation routines for
FEAT_LPA2") changed the "invalidation level unknown" hint from 0 to
TLBI_TTL_UNKNOWN (INT_MAX). But the fallback "unknown level" path in
flush_hugetlb_tlb_range() was not updated. So as it stands, when trying
to invalidate CONT_PMD_SIZE or CONT_PTE_SIZE hugetlb mappings, we will
spuriously try to invalidate at level 0 on LPA2-enabled systems.

Fix this so that the fallback passes TLBI_TTL_UNKNOWN, and while we are
at it, explicitly use the correct stride and level for CONT_PMD_SIZE and
CONT_PTE_SIZE, which should provide a minor optimization.

Cc: <stable@vger.kernel.org>
Fixes: c910f2b65518 ("arm64/mm: Update tlb invalidation routines for FEAT_LPA2")
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
 arch/arm64/include/asm/hugetlb.h | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

Comments

Anshuman Khandual Feb. 6, 2025, 6:46 a.m. UTC | #1
On 2/5/25 20:39, Ryan Roberts wrote:
> commit c910f2b65518 ("arm64/mm: Update tlb invalidation routines for
> FEAT_LPA2") changed the "invalidation level unknown" hint from 0 to
> TLBI_TTL_UNKNOWN (INT_MAX). But the fallback "unknown level" path in
> flush_hugetlb_tlb_range() was not updated. So as it stands, when trying
> to invalidate CONT_PMD_SIZE or CONT_PTE_SIZE hugetlb mappings, we will
> spuriously try to invalidate at level 0 on LPA2-enabled systems.
> 
> Fix this so that the fallback passes TLBI_TTL_UNKNOWN, and while we are
> at it, explicitly use the correct stride and level for CONT_PMD_SIZE and
> CONT_PTE_SIZE, which should provide a minor optimization.
> 
> Cc: <stable@vger.kernel.org>
> Fixes: c910f2b65518 ("arm64/mm: Update tlb invalidation routines for FEAT_LPA2")
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
>  arch/arm64/include/asm/hugetlb.h | 20 ++++++++++++++------
>  1 file changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
> index 03db9cb21ace..8ab9542d2d22 100644
> --- a/arch/arm64/include/asm/hugetlb.h
> +++ b/arch/arm64/include/asm/hugetlb.h
> @@ -76,12 +76,20 @@ static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma,
>  {
>  	unsigned long stride = huge_page_size(hstate_vma(vma));
>  
> -	if (stride == PMD_SIZE)
> -		__flush_tlb_range(vma, start, end, stride, false, 2);
> -	else if (stride == PUD_SIZE)
> -		__flush_tlb_range(vma, start, end, stride, false, 1);
> -	else
> -		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0);
> +	switch (stride) {
> +	case PUD_SIZE:
> +		__flush_tlb_range(vma, start, end, PUD_SIZE, false, 1);
> +		break;

Just wondering - should not !__PAGETABLE_PMD_FOLDED and pud_sect_supported()
checks also be added here for this PUD_SIZE case ?

> +	case CONT_PMD_SIZE:
> +	case PMD_SIZE:
> +		__flush_tlb_range(vma, start, end, PMD_SIZE, false, 2);
> +		break;
> +	case CONT_PTE_SIZE:
> +		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, 3);
> +		break;
> +	default:
> +		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, TLBI_TTL_UNKNOWN);
> +	}
>  }
>  
>  #endif /* __ASM_HUGETLB_H */
Ryan Roberts Feb. 6, 2025, 1:04 p.m. UTC | #2
On 06/02/2025 06:46, Anshuman Khandual wrote:
> 
> 
> On 2/5/25 20:39, Ryan Roberts wrote:
>> commit c910f2b65518 ("arm64/mm: Update tlb invalidation routines for
>> FEAT_LPA2") changed the "invalidation level unknown" hint from 0 to
>> TLBI_TTL_UNKNOWN (INT_MAX). But the fallback "unknown level" path in
>> flush_hugetlb_tlb_range() was not updated. So as it stands, when trying
>> to invalidate CONT_PMD_SIZE or CONT_PTE_SIZE hugetlb mappings, we will
>> spuriously try to invalidate at level 0 on LPA2-enabled systems.
>>
>> Fix this so that the fallback passes TLBI_TTL_UNKNOWN, and while we are
>> at it, explicitly use the correct stride and level for CONT_PMD_SIZE and
>> CONT_PTE_SIZE, which should provide a minor optimization.
>>
>> Cc: <stable@vger.kernel.org>
>> Fixes: c910f2b65518 ("arm64/mm: Update tlb invalidation routines for FEAT_LPA2")
>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>> ---
>>  arch/arm64/include/asm/hugetlb.h | 20 ++++++++++++++------
>>  1 file changed, 14 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
>> index 03db9cb21ace..8ab9542d2d22 100644
>> --- a/arch/arm64/include/asm/hugetlb.h
>> +++ b/arch/arm64/include/asm/hugetlb.h
>> @@ -76,12 +76,20 @@ static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma,
>>  {
>>  	unsigned long stride = huge_page_size(hstate_vma(vma));
>>  
>> -	if (stride == PMD_SIZE)
>> -		__flush_tlb_range(vma, start, end, stride, false, 2);
>> -	else if (stride == PUD_SIZE)
>> -		__flush_tlb_range(vma, start, end, stride, false, 1);
>> -	else
>> -		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0);
>> +	switch (stride) {
>> +	case PUD_SIZE:
>> +		__flush_tlb_range(vma, start, end, PUD_SIZE, false, 1);
>> +		break;
> 
> Just wondering - should not !__PAGETABLE_PMD_FOLDED and pud_sect_supported()
> checks also be added here for this PUD_SIZE case ?

Yeah I guess so. TBH, it's never been entirely clear to me what the benefit is?
Is it just to remove (a tiny amount of) dead code when we know we don't support
blocks at the level? Or is there something more fundamental going on that I've
missed?

We seem to be quite inconsistent with the use of pud_sect_supported() in
hugetlbpage.c.

Anyway, I'll add this in, I guess it's preferable to follow the established pattern.

Thanks,
Ryan

> 
>> +	case CONT_PMD_SIZE:
>> +	case PMD_SIZE:
>> +		__flush_tlb_range(vma, start, end, PMD_SIZE, false, 2);
>> +		break;
>> +	case CONT_PTE_SIZE:
>> +		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, 3);
>> +		break;
>> +	default:
>> +		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, TLBI_TTL_UNKNOWN);
>> +	}
>>  }
>>  
>>  #endif /* __ASM_HUGETLB_H */
Anshuman Khandual Feb. 13, 2025, 4:57 a.m. UTC | #3
On 2/6/25 18:34, Ryan Roberts wrote:
> On 06/02/2025 06:46, Anshuman Khandual wrote:
>>
>>
>> On 2/5/25 20:39, Ryan Roberts wrote:
>>> commit c910f2b65518 ("arm64/mm: Update tlb invalidation routines for
>>> FEAT_LPA2") changed the "invalidation level unknown" hint from 0 to
>>> TLBI_TTL_UNKNOWN (INT_MAX). But the fallback "unknown level" path in
>>> flush_hugetlb_tlb_range() was not updated. So as it stands, when trying
>>> to invalidate CONT_PMD_SIZE or CONT_PTE_SIZE hugetlb mappings, we will
>>> spuriously try to invalidate at level 0 on LPA2-enabled systems.
>>>
>>> Fix this so that the fallback passes TLBI_TTL_UNKNOWN, and while we are
>>> at it, explicitly use the correct stride and level for CONT_PMD_SIZE and
>>> CONT_PTE_SIZE, which should provide a minor optimization.
>>>
>>> Cc: <stable@vger.kernel.org>
>>> Fixes: c910f2b65518 ("arm64/mm: Update tlb invalidation routines for FEAT_LPA2")
>>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>>> ---
>>>  arch/arm64/include/asm/hugetlb.h | 20 ++++++++++++++------
>>>  1 file changed, 14 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
>>> index 03db9cb21ace..8ab9542d2d22 100644
>>> --- a/arch/arm64/include/asm/hugetlb.h
>>> +++ b/arch/arm64/include/asm/hugetlb.h
>>> @@ -76,12 +76,20 @@ static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma,
>>>  {
>>>  	unsigned long stride = huge_page_size(hstate_vma(vma));
>>>  
>>> -	if (stride == PMD_SIZE)
>>> -		__flush_tlb_range(vma, start, end, stride, false, 2);
>>> -	else if (stride == PUD_SIZE)
>>> -		__flush_tlb_range(vma, start, end, stride, false, 1);
>>> -	else
>>> -		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0);
>>> +	switch (stride) {
>>> +	case PUD_SIZE:
>>> +		__flush_tlb_range(vma, start, end, PUD_SIZE, false, 1);
>>> +		break;
>>
>> Just wondering - should not !__PAGETABLE_PMD_FOLDED and pud_sect_supported()
>> checks also be added here for this PUD_SIZE case ?
> 
> Yeah I guess so. TBH, it's never been entirely clear to me what the benefit is?
> Is it just to remove (a tiny amount of) dead code when we know we don't support
> blocks at the level? Or is there something more fundamental going on that I've
> missed?

There is a generic fallback for PUD_SIZE in include/asm-generic/pgtable-nopud.h when
it is not defined on arm64 platform and pud_sect_supported() might also get optimized
by the compiler.

static inline bool pud_sect_supported(void)
{
        return PAGE_SIZE == SZ_4K;
}

IIUC this just saves dead code from being compiled as you mentioned.

> 
> We seem to be quite inconsistent with the use of pud_sect_supported() in
> hugetlbpage.c.

PUD_SIZE switch cases in hugetlb_mask_last_page() and arch_make_huge_pte() ? Those
should be fixed.

> 
> Anyway, I'll add this in, I guess it's preferable to follow the established pattern.

Agreed.

> 
> Thanks,
> Ryan
> 
>>
>>> +	case CONT_PMD_SIZE:
>>> +	case PMD_SIZE:
>>> +		__flush_tlb_range(vma, start, end, PMD_SIZE, false, 2);
>>> +		break;
>>> +	case CONT_PTE_SIZE:
>>> +		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, 3);
>>> +		break;
>>> +	default:
>>> +		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, TLBI_TTL_UNKNOWN);
>>> +	}
>>>  }
>>>  
>>>  #endif /* __ASM_HUGETLB_H */
>
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
index 03db9cb21ace..8ab9542d2d22 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -76,12 +76,20 @@  static inline void flush_hugetlb_tlb_range(struct vm_area_struct *vma,
 {
 	unsigned long stride = huge_page_size(hstate_vma(vma));
 
-	if (stride == PMD_SIZE)
-		__flush_tlb_range(vma, start, end, stride, false, 2);
-	else if (stride == PUD_SIZE)
-		__flush_tlb_range(vma, start, end, stride, false, 1);
-	else
-		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, 0);
+	switch (stride) {
+	case PUD_SIZE:
+		__flush_tlb_range(vma, start, end, PUD_SIZE, false, 1);
+		break;
+	case CONT_PMD_SIZE:
+	case PMD_SIZE:
+		__flush_tlb_range(vma, start, end, PMD_SIZE, false, 2);
+		break;
+	case CONT_PTE_SIZE:
+		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, 3);
+		break;
+	default:
+		__flush_tlb_range(vma, start, end, PAGE_SIZE, false, TLBI_TTL_UNKNOWN);
+	}
 }
 
 #endif /* __ASM_HUGETLB_H */