diff mbox series

tlb: hugetlb: Add arm64 contiguous hint awareness

Message ID 20220323165218.35499-1-steve.capper@arm.com (mailing list archive)
State New
Headers show
Series tlb: hugetlb: Add arm64 contiguous hint awareness | expand

Commit Message

Steve Capper March 23, 2022, 4:52 p.m. UTC
tlb_remove_huge_tlb_entry only considers PMD_SIZE and PUD_SIZE when
updating the mmu_gather structure.

Unfortunately on arm64 there are two additional huge page sizes that
need to be covered: CONT_PTE_SIZE and CONT_PMD_SIZE. Where an end-user
attempts to employ contiguous huge pages, a VM_BUG_ON can be experienced
due to the fact that the tlb structure hasn't been correctly updated by
the relevant tlb_flush_p.._range() call from tlb_remove_huge_tlb_entry.

This patch adds the ability to define per-arch implementations of
tlb_remove_huge_tlb_entry and supplies an arm64 implementation that
covers the contiguous hint sizes.

Reported-by: David Hildenbrand <david@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/linux-mm/811c5c8e-b3a2-85d2-049c-717f17c3a03a@redhat.com/
Signed-off-by: Steve Capper <steve.capper@arm.com>
---
 arch/arm64/include/asm/tlb.h | 14 ++++++++++++++
 include/asm-generic/tlb.h    |  2 ++
 2 files changed, 16 insertions(+)

Comments

Catalin Marinas March 23, 2022, 6:30 p.m. UTC | #1
On Wed, Mar 23, 2022 at 04:52:18PM +0000, Steve Capper wrote:
> tlb_remove_huge_tlb_entry only considers PMD_SIZE and PUD_SIZE when
> updating the mmu_gather structure.
> 
> Unfortunately on arm64 there are two additional huge page sizes that
> need to be covered: CONT_PTE_SIZE and CONT_PMD_SIZE. Where an end-user
> attempts to employ contiguous huge pages, a VM_BUG_ON can be experienced
> due to the fact that the tlb structure hasn't been correctly updated by
> the relevant tlb_flush_p.._range() call from tlb_remove_huge_tlb_entry.
> 
> This patch adds the ability to define per-arch implementations of
> tlb_remove_huge_tlb_entry and supplies an arm64 implementation that
> covers the contiguous hint sizes.
> 
> Reported-by: David Hildenbrand <david@redhat.com>
> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> Link: https://lore.kernel.org/linux-mm/811c5c8e-b3a2-85d2-049c-717f17c3a03a@redhat.com/
> Signed-off-by: Steve Capper <steve.capper@arm.com>

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Peter Zijlstra March 24, 2022, 2:33 p.m. UTC | #2
On Wed, Mar 23, 2022 at 04:52:18PM +0000, Steve Capper wrote:

> +#define tlb_remove_huge_tlb_entry(h, tlb, ptep, address)	\
> +	do {							\
> +		unsigned long _sz = huge_page_size(h);		\
> +		if (_sz == CONT_PTE_SIZE)			\
> +			tlb_flush_pte_range(tlb, address, _sz);	\
> +		else if (_sz == PMD_SIZE)			\
> +			tlb_flush_pmd_range(tlb, address, _sz);	\
> +		else if (_sz == CONT_PMD_SIZE)			\
> +			tlb_flush_pmd_range(tlb, address, _sz);	\
> +		else if (_sz == PUD_SIZE)			\
> +			tlb_flush_pud_range(tlb, address, _sz);	\
> +		__tlb_remove_tlb_entry(tlb, ptep, address);	\
> +	} while (0)


It occurs to me that perhaps this can be written like:

	unsigned long _sz = huge_page_size(h);
	if (_sz >= P4D_SIZE)
		tlb_flush_p4d_range(tlb, address, _sz);
	else if (_sz >= PUD_SIZE)
		tlb_flush_pud_range(tlb, address, _sz);
	else if (_sz >= PMD_SIZE)
		tlb_flush_pmd_range(tlb, address, _sz);
	else
		tlb_flush_pte_range(tlb, address, _sz);
	__tlb_remove_tlb_entry(tlb, ptep, address);

And then it can still be generic..
Steve Capper March 24, 2022, 2:35 p.m. UTC | #3
On 24/03/2022 14:33, Peter Zijlstra wrote:
> On Wed, Mar 23, 2022 at 04:52:18PM +0000, Steve Capper wrote:
> 
>> +#define tlb_remove_huge_tlb_entry(h, tlb, ptep, address)	\
>> +	do {							\
>> +		unsigned long _sz = huge_page_size(h);		\
>> +		if (_sz == CONT_PTE_SIZE)			\
>> +			tlb_flush_pte_range(tlb, address, _sz);	\
>> +		else if (_sz == PMD_SIZE)			\
>> +			tlb_flush_pmd_range(tlb, address, _sz);	\
>> +		else if (_sz == CONT_PMD_SIZE)			\
>> +			tlb_flush_pmd_range(tlb, address, _sz);	\
>> +		else if (_sz == PUD_SIZE)			\
>> +			tlb_flush_pud_range(tlb, address, _sz);	\
>> +		__tlb_remove_tlb_entry(tlb, ptep, address);	\
>> +	} while (0)
> 
> 
> It occurs to me that perhaps this can be written like:
> 
> 	unsigned long _sz = huge_page_size(h);
> 	if (_sz >= P4D_SIZE)
> 		tlb_flush_p4d_range(tlb, address, _sz);
> 	else if (_sz >= PUD_SIZE)
> 		tlb_flush_pud_range(tlb, address, _sz);
> 	else if (_sz >= PMD_SIZE)
> 		tlb_flush_pmd_range(tlb, address, _sz);
> 	else
> 		tlb_flush_pte_range(tlb, address, _sz);
> 	__tlb_remove_tlb_entry(tlb, ptep, address);
> 
> And then it can still be generic..

Thanks Peter,
My concern with that would be the CONT_PMD_SIZE case would result in a 
call to tlb_flush_pte_range rather than tlb_flush_pmd_range causing some 
of the level parameters to be different.

Would that cause an issue?

Cheers,
Peter Zijlstra March 24, 2022, 3:58 p.m. UTC | #4
On Thu, Mar 24, 2022 at 02:35:44PM +0000, Steve Capper wrote:
> > It occurs to me that perhaps this can be written like:
> > 
> > 	unsigned long _sz = huge_page_size(h);
> > 	if (_sz >= P4D_SIZE)
> > 		tlb_flush_p4d_range(tlb, address, _sz);
> > 	else if (_sz >= PUD_SIZE)
> > 		tlb_flush_pud_range(tlb, address, _sz);
> > 	else if (_sz >= PMD_SIZE)
> > 		tlb_flush_pmd_range(tlb, address, _sz);
> > 	else
> > 		tlb_flush_pte_range(tlb, address, _sz);
> > 	__tlb_remove_tlb_entry(tlb, ptep, address);
> > 
> > And then it can still be generic..
> 
> Thanks Peter,
> My concern with that would be the CONT_PMD_SIZE case would result in a call
> to tlb_flush_pte_range rather than tlb_flush_pmd_range causing some of the
> level parameters to be different.

arch/arm64/include/asm/pgtable-hwdef.h:#define CONT_PMD_SIZE            (CONT_PMDS * PMD_SIZE)

Seems to imply CONT_PMD_SIZE >= PMD_SIZE, and would thus tickle:

> > 	else if (_sz >= PMD_SIZE)
> > 		tlb_flush_pmd_range(tlb, address, _sz);

Or am I confused?
Steve Capper March 24, 2022, 4:11 p.m. UTC | #5
On 24/03/2022 15:58, Peter Zijlstra wrote:
> On Thu, Mar 24, 2022 at 02:35:44PM +0000, Steve Capper wrote:
>>> It occurs to me that perhaps this can be written like:
>>>
>>> 	unsigned long _sz = huge_page_size(h);
>>> 	if (_sz >= P4D_SIZE)
>>> 		tlb_flush_p4d_range(tlb, address, _sz);
>>> 	else if (_sz >= PUD_SIZE)
>>> 		tlb_flush_pud_range(tlb, address, _sz);
>>> 	else if (_sz >= PMD_SIZE)
>>> 		tlb_flush_pmd_range(tlb, address, _sz);
>>> 	else
>>> 		tlb_flush_pte_range(tlb, address, _sz);
>>> 	__tlb_remove_tlb_entry(tlb, ptep, address);
>>>
>>> And then it can still be generic..
>>
>> Thanks Peter,
>> My concern with that would be the CONT_PMD_SIZE case would result in a call
>> to tlb_flush_pte_range rather than tlb_flush_pmd_range causing some of the
>> level parameters to be different.
> 
> arch/arm64/include/asm/pgtable-hwdef.h:#define CONT_PMD_SIZE            (CONT_PMDS * PMD_SIZE)
> 
> Seems to imply CONT_PMD_SIZE >= PMD_SIZE, and would thus tickle:
> 
>>> 	else if (_sz >= PMD_SIZE)
>>> 		tlb_flush_pmd_range(tlb, address, _sz);
> 
> Or am I confused?

Nope, it was me who was confused :-).
I misread one of the the lines sorry.
Yeah this should be fine for arm64.

Cheers,
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/tlb.h b/arch/arm64/include/asm/tlb.h
index c995d1f4594f..311d201d4e72 100644
--- a/arch/arm64/include/asm/tlb.h
+++ b/arch/arm64/include/asm/tlb.h
@@ -19,6 +19,20 @@  static inline void __tlb_remove_table(void *_table)
 #define tlb_flush tlb_flush
 static void tlb_flush(struct mmu_gather *tlb);
 
+#define tlb_remove_huge_tlb_entry(h, tlb, ptep, address)	\
+	do {							\
+		unsigned long _sz = huge_page_size(h);		\
+		if (_sz == CONT_PTE_SIZE)			\
+			tlb_flush_pte_range(tlb, address, _sz);	\
+		else if (_sz == PMD_SIZE)			\
+			tlb_flush_pmd_range(tlb, address, _sz);	\
+		else if (_sz == CONT_PMD_SIZE)			\
+			tlb_flush_pmd_range(tlb, address, _sz);	\
+		else if (_sz == PUD_SIZE)			\
+			tlb_flush_pud_range(tlb, address, _sz);	\
+		__tlb_remove_tlb_entry(tlb, ptep, address);	\
+	} while (0)
+
 #include <asm-generic/tlb.h>
 
 /*
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index 2c68a545ffa7..4622ee45f739 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -562,6 +562,7 @@  static inline void tlb_flush_p4d_range(struct mmu_gather *tlb,
 		__tlb_remove_tlb_entry(tlb, ptep, address);	\
 	} while (0)
 
+#ifndef tlb_remove_huge_tlb_entry
 #define tlb_remove_huge_tlb_entry(h, tlb, ptep, address)	\
 	do {							\
 		unsigned long _sz = huge_page_size(h);		\
@@ -571,6 +572,7 @@  static inline void tlb_flush_p4d_range(struct mmu_gather *tlb,
 			tlb_flush_pud_range(tlb, address, _sz);	\
 		__tlb_remove_tlb_entry(tlb, ptep, address);	\
 	} while (0)
+#endif
 
 /**
  * tlb_remove_pmd_tlb_entry - remember a pmd mapping for later tlb invalidation