Message ID | 20230920080133.944717-2-oliver.upton@linux.dev (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: arm64: Address soft lockups due to I-cache CMOs | expand |
On 9/20/23 18:01, Oliver Upton wrote: > Perhaps unsurprisingly, I-cache invalidations suffer from performance > issues similar to TLB invalidations on certain systems. TLB and I-cache > maintenance all result in DVM on the mesh, which is where the real > bottleneck lies. > > Rename the heuristic to point the finger at DVM, such that it may be > reused for limiting I-cache invalidations. > > Signed-off-by: Oliver Upton <oliver.upton@linux.dev> > --- > arch/arm64/include/asm/tlbflush.h | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > Reviewed-by: Gavin Shan <gshan@redhat.com> Tested-by: Gavin Shan <gshan@redhat.com> > diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h > index b149cf9f91bc..3431d37e5054 100644 > --- a/arch/arm64/include/asm/tlbflush.h > +++ b/arch/arm64/include/asm/tlbflush.h > @@ -333,7 +333,7 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) > * This is meant to avoid soft lock-ups on large TLB flushing ranges and not > * necessarily a performance improvement. > */ > -#define MAX_TLBI_OPS PTRS_PER_PTE > +#define MAX_DVM_OPS PTRS_PER_PTE > > /* > * __flush_tlb_range_op - Perform TLBI operation upon a range > @@ -413,12 +413,12 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, > > /* > * When not uses TLB range ops, we can handle up to > - * (MAX_TLBI_OPS - 1) pages; > + * (MAX_DVM_OPS - 1) pages; > * When uses TLB range ops, we can handle up to > * (MAX_TLBI_RANGE_PAGES - 1) pages. > */ > if ((!system_supports_tlb_range() && > - (end - start) >= (MAX_TLBI_OPS * stride)) || > + (end - start) >= (MAX_DVM_OPS * stride)) || > pages >= MAX_TLBI_RANGE_PAGES) { > flush_tlb_mm(vma->vm_mm); > return; > @@ -451,7 +451,7 @@ static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end > { > unsigned long addr; > > - if ((end - start) > (MAX_TLBI_OPS * PAGE_SIZE)) { > + if ((end - start) > (MAX_DVM_OPS * PAGE_SIZE)) { > flush_tlb_all(); > return; > }
On Wed, Sep 20, 2023 at 08:01:32AM +0000, Oliver Upton wrote: > Perhaps unsurprisingly, I-cache invalidations suffer from performance > issues similar to TLB invalidations on certain systems. TLB and I-cache > maintenance all result in DVM on the mesh, which is where the real > bottleneck lies. > > Rename the heuristic to point the finger at DVM, such that it may be > reused for limiting I-cache invalidations. > > Signed-off-by: Oliver Upton <oliver.upton@linux.dev> > --- > arch/arm64/include/asm/tlbflush.h | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h > index b149cf9f91bc..3431d37e5054 100644 > --- a/arch/arm64/include/asm/tlbflush.h > +++ b/arch/arm64/include/asm/tlbflush.h > @@ -333,7 +333,7 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) > * This is meant to avoid soft lock-ups on large TLB flushing ranges and not > * necessarily a performance improvement. > */ > -#define MAX_TLBI_OPS PTRS_PER_PTE > +#define MAX_DVM_OPS PTRS_PER_PTE > > /* > * __flush_tlb_range_op - Perform TLBI operation upon a range > @@ -413,12 +413,12 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, > > /* > * When not uses TLB range ops, we can handle up to > - * (MAX_TLBI_OPS - 1) pages; > + * (MAX_DVM_OPS - 1) pages; > * When uses TLB range ops, we can handle up to > * (MAX_TLBI_RANGE_PAGES - 1) pages. > */ > if ((!system_supports_tlb_range() && > - (end - start) >= (MAX_TLBI_OPS * stride)) || > + (end - start) >= (MAX_DVM_OPS * stride)) || > pages >= MAX_TLBI_RANGE_PAGES) { > flush_tlb_mm(vma->vm_mm); > return; > @@ -451,7 +451,7 @@ static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end > { > unsigned long addr; > > - if ((end - start) > (MAX_TLBI_OPS * PAGE_SIZE)) { > + if ((end - start) > (MAX_DVM_OPS * PAGE_SIZE)) { > flush_tlb_all(); > return; > } > -- > 2.42.0.459.ge4e396fd5e-goog Acked-by: Will Deacon <will@kernel.org> Will
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index b149cf9f91bc..3431d37e5054 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -333,7 +333,7 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) * This is meant to avoid soft lock-ups on large TLB flushing ranges and not * necessarily a performance improvement. */ -#define MAX_TLBI_OPS PTRS_PER_PTE +#define MAX_DVM_OPS PTRS_PER_PTE /* * __flush_tlb_range_op - Perform TLBI operation upon a range @@ -413,12 +413,12 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma, /* * When not uses TLB range ops, we can handle up to - * (MAX_TLBI_OPS - 1) pages; + * (MAX_DVM_OPS - 1) pages; * When uses TLB range ops, we can handle up to * (MAX_TLBI_RANGE_PAGES - 1) pages. */ if ((!system_supports_tlb_range() && - (end - start) >= (MAX_TLBI_OPS * stride)) || + (end - start) >= (MAX_DVM_OPS * stride)) || pages >= MAX_TLBI_RANGE_PAGES) { flush_tlb_mm(vma->vm_mm); return; @@ -451,7 +451,7 @@ static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end { unsigned long addr; - if ((end - start) > (MAX_TLBI_OPS * PAGE_SIZE)) { + if ((end - start) > (MAX_DVM_OPS * PAGE_SIZE)) { flush_tlb_all(); return; }
Perhaps unsurprisingly, I-cache invalidations suffer from performance issues similar to TLB invalidations on certain systems. TLB and I-cache maintenance all result in DVM on the mesh, which is where the real bottleneck lies. Rename the heuristic to point the finger at DVM, such that it may be reused for limiting I-cache invalidations. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> --- arch/arm64/include/asm/tlbflush.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)