Message ID | 20190219103233.148854086@infradead.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | generic mmu_gather patches | expand |
On Tue, Feb 19, 2019 at 11:31:53AM +0100, Peter Zijlstra wrote: > When an architecture does not have (an efficient) flush_tlb_range(), > but instead always uses full TLB invalidates, the current generic > tlb_flush() is sub-optimal, for it will generate extra flushes in > order to keep the range small. > > But if we cannot do range flushes, that is a moot concern. Optionally > provide this simplified default. > > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> > --- > include/asm-generic/tlb.h | 41 ++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 40 insertions(+), 1 deletion(-) > > --- a/include/asm-generic/tlb.h > +++ b/include/asm-generic/tlb.h > @@ -114,7 +114,8 @@ > * returns the smallest TLB entry size unmapped in this range. > * > * If an architecture does not provide tlb_flush() a default implementation > - * based on flush_tlb_range() will be used. > + * based on flush_tlb_range() will be used, unless MMU_GATHER_NO_RANGE is > + * specified, in which case we'll default to flush_tlb_mm(). > * > * Additionally there are a few opt-in features: > * > @@ -140,6 +141,9 @@ > * the page-table pages. Required if you use HAVE_RCU_TABLE_FREE and your > * architecture uses the Linux page-tables natively. > * > + * MMU_GATHER_NO_RANGE > + * > + * Use this if your architecture lacks an efficient flush_tlb_range(). > */ > #define HAVE_GENERIC_MMU_GATHER > > @@ -302,12 +306,45 @@ static inline void __tlb_reset_range(str > */ > } > > +#ifdef CONFIG_MMU_GATHER_NO_RANGE > + > +#if defined(tlb_flush) || defined(tlb_start_vma) || defined(tlb_end_vma) > +#error MMU_GATHER_NO_RANGE relies on default tlb_flush(), tlb_start_vma() and tlb_end_vma() > +#endif > + > +/* > + * When an architecture does not have efficient means of range flushing TLBs > + * there is no point in doing intermediate flushes on tlb_end_vma() to keep the > + * range small. We equally don't have to worry about page granularity or other > + * things. > + * > + * All we need to do is issue a full flush for any !0 range. > + */ > +static inline void tlb_flush(struct mmu_gather *tlb) > +{ > + if (tlb->end) > + flush_tlb_mm(tlb->mm); > +} I guess another way we could handle these architectures is by unconditionally resetting tlb->fullmm to 1, but this works too. Acked-by: Will Deacon <will.deacon@arm.com> Will
--- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -114,7 +114,8 @@ * returns the smallest TLB entry size unmapped in this range. * * If an architecture does not provide tlb_flush() a default implementation - * based on flush_tlb_range() will be used. + * based on flush_tlb_range() will be used, unless MMU_GATHER_NO_RANGE is + * specified, in which case we'll default to flush_tlb_mm(). * * Additionally there are a few opt-in features: * @@ -140,6 +141,9 @@ * the page-table pages. Required if you use HAVE_RCU_TABLE_FREE and your * architecture uses the Linux page-tables natively. * + * MMU_GATHER_NO_RANGE + * + * Use this if your architecture lacks an efficient flush_tlb_range(). */ #define HAVE_GENERIC_MMU_GATHER @@ -302,12 +306,45 @@ static inline void __tlb_reset_range(str */ } +#ifdef CONFIG_MMU_GATHER_NO_RANGE + +#if defined(tlb_flush) || defined(tlb_start_vma) || defined(tlb_end_vma) +#error MMU_GATHER_NO_RANGE relies on default tlb_flush(), tlb_start_vma() and tlb_end_vma() +#endif + +/* + * When an architecture does not have efficient means of range flushing TLBs + * there is no point in doing intermediate flushes on tlb_end_vma() to keep the + * range small. We equally don't have to worry about page granularity or other + * things. + * + * All we need to do is issue a full flush for any !0 range. + */ +static inline void tlb_flush(struct mmu_gather *tlb) +{ + if (tlb->end) + flush_tlb_mm(tlb->mm); +} + +static inline void +tlb_update_vma_flags(struct mmu_gather *tlb, struct vm_area_struct *vma) { } + +#define tlb_end_vma tlb_end_vma +static inline void tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma) { } + +#else /* CONFIG_MMU_GATHER_NO_RANGE */ + #ifndef tlb_flush #if defined(tlb_start_vma) || defined(tlb_end_vma) #error Default tlb_flush() relies on default tlb_start_vma() and tlb_end_vma() #endif +/* + * When an architecture does not provide its own tlb_flush() implementation + * but does have a reasonably efficient flush_vma_range() implementation + * use that. + */ static inline void tlb_flush(struct mmu_gather *tlb) { if (tlb->fullmm || tlb->need_flush_all) { @@ -348,6 +385,8 @@ tlb_update_vma_flags(struct mmu_gather * #endif +#endif /* CONFIG_MMU_GATHER_NO_RANGE */ + static inline void tlb_flush_mmu_tlbonly(struct mmu_gather *tlb) { if (!tlb->end)
When an architecture does not have (an efficient) flush_tlb_range(), but instead always uses full TLB invalidates, the current generic tlb_flush() is sub-optimal, for it will generate extra flushes in order to keep the range small. But if we cannot do range flushes, that is a moot concern. Optionally provide this simplified default. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> --- include/asm-generic/tlb.h | 41 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 40 insertions(+), 1 deletion(-)