Message ID | 20210512014231.466aff04@xhacker (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | riscv: Optimize switch_mm by passing "cpu" to flush_icache_deferred() | expand |
On Tue, 11 May 2021 10:42:31 PDT (-0700), jszhang3@mail.ustc.edu.cn wrote: > From: Jisheng Zhang <jszhang@kernel.org> > > Directly passing the cpu to flush_icache_deferred() rather than calling > smp_processor_id() again. > > Here are some performance numbers: > > With a run of hackbench 30 times on a single core riscv64 Qemu instance > with 1GB memory: > > without this patch: mean 36.934 > with this patch: mean 36.104 (improved by 2.24%) I don't really put any stock in QEMU performance numbers for this sort of thing, but for something like this where we're just skipping an expensive call I don't really see a reason to even need performance numbers at all as we can just consider it an obvious optimization. > Signed-off-by: Jisheng Zhang <jszhang@kernel.org> > --- > arch/riscv/mm/context.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c > index 68aa312fc352..6d445f2888ec 100644 > --- a/arch/riscv/mm/context.c > +++ b/arch/riscv/mm/context.c > @@ -281,10 +281,9 @@ static inline void set_mm(struct mm_struct *mm, unsigned int cpu) > * actually performs that local instruction cache flush, which implicitly only > * refers to the current hart. > */ > -static inline void flush_icache_deferred(struct mm_struct *mm) > +static inline void flush_icache_deferred(struct mm_struct *mm, unsigned int cpu) > { > #ifdef CONFIG_SMP > - unsigned int cpu = smp_processor_id(); > cpumask_t *mask = &mm->context.icache_stale_mask; > > if (cpumask_test_cpu(cpu, mask)) { This proceeds to perform only a local icache flush, which means it will break if any callers use a different CPU number. That large comment at the top alludes to this behavior, but I went ahead and added a line to make that more explicit. > @@ -320,5 +319,5 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next, > > set_mm(next, cpu); > > - flush_icache_deferred(next); > + flush_icache_deferred(next, cpu); > } Thanks, this is on for-next.
diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c index 68aa312fc352..6d445f2888ec 100644 --- a/arch/riscv/mm/context.c +++ b/arch/riscv/mm/context.c @@ -281,10 +281,9 @@ static inline void set_mm(struct mm_struct *mm, unsigned int cpu) * actually performs that local instruction cache flush, which implicitly only * refers to the current hart. */ -static inline void flush_icache_deferred(struct mm_struct *mm) +static inline void flush_icache_deferred(struct mm_struct *mm, unsigned int cpu) { #ifdef CONFIG_SMP - unsigned int cpu = smp_processor_id(); cpumask_t *mask = &mm->context.icache_stale_mask; if (cpumask_test_cpu(cpu, mask)) { @@ -320,5 +319,5 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next, set_mm(next, cpu); - flush_icache_deferred(next); + flush_icache_deferred(next, cpu); }