Message ID | 20240229232211.161961-4-samuel.holland@sifive.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | riscv: ASID-related and UP-related TLB flush enhancements | expand |
On Thu, Feb 29, 2024, at 6:21 PM, Samuel Holland wrote: > An IPI backend is always required in an SMP configuration, but an SBI > implementation is not. For example, SBI will be unavailable when the > kernel runs in M mode. > > Generally, IPIs are assumed to be faster than SBI calls due to the SBI > context switch overhead. However, when SBI is used as the IPI backend, > then the context switch cost must be paid anyway, and performing the > cache/TLB flush directly in the SBI implementation is more efficient > than inserting an interrupt to the kernel. This is the only scenario > where riscv_ipi_set_virq_range()'s use_for_rfence parameter is false. > > Thus, it makes sense for remote fences to use IPIs by default, and make > the SBI remote fence extension the special case. The historical intention of providing SBI calls for remote fences is that it abstracts over hardware that allows for performing remote fences _without an IPI at all_. The TH1520 is an example of such hardware, since it can (at least according to the documentation) perform broadcast fence, fence.i, and sfence.vma operations using bits in the mhint register. T-Head's public opensbi repository doesn't actually use this feature, and in general SBI remote fences come from a much more optimistic time about how much we can successfully hide from supervisor software. But I don't think we can generalize that an IPI will always do less work than a SBI remote fence. -s > sbi_ipi_init() already checks riscv_ipi_have_virq_range(), so it only > calls riscv_ipi_set_virq_range() when no other IPI device is available. > So we can move the static key and drop the use_for_rfence parameter. > > Furthermore, the static branch only makes sense when CONFIG_RISCV_SBI is > enabled. Optherwise, IPIs must be used. Add a fallback definition of > riscv_use_sbi_for_rfence() which handles this case and removes the need > to check CONFIG_RISCV_SBI elsewhere, such as in cacheflush.c. > > Signed-off-by: Samuel Holland <samuel.holland@sifive.com> > --- > > Changes in v5: > - Also switch to riscv_use_sbi_for_rfence() in asm/pgalloc.h > > Changes in v4: > - New patch for v4 > > arch/riscv/include/asm/pgalloc.h | 7 ++++--- > arch/riscv/include/asm/sbi.h | 4 ++++ > arch/riscv/include/asm/smp.h | 15 ++------------- > arch/riscv/kernel/sbi-ipi.c | 11 ++++++++++- > arch/riscv/kernel/smp.c | 11 +---------- > arch/riscv/mm/cacheflush.c | 5 ++--- > arch/riscv/mm/tlbflush.c | 31 ++++++++++++++----------------- > drivers/clocksource/timer-clint.c | 2 +- > 8 files changed, 38 insertions(+), 48 deletions(-) > > diff --git a/arch/riscv/include/asm/pgalloc.h b/arch/riscv/include/asm/pgalloc.h > index 87468f67951a..6578054977ef 100644 > --- a/arch/riscv/include/asm/pgalloc.h > +++ b/arch/riscv/include/asm/pgalloc.h > @@ -8,6 +8,7 @@ > #define _ASM_RISCV_PGALLOC_H > > #include <linux/mm.h> > +#include <asm/sbi.h> > #include <asm/tlb.h> > > #ifdef CONFIG_MMU > @@ -90,10 +91,10 @@ static inline pud_t *pud_alloc_one(struct mm_struct > *mm, unsigned long addr) > > static inline void riscv_tlb_remove_ptdesc(struct mmu_gather *tlb, void *pt) > { > - if (riscv_use_ipi_for_rfence()) > - tlb_remove_page_ptdesc(tlb, pt); > - else > + if (riscv_use_sbi_for_rfence()) > tlb_remove_ptdesc(tlb, pt); > + else > + tlb_remove_page_ptdesc(tlb, pt); > } > > #define pud_free pud_free > diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h > index 6e68f8dff76b..ea84392ca9d7 100644 > --- a/arch/riscv/include/asm/sbi.h > +++ b/arch/riscv/include/asm/sbi.h > @@ -375,8 +375,12 @@ unsigned long riscv_cached_marchid(unsigned int cpu_id); > unsigned long riscv_cached_mimpid(unsigned int cpu_id); > > #if IS_ENABLED(CONFIG_SMP) && IS_ENABLED(CONFIG_RISCV_SBI) > +DECLARE_STATIC_KEY_FALSE(riscv_sbi_for_rfence); > +#define riscv_use_sbi_for_rfence() \ > + static_branch_unlikely(&riscv_sbi_for_rfence) > void sbi_ipi_init(void); > #else > +static inline bool riscv_use_sbi_for_rfence(void) { return false; } > static inline void sbi_ipi_init(void) { } > #endif > > diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h > index 0d555847cde6..7ac80e9f2288 100644 > --- a/arch/riscv/include/asm/smp.h > +++ b/arch/riscv/include/asm/smp.h > @@ -49,12 +49,7 @@ void riscv_ipi_disable(void); > bool riscv_ipi_have_virq_range(void); > > /* Set the IPI interrupt numbers for arch (called by irqchip drivers) */ > -void riscv_ipi_set_virq_range(int virq, int nr, bool use_for_rfence); > - > -/* Check if we can use IPIs for remote FENCEs */ > -DECLARE_STATIC_KEY_FALSE(riscv_ipi_for_rfence); > -#define riscv_use_ipi_for_rfence() \ > - static_branch_unlikely(&riscv_ipi_for_rfence) > +void riscv_ipi_set_virq_range(int virq, int nr); > > /* Check other CPUs stop or not */ > bool smp_crash_stop_failed(void); > @@ -104,16 +99,10 @@ static inline bool riscv_ipi_have_virq_range(void) > return false; > } > > -static inline void riscv_ipi_set_virq_range(int virq, int nr, > - bool use_for_rfence) > +static inline void riscv_ipi_set_virq_range(int virq, int nr) > { > } > > -static inline bool riscv_use_ipi_for_rfence(void) > -{ > - return false; > -} > - > #endif /* CONFIG_SMP */ > > #if defined(CONFIG_HOTPLUG_CPU) && (CONFIG_SMP) > diff --git a/arch/riscv/kernel/sbi-ipi.c b/arch/riscv/kernel/sbi-ipi.c > index a4559695ce62..1026e22955cc 100644 > --- a/arch/riscv/kernel/sbi-ipi.c > +++ b/arch/riscv/kernel/sbi-ipi.c > @@ -13,6 +13,9 @@ > #include <linux/irqdomain.h> > #include <asm/sbi.h> > > +DEFINE_STATIC_KEY_FALSE(riscv_sbi_for_rfence); > +EXPORT_SYMBOL_GPL(riscv_sbi_for_rfence); > + > static int sbi_ipi_virq; > > static void sbi_ipi_handle(struct irq_desc *desc) > @@ -72,6 +75,12 @@ void __init sbi_ipi_init(void) > "irqchip/sbi-ipi:starting", > sbi_ipi_starting_cpu, NULL); > > - riscv_ipi_set_virq_range(virq, BITS_PER_BYTE, false); > + riscv_ipi_set_virq_range(virq, BITS_PER_BYTE); > pr_info("providing IPIs using SBI IPI extension\n"); > + > + /* > + * Use the SBI remote fence extension to avoid > + * the extra context switch needed to handle IPIs. > + */ > + static_branch_enable(&riscv_sbi_for_rfence); > } > diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c > index 45dd4035416e..8e6eb64459af 100644 > --- a/arch/riscv/kernel/smp.c > +++ b/arch/riscv/kernel/smp.c > @@ -171,10 +171,7 @@ bool riscv_ipi_have_virq_range(void) > return (ipi_virq_base) ? true : false; > } > > -DEFINE_STATIC_KEY_FALSE(riscv_ipi_for_rfence); > -EXPORT_SYMBOL_GPL(riscv_ipi_for_rfence); > - > -void riscv_ipi_set_virq_range(int virq, int nr, bool use_for_rfence) > +void riscv_ipi_set_virq_range(int virq, int nr) > { > int i, err; > > @@ -197,12 +194,6 @@ void riscv_ipi_set_virq_range(int virq, int nr, > bool use_for_rfence) > > /* Enabled IPIs for boot CPU immediately */ > riscv_ipi_enable(); > - > - /* Update RFENCE static key */ > - if (use_for_rfence) > - static_branch_enable(&riscv_ipi_for_rfence); > - else > - static_branch_disable(&riscv_ipi_for_rfence); > } > > static const char * const ipi_names[] = { > diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c > index 55a34f2020a8..47c485bc7df0 100644 > --- a/arch/riscv/mm/cacheflush.c > +++ b/arch/riscv/mm/cacheflush.c > @@ -21,7 +21,7 @@ void flush_icache_all(void) > { > local_flush_icache_all(); > > - if (IS_ENABLED(CONFIG_RISCV_SBI) && !riscv_use_ipi_for_rfence()) > + if (riscv_use_sbi_for_rfence()) > sbi_remote_fence_i(NULL); > else > on_each_cpu(ipi_remote_fence_i, NULL, 1); > @@ -69,8 +69,7 @@ void flush_icache_mm(struct mm_struct *mm, bool local) > * with flush_icache_deferred(). > */ > smp_mb(); > - } else if (IS_ENABLED(CONFIG_RISCV_SBI) && > - !riscv_use_ipi_for_rfence()) { > + } else if (riscv_use_sbi_for_rfence()) { > sbi_remote_fence_i(&others); > } else { > on_each_cpu_mask(&others, ipi_remote_fence_i, NULL, 1); > diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c > index 8d12b26f5ac3..0373661bd1c4 100644 > --- a/arch/riscv/mm/tlbflush.c > +++ b/arch/riscv/mm/tlbflush.c > @@ -78,10 +78,10 @@ static void __ipi_flush_tlb_all(void *info) > > void flush_tlb_all(void) > { > - if (riscv_use_ipi_for_rfence()) > - on_each_cpu(__ipi_flush_tlb_all, NULL, 1); > - else > + if (riscv_use_sbi_for_rfence()) > sbi_remote_sfence_vma_asid(NULL, 0, FLUSH_TLB_MAX_SIZE, FLUSH_TLB_NO_ASID); > + else > + on_each_cpu(__ipi_flush_tlb_all, NULL, 1); > } > > struct flush_tlb_range_data { > @@ -102,7 +102,6 @@ static void __flush_tlb_range(struct cpumask > *cmask, unsigned long asid, > unsigned long start, unsigned long size, > unsigned long stride) > { > - struct flush_tlb_range_data ftd; > bool broadcast; > > if (cpumask_empty(cmask)) > @@ -118,20 +117,18 @@ static void __flush_tlb_range(struct cpumask > *cmask, unsigned long asid, > broadcast = true; > } > > - if (broadcast) { > - if (riscv_use_ipi_for_rfence()) { > - ftd.asid = asid; > - ftd.start = start; > - ftd.size = size; > - ftd.stride = stride; > - on_each_cpu_mask(cmask, > - __ipi_flush_tlb_range_asid, > - &ftd, 1); > - } else > - sbi_remote_sfence_vma_asid(cmask, > - start, size, asid); > - } else { > + if (!broadcast) { > local_flush_tlb_range_asid(start, size, stride, asid); > + } else if (riscv_use_sbi_for_rfence()) { > + sbi_remote_sfence_vma_asid(cmask, start, size, asid); > + } else { > + struct flush_tlb_range_data ftd; > + > + ftd.asid = asid; > + ftd.start = start; > + ftd.size = size; > + ftd.stride = stride; > + on_each_cpu_mask(cmask, __ipi_flush_tlb_range_asid, &ftd, 1); > } > > if (cmask != cpu_online_mask) > diff --git a/drivers/clocksource/timer-clint.c > b/drivers/clocksource/timer-clint.c > index 09fd292eb83d..0bdd9d7ec545 100644 > --- a/drivers/clocksource/timer-clint.c > +++ b/drivers/clocksource/timer-clint.c > @@ -251,7 +251,7 @@ static int __init clint_timer_init_dt(struct > device_node *np) > } > > irq_set_chained_handler(clint_ipi_irq, clint_ipi_interrupt); > - riscv_ipi_set_virq_range(rc, BITS_PER_BYTE, true); > + riscv_ipi_set_virq_range(rc, BITS_PER_BYTE); > clint_clear_ipi(); > #endif > > -- > 2.43.1 > > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
On Mon, Mar 11, 2024 at 8:37 AM Stefan O'Rear <sorear@fastmail.com> wrote: > > On Thu, Feb 29, 2024, at 6:21 PM, Samuel Holland wrote: > > An IPI backend is always required in an SMP configuration, but an SBI > > implementation is not. For example, SBI will be unavailable when the > > kernel runs in M mode. > > > > Generally, IPIs are assumed to be faster than SBI calls due to the SBI > > context switch overhead. However, when SBI is used as the IPI backend, > > then the context switch cost must be paid anyway, and performing the > > cache/TLB flush directly in the SBI implementation is more efficient > > than inserting an interrupt to the kernel. This is the only scenario > > where riscv_ipi_set_virq_range()'s use_for_rfence parameter is false. > > > > Thus, it makes sense for remote fences to use IPIs by default, and make > > the SBI remote fence extension the special case. > > The historical intention of providing SBI calls for remote fences is that > it abstracts over hardware that allows for performing remote fences > _without an IPI at all_. The TH1520 is an example of such hardware, since > it can (at least according to the documentation) perform broadcast fence, > fence.i, and sfence.vma operations using bits in the mhint register. > > T-Head's public opensbi repository doesn't actually use this feature, and > in general SBI remote fences come from a much more optimistic time about > how much we can successfully hide from supervisor software. But I don't > think we can generalize that an IPI will always do less work than a SBI > remote fence. On platforms where SBI is the only way of injecting IPIs in S-mode, the SBI based remote fences are actually much faster. This is because on such platforms injecting an IPI from a HART to a set of HARTs will require multiple SBI calls which can be directly replaced by one (or few) SBI remote fence SBI calls. Most of the current platforms still have M-mode CLINT and depend on SBI IPI for S-mode IPI injection so we should not slow down remote fences for these platforms. Until all platforms have moved to RISC-V AIA (which supports S-mode IPIs), we should keep this boot-time choice of SBI RFENCEs versus direct S-mode IPIs. IMO, this patch is ahead of its time. Regards, Anup > > -s > > > sbi_ipi_init() already checks riscv_ipi_have_virq_range(), so it only > > calls riscv_ipi_set_virq_range() when no other IPI device is available. > > So we can move the static key and drop the use_for_rfence parameter. > > > > Furthermore, the static branch only makes sense when CONFIG_RISCV_SBI is > > enabled. Optherwise, IPIs must be used. Add a fallback definition of > > riscv_use_sbi_for_rfence() which handles this case and removes the need > > to check CONFIG_RISCV_SBI elsewhere, such as in cacheflush.c. > > > > Signed-off-by: Samuel Holland <samuel.holland@sifive.com> > > --- > > > > Changes in v5: > > - Also switch to riscv_use_sbi_for_rfence() in asm/pgalloc.h > > > > Changes in v4: > > - New patch for v4 > > > > arch/riscv/include/asm/pgalloc.h | 7 ++++--- > > arch/riscv/include/asm/sbi.h | 4 ++++ > > arch/riscv/include/asm/smp.h | 15 ++------------- > > arch/riscv/kernel/sbi-ipi.c | 11 ++++++++++- > > arch/riscv/kernel/smp.c | 11 +---------- > > arch/riscv/mm/cacheflush.c | 5 ++--- > > arch/riscv/mm/tlbflush.c | 31 ++++++++++++++----------------- > > drivers/clocksource/timer-clint.c | 2 +- > > 8 files changed, 38 insertions(+), 48 deletions(-) > > > > diff --git a/arch/riscv/include/asm/pgalloc.h b/arch/riscv/include/asm/pgalloc.h > > index 87468f67951a..6578054977ef 100644 > > --- a/arch/riscv/include/asm/pgalloc.h > > +++ b/arch/riscv/include/asm/pgalloc.h > > @@ -8,6 +8,7 @@ > > #define _ASM_RISCV_PGALLOC_H > > > > #include <linux/mm.h> > > +#include <asm/sbi.h> > > #include <asm/tlb.h> > > > > #ifdef CONFIG_MMU > > @@ -90,10 +91,10 @@ static inline pud_t *pud_alloc_one(struct mm_struct > > *mm, unsigned long addr) > > > > static inline void riscv_tlb_remove_ptdesc(struct mmu_gather *tlb, void *pt) > > { > > - if (riscv_use_ipi_for_rfence()) > > - tlb_remove_page_ptdesc(tlb, pt); > > - else > > + if (riscv_use_sbi_for_rfence()) > > tlb_remove_ptdesc(tlb, pt); > > + else > > + tlb_remove_page_ptdesc(tlb, pt); > > } > > > > #define pud_free pud_free > > diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h > > index 6e68f8dff76b..ea84392ca9d7 100644 > > --- a/arch/riscv/include/asm/sbi.h > > +++ b/arch/riscv/include/asm/sbi.h > > @@ -375,8 +375,12 @@ unsigned long riscv_cached_marchid(unsigned int cpu_id); > > unsigned long riscv_cached_mimpid(unsigned int cpu_id); > > > > #if IS_ENABLED(CONFIG_SMP) && IS_ENABLED(CONFIG_RISCV_SBI) > > +DECLARE_STATIC_KEY_FALSE(riscv_sbi_for_rfence); > > +#define riscv_use_sbi_for_rfence() \ > > + static_branch_unlikely(&riscv_sbi_for_rfence) > > void sbi_ipi_init(void); > > #else > > +static inline bool riscv_use_sbi_for_rfence(void) { return false; } > > static inline void sbi_ipi_init(void) { } > > #endif > > > > diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h > > index 0d555847cde6..7ac80e9f2288 100644 > > --- a/arch/riscv/include/asm/smp.h > > +++ b/arch/riscv/include/asm/smp.h > > @@ -49,12 +49,7 @@ void riscv_ipi_disable(void); > > bool riscv_ipi_have_virq_range(void); > > > > /* Set the IPI interrupt numbers for arch (called by irqchip drivers) */ > > -void riscv_ipi_set_virq_range(int virq, int nr, bool use_for_rfence); > > - > > -/* Check if we can use IPIs for remote FENCEs */ > > -DECLARE_STATIC_KEY_FALSE(riscv_ipi_for_rfence); > > -#define riscv_use_ipi_for_rfence() \ > > - static_branch_unlikely(&riscv_ipi_for_rfence) > > +void riscv_ipi_set_virq_range(int virq, int nr); > > > > /* Check other CPUs stop or not */ > > bool smp_crash_stop_failed(void); > > @@ -104,16 +99,10 @@ static inline bool riscv_ipi_have_virq_range(void) > > return false; > > } > > > > -static inline void riscv_ipi_set_virq_range(int virq, int nr, > > - bool use_for_rfence) > > +static inline void riscv_ipi_set_virq_range(int virq, int nr) > > { > > } > > > > -static inline bool riscv_use_ipi_for_rfence(void) > > -{ > > - return false; > > -} > > - > > #endif /* CONFIG_SMP */ > > > > #if defined(CONFIG_HOTPLUG_CPU) && (CONFIG_SMP) > > diff --git a/arch/riscv/kernel/sbi-ipi.c b/arch/riscv/kernel/sbi-ipi.c > > index a4559695ce62..1026e22955cc 100644 > > --- a/arch/riscv/kernel/sbi-ipi.c > > +++ b/arch/riscv/kernel/sbi-ipi.c > > @@ -13,6 +13,9 @@ > > #include <linux/irqdomain.h> > > #include <asm/sbi.h> > > > > +DEFINE_STATIC_KEY_FALSE(riscv_sbi_for_rfence); > > +EXPORT_SYMBOL_GPL(riscv_sbi_for_rfence); > > + > > static int sbi_ipi_virq; > > > > static void sbi_ipi_handle(struct irq_desc *desc) > > @@ -72,6 +75,12 @@ void __init sbi_ipi_init(void) > > "irqchip/sbi-ipi:starting", > > sbi_ipi_starting_cpu, NULL); > > > > - riscv_ipi_set_virq_range(virq, BITS_PER_BYTE, false); > > + riscv_ipi_set_virq_range(virq, BITS_PER_BYTE); > > pr_info("providing IPIs using SBI IPI extension\n"); > > + > > + /* > > + * Use the SBI remote fence extension to avoid > > + * the extra context switch needed to handle IPIs. > > + */ > > + static_branch_enable(&riscv_sbi_for_rfence); > > } > > diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c > > index 45dd4035416e..8e6eb64459af 100644 > > --- a/arch/riscv/kernel/smp.c > > +++ b/arch/riscv/kernel/smp.c > > @@ -171,10 +171,7 @@ bool riscv_ipi_have_virq_range(void) > > return (ipi_virq_base) ? true : false; > > } > > > > -DEFINE_STATIC_KEY_FALSE(riscv_ipi_for_rfence); > > -EXPORT_SYMBOL_GPL(riscv_ipi_for_rfence); > > - > > -void riscv_ipi_set_virq_range(int virq, int nr, bool use_for_rfence) > > +void riscv_ipi_set_virq_range(int virq, int nr) > > { > > int i, err; > > > > @@ -197,12 +194,6 @@ void riscv_ipi_set_virq_range(int virq, int nr, > > bool use_for_rfence) > > > > /* Enabled IPIs for boot CPU immediately */ > > riscv_ipi_enable(); > > - > > - /* Update RFENCE static key */ > > - if (use_for_rfence) > > - static_branch_enable(&riscv_ipi_for_rfence); > > - else > > - static_branch_disable(&riscv_ipi_for_rfence); > > } > > > > static const char * const ipi_names[] = { > > diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c > > index 55a34f2020a8..47c485bc7df0 100644 > > --- a/arch/riscv/mm/cacheflush.c > > +++ b/arch/riscv/mm/cacheflush.c > > @@ -21,7 +21,7 @@ void flush_icache_all(void) > > { > > local_flush_icache_all(); > > > > - if (IS_ENABLED(CONFIG_RISCV_SBI) && !riscv_use_ipi_for_rfence()) > > + if (riscv_use_sbi_for_rfence()) > > sbi_remote_fence_i(NULL); > > else > > on_each_cpu(ipi_remote_fence_i, NULL, 1); > > @@ -69,8 +69,7 @@ void flush_icache_mm(struct mm_struct *mm, bool local) > > * with flush_icache_deferred(). > > */ > > smp_mb(); > > - } else if (IS_ENABLED(CONFIG_RISCV_SBI) && > > - !riscv_use_ipi_for_rfence()) { > > + } else if (riscv_use_sbi_for_rfence()) { > > sbi_remote_fence_i(&others); > > } else { > > on_each_cpu_mask(&others, ipi_remote_fence_i, NULL, 1); > > diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c > > index 8d12b26f5ac3..0373661bd1c4 100644 > > --- a/arch/riscv/mm/tlbflush.c > > +++ b/arch/riscv/mm/tlbflush.c > > @@ -78,10 +78,10 @@ static void __ipi_flush_tlb_all(void *info) > > > > void flush_tlb_all(void) > > { > > - if (riscv_use_ipi_for_rfence()) > > - on_each_cpu(__ipi_flush_tlb_all, NULL, 1); > > - else > > + if (riscv_use_sbi_for_rfence()) > > sbi_remote_sfence_vma_asid(NULL, 0, FLUSH_TLB_MAX_SIZE, FLUSH_TLB_NO_ASID); > > + else > > + on_each_cpu(__ipi_flush_tlb_all, NULL, 1); > > } > > > > struct flush_tlb_range_data { > > @@ -102,7 +102,6 @@ static void __flush_tlb_range(struct cpumask > > *cmask, unsigned long asid, > > unsigned long start, unsigned long size, > > unsigned long stride) > > { > > - struct flush_tlb_range_data ftd; > > bool broadcast; > > > > if (cpumask_empty(cmask)) > > @@ -118,20 +117,18 @@ static void __flush_tlb_range(struct cpumask > > *cmask, unsigned long asid, > > broadcast = true; > > } > > > > - if (broadcast) { > > - if (riscv_use_ipi_for_rfence()) { > > - ftd.asid = asid; > > - ftd.start = start; > > - ftd.size = size; > > - ftd.stride = stride; > > - on_each_cpu_mask(cmask, > > - __ipi_flush_tlb_range_asid, > > - &ftd, 1); > > - } else > > - sbi_remote_sfence_vma_asid(cmask, > > - start, size, asid); > > - } else { > > + if (!broadcast) { > > local_flush_tlb_range_asid(start, size, stride, asid); > > + } else if (riscv_use_sbi_for_rfence()) { > > + sbi_remote_sfence_vma_asid(cmask, start, size, asid); > > + } else { > > + struct flush_tlb_range_data ftd; > > + > > + ftd.asid = asid; > > + ftd.start = start; > > + ftd.size = size; > > + ftd.stride = stride; > > + on_each_cpu_mask(cmask, __ipi_flush_tlb_range_asid, &ftd, 1); > > } > > > > if (cmask != cpu_online_mask) > > diff --git a/drivers/clocksource/timer-clint.c > > b/drivers/clocksource/timer-clint.c > > index 09fd292eb83d..0bdd9d7ec545 100644 > > --- a/drivers/clocksource/timer-clint.c > > +++ b/drivers/clocksource/timer-clint.c > > @@ -251,7 +251,7 @@ static int __init clint_timer_init_dt(struct > > device_node *np) > > } > > > > irq_set_chained_handler(clint_ipi_irq, clint_ipi_interrupt); > > - riscv_ipi_set_virq_range(rc, BITS_PER_BYTE, true); > > + riscv_ipi_set_virq_range(rc, BITS_PER_BYTE); > > clint_clear_ipi(); > > #endif > > > > -- > > 2.43.1 > > > > > > _______________________________________________ > > linux-riscv mailing list > > linux-riscv@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-riscv >
On Mon, Mar 11, 2024 at 9:34 AM Anup Patel <apatel@ventanamicro.com> wrote: > > On Mon, Mar 11, 2024 at 8:37 AM Stefan O'Rear <sorear@fastmail.com> wrote: > > > > On Thu, Feb 29, 2024, at 6:21 PM, Samuel Holland wrote: > > > An IPI backend is always required in an SMP configuration, but an SBI > > > implementation is not. For example, SBI will be unavailable when the > > > kernel runs in M mode. > > > > > > Generally, IPIs are assumed to be faster than SBI calls due to the SBI > > > context switch overhead. However, when SBI is used as the IPI backend, > > > then the context switch cost must be paid anyway, and performing the > > > cache/TLB flush directly in the SBI implementation is more efficient > > > than inserting an interrupt to the kernel. This is the only scenario > > > where riscv_ipi_set_virq_range()'s use_for_rfence parameter is false. > > > > > > Thus, it makes sense for remote fences to use IPIs by default, and make > > > the SBI remote fence extension the special case. > > > > The historical intention of providing SBI calls for remote fences is that > > it abstracts over hardware that allows for performing remote fences > > _without an IPI at all_. The TH1520 is an example of such hardware, since > > it can (at least according to the documentation) perform broadcast fence, > > fence.i, and sfence.vma operations using bits in the mhint register. > > > > T-Head's public opensbi repository doesn't actually use this feature, and > > in general SBI remote fences come from a much more optimistic time about > > how much we can successfully hide from supervisor software. But I don't > > think we can generalize that an IPI will always do less work than a SBI > > remote fence. > > On platforms where SBI is the only way of injecting IPIs in S-mode, the > SBI based remote fences are actually much faster. This is because on > such platforms injecting an IPI from a HART to a set of HARTs will > require multiple SBI calls which can be directly replaced by one (or > few) SBI remote fence SBI calls. > > Most of the current platforms still have M-mode CLINT and depend on > SBI IPI for S-mode IPI injection so we should not slow down remote > fences for these platforms. > > Until all platforms have moved to RISC-V AIA (which supports S-mode > IPIs), we should keep this boot-time choice of SBI RFENCEs versus > direct S-mode IPIs. > > IMO, this patch is ahead of its time. I think this patch is fine since it replaces the static key riscv_ipi_for_rfence with riscv_sbi_for_rfence. I got confused by the removal of riscv_ipi_for_rfence. Reviewed-by: Anup Patel <anup@brainfault.org> Regards, Anup > > Regards, > Anup > > > > > -s > > > > > sbi_ipi_init() already checks riscv_ipi_have_virq_range(), so it only > > > calls riscv_ipi_set_virq_range() when no other IPI device is available. > > > So we can move the static key and drop the use_for_rfence parameter. > > > > > > Furthermore, the static branch only makes sense when CONFIG_RISCV_SBI is > > > enabled. Optherwise, IPIs must be used. Add a fallback definition of > > > riscv_use_sbi_for_rfence() which handles this case and removes the need > > > to check CONFIG_RISCV_SBI elsewhere, such as in cacheflush.c. > > > > > > Signed-off-by: Samuel Holland <samuel.holland@sifive.com> > > > --- > > > > > > Changes in v5: > > > - Also switch to riscv_use_sbi_for_rfence() in asm/pgalloc.h > > > > > > Changes in v4: > > > - New patch for v4 > > > > > > arch/riscv/include/asm/pgalloc.h | 7 ++++--- > > > arch/riscv/include/asm/sbi.h | 4 ++++ > > > arch/riscv/include/asm/smp.h | 15 ++------------- > > > arch/riscv/kernel/sbi-ipi.c | 11 ++++++++++- > > > arch/riscv/kernel/smp.c | 11 +---------- > > > arch/riscv/mm/cacheflush.c | 5 ++--- > > > arch/riscv/mm/tlbflush.c | 31 ++++++++++++++----------------- > > > drivers/clocksource/timer-clint.c | 2 +- > > > 8 files changed, 38 insertions(+), 48 deletions(-) > > > > > > diff --git a/arch/riscv/include/asm/pgalloc.h b/arch/riscv/include/asm/pgalloc.h > > > index 87468f67951a..6578054977ef 100644 > > > --- a/arch/riscv/include/asm/pgalloc.h > > > +++ b/arch/riscv/include/asm/pgalloc.h > > > @@ -8,6 +8,7 @@ > > > #define _ASM_RISCV_PGALLOC_H > > > > > > #include <linux/mm.h> > > > +#include <asm/sbi.h> > > > #include <asm/tlb.h> > > > > > > #ifdef CONFIG_MMU > > > @@ -90,10 +91,10 @@ static inline pud_t *pud_alloc_one(struct mm_struct > > > *mm, unsigned long addr) > > > > > > static inline void riscv_tlb_remove_ptdesc(struct mmu_gather *tlb, void *pt) > > > { > > > - if (riscv_use_ipi_for_rfence()) > > > - tlb_remove_page_ptdesc(tlb, pt); > > > - else > > > + if (riscv_use_sbi_for_rfence()) > > > tlb_remove_ptdesc(tlb, pt); > > > + else > > > + tlb_remove_page_ptdesc(tlb, pt); > > > } > > > > > > #define pud_free pud_free > > > diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h > > > index 6e68f8dff76b..ea84392ca9d7 100644 > > > --- a/arch/riscv/include/asm/sbi.h > > > +++ b/arch/riscv/include/asm/sbi.h > > > @@ -375,8 +375,12 @@ unsigned long riscv_cached_marchid(unsigned int cpu_id); > > > unsigned long riscv_cached_mimpid(unsigned int cpu_id); > > > > > > #if IS_ENABLED(CONFIG_SMP) && IS_ENABLED(CONFIG_RISCV_SBI) > > > +DECLARE_STATIC_KEY_FALSE(riscv_sbi_for_rfence); > > > +#define riscv_use_sbi_for_rfence() \ > > > + static_branch_unlikely(&riscv_sbi_for_rfence) > > > void sbi_ipi_init(void); > > > #else > > > +static inline bool riscv_use_sbi_for_rfence(void) { return false; } > > > static inline void sbi_ipi_init(void) { } > > > #endif > > > > > > diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h > > > index 0d555847cde6..7ac80e9f2288 100644 > > > --- a/arch/riscv/include/asm/smp.h > > > +++ b/arch/riscv/include/asm/smp.h > > > @@ -49,12 +49,7 @@ void riscv_ipi_disable(void); > > > bool riscv_ipi_have_virq_range(void); > > > > > > /* Set the IPI interrupt numbers for arch (called by irqchip drivers) */ > > > -void riscv_ipi_set_virq_range(int virq, int nr, bool use_for_rfence); > > > - > > > -/* Check if we can use IPIs for remote FENCEs */ > > > -DECLARE_STATIC_KEY_FALSE(riscv_ipi_for_rfence); > > > -#define riscv_use_ipi_for_rfence() \ > > > - static_branch_unlikely(&riscv_ipi_for_rfence) > > > +void riscv_ipi_set_virq_range(int virq, int nr); > > > > > > /* Check other CPUs stop or not */ > > > bool smp_crash_stop_failed(void); > > > @@ -104,16 +99,10 @@ static inline bool riscv_ipi_have_virq_range(void) > > > return false; > > > } > > > > > > -static inline void riscv_ipi_set_virq_range(int virq, int nr, > > > - bool use_for_rfence) > > > +static inline void riscv_ipi_set_virq_range(int virq, int nr) > > > { > > > } > > > > > > -static inline bool riscv_use_ipi_for_rfence(void) > > > -{ > > > - return false; > > > -} > > > - > > > #endif /* CONFIG_SMP */ > > > > > > #if defined(CONFIG_HOTPLUG_CPU) && (CONFIG_SMP) > > > diff --git a/arch/riscv/kernel/sbi-ipi.c b/arch/riscv/kernel/sbi-ipi.c > > > index a4559695ce62..1026e22955cc 100644 > > > --- a/arch/riscv/kernel/sbi-ipi.c > > > +++ b/arch/riscv/kernel/sbi-ipi.c > > > @@ -13,6 +13,9 @@ > > > #include <linux/irqdomain.h> > > > #include <asm/sbi.h> > > > > > > +DEFINE_STATIC_KEY_FALSE(riscv_sbi_for_rfence); > > > +EXPORT_SYMBOL_GPL(riscv_sbi_for_rfence); > > > + > > > static int sbi_ipi_virq; > > > > > > static void sbi_ipi_handle(struct irq_desc *desc) > > > @@ -72,6 +75,12 @@ void __init sbi_ipi_init(void) > > > "irqchip/sbi-ipi:starting", > > > sbi_ipi_starting_cpu, NULL); > > > > > > - riscv_ipi_set_virq_range(virq, BITS_PER_BYTE, false); > > > + riscv_ipi_set_virq_range(virq, BITS_PER_BYTE); > > > pr_info("providing IPIs using SBI IPI extension\n"); > > > + > > > + /* > > > + * Use the SBI remote fence extension to avoid > > > + * the extra context switch needed to handle IPIs. > > > + */ > > > + static_branch_enable(&riscv_sbi_for_rfence); > > > } > > > diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c > > > index 45dd4035416e..8e6eb64459af 100644 > > > --- a/arch/riscv/kernel/smp.c > > > +++ b/arch/riscv/kernel/smp.c > > > @@ -171,10 +171,7 @@ bool riscv_ipi_have_virq_range(void) > > > return (ipi_virq_base) ? true : false; > > > } > > > > > > -DEFINE_STATIC_KEY_FALSE(riscv_ipi_for_rfence); > > > -EXPORT_SYMBOL_GPL(riscv_ipi_for_rfence); > > > - > > > -void riscv_ipi_set_virq_range(int virq, int nr, bool use_for_rfence) > > > +void riscv_ipi_set_virq_range(int virq, int nr) > > > { > > > int i, err; > > > > > > @@ -197,12 +194,6 @@ void riscv_ipi_set_virq_range(int virq, int nr, > > > bool use_for_rfence) > > > > > > /* Enabled IPIs for boot CPU immediately */ > > > riscv_ipi_enable(); > > > - > > > - /* Update RFENCE static key */ > > > - if (use_for_rfence) > > > - static_branch_enable(&riscv_ipi_for_rfence); > > > - else > > > - static_branch_disable(&riscv_ipi_for_rfence); > > > } > > > > > > static const char * const ipi_names[] = { > > > diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c > > > index 55a34f2020a8..47c485bc7df0 100644 > > > --- a/arch/riscv/mm/cacheflush.c > > > +++ b/arch/riscv/mm/cacheflush.c > > > @@ -21,7 +21,7 @@ void flush_icache_all(void) > > > { > > > local_flush_icache_all(); > > > > > > - if (IS_ENABLED(CONFIG_RISCV_SBI) && !riscv_use_ipi_for_rfence()) > > > + if (riscv_use_sbi_for_rfence()) > > > sbi_remote_fence_i(NULL); > > > else > > > on_each_cpu(ipi_remote_fence_i, NULL, 1); > > > @@ -69,8 +69,7 @@ void flush_icache_mm(struct mm_struct *mm, bool local) > > > * with flush_icache_deferred(). > > > */ > > > smp_mb(); > > > - } else if (IS_ENABLED(CONFIG_RISCV_SBI) && > > > - !riscv_use_ipi_for_rfence()) { > > > + } else if (riscv_use_sbi_for_rfence()) { > > > sbi_remote_fence_i(&others); > > > } else { > > > on_each_cpu_mask(&others, ipi_remote_fence_i, NULL, 1); > > > diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c > > > index 8d12b26f5ac3..0373661bd1c4 100644 > > > --- a/arch/riscv/mm/tlbflush.c > > > +++ b/arch/riscv/mm/tlbflush.c > > > @@ -78,10 +78,10 @@ static void __ipi_flush_tlb_all(void *info) > > > > > > void flush_tlb_all(void) > > > { > > > - if (riscv_use_ipi_for_rfence()) > > > - on_each_cpu(__ipi_flush_tlb_all, NULL, 1); > > > - else > > > + if (riscv_use_sbi_for_rfence()) > > > sbi_remote_sfence_vma_asid(NULL, 0, FLUSH_TLB_MAX_SIZE, FLUSH_TLB_NO_ASID); > > > + else > > > + on_each_cpu(__ipi_flush_tlb_all, NULL, 1); > > > } > > > > > > struct flush_tlb_range_data { > > > @@ -102,7 +102,6 @@ static void __flush_tlb_range(struct cpumask > > > *cmask, unsigned long asid, > > > unsigned long start, unsigned long size, > > > unsigned long stride) > > > { > > > - struct flush_tlb_range_data ftd; > > > bool broadcast; > > > > > > if (cpumask_empty(cmask)) > > > @@ -118,20 +117,18 @@ static void __flush_tlb_range(struct cpumask > > > *cmask, unsigned long asid, > > > broadcast = true; > > > } > > > > > > - if (broadcast) { > > > - if (riscv_use_ipi_for_rfence()) { > > > - ftd.asid = asid; > > > - ftd.start = start; > > > - ftd.size = size; > > > - ftd.stride = stride; > > > - on_each_cpu_mask(cmask, > > > - __ipi_flush_tlb_range_asid, > > > - &ftd, 1); > > > - } else > > > - sbi_remote_sfence_vma_asid(cmask, > > > - start, size, asid); > > > - } else { > > > + if (!broadcast) { > > > local_flush_tlb_range_asid(start, size, stride, asid); > > > + } else if (riscv_use_sbi_for_rfence()) { > > > + sbi_remote_sfence_vma_asid(cmask, start, size, asid); > > > + } else { > > > + struct flush_tlb_range_data ftd; > > > + > > > + ftd.asid = asid; > > > + ftd.start = start; > > > + ftd.size = size; > > > + ftd.stride = stride; > > > + on_each_cpu_mask(cmask, __ipi_flush_tlb_range_asid, &ftd, 1); > > > } > > > > > > if (cmask != cpu_online_mask) > > > diff --git a/drivers/clocksource/timer-clint.c > > > b/drivers/clocksource/timer-clint.c > > > index 09fd292eb83d..0bdd9d7ec545 100644 > > > --- a/drivers/clocksource/timer-clint.c > > > +++ b/drivers/clocksource/timer-clint.c > > > @@ -251,7 +251,7 @@ static int __init clint_timer_init_dt(struct > > > device_node *np) > > > } > > > > > > irq_set_chained_handler(clint_ipi_irq, clint_ipi_interrupt); > > > - riscv_ipi_set_virq_range(rc, BITS_PER_BYTE, true); > > > + riscv_ipi_set_virq_range(rc, BITS_PER_BYTE); > > > clint_clear_ipi(); > > > #endif > > > > > > -- > > > 2.43.1 > > > > > > > > > _______________________________________________ > > > linux-riscv mailing list > > > linux-riscv@lists.infradead.org > > > http://lists.infradead.org/mailman/listinfo/linux-riscv > >
Hi Stefan, Anup, On 2024-03-10 11:12 PM, Anup Patel wrote: > On Mon, Mar 11, 2024 at 9:34 AM Anup Patel <apatel@ventanamicro.com> wrote: >> >> On Mon, Mar 11, 2024 at 8:37 AM Stefan O'Rear <sorear@fastmail.com> wrote: >>> >>> On Thu, Feb 29, 2024, at 6:21 PM, Samuel Holland wrote: >>>> An IPI backend is always required in an SMP configuration, but an SBI >>>> implementation is not. For example, SBI will be unavailable when the >>>> kernel runs in M mode. >>>> >>>> Generally, IPIs are assumed to be faster than SBI calls due to the SBI >>>> context switch overhead. However, when SBI is used as the IPI backend, >>>> then the context switch cost must be paid anyway, and performing the >>>> cache/TLB flush directly in the SBI implementation is more efficient >>>> than inserting an interrupt to the kernel. This is the only scenario >>>> where riscv_ipi_set_virq_range()'s use_for_rfence parameter is false. >>>> >>>> Thus, it makes sense for remote fences to use IPIs by default, and make >>>> the SBI remote fence extension the special case. >>> >>> The historical intention of providing SBI calls for remote fences is that >>> it abstracts over hardware that allows for performing remote fences >>> _without an IPI at all_. The TH1520 is an example of such hardware, since >>> it can (at least according to the documentation) perform broadcast fence, >>> fence.i, and sfence.vma operations using bits in the mhint register. Platform-specific code can call static_branch_enable(&riscv_sbi_for_rfence) if it somehow knows SBI remote fences are faster. That option is still available. >>> T-Head's public opensbi repository doesn't actually use this feature, and >>> in general SBI remote fences come from a much more optimistic time about >>> how much we can successfully hide from supervisor software. But I don't >>> think we can generalize that an IPI will always do less work than a SBI >>> remote fence. Agreed, and as Anup pointed out below, the case where IPIs go through SBI is an obvious case where IPIs will do more work than SBI remote fences. However, there is not currently any way to detect platforms where SBI remote fences have special optimizations. So, in the absence of extra information, the kernel assumes SBI remote fences have the performance characteristics implied by using only standard RISC-V privileged ISA features. The status quo is that in all cases where IPIs can be delivered to the kernel's privilege mode without firmware assistance, remote fences are sent via IPI. This patch is just recognizing that. >> On platforms where SBI is the only way of injecting IPIs in S-mode, the >> SBI based remote fences are actually much faster. This is because on >> such platforms injecting an IPI from a HART to a set of HARTs will >> require multiple SBI calls which can be directly replaced by one (or >> few) SBI remote fence SBI calls. >> >> Most of the current platforms still have M-mode CLINT and depend on >> SBI IPI for S-mode IPI injection so we should not slow down remote >> fences for these platforms. Just to be clear, this patch does not change the behavior on any existing platform. All it does is simplify the logic the kernel uses to choose that behavior at boot time. In fact, it makes using something like the T-HEAD instructions easier, because it decouples the remote fence static branch from irqchip driver registration. And it also paves the way for supporting an SBI implementation that omits the remote fence extension, if needed. Like I mentioned in the commit message, the goal was to make IPIs the "else" case, since SMP simply won't work without IPIs, so they must exist. And any optimizations can be added on top of that. >> Until all platforms have moved to RISC-V AIA (which supports S-mode >> IPIs), we should keep this boot-time choice of SBI RFENCEs versus >> direct S-mode IPIs. >> >> IMO, this patch is ahead of its time. > > I think this patch is fine since it replaces the static key > riscv_ipi_for_rfence with riscv_sbi_for_rfence. > > I got confused by the removal of riscv_ipi_for_rfence. Thanks for taking the time to re-review. Regards, Samuel
diff --git a/arch/riscv/include/asm/pgalloc.h b/arch/riscv/include/asm/pgalloc.h index 87468f67951a..6578054977ef 100644 --- a/arch/riscv/include/asm/pgalloc.h +++ b/arch/riscv/include/asm/pgalloc.h @@ -8,6 +8,7 @@ #define _ASM_RISCV_PGALLOC_H #include <linux/mm.h> +#include <asm/sbi.h> #include <asm/tlb.h> #ifdef CONFIG_MMU @@ -90,10 +91,10 @@ static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr) static inline void riscv_tlb_remove_ptdesc(struct mmu_gather *tlb, void *pt) { - if (riscv_use_ipi_for_rfence()) - tlb_remove_page_ptdesc(tlb, pt); - else + if (riscv_use_sbi_for_rfence()) tlb_remove_ptdesc(tlb, pt); + else + tlb_remove_page_ptdesc(tlb, pt); } #define pud_free pud_free diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h index 6e68f8dff76b..ea84392ca9d7 100644 --- a/arch/riscv/include/asm/sbi.h +++ b/arch/riscv/include/asm/sbi.h @@ -375,8 +375,12 @@ unsigned long riscv_cached_marchid(unsigned int cpu_id); unsigned long riscv_cached_mimpid(unsigned int cpu_id); #if IS_ENABLED(CONFIG_SMP) && IS_ENABLED(CONFIG_RISCV_SBI) +DECLARE_STATIC_KEY_FALSE(riscv_sbi_for_rfence); +#define riscv_use_sbi_for_rfence() \ + static_branch_unlikely(&riscv_sbi_for_rfence) void sbi_ipi_init(void); #else +static inline bool riscv_use_sbi_for_rfence(void) { return false; } static inline void sbi_ipi_init(void) { } #endif diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h index 0d555847cde6..7ac80e9f2288 100644 --- a/arch/riscv/include/asm/smp.h +++ b/arch/riscv/include/asm/smp.h @@ -49,12 +49,7 @@ void riscv_ipi_disable(void); bool riscv_ipi_have_virq_range(void); /* Set the IPI interrupt numbers for arch (called by irqchip drivers) */ -void riscv_ipi_set_virq_range(int virq, int nr, bool use_for_rfence); - -/* Check if we can use IPIs for remote FENCEs */ -DECLARE_STATIC_KEY_FALSE(riscv_ipi_for_rfence); -#define riscv_use_ipi_for_rfence() \ - static_branch_unlikely(&riscv_ipi_for_rfence) +void riscv_ipi_set_virq_range(int virq, int nr); /* Check other CPUs stop or not */ bool smp_crash_stop_failed(void); @@ -104,16 +99,10 @@ static inline bool riscv_ipi_have_virq_range(void) return false; } -static inline void riscv_ipi_set_virq_range(int virq, int nr, - bool use_for_rfence) +static inline void riscv_ipi_set_virq_range(int virq, int nr) { } -static inline bool riscv_use_ipi_for_rfence(void) -{ - return false; -} - #endif /* CONFIG_SMP */ #if defined(CONFIG_HOTPLUG_CPU) && (CONFIG_SMP) diff --git a/arch/riscv/kernel/sbi-ipi.c b/arch/riscv/kernel/sbi-ipi.c index a4559695ce62..1026e22955cc 100644 --- a/arch/riscv/kernel/sbi-ipi.c +++ b/arch/riscv/kernel/sbi-ipi.c @@ -13,6 +13,9 @@ #include <linux/irqdomain.h> #include <asm/sbi.h> +DEFINE_STATIC_KEY_FALSE(riscv_sbi_for_rfence); +EXPORT_SYMBOL_GPL(riscv_sbi_for_rfence); + static int sbi_ipi_virq; static void sbi_ipi_handle(struct irq_desc *desc) @@ -72,6 +75,12 @@ void __init sbi_ipi_init(void) "irqchip/sbi-ipi:starting", sbi_ipi_starting_cpu, NULL); - riscv_ipi_set_virq_range(virq, BITS_PER_BYTE, false); + riscv_ipi_set_virq_range(virq, BITS_PER_BYTE); pr_info("providing IPIs using SBI IPI extension\n"); + + /* + * Use the SBI remote fence extension to avoid + * the extra context switch needed to handle IPIs. + */ + static_branch_enable(&riscv_sbi_for_rfence); } diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c index 45dd4035416e..8e6eb64459af 100644 --- a/arch/riscv/kernel/smp.c +++ b/arch/riscv/kernel/smp.c @@ -171,10 +171,7 @@ bool riscv_ipi_have_virq_range(void) return (ipi_virq_base) ? true : false; } -DEFINE_STATIC_KEY_FALSE(riscv_ipi_for_rfence); -EXPORT_SYMBOL_GPL(riscv_ipi_for_rfence); - -void riscv_ipi_set_virq_range(int virq, int nr, bool use_for_rfence) +void riscv_ipi_set_virq_range(int virq, int nr) { int i, err; @@ -197,12 +194,6 @@ void riscv_ipi_set_virq_range(int virq, int nr, bool use_for_rfence) /* Enabled IPIs for boot CPU immediately */ riscv_ipi_enable(); - - /* Update RFENCE static key */ - if (use_for_rfence) - static_branch_enable(&riscv_ipi_for_rfence); - else - static_branch_disable(&riscv_ipi_for_rfence); } static const char * const ipi_names[] = { diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c index 55a34f2020a8..47c485bc7df0 100644 --- a/arch/riscv/mm/cacheflush.c +++ b/arch/riscv/mm/cacheflush.c @@ -21,7 +21,7 @@ void flush_icache_all(void) { local_flush_icache_all(); - if (IS_ENABLED(CONFIG_RISCV_SBI) && !riscv_use_ipi_for_rfence()) + if (riscv_use_sbi_for_rfence()) sbi_remote_fence_i(NULL); else on_each_cpu(ipi_remote_fence_i, NULL, 1); @@ -69,8 +69,7 @@ void flush_icache_mm(struct mm_struct *mm, bool local) * with flush_icache_deferred(). */ smp_mb(); - } else if (IS_ENABLED(CONFIG_RISCV_SBI) && - !riscv_use_ipi_for_rfence()) { + } else if (riscv_use_sbi_for_rfence()) { sbi_remote_fence_i(&others); } else { on_each_cpu_mask(&others, ipi_remote_fence_i, NULL, 1); diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c index 8d12b26f5ac3..0373661bd1c4 100644 --- a/arch/riscv/mm/tlbflush.c +++ b/arch/riscv/mm/tlbflush.c @@ -78,10 +78,10 @@ static void __ipi_flush_tlb_all(void *info) void flush_tlb_all(void) { - if (riscv_use_ipi_for_rfence()) - on_each_cpu(__ipi_flush_tlb_all, NULL, 1); - else + if (riscv_use_sbi_for_rfence()) sbi_remote_sfence_vma_asid(NULL, 0, FLUSH_TLB_MAX_SIZE, FLUSH_TLB_NO_ASID); + else + on_each_cpu(__ipi_flush_tlb_all, NULL, 1); } struct flush_tlb_range_data { @@ -102,7 +102,6 @@ static void __flush_tlb_range(struct cpumask *cmask, unsigned long asid, unsigned long start, unsigned long size, unsigned long stride) { - struct flush_tlb_range_data ftd; bool broadcast; if (cpumask_empty(cmask)) @@ -118,20 +117,18 @@ static void __flush_tlb_range(struct cpumask *cmask, unsigned long asid, broadcast = true; } - if (broadcast) { - if (riscv_use_ipi_for_rfence()) { - ftd.asid = asid; - ftd.start = start; - ftd.size = size; - ftd.stride = stride; - on_each_cpu_mask(cmask, - __ipi_flush_tlb_range_asid, - &ftd, 1); - } else - sbi_remote_sfence_vma_asid(cmask, - start, size, asid); - } else { + if (!broadcast) { local_flush_tlb_range_asid(start, size, stride, asid); + } else if (riscv_use_sbi_for_rfence()) { + sbi_remote_sfence_vma_asid(cmask, start, size, asid); + } else { + struct flush_tlb_range_data ftd; + + ftd.asid = asid; + ftd.start = start; + ftd.size = size; + ftd.stride = stride; + on_each_cpu_mask(cmask, __ipi_flush_tlb_range_asid, &ftd, 1); } if (cmask != cpu_online_mask) diff --git a/drivers/clocksource/timer-clint.c b/drivers/clocksource/timer-clint.c index 09fd292eb83d..0bdd9d7ec545 100644 --- a/drivers/clocksource/timer-clint.c +++ b/drivers/clocksource/timer-clint.c @@ -251,7 +251,7 @@ static int __init clint_timer_init_dt(struct device_node *np) } irq_set_chained_handler(clint_ipi_irq, clint_ipi_interrupt); - riscv_ipi_set_virq_range(rc, BITS_PER_BYTE, true); + riscv_ipi_set_virq_range(rc, BITS_PER_BYTE); clint_clear_ipi(); #endif
An IPI backend is always required in an SMP configuration, but an SBI implementation is not. For example, SBI will be unavailable when the kernel runs in M mode. Generally, IPIs are assumed to be faster than SBI calls due to the SBI context switch overhead. However, when SBI is used as the IPI backend, then the context switch cost must be paid anyway, and performing the cache/TLB flush directly in the SBI implementation is more efficient than inserting an interrupt to the kernel. This is the only scenario where riscv_ipi_set_virq_range()'s use_for_rfence parameter is false. Thus, it makes sense for remote fences to use IPIs by default, and make the SBI remote fence extension the special case. sbi_ipi_init() already checks riscv_ipi_have_virq_range(), so it only calls riscv_ipi_set_virq_range() when no other IPI device is available. So we can move the static key and drop the use_for_rfence parameter. Furthermore, the static branch only makes sense when CONFIG_RISCV_SBI is enabled. Optherwise, IPIs must be used. Add a fallback definition of riscv_use_sbi_for_rfence() which handles this case and removes the need to check CONFIG_RISCV_SBI elsewhere, such as in cacheflush.c. Signed-off-by: Samuel Holland <samuel.holland@sifive.com> --- Changes in v5: - Also switch to riscv_use_sbi_for_rfence() in asm/pgalloc.h Changes in v4: - New patch for v4 arch/riscv/include/asm/pgalloc.h | 7 ++++--- arch/riscv/include/asm/sbi.h | 4 ++++ arch/riscv/include/asm/smp.h | 15 ++------------- arch/riscv/kernel/sbi-ipi.c | 11 ++++++++++- arch/riscv/kernel/smp.c | 11 +---------- arch/riscv/mm/cacheflush.c | 5 ++--- arch/riscv/mm/tlbflush.c | 31 ++++++++++++++----------------- drivers/clocksource/timer-clint.c | 2 +- 8 files changed, 38 insertions(+), 48 deletions(-)