Message ID | 20231229143627.22898-11-andy.chiu@sifive.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | riscv: support kernel-mode Vector | expand |
Hi Andy, > -----Original Message----- > From: Andy Chiu <andy.chiu@sifive.com> > Sent: Friday, December 29, 2023 10:36 PM > To: linux-riscv@lists.infradead.org; palmer@dabbelt.com > Cc: paul.walmsley@sifive.com; greentime.hu@sifive.com; > guoren@linux.alibaba.com; bjorn@kernel.org; charlie@rivosinc.com; > ardb@kernel.org; arnd@arndb.de; peterz@infradead.org; tglx@linutronix.de; > ebiggers@kernel.org; Andy Chiu <andy.chiu@sifive.com>; Albert Ou > <aou@eecs.berkeley.edu>; Guo Ren <guoren@kernel.org>; Han-Kuan Chen > <hankuan.chen@sifive.com>; Sami Tolvanen <samitolvanen@google.com>; > Deepak Gupta <debug@rivosinc.com>; Vincent Chen > <vincent.chen@sifive.com>; Heiko Stuebner <heiko@sntech.de>; Clément > Léger <cleger@rivosinc.com>; Björn Töpel <bjorn@rivosinc.com>; Wang, Xiao > W <xiao.w.wang@intel.com>; Nathan Chancellor <nathan@kernel.org>; > Jisheng Zhang <jszhang@kernel.org>; Conor Dooley > <conor.dooley@microchip.com>; Joel Granados <j.granados@samsung.com> > Subject: [v9, 10/10] riscv: vector: allow kernel-mode Vector with preemption > > Add kernel_vstate to keep track of kernel-mode Vector registers when > trap introduced context switch happens. Also, provide riscv_v_flags to > let context save/restore routine track context status. Context tracking > happens whenever the core starts its in-kernel Vector executions. An > active (dirty) kernel task's V contexts will be saved to memory whenever > a trap-introduced context switch happens. Or, when a softirq, which > happens to nest on top of it, uses Vector. Context retoring happens when > the execution transfer back to the original Kernel context where it > first enable preempt_v. > > Also, provide a config CONFIG_RISCV_ISA_V_PREEMPTIVE to give users an > option to disable preemptible kernel-mode Vector at build time. Users > with constraint memory may want to disable this config as preemptible > kernel-mode Vector needs extra space for tracking of per thread's > kernel-mode V context. Or, users might as well want to disable it if all > kernel-mode Vector code is time sensitive and cannot tolerate context > switch overhead. > > Signed-off-by: Andy Chiu <andy.chiu@sifive.com> > --- > Changelog v9: > - Separate context depth tracking out to a individual bitmap. > - Use bitwise to mask on/off the preempt_v status and drop unused masks > - Do not turn off bh on success path of preempt_v (To make preempt_v > available for task context that turns off irq). > - Remove and test lockdep assertion. > Changelog v8: > - fix -Wmissing-prototypes for functions with asmlinkage > Changelog v6: > - re-write patch to handle context nesting for softirqs > - drop thread flag and track context instead in riscv_v_flags > - refine some asm code and constraint it into C functions > - preallocate v context for preempt_v > - Return non-zero in riscv_v_start_kernel_context with non-preemptible > kernel-mode Vector > Changelog v4: > - dropped from v4 > Changelog v3: > - Guard vstate_save with {get,set}_cpu_vector_context > - Add comments on preventions of nesting V contexts > - remove warnings in context switch when trap's reg is not pressent (Conor) > - refactor code (Björn) > Changelog v2: > - fix build fail when compiling without RISCV_ISA_V (Conor) > - 's/TIF_RISCV_V_KMV/TIF_RISCV_V_KERNEL_MODE' and add comment > (Conor) > - merge Kconfig patch into this oine (Conor). > - > 's/CONFIG_RISCV_ISA_V_PREEMPTIVE_KMV/CONFIG_RISCV_ISA_V_PREEMP > TIVE/' > (Conor) > - fix some typos (Conor) > - enclose assembly with RISCV_ISA_V_PREEMPTIVE. > - change riscv_v_vstate_ctrl_config_kmv() to > kernel_vector_allow_preemption() for better understanding. (Conor) > - 's/riscv_v_kmv_preempitble/kernel_vector_preemptible/' > --- > arch/riscv/Kconfig | 14 +++ > arch/riscv/include/asm/asm-prototypes.h | 5 + > arch/riscv/include/asm/processor.h | 30 +++++- > arch/riscv/include/asm/simd.h | 26 ++++- > arch/riscv/include/asm/vector.h | 68 +++++++++++- > arch/riscv/kernel/entry.S | 8 ++ > arch/riscv/kernel/kernel_mode_vector.c | 137 ++++++++++++++++++++++-- > arch/riscv/kernel/process.c | 3 + > arch/riscv/kernel/vector.c | 31 ++++-- > 9 files changed, 300 insertions(+), 22 deletions(-) > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig > index 3c5ba05e8a2d..0a03d72706b5 100644 > --- a/arch/riscv/Kconfig > +++ b/arch/riscv/Kconfig > @@ -533,6 +533,20 @@ config RISCV_ISA_V_UCOPY_THRESHOLD > Prefer using vectorized copy_to_user()/copy_from_user() when the > workload size exceeds this value. > > +config RISCV_ISA_V_PREEMPTIVE > + bool "Run kernel-mode Vector with kernel preemption" > + depends on PREEMPTION > + depends on RISCV_ISA_V > + default y > + help > + Usually, in-kernel SIMD routines are run with preemption disabled. > + Functions which envoke long running SIMD thus must yield core's > + vector unit to prevent blocking other tasks for too long. > + > + This config allows kernel to run SIMD without explicitly disable > + preemption. Enabling this config will result in higher memory > + consumption due to the allocation of per-task's kernel Vector > context. > + > config TOOLCHAIN_HAS_ZBB > bool > default y > diff --git a/arch/riscv/include/asm/asm-prototypes.h > b/arch/riscv/include/asm/asm-prototypes.h > index be438932f321..cd627ec289f1 100644 > --- a/arch/riscv/include/asm/asm-prototypes.h > +++ b/arch/riscv/include/asm/asm-prototypes.h > @@ -30,6 +30,11 @@ void xor_regs_5_(unsigned long bytes, unsigned long > *__restrict p1, > const unsigned long *__restrict p4, > const unsigned long *__restrict p5); > > +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE > +asmlinkage void riscv_v_context_nesting_start(struct pt_regs *regs); > +asmlinkage void riscv_v_context_nesting_end(struct pt_regs *regs); > +#endif /* CONFIG_RISCV_ISA_V_PREEMPTIVE */ > + > #endif /* CONFIG_RISCV_ISA_V */ > > #define DECLARE_DO_ERROR_INFO(name) asmlinkage void name(struct > pt_regs *regs) > diff --git a/arch/riscv/include/asm/processor.h > b/arch/riscv/include/asm/processor.h > index e76839789067..b503fd34728d 100644 > --- a/arch/riscv/include/asm/processor.h > +++ b/arch/riscv/include/asm/processor.h > @@ -81,8 +81,35 @@ struct pt_regs; > * activation of this state disables the preemption. On a non-RT kernel, it > * also disable bh. Currently only 0 and 1 are valid value for this field. > * Other values are reserved for future uses. > + * - bits 8: is used for tracking preemptible kernel-mode Vector, when > + * RISCV_ISA_V_PREEMPTIVE is enabled. Calling kernel_vector_begin() does > not > + * disable the preemption if the thread's kernel_vstate.datap is allocated. > + * Instead, the kernel set this bit field. Then the trap entry/exit code > + * knows if we are entering/exiting the context that owns preempt_v. > + * - 0: the task is not using preempt_v > + * - 1: the task is actively using preempt_v. But whether does the task own > + * the preempt_v context is decided by bits in > RISCV_V_CTX_DEPTH_MASK. > + * - bit 16-23 are RISCV_V_CTX_DEPTH_MASK, used by context tracking > routine > + * when preempt_v starts: > + * - 0: the task is actively using, and own preempt_v context. > + * - non-zero: the task was using preempt_v, but then took a trap within. > + * Thus, the task does not own preempt_v. Any use of Vector will have to > + * save preempt_v, if dirty, and fallback to non-preemptible kernel-mode > + * Vector. > + * - bit 30: The in-kernel preempt_v context is saved, and requries to be > + * restored when returning to the context that owns the preempt_v. > + * - bit 31: The in-kernel preempt_v context is dirty, as signaled by the > + * trap entry code. Any context switches out-of current task need to save > + * it to the task's in-kernel V context. Also, any traps nesting on-top-of > + * preempt_v requesting to use V needs a save. > */ > -#define RISCV_KERNEL_MODE_V 0x1 > +#define RISCV_V_CTX_DEPTH_MASK 0x00ff0000 > + > +#define RISCV_V_CTX_UNIT_DEPTH 0x00010000 > +#define RISCV_KERNEL_MODE_V 0x00000001 > +#define RISCV_PREEMPT_V 0x00000100 > +#define RISCV_PREEMPT_V_DIRTY 0x80000000 > +#define RISCV_PREEMPT_V_NEED_RESTORE 0x40000000 > > /* CPU-specific state of a task */ > struct thread_struct { > @@ -96,6 +123,7 @@ struct thread_struct { > u32 vstate_ctrl; > struct __riscv_v_ext_state vstate; > unsigned long align_ctl; > + struct __riscv_v_ext_state kernel_vstate; > }; > > /* Whitelist the fstate from the task_struct for hardened usercopy */ > diff --git a/arch/riscv/include/asm/simd.h b/arch/riscv/include/asm/simd.h > index 4d699e16c9a9..54efbf523d49 100644 > --- a/arch/riscv/include/asm/simd.h > +++ b/arch/riscv/include/asm/simd.h > @@ -12,6 +12,7 @@ > #include <linux/percpu.h> > #include <linux/preempt.h> > #include <linux/types.h> > +#include <linux/thread_info.h> > > #include <asm/vector.h> > > @@ -28,12 +29,27 @@ static __must_check inline bool may_use_simd(void) > /* > * RISCV_KERNEL_MODE_V is only set while preemption is disabled, > * and is clear whenever preemption is enabled. > - * > - * Kernel-mode Vector temporarily disables bh. So we must not return > - * true on irq_disabled(). Otherwise we would fail the lockdep check > - * calling local_bh_enable() > */ > - return !in_hardirq() && !in_nmi() && !irqs_disabled() > && !(riscv_v_flags() & RISCV_KERNEL_MODE_V); > + if (in_hardirq() || in_nmi()) > + return false; > + > + /* > + * Nesting is acheived in preempt_v by spreading the control for > + * preemptible and non-preemptible kernel-mode Vector into two > fields. > + * Always try to match with prempt_v if kernel V-context exists. Then, > + * fallback to check non preempt_v if nesting happens, or if the config > + * is not set. > + */ > + if (IS_ENABLED(CONFIG_RISCV_ISA_V_PREEMPTIVE) && current- > >thread.kernel_vstate.datap) { > + if (!riscv_preempt_v_started(current)) > + return true; > + } > + /* > + * Non-preemptible kernel-mode Vector temporarily disables bh. So > we > + * must not return true on irq_disabled(). Otherwise we would fail the > + * lockdep check calling local_bh_enable() > + */ > + return !irqs_disabled() && !(riscv_v_flags() & > RISCV_KERNEL_MODE_V); > } > > #else /* ! CONFIG_RISCV_ISA_V */ > diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h > index 7b316050f24f..d69844906d51 100644 > --- a/arch/riscv/include/asm/vector.h > +++ b/arch/riscv/include/asm/vector.h > @@ -28,10 +28,11 @@ void get_cpu_vector_context(void); > void put_cpu_vector_context(void); > void riscv_v_thread_free(struct task_struct *tsk); > void __init riscv_v_setup_ctx_cache(void); > +void riscv_v_thread_alloc(struct task_struct *tsk); > > static inline u32 riscv_v_flags(void) > { > - return current->thread.riscv_v_flags; > + return READ_ONCE(current->thread.riscv_v_flags); > } > > static __always_inline bool has_vector(void) > @@ -200,14 +201,72 @@ static inline void riscv_v_vstate_set_restore(struct > task_struct *task, > } > } > > +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE > +static inline bool riscv_preempt_v_dirty(struct task_struct *task) > +{ > + u32 val = READ_ONCE(task->thread.riscv_v_flags); > + > + return !!(val & RISCV_PREEMPT_V_DIRTY); > +} > + > +static inline bool riscv_preempt_v_restore(struct task_struct *task) > +{ > + u32 val = READ_ONCE(task->thread.riscv_v_flags); > + > + return !!(val & RISCV_PREEMPT_V_NEED_RESTORE); > +} > + > +static inline void riscv_preempt_v_clear_dirty(struct task_struct *task) > +{ > + barrier(); > + task->thread.riscv_v_flags &= ~RISCV_PREEMPT_V_DIRTY; > +} > + > +static inline void riscv_preempt_v_set_restore(struct task_struct *task) > +{ > + barrier(); > + task->thread.riscv_v_flags |= RISCV_PREEMPT_V_NEED_RESTORE; > +} > + > +static inline bool riscv_preempt_v_started(struct task_struct *task) > +{ > + return !!(READ_ONCE(task->thread.riscv_v_flags) & > RISCV_PREEMPT_V); > +} > + > +static inline bool riscv_preempt_v_need_save(struct task_struct *task) > +{ > + u32 val = READ_ONCE(task->thread.riscv_v_flags); > + > + /* preempt_v has started and the state is dirty */ > + return !!((val & RISCV_PREEMPT_V_DIRTY) && (val & > RISCV_PREEMPT_V)); > +} > +#else /* !CONFIG_RISCV_ISA_V_PREEMPTIVE */ > +static inline bool riscv_preempt_v_dirty(struct task_struct *task) { return > false; } > +static inline bool riscv_preempt_v_restore(struct task_struct *task) { return > false; } > +static inline bool riscv_preempt_v_started(struct task_struct *task) { return > false; } > +static inline bool riscv_preempt_v_need_save(struct task_struct *task) > { return false; } > +#define riscv_preempt_v_clear_dirty(tsk) do {} while (0) > +#define riscv_preempt_v_set_restore(tsk) do {} while (0) > +#endif /* CONFIG_RISCV_ISA_V_PREEMPTIVE */ > + > static inline void __switch_to_vector(struct task_struct *prev, > struct task_struct *next) > { > struct pt_regs *regs; > > - regs = task_pt_regs(prev); > - riscv_v_vstate_save(&prev->thread.vstate, regs); > - riscv_v_vstate_set_restore(next, task_pt_regs(next)); > + if (riscv_preempt_v_need_save(prev)) { > + __riscv_v_vstate_save(&prev->thread.kernel_vstate, > + prev->thread.kernel_vstate.datap); > + riscv_preempt_v_clear_dirty(prev); > + } else { > + regs = task_pt_regs(prev); > + riscv_v_vstate_save(&prev->thread.vstate, regs); > + } > + > + if (riscv_preempt_v_started(next)) > + riscv_preempt_v_set_restore(next); > + else > + riscv_v_vstate_set_restore(next, task_pt_regs(next)); > } > > void riscv_v_vstate_ctrl_init(struct task_struct *tsk); > @@ -231,6 +290,7 @@ static inline bool > riscv_v_vstate_ctrl_user_allowed(void) { return false; } > #define riscv_v_vstate_on(regs) do {} while (0) > #define riscv_v_thread_free(tsk) do {} while (0) > #define riscv_v_setup_ctx_cache() do {} while (0) > +#define riscv_v_thread_alloc(tsk) do {} while (0) > > #endif /* CONFIG_RISCV_ISA_V */ > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > index 54ca4564a926..9d1a305d5508 100644 > --- a/arch/riscv/kernel/entry.S > +++ b/arch/riscv/kernel/entry.S > @@ -83,6 +83,10 @@ SYM_CODE_START(handle_exception) > /* Load the kernel shadow call stack pointer if coming from userspace > */ > scs_load_current_if_task_changed s5 > > +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE > + move a0, sp > + call riscv_v_context_nesting_start > +#endif > move a0, sp /* pt_regs */ > la ra, ret_from_exception > > @@ -138,6 +142,10 @@ SYM_CODE_START_NOALIGN(ret_from_exception) > */ > csrw CSR_SCRATCH, tp > 1: > +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE > + move a0, sp > + call riscv_v_context_nesting_end > +#endif > REG_L a0, PT_STATUS(sp) > /* > * The current load reservation is effectively part of the processor's > diff --git a/arch/riscv/kernel/kernel_mode_vector.c > b/arch/riscv/kernel/kernel_mode_vector.c > index 241a8f834e1c..22580d36fd08 100644 > --- a/arch/riscv/kernel/kernel_mode_vector.c > +++ b/arch/riscv/kernel/kernel_mode_vector.c > @@ -14,10 +14,13 @@ > #include <asm/vector.h> > #include <asm/switch_to.h> > #include <asm/simd.h> > +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE > +#include <asm/asm-prototypes.h> > +#endif > > static inline void riscv_v_flags_set(u32 flags) > { > - current->thread.riscv_v_flags = flags; > + WRITE_ONCE(current->thread.riscv_v_flags, flags); > } > > static inline void riscv_v_start(u32 flags) > @@ -27,12 +30,14 @@ static inline void riscv_v_start(u32 flags) > orig = riscv_v_flags(); > BUG_ON((orig & flags) != 0); > riscv_v_flags_set(orig | flags); > + barrier(); > } > > static inline void riscv_v_stop(u32 flags) > { > int orig; > > + barrier(); > orig = riscv_v_flags(); > BUG_ON((orig & flags) == 0); > riscv_v_flags_set(orig & ~flags); > @@ -75,6 +80,121 @@ void put_cpu_vector_context(void) > preempt_enable(); > } > > +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE > +static inline void riscv_preempt_v_set_dirty(void) > +{ > + current->thread.riscv_v_flags |= RISCV_PREEMPT_V_DIRTY; > +} > + > +static inline void riscv_preempt_v_reset_flags(void) > +{ > + current->thread.riscv_v_flags &= ~(RISCV_PREEMPT_V_DIRTY | > RISCV_PREEMPT_V_NEED_RESTORE); > +} > + > +static __always_inline volatile u32 *riscv_v_flags_ptr(void) > +{ > + return ¤t->thread.riscv_v_flags; > +} > + > +static inline void riscv_v_ctx_depth_inc(void) > +{ > + *riscv_v_flags_ptr() += RISCV_V_CTX_UNIT_DEPTH; > + barrier(); > +} > + > +static inline void riscv_v_ctx_depth_dec(void) > +{ > + barrier(); > + *riscv_v_flags_ptr() -= RISCV_V_CTX_UNIT_DEPTH; > +} > + > +static inline u32 riscv_v_ctx_get_depth(void) > +{ > + return riscv_v_flags() & RISCV_V_CTX_DEPTH_MASK; > +} > + > +static int riscv_v_stop_kernel_context(void) > +{ > + if (riscv_v_ctx_get_depth() != 0 || !riscv_preempt_v_started(current)) > + return 1; > + > + riscv_v_stop(RISCV_PREEMPT_V); > + return 0; > +} > + > +static int riscv_v_start_kernel_context(bool *is_nested) > +{ > + struct __riscv_v_ext_state *kvstate, *uvstate; > + > + kvstate = ¤t->thread.kernel_vstate; > + if (!kvstate->datap) > + return -ENOENT; > + > + if (riscv_preempt_v_started(current)) { > + WARN_ON(riscv_v_ctx_get_depth() == 0); > + *is_nested = true; > + if (riscv_preempt_v_dirty(current)) { > + get_cpu_vector_context(); > + __riscv_v_vstate_save(kvstate, kvstate->datap); > + riscv_preempt_v_clear_dirty(current); > + put_cpu_vector_context(); > + } > + get_cpu_vector_context(); Can we get_cpu_vector_context() before the above check? It looks we would get/put/get context when the preempt_v_dirty is true. > + riscv_preempt_v_set_restore(current); > + return 0; > + } > + > + riscv_v_start(RISCV_PREEMPT_V); > + if ((task_pt_regs(current)->status & SR_VS) == SR_VS_DIRTY) { > + uvstate = ¤t->thread.vstate; > + riscv_preempt_v_set_dirty(); > + __riscv_v_vstate_save(uvstate, uvstate->datap); > + __riscv_v_vstate_clean(task_pt_regs(current)); Why set the status as CLEAN? Later in kernel_vector_begin, the status would then be set to INITIAL by riscv_v_vstate_set_restore()/riscv_v_vstate_on(). BTW, the "status" is just a value in memory, not updated to CSR yet, I don't understand how this check (regs->status & SR_VS) == SR_VS_DIRTY in riscv_v_context_nesting_start() can work. Maybe I miss something. > + riscv_preempt_v_clear_dirty(current); I assume there would be some vector instructions running between the kernel_vector_begin() and kernel_vector_end(), so why not call riscv_preempt_v_clear_dirty() in riscv_v_stop_kernel_context()? Or we are relying on the check (regs->status & SR_VS) == SR_VS_DIRTY in nesting_start to mark RISCV_PREEMPT_V_DIRTY? BRs, Xiao > + } > + return 0; > +} > + > +/* low-level V context handling code, called with irq disabled */ > +asmlinkage void riscv_v_context_nesting_start(struct pt_regs *regs) > +{ > + int depth; > + > + if (!riscv_preempt_v_started(current)) > + return; > + > + depth = riscv_v_ctx_get_depth(); > + if (depth == 0 && (regs->status & SR_VS) == SR_VS_DIRTY) > + riscv_preempt_v_set_dirty(); > + > + riscv_v_ctx_depth_inc(); > +}
Hi Xiao, On Thu, Jan 4, 2024 at 11:03 PM Wang, Xiao W <xiao.w.wang@intel.com> wrote: > > Hi Andy, > > > -----Original Message----- > > From: Andy Chiu <andy.chiu@sifive.com> > > Sent: Friday, December 29, 2023 10:36 PM > > To: linux-riscv@lists.infradead.org; palmer@dabbelt.com > > Cc: paul.walmsley@sifive.com; greentime.hu@sifive.com; > > guoren@linux.alibaba.com; bjorn@kernel.org; charlie@rivosinc.com; > > ardb@kernel.org; arnd@arndb.de; peterz@infradead.org; tglx@linutronix.de; > > ebiggers@kernel.org; Andy Chiu <andy.chiu@sifive.com>; Albert Ou > > <aou@eecs.berkeley.edu>; Guo Ren <guoren@kernel.org>; Han-Kuan Chen > > <hankuan.chen@sifive.com>; Sami Tolvanen <samitolvanen@google.com>; > > Deepak Gupta <debug@rivosinc.com>; Vincent Chen > > <vincent.chen@sifive.com>; Heiko Stuebner <heiko@sntech.de>; Clément > > Léger <cleger@rivosinc.com>; Björn Töpel <bjorn@rivosinc.com>; Wang, Xiao > > W <xiao.w.wang@intel.com>; Nathan Chancellor <nathan@kernel.org>; > > Jisheng Zhang <jszhang@kernel.org>; Conor Dooley > > <conor.dooley@microchip.com>; Joel Granados <j.granados@samsung.com> > > Subject: [v9, 10/10] riscv: vector: allow kernel-mode Vector with preemption > > > > Add kernel_vstate to keep track of kernel-mode Vector registers when > > trap introduced context switch happens. Also, provide riscv_v_flags to > > let context save/restore routine track context status. Context tracking > > happens whenever the core starts its in-kernel Vector executions. An > > active (dirty) kernel task's V contexts will be saved to memory whenever > > a trap-introduced context switch happens. Or, when a softirq, which > > happens to nest on top of it, uses Vector. Context retoring happens when > > the execution transfer back to the original Kernel context where it > > first enable preempt_v. > > > > Also, provide a config CONFIG_RISCV_ISA_V_PREEMPTIVE to give users an > > option to disable preemptible kernel-mode Vector at build time. Users > > with constraint memory may want to disable this config as preemptible > > kernel-mode Vector needs extra space for tracking of per thread's > > kernel-mode V context. Or, users might as well want to disable it if all > > kernel-mode Vector code is time sensitive and cannot tolerate context > > switch overhead. > > > > Signed-off-by: Andy Chiu <andy.chiu@sifive.com> > > --- > > Changelog v9: > > - Separate context depth tracking out to a individual bitmap. > > - Use bitwise to mask on/off the preempt_v status and drop unused masks > > - Do not turn off bh on success path of preempt_v (To make preempt_v > > available for task context that turns off irq). > > - Remove and test lockdep assertion. > > Changelog v8: > > - fix -Wmissing-prototypes for functions with asmlinkage > > Changelog v6: > > - re-write patch to handle context nesting for softirqs > > - drop thread flag and track context instead in riscv_v_flags > > - refine some asm code and constraint it into C functions > > - preallocate v context for preempt_v > > - Return non-zero in riscv_v_start_kernel_context with non-preemptible > > kernel-mode Vector > > Changelog v4: > > - dropped from v4 > > Changelog v3: > > - Guard vstate_save with {get,set}_cpu_vector_context > > - Add comments on preventions of nesting V contexts > > - remove warnings in context switch when trap's reg is not pressent (Conor) > > - refactor code (Björn) > > Changelog v2: > > - fix build fail when compiling without RISCV_ISA_V (Conor) > > - 's/TIF_RISCV_V_KMV/TIF_RISCV_V_KERNEL_MODE' and add comment > > (Conor) > > - merge Kconfig patch into this oine (Conor). > > - > > 's/CONFIG_RISCV_ISA_V_PREEMPTIVE_KMV/CONFIG_RISCV_ISA_V_PREEMP > > TIVE/' > > (Conor) > > - fix some typos (Conor) > > - enclose assembly with RISCV_ISA_V_PREEMPTIVE. > > - change riscv_v_vstate_ctrl_config_kmv() to > > kernel_vector_allow_preemption() for better understanding. (Conor) > > - 's/riscv_v_kmv_preempitble/kernel_vector_preemptible/' > > --- > > arch/riscv/Kconfig | 14 +++ > > arch/riscv/include/asm/asm-prototypes.h | 5 + > > arch/riscv/include/asm/processor.h | 30 +++++- > > arch/riscv/include/asm/simd.h | 26 ++++- > > arch/riscv/include/asm/vector.h | 68 +++++++++++- > > arch/riscv/kernel/entry.S | 8 ++ > > arch/riscv/kernel/kernel_mode_vector.c | 137 ++++++++++++++++++++++-- > > arch/riscv/kernel/process.c | 3 + > > arch/riscv/kernel/vector.c | 31 ++++-- > > 9 files changed, 300 insertions(+), 22 deletions(-) > > > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig > > index 3c5ba05e8a2d..0a03d72706b5 100644 > > --- a/arch/riscv/Kconfig > > +++ b/arch/riscv/Kconfig > > @@ -533,6 +533,20 @@ config RISCV_ISA_V_UCOPY_THRESHOLD > > Prefer using vectorized copy_to_user()/copy_from_user() when the > > workload size exceeds this value. > > > > +config RISCV_ISA_V_PREEMPTIVE > > + bool "Run kernel-mode Vector with kernel preemption" > > + depends on PREEMPTION > > + depends on RISCV_ISA_V > > + default y > > + help > > + Usually, in-kernel SIMD routines are run with preemption disabled. > > + Functions which envoke long running SIMD thus must yield core's > > + vector unit to prevent blocking other tasks for too long. > > + > > + This config allows kernel to run SIMD without explicitly disable > > + preemption. Enabling this config will result in higher memory > > + consumption due to the allocation of per-task's kernel Vector > > context. > > + > > config TOOLCHAIN_HAS_ZBB > > bool > > default y > > diff --git a/arch/riscv/include/asm/asm-prototypes.h > > b/arch/riscv/include/asm/asm-prototypes.h > > index be438932f321..cd627ec289f1 100644 > > --- a/arch/riscv/include/asm/asm-prototypes.h > > +++ b/arch/riscv/include/asm/asm-prototypes.h > > @@ -30,6 +30,11 @@ void xor_regs_5_(unsigned long bytes, unsigned long > > *__restrict p1, > > const unsigned long *__restrict p4, > > const unsigned long *__restrict p5); > > > > +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE > > +asmlinkage void riscv_v_context_nesting_start(struct pt_regs *regs); > > +asmlinkage void riscv_v_context_nesting_end(struct pt_regs *regs); > > +#endif /* CONFIG_RISCV_ISA_V_PREEMPTIVE */ > > + > > #endif /* CONFIG_RISCV_ISA_V */ > > > > #define DECLARE_DO_ERROR_INFO(name) asmlinkage void name(struct > > pt_regs *regs) > > diff --git a/arch/riscv/include/asm/processor.h > > b/arch/riscv/include/asm/processor.h > > index e76839789067..b503fd34728d 100644 > > --- a/arch/riscv/include/asm/processor.h > > +++ b/arch/riscv/include/asm/processor.h > > @@ -81,8 +81,35 @@ struct pt_regs; > > * activation of this state disables the preemption. On a non-RT kernel, it > > * also disable bh. Currently only 0 and 1 are valid value for this field. > > * Other values are reserved for future uses. > > + * - bits 8: is used for tracking preemptible kernel-mode Vector, when > > + * RISCV_ISA_V_PREEMPTIVE is enabled. Calling kernel_vector_begin() does > > not > > + * disable the preemption if the thread's kernel_vstate.datap is allocated. > > + * Instead, the kernel set this bit field. Then the trap entry/exit code > > + * knows if we are entering/exiting the context that owns preempt_v. > > + * - 0: the task is not using preempt_v > > + * - 1: the task is actively using preempt_v. But whether does the task own > > + * the preempt_v context is decided by bits in > > RISCV_V_CTX_DEPTH_MASK. > > + * - bit 16-23 are RISCV_V_CTX_DEPTH_MASK, used by context tracking > > routine > > + * when preempt_v starts: > > + * - 0: the task is actively using, and own preempt_v context. > > + * - non-zero: the task was using preempt_v, but then took a trap within. > > + * Thus, the task does not own preempt_v. Any use of Vector will have to > > + * save preempt_v, if dirty, and fallback to non-preemptible kernel-mode > > + * Vector. > > + * - bit 30: The in-kernel preempt_v context is saved, and requries to be > > + * restored when returning to the context that owns the preempt_v. > > + * - bit 31: The in-kernel preempt_v context is dirty, as signaled by the > > + * trap entry code. Any context switches out-of current task need to save > > + * it to the task's in-kernel V context. Also, any traps nesting on-top-of > > + * preempt_v requesting to use V needs a save. > > */ > > -#define RISCV_KERNEL_MODE_V 0x1 > > +#define RISCV_V_CTX_DEPTH_MASK 0x00ff0000 > > + > > +#define RISCV_V_CTX_UNIT_DEPTH 0x00010000 > > +#define RISCV_KERNEL_MODE_V 0x00000001 > > +#define RISCV_PREEMPT_V 0x00000100 > > +#define RISCV_PREEMPT_V_DIRTY 0x80000000 > > +#define RISCV_PREEMPT_V_NEED_RESTORE 0x40000000 > > > > /* CPU-specific state of a task */ > > struct thread_struct { > > @@ -96,6 +123,7 @@ struct thread_struct { > > u32 vstate_ctrl; > > struct __riscv_v_ext_state vstate; > > unsigned long align_ctl; > > + struct __riscv_v_ext_state kernel_vstate; > > }; > > > > /* Whitelist the fstate from the task_struct for hardened usercopy */ > > diff --git a/arch/riscv/include/asm/simd.h b/arch/riscv/include/asm/simd.h > > index 4d699e16c9a9..54efbf523d49 100644 > > --- a/arch/riscv/include/asm/simd.h > > +++ b/arch/riscv/include/asm/simd.h > > @@ -12,6 +12,7 @@ > > #include <linux/percpu.h> > > #include <linux/preempt.h> > > #include <linux/types.h> > > +#include <linux/thread_info.h> > > > > #include <asm/vector.h> > > > > @@ -28,12 +29,27 @@ static __must_check inline bool may_use_simd(void) > > /* > > * RISCV_KERNEL_MODE_V is only set while preemption is disabled, > > * and is clear whenever preemption is enabled. > > - * > > - * Kernel-mode Vector temporarily disables bh. So we must not return > > - * true on irq_disabled(). Otherwise we would fail the lockdep check > > - * calling local_bh_enable() > > */ > > - return !in_hardirq() && !in_nmi() && !irqs_disabled() > > && !(riscv_v_flags() & RISCV_KERNEL_MODE_V); > > + if (in_hardirq() || in_nmi()) > > + return false; > > + > > + /* > > + * Nesting is acheived in preempt_v by spreading the control for > > + * preemptible and non-preemptible kernel-mode Vector into two > > fields. > > + * Always try to match with prempt_v if kernel V-context exists. Then, > > + * fallback to check non preempt_v if nesting happens, or if the config > > + * is not set. > > + */ > > + if (IS_ENABLED(CONFIG_RISCV_ISA_V_PREEMPTIVE) && current- > > >thread.kernel_vstate.datap) { > > + if (!riscv_preempt_v_started(current)) > > + return true; > > + } > > + /* > > + * Non-preemptible kernel-mode Vector temporarily disables bh. So > > we > > + * must not return true on irq_disabled(). Otherwise we would fail the > > + * lockdep check calling local_bh_enable() > > + */ > > + return !irqs_disabled() && !(riscv_v_flags() & > > RISCV_KERNEL_MODE_V); > > } > > > > #else /* ! CONFIG_RISCV_ISA_V */ > > diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h > > index 7b316050f24f..d69844906d51 100644 > > --- a/arch/riscv/include/asm/vector.h > > +++ b/arch/riscv/include/asm/vector.h > > @@ -28,10 +28,11 @@ void get_cpu_vector_context(void); > > void put_cpu_vector_context(void); > > void riscv_v_thread_free(struct task_struct *tsk); > > void __init riscv_v_setup_ctx_cache(void); > > +void riscv_v_thread_alloc(struct task_struct *tsk); > > > > static inline u32 riscv_v_flags(void) > > { > > - return current->thread.riscv_v_flags; > > + return READ_ONCE(current->thread.riscv_v_flags); > > } > > > > static __always_inline bool has_vector(void) > > @@ -200,14 +201,72 @@ static inline void riscv_v_vstate_set_restore(struct > > task_struct *task, > > } > > } > > > > +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE > > +static inline bool riscv_preempt_v_dirty(struct task_struct *task) > > +{ > > + u32 val = READ_ONCE(task->thread.riscv_v_flags); > > + > > + return !!(val & RISCV_PREEMPT_V_DIRTY); > > +} > > + > > +static inline bool riscv_preempt_v_restore(struct task_struct *task) > > +{ > > + u32 val = READ_ONCE(task->thread.riscv_v_flags); > > + > > + return !!(val & RISCV_PREEMPT_V_NEED_RESTORE); > > +} > > + > > +static inline void riscv_preempt_v_clear_dirty(struct task_struct *task) > > +{ > > + barrier(); > > + task->thread.riscv_v_flags &= ~RISCV_PREEMPT_V_DIRTY; > > +} > > + > > +static inline void riscv_preempt_v_set_restore(struct task_struct *task) > > +{ > > + barrier(); > > + task->thread.riscv_v_flags |= RISCV_PREEMPT_V_NEED_RESTORE; > > +} > > + > > +static inline bool riscv_preempt_v_started(struct task_struct *task) > > +{ > > + return !!(READ_ONCE(task->thread.riscv_v_flags) & > > RISCV_PREEMPT_V); > > +} > > + > > +static inline bool riscv_preempt_v_need_save(struct task_struct *task) > > +{ > > + u32 val = READ_ONCE(task->thread.riscv_v_flags); > > + > > + /* preempt_v has started and the state is dirty */ > > + return !!((val & RISCV_PREEMPT_V_DIRTY) && (val & > > RISCV_PREEMPT_V)); > > +} > > +#else /* !CONFIG_RISCV_ISA_V_PREEMPTIVE */ > > +static inline bool riscv_preempt_v_dirty(struct task_struct *task) { return > > false; } > > +static inline bool riscv_preempt_v_restore(struct task_struct *task) { return > > false; } > > +static inline bool riscv_preempt_v_started(struct task_struct *task) { return > > false; } > > +static inline bool riscv_preempt_v_need_save(struct task_struct *task) > > { return false; } > > +#define riscv_preempt_v_clear_dirty(tsk) do {} while (0) > > +#define riscv_preempt_v_set_restore(tsk) do {} while (0) > > +#endif /* CONFIG_RISCV_ISA_V_PREEMPTIVE */ > > + > > static inline void __switch_to_vector(struct task_struct *prev, > > struct task_struct *next) > > { > > struct pt_regs *regs; > > > > - regs = task_pt_regs(prev); > > - riscv_v_vstate_save(&prev->thread.vstate, regs); > > - riscv_v_vstate_set_restore(next, task_pt_regs(next)); > > + if (riscv_preempt_v_need_save(prev)) { > > + __riscv_v_vstate_save(&prev->thread.kernel_vstate, > > + prev->thread.kernel_vstate.datap); > > + riscv_preempt_v_clear_dirty(prev); > > + } else { > > + regs = task_pt_regs(prev); > > + riscv_v_vstate_save(&prev->thread.vstate, regs); > > + } > > + > > + if (riscv_preempt_v_started(next)) > > + riscv_preempt_v_set_restore(next); > > + else > > + riscv_v_vstate_set_restore(next, task_pt_regs(next)); > > } > > > > void riscv_v_vstate_ctrl_init(struct task_struct *tsk); > > @@ -231,6 +290,7 @@ static inline bool > > riscv_v_vstate_ctrl_user_allowed(void) { return false; } > > #define riscv_v_vstate_on(regs) do {} while (0) > > #define riscv_v_thread_free(tsk) do {} while (0) > > #define riscv_v_setup_ctx_cache() do {} while (0) > > +#define riscv_v_thread_alloc(tsk) do {} while (0) > > > > #endif /* CONFIG_RISCV_ISA_V */ > > > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > > index 54ca4564a926..9d1a305d5508 100644 > > --- a/arch/riscv/kernel/entry.S > > +++ b/arch/riscv/kernel/entry.S > > @@ -83,6 +83,10 @@ SYM_CODE_START(handle_exception) > > /* Load the kernel shadow call stack pointer if coming from userspace > > */ > > scs_load_current_if_task_changed s5 > > > > +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE > > + move a0, sp > > + call riscv_v_context_nesting_start > > +#endif > > move a0, sp /* pt_regs */ > > la ra, ret_from_exception > > > > @@ -138,6 +142,10 @@ SYM_CODE_START_NOALIGN(ret_from_exception) > > */ > > csrw CSR_SCRATCH, tp > > 1: > > +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE > > + move a0, sp > > + call riscv_v_context_nesting_end > > +#endif > > REG_L a0, PT_STATUS(sp) > > /* > > * The current load reservation is effectively part of the processor's > > diff --git a/arch/riscv/kernel/kernel_mode_vector.c > > b/arch/riscv/kernel/kernel_mode_vector.c > > index 241a8f834e1c..22580d36fd08 100644 > > --- a/arch/riscv/kernel/kernel_mode_vector.c > > +++ b/arch/riscv/kernel/kernel_mode_vector.c > > @@ -14,10 +14,13 @@ > > #include <asm/vector.h> > > #include <asm/switch_to.h> > > #include <asm/simd.h> > > +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE > > +#include <asm/asm-prototypes.h> > > +#endif > > > > static inline void riscv_v_flags_set(u32 flags) > > { > > - current->thread.riscv_v_flags = flags; > > + WRITE_ONCE(current->thread.riscv_v_flags, flags); > > } > > > > static inline void riscv_v_start(u32 flags) > > @@ -27,12 +30,14 @@ static inline void riscv_v_start(u32 flags) > > orig = riscv_v_flags(); > > BUG_ON((orig & flags) != 0); > > riscv_v_flags_set(orig | flags); > > + barrier(); > > } > > > > static inline void riscv_v_stop(u32 flags) > > { > > int orig; > > > > + barrier(); > > orig = riscv_v_flags(); > > BUG_ON((orig & flags) == 0); > > riscv_v_flags_set(orig & ~flags); > > @@ -75,6 +80,121 @@ void put_cpu_vector_context(void) > > preempt_enable(); > > } > > > > +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE > > +static inline void riscv_preempt_v_set_dirty(void) > > +{ > > + current->thread.riscv_v_flags |= RISCV_PREEMPT_V_DIRTY; > > +} > > + > > +static inline void riscv_preempt_v_reset_flags(void) > > +{ > > + current->thread.riscv_v_flags &= ~(RISCV_PREEMPT_V_DIRTY | > > RISCV_PREEMPT_V_NEED_RESTORE); > > +} > > + > > +static __always_inline volatile u32 *riscv_v_flags_ptr(void) > > +{ > > + return ¤t->thread.riscv_v_flags; > > +} > > + > > +static inline void riscv_v_ctx_depth_inc(void) > > +{ > > + *riscv_v_flags_ptr() += RISCV_V_CTX_UNIT_DEPTH; > > + barrier(); > > +} > > + > > +static inline void riscv_v_ctx_depth_dec(void) > > +{ > > + barrier(); > > + *riscv_v_flags_ptr() -= RISCV_V_CTX_UNIT_DEPTH; > > +} > > + > > +static inline u32 riscv_v_ctx_get_depth(void) > > +{ > > + return riscv_v_flags() & RISCV_V_CTX_DEPTH_MASK; > > +} > > + > > +static int riscv_v_stop_kernel_context(void) > > +{ > > + if (riscv_v_ctx_get_depth() != 0 || !riscv_preempt_v_started(current)) > > + return 1; > > + > > + riscv_v_stop(RISCV_PREEMPT_V); > > + return 0; > > +} > > + > > +static int riscv_v_start_kernel_context(bool *is_nested) > > +{ > > + struct __riscv_v_ext_state *kvstate, *uvstate; > > + > > + kvstate = ¤t->thread.kernel_vstate; > > + if (!kvstate->datap) > > + return -ENOENT; > > + > > + if (riscv_preempt_v_started(current)) { > > + WARN_ON(riscv_v_ctx_get_depth() == 0); > > + *is_nested = true; > > + if (riscv_preempt_v_dirty(current)) { > > + get_cpu_vector_context(); > > + __riscv_v_vstate_save(kvstate, kvstate->datap); > > + riscv_preempt_v_clear_dirty(current); > > + put_cpu_vector_context(); > > + } > > + get_cpu_vector_context(); > > Can we get_cpu_vector_context() before the above check? It looks we would get/put/get context > when the preempt_v_dirty is true. Yes, we can. I will fix that in v10 > > > + riscv_preempt_v_set_restore(current); > > + return 0; > > + } > > + > > + riscv_v_start(RISCV_PREEMPT_V); > > + if ((task_pt_regs(current)->status & SR_VS) == SR_VS_DIRTY) { > > + uvstate = ¤t->thread.vstate; > > + riscv_preempt_v_set_dirty(); > > + __riscv_v_vstate_save(uvstate, uvstate->datap); > > + __riscv_v_vstate_clean(task_pt_regs(current)); > > Why set the status as CLEAN? Later in kernel_vector_begin, the status would then be set to INITIAL > by riscv_v_vstate_set_restore()/riscv_v_vstate_on(). Thanks! Yes, we don't need to clean the state here because we call riscv_v_vstate_set_restore() right after returning back. > > BTW, the "status" is just a value in memory, not updated to CSR yet, I don't understand how > this check (regs->status & SR_VS) == SR_VS_DIRTY in riscv_v_context_nesting_start() can work. > Maybe I miss something. We are about to start kernel-mode Vector. So save the user's V context if it is dirty. task_pt_regs always point to the regset at user/kernel boundary because in-kernel trap grows on the original kernel stack. The purpose of this code segment is to own the user's V context with preempt_v for the context saving. So, we have to copy the dirty status to riscv_v_flag when starting preempt_v. However, it has to be seen atomically w.r.t context-nesting code at the same core. So, I am going to update the code to something like this: if ((task_pt_regs(current)->status & SR_VS) == SR_VS_DIRTY) { /* Transfer the ownership of V from user to kernel, then save */ riscv_v_start(RISCV_PREEMPT_V | RISCV_PREEMPT_V_DIRTY); uvstate = ¤t->thread.vstate; __riscv_v_vstate_save(uvstate, uvstate->datap); riscv_preempt_v_clear_dirty(current); } else { riscv_v_start(RISCV_PREEMPT_V); } Also, context saving code in context switch shall not save anything as long as preempt_v is enabled. I will address that in v10. > > > + riscv_preempt_v_clear_dirty(current); > > I assume there would be some vector instructions running between the kernel_vector_begin() > and kernel_vector_end(), so why not call riscv_preempt_v_clear_dirty() in riscv_v_stop_kernel_context()? > Or we are relying on the check (regs->status & SR_VS) == SR_VS_DIRTY in nesting_start to mark > RISCV_PREEMPT_V_DIRTY? Yes, we should clear RISCV_PREEMPT_V_DIRTY at riscv_v_stop_kernel_context(). Besides, we should only reset flags when a restore happens in riscv_v_context_nesting_end(). > > BRs, > Xiao > > > + } > > + return 0; > > +} > > + > > +/* low-level V context handling code, called with irq disabled */ > > +asmlinkage void riscv_v_context_nesting_start(struct pt_regs *regs) > > +{ > > + int depth; > > + > > + if (!riscv_preempt_v_started(current)) > > + return; > > + > > + depth = riscv_v_ctx_get_depth(); > > + if (depth == 0 && (regs->status & SR_VS) == SR_VS_DIRTY) > > + riscv_preempt_v_set_dirty(); > > + > > + riscv_v_ctx_depth_inc(); > > +} > Thanks for the suggestions! Regards, Andy
On Wed, Jan 10, 2024 at 11:31 PM Andy Chiu <andy.chiu@sifive.com> wrote: > > Hi Xiao, > > On Thu, Jan 4, 2024 at 11:03 PM Wang, Xiao W <xiao.w.wang@intel.com> wrote: > > > > Hi Andy, > > > > > -----Original Message----- > > > From: Andy Chiu <andy.chiu@sifive.com> > > > Sent: Friday, December 29, 2023 10:36 PM > > > To: linux-riscv@lists.infradead.org; palmer@dabbelt.com > > > Cc: paul.walmsley@sifive.com; greentime.hu@sifive.com; > > > guoren@linux.alibaba.com; bjorn@kernel.org; charlie@rivosinc.com; > > > ardb@kernel.org; arnd@arndb.de; peterz@infradead.org; tglx@linutronix.de; > > > ebiggers@kernel.org; Andy Chiu <andy.chiu@sifive.com>; Albert Ou > > > <aou@eecs.berkeley.edu>; Guo Ren <guoren@kernel.org>; Han-Kuan Chen > > > <hankuan.chen@sifive.com>; Sami Tolvanen <samitolvanen@google.com>; > > > Deepak Gupta <debug@rivosinc.com>; Vincent Chen > > > <vincent.chen@sifive.com>; Heiko Stuebner <heiko@sntech.de>; Clément > > > Léger <cleger@rivosinc.com>; Björn Töpel <bjorn@rivosinc.com>; Wang, Xiao > > > W <xiao.w.wang@intel.com>; Nathan Chancellor <nathan@kernel.org>; > > > Jisheng Zhang <jszhang@kernel.org>; Conor Dooley > > > <conor.dooley@microchip.com>; Joel Granados <j.granados@samsung.com> > > > Subject: [v9, 10/10] riscv: vector: allow kernel-mode Vector with preemption > > > > > > Add kernel_vstate to keep track of kernel-mode Vector registers when > > > trap introduced context switch happens. Also, provide riscv_v_flags to > > > let context save/restore routine track context status. Context tracking > > > happens whenever the core starts its in-kernel Vector executions. An > > > active (dirty) kernel task's V contexts will be saved to memory whenever > > > a trap-introduced context switch happens. Or, when a softirq, which > > > happens to nest on top of it, uses Vector. Context retoring happens when > > > the execution transfer back to the original Kernel context where it > > > first enable preempt_v. > > > > > > Also, provide a config CONFIG_RISCV_ISA_V_PREEMPTIVE to give users an > > > option to disable preemptible kernel-mode Vector at build time. Users > > > with constraint memory may want to disable this config as preemptible > > > kernel-mode Vector needs extra space for tracking of per thread's > > > kernel-mode V context. Or, users might as well want to disable it if all > > > kernel-mode Vector code is time sensitive and cannot tolerate context > > > switch overhead. > > > > > > Signed-off-by: Andy Chiu <andy.chiu@sifive.com> > > > --- > > > Changelog v9: > > > - Separate context depth tracking out to a individual bitmap. > > > - Use bitwise to mask on/off the preempt_v status and drop unused masks > > > - Do not turn off bh on success path of preempt_v (To make preempt_v > > > available for task context that turns off irq). > > > - Remove and test lockdep assertion. > > > Changelog v8: > > > - fix -Wmissing-prototypes for functions with asmlinkage > > > Changelog v6: > > > - re-write patch to handle context nesting for softirqs > > > - drop thread flag and track context instead in riscv_v_flags > > > - refine some asm code and constraint it into C functions > > > - preallocate v context for preempt_v > > > - Return non-zero in riscv_v_start_kernel_context with non-preemptible > > > kernel-mode Vector > > > Changelog v4: > > > - dropped from v4 > > > Changelog v3: > > > - Guard vstate_save with {get,set}_cpu_vector_context > > > - Add comments on preventions of nesting V contexts > > > - remove warnings in context switch when trap's reg is not pressent (Conor) > > > - refactor code (Björn) > > > Changelog v2: > > > - fix build fail when compiling without RISCV_ISA_V (Conor) > > > - 's/TIF_RISCV_V_KMV/TIF_RISCV_V_KERNEL_MODE' and add comment > > > (Conor) > > > - merge Kconfig patch into this oine (Conor). > > > - > > > 's/CONFIG_RISCV_ISA_V_PREEMPTIVE_KMV/CONFIG_RISCV_ISA_V_PREEMP > > > TIVE/' > > > (Conor) > > > - fix some typos (Conor) > > > - enclose assembly with RISCV_ISA_V_PREEMPTIVE. > > > - change riscv_v_vstate_ctrl_config_kmv() to > > > kernel_vector_allow_preemption() for better understanding. (Conor) > > > - 's/riscv_v_kmv_preempitble/kernel_vector_preemptible/' > > > --- > > > arch/riscv/Kconfig | 14 +++ > > > arch/riscv/include/asm/asm-prototypes.h | 5 + > > > arch/riscv/include/asm/processor.h | 30 +++++- > > > arch/riscv/include/asm/simd.h | 26 ++++- > > > arch/riscv/include/asm/vector.h | 68 +++++++++++- > > > arch/riscv/kernel/entry.S | 8 ++ > > > arch/riscv/kernel/kernel_mode_vector.c | 137 ++++++++++++++++++++++-- > > > arch/riscv/kernel/process.c | 3 + > > > arch/riscv/kernel/vector.c | 31 ++++-- > > > 9 files changed, 300 insertions(+), 22 deletions(-) > > > > > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig > > > index 3c5ba05e8a2d..0a03d72706b5 100644 > > > --- a/arch/riscv/Kconfig > > > +++ b/arch/riscv/Kconfig > > > @@ -533,6 +533,20 @@ config RISCV_ISA_V_UCOPY_THRESHOLD > > > Prefer using vectorized copy_to_user()/copy_from_user() when the > > > workload size exceeds this value. > > > > > > +config RISCV_ISA_V_PREEMPTIVE > > > + bool "Run kernel-mode Vector with kernel preemption" > > > + depends on PREEMPTION > > > + depends on RISCV_ISA_V > > > + default y > > > + help > > > + Usually, in-kernel SIMD routines are run with preemption disabled. > > > + Functions which envoke long running SIMD thus must yield core's > > > + vector unit to prevent blocking other tasks for too long. > > > + > > > + This config allows kernel to run SIMD without explicitly disable > > > + preemption. Enabling this config will result in higher memory > > > + consumption due to the allocation of per-task's kernel Vector > > > context. > > > + > > > config TOOLCHAIN_HAS_ZBB > > > bool > > > default y > > > diff --git a/arch/riscv/include/asm/asm-prototypes.h > > > b/arch/riscv/include/asm/asm-prototypes.h > > > index be438932f321..cd627ec289f1 100644 > > > --- a/arch/riscv/include/asm/asm-prototypes.h > > > +++ b/arch/riscv/include/asm/asm-prototypes.h > > > @@ -30,6 +30,11 @@ void xor_regs_5_(unsigned long bytes, unsigned long > > > *__restrict p1, > > > const unsigned long *__restrict p4, > > > const unsigned long *__restrict p5); > > > > > > +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE > > > +asmlinkage void riscv_v_context_nesting_start(struct pt_regs *regs); > > > +asmlinkage void riscv_v_context_nesting_end(struct pt_regs *regs); > > > +#endif /* CONFIG_RISCV_ISA_V_PREEMPTIVE */ > > > + > > > #endif /* CONFIG_RISCV_ISA_V */ > > > > > > #define DECLARE_DO_ERROR_INFO(name) asmlinkage void name(struct > > > pt_regs *regs) > > > diff --git a/arch/riscv/include/asm/processor.h > > > b/arch/riscv/include/asm/processor.h > > > index e76839789067..b503fd34728d 100644 > > > --- a/arch/riscv/include/asm/processor.h > > > +++ b/arch/riscv/include/asm/processor.h > > > @@ -81,8 +81,35 @@ struct pt_regs; > > > * activation of this state disables the preemption. On a non-RT kernel, it > > > * also disable bh. Currently only 0 and 1 are valid value for this field. > > > * Other values are reserved for future uses. > > > + * - bits 8: is used for tracking preemptible kernel-mode Vector, when > > > + * RISCV_ISA_V_PREEMPTIVE is enabled. Calling kernel_vector_begin() does > > > not > > > + * disable the preemption if the thread's kernel_vstate.datap is allocated. > > > + * Instead, the kernel set this bit field. Then the trap entry/exit code > > > + * knows if we are entering/exiting the context that owns preempt_v. > > > + * - 0: the task is not using preempt_v > > > + * - 1: the task is actively using preempt_v. But whether does the task own > > > + * the preempt_v context is decided by bits in > > > RISCV_V_CTX_DEPTH_MASK. > > > + * - bit 16-23 are RISCV_V_CTX_DEPTH_MASK, used by context tracking > > > routine > > > + * when preempt_v starts: > > > + * - 0: the task is actively using, and own preempt_v context. > > > + * - non-zero: the task was using preempt_v, but then took a trap within. > > > + * Thus, the task does not own preempt_v. Any use of Vector will have to > > > + * save preempt_v, if dirty, and fallback to non-preemptible kernel-mode > > > + * Vector. > > > + * - bit 30: The in-kernel preempt_v context is saved, and requries to be > > > + * restored when returning to the context that owns the preempt_v. > > > + * - bit 31: The in-kernel preempt_v context is dirty, as signaled by the > > > + * trap entry code. Any context switches out-of current task need to save > > > + * it to the task's in-kernel V context. Also, any traps nesting on-top-of > > > + * preempt_v requesting to use V needs a save. > > > */ > > > -#define RISCV_KERNEL_MODE_V 0x1 > > > +#define RISCV_V_CTX_DEPTH_MASK 0x00ff0000 > > > + > > > +#define RISCV_V_CTX_UNIT_DEPTH 0x00010000 > > > +#define RISCV_KERNEL_MODE_V 0x00000001 > > > +#define RISCV_PREEMPT_V 0x00000100 > > > +#define RISCV_PREEMPT_V_DIRTY 0x80000000 > > > +#define RISCV_PREEMPT_V_NEED_RESTORE 0x40000000 > > > > > > /* CPU-specific state of a task */ > > > struct thread_struct { > > > @@ -96,6 +123,7 @@ struct thread_struct { > > > u32 vstate_ctrl; > > > struct __riscv_v_ext_state vstate; > > > unsigned long align_ctl; > > > + struct __riscv_v_ext_state kernel_vstate; > > > }; > > > > > > /* Whitelist the fstate from the task_struct for hardened usercopy */ > > > diff --git a/arch/riscv/include/asm/simd.h b/arch/riscv/include/asm/simd.h > > > index 4d699e16c9a9..54efbf523d49 100644 > > > --- a/arch/riscv/include/asm/simd.h > > > +++ b/arch/riscv/include/asm/simd.h > > > @@ -12,6 +12,7 @@ > > > #include <linux/percpu.h> > > > #include <linux/preempt.h> > > > #include <linux/types.h> > > > +#include <linux/thread_info.h> > > > > > > #include <asm/vector.h> > > > > > > @@ -28,12 +29,27 @@ static __must_check inline bool may_use_simd(void) > > > /* > > > * RISCV_KERNEL_MODE_V is only set while preemption is disabled, > > > * and is clear whenever preemption is enabled. > > > - * > > > - * Kernel-mode Vector temporarily disables bh. So we must not return > > > - * true on irq_disabled(). Otherwise we would fail the lockdep check > > > - * calling local_bh_enable() > > > */ > > > - return !in_hardirq() && !in_nmi() && !irqs_disabled() > > > && !(riscv_v_flags() & RISCV_KERNEL_MODE_V); > > > + if (in_hardirq() || in_nmi()) > > > + return false; > > > + > > > + /* > > > + * Nesting is acheived in preempt_v by spreading the control for > > > + * preemptible and non-preemptible kernel-mode Vector into two > > > fields. > > > + * Always try to match with prempt_v if kernel V-context exists. Then, > > > + * fallback to check non preempt_v if nesting happens, or if the config > > > + * is not set. > > > + */ > > > + if (IS_ENABLED(CONFIG_RISCV_ISA_V_PREEMPTIVE) && current- > > > >thread.kernel_vstate.datap) { > > > + if (!riscv_preempt_v_started(current)) > > > + return true; > > > + } > > > + /* > > > + * Non-preemptible kernel-mode Vector temporarily disables bh. So > > > we > > > + * must not return true on irq_disabled(). Otherwise we would fail the > > > + * lockdep check calling local_bh_enable() > > > + */ > > > + return !irqs_disabled() && !(riscv_v_flags() & > > > RISCV_KERNEL_MODE_V); > > > } > > > > > > #else /* ! CONFIG_RISCV_ISA_V */ > > > diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h > > > index 7b316050f24f..d69844906d51 100644 > > > --- a/arch/riscv/include/asm/vector.h > > > +++ b/arch/riscv/include/asm/vector.h > > > @@ -28,10 +28,11 @@ void get_cpu_vector_context(void); > > > void put_cpu_vector_context(void); > > > void riscv_v_thread_free(struct task_struct *tsk); > > > void __init riscv_v_setup_ctx_cache(void); > > > +void riscv_v_thread_alloc(struct task_struct *tsk); > > > > > > static inline u32 riscv_v_flags(void) > > > { > > > - return current->thread.riscv_v_flags; > > > + return READ_ONCE(current->thread.riscv_v_flags); > > > } > > > > > > static __always_inline bool has_vector(void) > > > @@ -200,14 +201,72 @@ static inline void riscv_v_vstate_set_restore(struct > > > task_struct *task, > > > } > > > } > > > > > > +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE > > > +static inline bool riscv_preempt_v_dirty(struct task_struct *task) > > > +{ > > > + u32 val = READ_ONCE(task->thread.riscv_v_flags); > > > + > > > + return !!(val & RISCV_PREEMPT_V_DIRTY); > > > +} > > > + > > > +static inline bool riscv_preempt_v_restore(struct task_struct *task) > > > +{ > > > + u32 val = READ_ONCE(task->thread.riscv_v_flags); > > > + > > > + return !!(val & RISCV_PREEMPT_V_NEED_RESTORE); > > > +} > > > + > > > +static inline void riscv_preempt_v_clear_dirty(struct task_struct *task) > > > +{ > > > + barrier(); > > > + task->thread.riscv_v_flags &= ~RISCV_PREEMPT_V_DIRTY; > > > +} > > > + > > > +static inline void riscv_preempt_v_set_restore(struct task_struct *task) > > > +{ > > > + barrier(); > > > + task->thread.riscv_v_flags |= RISCV_PREEMPT_V_NEED_RESTORE; > > > +} > > > + > > > +static inline bool riscv_preempt_v_started(struct task_struct *task) > > > +{ > > > + return !!(READ_ONCE(task->thread.riscv_v_flags) & > > > RISCV_PREEMPT_V); > > > +} > > > + > > > +static inline bool riscv_preempt_v_need_save(struct task_struct *task) > > > +{ > > > + u32 val = READ_ONCE(task->thread.riscv_v_flags); > > > + > > > + /* preempt_v has started and the state is dirty */ > > > + return !!((val & RISCV_PREEMPT_V_DIRTY) && (val & > > > RISCV_PREEMPT_V)); > > > +} > > > +#else /* !CONFIG_RISCV_ISA_V_PREEMPTIVE */ > > > +static inline bool riscv_preempt_v_dirty(struct task_struct *task) { return > > > false; } > > > +static inline bool riscv_preempt_v_restore(struct task_struct *task) { return > > > false; } > > > +static inline bool riscv_preempt_v_started(struct task_struct *task) { return > > > false; } > > > +static inline bool riscv_preempt_v_need_save(struct task_struct *task) > > > { return false; } > > > +#define riscv_preempt_v_clear_dirty(tsk) do {} while (0) > > > +#define riscv_preempt_v_set_restore(tsk) do {} while (0) > > > +#endif /* CONFIG_RISCV_ISA_V_PREEMPTIVE */ > > > + > > > static inline void __switch_to_vector(struct task_struct *prev, > > > struct task_struct *next) > > > { > > > struct pt_regs *regs; > > > > > > - regs = task_pt_regs(prev); > > > - riscv_v_vstate_save(&prev->thread.vstate, regs); > > > - riscv_v_vstate_set_restore(next, task_pt_regs(next)); > > > + if (riscv_preempt_v_need_save(prev)) { > > > + __riscv_v_vstate_save(&prev->thread.kernel_vstate, > > > + prev->thread.kernel_vstate.datap); > > > + riscv_preempt_v_clear_dirty(prev); > > > + } else { > > > + regs = task_pt_regs(prev); > > > + riscv_v_vstate_save(&prev->thread.vstate, regs); > > > + } > > > + > > > + if (riscv_preempt_v_started(next)) > > > + riscv_preempt_v_set_restore(next); > > > + else > > > + riscv_v_vstate_set_restore(next, task_pt_regs(next)); > > > } > > > > > > void riscv_v_vstate_ctrl_init(struct task_struct *tsk); > > > @@ -231,6 +290,7 @@ static inline bool > > > riscv_v_vstate_ctrl_user_allowed(void) { return false; } > > > #define riscv_v_vstate_on(regs) do {} while (0) > > > #define riscv_v_thread_free(tsk) do {} while (0) > > > #define riscv_v_setup_ctx_cache() do {} while (0) > > > +#define riscv_v_thread_alloc(tsk) do {} while (0) > > > > > > #endif /* CONFIG_RISCV_ISA_V */ > > > > > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > > > index 54ca4564a926..9d1a305d5508 100644 > > > --- a/arch/riscv/kernel/entry.S > > > +++ b/arch/riscv/kernel/entry.S > > > @@ -83,6 +83,10 @@ SYM_CODE_START(handle_exception) > > > /* Load the kernel shadow call stack pointer if coming from userspace > > > */ > > > scs_load_current_if_task_changed s5 > > > > > > +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE > > > + move a0, sp > > > + call riscv_v_context_nesting_start > > > +#endif > > > move a0, sp /* pt_regs */ > > > la ra, ret_from_exception > > > > > > @@ -138,6 +142,10 @@ SYM_CODE_START_NOALIGN(ret_from_exception) > > > */ > > > csrw CSR_SCRATCH, tp > > > 1: > > > +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE > > > + move a0, sp > > > + call riscv_v_context_nesting_end > > > +#endif > > > REG_L a0, PT_STATUS(sp) > > > /* > > > * The current load reservation is effectively part of the processor's > > > diff --git a/arch/riscv/kernel/kernel_mode_vector.c > > > b/arch/riscv/kernel/kernel_mode_vector.c > > > index 241a8f834e1c..22580d36fd08 100644 > > > --- a/arch/riscv/kernel/kernel_mode_vector.c > > > +++ b/arch/riscv/kernel/kernel_mode_vector.c > > > @@ -14,10 +14,13 @@ > > > #include <asm/vector.h> > > > #include <asm/switch_to.h> > > > #include <asm/simd.h> > > > +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE > > > +#include <asm/asm-prototypes.h> > > > +#endif > > > > > > static inline void riscv_v_flags_set(u32 flags) > > > { > > > - current->thread.riscv_v_flags = flags; > > > + WRITE_ONCE(current->thread.riscv_v_flags, flags); > > > } > > > > > > static inline void riscv_v_start(u32 flags) > > > @@ -27,12 +30,14 @@ static inline void riscv_v_start(u32 flags) > > > orig = riscv_v_flags(); > > > BUG_ON((orig & flags) != 0); > > > riscv_v_flags_set(orig | flags); > > > + barrier(); > > > } > > > > > > static inline void riscv_v_stop(u32 flags) > > > { > > > int orig; > > > > > > + barrier(); > > > orig = riscv_v_flags(); > > > BUG_ON((orig & flags) == 0); > > > riscv_v_flags_set(orig & ~flags); > > > @@ -75,6 +80,121 @@ void put_cpu_vector_context(void) > > > preempt_enable(); > > > } > > > > > > +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE > > > +static inline void riscv_preempt_v_set_dirty(void) > > > +{ > > > + current->thread.riscv_v_flags |= RISCV_PREEMPT_V_DIRTY; > > > +} > > > + > > > +static inline void riscv_preempt_v_reset_flags(void) > > > +{ > > > + current->thread.riscv_v_flags &= ~(RISCV_PREEMPT_V_DIRTY | > > > RISCV_PREEMPT_V_NEED_RESTORE); > > > +} > > > + > > > +static __always_inline volatile u32 *riscv_v_flags_ptr(void) > > > +{ > > > + return ¤t->thread.riscv_v_flags; > > > +} > > > + > > > +static inline void riscv_v_ctx_depth_inc(void) > > > +{ > > > + *riscv_v_flags_ptr() += RISCV_V_CTX_UNIT_DEPTH; > > > + barrier(); > > > +} > > > + > > > +static inline void riscv_v_ctx_depth_dec(void) > > > +{ > > > + barrier(); > > > + *riscv_v_flags_ptr() -= RISCV_V_CTX_UNIT_DEPTH; > > > +} > > > + > > > +static inline u32 riscv_v_ctx_get_depth(void) > > > +{ > > > + return riscv_v_flags() & RISCV_V_CTX_DEPTH_MASK; > > > +} > > > + > > > +static int riscv_v_stop_kernel_context(void) > > > +{ > > > + if (riscv_v_ctx_get_depth() != 0 || !riscv_preempt_v_started(current)) > > > + return 1; > > > + > > > + riscv_v_stop(RISCV_PREEMPT_V); > > > + return 0; > > > +} > > > + > > > +static int riscv_v_start_kernel_context(bool *is_nested) > > > +{ > > > + struct __riscv_v_ext_state *kvstate, *uvstate; > > > + > > > + kvstate = ¤t->thread.kernel_vstate; > > > + if (!kvstate->datap) > > > + return -ENOENT; > > > + > > > + if (riscv_preempt_v_started(current)) { > > > + WARN_ON(riscv_v_ctx_get_depth() == 0); > > > + *is_nested = true; > > > + if (riscv_preempt_v_dirty(current)) { > > > + get_cpu_vector_context(); > > > + __riscv_v_vstate_save(kvstate, kvstate->datap); > > > + riscv_preempt_v_clear_dirty(current); > > > + put_cpu_vector_context(); > > > + } > > > + get_cpu_vector_context(); > > > > Can we get_cpu_vector_context() before the above check? It looks we would get/put/get context > > when the preempt_v_dirty is true. > > Yes, we can. I will fix that in v10 > > > > > > + riscv_preempt_v_set_restore(current); > > > + return 0; > > > + } > > > + > > > + riscv_v_start(RISCV_PREEMPT_V); > > > + if ((task_pt_regs(current)->status & SR_VS) == SR_VS_DIRTY) { > > > + uvstate = ¤t->thread.vstate; > > > + riscv_preempt_v_set_dirty(); > > > + __riscv_v_vstate_save(uvstate, uvstate->datap); > > > + __riscv_v_vstate_clean(task_pt_regs(current)); > > > > Why set the status as CLEAN? Later in kernel_vector_begin, the status would then be set to INITIAL > > by riscv_v_vstate_set_restore()/riscv_v_vstate_on(). > > Thanks! Yes, we don't need to clean the state here because we call > riscv_v_vstate_set_restore() right after returning back. > > > > > BTW, the "status" is just a value in memory, not updated to CSR yet, I don't understand how > > this check (regs->status & SR_VS) == SR_VS_DIRTY in riscv_v_context_nesting_start() can work. > > Maybe I miss something. > > We are about to start kernel-mode Vector. So save the user's V context > if it is dirty. task_pt_regs always point to the regset at user/kernel > boundary because in-kernel trap grows on the original kernel stack. > > The purpose of this code segment is to own the user's V context with > preempt_v for the context saving. So, we have to copy the dirty status > to riscv_v_flag when starting preempt_v. However, it has to be seen > atomically w.r.t context-nesting code at the same core. So, I am going > to update the code to something like this: > > if ((task_pt_regs(current)->status & SR_VS) == SR_VS_DIRTY) { > /* Transfer the ownership of V from user to kernel, then save */ > riscv_v_start(RISCV_PREEMPT_V | RISCV_PREEMPT_V_DIRTY); > uvstate = ¤t->thread.vstate; > __riscv_v_vstate_save(uvstate, uvstate->datap); > riscv_preempt_v_clear_dirty(current); > } else { > riscv_v_start(RISCV_PREEMPT_V); > } The dirty status checking for user context must perform after starting preempt_v. Or, we will pollute context in csr if a softirq nest between the check and the start of preempt_v. This is addressed in v10. However, always starting preempt_v with dirty may cause us to save the user's context even if it is not dirty. Another way is to use the one we have in v8: get_cpu_vector_context(); riscv_v_vstate_save(¤t->thread.vstate, task_pt_regs(current)); put_cpu_vector_context(); riscv_v_start(RISCV_PREEMPT_V); > > Also, context saving code in context switch shall not save anything as > long as preempt_v is enabled. I will address that in v10. > > > > > > + riscv_preempt_v_clear_dirty(current); > > > > I assume there would be some vector instructions running between the kernel_vector_begin() > > and kernel_vector_end(), so why not call riscv_preempt_v_clear_dirty() in riscv_v_stop_kernel_context()? > > Or we are relying on the check (regs->status & SR_VS) == SR_VS_DIRTY in nesting_start to mark > > RISCV_PREEMPT_V_DIRTY? > > Yes, we should clear RISCV_PREEMPT_V_DIRTY at > riscv_v_stop_kernel_context(). Besides, we should only reset flags > when a restore happens in riscv_v_context_nesting_end(). > > > > > BRs, > > Xiao > > > > > + } > > > + return 0; > > > +} > > > + > > > +/* low-level V context handling code, called with irq disabled */ > > > +asmlinkage void riscv_v_context_nesting_start(struct pt_regs *regs) > > > +{ > > > + int depth; > > > + > > > + if (!riscv_preempt_v_started(current)) > > > + return; > > > + > > > + depth = riscv_v_ctx_get_depth(); > > > + if (depth == 0 && (regs->status & SR_VS) == SR_VS_DIRTY) > > > + riscv_preempt_v_set_dirty(); > > > + > > > + riscv_v_ctx_depth_inc(); > > > +} > > > > Thanks for the suggestions! > > Regards, > Andy
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 3c5ba05e8a2d..0a03d72706b5 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -533,6 +533,20 @@ config RISCV_ISA_V_UCOPY_THRESHOLD Prefer using vectorized copy_to_user()/copy_from_user() when the workload size exceeds this value. +config RISCV_ISA_V_PREEMPTIVE + bool "Run kernel-mode Vector with kernel preemption" + depends on PREEMPTION + depends on RISCV_ISA_V + default y + help + Usually, in-kernel SIMD routines are run with preemption disabled. + Functions which envoke long running SIMD thus must yield core's + vector unit to prevent blocking other tasks for too long. + + This config allows kernel to run SIMD without explicitly disable + preemption. Enabling this config will result in higher memory + consumption due to the allocation of per-task's kernel Vector context. + config TOOLCHAIN_HAS_ZBB bool default y diff --git a/arch/riscv/include/asm/asm-prototypes.h b/arch/riscv/include/asm/asm-prototypes.h index be438932f321..cd627ec289f1 100644 --- a/arch/riscv/include/asm/asm-prototypes.h +++ b/arch/riscv/include/asm/asm-prototypes.h @@ -30,6 +30,11 @@ void xor_regs_5_(unsigned long bytes, unsigned long *__restrict p1, const unsigned long *__restrict p4, const unsigned long *__restrict p5); +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE +asmlinkage void riscv_v_context_nesting_start(struct pt_regs *regs); +asmlinkage void riscv_v_context_nesting_end(struct pt_regs *regs); +#endif /* CONFIG_RISCV_ISA_V_PREEMPTIVE */ + #endif /* CONFIG_RISCV_ISA_V */ #define DECLARE_DO_ERROR_INFO(name) asmlinkage void name(struct pt_regs *regs) diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h index e76839789067..b503fd34728d 100644 --- a/arch/riscv/include/asm/processor.h +++ b/arch/riscv/include/asm/processor.h @@ -81,8 +81,35 @@ struct pt_regs; * activation of this state disables the preemption. On a non-RT kernel, it * also disable bh. Currently only 0 and 1 are valid value for this field. * Other values are reserved for future uses. + * - bits 8: is used for tracking preemptible kernel-mode Vector, when + * RISCV_ISA_V_PREEMPTIVE is enabled. Calling kernel_vector_begin() does not + * disable the preemption if the thread's kernel_vstate.datap is allocated. + * Instead, the kernel set this bit field. Then the trap entry/exit code + * knows if we are entering/exiting the context that owns preempt_v. + * - 0: the task is not using preempt_v + * - 1: the task is actively using preempt_v. But whether does the task own + * the preempt_v context is decided by bits in RISCV_V_CTX_DEPTH_MASK. + * - bit 16-23 are RISCV_V_CTX_DEPTH_MASK, used by context tracking routine + * when preempt_v starts: + * - 0: the task is actively using, and own preempt_v context. + * - non-zero: the task was using preempt_v, but then took a trap within. + * Thus, the task does not own preempt_v. Any use of Vector will have to + * save preempt_v, if dirty, and fallback to non-preemptible kernel-mode + * Vector. + * - bit 30: The in-kernel preempt_v context is saved, and requries to be + * restored when returning to the context that owns the preempt_v. + * - bit 31: The in-kernel preempt_v context is dirty, as signaled by the + * trap entry code. Any context switches out-of current task need to save + * it to the task's in-kernel V context. Also, any traps nesting on-top-of + * preempt_v requesting to use V needs a save. */ -#define RISCV_KERNEL_MODE_V 0x1 +#define RISCV_V_CTX_DEPTH_MASK 0x00ff0000 + +#define RISCV_V_CTX_UNIT_DEPTH 0x00010000 +#define RISCV_KERNEL_MODE_V 0x00000001 +#define RISCV_PREEMPT_V 0x00000100 +#define RISCV_PREEMPT_V_DIRTY 0x80000000 +#define RISCV_PREEMPT_V_NEED_RESTORE 0x40000000 /* CPU-specific state of a task */ struct thread_struct { @@ -96,6 +123,7 @@ struct thread_struct { u32 vstate_ctrl; struct __riscv_v_ext_state vstate; unsigned long align_ctl; + struct __riscv_v_ext_state kernel_vstate; }; /* Whitelist the fstate from the task_struct for hardened usercopy */ diff --git a/arch/riscv/include/asm/simd.h b/arch/riscv/include/asm/simd.h index 4d699e16c9a9..54efbf523d49 100644 --- a/arch/riscv/include/asm/simd.h +++ b/arch/riscv/include/asm/simd.h @@ -12,6 +12,7 @@ #include <linux/percpu.h> #include <linux/preempt.h> #include <linux/types.h> +#include <linux/thread_info.h> #include <asm/vector.h> @@ -28,12 +29,27 @@ static __must_check inline bool may_use_simd(void) /* * RISCV_KERNEL_MODE_V is only set while preemption is disabled, * and is clear whenever preemption is enabled. - * - * Kernel-mode Vector temporarily disables bh. So we must not return - * true on irq_disabled(). Otherwise we would fail the lockdep check - * calling local_bh_enable() */ - return !in_hardirq() && !in_nmi() && !irqs_disabled() && !(riscv_v_flags() & RISCV_KERNEL_MODE_V); + if (in_hardirq() || in_nmi()) + return false; + + /* + * Nesting is acheived in preempt_v by spreading the control for + * preemptible and non-preemptible kernel-mode Vector into two fields. + * Always try to match with prempt_v if kernel V-context exists. Then, + * fallback to check non preempt_v if nesting happens, or if the config + * is not set. + */ + if (IS_ENABLED(CONFIG_RISCV_ISA_V_PREEMPTIVE) && current->thread.kernel_vstate.datap) { + if (!riscv_preempt_v_started(current)) + return true; + } + /* + * Non-preemptible kernel-mode Vector temporarily disables bh. So we + * must not return true on irq_disabled(). Otherwise we would fail the + * lockdep check calling local_bh_enable() + */ + return !irqs_disabled() && !(riscv_v_flags() & RISCV_KERNEL_MODE_V); } #else /* ! CONFIG_RISCV_ISA_V */ diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h index 7b316050f24f..d69844906d51 100644 --- a/arch/riscv/include/asm/vector.h +++ b/arch/riscv/include/asm/vector.h @@ -28,10 +28,11 @@ void get_cpu_vector_context(void); void put_cpu_vector_context(void); void riscv_v_thread_free(struct task_struct *tsk); void __init riscv_v_setup_ctx_cache(void); +void riscv_v_thread_alloc(struct task_struct *tsk); static inline u32 riscv_v_flags(void) { - return current->thread.riscv_v_flags; + return READ_ONCE(current->thread.riscv_v_flags); } static __always_inline bool has_vector(void) @@ -200,14 +201,72 @@ static inline void riscv_v_vstate_set_restore(struct task_struct *task, } } +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE +static inline bool riscv_preempt_v_dirty(struct task_struct *task) +{ + u32 val = READ_ONCE(task->thread.riscv_v_flags); + + return !!(val & RISCV_PREEMPT_V_DIRTY); +} + +static inline bool riscv_preempt_v_restore(struct task_struct *task) +{ + u32 val = READ_ONCE(task->thread.riscv_v_flags); + + return !!(val & RISCV_PREEMPT_V_NEED_RESTORE); +} + +static inline void riscv_preempt_v_clear_dirty(struct task_struct *task) +{ + barrier(); + task->thread.riscv_v_flags &= ~RISCV_PREEMPT_V_DIRTY; +} + +static inline void riscv_preempt_v_set_restore(struct task_struct *task) +{ + barrier(); + task->thread.riscv_v_flags |= RISCV_PREEMPT_V_NEED_RESTORE; +} + +static inline bool riscv_preempt_v_started(struct task_struct *task) +{ + return !!(READ_ONCE(task->thread.riscv_v_flags) & RISCV_PREEMPT_V); +} + +static inline bool riscv_preempt_v_need_save(struct task_struct *task) +{ + u32 val = READ_ONCE(task->thread.riscv_v_flags); + + /* preempt_v has started and the state is dirty */ + return !!((val & RISCV_PREEMPT_V_DIRTY) && (val & RISCV_PREEMPT_V)); +} +#else /* !CONFIG_RISCV_ISA_V_PREEMPTIVE */ +static inline bool riscv_preempt_v_dirty(struct task_struct *task) { return false; } +static inline bool riscv_preempt_v_restore(struct task_struct *task) { return false; } +static inline bool riscv_preempt_v_started(struct task_struct *task) { return false; } +static inline bool riscv_preempt_v_need_save(struct task_struct *task) { return false; } +#define riscv_preempt_v_clear_dirty(tsk) do {} while (0) +#define riscv_preempt_v_set_restore(tsk) do {} while (0) +#endif /* CONFIG_RISCV_ISA_V_PREEMPTIVE */ + static inline void __switch_to_vector(struct task_struct *prev, struct task_struct *next) { struct pt_regs *regs; - regs = task_pt_regs(prev); - riscv_v_vstate_save(&prev->thread.vstate, regs); - riscv_v_vstate_set_restore(next, task_pt_regs(next)); + if (riscv_preempt_v_need_save(prev)) { + __riscv_v_vstate_save(&prev->thread.kernel_vstate, + prev->thread.kernel_vstate.datap); + riscv_preempt_v_clear_dirty(prev); + } else { + regs = task_pt_regs(prev); + riscv_v_vstate_save(&prev->thread.vstate, regs); + } + + if (riscv_preempt_v_started(next)) + riscv_preempt_v_set_restore(next); + else + riscv_v_vstate_set_restore(next, task_pt_regs(next)); } void riscv_v_vstate_ctrl_init(struct task_struct *tsk); @@ -231,6 +290,7 @@ static inline bool riscv_v_vstate_ctrl_user_allowed(void) { return false; } #define riscv_v_vstate_on(regs) do {} while (0) #define riscv_v_thread_free(tsk) do {} while (0) #define riscv_v_setup_ctx_cache() do {} while (0) +#define riscv_v_thread_alloc(tsk) do {} while (0) #endif /* CONFIG_RISCV_ISA_V */ diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index 54ca4564a926..9d1a305d5508 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -83,6 +83,10 @@ SYM_CODE_START(handle_exception) /* Load the kernel shadow call stack pointer if coming from userspace */ scs_load_current_if_task_changed s5 +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE + move a0, sp + call riscv_v_context_nesting_start +#endif move a0, sp /* pt_regs */ la ra, ret_from_exception @@ -138,6 +142,10 @@ SYM_CODE_START_NOALIGN(ret_from_exception) */ csrw CSR_SCRATCH, tp 1: +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE + move a0, sp + call riscv_v_context_nesting_end +#endif REG_L a0, PT_STATUS(sp) /* * The current load reservation is effectively part of the processor's diff --git a/arch/riscv/kernel/kernel_mode_vector.c b/arch/riscv/kernel/kernel_mode_vector.c index 241a8f834e1c..22580d36fd08 100644 --- a/arch/riscv/kernel/kernel_mode_vector.c +++ b/arch/riscv/kernel/kernel_mode_vector.c @@ -14,10 +14,13 @@ #include <asm/vector.h> #include <asm/switch_to.h> #include <asm/simd.h> +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE +#include <asm/asm-prototypes.h> +#endif static inline void riscv_v_flags_set(u32 flags) { - current->thread.riscv_v_flags = flags; + WRITE_ONCE(current->thread.riscv_v_flags, flags); } static inline void riscv_v_start(u32 flags) @@ -27,12 +30,14 @@ static inline void riscv_v_start(u32 flags) orig = riscv_v_flags(); BUG_ON((orig & flags) != 0); riscv_v_flags_set(orig | flags); + barrier(); } static inline void riscv_v_stop(u32 flags) { int orig; + barrier(); orig = riscv_v_flags(); BUG_ON((orig & flags) == 0); riscv_v_flags_set(orig & ~flags); @@ -75,6 +80,121 @@ void put_cpu_vector_context(void) preempt_enable(); } +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE +static inline void riscv_preempt_v_set_dirty(void) +{ + current->thread.riscv_v_flags |= RISCV_PREEMPT_V_DIRTY; +} + +static inline void riscv_preempt_v_reset_flags(void) +{ + current->thread.riscv_v_flags &= ~(RISCV_PREEMPT_V_DIRTY | RISCV_PREEMPT_V_NEED_RESTORE); +} + +static __always_inline volatile u32 *riscv_v_flags_ptr(void) +{ + return ¤t->thread.riscv_v_flags; +} + +static inline void riscv_v_ctx_depth_inc(void) +{ + *riscv_v_flags_ptr() += RISCV_V_CTX_UNIT_DEPTH; + barrier(); +} + +static inline void riscv_v_ctx_depth_dec(void) +{ + barrier(); + *riscv_v_flags_ptr() -= RISCV_V_CTX_UNIT_DEPTH; +} + +static inline u32 riscv_v_ctx_get_depth(void) +{ + return riscv_v_flags() & RISCV_V_CTX_DEPTH_MASK; +} + +static int riscv_v_stop_kernel_context(void) +{ + if (riscv_v_ctx_get_depth() != 0 || !riscv_preempt_v_started(current)) + return 1; + + riscv_v_stop(RISCV_PREEMPT_V); + return 0; +} + +static int riscv_v_start_kernel_context(bool *is_nested) +{ + struct __riscv_v_ext_state *kvstate, *uvstate; + + kvstate = ¤t->thread.kernel_vstate; + if (!kvstate->datap) + return -ENOENT; + + if (riscv_preempt_v_started(current)) { + WARN_ON(riscv_v_ctx_get_depth() == 0); + *is_nested = true; + if (riscv_preempt_v_dirty(current)) { + get_cpu_vector_context(); + __riscv_v_vstate_save(kvstate, kvstate->datap); + riscv_preempt_v_clear_dirty(current); + put_cpu_vector_context(); + } + get_cpu_vector_context(); + riscv_preempt_v_set_restore(current); + return 0; + } + + riscv_v_start(RISCV_PREEMPT_V); + if ((task_pt_regs(current)->status & SR_VS) == SR_VS_DIRTY) { + uvstate = ¤t->thread.vstate; + riscv_preempt_v_set_dirty(); + __riscv_v_vstate_save(uvstate, uvstate->datap); + __riscv_v_vstate_clean(task_pt_regs(current)); + riscv_preempt_v_clear_dirty(current); + } + return 0; +} + +/* low-level V context handling code, called with irq disabled */ +asmlinkage void riscv_v_context_nesting_start(struct pt_regs *regs) +{ + int depth; + + if (!riscv_preempt_v_started(current)) + return; + + depth = riscv_v_ctx_get_depth(); + if (depth == 0 && (regs->status & SR_VS) == SR_VS_DIRTY) + riscv_preempt_v_set_dirty(); + + riscv_v_ctx_depth_inc(); +} + +asmlinkage void riscv_v_context_nesting_end(struct pt_regs *regs) +{ + struct __riscv_v_ext_state *vstate = ¤t->thread.kernel_vstate; + u32 depth; + + WARN_ON(!irqs_disabled()); + + if (!riscv_preempt_v_started(current)) + return; + + riscv_v_ctx_depth_dec(); + depth = riscv_v_ctx_get_depth(); + if (depth == 0) { + if (riscv_preempt_v_restore(current)) { + __riscv_v_vstate_restore(vstate, vstate->datap); + __riscv_v_vstate_clean(regs); + } + riscv_preempt_v_reset_flags(); + } +} +#else +#define riscv_v_start_kernel_context(nested) (-ENOENT) +#define riscv_v_stop_kernel_context() (-ENOENT) +#endif /* CONFIG_RISCV_ISA_V_PREEMPTIVE */ + /* * kernel_vector_begin(): obtain the CPU vector registers for use by the calling * context @@ -90,14 +210,20 @@ void put_cpu_vector_context(void) */ void kernel_vector_begin(void) { + bool nested = false; + if (WARN_ON(!has_vector())) return; BUG_ON(!may_use_simd()); - get_cpu_vector_context(); + if (riscv_v_start_kernel_context(&nested)) { + get_cpu_vector_context(); + riscv_v_vstate_save(¤t->thread.vstate, task_pt_regs(current)); + } - riscv_v_vstate_save(¤t->thread.vstate, task_pt_regs(current)); + if (!nested) + riscv_v_vstate_set_restore(current, task_pt_regs(current)); riscv_v_enable(); } @@ -117,10 +243,9 @@ void kernel_vector_end(void) if (WARN_ON(!has_vector())) return; - riscv_v_vstate_set_restore(current, task_pt_regs(current)); - riscv_v_disable(); - put_cpu_vector_context(); + if (riscv_v_stop_kernel_context()) + put_cpu_vector_context(); } EXPORT_SYMBOL_GPL(kernel_vector_end); diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c index 862d59c3872e..92922dbd5b5c 100644 --- a/arch/riscv/kernel/process.c +++ b/arch/riscv/kernel/process.c @@ -188,6 +188,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) *dst = *src; /* clear entire V context, including datap for a new task */ memset(&dst->thread.vstate, 0, sizeof(struct __riscv_v_ext_state)); + memset(&dst->thread.kernel_vstate, 0, sizeof(struct __riscv_v_ext_state)); clear_tsk_thread_flag(dst, TIF_RISCV_V_DEFER_RESTORE); return 0; @@ -224,6 +225,8 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) p->thread.s[0] = 0; } p->thread.riscv_v_flags = 0; + if (has_vector()) + riscv_v_thread_alloc(p); p->thread.ra = (unsigned long)ret_from_fork; p->thread.sp = (unsigned long)childregs; /* kernel sp */ return 0; diff --git a/arch/riscv/kernel/vector.c b/arch/riscv/kernel/vector.c index 1fe140e34557..f9769703fd39 100644 --- a/arch/riscv/kernel/vector.c +++ b/arch/riscv/kernel/vector.c @@ -22,6 +22,9 @@ static bool riscv_v_implicit_uacc = IS_ENABLED(CONFIG_RISCV_ISA_V_DEFAULT_ENABLE); static struct kmem_cache *riscv_v_user_cachep; +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE +static struct kmem_cache *riscv_v_kernel_cachep; +#endif unsigned long riscv_v_vsize __read_mostly; EXPORT_SYMBOL_GPL(riscv_v_vsize); @@ -53,6 +56,11 @@ void __init riscv_v_setup_ctx_cache(void) riscv_v_user_cachep = kmem_cache_create_usercopy("riscv_vector_ctx", riscv_v_vsize, 16, SLAB_PANIC, 0, riscv_v_vsize, NULL); +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE + riscv_v_kernel_cachep = kmem_cache_create("riscv_vector_kctx", + riscv_v_vsize, 16, + SLAB_PANIC, NULL); +#endif } static bool insn_is_vector(u32 insn_buf) @@ -88,24 +96,35 @@ static bool insn_is_vector(u32 insn_buf) return false; } -static int riscv_v_thread_zalloc(void) +static int riscv_v_thread_zalloc(struct kmem_cache *cache, + struct __riscv_v_ext_state *ctx) { void *datap; - datap = kmem_cache_zalloc(riscv_v_user_cachep, GFP_KERNEL); + datap = kmem_cache_zalloc(cache, GFP_KERNEL); if (!datap) return -ENOMEM; - current->thread.vstate.datap = datap; - memset(¤t->thread.vstate, 0, offsetof(struct __riscv_v_ext_state, - datap)); + ctx->datap = datap; + memset(ctx, 0, offsetof(struct __riscv_v_ext_state, datap)); return 0; } +void riscv_v_thread_alloc(struct task_struct *tsk) +{ +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE + riscv_v_thread_zalloc(riscv_v_kernel_cachep, &tsk->thread.kernel_vstate); +#endif +} + void riscv_v_thread_free(struct task_struct *tsk) { if (tsk->thread.vstate.datap) kmem_cache_free(riscv_v_user_cachep, tsk->thread.vstate.datap); +#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE + if (tsk->thread.kernel_vstate.datap) + kmem_cache_free(riscv_v_kernel_cachep, tsk->thread.kernel_vstate.datap); +#endif } #define VSTATE_CTRL_GET_CUR(x) ((x) & PR_RISCV_V_VSTATE_CTRL_CUR_MASK) @@ -177,7 +196,7 @@ bool riscv_v_first_use_handler(struct pt_regs *regs) * context where VS has been off. So, try to allocate the user's V * context and resume execution. */ - if (riscv_v_thread_zalloc()) { + if (riscv_v_thread_zalloc(riscv_v_user_cachep, ¤t->thread.vstate)) { force_sig(SIGBUS); return true; }
Add kernel_vstate to keep track of kernel-mode Vector registers when trap introduced context switch happens. Also, provide riscv_v_flags to let context save/restore routine track context status. Context tracking happens whenever the core starts its in-kernel Vector executions. An active (dirty) kernel task's V contexts will be saved to memory whenever a trap-introduced context switch happens. Or, when a softirq, which happens to nest on top of it, uses Vector. Context retoring happens when the execution transfer back to the original Kernel context where it first enable preempt_v. Also, provide a config CONFIG_RISCV_ISA_V_PREEMPTIVE to give users an option to disable preemptible kernel-mode Vector at build time. Users with constraint memory may want to disable this config as preemptible kernel-mode Vector needs extra space for tracking of per thread's kernel-mode V context. Or, users might as well want to disable it if all kernel-mode Vector code is time sensitive and cannot tolerate context switch overhead. Signed-off-by: Andy Chiu <andy.chiu@sifive.com> --- Changelog v9: - Separate context depth tracking out to a individual bitmap. - Use bitwise to mask on/off the preempt_v status and drop unused masks - Do not turn off bh on success path of preempt_v (To make preempt_v available for task context that turns off irq). - Remove and test lockdep assertion. Changelog v8: - fix -Wmissing-prototypes for functions with asmlinkage Changelog v6: - re-write patch to handle context nesting for softirqs - drop thread flag and track context instead in riscv_v_flags - refine some asm code and constraint it into C functions - preallocate v context for preempt_v - Return non-zero in riscv_v_start_kernel_context with non-preemptible kernel-mode Vector Changelog v4: - dropped from v4 Changelog v3: - Guard vstate_save with {get,set}_cpu_vector_context - Add comments on preventions of nesting V contexts - remove warnings in context switch when trap's reg is not pressent (Conor) - refactor code (Björn) Changelog v2: - fix build fail when compiling without RISCV_ISA_V (Conor) - 's/TIF_RISCV_V_KMV/TIF_RISCV_V_KERNEL_MODE' and add comment (Conor) - merge Kconfig patch into this oine (Conor). - 's/CONFIG_RISCV_ISA_V_PREEMPTIVE_KMV/CONFIG_RISCV_ISA_V_PREEMPTIVE/' (Conor) - fix some typos (Conor) - enclose assembly with RISCV_ISA_V_PREEMPTIVE. - change riscv_v_vstate_ctrl_config_kmv() to kernel_vector_allow_preemption() for better understanding. (Conor) - 's/riscv_v_kmv_preempitble/kernel_vector_preemptible/' --- arch/riscv/Kconfig | 14 +++ arch/riscv/include/asm/asm-prototypes.h | 5 + arch/riscv/include/asm/processor.h | 30 +++++- arch/riscv/include/asm/simd.h | 26 ++++- arch/riscv/include/asm/vector.h | 68 +++++++++++- arch/riscv/kernel/entry.S | 8 ++ arch/riscv/kernel/kernel_mode_vector.c | 137 ++++++++++++++++++++++-- arch/riscv/kernel/process.c | 3 + arch/riscv/kernel/vector.c | 31 ++++-- 9 files changed, 300 insertions(+), 22 deletions(-)