diff mbox series

[v2,5/5] riscv: vector: allow kernel-mode Vector with preemption

Message ID 20230721112855.1006-6-andy.chiu@sifive.com (mailing list archive)
State Superseded
Headers show
Series riscv: support kernel-mode Vector | expand

Checks

Context Check Description
conchuod/cover_letter success Series has a cover letter
conchuod/tree_selection success Guessed tree name to be for-next at HEAD 471aba2e4760
conchuod/fixes_present success Fixes tag not required for -next series
conchuod/maintainers_pattern success MAINTAINERS pattern errors before the patch: 4 and now 4
conchuod/verify_signedoff success Signed-off-by tag matches author and committer
conchuod/kdoc success Errors and warnings before: 0 this patch: 0
conchuod/build_rv64_clang_allmodconfig success Errors and warnings before: 2810 this patch: 2810
conchuod/module_param success Was 0 now: 0
conchuod/build_rv64_gcc_allmodconfig success Errors and warnings before: 15878 this patch: 15877
conchuod/build_rv32_defconfig success Build OK
conchuod/dtb_warn_rv64 success Errors and warnings before: 3 this patch: 3
conchuod/header_inline success No static functions without inline keyword in header files
conchuod/checkpatch warning CHECK: Prefer using the BIT macro
conchuod/build_rv64_nommu_k210_defconfig success Build OK
conchuod/verify_fixes success No Fixes tag
conchuod/build_rv64_nommu_virt_defconfig success Build OK

Commit Message

Andy Chiu July 21, 2023, 11:28 a.m. UTC
Add kernel_vstate to keep track of kernel-mode Vector registers when
trap introduced context switch happens. Also, provide trap_pt_regs to
let context save/restore routine reference status.VS at which the trap
takes place. The thread flag TIF_RISCV_V_KERNEL_MODE indicates whether
a task is running in kernel-mode Vector with preemption 'ON'. So context
switch routines know and would save V-regs to kernel_vstate and restore
V-regs immediately from kernel_vstate if the bit is set.

Apart from a task's preemption status, the capability of
running preemptive kernel-mode Vector is jointly controlled by the
RISCV_V_VSTATE_CTRL_PREEMPTIBLE mask in the task's
thread.vstate_ctrl. This bit is masked whenever a trap takes place in
kernel mode while executing preemptive Vector code.

Also, provide a config CONFIG_RISCV_ISA_V_PREEMPTIVE to give users an
option to disable preemptible kernel-mode Vector at build time. Users
with constraint memory may want to disable this config as preemptible
kernel-mode Vector needs extra space for tracking per thread's
kernel-mode V context. Or, users might as well want to disable it if all
kernel-mode Vector code is time sensitive and cannot tolerate context
swicth overhead.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
---
Changelog v2:
 - fix build fail when compiling without RISCV_ISA_V (Conor)
 - 's/TIF_RISCV_V_KMV/TIF_RISCV_V_KERNEL_MODE' and add comment (Conor)
 - merge Kconfig patch into this oine (Conor).
 - 's/CONFIG_RISCV_ISA_V_PREEMPTIVE_KMV/CONFIG_RISCV_ISA_V_PREEMPTIVE/'
   (Conor)
 - fix some typos (Conor)
 - enclose assembly with RISCV_ISA_V_PREEMPTIVE.
 - change riscv_v_vstate_ctrl_config_kmv() to
   kernel_vector_allow_preemption() for better understanding. (Conor)
 - 's/riscv_v_kmv_preempitble/kernel_vector_preemptible/'
---
 arch/riscv/Kconfig                     | 10 +++++
 arch/riscv/include/asm/processor.h     |  2 +
 arch/riscv/include/asm/simd.h          |  4 +-
 arch/riscv/include/asm/thread_info.h   |  4 ++
 arch/riscv/include/asm/vector.h        | 27 +++++++++++--
 arch/riscv/kernel/asm-offsets.c        |  2 +
 arch/riscv/kernel/entry.S              | 45 ++++++++++++++++++++++
 arch/riscv/kernel/kernel_mode_vector.c | 53 ++++++++++++++++++++++++--
 arch/riscv/kernel/process.c            |  8 +++-
 arch/riscv/kernel/vector.c             |  3 +-
 10 files changed, 148 insertions(+), 10 deletions(-)

Comments

Conor Dooley July 24, 2023, 12:18 p.m. UTC | #1
Hey Andy,

On Fri, Jul 21, 2023 at 11:28:55AM +0000, Andy Chiu wrote:
> Add kernel_vstate to keep track of kernel-mode Vector registers when
> trap introduced context switch happens. Also, provide trap_pt_regs to
> let context save/restore routine reference status.VS at which the trap
> takes place. The thread flag TIF_RISCV_V_KERNEL_MODE indicates whether
> a task is running in kernel-mode Vector with preemption 'ON'. So context
> switch routines know and would save V-regs to kernel_vstate and restore
> V-regs immediately from kernel_vstate if the bit is set.
> 
> Apart from a task's preemption status, the capability of
> running preemptive kernel-mode Vector is jointly controlled by the
> RISCV_V_VSTATE_CTRL_PREEMPTIBLE mask in the task's
> thread.vstate_ctrl. This bit is masked whenever a trap takes place in
> kernel mode while executing preemptive Vector code.
> 
> Also, provide a config CONFIG_RISCV_ISA_V_PREEMPTIVE to give users an
> option to disable preemptible kernel-mode Vector at build time. Users
> with constraint memory may want to disable this config as preemptible
> kernel-mode Vector needs extra space for tracking per thread's
> kernel-mode V context. Or, users might as well want to disable it if all
> kernel-mode Vector code is time sensitive and cannot tolerate context
> swicth overhead.
> 
> Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
> ---
> Changelog v2:
>  - fix build fail when compiling without RISCV_ISA_V (Conor)
>  - 's/TIF_RISCV_V_KMV/TIF_RISCV_V_KERNEL_MODE' and add comment (Conor)
>  - merge Kconfig patch into this oine (Conor).
>  - 's/CONFIG_RISCV_ISA_V_PREEMPTIVE_KMV/CONFIG_RISCV_ISA_V_PREEMPTIVE/'
>    (Conor)
>  - fix some typos (Conor)
>  - enclose assembly with RISCV_ISA_V_PREEMPTIVE.
>  - change riscv_v_vstate_ctrl_config_kmv() to
>    kernel_vector_allow_preemption() for better understanding. (Conor)
>  - 's/riscv_v_kmv_preempitble/kernel_vector_preemptible/'
> ---
>  arch/riscv/Kconfig                     | 10 +++++
>  arch/riscv/include/asm/processor.h     |  2 +
>  arch/riscv/include/asm/simd.h          |  4 +-
>  arch/riscv/include/asm/thread_info.h   |  4 ++
>  arch/riscv/include/asm/vector.h        | 27 +++++++++++--
>  arch/riscv/kernel/asm-offsets.c        |  2 +
>  arch/riscv/kernel/entry.S              | 45 ++++++++++++++++++++++
>  arch/riscv/kernel/kernel_mode_vector.c | 53 ++++++++++++++++++++++++--
>  arch/riscv/kernel/process.c            |  8 +++-
>  arch/riscv/kernel/vector.c             |  3 +-
>  10 files changed, 148 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 4c07b9189c86..0622951b15dd 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -507,6 +507,16 @@ config RISCV_ISA_V_DEFAULT_ENABLE
>  
>  	  If you don't know what to do here, say Y.
>  
> +config RISCV_ISA_V_PREEMPTIVE
> +	bool "Run kernel-mode Vector with kernel preemption"
> +	depends on PREEMPTION
> +	depends on RISCV_ISA_V
> +	default y
> +	help
> +	  Ordinarily the kernel disables preemption before running in-kernel
> +	  Vector code. This config frees the kernel from disabling preemption
> +	  by adding memory on demand for tracking kernel's V-context.
> +
>  config TOOLCHAIN_HAS_ZBB
>  	bool
>  	default y
> diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
> index c950a8d9edef..497c0dd30b2a 100644
> --- a/arch/riscv/include/asm/processor.h
> +++ b/arch/riscv/include/asm/processor.h
> @@ -42,6 +42,8 @@ struct thread_struct {
>  	unsigned long bad_cause;
>  	unsigned long vstate_ctrl;
>  	struct __riscv_v_ext_state vstate;
> +	struct pt_regs *trap_pt_regs;
> +	struct __riscv_v_ext_state kernel_vstate;
>  };
>  
>  /* Whitelist the fstate from the task_struct for hardened usercopy */
> diff --git a/arch/riscv/include/asm/simd.h b/arch/riscv/include/asm/simd.h
> index ef70af78005d..a54a0ce58f4d 100644
> --- a/arch/riscv/include/asm/simd.h
> +++ b/arch/riscv/include/asm/simd.h
> @@ -12,6 +12,7 @@
>  #include <linux/percpu.h>
>  #include <linux/preempt.h>
>  #include <linux/types.h>
> +#include <linux/thread_info.h>
>  
>  #ifdef CONFIG_RISCV_ISA_V
>  
> @@ -35,7 +36,8 @@ static __must_check inline bool may_use_simd(void)
>  	 * where it is set.
>  	 */
>  	return !in_irq() && !irqs_disabled() && !in_nmi() &&
> -	       !this_cpu_read(vector_context_busy);
> +	       !this_cpu_read(vector_context_busy) &&
> +	       !test_thread_flag(TIF_RISCV_V_KERNEL_MODE);
>  }
>  
>  #else /* ! CONFIG_RISCV_ISA_V */
> diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h
> index b182f2d03e25..8797d520e8ef 100644
> --- a/arch/riscv/include/asm/thread_info.h
> +++ b/arch/riscv/include/asm/thread_info.h
> @@ -94,6 +94,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src);
>  #define TIF_UPROBE		10	/* uprobe breakpoint or singlestep */
>  #define TIF_32BIT		11	/* compat-mode 32bit process */
>  #define TIF_RISCV_V_DEFER_RESTORE	12 /* restore Vector before returing to user */
> +#define TIF_RISCV_V_KERNEL_MODE			13 /* kernel-mode Vector run with preemption-on */
>  
>  #define _TIF_NOTIFY_RESUME	(1 << TIF_NOTIFY_RESUME)
>  #define _TIF_SIGPENDING		(1 << TIF_SIGPENDING)
> @@ -101,9 +102,12 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src);
>  #define _TIF_NOTIFY_SIGNAL	(1 << TIF_NOTIFY_SIGNAL)
>  #define _TIF_UPROBE		(1 << TIF_UPROBE)
>  #define _TIF_RISCV_V_DEFER_RESTORE	(1 << TIF_RISCV_V_DEFER_RESTORE)
> +#define _TIF_RISCV_V_KERNEL_MODE	(1 << TIF_RISCV_V_KERNEL_MODE)
>  
>  #define _TIF_WORK_MASK \
>  	(_TIF_NOTIFY_RESUME | _TIF_SIGPENDING | _TIF_NEED_RESCHED | \
>  	 _TIF_NOTIFY_SIGNAL | _TIF_UPROBE)
>  
> +#define RISCV_V_VSTATE_CTRL_PREEMPTIBLE	0x20
> +
>  #endif /* _ASM_RISCV_THREAD_INFO_H */
> diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h
> index 3b783b317112..c2776851d50d 100644
> --- a/arch/riscv/include/asm/vector.h
> +++ b/arch/riscv/include/asm/vector.h
> @@ -195,9 +195,24 @@ static inline void __switch_to_vector(struct task_struct *prev,
>  {
>  	struct pt_regs *regs;
>  
> -	regs = task_pt_regs(prev);
> -	riscv_v_vstate_save(&prev->thread.vstate, regs);
> -	riscv_v_vstate_set_restore(next, task_pt_regs(next));
> +	if (IS_ENABLED(CONFIG_RISCV_ISA_V_PREEMPTIVE) &&
> +	    test_tsk_thread_flag(prev, TIF_RISCV_V_KERNEL_MODE)) {
> +		regs = prev->thread.trap_pt_regs;
> +		WARN_ON(!regs);

In what cases could these WARN_ON()s be triggered?

> +		riscv_v_vstate_save(&prev->thread.kernel_vstate, regs);
> +	} else {
> +		regs = task_pt_regs(prev);
> +		riscv_v_vstate_save(&prev->thread.vstate, regs);
> +	}
> +
> +	if (IS_ENABLED(CONFIG_RISCV_ISA_V_PREEMPTIVE) &&
> +	    test_tsk_thread_flag(next, TIF_RISCV_V_KERNEL_MODE)) {
> +		regs = next->thread.trap_pt_regs;
> +		WARN_ON(!regs);
> +		riscv_v_vstate_restore(&next->thread.kernel_vstate, regs);
> +	} else {
> +		riscv_v_vstate_set_restore(next, task_pt_regs(next));
> +	}
>  }


>  /*
>   * kernel_vector_begin(): obtain the CPU vector registers for use by the calling
>   * context
> @@ -70,11 +109,14 @@ void kernel_vector_begin(void)
>  
>  	riscv_v_vstate_save(&current->thread.vstate, task_pt_regs(current));
>  
> -	get_cpu_vector_context();
> +	if (!preemptible() || !kernel_vector_preemptible()) {
> +		get_cpu_vector_context();
> +	} else {
> +		if (riscv_v_start_kernel_context())
> +			get_cpu_vector_context();

What happens here if riscv_v_start_kernel_context() fails w/ -ENOMEM?

> +	}
>  
>  	riscv_v_enable();
> -
> -	return 0;
>  }
>  EXPORT_SYMBOL_GPL(kernel_vector_begin);
>  
> @@ -96,6 +138,9 @@  void kernel_vector_end(void)
>  
>  	riscv_v_disable();
>  
> -	put_cpu_vector_context();
> +	if (!test_thread_flag(TIF_RISCV_V_KERNEL_MODE))
> +		put_cpu_vector_context();
> +	else
> +		riscv_v_stop_kernel_context();
>  }

Probably just missing something here, but how come we don't need to call
put_cpu_vector_context() here. I'm just a little confused, since, in
kernel_vector_begin, get_cpu_vector_context() is called.
Andy Chiu July 24, 2023, 3:45 p.m. UTC | #2
On Mon, Jul 24, 2023 at 8:19 PM Conor Dooley <conor.dooley@microchip.com> wrote:
>
> Hey Andy,
>
> On Fri, Jul 21, 2023 at 11:28:55AM +0000, Andy Chiu wrote:
> > Add kernel_vstate to keep track of kernel-mode Vector registers when
> > trap introduced context switch happens. Also, provide trap_pt_regs to
> > let context save/restore routine reference status.VS at which the trap
> > takes place. The thread flag TIF_RISCV_V_KERNEL_MODE indicates whether
> > a task is running in kernel-mode Vector with preemption 'ON'. So context
> > switch routines know and would save V-regs to kernel_vstate and restore
> > V-regs immediately from kernel_vstate if the bit is set.
> >
> > Apart from a task's preemption status, the capability of
> > running preemptive kernel-mode Vector is jointly controlled by the
> > RISCV_V_VSTATE_CTRL_PREEMPTIBLE mask in the task's
> > thread.vstate_ctrl. This bit is masked whenever a trap takes place in
> > kernel mode while executing preemptive Vector code.
> >
> > Also, provide a config CONFIG_RISCV_ISA_V_PREEMPTIVE to give users an
> > option to disable preemptible kernel-mode Vector at build time. Users
> > with constraint memory may want to disable this config as preemptible
> > kernel-mode Vector needs extra space for tracking per thread's
> > kernel-mode V context. Or, users might as well want to disable it if all
> > kernel-mode Vector code is time sensitive and cannot tolerate context
> > swicth overhead.
> >
> > Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
> > ---
> > Changelog v2:
> >  - fix build fail when compiling without RISCV_ISA_V (Conor)
> >  - 's/TIF_RISCV_V_KMV/TIF_RISCV_V_KERNEL_MODE' and add comment (Conor)
> >  - merge Kconfig patch into this oine (Conor).
> >  - 's/CONFIG_RISCV_ISA_V_PREEMPTIVE_KMV/CONFIG_RISCV_ISA_V_PREEMPTIVE/'
> >    (Conor)
> >  - fix some typos (Conor)
> >  - enclose assembly with RISCV_ISA_V_PREEMPTIVE.
> >  - change riscv_v_vstate_ctrl_config_kmv() to
> >    kernel_vector_allow_preemption() for better understanding. (Conor)
> >  - 's/riscv_v_kmv_preempitble/kernel_vector_preemptible/'
> > ---
> >  arch/riscv/Kconfig                     | 10 +++++
> >  arch/riscv/include/asm/processor.h     |  2 +
> >  arch/riscv/include/asm/simd.h          |  4 +-
> >  arch/riscv/include/asm/thread_info.h   |  4 ++
> >  arch/riscv/include/asm/vector.h        | 27 +++++++++++--
> >  arch/riscv/kernel/asm-offsets.c        |  2 +
> >  arch/riscv/kernel/entry.S              | 45 ++++++++++++++++++++++
> >  arch/riscv/kernel/kernel_mode_vector.c | 53 ++++++++++++++++++++++++--
> >  arch/riscv/kernel/process.c            |  8 +++-
> >  arch/riscv/kernel/vector.c             |  3 +-
> >  10 files changed, 148 insertions(+), 10 deletions(-)
> >
> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > index 4c07b9189c86..0622951b15dd 100644
> > --- a/arch/riscv/Kconfig
> > +++ b/arch/riscv/Kconfig
> > @@ -507,6 +507,16 @@ config RISCV_ISA_V_DEFAULT_ENABLE
> >
> >         If you don't know what to do here, say Y.
> >
> > +config RISCV_ISA_V_PREEMPTIVE
> > +     bool "Run kernel-mode Vector with kernel preemption"
> > +     depends on PREEMPTION
> > +     depends on RISCV_ISA_V
> > +     default y
> > +     help
> > +       Ordinarily the kernel disables preemption before running in-kernel
> > +       Vector code. This config frees the kernel from disabling preemption
> > +       by adding memory on demand for tracking kernel's V-context.
> > +
> >  config TOOLCHAIN_HAS_ZBB
> >       bool
> >       default y
> > diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
> > index c950a8d9edef..497c0dd30b2a 100644
> > --- a/arch/riscv/include/asm/processor.h
> > +++ b/arch/riscv/include/asm/processor.h
> > @@ -42,6 +42,8 @@ struct thread_struct {
> >       unsigned long bad_cause;
> >       unsigned long vstate_ctrl;
> >       struct __riscv_v_ext_state vstate;
> > +     struct pt_regs *trap_pt_regs;
> > +     struct __riscv_v_ext_state kernel_vstate;
> >  };
> >
> >  /* Whitelist the fstate from the task_struct for hardened usercopy */
> > diff --git a/arch/riscv/include/asm/simd.h b/arch/riscv/include/asm/simd.h
> > index ef70af78005d..a54a0ce58f4d 100644
> > --- a/arch/riscv/include/asm/simd.h
> > +++ b/arch/riscv/include/asm/simd.h
> > @@ -12,6 +12,7 @@
> >  #include <linux/percpu.h>
> >  #include <linux/preempt.h>
> >  #include <linux/types.h>
> > +#include <linux/thread_info.h>
> >
> >  #ifdef CONFIG_RISCV_ISA_V
> >
> > @@ -35,7 +36,8 @@ static __must_check inline bool may_use_simd(void)
> >        * where it is set.
> >        */
> >       return !in_irq() && !irqs_disabled() && !in_nmi() &&
> > -            !this_cpu_read(vector_context_busy);
> > +            !this_cpu_read(vector_context_busy) &&
> > +            !test_thread_flag(TIF_RISCV_V_KERNEL_MODE);
> >  }
> >
> >  #else /* ! CONFIG_RISCV_ISA_V */
> > diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h
> > index b182f2d03e25..8797d520e8ef 100644
> > --- a/arch/riscv/include/asm/thread_info.h
> > +++ b/arch/riscv/include/asm/thread_info.h
> > @@ -94,6 +94,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src);
> >  #define TIF_UPROBE           10      /* uprobe breakpoint or singlestep */
> >  #define TIF_32BIT            11      /* compat-mode 32bit process */
> >  #define TIF_RISCV_V_DEFER_RESTORE    12 /* restore Vector before returing to user */
> > +#define TIF_RISCV_V_KERNEL_MODE                      13 /* kernel-mode Vector run with preemption-on */
> >
> >  #define _TIF_NOTIFY_RESUME   (1 << TIF_NOTIFY_RESUME)
> >  #define _TIF_SIGPENDING              (1 << TIF_SIGPENDING)
> > @@ -101,9 +102,12 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src);
> >  #define _TIF_NOTIFY_SIGNAL   (1 << TIF_NOTIFY_SIGNAL)
> >  #define _TIF_UPROBE          (1 << TIF_UPROBE)
> >  #define _TIF_RISCV_V_DEFER_RESTORE   (1 << TIF_RISCV_V_DEFER_RESTORE)
> > +#define _TIF_RISCV_V_KERNEL_MODE     (1 << TIF_RISCV_V_KERNEL_MODE)
> >
> >  #define _TIF_WORK_MASK \
> >       (_TIF_NOTIFY_RESUME | _TIF_SIGPENDING | _TIF_NEED_RESCHED | \
> >        _TIF_NOTIFY_SIGNAL | _TIF_UPROBE)
> >
> > +#define RISCV_V_VSTATE_CTRL_PREEMPTIBLE      0x20
> > +
> >  #endif /* _ASM_RISCV_THREAD_INFO_H */
> > diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h
> > index 3b783b317112..c2776851d50d 100644
> > --- a/arch/riscv/include/asm/vector.h
> > +++ b/arch/riscv/include/asm/vector.h
> > @@ -195,9 +195,24 @@ static inline void __switch_to_vector(struct task_struct *prev,
> >  {
> >       struct pt_regs *regs;
> >
> > -     regs = task_pt_regs(prev);
> > -     riscv_v_vstate_save(&prev->thread.vstate, regs);
> > -     riscv_v_vstate_set_restore(next, task_pt_regs(next));
> > +     if (IS_ENABLED(CONFIG_RISCV_ISA_V_PREEMPTIVE) &&
> > +         test_tsk_thread_flag(prev, TIF_RISCV_V_KERNEL_MODE)) {
> > +             regs = prev->thread.trap_pt_regs;
> > +             WARN_ON(!regs);
>
> In what cases could these WARN_ON()s be triggered?

It probably happens when a kernel thread calls schedule() in the
middle of preemptible kernel mode Vector code. Because the kernel sets
trap_pt_regs only at trap entries. For example

// assume preemption = "ON" and memory allocation
// for kernel_vstate.datap success
kernel_vector_begin();
// some vector code
...
schedule();
...
kernel_vector_end();

It is possible to support making scheduler calls in preemptible kernel
mode Vector though. We just need to save nothing (all V regs are
caller-save) and set an appropriate status.VS for the "next" process.

>
> > +             riscv_v_vstate_save(&prev->thread.kernel_vstate, regs);
> > +     } else {
> > +             regs = task_pt_regs(prev);
> > +             riscv_v_vstate_save(&prev->thread.vstate, regs);
> > +     }
> > +
> > +     if (IS_ENABLED(CONFIG_RISCV_ISA_V_PREEMPTIVE) &&
> > +         test_tsk_thread_flag(next, TIF_RISCV_V_KERNEL_MODE)) {
> > +             regs = next->thread.trap_pt_regs;
> > +             WARN_ON(!regs);
> > +             riscv_v_vstate_restore(&next->thread.kernel_vstate, regs);
> > +     } else {
> > +             riscv_v_vstate_set_restore(next, task_pt_regs(next));
> > +     }
> >  }
>
>
> >  /*
> >   * kernel_vector_begin(): obtain the CPU vector registers for use by the calling
> >   * context
> > @@ -70,11 +109,14 @@ void kernel_vector_begin(void)
> >
> >       riscv_v_vstate_save(&current->thread.vstate, task_pt_regs(current));
> >
> > -     get_cpu_vector_context();
> > +     if (!preemptible() || !kernel_vector_preemptible()) {
> > +             get_cpu_vector_context();
> > +     } else {
> > +             if (riscv_v_start_kernel_context())
> > +                     get_cpu_vector_context();
>
> What happens here if riscv_v_start_kernel_context() fails w/ -ENOMEM?

Here we would fallback to starting kernel-mode Vector with preemption
disabled, by calling get_cpu_vector_context(). This makes calling
kernel_vector_begin() end up with 2 possible consequences, if the
caller runs in a preemptible context. One, which is the success path
of riscv_v_start_kernel_context(), will not alter the preemption
status but may increase memory usage if the context does not exist
yet.

However, if, on the other path, riscv_v_start_kernel_context() fails
with -ENOMEM, then the kernel-mode Vector code will be executed with
preemption "off".

Another way of solving this ambiguity is to add another function to
enable kernel mode Vector with preemption, and let the user check if
the allocation fails. So users who really want to run their Vector
code with preemption shall make this call. Otherwise, kernel mode
Vector runs with preemption off. However, I don't really want to add
it because I'd like to make the "upgrade" transparent to the caller.

>
> > +     }
> >
> >       riscv_v_enable();
> > -
> > -     return 0;
> >  }
> >  EXPORT_SYMBOL_GPL(kernel_vector_begin);
> >
> > @@ -96,6 +138,9 @@  void kernel_vector_end(void)
> >
> >       riscv_v_disable();
> >
> > -     put_cpu_vector_context();
> > +     if (!test_thread_flag(TIF_RISCV_V_KERNEL_MODE))
> > +             put_cpu_vector_context();
> > +     else
> > +             riscv_v_stop_kernel_context();
> >  }
>
> Probably just missing something here, but how come we don't need to call
> put_cpu_vector_context() here. I'm just a little confused, since, in
> kernel_vector_begin, get_cpu_vector_context() is called.

If "TIF_RISCV_V_KERNEL_MODE" is set, then we are running kernel-mode
Vector with preemption "ON". In such cases we don't need to call
put_cpu_vector_context(), which is the epilogue of kernel-mode Vector
with preemption "OFF". Instead, we should call
riscv_v_stop_kernel_context() to end the session.

Thanks,
Andy
Conor Dooley July 24, 2023, 4:26 p.m. UTC | #3
On Mon, Jul 24, 2023 at 11:45:47PM +0800, Andy Chiu wrote:
> On Mon, Jul 24, 2023 at 8:19 PM Conor Dooley <conor.dooley@microchip.com> wrote:
> > On Fri, Jul 21, 2023 at 11:28:55AM +0000, Andy Chiu wrote:
> > > Add kernel_vstate to keep track of kernel-mode Vector registers when
> > > trap introduced context switch happens. Also, provide trap_pt_regs to
> > > let context save/restore routine reference status.VS at which the trap
> > > takes place. The thread flag TIF_RISCV_V_KERNEL_MODE indicates whether
> > > a task is running in kernel-mode Vector with preemption 'ON'. So context
> > > switch routines know and would save V-regs to kernel_vstate and restore
> > > V-regs immediately from kernel_vstate if the bit is set.
> > >
> > > Apart from a task's preemption status, the capability of
> > > running preemptive kernel-mode Vector is jointly controlled by the
> > > RISCV_V_VSTATE_CTRL_PREEMPTIBLE mask in the task's
> > > thread.vstate_ctrl. This bit is masked whenever a trap takes place in
> > > kernel mode while executing preemptive Vector code.
> > >
> > > Also, provide a config CONFIG_RISCV_ISA_V_PREEMPTIVE to give users an
> > > option to disable preemptible kernel-mode Vector at build time. Users
> > > with constraint memory may want to disable this config as preemptible
> > > kernel-mode Vector needs extra space for tracking per thread's
> > > kernel-mode V context. Or, users might as well want to disable it if all
> > > kernel-mode Vector code is time sensitive and cannot tolerate context
> > > swicth overhead.
> > >
> > > Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
> > > ---
> > > Changelog v2:
> > >  - fix build fail when compiling without RISCV_ISA_V (Conor)
> > >  - 's/TIF_RISCV_V_KMV/TIF_RISCV_V_KERNEL_MODE' and add comment (Conor)
> > >  - merge Kconfig patch into this oine (Conor).
> > >  - 's/CONFIG_RISCV_ISA_V_PREEMPTIVE_KMV/CONFIG_RISCV_ISA_V_PREEMPTIVE/'
> > >    (Conor)
> > >  - fix some typos (Conor)
> > >  - enclose assembly with RISCV_ISA_V_PREEMPTIVE.
> > >  - change riscv_v_vstate_ctrl_config_kmv() to
> > >    kernel_vector_allow_preemption() for better understanding. (Conor)
> > >  - 's/riscv_v_kmv_preempitble/kernel_vector_preemptible/'
> > > ---
> > >  arch/riscv/Kconfig                     | 10 +++++
> > >  arch/riscv/include/asm/processor.h     |  2 +
> > >  arch/riscv/include/asm/simd.h          |  4 +-
> > >  arch/riscv/include/asm/thread_info.h   |  4 ++
> > >  arch/riscv/include/asm/vector.h        | 27 +++++++++++--
> > >  arch/riscv/kernel/asm-offsets.c        |  2 +
> > >  arch/riscv/kernel/entry.S              | 45 ++++++++++++++++++++++
> > >  arch/riscv/kernel/kernel_mode_vector.c | 53 ++++++++++++++++++++++++--
> > >  arch/riscv/kernel/process.c            |  8 +++-
> > >  arch/riscv/kernel/vector.c             |  3 +-
> > >  10 files changed, 148 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > > index 4c07b9189c86..0622951b15dd 100644
> > > --- a/arch/riscv/Kconfig
> > > +++ b/arch/riscv/Kconfig
> > > @@ -507,6 +507,16 @@ config RISCV_ISA_V_DEFAULT_ENABLE
> > >
> > >         If you don't know what to do here, say Y.
> > >
> > > +config RISCV_ISA_V_PREEMPTIVE
> > > +     bool "Run kernel-mode Vector with kernel preemption"
> > > +     depends on PREEMPTION
> > > +     depends on RISCV_ISA_V
> > > +     default y
> > > +     help
> > > +       Ordinarily the kernel disables preemption before running in-kernel
> > > +       Vector code. This config frees the kernel from disabling preemption
> > > +       by adding memory on demand for tracking kernel's V-context.
> > > +
> > >  config TOOLCHAIN_HAS_ZBB
> > >       bool
> > >       default y
> > > diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
> > > index c950a8d9edef..497c0dd30b2a 100644
> > > --- a/arch/riscv/include/asm/processor.h
> > > +++ b/arch/riscv/include/asm/processor.h
> > > @@ -42,6 +42,8 @@ struct thread_struct {
> > >       unsigned long bad_cause;
> > >       unsigned long vstate_ctrl;
> > >       struct __riscv_v_ext_state vstate;
> > > +     struct pt_regs *trap_pt_regs;
> > > +     struct __riscv_v_ext_state kernel_vstate;
> > >  };
> > >
> > >  /* Whitelist the fstate from the task_struct for hardened usercopy */
> > > diff --git a/arch/riscv/include/asm/simd.h b/arch/riscv/include/asm/simd.h
> > > index ef70af78005d..a54a0ce58f4d 100644
> > > --- a/arch/riscv/include/asm/simd.h
> > > +++ b/arch/riscv/include/asm/simd.h
> > > @@ -12,6 +12,7 @@
> > >  #include <linux/percpu.h>
> > >  #include <linux/preempt.h>
> > >  #include <linux/types.h>
> > > +#include <linux/thread_info.h>
> > >
> > >  #ifdef CONFIG_RISCV_ISA_V
> > >
> > > @@ -35,7 +36,8 @@ static __must_check inline bool may_use_simd(void)
> > >        * where it is set.
> > >        */
> > >       return !in_irq() && !irqs_disabled() && !in_nmi() &&
> > > -            !this_cpu_read(vector_context_busy);
> > > +            !this_cpu_read(vector_context_busy) &&
> > > +            !test_thread_flag(TIF_RISCV_V_KERNEL_MODE);
> > >  }
> > >
> > >  #else /* ! CONFIG_RISCV_ISA_V */
> > > diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h
> > > index b182f2d03e25..8797d520e8ef 100644
> > > --- a/arch/riscv/include/asm/thread_info.h
> > > +++ b/arch/riscv/include/asm/thread_info.h
> > > @@ -94,6 +94,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src);
> > >  #define TIF_UPROBE           10      /* uprobe breakpoint or singlestep */
> > >  #define TIF_32BIT            11      /* compat-mode 32bit process */
> > >  #define TIF_RISCV_V_DEFER_RESTORE    12 /* restore Vector before returing to user */
> > > +#define TIF_RISCV_V_KERNEL_MODE                      13 /* kernel-mode Vector run with preemption-on */
> > >
> > >  #define _TIF_NOTIFY_RESUME   (1 << TIF_NOTIFY_RESUME)
> > >  #define _TIF_SIGPENDING              (1 << TIF_SIGPENDING)
> > > @@ -101,9 +102,12 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src);
> > >  #define _TIF_NOTIFY_SIGNAL   (1 << TIF_NOTIFY_SIGNAL)
> > >  #define _TIF_UPROBE          (1 << TIF_UPROBE)
> > >  #define _TIF_RISCV_V_DEFER_RESTORE   (1 << TIF_RISCV_V_DEFER_RESTORE)
> > > +#define _TIF_RISCV_V_KERNEL_MODE     (1 << TIF_RISCV_V_KERNEL_MODE)
> > >
> > >  #define _TIF_WORK_MASK \
> > >       (_TIF_NOTIFY_RESUME | _TIF_SIGPENDING | _TIF_NEED_RESCHED | \
> > >        _TIF_NOTIFY_SIGNAL | _TIF_UPROBE)
> > >
> > > +#define RISCV_V_VSTATE_CTRL_PREEMPTIBLE      0x20
> > > +
> > >  #endif /* _ASM_RISCV_THREAD_INFO_H */
> > > diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h
> > > index 3b783b317112..c2776851d50d 100644
> > > --- a/arch/riscv/include/asm/vector.h
> > > +++ b/arch/riscv/include/asm/vector.h
> > > @@ -195,9 +195,24 @@ static inline void __switch_to_vector(struct task_struct *prev,
> > >  {
> > >       struct pt_regs *regs;
> > >
> > > -     regs = task_pt_regs(prev);
> > > -     riscv_v_vstate_save(&prev->thread.vstate, regs);
> > > -     riscv_v_vstate_set_restore(next, task_pt_regs(next));
> > > +     if (IS_ENABLED(CONFIG_RISCV_ISA_V_PREEMPTIVE) &&
> > > +         test_tsk_thread_flag(prev, TIF_RISCV_V_KERNEL_MODE)) {
> > > +             regs = prev->thread.trap_pt_regs;
> > > +             WARN_ON(!regs);
> >
> > In what cases could these WARN_ON()s be triggered?
> 
> It probably happens when a kernel thread calls schedule() in the
> middle of preemptible kernel mode Vector code. Because the kernel sets
> trap_pt_regs only at trap entries. For example
> 
> // assume preemption = "ON" and memory allocation
> // for kernel_vstate.datap success
> kernel_vector_begin();
> // some vector code
> ...
> schedule();
> ...
> kernel_vector_end();
> 
> It is possible to support making scheduler calls in preemptible kernel
> mode Vector though. We just need to save nothing (all V regs are
> caller-save) and set an appropriate status.VS for the "next" process.

I'm struggling to theorycraft where this can go wrong, because my
knowledge in this area is limited. If the only way it can go wrong is by
calling schedule() in a "protected" section like this, that seems
"okay". Are there not non-trap induced context switches that we need to
worry about?

> > > +             riscv_v_vstate_save(&prev->thread.kernel_vstate, regs);
> > > +     } else {
> > > +             regs = task_pt_regs(prev);
> > > +             riscv_v_vstate_save(&prev->thread.vstate, regs);
> > > +     }
> > > +
> > > +     if (IS_ENABLED(CONFIG_RISCV_ISA_V_PREEMPTIVE) &&
> > > +         test_tsk_thread_flag(next, TIF_RISCV_V_KERNEL_MODE)) {
> > > +             regs = next->thread.trap_pt_regs;
> > > +             WARN_ON(!regs);
> > > +             riscv_v_vstate_restore(&next->thread.kernel_vstate, regs);
> > > +     } else {
> > > +             riscv_v_vstate_set_restore(next, task_pt_regs(next));
> > > +     }
> > >  }
> >
> >
> > >  /*
> > >   * kernel_vector_begin(): obtain the CPU vector registers for use by the calling
> > >   * context
> > > @@ -70,11 +109,14 @@ void kernel_vector_begin(void)
> > >
> > >       riscv_v_vstate_save(&current->thread.vstate, task_pt_regs(current));
> > >
> > > -     get_cpu_vector_context();
> > > +     if (!preemptible() || !kernel_vector_preemptible()) {
> > > +             get_cpu_vector_context();
> > > +     } else {
> > > +             if (riscv_v_start_kernel_context())
> > > +                     get_cpu_vector_context();
> >
> > What happens here if riscv_v_start_kernel_context() fails w/ -ENOMEM?
> 
> Here we would fallback to starting kernel-mode Vector with preemption
> disabled, by calling get_cpu_vector_context(). This makes calling
> kernel_vector_begin() end up with 2 possible consequences, if the
> caller runs in a preemptible context. One, which is the success path
> of riscv_v_start_kernel_context(), will not alter the preemption
> status but may increase memory usage if the context does not exist
> yet.
> 
> However, if, on the other path, riscv_v_start_kernel_context() fails
> with -ENOMEM, then the kernel-mode Vector code will be executed with
> preemption "off".
> 
> Another way of solving this ambiguity is to add another function to
> enable kernel mode Vector with preemption, and let the user check if
> the allocation fails. So users who really want to run their Vector
> code with preemption shall make this call. Otherwise, kernel mode
> Vector runs with preemption off. However, I don't really want to add
> it because I'd like to make the "upgrade" transparent to the caller.
> 
> >
> > > +     }
> > >
> > >       riscv_v_enable();
> > > -
> > > -     return 0;
> > >  }
> > >  EXPORT_SYMBOL_GPL(kernel_vector_begin);
> > >
> > > @@ -96,6 +138,9 @@  void kernel_vector_end(void)
> > >
> > >       riscv_v_disable();
> > >
> > > -     put_cpu_vector_context();
> > > +     if (!test_thread_flag(TIF_RISCV_V_KERNEL_MODE))
> > > +             put_cpu_vector_context();
> > > +     else
> > > +             riscv_v_stop_kernel_context();
> > >  }
> >
> > Probably just missing something here, but how come we don't need to call
> > put_cpu_vector_context() here. I'm just a little confused, since, in
> > kernel_vector_begin, get_cpu_vector_context() is called.
> 
> If "TIF_RISCV_V_KERNEL_MODE" is set, then we are running kernel-mode
> Vector with preemption "ON". In such cases we don't need to call
> put_cpu_vector_context(), which is the epilogue of kernel-mode Vector
> with preemption "OFF". Instead, we should call
> riscv_v_stop_kernel_context() to end the session.

I think, for these last two comments, I screwed up. I misread
if (riscv_v_start_kernel_context())
as
if (!riscv_v_start_kernel_context())
which is the source of my confusion about this being imbalanced.

Thanks for your explanations however!
Björn Töpel Aug. 15, 2023, 12:19 p.m. UTC | #4
Andy Chiu <andy.chiu@sifive.com> writes:

> Add kernel_vstate to keep track of kernel-mode Vector registers when
> trap introduced context switch happens. Also, provide trap_pt_regs to
> let context save/restore routine reference status.VS at which the trap
> takes place. The thread flag TIF_RISCV_V_KERNEL_MODE indicates whether
> a task is running in kernel-mode Vector with preemption 'ON'. So context
> switch routines know and would save V-regs to kernel_vstate and restore
> V-regs immediately from kernel_vstate if the bit is set.
>
> Apart from a task's preemption status, the capability of
> running preemptive kernel-mode Vector is jointly controlled by the
> RISCV_V_VSTATE_CTRL_PREEMPTIBLE mask in the task's
> thread.vstate_ctrl. This bit is masked whenever a trap takes place in
> kernel mode while executing preemptive Vector code.
>
> Also, provide a config CONFIG_RISCV_ISA_V_PREEMPTIVE to give users an
> option to disable preemptible kernel-mode Vector at build time. Users
> with constraint memory may want to disable this config as preemptible
> kernel-mode Vector needs extra space for tracking per thread's
> kernel-mode V context. Or, users might as well want to disable it if all
> kernel-mode Vector code is time sensitive and cannot tolerate context
> swicth overhead.

Nice idea! Did you perform any benchmarking? Would be really interesting
to get some numbers.

Nit: "switch"

I like that the most "controversial" patch is last, so it can easily be
dropped if the discussions doesn't settle! It would be nice with kernel
vector support in 6.6!

> diff --git a/arch/riscv/kernel/kernel_mode_vector.c b/arch/riscv/kernel/kernel_mode_vector.c
[...]
> @@ -70,11 +109,14 @@ void kernel_vector_begin(void)
>  
>  	riscv_v_vstate_save(&current->thread.vstate, task_pt_regs(current));
>  
> -	get_cpu_vector_context();
> +	if (!preemptible() || !kernel_vector_preemptible()) {
> +		get_cpu_vector_context();
> +	} else {
> +		if (riscv_v_start_kernel_context())
> +			get_cpu_vector_context();
> +	}

Wdyt about replacing this with:
        if (!preemptible() || !kernel_vector_preemptible() || riscv_v_start_kernel_context())
                get_cpu_vector_context();

Björn
diff mbox series

Patch

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 4c07b9189c86..0622951b15dd 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -507,6 +507,16 @@  config RISCV_ISA_V_DEFAULT_ENABLE
 
 	  If you don't know what to do here, say Y.
 
+config RISCV_ISA_V_PREEMPTIVE
+	bool "Run kernel-mode Vector with kernel preemption"
+	depends on PREEMPTION
+	depends on RISCV_ISA_V
+	default y
+	help
+	  Ordinarily the kernel disables preemption before running in-kernel
+	  Vector code. This config frees the kernel from disabling preemption
+	  by adding memory on demand for tracking kernel's V-context.
+
 config TOOLCHAIN_HAS_ZBB
 	bool
 	default y
diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
index c950a8d9edef..497c0dd30b2a 100644
--- a/arch/riscv/include/asm/processor.h
+++ b/arch/riscv/include/asm/processor.h
@@ -42,6 +42,8 @@  struct thread_struct {
 	unsigned long bad_cause;
 	unsigned long vstate_ctrl;
 	struct __riscv_v_ext_state vstate;
+	struct pt_regs *trap_pt_regs;
+	struct __riscv_v_ext_state kernel_vstate;
 };
 
 /* Whitelist the fstate from the task_struct for hardened usercopy */
diff --git a/arch/riscv/include/asm/simd.h b/arch/riscv/include/asm/simd.h
index ef70af78005d..a54a0ce58f4d 100644
--- a/arch/riscv/include/asm/simd.h
+++ b/arch/riscv/include/asm/simd.h
@@ -12,6 +12,7 @@ 
 #include <linux/percpu.h>
 #include <linux/preempt.h>
 #include <linux/types.h>
+#include <linux/thread_info.h>
 
 #ifdef CONFIG_RISCV_ISA_V
 
@@ -35,7 +36,8 @@  static __must_check inline bool may_use_simd(void)
 	 * where it is set.
 	 */
 	return !in_irq() && !irqs_disabled() && !in_nmi() &&
-	       !this_cpu_read(vector_context_busy);
+	       !this_cpu_read(vector_context_busy) &&
+	       !test_thread_flag(TIF_RISCV_V_KERNEL_MODE);
 }
 
 #else /* ! CONFIG_RISCV_ISA_V */
diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h
index b182f2d03e25..8797d520e8ef 100644
--- a/arch/riscv/include/asm/thread_info.h
+++ b/arch/riscv/include/asm/thread_info.h
@@ -94,6 +94,7 @@  int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src);
 #define TIF_UPROBE		10	/* uprobe breakpoint or singlestep */
 #define TIF_32BIT		11	/* compat-mode 32bit process */
 #define TIF_RISCV_V_DEFER_RESTORE	12 /* restore Vector before returing to user */
+#define TIF_RISCV_V_KERNEL_MODE			13 /* kernel-mode Vector run with preemption-on */
 
 #define _TIF_NOTIFY_RESUME	(1 << TIF_NOTIFY_RESUME)
 #define _TIF_SIGPENDING		(1 << TIF_SIGPENDING)
@@ -101,9 +102,12 @@  int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src);
 #define _TIF_NOTIFY_SIGNAL	(1 << TIF_NOTIFY_SIGNAL)
 #define _TIF_UPROBE		(1 << TIF_UPROBE)
 #define _TIF_RISCV_V_DEFER_RESTORE	(1 << TIF_RISCV_V_DEFER_RESTORE)
+#define _TIF_RISCV_V_KERNEL_MODE	(1 << TIF_RISCV_V_KERNEL_MODE)
 
 #define _TIF_WORK_MASK \
 	(_TIF_NOTIFY_RESUME | _TIF_SIGPENDING | _TIF_NEED_RESCHED | \
 	 _TIF_NOTIFY_SIGNAL | _TIF_UPROBE)
 
+#define RISCV_V_VSTATE_CTRL_PREEMPTIBLE	0x20
+
 #endif /* _ASM_RISCV_THREAD_INFO_H */
diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h
index 3b783b317112..c2776851d50d 100644
--- a/arch/riscv/include/asm/vector.h
+++ b/arch/riscv/include/asm/vector.h
@@ -195,9 +195,24 @@  static inline void __switch_to_vector(struct task_struct *prev,
 {
 	struct pt_regs *regs;
 
-	regs = task_pt_regs(prev);
-	riscv_v_vstate_save(&prev->thread.vstate, regs);
-	riscv_v_vstate_set_restore(next, task_pt_regs(next));
+	if (IS_ENABLED(CONFIG_RISCV_ISA_V_PREEMPTIVE) &&
+	    test_tsk_thread_flag(prev, TIF_RISCV_V_KERNEL_MODE)) {
+		regs = prev->thread.trap_pt_regs;
+		WARN_ON(!regs);
+		riscv_v_vstate_save(&prev->thread.kernel_vstate, regs);
+	} else {
+		regs = task_pt_regs(prev);
+		riscv_v_vstate_save(&prev->thread.vstate, regs);
+	}
+
+	if (IS_ENABLED(CONFIG_RISCV_ISA_V_PREEMPTIVE) &&
+	    test_tsk_thread_flag(next, TIF_RISCV_V_KERNEL_MODE)) {
+		regs = next->thread.trap_pt_regs;
+		WARN_ON(!regs);
+		riscv_v_vstate_restore(&next->thread.kernel_vstate, regs);
+	} else {
+		riscv_v_vstate_set_restore(next, task_pt_regs(next));
+	}
 }
 
 void riscv_v_vstate_ctrl_init(struct task_struct *tsk);
@@ -222,4 +237,10 @@  static inline bool riscv_v_vstate_ctrl_user_allowed(void) { return false; }
 
 #endif /* CONFIG_RISCV_ISA_V */
 
+#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE
+void kernel_vector_allow_preemption(void);
+#else
+#define kernel_vector_allow_preemption()	do {} while (0)
+#endif
+
 #endif /* ! __ASM_RISCV_VECTOR_H */
diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c
index d6a75aac1d27..4b062f7741b2 100644
--- a/arch/riscv/kernel/asm-offsets.c
+++ b/arch/riscv/kernel/asm-offsets.c
@@ -38,6 +38,8 @@  void asm_offsets(void)
 	OFFSET(TASK_TI_PREEMPT_COUNT, task_struct, thread_info.preempt_count);
 	OFFSET(TASK_TI_KERNEL_SP, task_struct, thread_info.kernel_sp);
 	OFFSET(TASK_TI_USER_SP, task_struct, thread_info.user_sp);
+	OFFSET(TASK_THREAD_TRAP_REGP, task_struct, thread.trap_pt_regs);
+	OFFSET(TASK_THREAD_VSTATE_CTRL, task_struct, thread.vstate_ctrl);
 
 	OFFSET(TASK_THREAD_F0,  task_struct, thread.fstate.f[0]);
 	OFFSET(TASK_THREAD_F1,  task_struct, thread.fstate.f[1]);
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
index 143a2bb3e697..b6a7d4e9f526 100644
--- a/arch/riscv/kernel/entry.S
+++ b/arch/riscv/kernel/entry.S
@@ -66,6 +66,29 @@  _save_context:
 	REG_S s4, PT_CAUSE(sp)
 	REG_S s5, PT_TP(sp)
 
+#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE
+	/*
+	 * Record the register set at the frame where in-kernel V registers are
+	 * last alive.
+	 */
+	REG_L s0, TASK_TI_FLAGS(tp)
+	li s1, 1 << TIF_RISCV_V_KERNEL_MODE
+	and s0, s0, s1
+	beqz s0, 1f
+	li s0, TASK_THREAD_TRAP_REGP
+	add s0, s0, tp
+	REG_L s1, (s0)
+	bnez s1, 1f
+	REG_S sp, (s0)
+	li s0, TASK_THREAD_VSTATE_CTRL
+	add s0, s0, tp
+	REG_L s1, (s0)
+	li s2, ~RISCV_V_VSTATE_CTRL_PREEMPTIBLE
+	and s1, s1, s2
+	REG_S s1, (s0)
+1:
+#endif
+
 	/*
 	 * Set the scratch register to 0, so that if a recursive exception
 	 * occurs, the exception vector knows it came from the kernel
@@ -129,6 +152,28 @@  SYM_CODE_START_NOALIGN(ret_from_exception)
 	 */
 	csrw CSR_SCRATCH, tp
 1:
+#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE
+	/*
+	 * Clear tracking of the trap registers when we return to the frame
+	 * that uses kernel mode Vector.
+	 */
+	REG_L s0, TASK_TI_FLAGS(tp)
+	li s1, 1 << TIF_RISCV_V_KERNEL_MODE
+	and s0, s0, s1
+	beqz s0, 1f
+	li s0, TASK_THREAD_TRAP_REGP
+	add s0, s0, tp
+	REG_L s1, (s0)
+	bne s1, sp, 1f
+	REG_S x0, (s0)
+	li s0, TASK_THREAD_VSTATE_CTRL
+	add s0, s0, tp
+	REG_L s1, (s0)
+	ori s1, s1, RISCV_V_VSTATE_CTRL_PREEMPTIBLE
+	REG_S s1, (s0)
+1:
+#endif
+
 	REG_L a0, PT_STATUS(sp)
 	/*
 	 * The current load reservation is effectively part of the processor's
diff --git a/arch/riscv/kernel/kernel_mode_vector.c b/arch/riscv/kernel/kernel_mode_vector.c
index d9e097e68937..5c64f2034cdc 100644
--- a/arch/riscv/kernel/kernel_mode_vector.c
+++ b/arch/riscv/kernel/kernel_mode_vector.c
@@ -10,6 +10,7 @@ 
 #include <linux/percpu.h>
 #include <linux/preempt.h>
 #include <linux/types.h>
+#include <linux/slab.h>
 
 #include <asm/vector.h>
 #include <asm/switch_to.h>
@@ -48,6 +49,44 @@  static void put_cpu_vector_context(void)
 	preempt_enable();
 }
 
+#ifdef CONFIG_RISCV_ISA_V_PREEMPTIVE
+void kernel_vector_allow_preemption(void)
+{
+	current->thread.vstate_ctrl |= RISCV_V_VSTATE_CTRL_PREEMPTIBLE;
+}
+
+static bool kernel_vector_preemptible(void)
+{
+	return !!(current->thread.vstate_ctrl & RISCV_V_VSTATE_CTRL_PREEMPTIBLE);
+}
+
+static int riscv_v_start_kernel_context(void)
+{
+	struct __riscv_v_ext_state *vstate;
+
+	vstate = &current->thread.kernel_vstate;
+	if (!vstate->datap) {
+		vstate->datap = kmalloc(riscv_v_vsize, GFP_KERNEL);
+		if (!vstate->datap)
+			return -ENOMEM;
+	}
+
+	current->thread.trap_pt_regs = NULL;
+	WARN_ON(test_and_set_thread_flag(TIF_RISCV_V_KERNEL_MODE));
+	return 0;
+}
+
+static void riscv_v_stop_kernel_context(void)
+{
+	WARN_ON(!test_and_clear_thread_flag(TIF_RISCV_V_KERNEL_MODE));
+	current->thread.trap_pt_regs = NULL;
+}
+#else
+#define kernel_vector_preemptible()	(false)
+#define riscv_v_start_kernel_context()	(0)
+#define riscv_v_stop_kernel_context()	do {} while (0)
+#endif /* CONFIG_RISCV_ISA_V_PREEMPTIVE */
+
 /*
  * kernel_vector_begin(): obtain the CPU vector registers for use by the calling
  * context
@@ -70,11 +109,14 @@  void kernel_vector_begin(void)
 
 	riscv_v_vstate_save(&current->thread.vstate, task_pt_regs(current));
 
-	get_cpu_vector_context();
+	if (!preemptible() || !kernel_vector_preemptible()) {
+		get_cpu_vector_context();
+	} else {
+		if (riscv_v_start_kernel_context())
+			get_cpu_vector_context();
+	}
 
 	riscv_v_enable();
-
-	return 0;
 }
 EXPORT_SYMBOL_GPL(kernel_vector_begin);
 
@@ -96,6 +138,9 @@  void kernel_vector_end(void)
 
 	riscv_v_disable();
 
-	put_cpu_vector_context();
+	if (!test_thread_flag(TIF_RISCV_V_KERNEL_MODE))
+		put_cpu_vector_context();
+	else
+		riscv_v_stop_kernel_context();
 }
 EXPORT_SYMBOL_GPL(kernel_vector_end);
diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
index ec89e7edb6fd..18cb37c305ab 100644
--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -160,8 +160,11 @@  void flush_thread(void)
 void arch_release_task_struct(struct task_struct *tsk)
 {
 	/* Free the vector context of datap. */
-	if (has_vector())
+	if (has_vector()) {
 		kfree(tsk->thread.vstate.datap);
+		if (IS_ENABLED(CONFIG_RISCV_ISA_V_PREEMPTIVE))
+			kfree(tsk->thread.kernel_vstate.datap);
+	}
 }
 
 int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
@@ -170,7 +173,9 @@  int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 	*dst = *src;
 	/* clear entire V context, including datap for a new task */
 	memset(&dst->thread.vstate, 0, sizeof(struct __riscv_v_ext_state));
+	memset(&dst->thread.kernel_vstate, 0, sizeof(struct __riscv_v_ext_state));
 	clear_tsk_thread_flag(dst, TIF_RISCV_V_DEFER_RESTORE);
+	clear_tsk_thread_flag(dst, TIF_RISCV_V_KERNEL_MODE);
 
 	return 0;
 }
@@ -205,6 +210,7 @@  int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
 		childregs->a0 = 0; /* Return value of fork() */
 		p->thread.s[0] = 0;
 	}
+	kernel_vector_allow_preemption();
 	p->thread.ra = (unsigned long)ret_from_fork;
 	p->thread.sp = (unsigned long)childregs; /* kernel sp */
 	return 0;
diff --git a/arch/riscv/kernel/vector.c b/arch/riscv/kernel/vector.c
index 9d583b760db4..42f227077ee5 100644
--- a/arch/riscv/kernel/vector.c
+++ b/arch/riscv/kernel/vector.c
@@ -122,7 +122,8 @@  static inline void riscv_v_ctrl_set(struct task_struct *tsk, int cur, int nxt,
 	ctrl |= VSTATE_CTRL_MAKE_NEXT(nxt);
 	if (inherit)
 		ctrl |= PR_RISCV_V_VSTATE_CTRL_INHERIT;
-	tsk->thread.vstate_ctrl = ctrl;
+	tsk->thread.vstate_ctrl &= ~PR_RISCV_V_VSTATE_CTRL_MASK;
+	tsk->thread.vstate_ctrl |= ctrl;
 }
 
 bool riscv_v_vstate_ctrl_user_allowed(void)