diff mbox series

[v3] RISC-V: Don't check text_mutex during stop_machine

Message ID 20230215164317.727657-1-conor@kernel.org (mailing list archive)
State Superseded
Headers show
Series [v3] RISC-V: Don't check text_mutex during stop_machine | expand

Checks

Context Check Description
conchuod/cover_letter success Single patches do not need cover letters
conchuod/tree_selection success Guessed tree name to be fixes
conchuod/fixes_present success Fixes tag present in non-next series
conchuod/maintainers_pattern success MAINTAINERS pattern errors before the patch: 13 and now 13
conchuod/verify_signedoff success Signed-off-by tag matches author and committer
conchuod/kdoc success Errors and warnings before: 0 this patch: 0
conchuod/build_rv64_clang_allmodconfig success Errors and warnings before: 585 this patch: 585
conchuod/module_param success Was 0 now: 0
conchuod/build_rv64_gcc_allmodconfig success Errors and warnings before: 5737 this patch: 5737
conchuod/alphanumeric_selects success Out of order selects before the patch: 729 and now 729
conchuod/build_rv32_defconfig success Build OK
conchuod/dtb_warn_rv64 success Errors and warnings before: 2 this patch: 2
conchuod/header_inline success No static functions without inline keyword in header files
conchuod/checkpatch warning CHECK: Consider using #include <linux/ftrace.h> instead of <asm/ftrace.h>
conchuod/source_inline success Was 0 now: 0
conchuod/build_rv64_nommu_k210_defconfig success Build OK
conchuod/verify_fixes success Fixes tag looks correct
conchuod/build_rv64_nommu_virt_defconfig success Build OK

Commit Message

Conor Dooley Feb. 15, 2023, 4:43 p.m. UTC
From: Palmer Dabbelt <palmerdabbelt@google.com>

We're currently using stop_machine() to update ftrace, which means that
the thread that takes text_mutex during ftrace_prepare() may not be the
same as the thread that eventually patches the code.  This isn't
actually a race because the lock is still held (preventing any other
concurrent accesses) and there is only one thread running during
stop_machine(), but it does trigger a lockdep failure.

This patch just elides the lockdep check during stop_machine.

Fixes: c15ac4fd60d5 ("riscv/ftrace: Add dynamic function tracer support")
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Reported-by: Changbin Du <changbin.du@gmail.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
---
Resending this version as I am quite averse to deleting the assertion!

Changes since v2 [<20220322022331.32136-1-palmer@rivosinc.com>]:
* rebase on riscv/for-next as it as been a year.
* incorporate Changbin's suggestion that init_nop should take the lock
  rather than call prepare() & post_process().

Changes since v1 [<20210506071041.417854-1-palmer@dabbelt.com>]:
* Use ftrace_arch_ocde_modify_{prepare,post_process}() to set the flag.
  I remember having a reason I wanted the function when I wrote the v1,
  but it's been almost a year and I forget what that was -- maybe I was
  just crazy, the patch was sent at midnight.
* Fix DYNAMIC_FTRACE=n builds.
---
 arch/riscv/include/asm/ftrace.h |  7 +++++++
 arch/riscv/kernel/ftrace.c      | 15 +++++++++++++--
 arch/riscv/kernel/patch.c       | 10 +++++++++-
 3 files changed, 29 insertions(+), 3 deletions(-)

Comments

Changbin Du Feb. 16, 2023, 11:31 a.m. UTC | #1
On Wed, Feb 15, 2023 at 04:43:17PM +0000, Conor Dooley wrote:
> From: Palmer Dabbelt <palmerdabbelt@google.com>
> 
> We're currently using stop_machine() to update ftrace, which means that
> the thread that takes text_mutex during ftrace_prepare() may not be the
> same as the thread that eventually patches the code.  This isn't
> actually a race because the lock is still held (preventing any other
> concurrent accesses) and there is only one thread running during
> stop_machine(), but it does trigger a lockdep failure.
> 
> This patch just elides the lockdep check during stop_machine.
> 
> Fixes: c15ac4fd60d5 ("riscv/ftrace: Add dynamic function tracer support")
> Suggested-by: Steven Rostedt <rostedt@goodmis.org>
> Reported-by: Changbin Du <changbin.du@gmail.com>
> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
> Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
> ---
> Resending this version as I am quite averse to deleting the assertion!
> 
> Changes since v2 [<20220322022331.32136-1-palmer@rivosinc.com>]:
> * rebase on riscv/for-next as it as been a year.
> * incorporate Changbin's suggestion that init_nop should take the lock
>   rather than call prepare() & post_process().
> 
> Changes since v1 [<20210506071041.417854-1-palmer@dabbelt.com>]:
> * Use ftrace_arch_ocde_modify_{prepare,post_process}() to set the flag.
>   I remember having a reason I wanted the function when I wrote the v1,
>   but it's been almost a year and I forget what that was -- maybe I was
>   just crazy, the patch was sent at midnight.
> * Fix DYNAMIC_FTRACE=n builds.
> ---
>  arch/riscv/include/asm/ftrace.h |  7 +++++++
>  arch/riscv/kernel/ftrace.c      | 15 +++++++++++++--
>  arch/riscv/kernel/patch.c       | 10 +++++++++-
>  3 files changed, 29 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h
> index 04dad3380041..3ac7609f4ee9 100644
> --- a/arch/riscv/include/asm/ftrace.h
> +++ b/arch/riscv/include/asm/ftrace.h
> @@ -81,8 +81,15 @@ do {									\
>  struct dyn_ftrace;
>  int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
>  #define ftrace_init_nop ftrace_init_nop
> +extern int riscv_ftrace_in_stop_machine;
>  #endif
>  
> +#else /* CONFIG_DYNAMIC_FTRACE */
> +
> +#ifndef __ASSEMBLY__
> +#define riscv_ftrace_in_stop_machine 0
>  #endif
>  
> +#endif /* CONFIG_DYNAMIC_FTRACE */
> +
>  #endif /* _ASM_RISCV_FTRACE_H */
> diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
> index 2086f6585773..661bfa72f359 100644
> --- a/arch/riscv/kernel/ftrace.c
> +++ b/arch/riscv/kernel/ftrace.c
> @@ -11,14 +11,25 @@
>  #include <asm/cacheflush.h>
>  #include <asm/patch.h>
>  
> +int riscv_ftrace_in_stop_machine;
> +
>  #ifdef CONFIG_DYNAMIC_FTRACE
>  void ftrace_arch_code_modify_prepare(void) __acquires(&text_mutex)
>  {
>  	mutex_lock(&text_mutex);
> +
> +	/*
> +	 * The code sequences we use for ftrace can't be patched while the
> +	 * kernel is running, so we need to use stop_machine() to modify them
> +	 * for now.  This doesn't play nice with text_mutex, we use this flag
> +	 * to elide the check.
> +	 */
> +	riscv_ftrace_in_stop_machine = true;
>  }
>  
>  void ftrace_arch_code_modify_post_process(void) __releases(&text_mutex)
>  {
> +	riscv_ftrace_in_stop_machine = false;
>  	mutex_unlock(&text_mutex);
>  }
>  
> @@ -134,9 +145,9 @@ int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec)
>  {
>  	int out;
>  
> -	ftrace_arch_code_modify_prepare();
> +	mutex_lock(&text_mutex);
>  	out = ftrace_make_nop(mod, rec, MCOUNT_ADDR);
> -	ftrace_arch_code_modify_post_process();
> +	mutex_unlock(&text_mutex);
>  
>  	return out;
>  }
> diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c
> index 765004b60513..56b70271518d 100644
> --- a/arch/riscv/kernel/patch.c
> +++ b/arch/riscv/kernel/patch.c
> @@ -11,6 +11,7 @@
>  #include <asm/kprobes.h>
>  #include <asm/cacheflush.h>
>  #include <asm/fixmap.h>
> +#include <asm/ftrace.h>
>  #include <asm/patch.h>
>  
>  struct patch_insn {
> @@ -59,8 +60,15 @@ static int patch_insn_write(void *addr, const void *insn, size_t len)
>  	 * Before reaching here, it was expected to lock the text_mutex
>  	 * already, so we don't need to give another lock here and could
>  	 * ensure that it was safe between each cores.
> +	 *
> +	 * We're currently using stop_machine() for ftrace, and while that
> +	 * ensures text_mutex is held before installing the mappings it does
> +	 * not ensure text_mutex is held by the calling thread.  That's safe
> +	 * but triggers a lockdep failure, so just elide it for that specific
> +	 * case.
>  	 */
> -	lockdep_assert_held(&text_mutex);
> +	if (!riscv_ftrace_in_stop_machine)
> +		lockdep_assert_held(&text_mutex);
>  
>  	if (across_pages)
>  		patch_map(addr + len, FIX_TEXT_POKE1);
This misses this function.

int patch_text(void *addr, u32 insn)
{
	struct patch_insn patch = {
		.addr = addr,
		.insn = insn,
		.cpu_count = ATOMIC_INIT(0),
	};

	return stop_machine_cpuslocked(patch_text_cb,
				       &patch, cpu_online_mask);
}

> -- 
> 2.39.1
> 
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
>
Conor Dooley Feb. 24, 2023, 11:07 a.m. UTC | #2
On Thu, Feb 16, 2023 at 07:31:26PM +0800, Changbin Du wrote:
> On Wed, Feb 15, 2023 at 04:43:17PM +0000, Conor Dooley wrote:
> > From: Palmer Dabbelt <palmerdabbelt@google.com>
> > 
> > We're currently using stop_machine() to update ftrace, which means that
> > the thread that takes text_mutex during ftrace_prepare() may not be the
> > same as the thread that eventually patches the code.  This isn't
> > actually a race because the lock is still held (preventing any other
> > concurrent accesses) and there is only one thread running during
> > stop_machine(), but it does trigger a lockdep failure.
> > 
> > This patch just elides the lockdep check during stop_machine.
> > 
> > Fixes: c15ac4fd60d5 ("riscv/ftrace: Add dynamic function tracer support")
> > Suggested-by: Steven Rostedt <rostedt@goodmis.org>
> > Reported-by: Changbin Du <changbin.du@gmail.com>
> > Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
> > Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
> > Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
> > ---
> > Resending this version as I am quite averse to deleting the assertion!
> > 
> > Changes since v2 [<20220322022331.32136-1-palmer@rivosinc.com>]:
> > * rebase on riscv/for-next as it as been a year.
> > * incorporate Changbin's suggestion that init_nop should take the lock
> >   rather than call prepare() & post_process().
> > 
> > Changes since v1 [<20210506071041.417854-1-palmer@dabbelt.com>]:
> > * Use ftrace_arch_ocde_modify_{prepare,post_process}() to set the flag.
> >   I remember having a reason I wanted the function when I wrote the v1,
> >   but it's been almost a year and I forget what that was -- maybe I was
> >   just crazy, the patch was sent at midnight.
> > * Fix DYNAMIC_FTRACE=n builds.
> > ---
> >  arch/riscv/include/asm/ftrace.h |  7 +++++++
> >  arch/riscv/kernel/ftrace.c      | 15 +++++++++++++--
> >  arch/riscv/kernel/patch.c       | 10 +++++++++-
> >  3 files changed, 29 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h
> > index 04dad3380041..3ac7609f4ee9 100644
> > --- a/arch/riscv/include/asm/ftrace.h
> > +++ b/arch/riscv/include/asm/ftrace.h
> > @@ -81,8 +81,15 @@ do {									\
> >  struct dyn_ftrace;
> >  int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
> >  #define ftrace_init_nop ftrace_init_nop
> > +extern int riscv_ftrace_in_stop_machine;
> >  #endif
> >  
> > +#else /* CONFIG_DYNAMIC_FTRACE */
> > +
> > +#ifndef __ASSEMBLY__
> > +#define riscv_ftrace_in_stop_machine 0
> >  #endif
> >  
> > +#endif /* CONFIG_DYNAMIC_FTRACE */
> > +
> >  #endif /* _ASM_RISCV_FTRACE_H */
> > diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
> > index 2086f6585773..661bfa72f359 100644
> > --- a/arch/riscv/kernel/ftrace.c
> > +++ b/arch/riscv/kernel/ftrace.c
> > @@ -11,14 +11,25 @@
> >  #include <asm/cacheflush.h>
> >  #include <asm/patch.h>
> >  
> > +int riscv_ftrace_in_stop_machine;
> > +
> >  #ifdef CONFIG_DYNAMIC_FTRACE
> >  void ftrace_arch_code_modify_prepare(void) __acquires(&text_mutex)
> >  {
> >  	mutex_lock(&text_mutex);
> > +
> > +	/*
> > +	 * The code sequences we use for ftrace can't be patched while the
> > +	 * kernel is running, so we need to use stop_machine() to modify them
> > +	 * for now.  This doesn't play nice with text_mutex, we use this flag
> > +	 * to elide the check.
> > +	 */
> > +	riscv_ftrace_in_stop_machine = true;
> >  }
> >  
> >  void ftrace_arch_code_modify_post_process(void) __releases(&text_mutex)
> >  {
> > +	riscv_ftrace_in_stop_machine = false;
> >  	mutex_unlock(&text_mutex);
> >  }
> >  
> > @@ -134,9 +145,9 @@ int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec)
> >  {
> >  	int out;
> >  
> > -	ftrace_arch_code_modify_prepare();
> > +	mutex_lock(&text_mutex);
> >  	out = ftrace_make_nop(mod, rec, MCOUNT_ADDR);
> > -	ftrace_arch_code_modify_post_process();
> > +	mutex_unlock(&text_mutex);
> >  
> >  	return out;
> >  }
> > diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c
> > index 765004b60513..56b70271518d 100644
> > --- a/arch/riscv/kernel/patch.c
> > +++ b/arch/riscv/kernel/patch.c
> > @@ -11,6 +11,7 @@
> >  #include <asm/kprobes.h>
> >  #include <asm/cacheflush.h>
> >  #include <asm/fixmap.h>
> > +#include <asm/ftrace.h>
> >  #include <asm/patch.h>
> >  
> >  struct patch_insn {
> > @@ -59,8 +60,15 @@ static int patch_insn_write(void *addr, const void *insn, size_t len)
> >  	 * Before reaching here, it was expected to lock the text_mutex
> >  	 * already, so we don't need to give another lock here and could
> >  	 * ensure that it was safe between each cores.
> > +	 *
> > +	 * We're currently using stop_machine() for ftrace, and while that
> > +	 * ensures text_mutex is held before installing the mappings it does
> > +	 * not ensure text_mutex is held by the calling thread.  That's safe
> > +	 * but triggers a lockdep failure, so just elide it for that specific
> > +	 * case.
> >  	 */
> > -	lockdep_assert_held(&text_mutex);
> > +	if (!riscv_ftrace_in_stop_machine)
> > +		lockdep_assert_held(&text_mutex);
> >  
> >  	if (across_pages)
> >  		patch_map(addr + len, FIX_TEXT_POKE1);
> This misses this function.
> 
> int patch_text(void *addr, u32 insn)

So, with a corresponding rename to the symbol, does the following look
okay to you?

diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c
index f21592d20306..433b454e693f 100644
--- a/arch/riscv/kernel/probes/kprobes.c
+++ b/arch/riscv/kernel/probes/kprobes.c
@@ -27,9 +27,15 @@ static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
 
 	p->ainsn.api.restore = (unsigned long)p->addr + offset;
 
+	/*
+	 * kprobes takes text_mutex, but patch_text() calls stop_machine and
+	 * lockdep gets confused by the context in which the lock is taken.
+	 */
+	riscv_patch_in_stop_machine = true;
 	patch_text(p->ainsn.api.insn, p->opcode);
 	patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset),
 		   __BUG_INSN_32);
+	riscv_patch_in_stop_machine = false;
 }
 
 static void __kprobes arch_prepare_simulate(struct kprobe *p)
@@ -96,16 +102,28 @@ void *alloc_insn_page(void)
 /* install breakpoint in text */
 void __kprobes arch_arm_kprobe(struct kprobe *p)
 {
+	/*
+	 * kprobes takes text_mutex, but patch_text() calls stop_machine and
+	 * lockdep gets confused by the context in which the lock is taken.
+	 */
+	riscv_patch_in_stop_machine = true;
 	if ((p->opcode & __INSN_LENGTH_MASK) == __INSN_LENGTH_32)
 		patch_text(p->addr, __BUG_INSN_32);
 	else
 		patch_text(p->addr, __BUG_INSN_16);
+	riscv_patch_in_stop_machine = false;
 }
 
 /* remove breakpoint from text */
 void __kprobes arch_disarm_kprobe(struct kprobe *p)
 {
+	/*
+	 * kprobes takes text_mutex, but patch_text() calls stop_machine and
+	 * lockdep gets confused by the context in which the lock is taken.
+	 */
+	riscv_patch_in_stop_machine = true;
 	patch_text(p->addr, p->opcode);
+	riscv_patch_in_stop_machine = false;
 }
 
 void __kprobes arch_remove_kprobe(struct kprobe *p)
Changbin Du Feb. 24, 2023, 12:58 p.m. UTC | #3
On Fri, Feb 24, 2023 at 11:07:42AM +0000, Conor Dooley wrote:
> > > -	lockdep_assert_held(&text_mutex);
> > > +	if (!riscv_ftrace_in_stop_machine)
> > > +		lockdep_assert_held(&text_mutex);
> > >  
> > >  	if (across_pages)
> > >  		patch_map(addr + len, FIX_TEXT_POKE1);
> > This misses this function.
> > 
> > int patch_text(void *addr, u32 insn)
> 
> So, with a corresponding rename to the symbol, does the following look
> okay to you?
> 
> diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c
> index f21592d20306..433b454e693f 100644
> --- a/arch/riscv/kernel/probes/kprobes.c
> +++ b/arch/riscv/kernel/probes/kprobes.c
> @@ -27,9 +27,15 @@ static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
>  
>  	p->ainsn.api.restore = (unsigned long)p->addr + offset;
>  
> +	/*
> +	 * kprobes takes text_mutex, but patch_text() calls stop_machine and
> +	 * lockdep gets confused by the context in which the lock is taken.
> +	 */
> +	riscv_patch_in_stop_machine = true;
>  	patch_text(p->ainsn.api.insn, p->opcode);
>  	patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset),
>  		   __BUG_INSN_32);
> +	riscv_patch_in_stop_machine = false;
>  }
hmm, why not just put 'riscv_patch_in_stop_machine' into patch_text()? Then you
just need to modify that function.
Conor Dooley Feb. 24, 2023, 1:46 p.m. UTC | #4
On Fri, Feb 24, 2023 at 08:58:57PM +0800, Changbin Du wrote:
> On Fri, Feb 24, 2023 at 11:07:42AM +0000, Conor Dooley wrote:
> > > > -	lockdep_assert_held(&text_mutex);
> > > > +	if (!riscv_ftrace_in_stop_machine)
> > > > +		lockdep_assert_held(&text_mutex);
> > > >  
> > > >  	if (across_pages)
> > > >  		patch_map(addr + len, FIX_TEXT_POKE1);
> > > This misses this function.
> > > 
> > > int patch_text(void *addr, u32 insn)
> > 
> > So, with a corresponding rename to the symbol, does the following look
> > okay to you?
> > 
> > diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c
> > index f21592d20306..433b454e693f 100644
> > --- a/arch/riscv/kernel/probes/kprobes.c
> > +++ b/arch/riscv/kernel/probes/kprobes.c
> > @@ -27,9 +27,15 @@ static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
> >  
> >  	p->ainsn.api.restore = (unsigned long)p->addr + offset;
> >  
> > +	/*
> > +	 * kprobes takes text_mutex, but patch_text() calls stop_machine and
> > +	 * lockdep gets confused by the context in which the lock is taken.
> > +	 */
> > +	riscv_patch_in_stop_machine = true;
> >  	patch_text(p->ainsn.api.insn, p->opcode);
> >  	patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset),
> >  		   __BUG_INSN_32);
> > +	riscv_patch_in_stop_machine = false;
> >  }
> hmm, why not just put 'riscv_patch_in_stop_machine' into patch_text()? Then you
> just need to modify that function.

Right, I intentionally didn't do that as `riscv_patch_in_stop_machine`
skips the lockdep check, which we only want to do for codepaths we know
the lock will be held for.
I didn't want to put it in patch_text() so if users of patch_text() that
do not take the lock are added, they will be caught.

I'm probably just erring on the paranoid/conservative side of things!
Changbin Du Feb. 25, 2023, 1:50 a.m. UTC | #5
On Fri, Feb 24, 2023 at 01:46:38PM +0000, Conor Dooley wrote:
> On Fri, Feb 24, 2023 at 08:58:57PM +0800, Changbin Du wrote:
> > On Fri, Feb 24, 2023 at 11:07:42AM +0000, Conor Dooley wrote:
> > > > > -	lockdep_assert_held(&text_mutex);
> > > > > +	if (!riscv_ftrace_in_stop_machine)
> > > > > +		lockdep_assert_held(&text_mutex);
> > > > >  
> > > > >  	if (across_pages)
> > > > >  		patch_map(addr + len, FIX_TEXT_POKE1);
> > > > This misses this function.
> > > > 
> > > > int patch_text(void *addr, u32 insn)
> > > 
> > > So, with a corresponding rename to the symbol, does the following look
> > > okay to you?
> > > 
> > > diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c
> > > index f21592d20306..433b454e693f 100644
> > > --- a/arch/riscv/kernel/probes/kprobes.c
> > > +++ b/arch/riscv/kernel/probes/kprobes.c
> > > @@ -27,9 +27,15 @@ static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
> > >  
> > >  	p->ainsn.api.restore = (unsigned long)p->addr + offset;
> > >  
> > > +	/*
> > > +	 * kprobes takes text_mutex, but patch_text() calls stop_machine and
> > > +	 * lockdep gets confused by the context in which the lock is taken.
> > > +	 */
> > > +	riscv_patch_in_stop_machine = true;
> > >  	patch_text(p->ainsn.api.insn, p->opcode);
> > >  	patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset),
> > >  		   __BUG_INSN_32);
> > > +	riscv_patch_in_stop_machine = false;
> > >  }
> > hmm, why not just put 'riscv_patch_in_stop_machine' into patch_text()? Then you
> > just need to modify that function.
> 
> Right, I intentionally didn't do that as `riscv_patch_in_stop_machine`
> skips the lockdep check, which we only want to do for codepaths we know
> the lock will be held for.
> I didn't want to put it in patch_text() so if users of patch_text() that
> do not take the lock are added, they will be caught.
> 
Understood your concern. Then how abount below instead of discrete changes?

int patch_text(void *addr, u32 insn)
 {
+       int ret;
        struct patch_insn patch = {
                .addr = addr,
                .insn = insn,
                .cpu_count = ATOMIC_INIT(0),
        };

-       return stop_machine_cpuslocked(patch_text_cb,
+       lockdep_assert_held(&text_mutex);
+       riscv_patch_in_stop_machine = true
+       ret = stop_machine_cpuslocked(patch_text_cb,
                                       &patch, cpu_online_mask);
+       riscv_patch_in_stop_machine = false;
+       return ret;
 }

> I'm probably just erring on the paranoid/conservative side of things!
Conor Dooley Feb. 25, 2023, 1:45 p.m. UTC | #6
On 25 February 2023 01:50:14 GMT, Changbin Du <changbin.du@huawei.com> wrote:
>On Fri, Feb 24, 2023 at 01:46:38PM +0000, Conor Dooley wrote:
>> On Fri, Feb 24, 2023 at 08:58:57PM +0800, Changbin Du wrote:
>> > On Fri, Feb 24, 2023 at 11:07:42AM +0000, Conor Dooley wrote:
>> > > > > -	lockdep_assert_held(&text_mutex);
>> > > > > +	if (!riscv_ftrace_in_stop_machine)
>> > > > > +		lockdep_assert_held(&text_mutex);
>> > > > >  
>> > > > >  	if (across_pages)
>> > > > >  		patch_map(addr + len, FIX_TEXT_POKE1);
>> > > > This misses this function.
>> > > > 
>> > > > int patch_text(void *addr, u32 insn)
>> > > 
>> > > So, with a corresponding rename to the symbol, does the following look
>> > > okay to you?
>> > > 
>> > > diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c
>> > > index f21592d20306..433b454e693f 100644
>> > > --- a/arch/riscv/kernel/probes/kprobes.c
>> > > +++ b/arch/riscv/kernel/probes/kprobes.c
>> > > @@ -27,9 +27,15 @@ static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
>> > >  
>> > >  	p->ainsn.api.restore = (unsigned long)p->addr + offset;
>> > >  
>> > > +	/*
>> > > +	 * kprobes takes text_mutex, but patch_text() calls stop_machine and
>> > > +	 * lockdep gets confused by the context in which the lock is taken.
>> > > +	 */
>> > > +	riscv_patch_in_stop_machine = true;
>> > >  	patch_text(p->ainsn.api.insn, p->opcode);
>> > >  	patch_text((void *)((unsigned long)(p->ainsn.api.insn) + offset),
>> > >  		   __BUG_INSN_32);
>> > > +	riscv_patch_in_stop_machine = false;
>> > >  }
>> > hmm, why not just put 'riscv_patch_in_stop_machine' into patch_text()? Then you
>> > just need to modify that function.
>> 
>> Right, I intentionally didn't do that as `riscv_patch_in_stop_machine`
>> skips the lockdep check, which we only want to do for codepaths we know
>> the lock will be held for.
>> I didn't want to put it in patch_text() so if users of patch_text() that
>> do not take the lock are added, they will be caught.
>> 
>Understood your concern. Then how abount below instead of discrete changes?

Seems fair enough to me, I'll respin Monday - had some hardware fail so out of action right now.

Thanks,
Conor.

>
>int patch_text(void *addr, u32 insn)
> {
>+       int ret;
>        struct patch_insn patch = {
>                .addr = addr,
>                .insn = insn,
>                .cpu_count = ATOMIC_INIT(0),
>        };
>
>-       return stop_machine_cpuslocked(patch_text_cb,
>+       lockdep_assert_held(&text_mutex);
>+       riscv_patch_in_stop_machine = true
>+       ret = stop_machine_cpuslocked(patch_text_cb,
>                                       &patch, cpu_online_mask);
>+       riscv_patch_in_stop_machine = false;
>+       return ret;
> }
>
>> I'm probably just erring on the paranoid/conservative side of things!
diff mbox series

Patch

diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h
index 04dad3380041..3ac7609f4ee9 100644
--- a/arch/riscv/include/asm/ftrace.h
+++ b/arch/riscv/include/asm/ftrace.h
@@ -81,8 +81,15 @@  do {									\
 struct dyn_ftrace;
 int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
 #define ftrace_init_nop ftrace_init_nop
+extern int riscv_ftrace_in_stop_machine;
 #endif
 
+#else /* CONFIG_DYNAMIC_FTRACE */
+
+#ifndef __ASSEMBLY__
+#define riscv_ftrace_in_stop_machine 0
 #endif
 
+#endif /* CONFIG_DYNAMIC_FTRACE */
+
 #endif /* _ASM_RISCV_FTRACE_H */
diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
index 2086f6585773..661bfa72f359 100644
--- a/arch/riscv/kernel/ftrace.c
+++ b/arch/riscv/kernel/ftrace.c
@@ -11,14 +11,25 @@ 
 #include <asm/cacheflush.h>
 #include <asm/patch.h>
 
+int riscv_ftrace_in_stop_machine;
+
 #ifdef CONFIG_DYNAMIC_FTRACE
 void ftrace_arch_code_modify_prepare(void) __acquires(&text_mutex)
 {
 	mutex_lock(&text_mutex);
+
+	/*
+	 * The code sequences we use for ftrace can't be patched while the
+	 * kernel is running, so we need to use stop_machine() to modify them
+	 * for now.  This doesn't play nice with text_mutex, we use this flag
+	 * to elide the check.
+	 */
+	riscv_ftrace_in_stop_machine = true;
 }
 
 void ftrace_arch_code_modify_post_process(void) __releases(&text_mutex)
 {
+	riscv_ftrace_in_stop_machine = false;
 	mutex_unlock(&text_mutex);
 }
 
@@ -134,9 +145,9 @@  int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec)
 {
 	int out;
 
-	ftrace_arch_code_modify_prepare();
+	mutex_lock(&text_mutex);
 	out = ftrace_make_nop(mod, rec, MCOUNT_ADDR);
-	ftrace_arch_code_modify_post_process();
+	mutex_unlock(&text_mutex);
 
 	return out;
 }
diff --git a/arch/riscv/kernel/patch.c b/arch/riscv/kernel/patch.c
index 765004b60513..56b70271518d 100644
--- a/arch/riscv/kernel/patch.c
+++ b/arch/riscv/kernel/patch.c
@@ -11,6 +11,7 @@ 
 #include <asm/kprobes.h>
 #include <asm/cacheflush.h>
 #include <asm/fixmap.h>
+#include <asm/ftrace.h>
 #include <asm/patch.h>
 
 struct patch_insn {
@@ -59,8 +60,15 @@  static int patch_insn_write(void *addr, const void *insn, size_t len)
 	 * Before reaching here, it was expected to lock the text_mutex
 	 * already, so we don't need to give another lock here and could
 	 * ensure that it was safe between each cores.
+	 *
+	 * We're currently using stop_machine() for ftrace, and while that
+	 * ensures text_mutex is held before installing the mappings it does
+	 * not ensure text_mutex is held by the calling thread.  That's safe
+	 * but triggers a lockdep failure, so just elide it for that specific
+	 * case.
 	 */
-	lockdep_assert_held(&text_mutex);
+	if (!riscv_ftrace_in_stop_machine)
+		lockdep_assert_held(&text_mutex);
 
 	if (across_pages)
 		patch_map(addr + len, FIX_TEXT_POKE1);