diff mbox series

ftrace: Correct preemption accounting for function tracing.

Message ID 20250220140749.pfw8qoNZ@linutronix.de (mailing list archive)
State Queued
Headers show
Series ftrace: Correct preemption accounting for function tracing. | expand

Commit Message

Sebastian Andrzej Siewior Feb. 20, 2025, 2:07 p.m. UTC
The function tracer should record the preemption level at the point when
the function is invoked. If the tracing subsystem decrement the
preemption counter it needs to correct this before feeding the data into
the trace buffer. This was broken in the commit cited below while
shifting the preempt-disabled section.

Use tracing_gen_ctx_dec() which properly subtracts one from the
preemption counter on a preemptible kernel.

Cc: Wander Lairson Costa <wander@redhat.com>
Fixes: ce5e48036c9e7 ("ftrace: disable preemption when recursion locked")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
I haven't been ftrace for quite some time if I got the fixes tag right.

 kernel/trace/trace_functions.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

Comments

Wander Lairson Costa Feb. 20, 2025, 3:10 p.m. UTC | #1
On Thu, Feb 20, 2025 at 03:07:49PM +0100, Sebastian Andrzej Siewior wrote:
> The function tracer should record the preemption level at the point when
> the function is invoked. If the tracing subsystem decrement the
> preemption counter it needs to correct this before feeding the data into
> the trace buffer. This was broken in the commit cited below while
> shifting the preempt-disabled section.
> 
> Use tracing_gen_ctx_dec() which properly subtracts one from the
> preemption counter on a preemptible kernel.
> 
> Cc: Wander Lairson Costa <wander@redhat.com>
> Fixes: ce5e48036c9e7 ("ftrace: disable preemption when recursion locked")
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
> I haven't been ftrace for quite some time if I got the fixes tag right.
> 
>  kernel/trace/trace_functions.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/trace/trace_functions.c b/kernel/trace/trace_functions.c
> index d358c9935164d..df56f9b760109 100644
> --- a/kernel/trace/trace_functions.c
> +++ b/kernel/trace/trace_functions.c
> @@ -216,7 +216,7 @@ function_trace_call(unsigned long ip, unsigned long parent_ip,
>  
>  	parent_ip = function_get_true_parent_ip(parent_ip, fregs);
>  
> -	trace_ctx = tracing_gen_ctx();
> +	trace_ctx = tracing_gen_ctx_dec();
>  
>  	data = this_cpu_ptr(tr->array_buffer.data);
>  	if (!atomic_read(&data->disabled))
> @@ -321,7 +321,6 @@ function_no_repeats_trace_call(unsigned long ip, unsigned long parent_ip,
>  	struct trace_array *tr = op->private;
>  	struct trace_array_cpu *data;
>  	unsigned int trace_ctx;
> -	unsigned long flags;
>  	int bit;
>  
>  	if (unlikely(!tr->function_enabled))
> @@ -347,8 +346,7 @@ function_no_repeats_trace_call(unsigned long ip, unsigned long parent_ip,
>  	if (is_repeat_check(tr, last_info, ip, parent_ip))
>  		goto out;
>  
> -	local_save_flags(flags);
> -	trace_ctx = tracing_gen_ctx_flags(flags);
> +	trace_ctx = tracing_gen_ctx_dec();
>  	process_repeats(tr, ip, parent_ip, last_info, trace_ctx);
>  
>  	trace_function(tr, ip, parent_ip, trace_ctx);
> -- 
> 2.47.2
> 

It now reports the preemption count correctly:

    kworker/0:5-88      [000] b...3   211.972885: e1000_check_for_ack_vf <-e1000_write_posted_mbx
     kworker/0:5-88      [000] b...3   211.983386: e1000_read_posted_mbx <-e1000_reset_hw_vf
     kworker/0:5-88      [000] b...3   211.983389: e1000_init_hw_vf <-igbvf_reset
     kworker/0:5-88      [000] b...3   211.983390: e1000_rar_set_vf <-e1000_init_hw_vf
     kworker/0:5-88      [000] b...3   211.983390: e1000_write_posted_mbx <-e1000_rar_set_vf
     kworker/0:5-88      [000] D.Zf2   211.983391: igbvf_reset_L14: (igbvf_reset+0x62/0x120 [igbvf])
     kworker/0:5-88      [000] .N...   211.983396: igbvf_reset_L16: (igbvf_reset+0x7b/0x120 [igbvf])
  irq/63-ens14f0-1122    [000] b...2   211.983399: igb_msix_other <-irq_thread_fn
  irq/63-ens14f0-1122    [000] b...2   211.983399: igb_rd32 <-igb_msix_other
  irq/63-ens14f0-1122    [000] b...3   211.983400: igb_check_for_rst <-igb_msix_other
  irq/63-ens14f0-1122    [000] b...3   211.983400: igb_check_for_rst_pf <-igb_msix_other
  irq/63-ens14f0-1122    [000] b...3   211.983400: igb_rd32 <-igb_check_for_rst_pf
  irq/63-ens14f0-1122    [000] b...3   211.983401: igb_check_for_msg <-igb_msix_other
  irq/63-ens14f0-1122    [000] b...3   211.983401: igb_check_for_msg_pf <-igb_msix_other
  irq/63-ens14f0-1122    [000] b...3   211.983401: igb_rd32 <-igb_check_for_msg_pf
  irq/63-ens14f0-1122    [000] b...3   211.983402: igb_check_for_ack <-igb_msix_other
  irq/63-ens14f0-1122    [000] b...3   211.983402: igb_check_for_ack_pf <-igb_msix_other

(Except in line 6 where it show 'f', which seems to be an unrelated issue)

Tested-by: Wander Lairson Costa <wander@redhat.com>
Steven Rostedt Feb. 20, 2025, 4:54 p.m. UTC | #2
On Thu, 20 Feb 2025 12:10:30 -0300
Wander Lairson Costa <wander@redhat.com> wrote:

> On Thu, Feb 20, 2025 at 03:07:49PM +0100, Sebastian Andrzej Siewior wrote:
> > The function tracer should record the preemption level at the point when
> > the function is invoked. If the tracing subsystem decrement the
> > preemption counter it needs to correct this before feeding the data into
> > the trace buffer. This was broken in the commit cited below while
> > shifting the preempt-disabled section.
> > 
> > Use tracing_gen_ctx_dec() which properly subtracts one from the
> > preemption counter on a preemptible kernel.
> > 
> > Cc: Wander Lairson Costa <wander@redhat.com>
> > Fixes: ce5e48036c9e7 ("ftrace: disable preemption when recursion locked")
> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > ---
> > I haven't been ftrace for quite some time if I got the fixes tag right.
> > 
> >  kernel/trace/trace_functions.c | 6 ++----
> >  1 file changed, 2 insertions(+), 4 deletions(-)
> > 
> > diff --git a/kernel/trace/trace_functions.c b/kernel/trace/trace_functions.c
> > index d358c9935164d..df56f9b760109 100644
> > --- a/kernel/trace/trace_functions.c
> > +++ b/kernel/trace/trace_functions.c
> > @@ -216,7 +216,7 @@ function_trace_call(unsigned long ip, unsigned long parent_ip,
> >  
> >  	parent_ip = function_get_true_parent_ip(parent_ip, fregs);
> >  
> > -	trace_ctx = tracing_gen_ctx();
> > +	trace_ctx = tracing_gen_ctx_dec();
> >  
> >  	data = this_cpu_ptr(tr->array_buffer.data);
> >  	if (!atomic_read(&data->disabled))
> > @@ -321,7 +321,6 @@ function_no_repeats_trace_call(unsigned long ip, unsigned long parent_ip,
> >  	struct trace_array *tr = op->private;
> >  	struct trace_array_cpu *data;
> >  	unsigned int trace_ctx;
> > -	unsigned long flags;
> >  	int bit;
> >  
> >  	if (unlikely(!tr->function_enabled))
> > @@ -347,8 +346,7 @@ function_no_repeats_trace_call(unsigned long ip, unsigned long parent_ip,
> >  	if (is_repeat_check(tr, last_info, ip, parent_ip))
> >  		goto out;
> >  
> > -	local_save_flags(flags);
> > -	trace_ctx = tracing_gen_ctx_flags(flags);
> > +	trace_ctx = tracing_gen_ctx_dec();
> >  	process_repeats(tr, ip, parent_ip, last_info, trace_ctx);
> >  
> >  	trace_function(tr, ip, parent_ip, trace_ctx);
> > -- 
> > 2.47.2
> >   
> 
> It now reports the preemption count correctly:
> 
>     kworker/0:5-88      [000] b...3   211.972885: e1000_check_for_ack_vf <-e1000_write_posted_mbx
>      kworker/0:5-88      [000] b...3   211.983386: e1000_read_posted_mbx <-e1000_reset_hw_vf
>      kworker/0:5-88      [000] b...3   211.983389: e1000_init_hw_vf <-igbvf_reset
>      kworker/0:5-88      [000] b...3   211.983390: e1000_rar_set_vf <-e1000_init_hw_vf
>      kworker/0:5-88      [000] b...3   211.983390: e1000_write_posted_mbx <-e1000_rar_set_vf
>      kworker/0:5-88      [000] D.Zf2   211.983391: igbvf_reset_L14: (igbvf_reset+0x62/0x120 [igbvf])
>      kworker/0:5-88      [000] .N...   211.983396: igbvf_reset_L16: (igbvf_reset+0x7b/0x120 [igbvf])
>   irq/63-ens14f0-1122    [000] b...2   211.983399: igb_msix_other <-irq_thread_fn
>   irq/63-ens14f0-1122    [000] b...2   211.983399: igb_rd32 <-igb_msix_other
>   irq/63-ens14f0-1122    [000] b...3   211.983400: igb_check_for_rst <-igb_msix_other
>   irq/63-ens14f0-1122    [000] b...3   211.983400: igb_check_for_rst_pf <-igb_msix_other
>   irq/63-ens14f0-1122    [000] b...3   211.983400: igb_rd32 <-igb_check_for_rst_pf
>   irq/63-ens14f0-1122    [000] b...3   211.983401: igb_check_for_msg <-igb_msix_other
>   irq/63-ens14f0-1122    [000] b...3   211.983401: igb_check_for_msg_pf <-igb_msix_other
>   irq/63-ens14f0-1122    [000] b...3   211.983401: igb_rd32 <-igb_check_for_msg_pf
>   irq/63-ens14f0-1122    [000] b...3   211.983402: igb_check_for_ack <-igb_msix_other
>   irq/63-ens14f0-1122    [000] b...3   211.983402: igb_check_for_ack_pf <-igb_msix_other
> 
> (Except in line 6 where it show 'f', which seems to be an unrelated issue)

But it appears to be consistent.

Let me go and add some probes and see what they show. The output you gave
me that I requested didn't show anything unusual.

-- Steve
Sebastian Andrzej Siewior Feb. 21, 2025, 4:21 p.m. UTC | #3
On 2025-02-20 11:54:00 [-0500], Steven Rostedt wrote:
> >      kworker/0:5-88      [000] D.Zf2   211.983391: igbvf_reset_L14: (igbvf_reset+0x62/0x120 [igbvf])
> >      kworker/0:5-88      [000] .N...   211.983396: igbvf_reset_L16: (igbvf_reset+0x7b/0x120 [igbvf])> > (Except in line 6 where it show 'f', which seems to be an unrelated issue)
> 
> But it appears to be consistent.
> 
> Let me go and add some probes and see what they show. The output you gave
> me that I requested didn't show anything unusual.

Since I just noticed that you posted the patches, did you find anything?
From looking at it, if we preemption at 0, migrate disable at 3 then a
substraction of one would underflow the preemption counter to f and
migrate disable to 2. But based on code inspection mg should be at 1.
And this does not explain where D & Z is coming from.

> -- Steve

Sebastian
Steven Rostedt Feb. 21, 2025, 4:48 p.m. UTC | #4
On Fri, 21 Feb 2025 17:21:46 +0100
Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote:

> On 2025-02-20 11:54:00 [-0500], Steven Rostedt wrote:
> > >      kworker/0:5-88      [000] D.Zf2   211.983391: igbvf_reset_L14: (igbvf_reset+0x62/0x120 [igbvf])
> > >      kworker/0:5-88      [000] .N...   211.983396: igbvf_reset_L16: (igbvf_reset+0x7b/0x120 [igbvf])  
> …
> > > (Except in line 6 where it show 'f', which seems to be an unrelated issue)  
> > 
> > But it appears to be consistent.
> > 
> > Let me go and add some probes and see what they show. The output you gave
> > me that I requested didn't show anything unusual.  
> 
> Since I just noticed that you posted the patches, did you find anything?
> From looking at it, if we preemption at 0, migrate disable at 3 then a
> substraction of one would underflow the preemption counter to f and
> migrate disable to 2. But based on code inspection mg should be at 1.
> And this does not explain where D & Z is coming from.

I haven't found the bug for this yet, but I need to send out other fixes.
The preempt count issue you posted is agnostic to this issue, so I just
took it and added it to my current queue for this rc release.

-- Steve
diff mbox series

Patch

diff --git a/kernel/trace/trace_functions.c b/kernel/trace/trace_functions.c
index d358c9935164d..df56f9b760109 100644
--- a/kernel/trace/trace_functions.c
+++ b/kernel/trace/trace_functions.c
@@ -216,7 +216,7 @@  function_trace_call(unsigned long ip, unsigned long parent_ip,
 
 	parent_ip = function_get_true_parent_ip(parent_ip, fregs);
 
-	trace_ctx = tracing_gen_ctx();
+	trace_ctx = tracing_gen_ctx_dec();
 
 	data = this_cpu_ptr(tr->array_buffer.data);
 	if (!atomic_read(&data->disabled))
@@ -321,7 +321,6 @@  function_no_repeats_trace_call(unsigned long ip, unsigned long parent_ip,
 	struct trace_array *tr = op->private;
 	struct trace_array_cpu *data;
 	unsigned int trace_ctx;
-	unsigned long flags;
 	int bit;
 
 	if (unlikely(!tr->function_enabled))
@@ -347,8 +346,7 @@  function_no_repeats_trace_call(unsigned long ip, unsigned long parent_ip,
 	if (is_repeat_check(tr, last_info, ip, parent_ip))
 		goto out;
 
-	local_save_flags(flags);
-	trace_ctx = tracing_gen_ctx_flags(flags);
+	trace_ctx = tracing_gen_ctx_dec();
 	process_repeats(tr, ip, parent_ip, last_info, trace_ctx);
 
 	trace_function(tr, ip, parent_ip, trace_ctx);