diff mbox series

[V4,2/2] rcu: Update jiffies in rcu_cpu_stall_reset()

Message ID 20230814020045.51950-2-chenhuacai@loongson.cn (mailing list archive)
State New, archived
Headers show
Series [V4,1/2] tick: Rename tick_do_update_jiffies64() and allow external usage | expand

Commit Message

Huacai Chen Aug. 14, 2023, 2 a.m. UTC
The KGDB initial breakpoint gets an rcu stall warning after commit
a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in
rcu_cpu_stall_reset()").

[   53.452051] rcu: INFO: rcu_preempt self-detected stall on CPU
[   53.487950] rcu:     3-...0: (1 ticks this GP) idle=0e2c/1/0x4000000000000000 softirq=375/375 fqs=8
[   53.528243] rcu:     (t=12297 jiffies g=-995 q=1 ncpus=4)
[   53.564840] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc2+ #4848
[   53.603005] Hardware name: Loongson Loongson-3A5000-HV-7A2000-1w-V0.1-CRB/Loongson-LS3A5000-7A2000-1w-CRB-V1.21, BIOS Loongson-UDK2018-V2.0.05099-beta8 08
[   53.682062] pc 9000000000332100 ra 90000000003320f4 tp 90000001000a0000 sp 90000001000a3710
[   53.724934] a0 9000000001d4b488 a1 0000000000000000 a2 0000000000000001 a3 0000000000000000
[   53.768179] a4 9000000001d526c8 a5 90000001000a38f0 a6 000000000000002c a7 0000000000000000
[   53.810751] t0 00000000000002b0 t1 0000000000000004 t2 900000000131c9c0 t3 fffffffffffffffa
[   53.853249] t4 0000000000000080 t5 90000001002ac190 t6 0000000000000004 t7 9000000001912d58
[   53.895684] t8 0000000000000000 u0 90000000013141a0 s9 0000000000000028 s0 9000000001d512f0
[   53.937633] s1 9000000001d51278 s2 90000001000a3798 s3 90000000019fc410 s4 9000000001d4b488
[   53.979486] s5 9000000001d512f0 s6 90000000013141a0 s7 0000000000000078 s8 9000000001d4b450
[   54.021175]    ra: 90000000003320f4 kgdb_cpu_enter+0x534/0x640
[   54.060150]   ERA: 9000000000332100 kgdb_cpu_enter+0x540/0x640
[   54.098347]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
[   54.136621]  PRMD: 0000000c (PPLV0 +PIE +PWE)
[   54.172192]  EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
[   54.207838]  ECFG: 00071c1c (LIE=2-4,10-12 VS=7)
[   54.242503] ESTAT: 00000800 [INT] (IS=11 ECode=0 EsubCode=0)
[   54.277996]  PRID: 0014c011 (Loongson-64bit, Loongson-3A5000-HV)
[   54.313544] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc2+ #4848
[   54.430170] Stack : 0072617764726148 0000000000000000 9000000000223504 90000001000a0000
[   54.472308]         9000000100073a90 9000000100073a98 0000000000000000 9000000100073bd8
[   54.514413]         9000000100073bd0 9000000100073bd0 9000000100073a00 0000000000000001
[   54.556018]         0000000000000001 9000000100073a98 99828271f24e961a 90000001002810c0
[   54.596924]         0000000000000001 0000000000010003 0000000000000000 0000000000000001
[   54.637115]         ffff8000337cdb80 0000000000000001 0000000006360000 900000000131c9c0
[   54.677049]         0000000000000000 0000000000000000 90000000017b4c98 9000000001912000
[   54.716394]         9000000001912f68 9000000001913000 9000000001912f70 00000000000002b0
[   54.754880]         90000000014a8840 0000000000000000 900000000022351c 0000000000000000
[   54.792372]         00000000000002b0 000000000000000c 0000000000000000 0000000000071c1c
[   54.829302]         ...
[   54.859163] Call Trace:
[   54.859165] [<900000000022351c>] show_stack+0x5c/0x180
[   54.918298] [<90000000012f6100>] dump_stack_lvl+0x60/0x88
[   54.949251] [<90000000012dd5d8>] rcu_dump_cpu_stacks+0xf0/0x148
[   54.981116] [<90000000002d2fb8>] rcu_sched_clock_irq+0xb78/0xe60
[   55.012744] [<90000000002e47cc>] update_process_times+0x6c/0xc0
[   55.044169] [<90000000002f65d4>] tick_sched_timer+0x54/0x100
[   55.075488] [<90000000002e5174>] __hrtimer_run_queues+0x154/0x240
[   55.107347] [<90000000002e6288>] hrtimer_interrupt+0x108/0x2a0
[   55.139112] [<9000000000226418>] constant_timer_interrupt+0x38/0x60
[   55.170749] [<90000000002b3010>] __handle_irq_event_percpu+0x50/0x160
[   55.203141] [<90000000002b3138>] handle_irq_event_percpu+0x18/0x80
[   55.235064] [<90000000002b9d54>] handle_percpu_irq+0x54/0xa0
[   55.266241] [<90000000002b2168>] generic_handle_domain_irq+0x28/0x40
[   55.298466] [<9000000000aba95c>] handle_cpu_irq+0x5c/0xa0
[   55.329749] [<90000000012f7270>] handle_loongarch_irq+0x30/0x60
[   55.361476] [<90000000012f733c>] do_vint+0x9c/0x100
[   55.391737] [<9000000000332100>] kgdb_cpu_enter+0x540/0x640
[   55.422440] [<9000000000332b64>] kgdb_handle_exception+0x104/0x180
[   55.452911] [<9000000000232478>] kgdb_loongarch_notify+0x38/0xa0
[   55.481964] [<900000000026b4d4>] notify_die+0x94/0x100
[   55.509184] [<90000000012f685c>] do_bp+0x21c/0x340
[   55.562475] [<90000000003315b8>] kgdb_compiled_break+0x0/0x28
[   55.590319] [<9000000000332e80>] kgdb_register_io_module+0x160/0x1c0
[   55.618901] [<9000000000c0f514>] configure_kgdboc+0x154/0x1c0
[   55.647034] [<9000000000c0f5e0>] kgdboc_probe+0x60/0x80
[   55.674647] [<9000000000c96da8>] platform_probe+0x68/0x100
[   55.702613] [<9000000000c938e0>] really_probe+0xc0/0x340
[   55.730528] [<9000000000c93be4>] __driver_probe_device+0x84/0x140
[   55.759615] [<9000000000c93cdc>] driver_probe_device+0x3c/0x120
[   55.787990] [<9000000000c93e8c>] __device_attach_driver+0xcc/0x160
[   55.817145] [<9000000000c91290>] bus_for_each_drv+0x90/0x100
[   55.845654] [<9000000000c94328>] __device_attach+0xa8/0x1a0
[   55.874145] [<9000000000c925f0>] bus_probe_device+0xb0/0xe0
[   55.902572] [<9000000000c8ec7c>] device_add+0x65c/0x860
[   55.930635] [<9000000000c96704>] platform_device_add+0x124/0x2c0
[   55.959669] [<9000000001452b38>] init_kgdboc+0x58/0xa0
[   55.987677] [<900000000022015c>] do_one_initcall+0x7c/0x1e0
[   56.016134] [<9000000001420f1c>] kernel_init_freeable+0x22c/0x2a0
[   56.045128] [<90000000012f923c>] kernel_init+0x20/0x124

Currently rcu_cpu_stall_reset() set rcu_state.jiffies_stall to one check
period later, i.e. jiffies + rcu_jiffies_till_stall_check(). But jiffies
is only updated in the timer interrupt, so when kgdb_cpu_enter() begins
to run there may already be nearly one rcu check period after jiffies.
Since all interrupts are disabled during kgdb_cpu_enter(), jiffies will
not be updated. When kgdb_cpu_enter() returns, rcu_state.jiffies_stall
maybe already gets timeout.

We can set rcu_state.jiffies_stall to two rcu check periods later, e.g.
jiffies + (rcu_jiffies_till_stall_check() * 2) in rcu_cpu_stall_reset()
to avoid this problem. But this isn't a complete solution because kgdb
may take a very long time in irq disabled context.

Instead, update jiffies at the beginning of rcu_cpu_stall_reset() can
solve all kinds of problems.

Cc: stable@vger.kernel.org
Fixes: a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in rcu_cpu_stall_reset()")
Reported-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
---
 kernel/rcu/tree_stall.h | 1 +
 1 file changed, 1 insertion(+)

Comments

Paul E. McKenney Aug. 14, 2023, 4:15 p.m. UTC | #1
On Mon, Aug 14, 2023 at 10:00:45AM +0800, Huacai Chen wrote:
> The KGDB initial breakpoint gets an rcu stall warning after commit
> a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in
> rcu_cpu_stall_reset()").
> 
> [   53.452051] rcu: INFO: rcu_preempt self-detected stall on CPU
> [   53.487950] rcu:     3-...0: (1 ticks this GP) idle=0e2c/1/0x4000000000000000 softirq=375/375 fqs=8
> [   53.528243] rcu:     (t=12297 jiffies g=-995 q=1 ncpus=4)
> [   53.564840] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc2+ #4848
> [   53.603005] Hardware name: Loongson Loongson-3A5000-HV-7A2000-1w-V0.1-CRB/Loongson-LS3A5000-7A2000-1w-CRB-V1.21, BIOS Loongson-UDK2018-V2.0.05099-beta8 08
> [   53.682062] pc 9000000000332100 ra 90000000003320f4 tp 90000001000a0000 sp 90000001000a3710
> [   53.724934] a0 9000000001d4b488 a1 0000000000000000 a2 0000000000000001 a3 0000000000000000
> [   53.768179] a4 9000000001d526c8 a5 90000001000a38f0 a6 000000000000002c a7 0000000000000000
> [   53.810751] t0 00000000000002b0 t1 0000000000000004 t2 900000000131c9c0 t3 fffffffffffffffa
> [   53.853249] t4 0000000000000080 t5 90000001002ac190 t6 0000000000000004 t7 9000000001912d58
> [   53.895684] t8 0000000000000000 u0 90000000013141a0 s9 0000000000000028 s0 9000000001d512f0
> [   53.937633] s1 9000000001d51278 s2 90000001000a3798 s3 90000000019fc410 s4 9000000001d4b488
> [   53.979486] s5 9000000001d512f0 s6 90000000013141a0 s7 0000000000000078 s8 9000000001d4b450
> [   54.021175]    ra: 90000000003320f4 kgdb_cpu_enter+0x534/0x640
> [   54.060150]   ERA: 9000000000332100 kgdb_cpu_enter+0x540/0x640
> [   54.098347]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
> [   54.136621]  PRMD: 0000000c (PPLV0 +PIE +PWE)
> [   54.172192]  EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
> [   54.207838]  ECFG: 00071c1c (LIE=2-4,10-12 VS=7)
> [   54.242503] ESTAT: 00000800 [INT] (IS=11 ECode=0 EsubCode=0)
> [   54.277996]  PRID: 0014c011 (Loongson-64bit, Loongson-3A5000-HV)
> [   54.313544] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc2+ #4848
> [   54.430170] Stack : 0072617764726148 0000000000000000 9000000000223504 90000001000a0000
> [   54.472308]         9000000100073a90 9000000100073a98 0000000000000000 9000000100073bd8
> [   54.514413]         9000000100073bd0 9000000100073bd0 9000000100073a00 0000000000000001
> [   54.556018]         0000000000000001 9000000100073a98 99828271f24e961a 90000001002810c0
> [   54.596924]         0000000000000001 0000000000010003 0000000000000000 0000000000000001
> [   54.637115]         ffff8000337cdb80 0000000000000001 0000000006360000 900000000131c9c0
> [   54.677049]         0000000000000000 0000000000000000 90000000017b4c98 9000000001912000
> [   54.716394]         9000000001912f68 9000000001913000 9000000001912f70 00000000000002b0
> [   54.754880]         90000000014a8840 0000000000000000 900000000022351c 0000000000000000
> [   54.792372]         00000000000002b0 000000000000000c 0000000000000000 0000000000071c1c
> [   54.829302]         ...
> [   54.859163] Call Trace:
> [   54.859165] [<900000000022351c>] show_stack+0x5c/0x180
> [   54.918298] [<90000000012f6100>] dump_stack_lvl+0x60/0x88
> [   54.949251] [<90000000012dd5d8>] rcu_dump_cpu_stacks+0xf0/0x148
> [   54.981116] [<90000000002d2fb8>] rcu_sched_clock_irq+0xb78/0xe60
> [   55.012744] [<90000000002e47cc>] update_process_times+0x6c/0xc0
> [   55.044169] [<90000000002f65d4>] tick_sched_timer+0x54/0x100
> [   55.075488] [<90000000002e5174>] __hrtimer_run_queues+0x154/0x240
> [   55.107347] [<90000000002e6288>] hrtimer_interrupt+0x108/0x2a0
> [   55.139112] [<9000000000226418>] constant_timer_interrupt+0x38/0x60
> [   55.170749] [<90000000002b3010>] __handle_irq_event_percpu+0x50/0x160
> [   55.203141] [<90000000002b3138>] handle_irq_event_percpu+0x18/0x80
> [   55.235064] [<90000000002b9d54>] handle_percpu_irq+0x54/0xa0
> [   55.266241] [<90000000002b2168>] generic_handle_domain_irq+0x28/0x40
> [   55.298466] [<9000000000aba95c>] handle_cpu_irq+0x5c/0xa0
> [   55.329749] [<90000000012f7270>] handle_loongarch_irq+0x30/0x60
> [   55.361476] [<90000000012f733c>] do_vint+0x9c/0x100
> [   55.391737] [<9000000000332100>] kgdb_cpu_enter+0x540/0x640
> [   55.422440] [<9000000000332b64>] kgdb_handle_exception+0x104/0x180
> [   55.452911] [<9000000000232478>] kgdb_loongarch_notify+0x38/0xa0
> [   55.481964] [<900000000026b4d4>] notify_die+0x94/0x100
> [   55.509184] [<90000000012f685c>] do_bp+0x21c/0x340
> [   55.562475] [<90000000003315b8>] kgdb_compiled_break+0x0/0x28
> [   55.590319] [<9000000000332e80>] kgdb_register_io_module+0x160/0x1c0
> [   55.618901] [<9000000000c0f514>] configure_kgdboc+0x154/0x1c0
> [   55.647034] [<9000000000c0f5e0>] kgdboc_probe+0x60/0x80
> [   55.674647] [<9000000000c96da8>] platform_probe+0x68/0x100
> [   55.702613] [<9000000000c938e0>] really_probe+0xc0/0x340
> [   55.730528] [<9000000000c93be4>] __driver_probe_device+0x84/0x140
> [   55.759615] [<9000000000c93cdc>] driver_probe_device+0x3c/0x120
> [   55.787990] [<9000000000c93e8c>] __device_attach_driver+0xcc/0x160
> [   55.817145] [<9000000000c91290>] bus_for_each_drv+0x90/0x100
> [   55.845654] [<9000000000c94328>] __device_attach+0xa8/0x1a0
> [   55.874145] [<9000000000c925f0>] bus_probe_device+0xb0/0xe0
> [   55.902572] [<9000000000c8ec7c>] device_add+0x65c/0x860
> [   55.930635] [<9000000000c96704>] platform_device_add+0x124/0x2c0
> [   55.959669] [<9000000001452b38>] init_kgdboc+0x58/0xa0
> [   55.987677] [<900000000022015c>] do_one_initcall+0x7c/0x1e0
> [   56.016134] [<9000000001420f1c>] kernel_init_freeable+0x22c/0x2a0
> [   56.045128] [<90000000012f923c>] kernel_init+0x20/0x124
> 
> Currently rcu_cpu_stall_reset() set rcu_state.jiffies_stall to one check
> period later, i.e. jiffies + rcu_jiffies_till_stall_check(). But jiffies
> is only updated in the timer interrupt, so when kgdb_cpu_enter() begins
> to run there may already be nearly one rcu check period after jiffies.
> Since all interrupts are disabled during kgdb_cpu_enter(), jiffies will
> not be updated. When kgdb_cpu_enter() returns, rcu_state.jiffies_stall
> maybe already gets timeout.
> 
> We can set rcu_state.jiffies_stall to two rcu check periods later, e.g.
> jiffies + (rcu_jiffies_till_stall_check() * 2) in rcu_cpu_stall_reset()
> to avoid this problem. But this isn't a complete solution because kgdb
> may take a very long time in irq disabled context.
> 
> Instead, update jiffies at the beginning of rcu_cpu_stall_reset() can
> solve all kinds of problems.

Would it make sense for there to be a kgdb_cpu_exit()?  In that case,
the stalls could simply be suppressed at the beginning of the debug
session and re-enabled upon exit, as is currently done for sysrq output
via rcu_sysrq_start() and rcu_sysrq_end().

							Thanx, Paul

> Cc: stable@vger.kernel.org
> Fixes: a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in rcu_cpu_stall_reset()")
> Reported-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> ---
>  kernel/rcu/tree_stall.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> index b10b8349bb2a..1c7b540985bf 100644
> --- a/kernel/rcu/tree_stall.h
> +++ b/kernel/rcu/tree_stall.h
> @@ -153,6 +153,7 @@ static void panic_on_rcu_stall(void)
>   */
>  void rcu_cpu_stall_reset(void)
>  {
> +	do_update_jiffies_64(ktime_get());
>  	WRITE_ONCE(rcu_state.jiffies_stall,
>  		   jiffies + rcu_jiffies_till_stall_check());
>  }
> -- 
> 2.39.3
>
Huacai Chen Aug. 15, 2023, 6:05 a.m. UTC | #2
Hi, Paul,

On Tue, Aug 15, 2023 at 12:15 AM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Mon, Aug 14, 2023 at 10:00:45AM +0800, Huacai Chen wrote:
> > The KGDB initial breakpoint gets an rcu stall warning after commit
> > a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in
> > rcu_cpu_stall_reset()").
> >
> > [   53.452051] rcu: INFO: rcu_preempt self-detected stall on CPU
> > [   53.487950] rcu:     3-...0: (1 ticks this GP) idle=0e2c/1/0x4000000000000000 softirq=375/375 fqs=8
> > [   53.528243] rcu:     (t=12297 jiffies g=-995 q=1 ncpus=4)
> > [   53.564840] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc2+ #4848
> > [   53.603005] Hardware name: Loongson Loongson-3A5000-HV-7A2000-1w-V0.1-CRB/Loongson-LS3A5000-7A2000-1w-CRB-V1.21, BIOS Loongson-UDK2018-V2.0.05099-beta8 08
> > [   53.682062] pc 9000000000332100 ra 90000000003320f4 tp 90000001000a0000 sp 90000001000a3710
> > [   53.724934] a0 9000000001d4b488 a1 0000000000000000 a2 0000000000000001 a3 0000000000000000
> > [   53.768179] a4 9000000001d526c8 a5 90000001000a38f0 a6 000000000000002c a7 0000000000000000
> > [   53.810751] t0 00000000000002b0 t1 0000000000000004 t2 900000000131c9c0 t3 fffffffffffffffa
> > [   53.853249] t4 0000000000000080 t5 90000001002ac190 t6 0000000000000004 t7 9000000001912d58
> > [   53.895684] t8 0000000000000000 u0 90000000013141a0 s9 0000000000000028 s0 9000000001d512f0
> > [   53.937633] s1 9000000001d51278 s2 90000001000a3798 s3 90000000019fc410 s4 9000000001d4b488
> > [   53.979486] s5 9000000001d512f0 s6 90000000013141a0 s7 0000000000000078 s8 9000000001d4b450
> > [   54.021175]    ra: 90000000003320f4 kgdb_cpu_enter+0x534/0x640
> > [   54.060150]   ERA: 9000000000332100 kgdb_cpu_enter+0x540/0x640
> > [   54.098347]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
> > [   54.136621]  PRMD: 0000000c (PPLV0 +PIE +PWE)
> > [   54.172192]  EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
> > [   54.207838]  ECFG: 00071c1c (LIE=2-4,10-12 VS=7)
> > [   54.242503] ESTAT: 00000800 [INT] (IS=11 ECode=0 EsubCode=0)
> > [   54.277996]  PRID: 0014c011 (Loongson-64bit, Loongson-3A5000-HV)
> > [   54.313544] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc2+ #4848
> > [   54.430170] Stack : 0072617764726148 0000000000000000 9000000000223504 90000001000a0000
> > [   54.472308]         9000000100073a90 9000000100073a98 0000000000000000 9000000100073bd8
> > [   54.514413]         9000000100073bd0 9000000100073bd0 9000000100073a00 0000000000000001
> > [   54.556018]         0000000000000001 9000000100073a98 99828271f24e961a 90000001002810c0
> > [   54.596924]         0000000000000001 0000000000010003 0000000000000000 0000000000000001
> > [   54.637115]         ffff8000337cdb80 0000000000000001 0000000006360000 900000000131c9c0
> > [   54.677049]         0000000000000000 0000000000000000 90000000017b4c98 9000000001912000
> > [   54.716394]         9000000001912f68 9000000001913000 9000000001912f70 00000000000002b0
> > [   54.754880]         90000000014a8840 0000000000000000 900000000022351c 0000000000000000
> > [   54.792372]         00000000000002b0 000000000000000c 0000000000000000 0000000000071c1c
> > [   54.829302]         ...
> > [   54.859163] Call Trace:
> > [   54.859165] [<900000000022351c>] show_stack+0x5c/0x180
> > [   54.918298] [<90000000012f6100>] dump_stack_lvl+0x60/0x88
> > [   54.949251] [<90000000012dd5d8>] rcu_dump_cpu_stacks+0xf0/0x148
> > [   54.981116] [<90000000002d2fb8>] rcu_sched_clock_irq+0xb78/0xe60
> > [   55.012744] [<90000000002e47cc>] update_process_times+0x6c/0xc0
> > [   55.044169] [<90000000002f65d4>] tick_sched_timer+0x54/0x100
> > [   55.075488] [<90000000002e5174>] __hrtimer_run_queues+0x154/0x240
> > [   55.107347] [<90000000002e6288>] hrtimer_interrupt+0x108/0x2a0
> > [   55.139112] [<9000000000226418>] constant_timer_interrupt+0x38/0x60
> > [   55.170749] [<90000000002b3010>] __handle_irq_event_percpu+0x50/0x160
> > [   55.203141] [<90000000002b3138>] handle_irq_event_percpu+0x18/0x80
> > [   55.235064] [<90000000002b9d54>] handle_percpu_irq+0x54/0xa0
> > [   55.266241] [<90000000002b2168>] generic_handle_domain_irq+0x28/0x40
> > [   55.298466] [<9000000000aba95c>] handle_cpu_irq+0x5c/0xa0
> > [   55.329749] [<90000000012f7270>] handle_loongarch_irq+0x30/0x60
> > [   55.361476] [<90000000012f733c>] do_vint+0x9c/0x100
> > [   55.391737] [<9000000000332100>] kgdb_cpu_enter+0x540/0x640
> > [   55.422440] [<9000000000332b64>] kgdb_handle_exception+0x104/0x180
> > [   55.452911] [<9000000000232478>] kgdb_loongarch_notify+0x38/0xa0
> > [   55.481964] [<900000000026b4d4>] notify_die+0x94/0x100
> > [   55.509184] [<90000000012f685c>] do_bp+0x21c/0x340
> > [   55.562475] [<90000000003315b8>] kgdb_compiled_break+0x0/0x28
> > [   55.590319] [<9000000000332e80>] kgdb_register_io_module+0x160/0x1c0
> > [   55.618901] [<9000000000c0f514>] configure_kgdboc+0x154/0x1c0
> > [   55.647034] [<9000000000c0f5e0>] kgdboc_probe+0x60/0x80
> > [   55.674647] [<9000000000c96da8>] platform_probe+0x68/0x100
> > [   55.702613] [<9000000000c938e0>] really_probe+0xc0/0x340
> > [   55.730528] [<9000000000c93be4>] __driver_probe_device+0x84/0x140
> > [   55.759615] [<9000000000c93cdc>] driver_probe_device+0x3c/0x120
> > [   55.787990] [<9000000000c93e8c>] __device_attach_driver+0xcc/0x160
> > [   55.817145] [<9000000000c91290>] bus_for_each_drv+0x90/0x100
> > [   55.845654] [<9000000000c94328>] __device_attach+0xa8/0x1a0
> > [   55.874145] [<9000000000c925f0>] bus_probe_device+0xb0/0xe0
> > [   55.902572] [<9000000000c8ec7c>] device_add+0x65c/0x860
> > [   55.930635] [<9000000000c96704>] platform_device_add+0x124/0x2c0
> > [   55.959669] [<9000000001452b38>] init_kgdboc+0x58/0xa0
> > [   55.987677] [<900000000022015c>] do_one_initcall+0x7c/0x1e0
> > [   56.016134] [<9000000001420f1c>] kernel_init_freeable+0x22c/0x2a0
> > [   56.045128] [<90000000012f923c>] kernel_init+0x20/0x124
> >
> > Currently rcu_cpu_stall_reset() set rcu_state.jiffies_stall to one check
> > period later, i.e. jiffies + rcu_jiffies_till_stall_check(). But jiffies
> > is only updated in the timer interrupt, so when kgdb_cpu_enter() begins
> > to run there may already be nearly one rcu check period after jiffies.
> > Since all interrupts are disabled during kgdb_cpu_enter(), jiffies will
> > not be updated. When kgdb_cpu_enter() returns, rcu_state.jiffies_stall
> > maybe already gets timeout.
> >
> > We can set rcu_state.jiffies_stall to two rcu check periods later, e.g.
> > jiffies + (rcu_jiffies_till_stall_check() * 2) in rcu_cpu_stall_reset()
> > to avoid this problem. But this isn't a complete solution because kgdb
> > may take a very long time in irq disabled context.
> >
> > Instead, update jiffies at the beginning of rcu_cpu_stall_reset() can
> > solve all kinds of problems.
>
> Would it make sense for there to be a kgdb_cpu_exit()?  In that case,
> the stalls could simply be suppressed at the beginning of the debug
> session and re-enabled upon exit, as is currently done for sysrq output
> via rcu_sysrq_start() and rcu_sysrq_end().
Thank you for your advice, but that doesn't help. Because
rcu_sysrq_start() and rcu_sysrq_end() try to suppress the warnings
during sysrq, but kgdb already has no warnings during kgdb_cpu_enter()
since it is executed in irq disabled context. Instead, this patch
wants to suppress the warnings *after* kgdb_cpu_enter() due to a very
old jiffies value.


Huacai

>
>                                                         Thanx, Paul
>
> > Cc: stable@vger.kernel.org
> > Fixes: a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in rcu_cpu_stall_reset()")
> > Reported-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
> > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > ---
> >  kernel/rcu/tree_stall.h | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> > index b10b8349bb2a..1c7b540985bf 100644
> > --- a/kernel/rcu/tree_stall.h
> > +++ b/kernel/rcu/tree_stall.h
> > @@ -153,6 +153,7 @@ static void panic_on_rcu_stall(void)
> >   */
> >  void rcu_cpu_stall_reset(void)
> >  {
> > +     do_update_jiffies_64(ktime_get());
> >       WRITE_ONCE(rcu_state.jiffies_stall,
> >                  jiffies + rcu_jiffies_till_stall_check());
> >  }
> > --
> > 2.39.3
> >
Zqiang Aug. 16, 2023, 3:16 a.m. UTC | #3
>
> Hi, Paul,
>
> On Tue, Aug 15, 2023 at 12:15 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Mon, Aug 14, 2023 at 10:00:45AM +0800, Huacai Chen wrote:
> > > The KGDB initial breakpoint gets an rcu stall warning after commit
> > > a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in
> > > rcu_cpu_stall_reset()").
> > >
> > > [   53.452051] rcu: INFO: rcu_preempt self-detected stall on CPU
> > > [   53.487950] rcu:     3-...0: (1 ticks this GP) idle=0e2c/1/0x4000000000000000 softirq=375/375 fqs=8
> > > [   53.528243] rcu:     (t=12297 jiffies g=-995 q=1 ncpus=4)
> > > [   53.564840] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc2+ #4848
> > > [   53.603005] Hardware name: Loongson Loongson-3A5000-HV-7A2000-1w-V0.1-CRB/Loongson-LS3A5000-7A2000-1w-CRB-V1.21, BIOS Loongson-UDK2018-V2.0.05099-beta8 08
> > > [   53.682062] pc 9000000000332100 ra 90000000003320f4 tp 90000001000a0000 sp 90000001000a3710
> > > [   53.724934] a0 9000000001d4b488 a1 0000000000000000 a2 0000000000000001 a3 0000000000000000
> > > [   53.768179] a4 9000000001d526c8 a5 90000001000a38f0 a6 000000000000002c a7 0000000000000000
> > > [   53.810751] t0 00000000000002b0 t1 0000000000000004 t2 900000000131c9c0 t3 fffffffffffffffa
> > > [   53.853249] t4 0000000000000080 t5 90000001002ac190 t6 0000000000000004 t7 9000000001912d58
> > > [   53.895684] t8 0000000000000000 u0 90000000013141a0 s9 0000000000000028 s0 9000000001d512f0
> > > [   53.937633] s1 9000000001d51278 s2 90000001000a3798 s3 90000000019fc410 s4 9000000001d4b488
> > > [   53.979486] s5 9000000001d512f0 s6 90000000013141a0 s7 0000000000000078 s8 9000000001d4b450
> > > [   54.021175]    ra: 90000000003320f4 kgdb_cpu_enter+0x534/0x640
> > > [   54.060150]   ERA: 9000000000332100 kgdb_cpu_enter+0x540/0x640
> > > [   54.098347]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
> > > [   54.136621]  PRMD: 0000000c (PPLV0 +PIE +PWE)
> > > [   54.172192]  EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
> > > [   54.207838]  ECFG: 00071c1c (LIE=2-4,10-12 VS=7)
> > > [   54.242503] ESTAT: 00000800 [INT] (IS=11 ECode=0 EsubCode=0)
> > > [   54.277996]  PRID: 0014c011 (Loongson-64bit, Loongson-3A5000-HV)
> > > [   54.313544] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc2+ #4848
> > > [   54.430170] Stack : 0072617764726148 0000000000000000 9000000000223504 90000001000a0000
> > > [   54.472308]         9000000100073a90 9000000100073a98 0000000000000000 9000000100073bd8
> > > [   54.514413]         9000000100073bd0 9000000100073bd0 9000000100073a00 0000000000000001
> > > [   54.556018]         0000000000000001 9000000100073a98 99828271f24e961a 90000001002810c0
> > > [   54.596924]         0000000000000001 0000000000010003 0000000000000000 0000000000000001
> > > [   54.637115]         ffff8000337cdb80 0000000000000001 0000000006360000 900000000131c9c0
> > > [   54.677049]         0000000000000000 0000000000000000 90000000017b4c98 9000000001912000
> > > [   54.716394]         9000000001912f68 9000000001913000 9000000001912f70 00000000000002b0
> > > [   54.754880]         90000000014a8840 0000000000000000 900000000022351c 0000000000000000
> > > [   54.792372]         00000000000002b0 000000000000000c 0000000000000000 0000000000071c1c
> > > [   54.829302]         ...
> > > [   54.859163] Call Trace:
> > > [   54.859165] [<900000000022351c>] show_stack+0x5c/0x180
> > > [   54.918298] [<90000000012f6100>] dump_stack_lvl+0x60/0x88
> > > [   54.949251] [<90000000012dd5d8>] rcu_dump_cpu_stacks+0xf0/0x148
> > > [   54.981116] [<90000000002d2fb8>] rcu_sched_clock_irq+0xb78/0xe60
> > > [   55.012744] [<90000000002e47cc>] update_process_times+0x6c/0xc0
> > > [   55.044169] [<90000000002f65d4>] tick_sched_timer+0x54/0x100
> > > [   55.075488] [<90000000002e5174>] __hrtimer_run_queues+0x154/0x240
> > > [   55.107347] [<90000000002e6288>] hrtimer_interrupt+0x108/0x2a0
> > > [   55.139112] [<9000000000226418>] constant_timer_interrupt+0x38/0x60
> > > [   55.170749] [<90000000002b3010>] __handle_irq_event_percpu+0x50/0x160
> > > [   55.203141] [<90000000002b3138>] handle_irq_event_percpu+0x18/0x80
> > > [   55.235064] [<90000000002b9d54>] handle_percpu_irq+0x54/0xa0
> > > [   55.266241] [<90000000002b2168>] generic_handle_domain_irq+0x28/0x40
> > > [   55.298466] [<9000000000aba95c>] handle_cpu_irq+0x5c/0xa0
> > > [   55.329749] [<90000000012f7270>] handle_loongarch_irq+0x30/0x60
> > > [   55.361476] [<90000000012f733c>] do_vint+0x9c/0x100
> > > [   55.391737] [<9000000000332100>] kgdb_cpu_enter+0x540/0x640
> > > [   55.422440] [<9000000000332b64>] kgdb_handle_exception+0x104/0x180
> > > [   55.452911] [<9000000000232478>] kgdb_loongarch_notify+0x38/0xa0
> > > [   55.481964] [<900000000026b4d4>] notify_die+0x94/0x100
> > > [   55.509184] [<90000000012f685c>] do_bp+0x21c/0x340
> > > [   55.562475] [<90000000003315b8>] kgdb_compiled_break+0x0/0x28
> > > [   55.590319] [<9000000000332e80>] kgdb_register_io_module+0x160/0x1c0
> > > [   55.618901] [<9000000000c0f514>] configure_kgdboc+0x154/0x1c0
> > > [   55.647034] [<9000000000c0f5e0>] kgdboc_probe+0x60/0x80
> > > [   55.674647] [<9000000000c96da8>] platform_probe+0x68/0x100
> > > [   55.702613] [<9000000000c938e0>] really_probe+0xc0/0x340
> > > [   55.730528] [<9000000000c93be4>] __driver_probe_device+0x84/0x140
> > > [   55.759615] [<9000000000c93cdc>] driver_probe_device+0x3c/0x120
> > > [   55.787990] [<9000000000c93e8c>] __device_attach_driver+0xcc/0x160
> > > [   55.817145] [<9000000000c91290>] bus_for_each_drv+0x90/0x100
> > > [   55.845654] [<9000000000c94328>] __device_attach+0xa8/0x1a0
> > > [   55.874145] [<9000000000c925f0>] bus_probe_device+0xb0/0xe0
> > > [   55.902572] [<9000000000c8ec7c>] device_add+0x65c/0x860
> > > [   55.930635] [<9000000000c96704>] platform_device_add+0x124/0x2c0
> > > [   55.959669] [<9000000001452b38>] init_kgdboc+0x58/0xa0
> > > [   55.987677] [<900000000022015c>] do_one_initcall+0x7c/0x1e0
> > > [   56.016134] [<9000000001420f1c>] kernel_init_freeable+0x22c/0x2a0
> > > [   56.045128] [<90000000012f923c>] kernel_init+0x20/0x124
> > >
> > > Currently rcu_cpu_stall_reset() set rcu_state.jiffies_stall to one check
> > > period later, i.e. jiffies + rcu_jiffies_till_stall_check(). But jiffies
> > > is only updated in the timer interrupt, so when kgdb_cpu_enter() begins
> > > to run there may already be nearly one rcu check period after jiffies.
> > > Since all interrupts are disabled during kgdb_cpu_enter(), jiffies will
> > > not be updated. When kgdb_cpu_enter() returns, rcu_state.jiffies_stall
> > > maybe already gets timeout.
> > >
> > > We can set rcu_state.jiffies_stall to two rcu check periods later, e.g.
> > > jiffies + (rcu_jiffies_till_stall_check() * 2) in rcu_cpu_stall_reset()
> > > to avoid this problem. But this isn't a complete solution because kgdb
> > > may take a very long time in irq disabled context.
> > >
> > > Instead, update jiffies at the beginning of rcu_cpu_stall_reset() can
> > > solve all kinds of problems.
> >
> > Would it make sense for there to be a kgdb_cpu_exit()?  In that case,
> > the stalls could simply be suppressed at the beginning of the debug
> > session and re-enabled upon exit, as is currently done for sysrq output
> > via rcu_sysrq_start() and rcu_sysrq_end().
> Thank you for your advice, but that doesn't help. Because
> rcu_sysrq_start() and rcu_sysrq_end() try to suppress the warnings
> during sysrq, but kgdb already has no warnings during kgdb_cpu_enter()
> since it is executed in irq disabled context. Instead, this patch
> wants to suppress the warnings *after* kgdb_cpu_enter() due to a very
> old jiffies value.
>

Hello, Huacai

Is it possible to set  the rcu_cpu_stall_suppress is true in
dbg_touch_watchdogs()
and reset the rcu_cpu_stall_suppress at the beginning and end of the
RCU grace period?
or set rcupdate.rcu_cpu_stall_suppress_at_boot=1 in bootargs can
suppress RCU stall
in booting.


Thanks
Zqiang


>
> Huacai
>
> >
> >                                                         Thanx, Paul
> >
> > > Cc: stable@vger.kernel.org
> > > Fixes: a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in rcu_cpu_stall_reset()")
> > > Reported-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
> > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > > ---
> > >  kernel/rcu/tree_stall.h | 1 +
> > >  1 file changed, 1 insertion(+)
> > >
> > > diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> > > index b10b8349bb2a..1c7b540985bf 100644
> > > --- a/kernel/rcu/tree_stall.h
> > > +++ b/kernel/rcu/tree_stall.h
> > > @@ -153,6 +153,7 @@ static void panic_on_rcu_stall(void)
> > >   */
> > >  void rcu_cpu_stall_reset(void)
> > >  {
> > > +     do_update_jiffies_64(ktime_get());
> > >       WRITE_ONCE(rcu_state.jiffies_stall,
> > >                  jiffies + rcu_jiffies_till_stall_check());
> > >  }
> > > --
> > > 2.39.3
> > >
Huacai Chen Aug. 16, 2023, 4:53 a.m. UTC | #4
Hi, Qiang,

On Wed, Aug 16, 2023 at 11:16 AM Z qiang <qiang.zhang1211@gmail.com> wrote:
>
> >
> > Hi, Paul,
> >
> > On Tue, Aug 15, 2023 at 12:15 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > >
> > > On Mon, Aug 14, 2023 at 10:00:45AM +0800, Huacai Chen wrote:
> > > > The KGDB initial breakpoint gets an rcu stall warning after commit
> > > > a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in
> > > > rcu_cpu_stall_reset()").
> > > >
> > > > [   53.452051] rcu: INFO: rcu_preempt self-detected stall on CPU
> > > > [   53.487950] rcu:     3-...0: (1 ticks this GP) idle=0e2c/1/0x4000000000000000 softirq=375/375 fqs=8
> > > > [   53.528243] rcu:     (t=12297 jiffies g=-995 q=1 ncpus=4)
> > > > [   53.564840] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc2+ #4848
> > > > [   53.603005] Hardware name: Loongson Loongson-3A5000-HV-7A2000-1w-V0.1-CRB/Loongson-LS3A5000-7A2000-1w-CRB-V1.21, BIOS Loongson-UDK2018-V2.0.05099-beta8 08
> > > > [   53.682062] pc 9000000000332100 ra 90000000003320f4 tp 90000001000a0000 sp 90000001000a3710
> > > > [   53.724934] a0 9000000001d4b488 a1 0000000000000000 a2 0000000000000001 a3 0000000000000000
> > > > [   53.768179] a4 9000000001d526c8 a5 90000001000a38f0 a6 000000000000002c a7 0000000000000000
> > > > [   53.810751] t0 00000000000002b0 t1 0000000000000004 t2 900000000131c9c0 t3 fffffffffffffffa
> > > > [   53.853249] t4 0000000000000080 t5 90000001002ac190 t6 0000000000000004 t7 9000000001912d58
> > > > [   53.895684] t8 0000000000000000 u0 90000000013141a0 s9 0000000000000028 s0 9000000001d512f0
> > > > [   53.937633] s1 9000000001d51278 s2 90000001000a3798 s3 90000000019fc410 s4 9000000001d4b488
> > > > [   53.979486] s5 9000000001d512f0 s6 90000000013141a0 s7 0000000000000078 s8 9000000001d4b450
> > > > [   54.021175]    ra: 90000000003320f4 kgdb_cpu_enter+0x534/0x640
> > > > [   54.060150]   ERA: 9000000000332100 kgdb_cpu_enter+0x540/0x640
> > > > [   54.098347]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
> > > > [   54.136621]  PRMD: 0000000c (PPLV0 +PIE +PWE)
> > > > [   54.172192]  EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
> > > > [   54.207838]  ECFG: 00071c1c (LIE=2-4,10-12 VS=7)
> > > > [   54.242503] ESTAT: 00000800 [INT] (IS=11 ECode=0 EsubCode=0)
> > > > [   54.277996]  PRID: 0014c011 (Loongson-64bit, Loongson-3A5000-HV)
> > > > [   54.313544] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc2+ #4848
> > > > [   54.430170] Stack : 0072617764726148 0000000000000000 9000000000223504 90000001000a0000
> > > > [   54.472308]         9000000100073a90 9000000100073a98 0000000000000000 9000000100073bd8
> > > > [   54.514413]         9000000100073bd0 9000000100073bd0 9000000100073a00 0000000000000001
> > > > [   54.556018]         0000000000000001 9000000100073a98 99828271f24e961a 90000001002810c0
> > > > [   54.596924]         0000000000000001 0000000000010003 0000000000000000 0000000000000001
> > > > [   54.637115]         ffff8000337cdb80 0000000000000001 0000000006360000 900000000131c9c0
> > > > [   54.677049]         0000000000000000 0000000000000000 90000000017b4c98 9000000001912000
> > > > [   54.716394]         9000000001912f68 9000000001913000 9000000001912f70 00000000000002b0
> > > > [   54.754880]         90000000014a8840 0000000000000000 900000000022351c 0000000000000000
> > > > [   54.792372]         00000000000002b0 000000000000000c 0000000000000000 0000000000071c1c
> > > > [   54.829302]         ...
> > > > [   54.859163] Call Trace:
> > > > [   54.859165] [<900000000022351c>] show_stack+0x5c/0x180
> > > > [   54.918298] [<90000000012f6100>] dump_stack_lvl+0x60/0x88
> > > > [   54.949251] [<90000000012dd5d8>] rcu_dump_cpu_stacks+0xf0/0x148
> > > > [   54.981116] [<90000000002d2fb8>] rcu_sched_clock_irq+0xb78/0xe60
> > > > [   55.012744] [<90000000002e47cc>] update_process_times+0x6c/0xc0
> > > > [   55.044169] [<90000000002f65d4>] tick_sched_timer+0x54/0x100
> > > > [   55.075488] [<90000000002e5174>] __hrtimer_run_queues+0x154/0x240
> > > > [   55.107347] [<90000000002e6288>] hrtimer_interrupt+0x108/0x2a0
> > > > [   55.139112] [<9000000000226418>] constant_timer_interrupt+0x38/0x60
> > > > [   55.170749] [<90000000002b3010>] __handle_irq_event_percpu+0x50/0x160
> > > > [   55.203141] [<90000000002b3138>] handle_irq_event_percpu+0x18/0x80
> > > > [   55.235064] [<90000000002b9d54>] handle_percpu_irq+0x54/0xa0
> > > > [   55.266241] [<90000000002b2168>] generic_handle_domain_irq+0x28/0x40
> > > > [   55.298466] [<9000000000aba95c>] handle_cpu_irq+0x5c/0xa0
> > > > [   55.329749] [<90000000012f7270>] handle_loongarch_irq+0x30/0x60
> > > > [   55.361476] [<90000000012f733c>] do_vint+0x9c/0x100
> > > > [   55.391737] [<9000000000332100>] kgdb_cpu_enter+0x540/0x640
> > > > [   55.422440] [<9000000000332b64>] kgdb_handle_exception+0x104/0x180
> > > > [   55.452911] [<9000000000232478>] kgdb_loongarch_notify+0x38/0xa0
> > > > [   55.481964] [<900000000026b4d4>] notify_die+0x94/0x100
> > > > [   55.509184] [<90000000012f685c>] do_bp+0x21c/0x340
> > > > [   55.562475] [<90000000003315b8>] kgdb_compiled_break+0x0/0x28
> > > > [   55.590319] [<9000000000332e80>] kgdb_register_io_module+0x160/0x1c0
> > > > [   55.618901] [<9000000000c0f514>] configure_kgdboc+0x154/0x1c0
> > > > [   55.647034] [<9000000000c0f5e0>] kgdboc_probe+0x60/0x80
> > > > [   55.674647] [<9000000000c96da8>] platform_probe+0x68/0x100
> > > > [   55.702613] [<9000000000c938e0>] really_probe+0xc0/0x340
> > > > [   55.730528] [<9000000000c93be4>] __driver_probe_device+0x84/0x140
> > > > [   55.759615] [<9000000000c93cdc>] driver_probe_device+0x3c/0x120
> > > > [   55.787990] [<9000000000c93e8c>] __device_attach_driver+0xcc/0x160
> > > > [   55.817145] [<9000000000c91290>] bus_for_each_drv+0x90/0x100
> > > > [   55.845654] [<9000000000c94328>] __device_attach+0xa8/0x1a0
> > > > [   55.874145] [<9000000000c925f0>] bus_probe_device+0xb0/0xe0
> > > > [   55.902572] [<9000000000c8ec7c>] device_add+0x65c/0x860
> > > > [   55.930635] [<9000000000c96704>] platform_device_add+0x124/0x2c0
> > > > [   55.959669] [<9000000001452b38>] init_kgdboc+0x58/0xa0
> > > > [   55.987677] [<900000000022015c>] do_one_initcall+0x7c/0x1e0
> > > > [   56.016134] [<9000000001420f1c>] kernel_init_freeable+0x22c/0x2a0
> > > > [   56.045128] [<90000000012f923c>] kernel_init+0x20/0x124
> > > >
> > > > Currently rcu_cpu_stall_reset() set rcu_state.jiffies_stall to one check
> > > > period later, i.e. jiffies + rcu_jiffies_till_stall_check(). But jiffies
> > > > is only updated in the timer interrupt, so when kgdb_cpu_enter() begins
> > > > to run there may already be nearly one rcu check period after jiffies.
> > > > Since all interrupts are disabled during kgdb_cpu_enter(), jiffies will
> > > > not be updated. When kgdb_cpu_enter() returns, rcu_state.jiffies_stall
> > > > maybe already gets timeout.
> > > >
> > > > We can set rcu_state.jiffies_stall to two rcu check periods later, e.g.
> > > > jiffies + (rcu_jiffies_till_stall_check() * 2) in rcu_cpu_stall_reset()
> > > > to avoid this problem. But this isn't a complete solution because kgdb
> > > > may take a very long time in irq disabled context.
> > > >
> > > > Instead, update jiffies at the beginning of rcu_cpu_stall_reset() can
> > > > solve all kinds of problems.
> > >
> > > Would it make sense for there to be a kgdb_cpu_exit()?  In that case,
> > > the stalls could simply be suppressed at the beginning of the debug
> > > session and re-enabled upon exit, as is currently done for sysrq output
> > > via rcu_sysrq_start() and rcu_sysrq_end().
> > Thank you for your advice, but that doesn't help. Because
> > rcu_sysrq_start() and rcu_sysrq_end() try to suppress the warnings
> > during sysrq, but kgdb already has no warnings during kgdb_cpu_enter()
> > since it is executed in irq disabled context. Instead, this patch
> > wants to suppress the warnings *after* kgdb_cpu_enter() due to a very
> > old jiffies value.
> >
>
> Hello, Huacai
>
> Is it possible to set  the rcu_cpu_stall_suppress is true in
> dbg_touch_watchdogs()
> and reset the rcu_cpu_stall_suppress at the beginning and end of the
> RCU grace period?
This is possible but not the best: 1, kgdb is not the only caller of
rcu_cpu_stall_reset(); 2, it is difficult to find the "end" to reset
rcu_cpu_stall_suppress.

> or set rcupdate.rcu_cpu_stall_suppress_at_boot=1 in bootargs can
> suppress RCU stall
> in booting.
This is also possible, but it suppresses all kinds of stall warnings,
which is not what we want.

Huacai
>
>
> Thanks
> Zqiang
>
>
> >
> > Huacai
> >
> > >
> > >                                                         Thanx, Paul
> > >
> > > > Cc: stable@vger.kernel.org
> > > > Fixes: a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in rcu_cpu_stall_reset()")
> > > > Reported-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
> > > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > > > ---
> > > >  kernel/rcu/tree_stall.h | 1 +
> > > >  1 file changed, 1 insertion(+)
> > > >
> > > > diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> > > > index b10b8349bb2a..1c7b540985bf 100644
> > > > --- a/kernel/rcu/tree_stall.h
> > > > +++ b/kernel/rcu/tree_stall.h
> > > > @@ -153,6 +153,7 @@ static void panic_on_rcu_stall(void)
> > > >   */
> > > >  void rcu_cpu_stall_reset(void)
> > > >  {
> > > > +     do_update_jiffies_64(ktime_get());
> > > >       WRITE_ONCE(rcu_state.jiffies_stall,
> > > >                  jiffies + rcu_jiffies_till_stall_check());
> > > >  }
> > > > --
> > > > 2.39.3
> > > >
Zqiang Aug. 16, 2023, 5:09 a.m. UTC | #5
>
> Hi, Qiang,
>
> On Wed, Aug 16, 2023 at 11:16 AM Z qiang <qiang.zhang1211@gmail.com> wrote:
> >
> > >
> > > Hi, Paul,
> > >
> > > On Tue, Aug 15, 2023 at 12:15 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > >
> > > > On Mon, Aug 14, 2023 at 10:00:45AM +0800, Huacai Chen wrote:
> > > > > The KGDB initial breakpoint gets an rcu stall warning after commit
> > > > > a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in
> > > > > rcu_cpu_stall_reset()").
> > > > >
> > > > > [   53.452051] rcu: INFO: rcu_preempt self-detected stall on CPU
> > > > > [   53.487950] rcu:     3-...0: (1 ticks this GP) idle=0e2c/1/0x4000000000000000 softirq=375/375 fqs=8
> > > > > [   53.528243] rcu:     (t=12297 jiffies g=-995 q=1 ncpus=4)
> > > > > [   53.564840] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc2+ #4848
> > > > > [   53.603005] Hardware name: Loongson Loongson-3A5000-HV-7A2000-1w-V0.1-CRB/Loongson-LS3A5000-7A2000-1w-CRB-V1.21, BIOS Loongson-UDK2018-V2.0.05099-beta8 08
> > > > > [   53.682062] pc 9000000000332100 ra 90000000003320f4 tp 90000001000a0000 sp 90000001000a3710
> > > > > [   53.724934] a0 9000000001d4b488 a1 0000000000000000 a2 0000000000000001 a3 0000000000000000
> > > > > [   53.768179] a4 9000000001d526c8 a5 90000001000a38f0 a6 000000000000002c a7 0000000000000000
> > > > > [   53.810751] t0 00000000000002b0 t1 0000000000000004 t2 900000000131c9c0 t3 fffffffffffffffa
> > > > > [   53.853249] t4 0000000000000080 t5 90000001002ac190 t6 0000000000000004 t7 9000000001912d58
> > > > > [   53.895684] t8 0000000000000000 u0 90000000013141a0 s9 0000000000000028 s0 9000000001d512f0
> > > > > [   53.937633] s1 9000000001d51278 s2 90000001000a3798 s3 90000000019fc410 s4 9000000001d4b488
> > > > > [   53.979486] s5 9000000001d512f0 s6 90000000013141a0 s7 0000000000000078 s8 9000000001d4b450
> > > > > [   54.021175]    ra: 90000000003320f4 kgdb_cpu_enter+0x534/0x640
> > > > > [   54.060150]   ERA: 9000000000332100 kgdb_cpu_enter+0x540/0x640
> > > > > [   54.098347]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
> > > > > [   54.136621]  PRMD: 0000000c (PPLV0 +PIE +PWE)
> > > > > [   54.172192]  EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
> > > > > [   54.207838]  ECFG: 00071c1c (LIE=2-4,10-12 VS=7)
> > > > > [   54.242503] ESTAT: 00000800 [INT] (IS=11 ECode=0 EsubCode=0)
> > > > > [   54.277996]  PRID: 0014c011 (Loongson-64bit, Loongson-3A5000-HV)
> > > > > [   54.313544] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc2+ #4848
> > > > > [   54.430170] Stack : 0072617764726148 0000000000000000 9000000000223504 90000001000a0000
> > > > > [   54.472308]         9000000100073a90 9000000100073a98 0000000000000000 9000000100073bd8
> > > > > [   54.514413]         9000000100073bd0 9000000100073bd0 9000000100073a00 0000000000000001
> > > > > [   54.556018]         0000000000000001 9000000100073a98 99828271f24e961a 90000001002810c0
> > > > > [   54.596924]         0000000000000001 0000000000010003 0000000000000000 0000000000000001
> > > > > [   54.637115]         ffff8000337cdb80 0000000000000001 0000000006360000 900000000131c9c0
> > > > > [   54.677049]         0000000000000000 0000000000000000 90000000017b4c98 9000000001912000
> > > > > [   54.716394]         9000000001912f68 9000000001913000 9000000001912f70 00000000000002b0
> > > > > [   54.754880]         90000000014a8840 0000000000000000 900000000022351c 0000000000000000
> > > > > [   54.792372]         00000000000002b0 000000000000000c 0000000000000000 0000000000071c1c
> > > > > [   54.829302]         ...
> > > > > [   54.859163] Call Trace:
> > > > > [   54.859165] [<900000000022351c>] show_stack+0x5c/0x180
> > > > > [   54.918298] [<90000000012f6100>] dump_stack_lvl+0x60/0x88
> > > > > [   54.949251] [<90000000012dd5d8>] rcu_dump_cpu_stacks+0xf0/0x148
> > > > > [   54.981116] [<90000000002d2fb8>] rcu_sched_clock_irq+0xb78/0xe60
> > > > > [   55.012744] [<90000000002e47cc>] update_process_times+0x6c/0xc0
> > > > > [   55.044169] [<90000000002f65d4>] tick_sched_timer+0x54/0x100
> > > > > [   55.075488] [<90000000002e5174>] __hrtimer_run_queues+0x154/0x240
> > > > > [   55.107347] [<90000000002e6288>] hrtimer_interrupt+0x108/0x2a0
> > > > > [   55.139112] [<9000000000226418>] constant_timer_interrupt+0x38/0x60
> > > > > [   55.170749] [<90000000002b3010>] __handle_irq_event_percpu+0x50/0x160
> > > > > [   55.203141] [<90000000002b3138>] handle_irq_event_percpu+0x18/0x80
> > > > > [   55.235064] [<90000000002b9d54>] handle_percpu_irq+0x54/0xa0
> > > > > [   55.266241] [<90000000002b2168>] generic_handle_domain_irq+0x28/0x40
> > > > > [   55.298466] [<9000000000aba95c>] handle_cpu_irq+0x5c/0xa0
> > > > > [   55.329749] [<90000000012f7270>] handle_loongarch_irq+0x30/0x60
> > > > > [   55.361476] [<90000000012f733c>] do_vint+0x9c/0x100
> > > > > [   55.391737] [<9000000000332100>] kgdb_cpu_enter+0x540/0x640
> > > > > [   55.422440] [<9000000000332b64>] kgdb_handle_exception+0x104/0x180
> > > > > [   55.452911] [<9000000000232478>] kgdb_loongarch_notify+0x38/0xa0
> > > > > [   55.481964] [<900000000026b4d4>] notify_die+0x94/0x100
> > > > > [   55.509184] [<90000000012f685c>] do_bp+0x21c/0x340
> > > > > [   55.562475] [<90000000003315b8>] kgdb_compiled_break+0x0/0x28
> > > > > [   55.590319] [<9000000000332e80>] kgdb_register_io_module+0x160/0x1c0
> > > > > [   55.618901] [<9000000000c0f514>] configure_kgdboc+0x154/0x1c0
> > > > > [   55.647034] [<9000000000c0f5e0>] kgdboc_probe+0x60/0x80
> > > > > [   55.674647] [<9000000000c96da8>] platform_probe+0x68/0x100
> > > > > [   55.702613] [<9000000000c938e0>] really_probe+0xc0/0x340
> > > > > [   55.730528] [<9000000000c93be4>] __driver_probe_device+0x84/0x140
> > > > > [   55.759615] [<9000000000c93cdc>] driver_probe_device+0x3c/0x120
> > > > > [   55.787990] [<9000000000c93e8c>] __device_attach_driver+0xcc/0x160
> > > > > [   55.817145] [<9000000000c91290>] bus_for_each_drv+0x90/0x100
> > > > > [   55.845654] [<9000000000c94328>] __device_attach+0xa8/0x1a0
> > > > > [   55.874145] [<9000000000c925f0>] bus_probe_device+0xb0/0xe0
> > > > > [   55.902572] [<9000000000c8ec7c>] device_add+0x65c/0x860
> > > > > [   55.930635] [<9000000000c96704>] platform_device_add+0x124/0x2c0
> > > > > [   55.959669] [<9000000001452b38>] init_kgdboc+0x58/0xa0
> > > > > [   55.987677] [<900000000022015c>] do_one_initcall+0x7c/0x1e0
> > > > > [   56.016134] [<9000000001420f1c>] kernel_init_freeable+0x22c/0x2a0
> > > > > [   56.045128] [<90000000012f923c>] kernel_init+0x20/0x124
> > > > >
> > > > > Currently rcu_cpu_stall_reset() set rcu_state.jiffies_stall to one check
> > > > > period later, i.e. jiffies + rcu_jiffies_till_stall_check(). But jiffies
> > > > > is only updated in the timer interrupt, so when kgdb_cpu_enter() begins
> > > > > to run there may already be nearly one rcu check period after jiffies.
> > > > > Since all interrupts are disabled during kgdb_cpu_enter(), jiffies will
> > > > > not be updated. When kgdb_cpu_enter() returns, rcu_state.jiffies_stall
> > > > > maybe already gets timeout.
> > > > >
> > > > > We can set rcu_state.jiffies_stall to two rcu check periods later, e.g.
> > > > > jiffies + (rcu_jiffies_till_stall_check() * 2) in rcu_cpu_stall_reset()
> > > > > to avoid this problem. But this isn't a complete solution because kgdb
> > > > > may take a very long time in irq disabled context.
> > > > >
> > > > > Instead, update jiffies at the beginning of rcu_cpu_stall_reset() can
> > > > > solve all kinds of problems.
> > > >
> > > > Would it make sense for there to be a kgdb_cpu_exit()?  In that case,
> > > > the stalls could simply be suppressed at the beginning of the debug
> > > > session and re-enabled upon exit, as is currently done for sysrq output
> > > > via rcu_sysrq_start() and rcu_sysrq_end().
> > > Thank you for your advice, but that doesn't help. Because
> > > rcu_sysrq_start() and rcu_sysrq_end() try to suppress the warnings
> > > during sysrq, but kgdb already has no warnings during kgdb_cpu_enter()
> > > since it is executed in irq disabled context. Instead, this patch
> > > wants to suppress the warnings *after* kgdb_cpu_enter() due to a very
> > > old jiffies value.
> > >
> >
> > Hello, Huacai
> >
> > Is it possible to set  the rcu_cpu_stall_suppress is true in
> > dbg_touch_watchdogs()
> > and reset the rcu_cpu_stall_suppress at the beginning and end of the
> > RCU grace period?
> This is possible but not the best: 1, kgdb is not the only caller of
> rcu_cpu_stall_reset(); 2, it is difficult to find the "end" to reset
> rcu_cpu_stall_suppress.
>

You can replace rcu_state.jiffies_stall update by setting rcu_cpu_stall_suppress
in rcu_cpu_stall_reset(),  and reset rcu_cpu_stall_suppress in rcu_gp_init() and
rcu_gp_cleanup().

Thanks
Zqiang

>
> > or set rcupdate.rcu_cpu_stall_suppress_at_boot=1 in bootargs can
> > suppress RCU stall
> > in booting.
> This is also possible, but it suppresses all kinds of stall warnings,
> which is not what we want.
>
> Huacai
> >
> >
> > Thanks
> > Zqiang
> >
> >
> > >
> > > Huacai
> > >
> > > >
> > > >                                                         Thanx, Paul
> > > >
> > > > > Cc: stable@vger.kernel.org
> > > > > Fixes: a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in rcu_cpu_stall_reset()")
> > > > > Reported-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
> > > > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > > > > ---
> > > > >  kernel/rcu/tree_stall.h | 1 +
> > > > >  1 file changed, 1 insertion(+)
> > > > >
> > > > > diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> > > > > index b10b8349bb2a..1c7b540985bf 100644
> > > > > --- a/kernel/rcu/tree_stall.h
> > > > > +++ b/kernel/rcu/tree_stall.h
> > > > > @@ -153,6 +153,7 @@ static void panic_on_rcu_stall(void)
> > > > >   */
> > > > >  void rcu_cpu_stall_reset(void)
> > > > >  {
> > > > > +     do_update_jiffies_64(ktime_get());
> > > > >       WRITE_ONCE(rcu_state.jiffies_stall,
> > > > >                  jiffies + rcu_jiffies_till_stall_check());
> > > > >  }
> > > > > --
> > > > > 2.39.3
> > > > >
Huacai Chen Aug. 16, 2023, 9:33 a.m. UTC | #6
Hi, Qiang,

On Wed, Aug 16, 2023 at 1:09 PM Z qiang <qiang.zhang1211@gmail.com> wrote:
>
> >
> > Hi, Qiang,
> >
> > On Wed, Aug 16, 2023 at 11:16 AM Z qiang <qiang.zhang1211@gmail.com> wrote:
> > >
> > > >
> > > > Hi, Paul,
> > > >
> > > > On Tue, Aug 15, 2023 at 12:15 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > >
> > > > > On Mon, Aug 14, 2023 at 10:00:45AM +0800, Huacai Chen wrote:
> > > > > > The KGDB initial breakpoint gets an rcu stall warning after commit
> > > > > > a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in
> > > > > > rcu_cpu_stall_reset()").
> > > > > >
> > > > > > [   53.452051] rcu: INFO: rcu_preempt self-detected stall on CPU
> > > > > > [   53.487950] rcu:     3-...0: (1 ticks this GP) idle=0e2c/1/0x4000000000000000 softirq=375/375 fqs=8
> > > > > > [   53.528243] rcu:     (t=12297 jiffies g=-995 q=1 ncpus=4)
> > > > > > [   53.564840] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc2+ #4848
> > > > > > [   53.603005] Hardware name: Loongson Loongson-3A5000-HV-7A2000-1w-V0.1-CRB/Loongson-LS3A5000-7A2000-1w-CRB-V1.21, BIOS Loongson-UDK2018-V2.0.05099-beta8 08
> > > > > > [   53.682062] pc 9000000000332100 ra 90000000003320f4 tp 90000001000a0000 sp 90000001000a3710
> > > > > > [   53.724934] a0 9000000001d4b488 a1 0000000000000000 a2 0000000000000001 a3 0000000000000000
> > > > > > [   53.768179] a4 9000000001d526c8 a5 90000001000a38f0 a6 000000000000002c a7 0000000000000000
> > > > > > [   53.810751] t0 00000000000002b0 t1 0000000000000004 t2 900000000131c9c0 t3 fffffffffffffffa
> > > > > > [   53.853249] t4 0000000000000080 t5 90000001002ac190 t6 0000000000000004 t7 9000000001912d58
> > > > > > [   53.895684] t8 0000000000000000 u0 90000000013141a0 s9 0000000000000028 s0 9000000001d512f0
> > > > > > [   53.937633] s1 9000000001d51278 s2 90000001000a3798 s3 90000000019fc410 s4 9000000001d4b488
> > > > > > [   53.979486] s5 9000000001d512f0 s6 90000000013141a0 s7 0000000000000078 s8 9000000001d4b450
> > > > > > [   54.021175]    ra: 90000000003320f4 kgdb_cpu_enter+0x534/0x640
> > > > > > [   54.060150]   ERA: 9000000000332100 kgdb_cpu_enter+0x540/0x640
> > > > > > [   54.098347]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
> > > > > > [   54.136621]  PRMD: 0000000c (PPLV0 +PIE +PWE)
> > > > > > [   54.172192]  EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
> > > > > > [   54.207838]  ECFG: 00071c1c (LIE=2-4,10-12 VS=7)
> > > > > > [   54.242503] ESTAT: 00000800 [INT] (IS=11 ECode=0 EsubCode=0)
> > > > > > [   54.277996]  PRID: 0014c011 (Loongson-64bit, Loongson-3A5000-HV)
> > > > > > [   54.313544] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc2+ #4848
> > > > > > [   54.430170] Stack : 0072617764726148 0000000000000000 9000000000223504 90000001000a0000
> > > > > > [   54.472308]         9000000100073a90 9000000100073a98 0000000000000000 9000000100073bd8
> > > > > > [   54.514413]         9000000100073bd0 9000000100073bd0 9000000100073a00 0000000000000001
> > > > > > [   54.556018]         0000000000000001 9000000100073a98 99828271f24e961a 90000001002810c0
> > > > > > [   54.596924]         0000000000000001 0000000000010003 0000000000000000 0000000000000001
> > > > > > [   54.637115]         ffff8000337cdb80 0000000000000001 0000000006360000 900000000131c9c0
> > > > > > [   54.677049]         0000000000000000 0000000000000000 90000000017b4c98 9000000001912000
> > > > > > [   54.716394]         9000000001912f68 9000000001913000 9000000001912f70 00000000000002b0
> > > > > > [   54.754880]         90000000014a8840 0000000000000000 900000000022351c 0000000000000000
> > > > > > [   54.792372]         00000000000002b0 000000000000000c 0000000000000000 0000000000071c1c
> > > > > > [   54.829302]         ...
> > > > > > [   54.859163] Call Trace:
> > > > > > [   54.859165] [<900000000022351c>] show_stack+0x5c/0x180
> > > > > > [   54.918298] [<90000000012f6100>] dump_stack_lvl+0x60/0x88
> > > > > > [   54.949251] [<90000000012dd5d8>] rcu_dump_cpu_stacks+0xf0/0x148
> > > > > > [   54.981116] [<90000000002d2fb8>] rcu_sched_clock_irq+0xb78/0xe60
> > > > > > [   55.012744] [<90000000002e47cc>] update_process_times+0x6c/0xc0
> > > > > > [   55.044169] [<90000000002f65d4>] tick_sched_timer+0x54/0x100
> > > > > > [   55.075488] [<90000000002e5174>] __hrtimer_run_queues+0x154/0x240
> > > > > > [   55.107347] [<90000000002e6288>] hrtimer_interrupt+0x108/0x2a0
> > > > > > [   55.139112] [<9000000000226418>] constant_timer_interrupt+0x38/0x60
> > > > > > [   55.170749] [<90000000002b3010>] __handle_irq_event_percpu+0x50/0x160
> > > > > > [   55.203141] [<90000000002b3138>] handle_irq_event_percpu+0x18/0x80
> > > > > > [   55.235064] [<90000000002b9d54>] handle_percpu_irq+0x54/0xa0
> > > > > > [   55.266241] [<90000000002b2168>] generic_handle_domain_irq+0x28/0x40
> > > > > > [   55.298466] [<9000000000aba95c>] handle_cpu_irq+0x5c/0xa0
> > > > > > [   55.329749] [<90000000012f7270>] handle_loongarch_irq+0x30/0x60
> > > > > > [   55.361476] [<90000000012f733c>] do_vint+0x9c/0x100
> > > > > > [   55.391737] [<9000000000332100>] kgdb_cpu_enter+0x540/0x640
> > > > > > [   55.422440] [<9000000000332b64>] kgdb_handle_exception+0x104/0x180
> > > > > > [   55.452911] [<9000000000232478>] kgdb_loongarch_notify+0x38/0xa0
> > > > > > [   55.481964] [<900000000026b4d4>] notify_die+0x94/0x100
> > > > > > [   55.509184] [<90000000012f685c>] do_bp+0x21c/0x340
> > > > > > [   55.562475] [<90000000003315b8>] kgdb_compiled_break+0x0/0x28
> > > > > > [   55.590319] [<9000000000332e80>] kgdb_register_io_module+0x160/0x1c0
> > > > > > [   55.618901] [<9000000000c0f514>] configure_kgdboc+0x154/0x1c0
> > > > > > [   55.647034] [<9000000000c0f5e0>] kgdboc_probe+0x60/0x80
> > > > > > [   55.674647] [<9000000000c96da8>] platform_probe+0x68/0x100
> > > > > > [   55.702613] [<9000000000c938e0>] really_probe+0xc0/0x340
> > > > > > [   55.730528] [<9000000000c93be4>] __driver_probe_device+0x84/0x140
> > > > > > [   55.759615] [<9000000000c93cdc>] driver_probe_device+0x3c/0x120
> > > > > > [   55.787990] [<9000000000c93e8c>] __device_attach_driver+0xcc/0x160
> > > > > > [   55.817145] [<9000000000c91290>] bus_for_each_drv+0x90/0x100
> > > > > > [   55.845654] [<9000000000c94328>] __device_attach+0xa8/0x1a0
> > > > > > [   55.874145] [<9000000000c925f0>] bus_probe_device+0xb0/0xe0
> > > > > > [   55.902572] [<9000000000c8ec7c>] device_add+0x65c/0x860
> > > > > > [   55.930635] [<9000000000c96704>] platform_device_add+0x124/0x2c0
> > > > > > [   55.959669] [<9000000001452b38>] init_kgdboc+0x58/0xa0
> > > > > > [   55.987677] [<900000000022015c>] do_one_initcall+0x7c/0x1e0
> > > > > > [   56.016134] [<9000000001420f1c>] kernel_init_freeable+0x22c/0x2a0
> > > > > > [   56.045128] [<90000000012f923c>] kernel_init+0x20/0x124
> > > > > >
> > > > > > Currently rcu_cpu_stall_reset() set rcu_state.jiffies_stall to one check
> > > > > > period later, i.e. jiffies + rcu_jiffies_till_stall_check(). But jiffies
> > > > > > is only updated in the timer interrupt, so when kgdb_cpu_enter() begins
> > > > > > to run there may already be nearly one rcu check period after jiffies.
> > > > > > Since all interrupts are disabled during kgdb_cpu_enter(), jiffies will
> > > > > > not be updated. When kgdb_cpu_enter() returns, rcu_state.jiffies_stall
> > > > > > maybe already gets timeout.
> > > > > >
> > > > > > We can set rcu_state.jiffies_stall to two rcu check periods later, e.g.
> > > > > > jiffies + (rcu_jiffies_till_stall_check() * 2) in rcu_cpu_stall_reset()
> > > > > > to avoid this problem. But this isn't a complete solution because kgdb
> > > > > > may take a very long time in irq disabled context.
> > > > > >
> > > > > > Instead, update jiffies at the beginning of rcu_cpu_stall_reset() can
> > > > > > solve all kinds of problems.
> > > > >
> > > > > Would it make sense for there to be a kgdb_cpu_exit()?  In that case,
> > > > > the stalls could simply be suppressed at the beginning of the debug
> > > > > session and re-enabled upon exit, as is currently done for sysrq output
> > > > > via rcu_sysrq_start() and rcu_sysrq_end().
> > > > Thank you for your advice, but that doesn't help. Because
> > > > rcu_sysrq_start() and rcu_sysrq_end() try to suppress the warnings
> > > > during sysrq, but kgdb already has no warnings during kgdb_cpu_enter()
> > > > since it is executed in irq disabled context. Instead, this patch
> > > > wants to suppress the warnings *after* kgdb_cpu_enter() due to a very
> > > > old jiffies value.
> > > >
> > >
> > > Hello, Huacai
> > >
> > > Is it possible to set  the rcu_cpu_stall_suppress is true in
> > > dbg_touch_watchdogs()
> > > and reset the rcu_cpu_stall_suppress at the beginning and end of the
> > > RCU grace period?
> > This is possible but not the best: 1, kgdb is not the only caller of
> > rcu_cpu_stall_reset(); 2, it is difficult to find the "end" to reset
> > rcu_cpu_stall_suppress.
> >
>
> You can replace rcu_state.jiffies_stall update by setting rcu_cpu_stall_suppress
> in rcu_cpu_stall_reset(),  and reset rcu_cpu_stall_suppress in rcu_gp_init() and
> rcu_gp_cleanup().
What's the advantage compared with updating jiffies? Updating jiffies
seems more straight forward.

Huacai

>
> Thanks
> Zqiang
>
> >
> > > or set rcupdate.rcu_cpu_stall_suppress_at_boot=1 in bootargs can
> > > suppress RCU stall
> > > in booting.
> > This is also possible, but it suppresses all kinds of stall warnings,
> > which is not what we want.
> >
> > Huacai
> > >
> > >
> > > Thanks
> > > Zqiang
> > >
> > >
> > > >
> > > > Huacai
> > > >
> > > > >
> > > > >                                                         Thanx, Paul
> > > > >
> > > > > > Cc: stable@vger.kernel.org
> > > > > > Fixes: a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in rcu_cpu_stall_reset()")
> > > > > > Reported-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
> > > > > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > > > > > ---
> > > > > >  kernel/rcu/tree_stall.h | 1 +
> > > > > >  1 file changed, 1 insertion(+)
> > > > > >
> > > > > > diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> > > > > > index b10b8349bb2a..1c7b540985bf 100644
> > > > > > --- a/kernel/rcu/tree_stall.h
> > > > > > +++ b/kernel/rcu/tree_stall.h
> > > > > > @@ -153,6 +153,7 @@ static void panic_on_rcu_stall(void)
> > > > > >   */
> > > > > >  void rcu_cpu_stall_reset(void)
> > > > > >  {
> > > > > > +     do_update_jiffies_64(ktime_get());
> > > > > >       WRITE_ONCE(rcu_state.jiffies_stall,
> > > > > >                  jiffies + rcu_jiffies_till_stall_check());
> > > > > >  }
> > > > > > --
> > > > > > 2.39.3
> > > > > >
Zqiang Aug. 16, 2023, 10:06 a.m. UTC | #7
>
> Hi, Qiang,
>
> On Wed, Aug 16, 2023 at 1:09 PM Z qiang <qiang.zhang1211@gmail.com> wrote:
> >
> > >
> > > Hi, Qiang,
> > >
> > > On Wed, Aug 16, 2023 at 11:16 AM Z qiang <qiang.zhang1211@gmail.com> wrote:
> > > >
> > > > >
> > > > > Hi, Paul,
> > > > >
> > > > > On Tue, Aug 15, 2023 at 12:15 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > > >
> > > > > > On Mon, Aug 14, 2023 at 10:00:45AM +0800, Huacai Chen wrote:
> > > > > > > The KGDB initial breakpoint gets an rcu stall warning after commit
> > > > > > > a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in
> > > > > > > rcu_cpu_stall_reset()").
> > > > > > >
> > > > > > > [   53.452051] rcu: INFO: rcu_preempt self-detected stall on CPU
> > > > > > > [   53.487950] rcu:     3-...0: (1 ticks this GP) idle=0e2c/1/0x4000000000000000 softirq=375/375 fqs=8
> > > > > > > [   53.528243] rcu:     (t=12297 jiffies g=-995 q=1 ncpus=4)
> > > > > > > [   53.564840] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc2+ #4848
> > > > > > > [   53.603005] Hardware name: Loongson Loongson-3A5000-HV-7A2000-1w-V0.1-CRB/Loongson-LS3A5000-7A2000-1w-CRB-V1.21, BIOS Loongson-UDK2018-V2.0.05099-beta8 08
> > > > > > > [   53.682062] pc 9000000000332100 ra 90000000003320f4 tp 90000001000a0000 sp 90000001000a3710
> > > > > > > [   53.724934] a0 9000000001d4b488 a1 0000000000000000 a2 0000000000000001 a3 0000000000000000
> > > > > > > [   53.768179] a4 9000000001d526c8 a5 90000001000a38f0 a6 000000000000002c a7 0000000000000000
> > > > > > > [   53.810751] t0 00000000000002b0 t1 0000000000000004 t2 900000000131c9c0 t3 fffffffffffffffa
> > > > > > > [   53.853249] t4 0000000000000080 t5 90000001002ac190 t6 0000000000000004 t7 9000000001912d58
> > > > > > > [   53.895684] t8 0000000000000000 u0 90000000013141a0 s9 0000000000000028 s0 9000000001d512f0
> > > > > > > [   53.937633] s1 9000000001d51278 s2 90000001000a3798 s3 90000000019fc410 s4 9000000001d4b488
> > > > > > > [   53.979486] s5 9000000001d512f0 s6 90000000013141a0 s7 0000000000000078 s8 9000000001d4b450
> > > > > > > [   54.021175]    ra: 90000000003320f4 kgdb_cpu_enter+0x534/0x640
> > > > > > > [   54.060150]   ERA: 9000000000332100 kgdb_cpu_enter+0x540/0x640
> > > > > > > [   54.098347]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
> > > > > > > [   54.136621]  PRMD: 0000000c (PPLV0 +PIE +PWE)
> > > > > > > [   54.172192]  EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
> > > > > > > [   54.207838]  ECFG: 00071c1c (LIE=2-4,10-12 VS=7)
> > > > > > > [   54.242503] ESTAT: 00000800 [INT] (IS=11 ECode=0 EsubCode=0)
> > > > > > > [   54.277996]  PRID: 0014c011 (Loongson-64bit, Loongson-3A5000-HV)
> > > > > > > [   54.313544] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc2+ #4848
> > > > > > > [   54.430170] Stack : 0072617764726148 0000000000000000 9000000000223504 90000001000a0000
> > > > > > > [   54.472308]         9000000100073a90 9000000100073a98 0000000000000000 9000000100073bd8
> > > > > > > [   54.514413]         9000000100073bd0 9000000100073bd0 9000000100073a00 0000000000000001
> > > > > > > [   54.556018]         0000000000000001 9000000100073a98 99828271f24e961a 90000001002810c0
> > > > > > > [   54.596924]         0000000000000001 0000000000010003 0000000000000000 0000000000000001
> > > > > > > [   54.637115]         ffff8000337cdb80 0000000000000001 0000000006360000 900000000131c9c0
> > > > > > > [   54.677049]         0000000000000000 0000000000000000 90000000017b4c98 9000000001912000
> > > > > > > [   54.716394]         9000000001912f68 9000000001913000 9000000001912f70 00000000000002b0
> > > > > > > [   54.754880]         90000000014a8840 0000000000000000 900000000022351c 0000000000000000
> > > > > > > [   54.792372]         00000000000002b0 000000000000000c 0000000000000000 0000000000071c1c
> > > > > > > [   54.829302]         ...
> > > > > > > [   54.859163] Call Trace:
> > > > > > > [   54.859165] [<900000000022351c>] show_stack+0x5c/0x180
> > > > > > > [   54.918298] [<90000000012f6100>] dump_stack_lvl+0x60/0x88
> > > > > > > [   54.949251] [<90000000012dd5d8>] rcu_dump_cpu_stacks+0xf0/0x148
> > > > > > > [   54.981116] [<90000000002d2fb8>] rcu_sched_clock_irq+0xb78/0xe60
> > > > > > > [   55.012744] [<90000000002e47cc>] update_process_times+0x6c/0xc0
> > > > > > > [   55.044169] [<90000000002f65d4>] tick_sched_timer+0x54/0x100
> > > > > > > [   55.075488] [<90000000002e5174>] __hrtimer_run_queues+0x154/0x240
> > > > > > > [   55.107347] [<90000000002e6288>] hrtimer_interrupt+0x108/0x2a0
> > > > > > > [   55.139112] [<9000000000226418>] constant_timer_interrupt+0x38/0x60
> > > > > > > [   55.170749] [<90000000002b3010>] __handle_irq_event_percpu+0x50/0x160
> > > > > > > [   55.203141] [<90000000002b3138>] handle_irq_event_percpu+0x18/0x80
> > > > > > > [   55.235064] [<90000000002b9d54>] handle_percpu_irq+0x54/0xa0
> > > > > > > [   55.266241] [<90000000002b2168>] generic_handle_domain_irq+0x28/0x40
> > > > > > > [   55.298466] [<9000000000aba95c>] handle_cpu_irq+0x5c/0xa0
> > > > > > > [   55.329749] [<90000000012f7270>] handle_loongarch_irq+0x30/0x60
> > > > > > > [   55.361476] [<90000000012f733c>] do_vint+0x9c/0x100
> > > > > > > [   55.391737] [<9000000000332100>] kgdb_cpu_enter+0x540/0x640
> > > > > > > [   55.422440] [<9000000000332b64>] kgdb_handle_exception+0x104/0x180
> > > > > > > [   55.452911] [<9000000000232478>] kgdb_loongarch_notify+0x38/0xa0
> > > > > > > [   55.481964] [<900000000026b4d4>] notify_die+0x94/0x100
> > > > > > > [   55.509184] [<90000000012f685c>] do_bp+0x21c/0x340
> > > > > > > [   55.562475] [<90000000003315b8>] kgdb_compiled_break+0x0/0x28
> > > > > > > [   55.590319] [<9000000000332e80>] kgdb_register_io_module+0x160/0x1c0
> > > > > > > [   55.618901] [<9000000000c0f514>] configure_kgdboc+0x154/0x1c0
> > > > > > > [   55.647034] [<9000000000c0f5e0>] kgdboc_probe+0x60/0x80
> > > > > > > [   55.674647] [<9000000000c96da8>] platform_probe+0x68/0x100
> > > > > > > [   55.702613] [<9000000000c938e0>] really_probe+0xc0/0x340
> > > > > > > [   55.730528] [<9000000000c93be4>] __driver_probe_device+0x84/0x140
> > > > > > > [   55.759615] [<9000000000c93cdc>] driver_probe_device+0x3c/0x120
> > > > > > > [   55.787990] [<9000000000c93e8c>] __device_attach_driver+0xcc/0x160
> > > > > > > [   55.817145] [<9000000000c91290>] bus_for_each_drv+0x90/0x100
> > > > > > > [   55.845654] [<9000000000c94328>] __device_attach+0xa8/0x1a0
> > > > > > > [   55.874145] [<9000000000c925f0>] bus_probe_device+0xb0/0xe0
> > > > > > > [   55.902572] [<9000000000c8ec7c>] device_add+0x65c/0x860
> > > > > > > [   55.930635] [<9000000000c96704>] platform_device_add+0x124/0x2c0
> > > > > > > [   55.959669] [<9000000001452b38>] init_kgdboc+0x58/0xa0
> > > > > > > [   55.987677] [<900000000022015c>] do_one_initcall+0x7c/0x1e0
> > > > > > > [   56.016134] [<9000000001420f1c>] kernel_init_freeable+0x22c/0x2a0
> > > > > > > [   56.045128] [<90000000012f923c>] kernel_init+0x20/0x124
> > > > > > >
> > > > > > > Currently rcu_cpu_stall_reset() set rcu_state.jiffies_stall to one check
> > > > > > > period later, i.e. jiffies + rcu_jiffies_till_stall_check(). But jiffies
> > > > > > > is only updated in the timer interrupt, so when kgdb_cpu_enter() begins
> > > > > > > to run there may already be nearly one rcu check period after jiffies.
> > > > > > > Since all interrupts are disabled during kgdb_cpu_enter(), jiffies will
> > > > > > > not be updated. When kgdb_cpu_enter() returns, rcu_state.jiffies_stall
> > > > > > > maybe already gets timeout.
> > > > > > >
> > > > > > > We can set rcu_state.jiffies_stall to two rcu check periods later, e.g.
> > > > > > > jiffies + (rcu_jiffies_till_stall_check() * 2) in rcu_cpu_stall_reset()
> > > > > > > to avoid this problem. But this isn't a complete solution because kgdb
> > > > > > > may take a very long time in irq disabled context.
> > > > > > >
> > > > > > > Instead, update jiffies at the beginning of rcu_cpu_stall_reset() can
> > > > > > > solve all kinds of problems.
> > > > > >
> > > > > > Would it make sense for there to be a kgdb_cpu_exit()?  In that case,
> > > > > > the stalls could simply be suppressed at the beginning of the debug
> > > > > > session and re-enabled upon exit, as is currently done for sysrq output
> > > > > > via rcu_sysrq_start() and rcu_sysrq_end().
> > > > > Thank you for your advice, but that doesn't help. Because
> > > > > rcu_sysrq_start() and rcu_sysrq_end() try to suppress the warnings
> > > > > during sysrq, but kgdb already has no warnings during kgdb_cpu_enter()
> > > > > since it is executed in irq disabled context. Instead, this patch
> > > > > wants to suppress the warnings *after* kgdb_cpu_enter() due to a very
> > > > > old jiffies value.
> > > > >
> > > >
> > > > Hello, Huacai
> > > >
> > > > Is it possible to set  the rcu_cpu_stall_suppress is true in
> > > > dbg_touch_watchdogs()
> > > > and reset the rcu_cpu_stall_suppress at the beginning and end of the
> > > > RCU grace period?
> > > This is possible but not the best: 1, kgdb is not the only caller of
> > > rcu_cpu_stall_reset(); 2, it is difficult to find the "end" to reset
> > > rcu_cpu_stall_suppress.
> > >
> >
> > You can replace rcu_state.jiffies_stall update by setting rcu_cpu_stall_suppress
> > in rcu_cpu_stall_reset(),  and reset rcu_cpu_stall_suppress in rcu_gp_init() and
> > rcu_gp_cleanup().
> What's the advantage compared with updating jiffies? Updating jiffies
> seems more straight forward.
>

In do_update_jiffies_64(), need to acquire jiffies_lock raw spinlock,
like you said, kgdb is not the only caller of rcu_cpu_stall_reset(),
the rcu_cpu_stall_reset() maybe invoke in NMI  (arch/x86/platform/uv/uv_nmi.c)

Thanks
Zqiang


> Huacai
>
> >
> > Thanks
> > Zqiang
> >
> > >
> > > > or set rcupdate.rcu_cpu_stall_suppress_at_boot=1 in bootargs can
> > > > suppress RCU stall
> > > > in booting.
> > > This is also possible, but it suppresses all kinds of stall warnings,
> > > which is not what we want.
> > >
> > > Huacai
> > > >
> > > >
> > > > Thanks
> > > > Zqiang
> > > >
> > > >
> > > > >
> > > > > Huacai
> > > > >
> > > > > >
> > > > > >                                                         Thanx, Paul
> > > > > >
> > > > > > > Cc: stable@vger.kernel.org
> > > > > > > Fixes: a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in rcu_cpu_stall_reset()")
> > > > > > > Reported-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
> > > > > > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > > > > > > ---
> > > > > > >  kernel/rcu/tree_stall.h | 1 +
> > > > > > >  1 file changed, 1 insertion(+)
> > > > > > >
> > > > > > > diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> > > > > > > index b10b8349bb2a..1c7b540985bf 100644
> > > > > > > --- a/kernel/rcu/tree_stall.h
> > > > > > > +++ b/kernel/rcu/tree_stall.h
> > > > > > > @@ -153,6 +153,7 @@ static void panic_on_rcu_stall(void)
> > > > > > >   */
> > > > > > >  void rcu_cpu_stall_reset(void)
> > > > > > >  {
> > > > > > > +     do_update_jiffies_64(ktime_get());
> > > > > > >       WRITE_ONCE(rcu_state.jiffies_stall,
> > > > > > >                  jiffies + rcu_jiffies_till_stall_check());
> > > > > > >  }
> > > > > > > --
> > > > > > > 2.39.3
> > > > > > >
Huacai Chen Aug. 16, 2023, 12:28 p.m. UTC | #8
Hi, Qiang,

On Wed, Aug 16, 2023 at 6:06 PM Z qiang <qiang.zhang1211@gmail.com> wrote:
>
> >
> > Hi, Qiang,
> >
> > On Wed, Aug 16, 2023 at 1:09 PM Z qiang <qiang.zhang1211@gmail.com> wrote:
> > >
> > > >
> > > > Hi, Qiang,
> > > >
> > > > On Wed, Aug 16, 2023 at 11:16 AM Z qiang <qiang.zhang1211@gmail.com> wrote:
> > > > >
> > > > > >
> > > > > > Hi, Paul,
> > > > > >
> > > > > > On Tue, Aug 15, 2023 at 12:15 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > > > >
> > > > > > > On Mon, Aug 14, 2023 at 10:00:45AM +0800, Huacai Chen wrote:
> > > > > > > > The KGDB initial breakpoint gets an rcu stall warning after commit
> > > > > > > > a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in
> > > > > > > > rcu_cpu_stall_reset()").
> > > > > > > >
> > > > > > > > [   53.452051] rcu: INFO: rcu_preempt self-detected stall on CPU
> > > > > > > > [   53.487950] rcu:     3-...0: (1 ticks this GP) idle=0e2c/1/0x4000000000000000 softirq=375/375 fqs=8
> > > > > > > > [   53.528243] rcu:     (t=12297 jiffies g=-995 q=1 ncpus=4)
> > > > > > > > [   53.564840] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc2+ #4848
> > > > > > > > [   53.603005] Hardware name: Loongson Loongson-3A5000-HV-7A2000-1w-V0.1-CRB/Loongson-LS3A5000-7A2000-1w-CRB-V1.21, BIOS Loongson-UDK2018-V2.0.05099-beta8 08
> > > > > > > > [   53.682062] pc 9000000000332100 ra 90000000003320f4 tp 90000001000a0000 sp 90000001000a3710
> > > > > > > > [   53.724934] a0 9000000001d4b488 a1 0000000000000000 a2 0000000000000001 a3 0000000000000000
> > > > > > > > [   53.768179] a4 9000000001d526c8 a5 90000001000a38f0 a6 000000000000002c a7 0000000000000000
> > > > > > > > [   53.810751] t0 00000000000002b0 t1 0000000000000004 t2 900000000131c9c0 t3 fffffffffffffffa
> > > > > > > > [   53.853249] t4 0000000000000080 t5 90000001002ac190 t6 0000000000000004 t7 9000000001912d58
> > > > > > > > [   53.895684] t8 0000000000000000 u0 90000000013141a0 s9 0000000000000028 s0 9000000001d512f0
> > > > > > > > [   53.937633] s1 9000000001d51278 s2 90000001000a3798 s3 90000000019fc410 s4 9000000001d4b488
> > > > > > > > [   53.979486] s5 9000000001d512f0 s6 90000000013141a0 s7 0000000000000078 s8 9000000001d4b450
> > > > > > > > [   54.021175]    ra: 90000000003320f4 kgdb_cpu_enter+0x534/0x640
> > > > > > > > [   54.060150]   ERA: 9000000000332100 kgdb_cpu_enter+0x540/0x640
> > > > > > > > [   54.098347]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
> > > > > > > > [   54.136621]  PRMD: 0000000c (PPLV0 +PIE +PWE)
> > > > > > > > [   54.172192]  EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
> > > > > > > > [   54.207838]  ECFG: 00071c1c (LIE=2-4,10-12 VS=7)
> > > > > > > > [   54.242503] ESTAT: 00000800 [INT] (IS=11 ECode=0 EsubCode=0)
> > > > > > > > [   54.277996]  PRID: 0014c011 (Loongson-64bit, Loongson-3A5000-HV)
> > > > > > > > [   54.313544] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc2+ #4848
> > > > > > > > [   54.430170] Stack : 0072617764726148 0000000000000000 9000000000223504 90000001000a0000
> > > > > > > > [   54.472308]         9000000100073a90 9000000100073a98 0000000000000000 9000000100073bd8
> > > > > > > > [   54.514413]         9000000100073bd0 9000000100073bd0 9000000100073a00 0000000000000001
> > > > > > > > [   54.556018]         0000000000000001 9000000100073a98 99828271f24e961a 90000001002810c0
> > > > > > > > [   54.596924]         0000000000000001 0000000000010003 0000000000000000 0000000000000001
> > > > > > > > [   54.637115]         ffff8000337cdb80 0000000000000001 0000000006360000 900000000131c9c0
> > > > > > > > [   54.677049]         0000000000000000 0000000000000000 90000000017b4c98 9000000001912000
> > > > > > > > [   54.716394]         9000000001912f68 9000000001913000 9000000001912f70 00000000000002b0
> > > > > > > > [   54.754880]         90000000014a8840 0000000000000000 900000000022351c 0000000000000000
> > > > > > > > [   54.792372]         00000000000002b0 000000000000000c 0000000000000000 0000000000071c1c
> > > > > > > > [   54.829302]         ...
> > > > > > > > [   54.859163] Call Trace:
> > > > > > > > [   54.859165] [<900000000022351c>] show_stack+0x5c/0x180
> > > > > > > > [   54.918298] [<90000000012f6100>] dump_stack_lvl+0x60/0x88
> > > > > > > > [   54.949251] [<90000000012dd5d8>] rcu_dump_cpu_stacks+0xf0/0x148
> > > > > > > > [   54.981116] [<90000000002d2fb8>] rcu_sched_clock_irq+0xb78/0xe60
> > > > > > > > [   55.012744] [<90000000002e47cc>] update_process_times+0x6c/0xc0
> > > > > > > > [   55.044169] [<90000000002f65d4>] tick_sched_timer+0x54/0x100
> > > > > > > > [   55.075488] [<90000000002e5174>] __hrtimer_run_queues+0x154/0x240
> > > > > > > > [   55.107347] [<90000000002e6288>] hrtimer_interrupt+0x108/0x2a0
> > > > > > > > [   55.139112] [<9000000000226418>] constant_timer_interrupt+0x38/0x60
> > > > > > > > [   55.170749] [<90000000002b3010>] __handle_irq_event_percpu+0x50/0x160
> > > > > > > > [   55.203141] [<90000000002b3138>] handle_irq_event_percpu+0x18/0x80
> > > > > > > > [   55.235064] [<90000000002b9d54>] handle_percpu_irq+0x54/0xa0
> > > > > > > > [   55.266241] [<90000000002b2168>] generic_handle_domain_irq+0x28/0x40
> > > > > > > > [   55.298466] [<9000000000aba95c>] handle_cpu_irq+0x5c/0xa0
> > > > > > > > [   55.329749] [<90000000012f7270>] handle_loongarch_irq+0x30/0x60
> > > > > > > > [   55.361476] [<90000000012f733c>] do_vint+0x9c/0x100
> > > > > > > > [   55.391737] [<9000000000332100>] kgdb_cpu_enter+0x540/0x640
> > > > > > > > [   55.422440] [<9000000000332b64>] kgdb_handle_exception+0x104/0x180
> > > > > > > > [   55.452911] [<9000000000232478>] kgdb_loongarch_notify+0x38/0xa0
> > > > > > > > [   55.481964] [<900000000026b4d4>] notify_die+0x94/0x100
> > > > > > > > [   55.509184] [<90000000012f685c>] do_bp+0x21c/0x340
> > > > > > > > [   55.562475] [<90000000003315b8>] kgdb_compiled_break+0x0/0x28
> > > > > > > > [   55.590319] [<9000000000332e80>] kgdb_register_io_module+0x160/0x1c0
> > > > > > > > [   55.618901] [<9000000000c0f514>] configure_kgdboc+0x154/0x1c0
> > > > > > > > [   55.647034] [<9000000000c0f5e0>] kgdboc_probe+0x60/0x80
> > > > > > > > [   55.674647] [<9000000000c96da8>] platform_probe+0x68/0x100
> > > > > > > > [   55.702613] [<9000000000c938e0>] really_probe+0xc0/0x340
> > > > > > > > [   55.730528] [<9000000000c93be4>] __driver_probe_device+0x84/0x140
> > > > > > > > [   55.759615] [<9000000000c93cdc>] driver_probe_device+0x3c/0x120
> > > > > > > > [   55.787990] [<9000000000c93e8c>] __device_attach_driver+0xcc/0x160
> > > > > > > > [   55.817145] [<9000000000c91290>] bus_for_each_drv+0x90/0x100
> > > > > > > > [   55.845654] [<9000000000c94328>] __device_attach+0xa8/0x1a0
> > > > > > > > [   55.874145] [<9000000000c925f0>] bus_probe_device+0xb0/0xe0
> > > > > > > > [   55.902572] [<9000000000c8ec7c>] device_add+0x65c/0x860
> > > > > > > > [   55.930635] [<9000000000c96704>] platform_device_add+0x124/0x2c0
> > > > > > > > [   55.959669] [<9000000001452b38>] init_kgdboc+0x58/0xa0
> > > > > > > > [   55.987677] [<900000000022015c>] do_one_initcall+0x7c/0x1e0
> > > > > > > > [   56.016134] [<9000000001420f1c>] kernel_init_freeable+0x22c/0x2a0
> > > > > > > > [   56.045128] [<90000000012f923c>] kernel_init+0x20/0x124
> > > > > > > >
> > > > > > > > Currently rcu_cpu_stall_reset() set rcu_state.jiffies_stall to one check
> > > > > > > > period later, i.e. jiffies + rcu_jiffies_till_stall_check(). But jiffies
> > > > > > > > is only updated in the timer interrupt, so when kgdb_cpu_enter() begins
> > > > > > > > to run there may already be nearly one rcu check period after jiffies.
> > > > > > > > Since all interrupts are disabled during kgdb_cpu_enter(), jiffies will
> > > > > > > > not be updated. When kgdb_cpu_enter() returns, rcu_state.jiffies_stall
> > > > > > > > maybe already gets timeout.
> > > > > > > >
> > > > > > > > We can set rcu_state.jiffies_stall to two rcu check periods later, e.g.
> > > > > > > > jiffies + (rcu_jiffies_till_stall_check() * 2) in rcu_cpu_stall_reset()
> > > > > > > > to avoid this problem. But this isn't a complete solution because kgdb
> > > > > > > > may take a very long time in irq disabled context.
> > > > > > > >
> > > > > > > > Instead, update jiffies at the beginning of rcu_cpu_stall_reset() can
> > > > > > > > solve all kinds of problems.
> > > > > > >
> > > > > > > Would it make sense for there to be a kgdb_cpu_exit()?  In that case,
> > > > > > > the stalls could simply be suppressed at the beginning of the debug
> > > > > > > session and re-enabled upon exit, as is currently done for sysrq output
> > > > > > > via rcu_sysrq_start() and rcu_sysrq_end().
> > > > > > Thank you for your advice, but that doesn't help. Because
> > > > > > rcu_sysrq_start() and rcu_sysrq_end() try to suppress the warnings
> > > > > > during sysrq, but kgdb already has no warnings during kgdb_cpu_enter()
> > > > > > since it is executed in irq disabled context. Instead, this patch
> > > > > > wants to suppress the warnings *after* kgdb_cpu_enter() due to a very
> > > > > > old jiffies value.
> > > > > >
> > > > >
> > > > > Hello, Huacai
> > > > >
> > > > > Is it possible to set  the rcu_cpu_stall_suppress is true in
> > > > > dbg_touch_watchdogs()
> > > > > and reset the rcu_cpu_stall_suppress at the beginning and end of the
> > > > > RCU grace period?
> > > > This is possible but not the best: 1, kgdb is not the only caller of
> > > > rcu_cpu_stall_reset(); 2, it is difficult to find the "end" to reset
> > > > rcu_cpu_stall_suppress.
> > > >
> > >
> > > You can replace rcu_state.jiffies_stall update by setting rcu_cpu_stall_suppress
> > > in rcu_cpu_stall_reset(),  and reset rcu_cpu_stall_suppress in rcu_gp_init() and
> > > rcu_gp_cleanup().
> > What's the advantage compared with updating jiffies? Updating jiffies
> > seems more straight forward.
> >
>
> In do_update_jiffies_64(), need to acquire jiffies_lock raw spinlock,
> like you said, kgdb is not the only caller of rcu_cpu_stall_reset(),
> the rcu_cpu_stall_reset() maybe invoke in NMI  (arch/x86/platform/uv/uv_nmi.c)
Reset rcu_cpu_stall_suppress in rcu_gp_init()/rcu_gp_cleanup() is
still not so good to me, because it does a useless operation in most
cases. Moreover, the rcu core is refactored again and again, something
may be changed in future.

If  do_update_jiffies_64() cannot be used in NMI context, can we
consider my old method [1]?
https://lore.kernel.org/rcu/CAAhV-H7j9Y=VvRLm8thLw-EX1PGqBA9YfT4G1AN7ucYS=iP+DQ@mail.gmail.com/T/#t

Of course we should set rcu_state.jiffies_stall large enough, so we
can do like this:

 void rcu_cpu_stall_reset(void)
 {
  WRITE_ONCE(rcu_state.jiffies_stall,
-   jiffies + rcu_jiffies_till_stall_check());
+   jiffies + 300 * HZ);
 }

300s is the largest timeout value, and I think 300s is enough here in practice.

Huacai

>
> Thanks
> Zqiang
>
>
> > Huacai
> >
> > >
> > > Thanks
> > > Zqiang
> > >
> > > >
> > > > > or set rcupdate.rcu_cpu_stall_suppress_at_boot=1 in bootargs can
> > > > > suppress RCU stall
> > > > > in booting.
> > > > This is also possible, but it suppresses all kinds of stall warnings,
> > > > which is not what we want.
> > > >
> > > > Huacai
> > > > >
> > > > >
> > > > > Thanks
> > > > > Zqiang
> > > > >
> > > > >
> > > > > >
> > > > > > Huacai
> > > > > >
> > > > > > >
> > > > > > >                                                         Thanx, Paul
> > > > > > >
> > > > > > > > Cc: stable@vger.kernel.org
> > > > > > > > Fixes: a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in rcu_cpu_stall_reset()")
> > > > > > > > Reported-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
> > > > > > > > Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> > > > > > > > ---
> > > > > > > >  kernel/rcu/tree_stall.h | 1 +
> > > > > > > >  1 file changed, 1 insertion(+)
> > > > > > > >
> > > > > > > > diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> > > > > > > > index b10b8349bb2a..1c7b540985bf 100644
> > > > > > > > --- a/kernel/rcu/tree_stall.h
> > > > > > > > +++ b/kernel/rcu/tree_stall.h
> > > > > > > > @@ -153,6 +153,7 @@ static void panic_on_rcu_stall(void)
> > > > > > > >   */
> > > > > > > >  void rcu_cpu_stall_reset(void)
> > > > > > > >  {
> > > > > > > > +     do_update_jiffies_64(ktime_get());
> > > > > > > >       WRITE_ONCE(rcu_state.jiffies_stall,
> > > > > > > >                  jiffies + rcu_jiffies_till_stall_check());
> > > > > > > >  }
> > > > > > > > --
> > > > > > > > 2.39.3
> > > > > > > >
Alan Huang Aug. 16, 2023, 3:56 p.m. UTC | #9
>>>>>>>>> 
>>>>>>>>> Currently rcu_cpu_stall_reset() set rcu_state.jiffies_stall to one check
>>>>>>>>> period later, i.e. jiffies + rcu_jiffies_till_stall_check(). But jiffies
>>>>>>>>> is only updated in the timer interrupt, so when kgdb_cpu_enter() begins
>>>>>>>>> to run there may already be nearly one rcu check period after jiffies.
>>>>>>>>> Since all interrupts are disabled during kgdb_cpu_enter(), jiffies will
>>>>>>>>> not be updated. When kgdb_cpu_enter() returns, rcu_state.jiffies_stall
>>>>>>>>> maybe already gets timeout.
>>>>>>>>> 
>>>>>>>>> We can set rcu_state.jiffies_stall to two rcu check periods later, e.g.
>>>>>>>>> jiffies + (rcu_jiffies_till_stall_check() * 2) in rcu_cpu_stall_reset()
>>>>>>>>> to avoid this problem. But this isn't a complete solution because kgdb
>>>>>>>>> may take a very long time in irq disabled context.
>>>>>>>>> 
>>>>>>>>> Instead, update jiffies at the beginning of rcu_cpu_stall_reset() can
>>>>>>>>> solve all kinds of problems.
>>>>>>>> 
>>>>>>>> Would it make sense for there to be a kgdb_cpu_exit()?  In that case,
>>>>>>>> the stalls could simply be suppressed at the beginning of the debug
>>>>>>>> session and re-enabled upon exit, as is currently done for sysrq output
>>>>>>>> via rcu_sysrq_start() and rcu_sysrq_end().
>>>>>>> Thank you for your advice, but that doesn't help. Because
>>>>>>> rcu_sysrq_start() and rcu_sysrq_end() try to suppress the warnings
>>>>>>> during sysrq, but kgdb already has no warnings during kgdb_cpu_enter()
>>>>>>> since it is executed in irq disabled context. Instead, this patch
>>>>>>> wants to suppress the warnings *after* kgdb_cpu_enter() due to a very
>>>>>>> old jiffies value.
>>>>>>> 
>>>>>> 
>>>>>> Hello, Huacai
>>>>>> 
>>>>>> Is it possible to set  the rcu_cpu_stall_suppress is true in
>>>>>> dbg_touch_watchdogs()
>>>>>> and reset the rcu_cpu_stall_suppress at the beginning and end of the
>>>>>> RCU grace period?
>>>>> This is possible but not the best: 1, kgdb is not the only caller of
>>>>> rcu_cpu_stall_reset(); 2, it is difficult to find the "end" to reset
>>>>> rcu_cpu_stall_suppress.
>>>>> 
>>>> 
>>>> You can replace rcu_state.jiffies_stall update by setting rcu_cpu_stall_suppress
>>>> in rcu_cpu_stall_reset(),  and reset rcu_cpu_stall_suppress in rcu_gp_init() and
>>>> rcu_gp_cleanup().
>>> What's the advantage compared with updating jiffies? Updating jiffies
>>> seems more straight forward.
>>> 
>> 
>> In do_update_jiffies_64(), need to acquire jiffies_lock raw spinlock,
>> like you said, kgdb is not the only caller of rcu_cpu_stall_reset(),
>> the rcu_cpu_stall_reset() maybe invoke in NMI  (arch/x86/platform/uv/uv_nmi.c)
> Reset rcu_cpu_stall_suppress in rcu_gp_init()/rcu_gp_cleanup() is
> still not so good to me, because it does a useless operation in most
> cases. Moreover, the rcu core is refactored again and again, something
> may be changed in future.
> 
> If  do_update_jiffies_64() cannot be used in NMI context, can we

What about updating jiffies in dbg_touch_watchdogs or adding a wrapper which updates
both jiffies and jiffies_stall?

> consider my old method [1]?
> https://lore.kernel.org/rcu/CAAhV-H7j9Y=VvRLm8thLw-EX1PGqBA9YfT4G1AN7ucYS=iP+DQ@mail.gmail.com/T/#t
> 
> Of course we should set rcu_state.jiffies_stall large enough, so we
> can do like this:
> 
> void rcu_cpu_stall_reset(void)
> {
>  WRITE_ONCE(rcu_state.jiffies_stall,
> -   jiffies + rcu_jiffies_till_stall_check());
> +   jiffies + 300 * HZ);
> }
> 
> 300s is the largest timeout value, and I think 300s is enough here in practice.
> 
> Huacai
> 
>> 
>> Thanks
>> Zqiang
>> 
>> 
>>> Huacai
>>> 
>>>> 
>>>> Thanks
>>>> Zqiang
>>>> 
>>>>> 
>>>>>> or set rcupdate.rcu_cpu_stall_suppress_at_boot=1 in bootargs can
>>>>>> suppress RCU stall
>>>>>> in booting.
>>>>> This is also possible, but it suppresses all kinds of stall warnings,
>>>>> which is not what we want.
>>>>> 
>>>>> Huacai
>>>>>> 
>>>>>> 
>>>>>> Thanks
>>>>>> Zqiang
>>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> Huacai
>>>>>>> 
>>>>>>>> 
>>>>>>>>                                                        Thanx, Paul
>>>>>>>> 
>>>>>>>>> Cc: stable@vger.kernel.org
>>>>>>>>> Fixes: a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in rcu_cpu_stall_reset()")
>>>>>>>>> Reported-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
>>>>>>>>> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
>>>>>>>>> ---
>>>>>>>>> kernel/rcu/tree_stall.h | 1 +
>>>>>>>>> 1 file changed, 1 insertion(+)
>>>>>>>>> 
>>>>>>>>> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
>>>>>>>>> index b10b8349bb2a..1c7b540985bf 100644
>>>>>>>>> --- a/kernel/rcu/tree_stall.h
>>>>>>>>> +++ b/kernel/rcu/tree_stall.h
>>>>>>>>> @@ -153,6 +153,7 @@ static void panic_on_rcu_stall(void)
>>>>>>>>>  */
>>>>>>>>> void rcu_cpu_stall_reset(void)
>>>>>>>>> {
>>>>>>>>> +     do_update_jiffies_64(ktime_get());
>>>>>>>>>      WRITE_ONCE(rcu_state.jiffies_stall,
>>>>>>>>>                 jiffies + rcu_jiffies_till_stall_check());
>>>>>>>>> }
>>>>>>>>> --
>>>>>>>>> 2.39.3
Huacai Chen Aug. 16, 2023, 4:13 p.m. UTC | #10
Hi, Alan,

On Wed, Aug 16, 2023 at 11:57 PM Alan Huang <mmpgouride@gmail.com> wrote:
>
> >>>>>>>>>
> >>>>>>>>> Currently rcu_cpu_stall_reset() set rcu_state.jiffies_stall to one check
> >>>>>>>>> period later, i.e. jiffies + rcu_jiffies_till_stall_check(). But jiffies
> >>>>>>>>> is only updated in the timer interrupt, so when kgdb_cpu_enter() begins
> >>>>>>>>> to run there may already be nearly one rcu check period after jiffies.
> >>>>>>>>> Since all interrupts are disabled during kgdb_cpu_enter(), jiffies will
> >>>>>>>>> not be updated. When kgdb_cpu_enter() returns, rcu_state.jiffies_stall
> >>>>>>>>> maybe already gets timeout.
> >>>>>>>>>
> >>>>>>>>> We can set rcu_state.jiffies_stall to two rcu check periods later, e.g.
> >>>>>>>>> jiffies + (rcu_jiffies_till_stall_check() * 2) in rcu_cpu_stall_reset()
> >>>>>>>>> to avoid this problem. But this isn't a complete solution because kgdb
> >>>>>>>>> may take a very long time in irq disabled context.
> >>>>>>>>>
> >>>>>>>>> Instead, update jiffies at the beginning of rcu_cpu_stall_reset() can
> >>>>>>>>> solve all kinds of problems.
> >>>>>>>>
> >>>>>>>> Would it make sense for there to be a kgdb_cpu_exit()?  In that case,
> >>>>>>>> the stalls could simply be suppressed at the beginning of the debug
> >>>>>>>> session and re-enabled upon exit, as is currently done for sysrq output
> >>>>>>>> via rcu_sysrq_start() and rcu_sysrq_end().
> >>>>>>> Thank you for your advice, but that doesn't help. Because
> >>>>>>> rcu_sysrq_start() and rcu_sysrq_end() try to suppress the warnings
> >>>>>>> during sysrq, but kgdb already has no warnings during kgdb_cpu_enter()
> >>>>>>> since it is executed in irq disabled context. Instead, this patch
> >>>>>>> wants to suppress the warnings *after* kgdb_cpu_enter() due to a very
> >>>>>>> old jiffies value.
> >>>>>>>
> >>>>>>
> >>>>>> Hello, Huacai
> >>>>>>
> >>>>>> Is it possible to set  the rcu_cpu_stall_suppress is true in
> >>>>>> dbg_touch_watchdogs()
> >>>>>> and reset the rcu_cpu_stall_suppress at the beginning and end of the
> >>>>>> RCU grace period?
> >>>>> This is possible but not the best: 1, kgdb is not the only caller of
> >>>>> rcu_cpu_stall_reset(); 2, it is difficult to find the "end" to reset
> >>>>> rcu_cpu_stall_suppress.
> >>>>>
> >>>>
> >>>> You can replace rcu_state.jiffies_stall update by setting rcu_cpu_stall_suppress
> >>>> in rcu_cpu_stall_reset(),  and reset rcu_cpu_stall_suppress in rcu_gp_init() and
> >>>> rcu_gp_cleanup().
> >>> What's the advantage compared with updating jiffies? Updating jiffies
> >>> seems more straight forward.
> >>>
> >>
> >> In do_update_jiffies_64(), need to acquire jiffies_lock raw spinlock,
> >> like you said, kgdb is not the only caller of rcu_cpu_stall_reset(),
> >> the rcu_cpu_stall_reset() maybe invoke in NMI  (arch/x86/platform/uv/uv_nmi.c)
> > Reset rcu_cpu_stall_suppress in rcu_gp_init()/rcu_gp_cleanup() is
> > still not so good to me, because it does a useless operation in most
> > cases. Moreover, the rcu core is refactored again and again, something
> > may be changed in future.
> >
> > If  do_update_jiffies_64() cannot be used in NMI context, can we
>
> What about updating jiffies in dbg_touch_watchdogs or adding a wrapper which updates
> both jiffies and jiffies_stall?
This can solve the kgdb problem, but I found that most callers of
rcu_cpu_stall_reset() are in irq disabled context so they may meet
similar problems. Modifying rcu_cpu_stall_reset() can solve all of
them.

But due to the NMI issue, from my point of view, setting jiffies_stall
to jiffies + 300*HZ is the best solution now. :)

Huacai
>
> > consider my old method [1]?
> > https://lore.kernel.org/rcu/CAAhV-H7j9Y=VvRLm8thLw-EX1PGqBA9YfT4G1AN7ucYS=iP+DQ@mail.gmail.com/T/#t
> >
> > Of course we should set rcu_state.jiffies_stall large enough, so we
> > can do like this:
> >
> > void rcu_cpu_stall_reset(void)
> > {
> >  WRITE_ONCE(rcu_state.jiffies_stall,
> > -   jiffies + rcu_jiffies_till_stall_check());
> > +   jiffies + 300 * HZ);
> > }
> >
> > 300s is the largest timeout value, and I think 300s is enough here in practice.
> >
> > Huacai
> >
> >>
> >> Thanks
> >> Zqiang
> >>
> >>
> >>> Huacai
> >>>
> >>>>
> >>>> Thanks
> >>>> Zqiang
> >>>>
> >>>>>
> >>>>>> or set rcupdate.rcu_cpu_stall_suppress_at_boot=1 in bootargs can
> >>>>>> suppress RCU stall
> >>>>>> in booting.
> >>>>> This is also possible, but it suppresses all kinds of stall warnings,
> >>>>> which is not what we want.
> >>>>>
> >>>>> Huacai
> >>>>>>
> >>>>>>
> >>>>>> Thanks
> >>>>>> Zqiang
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> Huacai
> >>>>>>>
> >>>>>>>>
> >>>>>>>>                                                        Thanx, Paul
> >>>>>>>>
> >>>>>>>>> Cc: stable@vger.kernel.org
> >>>>>>>>> Fixes: a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in rcu_cpu_stall_reset()")
> >>>>>>>>> Reported-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
> >>>>>>>>> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> >>>>>>>>> ---
> >>>>>>>>> kernel/rcu/tree_stall.h | 1 +
> >>>>>>>>> 1 file changed, 1 insertion(+)
> >>>>>>>>>
> >>>>>>>>> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> >>>>>>>>> index b10b8349bb2a..1c7b540985bf 100644
> >>>>>>>>> --- a/kernel/rcu/tree_stall.h
> >>>>>>>>> +++ b/kernel/rcu/tree_stall.h
> >>>>>>>>> @@ -153,6 +153,7 @@ static void panic_on_rcu_stall(void)
> >>>>>>>>>  */
> >>>>>>>>> void rcu_cpu_stall_reset(void)
> >>>>>>>>> {
> >>>>>>>>> +     do_update_jiffies_64(ktime_get());
> >>>>>>>>>      WRITE_ONCE(rcu_state.jiffies_stall,
> >>>>>>>>>                 jiffies + rcu_jiffies_till_stall_check());
> >>>>>>>>> }
> >>>>>>>>> --
> >>>>>>>>> 2.39.3
>
>
Alan Huang Aug. 16, 2023, 4:52 p.m. UTC | #11
>> 
>>>>>>>>>>> 
>>>>>>>>>>> Currently rcu_cpu_stall_reset() set rcu_state.jiffies_stall to one check
>>>>>>>>>>> period later, i.e. jiffies + rcu_jiffies_till_stall_check(). But jiffies
>>>>>>>>>>> is only updated in the timer interrupt, so when kgdb_cpu_enter() begins
>>>>>>>>>>> to run there may already be nearly one rcu check period after jiffies.
>>>>>>>>>>> Since all interrupts are disabled during kgdb_cpu_enter(), jiffies will
>>>>>>>>>>> not be updated. When kgdb_cpu_enter() returns, rcu_state.jiffies_stall
>>>>>>>>>>> maybe already gets timeout.
>>>>>>>>>>> 
>>>>>>>>>>> We can set rcu_state.jiffies_stall to two rcu check periods later, e.g.
>>>>>>>>>>> jiffies + (rcu_jiffies_till_stall_check() * 2) in rcu_cpu_stall_reset()
>>>>>>>>>>> to avoid this problem. But this isn't a complete solution because kgdb
>>>>>>>>>>> may take a very long time in irq disabled context.
>>>>>>>>>>> 
>>>>>>>>>>> Instead, update jiffies at the beginning of rcu_cpu_stall_reset() can
>>>>>>>>>>> solve all kinds of problems.
>>>>>>>>>> 
>>>>>>>>>> Would it make sense for there to be a kgdb_cpu_exit()?  In that case,
>>>>>>>>>> the stalls could simply be suppressed at the beginning of the debug
>>>>>>>>>> session and re-enabled upon exit, as is currently done for sysrq output
>>>>>>>>>> via rcu_sysrq_start() and rcu_sysrq_end().
>>>>>>>>> Thank you for your advice, but that doesn't help. Because
>>>>>>>>> rcu_sysrq_start() and rcu_sysrq_end() try to suppress the warnings
>>>>>>>>> during sysrq, but kgdb already has no warnings during kgdb_cpu_enter()
>>>>>>>>> since it is executed in irq disabled context. Instead, this patch
>>>>>>>>> wants to suppress the warnings *after* kgdb_cpu_enter() due to a very
>>>>>>>>> old jiffies value.
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Hello, Huacai
>>>>>>>> 
>>>>>>>> Is it possible to set  the rcu_cpu_stall_suppress is true in
>>>>>>>> dbg_touch_watchdogs()
>>>>>>>> and reset the rcu_cpu_stall_suppress at the beginning and end of the
>>>>>>>> RCU grace period?
>>>>>>> This is possible but not the best: 1, kgdb is not the only caller of
>>>>>>> rcu_cpu_stall_reset(); 2, it is difficult to find the "end" to reset
>>>>>>> rcu_cpu_stall_suppress.
>>>>>>> 
>>>>>> 
>>>>>> You can replace rcu_state.jiffies_stall update by setting rcu_cpu_stall_suppress
>>>>>> in rcu_cpu_stall_reset(),  and reset rcu_cpu_stall_suppress in rcu_gp_init() and
>>>>>> rcu_gp_cleanup().
>>>>> What's the advantage compared with updating jiffies? Updating jiffies
>>>>> seems more straight forward.
>>>>> 
>>>> 
>>>> In do_update_jiffies_64(), need to acquire jiffies_lock raw spinlock,
>>>> like you said, kgdb is not the only caller of rcu_cpu_stall_reset(),
>>>> the rcu_cpu_stall_reset() maybe invoke in NMI  (arch/x86/platform/uv/uv_nmi.c)
>>> Reset rcu_cpu_stall_suppress in rcu_gp_init()/rcu_gp_cleanup() is
>>> still not so good to me, because it does a useless operation in most
>>> cases. Moreover, the rcu core is refactored again and again, something
>>> may be changed in future.
>>> 
>>> If  do_update_jiffies_64() cannot be used in NMI context, can we
>> 
>> What about updating jiffies in dbg_touch_watchdogs or adding a wrapper which updates
>> both jiffies and jiffies_stall?
> This can solve the kgdb problem, but I found that most callers of
> rcu_cpu_stall_reset() are in irq disabled context so they may meet

The duration of other contexts where interrupts are disabled may not be as long as in the case of kgdb?

> similar problems. Modifying rcu_cpu_stall_reset() can solve all of
> them.
> 
> But due to the NMI issue, from my point of view, setting jiffies_stall
> to jiffies + 300*HZ is the best solution now. :)

If I understand correctly, the NMI issue is the deadlock issue? If so, plus the short duration of other irq disabled 
contexts, it’s ok just update jiffies in dbg_touch_watchdogs.

Please correct me if anything wrong. :)

> 
> Huacai
>> 
>>> consider my old method [1]?
>>> https://lore.kernel.org/rcu/CAAhV-H7j9Y=VvRLm8thLw-EX1PGqBA9YfT4G1AN7ucYS=iP+DQ@mail.gmail.com/T/#t
>>> 
>>> Of course we should set rcu_state.jiffies_stall large enough, so we
>>> can do like this:
>>> 
>>> void rcu_cpu_stall_reset(void)
>>> {
>>> WRITE_ONCE(rcu_state.jiffies_stall,
>>> -   jiffies + rcu_jiffies_till_stall_check());
>>> +   jiffies + 300 * HZ);
>>> }
>>> 
>>> 300s is the largest timeout value, and I think 300s is enough here in practice.
>>> 
>>> Huacai
>>> 
>>>> 
>>>> Thanks
>>>> Zqiang
>>>> 
>>>> 
>>>>> Huacai
>>>>> 
>>>>>> 
>>>>>> Thanks
>>>>>> Zqiang
>>>>>> 
>>>>>>> 
>>>>>>>> or set rcupdate.rcu_cpu_stall_suppress_at_boot=1 in bootargs can
>>>>>>>> suppress RCU stall
>>>>>>>> in booting.
>>>>>>> This is also possible, but it suppresses all kinds of stall warnings,
>>>>>>> which is not what we want.
>>>>>>> 
>>>>>>> Huacai
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Thanks
>>>>>>>> Zqiang
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Huacai
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>                                                       Thanx, Paul
>>>>>>>>>> 
>>>>>>>>>>> Cc: stable@vger.kernel.org
>>>>>>>>>>> Fixes: a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in rcu_cpu_stall_reset()")
>>>>>>>>>>> Reported-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
>>>>>>>>>>> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
>>>>>>>>>>> ---
>>>>>>>>>>> kernel/rcu/tree_stall.h | 1 +
>>>>>>>>>>> 1 file changed, 1 insertion(+)
>>>>>>>>>>> 
>>>>>>>>>>> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
>>>>>>>>>>> index b10b8349bb2a..1c7b540985bf 100644
>>>>>>>>>>> --- a/kernel/rcu/tree_stall.h
>>>>>>>>>>> +++ b/kernel/rcu/tree_stall.h
>>>>>>>>>>> @@ -153,6 +153,7 @@ static void panic_on_rcu_stall(void)
>>>>>>>>>>> */
>>>>>>>>>>> void rcu_cpu_stall_reset(void)
>>>>>>>>>>> {
>>>>>>>>>>> +     do_update_jiffies_64(ktime_get());
>>>>>>>>>>>     WRITE_ONCE(rcu_state.jiffies_stall,
>>>>>>>>>>>                jiffies + rcu_jiffies_till_stall_check());
>>>>>>>>>>> }
>>>>>>>>>>> --
>>>>>>>>>>> 2.39.3
Joel Fernandes Aug. 16, 2023, 7:27 p.m. UTC | #12
> On Aug 16, 2023, at 8:29 AM, Huacai Chen <chenhuacai@kernel.org> wrote:
> 
> Hi, Qiang,
> 
>> On Wed, Aug 16, 2023 at 6:06 PM Z qiang <qiang.zhang1211@gmail.com> wrote:
>> 
>>> 
>>> Hi, Qiang,
>>> 
>>> On Wed, Aug 16, 2023 at 1:09 PM Z qiang <qiang.zhang1211@gmail.com> wrote:
>>>> 
>>>>> 
>>>>> Hi, Qiang,
>>>>> 
>>>>> On Wed, Aug 16, 2023 at 11:16 AM Z qiang <qiang.zhang1211@gmail.com> wrote:
>>>>>> 
>>>>>>> 
>>>>>>> Hi, Paul,
>>>>>>> 
>>>>>>> On Tue, Aug 15, 2023 at 12:15 AM Paul E. McKenney <paulmck@kernel.org> wrote:
>>>>>>>> 
>>>>>>>> On Mon, Aug 14, 2023 at 10:00:45AM +0800, Huacai Chen wrote:
>>>>>>>>> The KGDB initial breakpoint gets an rcu stall warning after commit
>>>>>>>>> a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in
>>>>>>>>> rcu_cpu_stall_reset()").
>>>>>>>>> 
>>>>>>>>> [   53.452051] rcu: INFO: rcu_preempt self-detected stall on CPU
>>>>>>>>> [   53.487950] rcu:     3-...0: (1 ticks this GP) idle=0e2c/1/0x4000000000000000 softirq=375/375 fqs=8
>>>>>>>>> [   53.528243] rcu:     (t=12297 jiffies g=-995 q=1 ncpus=4)
>>>>>>>>> [   53.564840] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc2+ #4848
>>>>>>>>> [   53.603005] Hardware name: Loongson Loongson-3A5000-HV-7A2000-1w-V0.1-CRB/Loongson-LS3A5000-7A2000-1w-CRB-V1.21, BIOS Loongson-UDK2018-V2.0.05099-beta8 08
>>>>>>>>> [   53.682062] pc 9000000000332100 ra 90000000003320f4 tp 90000001000a0000 sp 90000001000a3710
>>>>>>>>> [   53.724934] a0 9000000001d4b488 a1 0000000000000000 a2 0000000000000001 a3 0000000000000000
>>>>>>>>> [   53.768179] a4 9000000001d526c8 a5 90000001000a38f0 a6 000000000000002c a7 0000000000000000
>>>>>>>>> [   53.810751] t0 00000000000002b0 t1 0000000000000004 t2 900000000131c9c0 t3 fffffffffffffffa
>>>>>>>>> [   53.853249] t4 0000000000000080 t5 90000001002ac190 t6 0000000000000004 t7 9000000001912d58
>>>>>>>>> [   53.895684] t8 0000000000000000 u0 90000000013141a0 s9 0000000000000028 s0 9000000001d512f0
>>>>>>>>> [   53.937633] s1 9000000001d51278 s2 90000001000a3798 s3 90000000019fc410 s4 9000000001d4b488
>>>>>>>>> [   53.979486] s5 9000000001d512f0 s6 90000000013141a0 s7 0000000000000078 s8 9000000001d4b450
>>>>>>>>> [   54.021175]    ra: 90000000003320f4 kgdb_cpu_enter+0x534/0x640
>>>>>>>>> [   54.060150]   ERA: 9000000000332100 kgdb_cpu_enter+0x540/0x640
>>>>>>>>> [   54.098347]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
>>>>>>>>> [   54.136621]  PRMD: 0000000c (PPLV0 +PIE +PWE)
>>>>>>>>> [   54.172192]  EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
>>>>>>>>> [   54.207838]  ECFG: 00071c1c (LIE=2-4,10-12 VS=7)
>>>>>>>>> [   54.242503] ESTAT: 00000800 [INT] (IS=11 ECode=0 EsubCode=0)
>>>>>>>>> [   54.277996]  PRID: 0014c011 (Loongson-64bit, Loongson-3A5000-HV)
>>>>>>>>> [   54.313544] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc2+ #4848
>>>>>>>>> [   54.430170] Stack : 0072617764726148 0000000000000000 9000000000223504 90000001000a0000
>>>>>>>>> [   54.472308]         9000000100073a90 9000000100073a98 0000000000000000 9000000100073bd8
>>>>>>>>> [   54.514413]         9000000100073bd0 9000000100073bd0 9000000100073a00 0000000000000001
>>>>>>>>> [   54.556018]         0000000000000001 9000000100073a98 99828271f24e961a 90000001002810c0
>>>>>>>>> [   54.596924]         0000000000000001 0000000000010003 0000000000000000 0000000000000001
>>>>>>>>> [   54.637115]         ffff8000337cdb80 0000000000000001 0000000006360000 900000000131c9c0
>>>>>>>>> [   54.677049]         0000000000000000 0000000000000000 90000000017b4c98 9000000001912000
>>>>>>>>> [   54.716394]         9000000001912f68 9000000001913000 9000000001912f70 00000000000002b0
>>>>>>>>> [   54.754880]         90000000014a8840 0000000000000000 900000000022351c 0000000000000000
>>>>>>>>> [   54.792372]         00000000000002b0 000000000000000c 0000000000000000 0000000000071c1c
>>>>>>>>> [   54.829302]         ...
>>>>>>>>> [   54.859163] Call Trace:
>>>>>>>>> [   54.859165] [<900000000022351c>] show_stack+0x5c/0x180
>>>>>>>>> [   54.918298] [<90000000012f6100>] dump_stack_lvl+0x60/0x88
>>>>>>>>> [   54.949251] [<90000000012dd5d8>] rcu_dump_cpu_stacks+0xf0/0x148
>>>>>>>>> [   54.981116] [<90000000002d2fb8>] rcu_sched_clock_irq+0xb78/0xe60
>>>>>>>>> [   55.012744] [<90000000002e47cc>] update_process_times+0x6c/0xc0
>>>>>>>>> [   55.044169] [<90000000002f65d4>] tick_sched_timer+0x54/0x100
>>>>>>>>> [   55.075488] [<90000000002e5174>] __hrtimer_run_queues+0x154/0x240
>>>>>>>>> [   55.107347] [<90000000002e6288>] hrtimer_interrupt+0x108/0x2a0
>>>>>>>>> [   55.139112] [<9000000000226418>] constant_timer_interrupt+0x38/0x60
>>>>>>>>> [   55.170749] [<90000000002b3010>] __handle_irq_event_percpu+0x50/0x160
>>>>>>>>> [   55.203141] [<90000000002b3138>] handle_irq_event_percpu+0x18/0x80
>>>>>>>>> [   55.235064] [<90000000002b9d54>] handle_percpu_irq+0x54/0xa0
>>>>>>>>> [   55.266241] [<90000000002b2168>] generic_handle_domain_irq+0x28/0x40
>>>>>>>>> [   55.298466] [<9000000000aba95c>] handle_cpu_irq+0x5c/0xa0
>>>>>>>>> [   55.329749] [<90000000012f7270>] handle_loongarch_irq+0x30/0x60
>>>>>>>>> [   55.361476] [<90000000012f733c>] do_vint+0x9c/0x100
>>>>>>>>> [   55.391737] [<9000000000332100>] kgdb_cpu_enter+0x540/0x640
>>>>>>>>> [   55.422440] [<9000000000332b64>] kgdb_handle_exception+0x104/0x180
>>>>>>>>> [   55.452911] [<9000000000232478>] kgdb_loongarch_notify+0x38/0xa0
>>>>>>>>> [   55.481964] [<900000000026b4d4>] notify_die+0x94/0x100
>>>>>>>>> [   55.509184] [<90000000012f685c>] do_bp+0x21c/0x340
>>>>>>>>> [   55.562475] [<90000000003315b8>] kgdb_compiled_break+0x0/0x28
>>>>>>>>> [   55.590319] [<9000000000332e80>] kgdb_register_io_module+0x160/0x1c0
>>>>>>>>> [   55.618901] [<9000000000c0f514>] configure_kgdboc+0x154/0x1c0
>>>>>>>>> [   55.647034] [<9000000000c0f5e0>] kgdboc_probe+0x60/0x80
>>>>>>>>> [   55.674647] [<9000000000c96da8>] platform_probe+0x68/0x100
>>>>>>>>> [   55.702613] [<9000000000c938e0>] really_probe+0xc0/0x340
>>>>>>>>> [   55.730528] [<9000000000c93be4>] __driver_probe_device+0x84/0x140
>>>>>>>>> [   55.759615] [<9000000000c93cdc>] driver_probe_device+0x3c/0x120
>>>>>>>>> [   55.787990] [<9000000000c93e8c>] __device_attach_driver+0xcc/0x160
>>>>>>>>> [   55.817145] [<9000000000c91290>] bus_for_each_drv+0x90/0x100
>>>>>>>>> [   55.845654] [<9000000000c94328>] __device_attach+0xa8/0x1a0
>>>>>>>>> [   55.874145] [<9000000000c925f0>] bus_probe_device+0xb0/0xe0
>>>>>>>>> [   55.902572] [<9000000000c8ec7c>] device_add+0x65c/0x860
>>>>>>>>> [   55.930635] [<9000000000c96704>] platform_device_add+0x124/0x2c0
>>>>>>>>> [   55.959669] [<9000000001452b38>] init_kgdboc+0x58/0xa0
>>>>>>>>> [   55.987677] [<900000000022015c>] do_one_initcall+0x7c/0x1e0
>>>>>>>>> [   56.016134] [<9000000001420f1c>] kernel_init_freeable+0x22c/0x2a0
>>>>>>>>> [   56.045128] [<90000000012f923c>] kernel_init+0x20/0x124
>>>>>>>>> 
>>>>>>>>> Currently rcu_cpu_stall_reset() set rcu_state.jiffies_stall to one check
>>>>>>>>> period later, i.e. jiffies + rcu_jiffies_till_stall_check(). But jiffies
>>>>>>>>> is only updated in the timer interrupt, so when kgdb_cpu_enter() begins
>>>>>>>>> to run there may already be nearly one rcu check period after jiffies.
>>>>>>>>> Since all interrupts are disabled during kgdb_cpu_enter(), jiffies will
>>>>>>>>> not be updated. When kgdb_cpu_enter() returns, rcu_state.jiffies_stall
>>>>>>>>> maybe already gets timeout.
>>>>>>>>> 
>>>>>>>>> We can set rcu_state.jiffies_stall to two rcu check periods later, e.g.
>>>>>>>>> jiffies + (rcu_jiffies_till_stall_check() * 2) in rcu_cpu_stall_reset()
>>>>>>>>> to avoid this problem. But this isn't a complete solution because kgdb
>>>>>>>>> may take a very long time in irq disabled context.
>>>>>>>>> 
>>>>>>>>> Instead, update jiffies at the beginning of rcu_cpu_stall_reset() can
>>>>>>>>> solve all kinds of problems.
>>>>>>>> 
>>>>>>>> Would it make sense for there to be a kgdb_cpu_exit()?  In that case,
>>>>>>>> the stalls could simply be suppressed at the beginning of the debug
>>>>>>>> session and re-enabled upon exit, as is currently done for sysrq output
>>>>>>>> via rcu_sysrq_start() and rcu_sysrq_end().
>>>>>>> Thank you for your advice, but that doesn't help. Because
>>>>>>> rcu_sysrq_start() and rcu_sysrq_end() try to suppress the warnings
>>>>>>> during sysrq, but kgdb already has no warnings during kgdb_cpu_enter()
>>>>>>> since it is executed in irq disabled context. Instead, this patch
>>>>>>> wants to suppress the warnings *after* kgdb_cpu_enter() due to a very
>>>>>>> old jiffies value.
>>>>>>> 
>>>>>> 
>>>>>> Hello, Huacai
>>>>>> 
>>>>>> Is it possible to set  the rcu_cpu_stall_suppress is true in
>>>>>> dbg_touch_watchdogs()
>>>>>> and reset the rcu_cpu_stall_suppress at the beginning and end of the
>>>>>> RCU grace period?
>>>>> This is possible but not the best: 1, kgdb is not the only caller of
>>>>> rcu_cpu_stall_reset(); 2, it is difficult to find the "end" to reset
>>>>> rcu_cpu_stall_suppress.
>>>>> 
>>>> 
>>>> You can replace rcu_state.jiffies_stall update by setting rcu_cpu_stall_suppress
>>>> in rcu_cpu_stall_reset(),  and reset rcu_cpu_stall_suppress in rcu_gp_init() and
>>>> rcu_gp_cleanup().
>>> What's the advantage compared with updating jiffies? Updating jiffies
>>> seems more straight forward.
>>> 
>> 
>> In do_update_jiffies_64(), need to acquire jiffies_lock raw spinlock,
>> like you said, kgdb is not the only caller of rcu_cpu_stall_reset(),
>> the rcu_cpu_stall_reset() maybe invoke in NMI  (arch/x86/platform/uv/uv_nmi.c)
> Reset rcu_cpu_stall_suppress in rcu_gp_init()/rcu_gp_cleanup() is
> still not so good to me, because it does a useless operation in most
> cases. Moreover, the rcu core is refactored again and again, something
> may be changed in future.
> 
> If  do_update_jiffies_64() cannot be used in NMI context,

Can you not make the jiffies update conditional on whether it is called within NMI context?

> can we
> consider my old method [1]?
> https://lore.kernel.org/rcu/CAAhV-H7j9Y=VvRLm8thLw-EX1PGqBA9YfT4G1AN7ucYS=iP+DQ@mail.gmail.com/T/#t
> 
> Of course we should set rcu_state.jiffies_stall large enough, so we
> can do like this:
> 
> void rcu_cpu_stall_reset(void)
> {
>  WRITE_ONCE(rcu_state.jiffies_stall,
> -   jiffies + rcu_jiffies_till_stall_check());
> +   jiffies + 300 * HZ);
> }
> 
> 300s is the largest timeout value, and I think 300s is enough here in practice.

I dislike that..

Thanks,

 - Joel



> 
> Huacai
> 
>> 
>> Thanks
>> Zqiang
>> 
>> 
>>> Huacai
>>> 
>>>> 
>>>> Thanks
>>>> Zqiang
>>>> 
>>>>> 
>>>>>> or set rcupdate.rcu_cpu_stall_suppress_at_boot=1 in bootargs can
>>>>>> suppress RCU stall
>>>>>> in booting.
>>>>> This is also possible, but it suppresses all kinds of stall warnings,
>>>>> which is not what we want.
>>>>> 
>>>>> Huacai
>>>>>> 
>>>>>> 
>>>>>> Thanks
>>>>>> Zqiang
>>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> Huacai
>>>>>>> 
>>>>>>>> 
>>>>>>>>                                                        Thanx, Paul
>>>>>>>> 
>>>>>>>>> Cc: stable@vger.kernel.org
>>>>>>>>> Fixes: a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in rcu_cpu_stall_reset()")
>>>>>>>>> Reported-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
>>>>>>>>> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
>>>>>>>>> ---
>>>>>>>>> kernel/rcu/tree_stall.h | 1 +
>>>>>>>>> 1 file changed, 1 insertion(+)
>>>>>>>>> 
>>>>>>>>> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
>>>>>>>>> index b10b8349bb2a..1c7b540985bf 100644
>>>>>>>>> --- a/kernel/rcu/tree_stall.h
>>>>>>>>> +++ b/kernel/rcu/tree_stall.h
>>>>>>>>> @@ -153,6 +153,7 @@ static void panic_on_rcu_stall(void)
>>>>>>>>>  */
>>>>>>>>> void rcu_cpu_stall_reset(void)
>>>>>>>>> {
>>>>>>>>> +     do_update_jiffies_64(ktime_get());
>>>>>>>>>      WRITE_ONCE(rcu_state.jiffies_stall,
>>>>>>>>>                 jiffies + rcu_jiffies_till_stall_check());
>>>>>>>>> }
>>>>>>>>> --
>>>>>>>>> 2.39.3
>>>>>>>>>
Huacai Chen Aug. 17, 2023, 4:04 a.m. UTC | #13
Hi, Alan,

On Thu, Aug 17, 2023 at 12:52 AM Alan Huang <mmpgouride@gmail.com> wrote:
>
> >>
> >>>>>>>>>>>
> >>>>>>>>>>> Currently rcu_cpu_stall_reset() set rcu_state.jiffies_stall to one check
> >>>>>>>>>>> period later, i.e. jiffies + rcu_jiffies_till_stall_check(). But jiffies
> >>>>>>>>>>> is only updated in the timer interrupt, so when kgdb_cpu_enter() begins
> >>>>>>>>>>> to run there may already be nearly one rcu check period after jiffies.
> >>>>>>>>>>> Since all interrupts are disabled during kgdb_cpu_enter(), jiffies will
> >>>>>>>>>>> not be updated. When kgdb_cpu_enter() returns, rcu_state.jiffies_stall
> >>>>>>>>>>> maybe already gets timeout.
> >>>>>>>>>>>
> >>>>>>>>>>> We can set rcu_state.jiffies_stall to two rcu check periods later, e.g.
> >>>>>>>>>>> jiffies + (rcu_jiffies_till_stall_check() * 2) in rcu_cpu_stall_reset()
> >>>>>>>>>>> to avoid this problem. But this isn't a complete solution because kgdb
> >>>>>>>>>>> may take a very long time in irq disabled context.
> >>>>>>>>>>>
> >>>>>>>>>>> Instead, update jiffies at the beginning of rcu_cpu_stall_reset() can
> >>>>>>>>>>> solve all kinds of problems.
> >>>>>>>>>>
> >>>>>>>>>> Would it make sense for there to be a kgdb_cpu_exit()?  In that case,
> >>>>>>>>>> the stalls could simply be suppressed at the beginning of the debug
> >>>>>>>>>> session and re-enabled upon exit, as is currently done for sysrq output
> >>>>>>>>>> via rcu_sysrq_start() and rcu_sysrq_end().
> >>>>>>>>> Thank you for your advice, but that doesn't help. Because
> >>>>>>>>> rcu_sysrq_start() and rcu_sysrq_end() try to suppress the warnings
> >>>>>>>>> during sysrq, but kgdb already has no warnings during kgdb_cpu_enter()
> >>>>>>>>> since it is executed in irq disabled context. Instead, this patch
> >>>>>>>>> wants to suppress the warnings *after* kgdb_cpu_enter() due to a very
> >>>>>>>>> old jiffies value.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> Hello, Huacai
> >>>>>>>>
> >>>>>>>> Is it possible to set  the rcu_cpu_stall_suppress is true in
> >>>>>>>> dbg_touch_watchdogs()
> >>>>>>>> and reset the rcu_cpu_stall_suppress at the beginning and end of the
> >>>>>>>> RCU grace period?
> >>>>>>> This is possible but not the best: 1, kgdb is not the only caller of
> >>>>>>> rcu_cpu_stall_reset(); 2, it is difficult to find the "end" to reset
> >>>>>>> rcu_cpu_stall_suppress.
> >>>>>>>
> >>>>>>
> >>>>>> You can replace rcu_state.jiffies_stall update by setting rcu_cpu_stall_suppress
> >>>>>> in rcu_cpu_stall_reset(),  and reset rcu_cpu_stall_suppress in rcu_gp_init() and
> >>>>>> rcu_gp_cleanup().
> >>>>> What's the advantage compared with updating jiffies? Updating jiffies
> >>>>> seems more straight forward.
> >>>>>
> >>>>
> >>>> In do_update_jiffies_64(), need to acquire jiffies_lock raw spinlock,
> >>>> like you said, kgdb is not the only caller of rcu_cpu_stall_reset(),
> >>>> the rcu_cpu_stall_reset() maybe invoke in NMI  (arch/x86/platform/uv/uv_nmi.c)
> >>> Reset rcu_cpu_stall_suppress in rcu_gp_init()/rcu_gp_cleanup() is
> >>> still not so good to me, because it does a useless operation in most
> >>> cases. Moreover, the rcu core is refactored again and again, something
> >>> may be changed in future.
> >>>
> >>> If  do_update_jiffies_64() cannot be used in NMI context, can we
> >>
> >> What about updating jiffies in dbg_touch_watchdogs or adding a wrapper which updates
> >> both jiffies and jiffies_stall?
> > This can solve the kgdb problem, but I found that most callers of
> > rcu_cpu_stall_reset() are in irq disabled context so they may meet
>
> The duration of other contexts where interrupts are disabled may not be as long as in the case of kgdb?
>
> > similar problems. Modifying rcu_cpu_stall_reset() can solve all of
> > them.
> >
> > But due to the NMI issue, from my point of view, setting jiffies_stall
> > to jiffies + 300*HZ is the best solution now. :)
>
> If I understand correctly, the NMI issue is the deadlock issue? If so, plus the short duration of other irq disabled
> contexts, it’s ok just update jiffies in dbg_touch_watchdogs.
>
> Please correct me if anything wrong. :)
The timeout value can be configured as short as 3 seconds, in this
case other callers may also have problems.

Huacai
>
> >
> > Huacai
> >>
> >>> consider my old method [1]?
> >>> https://lore.kernel.org/rcu/CAAhV-H7j9Y=VvRLm8thLw-EX1PGqBA9YfT4G1AN7ucYS=iP+DQ@mail.gmail.com/T/#t
> >>>
> >>> Of course we should set rcu_state.jiffies_stall large enough, so we
> >>> can do like this:
> >>>
> >>> void rcu_cpu_stall_reset(void)
> >>> {
> >>> WRITE_ONCE(rcu_state.jiffies_stall,
> >>> -   jiffies + rcu_jiffies_till_stall_check());
> >>> +   jiffies + 300 * HZ);
> >>> }
> >>>
> >>> 300s is the largest timeout value, and I think 300s is enough here in practice.
> >>>
> >>> Huacai
> >>>
> >>>>
> >>>> Thanks
> >>>> Zqiang
> >>>>
> >>>>
> >>>>> Huacai
> >>>>>
> >>>>>>
> >>>>>> Thanks
> >>>>>> Zqiang
> >>>>>>
> >>>>>>>
> >>>>>>>> or set rcupdate.rcu_cpu_stall_suppress_at_boot=1 in bootargs can
> >>>>>>>> suppress RCU stall
> >>>>>>>> in booting.
> >>>>>>> This is also possible, but it suppresses all kinds of stall warnings,
> >>>>>>> which is not what we want.
> >>>>>>>
> >>>>>>> Huacai
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Thanks
> >>>>>>>> Zqiang
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Huacai
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>                                                       Thanx, Paul
> >>>>>>>>>>
> >>>>>>>>>>> Cc: stable@vger.kernel.org
> >>>>>>>>>>> Fixes: a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in rcu_cpu_stall_reset()")
> >>>>>>>>>>> Reported-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
> >>>>>>>>>>> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> >>>>>>>>>>> ---
> >>>>>>>>>>> kernel/rcu/tree_stall.h | 1 +
> >>>>>>>>>>> 1 file changed, 1 insertion(+)
> >>>>>>>>>>>
> >>>>>>>>>>> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> >>>>>>>>>>> index b10b8349bb2a..1c7b540985bf 100644
> >>>>>>>>>>> --- a/kernel/rcu/tree_stall.h
> >>>>>>>>>>> +++ b/kernel/rcu/tree_stall.h
> >>>>>>>>>>> @@ -153,6 +153,7 @@ static void panic_on_rcu_stall(void)
> >>>>>>>>>>> */
> >>>>>>>>>>> void rcu_cpu_stall_reset(void)
> >>>>>>>>>>> {
> >>>>>>>>>>> +     do_update_jiffies_64(ktime_get());
> >>>>>>>>>>>     WRITE_ONCE(rcu_state.jiffies_stall,
> >>>>>>>>>>>                jiffies + rcu_jiffies_till_stall_check());
> >>>>>>>>>>> }
> >>>>>>>>>>> --
> >>>>>>>>>>> 2.39.3
>
>
Huacai Chen Aug. 17, 2023, 8:06 a.m. UTC | #14
Hi, Joel,

On Thu, Aug 17, 2023 at 3:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
>
>
>
> > On Aug 16, 2023, at 8:29 AM, Huacai Chen <chenhuacai@kernel.org> wrote:
> >
> > Hi, Qiang,
> >
> >> On Wed, Aug 16, 2023 at 6:06 PM Z qiang <qiang.zhang1211@gmail.com> wrote:
> >>
> >>>
> >>> Hi, Qiang,
> >>>
> >>> On Wed, Aug 16, 2023 at 1:09 PM Z qiang <qiang.zhang1211@gmail.com> wrote:
> >>>>
> >>>>>
> >>>>> Hi, Qiang,
> >>>>>
> >>>>> On Wed, Aug 16, 2023 at 11:16 AM Z qiang <qiang.zhang1211@gmail.com> wrote:
> >>>>>>
> >>>>>>>
> >>>>>>> Hi, Paul,
> >>>>>>>
> >>>>>>> On Tue, Aug 15, 2023 at 12:15 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> >>>>>>>>
> >>>>>>>> On Mon, Aug 14, 2023 at 10:00:45AM +0800, Huacai Chen wrote:
> >>>>>>>>> The KGDB initial breakpoint gets an rcu stall warning after commit
> >>>>>>>>> a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in
> >>>>>>>>> rcu_cpu_stall_reset()").
> >>>>>>>>>
> >>>>>>>>> [   53.452051] rcu: INFO: rcu_preempt self-detected stall on CPU
> >>>>>>>>> [   53.487950] rcu:     3-...0: (1 ticks this GP) idle=0e2c/1/0x4000000000000000 softirq=375/375 fqs=8
> >>>>>>>>> [   53.528243] rcu:     (t=12297 jiffies g=-995 q=1 ncpus=4)
> >>>>>>>>> [   53.564840] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc2+ #4848
> >>>>>>>>> [   53.603005] Hardware name: Loongson Loongson-3A5000-HV-7A2000-1w-V0.1-CRB/Loongson-LS3A5000-7A2000-1w-CRB-V1.21, BIOS Loongson-UDK2018-V2.0.05099-beta8 08
> >>>>>>>>> [   53.682062] pc 9000000000332100 ra 90000000003320f4 tp 90000001000a0000 sp 90000001000a3710
> >>>>>>>>> [   53.724934] a0 9000000001d4b488 a1 0000000000000000 a2 0000000000000001 a3 0000000000000000
> >>>>>>>>> [   53.768179] a4 9000000001d526c8 a5 90000001000a38f0 a6 000000000000002c a7 0000000000000000
> >>>>>>>>> [   53.810751] t0 00000000000002b0 t1 0000000000000004 t2 900000000131c9c0 t3 fffffffffffffffa
> >>>>>>>>> [   53.853249] t4 0000000000000080 t5 90000001002ac190 t6 0000000000000004 t7 9000000001912d58
> >>>>>>>>> [   53.895684] t8 0000000000000000 u0 90000000013141a0 s9 0000000000000028 s0 9000000001d512f0
> >>>>>>>>> [   53.937633] s1 9000000001d51278 s2 90000001000a3798 s3 90000000019fc410 s4 9000000001d4b488
> >>>>>>>>> [   53.979486] s5 9000000001d512f0 s6 90000000013141a0 s7 0000000000000078 s8 9000000001d4b450
> >>>>>>>>> [   54.021175]    ra: 90000000003320f4 kgdb_cpu_enter+0x534/0x640
> >>>>>>>>> [   54.060150]   ERA: 9000000000332100 kgdb_cpu_enter+0x540/0x640
> >>>>>>>>> [   54.098347]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
> >>>>>>>>> [   54.136621]  PRMD: 0000000c (PPLV0 +PIE +PWE)
> >>>>>>>>> [   54.172192]  EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
> >>>>>>>>> [   54.207838]  ECFG: 00071c1c (LIE=2-4,10-12 VS=7)
> >>>>>>>>> [   54.242503] ESTAT: 00000800 [INT] (IS=11 ECode=0 EsubCode=0)
> >>>>>>>>> [   54.277996]  PRID: 0014c011 (Loongson-64bit, Loongson-3A5000-HV)
> >>>>>>>>> [   54.313544] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc2+ #4848
> >>>>>>>>> [   54.430170] Stack : 0072617764726148 0000000000000000 9000000000223504 90000001000a0000
> >>>>>>>>> [   54.472308]         9000000100073a90 9000000100073a98 0000000000000000 9000000100073bd8
> >>>>>>>>> [   54.514413]         9000000100073bd0 9000000100073bd0 9000000100073a00 0000000000000001
> >>>>>>>>> [   54.556018]         0000000000000001 9000000100073a98 99828271f24e961a 90000001002810c0
> >>>>>>>>> [   54.596924]         0000000000000001 0000000000010003 0000000000000000 0000000000000001
> >>>>>>>>> [   54.637115]         ffff8000337cdb80 0000000000000001 0000000006360000 900000000131c9c0
> >>>>>>>>> [   54.677049]         0000000000000000 0000000000000000 90000000017b4c98 9000000001912000
> >>>>>>>>> [   54.716394]         9000000001912f68 9000000001913000 9000000001912f70 00000000000002b0
> >>>>>>>>> [   54.754880]         90000000014a8840 0000000000000000 900000000022351c 0000000000000000
> >>>>>>>>> [   54.792372]         00000000000002b0 000000000000000c 0000000000000000 0000000000071c1c
> >>>>>>>>> [   54.829302]         ...
> >>>>>>>>> [   54.859163] Call Trace:
> >>>>>>>>> [   54.859165] [<900000000022351c>] show_stack+0x5c/0x180
> >>>>>>>>> [   54.918298] [<90000000012f6100>] dump_stack_lvl+0x60/0x88
> >>>>>>>>> [   54.949251] [<90000000012dd5d8>] rcu_dump_cpu_stacks+0xf0/0x148
> >>>>>>>>> [   54.981116] [<90000000002d2fb8>] rcu_sched_clock_irq+0xb78/0xe60
> >>>>>>>>> [   55.012744] [<90000000002e47cc>] update_process_times+0x6c/0xc0
> >>>>>>>>> [   55.044169] [<90000000002f65d4>] tick_sched_timer+0x54/0x100
> >>>>>>>>> [   55.075488] [<90000000002e5174>] __hrtimer_run_queues+0x154/0x240
> >>>>>>>>> [   55.107347] [<90000000002e6288>] hrtimer_interrupt+0x108/0x2a0
> >>>>>>>>> [   55.139112] [<9000000000226418>] constant_timer_interrupt+0x38/0x60
> >>>>>>>>> [   55.170749] [<90000000002b3010>] __handle_irq_event_percpu+0x50/0x160
> >>>>>>>>> [   55.203141] [<90000000002b3138>] handle_irq_event_percpu+0x18/0x80
> >>>>>>>>> [   55.235064] [<90000000002b9d54>] handle_percpu_irq+0x54/0xa0
> >>>>>>>>> [   55.266241] [<90000000002b2168>] generic_handle_domain_irq+0x28/0x40
> >>>>>>>>> [   55.298466] [<9000000000aba95c>] handle_cpu_irq+0x5c/0xa0
> >>>>>>>>> [   55.329749] [<90000000012f7270>] handle_loongarch_irq+0x30/0x60
> >>>>>>>>> [   55.361476] [<90000000012f733c>] do_vint+0x9c/0x100
> >>>>>>>>> [   55.391737] [<9000000000332100>] kgdb_cpu_enter+0x540/0x640
> >>>>>>>>> [   55.422440] [<9000000000332b64>] kgdb_handle_exception+0x104/0x180
> >>>>>>>>> [   55.452911] [<9000000000232478>] kgdb_loongarch_notify+0x38/0xa0
> >>>>>>>>> [   55.481964] [<900000000026b4d4>] notify_die+0x94/0x100
> >>>>>>>>> [   55.509184] [<90000000012f685c>] do_bp+0x21c/0x340
> >>>>>>>>> [   55.562475] [<90000000003315b8>] kgdb_compiled_break+0x0/0x28
> >>>>>>>>> [   55.590319] [<9000000000332e80>] kgdb_register_io_module+0x160/0x1c0
> >>>>>>>>> [   55.618901] [<9000000000c0f514>] configure_kgdboc+0x154/0x1c0
> >>>>>>>>> [   55.647034] [<9000000000c0f5e0>] kgdboc_probe+0x60/0x80
> >>>>>>>>> [   55.674647] [<9000000000c96da8>] platform_probe+0x68/0x100
> >>>>>>>>> [   55.702613] [<9000000000c938e0>] really_probe+0xc0/0x340
> >>>>>>>>> [   55.730528] [<9000000000c93be4>] __driver_probe_device+0x84/0x140
> >>>>>>>>> [   55.759615] [<9000000000c93cdc>] driver_probe_device+0x3c/0x120
> >>>>>>>>> [   55.787990] [<9000000000c93e8c>] __device_attach_driver+0xcc/0x160
> >>>>>>>>> [   55.817145] [<9000000000c91290>] bus_for_each_drv+0x90/0x100
> >>>>>>>>> [   55.845654] [<9000000000c94328>] __device_attach+0xa8/0x1a0
> >>>>>>>>> [   55.874145] [<9000000000c925f0>] bus_probe_device+0xb0/0xe0
> >>>>>>>>> [   55.902572] [<9000000000c8ec7c>] device_add+0x65c/0x860
> >>>>>>>>> [   55.930635] [<9000000000c96704>] platform_device_add+0x124/0x2c0
> >>>>>>>>> [   55.959669] [<9000000001452b38>] init_kgdboc+0x58/0xa0
> >>>>>>>>> [   55.987677] [<900000000022015c>] do_one_initcall+0x7c/0x1e0
> >>>>>>>>> [   56.016134] [<9000000001420f1c>] kernel_init_freeable+0x22c/0x2a0
> >>>>>>>>> [   56.045128] [<90000000012f923c>] kernel_init+0x20/0x124
> >>>>>>>>>
> >>>>>>>>> Currently rcu_cpu_stall_reset() set rcu_state.jiffies_stall to one check
> >>>>>>>>> period later, i.e. jiffies + rcu_jiffies_till_stall_check(). But jiffies
> >>>>>>>>> is only updated in the timer interrupt, so when kgdb_cpu_enter() begins
> >>>>>>>>> to run there may already be nearly one rcu check period after jiffies.
> >>>>>>>>> Since all interrupts are disabled during kgdb_cpu_enter(), jiffies will
> >>>>>>>>> not be updated. When kgdb_cpu_enter() returns, rcu_state.jiffies_stall
> >>>>>>>>> maybe already gets timeout.
> >>>>>>>>>
> >>>>>>>>> We can set rcu_state.jiffies_stall to two rcu check periods later, e.g.
> >>>>>>>>> jiffies + (rcu_jiffies_till_stall_check() * 2) in rcu_cpu_stall_reset()
> >>>>>>>>> to avoid this problem. But this isn't a complete solution because kgdb
> >>>>>>>>> may take a very long time in irq disabled context.
> >>>>>>>>>
> >>>>>>>>> Instead, update jiffies at the beginning of rcu_cpu_stall_reset() can
> >>>>>>>>> solve all kinds of problems.
> >>>>>>>>
> >>>>>>>> Would it make sense for there to be a kgdb_cpu_exit()?  In that case,
> >>>>>>>> the stalls could simply be suppressed at the beginning of the debug
> >>>>>>>> session and re-enabled upon exit, as is currently done for sysrq output
> >>>>>>>> via rcu_sysrq_start() and rcu_sysrq_end().
> >>>>>>> Thank you for your advice, but that doesn't help. Because
> >>>>>>> rcu_sysrq_start() and rcu_sysrq_end() try to suppress the warnings
> >>>>>>> during sysrq, but kgdb already has no warnings during kgdb_cpu_enter()
> >>>>>>> since it is executed in irq disabled context. Instead, this patch
> >>>>>>> wants to suppress the warnings *after* kgdb_cpu_enter() due to a very
> >>>>>>> old jiffies value.
> >>>>>>>
> >>>>>>
> >>>>>> Hello, Huacai
> >>>>>>
> >>>>>> Is it possible to set  the rcu_cpu_stall_suppress is true in
> >>>>>> dbg_touch_watchdogs()
> >>>>>> and reset the rcu_cpu_stall_suppress at the beginning and end of the
> >>>>>> RCU grace period?
> >>>>> This is possible but not the best: 1, kgdb is not the only caller of
> >>>>> rcu_cpu_stall_reset(); 2, it is difficult to find the "end" to reset
> >>>>> rcu_cpu_stall_suppress.
> >>>>>
> >>>>
> >>>> You can replace rcu_state.jiffies_stall update by setting rcu_cpu_stall_suppress
> >>>> in rcu_cpu_stall_reset(),  and reset rcu_cpu_stall_suppress in rcu_gp_init() and
> >>>> rcu_gp_cleanup().
> >>> What's the advantage compared with updating jiffies? Updating jiffies
> >>> seems more straight forward.
> >>>
> >>
> >> In do_update_jiffies_64(), need to acquire jiffies_lock raw spinlock,
> >> like you said, kgdb is not the only caller of rcu_cpu_stall_reset(),
> >> the rcu_cpu_stall_reset() maybe invoke in NMI  (arch/x86/platform/uv/uv_nmi.c)
> > Reset rcu_cpu_stall_suppress in rcu_gp_init()/rcu_gp_cleanup() is
> > still not so good to me, because it does a useless operation in most
> > cases. Moreover, the rcu core is refactored again and again, something
> > may be changed in future.
> >
> > If  do_update_jiffies_64() cannot be used in NMI context,
>
> Can you not make the jiffies update conditional on whether it is called within NMI context?
>
> > can we
> > consider my old method [1]?
> > https://lore.kernel.org/rcu/CAAhV-H7j9Y=VvRLm8thLw-EX1PGqBA9YfT4G1AN7ucYS=iP+DQ@mail.gmail.com/T/#t
> >
> > Of course we should set rcu_state.jiffies_stall large enough, so we
> > can do like this:
> >
> > void rcu_cpu_stall_reset(void)
> > {
> >  WRITE_ONCE(rcu_state.jiffies_stall,
> > -   jiffies + rcu_jiffies_till_stall_check());
> > +   jiffies + 300 * HZ);
> > }
> >
> > 300s is the largest timeout value, and I think 300s is enough here in practice.
>
> I dislike that..
Is this acceptable?

void rcu_cpu_stall_reset(void)
{
        unsigned long delta;

        delta = nsecs_to_jiffies(ktime_get_ns() - ktime_get_coarse_ns());

        WRITE_ONCE(rcu_state.jiffies_stall,
                   jiffies + delta + rcu_jiffies_till_stall_check());
}

This can update jiffies_stall without updating jiffies (but has the
same effect).



Huacai
>
> Thanks,
>
>  - Joel
>
>
>
> >
> > Huacai
> >
> >>
> >> Thanks
> >> Zqiang
> >>
> >>
> >>> Huacai
> >>>
> >>>>
> >>>> Thanks
> >>>> Zqiang
> >>>>
> >>>>>
> >>>>>> or set rcupdate.rcu_cpu_stall_suppress_at_boot=1 in bootargs can
> >>>>>> suppress RCU stall
> >>>>>> in booting.
> >>>>> This is also possible, but it suppresses all kinds of stall warnings,
> >>>>> which is not what we want.
> >>>>>
> >>>>> Huacai
> >>>>>>
> >>>>>>
> >>>>>> Thanks
> >>>>>> Zqiang
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> Huacai
> >>>>>>>
> >>>>>>>>
> >>>>>>>>                                                        Thanx, Paul
> >>>>>>>>
> >>>>>>>>> Cc: stable@vger.kernel.org
> >>>>>>>>> Fixes: a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall detection in rcu_cpu_stall_reset()")
> >>>>>>>>> Reported-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
> >>>>>>>>> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> >>>>>>>>> ---
> >>>>>>>>> kernel/rcu/tree_stall.h | 1 +
> >>>>>>>>> 1 file changed, 1 insertion(+)
> >>>>>>>>>
> >>>>>>>>> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> >>>>>>>>> index b10b8349bb2a..1c7b540985bf 100644
> >>>>>>>>> --- a/kernel/rcu/tree_stall.h
> >>>>>>>>> +++ b/kernel/rcu/tree_stall.h
> >>>>>>>>> @@ -153,6 +153,7 @@ static void panic_on_rcu_stall(void)
> >>>>>>>>>  */
> >>>>>>>>> void rcu_cpu_stall_reset(void)
> >>>>>>>>> {
> >>>>>>>>> +     do_update_jiffies_64(ktime_get());
> >>>>>>>>>      WRITE_ONCE(rcu_state.jiffies_stall,
> >>>>>>>>>                 jiffies + rcu_jiffies_till_stall_check());
> >>>>>>>>> }
> >>>>>>>>> --
> >>>>>>>>> 2.39.3
> >>>>>>>>>
Thomas Gleixner Aug. 23, 2023, 9:36 p.m. UTC | #15
On Wed, Aug 16 2023 at 18:06, Z. qiang wrote:
>> On Wed, Aug 16, 2023 at 1:09 PM Z qiang <qiang.zhang1211@gmail.com> wrote:
>> > You can replace rcu_state.jiffies_stall update by setting rcu_cpu_stall_suppress
>> > in rcu_cpu_stall_reset(),  and reset rcu_cpu_stall_suppress in rcu_gp_init() and
>> > rcu_gp_cleanup().
>> What's the advantage compared with updating jiffies? Updating jiffies
>> seems more straight forward.
>>
>
> In do_update_jiffies_64(), need to acquire jiffies_lock raw spinlock,
> like you said, kgdb is not the only caller of rcu_cpu_stall_reset(),
> the rcu_cpu_stall_reset() maybe invoke in NMI  (arch/x86/platform/uv/uv_nmi.c)

What's worse is that KGDB can set breakpoints pretty much everywhere and
there is no guarantee that the jiffies lock is not held when a
breakpoint hits.

Thanks,

        tglx
Thomas Gleixner Aug. 23, 2023, 9:41 p.m. UTC | #16
On Wed, Aug 16 2023 at 23:56, Alan Huang wrote:
>> If  do_update_jiffies_64() cannot be used in NMI context, can we
>
> What about updating jiffies in dbg_touch_watchdogs or adding a wrapper which updates
> both jiffies and jiffies_stall?

What makes dbg_touch_watchdogs() any different?

KGDB can pretty much have a breakpoint everywhere and therefore also
within the jiffies lock held region.

Thanks,

        tglx
Thomas Gleixner Aug. 23, 2023, 10:03 p.m. UTC | #17
On Thu, Aug 17 2023 at 16:06, Huacai Chen wrote:
> On Thu, Aug 17, 2023 at 3:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
>> > If  do_update_jiffies_64() cannot be used in NMI context,
>>
>> Can you not make the jiffies update conditional on whether it is
>> called within NMI context?

Which solves what? If KGDB has a breakpoint in the jiffies lock held
region then you still dead lock.

>> I dislike that..
> Is this acceptable?
>
> void rcu_cpu_stall_reset(void)
> {
>         unsigned long delta;
>
>         delta = nsecs_to_jiffies(ktime_get_ns() - ktime_get_coarse_ns());
>
>         WRITE_ONCE(rcu_state.jiffies_stall,
>                    jiffies + delta + rcu_jiffies_till_stall_check());
> }
>
> This can update jiffies_stall without updating jiffies (but has the
> same effect).

Now you traded the potential dead lock on jiffies lock for a potential
live lock vs. tk_core.seq. Not really an improvement, right?

The only way you can do the above is something like the incomplete and
uncompiled below. NMI safe and therefore livelock proof time interfaces
exist for a reason.

Thanks,

        tglx
---
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -51,6 +51,13 @@ struct tick_sched *tick_get_tick_sched(i
  */
 static ktime_t last_jiffies_update;
 
+unsigned long tick_estimate_stale_jiffies(void)
+{
+	ktime_t delta = ktime_get_mono_fast_ns() - READ_ONCE(last_jiffies_update);
+
+	return delta < 0 ? 0 : div_s64(delta, TICK_NSEC);
+}
+
 /*
  * Must be called with interrupts disabled !
  */
Paul E. McKenney Aug. 23, 2023, 10:41 p.m. UTC | #18
On Thu, Aug 24, 2023 at 12:03:25AM +0200, Thomas Gleixner wrote:
> On Thu, Aug 17 2023 at 16:06, Huacai Chen wrote:
> > On Thu, Aug 17, 2023 at 3:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> >> > If  do_update_jiffies_64() cannot be used in NMI context,
> >>
> >> Can you not make the jiffies update conditional on whether it is
> >> called within NMI context?
> 
> Which solves what? If KGDB has a breakpoint in the jiffies lock held
> region then you still dead lock.
> 
> >> I dislike that..
> > Is this acceptable?
> >
> > void rcu_cpu_stall_reset(void)
> > {
> >         unsigned long delta;
> >
> >         delta = nsecs_to_jiffies(ktime_get_ns() - ktime_get_coarse_ns());
> >
> >         WRITE_ONCE(rcu_state.jiffies_stall,
> >                    jiffies + delta + rcu_jiffies_till_stall_check());
> > }
> >
> > This can update jiffies_stall without updating jiffies (but has the
> > same effect).
> 
> Now you traded the potential dead lock on jiffies lock for a potential
> live lock vs. tk_core.seq. Not really an improvement, right?
> 
> The only way you can do the above is something like the incomplete and
> uncompiled below. NMI safe and therefore livelock proof time interfaces
> exist for a reason.

Just for completeness, another approach, with its own advantages
and disadvantage, is to add something like ULONG_MAX/4 to
rcu_state.jiffies_stall, but also set a counter indicating that this
has been done.  Then RCU's force-quiescent processing could decrement
that counter (if non-zero) and reset rcu_state.jiffies_stall when it
does reach zero.

Setting the counter to three should cover most cases, but "live by the
heuristic, die by the heuristic".  ;-)

It would be good to have some indication when gdb exited, but things
like the gdb "next" command can make that "interesting" when applied to
a long-running function.

 							Thanx, Paul

> Thanks,
> 
>         tglx
> ---
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -51,6 +51,13 @@ struct tick_sched *tick_get_tick_sched(i
>   */
>  static ktime_t last_jiffies_update;
>  
> +unsigned long tick_estimate_stale_jiffies(void)
> +{
> +	ktime_t delta = ktime_get_mono_fast_ns() - READ_ONCE(last_jiffies_update);
> +
> +	return delta < 0 ? 0 : div_s64(delta, TICK_NSEC);
> +}
> +
>  /*
>   * Must be called with interrupts disabled !
>   */
> 
>
Huacai Chen Aug. 24, 2023, 2:47 a.m. UTC | #19
Hi, Thomas,

On Thu, Aug 24, 2023 at 6:03 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Thu, Aug 17 2023 at 16:06, Huacai Chen wrote:
> > On Thu, Aug 17, 2023 at 3:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> >> > If  do_update_jiffies_64() cannot be used in NMI context,
> >>
> >> Can you not make the jiffies update conditional on whether it is
> >> called within NMI context?
>
> Which solves what? If KGDB has a breakpoint in the jiffies lock held
> region then you still dead lock.
>
> >> I dislike that..
> > Is this acceptable?
> >
> > void rcu_cpu_stall_reset(void)
> > {
> >         unsigned long delta;
> >
> >         delta = nsecs_to_jiffies(ktime_get_ns() - ktime_get_coarse_ns());
> >
> >         WRITE_ONCE(rcu_state.jiffies_stall,
> >                    jiffies + delta + rcu_jiffies_till_stall_check());
> > }
> >
> > This can update jiffies_stall without updating jiffies (but has the
> > same effect).
>
> Now you traded the potential dead lock on jiffies lock for a potential
> live lock vs. tk_core.seq. Not really an improvement, right?
>
> The only way you can do the above is something like the incomplete and
> uncompiled below. NMI safe and therefore livelock proof time interfaces
> exist for a reason.
>
> Thanks,
>
>         tglx
> ---
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -51,6 +51,13 @@ struct tick_sched *tick_get_tick_sched(i
>   */
>  static ktime_t last_jiffies_update;
>
> +unsigned long tick_estimate_stale_jiffies(void)
> +{
> +       ktime_t delta = ktime_get_mono_fast_ns() - READ_ONCE(last_jiffies_update);
> +
> +       return delta < 0 ? 0 : div_s64(delta, TICK_NSEC);
> +}
> +
>  /*
>   * Must be called with interrupts disabled !
>   */
Thank you for your advice, now the latest proposal is here [1], this
is very similar to your diff, please take a look.

[1] https://lore.kernel.org/rcu/CAAhV-H5mePbbF8Y3t-JfV+PNP8BEcjKtX4UokzL_vDzyw+2BRg@mail.gmail.com/T/#t

Huacai
>
>
Huacai Chen Aug. 24, 2023, 2:50 a.m. UTC | #20
Hi, Paul,

On Thu, Aug 24, 2023 at 6:41 AM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Thu, Aug 24, 2023 at 12:03:25AM +0200, Thomas Gleixner wrote:
> > On Thu, Aug 17 2023 at 16:06, Huacai Chen wrote:
> > > On Thu, Aug 17, 2023 at 3:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> > >> > If  do_update_jiffies_64() cannot be used in NMI context,
> > >>
> > >> Can you not make the jiffies update conditional on whether it is
> > >> called within NMI context?
> >
> > Which solves what? If KGDB has a breakpoint in the jiffies lock held
> > region then you still dead lock.
> >
> > >> I dislike that..
> > > Is this acceptable?
> > >
> > > void rcu_cpu_stall_reset(void)
> > > {
> > >         unsigned long delta;
> > >
> > >         delta = nsecs_to_jiffies(ktime_get_ns() - ktime_get_coarse_ns());
> > >
> > >         WRITE_ONCE(rcu_state.jiffies_stall,
> > >                    jiffies + delta + rcu_jiffies_till_stall_check());
> > > }
> > >
> > > This can update jiffies_stall without updating jiffies (but has the
> > > same effect).
> >
> > Now you traded the potential dead lock on jiffies lock for a potential
> > live lock vs. tk_core.seq. Not really an improvement, right?
> >
> > The only way you can do the above is something like the incomplete and
> > uncompiled below. NMI safe and therefore livelock proof time interfaces
> > exist for a reason.
>
> Just for completeness, another approach, with its own advantages
> and disadvantage, is to add something like ULONG_MAX/4 to
> rcu_state.jiffies_stall, but also set a counter indicating that this
> has been done.  Then RCU's force-quiescent processing could decrement
> that counter (if non-zero) and reset rcu_state.jiffies_stall when it
> does reach zero.
>
> Setting the counter to three should cover most cases, but "live by the
> heuristic, die by the heuristic".  ;-)
>
> It would be good to have some indication when gdb exited, but things
> like the gdb "next" command can make that "interesting" when applied to
> a long-running function.
The original code is adding ULONG_MAX/2, so adding ULONG_MAX/4 may
make no much difference? The simplest way is adding 300*HZ, but Joel
dislikes that.

Huacai

>
>                                                         Thanx, Paul
>
> > Thanks,
> >
> >         tglx
> > ---
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -51,6 +51,13 @@ struct tick_sched *tick_get_tick_sched(i
> >   */
> >  static ktime_t last_jiffies_update;
> >
> > +unsigned long tick_estimate_stale_jiffies(void)
> > +{
> > +     ktime_t delta = ktime_get_mono_fast_ns() - READ_ONCE(last_jiffies_update);
> > +
> > +     return delta < 0 ? 0 : div_s64(delta, TICK_NSEC);
> > +}
> > +
> >  /*
> >   * Must be called with interrupts disabled !
> >   */
> >
> >
Thomas Gleixner Aug. 24, 2023, 9:39 a.m. UTC | #21
On Thu, Aug 24 2023 at 10:47, Huacai Chen wrote:
> On Thu, Aug 24, 2023 at 6:03 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>> +unsigned long tick_estimate_stale_jiffies(void)
>> +{
>> +       ktime_t delta = ktime_get_mono_fast_ns() - READ_ONCE(last_jiffies_update);
>> +
>> +       return delta < 0 ? 0 : div_s64(delta, TICK_NSEC);
>> +}
>> +
>>  /*
>>   * Must be called with interrupts disabled !
>>   */
> Thank you for your advice, now the latest proposal is here [1], this
> is very similar to your diff, please take a look.

Similar by some definition of similar.
Paul E. McKenney Aug. 24, 2023, 11:40 a.m. UTC | #22
On Thu, Aug 24, 2023 at 10:50:41AM +0800, Huacai Chen wrote:
> Hi, Paul,
> 
> On Thu, Aug 24, 2023 at 6:41 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Thu, Aug 24, 2023 at 12:03:25AM +0200, Thomas Gleixner wrote:
> > > On Thu, Aug 17 2023 at 16:06, Huacai Chen wrote:
> > > > On Thu, Aug 17, 2023 at 3:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> > > >> > If  do_update_jiffies_64() cannot be used in NMI context,
> > > >>
> > > >> Can you not make the jiffies update conditional on whether it is
> > > >> called within NMI context?
> > >
> > > Which solves what? If KGDB has a breakpoint in the jiffies lock held
> > > region then you still dead lock.
> > >
> > > >> I dislike that..
> > > > Is this acceptable?
> > > >
> > > > void rcu_cpu_stall_reset(void)
> > > > {
> > > >         unsigned long delta;
> > > >
> > > >         delta = nsecs_to_jiffies(ktime_get_ns() - ktime_get_coarse_ns());
> > > >
> > > >         WRITE_ONCE(rcu_state.jiffies_stall,
> > > >                    jiffies + delta + rcu_jiffies_till_stall_check());
> > > > }
> > > >
> > > > This can update jiffies_stall without updating jiffies (but has the
> > > > same effect).
> > >
> > > Now you traded the potential dead lock on jiffies lock for a potential
> > > live lock vs. tk_core.seq. Not really an improvement, right?
> > >
> > > The only way you can do the above is something like the incomplete and
> > > uncompiled below. NMI safe and therefore livelock proof time interfaces
> > > exist for a reason.
> >
> > Just for completeness, another approach, with its own advantages
> > and disadvantage, is to add something like ULONG_MAX/4 to
> > rcu_state.jiffies_stall, but also set a counter indicating that this
> > has been done.  Then RCU's force-quiescent processing could decrement
> > that counter (if non-zero) and reset rcu_state.jiffies_stall when it
> > does reach zero.
> >
> > Setting the counter to three should cover most cases, but "live by the
> > heuristic, die by the heuristic".  ;-)
> >
> > It would be good to have some indication when gdb exited, but things
> > like the gdb "next" command can make that "interesting" when applied to
> > a long-running function.
> 
> The original code is adding ULONG_MAX/2, so adding ULONG_MAX/4 may
> make no much difference? The simplest way is adding 300*HZ, but Joel
> dislikes that.

I am not seeing the ULONG_MAX/2, so could you please point me to that
original code?

The advantage of ULONG_MAX/4 over ULONG_MAX/2 is that the time_after()
and time_before() macros have ULONG_MAX/4 slop in either direction
before giving you the wrong answer.  You can get nearly the same result
using ULONG_MAX/2, but it requires a bit more care.  And even on 32-bit
HZ=1000 systems, ULONG_MAX/4 gets you more than 12 days of gdb session
or jiffies-update delay before you start getting false positives.

Then things can be reset after (say) 3 calls to rcu_gp_fqs() and
also the current reset at the beginning of a grace period, which
is in record_gp_stall_check_time().

It would be better if RCU could get notified at both ends of the debug
session, but given gdb commands such as "next", along with Thomas's
point about gdb breakpoints being pretty much anywhere, this might or
might not be so helpful in real life.  But worth looking into.

							Thanx, Paul

> Huacai
> 
> >
> >                                                         Thanx, Paul
> >
> > > Thanks,
> > >
> > >         tglx
> > > ---
> > > --- a/kernel/time/tick-sched.c
> > > +++ b/kernel/time/tick-sched.c
> > > @@ -51,6 +51,13 @@ struct tick_sched *tick_get_tick_sched(i
> > >   */
> > >  static ktime_t last_jiffies_update;
> > >
> > > +unsigned long tick_estimate_stale_jiffies(void)
> > > +{
> > > +     ktime_t delta = ktime_get_mono_fast_ns() - READ_ONCE(last_jiffies_update);
> > > +
> > > +     return delta < 0 ? 0 : div_s64(delta, TICK_NSEC);
> > > +}
> > > +
> > >  /*
> > >   * Must be called with interrupts disabled !
> > >   */
> > >
> > >
Huacai Chen Aug. 24, 2023, 12:40 p.m. UTC | #23
Hi, Paul,

On Thu, Aug 24, 2023 at 7:40 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Thu, Aug 24, 2023 at 10:50:41AM +0800, Huacai Chen wrote:
> > Hi, Paul,
> >
> > On Thu, Aug 24, 2023 at 6:41 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > >
> > > On Thu, Aug 24, 2023 at 12:03:25AM +0200, Thomas Gleixner wrote:
> > > > On Thu, Aug 17 2023 at 16:06, Huacai Chen wrote:
> > > > > On Thu, Aug 17, 2023 at 3:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> > > > >> > If  do_update_jiffies_64() cannot be used in NMI context,
> > > > >>
> > > > >> Can you not make the jiffies update conditional on whether it is
> > > > >> called within NMI context?
> > > >
> > > > Which solves what? If KGDB has a breakpoint in the jiffies lock held
> > > > region then you still dead lock.
> > > >
> > > > >> I dislike that..
> > > > > Is this acceptable?
> > > > >
> > > > > void rcu_cpu_stall_reset(void)
> > > > > {
> > > > >         unsigned long delta;
> > > > >
> > > > >         delta = nsecs_to_jiffies(ktime_get_ns() - ktime_get_coarse_ns());
> > > > >
> > > > >         WRITE_ONCE(rcu_state.jiffies_stall,
> > > > >                    jiffies + delta + rcu_jiffies_till_stall_check());
> > > > > }
> > > > >
> > > > > This can update jiffies_stall without updating jiffies (but has the
> > > > > same effect).
> > > >
> > > > Now you traded the potential dead lock on jiffies lock for a potential
> > > > live lock vs. tk_core.seq. Not really an improvement, right?
> > > >
> > > > The only way you can do the above is something like the incomplete and
> > > > uncompiled below. NMI safe and therefore livelock proof time interfaces
> > > > exist for a reason.
> > >
> > > Just for completeness, another approach, with its own advantages
> > > and disadvantage, is to add something like ULONG_MAX/4 to
> > > rcu_state.jiffies_stall, but also set a counter indicating that this
> > > has been done.  Then RCU's force-quiescent processing could decrement
> > > that counter (if non-zero) and reset rcu_state.jiffies_stall when it
> > > does reach zero.
> > >
> > > Setting the counter to three should cover most cases, but "live by the
> > > heuristic, die by the heuristic".  ;-)
> > >
> > > It would be good to have some indication when gdb exited, but things
> > > like the gdb "next" command can make that "interesting" when applied to
> > > a long-running function.
> >
> > The original code is adding ULONG_MAX/2, so adding ULONG_MAX/4 may
> > make no much difference? The simplest way is adding 300*HZ, but Joel
> > dislikes that.
>
> I am not seeing the ULONG_MAX/2, so could you please point me to that
> original code?
Maybe I misunderstand something, I say the original code means code
before commit a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall
detection in rcu_cpu_stall_reset()").

Huacai
>
> The advantage of ULONG_MAX/4 over ULONG_MAX/2 is that the time_after()
> and time_before() macros have ULONG_MAX/4 slop in either direction
> before giving you the wrong answer.  You can get nearly the same result
> using ULONG_MAX/2, but it requires a bit more care.  And even on 32-bit
> HZ=1000 systems, ULONG_MAX/4 gets you more than 12 days of gdb session
> or jiffies-update delay before you start getting false positives.
>
> Then things can be reset after (say) 3 calls to rcu_gp_fqs() and
> also the current reset at the beginning of a grace period, which
> is in record_gp_stall_check_time().
>
> It would be better if RCU could get notified at both ends of the debug
> session, but given gdb commands such as "next", along with Thomas's
> point about gdb breakpoints being pretty much anywhere, this might or
> might not be so helpful in real life.  But worth looking into.
>
>                                                         Thanx, Paul
>
> > Huacai
> >
> > >
> > >                                                         Thanx, Paul
> > >
> > > > Thanks,
> > > >
> > > >         tglx
> > > > ---
> > > > --- a/kernel/time/tick-sched.c
> > > > +++ b/kernel/time/tick-sched.c
> > > > @@ -51,6 +51,13 @@ struct tick_sched *tick_get_tick_sched(i
> > > >   */
> > > >  static ktime_t last_jiffies_update;
> > > >
> > > > +unsigned long tick_estimate_stale_jiffies(void)
> > > > +{
> > > > +     ktime_t delta = ktime_get_mono_fast_ns() - READ_ONCE(last_jiffies_update);
> > > > +
> > > > +     return delta < 0 ? 0 : div_s64(delta, TICK_NSEC);
> > > > +}
> > > > +
> > > >  /*
> > > >   * Must be called with interrupts disabled !
> > > >   */
> > > >
> > > >
Joel Fernandes Aug. 24, 2023, 1:09 p.m. UTC | #24
On Thu, Aug 24, 2023 at 04:40:42AM -0700, Paul E. McKenney wrote:
> On Thu, Aug 24, 2023 at 10:50:41AM +0800, Huacai Chen wrote:
> > Hi, Paul,
> > 
> > On Thu, Aug 24, 2023 at 6:41 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > >
> > > On Thu, Aug 24, 2023 at 12:03:25AM +0200, Thomas Gleixner wrote:
> > > > On Thu, Aug 17 2023 at 16:06, Huacai Chen wrote:
> > > > > On Thu, Aug 17, 2023 at 3:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> > > > >> > If  do_update_jiffies_64() cannot be used in NMI context,
> > > > >>
> > > > >> Can you not make the jiffies update conditional on whether it is
> > > > >> called within NMI context?
> > > >
> > > > Which solves what? If KGDB has a breakpoint in the jiffies lock held
> > > > region then you still dead lock.
> > > >
> > > > >> I dislike that..
> > > > > Is this acceptable?
> > > > >
> > > > > void rcu_cpu_stall_reset(void)
> > > > > {
> > > > >         unsigned long delta;
> > > > >
> > > > >         delta = nsecs_to_jiffies(ktime_get_ns() - ktime_get_coarse_ns());
> > > > >
> > > > >         WRITE_ONCE(rcu_state.jiffies_stall,
> > > > >                    jiffies + delta + rcu_jiffies_till_stall_check());
> > > > > }
> > > > >
> > > > > This can update jiffies_stall without updating jiffies (but has the
> > > > > same effect).
> > > >
> > > > Now you traded the potential dead lock on jiffies lock for a potential
> > > > live lock vs. tk_core.seq. Not really an improvement, right?
> > > >
> > > > The only way you can do the above is something like the incomplete and
> > > > uncompiled below. NMI safe and therefore livelock proof time interfaces
> > > > exist for a reason.
> > >
> > > Just for completeness, another approach, with its own advantages
> > > and disadvantage, is to add something like ULONG_MAX/4 to
> > > rcu_state.jiffies_stall, but also set a counter indicating that this
> > > has been done.  Then RCU's force-quiescent processing could decrement
> > > that counter (if non-zero) and reset rcu_state.jiffies_stall when it
> > > does reach zero.
> > >
> > > Setting the counter to three should cover most cases, but "live by the
> > > heuristic, die by the heuristic".  ;-)
> > >
> > > It would be good to have some indication when gdb exited, but things
> > > like the gdb "next" command can make that "interesting" when applied to
> > > a long-running function.
> > 
> > The original code is adding ULONG_MAX/2, so adding ULONG_MAX/4 may
> > make no much difference? The simplest way is adding 300*HZ, but Joel
> > dislikes that.
> 
> I am not seeing the ULONG_MAX/2, so could you please point me to that
> original code?
> 
> The advantage of ULONG_MAX/4 over ULONG_MAX/2 is that the time_after()
> and time_before() macros have ULONG_MAX/4 slop in either direction
> before giving you the wrong answer.  You can get nearly the same result
> using ULONG_MAX/2, but it requires a bit more care.  And even on 32-bit
> HZ=1000 systems, ULONG_MAX/4 gets you more than 12 days of gdb session
> or jiffies-update delay before you start getting false positives.
> 
> Then things can be reset after (say) 3 calls to rcu_gp_fqs() and
> also the current reset at the beginning of a grace period, which
> is in record_gp_stall_check_time().

I like Paul's suggestion a lot except that if someone sets a breakpoint right
when the jiffies is being reset, so then we have to come back to doing
Thomas's suggestion.

So maybe a combination of Paul's and Thomas's suggestions (of using
last_jiffies_update with the NMI-safe timestamp read) may work.

> It would be better if RCU could get notified at both ends of the debug
> session, but given gdb commands such as "next", along with Thomas's
> point about gdb breakpoints being pretty much anywhere, this might or
> might not be so helpful in real life.  But worth looking into.

True, I was curious if rcu_cpu_stall_reset() can be called on a tickless
kernel as well before jiffies gets a chance to update, in which case I think
your suggestion of biasing the stall time and later resetting it would help a
lot for such situations.

thanks,

 - Joel


> 							Thanx, Paul
> 
> > Huacai
> > 
> > >
> > >                                                         Thanx, Paul
> > >
> > > > Thanks,
> > > >
> > > >         tglx
> > > > ---
> > > > --- a/kernel/time/tick-sched.c
> > > > +++ b/kernel/time/tick-sched.c
> > > > @@ -51,6 +51,13 @@ struct tick_sched *tick_get_tick_sched(i
> > > >   */
> > > >  static ktime_t last_jiffies_update;
> > > >
> > > > +unsigned long tick_estimate_stale_jiffies(void)
> > > > +{
> > > > +     ktime_t delta = ktime_get_mono_fast_ns() - READ_ONCE(last_jiffies_update);
> > > > +
> > > > +     return delta < 0 ? 0 : div_s64(delta, TICK_NSEC);
> > > > +}
> > > > +
> > > >  /*
> > > >   * Must be called with interrupts disabled !
> > > >   */
> > > >
> > > >
Joel Fernandes Aug. 24, 2023, 1:21 p.m. UTC | #25
Hello Thomas,

On Thu, Aug 24, 2023 at 12:03:25AM +0200, Thomas Gleixner wrote:
> On Thu, Aug 17 2023 at 16:06, Huacai Chen wrote:
> > On Thu, Aug 17, 2023 at 3:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> >> > If  do_update_jiffies_64() cannot be used in NMI context,
> >>
> >> Can you not make the jiffies update conditional on whether it is
> >> called within NMI context?
> 
> Which solves what? If KGDB has a breakpoint in the jiffies lock held
> region then you still dead lock.

Yes, we had already discussed this that jiffies update is not possible from
here. There are too many threads since different patch revisions were being
reviewed in different threads.

> >> I dislike that..
> > Is this acceptable?
> >
> > void rcu_cpu_stall_reset(void)
> > {
> >         unsigned long delta;
> >
> >         delta = nsecs_to_jiffies(ktime_get_ns() - ktime_get_coarse_ns());
> >
> >         WRITE_ONCE(rcu_state.jiffies_stall,
> >                    jiffies + delta + rcu_jiffies_till_stall_check());
> > }
> >
> > This can update jiffies_stall without updating jiffies (but has the
> > same effect).
> 
> Now you traded the potential dead lock on jiffies lock for a potential
> live lock vs. tk_core.seq. Not really an improvement, right?
> 
> The only way you can do the above is something like the incomplete and
> uncompiled below. NMI safe and therefore livelock proof time interfaces
> exist for a reason.

Yes, I had already mentioned exactly this issue here of not using an NMI-safe
interface:
https://lore.kernel.org/all/CAEXW_YT+uw5JodtrqjY0B2xx0J8ukF=FAB9-p5rxgWobSU2P2A@mail.gmail.com/
I like your suggestion of using last_jiffies_update though (which as you
mentioned needs to be explored more).

There are too many threads which makes the discussion hard to follow. Huacai,
it would be great if we can keep the discussions in the same thread (Say for
example by passing options like --in-reply-to to "git send-email" command).

thanks,

 - Joel


> 
> Thanks,
> 
>         tglx
> ---
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -51,6 +51,13 @@ struct tick_sched *tick_get_tick_sched(i
>   */
>  static ktime_t last_jiffies_update;
>  
> +unsigned long tick_estimate_stale_jiffies(void)
> +{
> +	ktime_t delta = ktime_get_mono_fast_ns() - READ_ONCE(last_jiffies_update);
> +
> +	return delta < 0 ? 0 : div_s64(delta, TICK_NSEC);
> +}
> +
>  /*
>   * Must be called with interrupts disabled !
>   */
> 
>
Paul E. McKenney Aug. 24, 2023, 1:24 p.m. UTC | #26
On Thu, Aug 24, 2023 at 08:40:00PM +0800, Huacai Chen wrote:
> Hi, Paul,
> On Thu, Aug 24, 2023 at 7:40 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > On Thu, Aug 24, 2023 at 10:50:41AM +0800, Huacai Chen wrote:
> > > Hi, Paul,
> > > On Thu, Aug 24, 2023 at 6:41 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > On Thu, Aug 24, 2023 at 12:03:25AM +0200, Thomas Gleixner wrote:
> > > > > On Thu, Aug 17 2023 at 16:06, Huacai Chen wrote:
> > > > > > On Thu, Aug 17, 2023 at 3:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> > > > > >> > If  do_update_jiffies_64() cannot be used in NMI context,
> > > > > >>
> > > > > >> Can you not make the jiffies update conditional on whether it is
> > > > > >> called within NMI context?
> > > > >
> > > > > Which solves what? If KGDB has a breakpoint in the jiffies lock held
> > > > > region then you still dead lock.
> > > > >
> > > > > >> I dislike that..
> > > > > > Is this acceptable?
> > > > > >
> > > > > > void rcu_cpu_stall_reset(void)
> > > > > > {
> > > > > >         unsigned long delta;
> > > > > >
> > > > > >         delta = nsecs_to_jiffies(ktime_get_ns() - ktime_get_coarse_ns());
> > > > > >
> > > > > >         WRITE_ONCE(rcu_state.jiffies_stall,
> > > > > >                    jiffies + delta + rcu_jiffies_till_stall_check());
> > > > > > }
> > > > > >
> > > > > > This can update jiffies_stall without updating jiffies (but has the
> > > > > > same effect).
> > > > >
> > > > > Now you traded the potential dead lock on jiffies lock for a potential
> > > > > live lock vs. tk_core.seq. Not really an improvement, right?
> > > > >
> > > > > The only way you can do the above is something like the incomplete and
> > > > > uncompiled below. NMI safe and therefore livelock proof time interfaces
> > > > > exist for a reason.
> > > >
> > > > Just for completeness, another approach, with its own advantages
> > > > and disadvantage, is to add something like ULONG_MAX/4 to
> > > > rcu_state.jiffies_stall, but also set a counter indicating that this
> > > > has been done.  Then RCU's force-quiescent processing could decrement
> > > > that counter (if non-zero) and reset rcu_state.jiffies_stall when it
> > > > does reach zero.
> > > >
> > > > Setting the counter to three should cover most cases, but "live by the
> > > > heuristic, die by the heuristic".  ;-)
> > > >
> > > > It would be good to have some indication when gdb exited, but things
> > > > like the gdb "next" command can make that "interesting" when applied to
> > > > a long-running function.
> > >
> > > The original code is adding ULONG_MAX/2, so adding ULONG_MAX/4 may
> > > make no much difference? The simplest way is adding 300*HZ, but Joel
> > > dislikes that.
> >
> > I am not seeing the ULONG_MAX/2, so could you please point me to that
> > original code?
> 
> Maybe I misunderstand something, I say the original code means code
> before commit a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall
> detection in rcu_cpu_stall_reset()").

Yes, my suggestion would essentially revert that patch.  It would
compensate by resetting rcu_state.jiffies_stall after a few calls
to rcu_gp_fqs().

Alternatively, we could simply provide a way for gdb users to manually
disable RCU CPU stall warnings at the beginning of their debug sessions
and to manually re-enable them when they are done.

							Thanx, Paul

> Huacai
> >
> > The advantage of ULONG_MAX/4 over ULONG_MAX/2 is that the time_after()
> > and time_before() macros have ULONG_MAX/4 slop in either direction
> > before giving you the wrong answer.  You can get nearly the same result
> > using ULONG_MAX/2, but it requires a bit more care.  And even on 32-bit
> > HZ=1000 systems, ULONG_MAX/4 gets you more than 12 days of gdb session
> > or jiffies-update delay before you start getting false positives.
> >
> > Then things can be reset after (say) 3 calls to rcu_gp_fqs() and
> > also the current reset at the beginning of a grace period, which
> > is in record_gp_stall_check_time().
> >
> > It would be better if RCU could get notified at both ends of the debug
> > session, but given gdb commands such as "next", along with Thomas's
> > point about gdb breakpoints being pretty much anywhere, this might or
> > might not be so helpful in real life.  But worth looking into.
> >
> >                                                         Thanx, Paul
> >
> > > Huacai
> > >
> > > >
> > > >                                                         Thanx, Paul
> > > >
> > > > > Thanks,
> > > > >
> > > > >         tglx
> > > > > ---
> > > > > --- a/kernel/time/tick-sched.c
> > > > > +++ b/kernel/time/tick-sched.c
> > > > > @@ -51,6 +51,13 @@ struct tick_sched *tick_get_tick_sched(i
> > > > >   */
> > > > >  static ktime_t last_jiffies_update;
> > > > >
> > > > > +unsigned long tick_estimate_stale_jiffies(void)
> > > > > +{
> > > > > +     ktime_t delta = ktime_get_mono_fast_ns() - READ_ONCE(last_jiffies_update);
> > > > > +
> > > > > +     return delta < 0 ? 0 : div_s64(delta, TICK_NSEC);
> > > > > +}
> > > > > +
> > > > >  /*
> > > > >   * Must be called with interrupts disabled !
> > > > >   */
> > > > >
> > > > >
Paul E. McKenney Aug. 24, 2023, 1:28 p.m. UTC | #27
On Thu, Aug 24, 2023 at 01:09:42PM +0000, Joel Fernandes wrote:
> On Thu, Aug 24, 2023 at 04:40:42AM -0700, Paul E. McKenney wrote:
> > On Thu, Aug 24, 2023 at 10:50:41AM +0800, Huacai Chen wrote:
> > > Hi, Paul,
> > > 
> > > On Thu, Aug 24, 2023 at 6:41 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > >
> > > > On Thu, Aug 24, 2023 at 12:03:25AM +0200, Thomas Gleixner wrote:
> > > > > On Thu, Aug 17 2023 at 16:06, Huacai Chen wrote:
> > > > > > On Thu, Aug 17, 2023 at 3:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> > > > > >> > If  do_update_jiffies_64() cannot be used in NMI context,
> > > > > >>
> > > > > >> Can you not make the jiffies update conditional on whether it is
> > > > > >> called within NMI context?
> > > > >
> > > > > Which solves what? If KGDB has a breakpoint in the jiffies lock held
> > > > > region then you still dead lock.
> > > > >
> > > > > >> I dislike that..
> > > > > > Is this acceptable?
> > > > > >
> > > > > > void rcu_cpu_stall_reset(void)
> > > > > > {
> > > > > >         unsigned long delta;
> > > > > >
> > > > > >         delta = nsecs_to_jiffies(ktime_get_ns() - ktime_get_coarse_ns());
> > > > > >
> > > > > >         WRITE_ONCE(rcu_state.jiffies_stall,
> > > > > >                    jiffies + delta + rcu_jiffies_till_stall_check());
> > > > > > }
> > > > > >
> > > > > > This can update jiffies_stall without updating jiffies (but has the
> > > > > > same effect).
> > > > >
> > > > > Now you traded the potential dead lock on jiffies lock for a potential
> > > > > live lock vs. tk_core.seq. Not really an improvement, right?
> > > > >
> > > > > The only way you can do the above is something like the incomplete and
> > > > > uncompiled below. NMI safe and therefore livelock proof time interfaces
> > > > > exist for a reason.
> > > >
> > > > Just for completeness, another approach, with its own advantages
> > > > and disadvantage, is to add something like ULONG_MAX/4 to
> > > > rcu_state.jiffies_stall, but also set a counter indicating that this
> > > > has been done.  Then RCU's force-quiescent processing could decrement
> > > > that counter (if non-zero) and reset rcu_state.jiffies_stall when it
> > > > does reach zero.
> > > >
> > > > Setting the counter to three should cover most cases, but "live by the
> > > > heuristic, die by the heuristic".  ;-)
> > > >
> > > > It would be good to have some indication when gdb exited, but things
> > > > like the gdb "next" command can make that "interesting" when applied to
> > > > a long-running function.
> > > 
> > > The original code is adding ULONG_MAX/2, so adding ULONG_MAX/4 may
> > > make no much difference? The simplest way is adding 300*HZ, but Joel
> > > dislikes that.
> > 
> > I am not seeing the ULONG_MAX/2, so could you please point me to that
> > original code?
> > 
> > The advantage of ULONG_MAX/4 over ULONG_MAX/2 is that the time_after()
> > and time_before() macros have ULONG_MAX/4 slop in either direction
> > before giving you the wrong answer.  You can get nearly the same result
> > using ULONG_MAX/2, but it requires a bit more care.  And even on 32-bit
> > HZ=1000 systems, ULONG_MAX/4 gets you more than 12 days of gdb session
> > or jiffies-update delay before you start getting false positives.
> > 
> > Then things can be reset after (say) 3 calls to rcu_gp_fqs() and
> > also the current reset at the beginning of a grace period, which
> > is in record_gp_stall_check_time().
> 
> I like Paul's suggestion a lot except that if someone sets a breakpoint right
> when the jiffies is being reset, so then we have to come back to doing
> Thomas's suggestion.

Please note that ULONG_MAX / 4 allows for jiffies not having been reset
for more than 10 days on 32-bit systems and for many millions of years
on 64-bit systems.  ;-)

> So maybe a combination of Paul's and Thomas's suggestions (of using
> last_jiffies_update with the NMI-safe timestamp read) may work.

I am absolutely not a fan of reworking all of the RCU CPU stall-warning
code to use some other timebase, at least not without a very important
reason to do so.  Nothing mentioned in this thread even comes close to
that level of importance.

> > It would be better if RCU could get notified at both ends of the debug
> > session, but given gdb commands such as "next", along with Thomas's
> > point about gdb breakpoints being pretty much anywhere, this might or
> > might not be so helpful in real life.  But worth looking into.
> 
> True, I was curious if rcu_cpu_stall_reset() can be called on a tickless
> kernel as well before jiffies gets a chance to update, in which case I think
> your suggestion of biasing the stall time and later resetting it would help a
> lot for such situations.

What code path can possibly invoke rcu_cpu_stall_reset() after an
extended full-system nohz_full time period without first doing at least
one context switch on the CPU that invokes rcu_cpu_stall_reset()?

							Thanx, Paul

> thanks,
> 
>  - Joel
> 
> 
> > 							Thanx, Paul
> > 
> > > Huacai
> > > 
> > > >
> > > >                                                         Thanx, Paul
> > > >
> > > > > Thanks,
> > > > >
> > > > >         tglx
> > > > > ---
> > > > > --- a/kernel/time/tick-sched.c
> > > > > +++ b/kernel/time/tick-sched.c
> > > > > @@ -51,6 +51,13 @@ struct tick_sched *tick_get_tick_sched(i
> > > > >   */
> > > > >  static ktime_t last_jiffies_update;
> > > > >
> > > > > +unsigned long tick_estimate_stale_jiffies(void)
> > > > > +{
> > > > > +     ktime_t delta = ktime_get_mono_fast_ns() - READ_ONCE(last_jiffies_update);
> > > > > +
> > > > > +     return delta < 0 ? 0 : div_s64(delta, TICK_NSEC);
> > > > > +}
> > > > > +
> > > > >  /*
> > > > >   * Must be called with interrupts disabled !
> > > > >   */
> > > > >
> > > > >
Paul E. McKenney Aug. 24, 2023, 1:29 p.m. UTC | #28
On Thu, Aug 24, 2023 at 01:21:55PM +0000, Joel Fernandes wrote:
> Hello Thomas,
> 
> On Thu, Aug 24, 2023 at 12:03:25AM +0200, Thomas Gleixner wrote:
> > On Thu, Aug 17 2023 at 16:06, Huacai Chen wrote:
> > > On Thu, Aug 17, 2023 at 3:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> > >> > If  do_update_jiffies_64() cannot be used in NMI context,
> > >>
> > >> Can you not make the jiffies update conditional on whether it is
> > >> called within NMI context?
> > 
> > Which solves what? If KGDB has a breakpoint in the jiffies lock held
> > region then you still dead lock.
> 
> Yes, we had already discussed this that jiffies update is not possible from
> here. There are too many threads since different patch revisions were being
> reviewed in different threads.

One of the nice properties of the jiffies counter is its trivially
provable NMI safety  ;-)

							Thanx, Paul

> > >> I dislike that..
> > > Is this acceptable?
> > >
> > > void rcu_cpu_stall_reset(void)
> > > {
> > >         unsigned long delta;
> > >
> > >         delta = nsecs_to_jiffies(ktime_get_ns() - ktime_get_coarse_ns());
> > >
> > >         WRITE_ONCE(rcu_state.jiffies_stall,
> > >                    jiffies + delta + rcu_jiffies_till_stall_check());
> > > }
> > >
> > > This can update jiffies_stall without updating jiffies (but has the
> > > same effect).
> > 
> > Now you traded the potential dead lock on jiffies lock for a potential
> > live lock vs. tk_core.seq. Not really an improvement, right?
> > 
> > The only way you can do the above is something like the incomplete and
> > uncompiled below. NMI safe and therefore livelock proof time interfaces
> > exist for a reason.
> 
> Yes, I had already mentioned exactly this issue here of not using an NMI-safe
> interface:
> https://lore.kernel.org/all/CAEXW_YT+uw5JodtrqjY0B2xx0J8ukF=FAB9-p5rxgWobSU2P2A@mail.gmail.com/
> I like your suggestion of using last_jiffies_update though (which as you
> mentioned needs to be explored more).
> 
> There are too many threads which makes the discussion hard to follow. Huacai,
> it would be great if we can keep the discussions in the same thread (Say for
> example by passing options like --in-reply-to to "git send-email" command).
> 
> thanks,
> 
>  - Joel
> 
> 
> > 
> > Thanks,
> > 
> >         tglx
> > ---
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -51,6 +51,13 @@ struct tick_sched *tick_get_tick_sched(i
> >   */
> >  static ktime_t last_jiffies_update;
> >  
> > +unsigned long tick_estimate_stale_jiffies(void)
> > +{
> > +	ktime_t delta = ktime_get_mono_fast_ns() - READ_ONCE(last_jiffies_update);
> > +
> > +	return delta < 0 ? 0 : div_s64(delta, TICK_NSEC);
> > +}
> > +
> >  /*
> >   * Must be called with interrupts disabled !
> >   */
> > 
> >
Huacai Chen Aug. 24, 2023, 3:43 p.m. UTC | #29
Hi, Paul,

On Thu, Aug 24, 2023 at 9:24 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Thu, Aug 24, 2023 at 08:40:00PM +0800, Huacai Chen wrote:
> > Hi, Paul,
> > On Thu, Aug 24, 2023 at 7:40 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > On Thu, Aug 24, 2023 at 10:50:41AM +0800, Huacai Chen wrote:
> > > > Hi, Paul,
> > > > On Thu, Aug 24, 2023 at 6:41 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > > On Thu, Aug 24, 2023 at 12:03:25AM +0200, Thomas Gleixner wrote:
> > > > > > On Thu, Aug 17 2023 at 16:06, Huacai Chen wrote:
> > > > > > > On Thu, Aug 17, 2023 at 3:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> > > > > > >> > If  do_update_jiffies_64() cannot be used in NMI context,
> > > > > > >>
> > > > > > >> Can you not make the jiffies update conditional on whether it is
> > > > > > >> called within NMI context?
> > > > > >
> > > > > > Which solves what? If KGDB has a breakpoint in the jiffies lock held
> > > > > > region then you still dead lock.
> > > > > >
> > > > > > >> I dislike that..
> > > > > > > Is this acceptable?
> > > > > > >
> > > > > > > void rcu_cpu_stall_reset(void)
> > > > > > > {
> > > > > > >         unsigned long delta;
> > > > > > >
> > > > > > >         delta = nsecs_to_jiffies(ktime_get_ns() - ktime_get_coarse_ns());
> > > > > > >
> > > > > > >         WRITE_ONCE(rcu_state.jiffies_stall,
> > > > > > >                    jiffies + delta + rcu_jiffies_till_stall_check());
> > > > > > > }
> > > > > > >
> > > > > > > This can update jiffies_stall without updating jiffies (but has the
> > > > > > > same effect).
> > > > > >
> > > > > > Now you traded the potential dead lock on jiffies lock for a potential
> > > > > > live lock vs. tk_core.seq. Not really an improvement, right?
> > > > > >
> > > > > > The only way you can do the above is something like the incomplete and
> > > > > > uncompiled below. NMI safe and therefore livelock proof time interfaces
> > > > > > exist for a reason.
> > > > >
> > > > > Just for completeness, another approach, with its own advantages
> > > > > and disadvantage, is to add something like ULONG_MAX/4 to
> > > > > rcu_state.jiffies_stall, but also set a counter indicating that this
> > > > > has been done.  Then RCU's force-quiescent processing could decrement
> > > > > that counter (if non-zero) and reset rcu_state.jiffies_stall when it
> > > > > does reach zero.
> > > > >
> > > > > Setting the counter to three should cover most cases, but "live by the
> > > > > heuristic, die by the heuristic".  ;-)
> > > > >
> > > > > It would be good to have some indication when gdb exited, but things
> > > > > like the gdb "next" command can make that "interesting" when applied to
> > > > > a long-running function.
> > > >
> > > > The original code is adding ULONG_MAX/2, so adding ULONG_MAX/4 may
> > > > make no much difference? The simplest way is adding 300*HZ, but Joel
> > > > dislikes that.
> > >
> > > I am not seeing the ULONG_MAX/2, so could you please point me to that
> > > original code?
> >
> > Maybe I misunderstand something, I say the original code means code
> > before commit a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall
> > detection in rcu_cpu_stall_reset()").
>
> Yes, my suggestion would essentially revert that patch.  It would
> compensate by resetting rcu_state.jiffies_stall after a few calls
> to rcu_gp_fqs().
>
> Alternatively, we could simply provide a way for gdb users to manually
> disable RCU CPU stall warnings at the beginning of their debug sessions
> and to manually re-enable them when they are done.
This problem is not KGDB-specific (though it is firstly found in the
KGDB case), so I want to fix it in the rcu code rather than in the
kgdb code.

Huacai
>
>                                                         Thanx, Paul
>
> > Huacai
> > >
> > > The advantage of ULONG_MAX/4 over ULONG_MAX/2 is that the time_after()
> > > and time_before() macros have ULONG_MAX/4 slop in either direction
> > > before giving you the wrong answer.  You can get nearly the same result
> > > using ULONG_MAX/2, but it requires a bit more care.  And even on 32-bit
> > > HZ=1000 systems, ULONG_MAX/4 gets you more than 12 days of gdb session
> > > or jiffies-update delay before you start getting false positives.
> > >
> > > Then things can be reset after (say) 3 calls to rcu_gp_fqs() and
> > > also the current reset at the beginning of a grace period, which
> > > is in record_gp_stall_check_time().
> > >
> > > It would be better if RCU could get notified at both ends of the debug
> > > session, but given gdb commands such as "next", along with Thomas's
> > > point about gdb breakpoints being pretty much anywhere, this might or
> > > might not be so helpful in real life.  But worth looking into.
> > >
> > >                                                         Thanx, Paul
> > >
> > > > Huacai
> > > >
> > > > >
> > > > >                                                         Thanx, Paul
> > > > >
> > > > > > Thanks,
> > > > > >
> > > > > >         tglx
> > > > > > ---
> > > > > > --- a/kernel/time/tick-sched.c
> > > > > > +++ b/kernel/time/tick-sched.c
> > > > > > @@ -51,6 +51,13 @@ struct tick_sched *tick_get_tick_sched(i
> > > > > >   */
> > > > > >  static ktime_t last_jiffies_update;
> > > > > >
> > > > > > +unsigned long tick_estimate_stale_jiffies(void)
> > > > > > +{
> > > > > > +     ktime_t delta = ktime_get_mono_fast_ns() - READ_ONCE(last_jiffies_update);
> > > > > > +
> > > > > > +     return delta < 0 ? 0 : div_s64(delta, TICK_NSEC);
> > > > > > +}
> > > > > > +
> > > > > >  /*
> > > > > >   * Must be called with interrupts disabled !
> > > > > >   */
> > > > > >
> > > > > >
Huacai Chen Aug. 24, 2023, 4:03 p.m. UTC | #30
Hi, Paul,

On Thu, Aug 24, 2023 at 9:28 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Thu, Aug 24, 2023 at 01:09:42PM +0000, Joel Fernandes wrote:
> > On Thu, Aug 24, 2023 at 04:40:42AM -0700, Paul E. McKenney wrote:
> > > On Thu, Aug 24, 2023 at 10:50:41AM +0800, Huacai Chen wrote:
> > > > Hi, Paul,
> > > >
> > > > On Thu, Aug 24, 2023 at 6:41 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > >
> > > > > On Thu, Aug 24, 2023 at 12:03:25AM +0200, Thomas Gleixner wrote:
> > > > > > On Thu, Aug 17 2023 at 16:06, Huacai Chen wrote:
> > > > > > > On Thu, Aug 17, 2023 at 3:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> > > > > > >> > If  do_update_jiffies_64() cannot be used in NMI context,
> > > > > > >>
> > > > > > >> Can you not make the jiffies update conditional on whether it is
> > > > > > >> called within NMI context?
> > > > > >
> > > > > > Which solves what? If KGDB has a breakpoint in the jiffies lock held
> > > > > > region then you still dead lock.
> > > > > >
> > > > > > >> I dislike that..
> > > > > > > Is this acceptable?
> > > > > > >
> > > > > > > void rcu_cpu_stall_reset(void)
> > > > > > > {
> > > > > > >         unsigned long delta;
> > > > > > >
> > > > > > >         delta = nsecs_to_jiffies(ktime_get_ns() - ktime_get_coarse_ns());
> > > > > > >
> > > > > > >         WRITE_ONCE(rcu_state.jiffies_stall,
> > > > > > >                    jiffies + delta + rcu_jiffies_till_stall_check());
> > > > > > > }
> > > > > > >
> > > > > > > This can update jiffies_stall without updating jiffies (but has the
> > > > > > > same effect).
> > > > > >
> > > > > > Now you traded the potential dead lock on jiffies lock for a potential
> > > > > > live lock vs. tk_core.seq. Not really an improvement, right?
> > > > > >
> > > > > > The only way you can do the above is something like the incomplete and
> > > > > > uncompiled below. NMI safe and therefore livelock proof time interfaces
> > > > > > exist for a reason.
> > > > >
> > > > > Just for completeness, another approach, with its own advantages
> > > > > and disadvantage, is to add something like ULONG_MAX/4 to
> > > > > rcu_state.jiffies_stall, but also set a counter indicating that this
> > > > > has been done.  Then RCU's force-quiescent processing could decrement
> > > > > that counter (if non-zero) and reset rcu_state.jiffies_stall when it
> > > > > does reach zero.
> > > > >
> > > > > Setting the counter to three should cover most cases, but "live by the
> > > > > heuristic, die by the heuristic".  ;-)
> > > > >
> > > > > It would be good to have some indication when gdb exited, but things
> > > > > like the gdb "next" command can make that "interesting" when applied to
> > > > > a long-running function.
> > > >
> > > > The original code is adding ULONG_MAX/2, so adding ULONG_MAX/4 may
> > > > make no much difference? The simplest way is adding 300*HZ, but Joel
> > > > dislikes that.
> > >
> > > I am not seeing the ULONG_MAX/2, so could you please point me to that
> > > original code?
> > >
> > > The advantage of ULONG_MAX/4 over ULONG_MAX/2 is that the time_after()
> > > and time_before() macros have ULONG_MAX/4 slop in either direction
> > > before giving you the wrong answer.  You can get nearly the same result
> > > using ULONG_MAX/2, but it requires a bit more care.  And even on 32-bit
> > > HZ=1000 systems, ULONG_MAX/4 gets you more than 12 days of gdb session
> > > or jiffies-update delay before you start getting false positives.
> > >
> > > Then things can be reset after (say) 3 calls to rcu_gp_fqs() and
> > > also the current reset at the beginning of a grace period, which
> > > is in record_gp_stall_check_time().
> >
> > I like Paul's suggestion a lot except that if someone sets a breakpoint right
> > when the jiffies is being reset, so then we have to come back to doing
> > Thomas's suggestion.
>
> Please note that ULONG_MAX / 4 allows for jiffies not having been reset
> for more than 10 days on 32-bit systems and for many millions of years
> on 64-bit systems.  ;-)
>
> > So maybe a combination of Paul's and Thomas's suggestions (of using
> > last_jiffies_update with the NMI-safe timestamp read) may work.
>
> I am absolutely not a fan of reworking all of the RCU CPU stall-warning
> code to use some other timebase, at least not without a very important
> reason to do so.  Nothing mentioned in this thread even comes close to
> that level of importance.
>
> > > It would be better if RCU could get notified at both ends of the debug
> > > session, but given gdb commands such as "next", along with Thomas's
> > > point about gdb breakpoints being pretty much anywhere, this might or
> > > might not be so helpful in real life.  But worth looking into.
> >
> > True, I was curious if rcu_cpu_stall_reset() can be called on a tickless
> > kernel as well before jiffies gets a chance to update, in which case I think
> > your suggestion of biasing the stall time and later resetting it would help a
> > lot for such situations.
>
> What code path can possibly invoke rcu_cpu_stall_reset() after an
> extended full-system nohz_full time period without first doing at least
> one context switch on the CPU that invokes rcu_cpu_stall_reset()?
In my commit message, the "KGDB initial breakpoint" means the
automatic call to kgdb_initial_breakpoint() at system boot. In my
test:
1, the "stall timeout" is 21s;
2, when I use "continue" to exit kgdb, the "total jiffies delayed
time" is ~40s (of course it will cause stall warning);
3, the "irq disabled time" (nearly the same as execution time of
kgdb_cpu_enter()) is ~12s;
4, this means the "jiffies delayed time" due to the tickless mechanism is ~28s.

So, at least in this case, the tickless mechanism contributes much for
the jiffies delay.

Huacai

>
>                                                         Thanx, Paul
>
> > thanks,
> >
> >  - Joel
> >
> >
> > >                                                     Thanx, Paul
> > >
> > > > Huacai
> > > >
> > > > >
> > > > >                                                         Thanx, Paul
> > > > >
> > > > > > Thanks,
> > > > > >
> > > > > >         tglx
> > > > > > ---
> > > > > > --- a/kernel/time/tick-sched.c
> > > > > > +++ b/kernel/time/tick-sched.c
> > > > > > @@ -51,6 +51,13 @@ struct tick_sched *tick_get_tick_sched(i
> > > > > >   */
> > > > > >  static ktime_t last_jiffies_update;
> > > > > >
> > > > > > +unsigned long tick_estimate_stale_jiffies(void)
> > > > > > +{
> > > > > > +     ktime_t delta = ktime_get_mono_fast_ns() - READ_ONCE(last_jiffies_update);
> > > > > > +
> > > > > > +     return delta < 0 ? 0 : div_s64(delta, TICK_NSEC);
> > > > > > +}
> > > > > > +
> > > > > >  /*
> > > > > >   * Must be called with interrupts disabled !
> > > > > >   */
> > > > > >
> > > > > >
Huacai Chen Aug. 24, 2023, 4:15 p.m. UTC | #31
Hi, Joel,

On Thu, Aug 24, 2023 at 9:22 PM Joel Fernandes <joel@joelfernandes.org> wrote:
>
> Hello Thomas,
>
> On Thu, Aug 24, 2023 at 12:03:25AM +0200, Thomas Gleixner wrote:
> > On Thu, Aug 17 2023 at 16:06, Huacai Chen wrote:
> > > On Thu, Aug 17, 2023 at 3:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> > >> > If  do_update_jiffies_64() cannot be used in NMI context,
> > >>
> > >> Can you not make the jiffies update conditional on whether it is
> > >> called within NMI context?
> >
> > Which solves what? If KGDB has a breakpoint in the jiffies lock held
> > region then you still dead lock.
>
> Yes, we had already discussed this that jiffies update is not possible from
> here. There are too many threads since different patch revisions were being
> reviewed in different threads.
>
> > >> I dislike that..
> > > Is this acceptable?
> > >
> > > void rcu_cpu_stall_reset(void)
> > > {
> > >         unsigned long delta;
> > >
> > >         delta = nsecs_to_jiffies(ktime_get_ns() - ktime_get_coarse_ns());
> > >
> > >         WRITE_ONCE(rcu_state.jiffies_stall,
> > >                    jiffies + delta + rcu_jiffies_till_stall_check());
> > > }
> > >
> > > This can update jiffies_stall without updating jiffies (but has the
> > > same effect).
> >
> > Now you traded the potential dead lock on jiffies lock for a potential
> > live lock vs. tk_core.seq. Not really an improvement, right?
> >
> > The only way you can do the above is something like the incomplete and
> > uncompiled below. NMI safe and therefore livelock proof time interfaces
> > exist for a reason.
>
> Yes, I had already mentioned exactly this issue here of not using an NMI-safe
> interface:
> https://lore.kernel.org/all/CAEXW_YT+uw5JodtrqjY0B2xx0J8ukF=FAB9-p5rxgWobSU2P2A@mail.gmail.com/
> I like your suggestion of using last_jiffies_update though (which as you
> mentioned needs to be explored more).
>
> There are too many threads which makes the discussion hard to follow. Huacai,
> it would be great if we can keep the discussions in the same thread (Say for
> example by passing options like --in-reply-to to "git send-email" command).
I will try my best. In the early time, not all of us were involved.
And when I think we are all ready to accept the new solution, I sent a
new patch. But unfortunately some others suggest different approaches
which make the patch subject change again and again...

Now I tend to use Thomas's method by introducing and using
tick_estimate_stale_jiffies(). But let me wait for some time before
sending patches.

Huacai

>
> thanks,
>
>  - Joel
>
>
> >
> > Thanks,
> >
> >         tglx
> > ---
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -51,6 +51,13 @@ struct tick_sched *tick_get_tick_sched(i
> >   */
> >  static ktime_t last_jiffies_update;
> >
> > +unsigned long tick_estimate_stale_jiffies(void)
> > +{
> > +     ktime_t delta = ktime_get_mono_fast_ns() - READ_ONCE(last_jiffies_update);
> > +
> > +     return delta < 0 ? 0 : div_s64(delta, TICK_NSEC);
> > +}
> > +
> >  /*
> >   * Must be called with interrupts disabled !
> >   */
> >
> >
Huacai Chen Aug. 24, 2023, 4:32 p.m. UTC | #32
On Fri, Aug 25, 2023 at 12:03 AM Huacai Chen <chenhuacai@kernel.org> wrote:
>
> Hi, Paul,
>
> On Thu, Aug 24, 2023 at 9:28 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Thu, Aug 24, 2023 at 01:09:42PM +0000, Joel Fernandes wrote:
> > > On Thu, Aug 24, 2023 at 04:40:42AM -0700, Paul E. McKenney wrote:
> > > > On Thu, Aug 24, 2023 at 10:50:41AM +0800, Huacai Chen wrote:
> > > > > Hi, Paul,
> > > > >
> > > > > On Thu, Aug 24, 2023 at 6:41 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > > >
> > > > > > On Thu, Aug 24, 2023 at 12:03:25AM +0200, Thomas Gleixner wrote:
> > > > > > > On Thu, Aug 17 2023 at 16:06, Huacai Chen wrote:
> > > > > > > > On Thu, Aug 17, 2023 at 3:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> > > > > > > >> > If  do_update_jiffies_64() cannot be used in NMI context,
> > > > > > > >>
> > > > > > > >> Can you not make the jiffies update conditional on whether it is
> > > > > > > >> called within NMI context?
> > > > > > >
> > > > > > > Which solves what? If KGDB has a breakpoint in the jiffies lock held
> > > > > > > region then you still dead lock.
> > > > > > >
> > > > > > > >> I dislike that..
> > > > > > > > Is this acceptable?
> > > > > > > >
> > > > > > > > void rcu_cpu_stall_reset(void)
> > > > > > > > {
> > > > > > > >         unsigned long delta;
> > > > > > > >
> > > > > > > >         delta = nsecs_to_jiffies(ktime_get_ns() - ktime_get_coarse_ns());
> > > > > > > >
> > > > > > > >         WRITE_ONCE(rcu_state.jiffies_stall,
> > > > > > > >                    jiffies + delta + rcu_jiffies_till_stall_check());
> > > > > > > > }
> > > > > > > >
> > > > > > > > This can update jiffies_stall without updating jiffies (but has the
> > > > > > > > same effect).
> > > > > > >
> > > > > > > Now you traded the potential dead lock on jiffies lock for a potential
> > > > > > > live lock vs. tk_core.seq. Not really an improvement, right?
> > > > > > >
> > > > > > > The only way you can do the above is something like the incomplete and
> > > > > > > uncompiled below. NMI safe and therefore livelock proof time interfaces
> > > > > > > exist for a reason.
> > > > > >
> > > > > > Just for completeness, another approach, with its own advantages
> > > > > > and disadvantage, is to add something like ULONG_MAX/4 to
> > > > > > rcu_state.jiffies_stall, but also set a counter indicating that this
> > > > > > has been done.  Then RCU's force-quiescent processing could decrement
> > > > > > that counter (if non-zero) and reset rcu_state.jiffies_stall when it
> > > > > > does reach zero.
> > > > > >
> > > > > > Setting the counter to three should cover most cases, but "live by the
> > > > > > heuristic, die by the heuristic".  ;-)
> > > > > >
> > > > > > It would be good to have some indication when gdb exited, but things
> > > > > > like the gdb "next" command can make that "interesting" when applied to
> > > > > > a long-running function.
> > > > >
> > > > > The original code is adding ULONG_MAX/2, so adding ULONG_MAX/4 may
> > > > > make no much difference? The simplest way is adding 300*HZ, but Joel
> > > > > dislikes that.
> > > >
> > > > I am not seeing the ULONG_MAX/2, so could you please point me to that
> > > > original code?
> > > >
> > > > The advantage of ULONG_MAX/4 over ULONG_MAX/2 is that the time_after()
> > > > and time_before() macros have ULONG_MAX/4 slop in either direction
> > > > before giving you the wrong answer.  You can get nearly the same result
> > > > using ULONG_MAX/2, but it requires a bit more care.  And even on 32-bit
> > > > HZ=1000 systems, ULONG_MAX/4 gets you more than 12 days of gdb session
> > > > or jiffies-update delay before you start getting false positives.
> > > >
> > > > Then things can be reset after (say) 3 calls to rcu_gp_fqs() and
> > > > also the current reset at the beginning of a grace period, which
> > > > is in record_gp_stall_check_time().
> > >
> > > I like Paul's suggestion a lot except that if someone sets a breakpoint right
> > > when the jiffies is being reset, so then we have to come back to doing
> > > Thomas's suggestion.
> >
> > Please note that ULONG_MAX / 4 allows for jiffies not having been reset
> > for more than 10 days on 32-bit systems and for many millions of years
> > on 64-bit systems.  ;-)
> >
> > > So maybe a combination of Paul's and Thomas's suggestions (of using
> > > last_jiffies_update with the NMI-safe timestamp read) may work.
> >
> > I am absolutely not a fan of reworking all of the RCU CPU stall-warning
> > code to use some other timebase, at least not without a very important
> > reason to do so.  Nothing mentioned in this thread even comes close to
> > that level of importance.
> >
> > > > It would be better if RCU could get notified at both ends of the debug
> > > > session, but given gdb commands such as "next", along with Thomas's
> > > > point about gdb breakpoints being pretty much anywhere, this might or
> > > > might not be so helpful in real life.  But worth looking into.
> > >
> > > True, I was curious if rcu_cpu_stall_reset() can be called on a tickless
> > > kernel as well before jiffies gets a chance to update, in which case I think
> > > your suggestion of biasing the stall time and later resetting it would help a
> > > lot for such situations.
> >
> > What code path can possibly invoke rcu_cpu_stall_reset() after an
> > extended full-system nohz_full time period without first doing at least
> > one context switch on the CPU that invokes rcu_cpu_stall_reset()?
> In my commit message, the "KGDB initial breakpoint" means the
> automatic call to kgdb_initial_breakpoint() at system boot. In my
> test:
> 1, the "stall timeout" is 21s;
> 2, when I use "continue" to exit kgdb, the "total jiffies delayed
> time" is ~40s (of course it will cause stall warning);
> 3, the "irq disabled time" (nearly the same as execution time of
> kgdb_cpu_enter()) is ~12s;
> 4, this means the "jiffies delayed time" due to the tickless mechanism is ~28s.
>
> So, at least in this case, the tickless mechanism contributes much for
> the jiffies delay.
I'm sorry here is a typo. The "irq disabled time" is ~28s and the
"tickless caused jiffies delayed time" is ~12s in the above test.

>
> Huacai
>
> >
> >                                                         Thanx, Paul
> >
> > > thanks,
> > >
> > >  - Joel
> > >
> > >
> > > >                                                     Thanx, Paul
> > > >
> > > > > Huacai
> > > > >
> > > > > >
> > > > > >                                                         Thanx, Paul
> > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > >         tglx
> > > > > > > ---
> > > > > > > --- a/kernel/time/tick-sched.c
> > > > > > > +++ b/kernel/time/tick-sched.c
> > > > > > > @@ -51,6 +51,13 @@ struct tick_sched *tick_get_tick_sched(i
> > > > > > >   */
> > > > > > >  static ktime_t last_jiffies_update;
> > > > > > >
> > > > > > > +unsigned long tick_estimate_stale_jiffies(void)
> > > > > > > +{
> > > > > > > +     ktime_t delta = ktime_get_mono_fast_ns() - READ_ONCE(last_jiffies_update);
> > > > > > > +
> > > > > > > +     return delta < 0 ? 0 : div_s64(delta, TICK_NSEC);
> > > > > > > +}
> > > > > > > +
> > > > > > >  /*
> > > > > > >   * Must be called with interrupts disabled !
> > > > > > >   */
> > > > > > >
> > > > > > >
Paul E. McKenney Aug. 24, 2023, 4:34 p.m. UTC | #33
On Fri, Aug 25, 2023 at 12:03:31AM +0800, Huacai Chen wrote:
> Hi, Paul,
> 
> On Thu, Aug 24, 2023 at 9:28 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Thu, Aug 24, 2023 at 01:09:42PM +0000, Joel Fernandes wrote:
> > > On Thu, Aug 24, 2023 at 04:40:42AM -0700, Paul E. McKenney wrote:
> > > > On Thu, Aug 24, 2023 at 10:50:41AM +0800, Huacai Chen wrote:
> > > > > Hi, Paul,
> > > > >
> > > > > On Thu, Aug 24, 2023 at 6:41 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > > >
> > > > > > On Thu, Aug 24, 2023 at 12:03:25AM +0200, Thomas Gleixner wrote:
> > > > > > > On Thu, Aug 17 2023 at 16:06, Huacai Chen wrote:
> > > > > > > > On Thu, Aug 17, 2023 at 3:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> > > > > > > >> > If  do_update_jiffies_64() cannot be used in NMI context,
> > > > > > > >>
> > > > > > > >> Can you not make the jiffies update conditional on whether it is
> > > > > > > >> called within NMI context?
> > > > > > >
> > > > > > > Which solves what? If KGDB has a breakpoint in the jiffies lock held
> > > > > > > region then you still dead lock.
> > > > > > >
> > > > > > > >> I dislike that..
> > > > > > > > Is this acceptable?
> > > > > > > >
> > > > > > > > void rcu_cpu_stall_reset(void)
> > > > > > > > {
> > > > > > > >         unsigned long delta;
> > > > > > > >
> > > > > > > >         delta = nsecs_to_jiffies(ktime_get_ns() - ktime_get_coarse_ns());
> > > > > > > >
> > > > > > > >         WRITE_ONCE(rcu_state.jiffies_stall,
> > > > > > > >                    jiffies + delta + rcu_jiffies_till_stall_check());
> > > > > > > > }
> > > > > > > >
> > > > > > > > This can update jiffies_stall without updating jiffies (but has the
> > > > > > > > same effect).
> > > > > > >
> > > > > > > Now you traded the potential dead lock on jiffies lock for a potential
> > > > > > > live lock vs. tk_core.seq. Not really an improvement, right?
> > > > > > >
> > > > > > > The only way you can do the above is something like the incomplete and
> > > > > > > uncompiled below. NMI safe and therefore livelock proof time interfaces
> > > > > > > exist for a reason.
> > > > > >
> > > > > > Just for completeness, another approach, with its own advantages
> > > > > > and disadvantage, is to add something like ULONG_MAX/4 to
> > > > > > rcu_state.jiffies_stall, but also set a counter indicating that this
> > > > > > has been done.  Then RCU's force-quiescent processing could decrement
> > > > > > that counter (if non-zero) and reset rcu_state.jiffies_stall when it
> > > > > > does reach zero.
> > > > > >
> > > > > > Setting the counter to three should cover most cases, but "live by the
> > > > > > heuristic, die by the heuristic".  ;-)
> > > > > >
> > > > > > It would be good to have some indication when gdb exited, but things
> > > > > > like the gdb "next" command can make that "interesting" when applied to
> > > > > > a long-running function.
> > > > >
> > > > > The original code is adding ULONG_MAX/2, so adding ULONG_MAX/4 may
> > > > > make no much difference? The simplest way is adding 300*HZ, but Joel
> > > > > dislikes that.
> > > >
> > > > I am not seeing the ULONG_MAX/2, so could you please point me to that
> > > > original code?
> > > >
> > > > The advantage of ULONG_MAX/4 over ULONG_MAX/2 is that the time_after()
> > > > and time_before() macros have ULONG_MAX/4 slop in either direction
> > > > before giving you the wrong answer.  You can get nearly the same result
> > > > using ULONG_MAX/2, but it requires a bit more care.  And even on 32-bit
> > > > HZ=1000 systems, ULONG_MAX/4 gets you more than 12 days of gdb session
> > > > or jiffies-update delay before you start getting false positives.
> > > >
> > > > Then things can be reset after (say) 3 calls to rcu_gp_fqs() and
> > > > also the current reset at the beginning of a grace period, which
> > > > is in record_gp_stall_check_time().
> > >
> > > I like Paul's suggestion a lot except that if someone sets a breakpoint right
> > > when the jiffies is being reset, so then we have to come back to doing
> > > Thomas's suggestion.
> >
> > Please note that ULONG_MAX / 4 allows for jiffies not having been reset
> > for more than 10 days on 32-bit systems and for many millions of years
> > on 64-bit systems.  ;-)
> >
> > > So maybe a combination of Paul's and Thomas's suggestions (of using
> > > last_jiffies_update with the NMI-safe timestamp read) may work.
> >
> > I am absolutely not a fan of reworking all of the RCU CPU stall-warning
> > code to use some other timebase, at least not without a very important
> > reason to do so.  Nothing mentioned in this thread even comes close to
> > that level of importance.
> >
> > > > It would be better if RCU could get notified at both ends of the debug
> > > > session, but given gdb commands such as "next", along with Thomas's
> > > > point about gdb breakpoints being pretty much anywhere, this might or
> > > > might not be so helpful in real life.  But worth looking into.
> > >
> > > True, I was curious if rcu_cpu_stall_reset() can be called on a tickless
> > > kernel as well before jiffies gets a chance to update, in which case I think
> > > your suggestion of biasing the stall time and later resetting it would help a
> > > lot for such situations.
> >
> > What code path can possibly invoke rcu_cpu_stall_reset() after an
> > extended full-system nohz_full time period without first doing at least
> > one context switch on the CPU that invokes rcu_cpu_stall_reset()?
> 
> In my commit message, the "KGDB initial breakpoint" means the
> automatic call to kgdb_initial_breakpoint() at system boot. In my
> test:
> 1, the "stall timeout" is 21s;
> 2, when I use "continue" to exit kgdb, the "total jiffies delayed
> time" is ~40s (of course it will cause stall warning);
> 3, the "irq disabled time" (nearly the same as execution time of
> kgdb_cpu_enter()) is ~12s;
> 4, this means the "jiffies delayed time" due to the tickless mechanism is ~28s.
> 
> So, at least in this case, the tickless mechanism contributes much for
> the jiffies delay.

Checking kgdb_initial_breakpoint() call points:

o	From start_kernel():  This is before interrupts have been enabled
	on the boot CPU, and so is before the first RCU grace period has
	started.  Also before any CPU is able to enter nohz_full mode.
	So this is presumably not the code path you are hitting.

o	From opt_kgdb_wait():  This is an early_param() function, thus
	also invoked before interrupts have been enabled on the boot CPU.

o	From kgdb_register_io_module(): I don't have a good handle
	on this set of code paths, but I am guessing that this is
	where you are hitting your breakpoint.

And in the post-boot cases, if there is no RCU grace period in flight,
there will be no RCU CPU stall warning.  Otherwise, the RCU grace-period
kthread should have been running every few jiffies for the duration of
the grace period.  If it wasn't, for example, due to being starved by
the nohz_full user processes running at real-time priority, you would
get RCU CPU stall warnings anyway.

If the RCU grace-period kthread is running, then the system is not in pure
system-wide nohz_full mode, and the jiffies updates should be happening.
This means that even if you were running in this mode for many weeks on
a 32-bit system, the jiffies value should be sane.

Which in turn should mean that this should be fixable by adding
ULONG_MAX/4 to rcu_state.jiffies_stall in rcu_cpu_stall_reset(), along
with a reset via rcu_jiffies_till_stall_check() after a few calls to
rcu_gp_fqs().

Or am I missing something subtle here?

							Thanx, Paul

> Huacai
> 
> >
> >                                                         Thanx, Paul
> >
> > > thanks,
> > >
> > >  - Joel
> > >
> > >
> > > >                                                     Thanx, Paul
> > > >
> > > > > Huacai
> > > > >
> > > > > >
> > > > > >                                                         Thanx, Paul
> > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > >         tglx
> > > > > > > ---
> > > > > > > --- a/kernel/time/tick-sched.c
> > > > > > > +++ b/kernel/time/tick-sched.c
> > > > > > > @@ -51,6 +51,13 @@ struct tick_sched *tick_get_tick_sched(i
> > > > > > >   */
> > > > > > >  static ktime_t last_jiffies_update;
> > > > > > >
> > > > > > > +unsigned long tick_estimate_stale_jiffies(void)
> > > > > > > +{
> > > > > > > +     ktime_t delta = ktime_get_mono_fast_ns() - READ_ONCE(last_jiffies_update);
> > > > > > > +
> > > > > > > +     return delta < 0 ? 0 : div_s64(delta, TICK_NSEC);
> > > > > > > +}
> > > > > > > +
> > > > > > >  /*
> > > > > > >   * Must be called with interrupts disabled !
> > > > > > >   */
> > > > > > >
> > > > > > >
Paul E. McKenney Aug. 24, 2023, 6:28 p.m. UTC | #34
On Thu, Aug 24, 2023 at 11:43:04PM +0800, Huacai Chen wrote:
> Hi, Paul,
> 
> On Thu, Aug 24, 2023 at 9:24 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Thu, Aug 24, 2023 at 08:40:00PM +0800, Huacai Chen wrote:
> > > Hi, Paul,
> > > On Thu, Aug 24, 2023 at 7:40 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > On Thu, Aug 24, 2023 at 10:50:41AM +0800, Huacai Chen wrote:
> > > > > Hi, Paul,
> > > > > On Thu, Aug 24, 2023 at 6:41 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > > > On Thu, Aug 24, 2023 at 12:03:25AM +0200, Thomas Gleixner wrote:
> > > > > > > On Thu, Aug 17 2023 at 16:06, Huacai Chen wrote:
> > > > > > > > On Thu, Aug 17, 2023 at 3:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> > > > > > > >> > If  do_update_jiffies_64() cannot be used in NMI context,
> > > > > > > >>
> > > > > > > >> Can you not make the jiffies update conditional on whether it is
> > > > > > > >> called within NMI context?
> > > > > > >
> > > > > > > Which solves what? If KGDB has a breakpoint in the jiffies lock held
> > > > > > > region then you still dead lock.
> > > > > > >
> > > > > > > >> I dislike that..
> > > > > > > > Is this acceptable?
> > > > > > > >
> > > > > > > > void rcu_cpu_stall_reset(void)
> > > > > > > > {
> > > > > > > >         unsigned long delta;
> > > > > > > >
> > > > > > > >         delta = nsecs_to_jiffies(ktime_get_ns() - ktime_get_coarse_ns());
> > > > > > > >
> > > > > > > >         WRITE_ONCE(rcu_state.jiffies_stall,
> > > > > > > >                    jiffies + delta + rcu_jiffies_till_stall_check());
> > > > > > > > }
> > > > > > > >
> > > > > > > > This can update jiffies_stall without updating jiffies (but has the
> > > > > > > > same effect).
> > > > > > >
> > > > > > > Now you traded the potential dead lock on jiffies lock for a potential
> > > > > > > live lock vs. tk_core.seq. Not really an improvement, right?
> > > > > > >
> > > > > > > The only way you can do the above is something like the incomplete and
> > > > > > > uncompiled below. NMI safe and therefore livelock proof time interfaces
> > > > > > > exist for a reason.
> > > > > >
> > > > > > Just for completeness, another approach, with its own advantages
> > > > > > and disadvantage, is to add something like ULONG_MAX/4 to
> > > > > > rcu_state.jiffies_stall, but also set a counter indicating that this
> > > > > > has been done.  Then RCU's force-quiescent processing could decrement
> > > > > > that counter (if non-zero) and reset rcu_state.jiffies_stall when it
> > > > > > does reach zero.
> > > > > >
> > > > > > Setting the counter to three should cover most cases, but "live by the
> > > > > > heuristic, die by the heuristic".  ;-)
> > > > > >
> > > > > > It would be good to have some indication when gdb exited, but things
> > > > > > like the gdb "next" command can make that "interesting" when applied to
> > > > > > a long-running function.
> > > > >
> > > > > The original code is adding ULONG_MAX/2, so adding ULONG_MAX/4 may
> > > > > make no much difference? The simplest way is adding 300*HZ, but Joel
> > > > > dislikes that.
> > > >
> > > > I am not seeing the ULONG_MAX/2, so could you please point me to that
> > > > original code?
> > >
> > > Maybe I misunderstand something, I say the original code means code
> > > before commit a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall
> > > detection in rcu_cpu_stall_reset()").
> >
> > Yes, my suggestion would essentially revert that patch.  It would
> > compensate by resetting rcu_state.jiffies_stall after a few calls
> > to rcu_gp_fqs().
> >
> > Alternatively, we could simply provide a way for gdb users to manually
> > disable RCU CPU stall warnings at the beginning of their debug sessions
> > and to manually re-enable them when they are done.
> 
> This problem is not KGDB-specific (though it is firstly found in the
> KGDB case), so I want to fix it in the rcu code rather than in the
> kgdb code.

Sure, for example, there is also PowerPC XMON.

But this problem also is not RCU-specific.  There are also hardlockups,
softlockups, workqueue lockups, networking timeouts, and who knows what
all else.

Plus, and again to Thomas's point, gdb breakpoints can happen anywhere.
For example, immediately after RCU computes the RCU CPU stall time for
a new grace period, and right before it stores it.  The gdb callout
updates rcu_state.jiffies_stall, but that update is overwritten with a
stale value as soon as the system starts back up.

Low probabillity, to be sure, but there are quite a few places in
the kernel right after a read from some timebase or another, and many
(perhaps all) of these can see similar stale-time-use problems.

The only way I know of to avoid these sorts of false positives is for
the user to manually suppress all timeouts (perhaps using a kernel-boot
parameter for your early-boot case), do the gdb work, and then unsuppress
all stalls.  Even that won't work for networking, because the other
system's clock will be running throughout.

In other words, from what I know now, there is no perfect solution.
Therefore, there are sharp limits to the complexity of any solution that
I will be willing to accept.

							Thanx, Paul

> Huacai
> >
> >                                                         Thanx, Paul
> >
> > > Huacai
> > > >
> > > > The advantage of ULONG_MAX/4 over ULONG_MAX/2 is that the time_after()
> > > > and time_before() macros have ULONG_MAX/4 slop in either direction
> > > > before giving you the wrong answer.  You can get nearly the same result
> > > > using ULONG_MAX/2, but it requires a bit more care.  And even on 32-bit
> > > > HZ=1000 systems, ULONG_MAX/4 gets you more than 12 days of gdb session
> > > > or jiffies-update delay before you start getting false positives.
> > > >
> > > > Then things can be reset after (say) 3 calls to rcu_gp_fqs() and
> > > > also the current reset at the beginning of a grace period, which
> > > > is in record_gp_stall_check_time().
> > > >
> > > > It would be better if RCU could get notified at both ends of the debug
> > > > session, but given gdb commands such as "next", along with Thomas's
> > > > point about gdb breakpoints being pretty much anywhere, this might or
> > > > might not be so helpful in real life.  But worth looking into.
> > > >
> > > >                                                         Thanx, Paul
> > > >
> > > > > Huacai
> > > > >
> > > > > >
> > > > > >                                                         Thanx, Paul
> > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > >         tglx
> > > > > > > ---
> > > > > > > --- a/kernel/time/tick-sched.c
> > > > > > > +++ b/kernel/time/tick-sched.c
> > > > > > > @@ -51,6 +51,13 @@ struct tick_sched *tick_get_tick_sched(i
> > > > > > >   */
> > > > > > >  static ktime_t last_jiffies_update;
> > > > > > >
> > > > > > > +unsigned long tick_estimate_stale_jiffies(void)
> > > > > > > +{
> > > > > > > +     ktime_t delta = ktime_get_mono_fast_ns() - READ_ONCE(last_jiffies_update);
> > > > > > > +
> > > > > > > +     return delta < 0 ? 0 : div_s64(delta, TICK_NSEC);
> > > > > > > +}
> > > > > > > +
> > > > > > >  /*
> > > > > > >   * Must be called with interrupts disabled !
> > > > > > >   */
> > > > > > >
> > > > > > >
Huacai Chen Aug. 25, 2023, 11:15 a.m. UTC | #35
Hi, Paul,

On Fri, Aug 25, 2023 at 2:28 AM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Thu, Aug 24, 2023 at 11:43:04PM +0800, Huacai Chen wrote:
> > Hi, Paul,
> >
> > On Thu, Aug 24, 2023 at 9:24 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > >
> > > On Thu, Aug 24, 2023 at 08:40:00PM +0800, Huacai Chen wrote:
> > > > Hi, Paul,
> > > > On Thu, Aug 24, 2023 at 7:40 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > > On Thu, Aug 24, 2023 at 10:50:41AM +0800, Huacai Chen wrote:
> > > > > > Hi, Paul,
> > > > > > On Thu, Aug 24, 2023 at 6:41 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > > > > On Thu, Aug 24, 2023 at 12:03:25AM +0200, Thomas Gleixner wrote:
> > > > > > > > On Thu, Aug 17 2023 at 16:06, Huacai Chen wrote:
> > > > > > > > > On Thu, Aug 17, 2023 at 3:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> > > > > > > > >> > If  do_update_jiffies_64() cannot be used in NMI context,
> > > > > > > > >>
> > > > > > > > >> Can you not make the jiffies update conditional on whether it is
> > > > > > > > >> called within NMI context?
> > > > > > > >
> > > > > > > > Which solves what? If KGDB has a breakpoint in the jiffies lock held
> > > > > > > > region then you still dead lock.
> > > > > > > >
> > > > > > > > >> I dislike that..
> > > > > > > > > Is this acceptable?
> > > > > > > > >
> > > > > > > > > void rcu_cpu_stall_reset(void)
> > > > > > > > > {
> > > > > > > > >         unsigned long delta;
> > > > > > > > >
> > > > > > > > >         delta = nsecs_to_jiffies(ktime_get_ns() - ktime_get_coarse_ns());
> > > > > > > > >
> > > > > > > > >         WRITE_ONCE(rcu_state.jiffies_stall,
> > > > > > > > >                    jiffies + delta + rcu_jiffies_till_stall_check());
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > This can update jiffies_stall without updating jiffies (but has the
> > > > > > > > > same effect).
> > > > > > > >
> > > > > > > > Now you traded the potential dead lock on jiffies lock for a potential
> > > > > > > > live lock vs. tk_core.seq. Not really an improvement, right?
> > > > > > > >
> > > > > > > > The only way you can do the above is something like the incomplete and
> > > > > > > > uncompiled below. NMI safe and therefore livelock proof time interfaces
> > > > > > > > exist for a reason.
> > > > > > >
> > > > > > > Just for completeness, another approach, with its own advantages
> > > > > > > and disadvantage, is to add something like ULONG_MAX/4 to
> > > > > > > rcu_state.jiffies_stall, but also set a counter indicating that this
> > > > > > > has been done.  Then RCU's force-quiescent processing could decrement
> > > > > > > that counter (if non-zero) and reset rcu_state.jiffies_stall when it
> > > > > > > does reach zero.
> > > > > > >
> > > > > > > Setting the counter to three should cover most cases, but "live by the
> > > > > > > heuristic, die by the heuristic".  ;-)
> > > > > > >
> > > > > > > It would be good to have some indication when gdb exited, but things
> > > > > > > like the gdb "next" command can make that "interesting" when applied to
> > > > > > > a long-running function.
> > > > > >
> > > > > > The original code is adding ULONG_MAX/2, so adding ULONG_MAX/4 may
> > > > > > make no much difference? The simplest way is adding 300*HZ, but Joel
> > > > > > dislikes that.
> > > > >
> > > > > I am not seeing the ULONG_MAX/2, so could you please point me to that
> > > > > original code?
> > > >
> > > > Maybe I misunderstand something, I say the original code means code
> > > > before commit a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall
> > > > detection in rcu_cpu_stall_reset()").
> > >
> > > Yes, my suggestion would essentially revert that patch.  It would
> > > compensate by resetting rcu_state.jiffies_stall after a few calls
> > > to rcu_gp_fqs().
> > >
> > > Alternatively, we could simply provide a way for gdb users to manually
> > > disable RCU CPU stall warnings at the beginning of their debug sessions
> > > and to manually re-enable them when they are done.
> >
> > This problem is not KGDB-specific (though it is firstly found in the
> > KGDB case), so I want to fix it in the rcu code rather than in the
> > kgdb code.
>
> Sure, for example, there is also PowerPC XMON.
>
> But this problem also is not RCU-specific.  There are also hardlockups,
> softlockups, workqueue lockups, networking timeouts, and who knows what
> all else.
>
> Plus, and again to Thomas's point, gdb breakpoints can happen anywhere.
> For example, immediately after RCU computes the RCU CPU stall time for
> a new grace period, and right before it stores it.  The gdb callout
> updates rcu_state.jiffies_stall, but that update is overwritten with a
> stale value as soon as the system starts back up.
>
> Low probabillity, to be sure, but there are quite a few places in
> the kernel right after a read from some timebase or another, and many
> (perhaps all) of these can see similar stale-time-use problems.
>
> The only way I know of to avoid these sorts of false positives is for
> the user to manually suppress all timeouts (perhaps using a kernel-boot
> parameter for your early-boot case), do the gdb work, and then unsuppress
> all stalls.  Even that won't work for networking, because the other
> system's clock will be running throughout.
>
> In other words, from what I know now, there is no perfect solution.
> Therefore, there are sharp limits to the complexity of any solution that
> I will be willing to accept.
I think the simplest solution is (I hope Joel will not angry):

void rcu_cpu_stall_reset(void)
{
        WRITE_ONCE(rcu_state.jiffies_stall, jiffies + 300*HZ);
}

300s is the upper limit of "stall timeout" we can configure
(RCU_CPU_STALL_TIMEOUT in kernel/rcu/Kconfig.debug), so it isn't just
a "magic number". In practice, 300s is also enough for any normal kgdb
operation. And compared to "resetting after a few calls to
rcu_gp_fqs()", this simple solution means "automatically resetting
after 300s".

If this is completely unacceptable, I prefer Thomas's
tick_estimate_stale_jiffies() solution.

Huacai

>
>                                                         Thanx, Paul
>
> > Huacai
> > >
> > >                                                         Thanx, Paul
> > >
> > > > Huacai
> > > > >
> > > > > The advantage of ULONG_MAX/4 over ULONG_MAX/2 is that the time_after()
> > > > > and time_before() macros have ULONG_MAX/4 slop in either direction
> > > > > before giving you the wrong answer.  You can get nearly the same result
> > > > > using ULONG_MAX/2, but it requires a bit more care.  And even on 32-bit
> > > > > HZ=1000 systems, ULONG_MAX/4 gets you more than 12 days of gdb session
> > > > > or jiffies-update delay before you start getting false positives.
> > > > >
> > > > > Then things can be reset after (say) 3 calls to rcu_gp_fqs() and
> > > > > also the current reset at the beginning of a grace period, which
> > > > > is in record_gp_stall_check_time().
> > > > >
> > > > > It would be better if RCU could get notified at both ends of the debug
> > > > > session, but given gdb commands such as "next", along with Thomas's
> > > > > point about gdb breakpoints being pretty much anywhere, this might or
> > > > > might not be so helpful in real life.  But worth looking into.
> > > > >
> > > > >                                                         Thanx, Paul
> > > > >
> > > > > > Huacai
> > > > > >
> > > > > > >
> > > > > > >                                                         Thanx, Paul
> > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > >         tglx
> > > > > > > > ---
> > > > > > > > --- a/kernel/time/tick-sched.c
> > > > > > > > +++ b/kernel/time/tick-sched.c
> > > > > > > > @@ -51,6 +51,13 @@ struct tick_sched *tick_get_tick_sched(i
> > > > > > > >   */
> > > > > > > >  static ktime_t last_jiffies_update;
> > > > > > > >
> > > > > > > > +unsigned long tick_estimate_stale_jiffies(void)
> > > > > > > > +{
> > > > > > > > +     ktime_t delta = ktime_get_mono_fast_ns() - READ_ONCE(last_jiffies_update);
> > > > > > > > +
> > > > > > > > +     return delta < 0 ? 0 : div_s64(delta, TICK_NSEC);
> > > > > > > > +}
> > > > > > > > +
> > > > > > > >  /*
> > > > > > > >   * Must be called with interrupts disabled !
> > > > > > > >   */
> > > > > > > >
> > > > > > > >
Joel Fernandes Aug. 25, 2023, 11:28 p.m. UTC | #36
On Fri, Aug 25, 2023 at 07:15:44PM +0800, Huacai Chen wrote:
> Hi, Paul,
> 
> On Fri, Aug 25, 2023 at 2:28 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Thu, Aug 24, 2023 at 11:43:04PM +0800, Huacai Chen wrote:
> > > Hi, Paul,
> > >
> > > On Thu, Aug 24, 2023 at 9:24 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > >
> > > > On Thu, Aug 24, 2023 at 08:40:00PM +0800, Huacai Chen wrote:
> > > > > Hi, Paul,
> > > > > On Thu, Aug 24, 2023 at 7:40 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > > > On Thu, Aug 24, 2023 at 10:50:41AM +0800, Huacai Chen wrote:
> > > > > > > Hi, Paul,
> > > > > > > On Thu, Aug 24, 2023 at 6:41 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > > > > > On Thu, Aug 24, 2023 at 12:03:25AM +0200, Thomas Gleixner wrote:
> > > > > > > > > On Thu, Aug 17 2023 at 16:06, Huacai Chen wrote:
> > > > > > > > > > On Thu, Aug 17, 2023 at 3:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> > > > > > > > > >> > If  do_update_jiffies_64() cannot be used in NMI context,
> > > > > > > > > >>
> > > > > > > > > >> Can you not make the jiffies update conditional on whether it is
> > > > > > > > > >> called within NMI context?
> > > > > > > > >
> > > > > > > > > Which solves what? If KGDB has a breakpoint in the jiffies lock held
> > > > > > > > > region then you still dead lock.
> > > > > > > > >
> > > > > > > > > >> I dislike that..
> > > > > > > > > > Is this acceptable?
> > > > > > > > > >
> > > > > > > > > > void rcu_cpu_stall_reset(void)
> > > > > > > > > > {
> > > > > > > > > >         unsigned long delta;
> > > > > > > > > >
> > > > > > > > > >         delta = nsecs_to_jiffies(ktime_get_ns() - ktime_get_coarse_ns());
> > > > > > > > > >
> > > > > > > > > >         WRITE_ONCE(rcu_state.jiffies_stall,
> > > > > > > > > >                    jiffies + delta + rcu_jiffies_till_stall_check());
> > > > > > > > > > }
> > > > > > > > > >
> > > > > > > > > > This can update jiffies_stall without updating jiffies (but has the
> > > > > > > > > > same effect).
> > > > > > > > >
> > > > > > > > > Now you traded the potential dead lock on jiffies lock for a potential
> > > > > > > > > live lock vs. tk_core.seq. Not really an improvement, right?
> > > > > > > > >
> > > > > > > > > The only way you can do the above is something like the incomplete and
> > > > > > > > > uncompiled below. NMI safe and therefore livelock proof time interfaces
> > > > > > > > > exist for a reason.
> > > > > > > >
> > > > > > > > Just for completeness, another approach, with its own advantages
> > > > > > > > and disadvantage, is to add something like ULONG_MAX/4 to
> > > > > > > > rcu_state.jiffies_stall, but also set a counter indicating that this
> > > > > > > > has been done.  Then RCU's force-quiescent processing could decrement
> > > > > > > > that counter (if non-zero) and reset rcu_state.jiffies_stall when it
> > > > > > > > does reach zero.
> > > > > > > >
> > > > > > > > Setting the counter to three should cover most cases, but "live by the
> > > > > > > > heuristic, die by the heuristic".  ;-)
> > > > > > > >
> > > > > > > > It would be good to have some indication when gdb exited, but things
> > > > > > > > like the gdb "next" command can make that "interesting" when applied to
> > > > > > > > a long-running function.
> > > > > > >
> > > > > > > The original code is adding ULONG_MAX/2, so adding ULONG_MAX/4 may
> > > > > > > make no much difference? The simplest way is adding 300*HZ, but Joel
> > > > > > > dislikes that.
> > > > > >
> > > > > > I am not seeing the ULONG_MAX/2, so could you please point me to that
> > > > > > original code?
> > > > >
> > > > > Maybe I misunderstand something, I say the original code means code
> > > > > before commit a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall
> > > > > detection in rcu_cpu_stall_reset()").
> > > >
> > > > Yes, my suggestion would essentially revert that patch.  It would
> > > > compensate by resetting rcu_state.jiffies_stall after a few calls
> > > > to rcu_gp_fqs().
> > > >
> > > > Alternatively, we could simply provide a way for gdb users to manually
> > > > disable RCU CPU stall warnings at the beginning of their debug sessions
> > > > and to manually re-enable them when they are done.
> > >
> > > This problem is not KGDB-specific (though it is firstly found in the
> > > KGDB case), so I want to fix it in the rcu code rather than in the
> > > kgdb code.
> >
> > Sure, for example, there is also PowerPC XMON.
> >
> > But this problem also is not RCU-specific.  There are also hardlockups,
> > softlockups, workqueue lockups, networking timeouts, and who knows what
> > all else.
> >
> > Plus, and again to Thomas's point, gdb breakpoints can happen anywhere.
> > For example, immediately after RCU computes the RCU CPU stall time for
> > a new grace period, and right before it stores it.  The gdb callout
> > updates rcu_state.jiffies_stall, but that update is overwritten with a
> > stale value as soon as the system starts back up.
> >
> > Low probabillity, to be sure, but there are quite a few places in
> > the kernel right after a read from some timebase or another, and many
> > (perhaps all) of these can see similar stale-time-use problems.
> >
> > The only way I know of to avoid these sorts of false positives is for
> > the user to manually suppress all timeouts (perhaps using a kernel-boot
> > parameter for your early-boot case), do the gdb work, and then unsuppress
> > all stalls.  Even that won't work for networking, because the other
> > system's clock will be running throughout.
> >
> > In other words, from what I know now, there is no perfect solution.
> > Therefore, there are sharp limits to the complexity of any solution that
> > I will be willing to accept.
> I think the simplest solution is (I hope Joel will not angry):

Not angry at all, just want to help. ;-). The problem is the 300*HZ solution
will also effect the VM workloads which also do a similar reset.  Allow me few
days to see if I can take a shot at fixing it slightly differently. I am
trying Paul's idea of setting jiffies at a later time. I think it is doable.
I think the advantage of doing this is it will make stall detection more
robust in this face of these gaps in jiffie update. And that solution does
not even need us to rely on ktime (and all the issues that come with that).

thanks,

 - Joel


> 
> void rcu_cpu_stall_reset(void)
> {
>         WRITE_ONCE(rcu_state.jiffies_stall, jiffies + 300*HZ);
> }
> 
> 300s is the upper limit of "stall timeout" we can configure
> (RCU_CPU_STALL_TIMEOUT in kernel/rcu/Kconfig.debug), so it isn't just
> a "magic number". In practice, 300s is also enough for any normal kgdb
> operation. And compared to "resetting after a few calls to
> rcu_gp_fqs()", this simple solution means "automatically resetting
> after 300s".
> 
> If this is completely unacceptable, I prefer Thomas's
> tick_estimate_stale_jiffies() solution.
> 
> Huacai
> 
> >
> >                                                         Thanx, Paul
> >
> > > Huacai
> > > >
> > > >                                                         Thanx, Paul
> > > >
> > > > > Huacai
> > > > > >
> > > > > > The advantage of ULONG_MAX/4 over ULONG_MAX/2 is that the time_after()
> > > > > > and time_before() macros have ULONG_MAX/4 slop in either direction
> > > > > > before giving you the wrong answer.  You can get nearly the same result
> > > > > > using ULONG_MAX/2, but it requires a bit more care.  And even on 32-bit
> > > > > > HZ=1000 systems, ULONG_MAX/4 gets you more than 12 days of gdb session
> > > > > > or jiffies-update delay before you start getting false positives.
> > > > > >
> > > > > > Then things can be reset after (say) 3 calls to rcu_gp_fqs() and
> > > > > > also the current reset at the beginning of a grace period, which
> > > > > > is in record_gp_stall_check_time().
> > > > > >
> > > > > > It would be better if RCU could get notified at both ends of the debug
> > > > > > session, but given gdb commands such as "next", along with Thomas's
> > > > > > point about gdb breakpoints being pretty much anywhere, this might or
> > > > > > might not be so helpful in real life.  But worth looking into.
> > > > > >
> > > > > >                                                         Thanx, Paul
> > > > > >
> > > > > > > Huacai
> > > > > > >
> > > > > > > >
> > > > > > > >                                                         Thanx, Paul
> > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > >         tglx
> > > > > > > > > ---
> > > > > > > > > --- a/kernel/time/tick-sched.c
> > > > > > > > > +++ b/kernel/time/tick-sched.c
> > > > > > > > > @@ -51,6 +51,13 @@ struct tick_sched *tick_get_tick_sched(i
> > > > > > > > >   */
> > > > > > > > >  static ktime_t last_jiffies_update;
> > > > > > > > >
> > > > > > > > > +unsigned long tick_estimate_stale_jiffies(void)
> > > > > > > > > +{
> > > > > > > > > +     ktime_t delta = ktime_get_mono_fast_ns() - READ_ONCE(last_jiffies_update);
> > > > > > > > > +
> > > > > > > > > +     return delta < 0 ? 0 : div_s64(delta, TICK_NSEC);
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > >  /*
> > > > > > > > >   * Must be called with interrupts disabled !
> > > > > > > > >   */
> > > > > > > > >
> > > > > > > > >
Paul E. McKenney Aug. 26, 2023, 1:45 a.m. UTC | #37
On Fri, Aug 25, 2023 at 07:15:44PM +0800, Huacai Chen wrote:
> Hi, Paul,
> 
> On Fri, Aug 25, 2023 at 2:28 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Thu, Aug 24, 2023 at 11:43:04PM +0800, Huacai Chen wrote:
> > > Hi, Paul,
> > >
> > > On Thu, Aug 24, 2023 at 9:24 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > >
> > > > On Thu, Aug 24, 2023 at 08:40:00PM +0800, Huacai Chen wrote:
> > > > > Hi, Paul,
> > > > > On Thu, Aug 24, 2023 at 7:40 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > > > On Thu, Aug 24, 2023 at 10:50:41AM +0800, Huacai Chen wrote:
> > > > > > > Hi, Paul,
> > > > > > > On Thu, Aug 24, 2023 at 6:41 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > > > > > On Thu, Aug 24, 2023 at 12:03:25AM +0200, Thomas Gleixner wrote:
> > > > > > > > > On Thu, Aug 17 2023 at 16:06, Huacai Chen wrote:
> > > > > > > > > > On Thu, Aug 17, 2023 at 3:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> > > > > > > > > >> > If  do_update_jiffies_64() cannot be used in NMI context,
> > > > > > > > > >>
> > > > > > > > > >> Can you not make the jiffies update conditional on whether it is
> > > > > > > > > >> called within NMI context?
> > > > > > > > >
> > > > > > > > > Which solves what? If KGDB has a breakpoint in the jiffies lock held
> > > > > > > > > region then you still dead lock.
> > > > > > > > >
> > > > > > > > > >> I dislike that..
> > > > > > > > > > Is this acceptable?
> > > > > > > > > >
> > > > > > > > > > void rcu_cpu_stall_reset(void)
> > > > > > > > > > {
> > > > > > > > > >         unsigned long delta;
> > > > > > > > > >
> > > > > > > > > >         delta = nsecs_to_jiffies(ktime_get_ns() - ktime_get_coarse_ns());
> > > > > > > > > >
> > > > > > > > > >         WRITE_ONCE(rcu_state.jiffies_stall,
> > > > > > > > > >                    jiffies + delta + rcu_jiffies_till_stall_check());
> > > > > > > > > > }
> > > > > > > > > >
> > > > > > > > > > This can update jiffies_stall without updating jiffies (but has the
> > > > > > > > > > same effect).
> > > > > > > > >
> > > > > > > > > Now you traded the potential dead lock on jiffies lock for a potential
> > > > > > > > > live lock vs. tk_core.seq. Not really an improvement, right?
> > > > > > > > >
> > > > > > > > > The only way you can do the above is something like the incomplete and
> > > > > > > > > uncompiled below. NMI safe and therefore livelock proof time interfaces
> > > > > > > > > exist for a reason.
> > > > > > > >
> > > > > > > > Just for completeness, another approach, with its own advantages
> > > > > > > > and disadvantage, is to add something like ULONG_MAX/4 to
> > > > > > > > rcu_state.jiffies_stall, but also set a counter indicating that this
> > > > > > > > has been done.  Then RCU's force-quiescent processing could decrement
> > > > > > > > that counter (if non-zero) and reset rcu_state.jiffies_stall when it
> > > > > > > > does reach zero.
> > > > > > > >
> > > > > > > > Setting the counter to three should cover most cases, but "live by the
> > > > > > > > heuristic, die by the heuristic".  ;-)
> > > > > > > >
> > > > > > > > It would be good to have some indication when gdb exited, but things
> > > > > > > > like the gdb "next" command can make that "interesting" when applied to
> > > > > > > > a long-running function.
> > > > > > >
> > > > > > > The original code is adding ULONG_MAX/2, so adding ULONG_MAX/4 may
> > > > > > > make no much difference? The simplest way is adding 300*HZ, but Joel
> > > > > > > dislikes that.
> > > > > >
> > > > > > I am not seeing the ULONG_MAX/2, so could you please point me to that
> > > > > > original code?
> > > > >
> > > > > Maybe I misunderstand something, I say the original code means code
> > > > > before commit a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall
> > > > > detection in rcu_cpu_stall_reset()").
> > > >
> > > > Yes, my suggestion would essentially revert that patch.  It would
> > > > compensate by resetting rcu_state.jiffies_stall after a few calls
> > > > to rcu_gp_fqs().
> > > >
> > > > Alternatively, we could simply provide a way for gdb users to manually
> > > > disable RCU CPU stall warnings at the beginning of their debug sessions
> > > > and to manually re-enable them when they are done.
> > >
> > > This problem is not KGDB-specific (though it is firstly found in the
> > > KGDB case), so I want to fix it in the rcu code rather than in the
> > > kgdb code.
> >
> > Sure, for example, there is also PowerPC XMON.
> >
> > But this problem also is not RCU-specific.  There are also hardlockups,
> > softlockups, workqueue lockups, networking timeouts, and who knows what
> > all else.
> >
> > Plus, and again to Thomas's point, gdb breakpoints can happen anywhere.
> > For example, immediately after RCU computes the RCU CPU stall time for
> > a new grace period, and right before it stores it.  The gdb callout
> > updates rcu_state.jiffies_stall, but that update is overwritten with a
> > stale value as soon as the system starts back up.
> >
> > Low probabillity, to be sure, but there are quite a few places in
> > the kernel right after a read from some timebase or another, and many
> > (perhaps all) of these can see similar stale-time-use problems.
> >
> > The only way I know of to avoid these sorts of false positives is for
> > the user to manually suppress all timeouts (perhaps using a kernel-boot
> > parameter for your early-boot case), do the gdb work, and then unsuppress
> > all stalls.  Even that won't work for networking, because the other
> > system's clock will be running throughout.
> >
> > In other words, from what I know now, there is no perfect solution.
> > Therefore, there are sharp limits to the complexity of any solution that
> > I will be willing to accept.
> I think the simplest solution is (I hope Joel will not angry):
> 
> void rcu_cpu_stall_reset(void)
> {
>         WRITE_ONCE(rcu_state.jiffies_stall, jiffies + 300*HZ);
> }
> 
> 300s is the upper limit of "stall timeout" we can configure
> (RCU_CPU_STALL_TIMEOUT in kernel/rcu/Kconfig.debug), so it isn't just
> a "magic number". In practice, 300s is also enough for any normal kgdb
> operation. And compared to "resetting after a few calls to
> rcu_gp_fqs()", this simple solution means "automatically resetting
> after 300s".

Please keep in mind that the long-ago choice of 300s did not take things
like kernel debuggers into account.  But that 300s limit still makes
sense in the absence of things like kernel debuggers.  So this code is
what takes up the difference.

> If this is completely unacceptable, I prefer Thomas's
> tick_estimate_stale_jiffies() solution.

Thomas's tick_estimate_stale_jiffies() does have its merits, but the
advantage of ULONG_MAX/4 is that you don't have to care about jiffies
being stale.

							Thanx, Paul

> > > > > > The advantage of ULONG_MAX/4 over ULONG_MAX/2 is that the time_after()
> > > > > > and time_before() macros have ULONG_MAX/4 slop in either direction
> > > > > > before giving you the wrong answer.  You can get nearly the same result
> > > > > > using ULONG_MAX/2, but it requires a bit more care.  And even on 32-bit
> > > > > > HZ=1000 systems, ULONG_MAX/4 gets you more than 12 days of gdb session
> > > > > > or jiffies-update delay before you start getting false positives.
> > > > > >
> > > > > > Then things can be reset after (say) 3 calls to rcu_gp_fqs() and
> > > > > > also the current reset at the beginning of a grace period, which
> > > > > > is in record_gp_stall_check_time().
> > > > > >
> > > > > > It would be better if RCU could get notified at both ends of the debug
> > > > > > session, but given gdb commands such as "next", along with Thomas's
> > > > > > point about gdb breakpoints being pretty much anywhere, this might or
> > > > > > might not be so helpful in real life.  But worth looking into.
> > > > > >
> > > > > >                                                         Thanx, Paul
> > > > > >
> > > > > > > Huacai
> > > > > > >
> > > > > > > >
> > > > > > > >                                                         Thanx, Paul
> > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > >         tglx
> > > > > > > > > ---
> > > > > > > > > --- a/kernel/time/tick-sched.c
> > > > > > > > > +++ b/kernel/time/tick-sched.c
> > > > > > > > > @@ -51,6 +51,13 @@ struct tick_sched *tick_get_tick_sched(i
> > > > > > > > >   */
> > > > > > > > >  static ktime_t last_jiffies_update;
> > > > > > > > >
> > > > > > > > > +unsigned long tick_estimate_stale_jiffies(void)
> > > > > > > > > +{
> > > > > > > > > +     ktime_t delta = ktime_get_mono_fast_ns() - READ_ONCE(last_jiffies_update);
> > > > > > > > > +
> > > > > > > > > +     return delta < 0 ? 0 : div_s64(delta, TICK_NSEC);
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > >  /*
> > > > > > > > >   * Must be called with interrupts disabled !
> > > > > > > > >   */
> > > > > > > > >
> > > > > > > > >
Joel Fernandes Aug. 27, 2023, 3:27 a.m. UTC | #38
On Fri, Aug 25, 2023 at 7:28 PM Joel Fernandes <joel@joelfernandes.org> wrote:
>
> On Fri, Aug 25, 2023 at 07:15:44PM +0800, Huacai Chen wrote:
> > Hi, Paul,
> >
> > On Fri, Aug 25, 2023 at 2:28 AM Paul E. McKenney <paulmck@kernel.org> wrote:
[..]
> > > > > > > > > > > On Thu, Aug 17, 2023 at 3:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> > > > > > > > > > >> > If  do_update_jiffies_64() cannot be used in NMI context,
> > > > > > > > > > >>
> > > > > > > > > > >> Can you not make the jiffies update conditional on whether it is
> > > > > > > > > > >> called within NMI context?
> > > > > > > > > >
> > > > > > > > > > Which solves what? If KGDB has a breakpoint in the jiffies lock held
> > > > > > > > > > region then you still dead lock.
> > > > > > > > > >
> > > > > > > > > > >> I dislike that..
> > > > > > > > > > > Is this acceptable?
> > > > > > > > > > >
> > > > > > > > > > > void rcu_cpu_stall_reset(void)
> > > > > > > > > > > {
> > > > > > > > > > >         unsigned long delta;
> > > > > > > > > > >
> > > > > > > > > > >         delta = nsecs_to_jiffies(ktime_get_ns() - ktime_get_coarse_ns());
> > > > > > > > > > >
> > > > > > > > > > >         WRITE_ONCE(rcu_state.jiffies_stall,
> > > > > > > > > > >                    jiffies + delta + rcu_jiffies_till_stall_check());
> > > > > > > > > > > }
> > > > > > > > > > >
> > > > > > > > > > > This can update jiffies_stall without updating jiffies (but has the
> > > > > > > > > > > same effect).
> > > > > > > > > >
> > > > > > > > > > Now you traded the potential dead lock on jiffies lock for a potential
> > > > > > > > > > live lock vs. tk_core.seq. Not really an improvement, right?
> > > > > > > > > >
> > > > > > > > > > The only way you can do the above is something like the incomplete and
> > > > > > > > > > uncompiled below. NMI safe and therefore livelock proof time interfaces
> > > > > > > > > > exist for a reason.
> > > > > > > > >
> > > > > > > > > Just for completeness, another approach, with its own advantages
> > > > > > > > > and disadvantage, is to add something like ULONG_MAX/4 to
> > > > > > > > > rcu_state.jiffies_stall, but also set a counter indicating that this
> > > > > > > > > has been done.  Then RCU's force-quiescent processing could decrement
> > > > > > > > > that counter (if non-zero) and reset rcu_state.jiffies_stall when it
> > > > > > > > > does reach zero.
> > > > > > > > >
> > > > > > > > > Setting the counter to three should cover most cases, but "live by the
> > > > > > > > > heuristic, die by the heuristic".  ;-)
> > > > > > > > >
> > > > > > > > > It would be good to have some indication when gdb exited, but things
> > > > > > > > > like the gdb "next" command can make that "interesting" when applied to
> > > > > > > > > a long-running function.
> > > > > > > >
> > > > > > > > The original code is adding ULONG_MAX/2, so adding ULONG_MAX/4 may
> > > > > > > > make no much difference? The simplest way is adding 300*HZ, but Joel
> > > > > > > > dislikes that.
> > > > > > >
> > > > > > > I am not seeing the ULONG_MAX/2, so could you please point me to that
> > > > > > > original code?
> > > > > >
> > > > > > Maybe I misunderstand something, I say the original code means code
> > > > > > before commit a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall
> > > > > > detection in rcu_cpu_stall_reset()").
> > > > >
> > > > > Yes, my suggestion would essentially revert that patch.  It would
> > > > > compensate by resetting rcu_state.jiffies_stall after a few calls
> > > > > to rcu_gp_fqs().
> > > > >
> > > > > Alternatively, we could simply provide a way for gdb users to manually
> > > > > disable RCU CPU stall warnings at the beginning of their debug sessions
> > > > > and to manually re-enable them when they are done.
> > > >
> > > > This problem is not KGDB-specific (though it is firstly found in the
> > > > KGDB case), so I want to fix it in the rcu code rather than in the
> > > > kgdb code.
> > >
> > > Sure, for example, there is also PowerPC XMON.
> > >
> > > But this problem also is not RCU-specific.  There are also hardlockups,
> > > softlockups, workqueue lockups, networking timeouts, and who knows what
> > > all else.
> > >
> > > Plus, and again to Thomas's point, gdb breakpoints can happen anywhere.
> > > For example, immediately after RCU computes the RCU CPU stall time for
> > > a new grace period, and right before it stores it.  The gdb callout
> > > updates rcu_state.jiffies_stall, but that update is overwritten with a
> > > stale value as soon as the system starts back up.
> > >
> > > Low probabillity, to be sure, but there are quite a few places in
> > > the kernel right after a read from some timebase or another, and many
> > > (perhaps all) of these can see similar stale-time-use problems.
> > >
> > > The only way I know of to avoid these sorts of false positives is for
> > > the user to manually suppress all timeouts (perhaps using a kernel-boot
> > > parameter for your early-boot case), do the gdb work, and then unsuppress
> > > all stalls.  Even that won't work for networking, because the other
> > > system's clock will be running throughout.
> > >
> > > In other words, from what I know now, there is no perfect solution.
> > > Therefore, there are sharp limits to the complexity of any solution that
> > > I will be willing to accept.
> > I think the simplest solution is (I hope Joel will not angry):
>
> Not angry at all, just want to help. ;-). The problem is the 300*HZ solution
> will also effect the VM workloads which also do a similar reset.  Allow me few
> days to see if I can take a shot at fixing it slightly differently. I am
> trying Paul's idea of setting jiffies at a later time. I think it is doable.
> I think the advantage of doing this is it will make stall detection more
> robust in this face of these gaps in jiffie update. And that solution does
> not even need us to rely on ktime (and all the issues that come with that).
>

I wrote a patch similar to Paul's idea and sent it out for review, the
advantage being it purely is based on jiffies. Could you try it out
and let me know?

thanks,

 - Joel
Huacai Chen Aug. 27, 2023, 5:50 a.m. UTC | #39
Hi, Joel,

On Sun, Aug 27, 2023 at 11:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
>
> On Fri, Aug 25, 2023 at 7:28 PM Joel Fernandes <joel@joelfernandes.org> wrote:
> >
> > On Fri, Aug 25, 2023 at 07:15:44PM +0800, Huacai Chen wrote:
> > > Hi, Paul,
> > >
> > > On Fri, Aug 25, 2023 at 2:28 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> [..]
> > > > > > > > > > > > On Thu, Aug 17, 2023 at 3:27 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> > > > > > > > > > > >> > If  do_update_jiffies_64() cannot be used in NMI context,
> > > > > > > > > > > >>
> > > > > > > > > > > >> Can you not make the jiffies update conditional on whether it is
> > > > > > > > > > > >> called within NMI context?
> > > > > > > > > > >
> > > > > > > > > > > Which solves what? If KGDB has a breakpoint in the jiffies lock held
> > > > > > > > > > > region then you still dead lock.
> > > > > > > > > > >
> > > > > > > > > > > >> I dislike that..
> > > > > > > > > > > > Is this acceptable?
> > > > > > > > > > > >
> > > > > > > > > > > > void rcu_cpu_stall_reset(void)
> > > > > > > > > > > > {
> > > > > > > > > > > >         unsigned long delta;
> > > > > > > > > > > >
> > > > > > > > > > > >         delta = nsecs_to_jiffies(ktime_get_ns() - ktime_get_coarse_ns());
> > > > > > > > > > > >
> > > > > > > > > > > >         WRITE_ONCE(rcu_state.jiffies_stall,
> > > > > > > > > > > >                    jiffies + delta + rcu_jiffies_till_stall_check());
> > > > > > > > > > > > }
> > > > > > > > > > > >
> > > > > > > > > > > > This can update jiffies_stall without updating jiffies (but has the
> > > > > > > > > > > > same effect).
> > > > > > > > > > >
> > > > > > > > > > > Now you traded the potential dead lock on jiffies lock for a potential
> > > > > > > > > > > live lock vs. tk_core.seq. Not really an improvement, right?
> > > > > > > > > > >
> > > > > > > > > > > The only way you can do the above is something like the incomplete and
> > > > > > > > > > > uncompiled below. NMI safe and therefore livelock proof time interfaces
> > > > > > > > > > > exist for a reason.
> > > > > > > > > >
> > > > > > > > > > Just for completeness, another approach, with its own advantages
> > > > > > > > > > and disadvantage, is to add something like ULONG_MAX/4 to
> > > > > > > > > > rcu_state.jiffies_stall, but also set a counter indicating that this
> > > > > > > > > > has been done.  Then RCU's force-quiescent processing could decrement
> > > > > > > > > > that counter (if non-zero) and reset rcu_state.jiffies_stall when it
> > > > > > > > > > does reach zero.
> > > > > > > > > >
> > > > > > > > > > Setting the counter to three should cover most cases, but "live by the
> > > > > > > > > > heuristic, die by the heuristic".  ;-)
> > > > > > > > > >
> > > > > > > > > > It would be good to have some indication when gdb exited, but things
> > > > > > > > > > like the gdb "next" command can make that "interesting" when applied to
> > > > > > > > > > a long-running function.
> > > > > > > > >
> > > > > > > > > The original code is adding ULONG_MAX/2, so adding ULONG_MAX/4 may
> > > > > > > > > make no much difference? The simplest way is adding 300*HZ, but Joel
> > > > > > > > > dislikes that.
> > > > > > > >
> > > > > > > > I am not seeing the ULONG_MAX/2, so could you please point me to that
> > > > > > > > original code?
> > > > > > >
> > > > > > > Maybe I misunderstand something, I say the original code means code
> > > > > > > before commit a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall
> > > > > > > detection in rcu_cpu_stall_reset()").
> > > > > >
> > > > > > Yes, my suggestion would essentially revert that patch.  It would
> > > > > > compensate by resetting rcu_state.jiffies_stall after a few calls
> > > > > > to rcu_gp_fqs().
> > > > > >
> > > > > > Alternatively, we could simply provide a way for gdb users to manually
> > > > > > disable RCU CPU stall warnings at the beginning of their debug sessions
> > > > > > and to manually re-enable them when they are done.
> > > > >
> > > > > This problem is not KGDB-specific (though it is firstly found in the
> > > > > KGDB case), so I want to fix it in the rcu code rather than in the
> > > > > kgdb code.
> > > >
> > > > Sure, for example, there is also PowerPC XMON.
> > > >
> > > > But this problem also is not RCU-specific.  There are also hardlockups,
> > > > softlockups, workqueue lockups, networking timeouts, and who knows what
> > > > all else.
> > > >
> > > > Plus, and again to Thomas's point, gdb breakpoints can happen anywhere.
> > > > For example, immediately after RCU computes the RCU CPU stall time for
> > > > a new grace period, and right before it stores it.  The gdb callout
> > > > updates rcu_state.jiffies_stall, but that update is overwritten with a
> > > > stale value as soon as the system starts back up.
> > > >
> > > > Low probabillity, to be sure, but there are quite a few places in
> > > > the kernel right after a read from some timebase or another, and many
> > > > (perhaps all) of these can see similar stale-time-use problems.
> > > >
> > > > The only way I know of to avoid these sorts of false positives is for
> > > > the user to manually suppress all timeouts (perhaps using a kernel-boot
> > > > parameter for your early-boot case), do the gdb work, and then unsuppress
> > > > all stalls.  Even that won't work for networking, because the other
> > > > system's clock will be running throughout.
> > > >
> > > > In other words, from what I know now, there is no perfect solution.
> > > > Therefore, there are sharp limits to the complexity of any solution that
> > > > I will be willing to accept.
> > > I think the simplest solution is (I hope Joel will not angry):
> >
> > Not angry at all, just want to help. ;-). The problem is the 300*HZ solution
> > will also effect the VM workloads which also do a similar reset.  Allow me few
> > days to see if I can take a shot at fixing it slightly differently. I am
> > trying Paul's idea of setting jiffies at a later time. I think it is doable.
> > I think the advantage of doing this is it will make stall detection more
> > robust in this face of these gaps in jiffie update. And that solution does
> > not even need us to rely on ktime (and all the issues that come with that).
> >
>
> I wrote a patch similar to Paul's idea and sent it out for review, the
> advantage being it purely is based on jiffies. Could you try it out
> and let me know?
If you can cc my gmail <chenhuacai@gmail.com>, that could be better.

I have read your patch, maybe the counter (nr_fqs_jiffies_stall)
should be atomic_t and we should use atomic operation to decrement its
value. Because rcu_gp_fqs() can be run concurrently, and we may miss
the (nr_fqs == 1) condition.

Huacai
>
> thanks,
>
>  - Joel
Joel Fernandes Aug. 27, 2023, 10:11 p.m. UTC | #40
On Sun, Aug 27, 2023 at 1:51 AM Huacai Chen <chenhuacai@kernel.org> wrote:
[..]
> > > > > The only way I know of to avoid these sorts of false positives is for
> > > > > the user to manually suppress all timeouts (perhaps using a kernel-boot
> > > > > parameter for your early-boot case), do the gdb work, and then unsuppress
> > > > > all stalls.  Even that won't work for networking, because the other
> > > > > system's clock will be running throughout.
> > > > >
> > > > > In other words, from what I know now, there is no perfect solution.
> > > > > Therefore, there are sharp limits to the complexity of any solution that
> > > > > I will be willing to accept.
> > > > I think the simplest solution is (I hope Joel will not angry):
> > >
> > > Not angry at all, just want to help. ;-). The problem is the 300*HZ solution
> > > will also effect the VM workloads which also do a similar reset.  Allow me few
> > > days to see if I can take a shot at fixing it slightly differently. I am
> > > trying Paul's idea of setting jiffies at a later time. I think it is doable.
> > > I think the advantage of doing this is it will make stall detection more
> > > robust in this face of these gaps in jiffie update. And that solution does
> > > not even need us to rely on ktime (and all the issues that come with that).
> > >
> >
> > I wrote a patch similar to Paul's idea and sent it out for review, the
> > advantage being it purely is based on jiffies. Could you try it out
> > and let me know?
> If you can cc my gmail <chenhuacai@gmail.com>, that could be better.

Sure, will do.

>
> I have read your patch, maybe the counter (nr_fqs_jiffies_stall)
> should be atomic_t and we should use atomic operation to decrement its
> value. Because rcu_gp_fqs() can be run concurrently, and we may miss
> the (nr_fqs == 1) condition.

I don't think so. There is only 1 place where RMW operation happens
and rcu_gp_fqs() is called only from the GP kthread. So a concurrent
RMW (and hence a lost update) is not possible.

Could you test the patch for the issue you are seeing and provide your
Tested-by tag? Thanks,

 - Joel
Paul E. McKenney Aug. 28, 2023, 10:47 a.m. UTC | #41
On Sun, Aug 27, 2023 at 06:11:40PM -0400, Joel Fernandes wrote:
> On Sun, Aug 27, 2023 at 1:51 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> [..]
> > > > > > The only way I know of to avoid these sorts of false positives is for
> > > > > > the user to manually suppress all timeouts (perhaps using a kernel-boot
> > > > > > parameter for your early-boot case), do the gdb work, and then unsuppress
> > > > > > all stalls.  Even that won't work for networking, because the other
> > > > > > system's clock will be running throughout.
> > > > > >
> > > > > > In other words, from what I know now, there is no perfect solution.
> > > > > > Therefore, there are sharp limits to the complexity of any solution that
> > > > > > I will be willing to accept.
> > > > > I think the simplest solution is (I hope Joel will not angry):
> > > >
> > > > Not angry at all, just want to help. ;-). The problem is the 300*HZ solution
> > > > will also effect the VM workloads which also do a similar reset.  Allow me few
> > > > days to see if I can take a shot at fixing it slightly differently. I am
> > > > trying Paul's idea of setting jiffies at a later time. I think it is doable.
> > > > I think the advantage of doing this is it will make stall detection more
> > > > robust in this face of these gaps in jiffie update. And that solution does
> > > > not even need us to rely on ktime (and all the issues that come with that).
> > > >
> > >
> > > I wrote a patch similar to Paul's idea and sent it out for review, the
> > > advantage being it purely is based on jiffies. Could you try it out
> > > and let me know?
> > If you can cc my gmail <chenhuacai@gmail.com>, that could be better.
> 
> Sure, will do.
> 
> > I have read your patch, maybe the counter (nr_fqs_jiffies_stall)
> > should be atomic_t and we should use atomic operation to decrement its
> > value. Because rcu_gp_fqs() can be run concurrently, and we may miss
> > the (nr_fqs == 1) condition.
> 
> I don't think so. There is only 1 place where RMW operation happens
> and rcu_gp_fqs() is called only from the GP kthread. So a concurrent
> RMW (and hence a lost update) is not possible.

Huacai, is your concern that the gdb user might have created a script
(for example, printing a variable or two, then automatically continuing),
so that breakpoints could happen in quick successsion, such that the
second breakpoint might run concurrently with rcu_gp_fqs()?

If this can really happen, the point that Joel makes is a good one, namely
that rcu_gp_fqs() is single-threaded and (absent rcutorture) runs only
once every few jiffies.  And gdb breakpoints, even with scripting, should
also be rather rare.  So if this is an issue, a global lock should do the
trick, perhaps even one of the existing locks in the rcu_state structure.
The result should then be just as performant/scalable and a lot simpler
than use of atomics.

> Could you test the patch for the issue you are seeing and provide your
> Tested-by tag? Thanks,

Either way, testing would of course be very good!  ;-)

							Thanx, Paul
Huacai Chen Aug. 28, 2023, 11:30 a.m. UTC | #42
Hi, Paul and Joel,

On Mon, Aug 28, 2023 at 6:47 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Sun, Aug 27, 2023 at 06:11:40PM -0400, Joel Fernandes wrote:
> > On Sun, Aug 27, 2023 at 1:51 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> > [..]
> > > > > > > The only way I know of to avoid these sorts of false positives is for
> > > > > > > the user to manually suppress all timeouts (perhaps using a kernel-boot
> > > > > > > parameter for your early-boot case), do the gdb work, and then unsuppress
> > > > > > > all stalls.  Even that won't work for networking, because the other
> > > > > > > system's clock will be running throughout.
> > > > > > >
> > > > > > > In other words, from what I know now, there is no perfect solution.
> > > > > > > Therefore, there are sharp limits to the complexity of any solution that
> > > > > > > I will be willing to accept.
> > > > > > I think the simplest solution is (I hope Joel will not angry):
> > > > >
> > > > > Not angry at all, just want to help. ;-). The problem is the 300*HZ solution
> > > > > will also effect the VM workloads which also do a similar reset.  Allow me few
> > > > > days to see if I can take a shot at fixing it slightly differently. I am
> > > > > trying Paul's idea of setting jiffies at a later time. I think it is doable.
> > > > > I think the advantage of doing this is it will make stall detection more
> > > > > robust in this face of these gaps in jiffie update. And that solution does
> > > > > not even need us to rely on ktime (and all the issues that come with that).
> > > > >
> > > >
> > > > I wrote a patch similar to Paul's idea and sent it out for review, the
> > > > advantage being it purely is based on jiffies. Could you try it out
> > > > and let me know?
> > > If you can cc my gmail <chenhuacai@gmail.com>, that could be better.
> >
> > Sure, will do.
> >
> > > I have read your patch, maybe the counter (nr_fqs_jiffies_stall)
> > > should be atomic_t and we should use atomic operation to decrement its
> > > value. Because rcu_gp_fqs() can be run concurrently, and we may miss
> > > the (nr_fqs == 1) condition.
> >
> > I don't think so. There is only 1 place where RMW operation happens
> > and rcu_gp_fqs() is called only from the GP kthread. So a concurrent
> > RMW (and hence a lost update) is not possible.
>
> Huacai, is your concern that the gdb user might have created a script
> (for example, printing a variable or two, then automatically continuing),
> so that breakpoints could happen in quick successsion, such that the
> second breakpoint might run concurrently with rcu_gp_fqs()?
>
> If this can really happen, the point that Joel makes is a good one, namely
> that rcu_gp_fqs() is single-threaded and (absent rcutorture) runs only
> once every few jiffies.  And gdb breakpoints, even with scripting, should
> also be rather rare.  So if this is an issue, a global lock should do the
> trick, perhaps even one of the existing locks in the rcu_state structure.
> The result should then be just as performant/scalable and a lot simpler
> than use of atomics.
Sorry, I made a mistake. Yes, there is no concurrent issue, and this
approach probably works. But I have another problem: how to ensure
that there is a jiffies update in three calls to rcu_gp_fqs()? Or in
other word, is three also a magic number here?

And I rechecked the commit message of a80be428fbc1f1f3bc9e ("rcu: Do
not disable GP stall detection in rcu_cpu_stall_reset()"). I don't
know why Sergey said that the original code disables stall-detection
forever, in fact it only disables the detection in the current GP.

Huacai

>
> > Could you test the patch for the issue you are seeing and provide your
> > Tested-by tag? Thanks,
>
> Either way, testing would of course be very good!  ;-)
>
>                                                         Thanx, Paul
Paul E. McKenney Aug. 28, 2023, 11:54 a.m. UTC | #43
On Mon, Aug 28, 2023 at 07:30:43PM +0800, Huacai Chen wrote:
> Hi, Paul and Joel,
> 
> On Mon, Aug 28, 2023 at 6:47 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Sun, Aug 27, 2023 at 06:11:40PM -0400, Joel Fernandes wrote:
> > > On Sun, Aug 27, 2023 at 1:51 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > [..]
> > > > > > > > The only way I know of to avoid these sorts of false positives is for
> > > > > > > > the user to manually suppress all timeouts (perhaps using a kernel-boot
> > > > > > > > parameter for your early-boot case), do the gdb work, and then unsuppress
> > > > > > > > all stalls.  Even that won't work for networking, because the other
> > > > > > > > system's clock will be running throughout.
> > > > > > > >
> > > > > > > > In other words, from what I know now, there is no perfect solution.
> > > > > > > > Therefore, there are sharp limits to the complexity of any solution that
> > > > > > > > I will be willing to accept.
> > > > > > > I think the simplest solution is (I hope Joel will not angry):
> > > > > >
> > > > > > Not angry at all, just want to help. ;-). The problem is the 300*HZ solution
> > > > > > will also effect the VM workloads which also do a similar reset.  Allow me few
> > > > > > days to see if I can take a shot at fixing it slightly differently. I am
> > > > > > trying Paul's idea of setting jiffies at a later time. I think it is doable.
> > > > > > I think the advantage of doing this is it will make stall detection more
> > > > > > robust in this face of these gaps in jiffie update. And that solution does
> > > > > > not even need us to rely on ktime (and all the issues that come with that).
> > > > > >
> > > > >
> > > > > I wrote a patch similar to Paul's idea and sent it out for review, the
> > > > > advantage being it purely is based on jiffies. Could you try it out
> > > > > and let me know?
> > > > If you can cc my gmail <chenhuacai@gmail.com>, that could be better.
> > >
> > > Sure, will do.
> > >
> > > > I have read your patch, maybe the counter (nr_fqs_jiffies_stall)
> > > > should be atomic_t and we should use atomic operation to decrement its
> > > > value. Because rcu_gp_fqs() can be run concurrently, and we may miss
> > > > the (nr_fqs == 1) condition.
> > >
> > > I don't think so. There is only 1 place where RMW operation happens
> > > and rcu_gp_fqs() is called only from the GP kthread. So a concurrent
> > > RMW (and hence a lost update) is not possible.
> >
> > Huacai, is your concern that the gdb user might have created a script
> > (for example, printing a variable or two, then automatically continuing),
> > so that breakpoints could happen in quick successsion, such that the
> > second breakpoint might run concurrently with rcu_gp_fqs()?
> >
> > If this can really happen, the point that Joel makes is a good one, namely
> > that rcu_gp_fqs() is single-threaded and (absent rcutorture) runs only
> > once every few jiffies.  And gdb breakpoints, even with scripting, should
> > also be rather rare.  So if this is an issue, a global lock should do the
> > trick, perhaps even one of the existing locks in the rcu_state structure.
> > The result should then be just as performant/scalable and a lot simpler
> > than use of atomics.
> 
> Sorry, I made a mistake. Yes, there is no concurrent issue, and this
> approach probably works. But I have another problem: how to ensure
> that there is a jiffies update in three calls to rcu_gp_fqs()? Or in
> other word, is three also a magic number here?

Each of the three calls to rcu_gp_fqs() involves a wakeup of and a
context switch to RCU's grace-period kthread, each of which should be
sufficient to update jiffies if initially in an out-of-date-jiffies state.
The three is to some extent magic, the idea being to avoid a situation
where an currently running FQS reenables stall warnings immediately
after gdb disables them.

Obviously, if your testing shows that some other value works better,
please do let us know so that we can update!  But we have to start
somewhere.

> And I rechecked the commit message of a80be428fbc1f1f3bc9e ("rcu: Do
> not disable GP stall detection in rcu_cpu_stall_reset()"). I don't
> know why Sergey said that the original code disables stall-detection
> forever, in fact it only disables the detection in the current GP.

Well, it does disable stall detection forever in the case where the
current grace period lasts forever, which if I recall correctly was the
case that Sergey was encountering.

							Thanx, Paul

> Huacai
> 
> >
> > > Could you test the patch for the issue you are seeing and provide your
> > > Tested-by tag? Thanks,
> >
> > Either way, testing would of course be very good!  ;-)
> >
> >                                                         Thanx, Paul
Joel Fernandes Aug. 28, 2023, 1:33 p.m. UTC | #44
On Mon, Aug 28, 2023 at 03:47:12AM -0700, Paul E. McKenney wrote:
> On Sun, Aug 27, 2023 at 06:11:40PM -0400, Joel Fernandes wrote:
> > On Sun, Aug 27, 2023 at 1:51 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> > [..]
> > > > > > > The only way I know of to avoid these sorts of false positives is for
> > > > > > > the user to manually suppress all timeouts (perhaps using a kernel-boot
> > > > > > > parameter for your early-boot case), do the gdb work, and then unsuppress
> > > > > > > all stalls.  Even that won't work for networking, because the other
> > > > > > > system's clock will be running throughout.
> > > > > > >
> > > > > > > In other words, from what I know now, there is no perfect solution.
> > > > > > > Therefore, there are sharp limits to the complexity of any solution that
> > > > > > > I will be willing to accept.
> > > > > > I think the simplest solution is (I hope Joel will not angry):
> > > > >
> > > > > Not angry at all, just want to help. ;-). The problem is the 300*HZ solution
> > > > > will also effect the VM workloads which also do a similar reset.  Allow me few
> > > > > days to see if I can take a shot at fixing it slightly differently. I am
> > > > > trying Paul's idea of setting jiffies at a later time. I think it is doable.
> > > > > I think the advantage of doing this is it will make stall detection more
> > > > > robust in this face of these gaps in jiffie update. And that solution does
> > > > > not even need us to rely on ktime (and all the issues that come with that).
> > > > >
> > > >
> > > > I wrote a patch similar to Paul's idea and sent it out for review, the
> > > > advantage being it purely is based on jiffies. Could you try it out
> > > > and let me know?
> > > If you can cc my gmail <chenhuacai@gmail.com>, that could be better.
> > 
> > Sure, will do.
> > 
> > > I have read your patch, maybe the counter (nr_fqs_jiffies_stall)
> > > should be atomic_t and we should use atomic operation to decrement its
> > > value. Because rcu_gp_fqs() can be run concurrently, and we may miss
> > > the (nr_fqs == 1) condition.
> > 
> > I don't think so. There is only 1 place where RMW operation happens
> > and rcu_gp_fqs() is called only from the GP kthread. So a concurrent
> > RMW (and hence a lost update) is not possible.
> 
> Huacai, is your concern that the gdb user might have created a script
> (for example, printing a variable or two, then automatically continuing),
> so that breakpoints could happen in quick successsion, such that the
> second breakpoint might run concurrently with rcu_gp_fqs()?
> 
> If this can really happen, the point that Joel makes is a good one, namely
> that rcu_gp_fqs() is single-threaded and (absent rcutorture) runs only
> once every few jiffies.  And gdb breakpoints, even with scripting, should
> also be rather rare.  So if this is an issue, a global lock should do the
> trick, perhaps even one of the existing locks in the rcu_state structure.
> The result should then be just as performant/scalable and a lot simpler
> than use of atomics.

Thanks Paul and Huacai, also I was thinking in the event of such concurrent
breakpoint stalling jiffies updates but GP thread / rcu_gp_fqs() chugging
along, we could also make the patch more robust for such a situation as
follows (diff on top of previous patch [1]). Thoughts?

Also if someone sets a breakpoint right after the "nr_fqs == 1" check, then
they are kind of asking for it anyway since the GP kthread getting
stalled is an actual reason for RCU stalls (infact rcutorture has a test mode
for it even :P) and as such the false-positive may not be that false. ;-)

Btw apologies for forgetting to CC Thomas on [1] since he is involved in the
timekeeping discussions. I relied on "git send-email" to populate the Cc list
but did not add Cc: to the patch.

[1] https://lore.kernel.org/all/20230827025349.4161262-1-joel@joelfernandes.org/

---8<-----------------------

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 9273f2318ea1..ffb165a2ef41 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1559,13 +1559,15 @@ static void rcu_gp_fqs(bool first_time)
 	WRITE_ONCE(rcu_state.n_force_qs, rcu_state.n_force_qs + 1);
 
 	WARN_ON_ONCE(nr_fqs > 3);
-	if (nr_fqs) {
+	/* Only countdown nr_fqs for stall purposes if jiffies moves. */
+	if (nr_fqs && jiffies != READ_ONCE(rcu_state.jiffies_last_fqs)) {
 		if (nr_fqs == 1) {
 			WRITE_ONCE(rcu_state.jiffies_stall,
 				   jiffies + rcu_jiffies_till_stall_check());
 		}
 		WRITE_ONCE(rcu_state.nr_fqs_jiffies_stall, --nr_fqs);
 	}
+	WRITE_ONCE(rcu_state.jiffies_last_fqs, jiffies);
 
 	if (first_time) {
 		/* Collect dyntick-idle snapshots. */
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index e9821a8422db..72128e348fa1 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -386,6 +386,8 @@ struct rcu_state {
 						/*  in jiffies. */
 	unsigned long jiffies_stall;		/* Time at which to check */
 						/*  for CPU stalls. */
+	unsigned long jiffies_last_fqs;		/* jiffies value at last FQS.
+						   to confirm jiffies moves. */
 	int nr_fqs_jiffies_stall;		/* Number of fqs loops after
 						 * which read jiffies and set
 						 * jiffies_stall. Stall
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index a2fa6b22e248..0ddd22afbc3a 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -160,6 +160,7 @@ void rcu_cpu_stall_reset(void)
 {
 	WRITE_ONCE(rcu_state.nr_fqs_jiffies_stall, 3);
 	WRITE_ONCE(rcu_state.jiffies_stall, ULONG_MAX);
+	WRITE_ONCE(rcu_state.jiffies_last_fqs, 0);
 }
 
 //////////////////////////////////////////////////////////////////////////////
@@ -177,6 +178,7 @@ static void record_gp_stall_check_time(void)
 	smp_mb(); // ->gp_start before ->jiffies_stall and caller's ->gp_seq.
 	WRITE_ONCE(rcu_state.nr_fqs_jiffies_stall, 0);
 	WRITE_ONCE(rcu_state.jiffies_stall, j + j1);
+	WRITE_ONCE(rcu_state.jiffies_last_fqs, 0);
 	rcu_state.jiffies_resched = j + j1 / 2;
 	rcu_state.n_force_qs_gpstart = READ_ONCE(rcu_state.n_force_qs);
 }
Paul E. McKenney Aug. 28, 2023, 2:02 p.m. UTC | #45
On Mon, Aug 28, 2023 at 01:33:48PM +0000, Joel Fernandes wrote:
> On Mon, Aug 28, 2023 at 03:47:12AM -0700, Paul E. McKenney wrote:
> > On Sun, Aug 27, 2023 at 06:11:40PM -0400, Joel Fernandes wrote:
> > > On Sun, Aug 27, 2023 at 1:51 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > [..]
> > > > > > > > The only way I know of to avoid these sorts of false positives is for
> > > > > > > > the user to manually suppress all timeouts (perhaps using a kernel-boot
> > > > > > > > parameter for your early-boot case), do the gdb work, and then unsuppress
> > > > > > > > all stalls.  Even that won't work for networking, because the other
> > > > > > > > system's clock will be running throughout.
> > > > > > > >
> > > > > > > > In other words, from what I know now, there is no perfect solution.
> > > > > > > > Therefore, there are sharp limits to the complexity of any solution that
> > > > > > > > I will be willing to accept.
> > > > > > > I think the simplest solution is (I hope Joel will not angry):
> > > > > >
> > > > > > Not angry at all, just want to help. ;-). The problem is the 300*HZ solution
> > > > > > will also effect the VM workloads which also do a similar reset.  Allow me few
> > > > > > days to see if I can take a shot at fixing it slightly differently. I am
> > > > > > trying Paul's idea of setting jiffies at a later time. I think it is doable.
> > > > > > I think the advantage of doing this is it will make stall detection more
> > > > > > robust in this face of these gaps in jiffie update. And that solution does
> > > > > > not even need us to rely on ktime (and all the issues that come with that).
> > > > > >
> > > > >
> > > > > I wrote a patch similar to Paul's idea and sent it out for review, the
> > > > > advantage being it purely is based on jiffies. Could you try it out
> > > > > and let me know?
> > > > If you can cc my gmail <chenhuacai@gmail.com>, that could be better.
> > > 
> > > Sure, will do.
> > > 
> > > > I have read your patch, maybe the counter (nr_fqs_jiffies_stall)
> > > > should be atomic_t and we should use atomic operation to decrement its
> > > > value. Because rcu_gp_fqs() can be run concurrently, and we may miss
> > > > the (nr_fqs == 1) condition.
> > > 
> > > I don't think so. There is only 1 place where RMW operation happens
> > > and rcu_gp_fqs() is called only from the GP kthread. So a concurrent
> > > RMW (and hence a lost update) is not possible.
> > 
> > Huacai, is your concern that the gdb user might have created a script
> > (for example, printing a variable or two, then automatically continuing),
> > so that breakpoints could happen in quick successsion, such that the
> > second breakpoint might run concurrently with rcu_gp_fqs()?
> > 
> > If this can really happen, the point that Joel makes is a good one, namely
> > that rcu_gp_fqs() is single-threaded and (absent rcutorture) runs only
> > once every few jiffies.  And gdb breakpoints, even with scripting, should
> > also be rather rare.  So if this is an issue, a global lock should do the
> > trick, perhaps even one of the existing locks in the rcu_state structure.
> > The result should then be just as performant/scalable and a lot simpler
> > than use of atomics.
> 
> Thanks Paul and Huacai, also I was thinking in the event of such concurrent
> breakpoint stalling jiffies updates but GP thread / rcu_gp_fqs() chugging
> along, we could also make the patch more robust for such a situation as
> follows (diff on top of previous patch [1]). Thoughts?
> 
> Also if someone sets a breakpoint right after the "nr_fqs == 1" check, then
> they are kind of asking for it anyway since the GP kthread getting
> stalled is an actual reason for RCU stalls (infact rcutorture has a test mode
> for it even :P) and as such the false-positive may not be that false. ;-)

That would indeed be asking for it.  But then again, they might have set
a breakpoint elsewhere that had the unintended side-effect of catching
the RCU grace-period kthread right at that point.

If that isn't something we are worried about, your original is fine.
If it is something we are worried about, I recommend learning from my
RCU CPU stall warning experiences and just using a lock.  ;-)

							Thanx, Paul

> Btw apologies for forgetting to CC Thomas on [1] since he is involved in the
> timekeeping discussions. I relied on "git send-email" to populate the Cc list
> but did not add Cc: to the patch.
> 
> [1] https://lore.kernel.org/all/20230827025349.4161262-1-joel@joelfernandes.org/
> 
> ---8<-----------------------
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 9273f2318ea1..ffb165a2ef41 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1559,13 +1559,15 @@ static void rcu_gp_fqs(bool first_time)
>  	WRITE_ONCE(rcu_state.n_force_qs, rcu_state.n_force_qs + 1);
>  
>  	WARN_ON_ONCE(nr_fqs > 3);
> -	if (nr_fqs) {
> +	/* Only countdown nr_fqs for stall purposes if jiffies moves. */
> +	if (nr_fqs && jiffies != READ_ONCE(rcu_state.jiffies_last_fqs)) {
>  		if (nr_fqs == 1) {
>  			WRITE_ONCE(rcu_state.jiffies_stall,
>  				   jiffies + rcu_jiffies_till_stall_check());
>  		}
>  		WRITE_ONCE(rcu_state.nr_fqs_jiffies_stall, --nr_fqs);
>  	}
> +	WRITE_ONCE(rcu_state.jiffies_last_fqs, jiffies);
>  
>  	if (first_time) {
>  		/* Collect dyntick-idle snapshots. */
> diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> index e9821a8422db..72128e348fa1 100644
> --- a/kernel/rcu/tree.h
> +++ b/kernel/rcu/tree.h
> @@ -386,6 +386,8 @@ struct rcu_state {
>  						/*  in jiffies. */
>  	unsigned long jiffies_stall;		/* Time at which to check */
>  						/*  for CPU stalls. */
> +	unsigned long jiffies_last_fqs;		/* jiffies value at last FQS.
> +						   to confirm jiffies moves. */
>  	int nr_fqs_jiffies_stall;		/* Number of fqs loops after
>  						 * which read jiffies and set
>  						 * jiffies_stall. Stall
> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> index a2fa6b22e248..0ddd22afbc3a 100644
> --- a/kernel/rcu/tree_stall.h
> +++ b/kernel/rcu/tree_stall.h
> @@ -160,6 +160,7 @@ void rcu_cpu_stall_reset(void)
>  {
>  	WRITE_ONCE(rcu_state.nr_fqs_jiffies_stall, 3);
>  	WRITE_ONCE(rcu_state.jiffies_stall, ULONG_MAX);
> +	WRITE_ONCE(rcu_state.jiffies_last_fqs, 0);
>  }
>  
>  //////////////////////////////////////////////////////////////////////////////
> @@ -177,6 +178,7 @@ static void record_gp_stall_check_time(void)
>  	smp_mb(); // ->gp_start before ->jiffies_stall and caller's ->gp_seq.
>  	WRITE_ONCE(rcu_state.nr_fqs_jiffies_stall, 0);
>  	WRITE_ONCE(rcu_state.jiffies_stall, j + j1);
> +	WRITE_ONCE(rcu_state.jiffies_last_fqs, 0);
>  	rcu_state.jiffies_resched = j + j1 / 2;
>  	rcu_state.n_force_qs_gpstart = READ_ONCE(rcu_state.n_force_qs);
>  }
Huacai Chen Aug. 28, 2023, 2:37 p.m. UTC | #46
On Mon, Aug 28, 2023 at 10:02 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Mon, Aug 28, 2023 at 01:33:48PM +0000, Joel Fernandes wrote:
> > On Mon, Aug 28, 2023 at 03:47:12AM -0700, Paul E. McKenney wrote:
> > > On Sun, Aug 27, 2023 at 06:11:40PM -0400, Joel Fernandes wrote:
> > > > On Sun, Aug 27, 2023 at 1:51 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > [..]
> > > > > > > > > The only way I know of to avoid these sorts of false positives is for
> > > > > > > > > the user to manually suppress all timeouts (perhaps using a kernel-boot
> > > > > > > > > parameter for your early-boot case), do the gdb work, and then unsuppress
> > > > > > > > > all stalls.  Even that won't work for networking, because the other
> > > > > > > > > system's clock will be running throughout.
> > > > > > > > >
> > > > > > > > > In other words, from what I know now, there is no perfect solution.
> > > > > > > > > Therefore, there are sharp limits to the complexity of any solution that
> > > > > > > > > I will be willing to accept.
> > > > > > > > I think the simplest solution is (I hope Joel will not angry):
> > > > > > >
> > > > > > > Not angry at all, just want to help. ;-). The problem is the 300*HZ solution
> > > > > > > will also effect the VM workloads which also do a similar reset.  Allow me few
> > > > > > > days to see if I can take a shot at fixing it slightly differently. I am
> > > > > > > trying Paul's idea of setting jiffies at a later time. I think it is doable.
> > > > > > > I think the advantage of doing this is it will make stall detection more
> > > > > > > robust in this face of these gaps in jiffie update. And that solution does
> > > > > > > not even need us to rely on ktime (and all the issues that come with that).
> > > > > > >
> > > > > >
> > > > > > I wrote a patch similar to Paul's idea and sent it out for review, the
> > > > > > advantage being it purely is based on jiffies. Could you try it out
> > > > > > and let me know?
> > > > > If you can cc my gmail <chenhuacai@gmail.com>, that could be better.
> > > >
> > > > Sure, will do.
> > > >
> > > > > I have read your patch, maybe the counter (nr_fqs_jiffies_stall)
> > > > > should be atomic_t and we should use atomic operation to decrement its
> > > > > value. Because rcu_gp_fqs() can be run concurrently, and we may miss
> > > > > the (nr_fqs == 1) condition.
> > > >
> > > > I don't think so. There is only 1 place where RMW operation happens
> > > > and rcu_gp_fqs() is called only from the GP kthread. So a concurrent
> > > > RMW (and hence a lost update) is not possible.
> > >
> > > Huacai, is your concern that the gdb user might have created a script
> > > (for example, printing a variable or two, then automatically continuing),
> > > so that breakpoints could happen in quick successsion, such that the
> > > second breakpoint might run concurrently with rcu_gp_fqs()?
> > >
> > > If this can really happen, the point that Joel makes is a good one, namely
> > > that rcu_gp_fqs() is single-threaded and (absent rcutorture) runs only
> > > once every few jiffies.  And gdb breakpoints, even with scripting, should
> > > also be rather rare.  So if this is an issue, a global lock should do the
> > > trick, perhaps even one of the existing locks in the rcu_state structure.
> > > The result should then be just as performant/scalable and a lot simpler
> > > than use of atomics.
> >
> > Thanks Paul and Huacai, also I was thinking in the event of such concurrent
> > breakpoint stalling jiffies updates but GP thread / rcu_gp_fqs() chugging
> > along, we could also make the patch more robust for such a situation as
> > follows (diff on top of previous patch [1]). Thoughts?
> >
> > Also if someone sets a breakpoint right after the "nr_fqs == 1" check, then
> > they are kind of asking for it anyway since the GP kthread getting
> > stalled is an actual reason for RCU stalls (infact rcutorture has a test mode
> > for it even :P) and as such the false-positive may not be that false. ;-)
>
> That would indeed be asking for it.  But then again, they might have set
> a breakpoint elsewhere that had the unintended side-effect of catching
> the RCU grace-period kthread right at that point.
>
> If that isn't something we are worried about, your original is fine.
> If it is something we are worried about, I recommend learning from my
> RCU CPU stall warning experiences and just using a lock.  ;-)
I also think the original patch should be OK, but I have another
question: what will happen if the current GP ends before
nr_fqs_jiffies_stall reaches zero?

Huacai
>
>                                                         Thanx, Paul
>
> > Btw apologies for forgetting to CC Thomas on [1] since he is involved in the
> > timekeeping discussions. I relied on "git send-email" to populate the Cc list
> > but did not add Cc: to the patch.
> >
> > [1] https://lore.kernel.org/all/20230827025349.4161262-1-joel@joelfernandes.org/
> >
> > ---8<-----------------------
> >
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 9273f2318ea1..ffb165a2ef41 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -1559,13 +1559,15 @@ static void rcu_gp_fqs(bool first_time)
> >       WRITE_ONCE(rcu_state.n_force_qs, rcu_state.n_force_qs + 1);
> >
> >       WARN_ON_ONCE(nr_fqs > 3);
> > -     if (nr_fqs) {
> > +     /* Only countdown nr_fqs for stall purposes if jiffies moves. */
> > +     if (nr_fqs && jiffies != READ_ONCE(rcu_state.jiffies_last_fqs)) {
> >               if (nr_fqs == 1) {
> >                       WRITE_ONCE(rcu_state.jiffies_stall,
> >                                  jiffies + rcu_jiffies_till_stall_check());
> >               }
> >               WRITE_ONCE(rcu_state.nr_fqs_jiffies_stall, --nr_fqs);
> >       }
> > +     WRITE_ONCE(rcu_state.jiffies_last_fqs, jiffies);
> >
> >       if (first_time) {
> >               /* Collect dyntick-idle snapshots. */
> > diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> > index e9821a8422db..72128e348fa1 100644
> > --- a/kernel/rcu/tree.h
> > +++ b/kernel/rcu/tree.h
> > @@ -386,6 +386,8 @@ struct rcu_state {
> >                                               /*  in jiffies. */
> >       unsigned long jiffies_stall;            /* Time at which to check */
> >                                               /*  for CPU stalls. */
> > +     unsigned long jiffies_last_fqs;         /* jiffies value at last FQS.
> > +                                                to confirm jiffies moves. */
> >       int nr_fqs_jiffies_stall;               /* Number of fqs loops after
> >                                                * which read jiffies and set
> >                                                * jiffies_stall. Stall
> > diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> > index a2fa6b22e248..0ddd22afbc3a 100644
> > --- a/kernel/rcu/tree_stall.h
> > +++ b/kernel/rcu/tree_stall.h
> > @@ -160,6 +160,7 @@ void rcu_cpu_stall_reset(void)
> >  {
> >       WRITE_ONCE(rcu_state.nr_fqs_jiffies_stall, 3);
> >       WRITE_ONCE(rcu_state.jiffies_stall, ULONG_MAX);
> > +     WRITE_ONCE(rcu_state.jiffies_last_fqs, 0);
> >  }
> >
> >  //////////////////////////////////////////////////////////////////////////////
> > @@ -177,6 +178,7 @@ static void record_gp_stall_check_time(void)
> >       smp_mb(); // ->gp_start before ->jiffies_stall and caller's ->gp_seq.
> >       WRITE_ONCE(rcu_state.nr_fqs_jiffies_stall, 0);
> >       WRITE_ONCE(rcu_state.jiffies_stall, j + j1);
> > +     WRITE_ONCE(rcu_state.jiffies_last_fqs, 0);
> >       rcu_state.jiffies_resched = j + j1 / 2;
> >       rcu_state.n_force_qs_gpstart = READ_ONCE(rcu_state.n_force_qs);
> >  }
Joel Fernandes Aug. 28, 2023, 2:50 p.m. UTC | #47
On Mon, Aug 28, 2023 at 10:37 AM Huacai Chen <chenhuacai@kernel.org> wrote:
>
> On Mon, Aug 28, 2023 at 10:02 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Mon, Aug 28, 2023 at 01:33:48PM +0000, Joel Fernandes wrote:
> > > On Mon, Aug 28, 2023 at 03:47:12AM -0700, Paul E. McKenney wrote:
> > > > On Sun, Aug 27, 2023 at 06:11:40PM -0400, Joel Fernandes wrote:
> > > > > On Sun, Aug 27, 2023 at 1:51 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > [..]
> > > > > > > > > > The only way I know of to avoid these sorts of false positives is for
> > > > > > > > > > the user to manually suppress all timeouts (perhaps using a kernel-boot
> > > > > > > > > > parameter for your early-boot case), do the gdb work, and then unsuppress
> > > > > > > > > > all stalls.  Even that won't work for networking, because the other
> > > > > > > > > > system's clock will be running throughout.
> > > > > > > > > >
> > > > > > > > > > In other words, from what I know now, there is no perfect solution.
> > > > > > > > > > Therefore, there are sharp limits to the complexity of any solution that
> > > > > > > > > > I will be willing to accept.
> > > > > > > > > I think the simplest solution is (I hope Joel will not angry):
> > > > > > > >
> > > > > > > > Not angry at all, just want to help. ;-). The problem is the 300*HZ solution
> > > > > > > > will also effect the VM workloads which also do a similar reset.  Allow me few
> > > > > > > > days to see if I can take a shot at fixing it slightly differently. I am
> > > > > > > > trying Paul's idea of setting jiffies at a later time. I think it is doable.
> > > > > > > > I think the advantage of doing this is it will make stall detection more
> > > > > > > > robust in this face of these gaps in jiffie update. And that solution does
> > > > > > > > not even need us to rely on ktime (and all the issues that come with that).
> > > > > > > >
> > > > > > >
> > > > > > > I wrote a patch similar to Paul's idea and sent it out for review, the
> > > > > > > advantage being it purely is based on jiffies. Could you try it out
> > > > > > > and let me know?
> > > > > > If you can cc my gmail <chenhuacai@gmail.com>, that could be better.
> > > > >
> > > > > Sure, will do.
> > > > >
> > > > > > I have read your patch, maybe the counter (nr_fqs_jiffies_stall)
> > > > > > should be atomic_t and we should use atomic operation to decrement its
> > > > > > value. Because rcu_gp_fqs() can be run concurrently, and we may miss
> > > > > > the (nr_fqs == 1) condition.
> > > > >
> > > > > I don't think so. There is only 1 place where RMW operation happens
> > > > > and rcu_gp_fqs() is called only from the GP kthread. So a concurrent
> > > > > RMW (and hence a lost update) is not possible.
> > > >
> > > > Huacai, is your concern that the gdb user might have created a script
> > > > (for example, printing a variable or two, then automatically continuing),
> > > > so that breakpoints could happen in quick successsion, such that the
> > > > second breakpoint might run concurrently with rcu_gp_fqs()?
> > > >
> > > > If this can really happen, the point that Joel makes is a good one, namely
> > > > that rcu_gp_fqs() is single-threaded and (absent rcutorture) runs only
> > > > once every few jiffies.  And gdb breakpoints, even with scripting, should
> > > > also be rather rare.  So if this is an issue, a global lock should do the
> > > > trick, perhaps even one of the existing locks in the rcu_state structure.
> > > > The result should then be just as performant/scalable and a lot simpler
> > > > than use of atomics.
> > >
> > > Thanks Paul and Huacai, also I was thinking in the event of such concurrent
> > > breakpoint stalling jiffies updates but GP thread / rcu_gp_fqs() chugging
> > > along, we could also make the patch more robust for such a situation as
> > > follows (diff on top of previous patch [1]). Thoughts?
> > >
> > > Also if someone sets a breakpoint right after the "nr_fqs == 1" check, then
> > > they are kind of asking for it anyway since the GP kthread getting
> > > stalled is an actual reason for RCU stalls (infact rcutorture has a test mode
> > > for it even :P) and as such the false-positive may not be that false. ;-)
> >
> > [Paul]
> > That would indeed be asking for it.  But then again, they might have set
> > a breakpoint elsewhere that had the unintended side-effect of catching
> > the RCU grace-period kthread right at that point.
> >
> > If that isn't something we are worried about, your original is fine.
> > If it is something we are worried about, I recommend learning from my
> > RCU CPU stall warning experiences and just using a lock.  ;-)

This sounds good to me.

> [Huacai]
> I also think the original patch should be OK, but I have another
> question: what will happen if the current GP ends before
> nr_fqs_jiffies_stall reaches zero?

Nothing should happen. Stall detection only happens when a GP is in
progress. If a new GP starts, it resets nr_fqs_jiffies_stall.

Or can you elaborate your concern more?

Thanks.
Huacai Chen Aug. 28, 2023, 3:12 p.m. UTC | #48
Hi, Joel,

On Mon, Aug 28, 2023 at 10:51 PM Joel Fernandes <joel@joelfernandes.org> wrote:
>
> On Mon, Aug 28, 2023 at 10:37 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> >
> > On Mon, Aug 28, 2023 at 10:02 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > >
> > > On Mon, Aug 28, 2023 at 01:33:48PM +0000, Joel Fernandes wrote:
> > > > On Mon, Aug 28, 2023 at 03:47:12AM -0700, Paul E. McKenney wrote:
> > > > > On Sun, Aug 27, 2023 at 06:11:40PM -0400, Joel Fernandes wrote:
> > > > > > On Sun, Aug 27, 2023 at 1:51 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > > > [..]
> > > > > > > > > > > The only way I know of to avoid these sorts of false positives is for
> > > > > > > > > > > the user to manually suppress all timeouts (perhaps using a kernel-boot
> > > > > > > > > > > parameter for your early-boot case), do the gdb work, and then unsuppress
> > > > > > > > > > > all stalls.  Even that won't work for networking, because the other
> > > > > > > > > > > system's clock will be running throughout.
> > > > > > > > > > >
> > > > > > > > > > > In other words, from what I know now, there is no perfect solution.
> > > > > > > > > > > Therefore, there are sharp limits to the complexity of any solution that
> > > > > > > > > > > I will be willing to accept.
> > > > > > > > > > I think the simplest solution is (I hope Joel will not angry):
> > > > > > > > >
> > > > > > > > > Not angry at all, just want to help. ;-). The problem is the 300*HZ solution
> > > > > > > > > will also effect the VM workloads which also do a similar reset.  Allow me few
> > > > > > > > > days to see if I can take a shot at fixing it slightly differently. I am
> > > > > > > > > trying Paul's idea of setting jiffies at a later time. I think it is doable.
> > > > > > > > > I think the advantage of doing this is it will make stall detection more
> > > > > > > > > robust in this face of these gaps in jiffie update. And that solution does
> > > > > > > > > not even need us to rely on ktime (and all the issues that come with that).
> > > > > > > > >
> > > > > > > >
> > > > > > > > I wrote a patch similar to Paul's idea and sent it out for review, the
> > > > > > > > advantage being it purely is based on jiffies. Could you try it out
> > > > > > > > and let me know?
> > > > > > > If you can cc my gmail <chenhuacai@gmail.com>, that could be better.
> > > > > >
> > > > > > Sure, will do.
> > > > > >
> > > > > > > I have read your patch, maybe the counter (nr_fqs_jiffies_stall)
> > > > > > > should be atomic_t and we should use atomic operation to decrement its
> > > > > > > value. Because rcu_gp_fqs() can be run concurrently, and we may miss
> > > > > > > the (nr_fqs == 1) condition.
> > > > > >
> > > > > > I don't think so. There is only 1 place where RMW operation happens
> > > > > > and rcu_gp_fqs() is called only from the GP kthread. So a concurrent
> > > > > > RMW (and hence a lost update) is not possible.
> > > > >
> > > > > Huacai, is your concern that the gdb user might have created a script
> > > > > (for example, printing a variable or two, then automatically continuing),
> > > > > so that breakpoints could happen in quick successsion, such that the
> > > > > second breakpoint might run concurrently with rcu_gp_fqs()?
> > > > >
> > > > > If this can really happen, the point that Joel makes is a good one, namely
> > > > > that rcu_gp_fqs() is single-threaded and (absent rcutorture) runs only
> > > > > once every few jiffies.  And gdb breakpoints, even with scripting, should
> > > > > also be rather rare.  So if this is an issue, a global lock should do the
> > > > > trick, perhaps even one of the existing locks in the rcu_state structure.
> > > > > The result should then be just as performant/scalable and a lot simpler
> > > > > than use of atomics.
> > > >
> > > > Thanks Paul and Huacai, also I was thinking in the event of such concurrent
> > > > breakpoint stalling jiffies updates but GP thread / rcu_gp_fqs() chugging
> > > > along, we could also make the patch more robust for such a situation as
> > > > follows (diff on top of previous patch [1]). Thoughts?
> > > >
> > > > Also if someone sets a breakpoint right after the "nr_fqs == 1" check, then
> > > > they are kind of asking for it anyway since the GP kthread getting
> > > > stalled is an actual reason for RCU stalls (infact rcutorture has a test mode
> > > > for it even :P) and as such the false-positive may not be that false. ;-)
> > >
> > > [Paul]
> > > That would indeed be asking for it.  But then again, they might have set
> > > a breakpoint elsewhere that had the unintended side-effect of catching
> > > the RCU grace-period kthread right at that point.
> > >
> > > If that isn't something we are worried about, your original is fine.
> > > If it is something we are worried about, I recommend learning from my
> > > RCU CPU stall warning experiences and just using a lock.  ;-)
>
> This sounds good to me.
>
> > [Huacai]
> > I also think the original patch should be OK, but I have another
> > question: what will happen if the current GP ends before
> > nr_fqs_jiffies_stall reaches zero?
>
> Nothing should happen. Stall detection only happens when a GP is in
> progress. If a new GP starts, it resets nr_fqs_jiffies_stall.
>
> Or can you elaborate your concern more?
OK, I will test your patch these days. Maybe putting
nr_fqs_jiffies_stall before jiffies_force_qs is better, because I
think putting an 'int' between two 'long' is wasting space. :)

Huacai

>
> Thanks.
Joel Fernandes Aug. 28, 2023, 8:47 p.m. UTC | #49
Hi Huacai,

On Mon, Aug 28, 2023 at 11:13 AM Huacai Chen <chenhuacai@kernel.org> wrote:
>
[...]
> >
> > > [Huacai]
> > > I also think the original patch should be OK, but I have another
> > > question: what will happen if the current GP ends before
> > > nr_fqs_jiffies_stall reaches zero?
> >
> > Nothing should happen. Stall detection only happens when a GP is in
> > progress. If a new GP starts, it resets nr_fqs_jiffies_stall.
> >
> > Or can you elaborate your concern more?
> OK, I will test your patch these days. Maybe putting
> nr_fqs_jiffies_stall before jiffies_force_qs is better, because I
> think putting an 'int' between two 'long' is wasting space. :)

That's a good point and I'll look into that.

Meanwhile I pushed the patch out to my 6.4 stable tree for testing on my fleet.

Ideally, I'd like to change the stall detection test in the rcutorture
to actually fail rcutorture if stalls don't happen in time. But at
least I verified this manually using rcutorture.

I should also add a documentation patch for stallwarn.rst to document
the understandable sensitivity of RCU stall detection to jiffies
updates (or lack thereof). Or if you have time, I'd appreciate support
on such a patch (not mandatory but I thought it would not hurt to
ask).

Looking forward to how your testing goes as well!

thanks,

 - Joel
Huacai Chen Aug. 29, 2023, 4:07 a.m. UTC | #50
Hi, Joel,

On Tue, Aug 29, 2023 at 4:47 AM Joel Fernandes <joel@joelfernandes.org> wrote:
>
> Hi Huacai,
>
> On Mon, Aug 28, 2023 at 11:13 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> >
> [...]
> > >
> > > > [Huacai]
> > > > I also think the original patch should be OK, but I have another
> > > > question: what will happen if the current GP ends before
> > > > nr_fqs_jiffies_stall reaches zero?
> > >
> > > Nothing should happen. Stall detection only happens when a GP is in
> > > progress. If a new GP starts, it resets nr_fqs_jiffies_stall.
> > >
> > > Or can you elaborate your concern more?
> > OK, I will test your patch these days. Maybe putting
> > nr_fqs_jiffies_stall before jiffies_force_qs is better, because I
> > think putting an 'int' between two 'long' is wasting space. :)
>
> That's a good point and I'll look into that.
Another point, is it better to replace ULONG_MAX with ULONG_MAX/4 as
Paul suggested?

>
> Meanwhile I pushed the patch out to my 6.4 stable tree for testing on my fleet.
>
> Ideally, I'd like to change the stall detection test in the rcutorture
> to actually fail rcutorture if stalls don't happen in time. But at
> least I verified this manually using rcutorture.
>
> I should also add a documentation patch for stallwarn.rst to document
> the understandable sensitivity of RCU stall detection to jiffies
> updates (or lack thereof). Or if you have time, I'd appreciate support
> on such a patch (not mandatory but I thought it would not hurt to
> ask).
>
> Looking forward to how your testing goes as well!
I have tested, it works for KGDB.

Huacai
>
> thanks,
>
>  - Joel
Joel Fernandes Aug. 29, 2023, 2:46 p.m. UTC | #51
On Tue, Aug 29, 2023 at 12:08 AM Huacai Chen <chenhuacai@kernel.org> wrote:
>
> Hi, Joel,
>
> On Tue, Aug 29, 2023 at 4:47 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> >
> > Hi Huacai,
> >
> > On Mon, Aug 28, 2023 at 11:13 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> > >
> > [...]
> > > >
> > > > > [Huacai]
> > > > > I also think the original patch should be OK, but I have another
> > > > > question: what will happen if the current GP ends before
> > > > > nr_fqs_jiffies_stall reaches zero?
> > > >
> > > > Nothing should happen. Stall detection only happens when a GP is in
> > > > progress. If a new GP starts, it resets nr_fqs_jiffies_stall.
> > > >
> > > > Or can you elaborate your concern more?
> > > OK, I will test your patch these days. Maybe putting
> > > nr_fqs_jiffies_stall before jiffies_force_qs is better, because I
> > > think putting an 'int' between two 'long' is wasting space. :)
> >
> > That's a good point and I'll look into that.
> Another point, is it better to replace ULONG_MAX with ULONG_MAX/4 as
> Paul suggested?
>

I could do that but I don't feel too strongly about it. I will keep it
at ULONG_MAX if it's OK with everyone.

> > Meanwhile I pushed the patch out to my 6.4 stable tree for testing on my fleet.
> >
> > Ideally, I'd like to change the stall detection test in the rcutorture
> > to actually fail rcutorture if stalls don't happen in time. But at
> > least I verified this manually using rcutorture.
> >
> > I should also add a documentation patch for stallwarn.rst to document
> > the understandable sensitivity of RCU stall detection to jiffies
> > updates (or lack thereof). Or if you have time, I'd appreciate support
> > on such a patch (not mandatory but I thought it would not hurt to
> > ask).
> >
> > Looking forward to how your testing goes as well!
> I have tested, it works for KGDB.

Thanks! If you don't mind, I will add your Tested-by tag to the patch
and send it out soon. My tests also look good!


 - Joel
Joel Fernandes Aug. 30, 2023, 1:04 a.m. UTC | #52
On Mon, Aug 28, 2023 at 11:12:49PM +0800, Huacai Chen wrote:
> Maybe putting
> nr_fqs_jiffies_stall before jiffies_force_qs is better, because I
> think putting an 'int' between two 'long' is wasting space. :)

Though it is a decent suggestion, moving it before jiffies_force_qs does not
make the structure smaller. Trying to outsmart the compiler seems not a good
idea when it is not really packing the structure.

Besides, I am not too worried about 4-byte holes considering the structure is
full of them and that ofl_lock is ____cacheline_internodealigned_in_smp which
adds a giant 55 byte hole to the structure. That should keep you at night if
you are worried about 4 byte holes.

And the total size of the structure on a 64-bit build is 3776 bytes so 4
bytes is about 1/10th of a percent. Puny.

So I'd rather leave nr_fqs_jiffies_stall where it is especially considering
it is more readable where it is. :-)

thanks,

 - Joel


> 
> Huacai
> 
> >
> > Thanks.
Huacai Chen Aug. 30, 2023, 4:25 a.m. UTC | #53
On Tue, Aug 29, 2023 at 10:46 PM Joel Fernandes <joel@joelfernandes.org> wrote:
>
> On Tue, Aug 29, 2023 at 12:08 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> >
> > Hi, Joel,
> >
> > On Tue, Aug 29, 2023 at 4:47 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> > >
> > > Hi Huacai,
> > >
> > > On Mon, Aug 28, 2023 at 11:13 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > >
> > > [...]
> > > > >
> > > > > > [Huacai]
> > > > > > I also think the original patch should be OK, but I have another
> > > > > > question: what will happen if the current GP ends before
> > > > > > nr_fqs_jiffies_stall reaches zero?
> > > > >
> > > > > Nothing should happen. Stall detection only happens when a GP is in
> > > > > progress. If a new GP starts, it resets nr_fqs_jiffies_stall.
> > > > >
> > > > > Or can you elaborate your concern more?
> > > > OK, I will test your patch these days. Maybe putting
> > > > nr_fqs_jiffies_stall before jiffies_force_qs is better, because I
> > > > think putting an 'int' between two 'long' is wasting space. :)
> > >
> > > That's a good point and I'll look into that.
> > Another point, is it better to replace ULONG_MAX with ULONG_MAX/4 as
> > Paul suggested?
> >
>
> I could do that but I don't feel too strongly about it. I will keep it
> at ULONG_MAX if it's OK with everyone.
>
> > > Meanwhile I pushed the patch out to my 6.4 stable tree for testing on my fleet.
> > >
> > > Ideally, I'd like to change the stall detection test in the rcutorture
> > > to actually fail rcutorture if stalls don't happen in time. But at
> > > least I verified this manually using rcutorture.
> > >
> > > I should also add a documentation patch for stallwarn.rst to document
> > > the understandable sensitivity of RCU stall detection to jiffies
> > > updates (or lack thereof). Or if you have time, I'd appreciate support
> > > on such a patch (not mandatory but I thought it would not hurt to
> > > ask).
> > >
> > > Looking forward to how your testing goes as well!
> > I have tested, it works for KGDB.
>
> Thanks! If you don't mind, I will add your Tested-by tag to the patch
> and send it out soon. My tests also look good!
You can add my Tested-by, but Reported-by should be "Binbin Zhou
<zhoubinbin@loongson.cn>"

Huacai
>
>
>  - Joel
Joel Fernandes Aug. 30, 2023, 10:18 a.m. UTC | #54
On Wed, Aug 30, 2023 at 12:25:56PM +0800, Huacai Chen wrote:
> On Tue, Aug 29, 2023 at 10:46 PM Joel Fernandes <joel@joelfernandes.org> wrote:
> >
> > On Tue, Aug 29, 2023 at 12:08 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> > >
> > > Hi, Joel,
> > >
> > > On Tue, Aug 29, 2023 at 4:47 AM Joel Fernandes <joel@joelfernandes.org> wrote:
> > > >
> > > > Hi Huacai,
> > > >
> > > > On Mon, Aug 28, 2023 at 11:13 AM Huacai Chen <chenhuacai@kernel.org> wrote:
> > > > >
> > > > [...]
> > > > > >
> > > > > > > [Huacai]
> > > > > > > I also think the original patch should be OK, but I have another
> > > > > > > question: what will happen if the current GP ends before
> > > > > > > nr_fqs_jiffies_stall reaches zero?
> > > > > >
> > > > > > Nothing should happen. Stall detection only happens when a GP is in
> > > > > > progress. If a new GP starts, it resets nr_fqs_jiffies_stall.
> > > > > >
> > > > > > Or can you elaborate your concern more?
> > > > > OK, I will test your patch these days. Maybe putting
> > > > > nr_fqs_jiffies_stall before jiffies_force_qs is better, because I
> > > > > think putting an 'int' between two 'long' is wasting space. :)
> > > >
> > > > That's a good point and I'll look into that.
> > > Another point, is it better to replace ULONG_MAX with ULONG_MAX/4 as
> > > Paul suggested?
> > >
> >
> > I could do that but I don't feel too strongly about it. I will keep it
> > at ULONG_MAX if it's OK with everyone.
> >
> > > > Meanwhile I pushed the patch out to my 6.4 stable tree for testing on my fleet.
> > > >
> > > > Ideally, I'd like to change the stall detection test in the rcutorture
> > > > to actually fail rcutorture if stalls don't happen in time. But at
> > > > least I verified this manually using rcutorture.
> > > >
> > > > I should also add a documentation patch for stallwarn.rst to document
> > > > the understandable sensitivity of RCU stall detection to jiffies
> > > > updates (or lack thereof). Or if you have time, I'd appreciate support
> > > > on such a patch (not mandatory but I thought it would not hurt to
> > > > ask).
> > > >
> > > > Looking forward to how your testing goes as well!
> > > I have tested, it works for KGDB.
> >
> > Thanks! If you don't mind, I will add your Tested-by tag to the patch
> > and send it out soon. My tests also look good!
>
> You can add my Tested-by, but Reported-by should be "Binbin Zhou
> <zhoubinbin@loongson.cn>"

Thanks, if/when Paul takes it, he could kindly change the Reported-by, or I
could.

thanks,

 - Joel
diff mbox series

Patch

diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index b10b8349bb2a..1c7b540985bf 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -153,6 +153,7 @@  static void panic_on_rcu_stall(void)
  */
 void rcu_cpu_stall_reset(void)
 {
+	do_update_jiffies_64(ktime_get());
 	WRITE_ONCE(rcu_state.jiffies_stall,
 		   jiffies + rcu_jiffies_till_stall_check());
 }