Message ID | 20230323134621.336832-1-qiang1.zhang@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] srcu: Fix flush srcu structure's->sup work warning in cleanup_srcu_struct() | expand |
On Thu, Mar 23, 2023 at 09:46:21PM +0800, Zqiang wrote: > When unloading rcutorture kmod will trigger the following callstack: > > insmod rcutorture.ko > rmmod rcutorture.ko > > [ 209.437327] WARNING: CPU: 0 PID: 508 at kernel/workqueue.c:3167 __flush_work+0x50a/0x540 > [ 209.437346] Modules linked in: rcutorture(-) torture [last unloaded: rcutorture] > [ 209.437382] CPU: 0 PID: 508 Comm: rmmod Tainted: G W 6.3.0-rc1-yocto-standard+ > [ 209.437406] RIP: 0010:__flush_work+0x50a/0x540 > ..... > [ 209.437758] flush_delayed_work+0x36/0x90 > [ 209.437776] cleanup_srcu_struct+0x68/0x2e0 > [ 209.437817] srcu_module_notify+0x71/0x140 > [ 209.437854] blocking_notifier_call_chain+0x9d/0xd0 > [ 209.437880] __x64_sys_delete_module+0x223/0x2e0 > [ 209.438046] do_syscall_64+0x43/0x90 > [ 209.438062] entry_SYSCALL_64_after_hwframe+0x72/0xdc > > flush_delayed_work() > ->__flush_work() > ->if (WARN_ON(!work->func)) > return false; > > For srcu objects defined with DEFINE_SRCU() or DEFINE_STATIC_SRCU(), > when compiling and loading as modules, the srcu_module_coming() is > invoked, allocate memory for srcu structure's->sda and initialize > sda structure, due to not fully initialize srcu structure's->sup, > so at this time the sup structure's->work.work.func is null, if not > invoke init_srcu_struct_fields() before unloading modules, the > __flush_work() be invoked in srcu_module_going() and find work->func > is empty, will raise the warning above. > > This commit add the check of srcu_sup structure's->srcu_gp_seq_needed > to determine whether the check_init_srcu_struct() has been invoked to > initialize srcu objects in srcu_module_going(), if not initialize, there > are no pending or running works, so there is no need to flush, only invoke > free_percpu() to release srcu structure's->sda. > > Co-developed-by: Paul E. McKenney <paulmck@kernel.org> Thank you for the testing, bug-finding, and problem-solving! In theory, you would need a Signed-off-by here from me as well, but in practice bisectability means that this must be folded into this: e7c778489040 ("srcu: Use static init for statically allocated in-module srcu_struct") This will of course be with attribution. > Signed-off-by: Zqiang <qiang1.zhang@intel.com> But this is still a bit more complex than needed. How about something like this? Thanx, Paul ------------------------------------------------------------------------ /* Initialize any global-scope srcu_struct structures used by this module. */ static int srcu_module_coming(struct module *mod) { int i; struct srcu_struct *ssp; struct srcu_struct **sspp = mod->srcu_struct_ptrs; for (i = 0; i < mod->num_srcu_structs; i++) { ssp = *(sspp++); ssp->sda = alloc_percpu(struct srcu_data); if (WARN_ON_ONCE(!ssp->sda)) return -ENOMEM; } return 0; } /* Clean up any global-scope srcu_struct structures used by this module. */ static void srcu_module_going(struct module *mod) { int i; struct srcu_struct *ssp; struct srcu_struct **sspp = mod->srcu_struct_ptrs; for (i = 0; i < mod->num_srcu_structs; i++) { ssp = *(sspp++); if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) && !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static)) cleanup_srcu_struct(ssp); free_percpu(ssp->sda); } }
Cc: my personal email qiang.zhang1211@gmail.com > When unloading rcutorture kmod will trigger the following callstack: > > insmod rcutorture.ko > rmmod rcutorture.ko > > [ 209.437327] WARNING: CPU: 0 PID: 508 at kernel/workqueue.c:3167 __flush_work+0x50a/0x540 > [ 209.437346] Modules linked in: rcutorture(-) torture [last unloaded: rcutorture] > [ 209.437382] CPU: 0 PID: 508 Comm: rmmod Tainted: G W 6.3.0-rc1-yocto-standard+ > [ 209.437406] RIP: 0010:__flush_work+0x50a/0x540 > ..... > [ 209.437758] flush_delayed_work+0x36/0x90 > [ 209.437776] cleanup_srcu_struct+0x68/0x2e0 > [ 209.437817] srcu_module_notify+0x71/0x140 > [ 209.437854] blocking_notifier_call_chain+0x9d/0xd0 > [ 209.437880] __x64_sys_delete_module+0x223/0x2e0 > [ 209.438046] do_syscall_64+0x43/0x90 > [ 209.438062] entry_SYSCALL_64_after_hwframe+0x72/0xdc > > flush_delayed_work() > ->__flush_work() > ->if (WARN_ON(!work->func)) > return false; > > For srcu objects defined with DEFINE_SRCU() or DEFINE_STATIC_SRCU(), > when compiling and loading as modules, the srcu_module_coming() is > invoked, allocate memory for srcu structure's->sda and initialize > sda structure, due to not fully initialize srcu structure's->sup, > so at this time the sup structure's->work.work.func is null, if not > invoke init_srcu_struct_fields() before unloading modules, the > __flush_work() be invoked in srcu_module_going() and find work->func > is empty, will raise the warning above. > > This commit add the check of srcu_sup structure's->srcu_gp_seq_needed > to determine whether the check_init_srcu_struct() has been invoked to > initialize srcu objects in srcu_module_going(), if not initialize, there > are no pending or running works, so there is no need to flush, only invoke > free_percpu() to release srcu structure's->sda. > > Co-developed-by: Paul E. McKenney <paulmck@kernel.org> > >Thank you for the testing, bug-finding, and problem-solving! > >In theory, you would need a Signed-off-by here from me as well, but >in practice bisectability means that this must be folded into this: > >e7c778489040 ("srcu: Use static init for statically allocated in-module srcu_struct") > >This will of course be with attribution. > > Signed-off-by: Zqiang <qiang1.zhang@intel.com> > >But this is still a bit more complex than needed. How about something >like this? Agree, from a logical point of view, this is more rigorous
On Fri, Mar 24, 2023 at 02:20:18AM +0000, Zhang, Qiang1 wrote: > Cc: my personal email qiang.zhang1211@gmail.com > > > When unloading rcutorture kmod will trigger the following callstack: > > > > insmod rcutorture.ko > > rmmod rcutorture.ko > > > > [ 209.437327] WARNING: CPU: 0 PID: 508 at kernel/workqueue.c:3167 __flush_work+0x50a/0x540 > > [ 209.437346] Modules linked in: rcutorture(-) torture [last unloaded: rcutorture] > > [ 209.437382] CPU: 0 PID: 508 Comm: rmmod Tainted: G W 6.3.0-rc1-yocto-standard+ > > [ 209.437406] RIP: 0010:__flush_work+0x50a/0x540 > > ..... > > [ 209.437758] flush_delayed_work+0x36/0x90 > > [ 209.437776] cleanup_srcu_struct+0x68/0x2e0 > > [ 209.437817] srcu_module_notify+0x71/0x140 > > [ 209.437854] blocking_notifier_call_chain+0x9d/0xd0 > > [ 209.437880] __x64_sys_delete_module+0x223/0x2e0 > > [ 209.438046] do_syscall_64+0x43/0x90 > > [ 209.438062] entry_SYSCALL_64_after_hwframe+0x72/0xdc > > > > flush_delayed_work() > > ->__flush_work() > > ->if (WARN_ON(!work->func)) > > return false; > > > > For srcu objects defined with DEFINE_SRCU() or DEFINE_STATIC_SRCU(), > > when compiling and loading as modules, the srcu_module_coming() is > > invoked, allocate memory for srcu structure's->sda and initialize > > sda structure, due to not fully initialize srcu structure's->sup, > > so at this time the sup structure's->work.work.func is null, if not > > invoke init_srcu_struct_fields() before unloading modules, the > > __flush_work() be invoked in srcu_module_going() and find work->func > > is empty, will raise the warning above. > > > > This commit add the check of srcu_sup structure's->srcu_gp_seq_needed > > to determine whether the check_init_srcu_struct() has been invoked to > > initialize srcu objects in srcu_module_going(), if not initialize, there > > are no pending or running works, so there is no need to flush, only invoke > > free_percpu() to release srcu structure's->sda. > > > > Co-developed-by: Paul E. McKenney <paulmck@kernel.org> > > > >Thank you for the testing, bug-finding, and problem-solving! > > > >In theory, you would need a Signed-off-by here from me as well, but > >in practice bisectability means that this must be folded into this: > > > >e7c778489040 ("srcu: Use static init for statically allocated in-module srcu_struct") > > > >This will of course be with attribution. > > > > Signed-off-by: Zqiang <qiang1.zhang@intel.com> > > > >But this is still a bit more complex than needed. How about something > >like this? > > Agree, from a logical point of view, this is more rigorous
> Cc: my personal email qiang.zhang1211@gmail.com > > > When unloading rcutorture kmod will trigger the following callstack: > > > > insmod rcutorture.ko > > rmmod rcutorture.ko > > > > [ 209.437327] WARNING: CPU: 0 PID: 508 at kernel/workqueue.c:3167 __flush_work+0x50a/0x540 > > [ 209.437346] Modules linked in: rcutorture(-) torture [last unloaded: rcutorture] > > [ 209.437382] CPU: 0 PID: 508 Comm: rmmod Tainted: G W 6.3.0-rc1-yocto-standard+ > > [ 209.437406] RIP: 0010:__flush_work+0x50a/0x540 > > ..... > > [ 209.437758] flush_delayed_work+0x36/0x90 > > [ 209.437776] cleanup_srcu_struct+0x68/0x2e0 > > [ 209.437817] srcu_module_notify+0x71/0x140 > > [ 209.437854] blocking_notifier_call_chain+0x9d/0xd0 > > [ 209.437880] __x64_sys_delete_module+0x223/0x2e0 > > [ 209.438046] do_syscall_64+0x43/0x90 > > [ 209.438062] entry_SYSCALL_64_after_hwframe+0x72/0xdc > > > > flush_delayed_work() > > ->__flush_work() > > ->if (WARN_ON(!work->func)) > > return false; > > > > For srcu objects defined with DEFINE_SRCU() or DEFINE_STATIC_SRCU(), > > when compiling and loading as modules, the srcu_module_coming() is > > invoked, allocate memory for srcu structure's->sda and initialize > > sda structure, due to not fully initialize srcu structure's->sup, > > so at this time the sup structure's->work.work.func is null, if not > > invoke init_srcu_struct_fields() before unloading modules, the > > __flush_work() be invoked in srcu_module_going() and find work->func > > is empty, will raise the warning above. > > > > This commit add the check of srcu_sup structure's->srcu_gp_seq_needed > > to determine whether the check_init_srcu_struct() has been invoked to > > initialize srcu objects in srcu_module_going(), if not initialize, there > > are no pending or running works, so there is no need to flush, only invoke > > free_percpu() to release srcu structure's->sda. > > > > Co-developed-by: Paul E. McKenney <paulmck@kernel.org> > > > >Thank you for the testing, bug-finding, and problem-solving! > > > >In theory, you would need a Signed-off-by here from me as well, but > >in practice bisectability means that this must be folded into this: > > > >e7c778489040 ("srcu: Use static init for statically allocated in-module srcu_struct") > > > >This will of course be with attribution. > > > > Signed-off-by: Zqiang <qiang1.zhang@intel.com> > > > >But this is still a bit more complex than needed. How about something > >like this? > > Agree, from a logical point of view, this is more rigorous
On Fri, Mar 24, 2023 at 03:53:08AM +0000, Zhang, Qiang1 wrote: > > Cc: my personal email qiang.zhang1211@gmail.com > > > > > When unloading rcutorture kmod will trigger the following callstack: > > > > > > insmod rcutorture.ko > > > rmmod rcutorture.ko > > > > > > [ 209.437327] WARNING: CPU: 0 PID: 508 at kernel/workqueue.c:3167 __flush_work+0x50a/0x540 > > > [ 209.437346] Modules linked in: rcutorture(-) torture [last unloaded: rcutorture] > > > [ 209.437382] CPU: 0 PID: 508 Comm: rmmod Tainted: G W 6.3.0-rc1-yocto-standard+ > > > [ 209.437406] RIP: 0010:__flush_work+0x50a/0x540 > > > ..... > > > [ 209.437758] flush_delayed_work+0x36/0x90 > > > [ 209.437776] cleanup_srcu_struct+0x68/0x2e0 > > > [ 209.437817] srcu_module_notify+0x71/0x140 > > > [ 209.437854] blocking_notifier_call_chain+0x9d/0xd0 > > > [ 209.437880] __x64_sys_delete_module+0x223/0x2e0 > > > [ 209.438046] do_syscall_64+0x43/0x90 > > > [ 209.438062] entry_SYSCALL_64_after_hwframe+0x72/0xdc > > > > > > flush_delayed_work() > > > ->__flush_work() > > > ->if (WARN_ON(!work->func)) > > > return false; > > > > > > For srcu objects defined with DEFINE_SRCU() or DEFINE_STATIC_SRCU(), > > > when compiling and loading as modules, the srcu_module_coming() is > > > invoked, allocate memory for srcu structure's->sda and initialize > > > sda structure, due to not fully initialize srcu structure's->sup, > > > so at this time the sup structure's->work.work.func is null, if not > > > invoke init_srcu_struct_fields() before unloading modules, the > > > __flush_work() be invoked in srcu_module_going() and find work->func > > > is empty, will raise the warning above. > > > > > > This commit add the check of srcu_sup structure's->srcu_gp_seq_needed > > > to determine whether the check_init_srcu_struct() has been invoked to > > > initialize srcu objects in srcu_module_going(), if not initialize, there > > > are no pending or running works, so there is no need to flush, only invoke > > > free_percpu() to release srcu structure's->sda. > > > > > > Co-developed-by: Paul E. McKenney <paulmck@kernel.org> > > > > > >Thank you for the testing, bug-finding, and problem-solving! > > > > > >In theory, you would need a Signed-off-by here from me as well, but > > >in practice bisectability means that this must be folded into this: > > > > > >e7c778489040 ("srcu: Use static init for statically allocated in-module srcu_struct") > > > > > >This will of course be with attribution. > > > > > > Signed-off-by: Zqiang <qiang1.zhang@intel.com> > > > > > >But this is still a bit more complex than needed. How about something > > >like this? > > > > Agree, from a logical point of view, this is more rigorous
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index 1fb078abbdc9..edf894e3b96e 100644 --- a/kernel/rcu/srcutree.c +++ b/kernel/rcu/srcutree.c @@ -1921,7 +1921,6 @@ static int srcu_module_coming(struct module *mod) ssp->sda = alloc_percpu(struct srcu_data); if (WARN_ON_ONCE(!ssp->sda)) return -ENOMEM; - init_srcu_struct_data(ssp); } return 0; } @@ -1931,9 +1930,14 @@ static void srcu_module_going(struct module *mod) { int i; struct srcu_struct **sspp = mod->srcu_struct_ptrs; + struct srcu_struct *ssp; - for (i = 0; i < mod->num_srcu_structs; i++) - cleanup_srcu_struct(*(sspp++)); + for (i = 0; i < mod->num_srcu_structs; i++) { + ssp = (*sspp++); + if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed))) + cleanup_srcu_struct(ssp); + free_percpu(ssp->sda); + } } /* Handle one module, either coming or going. */
When unloading rcutorture kmod will trigger the following callstack: insmod rcutorture.ko rmmod rcutorture.ko [ 209.437327] WARNING: CPU: 0 PID: 508 at kernel/workqueue.c:3167 __flush_work+0x50a/0x540 [ 209.437346] Modules linked in: rcutorture(-) torture [last unloaded: rcutorture] [ 209.437382] CPU: 0 PID: 508 Comm: rmmod Tainted: G W 6.3.0-rc1-yocto-standard+ [ 209.437406] RIP: 0010:__flush_work+0x50a/0x540 ..... [ 209.437758] flush_delayed_work+0x36/0x90 [ 209.437776] cleanup_srcu_struct+0x68/0x2e0 [ 209.437817] srcu_module_notify+0x71/0x140 [ 209.437854] blocking_notifier_call_chain+0x9d/0xd0 [ 209.437880] __x64_sys_delete_module+0x223/0x2e0 [ 209.438046] do_syscall_64+0x43/0x90 [ 209.438062] entry_SYSCALL_64_after_hwframe+0x72/0xdc flush_delayed_work() ->__flush_work() ->if (WARN_ON(!work->func)) return false; For srcu objects defined with DEFINE_SRCU() or DEFINE_STATIC_SRCU(), when compiling and loading as modules, the srcu_module_coming() is invoked, allocate memory for srcu structure's->sda and initialize sda structure, due to not fully initialize srcu structure's->sup, so at this time the sup structure's->work.work.func is null, if not invoke init_srcu_struct_fields() before unloading modules, the __flush_work() be invoked in srcu_module_going() and find work->func is empty, will raise the warning above. This commit add the check of srcu_sup structure's->srcu_gp_seq_needed to determine whether the check_init_srcu_struct() has been invoked to initialize srcu objects in srcu_module_going(), if not initialize, there are no pending or running works, so there is no need to flush, only invoke free_percpu() to release srcu structure's->sda. Co-developed-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Zqiang <qiang1.zhang@intel.com> --- kernel/rcu/srcutree.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)