diff mbox series

[v2] srcu: Fix flush srcu structure's->sup work warning in cleanup_srcu_struct()

Message ID 20230323134621.336832-1-qiang1.zhang@intel.com (mailing list archive)
State New, archived
Headers show
Series [v2] srcu: Fix flush srcu structure's->sup work warning in cleanup_srcu_struct() | expand

Commit Message

Zqiang March 23, 2023, 1:46 p.m. UTC
When unloading rcutorture kmod will trigger the following callstack:

insmod rcutorture.ko
rmmod rcutorture.ko

[  209.437327] WARNING: CPU: 0 PID: 508 at kernel/workqueue.c:3167 __flush_work+0x50a/0x540
[  209.437346] Modules linked in: rcutorture(-) torture [last unloaded: rcutorture]
[  209.437382] CPU: 0 PID: 508 Comm: rmmod Tainted: G  W  6.3.0-rc1-yocto-standard+
[  209.437406] RIP: 0010:__flush_work+0x50a/0x540
.....
[  209.437758]  flush_delayed_work+0x36/0x90
[  209.437776]  cleanup_srcu_struct+0x68/0x2e0
[  209.437817]  srcu_module_notify+0x71/0x140
[  209.437854]  blocking_notifier_call_chain+0x9d/0xd0
[  209.437880]  __x64_sys_delete_module+0x223/0x2e0
[  209.438046]  do_syscall_64+0x43/0x90
[  209.438062]  entry_SYSCALL_64_after_hwframe+0x72/0xdc

flush_delayed_work()
->__flush_work()
   ->if (WARN_ON(!work->func))
        return false;

For srcu objects defined with DEFINE_SRCU() or DEFINE_STATIC_SRCU(),
when compiling and loading as modules, the srcu_module_coming() is
invoked, allocate memory for srcu structure's->sda and initialize
sda structure, due to not fully initialize srcu structure's->sup,
so at this time the sup structure's->work.work.func is null, if not
invoke init_srcu_struct_fields() before unloading modules, the
__flush_work() be invoked in srcu_module_going() and find work->func
is empty, will raise the warning above.

This commit add the check of srcu_sup structure's->srcu_gp_seq_needed
to determine whether the check_init_srcu_struct() has been invoked to
initialize srcu objects in srcu_module_going(), if not initialize, there
are no pending or running works, so there is no need to flush, only invoke
free_percpu() to release srcu structure's->sda.

Co-developed-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Zqiang <qiang1.zhang@intel.com>
---
 kernel/rcu/srcutree.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

Comments

Paul E. McKenney March 23, 2023, 5:36 p.m. UTC | #1
On Thu, Mar 23, 2023 at 09:46:21PM +0800, Zqiang wrote:
> When unloading rcutorture kmod will trigger the following callstack:
> 
> insmod rcutorture.ko
> rmmod rcutorture.ko
> 
> [  209.437327] WARNING: CPU: 0 PID: 508 at kernel/workqueue.c:3167 __flush_work+0x50a/0x540
> [  209.437346] Modules linked in: rcutorture(-) torture [last unloaded: rcutorture]
> [  209.437382] CPU: 0 PID: 508 Comm: rmmod Tainted: G  W  6.3.0-rc1-yocto-standard+
> [  209.437406] RIP: 0010:__flush_work+0x50a/0x540
> .....
> [  209.437758]  flush_delayed_work+0x36/0x90
> [  209.437776]  cleanup_srcu_struct+0x68/0x2e0
> [  209.437817]  srcu_module_notify+0x71/0x140
> [  209.437854]  blocking_notifier_call_chain+0x9d/0xd0
> [  209.437880]  __x64_sys_delete_module+0x223/0x2e0
> [  209.438046]  do_syscall_64+0x43/0x90
> [  209.438062]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> 
> flush_delayed_work()
> ->__flush_work()
>    ->if (WARN_ON(!work->func))
>         return false;
> 
> For srcu objects defined with DEFINE_SRCU() or DEFINE_STATIC_SRCU(),
> when compiling and loading as modules, the srcu_module_coming() is
> invoked, allocate memory for srcu structure's->sda and initialize
> sda structure, due to not fully initialize srcu structure's->sup,
> so at this time the sup structure's->work.work.func is null, if not
> invoke init_srcu_struct_fields() before unloading modules, the
> __flush_work() be invoked in srcu_module_going() and find work->func
> is empty, will raise the warning above.
> 
> This commit add the check of srcu_sup structure's->srcu_gp_seq_needed
> to determine whether the check_init_srcu_struct() has been invoked to
> initialize srcu objects in srcu_module_going(), if not initialize, there
> are no pending or running works, so there is no need to flush, only invoke
> free_percpu() to release srcu structure's->sda.
> 
> Co-developed-by: Paul E. McKenney <paulmck@kernel.org>

Thank you for the testing, bug-finding, and problem-solving!

In theory, you would need a Signed-off-by here from me as well, but
in practice bisectability means that this must be folded into this:

e7c778489040 ("srcu: Use static init for statically allocated in-module srcu_struct")

This will of course be with attribution.

> Signed-off-by: Zqiang <qiang1.zhang@intel.com>

But this is still a bit more complex than needed.  How about something
like this?

							Thanx, Paul

------------------------------------------------------------------------

/* Initialize any global-scope srcu_struct structures used by this module. */
static int srcu_module_coming(struct module *mod)
{
	int i;
	struct srcu_struct *ssp;
	struct srcu_struct **sspp = mod->srcu_struct_ptrs;

	for (i = 0; i < mod->num_srcu_structs; i++) {
		ssp = *(sspp++);
		ssp->sda = alloc_percpu(struct srcu_data);
		if (WARN_ON_ONCE(!ssp->sda))
			return -ENOMEM;
	}
	return 0;
}

/* Clean up any global-scope srcu_struct structures used by this module. */
static void srcu_module_going(struct module *mod)
{
	int i;
	struct srcu_struct *ssp;
	struct srcu_struct **sspp = mod->srcu_struct_ptrs;

	for (i = 0; i < mod->num_srcu_structs; i++) {
		ssp = *(sspp++);
		if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)) &&
		    !WARN_ON_ONCE(!ssp->srcu_sup->sda_is_static))
				cleanup_srcu_struct(ssp);
		free_percpu(ssp->sda);
	}
}
Zqiang March 24, 2023, 2:20 a.m. UTC | #2
Cc:  my personal email qiang.zhang1211@gmail.com

> When unloading rcutorture kmod will trigger the following callstack:
> 
> insmod rcutorture.ko
> rmmod rcutorture.ko
> 
> [  209.437327] WARNING: CPU: 0 PID: 508 at kernel/workqueue.c:3167 __flush_work+0x50a/0x540
> [  209.437346] Modules linked in: rcutorture(-) torture [last unloaded: rcutorture]
> [  209.437382] CPU: 0 PID: 508 Comm: rmmod Tainted: G  W  6.3.0-rc1-yocto-standard+
> [  209.437406] RIP: 0010:__flush_work+0x50a/0x540
> .....
> [  209.437758]  flush_delayed_work+0x36/0x90
> [  209.437776]  cleanup_srcu_struct+0x68/0x2e0
> [  209.437817]  srcu_module_notify+0x71/0x140
> [  209.437854]  blocking_notifier_call_chain+0x9d/0xd0
> [  209.437880]  __x64_sys_delete_module+0x223/0x2e0
> [  209.438046]  do_syscall_64+0x43/0x90
> [  209.438062]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> 
> flush_delayed_work()
> ->__flush_work()
>    ->if (WARN_ON(!work->func))
>         return false;
> 
> For srcu objects defined with DEFINE_SRCU() or DEFINE_STATIC_SRCU(),
> when compiling and loading as modules, the srcu_module_coming() is
> invoked, allocate memory for srcu structure's->sda and initialize
> sda structure, due to not fully initialize srcu structure's->sup,
> so at this time the sup structure's->work.work.func is null, if not
> invoke init_srcu_struct_fields() before unloading modules, the
> __flush_work() be invoked in srcu_module_going() and find work->func
> is empty, will raise the warning above.
> 
> This commit add the check of srcu_sup structure's->srcu_gp_seq_needed
> to determine whether the check_init_srcu_struct() has been invoked to
> initialize srcu objects in srcu_module_going(), if not initialize, there
> are no pending or running works, so there is no need to flush, only invoke
> free_percpu() to release srcu structure's->sda.
> 
> Co-developed-by: Paul E. McKenney <paulmck@kernel.org>
>
>Thank you for the testing, bug-finding, and problem-solving!
>
>In theory, you would need a Signed-off-by here from me as well, but
>in practice bisectability means that this must be folded into this:
>
>e7c778489040 ("srcu: Use static init for statically allocated in-module srcu_struct")
>	
>This will of course be with attribution.
>
> Signed-off-by: Zqiang <qiang1.zhang@intel.com>
>
>But this is still a bit more complex than needed.  How about something
>like this?

Agree,  from a logical point of view, this is more rigorous
Paul E. McKenney March 24, 2023, 3:31 a.m. UTC | #3
On Fri, Mar 24, 2023 at 02:20:18AM +0000, Zhang, Qiang1 wrote:
> Cc:  my personal email qiang.zhang1211@gmail.com
> 
> > When unloading rcutorture kmod will trigger the following callstack:
> > 
> > insmod rcutorture.ko
> > rmmod rcutorture.ko
> > 
> > [  209.437327] WARNING: CPU: 0 PID: 508 at kernel/workqueue.c:3167 __flush_work+0x50a/0x540
> > [  209.437346] Modules linked in: rcutorture(-) torture [last unloaded: rcutorture]
> > [  209.437382] CPU: 0 PID: 508 Comm: rmmod Tainted: G  W  6.3.0-rc1-yocto-standard+
> > [  209.437406] RIP: 0010:__flush_work+0x50a/0x540
> > .....
> > [  209.437758]  flush_delayed_work+0x36/0x90
> > [  209.437776]  cleanup_srcu_struct+0x68/0x2e0
> > [  209.437817]  srcu_module_notify+0x71/0x140
> > [  209.437854]  blocking_notifier_call_chain+0x9d/0xd0
> > [  209.437880]  __x64_sys_delete_module+0x223/0x2e0
> > [  209.438046]  do_syscall_64+0x43/0x90
> > [  209.438062]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> > 
> > flush_delayed_work()
> > ->__flush_work()
> >    ->if (WARN_ON(!work->func))
> >         return false;
> > 
> > For srcu objects defined with DEFINE_SRCU() or DEFINE_STATIC_SRCU(),
> > when compiling and loading as modules, the srcu_module_coming() is
> > invoked, allocate memory for srcu structure's->sda and initialize
> > sda structure, due to not fully initialize srcu structure's->sup,
> > so at this time the sup structure's->work.work.func is null, if not
> > invoke init_srcu_struct_fields() before unloading modules, the
> > __flush_work() be invoked in srcu_module_going() and find work->func
> > is empty, will raise the warning above.
> > 
> > This commit add the check of srcu_sup structure's->srcu_gp_seq_needed
> > to determine whether the check_init_srcu_struct() has been invoked to
> > initialize srcu objects in srcu_module_going(), if not initialize, there
> > are no pending or running works, so there is no need to flush, only invoke
> > free_percpu() to release srcu structure's->sda.
> > 
> > Co-developed-by: Paul E. McKenney <paulmck@kernel.org>
> >
> >Thank you for the testing, bug-finding, and problem-solving!
> >
> >In theory, you would need a Signed-off-by here from me as well, but
> >in practice bisectability means that this must be folded into this:
> >
> >e7c778489040 ("srcu: Use static init for statically allocated in-module srcu_struct")
> >	
> >This will of course be with attribution.
> >
> > Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> >
> >But this is still a bit more complex than needed.  How about something
> >like this?
> 
> Agree,  from a logical point of view, this is more rigorous
Zqiang March 24, 2023, 3:53 a.m. UTC | #4
> Cc:  my personal email qiang.zhang1211@gmail.com
> 
> > When unloading rcutorture kmod will trigger the following callstack:
> > 
> > insmod rcutorture.ko
> > rmmod rcutorture.ko
> > 
> > [  209.437327] WARNING: CPU: 0 PID: 508 at kernel/workqueue.c:3167 __flush_work+0x50a/0x540
> > [  209.437346] Modules linked in: rcutorture(-) torture [last unloaded: rcutorture]
> > [  209.437382] CPU: 0 PID: 508 Comm: rmmod Tainted: G  W  6.3.0-rc1-yocto-standard+
> > [  209.437406] RIP: 0010:__flush_work+0x50a/0x540
> > .....
> > [  209.437758]  flush_delayed_work+0x36/0x90
> > [  209.437776]  cleanup_srcu_struct+0x68/0x2e0
> > [  209.437817]  srcu_module_notify+0x71/0x140
> > [  209.437854]  blocking_notifier_call_chain+0x9d/0xd0
> > [  209.437880]  __x64_sys_delete_module+0x223/0x2e0
> > [  209.438046]  do_syscall_64+0x43/0x90
> > [  209.438062]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> > 
> > flush_delayed_work()
> > ->__flush_work()
> >    ->if (WARN_ON(!work->func))
> >         return false;
> > 
> > For srcu objects defined with DEFINE_SRCU() or DEFINE_STATIC_SRCU(),
> > when compiling and loading as modules, the srcu_module_coming() is
> > invoked, allocate memory for srcu structure's->sda and initialize
> > sda structure, due to not fully initialize srcu structure's->sup,
> > so at this time the sup structure's->work.work.func is null, if not
> > invoke init_srcu_struct_fields() before unloading modules, the
> > __flush_work() be invoked in srcu_module_going() and find work->func
> > is empty, will raise the warning above.
> > 
> > This commit add the check of srcu_sup structure's->srcu_gp_seq_needed
> > to determine whether the check_init_srcu_struct() has been invoked to
> > initialize srcu objects in srcu_module_going(), if not initialize, there
> > are no pending or running works, so there is no need to flush, only invoke
> > free_percpu() to release srcu structure's->sda.
> > 
> > Co-developed-by: Paul E. McKenney <paulmck@kernel.org>
> >
> >Thank you for the testing, bug-finding, and problem-solving!
> >
> >In theory, you would need a Signed-off-by here from me as well, but
> >in practice bisectability means that this must be folded into this:
> >
> >e7c778489040 ("srcu: Use static init for statically allocated in-module srcu_struct")
> >	
> >This will of course be with attribution.
> >
> > Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> >
> >But this is still a bit more complex than needed.  How about something
> >like this?
> 
> Agree,  from a logical point of view, this is more rigorous
Paul E. McKenney March 24, 2023, 2:02 p.m. UTC | #5
On Fri, Mar 24, 2023 at 03:53:08AM +0000, Zhang, Qiang1 wrote:
> > Cc:  my personal email qiang.zhang1211@gmail.com
> > 
> > > When unloading rcutorture kmod will trigger the following callstack:
> > > 
> > > insmod rcutorture.ko
> > > rmmod rcutorture.ko
> > > 
> > > [  209.437327] WARNING: CPU: 0 PID: 508 at kernel/workqueue.c:3167 __flush_work+0x50a/0x540
> > > [  209.437346] Modules linked in: rcutorture(-) torture [last unloaded: rcutorture]
> > > [  209.437382] CPU: 0 PID: 508 Comm: rmmod Tainted: G  W  6.3.0-rc1-yocto-standard+
> > > [  209.437406] RIP: 0010:__flush_work+0x50a/0x540
> > > .....
> > > [  209.437758]  flush_delayed_work+0x36/0x90
> > > [  209.437776]  cleanup_srcu_struct+0x68/0x2e0
> > > [  209.437817]  srcu_module_notify+0x71/0x140
> > > [  209.437854]  blocking_notifier_call_chain+0x9d/0xd0
> > > [  209.437880]  __x64_sys_delete_module+0x223/0x2e0
> > > [  209.438046]  do_syscall_64+0x43/0x90
> > > [  209.438062]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> > > 
> > > flush_delayed_work()
> > > ->__flush_work()
> > >    ->if (WARN_ON(!work->func))
> > >         return false;
> > > 
> > > For srcu objects defined with DEFINE_SRCU() or DEFINE_STATIC_SRCU(),
> > > when compiling and loading as modules, the srcu_module_coming() is
> > > invoked, allocate memory for srcu structure's->sda and initialize
> > > sda structure, due to not fully initialize srcu structure's->sup,
> > > so at this time the sup structure's->work.work.func is null, if not
> > > invoke init_srcu_struct_fields() before unloading modules, the
> > > __flush_work() be invoked in srcu_module_going() and find work->func
> > > is empty, will raise the warning above.
> > > 
> > > This commit add the check of srcu_sup structure's->srcu_gp_seq_needed
> > > to determine whether the check_init_srcu_struct() has been invoked to
> > > initialize srcu objects in srcu_module_going(), if not initialize, there
> > > are no pending or running works, so there is no need to flush, only invoke
> > > free_percpu() to release srcu structure's->sda.
> > > 
> > > Co-developed-by: Paul E. McKenney <paulmck@kernel.org>
> > >
> > >Thank you for the testing, bug-finding, and problem-solving!
> > >
> > >In theory, you would need a Signed-off-by here from me as well, but
> > >in practice bisectability means that this must be folded into this:
> > >
> > >e7c778489040 ("srcu: Use static init for statically allocated in-module srcu_struct")
> > >	
> > >This will of course be with attribution.
> > >
> > > Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> > >
> > >But this is still a bit more complex than needed.  How about something
> > >like this?
> > 
> > Agree,  from a logical point of view, this is more rigorous
diff mbox series

Patch

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 1fb078abbdc9..edf894e3b96e 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1921,7 +1921,6 @@  static int srcu_module_coming(struct module *mod)
 		ssp->sda = alloc_percpu(struct srcu_data);
 		if (WARN_ON_ONCE(!ssp->sda))
 			return -ENOMEM;
-		init_srcu_struct_data(ssp);
 	}
 	return 0;
 }
@@ -1931,9 +1930,14 @@  static void srcu_module_going(struct module *mod)
 {
 	int i;
 	struct srcu_struct **sspp = mod->srcu_struct_ptrs;
+	struct srcu_struct *ssp;
 
-	for (i = 0; i < mod->num_srcu_structs; i++)
-		cleanup_srcu_struct(*(sspp++));
+	for (i = 0; i < mod->num_srcu_structs; i++) {
+		ssp = (*sspp++);
+		if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed)))
+			cleanup_srcu_struct(ssp);
+		free_percpu(ssp->sda);
+	}
 }
 
 /* Handle one module, either coming or going. */