diff mbox series

dm: Avoid flush_scheduled_work() usage

Message ID e3ac2722-01c4-decb-2b79-bfd1b6c2b782@I-love.SAKURA.ne.jp (mailing list archive)
State New, archived
Headers show
Series dm: Avoid flush_scheduled_work() usage | expand

Commit Message

Tetsuo Handa April 20, 2022, 5:12 a.m. UTC
Flushing system-wide workqueues is dangerous and will be forbidden.
Remove flush_scheduled_work() from local_init() in dm.c and instead
use local workqueue to dm-mpath.c, dm-raid1.c, dm-stripe.c.

Link: https://lkml.kernel.org/r/49925af7-78a8-a3dd-bce6-cfc02e1a9236@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
Note: This patch was made blindly and completely untested. Maybe simply
removing flush_scheduled_work() from local_init() is sufficient. But I
don't know dependency across all files. Therefore, please carefully check
which schedule_work() needs to be converted to queue_work().

As far as I can see, dm-stripe.c which contains dm_stripe_exit() which is
called via _exits[] before local_exit() calls flush_scheduled_work() is
the only file which calls schedule_work(). Therefore, I used dm_stripe_wq
for dm-stripe.c.

Since I don't know dependency, and dm-raid1.c and dm-mpath.c are also calling
schedule_work(), I used dm_raid1_wq and dm_mpath_wq. But these changes might be
unnecessary because of not calling flush_scheduled_work().

 drivers/md/dm-mpath.c  | 17 +++++++++++++----
 drivers/md/dm-raid1.c  | 14 +++++++++++---
 drivers/md/dm-stripe.c | 12 ++++++++++--
 drivers/md/dm.c        |  1 -
 4 files changed, 34 insertions(+), 10 deletions(-)

Comments

Tetsuo Handa June 13, 2022, 9:52 a.m. UTC | #1
Please see commit c4f135d643823a86 ("workqueue: Wrap flush_workqueue()
using a macro") for background.

What should we do for this module?

On 2022/04/20 14:12, Tetsuo Handa wrote:
> Flushing system-wide workqueues is dangerous and will be forbidden.
> Remove flush_scheduled_work() from local_init() in dm.c and instead
> use local workqueue to dm-mpath.c, dm-raid1.c, dm-stripe.c.
> 
> Link: https://lkml.kernel.org/r/49925af7-78a8-a3dd-bce6-cfc02e1a9236@I-love.SAKURA.ne.jp
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> ---
> Note: This patch was made blindly and completely untested. Maybe simply
> removing flush_scheduled_work() from local_init() is sufficient. But I
> don't know dependency across all files. Therefore, please carefully check
> which schedule_work() needs to be converted to queue_work().
> 
> As far as I can see, dm-stripe.c which contains dm_stripe_exit() which is
> called via _exits[] before local_exit() calls flush_scheduled_work() is
> the only file which calls schedule_work(). Therefore, I used dm_stripe_wq
> for dm-stripe.c.
> 
> Since I don't know dependency, and dm-raid1.c and dm-mpath.c are also calling
> schedule_work(), I used dm_raid1_wq and dm_mpath_wq. But these changes might be
> unnecessary because of not calling flush_scheduled_work().
> 
>  drivers/md/dm-mpath.c  | 17 +++++++++++++----
>  drivers/md/dm-raid1.c  | 14 +++++++++++---
>  drivers/md/dm-stripe.c | 12 ++++++++++--
>  drivers/md/dm.c        |  1 -
>  4 files changed, 34 insertions(+), 10 deletions(-)
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
Tetsuo Handa July 15, 2022, 1:27 p.m. UTC | #2
Ping?

On 2022/06/13 18:52, Tetsuo Handa wrote:
> Please see commit c4f135d643823a86 ("workqueue: Wrap flush_workqueue()
> using a macro") for background.
> 
> What should we do for this module?
> 
> On 2022/04/20 14:12, Tetsuo Handa wrote:
>> Flushing system-wide workqueues is dangerous and will be forbidden.
>> Remove flush_scheduled_work() from local_init() in dm.c and instead
>> use local workqueue to dm-mpath.c, dm-raid1.c, dm-stripe.c.
>>
>> Link: https://lkml.kernel.org/r/49925af7-78a8-a3dd-bce6-cfc02e1a9236@I-love.SAKURA.ne.jp
>> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
>> ---
>> Note: This patch was made blindly and completely untested. Maybe simply
>> removing flush_scheduled_work() from local_init() is sufficient. But I
>> don't know dependency across all files. Therefore, please carefully check
>> which schedule_work() needs to be converted to queue_work().
>>
>> As far as I can see, dm-stripe.c which contains dm_stripe_exit() which is
>> called via _exits[] before local_exit() calls flush_scheduled_work() is
>> the only file which calls schedule_work(). Therefore, I used dm_stripe_wq
>> for dm-stripe.c.
>>
>> Since I don't know dependency, and dm-raid1.c and dm-mpath.c are also calling
>> schedule_work(), I used dm_raid1_wq and dm_mpath_wq. But these changes might be
>> unnecessary because of not calling flush_scheduled_work().
>>
>>  drivers/md/dm-mpath.c  | 17 +++++++++++++----
>>  drivers/md/dm-raid1.c  | 14 +++++++++++---
>>  drivers/md/dm-stripe.c | 12 ++++++++++--
>>  drivers/md/dm.c        |  1 -
>>  4 files changed, 34 insertions(+), 10 deletions(-)
>>

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer July 15, 2022, 2:25 p.m. UTC | #3
I'm not applying such a fundamental change to the DM subsystem when
the header for the change leads with: "Note: This patch was made
blindly and completely untested."

I've been busy... when is the deadline for this conversion?  I read
commit c4f135d643823a86 but it doesn't say.

I don't think I have the time to make and test such change in time for
5.20 as I'm working to resolve other issues.

Mike


On Fri, Jul 15 2022 at  9:27P -0400,
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> wrote:

> Ping?
> 
> On 2022/06/13 18:52, Tetsuo Handa wrote:
> > Please see commit c4f135d643823a86 ("workqueue: Wrap flush_workqueue()
> > using a macro") for background.
> > 
> > What should we do for this module?
> > 
> > On 2022/04/20 14:12, Tetsuo Handa wrote:
> >> Flushing system-wide workqueues is dangerous and will be forbidden.
> >> Remove flush_scheduled_work() from local_init() in dm.c and instead
> >> use local workqueue to dm-mpath.c, dm-raid1.c, dm-stripe.c.
> >>
> >> Link: https://lkml.kernel.org/r/49925af7-78a8-a3dd-bce6-cfc02e1a9236@I-love.SAKURA.ne.jp
> >> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> >> ---
> >> Note: This patch was made blindly and completely untested. Maybe simply
> >> removing flush_scheduled_work() from local_init() is sufficient. But I
> >> don't know dependency across all files. Therefore, please carefully check
> >> which schedule_work() needs to be converted to queue_work().
> >>
> >> As far as I can see, dm-stripe.c which contains dm_stripe_exit() which is
> >> called via _exits[] before local_exit() calls flush_scheduled_work() is
> >> the only file which calls schedule_work(). Therefore, I used dm_stripe_wq
> >> for dm-stripe.c.
> >>
> >> Since I don't know dependency, and dm-raid1.c and dm-mpath.c are also calling
> >> schedule_work(), I used dm_raid1_wq and dm_mpath_wq. But these changes might be
> >> unnecessary because of not calling flush_scheduled_work().
> >>
> >>  drivers/md/dm-mpath.c  | 17 +++++++++++++----
> >>  drivers/md/dm-raid1.c  | 14 +++++++++++---
> >>  drivers/md/dm-stripe.c | 12 ++++++++++--
> >>  drivers/md/dm.c        |  1 -
> >>  4 files changed, 34 insertions(+), 10 deletions(-)
> >>
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
Tetsuo Handa July 16, 2022, 5:33 a.m. UTC | #4
Thanks for responding.

On 2022/07/15 23:25, Mike Snitzer wrote:
> I'm not applying such a fundamental change to the DM subsystem when
> the header for the change leads with: "Note: This patch was made
> blindly and completely untested."

Of course, I don't think this patch will be applied as-is.

> 
> I've been busy... when is the deadline for this conversion?  I read
> commit c4f135d643823a86 but it doesn't say.

There is no deadline. I estimate we need about 6 months for getting rid of
all in-tree flush_scheduled_work() users.

> 
> I don't think I have the time to make and test such change in time for
> 5.20 as I'm working to resolve other issues.

That's no problem. This patch was proposed for heads-up purpose.

Please clarify which work items does flush_scheduled_work() from local_exit()
needs to wait. Depending on the clarification result, we might be able to
simply remove flush_scheduled_work() from local_exit() and/or replace with
flush_work() instead of introducing dedicated workqueues.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
Tetsuo Handa Sept. 1, 2022, 2:35 p.m. UTC | #5
How is the progress?

On 2022/07/16 14:33, Tetsuo Handa wrote:
> Please clarify which work items does flush_scheduled_work() from local_exit()
> needs to wait. Depending on the clarification result, we might be able to
> simply remove flush_scheduled_work() from local_exit() and/or replace with
> flush_work() instead of introducing dedicated workqueues.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
diff mbox series

Patch

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 6ed9d2731254..9b592ff2e387 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -27,6 +27,8 @@ 
 #include <linux/atomic.h>
 #include <linux/blk-mq.h>
 
+static struct workqueue_struct *dm_mpath_wq;
+
 #define DM_MSG_PREFIX "multipath"
 #define DM_PG_INIT_DELAY_MSECS 2000
 #define DM_PG_INIT_DELAY_DEFAULT ((unsigned) -1)
@@ -1342,7 +1344,7 @@  static int fail_path(struct pgpath *pgpath)
 	dm_path_uevent(DM_UEVENT_PATH_FAILED, m->ti,
 		       pgpath->path.dev->name, atomic_read(&m->nr_valid_paths));
 
-	schedule_work(&m->trigger_event);
+	queue_work(dm_mpath_wq, &m->trigger_event);
 
 	enable_nopath_timeout(m);
 
@@ -2190,12 +2192,11 @@  static struct target_type multipath_target = {
 
 static int __init dm_multipath_init(void)
 {
-	int r;
+	int r = -ENOMEM;
 
 	kmultipathd = alloc_workqueue("kmpathd", WQ_MEM_RECLAIM, 0);
 	if (!kmultipathd) {
 		DMERR("failed to create workqueue kmpathd");
-		r = -ENOMEM;
 		goto bad_alloc_kmultipathd;
 	}
 
@@ -2209,10 +2210,15 @@  static int __init dm_multipath_init(void)
 						  WQ_MEM_RECLAIM);
 	if (!kmpath_handlerd) {
 		DMERR("failed to create workqueue kmpath_handlerd");
-		r = -ENOMEM;
 		goto bad_alloc_kmpath_handlerd;
 	}
 
+	dm_mpath_wq = alloc_workqueue("dm_mpath_wq", 0, 0);
+	if (!dm_mpath_wq) {
+		DMERR("failed to create workqueue dm_mpath_wq");
+		goto bad_alloc_dm_mpath_wq;
+	}
+
 	r = dm_register_target(&multipath_target);
 	if (r < 0) {
 		DMERR("request-based register failed %d", r);
@@ -2223,6 +2229,8 @@  static int __init dm_multipath_init(void)
 	return 0;
 
 bad_register_target:
+	destroy_workqueue(dm_mpath_wq);
+bad_alloc_dm_mpath_wq:
 	destroy_workqueue(kmpath_handlerd);
 bad_alloc_kmpath_handlerd:
 	destroy_workqueue(kmultipathd);
@@ -2232,6 +2240,7 @@  static int __init dm_multipath_init(void)
 
 static void __exit dm_multipath_exit(void)
 {
+	destroy_workqueue(dm_mpath_wq);
 	destroy_workqueue(kmpath_handlerd);
 	destroy_workqueue(kmultipathd);
 
diff --git a/drivers/md/dm-raid1.c b/drivers/md/dm-raid1.c
index 8811d484fdd1..df630d7f2b9b 100644
--- a/drivers/md/dm-raid1.c
+++ b/drivers/md/dm-raid1.c
@@ -19,6 +19,8 @@ 
 #include <linux/dm-kcopyd.h>
 #include <linux/dm-region-hash.h>
 
+static struct workqueue_struct *dm_raid1_wq;
+
 #define DM_MSG_PREFIX "raid1"
 
 #define MAX_RECOVERY 1	/* Maximum number of regions recovered in parallel. */
@@ -248,7 +250,7 @@  static void fail_mirror(struct mirror *m, enum dm_raid1_error error_type)
 		DMWARN("All sides of mirror have failed.");
 
 out:
-	schedule_work(&ms->trigger_event);
+	queue_work(dm_raid1_wq, &ms->trigger_event);
 }
 
 static int mirror_flush(struct dm_target *ti)
@@ -1486,22 +1488,28 @@  static struct target_type mirror_target = {
 
 static int __init dm_mirror_init(void)
 {
-	int r;
+	int r = -ENOMEM;
+
+	dm_raid1_wq = alloc_workqueue("dm_raid1_wq", 0, 0);
+	if (!dm_raid1_wq)
+		goto bad_target;
 
 	r = dm_register_target(&mirror_target);
 	if (r < 0) {
-		DMERR("Failed to register mirror target");
+		destroy_workqueue(dm_raid1_wq);
 		goto bad_target;
 	}
 
 	return 0;
 
 bad_target:
+	DMERR("Failed to register mirror target");
 	return r;
 }
 
 static void __exit dm_mirror_exit(void)
 {
+	destroy_workqueue(dm_raid1_wq);
 	dm_unregister_target(&mirror_target);
 }
 
diff --git a/drivers/md/dm-stripe.c b/drivers/md/dm-stripe.c
index c81d331d1afe..e44d3c98568d 100644
--- a/drivers/md/dm-stripe.c
+++ b/drivers/md/dm-stripe.c
@@ -15,6 +15,8 @@ 
 #include <linux/slab.h>
 #include <linux/log2.h>
 
+static struct workqueue_struct *dm_stripe_wq;
+
 #define DM_MSG_PREFIX "striped"
 #define DM_IO_ERROR_THRESHOLD 15
 
@@ -423,7 +425,7 @@  static int stripe_end_io(struct dm_target *ti, struct bio *bio,
 			atomic_inc(&(sc->stripe[i].error_count));
 			if (atomic_read(&(sc->stripe[i].error_count)) <
 			    DM_IO_ERROR_THRESHOLD)
-				schedule_work(&sc->trigger_event);
+				queue_work(dm_stripe_wq, &sc->trigger_event);
 		}
 
 	return DM_ENDIO_DONE;
@@ -475,9 +477,14 @@  int __init dm_stripe_init(void)
 {
 	int r;
 
+	dm_stripe_wq = alloc_workqueue("dm_stripe_wq", 0, 0);
+	if (!dm_stripe_wq)
+		return -ENOMEM;
 	r = dm_register_target(&stripe_target);
-	if (r < 0)
+	if (r < 0) {
+		destroy_workqueue(dm_stripe_wq);
 		DMWARN("target registration failed");
+	}
 
 	return r;
 }
@@ -485,4 +492,5 @@  int __init dm_stripe_init(void)
 void dm_stripe_exit(void)
 {
 	dm_unregister_target(&stripe_target);
+	destroy_workqueue(dm_stripe_wq);
 }
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 7e3b5bdcf520..ed0f7aa82b35 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -243,7 +243,6 @@  static int __init local_init(void)
 
 static void local_exit(void)
 {
-	flush_scheduled_work();
 	destroy_workqueue(deferred_remove_workqueue);
 
 	unregister_blkdev(_major, _name);