Message ID | 03f43407-c34b-b7b2-68cd-d4ca93a993b8@i-love.sakura.ne.jp (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | make autoclear operation synchronous again | expand |
The subject line is missing the loop: prefix. This approach looks very reasonable, but a few comments below: > - lo->workqueue = alloc_workqueue("loop%d", > - WQ_UNBOUND | WQ_FREEZABLE, > - 0, > - lo->lo_number); > + if (!lo->workqueue) > + lo->workqueue = alloc_workqueue("loop%d", > + WQ_UNBOUND | WQ_FREEZABLE, > + 0, lo->lo_number); Instead of having to deal with sometimes present workqueues, why not move the workqueue allocation to loop_add? > + /* This is safe: open() is still holding a reference. */ > + module_put(THIS_MODULE); Any reason to move the module_put here insted of keeping it at the end of the function as the old code did?
On 2021/12/30 2:29, Christoph Hellwig wrote: > The subject line is missing the loop: prefix. Oops. > > This approach looks very reasonable, but a few comments below: I prefer task_work approach if it is safe, for it can call destroy_workqueue() without global locks held. Note that __loop_clear_fd() is also called from ioctl(LOOP_CLR_FD), and creating locking dependency chain for both with and without global locks held looks stupid. If we can do without global locks held, why bother to do with global locks held? > >> - lo->workqueue = alloc_workqueue("loop%d", >> - WQ_UNBOUND | WQ_FREEZABLE, >> - 0, >> - lo->lo_number); >> + if (!lo->workqueue) >> + lo->workqueue = alloc_workqueue("loop%d", >> + WQ_UNBOUND | WQ_FREEZABLE, >> + 0, lo->lo_number); > > Instead of having to deal with sometimes present workqueues, why > not move the workqueue allocation to loop_add? A bit of worrisome thing is that destroy_workqueue() can be called with major_names_lock held, for loop_add() may be called as probe function from blk_request_module(). Some unexpected dependency might bite us in future. We can avoid destroy_workqueue() from loop_add() if we call alloc_workqueue() after add_disk() succeeded. But in that case calling alloc_workqueue() from loop_configure() (which is called without global locks like major_names_lock) sounds safer. > >> + /* This is safe: open() is still holding a reference. */ >> + module_put(THIS_MODULE); > > Any reason to move the module_put here insted of keeping it at the > end of the function as the old code did? OK. Two patches shown below. Are these look reasonable? From 1409a49181efcc474fbae2cf4e60cbc37adf34aa Mon Sep 17 00:00:00 2001 From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Date: Thu, 30 Dec 2021 18:41:05 +0900 Subject: [PATCH 1/2] loop: Revert "loop: make autoclear operation asynchronous" The kernel test robot is reporting that xfstest can fail at umount ext2 on xfs umount xfs sequence, for commit 322c4293ecc58110 ("loop: make autoclear operation asynchronous") broke what commit ("loop: Make explicit loop device destruction lazy") wanted to achieve. Although we cannot guarantee that nobody is holding a reference when "umount xfs" is called, we should try to close a race window opened by asynchronous autoclear operation. Reported-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> --- drivers/block/loop.c | 65 ++++++++++++++++++++------------------------ drivers/block/loop.h | 1 - 2 files changed, 29 insertions(+), 37 deletions(-) diff --git a/drivers/block/loop.c b/drivers/block/loop.c index b1b05c45c07c..e52a8a5e8cbc 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1082,7 +1082,7 @@ static int loop_configure(struct loop_device *lo, fmode_t mode, return error; } -static void __loop_clr_fd(struct loop_device *lo) +static void __loop_clr_fd(struct loop_device *lo, bool release) { struct file *filp; gfp_t gfp = lo->old_gfp_mask; @@ -1144,6 +1144,8 @@ static void __loop_clr_fd(struct loop_device *lo) /* let user-space know about this change */ kobject_uevent(&disk_to_dev(lo->lo_disk)->kobj, KOBJ_CHANGE); mapping_set_gfp_mask(filp->f_mapping, gfp); + /* This is safe: open() is still holding a reference. */ + module_put(THIS_MODULE); blk_mq_unfreeze_queue(lo->lo_queue); disk_force_media_change(lo->lo_disk, DISK_EVENT_MEDIA_CHANGE); @@ -1151,52 +1153,44 @@ static void __loop_clr_fd(struct loop_device *lo) if (lo->lo_flags & LO_FLAGS_PARTSCAN) { int err; - mutex_lock(&lo->lo_disk->open_mutex); + /* + * open_mutex has been held already in release path, so don't + * acquire it if this function is called in such case. + * + * If the reread partition isn't from release path, lo_refcnt + * must be at least one and it can only become zero when the + * current holder is released. + */ + if (!release) + mutex_lock(&lo->lo_disk->open_mutex); err = bdev_disk_changed(lo->lo_disk, false); - mutex_unlock(&lo->lo_disk->open_mutex); + if (!release) + mutex_unlock(&lo->lo_disk->open_mutex); if (err) pr_warn("%s: partition scan of loop%d failed (rc=%d)\n", __func__, lo->lo_number, err); /* Device is gone, no point in returning error */ } + /* + * lo->lo_state is set to Lo_unbound here after above partscan has + * finished. There cannot be anybody else entering __loop_clr_fd() as + * Lo_rundown state protects us from all the other places trying to + * change the 'lo' device. + */ lo->lo_flags = 0; if (!part_shift) lo->lo_disk->flags |= GENHD_FL_NO_PART; - - fput(filp); -} - -static void loop_rundown_completed(struct loop_device *lo) -{ mutex_lock(&lo->lo_mutex); lo->lo_state = Lo_unbound; mutex_unlock(&lo->lo_mutex); - module_put(THIS_MODULE); -} - -static void loop_rundown_workfn(struct work_struct *work) -{ - struct loop_device *lo = container_of(work, struct loop_device, - rundown_work); - struct block_device *bdev = lo->lo_device; - struct gendisk *disk = lo->lo_disk; - - __loop_clr_fd(lo); - kobject_put(&bdev->bd_device.kobj); - module_put(disk->fops->owner); - loop_rundown_completed(lo); -} -static void loop_schedule_rundown(struct loop_device *lo) -{ - struct block_device *bdev = lo->lo_device; - struct gendisk *disk = lo->lo_disk; - - __module_get(disk->fops->owner); - kobject_get(&bdev->bd_device.kobj); - INIT_WORK(&lo->rundown_work, loop_rundown_workfn); - queue_work(system_long_wq, &lo->rundown_work); + /* + * Need not hold lo_mutex to fput backing file. Calling fput holding + * lo_mutex triggers a circular lock dependency possibility warning as + * fput can take open_mutex which is usually taken before lo_mutex. + */ + fput(filp); } static int loop_clr_fd(struct loop_device *lo) @@ -1228,8 +1222,7 @@ static int loop_clr_fd(struct loop_device *lo) lo->lo_state = Lo_rundown; mutex_unlock(&lo->lo_mutex); - __loop_clr_fd(lo); - loop_rundown_completed(lo); + __loop_clr_fd(lo, false); return 0; } @@ -1754,7 +1747,7 @@ static void lo_release(struct gendisk *disk, fmode_t mode) * In autoclear mode, stop the loop thread * and remove configuration after last close. */ - loop_schedule_rundown(lo); + __loop_clr_fd(lo, true); return; } else if (lo->lo_state == Lo_bound) { /* diff --git a/drivers/block/loop.h b/drivers/block/loop.h index 918a7a2dc025..082d4b6bfc6a 100644 --- a/drivers/block/loop.h +++ b/drivers/block/loop.h @@ -56,7 +56,6 @@ struct loop_device { struct gendisk *lo_disk; struct mutex lo_mutex; bool idr_visible; - struct work_struct rundown_work; }; struct loop_cmd {
On Thu, Dec 30, 2021 at 07:52:34PM +0900, Tetsuo Handa wrote: > > Instead of having to deal with sometimes present workqueues, why > > not move the workqueue allocation to loop_add? > > A bit of worrisome thing is that destroy_workqueue() can be called with > major_names_lock held, for loop_add() may be called as probe function from > blk_request_module(). Some unexpected dependency might bite us in future. > > We can avoid destroy_workqueue() from loop_add() if we call alloc_workqueue() > after add_disk() succeeded. But in that case calling alloc_workqueue() from > loop_configure() (which is called without global locks like major_names_lock) > sounds safer. Ok. > OK. Two patches shown below. Are these look reasonable? They do look reasonable to me based on a quick glance, but please post them one patch per mail in a separate thread for proper review.
On Thu, Dec 30, 2021 at 07:52:34PM +0900, Tetsuo Handa wrote: >OK. Two patches shown below. Are these look reasonable? > > > >>From 1409a49181efcc474fbae2cf4e60cbc37adf34aa Mon Sep 17 00:00:00 2001 >From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> >Date: Thu, 30 Dec 2021 18:41:05 +0900 >Subject: [PATCH 1/2] loop: Revert "loop: make autoclear operation asynchronous" > Thanks, the revert fixes failures we saw recently in LTP tests, which do mount/umount in close succession: # for i in `seq 1 2`;do mount -t iso9660 -o loop /root/isofs.iso /mnt/isofs; umount /mnt/isofs/; done mount: /mnt/isofs: WARNING: source write-protected, mounted read-only. mount: /mnt/isofs: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error. umount: /mnt/isofs/: not mounted. >The kernel test robot is reporting that xfstest can fail at > > umount ext2 on xfs > umount xfs > >sequence, for commit 322c4293ecc58110 ("loop: make autoclear operation >asynchronous") broke what commit ("loop: Make explicit loop device >destruction lazy") wanted to achieve. > >Although we cannot guarantee that nobody is holding a reference when >"umount xfs" is called, we should try to close a race window opened >by asynchronous autoclear operation. > >Reported-by: kernel test robot <oliver.sang@intel.com> >Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> >--- > drivers/block/loop.c | 65 ++++++++++++++++++++------------------------ > drivers/block/loop.h | 1 - > 2 files changed, 29 insertions(+), 37 deletions(-) > >diff --git a/drivers/block/loop.c b/drivers/block/loop.c >index b1b05c45c07c..e52a8a5e8cbc 100644 >--- a/drivers/block/loop.c >+++ b/drivers/block/loop.c >@@ -1082,7 +1082,7 @@ static int loop_configure(struct loop_device *lo, fmode_t mode, > return error; > } > >-static void __loop_clr_fd(struct loop_device *lo) >+static void __loop_clr_fd(struct loop_device *lo, bool release) > { > struct file *filp; > gfp_t gfp = lo->old_gfp_mask; >@@ -1144,6 +1144,8 @@ static void __loop_clr_fd(struct loop_device *lo) > /* let user-space know about this change */ > kobject_uevent(&disk_to_dev(lo->lo_disk)->kobj, KOBJ_CHANGE); > mapping_set_gfp_mask(filp->f_mapping, gfp); >+ /* This is safe: open() is still holding a reference. */ >+ module_put(THIS_MODULE); > blk_mq_unfreeze_queue(lo->lo_queue); > > disk_force_media_change(lo->lo_disk, DISK_EVENT_MEDIA_CHANGE); >@@ -1151,52 +1153,44 @@ static void __loop_clr_fd(struct loop_device *lo) > if (lo->lo_flags & LO_FLAGS_PARTSCAN) { > int err; > >- mutex_lock(&lo->lo_disk->open_mutex); >+ /* >+ * open_mutex has been held already in release path, so don't >+ * acquire it if this function is called in such case. >+ * >+ * If the reread partition isn't from release path, lo_refcnt >+ * must be at least one and it can only become zero when the >+ * current holder is released. >+ */ >+ if (!release) >+ mutex_lock(&lo->lo_disk->open_mutex); > err = bdev_disk_changed(lo->lo_disk, false); >- mutex_unlock(&lo->lo_disk->open_mutex); >+ if (!release) >+ mutex_unlock(&lo->lo_disk->open_mutex); > if (err) > pr_warn("%s: partition scan of loop%d failed (rc=%d)\n", > __func__, lo->lo_number, err); > /* Device is gone, no point in returning error */ > } > >+ /* >+ * lo->lo_state is set to Lo_unbound here after above partscan has >+ * finished. There cannot be anybody else entering __loop_clr_fd() as >+ * Lo_rundown state protects us from all the other places trying to >+ * change the 'lo' device. >+ */ > lo->lo_flags = 0; > if (!part_shift) > lo->lo_disk->flags |= GENHD_FL_NO_PART; >- >- fput(filp); >-} >- >-static void loop_rundown_completed(struct loop_device *lo) >-{ > mutex_lock(&lo->lo_mutex); > lo->lo_state = Lo_unbound; > mutex_unlock(&lo->lo_mutex); >- module_put(THIS_MODULE); >-} >- >-static void loop_rundown_workfn(struct work_struct *work) >-{ >- struct loop_device *lo = container_of(work, struct loop_device, >- rundown_work); >- struct block_device *bdev = lo->lo_device; >- struct gendisk *disk = lo->lo_disk; >- >- __loop_clr_fd(lo); >- kobject_put(&bdev->bd_device.kobj); >- module_put(disk->fops->owner); >- loop_rundown_completed(lo); >-} > >-static void loop_schedule_rundown(struct loop_device *lo) >-{ >- struct block_device *bdev = lo->lo_device; >- struct gendisk *disk = lo->lo_disk; >- >- __module_get(disk->fops->owner); >- kobject_get(&bdev->bd_device.kobj); >- INIT_WORK(&lo->rundown_work, loop_rundown_workfn); >- queue_work(system_long_wq, &lo->rundown_work); >+ /* >+ * Need not hold lo_mutex to fput backing file. Calling fput holding >+ * lo_mutex triggers a circular lock dependency possibility warning as >+ * fput can take open_mutex which is usually taken before lo_mutex. >+ */ >+ fput(filp); > } > > static int loop_clr_fd(struct loop_device *lo) >@@ -1228,8 +1222,7 @@ static int loop_clr_fd(struct loop_device *lo) > lo->lo_state = Lo_rundown; > mutex_unlock(&lo->lo_mutex); > >- __loop_clr_fd(lo); >- loop_rundown_completed(lo); >+ __loop_clr_fd(lo, false); > return 0; > } > >@@ -1754,7 +1747,7 @@ static void lo_release(struct gendisk *disk, fmode_t mode) > * In autoclear mode, stop the loop thread > * and remove configuration after last close. > */ >- loop_schedule_rundown(lo); >+ __loop_clr_fd(lo, true); > return; > } else if (lo->lo_state == Lo_bound) { > /* >diff --git a/drivers/block/loop.h b/drivers/block/loop.h >index 918a7a2dc025..082d4b6bfc6a 100644 >--- a/drivers/block/loop.h >+++ b/drivers/block/loop.h >@@ -56,7 +56,6 @@ struct loop_device { > struct gendisk *lo_disk; > struct mutex lo_mutex; > bool idr_visible; >- struct work_struct rundown_work; > }; > > struct loop_cmd { >-- >2.32.0 >
On 2022/01/05 15:02, Jan Stancek wrote: > On Thu, Dec 30, 2021 at 07:52:34PM +0900, Tetsuo Handa wrote: >> OK. Two patches shown below. Are these look reasonable? >> >> >> >>> From 1409a49181efcc474fbae2cf4e60cbc37adf34aa Mon Sep 17 00:00:00 2001 >> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> >> Date: Thu, 30 Dec 2021 18:41:05 +0900 >> Subject: [PATCH 1/2] loop: Revert "loop: make autoclear operation asynchronous" >> > > Thanks, the revert fixes failures we saw recently in LTP tests, > which do mount/umount in close succession: > > # for i in `seq 1 2`;do mount -t iso9660 -o loop /root/isofs.iso /mnt/isofs; umount /mnt/isofs/; done > mount: /mnt/isofs: WARNING: source write-protected, mounted read-only. > mount: /mnt/isofs: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error. > umount: /mnt/isofs/: not mounted. > I'm waiting for Jens to come back to https://lkml.kernel.org/r/c205dcd2-db55-a35c-e2ef-20193b5ac0da@i-love.sakura.ne.jp .
On Wed, Jan 05, 2022 at 03:10:23PM +0900, Tetsuo Handa wrote: > On 2022/01/05 15:02, Jan Stancek wrote: > > On Thu, Dec 30, 2021 at 07:52:34PM +0900, Tetsuo Handa wrote: > >> OK. Two patches shown below. Are these look reasonable? > >> > >> > >> > >>> From 1409a49181efcc474fbae2cf4e60cbc37adf34aa Mon Sep 17 00:00:00 2001 > >> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > >> Date: Thu, 30 Dec 2021 18:41:05 +0900 > >> Subject: [PATCH 1/2] loop: Revert "loop: make autoclear operation asynchronous" > >> > > > > Thanks, the revert fixes failures we saw recently in LTP tests, > > which do mount/umount in close succession: > > > > # for i in `seq 1 2`;do mount -t iso9660 -o loop /root/isofs.iso /mnt/isofs; umount /mnt/isofs/; done > > mount: /mnt/isofs: WARNING: source write-protected, mounted read-only. > > mount: /mnt/isofs: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error. > > umount: /mnt/isofs/: not mounted. > > > > I'm waiting for Jens to come back to > https://lkml.kernel.org/r/c205dcd2-db55-a35c-e2ef-20193b5ac0da@i-love.sakura.ne.jp . So I think the version in this thread is what we should merge. On top of that we should: - remove the probe mechanism (patch already sent) - stop taking open_mutex in del_gendisk and bdev_disk_changed (I have a series for that now)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c index b1b05c45c07c..93137858b00c 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1006,10 +1006,10 @@ static int loop_configure(struct loop_device *lo, fmode_t mode, !file->f_op->write_iter) lo->lo_flags |= LO_FLAGS_READ_ONLY; - lo->workqueue = alloc_workqueue("loop%d", - WQ_UNBOUND | WQ_FREEZABLE, - 0, - lo->lo_number); + if (!lo->workqueue) + lo->workqueue = alloc_workqueue("loop%d", + WQ_UNBOUND | WQ_FREEZABLE, + 0, lo->lo_number); if (!lo->workqueue) { error = -ENOMEM; goto out_unlock; @@ -1082,7 +1082,7 @@ static int loop_configure(struct loop_device *lo, fmode_t mode, return error; } -static void __loop_clr_fd(struct loop_device *lo) +static void __loop_clr_fd(struct loop_device *lo, bool release) { struct file *filp; gfp_t gfp = lo->old_gfp_mask; @@ -1115,7 +1115,6 @@ static void __loop_clr_fd(struct loop_device *lo) /* freeze request queue during the transition */ blk_mq_freeze_queue(lo->lo_queue); - destroy_workqueue(lo->workqueue); spin_lock_irq(&lo->lo_work_lock); list_for_each_entry_safe(worker, pos, &lo->idle_worker_list, idle_list) { @@ -1144,6 +1143,8 @@ static void __loop_clr_fd(struct loop_device *lo) /* let user-space know about this change */ kobject_uevent(&disk_to_dev(lo->lo_disk)->kobj, KOBJ_CHANGE); mapping_set_gfp_mask(filp->f_mapping, gfp); + /* This is safe: open() is still holding a reference. */ + module_put(THIS_MODULE); blk_mq_unfreeze_queue(lo->lo_queue); disk_force_media_change(lo->lo_disk, DISK_EVENT_MEDIA_CHANGE); @@ -1151,52 +1152,44 @@ static void __loop_clr_fd(struct loop_device *lo) if (lo->lo_flags & LO_FLAGS_PARTSCAN) { int err; - mutex_lock(&lo->lo_disk->open_mutex); + /* + * open_mutex has been held already in release path, so don't + * acquire it if this function is called in such case. + * + * If the reread partition isn't from release path, lo_refcnt + * must be at least one and it can only become zero when the + * current holder is released. + */ + if (!release) + mutex_lock(&lo->lo_disk->open_mutex); err = bdev_disk_changed(lo->lo_disk, false); - mutex_unlock(&lo->lo_disk->open_mutex); + if (!release) + mutex_unlock(&lo->lo_disk->open_mutex); if (err) pr_warn("%s: partition scan of loop%d failed (rc=%d)\n", __func__, lo->lo_number, err); /* Device is gone, no point in returning error */ } + /* + * lo->lo_state is set to Lo_unbound here after above partscan has + * finished. There cannot be anybody else entering __loop_clr_fd() as + * Lo_rundown state protects us from all the other places trying to + * change the 'lo' device. + */ lo->lo_flags = 0; if (!part_shift) lo->lo_disk->flags |= GENHD_FL_NO_PART; - - fput(filp); -} - -static void loop_rundown_completed(struct loop_device *lo) -{ mutex_lock(&lo->lo_mutex); lo->lo_state = Lo_unbound; mutex_unlock(&lo->lo_mutex); - module_put(THIS_MODULE); -} - -static void loop_rundown_workfn(struct work_struct *work) -{ - struct loop_device *lo = container_of(work, struct loop_device, - rundown_work); - struct block_device *bdev = lo->lo_device; - struct gendisk *disk = lo->lo_disk; - - __loop_clr_fd(lo); - kobject_put(&bdev->bd_device.kobj); - module_put(disk->fops->owner); - loop_rundown_completed(lo); -} -static void loop_schedule_rundown(struct loop_device *lo) -{ - struct block_device *bdev = lo->lo_device; - struct gendisk *disk = lo->lo_disk; - - __module_get(disk->fops->owner); - kobject_get(&bdev->bd_device.kobj); - INIT_WORK(&lo->rundown_work, loop_rundown_workfn); - queue_work(system_long_wq, &lo->rundown_work); + /* + * Need not hold lo_mutex to fput backing file. Calling fput holding + * lo_mutex triggers a circular lock dependency possibility warning as + * fput can take open_mutex which is usually taken before lo_mutex. + */ + fput(filp); } static int loop_clr_fd(struct loop_device *lo) @@ -1228,8 +1221,7 @@ static int loop_clr_fd(struct loop_device *lo) lo->lo_state = Lo_rundown; mutex_unlock(&lo->lo_mutex); - __loop_clr_fd(lo); - loop_rundown_completed(lo); + __loop_clr_fd(lo, false); return 0; } @@ -1754,7 +1746,7 @@ static void lo_release(struct gendisk *disk, fmode_t mode) * In autoclear mode, stop the loop thread * and remove configuration after last close. */ - loop_schedule_rundown(lo); + __loop_clr_fd(lo, true); return; } else if (lo->lo_state == Lo_bound) { /* @@ -2080,6 +2072,8 @@ static void loop_remove(struct loop_device *lo) mutex_unlock(&loop_ctl_mutex); /* There is no route which can find this loop device. */ mutex_destroy(&lo->lo_mutex); + if (lo->workqueue) + destroy_workqueue(lo->workqueue); kfree(lo); } diff --git a/drivers/block/loop.h b/drivers/block/loop.h index 918a7a2dc025..082d4b6bfc6a 100644 --- a/drivers/block/loop.h +++ b/drivers/block/loop.h @@ -56,7 +56,6 @@ struct loop_device { struct gendisk *lo_disk; struct mutex lo_mutex; bool idr_visible; - struct work_struct rundown_work; }; struct loop_cmd {