[0/14] loop: Fix oops and possible deadlocks

Message ID	20180927113652.5422-1-jack@suse.cz (mailing list archive)
Headers	show Return-Path: <linux-block-owner@kernel.org> From: Jan Kara <jack@suse.cz> To: Jens Axboe <axboe@kernel.dk> Cc: <linux-block@vger.kernel.org>, Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>, Jan Kara <jack@suse.cz> Subject: [PATCH 0/14] loop: Fix oops and possible deadlocks Date: Thu, 27 Sep 2018 13:36:38 +0200 Message-Id: <20180927113652.5422-1-jack@suse.cz> Sender: linux-block-owner@vger.kernel.org Precedence: bulk
Series	loop: Fix oops and possible deadlocks \| expand [0/14] loop: Fix oops and possible deadlocks [01/14] block/loop: Don't grab "struct file" for vfs_getattr() operation. [02/14] block/loop: Use global lock for ioctl() operation. [03/14] loop: Fold __loop_release into loop_release [04/14] loop: Get rid of loop_index_mutex [05/14] loop: Push lo_ctl_mutex down into individual ioctls [06/14] loop: Split setting of lo_state from loop_clr_fd [07/14] loop: Push loop_ctl_mutex down into loop_clr_fd() [08/14] loop: Push loop_ctl_mutex down to loop_get_status() [09/14] loop: Push loop_ctl_mutex down to loop_set_status() [10/14] loop: Push loop_ctl_mutex down to loop_set_fd() [11/14] loop: Push loop_ctl_mutex down to loop_change_fd() [12/14] loop: Move special partition reread handling in loop_clr_fd() [13/14] loop: Move loop_reread_partitions() out of loop_ctl_mutex [14/14] loop: Fix deadlock when calling blkdev_reread_part()

Jan Kara Sept. 27, 2018, 11:36 a.m. UTC

Hi,

this patch series fixes oops and possible deadlocks as reported by syzbot [1]
[2]. The second patch in the series (from Tetsuo) fixes the oops, the remaining
patches are cleaning up the locking in the loop driver so that we can in the
end reasonably easily switch to rereading partitions without holding mutex
protecting the loop device.

I have lightly tested the patches by creating, deleting, and modifying loop
devices but if there's some more comprehensive loopback device testsuite, I
can try running it. Review is welcome!

								Honza

[1] https://syzkaller.appspot.com/bug?id=f3cfe26e785d85f9ee259f385515291d21bd80a3
[2] https://syzkaller.appspot.com/bug?id=bf154052f0eea4bc7712499e4569505907d15889

Tetsuo Handa Sept. 27, 2018, 2:47 p.m. UTC | #1

Possible changes folded into this series.

  (1) (I guess) no need to use _nested version.

  (2) Use mutex_lock_killable() where possible.

  (3) Move fput() to after mutex_unlock().

  (4) Don't return 0 upon invalid loop_control_ioctl().

  (5) No need to mutex_lock()/mutex_unlock() on each loop device at
      unregister_transfer_cb() callback.

  (6) No need to mutex_unlock()+mutex_lock() when calling __loop_clr_fd().

---
 drivers/block/loop.c | 78 ++++++++++++++++++++++++----------------------------
 1 file changed, 36 insertions(+), 42 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index a0fb7bf..395d1e9 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -678,11 +678,11 @@ static int loop_validate_file(struct file *file, struct block_device *bdev)
 static int loop_change_fd(struct loop_device *lo, struct block_device *bdev,
 			  unsigned int arg)
 {
-	struct file	*file, *old_file;
+	struct file	*file = NULL, *old_file;
 	int		error;
 	bool		partscan;
 
-	error = mutex_lock_killable_nested(&loop_ctl_mutex, 1);
+	error = mutex_lock_killable(&loop_ctl_mutex);
 	if (error)
 		return error;
 	error = -ENXIO;
@@ -701,7 +701,7 @@ static int loop_change_fd(struct loop_device *lo, struct block_device *bdev,
 
 	error = loop_validate_file(file, bdev);
 	if (error)
-		goto out_putf;
+		goto out_unlock;
 
 	old_file = lo->lo_backing_file;
 
@@ -709,29 +709,31 @@ static int loop_change_fd(struct loop_device *lo, struct block_device *bdev,
 
 	/* size of the new backing store needs to be the same */
 	if (get_loop_size(lo, file) != get_loop_size(lo, old_file))
-		goto out_putf;
+		goto out_unlock;
 
 	/* and ... switch */
 	blk_mq_freeze_queue(lo->lo_queue);
 	mapping_set_gfp_mask(old_file->f_mapping, lo->old_gfp_mask);
+	spin_lock_irq(&lo->lo_lock);
 	lo->lo_backing_file = file;
+	spin_unlock_irq(&lo->lo_lock);
 	lo->old_gfp_mask = mapping_gfp_mask(file->f_mapping);
 	mapping_set_gfp_mask(file->f_mapping,
 			     lo->old_gfp_mask & ~(__GFP_IO|__GFP_FS));
 	loop_update_dio(lo);
 	blk_mq_unfreeze_queue(lo->lo_queue);
 
-	fput(old_file);
 	partscan = lo->lo_flags & LO_FLAGS_PARTSCAN;
 	mutex_unlock(&loop_ctl_mutex);
+	fput(old_file);
 	if (partscan)
 		loop_reread_partitions(lo, bdev);
 	return 0;
 
-out_putf:
-	fput(file);
 out_unlock:
 	mutex_unlock(&loop_ctl_mutex);
+	if (file)
+		fput(file);
 	return error;
 }
 
@@ -916,7 +918,7 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
 	if (!file)
 		goto out;
 
-	error = mutex_lock_killable_nested(&loop_ctl_mutex, 1);
+	error = mutex_lock_killable(&loop_ctl_mutex);
 	if (error)
 		goto out_putf;
 
@@ -975,7 +977,8 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
 		lo->lo_flags |= LO_FLAGS_PARTSCAN;
 	partscan = lo->lo_flags & LO_FLAGS_PARTSCAN;
 
-	/* Grab the block_device to prevent its destruction after we
+	/*
+	 * Grab the block_device to prevent its destruction after we
 	 * put /dev/loopXX inode. Later in __loop_clr_fd() we bdput(bdev).
 	 */
 	bdgrab(bdev);
@@ -1031,26 +1034,21 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
 	return err;
 }
 
+/* loop_ctl_mutex is held by caller, and is released before return. */
 static int __loop_clr_fd(struct loop_device *lo, bool release)
 {
-	struct file *filp = NULL;
+	/*
+	 * filp != NULL because the caller checked lo->lo_state == Lo_bound
+	 * under loop_ctl_mutex.
+	 */
+	struct file *filp = lo->lo_backing_file;
 	gfp_t gfp = lo->old_gfp_mask;
 	struct block_device *bdev = lo->lo_device;
 	int err = 0;
 	bool partscan = false;
 	int lo_number;
 
-	mutex_lock(&loop_ctl_mutex);
-	if (WARN_ON_ONCE(lo->lo_state != Lo_rundown)) {
-		err = -ENXIO;
-		goto out_unlock;
-	}
-
-	filp = lo->lo_backing_file;
-	if (filp == NULL) {
-		err = -EINVAL;
-		goto out_unlock;
-	}
+	lo->lo_state = Lo_rundown;
 
 	/* freeze request queue during the transition */
 	blk_mq_freeze_queue(lo->lo_queue);
@@ -1091,13 +1089,12 @@ static int __loop_clr_fd(struct loop_device *lo, bool release)
 	module_put(THIS_MODULE);
 	blk_mq_unfreeze_queue(lo->lo_queue);
 
-	partscan = lo->lo_flags & LO_FLAGS_PARTSCAN && bdev;
+	partscan = (lo->lo_flags & LO_FLAGS_PARTSCAN) && bdev;
 	lo_number = lo->lo_number;
 	lo->lo_flags = 0;
 	if (!part_shift)
 		lo->lo_disk->flags |= GENHD_FL_NO_PART_SCAN;
 	loop_unprepare_queue(lo);
-out_unlock:
 	mutex_unlock(&loop_ctl_mutex);
 	if (partscan) {
 		/*
@@ -1123,8 +1120,7 @@ static int __loop_clr_fd(struct loop_device *lo, bool release)
 	 * lock dependency possibility warning as fput can take
 	 * bd_mutex which is usually taken before loop_ctl_mutex.
 	 */
-	if (filp)
-		fput(filp);
+	fput(filp);
 	return err;
 }
 
@@ -1132,7 +1128,7 @@ static int loop_clr_fd(struct loop_device *lo)
 {
 	int err;
 
-	err = mutex_lock_killable_nested(&loop_ctl_mutex, 1);
+	err = mutex_lock_killable(&loop_ctl_mutex);
 	if (err)
 		return err;
 	if (lo->lo_state != Lo_bound) {
@@ -1154,9 +1150,6 @@ static int loop_clr_fd(struct loop_device *lo)
 		mutex_unlock(&loop_ctl_mutex);
 		return 0;
 	}
-	lo->lo_state = Lo_rundown;
-	mutex_unlock(&loop_ctl_mutex);
-
 	return __loop_clr_fd(lo, false);
 }
 
@@ -1169,7 +1162,7 @@ static int loop_clr_fd(struct loop_device *lo)
 	struct block_device *bdev;
 	bool partscan = false;
 
-	err = mutex_lock_killable_nested(&loop_ctl_mutex, 1);
+	err = mutex_lock_killable(&loop_ctl_mutex);
 	if (err)
 		return err;
 	if (lo->lo_encrypt_key_size &&
@@ -1274,7 +1267,7 @@ static int loop_clr_fd(struct loop_device *lo)
 	struct kstat stat;
 	int ret;
 
-	ret = mutex_lock_killable_nested(&loop_ctl_mutex, 1);
+	ret = mutex_lock_killable(&loop_ctl_mutex);
 	if (ret)
 		return ret;
 	if (lo->lo_state != Lo_bound) {
@@ -1463,7 +1456,7 @@ static int lo_simple_ioctl(struct loop_device *lo, unsigned int cmd,
 {
 	int err;
 
-	err = mutex_lock_killable_nested(&loop_ctl_mutex, 1);
+	err = mutex_lock_killable(&loop_ctl_mutex);
 	if (err)
 		return err;
 	switch (cmd) {
@@ -1712,8 +1705,6 @@ static void lo_release(struct gendisk *disk, fmode_t mode)
 	if (lo->lo_flags & LO_FLAGS_AUTOCLEAR) {
 		if (lo->lo_state != Lo_bound)
 			goto out_unlock;
-		lo->lo_state = Lo_rundown;
-		mutex_unlock(&loop_ctl_mutex);
 		/*
 		 * In autoclear mode, stop the loop thread
 		 * and remove configuration after last close.
@@ -1769,10 +1760,8 @@ static int unregister_transfer_cb(int id, void *ptr, void *data)
 	struct loop_device *lo = ptr;
 	struct loop_func_table *xfer = data;
 
-	mutex_lock(&loop_ctl_mutex);
 	if (lo->lo_encryption == xfer)
 		loop_release_xfer(lo);
-	mutex_unlock(&loop_ctl_mutex);
 	return 0;
 }
 
@@ -1785,7 +1774,13 @@ int loop_unregister_transfer(int number)
 		return -EINVAL;
 
 	xfer_funcs[n] = NULL;
+	/*
+	 * cleanup_cryptoloop() cannot handle errors because it is called
+	 * from module_exit(). Thus, don't give up upon SIGKILL here.
+	 */
+	mutex_lock(&loop_ctl_mutex);
 	idr_for_each(&loop_index_idr, &unregister_transfer_cb, xfer);
+	mutex_unlock(&loop_ctl_mutex);
 	return 0;
 }
 
@@ -2031,7 +2026,9 @@ static struct kobject *loop_probe(dev_t dev, int *part, void *data)
 	struct kobject *kobj;
 	int err;
 
-	mutex_lock(&loop_ctl_mutex);
+	*part = 0;
+	if (mutex_lock_killable(&loop_ctl_mutex))
+		return NULL;
 	err = loop_lookup(&lo, MINOR(dev) >> part_shift);
 	if (err < 0)
 		err = loop_add(&lo, MINOR(dev) >> part_shift);
@@ -2040,8 +2037,6 @@ static struct kobject *loop_probe(dev_t dev, int *part, void *data)
 	else
 		kobj = get_disk_and_module(lo->lo_disk);
 	mutex_unlock(&loop_ctl_mutex);
-
-	*part = 0;
 	return kobj;
 }
 
@@ -2049,12 +2044,11 @@ static long loop_control_ioctl(struct file *file, unsigned int cmd,
 			       unsigned long parm)
 {
 	struct loop_device *lo;
-	int ret = -ENOSYS;
+	int ret = mutex_lock_killable(&loop_ctl_mutex);
 
-	ret = mutex_lock_killable(&loop_ctl_mutex);
 	if (ret)
 		return ret;
-
+	ret = -ENOSYS;
 	switch (cmd) {
 	case LOOP_CTL_ADD:
 		ret = loop_lookup(&lo, parm);

Jan Kara Oct. 4, 2018, 10:15 a.m. UTC | #2

On Thu 27-09-18 23:47:01, Tetsuo Handa wrote:
> Possible changes folded into this series.

Thanks for having a look. But please comment on individual patches at
appropriate places instead of sending this patch where everything is just
mixed together. It is much easier to find out what we are talking about
that way.

>   (1) (I guess) no need to use _nested version.

I just preserved the current status as I didn't want to dig into that hole.
Even if you're right, that would be a separate change. Not something these
patches should deal with.

>   (2) Use mutex_lock_killable() where possible.

Where exactly? I've only noticed you've changed loop_probe() where I think
the change is just bogus. That gets called on device insertion and you have
nowhere to deliver the signal in that case.

>   (3) Move fput() to after mutex_unlock().

Again, independent change. I've just preserved the current situation. But
probably worth including in this series as a separate patch. Care to send a
follow up patch with proper changelog etc.?

>   (4) Don't return 0 upon invalid loop_control_ioctl().

Good catch, I'll fix that up.

>   (5) No need to mutex_lock()/mutex_unlock() on each loop device at
>       unregister_transfer_cb() callback.

Another independent optimization. Will you send a follow up patch? I can
write the patch (and the one above) but I don't want to steal the credit
from you...

>   (6) No need to mutex_unlock()+mutex_lock() when calling __loop_clr_fd().

This is deliberate so that we get rid of the weird "__loop_clr_fd()
releases mutex it did not acquire". This is not performance critical path
by any means so better keep the locking simple.

								Honza

[0/14] loop: Fix oops and possible deadlocks

Message

Comments