Message ID | d15e9392-44d0-f42c-cbac-859459a99395@i-love.sakura.ne.jp (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | loop: drop loop_ctl_mutex around del_gendisk() in loop_remove() | expand |
Found that commit 990e78116d38059c ("block: loop: fix deadlock between open and remove") went to 5.13-rc6. #syz fix: block: loop: fix deadlock between open and remove Now syzbot is reporting Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(nbd_index_mutex); lock(&bdev->bd_mutex); lock(nbd_index_mutex); lock(&bdev->bd_mutex); at https://syzkaller.appspot.com/text?tag=CrashReport&x=11b8a5bbd00000 .
> Found that commit 990e78116d38059c ("block: loop: fix deadlock between open and remove") went to 5.13-rc6. > > #syz fix: block: loop: fix deadlock between open and remove Your 'fix:' command is accepted, but please keep syzkaller-bugs@googlegroups.com mailing list in CC next time. It serves as a history of what happened with each bug report. Thank you. > > Now syzbot is reporting > > Possible unsafe locking scenario: > > CPU0 CPU1 > ---- ---- > lock(nbd_index_mutex); > lock(&bdev->bd_mutex); > lock(nbd_index_mutex); > lock(&bdev->bd_mutex); > > at https://syzkaller.appspot.com/text?tag=CrashReport&x=11b8a5bbd00000 . >
On Sat 12-06-21 00:14:20, Tetsuo Handa wrote: > syzbot is reporting circular locking dependency between loop_ctl_mutex and > bdev->bd_mutex [1] due to commit c76f48eb5c084b1e ("block: take bd_mutex > around delete_partitions in del_gendisk"). > > But calling del_gendisk() from loop_remove() without loop_ctl_mutex held > triggers a different race problem regarding sysfs entry management. We > somehow need to serialize "add_disk() from loop_add()" and "del_gendisk() > from loop_remove()". Fortunately, since loop_control_ioctl() is called > with no locks held, we can use "sleep and retry" approach without risking > deadlock. > > Since "struct loop_device"->lo_disk->private_data is set to non-NULL at > loop_add() and is reset to NULL before calling loop_remove(), we can use > it as a flag for taking appropriate action ("sleep and retry" or "skip") > when loop_remove() is in progress. > > Link: https://syzkaller.appspot.com/bug?extid=61e04e51b7ac86930589 [1] > Reported-by: syzbot <syzbot+61e04e51b7ac86930589@syzkaller.appspotmail.com> > Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > Tested-by: syzbot <syzbot+61e04e51b7ac86930589@syzkaller.appspotmail.com> > Fixes: c76f48eb5c084b1e ("block: take bd_mutex around delete_partitions in del_gendisk") Christoph seems to have already fixed this by 990e78116d380 ("block: loop: fix deadlock between open and remove"). Honza
diff --git a/drivers/block/loop.c b/drivers/block/loop.c index d58d68f3c7cd..d4c9567b2904 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -2188,7 +2188,9 @@ static int loop_add(struct loop_device **l, int i) static void loop_remove(struct loop_device *lo) { + mutex_unlock(&loop_ctl_mutex); del_gendisk(lo->lo_disk); + mutex_lock(&loop_ctl_mutex); blk_cleanup_queue(lo->lo_queue); blk_mq_free_tag_set(&lo->tag_set); put_disk(lo->lo_disk); @@ -2201,7 +2203,7 @@ static int find_free_cb(int id, void *ptr, void *data) struct loop_device *lo = ptr; struct loop_device **l = data; - if (lo->lo_state == Lo_unbound) { + if (lo->lo_state == Lo_unbound && lo->lo_disk->private_data) { *l = lo; return 1; } @@ -2254,6 +2256,13 @@ static long loop_control_ioctl(struct file *file, unsigned int cmd, struct loop_device *lo; int ret; + if (0) { +unlock_and_retry: + mutex_unlock(&loop_ctl_mutex); + if (schedule_timeout_killable(HZ / 10)) + return -EINTR; + } + debug_check_no_locks_held(); ret = mutex_lock_killable(&loop_ctl_mutex); if (ret) return ret; @@ -2263,6 +2272,8 @@ static long loop_control_ioctl(struct file *file, unsigned int cmd, case LOOP_CTL_ADD: ret = loop_lookup(&lo, parm); if (ret >= 0) { + if (!lo->lo_disk->private_data) + goto unlock_and_retry; ret = -EEXIST; break; } @@ -2272,6 +2283,8 @@ static long loop_control_ioctl(struct file *file, unsigned int cmd, ret = loop_lookup(&lo, parm); if (ret < 0) break; + if (!lo->lo_disk->private_data) + goto unlock_and_retry; ret = mutex_lock_killable(&lo->lo_mutex); if (ret) break; @@ -2286,9 +2299,10 @@ static long loop_control_ioctl(struct file *file, unsigned int cmd, break; } lo->lo_disk->private_data = NULL; + parm = lo->lo_number; mutex_unlock(&lo->lo_mutex); - idr_remove(&loop_index_idr, lo->lo_number); loop_remove(lo); + idr_remove(&loop_index_idr, parm); break; case LOOP_CTL_GET_FREE: ret = loop_lookup(&lo, -1);