Message ID | 4d91224a-ecba-3696-1116-3da2e48ec4d3@I-love.SAKURA.ne.jp (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, May 08, 2018 at 08:05:12PM +0900, Tetsuo Handa wrote: > > So, it is time to think how to solve this race condition, as well as how to solve > lockdep's deadlock warning (and I guess that syzbot is actually hitting deadlocks). > An approach which serializes loop operations using global lock was proposed at > https://groups.google.com/d/msg/syzkaller-bugs/2Rw8-OM6IbM/PzdobV8kAgAJ . > Please respond... I'm looking at your patch which you proposed on this, and the locking architecture still looks way too complex. Things like loop_mutex_owner, and all of the infrastructure around lo->ioctl_in_progress should be removed, if at all possible. I believe it should be possible to do things with a single global mutex, some code refactoring, and some unlocked versions of some of the functions. Again, this requires root, and it requires someone deliberately trying to induce a race. So "it's time" is not necessarily the priority I would set for this item. But if we are going to fix it, let's fix it right, and not make the code more complex and less maintainable, all in the name of trying to make a rare, not-likely-to-happen-in-real-life syzbot reported problem to go away. Cheers, - Ted
Theodore Y. Ts'o wrote: > On Tue, May 08, 2018 at 08:05:12PM +0900, Tetsuo Handa wrote: > > > > So, it is time to think how to solve this race condition, as well as how to solve > > lockdep's deadlock warning (and I guess that syzbot is actually hitting deadlocks). > > An approach which serializes loop operations using global lock was proposed at > > https://groups.google.com/d/msg/syzkaller-bugs/2Rw8-OM6IbM/PzdobV8kAgAJ . > > Please respond... > > I'm looking at your patch which you proposed on this, and the locking > architecture still looks way too complex. Things like > loop_mutex_owner, and all of the infrastructure around > lo->ioctl_in_progress should be removed, if at all possible. The patch in the above link no longer uses "lo->ioctl_in_progress". You looked at previous version rather than current version. > > I believe it should be possible to do things with a single global > mutex, some code refactoring, and some unlocked versions of some of > the functions. The patch in the above link uses single global mutex "loop_mutex". > > Again, this requires root, and it requires someone deliberately trying > to induce a race. So "it's time" is not necessarily the priority I > would set for this item. But if we are going to fix it, let's fix it > right, and not make the code more complex and less maintainable, all > in the name of trying to make a rare, not-likely-to-happen-in-real-life > syzbot reported problem to go away. While NULL pointer dereference would be rare, deadlocks might not be rare enough to postpone the patch. Deadlocks can cause pile of false-positive hung task reports and can prevent syzbot from finding other bugs. That's why I say "it is time to think".
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -909,6 +909,9 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode, error = -EINVAL; goto out_putf; } + pr_err("Start sleeping\n"); + schedule_timeout_killable(3 * HZ); + pr_err("End sleeping\n"); f = l->lo_backing_file; }