Message ID | 20190516140127.23272-1-jack@suse.cz (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | loop: Don't change loop device under exclusive opener | expand |
On 5/16/19 8:01 AM, Jan Kara wrote: > Loop module allows calling LOOP_SET_FD while there are other openers of > the loop device. Even exclusive ones. This can lead to weird > consequences such as kernel deadlocks like: > > mount_bdev() lo_ioctl() > udf_fill_super() > udf_load_vrs() > sb_set_blocksize() - sets desired block size B > udf_tread() > sb_bread() > __bread_gfp(bdev, block, B) > loop_set_fd() > set_blocksize() > - now __getblk_slow() indefinitely loops because B != bdev > block size > > Fix the problem by disallowing LOOP_SET_FD ioctl when there are > exclusive openers of a loop device. > > [Deliberately chosen not to CC stable as a user with priviledges to > trigger this race has other means of taking the system down and this > has a potential of breaking some weird userspace setup] > > Reported-and-tested-by: syzbot+10007d66ca02b08f0e60@syzkaller.appspotmail.com > Signed-off-by: Jan Kara <jack@suse.cz> > --- > drivers/block/loop.c | 18 +++++++++++++++++- > 1 file changed, 17 insertions(+), 1 deletion(-) > > Hi Jens! > > What do you think about this patch? It fixes the problem but it also > changes user visible behavior so there are chances it breaks some > existing setup (although I have hard time coming up with a realistic > scenario where it would matter). I also have a hard time thinking about valid cases where this would be a problem. I think, in the end, that fixing the issue is more important than a potentially hypothetical use case. > Alternatively we could change getblk() code handle changing block > size. That would fix the particular issue syzkaller found as well but > I'm not sure what else is broken when block device changes while fs > driver is working with it. I think your solution here is saner.
On Thu 16-05-19 14:44:07, Jens Axboe wrote: > On 5/16/19 8:01 AM, Jan Kara wrote: > > Loop module allows calling LOOP_SET_FD while there are other openers of > > the loop device. Even exclusive ones. This can lead to weird > > consequences such as kernel deadlocks like: > > > > mount_bdev() lo_ioctl() > > udf_fill_super() > > udf_load_vrs() > > sb_set_blocksize() - sets desired block size B > > udf_tread() > > sb_bread() > > __bread_gfp(bdev, block, B) > > loop_set_fd() > > set_blocksize() > > - now __getblk_slow() indefinitely loops because B != bdev > > block size > > > > Fix the problem by disallowing LOOP_SET_FD ioctl when there are > > exclusive openers of a loop device. > > > > [Deliberately chosen not to CC stable as a user with priviledges to > > trigger this race has other means of taking the system down and this > > has a potential of breaking some weird userspace setup] > > > > Reported-and-tested-by: syzbot+10007d66ca02b08f0e60@syzkaller.appspotmail.com > > Signed-off-by: Jan Kara <jack@suse.cz> > > --- > > drivers/block/loop.c | 18 +++++++++++++++++- > > 1 file changed, 17 insertions(+), 1 deletion(-) > > > > Hi Jens! > > > > What do you think about this patch? It fixes the problem but it also > > changes user visible behavior so there are chances it breaks some > > existing setup (although I have hard time coming up with a realistic > > scenario where it would matter). > > I also have a hard time thinking about valid cases where this would be a > problem. I think, in the end, that fixing the issue is more important > than a potentially hypothetical use case. > > > Alternatively we could change getblk() code handle changing block > > size. That would fix the particular issue syzkaller found as well but > > I'm not sure what else is broken when block device changes while fs > > driver is working with it. > > I think your solution here is saner. Will you pick up the patch please? I cannot find it in your tree... Thanks! Honza
On 5/27/19 6:29 AM, Jan Kara wrote: > On Thu 16-05-19 14:44:07, Jens Axboe wrote: >> On 5/16/19 8:01 AM, Jan Kara wrote: >>> Loop module allows calling LOOP_SET_FD while there are other openers of >>> the loop device. Even exclusive ones. This can lead to weird >>> consequences such as kernel deadlocks like: >>> >>> mount_bdev() lo_ioctl() >>> udf_fill_super() >>> udf_load_vrs() >>> sb_set_blocksize() - sets desired block size B >>> udf_tread() >>> sb_bread() >>> __bread_gfp(bdev, block, B) >>> loop_set_fd() >>> set_blocksize() >>> - now __getblk_slow() indefinitely loops because B != bdev >>> block size >>> >>> Fix the problem by disallowing LOOP_SET_FD ioctl when there are >>> exclusive openers of a loop device. >>> >>> [Deliberately chosen not to CC stable as a user with priviledges to >>> trigger this race has other means of taking the system down and this >>> has a potential of breaking some weird userspace setup] >>> >>> Reported-and-tested-by: syzbot+10007d66ca02b08f0e60@syzkaller.appspotmail.com >>> Signed-off-by: Jan Kara <jack@suse.cz> >>> --- >>> drivers/block/loop.c | 18 +++++++++++++++++- >>> 1 file changed, 17 insertions(+), 1 deletion(-) >>> >>> Hi Jens! >>> >>> What do you think about this patch? It fixes the problem but it also >>> changes user visible behavior so there are chances it breaks some >>> existing setup (although I have hard time coming up with a realistic >>> scenario where it would matter). >> >> I also have a hard time thinking about valid cases where this would be a >> problem. I think, in the end, that fixing the issue is more important >> than a potentially hypothetical use case. >> >>> Alternatively we could change getblk() code handle changing block >>> size. That would fix the particular issue syzkaller found as well but >>> I'm not sure what else is broken when block device changes while fs >>> driver is working with it. >> >> I think your solution here is saner. > > Will you pick up the patch please? I cannot find it in your tree... Thanks! Done!
Hi Jan, at 21:34, Jens Axboe <axboe@kernel.dk> wrote: > On 5/27/19 6:29 AM, Jan Kara wrote: >> On Thu 16-05-19 14:44:07, Jens Axboe wrote: >>> On 5/16/19 8:01 AM, Jan Kara wrote: >>>> Loop module allows calling LOOP_SET_FD while there are other openers of >>>> the loop device. Even exclusive ones. This can lead to weird >>>> consequences such as kernel deadlocks like: >>>> >>>> mount_bdev() lo_ioctl() >>>> udf_fill_super() >>>> udf_load_vrs() >>>> sb_set_blocksize() - sets desired block size B >>>> udf_tread() >>>> sb_bread() >>>> __bread_gfp(bdev, block, B) >>>> loop_set_fd() >>>> set_blocksize() >>>> - now __getblk_slow() indefinitely loops because B != bdev >>>> block size >>>> >>>> Fix the problem by disallowing LOOP_SET_FD ioctl when there are >>>> exclusive openers of a loop device. >>>> >>>> [Deliberately chosen not to CC stable as a user with priviledges to >>>> trigger this race has other means of taking the system down and this >>>> has a potential of breaking some weird userspace setup] >>>> >>>> Reported-and-tested-by: >>>> syzbot+10007d66ca02b08f0e60@syzkaller.appspotmail.com >>>> Signed-off-by: Jan Kara <jack@suse.cz> >>>> --- >>>> drivers/block/loop.c | 18 +++++++++++++++++- >>>> 1 file changed, 17 insertions(+), 1 deletion(-) >>>> >>>> Hi Jens! >>>> >>>> What do you think about this patch? It fixes the problem but it also >>>> changes user visible behavior so there are chances it breaks some >>>> existing setup (although I have hard time coming up with a realistic >>>> scenario where it would matter). >>> >>> I also have a hard time thinking about valid cases where this would be a >>> problem. I think, in the end, that fixing the issue is more important >>> than a potentially hypothetical use case. >>> >>>> Alternatively we could change getblk() code handle changing block >>>> size. That would fix the particular issue syzkaller found as well but >>>> I'm not sure what else is broken when block device changes while fs >>>> driver is working with it. >>> >>> I think your solution here is saner. >> >> Will you pick up the patch please? I cannot find it in your tree... >> Thanks! > > Done! This patch introduced a regression [1]. A reproducer can be found at [2]. [1] https://bugs.launchpad.net/bugs/1836914 [2] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1836914/comments/4 Kai-Heng > > -- > Jens Axboe
On Thu 18-07-19 16:15:42, Kai-Heng Feng wrote: > Hi Jan, > > at 21:34, Jens Axboe <axboe@kernel.dk> wrote: > > > On 5/27/19 6:29 AM, Jan Kara wrote: > > > On Thu 16-05-19 14:44:07, Jens Axboe wrote: > > > > On 5/16/19 8:01 AM, Jan Kara wrote: > > > > > Loop module allows calling LOOP_SET_FD while there are other openers of > > > > > the loop device. Even exclusive ones. This can lead to weird > > > > > consequences such as kernel deadlocks like: > > > > > > > > > > mount_bdev() lo_ioctl() > > > > > udf_fill_super() > > > > > udf_load_vrs() > > > > > sb_set_blocksize() - sets desired block size B > > > > > udf_tread() > > > > > sb_bread() > > > > > __bread_gfp(bdev, block, B) > > > > > loop_set_fd() > > > > > set_blocksize() > > > > > - now __getblk_slow() indefinitely loops because B != bdev > > > > > block size > > > > > > > > > > Fix the problem by disallowing LOOP_SET_FD ioctl when there are > > > > > exclusive openers of a loop device. > > > > > > > > > > [Deliberately chosen not to CC stable as a user with priviledges to > > > > > trigger this race has other means of taking the system down and this > > > > > has a potential of breaking some weird userspace setup] > > > > > > > > > > Reported-and-tested-by: > > > > > syzbot+10007d66ca02b08f0e60@syzkaller.appspotmail.com > > > > > Signed-off-by: Jan Kara <jack@suse.cz> > > > > > --- > > > > > drivers/block/loop.c | 18 +++++++++++++++++- > > > > > 1 file changed, 17 insertions(+), 1 deletion(-) > > > > > > > > > > Hi Jens! > > > > > > > > > > What do you think about this patch? It fixes the problem but it also > > > > > changes user visible behavior so there are chances it breaks some > > > > > existing setup (although I have hard time coming up with a realistic > > > > > scenario where it would matter). > > > > > > > > I also have a hard time thinking about valid cases where this would be a > > > > problem. I think, in the end, that fixing the issue is more important > > > > than a potentially hypothetical use case. > > > > > > > > > Alternatively we could change getblk() code handle changing block > > > > > size. That would fix the particular issue syzkaller found as well but > > > > > I'm not sure what else is broken when block device changes while fs > > > > > driver is working with it. > > > > > > > > I think your solution here is saner. > > > > > > Will you pick up the patch please? I cannot find it in your tree... > > > Thanks! > > > > Done! > > This patch introduced a regression [1]. > A reproducer can be found at [2]. > > [1] https://bugs.launchpad.net/bugs/1836914 > [2] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1836914/comments/4 Thanks for the notice and the references. What's your version of util-linux? What your test script does is indeed racy. You have there: echo Running: for i in {a..z}{a..z}; do mount $i.squash /mnt/$i & done So all mount(8) commands will run in parallel and race to setup loop devices with LOOP_SET_FD and mount them. However util-linux (at least in the current version) seems to handle EBUSY from LOOP_SET_FD just fine and retries with the new loop device. So at this point I don't see why the patch makes difference... I guess I'll need to reproduce and see what's going on in detail. Honza
On Tue, 30 Jul 2019 at 10:29, Jan Kara <jack@suse.cz> wrote: > > Thanks for the notice and the references. What's your version of > util-linux? What your test script does is indeed racy. You have there: > > echo Running: > for i in {a..z}{a..z}; do > mount $i.squash /mnt/$i & > done > > So all mount(8) commands will run in parallel and race to setup loop > devices with LOOP_SET_FD and mount them. However util-linux (at least in > the current version) seems to handle EBUSY from LOOP_SET_FD just fine and > retries with the new loop device. So at this point I don't see why the patch > makes difference... I guess I'll need to reproduce and see what's going on > in detail. We've observed this in arch with util-linux 2.34, and ubuntu 19.10 (eoan ermine) with util-linux 2.33. just to be clear, the initial reports didn't involve a zany loop of mounts, but were triggered by effectively the same thing as systemd booted a system with a lot of snaps. The reroducer tries to makes things simpler to reproduce :-). FWIW, systemd versions were 244 and 242 for those systems, respectively.
On 2019/07/30 18:29, Jan Kara wrote: >> This patch introduced a regression [1]. >> A reproducer can be found at [2]. >> >> [1] https://bugs.launchpad.net/bugs/1836914 >> [2] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1836914/comments/4 > > Thanks for the notice and the references. What's your version of > util-linux? What your test script does is indeed racy. You have there: > > echo Running: > for i in {a..z}{a..z}; do > mount $i.squash /mnt/$i & > done > > So all mount(8) commands will run in parallel and race to setup loop > devices with LOOP_SET_FD and mount them. However util-linux (at least in > the current version) seems to handle EBUSY from LOOP_SET_FD just fine and > retries with the new loop device. So at this point I don't see why the patch > makes difference... I guess I'll need to reproduce and see what's going on > in detail. Firstly, why not to check the return value of blkdev_get() ? EBUSY is not the only error code blkdev_get() might return. /* * If we don't hold exclusive handle for the device, upgrade to it * here to avoid changing device under exclusive owner. */ if (!(mode & FMODE_EXCL)) { bdgrab(bdev); error = blkdev_get(bdev, mode | FMODE_EXCL, loop_set_fd); - if (error) + if (error) { + printk("loop_set_fd: %d\n", error); goto out_putf; + } } And try finding which line is returning an error (like https://marc.info/?l=linux-xfs&m=156437221703110 does).
On Tue 30-07-19 10:36:59, John Lenton wrote: > On Tue, 30 Jul 2019 at 10:29, Jan Kara <jack@suse.cz> wrote: > > > > Thanks for the notice and the references. What's your version of > > util-linux? What your test script does is indeed racy. You have there: > > > > echo Running: > > for i in {a..z}{a..z}; do > > mount $i.squash /mnt/$i & > > done > > > > So all mount(8) commands will run in parallel and race to setup loop > > devices with LOOP_SET_FD and mount them. However util-linux (at least in > > the current version) seems to handle EBUSY from LOOP_SET_FD just fine and > > retries with the new loop device. So at this point I don't see why the patch > > makes difference... I guess I'll need to reproduce and see what's going on > > in detail. > > We've observed this in arch with util-linux 2.34, and ubuntu 19.10 > (eoan ermine) with util-linux 2.33. > > just to be clear, the initial reports didn't involve a zany loop of > mounts, but were triggered by effectively the same thing as systemd > booted a system with a lot of snaps. The reroducer tries to makes > things simpler to reproduce :-). FWIW, systemd versions were 244 and > 242 for those systems, respectively. Thanks for info! So I think I see what's going on. The two mounts race like: MOUNT1 MOUNT2 num = ioctl(LOOP_CTL_GET_FREE) num = ioctl(LOOP_CTL_GET_FREE) ioctl("/dev/loop$num", LOOP_SET_FD, ..) - returns OK ioctl("/dev/loop$num", LOOP_SET_FD, ..) - acquires exclusine loop$num reference mount("/dev/loop$num", ...) - sees exclusive reference from MOUNT2 and fails - sees loop device is already bound and fails It is a bug in the scheme I've chosen that racing LOOP_SET_FD can block perfectly valid mount. I'll think how to fix this... Honza
On Tue 30-07-19 12:16:46, Jan Kara wrote: > On Tue 30-07-19 10:36:59, John Lenton wrote: > > On Tue, 30 Jul 2019 at 10:29, Jan Kara <jack@suse.cz> wrote: > > > > > > Thanks for the notice and the references. What's your version of > > > util-linux? What your test script does is indeed racy. You have there: > > > > > > echo Running: > > > for i in {a..z}{a..z}; do > > > mount $i.squash /mnt/$i & > > > done > > > > > > So all mount(8) commands will run in parallel and race to setup loop > > > devices with LOOP_SET_FD and mount them. However util-linux (at least in > > > the current version) seems to handle EBUSY from LOOP_SET_FD just fine and > > > retries with the new loop device. So at this point I don't see why the patch > > > makes difference... I guess I'll need to reproduce and see what's going on > > > in detail. > > > > We've observed this in arch with util-linux 2.34, and ubuntu 19.10 > > (eoan ermine) with util-linux 2.33. > > > > just to be clear, the initial reports didn't involve a zany loop of > > mounts, but were triggered by effectively the same thing as systemd > > booted a system with a lot of snaps. The reroducer tries to makes > > things simpler to reproduce :-). FWIW, systemd versions were 244 and > > 242 for those systems, respectively. > > Thanks for info! So I think I see what's going on. The two mounts race > like: > > MOUNT1 MOUNT2 > num = ioctl(LOOP_CTL_GET_FREE) > num = ioctl(LOOP_CTL_GET_FREE) > ioctl("/dev/loop$num", LOOP_SET_FD, ..) > - returns OK > ioctl("/dev/loop$num", LOOP_SET_FD, ..) > - acquires exclusine loop$num > reference > mount("/dev/loop$num", ...) > - sees exclusive reference from MOUNT2 and fails > - sees loop device is already > bound and fails > > It is a bug in the scheme I've chosen that racing LOOP_SET_FD can block > perfectly valid mount. I'll think how to fix this... So how about attached patch? It fixes the regression for me. Honza
at 21:36, Jan Kara <jack@suse.cz> wrote: > On Tue 30-07-19 12:16:46, Jan Kara wrote: >> On Tue 30-07-19 10:36:59, John Lenton wrote: >>> On Tue, 30 Jul 2019 at 10:29, Jan Kara <jack@suse.cz> wrote: >>>> Thanks for the notice and the references. What's your version of >>>> util-linux? What your test script does is indeed racy. You have there: >>>> >>>> echo Running: >>>> for i in {a..z}{a..z}; do >>>> mount $i.squash /mnt/$i & >>>> done >>>> >>>> So all mount(8) commands will run in parallel and race to setup loop >>>> devices with LOOP_SET_FD and mount them. However util-linux (at least in >>>> the current version) seems to handle EBUSY from LOOP_SET_FD just fine >>>> and >>>> retries with the new loop device. So at this point I don't see why the >>>> patch >>>> makes difference... I guess I'll need to reproduce and see what's >>>> going on >>>> in detail. >>> >>> We've observed this in arch with util-linux 2.34, and ubuntu 19.10 >>> (eoan ermine) with util-linux 2.33. >>> >>> just to be clear, the initial reports didn't involve a zany loop of >>> mounts, but were triggered by effectively the same thing as systemd >>> booted a system with a lot of snaps. The reroducer tries to makes >>> things simpler to reproduce :-). FWIW, systemd versions were 244 and >>> 242 for those systems, respectively. >> >> Thanks for info! So I think I see what's going on. The two mounts race >> like: >> >> MOUNT1 MOUNT2 >> num = ioctl(LOOP_CTL_GET_FREE) >> num = ioctl(LOOP_CTL_GET_FREE) >> ioctl("/dev/loop$num", LOOP_SET_FD, ..) >> - returns OK >> ioctl("/dev/loop$num", LOOP_SET_FD, ..) >> - acquires exclusine loop$num >> reference >> mount("/dev/loop$num", ...) >> - sees exclusive reference from MOUNT2 and fails >> - sees loop device is already >> bound and fails >> >> It is a bug in the scheme I've chosen that racing LOOP_SET_FD can block >> perfectly valid mount. I'll think how to fix this... > > So how about attached patch? It fixes the regression for me. Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> > > Honza > -- > Jan Kara <jack@suse.com> > SUSE Labs, CR > <0001-loop-Fix-mount-2-failure-due-to-race-with-LOOP_SET_F.patch>
On 7/30/19 7:36 AM, Jan Kara wrote: > On Tue 30-07-19 12:16:46, Jan Kara wrote: >> On Tue 30-07-19 10:36:59, John Lenton wrote: >>> On Tue, 30 Jul 2019 at 10:29, Jan Kara <jack@suse.cz> wrote: >>>> >>>> Thanks for the notice and the references. What's your version of >>>> util-linux? What your test script does is indeed racy. You have there: >>>> >>>> echo Running: >>>> for i in {a..z}{a..z}; do >>>> mount $i.squash /mnt/$i & >>>> done >>>> >>>> So all mount(8) commands will run in parallel and race to setup loop >>>> devices with LOOP_SET_FD and mount them. However util-linux (at least in >>>> the current version) seems to handle EBUSY from LOOP_SET_FD just fine and >>>> retries with the new loop device. So at this point I don't see why the patch >>>> makes difference... I guess I'll need to reproduce and see what's going on >>>> in detail. >>> >>> We've observed this in arch with util-linux 2.34, and ubuntu 19.10 >>> (eoan ermine) with util-linux 2.33. >>> >>> just to be clear, the initial reports didn't involve a zany loop of >>> mounts, but were triggered by effectively the same thing as systemd >>> booted a system with a lot of snaps. The reroducer tries to makes >>> things simpler to reproduce :-). FWIW, systemd versions were 244 and >>> 242 for those systems, respectively. >> >> Thanks for info! So I think I see what's going on. The two mounts race >> like: >> >> MOUNT1 MOUNT2 >> num = ioctl(LOOP_CTL_GET_FREE) >> num = ioctl(LOOP_CTL_GET_FREE) >> ioctl("/dev/loop$num", LOOP_SET_FD, ..) >> - returns OK >> ioctl("/dev/loop$num", LOOP_SET_FD, ..) >> - acquires exclusine loop$num >> reference >> mount("/dev/loop$num", ...) >> - sees exclusive reference from MOUNT2 and fails >> - sees loop device is already >> bound and fails >> >> It is a bug in the scheme I've chosen that racing LOOP_SET_FD can block >> perfectly valid mount. I'll think how to fix this... > > So how about attached patch? It fixes the regression for me. Jan, I've applied this patch - and also marked it for stable, so it'll end up in 5.2-stable. Thanks.
On Tue, 30 Jul 2019 at 20:17, Jens Axboe <axboe@kernel.dk> wrote: > > Jan, I've applied this patch - and also marked it for stable, so it'll > end up in 5.2-stable. Thanks. thank you all!
On Tue 30-07-19 13:17:28, Jens Axboe wrote: > On 7/30/19 7:36 AM, Jan Kara wrote: > > On Tue 30-07-19 12:16:46, Jan Kara wrote: > >> On Tue 30-07-19 10:36:59, John Lenton wrote: > >>> On Tue, 30 Jul 2019 at 10:29, Jan Kara <jack@suse.cz> wrote: > >>>> > >>>> Thanks for the notice and the references. What's your version of > >>>> util-linux? What your test script does is indeed racy. You have there: > >>>> > >>>> echo Running: > >>>> for i in {a..z}{a..z}; do > >>>> mount $i.squash /mnt/$i & > >>>> done > >>>> > >>>> So all mount(8) commands will run in parallel and race to setup loop > >>>> devices with LOOP_SET_FD and mount them. However util-linux (at least in > >>>> the current version) seems to handle EBUSY from LOOP_SET_FD just fine and > >>>> retries with the new loop device. So at this point I don't see why the patch > >>>> makes difference... I guess I'll need to reproduce and see what's going on > >>>> in detail. > >>> > >>> We've observed this in arch with util-linux 2.34, and ubuntu 19.10 > >>> (eoan ermine) with util-linux 2.33. > >>> > >>> just to be clear, the initial reports didn't involve a zany loop of > >>> mounts, but were triggered by effectively the same thing as systemd > >>> booted a system with a lot of snaps. The reroducer tries to makes > >>> things simpler to reproduce :-). FWIW, systemd versions were 244 and > >>> 242 for those systems, respectively. > >> > >> Thanks for info! So I think I see what's going on. The two mounts race > >> like: > >> > >> MOUNT1 MOUNT2 > >> num = ioctl(LOOP_CTL_GET_FREE) > >> num = ioctl(LOOP_CTL_GET_FREE) > >> ioctl("/dev/loop$num", LOOP_SET_FD, ..) > >> - returns OK > >> ioctl("/dev/loop$num", LOOP_SET_FD, ..) > >> - acquires exclusine loop$num > >> reference > >> mount("/dev/loop$num", ...) > >> - sees exclusive reference from MOUNT2 and fails > >> - sees loop device is already > >> bound and fails > >> > >> It is a bug in the scheme I've chosen that racing LOOP_SET_FD can block > >> perfectly valid mount. I'll think how to fix this... > > > > So how about attached patch? It fixes the regression for me. > > Jan, I've applied this patch - and also marked it for stable, so it'll > end up in 5.2-stable. Thanks. Thanks Jens! Honza
On 7/30/19 6:36 AM, Jan Kara wrote: > On Tue 30-07-19 12:16:46, Jan Kara wrote: >> On Tue 30-07-19 10:36:59, John Lenton wrote: >>> On Tue, 30 Jul 2019 at 10:29, Jan Kara <jack@suse.cz> wrote: >>>> >>>> Thanks for the notice and the references. What's your version of >>>> util-linux? What your test script does is indeed racy. You have there: >>>> >>>> echo Running: >>>> for i in {a..z}{a..z}; do >>>> mount $i.squash /mnt/$i & >>>> done >>>> >>>> So all mount(8) commands will run in parallel and race to setup loop >>>> devices with LOOP_SET_FD and mount them. However util-linux (at least in >>>> the current version) seems to handle EBUSY from LOOP_SET_FD just fine and >>>> retries with the new loop device. So at this point I don't see why the patch >>>> makes difference... I guess I'll need to reproduce and see what's going on >>>> in detail. >>> >>> We've observed this in arch with util-linux 2.34, and ubuntu 19.10 >>> (eoan ermine) with util-linux 2.33. >>> >>> just to be clear, the initial reports didn't involve a zany loop of >>> mounts, but were triggered by effectively the same thing as systemd >>> booted a system with a lot of snaps. The reroducer tries to makes >>> things simpler to reproduce :-). FWIW, systemd versions were 244 and >>> 242 for those systems, respectively. >> >> Thanks for info! So I think I see what's going on. The two mounts race >> like: >> >> MOUNT1 MOUNT2 >> num = ioctl(LOOP_CTL_GET_FREE) >> num = ioctl(LOOP_CTL_GET_FREE) >> ioctl("/dev/loop$num", LOOP_SET_FD, ..) >> - returns OK >> ioctl("/dev/loop$num", LOOP_SET_FD, ..) >> - acquires exclusine loop$num >> reference >> mount("/dev/loop$num", ...) >> - sees exclusive reference from MOUNT2 and fails >> - sees loop device is already >> bound and fails >> >> It is a bug in the scheme I've chosen that racing LOOP_SET_FD can block >> perfectly valid mount. I'll think how to fix this... > > So how about attached patch? It fixes the regression for me. Hi Jan, A new kernel warning is triggered by blktests block/001 that did not happen without this patch. Reverting commit 89e524c04fa9 ("loop: Fix mount(2) failure due to race with LOOP_SET_FD") makes that kernel warning disappear. Is this reproducible on your setup? Thanks, Bart. kernel: sr 10:0:0:0: [sr1] scsi-1 drive kernel: scsi host9: scsi_runtime_resume kernel: sr 10:0:0:0: Attached scsi CD-ROM sr1 kernel: driver: 'sr': driver_bound: bound to device '10:0:0:0' kernel: bus: 'scsi': really_probe: bound device 10:0:0:0 to driver sr kernel: scsi 9:0:0:0: CD-ROM Linux scsi_debug 0188 PQ: 0 ANSI: 7 kernel: WARNING: CPU: 5 PID: 907 at fs/block_dev.c:1899 __blkdev_put+0x396/0x3a0 systemd-udevd[906]: Process 'cdrom_id --lock-media /dev/sr2' failed with exit code 1. kernel: bus: 'scsi': driver_probe_device: matched device 9:0:0:0 with driver sr kernel: Modules linked in: scsi_debug crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 glue_helper crypto_simd joydev cryptd virtio_balloon virtio_console serio_raw qemu_fw_cfg iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 hid_generic usbhid virtio_net psmouse hid ahci net_failover libahci i2c_piix4 floppy virtio_blk virtio_scsi failover pata_acpi [last unloaded: scsi_debug] kernel: bus: 'scsi': really_probe: probing driver sr with device 9:0:0:0 kernel: sr 10:0:0:0: Attached scsi generic sg2 type 5 kernel: CPU: 5 PID: 907 Comm: systemd-udevd Not tainted 5.3.0-rc3-dbg+ #5 kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 kernel: RIP: 0010:__blkdev_put+0x396/0x3a0 kernel: Code: 49 8d 7d 08 e8 6b c8 f4 ff 49 8b 45 08 48 85 c0 0f 84 9c fe ff ff 8b b5 64 ff ff ff 48 8b bd 70 ff ff ff ff d0 e9 88 fe ff ff <0f> 0b e9 54 fd ff ff 0f 1f 00 0f 1f 44 00 00 55 48 89 e5 41 57 41 kernel: RSP: 0018:ffff8881111afd18 EFLAGS: 00010202 kernel: RAX: 0000000000000000 RBX: ffff888035472e00 RCX: ffffffff814a4783 kernel: RDX: 0000000000000002 RSI: dffffc0000000000 RDI: ffff888035472ea8 kernel: RBP: ffff8881111afde0 R08: ffffed1006a8e5c4 R09: ffffed1006a8e5c4 kernel: R10: ffff8881111afd08 R11: ffff888035472e1f R12: 0000000000000000 kernel: R13: 00000000080a005d R14: ffff888035472e18 R15: ffff888115e5f028 kernel: sr 9:0:0:0: Power-on or device reset occurred kernel: FS: 00007f113411d8c0(0000) GS:ffff88811b740000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 systemd-udevd[1257]: Process '/usr/bin/sg_inq --export /dev/sr1' failed with exit code 15. systemd-udevd[1257]: Process 'cdrom_id --lock-media /dev/sr1' failed with exit code 1. systemd-udevd[906]: Process '/usr/bin/sg_inq --export /dev/sr3' failed with exit code 15. systemd-udevd[906]: Process 'cdrom_id --lock-media /dev/sr3' failed with exit code 1. kernel: sr 9:0:0:0: [sr2] scsi-1 drive kernel: CR2: 00007ffc4ac7fc98 CR3: 000000011127a001 CR4: 0000000000360ee0 kernel: debugfs: Directory 'sr2' with parent 'block' already present! kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 kernel: Call Trace: kernel: ? __kasan_check_write+0x14/0x20 kernel: sr 9:0:0:0: Attached scsi CD-ROM sr2 kernel: ? __mutex_unlock_slowpath+0xbd/0x400 kernel: ? bd_set_size+0x70/0x70 kernel: ? preempt_count_sub+0x18/0xc0 kernel: blkdev_put+0x62/0x200 kernel: blkdev_close+0x49/0x50 kernel: __fput+0x15c/0x390 kernel: driver: 'sr': driver_bound: bound to device '9:0:0:0' systemd-udevd[913]: Process '/usr/bin/sg_inq --export /dev/sr1' failed with exit code 15. kernel: ____fput+0xe/0x10 kernel: task_work_run+0xc3/0xf0 kernel: bus: 'scsi': really_probe: bound device 9:0:0:0 to driver sr kernel: exit_to_usermode_loop+0xee/0xf0 systemd-udevd[913]: Process 'cdrom_id --lock-media /dev/sr1' failed with exit code 1. kernel: sr 9:0:0:0: Attached scsi generic sg2 type 5 kernel: do_syscall_64+0x213/0x270 systemd-udevd[1004]: Process '/usr/bin/sg_inq --export /dev/sr2' failed with exit code 15. systemd-udevd[906]: Process '/usr/bin/sg_inq --export /dev/sr1' failed with exit code 15. kernel: entry_SYSCALL_64_after_hwframe+0x49/0xbe kernel: RIP: 0033:0x7f11343a0674 kernel: Code: eb 8d e8 bf 1b 02 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8d 05 69 cd 0d 00 8b 00 85 c0 75 13 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 3c c3 0f 1f 00 53 89 fb 48 83 ec 10 e8 44 e7 kernel: RSP: 002b:00007ffc4ac80d78 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 kernel: RAX: 0000000000000000 RBX: 0000000000000006 RCX: 00007f11343a0674 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000006 kernel: RBP: 00007f113411d7e0 R08: 000055f286c9f4b0 R09: 0000000000000000 kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002 kernel: R13: 00000000000061b0 R14: 000055f287600fd0 R15: 000055f28757f3d0 kernel: irq event stamp: 2072492 kernel: hardirqs last enabled at (2072491): [<ffffffff81ef8b1c>] _raw_spin_unlock_irq+0x2c/0x50 kernel: hardirqs last disabled at (2072492): [<ffffffff8100291a>] trace_hardirqs_off_thunk+0x1a/0x20 kernel: softirqs last enabled at (2069158): [<ffffffff81043a03>] fpu__copy+0xe3/0x470 kernel: softirqs last disabled at (2069156): [<ffffffff81043975>] fpu__copy+0x55/0x470 kernel: ---[ end trace 352d17ea465743b6 ]--- systemd-udevd[1004]: Process 'cdrom_id --lock-media /dev/sr2' failed with exit code 1.
On 2019/08/06 1:41, Bart Van Assche wrote: > Hi Jan, > > A new kernel warning is triggered by blktests block/001 that did not happen > without this patch. Reverting commit 89e524c04fa9 ("loop: Fix mount(2) > failure due to race with LOOP_SET_FD") makes that kernel warning disappear. > Is this reproducible on your setup? > > Thanks, > > Bart. > > kernel: WARNING: CPU: 5 PID: 907 at fs/block_dev.c:1899 __blkdev_put+0x396/0x3a0 Hmm, this is reported as below one. WARNING in __blkdev_put (2) https://syzkaller.appspot.com/bug?id=1ac7a7a8440522302ccb634ba69f8e1e6f203902
On Mon 05-08-19 09:41:39, Bart Van Assche wrote: > On 7/30/19 6:36 AM, Jan Kara wrote: > > On Tue 30-07-19 12:16:46, Jan Kara wrote: > > > On Tue 30-07-19 10:36:59, John Lenton wrote: > > > > On Tue, 30 Jul 2019 at 10:29, Jan Kara <jack@suse.cz> wrote: > > > > > > > > > > Thanks for the notice and the references. What's your version of > > > > > util-linux? What your test script does is indeed racy. You have there: > > > > > > > > > > echo Running: > > > > > for i in {a..z}{a..z}; do > > > > > mount $i.squash /mnt/$i & > > > > > done > > > > > > > > > > So all mount(8) commands will run in parallel and race to setup loop > > > > > devices with LOOP_SET_FD and mount them. However util-linux (at least in > > > > > the current version) seems to handle EBUSY from LOOP_SET_FD just fine and > > > > > retries with the new loop device. So at this point I don't see why the patch > > > > > makes difference... I guess I'll need to reproduce and see what's going on > > > > > in detail. > > > > > > > > We've observed this in arch with util-linux 2.34, and ubuntu 19.10 > > > > (eoan ermine) with util-linux 2.33. > > > > > > > > just to be clear, the initial reports didn't involve a zany loop of > > > > mounts, but were triggered by effectively the same thing as systemd > > > > booted a system with a lot of snaps. The reroducer tries to makes > > > > things simpler to reproduce :-). FWIW, systemd versions were 244 and > > > > 242 for those systems, respectively. > > > > > > Thanks for info! So I think I see what's going on. The two mounts race > > > like: > > > > > > MOUNT1 MOUNT2 > > > num = ioctl(LOOP_CTL_GET_FREE) > > > num = ioctl(LOOP_CTL_GET_FREE) > > > ioctl("/dev/loop$num", LOOP_SET_FD, ..) > > > - returns OK > > > ioctl("/dev/loop$num", LOOP_SET_FD, ..) > > > - acquires exclusine loop$num > > > reference > > > mount("/dev/loop$num", ...) > > > - sees exclusive reference from MOUNT2 and fails > > > - sees loop device is already > > > bound and fails > > > > > > It is a bug in the scheme I've chosen that racing LOOP_SET_FD can block > > > perfectly valid mount. I'll think how to fix this... > > > > So how about attached patch? It fixes the regression for me. Hi Bart, > A new kernel warning is triggered by blktests block/001 that did not happen > without this patch. Reverting commit 89e524c04fa9 ("loop: Fix mount(2) > failure due to race with LOOP_SET_FD") makes that kernel warning disappear. > Is this reproducible on your setup? Thanks for report! Hum, no, it seems the warning doesn't trigger in my test VM. But reviewing the mentioned commit with fresh head, I can see where I did a mistake during my conversion of blkdev_get(). Does attached patch fix the warning for you? Honza
On 8/7/19 2:45 AM, Jan Kara wrote: > On Mon 05-08-19 09:41:39, Bart Van Assche wrote: >> A new kernel warning is triggered by blktests block/001 that did not happen >> without this patch. Reverting commit 89e524c04fa9 ("loop: Fix mount(2) >> failure due to race with LOOP_SET_FD") makes that kernel warning disappear. >> Is this reproducible on your setup? > > Thanks for report! Hum, no, it seems the warning doesn't trigger in my test > VM. But reviewing the mentioned commit with fresh head, I can see where I > did a mistake during my conversion of blkdev_get(). Does attached patch fix > the warning for you? Hi Jan, That patch indeed fixes the warning. Feel free to add my Tested-by. Thanks, Bart.
On 8/7/19 2:45 AM, Jan Kara wrote: > On Mon 05-08-19 09:41:39, Bart Van Assche wrote: >> On 7/30/19 6:36 AM, Jan Kara wrote: >>> On Tue 30-07-19 12:16:46, Jan Kara wrote: >>>> On Tue 30-07-19 10:36:59, John Lenton wrote: >>>>> On Tue, 30 Jul 2019 at 10:29, Jan Kara <jack@suse.cz> wrote: >>>>>> >>>>>> Thanks for the notice and the references. What's your version of >>>>>> util-linux? What your test script does is indeed racy. You have there: >>>>>> >>>>>> echo Running: >>>>>> for i in {a..z}{a..z}; do >>>>>> mount $i.squash /mnt/$i & >>>>>> done >>>>>> >>>>>> So all mount(8) commands will run in parallel and race to setup loop >>>>>> devices with LOOP_SET_FD and mount them. However util-linux (at least in >>>>>> the current version) seems to handle EBUSY from LOOP_SET_FD just fine and >>>>>> retries with the new loop device. So at this point I don't see why the patch >>>>>> makes difference... I guess I'll need to reproduce and see what's going on >>>>>> in detail. >>>>> >>>>> We've observed this in arch with util-linux 2.34, and ubuntu 19.10 >>>>> (eoan ermine) with util-linux 2.33. >>>>> >>>>> just to be clear, the initial reports didn't involve a zany loop of >>>>> mounts, but were triggered by effectively the same thing as systemd >>>>> booted a system with a lot of snaps. The reroducer tries to makes >>>>> things simpler to reproduce :-). FWIW, systemd versions were 244 and >>>>> 242 for those systems, respectively. >>>> >>>> Thanks for info! So I think I see what's going on. The two mounts race >>>> like: >>>> >>>> MOUNT1 MOUNT2 >>>> num = ioctl(LOOP_CTL_GET_FREE) >>>> num = ioctl(LOOP_CTL_GET_FREE) >>>> ioctl("/dev/loop$num", LOOP_SET_FD, ..) >>>> - returns OK >>>> ioctl("/dev/loop$num", LOOP_SET_FD, ..) >>>> - acquires exclusine loop$num >>>> reference >>>> mount("/dev/loop$num", ...) >>>> - sees exclusive reference from MOUNT2 and fails >>>> - sees loop device is already >>>> bound and fails >>>> >>>> It is a bug in the scheme I've chosen that racing LOOP_SET_FD can block >>>> perfectly valid mount. I'll think how to fix this... >>> >>> So how about attached patch? It fixes the regression for me. > > Hi Bart, > >> A new kernel warning is triggered by blktests block/001 that did not happen >> without this patch. Reverting commit 89e524c04fa9 ("loop: Fix mount(2) >> failure due to race with LOOP_SET_FD") makes that kernel warning disappear. >> Is this reproducible on your setup? > > Thanks for report! Hum, no, it seems the warning doesn't trigger in my test > VM. But reviewing the mentioned commit with fresh head, I can see where I > did a mistake during my conversion of blkdev_get(). Does attached patch fix > the warning for you? I've queued this up, Jan. Thanks for taking a look at it.
diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 102d79575895..f11b7dc16e9d 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -945,9 +945,20 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode, if (!file) goto out; + /* + * If we don't hold exclusive handle for the device, upgrade to it + * here to avoid changing device under exclusive owner. + */ + if (!(mode & FMODE_EXCL)) { + bdgrab(bdev); + error = blkdev_get(bdev, mode | FMODE_EXCL, loop_set_fd); + if (error) + goto out_putf; + } + error = mutex_lock_killable(&loop_ctl_mutex); if (error) - goto out_putf; + goto out_bdev; error = -EBUSY; if (lo->lo_state != Lo_unbound) @@ -1012,10 +1023,15 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode, mutex_unlock(&loop_ctl_mutex); if (partscan) loop_reread_partitions(lo, bdev); + if (!(mode & FMODE_EXCL)) + blkdev_put(bdev, mode | FMODE_EXCL); return 0; out_unlock: mutex_unlock(&loop_ctl_mutex); +out_bdev: + if (!(mode & FMODE_EXCL)) + blkdev_put(bdev, mode | FMODE_EXCL); out_putf: fput(file); out:
Loop module allows calling LOOP_SET_FD while there are other openers of the loop device. Even exclusive ones. This can lead to weird consequences such as kernel deadlocks like: mount_bdev() lo_ioctl() udf_fill_super() udf_load_vrs() sb_set_blocksize() - sets desired block size B udf_tread() sb_bread() __bread_gfp(bdev, block, B) loop_set_fd() set_blocksize() - now __getblk_slow() indefinitely loops because B != bdev block size Fix the problem by disallowing LOOP_SET_FD ioctl when there are exclusive openers of a loop device. [Deliberately chosen not to CC stable as a user with priviledges to trigger this race has other means of taking the system down and this has a potential of breaking some weird userspace setup] Reported-and-tested-by: syzbot+10007d66ca02b08f0e60@syzkaller.appspotmail.com Signed-off-by: Jan Kara <jack@suse.cz> --- drivers/block/loop.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) Hi Jens! What do you think about this patch? It fixes the problem but it also changes user visible behavior so there are chances it breaks some existing setup (although I have hard time coming up with a realistic scenario where it would matter). Alternatively we could change getblk() code handle changing block size. That would fix the particular issue syzkaller found as well but I'm not sure what else is broken when block device changes while fs driver is working with it. Honza