Message ID | 20220420131925.2464685-1-brauner@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fs: unset MNT_WRITE_HOLD on failure | expand |
On Wed, Apr 20, 2022 at 03:19:25PM +0200, Christian Brauner wrote: > After mnt_hold_writers() has been called we will always have set MNT_WRITE_HOLD > and consequently we always need to pair mnt_hold_writers() with > mnt_unhold_writers(). After the recent cleanup in [1] where Al switched from a > do-while to a for loop the cleanup currently fails to unset MNT_WRITE_HOLD for > the first mount that was changed. Fix this and make sure that the first mount > will be cleaned up and add some comments to make it more obvious. > > Reported-by: syzbot+10a16d1c43580983f6a2@syzkaller.appspotmail.com > Reported-by: syzbot+306090cfa3294f0bbfb3@syzkaller.appspotmail.com > Fixes: e257039f0fc7 ("mount_setattr(): clean the control flow and calling conventions") [1] > Link: https://lore.kernel.org/lkml/0000000000007cc21d05dd0432b8@google.com > Link: https://lore.kernel.org/lkml/00000000000080e10e05dd043247@google.com > Cc: Hillf Danton <hdanton@sina.com> > Cc: Christoph Hellwig <hch@lst.de> > Cc: Al Viro <viro@zeniv.linux.org.uk> > Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> > --- > This should fix the syzbot issue. This is only relevant for making a > mount or mount tree read-only: > 1. successul recursive read-only mount tree change: > Cleanup loop isn't executed. > 2. failed recursive read-only mount tree change: > m will point to the mount we failed to handle. The cleanup loop will > run until p == m and then terminate. > 3. successful single read-only mount change: > Cleanup loop won't be executed. > 4. failed single read-only mount change: > m will point to mnt and the cleanup loop will terminate if p == m. > I don't think there's any other weird corner cases since we now that > MNT_WRITE_HOLD can only have been set by us as it requires > lock_mount_hash() which we hold. So unconditionally unsetting it is > fine. But please make sure to take a close look at the changed loop. > --- Unless I hear objections I'll route this upstream before -rc4 to get this fixed because it's pretty trivial to trigger this.
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
diff --git a/fs/namespace.c b/fs/namespace.c index a0a36bfa3aa0..afe2b64b14f1 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -4058,10 +4058,22 @@ static int mount_setattr_prepare(struct mount_kattr *kattr, struct mount *mnt) if (err) { struct mount *p; - for (p = mnt; p != m; p = next_mnt(p, mnt)) { + /* + * If we had to call mnt_hold_writers() MNT_WRITE_HOLD will + * be set in @mnt_flags. The loop unsets MNT_WRITE_HOLD for all + * mounts and needs to take care to include the first mount. + */ + for (p = mnt; p; p = next_mnt(p, mnt)) { /* If we had to hold writers unblock them. */ if (p->mnt.mnt_flags & MNT_WRITE_HOLD) mnt_unhold_writers(p); + + /* + * We're done once the first mount we changed got + * MNT_WRITE_HOLD unset. + */ + if (p == m) + break; } } return err;
After mnt_hold_writers() has been called we will always have set MNT_WRITE_HOLD and consequently we always need to pair mnt_hold_writers() with mnt_unhold_writers(). After the recent cleanup in [1] where Al switched from a do-while to a for loop the cleanup currently fails to unset MNT_WRITE_HOLD for the first mount that was changed. Fix this and make sure that the first mount will be cleaned up and add some comments to make it more obvious. Reported-by: syzbot+10a16d1c43580983f6a2@syzkaller.appspotmail.com Reported-by: syzbot+306090cfa3294f0bbfb3@syzkaller.appspotmail.com Fixes: e257039f0fc7 ("mount_setattr(): clean the control flow and calling conventions") [1] Link: https://lore.kernel.org/lkml/0000000000007cc21d05dd0432b8@google.com Link: https://lore.kernel.org/lkml/00000000000080e10e05dd043247@google.com Cc: Hillf Danton <hdanton@sina.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> --- This should fix the syzbot issue. This is only relevant for making a mount or mount tree read-only: 1. successul recursive read-only mount tree change: Cleanup loop isn't executed. 2. failed recursive read-only mount tree change: m will point to the mount we failed to handle. The cleanup loop will run until p == m and then terminate. 3. successful single read-only mount change: Cleanup loop won't be executed. 4. failed single read-only mount change: m will point to mnt and the cleanup loop will terminate if p == m. I don't think there's any other weird corner cases since we now that MNT_WRITE_HOLD can only have been set by us as it requires lock_mount_hash() which we hold. So unconditionally unsetting it is fine. But please make sure to take a close look at the changed loop. --- fs/namespace.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) base-commit: b2d229d4ddb17db541098b83524d901257e93845