Message ID | 20230525101624.15814-4-jack@suse.cz (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | fs: Fix directory corruption when moving directories | expand |
On Thu, May 25, 2023 at 12:16:10PM +0200, Jan Kara wrote: > Currently the locking order of inode locks for directories that are not > in ancestor relationship is not defined because all operations that > needed to lock two directories like this were serialized by > sb->s_vfs_rename_mutex. However some filesystems need to lock two > subdirectories for RENAME_EXCHANGE operations and for this we need the > locking order established even for two tree-unrelated directories. > Provide a helper function lock_two_inodes() that establishes lock > ordering for any two inodes and use it in lock_two_directories(). > > CC: stable@vger.kernel.org > Signed-off-by: Jan Kara <jack@suse.cz> > --- > fs/inode.c | 34 ++++++++++++++++++++++++++++++++++ > fs/internal.h | 2 ++ > fs/namei.c | 4 ++-- > 3 files changed, 38 insertions(+), 2 deletions(-) > > diff --git a/fs/inode.c b/fs/inode.c > index 577799b7855f..2015fa50d34a 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -1103,6 +1103,40 @@ void discard_new_inode(struct inode *inode) > } > EXPORT_SYMBOL(discard_new_inode); > > +/** > + * lock_two_inodes - lock two inodes (may be regular files but also dirs) > + * > + * Lock any non-NULL argument. The caller must make sure that if he is passing > + * in two directories, one is not ancestor of the other. Zero, one or two > + * objects may be locked by this function. > + * > + * @inode1: first inode to lock > + * @inode2: second inode to lock > + * @subclass1: inode lock subclass for the first lock obtained > + * @subclass2: inode lock subclass for the second lock obtained > + */ > +void lock_two_inodes(struct inode *inode1, struct inode *inode2, > + unsigned subclass1, unsigned subclass2) > +{ > + if (!inode1 || !inode2) > + goto lock; Before this change in lock_two_nondirectories(struct inode *inode1, struct inode *inode2) the swap() would cause the non-NULL inode to always be locked with I_MUTEX_NONDIR2. Now it can be either I_MUTEX_NORMAL or I_MUTEX_NONDIR2. Is that change intentional?
On Fri 26-05-23 11:45:15, Christian Brauner wrote: > On Thu, May 25, 2023 at 12:16:10PM +0200, Jan Kara wrote: > > Currently the locking order of inode locks for directories that are not > > in ancestor relationship is not defined because all operations that > > needed to lock two directories like this were serialized by > > sb->s_vfs_rename_mutex. However some filesystems need to lock two > > subdirectories for RENAME_EXCHANGE operations and for this we need the > > locking order established even for two tree-unrelated directories. > > Provide a helper function lock_two_inodes() that establishes lock > > ordering for any two inodes and use it in lock_two_directories(). > > > > CC: stable@vger.kernel.org > > Signed-off-by: Jan Kara <jack@suse.cz> > > --- > > fs/inode.c | 34 ++++++++++++++++++++++++++++++++++ > > fs/internal.h | 2 ++ > > fs/namei.c | 4 ++-- > > 3 files changed, 38 insertions(+), 2 deletions(-) > > > > diff --git a/fs/inode.c b/fs/inode.c > > index 577799b7855f..2015fa50d34a 100644 > > --- a/fs/inode.c > > +++ b/fs/inode.c > > @@ -1103,6 +1103,40 @@ void discard_new_inode(struct inode *inode) > > } > > EXPORT_SYMBOL(discard_new_inode); > > > > +/** > > + * lock_two_inodes - lock two inodes (may be regular files but also dirs) > > + * > > + * Lock any non-NULL argument. The caller must make sure that if he is passing > > + * in two directories, one is not ancestor of the other. Zero, one or two > > + * objects may be locked by this function. > > + * > > + * @inode1: first inode to lock > > + * @inode2: second inode to lock > > + * @subclass1: inode lock subclass for the first lock obtained > > + * @subclass2: inode lock subclass for the second lock obtained > > + */ > > +void lock_two_inodes(struct inode *inode1, struct inode *inode2, > > + unsigned subclass1, unsigned subclass2) > > +{ > > + if (!inode1 || !inode2) > > + goto lock; > > Before this change in > > lock_two_nondirectories(struct inode *inode1, struct inode *inode2) > > the swap() would cause the non-NULL inode to always be locked with > I_MUTEX_NONDIR2. Now it can be either I_MUTEX_NORMAL or I_MUTEX_NONDIR2. > Is that change intentional? Kind of. I don't think we really care so I didn't bother to complicate the code for this. If you think keeping the lockdep class consistent is worth it, I can modify the patch... Honza
On Mon, May 29, 2023 at 02:41:31PM +0200, Jan Kara wrote: > On Fri 26-05-23 11:45:15, Christian Brauner wrote: > > On Thu, May 25, 2023 at 12:16:10PM +0200, Jan Kara wrote: > > > Currently the locking order of inode locks for directories that are not > > > in ancestor relationship is not defined because all operations that > > > needed to lock two directories like this were serialized by > > > sb->s_vfs_rename_mutex. However some filesystems need to lock two > > > subdirectories for RENAME_EXCHANGE operations and for this we need the > > > locking order established even for two tree-unrelated directories. > > > Provide a helper function lock_two_inodes() that establishes lock > > > ordering for any two inodes and use it in lock_two_directories(). > > > > > > CC: stable@vger.kernel.org > > > Signed-off-by: Jan Kara <jack@suse.cz> > > > --- > > > fs/inode.c | 34 ++++++++++++++++++++++++++++++++++ > > > fs/internal.h | 2 ++ > > > fs/namei.c | 4 ++-- > > > 3 files changed, 38 insertions(+), 2 deletions(-) > > > > > > diff --git a/fs/inode.c b/fs/inode.c > > > index 577799b7855f..2015fa50d34a 100644 > > > --- a/fs/inode.c > > > +++ b/fs/inode.c > > > @@ -1103,6 +1103,40 @@ void discard_new_inode(struct inode *inode) > > > } > > > EXPORT_SYMBOL(discard_new_inode); > > > > > > +/** > > > + * lock_two_inodes - lock two inodes (may be regular files but also dirs) > > > + * > > > + * Lock any non-NULL argument. The caller must make sure that if he is passing > > > + * in two directories, one is not ancestor of the other. Zero, one or two > > > + * objects may be locked by this function. > > > + * > > > + * @inode1: first inode to lock > > > + * @inode2: second inode to lock > > > + * @subclass1: inode lock subclass for the first lock obtained > > > + * @subclass2: inode lock subclass for the second lock obtained > > > + */ > > > +void lock_two_inodes(struct inode *inode1, struct inode *inode2, > > > + unsigned subclass1, unsigned subclass2) > > > +{ > > > + if (!inode1 || !inode2) > > > + goto lock; > > > > Before this change in > > > > lock_two_nondirectories(struct inode *inode1, struct inode *inode2) > > > > the swap() would cause the non-NULL inode to always be locked with > > I_MUTEX_NONDIR2. Now it can be either I_MUTEX_NORMAL or I_MUTEX_NONDIR2. > > Is that change intentional? > > Kind of. I don't think we really care so I didn't bother to complicate the > code for this. If you think keeping the lockdep class consistent is worth > it, I can modify the patch... Either a short comment or consistent lockdep class would be nice. I know it probably doesn't matter much but otherwise someone may end up wondering whether that's ok or not.
diff --git a/fs/inode.c b/fs/inode.c index 577799b7855f..2015fa50d34a 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -1103,6 +1103,40 @@ void discard_new_inode(struct inode *inode) } EXPORT_SYMBOL(discard_new_inode); +/** + * lock_two_inodes - lock two inodes (may be regular files but also dirs) + * + * Lock any non-NULL argument. The caller must make sure that if he is passing + * in two directories, one is not ancestor of the other. Zero, one or two + * objects may be locked by this function. + * + * @inode1: first inode to lock + * @inode2: second inode to lock + * @subclass1: inode lock subclass for the first lock obtained + * @subclass2: inode lock subclass for the second lock obtained + */ +void lock_two_inodes(struct inode *inode1, struct inode *inode2, + unsigned subclass1, unsigned subclass2) +{ + if (!inode1 || !inode2) + goto lock; + + /* + * If one object is directory and the other is not, we must make sure + * to lock directory first as the other object may be its child. + */ + if (S_ISDIR(inode2->i_mode) == S_ISDIR(inode1->i_mode)) { + if (inode1 > inode2) + swap(inode1, inode2); + } else if (!S_ISDIR(inode1->i_mode)) + swap(inode1, inode2); +lock: + if (inode1) + inode_lock_nested(inode1, subclass1); + if (inode2 && inode2 != inode1) + inode_lock_nested(inode2, subclass2); +} + /** * lock_two_nondirectories - take two i_mutexes on non-directory objects * diff --git a/fs/internal.h b/fs/internal.h index bd3b2810a36b..377030a50aca 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -152,6 +152,8 @@ extern long prune_icache_sb(struct super_block *sb, struct shrink_control *sc); int dentry_needs_remove_privs(struct mnt_idmap *, struct dentry *dentry); bool in_group_or_capable(struct mnt_idmap *idmap, const struct inode *inode, vfsgid_t vfsgid); +void lock_two_inodes(struct inode *inode1, struct inode *inode2, + unsigned subclass1, unsigned subclass2); /* * fs-writeback.c diff --git a/fs/namei.c b/fs/namei.c index e4fe0879ae55..148570aabe74 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -3028,8 +3028,8 @@ static struct dentry *lock_two_directories(struct dentry *p1, struct dentry *p2) return p; } - inode_lock_nested(p1->d_inode, I_MUTEX_PARENT); - inode_lock_nested(p2->d_inode, I_MUTEX_PARENT2); + lock_two_inodes(p1->d_inode, p2->d_inode, + I_MUTEX_PARENT, I_MUTEX_PARENT2); return NULL; }
Currently the locking order of inode locks for directories that are not in ancestor relationship is not defined because all operations that needed to lock two directories like this were serialized by sb->s_vfs_rename_mutex. However some filesystems need to lock two subdirectories for RENAME_EXCHANGE operations and for this we need the locking order established even for two tree-unrelated directories. Provide a helper function lock_two_inodes() that establishes lock ordering for any two inodes and use it in lock_two_directories(). CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> --- fs/inode.c | 34 ++++++++++++++++++++++++++++++++++ fs/internal.h | 2 ++ fs/namei.c | 4 ++-- 3 files changed, 38 insertions(+), 2 deletions(-)