Message ID | 20220218153249.406028-1-mszeredi@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | vfs: fix link vs. rename race | expand |
On Fri, Feb 18, 2022 at 04:32:49PM +0100, Miklos Szeredi wrote: > Reported-by: Xavier Roche <xavier.roche@algolia.com> Just one minor detail for the records: this was tested by me but reported by another Xavier. But that's not a big deal. Reported-by: Xavier Grand <xavier.grand@algolia.com> Tested-by: Xavier Roche <xavier.roche@algolia.com>
On Fri, Feb 18, 2022 at 04:32:49PM +0100, Miklos Szeredi wrote: > There has been a longstanding race condition between rename(2) and link(2), > when those operations are done in parallel: > > 1. Moving a file to an existing target file (eg. mv file target) > 2. Creating a link from the target file to a third file (eg. ln target > link) > > By the time vfs_link() locks the target inode, it might already be unlinked > by rename. This results in vfs_link() returning -ENOENT in order to > prevent linking to already unlinked files. This check was introduced in > v2.6.39 by commit aae8a97d3ec3 ("fs: Don't allow to create hardlink for > deleted file"). > > This breaks apparent atomicity of rename(2), which is described in > standards and the man page: > > "If newpath already exists, it will be atomically replaced, so that > there is no point at which another process attempting to access > newpath will find it missing." > > The simplest fix is to exclude renames for the complete link operation. > > This patch introduces a global rw_semaphore that is locked for read in > rename and for write in link. To prevent excessive contention, do not take > the lock in link on the first try. If the source of the link was found to > be unlinked, then retry with the lock held. AFAICS, that deadlocks if lock_rename() is taken in ecryptfs_rename() (with lock_rename() already taken by its caller) after another thread blocks trying to take your link_rwsem exclusive.
diff --git a/fs/namei.c b/fs/namei.c index 3f1829b3ab5b..dd6908cee49d 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -122,6 +122,8 @@ * PATH_MAX includes the nul terminator --RR. */ +static DECLARE_RWSEM(link_rwsem); + #define EMBEDDED_NAME_MAX (PATH_MAX - offsetof(struct filename, iname)) struct filename * @@ -2961,6 +2963,8 @@ struct dentry *lock_rename(struct dentry *p1, struct dentry *p2) { struct dentry *p; + down_read(&link_rwsem); + if (p1 == p2) { inode_lock_nested(p1->d_inode, I_MUTEX_PARENT); return NULL; @@ -2995,6 +2999,8 @@ void unlock_rename(struct dentry *p1, struct dentry *p2) inode_unlock(p2->d_inode); mutex_unlock(&p1->d_sb->s_vfs_rename_mutex); } + + up_read(&link_rwsem); } EXPORT_SYMBOL(unlock_rename); @@ -4456,6 +4462,7 @@ int do_linkat(int olddfd, struct filename *old, int newdfd, struct path old_path, new_path; struct inode *delegated_inode = NULL; int how = 0; + bool lock = false; int error; if ((flags & ~(AT_SYMLINK_FOLLOW | AT_EMPTY_PATH)) != 0) { @@ -4474,10 +4481,13 @@ int do_linkat(int olddfd, struct filename *old, int newdfd, if (flags & AT_SYMLINK_FOLLOW) how |= LOOKUP_FOLLOW; +retry_lock: + if (lock) + down_write(&link_rwsem); retry: error = filename_lookup(olddfd, old, how, &old_path, NULL); if (error) - goto out_putnames; + goto out_unlock_link; new_dentry = filename_create(newdfd, new, &new_path, (how & LOOKUP_REVAL)); @@ -4511,8 +4521,16 @@ int do_linkat(int olddfd, struct filename *old, int newdfd, how |= LOOKUP_REVAL; goto retry; } + if (!lock && error == -ENOENT) { + path_put(&old_path); + lock = true; + goto retry_lock; + } out_putpath: path_put(&old_path); +out_unlock_link: + if (lock) + up_write(&link_rwsem); out_putnames: putname(old); putname(new);