Message ID | 20241004102342.179434-2-amir73il@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Stash overlay real upper file in backing_file | expand |
On Fri, Oct 04, 2024 at 12:23:39PM +0200, Amir Goldstein wrote: > ovl_fsync() with !datasync opens a backing file from the top most dentry > in the stack, checks if this dentry is non-upper and skips the fsync. > > In case of an overlay dentry stack with lower data and lower metadata > above it, but without an upper metadata above it, the backing file is > opened from the top most lower metadata dentry and never used. > > Fix the helper ovl_real_fdget_meta() to return an empty struct fd in > that case to avoid the unneeded backing file open. Umm... Won't that screw the callers of ovl_real_fd()? I mean, here > @@ -395,7 +397,7 @@ static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync) > return ret; > > ret = ovl_real_fdget_meta(file, &real, !datasync); > - if (ret) > + if (ret || fd_empty(real)) > return ret; > you are taking account of that, but what of e.g. ret = ovl_real_fdget(file, &real); if (ret) return ret; /* * Overlay file f_pos is the master copy that is preserved * through copy up and modified on read/write, but only real * fs knows how to SEEK_HOLE/SEEK_DATA and real fs may impose * limitations that are more strict than ->s_maxbytes for specific * files, so we use the real file to perform seeks. */ ovl_inode_lock(inode); fd_file(real)->f_pos = file->f_pos; in ovl_llseek()? Get ovl_real_fdget_meta() called by ovl_real_fdget() and have it return 0 with NULL in fd_file(real), and you've got an oops right there, don't you? At the very least it's a bisect hazard...
On Fri, Oct 04, 2024 at 11:16:25PM +0100, Al Viro wrote: > On Fri, Oct 04, 2024 at 12:23:39PM +0200, Amir Goldstein wrote: > > ovl_fsync() with !datasync opens a backing file from the top most dentry > > in the stack, checks if this dentry is non-upper and skips the fsync. > > > > In case of an overlay dentry stack with lower data and lower metadata > > above it, but without an upper metadata above it, the backing file is > > opened from the top most lower metadata dentry and never used. > > > > Fix the helper ovl_real_fdget_meta() to return an empty struct fd in > > that case to avoid the unneeded backing file open. > > Umm... Won't that screw the callers of ovl_real_fd()? > > I mean, here > > > @@ -395,7 +397,7 @@ static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync) > > return ret; > > > > ret = ovl_real_fdget_meta(file, &real, !datasync); > > - if (ret) > > + if (ret || fd_empty(real)) > > return ret; > > > > you are taking account of that, but what of e.g. > ret = ovl_real_fdget(file, &real); > if (ret) > return ret; > > /* > * Overlay file f_pos is the master copy that is preserved > * through copy up and modified on read/write, but only real > * fs knows how to SEEK_HOLE/SEEK_DATA and real fs may impose > * limitations that are more strict than ->s_maxbytes for specific > * files, so we use the real file to perform seeks. > */ > ovl_inode_lock(inode); > fd_file(real)->f_pos = file->f_pos; > in ovl_llseek()? Get ovl_real_fdget_meta() called by ovl_real_fdget() and > have it return 0 with NULL in fd_file(real), and you've got an oops right > there, don't you? I see... so you rely upon that thing never happening when the last argument of ovl_real_fdget_meta() is false, including the call from ovl_real_fdget(). I still don't like the calling conventions, TBH. Let me think a bit...
On Fri, Oct 04, 2024 at 11:28:11PM +0100, Al Viro wrote: > > /* > > * Overlay file f_pos is the master copy that is preserved > > * through copy up and modified on read/write, but only real > > * fs knows how to SEEK_HOLE/SEEK_DATA and real fs may impose > > * limitations that are more strict than ->s_maxbytes for specific > > * files, so we use the real file to perform seeks. > > */ > > ovl_inode_lock(inode); > > fd_file(real)->f_pos = file->f_pos; > > in ovl_llseek()? Get ovl_real_fdget_meta() called by ovl_real_fdget() and > > have it return 0 with NULL in fd_file(real), and you've got an oops right > > there, don't you? > > I see... so you rely upon that thing never happening when the last argument of > ovl_real_fdget_meta() is false, including the call from ovl_real_fdget(). > > I still don't like the calling conventions, TBH. Let me think a bit... Sorry, I'm afraid I'll have to leave that until tomorrow - over 38C after the sodding shingles shot really screws the ability to dig through the code ;-/ My apologies...
On Sat, Oct 5, 2024 at 3:35 AM Al Viro <viro@zeniv.linux.org.uk> wrote: > > On Fri, Oct 04, 2024 at 11:28:11PM +0100, Al Viro wrote: > > > /* > > > * Overlay file f_pos is the master copy that is preserved > > > * through copy up and modified on read/write, but only real > > > * fs knows how to SEEK_HOLE/SEEK_DATA and real fs may impose > > > * limitations that are more strict than ->s_maxbytes for specific > > > * files, so we use the real file to perform seeks. > > > */ > > > ovl_inode_lock(inode); > > > fd_file(real)->f_pos = file->f_pos; > > > in ovl_llseek()? Get ovl_real_fdget_meta() called by ovl_real_fdget() and > > > have it return 0 with NULL in fd_file(real), and you've got an oops right > > > there, don't you? > > > > I see... so you rely upon that thing never happening when the last argument of > > ovl_real_fdget_meta() is false, including the call from ovl_real_fdget(). > > Correct. I had considered renaming the argument to allow_empty_upper_meta but I don't think that will make the contract a lot better. The thing is that ovl_fsync() caller is really different in two different aspects: 1. It wants only upper and therefore fd_empty() is a possible outcome 2. It (may) want the metadata inode (when data is still in lower inode) Overlayfs can have up to 3 different inodes in the stack for a regular file: lower_data+lower_metadata+upper_metdata There is currently no file operation that requires opening the lower_metadata inode and therefore, staching one backing file in ->private_data and another optional backing file chained from the first one is enough. If there is ever a file operation that needs to open a realfile from lower_metadata (only ioctl comes to mind), we may need to reevaluate. > > I still don't like the calling conventions, TBH. Let me think a bit... > I understand your concern, but honestly, I am not sure that returning struct fderr is fundamentally different from checking IS_ERR_OR_NULL. What I can do is refactor the helpers differently so that ovl_fsync() will call ovl_file_upper() to clarify that it may return NULL, just like ovl_{dentry,inode,path}_upper() and all the other callers will call ovl_file_real() which cannot return NULL, because it returns either lower or upper file, just like ovl_{inode,path}_real{,data}(). > Sorry, I'm afraid I'll have to leave that until tomorrow - over 38C after the sodding > shingles shot really screws the ability to dig through the code ;-/ My apologies... Ouch! feel well soon. Thanks, Amir.
On Sat, Oct 05, 2024 at 08:30:23AM +0200, Amir Goldstein wrote: > I understand your concern, but honestly, I am not sure that returning > struct fderr is fundamentally different from checking IS_ERR_OR_NULL. > > What I can do is refactor the helpers differently so that ovl_fsync() will > call ovl_file_upper() to clarify that it may return NULL, just like ovl_dentry_upper(), you mean? > ovl_{dentry,inode,path}_upper() and all the other callers will > call ovl_file_real() which cannot return NULL, because it returns > either lower or upper file, just like ovl_{inode,path}_real{,data}(). OK... One thing I'm not happy about is the control (and data) flow in there: stashed_upper: if (upperfile && file_inode(upperfile) == d_inode(realpath.dentry)) realfile = upperfile; /* * If realfile is lower and has been copied up since we'd opened it, * open the real upper file and stash it in backing_file_private(). */ if (unlikely(file_inode(realfile) != d_inode(realpath.dentry))) { struct file *old; /* Stashed upperfile has a mismatched inode */ if (unlikely(upperfile)) return ERR_PTR(-EIO); upperfile = ovl_open_realfile(file, &realpath); if (IS_ERR(upperfile)) return upperfile; old = cmpxchg_release(backing_file_private_ptr(realfile), NULL, upperfile); if (old) { fput(upperfile); upperfile = old; } goto stashed_upper; } Unless I'm misreading that, the value of realfile seen inside the second if is always the original; reassignment in the first if might as well had been followed by goto past the second one. What's more, if you win the race in the second if, you'll have upperfile != NULL and its file_inode() matching realpath.dentry->d_inode (you'd better, or you have a real problem in backing_file_open()). So that branch to stashed_upper in case old == NULL might as well had been "realfile = upperfile;". Correct? In case old != NULL we go there with upperfile != NULL. If it (i.e. old) has the right file_inode(), we are done; otherwise it's going to hit ERR_PTR(-EIO) in the second if. So it seems to be equivalent to this: if (unlikely(file_inode(realfile) != d_inode(realpath.dentry))) { /* * If realfile is lower and has been copied up since we'd * opened it, we need the real upper file opened. Whoever gets * there first stashes the result in backing_file_private(). */ struct file *upperfile = backing_file_private(realfile); if (unlikely(!upperfile)) { struct file *old; upperfile = ovl_open_realfile(file, &realpath); if (IS_ERR(upperfile)) return upperfile; old = cmpxchg_release(backing_file_private_ptr(realfile), NULL, upperfile); if (old) { fput(upperfile); upperfile = old; } } // upperfile reference is owned by realfile at that point if (unlikely(file_inode(upperfile) != d_inode(realpath.dentry))) /* Stashed upperfile has a mismatched inode */ return ERR_PTR(-EIO); realfile = upperfile; } Or am I misreading it? Seems to be more straightforward that way...
On Sat, Oct 5, 2024 at 9:49 PM Al Viro <viro@zeniv.linux.org.uk> wrote: > > On Sat, Oct 05, 2024 at 08:30:23AM +0200, Amir Goldstein wrote: > > > I understand your concern, but honestly, I am not sure that returning > > struct fderr is fundamentally different from checking IS_ERR_OR_NULL. > > > > What I can do is refactor the helpers differently so that ovl_fsync() will > > call ovl_file_upper() to clarify that it may return NULL, just like > > ovl_dentry_upper(), you mean? No, I meant I created a new helper ovl_upper_file() that only ovl_fsync() uses and can return NULL. All the rest of the callers are using the helper ovl_real_file() which cannot return NULL. I will post it. > > > ovl_{dentry,inode,path}_upper() and all the other callers will > > call ovl_file_real() which cannot return NULL, because it returns > > either lower or upper file, just like ovl_{inode,path}_real{,data}(). > > OK... One thing I'm not happy about is the control (and data) flow in there: > stashed_upper: > if (upperfile && file_inode(upperfile) == d_inode(realpath.dentry)) > realfile = upperfile; > > /* > * If realfile is lower and has been copied up since we'd opened it, > * open the real upper file and stash it in backing_file_private(). > */ > if (unlikely(file_inode(realfile) != d_inode(realpath.dentry))) { > struct file *old; > > /* Stashed upperfile has a mismatched inode */ > if (unlikely(upperfile)) > return ERR_PTR(-EIO); > > upperfile = ovl_open_realfile(file, &realpath); > if (IS_ERR(upperfile)) > return upperfile; > > old = cmpxchg_release(backing_file_private_ptr(realfile), NULL, > upperfile); > if (old) { > fput(upperfile); > upperfile = old; > } > > goto stashed_upper; > } > Unless I'm misreading that, the value of realfile seen inside the second > if is always the original; reassignment in the first if might as well had > been followed by goto past the second one. What's more, if you win the > race in the second if, you'll have upperfile != NULL and its file_inode() > matching realpath.dentry->d_inode (you'd better, or you have a real problem > in backing_file_open()). So that branch to stashed_upper in case old == NULL > might as well had been "realfile = upperfile;". Correct? In case old != NULL Correct. > we go there with upperfile != NULL. If it (i.e. old) has the right file_inode(), > we are done; otherwise it's going to hit ERR_PTR(-EIO) in the second if. > > So it seems to be equivalent to this: > if (unlikely(file_inode(realfile) != d_inode(realpath.dentry))) { > /* > * If realfile is lower and has been copied up since we'd > * opened it, we need the real upper file opened. Whoever gets > * there first stashes the result in backing_file_private(). > */ > struct file *upperfile = backing_file_private(realfile); > if (unlikely(!upperfile)) { > struct file *old; > > upperfile = ovl_open_realfile(file, &realpath); > if (IS_ERR(upperfile)) > return upperfile; > > old = cmpxchg_release(backing_file_private_ptr(realfile), NULL, > upperfile); > if (old) { > fput(upperfile); > upperfile = old; > } > } > // upperfile reference is owned by realfile at that point > if (unlikely(file_inode(upperfile) != d_inode(realpath.dentry))) > /* Stashed upperfile has a mismatched inode */ > return ERR_PTR(-EIO); > realfile = upperfile; > } > Or am I misreading it? Seems to be more straightforward that way... Yeh, that's a bit more clear without to goto, but I would not remove the if (upperfile) assertion. Actually the first if has a bug. It assumes that if the upperfile is stashed then it must be used, but this is incorrect. I have made the following change: static bool ovl_is_real_file(const struct file *realfile, const struct path *realpath) { return file_inode(realfile) == d_inode(realpath->dentry); } ... /* * Usually, if we operated on a stashed upperfile once, all following * operations will operate on the stashed upperfile, but there is one * exception - ovl_fsync(datasync = false) can populate the stashed * upperfile to perform fsync on upper metadata inode. In this case, * following read/write operations will not use the stashed upperfile. */ if (upperfile && likely(ovl_is_real_file(upperfile, realpath))) { realfile = upperfile; goto checkflags; } /* * If realfile is lower and has been copied up since we'd opened it, * open the real upper file and stash it in backing_file_private(). */ if (unlikely(!ovl_is_real_file(realfile, realpath))) { struct file *old; /* Either stashed realfile or upperfile must match realinode */ if (WARN_ON_ONCE(upperfile)) return ERR_PTR(-EIO); ... /* Stashed upperfile that won the race must match realinode */ if (WARN_ON_ONCE(!ovl_is_real_file(upperfile, realpath))) return ERR_PTR(-EIO); realfile = upperfile; } checkflags: ... What happens is that we have two slots for stashing real files and there are subtle assumptions in the code that 1. We will never be requested to open a real file for more than two inodes (lower and upper) 2. If we are requested to open two inodes, we will always be requested to open the lower inode first 3. IOW if we are requested to open the upper inode first, we will not be requested to open a different inode Therefore, the names realfile/upperfile are a bit misleading. If file is opened after copyup, then the realfile in ->private_data is the upper file and stashed upperfile is NULL. I will post the revised version. Thanks, Amir.
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c index 4504493b20be..3d64d00ef981 100644 --- a/fs/overlayfs/file.c +++ b/fs/overlayfs/file.c @@ -90,17 +90,19 @@ static int ovl_change_flags(struct file *file, unsigned int flags) } static int ovl_real_fdget_meta(const struct file *file, struct fd *real, - bool allow_meta) + bool upper_meta) { struct dentry *dentry = file_dentry(file); struct file *realfile = file->private_data; struct path realpath; int err; - real->word = (unsigned long)realfile; + real->word = 0; - if (allow_meta) { - ovl_path_real(dentry, &realpath); + if (upper_meta) { + ovl_path_upper(dentry, &realpath); + if (!realpath.dentry) + return 0; } else { /* lazy lookup and verify of lowerdata */ err = ovl_verify_lowerdata(dentry); @@ -395,7 +397,7 @@ static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync) return ret; ret = ovl_real_fdget_meta(file, &real, !datasync); - if (ret) + if (ret || fd_empty(real)) return ret; /* Don't sync lower file for fear of receiving EROFS error */
ovl_fsync() with !datasync opens a backing file from the top most dentry in the stack, checks if this dentry is non-upper and skips the fsync. In case of an overlay dentry stack with lower data and lower metadata above it, but without an upper metadata above it, the backing file is opened from the top most lower metadata dentry and never used. Fix the helper ovl_real_fdget_meta() to return an empty struct fd in that case to avoid the unneeded backing file open. Signed-off-by: Amir Goldstein <amir73il@gmail.com> --- fs/overlayfs/file.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)