Message ID | 20190526061100.21761-7-amir73il@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fixes for major copy_file_range() issues | expand |
On Sun, May 26, 2019 at 09:10:57AM +0300, Amir Goldstein wrote: > From: Dave Chinner <dchinner@redhat.com> > > Timestamps are not updated right now, so programs looking for > timestamp updates for file modifications (like rsync) will not > detect that files have changed. We are also accessing the source > data when doing a copy (but not when cloning) so we need to update > atime on the source file as well. > > Signed-off-by: Dave Chinner <dchinner@redhat.com> > Signed-off-by: Amir Goldstein <amir73il@gmail.com> > --- > fs/read_write.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/fs/read_write.c b/fs/read_write.c > index e16bcafc0da2..4b23a86aacd9 100644 > --- a/fs/read_write.c > +++ b/fs/read_write.c > @@ -1576,6 +1576,16 @@ int generic_copy_file_range_prep(struct file *file_in, struct file *file_out) > > WARN_ON_ONCE(!inode_is_locked(file_inode(file_out))); > > + /* Update source timestamps, because we are accessing file data */ > + file_accessed(file_in); > + > + /* Update destination timestamps, since we can alter file contents. */ > + if (!(file_out->f_mode & FMODE_NOCMTIME)) { > + ret = file_update_time(file_out); > + if (ret) > + return ret; > + } > + Is this the right place for updating the timestamps? I see that in same cases we may be updating the timestamp even if there was an error and no copy was performed. For example, if file_remove_privs fails. (btw, I've re-tested everything on ceph and everything seems to be working fine.) Cheers, -- Luís > /* > * Clear the security bits if the process is not being run by root. > * This keeps people from modifying setuid and setgid binaries. > -- > 2.17.1 > >
On Mon, May 27, 2019 at 03:35:39PM +0100, Luis Henriques wrote: > On Sun, May 26, 2019 at 09:10:57AM +0300, Amir Goldstein wrote: > > From: Dave Chinner <dchinner@redhat.com> > > > > Timestamps are not updated right now, so programs looking for > > timestamp updates for file modifications (like rsync) will not > > detect that files have changed. We are also accessing the source > > data when doing a copy (but not when cloning) so we need to update > > atime on the source file as well. > > > > Signed-off-by: Dave Chinner <dchinner@redhat.com> > > Signed-off-by: Amir Goldstein <amir73il@gmail.com> > > --- > > fs/read_write.c | 10 ++++++++++ > > 1 file changed, 10 insertions(+) > > > > diff --git a/fs/read_write.c b/fs/read_write.c > > index e16bcafc0da2..4b23a86aacd9 100644 > > --- a/fs/read_write.c > > +++ b/fs/read_write.c > > @@ -1576,6 +1576,16 @@ int generic_copy_file_range_prep(struct file *file_in, struct file *file_out) > > > > WARN_ON_ONCE(!inode_is_locked(file_inode(file_out))); > > > > + /* Update source timestamps, because we are accessing file data */ > > + file_accessed(file_in); > > + > > + /* Update destination timestamps, since we can alter file contents. */ > > + if (!(file_out->f_mode & FMODE_NOCMTIME)) { > > + ret = file_update_time(file_out); > > + if (ret) > > + return ret; > > + } > > + > > Is this the right place for updating the timestamps? I see that in same > cases we may be updating the timestamp even if there was an error and no > copy was performed. For example, if file_remove_privs fails. It's the same place we do it for read - file_accessed() is called before we do the IO - and the same place for write - file_update_time() is called before we copy data into the pagecache or do direct IO. As such, it really doesn't matter if it is before or after file_remove_privs() - the IO can still fail for many reasons after we've updated the timestamps and in some of the failure cases (e.g. we failed the sync at the end of an O_DSYNC buffered write) we still want the timestamps to be modified because the data and/or user visible metadata /may/ have been changed. cfr operates under the same constraints as read() and write(), so we need to update the timestamps up front regardless of whether the copy ends up succeeding or not.... Cheers, Dave.
Dave Chinner <david@fromorbit.com> writes: > On Mon, May 27, 2019 at 03:35:39PM +0100, Luis Henriques wrote: >> On Sun, May 26, 2019 at 09:10:57AM +0300, Amir Goldstein wrote: >> > From: Dave Chinner <dchinner@redhat.com> >> > >> > Timestamps are not updated right now, so programs looking for >> > timestamp updates for file modifications (like rsync) will not >> > detect that files have changed. We are also accessing the source >> > data when doing a copy (but not when cloning) so we need to update >> > atime on the source file as well. >> > >> > Signed-off-by: Dave Chinner <dchinner@redhat.com> >> > Signed-off-by: Amir Goldstein <amir73il@gmail.com> >> > --- >> > fs/read_write.c | 10 ++++++++++ >> > 1 file changed, 10 insertions(+) >> > >> > diff --git a/fs/read_write.c b/fs/read_write.c >> > index e16bcafc0da2..4b23a86aacd9 100644 >> > --- a/fs/read_write.c >> > +++ b/fs/read_write.c >> > @@ -1576,6 +1576,16 @@ int generic_copy_file_range_prep(struct file *file_in, struct file *file_out) >> > >> > WARN_ON_ONCE(!inode_is_locked(file_inode(file_out))); >> > >> > + /* Update source timestamps, because we are accessing file data */ >> > + file_accessed(file_in); >> > + >> > + /* Update destination timestamps, since we can alter file contents. */ >> > + if (!(file_out->f_mode & FMODE_NOCMTIME)) { >> > + ret = file_update_time(file_out); >> > + if (ret) >> > + return ret; >> > + } >> > + >> >> Is this the right place for updating the timestamps? I see that in same >> cases we may be updating the timestamp even if there was an error and no >> copy was performed. For example, if file_remove_privs fails. > > It's the same place we do it for read - file_accessed() is called > before we do the IO - and the same place for write - > file_update_time() is called before we copy data into the pagecache > or do direct IO. As such, it really doesn't matter if it is before > or after file_remove_privs() - the IO can still fail for many > reasons after we've updated the timestamps and in some of the > failure cases (e.g. we failed the sync at the end of an O_DSYNC > buffered write) we still want the timestamps to be modified because > the data and/or user visible metadata /may/ have been changed. > > cfr operates under the same constraints as read() and write(), so we > need to update the timestamps up front regardless of whether the > copy ends up succeeding or not.... Great, thanks for explaining it. It now makes sense, even for consistency, to have this operation here. Cheers,
On Sun, May 26, 2019 at 09:10:57AM +0300, Amir Goldstein wrote: > From: Dave Chinner <dchinner@redhat.com> > > Timestamps are not updated right now, so programs looking for > timestamp updates for file modifications (like rsync) will not > detect that files have changed. We are also accessing the source > data when doing a copy (but not when cloning) so we need to update > atime on the source file as well. > > Signed-off-by: Dave Chinner <dchinner@redhat.com> > Signed-off-by: Amir Goldstein <amir73il@gmail.com> > --- > fs/read_write.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/fs/read_write.c b/fs/read_write.c > index e16bcafc0da2..4b23a86aacd9 100644 > --- a/fs/read_write.c > +++ b/fs/read_write.c > @@ -1576,6 +1576,16 @@ int generic_copy_file_range_prep(struct file *file_in, struct file *file_out) > > WARN_ON_ONCE(!inode_is_locked(file_inode(file_out))); > > + /* Update source timestamps, because we are accessing file data */ > + file_accessed(file_in); > + > + /* Update destination timestamps, since we can alter file contents. */ > + if (!(file_out->f_mode & FMODE_NOCMTIME)) { > + ret = file_update_time(file_out); > + if (ret) > + return ret; > + } > + > /* > * Clear the security bits if the process is not being run by root. > * This keeps people from modifying setuid and setgid binaries. Should the file_update_time and file_remove_privs calls be factored into a separate small function to be called by generic_{copy,remap}_range_prep? Looks ok otherwise, Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> --D > -- > 2.17.1 >
On Tue, May 28, 2019 at 7:30 PM Darrick J. Wong <darrick.wong@oracle.com> wrote: > > On Sun, May 26, 2019 at 09:10:57AM +0300, Amir Goldstein wrote: > > From: Dave Chinner <dchinner@redhat.com> > > > > Timestamps are not updated right now, so programs looking for > > timestamp updates for file modifications (like rsync) will not > > detect that files have changed. We are also accessing the source > > data when doing a copy (but not when cloning) so we need to update > > atime on the source file as well. > > > > Signed-off-by: Dave Chinner <dchinner@redhat.com> > > Signed-off-by: Amir Goldstein <amir73il@gmail.com> > > --- > > fs/read_write.c | 10 ++++++++++ > > 1 file changed, 10 insertions(+) > > > > diff --git a/fs/read_write.c b/fs/read_write.c > > index e16bcafc0da2..4b23a86aacd9 100644 > > --- a/fs/read_write.c > > +++ b/fs/read_write.c > > @@ -1576,6 +1576,16 @@ int generic_copy_file_range_prep(struct file *file_in, struct file *file_out) > > > > WARN_ON_ONCE(!inode_is_locked(file_inode(file_out))); > > > > + /* Update source timestamps, because we are accessing file data */ > > + file_accessed(file_in); > > + > > + /* Update destination timestamps, since we can alter file contents. */ > > + if (!(file_out->f_mode & FMODE_NOCMTIME)) { > > + ret = file_update_time(file_out); > > + if (ret) > > + return ret; > > + } > > + > > /* > > * Clear the security bits if the process is not being run by root. > > * This keeps people from modifying setuid and setgid binaries. > > Should the file_update_time and file_remove_privs calls be factored into > a separate small function to be called by generic_{copy,remap}_range_prep? > Ok. same helper could be called also after copy when inode is not locked throughout the copy. Thanks, Amir.
diff --git a/fs/read_write.c b/fs/read_write.c index e16bcafc0da2..4b23a86aacd9 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -1576,6 +1576,16 @@ int generic_copy_file_range_prep(struct file *file_in, struct file *file_out) WARN_ON_ONCE(!inode_is_locked(file_inode(file_out))); + /* Update source timestamps, because we are accessing file data */ + file_accessed(file_in); + + /* Update destination timestamps, since we can alter file contents. */ + if (!(file_out->f_mode & FMODE_NOCMTIME)) { + ret = file_update_time(file_out); + if (ret) + return ret; + } + /* * Clear the security bits if the process is not being run by root. * This keeps people from modifying setuid and setgid binaries.