Message ID | 20170612122316.13244-15-jlayton@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Jun 12, 2017 at 08:23:06AM -0400, Jeff Layton wrote: > Add a new FS_WB_ERRSEQ flag to the fstype. Later patches will set and > key off of that to decide what behavior should be used. Please invert this so that only file systems that keep the old semantics need a flag.
On Mon, 2017-06-12 at 05:45 -0700, Christoph Hellwig wrote: > On Mon, Jun 12, 2017 at 08:23:06AM -0400, Jeff Layton wrote: > > Add a new FS_WB_ERRSEQ flag to the fstype. Later patches will set and > > key off of that to decide what behavior should be used. > > Please invert this so that only file systems that keep the old semantics > need a flag. That's definitely what I want for the endgame here. My plan was to add this flag for now, and then eventually reverse it (or drop it) once all or most filesystems are converted. We can do it that way from the get-go if you like. It'll mean tossing in a patch add this flag to all filesystems that have an fsync operation and that use the pagecache, and then gradually remove it from them as we convert them. Which method do you prefer?
On Tue, Jun 13, 2017 at 06:24:32AM -0400, Jeff Layton wrote: > That's definitely what I want for the endgame here. My plan was to add > this flag for now, and then eventually reverse it (or drop it) once all > or most filesystems are converted. > > We can do it that way from the get-go if you like. It'll mean tossing in > a patch add this flag to all filesystems that have an fsync operation > and that use the pagecache, and then gradually remove it from them as we > convert them. > > Which method do you prefer? Please do it from the get-go. Or in fact figure out if we can get away without it entirely. Moving the error reporting into ->fsync should help greatly with that, so what's missing after that?
On Tue, 2017-06-13 at 23:47 -0700, Christoph Hellwig wrote: > On Tue, Jun 13, 2017 at 06:24:32AM -0400, Jeff Layton wrote: > > That's definitely what I want for the endgame here. My plan was to add > > this flag for now, and then eventually reverse it (or drop it) once all > > or most filesystems are converted. > > > > We can do it that way from the get-go if you like. It'll mean tossing in > > a patch add this flag to all filesystems that have an fsync operation > > and that use the pagecache, and then gradually remove it from them as we > > convert them. > > > > Which method do you prefer? > > Please do it from the get-go. Or in fact figure out if we can get > away without it entirely. Moving the error reporting into ->fsync > should help greatly with that, so what's missing after that? In this smaller set, it's only really used for DAX. In the larger patch series I have (which needs updating on top of this), there are other things that key off of it: sync_file_range: ->fsync isn't called directly there, and I think we probably want similar semantics to fsync() for it JBD2: will try to re-set the error after clearing it with filemap_fdatawait. That's problematic with the new infrastructure so we need some way to avoid it. What I think I'll do for now is add a new FS_DAX_WB_ERRSEQ flag that will go away by the end of the series. As the need arises for a similar flag in other areas, I'll add them then. The overall goal is not to need these flags. It may take a bit of time to get there though. Thanks for the review so far!
On Wed, Jun 14, 2017 at 01:24:43PM -0400, Jeff Layton wrote: > In this smaller set, it's only really used for DAX. DAX only is implemented by three filesystems, please just fix them up in one go. > sync_file_range: ->fsync isn't called directly there, and I think we > probably want similar semantics to fsync() for it sync_file_range is only supposed to sync data, so it should not call ->fsync. > JBD2: will try to re-set the error after clearing it with > filemap_fdatawait. That's problematic with the new infrastructure so we > need some way to avoid it. JBD2 only has two users, please fix them up in one go.
On Thu, 2017-06-15 at 01:22 -0700, Christoph Hellwig wrote: > On Wed, Jun 14, 2017 at 01:24:43PM -0400, Jeff Layton wrote: > > In this smaller set, it's only really used for DAX. > > DAX only is implemented by three filesystems, please just fix them > up in one go. > Ok. > > sync_file_range: ->fsync isn't called directly there, and I think we > > probably want similar semantics to fsync() for it > > sync_file_range is only supposed to sync data, so it should not call > ->fsync. > Correct. But if there is a data writeback error, should we report an error on all open fds at that time (like we will for fsync)? I think we probably do want to do that, but like you say...there is no file op for sync_file_range. It'll need some way to figure out what sort of error tracking is in play. > > JBD2: will try to re-set the error after clearing it with > > filemap_fdatawait. That's problematic with the new infrastructure so we > > need some way to avoid it. > > JBD2 only has two users, please fix them up in one go. I came up with a fix yesterday that makes the flag unnecessary there. Thanks,
On Thu, Jun 15, 2017 at 06:42:12AM -0400, Jeff Layton wrote: > Correct. > > But if there is a data writeback error, should we report an error on all > open fds at that time (like we will for fsync)? We should in theory, but I don't see how to properly do it. In addition sync_file_range just can't be used for data integrity to start with, so I don't think it's worth it. At some point we should add a proper fsync_range syscall, though.
On Thu, 2017-06-15 at 07:57 -0700, Christoph Hellwig wrote: > On Thu, Jun 15, 2017 at 06:42:12AM -0400, Jeff Layton wrote: > > Correct. > > > > But if there is a data writeback error, should we report an error on all > > open fds at that time (like we will for fsync)? > > We should in theory, but I don't see how to properly do it. In addition > sync_file_range just can't be used for data integrity to start with, so > I don't think it's worth it. At some point we should add a proper > fsync_range syscall, though. filemap_report_wb_err will always return 0 if the inode never has mapping_set_error called on it. So, I think we should be able to do it there once we get all of the fs' converted over. That'll have to happen at the end of the series however.
diff --git a/include/linux/fs.h b/include/linux/fs.h index 6cd87887430b..17ba6284ab14 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2023,6 +2023,7 @@ struct file_system_type { #define FS_BINARY_MOUNTDATA 2 #define FS_HAS_SUBTYPE 4 #define FS_USERNS_MOUNT 8 /* Can be mounted by userns root */ +#define FS_WB_ERRSEQ 16 /* errseq_t writeback err tracking */ #define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() during rename() internally. */ struct dentry *(*mount) (struct file_system_type *, int, const char *, void *);
Now that we have new infrastructure for handling writeback errors using errseq_t, we need to convert the existing code to use it. We could attempt to retrofit the old interfaces on top of the new, but there is a conceptual disconnect here in the case of internal callers that invoke filemap_fdatawait and the like. When reporting writeback errors, we will always report errors that have occurred since a particular point in time. With the old writeback error reporting, the time we used was "since it was last tested/cleared" which is entirely arbitrary and potentially racy. Now, we can report the latest error that has occurred since an arbitrary point in time (represented as a sampled errseq_t value). This means that we need to touch each filesystem that calls filemap_check_errors in some fashion and ensure that we establish sane "since" values for those callers. But...some code is shared between filesystems and needs to be able to handle both error tracking schemes. Add a new FS_WB_ERRSEQ flag to the fstype. Later patches will set and key off of that to decide what behavior should be used. Signed-off-by: Jeff Layton <jlayton@redhat.com> --- include/linux/fs.h | 1 + 1 file changed, 1 insertion(+)