Message ID | 20230509080128.457489-1-hao.xu@linux.dev (mailing list archive) |
---|---|
State | Mainlined, archived |
Headers | show |
Series | [RFC] fuse: invalidate page cache pages before direct write | expand |
On 5/9/23 16:01, Hao Xu wrote: > From: Hao Xu <howeyxu@tencent.com> > > In FOPEN_DIRECT_IO, page cache may still be there for a file, direct > write should respect that and invalidate the corresponding pages so > that page cache readers don't get stale data. Another thing this patch > does is flush related pages to avoid its loss. > > Signed-off-by: Hao Xu <howeyxu@tencent.com> > --- > > Reference: > https://lore.kernel.org/linux-fsdevel/ee8380b3-683f-c526-5f10-1ce2ee6f79ad@linux.dev/#:~:text=I%20think%20this%20problem%20exists%20before%20this%20patchset > > fs/fuse/file.c | 14 +++++++++++++- > 1 file changed, 13 insertions(+), 1 deletion(-) > > diff --git a/fs/fuse/file.c b/fs/fuse/file.c > index 89d97f6188e0..edc84c1dfc5c 100644 > --- a/fs/fuse/file.c > +++ b/fs/fuse/file.c > @@ -1490,7 +1490,8 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter, > int write = flags & FUSE_DIO_WRITE; > int cuse = flags & FUSE_DIO_CUSE; > struct file *file = io->iocb->ki_filp; > - struct inode *inode = file->f_mapping->host; > + struct address_space *mapping = file->f_mapping; > + struct inode *inode = mapping->host; > struct fuse_file *ff = file->private_data; > struct fuse_conn *fc = ff->fm->fc; > size_t nmax = write ? fc->max_write : fc->max_read; > @@ -1516,6 +1517,17 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter, > inode_unlock(inode); > } > > + res = filemap_write_and_wait_range(mapping, pos, pos + count - 1); Seems We don't need to flush dirty page if the page is only private mmaped, because the pages are always clean. I'll fix this in v2. > + if (res) > + return res; > + > + if (write) { > + if (invalidate_inode_pages2_range(mapping, > + idx_from, idx_to)) { > + return -ENOTBLK; > + } > + } > + > io->should_dirty = !write && user_backed_iter(iter); > while (count) { > ssize_t nres;
Ping... On 5/9/23 16:01, Hao Xu wrote: > From: Hao Xu <howeyxu@tencent.com> > > In FOPEN_DIRECT_IO, page cache may still be there for a file, direct > write should respect that and invalidate the corresponding pages so > that page cache readers don't get stale data. Another thing this patch > does is flush related pages to avoid its loss. > > Signed-off-by: Hao Xu <howeyxu@tencent.com> > --- > > Reference: > https://lore.kernel.org/linux-fsdevel/ee8380b3-683f-c526-5f10-1ce2ee6f79ad@linux.dev/#:~:text=I%20think%20this%20problem%20exists%20before%20this%20patchset > > fs/fuse/file.c | 14 +++++++++++++- > 1 file changed, 13 insertions(+), 1 deletion(-) > > diff --git a/fs/fuse/file.c b/fs/fuse/file.c > index 89d97f6188e0..edc84c1dfc5c 100644 > --- a/fs/fuse/file.c > +++ b/fs/fuse/file.c > @@ -1490,7 +1490,8 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter, > int write = flags & FUSE_DIO_WRITE; > int cuse = flags & FUSE_DIO_CUSE; > struct file *file = io->iocb->ki_filp; > - struct inode *inode = file->f_mapping->host; > + struct address_space *mapping = file->f_mapping; > + struct inode *inode = mapping->host; > struct fuse_file *ff = file->private_data; > struct fuse_conn *fc = ff->fm->fc; > size_t nmax = write ? fc->max_write : fc->max_read; > @@ -1516,6 +1517,17 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter, > inode_unlock(inode); > } > > + res = filemap_write_and_wait_range(mapping, pos, pos + count - 1); > + if (res) > + return res; > + > + if (write) { > + if (invalidate_inode_pages2_range(mapping, > + idx_from, idx_to)) { > + return -ENOTBLK; > + } > + } > + > io->should_dirty = !write && user_backed_iter(iter); > while (count) { > ssize_t nres;
On 6/8/23 09:17, Hao Xu wrote: > Ping... > > On 5/9/23 16:01, Hao Xu wrote: >> From: Hao Xu <howeyxu@tencent.com> >> >> In FOPEN_DIRECT_IO, page cache may still be there for a file, direct >> write should respect that and invalidate the corresponding pages so >> that page cache readers don't get stale data. Another thing this patch >> does is flush related pages to avoid its loss. >> >> Signed-off-by: Hao Xu <howeyxu@tencent.com> >> --- >> >> Reference: >> https://lore.kernel.org/linux-fsdevel/ee8380b3-683f-c526-5f10-1ce2ee6f79ad@linux.dev/#:~:text=I%20think%20this%20problem%20exists%20before%20this%20patchset >> >> fs/fuse/file.c | 14 +++++++++++++- >> 1 file changed, 13 insertions(+), 1 deletion(-) >> >> diff --git a/fs/fuse/file.c b/fs/fuse/file.c >> index 89d97f6188e0..edc84c1dfc5c 100644 >> --- a/fs/fuse/file.c >> +++ b/fs/fuse/file.c >> @@ -1490,7 +1490,8 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, >> struct iov_iter *iter, >> int write = flags & FUSE_DIO_WRITE; >> int cuse = flags & FUSE_DIO_CUSE; >> struct file *file = io->iocb->ki_filp; >> - struct inode *inode = file->f_mapping->host; >> + struct address_space *mapping = file->f_mapping; >> + struct inode *inode = mapping->host; >> struct fuse_file *ff = file->private_data; >> struct fuse_conn *fc = ff->fm->fc; >> size_t nmax = write ? fc->max_write : fc->max_read; >> @@ -1516,6 +1517,17 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, >> struct iov_iter *iter, >> inode_unlock(inode); >> } >> + res = filemap_write_and_wait_range(mapping, pos, pos + count - 1); >> + if (res) >> + return res; >> + >> + if (write) { >> + if (invalidate_inode_pages2_range(mapping, >> + idx_from, idx_to)) { >> + return -ENOTBLK; >> + } >> + } >> + >> io->should_dirty = !write && user_backed_iter(iter); >> while (count) { >> ssize_t nres; > Is this part not working? if (!cuse && fuse_range_is_writeback(inode, idx_from, idx_to)) { if (!write) inode_lock(inode); fuse_sync_writes(inode); if (!write) inode_unlock(inode); } Thanks, Bernd
Hi Bernd, On 6/27/23 02:23, Bernd Schubert wrote: > > > On 6/8/23 09:17, Hao Xu wrote: >> Ping... >> >> On 5/9/23 16:01, Hao Xu wrote: >>> From: Hao Xu <howeyxu@tencent.com> >>> >>> In FOPEN_DIRECT_IO, page cache may still be there for a file, direct >>> write should respect that and invalidate the corresponding pages so >>> that page cache readers don't get stale data. Another thing this patch >>> does is flush related pages to avoid its loss. >>> >>> Signed-off-by: Hao Xu <howeyxu@tencent.com> >>> --- >>> >>> Reference: >>> https://lore.kernel.org/linux-fsdevel/ee8380b3-683f-c526-5f10-1ce2ee6f79ad@linux.dev/#:~:text=I%20think%20this%20problem%20exists%20before%20this%20patchset >>> >>> fs/fuse/file.c | 14 +++++++++++++- >>> 1 file changed, 13 insertions(+), 1 deletion(-) >>> >>> diff --git a/fs/fuse/file.c b/fs/fuse/file.c >>> index 89d97f6188e0..edc84c1dfc5c 100644 >>> --- a/fs/fuse/file.c >>> +++ b/fs/fuse/file.c >>> @@ -1490,7 +1490,8 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, >>> struct iov_iter *iter, >>> int write = flags & FUSE_DIO_WRITE; >>> int cuse = flags & FUSE_DIO_CUSE; >>> struct file *file = io->iocb->ki_filp; >>> - struct inode *inode = file->f_mapping->host; >>> + struct address_space *mapping = file->f_mapping; >>> + struct inode *inode = mapping->host; >>> struct fuse_file *ff = file->private_data; >>> struct fuse_conn *fc = ff->fm->fc; >>> size_t nmax = write ? fc->max_write : fc->max_read; >>> @@ -1516,6 +1517,17 @@ ssize_t fuse_direct_io(struct fuse_io_priv >>> *io, struct iov_iter *iter, >>> inode_unlock(inode); >>> } >>> + res = filemap_write_and_wait_range(mapping, pos, pos + count - 1); >>> + if (res) >>> + return res; >>> + >>> + if (write) { >>> + if (invalidate_inode_pages2_range(mapping, >>> + idx_from, idx_to)) { >>> + return -ENOTBLK; >>> + } >>> + } >>> + >>> io->should_dirty = !write && user_backed_iter(iter); >>> while (count) { >>> ssize_t nres; >> > > Is this part not working? > > if (!cuse && fuse_range_is_writeback(inode, idx_from, idx_to)) { > if (!write) > inode_lock(inode); > fuse_sync_writes(inode); > if (!write) > inode_unlock(inode); > } > > This code seems to be waiting for already triggered page cache writeback requests, it's not related with the issue this patch tries to address. The issue here is we should invalidate related page cache page before we do direct write. Regards, Hao > > Thanks, > Bernd
Hi Hao, On 6/29/23 14:00, Hao Xu wrote: > Hi Bernd, > > On 6/27/23 02:23, Bernd Schubert wrote: >> >> >> On 6/8/23 09:17, Hao Xu wrote: >>> Ping... >>> >>> On 5/9/23 16:01, Hao Xu wrote: >>>> From: Hao Xu <howeyxu@tencent.com> >>>> >>>> In FOPEN_DIRECT_IO, page cache may still be there for a file, direct >>>> write should respect that and invalidate the corresponding pages so >>>> that page cache readers don't get stale data. Another thing this patch >>>> does is flush related pages to avoid its loss. >>>> >>>> Signed-off-by: Hao Xu <howeyxu@tencent.com> >>>> --- >>>> >>>> Reference: >>>> https://lore.kernel.org/linux-fsdevel/ee8380b3-683f-c526-5f10-1ce2ee6f79ad@linux.dev/#:~:text=I%20think%20this%20problem%20exists%20before%20this%20patchset >>>> >>>> fs/fuse/file.c | 14 +++++++++++++- >>>> 1 file changed, 13 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/fs/fuse/file.c b/fs/fuse/file.c >>>> index 89d97f6188e0..edc84c1dfc5c 100644 >>>> --- a/fs/fuse/file.c >>>> +++ b/fs/fuse/file.c >>>> @@ -1490,7 +1490,8 @@ ssize_t fuse_direct_io(struct fuse_io_priv >>>> *io, struct iov_iter *iter, >>>> int write = flags & FUSE_DIO_WRITE; >>>> int cuse = flags & FUSE_DIO_CUSE; >>>> struct file *file = io->iocb->ki_filp; >>>> - struct inode *inode = file->f_mapping->host; >>>> + struct address_space *mapping = file->f_mapping; >>>> + struct inode *inode = mapping->host; >>>> struct fuse_file *ff = file->private_data; >>>> struct fuse_conn *fc = ff->fm->fc; >>>> size_t nmax = write ? fc->max_write : fc->max_read; >>>> @@ -1516,6 +1517,17 @@ ssize_t fuse_direct_io(struct fuse_io_priv >>>> *io, struct iov_iter *iter, >>>> inode_unlock(inode); >>>> } >>>> + res = filemap_write_and_wait_range(mapping, pos, pos + count - 1); >>>> + if (res) >>>> + return res; >>>> + >>>> + if (write) { >>>> + if (invalidate_inode_pages2_range(mapping, >>>> + idx_from, idx_to)) { >>>> + return -ENOTBLK; >>>> + } >>>> + } >>>> + >>>> io->should_dirty = !write && user_backed_iter(iter); >>>> while (count) { >>>> ssize_t nres; >>> >> >> Is this part not working? >> >> if (!cuse && fuse_range_is_writeback(inode, idx_from, idx_to)) { >> if (!write) >> inode_lock(inode); >> fuse_sync_writes(inode); >> if (!write) >> inode_unlock(inode); >> } >> >> > > > This code seems to be waiting for already triggered page cache writeback > requests, it's not related with the issue this patch tries to address. > The issue here is we should invalidate related page cache page before we > do direct write. oh, right, I just see it. I think you should move your filemap_write_and_wait_range() call above that piece in order to ensure it is send to the daemon/server side. Thanks, Bernd
diff --git a/fs/fuse/file.c b/fs/fuse/file.c index 89d97f6188e0..edc84c1dfc5c 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -1490,7 +1490,8 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter, int write = flags & FUSE_DIO_WRITE; int cuse = flags & FUSE_DIO_CUSE; struct file *file = io->iocb->ki_filp; - struct inode *inode = file->f_mapping->host; + struct address_space *mapping = file->f_mapping; + struct inode *inode = mapping->host; struct fuse_file *ff = file->private_data; struct fuse_conn *fc = ff->fm->fc; size_t nmax = write ? fc->max_write : fc->max_read; @@ -1516,6 +1517,17 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter, inode_unlock(inode); } + res = filemap_write_and_wait_range(mapping, pos, pos + count - 1); + if (res) + return res; + + if (write) { + if (invalidate_inode_pages2_range(mapping, + idx_from, idx_to)) { + return -ENOTBLK; + } + } + io->should_dirty = !write && user_backed_iter(iter); while (count) { ssize_t nres;