diff mbox series

[RFC] fuse: invalidate page cache pages before direct write

Message ID 20230509080128.457489-1-hao.xu@linux.dev (mailing list archive)
State Mainlined, archived
Headers show
Series [RFC] fuse: invalidate page cache pages before direct write | expand

Commit Message

Hao Xu May 9, 2023, 8:01 a.m. UTC
From: Hao Xu <howeyxu@tencent.com>

In FOPEN_DIRECT_IO, page cache may still be there for a file, direct
write should respect that and invalidate the corresponding pages so
that page cache readers don't get stale data. Another thing this patch
does is flush related pages to avoid its loss.

Signed-off-by: Hao Xu <howeyxu@tencent.com>
---

Reference:
https://lore.kernel.org/linux-fsdevel/ee8380b3-683f-c526-5f10-1ce2ee6f79ad@linux.dev/#:~:text=I%20think%20this%20problem%20exists%20before%20this%20patchset

 fs/fuse/file.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

Comments

Hao Xu May 24, 2023, 10:02 a.m. UTC | #1
On 5/9/23 16:01, Hao Xu wrote:
> From: Hao Xu <howeyxu@tencent.com>
> 
> In FOPEN_DIRECT_IO, page cache may still be there for a file, direct
> write should respect that and invalidate the corresponding pages so
> that page cache readers don't get stale data. Another thing this patch
> does is flush related pages to avoid its loss.
> 
> Signed-off-by: Hao Xu <howeyxu@tencent.com>
> ---
> 
> Reference:
> https://lore.kernel.org/linux-fsdevel/ee8380b3-683f-c526-5f10-1ce2ee6f79ad@linux.dev/#:~:text=I%20think%20this%20problem%20exists%20before%20this%20patchset
> 
>   fs/fuse/file.c | 14 +++++++++++++-
>   1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index 89d97f6188e0..edc84c1dfc5c 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -1490,7 +1490,8 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter,
>   	int write = flags & FUSE_DIO_WRITE;
>   	int cuse = flags & FUSE_DIO_CUSE;
>   	struct file *file = io->iocb->ki_filp;
> -	struct inode *inode = file->f_mapping->host;
> +	struct address_space *mapping = file->f_mapping;
> +	struct inode *inode = mapping->host;
>   	struct fuse_file *ff = file->private_data;
>   	struct fuse_conn *fc = ff->fm->fc;
>   	size_t nmax = write ? fc->max_write : fc->max_read;
> @@ -1516,6 +1517,17 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter,
>   			inode_unlock(inode);
>   	}
>   
> +	res = filemap_write_and_wait_range(mapping, pos, pos + count - 1);

Seems We don't need to flush dirty page if the page is only private 
mmaped, because the pages are always clean. I'll fix this in v2.

> +	if (res)
> +		return res;
> +
> +	if (write) {
> +		if (invalidate_inode_pages2_range(mapping,
> +				idx_from, idx_to)) {
> +			return -ENOTBLK;
> +		}
> +	}
> +
>   	io->should_dirty = !write && user_backed_iter(iter);
>   	while (count) {
>   		ssize_t nres;
Hao Xu June 8, 2023, 7:17 a.m. UTC | #2
Ping...

On 5/9/23 16:01, Hao Xu wrote:
> From: Hao Xu <howeyxu@tencent.com>
> 
> In FOPEN_DIRECT_IO, page cache may still be there for a file, direct
> write should respect that and invalidate the corresponding pages so
> that page cache readers don't get stale data. Another thing this patch
> does is flush related pages to avoid its loss.
> 
> Signed-off-by: Hao Xu <howeyxu@tencent.com>
> ---
> 
> Reference:
> https://lore.kernel.org/linux-fsdevel/ee8380b3-683f-c526-5f10-1ce2ee6f79ad@linux.dev/#:~:text=I%20think%20this%20problem%20exists%20before%20this%20patchset
> 
>   fs/fuse/file.c | 14 +++++++++++++-
>   1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index 89d97f6188e0..edc84c1dfc5c 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -1490,7 +1490,8 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter,
>   	int write = flags & FUSE_DIO_WRITE;
>   	int cuse = flags & FUSE_DIO_CUSE;
>   	struct file *file = io->iocb->ki_filp;
> -	struct inode *inode = file->f_mapping->host;
> +	struct address_space *mapping = file->f_mapping;
> +	struct inode *inode = mapping->host;
>   	struct fuse_file *ff = file->private_data;
>   	struct fuse_conn *fc = ff->fm->fc;
>   	size_t nmax = write ? fc->max_write : fc->max_read;
> @@ -1516,6 +1517,17 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter,
>   			inode_unlock(inode);
>   	}
>   
> +	res = filemap_write_and_wait_range(mapping, pos, pos + count - 1);
> +	if (res)
> +		return res;
> +
> +	if (write) {
> +		if (invalidate_inode_pages2_range(mapping,
> +				idx_from, idx_to)) {
> +			return -ENOTBLK;
> +		}
> +	}
> +
>   	io->should_dirty = !write && user_backed_iter(iter);
>   	while (count) {
>   		ssize_t nres;
Bernd Schubert June 26, 2023, 6:23 p.m. UTC | #3
On 6/8/23 09:17, Hao Xu wrote:
> Ping...
> 
> On 5/9/23 16:01, Hao Xu wrote:
>> From: Hao Xu <howeyxu@tencent.com>
>>
>> In FOPEN_DIRECT_IO, page cache may still be there for a file, direct
>> write should respect that and invalidate the corresponding pages so
>> that page cache readers don't get stale data. Another thing this patch
>> does is flush related pages to avoid its loss.
>>
>> Signed-off-by: Hao Xu <howeyxu@tencent.com>
>> ---
>>
>> Reference:
>> https://lore.kernel.org/linux-fsdevel/ee8380b3-683f-c526-5f10-1ce2ee6f79ad@linux.dev/#:~:text=I%20think%20this%20problem%20exists%20before%20this%20patchset
>>
>>   fs/fuse/file.c | 14 +++++++++++++-
>>   1 file changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
>> index 89d97f6188e0..edc84c1dfc5c 100644
>> --- a/fs/fuse/file.c
>> +++ b/fs/fuse/file.c
>> @@ -1490,7 +1490,8 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, 
>> struct iov_iter *iter,
>>       int write = flags & FUSE_DIO_WRITE;
>>       int cuse = flags & FUSE_DIO_CUSE;
>>       struct file *file = io->iocb->ki_filp;
>> -    struct inode *inode = file->f_mapping->host;
>> +    struct address_space *mapping = file->f_mapping;
>> +    struct inode *inode = mapping->host;
>>       struct fuse_file *ff = file->private_data;
>>       struct fuse_conn *fc = ff->fm->fc;
>>       size_t nmax = write ? fc->max_write : fc->max_read;
>> @@ -1516,6 +1517,17 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, 
>> struct iov_iter *iter,
>>               inode_unlock(inode);
>>       }
>> +    res = filemap_write_and_wait_range(mapping, pos, pos + count - 1);
>> +    if (res)
>> +        return res;
>> +
>> +    if (write) {
>> +        if (invalidate_inode_pages2_range(mapping,
>> +                idx_from, idx_to)) {
>> +            return -ENOTBLK;
>> +        }
>> +    }
>> +
>>       io->should_dirty = !write && user_backed_iter(iter);
>>       while (count) {
>>           ssize_t nres;
> 

Is this part not working?

	if (!cuse && fuse_range_is_writeback(inode, idx_from, idx_to)) {
		if (!write)
			inode_lock(inode);
		fuse_sync_writes(inode);
		if (!write)
			inode_unlock(inode);
	}



Thanks,
Bernd
Hao Xu June 29, 2023, noon UTC | #4
Hi Bernd,

On 6/27/23 02:23, Bernd Schubert wrote:
> 
> 
> On 6/8/23 09:17, Hao Xu wrote:
>> Ping...
>>
>> On 5/9/23 16:01, Hao Xu wrote:
>>> From: Hao Xu <howeyxu@tencent.com>
>>>
>>> In FOPEN_DIRECT_IO, page cache may still be there for a file, direct
>>> write should respect that and invalidate the corresponding pages so
>>> that page cache readers don't get stale data. Another thing this patch
>>> does is flush related pages to avoid its loss.
>>>
>>> Signed-off-by: Hao Xu <howeyxu@tencent.com>
>>> ---
>>>
>>> Reference:
>>> https://lore.kernel.org/linux-fsdevel/ee8380b3-683f-c526-5f10-1ce2ee6f79ad@linux.dev/#:~:text=I%20think%20this%20problem%20exists%20before%20this%20patchset
>>>
>>>   fs/fuse/file.c | 14 +++++++++++++-
>>>   1 file changed, 13 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
>>> index 89d97f6188e0..edc84c1dfc5c 100644
>>> --- a/fs/fuse/file.c
>>> +++ b/fs/fuse/file.c
>>> @@ -1490,7 +1490,8 @@ ssize_t fuse_direct_io(struct fuse_io_priv *io, 
>>> struct iov_iter *iter,
>>>       int write = flags & FUSE_DIO_WRITE;
>>>       int cuse = flags & FUSE_DIO_CUSE;
>>>       struct file *file = io->iocb->ki_filp;
>>> -    struct inode *inode = file->f_mapping->host;
>>> +    struct address_space *mapping = file->f_mapping;
>>> +    struct inode *inode = mapping->host;
>>>       struct fuse_file *ff = file->private_data;
>>>       struct fuse_conn *fc = ff->fm->fc;
>>>       size_t nmax = write ? fc->max_write : fc->max_read;
>>> @@ -1516,6 +1517,17 @@ ssize_t fuse_direct_io(struct fuse_io_priv 
>>> *io, struct iov_iter *iter,
>>>               inode_unlock(inode);
>>>       }
>>> +    res = filemap_write_and_wait_range(mapping, pos, pos + count - 1);
>>> +    if (res)
>>> +        return res;
>>> +
>>> +    if (write) {
>>> +        if (invalidate_inode_pages2_range(mapping,
>>> +                idx_from, idx_to)) {
>>> +            return -ENOTBLK;
>>> +        }
>>> +    }
>>> +
>>>       io->should_dirty = !write && user_backed_iter(iter);
>>>       while (count) {
>>>           ssize_t nres;
>>
> 
> Is this part not working?
> 
>      if (!cuse && fuse_range_is_writeback(inode, idx_from, idx_to)) {
>          if (!write)
>              inode_lock(inode);
>          fuse_sync_writes(inode);
>          if (!write)
>              inode_unlock(inode);
>      }
> 
> 


This code seems to be waiting for already triggered page cache writeback
requests, it's not related with the issue this patch tries to address.
The issue here is we should invalidate related page cache page before we
do direct write.

Regards,
Hao

> 
> Thanks,
> Bernd
Bernd Schubert June 29, 2023, 3:35 p.m. UTC | #5
Hi Hao,

On 6/29/23 14:00, Hao Xu wrote:
> Hi Bernd,
> 
> On 6/27/23 02:23, Bernd Schubert wrote:
>>
>>
>> On 6/8/23 09:17, Hao Xu wrote:
>>> Ping...
>>>
>>> On 5/9/23 16:01, Hao Xu wrote:
>>>> From: Hao Xu <howeyxu@tencent.com>
>>>>
>>>> In FOPEN_DIRECT_IO, page cache may still be there for a file, direct
>>>> write should respect that and invalidate the corresponding pages so
>>>> that page cache readers don't get stale data. Another thing this patch
>>>> does is flush related pages to avoid its loss.
>>>>
>>>> Signed-off-by: Hao Xu <howeyxu@tencent.com>
>>>> ---
>>>>
>>>> Reference:
>>>> https://lore.kernel.org/linux-fsdevel/ee8380b3-683f-c526-5f10-1ce2ee6f79ad@linux.dev/#:~:text=I%20think%20this%20problem%20exists%20before%20this%20patchset
>>>>
>>>>   fs/fuse/file.c | 14 +++++++++++++-
>>>>   1 file changed, 13 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
>>>> index 89d97f6188e0..edc84c1dfc5c 100644
>>>> --- a/fs/fuse/file.c
>>>> +++ b/fs/fuse/file.c
>>>> @@ -1490,7 +1490,8 @@ ssize_t fuse_direct_io(struct fuse_io_priv 
>>>> *io, struct iov_iter *iter,
>>>>       int write = flags & FUSE_DIO_WRITE;
>>>>       int cuse = flags & FUSE_DIO_CUSE;
>>>>       struct file *file = io->iocb->ki_filp;
>>>> -    struct inode *inode = file->f_mapping->host;
>>>> +    struct address_space *mapping = file->f_mapping;
>>>> +    struct inode *inode = mapping->host;
>>>>       struct fuse_file *ff = file->private_data;
>>>>       struct fuse_conn *fc = ff->fm->fc;
>>>>       size_t nmax = write ? fc->max_write : fc->max_read;
>>>> @@ -1516,6 +1517,17 @@ ssize_t fuse_direct_io(struct fuse_io_priv 
>>>> *io, struct iov_iter *iter,
>>>>               inode_unlock(inode);
>>>>       }
>>>> +    res = filemap_write_and_wait_range(mapping, pos, pos + count - 1);
>>>> +    if (res)
>>>> +        return res;
>>>> +
>>>> +    if (write) {
>>>> +        if (invalidate_inode_pages2_range(mapping,
>>>> +                idx_from, idx_to)) {
>>>> +            return -ENOTBLK;
>>>> +        }
>>>> +    }
>>>> +
>>>>       io->should_dirty = !write && user_backed_iter(iter);
>>>>       while (count) {
>>>>           ssize_t nres;
>>>
>>
>> Is this part not working?
>>
>>      if (!cuse && fuse_range_is_writeback(inode, idx_from, idx_to)) {
>>          if (!write)
>>              inode_lock(inode);
>>          fuse_sync_writes(inode);
>>          if (!write)
>>              inode_unlock(inode);
>>      }
>>
>>
> 
> 
> This code seems to be waiting for already triggered page cache writeback
> requests, it's not related with the issue this patch tries to address.
> The issue here is we should invalidate related page cache page before we
> do direct write.

oh, right, I just see it. I think you should move your 
filemap_write_and_wait_range() call above that piece in order to ensure 
it is send to the daemon/server side.


Thanks,
Bernd
diff mbox series

Patch

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 89d97f6188e0..edc84c1dfc5c 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1490,7 +1490,8 @@  ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter,
 	int write = flags & FUSE_DIO_WRITE;
 	int cuse = flags & FUSE_DIO_CUSE;
 	struct file *file = io->iocb->ki_filp;
-	struct inode *inode = file->f_mapping->host;
+	struct address_space *mapping = file->f_mapping;
+	struct inode *inode = mapping->host;
 	struct fuse_file *ff = file->private_data;
 	struct fuse_conn *fc = ff->fm->fc;
 	size_t nmax = write ? fc->max_write : fc->max_read;
@@ -1516,6 +1517,17 @@  ssize_t fuse_direct_io(struct fuse_io_priv *io, struct iov_iter *iter,
 			inode_unlock(inode);
 	}
 
+	res = filemap_write_and_wait_range(mapping, pos, pos + count - 1);
+	if (res)
+		return res;
+
+	if (write) {
+		if (invalidate_inode_pages2_range(mapping,
+				idx_from, idx_to)) {
+			return -ENOTBLK;
+		}
+	}
+
 	io->should_dirty = !write && user_backed_iter(iter);
 	while (count) {
 		ssize_t nres;