diff mbox series

[1/8] cachefiles: Fix incorrect block calculations in __cachefiles_prepare_write()

Message ID 20240821024301.1058918-2-wozizhi@huawei.com (mailing list archive)
State New
Headers show
Series netfs/cachefiles: Some bugfixes | expand

Commit Message

Zizhi Wo Aug. 21, 2024, 2:42 a.m. UTC
In the __cachefiles_prepare_write function, DIO aligns blocks using
PAGE_SIZE as the unit. And currently cachefiles_add_cache() binds
cache->bsize with the requirement that it must not exceed PAGE_SIZE.
However, if cache->bsize is smaller than PAGE_SIZE, the calculated block
count will be incorrect in __cachefiles_prepare_write().

Set the block size to cache->bsize to resolve this issue.

Fixes: 047487c947e8 ("cachefiles: Implement the I/O routines")
Signed-off-by: Zizhi Wo <wozizhi@huawei.com>
---
 fs/cachefiles/io.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

Comments

David Howells Oct. 10, 2024, 10:34 a.m. UTC | #1
Zizhi Wo <wozizhi@huawei.com> wrote:

> In the __cachefiles_prepare_write function, DIO aligns blocks using
> PAGE_SIZE as the unit. And currently cachefiles_add_cache() binds
> cache->bsize with the requirement that it must not exceed PAGE_SIZE.
> However, if cache->bsize is smaller than PAGE_SIZE, the calculated block
> count will be incorrect in __cachefiles_prepare_write().
> 
> Set the block size to cache->bsize to resolve this issue.

Have you tested this with 9p, afs, cifs, ceph and/or nfs?  This may cause an
issue there as it assumed that the cache file will be padded out to
PAGE_SIZE (see cachefiles_adjust_size()).

David
Zizhi Wo Oct. 10, 2024, 11:11 a.m. UTC | #2
在 2024/10/10 18:34, David Howells 写道:
> Zizhi Wo <wozizhi@huawei.com> wrote:
> 
>> In the __cachefiles_prepare_write function, DIO aligns blocks using
>> PAGE_SIZE as the unit. And currently cachefiles_add_cache() binds
>> cache->bsize with the requirement that it must not exceed PAGE_SIZE.
>> However, if cache->bsize is smaller than PAGE_SIZE, the calculated block
>> count will be incorrect in __cachefiles_prepare_write().
>>
>> Set the block size to cache->bsize to resolve this issue.
> 
> Have you tested this with 9p, afs, cifs, ceph and/or nfs?  This may cause an
> issue there as it assumed that the cache file will be padded out to
> PAGE_SIZE (see cachefiles_adjust_size()).
> 
> David
> 
> 

In my opinion, cachefiles_add_cache() will pass the corresponding size
to cache->bsize. For scenarios such as nfs/cifs, the corresponding bsize
is PAGE_SIZE aligned, which is fine. For scenarios where cache->bsize is
specified for non-PAGE_SIZE alignment (such as erofs on demand mode),
imposing PAGE_SIZE here can be problematic. So modify cache->bsize to be
more generic.

Thanks,
Zizhi Wo
David Howells Oct. 10, 2024, 11:36 a.m. UTC | #3
Zizhi Wo <wozizhi@huawei.com> wrote:

> For scenarios such as nfs/cifs, the corresponding bsize is PAGE_SIZE aligned

cache->bsize is a property of the cache device, not the network filesystems
that might be making use of it (and it might be shared between multiple
volumes from multiple network filesystems, all of which could, in theory, have
different 'block sizes', inasmuch as network filesystems have block sizes).

David
Zizhi Wo Oct. 10, 2024, 12:17 p.m. UTC | #4
在 2024/10/10 19:36, David Howells 写道:
> Zizhi Wo <wozizhi@huawei.com> wrote:
> 
>> For scenarios such as nfs/cifs, the corresponding bsize is PAGE_SIZE aligned
> 
> cache->bsize is a property of the cache device, not the network filesystems
> that might be making use of it (and it might be shared between multiple
> volumes from multiple network filesystems, all of which could, in theory, have
> different 'block sizes', inasmuch as network filesystems have block sizes).
> 
> David
> 
> 

Then I was wrong. Thank you for pointing it out. I'll be thinking about
new solutions for non-PAGE_SIZE scenarios.

Thanks,
Zizhi Wo
Zizhi Wo Oct. 10, 2024, 1:15 p.m. UTC | #5
在 2024/10/10 19:36, David Howells 写道:
> Zizhi Wo <wozizhi@huawei.com> wrote:
> 
>> For scenarios such as nfs/cifs, the corresponding bsize is PAGE_SIZE aligned
> 
> cache->bsize is a property of the cache device, not the network filesystems
> that might be making use of it (and it might be shared between multiple
> volumes from multiple network filesystems, all of which could, in theory, have
> different 'block sizes', inasmuch as network filesystems have block sizes).
> 
> David
> 

Sorry, I might have a slightly different idea now. The EROFS process
based on fscache is similar to NFS and others, supporting only
PAGE_SIZE-granularity reads/writes, which is fine. The issue now lies in
the __cachefiles_prepare_write() function, specifically in the block
count calculation when calling cachefiles_has_space(). By default, the
block size used there is PAGE_SIZE.

If the block size of the underlying cache filesystem is not PAGE_SIZE
(for example, smaller than PAGE_SIZE), it leads to an incorrect
calculation of the required block count, which makes the system think
there is enough space when there might not be. This should be the same
problem for all network file systems that specify the back-end cache
file system mode.

For example, assume len=12k, pagesize=4k, and the underlying cache
filesystem block size is 1k. The currently available blocks are 4, with
4k of space remaining. However, cachefiles_has_space() calculates the
block count as 12k/4k=3, and since 4 > 3, it incorrectly returns that
there is enough space.

Therefore, I believe this could be a general issue. If
cachefiles_add_cache() is part of a common workflow, then using
cache->bsize here might be more appropriate? Looking forward to your
reply.

Thanks,
Zizhi Wo

>
diff mbox series

Patch

diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c
index a91acd03ee12..59c5c08f921a 100644
--- a/fs/cachefiles/io.c
+++ b/fs/cachefiles/io.c
@@ -524,10 +524,10 @@  int __cachefiles_prepare_write(struct cachefiles_object *object,
 	struct cachefiles_cache *cache = object->volume->cache;
 	loff_t start = *_start, pos;
 	size_t len = *_len;
-	int ret;
+	int ret, block_size = cache->bsize;
 
 	/* Round to DIO size */
-	start = round_down(*_start, PAGE_SIZE);
+	start = round_down(*_start, block_size);
 	if (start != *_start || *_len > upper_len) {
 		/* Probably asked to cache a streaming write written into the
 		 * pagecache when the cookie was temporarily out of service to
@@ -537,7 +537,7 @@  int __cachefiles_prepare_write(struct cachefiles_object *object,
 		return -ENOBUFS;
 	}
 
-	*_len = round_up(len, PAGE_SIZE);
+	*_len = round_up(len, block_size);
 
 	/* We need to work out whether there's sufficient disk space to perform
 	 * the write - but we can skip that check if we have space already
@@ -563,7 +563,7 @@  int __cachefiles_prepare_write(struct cachefiles_object *object,
 	 * space, we need to see if it's fully allocated.  If it's not, we may
 	 * want to cull it.
 	 */
-	if (cachefiles_has_space(cache, 0, *_len / PAGE_SIZE,
+	if (cachefiles_has_space(cache, 0, *_len / block_size,
 				 cachefiles_has_space_check) == 0)
 		return 0; /* Enough space to simply overwrite the whole block */
 
@@ -595,7 +595,7 @@  int __cachefiles_prepare_write(struct cachefiles_object *object,
 	return ret;
 
 check_space:
-	return cachefiles_has_space(cache, 0, *_len / PAGE_SIZE,
+	return cachefiles_has_space(cache, 0, *_len / block_size,
 				    cachefiles_has_space_for_write);
 }