Message ID | 20240821024301.1058918-2-wozizhi@huawei.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | netfs/cachefiles: Some bugfixes | expand |
Zizhi Wo <wozizhi@huawei.com> wrote: > In the __cachefiles_prepare_write function, DIO aligns blocks using > PAGE_SIZE as the unit. And currently cachefiles_add_cache() binds > cache->bsize with the requirement that it must not exceed PAGE_SIZE. > However, if cache->bsize is smaller than PAGE_SIZE, the calculated block > count will be incorrect in __cachefiles_prepare_write(). > > Set the block size to cache->bsize to resolve this issue. Have you tested this with 9p, afs, cifs, ceph and/or nfs? This may cause an issue there as it assumed that the cache file will be padded out to PAGE_SIZE (see cachefiles_adjust_size()). David
在 2024/10/10 18:34, David Howells 写道: > Zizhi Wo <wozizhi@huawei.com> wrote: > >> In the __cachefiles_prepare_write function, DIO aligns blocks using >> PAGE_SIZE as the unit. And currently cachefiles_add_cache() binds >> cache->bsize with the requirement that it must not exceed PAGE_SIZE. >> However, if cache->bsize is smaller than PAGE_SIZE, the calculated block >> count will be incorrect in __cachefiles_prepare_write(). >> >> Set the block size to cache->bsize to resolve this issue. > > Have you tested this with 9p, afs, cifs, ceph and/or nfs? This may cause an > issue there as it assumed that the cache file will be padded out to > PAGE_SIZE (see cachefiles_adjust_size()). > > David > > In my opinion, cachefiles_add_cache() will pass the corresponding size to cache->bsize. For scenarios such as nfs/cifs, the corresponding bsize is PAGE_SIZE aligned, which is fine. For scenarios where cache->bsize is specified for non-PAGE_SIZE alignment (such as erofs on demand mode), imposing PAGE_SIZE here can be problematic. So modify cache->bsize to be more generic. Thanks, Zizhi Wo
Zizhi Wo <wozizhi@huawei.com> wrote:
> For scenarios such as nfs/cifs, the corresponding bsize is PAGE_SIZE aligned
cache->bsize is a property of the cache device, not the network filesystems
that might be making use of it (and it might be shared between multiple
volumes from multiple network filesystems, all of which could, in theory, have
different 'block sizes', inasmuch as network filesystems have block sizes).
David
在 2024/10/10 19:36, David Howells 写道: > Zizhi Wo <wozizhi@huawei.com> wrote: > >> For scenarios such as nfs/cifs, the corresponding bsize is PAGE_SIZE aligned > > cache->bsize is a property of the cache device, not the network filesystems > that might be making use of it (and it might be shared between multiple > volumes from multiple network filesystems, all of which could, in theory, have > different 'block sizes', inasmuch as network filesystems have block sizes). > > David > > Then I was wrong. Thank you for pointing it out. I'll be thinking about new solutions for non-PAGE_SIZE scenarios. Thanks, Zizhi Wo
在 2024/10/10 19:36, David Howells 写道: > Zizhi Wo <wozizhi@huawei.com> wrote: > >> For scenarios such as nfs/cifs, the corresponding bsize is PAGE_SIZE aligned > > cache->bsize is a property of the cache device, not the network filesystems > that might be making use of it (and it might be shared between multiple > volumes from multiple network filesystems, all of which could, in theory, have > different 'block sizes', inasmuch as network filesystems have block sizes). > > David > Sorry, I might have a slightly different idea now. The EROFS process based on fscache is similar to NFS and others, supporting only PAGE_SIZE-granularity reads/writes, which is fine. The issue now lies in the __cachefiles_prepare_write() function, specifically in the block count calculation when calling cachefiles_has_space(). By default, the block size used there is PAGE_SIZE. If the block size of the underlying cache filesystem is not PAGE_SIZE (for example, smaller than PAGE_SIZE), it leads to an incorrect calculation of the required block count, which makes the system think there is enough space when there might not be. This should be the same problem for all network file systems that specify the back-end cache file system mode. For example, assume len=12k, pagesize=4k, and the underlying cache filesystem block size is 1k. The currently available blocks are 4, with 4k of space remaining. However, cachefiles_has_space() calculates the block count as 12k/4k=3, and since 4 > 3, it incorrectly returns that there is enough space. Therefore, I believe this could be a general issue. If cachefiles_add_cache() is part of a common workflow, then using cache->bsize here might be more appropriate? Looking forward to your reply. Thanks, Zizhi Wo >
diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c index a91acd03ee12..59c5c08f921a 100644 --- a/fs/cachefiles/io.c +++ b/fs/cachefiles/io.c @@ -524,10 +524,10 @@ int __cachefiles_prepare_write(struct cachefiles_object *object, struct cachefiles_cache *cache = object->volume->cache; loff_t start = *_start, pos; size_t len = *_len; - int ret; + int ret, block_size = cache->bsize; /* Round to DIO size */ - start = round_down(*_start, PAGE_SIZE); + start = round_down(*_start, block_size); if (start != *_start || *_len > upper_len) { /* Probably asked to cache a streaming write written into the * pagecache when the cookie was temporarily out of service to @@ -537,7 +537,7 @@ int __cachefiles_prepare_write(struct cachefiles_object *object, return -ENOBUFS; } - *_len = round_up(len, PAGE_SIZE); + *_len = round_up(len, block_size); /* We need to work out whether there's sufficient disk space to perform * the write - but we can skip that check if we have space already @@ -563,7 +563,7 @@ int __cachefiles_prepare_write(struct cachefiles_object *object, * space, we need to see if it's fully allocated. If it's not, we may * want to cull it. */ - if (cachefiles_has_space(cache, 0, *_len / PAGE_SIZE, + if (cachefiles_has_space(cache, 0, *_len / block_size, cachefiles_has_space_check) == 0) return 0; /* Enough space to simply overwrite the whole block */ @@ -595,7 +595,7 @@ int __cachefiles_prepare_write(struct cachefiles_object *object, return ret; check_space: - return cachefiles_has_space(cache, 0, *_len / PAGE_SIZE, + return cachefiles_has_space(cache, 0, *_len / block_size, cachefiles_has_space_for_write); }
In the __cachefiles_prepare_write function, DIO aligns blocks using PAGE_SIZE as the unit. And currently cachefiles_add_cache() binds cache->bsize with the requirement that it must not exceed PAGE_SIZE. However, if cache->bsize is smaller than PAGE_SIZE, the calculated block count will be incorrect in __cachefiles_prepare_write(). Set the block size to cache->bsize to resolve this issue. Fixes: 047487c947e8 ("cachefiles: Implement the I/O routines") Signed-off-by: Zizhi Wo <wozizhi@huawei.com> --- fs/cachefiles/io.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)