Message ID | 20220223211305.296816-1-trondmy@kernel.org (mailing list archive) |
---|---|
Headers | show |
Series | Readdir improvements | expand |
On Wed, Feb 23, 2022 at 4:24 PM <trondmy@kernel.org> wrote: > > From: Trond Myklebust <trond.myklebust@hammerspace.com> > > The current NFS readdir code will always try to maximise the amount of > readahead it performs on the assumption that we can cache anything that > isn't immediately read by the process. > There are several cases where this assumption breaks down, including > when the 'ls -l' heuristic kicks in to try to force use of readdirplus > as a batch replacement for lookup/getattr. > > This series also implement Ben's page cache filter to ensure that we can > improve the ability to share cached data between processes that are > reading the same directory at the same time, and to avoid live-locks > when the directory is simultaneously changing. > > -- > v2: Remove reset of dtsize when NFS_INO_FORCE_READDIR is set > v3: Avoid excessive window shrinking in uncached_readdir case > v4: Track 'ls -l' cache hit/miss statistics > Improved algorithm for falling back to uncached readdir > Skip readdirplus when files are being written to > v5: bugfixes > Skip readdirplus when the acdirmax/acregmax values are low > Request a full XDR buffer when doing READDIRPLUS > v6: Add tracing > Don't have lookup request readdirplus when it won't help > v7: Implement Ben's page cache filter > Reduce the use of uncached readdir > Change indexing of the page cache to improve seekdir() performance. > > Trond Myklebust (21): > NFS: constify nfs_server_capable() and nfs_have_writebacks() > NFS: Trace lookup revalidation failure > NFS: Use kzalloc() to avoid initialising the nfs_open_dir_context > NFS: Calculate page offsets algorithmically > NFS: Store the change attribute in the directory page cache > NFS: If the cookie verifier changes, we must invalidate the page cache > NFS: Don't re-read the entire page cache to find the next cookie > NFS: Adjust the amount of readahead performed by NFS readdir > NFS: Simplify nfs_readdir_xdr_to_array() > NFS: Reduce use of uncached readdir > NFS: Improve heuristic for readdirplus > NFS: Don't ask for readdirplus unless it can help nfs_getattr() > NFSv4: Ask for a full XDR buffer of readdir goodness > NFS: Readdirplus can't help lookup for case insensitive filesystems > NFS: Don't request readdirplus when revalidation was forced > NFS: Add basic readdir tracing > NFS: Trace effects of readdirplus on the dcache > NFS: Trace effects of the readdirplus heuristic > NFS: Convert readdir page cache to use a cookie based index > NFS: Fix up forced readdirplus > NFS: Remove unnecessary cache invalidations for directories > > fs/nfs/dir.c | 450 ++++++++++++++++++++++++----------------- > fs/nfs/inode.c | 46 ++--- > fs/nfs/internal.h | 4 +- > fs/nfs/nfs3xdr.c | 7 +- > fs/nfs/nfs4proc.c | 2 - > fs/nfs/nfs4xdr.c | 6 +- > fs/nfs/nfstrace.h | 122 ++++++++++- > include/linux/nfs_fs.h | 19 +- > 8 files changed, 421 insertions(+), 235 deletions(-) > > -- > 2.35.1 > Trond I have been following your work here with periodic tests though not fully following all the patches content. As you know this is a tricky area and seems to be a hotspot area for customers that use NFS, with many scenarios that may go wrong. Thanks for your work, which now includes even some tracepoints and Ben's page cache filler. This patchset seems to be the best of all the ones so far. My initial tests (listings when modifying as well as idle directories) indicate that the issue that Gonzalo reported on Jan 14th [1] looks to be fixed by this set, but I'll let him confirm. I'll do some more testing and let you know if there's anything else I find. If there're some scenarios (mount options, servers, etc) you need more testing on, let us know and we'll try to make that happen. [1] [PATCH] NFS: limit block size reported for directories
On Wed, Feb 23, 2022 at 4:24 PM <trondmy@kernel.org> wrote: > > From: Trond Myklebust <trond.myklebust@hammerspace.com> > > The current NFS readdir code will always try to maximise the amount of > readahead it performs on the assumption that we can cache anything that > isn't immediately read by the process. > There are several cases where this assumption breaks down, including > when the 'ls -l' heuristic kicks in to try to force use of readdirplus > as a batch replacement for lookup/getattr. > > This series also implement Ben's page cache filter to ensure that we can > improve the ability to share cached data between processes that are > reading the same directory at the same time, and to avoid live-locks > when the directory is simultaneously changing. > > -- > v2: Remove reset of dtsize when NFS_INO_FORCE_READDIR is set > v3: Avoid excessive window shrinking in uncached_readdir case > v4: Track 'ls -l' cache hit/miss statistics > Improved algorithm for falling back to uncached readdir > Skip readdirplus when files are being written to > v5: bugfixes > Skip readdirplus when the acdirmax/acregmax values are low > Request a full XDR buffer when doing READDIRPLUS > v6: Add tracing > Don't have lookup request readdirplus when it won't help > v7: Implement Ben's page cache filter > Reduce the use of uncached readdir > Change indexing of the page cache to improve seekdir() performance. > > Trond Myklebust (21): > NFS: constify nfs_server_capable() and nfs_have_writebacks() > NFS: Trace lookup revalidation failure > NFS: Use kzalloc() to avoid initialising the nfs_open_dir_context > NFS: Calculate page offsets algorithmically > NFS: Store the change attribute in the directory page cache > NFS: If the cookie verifier changes, we must invalidate the page cache > NFS: Don't re-read the entire page cache to find the next cookie > NFS: Adjust the amount of readahead performed by NFS readdir > NFS: Simplify nfs_readdir_xdr_to_array() > NFS: Reduce use of uncached readdir > NFS: Improve heuristic for readdirplus > NFS: Don't ask for readdirplus unless it can help nfs_getattr() > NFSv4: Ask for a full XDR buffer of readdir goodness > NFS: Readdirplus can't help lookup for case insensitive filesystems > NFS: Don't request readdirplus when revalidation was forced > NFS: Add basic readdir tracing > NFS: Trace effects of readdirplus on the dcache > NFS: Trace effects of the readdirplus heuristic > NFS: Convert readdir page cache to use a cookie based index > NFS: Fix up forced readdirplus > NFS: Remove unnecessary cache invalidations for directories > > fs/nfs/dir.c | 450 ++++++++++++++++++++++++----------------- > fs/nfs/inode.c | 46 ++--- > fs/nfs/internal.h | 4 +- > fs/nfs/nfs3xdr.c | 7 +- > fs/nfs/nfs4proc.c | 2 - > fs/nfs/nfs4xdr.c | 6 +- > fs/nfs/nfstrace.h | 122 ++++++++++- > include/linux/nfs_fs.h | 19 +- > 8 files changed, 421 insertions(+), 235 deletions(-) > > -- > 2.35.1 > I don't see this pushed this to your 'testing' branch, but I applied manually after resetting at 3f58e709c162
On Thu, 2022-02-24 at 07:25 -0500, David Wysochanski wrote: > On Wed, Feb 23, 2022 at 4:24 PM <trondmy@kernel.org> wrote: > > > > From: Trond Myklebust <trond.myklebust@hammerspace.com> > > > > The current NFS readdir code will always try to maximise the amount > > of > > readahead it performs on the assumption that we can cache anything > > that > > isn't immediately read by the process. > > There are several cases where this assumption breaks down, > > including > > when the 'ls -l' heuristic kicks in to try to force use of > > readdirplus > > as a batch replacement for lookup/getattr. > > > > This series also implement Ben's page cache filter to ensure that > > we can > > improve the ability to share cached data between processes that are > > reading the same directory at the same time, and to avoid live- > > locks > > when the directory is simultaneously changing. > > > > -- > > v2: Remove reset of dtsize when NFS_INO_FORCE_READDIR is set > > v3: Avoid excessive window shrinking in uncached_readdir case > > v4: Track 'ls -l' cache hit/miss statistics > > Improved algorithm for falling back to uncached readdir > > Skip readdirplus when files are being written to > > v5: bugfixes > > Skip readdirplus when the acdirmax/acregmax values are low > > Request a full XDR buffer when doing READDIRPLUS > > v6: Add tracing > > Don't have lookup request readdirplus when it won't help > > v7: Implement Ben's page cache filter > > Reduce the use of uncached readdir > > Change indexing of the page cache to improve seekdir() > > performance. > > > > Trond Myklebust (21): > > NFS: constify nfs_server_capable() and nfs_have_writebacks() > > NFS: Trace lookup revalidation failure > > NFS: Use kzalloc() to avoid initialising the nfs_open_dir_context > > NFS: Calculate page offsets algorithmically > > NFS: Store the change attribute in the directory page cache > > NFS: If the cookie verifier changes, we must invalidate the page > > cache > > NFS: Don't re-read the entire page cache to find the next cookie > > NFS: Adjust the amount of readahead performed by NFS readdir > > NFS: Simplify nfs_readdir_xdr_to_array() > > NFS: Reduce use of uncached readdir > > NFS: Improve heuristic for readdirplus > > NFS: Don't ask for readdirplus unless it can help nfs_getattr() > > NFSv4: Ask for a full XDR buffer of readdir goodness > > NFS: Readdirplus can't help lookup for case insensitive > > filesystems > > NFS: Don't request readdirplus when revalidation was forced > > NFS: Add basic readdir tracing > > NFS: Trace effects of readdirplus on the dcache > > NFS: Trace effects of the readdirplus heuristic > > NFS: Convert readdir page cache to use a cookie based index > > NFS: Fix up forced readdirplus > > NFS: Remove unnecessary cache invalidations for directories > > > > fs/nfs/dir.c | 450 ++++++++++++++++++++++++------------- > > ---- > > fs/nfs/inode.c | 46 ++--- > > fs/nfs/internal.h | 4 +- > > fs/nfs/nfs3xdr.c | 7 +- > > fs/nfs/nfs4proc.c | 2 - > > fs/nfs/nfs4xdr.c | 6 +- > > fs/nfs/nfstrace.h | 122 ++++++++++- > > include/linux/nfs_fs.h | 19 +- > > 8 files changed, 421 insertions(+), 235 deletions(-) > > > > -- > > 2.35.1 > > > > Trond I have been following your work here with periodic tests though > not fully following all the patches content. As you know this is a > tricky > area and seems to be a hotspot area for customers that use NFS, with > many scenarios that may go wrong. Thanks for your work, which now > includes even some tracepoints and Ben's page cache filler. > > This patchset seems to be the best of all the ones so far. My > initial > tests (listings when modifying as well as idle directories) indicate > that the issue that Gonzalo reported on Jan 14th [1] looks to be > fixed > by this set, but I'll let him confirm. I'll do some more testing and > let you know if there's anything else I find. If there're some > scenarios (mount options, servers, etc) you need more testing on, let > us know and we'll try to make that happen. > > [1] [PATCH] NFS: limit block size reported for directories > Thanks Dave! I very much appreciate your testing the patches. This is a very complex area due to the interplay of readdir/readdirplus, the dcache and the attribute cache, and so it is key to get as much statistics as possible. Thanks again to you and Red Hat for contributing to that effort. Thanks also to Ben for his comments and reviews. His insights are key to the progress we appear to be making.
From: Trond Myklebust <trond.myklebust@hammerspace.com> The current NFS readdir code will always try to maximise the amount of readahead it performs on the assumption that we can cache anything that isn't immediately read by the process. There are several cases where this assumption breaks down, including when the 'ls -l' heuristic kicks in to try to force use of readdirplus as a batch replacement for lookup/getattr. This series also implement Ben's page cache filter to ensure that we can improve the ability to share cached data between processes that are reading the same directory at the same time, and to avoid live-locks when the directory is simultaneously changing. -- v2: Remove reset of dtsize when NFS_INO_FORCE_READDIR is set v3: Avoid excessive window shrinking in uncached_readdir case v4: Track 'ls -l' cache hit/miss statistics Improved algorithm for falling back to uncached readdir Skip readdirplus when files are being written to v5: bugfixes Skip readdirplus when the acdirmax/acregmax values are low Request a full XDR buffer when doing READDIRPLUS v6: Add tracing Don't have lookup request readdirplus when it won't help v7: Implement Ben's page cache filter Reduce the use of uncached readdir Change indexing of the page cache to improve seekdir() performance. Trond Myklebust (21): NFS: constify nfs_server_capable() and nfs_have_writebacks() NFS: Trace lookup revalidation failure NFS: Use kzalloc() to avoid initialising the nfs_open_dir_context NFS: Calculate page offsets algorithmically NFS: Store the change attribute in the directory page cache NFS: If the cookie verifier changes, we must invalidate the page cache NFS: Don't re-read the entire page cache to find the next cookie NFS: Adjust the amount of readahead performed by NFS readdir NFS: Simplify nfs_readdir_xdr_to_array() NFS: Reduce use of uncached readdir NFS: Improve heuristic for readdirplus NFS: Don't ask for readdirplus unless it can help nfs_getattr() NFSv4: Ask for a full XDR buffer of readdir goodness NFS: Readdirplus can't help lookup for case insensitive filesystems NFS: Don't request readdirplus when revalidation was forced NFS: Add basic readdir tracing NFS: Trace effects of readdirplus on the dcache NFS: Trace effects of the readdirplus heuristic NFS: Convert readdir page cache to use a cookie based index NFS: Fix up forced readdirplus NFS: Remove unnecessary cache invalidations for directories fs/nfs/dir.c | 450 ++++++++++++++++++++++++----------------- fs/nfs/inode.c | 46 ++--- fs/nfs/internal.h | 4 +- fs/nfs/nfs3xdr.c | 7 +- fs/nfs/nfs4proc.c | 2 - fs/nfs/nfs4xdr.c | 6 +- fs/nfs/nfstrace.h | 122 ++++++++++- include/linux/nfs_fs.h | 19 +- 8 files changed, 421 insertions(+), 235 deletions(-)