Message ID | 163969801519.20885.3977673503103544412.stgit@noble.brown (mailing list archive) |
---|---|
Headers | show |
Series | Repair SWAP-over-NFS | expand |
Hi Neil, On Thu, Dec 16, 2021 at 7:07 PM NeilBrown <neilb@suse.de> wrote: > > swap-over-NFS currently has a variety of problems. > > swap writes call generic_write_checks(), which always fails on a swap > file, so it completely fails. > Even without this, various deadlocks are possible - largely due to > improvements in NFS memory allocation (using NOFS instead of ATOMIC) > which weren't tested against swap-out. > > NFS is the only filesystem that has supported fs-based swap IO, and it > hasn't worked for several releases, so now is a convenient time to clean > up the swap-via-filesystem interfaces - we cannot break anything ! > > So the first few patches here clean up and improve various parts of the > swap-via-filesystem code. ->activate_swap() is given a cleaner > interface, a new ->swap_rw is introduced instead of burdening > ->direct_IO, etc. > > Current swap-to-filesystem code only ever submits single-page reads and > writes. These patches change that to allow multi-page IO when adjacent > requests are submitted. Writes are also changed to be async rather than > sync. This substantially speeds up write throughput for swap-over-NFS. > > Some of the NFS patches can land independently of the MM patches. A few > require the MM patches to land first. Thanks for fixing swap-over-NFS! Looks like it passes all the swap-related xfstests except for generic/357 on NFS v4.2. This test checks that we get -EINVAL on a reflinked swapfile, but I'm not sure if there is a way to check for that on the client side but if you have any ideas it would be nice to get that test passing while you're at it! Anna > > Thanks, > NeilBrown > > > --- > > NeilBrown (18): > Structural cleanup for filesystem-based swap > MM: create new mm/swap.h header file. > MM: use ->swap_rw for reads from SWP_FS_OPS swap-space > MM: perform async writes to SWP_FS_OPS swap-space > MM: reclaim mustn't enter FS for SWP_FS_OPS swap-space > MM: submit multipage reads for SWP_FS_OPS swap-space > MM: submit multipage write for SWP_FS_OPS swap-space > MM: Add AS_CAN_DIO mapping flag > NFS: rename nfs_direct_IO and use as ->swap_rw > NFS: swap IO handling is slightly different for O_DIRECT IO > SUNRPC/call_alloc: async tasks mustn't block waiting for memory > SUNRPC/auth: async tasks mustn't block waiting for memory > SUNRPC/xprt: async tasks mustn't block waiting for memory > SUNRPC: remove scheduling boost for "SWAPPER" tasks. > NFS: discard NFS_RPC_SWAPFLAGS and RPC_TASK_ROOTCREDS > SUNRPC: improve 'swap' handling: scheduling and PF_MEMALLOC > NFSv4: keep state manager thread active if swap is enabled > NFS: swap-out must always use STABLE writes. > > > drivers/block/loop.c | 4 +- > fs/fcntl.c | 5 +- > fs/inode.c | 3 + > fs/nfs/direct.c | 56 ++++++---- > fs/nfs/file.c | 25 +++-- > fs/nfs/inode.c | 1 + > fs/nfs/nfs4_fs.h | 1 + > fs/nfs/nfs4proc.c | 20 ++++ > fs/nfs/nfs4state.c | 39 ++++++- > fs/nfs/read.c | 4 - > fs/nfs/write.c | 2 + > fs/open.c | 2 +- > fs/overlayfs/file.c | 10 +- > include/linux/fs.h | 2 +- > include/linux/nfs_fs.h | 11 +- > include/linux/nfs_xdr.h | 2 + > include/linux/pagemap.h | 3 +- > include/linux/sunrpc/auth.h | 1 + > include/linux/sunrpc/sched.h | 1 - > include/linux/swap.h | 121 -------------------- > include/linux/writeback.h | 7 ++ > include/trace/events/sunrpc.h | 1 - > mm/madvise.c | 9 +- > mm/memory.c | 3 +- > mm/mincore.c | 1 + > mm/page_alloc.c | 1 + > mm/page_io.c | 189 ++++++++++++++++++++++++++------ > mm/shmem.c | 1 + > mm/swap.h | 140 +++++++++++++++++++++++ > mm/swap_state.c | 32 ++++-- > mm/swapfile.c | 6 + > mm/util.c | 1 + > mm/vmscan.c | 31 +++++- > net/sunrpc/auth.c | 8 +- > net/sunrpc/auth_gss/auth_gss.c | 6 +- > net/sunrpc/auth_unix.c | 10 +- > net/sunrpc/clnt.c | 7 +- > net/sunrpc/sched.c | 29 +++-- > net/sunrpc/xprt.c | 19 ++-- > net/sunrpc/xprtrdma/transport.c | 10 +- > net/sunrpc/xprtsock.c | 8 ++ > 41 files changed, 558 insertions(+), 274 deletions(-) > create mode 100644 mm/swap.h > > -- > Signature >
On Sat, 18 Dec 2021, Anna Schumaker wrote: > Hi Neil, > > On Thu, Dec 16, 2021 at 7:07 PM NeilBrown <neilb@suse.de> wrote: > > > > swap-over-NFS currently has a variety of problems. > > > > swap writes call generic_write_checks(), which always fails on a swap > > file, so it completely fails. > > Even without this, various deadlocks are possible - largely due to > > improvements in NFS memory allocation (using NOFS instead of ATOMIC) > > which weren't tested against swap-out. > > > > NFS is the only filesystem that has supported fs-based swap IO, and it > > hasn't worked for several releases, so now is a convenient time to clean > > up the swap-via-filesystem interfaces - we cannot break anything ! > > > > So the first few patches here clean up and improve various parts of the > > swap-via-filesystem code. ->activate_swap() is given a cleaner > > interface, a new ->swap_rw is introduced instead of burdening > > ->direct_IO, etc. > > > > Current swap-to-filesystem code only ever submits single-page reads and > > writes. These patches change that to allow multi-page IO when adjacent > > requests are submitted. Writes are also changed to be async rather than > > sync. This substantially speeds up write throughput for swap-over-NFS. > > > > Some of the NFS patches can land independently of the MM patches. A few > > require the MM patches to land first. > > Thanks for fixing swap-over-NFS! Looks like it passes all the > swap-related xfstests except for generic/357 on NFS v4.2. This test > checks that we get -EINVAL on a reflinked swapfile, but I'm not sure > if there is a way to check for that on the client side but if you have > any ideas it would be nice to get that test passing while you're at > it! Thanks for testing!. I think that testing that swap fails on a reflinked file is bogus. This isn't an important part of the API, it is just an internal implementation detail. I certainly understand that it could be problematic implementing swap on a reflinked file within XFS and it is perfectly acceptable to fail such a request. But if one day someone decided to implement it - should that be seen as a regression? Certainly over NFS there is no reason at all not to swap to a file that happens to be reflinked on the server. I don't think it even makes sense to test if the file has holes as the current nfs_swap_activate() does. I don't exactly object to the test, but I think it is misguided and pointless. Thanks, NeilBrown
On Mon, Dec 20, 2021 at 08:07:15AM +1100, NeilBrown wrote: > > Thanks for fixing swap-over-NFS! Looks like it passes all the > > swap-related xfstests except for generic/357 on NFS v4.2. This test > > checks that we get -EINVAL on a reflinked swapfile, but I'm not sure > > if there is a way to check for that on the client side but if you have > > any ideas it would be nice to get that test passing while you're at > > it! > > Thanks for testing!. > I think that testing that swap fails on a reflinked file is bogus. This > isn't an important part of the API, it is just an internal > implementation detail. > I certainly understand that it could be problematic implementing swap on > a reflinked file within XFS and it is perfectly acceptable to fail such > a request. But if one day someone decided to implement it - should that > be seen as a regression? Yes, there is really no fundamental reason not to support swap to reflinked files, especially for NFS. We'll need some kind of opt-in/out for this test.