mbox series

[00/10] Convert NFS fscache read paths to netfs API

Message ID 1611845708-6752-1-git-send-email-dwysocha@redhat.com (mailing list archive)
Headers show
Series Convert NFS fscache read paths to netfs API | expand

Message

David Wysochanski Jan. 28, 2021, 2:54 p.m. UTC
This minimal set of patches update the NFS client to use the new
readahead method, and convert the NFS fscache to use the new netfs
IO API, and are at:
https://github.com/DaveWysochanskiRH/kernel/releases/tag/fscache-iter-lib-nfs-20210128
https://github.com/DaveWysochanskiRH/kernel/commit/74357eb291c9c292f3ab3bc9ed1227cb76f52c51

The patches are based on David Howells fscache-netfs-lib tree at
https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-netfs-lib

The first 6 patches refactor some of the NFS read code to facilitate
re-use, the next 4 patches do the conversion to the new API.  Note
patch 8 converts nfs_readpages to nfs_readahead.

Changes since my last posting on Jan 27, 2021
- Fix oops with fscache enabled on parallel read unit test
- Add patches to handle invalidate and releasepage
- Use #define FSCACHE_USE_NEW_IO_API to select the new API
- Minor cleanup in nfs_readahead_from_fscache

Still TODO
1. Fix known bugs
a) nfs_issue_op: takes rcu_read_lock but may calls nfs_page_alloc()
   with GFP_KERNEL which may sleep (dhowells noted this in a review)
b) nfs_refresh_inode() takes inode->i_lock but may call
   __fscache_invalidate() which may sleep (found with lockdep)
c) WARN with xfstest fscache/netapp/pnfs/nfs41
2. Fixup NFS fscache stats (NFSIOS_FSCACHE_*)
* Compare with netfs stats and determine if still needed
3. Cleanup dfprintks and/or convert to tracepoints
4. Further tests (see "Not tested yet")

Tests run
1. Custom NFS+fscache unit tests for basic operation: PASS
* vers=3,4.0,4.1,4.2,sec=sys,server=localhost (same kernel)
2. cthon04: PASS
* test options "-b -g -s -l", fsc,vers=3,4.0,4.1,4.2,sec=sys
* No failures, oopses or hangs
3. iozone tests: PASS
* nofsc,fsc,vers=3,4.0,4.1,4.2,sec=sys,server=rhel7,rhel8
* No failures, oopses, or hangs
4. xfstests/generic: PASS*
* no hangs or crashes (one WARN); failures unrelated to these patches
* Ran following configurations
  * vers=4.1,fsc,sec=sys,rhel7-server: PASS
  * vers=4.0,fsc,sec=sys,rhel7-server: PASS
  * vers=3,fsc,sec=sys,rhel7-server: PASS
  * vers=4.1,nofsc,sec=sys,netapp-server(pnfs/files): PASS
  * vers=4.1,fsc,sec=sys,netapp-server(pnfs/files): INCOMPLETE
    * WARN_ON fs/netfs/read_helper.c:616
    * ran with kernel.panic_on_oops=1
  * vers=4.2,fsc,sec=sys,rhel7-server: running at generic/438
  * vers=4.2,fsc,sec=sys,rhel8-server: running at generic/127
5. kernel build: PASS
  * vers=4.2,fsc,sec=sys,rhel8-server: PASS

Not tested yet:
* error injections (for example, connection disruptions, server errors during IO, etc)
* many process mixed read/write on same file
* performance

Dave Wysochanski (10):
  NFS: Clean up nfs_readpage() and nfs_readpages()
  NFS: In nfs_readpage() only increment NFSIOS_READPAGES when read
    succeeds
  NFS: Refactor nfs_readpage() and nfs_readpage_async() to use
    nfs_readdesc
  NFS: Call readpage_async_filler() from nfs_readpage_async()
  NFS: Add nfs_pageio_complete_read() and remove nfs_readpage_async()
  NFS: Allow internal use of read structs and functions
  NFS: Convert to the netfs API and nfs_readpage to use netfs_readpage
  NFS: Convert readpages to readahead and use netfs_readahead for
    fscache
  NFS: Update releasepage to handle new fscache kiocb IO API
  NFS: update various invalidation code paths for new IO API

 fs/nfs/file.c              |  22 +++--
 fs/nfs/fscache.c           | 230 +++++++++++++++++++------------------------
 fs/nfs/fscache.h           | 105 +++-----------------
 fs/nfs/internal.h          |   8 ++
 fs/nfs/pagelist.c          |   2 +
 fs/nfs/read.c              | 240 ++++++++++++++++++++-------------------------
 fs/nfs/write.c             |  10 +-
 include/linux/nfs_fs.h     |   5 +-
 include/linux/nfs_iostat.h |   2 +-
 include/linux/nfs_page.h   |   1 +
 include/linux/nfs_xdr.h    |   1 +
 11 files changed, 257 insertions(+), 369 deletions(-)

Comments

David Wysochanski Feb. 1, 2021, 2:15 a.m. UTC | #1
On Thu, Jan 28, 2021 at 9:59 AM Dave Wysochanski <dwysocha@redhat.com> wrote:
>
> This minimal set of patches update the NFS client to use the new
> readahead method, and convert the NFS fscache to use the new netfs
> IO API, and are at:
> https://github.com/DaveWysochanskiRH/kernel/releases/tag/fscache-iter-lib-nfs-20210128
> https://github.com/DaveWysochanskiRH/kernel/commit/74357eb291c9c292f3ab3bc9ed1227cb76f52c51
>
> The patches are based on David Howells fscache-netfs-lib tree at
> https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-netfs-lib
>
> The first 6 patches refactor some of the NFS read code to facilitate
> re-use, the next 4 patches do the conversion to the new API.  Note
> patch 8 converts nfs_readpages to nfs_readahead.
>
> Changes since my last posting on Jan 27, 2021
> - Fix oops with fscache enabled on parallel read unit test
> - Add patches to handle invalidate and releasepage
> - Use #define FSCACHE_USE_NEW_IO_API to select the new API
> - Minor cleanup in nfs_readahead_from_fscache
>
> Still TODO
> 1. Fix known bugs
> a) nfs_issue_op: takes rcu_read_lock but may calls nfs_page_alloc()
>    with GFP_KERNEL which may sleep (dhowells noted this in a review)
> b) nfs_refresh_inode() takes inode->i_lock but may call
>    __fscache_invalidate() which may sleep (found with lockdep)
> c) WARN with xfstest fscache/netapp/pnfs/nfs41

Turns out this is a bit more involved and I would not consider pNFS +
fscache stable right now.
For now I may have to disable fscache if pNFS is enabled unless I can
quickly come up
with a reasonable fix for the problem.

The problem is as follows. Once netfs calls us in "issue_op" for a
given subrequest, it expects
one call back when the subrequest completes.  Now the "clamp_length"
function was developed
so we tell the netfs caller how big of an IO we can handle.  However,
right now it only implements
an 'rsize' check, and it does not take into account pNFS
characteristics such as segments
which may split up the IO into multiple RPCs. Since each of the RPC
have their own
completion, and so far I've not come up with a way to just call back
into netfs when the
last one is done, I am not sure what the right approach is.  One
obvious approach would be
a more sophisticated "clamp_length" function which adds similar logic
as to the *pg_test()
functions.  But I don't want to duplicate that and so it's not really clear.

> 2. Fixup NFS fscache stats (NFSIOS_FSCACHE_*)
> * Compare with netfs stats and determine if still needed
> 3. Cleanup dfprintks and/or convert to tracepoints
> 4. Further tests (see "Not tested yet")
>
> Tests run
> 1. Custom NFS+fscache unit tests for basic operation: PASS
> * vers=3,4.0,4.1,4.2,sec=sys,server=localhost (same kernel)
> 2. cthon04: PASS
> * test options "-b -g -s -l", fsc,vers=3,4.0,4.1,4.2,sec=sys
> * No failures, oopses or hangs
> 3. iozone tests: PASS
> * nofsc,fsc,vers=3,4.0,4.1,4.2,sec=sys,server=rhel7,rhel8
> * No failures, oopses, or hangs
> 4. xfstests/generic: PASS*
> * no hangs or crashes (one WARN); failures unrelated to these patches
> * Ran following configurations
>   * vers=4.1,fsc,sec=sys,rhel7-server: PASS
>   * vers=4.0,fsc,sec=sys,rhel7-server: PASS
>   * vers=3,fsc,sec=sys,rhel7-server: PASS
>   * vers=4.1,nofsc,sec=sys,netapp-server(pnfs/files): PASS
>   * vers=4.1,fsc,sec=sys,netapp-server(pnfs/files): INCOMPLETE
>     * WARN_ON fs/netfs/read_helper.c:616
>     * ran with kernel.panic_on_oops=1
>   * vers=4.2,fsc,sec=sys,rhel7-server: running at generic/438
>   * vers=4.2,fsc,sec=sys,rhel8-server: running at generic/127
> 5. kernel build: PASS
>   * vers=4.2,fsc,sec=sys,rhel8-server: PASS
>
> Not tested yet:
> * error injections (for example, connection disruptions, server errors during IO, etc)
> * many process mixed read/write on same file
> * performance
>
> Dave Wysochanski (10):
>   NFS: Clean up nfs_readpage() and nfs_readpages()
>   NFS: In nfs_readpage() only increment NFSIOS_READPAGES when read
>     succeeds
>   NFS: Refactor nfs_readpage() and nfs_readpage_async() to use
>     nfs_readdesc
>   NFS: Call readpage_async_filler() from nfs_readpage_async()
>   NFS: Add nfs_pageio_complete_read() and remove nfs_readpage_async()
>   NFS: Allow internal use of read structs and functions
>   NFS: Convert to the netfs API and nfs_readpage to use netfs_readpage
>   NFS: Convert readpages to readahead and use netfs_readahead for
>     fscache
>   NFS: Update releasepage to handle new fscache kiocb IO API
>   NFS: update various invalidation code paths for new IO API
>
>  fs/nfs/file.c              |  22 +++--
>  fs/nfs/fscache.c           | 230 +++++++++++++++++++------------------------
>  fs/nfs/fscache.h           | 105 +++-----------------
>  fs/nfs/internal.h          |   8 ++
>  fs/nfs/pagelist.c          |   2 +
>  fs/nfs/read.c              | 240 ++++++++++++++++++++-------------------------
>  fs/nfs/write.c             |  10 +-
>  include/linux/nfs_fs.h     |   5 +-
>  include/linux/nfs_iostat.h |   2 +-
>  include/linux/nfs_page.h   |   1 +
>  include/linux/nfs_xdr.h    |   1 +
>  11 files changed, 257 insertions(+), 369 deletions(-)
>
> --
> 1.8.3.1
>
Schumaker, Anna Feb. 1, 2021, 2:30 p.m. UTC | #2
Hi David,

On Sun, Jan 31, 2021 at 9:20 PM David Wysochanski <dwysocha@redhat.com> wrote:
>
> On Thu, Jan 28, 2021 at 9:59 AM Dave Wysochanski <dwysocha@redhat.com> wrote:
> >
> > This minimal set of patches update the NFS client to use the new
> > readahead method, and convert the NFS fscache to use the new netfs
> > IO API, and are at:
> > https://github.com/DaveWysochanskiRH/kernel/releases/tag/fscache-iter-lib-nfs-20210128
> > https://github.com/DaveWysochanskiRH/kernel/commit/74357eb291c9c292f3ab3bc9ed1227cb76f52c51
> >
> > The patches are based on David Howells fscache-netfs-lib tree at
> > https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-netfs-lib
> >
> > The first 6 patches refactor some of the NFS read code to facilitate
> > re-use, the next 4 patches do the conversion to the new API.  Note
> > patch 8 converts nfs_readpages to nfs_readahead.
> >
> > Changes since my last posting on Jan 27, 2021
> > - Fix oops with fscache enabled on parallel read unit test
> > - Add patches to handle invalidate and releasepage
> > - Use #define FSCACHE_USE_NEW_IO_API to select the new API
> > - Minor cleanup in nfs_readahead_from_fscache
> >
> > Still TODO
> > 1. Fix known bugs
> > a) nfs_issue_op: takes rcu_read_lock but may calls nfs_page_alloc()
> >    with GFP_KERNEL which may sleep (dhowells noted this in a review)
> > b) nfs_refresh_inode() takes inode->i_lock but may call
> >    __fscache_invalidate() which may sleep (found with lockdep)
> > c) WARN with xfstest fscache/netapp/pnfs/nfs41
>
> Turns out this is a bit more involved and I would not consider pNFS +
> fscache stable right now.
> For now I may have to disable fscache if pNFS is enabled unless I can
> quickly come up
> with a reasonable fix for the problem.

So my thought right now is to take the first 6 cleanup / preparation
patches for the 5.12 merge window and save the cutover for 5.13. This
would give you an extra release cycle to fix the pNFS stability, and
it would give more time to find and fix any issues in netfs before
switching NFS over to it.

Would that work?
Anna

>
> The problem is as follows. Once netfs calls us in "issue_op" for a
> given subrequest, it expects
> one call back when the subrequest completes.  Now the "clamp_length"
> function was developed
> so we tell the netfs caller how big of an IO we can handle.  However,
> right now it only implements
> an 'rsize' check, and it does not take into account pNFS
> characteristics such as segments
> which may split up the IO into multiple RPCs. Since each of the RPC
> have their own
> completion, and so far I've not come up with a way to just call back
> into netfs when the
> last one is done, I am not sure what the right approach is.  One
> obvious approach would be
> a more sophisticated "clamp_length" function which adds similar logic
> as to the *pg_test()
> functions.  But I don't want to duplicate that and so it's not really clear.
>
> > 2. Fixup NFS fscache stats (NFSIOS_FSCACHE_*)
> > * Compare with netfs stats and determine if still needed
> > 3. Cleanup dfprintks and/or convert to tracepoints
> > 4. Further tests (see "Not tested yet")
> >
> > Tests run
> > 1. Custom NFS+fscache unit tests for basic operation: PASS
> > * vers=3,4.0,4.1,4.2,sec=sys,server=localhost (same kernel)
> > 2. cthon04: PASS
> > * test options "-b -g -s -l", fsc,vers=3,4.0,4.1,4.2,sec=sys
> > * No failures, oopses or hangs
> > 3. iozone tests: PASS
> > * nofsc,fsc,vers=3,4.0,4.1,4.2,sec=sys,server=rhel7,rhel8
> > * No failures, oopses, or hangs
> > 4. xfstests/generic: PASS*
> > * no hangs or crashes (one WARN); failures unrelated to these patches
> > * Ran following configurations
> >   * vers=4.1,fsc,sec=sys,rhel7-server: PASS
> >   * vers=4.0,fsc,sec=sys,rhel7-server: PASS
> >   * vers=3,fsc,sec=sys,rhel7-server: PASS
> >   * vers=4.1,nofsc,sec=sys,netapp-server(pnfs/files): PASS
> >   * vers=4.1,fsc,sec=sys,netapp-server(pnfs/files): INCOMPLETE
> >     * WARN_ON fs/netfs/read_helper.c:616
> >     * ran with kernel.panic_on_oops=1
> >   * vers=4.2,fsc,sec=sys,rhel7-server: running at generic/438
> >   * vers=4.2,fsc,sec=sys,rhel8-server: running at generic/127
> > 5. kernel build: PASS
> >   * vers=4.2,fsc,sec=sys,rhel8-server: PASS
> >
> > Not tested yet:
> > * error injections (for example, connection disruptions, server errors during IO, etc)
> > * many process mixed read/write on same file
> > * performance
> >
> > Dave Wysochanski (10):
> >   NFS: Clean up nfs_readpage() and nfs_readpages()
> >   NFS: In nfs_readpage() only increment NFSIOS_READPAGES when read
> >     succeeds
> >   NFS: Refactor nfs_readpage() and nfs_readpage_async() to use
> >     nfs_readdesc
> >   NFS: Call readpage_async_filler() from nfs_readpage_async()
> >   NFS: Add nfs_pageio_complete_read() and remove nfs_readpage_async()
> >   NFS: Allow internal use of read structs and functions
> >   NFS: Convert to the netfs API and nfs_readpage to use netfs_readpage
> >   NFS: Convert readpages to readahead and use netfs_readahead for
> >     fscache
> >   NFS: Update releasepage to handle new fscache kiocb IO API
> >   NFS: update various invalidation code paths for new IO API
> >
> >  fs/nfs/file.c              |  22 +++--
> >  fs/nfs/fscache.c           | 230 +++++++++++++++++++------------------------
> >  fs/nfs/fscache.h           | 105 +++-----------------
> >  fs/nfs/internal.h          |   8 ++
> >  fs/nfs/pagelist.c          |   2 +
> >  fs/nfs/read.c              | 240 ++++++++++++++++++++-------------------------
> >  fs/nfs/write.c             |  10 +-
> >  include/linux/nfs_fs.h     |   5 +-
> >  include/linux/nfs_iostat.h |   2 +-
> >  include/linux/nfs_page.h   |   1 +
> >  include/linux/nfs_xdr.h    |   1 +
> >  11 files changed, 257 insertions(+), 369 deletions(-)
> >
> > --
> > 1.8.3.1
> >
>
David Wysochanski Feb. 2, 2021, 12:19 p.m. UTC | #3
On Mon, Feb 1, 2021 at 9:30 AM Anna Schumaker <anna.schumaker@netapp.com> wrote:
>
> Hi David,
>
> On Sun, Jan 31, 2021 at 9:20 PM David Wysochanski <dwysocha@redhat.com> wrote:
> >
> > On Thu, Jan 28, 2021 at 9:59 AM Dave Wysochanski <dwysocha@redhat.com> wrote:
> > >
> > > This minimal set of patches update the NFS client to use the new
> > > readahead method, and convert the NFS fscache to use the new netfs
> > > IO API, and are at:
> > > https://github.com/DaveWysochanskiRH/kernel/releases/tag/fscache-iter-lib-nfs-20210128
> > > https://github.com/DaveWysochanskiRH/kernel/commit/74357eb291c9c292f3ab3bc9ed1227cb76f52c51
> > >
> > > The patches are based on David Howells fscache-netfs-lib tree at
> > > https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-netfs-lib
> > >
> > > The first 6 patches refactor some of the NFS read code to facilitate
> > > re-use, the next 4 patches do the conversion to the new API.  Note
> > > patch 8 converts nfs_readpages to nfs_readahead.
> > >
> > > Changes since my last posting on Jan 27, 2021
> > > - Fix oops with fscache enabled on parallel read unit test
> > > - Add patches to handle invalidate and releasepage
> > > - Use #define FSCACHE_USE_NEW_IO_API to select the new API
> > > - Minor cleanup in nfs_readahead_from_fscache
> > >
> > > Still TODO
> > > 1. Fix known bugs
> > > a) nfs_issue_op: takes rcu_read_lock but may calls nfs_page_alloc()
> > >    with GFP_KERNEL which may sleep (dhowells noted this in a review)
> > > b) nfs_refresh_inode() takes inode->i_lock but may call
> > >    __fscache_invalidate() which may sleep (found with lockdep)
> > > c) WARN with xfstest fscache/netapp/pnfs/nfs41
> >
> > Turns out this is a bit more involved and I would not consider pNFS +
> > fscache stable right now.
> > For now I may have to disable fscache if pNFS is enabled unless I can
> > quickly come up
> > with a reasonable fix for the problem.
>
> So my thought right now is to take the first 6 cleanup / preparation
> patches for the 5.12 merge window and save the cutover for 5.13. This
> would give you an extra release cycle to fix the pNFS stability, and
> it would give more time to find and fix any issues in netfs before
> switching NFS over to it.
>
> Would that work?
> Anna
>

Yes that's fine.


> >
> > The problem is as follows. Once netfs calls us in "issue_op" for a
> > given subrequest, it expects
> > one call back when the subrequest completes.  Now the "clamp_length"
> > function was developed
> > so we tell the netfs caller how big of an IO we can handle.  However,
> > right now it only implements
> > an 'rsize' check, and it does not take into account pNFS
> > characteristics such as segments
> > which may split up the IO into multiple RPCs. Since each of the RPC
> > have their own
> > completion, and so far I've not come up with a way to just call back
> > into netfs when the
> > last one is done, I am not sure what the right approach is.  One
> > obvious approach would be
> > a more sophisticated "clamp_length" function which adds similar logic
> > as to the *pg_test()
> > functions.  But I don't want to duplicate that and so it's not really clear.
> >
> > > 2. Fixup NFS fscache stats (NFSIOS_FSCACHE_*)
> > > * Compare with netfs stats and determine if still needed
> > > 3. Cleanup dfprintks and/or convert to tracepoints
> > > 4. Further tests (see "Not tested yet")
> > >
> > > Tests run
> > > 1. Custom NFS+fscache unit tests for basic operation: PASS
> > > * vers=3,4.0,4.1,4.2,sec=sys,server=localhost (same kernel)
> > > 2. cthon04: PASS
> > > * test options "-b -g -s -l", fsc,vers=3,4.0,4.1,4.2,sec=sys
> > > * No failures, oopses or hangs
> > > 3. iozone tests: PASS
> > > * nofsc,fsc,vers=3,4.0,4.1,4.2,sec=sys,server=rhel7,rhel8
> > > * No failures, oopses, or hangs
> > > 4. xfstests/generic: PASS*
> > > * no hangs or crashes (one WARN); failures unrelated to these patches
> > > * Ran following configurations
> > >   * vers=4.1,fsc,sec=sys,rhel7-server: PASS
> > >   * vers=4.0,fsc,sec=sys,rhel7-server: PASS
> > >   * vers=3,fsc,sec=sys,rhel7-server: PASS
> > >   * vers=4.1,nofsc,sec=sys,netapp-server(pnfs/files): PASS
> > >   * vers=4.1,fsc,sec=sys,netapp-server(pnfs/files): INCOMPLETE
> > >     * WARN_ON fs/netfs/read_helper.c:616
> > >     * ran with kernel.panic_on_oops=1
> > >   * vers=4.2,fsc,sec=sys,rhel7-server: running at generic/438
> > >   * vers=4.2,fsc,sec=sys,rhel8-server: running at generic/127
> > > 5. kernel build: PASS
> > >   * vers=4.2,fsc,sec=sys,rhel8-server: PASS
> > >
> > > Not tested yet:
> > > * error injections (for example, connection disruptions, server errors during IO, etc)
> > > * many process mixed read/write on same file
> > > * performance
> > >
> > > Dave Wysochanski (10):
> > >   NFS: Clean up nfs_readpage() and nfs_readpages()
> > >   NFS: In nfs_readpage() only increment NFSIOS_READPAGES when read
> > >     succeeds
> > >   NFS: Refactor nfs_readpage() and nfs_readpage_async() to use
> > >     nfs_readdesc
> > >   NFS: Call readpage_async_filler() from nfs_readpage_async()
> > >   NFS: Add nfs_pageio_complete_read() and remove nfs_readpage_async()
> > >   NFS: Allow internal use of read structs and functions
> > >   NFS: Convert to the netfs API and nfs_readpage to use netfs_readpage
> > >   NFS: Convert readpages to readahead and use netfs_readahead for
> > >     fscache
> > >   NFS: Update releasepage to handle new fscache kiocb IO API
> > >   NFS: update various invalidation code paths for new IO API
> > >
> > >  fs/nfs/file.c              |  22 +++--
> > >  fs/nfs/fscache.c           | 230 +++++++++++++++++++------------------------
> > >  fs/nfs/fscache.h           | 105 +++-----------------
> > >  fs/nfs/internal.h          |   8 ++
> > >  fs/nfs/pagelist.c          |   2 +
> > >  fs/nfs/read.c              | 240 ++++++++++++++++++++-------------------------
> > >  fs/nfs/write.c             |  10 +-
> > >  include/linux/nfs_fs.h     |   5 +-
> > >  include/linux/nfs_iostat.h |   2 +-
> > >  include/linux/nfs_page.h   |   1 +
> > >  include/linux/nfs_xdr.h    |   1 +
> > >  11 files changed, 257 insertions(+), 369 deletions(-)
> > >
> > > --
> > > 1.8.3.1
> > >
> >
>