mbox series

[v6,0/5] NFSD: add support for NFSv4.1+ write delegation

Message ID 1688006176-32597-1-git-send-email-dai.ngo@oracle.com (mailing list archive)
Headers show
Series NFSD: add support for NFSv4.1+ write delegation | expand

Message

Dai Ngo June 29, 2023, 2:36 a.m. UTC
The NFSv4 server currently supports read delegation using VFS lease
which is implemented using file_lock. 

This patch series add write delegation support for NFSv4.1+ client by:

    . remove the check for F_WRLCK in generic_add_lease to allow
      file_lock to be used for write delegation.  

    . grant write delegation for OPEN with NFS4_SHARE_ACCESS_WRITE
      if there is no conflict with other OPENs.

Write delegation conflict with another OPEN, REMOVE, RENAME and SETATTR
are handled the same as read delegation using notify_change, try_break_deleg.

The write delegation support is for NFSv4.1+ client only since the NFSv4.0
Linux client behavior is not compliant with RFC 7530 Section 16.7.5. It
expects the server to look ahead in the compound to find a stateid in order
to determine whether the client that sends the GETATTR is the same client
that holds the write delegation. RFC 7530 spec does not call for the server
to look ahead in order to service the GETATTR op.

Changes since v1:

[PATCH 3/4] NFSD: add supports for CB_GETATTR callback
- remove WARN_ON_ONCE from encode_bitmap4
- replace decode_bitmap4 with xdr_stream_decode_uint32_array
- replace xdr_inline_decode and xdr_decode_hyper in decode_cb_getattr
   with xdr_stream_decode_u64. Also remove the un-needed likely().
- modify signature of encode_cb_getattr4args to take pointer to
   nfs4_cb_fattr
- replace decode_attr_length with xdr_stream_decode_u32
- rename decode_cb_getattr to decode_cb_fattr4
- fold the initialization of cb_cinfo and cb_fsize into decode_cb_fattr4
- rename ncf_cb_cinfo to ncf_cb_change to avoid confusion of cindo usage
  in fs/nfsd/nfs4xdr.c
- correct NFS4_dec_cb_getattr_sz and update size description

[PATCH 4/4] NFSD: handle GETATTR conflict with write delegation
- change nfs4_handle_wrdeleg_conflict returns __be32 to fix test robot
- change ncf_cb_cinfo to ncf_cb_change to avoid confusion of cindo usage
  in fs/nfsd/nfs4xdr.c

Changes since v2:

[PATCH 2/4] NFSD: enable support for write delegation
- rename 'deleg' to 'dl_type' in nfs4_set_delegation
- remove 'wdeleg' in nfs4_open_delegation

- drop [PATCH 3/4] NFSD: add supports for CB_GETATTR callback
  and [PATCH 4/4] NFSD: handle GETATTR conflict with write delegation
  for futher clarification of the benefits of these patches

Changes since v3:

- recall write delegation when there is GETATTR from 2nd client
- add trace point to track when write delegation is granted 

Changes since v4:
- squash 4/4 into 2/4
- apply 1/4 last instead of first
- combine nfs4_wrdeleg_filelock and nfs4_handle_wrdeleg_conflict to
  nfsd4_deleg_getattr_conflict and move it to fs/nfsd/nfs4state.c
- check for lock belongs to delegation before proceed and do it
  under the fl_lock
- check and skip FL_LAYOUT file_locks

Changes since v5:
- [patch 2/5] disable write delegation for NFSv4.0 client

- [patch 4/5] allow client to use write delegation stateid for READ (same
  behavior as Solaris server)

  When the server receives a READ request with write delegation stateid
  the server may returns the NFS4ERR_OPENMODE or allows the READ to proceed
  to accommodate clients whose WRITE implementation may unavoidably do reads
  (e.g., due to buffer cache constraints).  Per RFC 8881 section 9.1.2. Use
  of the Stateid and Locking

  Returning NFS4ERR_OPENMODE causes the client and server to enter an infinite
  loop of READ, NFS4ERR_OPENMODE, TEST_STATEID, READs, NFS4ERR_OPENMODEs,
  TEST_STATEID, READs, NFS4ERR_OPENMODEs. The Linux NFS client can not recover
  from NFS4ERR_OPENMODE for READ request if the file was opened with
  OPEN4_SHARE_ACCESS_WRITE. This READ was initiated internally from the NFS
  client and not from the read(2) system call.

- pass git regression test with 40 threads

Comments

Jeff Layton June 29, 2023, 2:51 p.m. UTC | #1
On Wed, 2023-06-28 at 19:36 -0700, Dai Ngo wrote:
> The NFSv4 server currently supports read delegation using VFS lease
> which is implemented using file_lock. 
> 
> This patch series add write delegation support for NFSv4.1+ client by:
> 
>     . remove the check for F_WRLCK in generic_add_lease to allow
>       file_lock to be used for write delegation.  
> 
>     . grant write delegation for OPEN with NFS4_SHARE_ACCESS_WRITE
>       if there is no conflict with other OPENs.
> 
> Write delegation conflict with another OPEN, REMOVE, RENAME and SETATTR
> are handled the same as read delegation using notify_change, try_break_deleg.
> 
> The write delegation support is for NFSv4.1+ client only since the NFSv4.0
> Linux client behavior is not compliant with RFC 7530 Section 16.7.5. It
> expects the server to look ahead in the compound to find a stateid in order
> to determine whether the client that sends the GETATTR is the same client
> that holds the write delegation. RFC 7530 spec does not call for the server
> to look ahead in order to service the GETATTR op.
> 
> Changes since v1:
> 
> [PATCH 3/4] NFSD: add supports for CB_GETATTR callback
> - remove WARN_ON_ONCE from encode_bitmap4
> - replace decode_bitmap4 with xdr_stream_decode_uint32_array
> - replace xdr_inline_decode and xdr_decode_hyper in decode_cb_getattr
>    with xdr_stream_decode_u64. Also remove the un-needed likely().
> - modify signature of encode_cb_getattr4args to take pointer to
>    nfs4_cb_fattr
> - replace decode_attr_length with xdr_stream_decode_u32
> - rename decode_cb_getattr to decode_cb_fattr4
> - fold the initialization of cb_cinfo and cb_fsize into decode_cb_fattr4
> - rename ncf_cb_cinfo to ncf_cb_change to avoid confusion of cindo usage
>   in fs/nfsd/nfs4xdr.c
> - correct NFS4_dec_cb_getattr_sz and update size description
> 
> [PATCH 4/4] NFSD: handle GETATTR conflict with write delegation
> - change nfs4_handle_wrdeleg_conflict returns __be32 to fix test robot
> - change ncf_cb_cinfo to ncf_cb_change to avoid confusion of cindo usage
>   in fs/nfsd/nfs4xdr.c
> 
> Changes since v2:
> 
> [PATCH 2/4] NFSD: enable support for write delegation
> - rename 'deleg' to 'dl_type' in nfs4_set_delegation
> - remove 'wdeleg' in nfs4_open_delegation
> 
> - drop [PATCH 3/4] NFSD: add supports for CB_GETATTR callback
>   and [PATCH 4/4] NFSD: handle GETATTR conflict with write delegation
>   for futher clarification of the benefits of these patches
> 
> Changes since v3:
> 
> - recall write delegation when there is GETATTR from 2nd client
> - add trace point to track when write delegation is granted 
> 
> Changes since v4:
> - squash 4/4 into 2/4
> - apply 1/4 last instead of first
> - combine nfs4_wrdeleg_filelock and nfs4_handle_wrdeleg_conflict to
>   nfsd4_deleg_getattr_conflict and move it to fs/nfsd/nfs4state.c
> - check for lock belongs to delegation before proceed and do it
>   under the fl_lock
> - check and skip FL_LAYOUT file_locks
> 
> Changes since v5:
> - [patch 2/5] disable write delegation for NFSv4.0 client
> 
> - [patch 4/5] allow client to use write delegation stateid for READ (same
>   behavior as Solaris server)
> 
>   When the server receives a READ request with write delegation stateid
>   the server may returns the NFS4ERR_OPENMODE or allows the READ to proceed
>   to accommodate clients whose WRITE implementation may unavoidably do reads
>   (e.g., due to buffer cache constraints).  Per RFC 8881 section 9.1.2. Use
>   of the Stateid and Locking
> 
>   Returning NFS4ERR_OPENMODE causes the client and server to enter an infinite
>   loop of READ, NFS4ERR_OPENMODE, TEST_STATEID, READs, NFS4ERR_OPENMODEs,
>   TEST_STATEID, READs, NFS4ERR_OPENMODEs. The Linux NFS client can not recover
>   from NFS4ERR_OPENMODE for READ request if the file was opened with
>   OPEN4_SHARE_ACCESS_WRITE. This READ was initiated internally from the NFS
>   client and not from the read(2) system call.
> 
> - pass git regression test with 40 threads
> 

Nice work, Dai! This all looks good to me. You can add:

Reviewed-by: Jeff Layton <jlayton@kernel.org>