Message ID | 20240425104938.3363417-1-dan.aloni@vastdata.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | sunrpc: fix NFSACL RPC retry on soft mount | expand |
On 25 Apr 2024, at 6:49, Dan Aloni wrote: > It used to be quite awhile ago since 1b63a75180c6 ('SUNRPC: Refactor > rpc_clone_client()'), in 2012, that `cl_timeout` was copied in so that > all mount parameters propagate to NFSACL clients. However since that > change, if mount options as follows are given: > > soft,timeo=50,retrans=16,vers=3 > > The resultant NFSACL client receives: > > cl_softrtry: 1 > cl_timeout: to_initval=60000, to_maxval=60000, to_increment=0, to_retries=2, to_exponential=0 > > These values lead to NFSACL operations not being retried under the > condition of transient network outages with soft mount. Instead, getacl > call fails after 60 seconds with EIO. > > The simple fix is to pass the existing client's `cl_timeout` as the new > client timeout. > > Cc: Chuck Lever <chuck.lever@oracle.com> > Cc: Benjamin Coddington <bcodding@redhat.com> > Link: https://lore.kernel.org/all/20231105154857.ryakhmgaptq3hb6b@gmail.com/T/ > Fixes: 1b63a75180c6 ('SUNRPC: Refactor rpc_clone_client()') > Signed-off-by: Dan Aloni <dan.aloni@vastdata.com> This also affects the local rpcbind, and makes the change in 6b996476f364 sunrpc: honor rpc_task's timeout value in rpcb_create() redundant. Just an observation, thanks for fixing this! Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Ben > --- > net/sunrpc/clnt.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c > index cda0935a68c9..07ffd4ee695a 100644 > --- a/net/sunrpc/clnt.c > +++ b/net/sunrpc/clnt.c > @@ -1068,6 +1068,7 @@ struct rpc_clnt *rpc_bind_new_program(struct rpc_clnt *old, > .version = vers, > .authflavor = old->cl_auth->au_flavor, > .cred = old->cl_cred, > + .timeout = old->cl_timeout, > }; > struct rpc_clnt *clnt; > int err; > -- > 2.39.3
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index cda0935a68c9..07ffd4ee695a 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -1068,6 +1068,7 @@ struct rpc_clnt *rpc_bind_new_program(struct rpc_clnt *old, .version = vers, .authflavor = old->cl_auth->au_flavor, .cred = old->cl_cred, + .timeout = old->cl_timeout, }; struct rpc_clnt *clnt; int err;
It used to be quite awhile ago since 1b63a75180c6 ('SUNRPC: Refactor rpc_clone_client()'), in 2012, that `cl_timeout` was copied in so that all mount parameters propagate to NFSACL clients. However since that change, if mount options as follows are given: soft,timeo=50,retrans=16,vers=3 The resultant NFSACL client receives: cl_softrtry: 1 cl_timeout: to_initval=60000, to_maxval=60000, to_increment=0, to_retries=2, to_exponential=0 These values lead to NFSACL operations not being retried under the condition of transient network outages with soft mount. Instead, getacl call fails after 60 seconds with EIO. The simple fix is to pass the existing client's `cl_timeout` as the new client timeout. Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Benjamin Coddington <bcodding@redhat.com> Link: https://lore.kernel.org/all/20231105154857.ryakhmgaptq3hb6b@gmail.com/T/ Fixes: 1b63a75180c6 ('SUNRPC: Refactor rpc_clone_client()') Signed-off-by: Dan Aloni <dan.aloni@vastdata.com> --- net/sunrpc/clnt.c | 1 + 1 file changed, 1 insertion(+)