Message ID | CAN-5tyEEVqnQgNJq_pbaujsVBeVQ0y0AwNgP=cJxA9aj2=o_9w@mail.gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, 2018-05-22 at 15:28 -0400, Olga Kornievskaia wrote: > On Tue, May 22, 2018 at 3:03 PM, Trond Myklebust > <trondmy@hammerspace.com> wrote: > > On Tue, 2018-05-22 at 14:40 -0400, Olga Kornievskaia wrote: > > > From: Olga Kornievskaia <olga.kornievskaia@gmail.com> > > > > > > For pNFS, the operations to DS currently timeout in 10s. > > > According > > > to the spec, the client must not be re-trying an NFSv4.1 > > > operation > > > unless the connection was broken. > > > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com> > > > --- > > > net/sunrpc/clnt.c | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c > > > index 6e432ec..97517eb 100644 > > > --- a/net/sunrpc/clnt.c > > > +++ b/net/sunrpc/clnt.c > > > @@ -668,6 +668,7 @@ struct rpc_clnt * > > > .prognumber = clnt->cl_prog, > > > .version = clnt->cl_vers, > > > .authflavor = flavor, > > > + .timeout = clnt->cl_timeout, > > > }; > > > return __rpc_clone_client(&args, clnt); > > > } > > > > What does this patch have to do with pNFS? That's the generic RPC > > client cloning API you are changing. > > > > The pNFS/files timeouts are intended to be set using the > > dataserver_retrans and dataserver_timeo module parameters described > > at > > the bottom of fs/nfs/filelayout/filelayoutdev.c > > Ok so perhaps the code needs to re-written so that it allows for the > DS to get an rpc client with its timeouts set. Which currently > doesn't > happen. > > From what I could tell the DS code tries to set the timeout values in > nfs4_set_ds_client() but that has no effect. > > nfs4_find_or_create_ds_client() calls rpc_clone_client_set_auth() > which creates an rpc client but the timeout that were set are ignored > and instead the rpc client is getting created with this 10s timeout. > > (but I thought that in general it made sense that a clone also copies > the timeout values) > It does not make sense when you consider that the timeout is a per- transport attribute. FWIW, I've no idea where this 10s timeout you are seeing is coming from. Perhaps it is worthwhile figuring that out first? -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com
On Tue, May 22, 2018 at 3:56 PM, Trond Myklebust <trondmy@hammerspace.com> wrote: > On Tue, 2018-05-22 at 15:28 -0400, Olga Kornievskaia wrote: >> On Tue, May 22, 2018 at 3:03 PM, Trond Myklebust >> <trondmy@hammerspace.com> wrote: >> > On Tue, 2018-05-22 at 14:40 -0400, Olga Kornievskaia wrote: >> > > From: Olga Kornievskaia <olga.kornievskaia@gmail.com> >> > > >> > > For pNFS, the operations to DS currently timeout in 10s. >> > > According >> > > to the spec, the client must not be re-trying an NFSv4.1 >> > > operation >> > > unless the connection was broken. >> > > >> > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com> >> > > --- >> > > net/sunrpc/clnt.c | 1 + >> > > 1 file changed, 1 insertion(+) >> > > >> > > diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c >> > > index 6e432ec..97517eb 100644 >> > > --- a/net/sunrpc/clnt.c >> > > +++ b/net/sunrpc/clnt.c >> > > @@ -668,6 +668,7 @@ struct rpc_clnt * >> > > .prognumber = clnt->cl_prog, >> > > .version = clnt->cl_vers, >> > > .authflavor = flavor, >> > > + .timeout = clnt->cl_timeout, >> > > }; >> > > return __rpc_clone_client(&args, clnt); >> > > } >> > >> > What does this patch have to do with pNFS? That's the generic RPC >> > client cloning API you are changing. >> > >> > The pNFS/files timeouts are intended to be set using the >> > dataserver_retrans and dataserver_timeo module parameters described >> > at >> > the bottom of fs/nfs/filelayout/filelayoutdev.c >> >> Ok so perhaps the code needs to re-written so that it allows for the >> DS to get an rpc client with its timeouts set. Which currently >> doesn't >> happen. >> >> From what I could tell the DS code tries to set the timeout values in >> nfs4_set_ds_client() but that has no effect. >> >> nfs4_find_or_create_ds_client() calls rpc_clone_client_set_auth() >> which creates an rpc client but the timeout that were set are ignored >> and instead the rpc client is getting created with this 10s timeout. >> >> (but I thought that in general it made sense that a clone also copies >> the timeout values) >> > > It does not make sense when you consider that the timeout is a per- > transport attribute. > > FWIW, I've no idea where this 10s timeout you are seeing is coming > from. Perhaps it is worthwhile figuring that out first? Besides the value of the 10s (which I also have been having a really hard time figuring out) it's also the max timeout and the fact that, after the 10s are up it's giving up and failing the operation which is then re-tried against the MDS. This shouldn't happen. So I felt like even if that value was 60s, it shouldn't have timed out after 60s and re-tried (without the fix that I'm proposing). I'll give it a bit more to figure out where 10s is coming from. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c index 04612c2..2c81eef8 100644 --- a/fs/nfs/nfs4client.c +++ b/fs/nfs/nfs4client.c @@ -99,7 +99,7 @@ struct nfs4_ds_server { if (dss == NULL) return ERR_PTR(-ENOMEM); - dss->rpc_clnt = rpc_clone_client_set_auth(ds_clp->cl_rpcclient, flavor); + dss->rpc_clnt = rpc_clone_client_set_auth_and_timeout(ds_clp->cl_rpcclient, flavor); if (IS_ERR(dss->rpc_clnt)) { int err = PTR_ERR(dss->rpc_clnt); kfree (dss); diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h index ed761f7..1c2b317 100644 --- a/include/linux/sunrpc/clnt.h +++ b/include/linux/sunrpc/clnt.h @@ -150,6 +150,8 @@ struct rpc_clnt *rpc_bind_new_program(struct rpc_clnt *, struct rpc_clnt *rpc_clone_client(struct rpc_clnt *); struct rpc_clnt *rpc_clone_client_set_auth(struct rpc_clnt *, rpc_authflavor_t); +struct rpc_clnt *rpc_clone_client_set_auth_and_timeout(struct rpc_clnt *, + rpc_authflavor_t); int rpc_switch_client_transport(struct rpc_clnt *, struct xprt_create *, const struct rpc_timeout *); diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index 6e432ec..50a4a44 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -674,6 +674,29 @@ struct rpc_clnt * EXPORT_SYMBOL_GPL(rpc_clone_client_set_auth); /** + * rpc_clone_client_set_auth_and_timeout - Clone an RPC client structure + * and set its auth and timeouts + * + * @clnt: RPC client whose parameters are copied + * @flavor: security flavor for new client + * + * Returns a fresh RPC client or an ERR_PTR. + */ +struct rpc_clnt * +rpc_clone_client_set_auth_and_timeout(struct rpc_clnt *clnt, rpc_authflavor_t flavor) +{ + struct rpc_create_args args = { + .program = clnt->cl_program, + .prognumber = clnt->cl_prog, + .version = clnt->cl_vers, + .authflavor = flavor, + .timeout = clnt->cl_timeout, + }; + return __rpc_clone_client(&args, clnt); +} +EXPORT_SYMBOL_GPL(rpc_clone_client_set_auth_and_timeout); + +/** * rpc_switch_client_transport: switch the RPC transport on the fly * @clnt: pointer to a struct rpc_clnt