Message ID | 20210608184527.87018-1-olga.kornievskaia@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/1] NFSv4.1+ add trunking when server trunking detected | expand |
> On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia <olga.kornievskaia@gmail.com> wrote: > > From: Olga Kornievskaia <kolga@netapp.com> > > After trunking is discovered in nfs4_discover_server_trunking(), > add the transport to the old client structure before destroying > the new client structure (along with its transport). > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com> > --- > fs/nfs/nfs4client.c | 40 ++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 40 insertions(+) > > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c > index 42719384e25f..984c851844d8 100644 > --- a/fs/nfs/nfs4client.c > +++ b/fs/nfs/nfs4client.c > @@ -361,6 +361,44 @@ static int nfs4_init_client_minor_version(struct nfs_client *clp) > return nfs4_init_callback(clp); > } > > +static void nfs4_add_trunk(struct nfs_client *clp, struct nfs_client *old) > +{ > + struct sockaddr_storage clp_addr, old_addr; > + struct sockaddr *clp_sap = (struct sockaddr *)&clp_addr; > + struct sockaddr *old_sap = (struct sockaddr *)&old_addr; > + size_t clp_salen, old_salen; > + struct xprt_create xprt_args = { > + .ident = old->cl_proto, > + .net = old->cl_net, > + .servername = old->cl_hostname, > + }; > + struct nfs4_add_xprt_data xprtdata = { > + .clp = old, > + }; > + struct rpc_add_xprt_test rpcdata = { > + .add_xprt_test = old->cl_mvops->session_trunk, > + .data = &xprtdata, > + }; > + > + if (clp->cl_proto != old->cl_proto) > + return; > + clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap, sizeof(clp_addr)); > + old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap, sizeof(old_addr)); > + > + if (clp_addr.ss_family != old_addr.ss_family) > + return; > + > + xprt_args.dstaddr = clp_sap; > + xprt_args.addrlen = clp_salen; > + > + xprtdata.cred = nfs4_get_clid_cred(old); > + rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args, > + rpc_clnt_setup_test_and_add_xprt, &rpcdata); Is there an upper bound on the number of transports that are added to the NFS client's switch? > + > + if (xprtdata.cred) > + put_cred(xprtdata.cred); > +} > + > /** > * nfs4_init_client - Initialise an NFS4 client record > * > @@ -434,6 +472,8 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp, > * won't try to use it. > */ > nfs_mark_client_ready(clp, -EPERM); > + if (old->cl_mvops->session_trunk) > + nfs4_add_trunk(clp, old); > } > clear_bit(NFS_CS_TSM_POSSIBLE, &clp->cl_flags); > nfs_put_client(clp); > -- > 2.27.0 > -- Chuck Lever
On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III <chuck.lever@oracle.com> wrote: > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia <olga.kornievskaia@gmail.com> wrote: > > > > From: Olga Kornievskaia <kolga@netapp.com> > > > > After trunking is discovered in nfs4_discover_server_trunking(), > > add the transport to the old client structure before destroying > > the new client structure (along with its transport). > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com> > > --- > > fs/nfs/nfs4client.c | 40 ++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 40 insertions(+) > > > > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c > > index 42719384e25f..984c851844d8 100644 > > --- a/fs/nfs/nfs4client.c > > +++ b/fs/nfs/nfs4client.c > > @@ -361,6 +361,44 @@ static int nfs4_init_client_minor_version(struct nfs_client *clp) > > return nfs4_init_callback(clp); > > } > > > > +static void nfs4_add_trunk(struct nfs_client *clp, struct nfs_client *old) > > +{ > > + struct sockaddr_storage clp_addr, old_addr; > > + struct sockaddr *clp_sap = (struct sockaddr *)&clp_addr; > > + struct sockaddr *old_sap = (struct sockaddr *)&old_addr; > > + size_t clp_salen, old_salen; > > + struct xprt_create xprt_args = { > > + .ident = old->cl_proto, > > + .net = old->cl_net, > > + .servername = old->cl_hostname, > > + }; > > + struct nfs4_add_xprt_data xprtdata = { > > + .clp = old, > > + }; > > + struct rpc_add_xprt_test rpcdata = { > > + .add_xprt_test = old->cl_mvops->session_trunk, > > + .data = &xprtdata, > > + }; > > + > > + if (clp->cl_proto != old->cl_proto) > > + return; > > + clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap, sizeof(clp_addr)); > > + old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap, sizeof(old_addr)); > > + > > + if (clp_addr.ss_family != old_addr.ss_family) > > + return; > > + > > + xprt_args.dstaddr = clp_sap; > > + xprt_args.addrlen = clp_salen; > > + > > + xprtdata.cred = nfs4_get_clid_cred(old); > > + rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args, > > + rpc_clnt_setup_test_and_add_xprt, &rpcdata); > > Is there an upper bound on the number of transports that > are added to the NFS client's switch? I don't believe any limits exist right now. Why should there be a limit? Are you saying that the client should limit trunking? While this is not what's happening here, but say FS_LOCATION returned 100 ips for the server. Are you saying the client should be limiting how many trunkable connections it would be establishing and picking just a few addresses to try? What's happening with this patch is that say there are 100mounts to 100 ips (each representing the same server or trunkable server(s)), without this patch a single connection is kept, with this patch we'll have 100 connections. > > > > + > > + if (xprtdata.cred) > > + put_cred(xprtdata.cred); > > +} > > + > > /** > > * nfs4_init_client - Initialise an NFS4 client record > > * > > @@ -434,6 +472,8 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp, > > * won't try to use it. > > */ > > nfs_mark_client_ready(clp, -EPERM); > > + if (old->cl_mvops->session_trunk) > > + nfs4_add_trunk(clp, old); > > } > > clear_bit(NFS_CS_TSM_POSSIBLE, &clp->cl_flags); > > nfs_put_client(clp); > > -- > > 2.27.0 > > > > -- > Chuck Lever > > >
> On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia <olga.kornievskaia@gmail.com> wrote: > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III <chuck.lever@oracle.com> wrote: >> >> >> >>> On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia <olga.kornievskaia@gmail.com> wrote: >>> >>> From: Olga Kornievskaia <kolga@netapp.com> >>> >>> After trunking is discovered in nfs4_discover_server_trunking(), >>> add the transport to the old client structure before destroying >>> the new client structure (along with its transport). >>> >>> Signed-off-by: Olga Kornievskaia <kolga@netapp.com> >>> --- >>> fs/nfs/nfs4client.c | 40 ++++++++++++++++++++++++++++++++++++++++ >>> 1 file changed, 40 insertions(+) >>> >>> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c >>> index 42719384e25f..984c851844d8 100644 >>> --- a/fs/nfs/nfs4client.c >>> +++ b/fs/nfs/nfs4client.c >>> @@ -361,6 +361,44 @@ static int nfs4_init_client_minor_version(struct nfs_client *clp) >>> return nfs4_init_callback(clp); >>> } >>> >>> +static void nfs4_add_trunk(struct nfs_client *clp, struct nfs_client *old) >>> +{ >>> + struct sockaddr_storage clp_addr, old_addr; >>> + struct sockaddr *clp_sap = (struct sockaddr *)&clp_addr; >>> + struct sockaddr *old_sap = (struct sockaddr *)&old_addr; >>> + size_t clp_salen, old_salen; >>> + struct xprt_create xprt_args = { >>> + .ident = old->cl_proto, >>> + .net = old->cl_net, >>> + .servername = old->cl_hostname, >>> + }; >>> + struct nfs4_add_xprt_data xprtdata = { >>> + .clp = old, >>> + }; >>> + struct rpc_add_xprt_test rpcdata = { >>> + .add_xprt_test = old->cl_mvops->session_trunk, >>> + .data = &xprtdata, >>> + }; >>> + >>> + if (clp->cl_proto != old->cl_proto) >>> + return; >>> + clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap, sizeof(clp_addr)); >>> + old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap, sizeof(old_addr)); >>> + >>> + if (clp_addr.ss_family != old_addr.ss_family) >>> + return; >>> + >>> + xprt_args.dstaddr = clp_sap; >>> + xprt_args.addrlen = clp_salen; >>> + >>> + xprtdata.cred = nfs4_get_clid_cred(old); >>> + rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args, >>> + rpc_clnt_setup_test_and_add_xprt, &rpcdata); >> >> Is there an upper bound on the number of transports that >> are added to the NFS client's switch? > > I don't believe any limits exist right now. Why should there be a > limit? Are you saying that the client should limit trunking? While > this is not what's happening here, but say FS_LOCATION returned 100 > ips for the server. Are you saying the client should be limiting how > many trunkable connections it would be establishing and picking just a > few addresses to try? What's happening with this patch is that say > there are 100mounts to 100 ips (each representing the same server or > trunkable server(s)), without this patch a single connection is kept, > with this patch we'll have 100 connections. The patch description needs to make this behavior more clear. It needs to explain "why" -- the body of the patch already explains "what". Can you include your last sentence in the description as a use case? As for the behavior... Seems to me that there are going to be only infinitesimal gains after the first few connections, and after that, it's going to be a lot for both sides to manage for no real gain. I think you do want to cap the total number of connections at a reasonable number, even in the FS_LOCATIONS case. >>> + >>> + if (xprtdata.cred) >>> + put_cred(xprtdata.cred); >>> +} >>> + >>> /** >>> * nfs4_init_client - Initialise an NFS4 client record >>> * >>> @@ -434,6 +472,8 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp, >>> * won't try to use it. >>> */ >>> nfs_mark_client_ready(clp, -EPERM); >>> + if (old->cl_mvops->session_trunk) >>> + nfs4_add_trunk(clp, old); >>> } >>> clear_bit(NFS_CS_TSM_POSSIBLE, &clp->cl_flags); >>> nfs_put_client(clp); >>> -- >>> 2.27.0 >>> >> >> -- >> Chuck Lever -- Chuck Lever
On Tue, 2021-06-08 at 21:06 +0000, Chuck Lever III wrote: > > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia < > > olga.kornievskaia@gmail.com> wrote: > > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III < > > chuck.lever@oracle.com> wrote: > > > > > > > > > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia < > > > > olga.kornievskaia@gmail.com> wrote: > > > > > > > > From: Olga Kornievskaia <kolga@netapp.com> > > > > > > > > After trunking is discovered in > > > > nfs4_discover_server_trunking(), > > > > add the transport to the old client structure before destroying > > > > the new client structure (along with its transport). > > > > > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com> > > > > --- > > > > fs/nfs/nfs4client.c | 40 > > > > ++++++++++++++++++++++++++++++++++++++++ > > > > 1 file changed, 40 insertions(+) > > > > > > > > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c > > > > index 42719384e25f..984c851844d8 100644 > > > > --- a/fs/nfs/nfs4client.c > > > > +++ b/fs/nfs/nfs4client.c > > > > @@ -361,6 +361,44 @@ static int > > > > nfs4_init_client_minor_version(struct nfs_client *clp) > > > > return nfs4_init_callback(clp); > > > > } > > > > > > > > +static void nfs4_add_trunk(struct nfs_client *clp, struct > > > > nfs_client *old) > > > > +{ > > > > + struct sockaddr_storage clp_addr, old_addr; > > > > + struct sockaddr *clp_sap = (struct sockaddr *)&clp_addr; > > > > + struct sockaddr *old_sap = (struct sockaddr *)&old_addr; > > > > + size_t clp_salen, old_salen; > > > > + struct xprt_create xprt_args = { > > > > + .ident = old->cl_proto, > > > > + .net = old->cl_net, > > > > + .servername = old->cl_hostname, > > > > + }; > > > > + struct nfs4_add_xprt_data xprtdata = { > > > > + .clp = old, > > > > + }; > > > > + struct rpc_add_xprt_test rpcdata = { > > > > + .add_xprt_test = old->cl_mvops->session_trunk, > > > > + .data = &xprtdata, > > > > + }; > > > > + > > > > + if (clp->cl_proto != old->cl_proto) > > > > + return; > > > > + clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap, > > > > sizeof(clp_addr)); > > > > + old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap, > > > > sizeof(old_addr)); > > > > + > > > > + if (clp_addr.ss_family != old_addr.ss_family) > > > > + return; > > > > + > > > > + xprt_args.dstaddr = clp_sap; > > > > + xprt_args.addrlen = clp_salen; > > > > + > > > > + xprtdata.cred = nfs4_get_clid_cred(old); > > > > + rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args, > > > > + rpc_clnt_setup_test_and_add_xprt, > > > > &rpcdata); > > > > > > Is there an upper bound on the number of transports that > > > are added to the NFS client's switch? > > > > I don't believe any limits exist right now. Why should there be a > > limit? Are you saying that the client should limit trunking? While > > this is not what's happening here, but say FS_LOCATION returned 100 > > ips for the server. Are you saying the client should be limiting > > how > > many trunkable connections it would be establishing and picking > > just a > > few addresses to try? What's happening with this patch is that say > > there are 100mounts to 100 ips (each representing the same server > > or > > trunkable server(s)), without this patch a single connection is > > kept, > > with this patch we'll have 100 connections. > > The patch description needs to make this behavior more clear. It > needs to explain "why" -- the body of the patch already explains > "what". Can you include your last sentence in the description as > a use case? > > As for the behavior... Seems to me that there are going to be only > infinitesimal gains after the first few connections, and after > that, it's going to be a lot for both sides to manage for no real > gain. I think you do want to cap the total number of connections > at a reasonable number, even in the FS_LOCATIONS case. > I'd tend to agree. If you want to scale I/O then pNFS is the way to go, not vast numbers of connections to a single server.
On Tue, 2021-06-08 at 21:19 +0000, Trond Myklebust wrote: > On Tue, 2021-06-08 at 21:06 +0000, Chuck Lever III wrote: > > > > > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia < > > > olga.kornievskaia@gmail.com> wrote: > > > > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III < > > > chuck.lever@oracle.com> wrote: > > > > > > > > > > > > > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia < > > > > > olga.kornievskaia@gmail.com> wrote: > > > > > > > > > > From: Olga Kornievskaia <kolga@netapp.com> > > > > > > > > > > After trunking is discovered in > > > > > nfs4_discover_server_trunking(), > > > > > add the transport to the old client structure before > > > > > destroying > > > > > the new client structure (along with its transport). > > > > > > > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com> > > > > > --- > > > > > fs/nfs/nfs4client.c | 40 > > > > > ++++++++++++++++++++++++++++++++++++++++ > > > > > 1 file changed, 40 insertions(+) > > > > > > > > > > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c > > > > > index 42719384e25f..984c851844d8 100644 > > > > > --- a/fs/nfs/nfs4client.c > > > > > +++ b/fs/nfs/nfs4client.c > > > > > @@ -361,6 +361,44 @@ static int > > > > > nfs4_init_client_minor_version(struct nfs_client *clp) > > > > > return nfs4_init_callback(clp); > > > > > } > > > > > > > > > > +static void nfs4_add_trunk(struct nfs_client *clp, struct > > > > > nfs_client *old) > > > > > +{ > > > > > + struct sockaddr_storage clp_addr, old_addr; > > > > > + struct sockaddr *clp_sap = (struct sockaddr > > > > > *)&clp_addr; > > > > > + struct sockaddr *old_sap = (struct sockaddr > > > > > *)&old_addr; > > > > > + size_t clp_salen, old_salen; > > > > > + struct xprt_create xprt_args = { > > > > > + .ident = old->cl_proto, > > > > > + .net = old->cl_net, > > > > > + .servername = old->cl_hostname, > > > > > + }; > > > > > + struct nfs4_add_xprt_data xprtdata = { > > > > > + .clp = old, > > > > > + }; > > > > > + struct rpc_add_xprt_test rpcdata = { > > > > > + .add_xprt_test = old->cl_mvops->session_trunk, > > > > > + .data = &xprtdata, > > > > > + }; > > > > > + > > > > > + if (clp->cl_proto != old->cl_proto) > > > > > + return; > > > > > + clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap, > > > > > sizeof(clp_addr)); > > > > > + old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap, > > > > > sizeof(old_addr)); > > > > > + > > > > > + if (clp_addr.ss_family != old_addr.ss_family) > > > > > + return; > > > > > + > > > > > + xprt_args.dstaddr = clp_sap; > > > > > + xprt_args.addrlen = clp_salen; > > > > > + > > > > > + xprtdata.cred = nfs4_get_clid_cred(old); > > > > > + rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args, > > > > > + rpc_clnt_setup_test_and_add_xprt, > > > > > &rpcdata); > > > > > > > > Is there an upper bound on the number of transports that > > > > are added to the NFS client's switch? > > > > > > I don't believe any limits exist right now. Why should there be a > > > limit? Are you saying that the client should limit trunking? > > > While > > > this is not what's happening here, but say FS_LOCATION returned > > > 100 > > > ips for the server. Are you saying the client should be limiting > > > how > > > many trunkable connections it would be establishing and picking > > > just a > > > few addresses to try? What's happening with this patch is that > > > say > > > there are 100mounts to 100 ips (each representing the same server > > > or > > > trunkable server(s)), without this patch a single connection is > > > kept, > > > with this patch we'll have 100 connections. > > > > The patch description needs to make this behavior more clear. It > > needs to explain "why" -- the body of the patch already explains > > "what". Can you include your last sentence in the description as > > a use case? > > > > As for the behavior... Seems to me that there are going to be only > > infinitesimal gains after the first few connections, and after > > that, it's going to be a lot for both sides to manage for no real > > gain. I think you do want to cap the total number of connections > > at a reasonable number, even in the FS_LOCATIONS case. > > > > I'd tend to agree. If you want to scale I/O then pNFS is the way to > go, > not vast numbers of connections to a single server. > BTW: AFAICS this patch will end up adding another connection every time we mount a new filesystem, whether or not a connection already exists to that IP address. That's unacceptable.
On Tue, Jun 8, 2021 at 5:26 PM Trond Myklebust <trondmy@hammerspace.com> wrote: > > On Tue, 2021-06-08 at 21:19 +0000, Trond Myklebust wrote: > > On Tue, 2021-06-08 at 21:06 +0000, Chuck Lever III wrote: > > > > > > > > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia < > > > > olga.kornievskaia@gmail.com> wrote: > > > > > > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III < > > > > chuck.lever@oracle.com> wrote: > > > > > > > > > > > > > > > > > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia < > > > > > > olga.kornievskaia@gmail.com> wrote: > > > > > > > > > > > > From: Olga Kornievskaia <kolga@netapp.com> > > > > > > > > > > > > After trunking is discovered in > > > > > > nfs4_discover_server_trunking(), > > > > > > add the transport to the old client structure before > > > > > > destroying > > > > > > the new client structure (along with its transport). > > > > > > > > > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com> > > > > > > --- > > > > > > fs/nfs/nfs4client.c | 40 > > > > > > ++++++++++++++++++++++++++++++++++++++++ > > > > > > 1 file changed, 40 insertions(+) > > > > > > > > > > > > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c > > > > > > index 42719384e25f..984c851844d8 100644 > > > > > > --- a/fs/nfs/nfs4client.c > > > > > > +++ b/fs/nfs/nfs4client.c > > > > > > @@ -361,6 +361,44 @@ static int > > > > > > nfs4_init_client_minor_version(struct nfs_client *clp) > > > > > > return nfs4_init_callback(clp); > > > > > > } > > > > > > > > > > > > +static void nfs4_add_trunk(struct nfs_client *clp, struct > > > > > > nfs_client *old) > > > > > > +{ > > > > > > + struct sockaddr_storage clp_addr, old_addr; > > > > > > + struct sockaddr *clp_sap = (struct sockaddr > > > > > > *)&clp_addr; > > > > > > + struct sockaddr *old_sap = (struct sockaddr > > > > > > *)&old_addr; > > > > > > + size_t clp_salen, old_salen; > > > > > > + struct xprt_create xprt_args = { > > > > > > + .ident = old->cl_proto, > > > > > > + .net = old->cl_net, > > > > > > + .servername = old->cl_hostname, > > > > > > + }; > > > > > > + struct nfs4_add_xprt_data xprtdata = { > > > > > > + .clp = old, > > > > > > + }; > > > > > > + struct rpc_add_xprt_test rpcdata = { > > > > > > + .add_xprt_test = old->cl_mvops->session_trunk, > > > > > > + .data = &xprtdata, > > > > > > + }; > > > > > > + > > > > > > + if (clp->cl_proto != old->cl_proto) > > > > > > + return; > > > > > > + clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap, > > > > > > sizeof(clp_addr)); > > > > > > + old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap, > > > > > > sizeof(old_addr)); > > > > > > + > > > > > > + if (clp_addr.ss_family != old_addr.ss_family) > > > > > > + return; > > > > > > + > > > > > > + xprt_args.dstaddr = clp_sap; > > > > > > + xprt_args.addrlen = clp_salen; > > > > > > + > > > > > > + xprtdata.cred = nfs4_get_clid_cred(old); > > > > > > + rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args, > > > > > > + rpc_clnt_setup_test_and_add_xprt, > > > > > > &rpcdata); > > > > > > > > > > Is there an upper bound on the number of transports that > > > > > are added to the NFS client's switch? > > > > > > > > I don't believe any limits exist right now. Why should there be a > > > > limit? Are you saying that the client should limit trunking? > > > > While > > > > this is not what's happening here, but say FS_LOCATION returned > > > > 100 > > > > ips for the server. Are you saying the client should be limiting > > > > how > > > > many trunkable connections it would be establishing and picking > > > > just a > > > > few addresses to try? What's happening with this patch is that > > > > say > > > > there are 100mounts to 100 ips (each representing the same server > > > > or > > > > trunkable server(s)), without this patch a single connection is > > > > kept, > > > > with this patch we'll have 100 connections. > > > > > > The patch description needs to make this behavior more clear. It > > > needs to explain "why" -- the body of the patch already explains > > > "what". Can you include your last sentence in the description as > > > a use case? > > > > > > As for the behavior... Seems to me that there are going to be only > > > infinitesimal gains after the first few connections, and after > > > that, it's going to be a lot for both sides to manage for no real > > > gain. I think you do want to cap the total number of connections > > > at a reasonable number, even in the FS_LOCATIONS case. > > > > > > > I'd tend to agree. If you want to scale I/O then pNFS is the way to > > go, > > not vast numbers of connections to a single server. > > > BTW: AFAICS this patch will end up adding another connection every time > we mount a new filesystem, whether or not a connection already exists > to that IP address. That's unacceptable. Not in my testing. Mounts to the same server (same IP) different export volumes lead to use of a single connection. > > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@hammerspace.com > >
On Tue, Jun 8, 2021 at 5:06 PM Chuck Lever III <chuck.lever@oracle.com> wrote: > > > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia <olga.kornievskaia@gmail.com> wrote: > > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III <chuck.lever@oracle.com> wrote: > >> > >> > >> > >>> On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia <olga.kornievskaia@gmail.com> wrote: > >>> > >>> From: Olga Kornievskaia <kolga@netapp.com> > >>> > >>> After trunking is discovered in nfs4_discover_server_trunking(), > >>> add the transport to the old client structure before destroying > >>> the new client structure (along with its transport). > >>> > >>> Signed-off-by: Olga Kornievskaia <kolga@netapp.com> > >>> --- > >>> fs/nfs/nfs4client.c | 40 ++++++++++++++++++++++++++++++++++++++++ > >>> 1 file changed, 40 insertions(+) > >>> > >>> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c > >>> index 42719384e25f..984c851844d8 100644 > >>> --- a/fs/nfs/nfs4client.c > >>> +++ b/fs/nfs/nfs4client.c > >>> @@ -361,6 +361,44 @@ static int nfs4_init_client_minor_version(struct nfs_client *clp) > >>> return nfs4_init_callback(clp); > >>> } > >>> > >>> +static void nfs4_add_trunk(struct nfs_client *clp, struct nfs_client *old) > >>> +{ > >>> + struct sockaddr_storage clp_addr, old_addr; > >>> + struct sockaddr *clp_sap = (struct sockaddr *)&clp_addr; > >>> + struct sockaddr *old_sap = (struct sockaddr *)&old_addr; > >>> + size_t clp_salen, old_salen; > >>> + struct xprt_create xprt_args = { > >>> + .ident = old->cl_proto, > >>> + .net = old->cl_net, > >>> + .servername = old->cl_hostname, > >>> + }; > >>> + struct nfs4_add_xprt_data xprtdata = { > >>> + .clp = old, > >>> + }; > >>> + struct rpc_add_xprt_test rpcdata = { > >>> + .add_xprt_test = old->cl_mvops->session_trunk, > >>> + .data = &xprtdata, > >>> + }; > >>> + > >>> + if (clp->cl_proto != old->cl_proto) > >>> + return; > >>> + clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap, sizeof(clp_addr)); > >>> + old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap, sizeof(old_addr)); > >>> + > >>> + if (clp_addr.ss_family != old_addr.ss_family) > >>> + return; > >>> + > >>> + xprt_args.dstaddr = clp_sap; > >>> + xprt_args.addrlen = clp_salen; > >>> + > >>> + xprtdata.cred = nfs4_get_clid_cred(old); > >>> + rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args, > >>> + rpc_clnt_setup_test_and_add_xprt, &rpcdata); > >> > >> Is there an upper bound on the number of transports that > >> are added to the NFS client's switch? > > > > I don't believe any limits exist right now. Why should there be a > > limit? Are you saying that the client should limit trunking? While > > this is not what's happening here, but say FS_LOCATION returned 100 > > ips for the server. Are you saying the client should be limiting how > > many trunkable connections it would be establishing and picking just a > > few addresses to try? What's happening with this patch is that say > > there are 100mounts to 100 ips (each representing the same server or > > trunkable server(s)), without this patch a single connection is kept, > > with this patch we'll have 100 connections. > > The patch description needs to make this behavior more clear. It > needs to explain "why" -- the body of the patch already explains > "what". Can you include your last sentence in the description as > a use case? I sure can. > As for the behavior... Seems to me that there are going to be only > infinitesimal gains after the first few connections, and after > that, it's going to be a lot for both sides to manage for no real > gain. I think you do want to cap the total number of connections > at a reasonable number, even in the FS_LOCATIONS case. Do you want to cap it at 16 like we do for nconnect? I'm ok with that for now. I don't understand why a setup where there X NICs on each side (client/server) and X mounts are done, there won't be throughput of close to X * throughput of a single NIC. > >>> + > >>> + if (xprtdata.cred) > >>> + put_cred(xprtdata.cred); > >>> +} > >>> + > >>> /** > >>> * nfs4_init_client - Initialise an NFS4 client record > >>> * > >>> @@ -434,6 +472,8 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp, > >>> * won't try to use it. > >>> */ > >>> nfs_mark_client_ready(clp, -EPERM); > >>> + if (old->cl_mvops->session_trunk) > >>> + nfs4_add_trunk(clp, old); > >>> } > >>> clear_bit(NFS_CS_TSM_POSSIBLE, &clp->cl_flags); > >>> nfs_put_client(clp); > >>> -- > >>> 2.27.0 > >>> > >> > >> -- > >> Chuck Lever > > -- > Chuck Lever > > >
On Tue, 2021-06-08 at 17:30 -0400, Olga Kornievskaia wrote: > On Tue, Jun 8, 2021 at 5:26 PM Trond Myklebust < > trondmy@hammerspace.com> wrote: > > > > On Tue, 2021-06-08 at 21:19 +0000, Trond Myklebust wrote: > > > On Tue, 2021-06-08 at 21:06 +0000, Chuck Lever III wrote: > > > > > > > > > > > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia < > > > > > olga.kornievskaia@gmail.com> wrote: > > > > > > > > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III < > > > > > chuck.lever@oracle.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia < > > > > > > > olga.kornievskaia@gmail.com> wrote: > > > > > > > > > > > > > > From: Olga Kornievskaia <kolga@netapp.com> > > > > > > > > > > > > > > After trunking is discovered in > > > > > > > nfs4_discover_server_trunking(), > > > > > > > add the transport to the old client structure before > > > > > > > destroying > > > > > > > the new client structure (along with its transport). > > > > > > > > > > > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com> > > > > > > > --- > > > > > > > fs/nfs/nfs4client.c | 40 > > > > > > > ++++++++++++++++++++++++++++++++++++++++ > > > > > > > 1 file changed, 40 insertions(+) > > > > > > > > > > > > > > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c > > > > > > > index 42719384e25f..984c851844d8 100644 > > > > > > > --- a/fs/nfs/nfs4client.c > > > > > > > +++ b/fs/nfs/nfs4client.c > > > > > > > @@ -361,6 +361,44 @@ static int > > > > > > > nfs4_init_client_minor_version(struct nfs_client *clp) > > > > > > > return nfs4_init_callback(clp); > > > > > > > } > > > > > > > > > > > > > > +static void nfs4_add_trunk(struct nfs_client *clp, > > > > > > > struct > > > > > > > nfs_client *old) > > > > > > > +{ > > > > > > > + struct sockaddr_storage clp_addr, old_addr; > > > > > > > + struct sockaddr *clp_sap = (struct sockaddr > > > > > > > *)&clp_addr; > > > > > > > + struct sockaddr *old_sap = (struct sockaddr > > > > > > > *)&old_addr; > > > > > > > + size_t clp_salen, old_salen; > > > > > > > + struct xprt_create xprt_args = { > > > > > > > + .ident = old->cl_proto, > > > > > > > + .net = old->cl_net, > > > > > > > + .servername = old->cl_hostname, > > > > > > > + }; > > > > > > > + struct nfs4_add_xprt_data xprtdata = { > > > > > > > + .clp = old, > > > > > > > + }; > > > > > > > + struct rpc_add_xprt_test rpcdata = { > > > > > > > + .add_xprt_test = old->cl_mvops- > > > > > > > >session_trunk, > > > > > > > + .data = &xprtdata, > > > > > > > + }; > > > > > > > + > > > > > > > + if (clp->cl_proto != old->cl_proto) > > > > > > > + return; > > > > > > > + clp_salen = rpc_peeraddr(clp->cl_rpcclient, > > > > > > > clp_sap, > > > > > > > sizeof(clp_addr)); > > > > > > > + old_salen = rpc_peeraddr(old->cl_rpcclient, > > > > > > > old_sap, > > > > > > > sizeof(old_addr)); > > > > > > > + > > > > > > > + if (clp_addr.ss_family != old_addr.ss_family) > > > > > > > + return; > > > > > > > + > > > > > > > + xprt_args.dstaddr = clp_sap; > > > > > > > + xprt_args.addrlen = clp_salen; > > > > > > > + > > > > > > > + xprtdata.cred = nfs4_get_clid_cred(old); > > > > > > > + rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args, > > > > > > > + rpc_clnt_setup_test_and_add_xprt, > > > > > > > &rpcdata); > > > > > > > > > > > > Is there an upper bound on the number of transports that > > > > > > are added to the NFS client's switch? > > > > > > > > > > I don't believe any limits exist right now. Why should there > > > > > be a > > > > > limit? Are you saying that the client should limit trunking? > > > > > While > > > > > this is not what's happening here, but say FS_LOCATION > > > > > returned > > > > > 100 > > > > > ips for the server. Are you saying the client should be > > > > > limiting > > > > > how > > > > > many trunkable connections it would be establishing and > > > > > picking > > > > > just a > > > > > few addresses to try? What's happening with this patch is > > > > > that > > > > > say > > > > > there are 100mounts to 100 ips (each representing the same > > > > > server > > > > > or > > > > > trunkable server(s)), without this patch a single connection > > > > > is > > > > > kept, > > > > > with this patch we'll have 100 connections. > > > > > > > > The patch description needs to make this behavior more clear. > > > > It > > > > needs to explain "why" -- the body of the patch already > > > > explains > > > > "what". Can you include your last sentence in the description > > > > as > > > > a use case? > > > > > > > > As for the behavior... Seems to me that there are going to be > > > > only > > > > infinitesimal gains after the first few connections, and after > > > > that, it's going to be a lot for both sides to manage for no > > > > real > > > > gain. I think you do want to cap the total number of > > > > connections > > > > at a reasonable number, even in the FS_LOCATIONS case. > > > > > > > > > > I'd tend to agree. If you want to scale I/O then pNFS is the way > > > to > > > go, > > > not vast numbers of connections to a single server. > > > > > BTW: AFAICS this patch will end up adding another connection every > > time > > we mount a new filesystem, whether or not a connection already > > exists > > to that IP address. That's unacceptable. > > Not in my testing. Mounts to the same server (same IP) different > export volumes lead to use of a single connection. > > Ah... Never mind. The comparison is made by rpc_clnt_setup_test_and_add_xprt(), and the address is discarded if it is already present. However why are you running trunking detection a second time on the address you've just run trunking detection on and have decided to add?
On Tue, 2021-06-08 at 17:40 -0400, Olga Kornievskaia wrote: > On Tue, Jun 8, 2021 at 5:06 PM Chuck Lever III < > chuck.lever@oracle.com> wrote: > > > > > > > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia < > > > olga.kornievskaia@gmail.com> wrote: > > > > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III < > > > chuck.lever@oracle.com> wrote: > > > > > > > > > > > > > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia < > > > > > olga.kornievskaia@gmail.com> wrote: > > > > > > > > > > From: Olga Kornievskaia <kolga@netapp.com> > > > > > > > > > > After trunking is discovered in > > > > > nfs4_discover_server_trunking(), > > > > > add the transport to the old client structure before > > > > > destroying > > > > > the new client structure (along with its transport). > > > > > > > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com> > > > > > --- > > > > > fs/nfs/nfs4client.c | 40 > > > > > ++++++++++++++++++++++++++++++++++++++++ > > > > > 1 file changed, 40 insertions(+) > > > > > > > > > > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c > > > > > index 42719384e25f..984c851844d8 100644 > > > > > --- a/fs/nfs/nfs4client.c > > > > > +++ b/fs/nfs/nfs4client.c > > > > > @@ -361,6 +361,44 @@ static int > > > > > nfs4_init_client_minor_version(struct nfs_client *clp) > > > > > return nfs4_init_callback(clp); > > > > > } > > > > > > > > > > +static void nfs4_add_trunk(struct nfs_client *clp, struct > > > > > nfs_client *old) > > > > > +{ > > > > > + struct sockaddr_storage clp_addr, old_addr; > > > > > + struct sockaddr *clp_sap = (struct sockaddr > > > > > *)&clp_addr; > > > > > + struct sockaddr *old_sap = (struct sockaddr > > > > > *)&old_addr; > > > > > + size_t clp_salen, old_salen; > > > > > + struct xprt_create xprt_args = { > > > > > + .ident = old->cl_proto, > > > > > + .net = old->cl_net, > > > > > + .servername = old->cl_hostname, > > > > > + }; > > > > > + struct nfs4_add_xprt_data xprtdata = { > > > > > + .clp = old, > > > > > + }; > > > > > + struct rpc_add_xprt_test rpcdata = { > > > > > + .add_xprt_test = old->cl_mvops->session_trunk, > > > > > + .data = &xprtdata, > > > > > + }; > > > > > + > > > > > + if (clp->cl_proto != old->cl_proto) > > > > > + return; > > > > > + clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap, > > > > > sizeof(clp_addr)); > > > > > + old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap, > > > > > sizeof(old_addr)); > > > > > + > > > > > + if (clp_addr.ss_family != old_addr.ss_family) > > > > > + return; > > > > > + > > > > > + xprt_args.dstaddr = clp_sap; > > > > > + xprt_args.addrlen = clp_salen; > > > > > + > > > > > + xprtdata.cred = nfs4_get_clid_cred(old); > > > > > + rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args, > > > > > + rpc_clnt_setup_test_and_add_xprt, > > > > > &rpcdata); > > > > > > > > Is there an upper bound on the number of transports that > > > > are added to the NFS client's switch? > > > > > > I don't believe any limits exist right now. Why should there be a > > > limit? Are you saying that the client should limit trunking? > > > While > > > this is not what's happening here, but say FS_LOCATION returned > > > 100 > > > ips for the server. Are you saying the client should be limiting > > > how > > > many trunkable connections it would be establishing and picking > > > just a > > > few addresses to try? What's happening with this patch is that > > > say > > > there are 100mounts to 100 ips (each representing the same server > > > or > > > trunkable server(s)), without this patch a single connection is > > > kept, > > > with this patch we'll have 100 connections. > > > > The patch description needs to make this behavior more clear. It > > needs to explain "why" -- the body of the patch already explains > > "what". Can you include your last sentence in the description as > > a use case? > > I sure can. > > > As for the behavior... Seems to me that there are going to be only > > infinitesimal gains after the first few connections, and after > > that, it's going to be a lot for both sides to manage for no real > > gain. I think you do want to cap the total number of connections > > at a reasonable number, even in the FS_LOCATIONS case. > > Do you want to cap it at 16 like we do for nconnect? I'm ok with that > for now. > > I don't understand why a setup where there X NICs on each side > (client/server) and X mounts are done, there won't be throughput of > close to X * throughput of a single NIC. That still does not make the patch acceptable. There are already server vendors seeing problems with supporting nconnect for various reasons. There needs to be a way to ensure that we can keep the current client behaviour without requiring changes to server DNS entries, etc. > > > > > > + > > > > > + if (xprtdata.cred) > > > > > + put_cred(xprtdata.cred); > > > > > +} > > > > > + > > > > > /** > > > > > * nfs4_init_client - Initialise an NFS4 client record > > > > > * > > > > > @@ -434,6 +472,8 @@ struct nfs_client > > > > > *nfs4_init_client(struct nfs_client *clp, > > > > > * won't try to use it. > > > > > */ > > > > > nfs_mark_client_ready(clp, -EPERM); > > > > > + if (old->cl_mvops->session_trunk) > > > > > + nfs4_add_trunk(clp, old); > > > > > } > > > > > clear_bit(NFS_CS_TSM_POSSIBLE, &clp->cl_flags); > > > > > nfs_put_client(clp); > > > > > -- > > > > > 2.27.0 > > > > > > > > > > > > > -- > > > > Chuck Lever > > > > -- > > Chuck Lever > > > > > >
On Tue, Jun 8, 2021 at 6:13 PM Trond Myklebust <trondmy@hammerspace.com> wrote: > > On Tue, 2021-06-08 at 17:40 -0400, Olga Kornievskaia wrote: > > On Tue, Jun 8, 2021 at 5:06 PM Chuck Lever III < > > chuck.lever@oracle.com> wrote: > > > > > > > > > > > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia < > > > > olga.kornievskaia@gmail.com> wrote: > > > > > > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III < > > > > chuck.lever@oracle.com> wrote: > > > > > > > > > > > > > > > > > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia < > > > > > > olga.kornievskaia@gmail.com> wrote: > > > > > > > > > > > > From: Olga Kornievskaia <kolga@netapp.com> > > > > > > > > > > > > After trunking is discovered in > > > > > > nfs4_discover_server_trunking(), > > > > > > add the transport to the old client structure before > > > > > > destroying > > > > > > the new client structure (along with its transport). > > > > > > > > > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com> > > > > > > --- > > > > > > fs/nfs/nfs4client.c | 40 > > > > > > ++++++++++++++++++++++++++++++++++++++++ > > > > > > 1 file changed, 40 insertions(+) > > > > > > > > > > > > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c > > > > > > index 42719384e25f..984c851844d8 100644 > > > > > > --- a/fs/nfs/nfs4client.c > > > > > > +++ b/fs/nfs/nfs4client.c > > > > > > @@ -361,6 +361,44 @@ static int > > > > > > nfs4_init_client_minor_version(struct nfs_client *clp) > > > > > > return nfs4_init_callback(clp); > > > > > > } > > > > > > > > > > > > +static void nfs4_add_trunk(struct nfs_client *clp, struct > > > > > > nfs_client *old) > > > > > > +{ > > > > > > + struct sockaddr_storage clp_addr, old_addr; > > > > > > + struct sockaddr *clp_sap = (struct sockaddr > > > > > > *)&clp_addr; > > > > > > + struct sockaddr *old_sap = (struct sockaddr > > > > > > *)&old_addr; > > > > > > + size_t clp_salen, old_salen; > > > > > > + struct xprt_create xprt_args = { > > > > > > + .ident = old->cl_proto, > > > > > > + .net = old->cl_net, > > > > > > + .servername = old->cl_hostname, > > > > > > + }; > > > > > > + struct nfs4_add_xprt_data xprtdata = { > > > > > > + .clp = old, > > > > > > + }; > > > > > > + struct rpc_add_xprt_test rpcdata = { > > > > > > + .add_xprt_test = old->cl_mvops->session_trunk, > > > > > > + .data = &xprtdata, > > > > > > + }; > > > > > > + > > > > > > + if (clp->cl_proto != old->cl_proto) > > > > > > + return; > > > > > > + clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap, > > > > > > sizeof(clp_addr)); > > > > > > + old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap, > > > > > > sizeof(old_addr)); > > > > > > + > > > > > > + if (clp_addr.ss_family != old_addr.ss_family) > > > > > > + return; > > > > > > + > > > > > > + xprt_args.dstaddr = clp_sap; > > > > > > + xprt_args.addrlen = clp_salen; > > > > > > + > > > > > > + xprtdata.cred = nfs4_get_clid_cred(old); > > > > > > + rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args, > > > > > > + rpc_clnt_setup_test_and_add_xprt, > > > > > > &rpcdata); > > > > > > > > > > Is there an upper bound on the number of transports that > > > > > are added to the NFS client's switch? > > > > > > > > I don't believe any limits exist right now. Why should there be a > > > > limit? Are you saying that the client should limit trunking? > > > > While > > > > this is not what's happening here, but say FS_LOCATION returned > > > > 100 > > > > ips for the server. Are you saying the client should be limiting > > > > how > > > > many trunkable connections it would be establishing and picking > > > > just a > > > > few addresses to try? What's happening with this patch is that > > > > say > > > > there are 100mounts to 100 ips (each representing the same server > > > > or > > > > trunkable server(s)), without this patch a single connection is > > > > kept, > > > > with this patch we'll have 100 connections. > > > > > > The patch description needs to make this behavior more clear. It > > > needs to explain "why" -- the body of the patch already explains > > > "what". Can you include your last sentence in the description as > > > a use case? > > > > I sure can. > > > > > As for the behavior... Seems to me that there are going to be only > > > infinitesimal gains after the first few connections, and after > > > that, it's going to be a lot for both sides to manage for no real > > > gain. I think you do want to cap the total number of connections > > > at a reasonable number, even in the FS_LOCATIONS case. > > > > Do you want to cap it at 16 like we do for nconnect? I'm ok with that > > for now. > > > > I don't understand why a setup where there X NICs on each side > > (client/server) and X mounts are done, there won't be throughput of > > close to X * throughput of a single NIC. > > That still does not make the patch acceptable. There are already server > vendors seeing problems with supporting nconnect for various reasons. Why are there servers presenting multiple IP addresses and returning the same server identity if they can not support trunking? That seems to be going against the spec? > There needs to be a way to ensure that we can keep the current client > behaviour without requiring changes to server DNS entries, etc. Ok, how about we introduce a mount option, "enable_trunking", and if that's present we would not collapse transports? > > > > > > > > + > > > > > > + if (xprtdata.cred) > > > > > > + put_cred(xprtdata.cred); > > > > > > +} > > > > > > + > > > > > > /** > > > > > > * nfs4_init_client - Initialise an NFS4 client record > > > > > > * > > > > > > @@ -434,6 +472,8 @@ struct nfs_client > > > > > > *nfs4_init_client(struct nfs_client *clp, > > > > > > * won't try to use it. > > > > > > */ > > > > > > nfs_mark_client_ready(clp, -EPERM); > > > > > > + if (old->cl_mvops->session_trunk) > > > > > > + nfs4_add_trunk(clp, old); > > > > > > } > > > > > > clear_bit(NFS_CS_TSM_POSSIBLE, &clp->cl_flags); > > > > > > nfs_put_client(clp); > > > > > > -- > > > > > > 2.27.0 > > > > > > > > > > > > > > > > -- > > > > > Chuck Lever > > > > > > -- > > > Chuck Lever > > > > > > > > > > > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@hammerspace.com > >
On Tue, Jun 8, 2021 at 5:41 PM Trond Myklebust <trondmy@hammerspace.com> wrote: > > On Tue, 2021-06-08 at 17:30 -0400, Olga Kornievskaia wrote: > > On Tue, Jun 8, 2021 at 5:26 PM Trond Myklebust < > > trondmy@hammerspace.com> wrote: > > > > > > On Tue, 2021-06-08 at 21:19 +0000, Trond Myklebust wrote: > > > > On Tue, 2021-06-08 at 21:06 +0000, Chuck Lever III wrote: > > > > > > > > > > > > > > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia < > > > > > > olga.kornievskaia@gmail.com> wrote: > > > > > > > > > > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III < > > > > > > chuck.lever@oracle.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia < > > > > > > > > olga.kornievskaia@gmail.com> wrote: > > > > > > > > > > > > > > > > From: Olga Kornievskaia <kolga@netapp.com> > > > > > > > > > > > > > > > > After trunking is discovered in > > > > > > > > nfs4_discover_server_trunking(), > > > > > > > > add the transport to the old client structure before > > > > > > > > destroying > > > > > > > > the new client structure (along with its transport). > > > > > > > > > > > > > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com> > > > > > > > > --- > > > > > > > > fs/nfs/nfs4client.c | 40 > > > > > > > > ++++++++++++++++++++++++++++++++++++++++ > > > > > > > > 1 file changed, 40 insertions(+) > > > > > > > > > > > > > > > > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c > > > > > > > > index 42719384e25f..984c851844d8 100644 > > > > > > > > --- a/fs/nfs/nfs4client.c > > > > > > > > +++ b/fs/nfs/nfs4client.c > > > > > > > > @@ -361,6 +361,44 @@ static int > > > > > > > > nfs4_init_client_minor_version(struct nfs_client *clp) > > > > > > > > return nfs4_init_callback(clp); > > > > > > > > } > > > > > > > > > > > > > > > > +static void nfs4_add_trunk(struct nfs_client *clp, > > > > > > > > struct > > > > > > > > nfs_client *old) > > > > > > > > +{ > > > > > > > > + struct sockaddr_storage clp_addr, old_addr; > > > > > > > > + struct sockaddr *clp_sap = (struct sockaddr > > > > > > > > *)&clp_addr; > > > > > > > > + struct sockaddr *old_sap = (struct sockaddr > > > > > > > > *)&old_addr; > > > > > > > > + size_t clp_salen, old_salen; > > > > > > > > + struct xprt_create xprt_args = { > > > > > > > > + .ident = old->cl_proto, > > > > > > > > + .net = old->cl_net, > > > > > > > > + .servername = old->cl_hostname, > > > > > > > > + }; > > > > > > > > + struct nfs4_add_xprt_data xprtdata = { > > > > > > > > + .clp = old, > > > > > > > > + }; > > > > > > > > + struct rpc_add_xprt_test rpcdata = { > > > > > > > > + .add_xprt_test = old->cl_mvops- > > > > > > > > >session_trunk, > > > > > > > > + .data = &xprtdata, > > > > > > > > + }; > > > > > > > > + > > > > > > > > + if (clp->cl_proto != old->cl_proto) > > > > > > > > + return; > > > > > > > > + clp_salen = rpc_peeraddr(clp->cl_rpcclient, > > > > > > > > clp_sap, > > > > > > > > sizeof(clp_addr)); > > > > > > > > + old_salen = rpc_peeraddr(old->cl_rpcclient, > > > > > > > > old_sap, > > > > > > > > sizeof(old_addr)); > > > > > > > > + > > > > > > > > + if (clp_addr.ss_family != old_addr.ss_family) > > > > > > > > + return; > > > > > > > > + > > > > > > > > + xprt_args.dstaddr = clp_sap; > > > > > > > > + xprt_args.addrlen = clp_salen; > > > > > > > > + > > > > > > > > + xprtdata.cred = nfs4_get_clid_cred(old); > > > > > > > > + rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args, > > > > > > > > + rpc_clnt_setup_test_and_add_xprt, > > > > > > > > &rpcdata); > > > > > > > > > > > > > > Is there an upper bound on the number of transports that > > > > > > > are added to the NFS client's switch? > > > > > > > > > > > > I don't believe any limits exist right now. Why should there > > > > > > be a > > > > > > limit? Are you saying that the client should limit trunking? > > > > > > While > > > > > > this is not what's happening here, but say FS_LOCATION > > > > > > returned > > > > > > 100 > > > > > > ips for the server. Are you saying the client should be > > > > > > limiting > > > > > > how > > > > > > many trunkable connections it would be establishing and > > > > > > picking > > > > > > just a > > > > > > few addresses to try? What's happening with this patch is > > > > > > that > > > > > > say > > > > > > there are 100mounts to 100 ips (each representing the same > > > > > > server > > > > > > or > > > > > > trunkable server(s)), without this patch a single connection > > > > > > is > > > > > > kept, > > > > > > with this patch we'll have 100 connections. > > > > > > > > > > The patch description needs to make this behavior more clear. > > > > > It > > > > > needs to explain "why" -- the body of the patch already > > > > > explains > > > > > "what". Can you include your last sentence in the description > > > > > as > > > > > a use case? > > > > > > > > > > As for the behavior... Seems to me that there are going to be > > > > > only > > > > > infinitesimal gains after the first few connections, and after > > > > > that, it's going to be a lot for both sides to manage for no > > > > > real > > > > > gain. I think you do want to cap the total number of > > > > > connections > > > > > at a reasonable number, even in the FS_LOCATIONS case. > > > > > > > > > > > > > I'd tend to agree. If you want to scale I/O then pNFS is the way > > > > to > > > > go, > > > > not vast numbers of connections to a single server. > > > > > > > BTW: AFAICS this patch will end up adding another connection every > > > time > > > we mount a new filesystem, whether or not a connection already > > > exists > > > to that IP address. That's unacceptable. > > > > Not in my testing. Mounts to the same server (same IP) different > > export volumes lead to use of a single connection. > > > > > Ah... Never mind. The comparison is made by > rpc_clnt_setup_test_and_add_xprt(), and the address is discarded if it > is already present. > > However why are you running trunking detection a second time on the > address you've just run trunking detection on and have decided to add? Because that's where we determined that these are different clients and are trunkable? But I guess also for the ease of re-using existing code. There is no hook to create an xprt and add it to an existing RPC client. One can be created. > > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@hammerspace.com > >
On Tue, 2021-06-08 at 18:18 -0400, Olga Kornievskaia wrote: > On Tue, Jun 8, 2021 at 6:13 PM Trond Myklebust < > trondmy@hammerspace.com> wrote: > > > > On Tue, 2021-06-08 at 17:40 -0400, Olga Kornievskaia wrote: > > > On Tue, Jun 8, 2021 at 5:06 PM Chuck Lever III < > > > chuck.lever@oracle.com> wrote: > > > > > > > > > > > > > > > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia < > > > > > olga.kornievskaia@gmail.com> wrote: > > > > > > > > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III < > > > > > chuck.lever@oracle.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia < > > > > > > > olga.kornievskaia@gmail.com> wrote: > > > > > > > > > > > > > > From: Olga Kornievskaia <kolga@netapp.com> > > > > > > > > > > > > > > After trunking is discovered in > > > > > > > nfs4_discover_server_trunking(), > > > > > > > add the transport to the old client structure before > > > > > > > destroying > > > > > > > the new client structure (along with its transport). > > > > > > > > > > > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com> > > > > > > > --- > > > > > > > fs/nfs/nfs4client.c | 40 > > > > > > > ++++++++++++++++++++++++++++++++++++++++ > > > > > > > 1 file changed, 40 insertions(+) > > > > > > > > > > > > > > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c > > > > > > > index 42719384e25f..984c851844d8 100644 > > > > > > > --- a/fs/nfs/nfs4client.c > > > > > > > +++ b/fs/nfs/nfs4client.c > > > > > > > @@ -361,6 +361,44 @@ static int > > > > > > > nfs4_init_client_minor_version(struct nfs_client *clp) > > > > > > > return nfs4_init_callback(clp); > > > > > > > } > > > > > > > > > > > > > > +static void nfs4_add_trunk(struct nfs_client *clp, > > > > > > > struct > > > > > > > nfs_client *old) > > > > > > > +{ > > > > > > > + struct sockaddr_storage clp_addr, old_addr; > > > > > > > + struct sockaddr *clp_sap = (struct sockaddr > > > > > > > *)&clp_addr; > > > > > > > + struct sockaddr *old_sap = (struct sockaddr > > > > > > > *)&old_addr; > > > > > > > + size_t clp_salen, old_salen; > > > > > > > + struct xprt_create xprt_args = { > > > > > > > + .ident = old->cl_proto, > > > > > > > + .net = old->cl_net, > > > > > > > + .servername = old->cl_hostname, > > > > > > > + }; > > > > > > > + struct nfs4_add_xprt_data xprtdata = { > > > > > > > + .clp = old, > > > > > > > + }; > > > > > > > + struct rpc_add_xprt_test rpcdata = { > > > > > > > + .add_xprt_test = old->cl_mvops- > > > > > > > >session_trunk, > > > > > > > + .data = &xprtdata, > > > > > > > + }; > > > > > > > + > > > > > > > + if (clp->cl_proto != old->cl_proto) > > > > > > > + return; > > > > > > > + clp_salen = rpc_peeraddr(clp->cl_rpcclient, > > > > > > > clp_sap, > > > > > > > sizeof(clp_addr)); > > > > > > > + old_salen = rpc_peeraddr(old->cl_rpcclient, > > > > > > > old_sap, > > > > > > > sizeof(old_addr)); > > > > > > > + > > > > > > > + if (clp_addr.ss_family != old_addr.ss_family) > > > > > > > + return; > > > > > > > + > > > > > > > + xprt_args.dstaddr = clp_sap; > > > > > > > + xprt_args.addrlen = clp_salen; > > > > > > > + > > > > > > > + xprtdata.cred = nfs4_get_clid_cred(old); > > > > > > > + rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args, > > > > > > > + rpc_clnt_setup_test_and_add_xprt, > > > > > > > &rpcdata); > > > > > > > > > > > > Is there an upper bound on the number of transports that > > > > > > are added to the NFS client's switch? > > > > > > > > > > I don't believe any limits exist right now. Why should there > > > > > be a > > > > > limit? Are you saying that the client should limit trunking? > > > > > While > > > > > this is not what's happening here, but say FS_LOCATION > > > > > returned > > > > > 100 > > > > > ips for the server. Are you saying the client should be > > > > > limiting > > > > > how > > > > > many trunkable connections it would be establishing and > > > > > picking > > > > > just a > > > > > few addresses to try? What's happening with this patch is > > > > > that > > > > > say > > > > > there are 100mounts to 100 ips (each representing the same > > > > > server > > > > > or > > > > > trunkable server(s)), without this patch a single connection > > > > > is > > > > > kept, > > > > > with this patch we'll have 100 connections. > > > > > > > > The patch description needs to make this behavior more clear. > > > > It > > > > needs to explain "why" -- the body of the patch already > > > > explains > > > > "what". Can you include your last sentence in the description > > > > as > > > > a use case? > > > > > > I sure can. > > > > > > > As for the behavior... Seems to me that there are going to be > > > > only > > > > infinitesimal gains after the first few connections, and after > > > > that, it's going to be a lot for both sides to manage for no > > > > real > > > > gain. I think you do want to cap the total number of > > > > connections > > > > at a reasonable number, even in the FS_LOCATIONS case. > > > > > > Do you want to cap it at 16 like we do for nconnect? I'm ok with > > > that > > > for now. > > > > > > I don't understand why a setup where there X NICs on each side > > > (client/server) and X mounts are done, there won't be throughput > > > of > > > close to X * throughput of a single NIC. > > > > That still does not make the patch acceptable. There are already > > server > > vendors seeing problems with supporting nconnect for various > > reasons. > > Why are there servers presenting multiple IP addresses and returning > the same server identity if they can not support trunking? That seems > to be going against the spec? That's not a question I can answer, but I'm not thinking of our server or the Linux server. > > > There needs to be a way to ensure that we can keep the current > > client > > behaviour without requiring changes to server DNS entries, etc. > > Ok, how about we introduce a mount option, "enable_trunking", and if > that's present we would not collapse transports? I'd suggest making it 'max_connect=X' so that we can actually set a limit like we do for nconnect. That limit probably needs to be lower bounded by the value of nconnect. > > > > > > > > > > > + > > > > > > > + if (xprtdata.cred) > > > > > > > + put_cred(xprtdata.cred); > > > > > > > +} > > > > > > > + > > > > > > > /** > > > > > > > * nfs4_init_client - Initialise an NFS4 client record > > > > > > > * > > > > > > > @@ -434,6 +472,8 @@ struct nfs_client > > > > > > > *nfs4_init_client(struct nfs_client *clp, > > > > > > > * won't try to use it. > > > > > > > */ > > > > > > > nfs_mark_client_ready(clp, -EPERM); > > > > > > > + if (old->cl_mvops->session_trunk) > > > > > > > + nfs4_add_trunk(clp, old); > > > > > > > } > > > > > > > clear_bit(NFS_CS_TSM_POSSIBLE, &clp->cl_flags); > > > > > > > nfs_put_client(clp); > > > > > > > -- > > > > > > > 2.27.0 > > > > > > > > > > > > > > > > > > > -- > > > > > > Chuck Lever > > > > > > > > -- > > > > Chuck Lever > > > > > > > > > > > > > > > > -- > > Trond Myklebust > > Linux NFS client maintainer, Hammerspace > > trond.myklebust@hammerspace.com > > > >
On Tue, 2021-06-08 at 18:21 -0400, Olga Kornievskaia wrote: > On Tue, Jun 8, 2021 at 5:41 PM Trond Myklebust < > trondmy@hammerspace.com> wrote: > > > > On Tue, 2021-06-08 at 17:30 -0400, Olga Kornievskaia wrote: > > > On Tue, Jun 8, 2021 at 5:26 PM Trond Myklebust < > > > trondmy@hammerspace.com> wrote: > > > > > > > > On Tue, 2021-06-08 at 21:19 +0000, Trond Myklebust wrote: > > > > > On Tue, 2021-06-08 at 21:06 +0000, Chuck Lever III wrote: > > > > > > > > > > > > > > > > > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia < > > > > > > > olga.kornievskaia@gmail.com> wrote: > > > > > > > > > > > > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III < > > > > > > > chuck.lever@oracle.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia < > > > > > > > > > olga.kornievskaia@gmail.com> wrote: > > > > > > > > > > > > > > > > > > From: Olga Kornievskaia <kolga@netapp.com> > > > > > > > > > > > > > > > > > > After trunking is discovered in > > > > > > > > > nfs4_discover_server_trunking(), > > > > > > > > > add the transport to the old client structure before > > > > > > > > > destroying > > > > > > > > > the new client structure (along with its transport). > > > > > > > > > > > > > > > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com> > > > > > > > > > --- > > > > > > > > > fs/nfs/nfs4client.c | 40 > > > > > > > > > ++++++++++++++++++++++++++++++++++++++++ > > > > > > > > > 1 file changed, 40 insertions(+) > > > > > > > > > > > > > > > > > > diff --git a/fs/nfs/nfs4client.c > > > > > > > > > b/fs/nfs/nfs4client.c > > > > > > > > > index 42719384e25f..984c851844d8 100644 > > > > > > > > > --- a/fs/nfs/nfs4client.c > > > > > > > > > +++ b/fs/nfs/nfs4client.c > > > > > > > > > @@ -361,6 +361,44 @@ static int > > > > > > > > > nfs4_init_client_minor_version(struct nfs_client > > > > > > > > > *clp) > > > > > > > > > return nfs4_init_callback(clp); > > > > > > > > > } > > > > > > > > > > > > > > > > > > +static void nfs4_add_trunk(struct nfs_client *clp, > > > > > > > > > struct > > > > > > > > > nfs_client *old) > > > > > > > > > +{ > > > > > > > > > + struct sockaddr_storage clp_addr, old_addr; > > > > > > > > > + struct sockaddr *clp_sap = (struct sockaddr > > > > > > > > > *)&clp_addr; > > > > > > > > > + struct sockaddr *old_sap = (struct sockaddr > > > > > > > > > *)&old_addr; > > > > > > > > > + size_t clp_salen, old_salen; > > > > > > > > > + struct xprt_create xprt_args = { > > > > > > > > > + .ident = old->cl_proto, > > > > > > > > > + .net = old->cl_net, > > > > > > > > > + .servername = old->cl_hostname, > > > > > > > > > + }; > > > > > > > > > + struct nfs4_add_xprt_data xprtdata = { > > > > > > > > > + .clp = old, > > > > > > > > > + }; > > > > > > > > > + struct rpc_add_xprt_test rpcdata = { > > > > > > > > > + .add_xprt_test = old->cl_mvops- > > > > > > > > > > session_trunk, > > > > > > > > > + .data = &xprtdata, > > > > > > > > > + }; > > > > > > > > > + > > > > > > > > > + if (clp->cl_proto != old->cl_proto) > > > > > > > > > + return; > > > > > > > > > + clp_salen = rpc_peeraddr(clp->cl_rpcclient, > > > > > > > > > clp_sap, > > > > > > > > > sizeof(clp_addr)); > > > > > > > > > + old_salen = rpc_peeraddr(old->cl_rpcclient, > > > > > > > > > old_sap, > > > > > > > > > sizeof(old_addr)); > > > > > > > > > + > > > > > > > > > + if (clp_addr.ss_family != old_addr.ss_family) > > > > > > > > > + return; > > > > > > > > > + > > > > > > > > > + xprt_args.dstaddr = clp_sap; > > > > > > > > > + xprt_args.addrlen = clp_salen; > > > > > > > > > + > > > > > > > > > + xprtdata.cred = nfs4_get_clid_cred(old); > > > > > > > > > + rpc_clnt_add_xprt(old->cl_rpcclient, > > > > > > > > > &xprt_args, > > > > > > > > > + > > > > > > > > > rpc_clnt_setup_test_and_add_xprt, > > > > > > > > > &rpcdata); > > > > > > > > > > > > > > > > Is there an upper bound on the number of transports > > > > > > > > that > > > > > > > > are added to the NFS client's switch? > > > > > > > > > > > > > > I don't believe any limits exist right now. Why should > > > > > > > there > > > > > > > be a > > > > > > > limit? Are you saying that the client should limit > > > > > > > trunking? > > > > > > > While > > > > > > > this is not what's happening here, but say FS_LOCATION > > > > > > > returned > > > > > > > 100 > > > > > > > ips for the server. Are you saying the client should be > > > > > > > limiting > > > > > > > how > > > > > > > many trunkable connections it would be establishing and > > > > > > > picking > > > > > > > just a > > > > > > > few addresses to try? What's happening with this patch is > > > > > > > that > > > > > > > say > > > > > > > there are 100mounts to 100 ips (each representing the > > > > > > > same > > > > > > > server > > > > > > > or > > > > > > > trunkable server(s)), without this patch a single > > > > > > > connection > > > > > > > is > > > > > > > kept, > > > > > > > with this patch we'll have 100 connections. > > > > > > > > > > > > The patch description needs to make this behavior more > > > > > > clear. > > > > > > It > > > > > > needs to explain "why" -- the body of the patch already > > > > > > explains > > > > > > "what". Can you include your last sentence in the > > > > > > description > > > > > > as > > > > > > a use case? > > > > > > > > > > > > As for the behavior... Seems to me that there are going to > > > > > > be > > > > > > only > > > > > > infinitesimal gains after the first few connections, and > > > > > > after > > > > > > that, it's going to be a lot for both sides to manage for > > > > > > no > > > > > > real > > > > > > gain. I think you do want to cap the total number of > > > > > > connections > > > > > > at a reasonable number, even in the FS_LOCATIONS case. > > > > > > > > > > > > > > > > I'd tend to agree. If you want to scale I/O then pNFS is the > > > > > way > > > > > to > > > > > go, > > > > > not vast numbers of connections to a single server. > > > > > > > > > BTW: AFAICS this patch will end up adding another connection > > > > every > > > > time > > > > we mount a new filesystem, whether or not a connection already > > > > exists > > > > to that IP address. That's unacceptable. > > > > > > Not in my testing. Mounts to the same server (same IP) different > > > export volumes lead to use of a single connection. > > > > > > > > Ah... Never mind. The comparison is made by > > rpc_clnt_setup_test_and_add_xprt(), and the address is discarded if > > it > > is already present. > > > > However why are you running trunking detection a second time on the > > address you've just run trunking detection on and have decided to > > add? > > Because that's where we determined that these are different clients > and are trunkable? But I guess also for the ease of re-using existing > code. There is no hook to create an xprt and add it to an existing > RPC > client. One can be created. Why not rpc_clnt_add_xprt()? That's what pNFS uses for this case.
On Tue, Jun 8, 2021 at 6:28 PM Trond Myklebust <trondmy@hammerspace.com> wrote: > > On Tue, 2021-06-08 at 18:18 -0400, Olga Kornievskaia wrote: > > On Tue, Jun 8, 2021 at 6:13 PM Trond Myklebust < > > trondmy@hammerspace.com> wrote: > > > > > > On Tue, 2021-06-08 at 17:40 -0400, Olga Kornievskaia wrote: > > > > On Tue, Jun 8, 2021 at 5:06 PM Chuck Lever III < > > > > chuck.lever@oracle.com> wrote: > > > > > > > > > > > > > > > > > > > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia < > > > > > > olga.kornievskaia@gmail.com> wrote: > > > > > > > > > > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III < > > > > > > chuck.lever@oracle.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia < > > > > > > > > olga.kornievskaia@gmail.com> wrote: > > > > > > > > > > > > > > > > From: Olga Kornievskaia <kolga@netapp.com> > > > > > > > > > > > > > > > > After trunking is discovered in > > > > > > > > nfs4_discover_server_trunking(), > > > > > > > > add the transport to the old client structure before > > > > > > > > destroying > > > > > > > > the new client structure (along with its transport). > > > > > > > > > > > > > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com> > > > > > > > > --- > > > > > > > > fs/nfs/nfs4client.c | 40 > > > > > > > > ++++++++++++++++++++++++++++++++++++++++ > > > > > > > > 1 file changed, 40 insertions(+) > > > > > > > > > > > > > > > > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c > > > > > > > > index 42719384e25f..984c851844d8 100644 > > > > > > > > --- a/fs/nfs/nfs4client.c > > > > > > > > +++ b/fs/nfs/nfs4client.c > > > > > > > > @@ -361,6 +361,44 @@ static int > > > > > > > > nfs4_init_client_minor_version(struct nfs_client *clp) > > > > > > > > return nfs4_init_callback(clp); > > > > > > > > } > > > > > > > > > > > > > > > > +static void nfs4_add_trunk(struct nfs_client *clp, > > > > > > > > struct > > > > > > > > nfs_client *old) > > > > > > > > +{ > > > > > > > > + struct sockaddr_storage clp_addr, old_addr; > > > > > > > > + struct sockaddr *clp_sap = (struct sockaddr > > > > > > > > *)&clp_addr; > > > > > > > > + struct sockaddr *old_sap = (struct sockaddr > > > > > > > > *)&old_addr; > > > > > > > > + size_t clp_salen, old_salen; > > > > > > > > + struct xprt_create xprt_args = { > > > > > > > > + .ident = old->cl_proto, > > > > > > > > + .net = old->cl_net, > > > > > > > > + .servername = old->cl_hostname, > > > > > > > > + }; > > > > > > > > + struct nfs4_add_xprt_data xprtdata = { > > > > > > > > + .clp = old, > > > > > > > > + }; > > > > > > > > + struct rpc_add_xprt_test rpcdata = { > > > > > > > > + .add_xprt_test = old->cl_mvops- > > > > > > > > >session_trunk, > > > > > > > > + .data = &xprtdata, > > > > > > > > + }; > > > > > > > > + > > > > > > > > + if (clp->cl_proto != old->cl_proto) > > > > > > > > + return; > > > > > > > > + clp_salen = rpc_peeraddr(clp->cl_rpcclient, > > > > > > > > clp_sap, > > > > > > > > sizeof(clp_addr)); > > > > > > > > + old_salen = rpc_peeraddr(old->cl_rpcclient, > > > > > > > > old_sap, > > > > > > > > sizeof(old_addr)); > > > > > > > > + > > > > > > > > + if (clp_addr.ss_family != old_addr.ss_family) > > > > > > > > + return; > > > > > > > > + > > > > > > > > + xprt_args.dstaddr = clp_sap; > > > > > > > > + xprt_args.addrlen = clp_salen; > > > > > > > > + > > > > > > > > + xprtdata.cred = nfs4_get_clid_cred(old); > > > > > > > > + rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args, > > > > > > > > + rpc_clnt_setup_test_and_add_xprt, > > > > > > > > &rpcdata); > > > > > > > > > > > > > > Is there an upper bound on the number of transports that > > > > > > > are added to the NFS client's switch? > > > > > > > > > > > > I don't believe any limits exist right now. Why should there > > > > > > be a > > > > > > limit? Are you saying that the client should limit trunking? > > > > > > While > > > > > > this is not what's happening here, but say FS_LOCATION > > > > > > returned > > > > > > 100 > > > > > > ips for the server. Are you saying the client should be > > > > > > limiting > > > > > > how > > > > > > many trunkable connections it would be establishing and > > > > > > picking > > > > > > just a > > > > > > few addresses to try? What's happening with this patch is > > > > > > that > > > > > > say > > > > > > there are 100mounts to 100 ips (each representing the same > > > > > > server > > > > > > or > > > > > > trunkable server(s)), without this patch a single connection > > > > > > is > > > > > > kept, > > > > > > with this patch we'll have 100 connections. > > > > > > > > > > The patch description needs to make this behavior more clear. > > > > > It > > > > > needs to explain "why" -- the body of the patch already > > > > > explains > > > > > "what". Can you include your last sentence in the description > > > > > as > > > > > a use case? > > > > > > > > I sure can. > > > > > > > > > As for the behavior... Seems to me that there are going to be > > > > > only > > > > > infinitesimal gains after the first few connections, and after > > > > > that, it's going to be a lot for both sides to manage for no > > > > > real > > > > > gain. I think you do want to cap the total number of > > > > > connections > > > > > at a reasonable number, even in the FS_LOCATIONS case. > > > > > > > > Do you want to cap it at 16 like we do for nconnect? I'm ok with > > > > that > > > > for now. > > > > > > > > I don't understand why a setup where there X NICs on each side > > > > (client/server) and X mounts are done, there won't be throughput > > > > of > > > > close to X * throughput of a single NIC. > > > > > > That still does not make the patch acceptable. There are already > > > server > > > vendors seeing problems with supporting nconnect for various > > > reasons. > > > > Why are there servers presenting multiple IP addresses and returning > > the same server identity if they can not support trunking? That seems > > to be going against the spec? > > That's not a question I can answer, but I'm not thinking of our server > or the Linux server. > > > > > > There needs to be a way to ensure that we can keep the current > > > client > > > behaviour without requiring changes to server DNS entries, etc. > > > > Ok, how about we introduce a mount option, "enable_trunking", and if > > that's present we would not collapse transports? > > I'd suggest making it 'max_connect=X' so that we can actually set a > limit like we do for nconnect. That limit probably needs to be lower > bounded by the value of nconnect. Sure I can do that. But wouldn't that be confusing since we already have nconnect option? How about "max_trunking=X"? The lack of the setting or max_trunking=1, would result in all transports collapsed, other values would result in all additions of trunkable transports. > > > > > > > > > > > > > > > + > > > > > > > > + if (xprtdata.cred) > > > > > > > > + put_cred(xprtdata.cred); > > > > > > > > +} > > > > > > > > + > > > > > > > > /** > > > > > > > > * nfs4_init_client - Initialise an NFS4 client record > > > > > > > > * > > > > > > > > @@ -434,6 +472,8 @@ struct nfs_client > > > > > > > > *nfs4_init_client(struct nfs_client *clp, > > > > > > > > * won't try to use it. > > > > > > > > */ > > > > > > > > nfs_mark_client_ready(clp, -EPERM); > > > > > > > > + if (old->cl_mvops->session_trunk) > > > > > > > > + nfs4_add_trunk(clp, old); > > > > > > > > } > > > > > > > > clear_bit(NFS_CS_TSM_POSSIBLE, &clp->cl_flags); > > > > > > > > nfs_put_client(clp); > > > > > > > > -- > > > > > > > > 2.27.0 > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Chuck Lever > > > > > > > > > > -- > > > > > Chuck Lever > > > > > > > > > > > > > > > > > > > > > -- > > > Trond Myklebust > > > Linux NFS client maintainer, Hammerspace > > > trond.myklebust@hammerspace.com > > > > > > > > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@hammerspace.com > >
On Tue, Jun 8, 2021 at 6:31 PM Trond Myklebust <trondmy@hammerspace.com> wrote: > > On Tue, 2021-06-08 at 18:21 -0400, Olga Kornievskaia wrote: > > On Tue, Jun 8, 2021 at 5:41 PM Trond Myklebust < > > trondmy@hammerspace.com> wrote: > > > > > > On Tue, 2021-06-08 at 17:30 -0400, Olga Kornievskaia wrote: > > > > On Tue, Jun 8, 2021 at 5:26 PM Trond Myklebust < > > > > trondmy@hammerspace.com> wrote: > > > > > > > > > > On Tue, 2021-06-08 at 21:19 +0000, Trond Myklebust wrote: > > > > > > On Tue, 2021-06-08 at 21:06 +0000, Chuck Lever III wrote: > > > > > > > > > > > > > > > > > > > > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia < > > > > > > > > olga.kornievskaia@gmail.com> wrote: > > > > > > > > > > > > > > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III < > > > > > > > > chuck.lever@oracle.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia < > > > > > > > > > > olga.kornievskaia@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > From: Olga Kornievskaia <kolga@netapp.com> > > > > > > > > > > > > > > > > > > > > After trunking is discovered in > > > > > > > > > > nfs4_discover_server_trunking(), > > > > > > > > > > add the transport to the old client structure before > > > > > > > > > > destroying > > > > > > > > > > the new client structure (along with its transport). > > > > > > > > > > > > > > > > > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com> > > > > > > > > > > --- > > > > > > > > > > fs/nfs/nfs4client.c | 40 > > > > > > > > > > ++++++++++++++++++++++++++++++++++++++++ > > > > > > > > > > 1 file changed, 40 insertions(+) > > > > > > > > > > > > > > > > > > > > diff --git a/fs/nfs/nfs4client.c > > > > > > > > > > b/fs/nfs/nfs4client.c > > > > > > > > > > index 42719384e25f..984c851844d8 100644 > > > > > > > > > > --- a/fs/nfs/nfs4client.c > > > > > > > > > > +++ b/fs/nfs/nfs4client.c > > > > > > > > > > @@ -361,6 +361,44 @@ static int > > > > > > > > > > nfs4_init_client_minor_version(struct nfs_client > > > > > > > > > > *clp) > > > > > > > > > > return nfs4_init_callback(clp); > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > +static void nfs4_add_trunk(struct nfs_client *clp, > > > > > > > > > > struct > > > > > > > > > > nfs_client *old) > > > > > > > > > > +{ > > > > > > > > > > + struct sockaddr_storage clp_addr, old_addr; > > > > > > > > > > + struct sockaddr *clp_sap = (struct sockaddr > > > > > > > > > > *)&clp_addr; > > > > > > > > > > + struct sockaddr *old_sap = (struct sockaddr > > > > > > > > > > *)&old_addr; > > > > > > > > > > + size_t clp_salen, old_salen; > > > > > > > > > > + struct xprt_create xprt_args = { > > > > > > > > > > + .ident = old->cl_proto, > > > > > > > > > > + .net = old->cl_net, > > > > > > > > > > + .servername = old->cl_hostname, > > > > > > > > > > + }; > > > > > > > > > > + struct nfs4_add_xprt_data xprtdata = { > > > > > > > > > > + .clp = old, > > > > > > > > > > + }; > > > > > > > > > > + struct rpc_add_xprt_test rpcdata = { > > > > > > > > > > + .add_xprt_test = old->cl_mvops- > > > > > > > > > > > session_trunk, > > > > > > > > > > + .data = &xprtdata, > > > > > > > > > > + }; > > > > > > > > > > + > > > > > > > > > > + if (clp->cl_proto != old->cl_proto) > > > > > > > > > > + return; > > > > > > > > > > + clp_salen = rpc_peeraddr(clp->cl_rpcclient, > > > > > > > > > > clp_sap, > > > > > > > > > > sizeof(clp_addr)); > > > > > > > > > > + old_salen = rpc_peeraddr(old->cl_rpcclient, > > > > > > > > > > old_sap, > > > > > > > > > > sizeof(old_addr)); > > > > > > > > > > + > > > > > > > > > > + if (clp_addr.ss_family != old_addr.ss_family) > > > > > > > > > > + return; > > > > > > > > > > + > > > > > > > > > > + xprt_args.dstaddr = clp_sap; > > > > > > > > > > + xprt_args.addrlen = clp_salen; > > > > > > > > > > + > > > > > > > > > > + xprtdata.cred = nfs4_get_clid_cred(old); > > > > > > > > > > + rpc_clnt_add_xprt(old->cl_rpcclient, > > > > > > > > > > &xprt_args, > > > > > > > > > > + > > > > > > > > > > rpc_clnt_setup_test_and_add_xprt, > > > > > > > > > > &rpcdata); > > > > > > > > > > > > > > > > > > Is there an upper bound on the number of transports > > > > > > > > > that > > > > > > > > > are added to the NFS client's switch? > > > > > > > > > > > > > > > > I don't believe any limits exist right now. Why should > > > > > > > > there > > > > > > > > be a > > > > > > > > limit? Are you saying that the client should limit > > > > > > > > trunking? > > > > > > > > While > > > > > > > > this is not what's happening here, but say FS_LOCATION > > > > > > > > returned > > > > > > > > 100 > > > > > > > > ips for the server. Are you saying the client should be > > > > > > > > limiting > > > > > > > > how > > > > > > > > many trunkable connections it would be establishing and > > > > > > > > picking > > > > > > > > just a > > > > > > > > few addresses to try? What's happening with this patch is > > > > > > > > that > > > > > > > > say > > > > > > > > there are 100mounts to 100 ips (each representing the > > > > > > > > same > > > > > > > > server > > > > > > > > or > > > > > > > > trunkable server(s)), without this patch a single > > > > > > > > connection > > > > > > > > is > > > > > > > > kept, > > > > > > > > with this patch we'll have 100 connections. > > > > > > > > > > > > > > The patch description needs to make this behavior more > > > > > > > clear. > > > > > > > It > > > > > > > needs to explain "why" -- the body of the patch already > > > > > > > explains > > > > > > > "what". Can you include your last sentence in the > > > > > > > description > > > > > > > as > > > > > > > a use case? > > > > > > > > > > > > > > As for the behavior... Seems to me that there are going to > > > > > > > be > > > > > > > only > > > > > > > infinitesimal gains after the first few connections, and > > > > > > > after > > > > > > > that, it's going to be a lot for both sides to manage for > > > > > > > no > > > > > > > real > > > > > > > gain. I think you do want to cap the total number of > > > > > > > connections > > > > > > > at a reasonable number, even in the FS_LOCATIONS case. > > > > > > > > > > > > > > > > > > > I'd tend to agree. If you want to scale I/O then pNFS is the > > > > > > way > > > > > > to > > > > > > go, > > > > > > not vast numbers of connections to a single server. > > > > > > > > > > > BTW: AFAICS this patch will end up adding another connection > > > > > every > > > > > time > > > > > we mount a new filesystem, whether or not a connection already > > > > > exists > > > > > to that IP address. That's unacceptable. > > > > > > > > Not in my testing. Mounts to the same server (same IP) different > > > > export volumes lead to use of a single connection. > > > > > > > > > > > Ah... Never mind. The comparison is made by > > > rpc_clnt_setup_test_and_add_xprt(), and the address is discarded if > > > it > > > is already present. > > > > > > However why are you running trunking detection a second time on the > > > address you've just run trunking detection on and have decided to > > > add? > > > > Because that's where we determined that these are different clients > > and are trunkable? But I guess also for the ease of re-using existing > > code. There is no hook to create an xprt and add it to an existing > > RPC > > client. One can be created. > > Why not rpc_clnt_add_xprt()? That's what pNFS uses for this case. That's what this code is using. > > > > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@hammerspace.com > >
Hi Olga, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on nfs/linux-next] [also build test WARNING on v5.13-rc5 next-20210608] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Olga-Kornievskaia/NFSv4-1-add-trunking-when-server-trunking-detected/20210609-025208 base: git://git.linux-nfs.org/projects/trondmy/linux-nfs.git linux-next config: arm-defconfig (attached as .config) compiler: arm-linux-gnueabi-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/52d86dfa29146024beeca51af401aabbb329a942 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Olga-Kornievskaia/NFSv4-1-add-trunking-when-server-trunking-detected/20210609-025208 git checkout 52d86dfa29146024beeca51af401aabbb329a942 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>): fs/nfs/nfs4client.c: In function 'nfs4_add_trunk': >> fs/nfs/nfs4client.c:369:20: warning: variable 'old_salen' set but not used [-Wunused-but-set-variable] 369 | size_t clp_salen, old_salen; | ^~~~~~~~~ vim +/old_salen +369 fs/nfs/nfs4client.c 363 364 static void nfs4_add_trunk(struct nfs_client *clp, struct nfs_client *old) 365 { 366 struct sockaddr_storage clp_addr, old_addr; 367 struct sockaddr *clp_sap = (struct sockaddr *)&clp_addr; 368 struct sockaddr *old_sap = (struct sockaddr *)&old_addr; > 369 size_t clp_salen, old_salen; 370 struct xprt_create xprt_args = { 371 .ident = old->cl_proto, 372 .net = old->cl_net, 373 .servername = old->cl_hostname, 374 }; 375 struct nfs4_add_xprt_data xprtdata = { 376 .clp = old, 377 }; 378 struct rpc_add_xprt_test rpcdata = { 379 .add_xprt_test = old->cl_mvops->session_trunk, 380 .data = &xprtdata, 381 }; 382 383 if (clp->cl_proto != old->cl_proto) 384 return; 385 clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap, sizeof(clp_addr)); 386 old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap, sizeof(old_addr)); 387 388 if (clp_addr.ss_family != old_addr.ss_family) 389 return; 390 391 xprt_args.dstaddr = clp_sap; 392 xprt_args.addrlen = clp_salen; 393 394 xprtdata.cred = nfs4_get_clid_cred(old); 395 rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args, 396 rpc_clnt_setup_test_and_add_xprt, &rpcdata); 397 398 if (xprtdata.cred) 399 put_cred(xprtdata.cred); 400 } 401 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
On Tue, 2021-06-08 at 18:38 -0400, Olga Kornievskaia wrote: > On Tue, Jun 8, 2021 at 6:28 PM Trond Myklebust < > trondmy@hammerspace.com> wrote: > > > > On Tue, 2021-06-08 at 18:18 -0400, Olga Kornievskaia wrote: > > > On Tue, Jun 8, 2021 at 6:13 PM Trond Myklebust < > > > trondmy@hammerspace.com> wrote: > > > > > > > > On Tue, 2021-06-08 at 17:40 -0400, Olga Kornievskaia wrote: > > > > > On Tue, Jun 8, 2021 at 5:06 PM Chuck Lever III < > > > > > chuck.lever@oracle.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia < > > > > > > > olga.kornievskaia@gmail.com> wrote: > > > > > > > > > > > > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III < > > > > > > > chuck.lever@oracle.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia < > > > > > > > > > olga.kornievskaia@gmail.com> wrote: > > > > > > > > > > > > > > > > > > From: Olga Kornievskaia <kolga@netapp.com> > > > > > > > > > > > > > > > > > > After trunking is discovered in > > > > > > > > > nfs4_discover_server_trunking(), > > > > > > > > > add the transport to the old client structure before > > > > > > > > > destroying > > > > > > > > > the new client structure (along with its transport). > > > > > > > > > > > > > > > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com> > > > > > > > > > --- > > > > > > > > > fs/nfs/nfs4client.c | 40 > > > > > > > > > ++++++++++++++++++++++++++++++++++++++++ > > > > > > > > > 1 file changed, 40 insertions(+) > > > > > > > > > > > > > > > > > > diff --git a/fs/nfs/nfs4client.c > > > > > > > > > b/fs/nfs/nfs4client.c > > > > > > > > > index 42719384e25f..984c851844d8 100644 > > > > > > > > > --- a/fs/nfs/nfs4client.c > > > > > > > > > +++ b/fs/nfs/nfs4client.c > > > > > > > > > @@ -361,6 +361,44 @@ static int > > > > > > > > > nfs4_init_client_minor_version(struct nfs_client > > > > > > > > > *clp) > > > > > > > > > return nfs4_init_callback(clp); > > > > > > > > > } > > > > > > > > > > > > > > > > > > +static void nfs4_add_trunk(struct nfs_client *clp, > > > > > > > > > struct > > > > > > > > > nfs_client *old) > > > > > > > > > +{ > > > > > > > > > + struct sockaddr_storage clp_addr, old_addr; > > > > > > > > > + struct sockaddr *clp_sap = (struct sockaddr > > > > > > > > > *)&clp_addr; > > > > > > > > > + struct sockaddr *old_sap = (struct sockaddr > > > > > > > > > *)&old_addr; > > > > > > > > > + size_t clp_salen, old_salen; > > > > > > > > > + struct xprt_create xprt_args = { > > > > > > > > > + .ident = old->cl_proto, > > > > > > > > > + .net = old->cl_net, > > > > > > > > > + .servername = old->cl_hostname, > > > > > > > > > + }; > > > > > > > > > + struct nfs4_add_xprt_data xprtdata = { > > > > > > > > > + .clp = old, > > > > > > > > > + }; > > > > > > > > > + struct rpc_add_xprt_test rpcdata = { > > > > > > > > > + .add_xprt_test = old->cl_mvops- > > > > > > > > > > session_trunk, > > > > > > > > > + .data = &xprtdata, > > > > > > > > > + }; > > > > > > > > > + > > > > > > > > > + if (clp->cl_proto != old->cl_proto) > > > > > > > > > + return; > > > > > > > > > + clp_salen = rpc_peeraddr(clp->cl_rpcclient, > > > > > > > > > clp_sap, > > > > > > > > > sizeof(clp_addr)); > > > > > > > > > + old_salen = rpc_peeraddr(old->cl_rpcclient, > > > > > > > > > old_sap, > > > > > > > > > sizeof(old_addr)); > > > > > > > > > + > > > > > > > > > + if (clp_addr.ss_family != old_addr.ss_family) > > > > > > > > > + return; > > > > > > > > > + > > > > > > > > > + xprt_args.dstaddr = clp_sap; > > > > > > > > > + xprt_args.addrlen = clp_salen; > > > > > > > > > + > > > > > > > > > + xprtdata.cred = nfs4_get_clid_cred(old); > > > > > > > > > + rpc_clnt_add_xprt(old->cl_rpcclient, > > > > > > > > > &xprt_args, > > > > > > > > > + > > > > > > > > > rpc_clnt_setup_test_and_add_xprt, > > > > > > > > > &rpcdata); > > > > > > > > > > > > > > > > Is there an upper bound on the number of transports > > > > > > > > that > > > > > > > > are added to the NFS client's switch? > > > > > > > > > > > > > > I don't believe any limits exist right now. Why should > > > > > > > there > > > > > > > be a > > > > > > > limit? Are you saying that the client should limit > > > > > > > trunking? > > > > > > > While > > > > > > > this is not what's happening here, but say FS_LOCATION > > > > > > > returned > > > > > > > 100 > > > > > > > ips for the server. Are you saying the client should be > > > > > > > limiting > > > > > > > how > > > > > > > many trunkable connections it would be establishing and > > > > > > > picking > > > > > > > just a > > > > > > > few addresses to try? What's happening with this patch is > > > > > > > that > > > > > > > say > > > > > > > there are 100mounts to 100 ips (each representing the > > > > > > > same > > > > > > > server > > > > > > > or > > > > > > > trunkable server(s)), without this patch a single > > > > > > > connection > > > > > > > is > > > > > > > kept, > > > > > > > with this patch we'll have 100 connections. > > > > > > > > > > > > The patch description needs to make this behavior more > > > > > > clear. > > > > > > It > > > > > > needs to explain "why" -- the body of the patch already > > > > > > explains > > > > > > "what". Can you include your last sentence in the > > > > > > description > > > > > > as > > > > > > a use case? > > > > > > > > > > I sure can. > > > > > > > > > > > As for the behavior... Seems to me that there are going to > > > > > > be > > > > > > only > > > > > > infinitesimal gains after the first few connections, and > > > > > > after > > > > > > that, it's going to be a lot for both sides to manage for > > > > > > no > > > > > > real > > > > > > gain. I think you do want to cap the total number of > > > > > > connections > > > > > > at a reasonable number, even in the FS_LOCATIONS case. > > > > > > > > > > Do you want to cap it at 16 like we do for nconnect? I'm ok > > > > > with > > > > > that > > > > > for now. > > > > > > > > > > I don't understand why a setup where there X NICs on each > > > > > side > > > > > (client/server) and X mounts are done, there won't be > > > > > throughput > > > > > of > > > > > close to X * throughput of a single NIC. > > > > > > > > That still does not make the patch acceptable. There are > > > > already > > > > server > > > > vendors seeing problems with supporting nconnect for various > > > > reasons. > > > > > > Why are there servers presenting multiple IP addresses and > > > returning > > > the same server identity if they can not support trunking? That > > > seems > > > to be going against the spec? > > > > That's not a question I can answer, but I'm not thinking of our > > server > > or the Linux server. > > > > > > > > > There needs to be a way to ensure that we can keep the current > > > > client > > > > behaviour without requiring changes to server DNS entries, etc. > > > > > > Ok, how about we introduce a mount option, "enable_trunking", and > > > if > > > that's present we would not collapse transports? > > > > I'd suggest making it 'max_connect=X' so that we can actually set a > > limit like we do for nconnect. That limit probably needs to be > > lower > > bounded by the value of nconnect. > > Sure I can do that. But wouldn't that be confusing since we already > have nconnect option? How about "max_trunking=X"? The lack of the > setting or max_trunking=1, would result in all transports collapsed, > other values would result in all additions of trunkable transports. > > I'm not comfortable using the word 'trunking' in the mount API. The NFS community uses 'trunking' to mean something very different from what the networking community does. You're only talking about session trunking here, and not clientid trunking, so I suggest we stick with a terminology that reminds us this is connection related.
On Tue, 2021-06-08 at 18:38 -0400, Olga Kornievskaia wrote: > On Tue, Jun 8, 2021 at 6:31 PM Trond Myklebust < > trondmy@hammerspace.com> wrote: > > > > On Tue, 2021-06-08 at 18:21 -0400, Olga Kornievskaia wrote: > > > On Tue, Jun 8, 2021 at 5:41 PM Trond Myklebust < > > > trondmy@hammerspace.com> wrote: > > > > > > > > On Tue, 2021-06-08 at 17:30 -0400, Olga Kornievskaia wrote: > > > > > On Tue, Jun 8, 2021 at 5:26 PM Trond Myklebust < > > > > > trondmy@hammerspace.com> wrote: > > > > > > > > > > > > On Tue, 2021-06-08 at 21:19 +0000, Trond Myklebust wrote: > > > > > > > On Tue, 2021-06-08 at 21:06 +0000, Chuck Lever III wrote: > > > > > > > > > > > > > > > > > > > > > > > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia < > > > > > > > > > olga.kornievskaia@gmail.com> wrote: > > > > > > > > > > > > > > > > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III < > > > > > > > > > chuck.lever@oracle.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia < > > > > > > > > > > > olga.kornievskaia@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > > > From: Olga Kornievskaia <kolga@netapp.com> > > > > > > > > > > > > > > > > > > > > > > After trunking is discovered in > > > > > > > > > > > nfs4_discover_server_trunking(), > > > > > > > > > > > add the transport to the old client structure > > > > > > > > > > > before > > > > > > > > > > > destroying > > > > > > > > > > > the new client structure (along with its > > > > > > > > > > > transport). > > > > > > > > > > > > > > > > > > > > > > Signed-off-by: Olga Kornievskaia < > > > > > > > > > > > kolga@netapp.com> > > > > > > > > > > > --- > > > > > > > > > > > fs/nfs/nfs4client.c | 40 > > > > > > > > > > > ++++++++++++++++++++++++++++++++++++++++ > > > > > > > > > > > 1 file changed, 40 insertions(+) > > > > > > > > > > > > > > > > > > > > > > diff --git a/fs/nfs/nfs4client.c > > > > > > > > > > > b/fs/nfs/nfs4client.c > > > > > > > > > > > index 42719384e25f..984c851844d8 100644 > > > > > > > > > > > --- a/fs/nfs/nfs4client.c > > > > > > > > > > > +++ b/fs/nfs/nfs4client.c > > > > > > > > > > > @@ -361,6 +361,44 @@ static int > > > > > > > > > > > nfs4_init_client_minor_version(struct nfs_client > > > > > > > > > > > *clp) > > > > > > > > > > > return nfs4_init_callback(clp); > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > +static void nfs4_add_trunk(struct nfs_client > > > > > > > > > > > *clp, > > > > > > > > > > > struct > > > > > > > > > > > nfs_client *old) > > > > > > > > > > > +{ > > > > > > > > > > > + struct sockaddr_storage clp_addr, old_addr; > > > > > > > > > > > + struct sockaddr *clp_sap = (struct sockaddr > > > > > > > > > > > *)&clp_addr; > > > > > > > > > > > + struct sockaddr *old_sap = (struct sockaddr > > > > > > > > > > > *)&old_addr; > > > > > > > > > > > + size_t clp_salen, old_salen; > > > > > > > > > > > + struct xprt_create xprt_args = { > > > > > > > > > > > + .ident = old->cl_proto, > > > > > > > > > > > + .net = old->cl_net, > > > > > > > > > > > + .servername = old->cl_hostname, > > > > > > > > > > > + }; > > > > > > > > > > > + struct nfs4_add_xprt_data xprtdata = { > > > > > > > > > > > + .clp = old, > > > > > > > > > > > + }; > > > > > > > > > > > + struct rpc_add_xprt_test rpcdata = { > > > > > > > > > > > + .add_xprt_test = old->cl_mvops- > > > > > > > > > > > > session_trunk, > > > > > > > > > > > + .data = &xprtdata, > > > > > > > > > > > + }; > > > > > > > > > > > + > > > > > > > > > > > + if (clp->cl_proto != old->cl_proto) > > > > > > > > > > > + return; > > > > > > > > > > > + clp_salen = rpc_peeraddr(clp->cl_rpcclient, > > > > > > > > > > > clp_sap, > > > > > > > > > > > sizeof(clp_addr)); > > > > > > > > > > > + old_salen = rpc_peeraddr(old->cl_rpcclient, > > > > > > > > > > > old_sap, > > > > > > > > > > > sizeof(old_addr)); > > > > > > > > > > > + > > > > > > > > > > > + if (clp_addr.ss_family != > > > > > > > > > > > old_addr.ss_family) > > > > > > > > > > > + return; > > > > > > > > > > > + > > > > > > > > > > > + xprt_args.dstaddr = clp_sap; > > > > > > > > > > > + xprt_args.addrlen = clp_salen; > > > > > > > > > > > + > > > > > > > > > > > + xprtdata.cred = nfs4_get_clid_cred(old); > > > > > > > > > > > + rpc_clnt_add_xprt(old->cl_rpcclient, > > > > > > > > > > > &xprt_args, > > > > > > > > > > > + > > > > > > > > > > > rpc_clnt_setup_test_and_add_xprt, > > > > > > > > > > > &rpcdata); > > > > > > > > > > > > > > > > > > > > Is there an upper bound on the number of transports > > > > > > > > > > that > > > > > > > > > > are added to the NFS client's switch? > > > > > > > > > > > > > > > > > > I don't believe any limits exist right now. Why > > > > > > > > > should > > > > > > > > > there > > > > > > > > > be a > > > > > > > > > limit? Are you saying that the client should limit > > > > > > > > > trunking? > > > > > > > > > While > > > > > > > > > this is not what's happening here, but say > > > > > > > > > FS_LOCATION > > > > > > > > > returned > > > > > > > > > 100 > > > > > > > > > ips for the server. Are you saying the client should > > > > > > > > > be > > > > > > > > > limiting > > > > > > > > > how > > > > > > > > > many trunkable connections it would be establishing > > > > > > > > > and > > > > > > > > > picking > > > > > > > > > just a > > > > > > > > > few addresses to try? What's happening with this > > > > > > > > > patch is > > > > > > > > > that > > > > > > > > > say > > > > > > > > > there are 100mounts to 100 ips (each representing the > > > > > > > > > same > > > > > > > > > server > > > > > > > > > or > > > > > > > > > trunkable server(s)), without this patch a single > > > > > > > > > connection > > > > > > > > > is > > > > > > > > > kept, > > > > > > > > > with this patch we'll have 100 connections. > > > > > > > > > > > > > > > > The patch description needs to make this behavior more > > > > > > > > clear. > > > > > > > > It > > > > > > > > needs to explain "why" -- the body of the patch already > > > > > > > > explains > > > > > > > > "what". Can you include your last sentence in the > > > > > > > > description > > > > > > > > as > > > > > > > > a use case? > > > > > > > > > > > > > > > > As for the behavior... Seems to me that there are going > > > > > > > > to > > > > > > > > be > > > > > > > > only > > > > > > > > infinitesimal gains after the first few connections, > > > > > > > > and > > > > > > > > after > > > > > > > > that, it's going to be a lot for both sides to manage > > > > > > > > for > > > > > > > > no > > > > > > > > real > > > > > > > > gain. I think you do want to cap the total number of > > > > > > > > connections > > > > > > > > at a reasonable number, even in the FS_LOCATIONS case. > > > > > > > > > > > > > > > > > > > > > > I'd tend to agree. If you want to scale I/O then pNFS is > > > > > > > the > > > > > > > way > > > > > > > to > > > > > > > go, > > > > > > > not vast numbers of connections to a single server. > > > > > > > > > > > > > BTW: AFAICS this patch will end up adding another > > > > > > connection > > > > > > every > > > > > > time > > > > > > we mount a new filesystem, whether or not a connection > > > > > > already > > > > > > exists > > > > > > to that IP address. That's unacceptable. > > > > > > > > > > Not in my testing. Mounts to the same server (same IP) > > > > > different > > > > > export volumes lead to use of a single connection. > > > > > > > > > > > > > > Ah... Never mind. The comparison is made by > > > > rpc_clnt_setup_test_and_add_xprt(), and the address is > > > > discarded if > > > > it > > > > is already present. > > > > > > > > However why are you running trunking detection a second time on > > > > the > > > > address you've just run trunking detection on and have decided > > > > to > > > > add? > > > > > > Because that's where we determined that these are different > > > clients > > > and are trunkable? But I guess also for the ease of re-using > > > existing > > > code. There is no hook to create an xprt and add it to an > > > existing > > > RPC > > > client. One can be created. > > > > Why not rpc_clnt_add_xprt()? That's what pNFS uses for this case. > > That's what this code is using. > Sorry. I should have written: 'rpc_clnt_add_xprt() with rpc_clnt_setup_test_and_add_xprt()', which is what pNFS uses when connecting to NFSv3 servers. i.e. rpc_clnt_add_xprt(clp->cl_rpcclient, &xprt_args, rpc_clnt_test_and_add_xprt, NULL); >
On Tue, 2021-06-08 at 23:02 +0000, Trond Myklebust wrote: > Sorry. I should have written: 'rpc_clnt_add_xprt() with > rpc_clnt_setup_test_and_add_xprt()', which is what pNFS uses when Growl... My keyboard is hexed today. I mean 'rpc_clnt_add_xprt() with rpc_clnt_test_and_add_xprt'. > connecting to NFSv3 servers. > i.e. > > rpc_clnt_add_xprt(clp->cl_rpcclient, &xprt_args, > rpc_clnt_test_and_add_xprt, NULL); > The above line is correct...
Hi Olga, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on nfs/linux-next] [also build test WARNING on v5.13-rc5 next-20210608] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Olga-Kornievskaia/NFSv4-1-add-trunking-when-server-trunking-detected/20210609-025208 base: git://git.linux-nfs.org/projects/trondmy/linux-nfs.git linux-next config: x86_64-randconfig-a015-20210608 (attached as .config) compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project d32cc150feb72f315a5bbd34f92e7beca21a50da) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install x86_64 cross compiling tool for clang build # apt-get install binutils-x86-64-linux-gnu # https://github.com/0day-ci/linux/commit/52d86dfa29146024beeca51af401aabbb329a942 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Olga-Kornievskaia/NFSv4-1-add-trunking-when-server-trunking-detected/20210609-025208 git checkout 52d86dfa29146024beeca51af401aabbb329a942 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>): >> fs/nfs/nfs4client.c:369:20: warning: variable 'old_salen' set but not used [-Wunused-but-set-variable] size_t clp_salen, old_salen; ^ 1 warning generated. vim +/old_salen +369 fs/nfs/nfs4client.c 363 364 static void nfs4_add_trunk(struct nfs_client *clp, struct nfs_client *old) 365 { 366 struct sockaddr_storage clp_addr, old_addr; 367 struct sockaddr *clp_sap = (struct sockaddr *)&clp_addr; 368 struct sockaddr *old_sap = (struct sockaddr *)&old_addr; > 369 size_t clp_salen, old_salen; 370 struct xprt_create xprt_args = { 371 .ident = old->cl_proto, 372 .net = old->cl_net, 373 .servername = old->cl_hostname, 374 }; 375 struct nfs4_add_xprt_data xprtdata = { 376 .clp = old, 377 }; 378 struct rpc_add_xprt_test rpcdata = { 379 .add_xprt_test = old->cl_mvops->session_trunk, 380 .data = &xprtdata, 381 }; 382 383 if (clp->cl_proto != old->cl_proto) 384 return; 385 clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap, sizeof(clp_addr)); 386 old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap, sizeof(old_addr)); 387 388 if (clp_addr.ss_family != old_addr.ss_family) 389 return; 390 391 xprt_args.dstaddr = clp_sap; 392 xprt_args.addrlen = clp_salen; 393 394 xprtdata.cred = nfs4_get_clid_cred(old); 395 rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args, 396 rpc_clnt_setup_test_and_add_xprt, &rpcdata); 397 398 if (xprtdata.cred) 399 put_cred(xprtdata.cred); 400 } 401 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
On Tue, Jun 8, 2021 at 5:06 PM Chuck Lever III <chuck.lever@oracle.com> wrote: > > > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia <olga.kornievskaia@gmail.com> wrote: > > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III <chuck.lever@oracle.com> wrote: > >> > >> > >> > >>> On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia <olga.kornievskaia@gmail.com> wrote: > >>> > >>> From: Olga Kornievskaia <kolga@netapp.com> > >>> > >>> After trunking is discovered in nfs4_discover_server_trunking(), > >>> add the transport to the old client structure before destroying > >>> the new client structure (along with its transport). > >>> > >>> Signed-off-by: Olga Kornievskaia <kolga@netapp.com> > >>> --- > >>> fs/nfs/nfs4client.c | 40 ++++++++++++++++++++++++++++++++++++++++ > >>> 1 file changed, 40 insertions(+) > >>> > >>> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c > >>> index 42719384e25f..984c851844d8 100644 > >>> --- a/fs/nfs/nfs4client.c > >>> +++ b/fs/nfs/nfs4client.c > >>> @@ -361,6 +361,44 @@ static int nfs4_init_client_minor_version(struct nfs_client *clp) > >>> return nfs4_init_callback(clp); > >>> } > >>> > >>> +static void nfs4_add_trunk(struct nfs_client *clp, struct nfs_client *old) > >>> +{ > >>> + struct sockaddr_storage clp_addr, old_addr; > >>> + struct sockaddr *clp_sap = (struct sockaddr *)&clp_addr; > >>> + struct sockaddr *old_sap = (struct sockaddr *)&old_addr; > >>> + size_t clp_salen, old_salen; > >>> + struct xprt_create xprt_args = { > >>> + .ident = old->cl_proto, > >>> + .net = old->cl_net, > >>> + .servername = old->cl_hostname, > >>> + }; > >>> + struct nfs4_add_xprt_data xprtdata = { > >>> + .clp = old, > >>> + }; > >>> + struct rpc_add_xprt_test rpcdata = { > >>> + .add_xprt_test = old->cl_mvops->session_trunk, > >>> + .data = &xprtdata, > >>> + }; > >>> + > >>> + if (clp->cl_proto != old->cl_proto) > >>> + return; > >>> + clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap, sizeof(clp_addr)); > >>> + old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap, sizeof(old_addr)); > >>> + > >>> + if (clp_addr.ss_family != old_addr.ss_family) > >>> + return; > >>> + > >>> + xprt_args.dstaddr = clp_sap; > >>> + xprt_args.addrlen = clp_salen; > >>> + > >>> + xprtdata.cred = nfs4_get_clid_cred(old); > >>> + rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args, > >>> + rpc_clnt_setup_test_and_add_xprt, &rpcdata); > >> > >> Is there an upper bound on the number of transports that > >> are added to the NFS client's switch? > > > > I don't believe any limits exist right now. Why should there be a > > limit? Are you saying that the client should limit trunking? While > > this is not what's happening here, but say FS_LOCATION returned 100 > > ips for the server. Are you saying the client should be limiting how > > many trunkable connections it would be establishing and picking just a > > few addresses to try? What's happening with this patch is that say > > there are 100mounts to 100 ips (each representing the same server or > > trunkable server(s)), without this patch a single connection is kept, > > with this patch we'll have 100 connections. > > The patch description needs to make this behavior more clear. It > needs to explain "why" -- the body of the patch already explains > "what". Can you include your last sentence in the description as > a use case? > > As for the behavior... Seems to me that there are going to be only > infinitesimal gains after the first few connections, and after > that, it's going to be a lot for both sides to manage for no real > gain. I think you do want to cap the total number of connections > at a reasonable number, even in the FS_LOCATIONS case. To both Trond and Chuck, If we were to cap the max number transports how to manage nconnect transports and trunking transports. Are they the "same" transports? So for instance, if somebody specified nconnect=10,max_connect=2 which one matters more. Should nconnect be cut at 2? And that would mean all the allowable transports are used up by nconnect. But we've seen that there is benefit from creating multiple connections for a single NIC. But in a multi-NIC environment, it means multiple connections on each NIC, wouldn't it?So should we say allow nconnect * max_connect value of transports instead? Or does the admin need to calculate that value to make sure that max_connect reflects that? > >>> + > >>> + if (xprtdata.cred) > >>> + put_cred(xprtdata.cred); > >>> +} > >>> + > >>> /** > >>> * nfs4_init_client - Initialise an NFS4 client record > >>> * > >>> @@ -434,6 +472,8 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp, > >>> * won't try to use it. > >>> */ > >>> nfs_mark_client_ready(clp, -EPERM); > >>> + if (old->cl_mvops->session_trunk) > >>> + nfs4_add_trunk(clp, old); > >>> } > >>> clear_bit(NFS_CS_TSM_POSSIBLE, &clp->cl_flags); > >>> nfs_put_client(clp); > >>> -- > >>> 2.27.0 > >>> > >> > >> -- > >> Chuck Lever > > -- > Chuck Lever > > >
diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c index 42719384e25f..984c851844d8 100644 --- a/fs/nfs/nfs4client.c +++ b/fs/nfs/nfs4client.c @@ -361,6 +361,44 @@ static int nfs4_init_client_minor_version(struct nfs_client *clp) return nfs4_init_callback(clp); } +static void nfs4_add_trunk(struct nfs_client *clp, struct nfs_client *old) +{ + struct sockaddr_storage clp_addr, old_addr; + struct sockaddr *clp_sap = (struct sockaddr *)&clp_addr; + struct sockaddr *old_sap = (struct sockaddr *)&old_addr; + size_t clp_salen, old_salen; + struct xprt_create xprt_args = { + .ident = old->cl_proto, + .net = old->cl_net, + .servername = old->cl_hostname, + }; + struct nfs4_add_xprt_data xprtdata = { + .clp = old, + }; + struct rpc_add_xprt_test rpcdata = { + .add_xprt_test = old->cl_mvops->session_trunk, + .data = &xprtdata, + }; + + if (clp->cl_proto != old->cl_proto) + return; + clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap, sizeof(clp_addr)); + old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap, sizeof(old_addr)); + + if (clp_addr.ss_family != old_addr.ss_family) + return; + + xprt_args.dstaddr = clp_sap; + xprt_args.addrlen = clp_salen; + + xprtdata.cred = nfs4_get_clid_cred(old); + rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args, + rpc_clnt_setup_test_and_add_xprt, &rpcdata); + + if (xprtdata.cred) + put_cred(xprtdata.cred); +} + /** * nfs4_init_client - Initialise an NFS4 client record * @@ -434,6 +472,8 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp, * won't try to use it. */ nfs_mark_client_ready(clp, -EPERM); + if (old->cl_mvops->session_trunk) + nfs4_add_trunk(clp, old); } clear_bit(NFS_CS_TSM_POSSIBLE, &clp->cl_flags); nfs_put_client(clp);