[1/1] NFSv4.1+ add trunking when server trunking detected

Message ID	20210608184527.87018-1-olga.kornievskaia@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-nfs-owner@kernel.org> From: Olga Kornievskaia <olga.kornievskaia@gmail.com> To: trond.myklebust@hammerspace.com, anna.schumaker@netapp.com Cc: linux-nfs@vger.kernel.org Subject: [PATCH 1/1] NFSv4.1+ add trunking when server trunking detected Date: Tue, 8 Jun 2021 14:45:27 -0400 Message-Id: <20210608184527.87018-1-olga.kornievskaia@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	[1/1] NFSv4.1+ add trunking when server trunking detected \| expand [1/1] NFSv4.1+ add trunking when server trunking detected

Olga Kornievskaia June 8, 2021, 6:45 p.m. UTC

From: Olga Kornievskaia <kolga@netapp.com>

After trunking is discovered in nfs4_discover_server_trunking(),
add the transport to the old client structure before destroying
the new client structure (along with its transport).

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
---
 fs/nfs/nfs4client.c | 40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

Chuck Lever June 8, 2021, 8:41 p.m. UTC | #1

> On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia <olga.kornievskaia@gmail.com> wrote:
> 
> From: Olga Kornievskaia <kolga@netapp.com>
> 
> After trunking is discovered in nfs4_discover_server_trunking(),
> add the transport to the old client structure before destroying
> the new client structure (along with its transport).
> 
> Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> ---
> fs/nfs/nfs4client.c | 40 ++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 40 insertions(+)
> 
> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> index 42719384e25f..984c851844d8 100644
> --- a/fs/nfs/nfs4client.c
> +++ b/fs/nfs/nfs4client.c
> @@ -361,6 +361,44 @@ static int nfs4_init_client_minor_version(struct nfs_client *clp)
> 	return nfs4_init_callback(clp);
> }
> 
> +static void nfs4_add_trunk(struct nfs_client *clp, struct nfs_client *old)
> +{
> +	struct sockaddr_storage clp_addr, old_addr;
> +	struct sockaddr *clp_sap = (struct sockaddr *)&clp_addr;
> +	struct sockaddr *old_sap = (struct sockaddr *)&old_addr;
> +	size_t clp_salen, old_salen;
> +	struct xprt_create xprt_args = {
> +		.ident = old->cl_proto,
> +		.net = old->cl_net,
> +		.servername = old->cl_hostname,
> +	};
> +	struct nfs4_add_xprt_data xprtdata = {
> +		.clp = old,
> +	};
> +	struct rpc_add_xprt_test rpcdata = {
> +		.add_xprt_test = old->cl_mvops->session_trunk,
> +		.data = &xprtdata,
> +	};
> +
> +	if (clp->cl_proto != old->cl_proto)
> +		return;
> +	clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap, sizeof(clp_addr));
> +	old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap, sizeof(old_addr));
> +
> +	if (clp_addr.ss_family != old_addr.ss_family)
> +		return;
> +
> +	xprt_args.dstaddr = clp_sap;
> +	xprt_args.addrlen = clp_salen;
> +
> +	xprtdata.cred = nfs4_get_clid_cred(old);
> +	rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args,
> +			  rpc_clnt_setup_test_and_add_xprt, &rpcdata);

Is there an upper bound on the number of transports that
are added to the NFS client's switch?


> +
> +	if (xprtdata.cred)
> +		put_cred(xprtdata.cred);
> +}
> +
> /**
>  * nfs4_init_client - Initialise an NFS4 client record
>  *
> @@ -434,6 +472,8 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp,
> 		 * won't try to use it.
> 		 */
> 		nfs_mark_client_ready(clp, -EPERM);
> +		if (old->cl_mvops->session_trunk)
> +			nfs4_add_trunk(clp, old);
> 	}
> 	clear_bit(NFS_CS_TSM_POSSIBLE, &clp->cl_flags);
> 	nfs_put_client(clp);
> -- 
> 2.27.0
> 

--
Chuck Lever

Olga Kornievskaia June 8, 2021, 8:56 p.m. UTC | #2

On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III <chuck.lever@oracle.com> wrote:
>
>
>
> > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia <olga.kornievskaia@gmail.com> wrote:
> >
> > From: Olga Kornievskaia <kolga@netapp.com>
> >
> > After trunking is discovered in nfs4_discover_server_trunking(),
> > add the transport to the old client structure before destroying
> > the new client structure (along with its transport).
> >
> > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > ---
> > fs/nfs/nfs4client.c | 40 ++++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 40 insertions(+)
> >
> > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> > index 42719384e25f..984c851844d8 100644
> > --- a/fs/nfs/nfs4client.c
> > +++ b/fs/nfs/nfs4client.c
> > @@ -361,6 +361,44 @@ static int nfs4_init_client_minor_version(struct nfs_client *clp)
> >       return nfs4_init_callback(clp);
> > }
> >
> > +static void nfs4_add_trunk(struct nfs_client *clp, struct nfs_client *old)
> > +{
> > +     struct sockaddr_storage clp_addr, old_addr;
> > +     struct sockaddr *clp_sap = (struct sockaddr *)&clp_addr;
> > +     struct sockaddr *old_sap = (struct sockaddr *)&old_addr;
> > +     size_t clp_salen, old_salen;
> > +     struct xprt_create xprt_args = {
> > +             .ident = old->cl_proto,
> > +             .net = old->cl_net,
> > +             .servername = old->cl_hostname,
> > +     };
> > +     struct nfs4_add_xprt_data xprtdata = {
> > +             .clp = old,
> > +     };
> > +     struct rpc_add_xprt_test rpcdata = {
> > +             .add_xprt_test = old->cl_mvops->session_trunk,
> > +             .data = &xprtdata,
> > +     };
> > +
> > +     if (clp->cl_proto != old->cl_proto)
> > +             return;
> > +     clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap, sizeof(clp_addr));
> > +     old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap, sizeof(old_addr));
> > +
> > +     if (clp_addr.ss_family != old_addr.ss_family)
> > +             return;
> > +
> > +     xprt_args.dstaddr = clp_sap;
> > +     xprt_args.addrlen = clp_salen;
> > +
> > +     xprtdata.cred = nfs4_get_clid_cred(old);
> > +     rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args,
> > +                       rpc_clnt_setup_test_and_add_xprt, &rpcdata);
>
> Is there an upper bound on the number of transports that
> are added to the NFS client's switch?

I don't believe any limits exist right now. Why should there be a
limit? Are you saying that the client should limit trunking? While
this is not what's happening here, but say FS_LOCATION returned 100
ips for the server. Are you saying the client should be limiting how
many trunkable connections it would be establishing and picking just a
few addresses to try? What's happening with this patch is that say
there are 100mounts to 100 ips (each representing the same server or
trunkable server(s)), without this patch a single connection is kept,
with this patch we'll have 100 connections.

>
>
> > +
> > +     if (xprtdata.cred)
> > +             put_cred(xprtdata.cred);
> > +}
> > +
> > /**
> >  * nfs4_init_client - Initialise an NFS4 client record
> >  *
> > @@ -434,6 +472,8 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp,
> >                * won't try to use it.
> >                */
> >               nfs_mark_client_ready(clp, -EPERM);
> > +             if (old->cl_mvops->session_trunk)
> > +                     nfs4_add_trunk(clp, old);
> >       }
> >       clear_bit(NFS_CS_TSM_POSSIBLE, &clp->cl_flags);
> >       nfs_put_client(clp);
> > --
> > 2.27.0
> >
>
> --
> Chuck Lever
>
>
>

Chuck Lever June 8, 2021, 9:06 p.m. UTC | #3

> On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia <olga.kornievskaia@gmail.com> wrote:
> 
> On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III <chuck.lever@oracle.com> wrote:
>> 
>> 
>> 
>>> On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia <olga.kornievskaia@gmail.com> wrote:
>>> 
>>> From: Olga Kornievskaia <kolga@netapp.com>
>>> 
>>> After trunking is discovered in nfs4_discover_server_trunking(),
>>> add the transport to the old client structure before destroying
>>> the new client structure (along with its transport).
>>> 
>>> Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
>>> ---
>>> fs/nfs/nfs4client.c | 40 ++++++++++++++++++++++++++++++++++++++++
>>> 1 file changed, 40 insertions(+)
>>> 
>>> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
>>> index 42719384e25f..984c851844d8 100644
>>> --- a/fs/nfs/nfs4client.c
>>> +++ b/fs/nfs/nfs4client.c
>>> @@ -361,6 +361,44 @@ static int nfs4_init_client_minor_version(struct nfs_client *clp)
>>>      return nfs4_init_callback(clp);
>>> }
>>> 
>>> +static void nfs4_add_trunk(struct nfs_client *clp, struct nfs_client *old)
>>> +{
>>> +     struct sockaddr_storage clp_addr, old_addr;
>>> +     struct sockaddr *clp_sap = (struct sockaddr *)&clp_addr;
>>> +     struct sockaddr *old_sap = (struct sockaddr *)&old_addr;
>>> +     size_t clp_salen, old_salen;
>>> +     struct xprt_create xprt_args = {
>>> +             .ident = old->cl_proto,
>>> +             .net = old->cl_net,
>>> +             .servername = old->cl_hostname,
>>> +     };
>>> +     struct nfs4_add_xprt_data xprtdata = {
>>> +             .clp = old,
>>> +     };
>>> +     struct rpc_add_xprt_test rpcdata = {
>>> +             .add_xprt_test = old->cl_mvops->session_trunk,
>>> +             .data = &xprtdata,
>>> +     };
>>> +
>>> +     if (clp->cl_proto != old->cl_proto)
>>> +             return;
>>> +     clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap, sizeof(clp_addr));
>>> +     old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap, sizeof(old_addr));
>>> +
>>> +     if (clp_addr.ss_family != old_addr.ss_family)
>>> +             return;
>>> +
>>> +     xprt_args.dstaddr = clp_sap;
>>> +     xprt_args.addrlen = clp_salen;
>>> +
>>> +     xprtdata.cred = nfs4_get_clid_cred(old);
>>> +     rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args,
>>> +                       rpc_clnt_setup_test_and_add_xprt, &rpcdata);
>> 
>> Is there an upper bound on the number of transports that
>> are added to the NFS client's switch?
> 
> I don't believe any limits exist right now. Why should there be a
> limit? Are you saying that the client should limit trunking? While
> this is not what's happening here, but say FS_LOCATION returned 100
> ips for the server. Are you saying the client should be limiting how
> many trunkable connections it would be establishing and picking just a
> few addresses to try? What's happening with this patch is that say
> there are 100mounts to 100 ips (each representing the same server or
> trunkable server(s)), without this patch a single connection is kept,
> with this patch we'll have 100 connections.

The patch description needs to make this behavior more clear. It
needs to explain "why" -- the body of the patch already explains
"what". Can you include your last sentence in the description as
a use case?

As for the behavior... Seems to me that there are going to be only
infinitesimal gains after the first few connections, and after
that, it's going to be a lot for both sides to manage for no real
gain. I think you do want to cap the total number of connections
at a reasonable number, even in the FS_LOCATIONS case.


>>> +
>>> +     if (xprtdata.cred)
>>> +             put_cred(xprtdata.cred);
>>> +}
>>> +
>>> /**
>>> * nfs4_init_client - Initialise an NFS4 client record
>>> *
>>> @@ -434,6 +472,8 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp,
>>>               * won't try to use it.
>>>               */
>>>              nfs_mark_client_ready(clp, -EPERM);
>>> +             if (old->cl_mvops->session_trunk)
>>> +                     nfs4_add_trunk(clp, old);
>>>      }
>>>      clear_bit(NFS_CS_TSM_POSSIBLE, &clp->cl_flags);
>>>      nfs_put_client(clp);
>>> --
>>> 2.27.0
>>> 
>> 
>> --
>> Chuck Lever

--
Chuck Lever

Trond Myklebust June 8, 2021, 9:19 p.m. UTC | #4

On Tue, 2021-06-08 at 21:06 +0000, Chuck Lever III wrote:
> 
> 
> > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia < 
> > olga.kornievskaia@gmail.com> wrote:
> > 
> > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III < 
> > chuck.lever@oracle.com> wrote:
> > > 
> > > 
> > > 
> > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia < 
> > > > olga.kornievskaia@gmail.com> wrote:
> > > > 
> > > > From: Olga Kornievskaia <kolga@netapp.com>
> > > > 
> > > > After trunking is discovered in
> > > > nfs4_discover_server_trunking(),
> > > > add the transport to the old client structure before destroying
> > > > the new client structure (along with its transport).
> > > > 
> > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > > > ---
> > > > fs/nfs/nfs4client.c | 40
> > > > ++++++++++++++++++++++++++++++++++++++++
> > > > 1 file changed, 40 insertions(+)
> > > > 
> > > > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> > > > index 42719384e25f..984c851844d8 100644
> > > > --- a/fs/nfs/nfs4client.c
> > > > +++ b/fs/nfs/nfs4client.c
> > > > @@ -361,6 +361,44 @@ static int
> > > > nfs4_init_client_minor_version(struct nfs_client *clp)
> > > >      return nfs4_init_callback(clp);
> > > > }
> > > > 
> > > > +static void nfs4_add_trunk(struct nfs_client *clp, struct
> > > > nfs_client *old)
> > > > +{
> > > > +     struct sockaddr_storage clp_addr, old_addr;
> > > > +     struct sockaddr *clp_sap = (struct sockaddr *)&clp_addr;
> > > > +     struct sockaddr *old_sap = (struct sockaddr *)&old_addr;
> > > > +     size_t clp_salen, old_salen;
> > > > +     struct xprt_create xprt_args = {
> > > > +             .ident = old->cl_proto,
> > > > +             .net = old->cl_net,
> > > > +             .servername = old->cl_hostname,
> > > > +     };
> > > > +     struct nfs4_add_xprt_data xprtdata = {
> > > > +             .clp = old,
> > > > +     };
> > > > +     struct rpc_add_xprt_test rpcdata = {
> > > > +             .add_xprt_test = old->cl_mvops->session_trunk,
> > > > +             .data = &xprtdata,
> > > > +     };
> > > > +
> > > > +     if (clp->cl_proto != old->cl_proto)
> > > > +             return;
> > > > +     clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap,
> > > > sizeof(clp_addr));
> > > > +     old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap,
> > > > sizeof(old_addr));
> > > > +
> > > > +     if (clp_addr.ss_family != old_addr.ss_family)
> > > > +             return;
> > > > +
> > > > +     xprt_args.dstaddr = clp_sap;
> > > > +     xprt_args.addrlen = clp_salen;
> > > > +
> > > > +     xprtdata.cred = nfs4_get_clid_cred(old);
> > > > +     rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args,
> > > > +                       rpc_clnt_setup_test_and_add_xprt,
> > > > &rpcdata);
> > > 
> > > Is there an upper bound on the number of transports that
> > > are added to the NFS client's switch?
> > 
> > I don't believe any limits exist right now. Why should there be a
> > limit? Are you saying that the client should limit trunking? While
> > this is not what's happening here, but say FS_LOCATION returned 100
> > ips for the server. Are you saying the client should be limiting
> > how
> > many trunkable connections it would be establishing and picking
> > just a
> > few addresses to try? What's happening with this patch is that say
> > there are 100mounts to 100 ips (each representing the same server
> > or
> > trunkable server(s)), without this patch a single connection is
> > kept,
> > with this patch we'll have 100 connections.
> 
> The patch description needs to make this behavior more clear. It
> needs to explain "why" -- the body of the patch already explains
> "what". Can you include your last sentence in the description as
> a use case?
> 
> As for the behavior... Seems to me that there are going to be only
> infinitesimal gains after the first few connections, and after
> that, it's going to be a lot for both sides to manage for no real
> gain. I think you do want to cap the total number of connections
> at a reasonable number, even in the FS_LOCATIONS case.
> 

I'd tend to agree. If you want to scale I/O then pNFS is the way to go,
not vast numbers of connections to a single server.

Trond Myklebust June 8, 2021, 9:26 p.m. UTC | #5

On Tue, 2021-06-08 at 21:19 +0000, Trond Myklebust wrote:
> On Tue, 2021-06-08 at 21:06 +0000, Chuck Lever III wrote:
> > 
> > 
> > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia < 
> > > olga.kornievskaia@gmail.com> wrote:
> > > 
> > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III < 
> > > chuck.lever@oracle.com> wrote:
> > > > 
> > > > 
> > > > 
> > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia < 
> > > > > olga.kornievskaia@gmail.com> wrote:
> > > > > 
> > > > > From: Olga Kornievskaia <kolga@netapp.com>
> > > > > 
> > > > > After trunking is discovered in
> > > > > nfs4_discover_server_trunking(),
> > > > > add the transport to the old client structure before
> > > > > destroying
> > > > > the new client structure (along with its transport).
> > > > > 
> > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > > > > ---
> > > > > fs/nfs/nfs4client.c | 40
> > > > > ++++++++++++++++++++++++++++++++++++++++
> > > > > 1 file changed, 40 insertions(+)
> > > > > 
> > > > > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> > > > > index 42719384e25f..984c851844d8 100644
> > > > > --- a/fs/nfs/nfs4client.c
> > > > > +++ b/fs/nfs/nfs4client.c
> > > > > @@ -361,6 +361,44 @@ static int
> > > > > nfs4_init_client_minor_version(struct nfs_client *clp)
> > > > >      return nfs4_init_callback(clp);
> > > > > }
> > > > > 
> > > > > +static void nfs4_add_trunk(struct nfs_client *clp, struct
> > > > > nfs_client *old)
> > > > > +{
> > > > > +     struct sockaddr_storage clp_addr, old_addr;
> > > > > +     struct sockaddr *clp_sap = (struct sockaddr
> > > > > *)&clp_addr;
> > > > > +     struct sockaddr *old_sap = (struct sockaddr
> > > > > *)&old_addr;
> > > > > +     size_t clp_salen, old_salen;
> > > > > +     struct xprt_create xprt_args = {
> > > > > +             .ident = old->cl_proto,
> > > > > +             .net = old->cl_net,
> > > > > +             .servername = old->cl_hostname,
> > > > > +     };
> > > > > +     struct nfs4_add_xprt_data xprtdata = {
> > > > > +             .clp = old,
> > > > > +     };
> > > > > +     struct rpc_add_xprt_test rpcdata = {
> > > > > +             .add_xprt_test = old->cl_mvops->session_trunk,
> > > > > +             .data = &xprtdata,
> > > > > +     };
> > > > > +
> > > > > +     if (clp->cl_proto != old->cl_proto)
> > > > > +             return;
> > > > > +     clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap,
> > > > > sizeof(clp_addr));
> > > > > +     old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap,
> > > > > sizeof(old_addr));
> > > > > +
> > > > > +     if (clp_addr.ss_family != old_addr.ss_family)
> > > > > +             return;
> > > > > +
> > > > > +     xprt_args.dstaddr = clp_sap;
> > > > > +     xprt_args.addrlen = clp_salen;
> > > > > +
> > > > > +     xprtdata.cred = nfs4_get_clid_cred(old);
> > > > > +     rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args,
> > > > > +                       rpc_clnt_setup_test_and_add_xprt,
> > > > > &rpcdata);
> > > > 
> > > > Is there an upper bound on the number of transports that
> > > > are added to the NFS client's switch?
> > > 
> > > I don't believe any limits exist right now. Why should there be a
> > > limit? Are you saying that the client should limit trunking?
> > > While
> > > this is not what's happening here, but say FS_LOCATION returned
> > > 100
> > > ips for the server. Are you saying the client should be limiting
> > > how
> > > many trunkable connections it would be establishing and picking
> > > just a
> > > few addresses to try? What's happening with this patch is that
> > > say
> > > there are 100mounts to 100 ips (each representing the same server
> > > or
> > > trunkable server(s)), without this patch a single connection is
> > > kept,
> > > with this patch we'll have 100 connections.
> > 
> > The patch description needs to make this behavior more clear. It
> > needs to explain "why" -- the body of the patch already explains
> > "what". Can you include your last sentence in the description as
> > a use case?
> > 
> > As for the behavior... Seems to me that there are going to be only
> > infinitesimal gains after the first few connections, and after
> > that, it's going to be a lot for both sides to manage for no real
> > gain. I think you do want to cap the total number of connections
> > at a reasonable number, even in the FS_LOCATIONS case.
> > 
> 
> I'd tend to agree. If you want to scale I/O then pNFS is the way to
> go,
> not vast numbers of connections to a single server.
> 
BTW: AFAICS this patch will end up adding another connection every time
we mount a new filesystem, whether or not a connection already exists
to that IP address. That's unacceptable.

Olga Kornievskaia June 8, 2021, 9:30 p.m. UTC | #6

On Tue, Jun 8, 2021 at 5:26 PM Trond Myklebust <trondmy@hammerspace.com> wrote:
>
> On Tue, 2021-06-08 at 21:19 +0000, Trond Myklebust wrote:
> > On Tue, 2021-06-08 at 21:06 +0000, Chuck Lever III wrote:
> > >
> > >
> > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia <
> > > > olga.kornievskaia@gmail.com> wrote:
> > > >
> > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III <
> > > > chuck.lever@oracle.com> wrote:
> > > > >
> > > > >
> > > > >
> > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia <
> > > > > > olga.kornievskaia@gmail.com> wrote:
> > > > > >
> > > > > > From: Olga Kornievskaia <kolga@netapp.com>
> > > > > >
> > > > > > After trunking is discovered in
> > > > > > nfs4_discover_server_trunking(),
> > > > > > add the transport to the old client structure before
> > > > > > destroying
> > > > > > the new client structure (along with its transport).
> > > > > >
> > > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > > > > > ---
> > > > > > fs/nfs/nfs4client.c | 40
> > > > > > ++++++++++++++++++++++++++++++++++++++++
> > > > > > 1 file changed, 40 insertions(+)
> > > > > >
> > > > > > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> > > > > > index 42719384e25f..984c851844d8 100644
> > > > > > --- a/fs/nfs/nfs4client.c
> > > > > > +++ b/fs/nfs/nfs4client.c
> > > > > > @@ -361,6 +361,44 @@ static int
> > > > > > nfs4_init_client_minor_version(struct nfs_client *clp)
> > > > > >      return nfs4_init_callback(clp);
> > > > > > }
> > > > > >
> > > > > > +static void nfs4_add_trunk(struct nfs_client *clp, struct
> > > > > > nfs_client *old)
> > > > > > +{
> > > > > > +     struct sockaddr_storage clp_addr, old_addr;
> > > > > > +     struct sockaddr *clp_sap = (struct sockaddr
> > > > > > *)&clp_addr;
> > > > > > +     struct sockaddr *old_sap = (struct sockaddr
> > > > > > *)&old_addr;
> > > > > > +     size_t clp_salen, old_salen;
> > > > > > +     struct xprt_create xprt_args = {
> > > > > > +             .ident = old->cl_proto,
> > > > > > +             .net = old->cl_net,
> > > > > > +             .servername = old->cl_hostname,
> > > > > > +     };
> > > > > > +     struct nfs4_add_xprt_data xprtdata = {
> > > > > > +             .clp = old,
> > > > > > +     };
> > > > > > +     struct rpc_add_xprt_test rpcdata = {
> > > > > > +             .add_xprt_test = old->cl_mvops->session_trunk,
> > > > > > +             .data = &xprtdata,
> > > > > > +     };
> > > > > > +
> > > > > > +     if (clp->cl_proto != old->cl_proto)
> > > > > > +             return;
> > > > > > +     clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap,
> > > > > > sizeof(clp_addr));
> > > > > > +     old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap,
> > > > > > sizeof(old_addr));
> > > > > > +
> > > > > > +     if (clp_addr.ss_family != old_addr.ss_family)
> > > > > > +             return;
> > > > > > +
> > > > > > +     xprt_args.dstaddr = clp_sap;
> > > > > > +     xprt_args.addrlen = clp_salen;
> > > > > > +
> > > > > > +     xprtdata.cred = nfs4_get_clid_cred(old);
> > > > > > +     rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args,
> > > > > > +                       rpc_clnt_setup_test_and_add_xprt,
> > > > > > &rpcdata);
> > > > >
> > > > > Is there an upper bound on the number of transports that
> > > > > are added to the NFS client's switch?
> > > >
> > > > I don't believe any limits exist right now. Why should there be a
> > > > limit? Are you saying that the client should limit trunking?
> > > > While
> > > > this is not what's happening here, but say FS_LOCATION returned
> > > > 100
> > > > ips for the server. Are you saying the client should be limiting
> > > > how
> > > > many trunkable connections it would be establishing and picking
> > > > just a
> > > > few addresses to try? What's happening with this patch is that
> > > > say
> > > > there are 100mounts to 100 ips (each representing the same server
> > > > or
> > > > trunkable server(s)), without this patch a single connection is
> > > > kept,
> > > > with this patch we'll have 100 connections.
> > >
> > > The patch description needs to make this behavior more clear. It
> > > needs to explain "why" -- the body of the patch already explains
> > > "what". Can you include your last sentence in the description as
> > > a use case?
> > >
> > > As for the behavior... Seems to me that there are going to be only
> > > infinitesimal gains after the first few connections, and after
> > > that, it's going to be a lot for both sides to manage for no real
> > > gain. I think you do want to cap the total number of connections
> > > at a reasonable number, even in the FS_LOCATIONS case.
> > >
> >
> > I'd tend to agree. If you want to scale I/O then pNFS is the way to
> > go,
> > not vast numbers of connections to a single server.
> >
> BTW: AFAICS this patch will end up adding another connection every time
> we mount a new filesystem, whether or not a connection already exists
> to that IP address. That's unacceptable.

Not in my testing. Mounts to the same server (same IP) different
export volumes lead to use of a single connection.

>
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com
>
>

Olga Kornievskaia June 8, 2021, 9:40 p.m. UTC | #7

On Tue, Jun 8, 2021 at 5:06 PM Chuck Lever III <chuck.lever@oracle.com> wrote:
>
>
>
> > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia <olga.kornievskaia@gmail.com> wrote:
> >
> > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III <chuck.lever@oracle.com> wrote:
> >>
> >>
> >>
> >>> On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia <olga.kornievskaia@gmail.com> wrote:
> >>>
> >>> From: Olga Kornievskaia <kolga@netapp.com>
> >>>
> >>> After trunking is discovered in nfs4_discover_server_trunking(),
> >>> add the transport to the old client structure before destroying
> >>> the new client structure (along with its transport).
> >>>
> >>> Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> >>> ---
> >>> fs/nfs/nfs4client.c | 40 ++++++++++++++++++++++++++++++++++++++++
> >>> 1 file changed, 40 insertions(+)
> >>>
> >>> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> >>> index 42719384e25f..984c851844d8 100644
> >>> --- a/fs/nfs/nfs4client.c
> >>> +++ b/fs/nfs/nfs4client.c
> >>> @@ -361,6 +361,44 @@ static int nfs4_init_client_minor_version(struct nfs_client *clp)
> >>>      return nfs4_init_callback(clp);
> >>> }
> >>>
> >>> +static void nfs4_add_trunk(struct nfs_client *clp, struct nfs_client *old)
> >>> +{
> >>> +     struct sockaddr_storage clp_addr, old_addr;
> >>> +     struct sockaddr *clp_sap = (struct sockaddr *)&clp_addr;
> >>> +     struct sockaddr *old_sap = (struct sockaddr *)&old_addr;
> >>> +     size_t clp_salen, old_salen;
> >>> +     struct xprt_create xprt_args = {
> >>> +             .ident = old->cl_proto,
> >>> +             .net = old->cl_net,
> >>> +             .servername = old->cl_hostname,
> >>> +     };
> >>> +     struct nfs4_add_xprt_data xprtdata = {
> >>> +             .clp = old,
> >>> +     };
> >>> +     struct rpc_add_xprt_test rpcdata = {
> >>> +             .add_xprt_test = old->cl_mvops->session_trunk,
> >>> +             .data = &xprtdata,
> >>> +     };
> >>> +
> >>> +     if (clp->cl_proto != old->cl_proto)
> >>> +             return;
> >>> +     clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap, sizeof(clp_addr));
> >>> +     old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap, sizeof(old_addr));
> >>> +
> >>> +     if (clp_addr.ss_family != old_addr.ss_family)
> >>> +             return;
> >>> +
> >>> +     xprt_args.dstaddr = clp_sap;
> >>> +     xprt_args.addrlen = clp_salen;
> >>> +
> >>> +     xprtdata.cred = nfs4_get_clid_cred(old);
> >>> +     rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args,
> >>> +                       rpc_clnt_setup_test_and_add_xprt, &rpcdata);
> >>
> >> Is there an upper bound on the number of transports that
> >> are added to the NFS client's switch?
> >
> > I don't believe any limits exist right now. Why should there be a
> > limit? Are you saying that the client should limit trunking? While
> > this is not what's happening here, but say FS_LOCATION returned 100
> > ips for the server. Are you saying the client should be limiting how
> > many trunkable connections it would be establishing and picking just a
> > few addresses to try? What's happening with this patch is that say
> > there are 100mounts to 100 ips (each representing the same server or
> > trunkable server(s)), without this patch a single connection is kept,
> > with this patch we'll have 100 connections.
>
> The patch description needs to make this behavior more clear. It
> needs to explain "why" -- the body of the patch already explains
> "what". Can you include your last sentence in the description as
> a use case?

I sure can.

> As for the behavior... Seems to me that there are going to be only
> infinitesimal gains after the first few connections, and after
> that, it's going to be a lot for both sides to manage for no real
> gain. I think you do want to cap the total number of connections
> at a reasonable number, even in the FS_LOCATIONS case.

Do you want to cap it at 16 like we do for nconnect? I'm ok with that for now.

I don't understand why a setup where there X NICs on each side
(client/server) and X mounts are done, there won't be throughput of
close to X * throughput of a single NIC.

> >>> +
> >>> +     if (xprtdata.cred)
> >>> +             put_cred(xprtdata.cred);
> >>> +}
> >>> +
> >>> /**
> >>> * nfs4_init_client - Initialise an NFS4 client record
> >>> *
> >>> @@ -434,6 +472,8 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp,
> >>>               * won't try to use it.
> >>>               */
> >>>              nfs_mark_client_ready(clp, -EPERM);
> >>> +             if (old->cl_mvops->session_trunk)
> >>> +                     nfs4_add_trunk(clp, old);
> >>>      }
> >>>      clear_bit(NFS_CS_TSM_POSSIBLE, &clp->cl_flags);
> >>>      nfs_put_client(clp);
> >>> --
> >>> 2.27.0
> >>>
> >>
> >> --
> >> Chuck Lever
>
> --
> Chuck Lever
>
>
>

Trond Myklebust June 8, 2021, 9:41 p.m. UTC | #8

On Tue, 2021-06-08 at 17:30 -0400, Olga Kornievskaia wrote:
> On Tue, Jun 8, 2021 at 5:26 PM Trond Myklebust < 
> trondmy@hammerspace.com> wrote:
> > 
> > On Tue, 2021-06-08 at 21:19 +0000, Trond Myklebust wrote:
> > > On Tue, 2021-06-08 at 21:06 +0000, Chuck Lever III wrote:
> > > > 
> > > > 
> > > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia <
> > > > > olga.kornievskaia@gmail.com> wrote:
> > > > > 
> > > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III <
> > > > > chuck.lever@oracle.com> wrote:
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia <
> > > > > > > olga.kornievskaia@gmail.com> wrote:
> > > > > > > 
> > > > > > > From: Olga Kornievskaia <kolga@netapp.com>
> > > > > > > 
> > > > > > > After trunking is discovered in
> > > > > > > nfs4_discover_server_trunking(),
> > > > > > > add the transport to the old client structure before
> > > > > > > destroying
> > > > > > > the new client structure (along with its transport).
> > > > > > > 
> > > > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > > > > > > ---
> > > > > > > fs/nfs/nfs4client.c | 40
> > > > > > > ++++++++++++++++++++++++++++++++++++++++
> > > > > > > 1 file changed, 40 insertions(+)
> > > > > > > 
> > > > > > > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> > > > > > > index 42719384e25f..984c851844d8 100644
> > > > > > > --- a/fs/nfs/nfs4client.c
> > > > > > > +++ b/fs/nfs/nfs4client.c
> > > > > > > @@ -361,6 +361,44 @@ static int
> > > > > > > nfs4_init_client_minor_version(struct nfs_client *clp)
> > > > > > >      return nfs4_init_callback(clp);
> > > > > > > }
> > > > > > > 
> > > > > > > +static void nfs4_add_trunk(struct nfs_client *clp,
> > > > > > > struct
> > > > > > > nfs_client *old)
> > > > > > > +{
> > > > > > > +     struct sockaddr_storage clp_addr, old_addr;
> > > > > > > +     struct sockaddr *clp_sap = (struct sockaddr
> > > > > > > *)&clp_addr;
> > > > > > > +     struct sockaddr *old_sap = (struct sockaddr
> > > > > > > *)&old_addr;
> > > > > > > +     size_t clp_salen, old_salen;
> > > > > > > +     struct xprt_create xprt_args = {
> > > > > > > +             .ident = old->cl_proto,
> > > > > > > +             .net = old->cl_net,
> > > > > > > +             .servername = old->cl_hostname,
> > > > > > > +     };
> > > > > > > +     struct nfs4_add_xprt_data xprtdata = {
> > > > > > > +             .clp = old,
> > > > > > > +     };
> > > > > > > +     struct rpc_add_xprt_test rpcdata = {
> > > > > > > +             .add_xprt_test = old->cl_mvops-
> > > > > > > >session_trunk,
> > > > > > > +             .data = &xprtdata,
> > > > > > > +     };
> > > > > > > +
> > > > > > > +     if (clp->cl_proto != old->cl_proto)
> > > > > > > +             return;
> > > > > > > +     clp_salen = rpc_peeraddr(clp->cl_rpcclient,
> > > > > > > clp_sap,
> > > > > > > sizeof(clp_addr));
> > > > > > > +     old_salen = rpc_peeraddr(old->cl_rpcclient,
> > > > > > > old_sap,
> > > > > > > sizeof(old_addr));
> > > > > > > +
> > > > > > > +     if (clp_addr.ss_family != old_addr.ss_family)
> > > > > > > +             return;
> > > > > > > +
> > > > > > > +     xprt_args.dstaddr = clp_sap;
> > > > > > > +     xprt_args.addrlen = clp_salen;
> > > > > > > +
> > > > > > > +     xprtdata.cred = nfs4_get_clid_cred(old);
> > > > > > > +     rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args,
> > > > > > > +                       rpc_clnt_setup_test_and_add_xprt,
> > > > > > > &rpcdata);
> > > > > > 
> > > > > > Is there an upper bound on the number of transports that
> > > > > > are added to the NFS client's switch?
> > > > > 
> > > > > I don't believe any limits exist right now. Why should there
> > > > > be a
> > > > > limit? Are you saying that the client should limit trunking?
> > > > > While
> > > > > this is not what's happening here, but say FS_LOCATION
> > > > > returned
> > > > > 100
> > > > > ips for the server. Are you saying the client should be
> > > > > limiting
> > > > > how
> > > > > many trunkable connections it would be establishing and
> > > > > picking
> > > > > just a
> > > > > few addresses to try? What's happening with this patch is
> > > > > that
> > > > > say
> > > > > there are 100mounts to 100 ips (each representing the same
> > > > > server
> > > > > or
> > > > > trunkable server(s)), without this patch a single connection
> > > > > is
> > > > > kept,
> > > > > with this patch we'll have 100 connections.
> > > > 
> > > > The patch description needs to make this behavior more clear.
> > > > It
> > > > needs to explain "why" -- the body of the patch already
> > > > explains
> > > > "what". Can you include your last sentence in the description
> > > > as
> > > > a use case?
> > > > 
> > > > As for the behavior... Seems to me that there are going to be
> > > > only
> > > > infinitesimal gains after the first few connections, and after
> > > > that, it's going to be a lot for both sides to manage for no
> > > > real
> > > > gain. I think you do want to cap the total number of
> > > > connections
> > > > at a reasonable number, even in the FS_LOCATIONS case.
> > > > 
> > > 
> > > I'd tend to agree. If you want to scale I/O then pNFS is the way
> > > to
> > > go,
> > > not vast numbers of connections to a single server.
> > > 
> > BTW: AFAICS this patch will end up adding another connection every
> > time
> > we mount a new filesystem, whether or not a connection already
> > exists
> > to that IP address. That's unacceptable.
> 
> Not in my testing. Mounts to the same server (same IP) different
> export volumes lead to use of a single connection.
> > 

Ah... Never mind. The comparison is made by
rpc_clnt_setup_test_and_add_xprt(), and the address is discarded if it
is already present.

However why are you running trunking detection a second time on the
address you've just run trunking detection on and have decided to add?

Trond Myklebust June 8, 2021, 10:13 p.m. UTC | #9

On Tue, 2021-06-08 at 17:40 -0400, Olga Kornievskaia wrote:
> On Tue, Jun 8, 2021 at 5:06 PM Chuck Lever III < 
> chuck.lever@oracle.com> wrote:
> > 
> > 
> > 
> > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia < 
> > > olga.kornievskaia@gmail.com> wrote:
> > > 
> > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III < 
> > > chuck.lever@oracle.com> wrote:
> > > > 
> > > > 
> > > > 
> > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia < 
> > > > > olga.kornievskaia@gmail.com> wrote:
> > > > > 
> > > > > From: Olga Kornievskaia <kolga@netapp.com>
> > > > > 
> > > > > After trunking is discovered in
> > > > > nfs4_discover_server_trunking(),
> > > > > add the transport to the old client structure before
> > > > > destroying
> > > > > the new client structure (along with its transport).
> > > > > 
> > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > > > > ---
> > > > > fs/nfs/nfs4client.c | 40
> > > > > ++++++++++++++++++++++++++++++++++++++++
> > > > > 1 file changed, 40 insertions(+)
> > > > > 
> > > > > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> > > > > index 42719384e25f..984c851844d8 100644
> > > > > --- a/fs/nfs/nfs4client.c
> > > > > +++ b/fs/nfs/nfs4client.c
> > > > > @@ -361,6 +361,44 @@ static int
> > > > > nfs4_init_client_minor_version(struct nfs_client *clp)
> > > > >      return nfs4_init_callback(clp);
> > > > > }
> > > > > 
> > > > > +static void nfs4_add_trunk(struct nfs_client *clp, struct
> > > > > nfs_client *old)
> > > > > +{
> > > > > +     struct sockaddr_storage clp_addr, old_addr;
> > > > > +     struct sockaddr *clp_sap = (struct sockaddr
> > > > > *)&clp_addr;
> > > > > +     struct sockaddr *old_sap = (struct sockaddr
> > > > > *)&old_addr;
> > > > > +     size_t clp_salen, old_salen;
> > > > > +     struct xprt_create xprt_args = {
> > > > > +             .ident = old->cl_proto,
> > > > > +             .net = old->cl_net,
> > > > > +             .servername = old->cl_hostname,
> > > > > +     };
> > > > > +     struct nfs4_add_xprt_data xprtdata = {
> > > > > +             .clp = old,
> > > > > +     };
> > > > > +     struct rpc_add_xprt_test rpcdata = {
> > > > > +             .add_xprt_test = old->cl_mvops->session_trunk,
> > > > > +             .data = &xprtdata,
> > > > > +     };
> > > > > +
> > > > > +     if (clp->cl_proto != old->cl_proto)
> > > > > +             return;
> > > > > +     clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap,
> > > > > sizeof(clp_addr));
> > > > > +     old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap,
> > > > > sizeof(old_addr));
> > > > > +
> > > > > +     if (clp_addr.ss_family != old_addr.ss_family)
> > > > > +             return;
> > > > > +
> > > > > +     xprt_args.dstaddr = clp_sap;
> > > > > +     xprt_args.addrlen = clp_salen;
> > > > > +
> > > > > +     xprtdata.cred = nfs4_get_clid_cred(old);
> > > > > +     rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args,
> > > > > +                       rpc_clnt_setup_test_and_add_xprt,
> > > > > &rpcdata);
> > > > 
> > > > Is there an upper bound on the number of transports that
> > > > are added to the NFS client's switch?
> > > 
> > > I don't believe any limits exist right now. Why should there be a
> > > limit? Are you saying that the client should limit trunking?
> > > While
> > > this is not what's happening here, but say FS_LOCATION returned
> > > 100
> > > ips for the server. Are you saying the client should be limiting
> > > how
> > > many trunkable connections it would be establishing and picking
> > > just a
> > > few addresses to try? What's happening with this patch is that
> > > say
> > > there are 100mounts to 100 ips (each representing the same server
> > > or
> > > trunkable server(s)), without this patch a single connection is
> > > kept,
> > > with this patch we'll have 100 connections.
> > 
> > The patch description needs to make this behavior more clear. It
> > needs to explain "why" -- the body of the patch already explains
> > "what". Can you include your last sentence in the description as
> > a use case?
> 
> I sure can.
> 
> > As for the behavior... Seems to me that there are going to be only
> > infinitesimal gains after the first few connections, and after
> > that, it's going to be a lot for both sides to manage for no real
> > gain. I think you do want to cap the total number of connections
> > at a reasonable number, even in the FS_LOCATIONS case.
> 
> Do you want to cap it at 16 like we do for nconnect? I'm ok with that
> for now.
> 
> I don't understand why a setup where there X NICs on each side
> (client/server) and X mounts are done, there won't be throughput of
> close to X * throughput of a single NIC.

That still does not make the patch acceptable. There are already server
vendors seeing problems with supporting nconnect for various reasons. 

There needs to be a way to ensure that we can keep the current client
behaviour without requiring changes to server DNS entries, etc.

> 
> > > > > +
> > > > > +     if (xprtdata.cred)
> > > > > +             put_cred(xprtdata.cred);
> > > > > +}
> > > > > +
> > > > > /**
> > > > > * nfs4_init_client - Initialise an NFS4 client record
> > > > > *
> > > > > @@ -434,6 +472,8 @@ struct nfs_client
> > > > > *nfs4_init_client(struct nfs_client *clp,
> > > > >               * won't try to use it.
> > > > >               */
> > > > >              nfs_mark_client_ready(clp, -EPERM);
> > > > > +             if (old->cl_mvops->session_trunk)
> > > > > +                     nfs4_add_trunk(clp, old);
> > > > >      }
> > > > >      clear_bit(NFS_CS_TSM_POSSIBLE, &clp->cl_flags);
> > > > >      nfs_put_client(clp);
> > > > > --
> > > > > 2.27.0
> > > > > 
> > > > 
> > > > --
> > > > Chuck Lever
> > 
> > --
> > Chuck Lever
> > 
> > 
> >

Olga Kornievskaia June 8, 2021, 10:18 p.m. UTC | #10

On Tue, Jun 8, 2021 at 6:13 PM Trond Myklebust <trondmy@hammerspace.com> wrote:
>
> On Tue, 2021-06-08 at 17:40 -0400, Olga Kornievskaia wrote:
> > On Tue, Jun 8, 2021 at 5:06 PM Chuck Lever III <
> > chuck.lever@oracle.com> wrote:
> > >
> > >
> > >
> > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia <
> > > > olga.kornievskaia@gmail.com> wrote:
> > > >
> > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III <
> > > > chuck.lever@oracle.com> wrote:
> > > > >
> > > > >
> > > > >
> > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia <
> > > > > > olga.kornievskaia@gmail.com> wrote:
> > > > > >
> > > > > > From: Olga Kornievskaia <kolga@netapp.com>
> > > > > >
> > > > > > After trunking is discovered in
> > > > > > nfs4_discover_server_trunking(),
> > > > > > add the transport to the old client structure before
> > > > > > destroying
> > > > > > the new client structure (along with its transport).
> > > > > >
> > > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > > > > > ---
> > > > > > fs/nfs/nfs4client.c | 40
> > > > > > ++++++++++++++++++++++++++++++++++++++++
> > > > > > 1 file changed, 40 insertions(+)
> > > > > >
> > > > > > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> > > > > > index 42719384e25f..984c851844d8 100644
> > > > > > --- a/fs/nfs/nfs4client.c
> > > > > > +++ b/fs/nfs/nfs4client.c
> > > > > > @@ -361,6 +361,44 @@ static int
> > > > > > nfs4_init_client_minor_version(struct nfs_client *clp)
> > > > > >      return nfs4_init_callback(clp);
> > > > > > }
> > > > > >
> > > > > > +static void nfs4_add_trunk(struct nfs_client *clp, struct
> > > > > > nfs_client *old)
> > > > > > +{
> > > > > > +     struct sockaddr_storage clp_addr, old_addr;
> > > > > > +     struct sockaddr *clp_sap = (struct sockaddr
> > > > > > *)&clp_addr;
> > > > > > +     struct sockaddr *old_sap = (struct sockaddr
> > > > > > *)&old_addr;
> > > > > > +     size_t clp_salen, old_salen;
> > > > > > +     struct xprt_create xprt_args = {
> > > > > > +             .ident = old->cl_proto,
> > > > > > +             .net = old->cl_net,
> > > > > > +             .servername = old->cl_hostname,
> > > > > > +     };
> > > > > > +     struct nfs4_add_xprt_data xprtdata = {
> > > > > > +             .clp = old,
> > > > > > +     };
> > > > > > +     struct rpc_add_xprt_test rpcdata = {
> > > > > > +             .add_xprt_test = old->cl_mvops->session_trunk,
> > > > > > +             .data = &xprtdata,
> > > > > > +     };
> > > > > > +
> > > > > > +     if (clp->cl_proto != old->cl_proto)
> > > > > > +             return;
> > > > > > +     clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap,
> > > > > > sizeof(clp_addr));
> > > > > > +     old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap,
> > > > > > sizeof(old_addr));
> > > > > > +
> > > > > > +     if (clp_addr.ss_family != old_addr.ss_family)
> > > > > > +             return;
> > > > > > +
> > > > > > +     xprt_args.dstaddr = clp_sap;
> > > > > > +     xprt_args.addrlen = clp_salen;
> > > > > > +
> > > > > > +     xprtdata.cred = nfs4_get_clid_cred(old);
> > > > > > +     rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args,
> > > > > > +                       rpc_clnt_setup_test_and_add_xprt,
> > > > > > &rpcdata);
> > > > >
> > > > > Is there an upper bound on the number of transports that
> > > > > are added to the NFS client's switch?
> > > >
> > > > I don't believe any limits exist right now. Why should there be a
> > > > limit? Are you saying that the client should limit trunking?
> > > > While
> > > > this is not what's happening here, but say FS_LOCATION returned
> > > > 100
> > > > ips for the server. Are you saying the client should be limiting
> > > > how
> > > > many trunkable connections it would be establishing and picking
> > > > just a
> > > > few addresses to try? What's happening with this patch is that
> > > > say
> > > > there are 100mounts to 100 ips (each representing the same server
> > > > or
> > > > trunkable server(s)), without this patch a single connection is
> > > > kept,
> > > > with this patch we'll have 100 connections.
> > >
> > > The patch description needs to make this behavior more clear. It
> > > needs to explain "why" -- the body of the patch already explains
> > > "what". Can you include your last sentence in the description as
> > > a use case?
> >
> > I sure can.
> >
> > > As for the behavior... Seems to me that there are going to be only
> > > infinitesimal gains after the first few connections, and after
> > > that, it's going to be a lot for both sides to manage for no real
> > > gain. I think you do want to cap the total number of connections
> > > at a reasonable number, even in the FS_LOCATIONS case.
> >
> > Do you want to cap it at 16 like we do for nconnect? I'm ok with that
> > for now.
> >
> > I don't understand why a setup where there X NICs on each side
> > (client/server) and X mounts are done, there won't be throughput of
> > close to X * throughput of a single NIC.
>
> That still does not make the patch acceptable. There are already server
> vendors seeing problems with supporting nconnect for various reasons.

Why are there servers presenting multiple IP addresses and returning
the same server identity if they can not support trunking? That seems
to be going against the spec?

> There needs to be a way to ensure that we can keep the current client
> behaviour without requiring changes to server DNS entries, etc.

Ok, how about we introduce a mount option, "enable_trunking", and if
that's present we would not collapse transports?

> >
> > > > > > +
> > > > > > +     if (xprtdata.cred)
> > > > > > +             put_cred(xprtdata.cred);
> > > > > > +}
> > > > > > +
> > > > > > /**
> > > > > > * nfs4_init_client - Initialise an NFS4 client record
> > > > > > *
> > > > > > @@ -434,6 +472,8 @@ struct nfs_client
> > > > > > *nfs4_init_client(struct nfs_client *clp,
> > > > > >               * won't try to use it.
> > > > > >               */
> > > > > >              nfs_mark_client_ready(clp, -EPERM);
> > > > > > +             if (old->cl_mvops->session_trunk)
> > > > > > +                     nfs4_add_trunk(clp, old);
> > > > > >      }
> > > > > >      clear_bit(NFS_CS_TSM_POSSIBLE, &clp->cl_flags);
> > > > > >      nfs_put_client(clp);
> > > > > > --
> > > > > > 2.27.0
> > > > > >
> > > > >
> > > > > --
> > > > > Chuck Lever
> > >
> > > --
> > > Chuck Lever
> > >
> > >
> > >
>
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com
>
>

Olga Kornievskaia June 8, 2021, 10:21 p.m. UTC | #11

On Tue, Jun 8, 2021 at 5:41 PM Trond Myklebust <trondmy@hammerspace.com> wrote:
>
> On Tue, 2021-06-08 at 17:30 -0400, Olga Kornievskaia wrote:
> > On Tue, Jun 8, 2021 at 5:26 PM Trond Myklebust <
> > trondmy@hammerspace.com> wrote:
> > >
> > > On Tue, 2021-06-08 at 21:19 +0000, Trond Myklebust wrote:
> > > > On Tue, 2021-06-08 at 21:06 +0000, Chuck Lever III wrote:
> > > > >
> > > > >
> > > > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia <
> > > > > > olga.kornievskaia@gmail.com> wrote:
> > > > > >
> > > > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III <
> > > > > > chuck.lever@oracle.com> wrote:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia <
> > > > > > > > olga.kornievskaia@gmail.com> wrote:
> > > > > > > >
> > > > > > > > From: Olga Kornievskaia <kolga@netapp.com>
> > > > > > > >
> > > > > > > > After trunking is discovered in
> > > > > > > > nfs4_discover_server_trunking(),
> > > > > > > > add the transport to the old client structure before
> > > > > > > > destroying
> > > > > > > > the new client structure (along with its transport).
> > > > > > > >
> > > > > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > > > > > > > ---
> > > > > > > > fs/nfs/nfs4client.c | 40
> > > > > > > > ++++++++++++++++++++++++++++++++++++++++
> > > > > > > > 1 file changed, 40 insertions(+)
> > > > > > > >
> > > > > > > > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> > > > > > > > index 42719384e25f..984c851844d8 100644
> > > > > > > > --- a/fs/nfs/nfs4client.c
> > > > > > > > +++ b/fs/nfs/nfs4client.c
> > > > > > > > @@ -361,6 +361,44 @@ static int
> > > > > > > > nfs4_init_client_minor_version(struct nfs_client *clp)
> > > > > > > >      return nfs4_init_callback(clp);
> > > > > > > > }
> > > > > > > >
> > > > > > > > +static void nfs4_add_trunk(struct nfs_client *clp,
> > > > > > > > struct
> > > > > > > > nfs_client *old)
> > > > > > > > +{
> > > > > > > > +     struct sockaddr_storage clp_addr, old_addr;
> > > > > > > > +     struct sockaddr *clp_sap = (struct sockaddr
> > > > > > > > *)&clp_addr;
> > > > > > > > +     struct sockaddr *old_sap = (struct sockaddr
> > > > > > > > *)&old_addr;
> > > > > > > > +     size_t clp_salen, old_salen;
> > > > > > > > +     struct xprt_create xprt_args = {
> > > > > > > > +             .ident = old->cl_proto,
> > > > > > > > +             .net = old->cl_net,
> > > > > > > > +             .servername = old->cl_hostname,
> > > > > > > > +     };
> > > > > > > > +     struct nfs4_add_xprt_data xprtdata = {
> > > > > > > > +             .clp = old,
> > > > > > > > +     };
> > > > > > > > +     struct rpc_add_xprt_test rpcdata = {
> > > > > > > > +             .add_xprt_test = old->cl_mvops-
> > > > > > > > >session_trunk,
> > > > > > > > +             .data = &xprtdata,
> > > > > > > > +     };
> > > > > > > > +
> > > > > > > > +     if (clp->cl_proto != old->cl_proto)
> > > > > > > > +             return;
> > > > > > > > +     clp_salen = rpc_peeraddr(clp->cl_rpcclient,
> > > > > > > > clp_sap,
> > > > > > > > sizeof(clp_addr));
> > > > > > > > +     old_salen = rpc_peeraddr(old->cl_rpcclient,
> > > > > > > > old_sap,
> > > > > > > > sizeof(old_addr));
> > > > > > > > +
> > > > > > > > +     if (clp_addr.ss_family != old_addr.ss_family)
> > > > > > > > +             return;
> > > > > > > > +
> > > > > > > > +     xprt_args.dstaddr = clp_sap;
> > > > > > > > +     xprt_args.addrlen = clp_salen;
> > > > > > > > +
> > > > > > > > +     xprtdata.cred = nfs4_get_clid_cred(old);
> > > > > > > > +     rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args,
> > > > > > > > +                       rpc_clnt_setup_test_and_add_xprt,
> > > > > > > > &rpcdata);
> > > > > > >
> > > > > > > Is there an upper bound on the number of transports that
> > > > > > > are added to the NFS client's switch?
> > > > > >
> > > > > > I don't believe any limits exist right now. Why should there
> > > > > > be a
> > > > > > limit? Are you saying that the client should limit trunking?
> > > > > > While
> > > > > > this is not what's happening here, but say FS_LOCATION
> > > > > > returned
> > > > > > 100
> > > > > > ips for the server. Are you saying the client should be
> > > > > > limiting
> > > > > > how
> > > > > > many trunkable connections it would be establishing and
> > > > > > picking
> > > > > > just a
> > > > > > few addresses to try? What's happening with this patch is
> > > > > > that
> > > > > > say
> > > > > > there are 100mounts to 100 ips (each representing the same
> > > > > > server
> > > > > > or
> > > > > > trunkable server(s)), without this patch a single connection
> > > > > > is
> > > > > > kept,
> > > > > > with this patch we'll have 100 connections.
> > > > >
> > > > > The patch description needs to make this behavior more clear.
> > > > > It
> > > > > needs to explain "why" -- the body of the patch already
> > > > > explains
> > > > > "what". Can you include your last sentence in the description
> > > > > as
> > > > > a use case?
> > > > >
> > > > > As for the behavior... Seems to me that there are going to be
> > > > > only
> > > > > infinitesimal gains after the first few connections, and after
> > > > > that, it's going to be a lot for both sides to manage for no
> > > > > real
> > > > > gain. I think you do want to cap the total number of
> > > > > connections
> > > > > at a reasonable number, even in the FS_LOCATIONS case.
> > > > >
> > > >
> > > > I'd tend to agree. If you want to scale I/O then pNFS is the way
> > > > to
> > > > go,
> > > > not vast numbers of connections to a single server.
> > > >
> > > BTW: AFAICS this patch will end up adding another connection every
> > > time
> > > we mount a new filesystem, whether or not a connection already
> > > exists
> > > to that IP address. That's unacceptable.
> >
> > Not in my testing. Mounts to the same server (same IP) different
> > export volumes lead to use of a single connection.
> > >
>
> Ah... Never mind. The comparison is made by
> rpc_clnt_setup_test_and_add_xprt(), and the address is discarded if it
> is already present.
>
> However why are you running trunking detection a second time on the
> address you've just run trunking detection on and have decided to add?

Because that's where we determined that these are different clients
and are trunkable? But I guess also for the ease of re-using existing
code. There is no hook to create an xprt and add it to an existing RPC
client. One can be created.

>
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com
>
>

Trond Myklebust June 8, 2021, 10:27 p.m. UTC | #12

On Tue, 2021-06-08 at 18:18 -0400, Olga Kornievskaia wrote:
> On Tue, Jun 8, 2021 at 6:13 PM Trond Myklebust < 
> trondmy@hammerspace.com> wrote:
> > 
> > On Tue, 2021-06-08 at 17:40 -0400, Olga Kornievskaia wrote:
> > > On Tue, Jun 8, 2021 at 5:06 PM Chuck Lever III <
> > > chuck.lever@oracle.com> wrote:
> > > > 
> > > > 
> > > > 
> > > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia <
> > > > > olga.kornievskaia@gmail.com> wrote:
> > > > > 
> > > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III <
> > > > > chuck.lever@oracle.com> wrote:
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia <
> > > > > > > olga.kornievskaia@gmail.com> wrote:
> > > > > > > 
> > > > > > > From: Olga Kornievskaia <kolga@netapp.com>
> > > > > > > 
> > > > > > > After trunking is discovered in
> > > > > > > nfs4_discover_server_trunking(),
> > > > > > > add the transport to the old client structure before
> > > > > > > destroying
> > > > > > > the new client structure (along with its transport).
> > > > > > > 
> > > > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > > > > > > ---
> > > > > > > fs/nfs/nfs4client.c | 40
> > > > > > > ++++++++++++++++++++++++++++++++++++++++
> > > > > > > 1 file changed, 40 insertions(+)
> > > > > > > 
> > > > > > > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> > > > > > > index 42719384e25f..984c851844d8 100644
> > > > > > > --- a/fs/nfs/nfs4client.c
> > > > > > > +++ b/fs/nfs/nfs4client.c
> > > > > > > @@ -361,6 +361,44 @@ static int
> > > > > > > nfs4_init_client_minor_version(struct nfs_client *clp)
> > > > > > >      return nfs4_init_callback(clp);
> > > > > > > }
> > > > > > > 
> > > > > > > +static void nfs4_add_trunk(struct nfs_client *clp,
> > > > > > > struct
> > > > > > > nfs_client *old)
> > > > > > > +{
> > > > > > > +     struct sockaddr_storage clp_addr, old_addr;
> > > > > > > +     struct sockaddr *clp_sap = (struct sockaddr
> > > > > > > *)&clp_addr;
> > > > > > > +     struct sockaddr *old_sap = (struct sockaddr
> > > > > > > *)&old_addr;
> > > > > > > +     size_t clp_salen, old_salen;
> > > > > > > +     struct xprt_create xprt_args = {
> > > > > > > +             .ident = old->cl_proto,
> > > > > > > +             .net = old->cl_net,
> > > > > > > +             .servername = old->cl_hostname,
> > > > > > > +     };
> > > > > > > +     struct nfs4_add_xprt_data xprtdata = {
> > > > > > > +             .clp = old,
> > > > > > > +     };
> > > > > > > +     struct rpc_add_xprt_test rpcdata = {
> > > > > > > +             .add_xprt_test = old->cl_mvops-
> > > > > > > >session_trunk,
> > > > > > > +             .data = &xprtdata,
> > > > > > > +     };
> > > > > > > +
> > > > > > > +     if (clp->cl_proto != old->cl_proto)
> > > > > > > +             return;
> > > > > > > +     clp_salen = rpc_peeraddr(clp->cl_rpcclient,
> > > > > > > clp_sap,
> > > > > > > sizeof(clp_addr));
> > > > > > > +     old_salen = rpc_peeraddr(old->cl_rpcclient,
> > > > > > > old_sap,
> > > > > > > sizeof(old_addr));
> > > > > > > +
> > > > > > > +     if (clp_addr.ss_family != old_addr.ss_family)
> > > > > > > +             return;
> > > > > > > +
> > > > > > > +     xprt_args.dstaddr = clp_sap;
> > > > > > > +     xprt_args.addrlen = clp_salen;
> > > > > > > +
> > > > > > > +     xprtdata.cred = nfs4_get_clid_cred(old);
> > > > > > > +     rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args,
> > > > > > > +                       rpc_clnt_setup_test_and_add_xprt,
> > > > > > > &rpcdata);
> > > > > > 
> > > > > > Is there an upper bound on the number of transports that
> > > > > > are added to the NFS client's switch?
> > > > > 
> > > > > I don't believe any limits exist right now. Why should there
> > > > > be a
> > > > > limit? Are you saying that the client should limit trunking?
> > > > > While
> > > > > this is not what's happening here, but say FS_LOCATION
> > > > > returned
> > > > > 100
> > > > > ips for the server. Are you saying the client should be
> > > > > limiting
> > > > > how
> > > > > many trunkable connections it would be establishing and
> > > > > picking
> > > > > just a
> > > > > few addresses to try? What's happening with this patch is
> > > > > that
> > > > > say
> > > > > there are 100mounts to 100 ips (each representing the same
> > > > > server
> > > > > or
> > > > > trunkable server(s)), without this patch a single connection
> > > > > is
> > > > > kept,
> > > > > with this patch we'll have 100 connections.
> > > > 
> > > > The patch description needs to make this behavior more clear.
> > > > It
> > > > needs to explain "why" -- the body of the patch already
> > > > explains
> > > > "what". Can you include your last sentence in the description
> > > > as
> > > > a use case?
> > > 
> > > I sure can.
> > > 
> > > > As for the behavior... Seems to me that there are going to be
> > > > only
> > > > infinitesimal gains after the first few connections, and after
> > > > that, it's going to be a lot for both sides to manage for no
> > > > real
> > > > gain. I think you do want to cap the total number of
> > > > connections
> > > > at a reasonable number, even in the FS_LOCATIONS case.
> > > 
> > > Do you want to cap it at 16 like we do for nconnect? I'm ok with
> > > that
> > > for now.
> > > 
> > > I don't understand why a setup where there X NICs on each side
> > > (client/server) and X mounts are done, there won't be throughput
> > > of
> > > close to X * throughput of a single NIC.
> > 
> > That still does not make the patch acceptable. There are already
> > server
> > vendors seeing problems with supporting nconnect for various
> > reasons.
> 
> Why are there servers presenting multiple IP addresses and returning
> the same server identity if they can not support trunking? That seems
> to be going against the spec?

That's not a question I can answer, but I'm not thinking of our server
or the Linux server.

> 
> > There needs to be a way to ensure that we can keep the current
> > client
> > behaviour without requiring changes to server DNS entries, etc.
> 
> Ok, how about we introduce a mount option, "enable_trunking", and if
> that's present we would not collapse transports?

I'd suggest making it 'max_connect=X' so that we can actually set a
limit like we do for nconnect. That limit probably needs to be lower
bounded by the value of nconnect.

> 
> > > 
> > > > > > > +
> > > > > > > +     if (xprtdata.cred)
> > > > > > > +             put_cred(xprtdata.cred);
> > > > > > > +}
> > > > > > > +
> > > > > > > /**
> > > > > > > * nfs4_init_client - Initialise an NFS4 client record
> > > > > > > *
> > > > > > > @@ -434,6 +472,8 @@ struct nfs_client
> > > > > > > *nfs4_init_client(struct nfs_client *clp,
> > > > > > >               * won't try to use it.
> > > > > > >               */
> > > > > > >              nfs_mark_client_ready(clp, -EPERM);
> > > > > > > +             if (old->cl_mvops->session_trunk)
> > > > > > > +                     nfs4_add_trunk(clp, old);
> > > > > > >      }
> > > > > > >      clear_bit(NFS_CS_TSM_POSSIBLE, &clp->cl_flags);
> > > > > > >      nfs_put_client(clp);
> > > > > > > --
> > > > > > > 2.27.0
> > > > > > > 
> > > > > > 
> > > > > > --
> > > > > > Chuck Lever
> > > > 
> > > > --
> > > > Chuck Lever
> > > > 
> > > > 
> > > > 
> > 
> > --
> > Trond Myklebust
> > Linux NFS client maintainer, Hammerspace
> > trond.myklebust@hammerspace.com
> > 
> >

Trond Myklebust June 8, 2021, 10:31 p.m. UTC | #13

On Tue, 2021-06-08 at 18:21 -0400, Olga Kornievskaia wrote:
> On Tue, Jun 8, 2021 at 5:41 PM Trond Myklebust < 
> trondmy@hammerspace.com> wrote:
> > 
> > On Tue, 2021-06-08 at 17:30 -0400, Olga Kornievskaia wrote:
> > > On Tue, Jun 8, 2021 at 5:26 PM Trond Myklebust <
> > > trondmy@hammerspace.com> wrote:
> > > > 
> > > > On Tue, 2021-06-08 at 21:19 +0000, Trond Myklebust wrote:
> > > > > On Tue, 2021-06-08 at 21:06 +0000, Chuck Lever III wrote:
> > > > > > 
> > > > > > 
> > > > > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia <
> > > > > > > olga.kornievskaia@gmail.com> wrote:
> > > > > > > 
> > > > > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III <
> > > > > > > chuck.lever@oracle.com> wrote:
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia <
> > > > > > > > > olga.kornievskaia@gmail.com> wrote:
> > > > > > > > > 
> > > > > > > > > From: Olga Kornievskaia <kolga@netapp.com>
> > > > > > > > > 
> > > > > > > > > After trunking is discovered in
> > > > > > > > > nfs4_discover_server_trunking(),
> > > > > > > > > add the transport to the old client structure before
> > > > > > > > > destroying
> > > > > > > > > the new client structure (along with its transport).
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > > > > > > > > ---
> > > > > > > > > fs/nfs/nfs4client.c | 40
> > > > > > > > > ++++++++++++++++++++++++++++++++++++++++
> > > > > > > > > 1 file changed, 40 insertions(+)
> > > > > > > > > 
> > > > > > > > > diff --git a/fs/nfs/nfs4client.c
> > > > > > > > > b/fs/nfs/nfs4client.c
> > > > > > > > > index 42719384e25f..984c851844d8 100644
> > > > > > > > > --- a/fs/nfs/nfs4client.c
> > > > > > > > > +++ b/fs/nfs/nfs4client.c
> > > > > > > > > @@ -361,6 +361,44 @@ static int
> > > > > > > > > nfs4_init_client_minor_version(struct nfs_client
> > > > > > > > > *clp)
> > > > > > > > >      return nfs4_init_callback(clp);
> > > > > > > > > }
> > > > > > > > > 
> > > > > > > > > +static void nfs4_add_trunk(struct nfs_client *clp,
> > > > > > > > > struct
> > > > > > > > > nfs_client *old)
> > > > > > > > > +{
> > > > > > > > > +     struct sockaddr_storage clp_addr, old_addr;
> > > > > > > > > +     struct sockaddr *clp_sap = (struct sockaddr
> > > > > > > > > *)&clp_addr;
> > > > > > > > > +     struct sockaddr *old_sap = (struct sockaddr
> > > > > > > > > *)&old_addr;
> > > > > > > > > +     size_t clp_salen, old_salen;
> > > > > > > > > +     struct xprt_create xprt_args = {
> > > > > > > > > +             .ident = old->cl_proto,
> > > > > > > > > +             .net = old->cl_net,
> > > > > > > > > +             .servername = old->cl_hostname,
> > > > > > > > > +     };
> > > > > > > > > +     struct nfs4_add_xprt_data xprtdata = {
> > > > > > > > > +             .clp = old,
> > > > > > > > > +     };
> > > > > > > > > +     struct rpc_add_xprt_test rpcdata = {
> > > > > > > > > +             .add_xprt_test = old->cl_mvops-
> > > > > > > > > > session_trunk,
> > > > > > > > > +             .data = &xprtdata,
> > > > > > > > > +     };
> > > > > > > > > +
> > > > > > > > > +     if (clp->cl_proto != old->cl_proto)
> > > > > > > > > +             return;
> > > > > > > > > +     clp_salen = rpc_peeraddr(clp->cl_rpcclient,
> > > > > > > > > clp_sap,
> > > > > > > > > sizeof(clp_addr));
> > > > > > > > > +     old_salen = rpc_peeraddr(old->cl_rpcclient,
> > > > > > > > > old_sap,
> > > > > > > > > sizeof(old_addr));
> > > > > > > > > +
> > > > > > > > > +     if (clp_addr.ss_family != old_addr.ss_family)
> > > > > > > > > +             return;
> > > > > > > > > +
> > > > > > > > > +     xprt_args.dstaddr = clp_sap;
> > > > > > > > > +     xprt_args.addrlen = clp_salen;
> > > > > > > > > +
> > > > > > > > > +     xprtdata.cred = nfs4_get_clid_cred(old);
> > > > > > > > > +     rpc_clnt_add_xprt(old->cl_rpcclient,
> > > > > > > > > &xprt_args,
> > > > > > > > > +                      
> > > > > > > > > rpc_clnt_setup_test_and_add_xprt,
> > > > > > > > > &rpcdata);
> > > > > > > > 
> > > > > > > > Is there an upper bound on the number of transports
> > > > > > > > that
> > > > > > > > are added to the NFS client's switch?
> > > > > > > 
> > > > > > > I don't believe any limits exist right now. Why should
> > > > > > > there
> > > > > > > be a
> > > > > > > limit? Are you saying that the client should limit
> > > > > > > trunking?
> > > > > > > While
> > > > > > > this is not what's happening here, but say FS_LOCATION
> > > > > > > returned
> > > > > > > 100
> > > > > > > ips for the server. Are you saying the client should be
> > > > > > > limiting
> > > > > > > how
> > > > > > > many trunkable connections it would be establishing and
> > > > > > > picking
> > > > > > > just a
> > > > > > > few addresses to try? What's happening with this patch is
> > > > > > > that
> > > > > > > say
> > > > > > > there are 100mounts to 100 ips (each representing the
> > > > > > > same
> > > > > > > server
> > > > > > > or
> > > > > > > trunkable server(s)), without this patch a single
> > > > > > > connection
> > > > > > > is
> > > > > > > kept,
> > > > > > > with this patch we'll have 100 connections.
> > > > > > 
> > > > > > The patch description needs to make this behavior more
> > > > > > clear.
> > > > > > It
> > > > > > needs to explain "why" -- the body of the patch already
> > > > > > explains
> > > > > > "what". Can you include your last sentence in the
> > > > > > description
> > > > > > as
> > > > > > a use case?
> > > > > > 
> > > > > > As for the behavior... Seems to me that there are going to
> > > > > > be
> > > > > > only
> > > > > > infinitesimal gains after the first few connections, and
> > > > > > after
> > > > > > that, it's going to be a lot for both sides to manage for
> > > > > > no
> > > > > > real
> > > > > > gain. I think you do want to cap the total number of
> > > > > > connections
> > > > > > at a reasonable number, even in the FS_LOCATIONS case.
> > > > > > 
> > > > > 
> > > > > I'd tend to agree. If you want to scale I/O then pNFS is the
> > > > > way
> > > > > to
> > > > > go,
> > > > > not vast numbers of connections to a single server.
> > > > > 
> > > > BTW: AFAICS this patch will end up adding another connection
> > > > every
> > > > time
> > > > we mount a new filesystem, whether or not a connection already
> > > > exists
> > > > to that IP address. That's unacceptable.
> > > 
> > > Not in my testing. Mounts to the same server (same IP) different
> > > export volumes lead to use of a single connection.
> > > > 
> > 
> > Ah... Never mind. The comparison is made by
> > rpc_clnt_setup_test_and_add_xprt(), and the address is discarded if
> > it
> > is already present.
> > 
> > However why are you running trunking detection a second time on the
> > address you've just run trunking detection on and have decided to
> > add?
> 
> Because that's where we determined that these are different clients
> and are trunkable? But I guess also for the ease of re-using existing
> code. There is no hook to create an xprt and add it to an existing
> RPC
> client. One can be created.

Why not rpc_clnt_add_xprt()? That's what pNFS uses for this case.

Olga Kornievskaia June 8, 2021, 10:38 p.m. UTC | #14

On Tue, Jun 8, 2021 at 6:28 PM Trond Myklebust <trondmy@hammerspace.com> wrote:
>
> On Tue, 2021-06-08 at 18:18 -0400, Olga Kornievskaia wrote:
> > On Tue, Jun 8, 2021 at 6:13 PM Trond Myklebust <
> > trondmy@hammerspace.com> wrote:
> > >
> > > On Tue, 2021-06-08 at 17:40 -0400, Olga Kornievskaia wrote:
> > > > On Tue, Jun 8, 2021 at 5:06 PM Chuck Lever III <
> > > > chuck.lever@oracle.com> wrote:
> > > > >
> > > > >
> > > > >
> > > > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia <
> > > > > > olga.kornievskaia@gmail.com> wrote:
> > > > > >
> > > > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III <
> > > > > > chuck.lever@oracle.com> wrote:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia <
> > > > > > > > olga.kornievskaia@gmail.com> wrote:
> > > > > > > >
> > > > > > > > From: Olga Kornievskaia <kolga@netapp.com>
> > > > > > > >
> > > > > > > > After trunking is discovered in
> > > > > > > > nfs4_discover_server_trunking(),
> > > > > > > > add the transport to the old client structure before
> > > > > > > > destroying
> > > > > > > > the new client structure (along with its transport).
> > > > > > > >
> > > > > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > > > > > > > ---
> > > > > > > > fs/nfs/nfs4client.c | 40
> > > > > > > > ++++++++++++++++++++++++++++++++++++++++
> > > > > > > > 1 file changed, 40 insertions(+)
> > > > > > > >
> > > > > > > > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> > > > > > > > index 42719384e25f..984c851844d8 100644
> > > > > > > > --- a/fs/nfs/nfs4client.c
> > > > > > > > +++ b/fs/nfs/nfs4client.c
> > > > > > > > @@ -361,6 +361,44 @@ static int
> > > > > > > > nfs4_init_client_minor_version(struct nfs_client *clp)
> > > > > > > >      return nfs4_init_callback(clp);
> > > > > > > > }
> > > > > > > >
> > > > > > > > +static void nfs4_add_trunk(struct nfs_client *clp,
> > > > > > > > struct
> > > > > > > > nfs_client *old)
> > > > > > > > +{
> > > > > > > > +     struct sockaddr_storage clp_addr, old_addr;
> > > > > > > > +     struct sockaddr *clp_sap = (struct sockaddr
> > > > > > > > *)&clp_addr;
> > > > > > > > +     struct sockaddr *old_sap = (struct sockaddr
> > > > > > > > *)&old_addr;
> > > > > > > > +     size_t clp_salen, old_salen;
> > > > > > > > +     struct xprt_create xprt_args = {
> > > > > > > > +             .ident = old->cl_proto,
> > > > > > > > +             .net = old->cl_net,
> > > > > > > > +             .servername = old->cl_hostname,
> > > > > > > > +     };
> > > > > > > > +     struct nfs4_add_xprt_data xprtdata = {
> > > > > > > > +             .clp = old,
> > > > > > > > +     };
> > > > > > > > +     struct rpc_add_xprt_test rpcdata = {
> > > > > > > > +             .add_xprt_test = old->cl_mvops-
> > > > > > > > >session_trunk,
> > > > > > > > +             .data = &xprtdata,
> > > > > > > > +     };
> > > > > > > > +
> > > > > > > > +     if (clp->cl_proto != old->cl_proto)
> > > > > > > > +             return;
> > > > > > > > +     clp_salen = rpc_peeraddr(clp->cl_rpcclient,
> > > > > > > > clp_sap,
> > > > > > > > sizeof(clp_addr));
> > > > > > > > +     old_salen = rpc_peeraddr(old->cl_rpcclient,
> > > > > > > > old_sap,
> > > > > > > > sizeof(old_addr));
> > > > > > > > +
> > > > > > > > +     if (clp_addr.ss_family != old_addr.ss_family)
> > > > > > > > +             return;
> > > > > > > > +
> > > > > > > > +     xprt_args.dstaddr = clp_sap;
> > > > > > > > +     xprt_args.addrlen = clp_salen;
> > > > > > > > +
> > > > > > > > +     xprtdata.cred = nfs4_get_clid_cred(old);
> > > > > > > > +     rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args,
> > > > > > > > +                       rpc_clnt_setup_test_and_add_xprt,
> > > > > > > > &rpcdata);
> > > > > > >
> > > > > > > Is there an upper bound on the number of transports that
> > > > > > > are added to the NFS client's switch?
> > > > > >
> > > > > > I don't believe any limits exist right now. Why should there
> > > > > > be a
> > > > > > limit? Are you saying that the client should limit trunking?
> > > > > > While
> > > > > > this is not what's happening here, but say FS_LOCATION
> > > > > > returned
> > > > > > 100
> > > > > > ips for the server. Are you saying the client should be
> > > > > > limiting
> > > > > > how
> > > > > > many trunkable connections it would be establishing and
> > > > > > picking
> > > > > > just a
> > > > > > few addresses to try? What's happening with this patch is
> > > > > > that
> > > > > > say
> > > > > > there are 100mounts to 100 ips (each representing the same
> > > > > > server
> > > > > > or
> > > > > > trunkable server(s)), without this patch a single connection
> > > > > > is
> > > > > > kept,
> > > > > > with this patch we'll have 100 connections.
> > > > >
> > > > > The patch description needs to make this behavior more clear.
> > > > > It
> > > > > needs to explain "why" -- the body of the patch already
> > > > > explains
> > > > > "what". Can you include your last sentence in the description
> > > > > as
> > > > > a use case?
> > > >
> > > > I sure can.
> > > >
> > > > > As for the behavior... Seems to me that there are going to be
> > > > > only
> > > > > infinitesimal gains after the first few connections, and after
> > > > > that, it's going to be a lot for both sides to manage for no
> > > > > real
> > > > > gain. I think you do want to cap the total number of
> > > > > connections
> > > > > at a reasonable number, even in the FS_LOCATIONS case.
> > > >
> > > > Do you want to cap it at 16 like we do for nconnect? I'm ok with
> > > > that
> > > > for now.
> > > >
> > > > I don't understand why a setup where there X NICs on each side
> > > > (client/server) and X mounts are done, there won't be throughput
> > > > of
> > > > close to X * throughput of a single NIC.
> > >
> > > That still does not make the patch acceptable. There are already
> > > server
> > > vendors seeing problems with supporting nconnect for various
> > > reasons.
> >
> > Why are there servers presenting multiple IP addresses and returning
> > the same server identity if they can not support trunking? That seems
> > to be going against the spec?
>
> That's not a question I can answer, but I'm not thinking of our server
> or the Linux server.
>
> >
> > > There needs to be a way to ensure that we can keep the current
> > > client
> > > behaviour without requiring changes to server DNS entries, etc.
> >
> > Ok, how about we introduce a mount option, "enable_trunking", and if
> > that's present we would not collapse transports?
>
> I'd suggest making it 'max_connect=X' so that we can actually set a
> limit like we do for nconnect. That limit probably needs to be lower
> bounded by the value of nconnect.

Sure I can do that. But wouldn't that be confusing since we already
have nconnect option? How about "max_trunking=X"? The lack of the
setting or max_trunking=1, would result in all transports collapsed,
other values would result in all additions of trunkable transports.

>
> >
> > > >
> > > > > > > > +
> > > > > > > > +     if (xprtdata.cred)
> > > > > > > > +             put_cred(xprtdata.cred);
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > /**
> > > > > > > > * nfs4_init_client - Initialise an NFS4 client record
> > > > > > > > *
> > > > > > > > @@ -434,6 +472,8 @@ struct nfs_client
> > > > > > > > *nfs4_init_client(struct nfs_client *clp,
> > > > > > > >               * won't try to use it.
> > > > > > > >               */
> > > > > > > >              nfs_mark_client_ready(clp, -EPERM);
> > > > > > > > +             if (old->cl_mvops->session_trunk)
> > > > > > > > +                     nfs4_add_trunk(clp, old);
> > > > > > > >      }
> > > > > > > >      clear_bit(NFS_CS_TSM_POSSIBLE, &clp->cl_flags);
> > > > > > > >      nfs_put_client(clp);
> > > > > > > > --
> > > > > > > > 2.27.0
> > > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Chuck Lever
> > > > >
> > > > > --
> > > > > Chuck Lever
> > > > >
> > > > >
> > > > >
> > >
> > > --
> > > Trond Myklebust
> > > Linux NFS client maintainer, Hammerspace
> > > trond.myklebust@hammerspace.com
> > >
> > >
>
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com
>
>

Olga Kornievskaia June 8, 2021, 10:38 p.m. UTC | #15

On Tue, Jun 8, 2021 at 6:31 PM Trond Myklebust <trondmy@hammerspace.com> wrote:
>
> On Tue, 2021-06-08 at 18:21 -0400, Olga Kornievskaia wrote:
> > On Tue, Jun 8, 2021 at 5:41 PM Trond Myklebust <
> > trondmy@hammerspace.com> wrote:
> > >
> > > On Tue, 2021-06-08 at 17:30 -0400, Olga Kornievskaia wrote:
> > > > On Tue, Jun 8, 2021 at 5:26 PM Trond Myklebust <
> > > > trondmy@hammerspace.com> wrote:
> > > > >
> > > > > On Tue, 2021-06-08 at 21:19 +0000, Trond Myklebust wrote:
> > > > > > On Tue, 2021-06-08 at 21:06 +0000, Chuck Lever III wrote:
> > > > > > >
> > > > > > >
> > > > > > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia <
> > > > > > > > olga.kornievskaia@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III <
> > > > > > > > chuck.lever@oracle.com> wrote:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia <
> > > > > > > > > > olga.kornievskaia@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > > From: Olga Kornievskaia <kolga@netapp.com>
> > > > > > > > > >
> > > > > > > > > > After trunking is discovered in
> > > > > > > > > > nfs4_discover_server_trunking(),
> > > > > > > > > > add the transport to the old client structure before
> > > > > > > > > > destroying
> > > > > > > > > > the new client structure (along with its transport).
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > > > > > > > > > ---
> > > > > > > > > > fs/nfs/nfs4client.c | 40
> > > > > > > > > > ++++++++++++++++++++++++++++++++++++++++
> > > > > > > > > > 1 file changed, 40 insertions(+)
> > > > > > > > > >
> > > > > > > > > > diff --git a/fs/nfs/nfs4client.c
> > > > > > > > > > b/fs/nfs/nfs4client.c
> > > > > > > > > > index 42719384e25f..984c851844d8 100644
> > > > > > > > > > --- a/fs/nfs/nfs4client.c
> > > > > > > > > > +++ b/fs/nfs/nfs4client.c
> > > > > > > > > > @@ -361,6 +361,44 @@ static int
> > > > > > > > > > nfs4_init_client_minor_version(struct nfs_client
> > > > > > > > > > *clp)
> > > > > > > > > >      return nfs4_init_callback(clp);
> > > > > > > > > > }
> > > > > > > > > >
> > > > > > > > > > +static void nfs4_add_trunk(struct nfs_client *clp,
> > > > > > > > > > struct
> > > > > > > > > > nfs_client *old)
> > > > > > > > > > +{
> > > > > > > > > > +     struct sockaddr_storage clp_addr, old_addr;
> > > > > > > > > > +     struct sockaddr *clp_sap = (struct sockaddr
> > > > > > > > > > *)&clp_addr;
> > > > > > > > > > +     struct sockaddr *old_sap = (struct sockaddr
> > > > > > > > > > *)&old_addr;
> > > > > > > > > > +     size_t clp_salen, old_salen;
> > > > > > > > > > +     struct xprt_create xprt_args = {
> > > > > > > > > > +             .ident = old->cl_proto,
> > > > > > > > > > +             .net = old->cl_net,
> > > > > > > > > > +             .servername = old->cl_hostname,
> > > > > > > > > > +     };
> > > > > > > > > > +     struct nfs4_add_xprt_data xprtdata = {
> > > > > > > > > > +             .clp = old,
> > > > > > > > > > +     };
> > > > > > > > > > +     struct rpc_add_xprt_test rpcdata = {
> > > > > > > > > > +             .add_xprt_test = old->cl_mvops-
> > > > > > > > > > > session_trunk,
> > > > > > > > > > +             .data = &xprtdata,
> > > > > > > > > > +     };
> > > > > > > > > > +
> > > > > > > > > > +     if (clp->cl_proto != old->cl_proto)
> > > > > > > > > > +             return;
> > > > > > > > > > +     clp_salen = rpc_peeraddr(clp->cl_rpcclient,
> > > > > > > > > > clp_sap,
> > > > > > > > > > sizeof(clp_addr));
> > > > > > > > > > +     old_salen = rpc_peeraddr(old->cl_rpcclient,
> > > > > > > > > > old_sap,
> > > > > > > > > > sizeof(old_addr));
> > > > > > > > > > +
> > > > > > > > > > +     if (clp_addr.ss_family != old_addr.ss_family)
> > > > > > > > > > +             return;
> > > > > > > > > > +
> > > > > > > > > > +     xprt_args.dstaddr = clp_sap;
> > > > > > > > > > +     xprt_args.addrlen = clp_salen;
> > > > > > > > > > +
> > > > > > > > > > +     xprtdata.cred = nfs4_get_clid_cred(old);
> > > > > > > > > > +     rpc_clnt_add_xprt(old->cl_rpcclient,
> > > > > > > > > > &xprt_args,
> > > > > > > > > > +
> > > > > > > > > > rpc_clnt_setup_test_and_add_xprt,
> > > > > > > > > > &rpcdata);
> > > > > > > > >
> > > > > > > > > Is there an upper bound on the number of transports
> > > > > > > > > that
> > > > > > > > > are added to the NFS client's switch?
> > > > > > > >
> > > > > > > > I don't believe any limits exist right now. Why should
> > > > > > > > there
> > > > > > > > be a
> > > > > > > > limit? Are you saying that the client should limit
> > > > > > > > trunking?
> > > > > > > > While
> > > > > > > > this is not what's happening here, but say FS_LOCATION
> > > > > > > > returned
> > > > > > > > 100
> > > > > > > > ips for the server. Are you saying the client should be
> > > > > > > > limiting
> > > > > > > > how
> > > > > > > > many trunkable connections it would be establishing and
> > > > > > > > picking
> > > > > > > > just a
> > > > > > > > few addresses to try? What's happening with this patch is
> > > > > > > > that
> > > > > > > > say
> > > > > > > > there are 100mounts to 100 ips (each representing the
> > > > > > > > same
> > > > > > > > server
> > > > > > > > or
> > > > > > > > trunkable server(s)), without this patch a single
> > > > > > > > connection
> > > > > > > > is
> > > > > > > > kept,
> > > > > > > > with this patch we'll have 100 connections.
> > > > > > >
> > > > > > > The patch description needs to make this behavior more
> > > > > > > clear.
> > > > > > > It
> > > > > > > needs to explain "why" -- the body of the patch already
> > > > > > > explains
> > > > > > > "what". Can you include your last sentence in the
> > > > > > > description
> > > > > > > as
> > > > > > > a use case?
> > > > > > >
> > > > > > > As for the behavior... Seems to me that there are going to
> > > > > > > be
> > > > > > > only
> > > > > > > infinitesimal gains after the first few connections, and
> > > > > > > after
> > > > > > > that, it's going to be a lot for both sides to manage for
> > > > > > > no
> > > > > > > real
> > > > > > > gain. I think you do want to cap the total number of
> > > > > > > connections
> > > > > > > at a reasonable number, even in the FS_LOCATIONS case.
> > > > > > >
> > > > > >
> > > > > > I'd tend to agree. If you want to scale I/O then pNFS is the
> > > > > > way
> > > > > > to
> > > > > > go,
> > > > > > not vast numbers of connections to a single server.
> > > > > >
> > > > > BTW: AFAICS this patch will end up adding another connection
> > > > > every
> > > > > time
> > > > > we mount a new filesystem, whether or not a connection already
> > > > > exists
> > > > > to that IP address. That's unacceptable.
> > > >
> > > > Not in my testing. Mounts to the same server (same IP) different
> > > > export volumes lead to use of a single connection.
> > > > >
> > >
> > > Ah... Never mind. The comparison is made by
> > > rpc_clnt_setup_test_and_add_xprt(), and the address is discarded if
> > > it
> > > is already present.
> > >
> > > However why are you running trunking detection a second time on the
> > > address you've just run trunking detection on and have decided to
> > > add?
> >
> > Because that's where we determined that these are different clients
> > and are trunkable? But I guess also for the ease of re-using existing
> > code. There is no hook to create an xprt and add it to an existing
> > RPC
> > client. One can be created.
>
> Why not rpc_clnt_add_xprt()? That's what pNFS uses for this case.

That's what this code is using.

>
>
>
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com
>
>

kernel test robot June 8, 2021, 10:39 p.m. UTC | #16

Hi Olga,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on nfs/linux-next]
[also build test WARNING on v5.13-rc5 next-20210608]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Olga-Kornievskaia/NFSv4-1-add-trunking-when-server-trunking-detected/20210609-025208
base:   git://git.linux-nfs.org/projects/trondmy/linux-nfs.git linux-next
config: arm-defconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/52d86dfa29146024beeca51af401aabbb329a942
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Olga-Kornievskaia/NFSv4-1-add-trunking-when-server-trunking-detected/20210609-025208
        git checkout 52d86dfa29146024beeca51af401aabbb329a942
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   fs/nfs/nfs4client.c: In function 'nfs4_add_trunk':
>> fs/nfs/nfs4client.c:369:20: warning: variable 'old_salen' set but not used [-Wunused-but-set-variable]
     369 |  size_t clp_salen, old_salen;
         |                    ^~~~~~~~~


vim +/old_salen +369 fs/nfs/nfs4client.c

   363	
   364	static void nfs4_add_trunk(struct nfs_client *clp, struct nfs_client *old)
   365	{
   366		struct sockaddr_storage clp_addr, old_addr;
   367		struct sockaddr *clp_sap = (struct sockaddr *)&clp_addr;
   368		struct sockaddr *old_sap = (struct sockaddr *)&old_addr;
 > 369		size_t clp_salen, old_salen;
   370		struct xprt_create xprt_args = {
   371			.ident = old->cl_proto,
   372			.net = old->cl_net,
   373			.servername = old->cl_hostname,
   374		};
   375		struct nfs4_add_xprt_data xprtdata = {
   376			.clp = old,
   377		};
   378		struct rpc_add_xprt_test rpcdata = {
   379			.add_xprt_test = old->cl_mvops->session_trunk,
   380			.data = &xprtdata,
   381		};
   382	
   383		if (clp->cl_proto != old->cl_proto)
   384			return;
   385		clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap, sizeof(clp_addr));
   386		old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap, sizeof(old_addr));
   387	
   388		if (clp_addr.ss_family != old_addr.ss_family)
   389			return;
   390	
   391		xprt_args.dstaddr = clp_sap;
   392		xprt_args.addrlen = clp_salen;
   393	
   394		xprtdata.cred = nfs4_get_clid_cred(old);
   395		rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args,
   396				  rpc_clnt_setup_test_and_add_xprt, &rpcdata);
   397	
   398		if (xprtdata.cred)
   399			put_cred(xprtdata.cred);
   400	}
   401	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

Trond Myklebust June 8, 2021, 10:58 p.m. UTC | #17

On Tue, 2021-06-08 at 18:38 -0400, Olga Kornievskaia wrote:
> On Tue, Jun 8, 2021 at 6:28 PM Trond Myklebust <
> trondmy@hammerspace.com> wrote:
> > 
> > On Tue, 2021-06-08 at 18:18 -0400, Olga Kornievskaia wrote:
> > > On Tue, Jun 8, 2021 at 6:13 PM Trond Myklebust <
> > > trondmy@hammerspace.com> wrote:
> > > > 
> > > > On Tue, 2021-06-08 at 17:40 -0400, Olga Kornievskaia wrote:
> > > > > On Tue, Jun 8, 2021 at 5:06 PM Chuck Lever III <
> > > > > chuck.lever@oracle.com> wrote:
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia <
> > > > > > > olga.kornievskaia@gmail.com> wrote:
> > > > > > > 
> > > > > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III <
> > > > > > > chuck.lever@oracle.com> wrote:
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia <
> > > > > > > > > olga.kornievskaia@gmail.com> wrote:
> > > > > > > > > 
> > > > > > > > > From: Olga Kornievskaia <kolga@netapp.com>
> > > > > > > > > 
> > > > > > > > > After trunking is discovered in
> > > > > > > > > nfs4_discover_server_trunking(),
> > > > > > > > > add the transport to the old client structure before
> > > > > > > > > destroying
> > > > > > > > > the new client structure (along with its transport).
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > > > > > > > > ---
> > > > > > > > > fs/nfs/nfs4client.c | 40
> > > > > > > > > ++++++++++++++++++++++++++++++++++++++++
> > > > > > > > > 1 file changed, 40 insertions(+)
> > > > > > > > > 
> > > > > > > > > diff --git a/fs/nfs/nfs4client.c
> > > > > > > > > b/fs/nfs/nfs4client.c
> > > > > > > > > index 42719384e25f..984c851844d8 100644
> > > > > > > > > --- a/fs/nfs/nfs4client.c
> > > > > > > > > +++ b/fs/nfs/nfs4client.c
> > > > > > > > > @@ -361,6 +361,44 @@ static int
> > > > > > > > > nfs4_init_client_minor_version(struct nfs_client
> > > > > > > > > *clp)
> > > > > > > > >      return nfs4_init_callback(clp);
> > > > > > > > > }
> > > > > > > > > 
> > > > > > > > > +static void nfs4_add_trunk(struct nfs_client *clp,
> > > > > > > > > struct
> > > > > > > > > nfs_client *old)
> > > > > > > > > +{
> > > > > > > > > +     struct sockaddr_storage clp_addr, old_addr;
> > > > > > > > > +     struct sockaddr *clp_sap = (struct sockaddr
> > > > > > > > > *)&clp_addr;
> > > > > > > > > +     struct sockaddr *old_sap = (struct sockaddr
> > > > > > > > > *)&old_addr;
> > > > > > > > > +     size_t clp_salen, old_salen;
> > > > > > > > > +     struct xprt_create xprt_args = {
> > > > > > > > > +             .ident = old->cl_proto,
> > > > > > > > > +             .net = old->cl_net,
> > > > > > > > > +             .servername = old->cl_hostname,
> > > > > > > > > +     };
> > > > > > > > > +     struct nfs4_add_xprt_data xprtdata = {
> > > > > > > > > +             .clp = old,
> > > > > > > > > +     };
> > > > > > > > > +     struct rpc_add_xprt_test rpcdata = {
> > > > > > > > > +             .add_xprt_test = old->cl_mvops-
> > > > > > > > > > session_trunk,
> > > > > > > > > +             .data = &xprtdata,
> > > > > > > > > +     };
> > > > > > > > > +
> > > > > > > > > +     if (clp->cl_proto != old->cl_proto)
> > > > > > > > > +             return;
> > > > > > > > > +     clp_salen = rpc_peeraddr(clp->cl_rpcclient,
> > > > > > > > > clp_sap,
> > > > > > > > > sizeof(clp_addr));
> > > > > > > > > +     old_salen = rpc_peeraddr(old->cl_rpcclient,
> > > > > > > > > old_sap,
> > > > > > > > > sizeof(old_addr));
> > > > > > > > > +
> > > > > > > > > +     if (clp_addr.ss_family != old_addr.ss_family)
> > > > > > > > > +             return;
> > > > > > > > > +
> > > > > > > > > +     xprt_args.dstaddr = clp_sap;
> > > > > > > > > +     xprt_args.addrlen = clp_salen;
> > > > > > > > > +
> > > > > > > > > +     xprtdata.cred = nfs4_get_clid_cred(old);
> > > > > > > > > +     rpc_clnt_add_xprt(old->cl_rpcclient,
> > > > > > > > > &xprt_args,
> > > > > > > > > +                      
> > > > > > > > > rpc_clnt_setup_test_and_add_xprt,
> > > > > > > > > &rpcdata);
> > > > > > > > 
> > > > > > > > Is there an upper bound on the number of transports
> > > > > > > > that
> > > > > > > > are added to the NFS client's switch?
> > > > > > > 
> > > > > > > I don't believe any limits exist right now. Why should
> > > > > > > there
> > > > > > > be a
> > > > > > > limit? Are you saying that the client should limit
> > > > > > > trunking?
> > > > > > > While
> > > > > > > this is not what's happening here, but say FS_LOCATION
> > > > > > > returned
> > > > > > > 100
> > > > > > > ips for the server. Are you saying the client should be
> > > > > > > limiting
> > > > > > > how
> > > > > > > many trunkable connections it would be establishing and
> > > > > > > picking
> > > > > > > just a
> > > > > > > few addresses to try? What's happening with this patch is
> > > > > > > that
> > > > > > > say
> > > > > > > there are 100mounts to 100 ips (each representing the
> > > > > > > same
> > > > > > > server
> > > > > > > or
> > > > > > > trunkable server(s)), without this patch a single
> > > > > > > connection
> > > > > > > is
> > > > > > > kept,
> > > > > > > with this patch we'll have 100 connections.
> > > > > > 
> > > > > > The patch description needs to make this behavior more
> > > > > > clear.
> > > > > > It
> > > > > > needs to explain "why" -- the body of the patch already
> > > > > > explains
> > > > > > "what". Can you include your last sentence in the
> > > > > > description
> > > > > > as
> > > > > > a use case?
> > > > > 
> > > > > I sure can.
> > > > > 
> > > > > > As for the behavior... Seems to me that there are going to
> > > > > > be
> > > > > > only
> > > > > > infinitesimal gains after the first few connections, and
> > > > > > after
> > > > > > that, it's going to be a lot for both sides to manage for
> > > > > > no
> > > > > > real
> > > > > > gain. I think you do want to cap the total number of
> > > > > > connections
> > > > > > at a reasonable number, even in the FS_LOCATIONS case.
> > > > > 
> > > > > Do you want to cap it at 16 like we do for nconnect? I'm ok
> > > > > with
> > > > > that
> > > > > for now.
> > > > > 
> > > > > I don't understand why a setup where there X NICs on each
> > > > > side
> > > > > (client/server) and X mounts are done, there won't be
> > > > > throughput
> > > > > of
> > > > > close to X * throughput of a single NIC.
> > > > 
> > > > That still does not make the patch acceptable. There are
> > > > already
> > > > server
> > > > vendors seeing problems with supporting nconnect for various
> > > > reasons.
> > > 
> > > Why are there servers presenting multiple IP addresses and
> > > returning
> > > the same server identity if they can not support trunking? That
> > > seems
> > > to be going against the spec?
> > 
> > That's not a question I can answer, but I'm not thinking of our
> > server
> > or the Linux server.
> > 
> > > 
> > > > There needs to be a way to ensure that we can keep the current
> > > > client
> > > > behaviour without requiring changes to server DNS entries, etc.
> > > 
> > > Ok, how about we introduce a mount option, "enable_trunking", and
> > > if
> > > that's present we would not collapse transports?
> > 
> > I'd suggest making it 'max_connect=X' so that we can actually set a
> > limit like we do for nconnect. That limit probably needs to be
> > lower
> > bounded by the value of nconnect.
> 
> Sure I can do that. But wouldn't that be confusing since we already
> have nconnect option? How about "max_trunking=X"? The lack of the
> setting or max_trunking=1, would result in all transports collapsed,
> other values would result in all additions of trunkable transports.
> 
> 

I'm not comfortable using the word 'trunking' in the mount API.

The NFS community uses 'trunking' to mean something very different from
what the networking community does. You're only talking about session
trunking here, and not clientid trunking, so I suggest we stick with a
terminology that reminds us this is connection related.

Trond Myklebust June 8, 2021, 11:02 p.m. UTC | #18

On Tue, 2021-06-08 at 18:38 -0400, Olga Kornievskaia wrote:
> On Tue, Jun 8, 2021 at 6:31 PM Trond Myklebust < 
> trondmy@hammerspace.com> wrote:
> > 
> > On Tue, 2021-06-08 at 18:21 -0400, Olga Kornievskaia wrote:
> > > On Tue, Jun 8, 2021 at 5:41 PM Trond Myklebust <
> > > trondmy@hammerspace.com> wrote:
> > > > 
> > > > On Tue, 2021-06-08 at 17:30 -0400, Olga Kornievskaia wrote:
> > > > > On Tue, Jun 8, 2021 at 5:26 PM Trond Myklebust <
> > > > > trondmy@hammerspace.com> wrote:
> > > > > > 
> > > > > > On Tue, 2021-06-08 at 21:19 +0000, Trond Myklebust wrote:
> > > > > > > On Tue, 2021-06-08 at 21:06 +0000, Chuck Lever III wrote:
> > > > > > > > 
> > > > > > > > 
> > > > > > > > > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia <
> > > > > > > > > olga.kornievskaia@gmail.com> wrote:
> > > > > > > > > 
> > > > > > > > > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III <
> > > > > > > > > chuck.lever@oracle.com> wrote:
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > > On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia <
> > > > > > > > > > > olga.kornievskaia@gmail.com> wrote:
> > > > > > > > > > > 
> > > > > > > > > > > From: Olga Kornievskaia <kolga@netapp.com>
> > > > > > > > > > > 
> > > > > > > > > > > After trunking is discovered in
> > > > > > > > > > > nfs4_discover_server_trunking(),
> > > > > > > > > > > add the transport to the old client structure
> > > > > > > > > > > before
> > > > > > > > > > > destroying
> > > > > > > > > > > the new client structure (along with its
> > > > > > > > > > > transport).
> > > > > > > > > > > 
> > > > > > > > > > > Signed-off-by: Olga Kornievskaia < 
> > > > > > > > > > > kolga@netapp.com>
> > > > > > > > > > > ---
> > > > > > > > > > > fs/nfs/nfs4client.c | 40
> > > > > > > > > > > ++++++++++++++++++++++++++++++++++++++++
> > > > > > > > > > > 1 file changed, 40 insertions(+)
> > > > > > > > > > > 
> > > > > > > > > > > diff --git a/fs/nfs/nfs4client.c
> > > > > > > > > > > b/fs/nfs/nfs4client.c
> > > > > > > > > > > index 42719384e25f..984c851844d8 100644
> > > > > > > > > > > --- a/fs/nfs/nfs4client.c
> > > > > > > > > > > +++ b/fs/nfs/nfs4client.c
> > > > > > > > > > > @@ -361,6 +361,44 @@ static int
> > > > > > > > > > > nfs4_init_client_minor_version(struct nfs_client
> > > > > > > > > > > *clp)
> > > > > > > > > > >      return nfs4_init_callback(clp);
> > > > > > > > > > > }
> > > > > > > > > > > 
> > > > > > > > > > > +static void nfs4_add_trunk(struct nfs_client
> > > > > > > > > > > *clp,
> > > > > > > > > > > struct
> > > > > > > > > > > nfs_client *old)
> > > > > > > > > > > +{
> > > > > > > > > > > +     struct sockaddr_storage clp_addr, old_addr;
> > > > > > > > > > > +     struct sockaddr *clp_sap = (struct sockaddr
> > > > > > > > > > > *)&clp_addr;
> > > > > > > > > > > +     struct sockaddr *old_sap = (struct sockaddr
> > > > > > > > > > > *)&old_addr;
> > > > > > > > > > > +     size_t clp_salen, old_salen;
> > > > > > > > > > > +     struct xprt_create xprt_args = {
> > > > > > > > > > > +             .ident = old->cl_proto,
> > > > > > > > > > > +             .net = old->cl_net,
> > > > > > > > > > > +             .servername = old->cl_hostname,
> > > > > > > > > > > +     };
> > > > > > > > > > > +     struct nfs4_add_xprt_data xprtdata = {
> > > > > > > > > > > +             .clp = old,
> > > > > > > > > > > +     };
> > > > > > > > > > > +     struct rpc_add_xprt_test rpcdata = {
> > > > > > > > > > > +             .add_xprt_test = old->cl_mvops-
> > > > > > > > > > > > session_trunk,
> > > > > > > > > > > +             .data = &xprtdata,
> > > > > > > > > > > +     };
> > > > > > > > > > > +
> > > > > > > > > > > +     if (clp->cl_proto != old->cl_proto)
> > > > > > > > > > > +             return;
> > > > > > > > > > > +     clp_salen = rpc_peeraddr(clp->cl_rpcclient,
> > > > > > > > > > > clp_sap,
> > > > > > > > > > > sizeof(clp_addr));
> > > > > > > > > > > +     old_salen = rpc_peeraddr(old->cl_rpcclient,
> > > > > > > > > > > old_sap,
> > > > > > > > > > > sizeof(old_addr));
> > > > > > > > > > > +
> > > > > > > > > > > +     if (clp_addr.ss_family !=
> > > > > > > > > > > old_addr.ss_family)
> > > > > > > > > > > +             return;
> > > > > > > > > > > +
> > > > > > > > > > > +     xprt_args.dstaddr = clp_sap;
> > > > > > > > > > > +     xprt_args.addrlen = clp_salen;
> > > > > > > > > > > +
> > > > > > > > > > > +     xprtdata.cred = nfs4_get_clid_cred(old);
> > > > > > > > > > > +     rpc_clnt_add_xprt(old->cl_rpcclient,
> > > > > > > > > > > &xprt_args,
> > > > > > > > > > > +
> > > > > > > > > > > rpc_clnt_setup_test_and_add_xprt,
> > > > > > > > > > > &rpcdata);
> > > > > > > > > > 
> > > > > > > > > > Is there an upper bound on the number of transports
> > > > > > > > > > that
> > > > > > > > > > are added to the NFS client's switch?
> > > > > > > > > 
> > > > > > > > > I don't believe any limits exist right now. Why
> > > > > > > > > should
> > > > > > > > > there
> > > > > > > > > be a
> > > > > > > > > limit? Are you saying that the client should limit
> > > > > > > > > trunking?
> > > > > > > > > While
> > > > > > > > > this is not what's happening here, but say
> > > > > > > > > FS_LOCATION
> > > > > > > > > returned
> > > > > > > > > 100
> > > > > > > > > ips for the server. Are you saying the client should
> > > > > > > > > be
> > > > > > > > > limiting
> > > > > > > > > how
> > > > > > > > > many trunkable connections it would be establishing
> > > > > > > > > and
> > > > > > > > > picking
> > > > > > > > > just a
> > > > > > > > > few addresses to try? What's happening with this
> > > > > > > > > patch is
> > > > > > > > > that
> > > > > > > > > say
> > > > > > > > > there are 100mounts to 100 ips (each representing the
> > > > > > > > > same
> > > > > > > > > server
> > > > > > > > > or
> > > > > > > > > trunkable server(s)), without this patch a single
> > > > > > > > > connection
> > > > > > > > > is
> > > > > > > > > kept,
> > > > > > > > > with this patch we'll have 100 connections.
> > > > > > > > 
> > > > > > > > The patch description needs to make this behavior more
> > > > > > > > clear.
> > > > > > > > It
> > > > > > > > needs to explain "why" -- the body of the patch already
> > > > > > > > explains
> > > > > > > > "what". Can you include your last sentence in the
> > > > > > > > description
> > > > > > > > as
> > > > > > > > a use case?
> > > > > > > > 
> > > > > > > > As for the behavior... Seems to me that there are going
> > > > > > > > to
> > > > > > > > be
> > > > > > > > only
> > > > > > > > infinitesimal gains after the first few connections,
> > > > > > > > and
> > > > > > > > after
> > > > > > > > that, it's going to be a lot for both sides to manage
> > > > > > > > for
> > > > > > > > no
> > > > > > > > real
> > > > > > > > gain. I think you do want to cap the total number of
> > > > > > > > connections
> > > > > > > > at a reasonable number, even in the FS_LOCATIONS case.
> > > > > > > > 
> > > > > > > 
> > > > > > > I'd tend to agree. If you want to scale I/O then pNFS is
> > > > > > > the
> > > > > > > way
> > > > > > > to
> > > > > > > go,
> > > > > > > not vast numbers of connections to a single server.
> > > > > > > 
> > > > > > BTW: AFAICS this patch will end up adding another
> > > > > > connection
> > > > > > every
> > > > > > time
> > > > > > we mount a new filesystem, whether or not a connection
> > > > > > already
> > > > > > exists
> > > > > > to that IP address. That's unacceptable.
> > > > > 
> > > > > Not in my testing. Mounts to the same server (same IP)
> > > > > different
> > > > > export volumes lead to use of a single connection.
> > > > > > 
> > > > 
> > > > Ah... Never mind. The comparison is made by
> > > > rpc_clnt_setup_test_and_add_xprt(), and the address is
> > > > discarded if
> > > > it
> > > > is already present.
> > > > 
> > > > However why are you running trunking detection a second time on
> > > > the
> > > > address you've just run trunking detection on and have decided
> > > > to
> > > > add?
> > > 
> > > Because that's where we determined that these are different
> > > clients
> > > and are trunkable? But I guess also for the ease of re-using
> > > existing
> > > code. There is no hook to create an xprt and add it to an
> > > existing
> > > RPC
> > > client. One can be created.
> > 
> > Why not rpc_clnt_add_xprt()? That's what pNFS uses for this case.
> 
> That's what this code is using.
> 


Sorry. I should have written: 'rpc_clnt_add_xprt() with
rpc_clnt_setup_test_and_add_xprt()', which is what pNFS uses when
connecting to NFSv3 servers.
i.e.

   rpc_clnt_add_xprt(clp->cl_rpcclient, &xprt_args,
                     rpc_clnt_test_and_add_xprt, NULL);
>

Trond Myklebust June 8, 2021, 11:12 p.m. UTC | #19

On Tue, 2021-06-08 at 23:02 +0000, Trond Myklebust wrote:
> Sorry. I should have written: 'rpc_clnt_add_xprt() with
> rpc_clnt_setup_test_and_add_xprt()', which is what pNFS uses when

Growl... My keyboard is hexed today. I mean 'rpc_clnt_add_xprt() with
rpc_clnt_test_and_add_xprt'.

> connecting to NFSv3 servers.
> i.e.
> 
>    rpc_clnt_add_xprt(clp->cl_rpcclient, &xprt_args,
>                      rpc_clnt_test_and_add_xprt, NULL);
> 

The above line is correct...

kernel test robot June 8, 2021, 11:36 p.m. UTC | #20

Hi Olga,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on nfs/linux-next]
[also build test WARNING on v5.13-rc5 next-20210608]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Olga-Kornievskaia/NFSv4-1-add-trunking-when-server-trunking-detected/20210609-025208
base:   git://git.linux-nfs.org/projects/trondmy/linux-nfs.git linux-next
config: x86_64-randconfig-a015-20210608 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project d32cc150feb72f315a5bbd34f92e7beca21a50da)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install x86_64 cross compiling tool for clang build
        # apt-get install binutils-x86-64-linux-gnu
        # https://github.com/0day-ci/linux/commit/52d86dfa29146024beeca51af401aabbb329a942
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Olga-Kornievskaia/NFSv4-1-add-trunking-when-server-trunking-detected/20210609-025208
        git checkout 52d86dfa29146024beeca51af401aabbb329a942
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> fs/nfs/nfs4client.c:369:20: warning: variable 'old_salen' set but not used [-Wunused-but-set-variable]
           size_t clp_salen, old_salen;
                             ^
   1 warning generated.


vim +/old_salen +369 fs/nfs/nfs4client.c

   363	
   364	static void nfs4_add_trunk(struct nfs_client *clp, struct nfs_client *old)
   365	{
   366		struct sockaddr_storage clp_addr, old_addr;
   367		struct sockaddr *clp_sap = (struct sockaddr *)&clp_addr;
   368		struct sockaddr *old_sap = (struct sockaddr *)&old_addr;
 > 369		size_t clp_salen, old_salen;
   370		struct xprt_create xprt_args = {
   371			.ident = old->cl_proto,
   372			.net = old->cl_net,
   373			.servername = old->cl_hostname,
   374		};
   375		struct nfs4_add_xprt_data xprtdata = {
   376			.clp = old,
   377		};
   378		struct rpc_add_xprt_test rpcdata = {
   379			.add_xprt_test = old->cl_mvops->session_trunk,
   380			.data = &xprtdata,
   381		};
   382	
   383		if (clp->cl_proto != old->cl_proto)
   384			return;
   385		clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap, sizeof(clp_addr));
   386		old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap, sizeof(old_addr));
   387	
   388		if (clp_addr.ss_family != old_addr.ss_family)
   389			return;
   390	
   391		xprt_args.dstaddr = clp_sap;
   392		xprt_args.addrlen = clp_salen;
   393	
   394		xprtdata.cred = nfs4_get_clid_cred(old);
   395		rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args,
   396				  rpc_clnt_setup_test_and_add_xprt, &rpcdata);
   397	
   398		if (xprtdata.cred)
   399			put_cred(xprtdata.cred);
   400	}
   401	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

Olga Kornievskaia June 9, 2021, 8:47 p.m. UTC | #21

On Tue, Jun 8, 2021 at 5:06 PM Chuck Lever III <chuck.lever@oracle.com> wrote:
>
>
>
> > On Jun 8, 2021, at 4:56 PM, Olga Kornievskaia <olga.kornievskaia@gmail.com> wrote:
> >
> > On Tue, Jun 8, 2021 at 4:41 PM Chuck Lever III <chuck.lever@oracle.com> wrote:
> >>
> >>
> >>
> >>> On Jun 8, 2021, at 2:45 PM, Olga Kornievskaia <olga.kornievskaia@gmail.com> wrote:
> >>>
> >>> From: Olga Kornievskaia <kolga@netapp.com>
> >>>
> >>> After trunking is discovered in nfs4_discover_server_trunking(),
> >>> add the transport to the old client structure before destroying
> >>> the new client structure (along with its transport).
> >>>
> >>> Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> >>> ---
> >>> fs/nfs/nfs4client.c | 40 ++++++++++++++++++++++++++++++++++++++++
> >>> 1 file changed, 40 insertions(+)
> >>>
> >>> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> >>> index 42719384e25f..984c851844d8 100644
> >>> --- a/fs/nfs/nfs4client.c
> >>> +++ b/fs/nfs/nfs4client.c
> >>> @@ -361,6 +361,44 @@ static int nfs4_init_client_minor_version(struct nfs_client *clp)
> >>>      return nfs4_init_callback(clp);
> >>> }
> >>>
> >>> +static void nfs4_add_trunk(struct nfs_client *clp, struct nfs_client *old)
> >>> +{
> >>> +     struct sockaddr_storage clp_addr, old_addr;
> >>> +     struct sockaddr *clp_sap = (struct sockaddr *)&clp_addr;
> >>> +     struct sockaddr *old_sap = (struct sockaddr *)&old_addr;
> >>> +     size_t clp_salen, old_salen;
> >>> +     struct xprt_create xprt_args = {
> >>> +             .ident = old->cl_proto,
> >>> +             .net = old->cl_net,
> >>> +             .servername = old->cl_hostname,
> >>> +     };
> >>> +     struct nfs4_add_xprt_data xprtdata = {
> >>> +             .clp = old,
> >>> +     };
> >>> +     struct rpc_add_xprt_test rpcdata = {
> >>> +             .add_xprt_test = old->cl_mvops->session_trunk,
> >>> +             .data = &xprtdata,
> >>> +     };
> >>> +
> >>> +     if (clp->cl_proto != old->cl_proto)
> >>> +             return;
> >>> +     clp_salen = rpc_peeraddr(clp->cl_rpcclient, clp_sap, sizeof(clp_addr));
> >>> +     old_salen = rpc_peeraddr(old->cl_rpcclient, old_sap, sizeof(old_addr));
> >>> +
> >>> +     if (clp_addr.ss_family != old_addr.ss_family)
> >>> +             return;
> >>> +
> >>> +     xprt_args.dstaddr = clp_sap;
> >>> +     xprt_args.addrlen = clp_salen;
> >>> +
> >>> +     xprtdata.cred = nfs4_get_clid_cred(old);
> >>> +     rpc_clnt_add_xprt(old->cl_rpcclient, &xprt_args,
> >>> +                       rpc_clnt_setup_test_and_add_xprt, &rpcdata);
> >>
> >> Is there an upper bound on the number of transports that
> >> are added to the NFS client's switch?
> >
> > I don't believe any limits exist right now. Why should there be a
> > limit? Are you saying that the client should limit trunking? While
> > this is not what's happening here, but say FS_LOCATION returned 100
> > ips for the server. Are you saying the client should be limiting how
> > many trunkable connections it would be establishing and picking just a
> > few addresses to try? What's happening with this patch is that say
> > there are 100mounts to 100 ips (each representing the same server or
> > trunkable server(s)), without this patch a single connection is kept,
> > with this patch we'll have 100 connections.
>
> The patch description needs to make this behavior more clear. It
> needs to explain "why" -- the body of the patch already explains
> "what". Can you include your last sentence in the description as
> a use case?
>
> As for the behavior... Seems to me that there are going to be only
> infinitesimal gains after the first few connections, and after
> that, it's going to be a lot for both sides to manage for no real
> gain. I think you do want to cap the total number of connections
> at a reasonable number, even in the FS_LOCATIONS case.

To both Trond and Chuck,

If we were to cap the max number transports how to manage nconnect
transports and trunking transports. Are they the "same" transports? So
for instance, if somebody specified nconnect=10,max_connect=2 which
one matters more. Should nconnect be cut at 2? And that would mean all
the allowable transports are used up by nconnect.

But we've seen that there is benefit from creating multiple
connections for a single NIC. But in a multi-NIC environment, it means
multiple connections on each NIC, wouldn't it?So should we say allow
nconnect * max_connect value of transports instead? Or does the admin
need to calculate that value to make sure that max_connect reflects
that?

> >>> +
> >>> +     if (xprtdata.cred)
> >>> +             put_cred(xprtdata.cred);
> >>> +}
> >>> +
> >>> /**
> >>> * nfs4_init_client - Initialise an NFS4 client record
> >>> *
> >>> @@ -434,6 +472,8 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp,
> >>>               * won't try to use it.
> >>>               */
> >>>              nfs_mark_client_ready(clp, -EPERM);
> >>> +             if (old->cl_mvops->session_trunk)
> >>> +                     nfs4_add_trunk(clp, old);
> >>>      }
> >>>      clear_bit(NFS_CS_TSM_POSSIBLE, &clp->cl_flags);
> >>>      nfs_put_client(clp);
> >>> --
> >>> 2.27.0
> >>>
> >>
> >> --
> >> Chuck Lever
>
> --
> Chuck Lever
>
>
>

[1/1] NFSv4.1+ add trunking when server trunking detected

Commit Message

Comments

Patch