diff mbox series

[RFC] NFSD: Force all NFSv4.2 COPY requests to be synchronous

Message ID 20240506210408.4760-1-cel@kernel.org (mailing list archive)
State New, archived
Headers show
Series [RFC] NFSD: Force all NFSv4.2 COPY requests to be synchronous | expand

Commit Message

Chuck Lever May 6, 2024, 9:04 p.m. UTC
From: Chuck Lever <chuck.lever@oracle.com>

We've discovered that delivering a CB_OFFLOAD operation can be
unreliable in some pretty unremarkable situations, and the Linux
NFS client does not yet support sending an OFFLOAD_STATUS
operation to probe whether an asynchronous COPY operation has
finished. On Linux NFS clients, COPY can hang until manually
interrupted.

I've tried a couple of remedies, but so far the side-effects are
worse than the disease. For now, force COPY operations to be
synchronous so that the use of CB_OFFLOAD is avoided entirely.

I have some patches that add an OFFLOAD_STATUS implementation to the
Linux NFS client, but that is not likely to fix older clients.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/nfs4proc.c | 7 +++++++
 1 file changed, 7 insertions(+)


base-commit: 939cb14d51a150e3c12ef7a8ce0ba04ce6131bd2

Comments

Rick Macklem May 6, 2024, 10:30 p.m. UTC | #1
On Mon, May 6, 2024 at 2:04 PM <cel@kernel.org> wrote:
>
> From: Chuck Lever <chuck.lever@oracle.com>
>
> We've discovered that delivering a CB_OFFLOAD operation can be
> unreliable in some pretty unremarkable situations, and the Linux
> NFS client does not yet support sending an OFFLOAD_STATUS
> operation to probe whether an asynchronous COPY operation has
> finished. On Linux NFS clients, COPY can hang until manually
> interrupted.
>
> I've tried a couple of remedies, but so far the side-effects are
> worse than the disease. For now, force COPY operations to be
> synchronous so that the use of CB_OFFLOAD is avoided entirely.
Just a yellow warning light from my experience with the FreeBSD server
(which always does synchronous Copy's).
A large synchronous Copy can take a long time, resulting in a slow RPC
response time. A user reported 25sec.
The 25sec case turned out to be a ZFS specific issue that could be avoided
via a ZFS tunable.

I do not have a good generic solution, but I do have a tunable that can
be used to clip the Copy len, which is a rather blunt and ugly workaround.
(The generic copy routine internal to FreeBSD can accept a flag that indicates
"return after 1sec with a partial copy done", but that is not implemented for
file systems like ZFS.)

Although there is no hard limit in the RFCs for RPC response times,
I've typically
assumed a max of 1-2sec is desirable.

rick

>
> I have some patches that add an OFFLOAD_STATUS implementation to the
> Linux NFS client, but that is not likely to fix older clients.
>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  fs/nfsd/nfs4proc.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index ea3cc3e870a7..12722c709cc6 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -1807,6 +1807,13 @@ nfsd4_copy(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>         __be32 status;
>         struct nfsd4_copy *async_copy = NULL;
>
> +       /*
> +        * Currently, async COPY is not reliable. Force all COPY
> +        * requests to be synchronous to avoid client application
> +        * hangs waiting for completion.
> +        */
> +       nfsd4_copy_set_sync(copy, true);
> +
>         copy->cp_clp = cstate->clp;
>         if (nfsd4_ssc_is_inter(copy)) {
>                 trace_nfsd_copy_inter(copy);
>
> base-commit: 939cb14d51a150e3c12ef7a8ce0ba04ce6131bd2
> --
> 2.44.0
>
>
Dai Ngo May 6, 2024, 11:37 p.m. UTC | #2
On 5/6/24 2:04 PM, cel@kernel.org wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
>
> We've discovered that delivering a CB_OFFLOAD operation can be
> unreliable in some pretty unremarkable situations,

Since the fore and back channel use the same connection so I assume
this is not a connection related problem.

Sounds like this is a bug that we should find and fix if possible
instead of work around it. Do you know any scenarios where the
CB_OFFLOAD operation is unreliable?

-Dai

>   and the Linux
> NFS client does not yet support sending an OFFLOAD_STATUS
> operation to probe whether an asynchronous COPY operation has
> finished. On Linux NFS clients, COPY can hang until manually
> interrupted.
>
> I've tried a couple of remedies, but so far the side-effects are
> worse than the disease. For now, force COPY operations to be
> synchronous so that the use of CB_OFFLOAD is avoided entirely.
>
> I have some patches that add an OFFLOAD_STATUS implementation to the
> Linux NFS client, but that is not likely to fix older clients.
>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>   fs/nfsd/nfs4proc.c | 7 +++++++
>   1 file changed, 7 insertions(+)
>
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index ea3cc3e870a7..12722c709cc6 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -1807,6 +1807,13 @@ nfsd4_copy(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>   	__be32 status;
>   	struct nfsd4_copy *async_copy = NULL;
>   
> +	/*
> +	 * Currently, async COPY is not reliable. Force all COPY
> +	 * requests to be synchronous to avoid client application
> +	 * hangs waiting for completion.
> +	 */
> +	nfsd4_copy_set_sync(copy, true);
> +
>   	copy->cp_clp = cstate->clp;
>   	if (nfsd4_ssc_is_inter(copy)) {
>   		trace_nfsd_copy_inter(copy);
>
> base-commit: 939cb14d51a150e3c12ef7a8ce0ba04ce6131bd2
Chuck Lever May 7, 2024, 12:14 a.m. UTC | #3
On Mon, May 06, 2024 at 03:30:18PM -0700, Rick Macklem wrote:
> On Mon, May 6, 2024 at 2:04 PM <cel@kernel.org> wrote:
> >
> > From: Chuck Lever <chuck.lever@oracle.com>
> >
> > We've discovered that delivering a CB_OFFLOAD operation can be
> > unreliable in some pretty unremarkable situations, and the Linux
> > NFS client does not yet support sending an OFFLOAD_STATUS
> > operation to probe whether an asynchronous COPY operation has
> > finished. On Linux NFS clients, COPY can hang until manually
> > interrupted.
> >
> > I've tried a couple of remedies, but so far the side-effects are
> > worse than the disease. For now, force COPY operations to be
> > synchronous so that the use of CB_OFFLOAD is avoided entirely.
> Just a yellow warning light from my experience with the FreeBSD server
> (which always does synchronous Copy's).
> A large synchronous Copy can take a long time, resulting in a slow RPC
> response time. A user reported 25sec.
> The 25sec case turned out to be a ZFS specific issue that could be avoided
> via a ZFS tunable.
> 
> I do not have a good generic solution, but I do have a tunable that can
> be used to clip the Copy len, which is a rather blunt and ugly workaround.
> (The generic copy routine internal to FreeBSD can accept a flag that indicates
> "return after 1sec with a partial copy done", but that is not implemented for
> file systems like ZFS.)
> 
> Although there is no hard limit in the RFCs for RPC response times,
> I've typically
> assumed a max of 1-2sec is desirable.

This is not meant to be a permanent change, but rather one that can
be backported to LTS kernels if we see the need. A long-term fix
will re-enable async COPY but deal properly with the loss of a
CB_OFFLOAD.

The server should return NFS4ERR_DELAY if it expects to take a long
time, no?

> rick
> 
> >
> > I have some patches that add an OFFLOAD_STATUS implementation to the
> > Linux NFS client, but that is not likely to fix older clients.
> >
> > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > ---
> >  fs/nfsd/nfs4proc.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> >
> > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> > index ea3cc3e870a7..12722c709cc6 100644
> > --- a/fs/nfsd/nfs4proc.c
> > +++ b/fs/nfsd/nfs4proc.c
> > @@ -1807,6 +1807,13 @@ nfsd4_copy(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> >         __be32 status;
> >         struct nfsd4_copy *async_copy = NULL;
> >
> > +       /*
> > +        * Currently, async COPY is not reliable. Force all COPY
> > +        * requests to be synchronous to avoid client application
> > +        * hangs waiting for completion.
> > +        */
> > +       nfsd4_copy_set_sync(copy, true);
> > +
> >         copy->cp_clp = cstate->clp;
> >         if (nfsd4_ssc_is_inter(copy)) {
> >                 trace_nfsd_copy_inter(copy);
> >
> > base-commit: 939cb14d51a150e3c12ef7a8ce0ba04ce6131bd2
> > --
> > 2.44.0
> >
> >
Chuck Lever May 7, 2024, 12:27 a.m. UTC | #4
On Mon, May 06, 2024 at 04:37:15PM -0700, Dai Ngo wrote:
> 
> On 5/6/24 2:04 PM, cel@kernel.org wrote:
> > From: Chuck Lever <chuck.lever@oracle.com>
> > 
> > We've discovered that delivering a CB_OFFLOAD operation can be
> > unreliable in some pretty unremarkable situations,
> 
> Since the fore and back channel use the same connection so I assume
> this is not a connection related problem.

This is totally a connection-related problem. The underlying issue
is that NFSD does not retransmit backchannel requests when the
connection is lost, and the Linux NFS client does not implement
OFFLOAD_STATUS. Neither side right now recovers from connection
loss while a COPY operation is pending.


> Sounds like this is a bug that we should find and fix if possible
> instead of work around it.

I've been looking for a fix for the past several months. The last
fix I put in, you asked me to revert. So, I would prefer to fix
the root cause of this issue, but right now the best we can do is
create a surgical patch that can be backported to LTS kernels, and
keep working on a longer term fix.

It's either temporarily force all COPY operations to become
synchronous, or temporarily drop support for COPY in NFSD. Actually
the latter sounds safer.


> Do you know any scenarios where the CB_OFFLOAD operation is
> unreliable?

Any scenario where the connection is dropped (say, because the
server wants the client to retransmit forechannel requests, or
because of a GSS sequence number window under-run, or because of a
network partition, etc) can potentially result in the loss of a
backchannel operation.

I can reproduce this issue 100% of the time with an NFSv4.2 mount
from a 6.8.7-200.fc39.x86_64 NFS client, using the git regression
suite.


> > and the Linux
> > NFS client does not yet support sending an OFFLOAD_STATUS
> > operation to probe whether an asynchronous COPY operation has
> > finished. On Linux NFS clients, COPY can hang until manually
> > interrupted.
> > 
> > I've tried a couple of remedies, but so far the side-effects are
> > worse than the disease. For now, force COPY operations to be
> > synchronous so that the use of CB_OFFLOAD is avoided entirely.
> > 
> > I have some patches that add an OFFLOAD_STATUS implementation to the
> > Linux NFS client, but that is not likely to fix older clients.
> > 
> > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > ---
> >   fs/nfsd/nfs4proc.c | 7 +++++++
> >   1 file changed, 7 insertions(+)
> > 
> > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> > index ea3cc3e870a7..12722c709cc6 100644
> > --- a/fs/nfsd/nfs4proc.c
> > +++ b/fs/nfsd/nfs4proc.c
> > @@ -1807,6 +1807,13 @@ nfsd4_copy(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> >   	__be32 status;
> >   	struct nfsd4_copy *async_copy = NULL;
> > +	/*
> > +	 * Currently, async COPY is not reliable. Force all COPY
> > +	 * requests to be synchronous to avoid client application
> > +	 * hangs waiting for completion.
> > +	 */
> > +	nfsd4_copy_set_sync(copy, true);
> > +
> >   	copy->cp_clp = cstate->clp;
> >   	if (nfsd4_ssc_is_inter(copy)) {
> >   		trace_nfsd_copy_inter(copy);
> > 
> > base-commit: 939cb14d51a150e3c12ef7a8ce0ba04ce6131bd2
Rick Macklem May 7, 2024, 12:31 a.m. UTC | #5
On Mon, May 6, 2024 at 5:14 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> On Mon, May 06, 2024 at 03:30:18PM -0700, Rick Macklem wrote:
> > On Mon, May 6, 2024 at 2:04 PM <cel@kernel.org> wrote:
> > >
> > > From: Chuck Lever <chuck.lever@oracle.com>
> > >
> > > We've discovered that delivering a CB_OFFLOAD operation can be
> > > unreliable in some pretty unremarkable situations, and the Linux
> > > NFS client does not yet support sending an OFFLOAD_STATUS
> > > operation to probe whether an asynchronous COPY operation has
> > > finished. On Linux NFS clients, COPY can hang until manually
> > > interrupted.
> > >
> > > I've tried a couple of remedies, but so far the side-effects are
> > > worse than the disease. For now, force COPY operations to be
> > > synchronous so that the use of CB_OFFLOAD is avoided entirely.
> > Just a yellow warning light from my experience with the FreeBSD server
> > (which always does synchronous Copy's).
> > A large synchronous Copy can take a long time, resulting in a slow RPC
> > response time. A user reported 25sec.
> > The 25sec case turned out to be a ZFS specific issue that could be avoided
> > via a ZFS tunable.
> >
> > I do not have a good generic solution, but I do have a tunable that can
> > be used to clip the Copy len, which is a rather blunt and ugly workaround.
> > (The generic copy routine internal to FreeBSD can accept a flag that indicates
> > "return after 1sec with a partial copy done", but that is not implemented for
> > file systems like ZFS.)
> >
> > Although there is no hard limit in the RFCs for RPC response times,
> > I've typically
> > assumed a max of 1-2sec is desirable.
>
> This is not meant to be a permanent change, but rather one that can
> be backported to LTS kernels if we see the need. A long-term fix
> will re-enable async COPY but deal properly with the loss of a
> CB_OFFLOAD.
>
> The server should return NFS4ERR_DELAY if it expects to take a long
> time, no?
And then the client is going to try the same Copy again after waiting a while,
yes?
I think you need the Copy to make some progress, so that the client doesn't
just try try again. (Also, in FreeBSD it wouldn't be easy to "interrupt" a copy
that is in progress on a server file system so, at least in FreeBSD, I don't
know how the NFS server would know to expect a long delay.  As I noted,
there is a "stop copying after 1sec" flag that, if obeyed by all file systems,
does allow some progress without taking an excessive time.)

rick

>
> > rick
> >
> > >
> > > I have some patches that add an OFFLOAD_STATUS implementation to the
> > > Linux NFS client, but that is not likely to fix older clients.
> > >
> > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > > ---
> > >  fs/nfsd/nfs4proc.c | 7 +++++++
> > >  1 file changed, 7 insertions(+)
> > >
> > > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> > > index ea3cc3e870a7..12722c709cc6 100644
> > > --- a/fs/nfsd/nfs4proc.c
> > > +++ b/fs/nfsd/nfs4proc.c
> > > @@ -1807,6 +1807,13 @@ nfsd4_copy(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> > >         __be32 status;
> > >         struct nfsd4_copy *async_copy = NULL;
> > >
> > > +       /*
> > > +        * Currently, async COPY is not reliable. Force all COPY
> > > +        * requests to be synchronous to avoid client application
> > > +        * hangs waiting for completion.
> > > +        */
> > > +       nfsd4_copy_set_sync(copy, true);
> > > +
> > >         copy->cp_clp = cstate->clp;
> > >         if (nfsd4_ssc_is_inter(copy)) {
> > >                 trace_nfsd_copy_inter(copy);
> > >
> > > base-commit: 939cb14d51a150e3c12ef7a8ce0ba04ce6131bd2
> > > --
> > > 2.44.0
> > >
> > >
>
> --
> Chuck Lever
diff mbox series

Patch

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index ea3cc3e870a7..12722c709cc6 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1807,6 +1807,13 @@  nfsd4_copy(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	__be32 status;
 	struct nfsd4_copy *async_copy = NULL;
 
+	/*
+	 * Currently, async COPY is not reliable. Force all COPY
+	 * requests to be synchronous to avoid client application
+	 * hangs waiting for completion.
+	 */
+	nfsd4_copy_set_sync(copy, true);
+
 	copy->cp_clp = cstate->clp;
 	if (nfsd4_ssc_is_inter(copy)) {
 		trace_nfsd_copy_inter(copy);