nfsd: CB_RECALL can race with FREE_STATEID

Message ID	20190418132400.24161-1-smayhew@redhat.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-nfs-owner@kernel.org> From: Scott Mayhew <smayhew@redhat.com> To: bfields@fieldses.org, jlayton@kernel.org Cc: linux-nfs@vger.kernel.org Subject: [PATCH] nfsd: CB_RECALL can race with FREE_STATEID Date: Thu, 18 Apr 2019 09:24:00 -0400 Message-Id: <20190418132400.24161-1-smayhew@redhat.com> Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk
Series	nfsd: CB_RECALL can race with FREE_STATEID \| expand nfsd: CB_RECALL can race with FREE_STATEID

Scott Mayhew April 18, 2019, 1:24 p.m. UTC

While trying to track down some issues involving large numbers of
delegations being recalled/revoked, I caught the server setting
SEQ4_STATUS_CB_PATH_DOWN while the client was actively responding to
CB_RECALLs.  It turns out that the client had already done a
TEST_STATEID and FREE_STATEID for a delegation being recalled by the
time it received the CB_RECALL.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
---
 fs/nfsd/nfs4state.c | 8 ++++++++
 1 file changed, 8 insertions(+)

J. Bruce Fields April 18, 2019, 3:13 p.m. UTC | #1

On Thu, Apr 18, 2019 at 09:24:00AM -0400, Scott Mayhew wrote:
> While trying to track down some issues involving large numbers of
> delegations being recalled/revoked, I caught the server setting
> SEQ4_STATUS_CB_PATH_DOWN while the client was actively responding to
> CB_RECALLs.  It turns out that the client had already done a
> TEST_STATEID and FREE_STATEID for a delegation being recalled by the
> time it received the CB_RECALL.

That's interesting, thanks!

This exception seems awfully narrow, though.

If we get back any NFS-level error at all, then I think the callback
channel is working (am I wrong?) and telling the client to set up a new
one is probably not going to help.  The best we can do is probably just
give up and let the client deal with the ensuing
RECALLABLE_STATE_REVOKED flag.

--b.

> 
> Signed-off-by: Scott Mayhew <smayhew@redhat.com>
> ---
>  fs/nfsd/nfs4state.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 6a45fb00c5fc..e88e429133a8 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -3958,6 +3958,14 @@ static int nfsd4_cb_recall_done(struct nfsd4_callback *cb,
>  			rpc_delay(task, 2 * HZ);
>  			return 0;
>  		}
> +		/*
> +		 * Race: client may have done a FREE_STATEID before
> +		 * receiving the CB_RECALL.
> +		 */
> +		if (dp->dl_stid.sc_type == NFS4_REVOKED_DELEG_STID &&
> +				refcount_read(&dp->dl_stid.sc_count) == 1 &&
> +				list_empty(&dp->dl_recall_lru))
> +			return 1;
>  		/*FALLTHRU*/
>  	default:
>  		return -1;
> -- 
> 2.17.2

Scott Mayhew April 18, 2019, 8:50 p.m. UTC | #2

On Thu, 18 Apr 2019, J. Bruce Fields wrote:

> On Thu, Apr 18, 2019 at 09:24:00AM -0400, Scott Mayhew wrote:
> > While trying to track down some issues involving large numbers of
> > delegations being recalled/revoked, I caught the server setting
> > SEQ4_STATUS_CB_PATH_DOWN while the client was actively responding to
> > CB_RECALLs.  It turns out that the client had already done a
> > TEST_STATEID and FREE_STATEID for a delegation being recalled by the
> > time it received the CB_RECALL.
> 
> That's interesting, thanks!
> 
> This exception seems awfully narrow, though.
> 
> If we get back any NFS-level error at all, then I think the callback
> channel is working (am I wrong?)

Correct, if the client replies with either NFS4ERR_DELAY or
NFS4ERR_BAD_STATEID, the server will retry 1 time (see dl_retries).
After that, we fall thru and nfsd4_cb_recall_done() returns -1 which
causes the SEQ4_STATUS_CB_PATH_DOWN flag to be set.

> and telling the client to set up a new
> one is probably not going to help.  The best we can do is probably just
> give up

That's what the patch is essentially doing.  Or are you saying don't
even bother with the checks but still return 1 so we don't set the
SEQ4_STATUS_CB_PATH_DOWN flag?

> and let the client deal with the ensuing
> RECALLABLE_STATE_REVOKED flag.

The client's already dealing with the RECALLABLE_STATE_REVOKED flag,
that's why it sent a TEST_STATEID and FREE_STATEID before it got this
particular CB_RECALL.  The idea behind the patch is to not give the
state manager on the client additional work by setting CB_PATH_DOWN when
the callback channel is clearly working...

-Scott
> 
> --b.
> 
> > 
> > Signed-off-by: Scott Mayhew <smayhew@redhat.com>
> > ---
> >  fs/nfsd/nfs4state.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index 6a45fb00c5fc..e88e429133a8 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -3958,6 +3958,14 @@ static int nfsd4_cb_recall_done(struct nfsd4_callback *cb,
> >  			rpc_delay(task, 2 * HZ);
> >  			return 0;
> >  		}
> > +		/*
> > +		 * Race: client may have done a FREE_STATEID before
> > +		 * receiving the CB_RECALL.
> > +		 */
> > +		if (dp->dl_stid.sc_type == NFS4_REVOKED_DELEG_STID &&
> > +				refcount_read(&dp->dl_stid.sc_count) == 1 &&
> > +				list_empty(&dp->dl_recall_lru))
> > +			return 1;
> >  		/*FALLTHRU*/
> >  	default:
> >  		return -1;
> > -- 
> > 2.17.2

J. Bruce Fields April 18, 2019, 9:37 p.m. UTC | #3

On Thu, Apr 18, 2019 at 04:50:24PM -0400, Scott Mayhew wrote:
> On Thu, 18 Apr 2019, J. Bruce Fields wrote:
> 
> > On Thu, Apr 18, 2019 at 09:24:00AM -0400, Scott Mayhew wrote:
> > > While trying to track down some issues involving large numbers of
> > > delegations being recalled/revoked, I caught the server setting
> > > SEQ4_STATUS_CB_PATH_DOWN while the client was actively responding to
> > > CB_RECALLs.  It turns out that the client had already done a
> > > TEST_STATEID and FREE_STATEID for a delegation being recalled by the
> > > time it received the CB_RECALL.
> > 
> > That's interesting, thanks!
> > 
> > This exception seems awfully narrow, though.
> > 
> > If we get back any NFS-level error at all, then I think the callback
> > channel is working (am I wrong?)
> 
> Correct, if the client replies with either NFS4ERR_DELAY or
> NFS4ERR_BAD_STATEID, the server will retry 1 time (see dl_retries).
> After that, we fall thru and nfsd4_cb_recall_done() returns -1 which
> causes the SEQ4_STATUS_CB_PATH_DOWN flag to be set.
> 
> > and telling the client to set up a new
> > one is probably not going to help.  The best we can do is probably just
> > give up
> 
> That's what the patch is essentially doing.  Or are you saying don't
> even bother with the checks but still return 1 so we don't set the
> SEQ4_STATUS_CB_PATH_DOWN flag?

Right, I don't see any point returning -1 (which ends up setting
CB_PATH_DOWN) in any case where we get an nfs-level error.  If the
client got so far as returning an error, then the callback path is
working.

I'm not sure exactly what errors *should* result in CB_PATH_DOWN,
though.  ETIMEDOUT, ENOTCONN, EIO?  And maybe we should be checking for
those in nfsd4_cb_done, and do away with the convention that -1 means
CB_PATH_DOWN.  I don't think there's a reason individual callback ops
would need different rules for when to mark the callback channel down.

--b.

> 
> > and let the client deal with the ensuing
> > RECALLABLE_STATE_REVOKED flag.
> 
> The client's already dealing with the RECALLABLE_STATE_REVOKED flag,
> that's why it sent a TEST_STATEID and FREE_STATEID before it got this
> particular CB_RECALL.  The idea behind the patch is to not give the
> state manager on the client additional work by setting CB_PATH_DOWN when
> the callback channel is clearly working...
> 
> -Scott
> > 
> > --b.
> > 
> > > 
> > > Signed-off-by: Scott Mayhew <smayhew@redhat.com>
> > > ---
> > >  fs/nfsd/nfs4state.c | 8 ++++++++
> > >  1 file changed, 8 insertions(+)
> > > 
> > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > index 6a45fb00c5fc..e88e429133a8 100644
> > > --- a/fs/nfsd/nfs4state.c
> > > +++ b/fs/nfsd/nfs4state.c
> > > @@ -3958,6 +3958,14 @@ static int nfsd4_cb_recall_done(struct nfsd4_callback *cb,
> > >  			rpc_delay(task, 2 * HZ);
> > >  			return 0;
> > >  		}
> > > +		/*
> > > +		 * Race: client may have done a FREE_STATEID before
> > > +		 * receiving the CB_RECALL.
> > > +		 */
> > > +		if (dp->dl_stid.sc_type == NFS4_REVOKED_DELEG_STID &&
> > > +				refcount_read(&dp->dl_stid.sc_count) == 1 &&
> > > +				list_empty(&dp->dl_recall_lru))
> > > +			return 1;
> > >  		/*FALLTHRU*/
> > >  	default:
> > >  		return -1;
> > > -- 
> > > 2.17.2

Trond Myklebust April 18, 2019, 10:03 p.m. UTC | #4

On Thu, 2019-04-18 at 16:50 -0400, Scott Mayhew wrote:
> On Thu, 18 Apr 2019, J. Bruce Fields wrote:
> 
> > On Thu, Apr 18, 2019 at 09:24:00AM -0400, Scott Mayhew wrote:
> > > While trying to track down some issues involving large numbers of
> > > delegations being recalled/revoked, I caught the server setting
> > > SEQ4_STATUS_CB_PATH_DOWN while the client was actively responding
> > > to
> > > CB_RECALLs.  It turns out that the client had already done a
> > > TEST_STATEID and FREE_STATEID for a delegation being recalled by
> > > the
> > > time it received the CB_RECALL.
> > 
> > That's interesting, thanks!
> > 
> > This exception seems awfully narrow, though.
> > 
> > If we get back any NFS-level error at all, then I think the
> > callback
> > channel is working (am I wrong?)
> 
> Correct, if the client replies with either NFS4ERR_DELAY or
> NFS4ERR_BAD_STATEID, the server will retry 1 time (see dl_retries).
> After that, we fall thru and nfsd4_cb_recall_done() returns -1 which
> causes the SEQ4_STATUS_CB_PATH_DOWN flag to be set.

There is no handling of NFS4ERR_DELAY in nfsd4_cb_recall_done().

As far as I can see, therefore, if the client returns NFS4ERR_DELAY
(which it usually does if it is already in the process of returning the
delegation) then the recall will fail immediately.

> > and telling the client to set up a new
> > one is probably not going to help.  The best we can do is probably
> > just
> > give up
> 
> That's what the patch is essentially doing.  Or are you saying don't
> even bother with the checks but still return 1 so we don't set the
> SEQ4_STATUS_CB_PATH_DOWN flag?
> 
> > and let the client deal with the ensuing
> > RECALLABLE_STATE_REVOKED flag.
> 
> The client's already dealing with the RECALLABLE_STATE_REVOKED flag,
> that's why it sent a TEST_STATEID and FREE_STATEID before it got this
> particular CB_RECALL.  The idea behind the patch is to not give the
> state manager on the client additional work by setting CB_PATH_DOWN
> when
> the callback channel is clearly working...
> 

Either way, the Linux client will ignore any further sequence flags
until it is done with the recovery of the RECALLABLE_STATE_REVOKED
flag. The reason is that the flags are edge triggered (i.e. they don't
clear until the state changes), and so we need to be able to perform a
full recovery before we can check them again.

J. Bruce Fields April 18, 2019, 11:42 p.m. UTC | #5

On Thu, Apr 18, 2019 at 10:03:09PM +0000, Trond Myklebust wrote:
> On Thu, 2019-04-18 at 16:50 -0400, Scott Mayhew wrote:
> > On Thu, 18 Apr 2019, J. Bruce Fields wrote:
> > 
> > > On Thu, Apr 18, 2019 at 09:24:00AM -0400, Scott Mayhew wrote:
> > > > While trying to track down some issues involving large numbers of
> > > > delegations being recalled/revoked, I caught the server setting
> > > > SEQ4_STATUS_CB_PATH_DOWN while the client was actively responding
> > > > to
> > > > CB_RECALLs.  It turns out that the client had already done a
> > > > TEST_STATEID and FREE_STATEID for a delegation being recalled by
> > > > the
> > > > time it received the CB_RECALL.
> > > 
> > > That's interesting, thanks!
> > > 
> > > This exception seems awfully narrow, though.
> > > 
> > > If we get back any NFS-level error at all, then I think the
> > > callback
> > > channel is working (am I wrong?)
> > 
> > Correct, if the client replies with either NFS4ERR_DELAY or
> > NFS4ERR_BAD_STATEID, the server will retry 1 time (see dl_retries).
> > After that, we fall thru and nfsd4_cb_recall_done() returns -1 which
> > causes the SEQ4_STATUS_CB_PATH_DOWN flag to be set.
> 
> There is no handling of NFS4ERR_DELAY in nfsd4_cb_recall_done().
> 
> As far as I can see, therefore, if the client returns NFS4ERR_DELAY
> (which it usually does if it is already in the process of returning the
> delegation) then the recall will fail immediately.

We should fix that, though it doesn't sound like it matters much in that
particular case.

The success or failure of a recall isn't actually all that interesting,
all that matters is whether we get the eventual DELEGRETURN.

--b.

Scott Mayhew April 30, 2019, 6:46 p.m. UTC | #6

On Thu, 18 Apr 2019, Trond Myklebust wrote:

> On Thu, 2019-04-18 at 16:50 -0400, Scott Mayhew wrote:
> > On Thu, 18 Apr 2019, J. Bruce Fields wrote:
> > 
> > > On Thu, Apr 18, 2019 at 09:24:00AM -0400, Scott Mayhew wrote:
> > > > While trying to track down some issues involving large numbers of
> > > > delegations being recalled/revoked, I caught the server setting
> > > > SEQ4_STATUS_CB_PATH_DOWN while the client was actively responding
> > > > to
> > > > CB_RECALLs.  It turns out that the client had already done a
> > > > TEST_STATEID and FREE_STATEID for a delegation being recalled by
> > > > the
> > > > time it received the CB_RECALL.
> > > 
> > > That's interesting, thanks!
> > > 
> > > This exception seems awfully narrow, though.
> > > 
> > > If we get back any NFS-level error at all, then I think the
> > > callback
> > > channel is working (am I wrong?)
> > 
> > Correct, if the client replies with either NFS4ERR_DELAY or
> > NFS4ERR_BAD_STATEID, the server will retry 1 time (see dl_retries).
> > After that, we fall thru and nfsd4_cb_recall_done() returns -1 which
> > causes the SEQ4_STATUS_CB_PATH_DOWN flag to be set.
> 
> There is no handling of NFS4ERR_DELAY in nfsd4_cb_recall_done().
> 
> As far as I can see, therefore, if the client returns NFS4ERR_DELAY
> (which it usually does if it is already in the process of returning the
> delegation) then the recall will fail immediately.

You're right, I had NFS4ERR_DELAY on the brain because I was seeing
those periodically in conjunction with the BIND_CONN_TO_SESSION calls
that were occurring while handling the bogus CB_PATH_DOWN flags from the
server.

-Scott
> 
> > > and telling the client to set up a new
> > > one is probably not going to help.  The best we can do is probably
> > > just
> > > give up
> > 
> > That's what the patch is essentially doing.  Or are you saying don't
> > even bother with the checks but still return 1 so we don't set the
> > SEQ4_STATUS_CB_PATH_DOWN flag?
> > 
> > > and let the client deal with the ensuing
> > > RECALLABLE_STATE_REVOKED flag.
> > 
> > The client's already dealing with the RECALLABLE_STATE_REVOKED flag,
> > that's why it sent a TEST_STATEID and FREE_STATEID before it got this
> > particular CB_RECALL.  The idea behind the patch is to not give the
> > state manager on the client additional work by setting CB_PATH_DOWN
> > when
> > the callback channel is clearly working...
> > 
> 
> Either way, the Linux client will ignore any further sequence flags
> until it is done with the recovery of the RECALLABLE_STATE_REVOKED
> flag. The reason is that the flags are edge triggered (i.e. they don't
> clear until the state changes), and so we need to be able to perform a
> full recovery before we can check them again.
> 
> -- 
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com
> 
>

Scott Mayhew April 30, 2019, 6:58 p.m. UTC | #7

On Thu, 18 Apr 2019, J. Bruce Fields wrote:

> On Thu, Apr 18, 2019 at 04:50:24PM -0400, Scott Mayhew wrote:
> > On Thu, 18 Apr 2019, J. Bruce Fields wrote:
> > 
> > > On Thu, Apr 18, 2019 at 09:24:00AM -0400, Scott Mayhew wrote:
> > > > While trying to track down some issues involving large numbers of
> > > > delegations being recalled/revoked, I caught the server setting
> > > > SEQ4_STATUS_CB_PATH_DOWN while the client was actively responding to
> > > > CB_RECALLs.  It turns out that the client had already done a
> > > > TEST_STATEID and FREE_STATEID for a delegation being recalled by the
> > > > time it received the CB_RECALL.
> > > 
> > > That's interesting, thanks!
> > > 
> > > This exception seems awfully narrow, though.
> > > 
> > > If we get back any NFS-level error at all, then I think the callback
> > > channel is working (am I wrong?)
> > 
> > Correct, if the client replies with either NFS4ERR_DELAY or
> > NFS4ERR_BAD_STATEID, the server will retry 1 time (see dl_retries).
> > After that, we fall thru and nfsd4_cb_recall_done() returns -1 which
> > causes the SEQ4_STATUS_CB_PATH_DOWN flag to be set.
> > 
> > > and telling the client to set up a new
> > > one is probably not going to help.  The best we can do is probably just
> > > give up
> > 
> > That's what the patch is essentially doing.  Or are you saying don't
> > even bother with the checks but still return 1 so we don't set the
> > SEQ4_STATUS_CB_PATH_DOWN flag?
> 
> Right, I don't see any point returning -1 (which ends up setting
> CB_PATH_DOWN) in any case where we get an nfs-level error.  If the
> client got so far as returning an error, then the callback path is
> working.
> 
> I'm not sure exactly what errors *should* result in CB_PATH_DOWN,
> though.  ETIMEDOUT, ENOTCONN, EIO?

I'm not sure either.  Looking at
call_status/call_timeout/rpc_check_timeout, it looks to me like ENOTCONN
will be translated to ETIMEDOUT because nfsd4_run_cb_work sets the 
RPC_TASK_SOFTCONN flag in the call to rpc_call_async.

It looks like call_status can return EHOSTDOWN, ENETDOWN, EHOSTUNREACH,
ENETUNREACH, and EPERM... should those be handled as well?

-Scott

> And maybe we should be checking for
> those in nfsd4_cb_done, and do away with the convention that -1 means
> CB_PATH_DOWN.  I don't think there's a reason individual callback ops
> would need different rules for when to mark the callback channel down.
> 
> --b.
> 
> > 
> > > and let the client deal with the ensuing
> > > RECALLABLE_STATE_REVOKED flag.
> > 
> > The client's already dealing with the RECALLABLE_STATE_REVOKED flag,
> > that's why it sent a TEST_STATEID and FREE_STATEID before it got this
> > particular CB_RECALL.  The idea behind the patch is to not give the
> > state manager on the client additional work by setting CB_PATH_DOWN when
> > the callback channel is clearly working...
> > 
> > -Scott
> > > 
> > > --b.
> > > 
> > > > 
> > > > Signed-off-by: Scott Mayhew <smayhew@redhat.com>
> > > > ---
> > > >  fs/nfsd/nfs4state.c | 8 ++++++++
> > > >  1 file changed, 8 insertions(+)
> > > > 
> > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > > index 6a45fb00c5fc..e88e429133a8 100644
> > > > --- a/fs/nfsd/nfs4state.c
> > > > +++ b/fs/nfsd/nfs4state.c
> > > > @@ -3958,6 +3958,14 @@ static int nfsd4_cb_recall_done(struct nfsd4_callback *cb,
> > > >  			rpc_delay(task, 2 * HZ);
> > > >  			return 0;
> > > >  		}
> > > > +		/*
> > > > +		 * Race: client may have done a FREE_STATEID before
> > > > +		 * receiving the CB_RECALL.
> > > > +		 */
> > > > +		if (dp->dl_stid.sc_type == NFS4_REVOKED_DELEG_STID &&
> > > > +				refcount_read(&dp->dl_stid.sc_count) == 1 &&
> > > > +				list_empty(&dp->dl_recall_lru))
> > > > +			return 1;
> > > >  		/*FALLTHRU*/
> > > >  	default:
> > > >  		return -1;
> > > > -- 
> > > > 2.17.2

Trond Myklebust April 30, 2019, 7:03 p.m. UTC | #8

On Tue, 2019-04-30 at 14:58 -0400, Scott Mayhew wrote:
> On Thu, 18 Apr 2019, J. Bruce Fields wrote:
> 
> > On Thu, Apr 18, 2019 at 04:50:24PM -0400, Scott Mayhew wrote:
> > > On Thu, 18 Apr 2019, J. Bruce Fields wrote:
> > > 
> > > > On Thu, Apr 18, 2019 at 09:24:00AM -0400, Scott Mayhew wrote:
> > > > > While trying to track down some issues involving large
> > > > > numbers of
> > > > > delegations being recalled/revoked, I caught the server
> > > > > setting
> > > > > SEQ4_STATUS_CB_PATH_DOWN while the client was actively
> > > > > responding to
> > > > > CB_RECALLs.  It turns out that the client had already done a
> > > > > TEST_STATEID and FREE_STATEID for a delegation being recalled
> > > > > by the
> > > > > time it received the CB_RECALL.
> > > > 
> > > > That's interesting, thanks!
> > > > 
> > > > This exception seems awfully narrow, though.
> > > > 
> > > > If we get back any NFS-level error at all, then I think the
> > > > callback
> > > > channel is working (am I wrong?)
> > > 
> > > Correct, if the client replies with either NFS4ERR_DELAY or
> > > NFS4ERR_BAD_STATEID, the server will retry 1 time (see
> > > dl_retries).
> > > After that, we fall thru and nfsd4_cb_recall_done() returns -1
> > > which
> > > causes the SEQ4_STATUS_CB_PATH_DOWN flag to be set.
> > > 
> > > > and telling the client to set up a new
> > > > one is probably not going to help.  The best we can do is
> > > > probably just
> > > > give up
> > > 
> > > That's what the patch is essentially doing.  Or are you saying
> > > don't
> > > even bother with the checks but still return 1 so we don't set
> > > the
> > > SEQ4_STATUS_CB_PATH_DOWN flag?
> > 
> > Right, I don't see any point returning -1 (which ends up setting
> > CB_PATH_DOWN) in any case where we get an nfs-level error.  If the
> > client got so far as returning an error, then the callback path is
> > working.
> > 
> > I'm not sure exactly what errors *should* result in CB_PATH_DOWN,
> > though.  ETIMEDOUT, ENOTCONN, EIO?
> 
> I'm not sure either.  Looking at
> call_status/call_timeout/rpc_check_timeout, it looks to me like
> ENOTCONN
> will be translated to ETIMEDOUT because nfsd4_run_cb_work sets the 
> RPC_TASK_SOFTCONN flag in the call to rpc_call_async.
> 
> It looks like call_status can return EHOSTDOWN, ENETDOWN,
> EHOSTUNREACH,
> ENETUNREACH, and EPERM... should those be handled as well?

Those errors should never be passed back to applications.

Scott Mayhew May 2, 2019, 11:35 a.m. UTC | #9

On Tue, 30 Apr 2019, Trond Myklebust wrote:

> On Tue, 2019-04-30 at 14:58 -0400, Scott Mayhew wrote:
> > On Thu, 18 Apr 2019, J. Bruce Fields wrote:
> > 
> > > On Thu, Apr 18, 2019 at 04:50:24PM -0400, Scott Mayhew wrote:
> > > > On Thu, 18 Apr 2019, J. Bruce Fields wrote:
> > > > 
> > > > > On Thu, Apr 18, 2019 at 09:24:00AM -0400, Scott Mayhew wrote:
> > > > > > While trying to track down some issues involving large
> > > > > > numbers of
> > > > > > delegations being recalled/revoked, I caught the server
> > > > > > setting
> > > > > > SEQ4_STATUS_CB_PATH_DOWN while the client was actively
> > > > > > responding to
> > > > > > CB_RECALLs.  It turns out that the client had already done a
> > > > > > TEST_STATEID and FREE_STATEID for a delegation being recalled
> > > > > > by the
> > > > > > time it received the CB_RECALL.
> > > > > 
> > > > > That's interesting, thanks!
> > > > > 
> > > > > This exception seems awfully narrow, though.
> > > > > 
> > > > > If we get back any NFS-level error at all, then I think the
> > > > > callback
> > > > > channel is working (am I wrong?)
> > > > 
> > > > Correct, if the client replies with either NFS4ERR_DELAY or
> > > > NFS4ERR_BAD_STATEID, the server will retry 1 time (see
> > > > dl_retries).
> > > > After that, we fall thru and nfsd4_cb_recall_done() returns -1
> > > > which
> > > > causes the SEQ4_STATUS_CB_PATH_DOWN flag to be set.
> > > > 
> > > > > and telling the client to set up a new
> > > > > one is probably not going to help.  The best we can do is
> > > > > probably just
> > > > > give up
> > > > 
> > > > That's what the patch is essentially doing.  Or are you saying
> > > > don't
> > > > even bother with the checks but still return 1 so we don't set
> > > > the
> > > > SEQ4_STATUS_CB_PATH_DOWN flag?
> > > 
> > > Right, I don't see any point returning -1 (which ends up setting
> > > CB_PATH_DOWN) in any case where we get an nfs-level error.  If the
> > > client got so far as returning an error, then the callback path is
> > > working.
> > > 
> > > I'm not sure exactly what errors *should* result in CB_PATH_DOWN,
> > > though.  ETIMEDOUT, ENOTCONN, EIO?
> > 
> > I'm not sure either.  Looking at
> > call_status/call_timeout/rpc_check_timeout, it looks to me like
> > ENOTCONN
> > will be translated to ETIMEDOUT because nfsd4_run_cb_work sets the 
> > RPC_TASK_SOFTCONN flag in the call to rpc_call_async.
> > 
> > It looks like call_status can return EHOSTDOWN, ENETDOWN,
> > EHOSTUNREACH,
> > ENETUNREACH, and EPERM... should those be handled as well?
> 
> Those errors should never be passed back to applications.

I'm confused.  If call_status passes any of those errors to rpc_exit,
then I'll see them in rpc_call_done/nfsd4_cb_done, won't I?

-Scott

> 
> -- 
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com
> 
>

Trond Myklebust May 2, 2019, 11:49 a.m. UTC | #10

On Thu, 2019-05-02 at 07:35 -0400, Scott Mayhew wrote:
> On Tue, 30 Apr 2019, Trond Myklebust wrote:
> 
> > On Tue, 2019-04-30 at 14:58 -0400, Scott Mayhew wrote:
> > > On Thu, 18 Apr 2019, J. Bruce Fields wrote:
> > > 
> > > > On Thu, Apr 18, 2019 at 04:50:24PM -0400, Scott Mayhew wrote:
> > > > > On Thu, 18 Apr 2019, J. Bruce Fields wrote:
> > > > > 
> > > > > > On Thu, Apr 18, 2019 at 09:24:00AM -0400, Scott Mayhew
> > > > > > wrote:
> > > > > > > While trying to track down some issues involving large
> > > > > > > numbers of
> > > > > > > delegations being recalled/revoked, I caught the server
> > > > > > > setting
> > > > > > > SEQ4_STATUS_CB_PATH_DOWN while the client was actively
> > > > > > > responding to
> > > > > > > CB_RECALLs.  It turns out that the client had already
> > > > > > > done a
> > > > > > > TEST_STATEID and FREE_STATEID for a delegation being
> > > > > > > recalled
> > > > > > > by the
> > > > > > > time it received the CB_RECALL.
> > > > > > 
> > > > > > That's interesting, thanks!
> > > > > > 
> > > > > > This exception seems awfully narrow, though.
> > > > > > 
> > > > > > If we get back any NFS-level error at all, then I think the
> > > > > > callback
> > > > > > channel is working (am I wrong?)
> > > > > 
> > > > > Correct, if the client replies with either NFS4ERR_DELAY or
> > > > > NFS4ERR_BAD_STATEID, the server will retry 1 time (see
> > > > > dl_retries).
> > > > > After that, we fall thru and nfsd4_cb_recall_done() returns
> > > > > -1
> > > > > which
> > > > > causes the SEQ4_STATUS_CB_PATH_DOWN flag to be set.
> > > > > 
> > > > > > and telling the client to set up a new
> > > > > > one is probably not going to help.  The best we can do is
> > > > > > probably just
> > > > > > give up
> > > > > 
> > > > > That's what the patch is essentially doing.  Or are you
> > > > > saying
> > > > > don't
> > > > > even bother with the checks but still return 1 so we don't
> > > > > set
> > > > > the
> > > > > SEQ4_STATUS_CB_PATH_DOWN flag?
> > > > 
> > > > Right, I don't see any point returning -1 (which ends up
> > > > setting
> > > > CB_PATH_DOWN) in any case where we get an nfs-level error.  If
> > > > the
> > > > client got so far as returning an error, then the callback path
> > > > is
> > > > working.
> > > > 
> > > > I'm not sure exactly what errors *should* result in
> > > > CB_PATH_DOWN,
> > > > though.  ETIMEDOUT, ENOTCONN, EIO?
> > > 
> > > I'm not sure either.  Looking at
> > > call_status/call_timeout/rpc_check_timeout, it looks to me like
> > > ENOTCONN
> > > will be translated to ETIMEDOUT because nfsd4_run_cb_work sets
> > > the 
> > > RPC_TASK_SOFTCONN flag in the call to rpc_call_async.
> > > 
> > > It looks like call_status can return EHOSTDOWN, ENETDOWN,
> > > EHOSTUNREACH,
> > > ENETUNREACH, and EPERM... should those be handled as well?
> > 
> > Those errors should never be passed back to applications.
> 
> I'm confused.  If call_status passes any of those errors to rpc_exit,
> then I'll see them in rpc_call_done/nfsd4_cb_done, won't I?
> 

If you ever see them in rpc_call_done/nfsd4_cb_done, then it will be
due to an RPC bug which will need to be fixed.

call_status() should be handling those errors by retrying, and if it
runs out of retry attempts, then it should be setting either an EIO
error or an ETIMEDOUT error, depending on which rpc_task flags have
been specified.

nfsd: CB_RECALL can race with FREE_STATEID

Commit Message

Comments

Patch