diff mbox

[v2] nfsd: nfsd4_cb_sequence_done() for processing more cb errors

Message ID 55817CCE.1030704@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Kinglong Mee June 17, 2015, 1:57 p.m. UTC
When testing pnfs layout, nfsd got error NFS4ERR_SEQ_MISORDERED.
It is caused by nfs return NFS4ERR_DELAY before validate_seqid(),
don't update the sequnce id, but nfsd updates the sequnce id !!!

According to RFC5661 20.9.3,
" If CB_SEQUENCE returns an error, then the state of the slot
(sequence ID, cached reply) MUST NOT change."

This patch introduce a new helper nfsd4_cb_sequence_done() for
processing more callback errors as client.

v2, 
thanks Christoph's advice of adding a helper process errors as client

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
---
 fs/nfsd/nfs4callback.c | 122 +++++++++++++++++++++++++++++++++++++------------
 fs/nfsd/state.h        |   1 +
 2 files changed, 93 insertions(+), 30 deletions(-)

Comments

J. Bruce Fields June 23, 2015, 9:34 p.m. UTC | #1
On Wed, Jun 17, 2015 at 09:57:34PM +0800, Kinglong Mee wrote:
> When testing pnfs layout, nfsd got error NFS4ERR_SEQ_MISORDERED.
> It is caused by nfs return NFS4ERR_DELAY before validate_seqid(),
> don't update the sequnce id, but nfsd updates the sequnce id !!!
> 
> According to RFC5661 20.9.3,
> " If CB_SEQUENCE returns an error, then the state of the slot
> (sequence ID, cached reply) MUST NOT change."
> 
> This patch introduce a new helper nfsd4_cb_sequence_done() for
> processing more callback errors as client.
> 
> v2, 
> thanks Christoph's advice of adding a helper process errors as client

Thanks, could you rebase this on top of my for-4.2 branch?  (I already
took your earlier patch.)  Also:

> Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
> ---
>  fs/nfsd/nfs4callback.c | 122 +++++++++++++++++++++++++++++++++++++------------
>  fs/nfsd/state.h        |   1 +
>  2 files changed, 93 insertions(+), 30 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> index ce8d3f2..6fce478 100644
> --- a/fs/nfsd/nfs4callback.c
> +++ b/fs/nfsd/nfs4callback.c
...
> @@ -874,6 +874,11 @@ static void nfsd4_cb_prepare(struct rpc_task *task, void *calldata)
>  	struct nfs4_client *clp = cb->cb_clp;
>  	u32 minorversion = clp->cl_minorversion;
>  
> +	/*
> +	 * cb_seq_status is only set in decode_cb_sequence4res,
> +	 * and so will remain 1 if an rpc level failure occurs.
> +	 */
> +	cb->cb_seq_status = 1;

Isn't that a valid error code?  OK, I guess NFS4ERR_PERM isn't a legal
SEQUENCE return.  Still, -1 might make this a bit more obvious.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kinglong Mee June 24, 2015, 1:10 a.m. UTC | #2
On 6/24/2015 5:34 AM, J. Bruce Fields wrote:
> On Wed, Jun 17, 2015 at 09:57:34PM +0800, Kinglong Mee wrote:
>> When testing pnfs layout, nfsd got error NFS4ERR_SEQ_MISORDERED.
>> It is caused by nfs return NFS4ERR_DELAY before validate_seqid(),
>> don't update the sequnce id, but nfsd updates the sequnce id !!!
>>
>> According to RFC5661 20.9.3,
>> " If CB_SEQUENCE returns an error, then the state of the slot
>> (sequence ID, cached reply) MUST NOT change."
>>
>> This patch introduce a new helper nfsd4_cb_sequence_done() for
>> processing more callback errors as client.
>>
>> v2, 
>> thanks Christoph's advice of adding a helper process errors as client
> 
> Thanks, could you rebase this on top of my for-4.2 branch?  (I already
> took your earlier patch.)  Also:

Got it.

> 
>> Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
>> ---
>>  fs/nfsd/nfs4callback.c | 122 +++++++++++++++++++++++++++++++++++++------------
>>  fs/nfsd/state.h        |   1 +
>>  2 files changed, 93 insertions(+), 30 deletions(-)
>>
>> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
>> index ce8d3f2..6fce478 100644
>> --- a/fs/nfsd/nfs4callback.c
>> +++ b/fs/nfsd/nfs4callback.c
> ...
>> @@ -874,6 +874,11 @@ static void nfsd4_cb_prepare(struct rpc_task *task, void *calldata)
>>  	struct nfs4_client *clp = cb->cb_clp;
>>  	u32 minorversion = clp->cl_minorversion;
>>  
>> +	/*
>> +	 * cb_seq_status is only set in decode_cb_sequence4res,
>> +	 * and so will remain 1 if an rpc level failure occurs.
>> +	 */
>> +	cb->cb_seq_status = 1;
> 
> Isn't that a valid error code?  OK, I guess NFS4ERR_PERM isn't a legal
> SEQUENCE return.  Still, -1 might make this a bit more obvious.

No, error code is a negative number, 1 is safe.
Also, it is copied from client's error process in nfs41_sequence_done.

thanks,
Kinglong Mee
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
J. Bruce Fields June 24, 2015, 1:20 a.m. UTC | #3
On Wed, Jun 24, 2015 at 09:10:55AM +0800, Kinglong Mee wrote:
> On 6/24/2015 5:34 AM, J. Bruce Fields wrote:
> > On Wed, Jun 17, 2015 at 09:57:34PM +0800, Kinglong Mee wrote:
> >> When testing pnfs layout, nfsd got error NFS4ERR_SEQ_MISORDERED.
> >> It is caused by nfs return NFS4ERR_DELAY before validate_seqid(),
> >> don't update the sequnce id, but nfsd updates the sequnce id !!!
> >>
> >> According to RFC5661 20.9.3,
> >> " If CB_SEQUENCE returns an error, then the state of the slot
> >> (sequence ID, cached reply) MUST NOT change."
> >>
> >> This patch introduce a new helper nfsd4_cb_sequence_done() for
> >> processing more callback errors as client.
> >>
> >> v2, 
> >> thanks Christoph's advice of adding a helper process errors as client
> > 
> > Thanks, could you rebase this on top of my for-4.2 branch?  (I already
> > took your earlier patch.)  Also:
> 
> Got it.
> 
> > 
> >> Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
> >> ---
> >>  fs/nfsd/nfs4callback.c | 122 +++++++++++++++++++++++++++++++++++++------------
> >>  fs/nfsd/state.h        |   1 +
> >>  2 files changed, 93 insertions(+), 30 deletions(-)
> >>
> >> diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
> >> index ce8d3f2..6fce478 100644
> >> --- a/fs/nfsd/nfs4callback.c
> >> +++ b/fs/nfsd/nfs4callback.c
> > ...
> >> @@ -874,6 +874,11 @@ static void nfsd4_cb_prepare(struct rpc_task *task, void *calldata)
> >>  	struct nfs4_client *clp = cb->cb_clp;
> >>  	u32 minorversion = clp->cl_minorversion;
> >>  
> >> +	/*
> >> +	 * cb_seq_status is only set in decode_cb_sequence4res,
> >> +	 * and so will remain 1 if an rpc level failure occurs.
> >> +	 */
> >> +	cb->cb_seq_status = 1;
> > 
> > Isn't that a valid error code?  OK, I guess NFS4ERR_PERM isn't a legal
> > SEQUENCE return.  Still, -1 might make this a bit more obvious.
> 
> No, error code is a negative number, 1 is safe.
> Also, it is copied from client's error process in nfs41_sequence_done.

Doh, got it!

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index ce8d3f2..6fce478 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -435,12 +435,12 @@  static int decode_cb_sequence4resok(struct xdr_stream *xdr,
 	 */
 	status = 0;
 out:
-	if (status)
-		nfsd4_mark_cb_fault(cb->cb_clp, status);
+	cb->cb_seq_status = status;
 	return status;
 out_overflow:
 	print_overflow_msg(__func__, xdr);
-	return -EIO;
+	status = -EIO;
+	goto out;
 }
 
 static int decode_cb_sequence4res(struct xdr_stream *xdr,
@@ -451,8 +451,8 @@  static int decode_cb_sequence4res(struct xdr_stream *xdr,
 	if (cb->cb_minorversion == 0)
 		return 0;
 
-	status = decode_cb_op_status(xdr, OP_CB_SEQUENCE, &cb->cb_status);
-	if (unlikely(status || cb->cb_status))
+	status = decode_cb_op_status(xdr, OP_CB_SEQUENCE, &cb->cb_seq_status);
+	if (unlikely(status || cb->cb_seq_status))
 		return status;
 
 	return decode_cb_sequence4resok(xdr, cb);
@@ -526,7 +526,7 @@  static int nfs4_xdr_dec_cb_recall(struct rpc_rqst *rqstp,
 
 	if (cb != NULL) {
 		status = decode_cb_sequence4res(xdr, cb);
-		if (unlikely(status || cb->cb_status))
+		if (unlikely(status || cb->cb_seq_status))
 			return status;
 	}
 
@@ -616,7 +616,7 @@  static int nfs4_xdr_dec_cb_layout(struct rpc_rqst *rqstp,
 
 	if (cb) {
 		status = decode_cb_sequence4res(xdr, cb);
-		if (unlikely(status || cb->cb_status))
+		if (unlikely(status || cb->cb_seq_status))
 			return status;
 	}
 	return decode_cb_op_status(xdr, OP_CB_LAYOUTRECALL, &cb->cb_status);
@@ -874,6 +874,11 @@  static void nfsd4_cb_prepare(struct rpc_task *task, void *calldata)
 	struct nfs4_client *clp = cb->cb_clp;
 	u32 minorversion = clp->cl_minorversion;
 
+	/*
+	 * cb_seq_status is only set in decode_cb_sequence4res,
+	 * and so will remain 1 if an rpc level failure occurs.
+	 */
+	cb->cb_seq_status = 1;
 	cb->cb_status = 0;
 	cb->cb_minorversion = minorversion;
 	if (minorversion) {
@@ -883,6 +888,84 @@  static void nfsd4_cb_prepare(struct rpc_task *task, void *calldata)
 	rpc_call_start(task);
 }
 
+static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_callback *cb)
+{
+	struct nfs4_client *clp = cb->cb_clp;
+	struct nfsd4_session *session = clp->cl_cb_session;
+	bool ret = true;
+
+	if (!clp->cl_minorversion) {
+		/*
+		 * If the backchannel connection was shut down while this
+		 * task was queued, we need to resubmit it after setting up
+		 * a new backchannel connection.
+		 *
+		 * Note that if we lost our callback connection permanently
+		 * the submission code will error out, so we don't need to
+		 * handle that case here.
+		 */
+		if (task->tk_flags & RPC_TASK_KILLED)
+			goto need_restart;
+
+		return true;
+	}
+
+	switch (cb->cb_seq_status) {
+	case 0:
+		/*
+		 * No need for lock, access serialized in nfsd4_cb_prepare
+		 *
+		 * RFC5661 20.9.3
+		 * If CB_SEQUENCE returns an error, then the state of the slot
+		 * (sequence ID, cached reply) MUST NOT change.
+		 */
+		++session->se_cb_seq_nr;
+		break;
+	case -ESERVERFAULT:
+		++session->se_cb_seq_nr;
+	case 1:
+	case -NFS4ERR_BADSESSION:
+		nfsd4_mark_cb_fault(cb->cb_clp, cb->cb_seq_status);
+		ret = false;
+		break;
+	case -NFS4ERR_DELAY:
+		if (!rpc_restart_call(task))
+			goto out;
+
+		rpc_delay(task, 2 * HZ);
+		return false;
+	case -NFS4ERR_BADSLOT:
+		goto retry_nowait;
+	case -NFS4ERR_SEQ_MISORDERED:
+		if (session->se_cb_seq_nr != 1) {
+			session->se_cb_seq_nr = 1;
+			goto retry_nowait;
+		}
+		break;
+	default:
+		dprintk("%s: unprocessed error %d\n", __func__,
+			cb->cb_seq_status);
+	}
+
+	clear_bit(0, &clp->cl_cb_slot_busy);
+	rpc_wake_up_next(&clp->cl_cb_waitq);
+	dprintk("%s: freed slot, new seqid=%d\n", __func__,
+		clp->cl_cb_session->se_cb_seq_nr);
+
+	if (task->tk_flags & RPC_TASK_KILLED)
+		goto need_restart;
+out:
+	return ret;
+retry_nowait:
+	if (rpc_restart_call_prepare(task))
+		ret = false;
+	goto out;
+need_restart:
+	task->tk_status = 0;
+	cb->cb_need_restart = true;
+	return false;
+}
+
 static void nfsd4_cb_done(struct rpc_task *task, void *calldata)
 {
 	struct nfsd4_callback *cb = calldata;
@@ -891,30 +974,8 @@  static void nfsd4_cb_done(struct rpc_task *task, void *calldata)
 	dprintk("%s: minorversion=%d\n", __func__,
 		clp->cl_minorversion);
 
-	if (clp->cl_minorversion) {
-		/* No need for lock, access serialized in nfsd4_cb_prepare */
-		if (!task->tk_status)
-			++clp->cl_cb_session->se_cb_seq_nr;
-		clear_bit(0, &clp->cl_cb_slot_busy);
-		rpc_wake_up_next(&clp->cl_cb_waitq);
-		dprintk("%s: freed slot, new seqid=%d\n", __func__,
-			clp->cl_cb_session->se_cb_seq_nr);
-	}
-
-	/*
-	 * If the backchannel connection was shut down while this
-	 * task was queued, we need to resubmit it after setting up
-	 * a new backchannel connection.
-	 *
-	 * Note that if we lost our callback connection permanently
-	 * the submission code will error out, so we don't need to
-	 * handle that case here.
-	 */
-	if (task->tk_flags & RPC_TASK_KILLED) {
-		task->tk_status = 0;
-		cb->cb_need_restart = true;
+	if (!nfsd4_cb_sequence_done(task, cb))
 		return;
-	}
 
 	if (cb->cb_status) {
 		WARN_ON_ONCE(task->tk_status);
@@ -1090,6 +1151,7 @@  void nfsd4_init_cb(struct nfsd4_callback *cb, struct nfs4_client *clp,
 	cb->cb_msg.rpc_resp = cb;
 	cb->cb_ops = ops;
 	INIT_WORK(&cb->cb_work, nfsd4_run_cb_work);
+	cb->cb_seq_status = 1;
 	cb->cb_status = 0;
 	cb->cb_need_restart = false;
 }
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index dbc4f85..d975165 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -67,6 +67,7 @@  struct nfsd4_callback {
 	struct rpc_message cb_msg;
 	struct nfsd4_callback_ops *cb_ops;
 	struct work_struct cb_work;
+	int cb_seq_status;
 	int cb_status;
 	bool cb_need_restart;
 };