Message ID | 20240418141104.95269-1-olga.kornievskaia@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [1/1] SUNRPC: fix handling expired GSS context | expand |
On 18 Apr 2024, at 10:11, Olga Kornievskaia wrote: > From: Olga Kornievskaia <kolga@netapp.com> > > In the case where we have received a successful reply to an RPC request, > but while processing the reply the client in rpc_decode_header() finds > an expired context, the code ends up propagating the error to the caller > instead of getting a new context and retrying the request. > > To give more details, in rpc_decode_header() we call rpcauth_checkverf() > will call into the gss and internally will at some point call > gss_validate() which has a check if the current’s context lifetime > expired, and it would fail. The reason for the failure gets ‘scrubbed’ > and translated to EACCES so when we get back to rpc_decode_header() we > just go to “out_verifier” which for that error would get converted to > “out_garbage” (ie it’s treated as garballed reply) and the next > action is call_encode. Which (1) doesn’t reencode or re-send (not to > mention no upcall happens because context expires as that reason just > not known) and it again fails in the same decoding process. After > re-trying it 3 times the error is propagated back to the caller > (ie nfs4_write_done_cb() in the case a failing write). > > To fix this, instead we need to look to the case where the server > decides that context has expired and replies with an RPC auth error. > In that case, the rpc_decode_header() goes to "out_msg_denied" in that > we return EKEYREJECTED which in call_decode() is sent to “call_reserve” > which triggers an upcalls and a re-try of the operation. > > The proposed fix is in case of a failed rpc_decode_header() to check > if credentials were set to be invalid and use that as a proxy for > deciding that context has expired and then treat is same way as > receiving an auth error. > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com> > > --- This looks like a day-0 problem and should probably be back > ported to earlier kernels. Looks good to me. Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Ben
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index 28f3749f6dc6..f19ad55017c9 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -2698,8 +2698,19 @@ rpc_decode_header(struct rpc_task *task, struct xdr_stream *xdr) goto out_msg_denied; error = rpcauth_checkverf(task, xdr); - if (error) + if (error) { + struct rpc_cred *cred = task->tk_rqstp->rq_cred; + + if (!test_bit(RPCAUTH_CRED_UPTODATE, &cred->cr_flags)) { + rpcauth_invalcred(task); + if (!task->tk_cred_retry) + goto out_err; + task->tk_cred_retry--; + trace_rpc__stale_creds(task); + return -EKEYREJECTED; + } goto out_verifier; + } p = xdr_inline_decode(xdr, sizeof(*p)); if (!p)