diff mbox series

SUNRPC: fix sign error causing rpcsec_gss drops

Message ID 20211001135921.GC959@fieldses.org (mailing list archive)
State New, archived
Headers show
Series SUNRPC: fix sign error causing rpcsec_gss drops | expand

Commit Message

J. Bruce Fields Oct. 1, 2021, 1:59 p.m. UTC
From: "J. Bruce Fields" <bfields@redhat.com>

If sd_max is unsigned, then sd_max - GSS_SEQ_WIN is a very large number
whenever sd_max is less than GSS_SEQ_WIN, and the comparison:

	seq_num <= sd->sd_max - GSS_SEQ_WIN

in gss_check_seq_num is pretty much always true, even when that's
clearly not what was intended.

This was causing pynfs to hang when using krb5, because pynfs uses zero
as the initial gss sequence number.  That's perfectly legal, but this
logic error causes knfsd to drop the rpc in that case.  Out-of-order
sequence IDs in the first GSS_SEQ_WIN (128) calls will also cause this.

Fixes: 10b9d99a3dbb ("SUNRPC: Augment server-side rpcgss tracepoints")
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 net/sunrpc/auth_gss/svcauth_gss.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Chuck Lever III Oct. 1, 2021, 4 p.m. UTC | #1
> On Oct 1, 2021, at 9:59 AM, J. Bruce Fields <bfields@fieldses.org> wrote:
> 
> From: "J. Bruce Fields" <bfields@redhat.com>
> 
> If sd_max is unsigned, then sd_max - GSS_SEQ_WIN is a very large number
> whenever sd_max is less than GSS_SEQ_WIN, and the comparison:
> 
> 	seq_num <= sd->sd_max - GSS_SEQ_WIN
> 
> in gss_check_seq_num is pretty much always true, even when that's
> clearly not what was intended.
> 
> This was causing pynfs to hang when using krb5, because pynfs uses zero
> as the initial gss sequence number.  That's perfectly legal, but this
> logic error causes knfsd to drop the rpc in that case.  Out-of-order
> sequence IDs in the first GSS_SEQ_WIN (128) calls will also cause this.
> 
> Fixes: 10b9d99a3dbb ("SUNRPC: Augment server-side rpcgss tracepoints")
> Cc: stable@vger.kernel.org
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>

This will be included in the next NFSD 5.15-rc. Thanks!
See the for-next branch at

git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git


> ---
> net/sunrpc/auth_gss/svcauth_gss.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
> index 7dba6a9c213a..b87565b64928 100644
> --- a/net/sunrpc/auth_gss/svcauth_gss.c
> +++ b/net/sunrpc/auth_gss/svcauth_gss.c
> @@ -645,7 +645,7 @@ static bool gss_check_seq_num(const struct svc_rqst *rqstp, struct rsc *rsci,
> 		}
> 		__set_bit(seq_num % GSS_SEQ_WIN, sd->sd_win);
> 		goto ok;
> -	} else if (seq_num <= sd->sd_max - GSS_SEQ_WIN) {
> +	} else if (seq_num + GSS_SEQ_WIN <= sd->sd_max) {
> 		goto toolow;
> 	}
> 	if (__test_and_set_bit(seq_num % GSS_SEQ_WIN, sd->sd_win))
> -- 
> 2.31.1
> 

--
Chuck Lever
Daniel Kobras Oct. 1, 2021, 4:50 p.m. UTC | #2
Hi Bruce!

Am 01.10.21 um 15:59 schrieb J. Bruce Fields:
> From: "J. Bruce Fields" <bfields@redhat.com>
> 
> If sd_max is unsigned, then sd_max - GSS_SEQ_WIN is a very large number
> whenever sd_max is less than GSS_SEQ_WIN, and the comparison:
> 
> 	seq_num <= sd->sd_max - GSS_SEQ_WIN
> 
> in gss_check_seq_num is pretty much always true, even when that's
> clearly not what was intended.
> 
> This was causing pynfs to hang when using krb5, because pynfs uses zero
> as the initial gss sequence number.  That's perfectly legal, but this
> logic error causes knfsd to drop the rpc in that case.  Out-of-order
> sequence IDs in the first GSS_SEQ_WIN (128) calls will also cause this.
> 
> Fixes: 10b9d99a3dbb ("SUNRPC: Augment server-side rpcgss tracepoints")

I wonder about the Fixes tag: That changeset added tracepoints to the
exit path, but the buggy logic seems to have been present since the
pre-git ages. Or am I missing something about 10b9d99a3dbb? (This might
explain some reports of--as you stated elsewhere--"once in a blue moon
my krb5 mounts hang" we've investigated, albeit on kernels that predate
10b9d99a3dbb.)

Kind regards,

Daniel

> Cc: stable@vger.kernel.org
> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
> ---
>  net/sunrpc/auth_gss/svcauth_gss.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
> index 7dba6a9c213a..b87565b64928 100644
> --- a/net/sunrpc/auth_gss/svcauth_gss.c
> +++ b/net/sunrpc/auth_gss/svcauth_gss.c
> @@ -645,7 +645,7 @@ static bool gss_check_seq_num(const struct svc_rqst *rqstp, struct rsc *rsci,
>  		}
>  		__set_bit(seq_num % GSS_SEQ_WIN, sd->sd_win);
>  		goto ok;
> -	} else if (seq_num <= sd->sd_max - GSS_SEQ_WIN) {
> +	} else if (seq_num + GSS_SEQ_WIN <= sd->sd_max) {
>  		goto toolow;
>  	}
>  	if (__test_and_set_bit(seq_num % GSS_SEQ_WIN, sd->sd_win))
>
J. Bruce Fields Oct. 1, 2021, 5:44 p.m. UTC | #3
On Fri, Oct 01, 2021 at 06:50:12PM +0200, Daniel Kobras wrote:
> Hi Bruce!
> 
> Am 01.10.21 um 15:59 schrieb J. Bruce Fields:
> > From: "J. Bruce Fields" <bfields@redhat.com>
> > 
> > If sd_max is unsigned, then sd_max - GSS_SEQ_WIN is a very large number
> > whenever sd_max is less than GSS_SEQ_WIN, and the comparison:
> > 
> > 	seq_num <= sd->sd_max - GSS_SEQ_WIN
> > 
> > in gss_check_seq_num is pretty much always true, even when that's
> > clearly not what was intended.
> > 
> > This was causing pynfs to hang when using krb5, because pynfs uses zero
> > as the initial gss sequence number.  That's perfectly legal, but this
> > logic error causes knfsd to drop the rpc in that case.  Out-of-order
> > sequence IDs in the first GSS_SEQ_WIN (128) calls will also cause this.
> > 
> > Fixes: 10b9d99a3dbb ("SUNRPC: Augment server-side rpcgss tracepoints")
> 
> I wonder about the Fixes tag: That changeset added tracepoints to the
> exit path, but the buggy logic seems to have been present since the
> pre-git ages. Or am I missing something about 10b9d99a3dbb?

The relevant parts of 10b9d99a3dbb were:

 struct gss_svc_seq_data {
        /* highest seq number seen so far: */
-       int                     sd_max;
+       u32                     sd_max;

and

-gss_check_seq_num(struct rsc *rsci, int seq_num)
+static bool gss_check_seq_num(const struct svc_rqst *rqstp, struct rsc *rsci,
+                             u32 seq_num)

Together, they mean the comparison

	seq_num <= sd->sd_max - GSS_SEQ_WIN

in the case sd_max is zero, effectively ends up being

	seq_num <= 4294967168

instead of what was intended,

	seq_num <= -128

.

> (This might explain some reports of--as you stated elsewhere--"once in
> a blue moon my krb5 mounts hang" we've investigated, albeit on kernels
> that predate 10b9d99a3dbb.)

Sounds like it was something else, I'm afraid!

--b.
Daniel Kobras Oct. 4, 2021, 6:50 a.m. UTC | #4
Hi Bruce!

Am 01.10.21 um 19:44 schrieb J. Bruce Fields:
> On Fri, Oct 01, 2021 at 06:50:12PM +0200, Daniel Kobras wrote:
>> Am 01.10.21 um 15:59 schrieb J. Bruce Fields:
>>> From: "J. Bruce Fields" <bfields@redhat.com>
>>>
>>> If sd_max is unsigned, then sd_max - GSS_SEQ_WIN is a very large number
>>> whenever sd_max is less than GSS_SEQ_WIN, and the comparison:
>>>
>>> 	seq_num <= sd->sd_max - GSS_SEQ_WIN
>>>
>>> in gss_check_seq_num is pretty much always true, even when that's
>>> clearly not what was intended.
>>>
>>> This was causing pynfs to hang when using krb5, because pynfs uses zero
>>> as the initial gss sequence number.  That's perfectly legal, but this
>>> logic error causes knfsd to drop the rpc in that case.  Out-of-order
>>> sequence IDs in the first GSS_SEQ_WIN (128) calls will also cause this.
>>>
>>> Fixes: 10b9d99a3dbb ("SUNRPC: Augment server-side rpcgss tracepoints")
>>
>> I wonder about the Fixes tag: That changeset added tracepoints to the
>> exit path, but the buggy logic seems to have been present since the
>> pre-git ages. Or am I missing something about 10b9d99a3dbb?
> 
> The relevant parts of 10b9d99a3dbb were:
> 
>  struct gss_svc_seq_data {
>         /* highest seq number seen so far: */
> -       int                     sd_max;
> +       u32                     sd_max;
> 
> and
> 
> -gss_check_seq_num(struct rsc *rsci, int seq_num)
> +static bool gss_check_seq_num(const struct svc_rqst *rqstp, struct rsc *rsci,
> +                             u32 seq_num)
> 
> Together, they mean the comparison
> 
> 	seq_num <= sd->sd_max - GSS_SEQ_WIN
> 
> in the case sd_max is zero, effectively ends up being
> 
> 	seq_num <= 4294967168
> 
> instead of what was intended,
> 
> 	seq_num <= -128
> 
> .

Sorry, indeed I had missed the change in signedness. Thanks for elaborating!

>> (This might explain some reports of--as you stated elsewhere--"once in
>> a blue moon my krb5 mounts hang" we've investigated, albeit on kernels
>> that predate 10b9d99a3dbb.)
> 
> Sounds like it was something else, I'm afraid!

Darn! ;-)

Kind regards,

Daniel
diff mbox series

Patch

diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
index 7dba6a9c213a..b87565b64928 100644
--- a/net/sunrpc/auth_gss/svcauth_gss.c
+++ b/net/sunrpc/auth_gss/svcauth_gss.c
@@ -645,7 +645,7 @@  static bool gss_check_seq_num(const struct svc_rqst *rqstp, struct rsc *rsci,
 		}
 		__set_bit(seq_num % GSS_SEQ_WIN, sd->sd_win);
 		goto ok;
-	} else if (seq_num <= sd->sd_max - GSS_SEQ_WIN) {
+	} else if (seq_num + GSS_SEQ_WIN <= sd->sd_max) {
 		goto toolow;
 	}
 	if (__test_and_set_bit(seq_num % GSS_SEQ_WIN, sd->sd_win))