Message ID | 20211001135921.GC959@fieldses.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | SUNRPC: fix sign error causing rpcsec_gss drops | expand |
> On Oct 1, 2021, at 9:59 AM, J. Bruce Fields <bfields@fieldses.org> wrote: > > From: "J. Bruce Fields" <bfields@redhat.com> > > If sd_max is unsigned, then sd_max - GSS_SEQ_WIN is a very large number > whenever sd_max is less than GSS_SEQ_WIN, and the comparison: > > seq_num <= sd->sd_max - GSS_SEQ_WIN > > in gss_check_seq_num is pretty much always true, even when that's > clearly not what was intended. > > This was causing pynfs to hang when using krb5, because pynfs uses zero > as the initial gss sequence number. That's perfectly legal, but this > logic error causes knfsd to drop the rpc in that case. Out-of-order > sequence IDs in the first GSS_SEQ_WIN (128) calls will also cause this. > > Fixes: 10b9d99a3dbb ("SUNRPC: Augment server-side rpcgss tracepoints") > Cc: stable@vger.kernel.org > Signed-off-by: J. Bruce Fields <bfields@redhat.com> This will be included in the next NFSD 5.15-rc. Thanks! See the for-next branch at git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git > --- > net/sunrpc/auth_gss/svcauth_gss.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c > index 7dba6a9c213a..b87565b64928 100644 > --- a/net/sunrpc/auth_gss/svcauth_gss.c > +++ b/net/sunrpc/auth_gss/svcauth_gss.c > @@ -645,7 +645,7 @@ static bool gss_check_seq_num(const struct svc_rqst *rqstp, struct rsc *rsci, > } > __set_bit(seq_num % GSS_SEQ_WIN, sd->sd_win); > goto ok; > - } else if (seq_num <= sd->sd_max - GSS_SEQ_WIN) { > + } else if (seq_num + GSS_SEQ_WIN <= sd->sd_max) { > goto toolow; > } > if (__test_and_set_bit(seq_num % GSS_SEQ_WIN, sd->sd_win)) > -- > 2.31.1 > -- Chuck Lever
Hi Bruce! Am 01.10.21 um 15:59 schrieb J. Bruce Fields: > From: "J. Bruce Fields" <bfields@redhat.com> > > If sd_max is unsigned, then sd_max - GSS_SEQ_WIN is a very large number > whenever sd_max is less than GSS_SEQ_WIN, and the comparison: > > seq_num <= sd->sd_max - GSS_SEQ_WIN > > in gss_check_seq_num is pretty much always true, even when that's > clearly not what was intended. > > This was causing pynfs to hang when using krb5, because pynfs uses zero > as the initial gss sequence number. That's perfectly legal, but this > logic error causes knfsd to drop the rpc in that case. Out-of-order > sequence IDs in the first GSS_SEQ_WIN (128) calls will also cause this. > > Fixes: 10b9d99a3dbb ("SUNRPC: Augment server-side rpcgss tracepoints") I wonder about the Fixes tag: That changeset added tracepoints to the exit path, but the buggy logic seems to have been present since the pre-git ages. Or am I missing something about 10b9d99a3dbb? (This might explain some reports of--as you stated elsewhere--"once in a blue moon my krb5 mounts hang" we've investigated, albeit on kernels that predate 10b9d99a3dbb.) Kind regards, Daniel > Cc: stable@vger.kernel.org > Signed-off-by: J. Bruce Fields <bfields@redhat.com> > --- > net/sunrpc/auth_gss/svcauth_gss.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c > index 7dba6a9c213a..b87565b64928 100644 > --- a/net/sunrpc/auth_gss/svcauth_gss.c > +++ b/net/sunrpc/auth_gss/svcauth_gss.c > @@ -645,7 +645,7 @@ static bool gss_check_seq_num(const struct svc_rqst *rqstp, struct rsc *rsci, > } > __set_bit(seq_num % GSS_SEQ_WIN, sd->sd_win); > goto ok; > - } else if (seq_num <= sd->sd_max - GSS_SEQ_WIN) { > + } else if (seq_num + GSS_SEQ_WIN <= sd->sd_max) { > goto toolow; > } > if (__test_and_set_bit(seq_num % GSS_SEQ_WIN, sd->sd_win)) >
On Fri, Oct 01, 2021 at 06:50:12PM +0200, Daniel Kobras wrote: > Hi Bruce! > > Am 01.10.21 um 15:59 schrieb J. Bruce Fields: > > From: "J. Bruce Fields" <bfields@redhat.com> > > > > If sd_max is unsigned, then sd_max - GSS_SEQ_WIN is a very large number > > whenever sd_max is less than GSS_SEQ_WIN, and the comparison: > > > > seq_num <= sd->sd_max - GSS_SEQ_WIN > > > > in gss_check_seq_num is pretty much always true, even when that's > > clearly not what was intended. > > > > This was causing pynfs to hang when using krb5, because pynfs uses zero > > as the initial gss sequence number. That's perfectly legal, but this > > logic error causes knfsd to drop the rpc in that case. Out-of-order > > sequence IDs in the first GSS_SEQ_WIN (128) calls will also cause this. > > > > Fixes: 10b9d99a3dbb ("SUNRPC: Augment server-side rpcgss tracepoints") > > I wonder about the Fixes tag: That changeset added tracepoints to the > exit path, but the buggy logic seems to have been present since the > pre-git ages. Or am I missing something about 10b9d99a3dbb? The relevant parts of 10b9d99a3dbb were: struct gss_svc_seq_data { /* highest seq number seen so far: */ - int sd_max; + u32 sd_max; and -gss_check_seq_num(struct rsc *rsci, int seq_num) +static bool gss_check_seq_num(const struct svc_rqst *rqstp, struct rsc *rsci, + u32 seq_num) Together, they mean the comparison seq_num <= sd->sd_max - GSS_SEQ_WIN in the case sd_max is zero, effectively ends up being seq_num <= 4294967168 instead of what was intended, seq_num <= -128 . > (This might explain some reports of--as you stated elsewhere--"once in > a blue moon my krb5 mounts hang" we've investigated, albeit on kernels > that predate 10b9d99a3dbb.) Sounds like it was something else, I'm afraid! --b.
Hi Bruce! Am 01.10.21 um 19:44 schrieb J. Bruce Fields: > On Fri, Oct 01, 2021 at 06:50:12PM +0200, Daniel Kobras wrote: >> Am 01.10.21 um 15:59 schrieb J. Bruce Fields: >>> From: "J. Bruce Fields" <bfields@redhat.com> >>> >>> If sd_max is unsigned, then sd_max - GSS_SEQ_WIN is a very large number >>> whenever sd_max is less than GSS_SEQ_WIN, and the comparison: >>> >>> seq_num <= sd->sd_max - GSS_SEQ_WIN >>> >>> in gss_check_seq_num is pretty much always true, even when that's >>> clearly not what was intended. >>> >>> This was causing pynfs to hang when using krb5, because pynfs uses zero >>> as the initial gss sequence number. That's perfectly legal, but this >>> logic error causes knfsd to drop the rpc in that case. Out-of-order >>> sequence IDs in the first GSS_SEQ_WIN (128) calls will also cause this. >>> >>> Fixes: 10b9d99a3dbb ("SUNRPC: Augment server-side rpcgss tracepoints") >> >> I wonder about the Fixes tag: That changeset added tracepoints to the >> exit path, but the buggy logic seems to have been present since the >> pre-git ages. Or am I missing something about 10b9d99a3dbb? > > The relevant parts of 10b9d99a3dbb were: > > struct gss_svc_seq_data { > /* highest seq number seen so far: */ > - int sd_max; > + u32 sd_max; > > and > > -gss_check_seq_num(struct rsc *rsci, int seq_num) > +static bool gss_check_seq_num(const struct svc_rqst *rqstp, struct rsc *rsci, > + u32 seq_num) > > Together, they mean the comparison > > seq_num <= sd->sd_max - GSS_SEQ_WIN > > in the case sd_max is zero, effectively ends up being > > seq_num <= 4294967168 > > instead of what was intended, > > seq_num <= -128 > > . Sorry, indeed I had missed the change in signedness. Thanks for elaborating! >> (This might explain some reports of--as you stated elsewhere--"once in >> a blue moon my krb5 mounts hang" we've investigated, albeit on kernels >> that predate 10b9d99a3dbb.) > > Sounds like it was something else, I'm afraid! Darn! ;-) Kind regards, Daniel
diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c index 7dba6a9c213a..b87565b64928 100644 --- a/net/sunrpc/auth_gss/svcauth_gss.c +++ b/net/sunrpc/auth_gss/svcauth_gss.c @@ -645,7 +645,7 @@ static bool gss_check_seq_num(const struct svc_rqst *rqstp, struct rsc *rsci, } __set_bit(seq_num % GSS_SEQ_WIN, sd->sd_win); goto ok; - } else if (seq_num <= sd->sd_max - GSS_SEQ_WIN) { + } else if (seq_num + GSS_SEQ_WIN <= sd->sd_max) { goto toolow; } if (__test_and_set_bit(seq_num % GSS_SEQ_WIN, sd->sd_win))