diff mbox series

[RFC] fix krb5p mount not providing large enough buffer in rq_rcvsize

Message ID CAN-5tyHegg96s7mr1YeoPbVd0UA7_cd2GEPYNWx98uUcx-0ARw@mail.gmail.com (mailing list archive)
State New, archived
Headers show
Series [RFC] fix krb5p mount not providing large enough buffer in rq_rcvsize | expand

Commit Message

Olga Kornievskaia March 10, 2020, 6:58 p.m. UTC
Ever since commit 2c94b8eca1a26 "SUNRPC: Use au_rslack when computing
reply buffer size". It changed how "req->rq_rcvsize" is calculated. It
used to use au_cslack value which was nice and large and changed it to
au_rslack value which turns out to be too small.

Since 5.1, v3 mount with sec=krb5p fails against an Ontap server
because client's receive buffer it too small.

For GSS, au_rslack is calculated from GSS_VERF_SLACK value which is
currently 100. And it's not enough. Changing it to 104 works and then
au_rslack is recalculated based on actual received mic.len and not
just the default buffer size.

I would like to propose to change it to something a little larger than
104, like 120 to give room if some other server might reply with
something even larger.

Thoughts? Will send an actual patch if no objections to this one.

Comments

Chuck Lever March 10, 2020, 7:57 p.m. UTC | #1
Hi Olga-

> On Mar 10, 2020, at 2:58 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> 
> Ever since commit 2c94b8eca1a26 "SUNRPC: Use au_rslack when computing
> reply buffer size". It changed how "req->rq_rcvsize" is calculated. It
> used to use au_cslack value which was nice and large and changed it to
> au_rslack value which turns out to be too small.
> 
> Since 5.1, v3 mount with sec=krb5p fails against an Ontap server
> because client's receive buffer it too small.

Can you be more specific? For instance, why is 100 bytes adequate for
Linux servers, but not OnTAP?

Is this explanation for the current value not correct?

  51 /* length of a krb5 verifier (48), plus data added before arguments when
  52  * using integrity (two 4-byte integers): */


> For GSS, au_rslack is calculated from GSS_VERF_SLACK value which is
> currently 100. And it's not enough. Changing it to 104 works and then
> au_rslack is recalculated based on actual received mic.len and not
> just the default buffer size.
> 
> I would like to propose to change it to something a little larger than
> 104, like 120 to give room if some other server might reply with
> something even larger.

Why does it need to be larger than 104?


> Thoughts? Will send an actual patch if no objections to this one.
> 
> diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
> index 24ca861..44ae6bc 100644
> --- a/net/sunrpc/auth_gss/auth_gss.c
> +++ b/net/sunrpc/auth_gss/auth_gss.c
> @@ -50,7 +50,7 @@
> #define GSS_CRED_SLACK         (RPC_MAX_AUTH_SIZE * 2)
> /* length of a krb5 verifier (48), plus data added before arguments when
>  * using integrity (two 4-byte integers): */
> -#define GSS_VERF_SLACK         100
> +#define GSS_VERF_SLACK         120
> 
> static DEFINE_HASHTABLE(gss_auth_hash_table, 4);
> static DEFINE_SPINLOCK(gss_auth_hash_lock);

--
Chuck Lever
Olga Kornievskaia March 10, 2020, 9:07 p.m. UTC | #2
Hi Chuck,

On Tue, Mar 10, 2020 at 3:57 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> Hi Olga-
>
> > On Mar 10, 2020, at 2:58 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> >
> > Ever since commit 2c94b8eca1a26 "SUNRPC: Use au_rslack when computing
> > reply buffer size". It changed how "req->rq_rcvsize" is calculated. It
> > used to use au_cslack value which was nice and large and changed it to
> > au_rslack value which turns out to be too small.
> >
> > Since 5.1, v3 mount with sec=krb5p fails against an Ontap server
> > because client's receive buffer it too small.
>
> Can you be more specific? For instance, why is 100 bytes adequate for
> Linux servers, but not OnTAP?

I don't know why Ontap sends more data than Linux server. The
opaque_len is just a lot larger. For the first message Linux
opaque_len is 120bytes and Ontap it's 206. So it could be for instance
for FSINFO that sends the file handle, for Netapp the file handle is
44bytes and for Linux it's only 28bytes.

> Is this explanation for the current value not correct?
>
>   51 /* length of a krb5 verifier (48), plus data added before arguments when
>   52  * using integrity (two 4-byte integers): */

I'm not sure what it is suppose to be. Isn't "data before arguments"
can vary in length and thus explain why linux and onto sizes are
different?
Looking at the network trace the krb5 verifier I see is 36bytes.

> > For GSS, au_rslack is calculated from GSS_VERF_SLACK value which is
> > currently 100. And it's not enough. Changing it to 104 works and then
> > au_rslack is recalculated based on actual received mic.len and not
> > just the default buffer size.
> >
> > I would like to propose to change it to something a little larger than
> > 104, like 120 to give room if some other server might reply with
> > something even larger.
>
> Why does it need to be larger than 104?

I don't know why 100 was chosen and given that I think arguments are
taken into the account and arguments can change. I think NetApp has
changed their file handle sizes (at some point, not in the near past
but i think so?). Perhaps they might want to do that again so the size
will change again.

Honestly, I would have like for 100 to be 200 to be safe.

>
>
> > Thoughts? Will send an actual patch if no objections to this one.
> >
> > diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
> > index 24ca861..44ae6bc 100644
> > --- a/net/sunrpc/auth_gss/auth_gss.c
> > +++ b/net/sunrpc/auth_gss/auth_gss.c
> > @@ -50,7 +50,7 @@
> > #define GSS_CRED_SLACK         (RPC_MAX_AUTH_SIZE * 2)
> > /* length of a krb5 verifier (48), plus data added before arguments when
> >  * using integrity (two 4-byte integers): */
> > -#define GSS_VERF_SLACK         100
> > +#define GSS_VERF_SLACK         120
> >
> > static DEFINE_HASHTABLE(gss_auth_hash_table, 4);
> > static DEFINE_SPINLOCK(gss_auth_hash_lock);
>
> --
> Chuck Lever
>
>
>
Chuck Lever March 10, 2020, 11:56 p.m. UTC | #3
> On Mar 10, 2020, at 5:07 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> 
> Hi Chuck,
> 
> On Tue, Mar 10, 2020 at 3:57 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>> 
>> Hi Olga-
>> 
>>> On Mar 10, 2020, at 2:58 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>> 
>>> Ever since commit 2c94b8eca1a26 "SUNRPC: Use au_rslack when computing
>>> reply buffer size". It changed how "req->rq_rcvsize" is calculated. It
>>> used to use au_cslack value which was nice and large and changed it to
>>> au_rslack value which turns out to be too small.
>>> 
>>> Since 5.1, v3 mount with sec=krb5p fails against an Ontap server
>>> because client's receive buffer it too small.
>> 
>> Can you be more specific? For instance, why is 100 bytes adequate for
>> Linux servers, but not OnTAP?
> 
> I don't know why Ontap sends more data than Linux server.

Let's be sure we are fixing the right problem. Yes, au_rslack is
smaller in v5.1, and that results in a behavioral regression. But
exactly which part of the new calculation is incorrect is not yet
clear. Simply bumping GSS_VERF_SLACK could very well plaster over
the real problem.


> The opaque_len is just a lot larger. For the first message Linux
> opaque_len is 120bytes and Ontap it's 206. So it could be for instance
> for FSINFO that sends the file handle, for Netapp the file handle is
> 44bytes and for Linux it's only 28bytes.

The maximum filehandle size should already be accounted for in the
maxsize macro for FSINFO.

Is this problem evident only with NFSv3 plus krb5p?


>> Is this explanation for the current value not correct?
>> 
>>  51 /* length of a krb5 verifier (48), plus data added before arguments when
>>  52  * using integrity (two 4-byte integers): */
> 
> I'm not sure what it is suppose to be. Isn't "data before arguments"
> can vary in length and thus explain why linux and onto sizes are
> different?
> Looking at the network trace the krb5 verifier I see is 36bytes.

GSS_VERF_SLACK is only for the extra length added by GSS data. The
length of the RPC message itself is handled separately, see above.

Can you post a Wireshark dissection of the problematic FSINFO reply?
(Having a working reply from Linux and a failing reply from OnTAP
would be even better).


>>> For GSS, au_rslack is calculated from GSS_VERF_SLACK value which is
>>> currently 100. And it's not enough. Changing it to 104 works and then
>>> au_rslack is recalculated based on actual received mic.len and not
>>> just the default buffer size.

What are the computed au_ralign and au_rslack values after the first
successful operation?


>>> I would like to propose to change it to something a little larger than
>>> 104, like 120 to give room if some other server might reply with
>>> something even larger.
>> 
>> Why does it need to be larger than 104?
> 
> I don't know why 100 was chosen and given that I think arguments are
> taken into the account and arguments can change. I think NetApp has
> changed their file handle sizes (at some point, not in the near past
> but i think so?). Perhaps they might want to do that again so the size
> will change again.
> 
> Honestly, I would have like for 100 to be 200 to be safe.

To be safe, I would like to have a good understanding of the details,
rather than guessing at an arbitrary maximum value. Let's choose a
rational maximum and include a descriptive comment about why that value
is the best choice.


>>> Thoughts? Will send an actual patch if no objections to this one.
>>> 
>>> diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
>>> index 24ca861..44ae6bc 100644
>>> --- a/net/sunrpc/auth_gss/auth_gss.c
>>> +++ b/net/sunrpc/auth_gss/auth_gss.c
>>> @@ -50,7 +50,7 @@
>>> #define GSS_CRED_SLACK         (RPC_MAX_AUTH_SIZE * 2)
>>> /* length of a krb5 verifier (48), plus data added before arguments when
>>> * using integrity (two 4-byte integers): */
>>> -#define GSS_VERF_SLACK         100
>>> +#define GSS_VERF_SLACK         120
>>> 
>>> static DEFINE_HASHTABLE(gss_auth_hash_table, 4);
>>> static DEFINE_SPINLOCK(gss_auth_hash_lock);
>> 
>> --
>> Chuck Lever

--
Chuck Lever
Chuck Lever March 11, 2020, 2:57 p.m. UTC | #4
> On Mar 10, 2020, at 7:56 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
> 
> 
> 
>> On Mar 10, 2020, at 5:07 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>> 
>> Hi Chuck,
>> 
>> On Tue, Mar 10, 2020 at 3:57 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>>> 
>>> Hi Olga-
>>> 
>>>> On Mar 10, 2020, at 2:58 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>>> 
>>>> Ever since commit 2c94b8eca1a26 "SUNRPC: Use au_rslack when computing
>>>> reply buffer size". It changed how "req->rq_rcvsize" is calculated. It
>>>> used to use au_cslack value which was nice and large and changed it to
>>>> au_rslack value which turns out to be too small.
>>>> 
>>>> Since 5.1, v3 mount with sec=krb5p fails against an Ontap server
>>>> because client's receive buffer it too small.
>>> 
>>> Can you be more specific? For instance, why is 100 bytes adequate for
>>> Linux servers, but not OnTAP?
>> 
>> I don't know why Ontap sends more data than Linux server.
> 
> Let's be sure we are fixing the right problem. Yes, au_rslack is
> smaller in v5.1, and that results in a behavioral regression. But
> exactly which part of the new calculation is incorrect is not yet
> clear. Simply bumping GSS_VERF_SLACK could very well plaster over
> the real problem.
> 
> 
>> The opaque_len is just a lot larger. For the first message Linux
>> opaque_len is 120bytes and Ontap it's 206. So it could be for instance
>> for FSINFO that sends the file handle, for Netapp the file handle is
>> 44bytes and for Linux it's only 28bytes.
> 
> The maximum filehandle size should already be accounted for in the
> maxsize macro for FSINFO.
> 
> Is this problem evident only with NFSv3 plus krb5p?
> 
> 
>>> Is this explanation for the current value not correct?
>>> 
>>> 51 /* length of a krb5 verifier (48), plus data added before arguments when
>>> 52  * using integrity (two 4-byte integers): */
>> 
>> I'm not sure what it is suppose to be. Isn't "data before arguments"
>> can vary in length and thus explain why linux and onto sizes are
>> different?
>> Looking at the network trace the krb5 verifier I see is 36bytes.
> 
> GSS_VERF_SLACK is only for the extra length added by GSS data. The
> length of the RPC message itself is handled separately, see above.
> 
> Can you post a Wireshark dissection of the problematic FSINFO reply?
> (Having a working reply from Linux and a failing reply from OnTAP
> would be even better).
> 
> 
>>>> For GSS, au_rslack is calculated from GSS_VERF_SLACK value which is
>>>> currently 100. And it's not enough. Changing it to 104 works and then
>>>> au_rslack is recalculated based on actual received mic.len and not
>>>> just the default buffer size.
> 
> What are the computed au_ralign and au_rslack values after the first
> successful operation?
> 
> 
>>>> I would like to propose to change it to something a little larger than
>>>> 104, like 120 to give room if some other server might reply with
>>>> something even larger.
>>> 
>>> Why does it need to be larger than 104?
>> 
>> I don't know why 100 was chosen and given that I think arguments are
>> taken into the account and arguments can change. I think NetApp has
>> changed their file handle sizes (at some point, not in the near past
>> but i think so?). Perhaps they might want to do that again so the size
>> will change again.
>> 
>> Honestly, I would have like for 100 to be 200 to be safe.
> 
> To be safe, I would like to have a good understanding of the details,
> rather than guessing at an arbitrary maximum value. Let's choose a
> rational maximum and include a descriptive comment about why that value
> is the best choice.

As author of 2c94b8eca1a26 I'm interested in helping resolve this
issue. I've audited this code again, and reviewed the git log.

Interestingly, this commit:

commit adeb8133dd57f380e70a389a89a2ea3ae227f9e2
Author:     Olga Kornievskaia <aglo@citi.umich.edu>
AuthorDate: Mon Dec 4 20:22:34 2006 -0500
Commit:     Trond Myklebust <Trond.Myklebust@netapp.com>
CommitDate: Wed Dec 6 10:46:44 2006 -0500

    rpc: spkm3 update

changed GSS_VERF_SLACK from 56 to 100 without changing the documenting
comment or explaining the increase. That is the only change to
GSS_VERF_SLACK since 2006.

Also, au_rslack has always been set to GSS_VERF_SLACK. But I think
after 2c94b8eca1a26 ("SUNRPC: Use au_rslack when computing reply
buffer size"), GSS_VERF_SLACK is not the right symbolic constant to
use as an initial value of au_rslack.

Before that commit, rslack was not used to compute the receive buffer
slack, so the initial value was probably not interesting.

Since that commit, rslack is meant to be the size of GSS information
that _trails_ the RPC message payload. And ralign is intended to be
the size of the GSS information that _precedes_ that payload.


That doesn't address the problem of how to size the trailing GSS
information. I consulted RFC 2203 and 5403 hoping to find some
protocol-defined maximum for the size of the trailing GSS information
in integrity- and privacy-wrapped messages. Browsing through these
did not reveal any new wisdom (though I admit that I could have
misread these documents).

RFC 2203 Section 5.3.2.2 contains the structural definition of an
integrity-wrapped message:

      struct rpc_gss_integ_data {
          opaque databody_integ<>;
          opaque checksum<>;
      };

   The databody_integ field is created as follows.  A structure
   consisting of a sequence number followed by the procedure arguments
   is constructed. This is shown below as the type rpc_gss_data_t:

      struct rpc_gss_data_t {
          unsigned int seq_num;
          proc_req_arg_t arg;
      };

Note the use of empty angle brackets. These are variable-length opaques
with no pre-defined maximum size.

RFC 2203 Section 5.3.2.3 explains the construction of privacy-wrapped
messages:

   When data privacy is used, the request data is represented as
   follows:

      struct rpc_gss_priv_data {
          opaque databody_priv<>
      };

   The databody_priv field is created as follows.  The rpc_gss_data_t
   structure described earlier is constructed again in the same way as
   for the case of data integrity.  Next, the GSS_Wrap() call is invoked
   to encrypt the octet stream corresponding to the rpc_gss_data_t
   structure, using the same value for QOP (argument qop_req to
   GSS_Wrap()) as was used for the header checksum (in the verifier) and
   conf_req_flag (an argument to GSS_Wrap()) of TRUE.  The GSS_Wrap()
   call returns an opaque octet stream (representing the encrypted
   rpc_gss_data_t structure) and its length, and this is encoded as the
   databody_priv field. Since databody_priv has an XDR type of opaque,
   the length returned by GSS_Wrap() is encoded as the four octet
   length, followed by the encrypted octet stream (padded to a multiple
   of four octets).

And unfortunately this text is just as vague about the maximum size of
such messages. So:

x  The trailing information is not part of the RPC header verifier field,
so the use of GSS_VERF_SLACK is an arbitrary choice and not explanatory
as the initial setting for au_rslack. Pretty confusing, actually. I say
there should be a new symbolic constant defined for this value; maybe two:
one for integrity and one for privacy; but one large enough for either
is sufficient.

x  Also, there is no maximum size for these structures specified by the
protocol. Based on the RFCs, there is no way for the client to estimate
the initial reply size, and there is no way to sanity check the length
of the trailing GSS information in received GSS-wrapped messages, which
seems like a potential attack vector?

Thus GSS_VERF_SLACK is:
- a good symbolic initial value for au_ralign
- probably should be increased to RPC_MAX_AUTH_SIZE because that is the
largest size for that field according to RFC 5531 Section 8.2.
- probably should not be used as the initial value for au_rslack

Question is still: what maximum value is big enough to guarantee
interoperability? I've looked at libtirpc, no love there. Asking around
internally now.

I'm still interested in seeing what the Wireshark dissector says about
the failing OnTAP FSINFO response.


>>>> Thoughts? Will send an actual patch if no objections to this one.
>>>> 
>>>> diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
>>>> index 24ca861..44ae6bc 100644
>>>> --- a/net/sunrpc/auth_gss/auth_gss.c
>>>> +++ b/net/sunrpc/auth_gss/auth_gss.c
>>>> @@ -50,7 +50,7 @@
>>>> #define GSS_CRED_SLACK         (RPC_MAX_AUTH_SIZE * 2)
>>>> /* length of a krb5 verifier (48), plus data added before arguments when
>>>> * using integrity (two 4-byte integers): */
>>>> -#define GSS_VERF_SLACK         100
>>>> +#define GSS_VERF_SLACK         120
>>>> 
>>>> static DEFINE_HASHTABLE(gss_auth_hash_table, 4);
>>>> static DEFINE_SPINLOCK(gss_auth_hash_lock);
>>> 
>>> --
>>> Chuck Lever
> 
> --
> Chuck Lever

--
Chuck Lever
Chuck Lever March 12, 2020, 8:10 p.m. UTC | #5
> On Mar 12, 2020, at 3:17 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> 
> On Tue, Mar 10, 2020 at 7:56 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>> 
>> 
>> 
>>> On Mar 10, 2020, at 5:07 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>> 
>>> Hi Chuck,
>>> 
>>> On Tue, Mar 10, 2020 at 3:57 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>>>> 
>>>> Hi Olga-
>>>> 
>>>>> On Mar 10, 2020, at 2:58 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>>>> 
>>>>> Ever since commit 2c94b8eca1a26 "SUNRPC: Use au_rslack when computing
>>>>> reply buffer size". It changed how "req->rq_rcvsize" is calculated. It
>>>>> used to use au_cslack value which was nice and large and changed it to
>>>>> au_rslack value which turns out to be too small.
>>>>> 
>>>>> Since 5.1, v3 mount with sec=krb5p fails against an Ontap server
>>>>> because client's receive buffer it too small.
>>>> 
>>>> Can you be more specific? For instance, why is 100 bytes adequate for
>>>> Linux servers, but not OnTAP?
>>> 
>>> I don't know why Ontap sends more data than Linux server.
>> 
>> Let's be sure we are fixing the right problem. Yes, au_rslack is
>> smaller in v5.1, and that results in a behavioral regression. But
>> exactly which part of the new calculation is incorrect is not yet
>> clear. Simply bumping GSS_VERF_SLACK could very well plaster over
>> the real problem.
>> 
>> 
>>> The opaque_len is just a lot larger. For the first message Linux
>>> opaque_len is 120bytes and Ontap it's 206. So it could be for instance
>>> for FSINFO that sends the file handle, for Netapp the file handle is
>>> 44bytes and for Linux it's only 28bytes.
>> 
>> The maximum filehandle size should already be accounted for in the
>> maxsize macro for FSINFO.
>> 
>> Is this problem evident only with NFSv3 plus krb5p?
> 
> So far that seems to be the case. Every other version and security flavor works.
> 
>>>> Is this explanation for the current value not correct?
>>>> 
>>>> 51 /* length of a krb5 verifier (48), plus data added before arguments when
>>>> 52  * using integrity (two 4-byte integers): */
>>> 
>>> I'm not sure what it is suppose to be. Isn't "data before arguments"
>>> can vary in length and thus explain why linux and onto sizes are
>>> different?
>>> Looking at the network trace the krb5 verifier I see is 36bytes.
>> 
>> GSS_VERF_SLACK is only for the extra length added by GSS data. The
>> length of the RPC message itself is handled separately, see above.
>> 
>> Can you post a Wireshark dissection of the problematic FSINFO reply?
>> (Having a working reply from Linux and a failing reply from OnTAP
>> would be even better).
> 
> I'm attaching two files. I mount against linux and mount against ontap.
> 
> 
> 
> 
>>>>> For GSS, au_rslack is calculated from GSS_VERF_SLACK value which is
>>>>> currently 100. And it's not enough. Changing it to 104 works and then
>>>>> au_rslack is recalculated based on actual received mic.len and not
>>>>> just the default buffer size.
>> 
>> What are the computed au_ralign and au_rslack values after the first
>> successful operation?
> 
> With GSS_VERF_SLACK 100
> Linux run:
> 
> Mar 12 13:14:29 localhost kernel: AGLO: gss_create_new setting for
> auth=00000000e14fdc39 cslack=200 and rslack=25
> Mar 12 13:14:29 localhost kernel: AGLO: gss_create_new setting for
> auth=00000000e14fdc39 ralign=25
> Mar 12 13:14:29 localhost kernel: NFS call  fsinfo
> ... <gssd upcall>
> Mar 12 13:14:29 localhost kernel: AGLO: call_allocate
> auth=00000000e14fdc39 au_cslack=200 au_rslack=25 rq_rcvsize=256
> p_replen=35
> Mar 12 13:14:29 localhost kernel: AGLO: gss_unwrap_resp_priv rcv_buf
> len=176 is ok offset=56 opaque=120
> Mar 12 13:14:29 localhost kernel: AGLO: gss_unwrap_resp_priv ****
> auth=00000000e14fdc39 resetting au_rslack=26
> Mar 12 13:14:29 localhost kernel: AGLO: gss_unwrap_resp_priv ****
> auth=00000000e14fdc39 resetting au_ralign=26
> Mar 12 13:14:29 localhost kernel: NFS reply fsinfo: 0
> 
> Ontap run:
> Mar 12 13:16:46 localhost kernel: AGLO: gss_create_new setting for
> auth=00000000e02d9e6e cslack=200 and rslack=25
> Mar 12 13:16:46 localhost kernel: AGLO: gss_create_new setting for
> auth=00000000e02d9e6e ralign=25
> Mar 12 13:16:46 localhost kernel: NFS call  fsinfo
> ... <gssd upcall>
> Mar 12 13:16:46 localhost kernel: AGLO: call_allocate
> auth=00000000e02d9e6e au_cslack=200 au_rslack=25 rq_rcvsize=256
> p_replen=35
> Mar 12 13:16:46 localhost kernel: AGLO: gss_unwrap_resp_priv rcv_buf
> len=256 too small offset=56 opaque=204
> Mar 12 13:16:46 localhost kernel: NFS reply fsinfo: -5
> 
> With GSS_VERF_SLACK 104
> Mar 12 13:33:23 localhost kernel: AGLO: gss_create_new setting for
> auth=000000004a545ea2 cslack=200 and rslack=26
> Mar 12 13:33:23 localhost kernel: AGLO: gss_create_new setting for
> auth=000000004a545ea2 ralign=26
> Mar 12 13:33:23 localhost kernel: NFS call  fsinfo
> ... <gssd upcall>
> Mar 12 13:33:23 localhost kernel: AGLO: call_allocate
> auth=000000004a545ea2 au_cslack=200 au_rslack=26 rq_rcvsize=260
> p_replen=35
> Mar 12 13:33:23 localhost kernel: AGLO: gss_unwrap_resp_priv rcv_buf
> len=260 is ok offset=56 opaque=204
> Mar 12 13:33:23 localhost kernel: AGLO: gss_unwrap_resp_priv ****
> auth=000000004a545ea2 resetting au_rslack=26
> Mar 12 13:33:23 localhost kernel: AGLO: gss_unwrap_resp_priv ****
> auth=000000004a545ea2 resetting au_ralign=26
> Mar 12 13:33:23 localhost kernel: NFS reply fsinfo: 0
> 
> difference in actual packets in fsinfo is that ontap sends postattrs
> so that's 84bytes.
> 
>        req->rq_rcvsize = RPC_REPHDRSIZE + auth->au_rslack + \
>                        max_t(size_t, proc->p_replen, 2);
> 
> RPC_REPHDRSIZE is defined to be 4 (*4)  (it says it doesn't include
> the verifier ???)

> rslack needs to cover kerberos blob 25 (*4)  (but that's the kerberos
> part a part of the wrap and not the verifier)
> p_replen to cover fs_info args 35 (*4) (seems like the right number)
> 
> So we are missing the GSS to include the verifier and the kerberos
> blob of the wrapper (and lengths!!). Basically we need GSS_VERF_SLACK
> to cover 2 kerberos blobs (or more specifically KRB_TOKEN_CFX_GetMic
> 9*4 and KRB_TOKEN_CFS_WRAP 15*4 + 2 lengths before the kerberos blobs
> = 104 and we are only giving 100).

GSS_VERF_SLACK is also used for setting au_verfsize, so please don't
change its value. Define a new constant for initializing au_rslack.

Let's construct that constant using the KRB5_TOKEN constants you mention
here... include/linux/sunrpc/gss_krb5.h has

221 /*
222  * This compile-time check verifies that we will not exceed the
223  * slack space allotted by the client and server auth_gss code
224  * before they call gss_wrap().
225  */
226 #define GSS_KRB5_MAX_SLACK_NEEDED \
227         (GSS_KRB5_TOK_HDR_LEN     /* gss token header */         \
228         + GSS_KRB5_MAX_CKSUM_LEN  /* gss token checksum */       \
229         + GSS_KRB5_MAX_BLOCKSIZE  /* confounder */               \
230         + GSS_KRB5_MAX_BLOCKSIZE  /* possible padding */         \
231         + GSS_KRB5_TOK_HDR_LEN    /* encrypted hdr in v2 token */\
232         + GSS_KRB5_MAX_CKSUM_LEN  /* encryption hmac */          \
233         + 4 + 4                   /* RPC verifier */             \
234         + GSS_KRB5_TOK_HDR_LEN                                   \
235         + GSS_KRB5_MAX_CKSUM_LEN)

So this, or something like this, plus the comment below.


> The reason things work against linux is because it has a nice buffer
> of 84bytes of post attributes that it doesn't send.

Missing post-attributes makes sense. Thank you for the analysis.


> To address your later point that kerberos blob is encryption type
> depended and that once some other encryption is added to gss-kerberos
> that's larger than existing checksum then this value would need to be
> adjusted again.

> If you agree with my reasoning for the number then I'd like to send
> out a patch now.

The current numbers are based on the kernel GSS implementation supporting
only Kerberos with a narrow set of enctypes. That needs to be made clear
in a documenting comment.

The reason this has been bothersome is because the existing setting is
a magic number (100), and its documenting comment has been stale since
2006. Any proposed fix has to address the missing documentation.


>>>>> I would like to propose to change it to something a little larger than
>>>>> 104, like 120 to give room if some other server might reply with
>>>>> something even larger.
>>>> 
>>>> Why does it need to be larger than 104?
>>> 
>>> I don't know why 100 was chosen and given that I think arguments are
>>> taken into the account and arguments can change. I think NetApp has
>>> changed their file handle sizes (at some point, not in the near past
>>> but i think so?). Perhaps they might want to do that again so the size
>>> will change again.
>>> 
>>> Honestly, I would have like for 100 to be 200 to be safe.
>> 
>> To be safe, I would like to have a good understanding of the details,
>> rather than guessing at an arbitrary maximum value. Let's choose a
>> rational maximum and include a descriptive comment about why that value
>> is the best choice.
>> 
>> 
>>>>> Thoughts? Will send an actual patch if no objections to this one.
>>>>> 
>>>>> diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
>>>>> index 24ca861..44ae6bc 100644
>>>>> --- a/net/sunrpc/auth_gss/auth_gss.c
>>>>> +++ b/net/sunrpc/auth_gss/auth_gss.c
>>>>> @@ -50,7 +50,7 @@
>>>>> #define GSS_CRED_SLACK         (RPC_MAX_AUTH_SIZE * 2)
>>>>> /* length of a krb5 verifier (48), plus data added before arguments when
>>>>> * using integrity (two 4-byte integers): */
>>>>> -#define GSS_VERF_SLACK         100
>>>>> +#define GSS_VERF_SLACK         120
>>>>> 
>>>>> static DEFINE_HASHTABLE(gss_auth_hash_table, 4);
>>>>> static DEFINE_SPINLOCK(gss_auth_hash_lock);
>>>> 
>>>> --
>>>> Chuck Lever
>> 
>> --
>> Chuck Lever
>> 
>> 
>> 
> <linux-v3-krb5p-mount.pcap.gz><ontap-v3-krb5-mount.pcap.gz>

--
Chuck Lever
Olga Kornievskaia March 20, 2020, 7:28 p.m. UTC | #6
On Thu, Mar 12, 2020 at 4:10 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>
>
>
> > On Mar 12, 2020, at 3:17 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> >
> > On Tue, Mar 10, 2020 at 7:56 PM Chuck Lever <chuck.lever@oracle.com> wrote:
> >>
> >>
> >>
> >>> On Mar 10, 2020, at 5:07 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> >>>
> >>> Hi Chuck,
> >>>
> >>> On Tue, Mar 10, 2020 at 3:57 PM Chuck Lever <chuck.lever@oracle.com> wrote:
> >>>>
> >>>> Hi Olga-
> >>>>
> >>>>> On Mar 10, 2020, at 2:58 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> >>>>>
> >>>>> Ever since commit 2c94b8eca1a26 "SUNRPC: Use au_rslack when computing
> >>>>> reply buffer size". It changed how "req->rq_rcvsize" is calculated. It
> >>>>> used to use au_cslack value which was nice and large and changed it to
> >>>>> au_rslack value which turns out to be too small.
> >>>>>
> >>>>> Since 5.1, v3 mount with sec=krb5p fails against an Ontap server
> >>>>> because client's receive buffer it too small.
> >>>>
> >>>> Can you be more specific? For instance, why is 100 bytes adequate for
> >>>> Linux servers, but not OnTAP?
> >>>
> >>> I don't know why Ontap sends more data than Linux server.
> >>
> >> Let's be sure we are fixing the right problem. Yes, au_rslack is
> >> smaller in v5.1, and that results in a behavioral regression. But
> >> exactly which part of the new calculation is incorrect is not yet
> >> clear. Simply bumping GSS_VERF_SLACK could very well plaster over
> >> the real problem.
> >>
> >>
> >>> The opaque_len is just a lot larger. For the first message Linux
> >>> opaque_len is 120bytes and Ontap it's 206. So it could be for instance
> >>> for FSINFO that sends the file handle, for Netapp the file handle is
> >>> 44bytes and for Linux it's only 28bytes.
> >>
> >> The maximum filehandle size should already be accounted for in the
> >> maxsize macro for FSINFO.
> >>
> >> Is this problem evident only with NFSv3 plus krb5p?
> >
> > So far that seems to be the case. Every other version and security flavor works.
> >
> >>>> Is this explanation for the current value not correct?
> >>>>
> >>>> 51 /* length of a krb5 verifier (48), plus data added before arguments when
> >>>> 52  * using integrity (two 4-byte integers): */
> >>>
> >>> I'm not sure what it is suppose to be. Isn't "data before arguments"
> >>> can vary in length and thus explain why linux and onto sizes are
> >>> different?
> >>> Looking at the network trace the krb5 verifier I see is 36bytes.
> >>
> >> GSS_VERF_SLACK is only for the extra length added by GSS data. The
> >> length of the RPC message itself is handled separately, see above.
> >>
> >> Can you post a Wireshark dissection of the problematic FSINFO reply?
> >> (Having a working reply from Linux and a failing reply from OnTAP
> >> would be even better).
> >
> > I'm attaching two files. I mount against linux and mount against ontap.
> >
> >
> >
> >
> >>>>> For GSS, au_rslack is calculated from GSS_VERF_SLACK value which is
> >>>>> currently 100. And it's not enough. Changing it to 104 works and then
> >>>>> au_rslack is recalculated based on actual received mic.len and not
> >>>>> just the default buffer size.
> >>
> >> What are the computed au_ralign and au_rslack values after the first
> >> successful operation?
> >
> > With GSS_VERF_SLACK 100
> > Linux run:
> >
> > Mar 12 13:14:29 localhost kernel: AGLO: gss_create_new setting for
> > auth=00000000e14fdc39 cslack=200 and rslack=25
> > Mar 12 13:14:29 localhost kernel: AGLO: gss_create_new setting for
> > auth=00000000e14fdc39 ralign=25
> > Mar 12 13:14:29 localhost kernel: NFS call  fsinfo
> > ... <gssd upcall>
> > Mar 12 13:14:29 localhost kernel: AGLO: call_allocate
> > auth=00000000e14fdc39 au_cslack=200 au_rslack=25 rq_rcvsize=256
> > p_replen=35
> > Mar 12 13:14:29 localhost kernel: AGLO: gss_unwrap_resp_priv rcv_buf
> > len=176 is ok offset=56 opaque=120
> > Mar 12 13:14:29 localhost kernel: AGLO: gss_unwrap_resp_priv ****
> > auth=00000000e14fdc39 resetting au_rslack=26
> > Mar 12 13:14:29 localhost kernel: AGLO: gss_unwrap_resp_priv ****
> > auth=00000000e14fdc39 resetting au_ralign=26
> > Mar 12 13:14:29 localhost kernel: NFS reply fsinfo: 0
> >
> > Ontap run:
> > Mar 12 13:16:46 localhost kernel: AGLO: gss_create_new setting for
> > auth=00000000e02d9e6e cslack=200 and rslack=25
> > Mar 12 13:16:46 localhost kernel: AGLO: gss_create_new setting for
> > auth=00000000e02d9e6e ralign=25
> > Mar 12 13:16:46 localhost kernel: NFS call  fsinfo
> > ... <gssd upcall>
> > Mar 12 13:16:46 localhost kernel: AGLO: call_allocate
> > auth=00000000e02d9e6e au_cslack=200 au_rslack=25 rq_rcvsize=256
> > p_replen=35
> > Mar 12 13:16:46 localhost kernel: AGLO: gss_unwrap_resp_priv rcv_buf
> > len=256 too small offset=56 opaque=204
> > Mar 12 13:16:46 localhost kernel: NFS reply fsinfo: -5
> >
> > With GSS_VERF_SLACK 104
> > Mar 12 13:33:23 localhost kernel: AGLO: gss_create_new setting for
> > auth=000000004a545ea2 cslack=200 and rslack=26
> > Mar 12 13:33:23 localhost kernel: AGLO: gss_create_new setting for
> > auth=000000004a545ea2 ralign=26
> > Mar 12 13:33:23 localhost kernel: NFS call  fsinfo
> > ... <gssd upcall>
> > Mar 12 13:33:23 localhost kernel: AGLO: call_allocate
> > auth=000000004a545ea2 au_cslack=200 au_rslack=26 rq_rcvsize=260
> > p_replen=35
> > Mar 12 13:33:23 localhost kernel: AGLO: gss_unwrap_resp_priv rcv_buf
> > len=260 is ok offset=56 opaque=204
> > Mar 12 13:33:23 localhost kernel: AGLO: gss_unwrap_resp_priv ****
> > auth=000000004a545ea2 resetting au_rslack=26
> > Mar 12 13:33:23 localhost kernel: AGLO: gss_unwrap_resp_priv ****
> > auth=000000004a545ea2 resetting au_ralign=26
> > Mar 12 13:33:23 localhost kernel: NFS reply fsinfo: 0
> >
> > difference in actual packets in fsinfo is that ontap sends postattrs
> > so that's 84bytes.
> >
> >        req->rq_rcvsize = RPC_REPHDRSIZE + auth->au_rslack + \
> >                        max_t(size_t, proc->p_replen, 2);
> >
> > RPC_REPHDRSIZE is defined to be 4 (*4)  (it says it doesn't include
> > the verifier ???)
>
> > rslack needs to cover kerberos blob 25 (*4)  (but that's the kerberos
> > part a part of the wrap and not the verifier)
> > p_replen to cover fs_info args 35 (*4) (seems like the right number)
> >
> > So we are missing the GSS to include the verifier and the kerberos
> > blob of the wrapper (and lengths!!). Basically we need GSS_VERF_SLACK
> > to cover 2 kerberos blobs (or more specifically KRB_TOKEN_CFX_GetMic
> > 9*4 and KRB_TOKEN_CFS_WRAP 15*4 + 2 lengths before the kerberos blobs
> > = 104 and we are only giving 100).
>
> GSS_VERF_SLACK is also used for setting au_verfsize, so please don't
> change its value. Define a new constant for initializing au_rslack.
>
> Let's construct that constant using the KRB5_TOKEN constants you mention
> here... include/linux/sunrpc/gss_krb5.h has
>
> 221 /*
> 222  * This compile-time check verifies that we will not exceed the
> 223  * slack space allotted by the client and server auth_gss code
> 224  * before they call gss_wrap().
> 225  */
> 226 #define GSS_KRB5_MAX_SLACK_NEEDED \
> 227         (GSS_KRB5_TOK_HDR_LEN     /* gss token header */         \
> 228         + GSS_KRB5_MAX_CKSUM_LEN  /* gss token checksum */       \
> 229         + GSS_KRB5_MAX_BLOCKSIZE  /* confounder */               \
> 230         + GSS_KRB5_MAX_BLOCKSIZE  /* possible padding */         \
> 231         + GSS_KRB5_TOK_HDR_LEN    /* encrypted hdr in v2 token */\
> 232         + GSS_KRB5_MAX_CKSUM_LEN  /* encryption hmac */          \
> 233         + 4 + 4                   /* RPC verifier */             \
> 234         + GSS_KRB5_TOK_HDR_LEN                                   \
> 235         + GSS_KRB5_MAX_CKSUM_LEN)
>
> So this, or something like this, plus the comment below.
>

I'm not sure this particular structure can we re-used as it computes
something different (as it can be seen below what RFC structures are
for mic and wrap cksums).

I have 2 ways this can be solved. A simple patch that eliminates
quartering the rslack of the GSS_VERF_SLACK that produces too small of
a buffer. GSS_VERF_SLACK value in itself is a good approximation of
what Kerberos needs.

diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
index 24ca861..e3b6ea2 100644
--- a/net/sunrpc/auth_gss/auth_gss.c
+++ b/net/sunrpc/auth_gss/auth_gss.c
@@ -1050,7 +1050,7 @@ static void gss_pipe_free(struct gss_pipe *p)
                goto err_put_mech;
        auth = &gss_auth->rpc_auth;
        auth->au_cslack = GSS_CRED_SLACK >> 2;
-       auth->au_rslack = GSS_VERF_SLACK >> 2;
+       auth->au_rslack = GSS_VERF_SLACK;
        auth->au_verfsize = GSS_VERF_SLACK >> 2;
        auth->au_ralign = GSS_VERF_SLACK >> 2;
        auth->au_flags = 0;


Or I can introduce a new variable but I'm having a hard time figuring
out what are the appropriate values for the checksums (I've added what
I'm getting on the wire):
diff --git a/include/linux/sunrpc/gss_krb5.h b/include/linux/sunrpc/gss_krb5.h
index c1d77dd..9ef77eb 100644
--- a/include/linux/sunrpc/gss_krb5.h
+++ b/include/linux/sunrpc/gss_krb5.h
@@ -234,6 +234,26 @@ enum seal_alg {
        + GSS_KRB5_TOK_HDR_LEN                                   \
        + GSS_KRB5_MAX_CKSUM_LEN)

+/* to decode a GSS wrap reply the RPC code allocates buffer size based
+ * on GSS_KRB5 tokens (mic, wrap) and the max operation reply size
+ * provide MIC and WRAP token sizes in bytes
+ */
+#define GSS_KRB5_TOKEN_GETMIC \
+       (2 /*KRB5_TOKEN_ID*/ \
+       + 1 /*KRB5_CTX_FLAGS*/ \
+       + 5 /*KRB5_MIC_FILLER*/ \
+       + 8 /*KRB5_CTX_SEQ*/ \
+       + 12 /*KRB5_MIC_CKSUM_LEN*/)
+
+#define GSS_KRB5_TOKEN_WRAP \
+       (2 /*KRB5_TOKEN_ID*/ \
+       + 1 /*KRB5_CTX_FLAGS*/ \
+       + 1 /*KRB5_WRAP_FILLER*/ \
+       + 2 /*KRB5_CTX_EC*/ \
+       + 2 /*KRB5_CTX_RRC*/ \
+       + 8 /*KRB5_CTX_SEQ*/ \
+       + 44 /*KRB5_WRAP_CKSUM_LEN */)
+
 u32
 make_checksum(struct krb5_ctx *kctx, char *header, int hdrlen,
                struct xdr_buf *body, int body_offset, u8 *cksumkey,
diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
index 24ca861..d6a52dc 100644
--- a/net/sunrpc/auth_gss/auth_gss.c
+++ b/net/sunrpc/auth_gss/auth_gss.c
@@ -25,6 +25,7 @@
 #include <linux/workqueue.h>
 #include <linux/sunrpc/rpc_pipe_fs.h>
 #include <linux/sunrpc/gss_api.h>
+#include <linux/sunrpc/gss_krb5.h>
 #include <linux/uaccess.h>
 #include <linux/hashtable.h>

@@ -52,6 +53,12 @@
  * using integrity (two 4-byte integers): */
 #define GSS_VERF_SLACK         100

+#define GSS_RAU_SLACK \
+(4 /* GSS verf token length */ \
++ GSS_KRB5_TOKEN_GETMIC /* verifier's getmic */ \
++ 4 /* GSS warp token length */ \
++ GSS_KRB5_TOKEN_WRAP) /* wrap verifier */ \
+
 static DEFINE_HASHTABLE(gss_auth_hash_table, 4);
 static DEFINE_SPINLOCK(gss_auth_hash_lock);

@@ -1050,7 +1057,7 @@ static void gss_pipe_free(struct gss_pipe *p)
                goto err_put_mech;
        auth = &gss_auth->rpc_auth;
        auth->au_cslack = GSS_CRED_SLACK >> 2;
-       auth->au_rslack = GSS_VERF_SLACK >> 2;
+       auth->au_rslack = GSS_RAU_SLACK;
        auth->au_verfsize = GSS_VERF_SLACK >> 2;
        auth->au_ralign = GSS_VERF_SLACK >> 2;
        auth->au_flags = 0;


Here's the KRB5 RFC structure for MIC:

 Octet no   Name        Description
         --------------------------------------------------------------
         0..1     TOK_ID     Identification field.  Tokens emitted by
                             GSS_GetMIC() contain the hex value 04 04
                             expressed in big-endian order in this
                             field.
         2        Flags      Attributes field, as described in section
                             4.2.2.
         3..7     Filler     Contains five octets of hex value FF.
         8..15    SND_SEQ    Sequence number field in clear text,
                             expressed in big-endian order.
         16..last SGN_CKSUM  Checksum of the "to-be-signed" data and
                             octet 0..15, as described in section 4.2.4.

for wrap:

Octet no   Name        Description
         --------------------------------------------------------------
          0..1     TOK_ID    Identification field.  Tokens emitted by
                             GSS_Wrap() contain the hex value 05 04
                             expressed in big-endian order in this
                             field.
          2        Flags     Attributes field, as described in section
                             4.2.2.
          3        Filler    Contains the hex value FF.
          4..5     EC        Contains the "extra count" field, in big-
                             endian order as described in section 4.2.3.
          6..7     RRC       Contains the "right rotation count" in big-
                             endian order, as described in section
                             4.2.5.
          8..15    SND_SEQ   Sequence number field in clear text,
                             expressed in big-endian order.
          16..last Data      Encrypted data for Wrap tokens with
                             confidentiality, or plaintext data followed
                             by the checksum for Wrap tokens without
                             confidentiality, as described in section
                             4.2.4.

Variable parts are cksum lengths based on the encryption used.

Current code has GSS_KRB5_MAX_CKSUM_LEN to be 20 (bytes?. Is that for
hmac signature) in WRAP cksum sent is 44bytes.

Looking for guidance here.

Thank you.

> > The reason things work against linux is because it has a nice buffer
> > of 84bytes of post attributes that it doesn't send.
>
> Missing post-attributes makes sense. Thank you for the analysis.
>
>
> > To address your later point that kerberos blob is encryption type
> > depended and that once some other encryption is added to gss-kerberos
> > that's larger than existing checksum then this value would need to be
> > adjusted again.
>
> > If you agree with my reasoning for the number then I'd like to send
> > out a patch now.
>
> The current numbers are based on the kernel GSS implementation supporting
> only Kerberos with a narrow set of enctypes. That needs to be made clear
> in a documenting comment.
>
> The reason this has been bothersome is because the existing setting is
> a magic number (100), and its documenting comment has been stale since
> 2006. Any proposed fix has to address the missing documentation.
>
>
> >>>>> I would like to propose to change it to something a little larger than
> >>>>> 104, like 120 to give room if some other server might reply with
> >>>>> something even larger.
> >>>>
> >>>> Why does it need to be larger than 104?
> >>>
> >>> I don't know why 100 was chosen and given that I think arguments are
> >>> taken into the account and arguments can change. I think NetApp has
> >>> changed their file handle sizes (at some point, not in the near past
> >>> but i think so?). Perhaps they might want to do that again so the size
> >>> will change again.
> >>>
> >>> Honestly, I would have like for 100 to be 200 to be safe.
> >>
> >> To be safe, I would like to have a good understanding of the details,
> >> rather than guessing at an arbitrary maximum value. Let's choose a
> >> rational maximum and include a descriptive comment about why that value
> >> is the best choice.
> >>
> >>
> >>>>> Thoughts? Will send an actual patch if no objections to this one.
> >>>>>
> >>>>> diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
> >>>>> index 24ca861..44ae6bc 100644
> >>>>> --- a/net/sunrpc/auth_gss/auth_gss.c
> >>>>> +++ b/net/sunrpc/auth_gss/auth_gss.c
> >>>>> @@ -50,7 +50,7 @@
> >>>>> #define GSS_CRED_SLACK         (RPC_MAX_AUTH_SIZE * 2)
> >>>>> /* length of a krb5 verifier (48), plus data added before arguments when
> >>>>> * using integrity (two 4-byte integers): */
> >>>>> -#define GSS_VERF_SLACK         100
> >>>>> +#define GSS_VERF_SLACK         120
> >>>>>
> >>>>> static DEFINE_HASHTABLE(gss_auth_hash_table, 4);
> >>>>> static DEFINE_SPINLOCK(gss_auth_hash_lock);
> >>>>
> >>>> --
> >>>> Chuck Lever
> >>
> >> --
> >> Chuck Lever
> >>
> >>
> >>
> > <linux-v3-krb5p-mount.pcap.gz><ontap-v3-krb5-mount.pcap.gz>
>
> --
> Chuck Lever
>
>
>
Chuck Lever March 20, 2020, 8:18 p.m. UTC | #7
> On Mar 20, 2020, at 3:28 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> 
> On Thu, Mar 12, 2020 at 4:10 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>> 
>> 
>> 
>>> On Mar 12, 2020, at 3:17 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>> 
>>> On Tue, Mar 10, 2020 at 7:56 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>>>> 
>>>> 
>>>> 
>>>>> On Mar 10, 2020, at 5:07 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>>>> 
>>>>> Hi Chuck,
>>>>> 
>>>>> On Tue, Mar 10, 2020 at 3:57 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>>>>>> 
>>>>>> Hi Olga-
>>>>>> 
>>>>>>> On Mar 10, 2020, at 2:58 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
>>>>>>> 
>>>>>>> Ever since commit 2c94b8eca1a26 "SUNRPC: Use au_rslack when computing
>>>>>>> reply buffer size". It changed how "req->rq_rcvsize" is calculated. It
>>>>>>> used to use au_cslack value which was nice and large and changed it to
>>>>>>> au_rslack value which turns out to be too small.
>>>>>>> 
>>>>>>> Since 5.1, v3 mount with sec=krb5p fails against an Ontap server
>>>>>>> because client's receive buffer it too small.
>>>>>> 
>>>>>> Can you be more specific? For instance, why is 100 bytes adequate for
>>>>>> Linux servers, but not OnTAP?
>>>>> 
>>>>> I don't know why Ontap sends more data than Linux server.
>>>> 
>>>> Let's be sure we are fixing the right problem. Yes, au_rslack is
>>>> smaller in v5.1, and that results in a behavioral regression. But
>>>> exactly which part of the new calculation is incorrect is not yet
>>>> clear. Simply bumping GSS_VERF_SLACK could very well plaster over
>>>> the real problem.
>>>> 
>>>> 
>>>>> The opaque_len is just a lot larger. For the first message Linux
>>>>> opaque_len is 120bytes and Ontap it's 206. So it could be for instance
>>>>> for FSINFO that sends the file handle, for Netapp the file handle is
>>>>> 44bytes and for Linux it's only 28bytes.
>>>> 
>>>> The maximum filehandle size should already be accounted for in the
>>>> maxsize macro for FSINFO.
>>>> 
>>>> Is this problem evident only with NFSv3 plus krb5p?
>>> 
>>> So far that seems to be the case. Every other version and security flavor works.
>>> 
>>>>>> Is this explanation for the current value not correct?
>>>>>> 
>>>>>> 51 /* length of a krb5 verifier (48), plus data added before arguments when
>>>>>> 52  * using integrity (two 4-byte integers): */
>>>>> 
>>>>> I'm not sure what it is suppose to be. Isn't "data before arguments"
>>>>> can vary in length and thus explain why linux and onto sizes are
>>>>> different?
>>>>> Looking at the network trace the krb5 verifier I see is 36bytes.
>>>> 
>>>> GSS_VERF_SLACK is only for the extra length added by GSS data. The
>>>> length of the RPC message itself is handled separately, see above.
>>>> 
>>>> Can you post a Wireshark dissection of the problematic FSINFO reply?
>>>> (Having a working reply from Linux and a failing reply from OnTAP
>>>> would be even better).
>>> 
>>> I'm attaching two files. I mount against linux and mount against ontap.
>>> 
>>> 
>>> 
>>> 
>>>>>>> For GSS, au_rslack is calculated from GSS_VERF_SLACK value which is
>>>>>>> currently 100. And it's not enough. Changing it to 104 works and then
>>>>>>> au_rslack is recalculated based on actual received mic.len and not
>>>>>>> just the default buffer size.
>>>> 
>>>> What are the computed au_ralign and au_rslack values after the first
>>>> successful operation?
>>> 
>>> With GSS_VERF_SLACK 100
>>> Linux run:
>>> 
>>> Mar 12 13:14:29 localhost kernel: AGLO: gss_create_new setting for
>>> auth=00000000e14fdc39 cslack=200 and rslack=25
>>> Mar 12 13:14:29 localhost kernel: AGLO: gss_create_new setting for
>>> auth=00000000e14fdc39 ralign=25
>>> Mar 12 13:14:29 localhost kernel: NFS call  fsinfo
>>> ... <gssd upcall>
>>> Mar 12 13:14:29 localhost kernel: AGLO: call_allocate
>>> auth=00000000e14fdc39 au_cslack=200 au_rslack=25 rq_rcvsize=256
>>> p_replen=35
>>> Mar 12 13:14:29 localhost kernel: AGLO: gss_unwrap_resp_priv rcv_buf
>>> len=176 is ok offset=56 opaque=120
>>> Mar 12 13:14:29 localhost kernel: AGLO: gss_unwrap_resp_priv ****
>>> auth=00000000e14fdc39 resetting au_rslack=26
>>> Mar 12 13:14:29 localhost kernel: AGLO: gss_unwrap_resp_priv ****
>>> auth=00000000e14fdc39 resetting au_ralign=26
>>> Mar 12 13:14:29 localhost kernel: NFS reply fsinfo: 0
>>> 
>>> Ontap run:
>>> Mar 12 13:16:46 localhost kernel: AGLO: gss_create_new setting for
>>> auth=00000000e02d9e6e cslack=200 and rslack=25
>>> Mar 12 13:16:46 localhost kernel: AGLO: gss_create_new setting for
>>> auth=00000000e02d9e6e ralign=25
>>> Mar 12 13:16:46 localhost kernel: NFS call  fsinfo
>>> ... <gssd upcall>
>>> Mar 12 13:16:46 localhost kernel: AGLO: call_allocate
>>> auth=00000000e02d9e6e au_cslack=200 au_rslack=25 rq_rcvsize=256
>>> p_replen=35
>>> Mar 12 13:16:46 localhost kernel: AGLO: gss_unwrap_resp_priv rcv_buf
>>> len=256 too small offset=56 opaque=204
>>> Mar 12 13:16:46 localhost kernel: NFS reply fsinfo: -5
>>> 
>>> With GSS_VERF_SLACK 104
>>> Mar 12 13:33:23 localhost kernel: AGLO: gss_create_new setting for
>>> auth=000000004a545ea2 cslack=200 and rslack=26
>>> Mar 12 13:33:23 localhost kernel: AGLO: gss_create_new setting for
>>> auth=000000004a545ea2 ralign=26
>>> Mar 12 13:33:23 localhost kernel: NFS call  fsinfo
>>> ... <gssd upcall>
>>> Mar 12 13:33:23 localhost kernel: AGLO: call_allocate
>>> auth=000000004a545ea2 au_cslack=200 au_rslack=26 rq_rcvsize=260
>>> p_replen=35
>>> Mar 12 13:33:23 localhost kernel: AGLO: gss_unwrap_resp_priv rcv_buf
>>> len=260 is ok offset=56 opaque=204
>>> Mar 12 13:33:23 localhost kernel: AGLO: gss_unwrap_resp_priv ****
>>> auth=000000004a545ea2 resetting au_rslack=26
>>> Mar 12 13:33:23 localhost kernel: AGLO: gss_unwrap_resp_priv ****
>>> auth=000000004a545ea2 resetting au_ralign=26
>>> Mar 12 13:33:23 localhost kernel: NFS reply fsinfo: 0
>>> 
>>> difference in actual packets in fsinfo is that ontap sends postattrs
>>> so that's 84bytes.
>>> 
>>>       req->rq_rcvsize = RPC_REPHDRSIZE + auth->au_rslack + \
>>>                       max_t(size_t, proc->p_replen, 2);
>>> 
>>> RPC_REPHDRSIZE is defined to be 4 (*4)  (it says it doesn't include
>>> the verifier ???)
>> 
>>> rslack needs to cover kerberos blob 25 (*4)  (but that's the kerberos
>>> part a part of the wrap and not the verifier)
>>> p_replen to cover fs_info args 35 (*4) (seems like the right number)
>>> 
>>> So we are missing the GSS to include the verifier and the kerberos
>>> blob of the wrapper (and lengths!!). Basically we need GSS_VERF_SLACK
>>> to cover 2 kerberos blobs (or more specifically KRB_TOKEN_CFX_GetMic
>>> 9*4 and KRB_TOKEN_CFS_WRAP 15*4 + 2 lengths before the kerberos blobs
>>> = 104 and we are only giving 100).
>> 
>> GSS_VERF_SLACK is also used for setting au_verfsize, so please don't
>> change its value. Define a new constant for initializing au_rslack.
>> 
>> Let's construct that constant using the KRB5_TOKEN constants you mention
>> here... include/linux/sunrpc/gss_krb5.h has
>> 
>> 221 /*
>> 222  * This compile-time check verifies that we will not exceed the
>> 223  * slack space allotted by the client and server auth_gss code
>> 224  * before they call gss_wrap().
>> 225  */
>> 226 #define GSS_KRB5_MAX_SLACK_NEEDED \
>> 227         (GSS_KRB5_TOK_HDR_LEN     /* gss token header */         \
>> 228         + GSS_KRB5_MAX_CKSUM_LEN  /* gss token checksum */       \
>> 229         + GSS_KRB5_MAX_BLOCKSIZE  /* confounder */               \
>> 230         + GSS_KRB5_MAX_BLOCKSIZE  /* possible padding */         \
>> 231         + GSS_KRB5_TOK_HDR_LEN    /* encrypted hdr in v2 token */\
>> 232         + GSS_KRB5_MAX_CKSUM_LEN  /* encryption hmac */          \
>> 233         + 4 + 4                   /* RPC verifier */             \
>> 234         + GSS_KRB5_TOK_HDR_LEN                                   \
>> 235         + GSS_KRB5_MAX_CKSUM_LEN)
>> 
>> So this, or something like this, plus the comment below.
>> 
> 
> I'm not sure this particular structure can we re-used as it computes
> something different (as it can be seen below what RFC structures are
> for mic and wrap cksums).
> 
> I have 2 ways this can be solved. A simple patch that eliminates
> quartering the rslack of the GSS_VERF_SLACK that produces too small of
> a buffer. GSS_VERF_SLACK value in itself is a good approximation of
> what Kerberos needs.
> 
> diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
> index 24ca861..e3b6ea2 100644
> --- a/net/sunrpc/auth_gss/auth_gss.c
> +++ b/net/sunrpc/auth_gss/auth_gss.c
> @@ -1050,7 +1050,7 @@ static void gss_pipe_free(struct gss_pipe *p)
>                goto err_put_mech;
>        auth = &gss_auth->rpc_auth;
>        auth->au_cslack = GSS_CRED_SLACK >> 2;
> -       auth->au_rslack = GSS_VERF_SLACK >> 2;
> +       auth->au_rslack = GSS_VERF_SLACK;
>        auth->au_verfsize = GSS_VERF_SLACK >> 2;
>        auth->au_ralign = GSS_VERF_SLACK >> 2;
>        auth->au_flags = 0;

The GSS_YADA_SLACK numbers are in bytes, these variables
are in XDR words. So ">> 2" is always the right thing to
do here. Not doing it for rslack would be confusing and
arbitrary.

(It really should be XDR_QUADLEN here, not a raw ">> 2").

It's simply happenstance that you get a value close to
what is needed when you take off the divide-by-four.

So this approach looks pretty hacky to me: it will confuse
anyone who looks at this code in the future. I don't feel
good about trading one magic number for another.


> Or I can introduce a new variable but I'm having a hard time figuring
> out what are the appropriate values for the checksums

IMO this is the correct approach, and one that mirrors the
approach taken with the _sz macros in nfs?xdr.c, (that is,
everywhere else where there is a requirement to estimate
the maximum size of an XDR data item).


> (I've added what
> I'm getting on the wire):
> diff --git a/include/linux/sunrpc/gss_krb5.h b/include/linux/sunrpc/gss_krb5.h
> index c1d77dd..9ef77eb 100644
> --- a/include/linux/sunrpc/gss_krb5.h
> +++ b/include/linux/sunrpc/gss_krb5.h
> @@ -234,6 +234,26 @@ enum seal_alg {
>        + GSS_KRB5_TOK_HDR_LEN                                   \
>        + GSS_KRB5_MAX_CKSUM_LEN)
> 
> +/* to decode a GSS wrap reply the RPC code allocates buffer size based
> + * on GSS_KRB5 tokens (mic, wrap) and the max operation reply size
> + * provide MIC and WRAP token sizes in bytes
> + */
> +#define GSS_KRB5_TOKEN_GETMIC \
> +       (2 /*KRB5_TOKEN_ID*/ \
> +       + 1 /*KRB5_CTX_FLAGS*/ \
> +       + 5 /*KRB5_MIC_FILLER*/ \
> +       + 8 /*KRB5_CTX_SEQ*/ \
> +       + 12 /*KRB5_MIC_CKSUM_LEN*/)
> +
> +#define GSS_KRB5_TOKEN_WRAP \
> +       (2 /*KRB5_TOKEN_ID*/ \
> +       + 1 /*KRB5_CTX_FLAGS*/ \
> +       + 1 /*KRB5_WRAP_FILLER*/ \
> +       + 2 /*KRB5_CTX_EC*/ \
> +       + 2 /*KRB5_CTX_RRC*/ \
> +       + 8 /*KRB5_CTX_SEQ*/ \
> +       + 44 /*KRB5_WRAP_CKSUM_LEN */)

These appear to be in bytes? The documenting comment looks
unfinished. But, I'm with you so far.


> +
> u32
> make_checksum(struct krb5_ctx *kctx, char *header, int hdrlen,
>                struct xdr_buf *body, int body_offset, u8 *cksumkey,
> diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
> index 24ca861..d6a52dc 100644
> --- a/net/sunrpc/auth_gss/auth_gss.c
> +++ b/net/sunrpc/auth_gss/auth_gss.c
> @@ -25,6 +25,7 @@
> #include <linux/workqueue.h>
> #include <linux/sunrpc/rpc_pipe_fs.h>
> #include <linux/sunrpc/gss_api.h>
> +#include <linux/sunrpc/gss_krb5.h>
> #include <linux/uaccess.h>
> #include <linux/hashtable.h>
> 
> @@ -52,6 +53,12 @@
>  * using integrity (two 4-byte integers): */
> #define GSS_VERF_SLACK         100
> 
> +#define GSS_RAU_SLACK \
> +(4 /* GSS verf token length */ \
> ++ GSS_KRB5_TOKEN_GETMIC /* verifier's getmic */ \
> ++ 4 /* GSS warp token length */ \

"wrap" not "warp" ...?

> ++ GSS_KRB5_TOKEN_WRAP) /* wrap verifier */ \
> +
> static DEFINE_HASHTABLE(gss_auth_hash_table, 4);
> static DEFINE_SPINLOCK(gss_auth_hash_lock);
> 
> @@ -1050,7 +1057,7 @@ static void gss_pipe_free(struct gss_pipe *p)
>                goto err_put_mech;
>        auth = &gss_auth->rpc_auth;
>        auth->au_cslack = GSS_CRED_SLACK >> 2;
> -       auth->au_rslack = GSS_VERF_SLACK >> 2;
> +       auth->au_rslack = GSS_RAU_SLACK;

I like the outlines of this approach. What's "RAU" ?
Maybe GSS_RESP_SLACK or GSS_REPL_SLACK might be clearer?

The slack variables are in XDR words. Should you have a
">> 2" or XDR_QUADLEN() here?


>        auth->au_verfsize = GSS_VERF_SLACK >> 2;
>        auth->au_ralign = GSS_VERF_SLACK >> 2;
>        auth->au_flags = 0;
> 
> 
> Here's the KRB5 RFC structure for MIC:

Is this RFC 4121? Just so we're on the same page.


> Octet no   Name        Description
>         --------------------------------------------------------------
>         0..1     TOK_ID     Identification field.  Tokens emitted by
>                             GSS_GetMIC() contain the hex value 04 04
>                             expressed in big-endian order in this
>                             field.
>         2        Flags      Attributes field, as described in section
>                             4.2.2.
>         3..7     Filler     Contains five octets of hex value FF.
>         8..15    SND_SEQ    Sequence number field in clear text,
>                             expressed in big-endian order.
>         16..last SGN_CKSUM  Checksum of the "to-be-signed" data and
>                             octet 0..15, as described in section 4.2.4.
> 
> for wrap:
> 
> Octet no   Name        Description
>         --------------------------------------------------------------
>          0..1     TOK_ID    Identification field.  Tokens emitted by
>                             GSS_Wrap() contain the hex value 05 04
>                             expressed in big-endian order in this
>                             field.
>          2        Flags     Attributes field, as described in section
>                             4.2.2.
>          3        Filler    Contains the hex value FF.
>          4..5     EC        Contains the "extra count" field, in big-
>                             endian order as described in section 4.2.3.
>          6..7     RRC       Contains the "right rotation count" in big-
>                             endian order, as described in section
>                             4.2.5.
>          8..15    SND_SEQ   Sequence number field in clear text,
>                             expressed in big-endian order.
>          16..last Data      Encrypted data for Wrap tokens with
>                             confidentiality, or plaintext data followed
>                             by the checksum for Wrap tokens without
>                             confidentiality, as described in section
>                             4.2.4.
> 
> Variable parts are cksum lengths based on the encryption used.

Exactly, so we want to know the largest possible checksum size
that will be needed, based on the set of supported KRB5 enctypes
the kernel can use. As far as I know, the encryption algorithms
do not alter the cleartext message size.

For now, just pick the biggest such number and make that a C
macro with a nice comment. We can embellish this with an
indirection later.


> Current code has GSS_KRB5_MAX_CKSUM_LEN to be 20 (bytes?. Is that for
> hmac signature) in WRAP cksum sent is 44bytes.
> 
> Looking for guidance here.

I'm not quite sure where to find a mapping of enctype to maximum
checksum size. Probably some code auditing needed, or maybe there
is an RFC for each enctype that can impart that size information.


> Thank you.
> 
>>> The reason things work against linux is because it has a nice buffer
>>> of 84bytes of post attributes that it doesn't send.
>> 
>> Missing post-attributes makes sense. Thank you for the analysis.
>> 
>> 
>>> To address your later point that kerberos blob is encryption type
>>> depended and that once some other encryption is added to gss-kerberos
>>> that's larger than existing checksum then this value would need to be
>>> adjusted again.
>> 
>>> If you agree with my reasoning for the number then I'd like to send
>>> out a patch now.
>> 
>> The current numbers are based on the kernel GSS implementation supporting
>> only Kerberos with a narrow set of enctypes. That needs to be made clear
>> in a documenting comment.
>> 
>> The reason this has been bothersome is because the existing setting is
>> a magic number (100), and its documenting comment has been stale since
>> 2006. Any proposed fix has to address the missing documentation.
>> 
>> 
>>>>>>> I would like to propose to change it to something a little larger than
>>>>>>> 104, like 120 to give room if some other server might reply with
>>>>>>> something even larger.
>>>>>> 
>>>>>> Why does it need to be larger than 104?
>>>>> 
>>>>> I don't know why 100 was chosen and given that I think arguments are
>>>>> taken into the account and arguments can change. I think NetApp has
>>>>> changed their file handle sizes (at some point, not in the near past
>>>>> but i think so?). Perhaps they might want to do that again so the size
>>>>> will change again.
>>>>> 
>>>>> Honestly, I would have like for 100 to be 200 to be safe.
>>>> 
>>>> To be safe, I would like to have a good understanding of the details,
>>>> rather than guessing at an arbitrary maximum value. Let's choose a
>>>> rational maximum and include a descriptive comment about why that value
>>>> is the best choice.
>>>> 
>>>> 
>>>>>>> Thoughts? Will send an actual patch if no objections to this one.
>>>>>>> 
>>>>>>> diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
>>>>>>> index 24ca861..44ae6bc 100644
>>>>>>> --- a/net/sunrpc/auth_gss/auth_gss.c
>>>>>>> +++ b/net/sunrpc/auth_gss/auth_gss.c
>>>>>>> @@ -50,7 +50,7 @@
>>>>>>> #define GSS_CRED_SLACK         (RPC_MAX_AUTH_SIZE * 2)
>>>>>>> /* length of a krb5 verifier (48), plus data added before arguments when
>>>>>>> * using integrity (two 4-byte integers): */
>>>>>>> -#define GSS_VERF_SLACK         100
>>>>>>> +#define GSS_VERF_SLACK         120
>>>>>>> 
>>>>>>> static DEFINE_HASHTABLE(gss_auth_hash_table, 4);
>>>>>>> static DEFINE_SPINLOCK(gss_auth_hash_lock);
>>>>>> 
>>>>>> --
>>>>>> Chuck Lever
>>>> 
>>>> --
>>>> Chuck Lever
>>>> 
>>>> 
>>>> 
>>> <linux-v3-krb5p-mount.pcap.gz><ontap-v3-krb5-mount.pcap.gz>
>> 
>> --
>> Chuck Lever

--
Chuck Lever
diff mbox series

Patch

diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
index 24ca861..44ae6bc 100644
--- a/net/sunrpc/auth_gss/auth_gss.c
+++ b/net/sunrpc/auth_gss/auth_gss.c
@@ -50,7 +50,7 @@ 
 #define GSS_CRED_SLACK         (RPC_MAX_AUTH_SIZE * 2)
 /* length of a krb5 verifier (48), plus data added before arguments when
  * using integrity (two 4-byte integers): */
-#define GSS_VERF_SLACK         100
+#define GSS_VERF_SLACK         120

 static DEFINE_HASHTABLE(gss_auth_hash_table, 4);
 static DEFINE_SPINLOCK(gss_auth_hash_lock);