diff mbox series

[RFC] cifs: fix signature check for error responses

Message ID 20220922223216.4730-1-ematsumiya@suse.de (mailing list archive)
State New, archived
Headers show
Series [RFC] cifs: fix signature check for error responses | expand

Commit Message

Enzo Matsumiya Sept. 22, 2022, 10:32 p.m. UTC
Hi all,

This patch is the (apparent) correct way to fix the issues regarding some
messages with invalid signatures. My previous patch
https://lore.kernel.org/linux-cifs/20220918235442.2981-1-ematsumiya@suse.de/
was wrong because a) it covered up the real issue (*), and b) I could never
again "reproduce" the "race" -- I had other patches in place when I tested it,
and (thus?) the system memory was not in "pristine" conditions, so it's very
likely that there's no race at all.

(*) Thanks a lot to Tom Talpey for pointing me to the right direction here.

Since it sucks to be wrong twice, I'm sending this one as RFC because I
wanted to confirm some things:

  1) Can I rely on the fact that status != STATUS_SUCCESS means no variable
     data in the message?  I could only infer from the spec, but not really
     confirm.
  
  2) (probably only in async cases) When the signature fails, for whatever
     reason, we don't take any action.  This doesn't seem right IMHO,
     e.g. if the message is spoofed, we show a warning that the signatures
     doesn't match, but I would expect at least for the operation to stop,
     or maybe even a complete dis/reconnect.

  3) For SMB1, I couldn't really use generic/465 as a real confirmation for
     this because it says "O_DIRECT is not supported".  From reading the
     code and 'man mount.cifs' I understood that this is supported, so what
     gives?  Worth noting that I don't follow/use/test SMB1 too much.
     The patch does work for other cases I tried though.

I hope someone can address my questions/concerts above, and please let me
know if the patch is technically and semantically correct.

Patch follows.


Enzo

--------
When verifying a response's signature, the computation will go through
the iov buffer (header + response structs) and the over the page data, to
verify any dynamic data appended to the message (usually on an SMB2 READ
response).

When the response is an error, however, specifically on async reads,
the page data is allocated before receiving the expected data.  Being
"valid" data (from the signature computation POV; non-NULL, >0 pages),
it's included in the parsing and generates an invalid signature for the
message.

Fix this by checking if the status is non-zero, and skip the page data
if it is.  The issue happens in all protocol versions, and this fix
applies to all.

This issue can be observed with xfstests generic/465.

Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
---
 fs/cifs/cifsencrypt.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

Comments

Tom Talpey Sept. 28, 2022, 3:05 p.m. UTC | #1
On 9/22/2022 6:32 PM, Enzo Matsumiya wrote:
> Hi all,
> 
> This patch is the (apparent) correct way to fix the issues regarding some
> messages with invalid signatures. My previous patch
> https://lore.kernel.org/linux-cifs/20220918235442.2981-1-ematsumiya@suse.de/
> was wrong because a) it covered up the real issue (*), and b) I could never
> again "reproduce" the "race" -- I had other patches in place when I tested it,
> and (thus?) the system memory was not in "pristine" conditions, so it's very
> likely that there's no race at all.
> 
> (*) Thanks a lot to Tom Talpey for pointing me to the right direction here.
> 
> Since it sucks to be wrong twice, I'm sending this one as RFC because I
> wanted to confirm some things:
> 
>    1) Can I rely on the fact that status != STATUS_SUCCESS means no variable
>       data in the message?  I could only infer from the spec, but not really
>       confirm.

Technically, I think the answer is "no" but practically speaking maybe
"yes".

Not all errors are actually errors however, e.g. STOPPED_ON_SYMLINK and
the additional error contexts which accompany it. We definitely don't
want to skip validation of those.

There's also STATUS_PENDING, but that's special because it isn't actually
a response, it's just a "hang on I'm busy" that carries no other
data.

Finally, MS-SMB2 lists a few others in section 2.2.2.2, all of which
need validation I think.

>    2) (probably only in async cases) When the signature fails, for whatever
>       reason, we don't take any action.  This doesn't seem right IMHO,
>       e.g. if the message is spoofed, we show a warning that the signatures
>       doesn't match, but I would expect at least for the operation to stop,
>       or maybe even a complete dis/reconnect.

I don't think so. The spec calls for dropping the message, and after
all it could have been simple packet corruption. The retries will sort
it out.

Absolutely positively do not print a message for each received packet,
good or bad.

>    3) For SMB1, I couldn't really use generic/465 as a real confirmation for
>       this because it says "O_DIRECT is not supported".  From reading the
>       code and 'man mount.cifs' I understood that this is supported, so what
>       gives?  Worth noting that I don't follow/use/test SMB1 too much.
>       The patch does work for other cases I tried though.

Honestly, I don't think we care. No amount of patching can possibly
save SMB1.

Tom.

> I hope someone can address my questions/concerts above, and please let me
> know if the patch is technically and semantically correct.
> 
> Patch follows.
> 
> 
> Enzo
> 
> --------
> When verifying a response's signature, the computation will go through
> the iov buffer (header + response structs) and the over the page data, to
> verify any dynamic data appended to the message (usually on an SMB2 READ
> response).
> 
> When the response is an error, however, specifically on async reads,
> the page data is allocated before receiving the expected data.  Being
> "valid" data (from the signature computation POV; non-NULL, >0 pages),
> it's included in the parsing and generates an invalid signature for the
> message.
> 
> Fix this by checking if the status is non-zero, and skip the page data
> if it is.  The issue happens in all protocol versions, and this fix
> applies to all.
> 
> This issue can be observed with xfstests generic/465.
> 
> Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
> ---
>   fs/cifs/cifsencrypt.c | 17 +++++++++++++++++
>   1 file changed, 17 insertions(+)
> 
> diff --git a/fs/cifs/cifsencrypt.c b/fs/cifs/cifsencrypt.c
> index 46f5718754f9..e3260bb436b3 100644
> --- a/fs/cifs/cifsencrypt.c
> +++ b/fs/cifs/cifsencrypt.c
> @@ -32,15 +32,28 @@ int __cifs_calc_signature(struct smb_rqst *rqst,
>   	int rc;
>   	struct kvec *iov = rqst->rq_iov;
>   	int n_vec = rqst->rq_nvec;
> +	bool has_error = false;
>   
>   	/* iov[0] is actual data and not the rfc1002 length for SMB2+ */
>   	if (!is_smb1(server)) {
> +		struct smb2_hdr *shdr = (struct smb2_hdr *)iov[0].iov_base;
> +
>   		if (iov[0].iov_len <= 4)
>   			return -EIO;
> +
> +		if (shdr->Status != 0)
> +			has_error = true;
> +
>   		i = 0;
>   	} else {
> +		struct smb_hdr *hdr = (struct smb_hdr *)iov[1].iov_base;
> +
>   		if (n_vec < 2 || iov[0].iov_len != 4)
>   			return -EIO;
> +
> +		if (hdr->Status.CifsError != 0)
> +			has_error = true;
> +
>   		i = 1; /* skip rfc1002 length */
>   	}
>   
> @@ -61,6 +74,9 @@ int __cifs_calc_signature(struct smb_rqst *rqst,
>   		}
>   	}
>   
> +	if (has_error)
> +		goto out_final;
> +
>   	/* now hash over the rq_pages array */
>   	for (i = 0; i < rqst->rq_npages; i++) {
>   		void *kaddr;
> @@ -81,6 +97,7 @@ int __cifs_calc_signature(struct smb_rqst *rqst,
>   		kunmap(rqst->rq_pages[i]);
>   	}
>   
> +out_final:
>   	rc = crypto_shash_final(shash, signature);
>   	if (rc)
>   		cifs_dbg(VFS, "%s: Could not generate hash\n", __func__);
Enzo Matsumiya Sept. 28, 2022, 4:30 p.m. UTC | #2
On 09/28, Tom Talpey wrote:
>On 9/22/2022 6:32 PM, Enzo Matsumiya wrote:
>>Hi all,
>>
>>This patch is the (apparent) correct way to fix the issues regarding some
>>messages with invalid signatures. My previous patch
>>https://lore.kernel.org/linux-cifs/20220918235442.2981-1-ematsumiya@suse.de/
>>was wrong because a) it covered up the real issue (*), and b) I could never
>>again "reproduce" the "race" -- I had other patches in place when I tested it,
>>and (thus?) the system memory was not in "pristine" conditions, so it's very
>>likely that there's no race at all.
>>
>>(*) Thanks a lot to Tom Talpey for pointing me to the right direction here.
>>
>>Since it sucks to be wrong twice, I'm sending this one as RFC because I
>>wanted to confirm some things:
>>
>>   1) Can I rely on the fact that status != STATUS_SUCCESS means no variable
>>      data in the message?  I could only infer from the spec, but not really
>>      confirm.
>
>Technically, I think the answer is "no" but practically speaking maybe
>"yes".
>
>Not all errors are actually errors however, e.g. STOPPED_ON_SYMLINK and
>the additional error contexts which accompany it. We definitely don't
>want to skip validation of those.
>
>There's also STATUS_PENDING, but that's special because it isn't actually
>a response, it's just a "hang on I'm busy" that carries no other
>data.
>
>Finally, MS-SMB2 lists a few others in section 2.2.2.2, all of which
>need validation I think.

Sorry, I might have not been clear.  By "variable data" I actually meant
data that's not going to be allocated in the requests' iovs, but rather
in rqst->rq_pages.

The variable buffers that are already expected (by PDU/spec), are AFAICS
always allocated either through cifs*buf_get() or some kmalloc() variant,
i.e. non-HIGHMEM.

async IO, though, is the only place I can see that allocates
rqst->rq_pages through iov_iter_get_pages_alloc2(), which is the source
of my concern, and the reason of my patch.  Particularly, on async read
cases, the page data is pre-allocated with the size we *expect* to
receive/read from the server, but if there's an error in a READ
response, the response itself will contain no/incomplete data in the page
data, but since those pages are allocated, the signature computation will
account for it, and generate a mismatching signature.

I'm running more tests now and I'll make sure to test the other possible
errors in 2.2.2.2.

>>   2) (probably only in async cases) When the signature fails, for whatever
>>      reason, we don't take any action.  This doesn't seem right IMHO,
>>      e.g. if the message is spoofed, we show a warning that the signatures
>>      doesn't match, but I would expect at least for the operation to stop,
>>      or maybe even a complete dis/reconnect.
>
>I don't think so. The spec calls for dropping the message, and after
>all it could have been simple packet corruption. The retries will sort
>it out.
>
>Absolutely positively do not print a message for each received packet,
>good or bad.

For the good messages ok, of course, but for bad? Why not? Otherwise, e.g.,
spotting the misvalidation mentioned above would've taken much longer.

(btw I spotted that misvalidation while debugging my AES-GMAC series,
but the same behaviuour happens with AES-CMAC i.e. in current code)

>>   3) For SMB1, I couldn't really use generic/465 as a real confirmation for
>>      this because it says "O_DIRECT is not supported".  From reading the
>>      code and 'man mount.cifs' I understood that this is supported, so what
>>      gives?  Worth noting that I don't follow/use/test SMB1 too much.
>>      The patch does work for other cases I tried though.
>
>Honestly, I don't think we care. No amount of patching can possibly
>save SMB1.

Ack :)

>Tom.


Cheers,

Enzo

>>I hope someone can address my questions/concerts above, and please let me
>>know if the patch is technically and semantically correct.
>>
>>Patch follows.
>>
>>
>>Enzo
>>
>>--------
>>When verifying a response's signature, the computation will go through
>>the iov buffer (header + response structs) and the over the page data, to
>>verify any dynamic data appended to the message (usually on an SMB2 READ
>>response).
>>
>>When the response is an error, however, specifically on async reads,
>>the page data is allocated before receiving the expected data.  Being
>>"valid" data (from the signature computation POV; non-NULL, >0 pages),
>>it's included in the parsing and generates an invalid signature for the
>>message.
>>
>>Fix this by checking if the status is non-zero, and skip the page data
>>if it is.  The issue happens in all protocol versions, and this fix
>>applies to all.
>>
>>This issue can be observed with xfstests generic/465.
>>
>>Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
>>---
>>  fs/cifs/cifsencrypt.c | 17 +++++++++++++++++
>>  1 file changed, 17 insertions(+)
>>
>>diff --git a/fs/cifs/cifsencrypt.c b/fs/cifs/cifsencrypt.c
>>index 46f5718754f9..e3260bb436b3 100644
>>--- a/fs/cifs/cifsencrypt.c
>>+++ b/fs/cifs/cifsencrypt.c
>>@@ -32,15 +32,28 @@ int __cifs_calc_signature(struct smb_rqst *rqst,
>>  	int rc;
>>  	struct kvec *iov = rqst->rq_iov;
>>  	int n_vec = rqst->rq_nvec;
>>+	bool has_error = false;
>>  	/* iov[0] is actual data and not the rfc1002 length for SMB2+ */
>>  	if (!is_smb1(server)) {
>>+		struct smb2_hdr *shdr = (struct smb2_hdr *)iov[0].iov_base;
>>+
>>  		if (iov[0].iov_len <= 4)
>>  			return -EIO;
>>+
>>+		if (shdr->Status != 0)
>>+			has_error = true;
>>+
>>  		i = 0;
>>  	} else {
>>+		struct smb_hdr *hdr = (struct smb_hdr *)iov[1].iov_base;
>>+
>>  		if (n_vec < 2 || iov[0].iov_len != 4)
>>  			return -EIO;
>>+
>>+		if (hdr->Status.CifsError != 0)
>>+			has_error = true;
>>+
>>  		i = 1; /* skip rfc1002 length */
>>  	}
>>@@ -61,6 +74,9 @@ int __cifs_calc_signature(struct smb_rqst *rqst,
>>  		}
>>  	}
>>+	if (has_error)
>>+		goto out_final;
>>+
>>  	/* now hash over the rq_pages array */
>>  	for (i = 0; i < rqst->rq_npages; i++) {
>>  		void *kaddr;
>>@@ -81,6 +97,7 @@ int __cifs_calc_signature(struct smb_rqst *rqst,
>>  		kunmap(rqst->rq_pages[i]);
>>  	}
>>+out_final:
>>  	rc = crypto_shash_final(shash, signature);
>>  	if (rc)
>>  		cifs_dbg(VFS, "%s: Could not generate hash\n", __func__);
Tom Talpey Sept. 28, 2022, 7:33 p.m. UTC | #3
On 9/28/2022 12:30 PM, Enzo Matsumiya wrote:
> On 09/28, Tom Talpey wrote:
>> On 9/22/2022 6:32 PM, Enzo Matsumiya wrote:
>>> Hi all,
>>>
>>> This patch is the (apparent) correct way to fix the issues regarding 
>>> some
>>> messages with invalid signatures. My previous patch
>>> https://lore.kernel.org/linux-cifs/20220918235442.2981-1-ematsumiya@suse.de/
>>> was wrong because a) it covered up the real issue (*), and b) I could 
>>> never
>>> again "reproduce" the "race" -- I had other patches in place when I 
>>> tested it,
>>> and (thus?) the system memory was not in "pristine" conditions, so 
>>> it's very
>>> likely that there's no race at all.
>>>
>>> (*) Thanks a lot to Tom Talpey for pointing me to the right direction 
>>> here.
>>>
>>> Since it sucks to be wrong twice, I'm sending this one as RFC because I
>>> wanted to confirm some things:
>>>
>>>   1) Can I rely on the fact that status != STATUS_SUCCESS means no 
>>> variable
>>>      data in the message?  I could only infer from the spec, but not 
>>> really
>>>      confirm.
>>
>> Technically, I think the answer is "no" but practically speaking maybe
>> "yes".
>>
>> Not all errors are actually errors however, e.g. STOPPED_ON_SYMLINK and
>> the additional error contexts which accompany it. We definitely don't
>> want to skip validation of those.
>>
>> There's also STATUS_PENDING, but that's special because it isn't actually
>> a response, it's just a "hang on I'm busy" that carries no other
>> data.
>>
>> Finally, MS-SMB2 lists a few others in section 2.2.2.2, all of which
>> need validation I think.
> 
> Sorry, I might have not been clear.  By "variable data" I actually meant
> data that's not going to be allocated in the requests' iovs, but rather
> in rqst->rq_pages.
> 
> The variable buffers that are already expected (by PDU/spec), are AFAICS
> always allocated either through cifs*buf_get() or some kmalloc() variant,
> i.e. non-HIGHMEM.
> 
> async IO, though, is the only place I can see that allocates
> rqst->rq_pages through iov_iter_get_pages_alloc2(), which is the source
> of my concern, and the reason of my patch.  Particularly, on async read
> cases, the page data is pre-allocated with the size we *expect* to
> receive/read from the server, but if there's an error in a READ
> response, the response itself will contain no/incomplete data in the page
> data, but since those pages are allocated, the signature computation will
> account for it, and generate a mismatching signature.

Aha - ok, got it. And I've been looking into those codepaths too because
the RDMA code fails badly when I try to optimize the completion path.
Right now, the code waits for ALL smbdirect packets to be sent before
returning, which is ridiculously inefficient. It does this in both
client and server sending. But if I remove the wait, nothing works as
it appears the queued buffers are either freed, or unwound from the
stack. Completely and totally bogus.

> I'm running more tests now and I'll make sure to test the other possible
> errors in 2.2.2.2.
> 
>>>   2) (probably only in async cases) When the signature fails, for 
>>> whatever
>>>      reason, we don't take any action.  This doesn't seem right IMHO,
>>>      e.g. if the message is spoofed, we show a warning that the 
>>> signatures
>>>      doesn't match, but I would expect at least for the operation to 
>>> stop,
>>>      or maybe even a complete dis/reconnect.
>>
>> I don't think so. The spec calls for dropping the message, and after
>> all it could have been simple packet corruption. The retries will sort
>> it out.
>>
>> Absolutely positively do not print a message for each received packet,
>> good or bad.
> 
> For the good messages ok, of course, but for bad? Why not? Otherwise, e.g.,
> spotting the misvalidation mentioned above would've taken much longer.

It's basic MITM. If there is such an attacker, it can trivially redirect
the client, by injecting an error undetected. Inspecting the payload
before doing the integrity check is just insecurely wrong.

Regarding the logging on signature validation, consider that an attacker
will therefore trivially be able to overrun the log, either DOSing the
system, or at the very least hiding other messages. Why alarm the
sysadmin, for something they are not likely to be able to act upon?
Bump a statistic, maybe.

To me, the kernel log is not informational, it's for critical actions.
For example, it requires root to even see it. Consider what you need to
tell such a person each time it happens. "Oops, I dropped a packet"
doesn't cut it. If it's a developer hint, then place it under CIFS FYI,
or some sort of opt-in diagnostic tracing channel.

Tom.

> (btw I spotted that misvalidation while debugging my AES-GMAC series,
> but the same behaviuour happens with AES-CMAC i.e. in current code)
> 
>>>   3) For SMB1, I couldn't really use generic/465 as a real 
>>> confirmation for
>>>      this because it says "O_DIRECT is not supported".  From reading the
>>>      code and 'man mount.cifs' I understood that this is supported, 
>>> so what
>>>      gives?  Worth noting that I don't follow/use/test SMB1 too much.
>>>      The patch does work for other cases I tried though.
>>
>> Honestly, I don't think we care. No amount of patching can possibly
>> save SMB1.
> 
> Ack :)
> 
>> Tom.
> 
> 
> Cheers,
> 
> Enzo
> 
>>> I hope someone can address my questions/concerts above, and please 
>>> let me
>>> know if the patch is technically and semantically correct.
>>>
>>> Patch follows.
>>>
>>>
>>> Enzo
>>>
>>> --------
>>> When verifying a response's signature, the computation will go through
>>> the iov buffer (header + response structs) and the over the page 
>>> data, to
>>> verify any dynamic data appended to the message (usually on an SMB2 READ
>>> response).
>>>
>>> When the response is an error, however, specifically on async reads,
>>> the page data is allocated before receiving the expected data.  Being
>>> "valid" data (from the signature computation POV; non-NULL, >0 pages),
>>> it's included in the parsing and generates an invalid signature for the
>>> message.
>>>
>>> Fix this by checking if the status is non-zero, and skip the page data
>>> if it is.  The issue happens in all protocol versions, and this fix
>>> applies to all.
>>>
>>> This issue can be observed with xfstests generic/465.
>>>
>>> Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
>>> ---
>>>  fs/cifs/cifsencrypt.c | 17 +++++++++++++++++
>>>  1 file changed, 17 insertions(+)
>>>
>>> diff --git a/fs/cifs/cifsencrypt.c b/fs/cifs/cifsencrypt.c
>>> index 46f5718754f9..e3260bb436b3 100644
>>> --- a/fs/cifs/cifsencrypt.c
>>> +++ b/fs/cifs/cifsencrypt.c
>>> @@ -32,15 +32,28 @@ int __cifs_calc_signature(struct smb_rqst *rqst,
>>>      int rc;
>>>      struct kvec *iov = rqst->rq_iov;
>>>      int n_vec = rqst->rq_nvec;
>>> +    bool has_error = false;
>>>      /* iov[0] is actual data and not the rfc1002 length for SMB2+ */
>>>      if (!is_smb1(server)) {
>>> +        struct smb2_hdr *shdr = (struct smb2_hdr *)iov[0].iov_base;
>>> +
>>>          if (iov[0].iov_len <= 4)
>>>              return -EIO;
>>> +
>>> +        if (shdr->Status != 0)
>>> +            has_error = true;
>>> +
>>>          i = 0;
>>>      } else {
>>> +        struct smb_hdr *hdr = (struct smb_hdr *)iov[1].iov_base;
>>> +
>>>          if (n_vec < 2 || iov[0].iov_len != 4)
>>>              return -EIO;
>>> +
>>> +        if (hdr->Status.CifsError != 0)
>>> +            has_error = true;
>>> +
>>>          i = 1; /* skip rfc1002 length */
>>>      }
>>> @@ -61,6 +74,9 @@ int __cifs_calc_signature(struct smb_rqst *rqst,
>>>          }
>>>      }
>>> +    if (has_error)
>>> +        goto out_final;
>>> +
>>>      /* now hash over the rq_pages array */
>>>      for (i = 0; i < rqst->rq_npages; i++) {
>>>          void *kaddr;
>>> @@ -81,6 +97,7 @@ int __cifs_calc_signature(struct smb_rqst *rqst,
>>>          kunmap(rqst->rq_pages[i]);
>>>      }
>>> +out_final:
>>>      rc = crypto_shash_final(shash, signature);
>>>      if (rc)
>>>          cifs_dbg(VFS, "%s: Could not generate hash\n", __func__);
>
diff mbox series

Patch

diff --git a/fs/cifs/cifsencrypt.c b/fs/cifs/cifsencrypt.c
index 46f5718754f9..e3260bb436b3 100644
--- a/fs/cifs/cifsencrypt.c
+++ b/fs/cifs/cifsencrypt.c
@@ -32,15 +32,28 @@  int __cifs_calc_signature(struct smb_rqst *rqst,
 	int rc;
 	struct kvec *iov = rqst->rq_iov;
 	int n_vec = rqst->rq_nvec;
+	bool has_error = false;
 
 	/* iov[0] is actual data and not the rfc1002 length for SMB2+ */
 	if (!is_smb1(server)) {
+		struct smb2_hdr *shdr = (struct smb2_hdr *)iov[0].iov_base;
+
 		if (iov[0].iov_len <= 4)
 			return -EIO;
+
+		if (shdr->Status != 0)
+			has_error = true;
+
 		i = 0;
 	} else {
+		struct smb_hdr *hdr = (struct smb_hdr *)iov[1].iov_base;
+
 		if (n_vec < 2 || iov[0].iov_len != 4)
 			return -EIO;
+
+		if (hdr->Status.CifsError != 0)
+			has_error = true;
+
 		i = 1; /* skip rfc1002 length */
 	}
 
@@ -61,6 +74,9 @@  int __cifs_calc_signature(struct smb_rqst *rqst,
 		}
 	}
 
+	if (has_error)
+		goto out_final;
+
 	/* now hash over the rq_pages array */
 	for (i = 0; i < rqst->rq_npages; i++) {
 		void *kaddr;
@@ -81,6 +97,7 @@  int __cifs_calc_signature(struct smb_rqst *rqst,
 		kunmap(rqst->rq_pages[i]);
 	}
 
+out_final:
 	rc = crypto_shash_final(shash, signature);
 	if (rc)
 		cifs_dbg(VFS, "%s: Could not generate hash\n", __func__);