diff mbox series

[bpf-next,04/10] bpf: Make errors of bpf_tcp_check_syncookie distinguishable

Message ID 20211019144655.3483197-5-maximmi@nvidia.com (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series New BPF helpers to accelerate synproxy | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR success PR summary
netdev/cover_letter success Series has a cover letter
netdev/fixes_present success Fixes tag not required for -next series
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for bpf-next
netdev/subject_prefix success Link
netdev/cc_maintainers warning 3 maintainers not CCed: llvm@lists.linux.dev brouer@redhat.com liuhangbin@gmail.com
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 11790 this patch: 11790
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success No Fixes tag
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 68 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 11421 this patch: 11421
netdev/header_inline success No static functions without inline keyword in header files
bpf/vmtest-bpf-next success VM_Test

Commit Message

Maxim Mikityanskiy Oct. 19, 2021, 2:46 p.m. UTC
bpf_tcp_check_syncookie returns errors when SYN cookie generation is
disabled (EINVAL) or when no cookies were recently generated (ENOENT).
The same error codes are used for other kinds of errors: invalid
parameters (EINVAL), invalid packet (EINVAL, ENOENT), bad cookie
(ENOENT). Such an overlap makes it impossible for a BPF program to
distinguish different cases that may require different handling.

For a BPF program that accelerates generating and checking SYN cookies,
typical logic looks like this (with current error codes annotated):

1. Drop invalid packets (EINVAL, ENOENT).

2. Drop packets with bad cookies (ENOENT).

3. Pass packets with good cookies (0).

4. Pass all packets when cookies are not in use (EINVAL, ENOENT).

The last point also matches the behavior of cookie_v4_check and
cookie_v6_check that skip all checks if cookie generation is disabled or
no cookies were recently generated. Overlapping error codes, however,
make it impossible to distinguish case 4 from cases 1 and 2.

The original commit message of commit 399040847084 ("bpf: add helper to
check for a valid SYN cookie") mentions another use case, though:
traffic classification, where it's important to distinguish new
connections from existing ones, and case 4 should be distinguishable
from case 3.

To match the requirements of both use cases, this patch reassigns error
codes of bpf_tcp_check_syncookie and adds missing documentation:

1. EINVAL: Invalid packets.

2. EACCES: Packets with bad cookies.

3. 0: Packets with good cookies.

4. ENOENT: Cookies are not in use.

This way all four cases are easily distinguishable.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
---
 include/uapi/linux/bpf.h       | 18 ++++++++++++++++--
 net/core/filter.c              |  6 +++---
 tools/include/uapi/linux/bpf.h | 18 ++++++++++++++++--
 3 files changed, 35 insertions(+), 7 deletions(-)

Comments

John Fastabend Oct. 20, 2021, 3:28 a.m. UTC | #1
Maxim Mikityanskiy wrote:
> bpf_tcp_check_syncookie returns errors when SYN cookie generation is
> disabled (EINVAL) or when no cookies were recently generated (ENOENT).
> The same error codes are used for other kinds of errors: invalid
> parameters (EINVAL), invalid packet (EINVAL, ENOENT), bad cookie
> (ENOENT). Such an overlap makes it impossible for a BPF program to
> distinguish different cases that may require different handling.

I'm not sure we can change these errors now. They are embedded in
the helper API. I think a BPF program could uncover the meaning
of the error anyways with some error path handling?

Anyways even if we do change these most of us who run programs
on multiple kernel versions would not be able to rely on them
being one way or the other easily.

> 
> For a BPF program that accelerates generating and checking SYN cookies,
> typical logic looks like this (with current error codes annotated):
> 
> 1. Drop invalid packets (EINVAL, ENOENT).
> 
> 2. Drop packets with bad cookies (ENOENT).
> 
> 3. Pass packets with good cookies (0).
> 
> 4. Pass all packets when cookies are not in use (EINVAL, ENOENT).
> 
> The last point also matches the behavior of cookie_v4_check and
> cookie_v6_check that skip all checks if cookie generation is disabled or
> no cookies were recently generated. Overlapping error codes, however,
> make it impossible to distinguish case 4 from cases 1 and 2.
> 
> The original commit message of commit 399040847084 ("bpf: add helper to
> check for a valid SYN cookie") mentions another use case, though:
> traffic classification, where it's important to distinguish new
> connections from existing ones, and case 4 should be distinguishable
> from case 3.
> 
> To match the requirements of both use cases, this patch reassigns error
> codes of bpf_tcp_check_syncookie and adds missing documentation:
> 
> 1. EINVAL: Invalid packets.
> 
> 2. EACCES: Packets with bad cookies.
> 
> 3. 0: Packets with good cookies.
> 
> 4. ENOENT: Cookies are not in use.
> 
> This way all four cases are easily distinguishable.
> 
> Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
> Reviewed-by: Tariq Toukan <tariqt@nvidia.com>

At very leasst this would need a fixes tag and should be backported
as a bug. Then we at least have a chance stable and LTS kernels
report the same thing.

[...]

> --- a/net/core/filter.c
> +++ b/net/core/filter.c
 
I'll take a stab at how a program can learn the error cause today.

BPF_CALL_5(bpf_tcp_check_syncookie, struct sock *, sk, void *, iph, u32, iph_len,
	   struct tcphdr *, th, u32, th_len)
{
#ifdef CONFIG_SYN_COOKIES
	u32 cookie;
	int ret;

// BPF program should know it pass bad values and can check
	if (unlikely(!sk || th_len < sizeof(*th)))
		return -EINVAL;

// sk_protocol and sk_state are exposed in sk and can be read directly 
	/* sk_listener() allows TCP_NEW_SYN_RECV, which makes no sense here. */
	if (sk->sk_protocol != IPPROTO_TCP || sk->sk_state != TCP_LISTEN)
		return -EINVAL;

// This is a user space knob right? I think this is a misconfig user can
// check before loading a program with check_syncookie?
	if (!sock_net(sk)->ipv4.sysctl_tcp_syncookies)
		return -EINVAL;

// We have th pointer can't we just check?
	if (!th->ack || th->rst || th->syn)
		return -ENOENT;

	if (tcp_synq_no_recent_overflow(sk))
		return -ENOENT;

	cookie = ntohl(th->ack_seq) - 1;

	switch (sk->sk_family) {
	case AF_INET:
// misconfiguration but can be checked.
		if (unlikely(iph_len < sizeof(struct iphdr)))
			return -EINVAL;

		ret = __cookie_v4_check((struct iphdr *)iph, th, cookie);
		break;

#if IS_BUILTIN(CONFIG_IPV6)
	case AF_INET6:
// misconfiguration can check as well
		if (unlikely(iph_len < sizeof(struct ipv6hdr)))
			return -EINVAL;

		ret = __cookie_v6_check((struct ipv6hdr *)iph, th, cookie);
		break;
#endif /* CONFIG_IPV6 */

	default:
		return -EPROTONOSUPPORT;
	}

	if (ret > 0)
		return 0;

	return -ENOENT;
#else
	return -ENOTSUPP;
#endif
}


So I guess my point is we have all the fields we could write a bit
of BPF to find the error cause if necessary. Might be better than
dealing with changing the error code and having to deal with the
differences in kernels. I do see how it would have been better
to get errors correct on the first patch though :/

By the way I haven't got to the next set of patches with the
actual features, but why not push everything above this patch
as fixes in its own series. Then the fixes can get going why
we review the feature.

Thanks,
John
Maxim Mikityanskiy Oct. 20, 2021, 1:16 p.m. UTC | #2
On 2021-10-20 06:28, John Fastabend wrote:
> Maxim Mikityanskiy wrote:
>> bpf_tcp_check_syncookie returns errors when SYN cookie generation is
>> disabled (EINVAL) or when no cookies were recently generated (ENOENT).
>> The same error codes are used for other kinds of errors: invalid
>> parameters (EINVAL), invalid packet (EINVAL, ENOENT), bad cookie
>> (ENOENT). Such an overlap makes it impossible for a BPF program to
>> distinguish different cases that may require different handling.
> 
> I'm not sure we can change these errors now. They are embedded in
> the helper API. I think a BPF program could uncover the meaning
> of the error anyways with some error path handling?
> 
> Anyways even if we do change these most of us who run programs
> on multiple kernel versions would not be able to rely on them
> being one way or the other easily.

The thing is, the error codes aren't really documented:

  * 0 if *iph* and *th* are a valid SYN cookie ACK, or a negative
  * error otherwise.

My patch doesn't break this assumption.

Practically speaking, there are two use cases of bpf_tcp_check_syncookie 
that I know about: traffic classification (find NEW ACK packets with the 
right cookie) and SYN flood protection.

For traffic classification, it's not important what error code we get. 
The logic for ACK packets is as follows:

1. Connection established => ESTABLISHED. Otherwise,

2. bpf_tcp_check_syncookie returns 0 => NEW. Otherwise,

3. INVALID (regardless of the specific error code).

My patch doesn't break this use case.

>>
>> For a BPF program that accelerates generating and checking SYN cookies,
>> typical logic looks like this (with current error codes annotated):
>>
>> 1. Drop invalid packets (EINVAL, ENOENT).
>>
>> 2. Drop packets with bad cookies (ENOENT).
>>
>> 3. Pass packets with good cookies (0).
>>
>> 4. Pass all packets when cookies are not in use (EINVAL, ENOENT).

Now that I'm reflecting on it again, it would make more sense to drop 
packets in case 4: it's a new packet, it's an ACK, and we don't expect 
any cookies.

>> The last point also matches the behavior of cookie_v4_check and
>> cookie_v6_check that skip all checks if cookie generation is disabled or
>> no cookies were recently generated. Overlapping error codes, however,
>> make it impossible to distinguish case 4 from cases 1 and 2.

If so, we don't strictly need to distinguish case 4 from 1 and 2. The 
logic for ACK packets is similar:

1. Connection established => XDP_PASS. Otherwise,

2. bpf_tcp_check_syncookie returns 0 => XDP_PASS. Otherwise,

3. XDP_DROP.

So, on one hand, it looks like both use cases can be implemented without 
this patch. On the other hand, changing error codes to more meaningful 
shouldn't break existing programs and can have its benefits, for 
example, in debugging or in statistic counting.

>> The original commit message of commit 399040847084 ("bpf: add helper to
>> check for a valid SYN cookie") mentions another use case, though:
>> traffic classification, where it's important to distinguish new
>> connections from existing ones, and case 4 should be distinguishable
>> from case 3.
>>
>> To match the requirements of both use cases, this patch reassigns error
>> codes of bpf_tcp_check_syncookie and adds missing documentation:
>>
>> 1. EINVAL: Invalid packets.
>>
>> 2. EACCES: Packets with bad cookies.
>>
>> 3. 0: Packets with good cookies.
>>
>> 4. ENOENT: Cookies are not in use.
>>
>> This way all four cases are easily distinguishable.
>>
>> Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
>> Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
> 
> At very leasst this would need a fixes tag and should be backported
> as a bug. Then we at least have a chance stable and LTS kernels
> report the same thing.

That's a good idea.

> [...]
> 
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>   
> I'll take a stab at how a program can learn the error cause today.
> 
> BPF_CALL_5(bpf_tcp_check_syncookie, struct sock *, sk, void *, iph, u32, iph_len,
> 	   struct tcphdr *, th, u32, th_len)
> {
> #ifdef CONFIG_SYN_COOKIES
> 	u32 cookie;
> 	int ret;
> 
> // BPF program should know it pass bad values and can check
> 	if (unlikely(!sk || th_len < sizeof(*th)))
> 		return -EINVAL;
> 
> // sk_protocol and sk_state are exposed in sk and can be read directly
> 	/* sk_listener() allows TCP_NEW_SYN_RECV, which makes no sense here. */
> 	if (sk->sk_protocol != IPPROTO_TCP || sk->sk_state != TCP_LISTEN)
> 		return -EINVAL;
> 
> // This is a user space knob right? I think this is a misconfig user can
> // check before loading a program with check_syncookie?

bpf_tcp_check_syncookie was initially introduced for the classification 
use case, to be able to classify new ACK packets with the right cookie 
as NEW. The XDP program classifies traffic regardless of whether SYN 
cookies are enabled. If we need to check the sysctl in userspace, it 
means we need two XDP programs (or additional trickery passing this 
value through a map).

> 	if (!sock_net(sk)->ipv4.sysctl_tcp_syncookies)
> 		return -EINVAL;
> 
> // We have th pointer can't we just check?

Yes, most of the checks can be repeated in BPF, but it's obvious it's 
slower to do all the checks twice.

> 	if (!th->ack || th->rst || th->syn)
> 		return -ENOENT;
> 
> 	if (tcp_synq_no_recent_overflow(sk))
> 		return -ENOENT;

This specific check can't be done in BPF.

> 
> 	cookie = ntohl(th->ack_seq) - 1;
> 
> 	switch (sk->sk_family) {
> 	case AF_INET:
> // misconfiguration but can be checked.
> 		if (unlikely(iph_len < sizeof(struct iphdr)))
> 			return -EINVAL;
> 
> 		ret = __cookie_v4_check((struct iphdr *)iph, th, cookie);
> 		break;
> 
> #if IS_BUILTIN(CONFIG_IPV6)
> 	case AF_INET6:
> // misconfiguration can check as well
> 		if (unlikely(iph_len < sizeof(struct ipv6hdr)))
> 			return -EINVAL;
> 
> 		ret = __cookie_v6_check((struct ipv6hdr *)iph, th, cookie);
> 		break;
> #endif /* CONFIG_IPV6 */
> 
> 	default:
> 		return -EPROTONOSUPPORT;
> 	}
> 
> 	if (ret > 0)
> 		return 0;
> 
> 	return -ENOENT;
> #else
> 	return -ENOTSUPP;
> #endif
> }
> 
> 
> So I guess my point is we have all the fields we could write a bit
> of BPF to find the error cause if necessary. Might be better than
> dealing with changing the error code and having to deal with the
> differences in kernels. I do see how it would have been better
> to get errors correct on the first patch though :/
> 
> By the way I haven't got to the next set of patches with the
> actual features, but why not push everything above this patch
> as fixes in its own series. Then the fixes can get going why
> we review the feature.

OK, I'll respin the fixes separately, while the discussion on the 
approach to expose conntrack is going on.

Thanks for reviewing!

> 
> Thanks,
> John
>
Lorenz Bauer Oct. 20, 2021, 3:26 p.m. UTC | #3
On Wed, 20 Oct 2021 at 14:16, Maxim Mikityanskiy <maximmi@nvidia.com> wrote:
>
> On 2021-10-20 06:28, John Fastabend wrote:
> > Maxim Mikityanskiy wrote:
> >> bpf_tcp_check_syncookie returns errors when SYN cookie generation is
> >> disabled (EINVAL) or when no cookies were recently generated (ENOENT).
> >> The same error codes are used for other kinds of errors: invalid
> >> parameters (EINVAL), invalid packet (EINVAL, ENOENT), bad cookie
> >> (ENOENT). Such an overlap makes it impossible for a BPF program to
> >> distinguish different cases that may require different handling.
> >
> > I'm not sure we can change these errors now. They are embedded in
> > the helper API. I think a BPF program could uncover the meaning
> > of the error anyways with some error path handling?
> >
> > Anyways even if we do change these most of us who run programs
> > on multiple kernel versions would not be able to rely on them
> > being one way or the other easily.
>
> The thing is, the error codes aren't really documented:
>
>   * 0 if *iph* and *th* are a valid SYN cookie ACK, or a negative
>   * error otherwise.

Yes, I kept this vague so that there is some wiggle room. FWIW your
proposed change would not break our BPF. Same for the examples
included in the kernel source itself. That is no guarantee of course.

Personally, I'm a bit on the fence regarding a backport of this.
Either this is a legitimate extension of the API and we don't
backport, or it's a bug (how?) and then we should backport.
diff mbox series

Patch

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 6fc59d61937a..2f12b11f1259 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3545,8 +3545,22 @@  union bpf_attr {
  * 		*th* points to the start of the TCP header, while *th_len*
  * 		contains **sizeof**\ (**struct tcphdr**).
  * 	Return
- * 		0 if *iph* and *th* are a valid SYN cookie ACK, or a negative
- * 		error otherwise.
+ *		0 if *iph* and *th* are a valid SYN cookie ACK.
+ *
+ *		On failure, the returned value is one of the following:
+ *
+ *		**-EACCES** if the SYN cookie is not valid.
+ *
+ *		**-EINVAL** if the packet or input arguments are invalid.
+ *
+ *		**-ENOENT** if SYN cookies are not issued (no SYN flood, or SYN
+ *		cookies are disabled in sysctl).
+ *
+ *		**-EOPNOTSUPP** if the kernel configuration does not enable SYN
+ *		cookies (CONFIG_SYN_COOKIES is off).
+ *
+ *		**-EPROTONOSUPPORT** if the IP version is not 4 or 6 (or 6, but
+ *		CONFIG_IPV6 is disabled).
  *
  * long bpf_sysctl_get_name(struct bpf_sysctl *ctx, char *buf, size_t buf_len, u64 flags)
  *	Description
diff --git a/net/core/filter.c b/net/core/filter.c
index 2c5877b775d9..d04988e67640 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -6709,10 +6709,10 @@  BPF_CALL_5(bpf_tcp_check_syncookie, struct sock *, sk, void *, iph, u32, iph_len
 		return -EINVAL;
 
 	if (!sock_net(sk)->ipv4.sysctl_tcp_syncookies)
-		return -EINVAL;
+		return -ENOENT;
 
 	if (!th->ack || th->rst || th->syn)
-		return -ENOENT;
+		return -EINVAL;
 
 	if (unlikely(iph_len < sizeof(struct iphdr)))
 		return -EINVAL;
@@ -6752,7 +6752,7 @@  BPF_CALL_5(bpf_tcp_check_syncookie, struct sock *, sk, void *, iph, u32, iph_len
 	if (ret > 0)
 		return 0;
 
-	return -ENOENT;
+	return -EACCES;
 #else
 	return -EOPNOTSUPP;
 #endif
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 6fc59d61937a..2f12b11f1259 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -3545,8 +3545,22 @@  union bpf_attr {
  * 		*th* points to the start of the TCP header, while *th_len*
  * 		contains **sizeof**\ (**struct tcphdr**).
  * 	Return
- * 		0 if *iph* and *th* are a valid SYN cookie ACK, or a negative
- * 		error otherwise.
+ *		0 if *iph* and *th* are a valid SYN cookie ACK.
+ *
+ *		On failure, the returned value is one of the following:
+ *
+ *		**-EACCES** if the SYN cookie is not valid.
+ *
+ *		**-EINVAL** if the packet or input arguments are invalid.
+ *
+ *		**-ENOENT** if SYN cookies are not issued (no SYN flood, or SYN
+ *		cookies are disabled in sysctl).
+ *
+ *		**-EOPNOTSUPP** if the kernel configuration does not enable SYN
+ *		cookies (CONFIG_SYN_COOKIES is off).
+ *
+ *		**-EPROTONOSUPPORT** if the IP version is not 4 or 6 (or 6, but
+ *		CONFIG_IPV6 is disabled).
  *
  * long bpf_sysctl_get_name(struct bpf_sysctl *ctx, char *buf, size_t buf_len, u64 flags)
  *	Description