Message ID | 20220221180358.169101-1-jakub@cloudflare.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | [bpf-next] selftests/bpf: Fix implementation-defined behavior in sk_lookup test | expand |
On Mon, 2022-02-21 at 19:03 +0100, Jakub Sitnicki wrote: > Shifting 16-bit type by 16 bits is implementation-defined for BPF > programs. > Don't rely on it in case it is causing the test failures we are > seeing on > s390x z15 target. > > Fixes: 2ed0dc5937d3 ("selftests/bpf: Cover 4-byte load from > remote_port in bpf_sk_lookup") > Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> > Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> > --- > > I don't have a dev env for s390x/z15 set up yet, so can't definitely > confirm the fix. > That said, it seems worth fixing either way. > > tools/testing/selftests/bpf/progs/test_sk_lookup.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/tools/testing/selftests/bpf/progs/test_sk_lookup.c > b/tools/testing/selftests/bpf/progs/test_sk_lookup.c > index bf5b7caefdd0..7d47276a8964 100644 > --- a/tools/testing/selftests/bpf/progs/test_sk_lookup.c > +++ b/tools/testing/selftests/bpf/progs/test_sk_lookup.c > @@ -65,6 +65,7 @@ static const __u32 KEY_SERVER_A = SERVER_A; > static const __u32 KEY_SERVER_B = SERVER_B; > > static const __u16 SRC_PORT = bpf_htons(8008); > +static const __u32 SRC_PORT_U32 = bpf_htonl(8008U << 16); > static const __u32 SRC_IP4 = IP4(127, 0, 0, 2); > static const __u32 SRC_IP6[] = IP6(0xfd000000, 0x0, 0x0, > 0x00000002); > > @@ -421,7 +422,7 @@ int ctx_narrow_access(struct bpf_sk_lookup *ctx) > > /* Load from remote_port field with zero padding (backward > compatibility) */ > val_u32 = *(__u32 *)&ctx->remote_port; > - if (val_u32 != bpf_htonl(bpf_ntohs(SRC_PORT) << 16)) > + if (val_u32 != SRC_PORT_U32) > return SK_DROP; > > /* Narrow loads from local_port field. Expect DST_PORT. */ Unfortunately this doesn't help with the s390 problem. I'll try to debug this.
On Mon, 2022-02-21 at 22:39 +0100, Ilya Leoshkevich wrote: > On Mon, 2022-02-21 at 19:03 +0100, Jakub Sitnicki wrote: > > Shifting 16-bit type by 16 bits is implementation-defined for BPF > > programs. > > Don't rely on it in case it is causing the test failures we are > > seeing on > > s390x z15 target. > > > > Fixes: 2ed0dc5937d3 ("selftests/bpf: Cover 4-byte load from > > remote_port in bpf_sk_lookup") > > Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> > > Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> > > --- > > > > I don't have a dev env for s390x/z15 set up yet, so can't > > definitely > > confirm the fix. > > That said, it seems worth fixing either way. > > > > tools/testing/selftests/bpf/progs/test_sk_lookup.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/tools/testing/selftests/bpf/progs/test_sk_lookup.c > > b/tools/testing/selftests/bpf/progs/test_sk_lookup.c > > index bf5b7caefdd0..7d47276a8964 100644 > > --- a/tools/testing/selftests/bpf/progs/test_sk_lookup.c > > +++ b/tools/testing/selftests/bpf/progs/test_sk_lookup.c > > @@ -65,6 +65,7 @@ static const __u32 KEY_SERVER_A = SERVER_A; > > static const __u32 KEY_SERVER_B = SERVER_B; > > > > static const __u16 SRC_PORT = bpf_htons(8008); > > +static const __u32 SRC_PORT_U32 = bpf_htonl(8008U << 16); > > static const __u32 SRC_IP4 = IP4(127, 0, 0, 2); > > static const __u32 SRC_IP6[] = IP6(0xfd000000, 0x0, 0x0, > > 0x00000002); > > > > @@ -421,7 +422,7 @@ int ctx_narrow_access(struct bpf_sk_lookup > > *ctx) > > > > /* Load from remote_port field with zero padding (backward > > compatibility) */ > > val_u32 = *(__u32 *)&ctx->remote_port; > > - if (val_u32 != bpf_htonl(bpf_ntohs(SRC_PORT) << 16)) > > + if (val_u32 != SRC_PORT_U32) > > return SK_DROP; > > > > /* Narrow loads from local_port field. Expect DST_PORT. */ > > Unfortunately this doesn't help with the s390 problem. > I'll try to debug this. I have to admit I have a hard time wrapping my head around the requirements here. Based on the pre-9a69e2b385f4 code, do I understand correctly that for the following input Port: 0x1f48 SRC_PORT: 0x481f we expect the following results for different kinds of loads: Size Offset LE BE BPF_B 0 0x1f 0 BPF_B 1 0x48 0 BPF_B 2 0 0x48 BPF_B 3 0 0x1f BPF_H 0 0x481f 0 BPF_H 1 0 0x481f BPF_W 0 0x481f 0x481f and this is guaranteed by the struct bpf_sk_lookup ABI? Because then it looks as if 9a69e2b385f4 breaks it on big-endian as follows: Size Offset BE-9a69e2b385f4 BPF_B 0 0x48 BPF_B 1 0x1f BPF_B 2 0 BPF_B 3 0 BPF_H 0 0x481f BPF_H 1 0 BPF_W 0 0x481f0000 Or is the old behavior a bug and this new one is desirable? 9a69e2b385f4 has no Fixes: tag, so I assume that's the former :-( In which case, would it make sense to fix it by swapping remote_port and :16 in bpf_sk_lookup on big-endian? Best regards, Ilya
On Tue, 2022-02-22 at 01:43 +0100, Ilya Leoshkevich wrote: > On Mon, 2022-02-21 at 22:39 +0100, Ilya Leoshkevich wrote: > > On Mon, 2022-02-21 at 19:03 +0100, Jakub Sitnicki wrote: > > > Shifting 16-bit type by 16 bits is implementation-defined for BPF > > > programs. > > > Don't rely on it in case it is causing the test failures we are > > > seeing on > > > s390x z15 target. > > > > > > Fixes: 2ed0dc5937d3 ("selftests/bpf: Cover 4-byte load from > > > remote_port in bpf_sk_lookup") > > > Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> > > > Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> > > > --- > > > > > > I don't have a dev env for s390x/z15 set up yet, so can't > > > definitely > > > confirm the fix. > > > That said, it seems worth fixing either way. > > > > > > tools/testing/selftests/bpf/progs/test_sk_lookup.c | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/tools/testing/selftests/bpf/progs/test_sk_lookup.c > > > b/tools/testing/selftests/bpf/progs/test_sk_lookup.c > > > index bf5b7caefdd0..7d47276a8964 100644 > > > --- a/tools/testing/selftests/bpf/progs/test_sk_lookup.c > > > +++ b/tools/testing/selftests/bpf/progs/test_sk_lookup.c > > > @@ -65,6 +65,7 @@ static const __u32 KEY_SERVER_A = SERVER_A; > > > static const __u32 KEY_SERVER_B = SERVER_B; > > > > > > static const __u16 SRC_PORT = bpf_htons(8008); > > > +static const __u32 SRC_PORT_U32 = bpf_htonl(8008U << 16); > > > static const __u32 SRC_IP4 = IP4(127, 0, 0, 2); > > > static const __u32 SRC_IP6[] = IP6(0xfd000000, 0x0, 0x0, > > > 0x00000002); > > > > > > @@ -421,7 +422,7 @@ int ctx_narrow_access(struct bpf_sk_lookup > > > *ctx) > > > > > > /* Load from remote_port field with zero padding > > > (backward > > > compatibility) */ > > > val_u32 = *(__u32 *)&ctx->remote_port; > > > - if (val_u32 != bpf_htonl(bpf_ntohs(SRC_PORT) << 16)) > > > + if (val_u32 != SRC_PORT_U32) > > > return SK_DROP; > > > > > > /* Narrow loads from local_port field. Expect DST_PORT. > > > */ > > > > Unfortunately this doesn't help with the s390 problem. > > I'll try to debug this. > > I have to admit I have a hard time wrapping my head around the > requirements here. > > Based on the pre-9a69e2b385f4 code, do I understand correctly that > for the following input > > Port: 0x1f48 > SRC_PORT: 0x481f > > we expect the following results for different kinds of loads: > > Size Offset LE BE > BPF_B 0 0x1f 0 > BPF_B 1 0x48 0 > BPF_B 2 0 0x48 > BPF_B 3 0 0x1f > BPF_H 0 0x481f 0 > BPF_H 1 0 0x481f > BPF_W 0 0x481f 0x481f > > and this is guaranteed by the struct bpf_sk_lookup ABI? Because then > it > looks as if 9a69e2b385f4 breaks it on big-endian as follows: > > Size Offset BE-9a69e2b385f4 > BPF_B 0 0x48 > BPF_B 1 0x1f > BPF_B 2 0 > BPF_B 3 0 > BPF_H 0 0x481f > BPF_H 1 0 > BPF_W 0 0x481f0000 Sorry, I worded this incorrectly: 9a69e2b385f4 did not change the kernel behavior, the ABI is not broken and the old compiled code should continue to work. What the second table really shows are what the results should be according to the 9a69e2b385f4 struct bpf_sk_lookup definition, which I still think is broken on big-endian and needs to be adjusted to match the ABI. I noticed one other strange thing in the meantime: loads from *(__u32 *)&ctx->remote_port, *(__u16 *)&ctx->remote_port and *((__u16 *)&ctx->remote_port + 1) all produce 8008 on s390, which is clearly inconsistent. It looks as if convert_ctx_accesses() needs to be adjusted to handle combinations like ctx_field_size == 4 && size == 2 && target_size == 2. I will continue with this tomorrow. > Or is the old behavior a bug and this new one is desirable? > 9a69e2b385f4 has no Fixes: tag, so I assume that's the former :-( > > In which case, would it make sense to fix it by swapping remote_port > and :16 in bpf_sk_lookup on big-endian? > > Best regards, > Ilya
On Tue, Feb 22, 2022 at 03:22 AM +01, Ilya Leoshkevich wrote: > On Tue, 2022-02-22 at 01:43 +0100, Ilya Leoshkevich wrote: >> On Mon, 2022-02-21 at 22:39 +0100, Ilya Leoshkevich wrote: >> > On Mon, 2022-02-21 at 19:03 +0100, Jakub Sitnicki wrote: >> > > Shifting 16-bit type by 16 bits is implementation-defined for BPF >> > > programs. >> > > Don't rely on it in case it is causing the test failures we are >> > > seeing on >> > > s390x z15 target. >> > > >> > > Fixes: 2ed0dc5937d3 ("selftests/bpf: Cover 4-byte load from >> > > remote_port in bpf_sk_lookup") >> > > Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> >> > > Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> >> > > --- >> > > >> > > I don't have a dev env for s390x/z15 set up yet, so can't >> > > definitely >> > > confirm the fix. >> > > That said, it seems worth fixing either way. >> > > >> > > tools/testing/selftests/bpf/progs/test_sk_lookup.c | 3 ++- >> > > 1 file changed, 2 insertions(+), 1 deletion(-) >> > > >> > > diff --git a/tools/testing/selftests/bpf/progs/test_sk_lookup.c >> > > b/tools/testing/selftests/bpf/progs/test_sk_lookup.c >> > > index bf5b7caefdd0..7d47276a8964 100644 >> > > --- a/tools/testing/selftests/bpf/progs/test_sk_lookup.c >> > > +++ b/tools/testing/selftests/bpf/progs/test_sk_lookup.c >> > > @@ -65,6 +65,7 @@ static const __u32 KEY_SERVER_A = SERVER_A; >> > > static const __u32 KEY_SERVER_B = SERVER_B; >> > > >> > > static const __u16 SRC_PORT = bpf_htons(8008); >> > > +static const __u32 SRC_PORT_U32 = bpf_htonl(8008U << 16); >> > > static const __u32 SRC_IP4 = IP4(127, 0, 0, 2); >> > > static const __u32 SRC_IP6[] = IP6(0xfd000000, 0x0, 0x0, >> > > 0x00000002); >> > > >> > > @@ -421,7 +422,7 @@ int ctx_narrow_access(struct bpf_sk_lookup >> > > *ctx) >> > > >> > > /* Load from remote_port field with zero padding >> > > (backward >> > > compatibility) */ >> > > val_u32 = *(__u32 *)&ctx->remote_port; >> > > - if (val_u32 != bpf_htonl(bpf_ntohs(SRC_PORT) << 16)) >> > > + if (val_u32 != SRC_PORT_U32) >> > > return SK_DROP; >> > > >> > > /* Narrow loads from local_port field. Expect DST_PORT. >> > > */ >> > >> > Unfortunately this doesn't help with the s390 problem. >> > I'll try to debug this. >> >> I have to admit I have a hard time wrapping my head around the >> requirements here. >> >> Based on the pre-9a69e2b385f4 code, do I understand correctly that >> for the following input >> >> Port: 0x1f48 >> SRC_PORT: 0x481f >> >> we expect the following results for different kinds of loads: >> >> Size Offset LE BE >> BPF_B 0 0x1f 0 >> BPF_B 1 0x48 0 >> BPF_B 2 0 0x48 >> BPF_B 3 0 0x1f >> BPF_H 0 0x481f 0 >> BPF_H 1 0 0x481f >> BPF_W 0 0x481f 0x481f >> >> and this is guaranteed by the struct bpf_sk_lookup ABI? Because then >> it >> looks as if 9a69e2b385f4 breaks it on big-endian as follows: >> >> Size Offset BE-9a69e2b385f4 >> BPF_B 0 0x48 >> BPF_B 1 0x1f >> BPF_B 2 0 >> BPF_B 3 0 >> BPF_H 0 0x481f >> BPF_H 1 0 >> BPF_W 0 0x481f0000 > > Sorry, I worded this incorrectly: 9a69e2b385f4 did not change the > kernel behavior, the ABI is not broken and the old compiled code should > continue to work. > What the second table really shows are what the results should be > according to the 9a69e2b385f4 struct bpf_sk_lookup definition, which I > still think is broken on big-endian and needs to be adjusted to match > the ABI. > > I noticed one other strange thing in the meantime: loads from > *(__u32 *)&ctx->remote_port, *(__u16 *)&ctx->remote_port and > *((__u16 *)&ctx->remote_port + 1) all produce 8008 on s390, which is > clearly inconsistent. It looks as if convert_ctx_accesses() needs to be > adjusted to handle combinations like ctx_field_size == 4 && size == 2 > && target_size == 2. I will continue with this tomorrow. > >> Or is the old behavior a bug and this new one is desirable? >> 9a69e2b385f4 has no Fixes: tag, so I assume that's the former :-( >> >> In which case, would it make sense to fix it by swapping remote_port >> and :16 in bpf_sk_lookup on big-endian? Thanks for looking into it. When it comes to requirements, my intention was to keep the same behavior as before the split up of the remote_port field in 9a69e2b385f4 ("bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide"). 9a69e2b385f4 was supposed to be a formality, after a similar change in 4421a582718a ("bpf: Make dst_port field in struct bpf_sock 16-bit wide"), which went in earlier. In 4421a582718a I've provided a bit more context. I understand that the remote_port value, even before the type changed from u32 to u16, appeared to the BPF program as if laid out in memory like so: offsetof(struct bpf_sk_lookup, remote_port) +0 <port MSB> +1 <port LSB> +2 0x00 +3 0x00 Translating it to your handy table format, I expect should result in loads as so if port is 8008 = 0x1f48: Size Offset LE BE BPF_B 0 0x1f 0x1f BPF_B 1 0x48 0x48 BPF_B 2 0 0 BPF_B 3 0 0 BPF_H 0 0x481f 0x1f48 BPF_H 1 0 0 BPF_W 0 0x481f 0x1f480000 But since the fix does not work, there must be a mistake somewhere in my reasoning. I expect I should be able to get virtme for s390 working sometime this week to check my math. I've seen your collegue had some luck with it [1]. Looking forward to your findings. [1] https://github.com/cilium/ebpf/issues/86#issuecomment-623945549
On Tue, 2022-02-22 at 15:53 +0100, Jakub Sitnicki wrote: > On Tue, Feb 22, 2022 at 03:22 AM +01, Ilya Leoshkevich wrote: > > On Tue, 2022-02-22 at 01:43 +0100, Ilya Leoshkevich wrote: > > > On Mon, 2022-02-21 at 22:39 +0100, Ilya Leoshkevich wrote: > > > > On Mon, 2022-02-21 at 19:03 +0100, Jakub Sitnicki wrote: > > > > > Shifting 16-bit type by 16 bits is implementation-defined for > > > > > BPF > > > > > programs. > > > > > Don't rely on it in case it is causing the test failures we > > > > > are > > > > > seeing on > > > > > s390x z15 target. > > > > > > > > > > Fixes: 2ed0dc5937d3 ("selftests/bpf: Cover 4-byte load from > > > > > remote_port in bpf_sk_lookup") > > > > > Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> > > > > > Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> > > > > > --- > > > > > > > > > > I don't have a dev env for s390x/z15 set up yet, so can't > > > > > definitely > > > > > confirm the fix. > > > > > That said, it seems worth fixing either way. > > > > > > > > > > tools/testing/selftests/bpf/progs/test_sk_lookup.c | 3 ++- > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git > > > > > a/tools/testing/selftests/bpf/progs/test_sk_lookup.c > > > > > b/tools/testing/selftests/bpf/progs/test_sk_lookup.c > > > > > index bf5b7caefdd0..7d47276a8964 100644 > > > > > --- a/tools/testing/selftests/bpf/progs/test_sk_lookup.c > > > > > +++ b/tools/testing/selftests/bpf/progs/test_sk_lookup.c > > > > > @@ -65,6 +65,7 @@ static const __u32 KEY_SERVER_A = SERVER_A; > > > > > static const __u32 KEY_SERVER_B = SERVER_B; > > > > > > > > > > static const __u16 SRC_PORT = bpf_htons(8008); > > > > > +static const __u32 SRC_PORT_U32 = bpf_htonl(8008U << 16); > > > > > static const __u32 SRC_IP4 = IP4(127, 0, 0, 2); > > > > > static const __u32 SRC_IP6[] = IP6(0xfd000000, 0x0, 0x0, > > > > > 0x00000002); > > > > > > > > > > @@ -421,7 +422,7 @@ int ctx_narrow_access(struct > > > > > bpf_sk_lookup > > > > > *ctx) > > > > > > > > > > /* Load from remote_port field with zero padding > > > > > (backward > > > > > compatibility) */ > > > > > val_u32 = *(__u32 *)&ctx->remote_port; > > > > > - if (val_u32 != bpf_htonl(bpf_ntohs(SRC_PORT) << 16)) > > > > > + if (val_u32 != SRC_PORT_U32) > > > > > return SK_DROP; > > > > > > > > > > /* Narrow loads from local_port field. Expect > > > > > DST_PORT. > > > > > */ > > > > > > > > Unfortunately this doesn't help with the s390 problem. > > > > I'll try to debug this. > > > > > > I have to admit I have a hard time wrapping my head around the > > > requirements here. > > > > > > Based on the pre-9a69e2b385f4 code, do I understand correctly > > > that > > > for the following input > > > > > > Port: 0x1f48 > > > SRC_PORT: 0x481f > > > > > > we expect the following results for different kinds of loads: > > > > > > Size Offset LE BE > > > BPF_B 0 0x1f 0 > > > BPF_B 1 0x48 0 > > > BPF_B 2 0 0x48 > > > BPF_B 3 0 0x1f > > > BPF_H 0 0x481f 0 > > > BPF_H 1 0 0x481f > > > BPF_W 0 0x481f 0x481f > > > > > > and this is guaranteed by the struct bpf_sk_lookup ABI? Because > > > then > > > it > > > looks as if 9a69e2b385f4 breaks it on big-endian as follows: > > > > > > Size Offset BE-9a69e2b385f4 > > > BPF_B 0 0x48 > > > BPF_B 1 0x1f > > > BPF_B 2 0 > > > BPF_B 3 0 > > > BPF_H 0 0x481f > > > BPF_H 1 0 > > > BPF_W 0 0x481f0000 > > > > Sorry, I worded this incorrectly: 9a69e2b385f4 did not change the > > kernel behavior, the ABI is not broken and the old compiled code > > should > > continue to work. > > What the second table really shows are what the results should be > > according to the 9a69e2b385f4 struct bpf_sk_lookup definition, > > which I > > still think is broken on big-endian and needs to be adjusted to > > match > > the ABI. > > > > I noticed one other strange thing in the meantime: loads from > > *(__u32 *)&ctx->remote_port, *(__u16 *)&ctx->remote_port and > > *((__u16 *)&ctx->remote_port + 1) all produce 8008 on s390, which > > is > > clearly inconsistent. It looks as if convert_ctx_accesses() needs > > to be > > adjusted to handle combinations like ctx_field_size == 4 && size == > > 2 > > && target_size == 2. I will continue with this tomorrow. > > > > > Or is the old behavior a bug and this new one is desirable? > > > 9a69e2b385f4 has no Fixes: tag, so I assume that's the former :-( > > > > > > In which case, would it make sense to fix it by swapping > > > remote_port > > > and :16 in bpf_sk_lookup on big-endian? > > Thanks for looking into it. > > When it comes to requirements, my intention was to keep the same > behavior as before the split up of the remote_port field in > 9a69e2b385f4 > ("bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide"). > > 9a69e2b385f4 was supposed to be a formality, after a similar change > in > 4421a582718a ("bpf: Make dst_port field in struct bpf_sock 16-bit > wide"), which went in earlier. > > In 4421a582718a I've provided a bit more context. I understand that > the > remote_port value, even before the type changed from u32 to u16, > appeared to the BPF program as if laid out in memory like so: > > offsetof(struct bpf_sk_lookup, remote_port) +0 <port MSB> > +1 <port LSB> > +2 0x00 > +3 0x00 > > Translating it to your handy table format, I expect should result in > loads as so if port is 8008 = 0x1f48: > > Size Offset LE BE > BPF_B 0 0x1f 0x1f > BPF_B 1 0x48 0x48 > BPF_B 2 0 0 > BPF_B 3 0 0 > BPF_H 0 0x481f 0x1f48 > BPF_H 1 0 0 > BPF_W 0 0x481f 0x1f480000 Hmm, I think for big-endian the layout is different. If we look at test_sk_lookup.c from 9a69e2b385f4^: /* Narrow loads from remote_port field. Expect SRC_PORT. */ if (LSB(ctx->remote_port, 0) != ((SRC_PORT >> 0) & 0xff) || LSB(ctx->remote_port, 1) != ((SRC_PORT >> 8) & 0xff) || LSB(ctx->remote_port, 2) != 0 || LSB(ctx->remote_port, 3) != 0) return SK_DROP; LSB() on little-endian is just byte indexing, so it's indeed 1f,48,00,00. However, on big-endian it's indexing from the end, so it's 00,00,48,1f. > But since the fix does not work, there must be a mistake somewhere in > my > reasoning. > > I expect I should be able to get virtme for s390 working sometime > this > week to check my math. I've seen your collegue had some luck with it > [1]. Yeah, I think it should work. In the worst case it should be possible to tweak vmtest.sh to cross-compile and emulate s390. > Looking forward to your findings. > > [1] https://github.com/cilium/ebpf/issues/86#issuecomment-623945549
On Tue, Feb 22, 2022 at 06:42 PM +01, Ilya Leoshkevich wrote: > On Tue, 2022-02-22 at 15:53 +0100, Jakub Sitnicki wrote: >> On Tue, Feb 22, 2022 at 03:22 AM +01, Ilya Leoshkevich wrote: >> > On Tue, 2022-02-22 at 01:43 +0100, Ilya Leoshkevich wrote: >> > > On Mon, 2022-02-21 at 22:39 +0100, Ilya Leoshkevich wrote: >> > > > On Mon, 2022-02-21 at 19:03 +0100, Jakub Sitnicki wrote: >> > > > > Shifting 16-bit type by 16 bits is implementation-defined for >> > > > > BPF >> > > > > programs. >> > > > > Don't rely on it in case it is causing the test failures we >> > > > > are >> > > > > seeing on >> > > > > s390x z15 target. >> > > > > >> > > > > Fixes: 2ed0dc5937d3 ("selftests/bpf: Cover 4-byte load from >> > > > > remote_port in bpf_sk_lookup") >> > > > > Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> >> > > > > Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> >> > > > > --- >> > > > > >> > > > > I don't have a dev env for s390x/z15 set up yet, so can't >> > > > > definitely >> > > > > confirm the fix. >> > > > > That said, it seems worth fixing either way. >> > > > > >> > > > > tools/testing/selftests/bpf/progs/test_sk_lookup.c | 3 ++- >> > > > > 1 file changed, 2 insertions(+), 1 deletion(-) >> > > > > >> > > > > diff --git >> > > > > a/tools/testing/selftests/bpf/progs/test_sk_lookup.c >> > > > > b/tools/testing/selftests/bpf/progs/test_sk_lookup.c >> > > > > index bf5b7caefdd0..7d47276a8964 100644 >> > > > > --- a/tools/testing/selftests/bpf/progs/test_sk_lookup.c >> > > > > +++ b/tools/testing/selftests/bpf/progs/test_sk_lookup.c >> > > > > @@ -65,6 +65,7 @@ static const __u32 KEY_SERVER_A = SERVER_A; >> > > > > static const __u32 KEY_SERVER_B = SERVER_B; >> > > > > >> > > > > static const __u16 SRC_PORT = bpf_htons(8008); >> > > > > +static const __u32 SRC_PORT_U32 = bpf_htonl(8008U << 16); >> > > > > static const __u32 SRC_IP4 = IP4(127, 0, 0, 2); >> > > > > static const __u32 SRC_IP6[] = IP6(0xfd000000, 0x0, 0x0, >> > > > > 0x00000002); >> > > > > >> > > > > @@ -421,7 +422,7 @@ int ctx_narrow_access(struct >> > > > > bpf_sk_lookup >> > > > > *ctx) >> > > > > >> > > > > /* Load from remote_port field with zero padding >> > > > > (backward >> > > > > compatibility) */ >> > > > > val_u32 = *(__u32 *)&ctx->remote_port; >> > > > > - if (val_u32 != bpf_htonl(bpf_ntohs(SRC_PORT) << 16)) >> > > > > + if (val_u32 != SRC_PORT_U32) >> > > > > return SK_DROP; >> > > > > >> > > > > /* Narrow loads from local_port field. Expect >> > > > > DST_PORT. >> > > > > */ >> > > > >> > > > Unfortunately this doesn't help with the s390 problem. >> > > > I'll try to debug this. >> > > >> > > I have to admit I have a hard time wrapping my head around the >> > > requirements here. >> > > >> > > Based on the pre-9a69e2b385f4 code, do I understand correctly >> > > that >> > > for the following input >> > > >> > > Port: 0x1f48 >> > > SRC_PORT: 0x481f >> > > >> > > we expect the following results for different kinds of loads: >> > > >> > > Size Offset LE BE >> > > BPF_B 0 0x1f 0 >> > > BPF_B 1 0x48 0 >> > > BPF_B 2 0 0x48 >> > > BPF_B 3 0 0x1f >> > > BPF_H 0 0x481f 0 >> > > BPF_H 1 0 0x481f >> > > BPF_W 0 0x481f 0x481f >> > > >> > > and this is guaranteed by the struct bpf_sk_lookup ABI? Because >> > > then >> > > it >> > > looks as if 9a69e2b385f4 breaks it on big-endian as follows: >> > > >> > > Size Offset BE-9a69e2b385f4 >> > > BPF_B 0 0x48 >> > > BPF_B 1 0x1f >> > > BPF_B 2 0 >> > > BPF_B 3 0 >> > > BPF_H 0 0x481f >> > > BPF_H 1 0 >> > > BPF_W 0 0x481f0000 >> > >> > Sorry, I worded this incorrectly: 9a69e2b385f4 did not change the >> > kernel behavior, the ABI is not broken and the old compiled code >> > should >> > continue to work. >> > What the second table really shows are what the results should be >> > according to the 9a69e2b385f4 struct bpf_sk_lookup definition, >> > which I >> > still think is broken on big-endian and needs to be adjusted to >> > match >> > the ABI. >> > >> > I noticed one other strange thing in the meantime: loads from >> > *(__u32 *)&ctx->remote_port, *(__u16 *)&ctx->remote_port and >> > *((__u16 *)&ctx->remote_port + 1) all produce 8008 on s390, which >> > is >> > clearly inconsistent. It looks as if convert_ctx_accesses() needs >> > to be >> > adjusted to handle combinations like ctx_field_size == 4 && size == >> > 2 >> > && target_size == 2. I will continue with this tomorrow. >> > >> > > Or is the old behavior a bug and this new one is desirable? >> > > 9a69e2b385f4 has no Fixes: tag, so I assume that's the former :-( >> > > >> > > In which case, would it make sense to fix it by swapping >> > > remote_port >> > > and :16 in bpf_sk_lookup on big-endian? >> >> Thanks for looking into it. >> >> When it comes to requirements, my intention was to keep the same >> behavior as before the split up of the remote_port field in >> 9a69e2b385f4 >> ("bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide"). >> >> 9a69e2b385f4 was supposed to be a formality, after a similar change >> in >> 4421a582718a ("bpf: Make dst_port field in struct bpf_sock 16-bit >> wide"), which went in earlier. >> >> In 4421a582718a I've provided a bit more context. I understand that >> the >> remote_port value, even before the type changed from u32 to u16, >> appeared to the BPF program as if laid out in memory like so: >> >> offsetof(struct bpf_sk_lookup, remote_port) +0 <port MSB> >> +1 <port LSB> >> +2 0x00 >> +3 0x00 >> >> Translating it to your handy table format, I expect should result in >> loads as so if port is 8008 = 0x1f48: >> >> Size Offset LE BE >> BPF_B 0 0x1f 0x1f >> BPF_B 1 0x48 0x48 >> BPF_B 2 0 0 >> BPF_B 3 0 0 >> BPF_H 0 0x481f 0x1f48 >> BPF_H 1 0 0 >> BPF_W 0 0x481f 0x1f480000 > > Hmm, I think for big-endian the layout is different. > If we look at test_sk_lookup.c from 9a69e2b385f4^: > > /* Narrow loads from remote_port field. Expect SRC_PORT. */ > if (LSB(ctx->remote_port, 0) != ((SRC_PORT >> 0) & 0xff) || > LSB(ctx->remote_port, 1) != ((SRC_PORT >> 8) & 0xff) || > LSB(ctx->remote_port, 2) != 0 || LSB(ctx->remote_port, 3) > != 0) > return SK_DROP; > > LSB() on little-endian is just byte indexing, so it's indeed > 1f,48,00,00. However, on big-endian it's indexing from the end, so > it's 00,00,48,1f. I understood that LSB() is indexing from the end on BE because SRC_PORT constant value differs on LE (= 0x481f) and BE (= 0x1f48) platforms, so LE BE SRC_PORT >> 0 1f 48 SRC_PORT >> 8 48 1f So on LE we first compare remote_port MSB, then LSB. While on BE we start with remote_port LSB, then MSB. But, now that you have pointed it out, I notice that sizeof(remote_port) has changed and from 4 to 2, and I can't see how LSB(…, 3) and LSB(…, 4) loads can keep working on big-endian. [...]
On Tue, 2022-02-22 at 19:24 +0100, Jakub Sitnicki wrote: > On Tue, Feb 22, 2022 at 06:42 PM +01, Ilya Leoshkevich wrote: > > On Tue, 2022-02-22 at 15:53 +0100, Jakub Sitnicki wrote: > > > On Tue, Feb 22, 2022 at 03:22 AM +01, Ilya Leoshkevich wrote: > > > > On Tue, 2022-02-22 at 01:43 +0100, Ilya Leoshkevich wrote: > > > > > On Mon, 2022-02-21 at 22:39 +0100, Ilya Leoshkevich wrote: > > > > > > On Mon, 2022-02-21 at 19:03 +0100, Jakub Sitnicki wrote: > > > > > > > Shifting 16-bit type by 16 bits is implementation-defined > > > > > > > for > > > > > > > BPF > > > > > > > programs. > > > > > > > Don't rely on it in case it is causing the test failures > > > > > > > we > > > > > > > are > > > > > > > seeing on > > > > > > > s390x z15 target. > > > > > > > > > > > > > > Fixes: 2ed0dc5937d3 ("selftests/bpf: Cover 4-byte load > > > > > > > from > > > > > > > remote_port in bpf_sk_lookup") > > > > > > > Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> > > > > > > > Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> > > > > > > > --- > > > > > > > > > > > > > > I don't have a dev env for s390x/z15 set up yet, so can't > > > > > > > definitely > > > > > > > confirm the fix. > > > > > > > That said, it seems worth fixing either way. > > > > > > > > > > > > > > tools/testing/selftests/bpf/progs/test_sk_lookup.c | 3 > > > > > > > ++- > > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > diff --git > > > > > > > a/tools/testing/selftests/bpf/progs/test_sk_lookup.c > > > > > > > b/tools/testing/selftests/bpf/progs/test_sk_lookup.c > > > > > > > index bf5b7caefdd0..7d47276a8964 100644 > > > > > > > --- a/tools/testing/selftests/bpf/progs/test_sk_lookup.c > > > > > > > +++ b/tools/testing/selftests/bpf/progs/test_sk_lookup.c > > > > > > > @@ -65,6 +65,7 @@ static const __u32 KEY_SERVER_A = > > > > > > > SERVER_A; > > > > > > > static const __u32 KEY_SERVER_B = SERVER_B; > > > > > > > > > > > > > > static const __u16 SRC_PORT = bpf_htons(8008); > > > > > > > +static const __u32 SRC_PORT_U32 = bpf_htonl(8008U << > > > > > > > 16); > > > > > > > static const __u32 SRC_IP4 = IP4(127, 0, 0, 2); > > > > > > > static const __u32 SRC_IP6[] = IP6(0xfd000000, 0x0, 0x0, > > > > > > > 0x00000002); > > > > > > > > > > > > > > @@ -421,7 +422,7 @@ int ctx_narrow_access(struct > > > > > > > bpf_sk_lookup > > > > > > > *ctx) > > > > > > > > > > > > > > /* Load from remote_port field with zero padding > > > > > > > (backward > > > > > > > compatibility) */ > > > > > > > val_u32 = *(__u32 *)&ctx->remote_port; > > > > > > > - if (val_u32 != bpf_htonl(bpf_ntohs(SRC_PORT) << > > > > > > > 16)) > > > > > > > + if (val_u32 != SRC_PORT_U32) > > > > > > > return SK_DROP; > > > > > > > > > > > > > > /* Narrow loads from local_port field. Expect > > > > > > > DST_PORT. > > > > > > > */ > > > > > > > > > > > > Unfortunately this doesn't help with the s390 problem. > > > > > > I'll try to debug this. > > > > > > > > > > I have to admit I have a hard time wrapping my head around > > > > > the > > > > > requirements here. > > > > > > > > > > Based on the pre-9a69e2b385f4 code, do I understand correctly > > > > > that > > > > > for the following input > > > > > > > > > > Port: 0x1f48 > > > > > SRC_PORT: 0x481f > > > > > > > > > > we expect the following results for different kinds of loads: > > > > > > > > > > Size Offset LE BE > > > > > BPF_B 0 0x1f 0 > > > > > BPF_B 1 0x48 0 > > > > > BPF_B 2 0 0x48 > > > > > BPF_B 3 0 0x1f > > > > > BPF_H 0 0x481f 0 > > > > > BPF_H 1 0 0x481f > > > > > BPF_W 0 0x481f 0x481f > > > > > > > > > > and this is guaranteed by the struct bpf_sk_lookup ABI? > > > > > Because > > > > > then > > > > > it > > > > > looks as if 9a69e2b385f4 breaks it on big-endian as follows: > > > > > > > > > > Size Offset BE-9a69e2b385f4 > > > > > BPF_B 0 0x48 > > > > > BPF_B 1 0x1f > > > > > BPF_B 2 0 > > > > > BPF_B 3 0 > > > > > BPF_H 0 0x481f > > > > > BPF_H 1 0 > > > > > BPF_W 0 0x481f0000 > > > > > > > > Sorry, I worded this incorrectly: 9a69e2b385f4 did not change > > > > the > > > > kernel behavior, the ABI is not broken and the old compiled > > > > code > > > > should > > > > continue to work. > > > > What the second table really shows are what the results should > > > > be > > > > according to the 9a69e2b385f4 struct bpf_sk_lookup definition, > > > > which I > > > > still think is broken on big-endian and needs to be adjusted to > > > > match > > > > the ABI. > > > > > > > > I noticed one other strange thing in the meantime: loads from > > > > *(__u32 *)&ctx->remote_port, *(__u16 *)&ctx->remote_port and > > > > *((__u16 *)&ctx->remote_port + 1) all produce 8008 on s390, > > > > which > > > > is > > > > clearly inconsistent. It looks as if convert_ctx_accesses() > > > > needs > > > > to be > > > > adjusted to handle combinations like ctx_field_size == 4 && > > > > size == > > > > 2 > > > > && target_size == 2. I will continue with this tomorrow. > > > > > > > > > Or is the old behavior a bug and this new one is desirable? > > > > > 9a69e2b385f4 has no Fixes: tag, so I assume that's the former > > > > > :-( > > > > > > > > > > In which case, would it make sense to fix it by swapping > > > > > remote_port > > > > > and :16 in bpf_sk_lookup on big-endian? > > > > > > Thanks for looking into it. > > > > > > When it comes to requirements, my intention was to keep the same > > > behavior as before the split up of the remote_port field in > > > 9a69e2b385f4 > > > ("bpf: Make remote_port field in struct bpf_sk_lookup 16-bit > > > wide"). > > > > > > 9a69e2b385f4 was supposed to be a formality, after a similar > > > change > > > in > > > 4421a582718a ("bpf: Make dst_port field in struct bpf_sock 16-bit > > > wide"), which went in earlier. > > > > > > In 4421a582718a I've provided a bit more context. I understand > > > that > > > the > > > remote_port value, even before the type changed from u32 to u16, > > > appeared to the BPF program as if laid out in memory like so: > > > > > > offsetof(struct bpf_sk_lookup, remote_port) +0 <port MSB> > > > +1 <port LSB> > > > +2 0x00 > > > +3 0x00 > > > > > > Translating it to your handy table format, I expect should result > > > in > > > loads as so if port is 8008 = 0x1f48: > > > > > > Size Offset LE BE > > > BPF_B 0 0x1f 0x1f > > > BPF_B 1 0x48 0x48 > > > BPF_B 2 0 0 > > > BPF_B 3 0 0 > > > BPF_H 0 0x481f 0x1f48 > > > BPF_H 1 0 0 > > > BPF_W 0 0x481f 0x1f480000 > > > > Hmm, I think for big-endian the layout is different. > > If we look at test_sk_lookup.c from 9a69e2b385f4^: > > > > /* Narrow loads from remote_port field. Expect SRC_PORT. */ > > if (LSB(ctx->remote_port, 0) != ((SRC_PORT >> 0) & 0xff) || > > LSB(ctx->remote_port, 1) != ((SRC_PORT >> 8) & 0xff) || > > LSB(ctx->remote_port, 2) != 0 || LSB(ctx->remote_port, > > 3) > > != 0) > > return SK_DROP; > > > > LSB() on little-endian is just byte indexing, so it's indeed > > 1f,48,00,00. However, on big-endian it's indexing from the end, so > > it's 00,00,48,1f. > > I understood that LSB() is indexing from the end on BE because > SRC_PORT > constant value differs on LE (= 0x481f) and BE (= 0x1f48) platforms, > so > > LE BE > SRC_PORT >> 0 1f 48 > SRC_PORT >> 8 48 1f > > So on LE we first compare remote_port MSB, then LSB. > While on BE we start with remote_port LSB, then MSB. > > But, now that you have pointed it out, I notice that > sizeof(remote_port) > has changed and from 4 to 2, and I can't see how LSB(…, 3) and LSB(…, > 4) > loads can keep working on big-endian. Oh, right - it should be 00,00,1f,48 on big-endian. Out-of-bounds LSB is indeed an issue. I've posted my current thoughts as an RFC series [1], this one is addressed in patch 3. [1] https://lore.kernel.org/bpf/20220222182559.2865596-1-iii@linux.ibm.com/ > > [...]
On Tue, Feb 22, 2022 at 10:51 PM +01, Ilya Leoshkevich wrote: > On Tue, 2022-02-22 at 19:24 +0100, Jakub Sitnicki wrote: >> On Tue, Feb 22, 2022 at 06:42 PM +01, Ilya Leoshkevich wrote: >> > On Tue, 2022-02-22 at 15:53 +0100, Jakub Sitnicki wrote: >> > > On Tue, Feb 22, 2022 at 03:22 AM +01, Ilya Leoshkevich wrote: >> > > > On Tue, 2022-02-22 at 01:43 +0100, Ilya Leoshkevich wrote: >> > > > > On Mon, 2022-02-21 at 22:39 +0100, Ilya Leoshkevich wrote: >> > > > > > On Mon, 2022-02-21 at 19:03 +0100, Jakub Sitnicki wrote: >> > > > > > > Shifting 16-bit type by 16 bits is implementation-defined >> > > > > > > for >> > > > > > > BPF >> > > > > > > programs. >> > > > > > > Don't rely on it in case it is causing the test failures >> > > > > > > we >> > > > > > > are >> > > > > > > seeing on >> > > > > > > s390x z15 target. >> > > > > > > >> > > > > > > Fixes: 2ed0dc5937d3 ("selftests/bpf: Cover 4-byte load >> > > > > > > from >> > > > > > > remote_port in bpf_sk_lookup") >> > > > > > > Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> >> > > > > > > Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> >> > > > > > > --- >> > > > > > > >> > > > > > > I don't have a dev env for s390x/z15 set up yet, so can't >> > > > > > > definitely >> > > > > > > confirm the fix. >> > > > > > > That said, it seems worth fixing either way. >> > > > > > > >> > > > > > > tools/testing/selftests/bpf/progs/test_sk_lookup.c | 3 >> > > > > > > ++- >> > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) >> > > > > > > >> > > > > > > diff --git >> > > > > > > a/tools/testing/selftests/bpf/progs/test_sk_lookup.c >> > > > > > > b/tools/testing/selftests/bpf/progs/test_sk_lookup.c >> > > > > > > index bf5b7caefdd0..7d47276a8964 100644 >> > > > > > > --- a/tools/testing/selftests/bpf/progs/test_sk_lookup.c >> > > > > > > +++ b/tools/testing/selftests/bpf/progs/test_sk_lookup.c >> > > > > > > @@ -65,6 +65,7 @@ static const __u32 KEY_SERVER_A = >> > > > > > > SERVER_A; >> > > > > > > static const __u32 KEY_SERVER_B = SERVER_B; >> > > > > > > >> > > > > > > static const __u16 SRC_PORT = bpf_htons(8008); >> > > > > > > +static const __u32 SRC_PORT_U32 = bpf_htonl(8008U << >> > > > > > > 16); >> > > > > > > static const __u32 SRC_IP4 = IP4(127, 0, 0, 2); >> > > > > > > static const __u32 SRC_IP6[] = IP6(0xfd000000, 0x0, 0x0, >> > > > > > > 0x00000002); >> > > > > > > >> > > > > > > @@ -421,7 +422,7 @@ int ctx_narrow_access(struct >> > > > > > > bpf_sk_lookup >> > > > > > > *ctx) >> > > > > > > >> > > > > > > /* Load from remote_port field with zero padding >> > > > > > > (backward >> > > > > > > compatibility) */ >> > > > > > > val_u32 = *(__u32 *)&ctx->remote_port; >> > > > > > > - if (val_u32 != bpf_htonl(bpf_ntohs(SRC_PORT) << >> > > > > > > 16)) >> > > > > > > + if (val_u32 != SRC_PORT_U32) >> > > > > > > return SK_DROP; >> > > > > > > >> > > > > > > /* Narrow loads from local_port field. Expect >> > > > > > > DST_PORT. >> > > > > > > */ >> > > > > > >> > > > > > Unfortunately this doesn't help with the s390 problem. >> > > > > > I'll try to debug this. >> > > > > >> > > > > I have to admit I have a hard time wrapping my head around >> > > > > the >> > > > > requirements here. >> > > > > >> > > > > Based on the pre-9a69e2b385f4 code, do I understand correctly >> > > > > that >> > > > > for the following input >> > > > > >> > > > > Port: 0x1f48 >> > > > > SRC_PORT: 0x481f >> > > > > >> > > > > we expect the following results for different kinds of loads: >> > > > > >> > > > > Size Offset LE BE >> > > > > BPF_B 0 0x1f 0 >> > > > > BPF_B 1 0x48 0 >> > > > > BPF_B 2 0 0x48 >> > > > > BPF_B 3 0 0x1f >> > > > > BPF_H 0 0x481f 0 >> > > > > BPF_H 1 0 0x481f >> > > > > BPF_W 0 0x481f 0x481f >> > > > > >> > > > > and this is guaranteed by the struct bpf_sk_lookup ABI? >> > > > > Because >> > > > > then >> > > > > it >> > > > > looks as if 9a69e2b385f4 breaks it on big-endian as follows: >> > > > > >> > > > > Size Offset BE-9a69e2b385f4 >> > > > > BPF_B 0 0x48 >> > > > > BPF_B 1 0x1f >> > > > > BPF_B 2 0 >> > > > > BPF_B 3 0 >> > > > > BPF_H 0 0x481f >> > > > > BPF_H 1 0 >> > > > > BPF_W 0 0x481f0000 >> > > > >> > > > Sorry, I worded this incorrectly: 9a69e2b385f4 did not change >> > > > the >> > > > kernel behavior, the ABI is not broken and the old compiled >> > > > code >> > > > should >> > > > continue to work. >> > > > What the second table really shows are what the results should >> > > > be >> > > > according to the 9a69e2b385f4 struct bpf_sk_lookup definition, >> > > > which I >> > > > still think is broken on big-endian and needs to be adjusted to >> > > > match >> > > > the ABI. >> > > > >> > > > I noticed one other strange thing in the meantime: loads from >> > > > *(__u32 *)&ctx->remote_port, *(__u16 *)&ctx->remote_port and >> > > > *((__u16 *)&ctx->remote_port + 1) all produce 8008 on s390, >> > > > which >> > > > is >> > > > clearly inconsistent. It looks as if convert_ctx_accesses() >> > > > needs >> > > > to be >> > > > adjusted to handle combinations like ctx_field_size == 4 && >> > > > size == >> > > > 2 >> > > > && target_size == 2. I will continue with this tomorrow. >> > > > >> > > > > Or is the old behavior a bug and this new one is desirable? >> > > > > 9a69e2b385f4 has no Fixes: tag, so I assume that's the former >> > > > > :-( >> > > > > >> > > > > In which case, would it make sense to fix it by swapping >> > > > > remote_port >> > > > > and :16 in bpf_sk_lookup on big-endian? >> > > >> > > Thanks for looking into it. >> > > >> > > When it comes to requirements, my intention was to keep the same >> > > behavior as before the split up of the remote_port field in >> > > 9a69e2b385f4 >> > > ("bpf: Make remote_port field in struct bpf_sk_lookup 16-bit >> > > wide"). >> > > >> > > 9a69e2b385f4 was supposed to be a formality, after a similar >> > > change >> > > in >> > > 4421a582718a ("bpf: Make dst_port field in struct bpf_sock 16-bit >> > > wide"), which went in earlier. >> > > >> > > In 4421a582718a I've provided a bit more context. I understand >> > > that >> > > the >> > > remote_port value, even before the type changed from u32 to u16, >> > > appeared to the BPF program as if laid out in memory like so: >> > > >> > > offsetof(struct bpf_sk_lookup, remote_port) +0 <port MSB> >> > > +1 <port LSB> >> > > +2 0x00 >> > > +3 0x00 >> > > >> > > Translating it to your handy table format, I expect should result >> > > in >> > > loads as so if port is 8008 = 0x1f48: >> > > >> > > Size Offset LE BE >> > > BPF_B 0 0x1f 0x1f >> > > BPF_B 1 0x48 0x48 >> > > BPF_B 2 0 0 >> > > BPF_B 3 0 0 >> > > BPF_H 0 0x481f 0x1f48 >> > > BPF_H 1 0 0 >> > > BPF_W 0 0x481f 0x1f480000 >> > >> > Hmm, I think for big-endian the layout is different. >> > If we look at test_sk_lookup.c from 9a69e2b385f4^: >> > >> > /* Narrow loads from remote_port field. Expect SRC_PORT. */ >> > if (LSB(ctx->remote_port, 0) != ((SRC_PORT >> 0) & 0xff) || >> > LSB(ctx->remote_port, 1) != ((SRC_PORT >> 8) & 0xff) || >> > LSB(ctx->remote_port, 2) != 0 || LSB(ctx->remote_port, >> > 3) >> > != 0) >> > return SK_DROP; >> > >> > LSB() on little-endian is just byte indexing, so it's indeed >> > 1f,48,00,00. However, on big-endian it's indexing from the end, so >> > it's 00,00,48,1f. >> >> I understood that LSB() is indexing from the end on BE because >> SRC_PORT >> constant value differs on LE (= 0x481f) and BE (= 0x1f48) platforms, >> so >> >> LE BE >> SRC_PORT >> 0 1f 48 >> SRC_PORT >> 8 48 1f >> >> So on LE we first compare remote_port MSB, then LSB. >> While on BE we start with remote_port LSB, then MSB. >> >> But, now that you have pointed it out, I notice that >> sizeof(remote_port) >> has changed and from 4 to 2, and I can't see how LSB(…, 3) and LSB(…, >> 4) >> loads can keep working on big-endian. > > Oh, right - it should be 00,00,1f,48 on big-endian. > Out-of-bounds LSB is indeed an issue. I've posted my current > thoughts as an RFC series [1], this one is addressed in patch 3. > > [1] > https://lore.kernel.org/bpf/20220222182559.2865596-1-iii@linux.ibm.com/ Sorry for the radio silience. I have patched my way through cross-compiling bpf selftests to s390, and can reproduce the sk_lookup test failure on big-endian. bpftool gen object/skeleton refusing to process BE BPF objects on an LE host got me stuck for a moment, but user mode qemu saved the day. FWIW, sock_fields test are also broken on BE, but failing silently, as I observed by instrumenting the read_sk_dst_port BPF prog with bpf_printk. I will be able check out the RFC series over this weekend.
diff --git a/tools/testing/selftests/bpf/progs/test_sk_lookup.c b/tools/testing/selftests/bpf/progs/test_sk_lookup.c index bf5b7caefdd0..7d47276a8964 100644 --- a/tools/testing/selftests/bpf/progs/test_sk_lookup.c +++ b/tools/testing/selftests/bpf/progs/test_sk_lookup.c @@ -65,6 +65,7 @@ static const __u32 KEY_SERVER_A = SERVER_A; static const __u32 KEY_SERVER_B = SERVER_B; static const __u16 SRC_PORT = bpf_htons(8008); +static const __u32 SRC_PORT_U32 = bpf_htonl(8008U << 16); static const __u32 SRC_IP4 = IP4(127, 0, 0, 2); static const __u32 SRC_IP6[] = IP6(0xfd000000, 0x0, 0x0, 0x00000002); @@ -421,7 +422,7 @@ int ctx_narrow_access(struct bpf_sk_lookup *ctx) /* Load from remote_port field with zero padding (backward compatibility) */ val_u32 = *(__u32 *)&ctx->remote_port; - if (val_u32 != bpf_htonl(bpf_ntohs(SRC_PORT) << 16)) + if (val_u32 != SRC_PORT_U32) return SK_DROP; /* Narrow loads from local_port field. Expect DST_PORT. */
Shifting 16-bit type by 16 bits is implementation-defined for BPF programs. Don't rely on it in case it is causing the test failures we are seeing on s390x z15 target. Fixes: 2ed0dc5937d3 ("selftests/bpf: Cover 4-byte load from remote_port in bpf_sk_lookup") Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> --- I don't have a dev env for s390x/z15 set up yet, so can't definitely confirm the fix. That said, it seems worth fixing either way. tools/testing/selftests/bpf/progs/test_sk_lookup.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)