Message ID | 20240731-selftest-sockmap-fixes-v2-0-08a0c73abed2@rbox.co (mailing list archive) |
---|---|
Headers | show |
Series | selftests/bpf: Various sockmap-related fixes | expand |
On Wed, Jul 31, 2024 at 12:01 PM +02, Michal Luczaj wrote: > Series takes care of few bugs and missing features with the aim to improve > the test coverage of sockmap/sockhash. > > Last patch is a create_pair() rewrite making use of > __attribute__((cleanup)) to handle socket fd lifetime. > > Signed-off-by: Michal Luczaj <mhal@rbox.co> > --- Sorry for the long turn-around time. I have opened some kind of Pandora's box with a recent USO change and been battling a regression even since. Also it was CfP deadline week. I will run & review this today / tomorrow latest.
On 8/5/24 17:22, Jakub Sitnicki wrote: > On Wed, Jul 31, 2024 at 12:01 PM +02, Michal Luczaj wrote: >> Series takes care of few bugs and missing features with the aim to improve >> the test coverage of sockmap/sockhash. >> >> Last patch is a create_pair() rewrite making use of >> __attribute__((cleanup)) to handle socket fd lifetime. >> >> Signed-off-by: Michal Luczaj <mhal@rbox.co> >> --- > > Sorry for the long turn-around time. > > I have opened some kind of Pandora's box with a recent USO change and > been battling a regression even since. Also it was CfP deadline week. > > I will run & review this today / tomorrow latest. Thanks for the update. But, really, as far as I'm concerned, no need for any rush.
On Wed, Jul 31, 2024 at 12:01 PM +02, Michal Luczaj wrote: > Series takes care of few bugs and missing features with the aim to improve > the test coverage of sockmap/sockhash. > > Last patch is a create_pair() rewrite making use of > __attribute__((cleanup)) to handle socket fd lifetime. > > Signed-off-by: Michal Luczaj <mhal@rbox.co> > --- > Changes in v2: > - Rebase on bpf-next (Jakub) > - Use cleanup helpers from kernel's cleanup.h (Jakub) > - Fix subject of patch 3, rephrase patch 4, use correct prefix > - Link to v1: https://lore.kernel.org/r/20240724-sockmap-selftest-fixes-v1-0-46165d224712@rbox.co > > Changes in v1: > - No declarations in function body (Jakub) > - Don't touch output arguments until function succeeds (Jakub) > - Link to v0: https://lore.kernel.org/netdev/027fdb41-ee11-4be0-a493-22f28a1abd7c@rbox.co/ > > --- > Michal Luczaj (6): > selftests/bpf: Support more socket types in create_pair() > selftests/bpf: Socket pair creation, cleanups > selftests/bpf: Simplify inet_socketpair() and vsock_socketpair_connectible() > selftests/bpf: Honour the sotype of af_unix redir tests > selftests/bpf: Exercise SOCK_STREAM unix_inet_redir_to_connected() > selftests/bpf: Introduce __attribute__((cleanup)) in create_pair() > > .../selftests/bpf/prog_tests/sockmap_basic.c | 28 ++-- > .../selftests/bpf/prog_tests/sockmap_helpers.h | 149 ++++++++++++++------- > .../selftests/bpf/prog_tests/sockmap_listen.c | 117 ++-------------- > 3 files changed, 124 insertions(+), 170 deletions(-) > --- > base-commit: 92cc2456e9775dc4333fb4aa430763ae4ac2f2d9 > change-id: 20240729-selftest-sockmap-fixes-bcca996e143b > > Best regards, Thanks again for these fixes. For the series: Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
On 8/6/24 14:01, Jakub Sitnicki wrote: > On Wed, Jul 31, 2024 at 12:01 PM +02, Michal Luczaj wrote: >> Series takes care of few bugs and missing features with the aim to improve >> the test coverage of sockmap/sockhash. >> >> Last patch is a create_pair() rewrite making use of >> __attribute__((cleanup)) to handle socket fd lifetime. >> >> Signed-off-by: Michal Luczaj <mhal@rbox.co> >> --- >> Changes in v2: >> - Rebase on bpf-next (Jakub) >> - Use cleanup helpers from kernel's cleanup.h (Jakub) >> - Fix subject of patch 3, rephrase patch 4, use correct prefix >> - Link to v1: https://lore.kernel.org/r/20240724-sockmap-selftest-fixes-v1-0-46165d224712@rbox.co >> >> Changes in v1: >> - No declarations in function body (Jakub) >> - Don't touch output arguments until function succeeds (Jakub) >> - Link to v0: https://lore.kernel.org/netdev/027fdb41-ee11-4be0-a493-22f28a1abd7c@rbox.co/ >> >> --- >> Michal Luczaj (6): >> selftests/bpf: Support more socket types in create_pair() >> selftests/bpf: Socket pair creation, cleanups >> selftests/bpf: Simplify inet_socketpair() and vsock_socketpair_connectible() >> selftests/bpf: Honour the sotype of af_unix redir tests >> selftests/bpf: Exercise SOCK_STREAM unix_inet_redir_to_connected() >> selftests/bpf: Introduce __attribute__((cleanup)) in create_pair() >> >> .../selftests/bpf/prog_tests/sockmap_basic.c | 28 ++-- >> .../selftests/bpf/prog_tests/sockmap_helpers.h | 149 ++++++++++++++------- >> .../selftests/bpf/prog_tests/sockmap_listen.c | 117 ++-------------- >> 3 files changed, 124 insertions(+), 170 deletions(-) >> --- >> base-commit: 92cc2456e9775dc4333fb4aa430763ae4ac2f2d9 >> change-id: 20240729-selftest-sockmap-fixes-bcca996e143b >> >> Best regards, > > Thanks again for these fixes. For the series: > > Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> > Tested-by: Jakub Sitnicki <jakub@cloudflare.com> Great, thanks for the review. With this completed, I guess we can unwind the (mail) stack to [1]. Is that ingress-to-local et al. something you wanted to take care of yourself or can I give it a try? [1] https://lore.kernel.org/netdev/87msmqn9ws.fsf@cloudflare.com/ Also, I've noticed patchwork still lists (besides this one) the old version of this series. Am I supposed to tell the bot to disregard it?
On Tue, Aug 06, 2024 at 07:18 PM +02, Michal Luczaj wrote: > On 8/6/24 14:01, Jakub Sitnicki wrote: >> On Wed, Jul 31, 2024 at 12:01 PM +02, Michal Luczaj wrote: >>> Series takes care of few bugs and missing features with the aim to improve >>> the test coverage of sockmap/sockhash. >>> >>> Last patch is a create_pair() rewrite making use of >>> __attribute__((cleanup)) to handle socket fd lifetime. >>> >>> Signed-off-by: Michal Luczaj <mhal@rbox.co> >>> --- >>> Changes in v2: >>> - Rebase on bpf-next (Jakub) >>> - Use cleanup helpers from kernel's cleanup.h (Jakub) >>> - Fix subject of patch 3, rephrase patch 4, use correct prefix >>> - Link to v1: https://lore.kernel.org/r/20240724-sockmap-selftest-fixes-v1-0-46165d224712@rbox.co >>> >>> Changes in v1: >>> - No declarations in function body (Jakub) >>> - Don't touch output arguments until function succeeds (Jakub) >>> - Link to v0: https://lore.kernel.org/netdev/027fdb41-ee11-4be0-a493-22f28a1abd7c@rbox.co/ >>> >>> --- >>> Michal Luczaj (6): >>> selftests/bpf: Support more socket types in create_pair() >>> selftests/bpf: Socket pair creation, cleanups >>> selftests/bpf: Simplify inet_socketpair() and vsock_socketpair_connectible() >>> selftests/bpf: Honour the sotype of af_unix redir tests >>> selftests/bpf: Exercise SOCK_STREAM unix_inet_redir_to_connected() >>> selftests/bpf: Introduce __attribute__((cleanup)) in create_pair() >>> >>> .../selftests/bpf/prog_tests/sockmap_basic.c | 28 ++-- >>> .../selftests/bpf/prog_tests/sockmap_helpers.h | 149 ++++++++++++++------- >>> .../selftests/bpf/prog_tests/sockmap_listen.c | 117 ++-------------- >>> 3 files changed, 124 insertions(+), 170 deletions(-) >>> --- >>> base-commit: 92cc2456e9775dc4333fb4aa430763ae4ac2f2d9 >>> change-id: 20240729-selftest-sockmap-fixes-bcca996e143b >>> >>> Best regards, >> >> Thanks again for these fixes. For the series: >> >> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> >> Tested-by: Jakub Sitnicki <jakub@cloudflare.com> > > Great, thanks for the review. With this completed, I guess we can unwind > the (mail) stack to [1]. Is that ingress-to-local et al. something you > wanted to take care of yourself or can I give it a try? I haven't stated any work on. You're welcome to tackle that. All I have is a toy test that I've used to generate the redirect matrix. Perhaps it can serve as inspiration: https://github.com/jsitnicki/sockmap-redir-matrix > > [1] https://lore.kernel.org/netdev/87msmqn9ws.fsf@cloudflare.com/ > > Also, I've noticed patchwork still lists (besides this one) the old version > of this series. Am I supposed to tell the bot to disregard it? Only the maintainers can change patch set status in patchwork: https://docs.kernel.org/process/maintainer-netdev.html#updating-patch-status
On 8/6/24 19:45, Jakub Sitnicki wrote: > On Tue, Aug 06, 2024 at 07:18 PM +02, Michal Luczaj wrote: >> Great, thanks for the review. With this completed, I guess we can unwind >> the (mail) stack to [1]. Is that ingress-to-local et al. something you >> wanted to take care of yourself or can I give it a try? >> [1] https://lore.kernel.org/netdev/87msmqn9ws.fsf@cloudflare.com/ > > I haven't stated any work on. You're welcome to tackle that. > > All I have is a toy test that I've used to generate the redirect matrix. > Perhaps it can serve as inspiration: > > https://github.com/jsitnicki/sockmap-redir-matrix All right, please let me know if this is more or less what you meant and I'll post the whole series for a review (+patch to purge sockmap_listen of redir tests, fix misnomers). Mostly I've just copypasted your code (mangling it terribly along the way), so I feel silly claiming the authorship. Should I assign you as an author? Note that the patches are based on [2], which has not reached bpf-next (patchwork says: "Needs ACK"). [2] [PATCH bpf-next v2 0/6] selftests/bpf: Various sockmap-related fixes https://lore.kernel.org/bpf/20240731-selftest-sockmap-fixes-v2-0-08a0c73abed2@rbox.co/
On Wed, Aug 14, 2024 at 06:14 PM +02, Michal Luczaj wrote: > On 8/6/24 19:45, Jakub Sitnicki wrote: >> On Tue, Aug 06, 2024 at 07:18 PM +02, Michal Luczaj wrote: >>> Great, thanks for the review. With this completed, I guess we can unwind >>> the (mail) stack to [1]. Is that ingress-to-local et al. something you >>> wanted to take care of yourself or can I give it a try? >>> [1] https://lore.kernel.org/netdev/87msmqn9ws.fsf@cloudflare.com/ >> >> I haven't stated any work on. You're welcome to tackle that. >> >> All I have is a toy test that I've used to generate the redirect matrix. >> Perhaps it can serve as inspiration: >> >> https://github.com/jsitnicki/sockmap-redir-matrix > > All right, please let me know if this is more or less what you meant and > I'll post the whole series for a review (+patch to purge sockmap_listen of > redir tests, fix misnomers). Mostly I've just copypasted your code > (mangling it terribly along the way), so I feel silly claiming the > authorship. Should I assign you as an author? Don't worry about it. I appreciate the help. I will take a look at the redirect tests this weekend. > Note that the patches are based on [2], which has not reached bpf-next > (patchwork says: "Needs ACK"). > > [2] [PATCH bpf-next v2 0/6] selftests/bpf: Various sockmap-related fixes > https://lore.kernel.org/bpf/20240731-selftest-sockmap-fixes-v2-0-08a0c73abed2@rbox.co/ Might have slipped throught the cracks... Andrii, Martin, The patch set still applies cleanly to bpf-next. Would you be able to a look at this series? Anything we need to do? Thanks, (the other) Jakub
On Wed, Aug 14, 2024 at 06:14 PM +02, Michal Luczaj wrote: > On 8/6/24 19:45, Jakub Sitnicki wrote: >> On Tue, Aug 06, 2024 at 07:18 PM +02, Michal Luczaj wrote: >>> Great, thanks for the review. With this completed, I guess we can unwind >>> the (mail) stack to [1]. Is that ingress-to-local et al. something you >>> wanted to take care of yourself or can I give it a try? >>> [1] https://lore.kernel.org/netdev/87msmqn9ws.fsf@cloudflare.com/ >> >> I haven't stated any work on. You're welcome to tackle that. >> >> All I have is a toy test that I've used to generate the redirect matrix. >> Perhaps it can serve as inspiration: >> >> https://github.com/jsitnicki/sockmap-redir-matrix > > All right, please let me know if this is more or less what you meant and > I'll post the whole series for a review (+patch to purge sockmap_listen of > redir tests, fix misnomers). [...] Gave it a look as promised. It makes sense to me as well to put these tests in a new module. There will be some overlap with sockmap_listen, which has diverged from its inital scope, but we can dedup that later. One thought that I had is that it could make sense to test the not supported redirect combos (and expect an error). Sometimes folks make changes and enable some parts of the API by accient. Just a suggestion. This will be a nice improvement to the test coverage even without the negative tests. > Note that the patches are based on [2], which has not reached bpf-next > (patchwork says: "Needs ACK"). I think it might be fair to resend the series to attract the maintainers attention at this point. Thanks, Jakub
On 8/16/24 12:03 PM, Jakub Sitnicki wrote: > On Wed, Aug 14, 2024 at 06:14 PM +02, Michal Luczaj wrote: >> On 8/6/24 19:45, Jakub Sitnicki wrote: >>> On Tue, Aug 06, 2024 at 07:18 PM +02, Michal Luczaj wrote: >>>> Great, thanks for the review. With this completed, I guess we can unwind >>>> the (mail) stack to [1]. Is that ingress-to-local et al. something you >>>> wanted to take care of yourself or can I give it a try? >>>> [1] https://lore.kernel.org/netdev/87msmqn9ws.fsf@cloudflare.com/ >>> >>> I haven't stated any work on. You're welcome to tackle that. >>> >>> All I have is a toy test that I've used to generate the redirect matrix. >>> Perhaps it can serve as inspiration: >>> >>> https://github.com/jsitnicki/sockmap-redir-matrix >> >> All right, please let me know if this is more or less what you meant and >> I'll post the whole series for a review (+patch to purge sockmap_listen of >> redir tests, fix misnomers). Mostly I've just copypasted your code >> (mangling it terribly along the way), so I feel silly claiming the >> authorship. Should I assign you as an author? > > Don't worry about it. I appreciate the help. > > I will take a look at the redirect tests this weekend. > >> Note that the patches are based on [2], which has not reached bpf-next >> (patchwork says: "Needs ACK"). >> >> [2] [PATCH bpf-next v2 0/6] selftests/bpf: Various sockmap-related fixes >> https://lore.kernel.org/bpf/20240731-selftest-sockmap-fixes-v2-0-08a0c73abed2@rbox.co/ > > Might have slipped throught the cracks... > > > Andrii, Martin, > > The patch set still applies cleanly to bpf-next. > > Would you be able to a look at this series? Anything we need to do? will take a look. no need to resend.
On 7/31/24 3:01 AM, Michal Luczaj wrote: > Series takes care of few bugs and missing features with the aim to improve > the test coverage of sockmap/sockhash. > > Last patch is a create_pair() rewrite making use of > __attribute__((cleanup)) to handle socket fd lifetime. Applied to bpf-next/net. Thanks.
On 8/19/24 22:05, Jakub Sitnicki wrote: > On Wed, Aug 14, 2024 at 06:14 PM +02, Michal Luczaj wrote: >> On 8/6/24 19:45, Jakub Sitnicki wrote: >>> On Tue, Aug 06, 2024 at 07:18 PM +02, Michal Luczaj wrote: >>>> Great, thanks for the review. With this completed, I guess we can unwind >>>> the (mail) stack to [1]. Is that ingress-to-local et al. something you >>>> wanted to take care of yourself or can I give it a try? >>>> [1] https://lore.kernel.org/netdev/87msmqn9ws.fsf@cloudflare.com/ >>> >>> I haven't stated any work on. You're welcome to tackle that. >>> >>> All I have is a toy test that I've used to generate the redirect matrix. >>> Perhaps it can serve as inspiration: >>> >>> https://github.com/jsitnicki/sockmap-redir-matrix >> >> All right, please let me know if this is more or less what you meant and >> I'll post the whole series for a review (+patch to purge sockmap_listen of >> redir tests, fix misnomers). [...] > > Gave it a look as promised. It makes sense to me as well to put these > tests in a new module. There will be some overlap with sockmap_listen, > which has diverged from its inital scope, but we can dedup that later. > > One thought that I had is that it could make sense to test the not > supported redirect combos (and expect an error). Sometimes folks make > changes and enable some parts of the API by accient. All right, so I did what sockmap_listen does: check test_sockmap_listen.c:verdict_map[SK_PASS] to see if the redirect took place for a given combo. And that works well... except for skb/msg to ingress af_vsock. Even though this is unsupported and no redirect actually happens, verdict appears to be SK_PASS. Is this correct? Maybe I'm missing something, so below is a crude testcase I've cobbled together. And sorry for the delay, I was away from keyboard. Michal All error logs: ./test_progs:unix_vsock_redir_fail:1600: want pass=0 / drop=1, have 1 / 0 unix_vsock_redir_fail:FAIL:1600 diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c index 4ee1148d22be..e59e1654f110 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c @@ -1561,6 +1561,45 @@ static void vsock_unix_redir_connectible(int sock_mapfd, int verd_mapfd, close(u1); } +static void unix_vsock_redir_fail(int sock_mapfd, int verd_mapfd) +{ + int v0, v1, u[2], pass, drop; + char a = 'a'; + + bpf_map_delete_elem(sock_mapfd, &(int){0}); + bpf_map_delete_elem(sock_mapfd, &(int){1}); + zero_verdict_count(verd_mapfd); + + if (socketpair(AF_UNIX, SOCK_STREAM, 0, u)) { + FAIL_ERRNO("socketpair(af_unix)"); + return; + } + + if (create_pair(AF_VSOCK, SOCK_STREAM, &v0, &v1)) + return; + + if (add_to_sockmap(sock_mapfd, v0, u[0])) + return; + + if (write(u[1], &a, sizeof(a)) != 1) { + FAIL_ERRNO("write()"); + return; + } + + errno = 0; + if (recv_timeout(v0, &a, sizeof(a), 0, 1) >= 0 || + recv_timeout(v1, &a, sizeof(a), 0, 1) >= 0 || + recv_timeout(u[0], &a, sizeof(a), 0, 1) >= 0 || + recv_timeout(u[1], &a, sizeof(a), 0, 1) >= 0) + FAIL("recv() returned >=0, errno=%d", errno); + + if (xbpf_map_lookup_elem(verd_mapfd, &(int){SK_PASS}, &pass) || + xbpf_map_lookup_elem(verd_mapfd, &(int){SK_DROP}, &drop)) + return; + if (pass != 0 || drop != 1) + FAIL("want pass=0 / drop=1, have %d / %d", pass, drop); +} + static void vsock_unix_skb_redir_connectible(struct test_sockmap_listen *skel, struct bpf_map *inner_map, int sotype) @@ -1582,6 +1621,23 @@ static void vsock_unix_skb_redir_connectible(struct test_sockmap_listen *skel, xbpf_prog_detach2(verdict, sock_map, BPF_SK_SKB_VERDICT); } +static void unix_vsock_redir(struct test_sockmap_listen *skel, struct bpf_map *inner_map) +{ + int verdict = bpf_program__fd(skel->progs.prog_skb_verdict); + int verdict_map = bpf_map__fd(skel->maps.verdict_map); + int sock_map = bpf_map__fd(inner_map); + int err; + + err = xbpf_prog_attach(verdict, sock_map, BPF_SK_SKB_VERDICT, 0); + if (err) + return; + + skel->bss->test_ingress = true; + unix_vsock_redir_fail(sock_map, verdict_map); + + xbpf_prog_detach2(verdict, sock_map, BPF_SK_SKB_VERDICT); +} + static void test_vsock_redir(struct test_sockmap_listen *skel, struct bpf_map *map) { const char *family_name, *map_name; @@ -1883,6 +1939,7 @@ void serial_test_sockmap_listen(void) test_unix_redir(skel, skel->maps.sock_map, SOCK_DGRAM); test_unix_redir(skel, skel->maps.sock_map, SOCK_STREAM); test_vsock_redir(skel, skel->maps.sock_map); + unix_vsock_redir(skel, skel->maps.sock_map); skel->bss->test_sockmap = false; run_tests(skel, skel->maps.sock_hash, AF_INET);
On 9/24/24 12:25, Michal Luczaj wrote: > On 8/19/24 22:05, Jakub Sitnicki wrote: >> On Wed, Aug 14, 2024 at 06:14 PM +02, Michal Luczaj wrote: >>> On 8/6/24 19:45, Jakub Sitnicki wrote: >>>> On Tue, Aug 06, 2024 at 07:18 PM +02, Michal Luczaj wrote: >>>>> Great, thanks for the review. With this completed, I guess we can unwind >>>>> the (mail) stack to [1]. Is that ingress-to-local et al. something you >>>>> wanted to take care of yourself or can I give it a try? >>>>> [1] https://lore.kernel.org/netdev/87msmqn9ws.fsf@cloudflare.com/ >>>> >>>> I haven't stated any work on. You're welcome to tackle that. >>>> >>>> All I have is a toy test that I've used to generate the redirect matrix. >>>> Perhaps it can serve as inspiration: >>>> >>>> https://github.com/jsitnicki/sockmap-redir-matrix >>> >>> All right, please let me know if this is more or less what you meant and >>> I'll post the whole series for a review (+patch to purge sockmap_listen of >>> redir tests, fix misnomers). [...] >> >> Gave it a look as promised. It makes sense to me as well to put these >> tests in a new module. There will be some overlap with sockmap_listen, >> which has diverged from its inital scope, but we can dedup that later. >> >> One thought that I had is that it could make sense to test the not >> supported redirect combos (and expect an error). Sometimes folks make >> changes and enable some parts of the API by accient. > > All right, so I did what sockmap_listen does: check > test_sockmap_listen.c:verdict_map[SK_PASS] to see if the redirect took > place for a given combo. And that works well... except for skb/msg to > ingress af_vsock. Even though this is unsupported and no redirect > actually happens, verdict appears to be SK_PASS. Is this correct? Here's a follow up: my guess is that some checks are missing. I'm not sure if it's the best approach, but this fixes things for me: diff --git a/include/net/sock.h b/include/net/sock.h index c58ca8dd561b..c87295f3476d 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2715,6 +2715,11 @@ static inline bool sk_is_stream_unix(const struct sock *sk) return sk->sk_family == AF_UNIX && sk->sk_type == SOCK_STREAM; } +static inline bool sk_is_vsock(const struct sock *sk) +{ + return sk->sk_family == AF_VSOCK; +} + /** * sk_eat_skb - Release a skb if it is no longer needed * @sk: socket to eat this skb from diff --git a/net/core/sock_map.c b/net/core/sock_map.c index 242c91a6e3d3..07d6aa4e39ef 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -647,6 +647,8 @@ BPF_CALL_4(bpf_sk_redirect_map, struct sk_buff *, skb, sk = __sock_map_lookup_elem(map, key); if (unlikely(!sk || !sock_map_redirect_allowed(sk))) return SK_DROP; + if ((flags & BPF_F_INGRESS) && sk_is_vsock(sk)) + return SK_DROP; skb_bpf_set_redir(skb, sk, flags & BPF_F_INGRESS); return SK_PASS; @@ -675,6 +677,8 @@ BPF_CALL_4(bpf_msg_redirect_map, struct sk_msg *, msg, return SK_DROP; if (!(flags & BPF_F_INGRESS) && !sk_is_tcp(sk)) return SK_DROP; + if (sk_is_vsock(sk)) + return SK_DROP; msg->flags = flags; msg->sk_redir = sk; @@ -1249,6 +1253,8 @@ BPF_CALL_4(bpf_sk_redirect_hash, struct sk_buff *, skb, sk = __sock_hash_lookup_elem(map, key); if (unlikely(!sk || !sock_map_redirect_allowed(sk))) return SK_DROP; + if ((flags & BPF_F_INGRESS) && sk_is_vsock(sk)) + return SK_DROP; skb_bpf_set_redir(skb, sk, flags & BPF_F_INGRESS); return SK_PASS; @@ -1277,6 +1283,8 @@ BPF_CALL_4(bpf_msg_redirect_hash, struct sk_msg *, msg, return SK_DROP; if (!(flags & BPF_F_INGRESS) && !sk_is_tcp(sk)) return SK_DROP; + if (sk_is_vsock(sk)) + return SK_DROP; msg->flags = flags; msg->sk_redir = sk;
On Fri, Sep 27, 2024 at 12:54 AM +02, Michal Luczaj wrote: > On 9/24/24 12:25, Michal Luczaj wrote: >> On 8/19/24 22:05, Jakub Sitnicki wrote: >>> On Wed, Aug 14, 2024 at 06:14 PM +02, Michal Luczaj wrote: >>>> On 8/6/24 19:45, Jakub Sitnicki wrote: >>>>> On Tue, Aug 06, 2024 at 07:18 PM +02, Michal Luczaj wrote: >>>>>> Great, thanks for the review. With this completed, I guess we can unwind >>>>>> the (mail) stack to [1]. Is that ingress-to-local et al. something you >>>>>> wanted to take care of yourself or can I give it a try? >>>>>> [1] https://lore.kernel.org/netdev/87msmqn9ws.fsf@cloudflare.com/ >>>>> >>>>> I haven't stated any work on. You're welcome to tackle that. >>>>> >>>>> All I have is a toy test that I've used to generate the redirect matrix. >>>>> Perhaps it can serve as inspiration: >>>>> >>>>> https://github.com/jsitnicki/sockmap-redir-matrix >>>> >>>> All right, please let me know if this is more or less what you meant and >>>> I'll post the whole series for a review (+patch to purge sockmap_listen of >>>> redir tests, fix misnomers). [...] >>> >>> Gave it a look as promised. It makes sense to me as well to put these >>> tests in a new module. There will be some overlap with sockmap_listen, >>> which has diverged from its inital scope, but we can dedup that later. >>> >>> One thought that I had is that it could make sense to test the not >>> supported redirect combos (and expect an error). Sometimes folks make >>> changes and enable some parts of the API by accient. >> >> All right, so I did what sockmap_listen does: check >> test_sockmap_listen.c:verdict_map[SK_PASS] to see if the redirect took >> place for a given combo. And that works well... except for skb/msg to >> ingress af_vsock. Even though this is unsupported and no redirect >> actually happens, verdict appears to be SK_PASS. Is this correct? > > Here's a follow up: my guess is that some checks are missing. I'm not sure > if it's the best approach, but this fixes things for me: So you have already found a bug with a negative test. Nice. Your patch makes sense to me. FYI, I've started a GH repo for anciallary materials for sockmap. Code samples, pointers to resources, a backlog of stuff to do (?). Inspired by the xdp-project repo: https://github.com/xdp-project/xdp-project We can move it to a dedicated project namespace, right now it's at: https://github.com/jsitnicki/sockmap-project/ Feel free to add stuff there. > diff --git a/include/net/sock.h b/include/net/sock.h > index c58ca8dd561b..c87295f3476d 100644 > --- a/include/net/sock.h > +++ b/include/net/sock.h > @@ -2715,6 +2715,11 @@ static inline bool sk_is_stream_unix(const struct sock *sk) > return sk->sk_family == AF_UNIX && sk->sk_type == SOCK_STREAM; > } > > +static inline bool sk_is_vsock(const struct sock *sk) > +{ > + return sk->sk_family == AF_VSOCK; > +} > + > /** > * sk_eat_skb - Release a skb if it is no longer needed > * @sk: socket to eat this skb from > diff --git a/net/core/sock_map.c b/net/core/sock_map.c > index 242c91a6e3d3..07d6aa4e39ef 100644 > --- a/net/core/sock_map.c > +++ b/net/core/sock_map.c > @@ -647,6 +647,8 @@ BPF_CALL_4(bpf_sk_redirect_map, struct sk_buff *, skb, > sk = __sock_map_lookup_elem(map, key); > if (unlikely(!sk || !sock_map_redirect_allowed(sk))) > return SK_DROP; > + if ((flags & BPF_F_INGRESS) && sk_is_vsock(sk)) > + return SK_DROP; > > skb_bpf_set_redir(skb, sk, flags & BPF_F_INGRESS); > return SK_PASS; > @@ -675,6 +677,8 @@ BPF_CALL_4(bpf_msg_redirect_map, struct sk_msg *, msg, > return SK_DROP; > if (!(flags & BPF_F_INGRESS) && !sk_is_tcp(sk)) > return SK_DROP; > + if (sk_is_vsock(sk)) > + return SK_DROP; > > msg->flags = flags; > msg->sk_redir = sk; > @@ -1249,6 +1253,8 @@ BPF_CALL_4(bpf_sk_redirect_hash, struct sk_buff *, skb, > sk = __sock_hash_lookup_elem(map, key); > if (unlikely(!sk || !sock_map_redirect_allowed(sk))) > return SK_DROP; > + if ((flags & BPF_F_INGRESS) && sk_is_vsock(sk)) > + return SK_DROP; > > skb_bpf_set_redir(skb, sk, flags & BPF_F_INGRESS); > return SK_PASS; > @@ -1277,6 +1283,8 @@ BPF_CALL_4(bpf_msg_redirect_hash, struct sk_msg *, msg, > return SK_DROP; > if (!(flags & BPF_F_INGRESS) && !sk_is_tcp(sk)) > return SK_DROP; > + if (sk_is_vsock(sk)) > + return SK_DROP; > > msg->flags = flags; > msg->sk_redir = sk;
On 9/27/24 11:15, Jakub Sitnicki wrote: > On Fri, Sep 27, 2024 at 12:54 AM +02, Michal Luczaj wrote: >> ... >> Here's a follow up: my guess is that some checks are missing. I'm not sure >> if it's the best approach, but this fixes things for me: > > So you have already found a bug with a negative test. Nice. > > Your patch makes sense to me. Great, I'll submit it properly. Another thing I've noticed is that unsupported (non-TCP) sk_msg redirects fail silently, i.e. send() is successful, then packet appears to be dropped, but because the BPF_SK_MSG_VERDICT program is never run, the verdict[SK_DROP] isn't updated. Is this by design? Also, for unsupported af_vsock sk_skb-to-ingress we hit the warning: [ 233.396654] rx_queue is empty, but rx_bytes is non-zero [ 233.396702] WARNING: CPU: 11 PID: 40601 at net/vmw_vsock/virtio_transport_common.c:589 virtio_transport_stream_dequeue+0x2e5/0x2f0 I'll try to fix that. Now, the series begin to grow long. Should the fixes come separately? > FYI, I've started a GH repo for anciallary materials for sockmap. > Code samples, pointers to resources, a backlog of stuff to do (?). > Inspired by the xdp-project repo: > > https://github.com/xdp-project/xdp-project > > We can move it to a dedicated project namespace, right now it's at: > > https://github.com/jsitnicki/sockmap-project/ > > Feel free to add stuff there. Not that I have anything to add, but sure, thanks :) Michal
I'm back after a short break. Sorry for delay. On Wed, Oct 02, 2024 at 10:27 AM +02, Michal Luczaj wrote: > On 9/27/24 11:15, Jakub Sitnicki wrote: >> On Fri, Sep 27, 2024 at 12:54 AM +02, Michal Luczaj wrote: >>> ... >>> Here's a follow up: my guess is that some checks are missing. I'm not sure >>> if it's the best approach, but this fixes things for me: >> >> So you have already found a bug with a negative test. Nice. >> >> Your patch makes sense to me. > > Great, I'll submit it properly. > > Another thing I've noticed is that unsupported (non-TCP) sk_msg redirects > fail silently, i.e. send() is successful, then packet appears to be > dropped, but because the BPF_SK_MSG_VERDICT program is never run, the > verdict[SK_DROP] isn't updated. Is this by design? That's curious. We don't override the proto::sendmsg callback for protocols which don't support sk_msg redirects, like UDP: https://elixir.bootlin.com/linux/v6.12-rc2/source/net/ipv4/udp_bpf.c#L114 The packet should get delivered to the peer socket as w/o sockmap. I will have to double check that. > Also, for unsupported af_vsock sk_skb-to-ingress we hit the warning: > > [ 233.396654] rx_queue is empty, but rx_bytes is non-zero > [ 233.396702] WARNING: CPU: 11 PID: 40601 at net/vmw_vsock/virtio_transport_common.c:589 virtio_transport_stream_dequeue+0x2e5/0x2f0 > > I'll try to fix that. Now, the series begin to grow long. Should the fixes > come separately? Thanks. And yes - if possible, better to push fixes separately. Because they go through the bpf tree, and they will still land in the upcoming -rc releases (and get backported). While improvements go through bpf-next. Of course that sometimes makes life more difficult if the improvements depend on some fixes... Not sure if anything from bpf-next gets backported if it has a Fixes tag. We can ask the stable kernel maintainers, if needed.
On 10/9/24 11:46, Jakub Sitnicki wrote: > That's curious. We don't override the proto::sendmsg callback for > protocols which don't support sk_msg redirects, like UDP: > > https://elixir.bootlin.com/linux/v6.12-rc2/source/net/ipv4/udp_bpf.c#L114 > > The packet should get delivered to the peer socket as w/o sockmap. > I will have to double check that. Ugh, no, you're right. I was checking the wrong queue all that time... Sorry for the confusion. > Thanks. And yes - if possible, better to push fixes separately. Because > they go through the bpf tree, and they will still land in the upcoming > -rc releases (and get backported). > > While improvements go through bpf-next. Of course that sometimes makes > life more difficult if the improvements depend on some fixes... I'm afraid that's the case for the redir selftest to run cleanly. Anyway, so those are the fixes mentioned, targeting bpf: https://lore.kernel.org/bpf/20241009-vsock-fixes-for-redir-v1-0-e455416f6d78@rbox.co/ > Not sure if anything from bpf-next gets backported if it has a Fixes > tag. We can ask the stable kernel maintainers, if needed.
Series takes care of few bugs and missing features with the aim to improve the test coverage of sockmap/sockhash. Last patch is a create_pair() rewrite making use of __attribute__((cleanup)) to handle socket fd lifetime. Signed-off-by: Michal Luczaj <mhal@rbox.co> --- Changes in v2: - Rebase on bpf-next (Jakub) - Use cleanup helpers from kernel's cleanup.h (Jakub) - Fix subject of patch 3, rephrase patch 4, use correct prefix - Link to v1: https://lore.kernel.org/r/20240724-sockmap-selftest-fixes-v1-0-46165d224712@rbox.co Changes in v1: - No declarations in function body (Jakub) - Don't touch output arguments until function succeeds (Jakub) - Link to v0: https://lore.kernel.org/netdev/027fdb41-ee11-4be0-a493-22f28a1abd7c@rbox.co/ --- Michal Luczaj (6): selftests/bpf: Support more socket types in create_pair() selftests/bpf: Socket pair creation, cleanups selftests/bpf: Simplify inet_socketpair() and vsock_socketpair_connectible() selftests/bpf: Honour the sotype of af_unix redir tests selftests/bpf: Exercise SOCK_STREAM unix_inet_redir_to_connected() selftests/bpf: Introduce __attribute__((cleanup)) in create_pair() .../selftests/bpf/prog_tests/sockmap_basic.c | 28 ++-- .../selftests/bpf/prog_tests/sockmap_helpers.h | 149 ++++++++++++++------- .../selftests/bpf/prog_tests/sockmap_listen.c | 117 ++-------------- 3 files changed, 124 insertions(+), 170 deletions(-) --- base-commit: 92cc2456e9775dc4333fb4aa430763ae4ac2f2d9 change-id: 20240729-selftest-sockmap-fixes-bcca996e143b Best regards,