mbox series

[v3,bpf-next,0/5] bpf-next: Add socket destroy capability

Message ID 20230321184541.1857363-1-aditi.ghag@isovalent.com (mailing list archive)
Headers show
Series bpf-next: Add socket destroy capability | expand

Message

Aditi Ghag March 21, 2023, 6:45 p.m. UTC
This patch adds the capability to destroy sockets in BPF. We plan to use
the capability in Cilium to force client sockets to reconnect when their
remote load-balancing backends are deleted. The other use case is
on-the-fly policy enforcement where existing socket connections prevented
by policies need to be terminated.

The use cases, and more details around
the selected approach was presented at LPC 2022 -
https://lpc.events/event/16/contributions/1358/.
RFC discussion -
https://lore.kernel.org/netdev/CABG=zsBEh-P4NXk23eBJw7eajB5YJeRS7oPXnTAzs=yob4EMoQ@mail.gmail.com/T/#u.
v2 patch series -
https://lore.kernel.org/bpf/20230223215311.926899-1-aditi.ghag@isovalent.com/T/#t

v3 highlights:
- Martin's review comments:
  - UDP iterator batching patch supports resume operation.
  - Removed "full_sock" check from the destroy kfunc.
  - Reset of metadata in case of rebatching.
- Extended selftests to cover cases for destroying listening sockets.
- Fixes for destroying listening TCP and UDP sockets.
- Stan's review:
  - Refactored selftests to use ASSERT_* in lieu of CHECK.
  - Free leaking afinfo in fini_udp.
- Restructured test cases per Andrii's comment.

Notes to the reviewers:

- There are two RFC commits for being able to destroy listening TCP and
  UDP sockets. The TCP commit isn't quite correct, as inet_unhash could
  be invoked from BPF context for cases other than iterator.
  The UDP commit seems reasonable based on my understanding of the code,
  but it may lead to unintended behavior when there are sockets
  listening on wildcard and specific address with a common port.
  I would appreciate insights into both the commits, as I'm not
  intimately familiar with some of the overall code path.

(Below notes are same as v2 patch series.)
- I hit a snag while writing the kfunc where verifier complained about the
  `sock_common` type passed from TCP iterator. With kfuncs, there don't
  seem to be any options available to pass BTF type hints to the verifier
  (equivalent of `ARG_PTR_TO_BTF_ID_SOCK_COMMON`, as was the case with the
  helper).  As a result, I changed the argument type of the sock_destory
  kfunc to `sock_common`.

- The `vmlinux.h` import in the selftest prog unexpectedly led to libbpf
  failing to load the program. As it turns out, the libbpf kfunc related
  code doesn't seem to handle BTF `FWD` type for structs. I've attached debug
  information about the issue in case the loader logic can accommodate such
  gotchas. Although the error in this case was specific to the test imports.

Aditi Ghag (5):
  bpf: Implement batching in UDP iterator
  bpf: Add bpf_sock_destroy kfunc
  [RFC] net: Skip taking lock in BPF context
  [RFC] udp: Fix destroying UDP listening sockets
  selftests/bpf: Add tests for bpf_sock_destroy

 include/net/udp.h                             |   1 +
 net/core/filter.c                             |  54 ++++
 net/ipv4/inet_hashtables.c                    |   9 +-
 net/ipv4/tcp.c                                |  16 +-
 net/ipv4/udp.c                                | 283 +++++++++++++++++-
 .../selftests/bpf/prog_tests/sock_destroy.c   | 190 ++++++++++++
 .../selftests/bpf/progs/sock_destroy_prog.c   | 151 ++++++++++
 7 files changed, 684 insertions(+), 20 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/sock_destroy.c
 create mode 100644 tools/testing/selftests/bpf/progs/sock_destroy_prog.c