diff mbox series

[iproute2,v8,1/3] ss: add support for BPF socket-local storage

Message ID 20240214084235.25618-2-qde@naccy.de (mailing list archive)
State Superseded
Delegated to: David Ahern
Headers show
Series ss: pretty-printing BPF socket-local storage | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Quentin Deslandes Feb. 14, 2024, 8:42 a.m. UTC
While sock_diag is able to return BPF socket-local storage in response
to INET_DIAG_REQ_SK_BPF_STORAGES requests, ss doesn't request it.

This change introduces the --bpf-maps and --bpf-map-id= options to request
BPF socket-local storage for all SK_STORAGE maps, or only specific ones.

The bigger part of this change will check the requested map IDs and
ensure they are valid. The column COL_EXT is used to print the
socket-local data into.

When --bpf-maps is used, ss will send an empty
INET_DIAG_REQ_SK_BPF_STORAGES request, in return the kernel will send
all the BPF socket-local storage entries for a given socket. The BTF
data for each map is loaded on demand, as ss can't predict which map ID
are used.

When --bpf-map-id=ID is used, a file descriptor to the requested maps is
open to 1) ensure the map doesn't disappear before the data is printed,
and 2) ensure the map type is BPF_MAP_TYPE_SK_STORAGE. The BTF data for
each requested map is loaded before the request is sent to the kernel.

Co-developed-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Quentin Deslandes <qde@naccy.de>
---
 misc/ss.c | 266 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 263 insertions(+), 3 deletions(-)

Comments

David Ahern Feb. 18, 2024, 5:39 p.m. UTC | #1
On 2/14/24 1:42 AM, Quentin Deslandes wrote:
> +	if (info.type != BPF_MAP_TYPE_SK_STORAGE) {
> +		fprintf(stderr, "ss: BPF map with ID %s has type '%s', expecting 'sk_storage'\n",
> +			optarg, libbpf_bpf_map_type_str(info.type));
> +		close(fd);
> +		return -1;
> +	}

ss.c: In function ‘bpf_map_opts_load_info’:
ss.c:3448:33: warning: implicit declaration of function
‘libbpf_bpf_map_type_str’ [-Wimplicit-function-declaration]
 3448 |                         optarg, libbpf_bpf_map_type_str(info.type));
      |                                 ^~~~~~~~~~~~~~~~~~~~~~~
ss.c:3447:68: warning: format ‘%s’ expects argument of type ‘char *’,
but argument 4 has type ‘int’ [-Wformat=]
 3447 |                 fprintf(stderr, "ss: BPF map with ID %s has type
'%s', expecting 'sk_storage'\n",
      |                                                                   ~^
      |                                                                    |
      |
  char *
      |                                                                   %d
 3448 |                         optarg, libbpf_bpf_map_type_str(info.type));
      |                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                                 |
      |                                 int
    CC       lnstat_util.o
    LINK     lnstat
    LINK     ss
/usr/bin/ld: ss.o: in function `main':


Ubuntu 22.04 has libbpf-0.5 installed. I suspect version hook is needed.
e.g., something like this (but with the relevant version numbers):

#if (LIBBPF_MAJOR_VERSION > 0) || (LIBBPF_MINOR_VERSION >= 7)
Quentin Deslandes Feb. 21, 2024, 9:51 a.m. UTC | #2
On 2024-02-18 18:39, David Ahern wrote:
> On 2/14/24 1:42 AM, Quentin Deslandes wrote:
>> +	if (info.type != BPF_MAP_TYPE_SK_STORAGE) {
>> +		fprintf(stderr, "ss: BPF map with ID %s has type '%s', expecting 'sk_storage'\n",
>> +			optarg, libbpf_bpf_map_type_str(info.type));
>> +		close(fd);
>> +		return -1;
>> +	}
> 
> ss.c: In function ‘bpf_map_opts_load_info’:
> ss.c:3448:33: warning: implicit declaration of function
> ‘libbpf_bpf_map_type_str’ [-Wimplicit-function-declaration]
>  3448 |                         optarg, libbpf_bpf_map_type_str(info.type));
>       |                                 ^~~~~~~~~~~~~~~~~~~~~~~
> ss.c:3447:68: warning: format ‘%s’ expects argument of type ‘char *’,
> but argument 4 has type ‘int’ [-Wformat=]
>  3447 |                 fprintf(stderr, "ss: BPF map with ID %s has type
> '%s', expecting 'sk_storage'\n",
>       |                                                                   ~^
>       |                                                                    |
>       |
>   char *
>       |                                                                   %d
>  3448 |                         optarg, libbpf_bpf_map_type_str(info.type));
>       |                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>       |                                 |
>       |                                 int
>     CC       lnstat_util.o
>     LINK     lnstat
>     LINK     ss
> /usr/bin/ld: ss.o: in function `main':
> 
> 
> Ubuntu 22.04 has libbpf-0.5 installed. I suspect version hook is needed.
> e.g., something like this (but with the relevant version numbers):
> 
> #if (LIBBPF_MAJOR_VERSION > 0) || (LIBBPF_MINOR_VERSION >= 7)

After checking, all the libbpf symbols I use require at least libbpf-0.5, except
for this one which was introduced in libbpf-1.0. Hence, I will print the type ID
instead of the string. IMO printing the string representation of the type doesn't
add enough value to justify adding a version hook.

However, I see the minimum required version for libbpf is 0.1, but this
series requires 0.5. I would check the version in ss and #error if
the requirements are not met, but I'm not sure this is the right way to do it.
What do you think?

Regards,
Quentin Deslandes
Quentin Deslandes Feb. 21, 2024, 11 a.m. UTC | #3
On 2024-02-21 10:51, Quentin Deslandes wrote:
> On 2024-02-18 18:39, David Ahern wrote:
>> On 2/14/24 1:42 AM, Quentin Deslandes wrote:
>>> +	if (info.type != BPF_MAP_TYPE_SK_STORAGE) {
>>> +		fprintf(stderr, "ss: BPF map with ID %s has type '%s', expecting 'sk_storage'\n",
>>> +			optarg, libbpf_bpf_map_type_str(info.type));
>>> +		close(fd);
>>> +		return -1;
>>> +	}
>>
>> ss.c: In function ‘bpf_map_opts_load_info’:
>> ss.c:3448:33: warning: implicit declaration of function
>> ‘libbpf_bpf_map_type_str’ [-Wimplicit-function-declaration]
>>  3448 |                         optarg, libbpf_bpf_map_type_str(info.type));
>>       |                                 ^~~~~~~~~~~~~~~~~~~~~~~
>> ss.c:3447:68: warning: format ‘%s’ expects argument of type ‘char *’,
>> but argument 4 has type ‘int’ [-Wformat=]
>>  3447 |                 fprintf(stderr, "ss: BPF map with ID %s has type
>> '%s', expecting 'sk_storage'\n",
>>       |                                                                   ~^
>>       |                                                                    |
>>       |
>>   char *
>>       |                                                                   %d
>>  3448 |                         optarg, libbpf_bpf_map_type_str(info.type));
>>       |                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>       |                                 |
>>       |                                 int
>>     CC       lnstat_util.o
>>     LINK     lnstat
>>     LINK     ss
>> /usr/bin/ld: ss.o: in function `main':
>>
>>
>> Ubuntu 22.04 has libbpf-0.5 installed. I suspect version hook is needed.
>> e.g., something like this (but with the relevant version numbers):
>>
>> #if (LIBBPF_MAJOR_VERSION > 0) || (LIBBPF_MINOR_VERSION >= 7)
> 
> After checking, all the libbpf symbols I use require at least libbpf-0.5, except
> for this one which was introduced in libbpf-1.0. Hence, I will print the type ID
> instead of the string. IMO printing the string representation of the type doesn't
> add enough value to justify adding a version hook.
> 
> However, I see the minimum required version for libbpf is 0.1, but this
> series requires 0.5. I would check the version in ss and #error if
> the requirements are not met, but I'm not sure this is the right way to do it.
> What do you think?

I've settled on a slightly different solution: the BPF socket-local code is gated
by ENABLE_BPF_SKSTORAGE_SUPPORT instead of HAVE_LIBBPF. If libbpf-0.5+ is available,
and HAVE_LIBBPF is defined, ENABLE_BPF_SKSTORAGE_SUPPORT will be defined in ss,
enabling this feature. If HAVE_LIBBPF is defined but libbpf-0.5+ is not available,
a compile warning will be printed, ENABLE_BPF_SKSTORAGE_SUPPORT won't be defined in
ss, and ss will be compiled without socket-local storage support.

This will ensure that:
- Features relying on HAVE_LIBBPF in ss don't have to comply with the same requirements
  as the BPF socket-local storage support (because iproute2 only requires libbpf-0.1+).
- This change won't prevent iproute2 as a whole from being compiled.

This seems much more reasonable than using #error and failing the whole build. I'll
send a v9 with these changes.

Regards,
Quentin Deslandes
David Ahern Feb. 21, 2024, 4:27 p.m. UTC | #4
On 2/21/24 4:00 AM, Quentin Deslandes wrote:
> This will ensure that: - Features relying on HAVE_LIBBPF in ss don't
> have to comply with the same requirements as the BPF socket-local
> storage support (because iproute2 only requires libbpf-0.1+). - This
> change won't prevent iproute2 as a whole from being compiled. This seems
> much more reasonable than using #error and failing the whole build. I'll
> send a v9 with these changes.

exactly, will take a look at the latest version in the next few days.
diff mbox series

Patch

diff --git a/misc/ss.c b/misc/ss.c
index 5296cabe..7f47b489 100644
--- a/misc/ss.c
+++ b/misc/ss.c
@@ -51,6 +51,11 @@ 
 #include <linux/tls.h>
 #include <linux/mptcp.h>
 
+#ifdef HAVE_LIBBPF
+#include <bpf/bpf.h>
+#include <bpf/libbpf.h>
+#endif
+
 #if HAVE_RPC
 #include <rpc/rpc.h>
 #include <rpc/xdr.h>
@@ -3380,6 +3385,209 @@  static void parse_diag_msg(struct nlmsghdr *nlh, struct sockstat *s)
 	memcpy(s->remote.data, r->id.idiag_dst, s->local.bytelen);
 }
 
+#ifdef HAVE_LIBBPF
+
+#define MAX_NR_BPF_MAP_ID_OPTS 32
+
+struct btf;
+
+static struct bpf_map_opts {
+	unsigned int nr_maps;
+	struct bpf_sk_storage_map_info {
+		unsigned int id;
+		int fd;
+	} maps[MAX_NR_BPF_MAP_ID_OPTS];
+	bool show_all;
+} bpf_map_opts;
+
+static void bpf_map_opts_mixed_error(void)
+{
+	fprintf(stderr,
+		"ss: --bpf-maps and --bpf-map-id cannot be used together\n");
+}
+
+static int bpf_map_opts_load_info(unsigned int map_id)
+{
+	struct bpf_map_info info = {};
+	uint32_t len = sizeof(info);
+	int fd;
+	int r;
+
+	if (bpf_map_opts.nr_maps == MAX_NR_BPF_MAP_ID_OPTS) {
+		fprintf(stderr,
+			"ss: too many (> %u) BPF socket-local storage maps found, skipping map ID %u\n",
+			MAX_NR_BPF_MAP_ID_OPTS, map_id);
+		return 0;
+	}
+
+	fd = bpf_map_get_fd_by_id(map_id);
+	if (fd < 0) {
+		if (errno == -ENOENT)
+			return 0;
+
+		fprintf(stderr, "ss: cannot get fd for BPF map ID %u%s\n",
+			map_id, errno == EPERM ?
+			": missing root permissions, CAP_BPF, or CAP_SYS_ADMIN" : "");
+		return -1;
+	}
+
+	r = bpf_obj_get_info_by_fd(fd, &info, &len);
+	if (r) {
+		fprintf(stderr, "ss: failed to get info for BPF map ID %u\n",
+			map_id);
+		close(fd);
+		return -1;
+	}
+
+	if (info.type != BPF_MAP_TYPE_SK_STORAGE) {
+		fprintf(stderr, "ss: BPF map with ID %s has type '%s', expecting 'sk_storage'\n",
+			optarg, libbpf_bpf_map_type_str(info.type));
+		close(fd);
+		return -1;
+	}
+
+	bpf_map_opts.maps[bpf_map_opts.nr_maps].id = map_id;
+	bpf_map_opts.maps[bpf_map_opts.nr_maps++].fd = fd;
+
+	return 0;
+}
+
+static struct bpf_sk_storage_map_info *bpf_map_opts_get_info(
+	unsigned int map_id)
+{
+	unsigned int i;
+	int r;
+
+	for (i = 0; i < bpf_map_opts.nr_maps; ++i) {
+		if (bpf_map_opts.maps[i].id == map_id)
+			return &bpf_map_opts.maps[i];
+	}
+
+	r = bpf_map_opts_load_info(map_id);
+	if (r)
+		return NULL;
+
+	return &bpf_map_opts.maps[bpf_map_opts.nr_maps - 1];
+}
+
+static int bpf_map_opts_add_id(const char *optarg)
+{
+	size_t optarg_len;
+	unsigned long id;
+	char *end;
+
+	if (bpf_map_opts.show_all) {
+		bpf_map_opts_mixed_error();
+		return -1;
+	}
+
+	optarg_len = strlen(optarg);
+	id = strtoul(optarg, &end, 0);
+	if (end != optarg + optarg_len || id == 0 || id >= UINT32_MAX) {
+		fprintf(stderr, "ss: invalid BPF map ID %s\n", optarg);
+		return -1;
+	}
+
+	/* Force lazy loading of the map's data. */
+	if (!bpf_map_opts_get_info(id))
+		return -1;
+
+	return 0;
+}
+
+static void bpf_map_opts_destroy(void)
+{
+	int i;
+
+	for (i = 0; i < bpf_map_opts.nr_maps; ++i)
+		close(bpf_map_opts.maps[i].fd);
+}
+
+static struct rtattr *bpf_map_opts_alloc_rta(void)
+{
+	struct rtattr *stgs_rta, *fd_rta;
+	size_t total_size;
+	unsigned int i;
+	void *buf;
+
+	/* If bpf_map_opts.show_all == true, we will send an empty message to
+	 * the kernel, which will return all the socket-local data attached to
+	 * a socket, no matter their map ID
+	 */
+	if (bpf_map_opts.show_all) {
+		total_size = RTA_LENGTH(0);
+	} else {
+		total_size = RTA_LENGTH(RTA_LENGTH(sizeof(int)) *
+					bpf_map_opts.nr_maps);
+	}
+
+	buf = malloc(total_size);
+	if (!buf)
+		return NULL;
+
+	stgs_rta = buf;
+	stgs_rta->rta_type = INET_DIAG_REQ_SK_BPF_STORAGES | NLA_F_NESTED;
+	stgs_rta->rta_len = total_size;
+
+	/* If inet_show_netlink() retries fetching socket data, nr_maps might
+	 * be different from 0, even with show_all == true, so we return early
+	 * to avoid inserting specific map IDs into the request.
+	 */
+	if (bpf_map_opts.show_all)
+		return stgs_rta;
+
+	buf = RTA_DATA(stgs_rta);
+	for (i = 0; i < bpf_map_opts.nr_maps; i++) {
+		int *fd;
+
+		fd_rta = buf;
+		fd_rta->rta_type = SK_DIAG_BPF_STORAGE_REQ_MAP_FD;
+		fd_rta->rta_len = RTA_LENGTH(sizeof(int));
+
+		fd = RTA_DATA(fd_rta);
+		*fd = bpf_map_opts.maps[i].fd;
+
+		buf += fd_rta->rta_len;
+	}
+
+	return stgs_rta;
+}
+
+static void show_sk_bpf_storages(struct rtattr *bpf_stgs)
+{
+	struct rtattr *tb[SK_DIAG_BPF_STORAGE_MAX + 1], *bpf_stg;
+	unsigned int rem;
+
+	for (bpf_stg = RTA_DATA(bpf_stgs), rem = RTA_PAYLOAD(bpf_stgs);
+		RTA_OK(bpf_stg, rem); bpf_stg = RTA_NEXT(bpf_stg, rem)) {
+
+		if ((bpf_stg->rta_type & NLA_TYPE_MASK) != SK_DIAG_BPF_STORAGE)
+			continue;
+
+		parse_rtattr_nested(tb, SK_DIAG_BPF_STORAGE_MAX,
+				    (struct rtattr *)bpf_stg);
+
+		if (tb[SK_DIAG_BPF_STORAGE_MAP_ID]) {
+			out(" map_id:%u",
+			    rta_getattr_u32(tb[SK_DIAG_BPF_STORAGE_MAP_ID]));
+		}
+	}
+}
+
+static bool bpf_map_opts_is_enabled(void)
+{
+	return bpf_map_opts.nr_maps || bpf_map_opts.show_all;
+}
+
+#else
+
+static bool bpf_map_opts_is_enabled(void)
+{
+	return false;
+}
+
+#endif
+
 static int inet_show_sock(struct nlmsghdr *nlh,
 			  struct sockstat *s)
 {
@@ -3387,8 +3595,9 @@  static int inet_show_sock(struct nlmsghdr *nlh,
 	struct inet_diag_msg *r = NLMSG_DATA(nlh);
 	unsigned char v6only = 0;
 
-	parse_rtattr(tb, INET_DIAG_MAX, (struct rtattr *)(r+1),
-		     nlh->nlmsg_len - NLMSG_LENGTH(sizeof(*r)));
+	parse_rtattr_flags(tb, INET_DIAG_MAX, (struct rtattr *)(r+1),
+			   nlh->nlmsg_len - NLMSG_LENGTH(sizeof(*r)),
+			   NLA_F_NESTED);
 
 	if (tb[INET_DIAG_PROTOCOL])
 		s->type = rta_getattr_u8(tb[INET_DIAG_PROTOCOL]);
@@ -3485,6 +3694,11 @@  static int inet_show_sock(struct nlmsghdr *nlh,
 	}
 	sctp_ino = s->ino;
 
+#ifdef HAVE_LIBBPF
+	if (tb[INET_DIAG_SK_BPF_STORAGES])
+		show_sk_bpf_storages(tb[INET_DIAG_SK_BPF_STORAGES]);
+#endif
+
 	return 0;
 }
 
@@ -3566,13 +3780,14 @@  static int sockdiag_send(int family, int fd, int protocol, struct filter *f)
 {
 	struct sockaddr_nl nladdr = { .nl_family = AF_NETLINK };
 	DIAG_REQUEST(req, struct inet_diag_req_v2 r);
+	struct rtattr *bpf_rta = NULL;
 	char    *bc = NULL;
 	int	bclen;
 	__u32	proto;
 	struct msghdr msg;
 	struct rtattr rta_bc;
 	struct rtattr rta_proto;
-	struct iovec iov[5];
+	struct iovec iov[6];
 	int iovlen = 1;
 
 	if (family == PF_UNSPEC)
@@ -3625,6 +3840,20 @@  static int sockdiag_send(int family, int fd, int protocol, struct filter *f)
 		iovlen += 2;
 	}
 
+#ifdef HAVE_LIBBPF
+	if (bpf_map_opts_is_enabled()) {
+		bpf_rta = bpf_map_opts_alloc_rta();
+		if (!bpf_rta) {
+			fprintf(stderr,
+				"ss: cannot alloc request for --bpf-map\n");
+			return -1;
+		}
+
+		iov[iovlen++] = (struct iovec){ bpf_rta, bpf_rta->rta_len };
+		req.nlh.nlmsg_len += bpf_rta->rta_len;
+	}
+#endif
+
 	msg = (struct msghdr) {
 		.msg_name = (void *)&nladdr,
 		.msg_namelen = sizeof(nladdr),
@@ -3633,10 +3862,13 @@  static int sockdiag_send(int family, int fd, int protocol, struct filter *f)
 	};
 
 	if (sendmsg(fd, &msg, 0) < 0) {
+		free(bpf_rta);
 		close(fd);
 		return -1;
 	}
 
+	free(bpf_rta);
+
 	return 0;
 }
 
@@ -5357,6 +5589,10 @@  static void _usage(FILE *dest)
 "       --tos           show tos and priority information\n"
 "       --cgroup        show cgroup information\n"
 "   -b, --bpf           show bpf filter socket information\n"
+#ifdef HAVE_LIBBPF
+"       --bpf-maps      show all BPF socket-local storage maps\n"
+"       --bpf-map-id=MAP-ID    show a BPF socket-local storage map\n"
+#endif
 "   -E, --events        continually display sockets as they are destroyed\n"
 "   -Z, --context       display task SELinux security contexts\n"
 "   -z, --contexts      display task and socket SELinux security contexts\n"
@@ -5482,6 +5718,9 @@  wrong_state:
 
 #define OPT_INET_SOCKOPT 262
 
+#define OPT_BPF_MAPS 263
+#define OPT_BPF_MAP_ID 264
+
 static const struct option long_opts[] = {
 	{ "numeric", 0, 0, 'n' },
 	{ "resolve", 0, 0, 'r' },
@@ -5527,6 +5766,10 @@  static const struct option long_opts[] = {
 	{ "mptcp", 0, 0, 'M' },
 	{ "oneline", 0, 0, 'O' },
 	{ "inet-sockopt", 0, 0, OPT_INET_SOCKOPT },
+#ifdef HAVE_LIBBPF
+	{ "bpf-maps", 0, 0, OPT_BPF_MAPS},
+	{ "bpf-map-id", 1, 0, OPT_BPF_MAP_ID},
+#endif
 	{ 0 }
 
 };
@@ -5732,6 +5975,19 @@  int main(int argc, char *argv[])
 		case OPT_INET_SOCKOPT:
 			show_inet_sockopt = 1;
 			break;
+#ifdef HAVE_LIBBPF
+		case OPT_BPF_MAPS:
+			if (bpf_map_opts.nr_maps) {
+				bpf_map_opts_mixed_error();
+				return -1;
+			}
+			bpf_map_opts.show_all = true;
+			break;
+		case OPT_BPF_MAP_ID:
+			if (bpf_map_opts_add_id(optarg))
+				exit(1);
+			break;
+#endif
 		case 'h':
 			help();
 		case '?':
@@ -5866,6 +6122,10 @@  int main(int argc, char *argv[])
 	if (show_processes || show_threads || show_proc_ctx || show_sock_ctx)
 		user_ent_destroy();
 
+#ifdef HAVE_LIBBPF
+	bpf_map_opts_destroy();
+#endif
+
 	render();
 
 	return 0;