From patchwork Wed May 3 22:53:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13230604 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BF736C7EE22 for ; Wed, 3 May 2023 22:54:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229756AbjECWyE (ORCPT ); Wed, 3 May 2023 18:54:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47980 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229653AbjECWyB (ORCPT ); Wed, 3 May 2023 18:54:01 -0400 Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com [IPv6:2607:f8b0:4864:20::630]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E590C44B7 for ; Wed, 3 May 2023 15:53:59 -0700 (PDT) Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-1ab1ce53ca6so10093095ad.0 for ; Wed, 03 May 2023 15:53:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1683154439; x=1685746439; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=K2lHDg5QdZkq6ksUKNS1hcQveXASFcJ3fP2NeLwWO4c=; b=fT7Zx8V8aK2LtlLZsqpr/qGLhiVYJDLKSkZmQqWrGn3GMwBKCwI1PPNBR741DiudJ2 OZ3tVFKOSSQm7EMFzl5Oro/YHd05m6s+xP5xdM3qXwk7QXo4W9lbXXtYcAMFXGtfiAla ei170g5gLvG6AchcoPIox1JsbZSzksLwX/ps0bp05pWHtb1qSNx6PPOvfxWLOK7E5RbE O+gtNlpQigMHqo9upphEDdH4a/cin6Zyl0Q3A4JwL9zEoByIaG5ZfCcnkf4jxBASbuuO x+KjCwubabbD/9iq0S++bCwVXv64+3Z2ty3+KSqh1GnSr5i9dfSxgVzJeraKlEdn3CMR t4Ng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683154439; x=1685746439; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=K2lHDg5QdZkq6ksUKNS1hcQveXASFcJ3fP2NeLwWO4c=; b=XOZrajdAjdDCet+PpaHtmIYOR5+WFDQVTtE0rtL/mCJ53Tol13t5tyW88wAz2w6Zqz m10cdm5q2DnSsyvqcok2iy5YieYaZtafIKyJTWIdNximKSlrPWmmI5xVMLLIxZ4svd5/ EpMq/kSGeGPmo5aIVIVKasXF64gJUt8Ry8YKk4jl09PUCwIH9VW2niHVhjinKTk2cFIO Y5KLuFrEHYyhH0ojOMGWbW83FXtOg1Yj5HQvgFZpLy5rft/jrVZreOVVMTT8Vdi5ehg/ bg42/t7ybkZ2fVGuZ7DwfsQuCZozxyB7oiWB45iN0Utxo6QOe9Db9ihugEBCo7K17C9V FP0A== X-Gm-Message-State: AC+VfDyTeXC6u4W26u4KDdcDjsqy8hiOlI/eEEYQ+Hjudam2/b8Szigb Mxi431iQzWGf6y/AO36LzPxW9vhE0rcHyraW28M= X-Google-Smtp-Source: ACHHUZ6KRJxwXFkf/mP63c7IqpYNQehYm+kPMThgGS5+WI3OBcQ1JDWJlY77JJ/OPqmk5YdOaEvqog== X-Received: by 2002:a17:902:d505:b0:1a9:90bc:c3c6 with SMTP id b5-20020a170902d50500b001a990bcc3c6mr2063954plg.16.1683154438847; Wed, 03 May 2023 15:53:58 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id p2-20020a1709028a8200b001a641e4738asm2200443plo.1.2023.05.03.15.53.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 May 2023 15:53:58 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, aditi.ghag@isovalent.com Subject: [PATCH v7 bpf-next 01/10] bpf: tcp: Avoid taking fast sock lock in iterator Date: Wed, 3 May 2023 22:53:42 +0000 Message-Id: <20230503225351.3700208-2-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230503225351.3700208-1-aditi.ghag@isovalent.com> References: <20230503225351.3700208-1-aditi.ghag@isovalent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Previously, BPF TCP iterator was acquiring fast version of sock lock that disables the BH. This introduced a circular dependency with code paths that later acquire sockets hash table bucket lock. Replace the fast version of sock lock with slow that faciliates BPF programs executed from the iterator to destroy TCP listening sockets using the bpf_sock_destroy kfunc (implemened in follow-up commits). Here is a stack trace that motivated this change: ``` lock_acquire+0xcd/0x330 _raw_spin_lock+0x33/0x40 ------> sock lock acquired with BH enabled sk_clone_lock+0x146/0x520 inet_csk_clone_lock+0x1b/0x110 tcp_create_openreq_child+0x22/0x3f0 tcp_v6_syn_recv_sock+0x96/0x940 lock_acquire+0xcd/0x330 _raw_spin_lock+0x33/0x40 ------> Acquire (bucket) lhash2.lock (may cause deadlock if interrupted) __inet_hash+0x4b/0x210 inet_csk_listen_start+0xe6/0x100 inet_listen+0x95/0x1d0 __sys_listen+0x69/0xb0 __x64_sys_listen+0x14/0x20 do_syscall_64+0x3c/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc lock_acquire+0xcd/0x330 _raw_spin_lock+0x33/0x40 ------> Acquire (bucket) lhash2.lock inet_unhash+0x9a/0x110 tcp_set_state+0x6a/0x210 tcp_abort+0x10d/0x200 bpf_prog_6793c5ca50c43c0d_iter_tcp6_server+0xa4/0xa9 bpf_iter_run_prog+0x1ff/0x340 ------> Release (bucket) lhash2.lock bpf_iter_tcp_seq_show+0xca/0x190 ------> Acquire (bucket) lhash2.lock ------> sock lock acquired with BH disabled bpf_seq_read+0x177/0x450 ``` Acked-by: Stanislav Fomichev Signed-off-by: Aditi Ghag --- net/ipv4/tcp_ipv4.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index ea370afa70ed..f2d370a9450f 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -2962,7 +2962,6 @@ static int bpf_iter_tcp_seq_show(struct seq_file *seq, void *v) struct bpf_iter_meta meta; struct bpf_prog *prog; struct sock *sk = v; - bool slow; uid_t uid; int ret; @@ -2970,7 +2969,7 @@ static int bpf_iter_tcp_seq_show(struct seq_file *seq, void *v) return 0; if (sk_fullsock(sk)) - slow = lock_sock_fast(sk); + lock_sock(sk); if (unlikely(sk_unhashed(sk))) { ret = SEQ_SKIP; @@ -2994,7 +2993,7 @@ static int bpf_iter_tcp_seq_show(struct seq_file *seq, void *v) unlock: if (sk_fullsock(sk)) - unlock_sock_fast(sk, slow); + release_sock(sk); return ret; } From patchwork Wed May 3 22:53:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13230603 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E20D6C7EE26 for ; Wed, 3 May 2023 22:54:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229747AbjECWyD (ORCPT ); Wed, 3 May 2023 18:54:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47968 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229622AbjECWyB (ORCPT ); Wed, 3 May 2023 18:54:01 -0400 Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com [IPv6:2607:f8b0:4864:20::630]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AC9154681 for ; Wed, 3 May 2023 15:54:00 -0700 (PDT) Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-1aaf21bb427so31971345ad.1 for ; Wed, 03 May 2023 15:54:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1683154440; x=1685746440; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=qwmHpgDmVlTo8Kp5M3jmfH1KC96jBeC1s7SoCPSCjuo=; b=Vpkcn3CEr0ix9AQq9OgHMitCBRXqBkzkEZsWewHXp/lAkn2P36o6kAm1wc4LNJ9oJm bK7DjsVcEHAc/vJz4FLZO3ufn49oaMcsoNDjYETUZmMTy4tkXMYUNNWNWwmEP2tXh8N+ VEx6PsDgeJBud/KkcBRlqHXMclVuPvFIqJ5sU/6CMcYIfvu6tiVkF53eQimK9MKUBedk zPrPcZbw0qsGxzCg3EZwCKtatp+T/Bb5g7Ss8mZb6eFp566a9JZdaRW05JnSnQtHesdK PSqu+JPNPh4BgWx/saNC9F9pGR/sTM+n5UkRFvbM/lNOwJSvgTYXfR6YemKClGkyEeLn mZrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683154440; x=1685746440; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qwmHpgDmVlTo8Kp5M3jmfH1KC96jBeC1s7SoCPSCjuo=; b=HQx83LRAGi6CJ4/3Y4dfNKYvlcGe/MnbQX9Fq8akE8jJgZDwyTwotPZ/1aJUP9L6HH /a8oSKAGGkivh6rZ1UTT2xPhd8vtSZOWUBtldGgfPPywl/kR+cba5LEMCOi0tG6azOSQ p9j0vw/dJyDARor1k05Bl9jLzzxhKeFq6SJseuDjCUV6+4sGi2GVk1pHS3SFNaBhXP+/ lMEhlfrEvnzEta8tQLf80R/mbro5UM/BBDJDBVQU2YbyXEeDOFG9iLJN4yPe/wF4u9g6 vZatxs0N/qLSjEqTWVkYb+GuTgf2UgQnXu2fRwN7SzKmoF/HZ+o2ErS7H7ryjajC7QvE JAjw== X-Gm-Message-State: AC+VfDyKmGx1EHNGWmrAJw/+5FlrdhTfXRavfREpGw8UJ5soC9nhE9/a pTuVZimB6Vsb3uzlUfiSXgR3dZET9phb70+u+dw= X-Google-Smtp-Source: ACHHUZ5tXldcCUYnIOYBVHSgD+ErgsvSgyLkDQUcurhJtR0y9/WXI20Vzevhqa7FQ3sHM3Uax8aeeg== X-Received: by 2002:a17:902:b607:b0:1ab:12a:bd2e with SMTP id b7-20020a170902b60700b001ab012abd2emr1577774pls.37.1683154439668; Wed, 03 May 2023 15:53:59 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id p2-20020a1709028a8200b001a641e4738asm2200443plo.1.2023.05.03.15.53.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 May 2023 15:53:59 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, aditi.ghag@isovalent.com Subject: [PATCH v7 bpf-next 02/10] udp: seq_file: Helper function to match socket attributes Date: Wed, 3 May 2023 22:53:43 +0000 Message-Id: <20230503225351.3700208-3-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230503225351.3700208-1-aditi.ghag@isovalent.com> References: <20230503225351.3700208-1-aditi.ghag@isovalent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net This is a preparatory commit to refactor code that matches socket attributes in iterators to a helper function, and use it in the proc fs iterator. Signed-off-by: Aditi Ghag --- net/ipv4/udp.c | 34 +++++++++++++++++++++++++++------- 1 file changed, 27 insertions(+), 7 deletions(-) diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index c605d171eb2d..71e3fef44fd5 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -2983,6 +2983,16 @@ EXPORT_SYMBOL(udp_prot); /* ------------------------------------------------------------------------ */ #ifdef CONFIG_PROC_FS +static unsigned short seq_file_family(const struct seq_file *seq); +static bool seq_sk_match(struct seq_file *seq, const struct sock *sk) +{ + unsigned short family = seq_file_family(seq); + + /* AF_UNSPEC is used as a match all */ + return ((family == AF_UNSPEC || family == sk->sk_family) && + net_eq(sock_net(sk), seq_file_net(seq))); +} + static struct udp_table *udp_get_table_afinfo(struct udp_seq_afinfo *afinfo, struct net *net) { @@ -3013,10 +3023,7 @@ static struct sock *udp_get_first(struct seq_file *seq, int start) spin_lock_bh(&hslot->lock); sk_for_each(sk, &hslot->head) { - if (!net_eq(sock_net(sk), net)) - continue; - if (afinfo->family == AF_UNSPEC || - sk->sk_family == afinfo->family) + if (seq_sk_match(seq, sk)) goto found; } spin_unlock_bh(&hslot->lock); @@ -3040,9 +3047,7 @@ static struct sock *udp_get_next(struct seq_file *seq, struct sock *sk) do { sk = sk_next(sk); - } while (sk && (!net_eq(sock_net(sk), net) || - (afinfo->family != AF_UNSPEC && - sk->sk_family != afinfo->family))); + } while (sk && !seq_sk_match(seq, sk)); if (!sk) { udptable = udp_get_table_afinfo(afinfo, net); @@ -3205,6 +3210,21 @@ static const struct seq_operations bpf_iter_udp_seq_ops = { }; #endif +static unsigned short seq_file_family(const struct seq_file *seq) +{ + const struct udp_seq_afinfo *afinfo; + +#ifdef CONFIG_BPF_SYSCALL + /* BPF iterator: bpf programs to filter sockets. */ + if (seq->op == &bpf_iter_udp_seq_ops) + return AF_UNSPEC; +#endif + + /* Proc fs iterator */ + afinfo = pde_data(file_inode(seq->file)); + return afinfo->family; +} + const struct seq_operations udp_seq_ops = { .start = udp_seq_start, .next = udp_seq_next, From patchwork Wed May 3 22:53:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13230605 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07DF6C77B78 for ; Wed, 3 May 2023 22:54:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229745AbjECWyI (ORCPT ); Wed, 3 May 2023 18:54:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47964 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229639AbjECWyD (ORCPT ); Wed, 3 May 2023 18:54:03 -0400 Received: from mail-pl1-x62e.google.com (mail-pl1-x62e.google.com [IPv6:2607:f8b0:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B3CA346A6 for ; Wed, 3 May 2023 15:54:01 -0700 (PDT) Received: by mail-pl1-x62e.google.com with SMTP id d9443c01a7336-1aaea3909d1so45634165ad.2 for ; Wed, 03 May 2023 15:54:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1683154441; x=1685746441; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=mi6aaC/8l3i5QxK9tIEa5YGWbVWWHfYetJOJa5yrsp0=; b=QA7c9Gm3zfvDBCcdyrUDRH8Nm2/FN7c+6Gk7FeU/blv/jN6Guwvvw0HNKJWipVEQm1 3exG4BD1SOdxqYixeinjntWUx7JF75l/R7BeerYNtUvl1BOmw7pPxdrnUT+J5sv3/S3y 4tL2rKJBqa39CQ0KurAYJjzKFXyLw6OlJQNHB+oiwzhgrgO0SIwG5hzRHrjpldX7Y/pc mjes8c72msnKmDDT2K5gGcXOMnkbEeZKHGeJbfv+wIVOPsYYdmq7yI1Dr6JoLob1GD5G 9CwMU/oPt+0wui2AJNTEHYnTX/N303TZNKMjevdLXZEvczlbALBDEZGUwcObdU9Wg4OE snDw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683154441; x=1685746441; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mi6aaC/8l3i5QxK9tIEa5YGWbVWWHfYetJOJa5yrsp0=; b=JuTFElVSiR41H07UicBBn7hChPt4g3JMKymeq4voGBUmBOiill/x1xrbo4o7zTDhwN k2Nu5ZFGPXVgGjPdGr+t3QzLdWcS0JtlFnfUGDh8hogqMADS1OLRISAzyEXHltczTl9H FPvOZxzMYnaUBG/QIDAMRoMzZJLZisO6o7qupNYi2Hhduq9me0Cc3xl+gKy5UwGf1AKf ua0PqTMx7ovQLf9TdBv7hFAuSf8i30ksz0AsOWshQ3TQmD7RxAeTPuXoK0W0yeVebP2U q/8rLejejSqLZmTdTdXQmJ+yyVr2zGYxhdkdaaTaDhRYS9zxdXlEChjfc9yZw9tr8MJm hlCQ== X-Gm-Message-State: AC+VfDzbezoU7Gc0u6KvSLBSNsBWe76hlWuVpUUpiie44m/I4LB5Eus8 e7nbAeHHo9xXd7igxMsU15zRyv3pXtfHHpRUx/M= X-Google-Smtp-Source: ACHHUZ4CCzf9G27k/hrZfQVp05Lc6h80vw+GUjHZVhoy0OjYqIWGolTlrQZEhYao6HXcYW6k32b1Yw== X-Received: by 2002:a17:903:32c9:b0:1a6:8ed5:428a with SMTP id i9-20020a17090332c900b001a68ed5428amr2146101plr.22.1683154440657; Wed, 03 May 2023 15:54:00 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id p2-20020a1709028a8200b001a641e4738asm2200443plo.1.2023.05.03.15.53.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 May 2023 15:54:00 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, aditi.ghag@isovalent.com, Martin KaFai Lau Subject: [PATCH v7 bpf-next 03/10] bpf: udp: Encapsulate logic to get udp table Date: Wed, 3 May 2023 22:53:44 +0000 Message-Id: <20230503225351.3700208-4-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230503225351.3700208-1-aditi.ghag@isovalent.com> References: <20230503225351.3700208-1-aditi.ghag@isovalent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net This is a preparatory commit that encapsulates the logic to get udp table in iterator inside udp_get_table_afinfo, and renames the function to udp_get_table_seq accordingly. Suggested-by: Martin KaFai Lau Signed-off-by: Aditi Ghag --- net/ipv4/udp.c | 35 ++++++++++++----------------------- 1 file changed, 12 insertions(+), 23 deletions(-) diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 71e3fef44fd5..c426ebafeb13 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -2993,9 +2993,16 @@ static bool seq_sk_match(struct seq_file *seq, const struct sock *sk) net_eq(sock_net(sk), seq_file_net(seq))); } -static struct udp_table *udp_get_table_afinfo(struct udp_seq_afinfo *afinfo, - struct net *net) +static struct udp_table *udp_get_table_seq(struct seq_file *seq, + struct net *net) { + const struct udp_iter_state *state = seq->private; + const struct udp_seq_afinfo *afinfo; + + if (state->bpf_seq_afinfo) + return net->ipv4.udp_table; + + afinfo = pde_data(file_inode(seq->file)); return afinfo->udp_table ? : net->ipv4.udp_table; } @@ -3003,16 +3010,10 @@ static struct sock *udp_get_first(struct seq_file *seq, int start) { struct udp_iter_state *state = seq->private; struct net *net = seq_file_net(seq); - struct udp_seq_afinfo *afinfo; struct udp_table *udptable; struct sock *sk; - if (state->bpf_seq_afinfo) - afinfo = state->bpf_seq_afinfo; - else - afinfo = pde_data(file_inode(seq->file)); - - udptable = udp_get_table_afinfo(afinfo, net); + udptable = udp_get_table_seq(seq, net); for (state->bucket = start; state->bucket <= udptable->mask; ++state->bucket) { @@ -3037,20 +3038,14 @@ static struct sock *udp_get_next(struct seq_file *seq, struct sock *sk) { struct udp_iter_state *state = seq->private; struct net *net = seq_file_net(seq); - struct udp_seq_afinfo *afinfo; struct udp_table *udptable; - if (state->bpf_seq_afinfo) - afinfo = state->bpf_seq_afinfo; - else - afinfo = pde_data(file_inode(seq->file)); - do { sk = sk_next(sk); } while (sk && !seq_sk_match(seq, sk)); if (!sk) { - udptable = udp_get_table_afinfo(afinfo, net); + udptable = udp_get_table_seq(seq, net); if (state->bucket <= udptable->mask) spin_unlock_bh(&udptable->hash[state->bucket].lock); @@ -3096,15 +3091,9 @@ EXPORT_SYMBOL(udp_seq_next); void udp_seq_stop(struct seq_file *seq, void *v) { struct udp_iter_state *state = seq->private; - struct udp_seq_afinfo *afinfo; struct udp_table *udptable; - if (state->bpf_seq_afinfo) - afinfo = state->bpf_seq_afinfo; - else - afinfo = pde_data(file_inode(seq->file)); - - udptable = udp_get_table_afinfo(afinfo, seq_file_net(seq)); + udptable = udp_get_table_seq(seq, seq_file_net(seq)); if (state->bucket <= udptable->mask) spin_unlock_bh(&udptable->hash[state->bucket].lock); From patchwork Wed May 3 22:53:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13230606 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 915AFC7EE25 for ; Wed, 3 May 2023 22:54:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229767AbjECWyI (ORCPT ); Wed, 3 May 2023 18:54:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47968 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229748AbjECWyE (ORCPT ); Wed, 3 May 2023 18:54:04 -0400 Received: from mail-pf1-x429.google.com (mail-pf1-x429.google.com [IPv6:2607:f8b0:4864:20::429]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 931AE44B7 for ; Wed, 3 May 2023 15:54:02 -0700 (PDT) Received: by mail-pf1-x429.google.com with SMTP id d2e1a72fcca58-64115eef620so8451971b3a.1 for ; Wed, 03 May 2023 15:54:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1683154442; x=1685746442; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=EcKx3nBKX8xE4I77EWn/gEVYn3xfHIo69qJzSU3VXDM=; b=h15aTaaNLw+ceV+c+QIVUDdZ3UcRmPrhmZwzfT9CazYtiaTvOugavkwVnvoG3Nn2v4 7gXKGD/jp0CrPPTaIxC2F6RT5UYyaBhhafbW2X4Jaqc3qxAip4W1iSCfUD0darCIPQo2 MfGDqxQlV83WmmS8ncRc9Y3HsIaSfD2guJWAPSoIjq0/dSuJspRUeL0Fl1P1fK4FtKwh oNGjhIlaGJgSsH/a/B1gkKoHGX4uP1PqgJK4RHv7MrPq5f0BZUDd1/5cRaOKchecec8i yFj0WthymzuMqpYOxoakFuCR/g26rQpiqIulx4ln2EU7chIoX7Ska8RCb24gg7HgwrJ/ aDrQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683154442; x=1685746442; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=EcKx3nBKX8xE4I77EWn/gEVYn3xfHIo69qJzSU3VXDM=; b=ICPSkcpR/xArofxwD+vPPeaOhofaXjZxOtavZJSqocu/4C6TuSHcJJyC/rXYwCAQgH /EO5va1ls/dBKaKAKejrHMdpPIRsbOeIV1wO9d2uJW8/+gF4Hy/DDI57dXwPaTaD04Zp oHqlG1QZGmZXBljZ0S8e2Moad7DbozHVhCDnEL4nYZ7iJT3hkle+ndcGXyybXH5BP1nP QuJlA+P2z2w9Ljnwpfpf4qrtSboPjgko2z2RMyAzhYFDJZiTKodemCMPhOLCJ9U0S0ia 04N3fZGaS1jzeTM3rda35tju0Hzfby7LX8XXkwcJ6K5NUe3DULJCTEQhwTRTD5L/G3b5 xFOQ== X-Gm-Message-State: AC+VfDwPGmm2fXoeEsqZo8Ap9K/10CMnjbUSpLT6H3oawU7xvv5EmpFA 19BnjM5PCaaQxj+oZb9pCGTvYl1GU9D9i04Rxds= X-Google-Smtp-Source: ACHHUZ7mMRMEVLtjRRkSu1jiJQkBpNiQUEapOssD83MDtEg4/hTWOnr+suNrFfm7lUsbx7YTCNz/4g== X-Received: by 2002:a17:902:d2c5:b0:1a6:d0a8:c70f with SMTP id n5-20020a170902d2c500b001a6d0a8c70fmr1454819plc.5.1683154441735; Wed, 03 May 2023 15:54:01 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id p2-20020a1709028a8200b001a641e4738asm2200443plo.1.2023.05.03.15.54.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 May 2023 15:54:01 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, aditi.ghag@isovalent.com, Martin KaFai Lau Subject: [PATCH v7 bpf-next 04/10] udp: seq_file: Remove bpf_seq_afinfo from udp_iter_state Date: Wed, 3 May 2023 22:53:45 +0000 Message-Id: <20230503225351.3700208-5-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230503225351.3700208-1-aditi.ghag@isovalent.com> References: <20230503225351.3700208-1-aditi.ghag@isovalent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net This is a preparatory commit to remove the field. The field was previously shared between proc fs and BPF UDP socket iterators. As the follow-up commits will decouple the implementation for the iterators, remove the field. As for BPF socket iterator, filtering of sockets is exepected to be done in BPF programs. Suggested-by: Martin KaFai Lau Signed-off-by: Aditi Ghag --- include/net/udp.h | 1 - net/ipv4/udp.c | 25 +++++-------------------- 2 files changed, 5 insertions(+), 21 deletions(-) diff --git a/include/net/udp.h b/include/net/udp.h index de4b528522bb..5cad44318d71 100644 --- a/include/net/udp.h +++ b/include/net/udp.h @@ -437,7 +437,6 @@ struct udp_seq_afinfo { struct udp_iter_state { struct seq_net_private p; int bucket; - struct udp_seq_afinfo *bpf_seq_afinfo; }; void *udp_seq_start(struct seq_file *seq, loff_t *pos); diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index c426ebafeb13..9f8c1554a9e4 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -2993,14 +2993,16 @@ static bool seq_sk_match(struct seq_file *seq, const struct sock *sk) net_eq(sock_net(sk), seq_file_net(seq))); } +static const struct seq_operations bpf_iter_udp_seq_ops; static struct udp_table *udp_get_table_seq(struct seq_file *seq, struct net *net) { - const struct udp_iter_state *state = seq->private; const struct udp_seq_afinfo *afinfo; - if (state->bpf_seq_afinfo) +#ifdef CONFIG_BPF_SYSCALL + if (seq->op == &bpf_iter_udp_seq_ops) return net->ipv4.udp_table; +#endif afinfo = pde_data(file_inode(seq->file)); return afinfo->udp_table ? : net->ipv4.udp_table; @@ -3424,28 +3426,11 @@ DEFINE_BPF_ITER_FUNC(udp, struct bpf_iter_meta *meta, static int bpf_iter_init_udp(void *priv_data, struct bpf_iter_aux_info *aux) { - struct udp_iter_state *st = priv_data; - struct udp_seq_afinfo *afinfo; - int ret; - - afinfo = kmalloc(sizeof(*afinfo), GFP_USER | __GFP_NOWARN); - if (!afinfo) - return -ENOMEM; - - afinfo->family = AF_UNSPEC; - afinfo->udp_table = NULL; - st->bpf_seq_afinfo = afinfo; - ret = bpf_iter_init_seq_net(priv_data, aux); - if (ret) - kfree(afinfo); - return ret; + return bpf_iter_init_seq_net(priv_data, aux); } static void bpf_iter_fini_udp(void *priv_data) { - struct udp_iter_state *st = priv_data; - - kfree(st->bpf_seq_afinfo); bpf_iter_fini_seq_net(priv_data); } From patchwork Wed May 3 22:53:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13230607 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22036C77B7F for ; Wed, 3 May 2023 22:54:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229656AbjECWyJ (ORCPT ); Wed, 3 May 2023 18:54:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47964 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229746AbjECWyF (ORCPT ); Wed, 3 May 2023 18:54:05 -0400 Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E7C5F7D80 for ; Wed, 3 May 2023 15:54:03 -0700 (PDT) Received: by mail-pl1-x635.google.com with SMTP id d9443c01a7336-1ab0c697c84so21600505ad.3 for ; Wed, 03 May 2023 15:54:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1683154443; x=1685746443; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=bySUk6SoSqiQ2FS6H9SXN/LOt2ukI0TjRH3U0ZGVQjs=; b=OUF6kOxc9x4IGdqWXifSPNKiW7siSZnBErbuRB+LoGB5OC9PZd+DOTnzRfXtyKaySk DM8IuhCJDHDX85jQK/yxHcp232MSnvEw+jc/KYZKTAqbKFNWJ6sp02zSrrpIQsedhjKt xPPzWtLaRd2VLh04WScO8UrxPxi/Qkncm/1sLpp4KA8ao/tlvvwljE/GmcGCtXabMVhF iuA33HhIrcY5UOa4h0loLs/CniAiYq178m1a4Ak4NZK9lT5A7v9ZlxSepmC96be9OObg 7Jyd/sMT9uwC0t1CVoI6GDQswCdrpAVRcS6eBNRJCtxGvd/hKDGE7TqkYdnVkSOhU7e7 78Lg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683154443; x=1685746443; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bySUk6SoSqiQ2FS6H9SXN/LOt2ukI0TjRH3U0ZGVQjs=; b=OmUwPTCAp2JyhgB1mhBlz5hgaHRCrhZ6QPxyXeE4zbXK5PXXTq7qK4VzQUtoL8zZbq BE9f2UGtw6hjN4F6kFct8D56RkvF1lYqczApwgx9KZP72D1GWBWMQ8bEvbyf8Wse73fL da9d/ZfLrFTPTUDujOfqw6wpDEAJMt6nc0aamPZILFKGeb+S4gEK/SoOz0BMw2SjAFhs Z7VlNm1uR73ki7LM1BJxk3CelJ8PRQTkCzMMhlqr60mS9Gq/wJ5fdoQ8YEd5hK34HtMU BI52k1J69RlfGxIZBDb1fUQfpN12hQ8NbUiKC5H9Fr3pYyjoSG0LsAXHWSpP6hNs+Z8E 5M6w== X-Gm-Message-State: AC+VfDyiVKDpW9objw03hDfGBKbx03Ujnad5HeSAbdS+iTUh0/qfTtKX +ekChQZH6ZJkpHfy8uo8WSyIYGLNEYx++9/ukvU= X-Google-Smtp-Source: ACHHUZ50n09itwDceaumUImw0a/8+yIY01K3YF6UsjD35o7CSTk6LRGrL2A4iyyiry6Xoamwy9HWhw== X-Received: by 2002:a17:902:cec9:b0:1a8:626:6d9d with SMTP id d9-20020a170902cec900b001a806266d9dmr2061258plg.62.1683154442675; Wed, 03 May 2023 15:54:02 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id p2-20020a1709028a8200b001a641e4738asm2200443plo.1.2023.05.03.15.54.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 May 2023 15:54:02 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, aditi.ghag@isovalent.com, Martin KaFai Lau Subject: [PATCH v7 bpf-next 05/10] bpf: udp: Implement batching for sockets iterator Date: Wed, 3 May 2023 22:53:46 +0000 Message-Id: <20230503225351.3700208-6-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230503225351.3700208-1-aditi.ghag@isovalent.com> References: <20230503225351.3700208-1-aditi.ghag@isovalent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Batch UDP sockets from BPF iterator that allows for overlapping locking semantics in BPF/kernel helpers executed in BPF programs. This facilitates BPF socket destroy kfunc (introduced by follow-up patches) to execute from BPF iterator programs. Previously, BPF iterators acquired the sock lock and sockets hash table bucket lock while executing BPF programs. This prevented BPF helpers that again acquire these locks to be executed from BPF iterators. With the batching approach, we acquire a bucket lock, batch all the bucket sockets, and then release the bucket lock. This enables BPF or kernel helpers to skip sock locking when invoked in the supported BPF contexts. The batching logic is similar to the logic implemented in TCP iterator: https://lore.kernel.org/bpf/20210701200613.1036157-1-kafai@fb.com/. Suggested-by: Martin KaFai Lau Signed-off-by: Aditi Ghag --- net/ipv4/udp.c | 205 +++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 199 insertions(+), 6 deletions(-) diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 9f8c1554a9e4..150551acab9d 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -3148,6 +3148,143 @@ struct bpf_iter__udp { int bucket __aligned(8); }; +struct bpf_udp_iter_state { + struct udp_iter_state state; + unsigned int cur_sk; + unsigned int end_sk; + unsigned int max_sk; + int offset; + struct sock **batch; + bool st_bucket_done; +}; + +static int bpf_iter_udp_realloc_batch(struct bpf_udp_iter_state *iter, + unsigned int new_batch_sz); +static struct sock *bpf_iter_udp_batch(struct seq_file *seq) +{ + struct bpf_udp_iter_state *iter = seq->private; + struct udp_iter_state *state = &iter->state; + struct net *net = seq_file_net(seq); + struct udp_table *udptable; + unsigned int batch_sks = 0; + bool resized = false; + struct sock *sk; + + /* The current batch is done, so advance the bucket. */ + if (iter->st_bucket_done) { + state->bucket++; + iter->offset = 0; + } + + udptable = udp_get_table_seq(seq, net); + +again: + /* New batch for the next bucket. + * Iterate over the hash table to find a bucket with sockets matching + * the iterator attributes, and return the first matching socket from + * the bucket. The remaining matched sockets from the bucket are batched + * before releasing the bucket lock. This allows BPF programs that are + * called in seq_show to acquire the bucket lock if needed. + */ + iter->cur_sk = 0; + iter->end_sk = 0; + iter->st_bucket_done = false; + batch_sks = 0; + + for (; state->bucket <= udptable->mask; state->bucket++) { + struct udp_hslot *hslot2 = &udptable->hash2[state->bucket]; + + if (hlist_empty(&hslot2->head)) { + iter->offset = 0; + continue; + } + + spin_lock_bh(&hslot2->lock); + udp_portaddr_for_each_entry(sk, &hslot2->head) { + if (seq_sk_match(seq, sk)) { + /* Resume from the last iterated socket at the + * offset in the bucket before iterator was stopped. + */ + if (iter->offset) { + --iter->offset; + continue; + } + if (iter->end_sk < iter->max_sk) { + sock_hold(sk); + iter->batch[iter->end_sk++] = sk; + } + batch_sks++; + } + } + spin_unlock_bh(&hslot2->lock); + + if (iter->end_sk) + break; + + /* Reset the current bucket's offset before moving to the next bucket. */ + iter->offset = 0; + } + + /* All done: no batch made. */ + if (!iter->end_sk) + return NULL; + + if (iter->end_sk == batch_sks) { + /* Batching is done for the current bucket; return the first + * socket to be iterated from the batch. + */ + iter->st_bucket_done = true; + goto done; + } + if (!resized && !bpf_iter_udp_realloc_batch(iter, batch_sks * 3 / 2)) { + resized = true; + /* After allocating a larger batch, retry one more time to grab + * the whole bucket. + */ + state->bucket--; + goto again; + } +done: + return iter->batch[0]; +} + +static void *bpf_iter_udp_seq_next(struct seq_file *seq, void *v, loff_t *pos) +{ + struct bpf_udp_iter_state *iter = seq->private; + struct sock *sk; + + /* Whenever seq_next() is called, the iter->cur_sk is + * done with seq_show(), so unref the iter->cur_sk. + */ + if (iter->cur_sk < iter->end_sk) { + sock_put(iter->batch[iter->cur_sk++]); + ++iter->offset; + } + + /* After updating iter->cur_sk, check if there are more sockets + * available in the current bucket batch. + */ + if (iter->cur_sk < iter->end_sk) + sk = iter->batch[iter->cur_sk]; + else + /* Prepare a new batch. */ + sk = bpf_iter_udp_batch(seq); + + ++*pos; + return sk; +} + +static void *bpf_iter_udp_seq_start(struct seq_file *seq, loff_t *pos) +{ + /* bpf iter does not support lseek, so it always + * continue from where it was stop()-ped. + */ + if (*pos) + return bpf_iter_udp_batch(seq); + + return SEQ_START_TOKEN; +} + static int udp_prog_seq_show(struct bpf_prog *prog, struct bpf_iter_meta *meta, struct udp_sock *udp_sk, uid_t uid, int bucket) { @@ -3168,18 +3305,37 @@ static int bpf_iter_udp_seq_show(struct seq_file *seq, void *v) struct bpf_prog *prog; struct sock *sk = v; uid_t uid; + int ret; if (v == SEQ_START_TOKEN) return 0; + lock_sock(sk); + + if (unlikely(sk_unhashed(sk))) { + ret = SEQ_SKIP; + goto unlock; + } + uid = from_kuid_munged(seq_user_ns(seq), sock_i_uid(sk)); meta.seq = seq; prog = bpf_iter_get_info(&meta, false); - return udp_prog_seq_show(prog, &meta, v, uid, state->bucket); + ret = udp_prog_seq_show(prog, &meta, v, uid, state->bucket); + +unlock: + release_sock(sk); + return ret; +} + +static void bpf_iter_udp_put_batch(struct bpf_udp_iter_state *iter) +{ + while (iter->cur_sk < iter->end_sk) + sock_put(iter->batch[iter->cur_sk++]); } static void bpf_iter_udp_seq_stop(struct seq_file *seq, void *v) { + struct bpf_udp_iter_state *iter = seq->private; struct bpf_iter_meta meta; struct bpf_prog *prog; @@ -3190,12 +3346,15 @@ static void bpf_iter_udp_seq_stop(struct seq_file *seq, void *v) (void)udp_prog_seq_show(prog, &meta, v, 0, 0); } - udp_seq_stop(seq, v); + if (iter->cur_sk < iter->end_sk) { + bpf_iter_udp_put_batch(iter); + iter->st_bucket_done = false; + } } static const struct seq_operations bpf_iter_udp_seq_ops = { - .start = udp_seq_start, - .next = udp_seq_next, + .start = bpf_iter_udp_seq_start, + .next = bpf_iter_udp_seq_next, .stop = bpf_iter_udp_seq_stop, .show = bpf_iter_udp_seq_show, }; @@ -3424,21 +3583,55 @@ static struct pernet_operations __net_initdata udp_sysctl_ops = { DEFINE_BPF_ITER_FUNC(udp, struct bpf_iter_meta *meta, struct udp_sock *udp_sk, uid_t uid, int bucket) +static int bpf_iter_udp_realloc_batch(struct bpf_udp_iter_state *iter, + unsigned int new_batch_sz) +{ + struct sock **new_batch; + + new_batch = kvmalloc_array(new_batch_sz, sizeof(*new_batch), + GFP_USER | __GFP_NOWARN); + if (!new_batch) + return -ENOMEM; + + bpf_iter_udp_put_batch(iter); + kvfree(iter->batch); + iter->batch = new_batch; + iter->max_sk = new_batch_sz; + + return 0; +} + +#define INIT_BATCH_SZ 16 + static int bpf_iter_init_udp(void *priv_data, struct bpf_iter_aux_info *aux) { - return bpf_iter_init_seq_net(priv_data, aux); + struct bpf_udp_iter_state *iter = priv_data; + int ret; + + ret = bpf_iter_init_seq_net(priv_data, aux); + if (ret) + return ret; + + ret = bpf_iter_udp_realloc_batch(iter, INIT_BATCH_SZ); + if (ret) + bpf_iter_fini_seq_net(priv_data); + + return ret; } static void bpf_iter_fini_udp(void *priv_data) { + struct bpf_udp_iter_state *iter = priv_data; + bpf_iter_fini_seq_net(priv_data); + kvfree(iter->batch); } static const struct bpf_iter_seq_info udp_seq_info = { .seq_ops = &bpf_iter_udp_seq_ops, .init_seq_private = bpf_iter_init_udp, .fini_seq_private = bpf_iter_fini_udp, - .seq_priv_size = sizeof(struct udp_iter_state), + .seq_priv_size = sizeof(struct bpf_udp_iter_state), }; static struct bpf_iter_reg udp_reg_info = { From patchwork Wed May 3 22:53:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13230609 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 13ACBC7EE22 for ; Wed, 3 May 2023 22:54:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229780AbjECWyK (ORCPT ); Wed, 3 May 2023 18:54:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48046 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229622AbjECWyF (ORCPT ); Wed, 3 May 2023 18:54:05 -0400 Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com [IPv6:2607:f8b0:4864:20::630]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9A21244B7 for ; Wed, 3 May 2023 15:54:04 -0700 (PDT) Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-1ab1ce53ca6so10093555ad.0 for ; Wed, 03 May 2023 15:54:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1683154444; x=1685746444; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=eTgKptWHJObnX/MfocLy8Ls0Zk4vGnGa6alxIVryx/k=; b=KWYTrYyX8I+7W3DUJiJ22aZuiTM+VS6LGyVSBYbwS+/GcQ9+Z9pAvrcNj192Bqw4Nm 57V/AJkmi//ejePadPZe/CQA+3j5oKoKek+T1GWx5pcsI7l8Uc7QBH0gA4k1k0RBs5uE FldI3A7kljEp9zQk2GMuIXLRmBpaiMUF+MWgZlDZc9cWi+yr+7iVl9uWmdoRPtzzMfWs b2NGc1TE7+dqOBPub+eIaat8gO2l+63rdD+S0+YoQLuSDWZfDFM0tBnyJ2PMddObW+5h 6P0uuOY9w4BAqIqVwUGxi8FwyUpdKVJqV2eXTo3opD+79ejp4GDdnwMBQe+hr1KGvt8J f6Iw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683154444; x=1685746444; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eTgKptWHJObnX/MfocLy8Ls0Zk4vGnGa6alxIVryx/k=; b=UZDzCBNdMp0cFz4DeGfqQJdXbZ1oICVguMOzujEYiEcj7b04+W13i1n0XHcacRTcOI rY5rYWWDHdvvnBADa8Z1HKdtxUQXtWcWtSDnFudWrLO2IQ2gpzSSaRgqgGq2b2klaUDy MsVoVqFfc9NBGWgm6YIwjPWHmjvjkZeXYJS2CxSjI/l2bkpmOfwhIJNTgiGpreopxnoD trneaLMr6WkaK4mzBsQeG+24NxUrElaioaEq8Oh2oQhuaFHLYiSe/MA+0G11pvap9q6R WVL90Mz4u4XDbUx3kOADEmkYTjnavxX8f+kv0Sul7ifeDUbmVsnIW2PNLh275vm9nT2C Fscw== X-Gm-Message-State: AC+VfDzrRbiJDTiAy9+cp4K2MdiEXvOeI8MDZyB5pnZrLTd1b4TxlL4l vWSnps3WR2u4HsYBwr+oUcg+jq0uchO/700clm8= X-Google-Smtp-Source: ACHHUZ4dkVnFbUYBuslTsOWG1NCPuukk2M4sJTul2JjcEp63V6k2vehi4mscm3qUi/rmOlht6GjuEg== X-Received: by 2002:a17:903:110f:b0:1a9:91a1:57bd with SMTP id n15-20020a170903110f00b001a991a157bdmr1903728plh.34.1683154443985; Wed, 03 May 2023 15:54:03 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id p2-20020a1709028a8200b001a641e4738asm2200443plo.1.2023.05.03.15.54.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 May 2023 15:54:03 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, aditi.ghag@isovalent.com Subject: [PATCH v7 bpf-next 06/10] bpf: Add bpf_sock_destroy kfunc Date: Wed, 3 May 2023 22:53:47 +0000 Message-Id: <20230503225351.3700208-7-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230503225351.3700208-1-aditi.ghag@isovalent.com> References: <20230503225351.3700208-1-aditi.ghag@isovalent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net The socket destroy kfunc is used to forcefully terminate sockets from certain BPF contexts. We plan to use the capability in Cilium to force client sockets to reconnect when their remote load-balancing backends are deleted. The other use case is on-the-fly policy enforcement where existing socket connections prevented by policies need to be forcefully terminated. The helper allows terminating sockets that may or may not be actively sending traffic. The helper is currently exposed to certain BPF iterators where users can filter, and terminate selected sockets. Additionally, the helper can only be called from these BPF contexts that ensure socket locking in order to allow synchronous execution of destroy helpers that also acquire socket locks. The previous commit that batches UDP sockets during iteration facilitated a synchronous invocation of the destroy helper from BPF context by skipping taking socket locks in the destroy handler. TCP iterators already supported batching. Follow-up commits will ensure that the kfunc can only be called from programs with `BPF_TRACE_ITER` attach type. The helper takes `sock_common` type argument, even though it expects, and casts them to a `sock` pointer. This enables the verifier to allow the sock_destroy kfunc to be called for TCP with `sock_common` and UDP with `sock` structs. As a comparison, BPF helpers enable this behavior with the `ARG_PTR_TO_BTF_ID_SOCK_COMMON` argument type. However, there is no such option available with the verifier logic that handles kfuncs where BTF types are inferred. Furthermore, as `sock_common` only has a subset of certain fields of `sock`, casting pointer to the latter type might not always be safe for certain sockets like request sockets, but these have a special handling in the diag_destroy handlers. Signed-off-by: Aditi Ghag --- net/core/filter.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++ net/ipv4/tcp.c | 10 ++++++--- net/ipv4/udp.c | 6 +++-- 3 files changed, 68 insertions(+), 5 deletions(-) diff --git a/net/core/filter.c b/net/core/filter.c index 727c5269867d..97d70b7959a1 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -11715,3 +11715,60 @@ static int __init bpf_kfunc_init(void) return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP, &bpf_kfunc_set_xdp); } late_initcall(bpf_kfunc_init); + +/* Disables missing prototype warnings */ +__diag_push(); +__diag_ignore_all("-Wmissing-prototypes", + "Global functions as their definitions will be in vmlinux BTF"); + +/* bpf_sock_destroy: Destroy the given socket with ECONNABORTED error code. + * + * The function expects a non-NULL pointer to a socket, and invokes the + * protocol specific socket destroy handlers. + * + * The helper can only be called from BPF contexts that have acquired the socket + * locks. + * + * Parameters: + * @sock: Pointer to socket to be destroyed + * + * Return: + * On error, may return EPROTONOSUPPORT, EINVAL. + * EPROTONOSUPPORT if protocol specific destroy handler is not supported. + * 0 otherwise + */ +__bpf_kfunc int bpf_sock_destroy(struct sock_common *sock) +{ + struct sock *sk = (struct sock *)sock; + + if (!sk) + return -EINVAL; + + /* The locking semantics that allow for synchronous execution of the + * destroy handlers are only supported for TCP and UDP. + * Supporting protocols will need to acquire lock_sock in the BPF context + * prior to invoking this kfunc. + */ + if (!sk->sk_prot->diag_destroy || (sk->sk_protocol != IPPROTO_TCP && + sk->sk_protocol != IPPROTO_UDP)) + return -EOPNOTSUPP; + + return sk->sk_prot->diag_destroy(sk, ECONNABORTED); +} + +__diag_pop() + +BTF_SET8_START(sock_destroy_kfunc_set) +BTF_ID_FLAGS(func, bpf_sock_destroy) +BTF_SET8_END(sock_destroy_kfunc_set) + +static const struct btf_kfunc_id_set bpf_sock_destroy_kfunc_set = { + .owner = THIS_MODULE, + .set = &sock_destroy_kfunc_set, +}; + +static int init_subsystem(void) +{ + return register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &bpf_sock_destroy_kfunc_set); +} +late_initcall(init_subsystem); diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 288693981b00..2259b4facc2f 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -4679,8 +4679,10 @@ int tcp_abort(struct sock *sk, int err) return 0; } - /* Don't race with userspace socket closes such as tcp_close. */ - lock_sock(sk); + /* BPF context ensures sock locking. */ + if (!has_current_bpf_ctx()) + /* Don't race with userspace socket closes such as tcp_close. */ + lock_sock(sk); if (sk->sk_state == TCP_LISTEN) { tcp_set_state(sk, TCP_CLOSE); @@ -4702,9 +4704,11 @@ int tcp_abort(struct sock *sk, int err) } bh_unlock_sock(sk); + local_bh_enable(); tcp_write_queue_purge(sk); - release_sock(sk); + if (!has_current_bpf_ctx()) + release_sock(sk); return 0; } EXPORT_SYMBOL_GPL(tcp_abort); diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 150551acab9d..5f48cdf82a45 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -2925,7 +2925,8 @@ EXPORT_SYMBOL(udp_poll); int udp_abort(struct sock *sk, int err) { - lock_sock(sk); + if (!has_current_bpf_ctx()) + lock_sock(sk); /* udp{v6}_destroy_sock() sets it under the sk lock, avoid racing * with close() @@ -2938,7 +2939,8 @@ int udp_abort(struct sock *sk, int err) __udp_disconnect(sk, 0); out: - release_sock(sk); + if (!has_current_bpf_ctx()) + release_sock(sk); return 0; } From patchwork Wed May 3 22:53:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13230608 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 94F63C7EE26 for ; Wed, 3 May 2023 22:54:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229639AbjECWyL (ORCPT ); Wed, 3 May 2023 18:54:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48154 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229751AbjECWyG (ORCPT ); Wed, 3 May 2023 18:54:06 -0400 Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9942B4C27 for ; Wed, 3 May 2023 15:54:05 -0700 (PDT) Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-1aaed87d8bdso34198955ad.3 for ; Wed, 03 May 2023 15:54:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1683154445; x=1685746445; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MgEqw/DaWrH7mQUZEkdtE0vOzpKLcnA0EDsa0jU2Swo=; b=AhIvQ4z9G+bRRwlR47L9AnU9VZF/v+Wz+t1CjQYCfJhxBw5tK2aSmvLA/50AQa5dpB Lg+v+zGJ+DedfUgKdG2xSPiwg7/vfMXvcYVV3vwhrst5aHV2EP9CtmZVt/6hYew7F6rd vQL80J5UnU+idft4KJsMV8Gh33RGT28gfYkzsS8oNXBIwso1Y3JZBE4xOWKK2JUY+ozx R3OfH0eJiXwIU4m35omDZEB5QPeUlxLGpYYF5XYi+UXuJFqVgwFfG6vMPHCaTpBiC0vV Jr+s361SMESRlXzDd1iUJuo3kAHjhAj0r7G/o67v/aSoHf+x4ltiyiSWaLVoab3BWQzh Du+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683154445; x=1685746445; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MgEqw/DaWrH7mQUZEkdtE0vOzpKLcnA0EDsa0jU2Swo=; b=fKE23MeWhdz8oMedkqUTVp1sMgbqUQOwG/6RSf+5EKYbUBCFLMMaFN3ZYDMZ38PhDS Jk6DOmKNuLAkpj/VvtJHhAojvM6nN7GufU03ZGNsxR1Yl3if4jaCL50aBtUppDlbzsDC 8AOtgI2r5306XnvMh2sb51oGFEgNlL0FUpsLIWmJMn6pj4HOhUnYweYisv6n5OBxzggt SVM5SGoEXpGbeAxWLuHbVcmw5xUZsVWOfO0yMoQGfRD5Sydc84p4zg6u4wKVdZq7NVZN ShDvzB7GKKnjYsxhHd1OJOPs2YqZezaYQs80F1QE4XkJTgRcBMyx6i1lzuINvS5TnJB8 LmCg== X-Gm-Message-State: AC+VfDypbJBCk1GAtK85Cph3EZNGv/a9kqn96JWvluzgVP/y2f0ZaZQK 6KxOi8oysIfc5mhQPMIpmfaz0t8RWkYgd1adLNA= X-Google-Smtp-Source: ACHHUZ6rIINg8Oo7+eMava/vo8SNaT9iNGwrOeSj+8q6srpmYsh82U5cRW4CCAdyWlMTczUTBpl5IA== X-Received: by 2002:a17:902:a40b:b0:1ab:94:1ee4 with SMTP id p11-20020a170902a40b00b001ab00941ee4mr1527712plq.2.1683154444803; Wed, 03 May 2023 15:54:04 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id p2-20020a1709028a8200b001a641e4738asm2200443plo.1.2023.05.03.15.54.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 May 2023 15:54:04 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, aditi.ghag@isovalent.com Subject: [PATCH v7 bpf-next 07/10] selftests/bpf: Add helper to get port using getsockname Date: Wed, 3 May 2023 22:53:48 +0000 Message-Id: <20230503225351.3700208-8-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230503225351.3700208-1-aditi.ghag@isovalent.com> References: <20230503225351.3700208-1-aditi.ghag@isovalent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net The helper will be used to programmatically retrieve, and pass ports in userspace and kernel selftest programs. Suggested-by: Stanislav Fomichev Signed-off-by: Aditi Ghag Acked-by: Stanislav Fomichev --- tools/testing/selftests/bpf/network_helpers.c | 23 +++++++++++++++++++ tools/testing/selftests/bpf/network_helpers.h | 1 + 2 files changed, 24 insertions(+) diff --git a/tools/testing/selftests/bpf/network_helpers.c b/tools/testing/selftests/bpf/network_helpers.c index 596caa176582..a105c0cd008a 100644 --- a/tools/testing/selftests/bpf/network_helpers.c +++ b/tools/testing/selftests/bpf/network_helpers.c @@ -427,3 +427,26 @@ void close_netns(struct nstoken *token) close(token->orig_netns_fd); free(token); } + +int get_socket_local_port(int sock_fd) +{ + struct sockaddr_storage addr; + socklen_t addrlen = sizeof(addr); + int err; + + err = getsockname(sock_fd, (struct sockaddr *)&addr, &addrlen); + if (err < 0) + return err; + + if (addr.ss_family == AF_INET) { + struct sockaddr_in *sin = (struct sockaddr_in *)&addr; + + return sin->sin_port; + } else if (addr.ss_family == AF_INET6) { + struct sockaddr_in6 *sin = (struct sockaddr_in6 *)&addr; + + return sin->sin6_port; + } + + return -1; +} diff --git a/tools/testing/selftests/bpf/network_helpers.h b/tools/testing/selftests/bpf/network_helpers.h index f882c691b790..694185644da6 100644 --- a/tools/testing/selftests/bpf/network_helpers.h +++ b/tools/testing/selftests/bpf/network_helpers.h @@ -56,6 +56,7 @@ int fastopen_connect(int server_fd, const char *data, unsigned int data_len, int make_sockaddr(int family, const char *addr_str, __u16 port, struct sockaddr_storage *addr, socklen_t *len); char *ping_command(int family); +int get_socket_local_port(int sock_fd); struct nstoken; /** From patchwork Wed May 3 22:53:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13230610 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8CF6EC77B78 for ; Wed, 3 May 2023 22:54:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229748AbjECWyL (ORCPT ); Wed, 3 May 2023 18:54:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47968 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229758AbjECWyI (ORCPT ); Wed, 3 May 2023 18:54:08 -0400 Received: from mail-pl1-x629.google.com (mail-pl1-x629.google.com [IPv6:2607:f8b0:4864:20::629]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9BB8586B5 for ; Wed, 3 May 2023 15:54:06 -0700 (PDT) Received: by mail-pl1-x629.google.com with SMTP id d9443c01a7336-1aad55244b7so42949235ad.2 for ; Wed, 03 May 2023 15:54:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1683154446; x=1685746446; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/kZMQO6MiHp39OPx5dM3ZCt2VOrOUpJym/PLANQdM58=; b=i8j645qvyGL5pval41LdrMquzDN5fP4CxWViDgjhGU6qhSe4bNJ5CBJeyOE5rawHWK MI7q9yRoMNHCs8z5JuUdoa3m3l6BV0obsoqWHJyebxzg2KCHLKWoHoa/e+Vw10pSHdPP ls9uMULclVcgyfjxNW4Fto/MNXp8vG0XWsf8/APT2p3Io9PiVZTsgahcp8nonwgKHAA/ TW/xFLrr8LazJrXDDMlbg8AQdyrFg34IBPC6NUoyiF/FWWOxb5qTwNy38OkoeeqX5XHx QD7WwwXnfwmPVwC79F1ZPKRkj0uj3/nux3Qgxv7MMLQmo7LpmzfZaNWGiGtkr8szgML4 ayIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683154446; x=1685746446; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/kZMQO6MiHp39OPx5dM3ZCt2VOrOUpJym/PLANQdM58=; b=Do7V9q5CrQkq0Cn6pXyye07+LetjVTEdlOFRxJ3ydxnZx5NNoLmQoSoKJTnKBLb3i+ z2qkqgKRNf2S2Igm71wYaORxnAvcSg+xOuA/ua6OD2Q7Kt2ZxxuP7OHpHfqlrDEfPxFS VRSiSBRG7GS9ESMnwUE/rj2z4ozkKboDQhfyobCW/tQF0VjHBs5VN3TgS5mLIDCfPhrh dBlZN6N/YbZbTEQxTGdXVgCDVn+ywydG67FRcRdaQtziuIxEBQsMv7vYscLAm0ONsw6L Rj91r2rFt3p242nxzJZJjeMBK3kXBbJnJWpOYTPtU0aSzgxiUoGu4QS7i29riGUIoVRR Ayuw== X-Gm-Message-State: AC+VfDxjwL9fsF1hSJO8q12jAcVUsGCfBeA+w7Y9KkBHWnrD0CeDsnXr Q3V5OR1ShFpYiboISicKecPaAQPTmY8bHDUuTmc= X-Google-Smtp-Source: ACHHUZ64DXeDHzeGQMxoj8rKjrPBIBzWVIcxHlBxUaUbEAZ02uRyFYbXr1ptmfHoRCeExgKt6pxLjw== X-Received: by 2002:a17:902:eac5:b0:1aa:fec9:5219 with SMTP id p5-20020a170902eac500b001aafec95219mr1721680pld.61.1683154445805; Wed, 03 May 2023 15:54:05 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id p2-20020a1709028a8200b001a641e4738asm2200443plo.1.2023.05.03.15.54.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 May 2023 15:54:05 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, aditi.ghag@isovalent.com Subject: [PATCH v7 bpf-next 08/10] selftests/bpf: Test bpf_sock_destroy Date: Wed, 3 May 2023 22:53:49 +0000 Message-Id: <20230503225351.3700208-9-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230503225351.3700208-1-aditi.ghag@isovalent.com> References: <20230503225351.3700208-1-aditi.ghag@isovalent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net The test cases for destroying sockets mirror the intended usages of the bpf_sock_destroy kfunc using iterators. The destroy helpers set `ECONNABORTED` error code that we can validate in the test code with client sockets. But UDP sockets have an overriding error code from the disconnect called during abort, so the error code the validation is only done for TCP sockets. Signed-off-by: Aditi Ghag --- .../selftests/bpf/prog_tests/sock_destroy.c | 215 ++++++++++++++++++ .../selftests/bpf/progs/sock_destroy_prog.c | 145 ++++++++++++ 2 files changed, 360 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/sock_destroy.c create mode 100644 tools/testing/selftests/bpf/progs/sock_destroy_prog.c diff --git a/tools/testing/selftests/bpf/prog_tests/sock_destroy.c b/tools/testing/selftests/bpf/prog_tests/sock_destroy.c new file mode 100644 index 000000000000..d5f76731b4a3 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/sock_destroy.c @@ -0,0 +1,215 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include + +#include "sock_destroy_prog.skel.h" +#include "network_helpers.h" + +#define TEST_NS "sock_destroy_netns" + +static void start_iter_sockets(struct bpf_program *prog) +{ + struct bpf_link *link; + char buf[50] = {}; + int iter_fd, len; + + link = bpf_program__attach_iter(prog, NULL); + if (!ASSERT_OK_PTR(link, "attach_iter")) + return; + + iter_fd = bpf_iter_create(bpf_link__fd(link)); + if (!ASSERT_GE(iter_fd, 0, "create_iter")) + goto free_link; + + while ((len = read(iter_fd, buf, sizeof(buf))) > 0) + ; + ASSERT_GE(len, 0, "read"); + + close(iter_fd); + +free_link: + bpf_link__destroy(link); +} + +static void test_tcp_client(struct sock_destroy_prog *skel) +{ + int serv = -1, clien = -1, n; + + serv = start_server(AF_INET6, SOCK_STREAM, NULL, 0, 0); + if (!ASSERT_GE(serv, 0, "start_server")) + goto cleanup; + + clien = connect_to_fd(serv, 0); + if (!ASSERT_GE(clien, 0, "connect_to_fd")) + goto cleanup; + + serv = accept(serv, NULL, NULL); + if (!ASSERT_GE(serv, 0, "serv accept")) + goto cleanup; + + n = send(clien, "t", 1, 0); + if (!ASSERT_EQ(n, 1, "client send")) + goto cleanup; + + /* Run iterator program that destroys connected client sockets. */ + start_iter_sockets(skel->progs.iter_tcp6_client); + + n = send(clien, "t", 1, 0); + if (!ASSERT_LT(n, 0, "client_send on destroyed socket")) + goto cleanup; + ASSERT_EQ(errno, ECONNABORTED, "error code on destroyed socket"); + +cleanup: + if (clien != -1) + close(clien); + if (serv != -1) + close(serv); +} + +static void test_tcp_server(struct sock_destroy_prog *skel) +{ + int serv = -1, clien = -1, n, serv_port; + + serv = start_server(AF_INET6, SOCK_STREAM, NULL, 0, 0); + if (!ASSERT_GE(serv, 0, "start_server")) + goto cleanup; + serv_port = get_socket_local_port(serv); + if (!ASSERT_GE(serv_port, 0, "get_sock_local_port")) + goto cleanup; + skel->bss->serv_port = (__be16) serv_port; + + clien = connect_to_fd(serv, 0); + if (!ASSERT_GE(clien, 0, "connect_to_fd")) + goto cleanup; + + serv = accept(serv, NULL, NULL); + if (!ASSERT_GE(serv, 0, "serv accept")) + goto cleanup; + + n = send(clien, "t", 1, 0); + if (!ASSERT_EQ(n, 1, "client send")) + goto cleanup; + + /* Run iterator program that destroys server sockets. */ + start_iter_sockets(skel->progs.iter_tcp6_server); + + n = send(clien, "t", 1, 0); + if (!ASSERT_LT(n, 0, "client_send on destroyed socket")) + goto cleanup; + ASSERT_EQ(errno, ECONNRESET, "error code on destroyed socket"); + +cleanup: + if (clien != -1) + close(clien); + if (serv != -1) + close(serv); +} + +static void test_udp_client(struct sock_destroy_prog *skel) +{ + int serv = -1, clien = -1, n = 0; + + serv = start_server(AF_INET6, SOCK_DGRAM, NULL, 0, 0); + if (!ASSERT_GE(serv, 0, "start_server")) + goto cleanup; + + clien = connect_to_fd(serv, 0); + if (!ASSERT_GE(clien, 0, "connect_to_fd")) + goto cleanup; + + n = send(clien, "t", 1, 0); + if (!ASSERT_EQ(n, 1, "client send")) + goto cleanup; + + /* Run iterator program that destroys sockets. */ + start_iter_sockets(skel->progs.iter_udp6_client); + + n = send(clien, "t", 1, 0); + if (!ASSERT_LT(n, 0, "client_send on destroyed socket")) + goto cleanup; + /* UDP sockets have an overriding error code after they are disconnected, + * so we don't check for ECONNABORTED error code. + */ + +cleanup: + if (clien != -1) + close(clien); + if (serv != -1) + close(serv); +} + +static void test_udp_server(struct sock_destroy_prog *skel) +{ + int *listen_fds = NULL, n, i, serv_port; + unsigned int num_listens = 5; + char buf[1]; + + /* Start reuseport servers. */ + listen_fds = start_reuseport_server(AF_INET6, SOCK_DGRAM, + "::1", 0, 0, num_listens); + if (!ASSERT_OK_PTR(listen_fds, "start_reuseport_server")) + goto cleanup; + serv_port = get_socket_local_port(listen_fds[0]); + if (!ASSERT_GE(serv_port, 0, "get_sock_local_port")) + goto cleanup; + skel->bss->serv_port = (__be16) serv_port; + + /* Run iterator program that destroys server sockets. */ + start_iter_sockets(skel->progs.iter_udp6_server); + + for (i = 0; i < num_listens; ++i) { + n = read(listen_fds[i], buf, sizeof(buf)); + if (!ASSERT_EQ(n, -1, "read") || + !ASSERT_EQ(errno, ECONNABORTED, "error code on destroyed socket")) + break; + } + ASSERT_EQ(i, num_listens, "server socket"); + +cleanup: + free_fds(listen_fds, num_listens); +} + +void test_sock_destroy(void) +{ + struct sock_destroy_prog *skel; + struct nstoken *nstoken; + int cgroup_fd; + + skel = sock_destroy_prog__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_open")) + return; + + cgroup_fd = test__join_cgroup("/sock_destroy"); + if (!ASSERT_GE(cgroup_fd, 0, "join_cgroup")) + goto cleanup; + + skel->links.sock_connect = bpf_program__attach_cgroup( + skel->progs.sock_connect, cgroup_fd); + if (!ASSERT_OK_PTR(skel->links.sock_connect, "prog_attach")) + goto cleanup; + + SYS(cleanup, "ip netns add %s", TEST_NS); + SYS(cleanup, "ip -net %s link set dev lo up", TEST_NS); + + nstoken = open_netns(TEST_NS); + if (!ASSERT_OK_PTR(nstoken, "open_netns")) + goto cleanup; + + if (test__start_subtest("tcp_client")) + test_tcp_client(skel); + if (test__start_subtest("tcp_server")) + test_tcp_server(skel); + if (test__start_subtest("udp_client")) + test_udp_client(skel); + if (test__start_subtest("udp_server")) + test_udp_server(skel); + + +cleanup: + if (nstoken) + close_netns(nstoken); + SYS_NOFAIL("ip netns del " TEST_NS " &> /dev/null"); + if (cgroup_fd >= 0) + close(cgroup_fd); + sock_destroy_prog__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/sock_destroy_prog.c b/tools/testing/selftests/bpf/progs/sock_destroy_prog.c new file mode 100644 index 000000000000..9e0bf7a54cec --- /dev/null +++ b/tools/testing/selftests/bpf/progs/sock_destroy_prog.c @@ -0,0 +1,145 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include "vmlinux.h" +#include +#include + +#include "bpf_tracing_net.h" + +__be16 serv_port = 0; + +int bpf_sock_destroy(struct sock_common *sk) __ksym; + +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __uint(max_entries, 1); + __type(key, __u32); + __type(value, __u64); +} tcp_conn_sockets SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __uint(max_entries, 1); + __type(key, __u32); + __type(value, __u64); +} udp_conn_sockets SEC(".maps"); + +SEC("cgroup/connect6") +int sock_connect(struct bpf_sock_addr *ctx) +{ + __u64 sock_cookie = 0; + int key = 0; + __u32 keyc = 0; + + if (ctx->family != AF_INET6 || ctx->user_family != AF_INET6) + return 1; + + sock_cookie = bpf_get_socket_cookie(ctx); + if (ctx->protocol == IPPROTO_TCP) + bpf_map_update_elem(&tcp_conn_sockets, &key, &sock_cookie, 0); + else if (ctx->protocol == IPPROTO_UDP) + bpf_map_update_elem(&udp_conn_sockets, &keyc, &sock_cookie, 0); + else + return 1; + + return 1; +} + +SEC("iter/tcp") +int iter_tcp6_client(struct bpf_iter__tcp *ctx) +{ + struct sock_common *sk_common = ctx->sk_common; + __u64 sock_cookie = 0; + __u64 *val; + int key = 0; + + if (!sk_common) + return 0; + + if (sk_common->skc_family != AF_INET6) + return 0; + + sock_cookie = bpf_get_socket_cookie(sk_common); + val = bpf_map_lookup_elem(&tcp_conn_sockets, &key); + if (!val) + return 0; + /* Destroy connected client sockets. */ + if (sock_cookie == *val) + bpf_sock_destroy(sk_common); + + return 0; +} + +SEC("iter/tcp") +int iter_tcp6_server(struct bpf_iter__tcp *ctx) +{ + struct sock_common *sk_common = ctx->sk_common; + const struct inet_connection_sock *icsk; + const struct inet_sock *inet; + struct tcp6_sock *tcp_sk; + __be16 srcp; + + if (!sk_common) + return 0; + + if (sk_common->skc_family != AF_INET6) + return 0; + + tcp_sk = bpf_skc_to_tcp6_sock(sk_common); + if (!tcp_sk) + return 0; + + icsk = &tcp_sk->tcp.inet_conn; + inet = &icsk->icsk_inet; + srcp = inet->inet_sport; + + /* Destroy server sockets. */ + if (srcp == serv_port) + bpf_sock_destroy(sk_common); + + return 0; +} + + +SEC("iter/udp") +int iter_udp6_client(struct bpf_iter__udp *ctx) +{ + struct udp_sock *udp_sk = ctx->udp_sk; + struct sock *sk = (struct sock *) udp_sk; + __u64 sock_cookie = 0, *val; + int key = 0; + + if (!sk) + return 0; + + sock_cookie = bpf_get_socket_cookie(sk); + val = bpf_map_lookup_elem(&udp_conn_sockets, &key); + if (!val) + return 0; + /* Destroy connected client sockets. */ + if (sock_cookie == *val) + bpf_sock_destroy((struct sock_common *)sk); + + return 0; +} + +SEC("iter/udp") +int iter_udp6_server(struct bpf_iter__udp *ctx) +{ + struct udp_sock *udp_sk = ctx->udp_sk; + struct sock *sk = (struct sock *) udp_sk; + struct inet_sock *inet; + __be16 srcp; + + if (!sk) + return 0; + + inet = &udp_sk->inet; + srcp = inet->inet_sport; + if (srcp == serv_port) + bpf_sock_destroy((struct sock_common *)sk); + + return 0; +} + +char _license[] SEC("license") = "GPL"; From patchwork Wed May 3 22:53:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13230612 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A13E9C77B7F for ; Wed, 3 May 2023 22:54:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229622AbjECWyO (ORCPT ); Wed, 3 May 2023 18:54:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47968 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229755AbjECWyK (ORCPT ); Wed, 3 May 2023 18:54:10 -0400 Received: from mail-pl1-x636.google.com (mail-pl1-x636.google.com [IPv6:2607:f8b0:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E14B146A6 for ; Wed, 3 May 2023 15:54:07 -0700 (PDT) Received: by mail-pl1-x636.google.com with SMTP id d9443c01a7336-1ab01bf474aso29856715ad.1 for ; Wed, 03 May 2023 15:54:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1683154447; x=1685746447; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7sG94dNySAdFrxzxkKgsn/Qvw18MkZf4kTSp4kQD6+s=; b=CvnszwoyFxeY+GLEFeiaV7VOPgWpuLDTADtKGy/ZMGNPrC9GSnIrtEXqWntj31H7/K auwB0PX5g42isYwp4By/5bUuRrsjVgecv/7uPFbR11H2scobgL3/bgguij9kfoqr8XfG Z8uzw8rqkgqUh9sawnooV0+trJHfEgDCT9FgiVOJ2Z0pA+yj3KOTBUlSJ/ydL487lJQ8 GnFTtiUsxrJqHSs1MAqpg0LqkNo188Ig2qQxoK2cQAYz9CNk0a/BCID9Q9wVPm4Sf7o0 xwmScL3qMakak5UuDpxPXvI7jQGUR1faTRxAjTfkMOqUcAu5LwETVN5yHus1nWCvLbN+ I0Tg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683154447; x=1685746447; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7sG94dNySAdFrxzxkKgsn/Qvw18MkZf4kTSp4kQD6+s=; b=Ln/LGyKxg8vXxtwq0Sf04D1sJ6A3FS1cVcR+UDwK4uWDuz5pHYFTVy4N5nAUW0oZfn 17Sk0zZhpY9L3845BAaMcl7Ih58xtcFYy7oTfjPKa6PGAwSs28lNdpjHJ+c+EDGKDHya J2eOF9bOEaHVRZGtxytvoIM8KNIM40LRCK3rmIroce3t//0XLTfyFC3PDkQh8CKdTpcC cjm5iIjemrAC7O36QGAFpeCjMMN0RWPC9/zs0Mr76+zikUq2gGmgFKHU3iyLWzy5XlGh Ax3w5wJY7d1NBeul/1QiPj9o4KUAjzZ/l9FXjIpk9c0HQ373L6l3xLLLHDnOtkNp+K10 xgYg== X-Gm-Message-State: AC+VfDxB4PTtPC1aHhAs1LH4ebSXcxhW+5lZgifP9TZH9d6CoctFFoDy hjS+n/GQsJBuf2ldcrJPgUde93P927Jd1/0y8TY= X-Google-Smtp-Source: ACHHUZ5G4rXbW+XaTxat5gqK1Rv7+g8Z2YYF4Uldel6zmG+LOwOjtshiiRJBVDh6q6KyWpn1BNc3ew== X-Received: by 2002:a17:902:aa02:b0:1a0:50bd:31a8 with SMTP id be2-20020a170902aa0200b001a050bd31a8mr1504206plb.26.1683154446788; Wed, 03 May 2023 15:54:06 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id p2-20020a1709028a8200b001a641e4738asm2200443plo.1.2023.05.03.15.54.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 May 2023 15:54:06 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, aditi.ghag@isovalent.com, Martin KaFai Lau Subject: [PATCH v7 bpf-next 09/10] bpf: Add a kfunc filter function to 'struct btf_kfunc_id_set' Date: Wed, 3 May 2023 22:53:50 +0000 Message-Id: <20230503225351.3700208-10-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230503225351.3700208-1-aditi.ghag@isovalent.com> References: <20230503225351.3700208-1-aditi.ghag@isovalent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net This commit adds the ability to filter kfuncs to certain BPF program types, and thereby limits bpf_sock_destroy kfunc to progras with attach type 'BPF_TRACE_ITER'. Previous patches introduced 'bpf_sock_destroy kfunc' that can only be called from BPF (sockets) iterator type programs. The reason being, the kfunc requires lock_sock to be done from the BPF context prior to calling the kfunc. To that end, the patch adds a callback filter to 'struct btf_kfunc_id_set'. The filter has access to the prog construct including other properties of the prog. For the bpf_sock_destroy case, the `expected_attached_type` property of a prog construct is used to allow access to the kfunc in the provided callback filter. Signed-off-by: Aditi Ghag Signed-off-by: Martin KaFai Lau --- include/linux/btf.h | 18 ++++++++----- kernel/bpf/btf.c | 59 +++++++++++++++++++++++++++++++++++-------- kernel/bpf/verifier.c | 7 ++--- net/core/filter.c | 9 +++++++ 4 files changed, 73 insertions(+), 20 deletions(-) diff --git a/include/linux/btf.h b/include/linux/btf.h index 495250162422..918a0b6379bd 100644 --- a/include/linux/btf.h +++ b/include/linux/btf.h @@ -99,10 +99,14 @@ struct btf_type; union bpf_attr; struct btf_show; struct btf_id_set; +struct bpf_prog; + +typedef int (*btf_kfunc_filter_t)(const struct bpf_prog *prog, u32 kfunc_id); struct btf_kfunc_id_set { struct module *owner; struct btf_id_set8 *set; + btf_kfunc_filter_t filter; }; struct btf_id_dtor_kfunc { @@ -482,7 +486,6 @@ static inline void *btf_id_set8_contains(const struct btf_id_set8 *set, u32 id) return bsearch(&id, set->pairs, set->cnt, sizeof(set->pairs[0]), btf_id_cmp_func); } -struct bpf_prog; struct bpf_verifier_log; #ifdef CONFIG_BPF_SYSCALL @@ -490,10 +493,10 @@ const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id); const char *btf_name_by_offset(const struct btf *btf, u32 offset); struct btf *btf_parse_vmlinux(void); struct btf *bpf_prog_get_target_btf(const struct bpf_prog *prog); -u32 *btf_kfunc_id_set_contains(const struct btf *btf, - enum bpf_prog_type prog_type, - u32 kfunc_btf_id); -u32 *btf_kfunc_is_modify_return(const struct btf *btf, u32 kfunc_btf_id); +u32 *btf_kfunc_id_set_contains(const struct btf *btf, u32 kfunc_btf_id, + const struct bpf_prog *prog); +u32 *btf_kfunc_is_modify_return(const struct btf *btf, u32 kfunc_btf_id, + const struct bpf_prog *prog); int register_btf_kfunc_id_set(enum bpf_prog_type prog_type, const struct btf_kfunc_id_set *s); int register_btf_fmodret_id_set(const struct btf_kfunc_id_set *kset); @@ -520,8 +523,9 @@ static inline const char *btf_name_by_offset(const struct btf *btf, return NULL; } static inline u32 *btf_kfunc_id_set_contains(const struct btf *btf, - enum bpf_prog_type prog_type, - u32 kfunc_btf_id) + u32 kfunc_btf_id, + struct bpf_prog *prog) + { return NULL; } diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 913b9d717a4a..c6dae44e236d 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -218,10 +218,17 @@ enum btf_kfunc_hook { enum { BTF_KFUNC_SET_MAX_CNT = 256, BTF_DTOR_KFUNC_MAX_CNT = 256, + BTF_KFUNC_FILTER_MAX_CNT = 16, +}; + +struct btf_kfunc_hook_filter { + btf_kfunc_filter_t filters[BTF_KFUNC_FILTER_MAX_CNT]; + u32 nr_filters; }; struct btf_kfunc_set_tab { struct btf_id_set8 *sets[BTF_KFUNC_HOOK_MAX]; + struct btf_kfunc_hook_filter hook_filters[BTF_KFUNC_HOOK_MAX]; }; struct btf_id_dtor_kfunc_tab { @@ -7720,9 +7727,12 @@ static int btf_check_kfunc_protos(struct btf *btf, u32 func_id, u32 func_flags) /* Kernel Function (kfunc) BTF ID set registration API */ static int btf_populate_kfunc_set(struct btf *btf, enum btf_kfunc_hook hook, - struct btf_id_set8 *add_set) + const struct btf_kfunc_id_set *kset) { + struct btf_kfunc_hook_filter *hook_filter; + struct btf_id_set8 *add_set = kset->set; bool vmlinux_set = !btf_is_module(btf); + bool add_filter = !!kset->filter; struct btf_kfunc_set_tab *tab; struct btf_id_set8 *set; u32 set_cnt; @@ -7737,6 +7747,20 @@ static int btf_populate_kfunc_set(struct btf *btf, enum btf_kfunc_hook hook, return 0; tab = btf->kfunc_set_tab; + + if (tab && add_filter) { + int i; + + hook_filter = &tab->hook_filters[hook]; + for (i = 0; i < hook_filter->nr_filters; i++) { + if (hook_filter->filters[i] == kset->filter) + add_filter = false; + } + + if (add_filter && hook_filter->nr_filters == BTF_KFUNC_FILTER_MAX_CNT) + return -E2BIG; + } + if (!tab) { tab = kzalloc(sizeof(*tab), GFP_KERNEL | __GFP_NOWARN); if (!tab) @@ -7759,7 +7783,7 @@ static int btf_populate_kfunc_set(struct btf *btf, enum btf_kfunc_hook hook, */ if (!vmlinux_set) { tab->sets[hook] = add_set; - return 0; + goto do_add_filter; } /* In case of vmlinux sets, there may be more than one set being @@ -7801,6 +7825,11 @@ static int btf_populate_kfunc_set(struct btf *btf, enum btf_kfunc_hook hook, sort(set->pairs, set->cnt, sizeof(set->pairs[0]), btf_id_cmp_func, NULL); +do_add_filter: + if (add_filter) { + hook_filter = &tab->hook_filters[hook]; + hook_filter->filters[hook_filter->nr_filters++] = kset->filter; + } return 0; end: btf_free_kfunc_set_tab(btf); @@ -7809,15 +7838,22 @@ static int btf_populate_kfunc_set(struct btf *btf, enum btf_kfunc_hook hook, static u32 *__btf_kfunc_id_set_contains(const struct btf *btf, enum btf_kfunc_hook hook, + const struct bpf_prog *prog, u32 kfunc_btf_id) { + struct btf_kfunc_hook_filter *hook_filter; struct btf_id_set8 *set; - u32 *id; + u32 *id, i; if (hook >= BTF_KFUNC_HOOK_MAX) return NULL; if (!btf->kfunc_set_tab) return NULL; + hook_filter = &btf->kfunc_set_tab->hook_filters[hook]; + for (i = 0; i < hook_filter->nr_filters; i++) { + if (hook_filter->filters[i](prog, kfunc_btf_id)) + return NULL; + } set = btf->kfunc_set_tab->sets[hook]; if (!set) return NULL; @@ -7870,23 +7906,25 @@ static int bpf_prog_type_to_kfunc_hook(enum bpf_prog_type prog_type) * protection for looking up a well-formed btf->kfunc_set_tab. */ u32 *btf_kfunc_id_set_contains(const struct btf *btf, - enum bpf_prog_type prog_type, - u32 kfunc_btf_id) + u32 kfunc_btf_id, + const struct bpf_prog *prog) { + enum bpf_prog_type prog_type = resolve_prog_type(prog); enum btf_kfunc_hook hook; u32 *kfunc_flags; - kfunc_flags = __btf_kfunc_id_set_contains(btf, BTF_KFUNC_HOOK_COMMON, kfunc_btf_id); + kfunc_flags = __btf_kfunc_id_set_contains(btf, BTF_KFUNC_HOOK_COMMON, prog, kfunc_btf_id); if (kfunc_flags) return kfunc_flags; hook = bpf_prog_type_to_kfunc_hook(prog_type); - return __btf_kfunc_id_set_contains(btf, hook, kfunc_btf_id); + return __btf_kfunc_id_set_contains(btf, hook, prog, kfunc_btf_id); } -u32 *btf_kfunc_is_modify_return(const struct btf *btf, u32 kfunc_btf_id) +u32 *btf_kfunc_is_modify_return(const struct btf *btf, u32 kfunc_btf_id, + const struct bpf_prog *prog) { - return __btf_kfunc_id_set_contains(btf, BTF_KFUNC_HOOK_FMODRET, kfunc_btf_id); + return __btf_kfunc_id_set_contains(btf, BTF_KFUNC_HOOK_FMODRET, prog, kfunc_btf_id); } static int __register_btf_kfunc_id_set(enum btf_kfunc_hook hook, @@ -7917,7 +7955,8 @@ static int __register_btf_kfunc_id_set(enum btf_kfunc_hook hook, goto err_out; } - ret = btf_populate_kfunc_set(btf, hook, kset->set); + ret = btf_populate_kfunc_set(btf, hook, kset); + err_out: btf_put(btf); return ret; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index d6db6de3e9ea..8d9519210935 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -10534,7 +10534,7 @@ static int fetch_kfunc_meta(struct bpf_verifier_env *env, *kfunc_name = func_name; func_proto = btf_type_by_id(desc_btf, func->type); - kfunc_flags = btf_kfunc_id_set_contains(desc_btf, resolve_prog_type(env->prog), func_id); + kfunc_flags = btf_kfunc_id_set_contains(desc_btf, func_id, env->prog); if (!kfunc_flags) { return -EACCES; } @@ -18526,7 +18526,8 @@ int bpf_check_attach_target(struct bpf_verifier_log *log, * in the fmodret id set with the KF_SLEEPABLE flag. */ else { - u32 *flags = btf_kfunc_is_modify_return(btf, btf_id); + u32 *flags = btf_kfunc_is_modify_return(btf, btf_id, + prog); if (flags && (*flags & KF_SLEEPABLE)) ret = 0; @@ -18554,7 +18555,7 @@ int bpf_check_attach_target(struct bpf_verifier_log *log, return -EINVAL; } ret = -EINVAL; - if (btf_kfunc_is_modify_return(btf, btf_id) || + if (btf_kfunc_is_modify_return(btf, btf_id, prog) || !check_attach_modify_return(addr, tname)) ret = 0; if (ret) { diff --git a/net/core/filter.c b/net/core/filter.c index 97d70b7959a1..20c603321325 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -11762,9 +11762,18 @@ BTF_SET8_START(sock_destroy_kfunc_set) BTF_ID_FLAGS(func, bpf_sock_destroy) BTF_SET8_END(sock_destroy_kfunc_set) +static int tracing_iter_filter(const struct bpf_prog *prog, u32 kfunc_id) +{ + if (btf_id_set8_contains(&sock_destroy_kfunc_set, kfunc_id) && + prog->expected_attach_type != BPF_TRACE_ITER) + return -EACCES; + return 0; +} + static const struct btf_kfunc_id_set bpf_sock_destroy_kfunc_set = { .owner = THIS_MODULE, .set = &sock_destroy_kfunc_set, + .filter = tracing_iter_filter, }; static int init_subsystem(void) From patchwork Wed May 3 22:53:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aditi Ghag X-Patchwork-Id: 13230611 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92C92C7EE25 for ; Wed, 3 May 2023 22:54:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229746AbjECWyN (ORCPT ); Wed, 3 May 2023 18:54:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48054 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229774AbjECWyK (ORCPT ); Wed, 3 May 2023 18:54:10 -0400 Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com [IPv6:2607:f8b0:4864:20::630]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 06B2B44B7 for ; Wed, 3 May 2023 15:54:08 -0700 (PDT) Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-1ab05018381so33316875ad.2 for ; Wed, 03 May 2023 15:54:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=isovalent.com; s=google; t=1683154448; x=1685746448; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=avjys7QME+mclT8grC8JsMORlSrK23DSEmt8E1wDeZY=; b=QbNKIDHvDxSP9ng37nGB0Xf/c3db2BL9eIAzhFToX/Rdhr26Szab+KTcpWvilgEBG+ re532HK3AoXHtwPwBKQSeaA7rH+dgA9taEiLODkbr7pkbM8ZCeNpNxN7tU4PV65iDoLZ 4nRQ1/OzEEg10hlr5OEna4N0HGhjZAu+i914TFZR8BF5n2QWeCPmW2UUUJSzZ3FfV2Xf Dt3faQ4K2/JIFPrsDQECzzLB8yx2fhx4wTnpVTu5URCs/JtiB3GggT4Ji63u/53WIVtU F6YD8vs7kptlwe8/NbhTKYP7OuElteYfbLyVFj++0UM9ogWqtLCBD0QJ344CjFumazPi rszg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683154448; x=1685746448; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=avjys7QME+mclT8grC8JsMORlSrK23DSEmt8E1wDeZY=; b=Ik6KvR1pXN/xYXZRSr+Bn97o4jIqBp0q8TdX0YSJZncukqVurCHajvfFDZC5vnjUoV Ta64orWGquqDZdhyWTvQ8ahzG5V1XUKC30Nm0W8uLx3l7zZA+K/iIBYawcg06t5CLRYE U1KAPmGbdtVXXYY7k3WiCjSOqB+YEv8z2Lb+mT3VmW/zbP8umRG/cjjnvHi4d4fa8aJV KW8+nMFOAjfbc8Bp2P9oGdzvOHFbnaPIEkWf8Z6e/y045k3E0Xdj3dkAtwKBKko4oLhm clO43R0B+tu1E7wpWsIiUbKrO0w48ZQtpL1RNivbIGXsvtTcaTKThKVob3UsX87CM9hx iIbg== X-Gm-Message-State: AC+VfDxiEV7mSy3TMbLID7/KZqzxpG4mjUNm/EQtUA4/xXPId/l/9tjx 37nSLxDL/JsaI3GawmjC29FA5irJIbINuw+KypM= X-Google-Smtp-Source: ACHHUZ7uVC373u6dsI4hr7CsNZ90skCs69b8pKSs16vTc4mZvhv93qM8t+VC/lT5NN9zrRUzzbTIAw== X-Received: by 2002:a17:902:f68e:b0:1a2:58f1:5e1d with SMTP id l14-20020a170902f68e00b001a258f15e1dmr1938731plg.36.1683154448152; Wed, 03 May 2023 15:54:08 -0700 (PDT) Received: from localhost.localdomain ([2604:1380:4611:8100::1]) by smtp.gmail.com with ESMTPSA id p2-20020a1709028a8200b001a641e4738asm2200443plo.1.2023.05.03.15.54.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 May 2023 15:54:07 -0700 (PDT) From: Aditi Ghag To: bpf@vger.kernel.org Cc: kafai@fb.com, sdf@google.com, aditi.ghag@isovalent.com, Martin KaFai Lau Subject: [PATCH v7 bpf-next 10/10] selftests/bpf: Extend bpf_sock_destroy tests Date: Wed, 3 May 2023 22:53:51 +0000 Message-Id: <20230503225351.3700208-11-aditi.ghag@isovalent.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230503225351.3700208-1-aditi.ghag@isovalent.com> References: <20230503225351.3700208-1-aditi.ghag@isovalent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net This commit adds a test case to verify that the bpf_sock_destroy kfunc is not allowed from program attach types other than BPF trace iterator. Unsupprted programs calling the kfunc will be rejected by the verifier. Signed-off-by: Aditi Ghag Signed-off-by: Martin KaFai Lau --- .../selftests/bpf/prog_tests/sock_destroy.c | 2 ++ .../bpf/progs/sock_destroy_prog_fail.c | 22 +++++++++++++++++++ 2 files changed, 24 insertions(+) create mode 100644 tools/testing/selftests/bpf/progs/sock_destroy_prog_fail.c diff --git a/tools/testing/selftests/bpf/prog_tests/sock_destroy.c b/tools/testing/selftests/bpf/prog_tests/sock_destroy.c index d5f76731b4a3..8f7d745e55a1 100644 --- a/tools/testing/selftests/bpf/prog_tests/sock_destroy.c +++ b/tools/testing/selftests/bpf/prog_tests/sock_destroy.c @@ -3,6 +3,7 @@ #include #include "sock_destroy_prog.skel.h" +#include "sock_destroy_prog_fail.skel.h" #include "network_helpers.h" #define TEST_NS "sock_destroy_netns" @@ -204,6 +205,7 @@ void test_sock_destroy(void) if (test__start_subtest("udp_server")) test_udp_server(skel); + RUN_TESTS(sock_destroy_prog_fail); cleanup: if (nstoken) diff --git a/tools/testing/selftests/bpf/progs/sock_destroy_prog_fail.c b/tools/testing/selftests/bpf/progs/sock_destroy_prog_fail.c new file mode 100644 index 000000000000..dd6850b58e25 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/sock_destroy_prog_fail.c @@ -0,0 +1,22 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include "vmlinux.h" +#include +#include + +#include "bpf_misc.h" + +char _license[] SEC("license") = "GPL"; + +int bpf_sock_destroy(struct sock_common *sk) __ksym; + +SEC("tp_btf/tcp_destroy_sock") +__failure __msg("calling kernel function bpf_sock_destroy is not allowed") +int BPF_PROG(trace_tcp_destroy_sock, struct sock *sk) +{ + /* should not load */ + bpf_sock_destroy((struct sock_common *)sk); + + return 0; +} +