From patchwork Fri Aug 2 15:29:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Maguire X-Patchwork-Id: 13751684 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7B43D175D32 for ; Fri, 2 Aug 2024 15:29:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722612601; cv=none; b=et42uuGrkhjRMspP7XweOIRSht6XYkJyc3MHefIG9dyTOpWK6Efq1/wjHGBbl4YwGkWM2pVsobUXu6xf4shDHWYZ4CmT7YZwP7MACczXYy2e1wcf4wu+Nw/B4JGo0NPlxDj0pAu53WJ7cs9FCQmNzFd/qcZ95U2dhva7utAGpB4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722612601; c=relaxed/simple; bh=E0/HjHnLfbbQ2AzOaCdSMbJAYPliiOSLl3KCxREyzxg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=uVBtVhWhgNX4uIWnfSUhw8sdWU2IXwm5nyTpGOWMuWleHLVsvbmdOutCoA7KvEwZW54rn8v3FWauH+Tbjuk7TSayUzVwPkRadHH7YfWik5JW2Bx7+YrPrx/IuZ1srgb9uQ1b/OIMd1FI2/dg6uxA28wODPtXPuDtWjw7NKxXiHM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=Cld9ZotH; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="Cld9ZotH" Received: from pps.filterd (m0333520.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 472DGPT3015055; Fri, 2 Aug 2024 15:29:40 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=corp-2023-11-20; bh=w HhtYEbpMQf05TnQ+drQ9PFVBhZIaQuxcekwtDpyPLg=; b=Cld9ZotHvbMFi0jVa 4yCHiUK1VZRZbSlmhjQlpdG4Cogpm/0cLblyyyjZvS++CP/JAOGq99+9RDHL1Clm hWtLCP9nxF2o5yNr1EYLBqSz8LjMcztJgobmpHwYfpClyTWww/83LpM2StO3yIHe jsdf4WbFiZNo1enXWKtTNhlbut4w9yvgBEwN+iRpMRY/DvmMefUrm6zWP6Bn8LIl jveb0R4Cndd4JYeOoWWG8+Z/7DhrpmPt2qedioRySaTFu8O+NIVp2YwUNoFg+edc kc7Q8BvoU+aMJT9yjPaQcvMcK/gQinFaoos5vHKIaxlWZNgFtQ0PUleVodnEvfzP 3DAyg== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 40rjg31ecp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 02 Aug 2024 15:29:40 +0000 (GMT) Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 472E2s27001857; Fri, 2 Aug 2024 15:29:39 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 40nvp1rhfa-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 02 Aug 2024 15:29:39 +0000 Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 472FTYgX035653; Fri, 2 Aug 2024 15:29:38 GMT Received: from bpf.uk.oracle.com (dhcp-10-175-223-234.vpn.oracle.com [10.175.223.234]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 40nvp1rh9t-2; Fri, 02 Aug 2024 15:29:38 +0000 From: Alan Maguire To: martin.lau@linux.dev Cc: ast@kernel.org, daniel@iogearbox.net, eddyz87@gmail.com, song@kernel.org, yonghong.song@linux.dev, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me, haoluo@google.com, jolsa@kernel.org, davem@davemloft.net, edumazet@google.com, bpf@vger.kernel.org, Alan Maguire Subject: [PATCH bpf-next 1/3] bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags Date: Fri, 2 Aug 2024 16:29:27 +0100 Message-ID: <20240802152929.2695863-2-alan.maguire@oracle.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20240802152929.2695863-1-alan.maguire@oracle.com> References: <20240802152929.2695863-1-alan.maguire@oracle.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-08-02_11,2024-08-02_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 adultscore=0 mlxlogscore=999 mlxscore=0 suspectscore=0 malwarescore=0 bulkscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2407110000 definitions=main-2408020107 X-Proofpoint-GUID: 6n8WM1fxNgAvM5kMAdJ3GMlHnlJdeShM X-Proofpoint-ORIG-GUID: 6n8WM1fxNgAvM5kMAdJ3GMlHnlJdeShM X-Patchwork-Delegate: bpf@iogearbox.net Currently the only opportunity to set sock ops flags dictating which callbacks fire for a socket is from within a TCP-BPF sockops program. This is problematic if the connection is already set up as there is no further chance to specify callbacks for that socket. Add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_setsockopt() and bpf_getsockopt() to allow users to specify callbacks later, either via an iterator over sockets or via a socket-specific program triggered by a setsockopt() on the socket. Previous discussion on this here [1]. [1] https://lore.kernel.org/bpf/f42f157b-6e52-dd4d-3d97-9b86c84c0b00@oracle.com/ Signed-off-by: Alan Maguire --- include/uapi/linux/bpf.h | 3 ++- net/core/filter.c | 16 ++++++++++++++++ tools/include/uapi/linux/bpf.h | 3 ++- 3 files changed, 20 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 35bcf52dbc65..d4d7efc34e67 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -2851,7 +2851,7 @@ union bpf_attr { * **TCP_SYNCNT**, **TCP_USER_TIMEOUT**, **TCP_NOTSENT_LOWAT**, * **TCP_NODELAY**, **TCP_MAXSEG**, **TCP_WINDOW_CLAMP**, * **TCP_THIN_LINEAR_TIMEOUTS**, **TCP_BPF_DELACK_MAX**, - * **TCP_BPF_RTO_MIN**. + * **TCP_BPF_RTO_MIN**, **TCP_BPF_SOCK_OPS_CB_FLAGS**. * * **IPPROTO_IP**, which supports *optname* **IP_TOS**. * * **IPPROTO_IPV6**, which supports the following *optname*\ s: * **IPV6_TCLASS**, **IPV6_AUTOFLOWLABEL**. @@ -7080,6 +7080,7 @@ enum { TCP_BPF_SYN = 1005, /* Copy the TCP header */ TCP_BPF_SYN_IP = 1006, /* Copy the IP[46] and TCP header */ TCP_BPF_SYN_MAC = 1007, /* Copy the MAC, IP[46], and TCP header */ + TCP_BPF_SOCK_OPS_CB_FLAGS = 1008, /* Set TCP sock ops flags */ }; enum { diff --git a/net/core/filter.c b/net/core/filter.c index 78a6f746ea0b..570ca3f12175 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -5278,6 +5278,11 @@ static int bpf_sol_tcp_setsockopt(struct sock *sk, int optname, return -EINVAL; inet_csk(sk)->icsk_rto_min = timeout; break; + case TCP_BPF_SOCK_OPS_CB_FLAGS: + if (val & ~(BPF_SOCK_OPS_ALL_CB_FLAGS)) + return -EINVAL; + tp->bpf_sock_ops_cb_flags = val; + break; default: return -EINVAL; } @@ -5366,6 +5371,17 @@ static int sol_tcp_sockopt(struct sock *sk, int optname, if (*optlen < 1) return -EINVAL; break; + case TCP_BPF_SOCK_OPS_CB_FLAGS: + if (*optlen != sizeof(int)) + return -EINVAL; + if (getopt) { + struct tcp_sock *tp = tcp_sk(sk); + int val = READ_ONCE(tp->bpf_sock_ops_cb_flags); + + memcpy(optval, &val, *optlen); + return 0; + } + return bpf_sol_tcp_setsockopt(sk, optname, optval, *optlen); default: if (getopt) return -EINVAL; diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 35bcf52dbc65..d4d7efc34e67 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -2851,7 +2851,7 @@ union bpf_attr { * **TCP_SYNCNT**, **TCP_USER_TIMEOUT**, **TCP_NOTSENT_LOWAT**, * **TCP_NODELAY**, **TCP_MAXSEG**, **TCP_WINDOW_CLAMP**, * **TCP_THIN_LINEAR_TIMEOUTS**, **TCP_BPF_DELACK_MAX**, - * **TCP_BPF_RTO_MIN**. + * **TCP_BPF_RTO_MIN**, **TCP_BPF_SOCK_OPS_CB_FLAGS**. * * **IPPROTO_IP**, which supports *optname* **IP_TOS**. * * **IPPROTO_IPV6**, which supports the following *optname*\ s: * **IPV6_TCLASS**, **IPV6_AUTOFLOWLABEL**. @@ -7080,6 +7080,7 @@ enum { TCP_BPF_SYN = 1005, /* Copy the TCP header */ TCP_BPF_SYN_IP = 1006, /* Copy the IP[46] and TCP header */ TCP_BPF_SYN_MAC = 1007, /* Copy the MAC, IP[46], and TCP header */ + TCP_BPF_SOCK_OPS_CB_FLAGS = 1008, /* Set TCP sock ops flags */ }; enum { From patchwork Fri Aug 2 15:29:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Maguire X-Patchwork-Id: 13751686 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4834213635A for ; Fri, 2 Aug 2024 15:30:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722612608; cv=none; b=oqSJYmUvK5apER2HvaGJICq2L9SGCgTV5FB9hpLKFPsm5uE2s/0HBet/jY7XAZP+ZiUN6jn9fPxA3l6uPs1S5nWxiWiGoed3AXPTTJP5Q/3nSmyjfW2520tau4VA1wls0/pIIFd4AF1EIFUy4c5UK/rqsm0vP5bcor1Jyd5NMj0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722612608; c=relaxed/simple; bh=F7GKnvena9bL8dC04BQsbziD4U7s7shL7yNKspy0aE0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HzGNZ+1Lj3Fv4eH4QV5iMnUjKFZ4tbTq0ontG9g8i1LjDn9QmbOkE6V5nfcIw9lJHyNXv6elbjPv95SI5M+CDmFaqt6R1KMqcS25S4z8EOdyEiyxTBuBWeNmKK3M9MH8sv54nzxKwjkNjEC/MUZFP8g3APx/v8ieZj/Kw2zrXuY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=P87Y+LHl; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="P87Y+LHl" Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 472DGRmL009840; Fri, 2 Aug 2024 15:29:44 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=corp-2023-11-20; bh=J Malo93BzW7e6u62fqej41GqJvh9Dvc9+j2jOQ/za8Q=; b=P87Y+LHl7CuQM4T/5 7n2Wv1FF9oGCS4mF008m7A5AYthp6wNMoIU4AhYZUBZI4WnPWSYmD+ielr/qorWn FyzliuNoim2oCmK57fhr+3Yi2j4/e0kx8Nu1s/bnc6X0CUMCDeF9WZWGfI579Exa 4YWRBsat54h+2yhYr+vKfTpMvhcQEStR9lSQgLRZdbPSc6by3vBZP/NA6u5Vt9uy Pki6pUc6yvdVoDFmBtQRkLcU1fxBWp5Hk7TBBYA/kN3sGPEMq/GqwNOl6I6d+8qn vS+bvfq88+AYjw0xQiEBrKq0b8LXHR/D/W+SyXE8VFJOzDTqV4hgOGEMk2vkSV27 LorAQ== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 40rjdwhdyy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 02 Aug 2024 15:29:44 +0000 (GMT) Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 472Doiw4002409; Fri, 2 Aug 2024 15:29:43 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 40nvp1rhjd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 02 Aug 2024 15:29:43 +0000 Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 472FTYgZ035653; Fri, 2 Aug 2024 15:29:42 GMT Received: from bpf.uk.oracle.com (dhcp-10-175-223-234.vpn.oracle.com [10.175.223.234]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 40nvp1rh9t-3; Fri, 02 Aug 2024 15:29:42 +0000 From: Alan Maguire To: martin.lau@linux.dev Cc: ast@kernel.org, daniel@iogearbox.net, eddyz87@gmail.com, song@kernel.org, yonghong.song@linux.dev, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me, haoluo@google.com, jolsa@kernel.org, davem@davemloft.net, edumazet@google.com, bpf@vger.kernel.org, Alan Maguire Subject: [PATCH bpf-next 2/3] selftests/bpf: add tests for TCP_BPF_SOCK_OPS_CB_FLAGS Date: Fri, 2 Aug 2024 16:29:28 +0100 Message-ID: <20240802152929.2695863-3-alan.maguire@oracle.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20240802152929.2695863-1-alan.maguire@oracle.com> References: <20240802152929.2695863-1-alan.maguire@oracle.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-08-02_11,2024-08-02_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 adultscore=0 mlxlogscore=999 mlxscore=0 suspectscore=0 malwarescore=0 bulkscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2407110000 definitions=main-2408020107 X-Proofpoint-ORIG-GUID: K4OX8PModn2J7LgPSjPzsH-hGeJSx8g- X-Proofpoint-GUID: K4OX8PModn2J7LgPSjPzsH-hGeJSx8g- X-Patchwork-Delegate: bpf@iogearbox.net Add tests to set/get TCP sockopt TCP_BPF_SOCK_OPS_CB_FLAGS via bpf_setsockopt() and also add a cgroup/setsockopt program that catches setsockopt() for this option and uses bpf_setsockopt() to set it. The latter allows us to support modifying sockops cb flags on a per-socket basis via setsockopt() without adding support into core setsockopt() itself. Signed-off-by: Alan Maguire --- .../selftests/bpf/prog_tests/setget_sockopt.c | 11 ++++++ .../selftests/bpf/progs/setget_sockopt.c | 37 +++++++++++++++++-- 2 files changed, 45 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c index 7d4a9b3d3722..b9c54217a489 100644 --- a/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c +++ b/tools/testing/selftests/bpf/prog_tests/setget_sockopt.c @@ -42,6 +42,7 @@ static int create_netns(void) static void test_tcp(int family) { struct setget_sockopt__bss *bss = skel->bss; + int cb_flags = BPF_SOCK_OPS_STATE_CB_FLAG | BPF_SOCK_OPS_RTO_CB_FLAG; int sfd, cfd; memset(bss, 0, sizeof(*bss)); @@ -56,6 +57,9 @@ static void test_tcp(int family) close(sfd); return; } + ASSERT_EQ(setsockopt(sfd, SOL_TCP, TCP_BPF_SOCK_OPS_CB_FLAGS, + &cb_flags, sizeof(cb_flags)), + 0, "setsockopt cb_flags"); close(sfd); close(cfd); @@ -65,6 +69,8 @@ static void test_tcp(int family) ASSERT_EQ(bss->nr_passive, 1, "nr_passive"); ASSERT_EQ(bss->nr_socket_post_create, 2, "nr_socket_post_create"); ASSERT_EQ(bss->nr_binddev, 2, "nr_bind"); + ASSERT_GE(bss->nr_state, 1, "nr_state"); + ASSERT_EQ(bss->nr_setsockopt, 1, "nr_setsockopt"); } static void test_udp(int family) @@ -185,6 +191,11 @@ void test_setget_sockopt(void) if (!ASSERT_OK_PTR(skel->links.socket_post_create, "attach_cgroup")) goto done; + skel->links.tcp_setsockopt = + bpf_program__attach_cgroup(skel->progs.tcp_setsockopt, cg_fd); + if (!ASSERT_OK_PTR(skel->links.tcp_setsockopt, "attach_setsockopt")) + goto done; + test_tcp(AF_INET6); test_tcp(AF_INET); test_udp(AF_INET6); diff --git a/tools/testing/selftests/bpf/progs/setget_sockopt.c b/tools/testing/selftests/bpf/progs/setget_sockopt.c index 60518aed1ffc..920af9e21e84 100644 --- a/tools/testing/selftests/bpf/progs/setget_sockopt.c +++ b/tools/testing/selftests/bpf/progs/setget_sockopt.c @@ -20,6 +20,8 @@ int nr_connect; int nr_binddev; int nr_socket_post_create; int nr_fin_wait1; +int nr_state; +int nr_setsockopt; struct sockopt_test { int opt; @@ -59,6 +61,8 @@ static const struct sockopt_test sol_tcp_tests[] = { { .opt = TCP_THIN_LINEAR_TIMEOUTS, .flip = 1, }, { .opt = TCP_USER_TIMEOUT, .new = 123400, .expected = 123400, }, { .opt = TCP_NOTSENT_LOWAT, .new = 1314, .expected = 1314, }, + { .opt = TCP_BPF_SOCK_OPS_CB_FLAGS, .new = BPF_SOCK_OPS_ALL_CB_FLAGS, + .expected = BPF_SOCK_OPS_ALL_CB_FLAGS, .restore = BPF_SOCK_OPS_STATE_CB_FLAG, }, { .opt = 0, }, }; @@ -124,6 +128,7 @@ static int bpf_test_sockopt_int(void *ctx, struct sock *sk, if (bpf_setsockopt(ctx, level, opt, &new, sizeof(new))) return 1; + if (bpf_getsockopt(ctx, level, opt, &tmp, sizeof(tmp)) || tmp != expected) return 1; @@ -384,11 +389,14 @@ int skops_sockopt(struct bpf_sock_ops *skops) nr_passive += !(bpf_test_sockopt(skops, sk) || test_tcp_maxseg(skops, sk) || test_tcp_saved_syn(skops, sk)); - bpf_sock_ops_cb_flags_set(skops, - skops->bpf_sock_ops_cb_flags | - BPF_SOCK_OPS_STATE_CB_FLAG); + + /* no need to set sockops cb flags here as sockopt + * tests and user-space originated setsockopt() will + * set flags to include BPF_SOCK_OPS_STATE_CB. + */ break; case BPF_SOCK_OPS_STATE_CB: + nr_state++; if (skops->args[1] == BPF_TCP_CLOSE_WAIT) nr_fin_wait1 += !bpf_test_sockopt(skops, sk); break; @@ -397,4 +405,27 @@ int skops_sockopt(struct bpf_sock_ops *skops) return 1; } +SEC("cgroup/setsockopt") +int tcp_setsockopt(struct bpf_sockopt *ctx) +{ + struct bpf_sock *sk = ctx->sk; + __u8 *optval_end = ctx->optval_end; + __u8 *optval = ctx->optval; + int val = 0; + + if (!sk || ctx->level != SOL_TCP || ctx->optname != TCP_BPF_SOCK_OPS_CB_FLAGS) + return 1; + if (optval + sizeof(int) > optval_end) + return 0; + if (ctx->optlen != sizeof(int)) + return 0; + val = *(int *)optval; + if (bpf_setsockopt(sk, ctx->level, ctx->optname, &val, sizeof(val))) + return 0; + nr_setsockopt++; + /* BPF has handled this no need to call "real" setsockopt() */ + ctx->optlen = -1; + return 1; +} + char _license[] SEC("license") = "GPL"; From patchwork Fri Aug 2 15:29:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alan Maguire X-Patchwork-Id: 13751687 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1FD921537AB for ; Fri, 2 Aug 2024 15:30:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.165.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722612612; cv=none; b=NXwG5gJf5iI3igtz82UIWV37aiVawPOLP4S2otfaO3zlA6ccv/IL8VI/kHPEYoF6fQvu4+hpIcgZPahKZcc9dYgxvSVFtbV2GKgM1zl5sntDHMJw1tw3FYhyDC8fg8ZMIB+7O/9qReRhGa41ab4zv1yQXNL8uIfKQ+BQRAo5iaM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722612612; c=relaxed/simple; bh=wlLt2jaE1jgImr2UwVnqlCKapZVN3QLFdCP/5Ck8Kdk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=OrsU4rFKR8v+AMbkOZC5Gdh5+AKv+OqMQ0yiCm1e8S3QPNlxIcSHzTUyvp+oa8hSgjUeYIVpSDdUZq4LdswKCesko8p5ktckAfblTWHNHsPzv9V6miIbx9D6zqz9CXyaAJI4dmdgIF/AuFCvszwEZ3kn9EKMil8jXv9mvuIX2As= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=G+WOJkCT; arc=none smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="G+WOJkCT" Received: from pps.filterd (m0333521.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 472DHX5N031533; Fri, 2 Aug 2024 15:29:48 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=corp-2023-11-20; bh=o zHTT2FU49PoAIeyA/cyJJZJJDuBiE4PYW5JGMtwWjo=; b=G+WOJkCTVnirdQHw8 8IO+ZV5mBLh3x/6Qmex75GyWf0OZwUjtZvUfZ/mm2HFg/gTq6Z+zYBIHoC1cXK1l GpjPEIYz5/P4H5dTx/v/XEnD5N85LrGEXmt4TOYhmefI5NAX6s2dCs8QUMGF4dXc BqmfxQ+swPK4dwKXij9GDs+OHmh9y5TvYalCF3UIBUEqkxHeQcQVQA1iBfWJi7nz 1JY1iVMJPLNrE0HiSaLTVY63LfevN6exoedDJ/fHzpq1V4f7dl3g+2ly6FriHAOh 42yrbS7D9dcw20EOwgquEe89zva2GR9qtfbSDYdNOqq2ny8i0GmGUU6OmyLTPAcP WIHkA== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 40rjg5hf87-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 02 Aug 2024 15:29:48 +0000 (GMT) Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 472Dx5wt001845; Fri, 2 Aug 2024 15:29:47 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 40nvp1rhmy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 02 Aug 2024 15:29:47 +0000 Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 472FTYgb035653; Fri, 2 Aug 2024 15:29:47 GMT Received: from bpf.uk.oracle.com (dhcp-10-175-223-234.vpn.oracle.com [10.175.223.234]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 40nvp1rh9t-4; Fri, 02 Aug 2024 15:29:46 +0000 From: Alan Maguire To: martin.lau@linux.dev Cc: ast@kernel.org, daniel@iogearbox.net, eddyz87@gmail.com, song@kernel.org, yonghong.song@linux.dev, john.fastabend@gmail.com, kpsingh@kernel.org, sdf@fomichev.me, haoluo@google.com, jolsa@kernel.org, davem@davemloft.net, edumazet@google.com, bpf@vger.kernel.org, Alan Maguire Subject: [PATCH bpf-next 3/3] selftests/bpf: modify bpf_iter_setsockopt to test TCP_BPF_SOCK_OPS_CB_FLAGS Date: Fri, 2 Aug 2024 16:29:29 +0100 Message-ID: <20240802152929.2695863-4-alan.maguire@oracle.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20240802152929.2695863-1-alan.maguire@oracle.com> References: <20240802152929.2695863-1-alan.maguire@oracle.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-08-02_11,2024-08-02_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 adultscore=0 mlxlogscore=999 mlxscore=0 suspectscore=0 malwarescore=0 bulkscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2407110000 definitions=main-2408020107 X-Proofpoint-ORIG-GUID: yi-nAajMF94WsyQo5eRbDs7ebuUCPfla X-Proofpoint-GUID: yi-nAajMF94WsyQo5eRbDs7ebuUCPfla X-Patchwork-Delegate: bpf@iogearbox.net Add support to test bpf_setsockopt(.., TCP_BPF_SOCK_OPS_CB_FLAGS, ...) in BPF iterator context; use per-socket storage to store the new value and retrieve it in a cgroup/getsockopt program we attach to allow us to query TCP_BPF_SOCK_OPS_CB_FLAGS via getsockopt. Signed-off-by: Alan Maguire --- .../bpf/prog_tests/bpf_iter_setsockopt.c | 83 +++++++++++++------ .../selftests/bpf/progs/bpf_iter_setsockopt.c | 76 ++++++++++++++--- 2 files changed, 123 insertions(+), 36 deletions(-) diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_iter_setsockopt.c b/tools/testing/selftests/bpf/prog_tests/bpf_iter_setsockopt.c index 16bed9dd8e6a..42effafe8efe 100644 --- a/tools/testing/selftests/bpf/prog_tests/bpf_iter_setsockopt.c +++ b/tools/testing/selftests/bpf/prog_tests/bpf_iter_setsockopt.c @@ -4,10 +4,13 @@ #include #include #include "network_helpers.h" +#include "cgroup_helpers.h" #include "bpf_dctcp.skel.h" #include "bpf_cubic.skel.h" #include "bpf_iter_setsockopt.skel.h" +#define TEST_CGROUP "/test-iter-setsockopt" + static int create_netns(void) { if (!ASSERT_OK(unshare(CLONE_NEWNET), "create netns")) @@ -32,17 +35,26 @@ static unsigned int set_bpf_cubic(int *fds, unsigned int nr_fds) return nr_fds; } -static unsigned int check_bpf_dctcp(int *fds, unsigned int nr_fds) +static unsigned int check_bpf_val(int *fds, unsigned int nr_fds, bool cong) { char tcp_cc[16]; - socklen_t optlen = sizeof(tcp_cc); + socklen_t cc_optlen = sizeof(tcp_cc); + int flags; + socklen_t flags_optlen = sizeof(flags); unsigned int i; for (i = 0; i < nr_fds; i++) { - if (getsockopt(fds[i], SOL_TCP, TCP_CONGESTION, - tcp_cc, &optlen) || - strcmp(tcp_cc, "bpf_dctcp")) - return i; + if (cong) { + if (getsockopt(fds[i], SOL_TCP, TCP_CONGESTION, + tcp_cc, &cc_optlen) || + strcmp(tcp_cc, "bpf_dctcp")) + return i; + } else { + if (getsockopt(fds[i], SOL_TCP, TCP_BPF_SOCK_OPS_CB_FLAGS, + &flags, &flags_optlen) || + flags != BPF_SOCK_OPS_ALL_CB_FLAGS) + return i; + } } return nr_fds; @@ -102,7 +114,7 @@ static unsigned short get_local_port(int fd) } static void do_bpf_iter_setsockopt(struct bpf_iter_setsockopt *iter_skel, - bool random_retry) + bool random_retry, bool cong) { int *reuse_listen_fds = NULL, *accepted_fds = NULL, *est_fds = NULL; unsigned int nr_reuse_listens = 256, nr_est = 256; @@ -140,9 +152,16 @@ static void do_bpf_iter_setsockopt(struct bpf_iter_setsockopt *iter_skel, "get_local_port(reuse_listen_fds[0])")) goto done; - /* Run bpf tcp iter to switch from bpf_cubic to bpf_dctcp */ + /* Run bpf tcp iter to change tcp value: + * + * - If cong is true, switch from bpf_cubic to bpf_dctcp; + * - If cong is false, use bpf_setsockopt() to set TCP sockops flags. + */ + iter_skel->bss->random_retry = random_retry; - iter_fd = bpf_iter_create(bpf_link__fd(iter_skel->links.change_tcp_cc)); + iter_skel->bss->cong = cong; + + iter_fd = bpf_iter_create(bpf_link__fd(iter_skel->links.change_tcp_val)); if (!ASSERT_GE(iter_fd, 0, "create iter_fd")) goto done; @@ -152,22 +171,21 @@ static void do_bpf_iter_setsockopt(struct bpf_iter_setsockopt *iter_skel, if (!ASSERT_OK(err, "read iter error")) goto done; - /* Check reuseport listen fds for dctcp */ - ASSERT_EQ(check_bpf_dctcp(reuse_listen_fds, nr_reuse_listens), + /* Check reuseport listen fds */ + ASSERT_EQ(check_bpf_val(reuse_listen_fds, nr_reuse_listens, cong), nr_reuse_listens, - "check reuse_listen_fds dctcp"); - - /* Check non reuseport listen fd for dctcp */ - ASSERT_EQ(check_bpf_dctcp(&listen_fd, 1), 1, - "check listen_fd dctcp"); + "check reuse_listen_fds"); + /* Check non reuseport listen fd */ + ASSERT_EQ(check_bpf_val(&listen_fd, 1, cong), 1, + "check listen_fd"); - /* Check established fds for dctcp */ - ASSERT_EQ(check_bpf_dctcp(est_fds, nr_est), nr_est, - "check est_fds dctcp"); + /* Check established fds */ + ASSERT_EQ(check_bpf_val(est_fds, nr_est, cong), nr_est, + "check est_fds"); - /* Check accepted fds for dctcp */ - ASSERT_EQ(check_bpf_dctcp(accepted_fds, nr_est), nr_est, - "check accepted_fds dctcp"); + /* Check accepted fds */ + ASSERT_EQ(check_bpf_val(accepted_fds, nr_est, cong), nr_est, + "check accepted_fds"); done: if (iter_fd != -1) @@ -186,6 +204,8 @@ void serial_test_bpf_iter_setsockopt(void) struct bpf_dctcp *dctcp_skel = NULL; struct bpf_link *cubic_link = NULL; struct bpf_link *dctcp_link = NULL; + struct bpf_link *getsockopt_link = NULL; + int cgroup_fd; if (create_netns()) return; @@ -194,8 +214,9 @@ void serial_test_bpf_iter_setsockopt(void) iter_skel = bpf_iter_setsockopt__open_and_load(); if (!ASSERT_OK_PTR(iter_skel, "iter_skel")) return; - iter_skel->links.change_tcp_cc = bpf_program__attach_iter(iter_skel->progs.change_tcp_cc, NULL); - if (!ASSERT_OK_PTR(iter_skel->links.change_tcp_cc, "attach iter")) + iter_skel->links.change_tcp_val = bpf_program__attach_iter(iter_skel->progs.change_tcp_val, + NULL); + if (!ASSERT_OK_PTR(iter_skel->links.change_tcp_val, "attach iter")) goto done; /* Load bpf_cubic */ @@ -214,13 +235,23 @@ void serial_test_bpf_iter_setsockopt(void) if (!ASSERT_OK_PTR(dctcp_link, "dctcp_link")) goto done; - do_bpf_iter_setsockopt(iter_skel, true); - do_bpf_iter_setsockopt(iter_skel, false); + cgroup_fd = cgroup_setup_and_join(TEST_CGROUP); + if (!ASSERT_OK_FD(cgroup_fd, "cgroup switch")) + goto done; + getsockopt_link = bpf_program__attach_cgroup(iter_skel->progs._getsockopt, cgroup_fd); + if (!ASSERT_OK_PTR(getsockopt_link, "getsockopt prog")) + goto done; + do_bpf_iter_setsockopt(iter_skel, true, true); + do_bpf_iter_setsockopt(iter_skel, false, true); + do_bpf_iter_setsockopt(iter_skel, true, false); + do_bpf_iter_setsockopt(iter_skel, false, false); done: bpf_link__destroy(cubic_link); bpf_link__destroy(dctcp_link); + bpf_link__destroy(getsockopt_link); bpf_cubic__destroy(cubic_skel); bpf_dctcp__destroy(dctcp_skel); bpf_iter_setsockopt__destroy(iter_skel); + cleanup_cgroup_environment(); } diff --git a/tools/testing/selftests/bpf/progs/bpf_iter_setsockopt.c b/tools/testing/selftests/bpf/progs/bpf_iter_setsockopt.c index ec7f91850dec..60752a7ebdf8 100644 --- a/tools/testing/selftests/bpf/progs/bpf_iter_setsockopt.c +++ b/tools/testing/selftests/bpf/progs/bpf_iter_setsockopt.c @@ -5,6 +5,13 @@ #include #include +struct { + __uint(type, BPF_MAP_TYPE_SK_STORAGE); + __uint(map_flags, BPF_F_NO_PREALLOC); + __type(key, int); + __type(value, int); +} sk_map SEC(".maps"); + #define bpf_tcp_sk(skc) ({ \ struct sock_common *_skc = skc; \ sk = NULL; \ @@ -21,6 +28,7 @@ unsigned short listen_hport = 0; char cubic_cc[TCP_CA_NAME_MAX] = "bpf_cubic"; char dctcp_cc[TCP_CA_NAME_MAX] = "bpf_dctcp"; bool random_retry = false; +bool cong = false; static bool tcp_cc_eq(const char *a, const char *b) { @@ -36,10 +44,32 @@ static bool tcp_cc_eq(const char *a, const char *b) return true; } +/* This program is used to intercept getsockopt() calls, providing + * the value of bpf_sock_ops_cb_flags for the socket; this value + * has been saved in per-socket storage earlier via the iterator + * program. + */ +SEC("cgroup/getsockopt") +int _getsockopt(struct bpf_sockopt *ctx) +{ + struct bpf_sock *sk = ctx->sk; + int *optval = ctx->optval; + int *sk_storage = 0; + + if (!sk || ctx->level != SOL_TCP || ctx->optname != TCP_BPF_SOCK_OPS_CB_FLAGS) + return 1; + sk_storage = bpf_sk_storage_get(&sk_map, sk, 0, 0); + if (sk_storage) { + if (ctx->optval + sizeof(int) <= ctx->optval_end) + *optval = *sk_storage; + ctx->retval = 0; + } + return 1; +} + SEC("iter/tcp") -int change_tcp_cc(struct bpf_iter__tcp *ctx) +int change_tcp_val(struct bpf_iter__tcp *ctx) { - char cur_cc[TCP_CA_NAME_MAX]; struct tcp_sock *tp; struct sock *sk; @@ -54,17 +84,43 @@ int change_tcp_cc(struct bpf_iter__tcp *ctx) bpf_ntohs(sk->sk_dport) != listen_hport)) return 0; - if (bpf_getsockopt(tp, SOL_TCP, TCP_CONGESTION, - cur_cc, sizeof(cur_cc))) - return 0; + if (cong) { + char cur_cc[TCP_CA_NAME_MAX]; - if (!tcp_cc_eq(cur_cc, cubic_cc)) - return 0; + if (bpf_getsockopt(tp, SOL_TCP, TCP_CONGESTION, + cur_cc, sizeof(cur_cc))) + return 0; - if (random_retry && bpf_get_prandom_u32() % 4 == 1) - return 1; + if (!tcp_cc_eq(cur_cc, cubic_cc)) + return 0; + + if (random_retry && bpf_get_prandom_u32() % 4 == 1) + return 1; + + bpf_setsockopt(tp, SOL_TCP, TCP_CONGESTION, dctcp_cc, sizeof(dctcp_cc)); + } else { + int val, newval = BPF_SOCK_OPS_ALL_CB_FLAGS; + int *sk_storage; - bpf_setsockopt(tp, SOL_TCP, TCP_CONGESTION, dctcp_cc, sizeof(dctcp_cc)); + if (bpf_getsockopt(tp, SOL_TCP, TCP_BPF_SOCK_OPS_CB_FLAGS, + &val, sizeof(val))) + return 0; + + if (val == newval) + return 0; + + if (random_retry && bpf_get_prandom_u32() % 4 == 1) + return 1; + + if (bpf_setsockopt(tp, SOL_TCP, TCP_BPF_SOCK_OPS_CB_FLAGS, + &newval, sizeof(newval))) + return 0; + /* store flags value for retrieval in cgroup/getsockopt prog */ + sk_storage = bpf_sk_storage_get(&sk_map, sk, 0, + BPF_SK_STORAGE_GET_F_CREATE); + if (sk_storage) + *sk_storage = newval; + } return 0; }