From patchwork Thu Mar 6 03:54:29 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 14003798 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-wr1-f68.google.com (mail-wr1-f68.google.com [209.85.221.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F182619ABD8 for ; Thu, 6 Mar 2025 03:54:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.68 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741233282; cv=none; b=bNgB/o5ftNC7RWiptbmzCyaGlHvLd/drDwE7TnCJXgEr567giBZ1Uy0f2Ie1F2P7fGWwT7W9tvTgil7eoViD2vf8oCexfgrFc5Ipho+CMuwDytFRDAltbKi8qB3yZTIRzkBo8dnYToo5MzTBKcS8eDJpvpWbzxTe1tUA7Unu+2s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741233282; c=relaxed/simple; bh=7wLAt7O8W74x95cBoOOBhxEA4GSGkeqLTG2jdVQE1ls=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=hbuwl2P6uHw6Ty2LK276RvMJFD+brPVUsUU6S2EulnoKUpYdyeYqSXWaNK4bvw8ERFMS0M5iz/y+ECFeyGK6kbtMl06Ow9y7oNF0eA7ywCHPPBSmT9wTwccTSeZsFZRr11Efu/G9VSSf8p1peRqDPVIFqf1vX8mD55Bawot+lso= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=kZrMYYL7; arc=none smtp.client-ip=209.85.221.68 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kZrMYYL7" Received: by mail-wr1-f68.google.com with SMTP id ffacd0b85a97d-391211ea598so109111f8f.1 for ; Wed, 05 Mar 2025 19:54:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1741233277; x=1741838077; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=RpbEdVMPpfdOJ9ldOmfW/MnEw0rgThHsWUYVOYyHCYQ=; b=kZrMYYL7eY7b/TUOL55KkkZy5LysQgl7vkaI3uYwIGBYfXsW/m+Qh/rp41hzl0boN9 lHoE4XIZ0PmZr11+Lp+xaOobGZDOtjPtTOTXixrzjURkAWw6wsoZDWHP7EvjzOqYFq57 iM9eYraSNqMbdIF7hFgLMkK3XMspKX5MO2QmkeapSGSvDIcIG5gBk/gaHfjFYUFCmwJe 3FpLAp3lvCnRG41o4UTbb3jB/XY9rIOe62A3Co4vqK1dWAzrB3SIijlnV79OCt9lAuyU vGepuDuSTWrH93Z8Ah3DVihGsXIFRVA8n82K7iHaqY8v5hfyiBSdgAZwQ/Mb6LFsabwV wyTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741233277; x=1741838077; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RpbEdVMPpfdOJ9ldOmfW/MnEw0rgThHsWUYVOYyHCYQ=; b=Gb2ZIXDy6rBSDSDSxGeeAuVkZNkiACXYkZ+f+Ivx7mGkAyJqBVFXHItVg3VbB621C5 TVflE4gAlNRwcQ8BxjRvYXIA4brcqHkkjQ01MZQx+FXRfWvvjMT+v9KPQkjfJk7NH+IG 80K5DkV4+EPkwSUGIWcxS1FgOkD4M+YUEtZp6psPmZHkU85Ey8l/nIeGCi7zLDJIGB01 CEJd/Guf8kSraGqm9wjTI0TNw2tkm+CZm19PdNYQ/+g0qK7wStjuvXM9JDjwVffEMNQA CBWCrsUsIuJHRacW7uuvNfPAwPhWXzrD7Dx8YqXFy/zmYcH8E7tseHhD4IdGQNcPstPP rmQA== X-Gm-Message-State: AOJu0YyYcIAPGqtoYt/fLbUlzMJYAQYh3IlT7OUDqEAzuMv5qRpOLvW9 uiZdl0CQoVo4TyX1L4orAeRB1WZm3eoopCX9LmaB6JzdPQqLU+4eXHWFhkxyDFE= X-Gm-Gg: ASbGncvBUORsE3qt7vXQ3JDGCtdosQKw214bv5aO2+4Se70uGGwvRowwKjZblYJDU2x 3KUh9lb8Oa3eUk7ugXB3rg4NvuZliSQpdJqfRCNQKmm0PeYGqwQU2v81rEuZKYbqUFF4TM9AWU7 N90n5GCYk2yHFeQ7XNo25cWeXSH/nDayFuZpX2BdK8WxAG9DqfF+Zs4v5XqTQ/8Fhs5ZVfbXhdY qWNEMoQFzL1VCBSFqI4/xsNysa/g13UVqyhCo6GryHB2OAMc+trHW5CJ8E0YBloqVGR08xAK/FW zn8yvZDY4HcdIdHlpDHwtQZsokGp+o9ohw== X-Google-Smtp-Source: AGHT+IG0S32iILhwYsBQPJk8aGT5deDiVR1wWuiDQrMmFAAT/poVDRPMTC0N7VUw6Jr1rLHhggXY2A== X-Received: by 2002:a5d:6c62:0:b0:390:f1cb:286e with SMTP id ffacd0b85a97d-3911f76eeebmr4666534f8f.27.1741233277391; Wed, 05 Mar 2025 19:54:37 -0800 (PST) Received: from localhost ([2a03:2880:31ff:4::]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3912bfb79fbsm578626f8f.13.2025.03.05.19.54.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Mar 2025 19:54:36 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , Tejun Heo , Emil Tsalapatis , Barret Rhoden , Josh Don , Dohyun Kim , kkd@meta.com, kernel-team@meta.com Subject: [PATCH bpf-next v5 1/3] selftests/bpf: Introduce cond_break_label Date: Wed, 5 Mar 2025 19:54:29 -0800 Message-ID: <20250306035431.2186189-2-memxor@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250306035431.2186189-1-memxor@gmail.com> References: <20250306035431.2186189-1-memxor@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=2098; h=from:subject; bh=7wLAt7O8W74x95cBoOOBhxEA4GSGkeqLTG2jdVQE1ls=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnyRK/vBTMFEOe3foCY/HEA3Xje5VIYhs2yIBBnE2Y Cyn6jTOJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ8kSvwAKCRBM4MiGSL8Ryt8PD/ 9peBbYGAajogMBnN+mpiMnuU2jnEsRI6t3VCYshinAgT9d2wq1tVmWkZjkh9bqUcBkz5VeVGpanfh4 pEmK/iovxb2HMtisyJtMaSaDukEFU1CDl4uB1UmIW41QI9CyzodgJqeqjQzSgIrOFITk0pAKP1Leyp Pz4wT9OAAuWgluKE0PUGgXwTKjKWRGB7GW3KBEexvfCedcG3VGXF0t5oBF3i1FfABsMCcba4JGR7q9 Ht+lHbfi9/PV6/RihEL9Ks2xvwGQYXyq28l5/kJLIKP9nGnkJMFdO9TOA8bAEo9xcqhdWsJcnZikR+ ZPOqOdEiJ62X1Q+EYfzvcn1nkJbpR2Ezuj7Ul/T0v7EdTal9jEznqg93PyZtksJfugnpwkG8zLI31I ojuDIHpewO24ZJf0ZaSOL4JFHsHfq3Bl97dBovHVLMRcU9Y7zc3MJkirW9p0Sii0khSn4haBcBgeYR oWIaA4dmVMX9NoxGB0PertFyETzYutRJ6lineR7EXKrzzn/Ane/HbE1ta7q2jCmbQUSRLxe7KPjJnl Uh+fAqrustErLetlvsMXvMJOtYUGCDcSmwz7Wg1yT9wO807jPZXZcRctov01GOItBNrgbAghAqex8e 5XnyEICjkMOMYkG4nqwRFReNPY/4EeeuIBT0CUbJNIW0YnOQrvK1VW2cDOgw== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-Patchwork-Delegate: bpf@iogearbox.net Add a new cond_break_label macro that jumps to the specified label when the cond_break termination check fires, and allows us to better handle the uncontrolled termination of the loop. Signed-off-by: Kumar Kartikeya Dwivedi --- tools/testing/selftests/bpf/bpf_experimental.h | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/bpf/bpf_experimental.h b/tools/testing/selftests/bpf/bpf_experimental.h index cd8ecd39c3f3..6535c8ae3c46 100644 --- a/tools/testing/selftests/bpf/bpf_experimental.h +++ b/tools/testing/selftests/bpf/bpf_experimental.h @@ -368,12 +368,12 @@ l_true: \ ret; \ }) -#define cond_break \ +#define __cond_break(expr) \ ({ __label__ l_break, l_continue; \ asm volatile goto("may_goto %l[l_break]" \ :::: l_break); \ goto l_continue; \ - l_break: break; \ + l_break: expr; \ l_continue:; \ }) #else @@ -392,7 +392,7 @@ l_true: \ ret; \ }) -#define cond_break \ +#define __cond_break(expr) \ ({ __label__ l_break, l_continue; \ asm volatile goto("1:.byte 0xe5; \ .byte 0; \ @@ -400,7 +400,7 @@ l_true: \ .short 0" \ :::: l_break); \ goto l_continue; \ - l_break: break; \ + l_break: expr; \ l_continue:; \ }) #else @@ -418,7 +418,7 @@ l_true: \ ret; \ }) -#define cond_break \ +#define __cond_break(expr) \ ({ __label__ l_break, l_continue; \ asm volatile goto("1:.byte 0xe5; \ .byte 0; \ @@ -426,12 +426,15 @@ l_true: \ .short 0" \ :::: l_break); \ goto l_continue; \ - l_break: break; \ + l_break: expr; \ l_continue:; \ }) #endif #endif +#define cond_break __cond_break(break) +#define cond_break_label(label) __cond_break(goto label) + #ifndef bpf_nop_mov #define bpf_nop_mov(var) \ asm volatile("%[reg]=%[reg]"::[reg]"r"((short)var)) From patchwork Thu Mar 6 03:54:30 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 14003799 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-wr1-f68.google.com (mail-wr1-f68.google.com [209.85.221.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 415A57EF09 for ; Thu, 6 Mar 2025 03:54:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.68 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741233283; cv=none; b=Msd2fuzjyVGo/ZXJpKva6+7fxfqEqYYArR9uUbJb5Q+2PHtA4R6tktpTPc8n0bP2dgo9LlPGVPMZILWTjJ9SqqrOwvKMd45d10eH6U/lO5plt7i333upZFisapdGH/zfH4Mr+I3J1B4hIfvYALHZ+PdmoaVEtILSJVnPxtVPJCY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741233283; c=relaxed/simple; bh=7y3bziEk5aW4TVrKavcou1NKdDmw5vmUMlc/qgN/Tzs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Qc0fPj3/gLNyoiWEZ1VaNCzkiqjFXGvwPmcJnc49xxWVjsVkfGFLcTD9Zs15k86uaJ7KhfFBR2uB9bPfQu3Gpu60XbMYAsA0NGsZ45ir8on/kAGEkv1eAgmC2Rlo/v3kwWeZbY0fgwR1t4W/KhL3KhBd5sA/TDg5bXEwSShhQS4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ePxt6uO5; arc=none smtp.client-ip=209.85.221.68 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ePxt6uO5" Received: by mail-wr1-f68.google.com with SMTP id ffacd0b85a97d-390e88caa4dso84120f8f.1 for ; Wed, 05 Mar 2025 19:54:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1741233279; x=1741838079; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DvnBKYQ34nCkzRQCRLtGQvVyWjBivJj3FY7JzmBgYeg=; b=ePxt6uO5wMFAdKKMuwkW5SntJBhbKOv/1xrs5oGWWcPClE4r13F3W6S0Xuis8LSF3l pvBlG0mxk9SBBgdC0uW6yyP3XUGGzpYOzwvW3J5p3ANulun2FaRnP1VXhuLkdmiQ1h3i /xQdiYSYeApmaeYLaQkELa65N38zZ8yE3phvZTswRPmVm1EULFR/b2UDSbDtqBHzUpgf llSqOegy7nUoKGYQv7V5ilgJ4jTG9vITzVL0tWXSa4Nq0JF3J4NzTMWQdU+LyiNeMrRR 13Prxu6gKow7r2nldgnA87/laUiLGFh48oaROy0NNLarrKDFkNmUuK6bt7fLW4cA9Eqe kRbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741233279; x=1741838079; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DvnBKYQ34nCkzRQCRLtGQvVyWjBivJj3FY7JzmBgYeg=; b=hYvs9mDG4u+o4NdvKEVY9mtgGE5wBeY2HH1wsV6h/PG8QHunOpDDmj/BI1+las5BuW imvOHeHhzhWiWeuJURrCX7APsZDTLeaxdjidIv0os4sb3YF9Vsx+C/U18ONA5woTsAEN lrIDQTn0sOiEbF6Do1OTLWTe9rDWGIt7D525dfL+0/+EsCdblVAKLtUhQdGWaMyaapIT PPLer9+wLu7Phg/jsqEmJyFfVpDX+ivJxgcFxiZNUGdJuCG+zcIL3UAzgO8Tb8ImI57r PsGNqV9qBIyT+l+wn5Vj1GWYvfcUGVQC9P2sJHxghvHzFycwi2e2k7yqaHJy0w1/tAvT C6Jw== X-Gm-Message-State: AOJu0YxdKddLEh3jduOzG0gqdN9Dua4XFvOWqX9DMM5/reYgY4FAfiJs tqARsGcgH+CLqI/c47anQMWihyBXIwpA8lXO/xv625W5ca7zpN5hdMD/pX/Nxco= X-Gm-Gg: ASbGnctMnhIjtDBxLPM6741k8mVcj6h9jx7vAzX5Y7vXB/gKCtEvn1npZbSx8Ezk8o9 oAmJFojBpC6g9/4pvkYkoJ9yYBo1gp5bXjsUy8CWSGVsMMMEWyGBWcbfmSmPyquIyzeAD5bHsst kV+RnWcTJsZIXgidWrBiQSFZ/DbpQI5sA6Uh+zIfRBjSYQ8GBm38xD3mRzj654s4K0/yKDN6a79 ZYZ/5nr64kAItbmctWtTsaevDlK9CNR9X5zdPxqlgP3zPs8eq/RL3Tl5kEuXfGR9o4G2RuiAEmw pJPhFo0MW4pY5YhGjHxuYgSrsRf+dcySxQ== X-Google-Smtp-Source: AGHT+IHhOunxcNVlf8826+TF/FEjT6inJ76eb4gYGegqYS9g7gzaqgkP+qpQDVDavubV/1JdkOAjSA== X-Received: by 2002:a05:6000:1547:b0:391:22e2:cd21 with SMTP id ffacd0b85a97d-39122e2cfbamr4787463f8f.36.1741233278889; Wed, 05 Mar 2025 19:54:38 -0800 (PST) Received: from localhost ([2a03:2880:31ff:4::]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3912c102e01sm554426f8f.93.2025.03.05.19.54.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Mar 2025 19:54:38 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , Tejun Heo , Emil Tsalapatis , Barret Rhoden , Josh Don , Dohyun Kim , kkd@meta.com, kernel-team@meta.com Subject: [PATCH bpf-next v5 2/3] selftests/bpf: Introduce arena spin lock Date: Wed, 5 Mar 2025 19:54:30 -0800 Message-ID: <20250306035431.2186189-3-memxor@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250306035431.2186189-1-memxor@gmail.com> References: <20250306035431.2186189-1-memxor@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=21948; h=from:subject; bh=7y3bziEk5aW4TVrKavcou1NKdDmw5vmUMlc/qgN/Tzs=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnyRK/ltJunBtE3OeGUDHRHaqqeXim21sl282IAH2a 6TaU4YSJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ8kSvwAKCRBM4MiGSL8RyorxD/ 4uiVFn+zuantKoqmeFjhBT6Giw4CwC/utN/c/ZB55zzzMBhg+TLXoMQTjYB2S4FgZV5efbsCdzL3Qb 2o7P9rOPkb9M22sgsWJGe0C42ISl7JGs+8vrMmT/F8hKZ2/rswClF7Ud3b132EhVHboUxO07k8tYY6 ftnF+6mQ+IWtU6s8qn0n1+QaP81CBWUMOjh5sYCLHSA+PEzGDVyuELiz4vkS4EYpRHA9MdhJ9rilMl NaLmLc0/ExU9pPU8pqD9pjkgkvT4zajo0eLnOXErb7HC18h0WBh+bu8pY6J4bQTOtBC+pUTeEpj7KE OkU2tmjyWfz7jQ1dLhe8bjofEk6ix9AKnm7XgNmRioXGeT9hYN+6SJ7z1dJFFg6XwRkDpIjyKdKZO2 uwMB1FzpEoLgCHzUIpuUQEJ+20WFgCK/51Cf4F0gWB6QxFxjyA7dGHTc1IACMoZJ2KiAEdeZEZOXof XdBoy8gfQwV5ArzEZOwqUSlwzZhc7G+9loRPcegZneF6H6BhYMuH/weq5ccGw/cgIz2m0f83N0B2l7 4Jd1ohx+qrHVF0T5KAWz/nH+PbQeGmBUT+XVih6Gg/kYZj/JSnNYmfDJs2bktGRcjyo1xzTQMncn8F 4NYHSnJzLQIIem+bwn7Cqk2pjjnFQ+i2swm+naUITGA+nLzv1m8A25ViY2sQ== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-Patchwork-Delegate: bpf@iogearbox.net Implement queued spin lock algorithm as BPF program for lock words living in BPF arena. The algorithm is copied from kernel/locking/qspinlock.c and adapted for BPF use. We first implement abstract helpers for portable atomics and acquire/release load instructions, by relying on X86_64 presence to elide expensive barriers and rely on implementation details of the JIT, and fall back to slow but correct implementations elsewhere. When support for acquire/release load/stores lands, we can improve this state. Then, the qspinlock algorithm is adapted to remove dependence on multi-word atomics due to lack of support in BPF ISA. For instance, xchg_tail cannot use 16-bit xchg, and needs to be a implemented as a 32-bit try_cmpxchg loop. Loops which are seemingly infinite from verifier PoV are annotated with cond_break_label macro to return an error. Only 1024 NR_CPUs are supported. Note that the slow path is a global function, hence the verifier doesn't know the return value's precision. The recommended way of usage is to always test against zero for success, and not ret < 0 for error, as the verifier would assume ret > 0 has not been accounted for. Add comments in the function documentation about this quirk. Signed-off-by: Kumar Kartikeya Dwivedi --- .../selftests/bpf/bpf_arena_spin_lock.h | 512 ++++++++++++++++++ tools/testing/selftests/bpf/bpf_atomic.h | 140 +++++ 2 files changed, 652 insertions(+) create mode 100644 tools/testing/selftests/bpf/bpf_arena_spin_lock.h create mode 100644 tools/testing/selftests/bpf/bpf_atomic.h diff --git a/tools/testing/selftests/bpf/bpf_arena_spin_lock.h b/tools/testing/selftests/bpf/bpf_arena_spin_lock.h new file mode 100644 index 000000000000..3aca389ce424 --- /dev/null +++ b/tools/testing/selftests/bpf/bpf_arena_spin_lock.h @@ -0,0 +1,512 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */ +#ifndef BPF_ARENA_SPIN_LOCK_H +#define BPF_ARENA_SPIN_LOCK_H + +#include +#include +#include "bpf_atomic.h" + +#define arch_mcs_spin_lock_contended_label(l, label) smp_cond_load_acquire_label(l, VAL, label) +#define arch_mcs_spin_unlock_contended(l) smp_store_release((l), 1) + +#if defined(ENABLE_ATOMICS_TESTS) && defined(__BPF_FEATURE_ADDR_SPACE_CAST) + +#define EBUSY 16 +#define EOPNOTSUPP 95 +#define ETIMEDOUT 110 + +#ifndef __arena +#define __arena __attribute__((address_space(1))) +#endif + +extern unsigned long CONFIG_NR_CPUS __kconfig; + +#define arena_spinlock_t struct qspinlock +/* FIXME: Using typedef causes CO-RE relocation error */ +/* typedef struct qspinlock arena_spinlock_t; */ + +struct arena_mcs_spinlock { + struct arena_mcs_spinlock __arena *next; + int locked; + int count; +}; + +struct arena_qnode { + struct arena_mcs_spinlock mcs; +}; + +#define _Q_MAX_NODES 4 +#define _Q_PENDING_LOOPS 1 + +/* + * Bitfields in the atomic value: + * + * 0- 7: locked byte + * 8: pending + * 9-15: not used + * 16-17: tail index + * 18-31: tail cpu (+1) + */ +#define _Q_MAX_CPUS 1024 + +#define _Q_SET_MASK(type) (((1U << _Q_ ## type ## _BITS) - 1)\ + << _Q_ ## type ## _OFFSET) +#define _Q_LOCKED_OFFSET 0 +#define _Q_LOCKED_BITS 8 +#define _Q_LOCKED_MASK _Q_SET_MASK(LOCKED) + +#define _Q_PENDING_OFFSET (_Q_LOCKED_OFFSET + _Q_LOCKED_BITS) +#define _Q_PENDING_BITS 8 +#define _Q_PENDING_MASK _Q_SET_MASK(PENDING) + +#define _Q_TAIL_IDX_OFFSET (_Q_PENDING_OFFSET + _Q_PENDING_BITS) +#define _Q_TAIL_IDX_BITS 2 +#define _Q_TAIL_IDX_MASK _Q_SET_MASK(TAIL_IDX) + +#define _Q_TAIL_CPU_OFFSET (_Q_TAIL_IDX_OFFSET + _Q_TAIL_IDX_BITS) +#define _Q_TAIL_CPU_BITS (32 - _Q_TAIL_CPU_OFFSET) +#define _Q_TAIL_CPU_MASK _Q_SET_MASK(TAIL_CPU) + +#define _Q_TAIL_OFFSET _Q_TAIL_IDX_OFFSET +#define _Q_TAIL_MASK (_Q_TAIL_IDX_MASK | _Q_TAIL_CPU_MASK) + +#define _Q_LOCKED_VAL (1U << _Q_LOCKED_OFFSET) +#define _Q_PENDING_VAL (1U << _Q_PENDING_OFFSET) + +#define likely(x) __builtin_expect(!!(x), 1) +#define unlikely(x) __builtin_expect(!!(x), 0) + +struct arena_qnode __arena qnodes[_Q_MAX_CPUS][_Q_MAX_NODES]; + +static inline u32 encode_tail(int cpu, int idx) +{ + u32 tail; + + tail = (cpu + 1) << _Q_TAIL_CPU_OFFSET; + tail |= idx << _Q_TAIL_IDX_OFFSET; /* assume < 4 */ + + return tail; +} + +static inline struct arena_mcs_spinlock __arena *decode_tail(u32 tail) +{ + u32 cpu = (tail >> _Q_TAIL_CPU_OFFSET) - 1; + u32 idx = (tail & _Q_TAIL_IDX_MASK) >> _Q_TAIL_IDX_OFFSET; + + return &qnodes[cpu][idx].mcs; +} + +static inline +struct arena_mcs_spinlock __arena *grab_mcs_node(struct arena_mcs_spinlock __arena *base, int idx) +{ + return &((struct arena_qnode __arena *)base + idx)->mcs; +} + +#define _Q_LOCKED_PENDING_MASK (_Q_LOCKED_MASK | _Q_PENDING_MASK) + +/** + * xchg_tail - Put in the new queue tail code word & retrieve previous one + * @lock : Pointer to queued spinlock structure + * @tail : The new queue tail code word + * Return: The previous queue tail code word + * + * xchg(lock, tail) + * + * p,*,* -> n,*,* ; prev = xchg(lock, node) + */ +static __always_inline u32 xchg_tail(arena_spinlock_t __arena *lock, u32 tail) +{ + u32 old, new; + + old = atomic_read(&lock->val); + do { + new = (old & _Q_LOCKED_PENDING_MASK) | tail; + /* + * We can use relaxed semantics since the caller ensures that + * the MCS node is properly initialized before updating the + * tail. + */ + /* These loops are not expected to stall, but we still need to + * prove to the verifier they will terminate eventually. + */ + cond_break_label(out); + } while (!atomic_try_cmpxchg_relaxed(&lock->val, &old, new)); + + return old; +out: + bpf_printk("RUNTIME ERROR: %s unexpected cond_break exit!!!", __func__); + return old; +} + +/** + * clear_pending - clear the pending bit. + * @lock: Pointer to queued spinlock structure + * + * *,1,* -> *,0,* + */ +static __always_inline void clear_pending(arena_spinlock_t __arena *lock) +{ + WRITE_ONCE(lock->pending, 0); +} + +/** + * clear_pending_set_locked - take ownership and clear the pending bit. + * @lock: Pointer to queued spinlock structure + * + * *,1,0 -> *,0,1 + * + * Lock stealing is not allowed if this function is used. + */ +static __always_inline void clear_pending_set_locked(arena_spinlock_t __arena *lock) +{ + WRITE_ONCE(lock->locked_pending, _Q_LOCKED_VAL); +} + +/** + * set_locked - Set the lock bit and own the lock + * @lock: Pointer to queued spinlock structure + * + * *,*,0 -> *,0,1 + */ +static __always_inline void set_locked(arena_spinlock_t __arena *lock) +{ + WRITE_ONCE(lock->locked, _Q_LOCKED_VAL); +} + +static __always_inline +u32 arena_fetch_set_pending_acquire(arena_spinlock_t __arena *lock) +{ + u32 old, new; + + old = atomic_read(&lock->val); + do { + new = old | _Q_PENDING_VAL; + /* + * These loops are not expected to stall, but we still need to + * prove to the verifier they will terminate eventually. + */ + cond_break_label(out); + } while (!atomic_try_cmpxchg_acquire(&lock->val, &old, new)); + + return old; +out: + bpf_printk("RUNTIME ERROR: %s unexpected cond_break exit!!!", __func__); + return old; +} + +/** + * arena_spin_trylock - try to acquire the queued spinlock + * @lock : Pointer to queued spinlock structure + * Return: 1 if lock acquired, 0 if failed + */ +static __always_inline int arena_spin_trylock(arena_spinlock_t __arena *lock) +{ + int val = atomic_read(&lock->val); + + if (unlikely(val)) + return 0; + + return likely(atomic_try_cmpxchg_acquire(&lock->val, &val, _Q_LOCKED_VAL)); +} + +__noinline +int arena_spin_lock_slowpath(arena_spinlock_t __arena __arg_arena *lock, u32 val) +{ + struct arena_mcs_spinlock __arena *prev, *next, *node0, *node; + int ret = -ETIMEDOUT; + u32 old, tail; + int idx; + + /* + * Wait for in-progress pending->locked hand-overs with a bounded + * number of spins so that we guarantee forward progress. + * + * 0,1,0 -> 0,0,1 + */ + if (val == _Q_PENDING_VAL) { + int cnt = _Q_PENDING_LOOPS; + val = atomic_cond_read_relaxed_label(&lock->val, + (VAL != _Q_PENDING_VAL) || !cnt--, + release_err); + } + + /* + * If we observe any contention; queue. + */ + if (val & ~_Q_LOCKED_MASK) + goto queue; + + /* + * trylock || pending + * + * 0,0,* -> 0,1,* -> 0,0,1 pending, trylock + */ + val = arena_fetch_set_pending_acquire(lock); + + /* + * If we observe contention, there is a concurrent locker. + * + * Undo and queue; our setting of PENDING might have made the + * n,0,0 -> 0,0,0 transition fail and it will now be waiting + * on @next to become !NULL. + */ + if (unlikely(val & ~_Q_LOCKED_MASK)) { + + /* Undo PENDING if we set it. */ + if (!(val & _Q_PENDING_MASK)) + clear_pending(lock); + + goto queue; + } + + /* + * We're pending, wait for the owner to go away. + * + * 0,1,1 -> *,1,0 + * + * this wait loop must be a load-acquire such that we match the + * store-release that clears the locked bit and create lock + * sequentiality; this is because not all + * clear_pending_set_locked() implementations imply full + * barriers. + */ + if (val & _Q_LOCKED_MASK) + smp_cond_load_acquire_label(&lock->locked, !VAL, release_err); + + /* + * take ownership and clear the pending bit. + * + * 0,1,0 -> 0,0,1 + */ + clear_pending_set_locked(lock); + return 0; + + /* + * End of pending bit optimistic spinning and beginning of MCS + * queuing. + */ +queue: + node0 = &(qnodes[bpf_get_smp_processor_id()])[0].mcs; + idx = node0->count++; + tail = encode_tail(bpf_get_smp_processor_id(), idx); + + /* + * 4 nodes are allocated based on the assumption that there will not be + * nested NMIs taking spinlocks. That may not be true in some + * architectures even though the chance of needing more than 4 nodes + * will still be extremely unlikely. When that happens, we simply return + * an error. Original qspinlock has a trylock fallback in this case. + */ + if (unlikely(idx >= _Q_MAX_NODES)) { + ret = -EBUSY; + goto release_node_err; + } + + node = grab_mcs_node(node0, idx); + + /* + * Ensure that we increment the head node->count before initialising + * the actual node. If the compiler is kind enough to reorder these + * stores, then an IRQ could overwrite our assignments. + */ + barrier(); + + node->locked = 0; + node->next = NULL; + + /* + * We touched a (possibly) cold cacheline in the per-cpu queue node; + * attempt the trylock once more in the hope someone let go while we + * weren't watching. + */ + if (arena_spin_trylock(lock)) + goto release; + + /* + * Ensure that the initialisation of @node is complete before we + * publish the updated tail via xchg_tail() and potentially link + * @node into the waitqueue via WRITE_ONCE(prev->next, node) below. + */ + smp_wmb(); + + /* + * Publish the updated tail. + * We have already touched the queueing cacheline; don't bother with + * pending stuff. + * + * p,*,* -> n,*,* + */ + old = xchg_tail(lock, tail); + next = NULL; + + /* + * if there was a previous node; link it and wait until reaching the + * head of the waitqueue. + */ + if (old & _Q_TAIL_MASK) { + prev = decode_tail(old); + + /* Link @node into the waitqueue. */ + WRITE_ONCE(prev->next, node); + + arch_mcs_spin_lock_contended_label(&node->locked, release_node_err); + + /* + * While waiting for the MCS lock, the next pointer may have + * been set by another lock waiter. We cannot prefetch here + * due to lack of equivalent instruction in BPF ISA. + */ + next = READ_ONCE(node->next); + } + + /* + * we're at the head of the waitqueue, wait for the owner & pending to + * go away. + * + * *,x,y -> *,0,0 + * + * this wait loop must use a load-acquire such that we match the + * store-release that clears the locked bit and create lock + * sequentiality; this is because the set_locked() function below + * does not imply a full barrier. + */ + val = atomic_cond_read_acquire_label(&lock->val, !(VAL & _Q_LOCKED_PENDING_MASK), + release_node_err); + + /* + * claim the lock: + * + * n,0,0 -> 0,0,1 : lock, uncontended + * *,*,0 -> *,*,1 : lock, contended + * + * If the queue head is the only one in the queue (lock value == tail) + * and nobody is pending, clear the tail code and grab the lock. + * Otherwise, we only need to grab the lock. + */ + + /* + * In the PV case we might already have _Q_LOCKED_VAL set, because + * of lock stealing; therefore we must also allow: + * + * n,0,1 -> 0,0,1 + * + * Note: at this point: (val & _Q_PENDING_MASK) == 0, because of the + * above wait condition, therefore any concurrent setting of + * PENDING will make the uncontended transition fail. + */ + if ((val & _Q_TAIL_MASK) == tail) { + if (atomic_try_cmpxchg_relaxed(&lock->val, &val, _Q_LOCKED_VAL)) + goto release; /* No contention */ + } + + /* + * Either somebody is queued behind us or _Q_PENDING_VAL got set + * which will then detect the remaining tail and queue behind us + * ensuring we'll see a @next. + */ + set_locked(lock); + + /* + * contended path; wait for next if not observed yet, release. + */ + if (!next) + next = smp_cond_load_relaxed_label(&node->next, (VAL), release_node_err); + + arch_mcs_spin_unlock_contended(&next->locked); + +release:; + /* + * release the node + * + * Doing a normal dec vs this_cpu_dec is fine. An upper context always + * decrements count it incremented before returning, thus we're fine. + * For contexts interrupting us, they either observe our dec or not. + * Just ensure the compiler doesn't reorder this statement, as a + * this_cpu_dec implicitly implied that. + */ + barrier(); + node0->count--; + return 0; +release_node_err: + barrier(); + node0->count--; + goto release_err; +release_err: + return ret; +} + +/** + * arena_spin_lock - acquire a queued spinlock + * @lock: Pointer to queued spinlock structure + * + * On error, returned value will be negative. + * On success, zero is returned. + * + * The return value _must_ be tested against zero for success, + * instead of checking it against negative, for passing the + * BPF verifier. + * + * The user should do: + * if (arena_spin_lock(...) != 0) // failure + * or + * if (arena_spin_lock(...) == 0) // success + * or + * if (arena_spin_lock(...)) // failure + * or + * if (!arena_spin_lock(...)) // success + * instead of: + * if (arena_spin_lock(...) < 0) // failure + * + * The return value can still be inspected later. + */ +static __always_inline int arena_spin_lock(arena_spinlock_t __arena *lock) +{ + int val = 0; + + if (CONFIG_NR_CPUS > 1024) + return -EOPNOTSUPP; + + bpf_preempt_disable(); + if (likely(atomic_try_cmpxchg_acquire(&lock->val, &val, _Q_LOCKED_VAL))) + return 0; + + val = arena_spin_lock_slowpath(lock, val); + /* FIXME: bpf_assert_range(-MAX_ERRNO, 0) once we have it working for all cases. */ + if (val) + bpf_preempt_enable(); + return val; +} + +/** + * arena_spin_unlock - release a queued spinlock + * @lock : Pointer to queued spinlock structure + */ +static __always_inline void arena_spin_unlock(arena_spinlock_t __arena *lock) +{ + /* + * unlock() needs release semantics: + */ + smp_store_release(&lock->locked, 0); + bpf_preempt_enable(); +} + +#define arena_spin_lock_irqsave(lock, flags) \ + ({ \ + int __ret; \ + bpf_local_irq_save(&(flags)); \ + __ret = arena_spin_lock((lock)); \ + if (__ret) \ + bpf_local_irq_restore(&(flags)); \ + (__ret); \ + }) + +#define arena_spin_unlock_irqrestore(lock, flags) \ + ({ \ + arena_spin_unlock((lock)); \ + bpf_local_irq_restore(&(flags)); \ + }) + +#endif + +#endif /* BPF_ARENA_SPIN_LOCK_H */ diff --git a/tools/testing/selftests/bpf/bpf_atomic.h b/tools/testing/selftests/bpf/bpf_atomic.h new file mode 100644 index 000000000000..a9674e544322 --- /dev/null +++ b/tools/testing/selftests/bpf/bpf_atomic.h @@ -0,0 +1,140 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */ +#ifndef BPF_ATOMIC_H +#define BPF_ATOMIC_H + +#include +#include +#include "bpf_experimental.h" + +extern bool CONFIG_X86_64 __kconfig __weak; + +/* + * __unqual_typeof(x) - Declare an unqualified scalar type, leaving + * non-scalar types unchanged, + * + * Prefer C11 _Generic for better compile-times and simpler code. Note: 'char' + * is not type-compatible with 'signed char', and we define a separate case. + * + * This is copied verbatim from kernel's include/linux/compiler_types.h, but + * with default expression (for pointers) changed from (x) to (typeof(x)0). + * + * This is because LLVM has a bug where for lvalue (x), it does not get rid of + * an extra address_space qualifier, but does in case of rvalue (typeof(x)0). + * Hence, for pointers, we need to create an rvalue expression to get the + * desired type. See https://github.com/llvm/llvm-project/issues/53400. + */ +#define __scalar_type_to_expr_cases(type) \ + unsigned type : (unsigned type)0, signed type : (signed type)0 + +#define __unqual_typeof(x) \ + typeof(_Generic((x), \ + char: (char)0, \ + __scalar_type_to_expr_cases(char), \ + __scalar_type_to_expr_cases(short), \ + __scalar_type_to_expr_cases(int), \ + __scalar_type_to_expr_cases(long), \ + __scalar_type_to_expr_cases(long long), \ + default: (typeof(x))0)) + +/* No-op for BPF */ +#define cpu_relax() ({}) + +#define READ_ONCE(x) (*(volatile typeof(x) *)&(x)) + +#define WRITE_ONCE(x, val) ((*(volatile typeof(x) *)&(x)) = (val)) + +#define cmpxchg(p, old, new) __sync_val_compare_and_swap((p), old, new) + +#define try_cmpxchg(p, pold, new) \ + ({ \ + __unqual_typeof(*(pold)) __o = *(pold); \ + __unqual_typeof(*(p)) __r = cmpxchg(p, __o, new); \ + if (__r != __o) \ + *(pold) = __r; \ + __r == __o; \ + }) + +#define try_cmpxchg_relaxed(p, pold, new) try_cmpxchg(p, pold, new) + +#define try_cmpxchg_acquire(p, pold, new) try_cmpxchg(p, pold, new) + +#define smp_mb() \ + ({ \ + unsigned long __val; \ + __sync_fetch_and_add(&__val, 0); \ + }) + +#define smp_rmb() \ + ({ \ + if (!CONFIG_X86_64) \ + smp_mb(); \ + else \ + barrier(); \ + }) + +#define smp_wmb() \ + ({ \ + if (!CONFIG_X86_64) \ + smp_mb(); \ + else \ + barrier(); \ + }) + +/* Control dependency provides LOAD->STORE, provide LOAD->LOAD */ +#define smp_acquire__after_ctrl_dep() ({ smp_rmb(); }) + +#define smp_load_acquire(p) \ + ({ \ + __unqual_typeof(*(p)) __v = READ_ONCE(*(p)); \ + if (!CONFIG_X86_64) \ + smp_mb(); \ + barrier(); \ + __v; \ + }) + +#define smp_store_release(p, val) \ + ({ \ + if (!CONFIG_X86_64) \ + smp_mb(); \ + barrier(); \ + WRITE_ONCE(*(p), val); \ + }) + +#define smp_cond_load_relaxed_label(p, cond_expr, label) \ + ({ \ + typeof(p) __ptr = (p); \ + __unqual_typeof(*(p)) VAL; \ + for (;;) { \ + VAL = (__unqual_typeof(*(p)))READ_ONCE(*__ptr); \ + if (cond_expr) \ + break; \ + cond_break_label(label); \ + cpu_relax(); \ + } \ + (typeof(*(p)))VAL; \ + }) + +#define smp_cond_load_acquire_label(p, cond_expr, label) \ + ({ \ + __unqual_typeof(*p) __val = \ + smp_cond_load_relaxed_label(p, cond_expr, label); \ + smp_acquire__after_ctrl_dep(); \ + (typeof(*(p)))__val; \ + }) + +#define atomic_read(p) READ_ONCE((p)->counter) + +#define atomic_cond_read_relaxed_label(p, cond_expr, label) \ + smp_cond_load_relaxed_label(&(p)->counter, cond_expr, label) + +#define atomic_cond_read_acquire_label(p, cond_expr, label) \ + smp_cond_load_acquire_label(&(p)->counter, cond_expr, label) + +#define atomic_try_cmpxchg_relaxed(p, pold, new) \ + try_cmpxchg_relaxed(&(p)->counter, pold, new) + +#define atomic_try_cmpxchg_acquire(p, pold, new) \ + try_cmpxchg_acquire(&(p)->counter, pold, new) + +#endif /* BPF_ATOMIC_H */ From patchwork Thu Mar 6 03:54:31 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 14003800 X-Patchwork-Delegate: bpf@iogearbox.net Received: from mail-wm1-f68.google.com (mail-wm1-f68.google.com [209.85.128.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BFD2819C540 for ; Thu, 6 Mar 2025 03:54:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.68 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741233285; cv=none; b=T8pyapVvIJ95BrhhF9MSjrpywAoZBcBmlDz24anc9aQg4zrq+WAIxpFtZtdJvDgQzjStU9Fg9BVqv2vAYWryJ/TXi1/k6g9tjrF+xjCOUrIsIFUpVCH+kNk8fCKbrC7m3cTwTw3Pg9wH5jgSQiwcyXoimwEWyq7eg1wGt8BYXMg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741233285; c=relaxed/simple; bh=ruwllVcxVR8hL7pZvqgC0ju7e3nf8kxM438AdoWXkZ8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=YCt/pROd/TLwetZ0qsR1s1GCmX78locZSJg30GmADcJLZT7ISi/RXJkU/gtwGz3s41xYrVpWdtkR30nFdPSvP+sBaoV4x10Vn9aoDRNH7ElgjWVgacd3u/maYUrvxWgk4+vOP9dI3OKQVdDEgEsmckp+ugMb97NmdKRhWewUOMQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=LJ7MAbx+; arc=none smtp.client-ip=209.85.128.68 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="LJ7MAbx+" Received: by mail-wm1-f68.google.com with SMTP id 5b1f17b1804b1-43bc31227ecso705945e9.1 for ; Wed, 05 Mar 2025 19:54:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1741233280; x=1741838080; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Sz1udnkrVf7H1EVKSlcz18+3nTTqlgb9md15hm7hieg=; b=LJ7MAbx+1mNNf20bnGYXugKa1kKpGXUNwTK6Vc6xwlO/hdGhs+lClV4AeR6SAItfZL whVHe63LWdLMzGNaN/dqv+NxUt0a/DpaOD/Iedm2bUMs3ZU8AKuNwSyINux+PkKgzPxd 3fAx69KepoVw9lf8YnUuovTyxrPyTqy8Xl74UDUQqVV/B9DRm/s61qgm2lhSFHe8o2NJ 5Epf87+PmFgRrHudLA9UURKwD4Rv2n/BLrnuRKywD7A86hokMbPqZm/XEN4c5lf/gQwL raZfDQylgjCSU2OA8DIZ+6OwG8nmDjOIQR7J2+//d2jbicklpRK8+ZrbmDBS8vB4z1xM /v8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741233280; x=1741838080; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Sz1udnkrVf7H1EVKSlcz18+3nTTqlgb9md15hm7hieg=; b=OzyJT4xSy9HwpR6rnww0/Q0lZ8iT4NJUFg23WeYea4mshzryC0hI1g0PSbmo74Hlwj 7/gqVaM2Z7Oa/N1b9zPa84zKwiBzV5+v00ofcD2UZn2IBoXwlMm5QD8eIXggiuBQmlkK BxSgLgyIFgJURHggemYNk/BUHGy9IXmBjingLKgcjgqSNgdgRmZKNccMFxVMxClFjJth FWnSG39F1fq0ip5YXeEAwc/mrpMvECqg8KXlR1xMYbZxiORcafP2LmZrA5h4/GVbvgKW ACRxvNu+BkewaKaOGS7SQfhQ5NUVFKYk67HIkdPm/T699BgaaRyPz6KaNi6RpEwLmd3H h5mQ== X-Gm-Message-State: AOJu0YwecHpTp1H8pYeYZrVgct90/NGlZngr0XNhv2Hq0sU6CImTMrnM DzXIjE93x4/hU4+EiS75PV8OZ3KkK+ysVQNt5CboOjNwUK/Eic0Vc+lU+5yTY1o= X-Gm-Gg: ASbGncvbwiUDgXjFH9iXkYeXdwI2vXQuKIseuEgL4DFe340Q2NpIun+AmiBUjol6qgz v+fK2gszfiR8hGwIM0gE3JaAgYKcd1c/LkhPiUQUHDj+yJMu36cfqhx4EZOfUpIoc9RGesRA24z TIFqM+AE82MT8MNaLrCE+wWjg45EHgzLVrhbgjOjRG5GtnNObTGkIVx+IjMw2S5wCgX/hF4QSrA GqeNNQzEHfqbhIVaVA3yTVfYqlYMLzjKA5WboJJPEvbISNec2nVdZjdtM3mQdg7Z4gRFj9wuaxP U/3+WMDw9DVnFXk6sxBsFDPnYLt2cUCw4Us= X-Google-Smtp-Source: AGHT+IH/m4LMrbzqGX+Wi+gEqFbYlpVHelphUQBPJddZQ0fnBxlJePuyLTGK47vPWGPH25es+sLwSA== X-Received: by 2002:a05:6000:2d0a:b0:390:e8d4:6517 with SMTP id ffacd0b85a97d-3911f740ccamr3653881f8f.21.1741233280036; Wed, 05 Mar 2025 19:54:40 -0800 (PST) Received: from localhost ([2a03:2880:31ff:48::]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3912c0e2b6asm556692f8f.66.2025.03.05.19.54.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Mar 2025 19:54:39 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , Tejun Heo , Emil Tsalapatis , Barret Rhoden , Josh Don , Dohyun Kim , kkd@meta.com, kernel-team@meta.com Subject: [PATCH bpf-next v5 3/3] selftests/bpf: Add tests for arena spin lock Date: Wed, 5 Mar 2025 19:54:31 -0800 Message-ID: <20250306035431.2186189-4-memxor@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250306035431.2186189-1-memxor@gmail.com> References: <20250306035431.2186189-1-memxor@gmail.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=5069; h=from:subject; bh=ruwllVcxVR8hL7pZvqgC0ju7e3nf8kxM438AdoWXkZ8=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnyRLAlOXcH90ljv5gtq6dUVqpXpdbfWdwl4gw0onq P9LtKZyJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ8kSwAAKCRBM4MiGSL8RysI/EA DB94Ih+0NetzrukimHRDdyBBiMWwt2b+iR2NHF4mCYvcVyGrBxv68JdhTHKBtrdYy8dFpk8L6v7y0w 69TPNz7LCXxzZ6FMcbryZppp96EslrQT7DmdoZ+KOQvEn3cJb0GtdMwAN2QXxIegSCjn41+q0KHYFU vOOtofTebB6Ig1n9ZPKrNpMhHfP+6bnJtr1+35M50zQa+xsSNmWna/fFPiCT3445O/Bbk2dpmWpCYq NIVspr81pTFZVOkrY0n8YPgXv9te715XxCWdgTHA2sCO/mOhUlMvAlcvjEWd5oBKlqxwq5igSRAG7A g18UkxaZ5R3EUhGxGcPHXCALg6tDBBgUbrKSrVks9NRbDv2qO4P/mlYb0AiF/2eX6a6+RjuwEi7Tsg /AilnkwlDu0jczfgCyPCmZXmlzv3TdkTY53a2R9SpRhtgzeEDjB3abEZ/gY/mLLZ9O1U5obro8LQe5 Scnj5KKw+rHjDPQZwofhFPYMrrE6wYzQg5EUfR8FtXucTu7X323lKWJIpAcJ3bN6eYWRTZ3K5akVdL HGCz3TkJTAXLR9SeiWv4k+Y1dbTvjQftOHueWSc6MO67OWdkMka8Ff0mwE/GK2Wfpt8GeghTX9gVtb +tSHkzKDdlaQd1gO8v0BRXxumr2OKLjVCc6Mq0N4V53mDROrvG3FZsXMB0cw== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-Patchwork-Delegate: bpf@iogearbox.net Add some basic selftests for qspinlock built over BPF arena using cond_break_label macro. Signed-off-by: Kumar Kartikeya Dwivedi --- .../bpf/prog_tests/arena_spin_lock.c | 108 ++++++++++++++++++ .../selftests/bpf/progs/arena_spin_lock.c | 51 +++++++++ 2 files changed, 159 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/arena_spin_lock.c create mode 100644 tools/testing/selftests/bpf/progs/arena_spin_lock.c diff --git a/tools/testing/selftests/bpf/prog_tests/arena_spin_lock.c b/tools/testing/selftests/bpf/prog_tests/arena_spin_lock.c new file mode 100644 index 000000000000..bc3616ba891c --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/arena_spin_lock.c @@ -0,0 +1,108 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */ +#include +#include +#include + +struct qspinlock { int val; }; +typedef struct qspinlock arena_spinlock_t; + +struct arena_qnode { + unsigned long next; + int count; + int locked; +}; + +#include "arena_spin_lock.skel.h" + +static long cpu; +static int repeat; + +pthread_barrier_t barrier; + +static void *spin_lock_thread(void *arg) +{ + int err, prog_fd = *(u32 *)arg; + LIBBPF_OPTS(bpf_test_run_opts, topts, + .data_in = &pkt_v4, + .data_size_in = sizeof(pkt_v4), + .repeat = repeat, + ); + cpu_set_t cpuset; + + CPU_ZERO(&cpuset); + CPU_SET(__sync_fetch_and_add(&cpu, 1), &cpuset); + ASSERT_OK(pthread_setaffinity_np(pthread_self(), sizeof(cpuset), &cpuset), "cpu affinity"); + + err = pthread_barrier_wait(&barrier); + if (err != PTHREAD_BARRIER_SERIAL_THREAD && err != 0) + ASSERT_FALSE(true, "pthread_barrier"); + + err = bpf_prog_test_run_opts(prog_fd, &topts); + ASSERT_OK(err, "test_run err"); + ASSERT_EQ((int)topts.retval, 0, "test_run retval"); + + pthread_exit(arg); +} + +static void test_arena_spin_lock_size(int size) +{ + LIBBPF_OPTS(bpf_test_run_opts, topts); + struct arena_spin_lock *skel; + pthread_t thread_id[16]; + int prog_fd, i, err; + void *ret; + + if (get_nprocs() < 2) { + test__skip(); + return; + } + + skel = arena_spin_lock__open_and_load(); + if (!ASSERT_OK_PTR(skel, "arena_spin_lock__open_and_load")) + return; + if (skel->data->test_skip == 2) { + test__skip(); + goto end; + } + skel->bss->cs_count = size; + skel->bss->limit = repeat * 16; + + ASSERT_OK(pthread_barrier_init(&barrier, NULL, 16), "barrier init"); + + prog_fd = bpf_program__fd(skel->progs.prog); + for (i = 0; i < 16; i++) { + err = pthread_create(&thread_id[i], NULL, &spin_lock_thread, &prog_fd); + if (!ASSERT_OK(err, "pthread_create")) + goto end_barrier; + } + + for (i = 0; i < 16; i++) { + if (!ASSERT_OK(pthread_join(thread_id[i], &ret), "pthread_join")) + goto end_barrier; + if (!ASSERT_EQ(ret, &prog_fd, "ret == prog_fd")) + goto end_barrier; + } + + ASSERT_EQ(skel->bss->counter, repeat * 16, "check counter value"); + +end_barrier: + pthread_barrier_destroy(&barrier); +end: + arena_spin_lock__destroy(skel); + return; +} + +void test_arena_spin_lock(void) +{ + repeat = 1000; + if (test__start_subtest("arena_spin_lock_1")) + test_arena_spin_lock_size(1); + cpu = 0; + if (test__start_subtest("arena_spin_lock_1000")) + test_arena_spin_lock_size(1000); + cpu = 0; + repeat = 100; + if (test__start_subtest("arena_spin_lock_50000")) + test_arena_spin_lock_size(50000); +} diff --git a/tools/testing/selftests/bpf/progs/arena_spin_lock.c b/tools/testing/selftests/bpf/progs/arena_spin_lock.c new file mode 100644 index 000000000000..c4500c37f85e --- /dev/null +++ b/tools/testing/selftests/bpf/progs/arena_spin_lock.c @@ -0,0 +1,51 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2025 Meta Platforms, Inc. and affiliates. */ +#include +#include +#include +#include "bpf_misc.h" +#include "bpf_arena_spin_lock.h" + +struct { + __uint(type, BPF_MAP_TYPE_ARENA); + __uint(map_flags, BPF_F_MMAPABLE); + __uint(max_entries, 100); /* number of pages */ +#ifdef __TARGET_ARCH_arm64 + __ulong(map_extra, 0x1ull << 32); /* start of mmap() region */ +#else + __ulong(map_extra, 0x1ull << 44); /* start of mmap() region */ +#endif +} arena SEC(".maps"); + +int cs_count; + +#if defined(ENABLE_ATOMICS_TESTS) && defined(__BPF_FEATURE_ADDR_SPACE_CAST) +arena_spinlock_t __arena lock; +int test_skip = 1; +#else +int test_skip = 2; +#endif + +int counter; +int limit; + +SEC("tc") +int prog(void *ctx) +{ + int ret = -2; + +#if defined(ENABLE_ATOMICS_TESTS) && defined(__BPF_FEATURE_ADDR_SPACE_CAST) + unsigned long flags; + + if ((ret = arena_spin_lock_irqsave(&lock, flags))) + return ret; + if (counter != limit) + counter++; + bpf_repeat(cs_count); + ret = 0; + arena_spin_unlock_irqrestore(&lock, flags); +#endif + return ret; +} + +char _license[] SEC("license") = "GPL";