From patchwork Thu Feb 6 10:54:09 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962834 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D5306C02194 for ; Thu, 6 Feb 2025 10:57:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=Y4Fpzu6vEVBo0wCJhRUrT6eOf6iL8Bnn94ks0el4cwM=; b=EWZ/wYErTxtOyNtfvhvXzq/X8K 89HL4acVGdCLWJAw5TkC/sJg9Cx+EHSPlDrHWXG177NuZeHGf4FYFcQ2OLNSAwYAtJfamdfOevDdl gqDBecgpldIcZSejt0mmM0YVWTigsUdB2Gtr5476oGQ60zwsazp5+nYtnBmovNlQsouf3FiIVy/J0 dy3Zhr7QGHTImD8bJ17K6wSuZNzC9mo8a00HaW4hwlksHTDwfyjig5IfXrHuDYBOph0OV3UlkWC/x 3A9EjzX1YjlqIpWFOFicqTsPWWlDGYZuNvJD00itMsLRKz8EXiE0mV3BT1zh84hm7fJrRf7Ug3xiz 9of9XqjA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tfzZY-000000062CT-2OCF; Thu, 06 Feb 2025 10:57:28 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzWu-000000061WC-0ti5 for linux-arm-kernel@bombadil.infradead.org; Thu, 06 Feb 2025 10:54:44 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Transfer-Encoding:MIME-Version :References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=Y4Fpzu6vEVBo0wCJhRUrT6eOf6iL8Bnn94ks0el4cwM=; b=PrjOQspbW2JJOQnrN1DjoyCfFH kTTHXY+AM464R9qMxLBlKaN0imqo76vBBqIQcX96e3XTHnCYepEDAq+zHXvlJgly8YARHLvVJZEvb jz6ossyw4GYUvjlFoIjeKC2/utMnSvpDD10fOBSufmiWdyNBqIDbYlUKbrD7tzvwr78rS/79RWZwM atns55LSI2WE8jCl+PYIAshaIns9gOMaZc1ZAIamaXtN/hA5bnp4+XJwkHNJEOu2QR1LSm34Qd420 EVNaFz528SRNoq+qB8OWoysOLmtM1MOQpUMtIG6C0L0e7iDUk/sMv/RsfO0u7q587C7ZVnRUf6Vgr CfXwHtpw==; Received: from mail-wr1-x443.google.com ([2a00:1450:4864:20::443]) by desiato.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzWq-0000000Guq4-3SVZ for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:54:42 +0000 Received: by mail-wr1-x443.google.com with SMTP id ffacd0b85a97d-38dc33931d3so96375f8f.1 for ; Thu, 06 Feb 2025 02:54:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839278; x=1739444078; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Y4Fpzu6vEVBo0wCJhRUrT6eOf6iL8Bnn94ks0el4cwM=; b=gIU9CfiAWoQlQrvRmgZ2S1E7VGbVNRLhXX7aJmA2GgR6AMVYAYxmNJovPgCgpwbxkk 6+ShBLqJppeavNlpH0Ua3F/a85N0vy9n5OUCUZmPas5FdvfxGkxQ39RxizC+QvwWu73Z HGduM3QYADZuAsuXYCvf9DzRpGnWijl1JmoAACfXLLZ/V63HtVZdIHGvj1GHB6Nofzv0 TOv+/79glyej4C+Qo3ZTX7s1kQueNO4hzkFeOXHnNzMZS5UfX1q8MPBP2SZ2gUqexztB By0sPuJSg0Gd0+29LR2bUDtAipxoyc1YoAbGwZIEb1GDLw/XaEswZolPO6Igt8JBFqaD P51g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839278; x=1739444078; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Y4Fpzu6vEVBo0wCJhRUrT6eOf6iL8Bnn94ks0el4cwM=; b=npXaR5CTvex6KEJunT5lm8H6iPR892yzLdZQ3RUquG4dpoVlz72dMZxFPVau9kTJ0U pBGRg31QT7j0oVos2JppWUPojMvejLw5XYV5/hF/liAJXXLp2Z6AqovP4IslF5ztgcbn B7mZmNg6lNTpsTPUIt+mhymLqu3oSHuxsuLTKuowwfDQf+ilZhemrqa+3+rpiGrkNRQy 6Ek+xZEqr/DSB1BoPzk5rmdprrv08zeLS0SM6yccGUIEuHNnx/CrPP9Jtnb8/yekRhev vQgyTFYjfLTGS/kozi8h7s99DkxfeunlijUbUn2i8+0s6oHQNaOGQ080s+vJUszhUZDs FeVA== X-Forwarded-Encrypted: i=1; AJvYcCW7N6qJN1XK4bq72yQchn0weljdfYwiuQsfiKCUzTo3ehdLwhXCfHqQWoOb+z45anq9Nr/m51osro1pxt8reYDl@lists.infradead.org X-Gm-Message-State: AOJu0Yzc0ITF2xC6rLRTRNlSlrqsjnPCaUwJzS8vTAebIlIXiQQdDson ToJwLX5uvk6P0uzglopzCuJFBRSeCC86sd2Hw8HYUaQbnytYVBJi X-Gm-Gg: ASbGnctAkeSx+ECzVzSQf40OXEx9T2cn/PmkyKoKORgUdNF/N+JSilxW0QnRSBz1/uO KcTSn2450p2PJ3VDDvFm48idLVeZp7s0j/WkKYlzza6ODJ3QgqorqgargyD2cbsXrW1TTcQVnph xcNbCqxaXAPansVJOoEC4ABVs1j8aWb8da+5x5vCxR8iPTM7WHBHWZdph1CEoEgvx9nyzzqUap0 9V7YzKKeqpkpcbn/twuQHi2RUDtbdEiOrBuCOcncGcvy8xfGQg8PaiW7dUZhqzxpkJgxRCBYn0I p3x+nQ== X-Google-Smtp-Source: AGHT+IHM5vaMe7dQKG64EkTeSRsdLjRkMoUvLfANIG4QYK0OqXKUcNrK28QFEfWzU8iR33/xEot5Og== X-Received: by 2002:a05:6000:184c:b0:38d:b807:b894 with SMTP id ffacd0b85a97d-38db807bb8fmr3106134f8f.18.1738839277867; Thu, 06 Feb 2025 02:54:37 -0800 (PST) Received: from localhost ([2a03:2880:31ff:72::]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38dc050d688sm912424f8f.24.2025.02.06.02.54.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:54:37 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Barret Rhoden , Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 01/26] locking: Move MCS struct definition to public header Date: Thu, 6 Feb 2025 02:54:09 -0800 Message-ID: <20250206105435.2159977-2-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=1522; h=from:subject; bh=l+FzxzqcQOvCH8c5GurhKDmppajCdKyqii/WUkvzxT8=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRjghejvu1V67bqXOIvMlrdiOA5EFuq9ml673Px YVwX7L2JAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUYwAKCRBM4MiGSL8RymRCD/ 9/YOhtMVXa4uhP+xTSUCPdwnoJ2e7ZwYDxog71LJ2HXxXg9/eSH8/Si/tA5gU1tWmG5QH+z9074vpJ vFqxdR0gSvIAsqQrBO0Tj1Qgtdyo22PRO5rV3ADONgAsBycwmCaZBA33dyS1teyYYdgun6rsOcUI+f 7pF1YCbr7dccOv8O7agZL2Y/864xfviCiuvbQTK+cwwdqybOxwT1eXhHNeK0iH7aUa6XsePpDTIUo+ Jjk3tHIskRifzXmI0/KTA1h/KRtLc8mbDdNugVNuuQ0Zglv9yZqIcVqy3z2l/OXaAVJlh5g0JO6dlM UFIZc61NIQP5h0n4mDGH9B1u7jcOYk+PiiJHCe89a2orxDwKXXe/wwnz5jKbllMykb9ikA9Ok+hW5d iP0uJ1ZTBE0OFUVJ7kZfX+CbwA9+l0dW6Sk5V88N8siHqhSZvKmyXQLvmnfjHCgGWq9VIkbzgNLFaZ yMEFQeShp0yCBfOTpqybarU2fc79rESPAl91vmwTkbkdGWUDl7ZmqaOlaNcnSsdyJIX4CxqqO0ad7h s2JcoMGOqNOzWMpcA7dkooTjlAydnB+BiivOD4kaYuS141fC0PntHgdN8c2hpVTrG9A64nnLBRMSVN wvOc8dXgBeQceXqmhiBDNuP6995mi4cgcucLwriFXse32y+vjJeqT2hqUBzA== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_105441_002876_E903F97A X-CRM114-Status: GOOD ( 12.95 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Move the definition of the struct mcs_spinlock from the private mcs_spinlock.h header in kernel/locking to the mcs_spinlock.h asm-generic header, since we will need to reference it from the qspinlock.h header in subsequent commits. Reviewed-by: Barret Rhoden Signed-off-by: Kumar Kartikeya Dwivedi --- include/asm-generic/mcs_spinlock.h | 6 ++++++ kernel/locking/mcs_spinlock.h | 6 ------ 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/include/asm-generic/mcs_spinlock.h b/include/asm-generic/mcs_spinlock.h index 10cd4ffc6ba2..39c94012b88a 100644 --- a/include/asm-generic/mcs_spinlock.h +++ b/include/asm-generic/mcs_spinlock.h @@ -1,6 +1,12 @@ #ifndef __ASM_MCS_SPINLOCK_H #define __ASM_MCS_SPINLOCK_H +struct mcs_spinlock { + struct mcs_spinlock *next; + int locked; /* 1 if lock acquired */ + int count; /* nesting count, see qspinlock.c */ +}; + /* * Architectures can define their own: * diff --git a/kernel/locking/mcs_spinlock.h b/kernel/locking/mcs_spinlock.h index 85251d8771d9..16160ca8907f 100644 --- a/kernel/locking/mcs_spinlock.h +++ b/kernel/locking/mcs_spinlock.h @@ -15,12 +15,6 @@ #include -struct mcs_spinlock { - struct mcs_spinlock *next; - int locked; /* 1 if lock acquired */ - int count; /* nesting count, see qspinlock.c */ -}; - #ifndef arch_mcs_spin_lock_contended /* * Using smp_cond_load_acquire() provides the acquire semantics From patchwork Thu Feb 6 10:54:10 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962837 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A8FA8C02194 for ; Thu, 6 Feb 2025 11:00:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=hgmvOyDRE/uHdGDx6ezBHiwZ8Zmst4a3Z/7igl9yl54=; b=Z4PAlm1bUb7l/Bm08tHhUNmeMD yY9tZstz+ey3HUlCi8VubWHy9BuXW+hM7+nMm4Zah+n+SYL5evA9XJ74noY06SYylEvKM7Wycgqbd PmWbG998uCAPI76kl/KV8N1KE3LlX5vBzD805WL68QdqRQj8KHdvS0L/GxUEIYkTVa+I2vxoEWgWK WHw0/WSlojPldXC0OtBxF9yCqLUUCeziS4VrvysNUdhZBkE5dO8sRYeF1xjIMQCSZ+wdMU9NCWodm MqWrWJH7fcfr32ypdf6+CkD1w9qvFQ63kaS+h021wFVXpuWo4wxgKrBntJJI1d6t0fmQqJk+ymt96 1JnJVilw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tfzcE-000000062q9-49gA; Thu, 06 Feb 2025 11:00:14 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzWv-000000061Wl-0y2F for linux-arm-kernel@bombadil.infradead.org; Thu, 06 Feb 2025 10:54:45 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Transfer-Encoding:MIME-Version :References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=hgmvOyDRE/uHdGDx6ezBHiwZ8Zmst4a3Z/7igl9yl54=; b=hnEgQD6qVwYETPZvAJMGhtO0po X4Ug1AlJRm5s0xnGuGsL1EIb/0v/EzpePqbsPbC2HEqj7gtIMQ3Od9/KjsrCmYZ19XPy3iPqNaSud m9sJC+3zdVTVgIpfFqMz20dZJ7OLTMwAmq7HCkWddlR6cjo78zuzDEbXSqpqrPpud7HEq/GWpQVHO AkikFyy8clupBaz7EHcmXJTzbCeJgFaoVJO8os3y3LcmP06aQioZUr5z/Rhql8iKiY119MmXALkzZ 02TXV5FaRBXgd0vUth2XNqdhzt9v5XH+EEal9sChi/I3+MORZgvpzr7Orocx7d+ewUNljIe79/VbE aNs6j8yA==; Received: from mail-wm1-x343.google.com ([2a00:1450:4864:20::343]) by desiato.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzWr-0000000Guq9-3kmL for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:54:44 +0000 Received: by mail-wm1-x343.google.com with SMTP id 5b1f17b1804b1-43624b2d453so7933755e9.2 for ; Thu, 06 Feb 2025 02:54:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839280; x=1739444080; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=hgmvOyDRE/uHdGDx6ezBHiwZ8Zmst4a3Z/7igl9yl54=; b=GAtMOdcycMajTV+/QRlzZlWplNgQZbgs5829puTTNUukxdJt6pw/ruopDqxpZpqPBI GQPpn01iNK/uZFGyIQlDo2q5Wcybf/SaYQXd65NxO+ZUxyPGT38y+R/sfkvInByoU3FJ JTrZBCzLGLa2X3Kf9CHmn57QLD+w564LLMGUEA5dpbKbaI6nA1x1Hj8eWEGASZEplKax H86G0lMsM/0Q8/qZHwxiz5FL9ByO/1gHAUuwtTe/goNERpJ7QnU449usPX233LUUN1MK 8FhDOUXLAKItYOgT8mRVUXLXSt9Pky7opprKcqb7Epw2XCwephbIbH89FltQADSXNvzX Qu9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839280; x=1739444080; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hgmvOyDRE/uHdGDx6ezBHiwZ8Zmst4a3Z/7igl9yl54=; b=QbkTsyK0yYPK6zyDCVNkaXN1kLF38yuola5l+G8ilnh/UZD8WRsLJepO/SqR5XVAJV TfD5J56nsBi9EZV7OzaLLd/omiuYXoYNQldIyjShQqvh69mknQVFYzO7TrGKSJSdbkDO 8owgFjxqJkzlrqlwUzEs48Cu7MYi6ucSuZUZYtPjD6MYT7ZFLvMvIivxfMV68Fd81NID BmkD6DNV6adERKaWegzAHE59xH0QJvMKu475phSkAVUszLryyM/2QiDJzlW+BfmEVOzQ iGXcDo8CrGAVwVoZMXnQnf+jzVoRme1YgfvO7P1+XKsHZD5AIy+XhtJHEUNa9YVjwTU8 bBkQ== X-Forwarded-Encrypted: i=1; AJvYcCWKxumKXIxCOSFvZ6tvvfaoU+MUfMO1lu7b/nLlKBT7UJ8o8cGsiJi+AonA0gQ6HRqTB0j2UUzZ6YCh6LYSwNFM@lists.infradead.org X-Gm-Message-State: AOJu0YzF/g6RK5tAP3CiDp5n/WIT76mZXglcBgvbfmmbFuktXUdZVZhf +jkZVZigTKMshypa6wfXpLqAFSFv4G7Fp+GKHwoI4pyjHBQ+Fj+j X-Gm-Gg: ASbGncucd7WhnbHjiplG0H0q7fTHe8vSmOhKQXlgXExBDcdWJQaPwe5zOigAHTkztkP 1Jd2XAoNTTG6gAEUweGPlMZ2QDpGiFvndMpKh/DzS6/y7rzk6zRCn0dZN2SZB6B0udLSAkSP5+F 7rwLOOE0SHzf+FsU30LvfFc4MjYLryBbFvUU28pS3z29SvtNGCMe23OJQX6Ixnnw31FaLQLtVAj mvkD4NJjackbAKy96pVZw7VBD2cRFoISA7YOLIrHVQAaigCalR+K2IItuS5iTllVUqSBUY8/pDM ozHgLw== X-Google-Smtp-Source: AGHT+IF4f8Xm3MmGWzddNVFfOdt5/pxOK6cPOqeFWXHyrZxzL6gB3MirVU0RpJy3SYCQ5MPv9d7ldg== X-Received: by 2002:a05:600c:3593:b0:436:18d0:aa6e with SMTP id 5b1f17b1804b1-4390d42f849mr64907365e9.5.1738839279263; Thu, 06 Feb 2025 02:54:39 -0800 (PST) Received: from localhost ([2a03:2880:31ff:20::]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38dbdd5c87csm1439525f8f.52.2025.02.06.02.54.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:54:38 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Barret Rhoden , Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 02/26] locking: Move common qspinlock helpers to a private header Date: Thu, 6 Feb 2025 02:54:10 -0800 Message-ID: <20250206105435.2159977-3-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=13562; h=from:subject; bh=eKJ3qxGBtRJg8l1rvSHjaQtIHqaeLQcTbgGFSWnocak=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRkkVfZqT1tCitfFNFTby5Hz/Q0Ls5KtoFEDTCL cZHH7UOJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZAAKCRBM4MiGSL8RylRQD/ 9ZckUJptWve6Ivsaj0tCRlXmeXvXakYFfReqoU4TTSiK5e60c9zMXr1PSHiloHlftlQqfuKi7ug6yx A0bUp3hubkSGlTfDSdMPwLh1AN2D0QpPIKuW9AIo5mw5fag+zRMjgwdjsh3o2biZZ4JsC36jA+tvRV OhywUPmmoZL/U7GKGzWmptqXq4iD7oAmPhyHSIyZ80efwrSDnwF8UIzoX8wR8vqgwUPSp3g3TAgdqS Wu+LKgl9hNtbykaP4Jj433O5chD603DDAL+C0COzSBRTNaFTRqqXx3o/3rvGoEDspvDUhm+uXtQJyl AHG62uLsNo3EyObbRiK6pLo/hjsdzLmLnWdfJb2NV9sJPp2VA5FxIiDCkL+P/08d6XoTHUePanJNRI p2OhTqrJFD4fN/JDiNEwhFAcdlvwcrSZV/qDLoUrrD0UYVuywjxrranVRZz6bQyIV9JZv/P8i0TkzE w1DQVyyoWScGq6wi5NLOun+C1IdPnB/k7AISjvxU+vTO7bhPWcKARMgbyFTCw+uBqNq614OwvBHiR1 SktvlKEmTFENUpURQ5Kkx8YOM8Bu1bLOunw6V1hyLl9WU5VH5WgZ2AhT3MLH8H1SixfIzevgMLkfIN BLB2b1rHlEnEgsNDOx3qTtgKlgKaHmaveMw+6M64H8QDFPh3iV10yEj7kN0A== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_105442_061345_F48095B4 X-CRM114-Status: GOOD ( 28.36 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Move qspinlock helper functions that encode, decode tail word, set and clear the pending and locked bits, and other miscellaneous definitions and macros to a private header. To this end, create a qspinlock.h header file in kernel/locking. Subsequent commits will introduce a modified qspinlock slow path function, thus moving shared code to a private header will help minimize unnecessary code duplication. Reviewed-by: Barret Rhoden Signed-off-by: Kumar Kartikeya Dwivedi --- kernel/locking/qspinlock.c | 193 +---------------------------------- kernel/locking/qspinlock.h | 200 +++++++++++++++++++++++++++++++++++++ 2 files changed, 205 insertions(+), 188 deletions(-) create mode 100644 kernel/locking/qspinlock.h diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index 7d96bed718e4..af8d122bb649 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c @@ -25,8 +25,9 @@ #include /* - * Include queued spinlock statistics code + * Include queued spinlock definitions and statistics code */ +#include "qspinlock.h" #include "qspinlock_stat.h" /* @@ -67,36 +68,6 @@ */ #include "mcs_spinlock.h" -#define MAX_NODES 4 - -/* - * On 64-bit architectures, the mcs_spinlock structure will be 16 bytes in - * size and four of them will fit nicely in one 64-byte cacheline. For - * pvqspinlock, however, we need more space for extra data. To accommodate - * that, we insert two more long words to pad it up to 32 bytes. IOW, only - * two of them can fit in a cacheline in this case. That is OK as it is rare - * to have more than 2 levels of slowpath nesting in actual use. We don't - * want to penalize pvqspinlocks to optimize for a rare case in native - * qspinlocks. - */ -struct qnode { - struct mcs_spinlock mcs; -#ifdef CONFIG_PARAVIRT_SPINLOCKS - long reserved[2]; -#endif -}; - -/* - * The pending bit spinning loop count. - * This heuristic is used to limit the number of lockword accesses - * made by atomic_cond_read_relaxed when waiting for the lock to - * transition out of the "== _Q_PENDING_VAL" state. We don't spin - * indefinitely because there's no guarantee that we'll make forward - * progress. - */ -#ifndef _Q_PENDING_LOOPS -#define _Q_PENDING_LOOPS 1 -#endif /* * Per-CPU queue node structures; we can never have more than 4 nested @@ -106,161 +77,7 @@ struct qnode { * * PV doubles the storage and uses the second cacheline for PV state. */ -static DEFINE_PER_CPU_ALIGNED(struct qnode, qnodes[MAX_NODES]); - -/* - * We must be able to distinguish between no-tail and the tail at 0:0, - * therefore increment the cpu number by one. - */ - -static inline __pure u32 encode_tail(int cpu, int idx) -{ - u32 tail; - - tail = (cpu + 1) << _Q_TAIL_CPU_OFFSET; - tail |= idx << _Q_TAIL_IDX_OFFSET; /* assume < 4 */ - - return tail; -} - -static inline __pure struct mcs_spinlock *decode_tail(u32 tail) -{ - int cpu = (tail >> _Q_TAIL_CPU_OFFSET) - 1; - int idx = (tail & _Q_TAIL_IDX_MASK) >> _Q_TAIL_IDX_OFFSET; - - return per_cpu_ptr(&qnodes[idx].mcs, cpu); -} - -static inline __pure -struct mcs_spinlock *grab_mcs_node(struct mcs_spinlock *base, int idx) -{ - return &((struct qnode *)base + idx)->mcs; -} - -#define _Q_LOCKED_PENDING_MASK (_Q_LOCKED_MASK | _Q_PENDING_MASK) - -#if _Q_PENDING_BITS == 8 -/** - * clear_pending - clear the pending bit. - * @lock: Pointer to queued spinlock structure - * - * *,1,* -> *,0,* - */ -static __always_inline void clear_pending(struct qspinlock *lock) -{ - WRITE_ONCE(lock->pending, 0); -} - -/** - * clear_pending_set_locked - take ownership and clear the pending bit. - * @lock: Pointer to queued spinlock structure - * - * *,1,0 -> *,0,1 - * - * Lock stealing is not allowed if this function is used. - */ -static __always_inline void clear_pending_set_locked(struct qspinlock *lock) -{ - WRITE_ONCE(lock->locked_pending, _Q_LOCKED_VAL); -} - -/* - * xchg_tail - Put in the new queue tail code word & retrieve previous one - * @lock : Pointer to queued spinlock structure - * @tail : The new queue tail code word - * Return: The previous queue tail code word - * - * xchg(lock, tail), which heads an address dependency - * - * p,*,* -> n,*,* ; prev = xchg(lock, node) - */ -static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail) -{ - /* - * We can use relaxed semantics since the caller ensures that the - * MCS node is properly initialized before updating the tail. - */ - return (u32)xchg_relaxed(&lock->tail, - tail >> _Q_TAIL_OFFSET) << _Q_TAIL_OFFSET; -} - -#else /* _Q_PENDING_BITS == 8 */ - -/** - * clear_pending - clear the pending bit. - * @lock: Pointer to queued spinlock structure - * - * *,1,* -> *,0,* - */ -static __always_inline void clear_pending(struct qspinlock *lock) -{ - atomic_andnot(_Q_PENDING_VAL, &lock->val); -} - -/** - * clear_pending_set_locked - take ownership and clear the pending bit. - * @lock: Pointer to queued spinlock structure - * - * *,1,0 -> *,0,1 - */ -static __always_inline void clear_pending_set_locked(struct qspinlock *lock) -{ - atomic_add(-_Q_PENDING_VAL + _Q_LOCKED_VAL, &lock->val); -} - -/** - * xchg_tail - Put in the new queue tail code word & retrieve previous one - * @lock : Pointer to queued spinlock structure - * @tail : The new queue tail code word - * Return: The previous queue tail code word - * - * xchg(lock, tail) - * - * p,*,* -> n,*,* ; prev = xchg(lock, node) - */ -static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail) -{ - u32 old, new; - - old = atomic_read(&lock->val); - do { - new = (old & _Q_LOCKED_PENDING_MASK) | tail; - /* - * We can use relaxed semantics since the caller ensures that - * the MCS node is properly initialized before updating the - * tail. - */ - } while (!atomic_try_cmpxchg_relaxed(&lock->val, &old, new)); - - return old; -} -#endif /* _Q_PENDING_BITS == 8 */ - -/** - * queued_fetch_set_pending_acquire - fetch the whole lock value and set pending - * @lock : Pointer to queued spinlock structure - * Return: The previous lock value - * - * *,*,* -> *,1,* - */ -#ifndef queued_fetch_set_pending_acquire -static __always_inline u32 queued_fetch_set_pending_acquire(struct qspinlock *lock) -{ - return atomic_fetch_or_acquire(_Q_PENDING_VAL, &lock->val); -} -#endif - -/** - * set_locked - Set the lock bit and own the lock - * @lock: Pointer to queued spinlock structure - * - * *,*,0 -> *,0,1 - */ -static __always_inline void set_locked(struct qspinlock *lock) -{ - WRITE_ONCE(lock->locked, _Q_LOCKED_VAL); -} - +static DEFINE_PER_CPU_ALIGNED(struct qnode, qnodes[_Q_MAX_NODES]); /* * Generate the native code for queued_spin_unlock_slowpath(); provide NOPs for @@ -410,7 +227,7 @@ void __lockfunc queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) * any MCS node. This is not the most elegant solution, but is * simple enough. */ - if (unlikely(idx >= MAX_NODES)) { + if (unlikely(idx >= _Q_MAX_NODES)) { lockevent_inc(lock_no_node); while (!queued_spin_trylock(lock)) cpu_relax(); @@ -465,7 +282,7 @@ void __lockfunc queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) * head of the waitqueue. */ if (old & _Q_TAIL_MASK) { - prev = decode_tail(old); + prev = decode_tail(old, qnodes); /* Link @node into the waitqueue. */ WRITE_ONCE(prev->next, node); diff --git a/kernel/locking/qspinlock.h b/kernel/locking/qspinlock.h new file mode 100644 index 000000000000..d4ceb9490365 --- /dev/null +++ b/kernel/locking/qspinlock.h @@ -0,0 +1,200 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Queued spinlock defines + * + * This file contains macro definitions and functions shared between different + * qspinlock slow path implementations. + */ +#ifndef __LINUX_QSPINLOCK_H +#define __LINUX_QSPINLOCK_H + +#include +#include +#include +#include + +#define _Q_MAX_NODES 4 + +/* + * The pending bit spinning loop count. + * This heuristic is used to limit the number of lockword accesses + * made by atomic_cond_read_relaxed when waiting for the lock to + * transition out of the "== _Q_PENDING_VAL" state. We don't spin + * indefinitely because there's no guarantee that we'll make forward + * progress. + */ +#ifndef _Q_PENDING_LOOPS +#define _Q_PENDING_LOOPS 1 +#endif + +/* + * On 64-bit architectures, the mcs_spinlock structure will be 16 bytes in + * size and four of them will fit nicely in one 64-byte cacheline. For + * pvqspinlock, however, we need more space for extra data. To accommodate + * that, we insert two more long words to pad it up to 32 bytes. IOW, only + * two of them can fit in a cacheline in this case. That is OK as it is rare + * to have more than 2 levels of slowpath nesting in actual use. We don't + * want to penalize pvqspinlocks to optimize for a rare case in native + * qspinlocks. + */ +struct qnode { + struct mcs_spinlock mcs; +#ifdef CONFIG_PARAVIRT_SPINLOCKS + long reserved[2]; +#endif +}; + +/* + * We must be able to distinguish between no-tail and the tail at 0:0, + * therefore increment the cpu number by one. + */ + +static inline __pure u32 encode_tail(int cpu, int idx) +{ + u32 tail; + + tail = (cpu + 1) << _Q_TAIL_CPU_OFFSET; + tail |= idx << _Q_TAIL_IDX_OFFSET; /* assume < 4 */ + + return tail; +} + +static inline __pure struct mcs_spinlock *decode_tail(u32 tail, struct qnode *qnodes) +{ + int cpu = (tail >> _Q_TAIL_CPU_OFFSET) - 1; + int idx = (tail & _Q_TAIL_IDX_MASK) >> _Q_TAIL_IDX_OFFSET; + + return per_cpu_ptr(&qnodes[idx].mcs, cpu); +} + +static inline __pure +struct mcs_spinlock *grab_mcs_node(struct mcs_spinlock *base, int idx) +{ + return &((struct qnode *)base + idx)->mcs; +} + +#define _Q_LOCKED_PENDING_MASK (_Q_LOCKED_MASK | _Q_PENDING_MASK) + +#if _Q_PENDING_BITS == 8 +/** + * clear_pending - clear the pending bit. + * @lock: Pointer to queued spinlock structure + * + * *,1,* -> *,0,* + */ +static __always_inline void clear_pending(struct qspinlock *lock) +{ + WRITE_ONCE(lock->pending, 0); +} + +/** + * clear_pending_set_locked - take ownership and clear the pending bit. + * @lock: Pointer to queued spinlock structure + * + * *,1,0 -> *,0,1 + * + * Lock stealing is not allowed if this function is used. + */ +static __always_inline void clear_pending_set_locked(struct qspinlock *lock) +{ + WRITE_ONCE(lock->locked_pending, _Q_LOCKED_VAL); +} + +/* + * xchg_tail - Put in the new queue tail code word & retrieve previous one + * @lock : Pointer to queued spinlock structure + * @tail : The new queue tail code word + * Return: The previous queue tail code word + * + * xchg(lock, tail), which heads an address dependency + * + * p,*,* -> n,*,* ; prev = xchg(lock, node) + */ +static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail) +{ + /* + * We can use relaxed semantics since the caller ensures that the + * MCS node is properly initialized before updating the tail. + */ + return (u32)xchg_relaxed(&lock->tail, + tail >> _Q_TAIL_OFFSET) << _Q_TAIL_OFFSET; +} + +#else /* _Q_PENDING_BITS == 8 */ + +/** + * clear_pending - clear the pending bit. + * @lock: Pointer to queued spinlock structure + * + * *,1,* -> *,0,* + */ +static __always_inline void clear_pending(struct qspinlock *lock) +{ + atomic_andnot(_Q_PENDING_VAL, &lock->val); +} + +/** + * clear_pending_set_locked - take ownership and clear the pending bit. + * @lock: Pointer to queued spinlock structure + * + * *,1,0 -> *,0,1 + */ +static __always_inline void clear_pending_set_locked(struct qspinlock *lock) +{ + atomic_add(-_Q_PENDING_VAL + _Q_LOCKED_VAL, &lock->val); +} + +/** + * xchg_tail - Put in the new queue tail code word & retrieve previous one + * @lock : Pointer to queued spinlock structure + * @tail : The new queue tail code word + * Return: The previous queue tail code word + * + * xchg(lock, tail) + * + * p,*,* -> n,*,* ; prev = xchg(lock, node) + */ +static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail) +{ + u32 old, new; + + old = atomic_read(&lock->val); + do { + new = (old & _Q_LOCKED_PENDING_MASK) | tail; + /* + * We can use relaxed semantics since the caller ensures that + * the MCS node is properly initialized before updating the + * tail. + */ + } while (!atomic_try_cmpxchg_relaxed(&lock->val, &old, new)); + + return old; +} +#endif /* _Q_PENDING_BITS == 8 */ + +/** + * queued_fetch_set_pending_acquire - fetch the whole lock value and set pending + * @lock : Pointer to queued spinlock structure + * Return: The previous lock value + * + * *,*,* -> *,1,* + */ +#ifndef queued_fetch_set_pending_acquire +static __always_inline u32 queued_fetch_set_pending_acquire(struct qspinlock *lock) +{ + return atomic_fetch_or_acquire(_Q_PENDING_VAL, &lock->val); +} +#endif + +/** + * set_locked - Set the lock bit and own the lock + * @lock: Pointer to queued spinlock structure + * + * *,*,0 -> *,0,1 + */ +static __always_inline void set_locked(struct qspinlock *lock) +{ + WRITE_ONCE(lock->locked, _Q_LOCKED_VAL); +} + +#endif /* __LINUX_QSPINLOCK_H */ From patchwork Thu Feb 6 10:54:11 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962835 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E6345C02196 for ; Thu, 6 Feb 2025 10:59:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: Content-Type:MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc: To:From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=OWs3GXRlx91qlbus5IlQPX+xUrucb3RvojJ3Jxtdv6A=; b=pddkmzMGtyVuQPMxxTtqKibpQ/ PYX2qkBhPkxdTZQdQPDsfM+qixhpK77CIQwFVdvgjH0F+OmYoN6uBhNjDYilL8nJj/1Nybdo19pE7 NP4b5Z86NT5P0o3qruW3vOXUSk2oIEmw4H/lR+8vMbYvtoO7xwV2bdz3AYpjEZvPZPnZ5gTDoWgrB HX4HrA6ke7BuiOWlspXec9OsGjgmoHBMai4xNC0MxVmyEOR2wPXX1wynEJabohLS3L4Fbetur7G1+ RNQUkytzANjgiO4WQub+PF7BAoDbuZq426vCJBolH4ARnvtxpFcZzor8u+hyj5oxCDsNUfjsQaoSl xTm1RFzg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tfzau-000000062Um-1Lsd; Thu, 06 Feb 2025 10:58:52 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzWv-000000061Wj-0vMi for linux-arm-kernel@bombadil.infradead.org; Thu, 06 Feb 2025 10:54:45 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Transfer-Encoding:Content-Type :MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Sender:Reply-To:Content-ID:Content-Description; bh=OWs3GXRlx91qlbus5IlQPX+xUrucb3RvojJ3Jxtdv6A=; b=R+dmC146Epw8appLw1EP0VSvoo SzzkqhgZwRAVd+IavGmKrdwSw84Vgu8jCbX1GG1aV1PLiB5pDHl6ttpyrL6IEU2Y6bysrEJg+ENyl qbS5hnih5W6JsJZp0fjQsALlf/zTuHhJC9XD6EpHUq2kubemtF3hc+zqjaez79Cts+SogFwHZ29er e/D3a+y9+i/J00A29H79qx4rzCldeQsFXX7pyWDTVTKDj8fbKImItz68/XnEIT9FNs5Xo4f4Ey5Sw RBFKvQ/DhFkxC2npMHAbN4XHk9QtkNhpjGUdFc+rOM2pSwbNrhGbIKN2wFbWWddZdfpEM80aJkrrm mY0Liqxg==; Received: from mail-wm1-x343.google.com ([2a00:1450:4864:20::343]) by desiato.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzWr-0000000Guqb-482l for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:54:44 +0000 Received: by mail-wm1-x343.google.com with SMTP id 5b1f17b1804b1-4361f65ca01so7253625e9.1 for ; Thu, 06 Feb 2025 02:54:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839281; x=1739444081; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=OWs3GXRlx91qlbus5IlQPX+xUrucb3RvojJ3Jxtdv6A=; b=Xjf9j2Fn/WywLvU7/RCVrSmHVv7Wzl4N3dgjebRQwcn4ELWifAWQJGS5JvUyGxIfw9 9gTMNo/UEbF3zVeiMrsuGgLcG0eqwYSdarlC9b6rmP16JXPaGJigH6dtqHQvzqgk3FLg l5XY2xK5N37dHnLX4Am9bWj6ksCRoma782kdnFJUxbuMlW8Vqr0PQHmiVK9LgQZoOKbO zjRuAjJxAiX/sRBvUHlxUx9cKhwH1njTyaI2proUIO3IkyjZHvlZUCb0SyTyQy+bRvGM 4usm/xTO4zel1iJ7zUxAIA1n0hNu34IGR/mUt6SGflHD5/WRZN52rbwQsov/vJv2ZzGN 7QMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839281; x=1739444081; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OWs3GXRlx91qlbus5IlQPX+xUrucb3RvojJ3Jxtdv6A=; b=o8wh+VHCICqwsukTxNJeBzGVbcG80PMlaUWcX9L4mUAK8NfYdGS8IRXhQKvEgD9RwU nOHn8nkd5+dxcUuYtGBz3qTSCwq5nO7GF96kBT3umWtT1LIB7ckrWMjgJwtu6K3q/nsf 8NsyGWQ1vhFe/xof+TzW929xNf0LWnmf72gzwpSdXF9enwqnN+vX4Zaib9IvCjR3cUWL WQpvFqBahmtTLl2mQjvhsGPVpCSXQgx4hcQILlFJytAz3mmn7Hx/yAcISQgF8SvbfCTu ZQGy4hSnEyMjCXOIfmMfMw9JEXGjNLmawBp1j7v+aQ5IGRkEucgpoGk61YZ6NnTfo45j +FxA== X-Forwarded-Encrypted: i=1; AJvYcCVINBvETxwUq2BaFI371DHtlcHXmVGEBqLP16i/Ne/ViVhrVn1jMsuOaXv2GbkHjDxKwgp5uJSyjSbey52g9i7P@lists.infradead.org X-Gm-Message-State: AOJu0YyDxlq73UsXCnyRF3KqtQRtquvc+sFKT391YHFLoks4TZ2xMqYm zzLhWLuxG2n5ur6XFDtbYCI2esCjAUN7aqlzB7zzYNhUBVgHtE08 X-Gm-Gg: ASbGnct2+Qctmv8oLGzpxpTpm3SuNIxkaFPoUwI9uzPGB8E7IS2HTJnt0Ev5nxsveJp mJ0OAcN4lRJfkLd9D/IEik6npNm9SdJmbdn1TobIRieQ7B7iMdy3Mg3OSndkO6RLJZCXSxYWWBv MmKuSwo6aTIrUMJs8y40SSFIOswetyqHw4HSox+M1KHwmRwLbNGFM/VZtiCXBSYVUrHk8w84N0q uFjRVx51tkSkWkKydUZ8hQFcD+7I8CTn/83LYgFZfWOrnU5P67UG/BG9cGjqN9H/uXJF6EH58Qz SUwx0g== X-Google-Smtp-Source: AGHT+IHeU2y2+7O/uQKeXcLLdS77pnq5NJz5kvWtZprZ6uplDLOI4uMkvBsICSMk+DCJ+KCjzZMlhA== X-Received: by 2002:a05:600c:5114:b0:435:32e:8270 with SMTP id 5b1f17b1804b1-4390d43d8ecmr61035955e9.14.1738839280667; Thu, 06 Feb 2025 02:54:40 -0800 (PST) Received: from localhost ([2a03:2880:31ff:1b::]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38dbde0fd23sm1381770f8f.71.2025.02.06.02.54.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:54:40 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Barret Rhoden , Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 03/26] locking: Allow obtaining result of arch_mcs_spin_lock_contended Date: Thu, 6 Feb 2025 02:54:11 -0800 Message-ID: <20250206105435.2159977-4-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=1052; h=from:subject; bh=WHhXMqIdalfkSexlY5e1BRqspbIYdrDmDQT3rX3AKP8=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRkvzlsIk8Mh+hFnelZKKgCgtqU9iOBLKbPXk+b Hr7ixKqJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZAAKCRBM4MiGSL8RypWCD/ 9a6wXeJpwHCx6Ry/+60+90gq49Vx2iM0Ni2K5/JZ/w1SQ1K45M6aFIB7872eGf0XNmEL8eeYRcg6ky oIbxfK6Dz0h4c4cjezC6IO7ADwiRf1nt9gfi3HEwqdWIzBMac9NGH1ITZzTSB9FZuAkqXAfpZZq0iW l9L7JT2bKDalL+4TjYIHnpQaWg+MzdyiUwK9daJqckornGM5bI4UFqkTtWU74gzcj9I8Ww9UAJmZll /0oSi0NI558EvQj/JozkqtizlLGWLPs8WdwyXca/O84yRKYAhzPQZ+Ebe9SP1IcoorQstTzGtBQLkN zeZqTP0CpGFfPZy9B3ND9iBY7eJQQxB2h2yT74z85HkKKEjc2v0asUqBT4vT+B5nCPzdr638w7FtGe emuAf5iymDVjVyVtnlOV9UV5465nfpUZv9DqAwksMUOVSc3qvTgbKcDN4XsiSIJPYVu+XDqiFZqx3Y WlVVJyTFe+L+QnTVi8xn2N/fICy+rFILllPSBpOso5O/9KXfpcu9vnaCtViVPakzRGJqLFeOjxAQNd hQxMdvz26zEoWX77OknnGmg7FcT6xSJ9eFK4uTMlDQbiikiPsGIMVPxVutFarPkQs5maZ7giuy1hnE iy/9siJLZmE3DRYLv0bOoj+wxlb6+yk1cnr9VBCzo9XnhvUEn57+U1QY9opw== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_105442_073823_E398BA8F X-CRM114-Status: GOOD ( 11.12 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org To support upcoming changes that require inspecting the return value once the conditional waiting loop in arch_mcs_spin_lock_contended terminates, modify the macro to preserve the result of smp_cond_load_acquire. This enables checking the return value as needed, which will help disambiguate the MCS node’s locked state in future patches. Reviewed-by: Barret Rhoden Signed-off-by: Kumar Kartikeya Dwivedi --- kernel/locking/mcs_spinlock.h | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/kernel/locking/mcs_spinlock.h b/kernel/locking/mcs_spinlock.h index 16160ca8907f..5c92ba199b90 100644 --- a/kernel/locking/mcs_spinlock.h +++ b/kernel/locking/mcs_spinlock.h @@ -24,9 +24,7 @@ * spinning, and smp_cond_load_acquire() provides that behavior. */ #define arch_mcs_spin_lock_contended(l) \ -do { \ - smp_cond_load_acquire(l, VAL); \ -} while (0) + smp_cond_load_acquire(l, VAL) #endif #ifndef arch_mcs_spin_unlock_contended From patchwork Thu Feb 6 10:54:12 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962839 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CEF4AC02194 for ; Thu, 6 Feb 2025 11:03:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=/G/O/Zyw1wKL+61ZIf71iZUhSIiLpUYSgFpa1qeafC4=; b=2EQr0r1kB81ZJ9z3360Q7BjDfX Si5pam45NkYiUlvMTt/YVaJMfln9zbPX4UuvxWr4dAAms3n8mtaVWcbi39R1Oiq2Sw+hKSJ6yMWnv szk5/HCIGztgpRgpREAXwApYsoxQrwww2KJW/xhXIV9N/MmSXcY2m5AL1EZUkG0ri0BrP1Bjpcgf2 R+zoeI05dWSbpqkus3UG0FXDNJe5ChH3QcloQVxjh8HKVo0b1Oi+b7k5lnAeaqqc/cGqVv5wcgQtD hDqCyEjTZdBHHJ5keGGHgtGVboIeQg6/aQaKJz+lIrES+tL8d0dHwAddglQokqOqEuvFJnLyHl5rW DHMZdZZg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tfzev-000000063V3-2DD9; Thu, 06 Feb 2025 11:03:01 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzWx-000000061Xk-31eA for linux-arm-kernel@bombadil.infradead.org; Thu, 06 Feb 2025 10:54:47 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Transfer-Encoding:MIME-Version :References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=/G/O/Zyw1wKL+61ZIf71iZUhSIiLpUYSgFpa1qeafC4=; b=gHlZ+wxP1G1aARfdhME4jfZsKv xiSucuQKsxVoS2SzrmqkzllJ5NYkd8c902dhBFnes4xC943l62SudVvBdH5GVtxy+5yjBf1Y8/bpO fdgXeHufP1UGOr5Sir1zts7SVNs4iM+U3eipgfkG8bhipgh7nJI/BgfwSebSoKeOlWwkZhocwkvwk 2I3xiKfLkM3QxMPZXYExqTPQ6wuCT4h5VyOAedM22VkJ8zIwtFfYVbyWwNTAUKECQSuPDy/uazkkO t9k7Pa8y66pvlD4B8gZ9O2vxP1rPoiy8ZzdVFAdJeqMNsRhgCYPyNsIzMLXT156abNKVN+ouLPUQC /akRolXQ==; Received: from mail-wm1-x344.google.com ([2a00:1450:4864:20::344]) by desiato.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzWu-0000000GurU-2nbj for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:54:46 +0000 Received: by mail-wm1-x344.google.com with SMTP id 5b1f17b1804b1-43634b570c1so5003425e9.0 for ; Thu, 06 Feb 2025 02:54:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839282; x=1739444082; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/G/O/Zyw1wKL+61ZIf71iZUhSIiLpUYSgFpa1qeafC4=; b=Ha7P0sl5v99vcRlQt5wYTm8wuN8ha+JPKpOV+n8gB1XCyJ4Q4SP7veYLI+an/ZZwvQ +iMT37Dv2rOSS6DqifAaMRfTNtPJBgPOwh69rpRgV+cWiELhyAIcJCBOCpdnYPGtuR0Y 0jFRhd70BG2cd81vHBUaWb4KnfupWVEERToeF+8b89Gzp3s6onL17PrlVhGD4Hp5pyO9 CEok9cAyvM85NAYdlPYtico6dHfQ8ESNIUI1ZJKx3Dt4qFokNbFgEEIn8SMnH7Rtkp8b 700SJERoO88UYR//l03RWeimkBIN2a9Bch2VuP1QdwoogqmlVVnVckqR0z/H7aAm9/wR bp0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839282; x=1739444082; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/G/O/Zyw1wKL+61ZIf71iZUhSIiLpUYSgFpa1qeafC4=; b=gBkNnuF+F9Mxck0tnK2gscRS4qtgmfbtVf64orXEDjYG8GJb71hx0FrMoiRqKwjSM6 /etpXAKQA39JsxDSQXeAQ/Y9C0bcTe+xaI877PIvPwUvP0A/cHsAx06KwPqPEzT/drDD b5qV6EPu+LmUXwYPtfIf5rTldgGg8sgJw/4ge+lK7ZVLK5X+BchztcOHN2CD624oXH6z cJnt5wd487EPX+0WplayOnl8h58W9B4RruhJNGBFDLEB+xTYeDbWaxKvK90Q1qao/z6C 7pIuNUJuq/14BI9vKP91HWMYwt8ASFu3NH/TJHFHdKKmpRB1G7wp4NC4CvgTGxRkIAUA ImVw== X-Forwarded-Encrypted: i=1; AJvYcCVfSTnSB5YylEH2k2efd2Rfynf4gYj9Pct0snacp+rlcru6wbja5WwgX8X7BDhxRum8sypyS3C09uYxnzrns/EF@lists.infradead.org X-Gm-Message-State: AOJu0Ywypd4OJiE3uvuKdPO6pkSgkB6LVICbe/wmPGnBHapdc7fkacId xr4GqRm4YADOjuoE25h8sbcI4TNr+Bgrz3nHCT84WimUR2RCK30V X-Gm-Gg: ASbGncvwlNhe5Za4RxOnVFvDLlas7CDrEKQ6Zr4jXu/sdkSRnSWb8HG8oa2eJEX2Fgh r/JAmeomvIv/e7gQz5QU+kt8NFiJp46FMcjYPLwgEe0vXq/nWoXjogmnpAXSuoC6iMbrfqOLpxE 39LSwjwkG2SReIJvyl8SmFx7IgiEfx98iY8jEGwlf3ER4CBUwoKqvXotmPX+A7AnC4YyO+s4gjQ BT8+3I4cLxmhLh/zb/auDGwdGAuMHw4J/fGzodQt1evcdeJ/5e09oijfiBZVxZJK9VE2sq1iPod DnPEDw== X-Google-Smtp-Source: AGHT+IED4qd4F+YsOBRJnZkqjdZ0qCWlhz5W6kmheG8y6eyBeQtVsfltvea7zK4aPaLbItj8zaxpkQ== X-Received: by 2002:a5d:5850:0:b0:385:f1f2:13ee with SMTP id ffacd0b85a97d-38db490f8b3mr4373779f8f.46.1738839282225; Thu, 06 Feb 2025 02:54:42 -0800 (PST) Received: from localhost ([2a03:2880:31ff:1a::]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38dbde1ddb9sm1382899f8f.84.2025.02.06.02.54.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:54:41 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Barret Rhoden , Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 04/26] locking: Copy out qspinlock.c to rqspinlock.c Date: Thu, 6 Feb 2025 02:54:12 -0800 Message-ID: <20250206105435.2159977-5-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=14101; h=from:subject; bh=FYMfmdHhZVrUK2nS5jUp23jzoKxyU5CBSvGylajqNkU=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRkgODNr0FrQcgKvek25PD/Kof6Fg8QSVqhrY4b ll0Vr4eJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZAAKCRBM4MiGSL8RyjcyEA CWkEv1hNRGlRz7ngHfRjKSfdxkoW+Jesqf9WJHO5raMEArHLDtxB4dMXEW9mPu0R2nbppStFQt+zh4 Emq++RkywjOxaooDM0K4Iurr6vOdSmpHfMiuqB6mZ4kHQ8fLbplR9vkDxA89qbEo0xPKh8RkEMtnwB vQKt8yT7Ubg4GV2eG76RGhBn9nPueBqpO/5X++hWy32N74C5r0qYYBhROpXLqY/R5Rpn4lV8TqhdRk rdiQNpdPA716MET8PP5iDJ+o22hHBxGRZLsv6TcXimxObYAneQ3XcPzdZNeQ0Lu1lmUL0ItvP6uVj/ Txl2TNgwgBgAu0HO1D5mIlX0DjWhlO/uH28Av4PWzHG8tTbA/gu81NyPa2GLFla0L8uiluz1HnLnSg b5MHVmXaAlvzACli+/WGtI5LeXTHhpKwels4eBwH2wCTHh5PyhPAgMvnCwaX0iRES/oSEYMV80tDGI oXrop0LJRkfimay6TKAOynHmk8HjO2vvkVuZe5Uc48ycPaC6eVxVz/k1IUcuKo4Ucf2q1/9PKIYU18 TYQ60j+i23PPwphC6ianMx4TGKm7EF1yZKUpEw/Vu9biDK7WpNF4KKzhD74yMfEucCO5aVnYjoaNTj 5RlEy21xIyGTRCEgZQJDOJPB36mKO3NQmcDl4clT647X86fuZjC/mHeKcouA== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_105444_905211_E6774FDC X-CRM114-Status: GOOD ( 33.17 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org In preparation for introducing a new lock implementation, Resilient Queued Spin Lock, or rqspinlock, we first begin our modifications by using the existing qspinlock.c code as the base. Simply copy the code to a new file and rename functions and variables from 'queued' to 'resilient_queued'. This helps each subsequent commit in clearly showing how and where the code is being changed. The only change after a literal copy in this commit is renaming the functions where necessary. Reviewed-by: Barret Rhoden Signed-off-by: Kumar Kartikeya Dwivedi --- kernel/locking/rqspinlock.c | 410 ++++++++++++++++++++++++++++++++++++ 1 file changed, 410 insertions(+) create mode 100644 kernel/locking/rqspinlock.c diff --git a/kernel/locking/rqspinlock.c b/kernel/locking/rqspinlock.c new file mode 100644 index 000000000000..caaa7c9bbc79 --- /dev/null +++ b/kernel/locking/rqspinlock.c @@ -0,0 +1,410 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Resilient Queued Spin Lock + * + * (C) Copyright 2013-2015 Hewlett-Packard Development Company, L.P. + * (C) Copyright 2013-2014,2018 Red Hat, Inc. + * (C) Copyright 2015 Intel Corp. + * (C) Copyright 2015 Hewlett-Packard Enterprise Development LP + * + * Authors: Waiman Long + * Peter Zijlstra + */ + +#ifndef _GEN_PV_LOCK_SLOWPATH + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * Include queued spinlock definitions and statistics code + */ +#include "qspinlock.h" +#include "qspinlock_stat.h" + +/* + * The basic principle of a queue-based spinlock can best be understood + * by studying a classic queue-based spinlock implementation called the + * MCS lock. A copy of the original MCS lock paper ("Algorithms for Scalable + * Synchronization on Shared-Memory Multiprocessors by Mellor-Crummey and + * Scott") is available at + * + * https://bugzilla.kernel.org/show_bug.cgi?id=206115 + * + * This queued spinlock implementation is based on the MCS lock, however to + * make it fit the 4 bytes we assume spinlock_t to be, and preserve its + * existing API, we must modify it somehow. + * + * In particular; where the traditional MCS lock consists of a tail pointer + * (8 bytes) and needs the next pointer (another 8 bytes) of its own node to + * unlock the next pending (next->locked), we compress both these: {tail, + * next->locked} into a single u32 value. + * + * Since a spinlock disables recursion of its own context and there is a limit + * to the contexts that can nest; namely: task, softirq, hardirq, nmi. As there + * are at most 4 nesting levels, it can be encoded by a 2-bit number. Now + * we can encode the tail by combining the 2-bit nesting level with the cpu + * number. With one byte for the lock value and 3 bytes for the tail, only a + * 32-bit word is now needed. Even though we only need 1 bit for the lock, + * we extend it to a full byte to achieve better performance for architectures + * that support atomic byte write. + * + * We also change the first spinner to spin on the lock bit instead of its + * node; whereby avoiding the need to carry a node from lock to unlock, and + * preserving existing lock API. This also makes the unlock code simpler and + * faster. + * + * N.B. The current implementation only supports architectures that allow + * atomic operations on smaller 8-bit and 16-bit data types. + * + */ + +#include "mcs_spinlock.h" + +/* + * Per-CPU queue node structures; we can never have more than 4 nested + * contexts: task, softirq, hardirq, nmi. + * + * Exactly fits one 64-byte cacheline on a 64-bit architecture. + * + * PV doubles the storage and uses the second cacheline for PV state. + */ +static DEFINE_PER_CPU_ALIGNED(struct qnode, qnodes[_Q_MAX_NODES]); + +/* + * Generate the native code for resilient_queued_spin_unlock_slowpath(); provide NOPs + * for all the PV callbacks. + */ + +static __always_inline void __pv_init_node(struct mcs_spinlock *node) { } +static __always_inline void __pv_wait_node(struct mcs_spinlock *node, + struct mcs_spinlock *prev) { } +static __always_inline void __pv_kick_node(struct qspinlock *lock, + struct mcs_spinlock *node) { } +static __always_inline u32 __pv_wait_head_or_lock(struct qspinlock *lock, + struct mcs_spinlock *node) + { return 0; } + +#define pv_enabled() false + +#define pv_init_node __pv_init_node +#define pv_wait_node __pv_wait_node +#define pv_kick_node __pv_kick_node +#define pv_wait_head_or_lock __pv_wait_head_or_lock + +#ifdef CONFIG_PARAVIRT_SPINLOCKS +#define resilient_queued_spin_lock_slowpath native_resilient_queued_spin_lock_slowpath +#endif + +#endif /* _GEN_PV_LOCK_SLOWPATH */ + +/** + * resilient_queued_spin_lock_slowpath - acquire the queued spinlock + * @lock: Pointer to queued spinlock structure + * @val: Current value of the queued spinlock 32-bit word + * + * (queue tail, pending bit, lock value) + * + * fast : slow : unlock + * : : + * uncontended (0,0,0) -:--> (0,0,1) ------------------------------:--> (*,*,0) + * : | ^--------.------. / : + * : v \ \ | : + * pending : (0,1,1) +--> (0,1,0) \ | : + * : | ^--' | | : + * : v | | : + * uncontended : (n,x,y) +--> (n,0,0) --' | : + * queue : | ^--' | : + * : v | : + * contended : (*,x,y) +--> (*,0,0) ---> (*,0,1) -' : + * queue : ^--' : + */ +void __lockfunc resilient_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) +{ + struct mcs_spinlock *prev, *next, *node; + u32 old, tail; + int idx; + + BUILD_BUG_ON(CONFIG_NR_CPUS >= (1U << _Q_TAIL_CPU_BITS)); + + if (pv_enabled()) + goto pv_queue; + + if (virt_spin_lock(lock)) + return; + + /* + * Wait for in-progress pending->locked hand-overs with a bounded + * number of spins so that we guarantee forward progress. + * + * 0,1,0 -> 0,0,1 + */ + if (val == _Q_PENDING_VAL) { + int cnt = _Q_PENDING_LOOPS; + val = atomic_cond_read_relaxed(&lock->val, + (VAL != _Q_PENDING_VAL) || !cnt--); + } + + /* + * If we observe any contention; queue. + */ + if (val & ~_Q_LOCKED_MASK) + goto queue; + + /* + * trylock || pending + * + * 0,0,* -> 0,1,* -> 0,0,1 pending, trylock + */ + val = queued_fetch_set_pending_acquire(lock); + + /* + * If we observe contention, there is a concurrent locker. + * + * Undo and queue; our setting of PENDING might have made the + * n,0,0 -> 0,0,0 transition fail and it will now be waiting + * on @next to become !NULL. + */ + if (unlikely(val & ~_Q_LOCKED_MASK)) { + + /* Undo PENDING if we set it. */ + if (!(val & _Q_PENDING_MASK)) + clear_pending(lock); + + goto queue; + } + + /* + * We're pending, wait for the owner to go away. + * + * 0,1,1 -> *,1,0 + * + * this wait loop must be a load-acquire such that we match the + * store-release that clears the locked bit and create lock + * sequentiality; this is because not all + * clear_pending_set_locked() implementations imply full + * barriers. + */ + if (val & _Q_LOCKED_MASK) + smp_cond_load_acquire(&lock->locked, !VAL); + + /* + * take ownership and clear the pending bit. + * + * 0,1,0 -> 0,0,1 + */ + clear_pending_set_locked(lock); + lockevent_inc(lock_pending); + return; + + /* + * End of pending bit optimistic spinning and beginning of MCS + * queuing. + */ +queue: + lockevent_inc(lock_slowpath); +pv_queue: + node = this_cpu_ptr(&qnodes[0].mcs); + idx = node->count++; + tail = encode_tail(smp_processor_id(), idx); + + trace_contention_begin(lock, LCB_F_SPIN); + + /* + * 4 nodes are allocated based on the assumption that there will + * not be nested NMIs taking spinlocks. That may not be true in + * some architectures even though the chance of needing more than + * 4 nodes will still be extremely unlikely. When that happens, + * we fall back to spinning on the lock directly without using + * any MCS node. This is not the most elegant solution, but is + * simple enough. + */ + if (unlikely(idx >= _Q_MAX_NODES)) { + lockevent_inc(lock_no_node); + while (!queued_spin_trylock(lock)) + cpu_relax(); + goto release; + } + + node = grab_mcs_node(node, idx); + + /* + * Keep counts of non-zero index values: + */ + lockevent_cond_inc(lock_use_node2 + idx - 1, idx); + + /* + * Ensure that we increment the head node->count before initialising + * the actual node. If the compiler is kind enough to reorder these + * stores, then an IRQ could overwrite our assignments. + */ + barrier(); + + node->locked = 0; + node->next = NULL; + pv_init_node(node); + + /* + * We touched a (possibly) cold cacheline in the per-cpu queue node; + * attempt the trylock once more in the hope someone let go while we + * weren't watching. + */ + if (queued_spin_trylock(lock)) + goto release; + + /* + * Ensure that the initialisation of @node is complete before we + * publish the updated tail via xchg_tail() and potentially link + * @node into the waitqueue via WRITE_ONCE(prev->next, node) below. + */ + smp_wmb(); + + /* + * Publish the updated tail. + * We have already touched the queueing cacheline; don't bother with + * pending stuff. + * + * p,*,* -> n,*,* + */ + old = xchg_tail(lock, tail); + next = NULL; + + /* + * if there was a previous node; link it and wait until reaching the + * head of the waitqueue. + */ + if (old & _Q_TAIL_MASK) { + prev = decode_tail(old, qnodes); + + /* Link @node into the waitqueue. */ + WRITE_ONCE(prev->next, node); + + pv_wait_node(node, prev); + arch_mcs_spin_lock_contended(&node->locked); + + /* + * While waiting for the MCS lock, the next pointer may have + * been set by another lock waiter. We optimistically load + * the next pointer & prefetch the cacheline for writing + * to reduce latency in the upcoming MCS unlock operation. + */ + next = READ_ONCE(node->next); + if (next) + prefetchw(next); + } + + /* + * we're at the head of the waitqueue, wait for the owner & pending to + * go away. + * + * *,x,y -> *,0,0 + * + * this wait loop must use a load-acquire such that we match the + * store-release that clears the locked bit and create lock + * sequentiality; this is because the set_locked() function below + * does not imply a full barrier. + * + * The PV pv_wait_head_or_lock function, if active, will acquire + * the lock and return a non-zero value. So we have to skip the + * atomic_cond_read_acquire() call. As the next PV queue head hasn't + * been designated yet, there is no way for the locked value to become + * _Q_SLOW_VAL. So both the set_locked() and the + * atomic_cmpxchg_relaxed() calls will be safe. + * + * If PV isn't active, 0 will be returned instead. + * + */ + if ((val = pv_wait_head_or_lock(lock, node))) + goto locked; + + val = atomic_cond_read_acquire(&lock->val, !(VAL & _Q_LOCKED_PENDING_MASK)); + +locked: + /* + * claim the lock: + * + * n,0,0 -> 0,0,1 : lock, uncontended + * *,*,0 -> *,*,1 : lock, contended + * + * If the queue head is the only one in the queue (lock value == tail) + * and nobody is pending, clear the tail code and grab the lock. + * Otherwise, we only need to grab the lock. + */ + + /* + * In the PV case we might already have _Q_LOCKED_VAL set, because + * of lock stealing; therefore we must also allow: + * + * n,0,1 -> 0,0,1 + * + * Note: at this point: (val & _Q_PENDING_MASK) == 0, because of the + * above wait condition, therefore any concurrent setting of + * PENDING will make the uncontended transition fail. + */ + if ((val & _Q_TAIL_MASK) == tail) { + if (atomic_try_cmpxchg_relaxed(&lock->val, &val, _Q_LOCKED_VAL)) + goto release; /* No contention */ + } + + /* + * Either somebody is queued behind us or _Q_PENDING_VAL got set + * which will then detect the remaining tail and queue behind us + * ensuring we'll see a @next. + */ + set_locked(lock); + + /* + * contended path; wait for next if not observed yet, release. + */ + if (!next) + next = smp_cond_load_relaxed(&node->next, (VAL)); + + arch_mcs_spin_unlock_contended(&next->locked); + pv_kick_node(lock, next); + +release: + trace_contention_end(lock, 0); + + /* + * release the node + */ + __this_cpu_dec(qnodes[0].mcs.count); +} +EXPORT_SYMBOL(resilient_queued_spin_lock_slowpath); + +/* + * Generate the paravirt code for resilient_queued_spin_unlock_slowpath(). + */ +#if !defined(_GEN_PV_LOCK_SLOWPATH) && defined(CONFIG_PARAVIRT_SPINLOCKS) +#define _GEN_PV_LOCK_SLOWPATH + +#undef pv_enabled +#define pv_enabled() true + +#undef pv_init_node +#undef pv_wait_node +#undef pv_kick_node +#undef pv_wait_head_or_lock + +#undef resilient_queued_spin_lock_slowpath +#define resilient_queued_spin_lock_slowpath __pv_resilient_queued_spin_lock_slowpath + +#include "qspinlock_paravirt.h" +#include "rqspinlock.c" + +bool nopvspin; +static __init int parse_nopvspin(char *arg) +{ + nopvspin = true; + return 0; +} +early_param("nopvspin", parse_nopvspin); +#endif From patchwork Thu Feb 6 10:54:13 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962858 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9C923C02194 for ; Thu, 6 Feb 2025 11:04:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=joZkmdV/dAzLK1UShpkitp+TG0brsNHWZrt+dWRUiK4=; b=lr8mGUcUXtw8SZ9fc+QCbFI6mh D7WAUpISNsoIsjBiI1HzmJFeJvh3pNfY5t1MHkm8pApGkufZz6dptSBonG2W1sBqdzgOV7ldtsn56 rp8k6Ovj9njF2Bx9e9U0ZIsZWu8wjfrEY+eK1JwhmhGlWZ4CivTJLCmpWAKZqOstKS3yDkXxPESaK 93TvKTuxnCG4Vpi0xiDzVlqRhX/L9Jj38JijQVM0D96duyGuxAHcEwcSJViMDL5t5rKHapyHfnBIh U/EnHsq4cCLhKY+ZxlTCC4/gw6THPvBjAqNhAUz/JSunugC3nAOs2IIslh+DW1wqBZ2gHSboxSubP 2H8tQ0eg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tfzgH-000000063kR-1Amj; Thu, 06 Feb 2025 11:04:25 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzWy-000000061Y0-2Wef for linux-arm-kernel@bombadil.infradead.org; Thu, 06 Feb 2025 10:54:48 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Transfer-Encoding:MIME-Version :References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=joZkmdV/dAzLK1UShpkitp+TG0brsNHWZrt+dWRUiK4=; b=nmRH2Q/i02L0sIDqicBUkoaney a8ISkL7DNIX2uHTBm0DS+1LVu8LPmB2TfN7JEXLv/L7kK53Qllsn3k48l011/1SLl6mdlGE8on93e bt0EKtrS/iIGTgKHG4L45Lpk/AQxQZbM3ZvjWov2rilPAxjLHXWKcRN4bx2vmUGYJ9DwyQJ3Tyb94 x1M7U5E2mOOsbYwd+A+GZgwIaSBtp8s/MKxZMMgtQ2lo8Que+09TSCW1R+tZbnGvfoTfc7e8NQZrI 81QVt6UKoUJImn4G99kpIl1y9l1K1+m87znVLDQKt6TBm0Q4U/TEnl7ye7PpRWAH8usYmxTBgkpGZ n9iZI4ug==; Received: from mail-wm1-x341.google.com ([2a00:1450:4864:20::341]) by desiato.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzWv-0000000Gurp-2pno for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:54:47 +0000 Received: by mail-wm1-x341.google.com with SMTP id 5b1f17b1804b1-43621d27adeso4636125e9.2 for ; Thu, 06 Feb 2025 02:54:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839283; x=1739444083; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=joZkmdV/dAzLK1UShpkitp+TG0brsNHWZrt+dWRUiK4=; b=Ex80tVEk/ixteZeSbAvEcG7VW/wYjVD6Ygn9nCQ9tUPDOUux9ze4RfSQVAuvFwi6pG VDvT9XBgOsHGdSXEShaqZdIOKdqHoyNDNkgmfe+nPNShWNShJuOfJ0+2THg3ocRHnd0L eZ6KhFnKrquP7W2M5BCSQFjLnn19eiZl4k1YC2kuYHx1IbrnFJZ5iQQEYMMb973CCRPj GoXr1BNDZD1H4b2ZT/wRfn5ptU/tUDgg+EndtqMvSQ+EYXtrd1mh6P3+hnsrAl+OoTmv UDPUPdB5gMMN2dfpfYTm7J5+E/pSfXracar/1OzZx8Yh1QBeY90jK1RPO5PgMPxXR/O/ 7LZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839283; x=1739444083; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=joZkmdV/dAzLK1UShpkitp+TG0brsNHWZrt+dWRUiK4=; b=VVt8zfujMzud2UItjLnHlLdrcN4DW96Yr1vaRh6T3LTA8t1klK+tu54cHQ81xD/bwS U+1MpkHy+Nlnt9qKtjvkb7jhxMR83X5cjqeNtZSy243eX9DCa06PZeF5tFb+1peuDwKF 5KPR5I0STznVjMvdtq14slKa4yUHo927x1rg5fbM6YgeHYywortv+y/+jX0BfWwFYrM3 yjIpBPdiHqwjqH1mHaBbX3gC/5LFmFmVl1H5PCU2lmV4F/5ht+538JX3Q6zJQwu53wJy dLJZfVqgs81cxfGRKqiNO2P3GbhVV03jKnPTGTJelGP5SGcztcTeZBriKQRW65ExR+Dh LCmg== X-Forwarded-Encrypted: i=1; AJvYcCUHEIsu/N9r11XYXxe2G7DB97q/dpTrZxzoXcY0DQ1WcM2CKhqvIH6rFEc7KwmqbupV94+rJkA3sF1jvbjRhcum@lists.infradead.org X-Gm-Message-State: AOJu0YyZgRYNoYcyWpxr5pncFA8x8SK1Aq0mp307D0tZMkN7vWsohBmF B0jHamOcL6eNLArjkTIT5viB8JLjUgE5ZmI0SOI/wmxudidfw33A X-Gm-Gg: ASbGncsh3TcB0M1G0uS3DpqRcTLwFAQkxG8VYXILL4SzUVurrayc7x3RJojVuv6WP74 FPG0lVvyQ7Efl/FyGfkXlhVMwRyTzCzwxaWDpCv0r2JKlsK3NWXkySchpawddRHqH538me6odF9 xVDuMQxDSRCXfFQuRyQHIDipDUMThVhHR04RBpsHNM9yRmLUmDy+tfsKfRboqiDw88vHOGDLVoA TqXMdmuSm0ALc/8F2SZVdjYQP1rnnBWP6zamt5rNr3i5lkWTgy/WS4lkxJyOuVuLxkRz1pqLP9D d6htVA== X-Google-Smtp-Source: AGHT+IFy0RXuexUHa3kfjyL19mkyb1nAWPyntJqslgT3n34npIvK+M2FW7T6Phh8AkNO/tuBPU7EAg== X-Received: by 2002:a05:6000:1886:b0:385:db39:2cf with SMTP id ffacd0b85a97d-38db48b9e8emr4235856f8f.12.1738839283504; Thu, 06 Feb 2025 02:54:43 -0800 (PST) Received: from localhost ([2a03:2880:31ff:1d::]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38dbdd35e9esm1390719f8f.25.2025.02.06.02.54.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:54:42 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Barret Rhoden , Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 05/26] rqspinlock: Add rqspinlock.h header Date: Thu, 6 Feb 2025 02:54:13 -0800 Message-ID: <20250206105435.2159977-6-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=2297; h=from:subject; bh=AVuKKIzjLnum/Q8Jp3BllOKRS5yuVDUgY1U4pKev08I=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRkAUSLUoqJ61KHJqVww5VkHo5XuLe/f7xDpkMh vShNwMCJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZAAKCRBM4MiGSL8RyoE9EA ChWskn+IPkSTn/DMBwsfvx6EWqIgzdLCOdPnNCPTWMG8F+qUgNrWq+hnY2AEcF0ceG8KE69wHJ0cLI VBN9DG6qMLro8x1voPjPdK+aL1HAKDexxzJsUGqXzwcJ7tvBu+6aJto1dqMnI/iAjErkVXhTfeeDnQ HE8Vc8ft+nREngcb+k2A3S+vnI7cNgZyEtUXn3mX+4C1e0siL9PCKWRbCj7Dd8CIyhr+I6iVZdwGNy fYfSZBupBuq5C4yvcYeJ8shIaV3bkO04VogVyYLEi/QVP8pKyI8R+EBtKWqigsXI6383C8YLtrbQGd 6GU/QIqvFWawk/4RAnRUbHrfbE0pEX9aYKhgNEqv6/kqUCtYlJg+UsJtP63tthsVPkwsBYSzNyoHbB 0DutHelI8e77l5BQjjBxlxrXXTwKwflGFaPet1F3lFUu5Cq2PuJNvOGpz9xnFAi/8jxZY9QaxMKAUg Xr5iqvWwjk83bjYdoRMl/jswMGFWaFmpSrDUvrXM43JJVq/Jip610cxzOu92F+nVn1lWFbMEC4Bpva mhfxqax6cnPG6vhnjpN4uowxGnWFTJUwdYsICJWIXeHdbiXs3WJOtEvcwisb0n3r1F6T8NtLM+7Mdu KrBmyKQ6ZIowge2wnwHfNovFg8fEt4LhwMndrwjiFM78K91D7w+iuxRUPVLw== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_105445_856187_7D7764A5 X-CRM114-Status: GOOD ( 15.53 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org This header contains the public declarations usable in the rest of the kernel for rqspinlock. Let's also type alias qspinlock to rqspinlock_t to ensure consistent use of the new lock type. We want to remove dependence on the qspinlock type in later patches as we need to provide a test-and-set fallback, hence begin abstracting away from now onwards. Reviewed-by: Barret Rhoden Signed-off-by: Kumar Kartikeya Dwivedi --- include/asm-generic/rqspinlock.h | 19 +++++++++++++++++++ kernel/locking/rqspinlock.c | 3 ++- 2 files changed, 21 insertions(+), 1 deletion(-) create mode 100644 include/asm-generic/rqspinlock.h diff --git a/include/asm-generic/rqspinlock.h b/include/asm-generic/rqspinlock.h new file mode 100644 index 000000000000..54860b519571 --- /dev/null +++ b/include/asm-generic/rqspinlock.h @@ -0,0 +1,19 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Resilient Queued Spin Lock + * + * (C) Copyright 2024 Meta Platforms, Inc. and affiliates. + * + * Authors: Kumar Kartikeya Dwivedi + */ +#ifndef __ASM_GENERIC_RQSPINLOCK_H +#define __ASM_GENERIC_RQSPINLOCK_H + +#include + +struct qspinlock; +typedef struct qspinlock rqspinlock_t; + +extern void resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val); + +#endif /* __ASM_GENERIC_RQSPINLOCK_H */ diff --git a/kernel/locking/rqspinlock.c b/kernel/locking/rqspinlock.c index caaa7c9bbc79..18eb9ef3e908 100644 --- a/kernel/locking/rqspinlock.c +++ b/kernel/locking/rqspinlock.c @@ -23,6 +23,7 @@ #include #include #include +#include /* * Include queued spinlock definitions and statistics code @@ -127,7 +128,7 @@ static __always_inline u32 __pv_wait_head_or_lock(struct qspinlock *lock, * contended : (*,x,y) +--> (*,0,0) ---> (*,0,1) -' : * queue : ^--' : */ -void __lockfunc resilient_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) +void __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val) { struct mcs_spinlock *prev, *next, *node; u32 old, tail; From patchwork Thu Feb 6 10:54:14 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962838 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6C8F8C02194 for ; Thu, 6 Feb 2025 11:01:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=FML/jIvPcAoxlA62EsuMUSBptbx5cKUmte0tA2musTk=; b=UAjN0n06ya2kzLcUY+4j1duIhF ckTMhKBH4Ypxl8hS29/fSVhVdXiuaQ+HY4xyjwceFNFJN5iRGiT6MpG2oxg2+r3futvOxsCiXXJ/1 aKMw3d56n5mWxR0ihQFgvAD68tn/olAYHK1jQu70Of6fvZu2KjRxTblPVRZenRV6l/CCHvITapgvL k5aGwhJP9bg3jHlgYwo9R0FkPzaDlNpDPkBdtVK4KRNwf9niJOU8JBva4VeQQYyPM0thVCg1nIDus zMEJLjVVnRgNZ9jIQLcFk1V3lMQ0Y1lYV7QB/Sm9rZiXCGTZN3zZ+E5SOq8qWUnBvZrbjIkDwynIk zBzupOWQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tfzdZ-000000063CO-3htq; Thu, 06 Feb 2025 11:01:37 +0000 Received: from mail-wm1-x343.google.com ([2a00:1450:4864:20::343]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzWw-000000061X8-31xA for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:54:47 +0000 Received: by mail-wm1-x343.google.com with SMTP id 5b1f17b1804b1-4361815b96cso4768485e9.1 for ; Thu, 06 Feb 2025 02:54:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839285; x=1739444085; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=FML/jIvPcAoxlA62EsuMUSBptbx5cKUmte0tA2musTk=; b=IfXOL+scrpPlc17T5rkNuolnRDHusHmgg1DHuIQHiODxRxM70xDjxPbzVSnirIKxTv fnX3BpAGYnTftzoQtg5Eis1UOdoTtM4FjXsT5rzK912t8VT/H9do4j103hcBDVBN/nH2 5pqkOLtfceJbNnT9thzwjR5d9vYtP5ld6OLhXSotIkenaKAUj0nfPP45II8F+tgILgIX U9COpBfn+bcdX7HftCtvq2bguI5XJ1amioqvwsbOpljz+1tot7LkSX/CPFcLSWVfvlQr Uwf5momH6fqbkUb/Uo6WzJ2sIJJGF861+Y2VdQnzKoqz9AVPc94W3CuF44bx+OLX/i/a nnbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839285; x=1739444085; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FML/jIvPcAoxlA62EsuMUSBptbx5cKUmte0tA2musTk=; b=nWrQjCPsM+okIQuRH7yLueMwzw1EUj2TTXIv05Iu9EeC7cYmpW7BvtT+Yx6l6CDLST SqqL/jPXm4Zv8fIxs+HEqBLp7dg14wFCn8roC1Vt1OTy5wj8EeF5UwZn7B2E40vZ4aXT eiyejM4jOytjTuL0wzc85cl9AQb+ig2v1WdWkZs2mOW7rUPeJ5c7r5B3KOQc5+fvsVsC sm8fbB75FmylNByIxhFYMB5Nv6cGK0YhVsua0884uP0gVedY8A0vMpN1NgL8tTwqcxo0 TiTsyyF9PGDV6syg1jWTkK2IDhlBjquMudP4W3JbGgIJFVv6I3jm1iKjyh+UW4WarAac SqFA== X-Forwarded-Encrypted: i=1; AJvYcCUex0eSyVDKJwNGXZ9tcYh+NTKlRcJgNxZK4L9syKC4q9OdV3qSL6ECP0GlUl+tzHm3dVZKHPU7xtp18I8uHmdP@lists.infradead.org X-Gm-Message-State: AOJu0YwOZd3E/pokFLlPWji//uBBDsQF2lvBGauM4E3zkPx8KhuI+p3v wovQkJ+6vL8JpZVML1QvDr27ey0KOi1OTLVQ1qXYEsY/aLEnXEPa X-Gm-Gg: ASbGncvJtRjIxOORm/G73kgtZLjVvEIKs+ZxQsFvmffkSnrVKoYdtQTrTG6pPHHEj3P SI8mUyaaeEIX8LDF94Hq4lrUT0K8Kz0xAH5wvBLLE3Pc8KdmCxRL1pZefXLNth3o7gF8EFUsbYK VmYIDtZaQfMmgxEASeCQ9RL6hBb3JoWTM0XnDzkvlbjEx7o2EyStMhxNBIfjAKRdawOktWXnRfK 14tRqY2D2If8l5vfbykcnMXsujBMGwppijElR8uudAxkMNl+BUHz7H9kMvXQeToK8RoRjH9Kpk1 zIQeWQ== X-Google-Smtp-Source: AGHT+IEOmTjECXHJOHu9cs/SJ+6ticeS7+QaeLa0eVa8SFVT2qBWkqUoeVUR/OVwzkrcdhv+Xm5HzQ== X-Received: by 2002:a05:600c:1c90:b0:434:f270:a513 with SMTP id 5b1f17b1804b1-4390d56e3admr51732625e9.29.1738839284871; Thu, 06 Feb 2025 02:54:44 -0800 (PST) Received: from localhost ([2a03:2880:31ff:17::]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-43907f6741esm44971815e9.3.2025.02.06.02.54.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:54:44 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Barret Rhoden , Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 06/26] rqspinlock: Drop PV and virtualization support Date: Thu, 6 Feb 2025 02:54:14 -0800 Message-ID: <20250206105435.2159977-7-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=6325; h=from:subject; bh=r7k8qGdKN/3/qwxHoOfh+ZQmucFzyerAvWUxzRBa3Nw=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRkILe/Rfrs9nCkS/jAFdrxKRcYv7G3/iAaCR9O 5XUAiaGJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZAAKCRBM4MiGSL8Ryt6cD/ 9T4mqw3kZqs2TP+tHKQuzgqUq8fvSH8yl7t36bQGme497vOAOYDUHMeNE4NNyj7xPtcdwP+75JT6jQ wb7DTCpWkXZuXyGsVYOqkRCTrIpl86UhY4KJl7kpw0Fu/l4dVh2eGgHzXhYTOAo3UfEW3sblt0q+J+ HtBSelQEJJ2OiEIXdozIXjKe1DouxA6jfr9ixQF5KRP3O4K0H2jeTFQw6ruH2RXH0V+42ZwyL/q4Sb j0fby9n8kP9ZDxMNUNWPVhnFWuMwb3b9rwsQZME31GLbIEi/IEz54iXXAjlwVeS8CtY3ZuiDqfn4LC 8vkJ/5biap6lg9ReRc9H8WCmVZuC1O18jeYfRVCk9BRgjFDmynobDC5PWqUEaxs/4weWUTipElhtHA 7rTqLuMsOilmIdBqGSY9cpHXiMj/9tfMraqZKztusV0dkACiFyAXpSgpqErFXM4z8J7b6tx0wu+gza wtTNXet8pwRgRUHVnbyGX81YgMCL5AQXwiOot10sXBM00IBvoKOrdjO2422YWhUOHy6sF6R5id4V9D gM4HflpQhz5PAIg3z56BGX8bWfOlfbWkA6/HEA5p+yyTD21iW/w4wyZuPUEWaHIhxlVrp81m7Ajv+d 8KZJHw5wvIc5BbaAV1NGrm4shlGp+OwQG7OPQO17A7Vq6h8oz8Q8NJUJBu4w== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_025446_767540_4DCCCA3E X-CRM114-Status: GOOD ( 18.53 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Changes to rqspinlock in subsequent commits will be algorithmic modifications, which won't remain in agreement with the implementations of paravirt spin lock and virt_spin_lock support. These future changes include measures for terminating waiting loops in slow path after a certain point. While using a fair lock like qspinlock directly inside virtual machines leads to suboptimal performance under certain conditions, we cannot use the existing virtualization support before we make it resilient as well. Therefore, drop it for now. Reviewed-by: Barret Rhoden Signed-off-by: Kumar Kartikeya Dwivedi --- kernel/locking/rqspinlock.c | 89 ------------------------------------- 1 file changed, 89 deletions(-) diff --git a/kernel/locking/rqspinlock.c b/kernel/locking/rqspinlock.c index 18eb9ef3e908..52db60cd9691 100644 --- a/kernel/locking/rqspinlock.c +++ b/kernel/locking/rqspinlock.c @@ -11,8 +11,6 @@ * Peter Zijlstra */ -#ifndef _GEN_PV_LOCK_SLOWPATH - #include #include #include @@ -75,38 +73,9 @@ * contexts: task, softirq, hardirq, nmi. * * Exactly fits one 64-byte cacheline on a 64-bit architecture. - * - * PV doubles the storage and uses the second cacheline for PV state. */ static DEFINE_PER_CPU_ALIGNED(struct qnode, qnodes[_Q_MAX_NODES]); -/* - * Generate the native code for resilient_queued_spin_unlock_slowpath(); provide NOPs - * for all the PV callbacks. - */ - -static __always_inline void __pv_init_node(struct mcs_spinlock *node) { } -static __always_inline void __pv_wait_node(struct mcs_spinlock *node, - struct mcs_spinlock *prev) { } -static __always_inline void __pv_kick_node(struct qspinlock *lock, - struct mcs_spinlock *node) { } -static __always_inline u32 __pv_wait_head_or_lock(struct qspinlock *lock, - struct mcs_spinlock *node) - { return 0; } - -#define pv_enabled() false - -#define pv_init_node __pv_init_node -#define pv_wait_node __pv_wait_node -#define pv_kick_node __pv_kick_node -#define pv_wait_head_or_lock __pv_wait_head_or_lock - -#ifdef CONFIG_PARAVIRT_SPINLOCKS -#define resilient_queued_spin_lock_slowpath native_resilient_queued_spin_lock_slowpath -#endif - -#endif /* _GEN_PV_LOCK_SLOWPATH */ - /** * resilient_queued_spin_lock_slowpath - acquire the queued spinlock * @lock: Pointer to queued spinlock structure @@ -136,12 +105,6 @@ void __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val) BUILD_BUG_ON(CONFIG_NR_CPUS >= (1U << _Q_TAIL_CPU_BITS)); - if (pv_enabled()) - goto pv_queue; - - if (virt_spin_lock(lock)) - return; - /* * Wait for in-progress pending->locked hand-overs with a bounded * number of spins so that we guarantee forward progress. @@ -212,7 +175,6 @@ void __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val) */ queue: lockevent_inc(lock_slowpath); -pv_queue: node = this_cpu_ptr(&qnodes[0].mcs); idx = node->count++; tail = encode_tail(smp_processor_id(), idx); @@ -251,7 +213,6 @@ void __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val) node->locked = 0; node->next = NULL; - pv_init_node(node); /* * We touched a (possibly) cold cacheline in the per-cpu queue node; @@ -288,7 +249,6 @@ void __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val) /* Link @node into the waitqueue. */ WRITE_ONCE(prev->next, node); - pv_wait_node(node, prev); arch_mcs_spin_lock_contended(&node->locked); /* @@ -312,23 +272,9 @@ void __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val) * store-release that clears the locked bit and create lock * sequentiality; this is because the set_locked() function below * does not imply a full barrier. - * - * The PV pv_wait_head_or_lock function, if active, will acquire - * the lock and return a non-zero value. So we have to skip the - * atomic_cond_read_acquire() call. As the next PV queue head hasn't - * been designated yet, there is no way for the locked value to become - * _Q_SLOW_VAL. So both the set_locked() and the - * atomic_cmpxchg_relaxed() calls will be safe. - * - * If PV isn't active, 0 will be returned instead. - * */ - if ((val = pv_wait_head_or_lock(lock, node))) - goto locked; - val = atomic_cond_read_acquire(&lock->val, !(VAL & _Q_LOCKED_PENDING_MASK)); -locked: /* * claim the lock: * @@ -341,11 +287,6 @@ void __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val) */ /* - * In the PV case we might already have _Q_LOCKED_VAL set, because - * of lock stealing; therefore we must also allow: - * - * n,0,1 -> 0,0,1 - * * Note: at this point: (val & _Q_PENDING_MASK) == 0, because of the * above wait condition, therefore any concurrent setting of * PENDING will make the uncontended transition fail. @@ -369,7 +310,6 @@ void __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val) next = smp_cond_load_relaxed(&node->next, (VAL)); arch_mcs_spin_unlock_contended(&next->locked); - pv_kick_node(lock, next); release: trace_contention_end(lock, 0); @@ -380,32 +320,3 @@ void __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val) __this_cpu_dec(qnodes[0].mcs.count); } EXPORT_SYMBOL(resilient_queued_spin_lock_slowpath); - -/* - * Generate the paravirt code for resilient_queued_spin_unlock_slowpath(). - */ -#if !defined(_GEN_PV_LOCK_SLOWPATH) && defined(CONFIG_PARAVIRT_SPINLOCKS) -#define _GEN_PV_LOCK_SLOWPATH - -#undef pv_enabled -#define pv_enabled() true - -#undef pv_init_node -#undef pv_wait_node -#undef pv_kick_node -#undef pv_wait_head_or_lock - -#undef resilient_queued_spin_lock_slowpath -#define resilient_queued_spin_lock_slowpath __pv_resilient_queued_spin_lock_slowpath - -#include "qspinlock_paravirt.h" -#include "rqspinlock.c" - -bool nopvspin; -static __init int parse_nopvspin(char *arg) -{ - nopvspin = true; - return 0; -} -early_param("nopvspin", parse_nopvspin); -#endif From patchwork Thu Feb 6 10:54:15 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962859 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BFCC4C02196 for ; Thu, 6 Feb 2025 11:05:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=Vg5Aj/VI1QlQmysdMT47W3wgHFqoKVdLw8Vt35/615c=; b=x9KC2nYZRPb3HIZaLGKJkKShG8 2H+Ho0822evneUNhOar0M1IHZCxGU58wT0WLVs0wVOUiEsCL1iunhS/EPo+jkCJe/cRlnRq2TEwhq aOZ9Bd5jZGZPzWwadl61rIds/xeH2SOKtzQFwHEzJkuC2kRTvNftv9vcg7XPUqCBKOOy+V7l9Tp2C f5le4USnOWtIo7EOkIvKEeIYmjslH+aDUjV5uV4juCtOMHxK3arQvs/e6kX/AQ+DqjRPe/5CC9Fq6 S7MJuGMq5JHZLSBrAb5U1MqBW6p7MYa79JM5ay0eO/yww3D3M2C+KEvBZ9pFspdrqrQwKALjZVzAP KWK2IESw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tfzhb-000000063rZ-3mkf; Thu, 06 Feb 2025 11:05:47 +0000 Received: from mail-wr1-x441.google.com ([2a00:1450:4864:20::441]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzWy-000000061Xi-08RL for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:54:49 +0000 Received: by mail-wr1-x441.google.com with SMTP id ffacd0b85a97d-38a25d4b9d4so349114f8f.0 for ; Thu, 06 Feb 2025 02:54:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839286; x=1739444086; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Vg5Aj/VI1QlQmysdMT47W3wgHFqoKVdLw8Vt35/615c=; b=hd2YCP5xFTzr7uDdiN39YSx2JM+sCdNAWc3JTWUlB8b9uxr1/iJEKTQGzBtHWT39dZ hMAJEwrZ5vjEAePgksU+oai3XnvL5PGrW6qpRGWyOZlXKIduwBVoVbMKZI1Gd6KMJdGr WTPKJnCJ3/+DqNRy8RF8fxOA+bvO4YRKJQl97d+2lVrEhEVK9ZhVIqFrvk/UcusHVGdg Q0GNynhEGwia2KdU5BO2JFqx20GhaNyXsdmGO/0wdLqI5Ble6GY0EdxTbKtyY2/JUtMN /dEEbWW1nXxNidyzDgRUgLM54cJTvgMf0ljkikAyT60oHQwouplHKXR0K+Lg2El7bxi3 IcyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839286; x=1739444086; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Vg5Aj/VI1QlQmysdMT47W3wgHFqoKVdLw8Vt35/615c=; b=lshfFGh/a5ifcTOcSeVWTrVZa7tWiDV0VausV4/ubcL+wCyq0TzV8cB9otItZdqfaR 4SnJd3IPxfmvoUbY6RFtTU0LyMSsg+tjxRHCUa+8vua5Y5fqjh0AExPaPo2+bj2E6+a+ qcdb/mXsV2lRzd+GFzaYVR+lYjaCR0tuQGYqIr6cKNcqypnYZTMhc2BbqsNe40Qe/KJU OCohf76rMn/VOhupdkrvgTLa0SDrsOb50WPBPyOV/gbgGisWWelgB43Mf6TO9b7XJHQ8 vsd6ecrYgs2V51GuhYZyHPZtSOmSTFcZT8zZPi4W0+Fnnq2hwGnaUgjcvMIB/5XOA0q9 SaYQ== X-Forwarded-Encrypted: i=1; AJvYcCV2e6D51WtXAzEUFbBVTh9cRxNL89v/RosPH4hAQg7OOKgvQSxJSnkbBf0yUHboJdRnUdc5sYwnuANTIwo1pyB7@lists.infradead.org X-Gm-Message-State: AOJu0Yy89Rj45dxhpdu8WvfnR+uiB4lPzyBnofI5iebk6qpGh2YdZmyS 5hJhRcSxzJW1hJk7UQapQB/RkMXs80xVGiMe9MXmuugtkZW88MS8 X-Gm-Gg: ASbGnctNUJo39lIwuPEP/Ejk0CcbAoWiWCQ4lty1S5ni7Sx0LT/r9foeBVLlAq2rkKv intOn+ZWcB2+Zs3lkQ3TYKeGa/GVH8dhG6l0OAEkE1SMLiCatgWqPxdBHe2OmKgr40vA5jWrTHq Ylk4VJbEvsp8dx6J+0v+26mJcTnsz+GrEqynljrNAzv87GUPE+bvcWQwPcUtJGFtFuKI3u5VG0a d/L+cOj+fnTAi6u0tZwAGuO/Q/iwPeMZX27wJXlJQvHJ7//GjYImDIlLFgIuNq3dLlEhV/vCqhx 8i3u X-Google-Smtp-Source: AGHT+IHYw/0AC8066ElVuaB+bONKwChfk8mv0EyRa3QvFZYateFuU3S3W5RTQoK6xFXLzRpsK0lELg== X-Received: by 2002:a5d:64ec:0:b0:38c:1281:260d with SMTP id ffacd0b85a97d-38db48d5ef1mr4800897f8f.31.1738839286065; Thu, 06 Feb 2025 02:54:46 -0800 (PST) Received: from localhost ([2a03:2880:31ff:f::]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38dbde1ddb9sm1383042f8f.84.2025.02.06.02.54.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:54:45 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Barret Rhoden , Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 07/26] rqspinlock: Add support for timeouts Date: Thu, 6 Feb 2025 02:54:15 -0800 Message-ID: <20250206105435.2159977-8-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=5055; h=from:subject; bh=+CI7RDk98Onq/s+jrg9zfcbapKjfoHyX6Gf/cOAFD9E=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRko56eerusglxFABuTrjI0BeTNdWet/EFfKlxx n6KicXiJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZAAKCRBM4MiGSL8RypHfD/ 9HHaEjSUQc5CtopQiXE6c0Vwac7KpTMW4wwJycT3bCu1KquGtKAhLC2KFqVX2wHK4bl/OLK9vxVSmT Y9HJRDo6MOTwwlMcflPNZ3sTdgVaOaqei2zDvi0fcuAdVHriJxcP1KZX+BOtNaX4oIBqMixlOnWT8n zk9/HR/PwfXvbdewufYQssneftjxoWOvuQQeFEhZlkhohK0wkeQGgYZ6x6dpnN/VU3PpQQVTbRQFDI q2nwG3DsniPReJ/kfk+KKkvpMIXwuCUQCswIjrMd6X+f8Gbx4HkDaoCK0jNJmLXz7x/qnXM7efk5aJ eNUrFK4ZNtavQLPKEs7/u1ksDubcrsL+PcEjSLPmM7S83+t6sEMMKte4jA9A3EhlIe3pOMdshKIwMZ ox0u8m96KQo3KwGtgfkm1xYVf6D/WMtXXhPzrlPQ9no39v7Bzx0BPhb1CwvoxQHGMvECHciFWXf1E1 lwy9QUGhK6aq/O7Wbf9dLtTyqUL22WhmoHXboRhy8E1Q5+h+jywgW08oMQ80fekxqIfrp7xeQuJxbs 6tUwh0BODBW6hRLhfkjt3Qu89gY3T/J/7YxiR5wbSV5A47TQLxdDH3gvALoDmdy9OiVYWlvfv6LgUD KhAG4jecaMw2SpR61+5VnBT3O5WvSYhyuAOcyVQLr68OWpdNT5LRRVSCKKDg== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_025448_063613_68D702EB X-CRM114-Status: GOOD ( 25.45 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Introduce policy macro RES_CHECK_TIMEOUT which can be used to detect when the timeout has expired for the slow path to return an error. It depends on being passed two variables initialized to 0: ts, ret. The 'ts' parameter is of type rqspinlock_timeout. This macro resolves to the (ret) expression so that it can be used in statements like smp_cond_load_acquire to break the waiting loop condition. The 'spin' member is used to amortize the cost of checking time by dispatching to the implementation every 64k iterations. The 'timeout_end' member is used to keep track of the timestamp that denotes the end of the waiting period. The 'ret' parameter denotes the status of the timeout, and can be checked in the slow path to detect timeouts after waiting loops. The 'duration' member is used to store the timeout duration for each waiting loop, that is passed down from the caller of the slow path function. Use the RES_INIT_TIMEOUT macro to initialize it. The default timeout value defined in the header (RES_DEF_TIMEOUT) is 0.5 seconds. This macro will be used as a condition for waiting loops in the slow path. Since each waiting loop applies a fresh timeout using the same rqspinlock_timeout, we add a new RES_RESET_TIMEOUT as well to ensure the values can be easily reinitialized to the default state. Reviewed-by: Barret Rhoden Signed-off-by: Kumar Kartikeya Dwivedi --- include/asm-generic/rqspinlock.h | 8 +++++- kernel/locking/rqspinlock.c | 46 +++++++++++++++++++++++++++++++- 2 files changed, 52 insertions(+), 2 deletions(-) diff --git a/include/asm-generic/rqspinlock.h b/include/asm-generic/rqspinlock.h index 54860b519571..c89733cbe643 100644 --- a/include/asm-generic/rqspinlock.h +++ b/include/asm-generic/rqspinlock.h @@ -10,10 +10,16 @@ #define __ASM_GENERIC_RQSPINLOCK_H #include +#include struct qspinlock; typedef struct qspinlock rqspinlock_t; -extern void resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val); +/* + * Default timeout for waiting loops is 0.5 seconds + */ +#define RES_DEF_TIMEOUT (NSEC_PER_SEC / 2) + +extern void resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, u64 timeout); #endif /* __ASM_GENERIC_RQSPINLOCK_H */ diff --git a/kernel/locking/rqspinlock.c b/kernel/locking/rqspinlock.c index 52db60cd9691..200454e9c636 100644 --- a/kernel/locking/rqspinlock.c +++ b/kernel/locking/rqspinlock.c @@ -6,9 +6,11 @@ * (C) Copyright 2013-2014,2018 Red Hat, Inc. * (C) Copyright 2015 Intel Corp. * (C) Copyright 2015 Hewlett-Packard Enterprise Development LP + * (C) Copyright 2024 Meta Platforms, Inc. and affiliates. * * Authors: Waiman Long * Peter Zijlstra + * Kumar Kartikeya Dwivedi */ #include @@ -22,6 +24,7 @@ #include #include #include +#include /* * Include queued spinlock definitions and statistics code @@ -68,6 +71,44 @@ #include "mcs_spinlock.h" +struct rqspinlock_timeout { + u64 timeout_end; + u64 duration; + u16 spin; +}; + +static noinline int check_timeout(struct rqspinlock_timeout *ts) +{ + u64 time = ktime_get_mono_fast_ns(); + + if (!ts->timeout_end) { + ts->timeout_end = time + ts->duration; + return 0; + } + + if (time > ts->timeout_end) + return -ETIMEDOUT; + + return 0; +} + +#define RES_CHECK_TIMEOUT(ts, ret) \ + ({ \ + if (!(ts).spin++) \ + (ret) = check_timeout(&(ts)); \ + (ret); \ + }) + +/* + * Initialize the 'duration' member with the chosen timeout. + */ +#define RES_INIT_TIMEOUT(ts, _timeout) ({ (ts).spin = 1; (ts).duration = _timeout; }) + +/* + * We only need to reset 'timeout_end', 'spin' will just wrap around as necessary. + */ +#define RES_RESET_TIMEOUT(ts) ({ (ts).timeout_end = 0; }) + /* * Per-CPU queue node structures; we can never have more than 4 nested * contexts: task, softirq, hardirq, nmi. @@ -97,14 +138,17 @@ static DEFINE_PER_CPU_ALIGNED(struct qnode, qnodes[_Q_MAX_NODES]); * contended : (*,x,y) +--> (*,0,0) ---> (*,0,1) -' : * queue : ^--' : */ -void __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val) +void __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, u64 timeout) { struct mcs_spinlock *prev, *next, *node; + struct rqspinlock_timeout ts; u32 old, tail; int idx; BUILD_BUG_ON(CONFIG_NR_CPUS >= (1U << _Q_TAIL_CPU_BITS)); + RES_INIT_TIMEOUT(ts, timeout); + /* * Wait for in-progress pending->locked hand-overs with a bounded * number of spins so that we guarantee forward progress. From patchwork Thu Feb 6 10:54:16 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962861 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 43291C02194 for ; Thu, 6 Feb 2025 11:08:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=u6HfEz6FnLcnMW+Wa6mBmkJ5vtGaoNq/ogsTG9QK4tE=; b=FBLVSxtH56h2azrf662f7YE/o/ 8j5qcCko+RtuSnBYLCVcvF8LtQdfwcJRktS7oYAHVumpijukaORBMc0N8WeC1ZGyKGNSmD3OGZxhW 3pA41kvYjy5KNglRRCDn/IbHcK8aZ3ZiCK+K+VwV1db/bPYo2JTUnp8dsqloudr1FTsTBXb0w9qdz ivnXLPrYPDkCF9vj508ifZDyBGsBiZEDw3ru+pRa9UvPiumQM3faoFGKzIoTaSfHzfZL7aSwAeIS8 EsBydI3MFUrmHsSX2aXLtepVKMlFaSPIP2gxT+Iqw47G24Fq0fLAGtvn9ROjmGuWjwFtG9AL/2Yg7 rn76wbAw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tfzkJ-000000064eQ-2sTU; Thu, 06 Feb 2025 11:08:35 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzX2-000000061Zh-3Gut for linux-arm-kernel@bombadil.infradead.org; Thu, 06 Feb 2025 10:54:52 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Transfer-Encoding:MIME-Version :References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=u6HfEz6FnLcnMW+Wa6mBmkJ5vtGaoNq/ogsTG9QK4tE=; b=JMqdVXVI/H70quOW4ot+PXSFvl T1gmkfgdGaAFj83VsffSsAgw20fbtYKpwto37lWQZUUbvp9trONjuR/pDBWh8GH9kEKHZHPFm9bGf 3KWLpkDOewCMLZ1UzvQ8tjgsRkF0eIJtxobywmvgpRgV7C8oAhB1YgD7A0YuiDNi/UrgNISwJExv8 HjhGyvKE0d1xl3b42Z3n0t8p2Iv4SzObR4tZbbbrqfEpUF1LCZ0UZuuOk9vg3N2PPq8NC74tseyoI wdz2VeV2HCn5eqP5bw4omDMqAOp3/BbH0zaDuLyqfdVA3HrjkRugGLY5bE6lCp5fBBXRnD3wAQm0T jy46VRXg==; Received: from mail-wr1-x441.google.com ([2a00:1450:4864:20::441]) by desiato.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzX0-0000000Gusx-01Jr for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:54:51 +0000 Received: by mail-wr1-x441.google.com with SMTP id ffacd0b85a97d-38dabb11eaaso357089f8f.0 for ; Thu, 06 Feb 2025 02:54:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839288; x=1739444088; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=u6HfEz6FnLcnMW+Wa6mBmkJ5vtGaoNq/ogsTG9QK4tE=; b=gV+Zwy/a1UALMSQyIyQT0o/Pin8nhGZTC5VutdSku83xdNhePIIRrELYXWfJvSSXVi Cku9DfKVkhR8RL8pJe3BP5YWQ2bpveZ8aFlTVBCMpC7UctNEUilOJuW9kS7hqgmawIJK RaOv5kg5Bu4wosi4T47asImnayxXMdh/hJUBKnRYYREw9JT2vblB3wsEtIZ53MW1kmdM DS9ER+++ME8BfRStgQprVT89AX3hOCdI1sAoa9Ey0RmdnCIrvSyF/futXaY7plXUb39D 1MPMRB+aEr7D29Edv18ghGAo2PXRtl2t2Pgc1HcPbCBmz7jJVAoK8dxef8nz/hpUXEed CHJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839288; x=1739444088; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=u6HfEz6FnLcnMW+Wa6mBmkJ5vtGaoNq/ogsTG9QK4tE=; b=XZkqVJsl8w/EEcAGEnZYalslXt1gOnrs3NcJtbgBWV97AXEuTS7oAbLUPCMy6MrAOU 9ZSdiD75C+KyRQjK5TbALflqcusJaRmvCI/0MZQ2ZQ1pQzoZxMrL1z3O7rAw02+hgaUW T6WnJv/ofvOkICeN8jOqqI/2nXWPFkXDh732uUdu/F54F0P5gjClmqy3MQ6IBl3fv6t8 HWremoLtYwWXD5hEW/ffl1NZEXU4zOIbjOMe95M/bSKgZlP4DTA6iDC2l1xalgo/mits Bbb3pSZ5yfMWLUEYHHdVTN7LEl2IunS7RgfqV/ETUP3fogKDaHTfzF/tkvpMrBGA0l57 indw== X-Forwarded-Encrypted: i=1; AJvYcCWGCZmJG3OK56ebDvUC/y9x7tIjGjSJ9V8TkrgRUogNxtJRPZsVGOnUa+SzPq9QwbSO+XycJ2YvaKQu7hB4KmAR@lists.infradead.org X-Gm-Message-State: AOJu0Yw9qh73/35hyexMzZmK8I9ppdZS/7StpvuLlqkFDNtUxeHscLXQ wDAZZXhUAwHdbcFr856swyb9K8aGT4DfwFQQ3u3Xfp7VLq9HRFdJ X-Gm-Gg: ASbGnctDIo7evZ7YrhtEZiwgm+VPicpPdqpeecz6xdWzS6d+RKfYAqNTSANi7RRITDo okuVAmfQPv4dMdxqCboD3QxJJcnBeBdvbf5lYK78eFSjgmCAz/vqVoiEw0YxlzLCjYlqqv0gt2F ckxIYlxFDyOfdNc+U/Yb2uXYtrtarTsxoZ4YJoClCW3SglmsKfUkHFJG0xbncXUScsuz1fZeYfU JmTKowJ5n/rKpfn8+ipGOmNugiPBSJQ6WvOdM+VnX4XkHUXQBCoAMixbIB8P2C6adUqdKybX/6l 9TkAiA== X-Google-Smtp-Source: AGHT+IEFTmIDnkApoCxo9XEdAou5ZGEMDlqk297HAU+HLuXK1+LhNaJhCM/rdUs7kriHnIXhGFI/oA== X-Received: by 2002:a5d:5988:0:b0:38c:1362:41b5 with SMTP id ffacd0b85a97d-38db486108amr4754830f8f.6.1738839287460; Thu, 06 Feb 2025 02:54:47 -0800 (PST) Received: from localhost ([2a03:2880:31ff:72::]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38dbde0fd3csm1414640f8f.62.2025.02.06.02.54.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:54:46 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Barret Rhoden , Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 08/26] rqspinlock: Protect pending bit owners from stalls Date: Thu, 6 Feb 2025 02:54:16 -0800 Message-ID: <20250206105435.2159977-9-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=4602; h=from:subject; bh=PZAstj2WMwdJ5zpDKugp3FElz+LV+4cvINkXq1LfzVc=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRlQkqPybkJ7GSLq/QqBmQtgES+UrYcxxOebbfU QQitnoeJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZQAKCRBM4MiGSL8RyrK9D/ 9TAnlSHxA1Q3uFCYcCdMIawjW1IJgTMIKG08AH5HflcEBfh12ex5nIURCNNtVmT8KNdznwDtmSyf9F Dz9E3Hb1omPeX7fZXLE7IFq92YLTS8ZhF954tYzs/ccHTnFSiRQUEYaPNN5BjAreKu7VsuNJTWWwh2 Zk18PTGBgcCoITYY2U9kPpt8H0ss5SjjE/v472ug3Sxr+ikSfRFCNfqHQlM07gdJeffwM2YhPLmyrA pS/ZOZBqjyv6FJgGZvfHKwzMoadC1/BvbPU7eQpAJasZGBSDhGpgWqGWPnrkd1Xx+Qpo5SaJTQv2Bf FVE3fsLPC8zh8SYzTkjlRiUuMQDkkUeC71FmooULzLbfpkqhhr5IlQIpcVaKafW3bGY7gkNkxgbUcR ZjGTHp06O6I30+WbvFpEoQXIBKla/RBxSi2qHXXdZ5xlHp/f0BpJfujOwZr5R+QHWNX4h8hvTigyRu JMiZjgK3a/DfYdZdkC7+3+Y4IWzCH5XFsthPvV6ZgI+HUdIaJu6gLuw1LHLahy0sUtSdl3j3nbkdnN nk5039QdF/msE7FiR8sEyIhyL9JjjIcbswOwvAx+cPa7i/XoU9RVLyZ+toJUpPQsXNfR7MRq1YEGFC OwvNNjfgANI5dQmuZkKEQfmCLhErAm7WdD82gJqews3JIUZe0bapo/LeUq6A== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_105450_176740_6C3191BF X-CRM114-Status: GOOD ( 19.62 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The pending bit is used to avoid queueing in case the lock is uncontended, and has demonstrated benefits for the 2 contender scenario, esp. on x86. In case the pending bit is acquired and we wait for the locked bit to disappear, we may get stuck due to the lock owner not making progress. Hence, this waiting loop must be protected with a timeout check. To perform a graceful recovery once we decide to abort our lock acquisition attempt in this case, we must unset the pending bit since we own it. All waiters undoing their changes and exiting gracefully allows the lock word to be restored to the unlocked state once all participants (owner, waiters) have been recovered, and the lock remains usable. Hence, set the pending bit back to zero before returning to the caller. Introduce a lockevent (rqspinlock_lock_timeout) to capture timeout event statistics. Reviewed-by: Barret Rhoden Signed-off-by: Kumar Kartikeya Dwivedi --- include/asm-generic/rqspinlock.h | 2 +- kernel/locking/lock_events_list.h | 5 +++++ kernel/locking/rqspinlock.c | 28 +++++++++++++++++++++++----- 3 files changed, 29 insertions(+), 6 deletions(-) diff --git a/include/asm-generic/rqspinlock.h b/include/asm-generic/rqspinlock.h index c89733cbe643..0981162c8ac7 100644 --- a/include/asm-generic/rqspinlock.h +++ b/include/asm-generic/rqspinlock.h @@ -20,6 +20,6 @@ typedef struct qspinlock rqspinlock_t; */ #define RES_DEF_TIMEOUT (NSEC_PER_SEC / 2) -extern void resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, u64 timeout); +extern int resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, u64 timeout); #endif /* __ASM_GENERIC_RQSPINLOCK_H */ diff --git a/kernel/locking/lock_events_list.h b/kernel/locking/lock_events_list.h index 97fb6f3f840a..c5286249994d 100644 --- a/kernel/locking/lock_events_list.h +++ b/kernel/locking/lock_events_list.h @@ -49,6 +49,11 @@ LOCK_EVENT(lock_use_node4) /* # of locking ops that use 4th percpu node */ LOCK_EVENT(lock_no_node) /* # of locking ops w/o using percpu node */ #endif /* CONFIG_QUEUED_SPINLOCKS */ +/* + * Locking events for Resilient Queued Spin Lock + */ +LOCK_EVENT(rqspinlock_lock_timeout) /* # of locking ops that timeout */ + /* * Locking events for rwsem */ diff --git a/kernel/locking/rqspinlock.c b/kernel/locking/rqspinlock.c index 200454e9c636..8e512feb37ce 100644 --- a/kernel/locking/rqspinlock.c +++ b/kernel/locking/rqspinlock.c @@ -138,12 +138,12 @@ static DEFINE_PER_CPU_ALIGNED(struct qnode, qnodes[_Q_MAX_NODES]); * contended : (*,x,y) +--> (*,0,0) ---> (*,0,1) -' : * queue : ^--' : */ -void __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, u64 timeout) +int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, u64 timeout) { struct mcs_spinlock *prev, *next, *node; struct rqspinlock_timeout ts; + int idx, ret = 0; u32 old, tail; - int idx; BUILD_BUG_ON(CONFIG_NR_CPUS >= (1U << _Q_TAIL_CPU_BITS)); @@ -201,8 +201,25 @@ void __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, * clear_pending_set_locked() implementations imply full * barriers. */ - if (val & _Q_LOCKED_MASK) - smp_cond_load_acquire(&lock->locked, !VAL); + if (val & _Q_LOCKED_MASK) { + RES_RESET_TIMEOUT(ts); + smp_cond_load_acquire(&lock->locked, !VAL || RES_CHECK_TIMEOUT(ts, ret)); + } + + if (ret) { + /* + * We waited for the locked bit to go back to 0, as the pending + * waiter, but timed out. We need to clear the pending bit since + * we own it. Once a stuck owner has been recovered, the lock + * must be restored to a valid state, hence removing the pending + * bit is necessary. + * + * *,1,* -> *,0,* + */ + clear_pending(lock); + lockevent_inc(rqspinlock_lock_timeout); + return ret; + } /* * take ownership and clear the pending bit. @@ -211,7 +228,7 @@ void __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, */ clear_pending_set_locked(lock); lockevent_inc(lock_pending); - return; + return 0; /* * End of pending bit optimistic spinning and beginning of MCS @@ -362,5 +379,6 @@ void __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, * release the node */ __this_cpu_dec(qnodes[0].mcs.count); + return 0; } EXPORT_SYMBOL(resilient_queued_spin_lock_slowpath); From patchwork Thu Feb 6 10:54:17 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962860 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7A0D8C02194 for ; Thu, 6 Feb 2025 11:07:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=iYidbmwxCEV+eSEjHIGui462v/Vxn8uyk3zJ0J48/Wg=; b=UzGXmcEnvLMB/AxG4qoFFoNhf6 EcNoXRlCFBkaz+F+MxBPSlOyRhdbndyYBzXxLrkTqdDVK501zHQHoECPMn1J7CaT11cCvk3fIDMUH RknAiZWtj4kRpGTJz8JPPD4x/8YGLXej3zbluVetLHS8+TK6ccxyszsz5V17OIiAdC7BPatxrlUZe A+Thqn8rqjFVgPVXqOrvKwZU0SB9F6+byL6UJHCfR73l0Yu5DdcV2J12ItLrMBXzetMmlbflRuf8P T7RJm+1C63zSkHX23yR+MgqIR6G/hqV9m0dXQmyxqsdR5iPXZAUA3MPxZKqvR6zSATq2smIFYurP0 v7OIo5Ug==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tfzix-000000064Fr-2J5y; Thu, 06 Feb 2025 11:07:11 +0000 Received: from mail-wm1-x341.google.com ([2a00:1450:4864:20::341]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzX0-000000061Yy-35dP for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:54:52 +0000 Received: by mail-wm1-x341.google.com with SMTP id 5b1f17b1804b1-4361b6f9faeso4042505e9.1 for ; Thu, 06 Feb 2025 02:54:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839289; x=1739444089; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=iYidbmwxCEV+eSEjHIGui462v/Vxn8uyk3zJ0J48/Wg=; b=CF1d5ORV1Dt5Pe2jbTlfoXsNh7y00KUkrF6KTb7PFmgKYtmmB1+U4njFIKkk4vr9v5 O8JEIgB/prOzJ3Ua5PnU4ionXUBg6nwLc0aRCvMrKrGlAzQcsQ1K1ojh4N4c/0b/g2P2 nsO9qU45GMILoM1WGhhhcOVe6BTsl8+k6sBx94k1TNwqPIh5gXdryj5ilefwI1OBqzx4 1+q1zl4t33jUFlOuiR4JSClnjjLw8GuvUgwjW1XLTW0KYgSR/hpEETjjrtN+kxnOwVmX bBVi1HboochR+kDi88UO0WMgQbAn9sHVPFXxTaAHhEay7CG01OUOmNph/jboN6X7Z13y wLUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839289; x=1739444089; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iYidbmwxCEV+eSEjHIGui462v/Vxn8uyk3zJ0J48/Wg=; b=PvoLdwcwMBU5Pykc/GtuajLuH3yFMz/TsTxdQGy5scypfvuY0yvuE12i7YpuAmxwjX XiaiHVHsS8QYyioU2+84YWbUwaKoP/T9aun8C1N3f/ysrtcU8Go/uZXU3tjKTd+N6zJU fvyufFuOwAAsxrQShqfW3eRVmgbOcO5YWFxCYlrqxwQEyhpyubtKcJ6rUromgFgg7wbj RDyBRc5uGPemSWlFDHg1HV8D+6JMg7doYXP4Y9xpE1MbS7dB8+8LmVZycSVAAMcRPJb2 qgImJaNHzslKjfYNdGY3A56zgSas9DARoLNFJKwhZjvS21kGJsPVHSFX31jYeFUHwrRY XgQA== X-Forwarded-Encrypted: i=1; AJvYcCW9fd7htT1430bAP3Km230YAMDl5KDg5hGuer/cKZ0yj0m5xAX4Ug4hTY/8cVsdthFKFAQt9etjpeaZXeTJbOMv@lists.infradead.org X-Gm-Message-State: AOJu0Yw6dRLIcrToJD/meAFkPSPqjMBCZ9BB11ZxuIae6zKHbIjsFS0r 6McG8iMMBlFMocdrNSN/44/Oo+yO8SVMFmpzrqr1t+LC3q6Pup9b X-Gm-Gg: ASbGnctJVkBx3CLemDaH1rKyr6arf7TCD8YzNsbQvtAjCZHA/7c9G2nEGReV2Vd/9XI ydUrmA0SOs5at1IhFJTV07JbzoUpmnjBjeURBp+rexNRAzZIF7WIPUm5VgGymXo65RH6aSWld1l f0bUxDdnHi0k7MVUYg5xMzqbp0mVRJ7MKv8xWhh7ULoX06gVupVdbHuekR8FwEy5eE0H2uNZ5Oj XvymXvUJyfQ+Rg0hhIz6N9mrsD/XxNYZsdaRh456C1JKApS2WXl1lCPplRbcqpt/83KgisOJRss 09fLZg== X-Google-Smtp-Source: AGHT+IFphip47QJW2PrpFsbKi02KtCDzlFv4DF2sCZj2+AejG4aG7KuSt5Jt87tvagiOv6cvGdW9aA== X-Received: by 2002:a05:600c:a47:b0:436:e3ea:64dd with SMTP id 5b1f17b1804b1-43912d3ef4bmr20857325e9.11.1738839289015; Thu, 06 Feb 2025 02:54:49 -0800 (PST) Received: from localhost ([2a03:2880:31ff:70::]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38dbdd4df6bsm1429709f8f.39.2025.02.06.02.54.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:54:48 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Barret Rhoden , Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 09/26] rqspinlock: Protect waiters in queue from stalls Date: Thu, 6 Feb 2025 02:54:17 -0800 Message-ID: <20250206105435.2159977-10-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=7029; h=from:subject; bh=Xa/jCeO/bNAS2DABSLS2XeyE6NH1yv2QeubXVXtSfFg=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRl7An7/KiEUm98Ur9vnCib0zAC2FnG3ihYCJKd QcDong6JAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZQAKCRBM4MiGSL8RyrulD/ 0QF/IO1vH6JRLlfF5fdCoL2J3bq6TUizWVQKnJM58Ak3TLWy8SAYpkkznDpIqhFohSNctWXqvIZ+9p 16sVoQkzVrUVZ7oFbcrGdJRcSkqMQc1LJCTJG+1+S0O458BwdjYHBxyoigb2bJxJ1AqoTakaX6rOFp q41csSMJragiD4b837bqZkq6Kcnt5NFp7XO/6Ca4/ZgmqpXuFA/o2zmajpZUM8v+bjVz/X16xlcuV6 SBxSHt0aPLBsnLU1BoCQrQWdpFpl1mJe+oslsG53qb/RfuoKcTBhetjeceQB8Hit/6sBcswtT44WAR Cx69D/8D2KZO/Q4Lrg08OxeuZadzrtnc6jOHnXI5DGab+HO17FzujG+flkF5KdBYru82bFLsWr3W9W P6WH1ACcX5G6Ivg/ah/wpe8/6GW6+02j3XKd5r0tgiGp4JDq0CnMOs8Sfe+LmDPiEuo+1zg/17KnwP G2OnXHvHPsyjcj5T2kVQOok60uFZk2WMFKXgMR23OcgQ/3g4sugfLA3JBDlw+xL1XfUSYgGPaGyi7o YNfJ07vRVHBXylG06/Y/GytWzKZoeQxLMeb8CkChWXxgnS28C3OZY3G1thv2fdFQ1sG/nUDTHZxqbj n/6RKyGflImPIdWq7jQWqYNDTxpBkJyANG+96Y+NtplOkmtZPIZtX2ePJ0yQ== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_025450_782185_5240D49E X-CRM114-Status: GOOD ( 36.35 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Implement the wait queue cleanup algorithm for rqspinlock. There are three forms of waiters in the original queued spin lock algorithm. The first is the waiter which acquires the pending bit and spins on the lock word without forming a wait queue. The second is the head waiter that is the first waiter heading the wait queue. The third form is of all the non-head waiters queued behind the head, waiting to be signalled through their MCS node to overtake the responsibility of the head. In this commit, we are concerned with the second and third kind. First, we augment the waiting loop of the head of the wait queue with a timeout. When this timeout happens, all waiters part of the wait queue will abort their lock acquisition attempts. This happens in three steps. First, the head breaks out of its loop waiting for pending and locked bits to turn to 0, and non-head waiters break out of their MCS node spin (more on that later). Next, every waiter (head or non-head) attempts to check whether they are also the tail waiter, in such a case they attempt to zero out the tail word and allow a new queue to be built up for this lock. If they succeed, they have no one to signal next in the queue to stop spinning. Otherwise, they signal the MCS node of the next waiter to break out of its spin and try resetting the tail word back to 0. This goes on until the tail waiter is found. In case of races, the new tail will be responsible for performing the same task, as the old tail will then fail to reset the tail word and wait for its next pointer to be updated before it signals the new tail to do the same. Lastly, all of these waiters release the rqnode and return to the caller. This patch underscores the point that rqspinlock's timeout does not apply to each waiter individually, and cannot be relied upon as an upper bound. It is possible for the rqspinlock waiters to return early from a failed lock acquisition attempt as soon as stalls are detected. The head waiter cannot directly WRITE_ONCE the tail to zero, as it may race with a concurrent xchg and a non-head waiter linking its MCS node to the head's MCS node through 'prev->next' assignment. Reviewed-by: Barret Rhoden Signed-off-by: Kumar Kartikeya Dwivedi --- kernel/locking/rqspinlock.c | 42 +++++++++++++++++++++++++++++--- kernel/locking/rqspinlock.h | 48 +++++++++++++++++++++++++++++++++++++ 2 files changed, 87 insertions(+), 3 deletions(-) create mode 100644 kernel/locking/rqspinlock.h diff --git a/kernel/locking/rqspinlock.c b/kernel/locking/rqspinlock.c index 8e512feb37ce..fdc20157d0c9 100644 --- a/kernel/locking/rqspinlock.c +++ b/kernel/locking/rqspinlock.c @@ -77,6 +77,8 @@ struct rqspinlock_timeout { u16 spin; }; +#define RES_TIMEOUT_VAL 2 + static noinline int check_timeout(struct rqspinlock_timeout *ts) { u64 time = ktime_get_mono_fast_ns(); @@ -305,12 +307,18 @@ int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, * head of the waitqueue. */ if (old & _Q_TAIL_MASK) { + int val; + prev = decode_tail(old, qnodes); /* Link @node into the waitqueue. */ WRITE_ONCE(prev->next, node); - arch_mcs_spin_lock_contended(&node->locked); + val = arch_mcs_spin_lock_contended(&node->locked); + if (val == RES_TIMEOUT_VAL) { + ret = -EDEADLK; + goto waitq_timeout; + } /* * While waiting for the MCS lock, the next pointer may have @@ -334,7 +342,35 @@ int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, * sequentiality; this is because the set_locked() function below * does not imply a full barrier. */ - val = atomic_cond_read_acquire(&lock->val, !(VAL & _Q_LOCKED_PENDING_MASK)); + RES_RESET_TIMEOUT(ts); + val = atomic_cond_read_acquire(&lock->val, !(VAL & _Q_LOCKED_PENDING_MASK) || + RES_CHECK_TIMEOUT(ts, ret)); + +waitq_timeout: + if (ret) { + /* + * If the tail is still pointing to us, then we are the final waiter, + * and are responsible for resetting the tail back to 0. Otherwise, if + * the cmpxchg operation fails, we signal the next waiter to take exit + * and try the same. For a waiter with tail node 'n': + * + * n,*,* -> 0,*,* + * + * When performing cmpxchg for the whole word (NR_CPUS > 16k), it is + * possible locked/pending bits keep changing and we see failures even + * when we remain the head of wait queue. However, eventually, + * pending bit owner will unset the pending bit, and new waiters + * will queue behind us. This will leave the lock owner in + * charge, and it will eventually either set locked bit to 0, or + * leave it as 1, allowing us to make progress. + */ + if (!try_cmpxchg_tail(lock, tail, 0)) { + next = smp_cond_load_relaxed(&node->next, VAL); + WRITE_ONCE(next->locked, RES_TIMEOUT_VAL); + } + lockevent_inc(rqspinlock_lock_timeout); + goto release; + } /* * claim the lock: @@ -379,6 +415,6 @@ int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, * release the node */ __this_cpu_dec(qnodes[0].mcs.count); - return 0; + return ret; } EXPORT_SYMBOL(resilient_queued_spin_lock_slowpath); diff --git a/kernel/locking/rqspinlock.h b/kernel/locking/rqspinlock.h new file mode 100644 index 000000000000..3cec3a0f2d7e --- /dev/null +++ b/kernel/locking/rqspinlock.h @@ -0,0 +1,48 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Resilient Queued Spin Lock defines + * + * (C) Copyright 2024 Meta Platforms, Inc. and affiliates. + * + * Authors: Kumar Kartikeya Dwivedi + */ +#ifndef __LINUX_RQSPINLOCK_H +#define __LINUX_RQSPINLOCK_H + +#include "qspinlock.h" + +/* + * try_cmpxchg_tail - Return result of cmpxchg of tail word with a new value + * @lock: Pointer to queued spinlock structure + * @tail: The tail to compare against + * @new_tail: The new queue tail code word + * Return: Bool to indicate whether the cmpxchg operation succeeded + * + * This is used by the head of the wait queue to clean up the queue. + * Provides relaxed ordering, since observers only rely on initialized + * state of the node which was made visible through the xchg_tail operation, + * i.e. through the smp_wmb preceding xchg_tail. + * + * We avoid using 16-bit cmpxchg, which is not available on all architectures. + */ +static __always_inline bool try_cmpxchg_tail(struct qspinlock *lock, u32 tail, u32 new_tail) +{ + u32 old, new; + + old = atomic_read(&lock->val); + do { + /* + * Is the tail part we compare to already stale? Fail. + */ + if ((old & _Q_TAIL_MASK) != tail) + return false; + /* + * Encode latest locked/pending state for new tail. + */ + new = (old & _Q_LOCKED_PENDING_MASK) | new_tail; + } while (!atomic_try_cmpxchg_relaxed(&lock->val, &old, new)); + + return true; +} + +#endif /* __LINUX_RQSPINLOCK_H */ From patchwork Thu Feb 6 10:54:18 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962863 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 33236C02194 for ; Thu, 6 Feb 2025 11:11:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=vau+zNCewZiHW38t3Nl8HxSJyuVHZ3A5oWGeCWcGqfM=; b=SaDnwx60FRgjkI67CE073BvC4p Q4t4ac/liJ2XMs91fwbeuyNnB0XKWSE8tdKwV15ZpIcntL784yrrWp3ezzF2yg9eFij++vieTVqLv PAgX/pVV1SPTCiggRhX7k12Zu3uO6bx3bTpHZ2y4Z7HPK2Fd3Z+3l3Iy+Jq+GE+L2QxzABOS6ivTA wdZWB51t0epBDS+7LWiTlv0xCGoYsJwRW09cPfFtijbkt9ohic5bC/m4kugTkeuEMC5P/CfiZ6C4c thJ1vEC2G1bId7So33kBhG5+KF6kaD6Vb77a5cYD8/SsCZ2S1VrYvCueY/kKPPEYYxjFvRCdpZy2Q FDcjQLbQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tfzn0-000000065Lz-0Nf8; Thu, 06 Feb 2025 11:11:22 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzX6-000000061ah-1ZBK for linux-arm-kernel@bombadil.infradead.org; Thu, 06 Feb 2025 10:54:56 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Transfer-Encoding:MIME-Version :References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=vau+zNCewZiHW38t3Nl8HxSJyuVHZ3A5oWGeCWcGqfM=; b=G2bZCNAQfL9/2xG9MwiM0Bnoox K+Fs50y3MDTyn5JTE+mBObrev4MDDl8Dop2v5NM7t84hk2vwE4VnKohGN6agVfVUpZve2uDSLPWbW k4RP9jFvtv8XdwB5NKGoHFAPfSVRGRle9kUkshAoGgwRlHZHnfKwx72OMSecAY1Alc5dUxwFB77Uh 3L3XFgxXTq8tXn9exmB0G8CYVWEGFbjqY54DxuirhxieT4wzvPf6TZPckL1kiSHku4ugu27kMKr1/ kPG90WmQ4tzdA48j62KcPzWpUr8x4qnW4SWwXVnVvsyPydwXJIkOWgXZu3e1KC4RfBzraAxGdDnnq 8FwWabrg==; Received: from mail-wm1-x341.google.com ([2a00:1450:4864:20::341]) by desiato.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzX2-0000000GutL-01pl for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:54:55 +0000 Received: by mail-wm1-x341.google.com with SMTP id 5b1f17b1804b1-438a39e659cso4518875e9.2 for ; Thu, 06 Feb 2025 02:54:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839291; x=1739444091; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=vau+zNCewZiHW38t3Nl8HxSJyuVHZ3A5oWGeCWcGqfM=; b=L5kDoazXHwHj8yGdbSHegPIHAOK9H3FTGdD695mVMQqZUhV2mFXakhV5A4+oFFu3Jj cfgGomgg8oCsuuFdvPynieAQfJfVuxa+WGFVGKzJdpl/WEklyqDcPHOR6QIvJ/Rov8pJ eRXfF/Dz1iHjy+8D7Mn6R1MM4VWiLdKMkdJ7av9UOAVUWLN1SC6qUb/7idtyqQBho+BD k7a0Np87n77lz3OtGu/hM2KZFU8UuDgHNy8R8tdZV7BCLQ8440AroUuG6DrNxd/S54L3 7L8WX/BMWgaeLVs3KxXrvmnN/vne6HSsFvaGGB15blJotW3gdU38ZXFJ+qyuEtZwYCov epEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839291; x=1739444091; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vau+zNCewZiHW38t3Nl8HxSJyuVHZ3A5oWGeCWcGqfM=; b=GwSKu/2Kj0xB18KAqIL5gq/y5GjA5/62c1pgPPySUt2KHndvHl3utzjLJ5IpioZw2C HHjMbNzJkEzt7yH6K0oBedn2Jjzs51GX2QtwmG8DNfFgSk0Fn6pJlil1ozNzgosOGkC0 t65DhNj3qkIHe6tAokkdtCbt3yYTjPpBmXDkG8Pf8qO1+NKKd5MbrKTpMUjaK6Iwlzhq 53aIFJPAmWwCXM+cEcdqKrOgzf/zN9U39WkiCxRgvSjmFncSue4Z0+8sCwGuHKq+eqbY 0m/0PXHh1LkgWh3UhPqbbPT7wp4W1Pbja/D5pEgNMzKBIsEpTuMdyiJbMc+g6tPUZ5zo PyHg== X-Forwarded-Encrypted: i=1; AJvYcCUGIXShlxOL6q+cS1BXrLPu8ZYwDLcI87h50QrK5FLH3dHEHlRHZnyPlcbGSz/RUhUFTG0NYXchI5AJ3aWpkWkc@lists.infradead.org X-Gm-Message-State: AOJu0Yw7QhxLRk3xAO8xR/b7+fB7i+9SKXoXQFnFefRZplo9gDZcJKbe +ltPW/Eekdg9adgw15ODxF4rPAPK2bs0AZZIA/aqfcBk4m3w0aTk X-Gm-Gg: ASbGncsKyZosJa4rpw4/vaY3XIBrexPd/SPHrKlri7hbUnteVvtPlp6lnAH6D1M7kGH VBcTCVeb3Z3svPBp7ifJA6nhkRhXy4yUrJljAUM0qWc9lnX5dw/Vs5X2L5C32dM7bdJDQIqfE41 feIouulX5BVycTHCe4cQEsNOCEa8khhoB289efEwR9DTvkWbL0racJiPUF/LZjYsnmejLMMaQxg yiewUuNfrZEaORSWMyhhEXaeeRsBQVmd1IU19vIlTi58Nil/pOG+Egn2zpXxrrrjA/ctiel6mmi n6xoxw== X-Google-Smtp-Source: AGHT+IH09CkYJ0UtkBe+a5qInFgvMXJIECzPffzqtSRnQ4/rOtucmd7AR3vCh/FlzDD0l/wp6T1GPQ== X-Received: by 2002:a05:600c:3b11:b0:434:a7b6:10e9 with SMTP id 5b1f17b1804b1-4390d43f76emr48688805e9.17.1738839290760; Thu, 06 Feb 2025 02:54:50 -0800 (PST) Received: from localhost ([2a03:2880:31ff:14::]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4391da96640sm15993445e9.8.2025.02.06.02.54.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:54:49 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Barret Rhoden , Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 10/26] rqspinlock: Protect waiters in trylock fallback from stalls Date: Thu, 6 Feb 2025 02:54:18 -0800 Message-ID: <20250206105435.2159977-11-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=1512; h=from:subject; bh=jiFbzaZcn62jcwbclsn6fFlBfAVbcfgq/5K/rfr88pM=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRlAFWO9LNwjO/6CG81gcHcw9dLmIxeb4fj+/Fr WCxwUJyJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZQAKCRBM4MiGSL8RylLCEA DDo4P5cRJ8lPeqvkLpxQQ1B7QXy+KIgoUK7g8esRrYEGzu3/eXsNpmYAoOvt3pEFrK41Z912PBSwiZ i494ekqsWp8O2NUu2LAySNMiQoLle49dVCJVCS6XExiZjShBBUkEtEuCd6nUmFQI75F1VAg1iHmFJB Cfib9a+3IucRelAmkgBnQ/O2d/fQo0v+SD0awcgBflP6r/Lt63AmQA4aZ4sFZ6racNxenALC+V6MfK eW1pM/lb2MDwv9GUrhhA/tvKIXCGXkR+t6Ujo9S300ORDUjNyeubMZtLOCzHkmeb6HyI/laIez2iWD Ek0KD+qMhLwDY6ZzHwu0bVuubuavObYEZQjiTGON1QtULb4JSDF56lM2C2rOn45iYri4ihN0gLv3Zs 2ViXlGn24tXGV6kxa8MyOwWHOShlj4BRY0MGSol1jU2UZbp2ztn5iujPHmtcZDFMbAAe8HF8eaAnJp EhjV0kpIyUMml54A8fDiGMgjt5jI0ciIUDnEYtzcDCC56XaCPRwRRYluCIYOk1O6sczyszhNNwEWsM +oYoNqL47WMzOJThezb9dy0n2ARWnKz5fdrzN1fZIqArDB6QziEdYH90bwROd3jhypRSPWfKyYUWg7 y0WljjCeSXbVoHZmX0zVKWnH/l+jCp7pdBirkN9WK6Cr5eBJYTtvKNDwUP/A== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_105452_183306_7F77578C X-CRM114-Status: GOOD ( 13.82 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org When we run out of maximum rqnodes, the original queued spin lock slow path falls back to a try lock. In such a case, we are again susceptible to stalls in case the lock owner fails to make progress. We use the timeout as a fallback to break out of this loop and return to the caller. This is a fallback for an extreme edge case, when on the same CPU we run out of all 4 qnodes. When could this happen? We are in slow path in task context, we get interrupted by an IRQ, which while in the slow path gets interrupted by an NMI, whcih in the slow path gets another nested NMI, which enters the slow path. All of the interruptions happen after node->count++. Reviewed-by: Barret Rhoden Signed-off-by: Kumar Kartikeya Dwivedi --- kernel/locking/rqspinlock.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/kernel/locking/rqspinlock.c b/kernel/locking/rqspinlock.c index fdc20157d0c9..df7adec59cec 100644 --- a/kernel/locking/rqspinlock.c +++ b/kernel/locking/rqspinlock.c @@ -255,8 +255,14 @@ int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, */ if (unlikely(idx >= _Q_MAX_NODES)) { lockevent_inc(lock_no_node); - while (!queued_spin_trylock(lock)) + RES_RESET_TIMEOUT(ts); + while (!queued_spin_trylock(lock)) { + if (RES_CHECK_TIMEOUT(ts, ret)) { + lockevent_inc(rqspinlock_lock_timeout); + break; + } cpu_relax(); + } goto release; } From patchwork Thu Feb 6 10:54:19 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962862 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A8068C02194 for ; Thu, 6 Feb 2025 11:10:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=8M3jGKVoMOaiDp/mgjTJ4k4swCXfL8TDoJbCV0k94ZM=; b=h+c71Wwn9x/Ol2sUUFb/K7qjfN 1/QhuDe3eQDxlHyiWkmoLYZ/fLjE8WjmD3aJ7NKe2+Oyh6aKIYHiUxbG4rHqJkvdcJIVKNJMDaO+a HzoWUpVTZAkv9cOsUe2Lyvdll0klUDxuVY0jhaiuvJQY8CvDauSCAgchA7zrH5KaGfyzDuFyQt7A8 d+12PBXBWQgwJ0+/XTH9cFWmtYgboy6q4TRgaDoSLkC4/iJOISkipeGxUvoKcHArXes8plAAWUES0 I/agddrigYdjPhL//rDvXi1GoBb7pr3R1g8RdlBPqucCaqhzz9+qKcXlwC5XBWvqfpt13GYJ1K62W QE5d9z3g==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tfzlf-0000000650Q-1mWv; Thu, 06 Feb 2025 11:09:59 +0000 Received: from mail-wm1-x343.google.com ([2a00:1450:4864:20::343]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzX3-000000061Zt-3tkx for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:54:55 +0000 Received: by mail-wm1-x343.google.com with SMTP id 5b1f17b1804b1-435f8f29f8aso4985875e9.2 for ; Thu, 06 Feb 2025 02:54:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839292; x=1739444092; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8M3jGKVoMOaiDp/mgjTJ4k4swCXfL8TDoJbCV0k94ZM=; b=DygAKKjTVovJMtEVTTanpiQZDmoMS1XmJIC0gk1MDnhpJ6uLEEAwZxgjXvAlpGFZWq /q99KyHbuCmSO59c5Bpg4bAjEtWuIjWLpMRg43aRnBn7eE/hDbNDhd2G70wCaYqo7MZy fc2t27Y9I6gAvssCKVChVMJjVXJ0Oe3PKARz5zadojmOObKnC5NXoJwa4ZewLqyYA0m9 mrjsscLbbKvdjXQmDNN1c7dzhdB/7MYUsk42D9NdeEw7kiGpG2l/9ycwF+j7XePfva+4 igTRdw+7wyZNXapSZc+d6XAosXX1yRi9HqOHbCz3uxh3T0cSCFUVogs5wcC3sMUtf/nr LTcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839292; x=1739444092; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8M3jGKVoMOaiDp/mgjTJ4k4swCXfL8TDoJbCV0k94ZM=; b=LOYLTWBEU1qVgo5a4BA3Lg54QdgQeTUr9OnCnxGiS7CZhqZWUMihyDj2gom41lea03 s0QcR5K084PEuT+DGmXqNN63kv2c7uxAlkQKf+LSGX2A/w2nqHMzMcBVpJLtWwO8QXGA 8kyL65HtTulwZBSP1RWBvHef2Q2qt9zM8pS15oFvlM8r0kNZTcDmANyj7Tfv43vMmumL w2HxF7TFev59AyhufmCqOin/Ai/qv+cyEX15F9C1XBui7KmJpj+SbW5iXML/HnV/C84G XpeNN5wuvl+FrfZSPXz9RFl/ADm3pCbi+2Jyze+YtnZRjYR50s3c4kzRDpNRotuI83Xy CLDQ== X-Forwarded-Encrypted: i=1; AJvYcCW2o9pKgPW7hJEzOTBpkeOlhI8xmwWKoawErqDwRGNa0irSC4sgAXOFOnuJoUtd9U6tuQ1Tycr077DWeoCd62Dq@lists.infradead.org X-Gm-Message-State: AOJu0YzLWiEG0F4401dIGa37d9n84SK5Adqvl3L7B3fI2R17ePpGqjdU znrQ0WvxS4YQ1wpVIP9QhSBdlfU6PU65qZZBoTA8XFQUGm3P3KrL X-Gm-Gg: ASbGncvISuW4lqeX6BJZp/Uhl89JO2AeuhGxquycffzqSIO7+9dk49fqK8Nku1si9AD yB1KOXfN1FewLTEXOYwPVFZsp3itJac9Bplds2FHS+amxr59DbssV4sv2hELd9jceEYvLtQyIeS jbRsHB/OOooPRJN3Rp3w/k6+D381573p/lBO4UKrkCnOnhhYt0Ze2Wgpt6dfb6X+tM/rdNkKG7J WCkHw2TbnlwKhO4oFTpAp+5SJyRwFM3VBHkrJXtGXqw2UIPfDeVvQI4hS13iOhRo/foAgPLdyGX yVDW5A== X-Google-Smtp-Source: AGHT+IE6Ytf9lAGZg/32w5jdCCc5bYoMNRLBU5aCkHim0FkVJBaC7h0Wj/nvgFrS1eA2CBln15W8Xg== X-Received: by 2002:a7b:cbce:0:b0:434:9499:9e87 with SMTP id 5b1f17b1804b1-43912d37614mr17657335e9.25.1738839292380; Thu, 06 Feb 2025 02:54:52 -0800 (PST) Received: from localhost ([2a03:2880:31ff:73::]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4390d964c7csm50548735e9.17.2025.02.06.02.54.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:54:51 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Barret Rhoden , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 11/26] rqspinlock: Add deadlock detection and recovery Date: Thu, 6 Feb 2025 02:54:19 -0800 Message-ID: <20250206105435.2159977-12-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=15389; h=from:subject; bh=msNcOGVmrnAfRu3MV0az2zqDUq7IHfqHVuBu+GMrk4A=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRlbGObmmTmVt1u04gEed3cMy7FmALO/uAVks+a TeeoBgaJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZQAKCRBM4MiGSL8RyiK0EA CnsJbt64sQX0kL016RYhlMZNffg/GcHvfv3Z9oFIBHQcHLm5bLwHfs4H4NTMyGbVhYbcOgTPdLdE2D DthUBlKJzI3++9fzS21vMoTArLRc+cWJMJlPcldmQQN9GEWs+b/JWYeCkyYbCSaTMqfXWCi/2rknpd IiM1cB9cu/wfEfje5N5LFnbXnCHoMreNOieQzxB0QIh/i86hmD+ykd3UqiUv760Z68PHozD3TfRpuT a7PXTCg4c8cYvwNs7CUHE2MF8SKHu2yYK7VA64+d6gkAsPGSctiNYcK9x0UYRHp/PZlfIbUrKsJWYW tTrmTMXAB0cNc3BBtMolkvMRTrXuMvMxootdKJZ810qc6RYfElFL7pzLsXWo8oB4OGyn3Gau3po3VG AJYUxKpTKF4k2WW+F+TbGolt6W/7v1jo2OXUzoMs/LFeFEB5XB16UoTp4po/kxAuX3GqLvlMtWsZcQ SG7GVcmsGx9lvto9F6yVjMvevUytjuT4u6A0cwgTSC4pDTlJXyDoN7VwX0iuTqnKddyh943RII4MuK d+rkjWBQWNMkwlqTLUJ4UxOusNS50rJNbWBpHmlApS0Stlf0zCOKqKcIVM+XepT1nadbnyrUe/Bhy8 FhajBZBdwockQWCjCt2Z1Mi6wiwkcY6rTQvd6J62f7BWXOkVhH+60sQr9qyQ== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_025454_119219_DF587B4A X-CRM114-Status: GOOD ( 39.39 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org While the timeout logic provides guarantees for the waiter's forward progress, the time until a stalling waiter unblocks can still be long. The default timeout of 1/2 sec can be excessively long for some use cases. Additionally, custom timeouts may exacerbate recovery time. Introduce logic to detect common cases of deadlocks and perform quicker recovery. This is done by dividing the time from entry into the locking slow path until the timeout into intervals of 1 ms. Then, after each interval elapses, deadlock detection is performed, while also polling the lock word to ensure we can quickly break out of the detection logic and proceed with lock acquisition. A 'held_locks' table is maintained per-CPU where the entry at the bottom denotes a lock being waited for or already taken. Entries coming before it denote locks that are already held. The current CPU's table can thus be looked at to detect AA deadlocks. The tables from other CPUs can be looked at to discover ABBA situations. Finally, when a matching entry for the lock being taken on the current CPU is found on some other CPU, a deadlock situation is detected. This function can take a long time, therefore the lock word is constantly polled in each loop iteration to ensure we can preempt detection and proceed with lock acquisition, using the is_lock_released check. We set 'spin' member of rqspinlock_timeout struct to 0 to trigger deadlock checks immediately to perform faster recovery. Note: Extending lock word size by 4 bytes to record owner CPU can allow faster detection for ABBA. It is typically the owner which participates in a ABBA situation. However, to keep compatibility with existing lock words in the kernel (struct qspinlock), and given deadlocks are a rare event triggered by bugs, we choose to favor compatibility over faster detection. The release_held_lock_entry function requires an smp_wmb, while the release store on unlock will provide the necessary ordering for us. Add comments to document the subtleties of why this is correct. It is possible for stores to be reordered still, but in the context of the deadlock detection algorithm, a release barrier is sufficient and needn't be stronger for unlock's case. Signed-off-by: Kumar Kartikeya Dwivedi --- include/asm-generic/rqspinlock.h | 83 +++++++++++++- kernel/locking/rqspinlock.c | 183 ++++++++++++++++++++++++++++--- 2 files changed, 252 insertions(+), 14 deletions(-) diff --git a/include/asm-generic/rqspinlock.h b/include/asm-generic/rqspinlock.h index 0981162c8ac7..c1dbd25287a1 100644 --- a/include/asm-generic/rqspinlock.h +++ b/include/asm-generic/rqspinlock.h @@ -11,15 +11,96 @@ #include #include +#include struct qspinlock; typedef struct qspinlock rqspinlock_t; +extern int resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, u64 timeout); + /* * Default timeout for waiting loops is 0.5 seconds */ #define RES_DEF_TIMEOUT (NSEC_PER_SEC / 2) -extern int resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, u64 timeout); +#define RES_NR_HELD 32 + +struct rqspinlock_held { + int cnt; + void *locks[RES_NR_HELD]; +}; + +DECLARE_PER_CPU_ALIGNED(struct rqspinlock_held, rqspinlock_held_locks); + +static __always_inline void grab_held_lock_entry(void *lock) +{ + int cnt = this_cpu_inc_return(rqspinlock_held_locks.cnt); + + if (unlikely(cnt > RES_NR_HELD)) { + /* Still keep the inc so we decrement later. */ + return; + } + + /* + * Implied compiler barrier in per-CPU operations; otherwise we can have + * the compiler reorder inc with write to table, allowing interrupts to + * overwrite and erase our write to the table (as on interrupt exit it + * will be reset to NULL). + */ + this_cpu_write(rqspinlock_held_locks.locks[cnt - 1], lock); +} + +/* + * It is possible to run into misdetection scenarios of AA deadlocks on the same + * CPU, and missed ABBA deadlocks on remote CPUs when this function pops entries + * out of order (due to lock A, lock B, unlock A, unlock B) pattern. The correct + * logic to preserve right entries in the table would be to walk the array of + * held locks and swap and clear out-of-order entries, but that's too + * complicated and we don't have a compelling use case for out of order unlocking. + * + * Therefore, we simply don't support such cases and keep the logic simple here. + */ +static __always_inline void release_held_lock_entry(void) +{ + struct rqspinlock_held *rqh = this_cpu_ptr(&rqspinlock_held_locks); + + if (unlikely(rqh->cnt > RES_NR_HELD)) + goto dec; + WRITE_ONCE(rqh->locks[rqh->cnt - 1], NULL); +dec: + this_cpu_dec(rqspinlock_held_locks.cnt); + /* + * This helper is invoked when we unwind upon failing to acquire the + * lock. Unlike the unlock path which constitutes a release store after + * we clear the entry, we need to emit a write barrier here. Otherwise, + * we may have a situation as follows: + * + * for lock B + * release_held_lock_entry + * + * try_cmpxchg_acquire for lock A + * grab_held_lock_entry + * + * Since these are attempts for different locks, no sequentiality is + * guaranteed and reordering may occur such that dec, inc are done + * before entry is overwritten. This permits a remote lock holder of + * lock B to now observe it as being attempted on this CPU, and may lead + * to misdetection. + * + * In case of unlock, we will always do a release on the lock word after + * releasing the entry, ensuring that other CPUs cannot hold the lock + * (and make conclusions about deadlocks) until the entry has been + * cleared on the local CPU, preventing any anomalies. Reordering is + * still possible there, but a remote CPU cannot observe a lock in our + * table which it is already holding, since visibility entails our + * release store for the said lock has not retired. + * + * We don't have a problem if the dec and WRITE_ONCE above get reordered + * with each other, we either notice an empty NULL entry on top (if dec + * succeeds WRITE_ONCE), or a potentially stale entry which cannot be + * observed (if dec precedes WRITE_ONCE). + */ + smp_wmb(); +} #endif /* __ASM_GENERIC_RQSPINLOCK_H */ diff --git a/kernel/locking/rqspinlock.c b/kernel/locking/rqspinlock.c index df7adec59cec..42e8a56534b6 100644 --- a/kernel/locking/rqspinlock.c +++ b/kernel/locking/rqspinlock.c @@ -30,6 +30,7 @@ * Include queued spinlock definitions and statistics code */ #include "qspinlock.h" +#include "rqspinlock.h" #include "qspinlock_stat.h" /* @@ -74,16 +75,146 @@ struct rqspinlock_timeout { u64 timeout_end; u64 duration; + u64 cur; u16 spin; }; #define RES_TIMEOUT_VAL 2 -static noinline int check_timeout(struct rqspinlock_timeout *ts) +DEFINE_PER_CPU_ALIGNED(struct rqspinlock_held, rqspinlock_held_locks); + +static bool is_lock_released(rqspinlock_t *lock, u32 mask, struct rqspinlock_timeout *ts) +{ + if (!(atomic_read_acquire(&lock->val) & (mask))) + return true; + return false; +} + +static noinline int check_deadlock_AA(rqspinlock_t *lock, u32 mask, + struct rqspinlock_timeout *ts) +{ + struct rqspinlock_held *rqh = this_cpu_ptr(&rqspinlock_held_locks); + int cnt = min(RES_NR_HELD, rqh->cnt); + + /* + * Return an error if we hold the lock we are attempting to acquire. + * We'll iterate over max 32 locks; no need to do is_lock_released. + */ + for (int i = 0; i < cnt - 1; i++) { + if (rqh->locks[i] == lock) + return -EDEADLK; + } + return 0; +} + +/* + * This focuses on the most common case of ABBA deadlocks (or ABBA involving + * more locks, which reduce to ABBA). This is not exhaustive, and we rely on + * timeouts as the final line of defense. + */ +static noinline int check_deadlock_ABBA(rqspinlock_t *lock, u32 mask, + struct rqspinlock_timeout *ts) +{ + struct rqspinlock_held *rqh = this_cpu_ptr(&rqspinlock_held_locks); + int rqh_cnt = min(RES_NR_HELD, rqh->cnt); + void *remote_lock; + int cpu; + + /* + * Find the CPU holding the lock that we want to acquire. If there is a + * deadlock scenario, we will read a stable set on the remote CPU and + * find the target. This would be a constant time operation instead of + * O(NR_CPUS) if we could determine the owning CPU from a lock value, but + * that requires increasing the size of the lock word. + */ + for_each_possible_cpu(cpu) { + struct rqspinlock_held *rqh_cpu = per_cpu_ptr(&rqspinlock_held_locks, cpu); + int real_cnt = READ_ONCE(rqh_cpu->cnt); + int cnt = min(RES_NR_HELD, real_cnt); + + /* + * Let's ensure to break out of this loop if the lock is available for + * us to potentially acquire. + */ + if (is_lock_released(lock, mask, ts)) + return 0; + + /* + * Skip ourselves, and CPUs whose count is less than 2, as they need at + * least one held lock and one acquisition attempt (reflected as top + * most entry) to participate in an ABBA deadlock. + * + * If cnt is more than RES_NR_HELD, it means the current lock being + * acquired won't appear in the table, and other locks in the table are + * already held, so we can't determine ABBA. + */ + if (cpu == smp_processor_id() || real_cnt < 2 || real_cnt > RES_NR_HELD) + continue; + + /* + * Obtain the entry at the top, this corresponds to the lock the + * remote CPU is attempting to acquire in a deadlock situation, + * and would be one of the locks we hold on the current CPU. + */ + remote_lock = READ_ONCE(rqh_cpu->locks[cnt - 1]); + /* + * If it is NULL, we've raced and cannot determine a deadlock + * conclusively, skip this CPU. + */ + if (!remote_lock) + continue; + /* + * Find if the lock we're attempting to acquire is held by this CPU. + * Don't consider the topmost entry, as that must be the latest lock + * being held or acquired. For a deadlock, the target CPU must also + * attempt to acquire a lock we hold, so for this search only 'cnt - 1' + * entries are important. + */ + for (int i = 0; i < cnt - 1; i++) { + if (READ_ONCE(rqh_cpu->locks[i]) != lock) + continue; + /* + * We found our lock as held on the remote CPU. Is the + * acquisition attempt on the remote CPU for a lock held + * by us? If so, we have a deadlock situation, and need + * to recover. + */ + for (int i = 0; i < rqh_cnt - 1; i++) { + if (rqh->locks[i] == remote_lock) + return -EDEADLK; + } + /* + * Inconclusive; retry again later. + */ + return 0; + } + } + return 0; +} + +static noinline int check_deadlock(rqspinlock_t *lock, u32 mask, + struct rqspinlock_timeout *ts) +{ + int ret; + + ret = check_deadlock_AA(lock, mask, ts); + if (ret) + return ret; + ret = check_deadlock_ABBA(lock, mask, ts); + if (ret) + return ret; + + return 0; +} + +static noinline int check_timeout(rqspinlock_t *lock, u32 mask, + struct rqspinlock_timeout *ts) { u64 time = ktime_get_mono_fast_ns(); + u64 prev = ts->cur; if (!ts->timeout_end) { + ts->cur = time; ts->timeout_end = time + ts->duration; return 0; } @@ -91,20 +222,30 @@ static noinline int check_timeout(struct rqspinlock_timeout *ts) if (time > ts->timeout_end) return -ETIMEDOUT; + /* + * A millisecond interval passed from last time? Trigger deadlock + * checks. + */ + if (prev + NSEC_PER_MSEC < time) { + ts->cur = time; + return check_deadlock(lock, mask, ts); + } + return 0; } -#define RES_CHECK_TIMEOUT(ts, ret) \ - ({ \ - if (!(ts).spin++) \ - (ret) = check_timeout(&(ts)); \ - (ret); \ +#define RES_CHECK_TIMEOUT(ts, ret, mask) \ + ({ \ + if (!(ts).spin++) \ + (ret) = check_timeout((lock), (mask), &(ts)); \ + (ret); \ }) /* * Initialize the 'duration' member with the chosen timeout. + * Set spin member to 0 to trigger AA/ABBA checks immediately. */ -#define RES_INIT_TIMEOUT(ts, _timeout) ({ (ts).spin = 1; (ts).duration = _timeout; }) +#define RES_INIT_TIMEOUT(ts, _timeout) ({ (ts).spin = 0; (ts).duration = _timeout; }) /* * We only need to reset 'timeout_end', 'spin' will just wrap around as necessary. @@ -192,6 +333,11 @@ int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, goto queue; } + /* + * Grab an entry in the held locks array, to enable deadlock detection. + */ + grab_held_lock_entry(lock); + /* * We're pending, wait for the owner to go away. * @@ -205,7 +351,7 @@ int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, */ if (val & _Q_LOCKED_MASK) { RES_RESET_TIMEOUT(ts); - smp_cond_load_acquire(&lock->locked, !VAL || RES_CHECK_TIMEOUT(ts, ret)); + smp_cond_load_acquire(&lock->locked, !VAL || RES_CHECK_TIMEOUT(ts, ret, _Q_LOCKED_MASK)); } if (ret) { @@ -220,7 +366,7 @@ int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, */ clear_pending(lock); lockevent_inc(rqspinlock_lock_timeout); - return ret; + goto err_release_entry; } /* @@ -238,6 +384,11 @@ int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, */ queue: lockevent_inc(lock_slowpath); + /* + * Grab deadlock detection entry for the queue path. + */ + grab_held_lock_entry(lock); + node = this_cpu_ptr(&qnodes[0].mcs); idx = node->count++; tail = encode_tail(smp_processor_id(), idx); @@ -257,9 +408,9 @@ int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, lockevent_inc(lock_no_node); RES_RESET_TIMEOUT(ts); while (!queued_spin_trylock(lock)) { - if (RES_CHECK_TIMEOUT(ts, ret)) { + if (RES_CHECK_TIMEOUT(ts, ret, ~0u)) { lockevent_inc(rqspinlock_lock_timeout); - break; + goto err_release_node; } cpu_relax(); } @@ -350,7 +501,7 @@ int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, */ RES_RESET_TIMEOUT(ts); val = atomic_cond_read_acquire(&lock->val, !(VAL & _Q_LOCKED_PENDING_MASK) || - RES_CHECK_TIMEOUT(ts, ret)); + RES_CHECK_TIMEOUT(ts, ret, _Q_LOCKED_PENDING_MASK)); waitq_timeout: if (ret) { @@ -375,7 +526,7 @@ int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, WRITE_ONCE(next->locked, RES_TIMEOUT_VAL); } lockevent_inc(rqspinlock_lock_timeout); - goto release; + goto err_release_node; } /* @@ -422,5 +573,11 @@ int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, */ __this_cpu_dec(qnodes[0].mcs.count); return ret; +err_release_node: + trace_contention_end(lock, ret); + __this_cpu_dec(qnodes[0].mcs.count); +err_release_entry: + release_held_lock_entry(); + return ret; } EXPORT_SYMBOL(resilient_queued_spin_lock_slowpath); From patchwork Thu Feb 6 10:54:20 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962864 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 05F23C02194 for ; Thu, 6 Feb 2025 11:12:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=bppDtbpuPfpz/O7JpsRY6fQstRsmaQoxcIF1IN9rk84=; b=UdlLQMuh+qe2F2vA/q2gERFj4N uDS6NBWB0xawoxu51HjaeC2UWc7AwxskT4mvumP+SjvL3ZpSYR0Xt6TZZICTonQpryt8y0KgHzKCA nGToPMdkyAOEvg4DyZoWotor6iBXjKZTMPqxqx9JJWugGRt85Q0xVZoYPt+2wnxtevyb4GnLmGQ4R bkmWtdpHFbBvEQm6yTZth/+hCYBzfx+001hLgt2wnYdUO0mB1j6n3VasKk+03kKnMEgtxirJwMnNa 412ATf3RpBMoONTtEWc2Rr2DrHiE/lC8GjfPI8PEzdHzns+6XeBdJzdjY5WKsWWNDlh3Anya+H7rw rZ5I0ylw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tfzoL-000000065gp-36xj; Thu, 06 Feb 2025 11:12:45 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzX8-000000061bq-3zqQ for linux-arm-kernel@bombadil.infradead.org; Thu, 06 Feb 2025 10:54:59 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Transfer-Encoding:MIME-Version :References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=bppDtbpuPfpz/O7JpsRY6fQstRsmaQoxcIF1IN9rk84=; b=Huk1IpLItg6UnmweuBLluG1b2U Q3ORUfDj4g5OFPUIQgP6z6a0mrkAyLISWNfjQpVWN+fm1RuvG9fQxYCpB6ldFk0kv0RFY3W8PxJaI Mo4LeHRe6nE7edxgH8sRcH7ho5FpmexQAeW/FnpKL+QGxKZT2ZFtPk3UDg1UJbpjj2+jco0TPXvcd ev+HpOFKe8SVrtufwP48vVD8CVTGCcHeEWmksMwG+670k4t1Qxzn/WkVteqt8BHWJhazIyvFXhtZF 3yjlzt1yDSA9gnb+//DaUmxbaBu+ZGWgYQ3AQDGqLxvufXTlElOzE+8OCiFIHtU1dN7rd8zfBWS72 hmKesemA==; Received: from mail-wm1-x344.google.com ([2a00:1450:4864:20::344]) by desiato.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzX6-0000000Guu3-0XlL for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:54:57 +0000 Received: by mail-wm1-x344.google.com with SMTP id 5b1f17b1804b1-4361dc6322fso4699995e9.3 for ; Thu, 06 Feb 2025 02:54:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839294; x=1739444094; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=bppDtbpuPfpz/O7JpsRY6fQstRsmaQoxcIF1IN9rk84=; b=SiHYQh3jC4nKDoaBUzs78QTBVG3zfVmoPMObZFyH8NEjhlL0iciz/3KUSTRaGs5b+9 fwaEBFTddmFSNt4dpltNYmvs6JAKuTTcKYAh8aZ+WIbMunzDrTgSUab1PonVNFy7IIwD kb35+BueBlxecspmz079nZ0O186mzPAgPETSkqhsILuEvM3HzJvu3hS7Tssp7wKP0uUe w5uJoYuoKk1N9QPQzMhghXdyiNaJX7MOKy5rvrIvyAl3ii7oLq6n7Vb7OeHD37LsySCt 45Y2yIGloqDwuNpKgMOetZzh32D1+UrsZpbQo3lfa3n3YhWA51yOcCvaKegI3mVo2thH FlIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839294; x=1739444094; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bppDtbpuPfpz/O7JpsRY6fQstRsmaQoxcIF1IN9rk84=; b=CE4kuQ7omF9w+cRz0hvnvPaGhjohnvsM7d35zATxc+fMcmPD9hLY/MY3n59npIH3Al dilEqcGr0sNMSpMVdgZdw3NS442UV6MBYcrE4WZmlkFiA5t2gxfPwxFG2OMJG0G4091T goCwOoSyH3Sapigk/Baj0BOtY9g1d8siNV4JzC8knFqULNnNaK46nPWuY+kObjWEG+46 KduBPbtoY/4c/VYyZkghYFvWNeELUZ4rPqvQSo5H0bkOn9/Ua7Bl3gBdvg/fGnUbIyHZ UtpcNq4K9ufM8E+1v4ajg5wxrrRLvPySGyeYMqNC0VU1XYdurFJQt4IM0LHsO4eq+MpR 7UAQ== X-Forwarded-Encrypted: i=1; AJvYcCWwFZmdUvBiBoEXZ1cQz7ywhm9TXAmXo9kEXrSMoBuNEm2PQFCrVqtlhNnnrbw1gbe8c+nfUgMn+e5RfW43QZgt@lists.infradead.org X-Gm-Message-State: AOJu0Ywh7qFlm4p3l8LAWAZbRQ2vCGM8LknKt9jACSFFVvhNKHt3jqQB 2VCFS3r1Kg0TA7w3KVBwAlbn3GnuvIxT0OmGfm3qYFMVd6cUlX1F X-Gm-Gg: ASbGncsjTZGZ1/cc53tgc8dDsswmWOqdRS8nsmrQbNY+Hmpe8wf4hjmjmUhT9r3xlCr hSFs32ucLjltXuQwPH/CUqoOKJp1lhoF4OSJDZx2mT7aRtUwuMPRIiPxGXLY+hsapIPnkSrJi6k axUnAfbaejA2VRYOw9pPUAPqLChrKSKzmW2bnVXRjYDH/LQDQUF2Z4zw9MUAr45PQ6wH7flmcL8 0+AdSw5lzgAzumLQ34UdQQlXKsNNqqceP5nBvNMmU68YG+A4oPnxaynVXekTESb3dEQYDRBA/O3 qOWj X-Google-Smtp-Source: AGHT+IH4Mj74Q6s6ESR+HQuLAWKyeMTzag9eq/JQXW9Gs5KkYpw8nRAczk2OFAZMAasQewSLnwMzRQ== X-Received: by 2002:a05:600c:1c90:b0:434:f270:a513 with SMTP id 5b1f17b1804b1-4390d56e3admr51736725e9.29.1738839293937; Thu, 06 Feb 2025 02:54:53 -0800 (PST) Received: from localhost ([2a03:2880:31ff:3::]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38dbdd36346sm1391806f8f.27.2025.02.06.02.54.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:54:53 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Barret Rhoden , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 12/26] rqspinlock: Add a test-and-set fallback Date: Thu, 6 Feb 2025 02:54:20 -0800 Message-ID: <20250206105435.2159977-13-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=3427; h=from:subject; bh=nWMxFCMQuXMGccLr9BZ8Vd2VRz48kOmVrJhLP51E7zU=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRlMFpWa3g5n8jTl6bpUsPGHtx8MXyfjnnCKJt9 AHs008SJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZQAKCRBM4MiGSL8RyvWnD/ 9ElDHw+LJcUwAOu/LB12XiFVDPKFpTcnqBAIIgWpttUZIU9wtEERsQL0IYyZVGNxKf6NjzmCa5aWjB 6dG2OM5EHYXz75cwkHfgazZPzdvgoVeDK/X6Rt2/k5eFlhkvDpaZ1RJQQbPSvRy3B4iNrYvUxPRntH 0siGNJGxXXk2/BF8QAsO0/9VyB0myTjnkvBLTYazkhPxYeGYJPOVUfjk9cl5Z2kz+zKGi7S7EVov5k EoVNvIcbjJp6aOpxZ0Cl81l9juNQk5dH73X0Y6yLqqcmxRePAdscK6ygQ0v6j5b5bZg1vt9NqzHfbx bcepMH3rFrZmAqXnSuel7pnTFM4tuaOV9SyNfh2zbTXiyWyQFXxPuGFx1+OLztjitTNulOf0KE+BUh HpLSN8XCqNmM10ErdfqRD80ZDXesuP0vqDokTCpTWKQXbVlnLiWGEwIOnyoE0LWBwqn/XvqhOIFrv4 DYsmQzW+k7QJRAtKWsWYpWOHweuglwIcQeVSluJEnfGNkaIUyHbEU8X0V7jEnkdX5De3i1bERuidMP wM5fvNrUBt5JbXAuZjfP1QaTVtQYh+Rc0k6cDl6I0PUPAi5aAglDK3T8+HJtDTOjLvWN3st8/uhyl5 rLuvZDv3ZySigXnWcS+rrjCZ06WAgFBD5LZbLsv4etjiyzSaJO4h2bT6JiIA== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_105456_352928_7BCF6179 X-CRM114-Status: GOOD ( 15.48 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Include a test-and-set fallback when queued spinlock support is not available. Introduce a rqspinlock type to act as a fallback when qspinlock support is absent. Include ifdef guards to ensure the slow path in this file is only compiled when CONFIG_QUEUED_SPINLOCKS=y. Subsequent patches will add further logic to ensure fallback to the test-and-set implementation when queued spinlock support is unavailable on an architecture. Signed-off-by: Kumar Kartikeya Dwivedi --- include/asm-generic/rqspinlock.h | 17 +++++++++++++++ kernel/locking/rqspinlock.c | 37 ++++++++++++++++++++++++++++++++ 2 files changed, 54 insertions(+) diff --git a/include/asm-generic/rqspinlock.h b/include/asm-generic/rqspinlock.h index c1dbd25287a1..92e53b2aafb9 100644 --- a/include/asm-generic/rqspinlock.h +++ b/include/asm-generic/rqspinlock.h @@ -12,11 +12,28 @@ #include #include #include +#ifdef CONFIG_QUEUED_SPINLOCKS +#include +#endif + +struct rqspinlock { + union { + atomic_t val; + u32 locked; + }; +}; struct qspinlock; +#ifdef CONFIG_QUEUED_SPINLOCKS typedef struct qspinlock rqspinlock_t; +#else +typedef struct rqspinlock rqspinlock_t; +#endif +extern int resilient_tas_spin_lock(rqspinlock_t *lock, u64 timeout); +#ifdef CONFIG_QUEUED_SPINLOCKS extern int resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, u64 timeout); +#endif /* * Default timeout for waiting loops is 0.5 seconds diff --git a/kernel/locking/rqspinlock.c b/kernel/locking/rqspinlock.c index 42e8a56534b6..ea034e80f855 100644 --- a/kernel/locking/rqspinlock.c +++ b/kernel/locking/rqspinlock.c @@ -21,7 +21,9 @@ #include #include #include +#ifdef CONFIG_QUEUED_SPINLOCKS #include +#endif #include #include #include @@ -29,8 +31,10 @@ /* * Include queued spinlock definitions and statistics code */ +#ifdef CONFIG_QUEUED_SPINLOCKS #include "qspinlock.h" #include "rqspinlock.h" +#endif #include "qspinlock_stat.h" /* @@ -252,6 +256,37 @@ static noinline int check_timeout(rqspinlock_t *lock, u32 mask, */ #define RES_RESET_TIMEOUT(ts) ({ (ts).timeout_end = 0; }) +/* + * Provide a test-and-set fallback for cases when queued spin lock support is + * absent from the architecture. + */ +int __lockfunc resilient_tas_spin_lock(rqspinlock_t *lock, u64 timeout) +{ + struct rqspinlock_timeout ts; + int val, ret = 0; + + RES_INIT_TIMEOUT(ts, timeout); + grab_held_lock_entry(lock); +retry: + val = atomic_read(&lock->val); + + if (val || !atomic_try_cmpxchg(&lock->val, &val, 1)) { + if (RES_CHECK_TIMEOUT(ts, ret, ~0u)) { + lockevent_inc(rqspinlock_lock_timeout); + goto out; + } + cpu_relax(); + goto retry; + } + + return 0; +out: + release_held_lock_entry(); + return ret; +} + +#ifdef CONFIG_QUEUED_SPINLOCKS + /* * Per-CPU queue node structures; we can never have more than 4 nested * contexts: task, softirq, hardirq, nmi. @@ -581,3 +616,5 @@ int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, return ret; } EXPORT_SYMBOL(resilient_queued_spin_lock_slowpath); + +#endif /* CONFIG_QUEUED_SPINLOCKS */ From patchwork Thu Feb 6 10:54:21 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962870 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DD3E8C02194 for ; Thu, 6 Feb 2025 11:17:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=2U/x1GfGP9j2cvu14gMbHcehw03jG1C+3eyWlyvy1zM=; b=WETv4BjAO0iycx+s5DHWr1D3NI hRccSlRygfgGWAVqOsZ2ZlA5DKcxqwMKSRnaLQoRpRXvzUvME+XUx+wotKd2Q5W7aebSc3JHjzU0t H5usjgnDK2IG4dnT9tTrApUcZRUgawfMXAkFQoivfMK+yHzN/+walKmLxBLX4dXX99u04dqkKvKDJ KqM8yA91xnmCqzqv++SfpUmZ8VPRS2gpdPM6iJBRyD4Xjx15478Po1q0T5J4Wx7odSMlDR4UMWJZB m0dszE9gX4YBJakgx49z0svwQieAEy89IONNiUDZzm2Um4OQC5SGekomQHkNpWbsw52wEhu/JWXyI BVE7HDbA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tfzsM-000000066Id-3KBW; Thu, 06 Feb 2025 11:16:54 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzXA-000000061cl-1xcU for linux-arm-kernel@bombadil.infradead.org; Thu, 06 Feb 2025 10:55:00 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Transfer-Encoding:MIME-Version :References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=2U/x1GfGP9j2cvu14gMbHcehw03jG1C+3eyWlyvy1zM=; b=laO5VTWis+Ew6TN6IDHw03UI70 /ULqCVz7R8ajfESw3ILM5YHkQM4Jtp/1IytE9RqYl6VFEDDLva21vPf+QwaVbng2gARCpMyzGB951 yX59IkgZ4YRZIHNIGivuO4KjfNwSbcxRaEfSJcBeNfKILEq4Z98Sn9ggS/kiDDj8xek9dDM718CVB z3fnBYdNZpPIy/NAbv4LOtyJVcYcqltLk9TxgiI73Qc3X/ZFHC9Hj8q0IrwwntlXwAJalUHrjJSIu 6bDy8xjA8upxEDovUjyJMJ+ews9tMuR4e8y6sRSj2FaM1XWbBMJyuHZsSpoQ8+j1jbwzufmaohlqC BikGM6zw==; Received: from mail-wm1-x342.google.com ([2a00:1450:4864:20::342]) by desiato.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzX6-0000000GuuQ-4Aqj for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:54:59 +0000 Received: by mail-wm1-x342.google.com with SMTP id 5b1f17b1804b1-43624b2d453so7936385e9.2 for ; Thu, 06 Feb 2025 02:54:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839295; x=1739444095; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=2U/x1GfGP9j2cvu14gMbHcehw03jG1C+3eyWlyvy1zM=; b=F05kbMgigUjlEKw65Ezp5zAMoLbKs4+5teX6j8wq+/jJo1eoVN4hiC3uDdIdrLWfge OJBYma+nASs/R+ylEqcuHxc5UJ5PYmRZwjmXEcmF9K4oh5wkShXdmK6W/OVPmoux66EI /giJtNVQsc+qZ7tP5E48iZM28tZdM1D8OgmeIonqZiJfW/jBuC9m+5FW2PJOCprlZblb OmPBlYYLAGsGm5IrCpepKbsEITV/drQ6BlBhUnzJsONUyZaMe78fwiiaXDKxM0khEY9c xSAhIUza39boyuLVwu/cNxlBv1q1aybSd7DMRQaMUnuxdSdNPjRFEJYgYGLrFJO/et54 sKNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839295; x=1739444095; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2U/x1GfGP9j2cvu14gMbHcehw03jG1C+3eyWlyvy1zM=; b=h5b5+oEGtvPHsSwNcnNPXH1htOd2fNX1+1LhqkxzFRbG/XhvLK0a+hKOiRfAqrGos9 P9QrppFZRRrVOzgqrCNYDhGdbr3SL2v0lCwaeO3cIp6jsaJLdYrzae9K93aUrQCwYpE3 SMzWqxtvJfbDARPn4XlxiSTPmG7N8UtdBSzMcfKoZFwRotvTOpMhri+Vw6EnLTIqnHCo ct/fnIcNCCAaWwRaip9J/i8Z1QnJBJ34CiEDS+hYRxZqDEDFkZmZh0x9PRl8M47/hNA7 A41pkW2pPzo+zoLH2liqrUCyLSNclocyOOqex+odbArXI0SXViYcy3CvACnUNTpBsPk9 uSOQ== X-Forwarded-Encrypted: i=1; AJvYcCWDAq1WrEHsBkV3BVX2ga5GIqO5tiCrZbp7QpN0tZ1slrsL5W8tgmiscJETigheLkZffFG4Q3/FXoBcWmGB0QY+@lists.infradead.org X-Gm-Message-State: AOJu0YxDlW8DJ+jUrmsbAlo0hOT7/iDMSgHGS6CnGmSWGlShtGZQRj77 b6Xy+o5UBXszjg/CGSZyLvZyE4nMyuRITGri8fv396zFtmdUQ0RO X-Gm-Gg: ASbGncupoI5ftdD+Ockip8rer3dbfsC37M7qajWd4ZNfspW+UiMWKgISMvwi3kU8IfO 5nxCLDXUO2l80cnkxQZEWia+hX14leUE5pbdgsTjNo5J9Yz3NTftQk96DFVrrmuU9qIB5Q+t3jD 8iBuUE6WpoUxsqHX2hnToNLq29cQPlNpgazefGhIuB3JjU+rh1ISVaG9Y6JhgdSuy6WjPLqvjzE rmQ0l1V276p2CwtHrHrXC1suGZV5AEX/pW+WZ1Vz5CxVoM8KJyFPiCVBkYceniDLjfQ43cOzN8T jtU9 X-Google-Smtp-Source: AGHT+IECKDGAzq+Depq2+nIptoAD3ksbrPRr2mBSOELhwPjJa2kOGcthn5Lo7ZSMn0/ZXamyQQJB4g== X-Received: by 2002:a05:600c:4f05:b0:434:f739:7cd9 with SMTP id 5b1f17b1804b1-4390d4350c4mr57549575e9.9.1738839295278; Thu, 06 Feb 2025 02:54:55 -0800 (PST) Received: from localhost ([2a03:2880:31ff:8::]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38dbdd7f081sm1429621f8f.58.2025.02.06.02.54.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:54:54 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Barret Rhoden , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 13/26] rqspinlock: Add basic support for CONFIG_PARAVIRT Date: Thu, 6 Feb 2025 02:54:21 -0800 Message-ID: <20250206105435.2159977-14-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=3266; h=from:subject; bh=zXQSTFNS/c2U8KM3FgC7kOLmnF5sm9qLW6KNbuMrArs=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRlcUneWOnutzdUSHTyEeSH9ShV4gvRxjc/jId8 1YMMoZSJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZQAKCRBM4MiGSL8RyojREA CAY3vlIDeTRuUpYBE+KFscZYPr9H8/q9UrVdKjYtlSmP4kUMnaYDdROM+/Z1eAE/ooLx/8vdb6tzBt fc3B+DxVL7tQO7jgkFhs8TWzKKlBa8uCE08YnUR183pRuI7FMMaaHbBakUz76eIN1rHwUUZnragSZK FI/5+wGEKE4O3AFSc0b6AVE2v7Ac5Qz5hBBsks/FANKm3Hdx0yJa1axdExsWkJREbmmLTO1RDjCoQZ MVoAxyPIBWrHsWFp3WUr3AGAwW/LT5cz1ADzDNURb97YC4QLdg4MG3NfjpdyWd2jj378mOBYDMydU7 ifjlmuV1KLLhJCbrDsvImh8rxY4aiH1tOMPpfC/pCCARozVsCCt/0NqyneZEa6ebt3Dn6xqnWrSM0D P8kjr4HmLnUenqsX6gCzKA1pntH7LoAXkWLdXVX0Ekt+LIx9dUbZSsloW9AwF+9r/JRRMOBYnNcUDq YDOdMDyfNFfvCjHd9qxbMa/52CU9Tl7upwFr/RbcEK4zY0ZgsV0kmffC3TAPnB+MzzJZwwtdF0ajVp nFsUX1lkdCe1xJTlMFPMzPgqFnxoN8cIGLIKW7BMNq3lr6eU9cBn3XZGlQqrCRwdk/yLylSriVacdL F0y6Ga5GbeLS7YaM3SQtcTwKK5K5ODeXnTU57iP8D7BcG5/WmFu88d6x4oxw== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_105457_334371_0CFCC044 X-CRM114-Status: GOOD ( 16.90 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org We ripped out PV and virtualization related bits from rqspinlock in an earlier commit, however, a fair lock performs poorly within a virtual machine when the lock holder is preempted. As such, retain the virt_spin_lock fallback to test and set lock, but with timeout and deadlock detection. We can do this by simply depending on the resilient_tas_spin_lock implementation from the previous patch. We don't integrate support for CONFIG_PARAVIRT_SPINLOCKS yet, as that requires more involved algorithmic changes and introduces more complexity. It can be done when the need arises in the future. Signed-off-by: Kumar Kartikeya Dwivedi --- arch/x86/include/asm/rqspinlock.h | 29 +++++++++++++++++++++++++++++ include/asm-generic/rqspinlock.h | 14 ++++++++++++++ kernel/locking/rqspinlock.c | 3 +++ 3 files changed, 46 insertions(+) create mode 100644 arch/x86/include/asm/rqspinlock.h diff --git a/arch/x86/include/asm/rqspinlock.h b/arch/x86/include/asm/rqspinlock.h new file mode 100644 index 000000000000..cbd65212c177 --- /dev/null +++ b/arch/x86/include/asm/rqspinlock.h @@ -0,0 +1,29 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_X86_RQSPINLOCK_H +#define _ASM_X86_RQSPINLOCK_H + +#include + +#ifdef CONFIG_PARAVIRT +DECLARE_STATIC_KEY_FALSE(virt_spin_lock_key); + +#define resilient_virt_spin_lock_enabled resilient_virt_spin_lock_enabled +static __always_inline bool resilient_virt_spin_lock_enabled(void) +{ + return static_branch_likely(&virt_spin_lock_key); +} + +struct qspinlock; +extern int resilient_tas_spin_lock(struct qspinlock *lock, u64 timeout); + +#define resilient_virt_spin_lock resilient_virt_spin_lock +static inline int resilient_virt_spin_lock(struct qspinlock *lock, u64 timeout) +{ + return resilient_tas_spin_lock(lock, timeout); +} + +#endif /* CONFIG_PARAVIRT */ + +#include + +#endif /* _ASM_X86_RQSPINLOCK_H */ diff --git a/include/asm-generic/rqspinlock.h b/include/asm-generic/rqspinlock.h index 92e53b2aafb9..bbe049dcf70d 100644 --- a/include/asm-generic/rqspinlock.h +++ b/include/asm-generic/rqspinlock.h @@ -35,6 +35,20 @@ extern int resilient_tas_spin_lock(rqspinlock_t *lock, u64 timeout); extern int resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, u64 timeout); #endif +#ifndef resilient_virt_spin_lock_enabled +static __always_inline bool resilient_virt_spin_lock_enabled(void) +{ + return false; +} +#endif + +#ifndef resilient_virt_spin_lock +static __always_inline int resilient_virt_spin_lock(struct qspinlock *lock, u64 timeout) +{ + return 0; +} +#endif + /* * Default timeout for waiting loops is 0.5 seconds */ diff --git a/kernel/locking/rqspinlock.c b/kernel/locking/rqspinlock.c index ea034e80f855..13d1759c9353 100644 --- a/kernel/locking/rqspinlock.c +++ b/kernel/locking/rqspinlock.c @@ -325,6 +325,9 @@ int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, BUILD_BUG_ON(CONFIG_NR_CPUS >= (1U << _Q_TAIL_CPU_BITS)); + if (resilient_virt_spin_lock_enabled()) + return resilient_virt_spin_lock(lock, timeout); + RES_INIT_TIMEOUT(ts, timeout); /* From patchwork Thu Feb 6 10:54:22 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962865 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B5AA7C02194 for ; Thu, 6 Feb 2025 11:14:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=psIYVnoDnyqEhuF08gdBP5C/H1dDdxXA5if0bgYSupA=; b=YC+QM4ArJdwRyZU6XM1/KJ5Naz sGZPoOq/5O4baqHkqXkCgwu3FPUXfLEfmQtwcc7Y7WXGbZ25zNakR3MRw73q8d/7sqnB6yxzxxuPg nq888FM4F8rf3M12qZQiMDD+GfsVgnTUVwEfJ6VWByD9mAKH5MRebLA52QIZh/I7w/QAfQq2sgJxf cQipT2ozt6oxR6IHDD6Y9C7fmVoIMjJulTqKer+mEl0FhSdIrXkkt5sO+cvNtf4cTeIULFetFtCI9 XqReOTEAwh5ZOZlWjOradFxgTjNk9nwXfd1M9yx/3nY8zE88+gKyZgK7SKGYwkdgFEv8KvVph1C9w pGcFE9hg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tfzpg-000000065wS-1ip6; Thu, 06 Feb 2025 11:14:08 +0000 Received: from mail-wm1-x341.google.com ([2a00:1450:4864:20::341]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzX7-000000061bB-4AXF for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:54:59 +0000 Received: by mail-wm1-x341.google.com with SMTP id 5b1f17b1804b1-4363ae65100so8239245e9.0 for ; Thu, 06 Feb 2025 02:54:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839297; x=1739444097; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=psIYVnoDnyqEhuF08gdBP5C/H1dDdxXA5if0bgYSupA=; b=UTlN5P1y+h9Tn+yENmov8hqFcC5uRX9qUnN6ILA/1GoNc5PHbn+v1fnW4V5h6jA/3T CGLpolbV4V1FV9M3F7VI5azVGarCL89QPipIhT4ifA2j+vGdTDpoo+btp3tTJ/mSsjk1 lVN4YCCAjFX79slhjQFVM4VMwRldP3JGcJYAMelcFadcFdmrpR4r1p6+801W1f/yjGFy lwKDDLUxBhymyjCj+it3T9FlYeRWARnUAlLvLuoMcHynNwvhDtShaOG1ikc4CsMPRtS/ Nwa+F7r34XODF2KVgzIRoHzLoF6GmZ8bnGd2Ym905yrWri1/wou0TZoegeK7kcnk79o3 ZB7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839297; x=1739444097; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=psIYVnoDnyqEhuF08gdBP5C/H1dDdxXA5if0bgYSupA=; b=pFBmnBcKUlZUxP1xSfM8PA8VEc42MJdeNEmawHH2HaqQtTIwySaTt5EzFv2kG+vZIS JiY+QrfdZvHLdeoKJ3RYZ+EO4P+rDzwoWOHXhuKXjcF5JGkHWPsKb7MvkmT9R6yuQmXa XC6Yx39x8ACev1m8ate3f2pRmXhGtTgLnm50rc/AQc33J+oNWBlgWd/4n8l8rnXpD9yd V9w6ZJvdom0HSzqsRihNb5c7pFtqADK3rCQCr8rc/Dqbbv53lmFiaOCnmK0pOfoxN/Lf rfTrh8Vex3eoX3ys3vu8Zt9Nnb2lgSRZzFir+K+9axtX7YBI4c1Kwg7sA5RC2H1vDGKG 9iUw== X-Forwarded-Encrypted: i=1; AJvYcCUVrX2ZbONtNuh12tierKTHt9xfg5JlumSpXjhH2WQ9V23avcUNm1RtB1z/HbTForLfcXuGawocvU2QpZkyGqgv@lists.infradead.org X-Gm-Message-State: AOJu0YyPbNtkzQOkcxPunxP/oPbDL719lOcZf5Sb/HOieEJLQx0Ge31c EMnknw4iv4v5U1Qo7vimJNkShUOFfw7KP4Im+SC9B5ZX78t5+dW8 X-Gm-Gg: ASbGncvBfyOGrWIhBHv6OziQJuz9FBvHyjqHrnhDnTLSx3uxds40k/u/PR+1Fx0/73R IxqpJiGKrZzRb0wXyziwoPQLpw+Yb6yH0/lXt+34E6KlxBHAdpCJ6Ii9McKeELt0SovQv89LuOF a4fOZ8u3RxfOXYSz8JdOcc5DQ3KnA2BSRz5Rauo+E6HPxtTNhtaGQ8bsi0053dO3dbj+malgXH8 pejQfFVGNNnbp+Csm/v3wgP374NwK+ci25/6+tZWd7YpAD1hY4C9GbLKvF8YQPVVPjLjj2X0XWu 4sKppw== X-Google-Smtp-Source: AGHT+IHkl/GEUHcQkAbl0ON/0qgQ/ymyQ9ZF0yYfkBQadmGYwFrDVox65cPqH2Wp5hAeoi4jk/bRwg== X-Received: by 2002:a5d:584f:0:b0:38d:b907:373a with SMTP id ffacd0b85a97d-38db9073983mr3633804f8f.18.1738839296552; Thu, 06 Feb 2025 02:54:56 -0800 (PST) Received: from localhost ([2a03:2880:31ff:14::]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4391dfc8a4asm15277955e9.32.2025.02.06.02.54.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:54:56 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Barret Rhoden , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 14/26] rqspinlock: Add helper to print a splat on timeout or deadlock Date: Thu, 6 Feb 2025 02:54:22 -0800 Message-ID: <20250206105435.2159977-15-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=2125; h=from:subject; bh=KNh5avXGN/NKqBUPvLsxJm4kx32ENpJMvZqaG5TMnX8=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRmRlXvwB1hx8++BclgtcXSkYsvJvj0epYaDAgR caKTzLKJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZgAKCRBM4MiGSL8Ryj/lEA CdyUwdOkuj69uW28N+HkwHcaKyNpxRWOwDuMUKnremt3PTIi1hrZRLDZ5EgKKnQdyKZY6t2nVrL373 lmuwFahN438dpl8Bk5tQbOhm6+4w+dE7fWr/DTpAuKuKYVZtr1/yiqqRdODy8wjoomvIzmS/in2A7C GF9o2x9dBdDcZ/k+DSqPjPQoebLQSloS3NlGu/AxxXu/YnaY7wvkbdiNj1GGZSKLX0Mf2zlShJbi46 v7mgNoevomx5zGXDhazsdU2klCu0ipaQUiKQlAbHB4LDaCfTJiw/FKqtjY4bmTlnVienNuoZwdcWqE BqCt7T9md4IJP+jH9aw6PAXZYlPYe/DGYUuOtCberxn54FV6sgZnEH03zTBFmJXKMxHisr9eXNr1zy /ciQUgfbnR0+tgIbFm8LWRpA1Wy+nerC5gB6EHuAIp0GJW/NIs85l39vDGshVBpbkh3T/NgrQOVYRc A+kNcL2FSKEXxRtiVw9nzH+exDCaJdx9Fb30gd31OENMeXf/QgUBlRKVdcyt9rD/H5L/01hPbSpwTq AxHFQVwdMLBxGzxGfpTUCDZHXq1BCR+fw5VT6cC3LSVeBC6Y7lEDJ0PPAdEPbxJe7DzjHFblZdRlsq EJfnrO5HAk3JyRiiqL+AbJN/On7peWMsf6ZNiz+g4rLFY2+PreoqERHpbkvw== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_025458_032886_09635D39 X-CRM114-Status: GOOD ( 14.79 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Whenever a timeout and a deadlock occurs, we would want to print a message to the dmesg console, including the CPU where the event occurred, the list of locks in the held locks table, and the stack trace of the caller, which allows determining where exactly in the slow path the waiter timed out or detected a deadlock. Splats are limited to atmost one per-CPU during machine uptime, and a lock is acquired to ensure that no interleaving occurs when a concurrent set of CPUs conflict and enter a deadlock situation and start printing data. Later patches will use this to inspect return value of rqspinlock API and then report a violation if necessary. Signed-off-by: Kumar Kartikeya Dwivedi --- kernel/locking/rqspinlock.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/kernel/locking/rqspinlock.c b/kernel/locking/rqspinlock.c index 13d1759c9353..93f928bc4e9c 100644 --- a/kernel/locking/rqspinlock.c +++ b/kernel/locking/rqspinlock.c @@ -196,6 +196,35 @@ static noinline int check_deadlock_ABBA(rqspinlock_t *lock, u32 mask, return 0; } +static DEFINE_PER_CPU(int, report_nest_cnt); +static DEFINE_PER_CPU(bool, report_flag); +static arch_spinlock_t report_lock; + +static void rqspinlock_report_violation(const char *s, void *lock) +{ + struct rqspinlock_held *rqh = this_cpu_ptr(&rqspinlock_held_locks); + + if (this_cpu_inc_return(report_nest_cnt) != 1) { + this_cpu_dec(report_nest_cnt); + return; + } + if (this_cpu_read(report_flag)) + goto end; + this_cpu_write(report_flag, true); + arch_spin_lock(&report_lock); + + pr_err("CPU %d: %s", smp_processor_id(), s); + pr_info("Held locks: %d\n", rqh->cnt + 1); + pr_info("Held lock[%2d] = 0x%px\n", 0, lock); + for (int i = 0; i < min(RES_NR_HELD, rqh->cnt); i++) + pr_info("Held lock[%2d] = 0x%px\n", i + 1, rqh->locks[i]); + dump_stack(); + + arch_spin_unlock(&report_lock); +end: + this_cpu_dec(report_nest_cnt); +} + static noinline int check_deadlock(rqspinlock_t *lock, u32 mask, struct rqspinlock_timeout *ts) { From patchwork Thu Feb 6 10:54:23 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962869 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CBF95C02194 for ; Thu, 6 Feb 2025 11:15:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=0X1+UYLXLNf68smwsb01lX6eFqtHiz/IPso40v3JOxM=; b=WHGvx4Tt2T7NvEzJ2zCX1DqHrA t4qk70rT87OM9cNgKrnXnoon48R3FmPr9UoaHlQQcYva0Zpo8sTxuteFSxDswpm5kKDOvgCSAKy+l fr6Wadh4JoIbAxNjym3u1RNR4mR1/ZyYyGYIBgIFVLdLxtUbRwacyqNk5+TmK2oURLIHGZPy5B8IR BJFUadJu7fGHN0qtZgy2H8wSsyFCUYdYvRzv2c0wX6Mlgw/7rAlPlHbKKq2EG5+buEYY0d0+8DcvY FAwAfgYXObUTkD0KywmhdSoZjmVGcxNNltQ1QvCdrQ6fUqL4y2ImaY/DFqDV8s1qu7KbA0pUouELF nRVgFC2g==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tfzr1-0000000663i-0e0P; Thu, 06 Feb 2025 11:15:31 +0000 Received: from mail-wr1-x444.google.com ([2a00:1450:4864:20::444]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzX9-000000061bz-1nR7 for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:55:00 +0000 Received: by mail-wr1-x444.google.com with SMTP id ffacd0b85a97d-38db10ab86cso601968f8f.3 for ; Thu, 06 Feb 2025 02:54:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839298; x=1739444098; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=0X1+UYLXLNf68smwsb01lX6eFqtHiz/IPso40v3JOxM=; b=D9ZtK6UkX9ACRUOUZ4TfrTaP8wyUFAngZrBbo00dV4vsLMA7z+fzJ9Ylsf4hIn9Wqx k3u43+GwEeomEn9rUwGAZgGyOZyJUo2j1qN4uzpnN03l26SE4gIYc8z1pIstU6HcFgec 92zbMEwUStRBpIcTcubMxfcDm5zTjBR6ri8lTNC+pze6VBWIYP9qATZMZwKTubtqTAT6 FIMll6PJNafh/nMf1q1EiPjqHSU1/NYMpDYXtocr5gM27F6h7DoQ4+IBJlETPSrK4+yW yiwPYEi+vHaHclAfkPDOGU2FFosqRcY+Cxm37lORIzbETaXUaChBFfup5vRGlQNnOGzv KqAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839298; x=1739444098; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0X1+UYLXLNf68smwsb01lX6eFqtHiz/IPso40v3JOxM=; b=WnbQKu2T932pkmC2LKrGn970j1szxBAUIqnkxIFWgJcfGPXoMQh7Ey/HAN6nkHgpnz 0iAYZgvFdL9SqkwvMebkKXkQgwyqne2HQGgX206UQqug4ufFISZPBaA+XuLmjigKJUjn mpht1gTvF7QFH2pLOngwkXFtAJr8bm8DLfWk0vGv7E+Z3X9/TLnDeUqXsNSxFjqY23mk jdGFvJA87uQEHnbV3mU51qGKknkcWT/XAuX00ckpajeCa/LMZPWG6OaEZfd52nnzp4Y7 EWlYhmjQaaWeQhqNfdQ6gQvzW2xhpVfwttrZVfOWFZOQhNwzs5349FekB5F6/ET0E+4c 3OOg== X-Forwarded-Encrypted: i=1; AJvYcCVP/neknrXf06C13uOnvNMmNsWG5xnH9BG1KYMDafrofwxqI5EpO3NFMW/VeNgSzhp5lSVbXRHB7FUXZVEgtvgX@lists.infradead.org X-Gm-Message-State: AOJu0YzwCzIC/N+byKWzUIK73zdL+4tunSIrj7GQLk6+GPxJblB2+mIK TgyXX5YFcY9YaybPYpGDwR/JEZfavY2zsxJA24V76cao2kvdLxP0 X-Gm-Gg: ASbGncvnLUz3tQbZH2cdgsYuLmTxDhJagIZQJOKp0H1hcymQg0vv23R5E5mq+stkKWu qVoKnu/CaUwD+OrAYhRnIcRjdCTXXVbbLH3cnNmBqtooCRyImrDmzJt8AVfbH4inPilkYGAqIrD jB6beuHcG8ySBsro/eyy02A8Lf/7JZGaSX6e9NvytvuOEVODoM4bwr0KEeK8rUh5O6kYBn6gO1x F0c3pQ7nQt2zH542jTP4eSqhxuLSTE7DdqMS0gl/1zMUn2Pp9ldHwmP6TP7TZ8Wxcvz72xjxygE Mz8y8A== X-Google-Smtp-Source: AGHT+IHFPsmZFGzMQgNIZxyin0+PS7DDeJ1kbh/KGqhqHLUU4fvdAQOI2sLLSlmyZkzlKcZIDN7WyQ== X-Received: by 2002:a5d:5222:0:b0:38d:b051:5a0e with SMTP id ffacd0b85a97d-38db495f8c3mr3602518f8f.49.1738839297924; Thu, 06 Feb 2025 02:54:57 -0800 (PST) Received: from localhost ([2a03:2880:31ff:16::]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38dbdd36776sm1432693f8f.32.2025.02.06.02.54.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:54:57 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Barret Rhoden , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 15/26] rqspinlock: Add macros for rqspinlock usage Date: Thu, 6 Feb 2025 02:54:23 -0800 Message-ID: <20250206105435.2159977-16-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=3270; h=from:subject; bh=itMjJLB45N5MLJdgdgQDHOOS7VmFDUaHlV/BKezdr3U=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRmSdisG+iOEzxKLgDdJjXZN/mncOFg8pv5dO3d bZNJg+GJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZgAKCRBM4MiGSL8RyqPRD/ 9gG2FU1oHAJPUn4UJ3fwKsuQm4i95ajoiPlT/m5kT+DPF9gSa+xO4hKuGYliwEJvf+nuxUe7FMhCUo nbsTx4KiQesdJvFLBEyC8lMRvIh0qaD//aym3Fhb4D5zAgaUBULNbyDh5gj/mVvu5FLRbI+DT3LJmw r1bostFusx1GdFieKe+yhrTChSjrfqKbnpY9R/8w5UkGN/g+vLsyykEbwnbgHhL0Ycvk+yqbXaob++ hWud3ejpfkcyQRqDBeycTn+q3cMN4T33wASR6VfBgNAggDQplTQ0NL97l+Dn3qdgtj1L8+PzWkyEy7 uzYnLPsU1re215zb4P38apAspYJP0lf30MOhhmObWuxKLqtviREsn0uOBZNsmdc4R3mfydqsDy2K3i fXrNShGCY4YJFjLqUeWDnW9g2IHVA0Y3GNq49DhGgN8ZGExFHkkLHKfWWt1vd80tJx+3FcaAcbrHN6 jqnPSeoZw9H/Gv0GIoXD05hjLkUhPcZYt3gM2eUbE0LjEPwU3A1PAq0320Z5j0ZxsuHnEQt80v5ElR 1N2IG2VJNeILB6c4gNBsyn59IW7xvMo0y6wnEpgTgAJHBxYZqffi0Js1mYt58TEIo6iaH05Sdjgvz5 U5PZ/7rUM15c0KeEcHBAwD6NKswh6LfL192Y9X7h+EChg4E+eQnBkpyXm7gw== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_025459_461686_B9C405A1 X-CRM114-Status: GOOD ( 15.71 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Introduce helper macros that wrap around the rqspinlock slow path and provide an interface analogous to the raw_spin_lock API. Note that in case of error conditions, preemption and IRQ disabling is automatically unrolled before returning the error back to the caller. Ensure that in absence of CONFIG_QUEUED_SPINLOCKS support, we fallback to the test-and-set implementation. Signed-off-by: Kumar Kartikeya Dwivedi --- include/asm-generic/rqspinlock.h | 71 ++++++++++++++++++++++++++++++++ 1 file changed, 71 insertions(+) diff --git a/include/asm-generic/rqspinlock.h b/include/asm-generic/rqspinlock.h index bbe049dcf70d..46119fc768b8 100644 --- a/include/asm-generic/rqspinlock.h +++ b/include/asm-generic/rqspinlock.h @@ -134,4 +134,75 @@ static __always_inline void release_held_lock_entry(void) smp_wmb(); } +#ifdef CONFIG_QUEUED_SPINLOCKS + +/** + * res_spin_lock - acquire a queued spinlock + * @lock: Pointer to queued spinlock structure + */ +static __always_inline int res_spin_lock(rqspinlock_t *lock) +{ + int val = 0; + + if (likely(atomic_try_cmpxchg_acquire(&lock->val, &val, _Q_LOCKED_VAL))) { + grab_held_lock_entry(lock); + return 0; + } + return resilient_queued_spin_lock_slowpath(lock, val, RES_DEF_TIMEOUT); +} + +#else + +#define res_spin_lock(lock) resilient_tas_spin_lock(lock, RES_DEF_TIMEOUT) + +#endif /* CONFIG_QUEUED_SPINLOCKS */ + +static __always_inline void res_spin_unlock(rqspinlock_t *lock) +{ + struct rqspinlock_held *rqh = this_cpu_ptr(&rqspinlock_held_locks); + + if (unlikely(rqh->cnt > RES_NR_HELD)) + goto unlock; + WRITE_ONCE(rqh->locks[rqh->cnt - 1], NULL); +unlock: + this_cpu_dec(rqspinlock_held_locks.cnt); + /* + * Release barrier, ensures correct ordering. See release_held_lock_entry + * for details. Perform release store instead of queued_spin_unlock, + * since we use this function for test-and-set fallback as well. When we + * have CONFIG_QUEUED_SPINLOCKS=n, we clear the full 4-byte lockword. + */ + smp_store_release(&lock->locked, 0); +} + +#ifdef CONFIG_QUEUED_SPINLOCKS +#define raw_res_spin_lock_init(lock) ({ *(lock) = (rqspinlock_t)__ARCH_SPIN_LOCK_UNLOCKED; }) +#else +#define raw_res_spin_lock_init(lock) ({ *(lock) = (rqspinlock_t){0}; }) +#endif + +#define raw_res_spin_lock(lock) \ + ({ \ + int __ret; \ + preempt_disable(); \ + __ret = res_spin_lock(lock); \ + if (__ret) \ + preempt_enable(); \ + __ret; \ + }) + +#define raw_res_spin_unlock(lock) ({ res_spin_unlock(lock); preempt_enable(); }) + +#define raw_res_spin_lock_irqsave(lock, flags) \ + ({ \ + int __ret; \ + local_irq_save(flags); \ + __ret = raw_res_spin_lock(lock); \ + if (__ret) \ + local_irq_restore(flags); \ + __ret; \ + }) + +#define raw_res_spin_unlock_irqrestore(lock, flags) ({ raw_res_spin_unlock(lock); local_irq_restore(flags); }) + #endif /* __ASM_GENERIC_RQSPINLOCK_H */ From patchwork Thu Feb 6 10:54:24 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962871 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C8E34C02194 for ; Thu, 6 Feb 2025 11:18:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=v/27xKgFdSZVO2eLH/O6emznfiz3pieZcJIn5kKU89M=; b=m7kghJbtdWSHtWqWybnZY3Ied4 IP+KhTtwca+rnJWTKOj1/h5lZimbkCh+XC1lSpvnn2B754kW4qUP6MFK2g3cQfmhaAt8syN1XG0D5 2//J02OGEX0A4rDmdAc9qsI8JBoGGOnXqSjPmfaDsZsLkgi8Ly/URxgnE7RpKPS0ssLtowxhI3B89 CiC6Vv9Nq3KGKR+wTGZzKT1hA9t2bDqEDXr4iaboZzDt/4jDDOnrlriX/Eh82hF0BsZe6ztjW002w o0ZLqnTHmxy7rRRUkNXFxmb9b6TVNmV6wG54rvM1Z4jIldaltRlG10ykodx3SjQFbbOf5kuC/bBzw fNjprSaQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tfzth-000000066bo-1nq6; Thu, 06 Feb 2025 11:18:17 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzXD-000000061eX-1Otc for linux-arm-kernel@bombadil.infradead.org; Thu, 06 Feb 2025 10:55:03 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Transfer-Encoding:MIME-Version :References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=v/27xKgFdSZVO2eLH/O6emznfiz3pieZcJIn5kKU89M=; b=Cd8pCtJR6U/ErwhjOfME6WzHHm lanAsMKTl3z6vpIMzmFFOTp6JzgDari7TY4NGv1sxk11ripZtltzmzJ6330Yq5gCvJHKvd5MFJNmm ZUde7poITb+W8hY/jNMDiUBX/9hzSZTCYAOHYpV5hvJxwg9rYRE1SdXImI8QRSmkWOJe9PRpMi95A v6jAgrQiOm6EUJRvs6/UoZV31SCQ2T1Q8PGSJlmnXhg7GV1OD7du6cG/xXAZP9m/iyFcjNmpaWAZT v+1fczdKB7J/s8H+1jfWnVPaTQGhF+nG9Iabb3/1lSa01zBWx01nIKe9+CDR+8pNtPnibC/NBctHi 6S8h9w9g==; Received: from mail-wr1-x443.google.com ([2a00:1450:4864:20::443]) by desiato.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzXA-0000000Guvd-2Hov for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:55:02 +0000 Received: by mail-wr1-x443.google.com with SMTP id ffacd0b85a97d-38db909acc9so526718f8f.0 for ; Thu, 06 Feb 2025 02:55:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839299; x=1739444099; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=v/27xKgFdSZVO2eLH/O6emznfiz3pieZcJIn5kKU89M=; b=AinvnRupGrKx4oB0wU4PO7xRzKfKw4owO6INxizPHU1lp8OrlkP1G4GYfWIbRpjed5 U4e29hMHKIjtUQUGfDRWgog/lAeXTvDJ4j21PPP+XKfRQ4YX8HHdQQoBJzC1OMxFGsB9 20JuyUhyiwXrxpRGyqxzE0LHvXLgt1zsYXo8M1of4OfDpkwMgMI5gQ4PxCCQUYJGaMzS LL+C4FDcEMlqTQ02aMgTWWxy4MVd2GE1bZmSh+WpFY95aHgVa+cMmo8FcYemWqQEpy6K YPEAfmc05lCQZsuDCKQ7IJJmcCuDHIr7GwODVPZ86CXtg/zcOrhLrCEkSjifcfQAPKe3 JLZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839299; x=1739444099; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=v/27xKgFdSZVO2eLH/O6emznfiz3pieZcJIn5kKU89M=; b=a8U4kaGBWQognzjx7Obo+USgkPe3t+NY/XbykQRY7K5HVAhtw57jFKp9qiFcP0H5HO +PGp3KbdywsxASdIiGD0fLp/maAb8UHs/ewXazaEuACeQ3FCVvaFILq+EjBQ13I0MtQL 6tO9sobVC/19dfgVOT7GReHmDcPwyvGydux9kgPBX1rnvq9kuOB2ir6C3+ekBD67yQE6 +m0F1VyuRl352zUl4hWUNiQpOZj7sTISJU6/2GEJZvLxc6gb4Zj1GTzpknrH6Q9asxXq /QVynHFfQ5IYyOcaBAFjC0Dzv6Cvy8v1ZrZmLAXRxWWvtLhSFbyThx5E70hp+Ix26QGK rRiQ== X-Forwarded-Encrypted: i=1; AJvYcCXV+nTpP4Uojf7r92Cdew0HK3//mva4lYf9rl6sTRd7vzgMs5KpW3W6HKpFnDQSj7W7HZpL1dMYvlyaURTnpOep@lists.infradead.org X-Gm-Message-State: AOJu0Yz6xbT1PWmKsqhVV31AP5dWMvm7zKsNW+ejcK8ONSyg4/uZ0akJ MKSam8NNO4V0lb7mgSdB+iL2bBHHUvkU082IhRS/mdjXTeE7uo++ X-Gm-Gg: ASbGnctkl7ep0fHzoPAxrElkgJ5lDO5IDYE4vSCYUg/Qb3dOo0+xvSTx9Eji/ApWvTf 4n7w7P63j/FDvqxyZydDfTC95++JzKCqa3zUDuCNLM5zg+3fJnPbfoxcYhHrg+ym7fX9bxz9IER mi4gMVl4eD7i5//+2a+kXcIKgwwma+QX7i1Bzm6deLoUAFvlgcLxnfBlLI5q6gPzs8tJbZool9Y kXhUxNgc6OIIQyfQFyKYCuH1UojnU02+2VQqBFdzojG7WRMKRbwdBzfaHAuvDwgrMdPi7e1+9up ZGgB X-Google-Smtp-Source: AGHT+IEaxBrKjdya80s5GTFvJ53ssNGQMjtcN2JFoVjdVKyNSB8TPhmiwUVzhaT6kIG4/mzrT4zfQg== X-Received: by 2002:adf:f9ce:0:b0:386:3835:9fec with SMTP id ffacd0b85a97d-38db492a155mr4449175f8f.44.1738839299377; Thu, 06 Feb 2025 02:54:59 -0800 (PST) Received: from localhost ([2a03:2880:31ff:4::]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38dbdc30fbbsm1419486f8f.0.2025.02.06.02.54.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:54:58 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Barret Rhoden , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 16/26] rqspinlock: Add locktorture support Date: Thu, 6 Feb 2025 02:54:24 -0800 Message-ID: <20250206105435.2159977-17-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=2978; h=from:subject; bh=lkeMxyDGnEWVT4+0Paf/jS31tLo0YsbXyTnTYAfMTz8=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRmHQ2YyiZ2nANvO6JQtZoHC62f1PKXDNjawAxK mKA8GLiJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZgAKCRBM4MiGSL8RyjR0D/ 9UxpQJ303fRMoYR6fqKrESf9af9KoWduUW16ytQ8cl8LUmJvwPsQA2RBgUKdWqIjIitbNFRK0ahuL2 G3ZxiZ7lVlv6kVhyOprX9Rp9/1KXJb9AssTbXNvRGsblLw0VJWdbOxKA4CzRsqXHpyhAZw+crjWaAi e76bz8XD/UpVFVgPBoDM6XHvEU2CQz9W0Mujyrr8fIKId/Ly9v+BzlYwvJ5Dwiq0pKOjO1wJSs2GSe xJNqUgPJbtFFLsbyNGsWDMGgSm+PQL9V4lY/frjPS48Pxdu+urR1lOEEV933OLvUjCojgSyeFrQFi4 C7ca0HI9k7+q8frWCz+Je2IqiCXf7yXlCIM2WkxiNzINim/wLLzaPdOeueZFq4MdMjMZBhCcUB1Rbe mRdSVBLs7X7Zpm/3jue63s7r1/T5Zdd+FnKnX9Mx1pbvmw8d+u28kL4LxsLFp3fBS3zeo5fKA1b74I aPnMh8oLweieJCNzHBtJmoA+8bROXjIbWCktLpZ6wtoxPmkgKNC65kw8ylP9OD9mtdz/bYC6djzAq0 WjHL34MT7pIHjgbJe86Kbe4rqIKg2p+NIDSL06kM7XS11ju9ee0SrcJYHqw/rvOlMJqzWxf4Nv/Lcw tVGq7vh2z1O7XmMCOh+d4cORBxDlJ4pqO/LdJnm3CeCaaMqsWEXlmgSV6ZRA== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_105500_753593_D091AF2C X-CRM114-Status: GOOD ( 14.23 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Introduce locktorture support for rqspinlock using the newly added macros as the first in-kernel user and consumer. Signed-off-by: Kumar Kartikeya Dwivedi --- kernel/locking/locktorture.c | 51 ++++++++++++++++++++++++++++++++++++ kernel/locking/rqspinlock.c | 1 + 2 files changed, 52 insertions(+) diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c index cc33470f4de9..a055ff38d1f5 100644 --- a/kernel/locking/locktorture.c +++ b/kernel/locking/locktorture.c @@ -362,6 +362,56 @@ static struct lock_torture_ops raw_spin_lock_irq_ops = { .name = "raw_spin_lock_irq" }; +#include +static rqspinlock_t rqspinlock; + +static int torture_raw_res_spin_write_lock(int tid __maybe_unused) +{ + raw_res_spin_lock(&rqspinlock); + return 0; +} + +static void torture_raw_res_spin_write_unlock(int tid __maybe_unused) +{ + raw_res_spin_unlock(&rqspinlock); +} + +static struct lock_torture_ops raw_res_spin_lock_ops = { + .writelock = torture_raw_res_spin_write_lock, + .write_delay = torture_spin_lock_write_delay, + .task_boost = torture_rt_boost, + .writeunlock = torture_raw_res_spin_write_unlock, + .readlock = NULL, + .read_delay = NULL, + .readunlock = NULL, + .name = "raw_res_spin_lock" +}; + +static int torture_raw_res_spin_write_lock_irq(int tid __maybe_unused) +{ + unsigned long flags; + + raw_res_spin_lock_irqsave(&rqspinlock, flags); + cxt.cur_ops->flags = flags; + return 0; +} + +static void torture_raw_res_spin_write_unlock_irq(int tid __maybe_unused) +{ + raw_res_spin_unlock_irqrestore(&rqspinlock, cxt.cur_ops->flags); +} + +static struct lock_torture_ops raw_res_spin_lock_irq_ops = { + .writelock = torture_raw_res_spin_write_lock_irq, + .write_delay = torture_spin_lock_write_delay, + .task_boost = torture_rt_boost, + .writeunlock = torture_raw_res_spin_write_unlock_irq, + .readlock = NULL, + .read_delay = NULL, + .readunlock = NULL, + .name = "raw_res_spin_lock_irq" +}; + static DEFINE_RWLOCK(torture_rwlock); static int torture_rwlock_write_lock(int tid __maybe_unused) @@ -1168,6 +1218,7 @@ static int __init lock_torture_init(void) &lock_busted_ops, &spin_lock_ops, &spin_lock_irq_ops, &raw_spin_lock_ops, &raw_spin_lock_irq_ops, + &raw_res_spin_lock_ops, &raw_res_spin_lock_irq_ops, &rw_lock_ops, &rw_lock_irq_ops, &mutex_lock_ops, &ww_mutex_lock_ops, diff --git a/kernel/locking/rqspinlock.c b/kernel/locking/rqspinlock.c index 93f928bc4e9c..49b4f3c75a3e 100644 --- a/kernel/locking/rqspinlock.c +++ b/kernel/locking/rqspinlock.c @@ -86,6 +86,7 @@ struct rqspinlock_timeout { #define RES_TIMEOUT_VAL 2 DEFINE_PER_CPU_ALIGNED(struct rqspinlock_held, rqspinlock_held_locks); +EXPORT_SYMBOL_GPL(rqspinlock_held_locks); static bool is_lock_released(rqspinlock_t *lock, u32 mask, struct rqspinlock_timeout *ts) { From patchwork Thu Feb 6 10:54:25 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962899 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3F42CC02194 for ; Thu, 6 Feb 2025 11:19:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=qz+NS5T4GU+VbAH0BNt16FcBYOnl5PXsvLXnbwAi+Qc=; b=ujBwhbykaWPY6jsc1QhURPFbaU KydcIuZvSi79TUeTbUjiGciksjjhDT/BNlJoCDd60zltM/hT9LMtmbNkNtQJjRGyfJoOyaxKr8ZzB 4UX6DI/pCXu8mjSaA7G8oAWLBG2NDn/9gCpSSiIL+OBkJDmcuteSBPxmTNBYU8bXv6cYUkI1m5Jlq FLffjbKT9+/RlbW1hE6Ch2QldLNViuTCi7R2v0YaJC4Hr2SLfY4EaLI9Jjvcr1N92dV/L8+Ona1Lq IfvY+ayiMlUfXEYy1waBHMtFe57dgUnLRQlhbU2VCl1/R422R2XIQuWMKPLj4u5B7wJiXp+yr/yYQ dSE77P+w==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tfzv2-000000066tj-0Fa0; Thu, 06 Feb 2025 11:19:40 +0000 Received: from mail-wm1-x344.google.com ([2a00:1450:4864:20::344]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzXC-000000061dq-0tqV for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:55:03 +0000 Received: by mail-wm1-x344.google.com with SMTP id 5b1f17b1804b1-43625c4a50dso4678995e9.0 for ; Thu, 06 Feb 2025 02:55:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839301; x=1739444101; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=qz+NS5T4GU+VbAH0BNt16FcBYOnl5PXsvLXnbwAi+Qc=; b=LpUDa2Dv8HcNnI52RssXGDW8s8GdureeXjzFJLXpIelqL5OeQYm227xaIAjKqgXlUM bRsIKJNNys9T4B3EyozuR7NLhHGdMc91pmOcn9mMoGS75165+ZxX5p4pfyDIo/5PqPFY Imie7jLYXKS3oedChyEDboPIN41woOmEi5v6Z8TmznsOBXsVGL+dUpOBMsr2dasjFWTg cyAykuOTDPIPXV0q6AICGfCbPqU4OBa3/JhtdPEJZC+17p0qbTHzD6UoJym6qUIfrjpZ NrYlJEq4JiVlqO8jYDLJFSNZ5Q27G1jLvdoOzuP9tWjdWz9xpCx38hhpm+ghIvVAG7lI bdqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839301; x=1739444101; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qz+NS5T4GU+VbAH0BNt16FcBYOnl5PXsvLXnbwAi+Qc=; b=g2BJp0l1LnjJNjxYNUMPN9djS9WbIqTHfw2MLcFEdubKetcfbqy3uhOV9MiDNrAox3 UoGv+T9NbOXAhDQohRHFAZAlqsQzXy9gn2zuMmme6I//0rTuJ66DoR8QCvNjJHpklInH IJWyqQ7Tk56vmf/+L2B6iA3H4e8xjS96NmZe583MYIPs2NCCFnNvOPmwCZUoO1X8+1m7 T/Cvo/Aoo886bGxNBo5r5x34aE7Ds6n84sO+pYIL9oq4EwCoj8V0r16JZ6Y+CgXPzBje qOxTL3lNyNJdQbPogRJ5HKWhDmt3L17tY+wscPiS3j+eQ8kh0Gj21U18ZwhA2htvrH1u lPew== X-Forwarded-Encrypted: i=1; AJvYcCVaEZvXllQiagirrdnSaTDi9DwErjPxjl8JobJXeiiOD8QQ6zrWaXQzRp4WsNxE++MKtztt1yhWNXAfPl6Po+Ur@lists.infradead.org X-Gm-Message-State: AOJu0YxY9OpudmSQ0SwOIGRNUJems3bWAz9RMO74eB/GY6pY7XH99DHL c3oxeWWHPGmQXsZwXTcHxC05g/bt3Q5fnaghl3qotRv1N7aaYBqG X-Gm-Gg: ASbGnctWqNGB6sM6qYWR8U6cB9rI3uKxoUi/bwMlqfKmVw3xeo14UGhWkxmaqPFrMvO vRAF0VMXpffmcRQeAmwb3k4kWH769TuRrzdPAyd82Bp6wjzPbdirZW/fLZ7/JsODx57kQE1dBe9 rxS1iqOslDka+Wfs1cLYePlB3Nzkj+6/jYqt7TwuTuCkSwWky1otfMGee3MtDKmjJ81XY8AktiU 2ERaS3K73HQGy70ip4i+Xw8JgeswVSWrErTeIDROkNbE9ADEYsAbyXyeMl596mzxloxbOGps6Qp Ai8fOQ== X-Google-Smtp-Source: AGHT+IG3YUH0oYkY2WcU9nkb8nClyB5AjNsFI6WIkNWS0fGK3jGoBgR9RRDJA0dBgbN9twa+WCc61A== X-Received: by 2002:a05:600c:1c87:b0:434:f7e3:bfbd with SMTP id 5b1f17b1804b1-4390d5611fcmr49163655e9.23.1738839300786; Thu, 06 Feb 2025 02:55:00 -0800 (PST) Received: from localhost ([2a03:2880:31ff:25::]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4390d94d7c7sm50627245e9.14.2025.02.06.02.55.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:55:00 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Ankur Arora , Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Barret Rhoden , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 17/26] rqspinlock: Hardcode cond_acquire loops to asm-generic implementation Date: Thu, 6 Feb 2025 02:54:25 -0800 Message-ID: <20250206105435.2159977-18-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=3789; h=from:subject; bh=qYphScagVacMZI8xAzQRuNOSyn9/aS/2qYdw+lGxq8I=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRmEQyiG868iJLdTJeV5rNF45XJkM3pzaj93Qpc DMdltmaJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZgAKCRBM4MiGSL8RyjhAD/ 94L7gwFOANgho2OHK/jFtxh/dQWudROmLlLNyaJh+qGxZ9/Y6D95iWxtze9d8A8JWUeVYuNcNzGuEN q2gb1bS2T8Pk+VCqcFRqsioDWHNnqM+BwLGq26+gTj0p3zop+cgAPklesV00m68IlSy1LhMcIzIF2E slsAxGCoT6w+U1KGge2lPNChfW8Kf4kMiHr5s4V3lqgjh5jDkHofe2E0f88YBI6NEK9X3fm/gk1V+J VIaptHRDqsUJmlPLmvdU51DFEwmGDVtxxc8mvgA6F08d9d87r+yOm0Oos298QtBiQcb6IqHR1aka9N ITML8aqrOeF2zgN99ZZVlzpqrJWuvSp/jTdyiXKWVRaHjVEtHs0pTh7ybdJy+0D0bEzNBE9I5MQHe+ WHend498nQc8gE/neaaMboKfxVnAocxfFtQoQknslS76Z48WT/MN4FdUGVjS9jgF8kAQB/iTqc+2A2 jb0b9ivFQZN1YJgTlFTR4zPHbYRw/NOF7HTisjbVAjun3ow+FP3VgcwZPkruEl+31v44pUzsmk7WZm PBqeRGaWYJACjXqHGBEueZu13cMlJCXDoarHrZqbMc5frds4B2dEW+0qLeFUe/LZJqdKgR+36oA8nE hBOileJ273cIjtBiGXGnEU1vWDHPgN8zSEOKxP4Z77q0bv5ucidRdFRkNFZA== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_025502_246035_9084202C X-CRM114-Status: GOOD ( 16.88 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Currently, for rqspinlock usage, the implementation of smp_cond_load_acquire (and thus, atomic_cond_read_acquire) are susceptible to stalls on arm64, because they do not guarantee that the conditional expression will be repeatedly invoked if the address being loaded from is not written to by other CPUs. When support for event-streams is absent (which unblocks stuck WFE-based loops every ~100us), we may end up being stuck forever. This causes a problem for us, as we need to repeatedly invoke the RES_CHECK_TIMEOUT in the spin loop to break out when the timeout expires. Hardcode the implementation to the asm-generic version in rqspinlock.c until support for smp_cond_load_acquire_timewait [0] lands upstream. [0]: https://lore.kernel.org/lkml/20250203214911.898276-1-ankur.a.arora@oracle.com Cc: Ankur Arora Signed-off-by: Kumar Kartikeya Dwivedi --- kernel/locking/rqspinlock.c | 41 ++++++++++++++++++++++++++++++++++--- 1 file changed, 38 insertions(+), 3 deletions(-) diff --git a/kernel/locking/rqspinlock.c b/kernel/locking/rqspinlock.c index 49b4f3c75a3e..b4cceeecf29c 100644 --- a/kernel/locking/rqspinlock.c +++ b/kernel/locking/rqspinlock.c @@ -325,6 +325,41 @@ int __lockfunc resilient_tas_spin_lock(rqspinlock_t *lock, u64 timeout) */ static DEFINE_PER_CPU_ALIGNED(struct qnode, qnodes[_Q_MAX_NODES]); +/* + * Hardcode smp_cond_load_acquire and atomic_cond_read_acquire implementations + * to the asm-generic implementation. In rqspinlock code, our conditional + * expression involves checking the value _and_ additionally a timeout. However, + * on arm64, the WFE-based implementation may never spin again if no stores + * occur to the locked byte in the lock word. As such, we may be stuck forever + * if event-stream based unblocking is not available on the platform for WFE + * spin loops (arch_timer_evtstrm_available). + * + * Once support for smp_cond_load_acquire_timewait [0] lands, we can drop this + * workaround. + * + * [0]: https://lore.kernel.org/lkml/20250203214911.898276-1-ankur.a.arora@oracle.com + */ +#define res_smp_cond_load_relaxed(ptr, cond_expr) ({ \ + typeof(ptr) __PTR = (ptr); \ + __unqual_scalar_typeof(*ptr) VAL; \ + for (;;) { \ + VAL = READ_ONCE(*__PTR); \ + if (cond_expr) \ + break; \ + cpu_relax(); \ + } \ + (typeof(*ptr))VAL; \ +}) + +#define res_smp_cond_load_acquire(ptr, cond_expr) ({ \ + __unqual_scalar_typeof(*ptr) _val; \ + _val = res_smp_cond_load_relaxed(ptr, cond_expr); \ + smp_acquire__after_ctrl_dep(); \ + (typeof(*ptr))_val; \ +}) + +#define res_atomic_cond_read_acquire(v, c) res_smp_cond_load_acquire(&(v)->counter, (c)) + /** * resilient_queued_spin_lock_slowpath - acquire the queued spinlock * @lock: Pointer to queued spinlock structure @@ -419,7 +454,7 @@ int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, */ if (val & _Q_LOCKED_MASK) { RES_RESET_TIMEOUT(ts); - smp_cond_load_acquire(&lock->locked, !VAL || RES_CHECK_TIMEOUT(ts, ret, _Q_LOCKED_MASK)); + res_smp_cond_load_acquire(&lock->locked, !VAL || RES_CHECK_TIMEOUT(ts, ret, _Q_LOCKED_MASK)); } if (ret) { @@ -568,8 +603,8 @@ int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, * does not imply a full barrier. */ RES_RESET_TIMEOUT(ts); - val = atomic_cond_read_acquire(&lock->val, !(VAL & _Q_LOCKED_PENDING_MASK) || - RES_CHECK_TIMEOUT(ts, ret, _Q_LOCKED_PENDING_MASK)); + val = res_atomic_cond_read_acquire(&lock->val, !(VAL & _Q_LOCKED_PENDING_MASK) || + RES_CHECK_TIMEOUT(ts, ret, _Q_LOCKED_PENDING_MASK)); waitq_timeout: if (ret) { From patchwork Thu Feb 6 10:54:26 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962900 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DDA00C02194 for ; Thu, 6 Feb 2025 11:21:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=JSwd0Ic/zADHNPGibPzB3vfQZwdDu0cIcYcU2UTlBE0=; b=AlKLeDzaaEBojieHIDK2Zx9IlW GtlP/L46L0CYjtcDlecMi2icJgDrYu8qIjhEnm3t0e55gPtwsuxHgQ0eAFTY8qoeOoAzkpLThbubO tuMpLY5bzuQm6lnD9XkthRK1UH3pRrtd9wMIDNQSv+hO7U3by8EG3TNzG9h5el+WDRnuAJvar1Ehs g4SfyJErk2H2EINAbKLDdDV7VDxtAbeMx2EcpsRnlgyTl7UoVlpjmOmWtQkzTS0URE+jmwWTWJ6BS anzvnuB8dvHznTIJ5ylKxFguQsg13yK8m5V95gDf1OEdlXBRy9TmeHRYHGzTskllQA4ygnBXM+r1L DceF8+5A==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tfzwN-000000067Bh-3bvs; Thu, 06 Feb 2025 11:21:03 +0000 Received: from mail-wr1-x444.google.com ([2a00:1450:4864:20::444]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzXD-000000061eY-2GEZ for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:55:04 +0000 Received: by mail-wr1-x444.google.com with SMTP id ffacd0b85a97d-38dc32a1318so131419f8f.1 for ; Thu, 06 Feb 2025 02:55:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839302; x=1739444102; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=JSwd0Ic/zADHNPGibPzB3vfQZwdDu0cIcYcU2UTlBE0=; b=JZsjT0tTYVqyf4y1zRGMAn66NBX/eBv4qjn0j9cRTw22RHYI1sK4RXBL40DMFZd+wX 1EoLfsm55VB0l33Y7rCciZHtiilHDNE9prseWxqX0EzqiBV/a5JPTH7sAYx03g1XAdJF tRosRi1mRUO0eIVb+ZtcsJ7QcU7Fsa2qOjUaaqJkSnbNM/gQ41V6v58zf80ZUHfSH6AW ZbdXz03wtnkWL2o9T/VLOVyP2Svwr820zqYo0JRnVN3WPdVi2lsDf8YFoRyr7EZrJToa EelucJIhpWllYWaKuCL/SD79zUtZA3b8sifzIRC5ppPVyqq/3x59zBKSOiZ9GkDMJKx9 vXbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839302; x=1739444102; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JSwd0Ic/zADHNPGibPzB3vfQZwdDu0cIcYcU2UTlBE0=; b=h8JksoWHyZCyFke2CdlyT901WZMWXGH4XH7cUCYA7WGfFr+mWa4jQw1af31pHmyJ9X rpEPCbfJBse8+UhBeed/00ft63vWvOuZoypNlGaoWmb39n8f2pmkOnb2y/OSLJUQQnUF sSZhFHlV5c6lkB7PX56SUgVqZofvyKTaSCo0bJmPdLkjYC2DFc6JXJelvtU2OkdRtkoE gCX9IrEfU7ekJajqXhNnwQ0hmMtMLiJr6adx5UaWt8PZf+2hBuT9pys1aROxhGh8wzNw xuznkJf+HC/PmDelgugA9vHbJHNuZ/r2xTp6waiQYwNqlcukdjJV4SbzOwrsYIRx8xA7 BElQ== X-Forwarded-Encrypted: i=1; AJvYcCWSe545GZIic6kjI8Uq4GjSl21m4KFCAGVgb+Yl6Ys8Mp+Rfj9l2RkoDlpvuHU4vW+ZGh+vVgBR87DP63mQVHPq@lists.infradead.org X-Gm-Message-State: AOJu0Ywhhhgce1d2SzH5r5MhoR56vPLNwGaLvwwYPK85av1eey+AajGC VRorHjr01TceO84ZF6Pa2RSE44MOHRlPT5inlvSjKnxYQUrSAJ80 X-Gm-Gg: ASbGncsUGJl9eKb9LNH4+J47UebPeQlqpIKgHnkIFlgHfgq+sQ4lUA8zfRTGGuWAy9M StN2oXpPi05r+w3HzhvytS/VQz2ojmUzMAjgPCx3HcqKf14OG1nsDjjbVtLfd53In5sA4CpS0uG 300Wwdxd3x09+DBsjMhZ42Z27XdybgeHxwbfi0/f5eklZSyBi0S1+TnMeg6c2Rpc7D/b7fN9Xe+ 7kc88e8F2zrULncmyMqvUi9fagRLzUjfamiA26Pj0LEJK6VTCkpVcr+2mOzBBVfVruBP1ngxFeB mWXN X-Google-Smtp-Source: AGHT+IHKdNt/2i2niit9x9Q/R1mWsETc05ExweAVXJWi6Fe7g9aArrjNh6K5B0b8g58nbAsrLJyEOg== X-Received: by 2002:a5d:59ac:0:b0:38d:bf6e:adca with SMTP id ffacd0b85a97d-38dbf6eae30mr869218f8f.48.1738839302123; Thu, 06 Feb 2025 02:55:02 -0800 (PST) Received: from localhost ([2a03:2880:31ff:2::]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38dc31b9394sm473848f8f.11.2025.02.06.02.55.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:55:01 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Barret Rhoden , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 18/26] rqspinlock: Add entry to Makefile, MAINTAINERS Date: Thu, 6 Feb 2025 02:54:26 -0800 Message-ID: <20250206105435.2159977-19-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=2083; h=from:subject; bh=E+oadE0wJWqBDHpLclZmcj/fTawgs/RCGY18+MKVeJ0=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRmodQ4/VhA2GYFRoJhLQZknuv2U5hRh8VIgaN5 gjWRV/OJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZgAKCRBM4MiGSL8Ryq+SD/ 94tOt0FPzLcovxJ8PvSGMSImDYXVh0wToZsIW+lhBs3jfHW5GETrWPgvHrSUxGhML9no+RKLfirkp4 E1cPXorlRj2ki4xsWzLSlHk+EH2PWoH8jCjs6jMrtK+SMqCA7Oj4Ice6DQmaMIN6/hB/xu6chupbLc 1lXBQZPtFxk1eXA75g9LS8ZVQ3WXOnPj+t6OewmPFulHGjMCCRR9/1M8sw89gkNsM3aB5E5I7YGAoV DP5513oBCDAn8tJPZqTzddkWfZoi+q6onyMLx6WoRPWwd6fk/EKwhICv1Ikg+RI9MkK5qeIiUkhtsh FX/jfVeTi3znO3Ae+dZqpxdGdwDobONfyOditGEf38ImSjAUAeUfyJe4K8xYbvfWnk1WOMn8G3jHmE c9muj11DGHiIVfw/ynoQqr4v5vyuNYzu5FWY81NdA3Evh0nXgKo18riB8GKo1moM5Ga4Cc9rhmReue T30PWz9HdqaEAV4ligZ3OeXxNDjK8YNM2P3ccL8sziaBme2U8G5uVWyFh3wSuGZextoI1/7fUBWFY7 QuxUlV8b1+apMH6nrCBDVCO4LpBI+7ZQ9qDWn1MUzT7DTYAHwa19++ndXSIA4NCiycMOqEvocmJYpe V+70XrThPEC4JgdWBMXps/15vB9J+B+z1zka8ksy3fY9si4XlMK2GSNw1VjA== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_025503_587501_89637598 X-CRM114-Status: GOOD ( 13.47 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Ensure that rqspinlock is built when qspinlock support and BPF subsystem is enabled. Also, add the file under the BPF MAINTAINERS entry so that all patches changing code in the file end up Cc'ing bpf@vger and the maintainers/reviewers. Ensure that the rqspinlock code is only built when the BPF subsystem is compiled in. Depending on queued spinlock support, we may or may not end up building the queued spinlock slowpath, and instead fallback to the test-and-set implementation. Signed-off-by: Kumar Kartikeya Dwivedi --- MAINTAINERS | 3 +++ include/asm-generic/Kbuild | 1 + kernel/locking/Makefile | 1 + 3 files changed, 5 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 896a307fa065..4d81f3303c79 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -4305,6 +4305,9 @@ F: include/uapi/linux/filter.h F: kernel/bpf/ F: kernel/trace/bpf_trace.c F: lib/buildid.c +F: arch/*/include/asm/rqspinlock.h +F: include/asm-generic/rqspinlock.h +F: kernel/locking/rqspinlock.c F: lib/test_bpf.c F: net/bpf/ F: net/core/filter.c diff --git a/include/asm-generic/Kbuild b/include/asm-generic/Kbuild index 1b43c3a77012..8675b7b4ad23 100644 --- a/include/asm-generic/Kbuild +++ b/include/asm-generic/Kbuild @@ -45,6 +45,7 @@ mandatory-y += pci.h mandatory-y += percpu.h mandatory-y += pgalloc.h mandatory-y += preempt.h +mandatory-y += rqspinlock.h mandatory-y += runtime-const.h mandatory-y += rwonce.h mandatory-y += sections.h diff --git a/kernel/locking/Makefile b/kernel/locking/Makefile index 0db4093d17b8..5645e9029bc0 100644 --- a/kernel/locking/Makefile +++ b/kernel/locking/Makefile @@ -24,6 +24,7 @@ obj-$(CONFIG_SMP) += spinlock.o obj-$(CONFIG_LOCK_SPIN_ON_OWNER) += osq_lock.o obj-$(CONFIG_PROVE_LOCKING) += spinlock.o obj-$(CONFIG_QUEUED_SPINLOCKS) += qspinlock.o +obj-$(CONFIG_BPF_SYSCALL) += rqspinlock.o obj-$(CONFIG_RT_MUTEXES) += rtmutex_api.o obj-$(CONFIG_PREEMPT_RT) += spinlock_rt.o ww_rt_mutex.o obj-$(CONFIG_DEBUG_SPINLOCK) += spinlock.o From patchwork Thu Feb 6 10:54:27 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962901 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 49D76C02194 for ; Thu, 6 Feb 2025 11:22:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=pV8Sn3/Wzf2u35L7Uf0fdtRf3raefbZ3DCYUjKqcPwY=; b=WkF3aENm1Sw6HOVYZKrPWuHJ+7 WkiJ9iOybtXpmmrmuhPHoCQKmR+Ednebd5vKRhFE7z4GpnmPeF4kT6XWTAaXAPrlAzvve+ir2rzUE HQsIRwCupzNhDjimaiPvxNXVlUKqDXMxi5xwKSPmgMBzJAwWXE3voEM4/7k4nwpF2AqhBGVJqCD34 vmlGU0F/d7/6JJTfUnmBh0k90r2LXJAe9NKuY10qKh2z/h/XPm15O8ySkkS6cOxSMla16eYlxC/bh bn0vprYFIBxHhaw3jHch5esTcl8zEtVoyRuQHDglvM9uCSrw2blSJSEvv619NQUgDkLIzT1lIbDMJ g9seCf4g==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tfzxj-000000067Rl-279z; Thu, 06 Feb 2025 11:22:27 +0000 Received: from mail-wm1-x343.google.com ([2a00:1450:4864:20::343]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzXE-000000061fI-3jGJ for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:55:06 +0000 Received: by mail-wm1-x343.google.com with SMTP id 5b1f17b1804b1-4362bae4d7dso4899345e9.1 for ; Thu, 06 Feb 2025 02:55:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839303; x=1739444103; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=pV8Sn3/Wzf2u35L7Uf0fdtRf3raefbZ3DCYUjKqcPwY=; b=i5sKjCkMYCYmKZbbCrSbuO84YgloANz5Fei15dOuahGWTz+m+wMszuS1+YI9Yacl9S Q20OCTWFfs0dyiCwRyxmYpBl+rri5MYpM//L7gAc71/DByTevwjtmF+DAyJfFumj5J3z 6NkaUTs8pQg3OVRqlH57X6api2rJb/Kwm19gUTmDizsamp6VjkoxTCvJ13vvr8Mnbj+l HeRxjtf/lLd3g0nSOgfaBqaipAfkLFX00By/w5RtExy5giLqSeo7zmVGzlt9nL6NcIxg X6BFBcw4tzohGB9suAI3SKcsAU9jP5TKWle/UgBOu3mz4WCIb/sj4SntoRq6ZCfr4RyN qXpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839303; x=1739444103; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pV8Sn3/Wzf2u35L7Uf0fdtRf3raefbZ3DCYUjKqcPwY=; b=OQ5gA8AdapBW3SAZWE5bAtv8F3QppaxmvWSqSHiIYMpkEvNRcbPVBA9FbFS7B6xq84 rG9/vkdBMw01won3G3Ak3irmeQRjR0gEj0WamcIMc+i6v5zW5NdSputWNV5PQ2n9X2xS 21mj1Qf9bdlIVJHtxhNb8Q20nsmJSKpc68nY94U2ui/Cs9SFn6PpfUk2ogtGb5FGmkts qovCDFcO92fb5zGaoLbOdrleU/i/OpZq4LMck/WQyBY3AlUR+HowsJ2tKSYMv8Sf+yEj x+GD0e5VeL1rlDHE1UlUoF00MTtLwtQTa5Z+DlB+elDhA1YDf1KE9zqLxFbVmhW9248I LMJg== X-Forwarded-Encrypted: i=1; AJvYcCUPuaZY/2nttJ0ciNGBswQ6ztTd59M4JEg5I30xxMRqXmLPUXFaguYHocRR3JTxiwjyiBPvCCjiBT/9ChSDLx/Q@lists.infradead.org X-Gm-Message-State: AOJu0Yx1Jly9XXoINM5/6Xqn2WMGkl/iJbaDcVbqs5UUJnGB5GCfBXuq S6OoIGOvITEjCegvnnQRBJTK8NA4wNzsIrQY8uPPdW73zzjNTLnH X-Gm-Gg: ASbGncvjG73XJytZoz82n9yinjoUOeGcIwRJo3z5iGwWM03HWm79eZH0FTbFd4AjHNs Pu8RICqeTD/g9ZEI9VdQDJA2LsPCmuHGTUtWwSkcZR3ky1uT+NnE8dD3o7oLrKThCqLaz416Ra/ pdatiau49jo1xEcfzcJLPArIS2yves+5zBzkGU2/9MJHxGmIJdAXXOpO06McwKtCE9nvpdglkIJ 94mycBeSjdwIjguKH23cDSBjmFDlYmVg5aIanDrO2t97GP9MnGozOoULed7L2SlLBDbz1jAtduM j4KQcQ== X-Google-Smtp-Source: AGHT+IF1B76+1ynE7+9dQEBX3L/iUxR6t+WoOTyxAKlHsKCxtnZ617dC5+DYUZ8gZKTRoQgB1NYv6Q== X-Received: by 2002:a05:600c:310b:b0:434:eb86:aeca with SMTP id 5b1f17b1804b1-4390d43401bmr53989535e9.10.1738839303346; Thu, 06 Feb 2025 02:55:03 -0800 (PST) Received: from localhost ([2a03:2880:31ff:73::]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4391da965a6sm15547985e9.6.2025.02.06.02.55.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:55:02 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Barret Rhoden , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 19/26] bpf: Convert hashtab.c to rqspinlock Date: Thu, 6 Feb 2025 02:54:27 -0800 Message-ID: <20250206105435.2159977-20-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=10973; h=from:subject; bh=PXV2g5jkXA6M5CNQXPC4dpfD1jgCcit6rNK1L4ARIdA=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRmPzcSKNTT0EqqnnkfrvBHcqC+gjliccdRPRLE gq5UTVWJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZgAKCRBM4MiGSL8Ryk5KD/ 9ctjNzhAOgMhJSDZVpv3j4of82O3Xfa/GeQWEsteAu7YzSYq9AIIE3HFCAgpILNT3t5B88CX2Ed1lP x7dqTeC9Uk0p5aenpsjKIaVczLP6xpR/qmVo4I6h57OW1szo7kfpJPwNod+b01Lsjd/4XhfiP0XMTP G76JN02zA8AGTVzvJ8vGj1QdFNU9+tF03KJXh+Ai+Hy7bM/zZZnTDzBA7JGwE5Z3ySRB16mRXmSVLz CLGdo2Vx6qHCLVgZYsnC2f8JFgFXNNkkEcIDSP7o4OaZzrAd8KLs4rj30sgypf6XlKOaZP0cMAh97R rv9OgrXJGSWzBydhJ2Yo/U9qlaChNMWunq075WSaiww/hRbinOWydGQIl0BjHkhOsKVhBVaUhJT7mH 11W6qKAuXa0BEPoRunMVTqW01R4v/bj+YfT4PH+kIRjtYpePc7WvEnxYYr3HAhgiEiOzRNYV/nz7i1 6Qxz8MOJnYfqv5lWb482r1sfm0F+njVMg5corGbZpFCvMJMAbVl7JlJWn3KBt/rsn852KjsX0vk+KT Y4I+fnWdHm6XJ+zLmMdJjxipaCf42YtN3jaFQn/AfCmYRCm/oAaqCIpx1s2cTQi5Kpy0J3M4iGCVSm IfffEEuYwarej+1RmRfEc7KsRdw9g1C3XN/mK0+uQ97QaolvQ1fpjm1VZIXg== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_025505_062142_E5F28DA7 X-CRM114-Status: GOOD ( 22.48 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Convert hashtab.c from raw_spinlock to rqspinlock, and drop the hashed per-cpu counter crud from the code base which is no longer necessary. Signed-off-by: Kumar Kartikeya Dwivedi --- kernel/bpf/hashtab.c | 102 ++++++++++++++----------------------------- 1 file changed, 32 insertions(+), 70 deletions(-) diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 4a9eeb7aef85..9b394e147967 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -16,6 +16,7 @@ #include "bpf_lru_list.h" #include "map_in_map.h" #include +#include #define HTAB_CREATE_FLAG_MASK \ (BPF_F_NO_PREALLOC | BPF_F_NO_COMMON_LRU | BPF_F_NUMA_NODE | \ @@ -78,7 +79,7 @@ */ struct bucket { struct hlist_nulls_head head; - raw_spinlock_t raw_lock; + rqspinlock_t raw_lock; }; #define HASHTAB_MAP_LOCK_COUNT 8 @@ -104,8 +105,6 @@ struct bpf_htab { u32 n_buckets; /* number of hash buckets */ u32 elem_size; /* size of each element in bytes */ u32 hashrnd; - struct lock_class_key lockdep_key; - int __percpu *map_locked[HASHTAB_MAP_LOCK_COUNT]; }; /* each htab element is struct htab_elem + key + value */ @@ -140,45 +139,26 @@ static void htab_init_buckets(struct bpf_htab *htab) for (i = 0; i < htab->n_buckets; i++) { INIT_HLIST_NULLS_HEAD(&htab->buckets[i].head, i); - raw_spin_lock_init(&htab->buckets[i].raw_lock); - lockdep_set_class(&htab->buckets[i].raw_lock, - &htab->lockdep_key); + raw_res_spin_lock_init(&htab->buckets[i].raw_lock); cond_resched(); } } -static inline int htab_lock_bucket(const struct bpf_htab *htab, - struct bucket *b, u32 hash, - unsigned long *pflags) +static inline int htab_lock_bucket(struct bucket *b, unsigned long *pflags) { unsigned long flags; + int ret; - hash = hash & min_t(u32, HASHTAB_MAP_LOCK_MASK, htab->n_buckets - 1); - - preempt_disable(); - local_irq_save(flags); - if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) { - __this_cpu_dec(*(htab->map_locked[hash])); - local_irq_restore(flags); - preempt_enable(); - return -EBUSY; - } - - raw_spin_lock(&b->raw_lock); + ret = raw_res_spin_lock_irqsave(&b->raw_lock, flags); + if (ret) + return ret; *pflags = flags; - return 0; } -static inline void htab_unlock_bucket(const struct bpf_htab *htab, - struct bucket *b, u32 hash, - unsigned long flags) +static inline void htab_unlock_bucket(struct bucket *b, unsigned long flags) { - hash = hash & min_t(u32, HASHTAB_MAP_LOCK_MASK, htab->n_buckets - 1); - raw_spin_unlock(&b->raw_lock); - __this_cpu_dec(*(htab->map_locked[hash])); - local_irq_restore(flags); - preempt_enable(); + raw_res_spin_unlock_irqrestore(&b->raw_lock, flags); } static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node); @@ -483,14 +463,12 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr) bool percpu_lru = (attr->map_flags & BPF_F_NO_COMMON_LRU); bool prealloc = !(attr->map_flags & BPF_F_NO_PREALLOC); struct bpf_htab *htab; - int err, i; + int err; htab = bpf_map_area_alloc(sizeof(*htab), NUMA_NO_NODE); if (!htab) return ERR_PTR(-ENOMEM); - lockdep_register_key(&htab->lockdep_key); - bpf_map_init_from_attr(&htab->map, attr); if (percpu_lru) { @@ -536,15 +514,6 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr) if (!htab->buckets) goto free_elem_count; - for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++) { - htab->map_locked[i] = bpf_map_alloc_percpu(&htab->map, - sizeof(int), - sizeof(int), - GFP_USER); - if (!htab->map_locked[i]) - goto free_map_locked; - } - if (htab->map.map_flags & BPF_F_ZERO_SEED) htab->hashrnd = 0; else @@ -607,15 +576,12 @@ static struct bpf_map *htab_map_alloc(union bpf_attr *attr) free_map_locked: if (htab->use_percpu_counter) percpu_counter_destroy(&htab->pcount); - for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++) - free_percpu(htab->map_locked[i]); bpf_map_area_free(htab->buckets); bpf_mem_alloc_destroy(&htab->pcpu_ma); bpf_mem_alloc_destroy(&htab->ma); free_elem_count: bpf_map_free_elem_count(&htab->map); free_htab: - lockdep_unregister_key(&htab->lockdep_key); bpf_map_area_free(htab); return ERR_PTR(err); } @@ -817,7 +783,7 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node) b = __select_bucket(htab, tgt_l->hash); head = &b->head; - ret = htab_lock_bucket(htab, b, tgt_l->hash, &flags); + ret = htab_lock_bucket(b, &flags); if (ret) return false; @@ -828,7 +794,7 @@ static bool htab_lru_map_delete_node(void *arg, struct bpf_lru_node *node) break; } - htab_unlock_bucket(htab, b, tgt_l->hash, flags); + htab_unlock_bucket(b, flags); if (l == tgt_l) check_and_free_fields(htab, l); @@ -1147,7 +1113,7 @@ static long htab_map_update_elem(struct bpf_map *map, void *key, void *value, */ } - ret = htab_lock_bucket(htab, b, hash, &flags); + ret = htab_lock_bucket(b, &flags); if (ret) return ret; @@ -1198,7 +1164,7 @@ static long htab_map_update_elem(struct bpf_map *map, void *key, void *value, check_and_free_fields(htab, l_old); } } - htab_unlock_bucket(htab, b, hash, flags); + htab_unlock_bucket(b, flags); if (l_old) { if (old_map_ptr) map->ops->map_fd_put_ptr(map, old_map_ptr, true); @@ -1207,7 +1173,7 @@ static long htab_map_update_elem(struct bpf_map *map, void *key, void *value, } return 0; err: - htab_unlock_bucket(htab, b, hash, flags); + htab_unlock_bucket(b, flags); return ret; } @@ -1254,7 +1220,7 @@ static long htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value copy_map_value(&htab->map, l_new->key + round_up(map->key_size, 8), value); - ret = htab_lock_bucket(htab, b, hash, &flags); + ret = htab_lock_bucket(b, &flags); if (ret) goto err_lock_bucket; @@ -1275,7 +1241,7 @@ static long htab_lru_map_update_elem(struct bpf_map *map, void *key, void *value ret = 0; err: - htab_unlock_bucket(htab, b, hash, flags); + htab_unlock_bucket(b, flags); err_lock_bucket: if (ret) @@ -1312,7 +1278,7 @@ static long __htab_percpu_map_update_elem(struct bpf_map *map, void *key, b = __select_bucket(htab, hash); head = &b->head; - ret = htab_lock_bucket(htab, b, hash, &flags); + ret = htab_lock_bucket(b, &flags); if (ret) return ret; @@ -1337,7 +1303,7 @@ static long __htab_percpu_map_update_elem(struct bpf_map *map, void *key, } ret = 0; err: - htab_unlock_bucket(htab, b, hash, flags); + htab_unlock_bucket(b, flags); return ret; } @@ -1378,7 +1344,7 @@ static long __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key, return -ENOMEM; } - ret = htab_lock_bucket(htab, b, hash, &flags); + ret = htab_lock_bucket(b, &flags); if (ret) goto err_lock_bucket; @@ -1402,7 +1368,7 @@ static long __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key, } ret = 0; err: - htab_unlock_bucket(htab, b, hash, flags); + htab_unlock_bucket(b, flags); err_lock_bucket: if (l_new) { bpf_map_dec_elem_count(&htab->map); @@ -1444,7 +1410,7 @@ static long htab_map_delete_elem(struct bpf_map *map, void *key) b = __select_bucket(htab, hash); head = &b->head; - ret = htab_lock_bucket(htab, b, hash, &flags); + ret = htab_lock_bucket(b, &flags); if (ret) return ret; @@ -1454,7 +1420,7 @@ static long htab_map_delete_elem(struct bpf_map *map, void *key) else ret = -ENOENT; - htab_unlock_bucket(htab, b, hash, flags); + htab_unlock_bucket(b, flags); if (l) free_htab_elem(htab, l); @@ -1480,7 +1446,7 @@ static long htab_lru_map_delete_elem(struct bpf_map *map, void *key) b = __select_bucket(htab, hash); head = &b->head; - ret = htab_lock_bucket(htab, b, hash, &flags); + ret = htab_lock_bucket(b, &flags); if (ret) return ret; @@ -1491,7 +1457,7 @@ static long htab_lru_map_delete_elem(struct bpf_map *map, void *key) else ret = -ENOENT; - htab_unlock_bucket(htab, b, hash, flags); + htab_unlock_bucket(b, flags); if (l) htab_lru_push_free(htab, l); return ret; @@ -1558,7 +1524,6 @@ static void htab_map_free_timers_and_wq(struct bpf_map *map) static void htab_map_free(struct bpf_map *map) { struct bpf_htab *htab = container_of(map, struct bpf_htab, map); - int i; /* bpf_free_used_maps() or close(map_fd) will trigger this map_free callback. * bpf_free_used_maps() is called after bpf prog is no longer executing. @@ -1583,9 +1548,6 @@ static void htab_map_free(struct bpf_map *map) bpf_mem_alloc_destroy(&htab->ma); if (htab->use_percpu_counter) percpu_counter_destroy(&htab->pcount); - for (i = 0; i < HASHTAB_MAP_LOCK_COUNT; i++) - free_percpu(htab->map_locked[i]); - lockdep_unregister_key(&htab->lockdep_key); bpf_map_area_free(htab); } @@ -1628,7 +1590,7 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key, b = __select_bucket(htab, hash); head = &b->head; - ret = htab_lock_bucket(htab, b, hash, &bflags); + ret = htab_lock_bucket(b, &bflags); if (ret) return ret; @@ -1665,7 +1627,7 @@ static int __htab_map_lookup_and_delete_elem(struct bpf_map *map, void *key, hlist_nulls_del_rcu(&l->hash_node); out_unlock: - htab_unlock_bucket(htab, b, hash, bflags); + htab_unlock_bucket(b, bflags); if (l) { if (is_lru_map) @@ -1787,7 +1749,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map, head = &b->head; /* do not grab the lock unless need it (bucket_cnt > 0). */ if (locked) { - ret = htab_lock_bucket(htab, b, batch, &flags); + ret = htab_lock_bucket(b, &flags); if (ret) { rcu_read_unlock(); bpf_enable_instrumentation(); @@ -1810,7 +1772,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map, /* Note that since bucket_cnt > 0 here, it is implicit * that the locked was grabbed, so release it. */ - htab_unlock_bucket(htab, b, batch, flags); + htab_unlock_bucket(b, flags); rcu_read_unlock(); bpf_enable_instrumentation(); goto after_loop; @@ -1821,7 +1783,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map, /* Note that since bucket_cnt > 0 here, it is implicit * that the locked was grabbed, so release it. */ - htab_unlock_bucket(htab, b, batch, flags); + htab_unlock_bucket(b, flags); rcu_read_unlock(); bpf_enable_instrumentation(); kvfree(keys); @@ -1884,7 +1846,7 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map, dst_val += value_size; } - htab_unlock_bucket(htab, b, batch, flags); + htab_unlock_bucket(b, flags); locked = false; while (node_to_free) { From patchwork Thu Feb 6 10:54:28 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962905 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 004FAC02196 for ; Thu, 6 Feb 2025 11:25:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=/+/W0EApda7zGr5aRBIxeSAAJdBHD2PeVsO8K+dox+k=; b=rkg88v8dHG1n7XPrFCCnJZDEHo TN60RObrY4QG3arRvdkuFyiOW9I27tpqI0IpWbk01Gs9heiwUtXauT4tOteUXFfcCzuhCFWX0WAUv Gurcu6dRNzLHmegxO6QNVB0P+1+WG5exXbVKRgTQrN2lV4OFSbNsrT9VBx4XGg75FPTFFPgrjZM1P 1X9Bzr3htPRPTr/iOC/h/SzmEEueNq3ar8/EESIZ5ZAgjWsYO+uKVHqlmxK6Ia1WNmLwaPAsd8NSD RivtCCDdCyyyDCA7hBS/O1sZZzmQWogpAVvI4hfm5qXAXilh9YdLFLJ1GmHa4wCB7dPqkdITJ/+JG XFFc48CA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tg00P-00000006814-3RA3; Thu, 06 Feb 2025 11:25:13 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzXI-000000061h1-23fv for linux-arm-kernel@bombadil.infradead.org; Thu, 06 Feb 2025 10:55:08 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Transfer-Encoding:MIME-Version :References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=/+/W0EApda7zGr5aRBIxeSAAJdBHD2PeVsO8K+dox+k=; b=RiZDs9exJdX5H9tAFHtQggBw+q ++iP0ipHKuyoxXg5/2N0R1BI9a2LWZ8SQ+AEYCP+JUApQ4oFEyiG/HdaPPBdCcqH5hg7YuyyteCHz oihds41FLe9cgt+fpqRivVM8QmM/G5WepIY0sXS9Mtkcp6GoTGnfuP4jXfLOHJmuZWeYzDfvkVg7D zPF0mx0YMK9y2zb+52xu2fzJlp+TUip0zisjXitagRIKwtraWgVIeX2WOkkH7F3T0AvRSYsxhsYxW z42pkahbvqcTFOdSYTax7TI4JD3lX5v/2MbIMBe9LJxSbynHDMlHhTdHomNvb0TWbD2cTXh7YCzDc yp8JZEuA==; Received: from mail-wr1-x443.google.com ([2a00:1450:4864:20::443]) by desiato.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzXF-0000000GuwD-3U95 for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:55:07 +0000 Received: by mail-wr1-x443.google.com with SMTP id ffacd0b85a97d-38db34a5c5fso325932f8f.2 for ; Thu, 06 Feb 2025 02:55:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839305; x=1739444105; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/+/W0EApda7zGr5aRBIxeSAAJdBHD2PeVsO8K+dox+k=; b=PndxsNhBUc1KVrEp1kyERHyUw907f9wheDmw6eFnzRHZ7TRojz3z1vprnODeb22PTb E5nh9oz4ouJjFanRfuCU7ucQ5G7lKIumUzLaS5iZzFJ8yeOuL8R8ZpccAHczxup1uTTy zxtAadoWijTCOSHnxi6l/3RWcP9cV9mB3rJ69bTtgjolhhApJFXsGtLdkOB+Q3PV+3tH NTOoICmFwcd5VRliEIhkiZ87hKiuT5RVSjNzEFASexjwMTN0xNyNgyFZJsMxJ0y6DGB8 s0ec8oLqUNr+cGk5o+nk/2dwTNkBqJELAsYiUCY371yUOR5kcohEca06/3KGvn6vf39E s6MA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839305; x=1739444105; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/+/W0EApda7zGr5aRBIxeSAAJdBHD2PeVsO8K+dox+k=; b=vhbKID3qEKxTn7mtNaghJyXhC1wtxe1MjVWuffvPrzCgMWjBVbvFOP0Q6vtGRMAZlB /aC3smS9jYmqSQi+NVfejio02xQ8WU0WVb6cfi/G//FPlzFRapxGvqekB4+kzSX8g71S SpFvrL4YXk46yDbSSbFm3R3yVLL6EhPxW/ajgZrJitjijP5wd3H9oYYHadbKJ2gkvVpD fyYafbiym3PEWmo9OGGDcNpVtZb6x82S6tjoqf8QnPeYnaugRqJDGvqgGXwNJCal91VT D5dNjU18cWkvRusc9BQH1+ay3VcsiBNv5/dQnSQphHCusZrnpZjE0cD8q7ctlTiwNTeO mbSQ== X-Forwarded-Encrypted: i=1; AJvYcCVflglfA0HPzGtZmZri1xeqQW/7fLtqBQThAAz9QTwUcl0xV0Wcrisk175vmEELgB9uxJWGJmKXqFwaW2gnMRxz@lists.infradead.org X-Gm-Message-State: AOJu0YzK+e8M4KPsLRnoIRffLFPZF8bzvlSHk+SGv4OR8cKv/VAZ12DA yWGJ3JESNea2eu/C4qkUkjFXppewrVBIfG47pAKw7J67OLjneF6u X-Gm-Gg: ASbGncuFEwRM3wkHzfqd2eEZpYWLsfImYRV96wM+L4ZE4y/s8wMYKBOjbmKoHUxyxbX Zc4XtO0yRGQzD8vXDlUDrSTTGhaIsB1GOvMuXwnRh0AArftQUSN1jShnPrdCDdTAX9lk0pe0yFP XFklTXyv3Rdn1BPGFiuuXXlLc6tafZ2QQtZURMBwlzJHs7+Lt5CNqML/OSFOjWSmhEQFS2U35BI 3J9M45PET49tPu1sufc4Qi/3JLph7LebgXup06X2i5csxSSESoBSprVTU+vlFuvg5EQ6tElSgLu 8llA X-Google-Smtp-Source: AGHT+IE1fh4Z/RX0JKmKGEVYdBOLzgVhEKs2AnV1hSi1PhRaogcIsY0ipckf2p7b6ebAKkLpzIqwFg== X-Received: by 2002:a5d:6da3:0:b0:38c:5b52:3a5e with SMTP id ffacd0b85a97d-38db48577fdmr4311094f8f.8.1738839304658; Thu, 06 Feb 2025 02:55:04 -0800 (PST) Received: from localhost ([2a03:2880:31ff:2::]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38dbde0fc25sm1415577f8f.64.2025.02.06.02.55.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:55:04 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Barret Rhoden , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 20/26] bpf: Convert percpu_freelist.c to rqspinlock Date: Thu, 6 Feb 2025 02:54:28 -0800 Message-ID: <20250206105435.2159977-21-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=6512; h=from:subject; bh=b7GJYiFRKmgt+Gua6MyRj5cuN5rmsZksWMg2VkyS3aw=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRm4A+EBJCWxvZBf7f2Rwf9fNAzPFNCmT/crlsa NhTJVJuJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZgAKCRBM4MiGSL8RysBTD/ 9cLKt+PR4fSl5mi6p7a7E2k65BK0pYhLW8IoR7EvD1rs1MUM8TYi9Wd0P3u5ARJSKgUZRovio/UJsb MCLHj+03gAh32u7M8XtrbyRGGjWp81sskv3umm0S5W6qW7GNEzdDfhDCVgZGxTaPwghKEcP6GNkC5D 3LWP4b9pp2XSrw5PDT7EN54Ds1FfjGWg6awZXbJcWgVmmS4522IVKIgAgnotntrcI70ccUJoxtdyUD ADWxNVhu3snrVyFYlCSn80qYS6o0ZBYVjqh5K1pU/GnUahNHcT7iZbSHN3HH7/pZhLFbvphUDLtImg bi4WsQVeTVKTqXtm6o/FeA/7+P+pIhRSeOynMeOZT9EqTagpNVaaptRj5HZMuXJWY5qUIHnY+c8P0v o1VHWya9TBJyOJM37cFthbRx9BxyN7uDd3fsaqDwGs+p+NFzUAh5Yyx3n6dkBHQI+zFF2mryGLzXTR h/vu/u/DQSfC2693zDq/2G2Nq8GCT4nwzP2057XK4Fhewd7/9rFPbG2Azm6f192ZUVX2yNhrwmZcFt jjpQ4CaSn5LXmu4ocoOo3rDF0aGwvP5Sb/Jn9vS78QfBYpWYEJxBrVtRMyeKfPjo4jEinqb40wB4vk h5S6Uj1SdTQN/i3IBzw259gce+jCSWN1SQaz6Dgy7OdNHF2kxY9Ea5mBCF8A== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_105505_991351_CFD829F2 X-CRM114-Status: GOOD ( 20.56 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Convert the percpu_freelist.c code to use rqspinlock, and remove the extralist fallback and trylock-based acquisitions to avoid deadlocks. Key thing to note is the retained while (true) loop to search through other CPUs when failing to push a node due to locking errors. This retains the behavior of the old code, where it would keep trying until it would be able to successfully push the node back into the freelist of a CPU. Technically, we should start iteration for this loop from raw_smp_processor_id() + 1, but to avoid hitting the edge of nr_cpus, we skip execution in the loop body instead. Signed-off-by: Kumar Kartikeya Dwivedi --- kernel/bpf/percpu_freelist.c | 113 ++++++++--------------------------- kernel/bpf/percpu_freelist.h | 4 +- 2 files changed, 27 insertions(+), 90 deletions(-) diff --git a/kernel/bpf/percpu_freelist.c b/kernel/bpf/percpu_freelist.c index 034cf87b54e9..632762b57299 100644 --- a/kernel/bpf/percpu_freelist.c +++ b/kernel/bpf/percpu_freelist.c @@ -14,11 +14,9 @@ int pcpu_freelist_init(struct pcpu_freelist *s) for_each_possible_cpu(cpu) { struct pcpu_freelist_head *head = per_cpu_ptr(s->freelist, cpu); - raw_spin_lock_init(&head->lock); + raw_res_spin_lock_init(&head->lock); head->first = NULL; } - raw_spin_lock_init(&s->extralist.lock); - s->extralist.first = NULL; return 0; } @@ -34,58 +32,39 @@ static inline void pcpu_freelist_push_node(struct pcpu_freelist_head *head, WRITE_ONCE(head->first, node); } -static inline void ___pcpu_freelist_push(struct pcpu_freelist_head *head, +static inline bool ___pcpu_freelist_push(struct pcpu_freelist_head *head, struct pcpu_freelist_node *node) { - raw_spin_lock(&head->lock); - pcpu_freelist_push_node(head, node); - raw_spin_unlock(&head->lock); -} - -static inline bool pcpu_freelist_try_push_extra(struct pcpu_freelist *s, - struct pcpu_freelist_node *node) -{ - if (!raw_spin_trylock(&s->extralist.lock)) + if (raw_res_spin_lock(&head->lock)) return false; - - pcpu_freelist_push_node(&s->extralist, node); - raw_spin_unlock(&s->extralist.lock); + pcpu_freelist_push_node(head, node); + raw_res_spin_unlock(&head->lock); return true; } -static inline void ___pcpu_freelist_push_nmi(struct pcpu_freelist *s, - struct pcpu_freelist_node *node) +void __pcpu_freelist_push(struct pcpu_freelist *s, + struct pcpu_freelist_node *node) { - int cpu, orig_cpu; + struct pcpu_freelist_head *head; + int cpu; - orig_cpu = raw_smp_processor_id(); - while (1) { - for_each_cpu_wrap(cpu, cpu_possible_mask, orig_cpu) { - struct pcpu_freelist_head *head; + if (___pcpu_freelist_push(this_cpu_ptr(s->freelist), node)) + return; + while (true) { + for_each_cpu_wrap(cpu, cpu_possible_mask, raw_smp_processor_id()) { + if (cpu == raw_smp_processor_id()) + continue; head = per_cpu_ptr(s->freelist, cpu); - if (raw_spin_trylock(&head->lock)) { - pcpu_freelist_push_node(head, node); - raw_spin_unlock(&head->lock); - return; - } - } - - /* cannot lock any per cpu lock, try extralist */ - if (pcpu_freelist_try_push_extra(s, node)) + if (raw_res_spin_lock(&head->lock)) + continue; + pcpu_freelist_push_node(head, node); + raw_res_spin_unlock(&head->lock); return; + } } } -void __pcpu_freelist_push(struct pcpu_freelist *s, - struct pcpu_freelist_node *node) -{ - if (in_nmi()) - ___pcpu_freelist_push_nmi(s, node); - else - ___pcpu_freelist_push(this_cpu_ptr(s->freelist), node); -} - void pcpu_freelist_push(struct pcpu_freelist *s, struct pcpu_freelist_node *node) { @@ -120,71 +99,29 @@ void pcpu_freelist_populate(struct pcpu_freelist *s, void *buf, u32 elem_size, static struct pcpu_freelist_node *___pcpu_freelist_pop(struct pcpu_freelist *s) { + struct pcpu_freelist_node *node = NULL; struct pcpu_freelist_head *head; - struct pcpu_freelist_node *node; int cpu; for_each_cpu_wrap(cpu, cpu_possible_mask, raw_smp_processor_id()) { head = per_cpu_ptr(s->freelist, cpu); if (!READ_ONCE(head->first)) continue; - raw_spin_lock(&head->lock); + if (raw_res_spin_lock(&head->lock)) + continue; node = head->first; if (node) { WRITE_ONCE(head->first, node->next); - raw_spin_unlock(&head->lock); + raw_res_spin_unlock(&head->lock); return node; } - raw_spin_unlock(&head->lock); + raw_res_spin_unlock(&head->lock); } - - /* per cpu lists are all empty, try extralist */ - if (!READ_ONCE(s->extralist.first)) - return NULL; - raw_spin_lock(&s->extralist.lock); - node = s->extralist.first; - if (node) - WRITE_ONCE(s->extralist.first, node->next); - raw_spin_unlock(&s->extralist.lock); - return node; -} - -static struct pcpu_freelist_node * -___pcpu_freelist_pop_nmi(struct pcpu_freelist *s) -{ - struct pcpu_freelist_head *head; - struct pcpu_freelist_node *node; - int cpu; - - for_each_cpu_wrap(cpu, cpu_possible_mask, raw_smp_processor_id()) { - head = per_cpu_ptr(s->freelist, cpu); - if (!READ_ONCE(head->first)) - continue; - if (raw_spin_trylock(&head->lock)) { - node = head->first; - if (node) { - WRITE_ONCE(head->first, node->next); - raw_spin_unlock(&head->lock); - return node; - } - raw_spin_unlock(&head->lock); - } - } - - /* cannot pop from per cpu lists, try extralist */ - if (!READ_ONCE(s->extralist.first) || !raw_spin_trylock(&s->extralist.lock)) - return NULL; - node = s->extralist.first; - if (node) - WRITE_ONCE(s->extralist.first, node->next); - raw_spin_unlock(&s->extralist.lock); return node; } struct pcpu_freelist_node *__pcpu_freelist_pop(struct pcpu_freelist *s) { - if (in_nmi()) - return ___pcpu_freelist_pop_nmi(s); return ___pcpu_freelist_pop(s); } diff --git a/kernel/bpf/percpu_freelist.h b/kernel/bpf/percpu_freelist.h index 3c76553cfe57..914798b74967 100644 --- a/kernel/bpf/percpu_freelist.h +++ b/kernel/bpf/percpu_freelist.h @@ -5,15 +5,15 @@ #define __PERCPU_FREELIST_H__ #include #include +#include struct pcpu_freelist_head { struct pcpu_freelist_node *first; - raw_spinlock_t lock; + rqspinlock_t lock; }; struct pcpu_freelist { struct pcpu_freelist_head __percpu *freelist; - struct pcpu_freelist_head extralist; }; struct pcpu_freelist_node { From patchwork Thu Feb 6 10:54:29 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962902 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 78697C02196 for ; Thu, 6 Feb 2025 11:24:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=eNqgh98W4funKculi1nvK5lPwgfxfi44KHpp2hUPVk4=; b=KZwVwaykiWfqIOra7KdvuFnDYs mGl3O6PiZVjOsD0Q/W7X429FPLdIgcl4VtCyPydJL0aPTlE7yabVJ8OWnLAv1eYiJqYc88r7LVw1L tcG3KcO3kVndcNM4ilJHDA7JL7GtIQnZYhC6yZb2E9aNOa5N5pICUH8HsJZR/dz6XiAo3pqLJTT+C isPmsEpFuM+hDCT1VL2tvy8qpMHyCgNytmtmVBxF3LNxB7yUoP5jJr++EXipiC1R6qD4zNpPJpYzD PKGgezOF4WbUvFqj2u/E/7oe9mDUlXqiILXI7pX8NssxfhatTFHw2AeeTkKCrpn0m5KL/bFWqsce4 4CtFfhMA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tfzz4-000000067kz-0hZ9; Thu, 06 Feb 2025 11:23:50 +0000 Received: from mail-wm1-x342.google.com ([2a00:1450:4864:20::342]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzXH-000000061gP-1K8e for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:55:08 +0000 Received: by mail-wm1-x342.google.com with SMTP id 5b1f17b1804b1-436202dd7f6so8393385e9.0 for ; Thu, 06 Feb 2025 02:55:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839306; x=1739444106; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=eNqgh98W4funKculi1nvK5lPwgfxfi44KHpp2hUPVk4=; b=Nr58sCahgFGu/2CrYRBcEyE/CaLgzeVxgiiQ6F792l2v6B+m32Q45JxLvvAG66fakL 6fClIblftJb2hHYzQKzmn6Sy2LUxvaQ8U4PPfVNycaKS96VT6MnXP1Vcc5mebXcXdk6g 6TzgmQOMeYXn7sRzcPMHxVU1TLoJbUsRFw2AD5elizTobgytyIeR1aRHVYKRVSa55g+d n6GaEyIKQyuouaMTxm4FqEMMrm3Mtybo7ttk4PpBf34gX+Siqq1ry4nnfpXvXntKD02b rkrYopCIg/glRTwV1x3vnQeZk8YU601IPPOHaw7jncU1Z4gl35XkR0eOmE7LqC2k23A6 ovCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839306; x=1739444106; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eNqgh98W4funKculi1nvK5lPwgfxfi44KHpp2hUPVk4=; b=RjfFuU5Oj+RLehPvqa2svWzssnbiRJ+Q8zeh99pXnCIm02D7yWajzL8Ongyxu57j2W Prd6sVFB0xHT4QK1sBRazMfNjxwa3InB2lLN0jmC7EWbPEFRcmNZr5aw/ZL0KQGs1LD2 +qL4lWbTxfyjnaTwjFljoNrBH2drP+Uwm2NcD1kt1e06p1aRg8oWR5UdlxnJl7rtdvw6 jKypUdO8ZmyJtroMdxVlxE/NpMxdtCeBpr0DG3itmK7ndQKgeZhvGT3Wpjmh4F5auK7L 1SUsY0V42EitsBl0x44L+YoQEUdahIP/aX7oy5PzWXW8wR6JG9YrjRJ1dZd7ml3YcjNZ 1wqw== X-Forwarded-Encrypted: i=1; AJvYcCWXST6BieZE5+9VNTr+LS6Q4Itw2QW3wqVEAXYgpIaWsHs7c6/GCSBeAH/kfeKbl9xSkYHzi64J+gxzKD0S+iEr@lists.infradead.org X-Gm-Message-State: AOJu0YxlAZsyt6JDL/NBi3yZ49R9NW8XaEKKT/MfUoTCP4e1v7vdGWBT al0AofYArCaaIjYZGM8ZFkb+IA+NUf4h+ixLV4Ce/PKgD+Oyt2bW X-Gm-Gg: ASbGnctyXNv9sgyZkSXgW5G11SqgAmWuUDCb6Z+cL9/6/qCsbDOSiEAUj2n7JqewujC SrOBXnWdJ0t43NlrbMmX1CUdvGs3fEUDeYEZt4zYdXk2cok7FN/lR4+f7YvB+nBlhFyIL3GnaMZ uZ+2VZGBIDY2p5SvVhrus0Cusobq4E3vDaVWw5yineJABcECYKP/du8QDBXI4f/E+ib7O5jtuII DNjspn4SzpN/i7/2mvc324Gen0GSC+u7curc3XTZd9IuIHcX+WQufdrwcYW4s8JznY1rR+IEufp Tkgfiw== X-Google-Smtp-Source: AGHT+IGl6SmZaMuRp7oUK50tAaXGFTAXDbbBuYWF+LV4jkLjehRBR+yIALlWuxWr7fqUlT9fVpNuwQ== X-Received: by 2002:a05:6000:154a:b0:38d:b125:3783 with SMTP id ffacd0b85a97d-38db4869738mr5279252f8f.18.1738839305846; Thu, 06 Feb 2025 02:55:05 -0800 (PST) Received: from localhost ([2a03:2880:31ff:1e::]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4390d96530fsm50029725e9.19.2025.02.06.02.55.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:55:05 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Barret Rhoden , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 21/26] bpf: Convert lpm_trie.c to rqspinlock Date: Thu, 6 Feb 2025 02:54:29 -0800 Message-ID: <20250206105435.2159977-22-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=3797; h=from:subject; bh=7psEOc+yd4qPNsI4KRxI79thdXuYqSsZXOHrdwDSR0g=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRnw6DiRxO330wCUMflYYf1S6rgXtusJs7sBci+ PMosJk6JAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZwAKCRBM4MiGSL8Rypr6D/ 0XU5AM+X7xrBbdXvcrdW6JR+XH2p7xDA8af4vJJme6W9rY9nqzM+aeJbS2fsqkaBOjvGhsrm6KSEvU h5nXKuyn5/iiNQyeA2eBNyNmynnhzv6V3d0A7NJwIL2gaLY06/KM6un/G5Zg7met6nyIaESxte16Zt Tj5PkEGuw2YWZIcBVnQHV1DCr9Cc2YERSgY31ol47Dox5OeIfV2Y+ur+/UmtHmdvZ5+g/plLlYxItA 3Mzw1TkFtwsYfMPR4b35Lo1IhYbORKAi5irpZ7F2x9vPeQ2b3wSFo+UjysbmMLgCp4k+cpHnPU+p16 ETzQsMxtY4lYN51QY3p+/yat8sRLixm842KUr+YDb385pgKtUe5qoiH+F47vXvN87HX4t1ptQ2/H7R gu1PEJN4dsT8I3PQzCGdlQoj17l0rZMaNn3acsxv41JTaPrIZgwx7fpOoB8x903xvDrzFJXqGVrwsG ibywFOJqfclF2qvVUO3ygHfXLFXMLus0EHad+VZzDPc0+wy53OQPcURCkt5UiaTENbTuGscfNR9GHT PtQksRTWgfDQgzrDx4EEkFvZ8uKGPHwtH3SAxFXLdTMysqVjm14CT2if0p9yCYswzH/yX3dA7ImI7q xLGCvaXy82YcKX7auIkK8zCMUWSP5gJB30nAtkG1jSBLXHv9M0FKHlbziHLA== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_025507_351644_74AA437C X-CRM114-Status: GOOD ( 18.18 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Convert all LPM trie usage of raw_spinlock to rqspinlock. Note that rcu_dereference_protected in trie_delete_elem is switched over to plain rcu_dereference, the RCU read lock should be held from BPF program side or eBPF syscall path, and the trie->lock is just acquired before the dereference. It is not clear the reason the protected variant was used from the commit history, but the above reasoning makes sense so switch over. Signed-off-by: Kumar Kartikeya Dwivedi --- kernel/bpf/lpm_trie.c | 25 ++++++++++++++----------- 1 file changed, 14 insertions(+), 11 deletions(-) diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c index e8a772e64324..be66d7e520e0 100644 --- a/kernel/bpf/lpm_trie.c +++ b/kernel/bpf/lpm_trie.c @@ -15,6 +15,7 @@ #include #include #include +#include #include /* Intermediate node */ @@ -36,7 +37,7 @@ struct lpm_trie { size_t n_entries; size_t max_prefixlen; size_t data_size; - raw_spinlock_t lock; + rqspinlock_t lock; }; /* This trie implements a longest prefix match algorithm that can be used to @@ -342,7 +343,9 @@ static long trie_update_elem(struct bpf_map *map, if (!new_node) return -ENOMEM; - raw_spin_lock_irqsave(&trie->lock, irq_flags); + ret = raw_res_spin_lock_irqsave(&trie->lock, irq_flags); + if (ret) + goto out_free; new_node->prefixlen = key->prefixlen; RCU_INIT_POINTER(new_node->child[0], NULL); @@ -356,8 +359,7 @@ static long trie_update_elem(struct bpf_map *map, */ slot = &trie->root; - while ((node = rcu_dereference_protected(*slot, - lockdep_is_held(&trie->lock)))) { + while ((node = rcu_dereference(*slot))) { matchlen = longest_prefix_match(trie, node, key); if (node->prefixlen != matchlen || @@ -442,8 +444,8 @@ static long trie_update_elem(struct bpf_map *map, rcu_assign_pointer(*slot, im_node); out: - raw_spin_unlock_irqrestore(&trie->lock, irq_flags); - + raw_res_spin_unlock_irqrestore(&trie->lock, irq_flags); +out_free: if (ret) bpf_mem_cache_free(&trie->ma, new_node); bpf_mem_cache_free_rcu(&trie->ma, free_node); @@ -467,7 +469,9 @@ static long trie_delete_elem(struct bpf_map *map, void *_key) if (key->prefixlen > trie->max_prefixlen) return -EINVAL; - raw_spin_lock_irqsave(&trie->lock, irq_flags); + ret = raw_res_spin_lock_irqsave(&trie->lock, irq_flags); + if (ret) + return ret; /* Walk the tree looking for an exact key/length match and keeping * track of the path we traverse. We will need to know the node @@ -478,8 +482,7 @@ static long trie_delete_elem(struct bpf_map *map, void *_key) trim = &trie->root; trim2 = trim; parent = NULL; - while ((node = rcu_dereference_protected( - *trim, lockdep_is_held(&trie->lock)))) { + while ((node = rcu_dereference(*trim))) { matchlen = longest_prefix_match(trie, node, key); if (node->prefixlen != matchlen || @@ -543,7 +546,7 @@ static long trie_delete_elem(struct bpf_map *map, void *_key) free_node = node; out: - raw_spin_unlock_irqrestore(&trie->lock, irq_flags); + raw_res_spin_unlock_irqrestore(&trie->lock, irq_flags); bpf_mem_cache_free_rcu(&trie->ma, free_parent); bpf_mem_cache_free_rcu(&trie->ma, free_node); @@ -592,7 +595,7 @@ static struct bpf_map *trie_alloc(union bpf_attr *attr) offsetof(struct bpf_lpm_trie_key_u8, data); trie->max_prefixlen = trie->data_size * 8; - raw_spin_lock_init(&trie->lock); + raw_res_spin_lock_init(&trie->lock); /* Allocate intermediate and leaf nodes from the same allocator */ leaf_size = sizeof(struct lpm_trie_node) + trie->data_size + From patchwork Thu Feb 6 10:54:30 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962906 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0F018C02194 for ; Thu, 6 Feb 2025 11:26:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=WY5xY00eKUhPVaK5CMP9UKPxhBmvTRnKdfB0Ehrzj4c=; b=Po/LtjjitqgLunArbsprMkOncq SWO3Arkyk7Ctpa0olvQZ3CZWLDDWkE/NF9kSaBGGf5bwK8cU1tl28COqd7B46gZZYCHYA6LLhoGOk PbiyUf7DDuCOnFq9k8uEtWDMrWbPxXaqjRv9DZUhgqdEk3+t7ct7maF59watebwRfJpnfvpyTtQos bicHtg1HJ71npRiURdP159xSodFOa/z6LhfqZoxgVNrhf7S/onwwaUiKmBjRZV3DSm9wTduUmq2fO k5UOl9LWC4Sw3weSYDxBl6DBjMGBx23dcc3ZcPHzjFzWoyvBxVUlOdsIYgMjuKvpVDiWC/pVXyXpb snjbZX4A==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tg01l-000000068Iu-2Rz2; Thu, 06 Feb 2025 11:26:37 +0000 Received: from mail-wr1-x442.google.com ([2a00:1450:4864:20::442]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzXI-000000061h3-2RiT for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:55:09 +0000 Received: by mail-wr1-x442.google.com with SMTP id ffacd0b85a97d-38db34a5c5fso325962f8f.2 for ; Thu, 06 Feb 2025 02:55:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839307; x=1739444107; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=WY5xY00eKUhPVaK5CMP9UKPxhBmvTRnKdfB0Ehrzj4c=; b=jOQG4NlunM13H6UfqAtc6RPtsHbb4La6roHobA6d7HdqtWnGefh0Y95XmSUittdbNs CIpo/rG5oyvRGAYp5Trl41uTb5DQj6rAQUfXOtve0mfhQxjLvmrSpCtORmDyBuxoU85F wjvwrhVZcMpCc6cA28TfjRqnYgIBiedUg0fq1qccPSQbgE9QVVB8PftX5Hp4Q5DTY6Dq AjFKI3ocrSxZeckCQNwbjtlW5XCPvdqckoPjS683GRn8eC9AraRL2Gmgolz1vsKMObiN TOJhHqjgRxiuMXJ5bEYVAXGql9nX168WOPZ6qGVvbIL8CTFGmCIMi/mZC1tPJKjtD6J5 6CWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839307; x=1739444107; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WY5xY00eKUhPVaK5CMP9UKPxhBmvTRnKdfB0Ehrzj4c=; b=mffqOREsCDam1/nVgNleue+/+L4sT6ZeOaSjCayUycwfUN+td8F238FpUmxBdUGUN5 01fUj3m1hO1HMIsz5N4H+MOGbimhK6AlcMDFbQocLVxGKcWkqFVNKsZr6OAMt7Ja4sJK B7IuoT61dY/sJqw0dIaM2C1OxbIqCO1Re6ff/gx4PtsWPAGpfkVbxy40fqvnuzpnqhMU OWN5OplIBMP0yzu/uhxS9j9Kb62/MQIGp1EMPshxiCUvK/J81IjQC9JNfW6KUVKWJHz9 0xuJpjxF5qdOTiW0ThybhT9J8KA3eHJFUuttefDYGbqMoi5omkiF0qTAEiCg89TG19GD ndBw== X-Forwarded-Encrypted: i=1; AJvYcCXYFYvSIM6C8NJw6M1k2zSRdVsabfkBcnwTAXaNHlaIeqzjAM2Dr/nFdmyB7QItfeacVuiAXCfK3iywjjgm+qqg@lists.infradead.org X-Gm-Message-State: AOJu0YxaSpZOalEFot34+/F11Fu4p0PxWY+EYpPSHyPQnnsz7Kzr4v/7 LnLTvTLDFKAHFJWN2NW+Bb/ELQwtEbOr8SZYi+XnkGn9L/v8GVeJ X-Gm-Gg: ASbGncvu9HiQrUeDv4vDSeR8xR2uUjgAyr0fpOy/hSTJUwua4ohcjGaFe9zV0AFnQuv Qx/38uZqoZGxA9tyG5NCSWGjzw7lPPO0Xm52I89WJmqpKu8hcygWU5XjqONGIXX9Pl1EwKsk16A btjBc1mRlxmoRRQ/elxBouhexfm3eMQB4iMo1Ui4yl1b5HAutKjXgzmeCEUiyxyJodxYTPwvivo Z1dgoykRj705cf18I4xuj9dhk47zoVanVKyCopEyKau/oD3ASXsC9HyrPUxYEaOvID9eEJKp2qq 1EF2 X-Google-Smtp-Source: AGHT+IG9zXGoXyJKKHaq7JsfFOdulH8qStWXMMx7WVjIkbMNF6mgU4TITTZXCi7em7sIZ7mcLjLv5Q== X-Received: by 2002:a05:6000:1567:b0:38a:5ce8:df51 with SMTP id ffacd0b85a97d-38db4857bb6mr4346951f8f.2.1738839307068; Thu, 06 Feb 2025 02:55:07 -0800 (PST) Received: from localhost ([2a03:2880:31ff:1::]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38dbde1ddaesm1381571f8f.85.2025.02.06.02.55.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:55:06 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Barret Rhoden , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 22/26] bpf: Introduce rqspinlock kfuncs Date: Thu, 6 Feb 2025 02:54:30 -0800 Message-ID: <20250206105435.2159977-23-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=5071; h=from:subject; bh=U/xq8I6EBeEZJrIOy3nlwnF6weOW6zwHeOpg6R+a+Dg=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRnGE3AZgPoVzGUtERg5LFCbq+DcCX9yF59ua8N n7+cbDKJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZwAKCRBM4MiGSL8RylLaD/ 0VpOorKWab0lqsTn/JbqVIfhX2mJDl4AszKROjW7ZbWS/ibRaNsagLSEBcUQD0xf70owiN9Yu/4znW OJaUpWXS7tNbAxPV2AbNa13K/5M2I9XFuZ6Ma44gw7XUBL2+eLtpDnsloEntdH23CIdCBlFgVoMhZi /9B64BlcRvucuNxfRyundNxbRTbW+WL+gdtObdVpvurkEZPU7XSKLpbhrzZvQ4wxTGIf/25YvUVXE4 S4KSqSp0B49pkPN4G9xW/jIyjgX3WLAwBlhBHZ6f15+/NQ+pg/hwN9hKNNehLoCE2vircPueBHstEE KegTCjgg9BggHBOklhXRKvmGFOY2CVPkx96cbQhQZG615Mp2ODEKab08GpR6au3L0Lg1QT3JQbi+tU DKBdLQMl1MjDsazZBZ1VwoZL4CT5etgBB13PQWNVWlid9dl1osLUMQE9UJ/QAqUtZVfz88GpkskFmi ol2YvLHaQVsp/3n/N56wQok42wK/y+P/xvxYD1rz0ExOWlNuLbEBjbjgFZsEQ1KAlLg2XUnKJ4yMrB aBVOWnLHU2q3mFDWBNz/iUcF0U3KO0efHZsiu7NYItC7HDIgvosgX8QFMNIjDG1EfiEFvUcisgaxYH JcTYydI991Jqy6AeM9QpCB99Wg4k4tVqvmj18ICTymjWeNb7929I4r8679rA== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_025508_622750_00D90245 X-CRM114-Status: GOOD ( 19.46 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Introduce four new kfuncs, bpf_res_spin_lock, and bpf_res_spin_unlock, and their irqsave/irqrestore variants, which wrap the rqspinlock APIs. bpf_res_spin_lock returns a conditional result, depending on whether the lock was acquired (NULL is returned when lock acquisition succeeds, non-NULL upon failure). The memory pointed to by the returned pointer upon failure can be dereferenced after the NULL check to obtain the error code. Instead of using the old bpf_spin_lock type, introduce a new type with the same layout, and the same alignment, but a different name to avoid type confusion. Preemption is disabled upon successful lock acquisition, however IRQs are not. Special kfuncs can be introduced later to allow disabling IRQs when taking a spin lock. Resilient locks are safe against AA deadlocks, hence not disabling IRQs currently does not allow violation of kernel safety. __irq_flag annotation is used to accept IRQ flags for the IRQ-variants, with the same semantics as existing bpf_local_irq_{save, restore}. These kfuncs will require additional verifier-side support in subsequent commits, to allow programs to hold multiple locks at the same time. Signed-off-by: Kumar Kartikeya Dwivedi --- include/asm-generic/rqspinlock.h | 7 +++ include/linux/bpf.h | 1 + kernel/locking/rqspinlock.c | 78 ++++++++++++++++++++++++++++++++ 3 files changed, 86 insertions(+) diff --git a/include/asm-generic/rqspinlock.h b/include/asm-generic/rqspinlock.h index 46119fc768b8..8249c2da09ad 100644 --- a/include/asm-generic/rqspinlock.h +++ b/include/asm-generic/rqspinlock.h @@ -23,6 +23,13 @@ struct rqspinlock { }; }; +/* Even though this is same as struct rqspinlock, we need to emit a distinct + * type in BTF for BPF programs. + */ +struct bpf_res_spin_lock { + u32 val; +}; + struct qspinlock; #ifdef CONFIG_QUEUED_SPINLOCKS typedef struct qspinlock rqspinlock_t; diff --git a/include/linux/bpf.h b/include/linux/bpf.h index f3f50e29d639..35af09ee6a2c 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -30,6 +30,7 @@ #include #include #include +#include struct bpf_verifier_env; struct bpf_verifier_log; diff --git a/kernel/locking/rqspinlock.c b/kernel/locking/rqspinlock.c index b4cceeecf29c..d05333203671 100644 --- a/kernel/locking/rqspinlock.c +++ b/kernel/locking/rqspinlock.c @@ -15,6 +15,8 @@ #include #include +#include +#include #include #include #include @@ -686,3 +688,79 @@ int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val, EXPORT_SYMBOL(resilient_queued_spin_lock_slowpath); #endif /* CONFIG_QUEUED_SPINLOCKS */ + +__bpf_kfunc_start_defs(); + +#define REPORT_STR(ret) ({ ret == -ETIMEDOUT ? "Timeout detected" : "AA or ABBA deadlock detected"; }) + +__bpf_kfunc int bpf_res_spin_lock(struct bpf_res_spin_lock *lock) +{ + int ret; + + BUILD_BUG_ON(sizeof(rqspinlock_t) != sizeof(struct bpf_res_spin_lock)); + BUILD_BUG_ON(__alignof__(rqspinlock_t) != __alignof__(struct bpf_res_spin_lock)); + + preempt_disable(); + ret = res_spin_lock((rqspinlock_t *)lock); + if (unlikely(ret)) { + preempt_enable(); + rqspinlock_report_violation(REPORT_STR(ret), lock); + return ret; + } + return 0; +} + +__bpf_kfunc void bpf_res_spin_unlock(struct bpf_res_spin_lock *lock) +{ + res_spin_unlock((rqspinlock_t *)lock); + preempt_enable(); +} + +__bpf_kfunc int bpf_res_spin_lock_irqsave(struct bpf_res_spin_lock *lock, unsigned long *flags__irq_flag) +{ + u64 *ptr = (u64 *)flags__irq_flag; + unsigned long flags; + int ret; + + preempt_disable(); + local_irq_save(flags); + ret = res_spin_lock((rqspinlock_t *)lock); + if (unlikely(ret)) { + local_irq_restore(flags); + preempt_enable(); + rqspinlock_report_violation(REPORT_STR(ret), lock); + return ret; + } + *ptr = flags; + return 0; +} + +__bpf_kfunc void bpf_res_spin_unlock_irqrestore(struct bpf_res_spin_lock *lock, unsigned long *flags__irq_flag) +{ + u64 *ptr = (u64 *)flags__irq_flag; + unsigned long flags = *ptr; + + res_spin_unlock((rqspinlock_t *)lock); + local_irq_restore(flags); + preempt_enable(); +} + +__bpf_kfunc_end_defs(); + +BTF_KFUNCS_START(rqspinlock_kfunc_ids) +BTF_ID_FLAGS(func, bpf_res_spin_lock, KF_RET_NULL) +BTF_ID_FLAGS(func, bpf_res_spin_unlock) +BTF_ID_FLAGS(func, bpf_res_spin_lock_irqsave, KF_RET_NULL) +BTF_ID_FLAGS(func, bpf_res_spin_unlock_irqrestore) +BTF_KFUNCS_END(rqspinlock_kfunc_ids) + +static const struct btf_kfunc_id_set rqspinlock_kfunc_set = { + .owner = THIS_MODULE, + .set = &rqspinlock_kfunc_ids, +}; + +static __init int rqspinlock_register_kfuncs(void) +{ + return register_btf_kfunc_id_set(BPF_PROG_TYPE_UNSPEC, &rqspinlock_kfunc_set); +} +late_initcall(rqspinlock_register_kfuncs); From patchwork Thu Feb 6 10:54:31 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962907 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2A010C02194 for ; Thu, 6 Feb 2025 11:28:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=MMJzjmm9Ssn/2tshx5tAZDCI9Z1engNm3Cw+RIUnYAs=; b=C0uH6ZUjroiJe7T/Nybtz0Lk75 aZqHg9ZiM/VOoqNLRZN1Uu5Rh3ON4IWqorP9Hz7SG3+0Ua6hr4fRfQZqsc8V/bN+aNYxRazRCN0HQ BThnIXfCBKYIju7PVOdqt20h8o05Vwtnn+XrEYpJx8ErzIxL1PR4cr/S/Hc7Tl0I88+QG+PNjiOB8 b1bfMjt+i54zz2FxoxI6+kFXlhzbBspJfrIt7o7o5KgbDn3qvDqEJT/RlsATdVs4/r9DgDv41uN97 bAO/HgmLa59sp214+EsKGMyfF5jfmVulC/BM/moRcNOhe5aeMcrpIMybsYWvLdJlJ3SDlF3E5ggIj QTMXsrFA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tg037-000000068Sj-0tPr; Thu, 06 Feb 2025 11:28:01 +0000 Received: from mail-wm1-x341.google.com ([2a00:1450:4864:20::341]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzXJ-000000061hb-3Uh3 for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:55:11 +0000 Received: by mail-wm1-x341.google.com with SMTP id 5b1f17b1804b1-43634b570c1so5006735e9.0 for ; Thu, 06 Feb 2025 02:55:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839308; x=1739444108; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MMJzjmm9Ssn/2tshx5tAZDCI9Z1engNm3Cw+RIUnYAs=; b=ZwNjuuJMl06l8/XC6HCt8Jtx+BykIFCZ0Ceqxa2uszlayHt0ovwpVjr18Us7TnBRSD FAA/pjyBPxJXY8RssVLaM/ofk34nwpNMAA9OJ1G3pMWEXuSUXumJQJsSMZr19SfZPfrw yrT+zwBOduJv5k/k818mSLOeGAav53/nP5vKrZdpgMTNjh6q2gXdy+YHZ1bx3pL0TKmD tkAuY0lWeLYzahoitdL+Hj4dnUzKU5dPhF17SAt3cuLjWFaPMXX0rtwmoIRazJyFRN6Q AA6f8aSv2BpPK1/PPGGrSw8KH4yd2jg2amfTZkNkkptCsKds+jMA5OdtgH4r6f8oN+KJ zVEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839308; x=1739444108; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MMJzjmm9Ssn/2tshx5tAZDCI9Z1engNm3Cw+RIUnYAs=; b=sk9+ewjOqepOFBgDuBK3h9v1aH1ENCkShfuhJIddL/3pefzmu5ojLajfBHYfFiod39 KZNyXJIdcP7t4leOIH/5LNSHFVqNjQKC89I+cskFmAWefrfEcxPuiyUV/SZ+Qe+Arwe0 vh64oDbEs86ii2AIVWDA3fyRz+LLeplPDNIapdE0iBaY4Fnase3cYCpPSN238G681Ghz aZl6EFn5OLadFBsTT1kYgTaKqiXo36eO/9vrJ87J0CdTlBZypO3sM+2muP5y711Yk+Ys Myi6qHqexDVgs+aqbJqUTCvkNkf2fuDzD2NKv5+3di/P86aOYvWx7Z1m7oc8g7GQvMBm NkVw== X-Forwarded-Encrypted: i=1; AJvYcCWKuuQs4JkfcS4lxYFiqGGyY1DpaamBzOleqCZX6U7BRe7MaoXwNdVBlGJkgJ3W1Y0miymf0BDVRsWHnIuMZaqQ@lists.infradead.org X-Gm-Message-State: AOJu0Yyh7SxwwTPOYGETu+WhSy4fbKAhmA24IXN2G0Ymr2BKsaYjeTyb ChxNhB7+ba/nZ7kMMS/A2qO2gNBrQqAyT7tV3yFr28TLU5QuprjU X-Gm-Gg: ASbGncs/UbzkeKIS2WKq1D5CUK9c+4YOLLqZ75pmoFi7jtN1bLN9utReC5iyJ0NkQrF gHFqsRupB9Xz076LEU//LTp9etf1I53Pw1hvpQb2dUlrMMJKcq0G61P/7PW9eOX4b74BglPUAD0 UtkW+zdPaMPSrignrSrWE//X4HR/qZ00rWBl+v17lCoMJReB7L1HyDBhHorAOzFjx7p0FIotKGt fQz1NQB7nPTE+cVXS3h6dDDvU9+J+qqf7pZ5zMXPa+IYPgph7YmayWAENwrFc0u6FUFb3JMDwn0 CQ== X-Google-Smtp-Source: AGHT+IEmR2/TmkoFR9idBNEPfQquu8zyJPeOqwMAgJpW3/scT3Z5dWwlgX9AEyJpUkJHqzG2L9+0eQ== X-Received: by 2002:a05:600c:1e1c:b0:434:effb:9f8a with SMTP id 5b1f17b1804b1-43912e54246mr23938815e9.15.1738839308376; Thu, 06 Feb 2025 02:55:08 -0800 (PST) Received: from localhost ([2a03:2880:31ff::]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38dbde1dc1dsm1375270f8f.87.2025.02.06.02.55.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:55:07 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Barret Rhoden , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 23/26] bpf: Handle allocation failure in acquire_lock_state Date: Thu, 6 Feb 2025 02:54:31 -0800 Message-ID: <20250206105435.2159977-24-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=776; h=from:subject; bh=a6nXDtyfaaaN5/sxSs0OZmYxp3ViUzHxtEF2nqEtxP4=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRndCMDTSB5KQHWglo8M0qqlVd8f6toksUk+2oC Ok/Xw+iJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZwAKCRBM4MiGSL8RypokD/ 9GP0/dyt61QvEQ+p/5NGWAGtLJXjdj00R6BxemQXfhh7MvCRx41BKujZPuPCBt02Xog2UeoMlXt04D RrFCxgOD/1K7vHK54wFCXnJU1den0sUUyDwfsc7Gj/Z0ifdqhrqr90k3WL27q6L5Ur7ohUWpuvAnjx gB0+Y/CJisy/oVjoFzaGqCN1CzvRRmkSSNLrV8Uz5Tf6Op2AvJ0nuER2FdlG/+j0t9YBcFkR3pothS zfXRfrBvws9/BRL/R8Z15OeRzU1lgVUPjSL6C9RU+hwgIhYDCQ00BoTuMlZh511ojo4sTo3/Zvtf16 9+lkVBS8sPO5bap8JYLBHDl0nLnFTfxJMfDyUryqKVYV3GSM7Y86ST04cPqZ8EhWprpNw+iKmOmCKd gjHx0Q4VcPzA5uy+jmFPpNNMXyT9dmIjrev7NGi5DbvK/DNdjUYWQxJjzN02iC7/EvRTjngcxoWyPe rUBoOvzWn7iv0hRVbYWfJBakloSkfB7eIVP2e7rSgqjtytPi/PjYxdl90c9TqChCBlyDqVB8XPQ1gq y0eDqUv8LC+trmtM6Wr6odaXyc9aKb9guEkmSooa05h3a0/ZFg3aOhTfFbDmyB8nOJC/rMKNjw4sQ7 8dmYH22uTb9/O8oVVtMMomG3b8H8m0yE9UTVXLKxcqjDa8wUn5Y2vIIHFqcQ== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_025509_861828_EA40EFAA X-CRM114-Status: GOOD ( 10.95 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The acquire_lock_state function needs to handle possible NULL values returned by acquire_reference_state, and return -ENOMEM. Fixes: 769b0f1c8214 ("bpf: Refactor {acquire,release}_reference_state") Signed-off-by: Kumar Kartikeya Dwivedi --- kernel/bpf/verifier.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 9971c03adfd5..d6999d085c7d 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1501,6 +1501,8 @@ static int acquire_lock_state(struct bpf_verifier_env *env, int insn_idx, enum r struct bpf_reference_state *s; s = acquire_reference_state(env, insn_idx); + if (!s) + return -ENOMEM; s->type = type; s->id = id; s->ptr = ptr; From patchwork Thu Feb 6 10:54:32 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962912 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 810B8C02194 for ; Thu, 6 Feb 2025 11:31:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=RabZcLwVRQNGP0BmugyphEg76jVqJJBXW2DB0rYNNAY=; b=TXD6TwQBIiftlREBQ0qbWNQ6s3 VS5Q+8fYv1cKlJvtHRsk9MP+VG5/AJfjto++wxMjXhRTsLTZ4uGfV69Ww9L34LesbfAR5A+N1xEQh YU8NTjP3ebYhX9FBvgbgfH2mtWjKilBanUFI6+fMGzJbmev5wrDZlJ+xhZhv8EFKxvAorqGeccwGx 41RkkfJx0bZyvfSe1X6GskkStS7IgaQEj0YnFQp2wvB1+xma78INbZIM67ilj50nr1lwlh30pcvKa lXqPDNEkk/oO+n/MTXG7n3anktaNFKVSIK1qJp6MbUubfd9m3xlPO6Fe+4g5Ii2wPk++D0VYX+HKU 4DrbMNFw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tg05n-000000068n6-22Qb; Thu, 06 Feb 2025 11:30:47 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzXP-000000061k1-282C for linux-arm-kernel@bombadil.infradead.org; Thu, 06 Feb 2025 10:55:15 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Transfer-Encoding:MIME-Version :References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=RabZcLwVRQNGP0BmugyphEg76jVqJJBXW2DB0rYNNAY=; b=QBHaUPoDmNbeVmKOiLEnWyiqA8 nDHsKY8nfs901UsmJQf3Lc66bIuYWCS5zU4aklZQFQRpRsUU7Yfn4BtgOiHG1utSOsCiJ4OMGsRnE +1ZwKs8X2zpgEXH5hm3xKm6bjck/wPioTXvfNVIPGv698/mxkckzmQNpUA1dWcqP1hbXg3afYSeOv Fr/j5Mv7ondbQxS5AyxV5sRjylRSSELH2loTN9rfOSWOdB0bNUTeL7WXH7fdCydcMEzPZ42+KpsRl kDMDzpe+PaJGbDe/xbn9TKRgLDLIJBK7Vq6yyzohzEWMGcwHskpbzXbBR7ir3g76kNW1+SA6lPPfw Rm3LM9vg==; Received: from mail-wm1-x342.google.com ([2a00:1450:4864:20::342]) by desiato.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzXL-0000000Guxv-2Jmn for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:55:13 +0000 Received: by mail-wm1-x342.google.com with SMTP id 5b1f17b1804b1-437a92d7b96so7308075e9.2 for ; Thu, 06 Feb 2025 02:55:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839310; x=1739444110; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=RabZcLwVRQNGP0BmugyphEg76jVqJJBXW2DB0rYNNAY=; b=lvSBafm7yK/6Fh7/2j8XwzKuXEkbfXL/kboLou336T7D/DZAlbvwAbrXAtCNpmYwP3 E/sdCE8eA1PQ/KvGK9WN6SHgujOQzu8LXfZpw45M+aHL1P0urVGit07iGPp2ZIUf7EkN tWIkoY8p0rKve4zvltOaF4GF6zDx4W8ZBVBtoDLvXGEJ/qlNyFym8QRXplkCie+zg+zJ H5MId2mzItK96KXS5Z+2HfnVXkhrAvauERVdSZlM78bzgfaWxiNTyG4rUenPUBDs9buQ 3LmMk8J72Hapvh53RRMNdOISDDkdudy2r/BeGIGnBO8jBM1/Sc8QVoKzsHgOTc36xUYV cEyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839310; x=1739444110; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RabZcLwVRQNGP0BmugyphEg76jVqJJBXW2DB0rYNNAY=; b=QEpkaLMAeF/j3mgkfPhJhepNTO7Lv6xEXC74AQHSlv+FX5sJJ4anTUCEd5wZhykUAL 7yLR0zBhYNmZh04iR0RjKQlGG1ATkDq8ZKQZmmNtxnNzpV2Z/eDZcaXeVrFEbUfpJ3sY WfoX9Y6iMqaVtLOWJTk931aua3yzqCUuQkbcqItiV4rq1eTqkWvw73p9iOVTKebFzn+n 7cLxLZxYb5UAtqvzxP7tRy2Wt0vnugz7lQ0g9PMv7MKDxi2Wt0WF8gJA8sirkY8E1Q2r cWQG43TK+s/4yUjIDImOJ3yt0Zlgi8wOsmNgl9y2pVltXbYcMv9hQXGDhYbobguST6Mh 39Tw== X-Forwarded-Encrypted: i=1; AJvYcCVOus0eq+XMuYQO+qINynp0Zfeyt9lG2FF6WLBJaxmTissGiX7CCoeYY0q3VNeKKtcwoqKJByx9Sa5l5kZyao14@lists.infradead.org X-Gm-Message-State: AOJu0YxHsl1CwK+tSWkLHXe4Xqgpz1VB8c5cyQ4DkwFh4V3eICOMccmc V72STtedPysKZ2uSUtTd2R0pQBmK4fNpRzWhXzTNQrmkjQnbQ7P/E2IHejAVwLA= X-Gm-Gg: ASbGnctRrC4FZ0GuZDa67P2yg1hOMDX2ktPLCqkC30iRlA04d+37SzMrjkbdYEtcvQM 4VfBnwrVnMynOglK8yHVGafEC3mbNe+LymAkDc5zHjObVUoN2GVp3n6pCXKB1vxZE3YWcxS7yEv maENQctudWWQFy3jSRt59gWFMRwOXViwmo6bPRTZROCyMG2+0R0292BIxOWZTz0GTexEzsnCv6G TVecMV7SusdweCJpTtDjAZXPzRbNX0uPDPVY/y5Qe4r9kaV2Fe+DarBfiZqWjvsGdYS1dnT8Lfy ILfh X-Google-Smtp-Source: AGHT+IFIYCB/oajw2Qvxf5XpXM6y+pT2gnKN3xko42QpRO/uNhJ36s5dySnRLck/L268tkP6cbHLvg== X-Received: by 2002:a05:600c:1f8f:b0:434:ff30:a159 with SMTP id 5b1f17b1804b1-4390d34b326mr53742055e9.0.1738839310094; Thu, 06 Feb 2025 02:55:10 -0800 (PST) Received: from localhost ([2a03:2880:31ff:7::]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4390d94d7d4sm52313755e9.10.2025.02.06.02.55.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:55:09 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Barret Rhoden , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 24/26] bpf: Implement verifier support for rqspinlock Date: Thu, 6 Feb 2025 02:54:32 -0800 Message-ID: <20250206105435.2159977-25-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=28449; h=from:subject; bh=5HGNbFTsma4FLf8lx8r/wItLdfJjrA3dk1lFblbB38k=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRnui/XZ+ItES1wv9fM/NvhMZxRq5RhAoodvGS2 ZRuffZCJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZwAKCRBM4MiGSL8RysFCD/ 4gVuhWUETLd0qMcRO+R4F5+AN7eVpScQItfzUD/5wd0Q+cj3KteCL9DvZqOyGrY3QAIR9RZYqSaNDL O/mIcF2Pvh6Cd4XW0HW9DX1kZTMtaNAX5y3Gf4MN5ec+3TRpze/Akrti+WPYsH6c2uLrYBQWRxTn1n 2nmXOnx8l0z9/hL0IiTo9+NU8hXMfYKyowF3WlbLb+qXzVAGbBgY8ujOAxHz7ChloIpJJycv6bt2px X+0gww/ScR2tiGeGccRHwjsv9A5aZwmf1rEgR+JxNWUDVmw5Q1bxLfcENHtdkmaY2LUSJRhbC2azIk Mhp1Qxtm4cKs6uDuXxuD6BC31t9Wh1sjAXDUHkkVzvmozKjq+iJaJfmq9lBIKIxm+F4VQ0IJeIu+ht ITxR8QbwqKVAaSvF4dtUZVj/rt0YUiXr3qhkIJkeJhVFHEIGyPLT8vAn1UxElhe76/uoOf0cEi/Fap Mu9etskGVBz22A+IZ3TPuAW/fi1WDtm/ORrB0Zppv7MWJt7QtJCxeRUdjMZUW19IozwqeH8oSDnccR Z6lXr96lJ79xOgOwGOPUFv9vqUOPLF2PHh7pe2TML+07+Q6rcQrxCQ51c2qz/+mhe7r7mGe3DzoB24 Y2voqsR/rq+xaXg/ZTLcx0T4rvjTs+U7b+M/9ldEAImGeezuk/XkRjwJxhSA== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_105511_766690_1412C68D X-CRM114-Status: GOOD ( 29.74 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Introduce verifier-side support for rqspinlock kfuncs. The first step is allowing bpf_res_spin_lock type to be defined in map values and allocated objects, so BTF-side is updated with a new BPF_RES_SPIN_LOCK field to recognize and validate. Any object cannot have both bpf_spin_lock and bpf_res_spin_lock, only one of them (and at most one of them per-object, like before) must be present. The bpf_res_spin_lock can also be used to protect objects that require lock protection for their kfuncs, like BPF rbtree and linked list. The verifier plumbing to simulate success and failure cases when calling the kfuncs is done by pushing a new verifier state to the verifier state stack which will verify the failure case upon calling the kfunc. The path where success is indicated creates all lock reference state and IRQ state (if necessary for irqsave variants). In the case of failure, the state clears the registers r0-r5, sets the return value, and skips kfunc processing, proceeding to the next instruction. When marking the return value for success case, the value is marked as 0, and for the failure case as [-MAX_ERRNO, -1]. Then, in the program, whenever user checks the return value as 'if (ret)' or 'if (ret < 0)' the verifier never traverses such branches for success cases, and would be aware that the lock is not held in such cases. We push the kfunc state in check_kfunc_call whenever rqspinlock kfuncs are invoked. We introduce a kfunc_class state to avoid mixing lock irqrestore kfuncs with IRQ state created by bpf_local_irq_save. With all this infrastructure, these kfuncs become usable in programs while satisfying all safety properties required by the kernel. Signed-off-by: Kumar Kartikeya Dwivedi Acked-by: Eduard Zingerman --- include/linux/bpf.h | 9 ++ include/linux/bpf_verifier.h | 17 ++- kernel/bpf/btf.c | 26 ++++- kernel/bpf/syscall.c | 6 +- kernel/bpf/verifier.c | 219 ++++++++++++++++++++++++++++------- 5 files changed, 232 insertions(+), 45 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 35af09ee6a2c..91dddf7396f9 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -205,6 +205,7 @@ enum btf_field_type { BPF_REFCOUNT = (1 << 9), BPF_WORKQUEUE = (1 << 10), BPF_UPTR = (1 << 11), + BPF_RES_SPIN_LOCK = (1 << 12), }; typedef void (*btf_dtor_kfunc_t)(void *); @@ -240,6 +241,7 @@ struct btf_record { u32 cnt; u32 field_mask; int spin_lock_off; + int res_spin_lock_off; int timer_off; int wq_off; int refcount_off; @@ -315,6 +317,8 @@ static inline const char *btf_field_type_name(enum btf_field_type type) switch (type) { case BPF_SPIN_LOCK: return "bpf_spin_lock"; + case BPF_RES_SPIN_LOCK: + return "bpf_res_spin_lock"; case BPF_TIMER: return "bpf_timer"; case BPF_WORKQUEUE: @@ -347,6 +351,8 @@ static inline u32 btf_field_type_size(enum btf_field_type type) switch (type) { case BPF_SPIN_LOCK: return sizeof(struct bpf_spin_lock); + case BPF_RES_SPIN_LOCK: + return sizeof(struct bpf_res_spin_lock); case BPF_TIMER: return sizeof(struct bpf_timer); case BPF_WORKQUEUE: @@ -377,6 +383,8 @@ static inline u32 btf_field_type_align(enum btf_field_type type) switch (type) { case BPF_SPIN_LOCK: return __alignof__(struct bpf_spin_lock); + case BPF_RES_SPIN_LOCK: + return __alignof__(struct bpf_res_spin_lock); case BPF_TIMER: return __alignof__(struct bpf_timer); case BPF_WORKQUEUE: @@ -420,6 +428,7 @@ static inline void bpf_obj_init_field(const struct btf_field *field, void *addr) case BPF_RB_ROOT: /* RB_ROOT_CACHED 0-inits, no need to do anything after memset */ case BPF_SPIN_LOCK: + case BPF_RES_SPIN_LOCK: case BPF_TIMER: case BPF_WORKQUEUE: case BPF_KPTR_UNREF: diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 32c23f2a3086..ed444e44f524 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -115,6 +115,15 @@ struct bpf_reg_state { int depth:30; } iter; + /* For irq stack slots */ + struct { + enum { + IRQ_KFUNC_IGNORE, + IRQ_NATIVE_KFUNC, + IRQ_LOCK_KFUNC, + } kfunc_class; + } irq; + /* Max size from any of the above. */ struct { unsigned long raw1; @@ -255,9 +264,11 @@ struct bpf_reference_state { * default to pointer reference on zero initialization of a state. */ enum ref_state_type { - REF_TYPE_PTR = 1, - REF_TYPE_IRQ = 2, - REF_TYPE_LOCK = 3, + REF_TYPE_PTR = (1 << 1), + REF_TYPE_IRQ = (1 << 2), + REF_TYPE_LOCK = (1 << 3), + REF_TYPE_RES_LOCK = (1 << 4), + REF_TYPE_RES_LOCK_IRQ = (1 << 5), } type; /* Track each reference created with a unique id, even if the same * instruction creates the reference multiple times (eg, via CALL). diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 9433b6467bbe..aba6183253ea 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -3480,6 +3480,15 @@ static int btf_get_field_type(const struct btf *btf, const struct btf_type *var_ goto end; } } + if (field_mask & BPF_RES_SPIN_LOCK) { + if (!strcmp(name, "bpf_res_spin_lock")) { + if (*seen_mask & BPF_RES_SPIN_LOCK) + return -E2BIG; + *seen_mask |= BPF_RES_SPIN_LOCK; + type = BPF_RES_SPIN_LOCK; + goto end; + } + } if (field_mask & BPF_TIMER) { if (!strcmp(name, "bpf_timer")) { if (*seen_mask & BPF_TIMER) @@ -3658,6 +3667,7 @@ static int btf_find_field_one(const struct btf *btf, switch (field_type) { case BPF_SPIN_LOCK: + case BPF_RES_SPIN_LOCK: case BPF_TIMER: case BPF_WORKQUEUE: case BPF_LIST_NODE: @@ -3951,6 +3961,7 @@ struct btf_record *btf_parse_fields(const struct btf *btf, const struct btf_type return ERR_PTR(-ENOMEM); rec->spin_lock_off = -EINVAL; + rec->res_spin_lock_off = -EINVAL; rec->timer_off = -EINVAL; rec->wq_off = -EINVAL; rec->refcount_off = -EINVAL; @@ -3978,6 +3989,11 @@ struct btf_record *btf_parse_fields(const struct btf *btf, const struct btf_type /* Cache offset for faster lookup at runtime */ rec->spin_lock_off = rec->fields[i].offset; break; + case BPF_RES_SPIN_LOCK: + WARN_ON_ONCE(rec->spin_lock_off >= 0); + /* Cache offset for faster lookup at runtime */ + rec->res_spin_lock_off = rec->fields[i].offset; + break; case BPF_TIMER: WARN_ON_ONCE(rec->timer_off >= 0); /* Cache offset for faster lookup at runtime */ @@ -4021,9 +4037,15 @@ struct btf_record *btf_parse_fields(const struct btf *btf, const struct btf_type rec->cnt++; } + if (rec->spin_lock_off >= 0 && rec->res_spin_lock_off >= 0) { + ret = -EINVAL; + goto end; + } + /* bpf_{list_head, rb_node} require bpf_spin_lock */ if ((btf_record_has_field(rec, BPF_LIST_HEAD) || - btf_record_has_field(rec, BPF_RB_ROOT)) && rec->spin_lock_off < 0) { + btf_record_has_field(rec, BPF_RB_ROOT)) && + (rec->spin_lock_off < 0 && rec->res_spin_lock_off < 0)) { ret = -EINVAL; goto end; } @@ -5636,7 +5658,7 @@ btf_parse_struct_metas(struct bpf_verifier_log *log, struct btf *btf) type = &tab->types[tab->cnt]; type->btf_id = i; - record = btf_parse_fields(btf, t, BPF_SPIN_LOCK | BPF_LIST_HEAD | BPF_LIST_NODE | + record = btf_parse_fields(btf, t, BPF_SPIN_LOCK | BPF_RES_SPIN_LOCK | BPF_LIST_HEAD | BPF_LIST_NODE | BPF_RB_ROOT | BPF_RB_NODE | BPF_REFCOUNT | BPF_KPTR, t->size); /* The record cannot be unset, treat it as an error if so */ diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index c420edbfb7c8..054707215d28 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -648,6 +648,7 @@ void btf_record_free(struct btf_record *rec) case BPF_RB_ROOT: case BPF_RB_NODE: case BPF_SPIN_LOCK: + case BPF_RES_SPIN_LOCK: case BPF_TIMER: case BPF_REFCOUNT: case BPF_WORKQUEUE: @@ -700,6 +701,7 @@ struct btf_record *btf_record_dup(const struct btf_record *rec) case BPF_RB_ROOT: case BPF_RB_NODE: case BPF_SPIN_LOCK: + case BPF_RES_SPIN_LOCK: case BPF_TIMER: case BPF_REFCOUNT: case BPF_WORKQUEUE: @@ -777,6 +779,7 @@ void bpf_obj_free_fields(const struct btf_record *rec, void *obj) switch (fields[i].type) { case BPF_SPIN_LOCK: + case BPF_RES_SPIN_LOCK: break; case BPF_TIMER: bpf_timer_cancel_and_free(field_ptr); @@ -1203,7 +1206,7 @@ static int map_check_btf(struct bpf_map *map, struct bpf_token *token, return -EINVAL; map->record = btf_parse_fields(btf, value_type, - BPF_SPIN_LOCK | BPF_TIMER | BPF_KPTR | BPF_LIST_HEAD | + BPF_SPIN_LOCK | BPF_RES_SPIN_LOCK | BPF_TIMER | BPF_KPTR | BPF_LIST_HEAD | BPF_RB_ROOT | BPF_REFCOUNT | BPF_WORKQUEUE | BPF_UPTR, map->value_size); if (!IS_ERR_OR_NULL(map->record)) { @@ -1222,6 +1225,7 @@ static int map_check_btf(struct bpf_map *map, struct bpf_token *token, case 0: continue; case BPF_SPIN_LOCK: + case BPF_RES_SPIN_LOCK: if (map->map_type != BPF_MAP_TYPE_HASH && map->map_type != BPF_MAP_TYPE_ARRAY && map->map_type != BPF_MAP_TYPE_CGROUP_STORAGE && diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index d6999d085c7d..294761dd0072 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -456,7 +456,7 @@ static bool subprog_is_exc_cb(struct bpf_verifier_env *env, int subprog) static bool reg_may_point_to_spin_lock(const struct bpf_reg_state *reg) { - return btf_record_has_field(reg_btf_record(reg), BPF_SPIN_LOCK); + return btf_record_has_field(reg_btf_record(reg), BPF_SPIN_LOCK | BPF_RES_SPIN_LOCK); } static bool type_is_rdonly_mem(u32 type) @@ -1148,7 +1148,8 @@ static int release_irq_state(struct bpf_verifier_state *state, int id); static int mark_stack_slot_irq_flag(struct bpf_verifier_env *env, struct bpf_kfunc_call_arg_meta *meta, - struct bpf_reg_state *reg, int insn_idx) + struct bpf_reg_state *reg, int insn_idx, + int kfunc_class) { struct bpf_func_state *state = func(env, reg); struct bpf_stack_state *slot; @@ -1170,6 +1171,7 @@ static int mark_stack_slot_irq_flag(struct bpf_verifier_env *env, st->type = PTR_TO_STACK; /* we don't have dedicated reg type */ st->live |= REG_LIVE_WRITTEN; st->ref_obj_id = id; + st->irq.kfunc_class = kfunc_class; for (i = 0; i < BPF_REG_SIZE; i++) slot->slot_type[i] = STACK_IRQ_FLAG; @@ -1178,7 +1180,8 @@ static int mark_stack_slot_irq_flag(struct bpf_verifier_env *env, return 0; } -static int unmark_stack_slot_irq_flag(struct bpf_verifier_env *env, struct bpf_reg_state *reg) +static int unmark_stack_slot_irq_flag(struct bpf_verifier_env *env, struct bpf_reg_state *reg, + int kfunc_class) { struct bpf_func_state *state = func(env, reg); struct bpf_stack_state *slot; @@ -1192,6 +1195,15 @@ static int unmark_stack_slot_irq_flag(struct bpf_verifier_env *env, struct bpf_r slot = &state->stack[spi]; st = &slot->spilled_ptr; + if (kfunc_class != IRQ_KFUNC_IGNORE && st->irq.kfunc_class != kfunc_class) { + const char *flag_kfunc = st->irq.kfunc_class == IRQ_NATIVE_KFUNC ? "native" : "lock"; + const char *used_kfunc = kfunc_class == IRQ_NATIVE_KFUNC ? "native" : "lock"; + + verbose(env, "irq flag acquired by %s kfuncs cannot be restored with %s kfuncs\n", + flag_kfunc, used_kfunc); + return -EINVAL; + } + err = release_irq_state(env->cur_state, st->ref_obj_id); WARN_ON_ONCE(err && err != -EACCES); if (err) { @@ -1591,7 +1603,7 @@ static struct bpf_reference_state *find_lock_state(struct bpf_verifier_state *st for (i = 0; i < state->acquired_refs; i++) { struct bpf_reference_state *s = &state->refs[i]; - if (s->type != type) + if (!(s->type & type)) continue; if (s->id == id && s->ptr == ptr) @@ -7985,6 +7997,12 @@ static int check_kfunc_mem_size_reg(struct bpf_verifier_env *env, struct bpf_reg return err; } +enum { + PROCESS_SPIN_LOCK = (1 << 0), + PROCESS_RES_LOCK = (1 << 1), + PROCESS_LOCK_IRQ = (1 << 2), +}; + /* Implementation details: * bpf_map_lookup returns PTR_TO_MAP_VALUE_OR_NULL. * bpf_obj_new returns PTR_TO_BTF_ID | MEM_ALLOC | PTR_MAYBE_NULL. @@ -8007,30 +8025,33 @@ static int check_kfunc_mem_size_reg(struct bpf_verifier_env *env, struct bpf_reg * env->cur_state->active_locks remembers which map value element or allocated * object got locked and clears it after bpf_spin_unlock. */ -static int process_spin_lock(struct bpf_verifier_env *env, int regno, - bool is_lock) +static int process_spin_lock(struct bpf_verifier_env *env, int regno, int flags) { + bool is_lock = flags & PROCESS_SPIN_LOCK, is_res_lock = flags & PROCESS_RES_LOCK; + const char *lock_str = is_res_lock ? "bpf_res_spin" : "bpf_spin"; struct bpf_reg_state *regs = cur_regs(env), *reg = ®s[regno]; struct bpf_verifier_state *cur = env->cur_state; bool is_const = tnum_is_const(reg->var_off); + bool is_irq = flags & PROCESS_LOCK_IRQ; u64 val = reg->var_off.value; struct bpf_map *map = NULL; struct btf *btf = NULL; struct btf_record *rec; + u32 spin_lock_off; int err; if (!is_const) { verbose(env, - "R%d doesn't have constant offset. bpf_spin_lock has to be at the constant offset\n", - regno); + "R%d doesn't have constant offset. %s_lock has to be at the constant offset\n", + regno, lock_str); return -EINVAL; } if (reg->type == PTR_TO_MAP_VALUE) { map = reg->map_ptr; if (!map->btf) { verbose(env, - "map '%s' has to have BTF in order to use bpf_spin_lock\n", - map->name); + "map '%s' has to have BTF in order to use %s_lock\n", + map->name, lock_str); return -EINVAL; } } else { @@ -8038,36 +8059,53 @@ static int process_spin_lock(struct bpf_verifier_env *env, int regno, } rec = reg_btf_record(reg); - if (!btf_record_has_field(rec, BPF_SPIN_LOCK)) { - verbose(env, "%s '%s' has no valid bpf_spin_lock\n", map ? "map" : "local", - map ? map->name : "kptr"); + if (!btf_record_has_field(rec, is_res_lock ? BPF_RES_SPIN_LOCK : BPF_SPIN_LOCK)) { + verbose(env, "%s '%s' has no valid %s_lock\n", map ? "map" : "local", + map ? map->name : "kptr", lock_str); return -EINVAL; } - if (rec->spin_lock_off != val + reg->off) { - verbose(env, "off %lld doesn't point to 'struct bpf_spin_lock' that is at %d\n", - val + reg->off, rec->spin_lock_off); + spin_lock_off = is_res_lock ? rec->res_spin_lock_off : rec->spin_lock_off; + if (spin_lock_off != val + reg->off) { + verbose(env, "off %lld doesn't point to 'struct %s_lock' that is at %d\n", + val + reg->off, lock_str, spin_lock_off); return -EINVAL; } if (is_lock) { void *ptr; + int type; if (map) ptr = map; else ptr = btf; - if (cur->active_locks) { - verbose(env, - "Locking two bpf_spin_locks are not allowed\n"); - return -EINVAL; + if (!is_res_lock && cur->active_locks) { + if (find_lock_state(env->cur_state, REF_TYPE_LOCK, 0, NULL)) { + verbose(env, + "Locking two bpf_spin_locks are not allowed\n"); + return -EINVAL; + } + } else if (is_res_lock) { + if (find_lock_state(env->cur_state, REF_TYPE_RES_LOCK, reg->id, ptr)) { + verbose(env, "Acquiring the same lock again, AA deadlock detected\n"); + return -EINVAL; + } } - err = acquire_lock_state(env, env->insn_idx, REF_TYPE_LOCK, reg->id, ptr); + + if (is_res_lock && is_irq) + type = REF_TYPE_RES_LOCK_IRQ; + else if (is_res_lock) + type = REF_TYPE_RES_LOCK; + else + type = REF_TYPE_LOCK; + err = acquire_lock_state(env, env->insn_idx, type, reg->id, ptr); if (err < 0) { verbose(env, "Failed to acquire lock state\n"); return err; } } else { void *ptr; + int type; if (map) ptr = map; @@ -8075,12 +8113,18 @@ static int process_spin_lock(struct bpf_verifier_env *env, int regno, ptr = btf; if (!cur->active_locks) { - verbose(env, "bpf_spin_unlock without taking a lock\n"); + verbose(env, "%s_unlock without taking a lock\n", lock_str); return -EINVAL; } - if (release_lock_state(env->cur_state, REF_TYPE_LOCK, reg->id, ptr)) { - verbose(env, "bpf_spin_unlock of different lock\n"); + if (is_res_lock && is_irq) + type = REF_TYPE_RES_LOCK_IRQ; + else if (is_res_lock) + type = REF_TYPE_RES_LOCK; + else + type = REF_TYPE_LOCK; + if (release_lock_state(cur, type, reg->id, ptr)) { + verbose(env, "%s_unlock of different lock\n", lock_str); return -EINVAL; } @@ -9391,11 +9435,11 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg, return -EACCES; } if (meta->func_id == BPF_FUNC_spin_lock) { - err = process_spin_lock(env, regno, true); + err = process_spin_lock(env, regno, PROCESS_SPIN_LOCK); if (err) return err; } else if (meta->func_id == BPF_FUNC_spin_unlock) { - err = process_spin_lock(env, regno, false); + err = process_spin_lock(env, regno, 0); if (err) return err; } else { @@ -11274,7 +11318,7 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn regs[BPF_REG_0].map_uid = meta.map_uid; regs[BPF_REG_0].type = PTR_TO_MAP_VALUE | ret_flag; if (!type_may_be_null(ret_flag) && - btf_record_has_field(meta.map_ptr->record, BPF_SPIN_LOCK)) { + btf_record_has_field(meta.map_ptr->record, BPF_SPIN_LOCK | BPF_RES_SPIN_LOCK)) { regs[BPF_REG_0].id = ++env->id_gen; } break; @@ -11446,10 +11490,10 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn /* mark_btf_func_reg_size() is used when the reg size is determined by * the BTF func_proto's return value size and argument. */ -static void mark_btf_func_reg_size(struct bpf_verifier_env *env, u32 regno, - size_t reg_size) +static void __mark_btf_func_reg_size(struct bpf_verifier_env *env, struct bpf_reg_state *regs, + u32 regno, size_t reg_size) { - struct bpf_reg_state *reg = &cur_regs(env)[regno]; + struct bpf_reg_state *reg = ®s[regno]; if (regno == BPF_REG_0) { /* Function return value */ @@ -11467,6 +11511,12 @@ static void mark_btf_func_reg_size(struct bpf_verifier_env *env, u32 regno, } } +static void mark_btf_func_reg_size(struct bpf_verifier_env *env, u32 regno, + size_t reg_size) +{ + return __mark_btf_func_reg_size(env, cur_regs(env), regno, reg_size); +} + static bool is_kfunc_acquire(struct bpf_kfunc_call_arg_meta *meta) { return meta->kfunc_flags & KF_ACQUIRE; @@ -11604,6 +11654,7 @@ enum { KF_ARG_RB_ROOT_ID, KF_ARG_RB_NODE_ID, KF_ARG_WORKQUEUE_ID, + KF_ARG_RES_SPIN_LOCK_ID, }; BTF_ID_LIST(kf_arg_btf_ids) @@ -11613,6 +11664,7 @@ BTF_ID(struct, bpf_list_node) BTF_ID(struct, bpf_rb_root) BTF_ID(struct, bpf_rb_node) BTF_ID(struct, bpf_wq) +BTF_ID(struct, bpf_res_spin_lock) static bool __is_kfunc_ptr_arg_type(const struct btf *btf, const struct btf_param *arg, int type) @@ -11661,6 +11713,11 @@ static bool is_kfunc_arg_wq(const struct btf *btf, const struct btf_param *arg) return __is_kfunc_ptr_arg_type(btf, arg, KF_ARG_WORKQUEUE_ID); } +static bool is_kfunc_arg_res_spin_lock(const struct btf *btf, const struct btf_param *arg) +{ + return __is_kfunc_ptr_arg_type(btf, arg, KF_ARG_RES_SPIN_LOCK_ID); +} + static bool is_kfunc_arg_callback(struct bpf_verifier_env *env, const struct btf *btf, const struct btf_param *arg) { @@ -11732,6 +11789,7 @@ enum kfunc_ptr_arg_type { KF_ARG_PTR_TO_MAP, KF_ARG_PTR_TO_WORKQUEUE, KF_ARG_PTR_TO_IRQ_FLAG, + KF_ARG_PTR_TO_RES_SPIN_LOCK, }; enum special_kfunc_type { @@ -11768,6 +11826,10 @@ enum special_kfunc_type { KF_bpf_iter_num_new, KF_bpf_iter_num_next, KF_bpf_iter_num_destroy, + KF_bpf_res_spin_lock, + KF_bpf_res_spin_unlock, + KF_bpf_res_spin_lock_irqsave, + KF_bpf_res_spin_unlock_irqrestore, }; BTF_SET_START(special_kfunc_set) @@ -11846,6 +11908,10 @@ BTF_ID(func, bpf_local_irq_restore) BTF_ID(func, bpf_iter_num_new) BTF_ID(func, bpf_iter_num_next) BTF_ID(func, bpf_iter_num_destroy) +BTF_ID(func, bpf_res_spin_lock) +BTF_ID(func, bpf_res_spin_unlock) +BTF_ID(func, bpf_res_spin_lock_irqsave) +BTF_ID(func, bpf_res_spin_unlock_irqrestore) static bool is_kfunc_ret_null(struct bpf_kfunc_call_arg_meta *meta) { @@ -11939,6 +12005,9 @@ get_kfunc_ptr_arg_type(struct bpf_verifier_env *env, if (is_kfunc_arg_irq_flag(meta->btf, &args[argno])) return KF_ARG_PTR_TO_IRQ_FLAG; + if (is_kfunc_arg_res_spin_lock(meta->btf, &args[argno])) + return KF_ARG_PTR_TO_RES_SPIN_LOCK; + if ((base_type(reg->type) == PTR_TO_BTF_ID || reg2btf_ids[base_type(reg->type)])) { if (!btf_type_is_struct(ref_t)) { verbose(env, "kernel function %s args#%d pointer type %s %s is not supported\n", @@ -12046,13 +12115,19 @@ static int process_irq_flag(struct bpf_verifier_env *env, int regno, struct bpf_kfunc_call_arg_meta *meta) { struct bpf_reg_state *regs = cur_regs(env), *reg = ®s[regno]; + int err, kfunc_class = IRQ_NATIVE_KFUNC; bool irq_save; - int err; - if (meta->func_id == special_kfunc_list[KF_bpf_local_irq_save]) { + if (meta->func_id == special_kfunc_list[KF_bpf_local_irq_save] || + meta->func_id == special_kfunc_list[KF_bpf_res_spin_lock_irqsave]) { irq_save = true; - } else if (meta->func_id == special_kfunc_list[KF_bpf_local_irq_restore]) { + if (meta->func_id == special_kfunc_list[KF_bpf_res_spin_lock_irqsave]) + kfunc_class = IRQ_LOCK_KFUNC; + } else if (meta->func_id == special_kfunc_list[KF_bpf_local_irq_restore] || + meta->func_id == special_kfunc_list[KF_bpf_res_spin_unlock_irqrestore]) { irq_save = false; + if (meta->func_id == special_kfunc_list[KF_bpf_res_spin_unlock_irqrestore]) + kfunc_class = IRQ_LOCK_KFUNC; } else { verbose(env, "verifier internal error: unknown irq flags kfunc\n"); return -EFAULT; @@ -12068,7 +12143,7 @@ static int process_irq_flag(struct bpf_verifier_env *env, int regno, if (err) return err; - err = mark_stack_slot_irq_flag(env, meta, reg, env->insn_idx); + err = mark_stack_slot_irq_flag(env, meta, reg, env->insn_idx, kfunc_class); if (err) return err; } else { @@ -12082,7 +12157,7 @@ static int process_irq_flag(struct bpf_verifier_env *env, int regno, if (err) return err; - err = unmark_stack_slot_irq_flag(env, reg); + err = unmark_stack_slot_irq_flag(env, reg, kfunc_class); if (err) return err; } @@ -12209,7 +12284,8 @@ static int check_reg_allocation_locked(struct bpf_verifier_env *env, struct bpf_ if (!env->cur_state->active_locks) return -EINVAL; - s = find_lock_state(env->cur_state, REF_TYPE_LOCK, id, ptr); + s = find_lock_state(env->cur_state, REF_TYPE_LOCK | REF_TYPE_RES_LOCK | REF_TYPE_RES_LOCK_IRQ, + id, ptr); if (!s) { verbose(env, "held lock and object are not in the same allocation\n"); return -EINVAL; @@ -12245,9 +12321,18 @@ static bool is_bpf_graph_api_kfunc(u32 btf_id) btf_id == special_kfunc_list[KF_bpf_refcount_acquire_impl]; } +static bool is_bpf_res_spin_lock_kfunc(u32 btf_id) +{ + return btf_id == special_kfunc_list[KF_bpf_res_spin_lock] || + btf_id == special_kfunc_list[KF_bpf_res_spin_unlock] || + btf_id == special_kfunc_list[KF_bpf_res_spin_lock_irqsave] || + btf_id == special_kfunc_list[KF_bpf_res_spin_unlock_irqrestore]; +} + static bool kfunc_spin_allowed(u32 btf_id) { - return is_bpf_graph_api_kfunc(btf_id) || is_bpf_iter_num_api_kfunc(btf_id); + return is_bpf_graph_api_kfunc(btf_id) || is_bpf_iter_num_api_kfunc(btf_id) || + is_bpf_res_spin_lock_kfunc(btf_id); } static bool is_sync_callback_calling_kfunc(u32 btf_id) @@ -12679,6 +12764,7 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_ case KF_ARG_PTR_TO_CONST_STR: case KF_ARG_PTR_TO_WORKQUEUE: case KF_ARG_PTR_TO_IRQ_FLAG: + case KF_ARG_PTR_TO_RES_SPIN_LOCK: break; default: WARN_ON_ONCE(1); @@ -12977,6 +13063,28 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_ if (ret < 0) return ret; break; + case KF_ARG_PTR_TO_RES_SPIN_LOCK: + { + int flags = PROCESS_RES_LOCK; + + if (reg->type != PTR_TO_MAP_VALUE && reg->type != (PTR_TO_BTF_ID | MEM_ALLOC)) { + verbose(env, "arg#%d doesn't point to map value or allocated object\n", i); + return -EINVAL; + } + + if (!is_bpf_res_spin_lock_kfunc(meta->func_id)) + return -EFAULT; + if (meta->func_id == special_kfunc_list[KF_bpf_res_spin_lock] || + meta->func_id == special_kfunc_list[KF_bpf_res_spin_lock_irqsave]) + flags |= PROCESS_SPIN_LOCK; + if (meta->func_id == special_kfunc_list[KF_bpf_res_spin_lock_irqsave] || + meta->func_id == special_kfunc_list[KF_bpf_res_spin_unlock_irqrestore]) + flags |= PROCESS_LOCK_IRQ; + ret = process_spin_lock(env, regno, flags); + if (ret < 0) + return ret; + break; + } } } @@ -13062,6 +13170,33 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, insn_aux->is_iter_next = is_iter_next_kfunc(&meta); + if (!insn->off && + (insn->imm == special_kfunc_list[KF_bpf_res_spin_lock] || + insn->imm == special_kfunc_list[KF_bpf_res_spin_lock_irqsave])) { + struct bpf_verifier_state *branch; + struct bpf_reg_state *regs; + + branch = push_stack(env, env->insn_idx + 1, env->insn_idx, false); + if (!branch) { + verbose(env, "failed to push state for failed lock acquisition\n"); + return -ENOMEM; + } + + regs = branch->frame[branch->curframe]->regs; + + /* Clear r0-r5 registers in forked state */ + for (i = 0; i < CALLER_SAVED_REGS; i++) + mark_reg_not_init(env, regs, caller_saved[i]); + + mark_reg_unknown(env, regs, BPF_REG_0); + err = __mark_reg_s32_range(env, regs, BPF_REG_0, -MAX_ERRNO, -1); + if (err) { + verbose(env, "failed to mark s32 range for retval in forked state for lock\n"); + return err; + } + __mark_btf_func_reg_size(env, regs, BPF_REG_0, sizeof(u32)); + } + if (is_kfunc_destructive(&meta) && !capable(CAP_SYS_BOOT)) { verbose(env, "destructive kfunc calls require CAP_SYS_BOOT capability\n"); return -EACCES; @@ -13232,6 +13367,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, if (btf_type_is_scalar(t)) { mark_reg_unknown(env, regs, BPF_REG_0); + if (meta.btf == btf_vmlinux && (meta.func_id == special_kfunc_list[KF_bpf_res_spin_lock] || + meta.func_id == special_kfunc_list[KF_bpf_res_spin_lock_irqsave])) + __mark_reg_const_zero(env, ®s[BPF_REG_0]); mark_btf_func_reg_size(env, BPF_REG_0, t->size); } else if (btf_type_is_ptr(t)) { ptr_type = btf_type_skip_modifiers(desc_btf, t->type, &ptr_type_id); @@ -18114,7 +18252,8 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, case STACK_IRQ_FLAG: old_reg = &old->stack[spi].spilled_ptr; cur_reg = &cur->stack[spi].spilled_ptr; - if (!check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap)) + if (!check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap) || + old_reg->irq.kfunc_class != cur_reg->irq.kfunc_class) return false; break; case STACK_MISC: @@ -18158,6 +18297,8 @@ static bool refsafe(struct bpf_verifier_state *old, struct bpf_verifier_state *c case REF_TYPE_IRQ: break; case REF_TYPE_LOCK: + case REF_TYPE_RES_LOCK: + case REF_TYPE_RES_LOCK_IRQ: if (old->refs[i].ptr != cur->refs[i].ptr) return false; break; @@ -19491,7 +19632,7 @@ static int check_map_prog_compatibility(struct bpf_verifier_env *env, } } - if (btf_record_has_field(map->record, BPF_SPIN_LOCK)) { + if (btf_record_has_field(map->record, BPF_SPIN_LOCK | BPF_RES_SPIN_LOCK)) { if (prog_type == BPF_PROG_TYPE_SOCKET_FILTER) { verbose(env, "socket filter progs cannot use bpf_spin_lock yet\n"); return -EINVAL; From patchwork Thu Feb 6 10:54:33 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962913 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5CCCBC02194 for ; Thu, 6 Feb 2025 11:32:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=bEAtfk9sZ7vuzt+HmDZCm4BnfnO/7nvi4Al6O9lbnV0=; b=hvE0h/HOvhqdEiMa7YeYQkBBtw t40aNNohvrSQL3qVAjHIaBSE78AFcfEy40gzXXTw/nmbIqAkcmP9NXAmExWSyzW3DJ2Bh/FI7zz+c qzBlRYGW4DwHClVpF87OoCKcfe3+4wy2p+HxoFkDHElKnkg3st0VGy2uGggfUh5KMXoki6PB/zQK3 jZeq5mSh2AOGxxphILHNUNkT2IDibXlIaflj7cWQ9cIp8dh8ymFYO8dpYBlYnM3YwKz2oJD3rA+SF WCS7RqleT1bJ85mEosjnmlY/BLk0kPItBELh2lfdCaw3BM8H9havSlKGDunnE1IEB7Gtrmz0gyAkC shxPw6QA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tg07A-000000068yu-10ER; Thu, 06 Feb 2025 11:32:12 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzXP-000000061k6-3Xmy for linux-arm-kernel@bombadil.infradead.org; Thu, 06 Feb 2025 10:55:15 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Transfer-Encoding:MIME-Version :References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=bEAtfk9sZ7vuzt+HmDZCm4BnfnO/7nvi4Al6O9lbnV0=; b=hDndw1XKALXqPzA71iDZCd2pDs A4rn+z3bMsQVaa52PJ/dh8q1RWVhMaeck+OQrtXrEbgm3hCh9nk99L3+Z4e/F4FunYNWdszcuKebC 2V50eMs9usg6o7sftDVuKIVrn7ZLTTe47xEgrIOeNpwTWJPbcONc1vIa2uTvLggUCi3yGI6iq0rEL ELVAbqK9gtvnwbIQCZ4gnj6lK+92z4V+kMVHuyoLaEzaEnTxhqYnfMNlQfTNwobeT50H29HsOg6jH iQ8vDfBMxMqdDqX7lxtc/LhCZGUbB3rJ+wXOmatyrPaa0niMy3XWjjKZ0ku4X4G9vwJyCmhDqmezG V3iSrGMA==; Received: from mail-wm1-x343.google.com ([2a00:1450:4864:20::343]) by desiato.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzXM-0000000GuyQ-3B1y for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:55:14 +0000 Received: by mail-wm1-x343.google.com with SMTP id 5b1f17b1804b1-4361dc6322fso4702105e9.3 for ; Thu, 06 Feb 2025 02:55:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839311; x=1739444111; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=bEAtfk9sZ7vuzt+HmDZCm4BnfnO/7nvi4Al6O9lbnV0=; b=fTvDt4fpFjuv7VfSj7Zz6tr8hG98jK2lxt2JEfO0z2HgmSrDjm820MANHtGSFR7a7w efKHDyXi/c6VFq576E8/D8Jju5mAvCkaWHDPKZWkT/9eMLrYP7WhNGJJ3LEj3k3CErKF Zs8cCei36gkGU4SCN8xAX6n5YKkz/k8YGVUk+8I9VqL/QyPIbb0ToZ11E0bkDkSVbKXl 25+3o4/gFng7ZXBAsnaYm/lqTrAGFFwRO7DC86pVwkxAk2+kKPQ9w/GAvLUI96NoJ4Ef apq05fIDVMYE69AOfdqrV9691bSDmerCjrAVSskQPhKmnv+C/OQSbpAonOsS5jvlsOWk 98kw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839311; x=1739444111; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bEAtfk9sZ7vuzt+HmDZCm4BnfnO/7nvi4Al6O9lbnV0=; b=mqpH3JQ0Off+Ceu7oGKGgXtwzTn6d0tegU0TqZE6Y418IcYOB8w88XtuUpG/f6zkga OrmcguWql10LNGuHlcggtgVMJbSgKXY2DMThJ9Aw7ep530//3CYDproOHz4iP5EtuhyR aDiC+dKcdcwGAlI6zLpjQZS/LzBfXTFff6zfjoT89cWQIPGY8+OnFA7TzfkF4yJBBUSK +Hws7yDUPm5EEjuKD4lbMECqpR6G0Vve9mxyClMToPyYPbUbMjIkyPhRGlVUfs8r92+L 4vaTMkavgtBQXSLK+wXZmjuXyCWH/88NnhRFupCaQ2Fsi9++nLu/Xw6PgYT8R06WE0WB ZDFg== X-Forwarded-Encrypted: i=1; AJvYcCWWZ3bNDstKsWO1NOzMEB4JUDHkIbgTiWJjKfZhmvn1g2LXwe3+MPJVIZMebUeqKsXm8LOEE6qNUfnY1x68kVMy@lists.infradead.org X-Gm-Message-State: AOJu0YwSjmKkygRHVru143JoykKpka2b3QXQDRupR8w7PAyobCKQ4SNr LapXWCdkMo4go+9hYVn5a4T4DR101r3oj/HGrGvF/yo9z4lah45D X-Gm-Gg: ASbGncss/2W8cRSEFs58NHAWHfDEOs8oMfXWOp7bugOIEADXr+gG2Ux20ICcjQd/lm9 MvcCa3UBNnlEc9wcdERnYpt/zsC289iPn0XnbYL0dTmvx4VBvsVN7thISp6JLiOYB+wj7Md+mPq HCVzZw7fOQgjLSeICM0L/4dHNgt5o2MFqY6zYOuzPskU48RLjNO4AOdT4fe8JiWUAr7DwWnE+of mk3f7Dk3yMzrHIo3SkALn2Bz/YpGCK2ADr2D2g0KuMpDTwRDGzusqrYxoOrLePXXy8f76x6Tc+l /KV8Sw== X-Google-Smtp-Source: AGHT+IFqbAZGecNZkCkV4EyCt9uLl+Me6VZBFwE4Ef4f+7NoRNBRql5C5/fdSnEXvukiZ8DRRQsg4A== X-Received: by 2002:a05:600c:3502:b0:434:a468:4a57 with SMTP id 5b1f17b1804b1-4390d56d740mr41168565e9.26.1738839311535; Thu, 06 Feb 2025 02:55:11 -0800 (PST) Received: from localhost ([2a03:2880:31ff:18::]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4390d966faasm51736715e9.23.2025.02.06.02.55.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:55:10 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Barret Rhoden , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 25/26] bpf: Maintain FIFO property for rqspinlock unlock Date: Thu, 6 Feb 2025 02:54:33 -0800 Message-ID: <20250206105435.2159977-26-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=4773; h=from:subject; bh=CpkXgirFn/osRo9yRzXQj0S5SI5kmNKbptLkawFmaSc=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRn9kSW/6/ffD2Y5sJ0f0LySFJbetcpzORHAy69 oJgq0WmJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZwAKCRBM4MiGSL8RysTTEA Cs/dM0FPIuRegO0Ni1ae0PS45twSts6Al635nhtr0pJrXE19WwWgKiImtGQFXqq3EPB5/8SvfuaE8G J1P1qGxlxHodgTsNGv+svmEY0/JEu+q38yAWvB4eRw3JLHg9CbMW51ggxE1p3ZpyZYdbpTgg6oxbQE tBXlQcMrV6jKRRSW1tnaKkmrC9wWPdweMrStWBRYVKxQBsyiRvRP58LTSpAeAwLr8stkpLRqzpdlUM WixOEfnvLFoJZoPEOHf1EBfFZyIThmEomcF7O0RKWuNsk5bXmo/QkqIcglaJT064UNbt+Xcpn2Bdgv Q+c7mIOHcVf31dJK9WBMiOB2YeIOcQuybv5/L73vuSoYFc4o2Hj5Rl0C+sRvUP8mIlwf25+jxQvYBP 4tTP5udWN/wvGI3i3o1VSO5asN9D7U6bGMJ1ZAfO4SBI7Jy7BbBDsa143gR8A2A30Kw9d5HSmip4a0 DmlaU7qWHdV2XFecXaKQvl/1VrvRH9XWp3ofuOWKzZkjOYKHnOjpcqd9d0JVn9LNE+45UH8SAgmHyl VGJ9BQJpdJmgnhnBI4c3ltCVvXs7IkhGfKc0k5MJwxPnQTv+uV0xjj9cFG4MJ77JOXjgkV1tpM1Nka 70APhaWwVIcUoIfaNj8H3ninXADBCDCYIA8glTB00cVm16MtNyCn/PsFhxVQ== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_105512_986934_390BE66B X-CRM114-Status: GOOD ( 20.79 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Since out-of-order unlocks are unsupported for rqspinlock, and irqsave variants enforce strict FIFO ordering anyway, make the same change for normal non-irqsave variants, such that FIFO ordering is enforced. Two new verifier state fields (active_lock_id, active_lock_ptr) are used to denote the top of the stack, and prev_id and prev_ptr are ascertained whenever popping the topmost entry through an unlock. Take special care to make these fields part of the state comparison in refsafe. Signed-off-by: Kumar Kartikeya Dwivedi --- include/linux/bpf_verifier.h | 3 +++ kernel/bpf/verifier.c | 33 ++++++++++++++++++++++++++++----- 2 files changed, 31 insertions(+), 5 deletions(-) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index ed444e44f524..92cd2289b743 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -269,6 +269,7 @@ struct bpf_reference_state { REF_TYPE_LOCK = (1 << 3), REF_TYPE_RES_LOCK = (1 << 4), REF_TYPE_RES_LOCK_IRQ = (1 << 5), + REF_TYPE_LOCK_MASK = REF_TYPE_LOCK | REF_TYPE_RES_LOCK | REF_TYPE_RES_LOCK_IRQ, } type; /* Track each reference created with a unique id, even if the same * instruction creates the reference multiple times (eg, via CALL). @@ -435,6 +436,8 @@ struct bpf_verifier_state { u32 active_locks; u32 active_preempt_locks; u32 active_irq_id; + u32 active_lock_id; + void *active_lock_ptr; bool active_rcu_lock; bool speculative; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 294761dd0072..9cac6ea4f844 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1421,6 +1421,8 @@ static int copy_reference_state(struct bpf_verifier_state *dst, const struct bpf dst->active_preempt_locks = src->active_preempt_locks; dst->active_rcu_lock = src->active_rcu_lock; dst->active_irq_id = src->active_irq_id; + dst->active_lock_id = src->active_lock_id; + dst->active_lock_ptr = src->active_lock_ptr; return 0; } @@ -1520,6 +1522,8 @@ static int acquire_lock_state(struct bpf_verifier_env *env, int insn_idx, enum r s->ptr = ptr; state->active_locks++; + state->active_lock_id = id; + state->active_lock_ptr = ptr; return 0; } @@ -1559,16 +1563,24 @@ static void release_reference_state(struct bpf_verifier_state *state, int idx) static int release_lock_state(struct bpf_verifier_state *state, int type, int id, void *ptr) { + void *prev_ptr = NULL; + u32 prev_id = 0; int i; for (i = 0; i < state->acquired_refs; i++) { - if (state->refs[i].type != type) - continue; - if (state->refs[i].id == id && state->refs[i].ptr == ptr) { + if (state->refs[i].type == type && state->refs[i].id == id && + state->refs[i].ptr == ptr) { release_reference_state(state, i); state->active_locks--; + /* Reassign active lock (id, ptr). */ + state->active_lock_id = prev_id; + state->active_lock_ptr = prev_ptr; return 0; } + if (state->refs[i].type & REF_TYPE_LOCK_MASK) { + prev_id = state->refs[i].id; + prev_ptr = state->refs[i].ptr; + } } return -EINVAL; } @@ -8123,6 +8135,14 @@ static int process_spin_lock(struct bpf_verifier_env *env, int regno, int flags) type = REF_TYPE_RES_LOCK; else type = REF_TYPE_LOCK; + if (!find_lock_state(cur, type, reg->id, ptr)) { + verbose(env, "%s_unlock of different lock\n", lock_str); + return -EINVAL; + } + if (reg->id != cur->active_lock_id || ptr != cur->active_lock_ptr) { + verbose(env, "%s_unlock cannot be out of order\n", lock_str); + return -EINVAL; + } if (release_lock_state(cur, type, reg->id, ptr)) { verbose(env, "%s_unlock of different lock\n", lock_str); return -EINVAL; @@ -12284,8 +12304,7 @@ static int check_reg_allocation_locked(struct bpf_verifier_env *env, struct bpf_ if (!env->cur_state->active_locks) return -EINVAL; - s = find_lock_state(env->cur_state, REF_TYPE_LOCK | REF_TYPE_RES_LOCK | REF_TYPE_RES_LOCK_IRQ, - id, ptr); + s = find_lock_state(env->cur_state, REF_TYPE_LOCK_MASK, id, ptr); if (!s) { verbose(env, "held lock and object are not in the same allocation\n"); return -EINVAL; @@ -18288,6 +18307,10 @@ static bool refsafe(struct bpf_verifier_state *old, struct bpf_verifier_state *c if (!check_ids(old->active_irq_id, cur->active_irq_id, idmap)) return false; + if (!check_ids(old->active_lock_id, cur->active_lock_id, idmap) || + old->active_lock_ptr != cur->active_lock_ptr) + return false; + for (i = 0; i < old->acquired_refs; i++) { if (!check_ids(old->refs[i].id, cur->refs[i].id, idmap) || old->refs[i].type != cur->refs[i].type) From patchwork Thu Feb 6 10:54:34 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kumar Kartikeya Dwivedi X-Patchwork-Id: 13962914 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 840A4C02194 for ; Thu, 6 Feb 2025 11:33:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=XI5mR9awotpFmMGmD6PyGcm5Yi9TOUmCSpGD+JUgaXI=; b=Mu5lMTJpP+RG1GEqTEiy76YwPO /15gylwtcIjPSu66WyJYNBP3Hlgao1KibVgPvGZ2MJhIBcM/V71c/iIFNCZN2XNVGWgaSz47BnAQ8 6n9XsM9ltSlmm3X45e4a9Qk236OR70Khpo3F3PnBCByCAfYb/btlgxYZfHFlfa77G79afnrHb7uH3 2ggNnbuBzKqIWxvHfAXxwj+zcW5Kcl8+81MvqsWM2ehIsXN1ZXiytgKX3EsLbjA9+6B2oo0IK9aHl qvhd4q1OHF+mxapt0Ukeu+zDFPoGPMSBreCJAPICSRKsxVovmie9aUbH6Rnoy6BVMBHVmH8RD2Hgy mKPwQX7w==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tg08U-0000000696k-3oV5; Thu, 06 Feb 2025 11:33:34 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzXR-000000061kr-0qF5 for linux-arm-kernel@bombadil.infradead.org; Thu, 06 Feb 2025 10:55:17 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Transfer-Encoding:MIME-Version :References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=XI5mR9awotpFmMGmD6PyGcm5Yi9TOUmCSpGD+JUgaXI=; b=eSZxFN1ucie0a45jh1cb7+va+q t2ADAUpYcL9/1TT0Hv5opLmnwxOmQfKejOO2YaVs0zwW/xumZxQUq4x+2CLDUYnNiJwUyAhWULWh4 V5F3O08ocps7yAtGjeBbnZnp0ohO+u2Ryfpqk5BujwUoJNCPcAlOPOab6bYOPMkmfVlB6bOcEPggT Cvp0mqcKCMIBl6Gf+1Fn2nkovtXT7448NJ88UhppheW2bVeNctyllLuW/7I0SXYszTX0ijM9w46/F ILrbR/UzcQoEupsxSF/r2FGC4tyQE3ZtqD5m4xyAaw8vj/z2dmaSbn8uM/I6Auq69SXiItZTpzL6f CpHnVxog==; Received: from mail-wm1-x343.google.com ([2a00:1450:4864:20::343]) by desiato.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tfzXO-0000000GuzF-24EM for linux-arm-kernel@lists.infradead.org; Thu, 06 Feb 2025 10:55:16 +0000 Received: by mail-wm1-x343.google.com with SMTP id 5b1f17b1804b1-43621d27adeso4641015e9.2 for ; Thu, 06 Feb 2025 02:55:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738839313; x=1739444113; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=XI5mR9awotpFmMGmD6PyGcm5Yi9TOUmCSpGD+JUgaXI=; b=eVfq8q8hvviI2N4WrrDmWqDg8u72VrrkKLCfq6vzfqdELFtpPGu6o7di0r8jS/CYto rI6Sb1Qrszm5r2vvLcWiw0jIpvAMH1Squg940kDYrPbwW8Jlx5tAjUOMbLLLbds6VMvi bJqpOm6f0hul0POzCvXqCJWiYN/Z0arEvB+HjyyBxz7TCeLXUENXsDvWuFyAxCBpLeYw kXuN6j2iiavx5dwIGIhLLpDMq1D2rXsoiLCxlcmf+B+m4ypoWpZ0XBDB3Pm31gDwo5Np qMHTv6HCyGc8A2b4BWZwGOPsabzL4LKHiAnh8hmeeJ+uq11b1ZzF6I4QA59Z/00mq9gl m+9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738839313; x=1739444113; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XI5mR9awotpFmMGmD6PyGcm5Yi9TOUmCSpGD+JUgaXI=; b=hmxkc2jE3HqXgK9YyQvLWDL45gTorqzLXGO253NovnLuRcfR8IlcefpGLxc/pVPNID BwtJh37hxjUcmtIXZYKJMPzQ2UEIF4qks/NQ00fKCQ5Xl5s9++ijQqCcANM5FRl4u2DQ hI08x0CRpbSRN5c20opSbkYyT2gAVX1GEJP1iXbK7EBHiiQ0BA9YF5OF6YIJkOgmytYs EwE2sGJSzuth88N66vt2x4c4egY86jQSrQAqMx/k7vCZ5e+lRTJV9yadV1BVFWHTXmRf MQc+sYz2l6J5KgO77jLzmEHPgzVAoSGTX8rW4FI2/Ze+nRR6P4EkYmnNVqaZ0lvuq53T dYuA== X-Forwarded-Encrypted: i=1; AJvYcCUGaRAdWmt4aYGNlTJnydJf6uk1s2zq3lvIEf4NUkd3L296WpCX9x90WeUvd8fWe6OHAZcBtxi5iUBzh1DP3Gb5@lists.infradead.org X-Gm-Message-State: AOJu0Ywggzn/QPog7E13YYqeJLI5dftGJq+0WmG+SAHeFyA4QV8l5iIv momuRDyk+IaaiFpc56RveT8ZIOObeu0SX/6cRZCgxSPuu9qS9Cy3 X-Gm-Gg: ASbGnct+8BtkbmcdeAa491Q0RTwzui2mXGnLZ8T98FToPA1eqIkr/pm65uDjxXQH5mF l+nMVRfxVIm1wjJ/RATcoTCacGZJeFZT8hvrzITmM2h2+9ovP6FnhFqwCMTa8V1dRr9xF/Csr0p b0m1+5+9L7kV5lt4Uq1t6B1qUw6hqE8sdYtH57n5tt0JYqZmCngy2L8ZSTVCvXh3wfwEWAVDa4q aSzyu+0xVMOSiOmB3Qxvso+oCIRNfh/iSw9026Ir01ae2xWi9vfKYd++P5w4RczbDp5liOBhzlw 9SuLqQ== X-Google-Smtp-Source: AGHT+IHxQpWySsSSj+am/9movA5RR05zhmTuyQQ7kRFsu9/Thv/ZrVIRrKuBYgF71sqgdttaSpcNlA== X-Received: by 2002:a05:600c:1d01:b0:434:f335:855 with SMTP id 5b1f17b1804b1-4390d5a3b1amr46396125e9.28.1738839312912; Thu, 06 Feb 2025 02:55:12 -0800 (PST) Received: from localhost ([2a03:2880:31ff:1e::]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4390d9334e7sm50903345e9.6.2025.02.06.02.55.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Feb 2025 02:55:12 -0800 (PST) From: Kumar Kartikeya Dwivedi To: bpf@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Linus Torvalds , Peter Zijlstra , Will Deacon , Waiman Long , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Martin KaFai Lau , Eduard Zingerman , "Paul E. McKenney" , Tejun Heo , Barret Rhoden , Josh Don , Dohyun Kim , linux-arm-kernel@lists.infradead.org, kernel-team@meta.com Subject: [PATCH bpf-next v2 26/26] selftests/bpf: Add tests for rqspinlock Date: Thu, 6 Feb 2025 02:54:34 -0800 Message-ID: <20250206105435.2159977-27-memxor@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250206105435.2159977-1-memxor@gmail.com> References: <20250206105435.2159977-1-memxor@gmail.com> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=15931; h=from:subject; bh=T/KW/xDA8UwHhiAJo3V+JzDoebizVQBXrYloT3/u9JY=; b=owEBbQKS/ZANAwAIAUzgyIZIvxHKAcsmYgBnpJRnrlHnn/tBJ6oJ5t/DtGyjE2XhsWglV3leICP3 GP9uxzCJAjMEAAEIAB0WIQRLvip+Buz51YI8YRFM4MiGSL8RygUCZ6SUZwAKCRBM4MiGSL8RygPCEA C9S+AZeaRDMKdILDKrL+7frf+PUkXKd4iu4zWS7lzhMl7sqS4Nh9t1j2+axKLBCR5GW16UadRkev9l 4AqI35PX5zJhmHuS98mojbAK0CpS2CGHXnEXEpFuaCyNsa1YcLtCp8cDsiMtko27s7eQKM+0JRXgV+ AHn3SqCKzv/56wYO8tQGN0xPhYGQyIQ358ZD+GyyR3NdWiQb7vmIK5qkZqGMpNOL1HQvR8gRQR6ptA 1PW4+UutQskXBQHy4BPCv1jc6WKr+5Dd2aa0c1PBdcGQ7fUZs9/wTKkNT+FDdNx7MlEYXg8Woa9RaX 14OIHH8WAYCA+BnP1Y20WdBptw0RU82vmXwQ5iwUyfjNyUeysFqk1dyT4ly8Wbg5CV4CXoBTsnyN6Y Sx5LJXb4UWDkzQm5GxqC7CXVXeFuV7ziqEu7dPwt7sqeT/WA9HEesT7Sc7e8S5gWuSULdoaCcKXMMz TxZWf652+WNHPkKtVodaoP4JvTJc8Vy2F+Li5/I7kMaVt2ZkqX18Qvf/4YXl2tJ/gWb6R9YYamNkwq EMJZ+zkvqv4Il+pSYLhiaoVY+mwAId1+LWNwv6b8UDTr5yVtRGYkxuTFGQWj7zgc3fxNJAX/9H6jHi OlLJLdnUrJZafds/4uVLPPoZVywAm415gF0IkMsXK0ujQlvIw6wx9VqTeeog== X-Developer-Key: i=memxor@gmail.com; a=openpgp; fpr=4BBE2A7E06ECF9D5823C61114CE0C88648BF11CA X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20250206_105514_620020_9925AC9F X-CRM114-Status: GOOD ( 24.38 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Introduce selftests that trigger AA, ABBA deadlocks, and test the edge case where the held locks table runs out of entries, since we then fallback to the timeout as the final line of defense. Also exercise verifier's AA detection where applicable. Signed-off-by: Kumar Kartikeya Dwivedi --- .../selftests/bpf/prog_tests/res_spin_lock.c | 99 +++++++ tools/testing/selftests/bpf/progs/irq.c | 53 ++++ .../selftests/bpf/progs/res_spin_lock.c | 143 ++++++++++ .../selftests/bpf/progs/res_spin_lock_fail.c | 244 ++++++++++++++++++ 4 files changed, 539 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/res_spin_lock.c create mode 100644 tools/testing/selftests/bpf/progs/res_spin_lock.c create mode 100644 tools/testing/selftests/bpf/progs/res_spin_lock_fail.c diff --git a/tools/testing/selftests/bpf/prog_tests/res_spin_lock.c b/tools/testing/selftests/bpf/prog_tests/res_spin_lock.c new file mode 100644 index 000000000000..5a46b3e4a842 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/res_spin_lock.c @@ -0,0 +1,99 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ +#include +#include + +#include "res_spin_lock.skel.h" +#include "res_spin_lock_fail.skel.h" + +static void test_res_spin_lock_failure(void) +{ + RUN_TESTS(res_spin_lock_fail); +} + +static volatile int skip; + +static void *spin_lock_thread(void *arg) +{ + int err, prog_fd = *(u32 *) arg; + LIBBPF_OPTS(bpf_test_run_opts, topts, + .data_in = &pkt_v4, + .data_size_in = sizeof(pkt_v4), + .repeat = 10000, + ); + + while (!READ_ONCE(skip)) { + err = bpf_prog_test_run_opts(prog_fd, &topts); + ASSERT_OK(err, "test_run"); + ASSERT_OK(topts.retval, "test_run retval"); + } + pthread_exit(arg); +} + +static void test_res_spin_lock_success(void) +{ + LIBBPF_OPTS(bpf_test_run_opts, topts, + .data_in = &pkt_v4, + .data_size_in = sizeof(pkt_v4), + .repeat = 1, + ); + struct res_spin_lock *skel; + pthread_t thread_id[16]; + int prog_fd, i, err; + void *ret; + + skel = res_spin_lock__open_and_load(); + if (!ASSERT_OK_PTR(skel, "res_spin_lock__open_and_load")) + return; + /* AA deadlock */ + prog_fd = bpf_program__fd(skel->progs.res_spin_lock_test); + err = bpf_prog_test_run_opts(prog_fd, &topts); + ASSERT_OK(err, "error"); + ASSERT_OK(topts.retval, "retval"); + + prog_fd = bpf_program__fd(skel->progs.res_spin_lock_test_held_lock_max); + err = bpf_prog_test_run_opts(prog_fd, &topts); + ASSERT_OK(err, "error"); + ASSERT_OK(topts.retval, "retval"); + + /* Multi-threaded ABBA deadlock. */ + + prog_fd = bpf_program__fd(skel->progs.res_spin_lock_test_AB); + for (i = 0; i < 16; i++) { + int err; + + err = pthread_create(&thread_id[i], NULL, &spin_lock_thread, &prog_fd); + if (!ASSERT_OK(err, "pthread_create")) + goto end; + } + + topts.repeat = 1000; + int fd = bpf_program__fd(skel->progs.res_spin_lock_test_BA); + while (!topts.retval && !err && !READ_ONCE(skel->bss->err)) { + err = bpf_prog_test_run_opts(fd, &topts); + } + + WRITE_ONCE(skip, true); + + for (i = 0; i < 16; i++) { + if (!ASSERT_OK(pthread_join(thread_id[i], &ret), "pthread_join")) + goto end; + if (!ASSERT_EQ(ret, &prog_fd, "ret == prog_fd")) + goto end; + } + + ASSERT_EQ(READ_ONCE(skel->bss->err), -EDEADLK, "timeout err"); + ASSERT_OK(err, "err"); + ASSERT_EQ(topts.retval, -EDEADLK, "timeout"); +end: + res_spin_lock__destroy(skel); + return; +} + +void test_res_spin_lock(void) +{ + if (test__start_subtest("res_spin_lock_success")) + test_res_spin_lock_success(); + if (test__start_subtest("res_spin_lock_failure")) + test_res_spin_lock_failure(); +} diff --git a/tools/testing/selftests/bpf/progs/irq.c b/tools/testing/selftests/bpf/progs/irq.c index b0b53d980964..3d4fee83a5be 100644 --- a/tools/testing/selftests/bpf/progs/irq.c +++ b/tools/testing/selftests/bpf/progs/irq.c @@ -11,6 +11,9 @@ extern void bpf_local_irq_save(unsigned long *) __weak __ksym; extern void bpf_local_irq_restore(unsigned long *) __weak __ksym; extern int bpf_copy_from_user_str(void *dst, u32 dst__sz, const void *unsafe_ptr__ign, u64 flags) __weak __ksym; +struct bpf_res_spin_lock lockA __hidden SEC(".data.A"); +struct bpf_res_spin_lock lockB __hidden SEC(".data.B"); + SEC("?tc") __failure __msg("arg#0 doesn't point to an irq flag on stack") int irq_save_bad_arg(struct __sk_buff *ctx) @@ -441,4 +444,54 @@ int irq_ooo_refs_array(struct __sk_buff *ctx) return 0; } +SEC("?tc") +__failure __msg("cannot restore irq state out of order") +int irq_ooo_lock_cond_inv(struct __sk_buff *ctx) +{ + unsigned long flags1, flags2; + + if (bpf_res_spin_lock_irqsave(&lockA, &flags1)) + return 0; + if (bpf_res_spin_lock_irqsave(&lockB, &flags2)) { + bpf_res_spin_unlock_irqrestore(&lockA, &flags1); + return 0; + } + + bpf_res_spin_unlock_irqrestore(&lockB, &flags1); + bpf_res_spin_unlock_irqrestore(&lockA, &flags2); + return 0; +} + +SEC("?tc") +__failure __msg("function calls are not allowed") +int irq_wrong_kfunc_class_1(struct __sk_buff *ctx) +{ + unsigned long flags1; + + if (bpf_res_spin_lock_irqsave(&lockA, &flags1)) + return 0; + /* For now, bpf_local_irq_restore is not allowed in critical section, + * but this test ensures error will be caught with kfunc_class when it's + * opened up. Tested by temporarily permitting this kfunc in critical + * section. + */ + bpf_local_irq_restore(&flags1); + bpf_res_spin_unlock_irqrestore(&lockA, &flags1); + return 0; +} + +SEC("?tc") +__failure __msg("function calls are not allowed") +int irq_wrong_kfunc_class_2(struct __sk_buff *ctx) +{ + unsigned long flags1, flags2; + + bpf_local_irq_save(&flags1); + if (bpf_res_spin_lock_irqsave(&lockA, &flags2)) + return 0; + bpf_local_irq_restore(&flags2); + bpf_res_spin_unlock_irqrestore(&lockA, &flags1); + return 0; +} + char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/res_spin_lock.c b/tools/testing/selftests/bpf/progs/res_spin_lock.c new file mode 100644 index 000000000000..f68aa2ccccc2 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/res_spin_lock.c @@ -0,0 +1,143 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ +#include +#include +#include +#include "bpf_misc.h" + +#define EDEADLK 35 +#define ETIMEDOUT 110 + +struct arr_elem { + struct bpf_res_spin_lock lock; +}; + +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __uint(max_entries, 64); + __type(key, int); + __type(value, struct arr_elem); +} arrmap SEC(".maps"); + +struct bpf_res_spin_lock lockA __hidden SEC(".data.A"); +struct bpf_res_spin_lock lockB __hidden SEC(".data.B"); + +SEC("tc") +int res_spin_lock_test(struct __sk_buff *ctx) +{ + struct arr_elem *elem1, *elem2; + int r; + + elem1 = bpf_map_lookup_elem(&arrmap, &(int){0}); + if (!elem1) + return -1; + elem2 = bpf_map_lookup_elem(&arrmap, &(int){0}); + if (!elem2) + return -1; + + r = bpf_res_spin_lock(&elem1->lock); + if (r) + return r; + if (!bpf_res_spin_lock(&elem2->lock)) { + bpf_res_spin_unlock(&elem2->lock); + bpf_res_spin_unlock(&elem1->lock); + return -1; + } + bpf_res_spin_unlock(&elem1->lock); + return 0; +} + +SEC("tc") +int res_spin_lock_test_AB(struct __sk_buff *ctx) +{ + int r; + + r = bpf_res_spin_lock(&lockA); + if (r) + return !r; + /* Only unlock if we took the lock. */ + if (!bpf_res_spin_lock(&lockB)) + bpf_res_spin_unlock(&lockB); + bpf_res_spin_unlock(&lockA); + return 0; +} + +int err; + +SEC("tc") +int res_spin_lock_test_BA(struct __sk_buff *ctx) +{ + int r; + + r = bpf_res_spin_lock(&lockB); + if (r) + return !r; + if (!bpf_res_spin_lock(&lockA)) + bpf_res_spin_unlock(&lockA); + else + err = -EDEADLK; + bpf_res_spin_unlock(&lockB); + return err ?: 0; +} + +SEC("tc") +int res_spin_lock_test_held_lock_max(struct __sk_buff *ctx) +{ + struct bpf_res_spin_lock *locks[48] = {}; + struct arr_elem *e; + u64 time_beg, time; + int ret = 0, i; + + _Static_assert(ARRAY_SIZE(((struct rqspinlock_held){}).locks) == 32, + "RES_NR_HELD assumed to be 32"); + + for (i = 0; i < 34; i++) { + int key = i; + + /* We cannot pass in i as it will get spilled/filled by the compiler and + * loses bounds in verifier state. + */ + e = bpf_map_lookup_elem(&arrmap, &key); + if (!e) + return 1; + locks[i] = &e->lock; + } + + for (; i < 48; i++) { + int key = i - 2; + + /* We cannot pass in i as it will get spilled/filled by the compiler and + * loses bounds in verifier state. + */ + e = bpf_map_lookup_elem(&arrmap, &key); + if (!e) + return 1; + locks[i] = &e->lock; + } + + time_beg = bpf_ktime_get_ns(); + for (i = 0; i < 34; i++) { + if (bpf_res_spin_lock(locks[i])) + goto end; + } + + /* Trigger AA, after exhausting entries in the held lock table. This + * time, only the timeout can save us, as AA detection won't succeed. + */ + if (!bpf_res_spin_lock(locks[34])) { + bpf_res_spin_unlock(locks[34]); + ret = 1; + goto end; + } + +end: + for (i = i - 1; i >= 0; i--) + bpf_res_spin_unlock(locks[i]); + time = bpf_ktime_get_ns() - time_beg; + /* Time spent should be easily above our limit (1/2 s), since AA + * detection won't be expedited due to lack of held lock entry. + */ + return ret ?: (time > 1000000000 / 2 ? 0 : 1); +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/res_spin_lock_fail.c b/tools/testing/selftests/bpf/progs/res_spin_lock_fail.c new file mode 100644 index 000000000000..3222e9283c78 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/res_spin_lock_fail.c @@ -0,0 +1,244 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ +#include +#include +#include +#include +#include "bpf_misc.h" +#include "bpf_experimental.h" + +struct arr_elem { + struct bpf_res_spin_lock lock; +}; + +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __uint(max_entries, 1); + __type(key, int); + __type(value, struct arr_elem); +} arrmap SEC(".maps"); + +long value; + +struct bpf_spin_lock lock __hidden SEC(".data.A"); +struct bpf_res_spin_lock res_lock __hidden SEC(".data.B"); + +SEC("?tc") +__failure __msg("point to map value or allocated object") +int res_spin_lock_arg(struct __sk_buff *ctx) +{ + struct arr_elem *elem; + + elem = bpf_map_lookup_elem(&arrmap, &(int){0}); + if (!elem) + return 0; + bpf_res_spin_lock((struct bpf_res_spin_lock *)bpf_core_cast(&elem->lock, struct __sk_buff)); + bpf_res_spin_lock(&elem->lock); + return 0; +} + +SEC("?tc") +__failure __msg("AA deadlock detected") +int res_spin_lock_AA(struct __sk_buff *ctx) +{ + struct arr_elem *elem; + + elem = bpf_map_lookup_elem(&arrmap, &(int){0}); + if (!elem) + return 0; + bpf_res_spin_lock(&elem->lock); + bpf_res_spin_lock(&elem->lock); + return 0; +} + +SEC("?tc") +__failure __msg("AA deadlock detected") +int res_spin_lock_cond_AA(struct __sk_buff *ctx) +{ + struct arr_elem *elem; + + elem = bpf_map_lookup_elem(&arrmap, &(int){0}); + if (!elem) + return 0; + if (bpf_res_spin_lock(&elem->lock)) + return 0; + bpf_res_spin_lock(&elem->lock); + return 0; +} + +SEC("?tc") +__failure __msg("unlock of different lock") +int res_spin_lock_mismatch_1(struct __sk_buff *ctx) +{ + struct arr_elem *elem; + + elem = bpf_map_lookup_elem(&arrmap, &(int){0}); + if (!elem) + return 0; + if (bpf_res_spin_lock(&elem->lock)) + return 0; + bpf_res_spin_unlock(&res_lock); + return 0; +} + +SEC("?tc") +__failure __msg("unlock of different lock") +int res_spin_lock_mismatch_2(struct __sk_buff *ctx) +{ + struct arr_elem *elem; + + elem = bpf_map_lookup_elem(&arrmap, &(int){0}); + if (!elem) + return 0; + if (bpf_res_spin_lock(&res_lock)) + return 0; + bpf_res_spin_unlock(&elem->lock); + return 0; +} + +SEC("?tc") +__failure __msg("unlock of different lock") +int res_spin_lock_irq_mismatch_1(struct __sk_buff *ctx) +{ + struct arr_elem *elem; + unsigned long f1; + + elem = bpf_map_lookup_elem(&arrmap, &(int){0}); + if (!elem) + return 0; + bpf_local_irq_save(&f1); + if (bpf_res_spin_lock(&res_lock)) + return 0; + bpf_res_spin_unlock_irqrestore(&res_lock, &f1); + return 0; +} + +SEC("?tc") +__failure __msg("unlock of different lock") +int res_spin_lock_irq_mismatch_2(struct __sk_buff *ctx) +{ + struct arr_elem *elem; + unsigned long f1; + + elem = bpf_map_lookup_elem(&arrmap, &(int){0}); + if (!elem) + return 0; + if (bpf_res_spin_lock_irqsave(&res_lock, &f1)) + return 0; + bpf_res_spin_unlock(&res_lock); + return 0; +} + +SEC("?tc") +__success +int res_spin_lock_ooo(struct __sk_buff *ctx) +{ + struct arr_elem *elem; + + elem = bpf_map_lookup_elem(&arrmap, &(int){0}); + if (!elem) + return 0; + if (bpf_res_spin_lock(&res_lock)) + return 0; + if (bpf_res_spin_lock(&elem->lock)) { + bpf_res_spin_unlock(&res_lock); + return 0; + } + bpf_res_spin_unlock(&elem->lock); + bpf_res_spin_unlock(&res_lock); + return 0; +} + +SEC("?tc") +__success +int res_spin_lock_ooo_irq(struct __sk_buff *ctx) +{ + struct arr_elem *elem; + unsigned long f1, f2; + + elem = bpf_map_lookup_elem(&arrmap, &(int){0}); + if (!elem) + return 0; + if (bpf_res_spin_lock_irqsave(&res_lock, &f1)) + return 0; + if (bpf_res_spin_lock_irqsave(&elem->lock, &f2)) { + bpf_res_spin_unlock_irqrestore(&res_lock, &f1); + /* We won't have a unreleased IRQ flag error here. */ + return 0; + } + bpf_res_spin_unlock_irqrestore(&elem->lock, &f2); + bpf_res_spin_unlock_irqrestore(&res_lock, &f1); + return 0; +} + +struct bpf_res_spin_lock lock1 __hidden SEC(".data.OO1"); +struct bpf_res_spin_lock lock2 __hidden SEC(".data.OO2"); + +SEC("?tc") +__failure __msg("bpf_res_spin_unlock cannot be out of order") +int res_spin_lock_ooo_unlock(struct __sk_buff *ctx) +{ + if (bpf_res_spin_lock(&lock1)) + return 0; + if (bpf_res_spin_lock(&lock2)) { + bpf_res_spin_unlock(&lock1); + return 0; + } + bpf_res_spin_unlock(&lock1); + bpf_res_spin_unlock(&lock2); + return 0; +} + +SEC("?tc") +__failure __msg("off 1 doesn't point to 'struct bpf_res_spin_lock' that is at 0") +int res_spin_lock_bad_off(struct __sk_buff *ctx) +{ + struct arr_elem *elem; + + elem = bpf_map_lookup_elem(&arrmap, &(int){0}); + if (!elem) + return 0; + bpf_res_spin_lock((void *)&elem->lock + 1); + return 0; +} + +SEC("?tc") +__failure __msg("R1 doesn't have constant offset. bpf_res_spin_lock has to be at the constant offset") +int res_spin_lock_var_off(struct __sk_buff *ctx) +{ + struct arr_elem *elem; + u64 val = value; + + elem = bpf_map_lookup_elem(&arrmap, &(int){0}); + if (!elem) { + // FIXME: Only inline assembly use in assert macro doesn't emit + // BTF definition. + bpf_throw(0); + return 0; + } + bpf_assert_range(val, 0, 40); + bpf_res_spin_lock((void *)&value + val); + return 0; +} + +SEC("?tc") +__failure __msg("map 'res_spin.bss' has no valid bpf_res_spin_lock") +int res_spin_lock_no_lock_map(struct __sk_buff *ctx) +{ + bpf_res_spin_lock((void *)&value + 1); + return 0; +} + +SEC("?tc") +__failure __msg("local 'kptr' has no valid bpf_res_spin_lock") +int res_spin_lock_no_lock_kptr(struct __sk_buff *ctx) +{ + struct { int i; } *p = bpf_obj_new(typeof(*p)); + + if (!p) + return 0; + bpf_res_spin_lock((void *)p); + return 0; +} + +char _license[] SEC("license") = "GPL";