From patchwork Fri Jan 13 18:44:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mateusz Guzik X-Patchwork-Id: 13101497 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5B05AC3DA78 for ; Fri, 13 Jan 2023 18:47:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=xubKYOaJVAdAB0JsaBsKykdOHW+nXnwkyV4BoDC4hUs=; b=x0QulFSoqz06sd IcjboE1uS0FbkmHonR0ct5+M/A4brukwxD+jwWVHEJ57/7bYY9jMcU2JbE1+T2DrJsKluzE5rg8mK YVmbosfyzzsyGd/cSB+lpuUgS/1sWu4txd+ThXszoz1GTRRuea5hz5se3PbJHfLQFgwRvDwO8Xx5u NJ6w9rIr90cADMaqf3RrwR+Zrv9U92PZqQnAc3SvPx0Qz6EoPnwBpfSlLM8baFZbg3X3cUP+ZgDs2 U09HTuQzlaEtm9dnvFRxUvU9W4Fo6BGuB6lFOEVpQYaERJ2Sz3eeNeud8EbQrzXyoxof3aswVx82Y rT8xRkKqR0N0L6kaoWYA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1pGP4I-0046Sf-TJ; Fri, 13 Jan 2023 18:46:23 +0000 Received: from mail-ed1-x531.google.com ([2a00:1450:4864:20::531]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1pGP32-0045l2-7r for linux-arm-kernel@lists.infradead.org; Fri, 13 Jan 2023 18:45:05 +0000 Received: by mail-ed1-x531.google.com with SMTP id i9so32396419edj.4 for ; Fri, 13 Jan 2023 10:45:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=fVyfE7TORjs2ncDiYR6RHn/voCN67+L9RGjNyyFWon4=; b=gub/9QAk7eF+UCFfx0DndHNotZYTPE//gNHNzxMJu9KVlzlge37X8QnM4vz9XgVoGW NTkTX0Jrvy0X1iLCKx3dtMJEfx1P9RPnrqj7SnNoT3zAc9U0LsO7GYYH1r76YnEOuuVV S0Dmt6PeQLQxQlI3Km6j9RPdxR3Fde4aD7mC5AA83q3moLDMKWIU4il2nMi73hJVfLuw N6PrrdTTg6pe6P62Hz/uBnY9fNKwaBHvRi2vPXov2BNekdhY5fwYHwVsLVFpdrYgkM0N uTHOmA+zJH5s9vTAh+oiVpU46/5NyuCKWF5Rvim+ioS1Vw+147q2Xkc11aL/mVGJa9uV h0lg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fVyfE7TORjs2ncDiYR6RHn/voCN67+L9RGjNyyFWon4=; b=3ElfbMZkqFkyLa2JJhQvHrDROMkomAxp5RkJF9405kIJjy4gtbsCeUjpajuiEuYQvO T8ex+vQWSqFOiBAgqcZtIoNjLD5YGtQ0PJE/IDXp1CKjZVyz2BBDOX2hLjHe/4MkzxWC TQ4gO5pQfDboPpzXDAT4JxM+7CvHAp2qpmUEEm7FVCSG/mKQuErbtovMR2jezwfQF7OL 82yBz7VXrbxfSSy/mSsBAeb6v1m1b3AJ7etkr9l5mjURoTApgSAl1dBgcYpxAewUkPjg o4HwGsuyJby5UCKRobiPQGwkeKP8A/bYhElxF2DEFzZmIWVk0LSc7+xyZK4XkyMrl3w3 Jp6w== X-Gm-Message-State: AFqh2kpiojsSxSjb9v+25ZwR4dpHB7SyjJYQa7J6B9dv3ByT/IkZdVwK E4VFKspNv2b0taMexdHv6p8= X-Google-Smtp-Source: AMrXdXs9Nq/D7npk4F9SpsYBT4FccC0XrYrklNLShn5a9UNkqQ5v0jyGHNqsroeXFPFHmdioCAUrzg== X-Received: by 2002:a05:6402:3227:b0:48e:ac4e:7bfa with SMTP id g39-20020a056402322700b0048eac4e7bfamr847292eda.2.1673635501957; Fri, 13 Jan 2023 10:45:01 -0800 (PST) Received: from f.. (cst-prg-72-175.cust.vodafone.cz. [46.135.72.175]) by smtp.gmail.com with ESMTPSA id m17-20020a50ef11000000b0049c4e3d4139sm889873eds.89.2023.01.13.10.45.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jan 2023 10:45:01 -0800 (PST) From: Mateusz Guzik To: torvalds@linux-foundation.org Cc: catalin.marinas@arm.com, jan.glauber@gmail.com, linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-fsdevel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, mjguzik@gmail.com, mpe@ellerman.id.au, tony.luck@intel.com, viro@zeniv.linux.org.uk, will@kernel.org Subject: [PATCH] lockref: stop doing cpu_relax in the cmpxchg loop Date: Fri, 13 Jan 2023 19:44:47 +0100 Message-Id: <20230113184447.1707316-1-mjguzik@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230113_104504_322067_22976507 X-CRM114-Status: GOOD ( 20.95 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On the x86-64 architecture even a failing cmpxchg grants exclusive access to the cacheline, making it preferable to retry the failed op immediately instead of stalling with the pause instruction. To illustrate the impact, below are benchmark results obtained by running various will-it-scale tests on top of the 6.2-rc3 kernel and Cascade Lake (2 sockets * 24 cores * 2 threads) CPU. All results in ops/s. Note there is some variance in re-runs, but the code is consistently faster when contention is present. open3 ("Same file open/close"): proc stock no-pause 1 805603 814942 (+%1) 2 1054980 1054781 (-0%) 8 1544802 1822858 (+18%) 24 1191064 2199665 (+84%) 48 851582 1469860 (+72%) 96 609481 1427170 (+134%) fstat2 ("Same file fstat"): proc stock no-pause 1 3013872 3047636 (+1%) 2 4284687 4400421 (+2%) 8 3257721 5530156 (+69%) 24 2239819 5466127 (+144%) 48 1701072 5256609 (+209%) 96 1269157 6649326 (+423%) Additionally, a kernel with a private patch to help access() scalability: access2 ("Same file access"): proc stock patched patched+nopause 24 2378041 2005501 5370335 (-15% / +125%) That is, fixing the problems in access itself *reduces* scalability after the cacheline ping-pong only happens in lockref with the pause instruction. Note that fstat and access benchmarks are not currently integrated into will-it-scale, but interested parties can find them in pull requests to said project. Code at hand has a rather tortured history. First modification showed up in d472d9d98b463dd7 ("lockref: Relax in cmpxchg loop"), written with Itanium in mind. Later it got patched up to use an arch-dependent macro to stop doing it on s390 where it caused a significant regression. Said macro had undergone revisions and was ultimately eliminated later, going back to cpu_relax. While I intended to only remove cpu_relax for x86-64, I got the following comment from Linus: > I would actually prefer just removing it entirely and see if somebody > else hollers. You have the numbers to prove it hurts on real hardware, > and I don't think we have any numbers to the contrary. > So I think it's better to trust the numbers and remove it as a > failure, than say "let's just remove it on x86-64 and leave everybody >else with the potentially broken code" Additionally, Will Deacon (maintainer of the arm64 port, one of the architectures previously benchmarked): > So, from the arm64 side of the fence, I'm perfectly happy just removing > the cpu_relax() calls from lockref. As such, come back full circle in history and whack it altogether. Signed-off-by: Mateusz Guzik --- lib/lockref.c | 1 - 1 file changed, 1 deletion(-) diff --git a/lib/lockref.c b/lib/lockref.c index 45e93ece8ba0..2afe4c5d8919 100644 --- a/lib/lockref.c +++ b/lib/lockref.c @@ -23,7 +23,6 @@ } \ if (!--retry) \ break; \ - cpu_relax(); \ } \ } while (0)