From patchwork Tue May 28 20:42:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mateusz Guzik X-Patchwork-Id: 13677393 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5A1CC25B78 for ; Tue, 28 May 2024 20:43:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 322686B00A9; Tue, 28 May 2024 16:43:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2D2046B00AA; Tue, 28 May 2024 16:43:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 199746B00AB; Tue, 28 May 2024 16:43:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id EC2326B00A9 for ; Tue, 28 May 2024 16:43:07 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id AE52D1A01F5 for ; Tue, 28 May 2024 20:43:07 +0000 (UTC) X-FDA: 82168979214.17.B89DF7A Received: from mail-wm1-f52.google.com (mail-wm1-f52.google.com [209.85.128.52]) by imf23.hostedemail.com (Postfix) with ESMTP id E7CD6140002 for ; Tue, 28 May 2024 20:43:05 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=bDuSeEgs; spf=pass (imf23.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.128.52 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716928986; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=P0e+jKy/0Fwk1GGmrtXEpWgTrJX0Xz7tJuQrxdoVOEA=; b=OxD76welRexSR8db8ab3srcK1iFvQ8gGz4fFFRTGNTl5Ozy6FXQBRlcVhPM4jCD//ldY1P HveQ5qAy5FtMHA/j07MycHaCwJ2HPXrYhVAmS6mauN9YEo+LUgOBaSQFC2GSwiB311eqvW nmfyBInvwkpuZMZ/iHfAYTfDr//JSMQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716928986; a=rsa-sha256; cv=none; b=pdm0IU/WQCiZcNvRaLmhmudOLhPVjD6jt2zK3Q/WWVDKt2Ee0/xvxKHZAeZJ40BQuKujxG oRf+mBsqZVli44WRXkcQCXA7+P12AelXtqaZhyzPCA3UnQwF3yKOqYwctzT8sXHD27+gVI ePcJApzj3/TH9LYCgv2Mm0N1Txu24co= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=bDuSeEgs; spf=pass (imf23.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.128.52 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-wm1-f52.google.com with SMTP id 5b1f17b1804b1-4211245e889so10445375e9.2 for ; Tue, 28 May 2024 13:43:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1716928984; x=1717533784; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=P0e+jKy/0Fwk1GGmrtXEpWgTrJX0Xz7tJuQrxdoVOEA=; b=bDuSeEgsvvFoZiI8TCVJRCtZOLgGd14ziKT+NhK+ifa7Of2mslto3mpqdrHXJJICfM 9iQ9cm7+Mu/mQzFP0goKvfFpTeYIDlM6SKxIqqQeZYmEIZkzT2fD70e6uAitHAGZrTxb ZNcGHv7FlvDeclt9Diw5lgw353mPBtzVgxH5DaAvniWF4FEu6Wxi9jBnQtWAKKrMYkau vbMb8QRTR7lDpxOuJ2OTtkSzPlV8ABiRFAMGTZeIhAM3uZ/iWzmkehR9NeC25+JLNsP4 zD3kBRSTaqEIlut3opO5PbvsDFtFGQidKnf89nOc+tamb4rGSJDqUPx11h8VcKU75LDV /3ig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716928984; x=1717533784; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=P0e+jKy/0Fwk1GGmrtXEpWgTrJX0Xz7tJuQrxdoVOEA=; b=Bf7xg1y5JsL9gMhQiNbqAv8/UXPPqvvQPQ6LHSbGJFK5biuS//t/fyuPtET6I5AM15 u60J3h4mtTgXiJIEppZ/OYKBXJ+kwmJePAACfqZr2XPbKjXCNUsHHVgNDoS40igZS8ii JxATZ8/0bAU0l/6pfmgPVV8SXpOWOA4trbf0XPoWB7ksuQa5RbqAqU5F+0w9dkS/4ymv 3lNmjfz1Y8GebtqERZP1OG5wMxSzy7YJ8uiI6iJjEvVtQODM3Jd+vetwXxC4JjNoqLV0 +3PrGL7tMapXfrfMXu0k1ma4RzJDWRkff53O5wYkDZD/lIhmtDwFVikGGkVR3X3P7Fzm vb6w== X-Forwarded-Encrypted: i=1; AJvYcCWzBhMyIPYTlj6A/liJXKRkTjneyLfBSty/sGnjJInPPj3NdiQ+hDHoY63Nv9jBguEouL47aloyaCnWAubcJDIA2gE= X-Gm-Message-State: AOJu0Yxz8tTwRmzWHW2ulKTgLqa844GO75U1el3d9OG6AxpQOA5uV2+V 5mW4FNbA4nW640aER8OYbrvzSS6eX9NreqfN2NtlmjiPtJRVvnh+ X-Google-Smtp-Source: AGHT+IG6CIFAgJKe3YOWgetMFHgXYZ+i9ODfyhlwMyMrWDq7Ju8nQYUGfW5MyNQ2NhN2FtIuC6+u2A== X-Received: by 2002:adf:fcce:0:b0:354:fb1d:694e with SMTP id ffacd0b85a97d-35527058247mr9653920f8f.29.1716928984046; Tue, 28 May 2024 13:43:04 -0700 (PDT) Received: from f.. (cst-prg-92-138.cust.vodafone.cz. [46.135.92.138]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-42120785973sm27888985e9.12.2024.05.28.13.43.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 May 2024 13:43:03 -0700 (PDT) From: Mateusz Guzik To: dennis@kernel.org Cc: tj@kernel.org, hughd@google.com, akpm@linux-foundation.org, vbabka@suse.cz, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Mateusz Guzik Subject: [PATCH v4] percpu_counter: add a cmpxchg-based _add_batch variant Date: Tue, 28 May 2024 22:42:57 +0200 Message-ID: <20240528204257.434817-1-mjguzik@gmail.com> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: m5oj6ootq7uiyaxc6zax7wappo7o18ot X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: E7CD6140002 X-HE-Tag: 1716928985-147185 X-HE-Meta: U2FsdGVkX19MMEmn0vkNJ0QQzVJqk1Zi5V79vzhtqE7Ocyh0IMpnfyTj/pQuPfS2ciuWn+xwzWdJdvrpQGdB0HBNn3hkdNwQFHICzdr8Tn3rbFx34MI6AL4Iw4UhJ006lppqv6Y7G6+++gkd0hiQAL00qkhn9OK2D1TEgFM3lzqo4jxJaKMlYRb1mXccA5DOewFqb+eHS03kpzGlaR5UNQO66L5PI3BwCbHj6Sqk1M61+fnBVNigXBt4qVyIb4e9EtUtu9jNLulFKNug6D7/LLXSJvzp1k84EbqxGXogDIzyv293GD2+59LS6hNzPSgWRLm8Qr3ZPvAHgZwGtY9Zb+2R5E/3rjk5gqeGCOBFLpb8HuiarCZMHuzJCSbR2fOCsOCuofOPHOGgaPKZA2+pX0ZyWosrk2++9dCpoNGqF5TI9Iil8q99e1m/Qv8j35Z78+jHWd4sREKWGtjWjaB1Z/Drvttwpsg/P3lXdBMCDr9XS1QCEsIXs2DekXpNL+v7djkpYrbQWJ7Af75bjK4HrdQ1jSw2vPet00BEbJMXo5dK67tsKKHEEp/suOnRHaXio5Nt1uePNJORWe6WRqlIT0q8iCudcw3Dogrmpj/5nKdc+fQUnKFxWwJfly3vFC0MPvO1lBmg1N1cJ2NUeJcGZ/0Nigj3lh/8JNHOTx/kc1G6n+qliuuvTZXpf1WG8KEPQ/l1jFU2GVLw3PfsoabC5b4Puce0hMJM8nRN9SFaHc57at+OaW0c9Oj69fMAOug8LZz4HotVZNjKK74zbUKPPRYDUookVrcVhy9BU8+ewph05q7/nmdT3K1ylO0hKxPHUklVjZP5hB28uJL/kHjtXodtEb9bJ+2KMoKcrTY8HtBNnVyTp9cwIFBTFxfMLGMl+aEBPHS2l+9IictCZIo7bTudhdUCtlKckx5ukpXQmTZfcT23hi3S0vo1syHPn1NHc3aa4zpD6FWE1tCGNFg W/x/dTlH r5bdwFXvJeZoSkrTzWX1whO9FLs+9o8t8TWlArlfUrmsk2RTpNQeOnl5H++5eZaa83EZJQNddNPMywnBFw7oaGGeThSgEiD5u76yuewKm6RkJUxjdmzAZJNxBBQBoduiqt661lYI5Mb1xQ7VmwXP7fXsFyQncIP8b/sZAhiLJ5qVTewjDcbD1UPFoKBI7LplIGNLFQf5EkPEoon0HcDve3/XP3hrytcv8TvXK7XGux62hSidB4ExuXyn5ACx+NfUM35LEaW8BSF9luw/Z60mFRLIu/4DLZbeffSb0azAlPdidFeEIUtOutQP2hVYoUDkvPSiJ6oRujAxcQAr5eS6twP569GasjCTkTSDxF7uZON6uZgrHiAktm8gFJNcVLm1g1HRzeV5g45fWRcnbWR5bTF7YsuUf270yE5jvasHbL7pYjgw4XrX0N2iQkJZs795kAD7kVwt600h3Ry0/JfWrjw57oS/h3txMFIvIbEPCcUAi2OXEdW7Egpo+ggZ+GghTSQSyHyjY3wl5j2jGdHyg7QuaSYwTvzo42I8hWG7nuOVP3+c= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Interrupt disable/enable trips are quite expensive on x86-64 compared to a mere cmpxchg (note: no lock prefix!) and percpu counters are used quite often. With this change I get a bump of 1% ops/s for negative path lookups, plugged into will-it-scale: void testcase(unsigned long long *iterations, unsigned long nr) { while (1) { int fd = open("/tmp/nonexistent", O_RDONLY); assert(fd == -1); (*iterations)++; } } The win would be higher if it was not for other slowdowns, but one has to start somewhere. Signed-off-by: Mateusz Guzik Acked-by: Vlastimil Babka Acked-by: Dennis Zhou --- v4: - fix a misplaced paren in unlikely(), reported by lkp: https://lore.kernel.org/oe-lkp/ZlZAbkjOylfZC5Os@snowbird/T/#t v3: - add a missing word to the new comment v2: - dodge preemption - use this_cpu_try_cmpxchg - keep the old variant depending on CONFIG_HAVE_CMPXCHG_LOCAL lib/percpu_counter.c | 44 +++++++++++++++++++++++++++++++++++++++----- 1 file changed, 39 insertions(+), 5 deletions(-) diff --git a/lib/percpu_counter.c b/lib/percpu_counter.c index 44dd133594d4..51bc5246986d 100644 --- a/lib/percpu_counter.c +++ b/lib/percpu_counter.c @@ -73,17 +73,50 @@ void percpu_counter_set(struct percpu_counter *fbc, s64 amount) EXPORT_SYMBOL(percpu_counter_set); /* - * local_irq_save() is needed to make the function irq safe: - * - The slow path would be ok as protected by an irq-safe spinlock. - * - this_cpu_add would be ok as it is irq-safe by definition. - * But: - * The decision slow path/fast path and the actual update must be atomic, too. + * Add to a counter while respecting batch size. + * + * There are 2 implementations, both dealing with the following problem: + * + * The decision slow path/fast path and the actual update must be atomic. * Otherwise a call in process context could check the current values and * decide that the fast path can be used. If now an interrupt occurs before * the this_cpu_add(), and the interrupt updates this_cpu(*fbc->counters), * then the this_cpu_add() that is executed after the interrupt has completed * can produce values larger than "batch" or even overflows. */ +#ifdef CONFIG_HAVE_CMPXCHG_LOCAL +/* + * Safety against interrupts is achieved in 2 ways: + * 1. the fast path uses local cmpxchg (note: no lock prefix) + * 2. the slow path operates with interrupts disabled + */ +void percpu_counter_add_batch(struct percpu_counter *fbc, s64 amount, s32 batch) +{ + s64 count; + unsigned long flags; + + count = this_cpu_read(*fbc->counters); + do { + if (unlikely(abs(count + amount) >= batch)) { + raw_spin_lock_irqsave(&fbc->lock, flags); + /* + * Note: by now we might have migrated to another CPU + * or the value might have changed. + */ + count = __this_cpu_read(*fbc->counters); + fbc->count += count + amount; + __this_cpu_sub(*fbc->counters, count); + raw_spin_unlock_irqrestore(&fbc->lock, flags); + return; + } + } while (!this_cpu_try_cmpxchg(*fbc->counters, &count, count + amount)); +} +#else +/* + * local_irq_save() is used to make the function irq safe: + * - The slow path would be ok as protected by an irq-safe spinlock. + * - this_cpu_add would be ok as it is irq-safe by definition. + */ void percpu_counter_add_batch(struct percpu_counter *fbc, s64 amount, s32 batch) { s64 count; @@ -101,6 +134,7 @@ void percpu_counter_add_batch(struct percpu_counter *fbc, s64 amount, s32 batch) } local_irq_restore(flags); } +#endif EXPORT_SYMBOL(percpu_counter_add_batch); /*