From patchwork Mon May 20 07:14:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mateusz Guzik X-Patchwork-Id: 13668063 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5EE0DC04FFE for ; Mon, 20 May 2024 07:14:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 855C06B0082; Mon, 20 May 2024 03:14:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8058E6B0083; Mon, 20 May 2024 03:14:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6CCEC6B0085; Mon, 20 May 2024 03:14:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 4F5D96B0082 for ; Mon, 20 May 2024 03:14:18 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id BC3AA80C30 for ; Mon, 20 May 2024 07:14:17 +0000 (UTC) X-FDA: 82137910554.15.6E02D96 Received: from mail-ed1-f52.google.com (mail-ed1-f52.google.com [209.85.208.52]) by imf11.hostedemail.com (Postfix) with ESMTP id E98AE40010 for ; Mon, 20 May 2024 07:14:15 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=NfixTusp; spf=pass (imf11.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.208.52 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1716189256; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=FZ3PltnZNdgwNAbMD6uwjPtdVhRlQYrC9JSG8VMmpxQ=; b=ZCMXKThldo1NrWsv8F/WzeI5mZgDrZbGWzcMG3LAVuBsbbBz5Bj9ZqKTMX0Gc7o+Yd7k7c L7pBCVNhjaamsKVdH07eoaUWjclcIyJ7LxivOXASQYo/g8hhtD34us26PUIBz7F0mztkdy nHt62vZ9uIIzbSm4eE8CYKfiipz38qY= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=NfixTusp; spf=pass (imf11.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.208.52 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1716189256; a=rsa-sha256; cv=none; b=vzX+APmj0TflA461f/UGIUMkV2w7LsCv8YsWfoUvtI2jUr2kFAqcdSr9eGVWzGjq/IF1we 9S5U9BnbRijB94fQP/NA2nwdjMK8FnP3Qg28J0LHbxmn87c93SUXsbv/3DfgfxHYe8G9gy GjCLYeBGyUijK8r5ZwsZrsw9CSogG3c= Received: by mail-ed1-f52.google.com with SMTP id 4fb4d7f45d1cf-5731ffcc905so7981898a12.2 for ; Mon, 20 May 2024 00:14:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1716189254; x=1716794054; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=FZ3PltnZNdgwNAbMD6uwjPtdVhRlQYrC9JSG8VMmpxQ=; b=NfixTuspuhjRFrI/Nq/ySqUbczeI4wcBUMbOqEEcRl88QMGCYXWrTfmxU605gQEU6Z jlqbSdPjy+eZnC5fy9Co5rX1/am4qM8syXr64f9agCLaGuJoBWd6+u5utmmnGdnlvsCv ziqX+TjfJMBj2MYM9XSVtZ5bRP4hQJqctRbA1xLMXcif1EqNfsMU4iCIg6g6c4xAie5q P0PD6vrfzkjVi1Zhxf17Gu4pxWgfz69SEvksdeil16CaXl/4eOkm9bKzgT/0MGZH8imR /EwD+bJl+LmWgMrcZA1DEUJ2p3KaO86CAnDaUVuwDnsLQDldqqQ1cDZHow/zh3+S8mE8 AgKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716189254; x=1716794054; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=FZ3PltnZNdgwNAbMD6uwjPtdVhRlQYrC9JSG8VMmpxQ=; b=Yl2d6sspEry0lrr4rOBlMnG92QkXpU0bqArEGh/hDDHPGPCpfa4cGlKBzHkafxIgM3 qNPPw1Jf/qXf+K3MA+bh+vBwIFXgpxfH5Dn/NfUgA2gZFddvfrrU9oaODEpTXrUvFv3z D5ac9tnd3DkHcOQyP1RaKNW62eA9wfITsSm5TO4i+VPN9zjGmry/1nH9yfrChcdbm1bZ EOETwpUzb2n7wKwLhlMSnj40CezjXzvXuU0gAxVhVQ2BO7Ou2VkD3hH9hUkjRRXYUmIy 9BnOVU/7Jx5SKR2WfEURTSTVqg8gZFC3tPXRGxX3AZiK9SUVkvlglxBOBgHVg3qnTAKt jYfw== X-Forwarded-Encrypted: i=1; AJvYcCXBVXWV09LNsJozIoGx29iwKNjJGy2Jgl3AqnZG+JmiYBFdAwAYdVBYHY1xWsLc8Dpmp+oxgVdGkiPFloDLhLQistI= X-Gm-Message-State: AOJu0YxA2o+VxiV26l11Y97ISQJEPc+4tm1seFI8Kpf7dAT3pm6o/zML aNgBp6g6E3atfROulw4s41flAWNWf560Unc/wvOisYWYgPvz/Nti X-Google-Smtp-Source: AGHT+IFpcWecaqheR6olE2EQ7ViMsVokrPeDzwE+eEfMtuGuvWEkz3ZG4+3aCMSdskzOZNNg8PcScg== X-Received: by 2002:a50:9ec5:0:b0:568:8e22:4eff with SMTP id 4fb4d7f45d1cf-5734d6df0d5mr18026097a12.37.1716189254152; Mon, 20 May 2024 00:14:14 -0700 (PDT) Received: from f.. (cst-prg-73-12.cust.vodafone.cz. [46.135.73.12]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5733becfde7sm14701087a12.48.2024.05.20.00.14.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 May 2024 00:14:13 -0700 (PDT) From: Mateusz Guzik To: dennis@kernel.org Cc: tj@kernel.org, hughd@google.com, akpm@linux-foundation.org, vbabka@suse.cz, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Mateusz Guzik Subject: [PATCH v2] percpu_counter: add a cmpxchg-based _add_batch variant Date: Mon, 20 May 2024 09:14:07 +0200 Message-ID: <20240520071407.201618-1-mjguzik@gmail.com> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: E98AE40010 X-Stat-Signature: noyhonzza4qoi5hpuntc7wzy3yfmrghs X-Rspam-User: X-HE-Tag: 1716189255-760385 X-HE-Meta: U2FsdGVkX1+FVMQkwR1natDzIo7/QnjcLYc3yc8QnxrnMTjagQre5RMFJCFAKO8OQxfwgS5hH9mtgzIXfGxqVUFPFYzSbZdiktHNMuFhjErQEruN1HLFtSYfSe0ggwW7x8Mxn8YJwsFsPQR20EmKsjj8CPwFo8d7GREa1BfdmMti7WKioSqhbRSu60SkLi0CdeNZeY/UM1xdw0QDSB8CMiMes2Lk7j4DE6zDF6SO5w/H1bMPeczozZKeplChur6jsf/8kBbUf+V6RF0LOKIHz7SK+1SdJ+hRp4mpRskVIHyfA9IHE8YBPTnAoSHz01zwSLjNF6eXWVRHL2JP+arOKAprWn1+LR25riLLxGIjl4kNyud9f0l6kICd7wAECe+slqb2EfWchU7zVnNYNwG5J0W0gCqHJx71D9JgTdtjgZ3oCRRqGX5mWxHeZV8OAobJpRWfkxKGwcam3J11E0Qt5+/NnZFDwZUcoJp4uPKsAMofjOez8asx3oPRaLyM5rAMAGtYofO+h/78Sri3FJaaqd3XyIyfnNhJ8iVb/xfQDB57w3MPukqkloiqQdFqvphhjs9t0PNa7wgcNx6rCgZs3r519LYSLrmB6Ajeob1cVHu1kixHZGTQwKlrvVrP9/pQw4dp51fph322Hl53ScqeJQ3VAROTfnkrGjR0pRAgUIaZwC4n8NXDW6Rw0QNtgjNsh5lzUR5xfaPbfKXBHSsdGKCYR8OyLCYFlUQss0PfKuvRx0yTUPnE35WqAAcq9Ys9+edWPmhr7fhD+BFn4cxRs/UpMeBs156ZIe+Hnz6nGo2kFOtlvsLxqR2A7xZhLWnJhsyVAS6LUNZBr3Ua2chzWDqDU4uMy5NiX/TLaQn8CgO+o+L79fs3HC2W+qmFxqAQzDcik/kmDXQNA19TpkEhhM4Xj45AEvaDfVjlutl1P8c646NKuvvgwQMgitRP9+ly53Ijp/lW81Lmn6IhdSf vOQi3Mre Q6AKilh/lQtYMAYtdIc3Z5Po5K+McrbR0JeZ10BfMAZctfshEheCW/CpxE2u1yGD0TwCX5OdEz6jUc5AeI1yqVl9Rt191Bfqi8czgbW2OHLi/AFjeCCfKLjivD9Dmc3V2ChSqILaOar1yEUvOzMMlrxiqZoYSvcnx/cHe9VxEjEePzaL9cialtDG/nFL3QLPRZSx8I+aQgtkn9hcYE8ui/F63mIQafYJWWobU3KrSuLAaxM9YnlY1TXh3r4/nw8jcGrPqtuMNFYvQCVDRLcCuzrSJcr8H6o2P44pM7ap5v4NFHZN5JqGEvgP4sq21JSFtThwZgqv9aQQXKNOJCeEqHJxf6YJp4f3o9lmnP6zXmtRegRDAiXvSWSXWaT+mlFiyklSElxZiyyJFSfJeqMVx1PL4k+MrrHib1Rskxag46Ca87r4tWwSoAG3CBuWrpWUj+IkPeRjsmUJtSe6Y/qQI30QEjBaLhSYJ1cW9IMZ1RTU6XTg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This was "percpu_counter: reimplement _add_batch with __this_cpu_cmpxchg". I chatted with vbabka a little bit and he pointed me at mod_zone_state, which does the same thing I needed except dodges preemption -- turns out cmpxchg with a gs-prefixed argument is safe here. ================ cut here ================ Interrupt disable/enable trips are quite expensive on x86-64 compared to a mere cmpxchg (note: no lock prefix!) and percpu counters are used quite often. With this change I get a bump of 1% ops/s for negative path lookups, plugged into will-it-scale: void testcase(unsigned long long *iterations, unsigned long nr) { while (1) { int fd = open("/tmp/nonexistent", O_RDONLY); assert(fd == -1); (*iterations)++; } } The win would be higher if it was not for other slowdowns, but one has to start somewhere. v2: - dodge preemption - use this_cpu_try_cmpxchg - keep the old variant depending on CONFIG_HAVE_CMPXCHG_LOCAL Signed-off-by: Mateusz Guzik Acked-by: Vlastimil Babka --- lib/percpu_counter.c | 44 +++++++++++++++++++++++++++++++++++++++----- 1 file changed, 39 insertions(+), 5 deletions(-) diff --git a/lib/percpu_counter.c b/lib/percpu_counter.c index 44dd133594d4..80ec2ffc981a 100644 --- a/lib/percpu_counter.c +++ b/lib/percpu_counter.c @@ -73,17 +73,50 @@ void percpu_counter_set(struct percpu_counter *fbc, s64 amount) EXPORT_SYMBOL(percpu_counter_set); /* - * local_irq_save() is needed to make the function irq safe: - * - The slow path would be ok as protected by an irq-safe spinlock. - * - this_cpu_add would be ok as it is irq-safe by definition. - * But: - * The decision slow path/fast path and the actual update must be atomic, too. + * Add to a counter while respecting batch size. + * + * There are 2 implementations, both dealing with the following problem: + * + * The decision slow path/fast path and the actual update must be atomic. * Otherwise a call in process context could check the current values and * decide that the fast path can be used. If now an interrupt occurs before * the this_cpu_add(), and the interrupt updates this_cpu(*fbc->counters), * then the this_cpu_add() that is executed after the interrupt has completed * can produce values larger than "batch" or even overflows. */ +#ifdef CONFIG_HAVE_CMPXCHG_LOCAL +/* + * Safety against interrupts is achieved in 2 ways: + * 1. the fast path uses local cmpxchg (note: no lock prefix) + * 2. the slow path operates with interrupts disabled + */ +void percpu_counter_add_batch(struct percpu_counter *fbc, s64 amount, s32 batch) +{ + s64 count; + unsigned long flags; + + count = this_cpu_read(*fbc->counters); + do { + if (unlikely(abs(count + amount)) >= batch) { + raw_spin_lock_irqsave(&fbc->lock, flags); + /* + * Note: by now might have migrated to another CPU or + * the value might have changed. + */ + count = __this_cpu_read(*fbc->counters); + fbc->count += count + amount; + __this_cpu_sub(*fbc->counters, count); + raw_spin_unlock_irqrestore(&fbc->lock, flags); + return; + } + } while (!this_cpu_try_cmpxchg(*fbc->counters, &count, count + amount)); +} +#else +/* + * local_irq_save() is used to make the function irq safe: + * - The slow path would be ok as protected by an irq-safe spinlock. + * - this_cpu_add would be ok as it is irq-safe by definition. + */ void percpu_counter_add_batch(struct percpu_counter *fbc, s64 amount, s32 batch) { s64 count; @@ -101,6 +134,7 @@ void percpu_counter_add_batch(struct percpu_counter *fbc, s64 amount, s32 batch) } local_irq_restore(flags); } +#endif EXPORT_SYMBOL(percpu_counter_add_batch); /*