From patchwork Fri Apr 12 09:24:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peng Zhang X-Patchwork-Id: 13627432 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 26EF5C04FF6 for ; Fri, 12 Apr 2024 09:24:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 99B516B007B; Fri, 12 Apr 2024 05:24:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8A0926B0087; Fri, 12 Apr 2024 05:24:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6C70D6B007B; Fri, 12 Apr 2024 05:24:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 4182E6B0082 for ; Fri, 12 Apr 2024 05:24:57 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 09F501A0DA0 for ; Fri, 12 Apr 2024 09:24:57 +0000 (UTC) X-FDA: 82000345434.28.183A7B7 Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by imf29.hostedemail.com (Postfix) with ESMTP id 91E32120002 for ; Fri, 12 Apr 2024 09:24:54 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=none; spf=pass (imf29.hostedemail.com: domain of zhangpeng362@huawei.com designates 45.249.212.189 as permitted sender) smtp.mailfrom=zhangpeng362@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1712913895; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4oxEpm/VY/ZzYCYrecE/zdeMbAUEN33bRe2b8RnUcZQ=; b=6/EfhSBiK4tQi7iKDLJzL7BBrlGxfebKu6Y5ZZx+qgXU94h2yVmq8nMAA1D+2aJ0FkffH5 aD2Ic5auGk5tpoOPajarFsNOb7/n+T8VRPOU/S7jbJFURXEz5TM6kq+CQZsDuWkGd20Owt q17SlHOgE2+sbegU/Fxi/yX+yA9DeJY= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=none; spf=pass (imf29.hostedemail.com: domain of zhangpeng362@huawei.com designates 45.249.212.189 as permitted sender) smtp.mailfrom=zhangpeng362@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1712913895; a=rsa-sha256; cv=none; b=S7A6Rt58thSjtEAzaAgm7vaeKQ6JOpz5GFZ1quSXyE9zQli9suhMLfCPliKmkIczBVUa6r C7fucUzY1q1JIBQzwiW0SqjV9EBTW90WEsBiuxHfOciVSxtw+dglhSTeJNKpB5Bhu2DMOs QbOLqzwW9FlWBVBOPYLJM2h+xBZCKRk= Received: from mail.maildlp.com (unknown [172.19.163.48]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4VGB1445JlzNnch; Fri, 12 Apr 2024 17:22:32 +0800 (CST) Received: from kwepemm600020.china.huawei.com (unknown [7.193.23.147]) by mail.maildlp.com (Postfix) with ESMTPS id C18CC180073; Fri, 12 Apr 2024 17:24:50 +0800 (CST) Received: from localhost.localdomain (10.175.112.125) by kwepemm600020.china.huawei.com (7.193.23.147) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Fri, 12 Apr 2024 17:24:49 +0800 From: Peng Zhang To: , CC: , , , , , , , , , , , , Subject: [RFC PATCH 1/3] Lazy percpu counters Date: Fri, 12 Apr 2024 17:24:39 +0800 Message-ID: <20240412092441.3112481-2-zhangpeng362@huawei.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240412092441.3112481-1-zhangpeng362@huawei.com> References: <20240412092441.3112481-1-zhangpeng362@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.112.125] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To kwepemm600020.china.huawei.com (7.193.23.147) X-Rspamd-Queue-Id: 91E32120002 X-Rspam-User: X-Stat-Signature: jx3ai178t6japkjygdczi7acbp4ohtgk X-Rspamd-Server: rspam01 X-HE-Tag: 1712913894-945689 X-HE-Meta: U2FsdGVkX18cuVp+cIeHCNQPaVrHFXYbcAjmeRJnRlZLWECU/rubPp2YxviIid3ZVPoeyuC8emhJFXJBpLBzwQiI47SOfFanb2y0rHifpvSXVZEso7viSflMxSl09vU0UGurKUQ4Bn1neohi+udtmrb0P76TzNPQADcThCWrXpgOSwhXYA10g2GKNp/pP9hOSe1ye/9/lxyX7gDBQi7UgZrpRPlTo3VU0A/TOhO29+byqzbUxYUzYLq6ElG0sfPF/PBVYcxUoGfxnuDtAeL3NLJ2jIDFwht+oO4j5Snfje/vJx+ZRuX7OO9qETrNR4GNgRLy69WA4okOnqXj7AEUp8e0H223ka4PbK2W11JFtDqY2QxLQEdzlbvcwjGgRccgnPA9Nkj8tz/RrZct/ua5ld3d6Vy7ImWJJUIYuzoSIfCtPdgV8sHi6QimTMUrRi0ji2h+66de5Nb5lo/lxt8Qk9xttKm2B9D2Bd+eRbCFP37288GP0c3/O/OhbEIfrsfHtX5vx0rIIVk04G4tkYyQjNhdUNe5d8IU6Y0a4rUYFRv0qvviCrjf5aA1VZrIDoei+c29cgXNGrITHceq1iwkyGk5L1bfM4y9v3mway907bbLGoNKA/1s3OzYulmnd9jc8G59Rj7Atqha+Kd6hXCSfksi8jX0fElvE+NfP+hJjEJjSY/+6Mh4Wgx2C5oezdufGGNYhP3bSHq07QGNTuEfF2YhLSnlWkAxYyrfdCbDZiMTQ8EPxoBbgYuewQcEftg7GiO6D1TcTjOU+3eusNYCXBPBhoVKrGKMfOu0Dx3RFSHsyu705yXv5GPjKfH3IY8aTu2LMsQkWnBktInyf8c9EoyelqE++vzBydMHX7QwlluxpMURivrwP2KK6GvIETUhOnAUDVwwthd58O1D5iTrTVuArKIC3qm1OPlGkq7/QXgtLHq5WS+HtpazUBYfmowTl9NYq8H3axsUtdXDZeN 9OWB1iR+ yV/vTL719R0S0PHLHYnb+GuuRkioguQfYnSHrkUAer1D2tBysDzMM2Uaclg/lSICejuPUzK7PSfLhjTrDs3VEVIuV4aO4IEYAzJFgqM3Osq+v5DCyaaHz6Fz5kPurgLlh+3H3fmlr0QLpcQHr9ZKAfE75WzyHHHtDMzLEfYoeFvHqN5s= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Kent Overstreet This patch adds lib/lazy-percpu-counter.c, which implements counters that start out as atomics, but lazily switch to percpu mode if the update rate crosses some threshold (arbitrarily set at 256 per second). Signed-off-by: Kent Overstreet Signed-off-by: Suren Baghdasaryan Signed-off-by: ZhangPeng --- include/linux/lazy-percpu-counter.h | 82 +++++++++++++++++++++++++++++ lib/Makefile | 2 +- lib/lazy-percpu-counter.c | 82 +++++++++++++++++++++++++++++ 3 files changed, 165 insertions(+), 1 deletion(-) create mode 100644 include/linux/lazy-percpu-counter.h create mode 100644 lib/lazy-percpu-counter.c diff --git a/include/linux/lazy-percpu-counter.h b/include/linux/lazy-percpu-counter.h new file mode 100644 index 000000000000..281b8dd88cb2 --- /dev/null +++ b/include/linux/lazy-percpu-counter.h @@ -0,0 +1,82 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Lazy percpu counters: + * (C) 2022 Kent Overstreet + * + * Lazy percpu counters start out in atomic mode, then switch to percpu mode if + * the update rate crosses some threshold. + * + * This means we don't have to decide between low memory overhead atomic + * counters and higher performance percpu counters - we can have our cake and + * eat it, too! + * + * Internally we use an atomic64_t, where the low bit indicates whether we're in + * percpu mode, and the high 8 bits are a secondary counter that's incremented + * when the counter is modified - meaning 55 bits of precision are available for + * the counter itself. + */ + +#ifndef _LINUX_LAZY_PERCPU_COUNTER_H +#define _LINUX_LAZY_PERCPU_COUNTER_H + +#include +#include + +struct lazy_percpu_counter { + atomic64_t v; + unsigned long last_wrap; +}; + +void lazy_percpu_counter_exit(struct lazy_percpu_counter *c); +void lazy_percpu_counter_add_slowpath(struct lazy_percpu_counter *c, s64 i); + +/* + * We use the high bits of the atomic counter for a secondary counter, which is + * incremented every time the counter is touched. When the secondary counter + * wraps, we check the time the counter last wrapped, and if it was recent + * enough that means the update frequency has crossed our threshold and we + * switch to percpu mode: + */ +#define COUNTER_MOD_BITS 8 +#define COUNTER_MOD_MASK ~(~0ULL >> COUNTER_MOD_BITS) +#define COUNTER_MOD_BITS_START (64 - COUNTER_MOD_BITS) + +/* + * We use the low bit of the counter to indicate whether we're in atomic mode + * (low bit clear), or percpu mode (low bit set, counter is a pointer to actual + * percpu counters: + */ +#define COUNTER_IS_PCPU_BIT 1 + +static inline u64 __percpu *lazy_percpu_counter_is_pcpu(u64 v) +{ + if (!(v & COUNTER_IS_PCPU_BIT)) + return NULL; + + v ^= COUNTER_IS_PCPU_BIT; + return (u64 __percpu *)(unsigned long)v; +} + +/** + * lazy_percpu_counter_add: Add a value to a lazy_percpu_counter + * + * @c: counter to modify + * @i: value to add + */ +static inline void lazy_percpu_counter_add(struct lazy_percpu_counter *c, s64 i) +{ + u64 v = atomic64_read(&c->v); + u64 __percpu *pcpu_v = lazy_percpu_counter_is_pcpu(v); + + if (likely(pcpu_v)) + this_cpu_add(*pcpu_v, i); + else + lazy_percpu_counter_add_slowpath(c, i); +} + +static inline void lazy_percpu_counter_sub(struct lazy_percpu_counter *c, s64 i) +{ + lazy_percpu_counter_add(c, -i); +} + +#endif /* _LINUX_LAZY_PERCPU_COUNTER_H */ diff --git a/lib/Makefile b/lib/Makefile index 2f4e17bfb299..7afa0c3e7cc7 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -46,7 +46,7 @@ obj-y += bcd.o sort.o parser.o debug_locks.o random32.o \ bust_spinlocks.o kasprintf.o bitmap.o scatterlist.o \ list_sort.o uuid.o iov_iter.o clz_ctz.o \ bsearch.o find_bit.o llist.o lwq.o memweight.o kfifo.o \ - percpu-refcount.o rhashtable.o base64.o \ + percpu-refcount.o lazy-percpu-counter.o rhashtable.o base64.o \ once.o refcount.o rcuref.o usercopy.o errseq.o bucket_locks.o \ generic-radix-tree.o bitmap-str.o obj-$(CONFIG_STRING_KUNIT_TEST) += string_kunit.o diff --git a/lib/lazy-percpu-counter.c b/lib/lazy-percpu-counter.c new file mode 100644 index 000000000000..e1914207214d --- /dev/null +++ b/lib/lazy-percpu-counter.c @@ -0,0 +1,82 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include +#include +#include +#include + +static inline s64 lazy_percpu_counter_atomic_val(s64 v) +{ + /* Ensure output is sign extended properly: */ + return (v << COUNTER_MOD_BITS) >> + (COUNTER_MOD_BITS + COUNTER_IS_PCPU_BIT); +} + +static void lazy_percpu_counter_switch_to_pcpu(struct lazy_percpu_counter *c) +{ + u64 __percpu *pcpu_v = alloc_percpu_gfp(u64, GFP_ATOMIC|__GFP_NOWARN); + u64 old, new, v; + + if (!pcpu_v) + return; + + preempt_disable(); + v = atomic64_read(&c->v); + do { + if (lazy_percpu_counter_is_pcpu(v)) { + free_percpu(pcpu_v); + return; + } + + old = v; + new = (unsigned long)pcpu_v | 1; + + *this_cpu_ptr(pcpu_v) = lazy_percpu_counter_atomic_val(v); + } while ((v = atomic64_cmpxchg(&c->v, old, new)) != old); + preempt_enable(); +} + +/** + * lazy_percpu_counter_exit: Free resources associated with a + * lazy_percpu_counter + * + * @c: counter to exit + */ +void lazy_percpu_counter_exit(struct lazy_percpu_counter *c) +{ + free_percpu(lazy_percpu_counter_is_pcpu(atomic64_read(&c->v))); +} +EXPORT_SYMBOL_GPL(lazy_percpu_counter_exit); + +void lazy_percpu_counter_add_slowpath(struct lazy_percpu_counter *c, s64 i) +{ + u64 atomic_i; + u64 old, v = atomic64_read(&c->v); + u64 __percpu *pcpu_v; + + atomic_i = i << COUNTER_IS_PCPU_BIT; + atomic_i &= ~COUNTER_MOD_MASK; + atomic_i |= 1ULL << COUNTER_MOD_BITS_START; + + do { + pcpu_v = lazy_percpu_counter_is_pcpu(v); + if (pcpu_v) { + this_cpu_add(*pcpu_v, i); + return; + } + + old = v; + } while ((v = atomic64_cmpxchg(&c->v, old, old + atomic_i)) != old); + + if (unlikely(!(v & COUNTER_MOD_MASK))) { + unsigned long now = jiffies; + + if (c->last_wrap && + unlikely(time_after(c->last_wrap + HZ, now))) + lazy_percpu_counter_switch_to_pcpu(c); + else + c->last_wrap = now; + } +} +EXPORT_SYMBOL(lazy_percpu_counter_add_slowpath);