From patchwork Mon May 1 16:54:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 13227643 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F1A5C77B61 for ; Mon, 1 May 2023 16:55:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2E3A290000A; Mon, 1 May 2023 12:55:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 26E25900002; Mon, 1 May 2023 12:55:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0E70B90000A; Mon, 1 May 2023 12:55:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id F259B900002 for ; Mon, 1 May 2023 12:55:26 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id BCE001C725E for ; Mon, 1 May 2023 16:55:26 +0000 (UTC) X-FDA: 80742287052.13.5376EB1 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf23.hostedemail.com (Postfix) with ESMTP id E48C6140015 for ; Mon, 1 May 2023 16:55:24 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=uyQWn2Ht; spf=pass (imf23.hostedemail.com: domain of 3_O5PZAYKCEMxzwjsglttlqj.htrqnsz2-rrp0fhp.twl@flex--surenb.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3_O5PZAYKCEMxzwjsglttlqj.htrqnsz2-rrp0fhp.twl@flex--surenb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682960124; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=O9K7OqGn67XevhyLqxW8/t3YPJ+QYtYh9lmzl+bkJrg=; b=B9/C43pBOCMfGX9AK6MzPbPYMN+j8p9icMKU204nC0D8f6X4QlejooecjddRJh6KX1vmdP z9+ll1zIchUX1xtGrejUwEeFrHGJts3gRGiVgTlOTDCZirxqMz1G31p58C+QL2oEjAaCHs orMxnTov+pGuWVl7b9LOm7bsCbyZLK4= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=uyQWn2Ht; spf=pass (imf23.hostedemail.com: domain of 3_O5PZAYKCEMxzwjsglttlqj.htrqnsz2-rrp0fhp.twl@flex--surenb.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3_O5PZAYKCEMxzwjsglttlqj.htrqnsz2-rrp0fhp.twl@flex--surenb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682960124; a=rsa-sha256; cv=none; b=kr1RCeFlDHtIlrHkBkUL1YWjJNfB3iUeMU88Vk98LAnr97N5yyzhdN4VK6DsEjWPm6dacO oxgtupSHzga1cWjfYpTtbq+TzSIw9+BRBcsXRBfYNkZaluR5mYRdwwhsVlXbuRcV1Bx3fe 5p/JE1TV1dYVrg1gikARwApLNEpzjio= Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-b9a77926afbso5335753276.3 for ; Mon, 01 May 2023 09:55:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1682960124; x=1685552124; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=O9K7OqGn67XevhyLqxW8/t3YPJ+QYtYh9lmzl+bkJrg=; b=uyQWn2HtrGHHQPYjCaEXS71WFvIKKSc7//EwH6w5zT7RuLTJsJvg4Siy51jlaOTMsf DmzxgGzQqK6buMAxRdIZF5co25tsVldyDbaF7rOabJEbLpi4XheA+l5VWP2IJDBlEIui J3jFl+rVymLiKhaaXfPqFSO8pBqa31LbX5CWrS5Wner2pA7Sqr1hkvfcI6t4MxiQZt9K qhkEpMVbiHPNZp9IQ6O2D9+a8gKMVQ/1Gtf9GajnR6izUu8XnftHHBs+v4IcA0SDdYNk NO/g09kaGZENLPill52iLrjrZMs4uUM92aRcipfW+xL+/5JdYb7kKitjTzDH4s2JKlli slTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682960124; x=1685552124; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=O9K7OqGn67XevhyLqxW8/t3YPJ+QYtYh9lmzl+bkJrg=; b=lNKiWi9FdDZy7/LclSx5Ntuk9Bggbq8HUwL0XU4b8atvOEJ5ahlsniIxcYjDzBKxDx MUqNSoaz/Mfw0r9dAV+9qiGS1ClHkfiz+dEhO06ElG37sj8FndpNDQ3KExnjen0p3bCm 9tuxcRuLUbHiFs6pYbAOeirAmCGRSfqybVsMvqVqibtw/MzvnT6fW5jypTIVUscVBSbf MzhxyWxJo7EI4oZmp8k8zFH54diRh+7aZsu6Qfh0Yx9X3mLcMasw9KEjCHE2z0ZOv1Qi gzF7Z2m/drt/LUmrjlt3Zpd8m4q5kP1x7Sowjbs2I9kpPQbT1Y6XPSD7G1Bmp14nEAux 98uQ== X-Gm-Message-State: AC+VfDzQRTNJLdJB/oStO2WOvJRxXDwXqY/0I7YP46ib2zLT0DOa/7hr 70YiG3P65mztA5RBSaMQ5/ws17X/vsg= X-Google-Smtp-Source: ACHHUZ6licAmCYMGFmFEgxRhjJBE3vK82JOqHF+xQXe/hB6jcq2RRjEzB8Dg4J2EO+jXUyt83MZ91zsgyn0= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:6d24:3efd:facc:7ac4]) (user=surenb job=sendgmr) by 2002:a05:6902:100e:b0:b8f:47c4:58ed with SMTP id w14-20020a056902100e00b00b8f47c458edmr8682966ybt.9.1682960124020; Mon, 01 May 2023 09:55:24 -0700 (PDT) Date: Mon, 1 May 2023 09:54:17 -0700 In-Reply-To: <20230501165450.15352-1-surenb@google.com> Mime-Version: 1.0 References: <20230501165450.15352-1-surenb@google.com> X-Mailer: git-send-email 2.40.1.495.gc816e09b53d-goog Message-ID: <20230501165450.15352-8-surenb@google.com> Subject: [PATCH 07/40] Lazy percpu counters From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: kent.overstreet@linux.dev, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, roman.gushchin@linux.dev, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, corbet@lwn.net, void@manifault.com, peterz@infradead.org, juri.lelli@redhat.com, ldufour@linux.ibm.com, catalin.marinas@arm.com, will@kernel.org, arnd@arndb.de, tglx@linutronix.de, mingo@redhat.com, dave.hansen@linux.intel.com, x86@kernel.org, peterx@redhat.com, david@redhat.com, axboe@kernel.dk, mcgrof@kernel.org, masahiroy@kernel.org, nathan@kernel.org, dennis@kernel.org, tj@kernel.org, muchun.song@linux.dev, rppt@kernel.org, paulmck@kernel.org, pasha.tatashin@soleen.com, yosryahmed@google.com, yuzhao@google.com, dhowells@redhat.com, hughd@google.com, andreyknvl@gmail.com, keescook@chromium.org, ndesaulniers@google.com, gregkh@linuxfoundation.org, ebiggers@google.com, ytcoode@gmail.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com, vschneid@redhat.com, cl@linux.com, penberg@kernel.org, iamjoonsoo.kim@lge.com, 42.hyeyoo@gmail.com, glider@google.com, elver@google.com, dvyukov@google.com, shakeelb@google.com, songmuchun@bytedance.com, jbaron@akamai.com, rientjes@google.com, minchan@google.com, kaleshsingh@google.com, surenb@google.com, kernel-team@android.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, iommu@lists.linux.dev, linux-arch@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-modules@vger.kernel.org, kasan-dev@googlegroups.com, cgroups@vger.kernel.org X-Stat-Signature: 8rbaq31qwschdwxm7qpnyci1ph3m841h X-Rspam-User: X-Rspamd-Queue-Id: E48C6140015 X-Rspamd-Server: rspam06 X-HE-Tag: 1682960124-289495 X-HE-Meta: U2FsdGVkX1+A4wo7gRI7bgjMGjfQQSkNiLUsBUSXsXdTBYwacfTTRPBVkjfG53yKRV34SF0KmStDJVDuNMpzPdgFNITctEdKnKhxGodIISvvZ9l+3V2tf/31EAGzj/h2reE3bJGQiIQEejC7KDDGndbaswBxg8ZuFq0119PJ8/uzSYtF9HUEdI4F/v1a6EdBJmGQFroZJOuWdq9uizjj9FVLeyyA0F41/CwNN8sz5CSY8l9PB6pcHYnnilfOZs0fC0eXq948Fnn5ikabp306uMfkCt48sLD/XoPOSTchLF45HhyXGzsChFCqq669+hPeWvaj8nmjtjYDS4kyJLhHZYRoqdGcqe0hWndBcNWbIKwhWODFZBuNGkYHUFMax/BqE3PBZXtlSJ/mQPAAegEpYdxtaOtXeQ9y2hDEHQUhQuhCHAyu+5l6I6suNTaEKkzrDcKY45VAXb9hE1pgYEWz9c3ox300p4SL7lnXo5jN0zV0sm0I/tJbQA99UM/8EW2wHuis2x5lsWWGqQt1QnlCgTnoqA4enfLFmCY4iGw2hlQVJm//A0KF2YpCRtA2plhX+E5KcTmu50bmPmdDpJGDkyb0wezVTd3A/mBMZWpgUAzOyjsUe9OxiDqn7dygcmj33huc9ntbvlxY1trwCkX373/i5RkfiXq5vWy4SnQ2mHvyq3JQNHF9AqjU7Pr/ibRAamxWpadt1jKjuoJs5X8pl2rfC8jZ7SWaFqY+atwVyO7TjJYmbZIJWbEGEBfOWUyo0EuJGOhFaHoYJcMcTQBubN/hCZ3j4qCtjvH3JDTBApRbBl7CRQiupqi66SDecOQSw4nJWhWnxuJoUlaiH/akyKhRLJi3tSvz8dQh9UGGVHYqvnLEGkvWGJy1MuP2yEmWqF8a9gVa6zQ0E7exvy3wo2wn9FTtmftlH/ziDhET6SwlMkvT5RQY8U9J984xgY0mi8/YaP8oI51xPYpU5Xl 1uZB+5sn Z4XjXXCTj8Qp0P1rVbUiwXLqEO3sUzqMaCCBFoVGqOc5S3gf4ifC/5VSjjr8hzmRpnfFEB4NtWyy2GWJw7OiOEhDVzDr8V1/KNvWAc2UDqWjrpy5Z85CCdieDJ5tEWEbOw4KZD+o7mklu9xKcc0u6v3aJMgzSxZ4DNr8NLjj9rfF88KHMeUe+R6kYd9SuAdAdfle2gv3Ko5NIPS4mB3siWXwSVpGfBhN++KrAfg8QKOk33BZHpDTxS6uQs2KfUGBIeyqr/pUF+XUUJl55j//0YayVbwFLiBoh/7uXAVUpmCK+5vQwa+R1XpbFFDBzCy6ShSWk87jabF6uBN+wTw+PjaRWJlRkaKq/XxwBir4UFLtuObdY5q2wq/k+PGPS2FL9b156yblYx7EyydR6MpmkAJm6/bRqn2vFBnl8iyKmj8cdvOoY8bIHy0waCVuzwfL20duVPF7xpJz2Y7a1TTW4Jhey89IEyhcZskHWMnUvuG3KWg4ddcsVs6FCNIWOIK1bk47DlnhI6isFXDJ08l7j5zKO9JLkcOj2tsMIYTlz8dB3Tx2HoMmuW9h0+Tsf8zh26E1cHUd7TUHk12oyFf5SW3dYAYDrH928MJqSCJkslLhjlXtPJNdL48oqy4zm9E0Al2LfQzvbgNLsSpbQjPaCP7zkPDjoqWIWIGy9s2z9Jky6pFPgQ3+ZjhO6/g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Kent Overstreet This patch adds lib/lazy-percpu-counter.c, which implements counters that start out as atomics, but lazily switch to percpu mode if the update rate crosses some threshold (arbitrarily set at 256 per second). Signed-off-by: Kent Overstreet Signed-off-by: Suren Baghdasaryan --- include/linux/lazy-percpu-counter.h | 102 ++++++++++++++++++++++ lib/Kconfig | 3 + lib/Makefile | 2 + lib/lazy-percpu-counter.c | 127 ++++++++++++++++++++++++++++ 4 files changed, 234 insertions(+) create mode 100644 include/linux/lazy-percpu-counter.h create mode 100644 lib/lazy-percpu-counter.c diff --git a/include/linux/lazy-percpu-counter.h b/include/linux/lazy-percpu-counter.h new file mode 100644 index 000000000000..45ca9e2ce58b --- /dev/null +++ b/include/linux/lazy-percpu-counter.h @@ -0,0 +1,102 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Lazy percpu counters: + * (C) 2022 Kent Overstreet + * + * Lazy percpu counters start out in atomic mode, then switch to percpu mode if + * the update rate crosses some threshold. + * + * This means we don't have to decide between low memory overhead atomic + * counters and higher performance percpu counters - we can have our cake and + * eat it, too! + * + * Internally we use an atomic64_t, where the low bit indicates whether we're in + * percpu mode, and the high 8 bits are a secondary counter that's incremented + * when the counter is modified - meaning 55 bits of precision are available for + * the counter itself. + */ + +#ifndef _LINUX_LAZY_PERCPU_COUNTER_H +#define _LINUX_LAZY_PERCPU_COUNTER_H + +#include +#include + +struct lazy_percpu_counter { + atomic64_t v; + unsigned long last_wrap; +}; + +void lazy_percpu_counter_exit(struct lazy_percpu_counter *c); +void lazy_percpu_counter_add_slowpath(struct lazy_percpu_counter *c, s64 i); +void lazy_percpu_counter_add_slowpath_noupgrade(struct lazy_percpu_counter *c, s64 i); +s64 lazy_percpu_counter_read(struct lazy_percpu_counter *c); + +/* + * We use the high bits of the atomic counter for a secondary counter, which is + * incremented every time the counter is touched. When the secondary counter + * wraps, we check the time the counter last wrapped, and if it was recent + * enough that means the update frequency has crossed our threshold and we + * switch to percpu mode: + */ +#define COUNTER_MOD_BITS 8 +#define COUNTER_MOD_MASK ~(~0ULL >> COUNTER_MOD_BITS) +#define COUNTER_MOD_BITS_START (64 - COUNTER_MOD_BITS) + +/* + * We use the low bit of the counter to indicate whether we're in atomic mode + * (low bit clear), or percpu mode (low bit set, counter is a pointer to actual + * percpu counters: + */ +#define COUNTER_IS_PCPU_BIT 1 + +static inline u64 __percpu *lazy_percpu_counter_is_pcpu(u64 v) +{ + if (!(v & COUNTER_IS_PCPU_BIT)) + return NULL; + + v ^= COUNTER_IS_PCPU_BIT; + return (u64 __percpu *)(unsigned long)v; +} + +/** + * lazy_percpu_counter_add: Add a value to a lazy_percpu_counter + * + * @c: counter to modify + * @i: value to add + */ +static inline void lazy_percpu_counter_add(struct lazy_percpu_counter *c, s64 i) +{ + u64 v = atomic64_read(&c->v); + u64 __percpu *pcpu_v = lazy_percpu_counter_is_pcpu(v); + + if (likely(pcpu_v)) + this_cpu_add(*pcpu_v, i); + else + lazy_percpu_counter_add_slowpath(c, i); +} + +/** + * lazy_percpu_counter_add_noupgrade: Add a value to a lazy_percpu_counter, + * without upgrading to percpu mode + * + * @c: counter to modify + * @i: value to add + */ +static inline void lazy_percpu_counter_add_noupgrade(struct lazy_percpu_counter *c, s64 i) +{ + u64 v = atomic64_read(&c->v); + u64 __percpu *pcpu_v = lazy_percpu_counter_is_pcpu(v); + + if (likely(pcpu_v)) + this_cpu_add(*pcpu_v, i); + else + lazy_percpu_counter_add_slowpath_noupgrade(c, i); +} + +static inline void lazy_percpu_counter_sub(struct lazy_percpu_counter *c, s64 i) +{ + lazy_percpu_counter_add(c, -i); +} + +#endif /* _LINUX_LAZY_PERCPU_COUNTER_H */ diff --git a/lib/Kconfig b/lib/Kconfig index 5c2da561c516..7380292a8fcd 100644 --- a/lib/Kconfig +++ b/lib/Kconfig @@ -505,6 +505,9 @@ config ASSOCIATIVE_ARRAY for more information. +config LAZY_PERCPU_COUNTER + bool + config HAS_IOMEM bool depends on !NO_IOMEM diff --git a/lib/Makefile b/lib/Makefile index 876fcdeae34e..293a0858a3f8 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -164,6 +164,8 @@ obj-$(CONFIG_DEBUG_PREEMPT) += smp_processor_id.o obj-$(CONFIG_DEBUG_LIST) += list_debug.o obj-$(CONFIG_DEBUG_OBJECTS) += debugobjects.o +obj-$(CONFIG_LAZY_PERCPU_COUNTER) += lazy-percpu-counter.o + obj-$(CONFIG_BITREVERSE) += bitrev.o obj-$(CONFIG_LINEAR_RANGES) += linear_ranges.o obj-$(CONFIG_PACKING) += packing.o diff --git a/lib/lazy-percpu-counter.c b/lib/lazy-percpu-counter.c new file mode 100644 index 000000000000..4f4e32c2dc09 --- /dev/null +++ b/lib/lazy-percpu-counter.c @@ -0,0 +1,127 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include +#include +#include +#include + +static inline s64 lazy_percpu_counter_atomic_val(s64 v) +{ + /* Ensure output is sign extended properly: */ + return (v << COUNTER_MOD_BITS) >> + (COUNTER_MOD_BITS + COUNTER_IS_PCPU_BIT); +} + +static void lazy_percpu_counter_switch_to_pcpu(struct lazy_percpu_counter *c) +{ + u64 __percpu *pcpu_v = alloc_percpu_gfp(u64, GFP_ATOMIC|__GFP_NOWARN); + u64 old, new, v; + + if (!pcpu_v) + return; + + preempt_disable(); + v = atomic64_read(&c->v); + do { + if (lazy_percpu_counter_is_pcpu(v)) { + free_percpu(pcpu_v); + return; + } + + old = v; + new = (unsigned long)pcpu_v | 1; + + *this_cpu_ptr(pcpu_v) = lazy_percpu_counter_atomic_val(v); + } while ((v = atomic64_cmpxchg(&c->v, old, new)) != old); + preempt_enable(); +} + +/** + * lazy_percpu_counter_exit: Free resources associated with a + * lazy_percpu_counter + * + * @c: counter to exit + */ +void lazy_percpu_counter_exit(struct lazy_percpu_counter *c) +{ + free_percpu(lazy_percpu_counter_is_pcpu(atomic64_read(&c->v))); +} +EXPORT_SYMBOL_GPL(lazy_percpu_counter_exit); + +/** + * lazy_percpu_counter_read: Read current value of a lazy_percpu_counter + * + * @c: counter to read + */ +s64 lazy_percpu_counter_read(struct lazy_percpu_counter *c) +{ + s64 v = atomic64_read(&c->v); + u64 __percpu *pcpu_v = lazy_percpu_counter_is_pcpu(v); + + if (pcpu_v) { + int cpu; + + v = 0; + for_each_possible_cpu(cpu) + v += *per_cpu_ptr(pcpu_v, cpu); + } else { + v = lazy_percpu_counter_atomic_val(v); + } + + return v; +} +EXPORT_SYMBOL_GPL(lazy_percpu_counter_read); + +void lazy_percpu_counter_add_slowpath(struct lazy_percpu_counter *c, s64 i) +{ + u64 atomic_i; + u64 old, v = atomic64_read(&c->v); + u64 __percpu *pcpu_v; + + atomic_i = i << COUNTER_IS_PCPU_BIT; + atomic_i &= ~COUNTER_MOD_MASK; + atomic_i |= 1ULL << COUNTER_MOD_BITS_START; + + do { + pcpu_v = lazy_percpu_counter_is_pcpu(v); + if (pcpu_v) { + this_cpu_add(*pcpu_v, i); + return; + } + + old = v; + } while ((v = atomic64_cmpxchg(&c->v, old, old + atomic_i)) != old); + + if (unlikely(!(v & COUNTER_MOD_MASK))) { + unsigned long now = jiffies; + + if (c->last_wrap && + unlikely(time_after(c->last_wrap + HZ, now))) + lazy_percpu_counter_switch_to_pcpu(c); + else + c->last_wrap = now; + } +} +EXPORT_SYMBOL(lazy_percpu_counter_add_slowpath); + +void lazy_percpu_counter_add_slowpath_noupgrade(struct lazy_percpu_counter *c, s64 i) +{ + u64 atomic_i; + u64 old, v = atomic64_read(&c->v); + u64 __percpu *pcpu_v; + + atomic_i = i << COUNTER_IS_PCPU_BIT; + atomic_i &= ~COUNTER_MOD_MASK; + + do { + pcpu_v = lazy_percpu_counter_is_pcpu(v); + if (pcpu_v) { + this_cpu_add(*pcpu_v, i); + return; + } + + old = v; + } while ((v = atomic64_cmpxchg(&c->v, old, old + atomic_i)) != old); +} +EXPORT_SYMBOL(lazy_percpu_counter_add_slowpath_noupgrade);