From patchwork Tue Oct 19 10:25:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marco Elver X-Patchwork-Id: 12569407 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9FC09C433EF for ; Tue, 19 Oct 2021 10:25:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 368936137F for ; Tue, 19 Oct 2021 10:25:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 368936137F Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id AC0356B006C; Tue, 19 Oct 2021 06:25:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A703A6B0071; Tue, 19 Oct 2021 06:25:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9389D900002; Tue, 19 Oct 2021 06:25:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0141.hostedemail.com [216.40.44.141]) by kanga.kvack.org (Postfix) with ESMTP id 877256B006C for ; Tue, 19 Oct 2021 06:25:34 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 4205B2BFA1 for ; Tue, 19 Oct 2021 10:25:34 +0000 (UTC) X-FDA: 78712805388.09.9E9F3CE Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf21.hostedemail.com (Postfix) with ESMTP id 5C573D042B5B for ; Tue, 19 Oct 2021 10:25:32 +0000 (UTC) Received: by mail-wm1-f74.google.com with SMTP id k5-20020a7bc3050000b02901e081f69d80so2466119wmj.8 for ; Tue, 19 Oct 2021 03:25:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:message-id:mime-version:subject:from:to:cc; bh=QtB1JTnzUP+wCjNXipQfBq+gxdYpMqzkTaAZVfPtWwI=; b=bLgLHB1EsD6Xqf/nesKeJq6K8NpxXqKu4siGjC40FCFeSYan7eAaWhX7avRE0Bepx7 Y6qqbNNQ/A07vx4efyod3le2wZ1p0cD080aqYbKGx/yslzQBooO1UxzORc5d40hIkkHa Ga1Ohc+v/to6XT8CiBIBVksccv9L0oaOuTrPxR7CXdMB5lyUOZcT4dHf35Hkeeymuw7h CEU5Lzj7I64RNr6Jrb2CNAZvzltzYzY00qRW0N7Ncs+7YOjUimB0ws/wHXbitiQzBWxC DliOsP4jxlnlkvyCSdiiboBPbx4rq2bRzjz7oD0B8+j0lOyKnqW15UegbpXWk2NAY2zJ teUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=QtB1JTnzUP+wCjNXipQfBq+gxdYpMqzkTaAZVfPtWwI=; b=JDGmo3NztWQ2yi7qtcipYyF1ujBQXwVAfAEKVsWq7sIegjYqtc2f+LAPlMiS9BSLb/ OP2/+wPt5ehdnhugJqA8halldI9E27yiC10+IyFmMKTk4VKjVAnB8DUmr6jhrWNBqn// MHk7E7s9rVg5t/W+yJhr+Qjz8HjQt3mmU54rociDBFe+RTDTIn/2Z3aZLpC6USERvmxz OT1Birvy7OVzO3bVB8mrn5Uak7lG8iuzKDYh9w9HIEFPQbqh/taAwCzL1VWsbnSLvAMd NZdn4Q1ETG4idy+LEvx2zCHH3I1ZHvLP4B/KLvCQz1Z/IFXHuJ0j+f84cQo8jAgx1bwo r7ng== X-Gm-Message-State: AOAM530PteFk697FeYwYJZtCnnMOJPtDMRnSHHfpig9wVvjA3jsxD2Jb CGxqQ5Ld+XpumrBTb8i8TGA+UFaaiA== X-Google-Smtp-Source: ABdhPJw1fLlelqg3KerFPx343JWE60pNjScsycCfcOz5DzweIH6O9KAzUxjmnPsfsZtQilKbBZ4HhNNpzw== X-Received: from elver.muc.corp.google.com ([2a00:79e0:15:13:feca:f6ef:d785:c732]) (user=elver job=sendgmr) by 2002:a1c:f31a:: with SMTP id q26mr5061343wmq.148.1634639132518; Tue, 19 Oct 2021 03:25:32 -0700 (PDT) Date: Tue, 19 Oct 2021 12:25:23 +0200 Message-Id: <20211019102524.2807208-1-elver@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.33.0.1079.g6e70778dc9-goog Subject: [PATCH 1/2] kfence: always use static branches to guard kfence_alloc() From: Marco Elver To: elver@google.com, Andrew Morton Cc: Alexander Potapenko , Dmitry Vyukov , Jann Horn , linux-kernel@vger.kernel.org, linux-mm@kvack.org, kasan-dev@googlegroups.com X-Rspamd-Queue-Id: 5C573D042B5B X-Stat-Signature: dcjz7sc9o46h4b9rshjcu1t9n53zgfwm Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=bLgLHB1E; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of 3HJ1uYQUKCCMDKUDQFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--elver.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3HJ1uYQUKCCMDKUDQFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--elver.bounces.google.com X-Rspamd-Server: rspam02 X-HE-Tag: 1634639132-609297 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Regardless of KFENCE mode (CONFIG_KFENCE_STATIC_KEYS: either using static keys to gate allocations, or using a simple dynamic branch), always use a static branch to avoid the dynamic branch in kfence_alloc() if KFENCE was disabled at boot. For CONFIG_KFENCE_STATIC_KEYS=n, this now avoids the dynamic branch if KFENCE was disabled at boot. To simplify, also unifies the location where kfence_allocation_gate is read-checked to just be inline in kfence_alloc(). Signed-off-by: Marco Elver --- include/linux/kfence.h | 21 +++++++++++---------- mm/kfence/core.c | 16 +++++++--------- 2 files changed, 18 insertions(+), 19 deletions(-) diff --git a/include/linux/kfence.h b/include/linux/kfence.h index 3fe6dd8a18c1..4b5e3679a72c 100644 --- a/include/linux/kfence.h +++ b/include/linux/kfence.h @@ -14,6 +14,9 @@ #ifdef CONFIG_KFENCE +#include +#include + /* * We allocate an even number of pages, as it simplifies calculations to map * address to metadata indices; effectively, the very first page serves as an @@ -22,13 +25,8 @@ #define KFENCE_POOL_SIZE ((CONFIG_KFENCE_NUM_OBJECTS + 1) * 2 * PAGE_SIZE) extern char *__kfence_pool; -#ifdef CONFIG_KFENCE_STATIC_KEYS -#include DECLARE_STATIC_KEY_FALSE(kfence_allocation_key); -#else -#include extern atomic_t kfence_allocation_gate; -#endif /** * is_kfence_address() - check if an address belongs to KFENCE pool @@ -116,13 +114,16 @@ void *__kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags); */ static __always_inline void *kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags) { -#ifdef CONFIG_KFENCE_STATIC_KEYS - if (static_branch_unlikely(&kfence_allocation_key)) +#if defined(CONFIG_KFENCE_STATIC_KEYS) || CONFIG_KFENCE_SAMPLE_INTERVAL == 0 + if (!static_branch_unlikely(&kfence_allocation_key)) + return NULL; #else - if (unlikely(!atomic_read(&kfence_allocation_gate))) + if (!static_branch_likely(&kfence_allocation_key)) + return NULL; #endif - return __kfence_alloc(s, size, flags); - return NULL; + if (likely(atomic_read(&kfence_allocation_gate))) + return NULL; + return __kfence_alloc(s, size, flags); } /** diff --git a/mm/kfence/core.c b/mm/kfence/core.c index 802905b1c89b..09945784df9e 100644 --- a/mm/kfence/core.c +++ b/mm/kfence/core.c @@ -104,10 +104,11 @@ struct kfence_metadata kfence_metadata[CONFIG_KFENCE_NUM_OBJECTS]; static struct list_head kfence_freelist = LIST_HEAD_INIT(kfence_freelist); static DEFINE_RAW_SPINLOCK(kfence_freelist_lock); /* Lock protecting freelist. */ -#ifdef CONFIG_KFENCE_STATIC_KEYS -/* The static key to set up a KFENCE allocation. */ +/* + * The static key to set up a KFENCE allocation; or if static keys are not used + * to gate allocations, to avoid a load and compare if KFENCE is disabled. + */ DEFINE_STATIC_KEY_FALSE(kfence_allocation_key); -#endif /* Gates the allocation, ensuring only one succeeds in a given period. */ atomic_t kfence_allocation_gate = ATOMIC_INIT(1); @@ -774,6 +775,8 @@ void __init kfence_init(void) return; } + if (!IS_ENABLED(CONFIG_KFENCE_STATIC_KEYS)) + static_branch_enable(&kfence_allocation_key); WRITE_ONCE(kfence_enabled, true); queue_delayed_work(system_unbound_wq, &kfence_timer, 0); pr_info("initialized - using %lu bytes for %d objects at 0x%p-0x%p\n", KFENCE_POOL_SIZE, @@ -866,12 +869,7 @@ void *__kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags) return NULL; } - /* - * allocation_gate only needs to become non-zero, so it doesn't make - * sense to continue writing to it and pay the associated contention - * cost, in case we have a large number of concurrent allocations. - */ - if (atomic_read(&kfence_allocation_gate) || atomic_inc_return(&kfence_allocation_gate) > 1) + if (atomic_inc_return(&kfence_allocation_gate) > 1) return NULL; #ifdef CONFIG_KFENCE_STATIC_KEYS /* From patchwork Tue Oct 19 10:25:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marco Elver X-Patchwork-Id: 12569409 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 880EEC433F5 for ; Tue, 19 Oct 2021 10:25:38 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 12F5D6137F for ; Tue, 19 Oct 2021 10:25:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 12F5D6137F Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id AE7C76B0071; Tue, 19 Oct 2021 06:25:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A988E6B0072; Tue, 19 Oct 2021 06:25:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 937DA900002; Tue, 19 Oct 2021 06:25:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0238.hostedemail.com [216.40.44.238]) by kanga.kvack.org (Postfix) with ESMTP id 855246B0071 for ; Tue, 19 Oct 2021 06:25:37 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 20F4432091 for ; Tue, 19 Oct 2021 10:25:37 +0000 (UTC) X-FDA: 78712805514.30.8CCD8E2 Received: from mail-ed1-f74.google.com (mail-ed1-f74.google.com [209.85.208.74]) by imf30.hostedemail.com (Postfix) with ESMTP id 57794E001980 for ; Tue, 19 Oct 2021 10:25:32 +0000 (UTC) Received: by mail-ed1-f74.google.com with SMTP id l22-20020aa7c316000000b003dbbced0731so17163189edq.6 for ; Tue, 19 Oct 2021 03:25:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=18QmBkB8I+n5qxxUGXdyzfTsY7V9PS8kg83ueXyuZCk=; b=jS26q/AKCynLWIZ1DuWcXrk/da8p0UDsNCSqbjbrQdhFduA2rTTEE5KXXo8F2DTx5T bnXRvbh6aHrFLlSEIxMGKt1UZ1XvZOKKQW1ck/YymO+hN5MMem9KqA6DgEYpZAtpusXx he3cSNg0RgpG4iOseaGujSgQHYT12DRs2SL5QfxW32rw3WtHdUl940Gcm1tVqYLjrJPn leSp780rCKNz6UVHsNeMmfCnxP6qcBKvMLCXUauxtvIG3dvI/DGCB6iQmRTTnTA0gAvv mLaWLVujG8Y3zib53zYaaQdHhVuHGWippCJujh9FB/6VpywOayCt9q0XugsZHMUqb63K CpGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=18QmBkB8I+n5qxxUGXdyzfTsY7V9PS8kg83ueXyuZCk=; b=ptBG2qC+01YdcsfBkh/vzaUoD/AaGFl/Q/ZiG0QOrh2e9DSU5iPCXdb85iyB5aQ0IS 4PknqTZ9liQzeDjTND6Tfu2aWSCiFhXA/Ai41vt6gnGRCWOIzklAwOQeoBGBt30m8zDn th+BsmTj7szBbY4sg815hRVqAOcFyVxtGOyZ72uGlvGd7A9iCP38t1kHm8NyELauhc6e b8e+mL/Vg62i79g4J2Mwi4rutLH5hD6RRHk4AhSA7Ewcmqw3j/1YlyW5sq6s5KwcRW+i NKHCRSyRLMlBmjEumOQV8wv/VuhTcmAvzfrljdsxfsn+2AEUnbvgl1wJRc6wSVmrZDqU gCNA== X-Gm-Message-State: AOAM5337fQ+amXtZ2ogEZkItKwCVNz90nYwISsc+2Z5rrAIGNzfWZ29P K2ERkCT7vq8Y3krLh/usyqnTXUxtbA== X-Google-Smtp-Source: ABdhPJz1h69+vO/03HADJePP5S4zfTZf32z308XEBAGlzeIDz9OMAre4hjxsibWqGOjxe/ihHQ45d4hKBA== X-Received: from elver.muc.corp.google.com ([2a00:79e0:15:13:feca:f6ef:d785:c732]) (user=elver job=sendgmr) by 2002:a17:907:1044:: with SMTP id oy4mr36797913ejb.308.1634639135257; Tue, 19 Oct 2021 03:25:35 -0700 (PDT) Date: Tue, 19 Oct 2021 12:25:24 +0200 In-Reply-To: <20211019102524.2807208-1-elver@google.com> Message-Id: <20211019102524.2807208-2-elver@google.com> Mime-Version: 1.0 References: <20211019102524.2807208-1-elver@google.com> X-Mailer: git-send-email 2.33.0.1079.g6e70778dc9-goog Subject: [PATCH 2/2] kfence: default to dynamic branch instead of static keys mode From: Marco Elver To: elver@google.com, Andrew Morton Cc: Alexander Potapenko , Dmitry Vyukov , Jann Horn , linux-kernel@vger.kernel.org, linux-mm@kvack.org, kasan-dev@googlegroups.com X-Stat-Signature: p6o81knjjdmsddxa6n3woin17j1iox6c Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="jS26q/AK"; spf=pass (imf30.hostedemail.com: domain of 3H51uYQUKCCYGNXGTIQQING.EQONKPWZ-OOMXCEM.QTI@flex--elver.bounces.google.com designates 209.85.208.74 as permitted sender) smtp.mailfrom=3H51uYQUKCCYGNXGTIQQING.EQONKPWZ-OOMXCEM.QTI@flex--elver.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 57794E001980 X-HE-Tag: 1634639132-639212 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: We have observed that on very large machines with newer CPUs, the static key/branch switching delay is on the order of milliseconds. This is due to the required broadcast IPIs, which simply does not scale well to hundreds of CPUs (cores). If done too frequently, this can adversely affect tail latencies of various workloads. One workaround is to increase the sample interval to several seconds, while decreasing sampled allocation coverage, but the problem still exists and could still increase tail latencies. As already noted in the Kconfig help text, there are trade-offs: at lower sample intervals the dynamic branch results in better performance; however, at very large sample intervals, the static keys mode can result in better performance -- careful benchmarking is recommended. Our initial benchmarking showed that with large enough sample intervals and workloads stressing the allocator, the static keys mode was slightly better. Evaluating and observing the possible system-wide side-effects of the static-key-switching induced broadcast IPIs, however, was a blind spot (in particular on large machines with 100s of cores). Therefore, a major downside of the static keys mode is, unfortunately, that it is hard to predict performance on new system architectures and topologies, but also making conclusions about performance of new workloads based on a limited set of benchmarks. Most distributions will simply select the defaults, while targeting a large variety of different workloads and system architectures. As such, the better default is CONFIG_KFENCE_STATIC_KEYS=n, and re-enabling it is only recommended after careful evaluation. For reference, on x86-64 the condition in kfence_alloc() generates exactly 2 instructions in the kmem_cache_alloc() fast-path: | ... | cmpl $0x0,0x1a8021c(%rip) # ffffffff82d560d0 | je ffffffff812d6003 | ... which, given kfence_allocation_gate is infrequently modified, should be well predicted by most CPUs. Signed-off-by: Marco Elver --- Documentation/dev-tools/kfence.rst | 12 ++++++++---- lib/Kconfig.kfence | 26 +++++++++++++++----------- 2 files changed, 23 insertions(+), 15 deletions(-) diff --git a/Documentation/dev-tools/kfence.rst b/Documentation/dev-tools/kfence.rst index d45f952986ae..ac6b89d1a8c3 100644 --- a/Documentation/dev-tools/kfence.rst +++ b/Documentation/dev-tools/kfence.rst @@ -231,10 +231,14 @@ Guarded allocations are set up based on the sample interval. After expiration of the sample interval, the next allocation through the main allocator (SLAB or SLUB) returns a guarded allocation from the KFENCE object pool (allocation sizes up to PAGE_SIZE are supported). At this point, the timer is reset, and -the next allocation is set up after the expiration of the interval. To "gate" a -KFENCE allocation through the main allocator's fast-path without overhead, -KFENCE relies on static branches via the static keys infrastructure. The static -branch is toggled to redirect the allocation to KFENCE. +the next allocation is set up after the expiration of the interval. + +When using ``CONFIG_KFENCE_STATIC_KEYS=y``, KFENCE allocations are "gated" +through the main allocator's fast-path by relying on static branches via the +static keys infrastructure. The static branch is toggled to redirect the +allocation to KFENCE. Depending on sample interval, target workloads, and +system architecture, this may perform better than the simple dynamic branch. +Careful benchmarking is recommended. KFENCE objects each reside on a dedicated page, at either the left or right page boundaries selected at random. The pages to the left and right of the diff --git a/lib/Kconfig.kfence b/lib/Kconfig.kfence index e641add33947..912f252a41fc 100644 --- a/lib/Kconfig.kfence +++ b/lib/Kconfig.kfence @@ -25,17 +25,6 @@ menuconfig KFENCE if KFENCE -config KFENCE_STATIC_KEYS - bool "Use static keys to set up allocations" - default y - depends on JUMP_LABEL # To ensure performance, require jump labels - help - Use static keys (static branches) to set up KFENCE allocations. Using - static keys is normally recommended, because it avoids a dynamic - branch in the allocator's fast path. However, with very low sample - intervals, or on systems that do not support jump labels, a dynamic - branch may still be an acceptable performance trade-off. - config KFENCE_SAMPLE_INTERVAL int "Default sample interval in milliseconds" default 100 @@ -56,6 +45,21 @@ config KFENCE_NUM_OBJECTS pages are required; with one containing the object and two adjacent ones used as guard pages. +config KFENCE_STATIC_KEYS + bool "Use static keys to set up allocations" if EXPERT + depends on JUMP_LABEL + help + Use static keys (static branches) to set up KFENCE allocations. This + option is only recommended when using very large sample intervals, or + performance has carefully been evaluated with this option. + + Using static keys comes with trade-offs that need to be carefully + evaluated given target workloads and system architectures. Notably, + enabling and disabling static keys invoke IPI broadcasts, the latency + and impact of which is much harder to predict than a dynamic branch. + + Say N if you are unsure. + config KFENCE_STRESS_TEST_FAULTS int "Stress testing of fault handling and error reporting" if EXPERT default 0