From patchwork Mon Sep 7 13:40:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marco Elver X-Patchwork-Id: 11761063 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4BD26746 for ; Mon, 7 Sep 2020 13:42:54 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0CB2F21481 for ; Mon, 7 Sep 2020 13:42:54 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="dGMDGYlJ"; dkim=fail reason="signature verification failed" (2048-bit key) header.d=google.com header.i=@google.com header.b="ALL9RpQB" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0CB2F21481 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:To:From:Subject:References:Mime-Version:Message-Id: In-Reply-To:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=+vUiH4E+xvuHnT5swWhFUNYceMcQwLYm09j5cvMyaFM=; b=dGMDGYlJS+VY5ZAzzk/SzjzBW a1XttxXMNmiAsykV0Y5PxFBMVsf0yipbMrd7QiIcCWSITQlKjXdMgH1nM53qHLdQYLUD75RhuZzLp kdJyCkPjzA6hgsRU5PtX4pHZhd9pshASTnH701R9NeQs760VRiDV/DbgTUMwWXodJPTipES+DuyBO MUpiX1PFhgYXNwtKl1Ob6gbwy2Gq5VfPzu/s3v2ua1413jCFQ+1A8zgv5TlfIg7JDzWLebE7qCmvf QSDaIb05lbizkbf2j/OaVukGo6OmGSJOnsbIf8T2DVOwN26yhzfHgZ/ubvhyBkBxckxxUbmPTw3H2 pssHcbC/Q==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kFHPp-00044a-Rm; Mon, 07 Sep 2020 13:42:37 +0000 Received: from mail-wr1-x449.google.com ([2a00:1450:4864:20::449]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kFHOp-0003eX-4Q for linux-arm-kernel@lists.infradead.org; Mon, 07 Sep 2020 13:41:51 +0000 Received: by mail-wr1-x449.google.com with SMTP id 3so5751430wrm.4 for ; Mon, 07 Sep 2020 06:41:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=VRGvWOinGw0btNVyOH340+TVwWpL0hbTJw9JR94PpB4=; b=ALL9RpQBmEPbBRqb5ELx8D38l7uZ2fXlF779RdUFbsUuGJL6gcBUM7tJu/64ISq4My YiAVlSrw9ghLoaIar3j3DmPnyxXsWrtPv2hmyhS+Nh0lOTqZmq/TEHAAEnK92wnT7PmA xIX/7/VCEr5weGUe/HeJ0n13Bu0Cd4VgsRBoKRmnNlDstFMCKYwn/2mMCjQG2cDBLTZ8 xy2KY7WJHh1+7zCZES2tvtQxcSlt1v2EojD7M3Avm7GLnisEB5LbabRNiH2MZhyOlqsb wj1LdJ2GatvNgSj4H1oAR5GVoqiODfuXW7nxdrulY3Q9I8oF7WMayLQ4fMnmW0nGjlKc hlnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=VRGvWOinGw0btNVyOH340+TVwWpL0hbTJw9JR94PpB4=; b=C9DzGO8omaYA4dQv8CYnrrSyTA9gR5pZaZjOfnCZc+u8SlWObDyAZ26jCglwAA6s8U 28sGTlKPTXQGTxEzfZcSPqfOT7oBFXL7Rq5wVl2oDMfgTr2mXYrxLGYfs7LVM8Oikeyz KkqwMx2IPhg7l6uDSavkG1FJT9GkWkaRKUQgocH86n2Y+fBbzxMgu9r/rxCrDmZmqaMI Vor75a7q/XrKuCBu4bZDPiQxSXXz0iNhI2GFEKtkb3H/WZ13meg6LXwjkL6Bpg/us97u x5G9TA4CaLMGuxH7sb9IUdU6YHr0yYtO4zLV1A4hD0FfG4sE9cP1WqcbMHucBewaAk9X hwVA== X-Gm-Message-State: AOAM530D7TbyZCfOUh2oQarp76wvnkWjyuJt30DcFmnzJ97mWcZ7pHex /Umk47CEt2vKTeQ8m484MePAG2olFg== X-Google-Smtp-Source: ABdhPJyYq0tbe8OJm8WhlgT1J8VR0ZbAfLGaI4xhC6D0qARX6o+xgg9QlBKjgP5zThx95lmrqWGuCVhVLg== X-Received: from elver.muc.corp.google.com ([2a00:79e0:15:13:f693:9fff:fef4:2449]) (user=elver job=sendgmr) by 2002:a1c:3886:: with SMTP id f128mr20829871wma.121.1599486092008; Mon, 07 Sep 2020 06:41:32 -0700 (PDT) Date: Mon, 7 Sep 2020 15:40:54 +0200 In-Reply-To: <20200907134055.2878499-1-elver@google.com> Message-Id: <20200907134055.2878499-10-elver@google.com> Mime-Version: 1.0 References: <20200907134055.2878499-1-elver@google.com> X-Mailer: git-send-email 2.28.0.526.ge36021eeef-goog Subject: [PATCH RFC 09/10] kfence, Documentation: add KFENCE documentation From: Marco Elver To: elver@google.com, glider@google.com, akpm@linux-foundation.org, catalin.marinas@arm.com, cl@linux.com, rientjes@google.com, iamjoonsoo.kim@lge.com, mark.rutland@arm.com, penberg@kernel.org X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200907_094135_248766_B016D0D3 X-CRM114-Status: GOOD ( 23.78 ) X-Spam-Score: -7.7 (-------) X-Spam-Report: SpamAssassin version 3.4.4 on merlin.infradead.org summary: Content analysis details: (-7.7 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [2a00:1450:4864:20:0:0:0:449 listed in] [list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record -7.5 USER_IN_DEF_DKIM_WL From: address is in the default DKIM white-list 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from envelope-from domain -0.0 DKIMWL_WL_MED DKIMwl.org - Medium sender X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-doc@vger.kernel.org, peterz@infradead.org, dave.hansen@linux.intel.com, linux-mm@kvack.org, edumazet@google.com, hpa@zytor.com, will@kernel.org, corbet@lwn.net, x86@kernel.org, kasan-dev@googlegroups.com, mingo@redhat.com, linux-arm-kernel@lists.infradead.org, aryabinin@virtuozzo.com, keescook@chromium.org, paulmck@kernel.org, jannh@google.com, andreyknvl@google.com, cai@lca.pw, luto@kernel.org, tglx@linutronix.de, dvyukov@google.com, gregkh@linuxfoundation.org, linux-kernel@vger.kernel.org, bp@alien8.de Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org Add KFENCE documentation in dev-tools/kfence.rst, and add to index. Co-developed-by: Alexander Potapenko Signed-off-by: Alexander Potapenko Signed-off-by: Marco Elver --- Documentation/dev-tools/index.rst | 1 + Documentation/dev-tools/kfence.rst | 285 +++++++++++++++++++++++++++++ 2 files changed, 286 insertions(+) create mode 100644 Documentation/dev-tools/kfence.rst diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst index f7809c7b1ba9..1b1cf4f5c9d9 100644 --- a/Documentation/dev-tools/index.rst +++ b/Documentation/dev-tools/index.rst @@ -22,6 +22,7 @@ whole; patches welcome! ubsan kmemleak kcsan + kfence gdb-kernel-debugging kgdb kselftest diff --git a/Documentation/dev-tools/kfence.rst b/Documentation/dev-tools/kfence.rst new file mode 100644 index 000000000000..254f4f089104 --- /dev/null +++ b/Documentation/dev-tools/kfence.rst @@ -0,0 +1,285 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Kernel Electric-Fence (KFENCE) +============================== + +Kernel Electric-Fence (KFENCE) is a low-overhead sampling-based memory safety +error detector. KFENCE detects heap out-of-bounds access, use-after-free, and +invalid-free errors. + +KFENCE is designed to be enabled in production kernels, and has near zero +performance overhead. Compared to KASAN, KFENCE trades performance for +precision. The main motivation behind KFENCE's design, is that with enough +total uptime KFENCE will detect bugs in code paths not typically exercised by +non-production test workloads. One way to quickly achieve a large enough total +uptime is when the tool is deployed across a large fleet of machines. + +Usage +----- + +To enable KFENCE, configure the kernel with:: + + CONFIG_KFENCE=y + +KFENCE provides several other configuration options to customize behaviour (see +the respective help text in ``lib/Kconfig.kfence`` for more info). + +Tuning performance +~~~~~~~~~~~~~~~~~~ + +The most important parameter is KFENCE's sample interval, which can be set via +the kernel boot parameter ``kfence.sample_interval`` in milliseconds. The +sample interval determines the frequency with which heap allocations will be +guarded by KFENCE. The default is configurable via the Kconfig option +``CONFIG_KFENCE_SAMPLE_INTERVAL``. Setting ``kfence.sample_interval=0`` +disables KFENCE. + +With the Kconfig option ``CONFIG_KFENCE_NUM_OBJECTS`` (default 255), the number +of available guarded objects can be controlled. Each object requires 2 pages, +one for the object itself and the other one used as a guard page; object pages +are interleaved with guard pages, and every object page is therefore surrounded +by two guard pages. + +The total memory dedicated to the KFENCE memory pool can be computed as:: + + ( #objects + 1 ) * 2 * PAGE_SIZE + +Using the default config, and assuming a page size of 4 KiB, results in +dedicating 2 MiB to the KFENCE memory pool. + +Error reports +~~~~~~~~~~~~~ + +A typical out-of-bounds access looks like this:: + + ================================================================== + BUG: KFENCE: out-of-bounds in test_out_of_bounds_read+0xa3/0x22b + + Out-of-bounds access at 0xffffffffb672efff (left of kfence-#17): + test_out_of_bounds_read+0xa3/0x22b + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + kfence-#17 [0xffffffffb672f000-0xffffffffb672f01f, size=32, cache=kmalloc-32] allocated in: + __kfence_alloc+0x42d/0x4c0 + __kmalloc+0x133/0x200 + test_alloc+0xf3/0x25b + test_out_of_bounds_read+0x98/0x22b + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 107 Comm: kunit_try_catch Not tainted 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +The header of the report provides a short summary of the function involved in +the access. It is followed by more detailed information about the access and +its origin. + +Use-after-free accesses are reported as:: + + ================================================================== + BUG: KFENCE: use-after-free in test_use_after_free_read+0xb3/0x143 + + Use-after-free access at 0xffffffffb673dfe0: + test_use_after_free_read+0xb3/0x143 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + kfence-#24 [0xffffffffb673dfe0-0xffffffffb673dfff, size=32, cache=kmalloc-32] allocated in: + __kfence_alloc+0x277/0x4c0 + __kmalloc+0x133/0x200 + test_alloc+0xf3/0x25b + test_use_after_free_read+0x76/0x143 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + freed in: + kfence_guarded_free+0x158/0x380 + __kfence_free+0x38/0xc0 + test_use_after_free_read+0xa8/0x143 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 109 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +KFENCE also reports on invalid frees, such as double-frees:: + + ================================================================== + BUG: KFENCE: invalid free in test_double_free+0xdc/0x171 + + Invalid free of 0xffffffffb6741000: + test_double_free+0xdc/0x171 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + kfence-#26 [0xffffffffb6741000-0xffffffffb674101f, size=32, cache=kmalloc-32] allocated in: + __kfence_alloc+0x42d/0x4c0 + __kmalloc+0x133/0x200 + test_alloc+0xf3/0x25b + test_double_free+0x76/0x171 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + freed in: + kfence_guarded_free+0x158/0x380 + __kfence_free+0x38/0xc0 + test_double_free+0xa8/0x171 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 111 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +KFENCE also uses pattern-based redzones on the other side of an object's guard +page, to detect out-of-bounds writes on the unprotected side of the object. +These are reported on frees:: + + ================================================================== + BUG: KFENCE: memory corruption in test_kmalloc_aligned_oob_write+0xef/0x184 + + Detected corrupted memory at 0xffffffffb6797ff9 [ 0xac . . . . . . ]: + test_kmalloc_aligned_oob_write+0xef/0x184 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + kfence-#69 [0xffffffffb6797fb0-0xffffffffb6797ff8, size=73, cache=kmalloc-96] allocated in: + __kfence_alloc+0x277/0x4c0 + __kmalloc+0x133/0x200 + test_alloc+0xf3/0x25b + test_kmalloc_aligned_oob_write+0x57/0x184 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 120 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +For such errors, the address where the corruption as well as the corrupt bytes +are shown. + +And finally, KFENCE may also report on invalid accesses to any protected page +where it was not possible to determine an associated object, e.g. if adjacent +object pages had not yet been allocated:: + + ================================================================== + BUG: KFENCE: invalid access in test_invalid_access+0x26/0xe0 + + Invalid access at 0xffffffffb670b00a: + test_invalid_access+0x26/0xe0 + kunit_try_run_case+0x51/0x85 + kunit_generic_run_threadfn_adapter+0x16/0x30 + kthread+0x137/0x160 + ret_from_fork+0x22/0x30 + + CPU: 4 PID: 124 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7 + Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 + ================================================================== + +DebugFS interface +~~~~~~~~~~~~~~~~~ + +Some debugging information is exposed via debugfs: + +* The file ``/sys/kernel/debug/kfence/stats`` provides runtime statistics. + +* The file ``/sys/kernel/debug/kfence/objects`` provides a list of objects + allocated via KFENCE, including those already freed but protected. + +Implementation Details +---------------------- + +Guarded allocations are set up based on the sample interval. After expiration +of the sample interval, a guarded allocation from the KFENCE object pool is +returned to the main allocator (SLAB or SLUB). At this point, the timer is +reset, and the next allocation is set up after the expiration of the interval. +To "gate" a KFENCE allocation through the main allocator's fast-path without +overhead, KFENCE relies on static branches via the static keys infrastructure. +The static branch is toggled to redirect the allocation to KFENCE. + +KFENCE objects each reside on a dedicated page, at either the left or right +page boundaries selected at random. The pages to the left and right of the +object page are "guard pages", whose attributes are changed to a protected +state, and cause page faults on any attempted access. Such page faults are then +intercepted by KFENCE, which handles the fault gracefully by reporting an +out-of-bounds access. The side opposite of an object's guard page is used as a +pattern-based redzone, to detect out-of-bounds writes on the unprotected sed of +the object on frees (for special alignment and size combinations, both sides of +the object are redzoned). + +KFENCE also uses pattern-based redzones on the other side of an object's guard +page, to detect out-of-bounds writes on the unprotected side of the object; +these are reported on frees. + +The following figure illustrates the page layout:: + + ---+-----------+-----------+-----------+-----------+-----------+--- + | xxxxxxxxx | O : | xxxxxxxxx | : O | xxxxxxxxx | + | xxxxxxxxx | B : | xxxxxxxxx | : B | xxxxxxxxx | + | x GUARD x | J : RED- | x GUARD x | RED- : J | x GUARD x | + | xxxxxxxxx | E : ZONE | xxxxxxxxx | ZONE : E | xxxxxxxxx | + | xxxxxxxxx | C : | xxxxxxxxx | : C | xxxxxxxxx | + | xxxxxxxxx | T : | xxxxxxxxx | : T | xxxxxxxxx | + ---+-----------+-----------+-----------+-----------+-----------+--- + +Upon deallocation of a KFENCE object, the object's page is again protected and +the object is marked as freed. Any further access to the object causes a fault +and KFENCE reports a use-after-free access. Freed objects are inserted at the +tail of KFENCE's freelist, so that the least recently freed objects are reused +first, and the chances of detecting use-after-frees of recently freed objects +is increased. + +Interface +--------- + +The following describes the functions which are used by allocators as well page +handling code to set up and deal with KFENCE allocations. + +.. kernel-doc:: include/linux/kfence.h + :functions: is_kfence_address + kfence_shutdown_cache + kfence_alloc kfence_free + kfence_ksize kfence_object_start + kfence_handle_page_fault + +Related Tools +------------- + +In userspace, a similar approach is taken by `GWP-ASan +`_. GWP-ASan also relies on guard pages and +a sampling strategy to detect memory unsafety bugs at scale. KFENCE's design is +directly influenced by GWP-ASan, and can be seen as its kernel sibling. Another +similar but non-sampling approach, that also inspired the name "KFENCE", can be +found in the userspace `Electric Fence Malloc Debugger +`_. + +In the kernel, several tools exist to debug memory access errors, and in +particular KASAN can detect all bug classes that KFENCE can detect. While KASAN +is more precise, relying on compiler instrumentation, this comes at a +performance cost. We want to highlight that KASAN and KFENCE are complementary, +with different target environments. For instance, KASAN is the better +debugging-aid, where a simple reproducer exists: due to the lower chance to +detect the error, it would require more effort using KFENCE to debug. +Deployments at scale, however, would benefit from using KFENCE to discover bugs +due to code paths not exercised by test cases or fuzzers.