Message ID | 20200915132046.3332537-1-elver@google.com (mailing list archive) |
---|---|
Headers | show |
Series | KFENCE: A low-overhead sampling-based memory safety error detector | expand |
On Tue, Sep 15, 2020 at 3:20 PM Marco Elver <elver@google.com> wrote: > > This adds the Kernel Electric-Fence (KFENCE) infrastructure. KFENCE is a > low-overhead sampling-based memory safety error detector of heap > use-after-free, invalid-free, and out-of-bounds access errors. This > series enables KFENCE for the x86 and arm64 architectures, and adds > KFENCE hooks to the SLAB and SLUB allocators. > > KFENCE is designed to be enabled in production kernels, and has near > zero performance overhead. Compared to KASAN, KFENCE trades performance > for precision. The main motivation behind KFENCE's design, is that with > enough total uptime KFENCE will detect bugs in code paths not typically > exercised by non-production test workloads. One way to quickly achieve a > large enough total uptime is when the tool is deployed across a large > fleet of machines. > > KFENCE objects each reside on a dedicated page, at either the left or > right page boundaries. The pages to the left and right of the object > page are "guard pages", whose attributes are changed to a protected > state, and cause page faults on any attempted access to them. Such page > faults are then intercepted by KFENCE, which handles the fault > gracefully by reporting a memory access error. > > Guarded allocations are set up based on a sample interval (can be set > via kfence.sample_interval). After expiration of the sample interval, > the next allocation through the main allocator (SLAB or SLUB) returns a > guarded allocation from the KFENCE object pool. At this point, the timer > is reset, and the next allocation is set up after the expiration of the > interval. > > To enable/disable a KFENCE allocation through the main allocator's > fast-path without overhead, KFENCE relies on static branches via the > static keys infrastructure. The static branch is toggled to redirect the > allocation to KFENCE. > > The KFENCE memory pool is of fixed size, and if the pool is exhausted no > further KFENCE allocations occur. The default config is conservative > with only 255 objects, resulting in a pool size of 2 MiB (with 4 KiB > pages). > > We have verified by running synthetic benchmarks (sysbench I/O, > hackbench) that a kernel with KFENCE is performance-neutral compared to > a non-KFENCE baseline kernel. > > KFENCE is inspired by GWP-ASan [1], a userspace tool with similar > properties. The name "KFENCE" is a homage to the Electric Fence Malloc > Debugger [2]. > > For more details, see Documentation/dev-tools/kfence.rst added in the > series -- also viewable here: > > https://raw.githubusercontent.com/google/kasan/kfence/Documentation/dev-tools/kfence.rst > > [1] http://llvm.org/docs/GwpAsan.html > [2] https://linux.die.net/man/3/efence I see all of my comments from v1 are resolved. So this is: Reviewed-by: Dmitry Vyukov <dvyukov@google.com> for the series. > v2: > * Various comment/documentation changes (see details in patches). > * Various smaller fixes (see details in patches). > * Change all reports to reference the kfence object, "kfence-#nn". > * Skip allocation/free internals stack trace. > * Rework KMEMLEAK compatibility patch. > > RFC/v1: https://lkml.kernel.org/r/20200907134055.2878499-1-elver@google.com > > Alexander Potapenko (6): > mm: add Kernel Electric-Fence infrastructure > x86, kfence: enable KFENCE for x86 > mm, kfence: insert KFENCE hooks for SLAB > mm, kfence: insert KFENCE hooks for SLUB > kfence, kasan: make KFENCE compatible with KASAN > kfence, kmemleak: make KFENCE compatible with KMEMLEAK > > Marco Elver (4): > arm64, kfence: enable KFENCE for ARM64 > kfence, lockdep: make KFENCE compatible with lockdep > kfence, Documentation: add KFENCE documentation > kfence: add test suite > > Documentation/dev-tools/index.rst | 1 + > Documentation/dev-tools/kfence.rst | 291 +++++++++++ > MAINTAINERS | 11 + > arch/arm64/Kconfig | 1 + > arch/arm64/include/asm/kfence.h | 39 ++ > arch/arm64/mm/fault.c | 4 + > arch/x86/Kconfig | 2 + > arch/x86/include/asm/kfence.h | 60 +++ > arch/x86/mm/fault.c | 4 + > include/linux/kfence.h | 174 +++++++ > init/main.c | 2 + > kernel/locking/lockdep.c | 8 + > lib/Kconfig.debug | 1 + > lib/Kconfig.kfence | 78 +++ > mm/Makefile | 1 + > mm/kasan/common.c | 7 + > mm/kfence/Makefile | 6 + > mm/kfence/core.c | 733 +++++++++++++++++++++++++++ > mm/kfence/kfence.h | 102 ++++ > mm/kfence/kfence_test.c | 777 +++++++++++++++++++++++++++++ > mm/kfence/report.c | 219 ++++++++ > mm/kmemleak.c | 6 + > mm/slab.c | 46 +- > mm/slab_common.c | 6 +- > mm/slub.c | 72 ++- > 25 files changed, 2619 insertions(+), 32 deletions(-) > create mode 100644 Documentation/dev-tools/kfence.rst > create mode 100644 arch/arm64/include/asm/kfence.h > create mode 100644 arch/x86/include/asm/kfence.h > create mode 100644 include/linux/kfence.h > create mode 100644 lib/Kconfig.kfence > create mode 100644 mm/kfence/Makefile > create mode 100644 mm/kfence/core.c > create mode 100644 mm/kfence/kfence.h > create mode 100644 mm/kfence/kfence_test.c > create mode 100644 mm/kfence/report.c > > -- > 2.28.0.618.gf4bc123cb7-goog >
On Tue, 2020-09-15 at 15:20 +0200, Marco Elver wrote: > This adds the Kernel Electric-Fence (KFENCE) infrastructure. KFENCE is a > low-overhead sampling-based memory safety error detector of heap > use-after-free, invalid-free, and out-of-bounds access errors. This > series enables KFENCE for the x86 and arm64 architectures, and adds > KFENCE hooks to the SLAB and SLUB allocators. > > KFENCE is designed to be enabled in production kernels, and has near > zero performance overhead. Compared to KASAN, KFENCE trades performance > for precision. The main motivation behind KFENCE's design, is that with > enough total uptime KFENCE will detect bugs in code paths not typically > exercised by non-production test workloads. One way to quickly achieve a > large enough total uptime is when the tool is deployed across a large > fleet of machines. > > KFENCE objects each reside on a dedicated page, at either the left or > right page boundaries. The pages to the left and right of the object > page are "guard pages", whose attributes are changed to a protected > state, and cause page faults on any attempted access to them. Such page > faults are then intercepted by KFENCE, which handles the fault > gracefully by reporting a memory access error. > > Guarded allocations are set up based on a sample interval (can be set > via kfence.sample_interval). After expiration of the sample interval, > the next allocation through the main allocator (SLAB or SLUB) returns a > guarded allocation from the KFENCE object pool. At this point, the timer > is reset, and the next allocation is set up after the expiration of the > interval. > > To enable/disable a KFENCE allocation through the main allocator's > fast-path without overhead, KFENCE relies on static branches via the > static keys infrastructure. The static branch is toggled to redirect the > allocation to KFENCE. > > The KFENCE memory pool is of fixed size, and if the pool is exhausted no > further KFENCE allocations occur. The default config is conservative > with only 255 objects, resulting in a pool size of 2 MiB (with 4 KiB > pages). > > We have verified by running synthetic benchmarks (sysbench I/O, > hackbench) that a kernel with KFENCE is performance-neutral compared to > a non-KFENCE baseline kernel. > > KFENCE is inspired by GWP-ASan [1], a userspace tool with similar > properties. The name "KFENCE" is a homage to the Electric Fence Malloc > Debugger [2]. > > For more details, see Documentation/dev-tools/kfence.rst added in the > series -- also viewable here: Does anybody else grow tried of all those different *imperfect* versions of in- kernel memory safety error detectors? KASAN-generic, KFENCE, KASAN-tag-based etc. Then, we have old things like page_poison, SLUB debugging, debug_pagealloc etc which are pretty much inefficient to detect bugs those days compared to KASAN. Can't we work towards having a single implementation and clean up all those mess?
On Fri, 18 Sep 2020 at 13:17, Qian Cai <cai@redhat.com> wrote: > > On Tue, 2020-09-15 at 15:20 +0200, Marco Elver wrote: > > This adds the Kernel Electric-Fence (KFENCE) infrastructure. KFENCE is a > > low-overhead sampling-based memory safety error detector of heap > > use-after-free, invalid-free, and out-of-bounds access errors. This > > series enables KFENCE for the x86 and arm64 architectures, and adds > > KFENCE hooks to the SLAB and SLUB allocators. > > > > KFENCE is designed to be enabled in production kernels, and has near > > zero performance overhead. Compared to KASAN, KFENCE trades performance > > for precision. The main motivation behind KFENCE's design, is that with > > enough total uptime KFENCE will detect bugs in code paths not typically > > exercised by non-production test workloads. One way to quickly achieve a > > large enough total uptime is when the tool is deployed across a large > > fleet of machines. > > > > KFENCE objects each reside on a dedicated page, at either the left or > > right page boundaries. The pages to the left and right of the object > > page are "guard pages", whose attributes are changed to a protected > > state, and cause page faults on any attempted access to them. Such page > > faults are then intercepted by KFENCE, which handles the fault > > gracefully by reporting a memory access error. > > > > Guarded allocations are set up based on a sample interval (can be set > > via kfence.sample_interval). After expiration of the sample interval, > > the next allocation through the main allocator (SLAB or SLUB) returns a > > guarded allocation from the KFENCE object pool. At this point, the timer > > is reset, and the next allocation is set up after the expiration of the > > interval. > > > > To enable/disable a KFENCE allocation through the main allocator's > > fast-path without overhead, KFENCE relies on static branches via the > > static keys infrastructure. The static branch is toggled to redirect the > > allocation to KFENCE. > > > > The KFENCE memory pool is of fixed size, and if the pool is exhausted no > > further KFENCE allocations occur. The default config is conservative > > with only 255 objects, resulting in a pool size of 2 MiB (with 4 KiB > > pages). > > > > We have verified by running synthetic benchmarks (sysbench I/O, > > hackbench) that a kernel with KFENCE is performance-neutral compared to > > a non-KFENCE baseline kernel. > > > > KFENCE is inspired by GWP-ASan [1], a userspace tool with similar > > properties. The name "KFENCE" is a homage to the Electric Fence Malloc > > Debugger [2]. > > > > For more details, see Documentation/dev-tools/kfence.rst added in the > > series -- also viewable here: > > Does anybody else grow tried of all those different *imperfect* versions of in- > kernel memory safety error detectors? KASAN-generic, KFENCE, KASAN-tag-based > etc. Then, we have old things like page_poison, SLUB debugging, debug_pagealloc > etc which are pretty much inefficient to detect bugs those days compared to > KASAN. Can't we work towards having a single implementation and clean up all > those mess? If you have suggestions on how to get a zero-overhead, precise ("perfect") memory safety error detector without new hardware extensions, we're open to suggestions -- many people over many years have researched this problems, and while we're making progress for C (and C++), the fact remains that what you're asking is likely impossible. This might be useful background: https://arxiv.org/pdf/1802.09517.pdf The fact remains that requirements and environments vary across applications and usecases. Maybe for one usecase (debugging, test env) normal KASAN is just fine. But that doesn't work for production, where we want to have max performance. MTE will get us closer (no silicon yet, and ARM64 only for now), but depending on implementation might come with small overheads, although quite acceptable for most environments with increasing processing power modern CPUs deliver. Yet for other environments, where even a small performance regression is unacceptable, and where it's infeasible to capture in tests what the workloads execute, KFENCE is a very attractive option. There have also been discussions on using Rust in the kernel [1], but this is just not feasible for core kernel code in the near future (even then, you'll still need dynamic error detection tools for all the unsafe bits, of which there are many in an OS kernel). [1] https://lwn.net/Articles/829858/ Thanks, -- Marco