mbox series

[mm,v3,00/34] kasan: switch tag-based modes to stack ring from per-object metadata

Message ID cover.1662411799.git.andreyknvl@google.com (mailing list archive)
Headers show
Series kasan: switch tag-based modes to stack ring from per-object metadata | expand

Message

andrey.konovalov@linux.dev Sept. 5, 2022, 9:05 p.m. UTC
From: Andrey Konovalov <andreyknvl@google.com>

This series makes the tag-based KASAN modes use a ring buffer for storing
stack depot handles for alloc/free stack traces for slab objects instead
of per-object metadata. This ring buffer is referred to as the stack ring.

On each alloc/free of a slab object, the tagged address of the object and
the current stack trace are recorded in the stack ring.

On each bug report, if the accessed address belongs to a slab object, the
stack ring is scanned for matching entries. The newest entries are used to
print the alloc/free stack traces in the report: one entry for alloc and
one for free.

The advantages of this approach over storing stack trace handles in
per-object metadata with the tag-based KASAN modes:

- Allows to find relevant stack traces for use-after-free bugs without
  using quarantine for freed memory. (Currently, if the object was
  reallocated multiple times, the report contains the latest alloc/free
  stack traces, not necessarily the ones relevant to the buggy allocation.)
- Allows to better identify and mark use-after-free bugs, effectively
  making the CONFIG_KASAN_TAGS_IDENTIFY functionality always-on.
- Has fixed memory overhead.

The disadvantage:

- If the affected object was allocated/freed long before the bug happened
  and the stack trace events were purged from the stack ring, the report
  will have no stack traces.

Discussion
==========

The proposed implementation of the stack ring uses a single ring buffer for
the whole kernel. This might lead to contention due to atomic accesses to
the ring buffer index on multicore systems.

At this point, it is unknown whether the performance impact from this
contention would be significant compared to the slowdown introduced by
collecting stack traces due to the planned changes to the latter part,
see the section below.

For now, the proposed implementation is deemed to be good enough, but this
might need to be revisited once the stack collection becomes faster.

A considered alternative is to keep a separate ring buffer for each CPU
and then iterate over all of them when printing a bug report. This approach
requires somehow figuring out which of the stack rings has the freshest
stack traces for an object if multiple stack rings have them.

Further plans
=============

This series is a part of an effort to make KASAN stack trace collection
suitable for production. This requires stack trace collection to be fast
and memory-bounded.

The planned steps are:

1. Speed up stack trace collection (potentially, by using SCS;
   patches on-hold until steps #2 and #3 are completed).
2. Keep stack trace handles in the stack ring (this series).
3. Add a memory-bounded mode to stack depot or provide an alternative
   memory-bounded stack storage.
4. Potentially, implement stack trace collection sampling to minimize
   the performance impact.

Thanks!

---

Changes v2->v3:
- Addressed Marco's comments, see the last 3 patches for list of changes.

Changes v1->v2:
- Rework synchronization in the stack ring implementation.
- Dynamically allocate stack ring based on the kasan.stack_ring_size
  command-line parameter.
- Multiple less significant changes, see the notes in patches for details.

Andrey Konovalov (34):
  kasan: check KASAN_NO_FREE_META in __kasan_metadata_size
  kasan: rename kasan_set_*_info to kasan_save_*_info
  kasan: move is_kmalloc check out of save_alloc_info
  kasan: split save_alloc_info implementations
  kasan: drop CONFIG_KASAN_TAGS_IDENTIFY
  kasan: introduce kasan_print_aux_stacks
  kasan: introduce kasan_get_alloc_track
  kasan: introduce kasan_init_object_meta
  kasan: clear metadata functions for tag-based modes
  kasan: move kasan_get_*_meta to generic.c
  kasan: introduce kasan_requires_meta
  kasan: introduce kasan_init_cache_meta
  kasan: drop CONFIG_KASAN_GENERIC check from kasan_init_cache_meta
  kasan: only define kasan_metadata_size for Generic mode
  kasan: only define kasan_never_merge for Generic mode
  kasan: only define metadata offsets for Generic mode
  kasan: only define metadata structs for Generic mode
  kasan: only define kasan_cache_create for Generic mode
  kasan: pass tagged pointers to kasan_save_alloc/free_info
  kasan: move kasan_get_alloc/free_track definitions
  kasan: cosmetic changes in report.c
  kasan: use virt_addr_valid in kasan_addr_to_page/slab
  kasan: use kasan_addr_to_slab in print_address_description
  kasan: make kasan_addr_to_page static
  kasan: simplify print_report
  kasan: introduce complete_report_info
  kasan: fill in cache and object in complete_report_info
  kasan: rework function arguments in report.c
  kasan: introduce kasan_complete_mode_report_info
  kasan: implement stack ring for tag-based modes
  kasan: support kasan.stacktrace for SW_TAGS
  kasan: dynamically allocate stack ring entries
  kasan: better identify bug types for tag-based modes
  kasan: add another use-after-free test

 Documentation/dev-tools/kasan.rst |  17 ++-
 include/linux/kasan.h             |  55 ++++------
 include/linux/slab.h              |   2 +-
 lib/Kconfig.kasan                 |   8 --
 lib/test_kasan.c                  |  24 ++++
 mm/kasan/common.c                 | 175 +++---------------------------
 mm/kasan/generic.c                | 154 ++++++++++++++++++++++++--
 mm/kasan/hw_tags.c                |  39 +------
 mm/kasan/kasan.h                  | 171 ++++++++++++++++++++---------
 mm/kasan/report.c                 | 117 +++++++++-----------
 mm/kasan/report_generic.c         |  45 +++++++-
 mm/kasan/report_tags.c            | 123 ++++++++++++++++-----
 mm/kasan/sw_tags.c                |   5 +-
 mm/kasan/tags.c                   | 141 +++++++++++++++++++-----
 14 files changed, 642 insertions(+), 434 deletions(-)

Comments

Andrey Konovalov Sept. 11, 2022, 11:50 a.m. UTC | #1
On Mon, Sep 5, 2022 at 11:05 PM <andrey.konovalov@linux.dev> wrote:
>
> From: Andrey Konovalov <andreyknvl@google.com>
>
> This series makes the tag-based KASAN modes use a ring buffer for storing
> stack depot handles for alloc/free stack traces for slab objects instead
> of per-object metadata. This ring buffer is referred to as the stack ring.
>
> On each alloc/free of a slab object, the tagged address of the object and
> the current stack trace are recorded in the stack ring.
>
> On each bug report, if the accessed address belongs to a slab object, the
> stack ring is scanned for matching entries. The newest entries are used to
> print the alloc/free stack traces in the report: one entry for alloc and
> one for free.
>
> The advantages of this approach over storing stack trace handles in
> per-object metadata with the tag-based KASAN modes:
>
> - Allows to find relevant stack traces for use-after-free bugs without
>   using quarantine for freed memory. (Currently, if the object was
>   reallocated multiple times, the report contains the latest alloc/free
>   stack traces, not necessarily the ones relevant to the buggy allocation.)
> - Allows to better identify and mark use-after-free bugs, effectively
>   making the CONFIG_KASAN_TAGS_IDENTIFY functionality always-on.
> - Has fixed memory overhead.
>
> The disadvantage:
>
> - If the affected object was allocated/freed long before the bug happened
>   and the stack trace events were purged from the stack ring, the report
>   will have no stack traces.
>
> Discussion
> ==========
>
> The proposed implementation of the stack ring uses a single ring buffer for
> the whole kernel. This might lead to contention due to atomic accesses to
> the ring buffer index on multicore systems.
>
> At this point, it is unknown whether the performance impact from this
> contention would be significant compared to the slowdown introduced by
> collecting stack traces due to the planned changes to the latter part,
> see the section below.
>
> For now, the proposed implementation is deemed to be good enough, but this
> might need to be revisited once the stack collection becomes faster.
>
> A considered alternative is to keep a separate ring buffer for each CPU
> and then iterate over all of them when printing a bug report. This approach
> requires somehow figuring out which of the stack rings has the freshest
> stack traces for an object if multiple stack rings have them.
>
> Further plans
> =============
>
> This series is a part of an effort to make KASAN stack trace collection
> suitable for production. This requires stack trace collection to be fast
> and memory-bounded.
>
> The planned steps are:
>
> 1. Speed up stack trace collection (potentially, by using SCS;
>    patches on-hold until steps #2 and #3 are completed).
> 2. Keep stack trace handles in the stack ring (this series).
> 3. Add a memory-bounded mode to stack depot or provide an alternative
>    memory-bounded stack storage.
> 4. Potentially, implement stack trace collection sampling to minimize
>    the performance impact.
>
> Thanks!

Hi Andrew,

Could you consider picking up this series into mm?

Most of the patches have a Reviewed-by tag from Marco, and I've
addressed the last few comments he had in v3.

Thanks!
Marco Elver Sept. 12, 2022, 9:39 a.m. UTC | #2
On Sun, 11 Sept 2022 at 13:50, Andrey Konovalov <andreyknvl@gmail.com> wrote:
>
> On Mon, Sep 5, 2022 at 11:05 PM <andrey.konovalov@linux.dev> wrote:
> >
> > From: Andrey Konovalov <andreyknvl@google.com>
> >
> > This series makes the tag-based KASAN modes use a ring buffer for storing
> > stack depot handles for alloc/free stack traces for slab objects instead
> > of per-object metadata. This ring buffer is referred to as the stack ring.
> >
> > On each alloc/free of a slab object, the tagged address of the object and
> > the current stack trace are recorded in the stack ring.
> >
> > On each bug report, if the accessed address belongs to a slab object, the
> > stack ring is scanned for matching entries. The newest entries are used to
> > print the alloc/free stack traces in the report: one entry for alloc and
> > one for free.
> >
> > The advantages of this approach over storing stack trace handles in
> > per-object metadata with the tag-based KASAN modes:
> >
> > - Allows to find relevant stack traces for use-after-free bugs without
> >   using quarantine for freed memory. (Currently, if the object was
> >   reallocated multiple times, the report contains the latest alloc/free
> >   stack traces, not necessarily the ones relevant to the buggy allocation.)
> > - Allows to better identify and mark use-after-free bugs, effectively
> >   making the CONFIG_KASAN_TAGS_IDENTIFY functionality always-on.
> > - Has fixed memory overhead.
> >
> > The disadvantage:
> >
> > - If the affected object was allocated/freed long before the bug happened
> >   and the stack trace events were purged from the stack ring, the report
> >   will have no stack traces.
> >
> > Discussion
> > ==========
> >
> > The proposed implementation of the stack ring uses a single ring buffer for
> > the whole kernel. This might lead to contention due to atomic accesses to
> > the ring buffer index on multicore systems.
> >
> > At this point, it is unknown whether the performance impact from this
> > contention would be significant compared to the slowdown introduced by
> > collecting stack traces due to the planned changes to the latter part,
> > see the section below.
> >
> > For now, the proposed implementation is deemed to be good enough, but this
> > might need to be revisited once the stack collection becomes faster.
> >
> > A considered alternative is to keep a separate ring buffer for each CPU
> > and then iterate over all of them when printing a bug report. This approach
> > requires somehow figuring out which of the stack rings has the freshest
> > stack traces for an object if multiple stack rings have them.
> >
> > Further plans
> > =============
> >
> > This series is a part of an effort to make KASAN stack trace collection
> > suitable for production. This requires stack trace collection to be fast
> > and memory-bounded.
> >
> > The planned steps are:
> >
> > 1. Speed up stack trace collection (potentially, by using SCS;
> >    patches on-hold until steps #2 and #3 are completed).
> > 2. Keep stack trace handles in the stack ring (this series).
> > 3. Add a memory-bounded mode to stack depot or provide an alternative
> >    memory-bounded stack storage.
> > 4. Potentially, implement stack trace collection sampling to minimize
> >    the performance impact.
> >
> > Thanks!
>
> Hi Andrew,
>
> Could you consider picking up this series into mm?
>
> Most of the patches have a Reviewed-by tag from Marco, and I've
> addressed the last few comments he had in v3.
>
> Thanks!

I see them in -next, so they've been picked up?

FWIW, my concerns have been addressed, so for patches that don't yet
have my Reviewed:


Acked-by: Marco Elver <elver@google.com>
Andrew Morton Sept. 12, 2022, 8:06 p.m. UTC | #3
On Mon, 12 Sep 2022 11:39:07 +0200 Marco Elver <elver@google.com> wrote:

>
> ...
>
> > Hi Andrew,
> >
> > Could you consider picking up this series into mm?
> >
> > Most of the patches have a Reviewed-by tag from Marco, and I've
> > addressed the last few comments he had in v3.
> >
> > Thanks!
> 
> I see them in -next, so they've been picked up?

yup.

> FWIW, my concerns have been addressed, so for patches that don't yet
> have my Reviewed:
> 
> 
> Acked-by: Marco Elver <elver@google.com>

Updated, thanks.
Yu Zhao Sept. 19, 2022, 8:07 a.m. UTC | #4
On Mon, Sep 12, 2022 at 2:06 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Mon, 12 Sep 2022 11:39:07 +0200 Marco Elver <elver@google.com> wrote:
>
> >
> > ...
> >
> > > Hi Andrew,
> > >
> > > Could you consider picking up this series into mm?
> > >
> > > Most of the patches have a Reviewed-by tag from Marco, and I've
> > > addressed the last few comments he had in v3.
> > >
> > > Thanks!
> >
> > I see them in -next, so they've been picked up?
>
> yup.
>
> > FWIW, my concerns have been addressed, so for patches that don't yet
> > have my Reviewed:
> >
> >
> > Acked-by: Marco Elver <elver@google.com>
>
> Updated, thanks.

Hit the following with the latest mm-unstable. Please take a look. Thanks.

BUG: rwlock bad magic on CPU#0, swapper/0, ffffffdc589d8218
CPU: 0 PID: 0 Comm: swapper Not tainted 6.0.0-rc3-lockdep+ #36
Call trace:
 dump_backtrace+0xfc/0x14c
 show_stack+0x24/0x58
 dump_stack_lvl+0x7c/0xa0
 dump_stack+0x18/0x44
 rwlock_bug+0x88/0x8c
 do_raw_read_unlock+0x7c/0x90
 _raw_read_unlock_irqrestore+0x54/0xa0
 save_stack_info+0x100/0x118
 kasan_save_alloc_info+0x20/0x2c
 __kasan_slab_alloc+0x90/0x94
 early_kmem_cache_node_alloc+0x8c/0x1a8
 __kmem_cache_create+0x1ac/0x338
 create_boot_cache+0xac/0xec
 kmem_cache_init+0x8c/0x174
 mm_init+0x3c/0x78
 start_kernel+0x188/0x49c
Andrey Konovalov Sept. 20, 2022, 6:59 p.m. UTC | #5
On Mon, Sep 19, 2022 at 10:08 AM Yu Zhao <yuzhao@google.com> wrote:
>
> Hit the following with the latest mm-unstable. Please take a look. Thanks.
>
> BUG: rwlock bad magic on CPU#0, swapper/0, ffffffdc589d8218
> CPU: 0 PID: 0 Comm: swapper Not tainted 6.0.0-rc3-lockdep+ #36
> Call trace:
>  dump_backtrace+0xfc/0x14c
>  show_stack+0x24/0x58
>  dump_stack_lvl+0x7c/0xa0
>  dump_stack+0x18/0x44
>  rwlock_bug+0x88/0x8c
>  do_raw_read_unlock+0x7c/0x90
>  _raw_read_unlock_irqrestore+0x54/0xa0
>  save_stack_info+0x100/0x118
>  kasan_save_alloc_info+0x20/0x2c
>  __kasan_slab_alloc+0x90/0x94
>  early_kmem_cache_node_alloc+0x8c/0x1a8
>  __kmem_cache_create+0x1ac/0x338
>  create_boot_cache+0xac/0xec
>  kmem_cache_init+0x8c/0x174
>  mm_init+0x3c/0x78
>  start_kernel+0x188/0x49c

Hi Yu,

Just mailed a fix.

Thank you for the report!