Message ID | 20230614095158.1133673-1-elver@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | kasan: add support for kasan.fault=panic_on_write | expand |
On Wed, Jun 14, 2023 at 11:52 AM Marco Elver <elver@google.com> wrote: > > KASAN's boot time kernel parameter 'kasan.fault=' currently supports > 'report' and 'panic', which results in either only reporting bugs or > also panicking on reports. > > However, some users may wish to have more control over when KASAN > reports result in a kernel panic: in particular, KASAN reported invalid > _writes_ are of special interest, because they have greater potential to > corrupt random kernel memory or be more easily exploited. > > To panic on invalid writes only, introduce 'kasan.fault=panic_on_write', > which allows users to choose to continue running on invalid reads, but > panic only on invalid writes. > > Signed-off-by: Marco Elver <elver@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Thanks!
On Wed, Jun 14, 2023 at 11:52 AM Marco Elver <elver@google.com> wrote: > > @@ -597,7 +614,11 @@ void kasan_report_async(void) > pr_err("Asynchronous fault: no details available\n"); > pr_err("\n"); > dump_stack_lvl(KERN_ERR); > - end_report(&flags, NULL); > + /* > + * Conservatively set is_write=true, because no details are available. > + * In this mode, kasan.fault=panic_on_write is like kasan.fault=panic. > + */ > + end_report(&flags, NULL, true); Hi Marco, When asymm mode is enabled, kasan_report_async should only be called for read accesses. I think we could check the mode and panic accordingly. Please also update the documentation to describe the flag behavior wrt async/asymm modes. On a related note, it looks like we have a typo in KASAN documentation: it states that asymm mode detects reads synchronously, and writes - asynchronously. Should be the reverse. Thanks!
On Tue, 20 Jun 2023 at 12:57, Andrey Konovalov <andreyknvl@gmail.com> wrote: > > On Wed, Jun 14, 2023 at 11:52 AM Marco Elver <elver@google.com> wrote: > > > > @@ -597,7 +614,11 @@ void kasan_report_async(void) > > pr_err("Asynchronous fault: no details available\n"); > > pr_err("\n"); > > dump_stack_lvl(KERN_ERR); > > - end_report(&flags, NULL); > > + /* > > + * Conservatively set is_write=true, because no details are available. > > + * In this mode, kasan.fault=panic_on_write is like kasan.fault=panic. > > + */ > > + end_report(&flags, NULL, true); > > Hi Marco, > > When asymm mode is enabled, kasan_report_async should only be called > for read accesses. I think we could check the mode and panic > accordingly. How do we check the mode, and how do we prove it's only called for read accesses? > Please also update the documentation to describe the flag behavior wrt > async/asymm modes. Will do. > On a related note, it looks like we have a typo in KASAN > documentation: it states that asymm mode detects reads synchronously, > and writes - asynchronously. Should be the reverse. This says the documentation is correct, and it's actually called for writes: https://docs.kernel.org/arm64/memory-tagging-extension.html#tag-check-faults Who is right?
On Tue, Jun 20, 2023 at 1:33 PM Marco Elver <elver@google.com> wrote: > > > On a related note, it looks like we have a typo in KASAN > > documentation: it states that asymm mode detects reads synchronously, > > and writes - asynchronously. Should be the reverse. > > This says the documentation is correct, and it's actually called for > writes: https://docs.kernel.org/arm64/memory-tagging-extension.html#tag-check-faults > > Who is right? Ah, right. I did a quick google to check when I was writing the response and found this: https://lwn.net/Articles/882963/. But looks like that cover letter is wrong and the documentation is right. I wonder what the point of the asymmetric mode is then. So the current code that you have should work perfectly. The only change I'd like to see is in the documentation. Thanks!
On Tue, Jun 20, 2023 at 01:45PM +0200, Andrey Konovalov wrote: > On Tue, Jun 20, 2023 at 1:33 PM Marco Elver <elver@google.com> wrote: > > > > > On a related note, it looks like we have a typo in KASAN > > > documentation: it states that asymm mode detects reads synchronously, > > > and writes - asynchronously. Should be the reverse. > > > > This says the documentation is correct, and it's actually called for > > writes: https://docs.kernel.org/arm64/memory-tagging-extension.html#tag-check-faults > > > > Who is right? > > Ah, right. I did a quick google to check when I was writing the > response and found this: https://lwn.net/Articles/882963/. But looks > like that cover letter is wrong and the documentation is right. I > wonder what the point of the asymmetric mode is then. Maybe not as strong, but asymm mode makes sense from a microarch point of view, where writes are always committed into a store buffer, but reads can only commit when the data (incl. tag) is available. > So the current code that you have should work perfectly. The only > change I'd like to see is in the documentation. Something like this (or more?) diff --git a/Documentation/dev-tools/kasan.rst b/Documentation/dev-tools/kasan.rst index 7f37a46af574..3c58392d931e 100644 --- a/Documentation/dev-tools/kasan.rst +++ b/Documentation/dev-tools/kasan.rst @@ -135,6 +135,8 @@ disabling KASAN altogether or controlling its features: fault occurs, the information is stored in hardware (in the TFSR_EL1 register for arm64). The kernel periodically checks the hardware and only reports tag faults during these checks. + Note that ``kasan.fault=panic_on_write`` results in panic for all + asynchronously checked accesses. Asymmetric mode: a bad access is detected synchronously on reads and asynchronously on writes.
On Tue, Jun 20, 2023 at 1:51 PM Marco Elver <elver@google.com> wrote: > > > Ah, right. I did a quick google to check when I was writing the > > response and found this: https://lwn.net/Articles/882963/. But looks > > like that cover letter is wrong and the documentation is right. I > > wonder what the point of the asymmetric mode is then. > > Maybe not as strong, but asymm mode makes sense from a microarch point > of view, where writes are always committed into a store buffer, but > reads can only commit when the data (incl. tag) is available. Yeah, I get that it can be a bit better than async with a similar slowdown, but there's little value in catching only reads from the security standpoint. > > So the current code that you have should work perfectly. The only > > change I'd like to see is in the documentation. > > Something like this (or more?) > > diff --git a/Documentation/dev-tools/kasan.rst b/Documentation/dev-tools/kasan.rst > index 7f37a46af574..3c58392d931e 100644 > --- a/Documentation/dev-tools/kasan.rst > +++ b/Documentation/dev-tools/kasan.rst > @@ -135,6 +135,8 @@ disabling KASAN altogether or controlling its features: > fault occurs, the information is stored in hardware (in the TFSR_EL1 > register for arm64). The kernel periodically checks the hardware and > only reports tag faults during these checks. > + Note that ``kasan.fault=panic_on_write`` results in panic for all > + asynchronously checked accesses. > Asymmetric mode: a bad access is detected synchronously on reads and > asynchronously on writes. Could you move this to the section that describes the kasan.fault flag? This seems more consistent. Thanks!
On Tue, Jun 20, 2023 at 03:56PM +0200, Andrey Konovalov wrote: ... > Could you move this to the section that describes the kasan.fault > flag? This seems more consistent. Like this? diff --git a/Documentation/dev-tools/kasan.rst b/Documentation/dev-tools/kasan.rst index 7f37a46af574..f4acf9c2e90f 100644 --- a/Documentation/dev-tools/kasan.rst +++ b/Documentation/dev-tools/kasan.rst @@ -110,7 +110,9 @@ parameter can be used to control panic and reporting behaviour: - ``kasan.fault=report``, ``=panic``, or ``=panic_on_write`` controls whether to only print a KASAN report, panic the kernel, or panic the kernel on invalid writes only (default: ``report``). The panic happens even if - ``kasan_multi_shot`` is enabled. + ``kasan_multi_shot`` is enabled. Note that when using asynchronous mode of + Hardware Tag-Based KASAN, ``kasan.fault=panic_on_write`` always panics on + asynchronously checked accesses (including reads). Software and Hardware Tag-Based KASAN modes (see the section about various modes below) support altering stack trace collection behavior:
On Tue, Jun 20, 2023 at 4:49 PM Marco Elver <elver@google.com> wrote: > > On Tue, Jun 20, 2023 at 03:56PM +0200, Andrey Konovalov wrote: > ... > > Could you move this to the section that describes the kasan.fault > > flag? This seems more consistent. > > Like this? > > > diff --git a/Documentation/dev-tools/kasan.rst b/Documentation/dev-tools/kasan.rst > index 7f37a46af574..f4acf9c2e90f 100644 > --- a/Documentation/dev-tools/kasan.rst > +++ b/Documentation/dev-tools/kasan.rst > @@ -110,7 +110,9 @@ parameter can be used to control panic and reporting behaviour: > - ``kasan.fault=report``, ``=panic``, or ``=panic_on_write`` controls whether > to only print a KASAN report, panic the kernel, or panic the kernel on > invalid writes only (default: ``report``). The panic happens even if > - ``kasan_multi_shot`` is enabled. > + ``kasan_multi_shot`` is enabled. Note that when using asynchronous mode of > + Hardware Tag-Based KASAN, ``kasan.fault=panic_on_write`` always panics on > + asynchronously checked accesses (including reads). > > Software and Hardware Tag-Based KASAN modes (see the section about various > modes below) support altering stack trace collection behavior: Yes, this looks great! Thanks!
On Tue, Jun 20, 2023 at 06:27PM +0200, Andrey Konovalov wrote: > On Tue, Jun 20, 2023 at 4:49 PM Marco Elver <elver@google.com> wrote: > > > > On Tue, Jun 20, 2023 at 03:56PM +0200, Andrey Konovalov wrote: > > ... > > > Could you move this to the section that describes the kasan.fault > > > flag? This seems more consistent. > > > > Like this? > > > > > > diff --git a/Documentation/dev-tools/kasan.rst b/Documentation/dev-tools/kasan.rst > > index 7f37a46af574..f4acf9c2e90f 100644 > > --- a/Documentation/dev-tools/kasan.rst > > +++ b/Documentation/dev-tools/kasan.rst > > @@ -110,7 +110,9 @@ parameter can be used to control panic and reporting behaviour: > > - ``kasan.fault=report``, ``=panic``, or ``=panic_on_write`` controls whether > > to only print a KASAN report, panic the kernel, or panic the kernel on > > invalid writes only (default: ``report``). The panic happens even if > > - ``kasan_multi_shot`` is enabled. > > + ``kasan_multi_shot`` is enabled. Note that when using asynchronous mode of > > + Hardware Tag-Based KASAN, ``kasan.fault=panic_on_write`` always panics on > > + asynchronously checked accesses (including reads). > > > > Software and Hardware Tag-Based KASAN modes (see the section about various > > modes below) support altering stack trace collection behavior: > > Yes, this looks great! Thanks! The patch here is already in mm-stable (which I recall doesn't do rebases?), so I sent https://lkml.kernel.org/r/ZJHfL6vavKUZ3Yd8@elver.google.com to be used as a fixup or just added to mm-stable by Andrew at one point or another as well. Thanks, -- Marco
diff --git a/Documentation/dev-tools/kasan.rst b/Documentation/dev-tools/kasan.rst index e66916a483cd..7f37a46af574 100644 --- a/Documentation/dev-tools/kasan.rst +++ b/Documentation/dev-tools/kasan.rst @@ -107,9 +107,10 @@ effectively disables ``panic_on_warn`` for KASAN reports. Alternatively, independent of ``panic_on_warn``, the ``kasan.fault=`` boot parameter can be used to control panic and reporting behaviour: -- ``kasan.fault=report`` or ``=panic`` controls whether to only print a KASAN - report or also panic the kernel (default: ``report``). The panic happens even - if ``kasan_multi_shot`` is enabled. +- ``kasan.fault=report``, ``=panic``, or ``=panic_on_write`` controls whether + to only print a KASAN report, panic the kernel, or panic the kernel on + invalid writes only (default: ``report``). The panic happens even if + ``kasan_multi_shot`` is enabled. Software and Hardware Tag-Based KASAN modes (see the section about various modes below) support altering stack trace collection behavior: diff --git a/mm/kasan/report.c b/mm/kasan/report.c index 892a9dc9d4d3..f8ac4d0c9848 100644 --- a/mm/kasan/report.c +++ b/mm/kasan/report.c @@ -43,6 +43,7 @@ enum kasan_arg_fault { KASAN_ARG_FAULT_DEFAULT, KASAN_ARG_FAULT_REPORT, KASAN_ARG_FAULT_PANIC, + KASAN_ARG_FAULT_PANIC_ON_WRITE, }; static enum kasan_arg_fault kasan_arg_fault __ro_after_init = KASAN_ARG_FAULT_DEFAULT; @@ -57,6 +58,8 @@ static int __init early_kasan_fault(char *arg) kasan_arg_fault = KASAN_ARG_FAULT_REPORT; else if (!strcmp(arg, "panic")) kasan_arg_fault = KASAN_ARG_FAULT_PANIC; + else if (!strcmp(arg, "panic_on_write")) + kasan_arg_fault = KASAN_ARG_FAULT_PANIC_ON_WRITE; else return -EINVAL; @@ -211,7 +214,7 @@ static void start_report(unsigned long *flags, bool sync) pr_err("==================================================================\n"); } -static void end_report(unsigned long *flags, void *addr) +static void end_report(unsigned long *flags, void *addr, bool is_write) { if (addr) trace_error_report_end(ERROR_DETECTOR_KASAN, @@ -220,8 +223,18 @@ static void end_report(unsigned long *flags, void *addr) spin_unlock_irqrestore(&report_lock, *flags); if (!test_bit(KASAN_BIT_MULTI_SHOT, &kasan_flags)) check_panic_on_warn("KASAN"); - if (kasan_arg_fault == KASAN_ARG_FAULT_PANIC) + switch (kasan_arg_fault) { + case KASAN_ARG_FAULT_DEFAULT: + case KASAN_ARG_FAULT_REPORT: + break; + case KASAN_ARG_FAULT_PANIC: panic("kasan.fault=panic set ...\n"); + break; + case KASAN_ARG_FAULT_PANIC_ON_WRITE: + if (is_write) + panic("kasan.fault=panic_on_write set ...\n"); + break; + } add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE); lockdep_on(); report_suppress_stop(); @@ -536,7 +549,11 @@ void kasan_report_invalid_free(void *ptr, unsigned long ip, enum kasan_report_ty print_report(&info); - end_report(&flags, ptr); + /* + * Invalid free is considered a "write" since the allocator's metadata + * updates involves writes. + */ + end_report(&flags, ptr, true); } /* @@ -571,7 +588,7 @@ bool kasan_report(unsigned long addr, size_t size, bool is_write, print_report(&info); - end_report(&irq_flags, ptr); + end_report(&irq_flags, ptr, is_write); out: user_access_restore(ua_flags); @@ -597,7 +614,11 @@ void kasan_report_async(void) pr_err("Asynchronous fault: no details available\n"); pr_err("\n"); dump_stack_lvl(KERN_ERR); - end_report(&flags, NULL); + /* + * Conservatively set is_write=true, because no details are available. + * In this mode, kasan.fault=panic_on_write is like kasan.fault=panic. + */ + end_report(&flags, NULL, true); } #endif /* CONFIG_KASAN_HW_TAGS */
KASAN's boot time kernel parameter 'kasan.fault=' currently supports 'report' and 'panic', which results in either only reporting bugs or also panicking on reports. However, some users may wish to have more control over when KASAN reports result in a kernel panic: in particular, KASAN reported invalid _writes_ are of special interest, because they have greater potential to corrupt random kernel memory or be more easily exploited. To panic on invalid writes only, introduce 'kasan.fault=panic_on_write', which allows users to choose to continue running on invalid reads, but panic only on invalid writes. Signed-off-by: Marco Elver <elver@google.com> --- Documentation/dev-tools/kasan.rst | 7 ++++--- mm/kasan/report.c | 31 ++++++++++++++++++++++++++----- 2 files changed, 30 insertions(+), 8 deletions(-)