diff mbox series

kasan: add support for kasan.fault=panic_on_write

Message ID 20230614095158.1133673-1-elver@google.com (mailing list archive)
State New
Headers show
Series kasan: add support for kasan.fault=panic_on_write | expand

Commit Message

Marco Elver June 14, 2023, 9:51 a.m. UTC
KASAN's boot time kernel parameter 'kasan.fault=' currently supports
'report' and 'panic', which results in either only reporting bugs or
also panicking on reports.

However, some users may wish to have more control over when KASAN
reports result in a kernel panic: in particular, KASAN reported invalid
_writes_ are of special interest, because they have greater potential to
corrupt random kernel memory or be more easily exploited.

To panic on invalid writes only, introduce 'kasan.fault=panic_on_write',
which allows users to choose to continue running on invalid reads, but
panic only on invalid writes.

Signed-off-by: Marco Elver <elver@google.com>
---
 Documentation/dev-tools/kasan.rst |  7 ++++---
 mm/kasan/report.c                 | 31 ++++++++++++++++++++++++++-----
 2 files changed, 30 insertions(+), 8 deletions(-)

Comments

Alexander Potapenko June 14, 2023, 10:06 a.m. UTC | #1
On Wed, Jun 14, 2023 at 11:52 AM Marco Elver <elver@google.com> wrote:
>
> KASAN's boot time kernel parameter 'kasan.fault=' currently supports
> 'report' and 'panic', which results in either only reporting bugs or
> also panicking on reports.
>
> However, some users may wish to have more control over when KASAN
> reports result in a kernel panic: in particular, KASAN reported invalid
> _writes_ are of special interest, because they have greater potential to
> corrupt random kernel memory or be more easily exploited.
>
> To panic on invalid writes only, introduce 'kasan.fault=panic_on_write',
> which allows users to choose to continue running on invalid reads, but
> panic only on invalid writes.
>
> Signed-off-by: Marco Elver <elver@google.com>
Reviewed-by: Alexander Potapenko <glider@google.com>

Thanks!
Andrey Konovalov June 20, 2023, 10:57 a.m. UTC | #2
On Wed, Jun 14, 2023 at 11:52 AM Marco Elver <elver@google.com> wrote:
>
> @@ -597,7 +614,11 @@ void kasan_report_async(void)
>         pr_err("Asynchronous fault: no details available\n");
>         pr_err("\n");
>         dump_stack_lvl(KERN_ERR);
> -       end_report(&flags, NULL);
> +       /*
> +        * Conservatively set is_write=true, because no details are available.
> +        * In this mode, kasan.fault=panic_on_write is like kasan.fault=panic.
> +        */
> +       end_report(&flags, NULL, true);

Hi Marco,

When asymm mode is enabled, kasan_report_async should only be called
for read accesses. I think we could check the mode and panic
accordingly.

Please also update the documentation to describe the flag behavior wrt
async/asymm modes.

On a related note, it looks like we have a typo in KASAN
documentation: it states that asymm mode detects reads synchronously,
and writes - asynchronously. Should be the reverse.

Thanks!
Marco Elver June 20, 2023, 11:32 a.m. UTC | #3
On Tue, 20 Jun 2023 at 12:57, Andrey Konovalov <andreyknvl@gmail.com> wrote:
>
> On Wed, Jun 14, 2023 at 11:52 AM Marco Elver <elver@google.com> wrote:
> >
> > @@ -597,7 +614,11 @@ void kasan_report_async(void)
> >         pr_err("Asynchronous fault: no details available\n");
> >         pr_err("\n");
> >         dump_stack_lvl(KERN_ERR);
> > -       end_report(&flags, NULL);
> > +       /*
> > +        * Conservatively set is_write=true, because no details are available.
> > +        * In this mode, kasan.fault=panic_on_write is like kasan.fault=panic.
> > +        */
> > +       end_report(&flags, NULL, true);
>
> Hi Marco,
>
> When asymm mode is enabled, kasan_report_async should only be called
> for read accesses. I think we could check the mode and panic
> accordingly.

How do we check the mode, and how do we prove it's only called for
read accesses?

> Please also update the documentation to describe the flag behavior wrt
> async/asymm modes.

Will do.

> On a related note, it looks like we have a typo in KASAN
> documentation: it states that asymm mode detects reads synchronously,
> and writes - asynchronously. Should be the reverse.

This says the documentation is correct, and it's actually called for
writes: https://docs.kernel.org/arm64/memory-tagging-extension.html#tag-check-faults

Who is right?
Andrey Konovalov June 20, 2023, 11:45 a.m. UTC | #4
On Tue, Jun 20, 2023 at 1:33 PM Marco Elver <elver@google.com> wrote:
>
> > On a related note, it looks like we have a typo in KASAN
> > documentation: it states that asymm mode detects reads synchronously,
> > and writes - asynchronously. Should be the reverse.
>
> This says the documentation is correct, and it's actually called for
> writes: https://docs.kernel.org/arm64/memory-tagging-extension.html#tag-check-faults
>
> Who is right?

Ah, right. I did a quick google to check when I was writing the
response and found this: https://lwn.net/Articles/882963/. But looks
like that cover letter is wrong and the documentation is right. I
wonder what the point of the asymmetric mode is then.

So the current code that you have should work perfectly. The only
change I'd like to see is in the documentation.

Thanks!
Marco Elver June 20, 2023, 11:51 a.m. UTC | #5
On Tue, Jun 20, 2023 at 01:45PM +0200, Andrey Konovalov wrote:
> On Tue, Jun 20, 2023 at 1:33 PM Marco Elver <elver@google.com> wrote:
> >
> > > On a related note, it looks like we have a typo in KASAN
> > > documentation: it states that asymm mode detects reads synchronously,
> > > and writes - asynchronously. Should be the reverse.
> >
> > This says the documentation is correct, and it's actually called for
> > writes: https://docs.kernel.org/arm64/memory-tagging-extension.html#tag-check-faults
> >
> > Who is right?
> 
> Ah, right. I did a quick google to check when I was writing the
> response and found this: https://lwn.net/Articles/882963/. But looks
> like that cover letter is wrong and the documentation is right. I
> wonder what the point of the asymmetric mode is then.

Maybe not as strong, but asymm mode makes sense from a microarch point
of view, where writes are always committed into a store buffer, but
reads can only commit when the data (incl. tag) is available.

> So the current code that you have should work perfectly. The only
> change I'd like to see is in the documentation.

Something like this (or more?)

diff --git a/Documentation/dev-tools/kasan.rst b/Documentation/dev-tools/kasan.rst
index 7f37a46af574..3c58392d931e 100644
--- a/Documentation/dev-tools/kasan.rst
+++ b/Documentation/dev-tools/kasan.rst
@@ -135,6 +135,8 @@ disabling KASAN altogether or controlling its features:
   fault occurs, the information is stored in hardware (in the TFSR_EL1
   register for arm64). The kernel periodically checks the hardware and
   only reports tag faults during these checks.
+  Note that ``kasan.fault=panic_on_write`` results in panic for all
+  asynchronously checked accesses.
   Asymmetric mode: a bad access is detected synchronously on reads and
   asynchronously on writes.
Andrey Konovalov June 20, 2023, 1:56 p.m. UTC | #6
On Tue, Jun 20, 2023 at 1:51 PM Marco Elver <elver@google.com> wrote:
>
> > Ah, right. I did a quick google to check when I was writing the
> > response and found this: https://lwn.net/Articles/882963/. But looks
> > like that cover letter is wrong and the documentation is right. I
> > wonder what the point of the asymmetric mode is then.
>
> Maybe not as strong, but asymm mode makes sense from a microarch point
> of view, where writes are always committed into a store buffer, but
> reads can only commit when the data (incl. tag) is available.

Yeah, I get that it can be a bit better than async with a similar
slowdown, but there's little value in catching only reads from the
security standpoint.

> > So the current code that you have should work perfectly. The only
> > change I'd like to see is in the documentation.
>
> Something like this (or more?)
>
> diff --git a/Documentation/dev-tools/kasan.rst b/Documentation/dev-tools/kasan.rst
> index 7f37a46af574..3c58392d931e 100644
> --- a/Documentation/dev-tools/kasan.rst
> +++ b/Documentation/dev-tools/kasan.rst
> @@ -135,6 +135,8 @@ disabling KASAN altogether or controlling its features:
>    fault occurs, the information is stored in hardware (in the TFSR_EL1
>    register for arm64). The kernel periodically checks the hardware and
>    only reports tag faults during these checks.
> +  Note that ``kasan.fault=panic_on_write`` results in panic for all
> +  asynchronously checked accesses.
>    Asymmetric mode: a bad access is detected synchronously on reads and
>    asynchronously on writes.

Could you move this to the section that describes the kasan.fault
flag? This seems more consistent.

Thanks!
Marco Elver June 20, 2023, 2:48 p.m. UTC | #7
On Tue, Jun 20, 2023 at 03:56PM +0200, Andrey Konovalov wrote:
...
> Could you move this to the section that describes the kasan.fault
> flag? This seems more consistent.

Like this?


diff --git a/Documentation/dev-tools/kasan.rst b/Documentation/dev-tools/kasan.rst
index 7f37a46af574..f4acf9c2e90f 100644
--- a/Documentation/dev-tools/kasan.rst
+++ b/Documentation/dev-tools/kasan.rst
@@ -110,7 +110,9 @@ parameter can be used to control panic and reporting behaviour:
 - ``kasan.fault=report``, ``=panic``, or ``=panic_on_write`` controls whether
   to only print a KASAN report, panic the kernel, or panic the kernel on
   invalid writes only (default: ``report``). The panic happens even if
-  ``kasan_multi_shot`` is enabled.
+  ``kasan_multi_shot`` is enabled. Note that when using asynchronous mode of
+  Hardware Tag-Based KASAN, ``kasan.fault=panic_on_write`` always panics on
+  asynchronously checked accesses (including reads).
 
 Software and Hardware Tag-Based KASAN modes (see the section about various
 modes below) support altering stack trace collection behavior:
Andrey Konovalov June 20, 2023, 4:27 p.m. UTC | #8
On Tue, Jun 20, 2023 at 4:49 PM Marco Elver <elver@google.com> wrote:
>
> On Tue, Jun 20, 2023 at 03:56PM +0200, Andrey Konovalov wrote:
> ...
> > Could you move this to the section that describes the kasan.fault
> > flag? This seems more consistent.
>
> Like this?
>
>
> diff --git a/Documentation/dev-tools/kasan.rst b/Documentation/dev-tools/kasan.rst
> index 7f37a46af574..f4acf9c2e90f 100644
> --- a/Documentation/dev-tools/kasan.rst
> +++ b/Documentation/dev-tools/kasan.rst
> @@ -110,7 +110,9 @@ parameter can be used to control panic and reporting behaviour:
>  - ``kasan.fault=report``, ``=panic``, or ``=panic_on_write`` controls whether
>    to only print a KASAN report, panic the kernel, or panic the kernel on
>    invalid writes only (default: ``report``). The panic happens even if
> -  ``kasan_multi_shot`` is enabled.
> +  ``kasan_multi_shot`` is enabled. Note that when using asynchronous mode of
> +  Hardware Tag-Based KASAN, ``kasan.fault=panic_on_write`` always panics on
> +  asynchronously checked accesses (including reads).
>
>  Software and Hardware Tag-Based KASAN modes (see the section about various
>  modes below) support altering stack trace collection behavior:

Yes, this looks great! Thanks!
Marco Elver June 20, 2023, 5:23 p.m. UTC | #9
On Tue, Jun 20, 2023 at 06:27PM +0200, Andrey Konovalov wrote:
> On Tue, Jun 20, 2023 at 4:49 PM Marco Elver <elver@google.com> wrote:
> >
> > On Tue, Jun 20, 2023 at 03:56PM +0200, Andrey Konovalov wrote:
> > ...
> > > Could you move this to the section that describes the kasan.fault
> > > flag? This seems more consistent.
> >
> > Like this?
> >
> >
> > diff --git a/Documentation/dev-tools/kasan.rst b/Documentation/dev-tools/kasan.rst
> > index 7f37a46af574..f4acf9c2e90f 100644
> > --- a/Documentation/dev-tools/kasan.rst
> > +++ b/Documentation/dev-tools/kasan.rst
> > @@ -110,7 +110,9 @@ parameter can be used to control panic and reporting behaviour:
> >  - ``kasan.fault=report``, ``=panic``, or ``=panic_on_write`` controls whether
> >    to only print a KASAN report, panic the kernel, or panic the kernel on
> >    invalid writes only (default: ``report``). The panic happens even if
> > -  ``kasan_multi_shot`` is enabled.
> > +  ``kasan_multi_shot`` is enabled. Note that when using asynchronous mode of
> > +  Hardware Tag-Based KASAN, ``kasan.fault=panic_on_write`` always panics on
> > +  asynchronously checked accesses (including reads).
> >
> >  Software and Hardware Tag-Based KASAN modes (see the section about various
> >  modes below) support altering stack trace collection behavior:
> 
> Yes, this looks great! Thanks!

The patch here is already in mm-stable (which I recall doesn't do
rebases?), so I sent

 https://lkml.kernel.org/r/ZJHfL6vavKUZ3Yd8@elver.google.com

to be used as a fixup or just added to mm-stable by Andrew at one point
or another as well.

Thanks,
-- Marco
diff mbox series

Patch

diff --git a/Documentation/dev-tools/kasan.rst b/Documentation/dev-tools/kasan.rst
index e66916a483cd..7f37a46af574 100644
--- a/Documentation/dev-tools/kasan.rst
+++ b/Documentation/dev-tools/kasan.rst
@@ -107,9 +107,10 @@  effectively disables ``panic_on_warn`` for KASAN reports.
 Alternatively, independent of ``panic_on_warn``, the ``kasan.fault=`` boot
 parameter can be used to control panic and reporting behaviour:
 
-- ``kasan.fault=report`` or ``=panic`` controls whether to only print a KASAN
-  report or also panic the kernel (default: ``report``). The panic happens even
-  if ``kasan_multi_shot`` is enabled.
+- ``kasan.fault=report``, ``=panic``, or ``=panic_on_write`` controls whether
+  to only print a KASAN report, panic the kernel, or panic the kernel on
+  invalid writes only (default: ``report``). The panic happens even if
+  ``kasan_multi_shot`` is enabled.
 
 Software and Hardware Tag-Based KASAN modes (see the section about various
 modes below) support altering stack trace collection behavior:
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index 892a9dc9d4d3..f8ac4d0c9848 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -43,6 +43,7 @@  enum kasan_arg_fault {
 	KASAN_ARG_FAULT_DEFAULT,
 	KASAN_ARG_FAULT_REPORT,
 	KASAN_ARG_FAULT_PANIC,
+	KASAN_ARG_FAULT_PANIC_ON_WRITE,
 };
 
 static enum kasan_arg_fault kasan_arg_fault __ro_after_init = KASAN_ARG_FAULT_DEFAULT;
@@ -57,6 +58,8 @@  static int __init early_kasan_fault(char *arg)
 		kasan_arg_fault = KASAN_ARG_FAULT_REPORT;
 	else if (!strcmp(arg, "panic"))
 		kasan_arg_fault = KASAN_ARG_FAULT_PANIC;
+	else if (!strcmp(arg, "panic_on_write"))
+		kasan_arg_fault = KASAN_ARG_FAULT_PANIC_ON_WRITE;
 	else
 		return -EINVAL;
 
@@ -211,7 +214,7 @@  static void start_report(unsigned long *flags, bool sync)
 	pr_err("==================================================================\n");
 }
 
-static void end_report(unsigned long *flags, void *addr)
+static void end_report(unsigned long *flags, void *addr, bool is_write)
 {
 	if (addr)
 		trace_error_report_end(ERROR_DETECTOR_KASAN,
@@ -220,8 +223,18 @@  static void end_report(unsigned long *flags, void *addr)
 	spin_unlock_irqrestore(&report_lock, *flags);
 	if (!test_bit(KASAN_BIT_MULTI_SHOT, &kasan_flags))
 		check_panic_on_warn("KASAN");
-	if (kasan_arg_fault == KASAN_ARG_FAULT_PANIC)
+	switch (kasan_arg_fault) {
+	case KASAN_ARG_FAULT_DEFAULT:
+	case KASAN_ARG_FAULT_REPORT:
+		break;
+	case KASAN_ARG_FAULT_PANIC:
 		panic("kasan.fault=panic set ...\n");
+		break;
+	case KASAN_ARG_FAULT_PANIC_ON_WRITE:
+		if (is_write)
+			panic("kasan.fault=panic_on_write set ...\n");
+		break;
+	}
 	add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
 	lockdep_on();
 	report_suppress_stop();
@@ -536,7 +549,11 @@  void kasan_report_invalid_free(void *ptr, unsigned long ip, enum kasan_report_ty
 
 	print_report(&info);
 
-	end_report(&flags, ptr);
+	/*
+	 * Invalid free is considered a "write" since the allocator's metadata
+	 * updates involves writes.
+	 */
+	end_report(&flags, ptr, true);
 }
 
 /*
@@ -571,7 +588,7 @@  bool kasan_report(unsigned long addr, size_t size, bool is_write,
 
 	print_report(&info);
 
-	end_report(&irq_flags, ptr);
+	end_report(&irq_flags, ptr, is_write);
 
 out:
 	user_access_restore(ua_flags);
@@ -597,7 +614,11 @@  void kasan_report_async(void)
 	pr_err("Asynchronous fault: no details available\n");
 	pr_err("\n");
 	dump_stack_lvl(KERN_ERR);
-	end_report(&flags, NULL);
+	/*
+	 * Conservatively set is_write=true, because no details are available.
+	 * In this mode, kasan.fault=panic_on_write is like kasan.fault=panic.
+	 */
+	end_report(&flags, NULL, true);
 }
 #endif /* CONFIG_KASAN_HW_TAGS */