[v3,5/6] kasan: Unset panic_on_warn before calling panic()

Message ID	20200116012321.26254-6-keescook@chromium.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=6pDc=3F=lists.openwall.com=kernel-hardening-return-17565-patchwork-kernel-hardening=patchwork.kernel.org@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B27512084D Mailing-List: contact kernel-hardening-help@lists.openwall.com; run by ezmlm Precedence: bulk From: Kees Cook <keescook@chromium.org> To: Andrew Morton <akpm@linux-foundation.org> Cc: Kees Cook <keescook@chromium.org>, Andrey Ryabinin <aryabinin@virtuozzo.com>, Elena Petrova <lenaptr@google.com>, Alexander Potapenko <glider@google.com>, Dan Carpenter <dan.carpenter@oracle.com>, "Gustavo A. R. Silva" <gustavo@embeddedor.com>, Arnd Bergmann <arnd@arndb.de>, Ard Biesheuvel <ard.biesheuvel@linaro.org>, kasan-dev@googlegroups.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-hardening@lists.openwall.com, syzkaller@googlegroups.com Subject: [PATCH v3 5/6] kasan: Unset panic_on_warn before calling panic() Date: Wed, 15 Jan 2020 17:23:20 -0800 Message-Id: <20200116012321.26254-6-keescook@chromium.org> In-Reply-To: <20200116012321.26254-1-keescook@chromium.org> References: <20200116012321.26254-1-keescook@chromium.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	ubsan: Split out bounds checker \| expand [v3,0/6] ubsan: Split out bounds checker [v3,1/6] ubsan: Add trap instrumentation option [v3,2/6] ubsan: Split "bounds" checker from other options [v3,3/6] lkdtm/bugs: Add arithmetic overflow and array bounds checks [v3,4/6] ubsan: Check panic_on_warn [v3,5/6] kasan: Unset panic_on_warn before calling panic() [v3,6/6] ubsan: Include bug type in report header

Kees Cook Jan. 16, 2020, 1:23 a.m. UTC

As done in the full WARN() handler, panic_on_warn needs to be cleared
before calling panic() to avoid recursive panics.

Signed-off-by: Kees Cook <keescook@chromium.org>
---
 mm/kasan/report.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

Dmitry Vyukov Jan. 16, 2020, 5:23 a.m. UTC | #1

On Thu, Jan 16, 2020 at 2:24 AM Kees Cook <keescook@chromium.org> wrote:
>
> As done in the full WARN() handler, panic_on_warn needs to be cleared
> before calling panic() to avoid recursive panics.
>
> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
>  mm/kasan/report.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/mm/kasan/report.c b/mm/kasan/report.c
> index 621782100eaa..844554e78893 100644
> --- a/mm/kasan/report.c
> +++ b/mm/kasan/report.c
> @@ -92,8 +92,16 @@ static void end_report(unsigned long *flags)
>         pr_err("==================================================================\n");
>         add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
>         spin_unlock_irqrestore(&report_lock, *flags);
> -       if (panic_on_warn)
> +       if (panic_on_warn) {
> +               /*
> +                * This thread may hit another WARN() in the panic path.
> +                * Resetting this prevents additional WARN() from panicking the
> +                * system on this thread.  Other threads are blocked by the
> +                * panic_mutex in panic().

I don't understand part about other threads.
Other threads are not necessary inside of panic(). And in fact since
we reset panic_on_warn, they will not get there even if they should.
If I am reading this correctly, once one thread prints a warning and
is going to panic, other threads may now print infinite amounts of
warning and proceed past them freely. Why is this the behavior we
want?

> +                */
> +               panic_on_warn = 0;
>                 panic("panic_on_warn set ...\n");
> +       }
>         kasan_enable_current();
>  }

Kees Cook Jan. 16, 2020, 11:49 p.m. UTC | #2

On Thu, Jan 16, 2020 at 06:23:01AM +0100, Dmitry Vyukov wrote:
> On Thu, Jan 16, 2020 at 2:24 AM Kees Cook <keescook@chromium.org> wrote:
> >
> > As done in the full WARN() handler, panic_on_warn needs to be cleared
> > before calling panic() to avoid recursive panics.
> >
> > Signed-off-by: Kees Cook <keescook@chromium.org>
> > ---
> >  mm/kasan/report.c | 10 +++++++++-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/kasan/report.c b/mm/kasan/report.c
> > index 621782100eaa..844554e78893 100644
> > --- a/mm/kasan/report.c
> > +++ b/mm/kasan/report.c
> > @@ -92,8 +92,16 @@ static void end_report(unsigned long *flags)
> >         pr_err("==================================================================\n");
> >         add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
> >         spin_unlock_irqrestore(&report_lock, *flags);
> > -       if (panic_on_warn)
> > +       if (panic_on_warn) {
> > +               /*
> > +                * This thread may hit another WARN() in the panic path.
> > +                * Resetting this prevents additional WARN() from panicking the
> > +                * system on this thread.  Other threads are blocked by the
> > +                * panic_mutex in panic().
> 
> I don't understand part about other threads.
> Other threads are not necessary inside of panic(). And in fact since
> we reset panic_on_warn, they will not get there even if they should.
> If I am reading this correctly, once one thread prints a warning and
> is going to panic, other threads may now print infinite amounts of
> warning and proceed past them freely. Why is this the behavior we
> want?

AIUI, the issue is the current thread hitting another WARN and blocking
on trying to call panic again. WARNs encountered during the execution of
panic() need to not attempt to call panic() again.

-Kees

> 
> > +                */
> > +               panic_on_warn = 0;
> >                 panic("panic_on_warn set ...\n");
> > +       }
> >         kasan_enable_current();
> >  }

Dmitry Vyukov Jan. 17, 2020, 9:54 a.m. UTC | #3

On Fri, Jan 17, 2020 at 12:49 AM Kees Cook <keescook@chromium.org> wrote:
>
> On Thu, Jan 16, 2020 at 06:23:01AM +0100, Dmitry Vyukov wrote:
> > On Thu, Jan 16, 2020 at 2:24 AM Kees Cook <keescook@chromium.org> wrote:
> > >
> > > As done in the full WARN() handler, panic_on_warn needs to be cleared
> > > before calling panic() to avoid recursive panics.
> > >
> > > Signed-off-by: Kees Cook <keescook@chromium.org>
> > > ---
> > >  mm/kasan/report.c | 10 +++++++++-
> > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/mm/kasan/report.c b/mm/kasan/report.c
> > > index 621782100eaa..844554e78893 100644
> > > --- a/mm/kasan/report.c
> > > +++ b/mm/kasan/report.c
> > > @@ -92,8 +92,16 @@ static void end_report(unsigned long *flags)
> > >         pr_err("==================================================================\n");
> > >         add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
> > >         spin_unlock_irqrestore(&report_lock, *flags);
> > > -       if (panic_on_warn)
> > > +       if (panic_on_warn) {
> > > +               /*
> > > +                * This thread may hit another WARN() in the panic path.
> > > +                * Resetting this prevents additional WARN() from panicking the
> > > +                * system on this thread.  Other threads are blocked by the
> > > +                * panic_mutex in panic().
> >
> > I don't understand part about other threads.
> > Other threads are not necessary inside of panic(). And in fact since
> > we reset panic_on_warn, they will not get there even if they should.
> > If I am reading this correctly, once one thread prints a warning and
> > is going to panic, other threads may now print infinite amounts of
> > warning and proceed past them freely. Why is this the behavior we
> > want?
>
> AIUI, the issue is the current thread hitting another WARN and blocking
> on trying to call panic again. WARNs encountered during the execution of
> panic() need to not attempt to call panic() again.

Yes, but the variable is global and affects other threads and the
comment talks about other threads, and that's the part I am confused
about (for both comment wording and the actual behavior). For the
"same thread hitting another warning" case we need a per-task flag or
something.

> -Kees
>
> >
> > > +                */
> > > +               panic_on_warn = 0;
> > >                 panic("panic_on_warn set ...\n");
> > > +       }
> > >         kasan_enable_current();
> > >  }
>
> --
> Kees Cook

Kees Cook Jan. 17, 2020, 9:20 p.m. UTC | #4

On Fri, Jan 17, 2020 at 10:54:36AM +0100, Dmitry Vyukov wrote:
> On Fri, Jan 17, 2020 at 12:49 AM Kees Cook <keescook@chromium.org> wrote:
> >
> > On Thu, Jan 16, 2020 at 06:23:01AM +0100, Dmitry Vyukov wrote:
> > > On Thu, Jan 16, 2020 at 2:24 AM Kees Cook <keescook@chromium.org> wrote:
> > > >
> > > > As done in the full WARN() handler, panic_on_warn needs to be cleared
> > > > before calling panic() to avoid recursive panics.
> > > >
> > > > Signed-off-by: Kees Cook <keescook@chromium.org>
> > > > ---
> > > >  mm/kasan/report.c | 10 +++++++++-
> > > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/mm/kasan/report.c b/mm/kasan/report.c
> > > > index 621782100eaa..844554e78893 100644
> > > > --- a/mm/kasan/report.c
> > > > +++ b/mm/kasan/report.c
> > > > @@ -92,8 +92,16 @@ static void end_report(unsigned long *flags)
> > > >         pr_err("==================================================================\n");
> > > >         add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
> > > >         spin_unlock_irqrestore(&report_lock, *flags);
> > > > -       if (panic_on_warn)
> > > > +       if (panic_on_warn) {
> > > > +               /*
> > > > +                * This thread may hit another WARN() in the panic path.
> > > > +                * Resetting this prevents additional WARN() from panicking the
> > > > +                * system on this thread.  Other threads are blocked by the
> > > > +                * panic_mutex in panic().
> > >
> > > I don't understand part about other threads.
> > > Other threads are not necessary inside of panic(). And in fact since
> > > we reset panic_on_warn, they will not get there even if they should.
> > > If I am reading this correctly, once one thread prints a warning and
> > > is going to panic, other threads may now print infinite amounts of
> > > warning and proceed past them freely. Why is this the behavior we
> > > want?
> >
> > AIUI, the issue is the current thread hitting another WARN and blocking
> > on trying to call panic again. WARNs encountered during the execution of
> > panic() need to not attempt to call panic() again.
> 
> Yes, but the variable is global and affects other threads and the
> comment talks about other threads, and that's the part I am confused
> about (for both comment wording and the actual behavior). For the
> "same thread hitting another warning" case we need a per-task flag or
> something.

This is duplicating the common panic-on-warn logic (see the generic bug
code), so I'd like to just have the same behavior between the three
implementations of panic-on-warn (generic bug, kasan, ubsan), and then
work to merge them into a common handler, and then perhaps fix the
details of the behavior. I think it's more correct to allow the panicing
thread to complete than to care about what the other threads are doing.
Right now, a WARN within the panic code will either a) hang the machine,
or b) not panic, allowing the rest of the threads to continue, maybe
then hitting other WARNs and hanging. The generic bug code does not
suffer from this.

-Kees

> 
> > -Kees
> >
> > >
> > > > +                */
> > > > +               panic_on_warn = 0;
> > > >                 panic("panic_on_warn set ...\n");
> > > > +       }
> > > >         kasan_enable_current();
> > > >  }
> >
> > --
> > Kees Cook

Dmitry Vyukov Jan. 18, 2020, 9:19 a.m. UTC | #5

On Fri, Jan 17, 2020 at 10:20 PM Kees Cook <keescook@chromium.org> wrote:
>
> On Fri, Jan 17, 2020 at 10:54:36AM +0100, Dmitry Vyukov wrote:
> > On Fri, Jan 17, 2020 at 12:49 AM Kees Cook <keescook@chromium.org> wrote:
> > >
> > > On Thu, Jan 16, 2020 at 06:23:01AM +0100, Dmitry Vyukov wrote:
> > > > On Thu, Jan 16, 2020 at 2:24 AM Kees Cook <keescook@chromium.org> wrote:
> > > > >
> > > > > As done in the full WARN() handler, panic_on_warn needs to be cleared
> > > > > before calling panic() to avoid recursive panics.
> > > > >
> > > > > Signed-off-by: Kees Cook <keescook@chromium.org>
> > > > > ---
> > > > >  mm/kasan/report.c | 10 +++++++++-
> > > > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/mm/kasan/report.c b/mm/kasan/report.c
> > > > > index 621782100eaa..844554e78893 100644
> > > > > --- a/mm/kasan/report.c
> > > > > +++ b/mm/kasan/report.c
> > > > > @@ -92,8 +92,16 @@ static void end_report(unsigned long *flags)
> > > > >         pr_err("==================================================================\n");
> > > > >         add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
> > > > >         spin_unlock_irqrestore(&report_lock, *flags);
> > > > > -       if (panic_on_warn)
> > > > > +       if (panic_on_warn) {
> > > > > +               /*
> > > > > +                * This thread may hit another WARN() in the panic path.
> > > > > +                * Resetting this prevents additional WARN() from panicking the
> > > > > +                * system on this thread.  Other threads are blocked by the
> > > > > +                * panic_mutex in panic().
> > > >
> > > > I don't understand part about other threads.
> > > > Other threads are not necessary inside of panic(). And in fact since
> > > > we reset panic_on_warn, they will not get there even if they should.
> > > > If I am reading this correctly, once one thread prints a warning and
> > > > is going to panic, other threads may now print infinite amounts of
> > > > warning and proceed past them freely. Why is this the behavior we
> > > > want?
> > >
> > > AIUI, the issue is the current thread hitting another WARN and blocking
> > > on trying to call panic again. WARNs encountered during the execution of
> > > panic() need to not attempt to call panic() again.
> >
> > Yes, but the variable is global and affects other threads and the
> > comment talks about other threads, and that's the part I am confused
> > about (for both comment wording and the actual behavior). For the
> > "same thread hitting another warning" case we need a per-task flag or
> > something.
>
> This is duplicating the common panic-on-warn logic (see the generic bug
> code), so I'd like to just have the same behavior between the three
> implementations of panic-on-warn (generic bug, kasan, ubsan), and then
> work to merge them into a common handler, and then perhaps fix the
> details of the behavior. I think it's more correct to allow the panicing
> thread to complete than to care about what the other threads are doing.
> Right now, a WARN within the panic code will either a) hang the machine,
> or b) not panic, allowing the rest of the threads to continue, maybe
> then hitting other WARNs and hanging. The generic bug code does not
> suffer from this.

I see. Then:

Acked-by: Dmitry Vyukov <dvyukov@google.com>

[v3,5/6] kasan: Unset panic_on_warn before calling panic()

Commit Message

Comments

Patch