mbox series

[next,v2,0/6] exit: Put an upper limit on how often we can oops

Message ID 20221109194404.gonna.558-kees@kernel.org (mailing list archive)
Headers show
Series exit: Put an upper limit on how often we can oops | expand

Message

Kees Cook Nov. 9, 2022, 8 p.m. UTC
Hi,

This builds on Jann's v1 patch[1]. Changes in v2:
- move sysctl into kernel/exit.c (where it belongs)
- expand Documentation slightly

New stuff in v2:
- expose oops_count to sysfs
- consolidate panic_on_warn usage
- introduce warn_limit
- expose warn_count to sysfs

[1] https://lore.kernel.org/lkml/20221107201317.324457-1-jannh@google.com

Jann Horn (1):
  exit: Put an upper limit on how often we can oops

Kees Cook (5):
  panic: Separate sysctl logic from CONFIG_SMP
  exit: Expose "oops_count" to sysfs
  panic: Consolidate open-coded panic_on_warn checks
  panic: Introduce warn_limit
  panic: Expose "warn_count" to sysfs

 .../ABI/testing/sysfs-kernel-oops_count       |  6 ++
 .../ABI/testing/sysfs-kernel-warn_count       |  6 ++
 Documentation/admin-guide/sysctl/kernel.rst   | 17 ++++++
 MAINTAINERS                                   |  2 +
 include/linux/panic.h                         |  1 +
 kernel/exit.c                                 | 60 +++++++++++++++++++
 kernel/kcsan/report.c                         |  3 +-
 kernel/panic.c                                | 44 +++++++++++++-
 kernel/sched/core.c                           |  3 +-
 lib/ubsan.c                                   |  3 +-
 mm/kasan/report.c                             |  4 +-
 mm/kfence/report.c                            |  3 +-
 12 files changed, 139 insertions(+), 13 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-kernel-oops_count
 create mode 100644 Documentation/ABI/testing/sysfs-kernel-warn_count

Comments

Luis Chamberlain Nov. 9, 2022, 9:16 p.m. UTC | #1
On Wed, Nov 09, 2022 at 12:00:43PM -0800, Kees Cook wrote:
> Hi,
> 
> This builds on Jann's v1 patch[1]. Changes in v2:
> - move sysctl into kernel/exit.c (where it belongs)
> - expand Documentation slightly
> 
> New stuff in v2:
> - expose oops_count to sysfs
> - consolidate panic_on_warn usage
> - introduce warn_limit
> - expose warn_count to sysfs
> 
> [1] https://lore.kernel.org/lkml/20221107201317.324457-1-jannh@google.com
> 
> Jann Horn (1):
>   exit: Put an upper limit on how often we can oops
> 
> Kees Cook (5):
>   panic: Separate sysctl logic from CONFIG_SMP
>   exit: Expose "oops_count" to sysfs
>   panic: Consolidate open-coded panic_on_warn checks
>   panic: Introduce warn_limit
>   panic: Expose "warn_count" to sysfs

For all:

Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>

  Luis
Marco Elver Nov. 14, 2022, 9:48 a.m. UTC | #2
On Wed, 9 Nov 2022 at 21:00, Kees Cook <keescook@chromium.org> wrote:
>
> Like oops_limit, add warn_limit for limiting the number of warnings when
> panic_on_warn is not set.
>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
> Cc: Eric Biggers <ebiggers@google.com>
> Cc: Huang Ying <ying.huang@intel.com>
> Cc: Petr Mladek <pmladek@suse.com>
> Cc: tangmeng <tangmeng@uniontech.com>
> Cc: "Guilherme G. Piccoli" <gpiccoli@igalia.com>
> Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Cc: linux-doc@vger.kernel.org
> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
>  Documentation/admin-guide/sysctl/kernel.rst |  9 +++++++++
>  kernel/panic.c                              | 13 +++++++++++++
>  2 files changed, 22 insertions(+)
>
> diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
> index 09f3fb2f8585..c385d5319cdf 100644
> --- a/Documentation/admin-guide/sysctl/kernel.rst
> +++ b/Documentation/admin-guide/sysctl/kernel.rst
> @@ -1508,6 +1508,15 @@ entry will default to 2 instead of 0.
>  2 Unprivileged calls to ``bpf()`` are disabled
>  = =============================================================
>
> +
> +warn_limit
> +==========
> +
> +Number of kernel warnings after which the kernel should panic when
> +``panic_on_warn`` is not set. Setting this to 0 or 1 has the same effect
> +as setting ``panic_on_warn=1``.
> +
> +
>  watchdog
>  ========
>
> diff --git a/kernel/panic.c b/kernel/panic.c
> index 3afd234767bc..b235fa4a6fc8 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -58,6 +58,7 @@ bool crash_kexec_post_notifiers;
>  int panic_on_warn __read_mostly;
>  unsigned long panic_on_taint;
>  bool panic_on_taint_nousertaint = false;
> +static unsigned int warn_limit __read_mostly = 10000;
>
>  int panic_timeout = CONFIG_PANIC_TIMEOUT;
>  EXPORT_SYMBOL_GPL(panic_timeout);
> @@ -88,6 +89,13 @@ static struct ctl_table kern_panic_table[] = {
>                 .extra2         = SYSCTL_ONE,
>         },
>  #endif
> +       {
> +               .procname       = "warn_limit",
> +               .data           = &warn_limit,
> +               .maxlen         = sizeof(warn_limit),
> +               .mode           = 0644,
> +               .proc_handler   = proc_douintvec,
> +       },
>         { }
>  };
>
> @@ -203,8 +211,13 @@ static void panic_print_sys_info(bool console_flush)
>
>  void check_panic_on_warn(const char *reason)
>  {
> +       static atomic_t warn_count = ATOMIC_INIT(0);
> +
>         if (panic_on_warn)
>                 panic("%s: panic_on_warn set ...\n", reason);
> +
> +       if (atomic_inc_return(&warn_count) >= READ_ONCE(warn_limit))
> +               panic("Warned too often (warn_limit is %d)", warn_limit);

Shouldn't this also include the "reason", like above? (Presumably a
warning had just been generated to console so the reason is easy
enough to infer from the log, although in that case "reason" also
seems redundant above.)
Kees Cook Nov. 17, 2022, 11:27 p.m. UTC | #3
On Mon, Nov 14, 2022 at 10:48:38AM +0100, Marco Elver wrote:
> On Wed, 9 Nov 2022 at 21:00, Kees Cook <keescook@chromium.org> wrote:
> >
> > Like oops_limit, add warn_limit for limiting the number of warnings when
> > panic_on_warn is not set.
> >
> > Cc: Jonathan Corbet <corbet@lwn.net>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
> > Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
> > Cc: Eric Biggers <ebiggers@google.com>
> > Cc: Huang Ying <ying.huang@intel.com>
> > Cc: Petr Mladek <pmladek@suse.com>
> > Cc: tangmeng <tangmeng@uniontech.com>
> > Cc: "Guilherme G. Piccoli" <gpiccoli@igalia.com>
> > Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
> > Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > Cc: linux-doc@vger.kernel.org
> > Signed-off-by: Kees Cook <keescook@chromium.org>
> > ---
> >  Documentation/admin-guide/sysctl/kernel.rst |  9 +++++++++
> >  kernel/panic.c                              | 13 +++++++++++++
> >  2 files changed, 22 insertions(+)
> >
> > diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
> > index 09f3fb2f8585..c385d5319cdf 100644
> > --- a/Documentation/admin-guide/sysctl/kernel.rst
> > +++ b/Documentation/admin-guide/sysctl/kernel.rst
> > @@ -1508,6 +1508,15 @@ entry will default to 2 instead of 0.
> >  2 Unprivileged calls to ``bpf()`` are disabled
> >  = =============================================================
> >
> > +
> > +warn_limit
> > +==========
> > +
> > +Number of kernel warnings after which the kernel should panic when
> > +``panic_on_warn`` is not set. Setting this to 0 or 1 has the same effect
> > +as setting ``panic_on_warn=1``.
> > +
> > +
> >  watchdog
> >  ========
> >
> > diff --git a/kernel/panic.c b/kernel/panic.c
> > index 3afd234767bc..b235fa4a6fc8 100644
> > --- a/kernel/panic.c
> > +++ b/kernel/panic.c
> > @@ -58,6 +58,7 @@ bool crash_kexec_post_notifiers;
> >  int panic_on_warn __read_mostly;
> >  unsigned long panic_on_taint;
> >  bool panic_on_taint_nousertaint = false;
> > +static unsigned int warn_limit __read_mostly = 10000;
> >
> >  int panic_timeout = CONFIG_PANIC_TIMEOUT;
> >  EXPORT_SYMBOL_GPL(panic_timeout);
> > @@ -88,6 +89,13 @@ static struct ctl_table kern_panic_table[] = {
> >                 .extra2         = SYSCTL_ONE,
> >         },
> >  #endif
> > +       {
> > +               .procname       = "warn_limit",
> > +               .data           = &warn_limit,
> > +               .maxlen         = sizeof(warn_limit),
> > +               .mode           = 0644,
> > +               .proc_handler   = proc_douintvec,
> > +       },
> >         { }
> >  };
> >
> > @@ -203,8 +211,13 @@ static void panic_print_sys_info(bool console_flush)
> >
> >  void check_panic_on_warn(const char *reason)
> >  {
> > +       static atomic_t warn_count = ATOMIC_INIT(0);
> > +
> >         if (panic_on_warn)
> >                 panic("%s: panic_on_warn set ...\n", reason);
> > +
> > +       if (atomic_inc_return(&warn_count) >= READ_ONCE(warn_limit))
> > +               panic("Warned too often (warn_limit is %d)", warn_limit);
> 
> Shouldn't this also include the "reason", like above? (Presumably a
> warning had just been generated to console so the reason is easy
> enough to infer from the log, although in that case "reason" also
> seems redundant above.)

Yeah, that makes sense. I had been thinking that since it was an action
due to repeated prior actions, the current "reason" didn't matter here.
But thinking about it more, I see what you mean. :)