diff mbox series

[bpf-next] bpf: Improve bpf_probe_write_user() warning message

Message ID 20241126165414.1378338-1-elver@google.com (mailing list archive)
State New
Headers show
Series [bpf-next] bpf: Improve bpf_probe_write_user() warning message | expand

Commit Message

Marco Elver Nov. 26, 2024, 4:52 p.m. UTC
The warning message for bpf_probe_write_user() was introduced in
96ae52279594 ("bpf: Add bpf_probe_write_user BPF helper to be called in
tracers"), with the following in the commit message:

    Given this feature is meant for experiments, and it has a risk of
    crashing the system, and running programs, we print a warning on
    when a proglet that attempts to use this helper is installed,
    along with the pid and process name.

After 8 years since 96ae52279594, bpf_probe_write_user() has found
successful applications beyond experiments [1, 2], with no other good
alternatives. Despite its intended purpose for "experiments", that
doesn't stop Hyrum's law, and there are likely many more users depending
on this helper: "[..] it does not matter what you promise [..] all
observable behaviors of your system will be depended on by somebody."

As such, the warning message can be improved:

1. The ominous "helper that may corrupt user memory!" offers no real
   benefit, and has been found to lead to confusion where the system
   administrator is loading programs with valid use cases.  Remove it.
   No information is lost, and administrators who know their system
   should not load eBPF programs that use bpf_probe_write_user() know
   what they are looking for.

2. If multiple programs with bpf_probe_write_user() are loaded by the
   same task/PID consecutively, only print the message once. If another
   task loads a program with the helper, the message is printed once
   more, and so on. This also makes the need for rate limiting
   redundant.

3. Every printk line needs to be concluded with "\n" to be flushed. With
   the old version the warning message only appeared after any following
   printk. Fix this.

Link: https://lore.kernel.org/lkml/20240404190146.1898103-1-elver@google.com/ [1]
Link: https://lore.kernel.org/r/lkml/CAAn3qOUMD81-vxLLfep0H6rRd74ho2VaekdL4HjKq+Y1t9KdXQ@mail.gmail.com/ [2]
Signed-off-by: Marco Elver <elver@google.com>
---
 kernel/trace/bpf_trace.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Comments

Andrii Nakryiko Nov. 26, 2024, 9:32 p.m. UTC | #1
On Tue, Nov 26, 2024 at 8:54 AM Marco Elver <elver@google.com> wrote:
>
> The warning message for bpf_probe_write_user() was introduced in
> 96ae52279594 ("bpf: Add bpf_probe_write_user BPF helper to be called in
> tracers"), with the following in the commit message:
>
>     Given this feature is meant for experiments, and it has a risk of
>     crashing the system, and running programs, we print a warning on
>     when a proglet that attempts to use this helper is installed,
>     along with the pid and process name.
>
> After 8 years since 96ae52279594, bpf_probe_write_user() has found
> successful applications beyond experiments [1, 2], with no other good
> alternatives. Despite its intended purpose for "experiments", that
> doesn't stop Hyrum's law, and there are likely many more users depending
> on this helper: "[..] it does not matter what you promise [..] all
> observable behaviors of your system will be depended on by somebody."
>
> As such, the warning message can be improved:
>
> 1. The ominous "helper that may corrupt user memory!" offers no real
>    benefit, and has been found to lead to confusion where the system
>    administrator is loading programs with valid use cases.  Remove it.
>    No information is lost, and administrators who know their system
>    should not load eBPF programs that use bpf_probe_write_user() know
>    what they are looking for.
>
> 2. If multiple programs with bpf_probe_write_user() are loaded by the
>    same task/PID consecutively, only print the message once. If another
>    task loads a program with the helper, the message is printed once
>    more, and so on. This also makes the need for rate limiting
>    redundant.
>
> 3. Every printk line needs to be concluded with "\n" to be flushed. With
>    the old version the warning message only appeared after any following
>    printk. Fix this.
>
> Link: https://lore.kernel.org/lkml/20240404190146.1898103-1-elver@google.com/ [1]
> Link: https://lore.kernel.org/r/lkml/CAAn3qOUMD81-vxLLfep0H6rRd74ho2VaekdL4HjKq+Y1t9KdXQ@mail.gmail.com/ [2]
> Signed-off-by: Marco Elver <elver@google.com>
> ---
>  kernel/trace/bpf_trace.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 630b763e5240..0ead3d66f8db 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -359,11 +359,16 @@ static const struct bpf_func_proto bpf_probe_write_user_proto = {
>
>  static const struct bpf_func_proto *bpf_get_probe_write_proto(void)
>  {
> +       static pid_t last_warn_pid = -1;
> +
>         if (!capable(CAP_SYS_ADMIN))
>                 return NULL;
>
> -       pr_warn_ratelimited("%s[%d] is installing a program with bpf_probe_write_user helper that may corrupt user memory!",
> -                           current->comm, task_pid_nr(current));
> +       if (READ_ONCE(last_warn_pid) != task_pid_nr(current)) {
> +               pr_warn("%s[%d] is installing a program with bpf_probe_write_user\n",
> +                       current->comm, task_pid_nr(current));
> +               WRITE_ONCE(last_warn_pid, task_pid_nr(current));
> +       }

should we just drop this warning altogether? After all, we can call
crash_kexec() without any warnings, if we have the right capabilities.
bpf_probe_write_user() is much less destructive and at worst will
cause memory corruption within a single process (assuming
CAP_SYS_ADMIN, of course). If yes, I think we should drop
bpf_get_probe_write_proto() function altogether and refactor
bpf_tracing_func_proto() to have
bpf_token_capable(CAP_SYS_ADMIN)-guarded section, just like
bpf_base_func_proto() has.

>
>         return &bpf_probe_write_user_proto;
>  }
> --
> 2.47.0.338.g60cca15819-goog
>
Alexei Starovoitov Nov. 27, 2024, 12:52 a.m. UTC | #2
On Tue, Nov 26, 2024 at 1:32 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Nov 26, 2024 at 8:54 AM Marco Elver <elver@google.com> wrote:
> >
> > The warning message for bpf_probe_write_user() was introduced in
> > 96ae52279594 ("bpf: Add bpf_probe_write_user BPF helper to be called in
> > tracers"), with the following in the commit message:
> >
> >     Given this feature is meant for experiments, and it has a risk of
> >     crashing the system, and running programs, we print a warning on
> >     when a proglet that attempts to use this helper is installed,
> >     along with the pid and process name.
> >
> > After 8 years since 96ae52279594, bpf_probe_write_user() has found
> > successful applications beyond experiments [1, 2], with no other good
> > alternatives. Despite its intended purpose for "experiments", that
> > doesn't stop Hyrum's law, and there are likely many more users depending
> > on this helper: "[..] it does not matter what you promise [..] all
> > observable behaviors of your system will be depended on by somebody."
> >
> > As such, the warning message can be improved:
> >
> > 1. The ominous "helper that may corrupt user memory!" offers no real
> >    benefit, and has been found to lead to confusion where the system
> >    administrator is loading programs with valid use cases.  Remove it.
> >    No information is lost, and administrators who know their system
> >    should not load eBPF programs that use bpf_probe_write_user() know
> >    what they are looking for.
> >
> > 2. If multiple programs with bpf_probe_write_user() are loaded by the
> >    same task/PID consecutively, only print the message once. If another
> >    task loads a program with the helper, the message is printed once
> >    more, and so on. This also makes the need for rate limiting
> >    redundant.
> >
> > 3. Every printk line needs to be concluded with "\n" to be flushed. With
> >    the old version the warning message only appeared after any following
> >    printk. Fix this.
> >
> > Link: https://lore.kernel.org/lkml/20240404190146.1898103-1-elver@google.com/ [1]
> > Link: https://lore.kernel.org/r/lkml/CAAn3qOUMD81-vxLLfep0H6rRd74ho2VaekdL4HjKq+Y1t9KdXQ@mail.gmail.com/ [2]
> > Signed-off-by: Marco Elver <elver@google.com>
> > ---
> >  kernel/trace/bpf_trace.c | 9 +++++++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index 630b763e5240..0ead3d66f8db 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -359,11 +359,16 @@ static const struct bpf_func_proto bpf_probe_write_user_proto = {
> >
> >  static const struct bpf_func_proto *bpf_get_probe_write_proto(void)
> >  {
> > +       static pid_t last_warn_pid = -1;
> > +
> >         if (!capable(CAP_SYS_ADMIN))
> >                 return NULL;
> >
> > -       pr_warn_ratelimited("%s[%d] is installing a program with bpf_probe_write_user helper that may corrupt user memory!",
> > -                           current->comm, task_pid_nr(current));
> > +       if (READ_ONCE(last_warn_pid) != task_pid_nr(current)) {
> > +               pr_warn("%s[%d] is installing a program with bpf_probe_write_user\n",
> > +                       current->comm, task_pid_nr(current));
> > +               WRITE_ONCE(last_warn_pid, task_pid_nr(current));
> > +       }
>
> should we just drop this warning altogether? After all, we can call
> crash_kexec() without any warnings, if we have the right capabilities.
> bpf_probe_write_user() is much less destructive and at worst will
> cause memory corruption within a single process (assuming
> CAP_SYS_ADMIN, of course). If yes, I think we should drop
> bpf_get_probe_write_proto() function altogether and refactor
> bpf_tracing_func_proto() to have
> bpf_token_capable(CAP_SYS_ADMIN)-guarded section, just like
> bpf_base_func_proto() has.

+1
Let's just remove this warn. It didn't stop anyone from using it so far.
Marco Elver Nov. 27, 2024, 10:39 a.m. UTC | #3
On Tue, 26 Nov 2024 at 22:32, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
[...]
> should we just drop this warning altogether? After all, we can call

I'm in favour.

> crash_kexec() without any warnings, if we have the right capabilities.
> bpf_probe_write_user() is much less destructive and at worst will
> cause memory corruption within a single process (assuming
> CAP_SYS_ADMIN, of course). If yes, I think we should drop
> bpf_get_probe_write_proto() function altogether and refactor
> bpf_tracing_func_proto() to have
> bpf_token_capable(CAP_SYS_ADMIN)-guarded section, just like
> bpf_base_func_proto() has.

Let me do that too. But as a separate patch 2/2 as it simplifies
backporting the removal of the warning to older kernels.
diff mbox series

Patch

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 630b763e5240..0ead3d66f8db 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -359,11 +359,16 @@  static const struct bpf_func_proto bpf_probe_write_user_proto = {
 
 static const struct bpf_func_proto *bpf_get_probe_write_proto(void)
 {
+	static pid_t last_warn_pid = -1;
+
 	if (!capable(CAP_SYS_ADMIN))
 		return NULL;
 
-	pr_warn_ratelimited("%s[%d] is installing a program with bpf_probe_write_user helper that may corrupt user memory!",
-			    current->comm, task_pid_nr(current));
+	if (READ_ONCE(last_warn_pid) != task_pid_nr(current)) {
+		pr_warn("%s[%d] is installing a program with bpf_probe_write_user\n",
+			current->comm, task_pid_nr(current));
+		WRITE_ONCE(last_warn_pid, task_pid_nr(current));
+	}
 
 	return &bpf_probe_write_user_proto;
 }