diff mbox series

[bpf-next,v2,3/5] tcp: Use skb__nullable in trace_tcp_send_reset

Message ID 20240905075622.66819-4-lulie@linux.alibaba.com (mailing list archive)
State Superseded
Headers show
Series bpf: Allow skb dynptr for tp_btf | expand

Commit Message

Philo Lu Sept. 5, 2024, 7:56 a.m. UTC
Replace skb with skb__nullable as the argument name. The suffix tells
bpf verifier through btf that the arg could be NULL and should be
checked in tp_btf prog.

For now, this is the only nullable argument in tcp tracepoints.

Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
---
 include/trace/events/tcp.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

Comments

Alexei Starovoitov Sept. 6, 2024, 12:26 a.m. UTC | #1
On Thu, Sep 5, 2024 at 12:56 AM Philo Lu <lulie@linux.alibaba.com> wrote:
>
> Replace skb with skb__nullable as the argument name. The suffix tells
> bpf verifier through btf that the arg could be NULL and should be
> checked in tp_btf prog.
>
> For now, this is the only nullable argument in tcp tracepoints.
>
> Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
> ---
>  include/trace/events/tcp.h | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
> index 1c8bd8e186b89..a27c4b619dffd 100644
> --- a/include/trace/events/tcp.h
> +++ b/include/trace/events/tcp.h
> @@ -91,10 +91,10 @@ DEFINE_RST_REASON(FN, FN)
>  TRACE_EVENT(tcp_send_reset,
>
>         TP_PROTO(const struct sock *sk,
> -                const struct sk_buff *skb,
> +                const struct sk_buff *skb__nullable,
>                  const enum sk_rst_reason reason),
>
> -       TP_ARGS(sk, skb, reason),
> +       TP_ARGS(sk, skb__nullable, reason),

netdev folks pls ack this patch.

Yes, it's a bit of a whack a mole and eventually we can get rid of it
with a smarter verifier (likely) or smarter objtool (unlikely).
Long term we should be able to analyze body of TP_fast_assign
automatically and conclude whether it's handling NULL for pointer
arguments or not. bpf verifier can easily do it for bpf code.
We just need to compile TP_fast_assign() as a tiny bpf snippet.
This is work in progress.

For now we have to use a suffix like __nullable.
Jakub Kicinski Sept. 6, 2024, 10:23 p.m. UTC | #2
On Thu, 5 Sep 2024 17:26:42 -0700 Alexei Starovoitov wrote:
> Yes, it's a bit of a whack a mole and eventually we can get rid of it
> with a smarter verifier (likely) or smarter objtool (unlikely).
> Long term we should be able to analyze body of TP_fast_assign
> automatically and conclude whether it's handling NULL for pointer
> arguments or not. bpf verifier can easily do it for bpf code.
> We just need to compile TP_fast_assign() as a tiny bpf snippet.
> This is work in progress.

Can we not wait for that work to conclude, then? AFAIU this whole
patch set is just a minor quality of life improvement for BPF progs
at the expense of carrying questionable changes upstream.
I don't see the urgency.
Alexei Starovoitov Sept. 6, 2024, 10:41 p.m. UTC | #3
On Fri, Sep 6, 2024 at 3:23 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu, 5 Sep 2024 17:26:42 -0700 Alexei Starovoitov wrote:
> > Yes, it's a bit of a whack a mole and eventually we can get rid of it
> > with a smarter verifier (likely) or smarter objtool (unlikely).
> > Long term we should be able to analyze body of TP_fast_assign
> > automatically and conclude whether it's handling NULL for pointer
> > arguments or not. bpf verifier can easily do it for bpf code.
> > We just need to compile TP_fast_assign() as a tiny bpf snippet.
> > This is work in progress.
>
> Can we not wait for that work to conclude, then? AFAIU this whole
> patch set is just a minor quality of life improvement for BPF progs
> at the expense of carrying questionable changes upstream.
> I don't see the urgency.

The urgency is now because the situation is dire.
The verifier assumes that skb is not null and will remove
if (!skb) check assuming that it's a dead code.
This patch set adds trusted stuff and fixes this issue too
which is the more important part.

Also it's not clear how long it will take to do 'dual compile'.
It's showing promise atm, but timelines are not certain.
If you recall I pushed for 'dual compile' for the last several years and
only now we found time to work on it.
Jakub Kicinski Sept. 6, 2024, 10:57 p.m. UTC | #4
On Fri, 6 Sep 2024 15:41:47 -0700 Alexei Starovoitov wrote:
> The urgency is now because the situation is dire.
> The verifier assumes that skb is not null and will remove
> if (!skb) check assuming that it's a dead code.

Meaning verifier currently isn't ready for patch 4?
Or we can crash 6.11-rc6 by attaching to a trace_tcp_send_reset()
and doing
	printf("%d\n", skb->len);
?
Alexei Starovoitov Sept. 6, 2024, 11:22 p.m. UTC | #5
On Fri, Sep 6, 2024 at 3:57 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Fri, 6 Sep 2024 15:41:47 -0700 Alexei Starovoitov wrote:
> > The urgency is now because the situation is dire.
> > The verifier assumes that skb is not null and will remove
> > if (!skb) check assuming that it's a dead code.
>
> Meaning verifier currently isn't ready for patch 4?
> Or we can crash 6.11-rc6 by attaching to a trace_tcp_send_reset()
> and doing
>         printf("%d\n", skb->len);
> ?

depends on the prog type and how it's attached, but yes :(
Without Philo's patches.

It was reported here:
https://lore.kernel.org/bpf/ZrCZS6nisraEqehw@jlelli-thinkpadt14gen4.remote.csb/

Jiri did the analysis. These files would need to be annotated:
include/trace/events/afs.h
include/trace/events/cachefiles.h
include/trace/events/ext4.h
include/trace/events/fib.h
include/trace/events/filelock.h
include/trace/events/host1x.h
include/trace/events/huge_memory.h
include/trace/events/kmem.h
include/trace/events/netfs.h
include/trace/events/power.h
include/trace/events/qdisc.h
include/trace/events/rxrpc.h
include/trace/events/sched.h
include/trace/events/sunrpc.h
include/trace/events/tcp.h
include/trace/events/tegra_apb_dma.h
include/trace/events/timer_migration.h
include/trace/events/writeback.h

which is 18 out of 160.

All other options are worse.
Jakub Kicinski Sept. 7, 2024, 12:17 a.m. UTC | #6
On Fri, 6 Sep 2024 16:22:12 -0700 Alexei Starovoitov wrote:
> > On Fri, 6 Sep 2024 15:41:47 -0700 Alexei Starovoitov wrote:  
> > > The urgency is now because the situation is dire.
> > > The verifier assumes that skb is not null and will remove
> > > if (!skb) check assuming that it's a dead code.  
> >
> > Meaning verifier currently isn't ready for patch 4?
> > Or we can crash 6.11-rc6 by attaching to a trace_tcp_send_reset()
> > and doing
> >         printf("%d\n", skb->len);
> > ?  
> 
> depends on the prog type and how it's attached, but yes :(

I see :( Thought this is just needed for patch 4.
In this case no objections "from networking perspective":

Acked-by: Jakub Kicinski <kuba@kernel.org>

although it feels more like a general tracing question.
diff mbox series

Patch

diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
index 1c8bd8e186b89..a27c4b619dffd 100644
--- a/include/trace/events/tcp.h
+++ b/include/trace/events/tcp.h
@@ -91,10 +91,10 @@  DEFINE_RST_REASON(FN, FN)
 TRACE_EVENT(tcp_send_reset,
 
 	TP_PROTO(const struct sock *sk,
-		 const struct sk_buff *skb,
+		 const struct sk_buff *skb__nullable,
 		 const enum sk_rst_reason reason),
 
-	TP_ARGS(sk, skb, reason),
+	TP_ARGS(sk, skb__nullable, reason),
 
 	TP_STRUCT__entry(
 		__field(const void *, skbaddr)
@@ -106,7 +106,7 @@  TRACE_EVENT(tcp_send_reset,
 	),
 
 	TP_fast_assign(
-		__entry->skbaddr = skb;
+		__entry->skbaddr = skb__nullable;
 		__entry->skaddr = sk;
 		/* Zero means unknown state. */
 		__entry->state = sk ? sk->sk_state : 0;
@@ -118,13 +118,13 @@  TRACE_EVENT(tcp_send_reset,
 			const struct inet_sock *inet = inet_sk(sk);
 
 			TP_STORE_ADDR_PORTS(__entry, inet, sk);
-		} else if (skb) {
-			const struct tcphdr *th = (const struct tcphdr *)skb->data;
+		} else if (skb__nullable) {
+			const struct tcphdr *th = (const struct tcphdr *)skb__nullable->data;
 			/*
 			 * We should reverse the 4-tuple of skb, so later
 			 * it can print the right flow direction of rst.
 			 */
-			TP_STORE_ADDR_PORTS_SKB(skb, th, entry->daddr, entry->saddr);
+			TP_STORE_ADDR_PORTS_SKB(skb__nullable, th, entry->daddr, entry->saddr);
 		}
 		__entry->reason = reason;
 	),