diff mbox series

[v1,bpf] net: Annotate rx_sk with __nullable for trace_kfree_skb.

Message ID 20250201001425.42377-1-kuniyu@amazon.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series [v1,bpf] net: Annotate rx_sk with __nullable for trace_kfree_skb. | expand

Checks

Context Check Description
bpf/vmtest-bpf-PR success PR summary
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for bpf
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/build_tools success Errors and warnings before: 26 (+1) this patch: 26 (+1)
netdev/cc_maintainers fail 1 blamed authors not CCed: hawk@kernel.org; 5 maintainers not CCed: mathieu.desnoyers@efficios.com linux-trace-kernel@vger.kernel.org rostedt@goodmis.org hawk@kernel.org mhiramat@kernel.org
netdev/build_clang success Errors and warnings before: 10 this patch: 10
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 11 this patch: 11
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 19 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-VM_Test-3 success Logs for Validate matrix.py
bpf/vmtest-bpf-VM_Test-5 success Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-VM_Test-2 success Logs for Unittests
bpf/vmtest-bpf-VM_Test-9 success Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-10 success Logs for aarch64-gcc / veristat-kernel
bpf/vmtest-bpf-VM_Test-4 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-VM_Test-6 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-11 success Logs for aarch64-gcc / veristat-meta
bpf/vmtest-bpf-VM_Test-12 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-VM_Test-16 success Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-VM_Test-18 success Logs for s390x-gcc / veristat-meta
bpf/vmtest-bpf-VM_Test-13 success Logs for s390x-gcc / build-release
bpf/vmtest-bpf-VM_Test-17 success Logs for s390x-gcc / veristat-kernel
bpf/vmtest-bpf-VM_Test-20 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-VM_Test-19 success Logs for set-matrix
bpf/vmtest-bpf-VM_Test-21 success Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-VM_Test-22 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-27 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-30 success Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-28 success Logs for x86_64-gcc / veristat-kernel / x86_64-gcc veristat_kernel
bpf/vmtest-bpf-VM_Test-29 success Logs for x86_64-gcc / veristat-meta / x86_64-gcc veristat_meta
bpf/vmtest-bpf-VM_Test-32 success Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-35 success Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-31 success Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17-O2
bpf/vmtest-bpf-VM_Test-36 success Logs for x86_64-llvm-17 / veristat-kernel
bpf/vmtest-bpf-VM_Test-37 success Logs for x86_64-llvm-17 / veristat-meta
bpf/vmtest-bpf-VM_Test-39 success Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18-O2
bpf/vmtest-bpf-VM_Test-38 success Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-40 success Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-44 success Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-45 success Logs for x86_64-llvm-18 / veristat-kernel
bpf/vmtest-bpf-VM_Test-46 success Logs for x86_64-llvm-18 / veristat-meta
bpf/vmtest-bpf-VM_Test-7 success Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-14 success Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
bpf/vmtest-bpf-VM_Test-23 success Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-24 success Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-25 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-26 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-33 success Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-34 success Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-VM_Test-41 success Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-43 success Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-42 success Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-VM_Test-8 success Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-15 success Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc

Commit Message

Kuniyuki Iwashima Feb. 1, 2025, 12:14 a.m. UTC
Yan Zhai reported a BPF prog could trigger a null-ptr-deref [0]
in trace_kfree_skb if the prog does not check if rx_sk is NULL.

Commit c53795d48ee8 ("net: add rx_sk to trace_kfree_skb") added
rx_sk to trace_kfree_skb, but rx_sk is optional and could be NULL.

Let's add __nullable suffix to rx_sk to let the BPF verifier
validate such a prog and prevent the issue.

Now we fail to load such a prog:

  libbpf: prog 'drop': -- BEGIN PROG LOAD LOG --
  0: R1=ctx() R10=fp0
  ; int BPF_PROG(drop, struct sk_buff *skb, void *location, @ kfree_skb_sk_null.bpf.c:21
  0: (79) r3 = *(u64 *)(r1 +24)
  func 'kfree_skb' arg3 has btf_id 5253 type STRUCT 'sock'
  1: R1=ctx() R3_w=trusted_ptr_or_null_sock(id=1)
  ; bpf_printk("sk: %d, %d\n", sk, sk->__sk_common.skc_family); @ kfree_skb_sk_null.bpf.c:24
  1: (69) r4 = *(u16 *)(r3 +16)
  R3 invalid mem access 'trusted_ptr_or_null_'
  processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
  -- END PROG LOAD LOG --

Note this fix requires commit 8aeaed21befc ("bpf: Support
__nullable argument suffix for tp_btf").

[0]:
BUG: kernel NULL pointer dereference, address: 0000000000000010
 PF: supervisor read access in kernel mode
 PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
PREEMPT SMP
CPU: 6 UID: 0 PID: 348 Comm: sshd Not tainted 6.12.11 #206
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:bpf_prog_5e21a6db8fcff1aa_drop+0x10/0x2d
Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 48 8b 57 18 <48> 0f b7 4a 10 48 bf 0c 4f e2 c1 ad 90 ff ff be 0c 00 00 00 e8 0f
RSP: 0018:ffffa86640b53da8 EFLAGS: 00010202
RAX: 0000000000000001 RBX: ffffa866402d1000 RCX: 0000000000000002
RDX: 0000000000000000 RSI: ffffa866402d1048 RDI: ffffa86640b53dc8
RBP: ffffa86640b53da8 R08: 0000000000000000 R09: 9c908cd09b9c8c91
R10: ffff90adc056b540 R11: 0000000000000002 R12: 0000000000000000
R13: ffffa86640b53e88 R14: 0000000000000800 R15: fffffffffffffffe
FS:  00007f2a27c2b480(0000) GS:ffff90b0efd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000010 CR3: 0000000100e69004 CR4: 00000000001726f0
Call Trace:
 <TASK>
 ? __die+0x1f/0x60
 ? page_fault_oops+0x148/0x420
 ? search_bpf_extables+0x5b/0x70
 ? fixup_exception+0x27/0x2c0
 ? exc_page_fault+0x75/0x170
 ? asm_exc_page_fault+0x22/0x30
 ? bpf_prog_5e21a6db8fcff1aa_drop+0x10/0x2d
 bpf_trace_run4+0x68/0xd0
 ? unix_stream_connect+0x1f4/0x6f0
 sk_skb_reason_drop+0x90/0x120
 unix_stream_connect+0x1f4/0x6f0
 __sys_connect+0x7f/0xb0
 __x64_sys_connect+0x14/0x20
 do_syscall_64+0x47/0xc30
 entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x7f2a27f296a0
Code: 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 80 3d 41 ff 0c 00 00 74 17 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 18 89 54
RSP: 002b:00007ffe29274f58 EFLAGS: 00000202 ORIG_RAX: 000000000000002a

Fixes: c53795d48ee8 ("net: add rx_sk to trace_kfree_skb")
Reported-by: Yan Zhai <yan@cloudflare.com>
Closes: https://lore.kernel.org/netdev/Z50zebTRzI962e6X@debian.debian/
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 include/trace/events/skb.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Martin KaFai Lau Feb. 1, 2025, 2:19 a.m. UTC | #1
On 1/31/25 4:14 PM, Kuniyuki Iwashima wrote:
> Yan Zhai reported a BPF prog could trigger a null-ptr-deref [0]
> in trace_kfree_skb if the prog does not check if rx_sk is NULL.
> 
> Commit c53795d48ee8 ("net: add rx_sk to trace_kfree_skb") added
> rx_sk to trace_kfree_skb, but rx_sk is optional and could be NULL.
> 
> Let's add __nullable suffix to rx_sk to let the BPF verifier
> validate such a prog and prevent the issue.
> 
> Now we fail to load such a prog:
> 
>    libbpf: prog 'drop': -- BEGIN PROG LOAD LOG --
>    0: R1=ctx() R10=fp0
>    ; int BPF_PROG(drop, struct sk_buff *skb, void *location, @ kfree_skb_sk_null.bpf.c:21
>    0: (79) r3 = *(u64 *)(r1 +24)
>    func 'kfree_skb' arg3 has btf_id 5253 type STRUCT 'sock'
>    1: R1=ctx() R3_w=trusted_ptr_or_null_sock(id=1)
>    ; bpf_printk("sk: %d, %d\n", sk, sk->__sk_common.skc_family); @ kfree_skb_sk_null.bpf.c:24
>    1: (69) r4 = *(u16 *)(r3 +16)
>    R3 invalid mem access 'trusted_ptr_or_null_'
>    processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
>    -- END PROG LOAD LOG --
> 
> Note this fix requires commit 8aeaed21befc ("bpf: Support
> __nullable argument suffix for tp_btf").

I believe the current way is to add kfree_skb to the raw_tp_null_args[],
https://lore.kernel.org/all/20241213221929.3495062-3-memxor@gmail.com/

cc: Kumar
Kuniyuki Iwashima Feb. 1, 2025, 2:34 a.m. UTC | #2
From: Martin KaFai Lau <martin.lau@linux.dev>
Date: Fri, 31 Jan 2025 18:19:22 -0800
> On 1/31/25 4:14 PM, Kuniyuki Iwashima wrote:
> > Yan Zhai reported a BPF prog could trigger a null-ptr-deref [0]
> > in trace_kfree_skb if the prog does not check if rx_sk is NULL.
> > 
> > Commit c53795d48ee8 ("net: add rx_sk to trace_kfree_skb") added
> > rx_sk to trace_kfree_skb, but rx_sk is optional and could be NULL.
> > 
> > Let's add __nullable suffix to rx_sk to let the BPF verifier
> > validate such a prog and prevent the issue.
> > 
> > Now we fail to load such a prog:
> > 
> >    libbpf: prog 'drop': -- BEGIN PROG LOAD LOG --
> >    0: R1=ctx() R10=fp0
> >    ; int BPF_PROG(drop, struct sk_buff *skb, void *location, @ kfree_skb_sk_null.bpf.c:21
> >    0: (79) r3 = *(u64 *)(r1 +24)
> >    func 'kfree_skb' arg3 has btf_id 5253 type STRUCT 'sock'
> >    1: R1=ctx() R3_w=trusted_ptr_or_null_sock(id=1)
> >    ; bpf_printk("sk: %d, %d\n", sk, sk->__sk_common.skc_family); @ kfree_skb_sk_null.bpf.c:24
> >    1: (69) r4 = *(u16 *)(r3 +16)
> >    R3 invalid mem access 'trusted_ptr_or_null_'
> >    processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
> >    -- END PROG LOAD LOG --
> > 
> > Note this fix requires commit 8aeaed21befc ("bpf: Support
> > __nullable argument suffix for tp_btf").
> 
> I believe the current way is to add kfree_skb to the raw_tp_null_args[],
> https://lore.kernel.org/all/20241213221929.3495062-3-memxor@gmail.com/

Oh, this is nice, thanks Martin!

I was wondering if other explicit NULL-able args should be renamed,
but looks like this series fixed all.

Will post this as v2.

---8<---
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 9de6acddd479..c3223e0db2f5 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -6507,6 +6507,8 @@ static const struct bpf_raw_tp_null_args raw_tp_null_args[] = {
 	/* rxrpc */
 	{ "rxrpc_recvdata", 0x1 },
 	{ "rxrpc_resend", 0x10 },
+	/* skb */
+	{"kfree_skb", 0x1000},
 	/* sunrpc */
 	{ "xs_stream_read_data", 0x1 },
 	/* ... from xprt_cong_event event class */
---8<---

Thanks!
Yan Zhai Feb. 1, 2025, 2:40 a.m. UTC | #3
On Fri, Jan 31, 2025 at 6:14 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> Yan Zhai reported a BPF prog could trigger a null-ptr-deref [0]
> in trace_kfree_skb if the prog does not check if rx_sk is NULL.
>
> Commit c53795d48ee8 ("net: add rx_sk to trace_kfree_skb") added
> rx_sk to trace_kfree_skb, but rx_sk is optional and could be NULL.
>
> Let's add __nullable suffix to rx_sk to let the BPF verifier
> validate such a prog and prevent the issue.
>
> Now we fail to load such a prog:
>
>   libbpf: prog 'drop': -- BEGIN PROG LOAD LOG --
>   0: R1=ctx() R10=fp0
>   ; int BPF_PROG(drop, struct sk_buff *skb, void *location, @ kfree_skb_sk_null.bpf.c:21
>   0: (79) r3 = *(u64 *)(r1 +24)
>   func 'kfree_skb' arg3 has btf_id 5253 type STRUCT 'sock'
>   1: R1=ctx() R3_w=trusted_ptr_or_null_sock(id=1)
>   ; bpf_printk("sk: %d, %d\n", sk, sk->__sk_common.skc_family); @ kfree_skb_sk_null.bpf.c:24
>   1: (69) r4 = *(u16 *)(r3 +16)
>   R3 invalid mem access 'trusted_ptr_or_null_'
>   processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
>   -- END PROG LOAD LOG --
>
> Note this fix requires commit 8aeaed21befc ("bpf: Support
> __nullable argument suffix for tp_btf").
>
> [0]:
> BUG: kernel NULL pointer dereference, address: 0000000000000010
>  PF: supervisor read access in kernel mode
>  PF: error_code(0x0000) - not-present page
> PGD 0 P4D 0
> PREEMPT SMP
> CPU: 6 UID: 0 PID: 348 Comm: sshd Not tainted 6.12.11 #206
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> RIP: 0010:bpf_prog_5e21a6db8fcff1aa_drop+0x10/0x2d
> Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 48 8b 57 18 <48> 0f b7 4a 10 48 bf 0c 4f e2 c1 ad 90 ff ff be 0c 00 00 00 e8 0f
> RSP: 0018:ffffa86640b53da8 EFLAGS: 00010202
> RAX: 0000000000000001 RBX: ffffa866402d1000 RCX: 0000000000000002
> RDX: 0000000000000000 RSI: ffffa866402d1048 RDI: ffffa86640b53dc8
> RBP: ffffa86640b53da8 R08: 0000000000000000 R09: 9c908cd09b9c8c91
> R10: ffff90adc056b540 R11: 0000000000000002 R12: 0000000000000000
> R13: ffffa86640b53e88 R14: 0000000000000800 R15: fffffffffffffffe
> FS:  00007f2a27c2b480(0000) GS:ffff90b0efd00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000010 CR3: 0000000100e69004 CR4: 00000000001726f0
> Call Trace:
>  <TASK>
>  ? __die+0x1f/0x60
>  ? page_fault_oops+0x148/0x420
>  ? search_bpf_extables+0x5b/0x70
>  ? fixup_exception+0x27/0x2c0
>  ? exc_page_fault+0x75/0x170
>  ? asm_exc_page_fault+0x22/0x30
>  ? bpf_prog_5e21a6db8fcff1aa_drop+0x10/0x2d
>  bpf_trace_run4+0x68/0xd0
>  ? unix_stream_connect+0x1f4/0x6f0
>  sk_skb_reason_drop+0x90/0x120
>  unix_stream_connect+0x1f4/0x6f0
>  __sys_connect+0x7f/0xb0
>  __x64_sys_connect+0x14/0x20
>  do_syscall_64+0x47/0xc30
>  entry_SYSCALL_64_after_hwframe+0x4b/0x53
> RIP: 0033:0x7f2a27f296a0
> Code: 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 80 3d 41 ff 0c 00 00 74 17 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 18 89 54
> RSP: 002b:00007ffe29274f58 EFLAGS: 00000202 ORIG_RAX: 000000000000002a
>
> Fixes: c53795d48ee8 ("net: add rx_sk to trace_kfree_skb")
> Reported-by: Yan Zhai <yan@cloudflare.com>
> Closes: https://lore.kernel.org/netdev/Z50zebTRzI962e6X@debian.debian/
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> ---
>  include/trace/events/skb.h | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
> index b877133cd93a..8bf0e61b8549 100644
> --- a/include/trace/events/skb.h
> +++ b/include/trace/events/skb.h
> @@ -24,9 +24,9 @@ DEFINE_DROP_REASON(FN, FN)
>  TRACE_EVENT(kfree_skb,
>
>         TP_PROTO(struct sk_buff *skb, void *location,
> -                enum skb_drop_reason reason, struct sock *rx_sk),
> +                enum skb_drop_reason reason, struct sock *rx_sk__nullable),
>
> -       TP_ARGS(skb, location, reason, rx_sk),
> +       TP_ARGS(skb, location, reason, rx_sk__nullable),
>
>         TP_STRUCT__entry(
>                 __field(void *,         skbaddr)
> @@ -39,7 +39,7 @@ TRACE_EVENT(kfree_skb,
>         TP_fast_assign(
>                 __entry->skbaddr = skb;
>                 __entry->location = location;
> -               __entry->rx_sk = rx_sk;
> +               __entry->rx_sk = rx_sk__nullable;
>                 __entry->protocol = ntohs(skb->protocol);
>                 __entry->reason = reason;
>         ),
> --
> 2.39.5 (Apple Git-154)
>

Tested-by: Yan Zhai <yan@cloudflare.com>
Yan Zhai Feb. 1, 2025, 2:41 a.m. UTC | #4
On Fri, Jan 31, 2025 at 8:19 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 1/31/25 4:14 PM, Kuniyuki Iwashima wrote:
> > Yan Zhai reported a BPF prog could trigger a null-ptr-deref [0]
> > in trace_kfree_skb if the prog does not check if rx_sk is NULL.
> >
> > Commit c53795d48ee8 ("net: add rx_sk to trace_kfree_skb") added
> > rx_sk to trace_kfree_skb, but rx_sk is optional and could be NULL.
> >
> > Let's add __nullable suffix to rx_sk to let the BPF verifier
> > validate such a prog and prevent the issue.
> >
> > Now we fail to load such a prog:
> >
> >    libbpf: prog 'drop': -- BEGIN PROG LOAD LOG --
> >    0: R1=ctx() R10=fp0
> >    ; int BPF_PROG(drop, struct sk_buff *skb, void *location, @ kfree_skb_sk_null.bpf.c:21
> >    0: (79) r3 = *(u64 *)(r1 +24)
> >    func 'kfree_skb' arg3 has btf_id 5253 type STRUCT 'sock'
> >    1: R1=ctx() R3_w=trusted_ptr_or_null_sock(id=1)
> >    ; bpf_printk("sk: %d, %d\n", sk, sk->__sk_common.skc_family); @ kfree_skb_sk_null.bpf.c:24
> >    1: (69) r4 = *(u16 *)(r3 +16)
> >    R3 invalid mem access 'trusted_ptr_or_null_'
> >    processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
> >    -- END PROG LOAD LOG --
> >
> > Note this fix requires commit 8aeaed21befc ("bpf: Support
> > __nullable argument suffix for tp_btf").
>
> I believe the current way is to add kfree_skb to the raw_tp_null_args[],
> https://lore.kernel.org/all/20241213221929.3495062-3-memxor@gmail.com/
>
Nice to learn the trick. Thanks Martin!

Yan

> cc: Kumar
>
diff mbox series

Patch

diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
index b877133cd93a..8bf0e61b8549 100644
--- a/include/trace/events/skb.h
+++ b/include/trace/events/skb.h
@@ -24,9 +24,9 @@  DEFINE_DROP_REASON(FN, FN)
 TRACE_EVENT(kfree_skb,
 
 	TP_PROTO(struct sk_buff *skb, void *location,
-		 enum skb_drop_reason reason, struct sock *rx_sk),
+		 enum skb_drop_reason reason, struct sock *rx_sk__nullable),
 
-	TP_ARGS(skb, location, reason, rx_sk),
+	TP_ARGS(skb, location, reason, rx_sk__nullable),
 
 	TP_STRUCT__entry(
 		__field(void *,		skbaddr)
@@ -39,7 +39,7 @@  TRACE_EVENT(kfree_skb,
 	TP_fast_assign(
 		__entry->skbaddr = skb;
 		__entry->location = location;
-		__entry->rx_sk = rx_sk;
+		__entry->rx_sk = rx_sk__nullable;
 		__entry->protocol = ntohs(skb->protocol);
 		__entry->reason = reason;
 	),