Message ID | 20220818165906.64450-1-toke@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | A couple of small refactorings of BPF program call sites | expand |
On 08/18, Toke H�iland-J�rgensen wrote: > Stanislav suggested[0] that these small refactorings could be split out > from the > XDP queueing RFC series and merged separately. The first change is a small > repacking of struct softnet_data, the others change the BPF call sites to > support full 64-bit values as arguments to bpf_redirect_map() and as the > return > value of a BPF program, relying on the fact that BPF registers are always > 64-bit > wide to maintain backwards compatibility. > Please see the individual patches for details. > [0] > https://lore.kernel.org/r/CAKH8qBtdnku7StcQ-SamadvAF==DRuLLZO94yOR1WJ9Bg=uX1w@mail.gmail.com Looks like a nice cleanup to me: Reviewed-by: Stanislav Fomichev <sdf@google.com> Can you share more on this comment? /* For some architectures, we need to do modulus in 32-bit width */ Some - which ones? And why do they need it to be 32-bit? > Kumar Kartikeya Dwivedi (1): > bpf: Use 64-bit return value for bpf_prog_run > Toke H�iland-J�rgensen (2): > dev: Move received_rps counter next to RPS members in softnet data > bpf: Expand map key argument of bpf_redirect_map to u64 > include/linux/bpf-cgroup.h | 12 +++++----- > include/linux/bpf.h | 16 ++++++------- > include/linux/filter.h | 46 +++++++++++++++++++------------------- > include/linux/netdevice.h | 2 +- > include/uapi/linux/bpf.h | 2 +- > kernel/bpf/cgroup.c | 12 +++++----- > kernel/bpf/core.c | 14 ++++++------ > kernel/bpf/cpumap.c | 4 ++-- > kernel/bpf/devmap.c | 4 ++-- > kernel/bpf/offload.c | 4 ++-- > kernel/bpf/verifier.c | 2 +- > net/bpf/test_run.c | 21 +++++++++-------- > net/core/filter.c | 4 ++-- > net/packet/af_packet.c | 7 ++++-- > net/xdp/xskmap.c | 4 ++-- > 15 files changed, 80 insertions(+), 74 deletions(-) > -- > 2.37.2
On Fri, 19 Aug 2022 at 00:29, <sdf@google.com> wrote: > > On 08/18, Toke H�iland-J�rgensen wrote: > > Stanislav suggested[0] that these small refactorings could be split out > > from the > > XDP queueing RFC series and merged separately. The first change is a small > > repacking of struct softnet_data, the others change the BPF call sites to > > support full 64-bit values as arguments to bpf_redirect_map() and as the > > return > > value of a BPF program, relying on the fact that BPF registers are always > > 64-bit > > wide to maintain backwards compatibility. > > > Please see the individual patches for details. > > > [0] > > https://lore.kernel.org/r/CAKH8qBtdnku7StcQ-SamadvAF==DRuLLZO94yOR1WJ9Bg=uX1w@mail.gmail.com > > Looks like a nice cleanup to me: > Reviewed-by: Stanislav Fomichev <sdf@google.com> > > Can you share more on this comment? > > /* For some architectures, we need to do modulus in 32-bit width */ > > Some - which ones? And why do they need it to be 32-bit? It was a fix for i386, I saw a kernel test robot error when it was sitting in Toke's tree. See https://lore.kernel.org/all/202203140800.8pr81INh-lkp@intel.com. Once bpf_prog_run_clear_cb starts returning a 64-bit value, we now need to save it in 32-bit before doing the modulus. > > > Kumar Kartikeya Dwivedi (1): > > bpf: Use 64-bit return value for bpf_prog_run > > > Toke H�iland-J�rgensen (2): > > dev: Move received_rps counter next to RPS members in softnet data > > bpf: Expand map key argument of bpf_redirect_map to u64 > > > include/linux/bpf-cgroup.h | 12 +++++----- > > include/linux/bpf.h | 16 ++++++------- > > include/linux/filter.h | 46 +++++++++++++++++++------------------- > > include/linux/netdevice.h | 2 +- > > include/uapi/linux/bpf.h | 2 +- > > kernel/bpf/cgroup.c | 12 +++++----- > > kernel/bpf/core.c | 14 ++++++------ > > kernel/bpf/cpumap.c | 4 ++-- > > kernel/bpf/devmap.c | 4 ++-- > > kernel/bpf/offload.c | 4 ++-- > > kernel/bpf/verifier.c | 2 +- > > net/bpf/test_run.c | 21 +++++++++-------- > > net/core/filter.c | 4 ++-- > > net/packet/af_packet.c | 7 ++++-- > > net/xdp/xskmap.c | 4 ++-- > > 15 files changed, 80 insertions(+), 74 deletions(-) > > > -- > > 2.37.2 >
On 8/18/22 6:59 PM, Toke Høiland-Jørgensen wrote: > Stanislav suggested[0] that these small refactorings could be split out from the > XDP queueing RFC series and merged separately. The first change is a small > repacking of struct softnet_data, the others change the BPF call sites to > support full 64-bit values as arguments to bpf_redirect_map() and as the return > value of a BPF program, relying on the fact that BPF registers are always 64-bit > wide to maintain backwards compatibility. > > Please see the individual patches for details. > > [0] https://lore.kernel.org/r/CAKH8qBtdnku7StcQ-SamadvAF==DRuLLZO94yOR1WJ9Bg=uX1w@mail.gmail.com > > Kumar Kartikeya Dwivedi (1): > bpf: Use 64-bit return value for bpf_prog_run > > Toke Høiland-Jørgensen (2): > dev: Move received_rps counter next to RPS members in softnet data > bpf: Expand map key argument of bpf_redirect_map to u64 Looks like this series throws NULL pointer derefs in the CI. I just reran it and same result whereas various other bpf-next targeted patches CI seems green and w/o below panic ... perhaps an issue in last patch; please investigate. https://github.com/kernel-patches/bpf/runs/7982907380?check_suite_focus=true [...] #231 verif_scale_strobemeta:OK #232 verif_scale_strobemeta_bpf_loop:OK #233 verif_scale_strobemeta_nounroll1:OK #234 verif_scale_strobemeta_nounroll2:OK #235 verif_scale_strobemeta_subprogs:OK #236 verif_scale_sysctl_loop1:OK #237 verif_scale_sysctl_loop2:OK #238 verif_scale_xdp_loop:OK #239 verif_stats:OK #240 verif_twfw:OK [ 828.755223] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 828.755223] #PF: supervisor instruction fetch in kernel mode [ 828.755223] #PF: error_code(0x0010) - not-present page [ 828.755223] PGD 0 P4D 0 [ 828.755223] Oops: 0010 [#1] PREEMPT SMP NOPTI [ 828.755223] CPU: 4 PID: 0 Comm: swapper/4 Tainted: G OE 5.19.0-g3141b1878b85 #1 [ 828.755223] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 [ 828.755223] RIP: 0010:0x0 [ 828.755223] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. [ 828.755223] RSP: 0018:ffffbccd400b3ea8 EFLAGS: 00000046 [ 828.755223] RAX: 0000000000000002 RBX: ffff9e79f9d1efc0 RCX: 000000000000000a [ 828.755223] RDX: 0000000000000000 RSI: 000000d2fe34b800 RDI: ffff9e79f9d1efc0 [ 828.755223] RBP: 000000d2fe34b800 R08: 0000000000000000 R09: 0000000000000000 [ 828.755223] R10: 00000000000f4240 R11: ffffffffb5062510 R12: 0000000000000015 [ 828.755223] R13: 7fffffffffffffff R14: 0000000000000004 R15: 000000d2fe34b800 [ 828.767338] FS: 0000000000000000(0000) GS:ffff9e79f9d00000(0000) knlGS:0000000000000000 [ 828.767338] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 828.767338] CR2: ffffffffffffffd6 CR3: 0000000100970000 CR4: 00000000000006e0 [ 828.767338] Call Trace: [ 828.767338] <TASK> [ 828.767338] tick_nohz_idle_stop_tick+0x1da/0x380 [ 828.767338] do_idle+0xe6/0x280 [ 828.767338] cpu_startup_entry+0x19/0x20 [ 828.767338] start_secondary+0x8f/0x90 [ 828.767338] secondary_startup_64_no_verify+0xe1/0xeb [ 828.767338] </TASK> [ 828.767338] Modules linked in: bpf_testmod(OE) [last unloaded: bpf_testmod(OE)] [ 828.767338] CR2: 0000000000000000 [ 828.767338] ---[ end trace 0000000000000000 ]--- [ 828.758172] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 828.758172] #PF: supervisor instruction fetch in kernel mode [ 828.758172] #PF: error_code(0x0010) - not-present page [ 828.758172] PGD 0 P4D 0 [ 828.767338] RIP: 0010:0x0 [ 828.758172] Oops: 0010 [#2] PREEMPT SMP NOPTI [ 828.758172] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G D OE 5.19.0-g3141b1878b85 #1 [ 828.758172] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 [ 828.767338] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. [ 828.767338] RSP: 0018:ffffbccd400b3ea8 EFLAGS: 00000046 [ 828.758172] RIP: 0010:0x0 [ 828.767338] [ 828.758172] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. [ 828.767338] RAX: 0000000000000002 RBX: ffff9e79f9d1efc0 RCX: 000000000000000a [ 828.767338] RDX: 0000000000000000 RSI: 000000d2fe34b800 RDI: ffff9e79f9d1efc0 [ 828.767338] RBP: 000000d2fe34b800 R08: 0000000000000000 R09: 0000000000000000 [ 828.758172] RSP: 0018:ffffbccd400cbea0 EFLAGS: 00000046 [ 828.767338] R10: 00000000000f4240 R11: ffffffffb5062510 R12: 0000000000000015 [ 828.767338] R13: 7fffffffffffffff R14: 0000000000000004 R15: 000000d2fe34b800 [ 828.758172] [ 828.758172] RAX: 0000000000000005 RBX: ffff9e79f9ddefc0 RCX: 000000000000000a [ 828.758172] RDX: 0000000000000000 RSI: 000000c0f6c7a580 RDI: ffff9e79f9ddefc0 [ 828.767338] FS: 0000000000000000(0000) GS:ffff9e79f9d00000(0000) knlGS:0000000000000000 [ 828.758172] RBP: 0000000000000013 R08: 7fffffffffffffff R09: 000000c0f6b86340 [ 828.767338] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 828.758172] R10: 00000000000f4240 R11: ffffffffb5062510 R12: 0000000000000000 [ 828.758172] R13: 0000000000000000 R14: 000000c0f6bc6d28 R15: 0000000000000000 [ 828.758172] FS: 0000000000000000(0000) GS:ffff9e79f9dc0000(0000) knlGS:0000000000000000 [ 828.767338] CR2: ffffffffffffffd6 CR3: 0000000100970000 CR4: 00000000000006e0 [ 828.758172] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 828.767338] Kernel panic - not syncing: Fatal exception [ 828.758172] CR2: ffffffffffffffd6 CR3: 000000009a836000 CR4: 00000000000006e0 [ 828.758172] Call Trace: [ 828.758172] <TASK> [ 828.758172] tick_nohz_restart_sched_tick+0x6b/0x90 [ 828.758172] tick_nohz_idle_exit+0xfc/0x150 [ 828.758172] do_idle+0x23c/0x280 [ 828.758172] cpu_startup_entry+0x19/0x20 [ 828.758172] start_secondary+0x8f/0x90 [ 828.758172] secondary_startup_64_no_verify+0xe1/0xeb [ 828.758172] </TASK> [...]
On Wed, 24 Aug 2022 at 00:42, Daniel Borkmann <daniel@iogearbox.net> wrote: > > On 8/18/22 6:59 PM, Toke Høiland-Jørgensen wrote: > > Stanislav suggested[0] that these small refactorings could be split out from the > > XDP queueing RFC series and merged separately. The first change is a small > > repacking of struct softnet_data, the others change the BPF call sites to > > support full 64-bit values as arguments to bpf_redirect_map() and as the return > > value of a BPF program, relying on the fact that BPF registers are always 64-bit > > wide to maintain backwards compatibility. > > > > Please see the individual patches for details. > > > > [0] https://lore.kernel.org/r/CAKH8qBtdnku7StcQ-SamadvAF==DRuLLZO94yOR1WJ9Bg=uX1w@mail.gmail.com > > > > Kumar Kartikeya Dwivedi (1): > > bpf: Use 64-bit return value for bpf_prog_run > > > > Toke Høiland-Jørgensen (2): > > dev: Move received_rps counter next to RPS members in softnet data > > bpf: Expand map key argument of bpf_redirect_map to u64 > > Looks like this series throws NULL pointer derefs in the CI. I just reran it and > same result whereas various other bpf-next targeted patches CI seems green and w/o > below panic ... perhaps an issue in last patch; please investigate. Was it only occurring with LLVM before, or with GCC too? > > https://github.com/kernel-patches/bpf/runs/7982907380?check_suite_focus=true > I've been trying to reproduce this for a day with no luck. First I did it with GCC, then I noticed that the CI is only red for LLVM, so I also tried with LLVM 16. I'll keep trying, but just wanted to update the thread. Also, would there be a way to look at logs of the past runs (that you saw and then triggered this failing run again)? Maybe their splat has some difference which might provide more clues.