Message ID | 20250210134550.3189616-2-zhangmingyi5@huawei.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | bpf-next: Introduced to support the ULP to get or set sockets | expand |
On 2/10/25 5:45 AM, zhangmingyi wrote: > From: Mingyi Zhang <zhangmingyi5@huawei.com> > > Note that tcp_getsockopt and tcp_setsockopt support TCP_ULP, while > bpf_getsockopt and bpf_setsockopt do not support TCP_ULP. > I think we can add the handling of this case. The commit message should talk about the "bool load" related changes in v2 and why it is needed. The subject line is confusing. How about, "Support TCP_ULP in bpf_get/setsockopt" The code changes lgtm. > > We want call bpf_setsockopt to replace the kernel module in the TCP_ULP > case. The purpose is to customize the behavior in connect and sendmsg. > We have an open source community project kmesh (kmesh.net). Based on > this, we refer to some processes of tcp fastopen to implement delayed > connet and perform HTTP DNAT when sendmsg.In this case, we need to parse > HTTP packets in the bpf program and set TCP_ULP for the specified socket.
Hello, kernel test robot noticed "BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/mutex.c" on: commit: 8f510de3f26b2fabaf47eacd59053469e9c32754 ("[PATCH v2 1/2] bpf-next: Introduced to support the ULP to get or set sockets") url: https://github.com/intel-lab-lkp/linux/commits/zhangmingyi/bpf-next-Introduced-to-support-the-ULP-to-get-or-set-sockets/20250210-215203 base: https://git.kernel.org/cgit/linux/kernel/git/bpf/bpf-next.git master patch link: https://lore.kernel.org/all/20250210134550.3189616-2-zhangmingyi5@huawei.com/ patch subject: [PATCH v2 1/2] bpf-next: Introduced to support the ULP to get or set sockets in testcase: trinity version: trinity-i386-abe9de86-1_20230429 with following parameters: runtime: 300s group: group-03 nr_groups: 5 config: i386-randconfig-054-20250212 compiler: gcc-12 test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G (please refer to attached dmesg/kmsg for entire log/backtrace) +-----------------------------------------------------------------------------+------------+------------+ | | 9b6cdaf2ac | 8f510de3f2 | +-----------------------------------------------------------------------------+------------+------------+ | BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/mutex.c | 0 | 6 | +-----------------------------------------------------------------------------+------------+------------+ If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <oliver.sang@intel.com> | Closes: https://lore.kernel.org/oe-lkp/202502140959.f66e2ba6-lkp@intel.com [ 71.099773][ T3759] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:562 [ 71.101798][ T3759] in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 3759, name: trinity-c4 [ 71.103659][ T3759] preempt_count: 0, expected: 0 [ 71.104658][ T3759] RCU nest depth: 1, expected: 0 [ 71.105669][ T3759] 2 locks held by trinity-c4/3759: [ 71.106777][ T3759] #0: ecffcd80 (sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock (include/net/sock.h:1625) [ 71.108460][ T3759] #1: c3500498 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire (include/linux/rcupdate.h:336) [ 71.110397][ T3759] CPU: 1 UID: 65534 PID: 3759 Comm: trinity-c4 Tainted: G T 6.14.0-rc1-00030-g8f510de3f26b #1 8ad64aae41fa4cb8babad52c8f50e0a7d5e34569 [ 71.110406][ T3759] Tainted: [T]=RANDSTRUCT [ 71.110407][ T3759] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 71.110410][ T3759] Call Trace: [ 71.110416][ T3759] dump_stack_lvl (lib/dump_stack.c:123) [ 71.110423][ T3759] dump_stack (lib/dump_stack.c:130) [ 71.110428][ T3759] __might_resched (kernel/sched/core.c:8767) [ 71.110440][ T3759] __might_sleep (kernel/sched/core.c:8696 (discriminator 17)) [ 71.110446][ T3759] __mutex_lock (include/linux/kernel.h:73 kernel/locking/mutex.c:562 kernel/locking/mutex.c:730) [ 71.110452][ T3759] ? rcu_read_unlock (include/linux/rcupdate.h:335) [ 71.110462][ T3759] ? mark_held_locks (kernel/locking/lockdep.c:4323) [ 71.110470][ T3759] ? lock_sock_nested (net/core/sock.c:3653) [ 71.110481][ T3759] mutex_lock_nested (kernel/locking/mutex.c:783) [ 71.110486][ T3759] ? tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993) [ 71.110494][ T3759] tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993) [ 71.110505][ T3759] tcp_set_ulp (net/ipv4/tcp_ulp.c:140 net/ipv4/tcp_ulp.c:166) [ 71.110513][ T3759] do_tcp_setsockopt (net/ipv4/tcp.c:3747) [ 71.110534][ T3759] tcp_setsockopt (net/ipv4/tcp.c:4032) [ 71.110542][ T3759] ? sock_common_recvmsg (net/core/sock.c:3833) [ 71.110548][ T3759] sock_common_setsockopt (net/core/sock.c:3838) [ 71.110561][ T3759] do_sock_setsockopt (net/socket.c:2298) [ 71.110577][ T3759] __sys_setsockopt (net/socket.c:2323) [ 71.110592][ T3759] __ia32_sys_setsockopt (net/socket.c:2326) [ 71.110599][ T3759] ia32_sys_call (kbuild/obj/consumer/i386-randconfig-054-20250212/./arch/x86/include/generated/asm/syscalls_32.h:367) [ 71.110607][ T3759] do_int80_syscall_32 (arch/x86/entry/common.c:165 arch/x86/entry/common.c:339) [ 71.110616][ T3759] entry_INT80_32 (arch/x86/entry/entry_32.S:942) [ 71.110621][ T3759] EIP: 0xb4014092 [ 71.110626][ T3759] Code: 00 00 00 e9 90 ff ff ff ff a3 24 00 00 00 68 30 00 00 00 e9 80 ff ff ff ff a3 f8 ff ff ff 66 90 00 00 00 00 00 00 00 00 cd 80 <c3> 8d b4 26 00 00 00 00 8d b6 00 00 00 00 8b 1c 24 c3 8d b4 26 00 All code ======== 0: 00 00 add %al,(%rax) 2: 00 e9 add %ch,%cl 4: 90 nop 5: ff (bad) 6: ff (bad) 7: ff (bad) 8: ff a3 24 00 00 00 jmp *0x24(%rbx) e: 68 30 00 00 00 push $0x30 13: e9 80 ff ff ff jmp 0xffffffffffffff98 18: ff a3 f8 ff ff ff jmp *-0x8(%rbx) 1e: 66 90 xchg %ax,%ax ... 28: cd 80 int $0x80 2a:* c3 ret <-- trapping instruction 2b: 8d b4 26 00 00 00 00 lea 0x0(%rsi,%riz,1),%esi 32: 8d b6 00 00 00 00 lea 0x0(%rsi),%esi 38: 8b 1c 24 mov (%rsp),%ebx 3b: c3 ret 3c: 8d .byte 0x8d 3d: b4 26 mov $0x26,%ah ... Code starting with the faulting instruction =========================================== 0: c3 ret 1: 8d b4 26 00 00 00 00 lea 0x0(%rsi,%riz,1),%esi 8: 8d b6 00 00 00 00 lea 0x0(%rsi),%esi e: 8b 1c 24 mov (%rsp),%ebx 11: c3 ret 12: 8d .byte 0x8d 13: b4 26 mov $0x26,%ah ... [ 71.110630][ T3759] EAX: ffffffda EBX: 00000134 ECX: 00000006 EDX: 0000001f [ 71.110634][ T3759] ESI: 08fee650 EDI: 00000004 EBP: 000012cf ESP: bfc1c538 [ 71.110638][ T3759] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000296 [ 71.182507][ T3759] [ 71.182999][ T3759] ============================= [ 71.183907][ T3759] [ BUG: Invalid wait context ] [ 71.184819][ T3759] 6.14.0-rc1-00030-g8f510de3f26b #1 Tainted: G W T [ 71.186327][ T3759] ----------------------------- [ 71.187265][ T3759] trinity-c4/3759 is trying to lock: [ 71.188287][ T3759] c37b35e0 (tcpv4_prot_mutex){....}-{4:4}, at: tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993) [ 71.189847][ T3759] other info that might help us debug this: [ 71.191018][ T3759] context-{5:5} [ 71.191678][ T3759] 2 locks held by trinity-c4/3759: [ 71.192635][ T3759] #0: ecffcd80 (sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock (include/net/sock.h:1625) [ 71.194220][ T3759] #1: c3500498 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire (include/linux/rcupdate.h:336) [ 71.196078][ T3759] stack backtrace: [ 71.196797][ T3759] CPU: 0 UID: 65534 PID: 3759 Comm: trinity-c4 Tainted: G W T 6.14.0-rc1-00030-g8f510de3f26b #1 8ad64aae41fa4cb8babad52c8f50e0a7d5e34569 [ 71.196807][ T3759] Tainted: [W]=WARN, [T]=RANDSTRUCT [ 71.196809][ T3759] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 71.196812][ T3759] Call Trace: [ 71.196818][ T3759] dump_stack_lvl (lib/dump_stack.c:123) [ 71.196825][ T3759] dump_stack (lib/dump_stack.c:130) [ 71.196830][ T3759] __lock_acquire (kernel/locking/lockdep.c:4830 kernel/locking/lockdep.c:4900 kernel/locking/lockdep.c:5178) [ 71.196840][ T3759] lock_acquire (kernel/locking/lockdep.c:469 kernel/locking/lockdep.c:5853) [ 71.196846][ T3759] ? tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993) [ 71.196856][ T3759] ? __schedule (kernel/sched/core.c:5380) [ 71.196866][ T3759] __mutex_lock (kernel/locking/mutex.c:587 kernel/locking/mutex.c:730) [ 71.196872][ T3759] ? tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993) [ 71.196878][ T3759] ? rcu_read_unlock (include/linux/rcupdate.h:335) [ 71.196885][ T3759] ? mark_held_locks (kernel/locking/lockdep.c:4323) [ 71.196889][ T3759] ? lock_sock_nested (net/core/sock.c:3653) [ 71.196898][ T3759] mutex_lock_nested (kernel/locking/mutex.c:783) [ 71.196904][ T3759] ? tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993) [ 71.196909][ T3759] tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993) [ 71.196916][ T3759] tcp_set_ulp (net/ipv4/tcp_ulp.c:140 net/ipv4/tcp_ulp.c:166) [ 71.196923][ T3759] do_tcp_setsockopt (net/ipv4/tcp.c:3747) [ 71.196934][ T3759] tcp_setsockopt (net/ipv4/tcp.c:4032) [ 71.196939][ T3759] ? sock_common_recvmsg (net/core/sock.c:3833) [ 71.196946][ T3759] sock_common_setsockopt (net/core/sock.c:3838) [ 71.196952][ T3759] do_sock_setsockopt (net/socket.c:2298) [ 71.196961][ T3759] __sys_setsockopt (net/socket.c:2323) [ 71.196967][ T3759] __ia32_sys_setsockopt (net/socket.c:2326) [ 71.196972][ T3759] ia32_sys_call (kbuild/obj/consumer/i386-randconfig-054-20250212/./arch/x86/include/generated/asm/syscalls_32.h:367) [ 71.196979][ T3759] do_int80_syscall_32 (arch/x86/entry/common.c:165 arch/x86/entry/common.c:339) [ 71.196985][ T3759] entry_INT80_32 (arch/x86/entry/entry_32.S:942) [ 71.196990][ T3759] EIP: 0xb4014092 [ 71.196995][ T3759] Code: 00 00 00 e9 90 ff ff ff ff a3 24 00 00 00 68 30 00 00 00 e9 80 ff ff ff ff a3 f8 ff ff ff 66 90 00 00 00 00 00 00 00 00 cd 80 <c3> 8d b4 26 00 00 00 00 8d b6 00 00 00 00 8b 1c 24 c3 8d b4 26 00 All code ======== 0: 00 00 add %al,(%rax) 2: 00 e9 add %ch,%cl 4: 90 nop 5: ff (bad) 6: ff (bad) 7: ff (bad) 8: ff a3 24 00 00 00 jmp *0x24(%rbx) e: 68 30 00 00 00 push $0x30 13: e9 80 ff ff ff jmp 0xffffffffffffff98 18: ff a3 f8 ff ff ff jmp *-0x8(%rbx) 1e: 66 90 xchg %ax,%ax ... 28: cd 80 int $0x80 2a:* c3 ret <-- trapping instruction 2b: 8d b4 26 00 00 00 00 lea 0x0(%rsi,%riz,1),%esi 32: 8d b6 00 00 00 00 lea 0x0(%rsi),%esi 38: 8b 1c 24 mov (%rsp),%ebx 3b: c3 ret 3c: 8d .byte 0x8d 3d: b4 26 mov $0x26,%ah ... Code starting with the faulting instruction =========================================== 0: c3 ret 1: 8d b4 26 00 00 00 00 lea 0x0(%rsi,%riz,1),%esi 8: 8d b6 00 00 00 00 lea 0x0(%rsi),%esi e: 8b 1c 24 mov (%rsp),%ebx 11: c3 ret 12: 8d .byte 0x8d 13: b4 26 mov $0x26,%ah ... [ 71.196999][ T3759] EAX: ffffffda EBX: 00000134 ECX: 00000006 EDX: 0000001f [ 71.197004][ T3759] ESI: 08fee650 EDI: 00000004 EBP: 000012cf ESP: bfc1c538 [ 71.197008][ T3759] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000296 The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20250214/202502140959.f66e2ba6-lkp@intel.com
On 2/13/25 6:13 PM, kernel test robot wrote: > [ 71.182999][ T3759] ============================= > [ 71.183907][ T3759] [ BUG: Invalid wait context ] > [ 71.184819][ T3759] 6.14.0-rc1-00030-g8f510de3f26b #1 Tainted: G W T > [ 71.186327][ T3759] ----------------------------- > [ 71.187265][ T3759] trinity-c4/3759 is trying to lock: > [ 71.188287][ T3759] c37b35e0 (tcpv4_prot_mutex){....}-{4:4}, at: tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993) > [ 71.189847][ T3759] other info that might help us debug this: > [ 71.191018][ T3759] context-{5:5} > [ 71.191678][ T3759] 2 locks held by trinity-c4/3759: > [ 71.192635][ T3759] #0: ecffcd80 (sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock (include/net/sock.h:1625) > [ 71.194220][ T3759] #1: c3500498 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire (include/linux/rcupdate.h:336) > [ 71.196078][ T3759] stack backtrace: > [ 71.196797][ T3759] CPU: 0 UID: 65534 PID: 3759 Comm: trinity-c4 Tainted: G W T 6.14.0-rc1-00030-g8f510de3f26b #1 8ad64aae41fa4cb8babad52c8f50e0a7d5e34569 > [ 71.196807][ T3759] Tainted: [W]=WARN, [T]=RANDSTRUCT > [ 71.196809][ T3759] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 > [ 71.196812][ T3759] Call Trace: > [ 71.196818][ T3759] dump_stack_lvl (lib/dump_stack.c:123) > [ 71.196825][ T3759] dump_stack (lib/dump_stack.c:130) > [ 71.196830][ T3759] __lock_acquire (kernel/locking/lockdep.c:4830 kernel/locking/lockdep.c:4900 kernel/locking/lockdep.c:5178) > [ 71.196840][ T3759] lock_acquire (kernel/locking/lockdep.c:469 kernel/locking/lockdep.c:5853) > [ 71.196846][ T3759] ? tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993) > [ 71.196856][ T3759] ? __schedule (kernel/sched/core.c:5380) > [ 71.196866][ T3759] __mutex_lock (kernel/locking/mutex.c:587 kernel/locking/mutex.c:730) > [ 71.196872][ T3759] ? tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993) > [ 71.196878][ T3759] ? rcu_read_unlock (include/linux/rcupdate.h:335) > [ 71.196885][ T3759] ? mark_held_locks (kernel/locking/lockdep.c:4323) > [ 71.196889][ T3759] ? lock_sock_nested (net/core/sock.c:3653) > [ 71.196898][ T3759] mutex_lock_nested (kernel/locking/mutex.c:783) This is probably because __tcp_set_ulp is now under the rcu_read_lock() in patch 1. Even fixing patch 1 will not be enough. The bpf cgrp prog (e.g. sockops) cannot sleep now, so it still cannot call bpf_setsockopt(TCP_ULP, "tls") which will take a mutex. This is a blocker :( > [ 71.196904][ T3759] ? tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993) > [ 71.196909][ T3759] tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993) > [ 71.196916][ T3759] tcp_set_ulp (net/ipv4/tcp_ulp.c:140 net/ipv4/tcp_ulp.c:166) > [ 71.196923][ T3759] do_tcp_setsockopt (net/ipv4/tcp.c:3747) > [ 71.196934][ T3759] tcp_setsockopt (net/ipv4/tcp.c:4032) > [ 71.196939][ T3759] ? sock_common_recvmsg (net/core/sock.c:3833) > [ 71.196946][ T3759] sock_common_setsockopt (net/core/sock.c:3838) > [ 71.196952][ T3759] do_sock_setsockopt (net/socket.c:2298) > [ 71.196961][ T3759] __sys_setsockopt (net/socket.c:2323) > [ 71.196967][ T3759] __ia32_sys_setsockopt (net/socket.c:2326) > [ 71.196972][ T3759] ia32_sys_call (kbuild/obj/consumer/i386-randconfig-054-20250212/./arch/x86/include/generated/asm/syscalls_32.h:367) > [ 71.196979][ T3759] do_int80_syscall_32 (arch/x86/entry/common.c:165 arch/x86/entry/common.c:339) > [ 71.196985][ T3759] entry_INT80_32 (arch/x86/entry/entry_32.S:942)
On Thu, 13 Feb 2025 22:23:39 -0800 Martin KaFai Lau wrote: > On 2/13/25 6:13 PM, kernel test robot wrote: > > [ 71.196846][ T3759] ? tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993) > > [ 71.196856][ T3759] ? __schedule (kernel/sched/core.c:5380) > > [ 71.196866][ T3759] __mutex_lock (kernel/locking/mutex.c:587 kernel/locking/mutex.c:730) > > [ 71.196872][ T3759] ? tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993) > > [ 71.196878][ T3759] ? rcu_read_unlock (include/linux/rcupdate.h:335) > > [ 71.196885][ T3759] ? mark_held_locks (kernel/locking/lockdep.c:4323) > > [ 71.196889][ T3759] ? lock_sock_nested (net/core/sock.c:3653) > > [ 71.196898][ T3759] mutex_lock_nested (kernel/locking/mutex.c:783) > > This is probably because __tcp_set_ulp is now under the rcu_read_lock() in patch 1. > > Even fixing patch 1 will not be enough. The bpf cgrp prog (e.g. sockops) cannot > sleep now, so it still cannot call bpf_setsockopt(TCP_ULP, "tls") which will > take a mutex. This is a blocker :( Oh, kbuild bot was nice enough to CC netdev, it wasn't CCed on the submission. I'd really rather we didn't allow setting ULP from BPF unless there is a strong and clear use case. The ULP configuration and stacking is a source of many bugs. And the use case here AFAIU is to allow attaching some ULP from an OOT module to a socket, which I think won't make core BPF folks happy either, right?
On 2/14/25 1:20 PM, Jakub Kicinski wrote: > On Thu, 13 Feb 2025 22:23:39 -0800 Martin KaFai Lau wrote: >> On 2/13/25 6:13 PM, kernel test robot wrote: >>> [ 71.196846][ T3759] ? tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993) >>> [ 71.196856][ T3759] ? __schedule (kernel/sched/core.c:5380) >>> [ 71.196866][ T3759] __mutex_lock (kernel/locking/mutex.c:587 kernel/locking/mutex.c:730) >>> [ 71.196872][ T3759] ? tls_init (net/tls/tls_main.c:934 net/tls/tls_main.c:993) >>> [ 71.196878][ T3759] ? rcu_read_unlock (include/linux/rcupdate.h:335) >>> [ 71.196885][ T3759] ? mark_held_locks (kernel/locking/lockdep.c:4323) >>> [ 71.196889][ T3759] ? lock_sock_nested (net/core/sock.c:3653) >>> [ 71.196898][ T3759] mutex_lock_nested (kernel/locking/mutex.c:783) >> >> This is probably because __tcp_set_ulp is now under the rcu_read_lock() in patch 1. >> >> Even fixing patch 1 will not be enough. The bpf cgrp prog (e.g. sockops) cannot >> sleep now, so it still cannot call bpf_setsockopt(TCP_ULP, "tls") which will >> take a mutex. This is a blocker :( > > Oh, kbuild bot was nice enough to CC netdev, it wasn't CCed on > the submission. Ah. I also didn't notice netdev was not cc-ed. will pay attention in the future. > > I'd really rather we didn't allow setting ULP from BPF unless there > is a strong and clear use case. The ULP configuration and stacking > is a source of many bugs. And the use case here AFAIU is to allow > attaching some ULP from an OOT module to a socket, which I think > won't make core BPF folks happy either, right? If the in-tree ulp does not work, there is little reason to do it for the out-of-tree module only. My question on the ulp use case went to silence in v1, so we can assume it is out-of-tree ulp only. I also asked to replace the "smc" ulp testing with a more real "tls" ulp testing to see how it goes first. It does not work as the bot reported it.
diff --git a/include/net/tcp.h b/include/net/tcp.h index e9b37b76e894..f26e92099b86 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -2582,7 +2582,7 @@ struct tcp_ulp_ops { }; int tcp_register_ulp(struct tcp_ulp_ops *type); void tcp_unregister_ulp(struct tcp_ulp_ops *type); -int tcp_set_ulp(struct sock *sk, const char *name); +int tcp_set_ulp(struct sock *sk, const char *name, bool load); void tcp_get_available_ulp(char *buf, size_t len); void tcp_cleanup_ulp(struct sock *sk); void tcp_update_ulp(struct sock *sk, struct proto *p, diff --git a/net/core/filter.c b/net/core/filter.c index 713d6f454df3..bdb5c43d6fb0 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -5380,6 +5380,7 @@ static int sol_tcp_sockopt(struct sock *sk, int optname, case TCP_CONGESTION: return sol_tcp_sockopt_congestion(sk, optval, optlen, getopt); case TCP_SAVED_SYN: + case TCP_ULP: if (*optlen < 1) return -EINVAL; break; diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 0d704bda6c41..88ccd0e211f9 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3744,7 +3744,7 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname, name[val] = 0; sockopt_lock_sock(sk); - err = tcp_set_ulp(sk, name); + err = tcp_set_ulp(sk, name, !has_current_bpf_ctx()); sockopt_release_sock(sk); return err; } diff --git a/net/ipv4/tcp_ulp.c b/net/ipv4/tcp_ulp.c index 2aa442128630..c1c39dbef417 100644 --- a/net/ipv4/tcp_ulp.c +++ b/net/ipv4/tcp_ulp.c @@ -33,10 +33,7 @@ static struct tcp_ulp_ops *tcp_ulp_find(const char *name) static const struct tcp_ulp_ops *__tcp_ulp_find_autoload(const char *name) { - const struct tcp_ulp_ops *ulp = NULL; - - rcu_read_lock(); - ulp = tcp_ulp_find(name); + const struct tcp_ulp_ops *ulp = tcp_ulp_find(name); #ifdef CONFIG_MODULES if (!ulp && capable(CAP_NET_ADMIN)) { @@ -46,10 +43,6 @@ static const struct tcp_ulp_ops *__tcp_ulp_find_autoload(const char *name) ulp = tcp_ulp_find(name); } #endif - if (!ulp || !try_module_get(ulp->owner)) - ulp = NULL; - - rcu_read_unlock(); return ulp; } @@ -154,15 +147,24 @@ static int __tcp_set_ulp(struct sock *sk, const struct tcp_ulp_ops *ulp_ops) return err; } -int tcp_set_ulp(struct sock *sk, const char *name) +int tcp_set_ulp(struct sock *sk, const char *name, bool load) { const struct tcp_ulp_ops *ulp_ops; + int err = 0; sock_owned_by_me(sk); - ulp_ops = __tcp_ulp_find_autoload(name); - if (!ulp_ops) - return -ENOENT; + rcu_read_lock(); + if (!load) + ulp_ops = tcp_ulp_find(name); + else + ulp_ops = __tcp_ulp_find_autoload(name); + + if (!ulp_ops || !try_module_get(ulp_ops->owner)) + err = -ENOENT; + else + err = __tcp_set_ulp(sk, ulp_ops); - return __tcp_set_ulp(sk, ulp_ops); + rcu_read_unlock(); + return err; } diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index fd021cf8286e..fb936d280b83 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1776,7 +1776,7 @@ int mptcp_subflow_create_socket(struct sock *sk, unsigned short family, sf->sk->sk_net_refcnt = 1; get_net_track(net, &sf->sk->ns_tracker, GFP_KERNEL); sock_inuse_add(net, 1); - err = tcp_set_ulp(sf->sk, "mptcp"); + err = tcp_set_ulp(sf->sk, "mptcp", true); if (err) goto err_free;