Message ID | 20240405122123.4156104-1-leitao@debian.org (mailing list archive) |
---|---|
Headers | show |
Series | wifi: Un-embed ath10k and ath11k dummy netdev | expand |
Breno Leitao <leitao@debian.org> writes: > struct net_device shouldn't be embedded into any structure, instead, > the owner should use the private space to embed their state into > net_device. > > This patch set fixes the problem above for ath10k and ath11k. This also > fixes the conversion of qtnfmac driver to the new helper. > > This patch set depends on a series that is still under review: > https://lore.kernel.org/all/20240404114854.2498663-1-leitao@debian.org/#t > > If it helps, I've pushed the tree to > https://github.com/leitao/linux/tree/wireless-dummy > > PS: Due to lack of hardware, unfortunately all these patches are > compiled tested only. > > Breno Leitao (3): > wifi: qtnfmac: Use netdev dummy allocator helper > wifi: ath10k: allocate dummy net_device dynamically > wifi: ath11k: allocate dummy net_device dynamically Thanks for setting up the branch, that makes the testing very easy. I now tested the branch using the commit below with ath11k WCN6855 hw2.0 on an x86 box: 5be9a125d8e7 wifi: ath11k: allocate dummy net_device dynamically But unfortunately it crashes, the stack trace below. I can easily test your branches, just let me know what to test. A direct 'git pull' command is the best. [ 238.886002] rmmod ath11k_pci [ 239.530328] ------------[ cut here ]------------ [ 239.530433] kernel BUG at net/core/dev.c:11066! [ 239.530621] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN [ 239.530709] CPU: 5 PID: 1668 Comm: rmmod Not tainted 6.9.0-rc2+ #1367 [ 239.530762] Hardware name: Intel(R) Client Systems NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0067.2021.0528.1339 05/28/2021 [ 239.530848] RIP: 0010:free_netdev (net/core/dev.c:11066 (discriminator 1)) [ 239.530893] Code: 08 84 d2 0f 85 3c 01 00 00 0f b7 83 3a 03 00 00 48 29 c3 48 89 df e8 27 10 21 fe 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 <0f> 0b 44 0f b6 25 fd 91 53 02 41 80 fc 01 0f 87 1f 14 94 00 41 83 All code ======== 0: 08 84 d2 0f 85 3c 01 or %al,0x13c850f(%rdx,%rdx,8) 7: 00 00 add %al,(%rax) 9: 0f b7 83 3a 03 00 00 movzwl 0x33a(%rbx),%eax 10: 48 29 c3 sub %rax,%rbx 13: 48 89 df mov %rbx,%rdi 16: e8 27 10 21 fe call 0xfffffffffe211042 1b: 48 83 c4 10 add $0x10,%rsp 1f: 5b pop %rbx 20: 41 5c pop %r12 22: 41 5d pop %r13 24: 41 5e pop %r14 26: 41 5f pop %r15 28: 5d pop %rbp 29: c3 ret 2a:* 0f 0b ud2 <-- trapping instruction 2c: 44 0f b6 25 fd 91 53 movzbl 0x25391fd(%rip),%r12d # 0x2539231 33: 02 34: 41 80 fc 01 cmp $0x1,%r12b 38: 0f 87 1f 14 94 00 ja 0x94145d 3e: 41 rex.B 3f: 83 .byte 0x83 Code starting with the faulting instruction =========================================== 0: 0f 0b ud2 2: 44 0f b6 25 fd 91 53 movzbl 0x25391fd(%rip),%r12d # 0x2539207 9: 02 a: 41 80 fc 01 cmp $0x1,%r12b e: 0f 87 1f 14 94 00 ja 0x941433 14: 41 rex.B 15: 83 .byte 0x83 [ 239.530977] RSP: 0018:ffffc90002dffb70 EFLAGS: 00010202 [ 239.531023] RAX: 0000000000000005 RBX: ffff88810c819000 RCX: 0000000000000000 [ 239.531072] RDX: 1ffff110219032d1 RSI: ffffffff87e79780 RDI: 0000000000000000 [ 239.531120] RBP: ffffc90002dffba8 R08: 0000000000000001 R09: 0000000000000001 [ 239.531169] R10: ffffffff897e3f17 R11: 00000000000000bb R12: ffff88810c8194e0 [ 239.531224] R13: ffff88810c819178 R14: dffffc0000000000 R15: ffff88810c819688 [ 239.531274] FS: 00007f462c4c4740(0000) GS:ffff888232c00000(0000) knlGS:0000000000000000 [ 239.531326] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 239.531371] CR2: 000055fc93483308 CR3: 0000000113a99004 CR4: 00000000003706f0 [ 239.531420] Call Trace: [ 239.531484] <TASK> [ 239.531550] ? show_regs (arch/x86/kernel/dumpstack.c:479) [ 239.531592] ? die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434 arch/x86/kernel/dumpstack.c:447) [ 239.531630] ? do_trap (arch/x86/kernel/traps.c:114 arch/x86/kernel/traps.c:155) [ 239.531669] ? do_error_trap (./arch/x86/include/asm/traps.h:58 arch/x86/kernel/traps.c:176) [ 239.531708] ? free_netdev (net/core/dev.c:11066 (discriminator 1)) [ 239.531749] ? handle_invalid_op (arch/x86/kernel/traps.c:214) [ 239.531789] ? free_netdev (net/core/dev.c:11066 (discriminator 1)) [ 239.531829] ? exc_invalid_op (arch/x86/kernel/traps.c:267) [ 239.531868] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:621) [ 239.531910] ? free_netdev (net/core/dev.c:11066 (discriminator 1)) [ 239.531952] ath11k_pcic_free_irq (drivers/net/wireless/ath/ath11k/pcic.c:312 (discriminator 2) drivers/net/wireless/ath/ath11k/pcic.c:334 (discriminator 2)) ath11k [ 239.532029] ath11k_pci_remove (drivers/net/wireless/ath/ath11k/pci.c:478 drivers/net/wireless/ath/ath11k/pci.c:987) ath11k_pci [ 239.532075] pci_device_remove (./include/linux/pm_runtime.h:129 drivers/pci/pci-driver.c:475) [ 239.532116] device_remove (drivers/base/dd.c:569) [ 239.532155] device_release_driver_internal (drivers/base/dd.c:1272 drivers/base/dd.c:1293) [ 239.532198] ? __kasan_check_read (mm/kasan/shadow.c:32) [ 239.532241] driver_detach (drivers/base/dd.c:1357) [ 239.532281] bus_remove_driver (drivers/base/bus.c:736) [ 239.532322] driver_unregister (drivers/base/driver.c:275) [ 239.532362] pci_unregister_driver (./include/linux/spinlock.h:351 drivers/pci/pci-driver.c:85 drivers/pci/pci-driver.c:1467) [ 239.532402] ? find_module_all (kernel/module/main.c:357 (discriminator 1)) [ 239.532443] ath11k_pci_exit (drivers/net/wireless/ath/ath11k/pci.c:1069) ath11k_pci [ 239.532543] __do_sys_delete_module (kernel/module/main.c:756) [ 239.532584] ? __kasan_slab_free (mm/kasan/common.c:274) [ 239.532625] ? module_flags (kernel/module/main.c:700) [ 239.532666] ? kmem_cache_free (mm/slub.c:4280 (discriminator 3) mm/slub.c:4344 (discriminator 3)) [ 239.533366] ? __fput (fs/file_table.c:436) [ 239.534132] ? debug_smp_processor_id (lib/smp_processor_id.c:61) [ 239.534855] __x64_sys_delete_module (kernel/module/main.c:698) [ 239.535620] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) [ 239.536334] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129) [ 239.537628] RIP: 0033:0x7f462c611c8b [ 239.538510] Code: 73 01 c3 48 8b 0d 05 c2 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d5 c1 0c 00 f7 d8 64 89 01 48 All code ======== 0: 73 01 jae 0x3 2: c3 ret 3: 48 8b 0d 05 c2 0c 00 mov 0xcc205(%rip),%rcx # 0xcc20f a: f7 d8 neg %eax c: 64 89 01 mov %eax,%fs:(%rcx) f: 48 83 c8 ff or $0xffffffffffffffff,%rax 13: c3 ret 14: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1) 1b: 00 00 00 1e: 90 nop 1f: f3 0f 1e fa endbr64 23: b8 b0 00 00 00 mov $0xb0,%eax 28: 0f 05 syscall 2a:* 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax <-- trapping instruction 30: 73 01 jae 0x33 32: c3 ret 33: 48 8b 0d d5 c1 0c 00 mov 0xcc1d5(%rip),%rcx # 0xcc20f 3a: f7 d8 neg %eax 3c: 64 89 01 mov %eax,%fs:(%rcx) 3f: 48 rex.W Code starting with the faulting instruction =========================================== 0: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax 6: 73 01 jae 0x9 8: c3 ret 9: 48 8b 0d d5 c1 0c 00 mov 0xcc1d5(%rip),%rcx # 0xcc1e5 10: f7 d8 neg %eax 12: 64 89 01 mov %eax,%fs:(%rcx) 15: 48 rex.W [ 239.540022] RSP: 002b:00007fff08acd828 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 [ 239.540791] RAX: ffffffffffffffda RBX: 0000555f9f0647d0 RCX: 00007f462c611c8b [ 239.541556] RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000555f9f064838 [ 239.542319] RBP: 00007fff08acd888 R08: 0000000000000000 R09: 0000000000000000 [ 239.543087] R10: 00007f462c68dac0 R11: 0000000000000206 R12: 00007fff08acda60 [ 239.543859] R13: 00007fff08acdeb7 R14: 0000555f9f0632a0 R15: 0000555f9f0647d0 [ 239.544615] </TASK> [ 239.545337] Modules linked in: ath11k_pci(-) ath11k mac80211 libarc4 cfg80211 qmi_helpers qrtr_mhi mhi qrtr nvme nvme_core [ 239.546153] ---[ end trace 0000000000000000 ]--- [ 239.568717] RIP: 0010:free_netdev (net/core/dev.c:11066 (discriminator 1)) [ 239.569476] Code: 08 84 d2 0f 85 3c 01 00 00 0f b7 83 3a 03 00 00 48 29 c3 48 89 df e8 27 10 21 fe 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 <0f> 0b 44 0f b6 25 fd 91 53 02 41 80 fc 01 0f 87 1f 14 94 00 41 83 All code ======== 0: 08 84 d2 0f 85 3c 01 or %al,0x13c850f(%rdx,%rdx,8) 7: 00 00 add %al,(%rax) 9: 0f b7 83 3a 03 00 00 movzwl 0x33a(%rbx),%eax 10: 48 29 c3 sub %rax,%rbx 13: 48 89 df mov %rbx,%rdi 16: e8 27 10 21 fe call 0xfffffffffe211042 1b: 48 83 c4 10 add $0x10,%rsp 1f: 5b pop %rbx 20: 41 5c pop %r12 22: 41 5d pop %r13 24: 41 5e pop %r14 26: 41 5f pop %r15 28: 5d pop %rbp 29: c3 ret 2a:* 0f 0b ud2 <-- trapping instruction 2c: 44 0f b6 25 fd 91 53 movzbl 0x25391fd(%rip),%r12d # 0x2539231 33: 02 34: 41 80 fc 01 cmp $0x1,%r12b 38: 0f 87 1f 14 94 00 ja 0x94145d 3e: 41 rex.B 3f: 83 .byte 0x83 Code starting with the faulting instruction =========================================== 0: 0f 0b ud2 2: 44 0f b6 25 fd 91 53 movzbl 0x25391fd(%rip),%r12d # 0x2539207 9: 02 a: 41 80 fc 01 cmp $0x1,%r12b e: 0f 87 1f 14 94 00 ja 0x941433 14: 41 rex.B 15: 83 .byte 0x83 [ 239.571082] RSP: 0018:ffffc90002dffb70 EFLAGS: 00010202 [ 239.571929] RAX: 0000000000000005 RBX: ffff88810c819000 RCX: 0000000000000000 [ 239.572970] RDX: 1ffff110219032d1 RSI: ffffffff87e79780 RDI: 0000000000000000 [ 239.573792] RBP: ffffc90002dffba8 R08: 0000000000000001 R09: 0000000000000001 [ 239.574600] R10: ffffffff897e3f17 R11: 00000000000000bb R12: ffff88810c8194e0 [ 239.575368] R13: ffff88810c819178 R14: dffffc0000000000 R15: ffff88810c819688 [ 239.576136] FS: 00007f462c4c4740(0000) GS:ffff888231c00000(0000) knlGS:0000000000000000 [ 239.576909] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 239.577723] CR2: 0000557ff5c54118 CR3: 0000000113a99003 CR4: 00000000003706f0 [ 239.578508] Kernel panic - not syncing: Fatal exception [ 239.579273] Kernel Offset: 0x3e00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 239.608975] Rebooting in 10 seconds..
Hello Kalle, On Fri, Apr 05, 2024 at 06:15:05PM +0300, Kalle Valo wrote: > Breno Leitao <leitao@debian.org> writes: > > > struct net_device shouldn't be embedded into any structure, instead, > > the owner should use the private space to embed their state into > > net_device. > > > > This patch set fixes the problem above for ath10k and ath11k. This also > > fixes the conversion of qtnfmac driver to the new helper. > > > > This patch set depends on a series that is still under review: > > https://lore.kernel.org/all/20240404114854.2498663-1-leitao@debian.org/#t > > > > If it helps, I've pushed the tree to > > https://github.com/leitao/linux/tree/wireless-dummy > > > > PS: Due to lack of hardware, unfortunately all these patches are > > compiled tested only. > > > > Breno Leitao (3): > > wifi: qtnfmac: Use netdev dummy allocator helper > > wifi: ath10k: allocate dummy net_device dynamically > > wifi: ath11k: allocate dummy net_device dynamically > > Thanks for setting up the branch, that makes the testing very easy. I > now tested the branch using the commit below with ath11k WCN6855 hw2.0 > on an x86 box: > > 5be9a125d8e7 wifi: ath11k: allocate dummy net_device dynamically > > But unfortunately it crashes, the stack trace below. I can easily test > your branches, just let me know what to test. A direct 'git pull' > command is the best. Thanks for the test. Reading the issue, I am afraid that freeing netdev explicitly (free_netdev()) might not be the best approach at the exit path. I would like to try to leverage the ->needs_free_netdev netdev mechanism to do the clean-up, if that makes sense. I've updated the ath11k patch, and I am curious if that is what we want. Would you mind testing a net patch I have, please? https://github.com/leitao/linux/tree/wireless-dummy_v2 PS: I didn't updated the other drivers (ath10k, qtnfmac, etc). Thank you!
Breno Leitao <leitao@debian.org> writes: > Hello Kalle, > > On Fri, Apr 05, 2024 at 06:15:05PM +0300, Kalle Valo wrote: >> Breno Leitao <leitao@debian.org> writes: >> >> > struct net_device shouldn't be embedded into any structure, instead, >> > the owner should use the private space to embed their state into >> > net_device. >> > >> > This patch set fixes the problem above for ath10k and ath11k. This also >> > fixes the conversion of qtnfmac driver to the new helper. >> > >> > This patch set depends on a series that is still under review: >> > https://lore.kernel.org/all/20240404114854.2498663-1-leitao@debian.org/#t >> > >> > If it helps, I've pushed the tree to >> > https://github.com/leitao/linux/tree/wireless-dummy >> > >> > PS: Due to lack of hardware, unfortunately all these patches are >> > compiled tested only. >> > >> > Breno Leitao (3): >> > wifi: qtnfmac: Use netdev dummy allocator helper >> > wifi: ath10k: allocate dummy net_device dynamically >> > wifi: ath11k: allocate dummy net_device dynamically >> >> Thanks for setting up the branch, that makes the testing very easy. I >> now tested the branch using the commit below with ath11k WCN6855 hw2.0 >> on an x86 box: >> >> 5be9a125d8e7 wifi: ath11k: allocate dummy net_device dynamically >> >> But unfortunately it crashes, the stack trace below. I can easily test >> your branches, just let me know what to test. A direct 'git pull' >> command is the best. > > Thanks for the test. > > Reading the issue, I am afraid that freeing netdev explicitly > (free_netdev()) might not be the best approach at the exit path. > > I would like to try to leverage the ->needs_free_netdev netdev > mechanism to do the clean-up, if that makes sense. I've updated the > ath11k patch, and I am curious if that is what we want. > > Would you mind testing a net patch I have, please? > > https://github.com/leitao/linux/tree/wireless-dummy_v2 I tested this again with my WCN6855 hw2.0 x86 test box on this commit: a87674ac820e wifi: ath11k: allocate dummy net_device dynamically It passes my tests and doesn't crash, but I see this kmemleak warning a lot: unreferenced object 0xffff888127109400 (size 128): comm "insmod", pid 2813, jiffies 4294926528 hex dump (first 32 bytes): d0 93 d5 0a 81 88 ff ff d0 93 d5 0a 81 88 ff ff ................ 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 870e4f12): [<ffffffff99bcd375>] kmemleak_alloc+0x45/0x80 [<ffffffff975707a8>] kmalloc_trace+0x278/0x2c0 [<ffffffff992904c5>] __hw_addr_create+0x55/0x260 [<ffffffff992909cb>] __hw_addr_add_ex+0x2fb/0x6d0 [<ffffffff99294004>] dev_addr_init+0x144/0x230 [<ffffffff992629ee>] alloc_netdev_mqs+0x12e/0xfe0 [<ffffffff992638c5>] alloc_netdev_dummy+0x25/0x30 [<ffffffffc0b6b0cd>] ath11k_pcic_ext_irq_config+0x1ad/0xc10 [ath11k] [<ffffffffc0b6c431>] ath11k_pcic_config_irq+0x2f1/0x4b0 [ath11k] [<ffffffffc0cb8314>] ath11k_pci_probe+0x874/0x1210 [ath11k_pci] [<ffffffff97febf06>] local_pci_probe+0xd6/0x180 [<ffffffff97feefaa>] pci_call_probe+0x15a/0x400 [<ffffffff97ff03d6>] pci_device_probe+0xa6/0x100 [<ffffffff98abe315>] really_probe+0x1d5/0x920 [<ffffffff98abed48>] __driver_probe_device+0x2e8/0x3f0 [<ffffffff98abee9a>] driver_probe_device+0x4a/0x140
On Mon, Apr 08, 2024 at 07:43:42PM +0300, Kalle Valo wrote: > Breno Leitao <leitao@debian.org> writes: > > On Fri, Apr 05, 2024 at 06:15:05PM +0300, Kalle Valo wrote: > >> Breno Leitao <leitao@debian.org> writes: > >> > >> > struct net_device shouldn't be embedded into any structure, instead, > >> > the owner should use the private space to embed their state into > >> > net_device. > >> > > >> > This patch set fixes the problem above for ath10k and ath11k. This also > >> > fixes the conversion of qtnfmac driver to the new helper. > >> > > >> > This patch set depends on a series that is still under review: > >> > https://lore.kernel.org/all/20240404114854.2498663-1-leitao@debian.org/#t > >> > > >> > If it helps, I've pushed the tree to > >> > https://github.com/leitao/linux/tree/wireless-dummy > >> > > >> > PS: Due to lack of hardware, unfortunately all these patches are > >> > compiled tested only. > >> > > >> > Breno Leitao (3): > >> > wifi: qtnfmac: Use netdev dummy allocator helper > >> > wifi: ath10k: allocate dummy net_device dynamically > >> > wifi: ath11k: allocate dummy net_device dynamically > >> > >> Thanks for setting up the branch, that makes the testing very easy. I > >> now tested the branch using the commit below with ath11k WCN6855 hw2.0 > >> on an x86 box: > >> > >> 5be9a125d8e7 wifi: ath11k: allocate dummy net_device dynamically > >> > >> But unfortunately it crashes, the stack trace below. I can easily test > >> your branches, just let me know what to test. A direct 'git pull' > >> command is the best. > > > > Thanks for the test. > > > > Reading the issue, I am afraid that freeing netdev explicitly > > (free_netdev()) might not be the best approach at the exit path. > > > > I would like to try to leverage the ->needs_free_netdev netdev > > mechanism to do the clean-up, if that makes sense. I've updated the > > ath11k patch, and I am curious if that is what we want. > > > > Would you mind testing a net patch I have, please? > > > > https://github.com/leitao/linux/tree/wireless-dummy_v2 > > I tested this again with my WCN6855 hw2.0 x86 test box on this commit: > > a87674ac820e wifi: ath11k: allocate dummy net_device dynamically > > It passes my tests and doesn't crash, but I see this kmemleak warning a > lot: Thanks Kalle, that was helpful. The device is not being clean-up automatically. Chatting with Jakub, he suggested coming back to the original approach, but, adding a additional patch, at the free_netdev(). Would you mind running another test, please? https://github.com/leitao/linux/tree/wireless-dummy_v3 The branch above is basically the original branch (as in this patch series), with this additional patch: Author: Breno Leitao <leitao@debian.org> Date: Mon Apr 8 11:37:32 2024 -0700 net: free_netdev: exit earlier if dummy For dummy devices, exit earlier at free_netdev() instead of executing the whole function. This is necessary, because dummy devices are special, and shouldn't have the second part of the function executed. Otherwise reg_state, which is NETREG_DUMMY, will be overwritten and there will be no way to identify that this is a dummy device. Also, this device do not need the final put_device(), since dummy devices are not registered (through register_netdevice()), where the device reference is increased (at netdev_register_kobject()/device_add()). Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Breno Leitao <leitao@debian.org> diff --git a/net/core/dev.c b/net/core/dev.c index 2b82bd1cd2f8..5d2cb97d0ae6 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -11058,7 +11058,8 @@ void free_netdev(struct net_device *dev) dev->xdp_bulkq = NULL; /* Compatibility with error handling in drivers */ - if (dev->reg_state == NETREG_UNINITIALIZED) { + if (dev->reg_state == NETREG_UNINITIALIZED || + dev->reg_state == NETREG_DUMMY) { netdev_freemem(dev); return; }
Breno Leitao <leitao@debian.org> writes: >> > Reading the issue, I am afraid that freeing netdev explicitly >> > (free_netdev()) might not be the best approach at the exit path. >> > >> > I would like to try to leverage the ->needs_free_netdev netdev >> > mechanism to do the clean-up, if that makes sense. I've updated the >> > ath11k patch, and I am curious if that is what we want. >> > >> > Would you mind testing a net patch I have, please? >> > >> > https://github.com/leitao/linux/tree/wireless-dummy_v2 >> >> I tested this again with my WCN6855 hw2.0 x86 test box on this commit: >> >> a87674ac820e wifi: ath11k: allocate dummy net_device dynamically >> >> It passes my tests and doesn't crash, but I see this kmemleak warning a >> lot: > > Thanks Kalle, that was helpful. The device is not being clean-up > automatically. > > Chatting with Jakub, he suggested coming back to the original approach, > but, adding a additional patch, at the free_netdev(). > > Would you mind running another test, please? > > https://github.com/leitao/linux/tree/wireless-dummy_v3 > > The branch above is basically the original branch (as in this patch > series), with this additional patch: > > Author: Breno Leitao <leitao@debian.org> > Date: Mon Apr 8 11:37:32 2024 -0700 > > net: free_netdev: exit earlier if dummy I tested with the same ath11k hardware and this one passes all my (simple) ath11k tests, no issues found. I used this commit: 1c10aebaa8ce net: free_netdev: exit earlier if dummy
On Tue, Apr 09, 2024 at 01:03:21PM +0300, Kalle Valo wrote: > Breno Leitao <leitao@debian.org> writes: > > >> > Reading the issue, I am afraid that freeing netdev explicitly > >> > (free_netdev()) might not be the best approach at the exit path. > >> > > >> > I would like to try to leverage the ->needs_free_netdev netdev > >> > mechanism to do the clean-up, if that makes sense. I've updated the > >> > ath11k patch, and I am curious if that is what we want. > >> > > >> > Would you mind testing a net patch I have, please? > >> > > >> > https://github.com/leitao/linux/tree/wireless-dummy_v2 > >> > >> I tested this again with my WCN6855 hw2.0 x86 test box on this commit: > >> > >> a87674ac820e wifi: ath11k: allocate dummy net_device dynamically > >> > >> It passes my tests and doesn't crash, but I see this kmemleak warning a > >> lot: > > > > Thanks Kalle, that was helpful. The device is not being clean-up > > automatically. > > > > Chatting with Jakub, he suggested coming back to the original approach, > > but, adding a additional patch, at the free_netdev(). > > > > Would you mind running another test, please? > > > > https://github.com/leitao/linux/tree/wireless-dummy_v3 > > > > The branch above is basically the original branch (as in this patch > > series), with this additional patch: > > > > Author: Breno Leitao <leitao@debian.org> > > Date: Mon Apr 8 11:37:32 2024 -0700 > > > > net: free_netdev: exit earlier if dummy > > I tested with the same ath11k hardware and this one passes all my > (simple) ath11k tests, no issues found. I used this commit: > > 1c10aebaa8ce net: free_netdev: exit earlier if dummy Thank you so much for the test. I will respin a v2 of this patchset with the additional patch included.
Breno Leitao <leitao@debian.org> writes: > On Tue, Apr 09, 2024 at 01:03:21PM +0300, Kalle Valo wrote: > >> Breno Leitao <leitao@debian.org> writes: >> >> >> > Reading the issue, I am afraid that freeing netdev explicitly >> >> > (free_netdev()) might not be the best approach at the exit path. >> >> > >> >> > I would like to try to leverage the ->needs_free_netdev netdev >> >> > mechanism to do the clean-up, if that makes sense. I've updated the >> >> > ath11k patch, and I am curious if that is what we want. >> >> > >> >> > Would you mind testing a net patch I have, please? >> >> > >> >> > https://github.com/leitao/linux/tree/wireless-dummy_v2 >> >> >> >> I tested this again with my WCN6855 hw2.0 x86 test box on this commit: >> >> >> >> a87674ac820e wifi: ath11k: allocate dummy net_device dynamically >> >> >> >> It passes my tests and doesn't crash, but I see this kmemleak warning a >> >> lot: >> > >> > Thanks Kalle, that was helpful. The device is not being clean-up >> > automatically. >> > >> > Chatting with Jakub, he suggested coming back to the original approach, >> > but, adding a additional patch, at the free_netdev(). >> > >> > Would you mind running another test, please? >> > >> > https://github.com/leitao/linux/tree/wireless-dummy_v3 >> > >> > The branch above is basically the original branch (as in this patch >> > series), with this additional patch: >> > >> > Author: Breno Leitao <leitao@debian.org> >> > Date: Mon Apr 8 11:37:32 2024 -0700 >> > >> > net: free_netdev: exit earlier if dummy >> >> I tested with the same ath11k hardware and this one passes all my >> (simple) ath11k tests, no issues found. I used this commit: >> >> 1c10aebaa8ce net: free_netdev: exit earlier if dummy > > Thank you so much for the test. > > I will respin a v2 of this patchset with the additional patch included. Sounds good. Feel free to add: Tested-by: Kalle Valo <kvalo@kernel.org>