Message ID | 20241011131843.2931995-1-edumazet@google.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] netdevsim: use cond_resched() in nsim_dev_trap_report_work() | expand |
On Fri, 11 Oct 2024 13:18:43 +0000 Eric Dumazet wrote: > --- a/drivers/net/netdevsim/dev.c > +++ b/drivers/net/netdevsim/dev.c > @@ -848,11 +848,12 @@ static void nsim_dev_trap_report_work(struct work_struct *work) > continue; > > nsim_dev_trap_report(nsim_dev_port); > + cond_resched(); > } > devl_unlock(priv_to_devlink(nsim_dev)); > - > - schedule_delayed_work(&nsim_dev->trap_data->trap_report_dw, > - msecs_to_jiffies(NSIM_TRAP_REPORT_INTERVAL_MS)); > + queue_delayed_work(system_unbound_wq, > + &nsim_dev->trap_data->trap_report_dw, > + msecs_to_jiffies(NSIM_TRAP_REPORT_INTERVAL_MS)); Makes sense, there's one more place which queues this work, in case we couldn't grab the lock. Should it also be converted?
On Fri, Oct 11, 2024 at 5:59 PM Jakub Kicinski <kuba@kernel.org> wrote: > > On Fri, 11 Oct 2024 13:18:43 +0000 Eric Dumazet wrote: > > --- a/drivers/net/netdevsim/dev.c > > +++ b/drivers/net/netdevsim/dev.c > > @@ -848,11 +848,12 @@ static void nsim_dev_trap_report_work(struct work_struct *work) > > continue; > > > > nsim_dev_trap_report(nsim_dev_port); > > + cond_resched(); > > } > > devl_unlock(priv_to_devlink(nsim_dev)); > > - > > - schedule_delayed_work(&nsim_dev->trap_data->trap_report_dw, > > - msecs_to_jiffies(NSIM_TRAP_REPORT_INTERVAL_MS)); > > + queue_delayed_work(system_unbound_wq, > > + &nsim_dev->trap_data->trap_report_dw, > > + msecs_to_jiffies(NSIM_TRAP_REPORT_INTERVAL_MS)); > > Makes sense, there's one more place which queues this work, in case we > couldn't grab the lock. Should it also be converted? Right of course, I will send a v2.
diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c index 92a7a36b93ac0cc1b02a551b974fb390254ac484..2f98443230895e8b6ee4cc36d5a2add8c2c0a00e 100644 --- a/drivers/net/netdevsim/dev.c +++ b/drivers/net/netdevsim/dev.c @@ -848,11 +848,12 @@ static void nsim_dev_trap_report_work(struct work_struct *work) continue; nsim_dev_trap_report(nsim_dev_port); + cond_resched(); } devl_unlock(priv_to_devlink(nsim_dev)); - - schedule_delayed_work(&nsim_dev->trap_data->trap_report_dw, - msecs_to_jiffies(NSIM_TRAP_REPORT_INTERVAL_MS)); + queue_delayed_work(system_unbound_wq, + &nsim_dev->trap_data->trap_report_dw, + msecs_to_jiffies(NSIM_TRAP_REPORT_INTERVAL_MS)); } static int nsim_dev_traps_init(struct devlink *devlink) @@ -907,8 +908,9 @@ static int nsim_dev_traps_init(struct devlink *devlink) INIT_DELAYED_WORK(&nsim_dev->trap_data->trap_report_dw, nsim_dev_trap_report_work); - schedule_delayed_work(&nsim_dev->trap_data->trap_report_dw, - msecs_to_jiffies(NSIM_TRAP_REPORT_INTERVAL_MS)); + queue_delayed_work(system_unbound_wq, + &nsim_dev->trap_data->trap_report_dw, + msecs_to_jiffies(NSIM_TRAP_REPORT_INTERVAL_MS)); return 0;
I am still seeing many syzbot reports hinting that syzbot might fool nsim_dev_trap_report_work() with hundreds of ports [1] Lets use cond_resched(), and system_unbound_wq instead of implicit system_wq. [1] INFO: task syz-executor:20633 blocked for more than 143 seconds. Not tainted 6.12.0-rc2-syzkaller-00205-g1d227fcc7222 #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz-executor state:D stack:25856 pid:20633 tgid:20633 ppid:1 flags:0x00004006 ... NMI backtrace for cpu 1 CPU: 1 UID: 0 PID: 16760 Comm: kworker/1:0 Not tainted 6.12.0-rc2-syzkaller-00205-g1d227fcc7222 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Workqueue: events nsim_dev_trap_report_work RIP: 0010:__sanitizer_cov_trace_pc+0x0/0x70 kernel/kcov.c:210 Code: 89 fb e8 23 00 00 00 48 8b 3d 04 fb 9c 0c 48 89 de 5b e9 c3 c7 5d 00 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 <f3> 0f 1e fa 48 8b 04 24 65 48 8b 0c 25 c0 d7 03 00 65 8b 15 60 f0 RSP: 0018:ffffc90000a187e8 EFLAGS: 00000246 RAX: 0000000000000100 RBX: ffffc90000a188e0 RCX: ffff888027d3bc00 RDX: ffff888027d3bc00 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88804a2e6000 R08: ffffffff8a4bc495 R09: ffffffff89da3577 R10: 0000000000000004 R11: ffffffff8a4bc2b0 R12: dffffc0000000000 R13: ffff88806573b503 R14: dffffc0000000000 R15: ffff8880663cca00 FS: 0000000000000000(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fc90a747f98 CR3: 000000000e734000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 000000000000002b DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Call Trace: <NMI> </NMI> <TASK> __local_bh_enable_ip+0x1bb/0x200 kernel/softirq.c:382 spin_unlock_bh include/linux/spinlock.h:396 [inline] nsim_dev_trap_report drivers/net/netdevsim/dev.c:820 [inline] nsim_dev_trap_report_work+0x75d/0xaa0 drivers/net/netdevsim/dev.c:850 process_one_work kernel/workqueue.c:3229 [inline] process_scheduled_works+0xa63/0x1850 kernel/workqueue.c:3310 worker_thread+0x870/0xd30 kernel/workqueue.c:3391 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Fixes: ba5e1272142d ("netdevsim: avoid potential loop in nsim_dev_trap_report_work()") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jiri Pirko <jiri@nvidia.com> --- drivers/net/netdevsim/dev.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)