Message ID | 20230720121829.566974-1-jiri@resnulli.us (mailing list archive) |
---|---|
Headers | show |
Series | devlink: introduce dump selector attr and use it for per-instance dumps | expand |
I'll take this through our nightly and will report back tomorrow.
Thu, Jul 20, 2023 at 03:55:00PM CEST, petrm@nvidia.com wrote:
>I'll take this through our nightly and will report back tomorrow.
Sure. I ran mlxsw regression with this already, no issues.
Jiri Pirko <jiri@resnulli.us> writes: > Thu, Jul 20, 2023 at 03:55:00PM CEST, petrm@nvidia.com wrote: > >>I'll take this through our nightly and will report back tomorrow. > > Sure. I ran mlxsw regression with this already, no issues. You started it on one machine and it went well for a while. But it's getting a stream of these splats right now: INFO - INFO - [ 4155.564670] rcu: INFO: rcu_preempt self-detected stall on CPU INFO - INFO - [ 4155.571093] rcu: 7-....: (99998 ticks this GP) idle=ac7c/1/0x4000000000000000 softirq=86447/86447 fqs=25001 INFO - INFO - [ 4155.582077] rcu: (t=100015 jiffies g=289809 q=1459 ncpus=8) INFO - INFO - [ 4155.588398] CPU: 7 PID: 38940 Comm: ip Not tainted 6.5.0-rc1jiri+ #1 INFO - INFO - [ 4155.595497] Hardware name: Mellanox Technologies Ltd. MSN4700/VMOD0010, BIOS 5.11 01/06/2019 INFO - INFO - [ 4155.604915] RIP: 0010:__netlink_lookup+0xca/0x150 INFO - INFO - [ 4155.610171] Code: 00 00 48 89 c7 48 83 cf 01 48 8b 10 48 83 e2 fe 48 0f 44 d7 f6 c2 01 75 5a 0f b7 4b 16 44 8b 44 24 08 49 89 c9 49 f7 d9 eb 08 <48> 8b 12 f6 c2 01 75 41 4a 8d 34 0a 44 39 86 e8 02 00 00 75 eb 48 INFO - INFO - [ 4155.631156] RSP: 0018:ffffbea7ca41b760 EFLAGS: 00000213 INFO - INFO - [ 4155.636992] RAX: ffffa048c25120b0 RBX: ffffa048c01e4000 RCX: 0000000000000400 INFO - INFO - [ 4155.644964] RDX: ffffa048c6d4b400 RSI: ffffa048c6d4b000 RDI: ffffa048c25120b1 INFO - INFO - [ 4155.652935] RBP: ffffa048c2512000 R08: 00000000888fe595 R09: fffffffffffffc00 INFO - INFO - [ 4155.660906] R10: 00000000302e3030 R11: 0000006900030008 R12: ffffa048c9205900 INFO - INFO - [ 4155.668879] R13: 00000000888fe595 R14: 0000000000000001 R15: ffffa048c01e4000 INFO - INFO - [ 4155.676850] FS: 00007f2155bcf800(0000) GS:ffffa04c2fdc0000(0000) knlGS:0000000000000000 INFO - INFO - [ 4155.685890] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 INFO - INFO - [ 4155.692307] CR2: 00000000004e4140 CR3: 000000014c919005 CR4: 00000000003706e0 INFO - INFO - [ 4155.700279] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 INFO - INFO - [ 4155.708249] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 INFO - INFO - [ 4155.716220] Call Trace: INFO - INFO - [ 4155.718948] <IRQ> INFO - INFO - [ 4155.721190] ? rcu_dump_cpu_stacks+0xea/0x170 INFO - INFO - [ 4155.726057] ? rcu_sched_clock_irq+0x53b/0x10b0 INFO - INFO - [ 4155.731116] ? update_load_avg+0x54/0x280 INFO - INFO - [ 4155.735593] ? notifier_call_chain+0x5a/0xc0 INFO - INFO - [ 4155.740361] ? timekeeping_update+0xaf/0x280 INFO - INFO - [ 4155.745130] ? timekeeping_advance+0x374/0x590 INFO - INFO - [ 4155.750093] ? update_process_times+0x74/0xb0 INFO - INFO - [ 4155.754957] ? tick_sched_handle+0x33/0x50 INFO - INFO - [ 4155.759529] ? tick_sched_timer+0x6b/0x80 INFO - INFO - [ 4155.763995] ? tick_sched_do_timer+0x80/0x80 INFO - INFO - [ 4155.768762] ? __hrtimer_run_queues+0x10f/0x2a0 INFO - INFO - [ 4155.773820] ? hrtimer_interrupt+0xf8/0x230 INFO - INFO - [ 4155.778492] ? __sysvec_apic_timer_interrupt+0x52/0x120 INFO - INFO - [ 4155.784327] ? sysvec_apic_timer_interrupt+0x6d/0x90 INFO - INFO - [ 4155.789874] </IRQ> INFO - INFO - [ 4155.792211] <TASK> INFO - INFO - [ 4155.794549] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 INFO - INFO - [ 4155.800485] ? __netlink_lookup+0xca/0x150 INFO - INFO - [ 4155.805059] netlink_unicast+0x132/0x390 INFO - INFO - [ 4155.809437] rtnl_getlink+0x36d/0x410 INFO - INFO - [ 4155.813532] rtnetlink_rcv_msg+0x14f/0x3b0 INFO - INFO - [ 4155.818106] ? __alloc_pages+0x17c/0x290 INFO - INFO - [ 4155.822485] ? rtnl_calcit.isra.0+0x140/0x140 INFO - INFO - [ 4155.827348] netlink_rcv_skb+0x58/0x100 INFO - INFO - [ 4155.831631] netlink_unicast+0x23c/0x390 INFO - INFO - [ 4155.836010] netlink_sendmsg+0x214/0x470 INFO - INFO - [ 4155.840390] ? netlink_unicast+0x390/0x390 INFO - INFO - [ 4155.844963] ____sys_sendmsg+0x16a/0x260 INFO - INFO - [ 4155.849345] ___sys_sendmsg+0x9a/0xe0 INFO - INFO - [ 4155.853437] __sys_sendmsg+0x7a/0xc0 INFO - INFO - [ 4155.857428] do_syscall_64+0x38/0x80 INFO - INFO - [ 4155.861419] entry_SYSCALL_64_after_hwframe+0x63/0xcd BTW, while for core patches, any machine pass is usually a good predictor of full regression pass, that's not always the case. There's a reason we run on about 15 machines plus simulation. Even if this had "no issues", there would be value in getting full regression run. I'm pulling this from the nightly again.
I see that this patchset got moved to "changes requested" in patchwork. Why exacly? There was no comment so far. Petr's splat is clearly not caused by this patchset. Should I resubmit? Thanks! Thu, Jul 20, 2023 at 02:18:18PM CEST, jiri@resnulli.us wrote: >From: Jiri Pirko <jiri@nvidia.com> > >Motivation: > >For SFs, one devlink instance per SF is created. There might be >thousands of these on a single host. When a user needs to know port >handle for specific SF, he needs to dump all devlink ports on the host >which does not scale good. > >Solution: > >Allow user to pass devlink handle alongside the dump command >and dump only objects which are under selected devlink instance. > >Introduce new attr DEVLINK_ATTR_DUMP_SELECTOR to nest the selection >attributes. This way the userspace can use maxattr to tell if dump >selector is supported by kernel or not. > >Assemble netlink policy for selector attribute. If user passes attr >unknown to kernel, netlink validation errors out. > >Example: >$ devlink port show >auxiliary/mlx5_core.eth.0/65535: type eth netdev eth2 flavour physical port 0 splittable false >auxiliary/mlx5_core.eth.1/131071: type eth netdev eth3 flavour physical port 1 splittable false > >$ devlink port show auxiliary/mlx5_core.eth.0 >auxiliary/mlx5_core.eth.0/65535: type eth netdev eth2 flavour physical port 0 splittable false > >$ devlink port show auxiliary/mlx5_core.eth.1 >auxiliary/mlx5_core.eth.1/131071: type eth netdev eth3 flavour physical port 1 splittable false > >This is done in patch #10 > >Dependency: > >The DEVLINK_ATTR_DUMP_SELECTOR parsing is very suitable to be done >once at the beginning of the dumping. Unfortunatelly, it is not possible >to define start() and done() callbacks for netlink small ops. >So all commands that use instance iterator for dumpit are converted to >split ops. This is done in patch #1-9 > >Extension: > >patch #11 extends the selector by port index for health reporter >dumping. > >v1->v2: >- the original single patch (patch #10) was extended to a patchset > >Jiri Pirko (11): > devlink: parse linecard attr in doit() callbacks > devlink: parse rate attrs in doit() callbacks > devlink: introduce __devlink_nl_pre_doit() with internal flags as > function arg > devlink: convert port get command to split ops > devlink: convert health reporter get command to split ops > devlink: convert param get command to split ops > devlink: convert trap get command to split ops > devlink: introduce set of macros and use it for split ops definitions > devlink: convert rest of the iterator dumpit commands to split ops > devlink: introduce dump selector attr and use it for per-instance > dumps > devlink: extend health reporter dump selector by port index > > include/uapi/linux/devlink.h | 2 + > net/devlink/devl_internal.h | 42 +++--- > net/devlink/health.c | 21 ++- > net/devlink/leftover.c | 211 ++++++++-------------------- > net/devlink/netlink.c | 263 ++++++++++++++++++++++++++++++----- > 5 files changed, 333 insertions(+), 206 deletions(-) > >-- >2.41.0 >
On Tue, 2023-07-25 at 10:07 +0200, Jiri Pirko wrote: > I see that this patchset got moved to "changes requested" in patchwork. > Why exacly? There was no comment so far. Petr's splat is clearly not > caused by this patchset. Quickly skimming over the series I agree the reported splat looks possibly more related to core netlink and/or rhashtable but it would help if you could express your reasoning on the splat itself, possibly with a decoded back-trace. Thanks! Paolo >
Tue, Jul 25, 2023 at 10:36:31AM CEST, pabeni@redhat.com wrote: >On Tue, 2023-07-25 at 10:07 +0200, Jiri Pirko wrote: >> I see that this patchset got moved to "changes requested" in patchwork. >> Why exacly? There was no comment so far. Petr's splat is clearly not >> caused by this patchset. > >Quickly skimming over the series I agree the reported splat looks >possibly more related to core netlink and/or rhashtable but it would >help if you could express your reasoning on the splat itself, possibly >with a decoded back-trace. > Well, since I'm touching only devlink, this is underlated to this patchset. I don't see why I would need to care about the splat. I will repost, if that is ok. >Thanks! > >Paolo >> >
On Tue, 2023-07-25 at 17:29 +0200, Jiri Pirko wrote: > Tue, Jul 25, 2023 at 10:36:31AM CEST, pabeni@redhat.com wrote: > > On Tue, 2023-07-25 at 10:07 +0200, Jiri Pirko wrote: > > > I see that this patchset got moved to "changes requested" in patchwork. > > > Why exacly? There was no comment so far. Petr's splat is clearly not > > > caused by this patchset. > > > > Quickly skimming over the series I agree the reported splat looks > > possibly more related to core netlink and/or rhashtable but it would > > help if you could express your reasoning on the splat itself, possibly > > with a decoded back-trace. > > > > Well, since I'm touching only devlink, this is underlated to this > patchset. I don't see why I would need to care about the splat. > I will repost, if that is ok. Not needed, the series is again in 'new' status. Cheers, Paolo
From: Jiri Pirko <jiri@nvidia.com> Motivation: For SFs, one devlink instance per SF is created. There might be thousands of these on a single host. When a user needs to know port handle for specific SF, he needs to dump all devlink ports on the host which does not scale good. Solution: Allow user to pass devlink handle alongside the dump command and dump only objects which are under selected devlink instance. Introduce new attr DEVLINK_ATTR_DUMP_SELECTOR to nest the selection attributes. This way the userspace can use maxattr to tell if dump selector is supported by kernel or not. Assemble netlink policy for selector attribute. If user passes attr unknown to kernel, netlink validation errors out. Example: $ devlink port show auxiliary/mlx5_core.eth.0/65535: type eth netdev eth2 flavour physical port 0 splittable false auxiliary/mlx5_core.eth.1/131071: type eth netdev eth3 flavour physical port 1 splittable false $ devlink port show auxiliary/mlx5_core.eth.0 auxiliary/mlx5_core.eth.0/65535: type eth netdev eth2 flavour physical port 0 splittable false $ devlink port show auxiliary/mlx5_core.eth.1 auxiliary/mlx5_core.eth.1/131071: type eth netdev eth3 flavour physical port 1 splittable false This is done in patch #10 Dependency: The DEVLINK_ATTR_DUMP_SELECTOR parsing is very suitable to be done once at the beginning of the dumping. Unfortunatelly, it is not possible to define start() and done() callbacks for netlink small ops. So all commands that use instance iterator for dumpit are converted to split ops. This is done in patch #1-9 Extension: patch #11 extends the selector by port index for health reporter dumping. v1->v2: - the original single patch (patch #10) was extended to a patchset Jiri Pirko (11): devlink: parse linecard attr in doit() callbacks devlink: parse rate attrs in doit() callbacks devlink: introduce __devlink_nl_pre_doit() with internal flags as function arg devlink: convert port get command to split ops devlink: convert health reporter get command to split ops devlink: convert param get command to split ops devlink: convert trap get command to split ops devlink: introduce set of macros and use it for split ops definitions devlink: convert rest of the iterator dumpit commands to split ops devlink: introduce dump selector attr and use it for per-instance dumps devlink: extend health reporter dump selector by port index include/uapi/linux/devlink.h | 2 + net/devlink/devl_internal.h | 42 +++--- net/devlink/health.c | 21 ++- net/devlink/leftover.c | 211 ++++++++-------------------- net/devlink/netlink.c | 263 ++++++++++++++++++++++++++++++----- 5 files changed, 333 insertions(+), 206 deletions(-)