diff mbox series

[net-next] net: Add lockdep asserts to ____napi_schedule().

Message ID YitkzkjU5zng7jAM@linutronix.de (mailing list archive)
State Accepted
Commit fbd9a2ceba5c74bbfa19cf257ae4b4b2c820860d
Delegated to: Netdev Maintainers
Headers show
Series [net-next] net: Add lockdep asserts to ____napi_schedule(). | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 17759 this patch: 17759
netdev/cc_maintainers warning 4 maintainers not CCed: mingo@redhat.com longman@redhat.com boqun.feng@gmail.com will@kernel.org
netdev/build_clang success Errors and warnings before: 3927 this patch: 3927
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 17372 this patch: 17372
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 36 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Sebastian Andrzej Siewior March 11, 2022, 3:03 p.m. UTC
____napi_schedule() needs to be invoked with disabled interrupts due to
__raise_softirq_irqoff (in order not to corrupt the per-CPU list).
____napi_schedule() needs also to be invoked from an interrupt context
so that the raised-softirq is processed while the interrupt context is
left.

Add lockdep asserts for both conditions.
While this is the second time the irq/softirq check is needed, provide a
generic lockdep_assert_softirq_will_run() which is used by both caller.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
This is my todo from
   https://lkml.kernel.org/r/20220203170901.52ccfd09@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com

- It was suggested to add this to __napi_schedule(),
  __napi_schedule_irqoff but both end up here in ____napi_schedule().

- It was suggested to do lockdep_assert_softirq_will_run(). Done plus
  moved the other caller.

While adding this, I stumbled over lockdep_assert_in_softirq(). This
really special casing things and builds upon assumptions. It took me a
while to figure what is going on. I would suggest to remove it and as a
replacement add lockdep annotations (as in spin_acquire()) around
`napi_alloc_cache' access which is the thing the annotation cares about.
And then the lockdep_assert_in_softirq() can be replaced with a
might_lock() so in the end we know why do what we do. Lockdep will yell
if the lock has been observed in-hardirq and in-softirq without
disabling interrupts.
Sounds good?

---
 include/linux/lockdep.h | 7 +++++++
 net/core/dev.c          | 5 ++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

Comments

patchwork-bot+netdevbpf@kernel.org March 14, 2022, 10:40 a.m. UTC | #1
Hello:

This patch was applied to netdev/net-next.git (master)
by David S. Miller <davem@davemloft.net>:

On Fri, 11 Mar 2022 16:03:42 +0100 you wrote:
> ____napi_schedule() needs to be invoked with disabled interrupts due to
> __raise_softirq_irqoff (in order not to corrupt the per-CPU list).
> ____napi_schedule() needs also to be invoked from an interrupt context
> so that the raised-softirq is processed while the interrupt context is
> left.
> 
> Add lockdep asserts for both conditions.
> While this is the second time the irq/softirq check is needed, provide a
> generic lockdep_assert_softirq_will_run() which is used by both caller.
> 
> [...]

Here is the summary with links:
  - [net-next] net: Add lockdep asserts to ____napi_schedule().
    https://git.kernel.org/netdev/net-next/c/fbd9a2ceba5c

You are awesome, thank you!
Saeed Mahameed March 17, 2022, 7:21 p.m. UTC | #2
On 11 Mar 16:03, Sebastian Andrzej Siewior wrote:
>____napi_schedule() needs to be invoked with disabled interrupts due to
>__raise_softirq_irqoff (in order not to corrupt the per-CPU list).

This patch is likely causing the following call trace when RPS is enabled:

  [  690.429122] WARNING: CPU: 0 PID: 0 at net/core/dev.c:4268 rps_trigger_softirq+0x21/0xb0
  [  690.431236] Modules linked in: bonding ib_ipoib ipip tunnel4 geneve ib_umad ip6_gre ip6_tunnel tunnel6 rdma_ucm nf_tables ip_gre gre mlx5_ib ib_uverbs mlx5_core iptable_raw openvswitch nsh xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink rpcrdma ib_iser xt_addrtype libiscsi scsi_transport_iscsi rdma_cm iw_cm iptable_nat nf_nat ib_cm br_netfilter ib_core overlay fuse [last unloaded: ib_uverbs]
  [  690.439693] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.17.0-rc7_net_next_4303f9c #1
  [  690.441709] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  [  690.444587] RIP: 0010:rps_trigger_softirq+0x21/0xb0
  [  690.445971] Code: ff 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 8b 05 b1 ea 21 01 53 48 89 fb 85 c0 74 1b 65 8b 05 16 56 71 7e a9 00 ff 0f 00 75 02 <0f> 0b 65 8b 05 4e 5f 72 7e 85 c0 74 5b 48 8b 83 e0 01 00 00 f6 c4
  [  690.450682] RSP: 0018:ffffffff82803e70 EFLAGS: 00010046
  [  690.452106] RAX: 0000000000000001 RBX: ffff88852ca3d400 RCX: ffff88852ca3d540
  [  690.453972] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88852ca3d400
  [  690.455860] RBP: 0000000000000000 R08: ffff88852ca3d400 R09: 0000000000000001
  [  690.457684] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
  [  690.459548] R13: ffff88852ca3d540 R14: ffffffff82829628 R15: 0000000000000000
  [  690.461429] FS:  0000000000000000(0000) GS:ffff88852ca00000(0000) knlGS:0000000000000000
  [  690.463653] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [  690.465180] CR2: 00007ff718200b98 CR3: 000000013b4de003 CR4: 0000000000370eb0
  [  690.467022] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [  690.468915] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [  690.470742] Call Trace:
  [  690.471544]  <TASK>
  [  690.472242]  flush_smp_call_function_queue+0xe5/0x1e0
  [  690.473639]  flush_smp_call_function_from_idle+0x5f/0xa0


For some reason - that i haven't looked into yet -
net_rps_send_ipi() will eventually ____napi_schedule()
only after enabling IRQ. see net_rps_action_and_irq_enable()
Jason A. Donenfeld March 18, 2022, 1:48 a.m. UTC | #3
Hi Sebastian,

On Fri, Mar 11, 2022 at 04:03:42PM +0100, Sebastian Andrzej Siewior wrote:
> ____napi_schedule() needs to be invoked with disabled interrupts due to
> __raise_softirq_irqoff (in order not to corrupt the per-CPU list).
> ____napi_schedule() needs also to be invoked from an interrupt context
> so that the raised-softirq is processed while the interrupt context is
> left.
> 
> Add lockdep asserts for both conditions.
> While this is the second time the irq/softirq check is needed, provide a
> generic lockdep_assert_softirq_will_run() which is used by both caller.

I stumbled upon this commit when noticing a new failure in WireGuard's
test suite:

[    1.338823] ------------[ cut here ]------------
[    1.339289] WARNING: CPU: 0 PID: 11 at ../../../../../../../../net/core/dev.c:4268 __napi_schedule+0xa1/0x300
[    1.340222] CPU: 0 PID: 11 Comm: kworker/0:1 Not tainted 5.17.0-rc8-debug+ #1
[    1.340896] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS d55cb5a 04/01/2014
[    1.341669] Workqueue: wg-crypt-wg0 wg_packet_decrypt_worker
[    1.342207] RIP: 0010:__napi_schedule+0xa1/0x300
[    1.342655] Code: c0 03 0f b6 14 11 38 d0 7c 08 84 d2 0f 85 eb 01 00 00 8b 05 cd a9 0d 01 85 c0 74 1f 65 8b 05 d6 87 7d 7e a9 00 ff 0f 00 75 02 <0f> 0b 65 8b 05 96 8e 7d 7e 85 c0 0f 84 86 01 00 00 4c 8d 73 10 be
[    1.344366] RSP: 0018:ffff888004bc7c98 EFLAGS: 00010046
[    1.344861] RAX: 0000000080000000 RBX: ffff888007570750 RCX: 1ffffffff05251e5
[    1.345532] RDX: 0000000000000000 RSI: ffffffff822e1060 RDI: ffffffff8244c700
[    1.346189] RBP: ffff888036400000 R08: 0000000000000001 R09: ffff888007570767
[    1.346847] R10: ffffed1000eae0ec R11: 0000000000000001 R12: 0000000000000200
[    1.347504] R13: 00000000000364c0 R14: ffff8880078231c0 R15: ffff888007570750
[    1.348193] FS:  0000000000000000(0000) GS:ffff888036400000(0000) knlGS:0000000000000000
[    1.348973] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.349505] CR2: 00007ffec7b8ed3c CR3: 0000000002625005 CR4: 0000000000370eb0
[    1.350207] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    1.350921] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    1.351587] Call Trace:
[    1.351822]  <TASK>
[    1.352026]  ? napi_schedule_prep+0x37/0x90
[    1.352417]  wg_packet_decrypt_worker+0x2ac/0x470
[    1.352859]  ? lock_is_held_type+0xd7/0x130
[    1.353251]  process_one_work+0x839/0x1380
[    1.353651]  ? rcu_read_unlock+0x40/0x40
[    1.354023]  ? pwq_dec_nr_in_flight+0x230/0x230
[    1.354448]  ? __rwlock_init+0x140/0x140
[    1.354826]  worker_thread+0x593/0xf60
[    1.355180]  ? process_one_work+0x1380/0x1380
[    1.355593]  ? process_one_work+0x1380/0x1380
[    1.356002]  kthread+0x262/0x300
[    1.356308]  ? kthread_exit+0xc0/0xc0
[    1.356656]  ret_from_fork+0x1f/0x30
[    1.357011]  </TASK>

Sounds like wg_packet_decrypt_worker() might be doing something wrong? I
vaguely recall a thread where you started looking into some things there
that seemed non-optimal, but I didn't realize there were correctness
issues. If your patch is correct, and wg_packet_decrypt_worker() is
wrong, do you have a concrete idea of how we should approach fixing
wireguard? Or do you want to send a patch for that?

Jason
Sebastian Andrzej Siewior March 18, 2022, 10:05 a.m. UTC | #4
On 2022-03-17 12:21:45 [-0700], Saeed Mahameed wrote:
> On 11 Mar 16:03, Sebastian Andrzej Siewior wrote:
> > ____napi_schedule() needs to be invoked with disabled interrupts due to
> > __raise_softirq_irqoff (in order not to corrupt the per-CPU list).
> 
> This patch is likely causing the following call trace when RPS is enabled:
> 
>  [  690.429122] WARNING: CPU: 0 PID: 0 at net/core/dev.c:4268 rps_trigger_softirq+0x21/0xb0
>  [  690.431236] Modules linked in: bonding ib_ipoib ipip tunnel4 geneve ib_umad ip6_gre ip6_tunnel tunnel6 rdma_ucm nf_tables ip_gre gre mlx5_ib ib_uverbs mlx5_core iptable_raw openvswitch nsh xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink rpcrdma ib_iser xt_addrtype libiscsi scsi_transport_iscsi rdma_cm iw_cm iptable_nat nf_nat ib_cm br_netfilter ib_core overlay fuse [last unloaded: ib_uverbs]
>  [  690.439693] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.17.0-rc7_net_next_4303f9c #1
>  [  690.441709] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>  [  690.444587] RIP: 0010:rps_trigger_softirq+0x21/0xb0
>  [  690.445971] Code: ff 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 8b 05 b1 ea 21 01 53 48 89 fb 85 c0 74 1b 65 8b 05 16 56 71 7e a9 00 ff 0f 00 75 02 <0f> 0b 65 8b 05 4e 5f 72 7e 85 c0 74 5b 48 8b 83 e0 01 00 00 f6 c4
>  [  690.450682] RSP: 0018:ffffffff82803e70 EFLAGS: 00010046
>  [  690.452106] RAX: 0000000000000001 RBX: ffff88852ca3d400 RCX: ffff88852ca3d540
>  [  690.453972] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88852ca3d400
>  [  690.455860] RBP: 0000000000000000 R08: ffff88852ca3d400 R09: 0000000000000001
>  [  690.457684] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
>  [  690.459548] R13: ffff88852ca3d540 R14: ffffffff82829628 R15: 0000000000000000
>  [  690.461429] FS:  0000000000000000(0000) GS:ffff88852ca00000(0000) knlGS:0000000000000000
>  [  690.463653] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  [  690.465180] CR2: 00007ff718200b98 CR3: 000000013b4de003 CR4: 0000000000370eb0
>  [  690.467022] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>  [  690.468915] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>  [  690.470742] Call Trace:
>  [  690.471544]  <TASK>
>  [  690.472242]  flush_smp_call_function_queue+0xe5/0x1e0
>  [  690.473639]  flush_smp_call_function_from_idle+0x5f/0xa0
> 
> 
> For some reason - that i haven't looked into yet -
> net_rps_send_ipi() will eventually ____napi_schedule()
> only after enabling IRQ. see net_rps_action_and_irq_enable()

Perfect. There is a do_softirq() in flush_smp_call_function_from_idle()
it so it fine.
PeterZ any idea in how to shut lockdep here? Playing with the preemption
counter will do the trick… I'm worried that by disabling BH
unconditionally here it will need to be done by the upper caller and
this in turn will force a BH-flush on PREEMPT_RT. While it looks
harmless in the idle case, it looks bad for migration_cpu_stop().

Sebastian
Sebastian Andrzej Siewior March 18, 2022, 10:57 a.m. UTC | #5
On 2022-03-17 19:48:51 [-0600], Jason A. Donenfeld wrote:
> Hi Sebastian,
Hi Jason,

> I stumbled upon this commit when noticing a new failure in WireGuard's
> test suite:> [    1.339289] WARNING: CPU: 0 PID: 11 at ../../../../../../../../net/core/dev.c:4268 __napi_schedule+0xa1/0x300> [    1.352417]  wg_packet_decrypt_worker+0x2ac/0x470> Sounds like wg_packet_decrypt_worker() might be doing something wrong? I
> vaguely recall a thread where you started looking into some things there
> that seemed non-optimal, but I didn't realize there were correctness
> issues. If your patch is correct, and wg_packet_decrypt_worker() is
> wrong, do you have a concrete idea of how we should approach fixing
> wireguard? Or do you want to send a patch for that?

In your case it is "okay" since that ptr_ring_consume_bh() will do BH
disable/enable which forces the softirq to run. It is not obvious. What
about the following:

diff --git a/drivers/net/wireguard/receive.c b/drivers/net/wireguard/receive.c
index 7b8df406c7737..26ffa3afa542e 100644
--- a/drivers/net/wireguard/receive.c
+++ b/drivers/net/wireguard/receive.c
@@ -502,15 +502,21 @@ void wg_packet_decrypt_worker(struct work_struct *work)
 	struct crypt_queue *queue = container_of(work, struct multicore_worker,
 						 work)->ptr;
 	struct sk_buff *skb;
+	unsigned int packets = 0;
 
-	while ((skb = ptr_ring_consume_bh(&queue->ring)) != NULL) {
+	local_bh_disable();
+	while ((skb = ptr_ring_consume(&queue->ring)) != NULL) {
 		enum packet_state state =
 			likely(decrypt_packet(skb, PACKET_CB(skb)->keypair)) ?
 				PACKET_STATE_CRYPTED : PACKET_STATE_DEAD;
 		wg_queue_enqueue_per_peer_rx(skb, state);
-		if (need_resched())
+		if (!(++packets % 4)) {
+			local_bh_enable();
 			cond_resched();
+			local_bh_disable();
+		}
 	}
+	local_bh_enable();
 }
 
 static void wg_packet_consume_data(struct wg_device *wg, struct sk_buff *skb)

It would decrypt 4 packets in a row and then after local_bh_enable() it
would invoke wg_packet_rx_poll() (assuming since it is the only napi
handler in wireguard) and after that it will attempt cond_resched() and
then continue with the next batch.

> Jason

Sebastian
Jason A. Donenfeld March 18, 2022, 6:19 p.m. UTC | #6
Hi Sebastian,

On Fri, Mar 18, 2022 at 4:57 AM Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
> > Hi Sebastian,
> Hi Jason,
>
> > I stumbled upon this commit when noticing a new failure in WireGuard's
> > test suite:
> …
> > [    1.339289] WARNING: CPU: 0 PID: 11 at ../../../../../../../../net/core/dev.c:4268 __napi_schedule+0xa1/0x300
> …
> > [    1.352417]  wg_packet_decrypt_worker+0x2ac/0x470
> …
> > Sounds like wg_packet_decrypt_worker() might be doing something wrong? I
> > vaguely recall a thread where you started looking into some things there
> > that seemed non-optimal, but I didn't realize there were correctness
> > issues. If your patch is correct, and wg_packet_decrypt_worker() is
> > wrong, do you have a concrete idea of how we should approach fixing
> > wireguard? Or do you want to send a patch for that?
>
> In your case it is "okay" since that ptr_ring_consume_bh() will do BH
> disable/enable which forces the softirq to run. It is not obvious.

In that case, isn't the lockdep assertion you added wrong and should
be reverted? If correct code is hitting it, something seems wrong...

> What
> about the following:
>
> diff --git a/drivers/net/wireguard/receive.c b/drivers/net/wireguard/receive.c
> index 7b8df406c7737..26ffa3afa542e 100644
> --- a/drivers/net/wireguard/receive.c
> +++ b/drivers/net/wireguard/receive.c
> @@ -502,15 +502,21 @@ void wg_packet_decrypt_worker(struct work_struct *work)
>         struct crypt_queue *queue = container_of(work, struct multicore_worker,
>                                                  work)->ptr;
>         struct sk_buff *skb;
> +       unsigned int packets = 0;
>
> -       while ((skb = ptr_ring_consume_bh(&queue->ring)) != NULL) {
> +       local_bh_disable();
> +       while ((skb = ptr_ring_consume(&queue->ring)) != NULL) {
>                 enum packet_state state =
>                         likely(decrypt_packet(skb, PACKET_CB(skb)->keypair)) ?
>                                 PACKET_STATE_CRYPTED : PACKET_STATE_DEAD;
>                 wg_queue_enqueue_per_peer_rx(skb, state);
> -               if (need_resched())
> +               if (!(++packets % 4)) {
> +                       local_bh_enable();
>                         cond_resched();
> +                       local_bh_disable();
> +               }
>         }
> +       local_bh_enable();
>  }
>
>  static void wg_packet_consume_data(struct wg_device *wg, struct sk_buff *skb)
>
> It would decrypt 4 packets in a row and then after local_bh_enable() it
> would invoke wg_packet_rx_poll() (assuming since it is the only napi
> handler in wireguard) and after that it will attempt cond_resched() and
> then continue with the next batch.

I'm willing to consider batching and all sorts of heuristics in there,
though probably for 5.19 rather than 5.18. Indeed there's some
interesting optimization work to be done. But if you want to propose a
change like this, can you send some benchmarks with it, preferably
taken with something like flent so we can see if it negatively affects
latency?

Regards,
Jason
Jakub Kicinski March 18, 2022, 6:59 p.m. UTC | #7
On Fri, 18 Mar 2022 12:19:45 -0600 Jason A. Donenfeld wrote:
> > In your case it is "okay" since that ptr_ring_consume_bh() will do BH
> > disable/enable which forces the softirq to run. It is not obvious.  
> 
> In that case, isn't the lockdep assertion you added wrong and should
> be reverted? If correct code is hitting it, something seems wrong...

FWIW I'd lean towards revert as well, I can't think of a simple
fix that won't require work arounds in callers.
Jason A. Donenfeld March 19, 2022, 12:41 a.m. UTC | #8
Hey Jakub,

On Fri, Mar 18, 2022 at 12:59 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Fri, 18 Mar 2022 12:19:45 -0600 Jason A. Donenfeld wrote:
> > > In your case it is "okay" since that ptr_ring_consume_bh() will do BH
> > > disable/enable which forces the softirq to run. It is not obvious.
> >
> > In that case, isn't the lockdep assertion you added wrong and should
> > be reverted? If correct code is hitting it, something seems wrong...
>
> FWIW I'd lean towards revert as well, I can't think of a simple
> fix that won't require work arounds in callers.

I just got an email from syzbot about this too:
https://lore.kernel.org/wireguard/0000000000000eaff805da869d5b@google.com/

And no word from Sebastian, and now it's the weekend, so I suspect
you're probably right. I'll send in a revert.

Jason
diff mbox series

Patch

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 467b94257105e..0cc65d2167015 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -329,6 +329,12 @@  extern void lock_unpin_lock(struct lockdep_map *lock, struct pin_cookie);
 
 #define lockdep_assert_none_held_once()		\
 	lockdep_assert_once(!current->lockdep_depth)
+/*
+ * Ensure that softirq is handled within the callchain and not delayed and
+ * handled by chance.
+ */
+#define lockdep_assert_softirq_will_run()	\
+	lockdep_assert_once(hardirq_count() | softirq_count())
 
 #define lockdep_recursing(tsk)	((tsk)->lockdep_recursion)
 
@@ -414,6 +420,7 @@  extern int lockdep_is_held(const void *);
 #define lockdep_assert_held_read(l)		do { (void)(l); } while (0)
 #define lockdep_assert_held_once(l)		do { (void)(l); } while (0)
 #define lockdep_assert_none_held_once()	do { } while (0)
+#define lockdep_assert_softirq_will_run()	do { } while (0)
 
 #define lockdep_recursing(tsk)			(0)
 
diff --git a/net/core/dev.c b/net/core/dev.c
index ba69ddf85af6b..dbda85879f6c1 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4267,6 +4267,9 @@  static inline void ____napi_schedule(struct softnet_data *sd,
 {
 	struct task_struct *thread;
 
+	lockdep_assert_softirq_will_run();
+	lockdep_assert_irqs_disabled();
+
 	if (test_bit(NAPI_STATE_THREADED, &napi->state)) {
 		/* Paired with smp_mb__before_atomic() in
 		 * napi_enable()/dev_set_threaded().
@@ -4874,7 +4877,7 @@  int __netif_rx(struct sk_buff *skb)
 {
 	int ret;
 
-	lockdep_assert_once(hardirq_count() | softirq_count());
+	lockdep_assert_softirq_will_run();
 
 	trace_netif_rx_entry(skb);
 	ret = netif_rx_internal(skb);