diff mbox series

[net] net: core: avoid napi_disable to cause deadlock

Message ID 20210318080450.38893-1-ljp@linux.ibm.com (mailing list archive)
State Rejected
Delegated to: Netdev Maintainers
Headers show
Series [net] net: core: avoid napi_disable to cause deadlock | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net
netdev/subject_prefix success Link
netdev/cc_maintainers success CCed 11 of 11 maintainers
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 10 this patch: 10
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 24 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 10 this patch: 10
netdev/header_inline success Link

Commit Message

Lijun Pan March 18, 2021, 8:04 a.m. UTC
There are chances that napi_disable is called twice by NIC driver.
This could generate deadlock. For example,
the first napi_disable will spin until NAPI_STATE_SCHED is cleared
by napi_complete_done, then set it again.
When napi_disable is called the second time, it will loop infinitely
because no dev->poll will be running to clear NAPI_STATE_SCHED.

CPU0				CPU1
 napi_disable
  test_and_set_bit
  (napi_complete_done clears
   NAPI_STATE_SCHED, ret 0,
   and set NAPI_STATE_SCHED)
				napi_disable
				 test_and_set_bit
				 (ret 1 and loop infinitely because
				  no napi instance is scheduled to
				  clear NAPI_STATE_SCHED bit)

Checking the napi state bit to make sure if napi is already disabled,
exit the call early enough to avoid spinning infinitely.

Fixes: bea3348eef27 ("[NET]: Make NAPI polling independent of struct net_device objects.")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
---
 net/core/dev.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

Comments

Eric Dumazet March 18, 2021, 9:22 a.m. UTC | #1
On Thu, Mar 18, 2021 at 9:04 AM Lijun Pan <ljp@linux.ibm.com> wrote:
>
> There are chances that napi_disable is called twice by NIC driver.


???

Please fix the buggy driver, or explain why it can not be fixed.


> This could generate deadlock. For example,
> the first napi_disable will spin until NAPI_STATE_SCHED is cleared
> by napi_complete_done, then set it again.
> When napi_disable is called the second time, it will loop infinitely
> because no dev->poll will be running to clear NAPI_STATE_SCHED.
>
> CPU0                            CPU1
>  napi_disable
>   test_and_set_bit
>   (napi_complete_done clears
>    NAPI_STATE_SCHED, ret 0,
>    and set NAPI_STATE_SCHED)
>                                 napi_disable
>                                  test_and_set_bit
>                                  (ret 1 and loop infinitely because
>                                   no napi instance is scheduled to
>                                   clear NAPI_STATE_SCHED bit)
>
> Checking the napi state bit to make sure if napi is already disabled,
> exit the call early enough to avoid spinning infinitely.
>
> Fixes: bea3348eef27 ("[NET]: Make NAPI polling independent of struct net_device objects.")
> Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
> ---
>  net/core/dev.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
David Miller March 18, 2021, 6:23 p.m. UTC | #2
From: Eric Dumazet <edumazet@google.com>
Date: Thu, 18 Mar 2021 10:22:23 +0100

> On Thu, Mar 18, 2021 at 9:04 AM Lijun Pan <ljp@linux.ibm.com> wrote:
>>
>> There are chances that napi_disable is called twice by NIC driver.
> 
> 
> ???
> 
> Please fix the buggy driver, or explain why it can not be fixed.

Agreed,.
diff mbox series

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index 6c5967e80132..eb3c0ddd4fd7 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6809,6 +6809,24 @@  EXPORT_SYMBOL(netif_napi_add);
 void napi_disable(struct napi_struct *n)
 {
 	might_sleep();
+
+	/* make sure napi_disable() runs only once,
+	 * When napi is disabled, the state bits are like:
+	 * NAPI_STATE_SCHED (set by previous napi_disable)
+	 * NAPI_STATE_NPSVC (set by previous napi_disable)
+	 * NAPI_STATE_DISABLE (cleared by previous napi_disable)
+	 * NAPI_STATE_PREFER_BUSY_POLL (cleared by previous napi_complete_done)
+	 * NAPI_STATE_MISSED (cleared by previous napi_complete_done)
+	 */
+
+	if (napi_disable_pending(n))
+		return;
+	if (test_bit(NAPI_STATE_SCHED, &n->state) &&
+	    test_bit(NAPI_STATE_NPSVC, &n->state) &&
+	    !test_bit(NAPI_STATE_MISSED, &n->state) &&
+	    !test_bit(NAPI_STATE_PREFER_BUSY_POLL, &n->state))
+		return;
+
 	set_bit(NAPI_STATE_DISABLE, &n->state);
 
 	while (test_and_set_bit(NAPI_STATE_SCHED, &n->state))