diff mbox series

[2/4] amd-xgbe: Fix NETDEV WATCHDOG transmit queue timeout warning

Message ID 20210212180010.221129-3-Shyam-sundar.S-k@amd.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series Bug fixes to amd-xgbe driver | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Guessed tree name to be net-next
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cc_maintainers success CCed 4 of 4 maintainers
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 14 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Shyam Sundar S K Feb. 12, 2021, 6 p.m. UTC
Current driver calls the netif_carrier_off() during the later point in
time to tear down the link which causes the netdev watchdog to timeout.

Calling netif_carrier_off() immediately after netif_tx_stop_all_queues()
would avoids the warning.

 ------------[ cut here ]------------
 NETDEV WATCHDOG: enp3s0f2 (amd-xgbe): transmit queue 0 timed out
 WARNING: CPU: 3 PID: 0 at net/sched/sch_generic.c:461 dev_watchdog+0x20d/0x220
 Modules linked in: amd_xgbe(E)  amd-xgbe 0000:03:00.2 enp3s0f2: Link is Down
 CPU: 3 PID: 0 Comm: swapper/3 Tainted: G            E
 Hardware name: AMD Bilby-RV2/Bilby-RV2, BIOS RBB1202A 10/18/2019
 RIP: 0010:dev_watchdog+0x20d/0x220
 Code: 00 49 63 4e e0 eb 92 4c 89 e7 c6 05 c6 e2 c1 00 01 e8 e7 ce fc ff 89 d9 48
 RSP: 0018:ffff90cfc28c3e88 EFLAGS: 00010286
 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006
 RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffff90cfc28d63c0
 RBP: ffff90cfb977845c R08: 0000000000000050 R09: 0000000000196018
 R10: ffff90cfc28c3ef8 R11: 0000000000000000 R12: ffff90cfb9778000
 R13: 0000000000000003 R14: ffff90cfb9778480 R15: 0000000000000010
 FS:  0000000000000000(0000) GS:ffff90cfc28c0000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f240ff2d9d0 CR3: 00000001e3e0a000 CR4: 00000000003406e0
 Call Trace:
  <IRQ>
  ? pfifo_fast_reset+0x100/0x100
  call_timer_fn+0x2b/0x130
  run_timer_softirq+0x3e8/0x440
  ? enqueue_hrtimer+0x39/0x90

Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
---
 drivers/net/ethernet/amd/xgbe/xgbe-drv.c  | 1 +
 drivers/net/ethernet/amd/xgbe/xgbe-mdio.c | 1 -
 2 files changed, 1 insertion(+), 1 deletion(-)

Comments

Tom Lendacky Feb. 12, 2021, 6:48 p.m. UTC | #1
On 2/12/21 12:00 PM, Shyam Sundar S K wrote:
> Current driver calls the netif_carrier_off() during the later point in
> time to tear down the link which causes the netdev watchdog to timeout.

This is a bit confusing...  how about:

The current driver calls netif_carrier_off() late in the link tear down 
which can result in a netdev watchdog timeout.

> 
> Calling netif_carrier_off() immediately after netif_tx_stop_all_queues()
> would avoids the warning.

s/would//

> 
>   ------------[ cut here ]------------
>   NETDEV WATCHDOG: enp3s0f2 (amd-xgbe): transmit queue 0 timed out
>   WARNING: CPU: 3 PID: 0 at net/sched/sch_generic.c:461 dev_watchdog+0x20d/0x220
>   Modules linked in: amd_xgbe(E)  amd-xgbe 0000:03:00.2 enp3s0f2: Link is Down
>   CPU: 3 PID: 0 Comm: swapper/3 Tainted: G            E
>   Hardware name: AMD Bilby-RV2/Bilby-RV2, BIOS RBB1202A 10/18/2019
>   RIP: 0010:dev_watchdog+0x20d/0x220
>   Code: 00 49 63 4e e0 eb 92 4c 89 e7 c6 05 c6 e2 c1 00 01 e8 e7 ce fc ff 89 d9 48
>   RSP: 0018:ffff90cfc28c3e88 EFLAGS: 00010286
>   RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006
>   RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffff90cfc28d63c0
>   RBP: ffff90cfb977845c R08: 0000000000000050 R09: 0000000000196018
>   R10: ffff90cfc28c3ef8 R11: 0000000000000000 R12: ffff90cfb9778000
>   R13: 0000000000000003 R14: ffff90cfb9778480 R15: 0000000000000010
>   FS:  0000000000000000(0000) GS:ffff90cfc28c0000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 00007f240ff2d9d0 CR3: 00000001e3e0a000 CR4: 00000000003406e0
>   Call Trace:
>    <IRQ>
>    ? pfifo_fast_reset+0x100/0x100
>    call_timer_fn+0x2b/0x130
>    run_timer_softirq+0x3e8/0x440
>    ? enqueue_hrtimer+0x39/0x90
> 
> Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>

Same comment about Co-developed-by: here as previous patch.

With the above comments addressed,

Acked-by: Tom Lendacky <thomas.lendacky@amd.com>

> ---
>   drivers/net/ethernet/amd/xgbe/xgbe-drv.c  | 1 +
>   drivers/net/ethernet/amd/xgbe/xgbe-mdio.c | 1 -
>   2 files changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> index 2709a2db5657..395eb0b52680 100644
> --- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> +++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> @@ -1368,6 +1368,7 @@ static void xgbe_stop(struct xgbe_prv_data *pdata)
>   		return;
>   
>   	netif_tx_stop_all_queues(netdev);
> +	netif_carrier_off(pdata->netdev);
>   
>   	xgbe_stop_timers(pdata);
>   	flush_workqueue(pdata->dev_workqueue);
> diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-mdio.c b/drivers/net/ethernet/amd/xgbe/xgbe-mdio.c
> index 93ef5a30cb8d..19ee4db0156d 100644
> --- a/drivers/net/ethernet/amd/xgbe/xgbe-mdio.c
> +++ b/drivers/net/ethernet/amd/xgbe/xgbe-mdio.c
> @@ -1396,7 +1396,6 @@ static void xgbe_phy_stop(struct xgbe_prv_data *pdata)
>   	pdata->phy_if.phy_impl.stop(pdata);
>   
>   	pdata->phy.link = 0;
> -	netif_carrier_off(pdata->netdev);
>   
>   	xgbe_phy_adjust_link(pdata);
>   }
>
diff mbox series

Patch

diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
index 2709a2db5657..395eb0b52680 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
@@ -1368,6 +1368,7 @@  static void xgbe_stop(struct xgbe_prv_data *pdata)
 		return;
 
 	netif_tx_stop_all_queues(netdev);
+	netif_carrier_off(pdata->netdev);
 
 	xgbe_stop_timers(pdata);
 	flush_workqueue(pdata->dev_workqueue);
diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-mdio.c b/drivers/net/ethernet/amd/xgbe/xgbe-mdio.c
index 93ef5a30cb8d..19ee4db0156d 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-mdio.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-mdio.c
@@ -1396,7 +1396,6 @@  static void xgbe_phy_stop(struct xgbe_prv_data *pdata)
 	pdata->phy_if.phy_impl.stop(pdata);
 
 	pdata->phy.link = 0;
-	netif_carrier_off(pdata->netdev);
 
 	xgbe_phy_adjust_link(pdata);
 }