diff mbox series

[net-next] r8169: disable ASPM in case of tx timeout

Message ID 1847c5aa-39ff-4574-b1c5-38ac5f16e594@gmail.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [net-next] r8169: disable ASPM in case of tx timeout | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers success CCed 7 of 7 maintainers
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch warning WARNING: line length of 84 exceeds 80 columns WARNING: line length of 94 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Heiner Kallweit Dec. 28, 2022, 9:30 p.m. UTC
There are still single reports of systems where ASPM incompatibilities
cause tx timeouts. It's not clear whom to blame, so let's disable
ASPM in case of a tx timeout.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
---
 drivers/net/ethernet/realtek/r8169_main.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

Comments

Stephen Hemminger Dec. 28, 2022, 10:05 p.m. UTC | #1
On Wed, 28 Dec 2022 22:30:56 +0100
Heiner Kallweit <hkallweit1@gmail.com> wrote:

> There are still single reports of systems where ASPM incompatibilities
> cause tx timeouts. It's not clear whom to blame, so let's disable
> ASPM in case of a tx timeout.
> 
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>

Wouldn't a log message be appropriate here.

	netdev_WARN_ONCE(tp->dev, "ASPM disabled on Tx timeout\n");
Heiner Kallweit Dec. 29, 2022, 11:30 a.m. UTC | #2
On 28.12.2022 23:05, Stephen Hemminger wrote:
> On Wed, 28 Dec 2022 22:30:56 +0100
> Heiner Kallweit <hkallweit1@gmail.com> wrote:
> 
>> There are still single reports of systems where ASPM incompatibilities
>> cause tx timeouts. It's not clear whom to blame, so let's disable
>> ASPM in case of a tx timeout.
>>
>> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
> 
> Wouldn't a log message be appropriate here.
> 
> 	netdev_WARN_ONCE(tp->dev, "ASPM disabled on Tx timeout\n");

Right, that's something I could add. Message will be printed only
if return code of pci_disable_link_state() indicates success.
And I'd use netdev_warn_once() instead of netdev_WARN_ONCE(),
because net core prints a stack trace already in case of tx timeout.
diff mbox series

Patch

diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index a9dcc98b6..7b58da9aa 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -576,6 +576,7 @@  struct rtl8169_tc_offsets {
 enum rtl_flag {
 	RTL_FLAG_TASK_ENABLED = 0,
 	RTL_FLAG_TASK_RESET_PENDING,
+	RTL_FLAG_TASK_TX_TIMEOUT,
 	RTL_FLAG_MAX
 };
 
@@ -3931,7 +3932,7 @@  static void rtl8169_tx_timeout(struct net_device *dev, unsigned int txqueue)
 {
 	struct rtl8169_private *tp = netdev_priv(dev);
 
-	rtl_schedule_task(tp, RTL_FLAG_TASK_RESET_PENDING);
+	rtl_schedule_task(tp, RTL_FLAG_TASK_TX_TIMEOUT);
 }
 
 static int rtl8169_tx_map(struct rtl8169_private *tp, const u32 *opts, u32 len,
@@ -4532,7 +4533,14 @@  static void rtl_task(struct work_struct *work)
 	    !test_bit(RTL_FLAG_TASK_ENABLED, tp->wk.flags))
 		goto out_unlock;
 
+	if (test_and_clear_bit(RTL_FLAG_TASK_TX_TIMEOUT, tp->wk.flags)) {
+		/* ASPM compatibility issues are a typical reason for tx timeouts */
+		pci_disable_link_state(tp->pci_dev, PCIE_LINK_STATE_L1 | PCIE_LINK_STATE_L0S);
+		goto reset;
+	}
+
 	if (test_and_clear_bit(RTL_FLAG_TASK_RESET_PENDING, tp->wk.flags)) {
+reset:
 		rtl_reset_work(tp);
 		netif_wake_queue(tp->dev);
 	}