e1000e: Link flap workaround option for false IRP events

Message ID	20250226194422.1030419-1-mpearson-lenovo@squebb.ca (mailing list archive)
State	Changes Requested
Delegated to:	Netdev Maintainers
Headers	show Received: from fout-b1-smtp.messagingengine.com (fout-b1-smtp.messagingengine.com [202.12.124.144]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4FBD6243952; Wed, 26 Feb 2025 19:44:31 +0000 (UTC) Feedback-ID: ibe194615:Fastmail From: Mark Pearson <mpearson-lenovo@squebb.ca> To: mpearson-lenovo@squebb.ca Cc: anthony.l.nguyen@intel.com, przemyslaw.kitszel@intel.com, andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, intel-wired-lan@lists.osuosl.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] e1000e: Link flap workaround option for false IRP events Date: Wed, 26 Feb 2025 14:44:12 -0500 Message-ID: <20250226194422.1030419-1-mpearson-lenovo@squebb.ca> In-Reply-To: <mpearson-lenovo@squebb.ca> References: <mpearson-lenovo@squebb.ca> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	e1000e: Link flap workaround option for false IRP events \| expand e1000e: Link flap workaround option for false IRP events

Message ID

20250226194422.1030419-1-mpearson-lenovo@squebb.ca (mailing list archive)

State

Changes Requested

Delegated to:

Netdev Maintainers

Headers

Feedback-ID: ibe194615:Fastmail
From: Mark Pearson <mpearson-lenovo@squebb.ca>
To: mpearson-lenovo@squebb.ca
Cc: anthony.l.nguyen@intel.com,
	przemyslaw.kitszel@intel.com,
	andrew+netdev@lunn.ch,
	davem@davemloft.net,
	edumazet@google.com,
	kuba@kernel.org,
	pabeni@redhat.com,
	intel-wired-lan@lists.osuosl.org,
	netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: [PATCH] e1000e: Link flap workaround option for false IRP events
Date: Wed, 26 Feb 2025 14:44:12 -0500
Message-ID: <20250226194422.1030419-1-mpearson-lenovo@squebb.ca>
In-Reply-To: <mpearson-lenovo@squebb.ca>
References: <mpearson-lenovo@squebb.ca>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Series

e1000e: Link flap workaround option for false IRP events | expand

Context	Check	Description
netdev/series_format	warning	Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection	success	Guessed tree name to be net-next
netdev/ynl	success	Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 0 this patch: 0
netdev/build_tools	success	No tools touched, skip
netdev/cc_maintainers	success	CCed 8 of 8 maintainers
netdev/build_clang	success	Errors and warnings before: 0 this patch: 0
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	fail	Found: 'module_param' was: 0 now: 1
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 0 this patch: 0
netdev/checkpatch	success	total: 0 errors, 0 warnings, 0 checks, 31 lines checked
netdev/build_clang_rust	success	No Rust files in patch. Skipping build
netdev/kdoc	success	Errors and warnings before: 35 this patch: 35
netdev/source_inline	success	Was 0 now: 0
netdev/contest	success	net-next-2025-02-26--21-00 (tests: 895)

Context

Check

Description

netdev/series_format

warning

Single patches do not need cover letters; Target tree name not specified in the subject

netdev/tree_selection

success

Guessed tree name to be net-next

netdev/ynl

success

Generated files up to date; no warnings/errors; no diff in generated;

netdev/fixes_present

success

Fixes tag not required for -next series

netdev/header_inline

success

No static functions without inline keyword in header files

netdev/build_32bit

success

Errors and warnings before: 0 this patch: 0

netdev/build_tools

success

No tools touched, skip

netdev/cc_maintainers

success

CCed 8 of 8 maintainers

netdev/build_clang

success

Errors and warnings before: 0 this patch: 0

netdev/verify_signedoff

success

Signed-off-by tag matches author and committer

netdev/deprecated_api

fail

Found: 'module_param' was: 0 now: 1

netdev/check_selftest

success

No net selftest shell script

netdev/verify_fixes

success

No Fixes tag

netdev/build_allmodconfig_warn

success

Errors and warnings before: 0 this patch: 0

netdev/checkpatch

success

total: 0 errors, 0 warnings, 0 checks, 31 lines checked

netdev/build_clang_rust

success

No Rust files in patch. Skipping build

netdev/kdoc

success

Errors and warnings before: 35 this patch: 35

netdev/source_inline

success

Was 0 now: 0

netdev/contest

success

net-next-2025-02-26--21-00 (tests: 895)

Commit Message

Mark Pearson Feb. 26, 2025, 7:44 p.m. UTC

Issue is seen on some Lenovo desktop workstations where there
is a false IRP event which triggers a link flap.
Condition is rare and only seen on networks where link speed
may differ along the path between nodes (e.g 10M/100M)

Intel are not able to determine root cause but provided a
workaround that does fix the issue. Tested extensively at Lenovo.

Adding a module option to enable this workaround for users
who are impacted by this issue.

Signed-off-by: Mark Pearson <mpearson-lenovo@squebb.ca>
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

Comments

Andrew Lunn Feb. 26, 2025, 10:52 p.m. UTC | #1

On Wed, Feb 26, 2025 at 02:44:12PM -0500, Mark Pearson wrote:
> Issue is seen on some Lenovo desktop workstations where there
> is a false IRP event which triggers a link flap.
> Condition is rare and only seen on networks where link speed
> may differ along the path between nodes (e.g 10M/100M)
> 
> Intel are not able to determine root cause but provided a
> workaround that does fix the issue. Tested extensively at Lenovo.
> 
> Adding a module option to enable this workaround for users
> who are impacted by this issue.

Why is a module option needed? Does the workaround itself introduce
issues? Please describe those issues?

In general, module options are not liked. So please include in the
commit message why a module option is the only option.
 
> Signed-off-by: Mark Pearson <mpearson-lenovo@squebb.ca>
> ---
>  drivers/net/ethernet/intel/e1000e/netdev.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
> index 286155efcedf..06774fb4b2dd 100644
> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> @@ -37,6 +37,10 @@ static int debug = -1;
>  module_param(debug, int, 0);
>  MODULE_PARM_DESC(debug, "Debug level (0=none,...,16=all)");
>  
> +static int false_irp_workaround;
> +module_param(false_irp_workaround, int, 0);
> +MODULE_PARM_DESC(false_irp_workaround, "Enable workaround for rare false IRP event causing link flap");
> +
>  static const struct e1000_info *e1000_info_tbl[] = {
>  	[board_82571]		= &e1000_82571_info,
>  	[board_82572]		= &e1000_82572_info,
> @@ -1757,6 +1761,21 @@ static irqreturn_t e1000_intr_msi(int __always_unused irq, void *data)
>  	/* read ICR disables interrupts using IAM */
>  	if (icr & E1000_ICR_LSC) {
>  		hw->mac.get_link_status = true;
> +
> +		/*
> +		 * False IRP workaround
> +		 * Issue seen on Lenovo P5 and P7 workstations where if there
> +		 * are different link speeds in the network a false IRP event
> +		 * is received, leading to a link flap.
> +		 * Intel unable to determine root cause. This read prevents
> +		 * the issue occurring
> +		 */
> +		if (false_irp_workaround) {
> +			u16 phy_data;
> +
> +			e1e_rphy(hw, PHY_REG(772, 26), &phy_data);

Please add some #define for these magic numbers, so we have some idea
what PHY register you are actually reading. That in itself might help
explain how the workaround actually works.

    Andrew

---
pw-bot: cr

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 286155efcedf..06774fb4b2dd 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -37,6 +37,10 @@  static int debug = -1;
 module_param(debug, int, 0);
 MODULE_PARM_DESC(debug, "Debug level (0=none,...,16=all)");
 
+static int false_irp_workaround;
+module_param(false_irp_workaround, int, 0);
+MODULE_PARM_DESC(false_irp_workaround, "Enable workaround for rare false IRP event causing link flap");
+
 static const struct e1000_info *e1000_info_tbl[] = {
 	[board_82571]		= &e1000_82571_info,
 	[board_82572]		= &e1000_82572_info,
@@ -1757,6 +1761,21 @@  static irqreturn_t e1000_intr_msi(int __always_unused irq, void *data)
 	/* read ICR disables interrupts using IAM */
 	if (icr & E1000_ICR_LSC) {
 		hw->mac.get_link_status = true;
+
+		/*
+		 * False IRP workaround
+		 * Issue seen on Lenovo P5 and P7 workstations where if there
+		 * are different link speeds in the network a false IRP event
+		 * is received, leading to a link flap.
+		 * Intel unable to determine root cause. This read prevents
+		 * the issue occurring
+		 */
+		if (false_irp_workaround) {
+			u16 phy_data;
+
+			e1e_rphy(hw, PHY_REG(772, 26), &phy_data);
+		}
+
 		/* ICH8 workaround-- Call gig speed drop workaround on cable
 		 * disconnect (LSC) before accessing any PHY registers
 		 */

e1000e: Link flap workaround option for false IRP events

Checks

Commit Message

Comments

Patch