diff mbox series

net: phy: disable eee due to errata on various KSZ switches

Message ID 20241004213235.3353398-1-tharvey@gateworks.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series net: phy: disable eee due to errata on various KSZ switches | expand

Checks

Context Check Description
netdev/tree_selection success Guessing tree name failed - patch did not apply

Commit Message

Tim Harvey Oct. 4, 2024, 9:32 p.m. UTC
The well-known errata regarding EEE not being functional on various KSZ
switches has been refactored a few times. Recently the refactoring has
excluded several switches that the errata should also apply to.

Disable EEE for these switches based on errata.

Signed-off-by: Tim Harvey <tharvey@gateworks.com>
---
 drivers/net/dsa/microchip/ksz_common.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

Comments

Andrew Lunn Oct. 5, 2024, 4:46 p.m. UTC | #1
On Fri, Oct 04, 2024 at 02:32:35PM -0700, Tim Harvey wrote:
> The well-known errata regarding EEE not being functional on various KSZ
> switches has been refactored a few times. Recently the refactoring has
> excluded several switches that the errata should also apply to.

Does the commit message say why?

Does this need a Fixes: tag?

	Andrew
Tim Harvey Oct. 7, 2024, 4:38 p.m. UTC | #2
On Sat, Oct 5, 2024 at 9:46 AM Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Fri, Oct 04, 2024 at 02:32:35PM -0700, Tim Harvey wrote:
> > The well-known errata regarding EEE not being functional on various KSZ
> > switches has been refactored a few times. Recently the refactoring has
> > excluded several switches that the errata should also apply to.
>
> Does the commit message say why?
>
> Does this need a Fixes: tag?
>

Hi Andrew,

Good question. I couldn't really figure out what fixes tag would be
appropriate as this code has changed a few times and broken in strange
ways. Here's a history as best I can tell:

The original workaround for the errata was applied with a register
write to manually disable the EEE feature in MMD 7:60 which was being
applied for KSZ9477/KSZ9897/KSZ9567 switch ID's.

Then came commit ("26dd2974c5b5 net: phy: micrel: Move KSZ9477 errata
fixes to PHY driver") and commit ("6068e6d7ba50 net: dsa: microchip:
remove KSZ9477 PHY errata handling") which moved the errata from the
switch driver to the PHY driver but only for PHY_ID_KSZ9477 (PHY ID)
however that PHY code was dead code because an entry was never added
for PHY_ID_KSZ9477 via MODULE_DEVICE_TABLE. So even if we add a
'Fixes: 6068e6d7ba50' it would not be fixed.

This was apparently realized much later and commit ("54a4e5c16382 net:
phy: micrel: add Microchip KSZ 9477 to the device table") added the
PHY_ID_KSZ9477 to the PHY driver. I believe the code was proper at
this point.

Later commit ("6149db4997f5 net: phy: micrel: fix KSZ9477 PHY issues
after suspend/resume") breaks this again for all but KSZ9897 by only
applying the errata for that PHY ID.

The most recent time this was affected was with commit ("08c6d8bae48c
net: phy: Provide Module 4 KSZ9477 errata (DS80000754C)") which
removes the blatant register write to MMD 7:60 and replaces it by
setting phydev->eee_broken_modes = -1 so that the generic phy-c45 code
disables EEE but this is only done for the KSZ9477_CHIP_ID (Switch ID)
so its still broken at this point for the other switches that have
this errata.

So at this point, the only commit that my patch would apply over is
the most recent 08c6d8bae48c but that wouldn't fix any of the previous
issues and it would be unclear what switch was broken at what point in
time.

Best Regards,

Tim
Andrew Lunn Oct. 7, 2024, 4:57 p.m. UTC | #3
On Mon, Oct 07, 2024 at 09:38:59AM -0700, Tim Harvey wrote:
> On Sat, Oct 5, 2024 at 9:46 AM Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > On Fri, Oct 04, 2024 at 02:32:35PM -0700, Tim Harvey wrote:
> > > The well-known errata regarding EEE not being functional on various KSZ
> > > switches has been refactored a few times. Recently the refactoring has
> > > excluded several switches that the errata should also apply to.
> >
> > Does the commit message say why?
> >
> > Does this need a Fixes: tag?
> >
> 
> Hi Andrew,
> 
> Good question. I couldn't really figure out what fixes tag would be
> appropriate as this code has changed a few times and broken in strange
> ways. Here's a history as best I can tell:
> 
> The original workaround for the errata was applied with a register
> write to manually disable the EEE feature in MMD 7:60 which was being
> applied for KSZ9477/KSZ9897/KSZ9567 switch ID's.
> 
> Then came commit ("26dd2974c5b5 net: phy: micrel: Move KSZ9477 errata
> fixes to PHY driver") and commit ("6068e6d7ba50 net: dsa: microchip:
> remove KSZ9477 PHY errata handling") which moved the errata from the
> switch driver to the PHY driver but only for PHY_ID_KSZ9477 (PHY ID)
> however that PHY code was dead code because an entry was never added
> for PHY_ID_KSZ9477 via MODULE_DEVICE_TABLE. So even if we add a
> 'Fixes: 6068e6d7ba50' it would not be fixed.
> 
> This was apparently realized much later and commit ("54a4e5c16382 net:
> phy: micrel: add Microchip KSZ 9477 to the device table") added the
> PHY_ID_KSZ9477 to the PHY driver. I believe the code was proper at
> this point.
> 
> Later commit ("6149db4997f5 net: phy: micrel: fix KSZ9477 PHY issues
> after suspend/resume") breaks this again for all but KSZ9897 by only
> applying the errata for that PHY ID.
> 
> The most recent time this was affected was with commit ("08c6d8bae48c
> net: phy: Provide Module 4 KSZ9477 errata (DS80000754C)") which
> removes the blatant register write to MMD 7:60 and replaces it by
> setting phydev->eee_broken_modes = -1 so that the generic phy-c45 code
> disables EEE but this is only done for the KSZ9477_CHIP_ID (Switch ID)
> so its still broken at this point for the other switches that have
> this errata.
> 
> So at this point, the only commit that my patch would apply over is
> the most recent 08c6d8bae48c but that wouldn't fix any of the previous
> issues and it would be unclear what switch was broken at what point in
> time.

O.K, so its a mess :-(

Lets look at this from a different direction. Which stable kernels do
you actually care about? Is 6.6 enough for you? Do you need 6.1? 4.19?

You should use a Fixed tag which goes back far enough for you. The
patch itself might not apply that far back, but once you get it merged
and backported as far as it does go, you can submit ported versions
for older kernel, referencing the original fix commit hash number.

	Andrew
Tim Harvey Oct. 7, 2024, 5:05 p.m. UTC | #4
On Mon, Oct 7, 2024 at 9:57 AM Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Mon, Oct 07, 2024 at 09:38:59AM -0700, Tim Harvey wrote:
> > On Sat, Oct 5, 2024 at 9:46 AM Andrew Lunn <andrew@lunn.ch> wrote:
> > >
> > > On Fri, Oct 04, 2024 at 02:32:35PM -0700, Tim Harvey wrote:
> > > > The well-known errata regarding EEE not being functional on various KSZ
> > > > switches has been refactored a few times. Recently the refactoring has
> > > > excluded several switches that the errata should also apply to.
> > >
> > > Does the commit message say why?
> > >
> > > Does this need a Fixes: tag?
> > >
> >
> > Hi Andrew,
> >
> > Good question. I couldn't really figure out what fixes tag would be
> > appropriate as this code has changed a few times and broken in strange
> > ways. Here's a history as best I can tell:
> >
> > The original workaround for the errata was applied with a register
> > write to manually disable the EEE feature in MMD 7:60 which was being
> > applied for KSZ9477/KSZ9897/KSZ9567 switch ID's.
> >
> > Then came commit ("26dd2974c5b5 net: phy: micrel: Move KSZ9477 errata
> > fixes to PHY driver") and commit ("6068e6d7ba50 net: dsa: microchip:
> > remove KSZ9477 PHY errata handling") which moved the errata from the
> > switch driver to the PHY driver but only for PHY_ID_KSZ9477 (PHY ID)
> > however that PHY code was dead code because an entry was never added
> > for PHY_ID_KSZ9477 via MODULE_DEVICE_TABLE. So even if we add a
> > 'Fixes: 6068e6d7ba50' it would not be fixed.
> >
> > This was apparently realized much later and commit ("54a4e5c16382 net:
> > phy: micrel: add Microchip KSZ 9477 to the device table") added the
> > PHY_ID_KSZ9477 to the PHY driver. I believe the code was proper at
> > this point.
> >
> > Later commit ("6149db4997f5 net: phy: micrel: fix KSZ9477 PHY issues
> > after suspend/resume") breaks this again for all but KSZ9897 by only
> > applying the errata for that PHY ID.
> >
> > The most recent time this was affected was with commit ("08c6d8bae48c
> > net: phy: Provide Module 4 KSZ9477 errata (DS80000754C)") which
> > removes the blatant register write to MMD 7:60 and replaces it by
> > setting phydev->eee_broken_modes = -1 so that the generic phy-c45 code
> > disables EEE but this is only done for the KSZ9477_CHIP_ID (Switch ID)
> > so its still broken at this point for the other switches that have
> > this errata.
> >
> > So at this point, the only commit that my patch would apply over is
> > the most recent 08c6d8bae48c but that wouldn't fix any of the previous
> > issues and it would be unclear what switch was broken at what point in
> > time.
>
> O.K, so its a mess :-(
>
> Lets look at this from a different direction. Which stable kernels do
> you actually care about? Is 6.6 enough for you? Do you need 6.1? 4.19?
>
> You should use a Fixed tag which goes back far enough for you. The
> patch itself might not apply that far back, but once you get it merged
> and backported as far as it does go, you can submit ported versions
> for older kernel, referencing the original fix commit hash number.
>

Yes... a mess. The difficulty everyone likely has with something like
this is they really can only test what they have as it's not super
clear each switch has for a PHY ID. It also wasn't really obvious to
me what switches had this errata (I just went through every switch in
the list of supported models in enum ksz_model and looked up their
errata which fortunately was public).

6.6 is good enough for me. I will resubmit with a fixes for the most
recent commit as well as include the above history in the commit as
well.

Best Regards,

Tim
diff mbox series

Patch

diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c
index b074b4bb0629..d2bd82a1067c 100644
--- a/drivers/net/dsa/microchip/ksz_common.c
+++ b/drivers/net/dsa/microchip/ksz_common.c
@@ -2578,11 +2578,19 @@  static u32 ksz_get_phy_flags(struct dsa_switch *ds, int port)
 		if (!port)
 			return MICREL_KSZ8_P1_ERRATA;
 		break;
+	case KSZ8795_CHIP_ID:
+	case KSZ8794_CHIP_ID:
+	case KSZ8765_CHIP_ID:
+		/* KSZ87xx DS80000687C Module 2 */
+	case KSZ9896_CHIP_ID:
+		/* KSZ9896 Errata DS80000757A Module 2 */
+	case KSZ9897_CHIP_ID:
+		/* KSZ9897 Errata DS00002394C Module 4 */
+	case KSZ9567_CHIP_ID:
+		/* KSZ9567 Errata DS80000756A Module 4 */
 	case KSZ9477_CHIP_ID:
-		/* KSZ9477 Errata DS80000754C
-		 *
-		 * Module 4: Energy Efficient Ethernet (EEE) feature select must
-		 * be manually disabled
+		/* KSZ9477 Errata DS80000754C Module 4 */
+		/* Energy Efficient Ethernet (EEE) feature select must be manually disabled
 		 *   The EEE feature is enabled by default, but it is not fully
 		 *   operational. It must be manually disabled through register
 		 *   controls. If not disabled, the PHY ports can auto-negotiate