diff mbox series

[net-next,v3,4/9] net: phy: micrel: apply resume errata workaround for ksz8873 and ksz8863

Message ID 20210526043037.9830-5-o.rempel@pengutronix.de (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series provide cable test support for the ksz886x switch | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net-next
netdev/subject_prefix success Link
netdev/cc_maintainers warning 1 maintainers not CCed: hkallweit1@gmail.com
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 34 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/header_inline success Link

Commit Message

Oleksij Rempel May 26, 2021, 4:30 a.m. UTC
The ksz8873 and ksz8863 switches are affected by following errata:

| "Receiver error in 100BASE-TX mode following Soft Power Down"
|
| Some KSZ8873 devices may exhibit receiver errors after transitioning
| from Soft Power Down mode to Normal mode, as controlled by register 195
| (0xC3) bits [1:0]. When exiting Soft Power Down mode, the receiver
| blocks may not start up properly, causing the PHY to miss data and
| exhibit erratic behavior. The problem may appear on either port 1 or
| port 2, or both ports. The problem occurs only for 100BASE-TX, not
| 10BASE-T.
|
| END USER IMPLICATIONS
| When the failure occurs, the following symptoms are seen on the affected
| port(s):
| - The port is able to link
| - LED0 blinks, even when there is no traffic
| - The MIB counters indicate receive errors (Rx Fragments, Rx Symbol
|   Errors, Rx CRC Errors, Rx Alignment Errors)
| - Only a small fraction of packets is correctly received and forwarded
|   through the switch. Most packets are dropped due to receive errors.
|
| The failing condition cannot be corrected by the following:
| - Removing and reconnecting the cable
| - Hardware reset
| - Software Reset and PCS Reset bits in register 67 (0x43)
|
| Work around:
| The problem can be corrected by setting and then clearing the Port Power
| Down bits (registers 29 (0x1D) and 45 (0x2D), bit 3). This must be done
| separately for each affected port after returning from Soft Power Down
| Mode to Normal Mode. The following procedure will ensure no further
| issues due to this erratum. To enter Soft Power Down Mode, set register
| 195 (0xC3), bits [1:0] = 10.
|
| To exit Soft Power Down Mode, follow these steps:
| 1. Set register 195 (0xC3), bits [1:0] = 00 // Exit soft power down mode
| 2. Wait 1ms minimum
| 3. Set register 29 (0x1D), bit [3] = 1 // Enter PHY port 1 power down mode
| 4. Set register 29 (0x1D), bit [3] = 0 // Exit PHY port 1 power down mode
| 5. Set register 45 (0x2D), bit [3] = 1 // Enter PHY port 2 power down mode
| 6. Set register 45 (0x2D), bit [3] = 0 // Exit PHY port 2 power down mode

This patch implements steps 2...6 of the suggested workaround. The first
step needs to be implemented in the switch driver.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
---
 drivers/net/phy/micrel.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

Comments

Vladimir Oltean May 26, 2021, 10:43 p.m. UTC | #1
On Wed, May 26, 2021 at 06:30:32AM +0200, Oleksij Rempel wrote:
> The ksz8873 and ksz8863 switches are affected by following errata:
> 
> | "Receiver error in 100BASE-TX mode following Soft Power Down"
> |
> | Some KSZ8873 devices may exhibit receiver errors after transitioning
> | from Soft Power Down mode to Normal mode, as controlled by register 195
> | (0xC3) bits [1:0]. When exiting Soft Power Down mode, the receiver
> | blocks may not start up properly, causing the PHY to miss data and
> | exhibit erratic behavior. The problem may appear on either port 1 or
> | port 2, or both ports. The problem occurs only for 100BASE-TX, not
> | 10BASE-T.
> |
> | END USER IMPLICATIONS
> | When the failure occurs, the following symptoms are seen on the affected
> | port(s):
> | - The port is able to link
> | - LED0 blinks, even when there is no traffic
> | - The MIB counters indicate receive errors (Rx Fragments, Rx Symbol
> |   Errors, Rx CRC Errors, Rx Alignment Errors)
> | - Only a small fraction of packets is correctly received and forwarded
> |   through the switch. Most packets are dropped due to receive errors.
> |
> | The failing condition cannot be corrected by the following:
> | - Removing and reconnecting the cable
> | - Hardware reset
> | - Software Reset and PCS Reset bits in register 67 (0x43)
> |
> | Work around:
> | The problem can be corrected by setting and then clearing the Port Power
> | Down bits (registers 29 (0x1D) and 45 (0x2D), bit 3). This must be done
> | separately for each affected port after returning from Soft Power Down
> | Mode to Normal Mode. The following procedure will ensure no further
> | issues due to this erratum. To enter Soft Power Down Mode, set register
> | 195 (0xC3), bits [1:0] = 10.
> |
> | To exit Soft Power Down Mode, follow these steps:
> | 1. Set register 195 (0xC3), bits [1:0] = 00 // Exit soft power down mode
> | 2. Wait 1ms minimum
> | 3. Set register 29 (0x1D), bit [3] = 1 // Enter PHY port 1 power down mode
> | 4. Set register 29 (0x1D), bit [3] = 0 // Exit PHY port 1 power down mode
> | 5. Set register 45 (0x2D), bit [3] = 1 // Enter PHY port 2 power down mode
> | 6. Set register 45 (0x2D), bit [3] = 0 // Exit PHY port 2 power down mode
> 
> This patch implements steps 2...6 of the suggested workaround. The first
> step needs to be implemented in the switch driver.

Am I right in understanding that register 195 (0xc3) is not a port register?

To hit the erratum, you have to enter Soft Power Down in the first place,
presumably by writing register 0xc3 from somewhere, right?

Where does Linux write this register from?

Once we find that place that enters/exits Soft Power Down mode, can't we
just toggle the Port Power Down bits for each port, exactly like the ERR
workaround says, instead of fooling around with a PHY driver?

> 
> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
> ---
>  drivers/net/phy/micrel.c | 22 +++++++++++++++++++++-
>  1 file changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
> index 227d88db7d27..f03188ed953a 100644
> --- a/drivers/net/phy/micrel.c
> +++ b/drivers/net/phy/micrel.c
> @@ -1048,6 +1048,26 @@ static int ksz8873mll_config_aneg(struct phy_device *phydev)
>  	return 0;
>  }
>  
> +static int ksz886x_resume(struct phy_device *phydev)
> +{
> +	int ret;
> +
> +	/* Apply errata workaround for KSZ8863 and KSZ8873:
> +	 * Receiver error in 100BASE-TX mode following Soft Power Down
> +	 *
> +	 * When exiting Soft Power Down mode, the receiver blocks may not start
> +	 * up properly, causing the PHY to miss data and exhibit erratic
> +	 * behavior.
> +	 */
> +	usleep_range(1000, 2000);
> +
> +	ret = phy_set_bits(phydev, MII_BMCR, BMCR_PDOWN);
> +	if (ret)
> +		return ret;
> +
> +	return phy_clear_bits(phydev, MII_BMCR, BMCR_PDOWN);
> +}
> +
>  static int kszphy_get_sset_count(struct phy_device *phydev)
>  {
>  	return ARRAY_SIZE(kszphy_hw_stats);
> @@ -1401,7 +1421,7 @@ static struct phy_driver ksphy_driver[] = {
>  	/* PHY_BASIC_FEATURES */
>  	.config_init	= kszphy_config_init,
>  	.suspend	= genphy_suspend,
> -	.resume		= genphy_resume,
> +	.resume		= ksz886x_resume,

Are you able to explain the relation between the call paths of
phy_resume() and the lifetime of the Soft Power Down setting of the
switch? How do we know that the PHYs are resumed after the switch has
exited Soft Power Down mode?

>  }, {
>  	.name		= "Micrel KSZ87XX Switch",
>  	/* PHY_BASIC_FEATURES */
> -- 
> 2.29.2
>
Oleksij Rempel June 10, 2021, 11:49 a.m. UTC | #2
On Thu, May 27, 2021 at 01:43:29AM +0300, Vladimir Oltean wrote:
> On Wed, May 26, 2021 at 06:30:32AM +0200, Oleksij Rempel wrote:
> > The ksz8873 and ksz8863 switches are affected by following errata:
> > 
> > | "Receiver error in 100BASE-TX mode following Soft Power Down"
> > |
> > | Some KSZ8873 devices may exhibit receiver errors after transitioning
> > | from Soft Power Down mode to Normal mode, as controlled by register 195
> > | (0xC3) bits [1:0]. When exiting Soft Power Down mode, the receiver
> > | blocks may not start up properly, causing the PHY to miss data and
> > | exhibit erratic behavior. The problem may appear on either port 1 or
> > | port 2, or both ports. The problem occurs only for 100BASE-TX, not
> > | 10BASE-T.
> > |
> > | END USER IMPLICATIONS
> > | When the failure occurs, the following symptoms are seen on the affected
> > | port(s):
> > | - The port is able to link
> > | - LED0 blinks, even when there is no traffic
> > | - The MIB counters indicate receive errors (Rx Fragments, Rx Symbol
> > |   Errors, Rx CRC Errors, Rx Alignment Errors)
> > | - Only a small fraction of packets is correctly received and forwarded
> > |   through the switch. Most packets are dropped due to receive errors.
> > |
> > | The failing condition cannot be corrected by the following:
> > | - Removing and reconnecting the cable
> > | - Hardware reset
> > | - Software Reset and PCS Reset bits in register 67 (0x43)
> > |
> > | Work around:
> > | The problem can be corrected by setting and then clearing the Port Power
> > | Down bits (registers 29 (0x1D) and 45 (0x2D), bit 3). This must be done
> > | separately for each affected port after returning from Soft Power Down
> > | Mode to Normal Mode. The following procedure will ensure no further
> > | issues due to this erratum. To enter Soft Power Down Mode, set register
> > | 195 (0xC3), bits [1:0] = 10.
> > |
> > | To exit Soft Power Down Mode, follow these steps:
> > | 1. Set register 195 (0xC3), bits [1:0] = 00 // Exit soft power down mode
> > | 2. Wait 1ms minimum
> > | 3. Set register 29 (0x1D), bit [3] = 1 // Enter PHY port 1 power down mode
> > | 4. Set register 29 (0x1D), bit [3] = 0 // Exit PHY port 1 power down mode
> > | 5. Set register 45 (0x2D), bit [3] = 1 // Enter PHY port 2 power down mode
> > | 6. Set register 45 (0x2D), bit [3] = 0 // Exit PHY port 2 power down mode
> > 
> > This patch implements steps 2...6 of the suggested workaround. The first
> > step needs to be implemented in the switch driver.
> 
> Am I right in understanding that register 195 (0xc3) is not a port register?
> 
> To hit the erratum, you have to enter Soft Power Down in the first place,
> presumably by writing register 0xc3 from somewhere, right?
> 
> Where does Linux write this register from?
> 
> Once we find that place that enters/exits Soft Power Down mode, can't we
> just toggle the Port Power Down bits for each port, exactly like the ERR
> workaround says, instead of fooling around with a PHY driver?

The KSZ8873 switch is using multiple register mappings.
https://ww1.microchip.com/downloads/en/DeviceDoc/00002348A.pdf
Page 38:
"The MIIM interface is used to access the MII PHY registers defined in
this section. The SPI, I2C, and SMI interfaces can also be used to access
some of these registers. The latter three interfaces use a different
mapping mechanism than the MIIM interface."

This PHY driver is able to work directly over MIIM (MDIO). Or work with DSA over
integrated register translation mapping.

> > 
> > Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
> > ---
> >  drivers/net/phy/micrel.c | 22 +++++++++++++++++++++-
> >  1 file changed, 21 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
> > index 227d88db7d27..f03188ed953a 100644
> > --- a/drivers/net/phy/micrel.c
> > +++ b/drivers/net/phy/micrel.c
> > @@ -1048,6 +1048,26 @@ static int ksz8873mll_config_aneg(struct phy_device *phydev)
> >  	return 0;
> >  }
> >  
> > +static int ksz886x_resume(struct phy_device *phydev)
> > +{
> > +	int ret;
> > +
> > +	/* Apply errata workaround for KSZ8863 and KSZ8873:
> > +	 * Receiver error in 100BASE-TX mode following Soft Power Down
> > +	 *
> > +	 * When exiting Soft Power Down mode, the receiver blocks may not start
> > +	 * up properly, causing the PHY to miss data and exhibit erratic
> > +	 * behavior.
> > +	 */
> > +	usleep_range(1000, 2000);
> > +
> > +	ret = phy_set_bits(phydev, MII_BMCR, BMCR_PDOWN);
> > +	if (ret)
> > +		return ret;
> > +
> > +	return phy_clear_bits(phydev, MII_BMCR, BMCR_PDOWN);
> > +}
> > +
> >  static int kszphy_get_sset_count(struct phy_device *phydev)
> >  {
> >  	return ARRAY_SIZE(kszphy_hw_stats);
> > @@ -1401,7 +1421,7 @@ static struct phy_driver ksphy_driver[] = {
> >  	/* PHY_BASIC_FEATURES */
> >  	.config_init	= kszphy_config_init,
> >  	.suspend	= genphy_suspend,
> > -	.resume		= genphy_resume,
> > +	.resume		= ksz886x_resume,
> 
> Are you able to explain the relation between the call paths of
> phy_resume() and the lifetime of the Soft Power Down setting of the
> switch? How do we know that the PHYs are resumed after the switch has
> exited Soft Power Down mode?

The MII_BMCRs BMCR_PDOWN bit is mapped to the "register 29 (0x1D), bit
[3]" for the PHY0 and to "register 45 (0x2D), bit [3]" for the PHY1.

I assume, I'll need to add this comments to the commit message. Or do
you have other suggestions on how this should be implemented?

Regards,
Oleksij
Vladimir Oltean June 10, 2021, 1:04 p.m. UTC | #3
On Thu, Jun 10, 2021 at 01:49:20PM +0200, Oleksij Rempel wrote:
> On Thu, May 27, 2021 at 01:43:29AM +0300, Vladimir Oltean wrote:
> > On Wed, May 26, 2021 at 06:30:32AM +0200, Oleksij Rempel wrote:
> > > The ksz8873 and ksz8863 switches are affected by following errata:
> > > 
> > > | "Receiver error in 100BASE-TX mode following Soft Power Down"
> > > |
> > > | Some KSZ8873 devices may exhibit receiver errors after transitioning
> > > | from Soft Power Down mode to Normal mode, as controlled by register 195
> > > | (0xC3) bits [1:0]. When exiting Soft Power Down mode, the receiver
> > > | blocks may not start up properly, causing the PHY to miss data and
> > > | exhibit erratic behavior. The problem may appear on either port 1 or
> > > | port 2, or both ports. The problem occurs only for 100BASE-TX, not
> > > | 10BASE-T.
> > > |
> > > | END USER IMPLICATIONS
> > > | When the failure occurs, the following symptoms are seen on the affected
> > > | port(s):
> > > | - The port is able to link
> > > | - LED0 blinks, even when there is no traffic
> > > | - The MIB counters indicate receive errors (Rx Fragments, Rx Symbol
> > > |   Errors, Rx CRC Errors, Rx Alignment Errors)
> > > | - Only a small fraction of packets is correctly received and forwarded
> > > |   through the switch. Most packets are dropped due to receive errors.
> > > |
> > > | The failing condition cannot be corrected by the following:
> > > | - Removing and reconnecting the cable
> > > | - Hardware reset
> > > | - Software Reset and PCS Reset bits in register 67 (0x43)
> > > |
> > > | Work around:
> > > | The problem can be corrected by setting and then clearing the Port Power
> > > | Down bits (registers 29 (0x1D) and 45 (0x2D), bit 3). This must be done
> > > | separately for each affected port after returning from Soft Power Down
> > > | Mode to Normal Mode. The following procedure will ensure no further
> > > | issues due to this erratum. To enter Soft Power Down Mode, set register
> > > | 195 (0xC3), bits [1:0] = 10.
> > > |
> > > | To exit Soft Power Down Mode, follow these steps:
> > > | 1. Set register 195 (0xC3), bits [1:0] = 00 // Exit soft power down mode
> > > | 2. Wait 1ms minimum
> > > | 3. Set register 29 (0x1D), bit [3] = 1 // Enter PHY port 1 power down mode
> > > | 4. Set register 29 (0x1D), bit [3] = 0 // Exit PHY port 1 power down mode
> > > | 5. Set register 45 (0x2D), bit [3] = 1 // Enter PHY port 2 power down mode
> > > | 6. Set register 45 (0x2D), bit [3] = 0 // Exit PHY port 2 power down mode
> > > 
> > > This patch implements steps 2...6 of the suggested workaround. The first
> > > step needs to be implemented in the switch driver.
> > 
> > Am I right in understanding that register 195 (0xc3) is not a port register?
> > 
> > To hit the erratum, you have to enter Soft Power Down in the first place,
> > presumably by writing register 0xc3 from somewhere, right?
> > 
> > Where does Linux write this register from?
> > 
> > Once we find that place that enters/exits Soft Power Down mode, can't we
> > just toggle the Port Power Down bits for each port, exactly like the ERR
> > workaround says, instead of fooling around with a PHY driver?
> 
> The KSZ8873 switch is using multiple register mappings.
> https://ww1.microchip.com/downloads/en/DeviceDoc/00002348A.pdf
> Page 38:
> "The MIIM interface is used to access the MII PHY registers defined in
> this section. The SPI, I2C, and SMI interfaces can also be used to access
> some of these registers. The latter three interfaces use a different
> mapping mechanism than the MIIM interface."
> 
> This PHY driver is able to work directly over MIIM (MDIO). Or work with DSA over
> integrated register translation mapping.

This doesn't answer my question of where is the Soft Power Down mode enabled.

> > > 
> > > Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
> > > ---
> > >  drivers/net/phy/micrel.c | 22 +++++++++++++++++++++-
> > >  1 file changed, 21 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
> > > index 227d88db7d27..f03188ed953a 100644
> > > --- a/drivers/net/phy/micrel.c
> > > +++ b/drivers/net/phy/micrel.c
> > > @@ -1048,6 +1048,26 @@ static int ksz8873mll_config_aneg(struct phy_device *phydev)
> > >  	return 0;
> > >  }
> > >  
> > > +static int ksz886x_resume(struct phy_device *phydev)
> > > +{
> > > +	int ret;
> > > +
> > > +	/* Apply errata workaround for KSZ8863 and KSZ8873:
> > > +	 * Receiver error in 100BASE-TX mode following Soft Power Down
> > > +	 *
> > > +	 * When exiting Soft Power Down mode, the receiver blocks may not start
> > > +	 * up properly, causing the PHY to miss data and exhibit erratic
> > > +	 * behavior.
> > > +	 */
> > > +	usleep_range(1000, 2000);
> > > +
> > > +	ret = phy_set_bits(phydev, MII_BMCR, BMCR_PDOWN);
> > > +	if (ret)
> > > +		return ret;
> > > +
> > > +	return phy_clear_bits(phydev, MII_BMCR, BMCR_PDOWN);
> > > +}
> > > +
> > >  static int kszphy_get_sset_count(struct phy_device *phydev)
> > >  {
> > >  	return ARRAY_SIZE(kszphy_hw_stats);
> > > @@ -1401,7 +1421,7 @@ static struct phy_driver ksphy_driver[] = {
> > >  	/* PHY_BASIC_FEATURES */
> > >  	.config_init	= kszphy_config_init,
> > >  	.suspend	= genphy_suspend,
> > > -	.resume		= genphy_resume,
> > > +	.resume		= ksz886x_resume,
> > 
> > Are you able to explain the relation between the call paths of
> > phy_resume() and the lifetime of the Soft Power Down setting of the
> > switch? How do we know that the PHYs are resumed after the switch has
> > exited Soft Power Down mode?
> 
> The MII_BMCRs BMCR_PDOWN bit is mapped to the "register 29 (0x1D), bit
> [3]" for the PHY0 and to "register 45 (0x2D), bit [3]" for the PHY1.
> 
> I assume, I'll need to add this comments to the commit message. Or do
> you have other suggestions on how this should be implemented?

According to "3.2 Power Management" in the datasheet you shared:

There are 5 (five) operation modes under the power management function,
which is controlled by two bits in Register 195 (0xC3) and one bit in
Register 29 (0x1D), 45 (0x2D) as shown below:

Register 195 bit[1:0] = 00 Normal Operation Mode
Register 195 bit[1:0] = 01 Energy Detect Mode
Register 195 bit[1:0] = 10 Soft Power Down Mode
Register 195 bit[1:0] = 11 Power Saving Mode
Register 29, 45 bit 3 = 1 Port Based Power Down Mode

3.2.4 SOFT POWER DOWN MODE

The soft power down mode is entered by setting bit[1:0]=10 in register
195. When KSZ8873MLL/FLL/RLL is in this mode, all PLL clocks are
disabled, the PHY and the MAC are off, all internal registers values
will not change. When the host set bit[1:0]=00 in register 195, this
device will be back from current soft power down mode to normal
operation mode.

3.2.5 PORT-BASED POWER DOWN MODE

In addition, the KSZ8873MLL/FLL/RLL features a per-port power down mode.
To save power, a PHY port that is not in use can be powered down via
port control register 29 or 45 bit 3, or MIIM PHY register. It saves
about 15 mA per port.



From the above I understand that the first 4 power management modes are
global, and the 5th isn't.

You've explained how the PHY driver enters port-based power down mode.
But the ERR describes an issue being triggered by a global power down
mode. What you are describing is not what the ERR text is describing.

Excuse my perhaps stupid question, but have you triggered the issue
described by the erratum? Does this patch fix that? Where is the disconnect?
Oleksij Rempel June 10, 2021, 1:25 p.m. UTC | #4
On Thu, Jun 10, 2021 at 04:04:45PM +0300, Vladimir Oltean wrote:
> On Thu, Jun 10, 2021 at 01:49:20PM +0200, Oleksij Rempel wrote:
> > On Thu, May 27, 2021 at 01:43:29AM +0300, Vladimir Oltean wrote:
> > > On Wed, May 26, 2021 at 06:30:32AM +0200, Oleksij Rempel wrote:
> > > > The ksz8873 and ksz8863 switches are affected by following errata:
> > > > 
> > > > | "Receiver error in 100BASE-TX mode following Soft Power Down"
> > > > |
> > > > | Some KSZ8873 devices may exhibit receiver errors after transitioning
> > > > | from Soft Power Down mode to Normal mode, as controlled by register 195
> > > > | (0xC3) bits [1:0]. When exiting Soft Power Down mode, the receiver
> > > > | blocks may not start up properly, causing the PHY to miss data and
> > > > | exhibit erratic behavior. The problem may appear on either port 1 or
> > > > | port 2, or both ports. The problem occurs only for 100BASE-TX, not
> > > > | 10BASE-T.
> > > > |
> > > > | END USER IMPLICATIONS
> > > > | When the failure occurs, the following symptoms are seen on the affected
> > > > | port(s):
> > > > | - The port is able to link
> > > > | - LED0 blinks, even when there is no traffic
> > > > | - The MIB counters indicate receive errors (Rx Fragments, Rx Symbol
> > > > |   Errors, Rx CRC Errors, Rx Alignment Errors)
> > > > | - Only a small fraction of packets is correctly received and forwarded
> > > > |   through the switch. Most packets are dropped due to receive errors.
> > > > |
> > > > | The failing condition cannot be corrected by the following:
> > > > | - Removing and reconnecting the cable
> > > > | - Hardware reset
> > > > | - Software Reset and PCS Reset bits in register 67 (0x43)
> > > > |
> > > > | Work around:
> > > > | The problem can be corrected by setting and then clearing the Port Power
> > > > | Down bits (registers 29 (0x1D) and 45 (0x2D), bit 3). This must be done
> > > > | separately for each affected port after returning from Soft Power Down
> > > > | Mode to Normal Mode. The following procedure will ensure no further
> > > > | issues due to this erratum. To enter Soft Power Down Mode, set register
> > > > | 195 (0xC3), bits [1:0] = 10.
> > > > |
> > > > | To exit Soft Power Down Mode, follow these steps:
> > > > | 1. Set register 195 (0xC3), bits [1:0] = 00 // Exit soft power down mode
> > > > | 2. Wait 1ms minimum
> > > > | 3. Set register 29 (0x1D), bit [3] = 1 // Enter PHY port 1 power down mode
> > > > | 4. Set register 29 (0x1D), bit [3] = 0 // Exit PHY port 1 power down mode
> > > > | 5. Set register 45 (0x2D), bit [3] = 1 // Enter PHY port 2 power down mode
> > > > | 6. Set register 45 (0x2D), bit [3] = 0 // Exit PHY port 2 power down mode
> > > > 
> > > > This patch implements steps 2...6 of the suggested workaround. The first
> > > > step needs to be implemented in the switch driver.
> > > 
> > > Am I right in understanding that register 195 (0xc3) is not a port register?
> > > 
> > > To hit the erratum, you have to enter Soft Power Down in the first place,
> > > presumably by writing register 0xc3 from somewhere, right?
> > > 
> > > Where does Linux write this register from?
> > > 
> > > Once we find that place that enters/exits Soft Power Down mode, can't we
> > > just toggle the Port Power Down bits for each port, exactly like the ERR
> > > workaround says, instead of fooling around with a PHY driver?
> > 
> > The KSZ8873 switch is using multiple register mappings.
> > https://ww1.microchip.com/downloads/en/DeviceDoc/00002348A.pdf
> > Page 38:
> > "The MIIM interface is used to access the MII PHY registers defined in
> > this section. The SPI, I2C, and SMI interfaces can also be used to access
> > some of these registers. The latter three interfaces use a different
> > mapping mechanism than the MIIM interface."
> > 
> > This PHY driver is able to work directly over MIIM (MDIO). Or work with DSA over
> > integrated register translation mapping.
> 
> This doesn't answer my question of where is the Soft Power Down mode enabled.
> 
> > > > 
> > > > Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
> > > > ---
> > > >  drivers/net/phy/micrel.c | 22 +++++++++++++++++++++-
> > > >  1 file changed, 21 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
> > > > index 227d88db7d27..f03188ed953a 100644
> > > > --- a/drivers/net/phy/micrel.c
> > > > +++ b/drivers/net/phy/micrel.c
> > > > @@ -1048,6 +1048,26 @@ static int ksz8873mll_config_aneg(struct phy_device *phydev)
> > > >  	return 0;
> > > >  }
> > > >  
> > > > +static int ksz886x_resume(struct phy_device *phydev)
> > > > +{
> > > > +	int ret;
> > > > +
> > > > +	/* Apply errata workaround for KSZ8863 and KSZ8873:
> > > > +	 * Receiver error in 100BASE-TX mode following Soft Power Down
> > > > +	 *
> > > > +	 * When exiting Soft Power Down mode, the receiver blocks may not start
> > > > +	 * up properly, causing the PHY to miss data and exhibit erratic
> > > > +	 * behavior.
> > > > +	 */
> > > > +	usleep_range(1000, 2000);
> > > > +
> > > > +	ret = phy_set_bits(phydev, MII_BMCR, BMCR_PDOWN);
> > > > +	if (ret)
> > > > +		return ret;
> > > > +
> > > > +	return phy_clear_bits(phydev, MII_BMCR, BMCR_PDOWN);
> > > > +}
> > > > +
> > > >  static int kszphy_get_sset_count(struct phy_device *phydev)
> > > >  {
> > > >  	return ARRAY_SIZE(kszphy_hw_stats);
> > > > @@ -1401,7 +1421,7 @@ static struct phy_driver ksphy_driver[] = {
> > > >  	/* PHY_BASIC_FEATURES */
> > > >  	.config_init	= kszphy_config_init,
> > > >  	.suspend	= genphy_suspend,
> > > > -	.resume		= genphy_resume,
> > > > +	.resume		= ksz886x_resume,
> > > 
> > > Are you able to explain the relation between the call paths of
> > > phy_resume() and the lifetime of the Soft Power Down setting of the
> > > switch? How do we know that the PHYs are resumed after the switch has
> > > exited Soft Power Down mode?
> > 
> > The MII_BMCRs BMCR_PDOWN bit is mapped to the "register 29 (0x1D), bit
> > [3]" for the PHY0 and to "register 45 (0x2D), bit [3]" for the PHY1.
> > 
> > I assume, I'll need to add this comments to the commit message. Or do
> > you have other suggestions on how this should be implemented?
> 
> According to "3.2 Power Management" in the datasheet you shared:
> 
> There are 5 (five) operation modes under the power management function,
> which is controlled by two bits in Register 195 (0xC3) and one bit in
> Register 29 (0x1D), 45 (0x2D) as shown below:
> 
> Register 195 bit[1:0] = 00 Normal Operation Mode
> Register 195 bit[1:0] = 01 Energy Detect Mode
> Register 195 bit[1:0] = 10 Soft Power Down Mode
> Register 195 bit[1:0] = 11 Power Saving Mode
> Register 29, 45 bit 3 = 1 Port Based Power Down Mode
> 
> 3.2.4 SOFT POWER DOWN MODE
> 
> The soft power down mode is entered by setting bit[1:0]=10 in register
> 195. When KSZ8873MLL/FLL/RLL is in this mode, all PLL clocks are
> disabled, the PHY and the MAC are off, all internal registers values
> will not change. When the host set bit[1:0]=00 in register 195, this
> device will be back from current soft power down mode to normal
> operation mode.
> 
> 3.2.5 PORT-BASED POWER DOWN MODE
> 
> In addition, the KSZ8873MLL/FLL/RLL features a per-port power down mode.
> To save power, a PHY port that is not in use can be powered down via
> port control register 29 or 45 bit 3, or MIIM PHY register. It saves
> about 15 mA per port.
> 
> 
> 
> From the above I understand that the first 4 power management modes are
> global, and the 5th isn't.
> 
> You've explained how the PHY driver enters port-based power down mode.
> But the ERR describes an issue being triggered by a global power down
> mode. What you are describing is not what the ERR text is describing.
> 
> Excuse my perhaps stupid question, but have you triggered the issue
> described by the erratum? Does this patch fix that? Where is the disconnect?

Yes, this issue was seen  at some early point of development (back in 2019)
reproducible on system start. Where switch was in some default state or
on a state configured by the bootloader. I didn't tried to reproduce it
now. With other words, there is no need to provide global power
management by the DSA driver to trigger it.

Regards,
Oleksij
Vladimir Oltean June 10, 2021, 6:18 p.m. UTC | #5
On Thu, Jun 10, 2021 at 03:25:05PM +0200, Oleksij Rempel wrote:
> Yes, this issue was seen  at some early point of development (back in 2019)
> reproducible on system start. Where switch was in some default state or
> on a state configured by the bootloader. I didn't tried to reproduce it
> now. With other words, there is no need to provide global power
> management by the DSA driver to trigger it.

If you're sure about that then add it to the commit message or comments,
since this is not what the ERR description says.
diff mbox series

Patch

diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
index 227d88db7d27..f03188ed953a 100644
--- a/drivers/net/phy/micrel.c
+++ b/drivers/net/phy/micrel.c
@@ -1048,6 +1048,26 @@  static int ksz8873mll_config_aneg(struct phy_device *phydev)
 	return 0;
 }
 
+static int ksz886x_resume(struct phy_device *phydev)
+{
+	int ret;
+
+	/* Apply errata workaround for KSZ8863 and KSZ8873:
+	 * Receiver error in 100BASE-TX mode following Soft Power Down
+	 *
+	 * When exiting Soft Power Down mode, the receiver blocks may not start
+	 * up properly, causing the PHY to miss data and exhibit erratic
+	 * behavior.
+	 */
+	usleep_range(1000, 2000);
+
+	ret = phy_set_bits(phydev, MII_BMCR, BMCR_PDOWN);
+	if (ret)
+		return ret;
+
+	return phy_clear_bits(phydev, MII_BMCR, BMCR_PDOWN);
+}
+
 static int kszphy_get_sset_count(struct phy_device *phydev)
 {
 	return ARRAY_SIZE(kszphy_hw_stats);
@@ -1401,7 +1421,7 @@  static struct phy_driver ksphy_driver[] = {
 	/* PHY_BASIC_FEATURES */
 	.config_init	= kszphy_config_init,
 	.suspend	= genphy_suspend,
-	.resume		= genphy_resume,
+	.resume		= ksz886x_resume,
 }, {
 	.name		= "Micrel KSZ87XX Switch",
 	/* PHY_BASIC_FEATURES */