Message ID | 20230530122621.2142192-1-lukma@denx.de (mailing list archive) |
---|---|
State | RFC |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [RFC] net: dsa: slave: Advertise correct EEE capabilities at slave PHY setup | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Guessing tree name failed - patch did not apply |
On Tue, May 30, 2023 at 02:26:21PM +0200, Lukasz Majewski wrote: > One can disable in device tree advertising of EEE capabilities of PHY > when 'eee-broken-100tx' property is present in DTS. > > With DSA switch it also may happen that one would need to disable EEE due > to some network issues. > > Corresponding switch DTS description: > > switch@0 { > ports { > port@0 { > reg = <0>; > label = "lan1"; > phy-handle = <&switchphy0>; > }; > } > mdio { > switchphy0: switchphy@0 { > reg = <0>; > eee-broken-100tx; > }; > }; > > This patch adjusts the content of MDIO_AN_EEE_ADV in MDIO_MMD_AN "device" > so the phydev->eee_broken_modes are taken into account from the start of > the slave PHYs. This should be handled by phylib today in recent kernels without the need for any patch (as I describe below, because the config_aneg PHY method should be programming it.) Are you seeing a problem with it in 6.4-rc? > As a result the 'ethtool --show-eee lan1' shows that EEE is not supported > from the outset. > > Questions: > > - Is the genphy_config_eee_advert() appropriate to be used here? > As I found this issue on 5.15 kernel, it looks like mainline now uses > PHY features for handle EEE (but the aforementioned function is still > present in newest mainline - v6.4-rc1). > > - I've also observed strange behaviour for EEE capability register: > Why the value in MDIO_MMD_PCS device; reg MDIO_PCS_EEE_ABLE is somewhat > "volatile" - in a sense that when I use: > ethtool --set-eee lan2 eee off > > It is cleared by PHY itself to 0x0 (from 0x2) and turning it on again is > not working. > > Is this expected? Or am I missing something? No - this register is supposed to report the capabilities of the PHY, and bits 1..15 should be read-only, and as they report the capabilities they should be fixed. Writing to bit 1 of this register will therefore be ignored. It sounds like your PHY has some odd behaviour - maybe someone misinterpreted 802.3 45.2.3.9? > Signed-off-by: Lukasz Majewski <lukma@denx.de> > --- > net/dsa/slave.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/net/dsa/slave.c b/net/dsa/slave.c > index 353d8fff3166..712923c7d4e2 100644 > --- a/net/dsa/slave.c > +++ b/net/dsa/slave.c > @@ -2247,6 +2247,7 @@ static int dsa_slave_phy_setup(struct net_device *slave_dev) > phylink_destroy(dp->pl); > } > > + genphy_config_eee_advert(slave_dev->phydev); No network driver (which includes DSA) should be calling any function starting genphy_*. These functions are purely for phylib or phy drivers to use, and no one else. genphy_config_eee_advert() is a deprecated function (see commit 5827b168125d ("net: phy: c45: migrate to genphy_c45_write_eee_adv()") and thus should not be used. genphy_c45_write_eee_adv() is called by genphy_c45_an_config_eee_aneg() which will in turn be called by genphy_config_aneg() for a clause 22 PHY, or by genphy_c45_an_config_aneg() for a clause 45 PHY. These will write the EEE advertisement mask to the PHY's AN MMD. So, EEE should be handled by phylib according to the firmware settings. The only thing that network drivers that use phylib have to deal with is setting their hardware for the LPI timeout and enabling/disabling the timeout as necessary.
On Tue, May 30, 2023 at 02:26:21PM +0200, Lukasz Majewski wrote: > One can disable in device tree advertising of EEE capabilities of PHY > when 'eee-broken-100tx' property is present in DTS. > > With DSA switch it also may happen that one would need to disable EEE due > to some network issues. Is EEE actually broken in the MAC/PHY combination? You should not be using this DT option for configuration. It is there because there is some hardware which is truly broken, and needs EEE turned off. If EEE does work, but you need to turn it off because of latency etc, then please use ethtool. Andrew,
Hi Andrew, > On Tue, May 30, 2023 at 02:26:21PM +0200, Lukasz Majewski wrote: > > One can disable in device tree advertising of EEE capabilities of > > PHY when 'eee-broken-100tx' property is present in DTS. > > > > With DSA switch it also may happen that one would need to disable > > EEE due to some network issues. > > Is EEE actually broken in the MAC/PHY combination? > Problem is that when I connect on this project some non-manageable switches (which by default have EEE enabled), then I observe very rare and sporadic link loss and reconnection. Disabling EEE solves the problem. > You should not be using this DT option for configuration. It is there > because there is some hardware which is truly broken, and needs EEE > turned off. Yes, I do think that the above sentence sums up my use case. > > If EEE does work, but you need to turn it off because of latency etc, > then please use ethtool. > Yes, correct - it is possible to disable the EEE with ethtool --set-eee lan2 eee off However, as I've stated in the mail, I cannot re-enable EEE once disabled with: ethtool --set-eee lan2 eee on ethtool --show-eee lan2 EEE Settings for lan2: EEE status: not supported As the capability register shows value of 0. > Andrew, Best regards, Lukasz Majewski -- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
Hi Russell, > On Tue, May 30, 2023 at 02:26:21PM +0200, Lukasz Majewski wrote: > > One can disable in device tree advertising of EEE capabilities of > > PHY when 'eee-broken-100tx' property is present in DTS. > > > > With DSA switch it also may happen that one would need to disable > > EEE due to some network issues. > > > > Corresponding switch DTS description: > > > > switch@0 { > > ports { > > port@0 { > > reg = <0>; > > label = "lan1"; > > phy-handle = <&switchphy0>; > > }; > > } > > mdio { > > switchphy0: switchphy@0 { > > reg = <0>; > > eee-broken-100tx; > > }; > > }; > > > > This patch adjusts the content of MDIO_AN_EEE_ADV in MDIO_MMD_AN > > "device" so the phydev->eee_broken_modes are taken into account > > from the start of the slave PHYs. > > This should be handled by phylib today in recent kernels without the > need for any patch (as I describe below, because the config_aneg PHY > method should be programming it.) Are you seeing a problem with it > in 6.4-rc? Unfortunately, for this project I use LTS 5.15.z kernel. My impression is that the mv88e6xxx driver is not handling EEE setup during initialization (even with v6.4-rc). I've tried to replace genphy_config_eee_advert() with phy_init_eee, but it lacks the part to program PCS advertise registers. > > > As a result the 'ethtool --show-eee lan1' shows that EEE is not > > supported from the outset. > > > > Questions: > > > > - Is the genphy_config_eee_advert() appropriate to be used here? > > As I found this issue on 5.15 kernel, it looks like mainline now > > uses PHY features for handle EEE (but the aforementioned function > > is still present in newest mainline - v6.4-rc1). > > > > - I've also observed strange behaviour for EEE capability register: > > Why the value in MDIO_MMD_PCS device; reg MDIO_PCS_EEE_ABLE is > > somewhat "volatile" - in a sense that when I use: > > ethtool --set-eee lan2 eee off > > > > It is cleared by PHY itself to 0x0 (from 0x2) and turning it on > > again is not working. > > > > Is this expected? Or am I missing something? > > No - this register is supposed to report the capabilities of the PHY, > and bits 1..15 should be read-only, and as they report the > capabilities they should be fixed. Writing to bit 1 of this register > will therefore be ignored. It sounds like your PHY has some odd > behaviour - maybe someone misinterpreted 802.3 45.2.3.9? > It is a good question. Or maybe after EEE disabling I read some wrong data (however, up till this moment bit offsets and values seems reasonable). > > Signed-off-by: Lukasz Majewski <lukma@denx.de> > > --- > > net/dsa/slave.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/net/dsa/slave.c b/net/dsa/slave.c > > index 353d8fff3166..712923c7d4e2 100644 > > --- a/net/dsa/slave.c > > +++ b/net/dsa/slave.c > > @@ -2247,6 +2247,7 @@ static int dsa_slave_phy_setup(struct > > net_device *slave_dev) phylink_destroy(dp->pl); > > } > > > > + genphy_config_eee_advert(slave_dev->phydev); > > No network driver (which includes DSA) should be calling any function > starting genphy_*. These functions are purely for phylib or phy > drivers to use, and no one else. As stated before, it looks like some PHY "update" in respect of EEE is not done when DSA framework creates phydevs for slave ports. > > genphy_config_eee_advert() is a deprecated function (see commit > 5827b168125d ("net: phy: c45: migrate to genphy_c45_write_eee_adv()") > and thus should not be used. Ok. > > genphy_c45_write_eee_adv() is called by > genphy_c45_an_config_eee_aneg() which will in turn be called by > genphy_config_aneg() for a clause 22 PHY, or by > genphy_c45_an_config_aneg() for a clause 45 PHY. These will write the > EEE advertisement mask to the PHY's AN MMD. > Ok. > So, EEE should be handled by phylib according to the firmware > settings. I also would expect, that phy core code parses DTS properties and then phydev->eee_broken_mode is used to mask EEE advertisement during PHY initialization and startup. > The only thing that network drivers that use phylib have to > deal with is setting their hardware for the LPI timeout and > enabling/disabling the timeout as necessary. > Yes. I do agree. Best regards, Lukasz Majewski -- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
On Tue, May 30, 2023 at 04:07:43PM +0200, Lukasz Majewski wrote: > Hi Russell, > > > On Tue, May 30, 2023 at 02:26:21PM +0200, Lukasz Majewski wrote: > > > One can disable in device tree advertising of EEE capabilities of > > > PHY when 'eee-broken-100tx' property is present in DTS. > > > > > > With DSA switch it also may happen that one would need to disable > > > EEE due to some network issues. > > > > > > Corresponding switch DTS description: > > > > > > switch@0 { > > > ports { > > > port@0 { > > > reg = <0>; > > > label = "lan1"; > > > phy-handle = <&switchphy0>; > > > }; > > > } > > > mdio { > > > switchphy0: switchphy@0 { > > > reg = <0>; > > > eee-broken-100tx; > > > }; > > > }; > > > > > > This patch adjusts the content of MDIO_AN_EEE_ADV in MDIO_MMD_AN > > > "device" so the phydev->eee_broken_modes are taken into account > > > from the start of the slave PHYs. > > > > This should be handled by phylib today in recent kernels without the > > need for any patch (as I describe below, because the config_aneg PHY > > method should be programming it.) Are you seeing a problem with it > > in 6.4-rc? > > Unfortunately, for this project I use LTS 5.15.z kernel. > > My impression is that the mv88e6xxx driver is not handling EEE setup > during initialization (even with v6.4-rc). > > I've tried to replace genphy_config_eee_advert() with phy_init_eee, but > it lacks the part to program PCS advertise registers. Firstly, I would advise backporting the EEE changes. The older EEE implementation was IMHO not particularly good (I think you can find a record in the archives of me stating that the old interfaces were just too quirky.) Secondly, even if you program the PHY for EEE, unless you have something like an Atheros AR803x PHY with its SmartEEE, EEE needs the support of both the PHY and the MAC to which its connected to in order to work. It's the MAC which is the "client" which says to the PHY "I'm idle" and when both ends tell their PHYs that they're idle, the media link can then drop into the low power state. The 88e6xxx internal PHYs will communicate their EEE negotiation state back to the MACs, but for an external PHY, that won't happen, and there is no code in the 88e6xxx driver to configure the MAC to program the MAC to do EEE. So, I'm wondering what's actually going on here... can you give any more details about the hardware setup?
On Tue, May 30, 2023 at 04:07:43PM +0200, Lukasz Majewski wrote: > Hi Russell, > > > On Tue, May 30, 2023 at 02:26:21PM +0200, Lukasz Majewski wrote: > > > One can disable in device tree advertising of EEE capabilities of > > > PHY when 'eee-broken-100tx' property is present in DTS. > > > > > > With DSA switch it also may happen that one would need to disable > > > EEE due to some network issues. > > > > > > Corresponding switch DTS description: > > > > > > switch@0 { > > > ports { > > > port@0 { > > > reg = <0>; > > > label = "lan1"; > > > phy-handle = <&switchphy0>; > > > }; > > > } > > > mdio { > > > switchphy0: switchphy@0 { > > > reg = <0>; > > > eee-broken-100tx; > > > }; > > > }; > > > > > > This patch adjusts the content of MDIO_AN_EEE_ADV in MDIO_MMD_AN > > > "device" so the phydev->eee_broken_modes are taken into account > > > from the start of the slave PHYs. > > > > This should be handled by phylib today in recent kernels without the > > need for any patch (as I describe below, because the config_aneg PHY > > method should be programming it.) Are you seeing a problem with it > > in 6.4-rc? > > Unfortunately, for this project I use LTS 5.15.z kernel. > > My impression is that the mv88e6xxx driver is not handling EEE setup > during initialization (even with v6.4-rc). In general, nearly every driver gets EEE wrong. I have a patchset which basically rewrites EEE. It has been posted as RFC a couple times, and i plan to start posting it for merging this week. But as a result, don't expect EEE to actually work with any LTS kernel. Andrew
> So, I'm wondering what's actually going on here... can you give > any more details about the hardware setup? And what switch it actually is. I've not looked in too much detail, but i think different switch families have different EEE capabilities. But in general, as Russell pointed out, there is no MAC support for EEE in the mv88e6xxx driver. Andrew
Hi Andrew, > On Tue, May 30, 2023 at 04:07:43PM +0200, Lukasz Majewski wrote: > > Hi Russell, > > > > > On Tue, May 30, 2023 at 02:26:21PM +0200, Lukasz Majewski wrote: > > > > One can disable in device tree advertising of EEE capabilities > > > > of PHY when 'eee-broken-100tx' property is present in DTS. > > > > > > > > With DSA switch it also may happen that one would need to > > > > disable EEE due to some network issues. > > > > > > > > Corresponding switch DTS description: > > > > > > > > switch@0 { > > > > ports { > > > > port@0 { > > > > reg = <0>; > > > > label = "lan1"; > > > > phy-handle = <&switchphy0>; > > > > }; > > > > } > > > > mdio { > > > > switchphy0: switchphy@0 { > > > > reg = <0>; > > > > eee-broken-100tx; > > > > }; > > > > }; > > > > > > > > This patch adjusts the content of MDIO_AN_EEE_ADV in MDIO_MMD_AN > > > > "device" so the phydev->eee_broken_modes are taken into account > > > > from the start of the slave PHYs. > > > > > > This should be handled by phylib today in recent kernels without > > > the need for any patch (as I describe below, because the > > > config_aneg PHY method should be programming it.) Are you seeing > > > a problem with it in 6.4-rc? > > > > Unfortunately, for this project I use LTS 5.15.z kernel. > > > > My impression is that the mv88e6xxx driver is not handling EEE setup > > during initialization (even with v6.4-rc). > > In general, nearly every driver gets EEE wrong. Ach... I see :/ > I have a patchset > which basically rewrites EEE. Ok. > It has been posted as RFC a couple > times, and i plan to start posting it for merging this week. Ok. :-) > > But as a result, don't expect EEE to actually work with any LTS > kernel. Then, I think that it would be best to use the above "hack" until your patch set is not reviewed and merged. After that, when customer will mover forward with LTS kernel, I can test the EEE on the proper HW. > > Andrew Best regards, Lukasz Majewski -- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
On Tue, May 30, 2023 at 04:26:49PM +0200, Andrew Lunn wrote: > > So, I'm wondering what's actually going on here... can you give > > any more details about the hardware setup? > > And what switch it actually is. I've not looked in too much detail, > but i think different switch families have different EEE capabilities. > But in general, as Russell pointed out, there is no MAC support for > EEE in the mv88e6xxx driver. ... except for the built-in PHYs, which if they successfully negotiate EEE, that status is communicated back to the MAC in that one sees MV88E6352_PORT_STS_EEE set, which results in the MAC being able to signal LPI to the PHY... and I've stuck a 'scope on the PHY media-side signals in the past and have seen that activity does stop without there needing to be any help from the driver for this. At least reading the information I have for the 88E6352, there is no configuration of LPI timers, nor any seperate LPI enable. If EEE is enabled at the MAC, then LPI will be signalled according to whatever Marvell decided would be appropriate. For an external PHY that the PPU is not polling, the only way that we'd have EEE functional is if we forced EEE in port control register 1 on switches that support those bits. In other words setting both the EEE and FORCE_EEE bits...
Hi Andrew, > > So, I'm wondering what's actually going on here... can you give > > any more details about the hardware setup? > > And what switch it actually is. It is mv88e6071. > I've not looked in too much detail, > but i think different switch families have different EEE capabilities. Yes, some (like b53) have the ability to disable EEE in the HW. The above one from Marvell seems to have EEE always enabled (in silicon) and the only possibility is to not advertise it [*]. > But in general, as Russell pointed out, there is no MAC support for > EEE in the mv88e6xxx driver. I may be wrong, but aren't we accessing this switch PHYs via c45 ? (MDIO_MMD_PCS devices and e.g. MDIO_PCS_EEE_ABLE registers)? > > Andrew [*] - maybe it is possible via some "reserved" registers. However, there are "register pages" for EEE, which allow tuning timers. Best regards, Lukasz Majewski -- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
Hi Russell, > On Tue, May 30, 2023 at 04:07:43PM +0200, Lukasz Majewski wrote: > > Hi Russell, > > > > > On Tue, May 30, 2023 at 02:26:21PM +0200, Lukasz Majewski wrote: > > > > One can disable in device tree advertising of EEE capabilities > > > > of PHY when 'eee-broken-100tx' property is present in DTS. > > > > > > > > With DSA switch it also may happen that one would need to > > > > disable EEE due to some network issues. > > > > > > > > Corresponding switch DTS description: > > > > > > > > switch@0 { > > > > ports { > > > > port@0 { > > > > reg = <0>; > > > > label = "lan1"; > > > > phy-handle = <&switchphy0>; > > > > }; > > > > } > > > > mdio { > > > > switchphy0: switchphy@0 { > > > > reg = <0>; > > > > eee-broken-100tx; > > > > }; > > > > }; > > > > > > > > This patch adjusts the content of MDIO_AN_EEE_ADV in MDIO_MMD_AN > > > > "device" so the phydev->eee_broken_modes are taken into account > > > > from the start of the slave PHYs. > > > > > > This should be handled by phylib today in recent kernels without > > > the need for any patch (as I describe below, because the > > > config_aneg PHY method should be programming it.) Are you seeing > > > a problem with it in 6.4-rc? > > > > Unfortunately, for this project I use LTS 5.15.z kernel. > > > > My impression is that the mv88e6xxx driver is not handling EEE setup > > during initialization (even with v6.4-rc). > > > > I've tried to replace genphy_config_eee_advert() with phy_init_eee, > > but it lacks the part to program PCS advertise registers. > > Firstly, I would advise backporting the EEE changes. The older EEE > implementation was IMHO not particularly good (I think you can find > a record in the archives of me stating that the old interfaces were > just too quirky.) Ok. > > Secondly, even if you program the PHY for EEE, unless you have > something like an Atheros AR803x PHY with its SmartEEE, EEE needs > the support of both the PHY and the MAC to which its connected to > in order to work. It's the MAC which is the "client" which says > to the PHY "I'm idle" and when both ends tell their PHYs that > they're idle, the media link can then drop into the low power > state. For now - I just wanted to disable the advertisement of EEE (in my use case EEE can be sacrifice for reliability) (However, thanks for detailed insights). > > The 88e6xxx internal PHYs will communicate their EEE negotiation > state back to the MACs, Yes, internal PHYs are used for mv88e6071. > but for an external PHY, that won't happen, > and there is no code in the 88e6xxx driver to configure the MAC to > program the MAC to do EEE. Ok. I see. > > So, I'm wondering what's actually going on here... can you give > any more details about the hardware setup? > I'm using standard mv88e6071 setup (it is recognized as fixed-link PHY from imx8 fec side). Best regards, Lukasz Majewski -- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
On Tue, May 30, 2023 at 04:47:31PM +0200, Lukasz Majewski wrote: > Hi Andrew, > > > > So, I'm wondering what's actually going on here... can you give > > > any more details about the hardware setup? > > > > And what switch it actually is. > > It is mv88e6071. > > > I've not looked in too much detail, > > but i think different switch families have different EEE capabilities. > > Yes, some (like b53) have the ability to disable EEE in the HW. > > The above one from Marvell seems to have EEE always enabled (in silicon) > and the only possibility is to not advertise it [*]. Right, and that tells the remote end "we don't support EEE" so the remote end should then disable EEE support. Meanwhile the local MAC will _still_ signal LPI towards its PHY. I have no idea whether the PHY will pass that LPI signal onwards to the media in that case, or if it prevents entering low power mode. It would be interesting to connect two of these switches together, put a 'scope on the signals between the PHY and the media isolation transformer, and see whether it's entering low power mode, comparing when EEE is successfully negotiated vs not negotiated. My suspicion would be that in the case where the MAC always signals LPI to the PHY, the result of negotiation won't make a blind bit of difference. > > But in general, as Russell pointed out, there is no MAC support for > > EEE in the mv88e6xxx driver. > > I may be wrong, but aren't we accessing this switch PHYs via c45 ? > (MDIO_MMD_PCS devices and e.g. MDIO_PCS_EEE_ABLE registers)? As I've said - EEE is a MAC-to-MAC thing. The PHYs do the capability negotiation and handle the media dependent part of EEE. However, it's the MACs that signal to the PHY "I'm idle, please enter low power mode" and when both ends that they're idle, the media link only then drops into low power mode. This is the basic high-level operation of EEE in an 802.3 compliant system. As I've also said, there are PHYs out there which do their own thing as an "enhancement" to allow MACs that aren't EEE capable to gain *some* of the power savings from EEE (and I previously noted one such example.) The PHY EEE configuration is always done via Clause 45 - either through proper clause 45 cycles on the MDIO bus, or through the MMD access through a couple of clause 22 registers. There aren't the registers in the clause 22 address space for EEE. The MDIO_PCS_EEE_ABLE registers describe what the capabilities of the PHY is to the management software (in this case phylib). These are not supposed to change. The advertisements are programmed via the autonegotiation MMD register set. There's some additional configuration bits in the PHY which control whether the clock to the MAC is stopped when entering EEE low-power mode. However, even with all that, the MAC is still what is involved in giving the PHY permission to enter EEE low-power mode. The broad outline sequence in an 802.3 compliant setup is: - Whenever the MAC sends a packet, it resets the LPI timer. - When LPI timer expires, MAC signals to PHY that it can enter low-power mode. - When the PHY at both ends both agree that they have permission from their respective MACs to enter low power mode, they initiate the process to put the media into low power mode. - If the PHY has been given permission from management software to stop clock, the PHY will stop the clock to the MAC. - When the MAC has a packet to send, the MAC stops signalling low-power mode to the PHY. - The PHY restores the clock if it was stopped, and wakes up the link, thereby causing the remote PHY to also wake up. - Normal operation resumes. 802.3 EEE is not a PHY-to-PHY thing, it's MAC-to-MAC.
> > But as a result, don't expect EEE to actually work with any LTS > > kernel. > > Then, I think that it would be best to use the above "hack" until your > patch set is not reviewed and merged. After that, when customer will > mover forward with LTS kernel, I can test the EEE on the proper HW. Just to be clear, Since EEE never really worked, i doubt these patchset will fulfil the stable rules. They are not minimal fixes, but pretty much a re-write. So you will need to wait for the LTS released December 2023. Or you do your own backport. Andrew
Hi Lukasz, On Tue, May 30, 2023 at 03:40:39PM +0200, Lukasz Majewski wrote: > Hi Andrew, > > > On Tue, May 30, 2023 at 02:26:21PM +0200, Lukasz Majewski wrote: > > > One can disable in device tree advertising of EEE capabilities of > > > PHY when 'eee-broken-100tx' property is present in DTS. > > > > > > With DSA switch it also may happen that one would need to disable > > > EEE due to some network issues. > > > > Is EEE actually broken in the MAC/PHY combination? > > > > Problem is that when I connect on this project some non-manageable > switches (which by default have EEE enabled), then I observe very rare > and sporadic link loss and reconnection. The interesting question is, do other link partner or local system is broken? In some cases, not proper tx-timer was triggering this kind of symptoms. And timer configuration may depend on the link speed. So, driver may be need to take care of this. > Disabling EEE solves the problem. > > > You should not be using this DT option for configuration. It is there > > because there is some hardware which is truly broken, and needs EEE > > turned off. > > Yes, I do think that the above sentence sums up my use case. As Andrew already described, current linux kernel EEE support is not in the best shape, it is hard to see the difference between broken HW and SW. > > If EEE does work, but you need to turn it off because of latency etc, > > then please use ethtool. > > > > Yes, correct - it is possible to disable the EEE with > > ethtool --set-eee lan2 eee off > > However, as I've stated in the mail, I cannot re-enable EEE once > disabled with: > > ethtool --set-eee lan2 eee on > > ethtool --show-eee lan2 > EEE Settings for lan2: > EEE status: not supported > > > As the capability register shows value of 0. Some PHYs indeed have this issues: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/phy/micrel.c?h=v6.4-rc4#n1402 In case of your older kernel version, you will need to fake access to the EEE caps register. Regards, Oleksij
Hi Russell, > On Tue, May 30, 2023 at 04:26:49PM +0200, Andrew Lunn wrote: > > > So, I'm wondering what's actually going on here... can you give > > > any more details about the hardware setup? > > > > And what switch it actually is. I've not looked in too much detail, > > but i think different switch families have different EEE > > capabilities. But in general, as Russell pointed out, there is no > > MAC support for EEE in the mv88e6xxx driver. > > ... except for the built-in PHYs, This is my case. > which if they successfully negotiate > EEE, that status is communicated back to the MAC in that one sees > MV88E6352_PORT_STS_EEE I cannot find this register in my documentation. > set, which results in the MAC being able to > signal LPI to the PHY... and I've stuck a 'scope on the PHY media-side > signals in the past and have seen that activity does stop without > there needing to be any help from the driver for this. > > At least reading the information I have for the 88E6352, there is no > configuration of LPI timers, nor any seperate LPI enable. If EEE is > enabled at the MAC, then LPI will be signalled according to whatever > Marvell decided would be appropriate. And this knowledge is not disclosed to public. > > For an external PHY that the PPU is not polling, the only way that > we'd have EEE functional is if we forced EEE in port control register > 1 on switches that support those bits. In other words setting both the > EEE and FORCE_EEE bits... > Are those bits available in c45 standard? Or are they SoC (IC) specific? In my case I do have only two c45 registers disclosed (i.e. described) for mv88e6071 SoC in the documentation. Best regards, Lukasz Majewski -- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
On Wed, May 31, 2023 at 10:16:40AM +0200, Lukasz Majewski wrote: > Hi Russell, > > > On Tue, May 30, 2023 at 04:26:49PM +0200, Andrew Lunn wrote: > > > > So, I'm wondering what's actually going on here... can you give > > > > any more details about the hardware setup? > > > > > > And what switch it actually is. I've not looked in too much detail, > > > but i think different switch families have different EEE > > > capabilities. But in general, as Russell pointed out, there is no > > > MAC support for EEE in the mv88e6xxx driver. > > > > ... except for the built-in PHYs, > > This is my case. > > > which if they successfully negotiate > > EEE, that status is communicated back to the MAC in that one sees > > MV88E6352_PORT_STS_EEE > > I cannot find this register in my documentation. $ grep MV88E6352_PORT_STS_EEE drivers/net/dsa/mv88e6xxx/port.h It's a register bit, not a register. > > set, which results in the MAC being able to > > signal LPI to the PHY... and I've stuck a 'scope on the PHY media-side > > signals in the past and have seen that activity does stop without > > there needing to be any help from the driver for this. > > > > At least reading the information I have for the 88E6352, there is no > > configuration of LPI timers, nor any seperate LPI enable. If EEE is > > enabled at the MAC, then LPI will be signalled according to whatever > > Marvell decided would be appropriate. > > And this knowledge is not disclosed to public. > > > > > For an external PHY that the PPU is not polling, the only way that > > we'd have EEE functional is if we forced EEE in port control register > > 1 on switches that support those bits. In other words setting both the > > EEE and FORCE_EEE bits... > > > > Are those bits available in c45 standard? Or are they SoC (IC) specific? I'm talking about bits in the switch port control register, not in the PHY. Again, grepping the above file will tell you which bits and their location.
Hi Russell, > On Tue, May 30, 2023 at 04:47:31PM +0200, Lukasz Majewski wrote: > > Hi Andrew, > > > > > > So, I'm wondering what's actually going on here... can you give > > > > any more details about the hardware setup? > > > > > > And what switch it actually is. > > > > It is mv88e6071. > > > > > I've not looked in too much detail, > > > but i think different switch families have different EEE > > > capabilities. > > > > Yes, some (like b53) have the ability to disable EEE in the HW. > > > > The above one from Marvell seems to have EEE always enabled (in > > silicon) and the only possibility is to not advertise it [*]. > > Right, and that tells the remote end "we don't support EEE" so the > remote end should then disable EEE support. > > Meanwhile the local MAC will _still_ signal LPI towards its PHY. I > have no idea whether the PHY will pass that LPI signal onwards to > the media in that case, or if it prevents entering low power mode. > > It would be interesting to connect two of these switches together, > put a 'scope on the signals between the PHY and the media isolation > transformer, and see whether it's entering low power mode, > comparing when EEE is successfully negotiated vs not negotiated. > > My suspicion would be that in the case where the MAC always signals > LPI to the PHY, the result of negotiation won't make a blind bit of > difference. > > > > But in general, as Russell pointed out, there is no MAC support > > > for EEE in the mv88e6xxx driver. > > > > I may be wrong, but aren't we accessing this switch PHYs via c45 ? > > (MDIO_MMD_PCS devices and e.g. MDIO_PCS_EEE_ABLE registers)? > > As I've said - EEE is a MAC-to-MAC thing. The PHYs do the capability > negotiation and handle the media dependent part of EEE. However, it's > the MACs that signal to the PHY "I'm idle, please enter low power > mode" and when both ends that they're idle, the media link only then > drops into low power mode. This is the basic high-level operation of > EEE in an 802.3 compliant system. > > As I've also said, there are PHYs out there which do their own thing > as an "enhancement" to allow MACs that aren't EEE capable to gain > *some* of the power savings from EEE (and I previously noted one > such example.) > > The PHY EEE configuration is always done via Clause 45 - either > through proper clause 45 cycles on the MDIO bus, or through the MMD > access through a couple of clause 22 registers. There aren't the > registers in the clause 22 address space for EEE. > > The MDIO_PCS_EEE_ABLE registers describe what the capabilities of the > PHY is to the management software (in this case phylib). These are not > supposed to change. The advertisements are programmed via the > autonegotiation MMD register set. There's some additional > configuration bits in the PHY which control whether the clock to the > MAC is stopped when entering EEE low-power mode. > > However, even with all that, the MAC is still what is involved in > giving the PHY permission to enter EEE low-power mode. > > The broad outline sequence in an 802.3 compliant setup is: > > - Whenever the MAC sends a packet, it resets the LPI timer. > - When LPI timer expires, MAC signals to PHY that it can enter > low-power mode. > - When the PHY at both ends both agree that they have permission from > their respective MACs to enter low power mode, they initiate the > process to put the media into low power mode. > - If the PHY has been given permission from management software to > stop clock, the PHY will stop the clock to the MAC. > - When the MAC has a packet to send, the MAC stops signalling > low-power mode to the PHY. > - The PHY restores the clock if it was stopped, and wakes up the link, > thereby causing the remote PHY to also wake up. > - Normal operation resumes. > > 802.3 EEE is not a PHY-to-PHY thing, it's MAC-to-MAC. > Thanks for the detailed explanation. With "switch" setup - where I do have MAC from imx8 (fec driver) connected to e.g. mv88e6071 with "fixed-link", I do guess that the EEE management is done solely in mv88e6071? In other words - the mv88e6071 solely decides if its internal PHY shall signal EEE to the peer switch. Just for the record - the mv88e6071 has a "register space" where one can tune assertion and wakeup timers. Unfortunately, there is no "bit" setting to disable EEE in the HW. Best regards, Lukasz Majewski -- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
Hi Andrew, > > > But as a result, don't expect EEE to actually work with any LTS > > > kernel. > > > > Then, I think that it would be best to use the above "hack" until > > your patch set is not reviewed and merged. After that, when > > customer will mover forward with LTS kernel, I can test the EEE on > > the proper HW. > > Just to be clear, Since EEE never really worked, i doubt these > patchset will fulfil the stable rules. They are not minimal fixes, but > pretty much a re-write. So you will need to wait for the LTS released > December 2023. Or you do your own backport. I will backport (if requred) those patches to 6.x LTS. > > Andrew Best regards, Lukasz Majewski -- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
Hi Oleksij, > Hi Lukasz, > > On Tue, May 30, 2023 at 03:40:39PM +0200, Lukasz Majewski wrote: > > Hi Andrew, > > > > > On Tue, May 30, 2023 at 02:26:21PM +0200, Lukasz Majewski wrote: > > > > One can disable in device tree advertising of EEE capabilities > > > > of PHY when 'eee-broken-100tx' property is present in DTS. > > > > > > > > With DSA switch it also may happen that one would need to > > > > disable EEE due to some network issues. > > > > > > Is EEE actually broken in the MAC/PHY combination? > > > > > > > Problem is that when I connect on this project some non-manageable > > switches (which by default have EEE enabled), then I observe very > > rare and sporadic link loss and reconnection. > > The interesting question is, do other link partner or local system is > broken? I was able to reproduce customer issue on my local setup with this switch. Moreover, only two devices were connected to this switch - mv88e6071 and the host PC. > In some cases, not proper tx-timer was triggering this kind of > symptoms. And timer configuration may depend on the link speed. So, > driver may be need to take care of this. The speed is only 100 Mbps, full duplex (as this switch only supports speed up till 100Mbps). > > > Disabling EEE solves the problem. > > > > > You should not be using this DT option for configuration. It is > > > there because there is some hardware which is truly broken, and > > > needs EEE turned off. > > > > Yes, I do think that the above sentence sums up my use case. > > As Andrew already described, current linux kernel EEE support is not > in the best shape, it is hard to see the difference between broken HW > and SW. Ok. > > > > If EEE does work, but you need to turn it off because of latency > > > etc, then please use ethtool. > > > > > > > Yes, correct - it is possible to disable the EEE with > > > > ethtool --set-eee lan2 eee off > > > > However, as I've stated in the mail, I cannot re-enable EEE once > > disabled with: > > > > ethtool --set-eee lan2 eee on > > > > ethtool --show-eee lan2 > > EEE Settings for lan2: > > EEE status: not supported > > > > > > As the capability register shows value of 0. > > Some PHYs indeed have this issues: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/phy/micrel.c?h=v6.4-rc4#n1402 > > In case of your older kernel version, you will need to fake access to > the EEE caps register. Thanks for pointing this out - I will add this functionality when re-enabling of EEE will be required via ethtool. > > Regards, > Oleksij Best regards, Lukasz Majewski -- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@denx.de
> Thanks for the detailed explanation. > > With "switch" setup - where I do have MAC from imx8 (fec driver) > connected to e.g. mv88e6071 with "fixed-link", I do guess that the EEE > management is done solely in mv88e6071? So you have the MAC-MAC connection? No back to back PHYs in the middle. If there is no PHY, linux will not do anything with EEE. What happens will depend on the reset defaults of the switch. For the FEC phy_eee_init() will return false, so i expect EEE is disabled. > In other words - the mv88e6071 solely decides if its internal PHY shall > signal EEE to the peer switch. Handling EEE in the mv88e6xxx driver is something on my todo list. But i don't expect it to happen soon. And before it will happen we actually need to decide how the user API should work. Andrew
diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 353d8fff3166..712923c7d4e2 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -2247,6 +2247,7 @@ static int dsa_slave_phy_setup(struct net_device *slave_dev) phylink_destroy(dp->pl); } + genphy_config_eee_advert(slave_dev->phydev); return ret; }
One can disable in device tree advertising of EEE capabilities of PHY when 'eee-broken-100tx' property is present in DTS. With DSA switch it also may happen that one would need to disable EEE due to some network issues. Corresponding switch DTS description: switch@0 { ports { port@0 { reg = <0>; label = "lan1"; phy-handle = <&switchphy0>; }; } mdio { switchphy0: switchphy@0 { reg = <0>; eee-broken-100tx; }; }; This patch adjusts the content of MDIO_AN_EEE_ADV in MDIO_MMD_AN "device" so the phydev->eee_broken_modes are taken into account from the start of the slave PHYs. As a result the 'ethtool --show-eee lan1' shows that EEE is not supported from the outset. Questions: - Is the genphy_config_eee_advert() appropriate to be used here? As I found this issue on 5.15 kernel, it looks like mainline now uses PHY features for handle EEE (but the aforementioned function is still present in newest mainline - v6.4-rc1). - I've also observed strange behaviour for EEE capability register: Why the value in MDIO_MMD_PCS device; reg MDIO_PCS_EEE_ABLE is somewhat "volatile" - in a sense that when I use: ethtool --set-eee lan2 eee off It is cleared by PHY itself to 0x0 (from 0x2) and turning it on again is not working. Is this expected? Or am I missing something? Signed-off-by: Lukasz Majewski <lukma@denx.de> --- net/dsa/slave.c | 1 + 1 file changed, 1 insertion(+)