Message ID | 20240708075023.14893-4-brgl@bgdev.pl (mailing list archive) |
---|---|
State | Accepted |
Commit | 708405f3e56e3bee086f46a43398ee300c36a1d8 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: phy: aquantia: enable support for aqr115c | expand |
Hi Bartosz, On 08/07/2024 08:50, Bartosz Golaszewski wrote: > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> > > When the PHY is first coming up (or resuming from suspend), it's > possible that although the FW status shows as running, we still see > zeroes in the GLOBAL_CFG set of registers and cannot determine available > modes. Since all models support 10M, add a poll and wait the config to > become available. > > Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> > --- > drivers/net/phy/aquantia/aquantia_main.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c > index 974795bd0860..2c8ba2725a91 100644 > --- a/drivers/net/phy/aquantia/aquantia_main.c > +++ b/drivers/net/phy/aquantia/aquantia_main.c > @@ -652,7 +652,13 @@ static int aqr107_fill_interface_modes(struct phy_device *phydev) > unsigned long *possible = phydev->possible_interfaces; > unsigned int serdes_mode, rate_adapt; > phy_interface_t interface; > - int i, val; > + int i, val, ret; > + > + ret = phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1, > + VEND1_GLOBAL_CFG_10M, val, val != 0, > + 1000, 100000, false); > + if (ret) > + return ret; > > /* Walk the media-speed configuration registers to determine which > * host-side serdes modes may be used by the PHY depending on the With the current -next and mainline we are seeing the following issue on our Tegra234 Jetson AGX Orin platform ... Aquantia AQR113C stmmac-0:00: aqr107_fill_interface_modes failed: -110 tegra-mgbe 6800000.ethernet eth0: __stmmac_open: Cannot attach to PHY (error: -110) We have tracked it down to this change and looks like our PHY does not support 10M ... $ ethtool eth0 Settings for eth0: Supported ports: [ ] Supported link modes: 100baseT/Full 1000baseT/Full 10000baseT/Full 1000baseKX/Full 10000baseKX4/Full 10000baseKR/Full 2500baseT/Full 5000baseT/Full The following fixes this for this platform ... diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c index d12e35374231..0b2db486d8bd 100644 --- a/drivers/net/phy/aquantia/aquantia_main.c +++ b/drivers/net/phy/aquantia/aquantia_main.c @@ -656,7 +656,7 @@ static int aqr107_fill_interface_modes(struct phy_device *phydev) int i, val, ret; ret = phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1, - VEND1_GLOBAL_CFG_10M, val, val != 0, + VEND1_GLOBAL_CFG_100M, val, val != 0, 1000, 100000, false); if (ret) return ret; However, I am not sure if this is guaranteed to work for all? On a related note, we had also found an issue with this PHY where the PHY reset is not working quite correctly. What we see is that when polling for the firmware ID in aqr107_wait_reset_complete() is that value in the firware ID registers transitions from 0 to 0xffff and then to the firmware ID. We have been testing the following change to fix this ... diff --git a/drivers/net/phy/aquantia/aquantia.h b/drivers/net/phy/aquantia/aquantia.h index 2465345081f8..278e3b167c58 100644 --- a/drivers/net/phy/aquantia/aquantia.h +++ b/drivers/net/phy/aquantia/aquantia.h @@ -20,6 +20,7 @@ #define VEND1_GLOBAL_FW_ID 0x0020 #define VEND1_GLOBAL_FW_ID_MAJOR GENMASK(15, 8) #define VEND1_GLOBAL_FW_ID_MINOR GENMASK(7, 0) +#define VEND1_GLOBAL_FW_ID_MASK GENMASK(15, 0) #define VEND1_GLOBAL_MAILBOX_INTERFACE1 0x0200 #define VEND1_GLOBAL_MAILBOX_INTERFACE1_EXECUTE BIT(15) diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c index 0b2db486d8bd..5023fd70050d 100644 --- a/drivers/net/phy/aquantia/aquantia_main.c +++ b/drivers/net/phy/aquantia/aquantia_main.c @@ -447,7 +447,9 @@ int aqr_wait_reset_complete(struct phy_device *phydev) int val; return phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1, - VEND1_GLOBAL_FW_ID, val, val != 0, + VEND1_GLOBAL_FW_ID, val, + ((val & VEND1_GLOBAL_FW_ID_MASK) != 0 && + (val & VEND1_GLOBAL_FW_ID_MASK) != VEND1_GLOBAL_FW_ID_MASK), 20000, 2000000, false); } I have not tried the resume use-case, but curious if this may also help here? Cheers Jon
On Thu, Jul 18, 2024 at 2:23 PM Jon Hunter <jonathanh@nvidia.com> wrote: > > Hi Bartosz, > > On 08/07/2024 08:50, Bartosz Golaszewski wrote: > > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> > > > > When the PHY is first coming up (or resuming from suspend), it's > > possible that although the FW status shows as running, we still see > > zeroes in the GLOBAL_CFG set of registers and cannot determine available > > modes. Since all models support 10M, add a poll and wait the config to > > become available. > > > > Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> > > --- > > drivers/net/phy/aquantia/aquantia_main.c | 8 +++++++- > > 1 file changed, 7 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c > > index 974795bd0860..2c8ba2725a91 100644 > > --- a/drivers/net/phy/aquantia/aquantia_main.c > > +++ b/drivers/net/phy/aquantia/aquantia_main.c > > @@ -652,7 +652,13 @@ static int aqr107_fill_interface_modes(struct phy_device *phydev) > > unsigned long *possible = phydev->possible_interfaces; > > unsigned int serdes_mode, rate_adapt; > > phy_interface_t interface; > > - int i, val; > > + int i, val, ret; > > + > > + ret = phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1, > > + VEND1_GLOBAL_CFG_10M, val, val != 0, > > + 1000, 100000, false); > > + if (ret) > > + return ret; > > > > /* Walk the media-speed configuration registers to determine which > > * host-side serdes modes may be used by the PHY depending on the > > > With the current -next and mainline we are seeing the following issue on > our Tegra234 Jetson AGX Orin platform ... > > Aquantia AQR113C stmmac-0:00: aqr107_fill_interface_modes failed: -110 > tegra-mgbe 6800000.ethernet eth0: __stmmac_open: Cannot attach to PHY (error: -110) > > > We have tracked it down to this change and looks like our PHY does not > support 10M ... > > $ ethtool eth0 > Settings for eth0: > Supported ports: [ ] > Supported link modes: 100baseT/Full > 1000baseT/Full > 10000baseT/Full > 1000baseKX/Full > 10000baseKX4/Full > 10000baseKR/Full > 2500baseT/Full > 5000baseT/Full > > The following fixes this for this platform ... > > diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c > index d12e35374231..0b2db486d8bd 100644 > --- a/drivers/net/phy/aquantia/aquantia_main.c > +++ b/drivers/net/phy/aquantia/aquantia_main.c > @@ -656,7 +656,7 @@ static int aqr107_fill_interface_modes(struct phy_device *phydev) > int i, val, ret; > > ret = phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1, > - VEND1_GLOBAL_CFG_10M, val, val != 0, > + VEND1_GLOBAL_CFG_100M, val, val != 0, > 1000, 100000, false); > if (ret) > return ret; > > > However, I am not sure if this is guaranteed to work for all? Ah cr*p. No, I don't think it is. We should take the first supported mode for a given PHY I think. > > On a related note, we had also found an issue with this PHY where > the PHY reset is not working quite correctly. What we see is that > when polling for the firmware ID in aqr107_wait_reset_complete() > is that value in the firware ID registers transitions from 0 to > 0xffff and then to the firmware ID. We have been testing the > following change to fix this ... > > diff --git a/drivers/net/phy/aquantia/aquantia.h b/drivers/net/phy/aquantia/aquantia.h > index 2465345081f8..278e3b167c58 100644 > --- a/drivers/net/phy/aquantia/aquantia.h > +++ b/drivers/net/phy/aquantia/aquantia.h > @@ -20,6 +20,7 @@ > #define VEND1_GLOBAL_FW_ID 0x0020 > #define VEND1_GLOBAL_FW_ID_MAJOR GENMASK(15, 8) > #define VEND1_GLOBAL_FW_ID_MINOR GENMASK(7, 0) > +#define VEND1_GLOBAL_FW_ID_MASK GENMASK(15, 0) > > #define VEND1_GLOBAL_MAILBOX_INTERFACE1 0x0200 > #define VEND1_GLOBAL_MAILBOX_INTERFACE1_EXECUTE BIT(15) > diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c > index 0b2db486d8bd..5023fd70050d 100644 > --- a/drivers/net/phy/aquantia/aquantia_main.c > +++ b/drivers/net/phy/aquantia/aquantia_main.c > @@ -447,7 +447,9 @@ int aqr_wait_reset_complete(struct phy_device *phydev) > int val; > > return phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1, > - VEND1_GLOBAL_FW_ID, val, val != 0, > + VEND1_GLOBAL_FW_ID, val, > + ((val & VEND1_GLOBAL_FW_ID_MASK) != 0 && > + (val & VEND1_GLOBAL_FW_ID_MASK) != VEND1_GLOBAL_FW_ID_MASK), > 20000, 2000000, false); > } > > I have not tried the resume use-case, but curious if this may > also help here? > Interesting. Unfortunately this doesn't help with the suspend/resume use-case if I revert my offending patch. Bart > Cheers > Jon > > -- > nvpublic
On Thu, Jul 18, 2024 at 3:04 PM Bartosz Golaszewski <brgl@bgdev.pl> wrote: > > On Thu, Jul 18, 2024 at 2:23 PM Jon Hunter <jonathanh@nvidia.com> wrote: > > > > > > With the current -next and mainline we are seeing the following issue on > > our Tegra234 Jetson AGX Orin platform ... > > > > Aquantia AQR113C stmmac-0:00: aqr107_fill_interface_modes failed: -110 > > tegra-mgbe 6800000.ethernet eth0: __stmmac_open: Cannot attach to PHY (error: -110) > > > > > > We have tracked it down to this change and looks like our PHY does not > > support 10M ... > > > > $ ethtool eth0 > > Settings for eth0: > > Supported ports: [ ] > > Supported link modes: 100baseT/Full > > 1000baseT/Full > > 10000baseT/Full > > 1000baseKX/Full > > 10000baseKX4/Full > > 10000baseKR/Full > > 2500baseT/Full > > 5000baseT/Full > > > > The following fixes this for this platform ... > > > > diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c > > index d12e35374231..0b2db486d8bd 100644 > > --- a/drivers/net/phy/aquantia/aquantia_main.c > > +++ b/drivers/net/phy/aquantia/aquantia_main.c > > @@ -656,7 +656,7 @@ static int aqr107_fill_interface_modes(struct phy_device *phydev) > > int i, val, ret; > > > > ret = phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1, > > - VEND1_GLOBAL_CFG_10M, val, val != 0, > > + VEND1_GLOBAL_CFG_100M, val, val != 0, > > 1000, 100000, false); > > if (ret) > > return ret; > > > > > > However, I am not sure if this is guaranteed to work for all? > > Ah cr*p. No, I don't think it is. We should take the first supported > mode for a given PHY I think. > TBH I only observed the issue on AQR115C. I don't have any other model to test with. Is it fine to fix it by implementing aqr115_fill_interface_modes() that would first wait for this register to return non-0 and then call aqr107_fill_interface_modes()? Bartosz
On 18/07/2024 14:29, Bartosz Golaszewski wrote: > On Thu, Jul 18, 2024 at 3:04 PM Bartosz Golaszewski <brgl@bgdev.pl> wrote: >> >> On Thu, Jul 18, 2024 at 2:23 PM Jon Hunter <jonathanh@nvidia.com> wrote: >>> >>> >>> With the current -next and mainline we are seeing the following issue on >>> our Tegra234 Jetson AGX Orin platform ... >>> >>> Aquantia AQR113C stmmac-0:00: aqr107_fill_interface_modes failed: -110 >>> tegra-mgbe 6800000.ethernet eth0: __stmmac_open: Cannot attach to PHY (error: -110) >>> >>> >>> We have tracked it down to this change and looks like our PHY does not >>> support 10M ... >>> >>> $ ethtool eth0 >>> Settings for eth0: >>> Supported ports: [ ] >>> Supported link modes: 100baseT/Full >>> 1000baseT/Full >>> 10000baseT/Full >>> 1000baseKX/Full >>> 10000baseKX4/Full >>> 10000baseKR/Full >>> 2500baseT/Full >>> 5000baseT/Full >>> >>> The following fixes this for this platform ... >>> >>> diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c >>> index d12e35374231..0b2db486d8bd 100644 >>> --- a/drivers/net/phy/aquantia/aquantia_main.c >>> +++ b/drivers/net/phy/aquantia/aquantia_main.c >>> @@ -656,7 +656,7 @@ static int aqr107_fill_interface_modes(struct phy_device *phydev) >>> int i, val, ret; >>> >>> ret = phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1, >>> - VEND1_GLOBAL_CFG_10M, val, val != 0, >>> + VEND1_GLOBAL_CFG_100M, val, val != 0, >>> 1000, 100000, false); >>> if (ret) >>> return ret; >>> >>> >>> However, I am not sure if this is guaranteed to work for all? >> >> Ah cr*p. No, I don't think it is. We should take the first supported >> mode for a given PHY I think. >> > > TBH I only observed the issue on AQR115C. I don't have any other model > to test with. Is it fine to fix it by implementing > aqr115_fill_interface_modes() that would first wait for this register > to return non-0 and then call aqr107_fill_interface_modes()? I am doing a bit more testing. We have seen a few issues with this PHY driver and so I am wondering if we also need something similar for the AQR113C variant too. Interestingly, the product brief for these PHYs [0] do show that both the AQR113C and AQR115C both support 10M. So I wonder if it is our ethernet controller that is not supporting 10M? I will check on this too. Jon [0] https://www.marvell.com/content/dam/marvell/en/public-collateral/transceivers/marvell-phys-transceivers-aqrate-gen4-product-brief.pdf
On Thu, Jul 18, 2024 at 4:08 PM Jon Hunter <jonathanh@nvidia.com> wrote: > > > On 18/07/2024 14:29, Bartosz Golaszewski wrote: > > On Thu, Jul 18, 2024 at 3:04 PM Bartosz Golaszewski <brgl@bgdev.pl> wrote: > >> > >> On Thu, Jul 18, 2024 at 2:23 PM Jon Hunter <jonathanh@nvidia.com> wrote: > >>> > >>> > >>> With the current -next and mainline we are seeing the following issue on > >>> our Tegra234 Jetson AGX Orin platform ... > >>> > >>> Aquantia AQR113C stmmac-0:00: aqr107_fill_interface_modes failed: -110 > >>> tegra-mgbe 6800000.ethernet eth0: __stmmac_open: Cannot attach to PHY (error: -110) > >>> > >>> > >>> We have tracked it down to this change and looks like our PHY does not > >>> support 10M ... > >>> > >>> $ ethtool eth0 > >>> Settings for eth0: > >>> Supported ports: [ ] > >>> Supported link modes: 100baseT/Full > >>> 1000baseT/Full > >>> 10000baseT/Full > >>> 1000baseKX/Full > >>> 10000baseKX4/Full > >>> 10000baseKR/Full > >>> 2500baseT/Full > >>> 5000baseT/Full > >>> > >>> The following fixes this for this platform ... > >>> > >>> diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c > >>> index d12e35374231..0b2db486d8bd 100644 > >>> --- a/drivers/net/phy/aquantia/aquantia_main.c > >>> +++ b/drivers/net/phy/aquantia/aquantia_main.c > >>> @@ -656,7 +656,7 @@ static int aqr107_fill_interface_modes(struct phy_device *phydev) > >>> int i, val, ret; > >>> > >>> ret = phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1, > >>> - VEND1_GLOBAL_CFG_10M, val, val != 0, > >>> + VEND1_GLOBAL_CFG_100M, val, val != 0, > >>> 1000, 100000, false); > >>> if (ret) > >>> return ret; > >>> > >>> > >>> However, I am not sure if this is guaranteed to work for all? > >> > >> Ah cr*p. No, I don't think it is. We should take the first supported > >> mode for a given PHY I think. > >> > > > > TBH I only observed the issue on AQR115C. I don't have any other model > > to test with. Is it fine to fix it by implementing > > aqr115_fill_interface_modes() that would first wait for this register > > to return non-0 and then call aqr107_fill_interface_modes()? > > I am doing a bit more testing. We have seen a few issues with this PHY > driver and so I am wondering if we also need something similar for the > AQR113C variant too. > > Interestingly, the product brief for these PHYs [0] do show that both > the AQR113C and AQR115C both support 10M. So I wonder if it is our > ethernet controller that is not supporting 10M? I will check on this too. > Oh you have an 113c? I didn't get this. Yeah, weird, all docs say it should support 10M. In fact all AQR PHYs should hence my initial change. Bart > Jon > > [0] > https://www.marvell.com/content/dam/marvell/en/public-collateral/transceivers/marvell-phys-transceivers-aqrate-gen4-product-brief.pdf > > > -- > nvpublic
On 18/07/2024 15:13, Bartosz Golaszewski wrote: > On Thu, Jul 18, 2024 at 4:08 PM Jon Hunter <jonathanh@nvidia.com> wrote: >> >> >> On 18/07/2024 14:29, Bartosz Golaszewski wrote: >>> On Thu, Jul 18, 2024 at 3:04 PM Bartosz Golaszewski <brgl@bgdev.pl> wrote: >>>> >>>> On Thu, Jul 18, 2024 at 2:23 PM Jon Hunter <jonathanh@nvidia.com> wrote: >>>>> >>>>> >>>>> With the current -next and mainline we are seeing the following issue on >>>>> our Tegra234 Jetson AGX Orin platform ... >>>>> >>>>> Aquantia AQR113C stmmac-0:00: aqr107_fill_interface_modes failed: -110 >>>>> tegra-mgbe 6800000.ethernet eth0: __stmmac_open: Cannot attach to PHY (error: -110) >>>>> >>>>> >>>>> We have tracked it down to this change and looks like our PHY does not >>>>> support 10M ... >>>>> >>>>> $ ethtool eth0 >>>>> Settings for eth0: >>>>> Supported ports: [ ] >>>>> Supported link modes: 100baseT/Full >>>>> 1000baseT/Full >>>>> 10000baseT/Full >>>>> 1000baseKX/Full >>>>> 10000baseKX4/Full >>>>> 10000baseKR/Full >>>>> 2500baseT/Full >>>>> 5000baseT/Full >>>>> >>>>> The following fixes this for this platform ... >>>>> >>>>> diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c >>>>> index d12e35374231..0b2db486d8bd 100644 >>>>> --- a/drivers/net/phy/aquantia/aquantia_main.c >>>>> +++ b/drivers/net/phy/aquantia/aquantia_main.c >>>>> @@ -656,7 +656,7 @@ static int aqr107_fill_interface_modes(struct phy_device *phydev) >>>>> int i, val, ret; >>>>> >>>>> ret = phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1, >>>>> - VEND1_GLOBAL_CFG_10M, val, val != 0, >>>>> + VEND1_GLOBAL_CFG_100M, val, val != 0, >>>>> 1000, 100000, false); >>>>> if (ret) >>>>> return ret; >>>>> >>>>> >>>>> However, I am not sure if this is guaranteed to work for all? >>>> >>>> Ah cr*p. No, I don't think it is. We should take the first supported >>>> mode for a given PHY I think. >>>> >>> >>> TBH I only observed the issue on AQR115C. I don't have any other model >>> to test with. Is it fine to fix it by implementing >>> aqr115_fill_interface_modes() that would first wait for this register >>> to return non-0 and then call aqr107_fill_interface_modes()? >> >> I am doing a bit more testing. We have seen a few issues with this PHY >> driver and so I am wondering if we also need something similar for the >> AQR113C variant too. >> >> Interestingly, the product brief for these PHYs [0] do show that both >> the AQR113C and AQR115C both support 10M. So I wonder if it is our >> ethernet controller that is not supporting 10M? I will check on this too. >> > > Oh you have an 113c? I didn't get this. Yeah, weird, all docs say it > should support 10M. In fact all AQR PHYs should hence my initial > change. Yes we have an AQR113C. I agree it should support this, but for whatever reason this is not advertised. I do see that 10M is advertised as supported by the network ... Link partner advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full My PC that is on the same network supports 10M, but just not this Tegra device. I am checking to see if this is expected for this device. Jon
On Thu, Jul 18, 2024 at 4:49 PM Jon Hunter <jonathanh@nvidia.com> wrote: > > > On 18/07/2024 15:13, Bartosz Golaszewski wrote: > > On Thu, Jul 18, 2024 at 4:08 PM Jon Hunter <jonathanh@nvidia.com> wrote: > >> > >> > >> On 18/07/2024 14:29, Bartosz Golaszewski wrote: > >>> On Thu, Jul 18, 2024 at 3:04 PM Bartosz Golaszewski <brgl@bgdev.pl> wrote: > >>>> > >>>> On Thu, Jul 18, 2024 at 2:23 PM Jon Hunter <jonathanh@nvidia.com> wrote: > >>>>> > >>>>> > >>>>> With the current -next and mainline we are seeing the following issue on > >>>>> our Tegra234 Jetson AGX Orin platform ... > >>>>> > >>>>> Aquantia AQR113C stmmac-0:00: aqr107_fill_interface_modes failed: -110 > >>>>> tegra-mgbe 6800000.ethernet eth0: __stmmac_open: Cannot attach to PHY (error: -110) > >>>>> > >>>>> > >>>>> We have tracked it down to this change and looks like our PHY does not > >>>>> support 10M ... > >>>>> > >>>>> $ ethtool eth0 > >>>>> Settings for eth0: > >>>>> Supported ports: [ ] > >>>>> Supported link modes: 100baseT/Full > >>>>> 1000baseT/Full > >>>>> 10000baseT/Full > >>>>> 1000baseKX/Full > >>>>> 10000baseKX4/Full > >>>>> 10000baseKR/Full > >>>>> 2500baseT/Full > >>>>> 5000baseT/Full > >>>>> > >>>>> The following fixes this for this platform ... > >>>>> > >>>>> diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c > >>>>> index d12e35374231..0b2db486d8bd 100644 > >>>>> --- a/drivers/net/phy/aquantia/aquantia_main.c > >>>>> +++ b/drivers/net/phy/aquantia/aquantia_main.c > >>>>> @@ -656,7 +656,7 @@ static int aqr107_fill_interface_modes(struct phy_device *phydev) > >>>>> int i, val, ret; > >>>>> > >>>>> ret = phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1, > >>>>> - VEND1_GLOBAL_CFG_10M, val, val != 0, > >>>>> + VEND1_GLOBAL_CFG_100M, val, val != 0, > >>>>> 1000, 100000, false); > >>>>> if (ret) > >>>>> return ret; > >>>>> > >>>>> > >>>>> However, I am not sure if this is guaranteed to work for all? > >>>> > >>>> Ah cr*p. No, I don't think it is. We should take the first supported > >>>> mode for a given PHY I think. > >>>> > >>> > >>> TBH I only observed the issue on AQR115C. I don't have any other model > >>> to test with. Is it fine to fix it by implementing > >>> aqr115_fill_interface_modes() that would first wait for this register > >>> to return non-0 and then call aqr107_fill_interface_modes()? > >> > >> I am doing a bit more testing. We have seen a few issues with this PHY > >> driver and so I am wondering if we also need something similar for the > >> AQR113C variant too. > >> > >> Interestingly, the product brief for these PHYs [0] do show that both > >> the AQR113C and AQR115C both support 10M. So I wonder if it is our > >> ethernet controller that is not supporting 10M? I will check on this too. > >> > > > > Oh you have an 113c? I didn't get this. Yeah, weird, all docs say it > > should support 10M. In fact all AQR PHYs should hence my initial > > change. > > > Yes we have an AQR113C. I agree it should support this, but for whatever > reason this is not advertised. I do see that 10M is advertised as > supported by the network ... > > Link partner advertised link modes: 10baseT/Half 10baseT/Full > 100baseT/Half 100baseT/Full > 1000baseT/Full > > My PC that is on the same network supports 10M, but just not this Tegra > device. I am checking to see if this is expected for this device. > I sent a patch for you to test. I think that even if it doesn't fully fix the issue you're observing, it's worth picking it up as it reduces the impact of the workaround I introduced. I'll be off next week so I'm sending it quickly with the hope it will be useful. Bart
On 18/07/2024 15:59, Bartosz Golaszewski wrote: ... >>>>> TBH I only observed the issue on AQR115C. I don't have any other model >>>>> to test with. Is it fine to fix it by implementing >>>>> aqr115_fill_interface_modes() that would first wait for this register >>>>> to return non-0 and then call aqr107_fill_interface_modes()? >>>> >>>> I am doing a bit more testing. We have seen a few issues with this PHY >>>> driver and so I am wondering if we also need something similar for the >>>> AQR113C variant too. >>>> >>>> Interestingly, the product brief for these PHYs [0] do show that both >>>> the AQR113C and AQR115C both support 10M. So I wonder if it is our >>>> ethernet controller that is not supporting 10M? I will check on this too. >>>> >>> >>> Oh you have an 113c? I didn't get this. Yeah, weird, all docs say it >>> should support 10M. In fact all AQR PHYs should hence my initial >>> change. >> >> >> Yes we have an AQR113C. I agree it should support this, but for whatever >> reason this is not advertised. I do see that 10M is advertised as >> supported by the network ... >> >> Link partner advertised link modes: 10baseT/Half 10baseT/Full >> 100baseT/Half 100baseT/Full >> 1000baseT/Full >> >> My PC that is on the same network supports 10M, but just not this Tegra >> device. I am checking to see if this is expected for this device. >> > > I sent a patch for you to test. I think that even if it doesn't fully > fix the issue you're observing, it's worth picking it up as it reduces > the impact of the workaround I introduced. Thanks! I will test this tonight. > I'll be off next week so I'm sending it quickly with the hope it will be useful. OK thanks for letting me know. Another thought I had, which is also quite timely, is that I have recently been testing a patch [0] as I found that this actually resolves an issue where we occasionally see our device fail to get an IP address. This was sent out over a year ago and sadly we failed to follow up :-( Russell was concerned if this would make the function that was being changed fail if it did not have the link (if I am understanding the comments correctly). However, looking at the code now, I see that the aqr107_read_status() function checks if '!phydev->link' before we poll the TX ready status, and so I am wondering if this change is OK? From my testing it does work. I would be interested to know if this may also resolve your issue? With this change [0] I have been able to do 500 boots on our board and verify that the ethernet controller is able to get an IP address every time. Without this change it would fail to get an IP address anywhere from 1-100 boots typically. I will test your patch in the same way, but I am wondering if both are trying to address the same sort of issue? Cheers Jon [0] https://lore.kernel.org/linux-tegra/20230628124326.55732-3-ruppala@nvidia.com/#t
On Thu, Jul 18, 2024 at 7:42 PM Jon Hunter <jonathanh@nvidia.com> wrote: > > > On 18/07/2024 15:59, Bartosz Golaszewski wrote: > > ... > > >>>>> TBH I only observed the issue on AQR115C. I don't have any other model > >>>>> to test with. Is it fine to fix it by implementing > >>>>> aqr115_fill_interface_modes() that would first wait for this register > >>>>> to return non-0 and then call aqr107_fill_interface_modes()? > >>>> > >>>> I am doing a bit more testing. We have seen a few issues with this PHY > >>>> driver and so I am wondering if we also need something similar for the > >>>> AQR113C variant too. > >>>> > >>>> Interestingly, the product brief for these PHYs [0] do show that both > >>>> the AQR113C and AQR115C both support 10M. So I wonder if it is our > >>>> ethernet controller that is not supporting 10M? I will check on this too. > >>>> > >>> > >>> Oh you have an 113c? I didn't get this. Yeah, weird, all docs say it > >>> should support 10M. In fact all AQR PHYs should hence my initial > >>> change. > >> > >> > >> Yes we have an AQR113C. I agree it should support this, but for whatever > >> reason this is not advertised. I do see that 10M is advertised as > >> supported by the network ... > >> > >> Link partner advertised link modes: 10baseT/Half 10baseT/Full > >> 100baseT/Half 100baseT/Full > >> 1000baseT/Full > >> > >> My PC that is on the same network supports 10M, but just not this Tegra > >> device. I am checking to see if this is expected for this device. > >> > > > > I sent a patch for you to test. I think that even if it doesn't fully > > fix the issue you're observing, it's worth picking it up as it reduces > > the impact of the workaround I introduced. > > > Thanks! I will test this tonight. > > > I'll be off next week so I'm sending it quickly with the hope it will be useful. > > > OK thanks for letting me know. > > Another thought I had, which is also quite timely, is that I have > recently been testing a patch [0] as I found that this actually resolves > an issue where we occasionally see our device fail to get an IP address. > > This was sent out over a year ago and sadly we failed to follow up :-( > > Russell was concerned if this would make the function that was being > changed fail if it did not have the link (if I am understanding the > comments correctly). However, looking at the code now, I see that the > aqr107_read_status() function checks if '!phydev->link' before we poll > the TX ready status, and so I am wondering if this change is OK? From my > testing it does work. I would be interested to know if this may also > resolve your issue? > > With this change [0] I have been able to do 500 boots on our board and > verify that the ethernet controller is able to get an IP address every > time. Without this change it would fail to get an IP address anywhere > from 1-100 boots typically. > > I will test your patch in the same way, but I am wondering if both are > trying to address the same sort of issue? > The patch you linked does not fix the suspend/resume either. :( Bartosz > Cheers > Jon > > [0] > https://lore.kernel.org/linux-tegra/20230628124326.55732-3-ruppala@nvidia.com/#t > > -- > nvpublic
> Interestingly, the product brief for these PHYs [0] do show that both the > AQR113C and AQR115C both support 10M. So I wonder if it is our ethernet > controller that is not supporting 10M? I will check on this too. The PHY should enumerate what it supports, look at genphy_read_abilities(), which reads the Basic Mode Status Register, and there are bits in there which indicate if the PHY supports 10Half and 10Full. The MAC should also be indicating what it support, and ethtool will be showing you the combination of the two. Andrew
On 18/07/2024 20:05, Bartosz Golaszewski wrote: > On Thu, Jul 18, 2024 at 7:42 PM Jon Hunter <jonathanh@nvidia.com> wrote: >> >> >> On 18/07/2024 15:59, Bartosz Golaszewski wrote: >> >> ... >> >>>>>>> TBH I only observed the issue on AQR115C. I don't have any other model >>>>>>> to test with. Is it fine to fix it by implementing >>>>>>> aqr115_fill_interface_modes() that would first wait for this register >>>>>>> to return non-0 and then call aqr107_fill_interface_modes()? >>>>>> >>>>>> I am doing a bit more testing. We have seen a few issues with this PHY >>>>>> driver and so I am wondering if we also need something similar for the >>>>>> AQR113C variant too. >>>>>> >>>>>> Interestingly, the product brief for these PHYs [0] do show that both >>>>>> the AQR113C and AQR115C both support 10M. So I wonder if it is our >>>>>> ethernet controller that is not supporting 10M? I will check on this too. >>>>>> >>>>> >>>>> Oh you have an 113c? I didn't get this. Yeah, weird, all docs say it >>>>> should support 10M. In fact all AQR PHYs should hence my initial >>>>> change. >>>> >>>> >>>> Yes we have an AQR113C. I agree it should support this, but for whatever >>>> reason this is not advertised. I do see that 10M is advertised as >>>> supported by the network ... >>>> >>>> Link partner advertised link modes: 10baseT/Half 10baseT/Full >>>> 100baseT/Half 100baseT/Full >>>> 1000baseT/Full >>>> >>>> My PC that is on the same network supports 10M, but just not this Tegra >>>> device. I am checking to see if this is expected for this device. >>>> >>> >>> I sent a patch for you to test. I think that even if it doesn't fully >>> fix the issue you're observing, it's worth picking it up as it reduces >>> the impact of the workaround I introduced. >> >> >> Thanks! I will test this tonight. >> >>> I'll be off next week so I'm sending it quickly with the hope it will be useful. >> >> >> OK thanks for letting me know. >> >> Another thought I had, which is also quite timely, is that I have >> recently been testing a patch [0] as I found that this actually resolves >> an issue where we occasionally see our device fail to get an IP address. >> >> This was sent out over a year ago and sadly we failed to follow up :-( >> >> Russell was concerned if this would make the function that was being >> changed fail if it did not have the link (if I am understanding the >> comments correctly). However, looking at the code now, I see that the >> aqr107_read_status() function checks if '!phydev->link' before we poll >> the TX ready status, and so I am wondering if this change is OK? From my >> testing it does work. I would be interested to know if this may also >> resolve your issue? >> >> With this change [0] I have been able to do 500 boots on our board and >> verify that the ethernet controller is able to get an IP address every >> time. Without this change it would fail to get an IP address anywhere >> from 1-100 boots typically. >> >> I will test your patch in the same way, but I am wondering if both are >> trying to address the same sort of issue? >> > > The patch you linked does not fix the suspend/resume either. :( Thanks for testing! I have verified that the patch you sent resolves the issue introduced by this patch for Tegra. And likewise this patch does not resolve the long-standing issue (not related to this change) that we have been observing. Cheers Jon
diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c index 974795bd0860..2c8ba2725a91 100644 --- a/drivers/net/phy/aquantia/aquantia_main.c +++ b/drivers/net/phy/aquantia/aquantia_main.c @@ -652,7 +652,13 @@ static int aqr107_fill_interface_modes(struct phy_device *phydev) unsigned long *possible = phydev->possible_interfaces; unsigned int serdes_mode, rate_adapt; phy_interface_t interface; - int i, val; + int i, val, ret; + + ret = phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1, + VEND1_GLOBAL_CFG_10M, val, val != 0, + 1000, 100000, false); + if (ret) + return ret; /* Walk the media-speed configuration registers to determine which * host-side serdes modes may be used by the PHY depending on the