diff mbox series

[v2] ARM: imx: allow to disable board specific PHY fixups

Message ID 20200329110457.4113-1-o.rempel@pengutronix.de (mailing list archive)
State New, archived
Headers show
Series [v2] ARM: imx: allow to disable board specific PHY fixups | expand

Commit Message

Oleksij Rempel March 29, 2020, 11:04 a.m. UTC
All PHY fixups located in imx and mxs machine code are PHY and/or board
specific. Never the less, they are applied to all boards independent on
how related PHY is actually connected. As result:
- we have boards with wrong PHY defaults which are not overwritten or
  not properly handled by PHY drivers.
- Some PHY driver changes was never tested and bugs was never detected
  due the fixups.
- Same PHY specific errata was fixed by SoC specific fixup, so the same
  issues should be investigated again after switching to different SoC
  on same board.

Since removing this fixups will brake may existing boards, we'll provide a
Kconfig option which can be used by kernel developers and system integrators.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
---
 arch/arm/mach-imx/mach-imx6q.c  |  3 ++-
 arch/arm/mach-imx/mach-imx6sx.c |  3 ++-
 arch/arm/mach-imx/mach-imx7d.c  |  3 ++-
 arch/arm/mach-mxs/mach-mxs.c    |  3 ++-
 drivers/net/phy/Kconfig         | 16 ++++++++++++++++
 5 files changed, 24 insertions(+), 4 deletions(-)

Comments

Andrew Lunn March 29, 2020, 3:08 p.m. UTC | #1
On Sun, Mar 29, 2020 at 01:04:57PM +0200, Oleksij Rempel wrote:

Hi Oleksij

> +config DEPRECATED_PHY_FIXUPS
> +	bool "Enable deprecated PHY fixups"
> +	default y
> +	---help---
> +	  In the early days it was common practice to configure PHYs by adding a
> +	  phy_register_fixup*() in the machine code. This practice turned out to
> +	  be potentially dangerous, because:
> +	  - it affects all PHYs in the system
> +	  - these register changes are usually not preserved during PHY reset
> +	    or suspend/resume cycle.
> +	  - it complicates debugging, since these configuration changes were not
> +	    done by the actual PHY driver.
> +	  This option allows to disable all fixups which are identified as
> +	  potentially harmful and give the developers a chance to implement the
> +	  proper configuration via the device tree (e.g.: phy-mode) and/or the
> +	  related PHY drivers.

This appears to be an IMX only problem. Everybody else seems to of got
this right. There is no need to bother everybody with this new
option. Please put this in arch/arm/mach-mxs/Kconfig and have IMX in
the name.

Having said that, i'm not sure this is the best solution. You cannot
build one kernel which runs on all machines. Did you consider some
sort of DT property to disable these fixup? What other ideas did you
have before deciding on this solution?

     Andrew
Oleksij Rempel March 30, 2020, 5:26 a.m. UTC | #2
Hi Andrew,

On Sun, Mar 29, 2020 at 05:08:54PM +0200, Andrew Lunn wrote:
> On Sun, Mar 29, 2020 at 01:04:57PM +0200, Oleksij Rempel wrote:
> 
> Hi Oleksij
> 
> > +config DEPRECATED_PHY_FIXUPS
> > +	bool "Enable deprecated PHY fixups"
> > +	default y
> > +	---help---
> > +	  In the early days it was common practice to configure PHYs by adding a
> > +	  phy_register_fixup*() in the machine code. This practice turned out to
> > +	  be potentially dangerous, because:
> > +	  - it affects all PHYs in the system
> > +	  - these register changes are usually not preserved during PHY reset
> > +	    or suspend/resume cycle.
> > +	  - it complicates debugging, since these configuration changes were not
> > +	    done by the actual PHY driver.
> > +	  This option allows to disable all fixups which are identified as
> > +	  potentially harmful and give the developers a chance to implement the
> > +	  proper configuration via the device tree (e.g.: phy-mode) and/or the
> > +	  related PHY drivers.
> 
> This appears to be an IMX only problem. Everybody else seems to of got
> this right. There is no need to bother everybody with this new
> option. Please put this in arch/arm/mach-mxs/Kconfig and have IMX in
> the name.

Actually, all fixups seems to do wring thing:
arch/arm/mach-davinci/board-dm644x-evm.c:915:		phy_register_fixup_for_uid(LXT971_PHY_ID, LXT971_PHY_MASK,

Increased MII drive strength. Should be probably enabled by the PHY
driver.

arch/arm/mach-imx/mach-imx6q.c:167:		phy_register_fixup_for_uid(PHY_ID_KSZ9021, MICREL_PHY_ID_MASK,
arch/arm/mach-imx/mach-imx6q.c:169:		phy_register_fixup_for_uid(PHY_ID_KSZ9031, MICREL_PHY_ID_MASK,
arch/arm/mach-imx/mach-imx6q.c:171:		phy_register_fixup_for_uid(PHY_ID_AR8031, 0xffffffef,
arch/arm/mach-imx/mach-imx6q.c:173:		phy_register_fixup_for_uid(PHY_ID_AR8035, 0xffffffef,
arch/arm/mach-imx/mach-imx6sx.c:40:		phy_register_fixup_for_uid(PHY_ID_AR8031, 0xffffffff,
arch/arm/mach-imx/mach-imx6ul.c:47:		phy_register_fixup_for_uid(PHY_ID_KSZ8081, MICREL_PHY_ID_MASK,
arch/arm/mach-imx/mach-imx7d.c:54:		phy_register_fixup_for_uid(PHY_ID_AR8031, 0xffffffff,
arch/arm/mach-imx/mach-imx7d.c:56:		phy_register_fixup_for_uid(PHY_ID_BCM54220, 0xffffffff,
arch/arm/mach-mxs/mach-mxs.c:262:		phy_register_fixup_for_uid(PHY_ID_KSZ8051, MICREL_PHY_ID_MASK,

Fix in some random manner PHY specific errata, enable clock output and
configure the clock skew.

arch/arm/mach-orion5x/dns323-setup.c:645:		phy_register_fixup_for_uid(MARVELL_PHY_ID_88E1118,

Enable LED. Should be done in DT if supported.

arch/powerpc/platforms/85xx/mpc85xx_mds.c:305:		phy_register_fixup_for_id(phy_id, mpc8568_fixup_125_clock);
arch/powerpc/platforms/85xx/mpc85xx_mds.c:306:		phy_register_fixup_for_id(phy_id, mpc8568_mds_phy_fixups);
arch/powerpc/platforms/85xx/mpc85xx_mds.c:311:		phy_register_fixup_for_id(phy_id, mpc8568_mds_phy_fixups);

Fix in some random manner PHY specific errata, enable clock output and
configure the clock skew.

drivers/net/ethernet/dnet.c:818:	err = phy_register_fixup_for_uid(0x01410cc0, 0xfffffff0,

Enable LED. Should be done in DT if supported.

drivers/net/usb/lan78xx.c:2071:		ret = phy_register_fixup_for_uid(PHY_KSZ9031RNX, 0xfffffff0,
drivers/net/usb/lan78xx.c:2078:		ret = phy_register_fixup_for_uid(PHY_LAN8835, 0xfffffff0,

enable clock output and configure the clock skew.

As we can see, all of used fixups seem to be wrong. For example micrel
PHY errata should be fixed in one place for all devices. Not only for
some iMX6 SoC. I used this option for iMX, because i can test it.

> There is no need to bother everybody with this new
> option. Please put this in arch/arm/mach-mxs/Kconfig and have IMX in
> the name.

A lot of work is needed to fix all of them. I just do not have enough
time to do it.

> Having said that, i'm not sure this is the best solution. You cannot
> build one kernel which runs on all machines.  Did you consider some
> sort of DT property to disable these fixup? What other ideas did you
> have before deciding on this solution?

As soon as all PHY driver support all needed bits used in this fixups,
we can use drivers on top of fixups. Since changes made by fixups will
be overwritten by PHY drivers any way. The Kconfig option is needed only for
developers who has enough resource to investigate this issues and
mainline needed changes.

Regards,
Oleksij
Florian Fainelli March 30, 2020, 5:33 p.m. UTC | #3
On 3/29/2020 10:26 PM, Oleksij Rempel wrote:
> Hi Andrew,
> 
> On Sun, Mar 29, 2020 at 05:08:54PM +0200, Andrew Lunn wrote:
>> On Sun, Mar 29, 2020 at 01:04:57PM +0200, Oleksij Rempel wrote:
>>
>> Hi Oleksij
>>
>>> +config DEPRECATED_PHY_FIXUPS
>>> +	bool "Enable deprecated PHY fixups"
>>> +	default y
>>> +	---help---
>>> +	  In the early days it was common practice to configure PHYs by adding a
>>> +	  phy_register_fixup*() in the machine code. This practice turned out to
>>> +	  be potentially dangerous, because:
>>> +	  - it affects all PHYs in the system
>>> +	  - these register changes are usually not preserved during PHY reset
>>> +	    or suspend/resume cycle.
>>> +	  - it complicates debugging, since these configuration changes were not
>>> +	    done by the actual PHY driver.
>>> +	  This option allows to disable all fixups which are identified as
>>> +	  potentially harmful and give the developers a chance to implement the
>>> +	  proper configuration via the device tree (e.g.: phy-mode) and/or the
>>> +	  related PHY drivers.
>>
>> This appears to be an IMX only problem. Everybody else seems to of got
>> this right. There is no need to bother everybody with this new
>> option. Please put this in arch/arm/mach-mxs/Kconfig and have IMX in
>> the name.
> 
> Actually, all fixups seems to do wring thing:
> arch/arm/mach-davinci/board-dm644x-evm.c:915:		phy_register_fixup_for_uid(LXT971_PHY_ID, LXT971_PHY_MASK,
> 
> Increased MII drive strength. Should be probably enabled by the PHY
> driver.
> 
> arch/arm/mach-imx/mach-imx6q.c:167:		phy_register_fixup_for_uid(PHY_ID_KSZ9021, MICREL_PHY_ID_MASK,
> arch/arm/mach-imx/mach-imx6q.c:169:		phy_register_fixup_for_uid(PHY_ID_KSZ9031, MICREL_PHY_ID_MASK,
> arch/arm/mach-imx/mach-imx6q.c:171:		phy_register_fixup_for_uid(PHY_ID_AR8031, 0xffffffef,
> arch/arm/mach-imx/mach-imx6q.c:173:		phy_register_fixup_for_uid(PHY_ID_AR8035, 0xffffffef,
> arch/arm/mach-imx/mach-imx6sx.c:40:		phy_register_fixup_for_uid(PHY_ID_AR8031, 0xffffffff,
> arch/arm/mach-imx/mach-imx6ul.c:47:		phy_register_fixup_for_uid(PHY_ID_KSZ8081, MICREL_PHY_ID_MASK,
> arch/arm/mach-imx/mach-imx7d.c:54:		phy_register_fixup_for_uid(PHY_ID_AR8031, 0xffffffff,
> arch/arm/mach-imx/mach-imx7d.c:56:		phy_register_fixup_for_uid(PHY_ID_BCM54220, 0xffffffff,
> arch/arm/mach-mxs/mach-mxs.c:262:		phy_register_fixup_for_uid(PHY_ID_KSZ8051, MICREL_PHY_ID_MASK,
> 
> Fix in some random manner PHY specific errata, enable clock output and
> configure the clock skew.
> 
> arch/arm/mach-orion5x/dns323-setup.c:645:		phy_register_fixup_for_uid(MARVELL_PHY_ID_88E1118,
> 
> Enable LED. Should be done in DT if supported.
> 
> arch/powerpc/platforms/85xx/mpc85xx_mds.c:305:		phy_register_fixup_for_id(phy_id, mpc8568_fixup_125_clock);
> arch/powerpc/platforms/85xx/mpc85xx_mds.c:306:		phy_register_fixup_for_id(phy_id, mpc8568_mds_phy_fixups);
> arch/powerpc/platforms/85xx/mpc85xx_mds.c:311:		phy_register_fixup_for_id(phy_id, mpc8568_mds_phy_fixups);
> 
> Fix in some random manner PHY specific errata, enable clock output and
> configure the clock skew.
> 
> drivers/net/ethernet/dnet.c:818:	err = phy_register_fixup_for_uid(0x01410cc0, 0xfffffff0,
> 
> Enable LED. Should be done in DT if supported.
> 
> drivers/net/usb/lan78xx.c:2071:		ret = phy_register_fixup_for_uid(PHY_KSZ9031RNX, 0xfffffff0,
> drivers/net/usb/lan78xx.c:2078:		ret = phy_register_fixup_for_uid(PHY_LAN8835, 0xfffffff0,
> 
> enable clock output and configure the clock skew.
> 
> As we can see, all of used fixups seem to be wrong. For example micrel
> PHY errata should be fixed in one place for all devices. Not only for
> some iMX6 SoC. I used this option for iMX, because i can test it.

"wrong" is a bit general without really trying to understand the history
of where this came from. Historically, those platforms were not DT
enabled for a while (except PPC) and there was no way to pass platform
specific to the PHY driver so the only way to key off specific
board/platform desired behavior was to register a PHY fixup.

There are also various configuration which are really policies (as in
policy vs. mechanism separation) for the board such as configuring LEDs
in a certain way etc. Very quickly you start putting more of that policy
into DT which is just frowned upon, unless there is a good abstraction
model whereby an Ethernet Device Tree node can also be a LED provider.

> 
>> There is no need to bother everybody with this new
>> option. Please put this in arch/arm/mach-mxs/Kconfig and have IMX in
>> the name.
> 
> A lot of work is needed to fix all of them. I just do not have enough
> time to do it.
> 
>> Having said that, i'm not sure this is the best solution. You cannot
>> build one kernel which runs on all machines.  Did you consider some
>> sort of DT property to disable these fixup? What other ideas did you
>> have before deciding on this solution?
> 
> As soon as all PHY driver support all needed bits used in this fixups,
> we can use drivers on top of fixups. Since changes made by fixups will
> be overwritten by PHY drivers any way. The Kconfig option is needed only for
> developers who has enough resource to investigate this issues and
> mainline needed changes.

We all know this is not going to happen, if people cared so much about
fixing such a problem, it would have been solved by now. If you do care
about IMX though, then please work on removing the fixups, but do not
introduce yet another config that you are guaranteed is going to be
turned on by default, thus creating another test matrix with no real value.
Russell King (Oracle) March 30, 2020, 5:41 p.m. UTC | #4
On Mon, Mar 30, 2020 at 10:33:03AM -0700, Florian Fainelli wrote:
> 
> 
> On 3/29/2020 10:26 PM, Oleksij Rempel wrote:
> > Hi Andrew,
> > 
> > On Sun, Mar 29, 2020 at 05:08:54PM +0200, Andrew Lunn wrote:
> >> On Sun, Mar 29, 2020 at 01:04:57PM +0200, Oleksij Rempel wrote:
> >>
> >> Hi Oleksij
> >>
> >>> +config DEPRECATED_PHY_FIXUPS
> >>> +	bool "Enable deprecated PHY fixups"
> >>> +	default y
> >>> +	---help---
> >>> +	  In the early days it was common practice to configure PHYs by adding a
> >>> +	  phy_register_fixup*() in the machine code. This practice turned out to
> >>> +	  be potentially dangerous, because:
> >>> +	  - it affects all PHYs in the system
> >>> +	  - these register changes are usually not preserved during PHY reset
> >>> +	    or suspend/resume cycle.
> >>> +	  - it complicates debugging, since these configuration changes were not
> >>> +	    done by the actual PHY driver.
> >>> +	  This option allows to disable all fixups which are identified as
> >>> +	  potentially harmful and give the developers a chance to implement the
> >>> +	  proper configuration via the device tree (e.g.: phy-mode) and/or the
> >>> +	  related PHY drivers.
> >>
> >> This appears to be an IMX only problem. Everybody else seems to of got
> >> this right. There is no need to bother everybody with this new
> >> option. Please put this in arch/arm/mach-mxs/Kconfig and have IMX in
> >> the name.
> > 
> > Actually, all fixups seems to do wring thing:
> > arch/arm/mach-davinci/board-dm644x-evm.c:915:		phy_register_fixup_for_uid(LXT971_PHY_ID, LXT971_PHY_MASK,
> > 
> > Increased MII drive strength. Should be probably enabled by the PHY
> > driver.
> > 
> > arch/arm/mach-imx/mach-imx6q.c:167:		phy_register_fixup_for_uid(PHY_ID_KSZ9021, MICREL_PHY_ID_MASK,
> > arch/arm/mach-imx/mach-imx6q.c:169:		phy_register_fixup_for_uid(PHY_ID_KSZ9031, MICREL_PHY_ID_MASK,
> > arch/arm/mach-imx/mach-imx6q.c:171:		phy_register_fixup_for_uid(PHY_ID_AR8031, 0xffffffef,
> > arch/arm/mach-imx/mach-imx6q.c:173:		phy_register_fixup_for_uid(PHY_ID_AR8035, 0xffffffef,

As far as I'm concerned, the AR8035 fixup is there with good reason.
It's not just "random" but is required to make the AR8035 usable with
the iMX6 SoCs.  Not because of a board level thing, but because it's
required for the AR8035 to be usable with an iMX6 SoC.

So, having it registered by the iMX6 SoC code is entirely logical and
correct.

That's likely true of the AR8031 situation as well.

I can't speak for any of the others.
Marc Kleine-Budde March 31, 2020, 7:47 a.m. UTC | #5
On 3/30/20 7:41 PM, Russell King - ARM Linux admin wrote:
>>> arch/arm/mach-imx/mach-imx6q.c:167:		phy_register_fixup_for_uid(PHY_ID_KSZ9021, MICREL_PHY_ID_MASK,
>>> arch/arm/mach-imx/mach-imx6q.c:169:		phy_register_fixup_for_uid(PHY_ID_KSZ9031, MICREL_PHY_ID_MASK,
>>> arch/arm/mach-imx/mach-imx6q.c:171:		phy_register_fixup_for_uid(PHY_ID_AR8031, 0xffffffef,
>>> arch/arm/mach-imx/mach-imx6q.c:173:		phy_register_fixup_for_uid(PHY_ID_AR8035, 0xffffffef,
> 
> As far as I'm concerned, the AR8035 fixup is there with good reason.
> It's not just "random" but is required to make the AR8035 usable with
> the iMX6 SoCs.  Not because of a board level thing, but because it's
> required for the AR8035 to be usable with an iMX6 SoC.

Is this still ture, if the AR8035 is attached to a switch behind an iMX6?

regards,
Marc
Russell King (Oracle) March 31, 2020, 7:54 a.m. UTC | #6
On Tue, Mar 31, 2020 at 09:47:19AM +0200, Marc Kleine-Budde wrote:
> On 3/30/20 7:41 PM, Russell King - ARM Linux admin wrote:
> >>> arch/arm/mach-imx/mach-imx6q.c:167:		phy_register_fixup_for_uid(PHY_ID_KSZ9021, MICREL_PHY_ID_MASK,
> >>> arch/arm/mach-imx/mach-imx6q.c:169:		phy_register_fixup_for_uid(PHY_ID_KSZ9031, MICREL_PHY_ID_MASK,
> >>> arch/arm/mach-imx/mach-imx6q.c:171:		phy_register_fixup_for_uid(PHY_ID_AR8031, 0xffffffef,
> >>> arch/arm/mach-imx/mach-imx6q.c:173:		phy_register_fixup_for_uid(PHY_ID_AR8035, 0xffffffef,
> > 
> > As far as I'm concerned, the AR8035 fixup is there with good reason.
> > It's not just "random" but is required to make the AR8035 usable with
> > the iMX6 SoCs.  Not because of a board level thing, but because it's
> > required for the AR8035 to be usable with an iMX6 SoC.
> 
> Is this still ture, if the AR8035 is attached to a switch behind an iMX6?

Do you know of such a setup, or are you talking about theoretical
situations?
Marc Kleine-Budde March 31, 2020, 8 a.m. UTC | #7
On 3/31/20 9:54 AM, Russell King - ARM Linux admin wrote:
> On Tue, Mar 31, 2020 at 09:47:19AM +0200, Marc Kleine-Budde wrote:
>> On 3/30/20 7:41 PM, Russell King - ARM Linux admin wrote:
>>>>> arch/arm/mach-imx/mach-imx6q.c:167:		phy_register_fixup_for_uid(PHY_ID_KSZ9021, MICREL_PHY_ID_MASK,
>>>>> arch/arm/mach-imx/mach-imx6q.c:169:		phy_register_fixup_for_uid(PHY_ID_KSZ9031, MICREL_PHY_ID_MASK,
>>>>> arch/arm/mach-imx/mach-imx6q.c:171:		phy_register_fixup_for_uid(PHY_ID_AR8031, 0xffffffef,
>>>>> arch/arm/mach-imx/mach-imx6q.c:173:		phy_register_fixup_for_uid(PHY_ID_AR8035, 0xffffffef,
>>>
>>> As far as I'm concerned, the AR8035 fixup is there with good reason.
>>> It's not just "random" but is required to make the AR8035 usable with
>>> the iMX6 SoCs.  Not because of a board level thing, but because it's
>>> required for the AR8035 to be usable with an iMX6 SoC.
>>
>> Is this still ture, if the AR8035 is attached to a switch behind an iMX6?
> 
> Do you know of such a setup, or are you talking about theoretical
> situations?

Granted, not for the AR8035, but for one of the KSZ Phys. This is why
Oleksij started looking into this issue in the first place.

regards,
Marc
Philippe Schenker March 31, 2020, 8:06 a.m. UTC | #8
On Mon, 2020-03-30 at 18:41 +0100, Russell King - ARM Linux admin wrote:
> On Mon, Mar 30, 2020 at 10:33:03AM -0700, Florian Fainelli wrote:
> > 
> > On 3/29/2020 10:26 PM, Oleksij Rempel wrote:
> > > Hi Andrew,
> > > 
> > > On Sun, Mar 29, 2020 at 05:08:54PM +0200, Andrew Lunn wrote:
> > > > On Sun, Mar 29, 2020 at 01:04:57PM +0200, Oleksij Rempel wrote:
> > > > 
> > > > Hi Oleksij
> > > > 
> > > > > +config DEPRECATED_PHY_FIXUPS
> > > > > +	bool "Enable deprecated PHY fixups"
> > > > > +	default y
> > > > > +	---help---
> > > > > +	  In the early days it was common practice to configure
> > > > > PHYs by adding a
> > > > > +	  phy_register_fixup*() in the machine code. This
> > > > > practice turned out to
> > > > > +	  be potentially dangerous, because:
> > > > > +	  - it affects all PHYs in the system
> > > > > +	  - these register changes are usually not preserved
> > > > > during PHY reset
> > > > > +	    or suspend/resume cycle.
> > > > > +	  - it complicates debugging, since these configuration
> > > > > changes were not
> > > > > +	    done by the actual PHY driver.
> > > > > +	  This option allows to disable all fixups which are
> > > > > identified as
> > > > > +	  potentially harmful and give the developers a chance
> > > > > to implement the
> > > > > +	  proper configuration via the device tree (e.g.: phy-
> > > > > mode) and/or the
> > > > > +	  related PHY drivers.
> > > > 
> > > > This appears to be an IMX only problem. Everybody else seems to
> > > > of got
> > > > this right. There is no need to bother everybody with this new
> > > > option. Please put this in arch/arm/mach-mxs/Kconfig and have
> > > > IMX in
> > > > the name.
> > > 
> > > Actually, all fixups seems to do wring thing:
> > > arch/arm/mach-davinci/board-dm644x-evm.c:915:		phy_regi
> > > ster_fixup_for_uid(LXT971_PHY_ID, LXT971_PHY_MASK,
> > > 
> > > Increased MII drive strength. Should be probably enabled by the
> > > PHY
> > > driver.
> > > 
> > > arch/arm/mach-imx/mach-imx6q.c:167:		phy_register_fix
> > > up_for_uid(PHY_ID_KSZ9021, MICREL_PHY_ID_MASK,
> > > arch/arm/mach-imx/mach-imx6q.c:169:		phy_register_fix
> > > up_for_uid(PHY_ID_KSZ9031, MICREL_PHY_ID_MASK,
> > > arch/arm/mach-imx/mach-imx6q.c:171:		phy_register_fix
> > > up_for_uid(PHY_ID_AR8031, 0xffffffef,
> > > arch/arm/mach-imx/mach-imx6q.c:173:		phy_register_fix
> > > up_for_uid(PHY_ID_AR8035, 0xffffffef,
> 
> As far as I'm concerned, the AR8035 fixup is there with good reason.
> It's not just "random" but is required to make the AR8035 usable with
> the iMX6 SoCs.  Not because of a board level thing, but because it's
> required for the AR8035 to be usable with an iMX6 SoC.
> 
> So, having it registered by the iMX6 SoC code is entirely logical and
> correct.
> 
> That's likely true of the AR8031 situation as well.
> 
> I can't speak for any of the others.

I can speak for the KSZ9031/KSZ9021 for those PHYs the fixup is solely
to add the TXC delay that, according to RGMII v1.3 spec should have been
done by hardware, and as i.MX6 as well as those PHYs do not have a
specific register to add that delay, the skew get set so the timing is
working somehow with the i.MX6 processor.

I'm fine when you want to remove those fixups. Please CC me because
we're relying on those at the moment. I would just put them into
devicetree.

Philippe
Russell King (Oracle) March 31, 2020, 8:19 a.m. UTC | #9
On Tue, Mar 31, 2020 at 10:00:12AM +0200, Marc Kleine-Budde wrote:
> On 3/31/20 9:54 AM, Russell King - ARM Linux admin wrote:
> > On Tue, Mar 31, 2020 at 09:47:19AM +0200, Marc Kleine-Budde wrote:
> >> On 3/30/20 7:41 PM, Russell King - ARM Linux admin wrote:
> >>>>> arch/arm/mach-imx/mach-imx6q.c:167:		phy_register_fixup_for_uid(PHY_ID_KSZ9021, MICREL_PHY_ID_MASK,
> >>>>> arch/arm/mach-imx/mach-imx6q.c:169:		phy_register_fixup_for_uid(PHY_ID_KSZ9031, MICREL_PHY_ID_MASK,
> >>>>> arch/arm/mach-imx/mach-imx6q.c:171:		phy_register_fixup_for_uid(PHY_ID_AR8031, 0xffffffef,
> >>>>> arch/arm/mach-imx/mach-imx6q.c:173:		phy_register_fixup_for_uid(PHY_ID_AR8035, 0xffffffef,
> >>>
> >>> As far as I'm concerned, the AR8035 fixup is there with good reason.
> >>> It's not just "random" but is required to make the AR8035 usable with
> >>> the iMX6 SoCs.  Not because of a board level thing, but because it's
> >>> required for the AR8035 to be usable with an iMX6 SoC.
> >>
> >> Is this still ture, if the AR8035 is attached to a switch behind an iMX6?
> > 
> > Do you know of such a setup, or are you talking about theoretical
> > situations?
> 
> Granted, not for the AR8035, but for one of the KSZ Phys. This is why
> Oleksij started looking into this issue in the first place.

Maybe there's an easy solution to this - check whether the PHY being
fixed up is connected to the iMX6 SoC:

static bool phy_connected_to(struct phy_device *phydev,
			     const struct of_device_id *matches)
{
	struct device_node *np, *phy_np;

	for_each_matching_node(np, matches) {
		phy_np = of_parse_phandle(np, "phy-handle", 0);
		if (!phy_np)
			phy_np = of_parse_phandle(np, "phy", 0);
		if (!phy_np)
			phy_np = of_parse_phandle(np, "phy-device", 0);
		if (phy_np && phydev->mdio.dev.of_node == phy_np) {
			of_node_put(phy_np);
			of_node_put(np);
			return true;
		}
		of_node_put(phy_np);
	}
	return false;
}

static struct of_device_id imx_fec_ids[] = {
	{ .compatible = "fsl,imx6q-fec", },
	...
	{ },
};

static bool phy_connected_to_fec(struct phy_device *phydev)
{
	return phy_connected_to(phydev, imx_fec_ids);
}

and then in the fixups:

	if (!phy_connected_to_fec(phydev))
		return 0;

?
David Jander March 31, 2020, 8:44 a.m. UTC | #10
On Mon, 30 Mar 2020 18:41:14 +0100
Russell King - ARM Linux admin <linux@armlinux.org.uk> wrote:

> On Mon, Mar 30, 2020 at 10:33:03AM -0700, Florian Fainelli wrote:
> > 
> > 
> > On 3/29/2020 10:26 PM, Oleksij Rempel wrote:  
> > > Hi Andrew,
> > > 
> > > On Sun, Mar 29, 2020 at 05:08:54PM +0200, Andrew Lunn wrote:  
> > >> On Sun, Mar 29, 2020 at 01:04:57PM +0200, Oleksij Rempel wrote:
> > >>
> > >> Hi Oleksij
> > >>  
> > >>> +config DEPRECATED_PHY_FIXUPS
> > >>> +	bool "Enable deprecated PHY fixups"
> > >>> +	default y
> > >>> +	---help---
> > >>> +	  In the early days it was common practice to configure PHYs by adding a
> > >>> +	  phy_register_fixup*() in the machine code. This practice turned out to
> > >>> +	  be potentially dangerous, because:
> > >>> +	  - it affects all PHYs in the system
> > >>> +	  - these register changes are usually not preserved during PHY reset
> > >>> +	    or suspend/resume cycle.
> > >>> +	  - it complicates debugging, since these configuration changes were not
> > >>> +	    done by the actual PHY driver.
> > >>> +	  This option allows to disable all fixups which are identified as
> > >>> +	  potentially harmful and give the developers a chance to implement the
> > >>> +	  proper configuration via the device tree (e.g.: phy-mode) and/or the
> > >>> +	  related PHY drivers.  
> > >>
> > >> This appears to be an IMX only problem. Everybody else seems to of got
> > >> this right. There is no need to bother everybody with this new
> > >> option. Please put this in arch/arm/mach-mxs/Kconfig and have IMX in
> > >> the name.  
> > > 
> > > Actually, all fixups seems to do wring thing:
> > > arch/arm/mach-davinci/board-dm644x-evm.c:915:		phy_register_fixup_for_uid(LXT971_PHY_ID, LXT971_PHY_MASK,
> > > 
> > > Increased MII drive strength. Should be probably enabled by the PHY
> > > driver.
> > > 
> > > arch/arm/mach-imx/mach-imx6q.c:167:		phy_register_fixup_for_uid(PHY_ID_KSZ9021, MICREL_PHY_ID_MASK,
> > > arch/arm/mach-imx/mach-imx6q.c:169:		phy_register_fixup_for_uid(PHY_ID_KSZ9031, MICREL_PHY_ID_MASK,
> > > arch/arm/mach-imx/mach-imx6q.c:171:		phy_register_fixup_for_uid(PHY_ID_AR8031, 0xffffffef,
> > > arch/arm/mach-imx/mach-imx6q.c:173:		phy_register_fixup_for_uid(PHY_ID_AR8035, 0xffffffef,  
> 
> As far as I'm concerned, the AR8035 fixup is there with good reason.
> It's not just "random" but is required to make the AR8035 usable with
> the iMX6 SoCs.  Not because of a board level thing, but because it's
> required for the AR8035 to be usable with an iMX6 SoC.

I have checked with the datasheet of the AR8035, and AFAICS, what the code
does is this:

 - Disable the SmartEEE feature of the phy. The comment in the code implies
   that for some reason it doesn't work, but the reason itself is not given.
   Anyway, disabling SmartEEE should IMHO opinion be controlled by a DT
   setting. There is no reason to believe this problem is specific to the
   i.MX6. Besides, it is a feature of the phy, so it seems logical to expose
   that via the DT. Once that is done, it has no place here.

 - Set the external clock output to 125MHz. This is needed because the i.MX6
   needs a 125MHz reference clock input. But it is not a requirement to use
   this output. It is perfectly fine and possible to design a board that uses
   an external oscillator for this. It is also possible that an i.MX6 design
   has such a phy connected to a MAC behind a switch or some other interface.
   Independent of i.MX6 this setting can also be necessary for other hardware
   designs, based on different SoC's. In summary, this is a feature of the
   specific hardware design at hand, and has nothing to do with the i.MX6
   specifically. This should definitely be exposed through the DT and not be
   here.

 - Enable TXC delay. To clarify, the RGMII specification version 1 specified
   that the RXC and TXC traces should be routed long enough to introduce a
   certain delay to the clock signal, or the delay should be introduced via
   other means. In a later version of the spec, a provision was given for MAC
   or PHY devices to generate this delay internally. The i.MX6 MAC interface
   is unable to generate the required delay internally, so it has to be taken
   care of either by the board layout, or by the PHY device. This is the
   crucial point: The amount of delay set by the PHY delay register depends on
   the board layout. It should NEVER be hard-coded in SoC setup code. The
   correct way is to specify it in the DT. Needless to say that this too,
   isn't i.MX6-specific.

> So, having it registered by the iMX6 SoC code is entirely logical and
> correct.

I'm afraid I don't agree. See above. This code really should never have been
here. It is not i.MX6-specific as I pointed out above, nor is it necessarily
applicable to all i.MX6 boards that use those phy devices.

> That's likely true of the AR8031 situation as well.
> 
> I can't speak for any of the others.

Best regards,
Russell King (Oracle) March 31, 2020, 9:36 a.m. UTC | #11
On Tue, Mar 31, 2020 at 10:44:59AM +0200, David Jander wrote:
> On Mon, 30 Mar 2020 18:41:14 +0100
> Russell King - ARM Linux admin <linux@armlinux.org.uk> wrote:
> 
> > On Mon, Mar 30, 2020 at 10:33:03AM -0700, Florian Fainelli wrote:
> > > 
> > > 
> > > On 3/29/2020 10:26 PM, Oleksij Rempel wrote:  
> > > > Hi Andrew,
> > > > 
> > > > On Sun, Mar 29, 2020 at 05:08:54PM +0200, Andrew Lunn wrote:  
> > > >> On Sun, Mar 29, 2020 at 01:04:57PM +0200, Oleksij Rempel wrote:
> > > >>
> > > >> Hi Oleksij
> > > >>  
> > > >>> +config DEPRECATED_PHY_FIXUPS
> > > >>> +	bool "Enable deprecated PHY fixups"
> > > >>> +	default y
> > > >>> +	---help---
> > > >>> +	  In the early days it was common practice to configure PHYs by adding a
> > > >>> +	  phy_register_fixup*() in the machine code. This practice turned out to
> > > >>> +	  be potentially dangerous, because:
> > > >>> +	  - it affects all PHYs in the system
> > > >>> +	  - these register changes are usually not preserved during PHY reset
> > > >>> +	    or suspend/resume cycle.
> > > >>> +	  - it complicates debugging, since these configuration changes were not
> > > >>> +	    done by the actual PHY driver.
> > > >>> +	  This option allows to disable all fixups which are identified as
> > > >>> +	  potentially harmful and give the developers a chance to implement the
> > > >>> +	  proper configuration via the device tree (e.g.: phy-mode) and/or the
> > > >>> +	  related PHY drivers.  
> > > >>
> > > >> This appears to be an IMX only problem. Everybody else seems to of got
> > > >> this right. There is no need to bother everybody with this new
> > > >> option. Please put this in arch/arm/mach-mxs/Kconfig and have IMX in
> > > >> the name.  
> > > > 
> > > > Actually, all fixups seems to do wring thing:
> > > > arch/arm/mach-davinci/board-dm644x-evm.c:915:		phy_register_fixup_for_uid(LXT971_PHY_ID, LXT971_PHY_MASK,
> > > > 
> > > > Increased MII drive strength. Should be probably enabled by the PHY
> > > > driver.
> > > > 
> > > > arch/arm/mach-imx/mach-imx6q.c:167:		phy_register_fixup_for_uid(PHY_ID_KSZ9021, MICREL_PHY_ID_MASK,
> > > > arch/arm/mach-imx/mach-imx6q.c:169:		phy_register_fixup_for_uid(PHY_ID_KSZ9031, MICREL_PHY_ID_MASK,
> > > > arch/arm/mach-imx/mach-imx6q.c:171:		phy_register_fixup_for_uid(PHY_ID_AR8031, 0xffffffef,
> > > > arch/arm/mach-imx/mach-imx6q.c:173:		phy_register_fixup_for_uid(PHY_ID_AR8035, 0xffffffef,  
> > 
> > As far as I'm concerned, the AR8035 fixup is there with good reason.
> > It's not just "random" but is required to make the AR8035 usable with
> > the iMX6 SoCs.  Not because of a board level thing, but because it's
> > required for the AR8035 to be usable with an iMX6 SoC.
> 
> I have checked with the datasheet of the AR8035, and AFAICS, what the code
> does is this:
> 
>  - Disable the SmartEEE feature of the phy. The comment in the code implies
>    that for some reason it doesn't work, but the reason itself is not given.
>    Anyway, disabling SmartEEE should IMHO opinion be controlled by a DT
>    setting. There is no reason to believe this problem is specific to the
>    i.MX6. Besides, it is a feature of the phy, so it seems logical to expose
>    that via the DT. Once that is done, it has no place here.
> 
>  - Set the external clock output to 125MHz. This is needed because the i.MX6
>    needs a 125MHz reference clock input. But it is not a requirement to use
>    this output. It is perfectly fine and possible to design a board that uses
>    an external oscillator for this. It is also possible that an i.MX6 design
>    has such a phy connected to a MAC behind a switch or some other interface.
>    Independent of i.MX6 this setting can also be necessary for other hardware
>    designs, based on different SoC's. In summary, this is a feature of the
>    specific hardware design at hand, and has nothing to do with the i.MX6
>    specifically. This should definitely be exposed through the DT and not be
>    here.
> 
>  - Enable TXC delay. To clarify, the RGMII specification version 1 specified
>    that the RXC and TXC traces should be routed long enough to introduce a
>    certain delay to the clock signal, or the delay should be introduced via
>    other means. In a later version of the spec, a provision was given for MAC
>    or PHY devices to generate this delay internally. The i.MX6 MAC interface
>    is unable to generate the required delay internally, so it has to be taken
>    care of either by the board layout, or by the PHY device. This is the
>    crucial point: The amount of delay set by the PHY delay register depends on
>    the board layout. It should NEVER be hard-coded in SoC setup code. The
>    correct way is to specify it in the DT. Needless to say that this too,
>    isn't i.MX6-specific.
> 
> > So, having it registered by the iMX6 SoC code is entirely logical and
> > correct.
> 
> I'm afraid I don't agree. See above. This code really should never have been
> here. It is not i.MX6-specific as I pointed out above, nor is it necessarily
> applicable to all i.MX6 boards that use those phy devices.

Then we will have to agree to disagree, sorry.
Andrew Lunn March 31, 2020, 12:54 p.m. UTC | #12
>  - Disable the SmartEEE feature of the phy. The comment in the code implies
>    that for some reason it doesn't work, but the reason itself is not given.
>    Anyway, disabling SmartEEE should IMHO opinion be controlled by a DT
>    setting. There is no reason to believe this problem is specific to the
>    i.MX6. Besides, it is a feature of the phy, so it seems logical to expose
>    that via the DT. Once that is done, it has no place here.

The device tree properties are defined:

bindings/net/ethernet-phy.yaml:  eee-broken-100tx:
bindings/net/ethernet-phy.yaml:  eee-broken-1000t:
bindings/net/ethernet-phy.yaml:  eee-broken-10gt:
bindings/net/ethernet-phy.yaml:  eee-broken-1000kx:
bindings/net/ethernet-phy.yaml:  eee-broken-10gkx4:
bindings/net/ethernet-phy.yaml:  eee-broken-10gkr:

And there is a helper:

void of_set_phy_eee_broken(struct phy_device *phydev)

>  - Enable TXC delay. To clarify, the RGMII specification version 1 specified
>    that the RXC and TXC traces should be routed long enough to introduce a
>    certain delay to the clock signal, or the delay should be introduced via
>    other means. In a later version of the spec, a provision was given for MAC
>    or PHY devices to generate this delay internally. The i.MX6 MAC interface
>    is unable to generate the required delay internally, so it has to be taken
>    care of either by the board layout, or by the PHY device. This is the
>    crucial point: The amount of delay set by the PHY delay register depends on
>    the board layout. It should NEVER be hard-coded in SoC setup code. The
>    correct way is to specify it in the DT. Needless to say that this too,
>    isn't i.MX6-specific.

Correct:

      # RX and TX delays are added by the MAC when required
      - rgmii

      # RGMII with internal RX and TX delays provided by the PHY,
      # the MAC should not add the RX or TX delays in this case
      - rgmii-id

      # RGMII with internal RX delay provided by the PHY, the MAC
      # should not add an RX delay in this case
      - rgmii-rxid

      # RGMII with internal TX delay provided by the PHY, the MAC
      # should not add an TX delay in this case
      - rgmii-txid

The needed properties exist.

I think part of the issue is that there are iMX6 board which are not
device tree?

       Andrew
Oleksij Rempel March 31, 2020, 1:45 p.m. UTC | #13
Hi Russell,

On Mon, Mar 30, 2020 at 06:41:14PM +0100, Russell King - ARM Linux admin wrote:
> On Mon, Mar 30, 2020 at 10:33:03AM -0700, Florian Fainelli wrote:
> > 
> > 
> > On 3/29/2020 10:26 PM, Oleksij Rempel wrote:
> > > Hi Andrew,
> > > 
> > > On Sun, Mar 29, 2020 at 05:08:54PM +0200, Andrew Lunn wrote:
> > >> On Sun, Mar 29, 2020 at 01:04:57PM +0200, Oleksij Rempel wrote:
> > >>
> > >> Hi Oleksij
> > >>
> > >>> +config DEPRECATED_PHY_FIXUPS
> > >>> +	bool "Enable deprecated PHY fixups"
> > >>> +	default y
> > >>> +	---help---
> > >>> +	  In the early days it was common practice to configure PHYs by adding a
> > >>> +	  phy_register_fixup*() in the machine code. This practice turned out to
> > >>> +	  be potentially dangerous, because:
> > >>> +	  - it affects all PHYs in the system
> > >>> +	  - these register changes are usually not preserved during PHY reset
> > >>> +	    or suspend/resume cycle.
> > >>> +	  - it complicates debugging, since these configuration changes were not
> > >>> +	    done by the actual PHY driver.
> > >>> +	  This option allows to disable all fixups which are identified as
> > >>> +	  potentially harmful and give the developers a chance to implement the
> > >>> +	  proper configuration via the device tree (e.g.: phy-mode) and/or the
> > >>> +	  related PHY drivers.
> > >>
> > >> This appears to be an IMX only problem. Everybody else seems to of got
> > >> this right. There is no need to bother everybody with this new
> > >> option. Please put this in arch/arm/mach-mxs/Kconfig and have IMX in
> > >> the name.
> > > 
> > > Actually, all fixups seems to do wring thing:
> > > arch/arm/mach-davinci/board-dm644x-evm.c:915:		phy_register_fixup_for_uid(LXT971_PHY_ID, LXT971_PHY_MASK,
> > > 
> > > Increased MII drive strength. Should be probably enabled by the PHY
> > > driver.
> > > 
> > > arch/arm/mach-imx/mach-imx6q.c:167:		phy_register_fixup_for_uid(PHY_ID_KSZ9021, MICREL_PHY_ID_MASK,
> > > arch/arm/mach-imx/mach-imx6q.c:169:		phy_register_fixup_for_uid(PHY_ID_KSZ9031, MICREL_PHY_ID_MASK,
> > > arch/arm/mach-imx/mach-imx6q.c:171:		phy_register_fixup_for_uid(PHY_ID_AR8031, 0xffffffef,
> > > arch/arm/mach-imx/mach-imx6q.c:173:		phy_register_fixup_for_uid(PHY_ID_AR8035, 0xffffffef,
> 
> As far as I'm concerned, the AR8035 fixup is there with good reason.
> It's not just "random" but is required to make the AR8035 usable with
> the iMX6 SoCs.  Not because of a board level thing, but because it's
> required for the AR8035 to be usable with an iMX6 SoC.
> 
> So, having it registered by the iMX6 SoC code is entirely logical and
> correct.
> 
> That's likely true of the AR8031 situation as well.
> 
> I can't speak for any of the others.

OK, let's analyze it step by step:
--------------------------------------------------------------------------------
arch/arm/mach-imx/mach-imx6q.c

The AR8035 fixup is doing following configurations:
- disable SmartEEE with following description:
  /* Ar803x phy SmartEEE feature cause link status generates glitch,
   * which cause ethernet link down/up issue, so disable SmartEEE

- enable clock output from PHY, configures it to 125Mhz and configures
  clock skew. See the comment provided in the source code:
  * Enable 125MHz clock from CLK_25M on the AR8031.  This
  * is fed in to the IMX6 on the ENET_REF_CLK (V22) pad.
  * Also, introduce a tx clock delay.
  *
  * This is the same as is the AR8031 fixup.

- powers on the PHY. Probably to make sure the clock output will run
  before FEC is probed to avoid clock glitches.

The AR8031 fixup only enables clock output of PHY, configures it to
125Mhz, and configures clock skew. The PHY not powered and although it
supports SmartEEE, it's not disabled. Let's assume the fixup author did
the correct configuration and SmartEEE is working without problems.

The KSZ9031rn fixup is configuring only the clock skew. Never the less,
this PHY is not able to provide a stable 125Mhz clock for the FEC, it's not
recommended to use. See:
| http://ww1.microchip.com/downloads/en/DeviceDoc/80000692D.pdf
| Module 2: Duty cycle variation for optional 125MHz reference output clock

The KSZ9021rn fixup is configuring clock skews.

Summary:
- SmartEEE is a PHY errata or bad configuration. It is present in both
  PHYs AR8035 and AR8031, and should be disabled via DT in the PHY driver.

- Clock skew configuration is board specific and this fixup will apply
  it even if the PHY is not connected to the FEC. For example boards
  with additional NIC connected to the PCIe or a switch connected to the
  FEC.
  For the clock skew configuration we already have RGMII_ID*, which
  can be configured by devicetree and/or by ethernet driver directly,
  if no devicetree can be used. For example by USB Ethernet adapter.
  All affected PHYs in this fixup series already support clock skew
  configuration by PHY drivers. See latest versions of at803x.c and
  micrel.c. It means, configurations relying on these fixups (and not
  providing correct devicetree with proper phy-mode) will break sooner or
  later, or already did and are already fixed.

- 125Mhz is a bigger issue:
  - It is board specific. Not all boards use PHYs as a clock source for the
    FEC. The following function is proof of my claim. It is responsible to
    configure iMX6Q to use own clock provider:
    arch/arm/mach-imx/mach-imx6q.c
|    static void __init imx6q_1588_init(void)
|	np = of_find_compatible_node(NULL, NULL, "fsl,imx6q-fec");
|	if (!np) {
|		pr_warn("%s: failed to find fec node\n", __func__);
|		return;
|	}
|
|	ptp_clk = of_clk_get(np, 2);
|	if (IS_ERR(ptp_clk)) {
|		pr_warn("%s: failed to get ptp clock\n", __func__);
|		goto put_node;
|	}
|
|	enet_ref = clk_get_sys(NULL, "enet_ref");
|	if (IS_ERR(enet_ref)) {
|		pr_warn("%s: failed to get enet clock\n", __func__);
|		goto put_ptp_clk;
|	}
|
|	/*
|	 * If enet_ref from ANATOP/CCM is the PTP clock source, we need to
|	 * set bit IOMUXC_GPR1[21].  Or the PTP clock must be from pad
|	 * (external OSC), and we need to clear the bit.
|	 */
|	clksel = clk_is_match(ptp_clk, enet_ref) ?
|				IMX6Q_GPR1_ENET_CLK_SEL_ANATOP :
|				IMX6Q_GPR1_ENET_CLK_SEL_PAD;
|	gpr = syscon_regmap_lookup_by_compatible("fsl,imx6q-iomuxc-gpr");
|	if (!IS_ERR(gpr))
|		regmap_update_bits(gpr, IOMUXC_GPR1,
|				IMX6Q_GPR1_ENET_CLK_SEL_MASK,
|				clksel);

  - The at803x PHY driver already provides following devicetree bindings to
    to configure it as clock provider:
    qca,clk-out-frequency
    qca,keep-pll-enabled
    It is kind of replacement of the clock framework.
  - the micrel PHY is configured, in most cases, by bootstrap pins, but does
    not guarantee a glitch free clock, since the PHY can be suspended no
    matter if other devices are using the clock provided by this PHY or
    not.
  - In current case we have the FEC driver which is using clock framework,
    properly, requesting clock source and even trying to disable it for
    power management. But the clock providers are _NOT_ implementing clock
    framework and do not care about proper glitch free clock initialization
    and power management. Implementing proper clock support in the PHY drivers
    will most probably break all boards relying on these fixups.

--------------------------------------------------------------------------------
arch/arm/mach-imx/mach-imx6sx.c
- this fixup is configuring RGMII signal voltage to 1V8 and clock skews.
  Both configurations are supported by the at803x driver:
  vddio-regulator
  RGMII-ID*

- This fixup is missing SmartEEE workaround.

--------------------------------------------------------------------------------
arch/arm/mach-imx/mach-imx6ul.c
static int ksz8081_phy_fixup(struct phy_device *dev)
{
	if (dev && dev->interface == PHY_INTERFACE_MODE_MII) {
		phy_write(dev, 0x1f, 0x8110);
		phy_write(dev, 0x16, 0x201);
	} else if (dev && dev->interface == PHY_INTERFACE_MODE_RMII) {
		phy_write(dev, 0x1f, 0x8190);
		phy_write(dev, 0x16, 0x202);
	}

This fixup gives me headaches:
http://ww1.microchip.com/downloads/en/DeviceDoc/KSZ8081RNA_RND.pdf

- 0x1f is PHY Control 2 register and it is changing mainly only one
  BIT(7) against default/reset values. This bit is configuring the
  clock frequency of XTAL attached to the PHY.
  Looking into the documentation shows that the meaning of this bit
  depends on the exact PHY variant (1 means 50MHz on the RNA and 25MHz
  on the RND variant). This fixup doesn't take this into account. :(

- 0x16 is Operation Mode Strap Override
  this fixup is changing two 3 bits:
  - BIT(9) If bit is ‘1’, PHY Address 0 is non-broadcast.
    This bit is overwritten by the micrel driver, see
    kszphy_config_init()
 - BIT(1) are overwriting boot strap configuration for RMII mode.
 - BIT(0) is reserved. Since this PHY support only RMII mode and has not
   enough pins for MII mode, the benefit of this configuration is
   questionable.

The patch introduced this fixup is trying to fix the imx6ul evk board.
According to this devicetree:

|arch/arm/boot/dts/imx6ul-14x14-evk.dtsi
|&fec1 {
|	pinctrl-names = "default";
|	pinctrl-0 = <&pinctrl_enet1>;
|	phy-mode = "rmii";
|	phy-handle = <&ethphy0>;
|	phy-supply = <&reg_peri_3v3>;
|	status = "okay";
|};
|
|&fec2 {
|	pinctrl-names = "default";
|	pinctrl-0 = <&pinctrl_enet2>;
|	phy-mode = "rmii";
|	phy-handle = <&ethphy1>;
|	phy-supply = <&reg_peri_3v3>;
|	status = "okay";
|
|	mdio {
|		#address-cells = <1>;
|		#size-cells = <0>;
|
|		ethphy0: ethernet-phy@2 {
|			reg = <2>;
|			micrel,led-mode = <1>;
|			clocks = <&clks IMX6UL_CLK_ENET_REF>;
|			clock-names = "rmii-ref";
|		};
|
|		ethphy1: ethernet-phy@1 {
|			reg = <1>;
|			micrel,led-mode = <1>;
|			clocks = <&clks IMX6UL_CLK_ENET2_REF>;
|			clock-names = "rmii-ref";
|		};
|	};
|};

This PHYs have proper clock configuration and can be used only in RGMII
mode. So, this fixup should be removed any way.

--------------------------------------------------------------------------------
arch/arm/mach-imx/mach-imx7d.c
Both fixups added by following commit:
|commit 69f9c5047d04945693ecc1bdfdb8a3dc2a1f48cf
|Author: Fugang Duan <b38611@freescale.com>
|Date:   Mon Sep 7 10:55:00 2015 +0800
|
|    ARM: imx: add enet init for i.MX7D platform
|
|    Add enet phy fixup, clock source init for i.MX7D platform.
|
|    Signed-off-by: Fugang Duan <B38611@freescale.com>
|    Signed-off-by: Shawn Guo <shawnguo@kernel.org>

- the ar8031 fixup is configuring RGMII signal voltage to 1V8 and clock skews.
  Both configurations are supported by the at803x driver:
  vddio-regulator
  RGMII-ID*
- this time SmartEEE workaround is included
- the bcm54220 fixup is configuring clock skews. No driver is available
  for the bcm54220 PHY. This values are probably not overwritten by any
  other driver.

--------------------------------------------------------------------------------
arch/arm/mach-mxs/mach-mxs.c
This fixup was added by following commit:
|commit 3143bbb42b3d27a5f799c97c84fb7a4a1de88f91
|Author: Shawn Guo <shawn.guo@linaro.org>
|Date:   Sat Jul 7 23:12:03 2012 +0800
|
|    ARM: mxs: convert apx4devkit board to device tree
|
|    Tested-by: Lauri Hintsala <lauri.hintsala@bluegiga.com>
|    Signed-off-by: Shawn Guo <shawn.guo@linaro.org>

This is the first fixup and this series which is not modifying the PHY
registers directly, but set the legacy board file specific flags for the
phy. The board specific XTAL frequency is reported to the PHY by
setting MICREL_PHY_50MHZ_CLK flag and used by micrel driver.

--------------------------------------------------------------------------------
arch/arm/mach-orion5x/dns323-setup.c
This fixup was added by following commit:
|commit 6e2daa49420777190c133d7097dd8d5c05b475ac
|Author: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|Date:   Mon Jun 21 13:21:53 2010 +1000
|
|    [ARM] orion5x: Base support for DNS-323 rev C1
|
|    This patch adds the base support for this new HW revision to the existing
|    dns323-setup.c file. The SoC seems to be the same as rev B1, the GPIOs
|    are all wired differently though and the fan control isn't i2c based
|    anymore.
|
|    Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
|    Signed-off-by: Nicolas Pitre <nico@fluxnic.net>


This is related to a single board and do not modifies PHY registers. The
LED configuration is requested per MARVELL_PHY_M1118_DNS323_LEDS flag by
the marvell PHY driver.

--------------------------------------------------------------------------------
arch/powerpc/platforms/85xx/mpc85xx_mds.c
This fixup was added by following commit:
|commit 94833a42765509a7aa95ed1ba7b227ead3c29c08
|Author: Andy Fleming <afleming@freescale.com>
|Date:   Fri May 2 18:56:41 2008 -0500
|
|    [POWERPC] 85xx: Add 8568 PHY workarounds to board code
|
|    The 8568 MDS needs some configuration changes to the PHY in order to
|    work properly.  These are done in the firmware, normally, but Linux
|    shouldn't need to rely on the firmware running such things (someone
|    could disable the PHY support in the firmware to save space, for instance).
|
|    Signed-off-by: Andy Fleming <afleming@freescale.com>
|    Signed-off-by: Kumar Gala <galak@kernel.crashing.org>

This fixup is hard to analyze. I was not able to find a documentation or
related driver for the Marvell 88E1111 PHY. It seems to enable 125Mhz clock as
previous fixups did.

--------------------------------------------------------------------------------
drivers/net/ethernet/dnet.c
This fixup was added by following commit:
|commit 4796417417a62e2ae83d92cb92e1ecf9ec67b5f5
|Author: Ilya Yanok <yanok@emcraft.com>
|Date:   Wed Mar 11 23:26:02 2009 -0700
|
|    dnet: Dave DNET ethernet controller driver (updated)
|
|    Driver for Dave DNET ethernet controller found on Dave/DENX QongEVB-LITE
|    FPGA. Heavily based on Dave sources, I've just adopted it to current
|    kernel version and done some code cleanup.
|
|    Signed-off-by: Ilya Yanok <yanok@emcraft.com>
|    Signed-off-by: David S. Miller <davem@davemloft.net>

Same PHY as in previous case (Marvell 88E1111). The comment statement is:
/* For Neptune board: LINK1000 as Link LED and TX as activity LED */

--------------------------------------------------------------------------------
drivers/net/usb/lan78xx.c
This driver provides two fixups, added by following commit:
|commit 02dc1f3d613d5a859513d7416c9aca370425a7e0
|Author: Woojung Huh <woojung.huh@microchip.com>
|Date:   Wed Dec 7 20:26:25 2016 +0000
|
|    lan78xx: add LAN7801 MAC only support
|
|    Add LAN7801 MAC only support with phy fixup functions.
|
|    Signed-off-by: Woojung Huh <woojung.huh@microchip.com>
|    Signed-off-by: David S. Miller <davem@davemloft.net>


These fixups are related to KSZ9031rnx and LAN8835 PHYs, are configuring clock
skews.

Please note: The KSZ9031 fixup is used on imx6q boards. What will happen
if we attach this adapter to a imx6q based system? Or what will happen
with all this usb ethernet adapters with atheros or micrel PHYs attached
to imx6 based system?

Regards,
Oleksij & Marc
Russell King (Oracle) March 31, 2020, 2:08 p.m. UTC | #14
On Tue, Mar 31, 2020 at 03:45:20PM +0200, Oleksij Rempel wrote:
> Hi Russell,
> 
> On Mon, Mar 30, 2020 at 06:41:14PM +0100, Russell King - ARM Linux admin wrote:
> > On Mon, Mar 30, 2020 at 10:33:03AM -0700, Florian Fainelli wrote:
> > > 
> > > 
> > > On 3/29/2020 10:26 PM, Oleksij Rempel wrote:
> > > > Hi Andrew,
> > > > 
> > > > On Sun, Mar 29, 2020 at 05:08:54PM +0200, Andrew Lunn wrote:
> > > >> On Sun, Mar 29, 2020 at 01:04:57PM +0200, Oleksij Rempel wrote:
> > > >>
> > > >> Hi Oleksij
> > > >>
> > > >>> +config DEPRECATED_PHY_FIXUPS
> > > >>> +	bool "Enable deprecated PHY fixups"
> > > >>> +	default y
> > > >>> +	---help---
> > > >>> +	  In the early days it was common practice to configure PHYs by adding a
> > > >>> +	  phy_register_fixup*() in the machine code. This practice turned out to
> > > >>> +	  be potentially dangerous, because:
> > > >>> +	  - it affects all PHYs in the system
> > > >>> +	  - these register changes are usually not preserved during PHY reset
> > > >>> +	    or suspend/resume cycle.
> > > >>> +	  - it complicates debugging, since these configuration changes were not
> > > >>> +	    done by the actual PHY driver.
> > > >>> +	  This option allows to disable all fixups which are identified as
> > > >>> +	  potentially harmful and give the developers a chance to implement the
> > > >>> +	  proper configuration via the device tree (e.g.: phy-mode) and/or the
> > > >>> +	  related PHY drivers.
> > > >>
> > > >> This appears to be an IMX only problem. Everybody else seems to of got
> > > >> this right. There is no need to bother everybody with this new
> > > >> option. Please put this in arch/arm/mach-mxs/Kconfig and have IMX in
> > > >> the name.
> > > > 
> > > > Actually, all fixups seems to do wring thing:
> > > > arch/arm/mach-davinci/board-dm644x-evm.c:915:		phy_register_fixup_for_uid(LXT971_PHY_ID, LXT971_PHY_MASK,
> > > > 
> > > > Increased MII drive strength. Should be probably enabled by the PHY
> > > > driver.
> > > > 
> > > > arch/arm/mach-imx/mach-imx6q.c:167:		phy_register_fixup_for_uid(PHY_ID_KSZ9021, MICREL_PHY_ID_MASK,
> > > > arch/arm/mach-imx/mach-imx6q.c:169:		phy_register_fixup_for_uid(PHY_ID_KSZ9031, MICREL_PHY_ID_MASK,
> > > > arch/arm/mach-imx/mach-imx6q.c:171:		phy_register_fixup_for_uid(PHY_ID_AR8031, 0xffffffef,
> > > > arch/arm/mach-imx/mach-imx6q.c:173:		phy_register_fixup_for_uid(PHY_ID_AR8035, 0xffffffef,
> > 
> > As far as I'm concerned, the AR8035 fixup is there with good reason.
> > It's not just "random" but is required to make the AR8035 usable with
> > the iMX6 SoCs.  Not because of a board level thing, but because it's
> > required for the AR8035 to be usable with an iMX6 SoC.
> > 
> > So, having it registered by the iMX6 SoC code is entirely logical and
> > correct.
> > 
> > That's likely true of the AR8031 situation as well.
> > 
> > I can't speak for any of the others.
> 
> OK, let's analyze it step by step:
> --------------------------------------------------------------------------------
> arch/arm/mach-imx/mach-imx6q.c
> 
> The AR8035 fixup is doing following configurations:
> - disable SmartEEE with following description:
>   /* Ar803x phy SmartEEE feature cause link status generates glitch,
>    * which cause ethernet link down/up issue, so disable SmartEEE
> 
> - enable clock output from PHY, configures it to 125Mhz and configures
>   clock skew. See the comment provided in the source code:
>   * Enable 125MHz clock from CLK_25M on the AR8031.  This
>   * is fed in to the IMX6 on the ENET_REF_CLK (V22) pad.
>   * Also, introduce a tx clock delay.
>   *
>   * This is the same as is the AR8031 fixup.
> 
> - powers on the PHY. Probably to make sure the clock output will run
>   before FEC is probed to avoid clock glitches.
> 
> The AR8031 fixup only enables clock output of PHY, configures it to
> 125Mhz, and configures clock skew. The PHY not powered and although it
> supports SmartEEE, it's not disabled. Let's assume the fixup author did
> the correct configuration and SmartEEE is working without problems.

I'm not arguing as a random third party.  I am the fixup author.

SmartEEE on the Atheros PHYs is enabled by default in the hardware,
and is a non-IEEE 802.3 approved hack to try to provide lower power
utilisation.  However, it has been observed to cause ethernet
corruption on SolidRun boards when connected to _some_ switches.
It appears that the combination of Atheros SmartEEE and some switches
introduces this problem.  This has been looked at by _three_ different
people.

The way SmartEEE works is very different from IEEE 802.3 EEE. The EEE
is terminated at the PHY, and the Ethernet controller is supposed to
know nothing about it.  If the link is in low power mode, then if the
MAC wants to start transmitting, the PHY has to buffer the packet,
wake the link up, and then pass the packet on.  There are configurable
delays in the AR8035, and we've tried adjusting those with no success.

This has nothing to do with anything at board level as far as anyone
can work out.

So, it seems entirely reasonable that the same problem would afflict
other iMX6 designs using the AR8035.  Indeed, it already does - the
SolidRun platforms have been through several different design
iterations, including different board layouts, and they _all_ exhibit
the same issue wrt SmartEEE using any of the iMX6 SoCs.

There is no published information from the manufacturer that suggests
that this is an Errata - if there were, then SolidRun being one of
their customers would have had that information.

Didn't bother to read the rest of the email, too long.
Russell King (Oracle) March 31, 2020, 3:15 p.m. UTC | #15
On Tue, Mar 31, 2020 at 02:54:33PM +0200, Andrew Lunn wrote:
> >  - Disable the SmartEEE feature of the phy. The comment in the code implies
> >    that for some reason it doesn't work, but the reason itself is not given.
> >    Anyway, disabling SmartEEE should IMHO opinion be controlled by a DT
> >    setting. There is no reason to believe this problem is specific to the
> >    i.MX6. Besides, it is a feature of the phy, so it seems logical to expose
> >    that via the DT. Once that is done, it has no place here.
> 
> The device tree properties are defined:
> 
> bindings/net/ethernet-phy.yaml:  eee-broken-100tx:
> bindings/net/ethernet-phy.yaml:  eee-broken-1000t:
> bindings/net/ethernet-phy.yaml:  eee-broken-10gt:
> bindings/net/ethernet-phy.yaml:  eee-broken-1000kx:
> bindings/net/ethernet-phy.yaml:  eee-broken-10gkx4:
> bindings/net/ethernet-phy.yaml:  eee-broken-10gkr:
> 
> And there is a helper:
> 
> void of_set_phy_eee_broken(struct phy_device *phydev)

Disabling the advertisement may solve it, but that is not known.
What the quirk is doing is disabling the SmartEEE feature only
(which is where the PHY handles the EEE so-called "transparently"
to the MAC).

It's all very well waving arms years later and saying we don't
like code that was merged, but unless someone can prove that an
alternative way is better and doesn't regress anything, there
won't be a way forward.

> >  - Enable TXC delay. To clarify, the RGMII specification version 1 specified
> >    that the RXC and TXC traces should be routed long enough to introduce a
> >    certain delay to the clock signal, or the delay should be introduced via
> >    other means. In a later version of the spec, a provision was given for MAC
> >    or PHY devices to generate this delay internally. The i.MX6 MAC interface
> >    is unable to generate the required delay internally, so it has to be taken
> >    care of either by the board layout, or by the PHY device. This is the
> >    crucial point: The amount of delay set by the PHY delay register depends on
> >    the board layout. It should NEVER be hard-coded in SoC setup code. The
> >    correct way is to specify it in the DT. Needless to say that this too,
> >    isn't i.MX6-specific.
> 
> Correct:
> 
>       # RX and TX delays are added by the MAC when required
>       - rgmii
> 
>       # RGMII with internal RX and TX delays provided by the PHY,
>       # the MAC should not add the RX or TX delays in this case
>       - rgmii-id
> 
>       # RGMII with internal RX delay provided by the PHY, the MAC
>       # should not add an RX delay in this case
>       - rgmii-rxid
> 
>       # RGMII with internal TX delay provided by the PHY, the MAC
>       # should not add an TX delay in this case
>       - rgmii-txid
> 
> The needed properties exist.
> 
> I think part of the issue is that there are iMX6 board which are not
> device tree?

I think it's historical - iMX6 never used to be able to enumerate
anything on the MDIO bus, so the only way to configure stuff on the
PHY was via quirks.  That seems to have changed in v3.17-rc1 without
anyone noticing, which happened after the SolidRun support was merged
(v3.14-rc1).  So, not surprisingly, SolidRun platforms don't make use
of the DT properties - quite how one is supposed to know about this
stuff, I've no idea (short of following almost every damn subsystem
mailing list and reading tonnes of email - that's highly impractical.)
Vladimir Oltean March 31, 2020, 3:40 p.m. UTC | #16
Hi Russell,

On Tue, 31 Mar 2020 at 18:15, Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
>
> On Tue, Mar 31, 2020 at 02:54:33PM +0200, Andrew Lunn wrote:
> > >  - Disable the SmartEEE feature of the phy. The comment in the code implies
> > >    that for some reason it doesn't work, but the reason itself is not given.
> > >    Anyway, disabling SmartEEE should IMHO opinion be controlled by a DT
> > >    setting. There is no reason to believe this problem is specific to the
> > >    i.MX6. Besides, it is a feature of the phy, so it seems logical to expose
> > >    that via the DT. Once that is done, it has no place here.
> >
> > The device tree properties are defined:
> >
> > bindings/net/ethernet-phy.yaml:  eee-broken-100tx:
> > bindings/net/ethernet-phy.yaml:  eee-broken-1000t:
> > bindings/net/ethernet-phy.yaml:  eee-broken-10gt:
> > bindings/net/ethernet-phy.yaml:  eee-broken-1000kx:
> > bindings/net/ethernet-phy.yaml:  eee-broken-10gkx4:
> > bindings/net/ethernet-phy.yaml:  eee-broken-10gkr:
> >
> > And there is a helper:
> >
> > void of_set_phy_eee_broken(struct phy_device *phydev)
>
> Disabling the advertisement may solve it, but that is not known.
> What the quirk is doing is disabling the SmartEEE feature only
> (which is where the PHY handles the EEE so-called "transparently"
> to the MAC).
>
> It's all very well waving arms years later and saying we don't
> like code that was merged, but unless someone can prove that an
> alternative way is better and doesn't regress anything, there
> won't be a way forward.
>

For what it's worth, your position on these device tree bindings for
broken EEE seems to have changed from the one that you expressed in
this thread:
https://www.spinics.net/lists/arm-kernel/msg703453.html
To quote from that:

> > There is no "advertisement of SmartEEE" - it's just EEE.  That is
> > because as far as the link partner is concerned, SmartEEE is just
> > EEE.
> > [...]
> >
> > Otherwise, using the existing "eee-broken-*" properties to disable the
> > link modes where EEE fails would be the correct way forward, and should
> > be used in preference to disabling SmartEEE.
> >
> > However, no one has mentioned what the problem that is trying to be
> > addressed.  Is it data corruption?  Is it that the link fails?  Is it
> > lost packets?  Is it that the MAC supports EEE?  I think there needs to
> > be some better understanding of the problem at hand before trying to
> > address it.

Regards,
-Vladimir
David Jander March 31, 2020, 3:41 p.m. UTC | #17
Dear Russell,

On Tue, 31 Mar 2020 10:36:49 +0100
Russell King - ARM Linux admin <linux@armlinux.org.uk> wrote:

> On Tue, Mar 31, 2020 at 10:44:59AM +0200, David Jander wrote:
> > On Mon, 30 Mar 2020 18:41:14 +0100
> > Russell King - ARM Linux admin <linux@armlinux.org.uk> wrote:
> >   
> > > On Mon, Mar 30, 2020 at 10:33:03AM -0700, Florian Fainelli wrote:  
> > > > 
> > > > 
> > > > On 3/29/2020 10:26 PM, Oleksij Rempel wrote:    
> > > > > Hi Andrew,
> > > > > 
> > > > > On Sun, Mar 29, 2020 at 05:08:54PM +0200, Andrew Lunn wrote:    
> > > > >> On Sun, Mar 29, 2020 at 01:04:57PM +0200, Oleksij Rempel wrote:
> > > > >>
> > > > >> Hi Oleksij
> > > > >>    
> > > > >>> +config DEPRECATED_PHY_FIXUPS
> > > > >>> +	bool "Enable deprecated PHY fixups"
> > > > >>> +	default y
> > > > >>> +	---help---
> > > > >>> +	  In the early days it was common practice to configure PHYs by adding a
> > > > >>> +	  phy_register_fixup*() in the machine code. This practice turned out to
> > > > >>> +	  be potentially dangerous, because:
> > > > >>> +	  - it affects all PHYs in the system
> > > > >>> +	  - these register changes are usually not preserved during PHY reset
> > > > >>> +	    or suspend/resume cycle.
> > > > >>> +	  - it complicates debugging, since these configuration changes were not
> > > > >>> +	    done by the actual PHY driver.
> > > > >>> +	  This option allows to disable all fixups which are identified as
> > > > >>> +	  potentially harmful and give the developers a chance to implement the
> > > > >>> +	  proper configuration via the device tree (e.g.: phy-mode) and/or the
> > > > >>> +	  related PHY drivers.    
> > > > >>
> > > > >> This appears to be an IMX only problem. Everybody else seems to of got
> > > > >> this right. There is no need to bother everybody with this new
> > > > >> option. Please put this in arch/arm/mach-mxs/Kconfig and have IMX in
> > > > >> the name.    
> > > > > 
> > > > > Actually, all fixups seems to do wring thing:
> > > > > arch/arm/mach-davinci/board-dm644x-evm.c:915:		phy_register_fixup_for_uid(LXT971_PHY_ID, LXT971_PHY_MASK,
> > > > > 
> > > > > Increased MII drive strength. Should be probably enabled by the PHY
> > > > > driver.
> > > > > 
> > > > > arch/arm/mach-imx/mach-imx6q.c:167:		phy_register_fixup_for_uid(PHY_ID_KSZ9021, MICREL_PHY_ID_MASK,
> > > > > arch/arm/mach-imx/mach-imx6q.c:169:		phy_register_fixup_for_uid(PHY_ID_KSZ9031, MICREL_PHY_ID_MASK,
> > > > > arch/arm/mach-imx/mach-imx6q.c:171:		phy_register_fixup_for_uid(PHY_ID_AR8031, 0xffffffef,
> > > > > arch/arm/mach-imx/mach-imx6q.c:173:		phy_register_fixup_for_uid(PHY_ID_AR8035, 0xffffffef,    
> > > 
> > > As far as I'm concerned, the AR8035 fixup is there with good reason.
> > > It's not just "random" but is required to make the AR8035 usable with
> > > the iMX6 SoCs.  Not because of a board level thing, but because it's
> > > required for the AR8035 to be usable with an iMX6 SoC.  
> > 
> > I have checked with the datasheet of the AR8035, and AFAICS, what the code
> > does is this:
> > 
> >  - Disable the SmartEEE feature of the phy. The comment in the code implies
> >    that for some reason it doesn't work, but the reason itself is not given.
> >    Anyway, disabling SmartEEE should IMHO opinion be controlled by a DT
> >    setting. There is no reason to believe this problem is specific to the
> >    i.MX6. Besides, it is a feature of the phy, so it seems logical to expose
> >    that via the DT. Once that is done, it has no place here.
> > 
> >  - Set the external clock output to 125MHz. This is needed because the i.MX6
> >    needs a 125MHz reference clock input. But it is not a requirement to use
> >    this output. It is perfectly fine and possible to design a board that uses
> >    an external oscillator for this. It is also possible that an i.MX6 design
> >    has such a phy connected to a MAC behind a switch or some other interface.
> >    Independent of i.MX6 this setting can also be necessary for other hardware
> >    designs, based on different SoC's. In summary, this is a feature of the
> >    specific hardware design at hand, and has nothing to do with the i.MX6
> >    specifically. This should definitely be exposed through the DT and not be
> >    here.
> > 
> >  - Enable TXC delay. To clarify, the RGMII specification version 1 specified
> >    that the RXC and TXC traces should be routed long enough to introduce a
> >    certain delay to the clock signal, or the delay should be introduced via
> >    other means. In a later version of the spec, a provision was given for MAC
> >    or PHY devices to generate this delay internally. The i.MX6 MAC interface
> >    is unable to generate the required delay internally, so it has to be taken
> >    care of either by the board layout, or by the PHY device. This is the
> >    crucial point: The amount of delay set by the PHY delay register depends on
> >    the board layout. It should NEVER be hard-coded in SoC setup code. The
> >    correct way is to specify it in the DT. Needless to say that this too,
> >    isn't i.MX6-specific.
> >   
> > > So, having it registered by the iMX6 SoC code is entirely logical and
> > > correct.  
> > 
> > I'm afraid I don't agree. See above. This code really should never have been
> > here. It is not i.MX6-specific as I pointed out above, nor is it necessarily
> > applicable to all i.MX6 boards that use those phy devices.  
> 
> Then we will have to agree to disagree, sorry.

Please forgive me if I am appearing a bit stubborn.
If it is not too much to ask, I would really like to know where my reasoning
is wrong?
Maybe you can explain to me how to solve the following real-life conflict that
this introduces:

Suppose we have a board with an i.MX6Q and a KSZ9031 connected to it. Suppose
I now take a USB stick with a LAN7800 ethernet chip and a KSZ9031 PHY. These
USB sticks do exist, and it does not seem unthinkable to me that one would
connect them to such an i.MX6 system in order to get a second LAN port.

AFAICS there is a reasonable chance this combination might not work, and for
some very obscure reason on top of that:

There are two places a fixup gets registered for the phy:

 drivers/net/usb/lan78xx.c:2019

and:

 arch/arm/mach-imx/mach-imx6q.c:64

Both of these fix-ups write different clock- and signal pad skews to any
ksz9031 phy... but there are now two.

So there is one driver, one set of fixups (one overwrites the other) but two
different instances of hardware requiring different settings.

I get that it is possibly harder to repair the USB case, but for SoC's we have
platform drivers that support device-trees almost everywhere nowadays.
Device-tree nodes are unique for each instance of a device, so there would be
the most logical place to fix these cases. In fact, the needed definitions are
already in place. The only thing that needs to be done is remove the fixup
from the SoC code and patch the affected DTS files. If the USB driver cannot
be repaired or will be repaired at a later time, apply fixups only for non-DT
devices. Problem solved. Right?

Maybe, if the kernel is modular and the lan7800 driver is actually a module
loaded much later than the SoC FEC driver, it will work fine if the fixups are
applied only once per device (I must admit that I don't know that for sure).
But if they are both built-in drivers, it may cause hardware to malfunction in
hard to debug ways.

I must also admit that I don't have the required USB stick to test my
hypothesis, so I cannot provide any hard proof that it will malfunction. Maybe
I am making a mistake in this reasoning, in which case I am sorry. Would be
cool to have an explanation as to why I am wrong here though...

Best regards,
Russell King (Oracle) March 31, 2020, 3:53 p.m. UTC | #18
On Tue, Mar 31, 2020 at 05:41:03PM +0200, David Jander wrote:
> 
> Dear Russell,
> 
> On Tue, 31 Mar 2020 10:36:49 +0100
> Russell King - ARM Linux admin <linux@armlinux.org.uk> wrote:
> 
> > On Tue, Mar 31, 2020 at 10:44:59AM +0200, David Jander wrote:
> > > On Mon, 30 Mar 2020 18:41:14 +0100
> > > Russell King - ARM Linux admin <linux@armlinux.org.uk> wrote:
> > >   
> > > > On Mon, Mar 30, 2020 at 10:33:03AM -0700, Florian Fainelli wrote:  
> > > > > 
> > > > > 
> > > > > On 3/29/2020 10:26 PM, Oleksij Rempel wrote:    
> > > > > > Hi Andrew,
> > > > > > 
> > > > > > On Sun, Mar 29, 2020 at 05:08:54PM +0200, Andrew Lunn wrote:    
> > > > > >> On Sun, Mar 29, 2020 at 01:04:57PM +0200, Oleksij Rempel wrote:
> > > > > >>
> > > > > >> Hi Oleksij
> > > > > >>    
> > > > > >>> +config DEPRECATED_PHY_FIXUPS
> > > > > >>> +	bool "Enable deprecated PHY fixups"
> > > > > >>> +	default y
> > > > > >>> +	---help---
> > > > > >>> +	  In the early days it was common practice to configure PHYs by adding a
> > > > > >>> +	  phy_register_fixup*() in the machine code. This practice turned out to
> > > > > >>> +	  be potentially dangerous, because:
> > > > > >>> +	  - it affects all PHYs in the system
> > > > > >>> +	  - these register changes are usually not preserved during PHY reset
> > > > > >>> +	    or suspend/resume cycle.
> > > > > >>> +	  - it complicates debugging, since these configuration changes were not
> > > > > >>> +	    done by the actual PHY driver.
> > > > > >>> +	  This option allows to disable all fixups which are identified as
> > > > > >>> +	  potentially harmful and give the developers a chance to implement the
> > > > > >>> +	  proper configuration via the device tree (e.g.: phy-mode) and/or the
> > > > > >>> +	  related PHY drivers.    
> > > > > >>
> > > > > >> This appears to be an IMX only problem. Everybody else seems to of got
> > > > > >> this right. There is no need to bother everybody with this new
> > > > > >> option. Please put this in arch/arm/mach-mxs/Kconfig and have IMX in
> > > > > >> the name.    
> > > > > > 
> > > > > > Actually, all fixups seems to do wring thing:
> > > > > > arch/arm/mach-davinci/board-dm644x-evm.c:915:		phy_register_fixup_for_uid(LXT971_PHY_ID, LXT971_PHY_MASK,
> > > > > > 
> > > > > > Increased MII drive strength. Should be probably enabled by the PHY
> > > > > > driver.
> > > > > > 
> > > > > > arch/arm/mach-imx/mach-imx6q.c:167:		phy_register_fixup_for_uid(PHY_ID_KSZ9021, MICREL_PHY_ID_MASK,
> > > > > > arch/arm/mach-imx/mach-imx6q.c:169:		phy_register_fixup_for_uid(PHY_ID_KSZ9031, MICREL_PHY_ID_MASK,
> > > > > > arch/arm/mach-imx/mach-imx6q.c:171:		phy_register_fixup_for_uid(PHY_ID_AR8031, 0xffffffef,
> > > > > > arch/arm/mach-imx/mach-imx6q.c:173:		phy_register_fixup_for_uid(PHY_ID_AR8035, 0xffffffef,    
> > > > 
> > > > As far as I'm concerned, the AR8035 fixup is there with good reason.
> > > > It's not just "random" but is required to make the AR8035 usable with
> > > > the iMX6 SoCs.  Not because of a board level thing, but because it's
> > > > required for the AR8035 to be usable with an iMX6 SoC.  
> > > 
> > > I have checked with the datasheet of the AR8035, and AFAICS, what the code
> > > does is this:
> > > 
> > >  - Disable the SmartEEE feature of the phy. The comment in the code implies
> > >    that for some reason it doesn't work, but the reason itself is not given.
> > >    Anyway, disabling SmartEEE should IMHO opinion be controlled by a DT
> > >    setting. There is no reason to believe this problem is specific to the
> > >    i.MX6. Besides, it is a feature of the phy, so it seems logical to expose
> > >    that via the DT. Once that is done, it has no place here.
> > > 
> > >  - Set the external clock output to 125MHz. This is needed because the i.MX6
> > >    needs a 125MHz reference clock input. But it is not a requirement to use
> > >    this output. It is perfectly fine and possible to design a board that uses
> > >    an external oscillator for this. It is also possible that an i.MX6 design
> > >    has such a phy connected to a MAC behind a switch or some other interface.
> > >    Independent of i.MX6 this setting can also be necessary for other hardware
> > >    designs, based on different SoC's. In summary, this is a feature of the
> > >    specific hardware design at hand, and has nothing to do with the i.MX6
> > >    specifically. This should definitely be exposed through the DT and not be
> > >    here.
> > > 
> > >  - Enable TXC delay. To clarify, the RGMII specification version 1 specified
> > >    that the RXC and TXC traces should be routed long enough to introduce a
> > >    certain delay to the clock signal, or the delay should be introduced via
> > >    other means. In a later version of the spec, a provision was given for MAC
> > >    or PHY devices to generate this delay internally. The i.MX6 MAC interface
> > >    is unable to generate the required delay internally, so it has to be taken
> > >    care of either by the board layout, or by the PHY device. This is the
> > >    crucial point: The amount of delay set by the PHY delay register depends on
> > >    the board layout. It should NEVER be hard-coded in SoC setup code. The
> > >    correct way is to specify it in the DT. Needless to say that this too,
> > >    isn't i.MX6-specific.
> > >   
> > > > So, having it registered by the iMX6 SoC code is entirely logical and
> > > > correct.  
> > > 
> > > I'm afraid I don't agree. See above. This code really should never have been
> > > here. It is not i.MX6-specific as I pointed out above, nor is it necessarily
> > > applicable to all i.MX6 boards that use those phy devices.  
> > 
> > Then we will have to agree to disagree, sorry.
> 
> Please forgive me if I am appearing a bit stubborn.
> If it is not too much to ask, I would really like to know where my reasoning
> is wrong?
> Maybe you can explain to me how to solve the following real-life conflict that
> this introduces:
> 
> Suppose we have a board with an i.MX6Q and a KSZ9031 connected to it. Suppose
> I now take a USB stick with a LAN7800 ethernet chip and a KSZ9031 PHY. These
> USB sticks do exist, and it does not seem unthinkable to me that one would
> connect them to such an i.MX6 system in order to get a second LAN port.

Thanks.  I've already covered how this can be delt with in some code
I've posted in this thread.  Therefore, I have nothing further to add
to this point, apart from pointing out that I've provided a solution
so as far as I'm concerned, it's entirely solvable, and warrants no
further argument.

Maybe a discussion about solutions would be appropriate, but merely
re-raising the same point while ignoring proposed solutions is not
a productive way forward.
Russell King (Oracle) March 31, 2020, 5:03 p.m. UTC | #19
On Tue, Mar 31, 2020 at 10:44:59AM +0200, David Jander wrote:
> I have checked with the datasheet of the AR8035, and AFAICS, what the code
> does is this:
> 
>  - Disable the SmartEEE feature of the phy. The comment in the code implies
>    that for some reason it doesn't work, but the reason itself is not given.
>    Anyway, disabling SmartEEE should IMHO opinion be controlled by a DT
>    setting. There is no reason to believe this problem is specific to the
>    i.MX6. Besides, it is a feature of the phy, so it seems logical to expose
>    that via the DT. Once that is done, it has no place here.
> 
>  - Set the external clock output to 125MHz. This is needed because the i.MX6
>    needs a 125MHz reference clock input. But it is not a requirement to use
>    this output. It is perfectly fine and possible to design a board that uses
>    an external oscillator for this. It is also possible that an i.MX6 design
>    has such a phy connected to a MAC behind a switch or some other interface.
>    Independent of i.MX6 this setting can also be necessary for other hardware
>    designs, based on different SoC's. In summary, this is a feature of the
>    specific hardware design at hand, and has nothing to do with the i.MX6
>    specifically. This should definitely be exposed through the DT and not be
>    here.
> 
>  - Enable TXC delay. To clarify, the RGMII specification version 1 specified
>    that the RXC and TXC traces should be routed long enough to introduce a
>    certain delay to the clock signal, or the delay should be introduced via
>    other means. In a later version of the spec, a provision was given for MAC
>    or PHY devices to generate this delay internally. The i.MX6 MAC interface
>    is unable to generate the required delay internally, so it has to be taken
>    care of either by the board layout, or by the PHY device. This is the
>    crucial point: The amount of delay set by the PHY delay register depends on
>    the board layout. It should NEVER be hard-coded in SoC setup code. The
>    correct way is to specify it in the DT. Needless to say that this too,
>    isn't i.MX6-specific.

Let's say this is simple to do, shall we?

So, if I disable the call to ar8031_phy_fixup() from ar8035_phy_fixup(),
and add the following to the imx6qdl-sr-som.dtsi fragment:

&fec {
...
        phy-handle = <&phy>;

        mdio {
                #address-cells = <1>;
                #size-cells = <0>;

                phy: ethernet-phy@0 {
                        reg = <0>;
                        qca,clk-out-frequency = <125000000>;
                };
        };
};

Note that phy-mode is already RGMII-ID.  This should work, right?

The link still comes up, which is good, but the PHY registers for
the clock output are wrong.

MMD 3 register 0x8016 contains 0xb282, not 0xb29a which it has
_with_ the quirk - and thus the above clock frequency stated in
DT is not being selected.  Forcing this register to the right
value restores networking.

Yes, the PHY driver is being used:

Qualcomm Atheros AR8035 2188000.ethernet-1:00: attached PHY driver [Qualcomm Atheros AR8035] (mii_bus:phy_addr=2188000.ethernet-1:00, irq=POLL)

So that's not the problem.

Adding some debug shows that the phy_device that is being used is
the correct one:

Qualcomm Atheros AR8035 2188000.ethernet-1:00: node=/soc/aips-bus@2100000/ethernet@2188000/mdio/ethernet-phy@0

and it is correctly parsing the clk-out-frequency property:

Qualcomm Atheros AR8035 2188000.ethernet-1:00: cof=0 125000000

When we get to attaching the PHY however:

Qualcomm Atheros AR8035 2188000.ethernet-1:00: clk_25m_mask=0004 clk_25m_reg=0000

which is just wrong.  That's because:

                if (at803x_match_phy_id(phydev, ATH8030_PHY_ID) ||
                    at803x_match_phy_id(phydev, ATH8035_PHY_ID)) {
                        priv->clk_25m_reg &= ~AT8035_CLK_OUT_MASK;
                        priv->clk_25m_mask &= ~AT8035_CLK_OUT_MASK;
                }

is patently untested - those "~" should not be there.  These masks
are one-bits-set for the values that comprise the fields, not
zero-bits-set.

So, I see a patch series is going to be necessary to fix the cockup(s)
in the PHY driver before we can do anything with DT files.
Oleksij Rempel March 31, 2020, 5:16 p.m. UTC | #20
On Tue, Mar 31, 2020 at 06:03:00PM +0100, Russell King - ARM Linux admin wrote:
> On Tue, Mar 31, 2020 at 10:44:59AM +0200, David Jander wrote:
> > I have checked with the datasheet of the AR8035, and AFAICS, what the code
> > does is this:
> > 
> >  - Disable the SmartEEE feature of the phy. The comment in the code implies
> >    that for some reason it doesn't work, but the reason itself is not given.
> >    Anyway, disabling SmartEEE should IMHO opinion be controlled by a DT
> >    setting. There is no reason to believe this problem is specific to the
> >    i.MX6. Besides, it is a feature of the phy, so it seems logical to expose
> >    that via the DT. Once that is done, it has no place here.
> > 
> >  - Set the external clock output to 125MHz. This is needed because the i.MX6
> >    needs a 125MHz reference clock input. But it is not a requirement to use
> >    this output. It is perfectly fine and possible to design a board that uses
> >    an external oscillator for this. It is also possible that an i.MX6 design
> >    has such a phy connected to a MAC behind a switch or some other interface.
> >    Independent of i.MX6 this setting can also be necessary for other hardware
> >    designs, based on different SoC's. In summary, this is a feature of the
> >    specific hardware design at hand, and has nothing to do with the i.MX6
> >    specifically. This should definitely be exposed through the DT and not be
> >    here.
> > 
> >  - Enable TXC delay. To clarify, the RGMII specification version 1 specified
> >    that the RXC and TXC traces should be routed long enough to introduce a
> >    certain delay to the clock signal, or the delay should be introduced via
> >    other means. In a later version of the spec, a provision was given for MAC
> >    or PHY devices to generate this delay internally. The i.MX6 MAC interface
> >    is unable to generate the required delay internally, so it has to be taken
> >    care of either by the board layout, or by the PHY device. This is the
> >    crucial point: The amount of delay set by the PHY delay register depends on
> >    the board layout. It should NEVER be hard-coded in SoC setup code. The
> >    correct way is to specify it in the DT. Needless to say that this too,
> >    isn't i.MX6-specific.
> 
> Let's say this is simple to do, shall we?
> 
> So, if I disable the call to ar8031_phy_fixup() from ar8035_phy_fixup(),
> and add the following to the imx6qdl-sr-som.dtsi fragment:
> 
> &fec {
> ...
>         phy-handle = <&phy>;
> 
>         mdio {
>                 #address-cells = <1>;
>                 #size-cells = <0>;
> 
>                 phy: ethernet-phy@0 {
>                         reg = <0>;
>                         qca,clk-out-frequency = <125000000>;
>                 };
>         };
> };
> 
> Note that phy-mode is already RGMII-ID.  This should work, right?
> 
> The link still comes up, which is good, but the PHY registers for
> the clock output are wrong.
> 
> MMD 3 register 0x8016 contains 0xb282, not 0xb29a which it has
> _with_ the quirk - and thus the above clock frequency stated in
> DT is not being selected.  Forcing this register to the right
> value restores networking.
> 
> Yes, the PHY driver is being used:
> 
> Qualcomm Atheros AR8035 2188000.ethernet-1:00: attached PHY driver [Qualcomm Atheros AR8035] (mii_bus:phy_addr=2188000.ethernet-1:00, irq=POLL)
> 
> So that's not the problem.
> 
> Adding some debug shows that the phy_device that is being used is
> the correct one:
> 
> Qualcomm Atheros AR8035 2188000.ethernet-1:00: node=/soc/aips-bus@2100000/ethernet@2188000/mdio/ethernet-phy@0
> 
> and it is correctly parsing the clk-out-frequency property:
> 
> Qualcomm Atheros AR8035 2188000.ethernet-1:00: cof=0 125000000
> 
> When we get to attaching the PHY however:
> 
> Qualcomm Atheros AR8035 2188000.ethernet-1:00: clk_25m_mask=0004 clk_25m_reg=0000
> 
> which is just wrong.  That's because:
> 
>                 if (at803x_match_phy_id(phydev, ATH8030_PHY_ID) ||
>                     at803x_match_phy_id(phydev, ATH8035_PHY_ID)) {
>                         priv->clk_25m_reg &= ~AT8035_CLK_OUT_MASK;
>                         priv->clk_25m_mask &= ~AT8035_CLK_OUT_MASK;
>                 }
> 
> is patently untested - those "~" should not be there.  These masks
> are one-bits-set for the values that comprise the fields, not
> zero-bits-set.
> 
> So, I see a patch series is going to be necessary to fix the cockup(s)
> in the PHY driver before we can do anything with DT files.

I'm glad you found this issues :D I made a patch to fix it last week.
And it was a reason to send a patch for disabling _all_ fixups :)

Regards,
Oleksij
Russell King (Oracle) March 31, 2020, 5:46 p.m. UTC | #21
On Tue, Mar 31, 2020 at 07:16:59PM +0200, Oleksij Rempel wrote:
> On Tue, Mar 31, 2020 at 06:03:00PM +0100, Russell King - ARM Linux admin wrote:
> > On Tue, Mar 31, 2020 at 10:44:59AM +0200, David Jander wrote:
> > > I have checked with the datasheet of the AR8035, and AFAICS, what the code
> > > does is this:
> > > 
> > >  - Disable the SmartEEE feature of the phy. The comment in the code implies
> > >    that for some reason it doesn't work, but the reason itself is not given.
> > >    Anyway, disabling SmartEEE should IMHO opinion be controlled by a DT
> > >    setting. There is no reason to believe this problem is specific to the
> > >    i.MX6. Besides, it is a feature of the phy, so it seems logical to expose
> > >    that via the DT. Once that is done, it has no place here.
> > > 
> > >  - Set the external clock output to 125MHz. This is needed because the i.MX6
> > >    needs a 125MHz reference clock input. But it is not a requirement to use
> > >    this output. It is perfectly fine and possible to design a board that uses
> > >    an external oscillator for this. It is also possible that an i.MX6 design
> > >    has such a phy connected to a MAC behind a switch or some other interface.
> > >    Independent of i.MX6 this setting can also be necessary for other hardware
> > >    designs, based on different SoC's. In summary, this is a feature of the
> > >    specific hardware design at hand, and has nothing to do with the i.MX6
> > >    specifically. This should definitely be exposed through the DT and not be
> > >    here.
> > > 
> > >  - Enable TXC delay. To clarify, the RGMII specification version 1 specified
> > >    that the RXC and TXC traces should be routed long enough to introduce a
> > >    certain delay to the clock signal, or the delay should be introduced via
> > >    other means. In a later version of the spec, a provision was given for MAC
> > >    or PHY devices to generate this delay internally. The i.MX6 MAC interface
> > >    is unable to generate the required delay internally, so it has to be taken
> > >    care of either by the board layout, or by the PHY device. This is the
> > >    crucial point: The amount of delay set by the PHY delay register depends on
> > >    the board layout. It should NEVER be hard-coded in SoC setup code. The
> > >    correct way is to specify it in the DT. Needless to say that this too,
> > >    isn't i.MX6-specific.
> > 
> > Let's say this is simple to do, shall we?
> > 
> > So, if I disable the call to ar8031_phy_fixup() from ar8035_phy_fixup(),
> > and add the following to the imx6qdl-sr-som.dtsi fragment:
> > 
> > &fec {
> > ...
> >         phy-handle = <&phy>;
> > 
> >         mdio {
> >                 #address-cells = <1>;
> >                 #size-cells = <0>;
> > 
> >                 phy: ethernet-phy@0 {
> >                         reg = <0>;
> >                         qca,clk-out-frequency = <125000000>;
> >                 };
> >         };
> > };
> > 
> > Note that phy-mode is already RGMII-ID.  This should work, right?
> > 
> > The link still comes up, which is good, but the PHY registers for
> > the clock output are wrong.
> > 
> > MMD 3 register 0x8016 contains 0xb282, not 0xb29a which it has
> > _with_ the quirk - and thus the above clock frequency stated in
> > DT is not being selected.  Forcing this register to the right
> > value restores networking.
> > 
> > Yes, the PHY driver is being used:
> > 
> > Qualcomm Atheros AR8035 2188000.ethernet-1:00: attached PHY driver [Qualcomm Atheros AR8035] (mii_bus:phy_addr=2188000.ethernet-1:00, irq=POLL)
> > 
> > So that's not the problem.
> > 
> > Adding some debug shows that the phy_device that is being used is
> > the correct one:
> > 
> > Qualcomm Atheros AR8035 2188000.ethernet-1:00: node=/soc/aips-bus@2100000/ethernet@2188000/mdio/ethernet-phy@0
> > 
> > and it is correctly parsing the clk-out-frequency property:
> > 
> > Qualcomm Atheros AR8035 2188000.ethernet-1:00: cof=0 125000000
> > 
> > When we get to attaching the PHY however:
> > 
> > Qualcomm Atheros AR8035 2188000.ethernet-1:00: clk_25m_mask=0004 clk_25m_reg=0000
> > 
> > which is just wrong.  That's because:
> > 
> >                 if (at803x_match_phy_id(phydev, ATH8030_PHY_ID) ||
> >                     at803x_match_phy_id(phydev, ATH8035_PHY_ID)) {
> >                         priv->clk_25m_reg &= ~AT8035_CLK_OUT_MASK;
> >                         priv->clk_25m_mask &= ~AT8035_CLK_OUT_MASK;
> >                 }
> > 
> > is patently untested - those "~" should not be there.  These masks
> > are one-bits-set for the values that comprise the fields, not
> > zero-bits-set.
> > 
> > So, I see a patch series is going to be necessary to fix the cockup(s)
> > in the PHY driver before we can do anything with DT files.
> 
> I'm glad you found this issues :D I made a patch to fix it last week.
> And it was a reason to send a patch for disabling _all_ fixups :)

So I'm wasting my time creating a patch right now, and this patch to
fix an obvious problem has not been picked up into -net yet, and isn't
part of the 5.6 kernel.

Okay, I'll look a this again once 5.7 is out; I've wasted enough time
on this already.
Oleksij Rempel April 1, 2020, 6:33 a.m. UTC | #22
On Tue, Mar 31, 2020 at 09:19:18AM +0100, Russell King - ARM Linux admin wrote:
> On Tue, Mar 31, 2020 at 10:00:12AM +0200, Marc Kleine-Budde wrote:
> > On 3/31/20 9:54 AM, Russell King - ARM Linux admin wrote:
> > > On Tue, Mar 31, 2020 at 09:47:19AM +0200, Marc Kleine-Budde wrote:
> > >> On 3/30/20 7:41 PM, Russell King - ARM Linux admin wrote:
> > >>>>> arch/arm/mach-imx/mach-imx6q.c:167:		phy_register_fixup_for_uid(PHY_ID_KSZ9021, MICREL_PHY_ID_MASK,
> > >>>>> arch/arm/mach-imx/mach-imx6q.c:169:		phy_register_fixup_for_uid(PHY_ID_KSZ9031, MICREL_PHY_ID_MASK,
> > >>>>> arch/arm/mach-imx/mach-imx6q.c:171:		phy_register_fixup_for_uid(PHY_ID_AR8031, 0xffffffef,
> > >>>>> arch/arm/mach-imx/mach-imx6q.c:173:		phy_register_fixup_for_uid(PHY_ID_AR8035, 0xffffffef,
> > >>>
> > >>> As far as I'm concerned, the AR8035 fixup is there with good reason.
> > >>> It's not just "random" but is required to make the AR8035 usable with
> > >>> the iMX6 SoCs.  Not because of a board level thing, but because it's
> > >>> required for the AR8035 to be usable with an iMX6 SoC.
> > >>
> > >> Is this still ture, if the AR8035 is attached to a switch behind an iMX6?
> > > 
> > > Do you know of such a setup, or are you talking about theoretical
> > > situations?
> > 
> > Granted, not for the AR8035, but for one of the KSZ Phys. This is why
> > Oleksij started looking into this issue in the first place.
> 
> Maybe there's an easy solution to this - check whether the PHY being
> fixed up is connected to the iMX6 SoC:
> 
> static bool phy_connected_to(struct phy_device *phydev,
> 			     const struct of_device_id *matches)
> {
> 	struct device_node *np, *phy_np;
> 
> 	for_each_matching_node(np, matches) {
> 		phy_np = of_parse_phandle(np, "phy-handle", 0);
> 		if (!phy_np)
> 			phy_np = of_parse_phandle(np, "phy", 0);
> 		if (!phy_np)
> 			phy_np = of_parse_phandle(np, "phy-device", 0);
> 		if (phy_np && phydev->mdio.dev.of_node == phy_np) {
> 			of_node_put(phy_np);
> 			of_node_put(np);
> 			return true;
> 		}
> 		of_node_put(phy_np);
> 	}
> 	return false;
> }
> 
> static struct of_device_id imx_fec_ids[] = {
> 	{ .compatible = "fsl,imx6q-fec", },
> 	...
> 	{ },
> };
> 
> static bool phy_connected_to_fec(struct phy_device *phydev)
> {
> 	return phy_connected_to(phydev, imx_fec_ids);
> }
> 
> and then in the fixups:
> 
> 	if (!phy_connected_to_fec(phydev))
> 		return 0;
> 

Ok, i see. We will limit fixup impact to some specific devicetree nodes.
And if we wont to disable fixup completely, some special devicetree binding will
be needed. Correct? Is this acceptable mainline way?
For the usb ethernet fixups we will need some thing similar.

Regards,
Oleksij
Florian Fainelli April 1, 2020, 5:10 p.m. UTC | #23
On 3/31/2020 11:33 PM, Oleksij Rempel wrote:
> On Tue, Mar 31, 2020 at 09:19:18AM +0100, Russell King - ARM Linux admin wrote:
>> On Tue, Mar 31, 2020 at 10:00:12AM +0200, Marc Kleine-Budde wrote:
>>> On 3/31/20 9:54 AM, Russell King - ARM Linux admin wrote:
>>>> On Tue, Mar 31, 2020 at 09:47:19AM +0200, Marc Kleine-Budde wrote:
>>>>> On 3/30/20 7:41 PM, Russell King - ARM Linux admin wrote:
>>>>>>>> arch/arm/mach-imx/mach-imx6q.c:167:		phy_register_fixup_for_uid(PHY_ID_KSZ9021, MICREL_PHY_ID_MASK,
>>>>>>>> arch/arm/mach-imx/mach-imx6q.c:169:		phy_register_fixup_for_uid(PHY_ID_KSZ9031, MICREL_PHY_ID_MASK,
>>>>>>>> arch/arm/mach-imx/mach-imx6q.c:171:		phy_register_fixup_for_uid(PHY_ID_AR8031, 0xffffffef,
>>>>>>>> arch/arm/mach-imx/mach-imx6q.c:173:		phy_register_fixup_for_uid(PHY_ID_AR8035, 0xffffffef,
>>>>>>
>>>>>> As far as I'm concerned, the AR8035 fixup is there with good reason.
>>>>>> It's not just "random" but is required to make the AR8035 usable with
>>>>>> the iMX6 SoCs.  Not because of a board level thing, but because it's
>>>>>> required for the AR8035 to be usable with an iMX6 SoC.
>>>>>
>>>>> Is this still ture, if the AR8035 is attached to a switch behind an iMX6?
>>>>
>>>> Do you know of such a setup, or are you talking about theoretical
>>>> situations?
>>>
>>> Granted, not for the AR8035, but for one of the KSZ Phys. This is why
>>> Oleksij started looking into this issue in the first place.
>>
>> Maybe there's an easy solution to this - check whether the PHY being
>> fixed up is connected to the iMX6 SoC:
>>
>> static bool phy_connected_to(struct phy_device *phydev,
>> 			     const struct of_device_id *matches)
>> {
>> 	struct device_node *np, *phy_np;
>>
>> 	for_each_matching_node(np, matches) {
>> 		phy_np = of_parse_phandle(np, "phy-handle", 0);
>> 		if (!phy_np)
>> 			phy_np = of_parse_phandle(np, "phy", 0);
>> 		if (!phy_np)
>> 			phy_np = of_parse_phandle(np, "phy-device", 0);
>> 		if (phy_np && phydev->mdio.dev.of_node == phy_np) {
>> 			of_node_put(phy_np);
>> 			of_node_put(np);
>> 			return true;
>> 		}
>> 		of_node_put(phy_np);
>> 	}
>> 	return false;
>> }
>>
>> static struct of_device_id imx_fec_ids[] = {
>> 	{ .compatible = "fsl,imx6q-fec", },
>> 	...
>> 	{ },
>> };
>>
>> static bool phy_connected_to_fec(struct phy_device *phydev)
>> {
>> 	return phy_connected_to(phydev, imx_fec_ids);
>> }
>>
>> and then in the fixups:
>>
>> 	if (!phy_connected_to_fec(phydev))
>> 		return 0;
>>
> 
> Ok, i see. We will limit fixup impact to some specific devicetree nodes.
> And if we wont to disable fixup completely, some special devicetree binding will
> be needed. Correct? Is this acceptable mainline way?
> For the usb ethernet fixups we will need some thing similar.

It looks like IMX is using phy_register_fixup_for_uid() which is not
able to scope the fixup against a specific MDIO bus controller name, I
would suggest we introduce one or two variants of that function in order
to allow specifying the scope against a MDIO bus controller name, and
another variant which can take a comparison function, such that the
logic that Russell has suggested above could be passed a as callback to
a new function: phy_register_fixup_cmp() or whatever is an appropriate
name. Internally, those functions would ideal all be implemented by the
same core function which is able to use any key/value as a match.
diff mbox series

Patch

diff --git a/arch/arm/mach-imx/mach-imx6q.c b/arch/arm/mach-imx/mach-imx6q.c
index edd26e0ffeec..aabf0d8c23a9 100644
--- a/arch/arm/mach-imx/mach-imx6q.c
+++ b/arch/arm/mach-imx/mach-imx6q.c
@@ -162,7 +162,8 @@  static int ar8035_phy_fixup(struct phy_device *dev)
 
 static void __init imx6q_enet_phy_init(void)
 {
-	if (IS_BUILTIN(CONFIG_PHYLIB)) {
+	if (IS_BUILTIN(CONFIG_PHYLIB) &&
+	    IS_BUILTIN(CONFIG_DEPRECATED_PHY_FIXUPS)) {
 		phy_register_fixup_for_uid(PHY_ID_KSZ9021, MICREL_PHY_ID_MASK,
 				ksz9021rn_phy_fixup);
 		phy_register_fixup_for_uid(PHY_ID_KSZ9031, MICREL_PHY_ID_MASK,
diff --git a/arch/arm/mach-imx/mach-imx6sx.c b/arch/arm/mach-imx/mach-imx6sx.c
index d5310bf307ff..fdd9bef27625 100644
--- a/arch/arm/mach-imx/mach-imx6sx.c
+++ b/arch/arm/mach-imx/mach-imx6sx.c
@@ -35,7 +35,8 @@  static int ar8031_phy_fixup(struct phy_device *dev)
 #define PHY_ID_AR8031   0x004dd074
 static void __init imx6sx_enet_phy_init(void)
 {
-	if (IS_BUILTIN(CONFIG_PHYLIB))
+	if (IS_BUILTIN(CONFIG_PHYLIB) &&
+	    IS_BUILTIN(CONFIG_DEPRECATED_PHY_FIXUPS))
 		phy_register_fixup_for_uid(PHY_ID_AR8031, 0xffffffff,
 					   ar8031_phy_fixup);
 }
diff --git a/arch/arm/mach-imx/mach-imx7d.c b/arch/arm/mach-imx/mach-imx7d.c
index ebb27592a9f7..1d3d67c247a3 100644
--- a/arch/arm/mach-imx/mach-imx7d.c
+++ b/arch/arm/mach-imx/mach-imx7d.c
@@ -49,7 +49,8 @@  static int bcm54220_phy_fixup(struct phy_device *dev)
 
 static void __init imx7d_enet_phy_init(void)
 {
-	if (IS_BUILTIN(CONFIG_PHYLIB)) {
+	if (IS_BUILTIN(CONFIG_PHYLIB) &&
+	    IS_BUILTIN(CONFIG_DEPRECATED_PHY_FIXUPS)) {
 		phy_register_fixup_for_uid(PHY_ID_AR8031, 0xffffffff,
 					   ar8031_phy_fixup);
 		phy_register_fixup_for_uid(PHY_ID_BCM54220, 0xffffffff,
diff --git a/arch/arm/mach-mxs/mach-mxs.c b/arch/arm/mach-mxs/mach-mxs.c
index c109f47e9cbc..b4b631242080 100644
--- a/arch/arm/mach-mxs/mach-mxs.c
+++ b/arch/arm/mach-mxs/mach-mxs.c
@@ -257,7 +257,8 @@  static void __init apx4devkit_init(void)
 {
 	enable_clk_enet_out();
 
-	if (IS_BUILTIN(CONFIG_PHYLIB))
+	if (IS_BUILTIN(CONFIG_PHYLIB) &&
+	    IS_BUILTIN(CONFIG_DEPRECATED_PHY_FIXUPS))
 		phy_register_fixup_for_uid(PHY_ID_KSZ8051, MICREL_PHY_ID_MASK,
 					   apx4devkit_phy_fixup);
 }
diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index 9dabe03a668c..f54428ddf058 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -249,6 +249,22 @@  config LED_TRIGGER_PHY
 		<Speed in megabits>Mbps OR <Speed in gigabits>Gbps OR link
 		for any speed known to the PHY.
 
+config DEPRECATED_PHY_FIXUPS
+	bool "Enable deprecated PHY fixups"
+	default y
+	---help---
+	  In the early days it was common practice to configure PHYs by adding a
+	  phy_register_fixup*() in the machine code. This practice turned out to
+	  be potentially dangerous, because:
+	  - it affects all PHYs in the system
+	  - these register changes are usually not preserved during PHY reset
+	    or suspend/resume cycle.
+	  - it complicates debugging, since these configuration changes were not
+	    done by the actual PHY driver.
+	  This option allows to disable all fixups which are identified as
+	  potentially harmful and give the developers a chance to implement the
+	  proper configuration via the device tree (e.g.: phy-mode) and/or the
+	  related PHY drivers.
 
 comment "MII PHY device drivers"