[RFC,net-next,0/2] net: dsa: bcm_sf2: Utilize PHYLINK for all ports

Message ID	20220725214942.97207-1-f.fainelli@gmail.com (mailing list archive)
Headers	show Return-Path: <netdev-owner@kernel.org> From: Florian Fainelli <f.fainelli@gmail.com> To: netdev@vger.kernel.org Cc: Florian Fainelli <f.fainelli@gmail.com>, Andrew Lunn <andrew@lunn.ch>, Vivien Didelot <vivien.didelot@gmail.com>, Vladimir Oltean <olteanv@gmail.com>, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>, linux-kernel@vger.kernel.org (open list) Subject: [RFC net-next 0/2] net: dsa: bcm_sf2: Utilize PHYLINK for all ports Date: Mon, 25 Jul 2022 14:49:40 -0700 Message-Id: <20220725214942.97207-1-f.fainelli@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	net: dsa: bcm_sf2: Utilize PHYLINK for all ports \| expand [RFC,net-next,0/2] net: dsa: bcm_sf2: Utilize PHYLINK for all ports [RFC,net-next,1/2] net: dsa: bcm_sf2: Introduce helper for port override offset [RFC,net-next,2/2] net: dsa: bcm_sf2: Have PHYLINK configure CPU/IMP port(s)

Florian Fainelli July 25, 2022, 9:49 p.m. UTC

Hi all,

Although this should work for all devices, since most DTBs on the
platforms where bcm_sf2 is use do not populate a 'fixed-link' property
for their CPU ports, but rely on the Ethernet controller DT node doing
that, we will not be registering a 'fixed-link' instance for CPU ports.

This still works because the switch matches the configuration of the
Ethernet controller, but on BCM4908 where we want to force 2GBits/sec,
that I cannot test, not so sure.

So as of now, this series does not produce register for register
compatile changes.

Florian Fainelli (2):
  net: dsa: bcm_sf2: Introduce helper for port override offset
  net: dsa: bcm_sf2: Have PHYLINK configure CPU/IMP port(s)

 drivers/net/dsa/bcm_sf2.c | 130 ++++++++++++++++++--------------------
 1 file changed, 63 insertions(+), 67 deletions(-)

Vladimir Oltean July 26, 2022, 11:23 a.m. UTC | #1

Hi Florian,

On Mon, Jul 25, 2022 at 02:49:40PM -0700, Florian Fainelli wrote:
> Hi all,
> 
> Although this should work for all devices, since most DTBs on the
> platforms where bcm_sf2 is use do not populate a 'fixed-link' property
> for their CPU ports, but rely on the Ethernet controller DT node doing
> that, we will not be registering a 'fixed-link' instance for CPU ports.
> 
> This still works because the switch matches the configuration of the
> Ethernet controller, but on BCM4908 where we want to force 2GBits/sec,
> that I cannot test, not so sure.
> 
> So as of now, this series does not produce register for register
> compatile changes.

My understanding of this change set is that it stops overriding the link
status of IMP ports from dsa_switch_ops :: setup (unconditionally) and
it moves it to phylink_mac_link_up(). But the latter may not be called
when you lack a fixed-link, and yet the IMP port(s) still work(s).

This begs the natural question, is overriding the link status ever needed?
It was done since the initial commit for this driver, 246d7f773c13c.

Florian Fainelli July 26, 2022, 4:14 p.m. UTC | #2

On 7/26/22 04:23, Vladimir Oltean wrote:
> Hi Florian,
> 
> On Mon, Jul 25, 2022 at 02:49:40PM -0700, Florian Fainelli wrote:
>> Hi all,
>>
>> Although this should work for all devices, since most DTBs on the
>> platforms where bcm_sf2 is use do not populate a 'fixed-link' property
>> for their CPU ports, but rely on the Ethernet controller DT node doing
>> that, we will not be registering a 'fixed-link' instance for CPU ports.
>>
>> This still works because the switch matches the configuration of the
>> Ethernet controller, but on BCM4908 where we want to force 2GBits/sec,
>> that I cannot test, not so sure.
>>
>> So as of now, this series does not produce register for register
>> compatile changes.
> 
> My understanding of this change set is that it stops overriding the link
> status of IMP ports from dsa_switch_ops :: setup (unconditionally) and
> it moves it to phylink_mac_link_up(). But the latter may not be called
> when you lack a fixed-link, and yet the IMP port(s) still work(s).
> 
> This begs the natural question, is overriding the link status ever needed?

It was until we started to unconditionally reset the switch using the "external" reset method as opposed to the "internal" reset method which turned out not to be functional:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=eee87e4377a4b86dc2eea0ade162b0dc33f40576

At any rate (no pun intended), 4908 will want a 2GBit/sec IMP port to be set-up and we have no way to do that other than by forcing that setting, either through the bcm_sf2_imp_setup() method or via a hack to the mac_link_up() callback. This is kind of orthogonal in the sense that there is no "official" support for speed 2000 mbits/sec anyway in the emulated SW PHY, PHYLINK or anywhere in between but if we want to fully transition over to PHYLINK to configure all ports, which is absolutely the goal, we will need to find a solution one way or another.

I would prefer if also we sort of "transferred" the 'fixed-link' parameters from the DSA Ethernet controller attached to the CPU port onto the PHYLINK instance of the CPU port in the switch as they ought to be strictly identical otherwise it just won't work. This would ensure that we continue to force the link and it would make me sleep better a night to know that the IMP port is operating strictly the same way it was. My script compares register values before/after for the registers that are static and this was flagged as a difference.

Vladimir Oltean July 27, 2022, 1:29 a.m. UTC | #3

On Tue, Jul 26, 2022 at 09:14:17AM -0700, Florian Fainelli wrote:
> > This begs the natural question, is overriding the link status ever needed?
> 
> It was until we started to unconditionally reset the switch using the
> "external" reset method as opposed to the "internal" reset method
> which turned out not to be functional:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=eee87e4377a4b86dc2eea0ade162b0dc33f40576

Ok, I see.

> At any rate (no pun intended), 4908 will want a 2GBit/sec IMP port to
> be set-up and we have no way to do that other than by forcing that
> setting, either through the bcm_sf2_imp_setup() method or via a hack
> to the mac_link_up() callback. This is kind of orthogonal in the sense
> that there is no "official" support for speed 2000 mbits/sec anyway in
> the emulated SW PHY, PHYLINK or anywhere in between but if we want to
> fully transition over to PHYLINK to configure all ports, which is
> absolutely the goal, we will need to find a solution one way or
> another.

So I made some tests with speed = <2000>; in the device tree and in a
way I'm more confused than when I started. I was expecting phylink_validate()
to somehow fail but this isn't at all what happened. Instead everything
seems to work just fine, minus some ergonomic details (some prints).

So in the case of a fixed-link, phylink_validate() is actually called
twice, once directly from phylink_create() and once almost immediately
afterwards from phylink_parse_fixedlink(). Both validations are of the
inquisitive kind rather than the confrontational kind, i.e. their return
value isn't checked, and "pl->supported"/"pl->link_config.advertising"
are initially filled by phylink with all ones, in order for the driver
to reduce this to all link mode bits that are supported.
Minor side note, this second validation done during fixed-link parsing
is redundant IMO, because nothing relevant inside the arguments that we
pass to pl->mac_ops->validate() will have changed in any way between the
calls.

Anyway, if phylink_validate() is never going to confront us about the
pl->supported link mode mask becoming zero, you might wonder why it
calls even inquisitively in the first place.

Essentially phylink_parse_fixedlink() just wants to print in case it's
using a link speed that isn't supported by the driver. To do that, it
calls phy_lookup_setting() where one of the arguments is pl->supported
itself. But in our case, there is no link mode for speed 2000, although
that shouldn't matter, since no Ethernet PHY sees or needs to advertise
this speed, so phy_lookup_setting() finds nothing. I suspect this is
largely due to historical reasons, where the link modes were the common
denominator at the level of the driver visible phylink_validate() API.
Today we may simply extend config->mac_capabilities and forgo adding
bogus link modes just for this to work.

Curiously, even if we go to the extra lengths of silencing phylink's
"fixed link not recognised" warning, nothing seems to be broken even if
we don't do that.

Immediately after pl->supported has been populated by the inquisitive
phylink_validate(), phylink clears it (which means that the pl->supported
variable used above could have very well been just a temporary on-stack
variable), and just populates some fields.
Namely the pause fields, and a *single* speed, corresponding to "s"
(what phy_lookup_setting() found).

	linkmode_zero(pl->supported);
	phylink_set(pl->supported, MII);
	phylink_set(pl->supported, Pause);
	phylink_set(pl->supported, Asym_Pause);
	phylink_set(pl->supported, Autoneg);
	if (s) {
		__set_bit(s->bit, pl->supported);
		__set_bit(s->bit, pl->link_config.lp_advertising);
	} else {
		phylink_warn(pl, "fixed link %s duplex %dMbps not recognised\n",
			     pl->link_config.duplex == DUPLEX_FULL ? "full" : "half",
			     pl->link_config.speed);
	}

Why phylink even bothers to keep the speed-related linkmode in
pl->supported, if it won't use it anywhere further, I can't answer.
I can even delete the "if (s) ... else ..." block altogether and nothing
seems to be adversely impacted.

In any case, the short version of the code walkthrough is that phylink
can apparently operate in fixed-link mode with a pl->supported and
pl->link_config.lp_advertising mask of link modes that doesn't contain
any speed, and this won't generate any error, although I'm not completely
sure it was intended either.

> I would prefer if also we sort of "transferred" the 'fixed-link'
> parameters from the DSA Ethernet controller attached to the CPU port
> onto the PHYLINK instance of the CPU port in the switch as they ought
> to be strictly identical otherwise it just won't work. This would
> ensure that we continue to force the link and it would make me sleep
> better a night to know that the IMP port is operating strictly the
> same way it was. My script compares register values before/after for
> the registers that are static and this was flagged as a difference.

There are several problems with transferring the parameters. Most
obvious derives from what we discussed about speed = <2000> just above:
the DSA master won't have it, either, because it's a non-standard speed.
Additionally, the DSA master may be missing the phy-mode too.

Second has to do with how we transfer the phy-mode assuming it isn't
missing on the master. RGMII modes are clearly problematic precisely
because we have so many driver interpretations of what they mean.
But "mii" and "rmii" aren't all that clear-cut either. Do we translate
into "mii" and "rmii" for DSA, or "rev-mii" and "rev-rmii"?
bcm_sf2 understands "rev-mii", but mv88e6xxx doesn't.

Florian Fainelli July 27, 2022, 2:17 a.m. UTC | #4

On 7/26/2022 6:29 PM, Vladimir Oltean wrote:
> On Tue, Jul 26, 2022 at 09:14:17AM -0700, Florian Fainelli wrote:
>>> This begs the natural question, is overriding the link status ever needed?
>>
>> It was until we started to unconditionally reset the switch using the
>> "external" reset method as opposed to the "internal" reset method
>> which turned out not to be functional:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=eee87e4377a4b86dc2eea0ade162b0dc33f40576
> 
> Ok, I see.
> 
>> At any rate (no pun intended), 4908 will want a 2GBit/sec IMP port to
>> be set-up and we have no way to do that other than by forcing that
>> setting, either through the bcm_sf2_imp_setup() method or via a hack
>> to the mac_link_up() callback. This is kind of orthogonal in the sense
>> that there is no "official" support for speed 2000 mbits/sec anyway in
>> the emulated SW PHY, PHYLINK or anywhere in between but if we want to
>> fully transition over to PHYLINK to configure all ports, which is
>> absolutely the goal, we will need to find a solution one way or
>> another.
> 
> So I made some tests with speed = <2000>; in the device tree and in a
> way I'm more confused than when I started. I was expecting phylink_validate()
> to somehow fail but this isn't at all what happened. Instead everything
> seems to work just fine, minus some ergonomic details (some prints).
> 
> So in the case of a fixed-link, phylink_validate() is actually called
> twice, once directly from phylink_create() and once almost immediately
> afterwards from phylink_parse_fixedlink(). Both validations are of the
> inquisitive kind rather than the confrontational kind, i.e. their return
> value isn't checked, and "pl->supported"/"pl->link_config.advertising"
> are initially filled by phylink with all ones, in order for the driver
> to reduce this to all link mode bits that are supported.
> Minor side note, this second validation done during fixed-link parsing
> is redundant IMO, because nothing relevant inside the arguments that we
> pass to pl->mac_ops->validate() will have changed in any way between the
> calls.
> 
> Anyway, if phylink_validate() is never going to confront us about the
> pl->supported link mode mask becoming zero, you might wonder why it
> calls even inquisitively in the first place.
> 
> Essentially phylink_parse_fixedlink() just wants to print in case it's
> using a link speed that isn't supported by the driver. To do that, it
> calls phy_lookup_setting() where one of the arguments is pl->supported
> itself. But in our case, there is no link mode for speed 2000, although
> that shouldn't matter, since no Ethernet PHY sees or needs to advertise
> this speed, so phy_lookup_setting() finds nothing. I suspect this is
> largely due to historical reasons, where the link modes were the common
> denominator at the level of the driver visible phylink_validate() API.
> Today we may simply extend config->mac_capabilities and forgo adding
> bogus link modes just for this to work.
> 
> Curiously, even if we go to the extra lengths of silencing phylink's
> "fixed link not recognised" warning, nothing seems to be broken even if
> we don't do that.
> 
> Immediately after pl->supported has been populated by the inquisitive
> phylink_validate(), phylink clears it (which means that the pl->supported
> variable used above could have very well been just a temporary on-stack
> variable), and just populates some fields.
> Namely the pause fields, and a *single* speed, corresponding to "s"
> (what phy_lookup_setting() found).
> 
> 	linkmode_zero(pl->supported);
> 	phylink_set(pl->supported, MII);
> 	phylink_set(pl->supported, Pause);
> 	phylink_set(pl->supported, Asym_Pause);
> 	phylink_set(pl->supported, Autoneg);
> 	if (s) {
> 		__set_bit(s->bit, pl->supported);
> 		__set_bit(s->bit, pl->link_config.lp_advertising);
> 	} else {
> 		phylink_warn(pl, "fixed link %s duplex %dMbps not recognised\n",
> 			     pl->link_config.duplex == DUPLEX_FULL ? "full" : "half",
> 			     pl->link_config.speed);
> 	}
> 
> Why phylink even bothers to keep the speed-related linkmode in
> pl->supported, if it won't use it anywhere further, I can't answer.
> I can even delete the "if (s) ... else ..." block altogether and nothing
> seems to be adversely impacted.
> 
> In any case, the short version of the code walkthrough is that phylink
> can apparently operate in fixed-link mode with a pl->supported and
> pl->link_config.lp_advertising mask of link modes that doesn't contain
> any speed, and this won't generate any error, although I'm not completely
> sure it was intended either.

OK, well maybe we need to syszbot the crap out of PHYLINK at some point, 
kunit anyone?

> 
>> I would prefer if also we sort of "transferred" the 'fixed-link'
>> parameters from the DSA Ethernet controller attached to the CPU port
>> onto the PHYLINK instance of the CPU port in the switch as they ought
>> to be strictly identical otherwise it just won't work. This would
>> ensure that we continue to force the link and it would make me sleep
>> better a night to know that the IMP port is operating strictly the
>> same way it was. My script compares register values before/after for
>> the registers that are static and this was flagged as a difference.
> 
> There are several problems with transferring the parameters. Most
> obvious derives from what we discussed about speed = <2000> just above:
> the DSA master won't have it, either, because it's a non-standard speed.
> Additionally, the DSA master may be missing the phy-mode too.
> 
> Second has to do with how we transfer the phy-mode assuming it isn't
> missing on the master. RGMII modes are clearly problematic precisely
> because we have so many driver interpretations of what they mean.
> But "mii" and "rmii" aren't all that clear-cut either. Do we translate
> into "mii" and "rmii" for DSA, or "rev-mii" and "rev-rmii"?
> bcm_sf2 understands "rev-mii", but mv88e6xxx doesn't.

Yep, you have me convinced. I suppose the course of action for me is to 
update the DTSes to also include a fixed-link property and phy-mode 
property in the CPU node, even if that duplicates what the Ethernet 
controller node already has, and then given a cycle or two, merge this 
patch series.

Vladimir Oltean July 27, 2022, 1:23 p.m. UTC | #5

On Tue, Jul 26, 2022 at 07:17:24PM -0700, Florian Fainelli wrote:
> > > I would prefer if also we sort of "transferred" the 'fixed-link'
> > > parameters from the DSA Ethernet controller attached to the CPU port
> > > onto the PHYLINK instance of the CPU port in the switch as they ought
> > > to be strictly identical otherwise it just won't work. This would
> > > ensure that we continue to force the link and it would make me sleep
> > > better a night to know that the IMP port is operating strictly the
> > > same way it was. My script compares register values before/after for
> > > the registers that are static and this was flagged as a difference.
> > 
> > There are several problems with transferring the parameters. Most
> > obvious derives from what we discussed about speed = <2000> just above:
> > the DSA master won't have it, either, because it's a non-standard speed.
> > Additionally, the DSA master may be missing the phy-mode too.
> > 
> > Second has to do with how we transfer the phy-mode assuming it isn't
> > missing on the master. RGMII modes are clearly problematic precisely
> > because we have so many driver interpretations of what they mean.
> > But "mii" and "rmii" aren't all that clear-cut either. Do we translate
> > into "mii" and "rmii" for DSA, or "rev-mii" and "rev-rmii"?
> > bcm_sf2 understands "rev-mii", but mv88e6xxx doesn't.
> 
> Yep, you have me convinced. I suppose the course of action for me is to
> update the DTSes to also include a fixed-link property and phy-mode property
> in the CPU node, even if that duplicates what the Ethernet controller node
> already has, and then given a cycle or two, merge this patch series.

I don't want to say that the 'peeking at the master' technique can't be
used at all, maybe if it would only come down to internal ports, and you
could give me a guarantee that the master does have a valid DT description
for your SoCs, we could consider doing this as a transition thing for
bcm_sf2 (on non-4908, see more below).

Having said that, I looked again at my analysis for bcm_sf2:
https://patchwork.kernel.org/project/netdevbpf/patch/20220723164635.1621911-1-vladimir.oltean@nxp.com/

and I notice that I said that arch/arm64/boot/dts/broadcom/bcm4908/bcm4908.dtsi
*does* have a valid (complete at least) OF node description for the CPU port:

	port@8 {
		reg = <8>;
		phy-mode = "internal";
		ethernet = <&enet>;

		fixed-link {
			speed = <1000>;
			full-duplex;
		};
	};

but I didn't have the information that the speed is wrong (you said you
want it to operate at 2G).

Additionally (and curiously), the "enet" node lacks any link
description:

	enet: ethernet@2000 {
		compatible = "brcm,bcm4908-enet";
		reg = <0x2000 0x1000>;

		interrupts = <GIC_SPI 86 IRQ_TYPE_LEVEL_HIGH>,
			     <GIC_SPI 87 IRQ_TYPE_LEVEL_HIGH>;
		interrupt-names = "rx", "tx";
	};

My question is, do you use the DTS file as seen in the mainline kernel
at all, or is it just for some sort of reference?

In case the 4908 really does have a fixed-link at 1000 provided by the
bootloader, you should be free to move the IMP setup code to
phylink_mac_link_up(), because phylink does have what it needs to run,
and just hack it to 2G for this SoC due to driver level knowledge,
despite what phylink thinks is going on. This can take place indefinitely.
In the long term, I don't know what else to suggest beyond fixing the
device tree to report the proper speed, sorry.

[RFC,net-next,0/2] net: dsa: bcm_sf2: Utilize PHYLINK for all ports

Message

Comments