diff mbox series

[net-next] net: pcs: lynxi: fully reconfigure if link is down

Message ID e9831ec99acd5a8ab03c76fce87fa750c7041e60.1692273723.git.daniel@makrotopia.org (mailing list archive)
State Superseded
Headers show
Series [net-next] net: pcs: lynxi: fully reconfigure if link is down | expand

Commit Message

Daniel Golle Aug. 17, 2023, 12:04 p.m. UTC
On MT7988 When switching from 10GBase-R/5GBase-R/USXGMII to one of the
interface modes provided by mtk-pcs-lynxi we need to make sure to
always perform a full configuration of the PHYA.
As the idea behind not doing that was mostly to prevent an existing link
going down without any need for it to do so. Hence we can just always
perform a full confinguration in case the link is down.

Signed-off-by: Daniel Golle <daniel@makrotopia.org>
---
 drivers/net/pcs/pcs-mtk-lynxi.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

Comments

Russell King (Oracle) Aug. 17, 2023, 1:03 p.m. UTC | #1
On Thu, Aug 17, 2023 at 01:04:06PM +0100, Daniel Golle wrote:
> On MT7988 When switching from 10GBase-R/5GBase-R/USXGMII to one of the
> interface modes provided by mtk-pcs-lynxi we need to make sure to
> always perform a full configuration of the PHYA.
> As the idea behind not doing that was mostly to prevent an existing link
> going down without any need for it to do so. Hence we can just always
> perform a full confinguration in case the link is down.

And this is racy - because in the case with inband signalling, the link
can come up between reading the status and acting on it. It could even
be already up, but the link status indicates it is not. Lastly, reading
the BMSR has side effects: the link status bit latches low until a read.

Basically, do not read the BMSR here, it's buggy to read it any place
other than pcs_get_state.

I think what we need to do instead are:

1) mtk_mac_select_pcs() returns the SGMII PCS or NULL. Presumably this
   is the driver which supports 10GBase-R/5GBase-R/USXGMII, and thus
   this returns NULL for 10GBase-R/5GBase-R/USXGMII.

   Phylink doesn't cater for mac_select_pcs() returning non-NULL for
   some modes and NULL for others, mainly because the presence of a PCS
   _used_ to cause phylink to change its behaviour (see
   https://lore.kernel.org/netdev/YZRLQqLblRurUd4V@shell.armlinux.org.uk/).
   That has now changed (we've got rid of the legacy stuff at last!) so
   there is no technical reason not to now allow that.

   Vladimir did have some arguments for not allowing it when we had the
   phylink_set_pcs() interface:
   https://lore.kernel.org/netdev/20211123181515.qqo7e4xbuu2ntwgt@skbuf/
   I'm assuming that your requirement now provides sufficient
   justification for allowing this.

   There is one bug that does need fixing first:
   phylink_change_inband_advert() checks pl->pcs->neg_mode without
   first checking whether pl->pcs is non-NULL.

   To allow this, phylink_major_config() needs:

   	pcs_changed = pcs && pl->pcs != pcs;

   to become:

   	pcs_changed = pl->pcs != pcs;

2) with (1) solved, there are a couple of callbacks that can be used to
   solve this - I think pcs_disable() is the one you want, which will
   be called when we switch to a mode where _this_ PCS will no longer
   be used (thus you can reset mpcs->interface to _NA, ready for when
   it is next brought into use.)

Would that work for you?
Denis Kirjanov Aug. 17, 2023, 1:13 p.m. UTC | #2
On 8/17/23 15:04, Daniel Golle wrote:
> On MT7988 When switching from 10GBase-R/5GBase-R/USXGMII to one of the
> interface modes provided by mtk-pcs-lynxi we need to make sure to
> always perform a full configuration of the PHYA.
> As the idea behind not doing that was mostly to prevent an existing link
> going down without any need for it to do so. Hence we can just always
> perform a full confinguration in case the link is down.
> 
> Signed-off-by: Daniel Golle <daniel@makrotopia.org>
> ---
>  drivers/net/pcs/pcs-mtk-lynxi.c | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/pcs/pcs-mtk-lynxi.c b/drivers/net/pcs/pcs-mtk-lynxi.c
> index b0f3ede945d96..788c2ccde064e 100644
> --- a/drivers/net/pcs/pcs-mtk-lynxi.c
> +++ b/drivers/net/pcs/pcs-mtk-lynxi.c
> @@ -108,8 +108,8 @@ static int mtk_pcs_lynxi_config(struct phylink_pcs *pcs, unsigned int neg_mode,
>  				bool permit_pause_to_mac)
>  {
>  	struct mtk_pcs_lynxi *mpcs = pcs_to_mtk_pcs_lynxi(pcs);
> -	bool mode_changed = false, changed;
> -	unsigned int rgc3, sgm_mode, bmcr;
> +	bool mode_changed = false, changed, link;
> +	unsigned int bm, rgc3, sgm_mode, bmcr;
>  	int advertise, link_timer;
>  
>  	advertise = phylink_mii_c22_pcs_encode_advertisement(interface,
> @@ -117,6 +117,10 @@ static int mtk_pcs_lynxi_config(struct phylink_pcs *pcs, unsigned int neg_mode,
>  	if (advertise < 0)
>  		return advertise;
>  
> +	/* Check if link is currently up */
> +	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1, &bm);
> +	link = !!(FIELD_GET(SGMII_BMSR, bm) & BMSR_LSTATUS);
> +
>  	/* Clearing IF_MODE_BIT0 switches the PCS to BASE-X mode, and
>  	 * we assume that fixes it's speed at bitrate = line rate (in
>  	 * other words, 1000Mbps or 2500Mbps).
> @@ -137,7 +141,10 @@ static int mtk_pcs_lynxi_config(struct phylink_pcs *pcs, unsigned int neg_mode,
>  		bmcr = 0;
>  	}
>  
> -	if (mpcs->interface != interface) {
> +	/* Do a full reconfiguration only if the link is down or the interface
> +	 * mode has changed
> +	 */
> +	if (mpcs->interface != interface || !link) {

btw is it a thread-safe to check the mpcs->interface member?
I've quick checked and phylink_pcs_config can be invoked from different places
and the code below does the following assignment:
mpcs->interface = interface;



>  		link_timer = phylink_get_link_timer_ns(interface);
>  		if (link_timer < 0)
>  			return link_timer;
Russell King (Oracle) Aug. 17, 2023, 1:30 p.m. UTC | #3
On Thu, Aug 17, 2023 at 04:13:44PM +0300, Denis Kirjanov wrote:
> btw is it a thread-safe to check the mpcs->interface member?
> I've quick checked and phylink_pcs_config can be invoked from different places
> and the code below does the following assignment:
> mpcs->interface = interface;

pcs_config is called from:
- phylink_change_inband_advert() while holding pl->state_mutex
- phylink_major_config()

phylink_major_config() is called from:
- phylink_resolve() while holding pl->state_mutex
- phylink_ethtool_ksettings_set() while holding pl->state_mutex
- phylink_mac_initial_config()

phylink_mac_initial_config() is called from:
- phylink_resume()
- phylink_start()

In both cases, these are called without holding pl->state_mutex, so
one may think that they could be unsafe. In both cases, however,
pl->phylink_disable_state is non-zero.

phylink_change_inband_advert() will check whether
PHYLINK_DISABLE_STOPPED is set, and not proceed to call pcs_config()
in that case. In the phylink_resume() case, userspace isn't running,
so there can't be a call to phylink_change_inband_advert().

So, one can assume (as must be the case of course, we're fiddling
with hardware) that pcs_config() will always be run without another
thread also running it for the same hardware. Two over-lapping
pcs_config() calls would lead to an indeterminant hardware state.
Daniel Golle Aug. 17, 2023, 3:11 p.m. UTC | #4
Hi Russell,

On Thu, Aug 17, 2023 at 02:03:40PM +0100, Russell King (Oracle) wrote:
> On Thu, Aug 17, 2023 at 01:04:06PM +0100, Daniel Golle wrote:
> > On MT7988 When switching from 10GBase-R/5GBase-R/USXGMII to one of the
> > interface modes provided by mtk-pcs-lynxi we need to make sure to
> > always perform a full configuration of the PHYA.
> > As the idea behind not doing that was mostly to prevent an existing link
> > going down without any need for it to do so. Hence we can just always
> > perform a full confinguration in case the link is down.
> 
> And this is racy - because in the case with inband signalling, the link
> can come up between reading the status and acting on it. It could even
> be already up, but the link status indicates it is not. Lastly, reading
> the BMSR has side effects: the link status bit latches low until a read.
> 
> Basically, do not read the BMSR here, it's buggy to read it any place
> other than pcs_get_state.
> 
> I think what we need to do instead are:
> 
> 1) mtk_mac_select_pcs() returns the SGMII PCS or NULL. Presumably this
>    is the driver which supports 10GBase-R/5GBase-R/USXGMII, and thus
>    this returns NULL for 10GBase-R/5GBase-R/USXGMII.
> 
>    Phylink doesn't cater for mac_select_pcs() returning non-NULL for
>    some modes and NULL for others, mainly because the presence of a PCS
>    _used_ to cause phylink to change its behaviour (see
>    https://lore.kernel.org/netdev/YZRLQqLblRurUd4V@shell.armlinux.org.uk/).
>    That has now changed (we've got rid of the legacy stuff at last!) so
>    there is no technical reason not to now allow that.
> 
>    Vladimir did have some arguments for not allowing it when we had the
>    phylink_set_pcs() interface:
>    https://lore.kernel.org/netdev/20211123181515.qqo7e4xbuu2ntwgt@skbuf/
>    I'm assuming that your requirement now provides sufficient
>    justification for allowing this.
> 
>    There is one bug that does need fixing first:
>    phylink_change_inband_advert() checks pl->pcs->neg_mode without
>    first checking whether pl->pcs is non-NULL.
> 
>    To allow this, phylink_major_config() needs:
> 
>    	pcs_changed = pcs && pl->pcs != pcs;
> 
>    to become:
> 
>    	pcs_changed = pl->pcs != pcs;
> 
> 2) with (1) solved, there are a couple of callbacks that can be used to
>    solve this - I think pcs_disable() is the one you want, which will
>    be called when we switch to a mode where _this_ PCS will no longer
>    be used (thus you can reset mpcs->interface to _NA, ready for when
>    it is next brought into use.)
> 
> Would that work for you?

Yes, and that actually even makes things much easier.
The case of mtk_mac_select_pcs() returning NULL is not even relevant:
In case of the interface being 10GBase-R, 5GBase-R or USXGMII
mtk_mac_select_pcs() will return a pointer to the USXGMII PCS instance[1].

Hence simply implementing .pcs_disabled already resolves the issue.
I will post a patch doing that instead which replaces this patch.


Thank you for reviewing!


Daniel


[1]: https://github.com/dangowrt/linux/commit/c81d14e214c8bbbab81fd6d6d49e6f7b87015e1e#diff-6f8a141b53de471a9fe00ac68f8c82b9dda3bad057c160327d6bfe1b0b9c8b23R550
diff mbox series

Patch

diff --git a/drivers/net/pcs/pcs-mtk-lynxi.c b/drivers/net/pcs/pcs-mtk-lynxi.c
index b0f3ede945d96..788c2ccde064e 100644
--- a/drivers/net/pcs/pcs-mtk-lynxi.c
+++ b/drivers/net/pcs/pcs-mtk-lynxi.c
@@ -108,8 +108,8 @@  static int mtk_pcs_lynxi_config(struct phylink_pcs *pcs, unsigned int neg_mode,
 				bool permit_pause_to_mac)
 {
 	struct mtk_pcs_lynxi *mpcs = pcs_to_mtk_pcs_lynxi(pcs);
-	bool mode_changed = false, changed;
-	unsigned int rgc3, sgm_mode, bmcr;
+	bool mode_changed = false, changed, link;
+	unsigned int bm, rgc3, sgm_mode, bmcr;
 	int advertise, link_timer;
 
 	advertise = phylink_mii_c22_pcs_encode_advertisement(interface,
@@ -117,6 +117,10 @@  static int mtk_pcs_lynxi_config(struct phylink_pcs *pcs, unsigned int neg_mode,
 	if (advertise < 0)
 		return advertise;
 
+	/* Check if link is currently up */
+	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1, &bm);
+	link = !!(FIELD_GET(SGMII_BMSR, bm) & BMSR_LSTATUS);
+
 	/* Clearing IF_MODE_BIT0 switches the PCS to BASE-X mode, and
 	 * we assume that fixes it's speed at bitrate = line rate (in
 	 * other words, 1000Mbps or 2500Mbps).
@@ -137,7 +141,10 @@  static int mtk_pcs_lynxi_config(struct phylink_pcs *pcs, unsigned int neg_mode,
 		bmcr = 0;
 	}
 
-	if (mpcs->interface != interface) {
+	/* Do a full reconfiguration only if the link is down or the interface
+	 * mode has changed
+	 */
+	if (mpcs->interface != interface || !link) {
 		link_timer = phylink_get_link_timer_ns(interface);
 		if (link_timer < 0)
 			return link_timer;