Message ID | 1c378be54d0fb76117f6d72dadd4a43a9950f0dc.1720105125.git.daniel@makrotopia.org (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,v4] net: dsa: mt7530: fix impossible MDIO address and issue warning | expand |
On Thu, 4 Jul 2024 16:08:22 +0100 Daniel Golle wrote:
> +return (phy_addr - MT7530_NUM_PORTS & ~MT7530_NUM_PORTS) + MT7530_NUM_PORTS & PHY_MAX_ADDR - 1;
This is still unindented! :)
Also GCC doesn't trust you with removal of the parenthesis:
../drivers/net/dsa/mt7530-mdio.c:149:18: warning: suggest parentheses around ‘-’ in operand of ‘&’ [-Wparentheses]
149 | return (phy_addr - MT7530_NUM_PORTS & ~MT7530_NUM_PORTS) + MT7530_NUM_PORTS & PHY_MAX_ADDR - 1;
../drivers/net/dsa/mt7530-mdio.c:149:58: warning: suggest parentheses around ‘+’ in operand of ‘&’ [-Wparentheses]
149 | return (phy_addr - MT7530_NUM_PORTS & ~MT7530_NUM_PORTS) + MT7530_NUM_PORTS & PHY_MAX_ADDR - 1;
even your you're correct.
On Thu, Jul 04, 2024 at 04:08:22PM +0100, Daniel Golle wrote: > The MDIO address of the MT7530 and MT7531 switch ICs can be configured > using bootstrap pins. However, there are only 4 possible options for the > switch itself: 7, 15, 23 and 31. As in MediaTek's SDK the address of the > switch is wrongly stated in the device tree as 0 (while in reality it is > 31), warn the user about such broken device tree and make a good guess > what was actually intended. Zero is the MDIO broadcast address. Doesn't the switch respond to it, or what's exactly the problem? > > This is imporant also to not break compatibility with older Device Trees important > as with commit 868ff5f4944a ("net: dsa: mt7530-mdio: read PHY address of > switch from device tree") the address in device tree will be taken into > account, while before it was hard-coded to 0x1f. > > Fixes: b8f126a8d543 ("net-next: dsa: add dsa support for Mediatek MT7530 switch") I fail to understand the logic behind blaming this commit. There was no observable issue prior to 868ff5f4944a ("net: dsa: mt7530-mdio: read PHY address of switch from device tree"), was there? That's what 'git bisect' with a broken device tree would point towards? > Signed-off-by: Daniel Golle <daniel@makrotopia.org> > Reviewed-by: Andrew Lunn <andrew@lunn.ch> > Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com> > Reviewed-by: Arınç ÜNAL <arinc.unal@arinc9.com> > --- > Changes since v3 [3]: > - simplify calculation of correct address > > Changes since v2 [2]: > - use macros instead of magic numbers > - introduce helper functions > - register new device on MDIO bus instead of messing with the address > and schedule delayed_work to unregister the "wrong" device. > This is a slightly different approach than suggested by Russell, but > imho makes things much easier than keeping the "wrong" device and > having to deal with keeping the removal of both devices linked. > - improve comments > > Changes since v1 [1]: > - use FW_WARN as suggested. > - fix build on net tree which doesn't have 'mdiodev' as member of the > priv struct. Imho including this patch as fix makes sense to warn > users about broken firmware, even if the change introducing the > actual breakage is only present in net-next for now. > > [1]: https://patchwork.kernel.org/project/netdevbpf/patch/e615351aefba25e990215845e4812e6cb8153b28.1714433716.git.daniel@makrotopia.org/ > [2]: https://patchwork.kernel.org/project/netdevbpf/patch/11f5f127d0350e72569c36f9060b6e642dfaddbb.1714514208.git.daniel@makrotopia.org/ > [3]: https://patchwork.kernel.org/project/netdevbpf/patch/7e3fed489c0bbca84a386b1077c61589030ff4ab.1719963228.git.daniel@makrotopia.org/ > > drivers/net/dsa/mt7530-mdio.c | 91 +++++++++++++++++++++++++++++++++++ > 1 file changed, 91 insertions(+) > > diff --git a/drivers/net/dsa/mt7530-mdio.c b/drivers/net/dsa/mt7530-mdio.c > index 51df42ccdbe6..2037ed944801 100644 > --- a/drivers/net/dsa/mt7530-mdio.c > +++ b/drivers/net/dsa/mt7530-mdio.c > @@ -11,6 +11,7 @@ > #include <linux/regmap.h> > #include <linux/reset.h> > #include <linux/regulator/consumer.h> > +#include <linux/workqueue.h> > #include <net/dsa.h> > > #include "mt7530.h" > @@ -136,6 +137,92 @@ static const struct of_device_id mt7530_of_match[] = { > }; > MODULE_DEVICE_TABLE(of, mt7530_of_match); > > +static int > +mt7530_correct_addr(int phy_addr) The prototype fits onto a single line. > +{ > + /* The corrected address is calculated as stated below: > + * 0~6, 31 -> 31 > + * 7~14 -> 7 > + * 15~22 -> 15 > + * 23~30 -> 23 > + */ > +return (phy_addr - MT7530_NUM_PORTS & ~MT7530_NUM_PORTS) + MT7530_NUM_PORTS & PHY_MAX_ADDR - 1; In addition to being weirdly indented and having difficult to follow logic.. Why not opt for the simple, self-documenting variant below? switch (phy_addr) { case 0 ... 6: case 31: return 31; case 7 ... 14: return 7; case 15 ... 22: return 15; case 23 ... 30: return 23; default: return -EINVAL ??? } > +} > + > +static bool > +mt7530_is_invalid_addr(int phy_addr) > +{ > + /* Only MDIO bus addresses 7, 15, 23, and 31 are valid options, > + * which all have the least significant three bits set. Check > + * for this. > + */ > + return (phy_addr & MT7530_NUM_PORTS) != MT7530_NUM_PORTS; Why not implement this in terms of phy_addr != mt7530_correct_addr(phy_addr)? > +} > + > +struct remove_impossible_priv { > + struct delayed_work remove_impossible_work; > + struct mdio_device *mdiodev; > +}; > + > +static void > +mt7530_remove_impossible(struct work_struct *work) Fits onto a single line. > +{ > + struct remove_impossible_priv *priv = container_of(work, struct remove_impossible_priv, > + remove_impossible_work.work); > + struct mdio_device *mdiodev = priv->mdiodev; > + > + mdio_device_remove(mdiodev); > + mdio_device_free(mdiodev); > + kfree(priv); > +} > + > +static int > +mt7530_reregister(struct mdio_device *mdiodev) > +{ > + /* If the address in DT must be wrong, make a good guess about > + * the most likely intention, issue a warning, register a new > + * MDIO device at the correct address and schedule the removal > + * of the device having an impossible address. > + */ > + struct fwnode_handle *fwnode = dev_fwnode(&mdiodev->dev); > + int corrected_addr = mt7530_correct_addr(mdiodev->addr); > + struct remove_impossible_priv *rem_priv; > + struct mdio_device *new_mdiodev; > + int ret; > + > + rem_priv = kmalloc(sizeof(*rem_priv), GFP_KERNEL); > + if (!rem_priv) > + return -ENOMEM; > + > + new_mdiodev = mdio_device_create(mdiodev->bus, corrected_addr); > + if (IS_ERR(new_mdiodev)) { > + ret = PTR_ERR(new_mdiodev); > + goto out_free_work; > + } > + device_set_node(&new_mdiodev->dev, fwnode); > + > + ret = mdio_device_register(new_mdiodev); > + if (WARN_ON(ret)) > + goto out_free_dev; > + > + dev_warn(&mdiodev->dev, FW_WARN > + "impossible switch MDIO address in device tree, assuming %d\n", > + corrected_addr); > + > + /* schedule impossible device for removal from mdio bus */ > + rem_priv->mdiodev = mdiodev; > + INIT_DELAYED_WORK(&rem_priv->remove_impossible_work, mt7530_remove_impossible); > + schedule_delayed_work(&rem_priv->remove_impossible_work, 0); What makes it so that mt7530_remove_impossible() is actually guaranteed to run after the probing of the mdio_device @ the incorrect address _finishes_? mdio_device_remove() will not work on a device which has probing in progress, will it? There's also the more straightforward option of fixing up priv->mdiodev->addr in mt7530.c to be something like priv->switch_base, which is derived from priv->mdiodev->addr, with a fallback to 0x1f if the latter is zero, and a FW_WARN().
Hi Vladimir, On Thu, Jul 04, 2024 at 08:16:04PM +0300, Vladimir Oltean wrote: > On Thu, Jul 04, 2024 at 04:08:22PM +0100, Daniel Golle wrote: > > The MDIO address of the MT7530 and MT7531 switch ICs can be configured > > using bootstrap pins. However, there are only 4 possible options for the > > switch itself: 7, 15, 23 and 31. As in MediaTek's SDK the address of the > > switch is wrongly stated in the device tree as 0 (while in reality it is > > 31), warn the user about such broken device tree and make a good guess > > what was actually intended. > > Zero is the MDIO broadcast address. Doesn't the switch respond to it, or > what's exactly the problem? No, MT7530 main device (ie. the switch itself, not the built-in PHYs which on MT7530 can also be exposed on the same bus) only responds to address 31 (default), 7, 15 or 23 (the latter 3 via non-default bootstrap configuration). MT7531 always uses address 31 by default and also doesn't respond on address 0. See also https://lkml.org/lkml/2024/5/31/236 > > > > > This is imporant also to not break compatibility with older Device Trees > > important Well spotted, will fix. > > > as with commit 868ff5f4944a ("net: dsa: mt7530-mdio: read PHY address of > > switch from device tree") the address in device tree will be taken into > > account, while before it was hard-coded to 0x1f. > > > > Fixes: b8f126a8d543 ("net-next: dsa: add dsa support for Mediatek MT7530 switch") > > I fail to understand the logic behind blaming this commit. There was no > observable issue prior to 868ff5f4944a ("net: dsa: mt7530-mdio: read PHY > address of switch from device tree"), was there? Please see the lengthy debate here: https://lore.kernel.org/linux-arm-kernel/af561268-9793-4b5d-aa0f-d09698fd6fb0@arinc9.com/T/#mc967f795a062f6aaedea7375a3be104266e88cc4 I should have provided a reference to that in the commit message or cover letter. > That's what 'git bisect' with a broken device tree would point towards? Yes, exactly. > > > Signed-off-by: Daniel Golle <daniel@makrotopia.org> > > Reviewed-by: Andrew Lunn <andrew@lunn.ch> > > Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com> > > Reviewed-by: Arınç ÜNAL <arinc.unal@arinc9.com> > > --- > > Changes since v3 [3]: > > - simplify calculation of correct address > > > > Changes since v2 [2]: > > - use macros instead of magic numbers > > - introduce helper functions > > - register new device on MDIO bus instead of messing with the address > > and schedule delayed_work to unregister the "wrong" device. > > This is a slightly different approach than suggested by Russell, but > > imho makes things much easier than keeping the "wrong" device and > > having to deal with keeping the removal of both devices linked. > > - improve comments > > > > Changes since v1 [1]: > > - use FW_WARN as suggested. > > - fix build on net tree which doesn't have 'mdiodev' as member of the > > priv struct. Imho including this patch as fix makes sense to warn > > users about broken firmware, even if the change introducing the > > actual breakage is only present in net-next for now. > > > > [1]: https://patchwork.kernel.org/project/netdevbpf/patch/e615351aefba25e990215845e4812e6cb8153b28.1714433716.git.daniel@makrotopia.org/ > > [2]: https://patchwork.kernel.org/project/netdevbpf/patch/11f5f127d0350e72569c36f9060b6e642dfaddbb.1714514208.git.daniel@makrotopia.org/ > > [3]: https://patchwork.kernel.org/project/netdevbpf/patch/7e3fed489c0bbca84a386b1077c61589030ff4ab.1719963228.git.daniel@makrotopia.org/ > > > > drivers/net/dsa/mt7530-mdio.c | 91 +++++++++++++++++++++++++++++++++++ > > 1 file changed, 91 insertions(+) > > > > diff --git a/drivers/net/dsa/mt7530-mdio.c b/drivers/net/dsa/mt7530-mdio.c > > index 51df42ccdbe6..2037ed944801 100644 > > --- a/drivers/net/dsa/mt7530-mdio.c > > +++ b/drivers/net/dsa/mt7530-mdio.c > > @@ -11,6 +11,7 @@ > > #include <linux/regmap.h> > > #include <linux/reset.h> > > #include <linux/regulator/consumer.h> > > +#include <linux/workqueue.h> > > #include <net/dsa.h> > > > > #include "mt7530.h" > > @@ -136,6 +137,92 @@ static const struct of_device_id mt7530_of_match[] = { > > }; > > MODULE_DEVICE_TABLE(of, mt7530_of_match); > > > > +static int > > +mt7530_correct_addr(int phy_addr) > > The prototype fits onto a single line. > > > +{ > > + /* The corrected address is calculated as stated below: > > + * 0~6, 31 -> 31 > > + * 7~14 -> 7 > > + * 15~22 -> 15 > > + * 23~30 -> 23 > > + */ > > +return (phy_addr - MT7530_NUM_PORTS & ~MT7530_NUM_PORTS) + MT7530_NUM_PORTS & PHY_MAX_ADDR - 1; > > In addition to being weirdly indented and having difficult to follow > logic.. Why not opt for the simple, self-documenting variant below? > > switch (phy_addr) { > case 0 ... 6: > case 31: > return 31; > case 7 ... 14: > return 7; > case 15 ... 22: > return 15; > case 23 ... 30: > return 23; > default: > return -EINVAL ??? > } > > > +} > > + > > +static bool > > +mt7530_is_invalid_addr(int phy_addr) > > +{ > > + /* Only MDIO bus addresses 7, 15, 23, and 31 are valid options, > > + * which all have the least significant three bits set. Check > > + * for this. > > + */ > > + return (phy_addr & MT7530_NUM_PORTS) != MT7530_NUM_PORTS; > > Why not implement this in terms of phy_addr != mt7530_correct_addr(phy_addr)? > > > +} > > + > > +struct remove_impossible_priv { > > + struct delayed_work remove_impossible_work; > > + struct mdio_device *mdiodev; > > +}; > > + > > +static void > > +mt7530_remove_impossible(struct work_struct *work) > > Fits onto a single line. > > > +{ > > + struct remove_impossible_priv *priv = container_of(work, struct remove_impossible_priv, > > + remove_impossible_work.work); > > + struct mdio_device *mdiodev = priv->mdiodev; > > + > > + mdio_device_remove(mdiodev); > > + mdio_device_free(mdiodev); > > + kfree(priv); > > +} > > + > > +static int > > +mt7530_reregister(struct mdio_device *mdiodev) > > +{ > > + /* If the address in DT must be wrong, make a good guess about > > + * the most likely intention, issue a warning, register a new > > + * MDIO device at the correct address and schedule the removal > > + * of the device having an impossible address. > > + */ > > + struct fwnode_handle *fwnode = dev_fwnode(&mdiodev->dev); > > + int corrected_addr = mt7530_correct_addr(mdiodev->addr); > > + struct remove_impossible_priv *rem_priv; > > + struct mdio_device *new_mdiodev; > > + int ret; > > + > > + rem_priv = kmalloc(sizeof(*rem_priv), GFP_KERNEL); > > + if (!rem_priv) > > + return -ENOMEM; > > + > > + new_mdiodev = mdio_device_create(mdiodev->bus, corrected_addr); > > + if (IS_ERR(new_mdiodev)) { > > + ret = PTR_ERR(new_mdiodev); > > + goto out_free_work; > > + } > > + device_set_node(&new_mdiodev->dev, fwnode); > > + > > + ret = mdio_device_register(new_mdiodev); > > + if (WARN_ON(ret)) > > + goto out_free_dev; > > + > > + dev_warn(&mdiodev->dev, FW_WARN > > + "impossible switch MDIO address in device tree, assuming %d\n", > > + corrected_addr); > > + > > + /* schedule impossible device for removal from mdio bus */ > > + rem_priv->mdiodev = mdiodev; > > + INIT_DELAYED_WORK(&rem_priv->remove_impossible_work, mt7530_remove_impossible); > > + schedule_delayed_work(&rem_priv->remove_impossible_work, 0); > > What makes it so that mt7530_remove_impossible() is actually guaranteed > to run after the probing of the mdio_device @ the incorrect address > _finishes_? mdio_device_remove() will not work on a device which has > probing in progress, will it? > > There's also the more straightforward option of fixing up priv->mdiodev->addr > in mt7530.c to be something like priv->switch_base, which is derived > from priv->mdiodev->addr, with a fallback to 0x1f if the latter is zero, > and a FW_WARN().
On 04/07/2024 20:48, Daniel Golle wrote: > Hi Vladimir, > > On Thu, Jul 04, 2024 at 08:16:04PM +0300, Vladimir Oltean wrote: >> On Thu, Jul 04, 2024 at 04:08:22PM +0100, Daniel Golle wrote: >>> The MDIO address of the MT7530 and MT7531 switch ICs can be configured >>> using bootstrap pins. However, there are only 4 possible options for the >>> switch itself: 7, 15, 23 and 31. As in MediaTek's SDK the address of the >>> switch is wrongly stated in the device tree as 0 (while in reality it is >>> 31), warn the user about such broken device tree and make a good guess >>> what was actually intended. >> >> Zero is the MDIO broadcast address. Doesn't the switch respond to it, or >> what's exactly the problem? > > No, MT7530 main device (ie. the switch itself, not the built-in PHYs > which on MT7530 can also be exposed on the same bus) only responds to > address 31 (default), 7, 15 or 23 (the latter 3 via non-default > bootstrap configuration). > > MT7531 always uses address 31 by default and also doesn't respond on > address 0. > > See also https://lkml.org/lkml/2024/5/31/236 To address my incorrect "0x0 is just another PHY address" remark there; in 22.2.4.5.5 of IEEE Std 802.3-2022, it is described that a PHY that is connected to the station management entity via the mechanical interface defined in 22.6 shall always respond to transactions addressed to PHY Address zero <00000>. A station management entity that is attached to multiple PHYs has to have prior knowledge of the appropriate PHY Address for each PHY. The MT7530 switch has the function to make its PHYs appear on the MDIO bus which the switch also listens on. This feature is controlled by the relevant bootstrap pin or by modifying the relevant bit on the modifiable trap register. The MT7530 DSA subdriver currently configures the modifiable trap register to enable this function. So one of the switch PHYs listens at the PHY address 0x0. I don't know whether the switch would respond to transactions addressed to the PHY address 0x0, if this function was disabled. Finding this out doesn't seem too relevant to this topic. >>> as with commit 868ff5f4944a ("net: dsa: mt7530-mdio: read PHY address of >>> switch from device tree") the address in device tree will be taken into >>> account, while before it was hard-coded to 0x1f. >>> >>> Fixes: b8f126a8d543 ("net-next: dsa: add dsa support for Mediatek MT7530 switch") >> >> I fail to understand the logic behind blaming this commit. There was no >> observable issue prior to 868ff5f4944a ("net: dsa: mt7530-mdio: read PHY >> address of switch from device tree"), was there? > > Please see the lengthy debate here: > > https://lore.kernel.org/linux-arm-kernel/af561268-9793-4b5d-aa0f-d09698fd6fb0@arinc9.com/T/#mc967f795a062f6aaedea7375a3be104266e88cc4 This thread may not directly answer the question. The understanding Daniel and I have come to is that the fact that the issue appeared after commit 868ff5f4944a ("net: dsa: mt7530-mdio: read PHY address of switch from device tree") doesn't make it the commit to blame. But rather, the commit that introduced a hardcoded 0x1f PHY address which, in result, allowed broken device trees to work is to blame. > > I should have provided a reference to that in the commit message or > cover letter. I agree, it would be a good idea to put it in the commit message. Arınç
diff --git a/drivers/net/dsa/mt7530-mdio.c b/drivers/net/dsa/mt7530-mdio.c index 51df42ccdbe6..2037ed944801 100644 --- a/drivers/net/dsa/mt7530-mdio.c +++ b/drivers/net/dsa/mt7530-mdio.c @@ -11,6 +11,7 @@ #include <linux/regmap.h> #include <linux/reset.h> #include <linux/regulator/consumer.h> +#include <linux/workqueue.h> #include <net/dsa.h> #include "mt7530.h" @@ -136,6 +137,92 @@ static const struct of_device_id mt7530_of_match[] = { }; MODULE_DEVICE_TABLE(of, mt7530_of_match); +static int +mt7530_correct_addr(int phy_addr) +{ + /* The corrected address is calculated as stated below: + * 0~6, 31 -> 31 + * 7~14 -> 7 + * 15~22 -> 15 + * 23~30 -> 23 + */ +return (phy_addr - MT7530_NUM_PORTS & ~MT7530_NUM_PORTS) + MT7530_NUM_PORTS & PHY_MAX_ADDR - 1; +} + +static bool +mt7530_is_invalid_addr(int phy_addr) +{ + /* Only MDIO bus addresses 7, 15, 23, and 31 are valid options, + * which all have the least significant three bits set. Check + * for this. + */ + return (phy_addr & MT7530_NUM_PORTS) != MT7530_NUM_PORTS; +} + +struct remove_impossible_priv { + struct delayed_work remove_impossible_work; + struct mdio_device *mdiodev; +}; + +static void +mt7530_remove_impossible(struct work_struct *work) +{ + struct remove_impossible_priv *priv = container_of(work, struct remove_impossible_priv, + remove_impossible_work.work); + struct mdio_device *mdiodev = priv->mdiodev; + + mdio_device_remove(mdiodev); + mdio_device_free(mdiodev); + kfree(priv); +} + +static int +mt7530_reregister(struct mdio_device *mdiodev) +{ + /* If the address in DT must be wrong, make a good guess about + * the most likely intention, issue a warning, register a new + * MDIO device at the correct address and schedule the removal + * of the device having an impossible address. + */ + struct fwnode_handle *fwnode = dev_fwnode(&mdiodev->dev); + int corrected_addr = mt7530_correct_addr(mdiodev->addr); + struct remove_impossible_priv *rem_priv; + struct mdio_device *new_mdiodev; + int ret; + + rem_priv = kmalloc(sizeof(*rem_priv), GFP_KERNEL); + if (!rem_priv) + return -ENOMEM; + + new_mdiodev = mdio_device_create(mdiodev->bus, corrected_addr); + if (IS_ERR(new_mdiodev)) { + ret = PTR_ERR(new_mdiodev); + goto out_free_work; + } + device_set_node(&new_mdiodev->dev, fwnode); + + ret = mdio_device_register(new_mdiodev); + if (WARN_ON(ret)) + goto out_free_dev; + + dev_warn(&mdiodev->dev, FW_WARN + "impossible switch MDIO address in device tree, assuming %d\n", + corrected_addr); + + /* schedule impossible device for removal from mdio bus */ + rem_priv->mdiodev = mdiodev; + INIT_DELAYED_WORK(&rem_priv->remove_impossible_work, mt7530_remove_impossible); + schedule_delayed_work(&rem_priv->remove_impossible_work, 0); + + return -EFAULT; + +out_free_dev: + mdio_device_free(new_mdiodev); +out_free_work: + kfree(rem_priv); + return ret; +} + static int mt7530_probe(struct mdio_device *mdiodev) { @@ -144,6 +231,10 @@ mt7530_probe(struct mdio_device *mdiodev) struct device_node *dn; int ret; + /* Check and if needed correct the MDIO address of the switch */ + if (mt7530_is_invalid_addr(mdiodev->addr)) + return mt7530_reregister(mdiodev); + dn = mdiodev->dev.of_node; priv = devm_kzalloc(&mdiodev->dev, sizeof(*priv), GFP_KERNEL);