Message ID | 20211101182859.24073-1-grygorii.strashko@ti.com (mailing list archive) |
---|---|
State | RFC |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [RFC] net: phy/mdio: enable mmd indirect access through phy_mii_ioctl() | expand |
On Mon, Nov 01, 2021 at 08:28:59PM +0200, Grygorii Strashko wrote: > This patch enables access to C22 PHY MMD address space through I'm not sure the terminology is correct here. I think it actually enables access to C45 address space, making use of C45 over C22. > phy_mii_ioctl() SIOCGMIIREG/SIOCSMIIREG IOCTLs. It checks if R/W request is > received with C45 flag enabled while MDIO bus doesn't support C45 and, in > this case, tries to treat prtad as PHY MMD selector and use MMD API. > > With this change it's possible to r/w PHY MMD registers with phytool, for > example, before: > > phytool read eth0/0x1f:0/0x32 > 0xffea > > after: > phytool read eth0/0x1f:0/0x32 > 0x00d1 > @@ -300,8 +300,19 @@ int phy_mii_ioctl(struct phy_device *phydev, struct ifreq *ifr, int cmd) > prtad = mii_data->phy_id; > devad = mii_data->reg_num; > } > - mii_data->val_out = mdiobus_read(phydev->mdio.bus, prtad, > - devad); > + if (mdio_phy_id_is_c45(mii_data->phy_id) && > + phydev->mdio.bus->probe_capabilities <= MDIOBUS_C22) { > + phy_lock_mdio_bus(phydev); > + > + mii_data->val_out = __mmd_phy_read(phydev->mdio.bus, > + mdio_phy_id_devad(mii_data->phy_id), > + prtad, > + mii_data->reg_num); > + > + phy_unlock_mdio_bus(phydev); > + } else { > + mii_data->val_out = mdiobus_read(phydev->mdio.bus, prtad, devad); > + } The layering look wrong here. You are trying to perform MDIO bus operation here, so it seems odd to perform __mmd_phy_read(). I wonder if it would be cleaner to move C45 over C22 down into the mdiobus_ level API? Andrew
On Mon, Nov 01, 2021 at 08:33:50PM +0100, Andrew Lunn wrote: > On Mon, Nov 01, 2021 at 08:28:59PM +0200, Grygorii Strashko wrote: > > This patch enables access to C22 PHY MMD address space through > > I'm not sure the terminology is correct here. I think it actually > enables access to C45 address space, making use of C45 over C22. I agree. > > phy_mii_ioctl() SIOCGMIIREG/SIOCSMIIREG IOCTLs. It checks if R/W request is > > received with C45 flag enabled while MDIO bus doesn't support C45 and, in > > this case, tries to treat prtad as PHY MMD selector and use MMD API. > > > > With this change it's possible to r/w PHY MMD registers with phytool, for > > example, before: > > > > phytool read eth0/0x1f:0/0x32 > > 0xffea > > > > after: > > phytool read eth0/0x1f:0/0x32 > > 0x00d1 > > @@ -300,8 +300,19 @@ int phy_mii_ioctl(struct phy_device *phydev, struct ifreq *ifr, int cmd) > > prtad = mii_data->phy_id; > > devad = mii_data->reg_num; > > } > > - mii_data->val_out = mdiobus_read(phydev->mdio.bus, prtad, > > - devad); > > + if (mdio_phy_id_is_c45(mii_data->phy_id) && > > + phydev->mdio.bus->probe_capabilities <= MDIOBUS_C22) { > > + phy_lock_mdio_bus(phydev); > > + > > + mii_data->val_out = __mmd_phy_read(phydev->mdio.bus, > > + mdio_phy_id_devad(mii_data->phy_id), > > + prtad, > > + mii_data->reg_num); > > + > > + phy_unlock_mdio_bus(phydev); > > + } else { > > + mii_data->val_out = mdiobus_read(phydev->mdio.bus, prtad, devad); > > + } > > The layering look wrong here. You are trying to perform MDIO bus > operation here, so it seems odd to perform __mmd_phy_read(). I wonder > if it would be cleaner to move C45 over C22 down into the mdiobus_ > level API? The use of the indirect registers is specific to PHYs, and we already know that various PHYs don't support indirect access, and some emulate access to the EEE registers - both of which are handled at the PHY driver level. Pushing indirect MMD access down into the MDIO bus layer means we need to have some way to deal with that. Alternatively, if we're just thinking about moving, e.g.: if (phydev->is_c45) { val = __mdiobus_c45_read(phydev->mdio.bus, phydev->mdio.addr, devad, regnum); } else { struct mii_bus *bus = phydev->mdio.bus; int phy_addr = phydev->mdio.addr; mmd_phy_indirect(bus, phy_addr, devad, regnum); /* Read the content of the MMD's selected register */ val = __mdiobus_read(bus, phy_addr, MII_MMD_DATA); } We would need to have some way to deal with that "is_c45" flag at mdio device level (maybe moving it to the mdio_device structure). Still doesn't help the "phy driver emulates the accesses" problem though.
> The use of the indirect registers is specific to PHYs, and we already > know that various PHYs don't support indirect access, and some emulate > access to the EEE registers - both of which are handled at the PHY > driver level. That is actually an interesting point. Should the ioctl call actually use the PHY driver read_mmd and write_mmd? Or should it go direct to the bus? realtek uses MII_MMD_DATA for something to do with suspend, and hence it uses genphy_write_mmd_unsupported(), or it has its own function emulating MMD operations. So maybe the ioctl handler actually needs to use __phy_read_mmd() if there is a phy at the address, rather than go direct to the bus? Or maybe we should just say no, you should do this all from userspace, by implementing C45 over C22 in userspace, the ioctl allows that, the kernel does not need to be involved. Andrew
On Tue, Nov 02, 2021 at 01:49:42AM +0100, Andrew Lunn wrote: > > The use of the indirect registers is specific to PHYs, and we already > > know that various PHYs don't support indirect access, and some emulate > > access to the EEE registers - both of which are handled at the PHY > > driver level. > > That is actually an interesting point. Should the ioctl call actually > use the PHY driver read_mmd and write_mmd? Or should it go direct to > the bus? realtek uses MII_MMD_DATA for something to do with suspend, > and hence it uses genphy_write_mmd_unsupported(), or it has its own > function emulating MMD operations. > > So maybe the ioctl handler actually needs to use __phy_read_mmd() if > there is a phy at the address, rather than go direct to the bus? > > Or maybe we should just say no, you should do this all from userspace, > by implementing C45 over C22 in userspace, the ioctl allows that, the > kernel does not need to be involved. Yes and no. There's a problem accessing anything that involves some kind of indirect or paged access with the current API - you can only do one access under the bus lock at a time, which makes the whole thing unreliable. We've accepted that unreliability on the grounds that this interface is for debugging only, so if it does go wrong, you get to keep all the pieces! The paged access case is really no different from the indirect C45 case. They're both exactly the same type of indirect access, just using different registers. That said, the MII ioctls are designed to be a bus level thing - you can address anything on the MII bus with them. Pushing the ioctl up to the PHY layer means we need to find the right phy device to operate on. What if we attempt a C45 access at an address that there isn't a phy device? For example, what would be the effect of trying a C45 indirect access to a DSA switch? Personally, my feeling would be that if we want to solve this, we need to solve this properly - we need to revise the interface so it's possible to request the kernel to perform a group of MII operations, so that userspace can safely access any paged/indirect register. With that solved, there will be no issue with requiring userspace to know what it's doing with indirect C45 accesses.
On Tue, Nov 02, 2021 at 12:39:52PM +0000, Russell King (Oracle) wrote: > On Tue, Nov 02, 2021 at 01:49:42AM +0100, Andrew Lunn wrote: > > > The use of the indirect registers is specific to PHYs, and we already > > > know that various PHYs don't support indirect access, and some emulate > > > access to the EEE registers - both of which are handled at the PHY > > > driver level. > > > > That is actually an interesting point. Should the ioctl call actually > > use the PHY driver read_mmd and write_mmd? Or should it go direct to > > the bus? realtek uses MII_MMD_DATA for something to do with suspend, > > and hence it uses genphy_write_mmd_unsupported(), or it has its own > > function emulating MMD operations. > > > > So maybe the ioctl handler actually needs to use __phy_read_mmd() if > > there is a phy at the address, rather than go direct to the bus? > > > > Or maybe we should just say no, you should do this all from userspace, > > by implementing C45 over C22 in userspace, the ioctl allows that, the > > kernel does not need to be involved. > > Yes and no. There's a problem accessing anything that involves some kind > of indirect or paged access with the current API - you can only do one > access under the bus lock at a time, which makes the whole thing > unreliable. We've accepted that unreliability on the grounds that this > interface is for debugging only, so if it does go wrong, you get to keep > all the pieces! Agreed. > That said, the MII ioctls are designed to be a bus level thing - you can > address anything on the MII bus with them. Pushing the ioctl up to the > PHY layer means we need to find the right phy device to operate on. What > if we attempt a C45 access at an address that there isn't a phy device? Yes, i think we need to keep with, this API is for MDIO bus access. If you want to do C45 over C22, you need to do it in user space, since that builds on top of basic MDIO bus accesses. > Personally, my feeling would be that if we want to solve this, we need > to solve this properly - we need to revise the interface so it's > possible to request the kernel to perform a group of MII operations, so > that userspace can safely access any paged/indirect register. With that > solved, there will be no issue with requiring userspace to know what > it's doing with indirect C45 accesses. I'm against that. It opens up an API to allow user space drivers, which i have always pushed back against. The current API is good enough you can use it for debug, but at the same time it is sufficiently broken that anybody trying to do user space drivers over it is asking for trouble. That seems like a good balance to me. Andrew
On 02/11/2021 14:39, Russell King (Oracle) wrote: > On Tue, Nov 02, 2021 at 01:49:42AM +0100, Andrew Lunn wrote: >>> The use of the indirect registers is specific to PHYs, and we already >>> know that various PHYs don't support indirect access, and some emulate >>> access to the EEE registers - both of which are handled at the PHY >>> driver level. >> >> That is actually an interesting point. Should the ioctl call actually >> use the PHY driver read_mmd and write_mmd? Or should it go direct to >> the bus? realtek uses MII_MMD_DATA for something to do with suspend, >> and hence it uses genphy_write_mmd_unsupported(), or it has its own >> function emulating MMD operations. >> >> So maybe the ioctl handler actually needs to use __phy_read_mmd() if >> there is a phy at the address, rather than go direct to the bus? >> >> Or maybe we should just say no, you should do this all from userspace, >> by implementing C45 over C22 in userspace, the ioctl allows that, the >> kernel does not need to be involved. > > Yes and no. There's a problem accessing anything that involves some kind > of indirect or paged access with the current API - you can only do one > access under the bus lock at a time, which makes the whole thing > unreliable. We've accepted that unreliability on the grounds that this > interface is for debugging only, so if it does go wrong, you get to keep > all the pieces! Right, MMD indirect access is 4 MDIO bus transactions. > > The paged access case is really no different from the indirect C45 case. > They're both exactly the same type of indirect access, just using > different registers. > > That said, the MII ioctls are designed to be a bus level thing - you can > address anything on the MII bus with them. Pushing the ioctl up to the > PHY layer means we need to find the right phy device to operate on. The phy_read_mmd/__phy_read_mmd() was the first thing i considered, but rejected exactly because of the possibility to access any MDIO device through this ioctls. in general, it can be called with check (mii->phy_id = pl->phydev->mdio.addr) > What > if we attempt a C45 access at an address that there isn't a phy device? > > For example, what would be the effect of trying a C45 indirect access to > a DSA switch? in case, C22/C22 MMD It will fail to read, seems no issues, and phytool will just return 0xfffb. First, there seems was previous attempt to do the same [1]. Also, there is some historical ... mess in this area :( There are: - generic_mii_ioctl() - 33 users (2005, it's older), uses struct mii_if_info - mdio_mii_ioctl() - 7 users (2009), uses struct mdio_if_info - phy_mii_ioctl() - 29 users, including phylink (2005), need PHY to get MDIO bus - phy_do_ioctl()->phy_mii_ioctl() - 10 users (2020) - phy_do_ioctl_running()->phy_mii_ioctl() - 22 users (2020) - phylink_mii_ioctl() (also calls phy_mii_ioctl(), but only for SIOCSHWTSTAMP) - 9 users, including DSA (2017) need PHY to get MDIO bus, also uses PHY for c45 detection, but any phy_id can be passed. - SIOCSMIIREG custom implementation - 32 users > > Personally, my feeling would be that if we want to solve this, we need > to solve this properly - we need to revise the interface so it's > possible to request the kernel to perform a group of MII operations, so > that userspace can safely access any paged/indirect register. With that > solved, there will be no issue with requiring userspace to know what > it's doing with indirect C45 accesses. > It would require MDIO bus lock, which is not a solution, or some sort of batch processing, like for mmd: w reg1 val1 w reg2 val2 w reg1 val3 r reg2 What Kernel interface do you have in mind? Sry, but I have to note that demand for this become terribly high, min two pings in months [1] https://www.spinics.net/lists/netdev/msg653629.html
On Tue, Nov 02, 2021 at 07:19:46PM +0200, Grygorii Strashko wrote: > It would require MDIO bus lock, which is not a solution, > or some sort of batch processing, like for mmd: > w reg1 val1 > w reg2 val2 > w reg1 val3 > r reg2 > > What Kernel interface do you have in mind? That is roughly what I was thinking, but Andrew has basically said no to it. > Sry, but I have to note that demand for this become terribly high, min two pings in months Feel free to continue demanding it, but it seems that at least two of the phylib maintainers are in agreement that providing generic emulation of C45 accesses in kernel space is just not going to happen.
Hi Russell, Andrew, Thanks a lot for you comments. On 02/11/2021 19:41, Russell King (Oracle) wrote: > On Tue, Nov 02, 2021 at 07:19:46PM +0200, Grygorii Strashko wrote: >> It would require MDIO bus lock, which is not a solution, >> or some sort of batch processing, like for mmd: >> w reg1 val1 >> w reg2 val2 >> w reg1 val3 >> r reg2 >> >> What Kernel interface do you have in mind? > > That is roughly what I was thinking, but Andrew has basically said no > to it. > >> Sry, but I have to note that demand for this become terribly high, min two pings in months > > Feel free to continue demanding it, but it seems that at least two of > the phylib maintainers are in agreement that providing generic > emulation of C45 accesses in kernel space is just not going to happen. > not ready to give up. One more idea how about mdiobus_get_phy(), so we can search for PHY and if present try to use proper API phy_read/phy_write_mmd?
On 02/11/2021 20:37, Grygorii Strashko wrote: > Hi Russell, Andrew, > > Thanks a lot for you comments. > > On 02/11/2021 19:41, Russell King (Oracle) wrote: >> On Tue, Nov 02, 2021 at 07:19:46PM +0200, Grygorii Strashko wrote: >>> It would require MDIO bus lock, which is not a solution, >>> or some sort of batch processing, like for mmd: >>> w reg1 val1 >>> w reg2 val2 >>> w reg1 val3 >>> r reg2 >>> >>> What Kernel interface do you have in mind? >> >> That is roughly what I was thinking, but Andrew has basically said no >> to it. >> >>> Sry, but I have to note that demand for this become terribly high, min two pings in months >> >> Feel free to continue demanding it, but it seems that at least two of >> the phylib maintainers are in agreement that providing generic >> emulation of C45 accesses in kernel space is just not going to happen. >> > > not ready to give up. > > One more idea how about mdiobus_get_phy(), so we can search for PHY and > if present try to use proper API phy_read/phy_write_mmd? > > smth like below diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c index a3bfb156c83d..8ebe59b5647d 100644 --- a/drivers/net/phy/phy.c +++ b/drivers/net/phy/phy.c @@ -285,6 +285,7 @@ int phy_mii_ioctl(struct phy_device *phydev, struct ifreq *ifr, int cmd) u16 val = mii_data->val_in; bool change_autoneg = false; int prtad, devad; + struct phy_device *phydev_rq = phydev; switch (cmd) { case SIOCGMIIPHY: @@ -300,8 +301,18 @@ int phy_mii_ioctl(struct phy_device *phydev, struct ifreq *ifr, int cmd) prtad = mii_data->phy_id; devad = mii_data->reg_num; } - mii_data->val_out = mdiobus_read(phydev->mdio.bus, prtad, - devad); + + if (prtad != phydev->mdio.addr) + phydev_rq = mdiobus_get_phy(phydev->mdio.bus, prtad); + + if (!phydev_rq) { + mii_data->val_out = mdiobus_read(phydev->mdio.bus, prtad, devad); + } else if (mdio_phy_id_is_c45(mii_data->phy_id) && !phydev->is_c45) { + mii_data->val_out = phy_read_mmd(phydev_rq, mdio_phy_id_devad(mii_data->phy_id), mii_data->reg_num); + } else { + mii_data->val_out = phy_read(phydev_rq, mii_data->reg_num); + } + return 0; case SIOCSMIIREG:
On 11/2/21 1:13 PM, Andrew Lunn wrote: > On Tue, Nov 02, 2021 at 12:39:52PM +0000, Russell King (Oracle) wrote: >> On Tue, Nov 02, 2021 at 01:49:42AM +0100, Andrew Lunn wrote: >> > > The use of the indirect registers is specific to PHYs, and we already >> > > know that various PHYs don't support indirect access, and some emulate >> > > access to the EEE registers - both of which are handled at the PHY >> > > driver level. >> > >> > That is actually an interesting point. Should the ioctl call actually >> > use the PHY driver read_mmd and write_mmd? Or should it go direct to >> > the bus? realtek uses MII_MMD_DATA for something to do with suspend, >> > and hence it uses genphy_write_mmd_unsupported(), or it has its own >> > function emulating MMD operations. >> > >> > So maybe the ioctl handler actually needs to use __phy_read_mmd() if >> > there is a phy at the address, rather than go direct to the bus? >> > >> > Or maybe we should just say no, you should do this all from userspace, >> > by implementing C45 over C22 in userspace, the ioctl allows that, the >> > kernel does not need to be involved. >> >> Yes and no. There's a problem accessing anything that involves some kind >> of indirect or paged access with the current API - you can only do one >> access under the bus lock at a time, which makes the whole thing >> unreliable. We've accepted that unreliability on the grounds that this >> interface is for debugging only, so if it does go wrong, you get to keep >> all the pieces! > > Agreed. > >> That said, the MII ioctls are designed to be a bus level thing - you can >> address anything on the MII bus with them. Pushing the ioctl up to the >> PHY layer means we need to find the right phy device to operate on. What >> if we attempt a C45 access at an address that there isn't a phy device? > > Yes, i think we need to keep with, this API is for MDIO bus access. If > you want to do C45 over C22, you need to do it in user space, since > that builds on top of basic MDIO bus accesses. > >> Personally, my feeling would be that if we want to solve this, we need >> to solve this properly - we need to revise the interface so it's >> possible to request the kernel to perform a group of MII operations, so >> that userspace can safely access any paged/indirect register. With that >> solved, there will be no issue with requiring userspace to know what >> it's doing with indirect C45 accesses. > > I'm against that. It opens up an API to allow user space drivers, > which i have always pushed back against. The current API is good > enough you can use it for debug, but at the same time it is > sufficiently broken that anybody trying to do user space drivers over > it is asking for trouble. That seems like a good balance to me. I have not found this to be the case. As soon as you need to access something using phylink, the emulated registers make the ioctls useless (especially because there may be multiple phy-like devices for one interface). Currently, I use [1] to debug phys without having to worry about what phylink is trying to simulate. I would much prefer something like what regmap does: reads are allowed through debugfs, but writes require editing the kernel source. This allows useful debugging, while making zero-edit userspace drivers impossible. And fundamentally, you can always load a module which lets you do your driver development from userspace anyway (as [1] demonstrates, though I personally compile it in). I don't really understand why we have to have worse debug tools to prevent something which is trivial to implement anyway. --Sean [1] https://github.com/wkz/mdio-tools
> @@ -300,8 +301,18 @@ int phy_mii_ioctl(struct phy_device *phydev, struct ifreq *ifr, int cmd) > prtad = mii_data->phy_id; > devad = mii_data->reg_num; > } > - mii_data->val_out = mdiobus_read(phydev->mdio.bus, prtad, > - devad); > + > + if (prtad != phydev->mdio.addr) > + phydev_rq = mdiobus_get_phy(phydev->mdio.bus, prtad); > + > + if (!phydev_rq) { > + mii_data->val_out = mdiobus_read(phydev->mdio.bus, prtad, devad); > + } else if (mdio_phy_id_is_c45(mii_data->phy_id) && !phydev->is_c45) { > + mii_data->val_out = phy_read_mmd(phydev_rq, mdio_phy_id_devad(mii_data->phy_id), mii_data->reg_num); > + } else { > + mii_data->val_out = phy_read(phydev_rq, mii_data->reg_num); > + } > + One thing i don't like about this is you have little idea what it has actually done. If you pass a C45 address, i expect a C45 access. If i pass a C22 i expect a C22 access. What i find interesting is that you and the other resent requester are using the same user space tool. If you implement C45 over C22 in that tool, you get your solution, and it will work for older kernels as well. Also, given the diverse implementations of this IOTCL, it probably works for more drivers than just those using phy_mii_ioctl(). Andrew
On 02/11/2021 23:46, Andrew Lunn wrote: >> @@ -300,8 +301,18 @@ int phy_mii_ioctl(struct phy_device *phydev, struct ifreq *ifr, int cmd) >> prtad = mii_data->phy_id; >> devad = mii_data->reg_num; >> } >> - mii_data->val_out = mdiobus_read(phydev->mdio.bus, prtad, >> - devad); >> + >> + if (prtad != phydev->mdio.addr) >> + phydev_rq = mdiobus_get_phy(phydev->mdio.bus, prtad); >> + >> + if (!phydev_rq) { >> + mii_data->val_out = mdiobus_read(phydev->mdio.bus, prtad, devad); >> + } else if (mdio_phy_id_is_c45(mii_data->phy_id) && !phydev->is_c45) { >> + mii_data->val_out = phy_read_mmd(phydev_rq, mdio_phy_id_devad(mii_data->phy_id), mii_data->reg_num); >> + } else { >> + mii_data->val_out = phy_read(phydev_rq, mii_data->reg_num); >> + } >> + > > One thing i don't like about this is you have little idea what it has > actually done. > > If you pass a C45 address, i expect a C45 access. If i pass a C22 i > expect a C22 access. I might be doing smth wrong and that's why it's RFC. I wanted to understand if i hook into the kernel side first correctly, so if above doesn't violate PHYs/mdiodev access any more there seems reason try to continue. > > What i find interesting is that you and the other resent requester are > using the same user space tool. If you implement C45 over C22 in that > tool, you get your solution, and it will work for older kernels as > well. Also, given the diverse implementations of this IOTCL, it > probably works for more drivers than just those using phy_mii_ioctl(). Do you mean change uapi, like add mdio_phy_id_is_c45_over_c22() and flag #define MDIO_PHY_ID_C45_OVER_C22 0x4000? Thank you for your comments and patience.
On Tue, Nov 02, 2021 at 03:46:13PM -0400, Sean Anderson wrote: > I have not found this to be the case. As soon as you need to access > something using phylink, the emulated registers make the ioctls useless > (especially because there may be multiple phy-like devices for one > interface). I think you're fundamentally misunderstanding something there. If there is a PHY present, phylink presents no different an interface from phylib - it does no emulation what so ever, and you can access any address. I use this on Macchiatobin when researching the 88x3310 PHY. I have a tool that allows me to view part of the register set in any MMD in almost real-time - and I can access either of the two PHYs on the xmdio bus from either of their network interfaces. Same for the clause 22 mdio bus. There is no emulation in this case, and you get full access to the MDIO/XMDIO bus just like via phylib. There is absolutely no difference. If there is no PHY connected, then phylink will emulate the accesses in just the same way as the fixed-phy support emulates accesses, and in a bug-compatible way with fixed-phy. It only emulates for PHY address 0. As there is no PHY, there is no MII bus known to phylink, so it there is no MII bus for phylink to pass any non-zero address on to. Split PCS support is relatively new, and this brings with it a whole host of issues: 1) the PCS may not be on a MII bus, and may not even have a PHY-like set of registers. How do we export that kind of setup through the MII ioctls? 2) when we have a copper SFP plugged in with its own PHY, we export it through the MII ioctls because phylink now has a PHY (so it falls in the "PHY present" case above). If we also have a PCS on a MII bus, we now have two completely different MII buses. Which MII bus do we export? 3) in the non-SFP case, the PHY and PCS may be sitting on different MII buses. Again, which MII bus do we export? The MII ioctls have no way to indicate which MII bus should be accessed. We can't just look at the address - what if the PHY and PCS are at the same address but on different buses? We may have cases where the PHY and PCS are sitting on the same MII bus - and in that case, phylink does not restrict whether you can access the PCS through the MII ioctls. Everything other case is "complicated" and unless we can come up with a sane way to fit everything into two or more buses into these antequated ioctls that are designed for a single MII bus, it's probably best not to even bodge something at the phylink level - it probably makes more sense for the network driver to do it. After all, the network driver probably has more knowledge about the hardware around it than phylink does.
> > What i find interesting is that you and the other resent requester are > > using the same user space tool. If you implement C45 over C22 in that > > tool, you get your solution, and it will work for older kernels as > > well. Also, given the diverse implementations of this IOTCL, it > > probably works for more drivers than just those using phy_mii_ioctl(). > > Do you mean change uapi, like > add mdio_phy_id_is_c45_over_c22() and > flag #define MDIO_PHY_ID_C45_OVER_C22 0x4000? No, i mean user space implements C45 over C22. Make phytool write MII_MMD_CTRL and MII_MMD_DATA to perform a C45 over C22. Andrew
On 03/11/2021 02:27, Andrew Lunn wrote: >>> What i find interesting is that you and the other resent requester are >>> using the same user space tool. If you implement C45 over C22 in that >>> tool, you get your solution, and it will work for older kernels as >>> well. Also, given the diverse implementations of this IOTCL, it >>> probably works for more drivers than just those using phy_mii_ioctl(). >> >> Do you mean change uapi, like >> add mdio_phy_id_is_c45_over_c22() and >> flag #define MDIO_PHY_ID_C45_OVER_C22 0x4000? > > No, i mean user space implements C45 over C22. Make phytool write > MII_MMD_CTRL and MII_MMD_DATA to perform a C45 over C22. Now I give up - as mentioned there is now way to sync User space vs Kernel MMD transactions and so no way to get trusted results. Thanks all for participating in discussion.
On Wed, Nov 03, 2021 at 08:42:07PM +0200, Grygorii Strashko wrote: > > > On 03/11/2021 02:27, Andrew Lunn wrote: > > > > What i find interesting is that you and the other resent requester are > > > > using the same user space tool. If you implement C45 over C22 in that > > > > tool, you get your solution, and it will work for older kernels as > > > > well. Also, given the diverse implementations of this IOTCL, it > > > > probably works for more drivers than just those using phy_mii_ioctl(). > > > > > > Do you mean change uapi, like > > > add mdio_phy_id_is_c45_over_c22() and > > > flag #define MDIO_PHY_ID_C45_OVER_C22 0x4000? > > > > No, i mean user space implements C45 over C22. Make phytool write > > MII_MMD_CTRL and MII_MMD_DATA to perform a C45 over C22. > > Now I give up - as mentioned there is now way to sync User space vs Kernel > MMD transactions and so no way to get trusted results. Except that it will probably work 99% of the time, which is enough for a debug tool. phylib is pretty idle most of the time, it just polls the PHY once a second to see if the link status has changed. And does not poll at all if interrupts are wired up. And you can always do a read three times and see if you get the same answer, or do a write followed by a read to see if the write actually happened correctly, or corrupted some other register. Andrew
On Wed, Nov 03, 2021 at 20:36, Andrew Lunn <andrew@lunn.ch> wrote: > On Wed, Nov 03, 2021 at 08:42:07PM +0200, Grygorii Strashko wrote: >> >> >> On 03/11/2021 02:27, Andrew Lunn wrote: >> > > > What i find interesting is that you and the other resent requester are >> > > > using the same user space tool. If you implement C45 over C22 in that >> > > > tool, you get your solution, and it will work for older kernels as >> > > > well. Also, given the diverse implementations of this IOTCL, it >> > > > probably works for more drivers than just those using phy_mii_ioctl(). >> > > >> > > Do you mean change uapi, like >> > > add mdio_phy_id_is_c45_over_c22() and >> > > flag #define MDIO_PHY_ID_C45_OVER_C22 0x4000? >> > >> > No, i mean user space implements C45 over C22. Make phytool write >> > MII_MMD_CTRL and MII_MMD_DATA to perform a C45 over C22. >> >> Now I give up - as mentioned there is now way to sync User space vs Kernel >> MMD transactions and so no way to get trusted results. Except that there is a way: https://github.com/wkz/mdio-tools I can see that Sean has already mentioned it in the other branch of the thread (thanks for the plug :)). I have posted it to netdev before: https://lore.kernel.org/netdev/C42DZQLTPHM5.2THDSRK84BI3T@wkz-x280/ It allows you to send an entire "MDIO program" to the kernel, where mdio-netlink will (1) lock the bus, (2) run your program in a small VM, and (3) unlock the bus. There are currently two programs in the project: - `mdio`: A basic register peek/poke program that uses the mdio-netlink module in the kernel to do its thing. The source is structured in such a way that custom access modes can be easily added. Today there are accessors for C22 PHYs, C45 MMDs, Marvell Alaska paged PHYs, Marvell LinkStreet switches, and XRS700x switches. - `mvls`: Specialized read-only debug tool for Marvell LinkStreet switches. This does _not_ rely on the mdio-netlink kernel module, instead it uses the standard devlink API. Let's you dump the VTU/ATU etc. You could easily add a new addressing mode to `mdio` to do C45-over-C22 accesses. Would that work for you Grygorii? > Except that it will probably work 99% of the time, which is enough for > a debug tool. Why though, why would we not build something that is rock solid? Ever since ce69e2162f15, the flood gates are open. Any vendor can implement mdio-netlink on their own, or just download mine. We are only punishing ourselves at this point. > phylib is pretty idle most of the time, it just polls > the PHY once a second to see if the link status has changed. And does > not poll at all if interrupts are wired up. And you can always do a > read three times and see if you get the same answer, or do a write > followed by a read to see if the write actually happened correctly, or > corrupted some other register. As a staunch opponent of Vendor SDKs myself, I get where you are coming from - really. I guess I just don't have the brain power to operate this kind of Rube Goldberg machinery and debug my problems at the same time. We have a mutex right there already, let's use it! I'll shut up now - I just had to make one final appeal :)
On Thu, Nov 04, 2021 at 12:17:47PM +0100, Tobias Waldekranz wrote: > On Wed, Nov 03, 2021 at 20:36, Andrew Lunn <andrew@lunn.ch> wrote: > > On Wed, Nov 03, 2021 at 08:42:07PM +0200, Grygorii Strashko wrote: > >> > >> > >> On 03/11/2021 02:27, Andrew Lunn wrote: > >> > > > What i find interesting is that you and the other resent requester are > >> > > > using the same user space tool. If you implement C45 over C22 in that > >> > > > tool, you get your solution, and it will work for older kernels as > >> > > > well. Also, given the diverse implementations of this IOTCL, it > >> > > > probably works for more drivers than just those using phy_mii_ioctl(). > >> > > > >> > > Do you mean change uapi, like > >> > > add mdio_phy_id_is_c45_over_c22() and > >> > > flag #define MDIO_PHY_ID_C45_OVER_C22 0x4000? > >> > > >> > No, i mean user space implements C45 over C22. Make phytool write > >> > MII_MMD_CTRL and MII_MMD_DATA to perform a C45 over C22. > >> > >> Now I give up - as mentioned there is now way to sync User space vs Kernel > >> MMD transactions and so no way to get trusted results. > > Except that there is a way: https://github.com/wkz/mdio-tools I'm guessing that this hasn't had much in the way of review, as it has a nice exploitable bug - you really want "pc" to be unsigned in mdio_nl_eval(), otherwise one can write a branch instruction that makes "pc" negative. Also it looks like one can easily exploit this to trigger any of your BUG_ON()/BUG() statements, thereby crashing while holding the MDIO bus lock causing a denial of service attack. I also see nothing that protects against any user on a system being able to use this interface, so the exploits above can be triggered by any user. Moreover, this lack of protection means any user on the system can use this interface to write to a PHY. Given that some PHYs today contain firmware, this gives anyone access to reprogram the PHY firmware, possibly introducing malicious firmware. I hope no one is using this module in a production environment.
On Thu, Nov 04, 2021 at 12:35:17PM +0000, Russell King (Oracle) wrote: > On Thu, Nov 04, 2021 at 12:17:47PM +0100, Tobias Waldekranz wrote: > > Except that there is a way: https://github.com/wkz/mdio-tools > > I'm guessing that this hasn't had much in the way of review, as it has > a nice exploitable bug - you really want "pc" to be unsigned in > mdio_nl_eval(), otherwise one can write a branch instruction that makes > "pc" negative. > > Also it looks like one can easily exploit this to trigger any of your > BUG_ON()/BUG() statements, thereby crashing while holding the MDIO bus > lock causing a denial of service attack. > > I also see nothing that protects against any user on a system being > able to use this interface, so the exploits above can be triggered by > any user. Moreover, this lack of protection means any user on the > system can use this interface to write to a PHY. > > Given that some PHYs today contain firmware, this gives anyone access > to reprogram the PHY firmware, possibly introducing malicious firmware. > > I hope no one is using this module in a production environment. It also leaks the reference count on the MDIO bus class device. mdio_find_bus(), rather class_find_device_by_name() takes a reference on the struct device that you never drop. See the documentation for class_find_device() for the statement about this: * Note, you will need to drop the reference with put_device() after use. Of course, mdio_find_bus() documentation should _really_ have mentioned this fact too.
On Thu, Nov 04, 2021 at 12:35, "Russell King (Oracle)" <linux@armlinux.org.uk> wrote: > On Thu, Nov 04, 2021 at 12:17:47PM +0100, Tobias Waldekranz wrote: >> On Wed, Nov 03, 2021 at 20:36, Andrew Lunn <andrew@lunn.ch> wrote: >> > On Wed, Nov 03, 2021 at 08:42:07PM +0200, Grygorii Strashko wrote: >> >> >> >> >> >> On 03/11/2021 02:27, Andrew Lunn wrote: >> >> > > > What i find interesting is that you and the other resent requester are >> >> > > > using the same user space tool. If you implement C45 over C22 in that >> >> > > > tool, you get your solution, and it will work for older kernels as >> >> > > > well. Also, given the diverse implementations of this IOTCL, it >> >> > > > probably works for more drivers than just those using phy_mii_ioctl(). >> >> > > >> >> > > Do you mean change uapi, like >> >> > > add mdio_phy_id_is_c45_over_c22() and >> >> > > flag #define MDIO_PHY_ID_C45_OVER_C22 0x4000? >> >> > >> >> > No, i mean user space implements C45 over C22. Make phytool write >> >> > MII_MMD_CTRL and MII_MMD_DATA to perform a C45 over C22. >> >> >> >> Now I give up - as mentioned there is now way to sync User space vs Kernel >> >> MMD transactions and so no way to get trusted results. >> >> Except that there is a way: https://github.com/wkz/mdio-tools > > I'm guessing that this hasn't had much in the way of review, as it has > a nice exploitable bug - you really want "pc" to be unsigned in > mdio_nl_eval(), otherwise one can write a branch instruction that makes > "pc" negative. You are quite right, it never got that far as it was NAKed on principle before that. I welcome the review, this is one of the reasons why I would love to have it in mainline. Alternatively, if someone has a better idea, I wouldn't mind adapting mdio-tools to whatever that interface would be. I agree that there should be much more rigorous checks around the modification of the PC. I will get on that. > Also it looks like one can easily exploit this to trigger any of your > BUG_ON()/BUG() statements, thereby crashing while holding the MDIO bus > lock causing a denial of service attack. The idea is that this is pre-validated in mdio_nl_validate_insn. Each instruction lists their acceptable argument types in mdio_nl_op_protos. > I also see nothing that protects against any user on a system being > able to use this interface, so the exploits above can be triggered by > any user. Moreover, this lack of protection means any user on the > system can use this interface to write to a PHY. I was under the impression that specifying GENL_ADMIN_PERM in the `struct genl_ops` would require the caller to hold CAP_NET_ADMIN? > Given that some PHYs today contain firmware, this gives anyone access > to reprogram the PHY firmware, possibly introducing malicious firmware. > > I hope no one is using this module in a production environment. Thanks for your review.
On Thu, Nov 04, 2021 at 12:40, "Russell King (Oracle)" <linux@armlinux.org.uk> wrote: > On Thu, Nov 04, 2021 at 12:35:17PM +0000, Russell King (Oracle) wrote: >> On Thu, Nov 04, 2021 at 12:17:47PM +0100, Tobias Waldekranz wrote: >> > Except that there is a way: https://github.com/wkz/mdio-tools >> >> I'm guessing that this hasn't had much in the way of review, as it has >> a nice exploitable bug - you really want "pc" to be unsigned in >> mdio_nl_eval(), otherwise one can write a branch instruction that makes >> "pc" negative. >> >> Also it looks like one can easily exploit this to trigger any of your >> BUG_ON()/BUG() statements, thereby crashing while holding the MDIO bus >> lock causing a denial of service attack. >> >> I also see nothing that protects against any user on a system being >> able to use this interface, so the exploits above can be triggered by >> any user. Moreover, this lack of protection means any user on the >> system can use this interface to write to a PHY. >> >> Given that some PHYs today contain firmware, this gives anyone access >> to reprogram the PHY firmware, possibly introducing malicious firmware. >> >> I hope no one is using this module in a production environment. > > It also leaks the reference count on the MDIO bus class device. > mdio_find_bus(), rather class_find_device_by_name() takes a reference > on the struct device that you never drop. See the documentation for > class_find_device() for the statement about this: > > * Note, you will need to drop the reference with put_device() after use. Ahh interesting. Thanks! > Of course, mdio_find_bus() documentation should _really_ have mentioned > this fact too. Yeah, that would have been helpful.
On 11/2/21 7:38 PM, Russell King (Oracle) wrote: > On Tue, Nov 02, 2021 at 03:46:13PM -0400, Sean Anderson wrote: >> I have not found this to be the case. As soon as you need to access >> something using phylink, the emulated registers make the ioctls useless >> (especially because there may be multiple phy-like devices for one >> interface). > > I think you're fundamentally misunderstanding something there. > > If there is a PHY present, phylink presents no different an interface > from phylib - it does no emulation what so ever, and you can access any > address. I use this on Macchiatobin when researching the 88x3310 PHY. I > have a tool that allows me to view part of the register set in any MMD > in almost real-time - and I can access either of the two PHYs on the > xmdio bus from either of their network interfaces. Same for the clause > 22 mdio bus. There is no emulation in this case, and you get full > access to the MDIO/XMDIO bus just like via phylib. There is absolutely > no difference. > > If there is no PHY connected, then phylink will emulate the accesses > in just the same way as the fixed-phy support emulates accesses, and > in a bug-compatible way with fixed-phy. It only emulates for PHY > address 0. As there is no PHY, there is no MII bus known to phylink, > so it there is no MII bus for phylink to pass any non-zero address on > to. > > Split PCS support is relatively new, and this brings with it a whole > host of issues: > > 1) the PCS may not be on a MII bus, and may not even have a PHY-like > set of registers. How do we export that kind of setup through the > MII ioctls? > > 2) when we have a copper SFP plugged in with its own PHY, we export it > through the MII ioctls because phylink now has a PHY (so it falls > in the "PHY present" case above). If we also have a PCS on a MII > bus, we now have two completely different MII buses. Which MII bus > do we export? > > 3) in the non-SFP case, the PHY and PCS may be sitting on different > MII buses. Again, which MII bus do we export? > > The MII ioctls have no way to indicate which MII bus should be > accessed. We can't just look at the address - what if the PHY and PCS > are at the same address but on different buses? > > We may have cases where the PHY and PCS are sitting on the same MII bus > - and in that case, phylink does not restrict whether you can access > the PCS through the MII ioctls. > > Everything other case is "complicated" and unless we can come up with > a sane way to fit everything into two or more buses into these > antequated ioctls that are designed for a single MII bus, it's probably > best not to even bodge something at the phylink level - it probably > makes more sense for the network driver to do it. After all, the > network driver probably has more knowledge about the hardware around it > than phylink does. I am specifically objecting to the statement > The current API is good enough you can use it for debug Because for debugging purposes, the current API is simply inadequate. As you note above, there are many cases where there is no obvious mapping between a single network interface and a single PHY on a single MDIO bus. For this reason, it is necessary to allow userspace access to any address on any MDIO bus for debugging. Even a read-only debugfs interface would be useful, but from what I can tell, such patches have been NAK'd. I find this very frustrating. I have no opinion on the proposed patch above (due to the ioctl interface's more fundamental issues, which you note). You will continue to get patches trying to extend MDIO access until there is better debug access. --Sean
diff --git a/drivers/net/phy/phy-core.c b/drivers/net/phy/phy-core.c index 2870c33b8975..2c83a121a5fa 100644 --- a/drivers/net/phy/phy-core.c +++ b/drivers/net/phy/phy-core.c @@ -457,6 +457,28 @@ static void mmd_phy_indirect(struct mii_bus *bus, int phy_addr, int devad, devad | MII_MMD_CTRL_NOINCR); } +int __mmd_phy_read(struct mii_bus *bus, int phy_addr, int devad, u32 regnum) +{ + int retval; + + mmd_phy_indirect(bus, phy_addr, devad, regnum); + + /* Read the content of the MMD's selected register */ + retval = __mdiobus_read(bus, phy_addr, MII_MMD_DATA); + + return retval; +} + +int __mmd_phy_write(struct mii_bus *bus, int phy_addr, int devad, u32 regnum, u16 val) +{ + mmd_phy_indirect(bus, phy_addr, devad, regnum); + + /* Write the data into MMD's selected register */ + __mdiobus_write(bus, phy_addr, MII_MMD_DATA, val); + + return 0; +} + /** * __phy_read_mmd - Convenience function for reading a register * from an MMD on a given PHY. @@ -482,10 +504,7 @@ int __phy_read_mmd(struct phy_device *phydev, int devad, u32 regnum) struct mii_bus *bus = phydev->mdio.bus; int phy_addr = phydev->mdio.addr; - mmd_phy_indirect(bus, phy_addr, devad, regnum); - - /* Read the content of the MMD's selected register */ - val = __mdiobus_read(bus, phy_addr, MII_MMD_DATA); + val = __mmd_phy_read(bus, phy_addr, devad, regnum); } return val; } @@ -538,10 +557,7 @@ int __phy_write_mmd(struct phy_device *phydev, int devad, u32 regnum, u16 val) struct mii_bus *bus = phydev->mdio.bus; int phy_addr = phydev->mdio.addr; - mmd_phy_indirect(bus, phy_addr, devad, regnum); - - /* Write the data into MMD's selected register */ - __mdiobus_write(bus, phy_addr, MII_MMD_DATA, val); + __mmd_phy_write(bus, phy_addr, devad, regnum, val); ret = 0; } diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c index a3bfb156c83d..212ec5954b95 100644 --- a/drivers/net/phy/phy.c +++ b/drivers/net/phy/phy.c @@ -300,8 +300,19 @@ int phy_mii_ioctl(struct phy_device *phydev, struct ifreq *ifr, int cmd) prtad = mii_data->phy_id; devad = mii_data->reg_num; } - mii_data->val_out = mdiobus_read(phydev->mdio.bus, prtad, - devad); + if (mdio_phy_id_is_c45(mii_data->phy_id) && + phydev->mdio.bus->probe_capabilities <= MDIOBUS_C22) { + phy_lock_mdio_bus(phydev); + + mii_data->val_out = __mmd_phy_read(phydev->mdio.bus, + mdio_phy_id_devad(mii_data->phy_id), + prtad, + mii_data->reg_num); + + phy_unlock_mdio_bus(phydev); + } else { + mii_data->val_out = mdiobus_read(phydev->mdio.bus, prtad, devad); + } return 0; case SIOCSMIIREG: @@ -351,7 +362,19 @@ int phy_mii_ioctl(struct phy_device *phydev, struct ifreq *ifr, int cmd) } } - mdiobus_write(phydev->mdio.bus, prtad, devad, val); + if (mdio_phy_id_is_c45(mii_data->phy_id) && + phydev->mdio.bus->probe_capabilities <= MDIOBUS_C22) { + phy_lock_mdio_bus(phydev); + + __mmd_phy_write(phydev->mdio.bus, mdio_phy_id_devad(mii_data->phy_id), + prtad, + mii_data->reg_num, + val); + + phy_unlock_mdio_bus(phydev); + } else { + mdiobus_write(phydev->mdio.bus, prtad, devad, val); + } if (prtad == phydev->mdio.addr && devad == MII_BMCR && diff --git a/include/linux/phy.h b/include/linux/phy.h index 96e43fbb2dd8..f6032c1708e6 100644 --- a/include/linux/phy.h +++ b/include/linux/phy.h @@ -1114,12 +1114,14 @@ int phy_read_mmd(struct phy_device *phydev, int devad, u32 regnum); * from an MMD on a given PHY. */ int __phy_read_mmd(struct phy_device *phydev, int devad, u32 regnum); +int __mmd_phy_read(struct mii_bus *bus, int phy_addr, int devad, u32 regnum); /* * phy_write_mmd - Convenience function for writing a register * on an MMD on a given PHY. */ int phy_write_mmd(struct phy_device *phydev, int devad, u32 regnum, u16 val); +int __mmd_phy_write(struct mii_bus *bus, int phy_addr, int devad, u32 regnum, u16 val); /* * __phy_write_mmd - Convenience function for writing a register
This patch enables access to C22 PHY MMD address space through phy_mii_ioctl() SIOCGMIIREG/SIOCSMIIREG IOCTLs. It checks if R/W request is received with C45 flag enabled while MDIO bus doesn't support C45 and, in this case, tries to treat prtad as PHY MMD selector and use MMD API. With this change it's possible to r/w PHY MMD registers with phytool, for example, before: phytool read eth0/0x1f:0/0x32 0xffea after: phytool read eth0/0x1f:0/0x32 0x00d1 This feature is very useful for various PHY issues debugging (now it's required to modify phy code to collect MMD regs dump). The patch is marked as RFC as it possible that I've missed something and such feature already present in Kernel, but I just can't find it. It also doesn't cover phylink. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> --- drivers/net/phy/phy-core.c | 32 ++++++++++++++++++++++++-------- drivers/net/phy/phy.c | 29 ++++++++++++++++++++++++++--- include/linux/phy.h | 2 ++ 3 files changed, 52 insertions(+), 11 deletions(-)