Message ID | 20220228233057.1140817-1-pgwipeout@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [v1] net: phy: fix motorcomm module automatic loading | expand |
On Mon, Feb 28, 2022 at 06:30:57PM -0500, Peter Geis wrote: > The sentinel compatible entry whitespace causes automatic module loading > to fail with certain userspace utilities. Fix this by removing the > whitespace and sentinel comment, which is unnecessary. Umm. How does it fail? > static const struct mdio_device_id __maybe_unused motorcomm_tbl[] = { > { PHY_ID_MATCH_EXACT(PHY_ID_YT8511) }, > - { /* sentinal */ } > + {} These two should be 100% identical in terms of the object code produced, and thus should have no bearing on the ability for the module to be loaded. Have you investigated the differences in the produced object code? If not, please do so, and describe what they were. Thanks.
On Mon, Feb 28, 2022 at 7:14 PM Russell King (Oracle) <linux@armlinux.org.uk> wrote: > > On Mon, Feb 28, 2022 at 06:30:57PM -0500, Peter Geis wrote: > > The sentinel compatible entry whitespace causes automatic module loading > > to fail with certain userspace utilities. Fix this by removing the > > whitespace and sentinel comment, which is unnecessary. > > Umm. How does it fail? It simply does not auto load the module by device id match. Manually loading the module after the fact works fine. > > > static const struct mdio_device_id __maybe_unused motorcomm_tbl[] = { > > { PHY_ID_MATCH_EXACT(PHY_ID_YT8511) }, > > - { /* sentinal */ } > > + {} > > These two should be 100% identical in terms of the object code produced, > and thus should have no bearing on the ability for the module to be > loaded. > > Have you investigated the differences in the produced object code? Yes, you are correct, I just compared the produced files and they are identical. This patch can get dropped then. I'm curious now why it seemed to make a difference. I am not familiar enough with how the various userspace elements decide to match the modules to determine exactly why this is failing. It seems to be hit or miss if userspace decides to auto load this, for instance Ubuntu 20.04 was happy to load my kernel module built with the arm64 official toolchain, but Manjaro will not load their self built kernel module. I originally suspected it was due to the manufacturer id being all zeros. Unless there's some weird compiler optimization that I'm not seeing in my configuration. Any ideas would be appreciated. Thanks! > If not, please do so, and describe what they were. Thanks. > > -- > RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ > FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
On Mon, Feb 28, 2022 at 7:44 PM Peter Geis <pgwipeout@gmail.com> wrote: > > On Mon, Feb 28, 2022 at 7:14 PM Russell King (Oracle) > <linux@armlinux.org.uk> wrote: > > > > On Mon, Feb 28, 2022 at 06:30:57PM -0500, Peter Geis wrote: > > > The sentinel compatible entry whitespace causes automatic module loading > > > to fail with certain userspace utilities. Fix this by removing the > > > whitespace and sentinel comment, which is unnecessary. > > > > Umm. How does it fail? > > It simply does not auto load the module by device id match. > Manually loading the module after the fact works fine. > > > > > > static const struct mdio_device_id __maybe_unused motorcomm_tbl[] = { > > > { PHY_ID_MATCH_EXACT(PHY_ID_YT8511) }, > > > - { /* sentinal */ } > > > + {} > > > > These two should be 100% identical in terms of the object code produced, > > and thus should have no bearing on the ability for the module to be > > loaded. > > > > Have you investigated the differences in the produced object code? > > Yes, you are correct, I just compared the produced files and they are identical. > This patch can get dropped then. > I'm curious now why it seemed to make a difference. > > I am not familiar enough with how the various userspace elements > decide to match the modules to determine exactly why this is failing. > It seems to be hit or miss if userspace decides to auto load this, for > instance Ubuntu 20.04 was happy to load my kernel module built with > the arm64 official toolchain, but Manjaro will not load their self > built kernel module. > I originally suspected it was due to the manufacturer id being all zeros. > Unless there's some weird compiler optimization that I'm not seeing in > my configuration. > > Any ideas would be appreciated. > Thanks! Good Morning, After testing various configurations I found what is actually happening here. When libphy is built in but the phy drivers are modules and not available in the initrd, the generic phy driver binds here. This allows the phy to come up but it is not functional. It also prevents the module driver from binding when it becomes available. https://elixir.bootlin.com/linux/v5.18-rc4/source/drivers/net/phy/phy_device.c#L1383 It seems there is an implicit dependency between phy_device and the device specific drivers that isn't realized in the configuration. I can think of a few ways to fix this, but I think the simplest is to make the device specific drivers have a kconfig dependency on libphy (which builds phy_device). This means that the only time the device specific phy drivers can be modules is if libphy is as well, otherwise if libphy is built in, the device specific drivers would need to be as well. There are more elegant and complicated solutions I can think of here, such as breaking out the generic driver as a module or having a device-tree flag that annotates that we need a device specific driver. This isn't realized with most of the common devices such as the Realtek driver because they have cross dependencies that ensure they are likely to be built in. Very Respectfully, Peter Geis > > > If not, please do so, and describe what they were. Thanks. > > > > -- > > RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ > > FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
> Good Morning, > > After testing various configurations I found what is actually > happening here. When libphy is built in but the phy drivers are > modules and not available in the initrd, the generic phy driver binds > here. This allows the phy to come up but it is not functional. What MAC are you using? Why is you interface being brought up by the initramfs? Are you using NFS root from within the initramfs? What normally happens is that the kernel loads, maybe with the MAC driver and phylib loading, as part of the initramfs. The other modules in the initramfs allow the root filesystem to be found, mounted, and pivoted into it. The MAC driver is then brought up by the initscripts, which causes phylib to request the needed PHY driver modules, it loads and all is good. If you are using NFS root, then the load of the PHY driver happens earlier, inside the initramfs. If this is you situation, maybe the correct fix is to teach the initramfs tools to include the PHY drivers when NFS root is being used? Andrew
On Sat, Apr 30, 2022 at 11:52 AM Andrew Lunn <andrew@lunn.ch> wrote: > > > Good Morning, > > > > After testing various configurations I found what is actually > > happening here. When libphy is built in but the phy drivers are > > modules and not available in the initrd, the generic phy driver binds > > here. This allows the phy to come up but it is not functional. > > What MAC are you using? Specifically Motorcomm, but I've discovered it can happen with any of the phy drivers with the right kconfig. > > Why is you interface being brought up by the initramfs? Are you using > NFS root from within the initramfs? This was discovered with embedded programming. It's common to have a small initramfs, or forgo an initramfs altogether. Another cause is a mismatch in kernel config where phylib is built in because of a dependency, but the rest of the phy drivers are modular. The key is: - phylib is built in - ethernet driver is built in - the phy driver is a module - modules aren't available at probe time (for any reason). In this case phylib assumes there is no driver, when the vast majority of phys now have device specific drivers.It seems this is an unsafe assumption as this means there is now an implicit dependency of the device specific phy drivers and phylib. It just so happens to work simply because both broadcom and realtek, some of the more common phys, have explicit dependencies elsewhere that cause them to be built in as well. > > What normally happens is that the kernel loads, maybe with the MAC > driver and phylib loading, as part of the initramfs. The other modules > in the initramfs allow the root filesystem to be found, mounted, and > pivoted into it. The MAC driver is then brought up by the initscripts, > which causes phylib to request the needed PHY driver modules, it loads > and all is good. > > If you are using NFS root, then the load of the PHY driver happens > earlier, inside the initramfs. If this is you situation, maybe the > correct fix is to teach the initramfs tools to include the PHY drivers > when NFS root is being used? > > Andrew
On Sat, Apr 30, 2022 at 12:31:27PM -0400, Peter Geis wrote: > On Sat, Apr 30, 2022 at 11:52 AM Andrew Lunn <andrew@lunn.ch> wrote: > > > > > Good Morning, > > > > > > After testing various configurations I found what is actually > > > happening here. When libphy is built in but the phy drivers are > > > modules and not available in the initrd, the generic phy driver binds > > > here. This allows the phy to come up but it is not functional. > > > > What MAC are you using? > > Specifically Motorcomm, but I've discovered it can happen with any of > the phy drivers with the right kconfig. > > > > > Why is you interface being brought up by the initramfs? Are you using > > NFS root from within the initramfs? > > This was discovered with embedded programming. It's common to have a > small initramfs, or forgo an initramfs altogether. Yes, i do that all the time. But then it is up to me to ensure i have all the code i need built into the kernel. > Another cause is a > mismatch in kernel config where phylib is built in because of a > dependency, but the rest of the phy drivers are modular. > The key is: > - phylib is built in > - ethernet driver is built in > - the phy driver is a module > - modules aren't available at probe time (for any reason). This 'for any reason' is what i'm trying to get at. It is not the kernel which builds the initramsfs. It is not the kernels problem if the modules it needs are missing, it is my fault for not telling the intramfs tools to include the modules needed to actually boot the machine. Andrew
On 30.04.2022 18:31, Peter Geis wrote: > On Sat, Apr 30, 2022 at 11:52 AM Andrew Lunn <andrew@lunn.ch> wrote: >> >>> Good Morning, >>> >>> After testing various configurations I found what is actually >>> happening here. When libphy is built in but the phy drivers are >>> modules and not available in the initrd, the generic phy driver binds >>> here. This allows the phy to come up but it is not functional. >> >> What MAC are you using? > > Specifically Motorcomm, but I've discovered it can happen with any of > the phy drivers with the right kconfig. > >> >> Why is you interface being brought up by the initramfs? Are you using >> NFS root from within the initramfs? > > This was discovered with embedded programming. It's common to have a > small initramfs, or forgo an initramfs altogether. Another cause is a > mismatch in kernel config where phylib is built in because of a > dependency, but the rest of the phy drivers are modular. > The key is: > - phylib is built in > - ethernet driver is built in > - the phy driver is a module > - modules aren't available at probe time (for any reason). > > In this case phylib assumes there is no driver, when the vast majority > of phys now have device specific drivers.It seems this is an unsafe > assumption as this means there is now an implicit dependency of the > device specific phy drivers and phylib. It just so happens to work > simply because both broadcom and realtek, some of the more common > phys, have explicit dependencies elsewhere that cause them to be built > in as well. > Because you mention the realtek phy driver: Users reported similar issues like you if r8169 MAC driver is built-in (or r8169 module is in initramfs) but realtek phy driver is not. There's no direct code dependency between r8169 and realtek phy driver, therefore initramfs-creating tools sometimes missed to automatically include the phy driver in initramfs. To mitigate this r8169 has the following: MODULE_SOFTDEP("pre: realtek"); This isn't strictly needed but some initramfs-creating tools consider such soft dependencies when checking what should be included in initramfs. If some other MAC is used with a Realtek PHY, then you may still see the described issue. As Andrew wrote: Eventually it's a userspace responsibility to ensure that all needed modules are included in initramfs. >>>> What normally happens is that the kernel loads, maybe with the MAC >> driver and phylib loading, as part of the initramfs. The other modules >> in the initramfs allow the root filesystem to be found, mounted, and >> pivoted into it. The MAC driver is then brought up by the initscripts, >> which causes phylib to request the needed PHY driver modules, it loads >> and all is good. >> >> If you are using NFS root, then the load of the PHY driver happens >> earlier, inside the initramfs. If this is you situation, maybe the >> correct fix is to teach the initramfs tools to include the PHY drivers >> when NFS root is being used? >> >> Andrew
On Sat, Apr 30, 2022 at 12:31:27PM -0400, Peter Geis wrote: > On Sat, Apr 30, 2022 at 11:52 AM Andrew Lunn <andrew@lunn.ch> wrote: > > > > > Good Morning, > > > > > > After testing various configurations I found what is actually > > > happening here. When libphy is built in but the phy drivers are > > > modules and not available in the initrd, the generic phy driver binds > > > here. This allows the phy to come up but it is not functional. > > > > What MAC are you using? > > Specifically Motorcomm, but I've discovered it can happen with any of > the phy drivers with the right kconfig. > > > > > Why is you interface being brought up by the initramfs? Are you using > > NFS root from within the initramfs? > > This was discovered with embedded programming. It's common to have a > small initramfs, or forgo an initramfs altogether. If you're talking about embedded, it makes more sense to have the PHY drivers built-in. They will take up less text and data space that way. Typically, PHY drivers have very small amounts of text and data, and both of these end up being rounded up to a page size when loaded as a module. > Another cause is a > mismatch in kernel config where phylib is built in because of a > dependency, but the rest of the phy drivers are modular. > The key is: > - phylib is built in > - ethernet driver is built in > - the phy driver is a module > - modules aren't available at probe time (for any reason). This is why many ethernet drivers connect with their PHY in their .ndo_open method, rather than at probe time.
diff --git a/drivers/net/phy/motorcomm.c b/drivers/net/phy/motorcomm.c index 7e6ac2c5e27e..1e3a4eed39bc 100644 --- a/drivers/net/phy/motorcomm.c +++ b/drivers/net/phy/motorcomm.c @@ -131,7 +131,7 @@ MODULE_LICENSE("GPL"); static const struct mdio_device_id __maybe_unused motorcomm_tbl[] = { { PHY_ID_MATCH_EXACT(PHY_ID_YT8511) }, - { /* sentinal */ } + {} }; MODULE_DEVICE_TABLE(mdio, motorcomm_tbl);
The sentinel compatible entry whitespace causes automatic module loading to fail with certain userspace utilities. Fix this by removing the whitespace and sentinel comment, which is unnecessary. Fixes: 48e8c6f1612b ("net: phy: add driver for Motorcomm yt8511 phy") Signed-off-by: Peter Geis <pgwipeout@gmail.com> --- drivers/net/phy/motorcomm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)