diff mbox series

[v1] net: phy: fix motorcomm module automatic loading

Message ID 20220228233057.1140817-1-pgwipeout@gmail.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [v1] net: phy: fix motorcomm module automatic loading | expand

Checks

Context Check Description
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers success CCed 7 of 7 maintainers
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes fail Problems with Fixes tag: 1
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 8 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/tree_selection success Guessing tree name failed - patch did not apply

Commit Message

Peter Geis Feb. 28, 2022, 11:30 p.m. UTC
The sentinel compatible entry whitespace causes automatic module loading
to fail with certain userspace utilities. Fix this by removing the
whitespace and sentinel comment, which is unnecessary.

Fixes: 48e8c6f1612b ("net: phy: add driver for Motorcomm yt8511 phy")

Signed-off-by: Peter Geis <pgwipeout@gmail.com>
---
 drivers/net/phy/motorcomm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Russell King (Oracle) March 1, 2022, 12:14 a.m. UTC | #1
On Mon, Feb 28, 2022 at 06:30:57PM -0500, Peter Geis wrote:
> The sentinel compatible entry whitespace causes automatic module loading
> to fail with certain userspace utilities. Fix this by removing the
> whitespace and sentinel comment, which is unnecessary.

Umm. How does it fail?

>  static const struct mdio_device_id __maybe_unused motorcomm_tbl[] = {
>  	{ PHY_ID_MATCH_EXACT(PHY_ID_YT8511) },
> -	{ /* sentinal */ }
> +	{}

These two should be 100% identical in terms of the object code produced,
and thus should have no bearing on the ability for the module to be
loaded.

Have you investigated the differences in the produced object code?
If not, please do so, and describe what they were. Thanks.
Peter Geis March 1, 2022, 12:44 a.m. UTC | #2
On Mon, Feb 28, 2022 at 7:14 PM Russell King (Oracle)
<linux@armlinux.org.uk> wrote:
>
> On Mon, Feb 28, 2022 at 06:30:57PM -0500, Peter Geis wrote:
> > The sentinel compatible entry whitespace causes automatic module loading
> > to fail with certain userspace utilities. Fix this by removing the
> > whitespace and sentinel comment, which is unnecessary.
>
> Umm. How does it fail?

It simply does not auto load the module by device id match.
Manually loading the module after the fact works fine.

>
> >  static const struct mdio_device_id __maybe_unused motorcomm_tbl[] = {
> >       { PHY_ID_MATCH_EXACT(PHY_ID_YT8511) },
> > -     { /* sentinal */ }
> > +     {}
>
> These two should be 100% identical in terms of the object code produced,
> and thus should have no bearing on the ability for the module to be
> loaded.
>
> Have you investigated the differences in the produced object code?

Yes, you are correct, I just compared the produced files and they are identical.
This patch can get dropped then.
I'm curious now why it seemed to make a difference.

I am not familiar enough with how the various userspace elements
decide to match the modules to determine exactly why this is failing.
It seems to be hit or miss if userspace decides to auto load this, for
instance Ubuntu 20.04 was happy to load my kernel module built with
the arm64 official toolchain, but Manjaro will not load their self
built kernel module.
I originally suspected it was due to the manufacturer id being all zeros.
Unless there's some weird compiler optimization that I'm not seeing in
my configuration.

Any ideas would be appreciated.
Thanks!

> If not, please do so, and describe what they were. Thanks.
>
> --
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
Peter Geis April 29, 2022, 12:51 p.m. UTC | #3
On Mon, Feb 28, 2022 at 7:44 PM Peter Geis <pgwipeout@gmail.com> wrote:
>
> On Mon, Feb 28, 2022 at 7:14 PM Russell King (Oracle)
> <linux@armlinux.org.uk> wrote:
> >
> > On Mon, Feb 28, 2022 at 06:30:57PM -0500, Peter Geis wrote:
> > > The sentinel compatible entry whitespace causes automatic module loading
> > > to fail with certain userspace utilities. Fix this by removing the
> > > whitespace and sentinel comment, which is unnecessary.
> >
> > Umm. How does it fail?
>
> It simply does not auto load the module by device id match.
> Manually loading the module after the fact works fine.
>
> >
> > >  static const struct mdio_device_id __maybe_unused motorcomm_tbl[] = {
> > >       { PHY_ID_MATCH_EXACT(PHY_ID_YT8511) },
> > > -     { /* sentinal */ }
> > > +     {}
> >
> > These two should be 100% identical in terms of the object code produced,
> > and thus should have no bearing on the ability for the module to be
> > loaded.
> >
> > Have you investigated the differences in the produced object code?
>
> Yes, you are correct, I just compared the produced files and they are identical.
> This patch can get dropped then.
> I'm curious now why it seemed to make a difference.
>
> I am not familiar enough with how the various userspace elements
> decide to match the modules to determine exactly why this is failing.
> It seems to be hit or miss if userspace decides to auto load this, for
> instance Ubuntu 20.04 was happy to load my kernel module built with
> the arm64 official toolchain, but Manjaro will not load their self
> built kernel module.
> I originally suspected it was due to the manufacturer id being all zeros.
> Unless there's some weird compiler optimization that I'm not seeing in
> my configuration.
>
> Any ideas would be appreciated.
> Thanks!

Good Morning,

After testing various configurations I found what is actually
happening here. When libphy is built in but the phy drivers are
modules and not available in the initrd, the generic phy driver binds
here. This allows the phy to come up but it is not functional. It also
prevents the module driver from binding when it becomes available.
https://elixir.bootlin.com/linux/v5.18-rc4/source/drivers/net/phy/phy_device.c#L1383

It seems there is an implicit dependency between phy_device and the
device specific drivers that isn't realized in the configuration.

I can think of a few ways to fix this, but I think the simplest is to
make the device specific drivers have a kconfig dependency on libphy
(which builds phy_device). This means that the only time the device
specific phy drivers can be modules is if libphy is as well, otherwise
if libphy is built in, the device specific drivers would need to be as
well. There are more elegant and complicated solutions I can think of
here, such as breaking out the generic driver as a module or having a
device-tree flag that annotates that we need a device specific driver.

This isn't realized with most of the common devices such as the
Realtek driver because they have cross dependencies that ensure they
are likely to be built in.

Very Respectfully,
Peter Geis


>
> > If not, please do so, and describe what they were. Thanks.
> >
> > --
> > RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> > FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
Andrew Lunn April 30, 2022, 3:52 p.m. UTC | #4
> Good Morning,
> 
> After testing various configurations I found what is actually
> happening here. When libphy is built in but the phy drivers are
> modules and not available in the initrd, the generic phy driver binds
> here. This allows the phy to come up but it is not functional.

What MAC are you using?

Why is you interface being brought up by the initramfs? Are you using
NFS root from within the initramfs?

What normally happens is that the kernel loads, maybe with the MAC
driver and phylib loading, as part of the initramfs. The other modules
in the initramfs allow the root filesystem to be found, mounted, and
pivoted into it. The MAC driver is then brought up by the initscripts,
which causes phylib to request the needed PHY driver modules, it loads
and all is good.

If you are using NFS root, then the load of the PHY driver happens
earlier, inside the initramfs. If this is you situation, maybe the
correct fix is to teach the initramfs tools to include the PHY drivers
when NFS root is being used?

     Andrew
Peter Geis April 30, 2022, 4:31 p.m. UTC | #5
On Sat, Apr 30, 2022 at 11:52 AM Andrew Lunn <andrew@lunn.ch> wrote:
>
> > Good Morning,
> >
> > After testing various configurations I found what is actually
> > happening here. When libphy is built in but the phy drivers are
> > modules and not available in the initrd, the generic phy driver binds
> > here. This allows the phy to come up but it is not functional.
>
> What MAC are you using?

Specifically Motorcomm, but I've discovered it can happen with any of
the phy drivers with the right kconfig.

>
> Why is you interface being brought up by the initramfs? Are you using
> NFS root from within the initramfs?

This was discovered with embedded programming. It's common to have a
small initramfs, or forgo an initramfs altogether. Another cause is a
mismatch in kernel config where phylib is built in because of a
dependency, but the rest of the phy drivers are modular.
The key is:
- phylib is built in
- ethernet driver is built in
- the phy driver is a module
- modules aren't available at probe time (for any reason).

In this case phylib assumes there is no driver, when the vast majority
of phys now have device specific drivers.It seems this is an unsafe
assumption as this means there is now an implicit dependency of the
device specific phy drivers and phylib. It just so happens to work
simply because both broadcom and realtek, some of the more common
phys, have explicit dependencies elsewhere that cause them to be built
in as well.

>
> What normally happens is that the kernel loads, maybe with the MAC
> driver and phylib loading, as part of the initramfs. The other modules
> in the initramfs allow the root filesystem to be found, mounted, and
> pivoted into it. The MAC driver is then brought up by the initscripts,
> which causes phylib to request the needed PHY driver modules, it loads
> and all is good.
>
> If you are using NFS root, then the load of the PHY driver happens
> earlier, inside the initramfs. If this is you situation, maybe the
> correct fix is to teach the initramfs tools to include the PHY drivers
> when NFS root is being used?
>
>      Andrew
Andrew Lunn April 30, 2022, 4:48 p.m. UTC | #6
On Sat, Apr 30, 2022 at 12:31:27PM -0400, Peter Geis wrote:
> On Sat, Apr 30, 2022 at 11:52 AM Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > > Good Morning,
> > >
> > > After testing various configurations I found what is actually
> > > happening here. When libphy is built in but the phy drivers are
> > > modules and not available in the initrd, the generic phy driver binds
> > > here. This allows the phy to come up but it is not functional.
> >
> > What MAC are you using?
> 
> Specifically Motorcomm, but I've discovered it can happen with any of
> the phy drivers with the right kconfig.
> 
> >
> > Why is you interface being brought up by the initramfs? Are you using
> > NFS root from within the initramfs?
> 
> This was discovered with embedded programming. It's common to have a
> small initramfs, or forgo an initramfs altogether.

Yes, i do that all the time. But then it is up to me to ensure i have
all the code i need built into the kernel.

> Another cause is a
> mismatch in kernel config where phylib is built in because of a
> dependency, but the rest of the phy drivers are modular.
> The key is:
> - phylib is built in
> - ethernet driver is built in
> - the phy driver is a module
> - modules aren't available at probe time (for any reason).

This 'for any reason' is what i'm trying to get at. It is not the
kernel which builds the initramsfs. It is not the kernels problem if
the modules it needs are missing, it is my fault for not telling the
intramfs tools to include the modules needed to actually boot the
machine.

	 Andrew
Heiner Kallweit April 30, 2022, 8:19 p.m. UTC | #7
On 30.04.2022 18:31, Peter Geis wrote:
> On Sat, Apr 30, 2022 at 11:52 AM Andrew Lunn <andrew@lunn.ch> wrote:
>>
>>> Good Morning,
>>>
>>> After testing various configurations I found what is actually
>>> happening here. When libphy is built in but the phy drivers are
>>> modules and not available in the initrd, the generic phy driver binds
>>> here. This allows the phy to come up but it is not functional.
>>
>> What MAC are you using?
> 
> Specifically Motorcomm, but I've discovered it can happen with any of
> the phy drivers with the right kconfig.
> 
>>
>> Why is you interface being brought up by the initramfs? Are you using
>> NFS root from within the initramfs?
> 
> This was discovered with embedded programming. It's common to have a
> small initramfs, or forgo an initramfs altogether. Another cause is a
> mismatch in kernel config where phylib is built in because of a
> dependency, but the rest of the phy drivers are modular.
> The key is:
> - phylib is built in
> - ethernet driver is built in
> - the phy driver is a module
> - modules aren't available at probe time (for any reason).
> 
> In this case phylib assumes there is no driver, when the vast majority
> of phys now have device specific drivers.It seems this is an unsafe
> assumption as this means there is now an implicit dependency of the
> device specific phy drivers and phylib. It just so happens to work
> simply because both broadcom and realtek, some of the more common
> phys, have explicit dependencies elsewhere that cause them to be built
> in as well.
> 
Because you mention the realtek phy driver:
Users reported similar issues like you if r8169 MAC driver is built-in
(or r8169 module is in initramfs) but realtek phy driver is not.
There's no direct code dependency between r8169 and realtek phy driver,
therefore initramfs-creating tools sometimes missed to automatically
include the phy driver in initramfs. To mitigate this r8169 has the following:
MODULE_SOFTDEP("pre: realtek");
This isn't strictly needed but some initramfs-creating tools consider
such soft dependencies when checking what should be included in initramfs.
If some other MAC is used with a Realtek PHY, then you may still see the
described issue.
As Andrew wrote: Eventually it's a userspace responsibility to ensure that
all needed modules are included in initramfs.

>>>> What normally happens is that the kernel loads, maybe with the MAC
>> driver and phylib loading, as part of the initramfs. The other modules
>> in the initramfs allow the root filesystem to be found, mounted, and
>> pivoted into it. The MAC driver is then brought up by the initscripts,
>> which causes phylib to request the needed PHY driver modules, it loads
>> and all is good.
>>
>> If you are using NFS root, then the load of the PHY driver happens
>> earlier, inside the initramfs. If this is you situation, maybe the
>> correct fix is to teach the initramfs tools to include the PHY drivers
>> when NFS root is being used?
>>
>>      Andrew
Russell King (Oracle) May 1, 2022, 6:22 a.m. UTC | #8
On Sat, Apr 30, 2022 at 12:31:27PM -0400, Peter Geis wrote:
> On Sat, Apr 30, 2022 at 11:52 AM Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > > Good Morning,
> > >
> > > After testing various configurations I found what is actually
> > > happening here. When libphy is built in but the phy drivers are
> > > modules and not available in the initrd, the generic phy driver binds
> > > here. This allows the phy to come up but it is not functional.
> >
> > What MAC are you using?
> 
> Specifically Motorcomm, but I've discovered it can happen with any of
> the phy drivers with the right kconfig.
> 
> >
> > Why is you interface being brought up by the initramfs? Are you using
> > NFS root from within the initramfs?
> 
> This was discovered with embedded programming. It's common to have a
> small initramfs, or forgo an initramfs altogether.

If you're talking about embedded, it makes more sense to have the PHY
drivers built-in. They will take up less text and data space that way.
Typically, PHY drivers have very small amounts of text and data, and
both of these end up being rounded up to a page size when loaded as a
module.

> Another cause is a
> mismatch in kernel config where phylib is built in because of a
> dependency, but the rest of the phy drivers are modular.
> The key is:
> - phylib is built in
> - ethernet driver is built in
> - the phy driver is a module
> - modules aren't available at probe time (for any reason).

This is why many ethernet drivers connect with their PHY in their
.ndo_open method, rather than at probe time.
diff mbox series

Patch

diff --git a/drivers/net/phy/motorcomm.c b/drivers/net/phy/motorcomm.c
index 7e6ac2c5e27e..1e3a4eed39bc 100644
--- a/drivers/net/phy/motorcomm.c
+++ b/drivers/net/phy/motorcomm.c
@@ -131,7 +131,7 @@  MODULE_LICENSE("GPL");
 
 static const struct mdio_device_id __maybe_unused motorcomm_tbl[] = {
 	{ PHY_ID_MATCH_EXACT(PHY_ID_YT8511) },
-	{ /* sentinal */ }
+	{}
 };
 
 MODULE_DEVICE_TABLE(mdio, motorcomm_tbl);