Message ID | 20240925160026.84091-1-marex@denx.de (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | soc: imx8m: Probe the SoC driver late | expand |
On Wed, Sep 25, 2024, at 16:00, Marek Vasut wrote: > With driver_async_probe=* on kernel command line, the following trace is > produced because on i.MX8M Plus hardware because the soc-imx8m.c driver > calls of_clk_get_by_name() which returns -EPROBE_DEFER because the clock > driver is not yet probed. This was not detected during regular testing > without driver_async_probe. > > Attempt to fix it by probing the SoC driver late, but I don't think that > is the correct approach here. I think the correct fix would be to propagate the -EPROBE_DEFER and return that from imx8_soc_init(), so it gets retried again after the clock driver. Arnd
On 9/25/24 6:04 PM, Arnd Bergmann wrote: > On Wed, Sep 25, 2024, at 16:00, Marek Vasut wrote: >> With driver_async_probe=* on kernel command line, the following trace is >> produced because on i.MX8M Plus hardware because the soc-imx8m.c driver >> calls of_clk_get_by_name() which returns -EPROBE_DEFER because the clock >> driver is not yet probed. This was not detected during regular testing >> without driver_async_probe. >> >> Attempt to fix it by probing the SoC driver late, but I don't think that >> is the correct approach here. > > I think the correct fix would be to propagate the -EPROBE_DEFER > and return that from imx8_soc_init(), so it gets retried again > after the clock driver. I already tried that, but if I return -EPROBE_DEFER from device_initcall, it doesn't get retriggered . I suspect EPROBE_DEFER works only for proper drivers ?
On Wed, Sep 25, 2024, at 16:09, Marek Vasut wrote: > On 9/25/24 6:04 PM, Arnd Bergmann wrote: >> On Wed, Sep 25, 2024, at 16:00, Marek Vasut wrote: >>> With driver_async_probe=* on kernel command line, the following trace is >>> produced because on i.MX8M Plus hardware because the soc-imx8m.c driver >>> calls of_clk_get_by_name() which returns -EPROBE_DEFER because the clock >>> driver is not yet probed. This was not detected during regular testing >>> without driver_async_probe. >>> >>> Attempt to fix it by probing the SoC driver late, but I don't think that >>> is the correct approach here. >> >> I think the correct fix would be to propagate the -EPROBE_DEFER >> and return that from imx8_soc_init(), so it gets retried again >> after the clock driver. > I already tried that, but if I return -EPROBE_DEFER from > device_initcall, it doesn't get retriggered . I suspect EPROBE_DEFER > works only for proper drivers ? Right, of course. And unfortunately it can't just register to the fsl,imx8mm-anatop/fsl,imx8mm-ocotp/... nodes because they all have a driver already. On the other hand, making it a late_initcall() defeats the purpose of the driver because then it can't be used by other drivers with soc_device_match(), resulting in incorrect behavior when another driver relies on this to enable a chip revision specific workaround. Arnd
On 9/25/24 6:31 PM, Arnd Bergmann wrote: > On Wed, Sep 25, 2024, at 16:09, Marek Vasut wrote: >> On 9/25/24 6:04 PM, Arnd Bergmann wrote: >>> On Wed, Sep 25, 2024, at 16:00, Marek Vasut wrote: >>>> With driver_async_probe=* on kernel command line, the following trace is >>>> produced because on i.MX8M Plus hardware because the soc-imx8m.c driver >>>> calls of_clk_get_by_name() which returns -EPROBE_DEFER because the clock >>>> driver is not yet probed. This was not detected during regular testing >>>> without driver_async_probe. >>>> >>>> Attempt to fix it by probing the SoC driver late, but I don't think that >>>> is the correct approach here. >>> >>> I think the correct fix would be to propagate the -EPROBE_DEFER >>> and return that from imx8_soc_init(), so it gets retried again >>> after the clock driver. >> I already tried that, but if I return -EPROBE_DEFER from >> device_initcall, it doesn't get retriggered . I suspect EPROBE_DEFER >> works only for proper drivers ? > > Right, of course. And unfortunately it can't just register to > the fsl,imx8mm-anatop/fsl,imx8mm-ocotp/... nodes because they > all have a driver already. > > On the other hand, making it a late_initcall() defeats the > purpose of the driver because then it can't be used by other > drivers with soc_device_match(), resulting in incorrect > behavior when another driver relies on this to enable > a chip revision specific workaround. I know, I am unsure how to proceed with this. Convert this to platform_driver or some such and probe it early maybe ?
On Wed, Sep 25, 2024 at 10:07 AM Marek Vasut <marex@denx.de> wrote: > > On 9/25/24 6:31 PM, Arnd Bergmann wrote: > > On Wed, Sep 25, 2024, at 16:09, Marek Vasut wrote: > >> On 9/25/24 6:04 PM, Arnd Bergmann wrote: > >>> On Wed, Sep 25, 2024, at 16:00, Marek Vasut wrote: > >>>> With driver_async_probe=* on kernel command line, the following trace is > >>>> produced because on i.MX8M Plus hardware because the soc-imx8m.c driver > >>>> calls of_clk_get_by_name() which returns -EPROBE_DEFER because the clock > >>>> driver is not yet probed. This was not detected during regular testing > >>>> without driver_async_probe. > >>>> > >>>> Attempt to fix it by probing the SoC driver late, but I don't think that > >>>> is the correct approach here. > >>> > >>> I think the correct fix would be to propagate the -EPROBE_DEFER > >>> and return that from imx8_soc_init(), so it gets retried again > >>> after the clock driver. > >> I already tried that, but if I return -EPROBE_DEFER from > >> device_initcall, it doesn't get retriggered . I suspect EPROBE_DEFER > >> works only for proper drivers ? > > > > Right, of course. And unfortunately it can't just register to > > the fsl,imx8mm-anatop/fsl,imx8mm-ocotp/... nodes because they > > all have a driver already. Arnd, Can't we change this to add a platform device and a platform driver in the initcall? And then the driver can return -EPROBE_DEFER if it can't get the clock yet? > > > > On the other hand, making it a late_initcall() defeats the > > purpose of the driver because then it can't be used by other > > drivers with soc_device_match(), resulting in incorrect > > behavior when another driver relies on this to enable > > a chip revision specific workaround. We could have soc_device_match() return -EPROBE_DEFER if no soc device has been registered yet. For cases where it's already working without any changes, we shouldn't see any new -EPROBE_DEFER return values. But for cases like what Marek is trying to do, it should work properly. He might have to fix bad driver code where they remap the error instead of returning it as is. On a tangential note, the soc framework seems to be yet another framework violating the bus vs class thing. If it's a bus, then you need to have a probe. Otherwise, just make it a class. Might be too much to fix at this point, but might be good to keep this in mind if you plan to write more frameworks or redo soc framework at some point :) See Slide 20: https://lpc.events/event/18/contributions/1734/ > I know, I am unsure how to proceed with this. Convert this to > platform_driver or some such and probe it early maybe ? Marek, Thanks for trying out drive_async_probe=* and trying to fix boards/drivers. The issue you are facing and the proper way to handle it was covered in my talk at LPC 2024 in Slide 18: https://lpc.events/event/18/contributions/1734/ All the benefits of fw_devlink are only present for drivers that use a device-driver model. And yes, in this case we should convert this driver into a platform device/driver if it's possible. Thanks, Saravana
On Wed, Sep 25, 2024, at 18:48, Saravana Kannan wrote: > On Wed, Sep 25, 2024 at 10:07 AM Marek Vasut <marex@denx.de> wrote: >> > >> > Right, of course. And unfortunately it can't just register to >> > the fsl,imx8mm-anatop/fsl,imx8mm-ocotp/... nodes because they >> > all have a driver already. > > Can't we change this to add a platform device and a platform driver in > the initcall? And then the driver can return -EPROBE_DEFER if it can't > get the clock yet? Yes, good idea. So the initcall would still use of_match_node() to see if wants to be loaded and then either bail early or call platform_create_bundle() to register the driver and the device. >> > On the other hand, making it a late_initcall() defeats the >> > purpose of the driver because then it can't be used by other >> > drivers with soc_device_match(), resulting in incorrect >> > behavior when another driver relies on this to enable >> > a chip revision specific workaround. > > We could have soc_device_match() return -EPROBE_DEFER if no soc device > has been registered yet. > > For cases where it's already working without any changes, we shouldn't > see any new -EPROBE_DEFER return values. But for cases like what Marek > is trying to do, it should work properly. He might have to fix bad > driver code where they remap the error instead of returning it as is. Right. > On a tangential note, the soc framework seems to be yet another > framework violating the bus vs class thing. If it's a bus, then you > need to have a probe. Otherwise, just make it a class. Might be too > much to fix at this point, but might be good to keep this in mind if > you plan to write more frameworks or redo soc framework at some point > :) > > See Slide 20: > https://lpc.events/event/18/contributions/1734/ Very useful, I don't think I've seen this explained like this in the past. It's probably not easy to change now since I'm sure there is existing userspace looking at /sys/bus/soc, but I can at least make sure I'll follow these when reviewing new bus_type or class submissions. Arnd
On Wed, Sep 25, 2024 at 1:00 PM Arnd Bergmann <arnd@arndb.de> wrote: > > On Wed, Sep 25, 2024, at 18:48, Saravana Kannan wrote: > > On Wed, Sep 25, 2024 at 10:07 AM Marek Vasut <marex@denx.de> wrote: > >> > > >> > Right, of course. And unfortunately it can't just register to > >> > the fsl,imx8mm-anatop/fsl,imx8mm-ocotp/... nodes because they > >> > all have a driver already. > > > > Can't we change this to add a platform device and a platform driver in > > the initcall? And then the driver can return -EPROBE_DEFER if it can't > > get the clock yet? > > Yes, good idea. So the initcall would still use of_match_node() > to see if wants to be loaded and then either bail early or > call platform_create_bundle() to register the driver and the > device. > > >> > On the other hand, making it a late_initcall() defeats the > >> > purpose of the driver because then it can't be used by other > >> > drivers with soc_device_match(), resulting in incorrect > >> > behavior when another driver relies on this to enable > >> > a chip revision specific workaround. > > > > We could have soc_device_match() return -EPROBE_DEFER if no soc device > > has been registered yet. > > > > For cases where it's already working without any changes, we shouldn't > > see any new -EPROBE_DEFER return values. But for cases like what Marek > > is trying to do, it should work properly. He might have to fix bad > > driver code where they remap the error instead of returning it as is. > > Right. Sweet! Marek, now you know what to do :) > > > On a tangential note, the soc framework seems to be yet another > > framework violating the bus vs class thing. If it's a bus, then you > > need to have a probe. Otherwise, just make it a class. Might be too > > much to fix at this point, but might be good to keep this in mind if > > you plan to write more frameworks or redo soc framework at some point > > :) > > > > See Slide 20: > > https://lpc.events/event/18/contributions/1734/ > > Very useful, I don't think I've seen this explained like this > in the past. It's probably not easy to change now since I'm > sure there is existing userspace looking at /sys/bus/soc, but > I can at least make sure I'll follow these when reviewing new > bus_type or class submissions. Great to hear it was useful! -Saravana
On 9/25/24 8:48 PM, Saravana Kannan wrote: Hi, >>> On the other hand, making it a late_initcall() defeats the >>> purpose of the driver because then it can't be used by other >>> drivers with soc_device_match(), resulting in incorrect >>> behavior when another driver relies on this to enable >>> a chip revision specific workaround. > > We could have soc_device_match() return -EPROBE_DEFER if no soc device > has been registered yet. > > For cases where it's already working without any changes, we shouldn't > see any new -EPROBE_DEFER return values. But for cases like what Marek > is trying to do, it should work properly. He might have to fix bad > driver code where they remap the error instead of returning it as is. I sent out V2. > On a tangential note, the soc framework seems to be yet another > framework violating the bus vs class thing. If it's a bus, then you > need to have a probe. Otherwise, just make it a class. Might be too > much to fix at this point, but might be good to keep this in mind if > you plan to write more frameworks or redo soc framework at some point > :) > > See Slide 20: > https://lpc.events/event/18/contributions/1734/ > >> I know, I am unsure how to proceed with this. Convert this to >> platform_driver or some such and probe it early maybe ? > > Marek, > > Thanks for trying out drive_async_probe=* and trying to fix boards/drivers. That was one of the most useful things I picked at LPC this year, thanks. > The issue you are facing and the proper way to handle it was covered > in my talk at LPC 2024 in Slide 18: > https://lpc.events/event/18/contributions/1734/ > > All the benefits of fw_devlink are only present for drivers that use a > device-driver model. And yes, in this case we should convert this > driver into a platform device/driver if it's possible. Done in V2, thanks.
diff --git a/drivers/soc/imx/soc-imx8m.c b/drivers/soc/imx/soc-imx8m.c index fe111bae38c8e..04d6ab1073b25 100644 --- a/drivers/soc/imx/soc-imx8m.c +++ b/drivers/soc/imx/soc-imx8m.c @@ -251,6 +251,6 @@ static int __init imx8_soc_init(void) kfree(soc_dev_attr); return ret; } -device_initcall(imx8_soc_init); +late_initcall(imx8_soc_init); MODULE_DESCRIPTION("NXP i.MX8M SoC driver"); MODULE_LICENSE("GPL");
With driver_async_probe=* on kernel command line, the following trace is produced because on i.MX8M Plus hardware because the soc-imx8m.c driver calls of_clk_get_by_name() which returns -EPROBE_DEFER because the clock driver is not yet probed. This was not detected during regular testing without driver_async_probe. Attempt to fix it by probing the SoC driver late, but I don't think that is the correct approach here. " ------------[ cut here ]------------ WARNING: CPU: 1 PID: 1 at drivers/soc/imx/soc-imx8m.c:115 imx8mm_soc_revision+0xdc/0x180 CPU: 1 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-next-20240924-00002-g2062bb554dea #603 Hardware name: DH electronics i.MX8M Plus DHCOM Premium Developer Kit (3) (DT) pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : imx8mm_soc_revision+0xdc/0x180 lr : imx8mm_soc_revision+0xd0/0x180 sp : ffff8000821fbcc0 x29: ffff8000821fbce0 x28: 0000000000000000 x27: ffff800081810120 x26: ffff8000818a9970 x25: 0000000000000006 x24: 0000000000824311 x23: ffff8000817f42c8 x22: ffff0000df8be210 x21: fffffffffffffdfb x20: ffff800082780000 x19: 0000000000000001 x18: ffffffffffffffff x17: ffff800081fff418 x16: ffff8000823e1000 x15: ffff0000c03b65e8 x14: ffff0000c00051b0 x13: ffff800082790000 x12: 0000000000000801 x11: ffff80008278ffff x10: ffff80008209d3a6 x9 : ffff80008062e95c x8 : ffff8000821fb9a0 x7 : 0000000000000000 x6 : 00000000000080e3 x5 : ffff0000df8c03d8 x4 : 0000000000000000 x3 : 0000000000000000 x2 : 0000000000000000 x1 : fffffffffffffdfb x0 : fffffffffffffdfb Call trace: imx8mm_soc_revision+0xdc/0x180 imx8_soc_init+0xb0/0x1e0 do_one_initcall+0x94/0x1a8 kernel_init_freeable+0x240/0x2a8 kernel_init+0x28/0x140 ret_from_fork+0x10/0x20 ---[ end trace 0000000000000000 ]--- SoC: i.MX8MP revision 1.1 " Signed-off-by: Marek Vasut <marex@denx.de> --- Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Fabio Estevam <festevam@gmail.com> Cc: Jeff Johnson <quic_jjohnson@quicinc.com> Cc: Neil Armstrong <neil.armstrong@linaro.org> Cc: Pengutronix Kernel Team <kernel@pengutronix.de> Cc: Saravana Kannan <saravanak@google.com> Cc: Sascha Hauer <s.hauer@pengutronix.de> Cc: Shawn Guo <shawnguo@kernel.org> Cc: imx@lists.linux.dev Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org --- drivers/soc/imx/soc-imx8m.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)