Message ID | 20151020154656.GY32532@n2100.arm.linux.org.uk (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
On 10/20/2015 8:46 AM, Russell King - ARM Linux wrote: > On Mon, Oct 19, 2015 at 06:21:40PM +0200, Geert Uytterhoeven wrote: >> Hi Russell, >> >> On Mon, Oct 19, 2015 at 5:35 PM, Russell King - ARM Linux >> <linux@arm.linux.org.uk> wrote: >>>>> What you can do is print those devices which have failed to probe at >>>>> late_initcall() time - possibly augmenting that with reports from >>>>> subsystems showing what resources are not available, but that's only >>>>> a guide, because of the "it might or might not be in a kernel module" >>>>> problem. >>>> >>>> Well, adding those reports would give you a changelog similar to the >>>> one in this series... >>> >>> I'm not sure about that, because what I was thinking of is adding >>> a flag which would be set at late_initcall() time prior to running >>> a final round of deferred device probing. >> >> Which round is the final round? >> That's the one which didn't manage to bind any new devices to drivers, >> which is something you only know _after_ the round has been run. >> >> So I think we need one extra round to handle this. >> >>> This flag would then be used in a deferred_warn() printk function >>> which would normally be silent, but when this flag is set, it would >>> print the reason for the deferral - and this would replace (or be >>> added) to the subsystems and drivers which return -EPROBE_DEFER. >>> >>> That has the effect of hiding all the deferrals up until just before >>> launching into userspace, which should then acomplish two things - >>> firstly, getting rid of the rather useless deferred messages up to >>> that point, and secondly printing the reason why the remaining >>> deferrals are happening. >>> >>> That should be a small number of new lines plus a one-line change >>> in subsystems and drivers. >> >> Apart from the extra round we probably can't get rid of, that sounds OK to me. > > Something like this. I haven't put a lot of effort into it to change all > the places which return an -EPROBE_DEFER, and it also looks like we need > some helpers to report when we have only an device_node (or should that > be fwnode?) See the commented out of_warn_deferred() in > drivers/gpio/gpiolib-of.c. Adding this stuff in the subsystems searching > for resources should make debugging why things are getting deferred easier. > > We could make driver_deferred_probe_report something that can be > deactivated again after the last deferred probe run, and provide the > user with a knob that they can turn it back on again. > > I've tried this out on two of my platforms, including forcing > driver_deferred_probe_report to be enabled, and I get exactly one > deferred probe, so not a particularly good test. > > The patch won't apply as-is to mainline for all files; it's based on my > tree which has some 360 additional patches (which seems to be about > normal for my tree now.) I like the concept (I have been thinking along similar lines lately). But I think this might make the console messages more confusing than they are now. The problem is that debug, warn, and error messages come from a somewhat random set of locations at the moment. Some come from the driver probe routines and some come from the subsystems that the probe routines call. So the patch is suppressing some messages, but not others. One thing that seemed pretty obvious from the patches is that the current probe routines are somewhat inconsistent in terms of messages, and that there is room for a set of best practices for messaging. That is on my long term wish list, but I'm not sure I'll ever chase after those windmills. A couple of specific comments below. > > drivers/base/dd.c | 29 +++++++++++++++++++++++++++++ > drivers/base/power/domain.c | 7 +++++-- > drivers/clk/clkdev.c | 9 ++++++++- > drivers/gpio/gpiolib-of.c | 5 +++++ > drivers/gpu/drm/bridge/dw_hdmi.c | 2 +- > drivers/gpu/drm/exynos/exynos_drm_dsi.c | 2 +- > drivers/gpu/drm/imx/imx-ldb.c | 5 +++-- > drivers/gpu/drm/msm/dsi/dsi.c | 2 +- > drivers/gpu/drm/msm/msm_drv.c | 3 ++- > drivers/gpu/drm/rcar-du/rcar_du_crtc.c | 3 ++- > drivers/of/irq.c | 5 ++++- > drivers/pci/host/pci-mvebu.c | 1 + > drivers/pinctrl/core.c | 5 +++-- > drivers/pinctrl/devicetree.c | 4 ++-- > drivers/regulator/core.c | 5 +++-- > include/linux/device.h | 1 + > 16 files changed, 71 insertions(+), 17 deletions(-) > > diff --git a/drivers/base/dd.c b/drivers/base/dd.c > index be0eb4639128..bb12224f2901 100644 > --- a/drivers/base/dd.c > +++ b/drivers/base/dd.c > @@ -129,7 +129,29 @@ void driver_deferred_probe_del(struct device *dev) > mutex_unlock(&deferred_probe_mutex); > } > > +static bool driver_deferred_probe_report; > + > +/** > + * dev_warn_deferred() - report why a probe has been deferred > + */ > +void dev_warn_deferred(struct device *dev, const char *fmt, ...) > +{ > + if (driver_deferred_probe_report) { > + struct va_format vaf; > + va_list ap; > + > + va_start(ap, fmt); > + vaf.fmt = fmt; > + vaf.va = ≈ > + > + dev_warn(dev, "deferring probe: %pV", &vaf); > + va_end(ap); > + } > +} > +EXPORT_SYMBOL_GPL(dev_warn_deferred); The places where dev_warn_deferred() replaces dev_dbg(), we lose the ability to turn on debugging and observe the driver reporting the specific reason the deferral is occurring. So it would be useful to add an "else dev_dbg()" in dev_warn_deferred() to retain that capability. > + > static bool driver_deferred_probe_enable = false; > + > /** > * driver_deferred_probe_trigger() - Kick off re-probing deferred devices > * > @@ -188,6 +210,13 @@ static int deferred_probe_initcall(void) > driver_deferred_probe_trigger(); Couldn't you put the "driver_deferred_probe_report = true" here? And then not add another round of probes. > /* Sort as many dependencies as possible before exiting initcalls */ > flush_workqueue(deferred_wq); > + > + /* Now one final round, reporting any devices that remain deferred */ > + driver_deferred_probe_report = true; > + driver_deferred_probe_trigger(); > + /* Sort as many dependencies as possible before exiting initcalls */ > + flush_workqueue(deferred_wq); > + > return 0; > } < snip > -Frank -- To unsubscribe from this list: send the line "unsubscribe dmaengine" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 20, 2015 at 08:58:19PM -0700, Frank Rowand wrote: > On 10/20/2015 8:46 AM, Russell King - ARM Linux wrote: > > On Mon, Oct 19, 2015 at 06:21:40PM +0200, Geert Uytterhoeven wrote: > >> Hi Russell, > >> > >> On Mon, Oct 19, 2015 at 5:35 PM, Russell King - ARM Linux > >> <linux@arm.linux.org.uk> wrote: > >>>>> What you can do is print those devices which have failed to probe at > >>>>> late_initcall() time - possibly augmenting that with reports from > >>>>> subsystems showing what resources are not available, but that's only > >>>>> a guide, because of the "it might or might not be in a kernel module" > >>>>> problem. > >>>> > >>>> Well, adding those reports would give you a changelog similar to the > >>>> one in this series... > >>> > >>> I'm not sure about that, because what I was thinking of is adding > >>> a flag which would be set at late_initcall() time prior to running > >>> a final round of deferred device probing. > >> > >> Which round is the final round? > >> That's the one which didn't manage to bind any new devices to drivers, > >> which is something you only know _after_ the round has been run. > >> > >> So I think we need one extra round to handle this. > >> > >>> This flag would then be used in a deferred_warn() printk function > >>> which would normally be silent, but when this flag is set, it would > >>> print the reason for the deferral - and this would replace (or be > >>> added) to the subsystems and drivers which return -EPROBE_DEFER. > >>> > >>> That has the effect of hiding all the deferrals up until just before > >>> launching into userspace, which should then acomplish two things - > >>> firstly, getting rid of the rather useless deferred messages up to > >>> that point, and secondly printing the reason why the remaining > >>> deferrals are happening. > >>> > >>> That should be a small number of new lines plus a one-line change > >>> in subsystems and drivers. > >> > >> Apart from the extra round we probably can't get rid of, that sounds OK to me. > > > > Something like this. I haven't put a lot of effort into it to change all > > the places which return an -EPROBE_DEFER, and it also looks like we need > > some helpers to report when we have only an device_node (or should that > > be fwnode?) See the commented out of_warn_deferred() in > > drivers/gpio/gpiolib-of.c. Adding this stuff in the subsystems searching > > for resources should make debugging why things are getting deferred easier. > > > > We could make driver_deferred_probe_report something that can be > > deactivated again after the last deferred probe run, and provide the > > user with a knob that they can turn it back on again. > > > > I've tried this out on two of my platforms, including forcing > > driver_deferred_probe_report to be enabled, and I get exactly one > > deferred probe, so not a particularly good test. > > > > The patch won't apply as-is to mainline for all files; it's based on my > > tree which has some 360 additional patches (which seems to be about > > normal for my tree now.) > > I like the concept (I have been thinking along similar lines lately). > But I think this might make the console messages more confusing than > they are now. If messages end up being given from the subsystem rather than the driver, surely they become more consistent? > The problem is that debug, warn, and error messages > come from a somewhat random set of locations at the moment. Some > come from the driver probe routines and some come from the subsystems > that the probe routines call. So the patch is suppressing some > messages, but not others. The patch is not complete (read the description above). > > +void dev_warn_deferred(struct device *dev, const char *fmt, ...) > > +{ > > + if (driver_deferred_probe_report) { > > + struct va_format vaf; > > + va_list ap; > > + > > + va_start(ap, fmt); > > + vaf.fmt = fmt; > > + vaf.va = ≈ > > + > > + dev_warn(dev, "deferring probe: %pV", &vaf); > > + va_end(ap); > > + } > > +} > > +EXPORT_SYMBOL_GPL(dev_warn_deferred); > > The places where dev_warn_deferred() replaces dev_dbg(), we lose the > ability to turn on debugging and observe the driver reporting the > specific reason the deferral is occurring. So it would be useful to > add an "else dev_dbg()" in dev_warn_deferred() to retain that capability. That's a possibility. > > > + > > static bool driver_deferred_probe_enable = false; > > + > > /** > > * driver_deferred_probe_trigger() - Kick off re-probing deferred devices > > * > > @@ -188,6 +210,13 @@ static int deferred_probe_initcall(void) > > driver_deferred_probe_trigger(); > > Couldn't you put the "driver_deferred_probe_report = true" here? And then > not add another round of probes. The idea is not to report anything for drivers that were deferred during the normal bootup. The above is part of the normal bootup, and the deferred activity should not be warned about. If we have any devices still deferring after _this_ round, that must indicate that some resource they want is not available, and that should be warned about. Of course, modules can defer too - and I made some suggestions in my waffle above the patch about that.
On 10/21/2015 1:18 AM, Russell King - ARM Linux wrote: > On Tue, Oct 20, 2015 at 08:58:19PM -0700, Frank Rowand wrote: >> On 10/20/2015 8:46 AM, Russell King - ARM Linux wrote: < snip > >>> + >>> static bool driver_deferred_probe_enable = false; >>> + >>> /** >>> * driver_deferred_probe_trigger() - Kick off re-probing deferred devices >>> * >>> @@ -188,6 +210,13 @@ static int deferred_probe_initcall(void) >>> driver_deferred_probe_trigger(); >> >> Couldn't you put the "driver_deferred_probe_report = true" here? And then >> not add another round of probes. > > The idea is not to report anything for drivers that were deferred > during the normal bootup. The above is part of the normal bootup, > and the deferred activity should not be warned about. The above is currently the last point for probe to succeed or defer (until possibly, as you mentioned, module loading resolves the defer). If a probe defers above, it will defer again below. The set of defers should be exactly the same above and below. > > If we have any devices still deferring after _this_ round, that must > indicate that some resource they want is not available, and that > should be warned about. > > Of course, modules can defer too - and I made some suggestions in my > waffle above the patch about that. > < adding back trimmed, for fuller context > >>> /* Sort as many dependencies as possible before exiting initcalls */ >>> flush_workqueue(deferred_wq); >>> + >>> + /* Now one final round, reporting any devices that remain deferred */ >>> + driver_deferred_probe_report = true; >>> + driver_deferred_probe_trigger(); >>> + /* Sort as many dependencies as possible before exiting initcalls */ >>> + flush_workqueue(deferred_wq); >>> + >>> return 0; >>> } -- To unsubscribe from this list: send the line "unsubscribe dmaengine" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/21/2015 06:36 PM, Frank Rowand wrote: > On 10/21/2015 1:18 AM, Russell King - ARM Linux wrote: >> On Tue, Oct 20, 2015 at 08:58:19PM -0700, Frank Rowand wrote: >>> On 10/20/2015 8:46 AM, Russell King - ARM Linux wrote: > > < snip > > >>>> + >>>> static bool driver_deferred_probe_enable = false; >>>> + >>>> /** >>>> * driver_deferred_probe_trigger() - Kick off re-probing deferred devices >>>> * >>>> @@ -188,6 +210,13 @@ static int deferred_probe_initcall(void) >>>> driver_deferred_probe_trigger(); >>> >>> Couldn't you put the "driver_deferred_probe_report = true" here? And then >>> not add another round of probes. >> >> The idea is not to report anything for drivers that were deferred >> during the normal bootup. The above is part of the normal bootup, >> and the deferred activity should not be warned about. > > The above is currently the last point for probe to succeed or defer > (until possibly, as you mentioned, module loading resolves the defer). > If a probe defers above, it will defer again below. The set of defers > should be exactly the same above and below. > Unfortunately this is not "the last point for probe to succeed or defer". There are still a bunch of drivers in Kernel which will be probed at late_initcall() level. (like ./drivers/net/ethernet/ti/cpsw.c => late_initcall(cpsw_init); Yes - they probably need to be updated to use module_init(), but that's what we have now). Those drivers will re-trigger deferred device probing if their probe succeeded. As result, it is impossible to say when will it happen the "final round of deferred device probing" :( and final list of drivers which was "deferred forever" will be know only when kernel exits to User space ("deferred forever" - before loading modules). May be, we also can consider adding debug_fs entry which can be used to display actual state of deferred_probe_pending_list? >> >> If we have any devices still deferring after _this_ round, that must >> indicate that some resource they want is not available, and that >> should be warned about. >> >> Of course, modules can defer too - and I made some suggestions in my >> waffle above the patch about that. >> > > < adding back trimmed, for fuller context > > >>>> /* Sort as many dependencies as possible before exiting initcalls */ >>>> flush_workqueue(deferred_wq); >>>> + >>>> + /* Now one final round, reporting any devices that remain deferred */ >>>> + driver_deferred_probe_report = true; >>>> + driver_deferred_probe_trigger(); >>>> + /* Sort as many dependencies as possible before exiting initcalls */ >>>> + flush_workqueue(deferred_wq); >>>> + >>>> return 0; >>>> }
On Wed, Oct 21, 2015 at 07:55:29PM +0300, Grygorii Strashko wrote: > On 10/21/2015 06:36 PM, Frank Rowand wrote: > > The above is currently the last point for probe to succeed or defer > > (until possibly, as you mentioned, module loading resolves the defer). > > If a probe defers above, it will defer again below. The set of defers > > should be exactly the same above and below. > > > > Unfortunately this is not "the last point for probe to succeed or defer". Of course it isn't. Being pedantic, there's actually no such thing, because the point that the kernel as finished booting can never actually be determined with things like modules being present. That's something I've acknowledged from the start of this. > There are still a bunch of drivers in Kernel which will be probed at late_initcall() level. > (like ./drivers/net/ethernet/ti/cpsw.c => late_initcall(cpsw_init); > Yes - they probably need to be updated to use module_init(), but that's what > we have now). Those drivers will re-trigger deferred device probing if their > probe succeeded. Maybe this particular late_initcall() which triggers off the deferred probing should be moved to its own really_late_initcall() which happens as the very last thing - I think this is intended to run after everything else has had a chance to probe once. > As result, it is impossible to say when will it happen the > "final round of deferred device probing" :( and final list of drivers > which was "deferred forever" will be know only when kernel exits to > User space ("deferred forever" - before loading modules). > > May be, we also can consider adding debug_fs entry which can be used to > display actual state of deferred_probe_pending_list? There are complaints in this thread about the existing deferred probing implementation being hard to debug - where it's known that a device has deferred, but it's not known why that happened. That would be solved by my proposal, as this final round of probing before entering userspace after _all_ normal device probes have been attempted once and then we've tried to satisfy the deferred probe (okay, that's what it's _supposed_ to be - and as it takes three lines to write it, you'll excuse me if I just use the abbreviated "final round of deferred probe" which is much shorter - but remember that the long version is what I actually mean) would produce a list of not only the devices that failed to probe, but also the cause of the deferred probes. My proposal would ensure that subsystems are happier to add these prints, because in the normal scenario where we have deferred probing, we're not littering the console log with lots of useless failure messages which make people stop and think "now did device X probe?" It also means scripts in our boot farms can more effectively analyse the log and determine whether the boot was actually successful and contained no errors. Merely printing the list of devices which have been deferred is next to useless. The next question will always be "why did device X defer?" and if that can't be answered, it means people having to spend a long time adding lots of printks to the kernel at lots of -EPROBE_DEFER returning sites or in the relevant drivers, tracing through the code back towards the -EPROBE_DEFER sites to try and track it down.
On 10/21/2015 9:55 AM, Grygorii Strashko wrote: > On 10/21/2015 06:36 PM, Frank Rowand wrote: >> On 10/21/2015 1:18 AM, Russell King - ARM Linux wrote: >>> On Tue, Oct 20, 2015 at 08:58:19PM -0700, Frank Rowand wrote: >>>> On 10/20/2015 8:46 AM, Russell King - ARM Linux wrote: >> >> < snip > >> >>>>> + >>>>> static bool driver_deferred_probe_enable = false; >>>>> + >>>>> /** >>>>> * driver_deferred_probe_trigger() - Kick off re-probing deferred devices >>>>> * >>>>> @@ -188,6 +210,13 @@ static int deferred_probe_initcall(void) >>>>> driver_deferred_probe_trigger(); >>>> >>>> Couldn't you put the "driver_deferred_probe_report = true" here? And then >>>> not add another round of probes. >>> >>> The idea is not to report anything for drivers that were deferred >>> during the normal bootup. The above is part of the normal bootup, >>> and the deferred activity should not be warned about. >> >> The above is currently the last point for probe to succeed or defer >> (until possibly, as you mentioned, module loading resolves the defer). >> If a probe defers above, it will defer again below. The set of defers >> should be exactly the same above and below. >> > > Unfortunately this is not "the last point for probe to succeed or defer". > There are still a bunch of drivers in Kernel which will be probed at late_initcall() level. > (like ./drivers/net/ethernet/ti/cpsw.c => late_initcall(cpsw_init); Yes, cpsw_init() should _not_ be a late_initcall. This is yet another example of playing games with ordering probes that we have been trying to eliminate. Thanks for pointing out one of the resulting problems this causes for the deferred probe mechanism. > Yes - they probably need to be updated to use module_init(), but that's what > we have now). Those drivers will re-trigger deferred device probing if their > probe succeeded. Yes, if cpsw_init() leads to a successful probe, then deferred device probing will be re-triggered. I do not know if cpsw_init() will be called before or after deferred_probe_initcall(). The general initcall mechanism does not provide any ordering guarantees between the two functions because they are at the same initcall level. > > As result, it is impossible to say when will it happen the > "final round of deferred device probing" :( and final list of drivers which > was "deferred forever" will be know only when kernel exits to User space > ("deferred forever" - before loading modules). > > May be, we also can consider adding debug_fs entry which can be used to display > actual state of deferred_probe_pending_list? > >>> >>> If we have any devices still deferring after _this_ round, that must >>> indicate that some resource they want is not available, and that >>> should be warned about. >>> >>> Of course, modules can defer too - and I made some suggestions in my >>> waffle above the patch about that. >>> >> >> < adding back trimmed, for fuller context > >> >>>>> /* Sort as many dependencies as possible before exiting initcalls */ >>>>> flush_workqueue(deferred_wq); >>>>> + >>>>> + /* Now one final round, reporting any devices that remain deferred */ >>>>> + driver_deferred_probe_report = true; >>>>> + driver_deferred_probe_trigger(); >>>>> + /* Sort as many dependencies as possible before exiting initcalls */ >>>>> + flush_workqueue(deferred_wq); >>>>> + >>>>> return 0; >>>>> } > > -- To unsubscribe from this list: send the line "unsubscribe dmaengine" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Russell, On 10/21/2015 08:20 PM, Russell King - ARM Linux wrote: > On Wed, Oct 21, 2015 at 07:55:29PM +0300, Grygorii Strashko wrote: >> On 10/21/2015 06:36 PM, Frank Rowand wrote: >>> The above is currently the last point for probe to succeed or defer >>> (until possibly, as you mentioned, module loading resolves the defer). >>> If a probe defers above, it will defer again below. The set of defers >>> should be exactly the same above and below. >>> >> >> Unfortunately this is not "the last point for probe to succeed or defer". > > Of course it isn't. Being pedantic, there's actually no such thing, > because the point that the kernel as finished booting can never actually > be determined with things like modules being present. That's something > I've acknowledged from the start of this. > >> There are still a bunch of drivers in Kernel which will be probed at late_initcall() level. >> (like ./drivers/net/ethernet/ti/cpsw.c => late_initcall(cpsw_init); >> Yes - they probably need to be updated to use module_init(), but that's what >> we have now). Those drivers will re-trigger deferred device probing if their >> probe succeeded. > > Maybe this particular late_initcall() which triggers off the deferred > probing should be moved to its own really_late_initcall() which happens > as the very last thing - I think this is intended to run after everything > else has had a chance to probe once. > >> As result, it is impossible to say when will it happen the >> "final round of deferred device probing" :( and final list of drivers >> which was "deferred forever" will be know only when kernel exits to >> User space ("deferred forever" - before loading modules). >> >> May be, we also can consider adding debug_fs entry which can be used to >> display actual state of deferred_probe_pending_list? > > There are complaints in this thread about the existing deferred probing > implementation being hard to debug - where it's known that a device > has deferred, but it's not known why that happened. > > That would be solved by my proposal, as this final round of probing > before entering userspace after _all_ normal device probes have been > attempted once and then we've tried to satisfy the deferred probe > (okay, that's what it's _supposed_ to be - and as it takes three lines > to write it, you'll excuse me if I just use the abbreviated "final > round of deferred probe" which is much shorter - but remember that > the long version is what I actually mean) would produce a list of > not only the devices that failed to probe, but also the cause of the > deferred probes. > > My proposal would ensure that subsystems are happier to add these > prints, because in the normal scenario where we have deferred probing, > we're not littering the console log with lots of useless failure > messages which make people stop and think "now did device X probe?" > It also means scripts in our boot farms can more effectively analyse > the log and determine whether the boot was actually successful and > contained no errors. > > Merely printing the list of devices which have been deferred is next > to useless. The next question will always be "why did device X defer?" > and if that can't be answered, it means people having to spend a long > time adding lots of printks to the kernel at lots of -EPROBE_DEFER > returning sites or in the relevant drivers, tracing through the code > back towards the -EPROBE_DEFER sites to try and track it down. > I perfectly understand your proposal and spent a lot of time trying to debug such kind issues also (and using printks). But I worry a bit (and that my main point) about these few additional rounds of deferred device probing which I have right now and which allows some of drivers to finish, finally, their probes successfully. With proposed change I'll get more messages in boot log, but some of them will belong to drivers which have been probed successfully and so, they will be not really useful. As result, I think, the most important thing is to identify (or create) some point during kernel boot when it will be possible to say that all built-in drivers (at least) finish their probes 100% (done or defer). Might be do_initcalls() can be updated (smth like this): static void __init do_initcalls(void) { int level; for (level = 0; level < ARRAY_SIZE(initcall_levels) - 1; level++) do_initcall_level(level); + wait_for_device_probe(); + /* Now one final round, reporting any devices that remain deferred */ + driver_deferred_probe_report = true; + driver_deferred_probe_trigger(); + wait_for_device_probe(); } Also, in my opinion, it will be useful if this debugging feature will be optional. Thanks.
On Wed, Oct 21, 2015 at 09:13:48PM +0300, Grygorii Strashko wrote: > But I worry a bit (and that my main point) about these few additional > rounds of deferred device probing which I have right now and which allows > some of drivers to finish, finally, their probes successfully. > With proposed change I'll get more messages in boot log, but some of > them will belong to drivers which have been probed successfully and so, > they will be not really useful. Then you haven't properly understood my proposal. I want to get rid of all the "X deferred its probing" messages up until the point that we set the "please report deferred probes" flag. That _should_ mean that all the deferred probing that goes on becomes _totally_ silent and becomes hidden (unless you really want to see it, in which case we can make a debug option which turns it on) up until we're at the point where we want to enter userspace. At that point, we then report into the kernel log which devices are still deferring and, via appropriately placed dev_warn_deferred(), the reasons why the devices are being deferred. So, gone will be all the messages earlier in the log about device X not having a GPIO/clock/whatever because the device providing the GPIO/clock/whatever hasn't been probed. If everything is satisfied by the time we run this last round (again, I'm not using a three line sentence to describe exactly what I mean, I'm sure you know by now... oops, I just did) then the kernel will report nothing about any deferrals. That's _got_ to be an improvement. > > As result, I think, the most important thing is to identify (or create) > some point during kernel boot when it will be possible to say that all > built-in drivers (at least) finish their probes 100% (done or defer). > > Might be do_initcalls() can be updated (smth like this): > static void __init do_initcalls(void) > { > int level; > > for (level = 0; level < ARRAY_SIZE(initcall_levels) - 1; level++) > do_initcall_level(level); > > + wait_for_device_probe(); > + /* Now one final round, reporting any devices that remain deferred */ > + driver_deferred_probe_report = true; > + driver_deferred_probe_trigger(); > + wait_for_device_probe(); > } > > Also, in my opinion, it will be useful if this debugging feature will be > optional. I wonder why you want it optional... so I'm going to guess and cover both cases I can think of below to head off another round of reply on this point (sorry if this sucks eggs.) I don't see it as being optional, because it's going to be cheap to run in the case of a system which has very few or no errors - which is what you should have for production systems, right? Remember, only devices and drivers that are present and have been probed once get added to the deferred probe list, not devices for which their drivers are modules.
On 10/21/2015 09:02 PM, Frank Rowand wrote: > On 10/21/2015 9:55 AM, Grygorii Strashko wrote: >> On 10/21/2015 06:36 PM, Frank Rowand wrote: >>> On 10/21/2015 1:18 AM, Russell King - ARM Linux wrote: >>>> On Tue, Oct 20, 2015 at 08:58:19PM -0700, Frank Rowand wrote: >>>>> On 10/20/2015 8:46 AM, Russell King - ARM Linux wrote: >>> >>> < snip > >>> >>>>>> + >>>>>> static bool driver_deferred_probe_enable = false; >>>>>> + >>>>>> /** >>>>>> * driver_deferred_probe_trigger() - Kick off re-probing deferred devices >>>>>> * >>>>>> @@ -188,6 +210,13 @@ static int deferred_probe_initcall(void) >>>>>> driver_deferred_probe_trigger(); >>>>> >>>>> Couldn't you put the "driver_deferred_probe_report = true" here? And then >>>>> not add another round of probes. >>>> >>>> The idea is not to report anything for drivers that were deferred >>>> during the normal bootup. The above is part of the normal bootup, >>>> and the deferred activity should not be warned about. >>> >>> The above is currently the last point for probe to succeed or defer >>> (until possibly, as you mentioned, module loading resolves the defer). >>> If a probe defers above, it will defer again below. The set of defers >>> should be exactly the same above and below. >>> >> >> Unfortunately this is not "the last point for probe to succeed or defer". >> There are still a bunch of drivers in Kernel which will be probed at late_initcall() level. >> (like ./drivers/net/ethernet/ti/cpsw.c => late_initcall(cpsw_init); > > Yes, cpsw_init() should _not_ be a late_initcall. This is yet another > example of playing games with ordering probes that we have been trying > to eliminate. yes, we're trying to solve such issues and have all TI's drivers initialized from module_init() level, but as usual this process is not so fast. You know, some times ago there was no other way to solve boot ordering issues, but only to play with init levels :) And, as result, right now in drivers/ and sound/ folders there are >77 occurrences of late_initcall(). > > Thanks for pointing out one of the resulting problems this causes for the > deferred probe mechanism. > >> Yes - they probably need to be updated to use module_init(), but that's what >> we have now). Those drivers will re-trigger deferred device probing if their >> probe succeeded. > > Yes, if cpsw_init() leads to a successful probe, then deferred device probing > will be re-triggered. I do not know if cpsw_init() will be called before or > after deferred_probe_initcall(). The general initcall mechanism does not > provide any ordering guarantees between the two functions because they are > at the same initcall level. It will be called after and it will re-triggered deferred device probing. Now ordering of init calls will be specified by drivers/Makefile which itself is funny thing. > >> >> As result, it is impossible to say when will it happen the >> "final round of deferred device probing" :( and final list of drivers which >> was "deferred forever" will be know only when kernel exits to User space >> ("deferred forever" - before loading modules). >> >> May be, we also can consider adding debug_fs entry which can be used to display >> actual state of deferred_probe_pending_list? >> >>>> >>>> If we have any devices still deferring after _this_ round, that must >>>> indicate that some resource they want is not available, and that >>>> should be warned about. >>>> >>>> Of course, modules can defer too - and I made some suggestions in my >>>> waffle above the patch about that. >>>> >>> >>> < adding back trimmed, for fuller context > >>> >>>>>> /* Sort as many dependencies as possible before exiting initcalls */ >>>>>> flush_workqueue(deferred_wq); >>>>>> + >>>>>> + /* Now one final round, reporting any devices that remain deferred */ >>>>>> + driver_deferred_probe_report = true; >>>>>> + driver_deferred_probe_trigger(); >>>>>> + /* Sort as many dependencies as possible before exiting initcalls */ >>>>>> + flush_workqueue(deferred_wq); >>>>>> + >>>>>> return 0; >>>>>> } >> >> >
On Wed, Oct 21, 2015 at 08:36:23AM -0700, Frank Rowand wrote: > On 10/21/2015 1:18 AM, Russell King - ARM Linux wrote: > > On Tue, Oct 20, 2015 at 08:58:19PM -0700, Frank Rowand wrote: > >> On 10/20/2015 8:46 AM, Russell King - ARM Linux wrote: > > < snip > > > >>> + > >>> static bool driver_deferred_probe_enable = false; > >>> + > >>> /** > >>> * driver_deferred_probe_trigger() - Kick off re-probing deferred devices > >>> * > >>> @@ -188,6 +210,13 @@ static int deferred_probe_initcall(void) > >>> driver_deferred_probe_trigger(); > >> > >> Couldn't you put the "driver_deferred_probe_report = true" here? And then > >> not add another round of probes. > > > > The idea is not to report anything for drivers that were deferred > > during the normal bootup. The above is part of the normal bootup, > > and the deferred activity should not be warned about. > > The above is currently the last point for probe to succeed or defer > (until possibly, as you mentioned, module loading resolves the defer). > If a probe defers above, it will defer again below. The set of defers > should be exactly the same above and below. Why should it? Isn't this late_initcall() the first opportunity that deferred devices get to be re-probed from their first set of attempts via the drivers having their initcalls called? If what you're saying is true, what's the point of this late_initcall()? <re-cut again, I've no idea why you keep adding it back>
On 10/21/2015 1:35 PM, Russell King - ARM Linux wrote: > On Wed, Oct 21, 2015 at 08:36:23AM -0700, Frank Rowand wrote: >> On 10/21/2015 1:18 AM, Russell King - ARM Linux wrote: >>> On Tue, Oct 20, 2015 at 08:58:19PM -0700, Frank Rowand wrote: >>>> On 10/20/2015 8:46 AM, Russell King - ARM Linux wrote: >> >> < snip > >> >>>>> + >>>>> static bool driver_deferred_probe_enable = false; >>>>> + >>>>> /** >>>>> * driver_deferred_probe_trigger() - Kick off re-probing deferred devices >>>>> * >>>>> @@ -188,6 +210,13 @@ static int deferred_probe_initcall(void) >>>>> driver_deferred_probe_trigger(); >>>> >>>> Couldn't you put the "driver_deferred_probe_report = true" here? And then >>>> not add another round of probes. >>> >>> The idea is not to report anything for drivers that were deferred >>> during the normal bootup. The above is part of the normal bootup, >>> and the deferred activity should not be warned about. >> >> The above is currently the last point for probe to succeed or defer >> (until possibly, as you mentioned, module loading resolves the defer). >> If a probe defers above, it will defer again below. The set of defers >> should be exactly the same above and below. > > Why should it? My assertion was incorrect. A probe in the deferral processing can result in the driver being placed on the new deferred list, then when a later probe of another deferred driver succeeds, the first driver will be moved to the active deferred list, and might succeed on the second probe attempt (or with the current messages would result in a second set of deferred messages). So yes, placing "driver_deferred_probe_report = true" where your patch put it and running through the deferred probe processing again is correct. -Frank -- To unsubscribe from this list: send the line "unsubscribe dmaengine" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 20, 2015 at 04:46:56PM +0100, Russell King - ARM Linux wrote: > Something like this. I haven't put a lot of effort into it to change all > the places which return an -EPROBE_DEFER, and it also looks like we need > some helpers to report when we have only an device_node (or should that > be fwnode?) See the commented out of_warn_deferred() in > drivers/gpio/gpiolib-of.c. Adding this stuff in the subsystems searching > for resources should make debugging why things are getting deferred easier. Yeah, plus I'd expect it to also result in better error reporting overall if the subsystems are able to report when they fail to get something rather than just returning an error to the driver. > +/** > + * dev_warn_deferred() - report why a probe has been deferred > + */ > +void dev_warn_deferred(struct device *dev, const char *fmt, ...) > +{ > + if (driver_deferred_probe_report) { > + struct va_format vaf; > + va_list ap; > + > + va_start(ap, fmt); > + vaf.fmt = fmt; > + vaf.va = ≈ > + > + dev_warn(dev, "deferring probe: %pV", &vaf); > + va_end(ap); > + } > +} > +EXPORT_SYMBOL_GPL(dev_warn_deferred); I'm not currently able to think of a nice way of writing this but I think what I'd really like to see from a driver point of view is something which decays into dev_err() if it's a non-deferral error. That way drivers can have minimal log and return error handling code and we will still get the output sensibly. The best I can think of is something like void dev_warn_deferred(struct device *dev, int err, const char *fmt, ...) which requires the caller to pass in err twice to get it logged. That's not a thing of beauty but it gets the job done... but perhaps your original interface is better, it's a bit cleaner.
diff --git a/drivers/base/dd.c b/drivers/base/dd.c index be0eb4639128..bb12224f2901 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -129,7 +129,29 @@ void driver_deferred_probe_del(struct device *dev) mutex_unlock(&deferred_probe_mutex); } +static bool driver_deferred_probe_report; + +/** + * dev_warn_deferred() - report why a probe has been deferred + */ +void dev_warn_deferred(struct device *dev, const char *fmt, ...) +{ + if (driver_deferred_probe_report) { + struct va_format vaf; + va_list ap; + + va_start(ap, fmt); + vaf.fmt = fmt; + vaf.va = ≈ + + dev_warn(dev, "deferring probe: %pV", &vaf); + va_end(ap); + } +} +EXPORT_SYMBOL_GPL(dev_warn_deferred); + static bool driver_deferred_probe_enable = false; + /** * driver_deferred_probe_trigger() - Kick off re-probing deferred devices * @@ -188,6 +210,13 @@ static int deferred_probe_initcall(void) driver_deferred_probe_trigger(); /* Sort as many dependencies as possible before exiting initcalls */ flush_workqueue(deferred_wq); + + /* Now one final round, reporting any devices that remain deferred */ + driver_deferred_probe_report = true; + driver_deferred_probe_trigger(); + /* Sort as many dependencies as possible before exiting initcalls */ + flush_workqueue(deferred_wq); + return 0; } late_initcall(deferred_probe_initcall); diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index 16550c63d611..9f4d687d7268 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -1997,8 +1997,8 @@ int genpd_dev_pm_attach(struct device *dev) pd = of_genpd_get_from_provider(&pd_args); if (IS_ERR(pd)) { - dev_dbg(dev, "%s() failed to find PM domain: %ld\n", - __func__, PTR_ERR(pd)); + dev_warn_deferred(dev, "%s() failed to find PM domain: %ld\n", + __func__, PTR_ERR(pd)); of_node_put(dev->of_node); return -EPROBE_DEFER; } @@ -2026,6 +2026,9 @@ int genpd_dev_pm_attach(struct device *dev) ret = pm_genpd_poweron(pd); out: + if (ret) + dev_warn_deferred(dev, "%s() deferring probe: %d\n", + __func__, ret); return ret ? -EPROBE_DEFER : 0; } EXPORT_SYMBOL_GPL(genpd_dev_pm_attach); diff --git a/drivers/clk/clkdev.c b/drivers/clk/clkdev.c index 779b6ff0c7ad..66f4212c63fd 100644 --- a/drivers/clk/clkdev.c +++ b/drivers/clk/clkdev.c @@ -201,7 +201,14 @@ struct clk *clk_get(struct device *dev, const char *con_id) if (dev) { clk = __of_clk_get_by_name(dev->of_node, dev_id, con_id); - if (!IS_ERR(clk) || PTR_ERR(clk) == -EPROBE_DEFER) + if (IS_ERR(clk) && PTR_ERR(clk) == -EPROBE_DEFER) { + if (dev) + dev_warn_deferred(dev, + "unable to locate clock for connection %s\n", + con_id); + return clk; + } + if (!IS_ERR(clk)) return clk; } diff --git a/drivers/gpio/gpiolib-of.c b/drivers/gpio/gpiolib-of.c index fa6e3c8823d6..36f09ab1c215 100644 --- a/drivers/gpio/gpiolib-of.c +++ b/drivers/gpio/gpiolib-of.c @@ -101,6 +101,11 @@ struct gpio_desc *of_get_named_gpiod_flags(struct device_node *np, pr_debug("%s: parsed '%s' property of node '%s[%d]' - status (%d)\n", __func__, propname, np->full_name, index, PTR_ERR_OR_ZERO(gg_data.out_gpio)); + +// if (gg_data.out_gpio == -EPROBE_DEFER) +// of_warn_deferred(np, "%s: unable to locate GPIO for %s[%d]\n", +// __func__, propname, index); + return gg_data.out_gpio; } diff --git a/drivers/gpu/drm/bridge/dw_hdmi.c b/drivers/gpu/drm/bridge/dw_hdmi.c index cb8764eecd70..088f5dd58424 100644 --- a/drivers/gpu/drm/bridge/dw_hdmi.c +++ b/drivers/gpu/drm/bridge/dw_hdmi.c @@ -1785,7 +1785,7 @@ int dw_hdmi_bind(struct device *dev, struct device *master, hdmi->ddc = of_find_i2c_adapter_by_node(ddc_node); of_node_put(ddc_node); if (!hdmi->ddc) { - dev_dbg(hdmi->dev, "failed to read ddc node\n"); + dev_warn_deferred(hdmi->dev, "failed to read ddc node\n"); return -EPROBE_DEFER; } diff --git a/drivers/gpu/drm/exynos/exynos_drm_dsi.c b/drivers/gpu/drm/exynos/exynos_drm_dsi.c index 12b03b364703..3155798d8245 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_dsi.c +++ b/drivers/gpu/drm/exynos/exynos_drm_dsi.c @@ -1899,7 +1899,7 @@ static int exynos_dsi_probe(struct platform_device *pdev) ret = devm_regulator_bulk_get(dev, ARRAY_SIZE(dsi->supplies), dsi->supplies); if (ret) { - dev_info(dev, "failed to get regulators: %d\n", ret); + dev_warn_deferred(dev, "failed to get regulators: %d\n", ret); return -EPROBE_DEFER; } diff --git a/drivers/gpu/drm/imx/imx-ldb.c b/drivers/gpu/drm/imx/imx-ldb.c index abacc8f67469..0b57054c886a 100644 --- a/drivers/gpu/drm/imx/imx-ldb.c +++ b/drivers/gpu/drm/imx/imx-ldb.c @@ -595,8 +595,9 @@ static int imx_ldb_bind(struct device *dev, struct device *master, void *data) else return -EPROBE_DEFER; if (!channel->panel) { - dev_err(dev, "panel not found: %s\n", - remote->full_name); + dev_warn_deferred(dev, + "panel not found: %s\n", + remote->full_name); return -EPROBE_DEFER; } } diff --git a/drivers/gpu/drm/msm/dsi/dsi.c b/drivers/gpu/drm/msm/dsi/dsi.c index 6edcd6f57e70..3ba94a2bca65 100644 --- a/drivers/gpu/drm/msm/dsi/dsi.c +++ b/drivers/gpu/drm/msm/dsi/dsi.c @@ -42,7 +42,7 @@ static int dsi_get_phy(struct msm_dsi *msm_dsi) of_node_put(phy_node); if (!phy_pdev || !msm_dsi->phy) { - dev_err(&pdev->dev, "%s: phy driver is not ready\n", __func__); + dev_warn_deferred(&pdev->dev, "%s: phy driver is not ready\n", __func__); return -EPROBE_DEFER; } diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c index 0339c5d82d37..e1cfcd38c0dd 100644 --- a/drivers/gpu/drm/msm/msm_drv.c +++ b/drivers/gpu/drm/msm/msm_drv.c @@ -1117,7 +1117,8 @@ static int msm_pdev_probe(struct platform_device *pdev) dev = bus_find_device_by_name(&platform_bus_type, NULL, devnames[i]); if (!dev) { - dev_info(&pdev->dev, "still waiting for %s\n", devnames[i]); + dev_warn_deferred(&pdev->dev, "waiting for %s\n", + devnames[i]); return -EPROBE_DEFER; } diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c index 48cb19949ca3..bbf36f68d4e0 100644 --- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c +++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c @@ -600,7 +600,8 @@ int rcar_du_crtc_create(struct rcar_du_group *rgrp, unsigned int index) if (!IS_ERR(clk)) { rcrtc->extclock = clk; } else if (PTR_ERR(rcrtc->clock) == -EPROBE_DEFER) { - dev_info(rcdu->dev, "can't get external clock %u\n", index); + dev_warn_deferred(rcdu->dev, "can't get external clock %u\n", + index); return -EPROBE_DEFER; } diff --git a/drivers/of/irq.c b/drivers/of/irq.c index 55317fa9c9dc..2056bb9e4c43 100644 --- a/drivers/of/irq.c +++ b/drivers/of/irq.c @@ -404,8 +404,11 @@ int of_irq_get(struct device_node *dev, int index) return rc; domain = irq_find_host(oirq.np); - if (!domain) + if (!domain) { + dev_warn_deferred(dev, "%s() failed to locate IRQ domain\n", + __func__); return -EPROBE_DEFER; + } return irq_create_of_mapping(&oirq); } diff --git a/drivers/pci/host/pci-mvebu.c b/drivers/pci/host/pci-mvebu.c index 0e9b82095dc9..b49ae4822a5b 100644 --- a/drivers/pci/host/pci-mvebu.c +++ b/drivers/pci/host/pci-mvebu.c @@ -1203,6 +1203,7 @@ static int mvebu_pcie_parse_port(struct mvebu_pcie *pcie, reset_gpio = of_get_named_gpio_flags(child, "reset-gpios", 0, &flags); if (reset_gpio == -EPROBE_DEFER) { + dev_warn_deferred(dev, "unable to find reset gpio\n"); ret = reset_gpio; goto err; } diff --git a/drivers/pinctrl/core.c b/drivers/pinctrl/core.c index 9638a00c67c2..299aae3bca14 100644 --- a/drivers/pinctrl/core.c +++ b/drivers/pinctrl/core.c @@ -741,8 +741,9 @@ static int add_setting(struct pinctrl *p, struct pinctrl_map const *map) * OK let us guess that the driver is not there yet, and * let's defer obtaining this pinctrl handle to later... */ - dev_info(p->dev, "unknown pinctrl device %s in map entry, deferring probe", - map->ctrl_dev_name); + dev_warn_deferred(p->dev, + "unknown pinctrl device %s in map entry, deferring probe", + map->ctrl_dev_name); return -EPROBE_DEFER; } diff --git a/drivers/pinctrl/devicetree.c b/drivers/pinctrl/devicetree.c index fe04e748dfe4..358f946471c9 100644 --- a/drivers/pinctrl/devicetree.c +++ b/drivers/pinctrl/devicetree.c @@ -115,8 +115,8 @@ static int dt_to_map_one_config(struct pinctrl *p, const char *statename, for (;;) { np_pctldev = of_get_next_parent(np_pctldev); if (!np_pctldev || of_node_is_root(np_pctldev)) { - dev_info(p->dev, "could not find pctldev for node %s, deferring probe\n", - np_config->full_name); + dev_warn_deferred(p->dev, "could not find pctldev for node %s, deferring probe\n", + np_config->full_name); of_node_put(np_pctldev); /* OK let's just assume this will appear later then */ return -EPROBE_DEFER; diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c index 8a34f6acc801..8d8ea0518283 100644 --- a/drivers/regulator/core.c +++ b/drivers/regulator/core.c @@ -1410,8 +1410,9 @@ static int regulator_resolve_supply(struct regulator_dev *rdev) if (have_full_constraints()) { r = dummy_regulator_rdev; } else { - dev_err(dev, "Failed to resolve %s-supply for %s\n", - rdev->supply_name, rdev->desc->name); + dev_warn_deferred(dev, + "Failed to resolve %s-supply for %s\n", + rdev->supply_name, rdev->desc->name); return -EPROBE_DEFER; } } diff --git a/include/linux/device.h b/include/linux/device.h index 5d7bc6349930..5050ce7d73b3 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -1087,6 +1087,7 @@ extern void device_shutdown(void); /* debugging and troubleshooting/diagnostic helpers. */ extern const char *dev_driver_string(const struct device *dev); +void dev_warn_deferred(struct device *dev, const char *fmt, ...); #ifdef CONFIG_PRINTK