Message ID | 20220526034609.480766-2-saravanak@google.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | [v1,1/2] driver core: Fix wait_for_device_probe() & deferred_probe_timeout interaction | expand |
On Wed, May 25, 2022 at 8:46 PM Saravana Kannan <saravanak@google.com> wrote: > > Mounting NFS rootfs was timing out when deferred_probe_timeout was > non-zero [1]. This was because ip_auto_config() initcall times out > waiting for the network interfaces to show up when > deferred_probe_timeout was non-zero. While ip_auto_config() calls > wait_for_device_probe() to make sure any currently running deferred > probe work or asynchronous probe finishes, that wasn't sufficient to > account for devices being deferred until deferred_probe_timeout. > > Commit 35a672363ab3 ("driver core: Ensure wait_for_device_probe() waits > until the deferred_probe_timeout fires") tried to fix that by making > sure wait_for_device_probe() waits for deferred_probe_timeout to expire > before returning. > > However, if wait_for_device_probe() is called from the kernel_init() > context: > > - Before deferred_probe_initcall() [2], it causes the boot process to > hang due to a deadlock. > > - After deferred_probe_initcall() [3], it blocks kernel_init() from > continuing till deferred_probe_timeout expires and beats the point of > deferred_probe_timeout that's trying to wait for userspace to load > modules. > > Neither of this is good. So revert the changes to > wait_for_device_probe(). > > [1] - https://lore.kernel.org/lkml/TYAPR01MB45443DF63B9EF29054F7C41FD8C60@TYAPR01MB4544.jpnprd01.prod.outlook.com/ > [2] - https://lore.kernel.org/lkml/YowHNo4sBjr9ijZr@dev-arch.thelio-3990X/ > [3] - https://lore.kernel.org/lkml/Yo3WvGnNk3LvLb7R@linutronix.de/ > > Cc: John Stultz <jstultz@google.com> > Cc: "David S. Miller" <davem@davemloft.net> > Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> > Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> > Cc: Jakub Kicinski <kuba@kernel.org> > Cc: Rafael J. Wysocki <rjw@rjwysocki.net> > Cc: Rob Herring <robh@kernel.org> > Cc: Geert Uytterhoeven <geert@linux-m68k.org> > Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> > Cc: Robin Murphy <robin.murphy@arm.com> > Cc: Andy Shevchenko <andy.shevchenko@gmail.com> > Cc: Sudeep Holla <sudeep.holla@arm.com> > Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> > Cc: Naresh Kamboju <naresh.kamboju@linaro.org> > Cc: Basil Eljuse <Basil.Eljuse@arm.com> > Cc: Ferry Toth <fntoth@gmail.com> > Cc: Arnd Bergmann <arnd@arndb.de> > Cc: Anders Roxell <anders.roxell@linaro.org> > Cc: linux-pm@vger.kernel.org > Fixes: 35a672363ab3 ("driver core: Ensure wait_for_device_probe() waits until the deferred_probe_timeout fires") > Reported-by: Nathan Chancellor <nathan@kernel.org> > Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > Signed-off-by: Saravana Kannan <saravanak@google.com> Apologies for the trouble this caused. Thanks for chasing it down! Acked-by: John Stultz <jstultz@google.com> thanks -john
diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 2fc8507f59ee..91f63cd33b12 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -263,7 +263,6 @@ int driver_deferred_probe_timeout; #endif EXPORT_SYMBOL_GPL(driver_deferred_probe_timeout); -static DECLARE_WAIT_QUEUE_HEAD(probe_timeout_waitqueue); static int __init deferred_probe_timeout_setup(char *str) { @@ -318,7 +317,6 @@ static void deferred_probe_timeout_work_func(struct work_struct *work) list_for_each_entry(p, &deferred_probe_pending_list, deferred_probe) dev_info(p->device, "deferred probe pending\n"); mutex_unlock(&deferred_probe_mutex); - wake_up_all(&probe_timeout_waitqueue); } static DECLARE_DELAYED_WORK(deferred_probe_timeout_work, deferred_probe_timeout_work_func); @@ -736,9 +734,6 @@ int driver_probe_done(void) */ void wait_for_device_probe(void) { - /* wait for probe timeout */ - wait_event(probe_timeout_waitqueue, !driver_deferred_probe_timeout); - /* wait for the deferred probe workqueue to finish */ flush_work(&deferred_probe_work);
Mounting NFS rootfs was timing out when deferred_probe_timeout was non-zero [1]. This was because ip_auto_config() initcall times out waiting for the network interfaces to show up when deferred_probe_timeout was non-zero. While ip_auto_config() calls wait_for_device_probe() to make sure any currently running deferred probe work or asynchronous probe finishes, that wasn't sufficient to account for devices being deferred until deferred_probe_timeout. Commit 35a672363ab3 ("driver core: Ensure wait_for_device_probe() waits until the deferred_probe_timeout fires") tried to fix that by making sure wait_for_device_probe() waits for deferred_probe_timeout to expire before returning. However, if wait_for_device_probe() is called from the kernel_init() context: - Before deferred_probe_initcall() [2], it causes the boot process to hang due to a deadlock. - After deferred_probe_initcall() [3], it blocks kernel_init() from continuing till deferred_probe_timeout expires and beats the point of deferred_probe_timeout that's trying to wait for userspace to load modules. Neither of this is good. So revert the changes to wait_for_device_probe(). [1] - https://lore.kernel.org/lkml/TYAPR01MB45443DF63B9EF29054F7C41FD8C60@TYAPR01MB4544.jpnprd01.prod.outlook.com/ [2] - https://lore.kernel.org/lkml/YowHNo4sBjr9ijZr@dev-arch.thelio-3990X/ [3] - https://lore.kernel.org/lkml/Yo3WvGnNk3LvLb7R@linutronix.de/ Cc: John Stultz <jstultz@google.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Rob Herring <robh@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Naresh Kamboju <naresh.kamboju@linaro.org> Cc: Basil Eljuse <Basil.Eljuse@arm.com> Cc: Ferry Toth <fntoth@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Anders Roxell <anders.roxell@linaro.org> Cc: linux-pm@vger.kernel.org Fixes: 35a672363ab3 ("driver core: Ensure wait_for_device_probe() waits until the deferred_probe_timeout fires") Reported-by: Nathan Chancellor <nathan@kernel.org> Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Saravana Kannan <saravanak@google.com> --- drivers/base/dd.c | 5 ----- 1 file changed, 5 deletions(-)