diff mbox series

init: Add support for rootwait timeout parameter

Message ID 20230526130716.2932507-1-loic.poulain@linaro.org (mailing list archive)
State Under Review
Headers show
Series init: Add support for rootwait timeout parameter | expand

Commit Message

Loic Poulain May 26, 2023, 1:07 p.m. UTC
Add an optional timeout arg to 'rootwait' as the maximum time in
seconds to wait for the root device to show up before attempting
forced mount of the root filesystem.

This can be helpful to force boot failure and restart in case the
root device does not show up in time, allowing the bootloader to
take any appropriate measures (e.g. recovery, A/B switch, retry...).

In success case, mounting happens as soon as the root device is ready,
contrary to the existing 'rootdelay' parameter (unconditional delay).

Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
---
 .../admin-guide/kernel-parameters.txt         |  4 ++++
 init/do_mounts.c                              | 19 +++++++++++++++++--
 2 files changed, 21 insertions(+), 2 deletions(-)

Comments

Christian Brauner May 30, 2023, 9:45 a.m. UTC | #1
On Fri, May 26, 2023 at 03:07:16PM +0200, Loic Poulain wrote:
> Add an optional timeout arg to 'rootwait' as the maximum time in
> seconds to wait for the root device to show up before attempting
> forced mount of the root filesystem.
> 
> This can be helpful to force boot failure and restart in case the
> root device does not show up in time, allowing the bootloader to
> take any appropriate measures (e.g. recovery, A/B switch, retry...).
> 
> In success case, mounting happens as soon as the root device is ready,
> contrary to the existing 'rootdelay' parameter (unconditional delay).
> 
> Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
> ---

Not terribly opposed and not terribly convinced yet.
So, we have rootdelay= with a timeout parameter that allows to specify a
delay before attempting to mount the root device. And we have rootwait
currently as an indefinite wait. Adding a timeout for rootwait doesn't
seem crazy and is backwards compatible. But there's no mention of any
concrete users or use-case for this which is usually preferable. If this
is just "could be useful for someone eventually" it's way less desirable
to merge this than when it's "here's a/multiple user/users"... So I
would love to see a use-case described here.

And this is only useful if there isn't an early userspace init that
parses and manages root=. So we need to hit prepare_namespaces() as a
rootwait timeout isn't meaningful if this is done by and early init in
the initramfs for example.
Loic Poulain May 30, 2023, 11:23 a.m. UTC | #2
Hi Christian,

On Tue, 30 May 2023 at 11:45, Christian Brauner <brauner@kernel.org> wrote:
>
> On Fri, May 26, 2023 at 03:07:16PM +0200, Loic Poulain wrote:
> > Add an optional timeout arg to 'rootwait' as the maximum time in
> > seconds to wait for the root device to show up before attempting
> > forced mount of the root filesystem.
> >
> > This can be helpful to force boot failure and restart in case the
> > root device does not show up in time, allowing the bootloader to
> > take any appropriate measures (e.g. recovery, A/B switch, retry...).
> >
> > In success case, mounting happens as soon as the root device is ready,
> > contrary to the existing 'rootdelay' parameter (unconditional delay).
> >
> > Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
> > ---
>
> Not terribly opposed and not terribly convinced yet.
> So, we have rootdelay= with a timeout parameter that allows to specify a
> delay before attempting to mount the root device. And we have rootwait
> currently as an indefinite wait. Adding a timeout for rootwait doesn't
> seem crazy and is backwards compatible. But there's no mention of any
> concrete users or use-case for this which is usually preferable. If this
> is just "could be useful for someone eventually" it's way less desirable
> to merge this than when it's "here's a/multiple user/users"... So I
> would love to see a use-case described here.

I can integrate the following use case into a v2 if you think it makes sense:

In case of device mapper usage for the root filesystem (e.g.
root=/dev/dm-0), if the mapper is not able to create the virtual block
for any reasons (wrong arguments, bad dm-verity signature, etc), the
`rootwait` parameter will cause the kernel to wait forever. Adding a
timeout allows it to detect the 'error' (panic) and reset the device
after a few seconds, the bootloader can then decide to mark this
non-bootable partition/parameter and fallback to another partition
(A/B case) or into a recovery mode.

But it's not specific to device mapper, if a eMMC/SDCARD is not
detected at boot time because of hardware or software problems (e.g.
updated with a bad devicetree), it could be desirable to panic/reboot
instead of waiting for something that will never happen.

>
> And this is only useful if there isn't an early userspace init that
> parses and manages root=. So we need to hit prepare_namespaces() as a
> rootwait timeout isn't meaningful if this is done by and early init in
> the initramfs for example.

Indeed, and I do not use initramfs in the above use case, the mapped
device is created directly from the kernel (thanks to dm-mod.create=),
mostly for boot time optimization reason, and this is for the same
reason that rootdelay does not fit.

Regards,
Loic
Christian Brauner May 30, 2023, 12:39 p.m. UTC | #3
On Tue, May 30, 2023 at 01:23:50PM +0200, Loic Poulain wrote:
> Hi Christian,
> 
> On Tue, 30 May 2023 at 11:45, Christian Brauner <brauner@kernel.org> wrote:
> >
> > On Fri, May 26, 2023 at 03:07:16PM +0200, Loic Poulain wrote:
> > > Add an optional timeout arg to 'rootwait' as the maximum time in
> > > seconds to wait for the root device to show up before attempting
> > > forced mount of the root filesystem.
> > >
> > > This can be helpful to force boot failure and restart in case the
> > > root device does not show up in time, allowing the bootloader to
> > > take any appropriate measures (e.g. recovery, A/B switch, retry...).
> > >
> > > In success case, mounting happens as soon as the root device is ready,
> > > contrary to the existing 'rootdelay' parameter (unconditional delay).
> > >
> > > Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
> > > ---
> >
> > Not terribly opposed and not terribly convinced yet.
> > So, we have rootdelay= with a timeout parameter that allows to specify a
> > delay before attempting to mount the root device. And we have rootwait
> > currently as an indefinite wait. Adding a timeout for rootwait doesn't
> > seem crazy and is backwards compatible. But there's no mention of any
> > concrete users or use-case for this which is usually preferable. If this
> > is just "could be useful for someone eventually" it's way less desirable
> > to merge this than when it's "here's a/multiple user/users"... So I
> > would love to see a use-case described here.
> 
> I can integrate the following use case into a v2 if you think it makes sense:

Yes, please.
Christoph Hellwig May 30, 2023, 2:56 p.m. UTC | #4
This clashes a bit with my big rework in this area in the
"fix the name_to_dev_t mess" series. I need to resend that series
anyway, should I just include a rebased version of this patch?
Christian Brauner May 30, 2023, 3:43 p.m. UTC | #5
On Tue, May 30, 2023 at 07:56:57AM -0700, Christoph Hellwig wrote:
> This clashes a bit with my big rework in this area in the
> "fix the name_to_dev_t mess" series. I need to resend that series
> anyway, should I just include a rebased version of this patch?

Sure, if this makes things easier for you then definitely.
Christoph Hellwig May 31, 2023, 5:54 a.m. UTC | #6
On Tue, May 30, 2023 at 05:43:53PM +0200, Christian Brauner wrote:
> On Tue, May 30, 2023 at 07:56:57AM -0700, Christoph Hellwig wrote:
> > This clashes a bit with my big rework in this area in the
> > "fix the name_to_dev_t mess" series. I need to resend that series
> > anyway, should I just include a rebased version of this patch?
> 
> Sure, if this makes things easier for you then definitely.

I have missed you had more comments that need a respin.  So maybe
Loic can just do the rebase and send it out with a note for the
baseline?  I plan to resend my series later today.
Christian Brauner May 31, 2023, 7:48 a.m. UTC | #7
On Tue, May 30, 2023 at 10:54:24PM -0700, Christoph Hellwig wrote:
> On Tue, May 30, 2023 at 05:43:53PM +0200, Christian Brauner wrote:
> > On Tue, May 30, 2023 at 07:56:57AM -0700, Christoph Hellwig wrote:
> > > This clashes a bit with my big rework in this area in the
> > > "fix the name_to_dev_t mess" series. I need to resend that series
> > > anyway, should I just include a rebased version of this patch?
> > 
> > Sure, if this makes things easier for you then definitely.
> 
> I have missed you had more comments that need a respin.  So maybe
> Loic can just do the rebase and send it out with a note for the
> baseline?  I plan to resend my series later today.

Sure, that works too!
Loic Poulain May 31, 2023, 7:57 a.m. UTC | #8
On Wed, 31 May 2023 at 07:54, Christoph Hellwig <hch@infradead.org> wrote:
>
> On Tue, May 30, 2023 at 05:43:53PM +0200, Christian Brauner wrote:
> > On Tue, May 30, 2023 at 07:56:57AM -0700, Christoph Hellwig wrote:
> > > This clashes a bit with my big rework in this area in the
> > > "fix the name_to_dev_t mess" series. I need to resend that series
> > > anyway, should I just include a rebased version of this patch?
> >
> > Sure, if this makes things easier for you then definitely.
>
> I have missed you had more comments that need a respin.  So maybe
> Loic can just do the rebase and send it out with a note for the
> baseline?  I plan to resend my series later today.

Can do that if it helps, please CC me.

Loic
diff mbox series

Patch

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 9e5bab29685f..6e351d4c84a5 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5465,6 +5465,10 @@ 
 			Useful for devices that are detected asynchronously
 			(e.g. USB and MMC devices).
 
+	rootwait=	[KNL] Maximum time (in seconds) to wait for root device
+			to show up before attempting to mount the root
+			filesystem.
+
 	rproc_mem=nn[KMG][@address]
 			[KNL,ARM,CMA] Remoteproc physical memory block.
 			Memory area to be used by remote processor image,
diff --git a/init/do_mounts.c b/init/do_mounts.c
index 811e94daf0a8..942458e7d1c0 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -18,6 +18,7 @@ 
 #include <linux/slab.h>
 #include <linux/ramfs.h>
 #include <linux/shmem_fs.h>
+#include <linux/ktime.h>
 
 #include <linux/nfs_fs.h>
 #include <linux/nfs_fs_sb.h>
@@ -306,12 +307,20 @@  static int __init rootwait_setup(char *str)
 {
 	if (*str)
 		return 0;
-	root_wait = 1;
+	root_wait = -1;
 	return 1;
 }
 
 __setup("rootwait", rootwait_setup);
 
+static int __init rootwait_timeout_setup(char *str)
+{
+	root_wait = simple_strtoul(str, NULL, 0);
+	return 1;
+}
+
+__setup("rootwait=", rootwait_timeout_setup);
+
 static char * __initdata root_mount_data;
 static int __init root_data_setup(char *str)
 {
@@ -633,11 +642,17 @@  void __init prepare_namespace(void)
 
 	/* wait for any asynchronous scanning to complete */
 	if ((ROOT_DEV == 0) && root_wait) {
+		const ktime_t end = ktime_add_ms(ktime_get_raw(), root_wait * MSEC_PER_SEC);
+
 		printk(KERN_INFO "Waiting for root device %s...\n",
 			saved_root_name);
 		while (driver_probe_done() != 0 ||
-			(ROOT_DEV = name_to_dev_t(saved_root_name)) == 0)
+			(ROOT_DEV = name_to_dev_t(saved_root_name)) == 0) {
 			msleep(5);
+
+			if (root_wait > 0 && ktime_after(ktime_get_raw(), end))
+				break;
+		}
 		async_synchronize_full();
 	}