Message ID | 20250320-initrd-erofs-v1-1-35bbb293468a@cyberus-technology.de (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | initrd: support erofs as initrd | expand |
On Thu, Mar 20, 2025 at 08:28:23PM +0100, Julian Stecklina via B4 Relay wrote: > From: Julian Stecklina <julian.stecklina@cyberus-technology.de> > > Add erofs detection to the initrd mount code. This allows systems to > boot from an erofs-based initrd in the same way as they can boot from > a squashfs initrd. > > Just as squashfs initrds, erofs images as initrds are a good option > for systems that are memory-constrained. > > Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de> > #include "do_mounts.h" > #include "../fs/squashfs/squashfs_fs.h" > +#include "../fs/erofs/erofs_fs.h" This is getting really unpleasant... Folks, could we do something similar to initcalls - add a section (.init.text.rd_detect?) with array of pointers to __init functions that would be called by that thing in turn? With filesystems that want to add that kind of stuff being about to do something like static int __init detect_minix(struct file *file, void *buf, loff_t *pos, int start_block) { struct minix_super_block *minixsb = buf; initrd_fill_buffer(file, buf, pos, (start_block + 1) * BLOCK_SIZE); if (minixsb->s_magic == MINIX_SUPER_MAGIC || minixsb->s_magic == MINIX_SUPER_MAGIC2) { printk(KERN_NOTICE "RAMDISK: Minix filesystem found at block %d\n", start_block); return minixsb->s_nzones << minixsb->s_log_zone_size; } return -1; } initrd_detect(detect_minix); with the latter emitting a pointer to detect_minix into that new section? initrd_fill_buffer() would be something along the lines of if (*pos != wanted) { *pos = wanted; kernel_read(file, buf, 512, pos); } I mean, we can keep adding those pieces there, but...
We've been trying to kill off initrd in favor of initramfs for about two decades. I don't think adding new file system support to it is helpful.
Hi Christoph, On 2025/3/21 13:01, Christoph Hellwig wrote: > We've been trying to kill off initrd in favor of initramfs for about > two decades. I don't think adding new file system support to it is > helpful. > Disclaimer: I don't know the background of this effort so more background might be helpful. Two years ago, I once thought if using EROFS + FSDAX to directly use the initrd image from bootloaders to avoid the original initrd double caching issue (which is what initramfs was proposed to resolve) and initramfs unnecessary tmpfs unpack overhead: https://lore.kernel.org/r/ZXgNQ85PdUKrQU1j@infradead.org Also EROFS supports xattrs so the following potential work (which the cpio format doesn't support) is no longer needed although I don't have any interest to follow either): https://lore.kernel.org/r/20190523121803.21638-1-roberto.sassu@huawei.com Anyway, personally I have no time slot or even input on those. Thanks, Gao Xiang
On Fri, Mar 21, 2025 at 02:08:26AM +0000, Al Viro wrote: > On Thu, Mar 20, 2025 at 08:28:23PM +0100, Julian Stecklina via B4 Relay wrote: > > From: Julian Stecklina <julian.stecklina@cyberus-technology.de> > > > > Add erofs detection to the initrd mount code. This allows systems to > > boot from an erofs-based initrd in the same way as they can boot from > > a squashfs initrd. > > > > Just as squashfs initrds, erofs images as initrds are a good option > > for systems that are memory-constrained. > > > > Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de> > > > #include "do_mounts.h" > > #include "../fs/squashfs/squashfs_fs.h" > > +#include "../fs/erofs/erofs_fs.h" > > This is getting really unpleasant... > > Folks, could we do something similar to initcalls - add a section > (.init.text.rd_detect?) with array of pointers to __init functions > that would be called by that thing in turn? With filesystems that > want to add that kind of stuff being about to do something like > > static int __init detect_minix(struct file *file, void *buf, loff_t *pos, int start_block) > { > struct minix_super_block *minixsb = buf; > initrd_fill_buffer(file, buf, pos, (start_block + 1) * BLOCK_SIZE); > if (minixsb->s_magic == MINIX_SUPER_MAGIC || > minixsb->s_magic == MINIX_SUPER_MAGIC2) { > printk(KERN_NOTICE > "RAMDISK: Minix filesystem found at block %d\n", > start_block); > return minixsb->s_nzones << minixsb->s_log_zone_size; > } > return -1; > } > > initrd_detect(detect_minix); > > with the latter emitting a pointer to detect_minix into that new > section? > > initrd_fill_buffer() would be something along the lines of > > if (*pos != wanted) { > *pos = wanted; > kernel_read(file, buf, 512, pos); > } > > I mean, we can keep adding those pieces there, but... Very much agreed.
On Thu, Mar 20, 2025 at 08:28:23PM +0100, Julian Stecklina via B4 Relay wrote: > From: Julian Stecklina <julian.stecklina@cyberus-technology.de> > > Add erofs detection to the initrd mount code. This allows systems to > boot from an erofs-based initrd in the same way as they can boot from > a squashfs initrd. > > Just as squashfs initrds, erofs images as initrds are a good option > for systems that are memory-constrained. I think this can be valuable and I know we've had discussion about this before but it should please come with a detailed rationale for using erofs and what project(s) this would be used by. Then I don't think there's a reason per se not to do it.
On Thu, Mar 20, 2025 at 08:28:23PM +0100, Julian Stecklina via B4 Relay wrote: > From: Julian Stecklina <julian.stecklina@cyberus-technology.de> > > Add erofs detection to the initrd mount code. This allows systems to > boot from an erofs-based initrd in the same way as they can boot from > a squashfs initrd. > > Just as squashfs initrds, erofs images as initrds are a good option > for systems that are memory-constrained. > > Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de> > --- > init/do_mounts_rd.c | 19 +++++++++++++++++++ > 1 file changed, 19 insertions(+) > > diff --git a/init/do_mounts_rd.c b/init/do_mounts_rd.c > index ac021ae6e6fa78c7b7828a78ab2fa3af3611bef3..7c3f8b45b5ed2eea3c534d7f2e65608542009df5 100644 > --- a/init/do_mounts_rd.c > +++ b/init/do_mounts_rd.c > @@ -11,6 +11,7 @@ > > #include "do_mounts.h" > #include "../fs/squashfs/squashfs_fs.h" > +#include "../fs/erofs/erofs_fs.h" > > #include <linux/decompress/generic.h> > > @@ -47,6 +48,7 @@ static int __init crd_load(decompress_fn deco); > * romfs > * cramfs > * squashfs > + * erofs > * gzip > * bzip2 > * lzma > @@ -63,6 +65,7 @@ identify_ramdisk_image(struct file *file, loff_t pos, > struct romfs_super_block *romfsb; > struct cramfs_super *cramfsb; > struct squashfs_super_block *squashfsb; > + struct erofs_super_block *erofsb; > int nblocks = -1; > unsigned char *buf; > const char *compress_name; > @@ -77,6 +80,7 @@ identify_ramdisk_image(struct file *file, loff_t pos, > romfsb = (struct romfs_super_block *) buf; > cramfsb = (struct cramfs_super *) buf; > squashfsb = (struct squashfs_super_block *) buf; > + erofsb = (struct erofs_super_block *) buf; > memset(buf, 0xe5, size); > > /* > @@ -165,6 +169,21 @@ identify_ramdisk_image(struct file *file, loff_t pos, > goto done; > } > > + /* Try erofs */ > + pos = (start_block * BLOCK_SIZE) + EROFS_SUPER_OFFSET; > + kernel_read(file, buf, size, &pos); > + > + if (erofsb->magic == EROFS_SUPER_MAGIC_V1) { le32_to_cpu(erofsb->magic) > + printk(KERN_NOTICE > + "RAMDISK: erofs filesystem found at block %d\n", > + start_block); > + > + nblocks = ((erofsb->blocks << erofsb->blkszbits) + BLOCK_SIZE - 1) > + >> BLOCK_SIZE_BITS; le32_to_cpu(erofsb->blocks) > + > + goto done; > + } > + > printk(KERN_NOTICE > "RAMDISK: Couldn't find valid RAM disk image starting at %d.\n", > start_block); > This seems to be broken for cramfs and minix, too. Thomas
Hi Al! On Fri, 2025-03-21 at 02:08 +0000, Al Viro wrote: > On Thu, Mar 20, 2025 at 08:28:23PM +0100, Julian Stecklina via B4 Relay wrote: > > From: Julian Stecklina <julian.stecklina@cyberus-technology.de> > > > > Add erofs detection to the initrd mount code. This allows systems to > > boot from an erofs-based initrd in the same way as they can boot from > > a squashfs initrd. > > > > Just as squashfs initrds, erofs images as initrds are a good option > > for systems that are memory-constrained. > > > > Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de> > > > #include "do_mounts.h" > > #include "../fs/squashfs/squashfs_fs.h" > > +#include "../fs/erofs/erofs_fs.h" > > This is getting really unpleasant... > > Folks, could we do something similar to initcalls - add a section > (.init.text.rd_detect?) with array of pointers to __init functions > that would be called by that thing in turn? With filesystems that > want to add that kind of stuff being about to do something like That's a great suggestion! I wonder whether it's possible to restructure the code further, so it does something along: 1. Try to detect whether it's a (compressed) initramfs -> if so, extract and be done. 2. Copy the initrd onto /dev/ram and just try to mount it (without the manual fs detection) As far as I understand it, the kernel should be able to figure out itself what filesystem it is? If so, this removes the manual fs detection and just allows all filesystem images to be used as initrd. I'm a noob in this part of the code base, so feel free to tell me that/why this is not possible. :) Julian
On Fri, 2025-03-21 at 13:27 +0800, Gao Xiang wrote: > Hi Christoph, > > On 2025/3/21 13:01, Christoph Hellwig wrote: > > We've been trying to kill off initrd in favor of initramfs for about > > two decades. I don't think adding new file system support to it is > > helpful. > > > > Disclaimer: I don't know the background of this effort so > more background might be helpful. So erofs came up in an effort to improve the experience for users of NixOS on smaller systems. We use erofs a lot and some people in the community just consider it a "better" cpio at this point. A great property is that the contents stays compressed in memory and there is no need to unpack anything at boot. Others like that the rootfs is read-only by default. In short: erofs is a great fit. Of course there are some solutions to using erofs images at boot now: https://github.com/containers/initoverlayfs But this adds yet another step in the already complex boot process and feels like a hack. It would be nice to just use erofs images as initrd. The other building block to this is automatically sizing /dev/ram0: https://lkml.org/lkml/2025/3/20/1296 I didn't pack both patches into one series, because I thought enabling erofs itself would be less controversial and is already useful on its own. The autosizing of /dev/ram is probably more involved than my RFC patch. I'm hoping for some input on how to do it right. :) > > Two years ago, I once thought if using EROFS + FSDAX to directly > use the initrd image from bootloaders to avoid the original initrd > double caching issue (which is what initramfs was proposed to > resolve) and initramfs unnecessary tmpfs unpack overhead: > https://lore.kernel.org/r/ZXgNQ85PdUKrQU1j@infradead.org > > Also EROFS supports xattrs so the following potential work (which > the cpio format doesn't support) is no longer needed although I > don't have any interest to follow either): > https://lore.kernel.org/r/20190523121803.21638-1-roberto.sassu@huawei.com Thanks for the pointers! Julian
On Fri, 2025-03-21 at 10:16 +0100, Thomas Weißschuh wrote: > On Thu, Mar 20, 2025 at 08:28:23PM +0100, Julian Stecklina via B4 Relay wrote: > > From: Julian Stecklina <julian.stecklina@cyberus-technology.de> > > > > Add erofs detection to the initrd mount code. This allows systems to > > boot from an erofs-based initrd in the same way as they can boot from > > a squashfs initrd. > > > > Just as squashfs initrds, erofs images as initrds are a good option > > for systems that are memory-constrained. > > > > Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de> > > --- > > init/do_mounts_rd.c | 19 +++++++++++++++++++ > > 1 file changed, 19 insertions(+) > > > > diff --git a/init/do_mounts_rd.c b/init/do_mounts_rd.c > > index > > ac021ae6e6fa78c7b7828a78ab2fa3af3611bef3..7c3f8b45b5ed2eea3c534d7f2e65608542 > > 009df5 100644 > > --- a/init/do_mounts_rd.c > > +++ b/init/do_mounts_rd.c > > @@ -11,6 +11,7 @@ > > > > #include "do_mounts.h" > > #include "../fs/squashfs/squashfs_fs.h" > > +#include "../fs/erofs/erofs_fs.h" > > > > #include <linux/decompress/generic.h> > > > > @@ -47,6 +48,7 @@ static int __init crd_load(decompress_fn deco); > > * romfs > > * cramfs > > * squashfs > > + * erofs > > * gzip > > * bzip2 > > * lzma > > @@ -63,6 +65,7 @@ identify_ramdisk_image(struct file *file, loff_t pos, > > struct romfs_super_block *romfsb; > > struct cramfs_super *cramfsb; > > struct squashfs_super_block *squashfsb; > > + struct erofs_super_block *erofsb; > > int nblocks = -1; > > unsigned char *buf; > > const char *compress_name; > > @@ -77,6 +80,7 @@ identify_ramdisk_image(struct file *file, loff_t pos, > > romfsb = (struct romfs_super_block *) buf; > > cramfsb = (struct cramfs_super *) buf; > > squashfsb = (struct squashfs_super_block *) buf; > > + erofsb = (struct erofs_super_block *) buf; > > memset(buf, 0xe5, size); > > > > /* > > @@ -165,6 +169,21 @@ identify_ramdisk_image(struct file *file, loff_t pos, > > goto done; > > } > > > > + /* Try erofs */ > > + pos = (start_block * BLOCK_SIZE) + EROFS_SUPER_OFFSET; > > + kernel_read(file, buf, size, &pos); > > + > > + if (erofsb->magic == EROFS_SUPER_MAGIC_V1) { > > le32_to_cpu(erofsb->magic) > > > + printk(KERN_NOTICE > > + "RAMDISK: erofs filesystem found at block %d\n", > > + start_block); > > + > > + nblocks = ((erofsb->blocks << erofsb->blkszbits) + BLOCK_SIZE - 1) > > + >> BLOCK_SIZE_BITS; > > le32_to_cpu(erofsb->blocks) > > > + > > + goto done; > > + } > > + > > printk(KERN_NOTICE > > "RAMDISK: Couldn't find valid RAM disk image starting at %d.\n", > > start_block); > > > > This seems to be broken for cramfs and minix, too. Great observation. Will fix! Julian
Hi Julian, On 2025/3/21 21:17, Julian Stecklina wrote: > On Fri, 2025-03-21 at 13:27 +0800, Gao Xiang wrote: >> Hi Christoph, >> >> On 2025/3/21 13:01, Christoph Hellwig wrote: >>> We've been trying to kill off initrd in favor of initramfs for about >>> two decades. I don't think adding new file system support to it is >>> helpful. >>> >> >> Disclaimer: I don't know the background of this effort so >> more background might be helpful. > > So erofs came up in an effort to improve the experience for users of NixOS on > smaller systems. We use erofs a lot and some people in the community just > consider it a "better" cpio at this point. A great property is that the contents > stays compressed in memory and there is no need to unpack anything at boot. > Others like that the rootfs is read-only by default. In short: erofs is a great > fit. > > Of course there are some solutions to using erofs images at boot now: > https://github.com/containers/initoverlayfs > > But this adds yet another step in the already complex boot process and feels > like a hack. It would be nice to just use erofs images as initrd. The other > building block to this is automatically sizing /dev/ram0: > > https://lkml.org/lkml/2025/3/20/1296 > > I didn't pack both patches into one series, because I thought enabling erofs > itself would be less controversial and is already useful on its own. The > autosizing of /dev/ram is probably more involved than my RFC patch. I'm hoping > for some input on how to do it right. :) Ok, my own thought is that cpio format is somewhat inflexible. It seems that the main original reason for introducing initramfs and cpio was to avoid double caching, but it can be resolved with FSDAX now and initdax totally avoids unnecessary cpio parsing and unpacking. cpio format is much like tar which lacks of basic features like random access and xattrs which are useful for some use cases as I mentioned before. The initrd image can even compressed as a whole and decompress in the current initramfs way. If you really need on-demand decompression, you could leave some file compresssed since EROFS supports per-inode compression, but those files are still double caching since FSDAX mode should be uncompressed to support mmap. You could leave rare-used files compressed. Thanks, Gao Xiang
diff --git a/init/do_mounts_rd.c b/init/do_mounts_rd.c index ac021ae6e6fa78c7b7828a78ab2fa3af3611bef3..7c3f8b45b5ed2eea3c534d7f2e65608542009df5 100644 --- a/init/do_mounts_rd.c +++ b/init/do_mounts_rd.c @@ -11,6 +11,7 @@ #include "do_mounts.h" #include "../fs/squashfs/squashfs_fs.h" +#include "../fs/erofs/erofs_fs.h" #include <linux/decompress/generic.h> @@ -47,6 +48,7 @@ static int __init crd_load(decompress_fn deco); * romfs * cramfs * squashfs + * erofs * gzip * bzip2 * lzma @@ -63,6 +65,7 @@ identify_ramdisk_image(struct file *file, loff_t pos, struct romfs_super_block *romfsb; struct cramfs_super *cramfsb; struct squashfs_super_block *squashfsb; + struct erofs_super_block *erofsb; int nblocks = -1; unsigned char *buf; const char *compress_name; @@ -77,6 +80,7 @@ identify_ramdisk_image(struct file *file, loff_t pos, romfsb = (struct romfs_super_block *) buf; cramfsb = (struct cramfs_super *) buf; squashfsb = (struct squashfs_super_block *) buf; + erofsb = (struct erofs_super_block *) buf; memset(buf, 0xe5, size); /* @@ -165,6 +169,21 @@ identify_ramdisk_image(struct file *file, loff_t pos, goto done; } + /* Try erofs */ + pos = (start_block * BLOCK_SIZE) + EROFS_SUPER_OFFSET; + kernel_read(file, buf, size, &pos); + + if (erofsb->magic == EROFS_SUPER_MAGIC_V1) { + printk(KERN_NOTICE + "RAMDISK: erofs filesystem found at block %d\n", + start_block); + + nblocks = ((erofsb->blocks << erofsb->blkszbits) + BLOCK_SIZE - 1) + >> BLOCK_SIZE_BITS; + + goto done; + } + printk(KERN_NOTICE "RAMDISK: Couldn't find valid RAM disk image starting at %d.\n", start_block);