Message ID | 1518813234-5874-2-git-send-email-takondra@cisco.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 02/16/18 12:33, Taras Kondratiuk wrote: > Many of the Linux security/integrity features are dependent on file > metadata, stored as extended attributes (xattrs), for making decisions. > These features need to be initialized during initcall and enabled as > early as possible for complete security coverage. > > Initramfs (tmpfs) supports xattrs, but newc CPIO archive format does not > support including them into the archive. > > This patch describes "extended" newc format (newcx) that is based on > newc and has following changes: > - extended attributes support > - increased size of filesize to support files >4GB > - increased mtime field size to have 64 bits of seconds and added a > field for nanoseconds > - removed unused checksum field > If you are going to implement a new, non-backwards-compatible format, you shouldn't replicate the mistakes of the current format. Specifically: 1. The use of ASCII-encoded fixed-length numbers is an idiotic legacy from an era before there were any portable way of dealing with numbers with prespecified endianness. If you are going to use ASCII, make them delimited so that they don't have fixed limits, or just use binary. The cpio header isn't fixed size, so that argument goes away, in fact the only way to determine the end of the header is to scan forward. 2. Alignment sensitivity! Because there is no header length information, the above scan tells you where the header ends, but there is padding before the data, and the size of that padding is only defined by alignment. 3. Inband encoding of EOF: if you actually have a filename "TRAILER!!!" you have problems. But first, before you define a whole new format for which no tools exist (you will have to work with the maintainers of the GNU tools to add support) you should see how complex it would be to support the POSIX tar/pax format, which already has all the features you are seeking, and by now is well-supported. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 02/16/2018 02:59 PM, H. Peter Anvin wrote: > On 02/16/18 12:33, Taras Kondratiuk wrote: >> Many of the Linux security/integrity features are dependent on file >> metadata, stored as extended attributes (xattrs), for making decisions. >> These features need to be initialized during initcall and enabled as >> early as possible for complete security coverage. >> >> Initramfs (tmpfs) supports xattrs, but newc CPIO archive format does not >> support including them into the archive. >> >> This patch describes "extended" newc format (newcx) that is based on >> newc and has following changes: >> - extended attributes support >> - increased size of filesize to support files >4GB >> - increased mtime field size to have 64 bits of seconds and added a >> field for nanoseconds >> - removed unused checksum field >> > > If you are going to implement a new, non-backwards-compatible format, > you shouldn't replicate the mistakes of the current format. Specifically: So rather than make minimal changes to the existing format and continue to support the existing format (sharing as much code as possible), you recommend gratuitous aesthetic changes? > 1. The use of ASCII-encoded fixed-length numbers is an idiotic legacy > from an era before there were any portable way of dealing with numbers > with prespecified endianness. It lets encoders and decoders easily share code with the existing cpio format, which we still intend to be able to read and write. > If you are going to use ASCII, make them > delimited so that they don't have fixed limits, or just use binary. When it's gzipped this accomplishes what? (Other than being gratuitously different from the previous iteration?) > The cpio header isn't fixed size, so that argument goes away, in fact > the only way to determine the end of the header is to scan forward. > > 2. Alignment sensitivity! Because there is no header length > information, the above scan tells you where the header ends, but there > is padding before the data, and the size of that padding is only defined > by alignment. Again, these are minimal changes to the existing cpio format. You're complaining about _cpio_, and that the new stuff isn't _different_ enough from it. > 3. Inband encoding of EOF: if you actually have a filename "TRAILER!!!" > you have problems. Been there, done that: http://lkml.iu.edu/hypermail/linux/kernel/1801.3/01791.html > But first, before you define a whole new format for which no tools exist > (you will have to work with the maintainers of the GNU tools to add > support) No, he's been working with the maintainer of toybox to add support (for about a year now), which gets him the Android command line. And the kernel has its own built-in tool to generate cpio images anyway. Why would anyone care what the GNU project thinks? > you should see how complex it would be to support the POSIX > tar/pax format, That argument was had (at length) when initramfs went in over a decade ago. There are links in Documentation/filesystems/ramfs-rootfs-initramfs.txt to the mailing list entries about it. > which already has all the features you are seeking, and > by now is well-supported. So... tar wasn't well-supported 15 years ago? (Hasn't the kernel source always been distributed via tarball back since 0.0.1?) You're suggesting having a whole second codepath that shares no code with the existing cpio extractor. Are you suggesting abandoning support for the existing initramfs.cpio.gz file format? Rob -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 16 Feb 2018, Rob Landley wrote: > > On 02/16/2018 02:59 PM, H. Peter Anvin wrote: >> On 02/16/18 12:33, Taras Kondratiuk wrote: >>> Many of the Linux security/integrity features are dependent on file >>> metadata, stored as extended attributes (xattrs), for making decisions. >>> These features need to be initialized during initcall and enabled as >>> early as possible for complete security coverage. >>> >>> Initramfs (tmpfs) supports xattrs, but newc CPIO archive format does not >>> support including them into the archive. >>> >>> This patch describes "extended" newc format (newcx) that is based on >>> newc and has following changes: >>> - extended attributes support >>> - increased size of filesize to support files >4GB >>> - increased mtime field size to have 64 bits of seconds and added a >>> field for nanoseconds >>> - removed unused checksum field >>> >> >> If you are going to implement a new, non-backwards-compatible format, >> you shouldn't replicate the mistakes of the current format. Specifically: > > So rather than make minimal changes to the existing format and continue to > support the existing format (sharing as much code as possible), you recommend > gratuitous aesthetic changes? > >> 1. The use of ASCII-encoded fixed-length numbers is an idiotic legacy >> from an era before there were any portable way of dealing with numbers >> with prespecified endianness. > > It lets encoders and decoders easily share code with the existing cpio format, > which we still intend to be able to read and write. > >> If you are going to use ASCII, make them >> delimited so that they don't have fixed limits, or just use binary. > > When it's gzipped this accomplishes what? (Other than being gratuitously > different from the previous iteration?) > >> The cpio header isn't fixed size, so that argument goes away, in fact >> the only way to determine the end of the header is to scan forward. >> >> 2. Alignment sensitivity! Because there is no header length >> information, the above scan tells you where the header ends, but there >> is padding before the data, and the size of that padding is only defined >> by alignment. > > Again, these are minimal changes to the existing cpio format. You're complaining > about _cpio_, and that the new stuff isn't _different_ enough from it. > >> 3. Inband encoding of EOF: if you actually have a filename "TRAILER!!!" >> you have problems. > > Been there, done that: > > http://lkml.iu.edu/hypermail/linux/kernel/1801.3/01791.html > >> But first, before you define a whole new format for which no tools exist >> (you will have to work with the maintainers of the GNU tools to add >> support) > > No, he's been working with the maintainer of toybox to add support (for about a > year now), which gets him the Android command line. And the kernel has its own > built-in tool to generate cpio images anyway. > > Why would anyone care what the GNU project thinks? In our internal use of this patch series we do use gnu cpio to create initramfs.cpio. And reference to gnu cpio patch that supports newcx format is posted in description for this serieis: https://raw.githubusercontent.com/victorkamensky/initramfs-xattrs-poky/rocko/meta/recipes-extended/cpio/cpio-2.12/cpio-xattrs.patch Whether GNU cpio maintainers will accept it is different matter. We will try, but we need to start somewhere and agree on new format first. Thanks, Victor >> you should see how complex it would be to support the POSIX >> tar/pax format, > > That argument was had (at length) when initramfs went in over a decade ago. > There are links in Documentation/filesystems/ramfs-rootfs-initramfs.txt to the > mailing list entries about it. > >> which already has all the features you are seeking, and >> by now is well-supported. > > So... tar wasn't well-supported 15 years ago? (Hasn't the kernel source always > been distributed via tarball back since 0.0.1?) > > You're suggesting having a whole second codepath that shares no code with the > existing cpio extractor. Are you suggesting abandoning support for the existing > initramfs.cpio.gz file format? > > Rob > -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On February 16, 2018 1:47:35 PM PST, Victor Kamensky <kamensky@cisco.com> wrote: > > >On Fri, 16 Feb 2018, Rob Landley wrote: > >> >> On 02/16/2018 02:59 PM, H. Peter Anvin wrote: >>> On 02/16/18 12:33, Taras Kondratiuk wrote: >>>> Many of the Linux security/integrity features are dependent on file >>>> metadata, stored as extended attributes (xattrs), for making >decisions. >>>> These features need to be initialized during initcall and enabled >as >>>> early as possible for complete security coverage. >>>> >>>> Initramfs (tmpfs) supports xattrs, but newc CPIO archive format >does not >>>> support including them into the archive. >>>> >>>> This patch describes "extended" newc format (newcx) that is based >on >>>> newc and has following changes: >>>> - extended attributes support >>>> - increased size of filesize to support files >4GB >>>> - increased mtime field size to have 64 bits of seconds and added a >>>> field for nanoseconds >>>> - removed unused checksum field >>>> >>> >>> If you are going to implement a new, non-backwards-compatible >format, >>> you shouldn't replicate the mistakes of the current format. >Specifically: >> >> So rather than make minimal changes to the existing format and >continue to >> support the existing format (sharing as much code as possible), you >recommend >> gratuitous aesthetic changes? >> >>> 1. The use of ASCII-encoded fixed-length numbers is an idiotic >legacy >>> from an era before there were any portable way of dealing with >numbers >>> with prespecified endianness. >> >> It lets encoders and decoders easily share code with the existing >cpio format, >> which we still intend to be able to read and write. >> >>> If you are going to use ASCII, make them >>> delimited so that they don't have fixed limits, or just use binary. >> >> When it's gzipped this accomplishes what? (Other than being >gratuitously >> different from the previous iteration?) >> >>> The cpio header isn't fixed size, so that argument goes away, in >fact >>> the only way to determine the end of the header is to scan forward. >>> >>> 2. Alignment sensitivity! Because there is no header length >>> information, the above scan tells you where the header ends, but >there >>> is padding before the data, and the size of that padding is only >defined >>> by alignment. >> >> Again, these are minimal changes to the existing cpio format. You're >complaining >> about _cpio_, and that the new stuff isn't _different_ enough from >it. >> >>> 3. Inband encoding of EOF: if you actually have a filename >"TRAILER!!!" >>> you have problems. >> >> Been there, done that: >> >> http://lkml.iu.edu/hypermail/linux/kernel/1801.3/01791.html >> >>> But first, before you define a whole new format for which no tools >exist >>> (you will have to work with the maintainers of the GNU tools to add >>> support) >> >> No, he's been working with the maintainer of toybox to add support >(for about a >> year now), which gets him the Android command line. And the kernel >has its own >> built-in tool to generate cpio images anyway. >> >> Why would anyone care what the GNU project thinks? > >In our internal use of this patch series we do use gnu cpio >to create initramfs.cpio. > >And reference to gnu cpio patch that supports newcx format is >posted in description for this serieis: > >https://raw.githubusercontent.com/victorkamensky/initramfs-xattrs-poky/rocko/meta/recipes-extended/cpio/cpio-2.12/cpio-xattrs.patch > >Whether GNU cpio maintainers will accept it is different matter. >We will try, but we need to start somewhere and agree on >new format first. > >Thanks, >Victor > >>> you should see how complex it would be to support the POSIX >>> tar/pax format, >> >> That argument was had (at length) when initramfs went in over a >decade ago. >> There are links in >Documentation/filesystems/ramfs-rootfs-initramfs.txt to the >> mailing list entries about it. >> >>> which already has all the features you are seeking, and >>> by now is well-supported. >> >> So... tar wasn't well-supported 15 years ago? (Hasn't the kernel >source always >> been distributed via tarball back since 0.0.1?) >> >> You're suggesting having a whole second codepath that shares no code >with the >> existing cpio extractor. Are you suggesting abandoning support for >the existing >> initramfs.cpio.gz file format? >> >> Rob >> Introducing new, incompatible data formats is an inherently *very* costly operation; unfortunately many engineers don't seem to have a good grip of just *how* expensive it is (see "silly embedded nonsense hacks", "too little, too soon".) Cpio itself is a great horror show of just how bad this gets: a bunch of minor tweaks without finding underlying design bugs resulting in a ton of mutually incompatible formats. "They are almost the same" doesn't help: they are still incompatible. Introducing a new incompatible data format without strong justification is engineering malpractice. Doing it under the non-justification of expedience ("oh, we can share most of the code") is aggravated engineering malpractice. It is entirely possible that the modern posix tar/pax format is too complex to be practical in this case – that would be justifying a new format. But then you are taking the fundamental cost of breakage, and then the new format definitely should not be replicating known defects of another format and without at least some thought about how to avoid it in the future.
Quoting hpa@zytor.com (2018-02-16 16:00:36) > On February 16, 2018 1:47:35 PM PST, Victor Kamensky <kamensky@cisco.com> wrote: > > > > > >On Fri, 16 Feb 2018, Rob Landley wrote: > > > >> > >> On 02/16/2018 02:59 PM, H. Peter Anvin wrote: > >>> On 02/16/18 12:33, Taras Kondratiuk wrote: > >>>> Many of the Linux security/integrity features are dependent on file > >>>> metadata, stored as extended attributes (xattrs), for making > >decisions. > >>>> These features need to be initialized during initcall and enabled > >as > >>>> early as possible for complete security coverage. > >>>> > >>>> Initramfs (tmpfs) supports xattrs, but newc CPIO archive format > >does not > >>>> support including them into the archive. > >>>> > >>>> This patch describes "extended" newc format (newcx) that is based > >on > >>>> newc and has following changes: > >>>> - extended attributes support > >>>> - increased size of filesize to support files >4GB > >>>> - increased mtime field size to have 64 bits of seconds and added a > >>>> field for nanoseconds > >>>> - removed unused checksum field > >>>> > >>> > >>> If you are going to implement a new, non-backwards-compatible > >format, > >>> you shouldn't replicate the mistakes of the current format. > >Specifically: > >> > >> So rather than make minimal changes to the existing format and > >continue to > >> support the existing format (sharing as much code as possible), you > >recommend > >> gratuitous aesthetic changes? > >> > >>> 1. The use of ASCII-encoded fixed-length numbers is an idiotic > >legacy > >>> from an era before there were any portable way of dealing with > >numbers > >>> with prespecified endianness. > >> > >> It lets encoders and decoders easily share code with the existing > >cpio format, > >> which we still intend to be able to read and write. > >> > >>> If you are going to use ASCII, make them > >>> delimited so that they don't have fixed limits, or just use binary. > >> > >> When it's gzipped this accomplishes what? (Other than being > >gratuitously > >> different from the previous iteration?) > >> > >>> The cpio header isn't fixed size, so that argument goes away, in > >fact > >>> the only way to determine the end of the header is to scan forward. > >>> > >>> 2. Alignment sensitivity! Because there is no header length > >>> information, the above scan tells you where the header ends, but > >there > >>> is padding before the data, and the size of that padding is only > >defined > >>> by alignment. > >> > >> Again, these are minimal changes to the existing cpio format. You're > >complaining > >> about _cpio_, and that the new stuff isn't _different_ enough from > >it. > >> > >>> 3. Inband encoding of EOF: if you actually have a filename > >"TRAILER!!!" > >>> you have problems. > >> > >> Been there, done that: > >> > >> http://lkml.iu.edu/hypermail/linux/kernel/1801.3/01791.html > >> > >>> But first, before you define a whole new format for which no tools > >exist > >>> (you will have to work with the maintainers of the GNU tools to add > >>> support) > >> > >> No, he's been working with the maintainer of toybox to add support > >(for about a > >> year now), which gets him the Android command line. And the kernel > >has its own > >> built-in tool to generate cpio images anyway. > >> > >> Why would anyone care what the GNU project thinks? > > > >In our internal use of this patch series we do use gnu cpio > >to create initramfs.cpio. > > > >And reference to gnu cpio patch that supports newcx format is > >posted in description for this serieis: > > > >https://raw.githubusercontent.com/victorkamensky/initramfs-xattrs-poky/rocko/meta/recipes-extended/cpio/cpio-2.12/cpio-xattrs.patch > > > >Whether GNU cpio maintainers will accept it is different matter. > >We will try, but we need to start somewhere and agree on > >new format first. > > > >Thanks, > >Victor > > > >>> you should see how complex it would be to support the POSIX > >>> tar/pax format, > >> > >> That argument was had (at length) when initramfs went in over a > >decade ago. > >> There are links in > >Documentation/filesystems/ramfs-rootfs-initramfs.txt to the > >> mailing list entries about it. > >> > >>> which already has all the features you are seeking, and > >>> by now is well-supported. > >> > >> So... tar wasn't well-supported 15 years ago? (Hasn't the kernel > >source always > >> been distributed via tarball back since 0.0.1?) > >> > >> You're suggesting having a whole second codepath that shares no code > >with the > >> existing cpio extractor. Are you suggesting abandoning support for > >the existing > >> initramfs.cpio.gz file format? > >> > >> Rob > >> > > Introducing new, incompatible data formats is an inherently *very* costly operation; unfortunately many engineers don't seem to have a good grip of just *how* expensive it is (see "silly embedded nonsense hacks", "too little, too soon".) > > Cpio itself is a great horror show of just how bad this gets: a bunch of minor tweaks without finding underlying design bugs resulting in a ton of mutually incompatible formats. "They are almost the same" doesn't help: they are still incompatible. > > Introducing a new incompatible data format without strong justification is engineering malpractice. Doing it under the non-justification of expedience ("oh, we can share most of the code") is aggravated engineering malpractice. > > It is entirely possible that the modern posix tar/pax format is too complex to be practical in this case – that would be justifying a new format. But then you are taking the fundamental cost of breakage, and then the new format definitely should not be replicating known defects of another format and without at least some thought about how to avoid it in the future. I do understand a cost of adding a new format and I'd be very happy not to do it if there is a better option. I did consider using tar/pax, but looks like it was already discussed in 2001 between you and Al Viro [1] and tar was rejected. My main tar concerns: - ustar+pax header is *huge*. E.g. directory entry in archive: pax 1536 bytes vs cpio <200 bytes. Overall compressed initramfs size increase is not significant though. - pax is not a strict format. E.g. xattrs may be stored under different names: SHCILY.xattr (GNU tar, star) vs LIBARCHIVE.xattr (libarchive). I'm not sure which option is better. Adding tar to the kernel or adding new cpio format into several tools (GNU cpio, libarchive, busybox, toybox) will result in approximately the same amount of code. It would be nice to get Al Viro's thoughts on this. [1] https://web.archive.org/web/20060909041730/http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1638.html -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 02/16/2018 06:00 PM, hpa@zytor.com wrote: > Introducing new, incompatible data formats is an inherently *very* > costly operation; unfortunately many engineers don't seem to have a good grip > of just *how* expensive it is (see "silly embedded nonsense hacks", "too > little, too soon".) So your argument is we should use the _existing_ cpio format that supports xattrs? You keep bringing up the embedded world as a thing you don't understand and is thus bad. I remember when you dismissed "I would like to constrain my cross-compiling dependencies to a minimal set" as a... what did you call it, a silly academic exercise? (Googles...) https://lkml.org/lkml/2008/2/15/548 > Cpio itself is a great horror show of just how bad this gets: That's not what you said last time? http://lkml.iu.edu/hypermail/linux/kernel/0112.2/1540.html > Introducing a new incompatible data format without strong justification Here's you suggesting a new format when initramfs first went in, because you disliked _both_ tar and cpio: http://lkml.iu.edu/hypermail/linux/kernel/0112.2/1587.html Seriously, there is a "why cpio rather than tar" section of https://www.kernel.org/doc/Documentation/filesystems/ramfs-rootfs-initramfs.txt with links to the messages. (www.uwsg became lkml in the links, I should submit a patch fixing that, it redirected 6 months ago...) We've _had_ this argument already. You are not bringing up _new_ arguments. This patch set is because people want xattrs in initramfs. I still don't personally understand why they want this, but they do. We need to still support the existing file format for the forseeable future, and we might as well fix y2038 while we're there (treating it as unsigned buys us a lot of decades, but as long as we're bumping the version number anyway...). Otherwise it tries to be the minimal set of changes to get us there. (My first stab at this was dealing with sparse files, but runs of zeroes gzip pretty well and tmpfs could always make itself sparse after the fact...) > Doing it under the non-justification of expedience ("oh, we can share most> > of the code") is aggravated engineering malpractice. Coming from the guy who added perl as a build dependency to every project he maintained simultaneously (the linux kernel, your bootloader, klibc), that seems a lot more like an opinion than an objective metric. > It is entirely possible that the modern posix tar/pax format is too complex In the link above you declared it too complex in 2001. Partly because the gnu tar and pax formats aren't really the same format. > to be practical in this case – that would be justifying a new format. But > then you are taking the fundamental cost of breakage, and then the new format > definitely should not be replicating known defects of another format and > without at least some thought about how to avoid it in the future. Didn't Linus flame more than one developer for ripping things out and replacing them with a new untested thing rather than leaving a trail of breadcrumbs from a known working thing to another known working thing? Or has the right way to do it changed since the 2.5 development cycle? Strangely the poor souls suffering under the burden of cpio to use initramfs today haven't been screaming out their agony in a detectable way. (They're mad the kernel doesn't give better feedback about why init failed to launch and it either paniced or fell through to the fallback ROOT=, my patch to make devtmpfs_mount work for initramfs was trying to fix the "you pointed the kernel at a root filesystem directory which it cpio'd up but there was /dev/console in it so your init has no stdin/stdout/stderr and dies immediately because of it" problem. And the recent thread about "please don't add a third knob to make initramfs be tmpfs instead of ramfs" was another corner case of that). And I have half an INITRAMFS_VERBOSE patch around here somewhere to printk() a lot more status (and I need to update the initramfs documentation I wrote to help people have an easier time using it...) But that's not about archive format. That's kernel userspace bringup being persnickety. The silent majority you speak for on this archive format issue is pretty darn silent. Was this recorded as a problem for you before somebody suggested changing it? I tend to be public about https://twitter.com/landley/status/964620648050982912 and collect links to other people's concerns when I notice... Or is this just your opinion? Rob -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2018-02-16 at 12:59 -0800, H. Peter Anvin wrote: > On 02/16/18 12:33, Taras Kondratiuk wrote: > > Many of the Linux security/integrity features are dependent on file > > metadata, stored as extended attributes (xattrs), for making decisions. > > These features need to be initialized during initcall and enabled as > > early as possible for complete security coverage. > > > > Initramfs (tmpfs) supports xattrs, but newc CPIO archive format does not > > support including them into the archive. > > > > This patch describes "extended" newc format (newcx) that is based on > > newc and has following changes: > > - extended attributes support > > - increased size of filesize to support files >4GB > > - increased mtime field size to have 64 bits of seconds and added a > > field for nanoseconds > > - removed unused checksum field > > > > If you are going to implement a new, non-backwards-compatible format, > you shouldn't replicate the mistakes of the current format. Specifically: > > 1. The use of ASCII-encoded fixed-length numbers is an idiotic legacy > from an era before there were any portable way of dealing with numbers > with prespecified endianness. If you are going to use ASCII, make them > delimited so that they don't have fixed limits, or just use binary. > > The cpio header isn't fixed size, so that argument goes away, in fact > the only way to determine the end of the header is to scan forward. > > 2. Alignment sensitivity! Because there is no header length > information, the above scan tells you where the header ends, but there > is padding before the data, and the size of that padding is only defined > by alignment. > > 3. Inband encoding of EOF: if you actually have a filename "TRAILER!!!" > you have problems. > > But first, before you define a whole new format for which no tools exist > (you will have to work with the maintainers of the GNU tools to add > support) you should see how complex it would be to support the POSIX > tar/pax format, which already has all the features you are seeking, and > by now is well-supported. The discussion about including xattrs in the initramfs didn't start yesterday. It's been on the list of measurement/appraisal gaps that need to be closed for years. Initially I planned on using tar, but at the 2014 Kernel Summit I spoke with Al at length. At the time, he was very clear that tar is unnecessarily overly complicated and recommended extending CPIO. I took his advice. Unfortunately, as soon as I posted an initial patch set to include xattrs in CPIO, all of the problems with CPIO had to be addressed before defining a new CPIO number. Unfortunately, this wasn't the only measurement/appraisal gap that needed to be addressed. I've been working on closing other gaps. I'm really happy that someone has taken the time to work on this. Instead of derailing their attempt of extending CPIO to include xattrs, I'd appreciate your making constructive suggestions. Mimi -- To unsubscribe from this list: send the line "unsubscribe linux-security-module" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On February 17, 2018 4:15:12 PM PST, Mimi Zohar <zohar@linux.vnet.ibm.com> wrote: >On Fri, 2018-02-16 at 12:59 -0800, H. Peter Anvin wrote: >> On 02/16/18 12:33, Taras Kondratiuk wrote: >> > Many of the Linux security/integrity features are dependent on file >> > metadata, stored as extended attributes (xattrs), for making >decisions. >> > These features need to be initialized during initcall and enabled >as >> > early as possible for complete security coverage. >> > >> > Initramfs (tmpfs) supports xattrs, but newc CPIO archive format >does not >> > support including them into the archive. >> > >> > This patch describes "extended" newc format (newcx) that is based >on >> > newc and has following changes: >> > - extended attributes support >> > - increased size of filesize to support files >4GB >> > - increased mtime field size to have 64 bits of seconds and added a >> > field for nanoseconds >> > - removed unused checksum field >> > >> >> If you are going to implement a new, non-backwards-compatible format, >> you shouldn't replicate the mistakes of the current format. >Specifically: >> >> 1. The use of ASCII-encoded fixed-length numbers is an idiotic legacy >> from an era before there were any portable way of dealing with >numbers >> with prespecified endianness. If you are going to use ASCII, make >them >> delimited so that they don't have fixed limits, or just use binary. >> >> The cpio header isn't fixed size, so that argument goes away, in fact >> the only way to determine the end of the header is to scan forward. >> >> 2. Alignment sensitivity! Because there is no header length >> information, the above scan tells you where the header ends, but >there >> is padding before the data, and the size of that padding is only >defined >> by alignment. >> >> 3. Inband encoding of EOF: if you actually have a filename >"TRAILER!!!" >> you have problems. >> >> But first, before you define a whole new format for which no tools >exist >> (you will have to work with the maintainers of the GNU tools to add >> support) you should see how complex it would be to support the POSIX >> tar/pax format, which already has all the features you are seeking, >and >> by now is well-supported. > >The discussion about including xattrs in the initramfs didn't start >yesterday. It's been on the list of measurement/appraisal gaps that >need to be closed for years. Initially I planned on using tar, but at >the 2014 Kernel Summit I spoke with Al at length. At the time, he was >very clear that tar is unnecessarily overly complicated and >recommended extending CPIO. > >I took his advice. Unfortunately, as soon as I posted an initial >patch set to include xattrs in CPIO, all of the problems with CPIO had >to be addressed before defining a new CPIO number. Unfortunately, >this wasn't the only measurement/appraisal gap that needed to be >addressed. I've been working on closing other gaps. > >I'm really happy that someone has taken the time to work on this. > Instead of derailing their attempt of extending CPIO to include >xattrs, I'd appreciate your making constructive suggestions. > >Mimi I'm not trying to derail anything, but I do want to see it done well if it is going to be done. I have several ideas already, but I may not have a chance to write them down properly until after the weekend due to family obligations. The assessment of pax/tar is useful; it should be added to the patch documentation in a future set.
On February 17, 2018 4:15:12 PM PST, Mimi Zohar <zohar@linux.vnet.ibm.com> wrote: >On Fri, 2018-02-16 at 12:59 -0800, H. Peter Anvin wrote: >> On 02/16/18 12:33, Taras Kondratiuk wrote: >> > Many of the Linux security/integrity features are dependent on file >> > metadata, stored as extended attributes (xattrs), for making >decisions. >> > These features need to be initialized during initcall and enabled >as >> > early as possible for complete security coverage. >> > >> > Initramfs (tmpfs) supports xattrs, but newc CPIO archive format >does not >> > support including them into the archive. >> > >> > This patch describes "extended" newc format (newcx) that is based >on >> > newc and has following changes: >> > - extended attributes support >> > - increased size of filesize to support files >4GB >> > - increased mtime field size to have 64 bits of seconds and added a >> > field for nanoseconds >> > - removed unused checksum field >> > >> >> If you are going to implement a new, non-backwards-compatible format, >> you shouldn't replicate the mistakes of the current format. >Specifically: >> >> 1. The use of ASCII-encoded fixed-length numbers is an idiotic legacy >> from an era before there were any portable way of dealing with >numbers >> with prespecified endianness. If you are going to use ASCII, make >them >> delimited so that they don't have fixed limits, or just use binary. >> >> The cpio header isn't fixed size, so that argument goes away, in fact >> the only way to determine the end of the header is to scan forward. >> >> 2. Alignment sensitivity! Because there is no header length >> information, the above scan tells you where the header ends, but >there >> is padding before the data, and the size of that padding is only >defined >> by alignment. >> >> 3. Inband encoding of EOF: if you actually have a filename >"TRAILER!!!" >> you have problems. >> >> But first, before you define a whole new format for which no tools >exist >> (you will have to work with the maintainers of the GNU tools to add >> support) you should see how complex it would be to support the POSIX >> tar/pax format, which already has all the features you are seeking, >and >> by now is well-supported. > >The discussion about including xattrs in the initramfs didn't start >yesterday. It's been on the list of measurement/appraisal gaps that >need to be closed for years. Initially I planned on using tar, but at >the 2014 Kernel Summit I spoke with Al at length. At the time, he was >very clear that tar is unnecessarily overly complicated and >recommended extending CPIO. > >I took his advice. Unfortunately, as soon as I posted an initial >patch set to include xattrs in CPIO, all of the problems with CPIO had >to be addressed before defining a new CPIO number. Unfortunately, >this wasn't the only measurement/appraisal gap that needed to be >addressed. I've been working on closing other gaps. > >I'm really happy that someone has taken the time to work on this. > Instead of derailing their attempt of extending CPIO to include >xattrs, I'd appreciate your making constructive suggestions. > >Mimi Do you have a description of the gaps you have identified?
On Sat, 2018-02-17 at 16:26 -0800, hpa@zytor.com wrote:
> Do you have a description of the gaps you have identified?
Probably the 2016 Linux Security Summit (LSS) integrity status update
has the best list.
http://events17.linuxfoundation.org/sites/events/files/slides/LSS2016-
LinuxIntegritySubsystemStatus.pdf
Mimi
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/Documentation/early-userspace/buffer-format.txt b/Documentation/early-userspace/buffer-format.txt index e1fd7f9dad16..8349b2f28b4b 100644 --- a/Documentation/early-userspace/buffer-format.txt +++ b/Documentation/early-userspace/buffer-format.txt @@ -24,6 +24,7 @@ grammar, where: + indicates concatenation GZIP() indicates the gzip(1) of the operand ALGN(n) means padding with null bytes to an n-byte boundary + [n] means size of field is n bytes initramfs := ("\0" | cpio_archive | cpio_gzip_archive)* @@ -31,20 +32,33 @@ grammar, where: cpio_archive := cpio_file* + (<nothing> | cpio_trailer) - cpio_file := ALGN(4) + cpio_header + filename + "\0" + ALGN(4) + data + cpio_file := (cpio_newc_file | cpio_newcx_file) + + cpio_newc_file := ALGN(4) + cpio_newc_header + filename + "\0" + \ + ALGN(4) + data + + cpio_newcx_file := ALGN(4) + cpio_newcx_header + filename + "\0" + \ + ALGN(4) + xattrs + ALGN(4) + data + + xattrs := xattr_entry* + + xattr_entry := xattr_size[8] + xattr_name + "\0" + xattr_value cpio_trailer := ALGN(4) + cpio_header + "TRAILER!!!\0" + ALGN(4) In human terms, the initramfs buffer contains a collection of -compressed and/or uncompressed cpio archives (in the "newc" or "crc" -formats); arbitrary amounts zero bytes (for padding) can be added -between members. +compressed and/or uncompressed cpio archives; arbitrary amounts +zero bytes (for padding) can be added between members. The cpio "TRAILER!!!" entry (cpio end-of-archive) is optional, but is not ignored; see "handling of hard links" below. -The structure of the cpio_header is as follows (all fields contain +xattr_size is a total size of xattr_entry including 8 bytes of +xattr_size. xattr_size has the same hexadecimal ASCII encoding as other +fields of cpio header (see below). + +The structure of the cpio_newc_header is as follows (all fields contain hexadecimal ASCII numbers fully padded with '0' on the left to the full width of the field, for example, the integer 4780 is represented by the ASCII string "000012ac"): @@ -81,6 +95,28 @@ algorithm used. If the filename is "TRAILER!!!" this is actually an end-of-archive marker; the c_filesize for an end-of-archive marker must be zero. +"Extended" newc format (newcx) +"newcx" cpio format extends "newc" by increasing size of some fields +and adding extended attributes support. cpio_newcx_header structure: + +Field name Field size Meaning +c_magic 6 bytes The string "070703" +c_ino 8 bytes File inode number +c_mode 8 bytes File mode and permissions +c_uid 8 bytes File uid +c_gid 8 bytes File gid +c_nlink 8 bytes Number of links +c_mtime 16 bytes Modification time (seconds) +c_mtime_nsec 8 bytes Modification time (nanoseconds) +c_filesize 16 bytes Size of data field +c_maj 8 bytes Major part of file device number +c_min 8 bytes Minor part of file device number +c_rmaj 8 bytes Major part of device node reference +c_rmin 8 bytes Minor part of device node reference +c_namesize 8 bytes Length of filename, including final \0 +c_xattrs_size 8 bytes Size of xattrs field + +Most of the fields match cpio_newc_header. c_chksum field is dropped. *** Handling of hard links