[v2,0/2] x86/kexec: UKI Support

Message ID	20230911052535.335770-1-kernel@jfarr.cc (mailing list archive)
Headers	show Return-Path: <keyrings-owner@vger.kernel.org> Feedback-ID: i0fc947c4:Fastmail From: Jan Hendrik Farr <kernel@jfarr.cc> To: linux-kernel@vger.kernel.org Cc: kexec@lists.infradead.org, x86@kernel.org, tglx@linutronix.de, dhowells@redhat.com, vgoyal@redhat.com, keyrings@vger.kernel.org, akpm@linux-foundation.org, bhe@redhat.com, bhelgaas@google.com, bluca@debian.org, lennart@poettering.net, Jan Hendrik Farr <kernel@jfarr.cc> Subject: [PATCH v2 0/2] x86/kexec: UKI Support Date: Mon, 11 Sep 2023 07:25:33 +0200 Message-Id: <20230911052535.335770-1-kernel@jfarr.cc> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	x86/kexec: UKI Support \| expand [v2,0/2] x86/kexec: UKI Support [v2,1/2] move pefile_parse_binary to its own file [v2,2/2] x86/kexec: UKI support

Jan Hendrik Farr Sept. 11, 2023, 5:25 a.m. UTC

Hello,

this patch (v2) implements UKI support for kexec_file_load. It will require
support in the kexec-tools userspace utility. For testing purposes the
following can be used: https://github.com/Cydox/kexec-test/

Creating UKIs for testing can be done with ukify (included in systemd),
sbctl, and mkinitcpio, etc.

There has been discussion on this topic in an issue on GitHub that is linked
below for reference.

Changes for v2:
- .cmdline section is now optional
- moving pefile_parse_binary is now in a separate commit for clarity
- parse_pefile.c is now in /lib instead of arch/x86/kernel (not sure if
  this is the best location, but it definetly shouldn't have been in an
  architecture specific location)
- parse_pefile.h is now in include/kernel instead of architecture
  specific location
- if initrd or cmdline is manually supplied EPERM is returned instead of
  being silently ignored
- formatting tweaks


Some links:
- Related discussion: https://github.com/systemd/systemd/issues/28538
- Documentation of UKIs: https://uapi-group.org/specifications/specs/unified_kernel_image/

Jan Hendrik Farr (2):
  move pefile_parse_binary to its own file
  x86/kexec: UKI support

 arch/x86/include/asm/kexec-uki.h       |   7 ++
 arch/x86/kernel/Makefile               |   1 +
 arch/x86/kernel/kexec-uki.c            | 126 +++++++++++++++++++++++++
 arch/x86/kernel/machine_kexec_64.c     |   2 +
 crypto/asymmetric_keys/mscode_parser.c |   2 +-
 crypto/asymmetric_keys/verify_pefile.c | 110 +++------------------
 crypto/asymmetric_keys/verify_pefile.h |  16 ----
 include/linux/parse_pefile.h           |  32 +++++++
 lib/Makefile                           |   3 +
 lib/parse_pefile.c                     | 109 +++++++++++++++++++++
 10 files changed, 292 insertions(+), 116 deletions(-)
 create mode 100644 arch/x86/include/asm/kexec-uki.h
 create mode 100644 arch/x86/kernel/kexec-uki.c
 create mode 100644 include/linux/parse_pefile.h
 create mode 100644 lib/parse_pefile.c

Baoquan He Sept. 12, 2023, 1:03 a.m. UTC | #1

Add Philipp to CC as he is also investigating UKI

On 09/11/23 at 07:25am, Jan Hendrik Farr wrote:
> Hello,
> 
> this patch (v2) implements UKI support for kexec_file_load. It will require
> support in the kexec-tools userspace utility. For testing purposes the
> following can be used: https://github.com/Cydox/kexec-test/
> 
> Creating UKIs for testing can be done with ukify (included in systemd),
> sbctl, and mkinitcpio, etc.

This is awesome work, Jan, thanks.

By the way, could you provide detailed steps about how to test this
patchset so that people interested can give it a shot?

> 
> There has been discussion on this topic in an issue on GitHub that is linked
> below for reference.
> 
> Changes for v2:
> - .cmdline section is now optional
> - moving pefile_parse_binary is now in a separate commit for clarity
> - parse_pefile.c is now in /lib instead of arch/x86/kernel (not sure if
>   this is the best location, but it definetly shouldn't have been in an
>   architecture specific location)
> - parse_pefile.h is now in include/kernel instead of architecture
>   specific location
> - if initrd or cmdline is manually supplied EPERM is returned instead of
>   being silently ignored
> - formatting tweaks
> 
> 
> Some links:
> - Related discussion: https://github.com/systemd/systemd/issues/28538
> - Documentation of UKIs: https://uapi-group.org/specifications/specs/unified_kernel_image/
> 
> Jan Hendrik Farr (2):
>   move pefile_parse_binary to its own file
>   x86/kexec: UKI support
> 
>  arch/x86/include/asm/kexec-uki.h       |   7 ++
>  arch/x86/kernel/Makefile               |   1 +
>  arch/x86/kernel/kexec-uki.c            | 126 +++++++++++++++++++++++++
>  arch/x86/kernel/machine_kexec_64.c     |   2 +
>  crypto/asymmetric_keys/mscode_parser.c |   2 +-
>  crypto/asymmetric_keys/verify_pefile.c | 110 +++------------------
>  crypto/asymmetric_keys/verify_pefile.h |  16 ----
>  include/linux/parse_pefile.h           |  32 +++++++
>  lib/Makefile                           |   3 +
>  lib/parse_pefile.c                     | 109 +++++++++++++++++++++
>  10 files changed, 292 insertions(+), 116 deletions(-)
>  create mode 100644 arch/x86/include/asm/kexec-uki.h
>  create mode 100644 arch/x86/kernel/kexec-uki.c
>  create mode 100644 include/linux/parse_pefile.h
>  create mode 100644 lib/parse_pefile.c
> 
> -- 
> 2.40.1
>

Jan Hendrik Farr Sept. 12, 2023, 7:25 p.m. UTC | #2

> By the way, could you provide detailed steps about how to test this
> patchset so that people interested can give it a shot?

Sure.

0. Build and run your kernel with my patches.
1. You're gonna need a UKI to kexec. I personally use sbctl or ukify to create them.

sbctl command example (assumes you have the cmdline you want saved in a file called cmdline):
sudo sbctl bundle -k /boot/vmlinuz-6.5.2 -f /boot/initrd.img-6.5.2 -c ./cmdline -s ./uki.efi

ukify command example:
sudo python3 ukify.py build --linux=/boot/vmlinuz-6.5.2 --initrd=/boot/initrd.img-6.5.2 --cmdline <cmdline as string or a filename>

2. If you are running in lockdown mode you'll have to sign the UKI. You can use sbctl, pesign, or sbsign for example.
3. Compile kexec-test (see links below). Simple "gcc main.c -o kexec-test" should work
4. Do the kexec load: ./kexec-text <path to uki>
(this is equivalent to "kexec -a -l <path to UKI>", however that currently complains about not recognizing the format)
5. At this point it's useful to check if the loading succeeded with: "cat /sys/kernel/kexec_loaded" (should return "1")
6. Do a kexec reboot. If you are running systemd, the best way is with: "systemctl kexec". Otherwise you can try "kexec -e", however this will not shut all your services down

If anyone has problems please feel free to ask.

Links:
sbctl: https://github.com/Foxboron/sbctl
ukify: https://github.com/systemd/systemd/tree/main/src/ukify
kexec-test: https://github.com/Cydox/kexec-test/

Philipp Rudo Sept. 13, 2023, 2 p.m. UTC | #3

Hi Jan,

All in all the code looks fine to me. Nevertheless I don't think UKI support
should be added at the moment. This is because IMHO you basically reinterpret
the kexec_file systemcall and thus add a new uapi to the kernel. Once
introduced it is extremely hard to remove or change an uapi again. The problem
I see is that the spec you based your work on is in such a poor shape that I
don't feel comfortable to add any new uapi based on it.

For example there are two definitions for the UKI which contradict each other.
The dedicated one [1] you have cited earlier and the one in the BLS for type #2
entries [2]. In [1] the .linux and .initrd sections are mandatory and the
.osrel and .cmdline sections are optional while in [2] it is the other way
round. Which definition should the kernel follow?

Furthermore, I absolutely don't understand how the spec should be read. All
the spec does is defining some file formats. There is no word about which
component in the boot chain is supposed to handle them and what exactly this
component is supposed to do with it. But that is crucial if we want to add UKI
support for kexec as the kexec systemcall will replace the stub. So we need to
know what tasks the stub is supposed to perform. Currently this is only some
implementation detail of the systemd-stub [3] that can change any moment and I
strongly oppose to base any uapi on it.

In the end the only benefit this series brings is to extend the signature
checking on the whole UKI except of just the kernel image. Everything else can
also be done in user space. Compared to the problems described above this is a
very small gain for me.

Until the spec got fixed I don't see a chance to add UKI support for kexec.

Thanks
Philipp

[1] https://uapi-group.org/specifications/specs/unified_kernel_image/
[2] https://uapi-group.org/specifications/specs/boot_loader_specification/#type-2-efi-unified-kernel-images
[3] https://www.freedesktop.org/software/systemd/man/systemd-stub.html

On Mon, 11 Sep 2023 07:25:33 +0200
Jan Hendrik Farr <kernel@jfarr.cc> wrote:

> Hello,
> 
> this patch (v2) implements UKI support for kexec_file_load. It will require
> support in the kexec-tools userspace utility. For testing purposes the
> following can be used: https://github.com/Cydox/kexec-test/
> 
> Creating UKIs for testing can be done with ukify (included in systemd),
> sbctl, and mkinitcpio, etc.
> 
> There has been discussion on this topic in an issue on GitHub that is linked
> below for reference.
> 
> Changes for v2:
> - .cmdline section is now optional
> - moving pefile_parse_binary is now in a separate commit for clarity
> - parse_pefile.c is now in /lib instead of arch/x86/kernel (not sure if
>   this is the best location, but it definetly shouldn't have been in an
>   architecture specific location)
> - parse_pefile.h is now in include/kernel instead of architecture
>   specific location
> - if initrd or cmdline is manually supplied EPERM is returned instead of
>   being silently ignored
> - formatting tweaks
> 
> 
> Some links:
> - Related discussion: https://github.com/systemd/systemd/issues/28538
> - Documentation of UKIs: https://uapi-group.org/specifications/specs/unified_kernel_image/
> 
> Jan Hendrik Farr (2):
>   move pefile_parse_binary to its own file
>   x86/kexec: UKI support
> 
>  arch/x86/include/asm/kexec-uki.h       |   7 ++
>  arch/x86/kernel/Makefile               |   1 +
>  arch/x86/kernel/kexec-uki.c            | 126 +++++++++++++++++++++++++
>  arch/x86/kernel/machine_kexec_64.c     |   2 +
>  crypto/asymmetric_keys/mscode_parser.c |   2 +-
>  crypto/asymmetric_keys/verify_pefile.c | 110 +++------------------
>  crypto/asymmetric_keys/verify_pefile.h |  16 ----
>  include/linux/parse_pefile.h           |  32 +++++++
>  lib/Makefile                           |   3 +
>  lib/parse_pefile.c                     | 109 +++++++++++++++++++++
>  10 files changed, 292 insertions(+), 116 deletions(-)
>  create mode 100644 arch/x86/include/asm/kexec-uki.h
>  create mode 100644 arch/x86/kernel/kexec-uki.c
>  create mode 100644 include/linux/parse_pefile.h
>  create mode 100644 lib/parse_pefile.c
>

Jan Hendrik Farr Sept. 13, 2023, 2:42 p.m. UTC | #4

On Wed, Sep 13, 2023, at 4:00 PM, Philipp Rudo wrote:
> Hi Jan,
>
> All in all the code looks fine to me. Nevertheless I don't think UKI
> support should be added at the moment. This is because IMHO you
> basically reinterpret the kexec_file systemcall and thus add a new
> uapi to the kernel. Once introduced it is extremely hard to remove or
> change an uapi again. The problem I see is that the spec you based
> your work on is in such a poor shape that I don't feel comfortable to
> add any new uapi based on it.
>
> For example there are two definitions for the UKI which contradict
> each other. The dedicated one [1] you have cited earlier and the one
> in the BLS for type #2 entries [2]. In [1] the .linux and .initrd
> sections are mandatory and the .osrel and .cmdline sections are
> optional while in [2] it is the other way round. Which definition
> should the kernel follow?
>
> Furthermore, I absolutely don't understand how the spec should be
> read. All the spec does is defining some file formats. There is no
> word about which component in the boot chain is supposed to handle
> them and what exactly this component is supposed to do with it. But
> that is crucial if we want to add UKI support for kexec as the kexec
> systemcall will replace the stub. So we need to know what tasks the
> stub is supposed to perform. Currently this is only some
> implementation detail of the systemd-stub [3] that can change any
> moment and I strongly oppose to base any uapi on it.
>

I think I have to agree with you on this one.

I kinda experienced this first hand. My initial patch required a kernel
command line, because the spec in [1] didn't say it was optional. Then
I got a response saying that it's actually optional. And the spec got
changed in PR [2] (I have a WIP v3 that allows a manual cmdline if the
.cmdline section is empty as explained by Lennart in [2] and now
also in the spec since [3]). So there's already 2 changes in the spec
in the last few days related to my patch. That shows that the spec is
not stable.

What's even worse IMO is that the reason given in [2] to change the spec
is simply that systemd-stub has a different behavior than the spec and
therefore the spec should be updated. At this point it's not really a
specification but simply a flawed documentation of the behavior of systemd-
stub / ukify.

I also discovered that ukify treats the initrd as optional [4]. So is
this the next change in the spec?

In [5] Luca writes:
> [...] we fully intend for the UKI format to be an open and stable
> specification, that anybody can support and rely on.
But that is unfortunately not where the format is at this point.

What is annoying though is where this leaves a user that actually
wants this feature. They can carry a patch or they might have to wait
a long time.

Can you indicate what it would take for the kernel community to consider
this spec as stable enough?

> In the end the only benefit this series brings is to extend the
> signature checking on the whole UKI except of just the kernel image.
> Everything else can also be done in user space. Compared to the
> problems described above this is a very small gain for me.

Correct. That is the benefit of pulling the UKI apart in the
kernel. However having to sign the kernel inside the UKI defeats
the whole point.

[1] https://uapi-group.org/specifications/specs/unified_kernel_image/
[2] https://github.com/uapi-group/specifications/pull/72
[3] https://github.com/uapi-group/specifications/pull/73
[4] https://github.com/uapi-group/specifications/issues/74
[5] https://github.com/systemd/systemd/issues/28538

Lennart Poettering Sept. 14, 2023, 9:32 a.m. UTC | #5

On Mi, 13.09.23 16:00, Philipp Rudo (prudo@redhat.com) wrote:

> For example there are two definitions for the UKI which contradict each other.
> The dedicated one [1] you have cited earlier and the one in the BLS for type #2
> entries [2]. In [1] the .linux and .initrd sections are mandatory and the
> .osrel and .cmdline sections are optional while in [2] it is the other way
> round. Which definition should the kernel follow?
>
> Furthermore, I absolutely don't understand how the spec should be read. All
> the spec does is defining some file formats. There is no word about which
> component in the boot chain is supposed to handle them and what exactly this
> component is supposed to do with it. But that is crucial if we want to add UKI
> support for kexec as the kexec systemcall will replace the stub. So we need to
> know what tasks the stub is supposed to perform. Currently this is only some
> implementation detail of the systemd-stub [3] that can change any moment and I
> strongly oppose to base any uapi on it.
>
> In the end the only benefit this series brings is to extend the signature
> checking on the whole UKI except of just the kernel image. Everything else can
> also be done in user space. Compared to the problems described above this is a
> very small gain for me.
>
> Until the spec got fixed I don't see a chance to add UKI support for kexec.

So that spec is initially just a generalization of what
systemd-stub/systemd-boot/ukify does. The descrepancies between the
cited specs mostly come from the that generalization. If you want to
enumerate kernels and order them the ".osrel" stuff for example is
necessary, hence the boot loader spec really wants it. If you don't
care about the boot loader spec though and just want to register the
kernel UKI PE directly in BootXXX efi vars for example, then there's
no need to include .osrel. That all said we should certainly make the
two specs align better, and clarify the situation. Suggestions/patches
more than welcome.

Ultimately, I think a spec written as description with a single
implementation in mind (i.e. systemd) is a generally a bad spec. Hence
if kexec in the Linux kernel wants to add support for it, that'd be
great but I'd see that as an opportunity to adjust the spec to the
needs of the Linux kernel in this area, so that it reflects well more
than just one backend implementation.

Hence, seeing the spec as set in stone and as inherently low quality
is the wrong way to see it I am sure. Instead, the goal here is to
adjust the spec to make it work really nicely for *both* systemd and
the kernel.

Lennart

--
Lennart Poettering, Berlin

Jarkko Sakkinen Sept. 14, 2023, 12:26 p.m. UTC | #6

On Thu Sep 14, 2023 at 12:32 PM EEST, Lennart Poettering wrote:
> On Mi, 13.09.23 16:00, Philipp Rudo (prudo@redhat.com) wrote:
>
> > For example there are two definitions for the UKI which contradict each other.
> > The dedicated one [1] you have cited earlier and the one in the BLS for type #2
> > entries [2]. In [1] the .linux and .initrd sections are mandatory and the
> > .osrel and .cmdline sections are optional while in [2] it is the other way
> > round. Which definition should the kernel follow?
> >
> > Furthermore, I absolutely don't understand how the spec should be read. All
> > the spec does is defining some file formats. There is no word about which
> > component in the boot chain is supposed to handle them and what exactly this
> > component is supposed to do with it. But that is crucial if we want to add UKI
> > support for kexec as the kexec systemcall will replace the stub. So we need to
> > know what tasks the stub is supposed to perform. Currently this is only some
> > implementation detail of the systemd-stub [3] that can change any moment and I
> > strongly oppose to base any uapi on it.
> >
> > In the end the only benefit this series brings is to extend the signature
> > checking on the whole UKI except of just the kernel image. Everything else can
> > also be done in user space. Compared to the problems described above this is a
> > very small gain for me.
> >
> > Until the spec got fixed I don't see a chance to add UKI support for kexec.
>
> So that spec is initially just a generalization of what
> systemd-stub/systemd-boot/ukify does. The descrepancies between the
> cited specs mostly come from the that generalization. If you want to
> enumerate kernels and order them the ".osrel" stuff for example is
> necessary, hence the boot loader spec really wants it. If you don't
> care about the boot loader spec though and just want to register the
> kernel UKI PE directly in BootXXX efi vars for example, then there's
> no need to include .osrel. That all said we should certainly make the
> two specs align better, and clarify the situation. Suggestions/patches
> more than welcome.
>
> Ultimately, I think a spec written as description with a single
> implementation in mind (i.e. systemd) is a generally a bad spec. Hence
> if kexec in the Linux kernel wants to add support for it, that'd be
> great but I'd see that as an opportunity to adjust the spec to the
> needs of the Linux kernel in this area, so that it reflects well more
> than just one backend implementation.
>
> Hence, seeing the spec as set in stone and as inherently low quality
> is the wrong way to see it I am sure. Instead, the goal here is to
> adjust the spec to make it work really nicely for *both* systemd and
> the kernel.

Bringing better backing story [1] would also help the spec. Immeditaly
when there's some reflection surface, also the possible faults it the
spec become more apparent. Also this makes spec refinement less boring,
which can be boring and tedious if you write it isolated by yourself or
in a small group :-)

I need to check if I could with some effort extend my current testing
environment for UKI [2]. Need to study this better at some point.

> Lennart
>
> --
> Lennart Poettering, Berlin

[1] https://social.kernel.org/notice/AZklKOsIYBZXDL9Bya
[2] https://github.com/jarkkojs/buildroot-tpmdd/compare/master...linux-6.5.y

BR, JKarkko

Jarkko Sakkinen Sept. 14, 2023, 3:33 p.m. UTC | #7

On Thu Sep 14, 2023 at 3:26 PM EEST, Jarkko Sakkinen wrote:
> On Thu Sep 14, 2023 at 12:32 PM EEST, Lennart Poettering wrote:
> > On Mi, 13.09.23 16:00, Philipp Rudo (prudo@redhat.com) wrote:
> >
> > > For example there are two definitions for the UKI which contradict each other.
> > > The dedicated one [1] you have cited earlier and the one in the BLS for type #2
> > > entries [2]. In [1] the .linux and .initrd sections are mandatory and the
> > > .osrel and .cmdline sections are optional while in [2] it is the other way
> > > round. Which definition should the kernel follow?
> > >
> > > Furthermore, I absolutely don't understand how the spec should be read. All
> > > the spec does is defining some file formats. There is no word about which
> > > component in the boot chain is supposed to handle them and what exactly this
> > > component is supposed to do with it. But that is crucial if we want to add UKI
> > > support for kexec as the kexec systemcall will replace the stub. So we need to
> > > know what tasks the stub is supposed to perform. Currently this is only some
> > > implementation detail of the systemd-stub [3] that can change any moment and I
> > > strongly oppose to base any uapi on it.
> > >
> > > In the end the only benefit this series brings is to extend the signature
> > > checking on the whole UKI except of just the kernel image. Everything else can
> > > also be done in user space. Compared to the problems described above this is a
> > > very small gain for me.
> > >
> > > Until the spec got fixed I don't see a chance to add UKI support for kexec.
> >
> > So that spec is initially just a generalization of what
> > systemd-stub/systemd-boot/ukify does. The descrepancies between the
> > cited specs mostly come from the that generalization. If you want to
> > enumerate kernels and order them the ".osrel" stuff for example is
> > necessary, hence the boot loader spec really wants it. If you don't
> > care about the boot loader spec though and just want to register the
> > kernel UKI PE directly in BootXXX efi vars for example, then there's
> > no need to include .osrel. That all said we should certainly make the
> > two specs align better, and clarify the situation. Suggestions/patches
> > more than welcome.
> >
> > Ultimately, I think a spec written as description with a single
> > implementation in mind (i.e. systemd) is a generally a bad spec. Hence
> > if kexec in the Linux kernel wants to add support for it, that'd be
> > great but I'd see that as an opportunity to adjust the spec to the
> > needs of the Linux kernel in this area, so that it reflects well more
> > than just one backend implementation.
> >
> > Hence, seeing the spec as set in stone and as inherently low quality
> > is the wrong way to see it I am sure. Instead, the goal here is to
> > adjust the spec to make it work really nicely for *both* systemd and
> > the kernel.
>
> Bringing better backing story [1] would also help the spec. Immeditaly
> when there's some reflection surface, also the possible faults it the
> spec become more apparent. Also this makes spec refinement less boring,
> which can be boring and tedious if you write it isolated by yourself or
> in a small group :-)
>
> I need to check if I could with some effort extend my current testing
> environment for UKI [2]. Need to study this better at some point.
>
> > Lennart
> >
> > --
> > Lennart Poettering, Berlin
>
> [1] https://social.kernel.org/notice/AZklKOsIYBZXDL9Bya
> [2] https://github.com/jarkkojs/buildroot-tpmdd/compare/master...linux-6.5.y
>
> BR, JKarkko

I need to revisit one some months olds patch set from Matthew Garrett.
It was about encrypted hibernate.

I don't recall exactly what was the problem with PCR sealing but want
to check. This is not hunch that this would affect the current patch
set in review. Just want to revisit to remember why it did not go
through in the end.

BTW, would not be a bad idea to extend CC list to at least Matthew and
James Bottomley on this patch.

BR, Jarkko

Jan Hendrik Farr Sept. 14, 2023, 4:11 p.m. UTC | #8

> BTW, would not be a bad idea to extend CC list to at least Matthew and
> James Bottomley on this patch.

Sure. Added Matthew and James in CC

Also, I already made some minor changes. cmdline is now used from the syscall if there is no .cmdline section included in the UKI. find_section now returns the section_header as an ERR_PTR. You can find them in the uki-v3-wip branch at https://github.com/Cydox/linux/commits/uki-v3-wip

Philipp Rudo Sept. 14, 2023, 6:51 p.m. UTC | #9

Hi Lennart,

On Thu, 14 Sep 2023 11:32:20 +0200
Lennart Poettering <mzxreary@0pointer.de> wrote:

> On Mi, 13.09.23 16:00, Philipp Rudo (prudo@redhat.com) wrote:
> 
> > For example there are two definitions for the UKI which contradict each other.
> > The dedicated one [1] you have cited earlier and the one in the BLS for type #2
> > entries [2]. In [1] the .linux and .initrd sections are mandatory and the
> > .osrel and .cmdline sections are optional while in [2] it is the other way
> > round. Which definition should the kernel follow?
> >
> > Furthermore, I absolutely don't understand how the spec should be read. All
> > the spec does is defining some file formats. There is no word about which
> > component in the boot chain is supposed to handle them and what exactly this
> > component is supposed to do with it. But that is crucial if we want to add UKI
> > support for kexec as the kexec systemcall will replace the stub. So we need to
> > know what tasks the stub is supposed to perform. Currently this is only some
> > implementation detail of the systemd-stub [3] that can change any moment and I
> > strongly oppose to base any uapi on it.
> >
> > In the end the only benefit this series brings is to extend the signature
> > checking on the whole UKI except of just the kernel image. Everything else can
> > also be done in user space. Compared to the problems described above this is a
> > very small gain for me.
> >
> > Until the spec got fixed I don't see a chance to add UKI support for kexec.  
> 
> So that spec is initially just a generalization of what
> systemd-stub/systemd-boot/ukify does. The descrepancies between the
> cited specs mostly come from the that generalization. If you want to
> enumerate kernels and order them the ".osrel" stuff for example is
> necessary, hence the boot loader spec really wants it. If you don't
> care about the boot loader spec though and just want to register the
> kernel UKI PE directly in BootXXX efi vars for example, then there's
> no need to include .osrel. That all said we should certainly make the
> two specs align better, and clarify the situation. Suggestions/patches
> more than welcome.

Thanks for the explanation. I know that writing a spec isn't easy and
no matter how hard you try it will never be "perfect". That's why I
think it's important to have a proper Introduction at the beginning of
the spec that gives a context to the reader on the problem the spec is
trying to solve, the different use cases/environments considered
while writing the spec, the different components involved and how they
interact with each other, which limitations there are, etc.. The BLS
sort of has it (if you also consider the "Additional discussion") but
for the UKI the introduction basically boils down to "it's a file that
contains stuff", which isn't very helpful. Having such a context not
only makes it easier to understand where such contradictions come from
but also help to prevent problems where people from different
backgrounds interpret the spec differently because they look at it from
their context.

An other thing I don't understand is, why the Extension Images (I
assume describe the "Companion Files" in the systmd-stub man pages) are
a separate spec. With the initrd and cmdline being part of the UKI and
thus fixed you take away a lot of flexibility users have today. These
extensions bring back part of the flexibility which IMHO is needed by
general purpose distros as for them a simple one-size-fits-all solution
doesn't work. That's why for me both belong together and thus should be
described in the same spec.

The extensions are also a good example why you need to properly define
the different components and their responsibility. In a secureboot
environment these extensions need to be signed and verified during
boot. But wich component is responsible to check the signature? Is it
the stub? The kernel? or even the initrd? If you don't define that in
the spec you will eventually end up in situations where nobody checks
the signature because everybody is sure it's "someone else's job".

> Ultimately, I think a spec written as description with a single
> implementation in mind (i.e. systemd) is a generally a bad spec. Hence
> if kexec in the Linux kernel wants to add support for it, that'd be
> great but I'd see that as an opportunity to adjust the spec to the
> needs of the Linux kernel in this area, so that it reflects well more
> than just one backend implementation.

Fully agree. I must admit my first mail sounds pretty negative. But I
don't oppose the UKI. All I wanted to say that at the moment the spec
doesn't work for the kernel. But it can (and should) be fixed.

In this context I hope it is also clear to you that when more and more
people rely on the spec you need a more formal process when including
changes. Especially when the change might break the implementation of
others. So no more making the .cmdline optional and allowing it to be
overwritten all on the same day.

Having that said, what does "local override" exactly mean? Does that
mean a distro can allow a user to freely choose the cmdline without
checking any signatures? I.e. does that mean we can get rid of this
https://github.com/systemd/systemd/issues/24539

> Hence, seeing the spec as set in stone and as inherently low quality
> is the wrong way to see it I am sure. Instead, the goal here is to
> adjust the spec to make it work really nicely for *both* systemd and
> the kernel.

Sorry, I never wanted to intend that the spec inherently low quality.
Just that it doesn't meat my expectations, yet. But that is fine. The
spec isn't even a year old and there's only a single implementation,
yet. So it's more documentation rather than a spec.

Furthermore I don't expect the spec to be ever "set in stone". A spec
always needs to adjust to an ever changing world. If it doesn't it's
dead. But once other people rely on it you mustn't break backward
compatibility. Meaning the more people rely on it the more careful you
have to be which changes you make.

Anyway, I hope I could make clear what my pain points are and by
that help to make it work for both sides.

Thanks
Philipp

> Lennart
> 
> --
> Lennart Poettering, Berlin
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
>

Philipp Rudo Sept. 14, 2023, 7:09 p.m. UTC | #10

Hi Jan,

On Wed, 13 Sep 2023 16:42:33 +0200
"Jan Hendrik Farr" <kernel@jfarr.cc> wrote:

> On Wed, Sep 13, 2023, at 4:00 PM, Philipp Rudo wrote:

[...]

> In [5] Luca writes:
> > [...] we fully intend for the UKI format to be an open and stable
> > specification, that anybody can support and rely on.  
> But that is unfortunately not where the format is at this point.
> 
> What is annoying though is where this leaves a user that actually
> wants this feature. They can carry a patch or they might have to wait
> a long time.
> 
> Can you indicate what it would take for the kernel community to consider
> this spec as stable enough?

I don't think there is a good answer to that question. In fact I
believe if you ask 10 people from the community you will get 20+
different answers.

My guess is that either (1) the spec is moved to some official standard
committee where people spend decades to polish it before it makes it
into the kernel or (2) there's a big flamewar on LKML until Linus had
enough and passes his judgment on it. So definitely (2) ;-)

Thanks
Philipp

> 
> 
> > In the end the only benefit this series brings is to extend the
> > signature checking on the whole UKI except of just the kernel image.
> > Everything else can also be done in user space. Compared to the
> > problems described above this is a very small gain for me.  
> 
> Correct. That is the benefit of pulling the UKI apart in the
> kernel. However having to sign the kernel inside the UKI defeats
> the whole point.
> 
> 
> [1] https://uapi-group.org/specifications/specs/unified_kernel_image/
> [2] https://github.com/uapi-group/specifications/pull/72
> [3] https://github.com/uapi-group/specifications/pull/73
> [4] https://github.com/uapi-group/specifications/issues/74
> [5] https://github.com/systemd/systemd/issues/28538
>

Jan Hendrik Farr Sept. 14, 2023, 9:04 p.m. UTC | #11

On Thu, Sep 14, 2023, at 8:51 PM, Philipp Rudo wrote:
> [...]
>
> In this context I hope it is also clear to you that when more and more
> people rely on the spec you need a more formal process when including
> changes. Especially when the change might break the implementation of
> others. So no more making the .cmdline optional and allowing it to be
> overwritten all on the same day.
>
> Having that said, what does "local override" exactly mean? Does that
> mean a distro can allow a user to freely choose the cmdline without
> checking any signatures?

The behavior of systemd-stub is to allow the bootloader (or whatever
called sd-stub) supplied cmdline when there is no .cmdline section in
the UKI. That's how I understand "local override" here. For WIP v3 of
this patch the behavior is to use the cmdline supplied by userspace to
the kexec_file_load syscall if no .cmdline section is in the UKI.

empty .cmdline section -> empty cmdline always passed to kernel
.cmdline section -> use bootloader/user supplied cmdline (which would
be empty by default)

This setup does not make sense for a locked down / secure system though.

Maybe the word "override" is not ideal. There is nothing actually being
overridden as there is no cmdline in the UKI in the first place.

sd-stub also allows the bootloader supplied cmdline if not using secure
boot. So maybe the kernel could allow user supplied cmdline if not in
lockdown mode for kexec maybe? If not in lockdown mode somebody can just
kexec an unsigned kernel + unsigned cmdline using the kexec_load syscall
anyways. For this case the word "override" makes sense.

The logic for all of this in sd-stub is in [1].

> I.e. does that mean we can get rid of this
>      https://github.com/systemd/systemd/issues/24539

This is a different usecase IMO.

>> Hence, seeing the spec as set in stone and as inherently low quality
>> is the wrong way to see it I am sure. Instead, the goal here is to
>> adjust the spec to make it work really nicely for *both* systemd and
>> the kernel.
>
> Sorry, I never wanted to intend that the spec inherently low quality.
> Just that it doesn't meat my expectations, yet. But that is fine. The
> spec isn't even a year old and there's only a single implementation,
> yet. So it's more documentation rather than a spec.

Let's make it happen.

[1] https://github.com/systemd/systemd/blob/5898cef22a35ceefa068d5f46929eced2baab0ed/src/boot/efi/stub.c#L140

Jarkko Sakkinen Sept. 14, 2023, 9:14 p.m. UTC | #12

On Thu Sep 14, 2023 at 7:11 PM EEST, Jan Hendrik Farr wrote:
> > BTW, would not be a bad idea to extend CC list to at least Matthew and
> > James Bottomley on this patch.
>
> Sure. Added Matthew and James in CC
>
> Also, I already made some minor changes. cmdline is now used from the
> syscall if there is no .cmdline section included in the UKI.
> find_section now returns the section_header as an ERR_PTR. You can
> find them in the uki-v3-wip branch at
> https://github.com/Cydox/linux/commits/uki-v3-wip

Hey, I discussed about IKU at the Linux Linux security module
maintainers monthly meeting and we concluded that it would be nice if
this had a spread to linux-integrity and linux-security-module mailing
lists.

It is x86 feature at this point but obviously that will work as
reference model to other architectures too.  So it would be nice
if those mailing lists would be also included to the loop.

I do not have time to check if this is relevant but this I think
the last version seen of encrypted hibernate:

https://lore.kernel.org/linux-integrity/20221111231636.3748636-1-evgreen@chromium.org/

Just adding it as a reference since I mentioned it earlier.

I'm on holiday for the next week but will look forward to the next
version after I'm back.

BR, Jarkko

Philipp Rudo Sept. 18, 2023, 3:36 p.m. UTC | #13

Hi Jan,

On Thu, 14 Sep 2023 23:04:32 +0200
"Jan Hendrik Farr" <kernel@jfarr.cc> wrote:

> On Thu, Sep 14, 2023, at 8:51 PM, Philipp Rudo wrote:
> > [...]
> >
> > In this context I hope it is also clear to you that when more and more
> > people rely on the spec you need a more formal process when including
> > changes. Especially when the change might break the implementation of
> > others. So no more making the .cmdline optional and allowing it to be
> > overwritten all on the same day.
> >
> > Having that said, what does "local override" exactly mean? Does that
> > mean a distro can allow a user to freely choose the cmdline without
> > checking any signatures?  
> 
> The behavior of systemd-stub is to allow the bootloader (or whatever
> called sd-stub) supplied cmdline when there is no .cmdline section in
> the UKI. That's how I understand "local override" here. For WIP v3 of
> this patch the behavior is to use the cmdline supplied by userspace to
> the kexec_file_load syscall if no .cmdline section is in the UKI.
> 
> empty .cmdline section -> empty cmdline always passed to kernel
> .cmdline section -> use bootloader/user supplied cmdline (which would
> be empty by default)
> 
> This setup does not make sense for a locked down / secure system though.
> 
> Maybe the word "override" is not ideal. There is nothing actually being
> overridden as there is no cmdline in the UKI in the first place.
> 
> sd-stub also allows the bootloader supplied cmdline if not using secure
> boot. So maybe the kernel could allow user supplied cmdline if not in
> lockdown mode for kexec maybe? If not in lockdown mode somebody can just
> kexec an unsigned kernel + unsigned cmdline using the kexec_load syscall
> anyways. For this case the word "override" makes sense.
> 
> The logic for all of this in sd-stub is in [1].
> 
> > I.e. does that mean we can get rid of this
> >      https://github.com/systemd/systemd/issues/24539  
> 
> This is a different usecase IMO.

Yeah, I expected that. The whole question was meant to be rhetorical.
The point I wanted to make was that when a spec uses terms like "local
override" it needs to explain what it means.

Thanks
Philipp

> >> Hence, seeing the spec as set in stone and as inherently low quality
> >> is the wrong way to see it I am sure. Instead, the goal here is to
> >> adjust the spec to make it work really nicely for *both* systemd and
> >> the kernel.  
> >
> > Sorry, I never wanted to intend that the spec inherently low quality.
> > Just that it doesn't meat my expectations, yet. But that is fine. The
> > spec isn't even a year old and there's only a single implementation,
> > yet. So it's more documentation rather than a spec.  
> 
> Let's make it happen.
> 
> 
> [1] https://github.com/systemd/systemd/blob/5898cef22a35ceefa068d5f46929eced2baab0ed/src/boot/efi/stub.c#L140
>

Dave Young Sept. 20, 2023, 7:43 a.m. UTC | #14

> > In the end the only benefit this series brings is to extend the
> > signature checking on the whole UKI except of just the kernel image.
> > Everything else can also be done in user space. Compared to the
> > problems described above this is a very small gain for me.
>
> Correct. That is the benefit of pulling the UKI apart in the
> kernel. However having to sign the kernel inside the UKI defeats
> the whole point.


Pingfan added the zboot load support in kexec-tools, I know that he is
trying to sign the zboot image and the inside kernel twice. So
probably there are some common areas which can be discussed.
Added Ard and Pingfan in cc.
http://lists.infradead.org/pipermail/kexec/2023-August/027674.html


Thanks
Dave

Dave Young Sept. 20, 2023, 8:40 a.m. UTC | #15

On Wed, 20 Sept 2023 at 15:43, Dave Young <dyoung@redhat.com> wrote:
>
> > > In the end the only benefit this series brings is to extend the
> > > signature checking on the whole UKI except of just the kernel image.
> > > Everything else can also be done in user space. Compared to the
> > > problems described above this is a very small gain for me.
> >
> > Correct. That is the benefit of pulling the UKI apart in the
> > kernel. However having to sign the kernel inside the UKI defeats
> > the whole point.
>
>
> Pingfan added the zboot load support in kexec-tools, I know that he is
> trying to sign the zboot image and the inside kernel twice. So
> probably there are some common areas which can be discussed.
> Added Ard and Pingfan in cc.
> http://lists.infradead.org/pipermail/kexec/2023-August/027674.html
>

Here is another thread of the initial try in kernel with a few more
options eg. some fake efi service helpers.
https://lore.kernel.org/linux-arm-kernel/ZBvKSis+dfnqa+Vz@piliu.users.ipa.redhat.com/T/#m42abb0ad3c10126b8b3bfae8a596deb707d6f76e

>
> Thanks
> Dave

Ard Biesheuvel Sept. 20, 2023, 10:50 a.m. UTC | #16

On Wed, 20 Sept 2023 at 08:40, Dave Young <dyoung@redhat.com> wrote:
>
> On Wed, 20 Sept 2023 at 15:43, Dave Young <dyoung@redhat.com> wrote:
> >
> > > > In the end the only benefit this series brings is to extend the
> > > > signature checking on the whole UKI except of just the kernel image.
> > > > Everything else can also be done in user space. Compared to the
> > > > problems described above this is a very small gain for me.
> > >
> > > Correct. That is the benefit of pulling the UKI apart in the
> > > kernel. However having to sign the kernel inside the UKI defeats
> > > the whole point.
> >
> >
> > Pingfan added the zboot load support in kexec-tools, I know that he is
> > trying to sign the zboot image and the inside kernel twice. So
> > probably there are some common areas which can be discussed.
> > Added Ard and Pingfan in cc.
> > http://lists.infradead.org/pipermail/kexec/2023-August/027674.html
> >
>
> Here is another thread of the initial try in kernel with a few more
> options eg. some fake efi service helpers.
> https://lore.kernel.org/linux-arm-kernel/ZBvKSis+dfnqa+Vz@piliu.users.ipa.redhat.com/T/#m42abb0ad3c10126b8b3bfae8a596deb707d6f76e
>

Currently, UKI's external interface is defined in terms of EFI
services, i.e., it is an executable PE/COFF binary that encapsulates
all the logic that performs the unpacking of the individual sections,
and loads the kernel as a PE/COFF binary as well (i.e., via
LoadImage/StartImage)

As soon as we add support to Linux to unpack a UKI and boot the
encapsulated kernel using a boot protocol other than EFI, we are
painting ourselves into a corner, severely limiting the freedom of the
UKI effort to make changes to the interfaces that were implementation
details up to this point.

It also means that UKI handling in kexec will need to be taught about
every individual architecture again, which is something we are trying
to avoid with EFI support in general. Breaking the abstraction like
this lets the cat out of the bag, and will add yet another variation
of kexec that we will need to support and maintain forever.

So the only way to do this properly and portably is to implement the
minimal set of EFI boot services [0] that Linux actually needs to run
its EFI stub (which is mostly identical to the set that UKI relies on
afaict), and expose them to the kexec image as it is being loaded.
This is not as bad as it sounds - I have some Rust code that could be
used as an inspiration [1] and which could be reused and shared
between architectures.

This would also reduce/remove the need for a purgatory: loading a EFI
binary in this way would run it up to the point were it calls
ExitBootServices(), and the actual kexec would invoke the image as if
it was returning from ExitBootServices().

The only fundamental problem here is the need to allocate large chunks
of physical memory, which would need some kind of CMA support, I
imagine?

Maybe we should do a BoF at LPC to discuss this further?

[0] this is not as bad as it sounds: beyond a protocol database, a
heap allocator and a memory map, there is actually very little needed
to boot Linux via the EFI stub (although UKI needs
LoadImage/StartImage as well)

[1] https://github.com/ardbiesheuvel/efilite

Dave Young Sept. 20, 2023, 12:07 p.m. UTC | #17

On Wed, 20 Sept 2023 at 18:50, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Wed, 20 Sept 2023 at 08:40, Dave Young <dyoung@redhat.com> wrote:
> >
> > On Wed, 20 Sept 2023 at 15:43, Dave Young <dyoung@redhat.com> wrote:
> > >
> > > > > In the end the only benefit this series brings is to extend the
> > > > > signature checking on the whole UKI except of just the kernel image.
> > > > > Everything else can also be done in user space. Compared to the
> > > > > problems described above this is a very small gain for me.
> > > >
> > > > Correct. That is the benefit of pulling the UKI apart in the
> > > > kernel. However having to sign the kernel inside the UKI defeats
> > > > the whole point.
> > >
> > >
> > > Pingfan added the zboot load support in kexec-tools, I know that he is
> > > trying to sign the zboot image and the inside kernel twice. So
> > > probably there are some common areas which can be discussed.
> > > Added Ard and Pingfan in cc.
> > > http://lists.infradead.org/pipermail/kexec/2023-August/027674.html
> > >
> >
> > Here is another thread of the initial try in kernel with a few more
> > options eg. some fake efi service helpers.
> > https://lore.kernel.org/linux-arm-kernel/ZBvKSis+dfnqa+Vz@piliu.users.ipa.redhat.com/T/#m42abb0ad3c10126b8b3bfae8a596deb707d6f76e
> >
>

Ard, thanks for the comments.

> Currently, UKI's external interface is defined in terms of EFI
> services, i.e., it is an executable PE/COFF binary that encapsulates
> all the logic that performs the unpacking of the individual sections,
> and loads the kernel as a PE/COFF binary as well (i.e., via
> LoadImage/StartImage)
>
> As soon as we add support to Linux to unpack a UKI and boot the
> encapsulated kernel using a boot protocol other than EFI, we are
> painting ourselves into a corner, severely limiting the freedom of the
> UKI effort to make changes to the interfaces that were implementation
> details up to this point.

Agreed, it seems UKI is more flexible and complex than the zboot,
we do need to carefully think about a better solution.

>
> It also means that UKI handling in kexec will need to be taught about
> every individual architecture again, which is something we are trying
> to avoid with EFI support in general. Breaking the abstraction like
> this lets the cat out of the bag, and will add yet another variation
> of kexec that we will need to support and maintain forever.
>
> So the only way to do this properly and portably is to implement the
> minimal set of EFI boot services [0] that Linux actually needs to run
> its EFI stub (which is mostly identical to the set that UKI relies on
> afaict), and expose them to the kexec image as it is being loaded.
> This is not as bad as it sounds - I have some Rust code that could be
> used as an inspiration [1] and which could be reused and shared
> between architectures.

Great!

>
> This would also reduce/remove the need for a purgatory: loading a EFI
> binary in this way would run it up to the point were it calls
> ExitBootServices(), and the actual kexec would invoke the image as if
> it was returning from ExitBootServices().
>
> The only fundamental problem here is the need to allocate large chunks
> of physical memory, which would need some kind of CMA support, I
> imagine?

Hmm, I thought that your idea is to write the efi stub code in "purgatory"
so kexec can jump to it while rebooting then it will be able to access the
whole usable memory, but it seems you want an efi app run under linux
and somehow provide services to kexec?  My EFI knowledge is incomplete
and outdated,  If my understanding of your proposal is true how can it keep
running after switching to the new kernel stub?

>
> Maybe we should do a BoF at LPC to discuss this further?

It does deserve more discussion, unfortunately I will not be able to join LPC,
Philipp Rudo (cced) planned attend the conf, so I think you guys can
discuss together with
other people interested. I think I will watch the recordings or
joining virtually if possible.

>
> [0] this is not as bad as it sounds: beyond a protocol database, a
> heap allocator and a memory map, there is actually very little needed
> to boot Linux via the EFI stub (although UKI needs
> LoadImage/StartImage as well)
>
> [1] https://github.com/ardbiesheuvel/efilite
>

Dave Young Sept. 20, 2023, 12:18 p.m. UTC | #18

On Wed, 20 Sept 2023 at 20:07, Dave Young <dyoung@redhat.com> wrote:
>
> On Wed, 20 Sept 2023 at 18:50, Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Wed, 20 Sept 2023 at 08:40, Dave Young <dyoung@redhat.com> wrote:
> > >
> > > On Wed, 20 Sept 2023 at 15:43, Dave Young <dyoung@redhat.com> wrote:
> > > >
> > > > > > In the end the only benefit this series brings is to extend the
> > > > > > signature checking on the whole UKI except of just the kernel image.
> > > > > > Everything else can also be done in user space. Compared to the
> > > > > > problems described above this is a very small gain for me.
> > > > >
> > > > > Correct. That is the benefit of pulling the UKI apart in the
> > > > > kernel. However having to sign the kernel inside the UKI defeats
> > > > > the whole point.
> > > >
> > > >
> > > > Pingfan added the zboot load support in kexec-tools, I know that he is
> > > > trying to sign the zboot image and the inside kernel twice. So
> > > > probably there are some common areas which can be discussed.
> > > > Added Ard and Pingfan in cc.
> > > > http://lists.infradead.org/pipermail/kexec/2023-August/027674.html
> > > >
> > >
> > > Here is another thread of the initial try in kernel with a few more
> > > options eg. some fake efi service helpers.
> > > https://lore.kernel.org/linux-arm-kernel/ZBvKSis+dfnqa+Vz@piliu.users.ipa.redhat.com/T/#m42abb0ad3c10126b8b3bfae8a596deb707d6f76e
> > >
> >
>
> Ard, thanks for the comments.
>
> > Currently, UKI's external interface is defined in terms of EFI
> > services, i.e., it is an executable PE/COFF binary that encapsulates
> > all the logic that performs the unpacking of the individual sections,
> > and loads the kernel as a PE/COFF binary as well (i.e., via
> > LoadImage/StartImage)
> >
> > As soon as we add support to Linux to unpack a UKI and boot the
> > encapsulated kernel using a boot protocol other than EFI, we are
> > painting ourselves into a corner, severely limiting the freedom of the
> > UKI effort to make changes to the interfaces that were implementation
> > details up to this point.
>
> Agreed, it seems UKI is more flexible and complex than the zboot,
> we do need to carefully think about a better solution.
>
> >
> > It also means that UKI handling in kexec will need to be taught about
> > every individual architecture again, which is something we are trying
> > to avoid with EFI support in general. Breaking the abstraction like
> > this lets the cat out of the bag, and will add yet another variation
> > of kexec that we will need to support and maintain forever.
> >
> > So the only way to do this properly and portably is to implement the
> > minimal set of EFI boot services [0] that Linux actually needs to run
> > its EFI stub (which is mostly identical to the set that UKI relies on
> > afaict), and expose them to the kexec image as it is being loaded.
> > This is not as bad as it sounds - I have some Rust code that could be
> > used as an inspiration [1] and which could be reused and shared
> > between architectures.
>
> Great!
>
> >
> > This would also reduce/remove the need for a purgatory: loading a EFI
> > binary in this way would run it up to the point were it calls
> > ExitBootServices(), and the actual kexec would invoke the image as if
> > it was returning from ExitBootServices().
> >
> > The only fundamental problem here is the need to allocate large chunks
> > of physical memory, which would need some kind of CMA support, I
> > imagine?
>
> Hmm, I thought that your idea is to write the efi stub code in "purgatory"
> so kexec can jump to it while rebooting then it will be able to access the
> whole usable memory, but it seems you want an efi app run under linux
> and somehow provide services to kexec?  My EFI knowledge is incomplete
> and outdated,  If my understanding of your proposal is true how can it keep
> running after switching to the new kernel stub?

Oops,  please ignore the quick reply and questioins, I apparently
forgot that this is the kexec loading
phase instead of the rebooting phase.  Yes as you said CMA might be
the only choice
for that proposal.

>
> >
> > Maybe we should do a BoF at LPC to discuss this further?
>
> It does deserve more discussion, unfortunately I will not be able to join LPC,
> Philipp Rudo (cced) planned attend the conf, so I think you guys can
> discuss together with
> other people interested. I think I will watch the recordings or
> joining virtually if possible.
>
> >
> > [0] this is not as bad as it sounds: beyond a protocol database, a
> > heap allocator and a memory map, there is actually very little needed
> > to boot Linux via the EFI stub (although UKI needs
> > LoadImage/StartImage as well)
> >
> > [1] https://github.com/ardbiesheuvel/efilite
> >

Jan Hendrik Farr Sept. 20, 2023, 10:02 p.m. UTC | #19

Hi Ard, Greetings from Delft


On 20 10:50:31, Ard Biesheuvel wrote:
> On Wed, 20 Sept 2023 at 08:40, Dave Young <dyoung@redhat.com> wrote:
> >
> > On Wed, 20 Sept 2023 at 15:43, Dave Young <dyoung@redhat.com> wrote:
> > >
> > > > > In the end the only benefit this series brings is to extend the
> > > > > signature checking on the whole UKI except of just the kernel image.
> > > > > Everything else can also be done in user space. Compared to the
> > > > > problems described above this is a very small gain for me.
> > > >
> > > > Correct. That is the benefit of pulling the UKI apart in the
> > > > kernel. However having to sign the kernel inside the UKI defeats
> > > > the whole point.
> > >
> > >
> > > Pingfan added the zboot load support in kexec-tools, I know that he is
> > > trying to sign the zboot image and the inside kernel twice. So
> > > probably there are some common areas which can be discussed.
> > > Added Ard and Pingfan in cc.
> > > http://lists.infradead.org/pipermail/kexec/2023-August/027674.html
> > >
> >
> > Here is another thread of the initial try in kernel with a few more
> > options eg. some fake efi service helpers.
> > https://lore.kernel.org/linux-arm-kernel/ZBvKSis+dfnqa+Vz@piliu.users.ipa.redhat.com/T/#m42abb0ad3c10126b8b3bfae8a596deb707d6f76e
> >
> 
> Currently, UKI's external interface is defined in terms of EFI
> services, i.e., it is an executable PE/COFF binary that encapsulates
> all the logic that performs the unpacking of the individual sections,
> and loads the kernel as a PE/COFF binary as well (i.e., via
> LoadImage/StartImage)
>
> As soon as we add support to Linux to unpack a UKI and boot the
> encapsulated kernel using a boot protocol other than EFI, we are
> painting ourselves into a corner, severely limiting the freedom of the
> UKI effort to make changes to the interfaces that were implementation
> details up to this point.

While this was true at some point, it's not anymore. The intention is
for UKIs to be a stable file format that can be used outside of
systemd-stub. They are no longer just defined by their interface to
UEFI. While the spec will need work it can be found at [1]. This patch
depends on the UKI containing the linux bzimage, initrd, and cmdline in
the sections with those names. We can depend on this in the future.

There is some discussions around supporting more of the UKI features in
the future (TPM PCR signatures, etc). See [2].

> It also means that UKI handling in kexec will need to be taught about
> every individual architecture again, which is something we are trying
> to avoid with EFI support in general. Breaking the abstraction like
> this lets the cat out of the bag, and will add yet another variation
> of kexec that we will need to support and maintain forever.

Yes this would require more work for each architecture that wants to
kexec UKIs (arm64 would be next).

However I think the support required would be way lower than all the
other kexec loaders. I would suggest that this loader is actually really
simple (check the code in the uki_load function in
arch/x86/kernel/kexec-uki.c). All of the heavy lifting is actually done
by the existing bzimage loader. The UKI loader just pulls appart the UKI
and calls the existing bzimage loader, it's really simple. It's like an add-on
to the current loader.

> So the only way to do this properly and portably is to implement the
> minimal set of EFI boot services [0] that Linux actually needs to run
> its EFI stub (which is mostly identical to the set that UKI relies on
> afaict), and expose them to the kexec image as it is being loaded.
> This is not as bad as it sounds - I have some Rust code that could be
> used as an inspiration [1] and which could be reused and shared
> between architectures.

This would be a very cool thing in general though and would open a lot
of possibilities. But how much support will this require? Soon people
will want more than just the minimal set of EFI services for booting
Linux. I do like this idea though and would probably be one of the
people wanting more of the EFI services supported.

> This would also reduce/remove the need for a purgatory: loading a EFI
> binary in this way would run it up to the point were it calls
> ExitBootServices(), and the actual kexec would invoke the image as if
> it was returning from ExitBootServices().
> 
> The only fundamental problem here is the need to allocate large chunks
> of physical memory, which would need some kind of CMA support, I
> imagine?
> 
> Maybe we should do a BoF at LPC to discuss this further?

I definetly won't be at LPC, is it possible to join virtually?

> 
> [0] this is not as bad as it sounds: beyond a protocol database, a
> heap allocator and a memory map, there is actually very little needed
> to boot Linux via the EFI stub (although UKI needs
> LoadImage/StartImage as well)
> 
> [1] https://github.com/ardbiesheuvel/efilite


[1] https://uapi-group.org/specifications/specs/unified_kernel_image/
[2] https://github.com/systemd/systemd/issues/28538

Dave Young Sept. 21, 2023, 3:14 a.m. UTC | #20

On Wed, 20 Sept 2023 at 20:18, Dave Young <dyoung@redhat.com> wrote:
>
> On Wed, 20 Sept 2023 at 20:07, Dave Young <dyoung@redhat.com> wrote:
> >
> > On Wed, 20 Sept 2023 at 18:50, Ard Biesheuvel <ardb@kernel.org> wrote:
> > >
> > > On Wed, 20 Sept 2023 at 08:40, Dave Young <dyoung@redhat.com> wrote:
> > > >
> > > > On Wed, 20 Sept 2023 at 15:43, Dave Young <dyoung@redhat.com> wrote:
> > > > >
> > > > > > > In the end the only benefit this series brings is to extend the
> > > > > > > signature checking on the whole UKI except of just the kernel image.
> > > > > > > Everything else can also be done in user space. Compared to the
> > > > > > > problems described above this is a very small gain for me.
> > > > > >
> > > > > > Correct. That is the benefit of pulling the UKI apart in the
> > > > > > kernel. However having to sign the kernel inside the UKI defeats
> > > > > > the whole point.
> > > > >
> > > > >
> > > > > Pingfan added the zboot load support in kexec-tools, I know that he is
> > > > > trying to sign the zboot image and the inside kernel twice. So
> > > > > probably there are some common areas which can be discussed.
> > > > > Added Ard and Pingfan in cc.
> > > > > http://lists.infradead.org/pipermail/kexec/2023-August/027674.html
> > > > >
> > > >
> > > > Here is another thread of the initial try in kernel with a few more
> > > > options eg. some fake efi service helpers.
> > > > https://lore.kernel.org/linux-arm-kernel/ZBvKSis+dfnqa+Vz@piliu.users.ipa.redhat.com/T/#m42abb0ad3c10126b8b3bfae8a596deb707d6f76e
> > > >
> > >
> >
> > Ard, thanks for the comments.
> >
> > > Currently, UKI's external interface is defined in terms of EFI
> > > services, i.e., it is an executable PE/COFF binary that encapsulates
> > > all the logic that performs the unpacking of the individual sections,
> > > and loads the kernel as a PE/COFF binary as well (i.e., via
> > > LoadImage/StartImage)
> > >
> > > As soon as we add support to Linux to unpack a UKI and boot the
> > > encapsulated kernel using a boot protocol other than EFI, we are
> > > painting ourselves into a corner, severely limiting the freedom of the
> > > UKI effort to make changes to the interfaces that were implementation
> > > details up to this point.
> >
> > Agreed, it seems UKI is more flexible and complex than the zboot,
> > we do need to carefully think about a better solution.
> >
> > >
> > > It also means that UKI handling in kexec will need to be taught about
> > > every individual architecture again, which is something we are trying
> > > to avoid with EFI support in general. Breaking the abstraction like
> > > this lets the cat out of the bag, and will add yet another variation
> > > of kexec that we will need to support and maintain forever.
> > >
> > > So the only way to do this properly and portably is to implement the
> > > minimal set of EFI boot services [0] that Linux actually needs to run
> > > its EFI stub (which is mostly identical to the set that UKI relies on
> > > afaict), and expose them to the kexec image as it is being loaded.
> > > This is not as bad as it sounds - I have some Rust code that could be
> > > used as an inspiration [1] and which could be reused and shared
> > > between architectures.
> >
> > Great!
> >
> > >
> > > This would also reduce/remove the need for a purgatory: loading a EFI
> > > binary in this way would run it up to the point were it calls
> > > ExitBootServices(), and the actual kexec would invoke the image as if
> > > it was returning from ExitBootServices().
> > >
> > > The only fundamental problem here is the need to allocate large chunks
> > > of physical memory, which would need some kind of CMA support, I
> > > imagine?
> >
> > Hmm, I thought that your idea is to write the efi stub code in "purgatory"
> > so kexec can jump to it while rebooting then it will be able to access the
> > whole usable memory, but it seems you want an efi app run under linux
> > and somehow provide services to kexec?  My EFI knowledge is incomplete
> > and outdated,  If my understanding of your proposal is true how can it keep
> > running after switching to the new kernel stub?
>
> Oops,  please ignore the quick reply and questioins, I apparently
> forgot that this is the kexec loading
> phase instead of the rebooting phase.  Yes as you said CMA might be
> the only choice
> for that proposal.

Ok, refreshed my memory with a brief discussion with Pingfan.  My
understanding of the flow is like below:

1. kexec loads the UKI image itself without any parsing of the internal files.
2. reboot|crash -> jump to the fake stub
3. the stub parse the UKI internal kernel/initrd/cmdline
4. boot into the new kernel

With above code flow, it is still not clear for me where do we need
the large chunk
of memory for the stub.
in step 1 kexec loading does not need physical continuous big chunks as
while jumping to the stub in step 2 the UKI image will be relocated again.
the stub does need some page table setups but that is already done in current
kexec code.

Other than the boot service implementation, another issue here from my mind is
the memory map information should be passed to the stub use.  Taking x86 as an
example kexec will pass the raw e820 table to 2nd kernel according to the x86
boot protocol.  Now if we move to a chain load case, there will be similar
requirement for the stub.

Another thing is in case the stub can be more complex in the future, how can
we debug it.  The boot service conout is not usable, and since graphic kms switc
hing, I'm not sure the boot framebuffer will be usable as well.  Probably the
only way is to add serial output support.  Anyway this is something could be
hard to handle.

>
> >
> > >
> > > Maybe we should do a BoF at LPC to discuss this further?
> >
> > It does deserve more discussion, unfortunately I will not be able to join LPC,
> > Philipp Rudo (cced) planned attend the conf, so I think you guys can
> > discuss together with
> > other people interested. I think I will watch the recordings or
> > joining virtually if possible.
> >
> > >
> > > [0] this is not as bad as it sounds: beyond a protocol database, a
> > > heap allocator and a memory map, there is actually very little needed
> > > to boot Linux via the EFI stub (although UKI needs
> > > LoadImage/StartImage as well)
> > >
> > > [1] https://github.com/ardbiesheuvel/efilite
> > >

Philipp Rudo Sept. 25, 2023, 2:36 p.m. UTC | #21

Hi Jan,

On Thu, 21 Sep 2023 00:02:25 +0200
Jan Hendrik Farr <kernel@jfarr.cc> wrote:

[...]

> > Maybe we should do a BoF at LPC to discuss this further?  
> 
> I definetly won't be at LPC, is it possible to join virtually?

Yes, LPC will be hybrid again this year. Virtual access costs $50
although you can apply for an 50% discount when you are a
non-professional.

https://lpc.events/event/17/page/212-attend

Thanks
Philipp

[v2,0/2] x86/kexec: UKI Support

Message

Comments