Message ID | 20200816001949.595424-1-marmarek@invisiblethingslab.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | efi: discover ESRT table on Xen PV too | expand |
Hi Marek, On Sun, 16 Aug 2020 at 02:20, Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> wrote: > > In case of Xen PV dom0, Xen passes along info about system tables (see > arch/x86/xen/efi.c), but not the memory map from EFI. This makes sense > as it is Xen responsible for managing physical memory address space. > In this case, it doesn't make sense to condition using ESRT table on > availability of EFI memory map, as it isn't Linux kernel responsible for > it. Skip this part on Xen PV (let Xen do the right thing if it deems > necessary) and use ESRT table normally. > > This is a requirement for using fwupd in PV dom0 to update UEFI using > capsules. > > Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> > --- > drivers/firmware/efi/esrt.c | 47 ++++++++++++++++++++----------------- > 1 file changed, 25 insertions(+), 22 deletions(-) > > diff --git a/drivers/firmware/efi/esrt.c b/drivers/firmware/efi/esrt.c > index d5915272141f..5c49f2aaa4b1 100644 > --- a/drivers/firmware/efi/esrt.c > +++ b/drivers/firmware/efi/esrt.c > @@ -245,36 +245,38 @@ void __init efi_esrt_init(void) > int rc; > phys_addr_t end; > > - if (!efi_enabled(EFI_MEMMAP)) > + if (!efi_enabled(EFI_MEMMAP) && !efi_enabled(EFI_PARAVIRT)) > return; > > pr_debug("esrt-init: loading.\n"); > if (!esrt_table_exists()) > return; > > - rc = efi_mem_desc_lookup(efi.esrt, &md); > - if (rc < 0 || > - (!(md.attribute & EFI_MEMORY_RUNTIME) && > - md.type != EFI_BOOT_SERVICES_DATA && > - md.type != EFI_RUNTIME_SERVICES_DATA)) { > - pr_warn("ESRT header is not in the memory map.\n"); > - return; > - } > + if (efi_enabled(EFI_MEMMAP)) { > + rc = efi_mem_desc_lookup(efi.esrt, &md); > + if (rc < 0 || > + (!(md.attribute & EFI_MEMORY_RUNTIME) && > + md.type != EFI_BOOT_SERVICES_DATA && > + md.type != EFI_RUNTIME_SERVICES_DATA)) { > + pr_warn("ESRT header is not in the memory map.\n"); > + return; > + } > > - max = efi_mem_desc_end(&md); > - if (max < efi.esrt) { > - pr_err("EFI memory descriptor is invalid. (esrt: %p max: %p)\n", > - (void *)efi.esrt, (void *)max); > - return; > - } > + max = efi_mem_desc_end(&md); > + if (max < efi.esrt) { > + pr_err("EFI memory descriptor is invalid. (esrt: %p max: %p)\n", > + (void *)efi.esrt, (void *)max); > + return; > + } > > - size = sizeof(*esrt); > - max -= efi.esrt; > + size = sizeof(*esrt); > + max -= efi.esrt; > > - if (max < size) { > - pr_err("ESRT header doesn't fit on single memory map entry. (size: %zu max: %zu)\n", > - size, max); > - return; > + if (max < size) { > + pr_err("ESRT header doesn't fit on single memory map entry. (size: %zu max: %zu)\n", > + size, max); > + return; > + } > } > > va = early_memremap(efi.esrt, size); > @@ -331,7 +333,8 @@ void __init efi_esrt_init(void) > > end = esrt_data + size; > pr_info("Reserving ESRT space from %pa to %pa.\n", &esrt_data, &end); > - if (md.type == EFI_BOOT_SERVICES_DATA) > + > + if (efi_enabled(EFI_MEMMAP) && md.type == EFI_BOOT_SERVICES_DATA) > efi_mem_reserve(esrt_data, esrt_data_size); > This does not look correct to me. Why doesn't the region need to be reserved on a Xen boot? The OS may overwrite it otherwise. > pr_debug("esrt-init: loaded.\n"); > -- > 2.25.4 >
On Sun, Aug 16, 2020 at 02:19:49AM +0200, Marek Marczykowski-Górecki wrote: > In case of Xen PV dom0, Xen passes along info about system tables (see > arch/x86/xen/efi.c), but not the memory map from EFI. I think that's because the memory map returned by XENMEM_machine_memory_map is in e820 form, and doesn't contain the required information about the EFI regions due to the translation done by efi_arch_process_memory_map in Xen? > This makes sense > as it is Xen responsible for managing physical memory address space. > In this case, it doesn't make sense to condition using ESRT table on > availability of EFI memory map, as it isn't Linux kernel responsible for > it. PV dom0 is kind of special in that regard as it can create mappings to (almost) any MMIO regions, and hence can change it's memory map substantially. > Skip this part on Xen PV (let Xen do the right thing if it deems > necessary) and use ESRT table normally. Maybe it would be better to introduce a new hypercall (or add a parameter to XENMEM_machine_memory_map) in order to be able to fetch the EFI memory map? That should allow a PV dom0 to check the ESRT is correct and thus not diverge from bate metal. > > This is a requirement for using fwupd in PV dom0 to update UEFI using > capsules. > > Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> > --- > drivers/firmware/efi/esrt.c | 47 ++++++++++++++++++++----------------- > 1 file changed, 25 insertions(+), 22 deletions(-) > > diff --git a/drivers/firmware/efi/esrt.c b/drivers/firmware/efi/esrt.c > index d5915272141f..5c49f2aaa4b1 100644 > --- a/drivers/firmware/efi/esrt.c > +++ b/drivers/firmware/efi/esrt.c > @@ -245,36 +245,38 @@ void __init efi_esrt_init(void) > int rc; > phys_addr_t end; > > - if (!efi_enabled(EFI_MEMMAP)) > + if (!efi_enabled(EFI_MEMMAP) && !efi_enabled(EFI_PARAVIRT)) > return; > > pr_debug("esrt-init: loading.\n"); > if (!esrt_table_exists()) > return; > > - rc = efi_mem_desc_lookup(efi.esrt, &md); > - if (rc < 0 || > - (!(md.attribute & EFI_MEMORY_RUNTIME) && > - md.type != EFI_BOOT_SERVICES_DATA && > - md.type != EFI_RUNTIME_SERVICES_DATA)) { > - pr_warn("ESRT header is not in the memory map.\n"); > - return; > - } > + if (efi_enabled(EFI_MEMMAP)) { > + rc = efi_mem_desc_lookup(efi.esrt, &md); > + if (rc < 0 || > + (!(md.attribute & EFI_MEMORY_RUNTIME) && > + md.type != EFI_BOOT_SERVICES_DATA && > + md.type != EFI_RUNTIME_SERVICES_DATA)) { > + pr_warn("ESRT header is not in the memory map.\n"); > + return; > + } Here you blindly trust the data in the ESRT in the PV case, without checking it matches the regions on the memory map, which could lead to errors if ESRT turns to be wrong. Thanks, Roger.
On Mon, Aug 17, 2020 at 10:16:07AM +0200, Ard Biesheuvel wrote: > > @@ -331,7 +333,8 @@ void __init efi_esrt_init(void) > > > > end = esrt_data + size; > > pr_info("Reserving ESRT space from %pa to %pa.\n", &esrt_data, &end); > > - if (md.type == EFI_BOOT_SERVICES_DATA) > > + > > + if (efi_enabled(EFI_MEMMAP) && md.type == EFI_BOOT_SERVICES_DATA) > > efi_mem_reserve(esrt_data, esrt_data_size); > > > > This does not look correct to me. Why doesn't the region need to be > reserved on a Xen boot? The OS may overwrite it otherwise. In case of Xen, it is Xen responsibility to do that. Otherwise even if dom0 would not use it, Xen could allocate that physical memory to another guest.
On Mon, Aug 17, 2020 at 11:00:13AM +0200, Roger Pau Monné wrote: > On Sun, Aug 16, 2020 at 02:19:49AM +0200, Marek Marczykowski-Górecki wrote: > > In case of Xen PV dom0, Xen passes along info about system tables (see > > arch/x86/xen/efi.c), but not the memory map from EFI. > > I think that's because the memory map returned by > XENMEM_machine_memory_map is in e820 form, and doesn't contain the > required information about the EFI regions due to the translation done > by efi_arch_process_memory_map in Xen? Yes, I think so. > > This makes sense > > as it is Xen responsible for managing physical memory address space. > > In this case, it doesn't make sense to condition using ESRT table on > > availability of EFI memory map, as it isn't Linux kernel responsible for > > it. > > PV dom0 is kind of special in that regard as it can create mappings to > (almost) any MMIO regions, and hence can change it's memory map > substantially. Do you mean PV dom0 should receive full EFI memory map? Jan already objected this as it would be a layering violation. > > Skip this part on Xen PV (let Xen do the right thing if it deems > > necessary) and use ESRT table normally. > > Maybe it would be better to introduce a new hypercall (or add a > parameter to XENMEM_machine_memory_map) in order to be able to fetch > the EFI memory map? > > That should allow a PV dom0 to check the ESRT is correct and thus not > diverge from bate metal. Note the EFI memory map there is used not just to check things, but to actually modify it to reserve the region. I think that's rather Xen responsibility, not dom0. See the comment from Ard. > > > > This is a requirement for using fwupd in PV dom0 to update UEFI using > > capsules. > > > > Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> > > --- > > drivers/firmware/efi/esrt.c | 47 ++++++++++++++++++++----------------- > > 1 file changed, 25 insertions(+), 22 deletions(-) > > > > diff --git a/drivers/firmware/efi/esrt.c b/drivers/firmware/efi/esrt.c > > index d5915272141f..5c49f2aaa4b1 100644 > > --- a/drivers/firmware/efi/esrt.c > > +++ b/drivers/firmware/efi/esrt.c > > @@ -245,36 +245,38 @@ void __init efi_esrt_init(void) > > int rc; > > phys_addr_t end; > > > > - if (!efi_enabled(EFI_MEMMAP)) > > + if (!efi_enabled(EFI_MEMMAP) && !efi_enabled(EFI_PARAVIRT)) > > return; > > > > pr_debug("esrt-init: loading.\n"); > > if (!esrt_table_exists()) > > return; > > > > - rc = efi_mem_desc_lookup(efi.esrt, &md); > > - if (rc < 0 || > > - (!(md.attribute & EFI_MEMORY_RUNTIME) && > > - md.type != EFI_BOOT_SERVICES_DATA && > > - md.type != EFI_RUNTIME_SERVICES_DATA)) { > > - pr_warn("ESRT header is not in the memory map.\n"); > > - return; > > - } > > + if (efi_enabled(EFI_MEMMAP)) { > > + rc = efi_mem_desc_lookup(efi.esrt, &md); > > + if (rc < 0 || > > + (!(md.attribute & EFI_MEMORY_RUNTIME) && > > + md.type != EFI_BOOT_SERVICES_DATA && > > + md.type != EFI_RUNTIME_SERVICES_DATA)) { > > + pr_warn("ESRT header is not in the memory map.\n"); > > + return; > > + } > > Here you blindly trust the data in the ESRT in the PV case, without > checking it matches the regions on the memory map, which could lead to > errors if ESRT turns to be wrong. I don't think checking merely if ESRT lives somewhere in EFI_{BOOT,RUNTIME}_SERVICES_DATA area guarantees its correctness. On the other hand, this seems to be done to prevent overwriting that memory with something else (see that in case of EFI_BOOT_SERVICES_DATA it is later marked as reserved. I think it should be rather done by Xen, not dom0. Either by moving this whole check into dom0 (for this table only? or perhaps for other tables too?), or by simply reserving the whole EFI_BOOT_SERVICES_DATA (like the /mapbs boot switch does). Then, I think dom0 could use e820 map to verify that this is reserved, but do not modify the map anymore.
On Tue, Aug 18, 2020 at 02:01:35PM +0200, Marek Marczykowski-Górecki wrote: > On Mon, Aug 17, 2020 at 11:00:13AM +0200, Roger Pau Monné wrote: > > On Sun, Aug 16, 2020 at 02:19:49AM +0200, Marek Marczykowski-Górecki wrote: > > > In case of Xen PV dom0, Xen passes along info about system tables (see > > > arch/x86/xen/efi.c), but not the memory map from EFI. > > > > I think that's because the memory map returned by > > XENMEM_machine_memory_map is in e820 form, and doesn't contain the > > required information about the EFI regions due to the translation done > > by efi_arch_process_memory_map in Xen? > > Yes, I think so. > > > > This makes sense > > > as it is Xen responsible for managing physical memory address space. > > > In this case, it doesn't make sense to condition using ESRT table on > > > availability of EFI memory map, as it isn't Linux kernel responsible for > > > it. > > > > PV dom0 is kind of special in that regard as it can create mappings to > > (almost) any MMIO regions, and hence can change it's memory map > > substantially. > > Do you mean PV dom0 should receive full EFI memory map? Jan already > objected this as it would be a layering violation. dom0 is already capable of getting the native e820 memory map using XENMEM_machine_memory_map, I'm not sure why allowing to return the memory map in EFI form would be any different (or a layering violation in the PV dom0 case). Do you have a reference to that thread? I certainly missed it. For PVH dom0 we could consider something different, since in that case there's a guest memory map which could likely be returned in EFI format and with the EFI regions if required. > > > Skip this part on Xen PV (let Xen do the right thing if it deems > > > necessary) and use ESRT table normally. > > > > Maybe it would be better to introduce a new hypercall (or add a > > parameter to XENMEM_machine_memory_map) in order to be able to fetch > > the EFI memory map? > > > > That should allow a PV dom0 to check the ESRT is correct and thus not > > diverge from bate metal. > > Note the EFI memory map there is used not just to check things, but to > actually modify it to reserve the region. I think that's rather Xen > responsibility, not dom0. See the comment from Ard. But that would modify Linux copy of the memory map, which is fine? My understanding of EFI is limited, but I don't think such changes are feed back into EFI, so Linux is completely free to do whatever it wants with it's copy of the EFI memory map. > > > > > > This is a requirement for using fwupd in PV dom0 to update UEFI using > > > capsules. > > > > > > Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> > > > --- > > > drivers/firmware/efi/esrt.c | 47 ++++++++++++++++++++----------------- > > > 1 file changed, 25 insertions(+), 22 deletions(-) > > > > > > diff --git a/drivers/firmware/efi/esrt.c b/drivers/firmware/efi/esrt.c > > > index d5915272141f..5c49f2aaa4b1 100644 > > > --- a/drivers/firmware/efi/esrt.c > > > +++ b/drivers/firmware/efi/esrt.c > > > @@ -245,36 +245,38 @@ void __init efi_esrt_init(void) > > > int rc; > > > phys_addr_t end; > > > > > > - if (!efi_enabled(EFI_MEMMAP)) > > > + if (!efi_enabled(EFI_MEMMAP) && !efi_enabled(EFI_PARAVIRT)) > > > return; > > > > > > pr_debug("esrt-init: loading.\n"); > > > if (!esrt_table_exists()) > > > return; > > > > > > - rc = efi_mem_desc_lookup(efi.esrt, &md); > > > - if (rc < 0 || > > > - (!(md.attribute & EFI_MEMORY_RUNTIME) && > > > - md.type != EFI_BOOT_SERVICES_DATA && > > > - md.type != EFI_RUNTIME_SERVICES_DATA)) { > > > - pr_warn("ESRT header is not in the memory map.\n"); > > > - return; > > > - } > > > + if (efi_enabled(EFI_MEMMAP)) { > > > + rc = efi_mem_desc_lookup(efi.esrt, &md); > > > + if (rc < 0 || > > > + (!(md.attribute & EFI_MEMORY_RUNTIME) && > > > + md.type != EFI_BOOT_SERVICES_DATA && > > > + md.type != EFI_RUNTIME_SERVICES_DATA)) { > > > + pr_warn("ESRT header is not in the memory map.\n"); > > > + return; > > > + } > > > > Here you blindly trust the data in the ESRT in the PV case, without > > checking it matches the regions on the memory map, which could lead to > > errors if ESRT turns to be wrong. > > I don't think checking merely if ESRT lives somewhere in > EFI_{BOOT,RUNTIME}_SERVICES_DATA area guarantees its correctness. > > On the other hand, this seems to be done to prevent overwriting that > memory with something else (see that in case of EFI_BOOT_SERVICES_DATA > it is later marked as reserved. I think it should be rather done by Xen, > not dom0. Forcing Xen to do all those checks seems quite a tedious work, and in fact the memory map that dom0 has might be more complete than the one Xen is able to construct, as Xen doesn't have an AML parser so it's not able to get all the possible info from ACPI. Thanks, Roger.
On Tue, Aug 18, 2020 at 02:47:10PM +0200, Roger Pau Monné wrote: > On Tue, Aug 18, 2020 at 02:01:35PM +0200, Marek Marczykowski-Górecki wrote: > > Do you mean PV dom0 should receive full EFI memory map? Jan already > > objected this as it would be a layering violation. > > dom0 is already capable of getting the native e820 memory map using > XENMEM_machine_memory_map, I'm not sure why allowing to return the > memory map in EFI form would be any different (or a layering > violation in the PV dom0 case). > > Do you have a reference to that thread? I certainly missed it. See this thread: http://markmail.org/message/nrrvuau5whebksy2 > For PVH dom0 we could consider something different, since in that case > there's a guest memory map which could likely be returned in EFI > format and with the EFI regions if required. > > > > > Skip this part on Xen PV (let Xen do the right thing if it deems > > > > necessary) and use ESRT table normally. > > > > > > Maybe it would be better to introduce a new hypercall (or add a > > > parameter to XENMEM_machine_memory_map) in order to be able to fetch > > > the EFI memory map? > > > > > > That should allow a PV dom0 to check the ESRT is correct and thus not > > > diverge from bate metal. > > > > Note the EFI memory map there is used not just to check things, but to > > actually modify it to reserve the region. I think that's rather Xen > > responsibility, not dom0. See the comment from Ard. > > But that would modify Linux copy of the memory map, which is fine? My > understanding of EFI is limited, but I don't think such changes are > feed back into EFI, so Linux is completely free to do whatever it > wants with it's copy of the EFI memory map. Yes, but the thing is to make sure Xen doesn't use that memory, not only dom0. See below. > > > > + if (efi_enabled(EFI_MEMMAP)) { > > > > + rc = efi_mem_desc_lookup(efi.esrt, &md); > > > > + if (rc < 0 || > > > > + (!(md.attribute & EFI_MEMORY_RUNTIME) && > > > > + md.type != EFI_BOOT_SERVICES_DATA && > > > > + md.type != EFI_RUNTIME_SERVICES_DATA)) { > > > > + pr_warn("ESRT header is not in the memory map.\n"); > > > > + return; > > > > + } > > > > > > Here you blindly trust the data in the ESRT in the PV case, without > > > checking it matches the regions on the memory map, which could lead to > > > errors if ESRT turns to be wrong. > > > > I don't think checking merely if ESRT lives somewhere in > > EFI_{BOOT,RUNTIME}_SERVICES_DATA area guarantees its correctness. > > > > On the other hand, this seems to be done to prevent overwriting that > > memory with something else (see that in case of EFI_BOOT_SERVICES_DATA > > it is later marked as reserved. I think it should be rather done by Xen, > > not dom0. > > Forcing Xen to do all those checks seems quite a tedious work, and in > fact the memory map that dom0 has might be more complete than the one > Xen is able to construct, as Xen doesn't have an AML parser so it's > not able to get all the possible info from ACPI. Let me draw the picture from the beginning. EFI memory map contains various memory regions. Some of them are marked as not needed after ExitBootServices() call (done in Xen before launching dom0). This includes EFI_BOOT_SERVICES_DATA and EFI_BOOT_SERVICES_CODE. EFI SystemTable contains pointers to various ConfigurationTables - physical addresses (at least in this case). Xen does interpret some of them, but not ESRT. Xen pass the whole (address of) SystemTable to Linux dom0 (at least in PV case). Xen doesn't do anything about tables it doesn't understand. Now, the code in Linux takes the (ESRT) table address early and checks the memory map for it. We have 3 cases: - it points at area marked as neither EFI_*_SERVICES_DATA, nor with EFI_MEMORY_RUNTIME attribute -> Linux refuse to use it - it points to EFI_RUNTIME_SERVICES_DATA or with EFI_MEMORY_RUNTIME attribute - Linux uses the table; memory map already says the area belongs to EFI and the OS should not use it for something else - it points to EFI_BOOT_SERVICES_DATA - Linux mark the area as reserved to not release it after calling ExitBootServices() The problematic is the third case - at the time when Linux dom0 is run, ExitBootServices() was already called and EFI_BOOT_SERVICES_* memory was already released. It could be already used for something else (for example Xen could overwrite it while loading dom0). Note the problematic case should be the most common - UEFI specification says "The ESRT shall be stored in memory of type EfiBootServicesData" (chapter 22.3 of UEFI Spec v2.6). For this reason, to use ESRT in dom0, Xen should do something about it before ExitBootServices() call. While analyzing all the EFI tables is probably not a viable option, it can do some simple action: - retains all the EFI_BOOT_SERVICES_* areas - there is already code for that, controlled with /mapbs boot switch (to xen.efi, would need another option for multiboot2+efi) - have a list of tables to retain - since Xen already do analyze some of the ConfigurationTables, it can also have a list of those to preserve even if they live in EFI_BOOT_SERVICES_DATA. In this case, while Xen doesn't need to parse the whole table, it need to parse it's header to get the table size - to reserve that memory and not reuse it after ExitBootServices(). I think the second solution is slightly more elegant.
On Tue, Aug 18, 2020 at 05:00:20PM +0200, Marek Marczykowski-Górecki wrote: > On Tue, Aug 18, 2020 at 02:47:10PM +0200, Roger Pau Monné wrote: > > On Tue, Aug 18, 2020 at 02:01:35PM +0200, Marek Marczykowski-Górecki wrote: > > > Do you mean PV dom0 should receive full EFI memory map? Jan already > > > objected this as it would be a layering violation. > > > > dom0 is already capable of getting the native e820 memory map using > > XENMEM_machine_memory_map, I'm not sure why allowing to return the > > memory map in EFI form would be any different (or a layering > > violation in the PV dom0 case). > > > > Do you have a reference to that thread? I certainly missed it. > > See this thread: http://markmail.org/message/nrrvuau5whebksy2 > > > For PVH dom0 we could consider something different, since in that case > > there's a guest memory map which could likely be returned in EFI > > format and with the EFI regions if required. > > > > > > > Skip this part on Xen PV (let Xen do the right thing if it deems > > > > > necessary) and use ESRT table normally. > > > > > > > > Maybe it would be better to introduce a new hypercall (or add a > > > > parameter to XENMEM_machine_memory_map) in order to be able to fetch > > > > the EFI memory map? > > > > > > > > That should allow a PV dom0 to check the ESRT is correct and thus not > > > > diverge from bate metal. > > > > > > Note the EFI memory map there is used not just to check things, but to > > > actually modify it to reserve the region. I think that's rather Xen > > > responsibility, not dom0. See the comment from Ard. > > > > But that would modify Linux copy of the memory map, which is fine? My > > understanding of EFI is limited, but I don't think such changes are > > feed back into EFI, so Linux is completely free to do whatever it > > wants with it's copy of the EFI memory map. > > Yes, but the thing is to make sure Xen doesn't use that memory, not only > dom0. See below. > > > > > > + if (efi_enabled(EFI_MEMMAP)) { > > > > > + rc = efi_mem_desc_lookup(efi.esrt, &md); > > > > > + if (rc < 0 || > > > > > + (!(md.attribute & EFI_MEMORY_RUNTIME) && > > > > > + md.type != EFI_BOOT_SERVICES_DATA && > > > > > + md.type != EFI_RUNTIME_SERVICES_DATA)) { > > > > > + pr_warn("ESRT header is not in the memory map.\n"); > > > > > + return; > > > > > + } > > > > > > > > Here you blindly trust the data in the ESRT in the PV case, without > > > > checking it matches the regions on the memory map, which could lead to > > > > errors if ESRT turns to be wrong. > > > > > > I don't think checking merely if ESRT lives somewhere in > > > EFI_{BOOT,RUNTIME}_SERVICES_DATA area guarantees its correctness. > > > > > > On the other hand, this seems to be done to prevent overwriting that > > > memory with something else (see that in case of EFI_BOOT_SERVICES_DATA > > > it is later marked as reserved. I think it should be rather done by Xen, > > > not dom0. > > > > Forcing Xen to do all those checks seems quite a tedious work, and in > > fact the memory map that dom0 has might be more complete than the one > > Xen is able to construct, as Xen doesn't have an AML parser so it's > > not able to get all the possible info from ACPI. > > Let me draw the picture from the beginning. Thanks, greatly appreciated. > EFI memory map contains various memory regions. Some of them are marked > as not needed after ExitBootServices() call (done in Xen before > launching dom0). This includes EFI_BOOT_SERVICES_DATA and > EFI_BOOT_SERVICES_CODE. > > EFI SystemTable contains pointers to various ConfigurationTables - > physical addresses (at least in this case). Xen does interpret some of > them, but not ESRT. Xen pass the whole (address of) SystemTable to Linux > dom0 (at least in PV case). Xen doesn't do anything about tables it > doesn't understand. > > Now, the code in Linux takes the (ESRT) table address early and checks > the memory map for it. We have 3 cases: > - it points at area marked as neither EFI_*_SERVICES_DATA, nor with > EFI_MEMORY_RUNTIME attribute -> Linux refuse to use it > - it points to EFI_RUNTIME_SERVICES_DATA or with EFI_MEMORY_RUNTIME > attribute - Linux uses the table; memory map already says the area > belongs to EFI and the OS should not use it for something else > - it points to EFI_BOOT_SERVICES_DATA - Linux mark the area as reserved > to not release it after calling ExitBootServices() > > The problematic is the third case - at the time when Linux dom0 is run, > ExitBootServices() was already called and EFI_BOOT_SERVICES_* memory was > already released. It could be already used for something else (for > example Xen could overwrite it while loading dom0). > > Note the problematic case should be the most common - UEFI specification > says "The ESRT shall be stored in memory of type EfiBootServicesData" > (chapter 22.3 of UEFI Spec v2.6). > > For this reason, to use ESRT in dom0, Xen should do something about it > before ExitBootServices() call. While analyzing all the EFI tables is > probably not a viable option, it can do some simple action: > - retains all the EFI_BOOT_SERVICES_* areas - there is already code > for that, controlled with /mapbs boot switch (to xen.efi, would need > another option for multiboot2+efi) > - have a list of tables to retain - since Xen already do analyze some > of the ConfigurationTables, it can also have a list of those to > preserve even if they live in EFI_BOOT_SERVICES_DATA. In this case, > while Xen doesn't need to parse the whole table, it need to parse it's > header to get the table size - to reserve that memory and not reuse > it after ExitBootServices(). Xen seems to already contain skeleton XEN_EFI_query_capsule_capabilities and XEN_EFI_update_capsule hypercalls which is what should be used in order to perform the updates? So yes, I agree Xen should make sure the region of the table is not freed when exiting boot services, and that dom0 can access it. I guess we should move the checks done by Linux to Xen, and then only provide the ESRT table to dom0 if the checks (now done by Xen) pass. It might be helpful to see the whole picture here with the hooks to perform the updates also implemented, as those are missing in Xen (and Linux?). That would give a clearer view of what you are trying to achieve IMO. Thanks, Roger.
On Tue, Aug 18, 2020 at 07:21:14PM +0200, Roger Pau Monné wrote: > > Let me draw the picture from the beginning. > > Thanks, greatly appreciated. > > > EFI memory map contains various memory regions. Some of them are marked > > as not needed after ExitBootServices() call (done in Xen before > > launching dom0). This includes EFI_BOOT_SERVICES_DATA and > > EFI_BOOT_SERVICES_CODE. > > > > EFI SystemTable contains pointers to various ConfigurationTables - > > physical addresses (at least in this case). Xen does interpret some of > > them, but not ESRT. Xen pass the whole (address of) SystemTable to Linux > > dom0 (at least in PV case). Xen doesn't do anything about tables it > > doesn't understand. > > > > Now, the code in Linux takes the (ESRT) table address early and checks > > the memory map for it. We have 3 cases: > > - it points at area marked as neither EFI_*_SERVICES_DATA, nor with > > EFI_MEMORY_RUNTIME attribute -> Linux refuse to use it > > - it points to EFI_RUNTIME_SERVICES_DATA or with EFI_MEMORY_RUNTIME > > attribute - Linux uses the table; memory map already says the area > > belongs to EFI and the OS should not use it for something else > > - it points to EFI_BOOT_SERVICES_DATA - Linux mark the area as reserved > > to not release it after calling ExitBootServices() > > > > The problematic is the third case - at the time when Linux dom0 is run, > > ExitBootServices() was already called and EFI_BOOT_SERVICES_* memory was > > already released. It could be already used for something else (for > > example Xen could overwrite it while loading dom0). > > > > Note the problematic case should be the most common - UEFI specification > > says "The ESRT shall be stored in memory of type EfiBootServicesData" > > (chapter 22.3 of UEFI Spec v2.6). > > > > For this reason, to use ESRT in dom0, Xen should do something about it > > before ExitBootServices() call. While analyzing all the EFI tables is > > probably not a viable option, it can do some simple action: > > - retains all the EFI_BOOT_SERVICES_* areas - there is already code > > for that, controlled with /mapbs boot switch (to xen.efi, would need > > another option for multiboot2+efi) > > - have a list of tables to retain - since Xen already do analyze some > > of the ConfigurationTables, it can also have a list of those to > > preserve even if they live in EFI_BOOT_SERVICES_DATA. In this case, > > while Xen doesn't need to parse the whole table, it need to parse it's > > header to get the table size - to reserve that memory and not reuse > > it after ExitBootServices(). > > Xen seems to already contain skeleton > XEN_EFI_query_capsule_capabilities and XEN_EFI_update_capsule > hypercalls which is what should be used in order to perform the > updates? I think those covers only runtime service calls similarly named. But you need also ESRT table to collect info about devices that you can even attempt to update. TBH, I'm not sure if those runtime services are really needed. I think Norbert succeeded UEFI update from within Xen PV dom0 with just access to the ESRT table, but without those services. > So yes, I agree Xen should make sure the region of the table is not > freed when exiting boot services, and that dom0 can access it. I > guess we should move the checks done by Linux to Xen, and then only > provide the ESRT table to dom0 if the checks (now done by Xen) pass. Yes, something like this. But note currently in the (PV) dom0 case, Xen provides dom0 with a pointer to the whole SystemTable, not individual ConfigurationTables. Making it filter what dom0 gets would require Xen to re-construct the whole thing with just those elements that are desired. Not exactly sure if worth the effort given the privilege dom0 has. BTW How does it look in PVH dom0 case? Does it also get unmodified host EFI SystemTable? In that case, it would be more tricky, because (IIUC) physical addresses (like the one for ESRT table) are not meaningful to PVH dom0. > It might be helpful to see the whole picture here with the hooks to > perform the updates also implemented, as those are missing in Xen (and > Linux?). That would give a clearer view of what you are trying to > achieve IMO. Norbert, can you shed some light on this process? While those two runtime services seems relevant, I see also an update process involving simply dropping some file into ESP (/boot/efi). I'm not sure if some runtime services were involved.
On 18.08.2020 14:47, Roger Pau Monné wrote: > On Tue, Aug 18, 2020 at 02:01:35PM +0200, Marek Marczykowski-Górecki wrote: >> On Mon, Aug 17, 2020 at 11:00:13AM +0200, Roger Pau Monné wrote: >>> On Sun, Aug 16, 2020 at 02:19:49AM +0200, Marek Marczykowski-Górecki wrote: >>>> In case of Xen PV dom0, Xen passes along info about system tables (see >>>> arch/x86/xen/efi.c), but not the memory map from EFI. >>> >>> I think that's because the memory map returned by >>> XENMEM_machine_memory_map is in e820 form, and doesn't contain the >>> required information about the EFI regions due to the translation done >>> by efi_arch_process_memory_map in Xen? >> >> Yes, I think so. >> >>>> This makes sense >>>> as it is Xen responsible for managing physical memory address space. >>>> In this case, it doesn't make sense to condition using ESRT table on >>>> availability of EFI memory map, as it isn't Linux kernel responsible for >>>> it. >>> >>> PV dom0 is kind of special in that regard as it can create mappings to >>> (almost) any MMIO regions, and hence can change it's memory map >>> substantially. >> >> Do you mean PV dom0 should receive full EFI memory map? Jan already >> objected this as it would be a layering violation. > > dom0 is already capable of getting the native e820 memory map using > XENMEM_machine_memory_map, I'm not sure why allowing to return the > memory map in EFI form would be any different (or a layering > violation in the PV dom0 case). The EFI memory map exposes more information than the E820 one, and this extra information should remain private to Xen if at all possible. I actually think that exposing the raw E820 map was a layering violation, too. Instead hypercalls should have been added for the specific legitimate uses that Dom0 may have for the memmap. Jan
On Tue, Aug 18, 2020 at 08:40:18PM +0200, Marek Marczykowski-Górecki wrote: > On Tue, Aug 18, 2020 at 07:21:14PM +0200, Roger Pau Monné wrote: > > > Let me draw the picture from the beginning. > > > > Thanks, greatly appreciated. > > > > > EFI memory map contains various memory regions. Some of them are marked > > > as not needed after ExitBootServices() call (done in Xen before > > > launching dom0). This includes EFI_BOOT_SERVICES_DATA and > > > EFI_BOOT_SERVICES_CODE. > > > > > > EFI SystemTable contains pointers to various ConfigurationTables - > > > physical addresses (at least in this case). Xen does interpret some of > > > them, but not ESRT. Xen pass the whole (address of) SystemTable to Linux > > > dom0 (at least in PV case). Xen doesn't do anything about tables it > > > doesn't understand. > > > > > > Now, the code in Linux takes the (ESRT) table address early and checks > > > the memory map for it. We have 3 cases: > > > - it points at area marked as neither EFI_*_SERVICES_DATA, nor with > > > EFI_MEMORY_RUNTIME attribute -> Linux refuse to use it > > > - it points to EFI_RUNTIME_SERVICES_DATA or with EFI_MEMORY_RUNTIME > > > attribute - Linux uses the table; memory map already says the area > > > belongs to EFI and the OS should not use it for something else > > > - it points to EFI_BOOT_SERVICES_DATA - Linux mark the area as reserved > > > to not release it after calling ExitBootServices() > > > > > > The problematic is the third case - at the time when Linux dom0 is run, > > > ExitBootServices() was already called and EFI_BOOT_SERVICES_* memory was > > > already released. It could be already used for something else (for > > > example Xen could overwrite it while loading dom0). > > > > > > Note the problematic case should be the most common - UEFI specification > > > says "The ESRT shall be stored in memory of type EfiBootServicesData" > > > (chapter 22.3 of UEFI Spec v2.6). > > > > > > For this reason, to use ESRT in dom0, Xen should do something about it > > > before ExitBootServices() call. While analyzing all the EFI tables is > > > probably not a viable option, it can do some simple action: > > > - retains all the EFI_BOOT_SERVICES_* areas - there is already code > > > for that, controlled with /mapbs boot switch (to xen.efi, would need > > > another option for multiboot2+efi) > > > - have a list of tables to retain - since Xen already do analyze some > > > of the ConfigurationTables, it can also have a list of those to > > > preserve even if they live in EFI_BOOT_SERVICES_DATA. In this case, > > > while Xen doesn't need to parse the whole table, it need to parse it's > > > header to get the table size - to reserve that memory and not reuse > > > it after ExitBootServices(). > > > > Xen seems to already contain skeleton > > XEN_EFI_query_capsule_capabilities and XEN_EFI_update_capsule > > hypercalls which is what should be used in order to perform the > > updates? > > I think those covers only runtime service calls similarly named. But you > need also ESRT table to collect info about devices that you can even > attempt to update. Right, the ESRT must be available so that dom0 can discover the resources. > TBH, I'm not sure if those runtime services are really needed. I think > Norbert succeeded UEFI update from within Xen PV dom0 with just access > to the ESRT table, but without those services. OK, by reading the UEFI spec I assumed that you needed access to QueryCapsuleCapabilities and UpdateCapsule in order to perform the updates, and those should be proxied using hyopercalls. Maybe this is not mandatory and there's a side-band mechanism of doing this? I think we need more info here. > > So yes, I agree Xen should make sure the region of the table is not > > freed when exiting boot services, and that dom0 can access it. I > > guess we should move the checks done by Linux to Xen, and then only > > provide the ESRT table to dom0 if the checks (now done by Xen) pass. > > Yes, something like this. But note currently in the (PV) dom0 case, Xen > provides dom0 with a pointer to the whole SystemTable, not individual > ConfigurationTables. Making it filter what dom0 gets would require Xen > to re-construct the whole thing with just those elements that are > desired. Not exactly sure if worth the effort given the privilege dom0 > has. We already do this for ACPI in PVH dom0, where Xen rebuilds the RSDT in order to filter out tables that shouldn't be exposed to dom0. If possible using something similar for UEFI would be my preference, but I certainly haven't investigated at all whether this is feasible. > BTW How does it look in PVH dom0 case? Does it also get unmodified host > EFI SystemTable? In that case, it would be more tricky, because (IIUC) > physical addresses (like the one for ESRT table) are not meaningful to > PVH dom0. For PVH dom0 we should make sure the ESRT is identity mapped into the physmap, so that dom0 has access to it. PVH dom0 gets a physical memory map that's basically the native one with the RAM regions adjusted to match the assigned memory. We already identity map a bunch of stuff there, so identity mapping the ESRT would be likely fine. > > It might be helpful to see the whole picture here with the hooks to > > perform the updates also implemented, as those are missing in Xen (and > > Linux?). That would give a clearer view of what you are trying to > > achieve IMO. > > Norbert, can you shed some light on this process? > > While those two runtime services seems relevant, I see also an update > process involving simply dropping some file into ESP (/boot/efi). I'm > not sure if some runtime services were involved. So then the update is done when rebooting? If we expose the ESRT we should also make sure the run-time services related to it are available. Thanks, Roger.
On 19.08.2020 10:19, Roger Pau Monné wrote: > On Tue, Aug 18, 2020 at 08:40:18PM +0200, Marek Marczykowski-Górecki wrote: >> On Tue, Aug 18, 2020 at 07:21:14PM +0200, Roger Pau Monné wrote: >>>> Let me draw the picture from the beginning. >>> Thanks, greatly appreciated. >>> >>>> EFI memory map contains various memory regions. Some of them are marked >>>> as not needed after ExitBootServices() call (done in Xen before >>>> launching dom0). This includes EFI_BOOT_SERVICES_DATA and >>>> EFI_BOOT_SERVICES_CODE. >>>> >>>> EFI SystemTable contains pointers to various ConfigurationTables - >>>> physical addresses (at least in this case). Xen does interpret some of >>>> them, but not ESRT. Xen pass the whole (address of) SystemTable to Linux >>>> dom0 (at least in PV case). Xen doesn't do anything about tables it >>>> doesn't understand. >>>> >>>> Now, the code in Linux takes the (ESRT) table address early and checks >>>> the memory map for it. We have 3 cases: >>>> - it points at area marked as neither EFI_*_SERVICES_DATA, nor with >>>> EFI_MEMORY_RUNTIME attribute -> Linux refuse to use it >>>> - it points to EFI_RUNTIME_SERVICES_DATA or with EFI_MEMORY_RUNTIME >>>> attribute - Linux uses the table; memory map already says the area >>>> belongs to EFI and the OS should not use it for something else >>>> - it points to EFI_BOOT_SERVICES_DATA - Linux mark the area as reserved >>>> to not release it after calling ExitBootServices() >>>> >>>> The problematic is the third case - at the time when Linux dom0 is run, >>>> ExitBootServices() was already called and EFI_BOOT_SERVICES_* memory was >>>> already released. It could be already used for something else (for >>>> example Xen could overwrite it while loading dom0). >>>> >>>> Note the problematic case should be the most common - UEFI specification >>>> says "The ESRT shall be stored in memory of type EfiBootServicesData" >>>> (chapter 22.3 of UEFI Spec v2.6). >>>> >>>> For this reason, to use ESRT in dom0, Xen should do something about it >>>> before ExitBootServices() call. While analyzing all the EFI tables is >>>> probably not a viable option, it can do some simple action: >>>> - retains all the EFI_BOOT_SERVICES_* areas - there is already code >>>> for that, controlled with /mapbs boot switch (to xen.efi, would need >>>> another option for multiboot2+efi) >>>> - have a list of tables to retain - since Xen already do analyze some >>>> of the ConfigurationTables, it can also have a list of those to >>>> preserve even if they live in EFI_BOOT_SERVICES_DATA. In this case, >>>> while Xen doesn't need to parse the whole table, it need to parse it's >>>> header to get the table size - to reserve that memory and not reuse >>>> it after ExitBootServices(). >>> Xen seems to already contain skeleton >>> XEN_EFI_query_capsule_capabilities and XEN_EFI_update_capsule >>> hypercalls which is what should be used in order to perform the >>> updates? >> I think those covers only runtime service calls similarly named. But you >> need also ESRT table to collect info about devices that you can even >> attempt to update. > Right, the ESRT must be available so that dom0 can discover the > resources. > >> TBH, I'm not sure if those runtime services are really needed. I think >> Norbert succeeded UEFI update from within Xen PV dom0 with just access >> to the ESRT table, but without those services. >> Marek is right here. I was able to successfully update and downgrade UFEI when the ESRT table was provided to the Xen PV dom0. I didn't need any extra services to make the UEFI capsule update work. > OK, by reading the UEFI spec I assumed that you needed access to > QueryCapsuleCapabilities and UpdateCapsule in order to perform the > updates, and those should be proxied using hyopercalls. Maybe this is > not mandatory and there's a side-band mechanism of doing this? > > I think we need more info here. > >>> So yes, I agree Xen should make sure the region of the table is not >>> freed when exiting boot services, and that dom0 can access it. I >>> guess we should move the checks done by Linux to Xen, and then only >>> provide the ESRT table to dom0 if the checks (now done by Xen) pass. >> Yes, something like this. But note currently in the (PV) dom0 case, Xen >> provides dom0 with a pointer to the whole SystemTable, not individual >> ConfigurationTables. Making it filter what dom0 gets would require Xen >> to re-construct the whole thing with just those elements that are >> desired. Not exactly sure if worth the effort given the privilege dom0 >> has. > We already do this for ACPI in PVH dom0, where Xen rebuilds the RSDT > in order to filter out tables that shouldn't be exposed to dom0. If > possible using something similar for UEFI would be my preference, but > I certainly haven't investigated at all whether this is feasible. > >> BTW How does it look in PVH dom0 case? Does it also get unmodified host >> EFI SystemTable? In that case, it would be more tricky, because (IIUC) >> physical addresses (like the one for ESRT table) are not meaningful to >> PVH dom0. > For PVH dom0 we should make sure the ESRT is identity mapped into the > physmap, so that dom0 has access to it. PVH dom0 gets a physical > memory map that's basically the native one with the RAM regions > adjusted to match the assigned memory. > > We already identity map a bunch of stuff there, so identity mapping > the ESRT would be likely fine. > >>> It might be helpful to see the whole picture here with the hooks to >>> perform the updates also implemented, as those are missing in Xen (and >>> Linux?). That would give a clearer view of what you are trying to >>> achieve IMO. >> Norbert, can you shed some light on this process? >> >> While those two runtime services seems relevant, I see also an update >> process involving simply dropping some file into ESP (/boot/efi). I'm >> not sure if some runtime services were involved. > So then the update is done when rebooting? If we expose the ESRT we > should also make sure the run-time services related to it are > available. Fwupd uses system firmware GUID to recognize its type. UEFI GUID is provided in the ESRT. Then fwupd checks if the updates/downgrades are available. In the next step the tool downloads and extracts cabinet archives in the EFI capsule file format and the capsule updates firmware after the OS reboot. --- Best Regards, Norbert Kamiński Embedded Systems Engineer GPG key ID: 9E9F90AFE10F466A 3mdeb.com
On Wed, Aug 19, 2020 at 01:33:39PM +0200, Norbert Kaminski wrote: > > On 19.08.2020 10:19, Roger Pau Monné wrote: > > On Tue, Aug 18, 2020 at 08:40:18PM +0200, Marek Marczykowski-Górecki wrote: > > > On Tue, Aug 18, 2020 at 07:21:14PM +0200, Roger Pau Monné wrote: > > > > > Let me draw the picture from the beginning. > > > > Thanks, greatly appreciated. > > > > > > > > > EFI memory map contains various memory regions. Some of them are marked > > > > > as not needed after ExitBootServices() call (done in Xen before > > > > > launching dom0). This includes EFI_BOOT_SERVICES_DATA and > > > > > EFI_BOOT_SERVICES_CODE. > > > > > > > > > > EFI SystemTable contains pointers to various ConfigurationTables - > > > > > physical addresses (at least in this case). Xen does interpret some of > > > > > them, but not ESRT. Xen pass the whole (address of) SystemTable to Linux > > > > > dom0 (at least in PV case). Xen doesn't do anything about tables it > > > > > doesn't understand. > > > > > > > > > > Now, the code in Linux takes the (ESRT) table address early and checks > > > > > the memory map for it. We have 3 cases: > > > > > - it points at area marked as neither EFI_*_SERVICES_DATA, nor with > > > > > EFI_MEMORY_RUNTIME attribute -> Linux refuse to use it > > > > > - it points to EFI_RUNTIME_SERVICES_DATA or with EFI_MEMORY_RUNTIME > > > > > attribute - Linux uses the table; memory map already says the area > > > > > belongs to EFI and the OS should not use it for something else > > > > > - it points to EFI_BOOT_SERVICES_DATA - Linux mark the area as reserved > > > > > to not release it after calling ExitBootServices() > > > > > > > > > > The problematic is the third case - at the time when Linux dom0 is run, > > > > > ExitBootServices() was already called and EFI_BOOT_SERVICES_* memory was > > > > > already released. It could be already used for something else (for > > > > > example Xen could overwrite it while loading dom0). > > > > > > > > > > Note the problematic case should be the most common - UEFI specification > > > > > says "The ESRT shall be stored in memory of type EfiBootServicesData" > > > > > (chapter 22.3 of UEFI Spec v2.6). > > > > > > > > > > For this reason, to use ESRT in dom0, Xen should do something about it > > > > > before ExitBootServices() call. While analyzing all the EFI tables is > > > > > probably not a viable option, it can do some simple action: > > > > > - retains all the EFI_BOOT_SERVICES_* areas - there is already code > > > > > for that, controlled with /mapbs boot switch (to xen.efi, would need > > > > > another option for multiboot2+efi) > > > > > - have a list of tables to retain - since Xen already do analyze some > > > > > of the ConfigurationTables, it can also have a list of those to > > > > > preserve even if they live in EFI_BOOT_SERVICES_DATA. In this case, > > > > > while Xen doesn't need to parse the whole table, it need to parse it's > > > > > header to get the table size - to reserve that memory and not reuse > > > > > it after ExitBootServices(). > > > > Xen seems to already contain skeleton > > > > XEN_EFI_query_capsule_capabilities and XEN_EFI_update_capsule > > > > hypercalls which is what should be used in order to perform the > > > > updates? > > > I think those covers only runtime service calls similarly named. But you > > > need also ESRT table to collect info about devices that you can even > > > attempt to update. > > Right, the ESRT must be available so that dom0 can discover the > > resources. > > > > > TBH, I'm not sure if those runtime services are really needed. I think > > > Norbert succeeded UEFI update from within Xen PV dom0 with just access > > > to the ESRT table, but without those services. > > > > Marek is right here. I was able to successfully update and downgrade > UFEI when the ESRT table was provided to the Xen PV dom0. I didn't > need any extra services to make the UEFI capsule update work. OK, I think that's using the method described in 8.5.5 of delivery of Capsules via file on Mass Storage, which doesn't use the UpdateCapsule() runtime API? Using such method doesn't require QueryCapsuleCapabilities(), as that's used to know whether a certain capsule can be updated via UpdateCapsule(). > > OK, by reading the UEFI spec I assumed that you needed access to > > QueryCapsuleCapabilities and UpdateCapsule in order to perform the > > updates, and those should be proxied using hyopercalls. Maybe this is > > not mandatory and there's a side-band mechanism of doing this? > > > > I think we need more info here. > > > > > > So yes, I agree Xen should make sure the region of the table is not > > > > freed when exiting boot services, and that dom0 can access it. I > > > > guess we should move the checks done by Linux to Xen, and then only > > > > provide the ESRT table to dom0 if the checks (now done by Xen) pass. > > > Yes, something like this. But note currently in the (PV) dom0 case, Xen > > > provides dom0 with a pointer to the whole SystemTable, not individual > > > ConfigurationTables. Making it filter what dom0 gets would require Xen > > > to re-construct the whole thing with just those elements that are > > > desired. Not exactly sure if worth the effort given the privilege dom0 > > > has. > > We already do this for ACPI in PVH dom0, where Xen rebuilds the RSDT > > in order to filter out tables that shouldn't be exposed to dom0. If > > possible using something similar for UEFI would be my preference, but > > I certainly haven't investigated at all whether this is feasible. > > > > > BTW How does it look in PVH dom0 case? Does it also get unmodified host > > > EFI SystemTable? In that case, it would be more tricky, because (IIUC) > > > physical addresses (like the one for ESRT table) are not meaningful to > > > PVH dom0. > > For PVH dom0 we should make sure the ESRT is identity mapped into the > > physmap, so that dom0 has access to it. PVH dom0 gets a physical > > memory map that's basically the native one with the RAM regions > > adjusted to match the assigned memory. > > > > We already identity map a bunch of stuff there, so identity mapping > > the ESRT would be likely fine. > > > > > > It might be helpful to see the whole picture here with the hooks to > > > > perform the updates also implemented, as those are missing in Xen (and > > > > Linux?). That would give a clearer view of what you are trying to > > > > achieve IMO. > > > Norbert, can you shed some light on this process? > > > > > > While those two runtime services seems relevant, I see also an update > > > process involving simply dropping some file into ESP (/boot/efi). I'm > > > not sure if some runtime services were involved. > > So then the update is done when rebooting? If we expose the ESRT we > > should also make sure the run-time services related to it are > > available. > > Fwupd uses system firmware GUID to recognize its type. UEFI GUID is > provided in the ESRT. Then fwupd checks if the updates/downgrades are > available. In the next step the tool downloads and extracts cabinet > archives in the EFI capsule file format and the capsule updates firmware > after the OS reboot. Right, so you only need access to the ESRT table, that's all. Then I think we need to make sure Xen doesn't use this memory for anything else, which will require some changes in Xen (or at least some checks?). We also need to decide what to do if the table turns out to be placed in a wrong region. How are we going to prevent dom0 from using it then? My preference would be to completely hide it from dom0 in that case, such that it believes there's no ESRT at all if possible. Thanks, Roger.
On Thu, Aug 20, 2020 at 11:30:25AM +0200, Roger Pau Monné wrote: > Right, so you only need access to the ESRT table, that's all. Then I > think we need to make sure Xen doesn't use this memory for anything > else, which will require some changes in Xen (or at least some > checks?). > > We also need to decide what to do if the table turns out to be placed > in a wrong region. How are we going to prevent dom0 from using it > then? My preference would be to completely hide it from dom0 in that > case, such that it believes there's no ESRT at all if possible. Yes, that makes sense. As discussed earlier, that probably means re-constructing SystemTable before giving it to dom0. We'd need to do that in PVH case anyway, to adjust addresses, right? Is there something like this in the Xen codebase already, or it needs to be written from scratch?
On Thu, 20 Aug 2020 at 11:30, Roger Pau Monné <roger.pau@citrix.com> wrote: > > On Wed, Aug 19, 2020 at 01:33:39PM +0200, Norbert Kaminski wrote: > > > > On 19.08.2020 10:19, Roger Pau Monné wrote: > > > On Tue, Aug 18, 2020 at 08:40:18PM +0200, Marek Marczykowski-Górecki wrote: > > > > On Tue, Aug 18, 2020 at 07:21:14PM +0200, Roger Pau Monné wrote: > > > > > > Let me draw the picture from the beginning. > > > > > Thanks, greatly appreciated. > > > > > > > > > > > EFI memory map contains various memory regions. Some of them are marked > > > > > > as not needed after ExitBootServices() call (done in Xen before > > > > > > launching dom0). This includes EFI_BOOT_SERVICES_DATA and > > > > > > EFI_BOOT_SERVICES_CODE. > > > > > > > > > > > > EFI SystemTable contains pointers to various ConfigurationTables - > > > > > > physical addresses (at least in this case). Xen does interpret some of > > > > > > them, but not ESRT. Xen pass the whole (address of) SystemTable to Linux > > > > > > dom0 (at least in PV case). Xen doesn't do anything about tables it > > > > > > doesn't understand. > > > > > > > > > > > > Now, the code in Linux takes the (ESRT) table address early and checks > > > > > > the memory map for it. We have 3 cases: > > > > > > - it points at area marked as neither EFI_*_SERVICES_DATA, nor with > > > > > > EFI_MEMORY_RUNTIME attribute -> Linux refuse to use it > > > > > > - it points to EFI_RUNTIME_SERVICES_DATA or with EFI_MEMORY_RUNTIME > > > > > > attribute - Linux uses the table; memory map already says the area > > > > > > belongs to EFI and the OS should not use it for something else > > > > > > - it points to EFI_BOOT_SERVICES_DATA - Linux mark the area as reserved > > > > > > to not release it after calling ExitBootServices() > > > > > > > > > > > > The problematic is the third case - at the time when Linux dom0 is run, > > > > > > ExitBootServices() was already called and EFI_BOOT_SERVICES_* memory was > > > > > > already released. It could be already used for something else (for > > > > > > example Xen could overwrite it while loading dom0). > > > > > > > > > > > > Note the problematic case should be the most common - UEFI specification > > > > > > says "The ESRT shall be stored in memory of type EfiBootServicesData" > > > > > > (chapter 22.3 of UEFI Spec v2.6). > > > > > > > > > > > > For this reason, to use ESRT in dom0, Xen should do something about it > > > > > > before ExitBootServices() call. While analyzing all the EFI tables is > > > > > > probably not a viable option, it can do some simple action: > > > > > > - retains all the EFI_BOOT_SERVICES_* areas - there is already code > > > > > > for that, controlled with /mapbs boot switch (to xen.efi, would need > > > > > > another option for multiboot2+efi) > > > > > > - have a list of tables to retain - since Xen already do analyze some > > > > > > of the ConfigurationTables, it can also have a list of those to > > > > > > preserve even if they live in EFI_BOOT_SERVICES_DATA. In this case, > > > > > > while Xen doesn't need to parse the whole table, it need to parse it's > > > > > > header to get the table size - to reserve that memory and not reuse > > > > > > it after ExitBootServices(). > > > > > Xen seems to already contain skeleton > > > > > XEN_EFI_query_capsule_capabilities and XEN_EFI_update_capsule > > > > > hypercalls which is what should be used in order to perform the > > > > > updates? > > > > I think those covers only runtime service calls similarly named. But you > > > > need also ESRT table to collect info about devices that you can even > > > > attempt to update. > > > Right, the ESRT must be available so that dom0 can discover the > > > resources. > > > > > > > TBH, I'm not sure if those runtime services are really needed. I think > > > > Norbert succeeded UEFI update from within Xen PV dom0 with just access > > > > to the ESRT table, but without those services. > > > > > > Marek is right here. I was able to successfully update and downgrade > > UFEI when the ESRT table was provided to the Xen PV dom0. I didn't > > need any extra services to make the UEFI capsule update work. > > OK, I think that's using the method described in 8.5.5 of delivery of > Capsules via file on Mass Storage, which doesn't use the > UpdateCapsule() runtime API? > No, it doesn't even do that. It uses its own .efi binary to invoke UpdateCapsule() after a reboot, by setting up the BootNext variable to override the boot target for the next boot only. > Using such method doesn't require QueryCapsuleCapabilities(), as > that's used to know whether a certain capsule can be updated via > UpdateCapsule(). > That is a bit of a downside here. But the reason this approach was chosen is because that is how Windows applies capsule updates as well, and given that many x86 vendors only care about the Windows Logo sticker, and not about UEFI compliance, calling UpdateCapsule() from the OS is poorly tested, and therefore broken on many platforms. > > > OK, by reading the UEFI spec I assumed that you needed access to > > > QueryCapsuleCapabilities and UpdateCapsule in order to perform the > > > updates, and those should be proxied using hyopercalls. Maybe this is > > > not mandatory and there's a side-band mechanism of doing this? > > > > > > I think we need more info here. > > > > > > > > So yes, I agree Xen should make sure the region of the table is not > > > > > freed when exiting boot services, and that dom0 can access it. I > > > > > guess we should move the checks done by Linux to Xen, and then only > > > > > provide the ESRT table to dom0 if the checks (now done by Xen) pass. > > > > Yes, something like this. But note currently in the (PV) dom0 case, Xen > > > > provides dom0 with a pointer to the whole SystemTable, not individual > > > > ConfigurationTables. Making it filter what dom0 gets would require Xen > > > > to re-construct the whole thing with just those elements that are > > > > desired. Not exactly sure if worth the effort given the privilege dom0 > > > > has. > > > We already do this for ACPI in PVH dom0, where Xen rebuilds the RSDT > > > in order to filter out tables that shouldn't be exposed to dom0. If > > > possible using something similar for UEFI would be my preference, but > > > I certainly haven't investigated at all whether this is feasible. > > > > > > > BTW How does it look in PVH dom0 case? Does it also get unmodified host > > > > EFI SystemTable? In that case, it would be more tricky, because (IIUC) > > > > physical addresses (like the one for ESRT table) are not meaningful to > > > > PVH dom0. > > > For PVH dom0 we should make sure the ESRT is identity mapped into the > > > physmap, so that dom0 has access to it. PVH dom0 gets a physical > > > memory map that's basically the native one with the RAM regions > > > adjusted to match the assigned memory. > > > > > > We already identity map a bunch of stuff there, so identity mapping > > > the ESRT would be likely fine. > > > > > > > > It might be helpful to see the whole picture here with the hooks to > > > > > perform the updates also implemented, as those are missing in Xen (and > > > > > Linux?). That would give a clearer view of what you are trying to > > > > > achieve IMO. > > > > Norbert, can you shed some light on this process? > > > > > > > > While those two runtime services seems relevant, I see also an update > > > > process involving simply dropping some file into ESP (/boot/efi). I'm > > > > not sure if some runtime services were involved. > > > So then the update is done when rebooting? If we expose the ESRT we > > > should also make sure the run-time services related to it are > > > available. > > > > Fwupd uses system firmware GUID to recognize its type. UEFI GUID is > > provided in the ESRT. Then fwupd checks if the updates/downgrades are > > available. In the next step the tool downloads and extracts cabinet > > archives in the EFI capsule file format and the capsule updates firmware > > after the OS reboot. > > Right, so you only need access to the ESRT table, that's all. Then I > think we need to make sure Xen doesn't use this memory for anything > else, which will require some changes in Xen (or at least some > checks?). > Indeed. As long as the fwupdate tooling can access the ESRT, install its .efi binary in /boot/efi and can manipulate the BootNext EFI variable to to invoke it, it should work fine under Xen as well. > We also need to decide what to do if the table turns out to be placed > in a wrong region. How are we going to prevent dom0 from using it > then? My preference would be to completely hide it from dom0 in that > case, such that it believes there's no ESRT at all if possible. > The ESRT must be placed in memory that the OS may take ownership of, and that is why we have this check. I think it is perfectly fine to put the ESRT in memory that the OS will never touch in the first place, which is much more straight-forward.
On Thu, Aug 20, 2020 at 11:34:54AM +0200, Marek Marczykowski-Górecki wrote: > On Thu, Aug 20, 2020 at 11:30:25AM +0200, Roger Pau Monné wrote: > > Right, so you only need access to the ESRT table, that's all. Then I > > think we need to make sure Xen doesn't use this memory for anything > > else, which will require some changes in Xen (or at least some > > checks?). > > > > We also need to decide what to do if the table turns out to be placed > > in a wrong region. How are we going to prevent dom0 from using it > > then? My preference would be to completely hide it from dom0 in that > > case, such that it believes there's no ESRT at all if possible. > > Yes, that makes sense. As discussed earlier, that probably means > re-constructing SystemTable before giving it to dom0. We'd need to do > that in PVH case anyway, to adjust addresses, right? Not really, on PVH dom0 we should be able to identity map the required EFI regions in the dom0 p2m, so the only difference between a classic PV dom0 is that we need to assure that those regions are correctly identity mapped in the p2m, but that shouldn't require any change to the SystemTable unless we need to craft custom tables (see below). > Is there something > like this in the Xen codebase already, or it needs to be written from > scratch? AFAICT it needs to be written for EFI. For the purposes here I think you could copy the SystemTable and modify the NumberOfTableEntries and ConfigurationTable fields in the copy in order to delete the ESRT if found to be placed in a non suitable region? At that point we can remove the checks from Linux since Xen will assert that whatever gets passed to dom0 is in a suitable region. It would be nice to have a way to signal that the placement of the ESRT has been checked, but I'm not sure how to do this, do you have any ideas? Roger.
diff --git a/drivers/firmware/efi/esrt.c b/drivers/firmware/efi/esrt.c index d5915272141f..5c49f2aaa4b1 100644 --- a/drivers/firmware/efi/esrt.c +++ b/drivers/firmware/efi/esrt.c @@ -245,36 +245,38 @@ void __init efi_esrt_init(void) int rc; phys_addr_t end; - if (!efi_enabled(EFI_MEMMAP)) + if (!efi_enabled(EFI_MEMMAP) && !efi_enabled(EFI_PARAVIRT)) return; pr_debug("esrt-init: loading.\n"); if (!esrt_table_exists()) return; - rc = efi_mem_desc_lookup(efi.esrt, &md); - if (rc < 0 || - (!(md.attribute & EFI_MEMORY_RUNTIME) && - md.type != EFI_BOOT_SERVICES_DATA && - md.type != EFI_RUNTIME_SERVICES_DATA)) { - pr_warn("ESRT header is not in the memory map.\n"); - return; - } + if (efi_enabled(EFI_MEMMAP)) { + rc = efi_mem_desc_lookup(efi.esrt, &md); + if (rc < 0 || + (!(md.attribute & EFI_MEMORY_RUNTIME) && + md.type != EFI_BOOT_SERVICES_DATA && + md.type != EFI_RUNTIME_SERVICES_DATA)) { + pr_warn("ESRT header is not in the memory map.\n"); + return; + } - max = efi_mem_desc_end(&md); - if (max < efi.esrt) { - pr_err("EFI memory descriptor is invalid. (esrt: %p max: %p)\n", - (void *)efi.esrt, (void *)max); - return; - } + max = efi_mem_desc_end(&md); + if (max < efi.esrt) { + pr_err("EFI memory descriptor is invalid. (esrt: %p max: %p)\n", + (void *)efi.esrt, (void *)max); + return; + } - size = sizeof(*esrt); - max -= efi.esrt; + size = sizeof(*esrt); + max -= efi.esrt; - if (max < size) { - pr_err("ESRT header doesn't fit on single memory map entry. (size: %zu max: %zu)\n", - size, max); - return; + if (max < size) { + pr_err("ESRT header doesn't fit on single memory map entry. (size: %zu max: %zu)\n", + size, max); + return; + } } va = early_memremap(efi.esrt, size); @@ -331,7 +333,8 @@ void __init efi_esrt_init(void) end = esrt_data + size; pr_info("Reserving ESRT space from %pa to %pa.\n", &esrt_data, &end); - if (md.type == EFI_BOOT_SERVICES_DATA) + + if (efi_enabled(EFI_MEMMAP) && md.type == EFI_BOOT_SERVICES_DATA) efi_mem_reserve(esrt_data, esrt_data_size); pr_debug("esrt-init: loaded.\n");
In case of Xen PV dom0, Xen passes along info about system tables (see arch/x86/xen/efi.c), but not the memory map from EFI. This makes sense as it is Xen responsible for managing physical memory address space. In this case, it doesn't make sense to condition using ESRT table on availability of EFI memory map, as it isn't Linux kernel responsible for it. Skip this part on Xen PV (let Xen do the right thing if it deems necessary) and use ESRT table normally. This is a requirement for using fwupd in PV dom0 to update UEFI using capsules. Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> --- drivers/firmware/efi/esrt.c | 47 ++++++++++++++++++++----------------- 1 file changed, 25 insertions(+), 22 deletions(-)