diff mbox series

[PATCH-4.16,v2] xen/efi: Fix Grub2 boot on arm64

Message ID 20211104141206.25153-1-luca.fancellu@arm.com (mailing list archive)
State Superseded
Headers show
Series [PATCH-4.16,v2] xen/efi: Fix Grub2 boot on arm64 | expand

Commit Message

Luca Fancellu Nov. 4, 2021, 2:12 p.m. UTC
The code introduced by commit a1743fc3a9fe9b68c265c45264dddf214fd9b882
("arm/efi: Use dom0less configuration when using EFI boot") is
introducing a problem to boot Xen using Grub2 on ARM machine using EDK2.

Despite UEFI specification, EDK2+Grub2 is returning a NULL DeviceHandle
inside the interface given by the LOADED_IMAGE_PROTOCOL service, this
handle is used later by efi_bs->HandleProtocol(...) inside
get_parent_handle(...) when requesting the SIMPLE_FILE_SYSTEM_PROTOCOL
interface, causing Xen to stop the boot because of an EFI_INVALID_PARAMETER
error.

Before the commit above, the function was never called because the
logic was skipping the call when there were multiboot modules in the
DT because the filesystem was never used and the bootloader had
put in place all the right modules in memory and the addresses
in the DT.

To fix the problem we allow the get_parent_handle(...) function to
return a NULL handle on error and we check the usage of the function
to handle the new use case. The function in fact should not prevent
the boot even if the filesystem can't be used, because the DT and
the modules could be put in place by the bootloader before running
Xen and if xen,uefi-binary property is not used, there is no need
for the filesystem.

Another problem is found when the UEFI stub tries to check if Dom0
image or DomUs are present.
The logic doesn't work when the UEFI stub is not responsible to load
any modules, so the efi_check_dt_boot(...) return value is modified
to return the number of multiboot module found and not only the number
of module loaded by the stub.
Taking the occasion to update the comment in handle_module_node(...)
to explain why we return success even if xen,uefi-binary is not found.

Fixes: a1743fc3a9 ("arm/efi: Use dom0less configuration when using EFI boot")
Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>

---
Justification for integration in 4.16:
Upside: allow booting xen from grub on arm64 when the stub doesn't load
        any module.
Downside: It's affecting the EFI boot path.
Risk: It's not affecting x86 arch that works the same way as before.
      If something is wrong it creates a problem on early boot and not at
      runtime, so risk is low.

Tested in this configurations:
 - Bootloader loads modules and specify them as multiboot modules in DT:
   * combination of Dom0, DomUs, Dom0 and DomUs
 - DT specifies multiboot modules in DT using xen,uefi-binary property:
   * combination of Dom0, DomUs, Dom0 and DomUs
 - Bootloader loads a Dom0 module and appends it as multiboot module in DT,
   other multiboot modules are listed for DomUs using xen,uefi-binary
 - No multiboot modules in DT and no kernel entry in cfg file:
   * proper error thrown

Changes in v2:
 - Changed comment on DeviceHandle NULL (Jan)
 - Removed fatal condition on handle NULL (Jan)
 - Add more info about the EDK2+Grub2 issue to the commit msg (Jan)
 - Removed modules_found from function signature and pass everything
   on return (Stefano)
 - Improved comment in handle_module_node

---
 xen/arch/arm/efi/efi-boot.h | 34 +++++++++++++++++++++++-----------
 xen/common/efi/boot.c       | 28 +++++++++++++++++++++++++---
 2 files changed, 48 insertions(+), 14 deletions(-)

Comments

Bertrand Marquis Nov. 4, 2021, 2:33 p.m. UTC | #1
Hi Luca,

> On 4 Nov 2021, at 14:12, Luca Fancellu <Luca.Fancellu@arm.com> wrote:
> 
> The code introduced by commit a1743fc3a9fe9b68c265c45264dddf214fd9b882
> ("arm/efi: Use dom0less configuration when using EFI boot") is
> introducing a problem to boot Xen using Grub2 on ARM machine using EDK2.
> 
> Despite UEFI specification, EDK2+Grub2 is returning a NULL DeviceHandle
> inside the interface given by the LOADED_IMAGE_PROTOCOL service, this
> handle is used later by efi_bs->HandleProtocol(...) inside
> get_parent_handle(...) when requesting the SIMPLE_FILE_SYSTEM_PROTOCOL
> interface, causing Xen to stop the boot because of an EFI_INVALID_PARAMETER
> error.
> 
> Before the commit above, the function was never called because the
> logic was skipping the call when there were multiboot modules in the
> DT because the filesystem was never used and the bootloader had
> put in place all the right modules in memory and the addresses
> in the DT.
> 
> To fix the problem we allow the get_parent_handle(...) function to
> return a NULL handle on error and we check the usage of the function
> to handle the new use case. The function in fact should not prevent
> the boot even if the filesystem can't be used, because the DT and
> the modules could be put in place by the bootloader before running
> Xen and if xen,uefi-binary property is not used, there is no need
> for the filesystem.
> 
> Another problem is found when the UEFI stub tries to check if Dom0
> image or DomUs are present.
> The logic doesn't work when the UEFI stub is not responsible to load
> any modules, so the efi_check_dt_boot(...) return value is modified
> to return the number of multiboot module found and not only the number
> of module loaded by the stub.
> Taking the occasion to update the comment in handle_module_node(...)
> to explain why we return success even if xen,uefi-binary is not found.
> 
> Fixes: a1743fc3a9 ("arm/efi: Use dom0less configuration when using EFI boot")
> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>

Thanks for the detailed explanation and testing all those configurations.

Cheers
Bertrand


> 
> ---
> Justification for integration in 4.16:
> Upside: allow booting xen from grub on arm64 when the stub doesn't load
>        any module.
> Downside: It's affecting the EFI boot path.
> Risk: It's not affecting x86 arch that works the same way as before.
>      If something is wrong it creates a problem on early boot and not at
>      runtime, so risk is low.
> 
> Tested in this configurations:
> - Bootloader loads modules and specify them as multiboot modules in DT:
>   * combination of Dom0, DomUs, Dom0 and DomUs
> - DT specifies multiboot modules in DT using xen,uefi-binary property:
>   * combination of Dom0, DomUs, Dom0 and DomUs
> - Bootloader loads a Dom0 module and appends it as multiboot module in DT,
>   other multiboot modules are listed for DomUs using xen,uefi-binary
> - No multiboot modules in DT and no kernel entry in cfg file:
>   * proper error thrown
> 
> Changes in v2:
> - Changed comment on DeviceHandle NULL (Jan)
> - Removed fatal condition on handle NULL (Jan)
> - Add more info about the EDK2+Grub2 issue to the commit msg (Jan)
> - Removed modules_found from function signature and pass everything
>   on return (Stefano)
> - Improved comment in handle_module_node
> 
> ---
> xen/arch/arm/efi/efi-boot.h | 34 +++++++++++++++++++++++-----------
> xen/common/efi/boot.c       | 28 +++++++++++++++++++++++++---
> 2 files changed, 48 insertions(+), 14 deletions(-)
> 
> diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
> index 8b88dd26a5..c3ae9751ab 100644
> --- a/xen/arch/arm/efi/efi-boot.h
> +++ b/xen/arch/arm/efi/efi-boot.h
> @@ -702,6 +702,7 @@ static int __init allocate_module_file(EFI_FILE_HANDLE dir_handle,
>  * This function checks for the presence of the xen,uefi-binary property in the
>  * module, if found it loads the binary as module and sets the right address
>  * for the reg property into the module DT node.
> + * Returns 1 if module is multiboot,module, 0 if not, < 0 on error
>  */
> static int __init handle_module_node(EFI_FILE_HANDLE dir_handle,
>                                      int module_node_offset,
> @@ -730,8 +731,8 @@ static int __init handle_module_node(EFI_FILE_HANDLE dir_handle,
>                                  &uefi_name_len);
> 
>     if ( !uefi_name_prop )
> -        /* Property not found */
> -        return 0;
> +        /* Property not found, but signal this is a multiboot,module */
> +        return 1;
> 
>     file_idx = get_module_file_index(uefi_name_prop, uefi_name_len);
>     if ( file_idx < 0 )
> @@ -795,19 +796,20 @@ static int __init handle_module_node(EFI_FILE_HANDLE dir_handle,
>         }
>     }
> 
> -    return 0;
> +    return 1;
> }
> 
> /*
>  * This function checks for boot modules under the domU guest domain node
>  * in the DT.
> - * Returns 0 on success, negative number on error.
> + * Returns number of multiboot,module found or negative number on error.
>  */
> static int __init handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
>                                               int domain_node)
> {
>     int module_node, addr_cells, size_cells, len;
>     const struct fdt_property *prop;
> +    unsigned int mb_modules_found = 0;
> 
>     /* Get #address-cells and #size-cells from domain node */
>     prop = fdt_get_property(fdt, domain_node, "#address-cells", &len);
> @@ -837,20 +839,22 @@ static int __init handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
>                                      size_cells, true);
>         if ( ret < 0 )
>             return ret;
> +
> +        mb_modules_found += ret;
>     }
> 
> -    return 0;
> +    return mb_modules_found;
> }
> 
> /*
>  * This function checks for xen domain nodes under the /chosen node for possible
>  * dom0 and domU guests to be loaded.
> - * Returns the number of modules loaded or a negative number for error.
> + * Returns the number of multiboot modules found or a negative number for error.
>  */
> static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
> {
>     int chosen, node, addr_len, size_len;
> -    unsigned int i = 0;
> +    unsigned int i = 0, modules_found = 0;
> 
>     /* Check for the chosen node in the current DTB */
>     chosen = setup_chosen_node(fdt, &addr_len, &size_len);
> @@ -865,15 +869,23 @@ static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
>           node > 0;
>           node = fdt_next_subnode(fdt, node) )
>     {
> +        int ret;
> +
>         if ( !fdt_node_check_compatible(fdt, node, "xen,domain") )
>         {
>             /* Found a node with compatible xen,domain; handle this node. */
> -            if ( handle_dom0less_domain_node(dir_handle, node) < 0 )
> +            ret = handle_dom0less_domain_node(dir_handle, node);
> +            if ( ret < 0 )
>                 return ERROR_DT_MODULE_DOMU;
>         }
> -        else if ( handle_module_node(dir_handle, node, addr_len, size_len,
> -                                     false) < 0 )
> +        else
> +        {
> +            ret = handle_module_node(dir_handle, node, addr_len, size_len,
> +                                     false);
> +            if ( ret < 0 )
>                  return ERROR_DT_MODULE_DOM0;
> +        }
> +        modules_found += ret;
>     }
> 
>     /* Free boot modules file names if any */
> @@ -883,7 +895,7 @@ static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
>         efi_bs->FreePool(modules[i].name);
>     }
> 
> -    return modules_idx;
> +    return modules_found;
> }
> 
> static void __init efi_arch_cpu(void)
> diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
> index 392ff3ac9b..112b7e7571 100644
> --- a/xen/common/efi/boot.c
> +++ b/xen/common/efi/boot.c
> @@ -449,6 +449,15 @@ static EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
>     CHAR16 *pathend, *ptr;
>     EFI_STATUS ret;
> 
> +    /*
> +     * Grub2 running on top of EDK2 has been observed to supply a NULL
> +     * DeviceHandle. We can't use that to gain access to the filesystem.
> +     * However the system can still boot if it doesn’t require access to the
> +     * filesystem.
> +     */
> +    if ( !loaded_image->DeviceHandle )
> +        return NULL;
> +
>     do {
>         EFI_FILE_IO_INTERFACE *fio;
> 
> @@ -581,6 +590,8 @@ static bool __init read_file(EFI_FILE_HANDLE dir_handle, CHAR16 *name,
>     EFI_STATUS ret;
>     const CHAR16 *what = NULL;
> 
> +    if ( !dir_handle )
> +        blexit(L"Error: No access to the filesystem");
>     if ( !name )
>         PrintErrMesg(L"No filename", EFI_OUT_OF_RESOURCES);
>     ret = dir_handle->Open(dir_handle, &FileHandle, name,
> @@ -1333,8 +1344,18 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
>             EFI_FILE_HANDLE handle = get_parent_handle(loaded_image,
>                                                        &file_name);
> 
> -            handle->Close(handle);
> -            *argv = file_name;
> +            if ( !handle )
> +            {
> +                PrintErr(L"Error retrieving image name: no filesystem access."
> +                         L" Setting default to xen.efi");
> +                PrintErr(newline);
> +                *argv = L"xen.efi";
> +            }
> +            else
> +            {
> +                handle->Close(handle);
> +                *argv = file_name;
> +            }
>         }
> 
>         name.s = get_value(&cfg, section.s, "options");
> @@ -1369,7 +1390,8 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
>     /* Get the number of boot modules specified on the DT or an error (<0) */
>     dt_modules_found = efi_check_dt_boot(dir_handle);
> 
> -    dir_handle->Close(dir_handle);
> +    if ( dir_handle )
> +        dir_handle->Close(dir_handle);
> 
>     if ( dt_modules_found < 0 )
>         /* efi_check_dt_boot throws some error */
> -- 
> 2.17.1
>
Ian Jackson Nov. 4, 2021, 2:44 p.m. UTC | #2
Bertrand Marquis writes ("Re: [PATCH-4.16 v2] xen/efi: Fix Grub2 boot on arm64"):
> > On 4 Nov 2021, at 14:12, Luca Fancellu <Luca.Fancellu@arm.com> wrote:
> > Fixes: a1743fc3a9 ("arm/efi: Use dom0less configuration when using EFI boot")
> > Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
>
> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
> 
> Thanks for the detailed explanation and testing all those configurations.

Indeed, thank you for the careful thought and documentation of your
conclusions.

Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich Nov. 4, 2021, 4:36 p.m. UTC | #3
On 04.11.2021 15:12, Luca Fancellu wrote:
> --- a/xen/common/efi/boot.c
> +++ b/xen/common/efi/boot.c
> @@ -449,6 +449,15 @@ static EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
>      CHAR16 *pathend, *ptr;
>      EFI_STATUS ret;
>  
> +    /*
> +     * Grub2 running on top of EDK2 has been observed to supply a NULL
> +     * DeviceHandle. We can't use that to gain access to the filesystem.
> +     * However the system can still boot if it doesn’t require access to the
> +     * filesystem.
> +     */
> +    if ( !loaded_image->DeviceHandle )
> +        return NULL;
> +
>      do {
>          EFI_FILE_IO_INTERFACE *fio;
>  
> @@ -581,6 +590,8 @@ static bool __init read_file(EFI_FILE_HANDLE dir_handle, CHAR16 *name,
>      EFI_STATUS ret;
>      const CHAR16 *what = NULL;
>  
> +    if ( !dir_handle )
> +        blexit(L"Error: No access to the filesystem");
>      if ( !name )
>          PrintErrMesg(L"No filename", EFI_OUT_OF_RESOURCES);
>      ret = dir_handle->Open(dir_handle, &FileHandle, name,
> @@ -1333,8 +1344,18 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
>              EFI_FILE_HANDLE handle = get_parent_handle(loaded_image,
>                                                         &file_name);
>  
> -            handle->Close(handle);
> -            *argv = file_name;
> +            if ( !handle )
> +            {
> +                PrintErr(L"Error retrieving image name: no filesystem access."
> +                         L" Setting default to xen.efi");
> +                PrintErr(newline);
> +                *argv = L"xen.efi";
> +            }
> +            else
> +            {
> +                handle->Close(handle);
> +                *argv = file_name;
> +            }
>          }
>  
>          name.s = get_value(&cfg, section.s, "options");
> @@ -1369,7 +1390,8 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
>      /* Get the number of boot modules specified on the DT or an error (<0) */
>      dt_modules_found = efi_check_dt_boot(dir_handle);
>  
> -    dir_handle->Close(dir_handle);
> +    if ( dir_handle )
> +        dir_handle->Close(dir_handle);
>  
>      if ( dt_modules_found < 0 )
>          /* efi_check_dt_boot throws some error */
> 

I'm sorry, but I think we need to take a step back here and revisit
the earlier change. If that hadn't moved obtaining dir_handle out by
one level of scope, nothing bad would have happened to the case that
you're now trying to fix, I understand? So perhaps that part wants
undoing, with efi_check_dt_boot() instead getting passed loaded_image.
That way, down the call tree the needed handle can be obtained via
another call to get_parent_handle(), and quite likely in the scenario
you're trying to fix here execution wouldn't even make it there. This
then wouldn't be much different to the image name retrieval calling
get_parent_handle() a 2nd time, rather than trying to re-use
dir_handle.

Net effect being that I think get_parent_handle() would then again
only be called when the returned handle is actually needed, and hence
when failure of HandleProtocol() (for DeviceHandle being NULL just
like for any other reason) is indeed an error that needs reporting.

Jan
Stefano Stabellini Nov. 4, 2021, 8:51 p.m. UTC | #4
On Thu, 4 Nov 2021, Luca Fancellu wrote:
> The code introduced by commit a1743fc3a9fe9b68c265c45264dddf214fd9b882
> ("arm/efi: Use dom0less configuration when using EFI boot") is
> introducing a problem to boot Xen using Grub2 on ARM machine using EDK2.
> 
> Despite UEFI specification, EDK2+Grub2 is returning a NULL DeviceHandle
> inside the interface given by the LOADED_IMAGE_PROTOCOL service, this
> handle is used later by efi_bs->HandleProtocol(...) inside
> get_parent_handle(...) when requesting the SIMPLE_FILE_SYSTEM_PROTOCOL
> interface, causing Xen to stop the boot because of an EFI_INVALID_PARAMETER
> error.
> 
> Before the commit above, the function was never called because the
> logic was skipping the call when there were multiboot modules in the
> DT because the filesystem was never used and the bootloader had
> put in place all the right modules in memory and the addresses
> in the DT.
> 
> To fix the problem we allow the get_parent_handle(...) function to
> return a NULL handle on error and we check the usage of the function
> to handle the new use case. The function in fact should not prevent
> the boot even if the filesystem can't be used, because the DT and
> the modules could be put in place by the bootloader before running
> Xen and if xen,uefi-binary property is not used, there is no need
> for the filesystem.
> 
> Another problem is found when the UEFI stub tries to check if Dom0
> image or DomUs are present.
> The logic doesn't work when the UEFI stub is not responsible to load
> any modules, so the efi_check_dt_boot(...) return value is modified
> to return the number of multiboot module found and not only the number
> of module loaded by the stub.
> Taking the occasion to update the comment in handle_module_node(...)
> to explain why we return success even if xen,uefi-binary is not found.
> 
> Fixes: a1743fc3a9 ("arm/efi: Use dom0less configuration when using EFI boot")
> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>

For the ARM part (xen/arch/arm/efi/efi-boot.h):

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


To be honest the changes to xen/common/efi/boot.c look OK too, but I'll
reply to Jan's email about his suggestion


> ---
> Justification for integration in 4.16:
> Upside: allow booting xen from grub on arm64 when the stub doesn't load
>         any module.
> Downside: It's affecting the EFI boot path.
> Risk: It's not affecting x86 arch that works the same way as before.
>       If something is wrong it creates a problem on early boot and not at
>       runtime, so risk is low.
> 
> Tested in this configurations:
>  - Bootloader loads modules and specify them as multiboot modules in DT:
>    * combination of Dom0, DomUs, Dom0 and DomUs
>  - DT specifies multiboot modules in DT using xen,uefi-binary property:
>    * combination of Dom0, DomUs, Dom0 and DomUs
>  - Bootloader loads a Dom0 module and appends it as multiboot module in DT,
>    other multiboot modules are listed for DomUs using xen,uefi-binary
>  - No multiboot modules in DT and no kernel entry in cfg file:
>    * proper error thrown
> 
> Changes in v2:
>  - Changed comment on DeviceHandle NULL (Jan)
>  - Removed fatal condition on handle NULL (Jan)
>  - Add more info about the EDK2+Grub2 issue to the commit msg (Jan)
>  - Removed modules_found from function signature and pass everything
>    on return (Stefano)
>  - Improved comment in handle_module_node
> 
> ---
>  xen/arch/arm/efi/efi-boot.h | 34 +++++++++++++++++++++++-----------
>  xen/common/efi/boot.c       | 28 +++++++++++++++++++++++++---
>  2 files changed, 48 insertions(+), 14 deletions(-)
> 
> diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
> index 8b88dd26a5..c3ae9751ab 100644
> --- a/xen/arch/arm/efi/efi-boot.h
> +++ b/xen/arch/arm/efi/efi-boot.h
> @@ -702,6 +702,7 @@ static int __init allocate_module_file(EFI_FILE_HANDLE dir_handle,
>   * This function checks for the presence of the xen,uefi-binary property in the
>   * module, if found it loads the binary as module and sets the right address
>   * for the reg property into the module DT node.
> + * Returns 1 if module is multiboot,module, 0 if not, < 0 on error
>   */
>  static int __init handle_module_node(EFI_FILE_HANDLE dir_handle,
>                                       int module_node_offset,
> @@ -730,8 +731,8 @@ static int __init handle_module_node(EFI_FILE_HANDLE dir_handle,
>                                   &uefi_name_len);
>  
>      if ( !uefi_name_prop )
> -        /* Property not found */
> -        return 0;
> +        /* Property not found, but signal this is a multiboot,module */
> +        return 1;
>  
>      file_idx = get_module_file_index(uefi_name_prop, uefi_name_len);
>      if ( file_idx < 0 )
> @@ -795,19 +796,20 @@ static int __init handle_module_node(EFI_FILE_HANDLE dir_handle,
>          }
>      }
>  
> -    return 0;
> +    return 1;
>  }
>  
>  /*
>   * This function checks for boot modules under the domU guest domain node
>   * in the DT.
> - * Returns 0 on success, negative number on error.
> + * Returns number of multiboot,module found or negative number on error.
>   */
>  static int __init handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
>                                                int domain_node)
>  {
>      int module_node, addr_cells, size_cells, len;
>      const struct fdt_property *prop;
> +    unsigned int mb_modules_found = 0;
>  
>      /* Get #address-cells and #size-cells from domain node */
>      prop = fdt_get_property(fdt, domain_node, "#address-cells", &len);
> @@ -837,20 +839,22 @@ static int __init handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
>                                       size_cells, true);
>          if ( ret < 0 )
>              return ret;
> +
> +        mb_modules_found += ret;
>      }
>  
> -    return 0;
> +    return mb_modules_found;
>  }
>  
>  /*
>   * This function checks for xen domain nodes under the /chosen node for possible
>   * dom0 and domU guests to be loaded.
> - * Returns the number of modules loaded or a negative number for error.
> + * Returns the number of multiboot modules found or a negative number for error.
>   */
>  static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
>  {
>      int chosen, node, addr_len, size_len;
> -    unsigned int i = 0;
> +    unsigned int i = 0, modules_found = 0;
>  
>      /* Check for the chosen node in the current DTB */
>      chosen = setup_chosen_node(fdt, &addr_len, &size_len);
> @@ -865,15 +869,23 @@ static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
>            node > 0;
>            node = fdt_next_subnode(fdt, node) )
>      {
> +        int ret;
> +
>          if ( !fdt_node_check_compatible(fdt, node, "xen,domain") )
>          {
>              /* Found a node with compatible xen,domain; handle this node. */
> -            if ( handle_dom0less_domain_node(dir_handle, node) < 0 )
> +            ret = handle_dom0less_domain_node(dir_handle, node);
> +            if ( ret < 0 )
>                  return ERROR_DT_MODULE_DOMU;
>          }
> -        else if ( handle_module_node(dir_handle, node, addr_len, size_len,
> -                                     false) < 0 )
> +        else
> +        {
> +            ret = handle_module_node(dir_handle, node, addr_len, size_len,
> +                                     false);
> +            if ( ret < 0 )
>                   return ERROR_DT_MODULE_DOM0;
> +        }
> +        modules_found += ret;
>      }
>  
>      /* Free boot modules file names if any */
> @@ -883,7 +895,7 @@ static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
>          efi_bs->FreePool(modules[i].name);
>      }
>  
> -    return modules_idx;
> +    return modules_found;
>  }
>  
>  static void __init efi_arch_cpu(void)
> diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
> index 392ff3ac9b..112b7e7571 100644
> --- a/xen/common/efi/boot.c
> +++ b/xen/common/efi/boot.c
> @@ -449,6 +449,15 @@ static EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
>      CHAR16 *pathend, *ptr;
>      EFI_STATUS ret;
>  
> +    /*
> +     * Grub2 running on top of EDK2 has been observed to supply a NULL
> +     * DeviceHandle. We can't use that to gain access to the filesystem.
> +     * However the system can still boot if it doesn’t require access to the
> +     * filesystem.
> +     */
> +    if ( !loaded_image->DeviceHandle )
> +        return NULL;
> +
>      do {
>          EFI_FILE_IO_INTERFACE *fio;
>  
> @@ -581,6 +590,8 @@ static bool __init read_file(EFI_FILE_HANDLE dir_handle, CHAR16 *name,
>      EFI_STATUS ret;
>      const CHAR16 *what = NULL;
>  
> +    if ( !dir_handle )
> +        blexit(L"Error: No access to the filesystem");
>      if ( !name )
>          PrintErrMesg(L"No filename", EFI_OUT_OF_RESOURCES);
>      ret = dir_handle->Open(dir_handle, &FileHandle, name,
> @@ -1333,8 +1344,18 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
>              EFI_FILE_HANDLE handle = get_parent_handle(loaded_image,
>                                                         &file_name);
>  
> -            handle->Close(handle);
> -            *argv = file_name;
> +            if ( !handle )
> +            {
> +                PrintErr(L"Error retrieving image name: no filesystem access."
> +                         L" Setting default to xen.efi");
> +                PrintErr(newline);
> +                *argv = L"xen.efi";
> +            }
> +            else
> +            {
> +                handle->Close(handle);
> +                *argv = file_name;
> +            }
>          }
>  
>          name.s = get_value(&cfg, section.s, "options");
> @@ -1369,7 +1390,8 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
>      /* Get the number of boot modules specified on the DT or an error (<0) */
>      dt_modules_found = efi_check_dt_boot(dir_handle);
>  
> -    dir_handle->Close(dir_handle);
> +    if ( dir_handle )
> +        dir_handle->Close(dir_handle);
>  
>      if ( dt_modules_found < 0 )
>          /* efi_check_dt_boot throws some error */
> -- 
> 2.17.1
>
Stefano Stabellini Nov. 4, 2021, 8:56 p.m. UTC | #5
On Thu, 4 Nov 2021, Jan Beulich wrote:
> On 04.11.2021 15:12, Luca Fancellu wrote:
> > --- a/xen/common/efi/boot.c
> > +++ b/xen/common/efi/boot.c
> > @@ -449,6 +449,15 @@ static EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
> >      CHAR16 *pathend, *ptr;
> >      EFI_STATUS ret;
> >  
> > +    /*
> > +     * Grub2 running on top of EDK2 has been observed to supply a NULL
> > +     * DeviceHandle. We can't use that to gain access to the filesystem.
> > +     * However the system can still boot if it doesn’t require access to the
> > +     * filesystem.
> > +     */
> > +    if ( !loaded_image->DeviceHandle )
> > +        return NULL;
> > +
> >      do {
> >          EFI_FILE_IO_INTERFACE *fio;
> >  
> > @@ -581,6 +590,8 @@ static bool __init read_file(EFI_FILE_HANDLE dir_handle, CHAR16 *name,
> >      EFI_STATUS ret;
> >      const CHAR16 *what = NULL;
> >  
> > +    if ( !dir_handle )
> > +        blexit(L"Error: No access to the filesystem");
> >      if ( !name )
> >          PrintErrMesg(L"No filename", EFI_OUT_OF_RESOURCES);
> >      ret = dir_handle->Open(dir_handle, &FileHandle, name,
> > @@ -1333,8 +1344,18 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
> >              EFI_FILE_HANDLE handle = get_parent_handle(loaded_image,
> >                                                         &file_name);
> >  
> > -            handle->Close(handle);
> > -            *argv = file_name;
> > +            if ( !handle )
> > +            {
> > +                PrintErr(L"Error retrieving image name: no filesystem access."
> > +                         L" Setting default to xen.efi");
> > +                PrintErr(newline);
> > +                *argv = L"xen.efi";
> > +            }
> > +            else
> > +            {
> > +                handle->Close(handle);
> > +                *argv = file_name;
> > +            }
> >          }
> >  
> >          name.s = get_value(&cfg, section.s, "options");
> > @@ -1369,7 +1390,8 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
> >      /* Get the number of boot modules specified on the DT or an error (<0) */
> >      dt_modules_found = efi_check_dt_boot(dir_handle);
> >  
> > -    dir_handle->Close(dir_handle);
> > +    if ( dir_handle )
> > +        dir_handle->Close(dir_handle);
> >  
> >      if ( dt_modules_found < 0 )
> >          /* efi_check_dt_boot throws some error */
> > 
> 
> I'm sorry, but I think we need to take a step back here and revisit
> the earlier change. If that hadn't moved obtaining dir_handle out by
> one level of scope, nothing bad would have happened to the case that
> you're now trying to fix, I understand? So perhaps that part wants
> undoing, with efi_check_dt_boot() instead getting passed loaded_image.
> That way, down the call tree the needed handle can be obtained via
> another call to get_parent_handle(), and quite likely in the scenario
> you're trying to fix here execution wouldn't even make it there. This
> then wouldn't be much different to the image name retrieval calling
> get_parent_handle() a 2nd time, rather than trying to re-use
> dir_handle.
> 
> Net effect being that I think get_parent_handle() would then again
> only be called when the returned handle is actually needed, and hence
> when failure of HandleProtocol() (for DeviceHandle being NULL just
> like for any other reason) is indeed an error that needs reporting.

In my opinion the current version is good enough. Regardless, I looked
at your suggestion into details. As it took me some time to understand
it, I thought I would share the code changes that I think correspond to
what you wrote. Does everything check out?

If so, I think it looks fine, maybe a bit better than the current
version. I'll leave that to you and Luca.


diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
index c3ae9751ab..9dcd8547cd 100644
--- a/xen/arch/arm/efi/efi-boot.h
+++ b/xen/arch/arm/efi/efi-boot.h
@@ -8,6 +8,8 @@
 #include <asm/setup.h>
 #include <asm/smp.h>
 
+extern EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
+                                                CHAR16 **leaf);
 typedef struct {
     char *name;
     unsigned int name_len;
@@ -54,7 +56,7 @@ static int handle_module_node(EFI_FILE_HANDLE dir_handle,
                               bool is_domu_module);
 static int handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
                                        int domain_node);
-static int efi_check_dt_boot(EFI_FILE_HANDLE dir_handle);
+static int efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image);
 
 #define DEVICE_TREE_GUID \
 {0xb1b621d5, 0xf19c, 0x41a5, {0x83, 0x0b, 0xd9, 0x15, 0x2c, 0x69, 0xaa, 0xe0}}
@@ -851,10 +853,14 @@ static int __init handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
  * dom0 and domU guests to be loaded.
  * Returns the number of multiboot modules found or a negative number for error.
  */
-static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
+static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
 {
     int chosen, node, addr_len, size_len;
     unsigned int i = 0, modules_found = 0;
+    EFI_FILE_HANDLE dir_handle;
+    CHAR16 *file_name;
+
+    dir_handle = get_parent_handle(loaded_image, &file_name);
 
     /* Check for the chosen node in the current DTB */
     chosen = setup_chosen_node(fdt, &addr_len, &size_len);
@@ -895,6 +901,8 @@ static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
         efi_bs->FreePool(modules[i].name);
     }
 
+    dir_handle->Close(dir_handle);
+
     return modules_found;
 }
 
diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index 112b7e7571..2407671a7d 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -167,7 +167,7 @@ static void __init PrintErr(const CHAR16 *s)
 }
 
 #ifndef CONFIG_HAS_DEVICE_TREE
-static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
+static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
 {
     return 0;
 }
@@ -439,8 +439,8 @@ static unsigned int __init get_argv(unsigned int argc, CHAR16 **argv,
     return argc;
 }
 
-static EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
-                                                CHAR16 **leaf)
+EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
+                                         CHAR16 **leaf)
 {
     static EFI_GUID __initdata fs_protocol = SIMPLE_FILE_SYSTEM_PROTOCOL;
     static CHAR16 __initdata buffer[512];
@@ -1236,9 +1236,6 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
 
     efi_arch_relocate_image(0);
 
-    /* Get the file system interface. */
-    dir_handle = get_parent_handle(loaded_image, &file_name);
-
     if ( use_cfg_file )
     {
         UINTN depth, cols, rows, size;
@@ -1251,6 +1248,9 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
 
         gop = efi_get_gop();
 
+        /* Get the file system interface. */
+        dir_handle = get_parent_handle(loaded_image, &file_name);
+
         /* Read and parse the config file. */
         if ( read_section(loaded_image, L"config", &cfg, NULL) )
             PrintStr(L"Using builtin config file\r\n");
@@ -1344,18 +1344,8 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
             EFI_FILE_HANDLE handle = get_parent_handle(loaded_image,
                                                        &file_name);
 
-            if ( !handle )
-            {
-                PrintErr(L"Error retrieving image name: no filesystem access."
-                         L" Setting default to xen.efi");
-                PrintErr(newline);
-                *argv = L"xen.efi";
-            }
-            else
-            {
-                handle->Close(handle);
-                *argv = file_name;
-            }
+            handle->Close(handle);
+            *argv = file_name;
         }
 
         name.s = get_value(&cfg, section.s, "options");
@@ -1383,15 +1373,14 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
         efi_bs->FreePages(cfg.addr, PFN_UP(cfg.size));
         cfg.addr = 0;
 
+        dir_handle->Close(dir_handle);
+
         if ( gop && !base_video )
             gop_mode = efi_find_gop_mode(gop, cols, rows, depth);
     }
 
     /* Get the number of boot modules specified on the DT or an error (<0) */
-    dt_modules_found = efi_check_dt_boot(dir_handle);
-
-    if ( dir_handle )
-        dir_handle->Close(dir_handle);
+    dt_modules_found = efi_check_dt_boot(loaded_image);
 
     if ( dt_modules_found < 0 )
         /* efi_check_dt_boot throws some error */
Luca Fancellu Nov. 4, 2021, 9:07 p.m. UTC | #6
> On 4 Nov 2021, at 20:56, Stefano Stabellini <sstabellini@kernel.org> wrote:
> 
> On Thu, 4 Nov 2021, Jan Beulich wrote:
>> On 04.11.2021 15:12, Luca Fancellu wrote:
>>> --- a/xen/common/efi/boot.c
>>> +++ b/xen/common/efi/boot.c
>>> @@ -449,6 +449,15 @@ static EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
>>>     CHAR16 *pathend, *ptr;
>>>     EFI_STATUS ret;
>>> 
>>> +    /*
>>> +     * Grub2 running on top of EDK2 has been observed to supply a NULL
>>> +     * DeviceHandle. We can't use that to gain access to the filesystem.
>>> +     * However the system can still boot if it doesn’t require access to the
>>> +     * filesystem.
>>> +     */
>>> +    if ( !loaded_image->DeviceHandle )
>>> +        return NULL;
>>> +
>>>     do {
>>>         EFI_FILE_IO_INTERFACE *fio;
>>> 
>>> @@ -581,6 +590,8 @@ static bool __init read_file(EFI_FILE_HANDLE dir_handle, CHAR16 *name,
>>>     EFI_STATUS ret;
>>>     const CHAR16 *what = NULL;
>>> 
>>> +    if ( !dir_handle )
>>> +        blexit(L"Error: No access to the filesystem");
>>>     if ( !name )
>>>         PrintErrMesg(L"No filename", EFI_OUT_OF_RESOURCES);
>>>     ret = dir_handle->Open(dir_handle, &FileHandle, name,
>>> @@ -1333,8 +1344,18 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
>>>             EFI_FILE_HANDLE handle = get_parent_handle(loaded_image,
>>>                                                        &file_name);
>>> 
>>> -            handle->Close(handle);
>>> -            *argv = file_name;
>>> +            if ( !handle )
>>> +            {
>>> +                PrintErr(L"Error retrieving image name: no filesystem access."
>>> +                         L" Setting default to xen.efi");
>>> +                PrintErr(newline);
>>> +                *argv = L"xen.efi";
>>> +            }
>>> +            else
>>> +            {
>>> +                handle->Close(handle);
>>> +                *argv = file_name;
>>> +            }
>>>         }
>>> 
>>>         name.s = get_value(&cfg, section.s, "options");
>>> @@ -1369,7 +1390,8 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
>>>     /* Get the number of boot modules specified on the DT or an error (<0) */
>>>     dt_modules_found = efi_check_dt_boot(dir_handle);
>>> 
>>> -    dir_handle->Close(dir_handle);
>>> +    if ( dir_handle )
>>> +        dir_handle->Close(dir_handle);
>>> 
>>>     if ( dt_modules_found < 0 )
>>>         /* efi_check_dt_boot throws some error */
>>> 
>> 
>> I'm sorry, but I think we need to take a step back here and revisit
>> the earlier change. If that hadn't moved obtaining dir_handle out by
>> one level of scope, nothing bad would have happened to the case that
>> you're now trying to fix, I understand? So perhaps that part wants
>> undoing, with efi_check_dt_boot() instead getting passed loaded_image.
>> That way, down the call tree the needed handle can be obtained via
>> another call to get_parent_handle(), and quite likely in the scenario
>> you're trying to fix here execution wouldn't even make it there. This
>> then wouldn't be much different to the image name retrieval calling
>> get_parent_handle() a 2nd time, rather than trying to re-use
>> dir_handle.
>> 
>> Net effect being that I think get_parent_handle() would then again
>> only be called when the returned handle is actually needed, and hence
>> when failure of HandleProtocol() (for DeviceHandle being NULL just
>> like for any other reason) is indeed an error that needs reporting.
> 
> In my opinion the current version is good enough. Regardless, I looked
> at your suggestion into details. As it took me some time to understand
> it, I thought I would share the code changes that I think correspond to
> what you wrote. Does everything check out?
> 
> If so, I think it looks fine, maybe a bit better than the current
> version. I'll leave that to you and Luca.
> 
> 
> diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
> index c3ae9751ab..9dcd8547cd 100644
> --- a/xen/arch/arm/efi/efi-boot.h
> +++ b/xen/arch/arm/efi/efi-boot.h
> @@ -8,6 +8,8 @@
> #include <asm/setup.h>
> #include <asm/smp.h>
> 
> +extern EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
> +                                                CHAR16 **leaf);
> typedef struct {
>     char *name;
>     unsigned int name_len;
> @@ -54,7 +56,7 @@ static int handle_module_node(EFI_FILE_HANDLE dir_handle,
>                               bool is_domu_module);
> static int handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
>                                        int domain_node);
> -static int efi_check_dt_boot(EFI_FILE_HANDLE dir_handle);
> +static int efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image);
> 
> #define DEVICE_TREE_GUID \
> {0xb1b621d5, 0xf19c, 0x41a5, {0x83, 0x0b, 0xd9, 0x15, 0x2c, 0x69, 0xaa, 0xe0}}
> @@ -851,10 +853,14 @@ static int __init handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
>  * dom0 and domU guests to be loaded.
>  * Returns the number of multiboot modules found or a negative number for error.
>  */
> -static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
> +static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
> {
>     int chosen, node, addr_len, size_len;
>     unsigned int i = 0, modules_found = 0;
> +    EFI_FILE_HANDLE dir_handle;
> +    CHAR16 *file_name;
> +
> +    dir_handle = get_parent_handle(loaded_image, &file_name);

We can’t use get_parent_handle here because we will end up with the same problem,
we would need to use the filesystem if and only if we need to use it, so the way I see
is to pass loaded_image down to the stack until allocate_module_file(…), in this
function we can use get_parent_handle(…) because the user wants us to do that.
The downside is that we must close the handle there, so for each loaded file we will
request and close the handle. Is this something we don’t bother too much?

> 
>     /* Check for the chosen node in the current DTB */
>     chosen = setup_chosen_node(fdt, &addr_len, &size_len);
> @@ -895,6 +901,8 @@ static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
>         efi_bs->FreePool(modules[i].name);
>     }
> 
> +    dir_handle->Close(dir_handle);
> +
>     return modules_found;
> }
> 
> diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
> index 112b7e7571..2407671a7d 100644
> --- a/xen/common/efi/boot.c
> +++ b/xen/common/efi/boot.c
> @@ -167,7 +167,7 @@ static void __init PrintErr(const CHAR16 *s)
> }
> 
> #ifndef CONFIG_HAS_DEVICE_TREE
> -static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
> +static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
> {
>     return 0;
> }
> @@ -439,8 +439,8 @@ static unsigned int __init get_argv(unsigned int argc, CHAR16 **argv,
>     return argc;
> }
> 
> -static EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
> -                                                CHAR16 **leaf)
> +EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
> +                                         CHAR16 **leaf)
> {
>     static EFI_GUID __initdata fs_protocol = SIMPLE_FILE_SYSTEM_PROTOCOL;
>     static CHAR16 __initdata buffer[512];
> @@ -1236,9 +1236,6 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
> 
>     efi_arch_relocate_image(0);
> 
> -    /* Get the file system interface. */
> -    dir_handle = get_parent_handle(loaded_image, &file_name);
> -
>     if ( use_cfg_file )
>     {
>         UINTN depth, cols, rows, size;
> @@ -1251,6 +1248,9 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
> 
>         gop = efi_get_gop();
> 
> +        /* Get the file system interface. */
> +        dir_handle = get_parent_handle(loaded_image, &file_name);
> +
>         /* Read and parse the config file. */
>         if ( read_section(loaded_image, L"config", &cfg, NULL) )
>             PrintStr(L"Using builtin config file\r\n");
> @@ -1344,18 +1344,8 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
>             EFI_FILE_HANDLE handle = get_parent_handle(loaded_image,
>                                                        &file_name);
> 
> -            if ( !handle )
> -            {
> -                PrintErr(L"Error retrieving image name: no filesystem access."
> -                         L" Setting default to xen.efi");
> -                PrintErr(newline);
> -                *argv = L"xen.efi";
> -            }
> -            else
> -            {
> -                handle->Close(handle);
> -                *argv = file_name;
> -            }
> +            handle->Close(handle);
> +            *argv = file_name;
>         }
> 
>         name.s = get_value(&cfg, section.s, "options");
> @@ -1383,15 +1373,14 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
>         efi_bs->FreePages(cfg.addr, PFN_UP(cfg.size));
>         cfg.addr = 0;
> 
> +        dir_handle->Close(dir_handle);
> +
>         if ( gop && !base_video )
>             gop_mode = efi_find_gop_mode(gop, cols, rows, depth);
>     }
> 
>     /* Get the number of boot modules specified on the DT or an error (<0) */
> -    dt_modules_found = efi_check_dt_boot(dir_handle);
> -
> -    if ( dir_handle )
> -        dir_handle->Close(dir_handle);
> +    dt_modules_found = efi_check_dt_boot(loaded_image);
> 
>     if ( dt_modules_found < 0 )
>         /* efi_check_dt_boot throws some error */
Stefano Stabellini Nov. 4, 2021, 9:35 p.m. UTC | #7
On Thu, 4 Nov 2021, Luca Fancellu wrote:
> > On 4 Nov 2021, at 20:56, Stefano Stabellini <sstabellini@kernel.org> wrote:
> > 
> > On Thu, 4 Nov 2021, Jan Beulich wrote:
> >> On 04.11.2021 15:12, Luca Fancellu wrote:
> >>> --- a/xen/common/efi/boot.c
> >>> +++ b/xen/common/efi/boot.c
> >>> @@ -449,6 +449,15 @@ static EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
> >>>     CHAR16 *pathend, *ptr;
> >>>     EFI_STATUS ret;
> >>> 
> >>> +    /*
> >>> +     * Grub2 running on top of EDK2 has been observed to supply a NULL
> >>> +     * DeviceHandle. We can't use that to gain access to the filesystem.
> >>> +     * However the system can still boot if it doesn’t require access to the
> >>> +     * filesystem.
> >>> +     */
> >>> +    if ( !loaded_image->DeviceHandle )
> >>> +        return NULL;
> >>> +
> >>>     do {
> >>>         EFI_FILE_IO_INTERFACE *fio;
> >>> 
> >>> @@ -581,6 +590,8 @@ static bool __init read_file(EFI_FILE_HANDLE dir_handle, CHAR16 *name,
> >>>     EFI_STATUS ret;
> >>>     const CHAR16 *what = NULL;
> >>> 
> >>> +    if ( !dir_handle )
> >>> +        blexit(L"Error: No access to the filesystem");
> >>>     if ( !name )
> >>>         PrintErrMesg(L"No filename", EFI_OUT_OF_RESOURCES);
> >>>     ret = dir_handle->Open(dir_handle, &FileHandle, name,
> >>> @@ -1333,8 +1344,18 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
> >>>             EFI_FILE_HANDLE handle = get_parent_handle(loaded_image,
> >>>                                                        &file_name);
> >>> 
> >>> -            handle->Close(handle);
> >>> -            *argv = file_name;
> >>> +            if ( !handle )
> >>> +            {
> >>> +                PrintErr(L"Error retrieving image name: no filesystem access."
> >>> +                         L" Setting default to xen.efi");
> >>> +                PrintErr(newline);
> >>> +                *argv = L"xen.efi";
> >>> +            }
> >>> +            else
> >>> +            {
> >>> +                handle->Close(handle);
> >>> +                *argv = file_name;
> >>> +            }
> >>>         }
> >>> 
> >>>         name.s = get_value(&cfg, section.s, "options");
> >>> @@ -1369,7 +1390,8 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
> >>>     /* Get the number of boot modules specified on the DT or an error (<0) */
> >>>     dt_modules_found = efi_check_dt_boot(dir_handle);
> >>> 
> >>> -    dir_handle->Close(dir_handle);
> >>> +    if ( dir_handle )
> >>> +        dir_handle->Close(dir_handle);
> >>> 
> >>>     if ( dt_modules_found < 0 )
> >>>         /* efi_check_dt_boot throws some error */
> >>> 
> >> 
> >> I'm sorry, but I think we need to take a step back here and revisit
> >> the earlier change. If that hadn't moved obtaining dir_handle out by
> >> one level of scope, nothing bad would have happened to the case that
> >> you're now trying to fix, I understand? So perhaps that part wants
> >> undoing, with efi_check_dt_boot() instead getting passed loaded_image.
> >> That way, down the call tree the needed handle can be obtained via
> >> another call to get_parent_handle(), and quite likely in the scenario
> >> you're trying to fix here execution wouldn't even make it there. This
> >> then wouldn't be much different to the image name retrieval calling
> >> get_parent_handle() a 2nd time, rather than trying to re-use
> >> dir_handle.
> >> 
> >> Net effect being that I think get_parent_handle() would then again
> >> only be called when the returned handle is actually needed, and hence
> >> when failure of HandleProtocol() (for DeviceHandle being NULL just
> >> like for any other reason) is indeed an error that needs reporting.
> > 
> > In my opinion the current version is good enough. Regardless, I looked
> > at your suggestion into details. As it took me some time to understand
> > it, I thought I would share the code changes that I think correspond to
> > what you wrote. Does everything check out?
> > 
> > If so, I think it looks fine, maybe a bit better than the current
> > version. I'll leave that to you and Luca.
> > 
> > 
> > diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
> > index c3ae9751ab..9dcd8547cd 100644
> > --- a/xen/arch/arm/efi/efi-boot.h
> > +++ b/xen/arch/arm/efi/efi-boot.h
> > @@ -8,6 +8,8 @@
> > #include <asm/setup.h>
> > #include <asm/smp.h>
> > 
> > +extern EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
> > +                                                CHAR16 **leaf);
> > typedef struct {
> >     char *name;
> >     unsigned int name_len;
> > @@ -54,7 +56,7 @@ static int handle_module_node(EFI_FILE_HANDLE dir_handle,
> >                               bool is_domu_module);
> > static int handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
> >                                        int domain_node);
> > -static int efi_check_dt_boot(EFI_FILE_HANDLE dir_handle);
> > +static int efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image);
> > 
> > #define DEVICE_TREE_GUID \
> > {0xb1b621d5, 0xf19c, 0x41a5, {0x83, 0x0b, 0xd9, 0x15, 0x2c, 0x69, 0xaa, 0xe0}}
> > @@ -851,10 +853,14 @@ static int __init handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
> >  * dom0 and domU guests to be loaded.
> >  * Returns the number of multiboot modules found or a negative number for error.
> >  */
> > -static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
> > +static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
> > {
> >     int chosen, node, addr_len, size_len;
> >     unsigned int i = 0, modules_found = 0;
> > +    EFI_FILE_HANDLE dir_handle;
> > +    CHAR16 *file_name;
> > +
> > +    dir_handle = get_parent_handle(loaded_image, &file_name);
> 
> We can’t use get_parent_handle here because we will end up with the same problem,
> we would need to use the filesystem if and only if we need to use it, 

Understood, but it would work the same way as this version of the patch:
if we end up calling read_file with dir_handle == NULL, then read_file
would do:

  blexit(L"Error: No access to the filesystem");

If we don't end up calling read_file, then everything works even if
dir_handle == NULL. Right?


> so the way I see
> is to pass loaded_image down to the stack until allocate_module_file(…), in this
> function we can use get_parent_handle(…) because the user wants us to do that.
> The downside is that we must close the handle there, so for each loaded file we will
> request and close the handle. Is this something we don’t bother too much?

Yeah, that doesn't seem ideal.


> > 
> >     /* Check for the chosen node in the current DTB */
> >     chosen = setup_chosen_node(fdt, &addr_len, &size_len);
> > @@ -895,6 +901,8 @@ static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
> >         efi_bs->FreePool(modules[i].name);
> >     }
> > 
> > +    dir_handle->Close(dir_handle);
> > +
> >     return modules_found;
> > }
> > 
> > diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
> > index 112b7e7571..2407671a7d 100644
> > --- a/xen/common/efi/boot.c
> > +++ b/xen/common/efi/boot.c
> > @@ -167,7 +167,7 @@ static void __init PrintErr(const CHAR16 *s)
> > }
> > 
> > #ifndef CONFIG_HAS_DEVICE_TREE
> > -static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
> > +static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
> > {
> >     return 0;
> > }
> > @@ -439,8 +439,8 @@ static unsigned int __init get_argv(unsigned int argc, CHAR16 **argv,
> >     return argc;
> > }
> > 
> > -static EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
> > -                                                CHAR16 **leaf)
> > +EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
> > +                                         CHAR16 **leaf)
> > {
> >     static EFI_GUID __initdata fs_protocol = SIMPLE_FILE_SYSTEM_PROTOCOL;
> >     static CHAR16 __initdata buffer[512];
> > @@ -1236,9 +1236,6 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
> > 
> >     efi_arch_relocate_image(0);
> > 
> > -    /* Get the file system interface. */
> > -    dir_handle = get_parent_handle(loaded_image, &file_name);
> > -
> >     if ( use_cfg_file )
> >     {
> >         UINTN depth, cols, rows, size;
> > @@ -1251,6 +1248,9 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
> > 
> >         gop = efi_get_gop();
> > 
> > +        /* Get the file system interface. */
> > +        dir_handle = get_parent_handle(loaded_image, &file_name);
> > +
> >         /* Read and parse the config file. */
> >         if ( read_section(loaded_image, L"config", &cfg, NULL) )
> >             PrintStr(L"Using builtin config file\r\n");
> > @@ -1344,18 +1344,8 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
> >             EFI_FILE_HANDLE handle = get_parent_handle(loaded_image,
> >                                                        &file_name);
> > 
> > -            if ( !handle )
> > -            {
> > -                PrintErr(L"Error retrieving image name: no filesystem access."
> > -                         L" Setting default to xen.efi");
> > -                PrintErr(newline);
> > -                *argv = L"xen.efi";
> > -            }
> > -            else
> > -            {
> > -                handle->Close(handle);
> > -                *argv = file_name;
> > -            }
> > +            handle->Close(handle);
> > +            *argv = file_name;
> >         }
> > 
> >         name.s = get_value(&cfg, section.s, "options");
> > @@ -1383,15 +1373,14 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
> >         efi_bs->FreePages(cfg.addr, PFN_UP(cfg.size));
> >         cfg.addr = 0;
> > 
> > +        dir_handle->Close(dir_handle);
> > +
> >         if ( gop && !base_video )
> >             gop_mode = efi_find_gop_mode(gop, cols, rows, depth);
> >     }
> > 
> >     /* Get the number of boot modules specified on the DT or an error (<0) */
> > -    dt_modules_found = efi_check_dt_boot(dir_handle);
> > -
> > -    if ( dir_handle )
> > -        dir_handle->Close(dir_handle);
> > +    dt_modules_found = efi_check_dt_boot(loaded_image);
> > 
> >     if ( dt_modules_found < 0 )
> >         /* efi_check_dt_boot throws some error */
>
Luca Fancellu Nov. 4, 2021, 9:43 p.m. UTC | #8
> On 4 Nov 2021, at 21:35, Stefano Stabellini <sstabellini@kernel.org> wrote:
> 
> On Thu, 4 Nov 2021, Luca Fancellu wrote:
>>> On 4 Nov 2021, at 20:56, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>> 
>>> On Thu, 4 Nov 2021, Jan Beulich wrote:
>>>> On 04.11.2021 15:12, Luca Fancellu wrote:
>>>>> --- a/xen/common/efi/boot.c
>>>>> +++ b/xen/common/efi/boot.c
>>>>> @@ -449,6 +449,15 @@ static EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
>>>>>    CHAR16 *pathend, *ptr;
>>>>>    EFI_STATUS ret;
>>>>> 
>>>>> +    /*
>>>>> +     * Grub2 running on top of EDK2 has been observed to supply a NULL
>>>>> +     * DeviceHandle. We can't use that to gain access to the filesystem.
>>>>> +     * However the system can still boot if it doesn’t require access to the
>>>>> +     * filesystem.
>>>>> +     */
>>>>> +    if ( !loaded_image->DeviceHandle )
>>>>> +        return NULL;
>>>>> +
>>>>>    do {
>>>>>        EFI_FILE_IO_INTERFACE *fio;
>>>>> 
>>>>> @@ -581,6 +590,8 @@ static bool __init read_file(EFI_FILE_HANDLE dir_handle, CHAR16 *name,
>>>>>    EFI_STATUS ret;
>>>>>    const CHAR16 *what = NULL;
>>>>> 
>>>>> +    if ( !dir_handle )
>>>>> +        blexit(L"Error: No access to the filesystem");
>>>>>    if ( !name )
>>>>>        PrintErrMesg(L"No filename", EFI_OUT_OF_RESOURCES);
>>>>>    ret = dir_handle->Open(dir_handle, &FileHandle, name,
>>>>> @@ -1333,8 +1344,18 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
>>>>>            EFI_FILE_HANDLE handle = get_parent_handle(loaded_image,
>>>>>                                                       &file_name);
>>>>> 
>>>>> -            handle->Close(handle);
>>>>> -            *argv = file_name;
>>>>> +            if ( !handle )
>>>>> +            {
>>>>> +                PrintErr(L"Error retrieving image name: no filesystem access."
>>>>> +                         L" Setting default to xen.efi");
>>>>> +                PrintErr(newline);
>>>>> +                *argv = L"xen.efi";
>>>>> +            }
>>>>> +            else
>>>>> +            {
>>>>> +                handle->Close(handle);
>>>>> +                *argv = file_name;
>>>>> +            }
>>>>>        }
>>>>> 
>>>>>        name.s = get_value(&cfg, section.s, "options");
>>>>> @@ -1369,7 +1390,8 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
>>>>>    /* Get the number of boot modules specified on the DT or an error (<0) */
>>>>>    dt_modules_found = efi_check_dt_boot(dir_handle);
>>>>> 
>>>>> -    dir_handle->Close(dir_handle);
>>>>> +    if ( dir_handle )
>>>>> +        dir_handle->Close(dir_handle);
>>>>> 
>>>>>    if ( dt_modules_found < 0 )
>>>>>        /* efi_check_dt_boot throws some error */
>>>>> 
>>>> 
>>>> I'm sorry, but I think we need to take a step back here and revisit
>>>> the earlier change. If that hadn't moved obtaining dir_handle out by
>>>> one level of scope, nothing bad would have happened to the case that
>>>> you're now trying to fix, I understand? So perhaps that part wants
>>>> undoing, with efi_check_dt_boot() instead getting passed loaded_image.
>>>> That way, down the call tree the needed handle can be obtained via
>>>> another call to get_parent_handle(), and quite likely in the scenario
>>>> you're trying to fix here execution wouldn't even make it there. This
>>>> then wouldn't be much different to the image name retrieval calling
>>>> get_parent_handle() a 2nd time, rather than trying to re-use
>>>> dir_handle.
>>>> 
>>>> Net effect being that I think get_parent_handle() would then again
>>>> only be called when the returned handle is actually needed, and hence
>>>> when failure of HandleProtocol() (for DeviceHandle being NULL just
>>>> like for any other reason) is indeed an error that needs reporting.
>>> 
>>> In my opinion the current version is good enough. Regardless, I looked
>>> at your suggestion into details. As it took me some time to understand
>>> it, I thought I would share the code changes that I think correspond to
>>> what you wrote. Does everything check out?
>>> 
>>> If so, I think it looks fine, maybe a bit better than the current
>>> version. I'll leave that to you and Luca.
>>> 
>>> 
>>> diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
>>> index c3ae9751ab..9dcd8547cd 100644
>>> --- a/xen/arch/arm/efi/efi-boot.h
>>> +++ b/xen/arch/arm/efi/efi-boot.h
>>> @@ -8,6 +8,8 @@
>>> #include <asm/setup.h>
>>> #include <asm/smp.h>
>>> 
>>> +extern EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
>>> +                                                CHAR16 **leaf);
>>> typedef struct {
>>>    char *name;
>>>    unsigned int name_len;
>>> @@ -54,7 +56,7 @@ static int handle_module_node(EFI_FILE_HANDLE dir_handle,
>>>                              bool is_domu_module);
>>> static int handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
>>>                                       int domain_node);
>>> -static int efi_check_dt_boot(EFI_FILE_HANDLE dir_handle);
>>> +static int efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image);
>>> 
>>> #define DEVICE_TREE_GUID \
>>> {0xb1b621d5, 0xf19c, 0x41a5, {0x83, 0x0b, 0xd9, 0x15, 0x2c, 0x69, 0xaa, 0xe0}}
>>> @@ -851,10 +853,14 @@ static int __init handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
>>> * dom0 and domU guests to be loaded.
>>> * Returns the number of multiboot modules found or a negative number for error.
>>> */
>>> -static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
>>> +static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
>>> {
>>>    int chosen, node, addr_len, size_len;
>>>    unsigned int i = 0, modules_found = 0;
>>> +    EFI_FILE_HANDLE dir_handle;
>>> +    CHAR16 *file_name;
>>> +
>>> +    dir_handle = get_parent_handle(loaded_image, &file_name);
>> 
>> We can’t use get_parent_handle here because we will end up with the same problem,
>> we would need to use the filesystem if and only if we need to use it, 
> 
> Understood, but it would work the same way as this version of the patch:
> if we end up calling read_file with dir_handle == NULL, then read_file
> would do:
> 
>  blexit(L"Error: No access to the filesystem");
> 
> If we don't end up calling read_file, then everything works even if
> dir_handle == NULL. Right?

Oh yes sorry my bad Stefano! Having this version of the patch, it will work.

My understanding was instead that the Jan suggestion is to revert the place
of call of get_parent_handle (like in your code diff), but also to remove the
changes to get_parent_handle and read_file.
I guess Jan will specify his preference, but if he meant the last one, then
the only way I see...

> 
> 
>> so the way I see
>> is to pass loaded_image down to the stack until allocate_module_file(…), in this
>> function we can use get_parent_handle(…) because the user wants us to do that.
>> The downside is that we must close the handle there, so for each loaded file we will
>> request and close the handle. Is this something we don’t bother too much?
> 
> Yeah, that doesn't seem ideal.

… is this one.

> 
> 
>>> 
>>>    /* Check for the chosen node in the current DTB */
>>>    chosen = setup_chosen_node(fdt, &addr_len, &size_len);
>>> @@ -895,6 +901,8 @@ static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
>>>        efi_bs->FreePool(modules[i].name);
>>>    }
>>> 
>>> +    dir_handle->Close(dir_handle);
>>> +
>>>    return modules_found;
>>> }
>>> 
>>> diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
>>> index 112b7e7571..2407671a7d 100644
>>> --- a/xen/common/efi/boot.c
>>> +++ b/xen/common/efi/boot.c
>>> @@ -167,7 +167,7 @@ static void __init PrintErr(const CHAR16 *s)
>>> }
>>> 
>>> #ifndef CONFIG_HAS_DEVICE_TREE
>>> -static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
>>> +static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
>>> {
>>>    return 0;
>>> }
>>> @@ -439,8 +439,8 @@ static unsigned int __init get_argv(unsigned int argc, CHAR16 **argv,
>>>    return argc;
>>> }
>>> 
>>> -static EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
>>> -                                                CHAR16 **leaf)
>>> +EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
>>> +                                         CHAR16 **leaf)
>>> {
>>>    static EFI_GUID __initdata fs_protocol = SIMPLE_FILE_SYSTEM_PROTOCOL;
>>>    static CHAR16 __initdata buffer[512];
>>> @@ -1236,9 +1236,6 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
>>> 
>>>    efi_arch_relocate_image(0);
>>> 
>>> -    /* Get the file system interface. */
>>> -    dir_handle = get_parent_handle(loaded_image, &file_name);
>>> -
>>>    if ( use_cfg_file )
>>>    {
>>>        UINTN depth, cols, rows, size;
>>> @@ -1251,6 +1248,9 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
>>> 
>>>        gop = efi_get_gop();
>>> 
>>> +        /* Get the file system interface. */
>>> +        dir_handle = get_parent_handle(loaded_image, &file_name);
>>> +
>>>        /* Read and parse the config file. */
>>>        if ( read_section(loaded_image, L"config", &cfg, NULL) )
>>>            PrintStr(L"Using builtin config file\r\n");
>>> @@ -1344,18 +1344,8 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
>>>            EFI_FILE_HANDLE handle = get_parent_handle(loaded_image,
>>>                                                       &file_name);
>>> 
>>> -            if ( !handle )
>>> -            {
>>> -                PrintErr(L"Error retrieving image name: no filesystem access."
>>> -                         L" Setting default to xen.efi");
>>> -                PrintErr(newline);
>>> -                *argv = L"xen.efi";
>>> -            }
>>> -            else
>>> -            {
>>> -                handle->Close(handle);
>>> -                *argv = file_name;
>>> -            }
>>> +            handle->Close(handle);
>>> +            *argv = file_name;
>>>        }
>>> 
>>>        name.s = get_value(&cfg, section.s, "options");
>>> @@ -1383,15 +1373,14 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
>>>        efi_bs->FreePages(cfg.addr, PFN_UP(cfg.size));
>>>        cfg.addr = 0;
>>> 
>>> +        dir_handle->Close(dir_handle);
>>> +
>>>        if ( gop && !base_video )
>>>            gop_mode = efi_find_gop_mode(gop, cols, rows, depth);
>>>    }
>>> 
>>>    /* Get the number of boot modules specified on the DT or an error (<0) */
>>> -    dt_modules_found = efi_check_dt_boot(dir_handle);
>>> -
>>> -    if ( dir_handle )
>>> -        dir_handle->Close(dir_handle);
>>> +    dt_modules_found = efi_check_dt_boot(loaded_image);
>>> 
>>>    if ( dt_modules_found < 0 )
>>>        /* efi_check_dt_boot throws some error */
Stefano Stabellini Nov. 4, 2021, 9:50 p.m. UTC | #9
On Thu, 4 Nov 2021, Luca Fancellu wrote:
> > On 4 Nov 2021, at 21:35, Stefano Stabellini <sstabellini@kernel.org> wrote:
> > 
> > On Thu, 4 Nov 2021, Luca Fancellu wrote:
> >>> On 4 Nov 2021, at 20:56, Stefano Stabellini <sstabellini@kernel.org> wrote:
> >>> 
> >>> On Thu, 4 Nov 2021, Jan Beulich wrote:
> >>>> On 04.11.2021 15:12, Luca Fancellu wrote:
> >>>>> --- a/xen/common/efi/boot.c
> >>>>> +++ b/xen/common/efi/boot.c
> >>>>> @@ -449,6 +449,15 @@ static EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
> >>>>>    CHAR16 *pathend, *ptr;
> >>>>>    EFI_STATUS ret;
> >>>>> 
> >>>>> +    /*
> >>>>> +     * Grub2 running on top of EDK2 has been observed to supply a NULL
> >>>>> +     * DeviceHandle. We can't use that to gain access to the filesystem.
> >>>>> +     * However the system can still boot if it doesn’t require access to the
> >>>>> +     * filesystem.
> >>>>> +     */
> >>>>> +    if ( !loaded_image->DeviceHandle )
> >>>>> +        return NULL;
> >>>>> +
> >>>>>    do {
> >>>>>        EFI_FILE_IO_INTERFACE *fio;
> >>>>> 
> >>>>> @@ -581,6 +590,8 @@ static bool __init read_file(EFI_FILE_HANDLE dir_handle, CHAR16 *name,
> >>>>>    EFI_STATUS ret;
> >>>>>    const CHAR16 *what = NULL;
> >>>>> 
> >>>>> +    if ( !dir_handle )
> >>>>> +        blexit(L"Error: No access to the filesystem");
> >>>>>    if ( !name )
> >>>>>        PrintErrMesg(L"No filename", EFI_OUT_OF_RESOURCES);
> >>>>>    ret = dir_handle->Open(dir_handle, &FileHandle, name,
> >>>>> @@ -1333,8 +1344,18 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
> >>>>>            EFI_FILE_HANDLE handle = get_parent_handle(loaded_image,
> >>>>>                                                       &file_name);
> >>>>> 
> >>>>> -            handle->Close(handle);
> >>>>> -            *argv = file_name;
> >>>>> +            if ( !handle )
> >>>>> +            {
> >>>>> +                PrintErr(L"Error retrieving image name: no filesystem access."
> >>>>> +                         L" Setting default to xen.efi");
> >>>>> +                PrintErr(newline);
> >>>>> +                *argv = L"xen.efi";
> >>>>> +            }
> >>>>> +            else
> >>>>> +            {
> >>>>> +                handle->Close(handle);
> >>>>> +                *argv = file_name;
> >>>>> +            }
> >>>>>        }
> >>>>> 
> >>>>>        name.s = get_value(&cfg, section.s, "options");
> >>>>> @@ -1369,7 +1390,8 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
> >>>>>    /* Get the number of boot modules specified on the DT or an error (<0) */
> >>>>>    dt_modules_found = efi_check_dt_boot(dir_handle);
> >>>>> 
> >>>>> -    dir_handle->Close(dir_handle);
> >>>>> +    if ( dir_handle )
> >>>>> +        dir_handle->Close(dir_handle);
> >>>>> 
> >>>>>    if ( dt_modules_found < 0 )
> >>>>>        /* efi_check_dt_boot throws some error */
> >>>>> 
> >>>> 
> >>>> I'm sorry, but I think we need to take a step back here and revisit
> >>>> the earlier change. If that hadn't moved obtaining dir_handle out by
> >>>> one level of scope, nothing bad would have happened to the case that
> >>>> you're now trying to fix, I understand? So perhaps that part wants
> >>>> undoing, with efi_check_dt_boot() instead getting passed loaded_image.
> >>>> That way, down the call tree the needed handle can be obtained via
> >>>> another call to get_parent_handle(), and quite likely in the scenario
> >>>> you're trying to fix here execution wouldn't even make it there. This
> >>>> then wouldn't be much different to the image name retrieval calling
> >>>> get_parent_handle() a 2nd time, rather than trying to re-use
> >>>> dir_handle.
> >>>> 
> >>>> Net effect being that I think get_parent_handle() would then again
> >>>> only be called when the returned handle is actually needed, and hence
> >>>> when failure of HandleProtocol() (for DeviceHandle being NULL just
> >>>> like for any other reason) is indeed an error that needs reporting.
> >>> 
> >>> In my opinion the current version is good enough. Regardless, I looked
> >>> at your suggestion into details. As it took me some time to understand
> >>> it, I thought I would share the code changes that I think correspond to
> >>> what you wrote. Does everything check out?
> >>> 
> >>> If so, I think it looks fine, maybe a bit better than the current
> >>> version. I'll leave that to you and Luca.
> >>> 
> >>> 
> >>> diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
> >>> index c3ae9751ab..9dcd8547cd 100644
> >>> --- a/xen/arch/arm/efi/efi-boot.h
> >>> +++ b/xen/arch/arm/efi/efi-boot.h
> >>> @@ -8,6 +8,8 @@
> >>> #include <asm/setup.h>
> >>> #include <asm/smp.h>
> >>> 
> >>> +extern EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
> >>> +                                                CHAR16 **leaf);
> >>> typedef struct {
> >>>    char *name;
> >>>    unsigned int name_len;
> >>> @@ -54,7 +56,7 @@ static int handle_module_node(EFI_FILE_HANDLE dir_handle,
> >>>                              bool is_domu_module);
> >>> static int handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
> >>>                                       int domain_node);
> >>> -static int efi_check_dt_boot(EFI_FILE_HANDLE dir_handle);
> >>> +static int efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image);
> >>> 
> >>> #define DEVICE_TREE_GUID \
> >>> {0xb1b621d5, 0xf19c, 0x41a5, {0x83, 0x0b, 0xd9, 0x15, 0x2c, 0x69, 0xaa, 0xe0}}
> >>> @@ -851,10 +853,14 @@ static int __init handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
> >>> * dom0 and domU guests to be loaded.
> >>> * Returns the number of multiboot modules found or a negative number for error.
> >>> */
> >>> -static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
> >>> +static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
> >>> {
> >>>    int chosen, node, addr_len, size_len;
> >>>    unsigned int i = 0, modules_found = 0;
> >>> +    EFI_FILE_HANDLE dir_handle;
> >>> +    CHAR16 *file_name;
> >>> +
> >>> +    dir_handle = get_parent_handle(loaded_image, &file_name);
> >> 
> >> We can’t use get_parent_handle here because we will end up with the same problem,
> >> we would need to use the filesystem if and only if we need to use it, 
> > 
> > Understood, but it would work the same way as this version of the patch:
> > if we end up calling read_file with dir_handle == NULL, then read_file
> > would do:
> > 
> >  blexit(L"Error: No access to the filesystem");
> > 
> > If we don't end up calling read_file, then everything works even if
> > dir_handle == NULL. Right?
> 
> Oh yes sorry my bad Stefano! Having this version of the patch, it will work.
> 
> My understanding was instead that the Jan suggestion is to revert the place
> of call of get_parent_handle (like in your code diff), but also to remove the
> changes to get_parent_handle and read_file.
> I guess Jan will specify his preference, but if he meant the last one, then
> the only way I see...

I think we should keep the changes to get_parent_handle and read_file,
otherwise it will make it awkward, and those changes are good in their
own right anyway.
 
 
> >> so the way I see
> >> is to pass loaded_image down to the stack until allocate_module_file(…), in this
> >> function we can use get_parent_handle(…) because the user wants us to do that.
> >> The downside is that we must close the handle there, so for each loaded file we will
> >> request and close the handle. Is this something we don’t bother too much?
> > 
> > Yeah, that doesn't seem ideal.
> 
> … is this one.
> 
> > 
> > 
> >>> 
> >>>    /* Check for the chosen node in the current DTB */
> >>>    chosen = setup_chosen_node(fdt, &addr_len, &size_len);
> >>> @@ -895,6 +901,8 @@ static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
> >>>        efi_bs->FreePool(modules[i].name);
> >>>    }
> >>> 
> >>> +    dir_handle->Close(dir_handle);
> >>> +
> >>>    return modules_found;
> >>> }
> >>> 
> >>> diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
> >>> index 112b7e7571..2407671a7d 100644
> >>> --- a/xen/common/efi/boot.c
> >>> +++ b/xen/common/efi/boot.c
> >>> @@ -167,7 +167,7 @@ static void __init PrintErr(const CHAR16 *s)
> >>> }
> >>> 
> >>> #ifndef CONFIG_HAS_DEVICE_TREE
> >>> -static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
> >>> +static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
> >>> {
> >>>    return 0;
> >>> }
> >>> @@ -439,8 +439,8 @@ static unsigned int __init get_argv(unsigned int argc, CHAR16 **argv,
> >>>    return argc;
> >>> }
> >>> 
> >>> -static EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
> >>> -                                                CHAR16 **leaf)
> >>> +EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
> >>> +                                         CHAR16 **leaf)
> >>> {
> >>>    static EFI_GUID __initdata fs_protocol = SIMPLE_FILE_SYSTEM_PROTOCOL;
> >>>    static CHAR16 __initdata buffer[512];
> >>> @@ -1236,9 +1236,6 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
> >>> 
> >>>    efi_arch_relocate_image(0);
> >>> 
> >>> -    /* Get the file system interface. */
> >>> -    dir_handle = get_parent_handle(loaded_image, &file_name);
> >>> -
> >>>    if ( use_cfg_file )
> >>>    {
> >>>        UINTN depth, cols, rows, size;
> >>> @@ -1251,6 +1248,9 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
> >>> 
> >>>        gop = efi_get_gop();
> >>> 
> >>> +        /* Get the file system interface. */
> >>> +        dir_handle = get_parent_handle(loaded_image, &file_name);
> >>> +
> >>>        /* Read and parse the config file. */
> >>>        if ( read_section(loaded_image, L"config", &cfg, NULL) )
> >>>            PrintStr(L"Using builtin config file\r\n");
> >>> @@ -1344,18 +1344,8 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
> >>>            EFI_FILE_HANDLE handle = get_parent_handle(loaded_image,
> >>>                                                       &file_name);
> >>> 
> >>> -            if ( !handle )
> >>> -            {
> >>> -                PrintErr(L"Error retrieving image name: no filesystem access."
> >>> -                         L" Setting default to xen.efi");
> >>> -                PrintErr(newline);
> >>> -                *argv = L"xen.efi";
> >>> -            }
> >>> -            else
> >>> -            {
> >>> -                handle->Close(handle);
> >>> -                *argv = file_name;
> >>> -            }
> >>> +            handle->Close(handle);
> >>> +            *argv = file_name;
> >>>        }
> >>> 
> >>>        name.s = get_value(&cfg, section.s, "options");
> >>> @@ -1383,15 +1373,14 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
> >>>        efi_bs->FreePages(cfg.addr, PFN_UP(cfg.size));
> >>>        cfg.addr = 0;
> >>> 
> >>> +        dir_handle->Close(dir_handle);
> >>> +
> >>>        if ( gop && !base_video )
> >>>            gop_mode = efi_find_gop_mode(gop, cols, rows, depth);
> >>>    }
> >>> 
> >>>    /* Get the number of boot modules specified on the DT or an error (<0) */
> >>> -    dt_modules_found = efi_check_dt_boot(dir_handle);
> >>> -
> >>> -    if ( dir_handle )
> >>> -        dir_handle->Close(dir_handle);
> >>> +    dt_modules_found = efi_check_dt_boot(loaded_image);
> >>> 
> >>>    if ( dt_modules_found < 0 )
> >>>        /* efi_check_dt_boot throws some error */
>
Jan Beulich Nov. 5, 2021, 7:27 a.m. UTC | #10
On 04.11.2021 21:56, Stefano Stabellini wrote:
> On Thu, 4 Nov 2021, Jan Beulich wrote:
>> On 04.11.2021 15:12, Luca Fancellu wrote:
>>> --- a/xen/common/efi/boot.c
>>> +++ b/xen/common/efi/boot.c
>>> @@ -449,6 +449,15 @@ static EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
>>>      CHAR16 *pathend, *ptr;
>>>      EFI_STATUS ret;
>>>  
>>> +    /*
>>> +     * Grub2 running on top of EDK2 has been observed to supply a NULL
>>> +     * DeviceHandle. We can't use that to gain access to the filesystem.
>>> +     * However the system can still boot if it doesn’t require access to the
>>> +     * filesystem.
>>> +     */
>>> +    if ( !loaded_image->DeviceHandle )
>>> +        return NULL;
>>> +
>>>      do {
>>>          EFI_FILE_IO_INTERFACE *fio;
>>>  
>>> @@ -581,6 +590,8 @@ static bool __init read_file(EFI_FILE_HANDLE dir_handle, CHAR16 *name,
>>>      EFI_STATUS ret;
>>>      const CHAR16 *what = NULL;
>>>  
>>> +    if ( !dir_handle )
>>> +        blexit(L"Error: No access to the filesystem");
>>>      if ( !name )
>>>          PrintErrMesg(L"No filename", EFI_OUT_OF_RESOURCES);
>>>      ret = dir_handle->Open(dir_handle, &FileHandle, name,
>>> @@ -1333,8 +1344,18 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
>>>              EFI_FILE_HANDLE handle = get_parent_handle(loaded_image,
>>>                                                         &file_name);
>>>  
>>> -            handle->Close(handle);
>>> -            *argv = file_name;
>>> +            if ( !handle )
>>> +            {
>>> +                PrintErr(L"Error retrieving image name: no filesystem access."
>>> +                         L" Setting default to xen.efi");
>>> +                PrintErr(newline);
>>> +                *argv = L"xen.efi";
>>> +            }
>>> +            else
>>> +            {
>>> +                handle->Close(handle);
>>> +                *argv = file_name;
>>> +            }
>>>          }
>>>  
>>>          name.s = get_value(&cfg, section.s, "options");
>>> @@ -1369,7 +1390,8 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
>>>      /* Get the number of boot modules specified on the DT or an error (<0) */
>>>      dt_modules_found = efi_check_dt_boot(dir_handle);
>>>  
>>> -    dir_handle->Close(dir_handle);
>>> +    if ( dir_handle )
>>> +        dir_handle->Close(dir_handle);
>>>  
>>>      if ( dt_modules_found < 0 )
>>>          /* efi_check_dt_boot throws some error */
>>>
>>
>> I'm sorry, but I think we need to take a step back here and revisit
>> the earlier change. If that hadn't moved obtaining dir_handle out by
>> one level of scope, nothing bad would have happened to the case that
>> you're now trying to fix, I understand? So perhaps that part wants
>> undoing, with efi_check_dt_boot() instead getting passed loaded_image.
>> That way, down the call tree the needed handle can be obtained via
>> another call to get_parent_handle(), and quite likely in the scenario
>> you're trying to fix here execution wouldn't even make it there. This
>> then wouldn't be much different to the image name retrieval calling
>> get_parent_handle() a 2nd time, rather than trying to re-use
>> dir_handle.
>>
>> Net effect being that I think get_parent_handle() would then again
>> only be called when the returned handle is actually needed, and hence
>> when failure of HandleProtocol() (for DeviceHandle being NULL just
>> like for any other reason) is indeed an error that needs reporting.
> 
> In my opinion the current version is good enough. Regardless, I looked
> at your suggestion into details. As it took me some time to understand
> it, I thought I would share the code changes that I think correspond to
> what you wrote. Does everything check out?

Well, first of all I understand that's an incremental change on top of
Luca's, not a replacement. And then there are a couple of things to be
done slightly differently (imo), to match the present model:

> --- a/xen/arch/arm/efi/efi-boot.h
> +++ b/xen/arch/arm/efi/efi-boot.h
> @@ -8,6 +8,8 @@
>  #include <asm/setup.h>
>  #include <asm/smp.h>
>  
> +extern EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
> +                                                CHAR16 **leaf);

This should remain static, but will need forward-declaring (for the
time being, I have a post-4.16 patch eliminating a fair part of
those forward decls).

> @@ -851,10 +853,14 @@ static int __init handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
>   * dom0 and domU guests to be loaded.
>   * Returns the number of multiboot modules found or a negative number for error.
>   */
> -static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
> +static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
>  {
>      int chosen, node, addr_len, size_len;
>      unsigned int i = 0, modules_found = 0;
> +    EFI_FILE_HANDLE dir_handle;
> +    CHAR16 *file_name;
> +
> +    dir_handle = get_parent_handle(loaded_image, &file_name);
>  
>      /* Check for the chosen node in the current DTB */
>      chosen = setup_chosen_node(fdt, &addr_len, &size_len);
> @@ -895,6 +901,8 @@ static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
>          efi_bs->FreePool(modules[i].name);
>      }
>  
> +    dir_handle->Close(dir_handle);
> +
>      return modules_found;
>  }

Imo obtaining of the handle wants pushing further down the call tree.
Placing it here will, afaict, still trip the problem Luca is trying
to resolve. Plus of course the handle wants closing also on error
paths (if any in the function this really wants to be put into).

> @@ -1236,9 +1236,6 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
>  
>      efi_arch_relocate_image(0);
>  
> -    /* Get the file system interface. */
> -    dir_handle = get_parent_handle(loaded_image, &file_name);
> -
>      if ( use_cfg_file )
>      {
>          UINTN depth, cols, rows, size;
> @@ -1251,6 +1248,9 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
>  
>          gop = efi_get_gop();
>  
> +        /* Get the file system interface. */
> +        dir_handle = get_parent_handle(loaded_image, &file_name);

Along with this the declaration of dir_handle also wants to move back
into the more narrow scope.

Jan
Jan Beulich Nov. 5, 2021, 7:32 a.m. UTC | #11
On 04.11.2021 22:43, Luca Fancellu wrote:
> 
> 
>> On 4 Nov 2021, at 21:35, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>
>> On Thu, 4 Nov 2021, Luca Fancellu wrote:
>>>> On 4 Nov 2021, at 20:56, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>>>
>>>> On Thu, 4 Nov 2021, Jan Beulich wrote:
>>>>> On 04.11.2021 15:12, Luca Fancellu wrote:
>>>>>> --- a/xen/common/efi/boot.c
>>>>>> +++ b/xen/common/efi/boot.c
>>>>>> @@ -449,6 +449,15 @@ static EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
>>>>>>    CHAR16 *pathend, *ptr;
>>>>>>    EFI_STATUS ret;
>>>>>>
>>>>>> +    /*
>>>>>> +     * Grub2 running on top of EDK2 has been observed to supply a NULL
>>>>>> +     * DeviceHandle. We can't use that to gain access to the filesystem.
>>>>>> +     * However the system can still boot if it doesn’t require access to the
>>>>>> +     * filesystem.
>>>>>> +     */
>>>>>> +    if ( !loaded_image->DeviceHandle )
>>>>>> +        return NULL;
>>>>>> +
>>>>>>    do {
>>>>>>        EFI_FILE_IO_INTERFACE *fio;
>>>>>>
>>>>>> @@ -581,6 +590,8 @@ static bool __init read_file(EFI_FILE_HANDLE dir_handle, CHAR16 *name,
>>>>>>    EFI_STATUS ret;
>>>>>>    const CHAR16 *what = NULL;
>>>>>>
>>>>>> +    if ( !dir_handle )
>>>>>> +        blexit(L"Error: No access to the filesystem");
>>>>>>    if ( !name )
>>>>>>        PrintErrMesg(L"No filename", EFI_OUT_OF_RESOURCES);
>>>>>>    ret = dir_handle->Open(dir_handle, &FileHandle, name,
>>>>>> @@ -1333,8 +1344,18 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
>>>>>>            EFI_FILE_HANDLE handle = get_parent_handle(loaded_image,
>>>>>>                                                       &file_name);
>>>>>>
>>>>>> -            handle->Close(handle);
>>>>>> -            *argv = file_name;
>>>>>> +            if ( !handle )
>>>>>> +            {
>>>>>> +                PrintErr(L"Error retrieving image name: no filesystem access."
>>>>>> +                         L" Setting default to xen.efi");
>>>>>> +                PrintErr(newline);
>>>>>> +                *argv = L"xen.efi";
>>>>>> +            }
>>>>>> +            else
>>>>>> +            {
>>>>>> +                handle->Close(handle);
>>>>>> +                *argv = file_name;
>>>>>> +            }
>>>>>>        }
>>>>>>
>>>>>>        name.s = get_value(&cfg, section.s, "options");
>>>>>> @@ -1369,7 +1390,8 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
>>>>>>    /* Get the number of boot modules specified on the DT or an error (<0) */
>>>>>>    dt_modules_found = efi_check_dt_boot(dir_handle);
>>>>>>
>>>>>> -    dir_handle->Close(dir_handle);
>>>>>> +    if ( dir_handle )
>>>>>> +        dir_handle->Close(dir_handle);
>>>>>>
>>>>>>    if ( dt_modules_found < 0 )
>>>>>>        /* efi_check_dt_boot throws some error */
>>>>>>
>>>>>
>>>>> I'm sorry, but I think we need to take a step back here and revisit
>>>>> the earlier change. If that hadn't moved obtaining dir_handle out by
>>>>> one level of scope, nothing bad would have happened to the case that
>>>>> you're now trying to fix, I understand? So perhaps that part wants
>>>>> undoing, with efi_check_dt_boot() instead getting passed loaded_image.
>>>>> That way, down the call tree the needed handle can be obtained via
>>>>> another call to get_parent_handle(), and quite likely in the scenario
>>>>> you're trying to fix here execution wouldn't even make it there. This
>>>>> then wouldn't be much different to the image name retrieval calling
>>>>> get_parent_handle() a 2nd time, rather than trying to re-use
>>>>> dir_handle.
>>>>>
>>>>> Net effect being that I think get_parent_handle() would then again
>>>>> only be called when the returned handle is actually needed, and hence
>>>>> when failure of HandleProtocol() (for DeviceHandle being NULL just
>>>>> like for any other reason) is indeed an error that needs reporting.
>>>>
>>>> In my opinion the current version is good enough. Regardless, I looked
>>>> at your suggestion into details. As it took me some time to understand
>>>> it, I thought I would share the code changes that I think correspond to
>>>> what you wrote. Does everything check out?
>>>>
>>>> If so, I think it looks fine, maybe a bit better than the current
>>>> version. I'll leave that to you and Luca.
>>>>
>>>>
>>>> diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
>>>> index c3ae9751ab..9dcd8547cd 100644
>>>> --- a/xen/arch/arm/efi/efi-boot.h
>>>> +++ b/xen/arch/arm/efi/efi-boot.h
>>>> @@ -8,6 +8,8 @@
>>>> #include <asm/setup.h>
>>>> #include <asm/smp.h>
>>>>
>>>> +extern EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
>>>> +                                                CHAR16 **leaf);
>>>> typedef struct {
>>>>    char *name;
>>>>    unsigned int name_len;
>>>> @@ -54,7 +56,7 @@ static int handle_module_node(EFI_FILE_HANDLE dir_handle,
>>>>                              bool is_domu_module);
>>>> static int handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
>>>>                                       int domain_node);
>>>> -static int efi_check_dt_boot(EFI_FILE_HANDLE dir_handle);
>>>> +static int efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image);
>>>>
>>>> #define DEVICE_TREE_GUID \
>>>> {0xb1b621d5, 0xf19c, 0x41a5, {0x83, 0x0b, 0xd9, 0x15, 0x2c, 0x69, 0xaa, 0xe0}}
>>>> @@ -851,10 +853,14 @@ static int __init handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
>>>> * dom0 and domU guests to be loaded.
>>>> * Returns the number of multiboot modules found or a negative number for error.
>>>> */
>>>> -static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
>>>> +static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
>>>> {
>>>>    int chosen, node, addr_len, size_len;
>>>>    unsigned int i = 0, modules_found = 0;
>>>> +    EFI_FILE_HANDLE dir_handle;
>>>> +    CHAR16 *file_name;
>>>> +
>>>> +    dir_handle = get_parent_handle(loaded_image, &file_name);
>>>
>>> We can’t use get_parent_handle here because we will end up with the same problem,
>>> we would need to use the filesystem if and only if we need to use it, 
>>
>> Understood, but it would work the same way as this version of the patch:
>> if we end up calling read_file with dir_handle == NULL, then read_file
>> would do:
>>
>>  blexit(L"Error: No access to the filesystem");
>>
>> If we don't end up calling read_file, then everything works even if
>> dir_handle == NULL. Right?
> 
> Oh yes sorry my bad Stefano! Having this version of the patch, it will work.
> 
> My understanding was instead that the Jan suggestion is to revert the place
> of call of get_parent_handle (like in your code diff), but also to remove the
> changes to get_parent_handle and read_file.

This is indeed a correct understanding of yours. (Hence why an incremental
change on top of yours wasn't the most expressive way to outline Stefano's
thoughts.)

Jan
Jan Beulich Nov. 5, 2021, 7:35 a.m. UTC | #12
On 04.11.2021 22:50, Stefano Stabellini wrote:
> On Thu, 4 Nov 2021, Luca Fancellu wrote:
>>> On 4 Nov 2021, at 21:35, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>> On Thu, 4 Nov 2021, Luca Fancellu wrote:
>>>>> On 4 Nov 2021, at 20:56, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>>>> @@ -851,10 +853,14 @@ static int __init handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
>>>>> * dom0 and domU guests to be loaded.
>>>>> * Returns the number of multiboot modules found or a negative number for error.
>>>>> */
>>>>> -static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
>>>>> +static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
>>>>> {
>>>>>    int chosen, node, addr_len, size_len;
>>>>>    unsigned int i = 0, modules_found = 0;
>>>>> +    EFI_FILE_HANDLE dir_handle;
>>>>> +    CHAR16 *file_name;
>>>>> +
>>>>> +    dir_handle = get_parent_handle(loaded_image, &file_name);
>>>>
>>>> We can’t use get_parent_handle here because we will end up with the same problem,
>>>> we would need to use the filesystem if and only if we need to use it, 
>>>
>>> Understood, but it would work the same way as this version of the patch:
>>> if we end up calling read_file with dir_handle == NULL, then read_file
>>> would do:
>>>
>>>  blexit(L"Error: No access to the filesystem");
>>>
>>> If we don't end up calling read_file, then everything works even if
>>> dir_handle == NULL. Right?
>>
>> Oh yes sorry my bad Stefano! Having this version of the patch, it will work.
>>
>> My understanding was instead that the Jan suggestion is to revert the place
>> of call of get_parent_handle (like in your code diff), but also to remove the
>> changes to get_parent_handle and read_file.
>> I guess Jan will specify his preference, but if he meant the last one, then
>> the only way I see...
> 
> I think we should keep the changes to get_parent_handle and read_file,
> otherwise it will make it awkward, and those changes are good in their
> own right anyway.

As a maintainer of this code I'm afraid I have to say that I disagree.
These changes were actually part of the reason why I went and looked
how things could (and imo ought to be) done differently.

Jan
Stefano Stabellini Nov. 5, 2021, 3:33 p.m. UTC | #13
On Fri, 5 Nov 2021, Jan Beulich wrote:
> On 04.11.2021 22:50, Stefano Stabellini wrote:
> > On Thu, 4 Nov 2021, Luca Fancellu wrote:
> >>> On 4 Nov 2021, at 21:35, Stefano Stabellini <sstabellini@kernel.org> wrote:
> >>> On Thu, 4 Nov 2021, Luca Fancellu wrote:
> >>>>> On 4 Nov 2021, at 20:56, Stefano Stabellini <sstabellini@kernel.org> wrote:
> >>>>> @@ -851,10 +853,14 @@ static int __init handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
> >>>>> * dom0 and domU guests to be loaded.
> >>>>> * Returns the number of multiboot modules found or a negative number for error.
> >>>>> */
> >>>>> -static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
> >>>>> +static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
> >>>>> {
> >>>>>    int chosen, node, addr_len, size_len;
> >>>>>    unsigned int i = 0, modules_found = 0;
> >>>>> +    EFI_FILE_HANDLE dir_handle;
> >>>>> +    CHAR16 *file_name;
> >>>>> +
> >>>>> +    dir_handle = get_parent_handle(loaded_image, &file_name);
> >>>>
> >>>> We can’t use get_parent_handle here because we will end up with the same problem,
> >>>> we would need to use the filesystem if and only if we need to use it, 
> >>>
> >>> Understood, but it would work the same way as this version of the patch:
> >>> if we end up calling read_file with dir_handle == NULL, then read_file
> >>> would do:
> >>>
> >>>  blexit(L"Error: No access to the filesystem");
> >>>
> >>> If we don't end up calling read_file, then everything works even if
> >>> dir_handle == NULL. Right?
> >>
> >> Oh yes sorry my bad Stefano! Having this version of the patch, it will work.
> >>
> >> My understanding was instead that the Jan suggestion is to revert the place
> >> of call of get_parent_handle (like in your code diff), but also to remove the
> >> changes to get_parent_handle and read_file.
> >> I guess Jan will specify his preference, but if he meant the last one, then
> >> the only way I see...
> > 
> > I think we should keep the changes to get_parent_handle and read_file,
> > otherwise it will make it awkward, and those changes are good in their
> > own right anyway.
> 
> As a maintainer of this code I'm afraid I have to say that I disagree.
> These changes were actually part of the reason why I went and looked
> how things could (and imo ought to be) done differently.

You know this code and EFI booting better than me -- aren't you
concerned about Xen calling get_parent_handle / dir_handle->Close so
many times (by allocate_module_file)?

My main concern is performance and resource utilization. With v3 of the
patch get_parent_handle will get called for every module to be loaded.
With dom0less, it could easily get called 10 times or more. Taking a
look at get_parent_handle, the Xen side doesn't seem small and I have
no idea how the EDK2 side looks. I am just worried that it would
actually have an impact on boot times (also depending on the bootloader
implementation).
Jan Beulich Nov. 8, 2021, 7:25 a.m. UTC | #14
On 05.11.2021 16:33, Stefano Stabellini wrote:
> On Fri, 5 Nov 2021, Jan Beulich wrote:
>> On 04.11.2021 22:50, Stefano Stabellini wrote:
>>> On Thu, 4 Nov 2021, Luca Fancellu wrote:
>>>>> On 4 Nov 2021, at 21:35, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>>>> On Thu, 4 Nov 2021, Luca Fancellu wrote:
>>>>>>> On 4 Nov 2021, at 20:56, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>>>>>> @@ -851,10 +853,14 @@ static int __init handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
>>>>>>> * dom0 and domU guests to be loaded.
>>>>>>> * Returns the number of multiboot modules found or a negative number for error.
>>>>>>> */
>>>>>>> -static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
>>>>>>> +static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
>>>>>>> {
>>>>>>>    int chosen, node, addr_len, size_len;
>>>>>>>    unsigned int i = 0, modules_found = 0;
>>>>>>> +    EFI_FILE_HANDLE dir_handle;
>>>>>>> +    CHAR16 *file_name;
>>>>>>> +
>>>>>>> +    dir_handle = get_parent_handle(loaded_image, &file_name);
>>>>>>
>>>>>> We can’t use get_parent_handle here because we will end up with the same problem,
>>>>>> we would need to use the filesystem if and only if we need to use it, 
>>>>>
>>>>> Understood, but it would work the same way as this version of the patch:
>>>>> if we end up calling read_file with dir_handle == NULL, then read_file
>>>>> would do:
>>>>>
>>>>>  blexit(L"Error: No access to the filesystem");
>>>>>
>>>>> If we don't end up calling read_file, then everything works even if
>>>>> dir_handle == NULL. Right?
>>>>
>>>> Oh yes sorry my bad Stefano! Having this version of the patch, it will work.
>>>>
>>>> My understanding was instead that the Jan suggestion is to revert the place
>>>> of call of get_parent_handle (like in your code diff), but also to remove the
>>>> changes to get_parent_handle and read_file.
>>>> I guess Jan will specify his preference, but if he meant the last one, then
>>>> the only way I see...
>>>
>>> I think we should keep the changes to get_parent_handle and read_file,
>>> otherwise it will make it awkward, and those changes are good in their
>>> own right anyway.
>>
>> As a maintainer of this code I'm afraid I have to say that I disagree.
>> These changes were actually part of the reason why I went and looked
>> how things could (and imo ought to be) done differently.
> 
> You know this code and EFI booting better than me -- aren't you
> concerned about Xen calling get_parent_handle / dir_handle->Close so
> many times (by allocate_module_file)?

I'm not overly concerned there; my primary concern is for it to get called
without need in the first place.

> My main concern is performance and resource utilization. With v3 of the
> patch get_parent_handle will get called for every module to be loaded.
> With dom0less, it could easily get called 10 times or more. Taking a
> look at get_parent_handle, the Xen side doesn't seem small and I have
> no idea how the EDK2 side looks. I am just worried that it would
> actually have an impact on boot times (also depending on the bootloader
> implementation).

The biggest part of the function deals with determining the "residual" of
the file name. That part looks to be of no interest at all to
allocate_module_file() (whether that's actually correct I can't tell). I
don't see why this couldn't be made conditional (e.g. by passing in NULL
for "leaf").

Jan
Stefano Stabellini Nov. 9, 2021, 2:11 a.m. UTC | #15
On Mon, 8 Nov 2021, Jan Beulich wrote:
> On 05.11.2021 16:33, Stefano Stabellini wrote:
> > On Fri, 5 Nov 2021, Jan Beulich wrote:
> >> On 04.11.2021 22:50, Stefano Stabellini wrote:
> >>> On Thu, 4 Nov 2021, Luca Fancellu wrote:
> >>>>> On 4 Nov 2021, at 21:35, Stefano Stabellini <sstabellini@kernel.org> wrote:
> >>>>> On Thu, 4 Nov 2021, Luca Fancellu wrote:
> >>>>>>> On 4 Nov 2021, at 20:56, Stefano Stabellini <sstabellini@kernel.org> wrote:
> >>>>>>> @@ -851,10 +853,14 @@ static int __init handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
> >>>>>>> * dom0 and domU guests to be loaded.
> >>>>>>> * Returns the number of multiboot modules found or a negative number for error.
> >>>>>>> */
> >>>>>>> -static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
> >>>>>>> +static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
> >>>>>>> {
> >>>>>>>    int chosen, node, addr_len, size_len;
> >>>>>>>    unsigned int i = 0, modules_found = 0;
> >>>>>>> +    EFI_FILE_HANDLE dir_handle;
> >>>>>>> +    CHAR16 *file_name;
> >>>>>>> +
> >>>>>>> +    dir_handle = get_parent_handle(loaded_image, &file_name);
> >>>>>>
> >>>>>> We can’t use get_parent_handle here because we will end up with the same problem,
> >>>>>> we would need to use the filesystem if and only if we need to use it, 
> >>>>>
> >>>>> Understood, but it would work the same way as this version of the patch:
> >>>>> if we end up calling read_file with dir_handle == NULL, then read_file
> >>>>> would do:
> >>>>>
> >>>>>  blexit(L"Error: No access to the filesystem");
> >>>>>
> >>>>> If we don't end up calling read_file, then everything works even if
> >>>>> dir_handle == NULL. Right?
> >>>>
> >>>> Oh yes sorry my bad Stefano! Having this version of the patch, it will work.
> >>>>
> >>>> My understanding was instead that the Jan suggestion is to revert the place
> >>>> of call of get_parent_handle (like in your code diff), but also to remove the
> >>>> changes to get_parent_handle and read_file.
> >>>> I guess Jan will specify his preference, but if he meant the last one, then
> >>>> the only way I see...
> >>>
> >>> I think we should keep the changes to get_parent_handle and read_file,
> >>> otherwise it will make it awkward, and those changes are good in their
> >>> own right anyway.
> >>
> >> As a maintainer of this code I'm afraid I have to say that I disagree.
> >> These changes were actually part of the reason why I went and looked
> >> how things could (and imo ought to be) done differently.
> > 
> > You know this code and EFI booting better than me -- aren't you
> > concerned about Xen calling get_parent_handle / dir_handle->Close so
> > many times (by allocate_module_file)?
> 
> I'm not overly concerned there; my primary concern is for it to get called
> without need in the first place.

Exactly my thinking! Except that now it gets called 10x times with
dom0less boot :-(


> > My main concern is performance and resource utilization. With v3 of the
> > patch get_parent_handle will get called for every module to be loaded.
> > With dom0less, it could easily get called 10 times or more. Taking a
> > look at get_parent_handle, the Xen side doesn't seem small and I have
> > no idea how the EDK2 side looks. I am just worried that it would
> > actually have an impact on boot times (also depending on the bootloader
> > implementation).
> 
> The biggest part of the function deals with determining the "residual" of
> the file name. That part looks to be of no interest at all to
> allocate_module_file() (whether that's actually correct I can't tell). I
> don't see why this couldn't be made conditional (e.g. by passing in NULL
> for "leaf").

I understand the idea of passing NULL instead of "leaf", but I tried
having a look and I can't tell what we would be able to skip in
get_parent_handle.

Should we have a global variable to keep the dir_handle open during
dom0less module loading?
Luca Fancellu Nov. 9, 2021, 9:23 a.m. UTC | #16
> On 9 Nov 2021, at 02:11, Stefano Stabellini <sstabellini@kernel.org> wrote:
> 
> On Mon, 8 Nov 2021, Jan Beulich wrote:
>> On 05.11.2021 16:33, Stefano Stabellini wrote:
>>> On Fri, 5 Nov 2021, Jan Beulich wrote:
>>>> On 04.11.2021 22:50, Stefano Stabellini wrote:
>>>>> On Thu, 4 Nov 2021, Luca Fancellu wrote:
>>>>>>> On 4 Nov 2021, at 21:35, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>>>>>> On Thu, 4 Nov 2021, Luca Fancellu wrote:
>>>>>>>>> On 4 Nov 2021, at 20:56, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>>>>>>>> @@ -851,10 +853,14 @@ static int __init handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
>>>>>>>>> * dom0 and domU guests to be loaded.
>>>>>>>>> * Returns the number of multiboot modules found or a negative number for error.
>>>>>>>>> */
>>>>>>>>> -static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
>>>>>>>>> +static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
>>>>>>>>> {
>>>>>>>>>   int chosen, node, addr_len, size_len;
>>>>>>>>>   unsigned int i = 0, modules_found = 0;
>>>>>>>>> +    EFI_FILE_HANDLE dir_handle;
>>>>>>>>> +    CHAR16 *file_name;
>>>>>>>>> +
>>>>>>>>> +    dir_handle = get_parent_handle(loaded_image, &file_name);
>>>>>>>> 
>>>>>>>> We can’t use get_parent_handle here because we will end up with the same problem,
>>>>>>>> we would need to use the filesystem if and only if we need to use it, 
>>>>>>> 
>>>>>>> Understood, but it would work the same way as this version of the patch:
>>>>>>> if we end up calling read_file with dir_handle == NULL, then read_file
>>>>>>> would do:
>>>>>>> 
>>>>>>> blexit(L"Error: No access to the filesystem");
>>>>>>> 
>>>>>>> If we don't end up calling read_file, then everything works even if
>>>>>>> dir_handle == NULL. Right?
>>>>>> 
>>>>>> Oh yes sorry my bad Stefano! Having this version of the patch, it will work.
>>>>>> 
>>>>>> My understanding was instead that the Jan suggestion is to revert the place
>>>>>> of call of get_parent_handle (like in your code diff), but also to remove the
>>>>>> changes to get_parent_handle and read_file.
>>>>>> I guess Jan will specify his preference, but if he meant the last one, then
>>>>>> the only way I see...
>>>>> 
>>>>> I think we should keep the changes to get_parent_handle and read_file,
>>>>> otherwise it will make it awkward, and those changes are good in their
>>>>> own right anyway.
>>>> 
>>>> As a maintainer of this code I'm afraid I have to say that I disagree.
>>>> These changes were actually part of the reason why I went and looked
>>>> how things could (and imo ought to be) done differently.
>>> 
>>> You know this code and EFI booting better than me -- aren't you
>>> concerned about Xen calling get_parent_handle / dir_handle->Close so
>>> many times (by allocate_module_file)?
>> 
>> I'm not overly concerned there; my primary concern is for it to get called
>> without need in the first place.
> 
> Exactly my thinking! Except that now it gets called 10x times with
> dom0less boot :-(
> 
> 
>>> My main concern is performance and resource utilization. With v3 of the
>>> patch get_parent_handle will get called for every module to be loaded.
>>> With dom0less, it could easily get called 10 times or more. Taking a
>>> look at get_parent_handle, the Xen side doesn't seem small and I have
>>> no idea how the EDK2 side looks. I am just worried that it would
>>> actually have an impact on boot times (also depending on the bootloader
>>> implementation).
>> 
>> The biggest part of the function deals with determining the "residual" of
>> the file name. That part looks to be of no interest at all to
>> allocate_module_file() (whether that's actually correct I can't tell). I
>> don't see why this couldn't be made conditional (e.g. by passing in NULL
>> for "leaf").
> 
> I understand the idea of passing NULL instead of "leaf", but I tried
> having a look and I can't tell what we would be able to skip in
> get_parent_handle.
> 

Hi Stefano, Jan,

> Should we have a global variable to keep the dir_handle open during
> dom0less module loading?

I thought about a solution for that, here the changes, please not that I’ve just built them, not tested,
Would they be something acceptable to have loaded_image global?

diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
index 458cfbbed4..b4d86e9f7c 100644
--- a/xen/arch/arm/efi/efi-boot.h
+++ b/xen/arch/arm/efi/efi-boot.h
@@ -44,17 +44,17 @@ void __flush_dcache_area(const void *vaddr, unsigned long size);
 
 static int get_module_file_index(const char *name, unsigned int name_len);
 static void PrintMessage(const CHAR16 *s);
-static int allocate_module_file(EFI_LOADED_IMAGE *loaded_image,
+static int allocate_module_file(EFI_FILE_HANDLE *dir_handle,
                                 const char *name,
                                 unsigned int name_len);
-static int handle_module_node(EFI_LOADED_IMAGE *loaded_image,
+static int handle_module_node(EFI_FILE_HANDLE *dir_handle,
                               int module_node_offset,
                               int reg_addr_cells,
                               int reg_size_cells,
                               bool is_domu_module);
-static int handle_dom0less_domain_node(EFI_LOADED_IMAGE *loaded_image,
+static int handle_dom0less_domain_node(EFI_FILE_HANDLE *dir_handle,
                                        int domain_node);
-static int efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image);
+static int efi_check_dt_boot(void);
 
 #define DEVICE_TREE_GUID \
 {0xb1b621d5, 0xf19c, 0x41a5, {0x83, 0x0b, 0xd9, 0x15, 0x2c, 0x69, 0xaa, 0xe0}}
@@ -647,11 +647,10 @@ static void __init PrintMessage(const CHAR16 *s)
  * This function allocates a binary and keeps track of its name, it returns the
  * index of the file in the modules array or a negative number on error.
  */
-static int __init allocate_module_file(EFI_LOADED_IMAGE *loaded_image,
+static int __init allocate_module_file(EFI_FILE_HANDLE *dir_handle,
                                        const char *name,
                                        unsigned int name_len)
 {
-    EFI_FILE_HANDLE dir_handle;
     module_name *file_name;
     CHAR16 *fname;
     union string module_name;
@@ -686,12 +685,11 @@ static int __init allocate_module_file(EFI_LOADED_IMAGE *loaded_image,
     file_name->name_len = name_len;
 
     /* Get the file system interface. */
-    dir_handle = get_parent_handle(loaded_image, &fname);
+    if ( !*dir_handle )
+        *dir_handle = get_parent_handle(&fname);
 
     /* Load the binary in memory */
-    read_file(dir_handle, s2w(&module_name), &module_binary, NULL);
-
-    dir_handle->Close(dir_handle);
+    read_file(*dir_handle, s2w(&module_name), &module_binary, NULL);
 
     /* Save address and size */
     file_name->addr = module_binary.addr;
@@ -711,7 +709,7 @@ static int __init allocate_module_file(EFI_LOADED_IMAGE *loaded_image,
  * for the reg property into the module DT node.
  * Returns 1 if module is multiboot,module, 0 if not, < 0 on error
  */
-static int __init handle_module_node(EFI_LOADED_IMAGE *loaded_image,
+static int __init handle_module_node(EFI_FILE_HANDLE *dir_handle,
                                      int module_node_offset,
                                      int reg_addr_cells,
                                      int reg_size_cells,
@@ -744,7 +742,7 @@ static int __init handle_module_node(EFI_LOADED_IMAGE *loaded_image,
     file_idx = get_module_file_index(uefi_name_prop, uefi_name_len);
     if ( file_idx < 0 )
     {
-        file_idx = allocate_module_file(loaded_image, uefi_name_prop,
+        file_idx = allocate_module_file(dir_handle, uefi_name_prop,
                                         uefi_name_len);
         if ( file_idx < 0 )
             return file_idx;
@@ -811,7 +809,7 @@ static int __init handle_module_node(EFI_LOADED_IMAGE *loaded_image,
  * in the DT.
  * Returns number of multiboot,module found or negative number on error.
  */
-static int __init handle_dom0less_domain_node(EFI_LOADED_IMAGE *loaded_image,
+static int __init handle_dom0less_domain_node(EFI_FILE_HANDLE *dir_handle,
                                               int domain_node)
 {
     int module_node, addr_cells, size_cells, len;
@@ -842,7 +840,7 @@ static int __init handle_dom0less_domain_node(EFI_LOADED_IMAGE *loaded_image,
           module_node > 0;
           module_node = fdt_next_subnode(fdt, module_node) )
     {
-        int ret = handle_module_node(loaded_image, module_node, addr_cells,
+        int ret = handle_module_node(dir_handle, module_node, addr_cells,
                                      size_cells, true);
         if ( ret < 0 )
             return ret;
@@ -858,10 +856,11 @@ static int __init handle_dom0less_domain_node(EFI_LOADED_IMAGE *loaded_image,
  * dom0 and domU guests to be loaded.
  * Returns the number of multiboot modules found or a negative number for error.
  */
-static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
+static int __init efi_check_dt_boot(void)
 {
     int chosen, node, addr_len, size_len;
     unsigned int i = 0, modules_found = 0;
+    EFI_FILE_HANDLE *dir_handle = NULL;
 
     /* Check for the chosen node in the current DTB */
     chosen = setup_chosen_node(fdt, &addr_len, &size_len);
@@ -881,13 +880,13 @@ static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
         if ( !fdt_node_check_compatible(fdt, node, "xen,domain") )
         {
             /* Found a node with compatible xen,domain; handle this node. */
-            ret = handle_dom0less_domain_node(loaded_image, node);
+            ret = handle_dom0less_domain_node(dir_handle, node);
             if ( ret < 0 )
                 return ERROR_DT_MODULE_DOMU;
         }
         else
         {
-            ret = handle_module_node(loaded_image, node, addr_len, size_len,
+            ret = handle_module_node(dir_handle, node, addr_len, size_len,
                                      false);
             if ( ret < 0 )
                  return ERROR_DT_MODULE_DOM0;
@@ -895,6 +894,9 @@ static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
         modules_found += ret;
     }
 
+    if ( *dir_handle )
+        (*dir_handle)->Close(*dir_handle);
+
     /* Free boot modules file names if any */
     for ( ; i < modules_idx; i++ )
     {
diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index 8fd5e2d078..1a91920e8a 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -121,8 +121,7 @@ static char *get_value(const struct file *cfg, const char *section,
 static char *split_string(char *s);
 static CHAR16 *s2w(union string *str);
 static char *w2s(const union string *str);
-static EFI_FILE_HANDLE get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
-                                         CHAR16 **leaf);
+static EFI_FILE_HANDLE get_parent_handle(CHAR16 **leaf);
 static bool read_file(EFI_FILE_HANDLE dir_handle, CHAR16 *name,
                       struct file *file, const char *options);
 static bool read_section(const EFI_LOADED_IMAGE *image, const CHAR16 *name,
@@ -145,6 +144,7 @@ static void efi_exit_boot(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
 static const EFI_BOOT_SERVICES *__initdata efi_bs;
 static UINT32 __initdata efi_bs_revision;
 static EFI_HANDLE __initdata efi_ih;
+static EFI_LOADED_IMAGE *__initdata loaded_image;
 
 static SIMPLE_TEXT_OUTPUT_INTERFACE *__initdata StdOut;
 static SIMPLE_TEXT_OUTPUT_INTERFACE *__initdata StdErr;
@@ -169,7 +169,7 @@ static void __init PrintErr(const CHAR16 *s)
 }
 
 #ifndef CONFIG_HAS_DEVICE_TREE
-static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
+static int __init efi_check_dt_boot(void)
 {
     return 0;
 }
@@ -441,8 +441,7 @@ static unsigned int __init get_argv(unsigned int argc, CHAR16 **argv,
     return argc;
 }
 
-static EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
-                                                CHAR16 **leaf)
+static EFI_FILE_HANDLE __init get_parent_handle(CHAR16 **leaf)
 {
     static EFI_GUID __initdata fs_protocol = SIMPLE_FILE_SYSTEM_PROTOCOL;
     static CHAR16 __initdata buffer[512];
@@ -451,6 +450,8 @@ static EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
     CHAR16 *pathend, *ptr;
     EFI_STATUS ret;
 
+    BUG_ON(!loaded_image);
+
     do {
         EFI_FILE_IO_INTERFACE *fio;
 
@@ -1134,7 +1135,6 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
 {
     static EFI_GUID __initdata loaded_image_guid = LOADED_IMAGE_PROTOCOL;
     static EFI_GUID __initdata shim_lock_guid = SHIM_LOCK_PROTOCOL_GUID;
-    EFI_LOADED_IMAGE *loaded_image;
     EFI_STATUS status;
     unsigned int i, argc;
     CHAR16 **argv, *file_name, *cfg_file_name = NULL, *options = NULL;
@@ -1240,7 +1240,7 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
         gop = efi_get_gop();
 
         /* Get the file system interface. */
-        dir_handle = get_parent_handle(loaded_image, &file_name);
+        dir_handle = get_parent_handle(&file_name);
 
         /* Read and parse the config file. */
         if ( read_section(loaded_image, L"config", &cfg, NULL) )
@@ -1332,8 +1332,7 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
          */
         if ( argc && !*argv )
         {
-            EFI_FILE_HANDLE handle = get_parent_handle(loaded_image,
-                                                       &file_name);
+            EFI_FILE_HANDLE handle = get_parent_handle(&file_name);
 
             handle->Close(handle);
             *argv = file_name;
@@ -1371,7 +1370,7 @@ efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
     }
 
     /* Get the number of boot modules specified on the DT or an error (<0) */
-    dt_modules_found = efi_check_dt_boot(loaded_image);
+    dt_modules_found = efi_check_dt_boot();
 
     if ( dt_modules_found < 0 )
         /* efi_check_dt_boot throws some error */


Cheers,
Luca
Jan Beulich Nov. 9, 2021, 11 a.m. UTC | #17
On 09.11.2021 03:11, Stefano Stabellini wrote:
> On Mon, 8 Nov 2021, Jan Beulich wrote:
>> On 05.11.2021 16:33, Stefano Stabellini wrote:
>>> My main concern is performance and resource utilization. With v3 of the
>>> patch get_parent_handle will get called for every module to be loaded.
>>> With dom0less, it could easily get called 10 times or more. Taking a
>>> look at get_parent_handle, the Xen side doesn't seem small and I have
>>> no idea how the EDK2 side looks. I am just worried that it would
>>> actually have an impact on boot times (also depending on the bootloader
>>> implementation).
>>
>> The biggest part of the function deals with determining the "residual" of
>> the file name. That part looks to be of no interest at all to
>> allocate_module_file() (whether that's actually correct I can't tell). I
>> don't see why this couldn't be made conditional (e.g. by passing in NULL
>> for "leaf").
> 
> I understand the idea of passing NULL instead of "leaf", but I tried
> having a look and I can't tell what we would be able to skip in
> get_parent_handle.

My bad - I did overlook that dir_handle gets updated even past the
initial loop.

> Should we have a global variable to keep the dir_handle open during
> dom0less module loading?

If that's contained within Arm-specific code, I (obviously) don't mind.
Otherwise I remain to be convinced.

Jan
Jan Beulich Nov. 9, 2021, 11:01 a.m. UTC | #18
On 09.11.2021 10:23, Luca Fancellu wrote:
>> On 9 Nov 2021, at 02:11, Stefano Stabellini <sstabellini@kernel.org> wrote:
>> Should we have a global variable to keep the dir_handle open during
>> dom0less module loading?
> 
> I thought about a solution for that, here the changes, please not that I’ve just built them, not tested,
> Would they be something acceptable to have loaded_image global?

Not without a very good reason, I'm afraid.

Jan
Stefano Stabellini Nov. 9, 2021, 9:52 p.m. UTC | #19
On Tue, 9 Nov 2021, Jan Beulich wrote:
> On 09.11.2021 03:11, Stefano Stabellini wrote:
> > On Mon, 8 Nov 2021, Jan Beulich wrote:
> >> On 05.11.2021 16:33, Stefano Stabellini wrote:
> >>> My main concern is performance and resource utilization. With v3 of the
> >>> patch get_parent_handle will get called for every module to be loaded.
> >>> With dom0less, it could easily get called 10 times or more. Taking a
> >>> look at get_parent_handle, the Xen side doesn't seem small and I have
> >>> no idea how the EDK2 side looks. I am just worried that it would
> >>> actually have an impact on boot times (also depending on the bootloader
> >>> implementation).
> >>
> >> The biggest part of the function deals with determining the "residual" of
> >> the file name. That part looks to be of no interest at all to
> >> allocate_module_file() (whether that's actually correct I can't tell). I
> >> don't see why this couldn't be made conditional (e.g. by passing in NULL
> >> for "leaf").
> > 
> > I understand the idea of passing NULL instead of "leaf", but I tried
> > having a look and I can't tell what we would be able to skip in
> > get_parent_handle.
> 
> My bad - I did overlook that dir_handle gets updated even past the
> initial loop.
> 
> > Should we have a global variable to keep the dir_handle open during
> > dom0less module loading?
> 
> If that's contained within Arm-specific code, I (obviously) don't mind.
> Otherwise I remain to be convinced.

I think we can do something decent entirely within
xen/arch/arm/efi/efi-boot.h.

Luca, see below as reference; it is untested and incomplete but should
explain the idea better than words. With the below, we only open/close
the handle once for the all dom0less modules.


diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
index 458cfbbed4..b5218d5228 100644
--- a/xen/arch/arm/efi/efi-boot.h
+++ b/xen/arch/arm/efi/efi-boot.h
@@ -24,6 +24,7 @@ static struct file __initdata module_binary;
 static module_name __initdata modules[MAX_UEFI_MODULES];
 static unsigned int __initdata modules_available = MAX_UEFI_MODULES;
 static unsigned int __initdata modules_idx;
+static EFI_FILE_HANDLE __initdata dir_handle;
 
 #define ERROR_BINARY_FILE_NOT_FOUND (-1)
 #define ERROR_ALLOC_MODULE_NO_SPACE (-1)
@@ -651,9 +652,7 @@ static int __init allocate_module_file(EFI_LOADED_IMAGE *loaded_image,
                                        const char *name,
                                        unsigned int name_len)
 {
-    EFI_FILE_HANDLE dir_handle;
     module_name *file_name;
-    CHAR16 *fname;
     union string module_name;
     int ret;
 
@@ -685,14 +684,9 @@ static int __init allocate_module_file(EFI_LOADED_IMAGE *loaded_image,
     strlcpy(file_name->name, name, name_len + 1);
     file_name->name_len = name_len;
 
-    /* Get the file system interface. */
-    dir_handle = get_parent_handle(loaded_image, &fname);
-
     /* Load the binary in memory */
     read_file(dir_handle, s2w(&module_name), &module_binary, NULL);
 
-    dir_handle->Close(dir_handle);
-
     /* Save address and size */
     file_name->addr = module_binary.addr;
     file_name->size = module_binary.size;
@@ -862,6 +856,7 @@ static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
 {
     int chosen, node, addr_len, size_len;
     unsigned int i = 0, modules_found = 0;
+    CHAR16 *fname;
 
     /* Check for the chosen node in the current DTB */
     chosen = setup_chosen_node(fdt, &addr_len, &size_len);
@@ -871,6 +866,8 @@ static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
         return ERROR_DT_CHOSEN_NODE;
     }
 
+    dir_handle = get_parent_handle(loaded_image, &fname);
+
     /* Check for nodes compatible with xen,domain under the chosen node */
     for ( node = fdt_first_subnode(fdt, chosen);
           node > 0;
@@ -902,6 +899,8 @@ static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
         efi_bs->FreePool(modules[i].name);
     }
 
+    if ( dir_handle )
+        dir_handle->Close(dir_handle);
     return modules_found;
 }
Julien Grall Nov. 9, 2021, 10:31 p.m. UTC | #20
Hi Stefano,

On 09/11/2021 21:52, Stefano Stabellini wrote:
> On Tue, 9 Nov 2021, Jan Beulich wrote:
>> On 09.11.2021 03:11, Stefano Stabellini wrote:
>>> On Mon, 8 Nov 2021, Jan Beulich wrote:
>>>> On 05.11.2021 16:33, Stefano Stabellini wrote:
>>>>> My main concern is performance and resource utilization. With v3 of the
>>>>> patch get_parent_handle will get called for every module to be loaded.
>>>>> With dom0less, it could easily get called 10 times or more. Taking a
>>>>> look at get_parent_handle, the Xen side doesn't seem small and I have
>>>>> no idea how the EDK2 side looks. I am just worried that it would
>>>>> actually have an impact on boot times (also depending on the bootloader
>>>>> implementation).
>>>>
>>>> The biggest part of the function deals with determining the "residual" of
>>>> the file name. That part looks to be of no interest at all to
>>>> allocate_module_file() (whether that's actually correct I can't tell). I
>>>> don't see why this couldn't be made conditional (e.g. by passing in NULL
>>>> for "leaf").
>>>
>>> I understand the idea of passing NULL instead of "leaf", but I tried
>>> having a look and I can't tell what we would be able to skip in
>>> get_parent_handle.
>>
>> My bad - I did overlook that dir_handle gets updated even past the
>> initial loop.
>>
>>> Should we have a global variable to keep the dir_handle open during
>>> dom0less module loading?
>>
>> If that's contained within Arm-specific code, I (obviously) don't mind.
>> Otherwise I remain to be convinced.
> 
> I think we can do something decent entirely within
> xen/arch/arm/efi/efi-boot.h.
> 
> Luca, see below as reference; it is untested and incomplete but should
> explain the idea better than words. With the below, we only open/close
> the handle once for the all dom0less modules.

Looking at the diff below, you open/close dir_handle within 
efi_check_dt_boot(). This means that only the functions called within 
the function will effectively used.

At which point, I think this should be an argument of the functions 
rather than a global function. This will drastically reduce any misuse 
of the global variable (which BTW the name seems to clash with some of 
the arguments).

Cheers,
Jan Beulich Nov. 10, 2021, 7:40 a.m. UTC | #21
On 09.11.2021 22:52, Stefano Stabellini wrote:
> @@ -862,6 +856,7 @@ static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
>  {
>      int chosen, node, addr_len, size_len;
>      unsigned int i = 0, modules_found = 0;
> +    CHAR16 *fname;
>  
>      /* Check for the chosen node in the current DTB */
>      chosen = setup_chosen_node(fdt, &addr_len, &size_len);
> @@ -871,6 +866,8 @@ static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
>          return ERROR_DT_CHOSEN_NODE;
>      }
>  
> +    dir_handle = get_parent_handle(loaded_image, &fname);
> +
>      /* Check for nodes compatible with xen,domain under the chosen node */
>      for ( node = fdt_first_subnode(fdt, chosen);
>            node > 0;
> @@ -902,6 +899,8 @@ static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
>          efi_bs->FreePool(modules[i].name);
>      }
>  
> +    if ( dir_handle )
> +        dir_handle->Close(dir_handle);

Weren't we there before, resulting in the same problem that Luca's
change attempted to fix? get_parent_handle() simply shouldn't be
called unconditionally aiui. I could see closing of the handle to
happen here, but its opening imo should happen the first time
allocate_module_file() is actually reached.

Jan
Luca Fancellu Nov. 10, 2021, 1:05 p.m. UTC | #22
> On 9 Nov 2021, at 21:52, Stefano Stabellini <sstabellini@kernel.org> wrote:
> 
> On Tue, 9 Nov 2021, Jan Beulich wrote:
>> On 09.11.2021 03:11, Stefano Stabellini wrote:
>>> On Mon, 8 Nov 2021, Jan Beulich wrote:
>>>> On 05.11.2021 16:33, Stefano Stabellini wrote:
>>>>> My main concern is performance and resource utilization. With v3 of the
>>>>> patch get_parent_handle will get called for every module to be loaded.
>>>>> With dom0less, it could easily get called 10 times or more. Taking a
>>>>> look at get_parent_handle, the Xen side doesn't seem small and I have
>>>>> no idea how the EDK2 side looks. I am just worried that it would
>>>>> actually have an impact on boot times (also depending on the bootloader
>>>>> implementation).
>>>> 
>>>> The biggest part of the function deals with determining the "residual" of
>>>> the file name. That part looks to be of no interest at all to
>>>> allocate_module_file() (whether that's actually correct I can't tell). I
>>>> don't see why this couldn't be made conditional (e.g. by passing in NULL
>>>> for "leaf").
>>> 
>>> I understand the idea of passing NULL instead of "leaf", but I tried
>>> having a look and I can't tell what we would be able to skip in
>>> get_parent_handle.
>> 
>> My bad - I did overlook that dir_handle gets updated even past the
>> initial loop.
>> 
>>> Should we have a global variable to keep the dir_handle open during
>>> dom0less module loading?
>> 
>> If that's contained within Arm-specific code, I (obviously) don't mind.
>> Otherwise I remain to be convinced.
> 
> I think we can do something decent entirely within
> xen/arch/arm/efi/efi-boot.h.
> 
> Luca, see below as reference; it is untested and incomplete but should
> explain the idea better than words. With the below, we only open/close
> the handle once for the all dom0less modules.
> 
> 
> diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
> index 458cfbbed4..b5218d5228 100644
> --- a/xen/arch/arm/efi/efi-boot.h
> +++ b/xen/arch/arm/efi/efi-boot.h
> @@ -24,6 +24,7 @@ static struct file __initdata module_binary;
> static module_name __initdata modules[MAX_UEFI_MODULES];
> static unsigned int __initdata modules_available = MAX_UEFI_MODULES;
> static unsigned int __initdata modules_idx;
> +static EFI_FILE_HANDLE __initdata dir_handle;
> 
> #define ERROR_BINARY_FILE_NOT_FOUND (-1)
> #define ERROR_ALLOC_MODULE_NO_SPACE (-1)
> @@ -651,9 +652,7 @@ static int __init allocate_module_file(EFI_LOADED_IMAGE *loaded_image,
>                                        const char *name,
>                                        unsigned int name_len)
> {
> -    EFI_FILE_HANDLE dir_handle;
>     module_name *file_name;
> -    CHAR16 *fname;
>     union string module_name;
>     int ret;
> 
> @@ -685,14 +684,9 @@ static int __init allocate_module_file(EFI_LOADED_IMAGE *loaded_image,
>     strlcpy(file_name->name, name, name_len + 1);
>     file_name->name_len = name_len;
> 
> -    /* Get the file system interface. */
> -    dir_handle = get_parent_handle(loaded_image, &fname);
> -
>     /* Load the binary in memory */
>     read_file(dir_handle, s2w(&module_name), &module_binary, NULL);
> 
> -    dir_handle->Close(dir_handle);
> -
>     /* Save address and size */
>     file_name->addr = module_binary.addr;
>     file_name->size = module_binary.size;
> @@ -862,6 +856,7 @@ static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
> {
>     int chosen, node, addr_len, size_len;
>     unsigned int i = 0, modules_found = 0;
> +    CHAR16 *fname;
> 
>     /* Check for the chosen node in the current DTB */
>     chosen = setup_chosen_node(fdt, &addr_len, &size_len);
> @@ -871,6 +866,8 @@ static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
>         return ERROR_DT_CHOSEN_NODE;
>     }
> 
> +    dir_handle = get_parent_handle(loaded_image, &fname);
> +
>     /* Check for nodes compatible with xen,domain under the chosen node */
>     for ( node = fdt_first_subnode(fdt, chosen);
>           node > 0;
> @@ -902,6 +899,8 @@ static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
>         efi_bs->FreePool(modules[i].name);
>     }
> 
> +    if ( dir_handle )
> +        dir_handle->Close(dir_handle);
>     return modules_found;
> }
> 

Hi Stefano,

I thought having the EFI_FILE_HANDLE global in efi-boot.h was a “no go”, but if it’s not then instead of
calling get_parent_handle in efi_check_dt_boot (that is the main issue with EDK2+Grub2), we can do
something like this:

diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
index 458cfbbed4..169bbfc215 100644
--- a/xen/arch/arm/efi/efi-boot.h
+++ b/xen/arch/arm/efi/efi-boot.h
@@ -24,6 +24,7 @@ static struct file __initdata module_binary;
 static module_name __initdata modules[MAX_UEFI_MODULES];
 static unsigned int __initdata modules_available = MAX_UEFI_MODULES;
 static unsigned int __initdata modules_idx;
+static EFI_FILE_HANDLE __initdata fs_dir_handle;
 
 #define ERROR_BINARY_FILE_NOT_FOUND (-1)
 #define ERROR_ALLOC_MODULE_NO_SPACE (-1)
@@ -651,7 +652,6 @@ static int __init allocate_module_file(EFI_LOADED_IMAGE *loaded_image,
                                        const char *name,
                                        unsigned int name_len)
 {
-    EFI_FILE_HANDLE dir_handle;
     module_name *file_name;
     CHAR16 *fname;
     union string module_name;
@@ -686,12 +686,11 @@ static int __init allocate_module_file(EFI_LOADED_IMAGE *loaded_image,
     file_name->name_len = name_len;
 
     /* Get the file system interface. */
-    dir_handle = get_parent_handle(loaded_image, &fname);
+    if ( !fs_dir_handle )
+        fs_dir_handle = get_parent_handle(loaded_image, &fname);
 
     /* Load the binary in memory */
-    read_file(dir_handle, s2w(&module_name), &module_binary, NULL);
-
-    dir_handle->Close(dir_handle);
+    read_file(fs_dir_handle, s2w(&module_name), &module_binary, NULL);
 
     /* Save address and size */
     file_name->addr = module_binary.addr;
@@ -895,6 +894,10 @@ static int __init efi_check_dt_boot(EFI_LOADED_IMAGE *loaded_image)
         modules_found += ret;
     }
 
+    /* allocate_module_file could have allocated the handle, if so, close it */
+    if ( fs_dir_handle )
+        fs_dir_handle->Close(fs_dir_handle);
+
     /* Free boot modules file names if any */
     for ( ; i < modules_idx; i++ )
     {


I’ve not tested these changes, but I’ve built them for arm/x86 and they build.

What everyone think about that?

Cheers,
Luca
Julien Grall Nov. 10, 2021, 1:36 p.m. UTC | #23
Hi Luca,

On 10/11/2021 13:05, Luca Fancellu wrote:
> I thought having the EFI_FILE_HANDLE global in efi-boot.h was a “no go”, but if it’s not then instead of
> calling get_parent_handle in efi_check_dt_boot (that is the main issue with EDK2+Grub2), we can do
> something like this:

fs_dir_handle is only used by callees of efi_check_boot_dt_boot(). So 
the global variable is not an option for me because the risk is not 
worth it (it is easy to misuse a global variable).

Instead, I think fs_dir_handle should be an argument of 
allocate_module_file() and propagated up to the first call in 
efi_check_dt_boot().

Cheers,
Luca Fancellu Nov. 10, 2021, 2:02 p.m. UTC | #24
> On 10 Nov 2021, at 13:36, Julien Grall <julien@xen.org> wrote:
> 
> Hi Luca,
> 
> On 10/11/2021 13:05, Luca Fancellu wrote:
>> I thought having the EFI_FILE_HANDLE global in efi-boot.h was a “no go”, but if it’s not then instead of
>> calling get_parent_handle in efi_check_dt_boot (that is the main issue with EDK2+Grub2), we can do
>> something like this:
> 

Hi Julien,

> fs_dir_handle is only used by callees of efi_check_boot_dt_boot(). So the global variable is not an option for me because the risk is not worth it (it is easy to misuse a global variable).
> 
> Instead, I think fs_dir_handle should be an argument of allocate_module_file() and propagated up to the first call in efi_check_dt_boot().
> 

Yes you are right, changing the interface of handle_dom0less_domain_node, handle_module_node, allocate_module_file to host also an argument EFI_FILE_HANDLE *dir_handle
avoids the use of the global, then the handle is requested in allocate_module_file only once and closed in efi_check_dt_boot only if it’s not null.

Cheers,
Luca

> Cheers,
> 
> -- 
> Julien Grall
Julien Grall Nov. 15, 2021, 6:57 p.m. UTC | #25
Hi Luca,

On 10/11/2021 14:02, Luca Fancellu wrote:
> 
> 
>> On 10 Nov 2021, at 13:36, Julien Grall <julien@xen.org> wrote:
>>
>> Hi Luca,
>>
>> On 10/11/2021 13:05, Luca Fancellu wrote:
>>> I thought having the EFI_FILE_HANDLE global in efi-boot.h was a “no go”, but if it’s not then instead of
>>> calling get_parent_handle in efi_check_dt_boot (that is the main issue with EDK2+Grub2), we can do
>>> something like this:
>>
>> fs_dir_handle is only used by callees of efi_check_boot_dt_boot(). So the global variable is not an option for me because the risk is not worth it (it is easy to misuse a global variable).
>>
>> Instead, I think fs_dir_handle should be an argument of allocate_module_file() and propagated up to the first call in efi_check_dt_boot().
>>
> 
> Yes you are right, changing the interface of handle_dom0less_domain_node, handle_module_node, allocate_module_file to host also an argument EFI_FILE_HANDLE *dir_handle
> avoids the use of the global, then the handle is requested in allocate_module_file only once and closed in efi_check_dt_boot only if it’s not null.

That would indeed be better. I'd like this patch to be merged in 4.16. 
Would you be able to send a new version in the next couple of days?

Cheers,
Stefano Stabellini Nov. 15, 2021, 10 p.m. UTC | #26
+Ian

On Mon, 15 Nov 2021, Julien Grall wrote:
> Hi Luca,
> 
> On 10/11/2021 14:02, Luca Fancellu wrote:
> > 
> > 
> > > On 10 Nov 2021, at 13:36, Julien Grall <julien@xen.org> wrote:
> > > 
> > > Hi Luca,
> > > 
> > > On 10/11/2021 13:05, Luca Fancellu wrote:
> > > > I thought having the EFI_FILE_HANDLE global in efi-boot.h was a “no go”,
> > > > but if it’s not then instead of
> > > > calling get_parent_handle in efi_check_dt_boot (that is the main issue
> > > > with EDK2+Grub2), we can do
> > > > something like this:
> > > 
> > > fs_dir_handle is only used by callees of efi_check_boot_dt_boot(). So the
> > > global variable is not an option for me because the risk is not worth it
> > > (it is easy to misuse a global variable).
> > > 
> > > Instead, I think fs_dir_handle should be an argument of
> > > allocate_module_file() and propagated up to the first call in
> > > efi_check_dt_boot().
> > > 
> > 
> > Yes you are right, changing the interface of handle_dom0less_domain_node,
> > handle_module_node, allocate_module_file to host also an argument
> > EFI_FILE_HANDLE *dir_handle
> > avoids the use of the global, then the handle is requested in
> > allocate_module_file only once and closed in efi_check_dt_boot only if it’s
> > not null.
> 
> That would indeed be better. I'd like this patch to be merged in 4.16. Would
> you be able to send a new version in the next couple of days?

I'd love that too; adding Ian so that he is aware.
Luca Fancellu Nov. 16, 2021, 8:36 a.m. UTC | #27
> On 15 Nov 2021, at 22:00, Stefano Stabellini <sstabellini@kernel.org> wrote:
> 
> +Ian
> 
> On Mon, 15 Nov 2021, Julien Grall wrote:
>> Hi Luca,
>> 
>> On 10/11/2021 14:02, Luca Fancellu wrote:
>>> 
>>> 
>>>> On 10 Nov 2021, at 13:36, Julien Grall <julien@xen.org> wrote:
>>>> 
>>>> Hi Luca,
>>>> 
>>>> On 10/11/2021 13:05, Luca Fancellu wrote:
>>>>> I thought having the EFI_FILE_HANDLE global in efi-boot.h was a “no go”,
>>>>> but if it’s not then instead of
>>>>> calling get_parent_handle in efi_check_dt_boot (that is the main issue
>>>>> with EDK2+Grub2), we can do
>>>>> something like this:
>>>> 
>>>> fs_dir_handle is only used by callees of efi_check_boot_dt_boot(). So the
>>>> global variable is not an option for me because the risk is not worth it
>>>> (it is easy to misuse a global variable).
>>>> 
>>>> Instead, I think fs_dir_handle should be an argument of
>>>> allocate_module_file() and propagated up to the first call in
>>>> efi_check_dt_boot().
>>>> 
>>> 
>>> Yes you are right, changing the interface of handle_dom0less_domain_node,
>>> handle_module_node, allocate_module_file to host also an argument
>>> EFI_FILE_HANDLE *dir_handle
>>> avoids the use of the global, then the handle is requested in
>>> allocate_module_file only once and closed in efi_check_dt_boot only if it’s
>>> not null.
>> 
>> That would indeed be better. I'd like this patch to be merged in 4.16. Would
>> you be able to send a new version in the next couple of days?
> 
> I'd love that too; adding Ian so that he is aware.

Hi, yes I will prepare it and push very soon.

Cheers,

Luca
Ian Jackson Nov. 16, 2021, 3:08 p.m. UTC | #28
Luca Fancellu writes ("Re: [PATCH-4.16 v2] xen/efi: Fix Grub2 boot on arm64"):
> > On 15 Nov 2021, at 22:00, Stefano Stabellini <sstabellini@kernel.org> wrote:
> > On Mon, 15 Nov 2021, Julien Grall wrote:
> >> That would indeed be better. I'd like this patch to be merged in 4.16. Would
> >> you be able to send a new version in the next couple of days?
> > 
> > I'd love that too; adding Ian so that he is aware.
> 
> Hi, yes I will prepare it and push very soon.

Can someone explain to me what is going on here in management-level-speak ?
I have read the thread and, as far as I can tell:

There was an actual regression with Grub2 on ARM64.  This was fixed by
9bc9fff04ba0 "xen/efi: Fix Grub2 boot on arm64" (committed on the 5th
of Novwmber).

But there are some objections to parts of that patch, from Jan.  It is
not clear to me what the status of those objections is.

Was I wrong to think that Jan had given an R-b ?  Had it been
withdrawn ?  I apologise if I committed a patch I shouldn't have.
(I have a vague memory of some conversation about this on irc but
nothing about this seems to have made it into email.)

AIUI from the thread, most of this discussion is about a followup
patch.  I don't understand the nature of the problem the followup
patch fixes, or the risk of the followup patch.

Does the current state of staging have a regression or serious bug ?
Who is affected by this bug and what are the consequences ?

Thanks,
Ian.
Jan Beulich Nov. 16, 2021, 4:11 p.m. UTC | #29
On 16.11.2021 16:08, Ian Jackson wrote:
> Luca Fancellu writes ("Re: [PATCH-4.16 v2] xen/efi: Fix Grub2 boot on arm64"):
>>> On 15 Nov 2021, at 22:00, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>> On Mon, 15 Nov 2021, Julien Grall wrote:
>>>> That would indeed be better. I'd like this patch to be merged in 4.16. Would
>>>> you be able to send a new version in the next couple of days?
>>>
>>> I'd love that too; adding Ian so that he is aware.
>>
>> Hi, yes I will prepare it and push very soon.
> 
> Can someone explain to me what is going on here in management-level-speak ?
> I have read the thread and, as far as I can tell:
> 
> There was an actual regression with Grub2 on ARM64.  This was fixed by
> 9bc9fff04ba0 "xen/efi: Fix Grub2 boot on arm64" (committed on the 5th
> of Novwmber).
> 
> But there are some objections to parts of that patch, from Jan.  It is
> not clear to me what the status of those objections is.
> 
> Was I wrong to think that Jan had given an R-b ?  Had it been
> withdrawn ?  I apologise if I committed a patch I shouldn't have.

FTAOD that patch was fine to commit afaic. My concerns were addressed,
but further concerns had been left pending.

Jan
Julien Grall Nov. 16, 2021, 4:23 p.m. UTC | #30
Hi,

On 16/11/2021 15:08, Ian Jackson wrote:
> Luca Fancellu writes ("Re: [PATCH-4.16 v2] xen/efi: Fix Grub2 boot on arm64"):
>>> On 15 Nov 2021, at 22:00, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>> On Mon, 15 Nov 2021, Julien Grall wrote:
>>>> That would indeed be better. I'd like this patch to be merged in 4.16. Would
>>>> you be able to send a new version in the next couple of days?
>>>
>>> I'd love that too; adding Ian so that he is aware.
>>
>> Hi, yes I will prepare it and push very soon.
> 
> Can someone explain to me what is going on here in management-level-speak ?
> I have read the thread and, as far as I can tell:
> 
> There was an actual regression with Grub2 on ARM64.  This was fixed by
> 9bc9fff04ba0 "xen/efi: Fix Grub2 boot on arm64" (committed on the 5th
> of Novwmber).
> 
> But there are some objections to parts of that patch, from Jan.  It is
> not clear to me what the status of those objections is.
> 
> Was I wrong to think that Jan had given an R-b ?  Had it been
> withdrawn ?  I apologise if I committed a patch I shouldn't have.
> (I have a vague memory of some conversation about this on irc but
> nothing about this seems to have made it into email.)
> 
> AIUI from the thread, most of this discussion is about a followup
> patch.  I don't understand the nature of the problem the followup
> patch fixes, or the risk of the followup patch.
> 
> Does the current state of staging have a regression or serious bug ?
> Who is affected by this bug and what are the consequences ?

I think there was some confusion from my side, I thought the original 
path was not committed and therefore we would respin it and fold the fix 
discussed. Apologies for that.

Given the original patch is merged, my view on the new patch is 
different. This is improving performance during boot. While it would be 
a nice patch to merge, I am less inclined to get it merged in 4.16.

That said, the 4.16 tree will be re-opened for backports soon after 
4.16.0 is released. In the past, we backported performance improvement 
patch and we could consider to do the same in this case as well.

Cheers,
diff mbox series

Patch

diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
index 8b88dd26a5..c3ae9751ab 100644
--- a/xen/arch/arm/efi/efi-boot.h
+++ b/xen/arch/arm/efi/efi-boot.h
@@ -702,6 +702,7 @@  static int __init allocate_module_file(EFI_FILE_HANDLE dir_handle,
  * This function checks for the presence of the xen,uefi-binary property in the
  * module, if found it loads the binary as module and sets the right address
  * for the reg property into the module DT node.
+ * Returns 1 if module is multiboot,module, 0 if not, < 0 on error
  */
 static int __init handle_module_node(EFI_FILE_HANDLE dir_handle,
                                      int module_node_offset,
@@ -730,8 +731,8 @@  static int __init handle_module_node(EFI_FILE_HANDLE dir_handle,
                                  &uefi_name_len);
 
     if ( !uefi_name_prop )
-        /* Property not found */
-        return 0;
+        /* Property not found, but signal this is a multiboot,module */
+        return 1;
 
     file_idx = get_module_file_index(uefi_name_prop, uefi_name_len);
     if ( file_idx < 0 )
@@ -795,19 +796,20 @@  static int __init handle_module_node(EFI_FILE_HANDLE dir_handle,
         }
     }
 
-    return 0;
+    return 1;
 }
 
 /*
  * This function checks for boot modules under the domU guest domain node
  * in the DT.
- * Returns 0 on success, negative number on error.
+ * Returns number of multiboot,module found or negative number on error.
  */
 static int __init handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
                                               int domain_node)
 {
     int module_node, addr_cells, size_cells, len;
     const struct fdt_property *prop;
+    unsigned int mb_modules_found = 0;
 
     /* Get #address-cells and #size-cells from domain node */
     prop = fdt_get_property(fdt, domain_node, "#address-cells", &len);
@@ -837,20 +839,22 @@  static int __init handle_dom0less_domain_node(EFI_FILE_HANDLE dir_handle,
                                      size_cells, true);
         if ( ret < 0 )
             return ret;
+
+        mb_modules_found += ret;
     }
 
-    return 0;
+    return mb_modules_found;
 }
 
 /*
  * This function checks for xen domain nodes under the /chosen node for possible
  * dom0 and domU guests to be loaded.
- * Returns the number of modules loaded or a negative number for error.
+ * Returns the number of multiboot modules found or a negative number for error.
  */
 static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
 {
     int chosen, node, addr_len, size_len;
-    unsigned int i = 0;
+    unsigned int i = 0, modules_found = 0;
 
     /* Check for the chosen node in the current DTB */
     chosen = setup_chosen_node(fdt, &addr_len, &size_len);
@@ -865,15 +869,23 @@  static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
           node > 0;
           node = fdt_next_subnode(fdt, node) )
     {
+        int ret;
+
         if ( !fdt_node_check_compatible(fdt, node, "xen,domain") )
         {
             /* Found a node with compatible xen,domain; handle this node. */
-            if ( handle_dom0less_domain_node(dir_handle, node) < 0 )
+            ret = handle_dom0less_domain_node(dir_handle, node);
+            if ( ret < 0 )
                 return ERROR_DT_MODULE_DOMU;
         }
-        else if ( handle_module_node(dir_handle, node, addr_len, size_len,
-                                     false) < 0 )
+        else
+        {
+            ret = handle_module_node(dir_handle, node, addr_len, size_len,
+                                     false);
+            if ( ret < 0 )
                  return ERROR_DT_MODULE_DOM0;
+        }
+        modules_found += ret;
     }
 
     /* Free boot modules file names if any */
@@ -883,7 +895,7 @@  static int __init efi_check_dt_boot(EFI_FILE_HANDLE dir_handle)
         efi_bs->FreePool(modules[i].name);
     }
 
-    return modules_idx;
+    return modules_found;
 }
 
 static void __init efi_arch_cpu(void)
diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index 392ff3ac9b..112b7e7571 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -449,6 +449,15 @@  static EFI_FILE_HANDLE __init get_parent_handle(EFI_LOADED_IMAGE *loaded_image,
     CHAR16 *pathend, *ptr;
     EFI_STATUS ret;
 
+    /*
+     * Grub2 running on top of EDK2 has been observed to supply a NULL
+     * DeviceHandle. We can't use that to gain access to the filesystem.
+     * However the system can still boot if it doesn’t require access to the
+     * filesystem.
+     */
+    if ( !loaded_image->DeviceHandle )
+        return NULL;
+
     do {
         EFI_FILE_IO_INTERFACE *fio;
 
@@ -581,6 +590,8 @@  static bool __init read_file(EFI_FILE_HANDLE dir_handle, CHAR16 *name,
     EFI_STATUS ret;
     const CHAR16 *what = NULL;
 
+    if ( !dir_handle )
+        blexit(L"Error: No access to the filesystem");
     if ( !name )
         PrintErrMesg(L"No filename", EFI_OUT_OF_RESOURCES);
     ret = dir_handle->Open(dir_handle, &FileHandle, name,
@@ -1333,8 +1344,18 @@  efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
             EFI_FILE_HANDLE handle = get_parent_handle(loaded_image,
                                                        &file_name);
 
-            handle->Close(handle);
-            *argv = file_name;
+            if ( !handle )
+            {
+                PrintErr(L"Error retrieving image name: no filesystem access."
+                         L" Setting default to xen.efi");
+                PrintErr(newline);
+                *argv = L"xen.efi";
+            }
+            else
+            {
+                handle->Close(handle);
+                *argv = file_name;
+            }
         }
 
         name.s = get_value(&cfg, section.s, "options");
@@ -1369,7 +1390,8 @@  efi_start(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
     /* Get the number of boot modules specified on the DT or an error (<0) */
     dt_modules_found = efi_check_dt_boot(dir_handle);
 
-    dir_handle->Close(dir_handle);
+    if ( dir_handle )
+        dir_handle->Close(dir_handle);
 
     if ( dt_modules_found < 0 )
         /* efi_check_dt_boot throws some error */