diff mbox series

[V6,2/2] libxl/arm: Add handling of extended regions for DomU

Message ID 1633974539-7380-3-git-send-email-olekstysh@gmail.com (mailing list archive)
State Superseded
Headers show
Series Add handling of extended regions (safe ranges) on Arm (Was "xen/memory: Introduce a hypercall to provide unallocated space") | expand

Commit Message

Oleksandr Tyshchenko Oct. 11, 2021, 5:48 p.m. UTC
From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

The extended region (safe range) is a region of guest physical
address space which is unused and could be safely used to create
grant/foreign mappings instead of wasting real RAM pages from
the domain memory for establishing these mappings.

The extended regions are chosen at the domain creation time and
advertised to it via "reg" property under hypervisor node in
the guest device-tree. As region 0 is reserved for grant table
space (always present), the indexes for extended regions are 1...N.
If extended regions could not be allocated for some reason,
Xen doesn't fail and behaves as usual, so only inserts region 0.

Please note the following limitations:
- The extended region feature is only supported for 64-bit domain
  currently.
- The ACPI case is not covered.

***

The algorithm to choose extended regions for non-direct mapped
DomU is simpler in comparison with the algorithm for direct mapped
Dom0. We usually have a lot of unused space above 4GB, and might
have some unused space below 4GB (depends on guest memory size).
Try to allocate separate 2MB-aligned extended regions from the first
(below 4GB) and second (above 4GB) RAM banks taking into the account
the maximum supported guest physical address space size and the amount
of memory assigned to the guest. The minimum size of extended region
the same as for Dom0 (64MB).

Suggested-by: Julien Grall <jgrall@amazon.com>
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
---
Changes RFC -> V2:
   - update patch description
   - drop uneeded "extended-region" DT property
   - clear reg array in finalise_ext_region() and add a TODO

Changes V2 -> V3:
   - update patch description, comments in code
   - only pick up regions with size >= 64MB
   - move the region calculation to make_hypervisor_node() and drop
     finalise_ext_region()
   - extend the list of arguments for make_hypervisor_node()
   - do not show warning for 32-bit domain
   - change the region alignment from 1GB to 2MB
   - move EXT_REGION_SIZE to public/arch-arm.h

Changes V3 -> V4:
   - add R-b, A-b and T-b

Changes V4 -> V5:
   - update patch description and comments in code
   - reflect changes done in previous patch to pass gpaddr_bits
     via createdomain domctl (struct xen_arch_domainconfig)
   - drop R-b, A-b and T-b
   - drop limit for maximum extended region size (128GB)
   - try to also allocate region below 4GB, optimize code
     for calculating extended regions

Change V5 -> V6:
   - reflect changes done in previous patch to pass gpaddr_bits
     via getdomaininfo domctl (struct xen_domctl_getdomaininfo)
   - reduce the number of local variables, rework calculations
---
 tools/libs/light/libxl_arm.c  | 76 ++++++++++++++++++++++++++++++++++++++++---
 xen/include/public/arch-arm.h |  2 ++
 2 files changed, 73 insertions(+), 5 deletions(-)

Comments

Stefano Stabellini Oct. 11, 2021, 8 p.m. UTC | #1
On Mon, 11 Oct 2021, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> The extended region (safe range) is a region of guest physical
> address space which is unused and could be safely used to create
> grant/foreign mappings instead of wasting real RAM pages from
> the domain memory for establishing these mappings.
> 
> The extended regions are chosen at the domain creation time and
> advertised to it via "reg" property under hypervisor node in
> the guest device-tree. As region 0 is reserved for grant table
> space (always present), the indexes for extended regions are 1...N.
> If extended regions could not be allocated for some reason,
> Xen doesn't fail and behaves as usual, so only inserts region 0.
> 
> Please note the following limitations:
> - The extended region feature is only supported for 64-bit domain
>   currently.
> - The ACPI case is not covered.
> 
> ***
> 
> The algorithm to choose extended regions for non-direct mapped
> DomU is simpler in comparison with the algorithm for direct mapped
> Dom0. We usually have a lot of unused space above 4GB, and might
> have some unused space below 4GB (depends on guest memory size).
> Try to allocate separate 2MB-aligned extended regions from the first
> (below 4GB) and second (above 4GB) RAM banks taking into the account
> the maximum supported guest physical address space size and the amount
> of memory assigned to the guest. The minimum size of extended region
> the same as for Dom0 (64MB).
> 
> Suggested-by: Julien Grall <jgrall@amazon.com>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
> Changes RFC -> V2:
>    - update patch description
>    - drop uneeded "extended-region" DT property
>    - clear reg array in finalise_ext_region() and add a TODO
> 
> Changes V2 -> V3:
>    - update patch description, comments in code
>    - only pick up regions with size >= 64MB
>    - move the region calculation to make_hypervisor_node() and drop
>      finalise_ext_region()
>    - extend the list of arguments for make_hypervisor_node()
>    - do not show warning for 32-bit domain
>    - change the region alignment from 1GB to 2MB
>    - move EXT_REGION_SIZE to public/arch-arm.h
> 
> Changes V3 -> V4:
>    - add R-b, A-b and T-b
> 
> Changes V4 -> V5:
>    - update patch description and comments in code
>    - reflect changes done in previous patch to pass gpaddr_bits
>      via createdomain domctl (struct xen_arch_domainconfig)
>    - drop R-b, A-b and T-b
>    - drop limit for maximum extended region size (128GB)
>    - try to also allocate region below 4GB, optimize code
>      for calculating extended regions
> 
> Change V5 -> V6:
>    - reflect changes done in previous patch to pass gpaddr_bits
>      via getdomaininfo domctl (struct xen_domctl_getdomaininfo)
>    - reduce the number of local variables, rework calculations
> ---
>  tools/libs/light/libxl_arm.c  | 76 ++++++++++++++++++++++++++++++++++++++++---
>  xen/include/public/arch-arm.h |  2 ++
>  2 files changed, 73 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
> index e3140a6..c0e8415 100644
> --- a/tools/libs/light/libxl_arm.c
> +++ b/tools/libs/light/libxl_arm.c
> @@ -598,9 +598,20 @@ static int make_timer_node(libxl__gc *gc, void *fdt,
>      return 0;
>  }
>  
> +#define ALIGN_UP_TO_2MB(x)   (((x) + MB(2) - 1) & (~(MB(2) - 1)))
> +
>  static int make_hypervisor_node(libxl__gc *gc, void *fdt,
> -                                const libxl_version_info *vers)
> +                                const libxl_version_info *vers,
> +                                const libxl_domain_build_info *b_info,
> +                                const struct xc_dom_image *dom)
>  {
> +    uint64_t region_size[GUEST_RAM_BANKS] = {0}, region_base[GUEST_RAM_BANKS],
> +        bank1end, ramsize;
> +    uint32_t regs[(GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS) *
> +                  (GUEST_RAM_BANKS + 1)];
> +    be32 *cells = &regs[0];
> +    unsigned int i, len, nr_regions = 0;
> +    libxl_dominfo info;
>      int res;
>      gic_interrupt intr;
>  
> @@ -615,9 +626,64 @@ static int make_hypervisor_node(libxl__gc *gc, void *fdt,
>                                "xen,xen");
>      if (res) return res;
>  
> -    /* reg 0 is grant table space */
> -    res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
> -                            1,GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
> +    if (strcmp(dom->guest_type, "xen-3.0-aarch64")) {
> +        LOG(WARN, "The extended regions are only supported for 64-bit guest currently");
> +        goto out;
> +    }
> +
> +    res = libxl_domain_info(CTX, &info, dom->guest_domid);
> +    if (res) return res;
> +
> +    assert(info.gpaddr_bits >= 32 && info.gpaddr_bits <= 48);
> +
> +    /*
> +     * Try to allocate separate 2MB-aligned extended regions from the first
> +     * (below 4GB) and second (above 4GB) RAM banks taking into the account
> +     * the maximum supported guest physical address space size and the amount
> +     * of memory assigned to the guest.
> +     * As the guest memory layout is not populated yet we cannot rely on
> +     * dom->rambank_size[], so calculate the actual size of both banks using
> +     * "max_memkb" value.
> +     */
> +    ramsize = b_info->max_memkb * 1024;
> +    if (ramsize <= GUEST_RAM0_SIZE) {
> +        region_base[0] = GUEST_RAM0_BASE + ALIGN_UP_TO_2MB(ramsize);
> +        region_size[0] = GUEST_RAM0_SIZE - ALIGN_UP_TO_2MB(ramsize);
> +        region_base[1] = GUEST_RAM1_BASE;
> +    } else
> +        region_base[1] = GUEST_RAM1_BASE +
> +            ALIGN_UP_TO_2MB(ramsize - GUEST_RAM0_SIZE);
> +
> +    bank1end = min(1ULL << info.gpaddr_bits, GUEST_RAM1_BASE + GUEST_RAM1_SIZE);
> +    if (bank1end > region_base[1])
> +        region_size[1] = bank1end - region_base[1];
> +
> +out:
> +    /*
> +     * The region 0 for grant table space must be always present. If we managed
> +     * to allocate the extended regions then insert them as regions 1...N.
> +     */
> +    set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
> +              GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
> +
> +    for (i = 0; i < GUEST_RAM_BANKS; i++) {
> +        if (region_size[i] < GUEST_EXT_REGION_MIN_SIZE)
> +            continue;
> +
> +        LOG(DEBUG, "Extended region %u: %#"PRIx64"->%#"PRIx64"\n",
> +            nr_regions, region_base[i], region_base[i] + region_size[i]);
> +
> +        set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
> +                  region_base[i], region_size[i]);
> +        nr_regions++;
> +    }
> +
> +    if (!nr_regions)
> +        LOG(WARN, "The extended regions cannot be allocated, not enough space");
> +
> +    len = sizeof(regs[0]) * (GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS) *
> +        (nr_regions + 1);
> +    res = fdt_property(fdt, "reg", regs, len);
>      if (res) return res;
>  
>      /*
> @@ -963,7 +1029,7 @@ next_resize:
>          }
>  
>          FDT( make_timer_node(gc, fdt, ainfo, state->clock_frequency) );
> -        FDT( make_hypervisor_node(gc, fdt, vers) );
> +        FDT( make_hypervisor_node(gc, fdt, vers, info, dom) );
>  
>          if (info->arch_arm.vuart == LIBXL_VUART_TYPE_SBSA_UART)
>              FDT( make_vpl011_uart_node(gc, fdt, ainfo, dom) );
> diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
> index d46c61f..19ca2b0 100644
> --- a/xen/include/public/arch-arm.h
> +++ b/xen/include/public/arch-arm.h
> @@ -451,6 +451,8 @@ typedef uint64_t xen_callback_t;
>  #define GUEST_RAM_BANK_BASES   { GUEST_RAM0_BASE, GUEST_RAM1_BASE }
>  #define GUEST_RAM_BANK_SIZES   { GUEST_RAM0_SIZE, GUEST_RAM1_SIZE }
>  
> +#define GUEST_EXT_REGION_MIN_SIZE   xen_mk_ullong(0x0004000000) /* 64MB */
> +
>  /* Current supported guest VCPUs */
>  #define GUEST_MAX_VCPUS 128
>  
> -- 
> 2.7.4
>
Ian Jackson Oct. 12, 2021, 3:11 p.m. UTC | #2
Stefano Stabellini writes ("Re: [PATCH V6 2/2] libxl/arm: Add handling of extended regions for DomU"):
> On Mon, 11 Oct 2021, Oleksandr Tyshchenko wrote:
> > From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> > 
> > The extended region (safe range) is a region of guest physical
> > address space which is unused and could be safely used to create
> > grant/foreign mappings instead of wasting real RAM pages from
> > the domain memory for establishing these mappings.
> > 
> > The extended regions are chosen at the domain creation time and
> > advertised to it via "reg" property under hypervisor node in
> > the guest device-tree. As region 0 is reserved for grant table
> > space (always present), the indexes for extended regions are 1...N.
> > If extended regions could not be allocated for some reason,
> > Xen doesn't fail and behaves as usual, so only inserts region 0.
> > 
> > Please note the following limitations:
> > - The extended region feature is only supported for 64-bit domain
> >   currently.
> > - The ACPI case is not covered.
> > 
> > ***
> > 
> > The algorithm to choose extended regions for non-direct mapped
> > DomU is simpler in comparison with the algorithm for direct mapped
> > Dom0. We usually have a lot of unused space above 4GB, and might
> > have some unused space below 4GB (depends on guest memory size).
> > Try to allocate separate 2MB-aligned extended regions from the first
> > (below 4GB) and second (above 4GB) RAM banks taking into the account
> > the maximum supported guest physical address space size and the amount
> > of memory assigned to the guest. The minimum size of extended region
> > the same as for Dom0 (64MB).
> > 
> > Suggested-by: Julien Grall <jgrall@amazon.com>
> > Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>

On the basis of that and the diffstat:

Acked-by: Ian Jackson <iwj@xenproject.org>
Oleksandr Tyshchenko Oct. 12, 2021, 3:22 p.m. UTC | #3
On 11.10.21 20:48, Oleksandr Tyshchenko wrote:

Hi Ian, Stefano

> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>
> The extended region (safe range) is a region of guest physical
> address space which is unused and could be safely used to create
> grant/foreign mappings instead of wasting real RAM pages from
> the domain memory for establishing these mappings.
>
> The extended regions are chosen at the domain creation time and
> advertised to it via "reg" property under hypervisor node in
> the guest device-tree. As region 0 is reserved for grant table
> space (always present), the indexes for extended regions are 1...N.
> If extended regions could not be allocated for some reason,
> Xen doesn't fail and behaves as usual, so only inserts region 0.
>
> Please note the following limitations:
> - The extended region feature is only supported for 64-bit domain
>    currently.
> - The ACPI case is not covered.
>
> ***
>
> The algorithm to choose extended regions for non-direct mapped
> DomU is simpler in comparison with the algorithm for direct mapped
> Dom0. We usually have a lot of unused space above 4GB, and might
> have some unused space below 4GB (depends on guest memory size).
> Try to allocate separate 2MB-aligned extended regions from the first
> (below 4GB) and second (above 4GB) RAM banks taking into the account
> the maximum supported guest physical address space size and the amount
> of memory assigned to the guest. The minimum size of extended region
> the same as for Dom0 (64MB).
>
> Suggested-by: Julien Grall <jgrall@amazon.com>
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> ---
> Changes RFC -> V2:
>     - update patch description
>     - drop uneeded "extended-region" DT property
>     - clear reg array in finalise_ext_region() and add a TODO
>
> Changes V2 -> V3:
>     - update patch description, comments in code
>     - only pick up regions with size >= 64MB
>     - move the region calculation to make_hypervisor_node() and drop
>       finalise_ext_region()
>     - extend the list of arguments for make_hypervisor_node()
>     - do not show warning for 32-bit domain
>     - change the region alignment from 1GB to 2MB
>     - move EXT_REGION_SIZE to public/arch-arm.h
>
> Changes V3 -> V4:
>     - add R-b, A-b and T-b
>
> Changes V4 -> V5:
>     - update patch description and comments in code
>     - reflect changes done in previous patch to pass gpaddr_bits
>       via createdomain domctl (struct xen_arch_domainconfig)
>     - drop R-b, A-b and T-b
>     - drop limit for maximum extended region size (128GB)
>     - try to also allocate region below 4GB, optimize code
>       for calculating extended regions
>
> Change V5 -> V6:
>     - reflect changes done in previous patch to pass gpaddr_bits
>       via getdomaininfo domctl (struct xen_domctl_getdomaininfo)
>     - reduce the number of local variables, rework calculations
> ---
>   tools/libs/light/libxl_arm.c  | 76 ++++++++++++++++++++++++++++++++++++++++---
>   xen/include/public/arch-arm.h |  2 ++
>   2 files changed, 73 insertions(+), 5 deletions(-)
>
> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
> index e3140a6..c0e8415 100644
> --- a/tools/libs/light/libxl_arm.c
> +++ b/tools/libs/light/libxl_arm.c
> @@ -598,9 +598,20 @@ static int make_timer_node(libxl__gc *gc, void *fdt,
>       return 0;
>   }
>   
> +#define ALIGN_UP_TO_2MB(x)   (((x) + MB(2) - 1) & (~(MB(2) - 1)))
> +
>   static int make_hypervisor_node(libxl__gc *gc, void *fdt,
> -                                const libxl_version_info *vers)
> +                                const libxl_version_info *vers,
> +                                const libxl_domain_build_info *b_info,
> +                                const struct xc_dom_image *dom)
>   {
> +    uint64_t region_size[GUEST_RAM_BANKS] = {0}, region_base[GUEST_RAM_BANKS],
> +        bank1end, ramsize;
> +    uint32_t regs[(GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS) *
> +                  (GUEST_RAM_BANKS + 1)];
> +    be32 *cells = &regs[0];
> +    unsigned int i, len, nr_regions = 0;
> +    libxl_dominfo info;
>       int res;
>       gic_interrupt intr;
>   
> @@ -615,9 +626,64 @@ static int make_hypervisor_node(libxl__gc *gc, void *fdt,
>                                 "xen,xen");
>       if (res) return res;
>   
> -    /* reg 0 is grant table space */
> -    res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
> -                            1,GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
> +    if (strcmp(dom->guest_type, "xen-3.0-aarch64")) {
> +        LOG(WARN, "The extended regions are only supported for 64-bit guest currently");
> +        goto out;
> +    }
> +
> +    res = libxl_domain_info(CTX, &info, dom->guest_domid);
> +    if (res) return res;
> +
> +    assert(info.gpaddr_bits >= 32 && info.gpaddr_bits <= 48);
> +
> +    /*
> +     * Try to allocate separate 2MB-aligned extended regions from the first
> +     * (below 4GB) and second (above 4GB) RAM banks taking into the account
> +     * the maximum supported guest physical address space size and the amount
> +     * of memory assigned to the guest.
> +     * As the guest memory layout is not populated yet we cannot rely on
> +     * dom->rambank_size[], so calculate the actual size of both banks using
> +     * "max_memkb" value.
> +     */
> +    ramsize = b_info->max_memkb * 1024;
> +    if (ramsize <= GUEST_RAM0_SIZE) {
> +        region_base[0] = GUEST_RAM0_BASE + ALIGN_UP_TO_2MB(ramsize);
> +        region_size[0] = GUEST_RAM0_SIZE - ALIGN_UP_TO_2MB(ramsize);
> +        region_base[1] = GUEST_RAM1_BASE;
> +    } else
> +        region_base[1] = GUEST_RAM1_BASE +
> +            ALIGN_UP_TO_2MB(ramsize - GUEST_RAM0_SIZE);
> +
> +    bank1end = min(1ULL << info.gpaddr_bits, GUEST_RAM1_BASE + GUEST_RAM1_SIZE);
> +    if (bank1end > region_base[1])
> +        region_size[1] = bank1end - region_base[1];
> +
> +out:
> +    /*
> +     * The region 0 for grant table space must be always present. If we managed
> +     * to allocate the extended regions then insert them as regions 1...N.
> +     */
> +    set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
> +              GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
> +
> +    for (i = 0; i < GUEST_RAM_BANKS; i++) {
> +        if (region_size[i] < GUEST_EXT_REGION_MIN_SIZE)
> +            continue;
> +
> +        LOG(DEBUG, "Extended region %u: %#"PRIx64"->%#"PRIx64"\n",

If this appears to be the last version, may I please ask to remove 
unneeded "\n" in the message above while committing? Thank you.


> +            nr_regions, region_base[i], region_base[i] + region_size[i]);
> +
> +        set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
> +                  region_base[i], region_size[i]);
> +        nr_regions++;
> +    }
> +
> +    if (!nr_regions)
> +        LOG(WARN, "The extended regions cannot be allocated, not enough space");
> +
> +    len = sizeof(regs[0]) * (GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS) *
> +        (nr_regions + 1);
> +    res = fdt_property(fdt, "reg", regs, len);
>       if (res) return res;
>   
>       /*
> @@ -963,7 +1029,7 @@ next_resize:
>           }
>   
>           FDT( make_timer_node(gc, fdt, ainfo, state->clock_frequency) );
> -        FDT( make_hypervisor_node(gc, fdt, vers) );
> +        FDT( make_hypervisor_node(gc, fdt, vers, info, dom) );
>   
>           if (info->arch_arm.vuart == LIBXL_VUART_TYPE_SBSA_UART)
>               FDT( make_vpl011_uart_node(gc, fdt, ainfo, dom) );
> diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
> index d46c61f..19ca2b0 100644
> --- a/xen/include/public/arch-arm.h
> +++ b/xen/include/public/arch-arm.h
> @@ -451,6 +451,8 @@ typedef uint64_t xen_callback_t;
>   #define GUEST_RAM_BANK_BASES   { GUEST_RAM0_BASE, GUEST_RAM1_BASE }
>   #define GUEST_RAM_BANK_SIZES   { GUEST_RAM0_SIZE, GUEST_RAM1_SIZE }
>   
> +#define GUEST_EXT_REGION_MIN_SIZE   xen_mk_ullong(0x0004000000) /* 64MB */
> +
>   /* Current supported guest VCPUs */
>   #define GUEST_MAX_VCPUS 128
>
Ian Jackson Oct. 12, 2021, 3:32 p.m. UTC | #4
Oleksandr writes ("Re: [PATCH V6 2/2] libxl/arm: Add handling of extended regions for DomU"):
> On 11.10.21 20:48, Oleksandr Tyshchenko wrote:
> Hi Ian, Stefano
> > +        LOG(DEBUG, "Extended region %u: %#"PRIx64"->%#"PRIx64"\n",
> 
> If this appears to be the last version, may I please ask to remove 
> unneeded "\n" in the message above while committing? Thank you.

I think Stefano will be committing this but: personally I don't like
the "edit on commit" workflow.  Committers are already a bottleneck
and it is easy to make uncontrolled mistakes.  I find the most
convenient workflow to be to acquire a git branch from somewhere and
commit that, so if you provide a git branch with the acks folded in
and this kind of minor fix included, that would be ideal.  Ie,
git-request-merge format or something like it.

Thanks,
Ian.
Oleksandr Tyshchenko Oct. 12, 2021, 3:38 p.m. UTC | #5
On 12.10.21 18:32, Ian Jackson wrote:

Hi Ian

> Oleksandr writes ("Re: [PATCH V6 2/2] libxl/arm: Add handling of extended regions for DomU"):
>> On 11.10.21 20:48, Oleksandr Tyshchenko wrote:
>> Hi Ian, Stefano
>>> +        LOG(DEBUG, "Extended region %u: %#"PRIx64"->%#"PRIx64"\n",
>> If this appears to be the last version, may I please ask to remove
>> unneeded "\n" in the message above while committing? Thank you.
> I think Stefano will be committing this but: personally I don't like
> the "edit on commit" workflow.  Committers are already a bottleneck
> and it is easy to make uncontrolled mistakes.  I find the most
> convenient workflow to be to acquire a git branch from somewhere and
> commit that, so if you provide a git branch with the acks folded in
> and this kind of minor fix included, that would be ideal.  Ie,
> git-request-merge format or something like it.

Yes, I will provide a git branch later on today.


>
> Thanks,
> Ian.
Julien Grall Oct. 12, 2021, 4:05 p.m. UTC | #6
Hi Oleksandr,

On 11/10/2021 18:48, Oleksandr Tyshchenko wrote:
> ---
>   tools/libs/light/libxl_arm.c  | 76 ++++++++++++++++++++++++++++++++++++++++---
>   xen/include/public/arch-arm.h |  2 ++
>   2 files changed, 73 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
> index e3140a6..c0e8415 100644
> --- a/tools/libs/light/libxl_arm.c
> +++ b/tools/libs/light/libxl_arm.c
> @@ -598,9 +598,20 @@ static int make_timer_node(libxl__gc *gc, void *fdt,
>       return 0;
>   }
>   
> +#define ALIGN_UP_TO_2MB(x)   (((x) + MB(2) - 1) & (~(MB(2) - 1)))
> +
>   static int make_hypervisor_node(libxl__gc *gc, void *fdt,
> -                                const libxl_version_info *vers)
> +                                const libxl_version_info *vers,
> +                                const libxl_domain_build_info *b_info,
> +                                const struct xc_dom_image *dom)
>   {
> +    uint64_t region_size[GUEST_RAM_BANKS] = {0}, region_base[GUEST_RAM_BANKS],
> +        bank1end, ramsize;
> +    uint32_t regs[(GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS) *
> +                  (GUEST_RAM_BANKS + 1)];
> +    be32 *cells = &regs[0];
> +    unsigned int i, len, nr_regions = 0;
> +    libxl_dominfo info;
>       int res;
>       gic_interrupt intr;
>   
> @@ -615,9 +626,64 @@ static int make_hypervisor_node(libxl__gc *gc, void *fdt,
>                                 "xen,xen");
>       if (res) return res;
>   
> -    /* reg 0 is grant table space */
> -    res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
> -                            1,GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
> +    if (strcmp(dom->guest_type, "xen-3.0-aarch64")) {
> +        LOG(WARN, "The extended regions are only supported for 64-bit guest currently");
> +        goto out;
> +    }

I understand why we want to limit to 64-bit domain for dom0. But I am 
not sure this is warrant for 32-bit domain. At worse, the guest will 
ignore the bank because it is not usable. So could we drop the check?

> +
> +    res = libxl_domain_info(CTX, &info, dom->guest_domid);
> +    if (res) return res;
> +
> +    assert(info.gpaddr_bits >= 32 && info.gpaddr_bits <= 48);
What could go wrong below if gpaddr_bits is not within this range?

> +
> +    /*
> +     * Try to allocate separate 2MB-aligned extended regions from the first
> +     * (below 4GB) and second (above 4GB) RAM banks taking into the account
> +     * the maximum supported guest physical address space size and the amount
> +     * of memory assigned to the guest.
> +     * As the guest memory layout is not populated yet we cannot rely on
> +     * dom->rambank_size[], so calculate the actual size of both banks using
> +     * "max_memkb" value.
> +     */

At the moment, libxl doesn't know how libxc will allocate the memory. We 
may decide in the future to have only a small amount of memory below 4GB 
and then the rest above 4GB. With this approach it would be more 
difficult to modify the memory layout. Instead, I think we should create 
a placeholder that is updated once we know the banks in 
libxl__arch_domain_finalise_hw_description.

We also probably want to mention in the memory layout in 
public/arch-arm.h this decision as the suggested way to find extended 
regions will definitely impact our decision to re-order the memory 
layout or shrink some regions in the future (I have in mind the PCI 
Passthrough work).

> +    ramsize = b_info->max_memkb * 1024;
> +    if (ramsize <= GUEST_RAM0_SIZE) {
> +        region_base[0] = GUEST_RAM0_BASE + ALIGN_UP_TO_2MB(ramsize);
> +        region_size[0] = GUEST_RAM0_SIZE - ALIGN_UP_TO_2MB(ramsize);
> +        region_base[1] = GUEST_RAM1_BASE;
> +    } else
> +        region_base[1] = GUEST_RAM1_BASE +
> +            ALIGN_UP_TO_2MB(ramsize - GUEST_RAM0_SIZE);
> +
> +    bank1end = min(1ULL << info.gpaddr_bits, GUEST_RAM1_BASE + GUEST_RAM1_SIZE);
> +    if (bank1end > region_base[1])
> +        region_size[1] = bank1end - region_base[1];

It would be best to not rely on the fact that Bank on is always below 
4GB. If the code is too complex then we should look to add a 
BUILD_BUG_ON() to avoid any surprise.

Cheers,
Oleksandr Tyshchenko Oct. 12, 2021, 5:42 p.m. UTC | #7
On 12.10.21 19:05, Julien Grall wrote:
> Hi Oleksandr,

Hi Julien



>
> On 11/10/2021 18:48, Oleksandr Tyshchenko wrote:
>> ---
>>   tools/libs/light/libxl_arm.c  | 76 
>> ++++++++++++++++++++++++++++++++++++++++---
>>   xen/include/public/arch-arm.h |  2 ++
>>   2 files changed, 73 insertions(+), 5 deletions(-)
>>
>> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
>> index e3140a6..c0e8415 100644
>> --- a/tools/libs/light/libxl_arm.c
>> +++ b/tools/libs/light/libxl_arm.c
>> @@ -598,9 +598,20 @@ static int make_timer_node(libxl__gc *gc, void 
>> *fdt,
>>       return 0;
>>   }
>>   +#define ALIGN_UP_TO_2MB(x)   (((x) + MB(2) - 1) & (~(MB(2) - 1)))
>> +
>>   static int make_hypervisor_node(libxl__gc *gc, void *fdt,
>> -                                const libxl_version_info *vers)
>> +                                const libxl_version_info *vers,
>> +                                const libxl_domain_build_info *b_info,
>> +                                const struct xc_dom_image *dom)
>>   {
>> +    uint64_t region_size[GUEST_RAM_BANKS] = {0}, 
>> region_base[GUEST_RAM_BANKS],
>> +        bank1end, ramsize;
>> +    uint32_t regs[(GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS) *
>> +                  (GUEST_RAM_BANKS + 1)];
>> +    be32 *cells = &regs[0];
>> +    unsigned int i, len, nr_regions = 0;
>> +    libxl_dominfo info;
>>       int res;
>>       gic_interrupt intr;
>>   @@ -615,9 +626,64 @@ static int make_hypervisor_node(libxl__gc *gc, 
>> void *fdt,
>>                                 "xen,xen");
>>       if (res) return res;
>>   -    /* reg 0 is grant table space */
>> -    res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS, 
>> GUEST_ROOT_SIZE_CELLS,
>> -                            1,GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
>> +    if (strcmp(dom->guest_type, "xen-3.0-aarch64")) {
>> +        LOG(WARN, "The extended regions are only supported for 
>> 64-bit guest currently");
>> +        goto out;
>> +    }
>
> I understand why we want to limit to 64-bit domain for dom0. But I am 
> not sure this is warrant for 32-bit domain. At worse, the guest will 
> ignore the bank because it is not usable. So could we drop the check?

Yes.


>
>
>> +
>> +    res = libxl_domain_info(CTX, &info, dom->guest_domid);
>> +    if (res) return res;
>> +
>> +    assert(info.gpaddr_bits >= 32 && info.gpaddr_bits <= 48);
> What could go wrong below if gpaddr_bits is not within this range?

if info.gpaddr_bits is less than 64, then nothing bad, otherwise, I 
assume we will get shift count overflow.


>
>
>> +
>> +    /*
>> +     * Try to allocate separate 2MB-aligned extended regions from 
>> the first
>> +     * (below 4GB) and second (above 4GB) RAM banks taking into the 
>> account
>> +     * the maximum supported guest physical address space size and 
>> the amount
>> +     * of memory assigned to the guest.
>> +     * As the guest memory layout is not populated yet we cannot 
>> rely on
>> +     * dom->rambank_size[], so calculate the actual size of both 
>> banks using
>> +     * "max_memkb" value.
>> +     */
>
> At the moment, libxl doesn't know how libxc will allocate the memory. 
> We may decide in the future to have only a small amount of memory 
> below 4GB and then the rest above 4GB. With this approach it would be 
> more difficult to modify the memory layout. Instead, I think we should 
> create a placeholder that is updated once we know the banks in 
> libxl__arch_domain_finalise_hw_description.

If I got your point correctly, this is close to how it was done from the 
beginning. Yes, we can create placeholder(s) here and then update them 
once the memory layout is populated. The problem is that we won't be 
able to remove the placeholder(s) if we fail to allocate region(s) for 
some reasons. So, we should know for sure in advance how many region(s) 
we will be able to allocate later on in order to create the required 
number of placeholders right now...  Please, look at the TODO I wrote in 
finalise_ext_region() [1]. Or I misread your point?


>
>
> We also probably want to mention in the memory layout in 
> public/arch-arm.h this decision as the suggested way to find extended 
> regions will definitely impact our decision to re-order the memory 
> layout or shrink some regions in the future (I have in mind the PCI 
> Passthrough work).

Sorry, I couldn't parse.


>
>
>> +    ramsize = b_info->max_memkb * 1024;
>> +    if (ramsize <= GUEST_RAM0_SIZE) {
>> +        region_base[0] = GUEST_RAM0_BASE + ALIGN_UP_TO_2MB(ramsize);
>> +        region_size[0] = GUEST_RAM0_SIZE - ALIGN_UP_TO_2MB(ramsize);
>> +        region_base[1] = GUEST_RAM1_BASE;
>> +    } else
>> +        region_base[1] = GUEST_RAM1_BASE +
>> +            ALIGN_UP_TO_2MB(ramsize - GUEST_RAM0_SIZE);
>> +
>> +    bank1end = min(1ULL << info.gpaddr_bits, GUEST_RAM1_BASE + 
>> GUEST_RAM1_SIZE);
>> +    if (bank1end > region_base[1])
>> +        region_size[1] = bank1end - region_base[1];
>
> It would be best to not rely on the fact that Bank on is always below 
> 4GB. If the code is too complex then we should look to add a 
> BUILD_BUG_ON() to avoid any surprise.

Yes, I can add:

BUILD_BUG_ON((GUEST_RAM0_BASE + GUEST_RAM0_SIZE) > GB(4));


[1] 
https://lore.kernel.org/xen-devel/1631297924-8658-4-git-send-email-olekstysh@gmail.com/


>
> Cheers,
>
Julien Grall Oct. 12, 2021, 9:22 p.m. UTC | #8
Hi Oleksandr,

On 12/10/2021 18:42, Oleksandr wrote:
> On 12.10.21 19:05, Julien Grall wrote:
>> On 11/10/2021 18:48, Oleksandr Tyshchenko wrote:
>>> ---
>>>   tools/libs/light/libxl_arm.c  | 76 
>>> ++++++++++++++++++++++++++++++++++++++++---
>>>   xen/include/public/arch-arm.h |  2 ++
>>>   2 files changed, 73 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
>>> index e3140a6..c0e8415 100644
>>> --- a/tools/libs/light/libxl_arm.c
>>> +++ b/tools/libs/light/libxl_arm.c
>>> @@ -598,9 +598,20 @@ static int make_timer_node(libxl__gc *gc, void 
>>> *fdt,
>>>       return 0;
>>>   }
>>>   +#define ALIGN_UP_TO_2MB(x)   (((x) + MB(2) - 1) & (~(MB(2) - 1)))
>>> +
>>>   static int make_hypervisor_node(libxl__gc *gc, void *fdt,
>>> -                                const libxl_version_info *vers)
>>> +                                const libxl_version_info *vers,
>>> +                                const libxl_domain_build_info *b_info,
>>> +                                const struct xc_dom_image *dom)
>>>   {
>>> +    uint64_t region_size[GUEST_RAM_BANKS] = {0}, 
>>> region_base[GUEST_RAM_BANKS],
>>> +        bank1end, ramsize;
>>> +    uint32_t regs[(GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS) *
>>> +                  (GUEST_RAM_BANKS + 1)];
>>> +    be32 *cells = &regs[0];
>>> +    unsigned int i, len, nr_regions = 0;
>>> +    libxl_dominfo info;
>>>       int res;
>>>       gic_interrupt intr;
>>>   @@ -615,9 +626,64 @@ static int make_hypervisor_node(libxl__gc *gc, 
>>> void *fdt,
>>>                                 "xen,xen");
>>>       if (res) return res;
>>>   -    /* reg 0 is grant table space */
>>> -    res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS, 
>>> GUEST_ROOT_SIZE_CELLS,
>>> -                            1,GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
>>> +    if (strcmp(dom->guest_type, "xen-3.0-aarch64")) {
>>> +        LOG(WARN, "The extended regions are only supported for 
>>> 64-bit guest currently");
>>> +        goto out;
>>> +    }
>>
>> I understand why we want to limit to 64-bit domain for dom0. But I am 
>> not sure this is warrant for 32-bit domain. At worse, the guest will 
>> ignore the bank because it is not usable. So could we drop the check?
> 
> Yes.
> 
> 
>>
>>
>>> +
>>> +    res = libxl_domain_info(CTX, &info, dom->guest_domid);
>>> +    if (res) return res;
>>> +
>>> +    assert(info.gpaddr_bits >= 32 && info.gpaddr_bits <= 48);
>> What could go wrong below if gpaddr_bits is not within this range?
> 
> if info.gpaddr_bits is less than 64, then nothing bad, otherwise, I 
> assume we will get shift count overflow.

So I think the assert() is not suitable here because even if the 
gpaddr_bits is provided by the hypervisor (and therefore should be 
trusted), this is a different component so hardening the code is a good 
practice.

In this case, I would check that info.gpaddr_bits <= 64 and return an 
error. The reason I am suggesting <= 64 and not 48 is because Arm 
already supports 52 bits address space. Yet, I still like to avoid this 
assumption in the code. Something like below should work:

bank1end = GUEST_RAM1_BASE + GUEST_RAM1_SIZE - 1;
bank1end = min(bank1end, ~(0ULL) >> (64 - info.gpaddr_bits);

>>> +
>>> +    /*
>>> +     * Try to allocate separate 2MB-aligned extended regions from 
>>> the first
>>> +     * (below 4GB) and second (above 4GB) RAM banks taking into the 
>>> account
>>> +     * the maximum supported guest physical address space size and 
>>> the amount
>>> +     * of memory assigned to the guest.
>>> +     * As the guest memory layout is not populated yet we cannot 
>>> rely on
>>> +     * dom->rambank_size[], so calculate the actual size of both 
>>> banks using
>>> +     * "max_memkb" value.
>>> +     */
>>
>> At the moment, libxl doesn't know how libxc will allocate the memory. 
>> We may decide in the future to have only a small amount of memory 
>> below 4GB and then the rest above 4GB. With this approach it would be 
>> more difficult to modify the memory layout. Instead, I think we should 
>> create a placeholder that is updated once we know the banks in 
>> libxl__arch_domain_finalise_hw_description.
> 
> If I got your point correctly, this is close to how it was done from the 
> beginning. Yes, we can create placeholder(s) here and then update them 
> once the memory layout is populated. The problem is that we won't be 
> able to remove the placeholder(s) if we fail to allocate region(s) for 
> some reasons. So, we should know for sure in advance how many region(s) 
> we will be able to allocate later on in order to create the required 
> number of placeholders right now...  Please, look at the TODO I wrote in 
> finalise_ext_region() [1]. Or I misread your point?

You read correctly my point. However, I disagree that it is a problem to 
remove the placeholder if we fail to allocate the amount of regions 
expected.

Looking at libfdt, I can see two ways to deal with it:
   1) Use fdt_setprop()
   2) Delete the property using fdt_delprop() and then recreate it with 
fdt_appendprop()

The first solution is ideal and I think can work here to downsize the 
property. At worse, the second solution should work as the FDT blob will 
not increase.

>> We also probably want to mention in the memory layout in 
>> public/arch-arm.h this decision as the suggested way to find extended 
>> regions will definitely impact our decision to re-order the memory 
>> layout or shrink some regions in the future (I have in mind the PCI 
>> Passthrough work).
> 
> Sorry, I couldn't parse.

So this patch is relying on the fact that the regions reserved for the 
RAM are big enough to also accommodate the extended regions.

I am happy with this approach. However, I would like the approach to be 
documented in arch-arm.h because this is the first place one would look 
to understand the memory layout. This will be helpful if/when we need to 
modify the guest memory layout.

> 
> 
>>
>>
>>> +    ramsize = b_info->max_memkb * 1024;
>>> +    if (ramsize <= GUEST_RAM0_SIZE) {
>>> +        region_base[0] = GUEST_RAM0_BASE + ALIGN_UP_TO_2MB(ramsize);
>>> +        region_size[0] = GUEST_RAM0_SIZE - ALIGN_UP_TO_2MB(ramsize);
>>> +        region_base[1] = GUEST_RAM1_BASE;
>>> +    } else
>>> +        region_base[1] = GUEST_RAM1_BASE +
>>> +            ALIGN_UP_TO_2MB(ramsize - GUEST_RAM0_SIZE);
>>> +
>>> +    bank1end = min(1ULL << info.gpaddr_bits, GUEST_RAM1_BASE + 
>>> GUEST_RAM1_SIZE);
>>> +    if (bank1end > region_base[1])
>>> +        region_size[1] = bank1end - region_base[1];
>>
>> It would be best to not rely on the fact that Bank on is always below 
>> 4GB. If the code is too complex then we should look to add a 
>> BUILD_BUG_ON() to avoid any surprise.
> 
> Yes, I can add:
> 
> BUILD_BUG_ON((GUEST_RAM0_BASE + GUEST_RAM0_SIZE) > GB(4));

I am OK with that. But I wonder if we could simply use min(..., ) to 
avoid the BUILD_BUG_ON().

Cheers,
Oleksandr Tyshchenko Oct. 12, 2021, 9:43 p.m. UTC | #9
On 13.10.21 00:22, Julien Grall wrote:
> Hi Oleksandr,

Hi Julien, Ian.


Julien, thank you for the detailed answer, I will analyze it tomorrow.

Ian, I think, there is no reason in providing git branch with the acks 
folded in + my minor fix for the debug message as it was discussed 
before, it sounds like there is more work to do, so it is going to be a 
new version anyway.




>
> On 12/10/2021 18:42, Oleksandr wrote:
>> On 12.10.21 19:05, Julien Grall wrote:
>>> On 11/10/2021 18:48, Oleksandr Tyshchenko wrote:
>>>> ---
>>>>   tools/libs/light/libxl_arm.c  | 76 
>>>> ++++++++++++++++++++++++++++++++++++++++---
>>>>   xen/include/public/arch-arm.h |  2 ++
>>>>   2 files changed, 73 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/tools/libs/light/libxl_arm.c 
>>>> b/tools/libs/light/libxl_arm.c
>>>> index e3140a6..c0e8415 100644
>>>> --- a/tools/libs/light/libxl_arm.c
>>>> +++ b/tools/libs/light/libxl_arm.c
>>>> @@ -598,9 +598,20 @@ static int make_timer_node(libxl__gc *gc, void 
>>>> *fdt,
>>>>       return 0;
>>>>   }
>>>>   +#define ALIGN_UP_TO_2MB(x)   (((x) + MB(2) - 1) & (~(MB(2) - 1)))
>>>> +
>>>>   static int make_hypervisor_node(libxl__gc *gc, void *fdt,
>>>> -                                const libxl_version_info *vers)
>>>> +                                const libxl_version_info *vers,
>>>> +                                const libxl_domain_build_info 
>>>> *b_info,
>>>> +                                const struct xc_dom_image *dom)
>>>>   {
>>>> +    uint64_t region_size[GUEST_RAM_BANKS] = {0}, 
>>>> region_base[GUEST_RAM_BANKS],
>>>> +        bank1end, ramsize;
>>>> +    uint32_t regs[(GUEST_ROOT_ADDRESS_CELLS + 
>>>> GUEST_ROOT_SIZE_CELLS) *
>>>> +                  (GUEST_RAM_BANKS + 1)];
>>>> +    be32 *cells = &regs[0];
>>>> +    unsigned int i, len, nr_regions = 0;
>>>> +    libxl_dominfo info;
>>>>       int res;
>>>>       gic_interrupt intr;
>>>>   @@ -615,9 +626,64 @@ static int make_hypervisor_node(libxl__gc 
>>>> *gc, void *fdt,
>>>>                                 "xen,xen");
>>>>       if (res) return res;
>>>>   -    /* reg 0 is grant table space */
>>>> -    res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS, 
>>>> GUEST_ROOT_SIZE_CELLS,
>>>> -                            1,GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
>>>> +    if (strcmp(dom->guest_type, "xen-3.0-aarch64")) {
>>>> +        LOG(WARN, "The extended regions are only supported for 
>>>> 64-bit guest currently");
>>>> +        goto out;
>>>> +    }
>>>
>>> I understand why we want to limit to 64-bit domain for dom0. But I 
>>> am not sure this is warrant for 32-bit domain. At worse, the guest 
>>> will ignore the bank because it is not usable. So could we drop the 
>>> check?
>>
>> Yes.
>>
>>
>>>
>>>
>>>> +
>>>> +    res = libxl_domain_info(CTX, &info, dom->guest_domid);
>>>> +    if (res) return res;
>>>> +
>>>> +    assert(info.gpaddr_bits >= 32 && info.gpaddr_bits <= 48);
>>> What could go wrong below if gpaddr_bits is not within this range?
>>
>> if info.gpaddr_bits is less than 64, then nothing bad, otherwise, I 
>> assume we will get shift count overflow.
>
> So I think the assert() is not suitable here because even if the 
> gpaddr_bits is provided by the hypervisor (and therefore should be 
> trusted), this is a different component so hardening the code is a 
> good practice.
>
> In this case, I would check that info.gpaddr_bits <= 64 and return an 
> error. The reason I am suggesting <= 64 and not 48 is because Arm 
> already supports 52 bits address space. Yet, I still like to avoid 
> this assumption in the code. Something like below should work:
>
> bank1end = GUEST_RAM1_BASE + GUEST_RAM1_SIZE - 1;
> bank1end = min(bank1end, ~(0ULL) >> (64 - info.gpaddr_bits);
>
>>>> +
>>>> +    /*
>>>> +     * Try to allocate separate 2MB-aligned extended regions from 
>>>> the first
>>>> +     * (below 4GB) and second (above 4GB) RAM banks taking into 
>>>> the account
>>>> +     * the maximum supported guest physical address space size and 
>>>> the amount
>>>> +     * of memory assigned to the guest.
>>>> +     * As the guest memory layout is not populated yet we cannot 
>>>> rely on
>>>> +     * dom->rambank_size[], so calculate the actual size of both 
>>>> banks using
>>>> +     * "max_memkb" value.
>>>> +     */
>>>
>>> At the moment, libxl doesn't know how libxc will allocate the 
>>> memory. We may decide in the future to have only a small amount of 
>>> memory below 4GB and then the rest above 4GB. With this approach it 
>>> would be more difficult to modify the memory layout. Instead, I 
>>> think we should create a placeholder that is updated once we know 
>>> the banks in libxl__arch_domain_finalise_hw_description.
>>
>> If I got your point correctly, this is close to how it was done from 
>> the beginning. Yes, we can create placeholder(s) here and then update 
>> them once the memory layout is populated. The problem is that we 
>> won't be able to remove the placeholder(s) if we fail to allocate 
>> region(s) for some reasons. So, we should know for sure in advance 
>> how many region(s) we will be able to allocate later on in order to 
>> create the required number of placeholders right now...  Please, look 
>> at the TODO I wrote in finalise_ext_region() [1]. Or I misread your 
>> point?
>
> You read correctly my point. However, I disagree that it is a problem 
> to remove the placeholder if we fail to allocate the amount of regions 
> expected.
>
> Looking at libfdt, I can see two ways to deal with it:
>   1) Use fdt_setprop()
>   2) Delete the property using fdt_delprop() and then recreate it with 
> fdt_appendprop()
>
> The first solution is ideal and I think can work here to downsize the 
> property. At worse, the second solution should work as the FDT blob 
> will not increase.
>
>>> We also probably want to mention in the memory layout in 
>>> public/arch-arm.h this decision as the suggested way to find 
>>> extended regions will definitely impact our decision to re-order the 
>>> memory layout or shrink some regions in the future (I have in mind 
>>> the PCI Passthrough work).
>>
>> Sorry, I couldn't parse.
>
> So this patch is relying on the fact that the regions reserved for the 
> RAM are big enough to also accommodate the extended regions.
>
> I am happy with this approach. However, I would like the approach to 
> be documented in arch-arm.h because this is the first place one would 
> look to understand the memory layout. This will be helpful if/when we 
> need to modify the guest memory layout.
>
>>
>>
>>>
>>>
>>>> +    ramsize = b_info->max_memkb * 1024;
>>>> +    if (ramsize <= GUEST_RAM0_SIZE) {
>>>> +        region_base[0] = GUEST_RAM0_BASE + ALIGN_UP_TO_2MB(ramsize);
>>>> +        region_size[0] = GUEST_RAM0_SIZE - ALIGN_UP_TO_2MB(ramsize);
>>>> +        region_base[1] = GUEST_RAM1_BASE;
>>>> +    } else
>>>> +        region_base[1] = GUEST_RAM1_BASE +
>>>> +            ALIGN_UP_TO_2MB(ramsize - GUEST_RAM0_SIZE);
>>>> +
>>>> +    bank1end = min(1ULL << info.gpaddr_bits, GUEST_RAM1_BASE + 
>>>> GUEST_RAM1_SIZE);
>>>> +    if (bank1end > region_base[1])
>>>> +        region_size[1] = bank1end - region_base[1];
>>>
>>> It would be best to not rely on the fact that Bank on is always 
>>> below 4GB. If the code is too complex then we should look to add a 
>>> BUILD_BUG_ON() to avoid any surprise.
>>
>> Yes, I can add:
>>
>> BUILD_BUG_ON((GUEST_RAM0_BASE + GUEST_RAM0_SIZE) > GB(4));
>
> I am OK with that. But I wonder if we could simply use min(..., ) to 
> avoid the BUILD_BUG_ON().
>
> Cheers,
>
Oleksandr Tyshchenko Oct. 13, 2021, 1:46 p.m. UTC | #10
Hi Julien


On 13.10.21 00:43, Oleksandr wrote:
>
> On 13.10.21 00:22, Julien Grall wrote:
>> Hi Oleksandr,
>
> Hi Julien, Ian.
>
>
> Julien, thank you for the detailed answer, I will analyze it tomorrow.

Already analyzed, please see comments below.


>
> Ian, I think, there is no reason in providing git branch with the acks 
> folded in + my minor fix for the debug message as it was discussed 
> before, it sounds like there is more work to do, so it is going to be 
> a new version anyway.
>
>
>
>
>>
>> On 12/10/2021 18:42, Oleksandr wrote:
>>> On 12.10.21 19:05, Julien Grall wrote:
>>>> On 11/10/2021 18:48, Oleksandr Tyshchenko wrote:
>>>>> ---
>>>>>   tools/libs/light/libxl_arm.c  | 76 
>>>>> ++++++++++++++++++++++++++++++++++++++++---
>>>>>   xen/include/public/arch-arm.h |  2 ++
>>>>>   2 files changed, 73 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/tools/libs/light/libxl_arm.c 
>>>>> b/tools/libs/light/libxl_arm.c
>>>>> index e3140a6..c0e8415 100644
>>>>> --- a/tools/libs/light/libxl_arm.c
>>>>> +++ b/tools/libs/light/libxl_arm.c
>>>>> @@ -598,9 +598,20 @@ static int make_timer_node(libxl__gc *gc, 
>>>>> void *fdt,
>>>>>       return 0;
>>>>>   }
>>>>>   +#define ALIGN_UP_TO_2MB(x)   (((x) + MB(2) - 1) & (~(MB(2) - 1)))
>>>>> +
>>>>>   static int make_hypervisor_node(libxl__gc *gc, void *fdt,
>>>>> -                                const libxl_version_info *vers)
>>>>> +                                const libxl_version_info *vers,
>>>>> +                                const libxl_domain_build_info 
>>>>> *b_info,
>>>>> +                                const struct xc_dom_image *dom)
>>>>>   {
>>>>> +    uint64_t region_size[GUEST_RAM_BANKS] = {0}, 
>>>>> region_base[GUEST_RAM_BANKS],
>>>>> +        bank1end, ramsize;
>>>>> +    uint32_t regs[(GUEST_ROOT_ADDRESS_CELLS + 
>>>>> GUEST_ROOT_SIZE_CELLS) *
>>>>> +                  (GUEST_RAM_BANKS + 1)];
>>>>> +    be32 *cells = &regs[0];
>>>>> +    unsigned int i, len, nr_regions = 0;
>>>>> +    libxl_dominfo info;
>>>>>       int res;
>>>>>       gic_interrupt intr;
>>>>>   @@ -615,9 +626,64 @@ static int make_hypervisor_node(libxl__gc 
>>>>> *gc, void *fdt,
>>>>>                                 "xen,xen");
>>>>>       if (res) return res;
>>>>>   -    /* reg 0 is grant table space */
>>>>> -    res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS, 
>>>>> GUEST_ROOT_SIZE_CELLS,
>>>>> -                            1,GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
>>>>> +    if (strcmp(dom->guest_type, "xen-3.0-aarch64")) {
>>>>> +        LOG(WARN, "The extended regions are only supported for 
>>>>> 64-bit guest currently");
>>>>> +        goto out;
>>>>> +    }
>>>>
>>>> I understand why we want to limit to 64-bit domain for dom0. But I 
>>>> am not sure this is warrant for 32-bit domain. At worse, the guest 
>>>> will ignore the bank because it is not usable. So could we drop the 
>>>> check?
>>>
>>> Yes.
>>>
>>>
>>>>
>>>>
>>>>> +
>>>>> +    res = libxl_domain_info(CTX, &info, dom->guest_domid);
>>>>> +    if (res) return res;
>>>>> +
>>>>> +    assert(info.gpaddr_bits >= 32 && info.gpaddr_bits <= 48);
>>>> What could go wrong below if gpaddr_bits is not within this range?
>>>
>>> if info.gpaddr_bits is less than 64, then nothing bad, otherwise, I 
>>> assume we will get shift count overflow.
>>
>> So I think the assert() is not suitable here because even if the 
>> gpaddr_bits is provided by the hypervisor (and therefore should be 
>> trusted), this is a different component so hardening the code is a 
>> good practice.
>>
>> In this case, I would check that info.gpaddr_bits <= 64 and return an 
>> error. The reason I am suggesting <= 64 and not 48 is because Arm 
>> already supports 52 bits address space. Yet, I still like to avoid 
>> this assumption in the code. Something like below should work:
>>
>> bank1end = GUEST_RAM1_BASE + GUEST_RAM1_SIZE - 1;
>> bank1end = min(bank1end, ~(0ULL) >> (64 - info.gpaddr_bits);

ok, makes sense.


>>
>>
>>>>> +
>>>>> +    /*
>>>>> +     * Try to allocate separate 2MB-aligned extended regions from 
>>>>> the first
>>>>> +     * (below 4GB) and second (above 4GB) RAM banks taking into 
>>>>> the account
>>>>> +     * the maximum supported guest physical address space size 
>>>>> and the amount
>>>>> +     * of memory assigned to the guest.
>>>>> +     * As the guest memory layout is not populated yet we cannot 
>>>>> rely on
>>>>> +     * dom->rambank_size[], so calculate the actual size of both 
>>>>> banks using
>>>>> +     * "max_memkb" value.
>>>>> +     */
>>>>
>>>> At the moment, libxl doesn't know how libxc will allocate the 
>>>> memory. We may decide in the future to have only a small amount of 
>>>> memory below 4GB and then the rest above 4GB. With this approach it 
>>>> would be more difficult to modify the memory layout. Instead, I 
>>>> think we should create a placeholder that is updated once we know 
>>>> the banks in libxl__arch_domain_finalise_hw_description.
>>>
>>> If I got your point correctly, this is close to how it was done from 
>>> the beginning. Yes, we can create placeholder(s) here and then 
>>> update them once the memory layout is populated. The problem is that 
>>> we won't be able to remove the placeholder(s) if we fail to allocate 
>>> region(s) for some reasons. So, we should know for sure in advance 
>>> how many region(s) we will be able to allocate later on in order to 
>>> create the required number of placeholders right now... Please, look 
>>> at the TODO I wrote in finalise_ext_region() [1]. Or I misread your 
>>> point?
>>
>> You read correctly my point. However, I disagree that it is a problem 
>> to remove the placeholder if we fail to allocate the amount of 
>> regions expected.
>>
>> Looking at libfdt, I can see two ways to deal with it:
>>   1) Use fdt_setprop()
>>   2) Delete the property using fdt_delprop() and then recreate it 
>> with fdt_appendprop()
>>
>> The first solution is ideal and I think can work here to downsize the 
>> property. At worse, the second solution should work as the FDT blob 
>> will not increase.

Indeed, fdt_setprop() will work in our case, thank you. When I looked at 
it, I only experimented with two high-level functions: 
fdt_setprop_inplace() and fdt_nop_node() used in libxl_arm.c and no one 
wasn't suitable.


>>
>>
>>>> We also probably want to mention in the memory layout in 
>>>> public/arch-arm.h this decision as the suggested way to find 
>>>> extended regions will definitely impact our decision to re-order 
>>>> the memory layout or shrink some regions in the future (I have in 
>>>> mind the PCI Passthrough work).
>>>
>>> Sorry, I couldn't parse.
>>
>> So this patch is relying on the fact that the regions reserved for 
>> the RAM are big enough to also accommodate the extended regions.
>>
>> I am happy with this approach. However, I would like the approach to 
>> be documented in arch-arm.h because this is the first place one would 
>> look to understand the memory layout. This will be helpful if/when we 
>> need to modify the guest memory layout.

Thank you for the clarification, it is clear now.


>>
>>
>>>
>>>
>>>>
>>>>
>>>>> +    ramsize = b_info->max_memkb * 1024;
>>>>> +    if (ramsize <= GUEST_RAM0_SIZE) {
>>>>> +        region_base[0] = GUEST_RAM0_BASE + ALIGN_UP_TO_2MB(ramsize);
>>>>> +        region_size[0] = GUEST_RAM0_SIZE - ALIGN_UP_TO_2MB(ramsize);
>>>>> +        region_base[1] = GUEST_RAM1_BASE;
>>>>> +    } else
>>>>> +        region_base[1] = GUEST_RAM1_BASE +
>>>>> +            ALIGN_UP_TO_2MB(ramsize - GUEST_RAM0_SIZE);
>>>>> +
>>>>> +    bank1end = min(1ULL << info.gpaddr_bits, GUEST_RAM1_BASE + 
>>>>> GUEST_RAM1_SIZE);
>>>>> +    if (bank1end > region_base[1])
>>>>> +        region_size[1] = bank1end - region_base[1];
>>>>
>>>> It would be best to not rely on the fact that Bank on is always 
>>>> below 4GB. If the code is too complex then we should look to add a 
>>>> BUILD_BUG_ON() to avoid any surprise.
>>>
>>> Yes, I can add:
>>>
>>> BUILD_BUG_ON((GUEST_RAM0_BASE + GUEST_RAM0_SIZE) > GB(4));
>>
>> I am OK with that. But I wonder if we could simply use min(..., ) to 
>> avoid the BUILD_BUG_ON().

I think, we could. And probably we could avoid relying on the fact that 
Bank 0 is always below 4GB.

Below, the changes based on my understanding of the whole request. Would 
you be OK with them?


diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index e3140a6..53ae0f3 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -615,9 +615,12 @@ static int make_hypervisor_node(libxl__gc *gc, void 
*fdt,
                                "xen,xen");
      if (res) return res;

-    /* reg 0 is grant table space */
+    /*
+     * reg 0 is a placeholder for grant table space, reg 1...N are
+     * the placeholders for extended regions.
+     */
      res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS, 
GUEST_ROOT_SIZE_CELLS,
-                            1,GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
+                            GUEST_RAM_BANKS + 1, 0, 0, 0, 0, 0, 0);
      if (res) return res;

      /*
@@ -1069,6 +1072,74 @@ static void finalise_one_node(libxl__gc *gc, void 
*fdt, const char *uname,
      }
  }

+#define ALIGN_UP_TO_2MB(x)   (((x) + MB(2) - 1) & (~(MB(2) - 1)))
+
+static void finalise_ext_region(libxl__gc *gc, struct xc_dom_image *dom)
+{
+    void *fdt = dom->devicetree_blob;
+    uint64_t region_size[GUEST_RAM_BANKS] = {0}, 
region_base[GUEST_RAM_BANKS],
+        bankend[GUEST_RAM_BANKS];
+    uint32_t regs[(GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS) *
+                  (GUEST_RAM_BANKS + 1)];
+    be32 *cells = &regs[0];
+    const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
+    const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
+    unsigned int i, len, nr_regions = 0;
+    libxl_dominfo info;
+    int offset, rc;
+
+    offset = fdt_path_offset(fdt, "/hypervisor");
+    assert(offset > 0);
+
+    rc = libxl_domain_info(CTX, &info, dom->guest_domid);
+    assert(!rc);
+
+    assert(info.gpaddr_bits <= 64);
+
+    /*
+     * Try to allocate separate 2MB-aligned extended regions from the first
+     * and second RAM banks taking into the account the maximum supported
+     * guest physical address space size and the amount of memory assigned
+     * to the guest.
+     */
+    for (i = 0; i < GUEST_RAM_BANKS; i++) {
+        region_base[i] = bankbase[i] +
+            ALIGN_UP_TO_2MB((uint64_t)dom->rambank_size[i] << 
XC_PAGE_SHIFT);
+
+        bankend[i] = ~0ULL >> (64 - info.gpaddr_bits);
+        bankend[i] = min(bankend[i], bankbase[i] + banksize[i] - 1);
+        if (bankend[i] > region_base[i])
+            region_size[i] = bankend[i] - region_base[i] + 1;
+    }
+
+    /*
+     * The region 0 for grant table space must be always present. If we 
managed
+     * to allocate the extended regions then insert them as regions 1...N.
+     */
+    set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
+              GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
+
+    for (i = 0; i < GUEST_RAM_BANKS; i++) {
+        if (region_size[i] < GUEST_EXT_REGION_MIN_SIZE)
+            continue;
+
+        LOG(DEBUG, "Extended region %u: %#"PRIx64"->%#"PRIx64"",
+            nr_regions, region_base[i], region_base[i] + region_size[i]);
+
+        set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
+                  region_base[i], region_size[i]);
+        nr_regions++;
+    }
+
+    if (!nr_regions)
+        LOG(WARN, "The extended regions cannot be allocated, not enough 
space");
+
+    len = sizeof(regs[0]) * (GUEST_ROOT_ADDRESS_CELLS + 
GUEST_ROOT_SIZE_CELLS) *
+        (nr_regions + 1);
+    rc = fdt_setprop(fdt, offset, "reg", regs, len);
+    assert(!rc);
+}
+
  int libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
                                                 uint32_t domid,
libxl_domain_config *d_config,
@@ -1109,6 +1180,8 @@ int 
libxl__arch_domain_finalise_hw_description(libxl__gc *gc,

      }

+    finalise_ext_region(gc, dom);
+
      for (i = 0; i < GUEST_RAM_BANKS; i++) {
          const uint64_t size = (uint64_t)dom->rambank_size[i] << 
XC_PAGE_SHIFT;

diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
index d46c61f..7425a78 100644
--- a/xen/include/public/arch-arm.h
+++ b/xen/include/public/arch-arm.h
@@ -438,6 +438,11 @@ typedef uint64_t xen_callback_t;

  #define GUEST_RAM_BANKS   2

+/*
+ * The way to find the extended regions (to be exposed to the guest as 
unused
+ * address space) relies on the fact that the regions reserved for the RAM
+ * below are big enough to also accommodate such regions.
+ */
  #define GUEST_RAM0_BASE   xen_mk_ullong(0x40000000) /* 3GB of low RAM 
@ 1GB */
  #define GUEST_RAM0_SIZE   xen_mk_ullong(0xc0000000)

@@ -451,6 +456,8 @@ typedef uint64_t xen_callback_t;
  #define GUEST_RAM_BANK_BASES   { GUEST_RAM0_BASE, GUEST_RAM1_BASE }
  #define GUEST_RAM_BANK_SIZES   { GUEST_RAM0_SIZE, GUEST_RAM1_SIZE }

+#define GUEST_EXT_REGION_MIN_SIZE   xen_mk_ullong(0x0004000000) /* 64MB */
+
  /* Current supported guest VCPUs */
  #define GUEST_MAX_VCPUS 128

(END)
Julien Grall Oct. 13, 2021, 3:15 p.m. UTC | #11
On 13/10/2021 14:46, Oleksandr wrote:
> 
> Hi Julien

Hi Oleksandr,

> On 13.10.21 00:43, Oleksandr wrote:
> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
> index e3140a6..53ae0f3 100644
> --- a/tools/libs/light/libxl_arm.c
> +++ b/tools/libs/light/libxl_arm.c
> @@ -615,9 +615,12 @@ static int make_hypervisor_node(libxl__gc *gc, void 
> *fdt,
>                                 "xen,xen");
>       if (res) return res;
> 
> -    /* reg 0 is grant table space */
> +    /*
> +     * reg 0 is a placeholder for grant table space, reg 1...N are
> +     * the placeholders for extended regions.
> +     */
>       res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS, 
> GUEST_ROOT_SIZE_CELLS,
> -                            1,GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
> +                            GUEST_RAM_BANKS + 1, 0, 0, 0, 0, 0, 0);

Here you are relying on GUEST_RAM_BANKS == 2. I think this is pretty 
fragile as this may change in the future.

fdt_property_regs() is not really suitable for here. I would suggest to 
create a new helper fdt_property_placeholder() which takes the address 
cells, size cells and the number of ranges. The function will then 
create N range that are zeroed.

>       if (res) return res;
> 
>       /*
> @@ -1069,6 +1072,74 @@ static void finalise_one_node(libxl__gc *gc, void 
> *fdt, const char *uname,
>       }
>   }
> 
> +#define ALIGN_UP_TO_2MB(x)   (((x) + MB(2) - 1) & (~(MB(2) - 1)))
> +
> +static void finalise_ext_region(libxl__gc *gc, struct xc_dom_image *dom)

The function is doing more than finalizing extend regions, it also 
create the grant table regs. So how about naming it: 
finalize_hypervisor_node()?

> +{
> +    void *fdt = dom->devicetree_blob;
> +    uint64_t region_size[GUEST_RAM_BANKS] = {0}, 
> region_base[GUEST_RAM_BANKS],
> +        bankend[GUEST_RAM_BANKS];
> +    uint32_t regs[(GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS) *
> +                  (GUEST_RAM_BANKS + 1)];
> +    be32 *cells = &regs[0];
> +    const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
> +    const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
> +    unsigned int i, len, nr_regions = 0;
> +    libxl_dominfo info;
> +    int offset, rc;
> +
> +    offset = fdt_path_offset(fdt, "/hypervisor");
> +    assert(offset > 0);
> +
> +    rc = libxl_domain_info(CTX, &info, dom->guest_domid);
> +    assert(!rc);
> +
> +    assert(info.gpaddr_bits <= 64);

Neither of the two should be assert(). They should be proper check so we 
don't end up with a disaster (in particularly for the former) if there 
is a problem.

> +
> +    /*
> +     * Try to allocate separate 2MB-aligned extended regions from the 
> first
> +     * and second RAM banks taking into the account the maximum supported
> +     * guest physical address space size and the amount of memory assigned
> +     * to the guest.
> +     */
> +    for (i = 0; i < GUEST_RAM_BANKS; i++) {
> +        region_base[i] = bankbase[i] +
> +            ALIGN_UP_TO_2MB((uint64_t)dom->rambank_size[i] << 
> XC_PAGE_SHIFT);
> +
> +        bankend[i] = ~0ULL >> (64 - info.gpaddr_bits);
> +        bankend[i] = min(bankend[i], bankbase[i] + banksize[i] - 1);
> +        if (bankend[i] > region_base[i])
> +            region_size[i] = bankend[i] - region_base[i] + 1;
> +    } > +
> +    /*
> +     * The region 0 for grant table space must be always present. If we 
> managed
> +     * to allocate the extended regions then insert them as regions 1...N.
> +     */
> +    set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
> +              GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
> +
> +    for (i = 0; i < GUEST_RAM_BANKS; i++) {
> +        if (region_size[i] < GUEST_EXT_REGION_MIN_SIZE)
> +            continue;
> +
> +        LOG(DEBUG, "Extended region %u: %#"PRIx64"->%#"PRIx64"",
> +            nr_regions, region_base[i], region_base[i] + region_size[i]);
> +
> +        set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
> +                  region_base[i], region_size[i]);
> +        nr_regions++;
> +    }
> +
> +    if (!nr_regions)
> +        LOG(WARN, "The extended regions cannot be allocated, not enough 
> space");
> +
> +    len = sizeof(regs[0]) * (GUEST_ROOT_ADDRESS_CELLS + 
> GUEST_ROOT_SIZE_CELLS) *
> +        (nr_regions + 1);
> +    rc = fdt_setprop(fdt, offset, "reg", regs, len);
> +    assert(!rc);

We should propagate the error.

> +}
> +
>   int libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
>                                                  uint32_t domid,
> libxl_domain_config *d_config,
> @@ -1109,6 +1180,8 @@ int 
> libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
> 
>       }
> 
> +    finalise_ext_region(gc, dom);
> +
>       for (i = 0; i < GUEST_RAM_BANKS; i++) {
>           const uint64_t size = (uint64_t)dom->rambank_size[i] << 
> XC_PAGE_SHIFT;
> 
> diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
> index d46c61f..7425a78 100644
> --- a/xen/include/public/arch-arm.h
> +++ b/xen/include/public/arch-arm.h
> @@ -438,6 +438,11 @@ typedef uint64_t xen_callback_t;
> 
>   #define GUEST_RAM_BANKS   2
> 
> +/*
> + * The way to find the extended regions (to be exposed to the guest as 
> unused
> + * address space) relies on the fact that the regions reserved for the RAM
> + * below are big enough to also accommodate such regions.
> + */
>   #define GUEST_RAM0_BASE   xen_mk_ullong(0x40000000) /* 3GB of low RAM 
> @ 1GB */
>   #define GUEST_RAM0_SIZE   xen_mk_ullong(0xc0000000)
> 
> @@ -451,6 +456,8 @@ typedef uint64_t xen_callback_t;
>   #define GUEST_RAM_BANK_BASES   { GUEST_RAM0_BASE, GUEST_RAM1_BASE }
>   #define GUEST_RAM_BANK_SIZES   { GUEST_RAM0_SIZE, GUEST_RAM1_SIZE }
> 
> +#define GUEST_EXT_REGION_MIN_SIZE   xen_mk_ullong(0x0004000000) /* 64MB */

I would prefer if this value is not part of the public header because 
this is not a value that the hypervisor needs to know. So it is better 
to restrict it to the libxl_arm.c

Cheers,
Oleksandr Tyshchenko Oct. 13, 2021, 3:49 p.m. UTC | #12
On 13.10.21 18:15, Julien Grall wrote:
> On 13/10/2021 14:46, Oleksandr wrote:
>>
>> Hi Julien
>
> Hi Oleksandr,

Hi Julien


Thank you for the prompt response.


>
>> On 13.10.21 00:43, Oleksandr wrote:
>> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
>> index e3140a6..53ae0f3 100644
>> --- a/tools/libs/light/libxl_arm.c
>> +++ b/tools/libs/light/libxl_arm.c
>> @@ -615,9 +615,12 @@ static int make_hypervisor_node(libxl__gc *gc, 
>> void *fdt,
>>                                 "xen,xen");
>>       if (res) return res;
>>
>> -    /* reg 0 is grant table space */
>> +    /*
>> +     * reg 0 is a placeholder for grant table space, reg 1...N are
>> +     * the placeholders for extended regions.
>> +     */
>>       res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS, 
>> GUEST_ROOT_SIZE_CELLS,
>> -                            1,GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
>> +                            GUEST_RAM_BANKS + 1, 0, 0, 0, 0, 0, 0);
>
> Here you are relying on GUEST_RAM_BANKS == 2. I think this is pretty 
> fragile as this may change in the future.
>
> fdt_property_regs() is not really suitable for here. I would suggest 
> to create a new helper fdt_property_placeholder() which takes the 
> address cells, size cells and the number of ranges. The function will 
> then create N range that are zeroed.

You are right. Probably, we could add an assert here for now to be 
triggered if "GUEST_RAM_BANKS != 2"?
But, if you still think we need to make it with the helper right now, I 
will. However, I don't know how long this will take.


>
>
>>       if (res) return res;
>>
>>       /*
>> @@ -1069,6 +1072,74 @@ static void finalise_one_node(libxl__gc *gc, 
>> void *fdt, const char *uname,
>>       }
>>   }
>>
>> +#define ALIGN_UP_TO_2MB(x)   (((x) + MB(2) - 1) & (~(MB(2) - 1)))
>> +
>> +static void finalise_ext_region(libxl__gc *gc, struct xc_dom_image 
>> *dom)
>
> The function is doing more than finalizing extend regions, it also 
> create the grant table regs. So how about naming it: 
> finalize_hypervisor_node()?

ok, I don't mind.


>
>
>> +{
>> +    void *fdt = dom->devicetree_blob;
>> +    uint64_t region_size[GUEST_RAM_BANKS] = {0}, 
>> region_base[GUEST_RAM_BANKS],
>> +        bankend[GUEST_RAM_BANKS];
>> +    uint32_t regs[(GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS) *
>> +                  (GUEST_RAM_BANKS + 1)];
>> +    be32 *cells = &regs[0];
>> +    const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
>> +    const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
>> +    unsigned int i, len, nr_regions = 0;
>> +    libxl_dominfo info;
>> +    int offset, rc;
>> +
>> +    offset = fdt_path_offset(fdt, "/hypervisor");
>> +    assert(offset > 0);
>> +
>> +    rc = libxl_domain_info(CTX, &info, dom->guest_domid);
>> +    assert(!rc);
>> +
>> +    assert(info.gpaddr_bits <= 64);
>
> Neither of the two should be assert(). They should be proper check so 
> we don't end up with a disaster (in particularly for the former) if 
> there is a problem.

I looked at the similar finalise_*(), and it looks like no one bothers 
with returning an error. Of course, this is not an excuse, will add a 
proper check.


>
>
>> +
>> +    /*
>> +     * Try to allocate separate 2MB-aligned extended regions from 
>> the first
>> +     * and second RAM banks taking into the account the maximum 
>> supported
>> +     * guest physical address space size and the amount of memory 
>> assigned
>> +     * to the guest.
>> +     */
>> +    for (i = 0; i < GUEST_RAM_BANKS; i++) {
>> +        region_base[i] = bankbase[i] +
>> +            ALIGN_UP_TO_2MB((uint64_t)dom->rambank_size[i] << 
>> XC_PAGE_SHIFT);
>> +
>> +        bankend[i] = ~0ULL >> (64 - info.gpaddr_bits);
>> +        bankend[i] = min(bankend[i], bankbase[i] + banksize[i] - 1);
>> +        if (bankend[i] > region_base[i])
>> +            region_size[i] = bankend[i] - region_base[i] + 1;
>> +    } > +
>> +    /*
>> +     * The region 0 for grant table space must be always present. If 
>> we managed
>> +     * to allocate the extended regions then insert them as regions 
>> 1...N.
>> +     */
>> +    set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
>> +              GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
>> +
>> +    for (i = 0; i < GUEST_RAM_BANKS; i++) {
>> +        if (region_size[i] < GUEST_EXT_REGION_MIN_SIZE)
>> +            continue;
>> +
>> +        LOG(DEBUG, "Extended region %u: %#"PRIx64"->%#"PRIx64"",
>> +            nr_regions, region_base[i], region_base[i] + 
>> region_size[i]);
>> +
>> +        set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, 
>> GUEST_ROOT_SIZE_CELLS,
>> +                  region_base[i], region_size[i]);
>> +        nr_regions++;
>> +    }
>> +
>> +    if (!nr_regions)
>> +        LOG(WARN, "The extended regions cannot be allocated, not 
>> enough space");
>> +
>> +    len = sizeof(regs[0]) * (GUEST_ROOT_ADDRESS_CELLS + 
>> GUEST_ROOT_SIZE_CELLS) *
>> +        (nr_regions + 1);
>> +    rc = fdt_setprop(fdt, offset, "reg", regs, len);
>> +    assert(!rc);
>
> We should propagate the error.

ok, will propagate, it looks like an upper layer 
libxl__arch_domain_finalise_hw_description() also needs to propagate it.


>
>
>> +}
>> +
>>   int libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
>>                                                  uint32_t domid,
>> libxl_domain_config *d_config,
>> @@ -1109,6 +1180,8 @@ int 
>> libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
>>
>>       }
>>
>> +    finalise_ext_region(gc, dom);
>> +
>>       for (i = 0; i < GUEST_RAM_BANKS; i++) {
>>           const uint64_t size = (uint64_t)dom->rambank_size[i] << 
>> XC_PAGE_SHIFT;
>>
>> diff --git a/xen/include/public/arch-arm.h 
>> b/xen/include/public/arch-arm.h
>> index d46c61f..7425a78 100644
>> --- a/xen/include/public/arch-arm.h
>> +++ b/xen/include/public/arch-arm.h
>> @@ -438,6 +438,11 @@ typedef uint64_t xen_callback_t;
>>
>>   #define GUEST_RAM_BANKS   2
>>
>> +/*
>> + * The way to find the extended regions (to be exposed to the guest 
>> as unused
>> + * address space) relies on the fact that the regions reserved for 
>> the RAM
>> + * below are big enough to also accommodate such regions.
>> + */
>>   #define GUEST_RAM0_BASE   xen_mk_ullong(0x40000000) /* 3GB of low 
>> RAM @ 1GB */
>>   #define GUEST_RAM0_SIZE   xen_mk_ullong(0xc0000000)
>>
>> @@ -451,6 +456,8 @@ typedef uint64_t xen_callback_t;
>>   #define GUEST_RAM_BANK_BASES   { GUEST_RAM0_BASE, GUEST_RAM1_BASE }
>>   #define GUEST_RAM_BANK_SIZES   { GUEST_RAM0_SIZE, GUEST_RAM1_SIZE }
>>
>> +#define GUEST_EXT_REGION_MIN_SIZE   xen_mk_ullong(0x0004000000) /* 
>> 64MB */
>
> I would prefer if this value is not part of the public header because 
> this is not a value that the hypervisor needs to know. So it is better 
> to restrict it to the libxl_arm.c

ok, will do.
Julien Grall Oct. 13, 2021, 4:24 p.m. UTC | #13
Hi Oleksandr,

On 13/10/2021 16:49, Oleksandr wrote:
> 
> On 13.10.21 18:15, Julien Grall wrote:
>> On 13/10/2021 14:46, Oleksandr wrote:
>>>
>>> Hi Julien
>>
>> Hi Oleksandr,
> 
> Hi Julien
> 
> 
> Thank you for the prompt response.
> 
> 
>>
>>> On 13.10.21 00:43, Oleksandr wrote:
>>> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
>>> index e3140a6..53ae0f3 100644
>>> --- a/tools/libs/light/libxl_arm.c
>>> +++ b/tools/libs/light/libxl_arm.c
>>> @@ -615,9 +615,12 @@ static int make_hypervisor_node(libxl__gc *gc, 
>>> void *fdt,
>>>                                 "xen,xen");
>>>       if (res) return res;
>>>
>>> -    /* reg 0 is grant table space */
>>> +    /*
>>> +     * reg 0 is a placeholder for grant table space, reg 1...N are
>>> +     * the placeholders for extended regions.
>>> +     */
>>>       res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS, 
>>> GUEST_ROOT_SIZE_CELLS,
>>> -                            1,GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
>>> +                            GUEST_RAM_BANKS + 1, 0, 0, 0, 0, 0, 0);
>>
>> Here you are relying on GUEST_RAM_BANKS == 2. I think this is pretty 
>> fragile as this may change in the future.
>>
>> fdt_property_regs() is not really suitable for here. I would suggest 
>> to create a new helper fdt_property_placeholder() which takes the 
>> address cells, size cells and the number of ranges. The function will 
>> then create N range that are zeroed.
> 
> You are right. Probably, we could add an assert here for now to be 
> triggered if "GUEST_RAM_BANKS != 2"?
> But, if you still think we need to make it with the helper right now, I 
> will. However, I don't know how long this will take.

I would prefer if we introduce the helper now. Below a potential version 
(not compiled):

diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index e3140a6e0039..df59a0521412 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -269,6 +269,20 @@ static int fdt_property_regs(libxl__gc *gc, void *fdt,
      return fdt_property(fdt, "reg", regs, sizeof(regs));
  }

+static int fdt_property_reg_placeholder(libxl__gc *gc, void *fdt,
+                                        unsigned int addr_cells,
+                                        unsigned int size_cells,
+                                        unsigned int num_regs)
+{
+    uint32_t regs[num_regs * (addr_cells + size_cells)];
+
+    for (i = 0 ; i < num_regs; i++) {
+        set_range(&cells, addr_cells, size_cells, 0, 0);
+    }
+
+    return fdt_property(fdt, "reg", regs, sizeof(regs));
+}
+
  static int make_root_properties(libxl__gc *gc,
                                  const libxl_version_info *vers,

>>> +{
>>> +    void *fdt = dom->devicetree_blob;
>>> +    uint64_t region_size[GUEST_RAM_BANKS] = {0}, 
>>> region_base[GUEST_RAM_BANKS],
>>> +        bankend[GUEST_RAM_BANKS];
>>> +    uint32_t regs[(GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS) *
>>> +                  (GUEST_RAM_BANKS + 1)];
>>> +    be32 *cells = &regs[0];
>>> +    const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
>>> +    const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
>>> +    unsigned int i, len, nr_regions = 0;
>>> +    libxl_dominfo info;
>>> +    int offset, rc;
>>> +
>>> +    offset = fdt_path_offset(fdt, "/hypervisor");
>>> +    assert(offset > 0);
>>> +
>>> +    rc = libxl_domain_info(CTX, &info, dom->guest_domid);
>>> +    assert(!rc);
>>> +
>>> +    assert(info.gpaddr_bits <= 64);
>>
>> Neither of the two should be assert(). They should be proper check so 
>> we don't end up with a disaster (in particularly for the former) if 
>> there is a problem.
> 
> I looked at the similar finalise_*(), and it looks like no one bothers 
> with returning an error. Of course, this is not an excuse, will add a 
> proper check.

This is a bit unfortunate. I would prefer if this can be avoided for new 
code (the more libxl__arch_domain_finalise_hw_description() can already 
propagate the error).

Cheers,
Oleksandr Tyshchenko Oct. 13, 2021, 4:44 p.m. UTC | #14
On 13.10.21 19:24, Julien Grall wrote:
> Hi Oleksandr,

Hi Julien



>
> On 13/10/2021 16:49, Oleksandr wrote:
>>
>> On 13.10.21 18:15, Julien Grall wrote:
>>> On 13/10/2021 14:46, Oleksandr wrote:
>>>>
>>>> Hi Julien
>>>
>>> Hi Oleksandr,
>>
>> Hi Julien
>>
>>
>> Thank you for the prompt response.
>>
>>
>>>
>>>> On 13.10.21 00:43, Oleksandr wrote:
>>>> diff --git a/tools/libs/light/libxl_arm.c 
>>>> b/tools/libs/light/libxl_arm.c
>>>> index e3140a6..53ae0f3 100644
>>>> --- a/tools/libs/light/libxl_arm.c
>>>> +++ b/tools/libs/light/libxl_arm.c
>>>> @@ -615,9 +615,12 @@ static int make_hypervisor_node(libxl__gc *gc, 
>>>> void *fdt,
>>>>                                 "xen,xen");
>>>>       if (res) return res;
>>>>
>>>> -    /* reg 0 is grant table space */
>>>> +    /*
>>>> +     * reg 0 is a placeholder for grant table space, reg 1...N are
>>>> +     * the placeholders for extended regions.
>>>> +     */
>>>>       res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS, 
>>>> GUEST_ROOT_SIZE_CELLS,
>>>> -                            1,GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
>>>> +                            GUEST_RAM_BANKS + 1, 0, 0, 0, 0, 0, 0);
>>>
>>> Here you are relying on GUEST_RAM_BANKS == 2. I think this is pretty 
>>> fragile as this may change in the future.
>>>
>>> fdt_property_regs() is not really suitable for here. I would suggest 
>>> to create a new helper fdt_property_placeholder() which takes the 
>>> address cells, size cells and the number of ranges. The function 
>>> will then create N range that are zeroed.
>>
>> You are right. Probably, we could add an assert here for now to be 
>> triggered if "GUEST_RAM_BANKS != 2"?
>> But, if you still think we need to make it with the helper right now, 
>> I will. However, I don't know how long this will take.
>
> I would prefer if we introduce the helper now. Below a potential 
> version (not compiled):


You wouldn't believe)))
I decided to not wait for the answer and re-check. So, I ended up with 
the similar to what you suggested below. Thank you.
Yes, it will work if add missing locals. However, I initially named it 
exactly as was suggested (fdt_property_placeholder) and got a 
compilation error, since fdt_property_placeholder is already present.
So, I was thinking to choose another name or to even open-code it, but I 
see you already proposed new name fdt_property_reg_placeholder.

...

libxl_arm.c:366:12: error: conflicting types for 'fdt_property_placeholder'
   366 | static int fdt_property_placeholder(libxl__gc *gc, void *fdt,
       |            ^~~~~~~~~~~~~~~~~~~~~~~~
In file included from libxl_libfdt_compat.h:64,
                  from libxl_arm.c:3:

...


>
> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
> index e3140a6e0039..df59a0521412 100644
> --- a/tools/libs/light/libxl_arm.c
> +++ b/tools/libs/light/libxl_arm.c
> @@ -269,6 +269,20 @@ static int fdt_property_regs(libxl__gc *gc, void 
> *fdt,
>      return fdt_property(fdt, "reg", regs, sizeof(regs));
>  }
>
> +static int fdt_property_reg_placeholder(libxl__gc *gc, void *fdt,
> +                                        unsigned int addr_cells,
> +                                        unsigned int size_cells,
> +                                        unsigned int num_regs)
> +{
> +    uint32_t regs[num_regs * (addr_cells + size_cells)];
> +
> +    for (i = 0 ; i < num_regs; i++) {
> +        set_range(&cells, addr_cells, size_cells, 0, 0);
> +    }
> +
> +    return fdt_property(fdt, "reg", regs, sizeof(regs));
> +}
> +
>  static int make_root_properties(libxl__gc *gc,
>                                  const libxl_version_info *vers,
>
>>>> +{
>>>> +    void *fdt = dom->devicetree_blob;
>>>> +    uint64_t region_size[GUEST_RAM_BANKS] = {0}, 
>>>> region_base[GUEST_RAM_BANKS],
>>>> +        bankend[GUEST_RAM_BANKS];
>>>> +    uint32_t regs[(GUEST_ROOT_ADDRESS_CELLS + 
>>>> GUEST_ROOT_SIZE_CELLS) *
>>>> +                  (GUEST_RAM_BANKS + 1)];
>>>> +    be32 *cells = &regs[0];
>>>> +    const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
>>>> +    const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
>>>> +    unsigned int i, len, nr_regions = 0;
>>>> +    libxl_dominfo info;
>>>> +    int offset, rc;
>>>> +
>>>> +    offset = fdt_path_offset(fdt, "/hypervisor");
>>>> +    assert(offset > 0);
>>>> +
>>>> +    rc = libxl_domain_info(CTX, &info, dom->guest_domid);
>>>> +    assert(!rc);
>>>> +
>>>> +    assert(info.gpaddr_bits <= 64);
>>>
>>> Neither of the two should be assert(). They should be proper check 
>>> so we don't end up with a disaster (in particularly for the former) 
>>> if there is a problem.
>>
>> I looked at the similar finalise_*(), and it looks like no one 
>> bothers with returning an error. Of course, this is not an excuse, 
>> will add a proper check.
>
> This is a bit unfortunate. I would prefer if this can be avoided for 
> new code (the more libxl__arch_domain_finalise_hw_description() can 
> already propagate the error).
>
> Cheers,
>
Julien Grall Oct. 13, 2021, 5:07 p.m. UTC | #15
Hi,

On 13/10/2021 17:44, Oleksandr wrote:
> On 13.10.21 19:24, Julien Grall wrote:
>> On 13/10/2021 16:49, Oleksandr wrote:
>>>
>>> On 13.10.21 18:15, Julien Grall wrote:
>>>> On 13/10/2021 14:46, Oleksandr wrote:
>>>>>
>>>>> Hi Julien
>>>>
>>>> Hi Oleksandr,
>>>
>>> Hi Julien
>>>
>>>
>>> Thank you for the prompt response.
>>>
>>>
>>>>
>>>>> On 13.10.21 00:43, Oleksandr wrote:
>>>>> diff --git a/tools/libs/light/libxl_arm.c 
>>>>> b/tools/libs/light/libxl_arm.c
>>>>> index e3140a6..53ae0f3 100644
>>>>> --- a/tools/libs/light/libxl_arm.c
>>>>> +++ b/tools/libs/light/libxl_arm.c
>>>>> @@ -615,9 +615,12 @@ static int make_hypervisor_node(libxl__gc *gc, 
>>>>> void *fdt,
>>>>>                                 "xen,xen");
>>>>>       if (res) return res;
>>>>>
>>>>> -    /* reg 0 is grant table space */
>>>>> +    /*
>>>>> +     * reg 0 is a placeholder for grant table space, reg 1...N are
>>>>> +     * the placeholders for extended regions.
>>>>> +     */
>>>>>       res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS, 
>>>>> GUEST_ROOT_SIZE_CELLS,
>>>>> -                            1,GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
>>>>> +                            GUEST_RAM_BANKS + 1, 0, 0, 0, 0, 0, 0);
>>>>
>>>> Here you are relying on GUEST_RAM_BANKS == 2. I think this is pretty 
>>>> fragile as this may change in the future.
>>>>
>>>> fdt_property_regs() is not really suitable for here. I would suggest 
>>>> to create a new helper fdt_property_placeholder() which takes the 
>>>> address cells, size cells and the number of ranges. The function 
>>>> will then create N range that are zeroed.
>>>
>>> You are right. Probably, we could add an assert here for now to be 
>>> triggered if "GUEST_RAM_BANKS != 2"?
>>> But, if you still think we need to make it with the helper right now, 
>>> I will. However, I don't know how long this will take.
>>
>> I would prefer if we introduce the helper now. Below a potential 
>> version (not compiled):
> 
> 
> You wouldn't believe)))
> I decided to not wait for the answer and re-check. So, I ended up with 
> the similar to what you suggested below. Thank you.
> Yes, it will work if add missing locals. However, I initially named it 
> exactly as was suggested (fdt_property_placeholder) and got a 
> compilation error, since fdt_property_placeholder is already present.
> So, I was thinking to choose another name or to even open-code it, but I 
> see you already proposed new name fdt_property_reg_placeholder.

Ah I didn't realize that libfdt already implemented 
fdt_property_placeholder(). The function sounds perfect for us (and the 
code is nicer :)), but unfortunately this was introdcued only in 2017. 
So it is possible that some distros may not ship the last libfdt.

So for now, I think we need to implement our own. In a follow-up we 
could try to provide a compat layer as we did for other missing fdt_* 
helpers.

Cheers,
Oleksandr Tyshchenko Oct. 13, 2021, 6:26 p.m. UTC | #16
On 13.10.21 20:07, Julien Grall wrote:

Hi Julien

> Hi,
>
> On 13/10/2021 17:44, Oleksandr wrote:
>> On 13.10.21 19:24, Julien Grall wrote:
>>> On 13/10/2021 16:49, Oleksandr wrote:
>>>>
>>>> On 13.10.21 18:15, Julien Grall wrote:
>>>>> On 13/10/2021 14:46, Oleksandr wrote:
>>>>>>
>>>>>> Hi Julien
>>>>>
>>>>> Hi Oleksandr,
>>>>
>>>> Hi Julien
>>>>
>>>>
>>>> Thank you for the prompt response.
>>>>
>>>>
>>>>>
>>>>>> On 13.10.21 00:43, Oleksandr wrote:
>>>>>> diff --git a/tools/libs/light/libxl_arm.c 
>>>>>> b/tools/libs/light/libxl_arm.c
>>>>>> index e3140a6..53ae0f3 100644
>>>>>> --- a/tools/libs/light/libxl_arm.c
>>>>>> +++ b/tools/libs/light/libxl_arm.c
>>>>>> @@ -615,9 +615,12 @@ static int make_hypervisor_node(libxl__gc 
>>>>>> *gc, void *fdt,
>>>>>>                                 "xen,xen");
>>>>>>       if (res) return res;
>>>>>>
>>>>>> -    /* reg 0 is grant table space */
>>>>>> +    /*
>>>>>> +     * reg 0 is a placeholder for grant table space, reg 1...N are
>>>>>> +     * the placeholders for extended regions.
>>>>>> +     */
>>>>>>       res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS, 
>>>>>> GUEST_ROOT_SIZE_CELLS,
>>>>>> -                            1,GUEST_GNTTAB_BASE, 
>>>>>> GUEST_GNTTAB_SIZE);
>>>>>> +                            GUEST_RAM_BANKS + 1, 0, 0, 0, 0, 0, 0);
>>>>>
>>>>> Here you are relying on GUEST_RAM_BANKS == 2. I think this is 
>>>>> pretty fragile as this may change in the future.
>>>>>
>>>>> fdt_property_regs() is not really suitable for here. I would 
>>>>> suggest to create a new helper fdt_property_placeholder() which 
>>>>> takes the address cells, size cells and the number of ranges. The 
>>>>> function will then create N range that are zeroed.
>>>>
>>>> You are right. Probably, we could add an assert here for now to be 
>>>> triggered if "GUEST_RAM_BANKS != 2"?
>>>> But, if you still think we need to make it with the helper right 
>>>> now, I will. However, I don't know how long this will take.
>>>
>>> I would prefer if we introduce the helper now. Below a potential 
>>> version (not compiled):
>>
>>
>> You wouldn't believe)))
>> I decided to not wait for the answer and re-check. So, I ended up 
>> with the similar to what you suggested below. Thank you.
>> Yes, it will work if add missing locals. However, I initially named 
>> it exactly as was suggested (fdt_property_placeholder) and got a 
>> compilation error, since fdt_property_placeholder is already present.
>> So, I was thinking to choose another name or to even open-code it, 
>> but I see you already proposed new name fdt_property_reg_placeholder.
>
> Ah I didn't realize that libfdt already implemented 
> fdt_property_placeholder(). The function sounds perfect for us (and 
> the code is nicer :)), but unfortunately this was introdcued only in 
> 2017. So it is possible that some distros may not ship the last libfdt.
>
> So for now, I think we need to implement our own. In a follow-up we 
> could try to provide a compat layer as we did for other missing fdt_* 
> helpers.
>
> Cheers,


Well, I hope that I addressed all your comments. If so, I will prepare V7.


diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index e3140a6..a780155 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -269,6 +269,21 @@ static int fdt_property_regs(libxl__gc *gc, void *fdt,
      return fdt_property(fdt, "reg", regs, sizeof(regs));
  }

+static int fdt_property_reg_placeholder(libxl__gc *gc, void *fdt,
+                                        unsigned int addr_cells,
+                                        unsigned int size_cells,
+                                        unsigned int num_regs)
+{
+    uint32_t regs[num_regs * (addr_cells + size_cells)];
+    be32 *cells = &regs[0];
+    unsigned int i;
+
+    for (i = 0; i < num_regs; i++)
+        set_range(&cells, addr_cells, size_cells, 0, 0);
+
+    return fdt_property(fdt, "reg", regs, sizeof(regs));
+}
+
  static int make_root_properties(libxl__gc *gc,
                                  const libxl_version_info *vers,
                                  void *fdt)
@@ -615,9 +630,13 @@ static int make_hypervisor_node(libxl__gc *gc, void 
*fdt,
                                "xen,xen");
      if (res) return res;

-    /* reg 0 is grant table space */
-    res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS, 
GUEST_ROOT_SIZE_CELLS,
-                            1,GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
+    /*
+     * reg 0 is a placeholder for grant table space, reg 1...N are
+     * the placeholders for extended regions.
+     */
+    res = fdt_property_reg_placeholder(gc, fdt, GUEST_ROOT_ADDRESS_CELLS,
+                                       GUEST_ROOT_SIZE_CELLS,
+                                       GUEST_RAM_BANKS + 1);
      if (res) return res;

      /*
@@ -1069,20 +1088,93 @@ static void finalise_one_node(libxl__gc *gc, 
void *fdt, const char *uname,
      }
  }

+#define ALIGN_UP_TO_2MB(x)   (((x) + MB(2) - 1) & (~(MB(2) - 1)))
+
+#define EXT_REGION_MIN_SIZE   xen_mk_ullong(0x0004000000) /* 64MB */
+
+static int finalize_hypervisor_node(libxl__gc *gc, struct xc_dom_image 
*dom)
+{
+    void *fdt = dom->devicetree_blob;
+    uint64_t region_size[GUEST_RAM_BANKS] = {0}, 
region_base[GUEST_RAM_BANKS],
+        bankend[GUEST_RAM_BANKS];
+    uint32_t regs[(GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS) *
+                  (GUEST_RAM_BANKS + 1)];
+    be32 *cells = &regs[0];
+    const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
+    const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
+    unsigned int i, len, nr_regions = 0;
+    libxl_dominfo info;
+    int offset, rc;
+
+    offset = fdt_path_offset(fdt, "/hypervisor");
+    if (offset < 0)
+        return offset;
+
+    rc = libxl_domain_info(CTX, &info, dom->guest_domid);
+    if (rc)
+        return rc;
+
+    if (info.gpaddr_bits > 64)
+        return ERROR_INVAL;
+
+    /*
+     * Try to allocate separate 2MB-aligned extended regions from the first
+     * and second RAM banks taking into the account the maximum supported
+     * guest physical address space size and the amount of memory assigned
+     * to the guest.
+     */
+    for (i = 0; i < GUEST_RAM_BANKS; i++) {
+        region_base[i] = bankbase[i] +
+            ALIGN_UP_TO_2MB((uint64_t)dom->rambank_size[i] << 
XC_PAGE_SHIFT);
+
+        bankend[i] = ~0ULL >> (64 - info.gpaddr_bits);
+        bankend[i] = min(bankend[i], bankbase[i] + banksize[i] - 1);
+        if (bankend[i] > region_base[i])
+            region_size[i] = bankend[i] - region_base[i] + 1;
+    }
+
+    /*
+     * The region 0 for grant table space must be always present. If we 
managed
+     * to allocate the extended regions then insert them as regions 1...N.
+     */
+    set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
+              GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
+
+    for (i = 0; i < GUEST_RAM_BANKS; i++) {
+        if (region_size[i] < EXT_REGION_MIN_SIZE)
+            continue;
+
+        LOG(DEBUG, "Extended region %u: %#"PRIx64"->%#"PRIx64"",
+            nr_regions, region_base[i], region_base[i] + region_size[i]);
+
+        set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
+                  region_base[i], region_size[i]);
+        nr_regions++;
+    }
+
+    if (!nr_regions)
+        LOG(WARN, "The extended regions cannot be allocated, not enough 
space");
+
+    len = sizeof(regs[0]) * (GUEST_ROOT_ADDRESS_CELLS + 
GUEST_ROOT_SIZE_CELLS) *
+        (nr_regions + 1);
+
+    return fdt_setprop(fdt, offset, "reg", regs, len);
+}
+
  int libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
                                                 uint32_t domid,
libxl_domain_config *d_config,
                                                 struct xc_dom_image *dom)
  {
      void *fdt = dom->devicetree_blob;
-    int i;
+    int i, res;
      const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;

      const struct xc_dom_seg *ramdisk = dom->modules[0].blob ?
          &dom->modules[0].seg : NULL;

      if (ramdisk) {
-        int chosen, res;
+        int chosen;
          uint64_t val;

          /* Neither the fdt_path_offset() nor either of the
@@ -1109,6 +1201,10 @@ int 
libxl__arch_domain_finalise_hw_description(libxl__gc *gc,

      }

+    res = finalize_hypervisor_node(gc, dom);
+    if (res)
+        return res;
+
      for (i = 0; i < GUEST_RAM_BANKS; i++) {
          const uint64_t size = (uint64_t)dom->rambank_size[i] << 
XC_PAGE_SHIFT;

diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
index d46c61f..96ead3d 100644
--- a/xen/include/public/arch-arm.h
+++ b/xen/include/public/arch-arm.h
@@ -438,6 +438,11 @@ typedef uint64_t xen_callback_t;

  #define GUEST_RAM_BANKS   2

+/*
+ * The way to find the extended regions (to be exposed to the guest as 
unused
+ * address space) relies on the fact that the regions reserved for the RAM
+ * below are big enough to also accommodate such regions.
+ */
  #define GUEST_RAM0_BASE   xen_mk_ullong(0x40000000) /* 3GB of low RAM 
@ 1GB */
  #define GUEST_RAM0_SIZE   xen_mk_ullong(0xc0000000)

(END)
Oleksandr Tyshchenko Oct. 13, 2021, 8:19 p.m. UTC | #17
Hi Stefano,


On 13.10.21 21:26, Oleksandr wrote:
>
> On 13.10.21 20:07, Julien Grall wrote:
>
> Hi Julien
>
>> Hi,
>>
>> On 13/10/2021 17:44, Oleksandr wrote:
>>> On 13.10.21 19:24, Julien Grall wrote:
>>>> On 13/10/2021 16:49, Oleksandr wrote:
>>>>>
>>>>> On 13.10.21 18:15, Julien Grall wrote:
>>>>>> On 13/10/2021 14:46, Oleksandr wrote:
>>>>>>>
>>>>>>> Hi Julien
>>>>>>
>>>>>> Hi Oleksandr,
>>>>>
>>>>> Hi Julien
>>>>>
>>>>>
>>>>> Thank you for the prompt response.
>>>>>
>>>>>
>>>>>>
>>>>>>> On 13.10.21 00:43, Oleksandr wrote:
>>>>>>> diff --git a/tools/libs/light/libxl_arm.c 
>>>>>>> b/tools/libs/light/libxl_arm.c
>>>>>>> index e3140a6..53ae0f3 100644
>>>>>>> --- a/tools/libs/light/libxl_arm.c
>>>>>>> +++ b/tools/libs/light/libxl_arm.c
>>>>>>> @@ -615,9 +615,12 @@ static int make_hypervisor_node(libxl__gc 
>>>>>>> *gc, void *fdt,
>>>>>>>                                 "xen,xen");
>>>>>>>       if (res) return res;
>>>>>>>
>>>>>>> -    /* reg 0 is grant table space */
>>>>>>> +    /*
>>>>>>> +     * reg 0 is a placeholder for grant table space, reg 1...N are
>>>>>>> +     * the placeholders for extended regions.
>>>>>>> +     */
>>>>>>>       res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS, 
>>>>>>> GUEST_ROOT_SIZE_CELLS,
>>>>>>> -                            1,GUEST_GNTTAB_BASE, 
>>>>>>> GUEST_GNTTAB_SIZE);
>>>>>>> +                            GUEST_RAM_BANKS + 1, 0, 0, 0, 0, 0, 
>>>>>>> 0);
>>>>>>
>>>>>> Here you are relying on GUEST_RAM_BANKS == 2. I think this is 
>>>>>> pretty fragile as this may change in the future.
>>>>>>
>>>>>> fdt_property_regs() is not really suitable for here. I would 
>>>>>> suggest to create a new helper fdt_property_placeholder() which 
>>>>>> takes the address cells, size cells and the number of ranges. The 
>>>>>> function will then create N range that are zeroed.
>>>>>
>>>>> You are right. Probably, we could add an assert here for now to be 
>>>>> triggered if "GUEST_RAM_BANKS != 2"?
>>>>> But, if you still think we need to make it with the helper right 
>>>>> now, I will. However, I don't know how long this will take.
>>>>
>>>> I would prefer if we introduce the helper now. Below a potential 
>>>> version (not compiled):
>>>
>>>
>>> You wouldn't believe)))
>>> I decided to not wait for the answer and re-check. So, I ended up 
>>> with the similar to what you suggested below. Thank you.
>>> Yes, it will work if add missing locals. However, I initially named 
>>> it exactly as was suggested (fdt_property_placeholder) and got a 
>>> compilation error, since fdt_property_placeholder is already present.
>>> So, I was thinking to choose another name or to even open-code it, 
>>> but I see you already proposed new name fdt_property_reg_placeholder.
>>
>> Ah I didn't realize that libfdt already implemented 
>> fdt_property_placeholder(). The function sounds perfect for us (and 
>> the code is nicer :)), but unfortunately this was introdcued only in 
>> 2017. So it is possible that some distros may not ship the last libfdt.
>>
>> So for now, I think we need to implement our own. In a follow-up we 
>> could try to provide a compat layer as we did for other missing fdt_* 
>> helpers.
>>
>> Cheers,
>
>
> Well, I hope that I addressed all your comments. If so, I will prepare V7.


May I please ask, are you *preliminary* OK with new changes requested by 
Julien? The main changes are to bring finalize_*() back (hopefully there 
is a way how to downsize a placeholder), avoid relying on the fact that 
Bank 0 is always below 4GB, adding a convenient helper to create a 
placeholder for N ranges plus some code hardening, etc.


>
>
>
> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
> index e3140a6..a780155 100644
> --- a/tools/libs/light/libxl_arm.c
> +++ b/tools/libs/light/libxl_arm.c
> @@ -269,6 +269,21 @@ static int fdt_property_regs(libxl__gc *gc, void 
> *fdt,
>      return fdt_property(fdt, "reg", regs, sizeof(regs));
>  }
>
> +static int fdt_property_reg_placeholder(libxl__gc *gc, void *fdt,
> +                                        unsigned int addr_cells,
> +                                        unsigned int size_cells,
> +                                        unsigned int num_regs)
> +{
> +    uint32_t regs[num_regs * (addr_cells + size_cells)];
> +    be32 *cells = &regs[0];
> +    unsigned int i;
> +
> +    for (i = 0; i < num_regs; i++)
> +        set_range(&cells, addr_cells, size_cells, 0, 0);
> +
> +    return fdt_property(fdt, "reg", regs, sizeof(regs));
> +}
> +
>  static int make_root_properties(libxl__gc *gc,
>                                  const libxl_version_info *vers,
>                                  void *fdt)
> @@ -615,9 +630,13 @@ static int make_hypervisor_node(libxl__gc *gc, 
> void *fdt,
>                                "xen,xen");
>      if (res) return res;
>
> -    /* reg 0 is grant table space */
> -    res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS, 
> GUEST_ROOT_SIZE_CELLS,
> -                            1,GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
> +    /*
> +     * reg 0 is a placeholder for grant table space, reg 1...N are
> +     * the placeholders for extended regions.
> +     */
> +    res = fdt_property_reg_placeholder(gc, fdt, 
> GUEST_ROOT_ADDRESS_CELLS,
> +                                       GUEST_ROOT_SIZE_CELLS,
> +                                       GUEST_RAM_BANKS + 1);
>      if (res) return res;
>
>      /*
> @@ -1069,20 +1088,93 @@ static void finalise_one_node(libxl__gc *gc, 
> void *fdt, const char *uname,
>      }
>  }
>
> +#define ALIGN_UP_TO_2MB(x)   (((x) + MB(2) - 1) & (~(MB(2) - 1)))
> +
> +#define EXT_REGION_MIN_SIZE   xen_mk_ullong(0x0004000000) /* 64MB */
> +
> +static int finalize_hypervisor_node(libxl__gc *gc, struct 
> xc_dom_image *dom)
> +{
> +    void *fdt = dom->devicetree_blob;
> +    uint64_t region_size[GUEST_RAM_BANKS] = {0}, 
> region_base[GUEST_RAM_BANKS],
> +        bankend[GUEST_RAM_BANKS];
> +    uint32_t regs[(GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS) *
> +                  (GUEST_RAM_BANKS + 1)];
> +    be32 *cells = &regs[0];
> +    const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
> +    const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
> +    unsigned int i, len, nr_regions = 0;
> +    libxl_dominfo info;
> +    int offset, rc;
> +
> +    offset = fdt_path_offset(fdt, "/hypervisor");
> +    if (offset < 0)
> +        return offset;
> +
> +    rc = libxl_domain_info(CTX, &info, dom->guest_domid);
> +    if (rc)
> +        return rc;
> +
> +    if (info.gpaddr_bits > 64)
> +        return ERROR_INVAL;
> +
> +    /*
> +     * Try to allocate separate 2MB-aligned extended regions from the 
> first
> +     * and second RAM banks taking into the account the maximum 
> supported
> +     * guest physical address space size and the amount of memory 
> assigned
> +     * to the guest.
> +     */
> +    for (i = 0; i < GUEST_RAM_BANKS; i++) {
> +        region_base[i] = bankbase[i] +
> +            ALIGN_UP_TO_2MB((uint64_t)dom->rambank_size[i] << 
> XC_PAGE_SHIFT);
> +
> +        bankend[i] = ~0ULL >> (64 - info.gpaddr_bits);
> +        bankend[i] = min(bankend[i], bankbase[i] + banksize[i] - 1);
> +        if (bankend[i] > region_base[i])
> +            region_size[i] = bankend[i] - region_base[i] + 1;
> +    }
> +
> +    /*
> +     * The region 0 for grant table space must be always present. If 
> we managed
> +     * to allocate the extended regions then insert them as regions 
> 1...N.
> +     */
> +    set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
> +              GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
> +
> +    for (i = 0; i < GUEST_RAM_BANKS; i++) {
> +        if (region_size[i] < EXT_REGION_MIN_SIZE)
> +            continue;
> +
> +        LOG(DEBUG, "Extended region %u: %#"PRIx64"->%#"PRIx64"",
> +            nr_regions, region_base[i], region_base[i] + 
> region_size[i]);
> +
> +        set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, 
> GUEST_ROOT_SIZE_CELLS,
> +                  region_base[i], region_size[i]);
> +        nr_regions++;
> +    }
> +
> +    if (!nr_regions)
> +        LOG(WARN, "The extended regions cannot be allocated, not 
> enough space");
> +
> +    len = sizeof(regs[0]) * (GUEST_ROOT_ADDRESS_CELLS + 
> GUEST_ROOT_SIZE_CELLS) *
> +        (nr_regions + 1);
> +
> +    return fdt_setprop(fdt, offset, "reg", regs, len);
> +}
> +
>  int libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
>                                                 uint32_t domid,
> libxl_domain_config *d_config,
>                                                 struct xc_dom_image *dom)
>  {
>      void *fdt = dom->devicetree_blob;
> -    int i;
> +    int i, res;
>      const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
>
>      const struct xc_dom_seg *ramdisk = dom->modules[0].blob ?
>          &dom->modules[0].seg : NULL;
>
>      if (ramdisk) {
> -        int chosen, res;
> +        int chosen;
>          uint64_t val;
>
>          /* Neither the fdt_path_offset() nor either of the
> @@ -1109,6 +1201,10 @@ int 
> libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
>
>      }
>
> +    res = finalize_hypervisor_node(gc, dom);
> +    if (res)
> +        return res;
> +
>      for (i = 0; i < GUEST_RAM_BANKS; i++) {
>          const uint64_t size = (uint64_t)dom->rambank_size[i] << 
> XC_PAGE_SHIFT;
>
> diff --git a/xen/include/public/arch-arm.h 
> b/xen/include/public/arch-arm.h
> index d46c61f..96ead3d 100644
> --- a/xen/include/public/arch-arm.h
> +++ b/xen/include/public/arch-arm.h
> @@ -438,6 +438,11 @@ typedef uint64_t xen_callback_t;
>
>  #define GUEST_RAM_BANKS   2
>
> +/*
> + * The way to find the extended regions (to be exposed to the guest 
> as unused
> + * address space) relies on the fact that the regions reserved for 
> the RAM
> + * below are big enough to also accommodate such regions.
> + */
>  #define GUEST_RAM0_BASE   xen_mk_ullong(0x40000000) /* 3GB of low RAM 
> @ 1GB */
>  #define GUEST_RAM0_SIZE   xen_mk_ullong(0xc0000000)
>
> (END)
>
>
Stefano Stabellini Oct. 13, 2021, 8:24 p.m. UTC | #18
On Wed, 13 Oct 2021, Oleksandr wrote:
> On 13.10.21 21:26, Oleksandr wrote:
> > 
> > On 13.10.21 20:07, Julien Grall wrote:
> > 
> > Hi Julien
> > 
> > > Hi,
> > > 
> > > On 13/10/2021 17:44, Oleksandr wrote:
> > > > On 13.10.21 19:24, Julien Grall wrote:
> > > > > On 13/10/2021 16:49, Oleksandr wrote:
> > > > > > 
> > > > > > On 13.10.21 18:15, Julien Grall wrote:
> > > > > > > On 13/10/2021 14:46, Oleksandr wrote:
> > > > > > > > 
> > > > > > > > Hi Julien
> > > > > > > 
> > > > > > > Hi Oleksandr,
> > > > > > 
> > > > > > Hi Julien
> > > > > > 
> > > > > > 
> > > > > > Thank you for the prompt response.
> > > > > > 
> > > > > > 
> > > > > > > 
> > > > > > > > On 13.10.21 00:43, Oleksandr wrote:
> > > > > > > > diff --git a/tools/libs/light/libxl_arm.c
> > > > > > > > b/tools/libs/light/libxl_arm.c
> > > > > > > > index e3140a6..53ae0f3 100644
> > > > > > > > --- a/tools/libs/light/libxl_arm.c
> > > > > > > > +++ b/tools/libs/light/libxl_arm.c
> > > > > > > > @@ -615,9 +615,12 @@ static int make_hypervisor_node(libxl__gc
> > > > > > > > *gc, void *fdt,
> > > > > > > >                                 "xen,xen");
> > > > > > > >       if (res) return res;
> > > > > > > > 
> > > > > > > > -    /* reg 0 is grant table space */
> > > > > > > > +    /*
> > > > > > > > +     * reg 0 is a placeholder for grant table space, reg 1...N
> > > > > > > > are
> > > > > > > > +     * the placeholders for extended regions.
> > > > > > > > +     */
> > > > > > > >       res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS,
> > > > > > > > GUEST_ROOT_SIZE_CELLS,
> > > > > > > > -                            1,GUEST_GNTTAB_BASE,
> > > > > > > > GUEST_GNTTAB_SIZE);
> > > > > > > > +                            GUEST_RAM_BANKS + 1, 0, 0, 0, 0, 0,
> > > > > > > > 0);
> > > > > > > 
> > > > > > > Here you are relying on GUEST_RAM_BANKS == 2. I think this is
> > > > > > > pretty fragile as this may change in the future.
> > > > > > > 
> > > > > > > fdt_property_regs() is not really suitable for here. I would
> > > > > > > suggest to create a new helper fdt_property_placeholder() which
> > > > > > > takes the address cells, size cells and the number of ranges. The
> > > > > > > function will then create N range that are zeroed.
> > > > > > 
> > > > > > You are right. Probably, we could add an assert here for now to be
> > > > > > triggered if "GUEST_RAM_BANKS != 2"?
> > > > > > But, if you still think we need to make it with the helper right
> > > > > > now, I will. However, I don't know how long this will take.
> > > > > 
> > > > > I would prefer if we introduce the helper now. Below a potential
> > > > > version (not compiled):
> > > > 
> > > > 
> > > > You wouldn't believe)))
> > > > I decided to not wait for the answer and re-check. So, I ended up with
> > > > the similar to what you suggested below. Thank you.
> > > > Yes, it will work if add missing locals. However, I initially named it
> > > > exactly as was suggested (fdt_property_placeholder) and got a
> > > > compilation error, since fdt_property_placeholder is already present.
> > > > So, I was thinking to choose another name or to even open-code it, but I
> > > > see you already proposed new name fdt_property_reg_placeholder.
> > > 
> > > Ah I didn't realize that libfdt already implemented
> > > fdt_property_placeholder(). The function sounds perfect for us (and the
> > > code is nicer :)), but unfortunately this was introdcued only in 2017. So
> > > it is possible that some distros may not ship the last libfdt.
> > > 
> > > So for now, I think we need to implement our own. In a follow-up we could
> > > try to provide a compat layer as we did for other missing fdt_* helpers.
> > > 
> > > Cheers,
> > 
> > 
> > Well, I hope that I addressed all your comments. If so, I will prepare V7.
> 
> 
> May I please ask, are you *preliminary* OK with new changes requested by
> Julien? The main changes are to bring finalize_*() back (hopefully there is a
> way how to downsize a placeholder), avoid relying on the fact that Bank 0 is
> always below 4GB, adding a convenient helper to create a placeholder for N
> ranges plus some code hardening, etc.

Yes, I am OK with it


> > diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
> > index e3140a6..a780155 100644
> > --- a/tools/libs/light/libxl_arm.c
> > +++ b/tools/libs/light/libxl_arm.c
> > @@ -269,6 +269,21 @@ static int fdt_property_regs(libxl__gc *gc, void *fdt,
> >      return fdt_property(fdt, "reg", regs, sizeof(regs));
> >  }
> > 
> > +static int fdt_property_reg_placeholder(libxl__gc *gc, void *fdt,
> > +                                        unsigned int addr_cells,
> > +                                        unsigned int size_cells,
> > +                                        unsigned int num_regs)
> > +{
> > +    uint32_t regs[num_regs * (addr_cells + size_cells)];
> > +    be32 *cells = &regs[0];
> > +    unsigned int i;
> > +
> > +    for (i = 0; i < num_regs; i++)
> > +        set_range(&cells, addr_cells, size_cells, 0, 0);
> > +
> > +    return fdt_property(fdt, "reg", regs, sizeof(regs));
> > +}
> > +
> >  static int make_root_properties(libxl__gc *gc,
> >                                  const libxl_version_info *vers,
> >                                  void *fdt)
> > @@ -615,9 +630,13 @@ static int make_hypervisor_node(libxl__gc *gc, void
> > *fdt,
> >                                "xen,xen");
> >      if (res) return res;
> > 
> > -    /* reg 0 is grant table space */
> > -    res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS,
> > GUEST_ROOT_SIZE_CELLS,
> > -                            1,GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
> > +    /*
> > +     * reg 0 is a placeholder for grant table space, reg 1...N are
> > +     * the placeholders for extended regions.
> > +     */
> > +    res = fdt_property_reg_placeholder(gc, fdt, GUEST_ROOT_ADDRESS_CELLS,
> > +                                       GUEST_ROOT_SIZE_CELLS,
> > +                                       GUEST_RAM_BANKS + 1);
> >      if (res) return res;
> > 
> >      /*
> > @@ -1069,20 +1088,93 @@ static void finalise_one_node(libxl__gc *gc, void
> > *fdt, const char *uname,
> >      }
> >  }
> > 
> > +#define ALIGN_UP_TO_2MB(x)   (((x) + MB(2) - 1) & (~(MB(2) - 1)))
> > +
> > +#define EXT_REGION_MIN_SIZE   xen_mk_ullong(0x0004000000) /* 64MB */
> > +
> > +static int finalize_hypervisor_node(libxl__gc *gc, struct xc_dom_image
> > *dom)
> > +{
> > +    void *fdt = dom->devicetree_blob;
> > +    uint64_t region_size[GUEST_RAM_BANKS] = {0},
> > region_base[GUEST_RAM_BANKS],
> > +        bankend[GUEST_RAM_BANKS];
> > +    uint32_t regs[(GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS) *
> > +                  (GUEST_RAM_BANKS + 1)];
> > +    be32 *cells = &regs[0];
> > +    const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
> > +    const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
> > +    unsigned int i, len, nr_regions = 0;
> > +    libxl_dominfo info;
> > +    int offset, rc;
> > +
> > +    offset = fdt_path_offset(fdt, "/hypervisor");
> > +    if (offset < 0)
> > +        return offset;
> > +
> > +    rc = libxl_domain_info(CTX, &info, dom->guest_domid);
> > +    if (rc)
> > +        return rc;
> > +
> > +    if (info.gpaddr_bits > 64)
> > +        return ERROR_INVAL;
> > +
> > +    /*
> > +     * Try to allocate separate 2MB-aligned extended regions from the first
> > +     * and second RAM banks taking into the account the maximum supported
> > +     * guest physical address space size and the amount of memory assigned
> > +     * to the guest.
> > +     */
> > +    for (i = 0; i < GUEST_RAM_BANKS; i++) {
> > +        region_base[i] = bankbase[i] +
> > +            ALIGN_UP_TO_2MB((uint64_t)dom->rambank_size[i] <<
> > XC_PAGE_SHIFT);
> > +
> > +        bankend[i] = ~0ULL >> (64 - info.gpaddr_bits);
> > +        bankend[i] = min(bankend[i], bankbase[i] + banksize[i] - 1);
> > +        if (bankend[i] > region_base[i])
> > +            region_size[i] = bankend[i] - region_base[i] + 1;
> > +    }
> > +
> > +    /*
> > +     * The region 0 for grant table space must be always present. If we
> > managed
> > +     * to allocate the extended regions then insert them as regions 1...N.
> > +     */
> > +    set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
> > +              GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
> > +
> > +    for (i = 0; i < GUEST_RAM_BANKS; i++) {
> > +        if (region_size[i] < EXT_REGION_MIN_SIZE)
> > +            continue;
> > +
> > +        LOG(DEBUG, "Extended region %u: %#"PRIx64"->%#"PRIx64"",
> > +            nr_regions, region_base[i], region_base[i] + region_size[i]);
> > +
> > +        set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
> > +                  region_base[i], region_size[i]);
> > +        nr_regions++;
> > +    }
> > +
> > +    if (!nr_regions)
> > +        LOG(WARN, "The extended regions cannot be allocated, not enough
> > space");
> > +
> > +    len = sizeof(regs[0]) * (GUEST_ROOT_ADDRESS_CELLS +
> > GUEST_ROOT_SIZE_CELLS) *
> > +        (nr_regions + 1);
> > +
> > +    return fdt_setprop(fdt, offset, "reg", regs, len);
> > +}
> > +
> >  int libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
> >                                                 uint32_t domid,
> > libxl_domain_config *d_config,
> >                                                 struct xc_dom_image *dom)
> >  {
> >      void *fdt = dom->devicetree_blob;
> > -    int i;
> > +    int i, res;
> >      const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
> > 
> >      const struct xc_dom_seg *ramdisk = dom->modules[0].blob ?
> >          &dom->modules[0].seg : NULL;
> > 
> >      if (ramdisk) {
> > -        int chosen, res;
> > +        int chosen;
> >          uint64_t val;
> > 
> >          /* Neither the fdt_path_offset() nor either of the
> > @@ -1109,6 +1201,10 @@ int
> > libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
> > 
> >      }
> > 
> > +    res = finalize_hypervisor_node(gc, dom);
> > +    if (res)
> > +        return res;
> > +
> >      for (i = 0; i < GUEST_RAM_BANKS; i++) {
> >          const uint64_t size = (uint64_t)dom->rambank_size[i] <<
> > XC_PAGE_SHIFT;
> > 
> > diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
> > index d46c61f..96ead3d 100644
> > --- a/xen/include/public/arch-arm.h
> > +++ b/xen/include/public/arch-arm.h
> > @@ -438,6 +438,11 @@ typedef uint64_t xen_callback_t;
> > 
> >  #define GUEST_RAM_BANKS   2
> > 
> > +/*
> > + * The way to find the extended regions (to be exposed to the guest as
> > unused
> > + * address space) relies on the fact that the regions reserved for the RAM
> > + * below are big enough to also accommodate such regions.
> > + */
> >  #define GUEST_RAM0_BASE   xen_mk_ullong(0x40000000) /* 3GB of low RAM @ 1GB
> > */
> >  #define GUEST_RAM0_SIZE   xen_mk_ullong(0xc0000000)
> > 
> > (END)
> > 
> > 
> -- 
> Regards,
> 
> Oleksandr Tyshchenko
>
Oleksandr Tyshchenko Oct. 14, 2021, 11:44 a.m. UTC | #19
Hi Stefano


On 13.10.21 23:24, Stefano Stabellini wrote:
> On Wed, 13 Oct 2021, Oleksandr wrote:
>> On 13.10.21 21:26, Oleksandr wrote:
>>> On 13.10.21 20:07, Julien Grall wrote:
>>>
>>> Hi Julien
>>>
>>>> Hi,
>>>>
>>>> On 13/10/2021 17:44, Oleksandr wrote:
>>>>> On 13.10.21 19:24, Julien Grall wrote:
>>>>>> On 13/10/2021 16:49, Oleksandr wrote:
>>>>>>> On 13.10.21 18:15, Julien Grall wrote:
>>>>>>>> On 13/10/2021 14:46, Oleksandr wrote:
>>>>>>>>> Hi Julien
>>>>>>>> Hi Oleksandr,
>>>>>>> Hi Julien
>>>>>>>
>>>>>>>
>>>>>>> Thank you for the prompt response.
>>>>>>>
>>>>>>>
>>>>>>>>> On 13.10.21 00:43, Oleksandr wrote:
>>>>>>>>> diff --git a/tools/libs/light/libxl_arm.c
>>>>>>>>> b/tools/libs/light/libxl_arm.c
>>>>>>>>> index e3140a6..53ae0f3 100644
>>>>>>>>> --- a/tools/libs/light/libxl_arm.c
>>>>>>>>> +++ b/tools/libs/light/libxl_arm.c
>>>>>>>>> @@ -615,9 +615,12 @@ static int make_hypervisor_node(libxl__gc
>>>>>>>>> *gc, void *fdt,
>>>>>>>>>                                  "xen,xen");
>>>>>>>>>        if (res) return res;
>>>>>>>>>
>>>>>>>>> -    /* reg 0 is grant table space */
>>>>>>>>> +    /*
>>>>>>>>> +     * reg 0 is a placeholder for grant table space, reg 1...N
>>>>>>>>> are
>>>>>>>>> +     * the placeholders for extended regions.
>>>>>>>>> +     */
>>>>>>>>>        res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS,
>>>>>>>>> GUEST_ROOT_SIZE_CELLS,
>>>>>>>>> -                            1,GUEST_GNTTAB_BASE,
>>>>>>>>> GUEST_GNTTAB_SIZE);
>>>>>>>>> +                            GUEST_RAM_BANKS + 1, 0, 0, 0, 0, 0,
>>>>>>>>> 0);
>>>>>>>> Here you are relying on GUEST_RAM_BANKS == 2. I think this is
>>>>>>>> pretty fragile as this may change in the future.
>>>>>>>>
>>>>>>>> fdt_property_regs() is not really suitable for here. I would
>>>>>>>> suggest to create a new helper fdt_property_placeholder() which
>>>>>>>> takes the address cells, size cells and the number of ranges. The
>>>>>>>> function will then create N range that are zeroed.
>>>>>>> You are right. Probably, we could add an assert here for now to be
>>>>>>> triggered if "GUEST_RAM_BANKS != 2"?
>>>>>>> But, if you still think we need to make it with the helper right
>>>>>>> now, I will. However, I don't know how long this will take.
>>>>>> I would prefer if we introduce the helper now. Below a potential
>>>>>> version (not compiled):
>>>>>
>>>>> You wouldn't believe)))
>>>>> I decided to not wait for the answer and re-check. So, I ended up with
>>>>> the similar to what you suggested below. Thank you.
>>>>> Yes, it will work if add missing locals. However, I initially named it
>>>>> exactly as was suggested (fdt_property_placeholder) and got a
>>>>> compilation error, since fdt_property_placeholder is already present.
>>>>> So, I was thinking to choose another name or to even open-code it, but I
>>>>> see you already proposed new name fdt_property_reg_placeholder.
>>>> Ah I didn't realize that libfdt already implemented
>>>> fdt_property_placeholder(). The function sounds perfect for us (and the
>>>> code is nicer :)), but unfortunately this was introdcued only in 2017. So
>>>> it is possible that some distros may not ship the last libfdt.
>>>>
>>>> So for now, I think we need to implement our own. In a follow-up we could
>>>> try to provide a compat layer as we did for other missing fdt_* helpers.
>>>>
>>>> Cheers,
>>>
>>> Well, I hope that I addressed all your comments. If so, I will prepare V7.
>>
>> May I please ask, are you *preliminary* OK with new changes requested by
>> Julien? The main changes are to bring finalize_*() back (hopefully there is a
>> way how to downsize a placeholder), avoid relying on the fact that Bank 0 is
>> always below 4GB, adding a convenient helper to create a placeholder for N
>> ranges plus some code hardening, etc.
> Yes, I am OK with it

Thank you for confirming. I have just pushed V7:

https://lore.kernel.org/xen-devel/1634211645-26912-1-git-send-email-olekstysh@gmail.com/


>
>
>>> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
>>> index e3140a6..a780155 100644
>>> --- a/tools/libs/light/libxl_arm.c
>>> +++ b/tools/libs/light/libxl_arm.c
>>> @@ -269,6 +269,21 @@ static int fdt_property_regs(libxl__gc *gc, void *fdt,
>>>       return fdt_property(fdt, "reg", regs, sizeof(regs));
>>>   }
>>>
>>> +static int fdt_property_reg_placeholder(libxl__gc *gc, void *fdt,
>>> +                                        unsigned int addr_cells,
>>> +                                        unsigned int size_cells,
>>> +                                        unsigned int num_regs)
>>> +{
>>> +    uint32_t regs[num_regs * (addr_cells + size_cells)];
>>> +    be32 *cells = &regs[0];
>>> +    unsigned int i;
>>> +
>>> +    for (i = 0; i < num_regs; i++)
>>> +        set_range(&cells, addr_cells, size_cells, 0, 0);
>>> +
>>> +    return fdt_property(fdt, "reg", regs, sizeof(regs));
>>> +}
>>> +
>>>   static int make_root_properties(libxl__gc *gc,
>>>                                   const libxl_version_info *vers,
>>>                                   void *fdt)
>>> @@ -615,9 +630,13 @@ static int make_hypervisor_node(libxl__gc *gc, void
>>> *fdt,
>>>                                 "xen,xen");
>>>       if (res) return res;
>>>
>>> -    /* reg 0 is grant table space */
>>> -    res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS,
>>> GUEST_ROOT_SIZE_CELLS,
>>> -                            1,GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
>>> +    /*
>>> +     * reg 0 is a placeholder for grant table space, reg 1...N are
>>> +     * the placeholders for extended regions.
>>> +     */
>>> +    res = fdt_property_reg_placeholder(gc, fdt, GUEST_ROOT_ADDRESS_CELLS,
>>> +                                       GUEST_ROOT_SIZE_CELLS,
>>> +                                       GUEST_RAM_BANKS + 1);
>>>       if (res) return res;
>>>
>>>       /*
>>> @@ -1069,20 +1088,93 @@ static void finalise_one_node(libxl__gc *gc, void
>>> *fdt, const char *uname,
>>>       }
>>>   }
>>>
>>> +#define ALIGN_UP_TO_2MB(x)   (((x) + MB(2) - 1) & (~(MB(2) - 1)))
>>> +
>>> +#define EXT_REGION_MIN_SIZE   xen_mk_ullong(0x0004000000) /* 64MB */
>>> +
>>> +static int finalize_hypervisor_node(libxl__gc *gc, struct xc_dom_image
>>> *dom)
>>> +{
>>> +    void *fdt = dom->devicetree_blob;
>>> +    uint64_t region_size[GUEST_RAM_BANKS] = {0},
>>> region_base[GUEST_RAM_BANKS],
>>> +        bankend[GUEST_RAM_BANKS];
>>> +    uint32_t regs[(GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS) *
>>> +                  (GUEST_RAM_BANKS + 1)];
>>> +    be32 *cells = &regs[0];
>>> +    const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
>>> +    const uint64_t banksize[] = GUEST_RAM_BANK_SIZES;
>>> +    unsigned int i, len, nr_regions = 0;
>>> +    libxl_dominfo info;
>>> +    int offset, rc;
>>> +
>>> +    offset = fdt_path_offset(fdt, "/hypervisor");
>>> +    if (offset < 0)
>>> +        return offset;
>>> +
>>> +    rc = libxl_domain_info(CTX, &info, dom->guest_domid);
>>> +    if (rc)
>>> +        return rc;
>>> +
>>> +    if (info.gpaddr_bits > 64)
>>> +        return ERROR_INVAL;
>>> +
>>> +    /*
>>> +     * Try to allocate separate 2MB-aligned extended regions from the first
>>> +     * and second RAM banks taking into the account the maximum supported
>>> +     * guest physical address space size and the amount of memory assigned
>>> +     * to the guest.
>>> +     */
>>> +    for (i = 0; i < GUEST_RAM_BANKS; i++) {
>>> +        region_base[i] = bankbase[i] +
>>> +            ALIGN_UP_TO_2MB((uint64_t)dom->rambank_size[i] <<
>>> XC_PAGE_SHIFT);
>>> +
>>> +        bankend[i] = ~0ULL >> (64 - info.gpaddr_bits);
>>> +        bankend[i] = min(bankend[i], bankbase[i] + banksize[i] - 1);
>>> +        if (bankend[i] > region_base[i])
>>> +            region_size[i] = bankend[i] - region_base[i] + 1;
>>> +    }
>>> +
>>> +    /*
>>> +     * The region 0 for grant table space must be always present. If we
>>> managed
>>> +     * to allocate the extended regions then insert them as regions 1...N.
>>> +     */
>>> +    set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
>>> +              GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
>>> +
>>> +    for (i = 0; i < GUEST_RAM_BANKS; i++) {
>>> +        if (region_size[i] < EXT_REGION_MIN_SIZE)
>>> +            continue;
>>> +
>>> +        LOG(DEBUG, "Extended region %u: %#"PRIx64"->%#"PRIx64"",
>>> +            nr_regions, region_base[i], region_base[i] + region_size[i]);
>>> +
>>> +        set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
>>> +                  region_base[i], region_size[i]);
>>> +        nr_regions++;
>>> +    }
>>> +
>>> +    if (!nr_regions)
>>> +        LOG(WARN, "The extended regions cannot be allocated, not enough
>>> space");
>>> +
>>> +    len = sizeof(regs[0]) * (GUEST_ROOT_ADDRESS_CELLS +
>>> GUEST_ROOT_SIZE_CELLS) *
>>> +        (nr_regions + 1);
>>> +
>>> +    return fdt_setprop(fdt, offset, "reg", regs, len);
>>> +}
>>> +
>>>   int libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
>>>                                                  uint32_t domid,
>>> libxl_domain_config *d_config,
>>>                                                  struct xc_dom_image *dom)
>>>   {
>>>       void *fdt = dom->devicetree_blob;
>>> -    int i;
>>> +    int i, res;
>>>       const uint64_t bankbase[] = GUEST_RAM_BANK_BASES;
>>>
>>>       const struct xc_dom_seg *ramdisk = dom->modules[0].blob ?
>>>           &dom->modules[0].seg : NULL;
>>>
>>>       if (ramdisk) {
>>> -        int chosen, res;
>>> +        int chosen;
>>>           uint64_t val;
>>>
>>>           /* Neither the fdt_path_offset() nor either of the
>>> @@ -1109,6 +1201,10 @@ int
>>> libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
>>>
>>>       }
>>>
>>> +    res = finalize_hypervisor_node(gc, dom);
>>> +    if (res)
>>> +        return res;
>>> +
>>>       for (i = 0; i < GUEST_RAM_BANKS; i++) {
>>>           const uint64_t size = (uint64_t)dom->rambank_size[i] <<
>>> XC_PAGE_SHIFT;
>>>
>>> diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
>>> index d46c61f..96ead3d 100644
>>> --- a/xen/include/public/arch-arm.h
>>> +++ b/xen/include/public/arch-arm.h
>>> @@ -438,6 +438,11 @@ typedef uint64_t xen_callback_t;
>>>
>>>   #define GUEST_RAM_BANKS   2
>>>
>>> +/*
>>> + * The way to find the extended regions (to be exposed to the guest as
>>> unused
>>> + * address space) relies on the fact that the regions reserved for the RAM
>>> + * below are big enough to also accommodate such regions.
>>> + */
>>>   #define GUEST_RAM0_BASE   xen_mk_ullong(0x40000000) /* 3GB of low RAM @ 1GB
>>> */
>>>   #define GUEST_RAM0_SIZE   xen_mk_ullong(0xc0000000)
>>>
>>> (END)
>>>
>>>
>> -- 
>> Regards,
>>
>> Oleksandr Tyshchenko
diff mbox series

Patch

diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index e3140a6..c0e8415 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -598,9 +598,20 @@  static int make_timer_node(libxl__gc *gc, void *fdt,
     return 0;
 }
 
+#define ALIGN_UP_TO_2MB(x)   (((x) + MB(2) - 1) & (~(MB(2) - 1)))
+
 static int make_hypervisor_node(libxl__gc *gc, void *fdt,
-                                const libxl_version_info *vers)
+                                const libxl_version_info *vers,
+                                const libxl_domain_build_info *b_info,
+                                const struct xc_dom_image *dom)
 {
+    uint64_t region_size[GUEST_RAM_BANKS] = {0}, region_base[GUEST_RAM_BANKS],
+        bank1end, ramsize;
+    uint32_t regs[(GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS) *
+                  (GUEST_RAM_BANKS + 1)];
+    be32 *cells = &regs[0];
+    unsigned int i, len, nr_regions = 0;
+    libxl_dominfo info;
     int res;
     gic_interrupt intr;
 
@@ -615,9 +626,64 @@  static int make_hypervisor_node(libxl__gc *gc, void *fdt,
                               "xen,xen");
     if (res) return res;
 
-    /* reg 0 is grant table space */
-    res = fdt_property_regs(gc, fdt, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
-                            1,GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
+    if (strcmp(dom->guest_type, "xen-3.0-aarch64")) {
+        LOG(WARN, "The extended regions are only supported for 64-bit guest currently");
+        goto out;
+    }
+
+    res = libxl_domain_info(CTX, &info, dom->guest_domid);
+    if (res) return res;
+
+    assert(info.gpaddr_bits >= 32 && info.gpaddr_bits <= 48);
+
+    /*
+     * Try to allocate separate 2MB-aligned extended regions from the first
+     * (below 4GB) and second (above 4GB) RAM banks taking into the account
+     * the maximum supported guest physical address space size and the amount
+     * of memory assigned to the guest.
+     * As the guest memory layout is not populated yet we cannot rely on
+     * dom->rambank_size[], so calculate the actual size of both banks using
+     * "max_memkb" value.
+     */
+    ramsize = b_info->max_memkb * 1024;
+    if (ramsize <= GUEST_RAM0_SIZE) {
+        region_base[0] = GUEST_RAM0_BASE + ALIGN_UP_TO_2MB(ramsize);
+        region_size[0] = GUEST_RAM0_SIZE - ALIGN_UP_TO_2MB(ramsize);
+        region_base[1] = GUEST_RAM1_BASE;
+    } else
+        region_base[1] = GUEST_RAM1_BASE +
+            ALIGN_UP_TO_2MB(ramsize - GUEST_RAM0_SIZE);
+
+    bank1end = min(1ULL << info.gpaddr_bits, GUEST_RAM1_BASE + GUEST_RAM1_SIZE);
+    if (bank1end > region_base[1])
+        region_size[1] = bank1end - region_base[1];
+
+out:
+    /*
+     * The region 0 for grant table space must be always present. If we managed
+     * to allocate the extended regions then insert them as regions 1...N.
+     */
+    set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
+              GUEST_GNTTAB_BASE, GUEST_GNTTAB_SIZE);
+
+    for (i = 0; i < GUEST_RAM_BANKS; i++) {
+        if (region_size[i] < GUEST_EXT_REGION_MIN_SIZE)
+            continue;
+
+        LOG(DEBUG, "Extended region %u: %#"PRIx64"->%#"PRIx64"\n",
+            nr_regions, region_base[i], region_base[i] + region_size[i]);
+
+        set_range(&cells, GUEST_ROOT_ADDRESS_CELLS, GUEST_ROOT_SIZE_CELLS,
+                  region_base[i], region_size[i]);
+        nr_regions++;
+    }
+
+    if (!nr_regions)
+        LOG(WARN, "The extended regions cannot be allocated, not enough space");
+
+    len = sizeof(regs[0]) * (GUEST_ROOT_ADDRESS_CELLS + GUEST_ROOT_SIZE_CELLS) *
+        (nr_regions + 1);
+    res = fdt_property(fdt, "reg", regs, len);
     if (res) return res;
 
     /*
@@ -963,7 +1029,7 @@  next_resize:
         }
 
         FDT( make_timer_node(gc, fdt, ainfo, state->clock_frequency) );
-        FDT( make_hypervisor_node(gc, fdt, vers) );
+        FDT( make_hypervisor_node(gc, fdt, vers, info, dom) );
 
         if (info->arch_arm.vuart == LIBXL_VUART_TYPE_SBSA_UART)
             FDT( make_vpl011_uart_node(gc, fdt, ainfo, dom) );
diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
index d46c61f..19ca2b0 100644
--- a/xen/include/public/arch-arm.h
+++ b/xen/include/public/arch-arm.h
@@ -451,6 +451,8 @@  typedef uint64_t xen_callback_t;
 #define GUEST_RAM_BANK_BASES   { GUEST_RAM0_BASE, GUEST_RAM1_BASE }
 #define GUEST_RAM_BANK_SIZES   { GUEST_RAM0_SIZE, GUEST_RAM1_SIZE }
 
+#define GUEST_EXT_REGION_MIN_SIZE   xen_mk_ullong(0x0004000000) /* 64MB */
+
 /* Current supported guest VCPUs */
 #define GUEST_MAX_VCPUS 128