diff mbox series

xen/device-tree: Allow exact match for overlapping regions

Message ID 20241106134132.2185492-1-luca.fancellu@arm.com (mailing list archive)
State New
Headers show
Series xen/device-tree: Allow exact match for overlapping regions | expand

Commit Message

Luca Fancellu Nov. 6, 2024, 1:41 p.m. UTC
There are some cases where the device tree exposes a memory range
in both /memreserve/ and reserved-memory node, in this case the
current code will stop Xen to boot since it will find that the
latter range is clashing with the already recorded /memreserve/
ranges.

Furthermore, u-boot lists boot modules ranges, such as ramdisk,
in the /memreserve/ part and even in this case this will prevent
Xen to boot since it will see that the module memory range that
it is going to add in 'add_boot_module' clashes with a /memreserve/
range.

When Xen populate the data structure that tracks the memory ranges,
it also adds a memory type described in 'enum membank_type', so
in order to fix this behavior, allow the 'check_reserved_regions_overlap'
function to check for exact memory range match given a specific memory
type; allowing reserved-memory node ranges and boot modules to have an
exact match with ranges from /memreserve/.

While there, set a type for the memory recorded during ACPI boot.

Fixes: 53dc37829c31 ("xen/arm: Add DT reserve map regions to bootinfo.reserved_mem")
Reported-by: Shawn Anastasio <sanastasio@raptorengineering.com>
Reported-by: Grygorii Strashko <grygorii_strashko@epam.com>
Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
---
I tested this patch adding the same range in a /memreserve/ entry and
/reserved-memory node, and by letting u-boot pass a ramdisk.
I've also tested that a configuration running static shared memory still works
fine.
---
 xen/arch/arm/efi/efi-boot.h       |  3 +-
 xen/arch/arm/static-shmem.c       |  2 +-
 xen/common/device-tree/bootfdt.c  |  9 +++++-
 xen/common/device-tree/bootinfo.c | 48 ++++++++++++++++++++++++-------
 xen/include/xen/bootfdt.h         |  9 +++++-
 5 files changed, 57 insertions(+), 14 deletions(-)

Comments

Grygorii Strashko Nov. 6, 2024, 2:19 p.m. UTC | #1
On 06.11.24 15:41, Luca Fancellu wrote:
> There are some cases where the device tree exposes a memory range
> in both /memreserve/ and reserved-memory node, in this case the
> current code will stop Xen to boot since it will find that the
> latter range is clashing with the already recorded /memreserve/
> ranges.
> 
> Furthermore, u-boot lists boot modules ranges, such as ramdisk,
> in the /memreserve/ part and even in this case this will prevent
> Xen to boot since it will see that the module memory range that
> it is going to add in 'add_boot_module' clashes with a /memreserve/
> range.
> 
> When Xen populate the data structure that tracks the memory ranges,
> it also adds a memory type described in 'enum membank_type', so
> in order to fix this behavior, allow the 'check_reserved_regions_overlap'
> function to check for exact memory range match given a specific memory
> type; allowing reserved-memory node ranges and boot modules to have an
> exact match with ranges from /memreserve/.
> 
> While there, set a type for the memory recorded during ACPI boot.
> 
> Fixes: 53dc37829c31 ("xen/arm: Add DT reserve map regions to bootinfo.reserved_mem")
> Reported-by: Shawn Anastasio <sanastasio@raptorengineering.com>
> Reported-by: Grygorii Strashko <grygorii_strashko@epam.com>
> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
> ---
> I tested this patch adding the same range in a /memreserve/ entry and
> /reserved-memory node, and by letting u-boot pass a ramdisk.
> I've also tested that a configuration running static shared memory still works
> fine.

[...]

Thank you, Dom0 started successfully with Initrd.

Tested-by: Grygorii Strashko <grygorii_strashko@epam.com>

Meminfo from boot log is below:

(XEN) Checking for initrd in /chosen
(XEN) Initrd 0000000084000040-00000000860864fd
(XEN) RAM: 0000000048000000 - 00000000bfffffff
(XEN) RAM: 0000000480000000 - 00000004ffffffff
(XEN) RAM: 0000000600000000 - 00000006ffffffff
(XEN)
(XEN) MODULE[0]: 0000000048080000 - 00000000481ec000 Xen
(XEN) MODULE[1]: 0000000048000000 - 000000004801e080 Device Tree
(XEN) MODULE[2]: 0000000084000040 - 00000000860864fd Ramdisk
(XEN) MODULE[3]: 0000000048300000 - 000000004a300000 Kernel
(XEN) MODULE[4]: 0000000048070000 - 0000000048080000 XSM
(XEN)  RESVD[0]: 0000000084000040 - 00000000860864fc
(XEN)  RESVD[1]: 0000000060000000 - 000000007fffffff
(XEN)  RESVD[2]: 00000000b0000000 - 00000000bfffffff
(XEN)  RESVD[3]: 00000000a0000000 - 00000000afffffff
...

BR,
-grygorii
Michal Orzel Nov. 6, 2024, 3:07 p.m. UTC | #2
On 06/11/2024 14:41, Luca Fancellu wrote:
> 
> 
> There are some cases where the device tree exposes a memory range
> in both /memreserve/ and reserved-memory node, in this case the
> current code will stop Xen to boot since it will find that the
> latter range is clashing with the already recorded /memreserve/
> ranges.
> 
> Furthermore, u-boot lists boot modules ranges, such as ramdisk,
> in the /memreserve/ part and even in this case this will prevent
> Xen to boot since it will see that the module memory range that
> it is going to add in 'add_boot_module' clashes with a /memreserve/
> range.
> 
> When Xen populate the data structure that tracks the memory ranges,
> it also adds a memory type described in 'enum membank_type', so
> in order to fix this behavior, allow the 'check_reserved_regions_overlap'
> function to check for exact memory range match given a specific memory
> type; allowing reserved-memory node ranges and boot modules to have an
> exact match with ranges from /memreserve/.
> 
> While there, set a type for the memory recorded during ACPI boot.
> 
> Fixes: 53dc37829c31 ("xen/arm: Add DT reserve map regions to bootinfo.reserved_mem")
> Reported-by: Shawn Anastasio <sanastasio@raptorengineering.com>
> Reported-by: Grygorii Strashko <grygorii_strashko@epam.com>
> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
> ---
> I tested this patch adding the same range in a /memreserve/ entry and
> /reserved-memory node, and by letting u-boot pass a ramdisk.
> I've also tested that a configuration running static shared memory still works
> fine.
> ---
So we have 2 separate issues. I don't particularly like the concept of introducing MEMBANK_NONE
and the changes below look a bit too much for me, given that for boot modules we can only have
/memreserve/ matching initrd.

Shawn patch fixes the first issue. AFAICT the second issue can be fixed by below simple patch:
diff --git a/xen/common/device-tree/bootfdt.c b/xen/common/device-tree/bootfdt.c
index 927f59c64b0d..d8bd8c44bd35 100644
--- a/xen/common/device-tree/bootfdt.c
+++ b/xen/common/device-tree/bootfdt.c
@@ -586,6 +586,10 @@ size_t __init boot_fdt_info(const void *fdt, paddr_t paddr)

     add_boot_module(BOOTMOD_FDT, paddr, fdt_totalsize(fdt), false);

+    ret = device_tree_for_each_node(fdt, 0, early_scan_node, NULL);
+    if ( ret )
+        panic("Early FDT parsing failed (%d)\n", ret);
+
     nr_rsvd = fdt_num_mem_rsv(fdt);
     if ( nr_rsvd < 0 )
         panic("Parsing FDT memory reserve map failed (%d)\n", nr_rsvd);
@@ -594,10 +598,14 @@ size_t __init boot_fdt_info(const void *fdt, paddr_t paddr)
     {
         struct membank *bank;
         paddr_t s, sz;
+        const struct bootmodule *mod = boot_module_find_by_kind(BOOTMOD_RAMDISK);

         if ( fdt_get_mem_rsv_paddr(device_tree_flattened, i, &s, &sz) < 0 )
             continue;

+        if ( mod && (mod->start == s) && (mod->size == sz) )
+            continue;
+
         if ( reserved_mem->nr_banks < reserved_mem->max_banks )
         {
             bank = &reserved_mem->bank[reserved_mem->nr_banks];
@@ -610,10 +618,6 @@ size_t __init boot_fdt_info(const void *fdt, paddr_t paddr)
             panic("Cannot allocate reserved memory bank\n");
     }

-    ret = device_tree_for_each_node(fdt, 0, early_scan_node, NULL);
-    if ( ret )
-        panic("Early FDT parsing failed (%d)\n", ret);
-
     /*
      * On Arm64 setup_directmap_mappings() expects to be called with the lowest
      * bank in memory first. There is no requirement that the DT will provide

I'll let other DT maintainers voice their opinion.

~Michal
Luca Fancellu Nov. 6, 2024, 3:16 p.m. UTC | #3
Hi Michal,

> So we have 2 separate issues. I don't particularly like the concept of introducing MEMBANK_NONE
> and the changes below look a bit too much for me, given that for boot modules we can only have
> /memreserve/ matching initrd.
> 
> Shawn patch fixes the first issue. AFAICT the second issue can be fixed by below simple patch:
> diff --git a/xen/common/device-tree/bootfdt.c b/xen/common/device-tree/bootfdt.c
> index 927f59c64b0d..d8bd8c44bd35 100644
> --- a/xen/common/device-tree/bootfdt.c
> +++ b/xen/common/device-tree/bootfdt.c
> @@ -586,6 +586,10 @@ size_t __init boot_fdt_info(const void *fdt, paddr_t paddr)
> 
>     add_boot_module(BOOTMOD_FDT, paddr, fdt_totalsize(fdt), false);
> 
> +    ret = device_tree_for_each_node(fdt, 0, early_scan_node, NULL);
> +    if ( ret )
> +        panic("Early FDT parsing failed (%d)\n", ret);
> +
>     nr_rsvd = fdt_num_mem_rsv(fdt);
>     if ( nr_rsvd < 0 )
>         panic("Parsing FDT memory reserve map failed (%d)\n", nr_rsvd);
> @@ -594,10 +598,14 @@ size_t __init boot_fdt_info(const void *fdt, paddr_t paddr)
>     {
>         struct membank *bank;
>         paddr_t s, sz;
> +        const struct bootmodule *mod = boot_module_find_by_kind(BOOTMOD_RAMDISK);
> 
>         if ( fdt_get_mem_rsv_paddr(device_tree_flattened, i, &s, &sz) < 0 )
>             continue;
> 
> +        if ( mod && (mod->start == s) && (mod->size == sz) )
> +            continue;

Ok I see, we skip the /memreserve/ entry if it matches the ramdisk, fair enough, I don’t have
a strong opinion on how we do that, the important thing is just to unblock the users experiencing
this issue.

Cheers,
Luca
Grygorii Strashko Nov. 7, 2024, 9:42 a.m. UTC | #4
On 06.11.24 17:16, Luca Fancellu wrote:
> Hi Michal,
> 
>> So we have 2 separate issues. I don't particularly like the concept of introducing MEMBANK_NONE
>> and the changes below look a bit too much for me, given that for boot modules we can only have
>> /memreserve/ matching initrd.
>>
>> Shawn patch fixes the first issue. AFAICT the second issue can be fixed by below simple patch:
>> diff --git a/xen/common/device-tree/bootfdt.c b/xen/common/device-tree/bootfdt.c
>> index 927f59c64b0d..d8bd8c44bd35 100644
>> --- a/xen/common/device-tree/bootfdt.c
>> +++ b/xen/common/device-tree/bootfdt.c
>> @@ -586,6 +586,10 @@ size_t __init boot_fdt_info(const void *fdt, paddr_t paddr)
>>
>>      add_boot_module(BOOTMOD_FDT, paddr, fdt_totalsize(fdt), false);
>>
>> +    ret = device_tree_for_each_node(fdt, 0, early_scan_node, NULL);
>> +    if ( ret )
>> +        panic("Early FDT parsing failed (%d)\n", ret);
>> +
>>      nr_rsvd = fdt_num_mem_rsv(fdt);
>>      if ( nr_rsvd < 0 )
>>          panic("Parsing FDT memory reserve map failed (%d)\n", nr_rsvd);
>> @@ -594,10 +598,14 @@ size_t __init boot_fdt_info(const void *fdt, paddr_t paddr)
>>      {
>>          struct membank *bank;
>>          paddr_t s, sz;
>> +        const struct bootmodule *mod = boot_module_find_by_kind(BOOTMOD_RAMDISK);
>>
>>          if ( fdt_get_mem_rsv_paddr(device_tree_flattened, i, &s, &sz) < 0 )
>>              continue;
>>
>> +        if ( mod && (mod->start == s) && (mod->size == sz) )
>> +            continue;
> 
> Ok I see, we skip the /memreserve/ entry if it matches the ramdisk, fair enough, I don’t have
> a strong opinion on how we do that, the important thing is just to unblock the users experiencing
> this issue.

Don't know if my opinion would matter here, but Luca's patch looks more generic and logically solid for me.
While handling only "ramdisk" somewhere in the middle  - looks more like a hack.

Any way, it's up to you.

BR,
-grygorii
Grygorii Strashko Nov. 13, 2024, 10:31 a.m. UTC | #5
Hi All,

On 07.11.24 11:42, Grygorii Strashko wrote:
> 
> 
> On 06.11.24 17:16, Luca Fancellu wrote:
>> Hi Michal,
>>
>>> So we have 2 separate issues. I don't particularly like the concept of introducing MEMBANK_NONE
>>> and the changes below look a bit too much for me, given that for boot modules we can only have
>>> /memreserve/ matching initrd.
>>>
>>> Shawn patch fixes the first issue. AFAICT the second issue can be fixed by below simple patch:
>>> diff --git a/xen/common/device-tree/bootfdt.c b/xen/common/device-tree/bootfdt.c
>>> index 927f59c64b0d..d8bd8c44bd35 100644
>>> --- a/xen/common/device-tree/bootfdt.c
>>> +++ b/xen/common/device-tree/bootfdt.c
>>> @@ -586,6 +586,10 @@ size_t __init boot_fdt_info(const void *fdt, paddr_t paddr)
>>>
>>>      add_boot_module(BOOTMOD_FDT, paddr, fdt_totalsize(fdt), false);
>>>
>>> +    ret = device_tree_for_each_node(fdt, 0, early_scan_node, NULL);
>>> +    if ( ret )
>>> +        panic("Early FDT parsing failed (%d)\n", ret);
>>> +
>>>      nr_rsvd = fdt_num_mem_rsv(fdt);
>>>      if ( nr_rsvd < 0 )
>>>          panic("Parsing FDT memory reserve map failed (%d)\n", nr_rsvd);
>>> @@ -594,10 +598,14 @@ size_t __init boot_fdt_info(const void *fdt, paddr_t paddr)
>>>      {
>>>          struct membank *bank;
>>>          paddr_t s, sz;
>>> +        const struct bootmodule *mod = boot_module_find_by_kind(BOOTMOD_RAMDISK);
>>>
>>>          if ( fdt_get_mem_rsv_paddr(device_tree_flattened, i, &s, &sz) < 0 )
>>>              continue;
>>>
>>> +        if ( mod && (mod->start == s) && (mod->size == sz) )
>>> +            continue;
>>
>> Ok I see, we skip the /memreserve/ entry if it matches the ramdisk, fair enough, I don’t have
>> a strong opinion on how we do that, the important thing is just to unblock the users experiencing
>> this issue.
> 
> Don't know if my opinion would matter here, but Luca's patch looks more generic and logically solid for me.
> While handling only "ramdisk" somewhere in the middle  - looks more like a hack.
> 
> Any way, it's up to you.


Sorry, that I'm disturbing you, but is there going to be any conclusion regarding this patch?


Best regards,
-Grygorii
Julien Grall Nov. 13, 2024, 1:50 p.m. UTC | #6
Hi Michal,

On 06/11/2024 15:07, Michal Orzel wrote:
> 
> 
> On 06/11/2024 14:41, Luca Fancellu wrote:
>>
>>
>> There are some cases where the device tree exposes a memory range
>> in both /memreserve/ and reserved-memory node, in this case the
>> current code will stop Xen to boot since it will find that the
>> latter range is clashing with the already recorded /memreserve/
>> ranges.
>>
>> Furthermore, u-boot lists boot modules ranges, such as ramdisk,
>> in the /memreserve/ part and even in this case this will prevent
>> Xen to boot since it will see that the module memory range that
>> it is going to add in 'add_boot_module' clashes with a /memreserve/
>> range.
>>
>> When Xen populate the data structure that tracks the memory ranges,
>> it also adds a memory type described in 'enum membank_type', so
>> in order to fix this behavior, allow the 'check_reserved_regions_overlap'
>> function to check for exact memory range match given a specific memory
>> type; allowing reserved-memory node ranges and boot modules to have an
>> exact match with ranges from /memreserve/.
>>
>> While there, set a type for the memory recorded during ACPI boot.
>>
>> Fixes: 53dc37829c31 ("xen/arm: Add DT reserve map regions to bootinfo.reserved_mem")
>> Reported-by: Shawn Anastasio <sanastasio@raptorengineering.com>
>> Reported-by: Grygorii Strashko <grygorii_strashko@epam.com>
>> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
>> ---
>> I tested this patch adding the same range in a /memreserve/ entry and
>> /reserved-memory node, and by letting u-boot pass a ramdisk.
>> I've also tested that a configuration running static shared memory still works
>> fine.
>> ---
> So we have 2 separate issues. I don't particularly like the concept of introducing MEMBANK_NONE
> and the changes below look a bit too much for me, given that for boot modules we can only have
> /memreserve/ matching initrd.

How so? Is this an observation or part of a specification?

Cheers,
Julien Grall Nov. 13, 2024, 2:01 p.m. UTC | #7
Hi Luca,

On 06/11/2024 13:41, Luca Fancellu wrote:
> There are some cases where the device tree exposes a memory range
> in both /memreserve/ and reserved-memory node, in this case the
> current code will stop Xen to boot since it will find that the
> latter range is clashing with the already recorded /memreserve/
> ranges.
> 
> Furthermore, u-boot lists boot modules ranges, such as ramdisk,
> in the /memreserve/ part and even in this case this will prevent
> Xen to boot since it will see that the module memory range that
> it is going to add in 'add_boot_module' clashes with a /memreserve/
> range.
> 
> When Xen populate the data structure that tracks the memory ranges,
> it also adds a memory type described in 'enum membank_type', so
> in order to fix this behavior, allow the 'check_reserved_regions_overlap'
> function to check for exact memory range match given a specific memory
> type; allowing reserved-memory node ranges and boot modules to have an
> exact match with ranges from /memreserve/.
> 
> While there, set a type for the memory recorded during ACPI boot.

I am a bit confused which you mention only ACPI boot. Isn't the path 
also used when booting using Device-Tree?

> 
> Fixes: 53dc37829c31 ("xen/arm: Add DT reserve map regions to bootinfo.reserved_mem")
> Reported-by: Shawn Anastasio <sanastasio@raptorengineering.com>
> Reported-by: Grygorii Strashko <grygorii_strashko@epam.com>
> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
> ---
> I tested this patch adding the same range in a /memreserve/ entry and
> /reserved-memory node, and by letting u-boot pass a ramdisk.
> I've also tested that a configuration running static shared memory still works
> fine.
> ---
>   xen/arch/arm/efi/efi-boot.h       |  3 +-
>   xen/arch/arm/static-shmem.c       |  2 +-
>   xen/common/device-tree/bootfdt.c  |  9 +++++-
>   xen/common/device-tree/bootinfo.c | 48 ++++++++++++++++++++++++-------
>   xen/include/xen/bootfdt.h         |  9 +++++-
>   5 files changed, 57 insertions(+), 14 deletions(-)
> 
> diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
> index 199f5260229d..d35c991c856f 100644
> --- a/xen/arch/arm/efi/efi-boot.h
> +++ b/xen/arch/arm/efi/efi-boot.h
> @@ -167,13 +167,14 @@ static bool __init meminfo_add_bank(struct membanks *mem,
>       if ( mem->nr_banks >= mem->max_banks )
>           return false;
>   #ifdef CONFIG_ACPI
> -    if ( check_reserved_regions_overlap(start, size) )
> +    if ( check_reserved_regions_overlap(start, size, MEMBANK_NONE) )
>           return false;
>   #endif
>   
>       bank = &mem->bank[mem->nr_banks];
>       bank->start = start;
>       bank->size = size;
> +    bank->type = MEMBANK_DEFAULT;
>   
>       mem->nr_banks++;
>   
> diff --git a/xen/arch/arm/static-shmem.c b/xen/arch/arm/static-shmem.c
> index aa80756c3ca5..149ed4b0a5ba 100644
> --- a/xen/arch/arm/static-shmem.c
> +++ b/xen/arch/arm/static-shmem.c
> @@ -696,7 +696,7 @@ int __init process_shm_node(const void *fdt, int node, uint32_t address_cells,
>           if (i < mem->max_banks)
>           {
>               if ( (paddr != INVALID_PADDR) &&
> -                 check_reserved_regions_overlap(paddr, size) )
> +                 check_reserved_regions_overlap(paddr, size, MEMBANK_NONE) )
>                   return -EINVAL;
>   
>               /* Static shared memory shall be reserved from any other use. */
> diff --git a/xen/common/device-tree/bootfdt.c b/xen/common/device-tree/bootfdt.c
> index 927f59c64b0d..fb3a6ab95a22 100644
> --- a/xen/common/device-tree/bootfdt.c
> +++ b/xen/common/device-tree/bootfdt.c
> @@ -176,8 +176,15 @@ static int __init device_tree_get_meminfo(const void *fdt, int node,
>       for ( i = 0; i < banks && mem->nr_banks < mem->max_banks; i++ )
>       {
>           device_tree_get_reg(&cell, address_cells, size_cells, &start, &size);
> +        /*
> +         * Some valid device trees, such as those generated by OpenPOWER
> +         * skiboot firmware, expose all reserved memory regions in the
> +         * FDT memory reservation block AND in the reserved-memory node which
> +         * has already been parsed. Thus, any matching overlaps in the
> +         * reserved_mem banks should be ignored.
> +         */
>           if ( mem == bootinfo_get_reserved_mem() &&
> -             check_reserved_regions_overlap(start, size) )
> +             check_reserved_regions_overlap(start, size, MEMBANK_FDT_RESVMEM) )
>               return -EINVAL;
>           /* Some DT may describe empty bank, ignore them */
>           if ( !size )
> diff --git a/xen/common/device-tree/bootinfo.c b/xen/common/device-tree/bootinfo.c
> index f2e6a1145b7c..05038075e835 100644
> --- a/xen/common/device-tree/bootinfo.c
> +++ b/xen/common/device-tree/bootinfo.c
> @@ -99,7 +99,8 @@ static void __init dt_unreserved_regions(paddr_t s, paddr_t e,
>    */
>   static bool __init meminfo_overlap_check(const struct membanks *mem,
>                                            paddr_t region_start,
> -                                         paddr_t region_size)
> +                                         paddr_t region_size,
> +                                         enum membank_type allow_match_type)

Looking at the callers, you only seem to pass MEMBANK_FDT_RESVMEM or 
MEMBANK_NONE. So I wonder whether it actually make sense to introduce 
MEMBANK_NONE. Wouldn't it be better to have a boolean indicating whether 
we are looking for FDT_RESVMEM?

>   {
>       paddr_t bank_start = INVALID_PADDR, bank_end = 0;
>       paddr_t region_end = region_start + region_size;
> @@ -113,12 +114,16 @@ static bool __init meminfo_overlap_check(const struct membanks *mem,
>           if ( INVALID_PADDR == bank_start || region_end <= bank_start ||
>                region_start >= bank_end )
>               continue;
> -        else
> -        {
> -            printk("Region: [%#"PRIpaddr", %#"PRIpaddr") overlapping with bank[%u]: [%#"PRIpaddr", %#"PRIpaddr")\n",
> -                   region_start, region_end, i, bank_start, bank_end);
> -            return true;
> -        }
> +
> +        if ( (allow_match_type != MEMBANK_NONE) &&
> +             (region_start == bank_start) && (region_end == bank_end) &&

Why is this only an exact match?

> +             (allow_match_type == mem->bank[i].type) )
> +            continue;
> +
> +        printk("Region: [%#"PRIpaddr", %#"PRIpaddr") overlapping with bank[%u]: [%#"PRIpaddr", %#"PRIpaddr")\n",
> +                region_start, region_end, i, bank_start, bank_end);
> +        return true;
> +
>       }
>   
>       return false;
> @@ -169,9 +174,14 @@ void __init fw_unreserved_regions(paddr_t s, paddr_t e,
>    * with the existing reserved memory regions defined in bootinfo.
>    * Return true if the input physical address range is overlapping with any
>    * existing reserved memory regions, otherwise false.
> + * The 'allow_match_type' parameter can be used to allow exact match of a
> + * region with another memory region already recorded, but it needs to be used
> + * only on memory regions that allows a type (reserved_mem, acpi). For all the
> + * other cases, passing 'MEMBANK_NONE' will disable the exact match.
>    */
>   bool __init check_reserved_regions_overlap(paddr_t region_start,
> -                                           paddr_t region_size)
> +                                           paddr_t region_size,
> +                                           enum membank_type allow_match_type)
>   {
>       const struct membanks *mem_banks[] = {
>           bootinfo_get_reserved_mem(),
> @@ -190,8 +200,21 @@ bool __init check_reserved_regions_overlap(paddr_t region_start,
>        * shared memory banks (when static shared memory feature is enabled)
>        */
>       for ( i = 0; i < ARRAY_SIZE(mem_banks); i++ )
> -        if ( meminfo_overlap_check(mem_banks[i], region_start, region_size) )
> +    {
> +        enum membank_type type = allow_match_type;
> +
> +#ifdef CONFIG_STATIC_SHM
> +        /*
> +         * Static shared memory banks don't have a type, so for them disable
> +         * the exact match check.
> +         */

This feels a bit of a hack. Why can't we had a default type like you did 
in meminfo_add_bank()?

> +        if (mem_banks[i] == bootinfo_get_shmem())
> +            type = MEMBANK_NONE;
> +#endif
> +        if ( meminfo_overlap_check(mem_banks[i], region_start, region_size,
> +                                   type) )
>               return true;
> +    }
>   
>       /* Check if input region is overlapping with bootmodules */
>       if ( bootmodules_overlap_check(&bootinfo.modules,
> @@ -216,7 +239,12 @@ struct bootmodule __init *add_boot_module(bootmodule_kind kind,
>           return NULL;
>       }
>   
> -    if ( check_reserved_regions_overlap(start, size) )
> +    /*
> +     * u-boot adds boot module such as ramdisk to the /memreserve/, since these
> +     * ranges are saved in reserved_mem at this stage, allow an eventual exact
> +     * match with MEMBANK_FDT_RESVMEM banks.
> +     */
> +    if ( check_reserved_regions_overlap(start, size, MEMBANK_FDT_RESVMEM) )
>           return NULL;
>   
>       for ( i = 0 ; i < mods->nr_mods ; i++ )
> diff --git a/xen/include/xen/bootfdt.h b/xen/include/xen/bootfdt.h
> index 16fa05f38f38..8f8a7ad882a4 100644
> --- a/xen/include/xen/bootfdt.h
> +++ b/xen/include/xen/bootfdt.h
> @@ -23,6 +23,12 @@ typedef enum {
>   }  bootmodule_kind;
>   
>   enum membank_type {
> +    /*
> +     * The MEMBANK_NONE type is a special placeholder that should not be applied
> +     * to a memory bank, it is used as a special value in some function in order
> +     * to disable some feature.
> +     */
> +    MEMBANK_NONE = -1,
>       /*
>        * The MEMBANK_DEFAULT type refers to either reserved memory for the
>        * device/firmware (when the bank is in 'reserved_mem') or any RAM (when
> @@ -158,7 +164,8 @@ struct bootinfo {
>   extern struct bootinfo bootinfo;
>   
>   bool check_reserved_regions_overlap(paddr_t region_start,
> -                                    paddr_t region_size);
> +                                    paddr_t region_size,
> +                                    enum membank_type allow_match_type);
>   
>   struct bootmodule *add_boot_module(bootmodule_kind kind,
>                                      paddr_t start, paddr_t size, bool domU);

Cheers,
Michal Orzel Nov. 13, 2024, 2:19 p.m. UTC | #8
On 13/11/2024 14:50, Julien Grall wrote:
> 
> 
> Hi Michal,
> 
> On 06/11/2024 15:07, Michal Orzel wrote:
>>
>>
>> On 06/11/2024 14:41, Luca Fancellu wrote:
>>>
>>>
>>> There are some cases where the device tree exposes a memory range
>>> in both /memreserve/ and reserved-memory node, in this case the
>>> current code will stop Xen to boot since it will find that the
>>> latter range is clashing with the already recorded /memreserve/
>>> ranges.
>>>
>>> Furthermore, u-boot lists boot modules ranges, such as ramdisk,
>>> in the /memreserve/ part and even in this case this will prevent
>>> Xen to boot since it will see that the module memory range that
>>> it is going to add in 'add_boot_module' clashes with a /memreserve/
>>> range.
>>>
>>> When Xen populate the data structure that tracks the memory ranges,
>>> it also adds a memory type described in 'enum membank_type', so
>>> in order to fix this behavior, allow the 'check_reserved_regions_overlap'
>>> function to check for exact memory range match given a specific memory
>>> type; allowing reserved-memory node ranges and boot modules to have an
>>> exact match with ranges from /memreserve/.
>>>
>>> While there, set a type for the memory recorded during ACPI boot.
>>>
>>> Fixes: 53dc37829c31 ("xen/arm: Add DT reserve map regions to bootinfo.reserved_mem")
>>> Reported-by: Shawn Anastasio <sanastasio@raptorengineering.com>
>>> Reported-by: Grygorii Strashko <grygorii_strashko@epam.com>
>>> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
>>> ---
>>> I tested this patch adding the same range in a /memreserve/ entry and
>>> /reserved-memory node, and by letting u-boot pass a ramdisk.
>>> I've also tested that a configuration running static shared memory still works
>>> fine.
>>> ---
>> So we have 2 separate issues. I don't particularly like the concept of introducing MEMBANK_NONE
>> and the changes below look a bit too much for me, given that for boot modules we can only have
>> /memreserve/ matching initrd.
> 
> How so? Is this an observation or part of a specification?
Not sure what specification you would want to see. It's all part of U-Boot and Linux behavior that is not documented (except for code comments).
My statement is based on the U-Boot and Linux behavior. U-Boot part only present for initrd:
https://github.com/u-boot/u-boot/blob/master/boot/fdt_support.c#L249

For things that Xen can be interested in, only region for ramdisk for dom0 can match the /memreserve/ region.
Providing a generic solution (like Luca did) would want providing an example of sth else that can match which I'm not aware of.

~Michal
Luca Fancellu Nov. 13, 2024, 2:33 p.m. UTC | #9
Hi Julien,

> On 13 Nov 2024, at 14:01, Julien Grall <julien@xen.org> wrote:
> 
> Hi Luca,
> 
> On 06/11/2024 13:41, Luca Fancellu wrote:
>> There are some cases where the device tree exposes a memory range
>> in both /memreserve/ and reserved-memory node, in this case the
>> current code will stop Xen to boot since it will find that the
>> latter range is clashing with the already recorded /memreserve/
>> ranges.
>> Furthermore, u-boot lists boot modules ranges, such as ramdisk,
>> in the /memreserve/ part and even in this case this will prevent
>> Xen to boot since it will see that the module memory range that
>> it is going to add in 'add_boot_module' clashes with a /memreserve/
>> range.
>> When Xen populate the data structure that tracks the memory ranges,
>> it also adds a memory type described in 'enum membank_type', so
>> in order to fix this behavior, allow the 'check_reserved_regions_overlap'
>> function to check for exact memory range match given a specific memory
>> type; allowing reserved-memory node ranges and boot modules to have an
>> exact match with ranges from /memreserve/.
>> While there, set a type for the memory recorded during ACPI boot.
> 
> I am a bit confused which you mention only ACPI boot. Isn't the path also used when booting using Device-Tree?

right, maybe this should be:

“While there, set a type for the memory recorded using meminfo_add_bank() from eft-boot.h."

>> 
>>  static bool __init meminfo_overlap_check(const struct membanks *mem,
>>                                           paddr_t region_start,
>> -                                         paddr_t region_size)
>> +                                         paddr_t region_size,
>> +                                         enum membank_type allow_match_type)
> 
> Looking at the callers, you only seem to pass MEMBANK_FDT_RESVMEM or MEMBANK_NONE. So I wonder whether it actually make sense to introduce MEMBANK_NONE. Wouldn't it be better to have a boolean indicating whether we are looking for FDT_RESVMEM?

I wanted to give a more generic approach, but you are right, we could have a boolean like allow_match_memreserve.


> 
>>  {
>>      paddr_t bank_start = INVALID_PADDR, bank_end = 0;
>>      paddr_t region_end = region_start + region_size;
>> @@ -113,12 +114,16 @@ static bool __init meminfo_overlap_check(const struct membanks *mem,
>>          if ( INVALID_PADDR == bank_start || region_end <= bank_start ||
>>               region_start >= bank_end )
>>              continue;
>> -        else
>> -        {
>> -            printk("Region: [%#"PRIpaddr", %#"PRIpaddr") overlapping with bank[%u]: [%#"PRIpaddr", %#"PRIpaddr")\n",
>> -                   region_start, region_end, i, bank_start, bank_end);
>> -            return true;
>> -        }
>> +
>> +        if ( (allow_match_type != MEMBANK_NONE) &&
>> +             (region_start == bank_start) && (region_end == bank_end) &&
> 
> Why is this only an exact match?

Apparently what we are fixing is only a case where memreserve regions matches exactly modules or reserved_mem nodes.

> 
>> +             (allow_match_type == mem->bank[i].type) )
>> +            continue;
>> +
>> +        printk("Region: [%#"PRIpaddr", %#"PRIpaddr") overlapping with bank[%u]: [%#"PRIpaddr", %#"PRIpaddr")\n",
>> +                region_start, region_end, i, bank_start, bank_end);
>> +        return true;
>> +
>>      }
>>        return false;
>> @@ -169,9 +174,14 @@ void __init fw_unreserved_regions(paddr_t s, paddr_t e,
>>   * with the existing reserved memory regions defined in bootinfo.
>>   * Return true if the input physical address range is overlapping with any
>>   * existing reserved memory regions, otherwise false.
>> + * The 'allow_match_type' parameter can be used to allow exact match of a
>> + * region with another memory region already recorded, but it needs to be used
>> + * only on memory regions that allows a type (reserved_mem, acpi). For all the
>> + * other cases, passing 'MEMBANK_NONE' will disable the exact match.
>>   */
>>  bool __init check_reserved_regions_overlap(paddr_t region_start,
>> -                                           paddr_t region_size)
>> +                                           paddr_t region_size,
>> +                                           enum membank_type allow_match_type)
>>  {
>>      const struct membanks *mem_banks[] = {
>>          bootinfo_get_reserved_mem(),
>> @@ -190,8 +200,21 @@ bool __init check_reserved_regions_overlap(paddr_t region_start,
>>       * shared memory banks (when static shared memory feature is enabled)
>>       */
>>      for ( i = 0; i < ARRAY_SIZE(mem_banks); i++ )
>> -        if ( meminfo_overlap_check(mem_banks[i], region_start, region_size) )
>> +    {
>> +        enum membank_type type = allow_match_type;
>> +
>> +#ifdef CONFIG_STATIC_SHM
>> +        /*
>> +         * Static shared memory banks don't have a type, so for them disable
>> +         * the exact match check.
>> +         */
> 
> This feels a bit of a hack. Why can't we had a default type like you did in meminfo_add_bank()?

This is the structure:

struct membank {
    paddr_t start;
    paddr_t size;
    union {
        enum membank_type type;
#ifdef CONFIG_STATIC_SHM
        struct shmem_membank_extra *shmem_extra;
#endif
    };
};

we did that in order to save space when static shared memory is not enabled, so we can’t have the
type for these banks because we are already writing shmem_extra.

We could remove the union but in that case we would waste space when static shared memory is
enabled.

Cheers,
Luca
Julien Grall Nov. 13, 2024, 2:40 p.m. UTC | #10
Hi,

On 13/11/2024 14:19, Michal Orzel wrote:
> 
> 
> On 13/11/2024 14:50, Julien Grall wrote:
>>
>>
>> Hi Michal,
>>
>> On 06/11/2024 15:07, Michal Orzel wrote:
>>>
>>>
>>> On 06/11/2024 14:41, Luca Fancellu wrote:
>>>>
>>>>
>>>> There are some cases where the device tree exposes a memory range
>>>> in both /memreserve/ and reserved-memory node, in this case the
>>>> current code will stop Xen to boot since it will find that the
>>>> latter range is clashing with the already recorded /memreserve/
>>>> ranges.
>>>>
>>>> Furthermore, u-boot lists boot modules ranges, such as ramdisk,
>>>> in the /memreserve/ part and even in this case this will prevent
>>>> Xen to boot since it will see that the module memory range that
>>>> it is going to add in 'add_boot_module' clashes with a /memreserve/
>>>> range.
>>>>
>>>> When Xen populate the data structure that tracks the memory ranges,
>>>> it also adds a memory type described in 'enum membank_type', so
>>>> in order to fix this behavior, allow the 'check_reserved_regions_overlap'
>>>> function to check for exact memory range match given a specific memory
>>>> type; allowing reserved-memory node ranges and boot modules to have an
>>>> exact match with ranges from /memreserve/.
>>>>
>>>> While there, set a type for the memory recorded during ACPI boot.
>>>>
>>>> Fixes: 53dc37829c31 ("xen/arm: Add DT reserve map regions to bootinfo.reserved_mem")
>>>> Reported-by: Shawn Anastasio <sanastasio@raptorengineering.com>
>>>> Reported-by: Grygorii Strashko <grygorii_strashko@epam.com>
>>>> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
>>>> ---
>>>> I tested this patch adding the same range in a /memreserve/ entry and
>>>> /reserved-memory node, and by letting u-boot pass a ramdisk.
>>>> I've also tested that a configuration running static shared memory still works
>>>> fine.
>>>> ---
>>> So we have 2 separate issues. I don't particularly like the concept of introducing MEMBANK_NONE
>>> and the changes below look a bit too much for me, given that for boot modules we can only have
>>> /memreserve/ matching initrd.
>>
>> How so? Is this an observation or part of a specification?
> Not sure what specification you would want to see.

Anything that you bake your observation. My concern with observation is ...

  It's all part of U-Boot and Linux behavior that is not documented 
(except for code comments).
> My statement is based on the U-Boot and Linux behavior. U-Boot part only present for initrd:
> https://github.com/u-boot/u-boot/blob/master/boot/fdt_support.c#L249

... a user is not forced to use U-boot. So this is not a good reason to 
rely on it. If Linux starts to rely on it, then it is probably a better 
argument, but first I would need to see the code. Can you paste a link?

> 
> For things that Xen can be interested in, only region for ramdisk for dom0 can match the /memreserve/ region.
> Providing a generic solution (like Luca did) would want providing an example of sth else that can match which I'm not aware of.

I would argue this is the other way around. If we are not certain that 
/memreserve/ will not be used for any other boot module, then we should 
have a generic solution. Otherwise, we will end up with similar weird 
issue in the future.

Cheers,
Julien Grall Nov. 13, 2024, 3:14 p.m. UTC | #11
On 13/11/2024 14:33, Luca Fancellu wrote:
> Hi Julien,

Hi,

>> On 13 Nov 2024, at 14:01, Julien Grall <julien@xen.org> wrote:
>>
>> Hi Luca,
>>
>> On 06/11/2024 13:41, Luca Fancellu wrote:
>>> There are some cases where the device tree exposes a memory range
>>> in both /memreserve/ and reserved-memory node, in this case the
>>> current code will stop Xen to boot since it will find that the
>>> latter range is clashing with the already recorded /memreserve/
>>> ranges.
>>> Furthermore, u-boot lists boot modules ranges, such as ramdisk,
>>> in the /memreserve/ part and even in this case this will prevent
>>> Xen to boot since it will see that the module memory range that
>>> it is going to add in 'add_boot_module' clashes with a /memreserve/
>>> range.
>>> When Xen populate the data structure that tracks the memory ranges,
>>> it also adds a memory type described in 'enum membank_type', so
>>> in order to fix this behavior, allow the 'check_reserved_regions_overlap'
>>> function to check for exact memory range match given a specific memory
>>> type; allowing reserved-memory node ranges and boot modules to have an
>>> exact match with ranges from /memreserve/.
>>> While there, set a type for the memory recorded during ACPI boot.
>>
>> I am a bit confused which you mention only ACPI boot. Isn't the path also used when booting using Device-Tree?
> 
> right, maybe this should be:
> 
> “While there, set a type for the memory recorded using meminfo_add_bank() from eft-boot.h."
> 
>>>
>>>   static bool __init meminfo_overlap_check(const struct membanks *mem,
>>>                                            paddr_t region_start,
>>> -                                         paddr_t region_size)
>>> +                                         paddr_t region_size,
>>> +                                         enum membank_type allow_match_type)
>>
>> Looking at the callers, you only seem to pass MEMBANK_FDT_RESVMEM or MEMBANK_NONE. So I wonder whether it actually make sense to introduce MEMBANK_NONE. Wouldn't it be better to have a boolean indicating whether we are looking for FDT_RESVMEM?
> 
> I wanted to give a more generic approach, but you are right, we could have a boolean like allow_match_memreserve.
> 
> 
>>
>>>   {
>>>       paddr_t bank_start = INVALID_PADDR, bank_end = 0;
>>>       paddr_t region_end = region_start + region_size;
>>> @@ -113,12 +114,16 @@ static bool __init meminfo_overlap_check(const struct membanks *mem,
>>>           if ( INVALID_PADDR == bank_start || region_end <= bank_start ||
>>>                region_start >= bank_end )
>>>               continue;
>>> -        else
>>> -        {
>>> -            printk("Region: [%#"PRIpaddr", %#"PRIpaddr") overlapping with bank[%u]: [%#"PRIpaddr", %#"PRIpaddr")\n",
>>> -                   region_start, region_end, i, bank_start, bank_end);
>>> -            return true;
>>> -        }
>>> +
>>> +        if ( (allow_match_type != MEMBANK_NONE) &&
>>> +             (region_start == bank_start) && (region_end == bank_end) &&
>>
>> Why is this only an exact match?
> 
> Apparently what we are fixing is only a case where memreserve regions matches exactly modules or reserved_mem nodes.

TBH, as I replied to Michal, I don't understand why we are just focusing 
on just one issue. It would be good to provide some rationale.

> 
>>
>>> +             (allow_match_type == mem->bank[i].type) )
>>> +            continue;
>>> +
>>> +        printk("Region: [%#"PRIpaddr", %#"PRIpaddr") overlapping with bank[%u]: [%#"PRIpaddr", %#"PRIpaddr")\n",
>>> +                region_start, region_end, i, bank_start, bank_end);
>>> +        return true;
>>> +
>>>       }
>>>         return false;
>>> @@ -169,9 +174,14 @@ void __init fw_unreserved_regions(paddr_t s, paddr_t e,
>>>    * with the existing reserved memory regions defined in bootinfo.
>>>    * Return true if the input physical address range is overlapping with any
>>>    * existing reserved memory regions, otherwise false.
>>> + * The 'allow_match_type' parameter can be used to allow exact match of a
>>> + * region with another memory region already recorded, but it needs to be used
>>> + * only on memory regions that allows a type (reserved_mem, acpi). For all the
>>> + * other cases, passing 'MEMBANK_NONE' will disable the exact match.
>>>    */
>>>   bool __init check_reserved_regions_overlap(paddr_t region_start,
>>> -                                           paddr_t region_size)
>>> +                                           paddr_t region_size,
>>> +                                           enum membank_type allow_match_type)
>>>   {
>>>       const struct membanks *mem_banks[] = {
>>>           bootinfo_get_reserved_mem(),
>>> @@ -190,8 +200,21 @@ bool __init check_reserved_regions_overlap(paddr_t region_start,
>>>        * shared memory banks (when static shared memory feature is enabled)
>>>        */
>>>       for ( i = 0; i < ARRAY_SIZE(mem_banks); i++ )
>>> -        if ( meminfo_overlap_check(mem_banks[i], region_start, region_size) )
>>> +    {
>>> +        enum membank_type type = allow_match_type;
>>> +
>>> +#ifdef CONFIG_STATIC_SHM
>>> +        /*
>>> +         * Static shared memory banks don't have a type, so for them disable
>>> +         * the exact match check.
>>> +         */
>>
>> This feels a bit of a hack. Why can't we had a default type like you did in meminfo_add_bank()?
> 
> This is the structure:
> 
> struct membank {
>      paddr_t start;
>      paddr_t size;
>      union {
>          enum membank_type type;
> #ifdef CONFIG_STATIC_SHM
>          struct shmem_membank_extra *shmem_extra;
> #endif
>      };
> };

Anonymous union are really annoying... So effectively in 
check_reserved_regions_overlap() we are hoping that the caller will not 
set allow_match_type to another value than MEMBANK_NONE for static 
memory. This is extremely fragile.

I can't think of another solution right now, but I am definitely not 
happy with this approach.

> 
> we did that in order to save space when static shared memory is not enabled, so we can’t have the
> type for these banks because we are already writing shmem_extra.
> 
> We could remove the union but in that case we would waste space when static shared memory is
> enabled.

Can you remind me how much memory this is going to save?

Cheers,
Michal Orzel Nov. 13, 2024, 3:40 p.m. UTC | #12
On 13/11/2024 15:40, Julien Grall wrote:
> 
> 
> Hi,
> 
> On 13/11/2024 14:19, Michal Orzel wrote:
>>
>>
>> On 13/11/2024 14:50, Julien Grall wrote:
>>>
>>>
>>> Hi Michal,
>>>
>>> On 06/11/2024 15:07, Michal Orzel wrote:
>>>>
>>>>
>>>> On 06/11/2024 14:41, Luca Fancellu wrote:
>>>>>
>>>>>
>>>>> There are some cases where the device tree exposes a memory range
>>>>> in both /memreserve/ and reserved-memory node, in this case the
>>>>> current code will stop Xen to boot since it will find that the
>>>>> latter range is clashing with the already recorded /memreserve/
>>>>> ranges.
>>>>>
>>>>> Furthermore, u-boot lists boot modules ranges, such as ramdisk,
>>>>> in the /memreserve/ part and even in this case this will prevent
>>>>> Xen to boot since it will see that the module memory range that
>>>>> it is going to add in 'add_boot_module' clashes with a /memreserve/
>>>>> range.
>>>>>
>>>>> When Xen populate the data structure that tracks the memory ranges,
>>>>> it also adds a memory type described in 'enum membank_type', so
>>>>> in order to fix this behavior, allow the 'check_reserved_regions_overlap'
>>>>> function to check for exact memory range match given a specific memory
>>>>> type; allowing reserved-memory node ranges and boot modules to have an
>>>>> exact match with ranges from /memreserve/.
>>>>>
>>>>> While there, set a type for the memory recorded during ACPI boot.
>>>>>
>>>>> Fixes: 53dc37829c31 ("xen/arm: Add DT reserve map regions to bootinfo.reserved_mem")
>>>>> Reported-by: Shawn Anastasio <sanastasio@raptorengineering.com>
>>>>> Reported-by: Grygorii Strashko <grygorii_strashko@epam.com>
>>>>> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
>>>>> ---
>>>>> I tested this patch adding the same range in a /memreserve/ entry and
>>>>> /reserved-memory node, and by letting u-boot pass a ramdisk.
>>>>> I've also tested that a configuration running static shared memory still works
>>>>> fine.
>>>>> ---
>>>> So we have 2 separate issues. I don't particularly like the concept of introducing MEMBANK_NONE
>>>> and the changes below look a bit too much for me, given that for boot modules we can only have
>>>> /memreserve/ matching initrd.
>>>
>>> How so? Is this an observation or part of a specification?
>> Not sure what specification you would want to see.
> 
> Anything that you bake your observation. My concern with observation is ...
> 
>   It's all part of U-Boot and Linux behavior that is not documented
> (except for code comments).
>> My statement is based on the U-Boot and Linux behavior. U-Boot part only present for initrd:
>> https://github.com/u-boot/u-boot/blob/master/boot/fdt_support.c#L249
> 
> ... a user is not forced to use U-boot. So this is not a good reason to
I thought that this behavior is solely down to u-boot playing tricks with memreserve.

> rely on it. If Linux starts to rely on it, then it is probably a better
> argument, but first I would need to see the code. Can you paste a link?
Not sure how I would do that given that it is all scattered. But if it means sth, here is kexec code
to create fdt. It is clear they do the same trick as u-boot.
https://github.com/torvalds/linux/blob/master/drivers/of/kexec.c#L355

> 
>>
>> For things that Xen can be interested in, only region for ramdisk for dom0 can match the /memreserve/ region.
>> Providing a generic solution (like Luca did) would want providing an example of sth else that can match which I'm not aware of.
> 
> I would argue this is the other way around. If we are not certain that
> /memreserve/ will not be used for any other boot module, then we should
> have a generic solution. Otherwise, we will end up with similar weird
> issue in the future.
We have 3 possible modules for bootloader->kernel workflow: kernel, dtb and ramdisk. The first 2 are not described in DT so I'm not sure
what are your examples of bootmodules for which you want kernel know about memory reservation other than ramdisk.

~Michal
Julien Grall Nov. 13, 2024, 4:41 p.m. UTC | #13
Hi,

On 13/11/2024 15:40, Michal Orzel wrote:
> 
> 
> On 13/11/2024 15:40, Julien Grall wrote:
>>
>>
>> Hi,
>>
>> On 13/11/2024 14:19, Michal Orzel wrote:
>>>
>>>
>>> On 13/11/2024 14:50, Julien Grall wrote:
>>>>
>>>>
>>>> Hi Michal,
>>>>
>>>> On 06/11/2024 15:07, Michal Orzel wrote:
>>>>>
>>>>>
>>>>> On 06/11/2024 14:41, Luca Fancellu wrote:
>>>>>>
>>>>>>
>>>>>> There are some cases where the device tree exposes a memory range
>>>>>> in both /memreserve/ and reserved-memory node, in this case the
>>>>>> current code will stop Xen to boot since it will find that the
>>>>>> latter range is clashing with the already recorded /memreserve/
>>>>>> ranges.
>>>>>>
>>>>>> Furthermore, u-boot lists boot modules ranges, such as ramdisk,
>>>>>> in the /memreserve/ part and even in this case this will prevent
>>>>>> Xen to boot since it will see that the module memory range that
>>>>>> it is going to add in 'add_boot_module' clashes with a /memreserve/
>>>>>> range.
>>>>>>
>>>>>> When Xen populate the data structure that tracks the memory ranges,
>>>>>> it also adds a memory type described in 'enum membank_type', so
>>>>>> in order to fix this behavior, allow the 'check_reserved_regions_overlap'
>>>>>> function to check for exact memory range match given a specific memory
>>>>>> type; allowing reserved-memory node ranges and boot modules to have an
>>>>>> exact match with ranges from /memreserve/.
>>>>>>
>>>>>> While there, set a type for the memory recorded during ACPI boot.
>>>>>>
>>>>>> Fixes: 53dc37829c31 ("xen/arm: Add DT reserve map regions to bootinfo.reserved_mem")
>>>>>> Reported-by: Shawn Anastasio <sanastasio@raptorengineering.com>
>>>>>> Reported-by: Grygorii Strashko <grygorii_strashko@epam.com>
>>>>>> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
>>>>>> ---
>>>>>> I tested this patch adding the same range in a /memreserve/ entry and
>>>>>> /reserved-memory node, and by letting u-boot pass a ramdisk.
>>>>>> I've also tested that a configuration running static shared memory still works
>>>>>> fine.
>>>>>> ---
>>>>> So we have 2 separate issues. I don't particularly like the concept of introducing MEMBANK_NONE
>>>>> and the changes below look a bit too much for me, given that for boot modules we can only have
>>>>> /memreserve/ matching initrd.
>>>>
>>>> How so? Is this an observation or part of a specification?
>>> Not sure what specification you would want to see.
>>
>> Anything that you bake your observation. My concern with observation is ...
>>
>>    It's all part of U-Boot and Linux behavior that is not documented
>> (except for code comments).
>>> My statement is based on the U-Boot and Linux behavior. U-Boot part only present for initrd:
>>> https://github.com/u-boot/u-boot/blob/master/boot/fdt_support.c#L249
>>
>> ... a user is not forced to use U-boot. So this is not a good reason to
> I thought that this behavior is solely down to u-boot playing tricks with memreserve.

Sure we noticed that U-boot is doing some we didn't expect. But this 
really doesn't mean there are not other interesting behavior happening.

> 
>> rely on it. If Linux starts to rely on it, then it is probably a better
>> argument, but first I would need to see the code. Can you paste a link?
> Not sure how I would do that given that it is all scattered. 

There are no requirements to be all scattered.

 > But if it means sth, here is kexec code> to create fdt. It is clear 
they do the same trick as u-boot.
> https://github.com/torvalds/linux/blob/master/drivers/of/kexec.c#L355

Yet this doesn't provide any information why this only has to be an 
exact region... It only tells me the current behavior.

> 
>>
>>>
>>> For things that Xen can be interested in, only region for ramdisk for dom0 can match the /memreserve/ region.
>>> Providing a generic solution (like Luca did) would want providing an example of sth else that can match which I'm not aware of.
>>
>> I would argue this is the other way around. If we are not certain that
>> /memreserve/ will not be used for any other boot module, then we should
>> have a generic solution. Otherwise, we will end up with similar weird
>> issue in the future.
> We have 3 possible modules for bootloader->kernel workflow: kernel, dtb and ramdisk. The first 2 are not described in DT so I'm not sure
> what are your examples of bootmodules for which you want kernel know about memory reservation other than ramdisk.

The DTB is not described but the kernel is. We also have XSM modules. 
All of which could in theory be in memreserve if for some reasons the 
bootloader wanted to preserve the modules for future use (think 
Live-Update)...

Anyway, to be honest, I don't understand why you are pushing back at a 
more generic solution... Yes this may be what we just notice today, but 
I haven't seen any evidence that it never happen.

So I would rather go with the generic solution.

Cheers,
Luca Fancellu Nov. 13, 2024, 4:52 p.m. UTC | #14
Hi Julien,


>>>> -        }
>>>> +
>>>> +        if ( (allow_match_type != MEMBANK_NONE) &&
>>>> +             (region_start == bank_start) && (region_end == bank_end) &&
>>> 
>>> Why is this only an exact match?
>> Apparently what we are fixing is only a case where memreserve regions matches exactly modules or reserved_mem nodes.
> 
> TBH, as I replied to Michal, I don't understand why we are just focusing on just one issue. It would be good to provide some rationale.

I’m not sure why some DT out there are providing memory ranges reserved in both resvmem and reserved_memory node, could it be
that it was needed because some firmware was reading only one of the two before starting linux? I’m just thinking out loud.

Is your point that we should allow instead not only exact matching but also partial matching?

>>>> 
>>>> +
>>>> +#ifdef CONFIG_STATIC_SHM
>>>> +        /*
>>>> +         * Static shared memory banks don't have a type, so for them disable
>>>> +         * the exact match check.
>>>> +         */
>>> 
>>> This feels a bit of a hack. Why can't we had a default type like you did in meminfo_add_bank()?
>> This is the structure:
>> struct membank {
>>     paddr_t start;
>>     paddr_t size;
>>     union {
>>         enum membank_type type;
>> #ifdef CONFIG_STATIC_SHM
>>         struct shmem_membank_extra *shmem_extra;
>> #endif
>>     };
>> };
> 
> Anonymous union are really annoying... So effectively in check_reserved_regions_overlap() we are hoping that the caller will not set allow_match_type to another value than MEMBANK_NONE for static memory. This is extremely fragile.
> 
> I can't think of another solution right now, but I am definitely not happy with this approach.

I agree it’s not nice, but the issue is not in check_reserved_regions_overlap(), it is in meminfo_overlap_check() which is static and so confined to that module.

Anyway I’m trying to think about another solution.

> 
>> we did that in order to save space when static shared memory is not enabled, so we can’t have the
>> type for these banks because we are already writing shmem_extra.
>> We could remove the union but in that case we would waste space when static shared memory is
>> enabled.
> 
> Can you remind me how much memory this is going to save?

The space issue comes from:

#define NR_MEM_BANKS 256

[…]

struct meminfo {
    struct membanks_hdr common;
    struct membank bank[NR_MEM_BANKS];
};

struct bootinfo {
    struct meminfo mem;
    /* The reserved regions are only used when booting using Device-Tree */
    struct meminfo reserved_mem;

    […]

#ifdef CONFIG_ACPI
    struct meminfo acpi;
#endif
    […]
};

So in case we remove the union and CONFIG_STATIC_SHM=y and so we allow type and *shmem_extra to coexist, we have 256 * 2 * sizeof(pointer) byte.
Additionaly if CONFIG_ACPI=y, we have 256 * sizeof(pointer) byte more not used space.

Cheers,
Luca
Michal Orzel Nov. 13, 2024, 4:56 p.m. UTC | #15
On 13/11/2024 17:41, Julien Grall wrote:
> 
> 
> Hi,
> 
> On 13/11/2024 15:40, Michal Orzel wrote:
>>
>>
>> On 13/11/2024 15:40, Julien Grall wrote:
>>>
>>>
>>> Hi,
>>>
>>> On 13/11/2024 14:19, Michal Orzel wrote:
>>>>
>>>>
>>>> On 13/11/2024 14:50, Julien Grall wrote:
>>>>>
>>>>>
>>>>> Hi Michal,
>>>>>
>>>>> On 06/11/2024 15:07, Michal Orzel wrote:
>>>>>>
>>>>>>
>>>>>> On 06/11/2024 14:41, Luca Fancellu wrote:
>>>>>>>
>>>>>>>
>>>>>>> There are some cases where the device tree exposes a memory range
>>>>>>> in both /memreserve/ and reserved-memory node, in this case the
>>>>>>> current code will stop Xen to boot since it will find that the
>>>>>>> latter range is clashing with the already recorded /memreserve/
>>>>>>> ranges.
>>>>>>>
>>>>>>> Furthermore, u-boot lists boot modules ranges, such as ramdisk,
>>>>>>> in the /memreserve/ part and even in this case this will prevent
>>>>>>> Xen to boot since it will see that the module memory range that
>>>>>>> it is going to add in 'add_boot_module' clashes with a /memreserve/
>>>>>>> range.
>>>>>>>
>>>>>>> When Xen populate the data structure that tracks the memory ranges,
>>>>>>> it also adds a memory type described in 'enum membank_type', so
>>>>>>> in order to fix this behavior, allow the 'check_reserved_regions_overlap'
>>>>>>> function to check for exact memory range match given a specific memory
>>>>>>> type; allowing reserved-memory node ranges and boot modules to have an
>>>>>>> exact match with ranges from /memreserve/.
>>>>>>>
>>>>>>> While there, set a type for the memory recorded during ACPI boot.
>>>>>>>
>>>>>>> Fixes: 53dc37829c31 ("xen/arm: Add DT reserve map regions to bootinfo.reserved_mem")
>>>>>>> Reported-by: Shawn Anastasio <sanastasio@raptorengineering.com>
>>>>>>> Reported-by: Grygorii Strashko <grygorii_strashko@epam.com>
>>>>>>> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
>>>>>>> ---
>>>>>>> I tested this patch adding the same range in a /memreserve/ entry and
>>>>>>> /reserved-memory node, and by letting u-boot pass a ramdisk.
>>>>>>> I've also tested that a configuration running static shared memory still works
>>>>>>> fine.
>>>>>>> ---
>>>>>> So we have 2 separate issues. I don't particularly like the concept of introducing MEMBANK_NONE
>>>>>> and the changes below look a bit too much for me, given that for boot modules we can only have
>>>>>> /memreserve/ matching initrd.
>>>>>
>>>>> How so? Is this an observation or part of a specification?
>>>> Not sure what specification you would want to see.
>>>
>>> Anything that you bake your observation. My concern with observation is ...
>>>
>>>    It's all part of U-Boot and Linux behavior that is not documented
>>> (except for code comments).
>>>> My statement is based on the U-Boot and Linux behavior. U-Boot part only present for initrd:
>>>> https://github.com/u-boot/u-boot/blob/master/boot/fdt_support.c#L249
>>>
>>> ... a user is not forced to use U-boot. So this is not a good reason to
>> I thought that this behavior is solely down to u-boot playing tricks with memreserve.
> 
> Sure we noticed that U-boot is doing some we didn't expect. But this
> really doesn't mean there are not other interesting behavior happening.
> 
>>
>>> rely on it. If Linux starts to rely on it, then it is probably a better
>>> argument, but first I would need to see the code. Can you paste a link?
>> Not sure how I would do that given that it is all scattered.
> 
> There are no requirements to be all scattered.
> 
>  > But if it means sth, here is kexec code> to create fdt. It is clear
> they do the same trick as u-boot.
>> https://github.com/torvalds/linux/blob/master/drivers/of/kexec.c#L355
> 
> Yet this doesn't provide any information why this only has to be an
> exact region... It only tells me the current behavior.
> 
>>
>>>
>>>>
>>>> For things that Xen can be interested in, only region for ramdisk for dom0 can match the /memreserve/ region.
>>>> Providing a generic solution (like Luca did) would want providing an example of sth else that can match which I'm not aware of.
>>>
>>> I would argue this is the other way around. If we are not certain that
>>> /memreserve/ will not be used for any other boot module, then we should
>>> have a generic solution. Otherwise, we will end up with similar weird
>>> issue in the future.
>> We have 3 possible modules for bootloader->kernel workflow: kernel, dtb and ramdisk. The first 2 are not described in DT so I'm not sure
>> what are your examples of bootmodules for which you want kernel know about memory reservation other than ramdisk.
> 
> The DTB is not described but the kernel is. We also have XSM modules.
> All of which could in theory be in memreserve if for some reasons the
> bootloader wanted to preserve the modules for future use (think
> Live-Update)...
> 
> Anyway, to be honest, I don't understand why you are pushing back at a
> more generic solution... Yes this may be what we just notice today, but
> I haven't seen any evidence that it never happen.
> 
> So I would rather go with the generic solution.
My reluctance comes from the fact that it's difficult for me to later justify (if someone asks) a code like that
in the port we maintain because I can't answer the question about the rationale of other modules.
All you wrote is just a theory. Neither you nor anyone seen a code where a bootloader sets up /memreserve/
for sth else than initrd. That's it. I understand that generic solution solves the possible future problems
(the current u-boot behavior dates back to 2014 and nothing new happened since that time) but at least I find
it more difficult to reason about. I can see you might not see it the same way I do, therefore I am fine with
the generic solution.

~Michal
Julien Grall Nov. 13, 2024, 5:24 p.m. UTC | #16
On 13/11/2024 16:56, Michal Orzel wrote:
>>>>> For things that Xen can be interested in, only region for ramdisk for dom0 can match the /memreserve/ region.
>>>>> Providing a generic solution (like Luca did) would want providing an example of sth else that can match which I'm not aware of.
>>>>
>>>> I would argue this is the other way around. If we are not certain that
>>>> /memreserve/ will not be used for any other boot module, then we should
>>>> have a generic solution. Otherwise, we will end up with similar weird
>>>> issue in the future.
>>> We have 3 possible modules for bootloader->kernel workflow: kernel, dtb and ramdisk. The first 2 are not described in DT so I'm not sure
>>> what are your examples of bootmodules for which you want kernel know about memory reservation other than ramdisk.
>>
>> The DTB is not described but the kernel is. We also have XSM modules.
>> All of which could in theory be in memreserve if for some reasons the
>> bootloader wanted to preserve the modules for future use (think
>> Live-Update)...
>>
>> Anyway, to be honest, I don't understand why you are pushing back at a
>> more generic solution... Yes this may be what we just notice today, but
>> I haven't seen any evidence that it never happen.
>>
>> So I would rather go with the generic solution.
> My reluctance comes from the fact that it's difficult for me to later justify (if someone asks) a code like that
> in the port we maintain because I can't answer the question about the rationale of other modules.

You also can't answer why it can't happen. So between the choice of 
trying to be future-proof or handling in an emergency, I would chose the 
former. If you want to handle differently in the AMD port, so be it.

 > All you wrote is just a theory.

Hu, seriously? You are basing your decision on observation on a few 
platforms and then you tell me I theorized?

The main source of truth is the specification. This is ...

> Neither you nor anyone seen a code where a bootloader sets up /memreserve/
> for sth else than initrd. That's it.

the only way we can confirm something doesn't happen is based on the 
specification. If it is not written anywhere then it can happen.

I will Nack any patch/approach that is not attempting to look at the 
specification (if any) unless obviously this will break the guest OSes. 
At which point I will weight the pros and cons.

I don't believe this is the case here. So there is no excuses to try to 
take short cut...

Cheers,
Stefano Stabellini Nov. 13, 2024, 10:41 p.m. UTC | #17
On Wed, 13 Nov 2024, Julien Grall wrote:
> On 13/11/2024 15:40, Michal Orzel wrote:
> > On 13/11/2024 15:40, Julien Grall wrote:
> > > On 13/11/2024 14:19, Michal Orzel wrote:
> > > > On 13/11/2024 14:50, Julien Grall wrote:
> > > > > On 06/11/2024 15:07, Michal Orzel wrote:
> > > > > > On 06/11/2024 14:41, Luca Fancellu wrote:
> > > > > > > There are some cases where the device tree exposes a memory range
> > > > > > > in both /memreserve/ and reserved-memory node, in this case the
> > > > > > > current code will stop Xen to boot since it will find that the
> > > > > > > latter range is clashing with the already recorded /memreserve/
> > > > > > > ranges.
> > > > > > > 
> > > > > > > Furthermore, u-boot lists boot modules ranges, such as ramdisk,
> > > > > > > in the /memreserve/ part and even in this case this will prevent
> > > > > > > Xen to boot since it will see that the module memory range that
> > > > > > > it is going to add in 'add_boot_module' clashes with a
> > > > > > > /memreserve/
> > > > > > > range.
> > > > > > > 
> > > > > > > When Xen populate the data structure that tracks the memory
> > > > > > > ranges,
> > > > > > > it also adds a memory type described in 'enum membank_type', so
> > > > > > > in order to fix this behavior, allow the
> > > > > > > 'check_reserved_regions_overlap'
> > > > > > > function to check for exact memory range match given a specific
> > > > > > > memory
> > > > > > > type; allowing reserved-memory node ranges and boot modules to
> > > > > > > have an
> > > > > > > exact match with ranges from /memreserve/.
> > > > > > > 
> > > > > > > While there, set a type for the memory recorded during ACPI boot.
> > > > > > > 
> > > > > > > Fixes: 53dc37829c31 ("xen/arm: Add DT reserve map regions to
> > > > > > > bootinfo.reserved_mem")
> > > > > > > Reported-by: Shawn Anastasio <sanastasio@raptorengineering.com>
> > > > > > > Reported-by: Grygorii Strashko <grygorii_strashko@epam.com>
> > > > > > > Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
> > > > > > > ---
> > > > > > > I tested this patch adding the same range in a /memreserve/ entry
> > > > > > > and
> > > > > > > /reserved-memory node, and by letting u-boot pass a ramdisk.
> > > > > > > I've also tested that a configuration running static shared memory
> > > > > > > still works
> > > > > > > fine.
> > > > > > > ---
> > > > > > So we have 2 separate issues. I don't particularly like the concept
> > > > > > of introducing MEMBANK_NONE
> > > > > > and the changes below look a bit too much for me, given that for
> > > > > > boot modules we can only have
> > > > > > /memreserve/ matching initrd.
> > > > > 
> > > > > How so? Is this an observation or part of a specification?
> > > > Not sure what specification you would want to see.
> > > 
> > > Anything that you bake your observation. My concern with observation is
> > > ...
> > > 
> > >    It's all part of U-Boot and Linux behavior that is not documented
> > > (except for code comments).
> > > > My statement is based on the U-Boot and Linux behavior. U-Boot part only
> > > > present for initrd:
> > > > https://github.com/u-boot/u-boot/blob/master/boot/fdt_support.c#L249
> > > 
> > > ... a user is not forced to use U-boot. So this is not a good reason to
> > I thought that this behavior is solely down to u-boot playing tricks with
> > memreserve.
> 
> Sure we noticed that U-boot is doing some we didn't expect. But this really
> doesn't mean there are not other interesting behavior happening.
> 
> > 
> > > rely on it. If Linux starts to rely on it, then it is probably a better
> > > argument, but first I would need to see the code. Can you paste a link?
> > Not sure how I would do that given that it is all scattered. 
> 
> There are no requirements to be all scattered.
> 
> > But if it means sth, here is kexec code> to create fdt. It is clear they do
> the same trick as u-boot.
> > https://github.com/torvalds/linux/blob/master/drivers/of/kexec.c#L355
> 
> Yet this doesn't provide any information why this only has to be an exact
> region... It only tells me the current behavior.
> 
> > 
> > > 
> > > > 
> > > > For things that Xen can be interested in, only region for ramdisk for
> > > > dom0 can match the /memreserve/ region.
> > > > Providing a generic solution (like Luca did) would want providing an
> > > > example of sth else that can match which I'm not aware of.
> > > 
> > > I would argue this is the other way around. If we are not certain that
> > > /memreserve/ will not be used for any other boot module, then we should
> > > have a generic solution. Otherwise, we will end up with similar weird
> > > issue in the future.
> > We have 3 possible modules for bootloader->kernel workflow: kernel, dtb and
> > ramdisk. The first 2 are not described in DT so I'm not sure
> > what are your examples of bootmodules for which you want kernel know about
> > memory reservation other than ramdisk.
> 
> The DTB is not described but the kernel is. We also have XSM modules. All of
> which could in theory be in memreserve if for some reasons the bootloader
> wanted to preserve the modules for future use (think Live-Update)...
> 
> Anyway, to be honest, I don't understand why you are pushing back at a more
> generic solution... Yes this may be what we just notice today, but I haven't
> seen any evidence that it never happen.
> 
> So I would rather go with the generic solution.

I looked into the question: "Is this an observation or part of a
specification?"

Looking at the device tree specification
source/chapter5-flattened-format.rst:"Memory Reservation Block"

It says:

"It is used to protect vital data structures from being overwritten by
the client program." [...] "More specifically, a client program shall
not access memory in a reserved region unless other information provided
by the boot program explicitly indicates that it shall do so."


I think it is better to stay on the safe side and implement in Xen a
more generic behavior to support /memreserve/. It is possible that in a
future board more information could be residing in a /memreserve/
region. For instance, I could imagine EFI runtime services residing in a
/memreserve/ region.

I am a bit confused by ranges that are both in /memreserve/ and
/reserved-memory. Ranges under /memreserve/ should not be accessed at
all (unless otherwise specified), ranges under /reserved-memory are
reserved for specific drivers.

I guess ranges that are both in /memreserve/ and /reserved-memory are
exactly the type of ranges that fall under this statement in the spec:
"unless other information provided by the boot program explicitly
indicates that it shall do so".

The way I see it from the device tree spec, I think Xen should not map
/memreserve/ ranges to Dom0, and it should avoid accessing them itself.
But if a range is both in /memreserve/ and also in /reserved-memory,
then basically /reserved-memory takes precedence, so Xen should map it
to Dom0.

Please have a look at the spec, and let me know if you come to the same
conclusion.
Michal Orzel Nov. 14, 2024, 7:18 a.m. UTC | #18
On 13/11/2024 23:41, Stefano Stabellini wrote:
> 
> 
> On Wed, 13 Nov 2024, Julien Grall wrote:
>> On 13/11/2024 15:40, Michal Orzel wrote:
>>> On 13/11/2024 15:40, Julien Grall wrote:
>>>> On 13/11/2024 14:19, Michal Orzel wrote:
>>>>> On 13/11/2024 14:50, Julien Grall wrote:
>>>>>> On 06/11/2024 15:07, Michal Orzel wrote:
>>>>>>> On 06/11/2024 14:41, Luca Fancellu wrote:
>>>>>>>> There are some cases where the device tree exposes a memory range
>>>>>>>> in both /memreserve/ and reserved-memory node, in this case the
>>>>>>>> current code will stop Xen to boot since it will find that the
>>>>>>>> latter range is clashing with the already recorded /memreserve/
>>>>>>>> ranges.
>>>>>>>>
>>>>>>>> Furthermore, u-boot lists boot modules ranges, such as ramdisk,
>>>>>>>> in the /memreserve/ part and even in this case this will prevent
>>>>>>>> Xen to boot since it will see that the module memory range that
>>>>>>>> it is going to add in 'add_boot_module' clashes with a
>>>>>>>> /memreserve/
>>>>>>>> range.
>>>>>>>>
>>>>>>>> When Xen populate the data structure that tracks the memory
>>>>>>>> ranges,
>>>>>>>> it also adds a memory type described in 'enum membank_type', so
>>>>>>>> in order to fix this behavior, allow the
>>>>>>>> 'check_reserved_regions_overlap'
>>>>>>>> function to check for exact memory range match given a specific
>>>>>>>> memory
>>>>>>>> type; allowing reserved-memory node ranges and boot modules to
>>>>>>>> have an
>>>>>>>> exact match with ranges from /memreserve/.
>>>>>>>>
>>>>>>>> While there, set a type for the memory recorded during ACPI boot.
>>>>>>>>
>>>>>>>> Fixes: 53dc37829c31 ("xen/arm: Add DT reserve map regions to
>>>>>>>> bootinfo.reserved_mem")
>>>>>>>> Reported-by: Shawn Anastasio <sanastasio@raptorengineering.com>
>>>>>>>> Reported-by: Grygorii Strashko <grygorii_strashko@epam.com>
>>>>>>>> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
>>>>>>>> ---
>>>>>>>> I tested this patch adding the same range in a /memreserve/ entry
>>>>>>>> and
>>>>>>>> /reserved-memory node, and by letting u-boot pass a ramdisk.
>>>>>>>> I've also tested that a configuration running static shared memory
>>>>>>>> still works
>>>>>>>> fine.
>>>>>>>> ---
>>>>>>> So we have 2 separate issues. I don't particularly like the concept
>>>>>>> of introducing MEMBANK_NONE
>>>>>>> and the changes below look a bit too much for me, given that for
>>>>>>> boot modules we can only have
>>>>>>> /memreserve/ matching initrd.
>>>>>>
>>>>>> How so? Is this an observation or part of a specification?
>>>>> Not sure what specification you would want to see.
>>>>
>>>> Anything that you bake your observation. My concern with observation is
>>>> ...
>>>>
>>>>    It's all part of U-Boot and Linux behavior that is not documented
>>>> (except for code comments).
>>>>> My statement is based on the U-Boot and Linux behavior. U-Boot part only
>>>>> present for initrd:
>>>>> https://github.com/u-boot/u-boot/blob/master/boot/fdt_support.c#L249
>>>>
>>>> ... a user is not forced to use U-boot. So this is not a good reason to
>>> I thought that this behavior is solely down to u-boot playing tricks with
>>> memreserve.
>>
>> Sure we noticed that U-boot is doing some we didn't expect. But this really
>> doesn't mean there are not other interesting behavior happening.
>>
>>>
>>>> rely on it. If Linux starts to rely on it, then it is probably a better
>>>> argument, but first I would need to see the code. Can you paste a link?
>>> Not sure how I would do that given that it is all scattered.
>>
>> There are no requirements to be all scattered.
>>
>>> But if it means sth, here is kexec code> to create fdt. It is clear they do
>> the same trick as u-boot.
>>> https://github.com/torvalds/linux/blob/master/drivers/of/kexec.c#L355
>>
>> Yet this doesn't provide any information why this only has to be an exact
>> region... It only tells me the current behavior.
>>
>>>
>>>>
>>>>>
>>>>> For things that Xen can be interested in, only region for ramdisk for
>>>>> dom0 can match the /memreserve/ region.
>>>>> Providing a generic solution (like Luca did) would want providing an
>>>>> example of sth else that can match which I'm not aware of.
>>>>
>>>> I would argue this is the other way around. If we are not certain that
>>>> /memreserve/ will not be used for any other boot module, then we should
>>>> have a generic solution. Otherwise, we will end up with similar weird
>>>> issue in the future.
>>> We have 3 possible modules for bootloader->kernel workflow: kernel, dtb and
>>> ramdisk. The first 2 are not described in DT so I'm not sure
>>> what are your examples of bootmodules for which you want kernel know about
>>> memory reservation other than ramdisk.
>>
>> The DTB is not described but the kernel is. We also have XSM modules. All of
>> which could in theory be in memreserve if for some reasons the bootloader
>> wanted to preserve the modules for future use (think Live-Update)...
>>
>> Anyway, to be honest, I don't understand why you are pushing back at a more
>> generic solution... Yes this may be what we just notice today, but I haven't
>> seen any evidence that it never happen.
>>
>> So I would rather go with the generic solution.
> 
> I looked into the question: "Is this an observation or part of a
> specification?"
> 
> Looking at the device tree specification
> source/chapter5-flattened-format.rst:"Memory Reservation Block"
> 
> It says:
> 
> "It is used to protect vital data structures from being overwritten by
> the client program." [...] "More specifically, a client program shall
> not access memory in a reserved region unless other information provided
> by the boot program explicitly indicates that it shall do so."
> 
> 
> I think it is better to stay on the safe side and implement in Xen a
> more generic behavior to support /memreserve/. It is possible that in a
> future board more information could be residing in a /memreserve/
> region. For instance, I could imagine EFI runtime services residing in a
> /memreserve/ region.
> 
> I am a bit confused by ranges that are both in /memreserve/ and
> /reserved-memory. Ranges under /memreserve/ should not be accessed at
> all (unless otherwise specified), ranges under /reserved-memory are
> reserved for specific drivers.
> 
> I guess ranges that are both in /memreserve/ and /reserved-memory are
> exactly the type of ranges that fall under this statement in the spec:
> "unless other information provided by the boot program explicitly
> indicates that it shall do so".
> 
> The way I see it from the device tree spec, I think Xen should not map
> /memreserve/ ranges to Dom0, and it should avoid accessing them itself.
> But if a range is both in /memreserve/ and also in /reserved-memory,
> then basically /reserved-memory takes precedence, so Xen should map it
> to Dom0.
> 
> Please have a look at the spec, and let me know if you come to the same
> conclusion.
If we are going to follow the spec here (which I agree to do), then taking into
consideration the following part:

"*shall not* access memory in a reserved region *unless other information* provided
by the boot program explicitly indicates that it shall do so."

the initrd case fits into the exception (otherwise we would not be able to provide initrd to dom0).
Yes, it is described by U-Boot as /memreserve/ but also follows Linux dt binding for "linux,initrd-{start,end}" which can be perceived
as *other information provided by the boot program* and therefore can be used as ramdisk for dom0.

However, I'm not aware of any other official DT binding related to dom0/Linux boot modules other than
initrd, where this *other information* would be present and indicate the possibility of use by client program
i.e. Xen/dom0. This was my only concern, but I decided not to stand in the way.

~Michal
Julien Grall Nov. 14, 2024, 10:10 a.m. UTC | #19
Hi Michal,

On 14/11/2024 07:18, Michal Orzel wrote:
> 
> 
> On 13/11/2024 23:41, Stefano Stabellini wrote:
>>
>>
>> On Wed, 13 Nov 2024, Julien Grall wrote:
>>> On 13/11/2024 15:40, Michal Orzel wrote:
>>>> On 13/11/2024 15:40, Julien Grall wrote:
>>>>> On 13/11/2024 14:19, Michal Orzel wrote:
>>>>>> On 13/11/2024 14:50, Julien Grall wrote:
>>>>>>> On 06/11/2024 15:07, Michal Orzel wrote:
>>>>>>>> On 06/11/2024 14:41, Luca Fancellu wrote:
>>>>>>>>> There are some cases where the device tree exposes a memory range
>>>>>>>>> in both /memreserve/ and reserved-memory node, in this case the
>>>>>>>>> current code will stop Xen to boot since it will find that the
>>>>>>>>> latter range is clashing with the already recorded /memreserve/
>>>>>>>>> ranges.
>>>>>>>>>
>>>>>>>>> Furthermore, u-boot lists boot modules ranges, such as ramdisk,
>>>>>>>>> in the /memreserve/ part and even in this case this will prevent
>>>>>>>>> Xen to boot since it will see that the module memory range that
>>>>>>>>> it is going to add in 'add_boot_module' clashes with a
>>>>>>>>> /memreserve/
>>>>>>>>> range.
>>>>>>>>>
>>>>>>>>> When Xen populate the data structure that tracks the memory
>>>>>>>>> ranges,
>>>>>>>>> it also adds a memory type described in 'enum membank_type', so
>>>>>>>>> in order to fix this behavior, allow the
>>>>>>>>> 'check_reserved_regions_overlap'
>>>>>>>>> function to check for exact memory range match given a specific
>>>>>>>>> memory
>>>>>>>>> type; allowing reserved-memory node ranges and boot modules to
>>>>>>>>> have an
>>>>>>>>> exact match with ranges from /memreserve/.
>>>>>>>>>
>>>>>>>>> While there, set a type for the memory recorded during ACPI boot.
>>>>>>>>>
>>>>>>>>> Fixes: 53dc37829c31 ("xen/arm: Add DT reserve map regions to
>>>>>>>>> bootinfo.reserved_mem")
>>>>>>>>> Reported-by: Shawn Anastasio <sanastasio@raptorengineering.com>
>>>>>>>>> Reported-by: Grygorii Strashko <grygorii_strashko@epam.com>
>>>>>>>>> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
>>>>>>>>> ---
>>>>>>>>> I tested this patch adding the same range in a /memreserve/ entry
>>>>>>>>> and
>>>>>>>>> /reserved-memory node, and by letting u-boot pass a ramdisk.
>>>>>>>>> I've also tested that a configuration running static shared memory
>>>>>>>>> still works
>>>>>>>>> fine.
>>>>>>>>> ---
>>>>>>>> So we have 2 separate issues. I don't particularly like the concept
>>>>>>>> of introducing MEMBANK_NONE
>>>>>>>> and the changes below look a bit too much for me, given that for
>>>>>>>> boot modules we can only have
>>>>>>>> /memreserve/ matching initrd.
>>>>>>>
>>>>>>> How so? Is this an observation or part of a specification?
>>>>>> Not sure what specification you would want to see.
>>>>>
>>>>> Anything that you bake your observation. My concern with observation is
>>>>> ...
>>>>>
>>>>>     It's all part of U-Boot and Linux behavior that is not documented
>>>>> (except for code comments).
>>>>>> My statement is based on the U-Boot and Linux behavior. U-Boot part only
>>>>>> present for initrd:
>>>>>> https://github.com/u-boot/u-boot/blob/master/boot/fdt_support.c#L249
>>>>>
>>>>> ... a user is not forced to use U-boot. So this is not a good reason to
>>>> I thought that this behavior is solely down to u-boot playing tricks with
>>>> memreserve.
>>>
>>> Sure we noticed that U-boot is doing some we didn't expect. But this really
>>> doesn't mean there are not other interesting behavior happening.
>>>
>>>>
>>>>> rely on it. If Linux starts to rely on it, then it is probably a better
>>>>> argument, but first I would need to see the code. Can you paste a link?
>>>> Not sure how I would do that given that it is all scattered.
>>>
>>> There are no requirements to be all scattered.
>>>
>>>> But if it means sth, here is kexec code> to create fdt. It is clear they do
>>> the same trick as u-boot.
>>>> https://github.com/torvalds/linux/blob/master/drivers/of/kexec.c#L355
>>>
>>> Yet this doesn't provide any information why this only has to be an exact
>>> region... It only tells me the current behavior.
>>>
>>>>
>>>>>
>>>>>>
>>>>>> For things that Xen can be interested in, only region for ramdisk for
>>>>>> dom0 can match the /memreserve/ region.
>>>>>> Providing a generic solution (like Luca did) would want providing an
>>>>>> example of sth else that can match which I'm not aware of.
>>>>>
>>>>> I would argue this is the other way around. If we are not certain that
>>>>> /memreserve/ will not be used for any other boot module, then we should
>>>>> have a generic solution. Otherwise, we will end up with similar weird
>>>>> issue in the future.
>>>> We have 3 possible modules for bootloader->kernel workflow: kernel, dtb and
>>>> ramdisk. The first 2 are not described in DT so I'm not sure
>>>> what are your examples of bootmodules for which you want kernel know about
>>>> memory reservation other than ramdisk.
>>>
>>> The DTB is not described but the kernel is. We also have XSM modules. All of
>>> which could in theory be in memreserve if for some reasons the bootloader
>>> wanted to preserve the modules for future use (think Live-Update)...
>>>
>>> Anyway, to be honest, I don't understand why you are pushing back at a more
>>> generic solution... Yes this may be what we just notice today, but I haven't
>>> seen any evidence that it never happen.
>>>
>>> So I would rather go with the generic solution.
>>
>> I looked into the question: "Is this an observation or part of a
>> specification?"
>>
>> Looking at the device tree specification
>> source/chapter5-flattened-format.rst:"Memory Reservation Block"
>>
>> It says:
>>
>> "It is used to protect vital data structures from being overwritten by
>> the client program." [...] "More specifically, a client program shall
>> not access memory in a reserved region unless other information provided
>> by the boot program explicitly indicates that it shall do so."
>>
>>
>> I think it is better to stay on the safe side and implement in Xen a
>> more generic behavior to support /memreserve/. It is possible that in a
>> future board more information could be residing in a /memreserve/
>> region. For instance, I could imagine EFI runtime services residing in a
>> /memreserve/ region.
>>
>> I am a bit confused by ranges that are both in /memreserve/ and
>> /reserved-memory. Ranges under /memreserve/ should not be accessed at
>> all (unless otherwise specified), ranges under /reserved-memory are
>> reserved for specific drivers.
>>
>> I guess ranges that are both in /memreserve/ and /reserved-memory are
>> exactly the type of ranges that fall under this statement in the spec:
>> "unless other information provided by the boot program explicitly
>> indicates that it shall do so".
>>
>> The way I see it from the device tree spec, I think Xen should not map
>> /memreserve/ ranges to Dom0, and it should avoid accessing them itself.
>> But if a range is both in /memreserve/ and also in /reserved-memory,
>> then basically /reserved-memory takes precedence, so Xen should map it
>> to Dom0.
>>
>> Please have a look at the spec, and let me know if you come to the same
>> conclusion.
> If we are going to follow the spec here (which I agree to do), then taking into
> consideration the following part:
> 
> "*shall not* access memory in a reserved region *unless other information* provided
> by the boot program explicitly indicates that it shall do so."

This tells me that an OS should not access the region unless there is a 
Device-Tree binding *and* the OS/hypervisor can interpret it. I think we 
agree on that. The part we disagree is...

> 
> the initrd case fits into the exception (otherwise we would not be able to provide initrd to dom0).
> Yes, it is described by U-Boot as /memreserve/ but also follows Linux dt binding for "linux,initrd-{start,end}" which can be perceived
> as *other information provided by the boot program* and therefore can be used as ramdisk for dom0.
> 
> However, I'm not aware of any other official DT binding related to dom0/Linux boot modules other than
> initrd, where this *other information* would be present and indicate the possibility of use by client program
> i.e. Xen/dom0. 

... this one. I don't see why you would not consider the Xen bindings as 
not official. Yes they are not described in Documents/devicetree/ but 
there are bootloader out (e.g. Grub, and to some extend QEMU) that are 
using our bindings. At which point I don't see why they can't decide to 
add the modules in /memreserve/ and why we would forbid it.

Cheers,
Julien Grall Nov. 14, 2024, 10:31 a.m. UTC | #20
Hi Stefano,

On 13/11/2024 22:41, Stefano Stabellini wrote:
> On Wed, 13 Nov 2024, Julien Grall wrote:
>> On 13/11/2024 15:40, Michal Orzel wrote:
>>> On 13/11/2024 15:40, Julien Grall wrote:
>>>> On 13/11/2024 14:19, Michal Orzel wrote:
>>>>> On 13/11/2024 14:50, Julien Grall wrote:
>>>>>> On 06/11/2024 15:07, Michal Orzel wrote:
>>>>>>> On 06/11/2024 14:41, Luca Fancellu wrote:
>>>>>>>> There are some cases where the device tree exposes a memory range
>>>>>>>> in both /memreserve/ and reserved-memory node, in this case the
>>>>>>>> current code will stop Xen to boot since it will find that the
>>>>>>>> latter range is clashing with the already recorded /memreserve/
>>>>>>>> ranges.
>>>>>>>>
>>>>>>>> Furthermore, u-boot lists boot modules ranges, such as ramdisk,
>>>>>>>> in the /memreserve/ part and even in this case this will prevent
>>>>>>>> Xen to boot since it will see that the module memory range that
>>>>>>>> it is going to add in 'add_boot_module' clashes with a
>>>>>>>> /memreserve/
>>>>>>>> range.
>>>>>>>>
>>>>>>>> When Xen populate the data structure that tracks the memory
>>>>>>>> ranges,
>>>>>>>> it also adds a memory type described in 'enum membank_type', so
>>>>>>>> in order to fix this behavior, allow the
>>>>>>>> 'check_reserved_regions_overlap'
>>>>>>>> function to check for exact memory range match given a specific
>>>>>>>> memory
>>>>>>>> type; allowing reserved-memory node ranges and boot modules to
>>>>>>>> have an
>>>>>>>> exact match with ranges from /memreserve/.
>>>>>>>>
>>>>>>>> While there, set a type for the memory recorded during ACPI boot.
>>>>>>>>
>>>>>>>> Fixes: 53dc37829c31 ("xen/arm: Add DT reserve map regions to
>>>>>>>> bootinfo.reserved_mem")
>>>>>>>> Reported-by: Shawn Anastasio <sanastasio@raptorengineering.com>
>>>>>>>> Reported-by: Grygorii Strashko <grygorii_strashko@epam.com>
>>>>>>>> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
>>>>>>>> ---
>>>>>>>> I tested this patch adding the same range in a /memreserve/ entry
>>>>>>>> and
>>>>>>>> /reserved-memory node, and by letting u-boot pass a ramdisk.
>>>>>>>> I've also tested that a configuration running static shared memory
>>>>>>>> still works
>>>>>>>> fine.
>>>>>>>> ---
>>>>>>> So we have 2 separate issues. I don't particularly like the concept
>>>>>>> of introducing MEMBANK_NONE
>>>>>>> and the changes below look a bit too much for me, given that for
>>>>>>> boot modules we can only have
>>>>>>> /memreserve/ matching initrd.
>>>>>>
>>>>>> How so? Is this an observation or part of a specification?
>>>>> Not sure what specification you would want to see.
>>>>
>>>> Anything that you bake your observation. My concern with observation is
>>>> ...
>>>>
>>>>     It's all part of U-Boot and Linux behavior that is not documented
>>>> (except for code comments).
>>>>> My statement is based on the U-Boot and Linux behavior. U-Boot part only
>>>>> present for initrd:
>>>>> https://github.com/u-boot/u-boot/blob/master/boot/fdt_support.c#L249
>>>>
>>>> ... a user is not forced to use U-boot. So this is not a good reason to
>>> I thought that this behavior is solely down to u-boot playing tricks with
>>> memreserve.
>>
>> Sure we noticed that U-boot is doing some we didn't expect. But this really
>> doesn't mean there are not other interesting behavior happening.
>>
>>>
>>>> rely on it. If Linux starts to rely on it, then it is probably a better
>>>> argument, but first I would need to see the code. Can you paste a link?
>>> Not sure how I would do that given that it is all scattered.
>>
>> There are no requirements to be all scattered.
>>
>>> But if it means sth, here is kexec code> to create fdt. It is clear they do
>> the same trick as u-boot.
>>> https://github.com/torvalds/linux/blob/master/drivers/of/kexec.c#L355
>>
>> Yet this doesn't provide any information why this only has to be an exact
>> region... It only tells me the current behavior.
>>
>>>
>>>>
>>>>>
>>>>> For things that Xen can be interested in, only region for ramdisk for
>>>>> dom0 can match the /memreserve/ region.
>>>>> Providing a generic solution (like Luca did) would want providing an
>>>>> example of sth else that can match which I'm not aware of.
>>>>
>>>> I would argue this is the other way around. If we are not certain that
>>>> /memreserve/ will not be used for any other boot module, then we should
>>>> have a generic solution. Otherwise, we will end up with similar weird
>>>> issue in the future.
>>> We have 3 possible modules for bootloader->kernel workflow: kernel, dtb and
>>> ramdisk. The first 2 are not described in DT so I'm not sure
>>> what are your examples of bootmodules for which you want kernel know about
>>> memory reservation other than ramdisk.
>>
>> The DTB is not described but the kernel is. We also have XSM modules. All of
>> which could in theory be in memreserve if for some reasons the bootloader
>> wanted to preserve the modules for future use (think Live-Update)...
>>
>> Anyway, to be honest, I don't understand why you are pushing back at a more
>> generic solution... Yes this may be what we just notice today, but I haven't
>> seen any evidence that it never happen.
>>
>> So I would rather go with the generic solution.
> 
> I looked into the question: "Is this an observation or part of a
> specification?"
> 
> Looking at the device tree specification
> source/chapter5-flattened-format.rst:"Memory Reservation Block"
> 
> It says:
> 
> "It is used to protect vital data structures from being overwritten by
> the client program." [...] "More specifically, a client program shall
> not access memory in a reserved region unless other information provided
> by the boot program explicitly indicates that it shall do so."
> 
> 
> I think it is better to stay on the safe side and implement in Xen a
> more generic behavior to support /memreserve/. It is possible that in a
> future board more information could be residing in a /memreserve/
> region. For instance, I could imagine EFI runtime services residing in a
> /memreserve/ region.

I am not 100% sure about this one. The specification implies that if a 
region is reserved, then it would need to be marked as 
EfiReservedMemoryType in the EFI memory map. But for EFI runtime 
services, they should be using EfiRuntimeServicesCode or 
EfiRuntimeServicesData.

> 
> I am a bit confused by ranges that are both in /memreserve/ and
> /reserved-memory. Ranges under /memreserve/ should not be accessed at
> all (unless otherwise specified), ranges under /reserved-memory are
> reserved for specific drivers.

IIUC /memreserve/ is the legacy approach for describing reserved regions.

> I guess ranges that are both in /memreserve/ and /reserved-memory are
> exactly the type of ranges that fall under this statement in the spec:
> "unless other information provided by the boot program explicitly
> indicates that it shall do so".

Yes. The OS would be able to use the range based what /reserved-memory 
says. Note that you can also the describe a region from /memreserve/ 
outside or /reserved-memory (such as the CPU spin table).

> 
> The way I see it from the device tree spec, I think Xen should not map
> /memreserve/ ranges to Dom0, and it should avoid accessing them itself.

See above, Xen should be able to access the regions in /memreserve/. But 
it should map them in the directmap.

> But if a range is both in /memreserve/ and also in /reserved-memory,
> then basically /reserved-memory takes precedence, so Xen should map it
> to Dom0.

Unless Xen needs to use some of them. At which point this will need to 
be excluded from Dom0.

Looking at the code, I think /memreserve/ and /reserved-memory are not 
mapped in Xen and everything in /reserved-memory is mapped to dom0.

Cheers,
Michal Orzel Nov. 14, 2024, 11:48 a.m. UTC | #21
On 14/11/2024 11:31, Julien Grall wrote:
> 
> 
> Hi Stefano,
> 
> On 13/11/2024 22:41, Stefano Stabellini wrote:
>> On Wed, 13 Nov 2024, Julien Grall wrote:
>>> On 13/11/2024 15:40, Michal Orzel wrote:
>>>> On 13/11/2024 15:40, Julien Grall wrote:
>>>>> On 13/11/2024 14:19, Michal Orzel wrote:
>>>>>> On 13/11/2024 14:50, Julien Grall wrote:
>>>>>>> On 06/11/2024 15:07, Michal Orzel wrote:
>>>>>>>> On 06/11/2024 14:41, Luca Fancellu wrote:
>>>>>>>>> There are some cases where the device tree exposes a memory range
>>>>>>>>> in both /memreserve/ and reserved-memory node, in this case the
>>>>>>>>> current code will stop Xen to boot since it will find that the
>>>>>>>>> latter range is clashing with the already recorded /memreserve/
>>>>>>>>> ranges.
>>>>>>>>>
>>>>>>>>> Furthermore, u-boot lists boot modules ranges, such as ramdisk,
>>>>>>>>> in the /memreserve/ part and even in this case this will prevent
>>>>>>>>> Xen to boot since it will see that the module memory range that
>>>>>>>>> it is going to add in 'add_boot_module' clashes with a
>>>>>>>>> /memreserve/
>>>>>>>>> range.
>>>>>>>>>
>>>>>>>>> When Xen populate the data structure that tracks the memory
>>>>>>>>> ranges,
>>>>>>>>> it also adds a memory type described in 'enum membank_type', so
>>>>>>>>> in order to fix this behavior, allow the
>>>>>>>>> 'check_reserved_regions_overlap'
>>>>>>>>> function to check for exact memory range match given a specific
>>>>>>>>> memory
>>>>>>>>> type; allowing reserved-memory node ranges and boot modules to
>>>>>>>>> have an
>>>>>>>>> exact match with ranges from /memreserve/.
>>>>>>>>>
>>>>>>>>> While there, set a type for the memory recorded during ACPI boot.
>>>>>>>>>
>>>>>>>>> Fixes: 53dc37829c31 ("xen/arm: Add DT reserve map regions to
>>>>>>>>> bootinfo.reserved_mem")
>>>>>>>>> Reported-by: Shawn Anastasio <sanastasio@raptorengineering.com>
>>>>>>>>> Reported-by: Grygorii Strashko <grygorii_strashko@epam.com>
>>>>>>>>> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
>>>>>>>>> ---
>>>>>>>>> I tested this patch adding the same range in a /memreserve/ entry
>>>>>>>>> and
>>>>>>>>> /reserved-memory node, and by letting u-boot pass a ramdisk.
>>>>>>>>> I've also tested that a configuration running static shared memory
>>>>>>>>> still works
>>>>>>>>> fine.
>>>>>>>>> ---
>>>>>>>> So we have 2 separate issues. I don't particularly like the concept
>>>>>>>> of introducing MEMBANK_NONE
>>>>>>>> and the changes below look a bit too much for me, given that for
>>>>>>>> boot modules we can only have
>>>>>>>> /memreserve/ matching initrd.
>>>>>>>
>>>>>>> How so? Is this an observation or part of a specification?
>>>>>> Not sure what specification you would want to see.
>>>>>
>>>>> Anything that you bake your observation. My concern with observation is
>>>>> ...
>>>>>
>>>>>     It's all part of U-Boot and Linux behavior that is not documented
>>>>> (except for code comments).
>>>>>> My statement is based on the U-Boot and Linux behavior. U-Boot part only
>>>>>> present for initrd:
>>>>>> https://github.com/u-boot/u-boot/blob/master/boot/fdt_support.c#L249
>>>>>
>>>>> ... a user is not forced to use U-boot. So this is not a good reason to
>>>> I thought that this behavior is solely down to u-boot playing tricks with
>>>> memreserve.
>>>
>>> Sure we noticed that U-boot is doing some we didn't expect. But this really
>>> doesn't mean there are not other interesting behavior happening.
>>>
>>>>
>>>>> rely on it. If Linux starts to rely on it, then it is probably a better
>>>>> argument, but first I would need to see the code. Can you paste a link?
>>>> Not sure how I would do that given that it is all scattered.
>>>
>>> There are no requirements to be all scattered.
>>>
>>>> But if it means sth, here is kexec code> to create fdt. It is clear they do
>>> the same trick as u-boot.
>>>> https://github.com/torvalds/linux/blob/master/drivers/of/kexec.c#L355
>>>
>>> Yet this doesn't provide any information why this only has to be an exact
>>> region... It only tells me the current behavior.
>>>
>>>>
>>>>>
>>>>>>
>>>>>> For things that Xen can be interested in, only region for ramdisk for
>>>>>> dom0 can match the /memreserve/ region.
>>>>>> Providing a generic solution (like Luca did) would want providing an
>>>>>> example of sth else that can match which I'm not aware of.
>>>>>
>>>>> I would argue this is the other way around. If we are not certain that
>>>>> /memreserve/ will not be used for any other boot module, then we should
>>>>> have a generic solution. Otherwise, we will end up with similar weird
>>>>> issue in the future.
>>>> We have 3 possible modules for bootloader->kernel workflow: kernel, dtb and
>>>> ramdisk. The first 2 are not described in DT so I'm not sure
>>>> what are your examples of bootmodules for which you want kernel know about
>>>> memory reservation other than ramdisk.
>>>
>>> The DTB is not described but the kernel is. We also have XSM modules. All of
>>> which could in theory be in memreserve if for some reasons the bootloader
>>> wanted to preserve the modules for future use (think Live-Update)...
>>>
>>> Anyway, to be honest, I don't understand why you are pushing back at a more
>>> generic solution... Yes this may be what we just notice today, but I haven't
>>> seen any evidence that it never happen.
>>>
>>> So I would rather go with the generic solution.
>>
>> I looked into the question: "Is this an observation or part of a
>> specification?"
>>
>> Looking at the device tree specification
>> source/chapter5-flattened-format.rst:"Memory Reservation Block"
>>
>> It says:
>>
>> "It is used to protect vital data structures from being overwritten by
>> the client program." [...] "More specifically, a client program shall
>> not access memory in a reserved region unless other information provided
>> by the boot program explicitly indicates that it shall do so."
>>
>>
>> I think it is better to stay on the safe side and implement in Xen a
>> more generic behavior to support /memreserve/. It is possible that in a
>> future board more information could be residing in a /memreserve/
>> region. For instance, I could imagine EFI runtime services residing in a
>> /memreserve/ region.
> 
> I am not 100% sure about this one. The specification implies that if a
> region is reserved, then it would need to be marked as
> EfiReservedMemoryType in the EFI memory map. But for EFI runtime
> services, they should be using EfiRuntimeServicesCode or
> EfiRuntimeServicesData.
> 
>>
>> I am a bit confused by ranges that are both in /memreserve/ and
>> /reserved-memory. Ranges under /memreserve/ should not be accessed at
>> all (unless otherwise specified), ranges under /reserved-memory are
>> reserved for specific drivers.
> 
> IIUC /memreserve/ is the legacy approach for describing reserved regions.
> 
>> I guess ranges that are both in /memreserve/ and /reserved-memory are
>> exactly the type of ranges that fall under this statement in the spec:
>> "unless other information provided by the boot program explicitly
>> indicates that it shall do so".
> 
> Yes. The OS would be able to use the range based what /reserved-memory
> says. Note that you can also the describe a region from /memreserve/
> outside or /reserved-memory (such as the CPU spin table).
> 
>>
>> The way I see it from the device tree spec, I think Xen should not map
>> /memreserve/ ranges to Dom0, and it should avoid accessing them itself.
> 
> See above, Xen should be able to access the regions in /memreserve/. But
> it should map them in the directmap.
> 
>> But if a range is both in /memreserve/ and also in /reserved-memory,
>> then basically /reserved-memory takes precedence, so Xen should map it
>> to Dom0.
> 
> Unless Xen needs to use some of them. At which point this will need to
> be excluded from Dom0.
> 
> Looking at the code, I think /memreserve/ and /reserved-memory are not
> mapped in Xen and everything in /reserved-memory is mapped to dom0.
Why do we forward /reserved-memory to dom0 fdt but /memreserve/ not? From the discussion
we're having it seems like we should treat them equally. Also, looking at Luca patch,
we seem to special case /memreserve/ and only allow for overlap /memresrve/ with boot modules
and not /reserved-memory/ with boot modules. If we are going to claim that all the boot modules
can be marked as reserved by the bootloader, then I think we should treat them equally providing
the same experience to dom0.

Last thing I wanted to ask (for my clarity) is that if bootloader loads initrd at region A and marks
it as reserved (either in /memreserve/ or /reserved-memory/), and later on Xen copies initrd from region
A to B, shouldn't the reserved memory region be updated to cover new region for initrd?

~Michal
Julien Grall Nov. 14, 2024, 12:04 p.m. UTC | #22
Hi Michal,

On 14/11/2024 11:48, Michal Orzel wrote:
> 
> 
> On 14/11/2024 11:31, Julien Grall wrote:
>> Looking at the code, I think /memreserve/ and /reserved-memory are not
>> mapped in Xen and everything in /reserved-memory is mapped to dom0.
> Why do we forward /reserved-memory to dom0 fdt but /memreserve/ not?

I was wondering the same. The main issue I can think of with 
/memreserve/ is some of the regions will likely be for Xen own usage. So 
we would need to have a way to exclude them from dom0.

 >  From the discussion> we're having it seems like we should treat them 
equally. Also, looking at Luca patch,
> we seem to special case /memreserve/ and only allow for overlap /memresrve/ with boot modules
> and not /reserved-memory/ with boot modules. If we are going to claim that all the boot modules
> can be marked as reserved by the bootloader, then I think we should treat them equally providing
> the same experience to dom0.

In my mind, /memreserved/ and /reserved-memory/ are different. The 
former doesn't say what the region is for, but the latter will indicate it.

So I am not 100% sure how the bootmodule could end up in 
/reserved-memory/ because they are described as part of the multiboot 
modules. Do you have a scenario?

Regardless that, if we decide to allow boot modules in /reserved-memory/ 
then we would need need to rework how the reserved-regions are mapped 
because we don't want the boot modules to be exposed to dom0.

> 
> Last thing I wanted to ask (for my clarity) is that if bootloader loads initrd at region A and marks
> it as reserved (either in /memreserve/ or /reserved-memory/), and later on Xen copies initrd from region
> A to B, shouldn't the reserved memory region be updated to cover new region for initrd?

If we mark the region has a reserved, then we are telling the OS it 
can't use the region. But I am not sure why it would be needed as Xen 
doesn't care how the regions is going to be used by the domain. From a 
domain side, do you see any reason why we would want to mark again the 
region as reserved?

If we didn't copy the initrd, then I would have directly agreed that 
they should be marked as /memreserve/.

Cheers,
Michal Orzel Nov. 14, 2024, 12:22 p.m. UTC | #23
On 14/11/2024 13:04, Julien Grall wrote:
> 
> 
> Hi Michal,
> 
> On 14/11/2024 11:48, Michal Orzel wrote:
>>
>>
>> On 14/11/2024 11:31, Julien Grall wrote:
>>> Looking at the code, I think /memreserve/ and /reserved-memory are not
>>> mapped in Xen and everything in /reserved-memory is mapped to dom0.
>> Why do we forward /reserved-memory to dom0 fdt but /memreserve/ not?
> 
> I was wondering the same. The main issue I can think of with
> /memreserve/ is some of the regions will likely be for Xen own usage. So
Can you give example of regions defined as reserved for Xen usage (other than static-mem)?

> we would need to have a way to exclude them from dom0.
> 
>  >  From the discussion> we're having it seems like we should treat them
> equally. Also, looking at Luca patch,
>> we seem to special case /memreserve/ and only allow for overlap /memresrve/ with boot modules
>> and not /reserved-memory/ with boot modules. If we are going to claim that all the boot modules
>> can be marked as reserved by the bootloader, then I think we should treat them equally providing
>> the same experience to dom0.
> 
> In my mind, /memreserved/ and /reserved-memory/ are different. The
> former doesn't say what the region is for, but the latter will indicate it.
In the context of this patch, I don't agree. We're discussing overlap, and if a region A
from /memreserve/ overlaps fully with a module A, we know what is the purpose of it.
Today it's initrd, but as you say we cannot rule out other modules as well.

> 
> So I am not 100% sure how the bootmodule could end up in
> /reserved-memory/ because they are described as part of the multiboot
> modules. Do you have a scenario?
I don't same as I don't have scenario for /memreserve/ overlapping with sth else than initrd.
All of these comes from my validation of u-boot, grub, barebox code. I have a feeling that due to
U-Boot trick that is not present in any other *known* bootloader, we are trying to over-engineer the problem :)
But as Stefano and you wrote, we should follow the spec and for me we should therefore treat them equally.

> 
> Regardless that, if we decide to allow boot modules in /reserved-memory/
> then we would need need to rework how the reserved-regions are mapped
> because we don't want the boot modules to be exposed to dom0.
+1

> 
>>
>> Last thing I wanted to ask (for my clarity) is that if bootloader loads initrd at region A and marks
>> it as reserved (either in /memreserve/ or /reserved-memory/), and later on Xen copies initrd from region
>> A to B, shouldn't the reserved memory region be updated to cover new region for initrd?
> 
> If we mark the region has a reserved, then we are telling the OS it
> can't use the region. But I am not sure why it would be needed as Xen
Well, in the context of initrd, kernel uses it even though it is reserved. This is because
of the second part of the spec where other bindings come into play.

> doesn't care how the regions is going to be used by the domain. From a
> domain side, do you see any reason why we would want to mark again the
> region as reserved?
TBH I don't same as I still don't know why U-Boot does that trick. It comes from a very
old code and my initial understanding is that it is done purely for U-boot bookkeeping.

> 
> If we didn't copy the initrd, then I would have directly agreed that
> they should be marked as /memreserve/.
> 
> Cheers,
> 
> --
> Julien Grall
> 

~Michal
Julien Grall Nov. 14, 2024, 1:09 p.m. UTC | #24
On 14/11/2024 12:22, Michal Orzel wrote:
> 
> 
> On 14/11/2024 13:04, Julien Grall wrote:
>>
>>
>> Hi Michal,
>>
>> On 14/11/2024 11:48, Michal Orzel wrote:
>>>
>>>
>>> On 14/11/2024 11:31, Julien Grall wrote:
>>>> Looking at the code, I think /memreserve/ and /reserved-memory are not
>>>> mapped in Xen and everything in /reserved-memory is mapped to dom0.
>>> Why do we forward /reserved-memory to dom0 fdt but /memreserve/ not?
>>
>> I was wondering the same. The main issue I can think of with
>> /memreserve/ is some of the regions will likely be for Xen own usage. So
> Can you give example of regions defined as reserved for Xen usage (other than static-mem)?

The spin table to bring-up CPUs.

> 
>> we would need to have a way to exclude them from dom0.
>>
>>   >  From the discussion> we're having it seems like we should treat them
>> equally. Also, looking at Luca patch,
>>> we seem to special case /memreserve/ and only allow for overlap /memresrve/ with boot modules
>>> and not /reserved-memory/ with boot modules. If we are going to claim that all the boot modules
>>> can be marked as reserved by the bootloader, then I think we should treat them equally providing
>>> the same experience to dom0.
>>
>> In my mind, /memreserved/ and /reserved-memory/ are different. The
>> former doesn't say what the region is for, but the latter will indicate it.
> In the context of this patch, I don't agree. We're discussing overlap, and if a region A
> from /memreserve/ overlaps fully with a module A, we know what is the purpose of it.
 > Today it's initrd, but as you say we cannot rule out other modules as 
well.

To give a concrete example, the /reserved-region/ can be used to reserve 
space for the VGA buffer. It would be odd that someone would put the 
boot module in the middle of the VGA buffer... If Xen ends up to use the 
VGA buffer (not the case today), then it would be a problem. Xen would 
need to be reworked to move all boot modules outside of the memory 
because it can access the VGA (or any other reserved regions).

The problem is slightly different for /memreserve/ because we don't 
exactly know what the regions are for until we parse the rest of the DT. 
Yes technically, a user could put the initrd in the wrong place. So the 
problem is the same. But this is a relexation I am more willing to 
accept over /reserved-region/ right now.

>> So I am not 100% sure how the bootmodule could end up in
>> /reserved-memory/ because they are described as part of the multiboot
>> modules. Do you have a scenario?
> I don't same as I don't have scenario for /memreserve/ overlapping with sth else than initrd.
> All of these comes from my validation of u-boot, grub, barebox code. I have a feeling that due to
> U-Boot trick that is not present in any other *known* bootloader, we are trying to over-engineer the problem :)
> But as Stefano and you wrote, we should follow the spec and for me we should therefore treat them equally.

See above why I don't think we should treat them equally today. We might 
be able in the future if we can categorize all the regions in /memreserve/.

[...]

>>> Last thing I wanted to ask (for my clarity) is that if bootloader loads initrd at region A and marks
>>> it as reserved (either in /memreserve/ or /reserved-memory/), and later on Xen copies initrd from region
>>> A to B, shouldn't the reserved memory region be updated to cover new region for initrd?
>>
>> If we mark the region has a reserved, then we are telling the OS it
>> can't use the region. But I am not sure why it would be needed as Xen
> Well, in the context of initrd, kernel uses it even though it is reserved. This is because
> of the second part of the spec where other bindings come into play.
> 
>> doesn't care how the regions is going to be used by the domain. From a
>> domain side, do you see any reason why we would want to mark again the
>> region as reserved?
> TBH I don't same as I still don't know why U-Boot does that trick. It comes from a very
> old code and my initial understanding is that it is done purely for U-boot bookkeeping.

/memreserve/ (and now) /reserved-regions/ are an easy way to find the 
regions that should be excluded from the RAM. Without that, you will 
need to have special case (such as for initrd, and the various xen boot 
moudles). I suspect that Linux have a special case and hence why it 
hasn't been a problem that Xen doesn't reserve the region.

Also, AFAIU, the Image protocol doesn't seem require the region to be 
reserved. It just says:

"""
If an initrd/initramfs is passed to the kernel at boot, it must reside
entirely within a 1 GB aligned physical memory window of up to 32 GB in
size that fully covers the kernel Image as well.

Any memory described to the kernel (even that below the start of the
image) which is not marked as reserved from the kernel (e.g., with a
memreserve region in the device tree) will be considered as available to
the kernel.
"""

Regarding the idea to mark the initird as reserved. My main concern is 
you now double the amount of memory that will be unusuable (AFAIU 
neither Xen nor Linux will be able to use the memory for allocations). I 
feel this is quite a steep price. A potential solution would to be map 
the initrd region rather than copying. Not sure how much code it will 
result, so someone will need to evaluate whether it is worth it.

Cheers,
Grygorii Strashko Nov. 14, 2024, 5:54 p.m. UTC | #25
On 14.11.24 15:09, Julien Grall wrote:
> 
> 
> On 14/11/2024 12:22, Michal Orzel wrote:
>>
>>
>> On 14/11/2024 13:04, Julien Grall wrote:
>>>
>>>
>>> Hi Michal,
>>>
>>> On 14/11/2024 11:48, Michal Orzel wrote:
>>>>
>>>>
>>>> On 14/11/2024 11:31, Julien Grall wrote:
>>>>> Looking at the code, I think /memreserve/ and /reserved-memory are not
>>>>> mapped in Xen and everything in /reserved-memory is mapped to dom0.
>>>> Why do we forward /reserved-memory to dom0 fdt but /memreserve/ not?
>>>
>>> I was wondering the same. The main issue I can think of with
>>> /memreserve/ is some of the regions will likely be for Xen own usage. So
>> Can you give example of regions defined as reserved for Xen usage (other than static-mem)?
> 
> The spin table to bring-up CPUs.
> 
>>
>>> we would need to have a way to exclude them from dom0.
>>>
>>>   >  From the discussion> we're having it seems like we should treat them
>>> equally. Also, looking at Luca patch,
>>>> we seem to special case /memreserve/ and only allow for overlap /memresrve/ with boot modules
>>>> and not /reserved-memory/ with boot modules. If we are going to claim that all the boot modules
>>>> can be marked as reserved by the bootloader, then I think we should treat them equally providing
>>>> the same experience to dom0.
>>>
>>> In my mind, /memreserved/ and /reserved-memory/ are different. The
>>> former doesn't say what the region is for, but the latter will indicate it.
>> In the context of this patch, I don't agree. We're discussing overlap, and if a region A
>> from /memreserve/ overlaps fully with a module A, we know what is the purpose of it.
>  > Today it's initrd, but as you say we cannot rule out other modules as well.
> 
> To give a concrete example, the /reserved-region/ can be used to reserve space for the VGA buffer. It would be odd that someone would put the boot module in the middle of the VGA buffer... If Xen ends up to use the VGA buffer (not the case today), then it would be a problem. Xen would need to be reworked to move all boot modules outside of the memory because it can access the VGA (or any other reserved regions).
> 
> The problem is slightly different for /memreserve/ because we don't exactly know what the regions are for until we parse the rest of the DT. Yes technically, a user could put the initrd in the wrong place. So the problem is the same. But this is a relexation I am more willing to accept over /reserved-region/ right now.
> 
>>> So I am not 100% sure how the bootmodule could end up in
>>> /reserved-memory/ because they are described as part of the multiboot
>>> modules. Do you have a scenario?
>> I don't same as I don't have scenario for /memreserve/ overlapping with sth else than initrd.
>> All of these comes from my validation of u-boot, grub, barebox code. I have a feeling that due to
>> U-Boot trick that is not present in any other *known* bootloader, we are trying to over-engineer the problem :)
>> But as Stefano and you wrote, we should follow the spec and for me we should therefore treat them equally.
> 
> See above why I don't think we should treat them equally today. We might be able in the future if we can categorize all the regions in /memreserve/.
> 
> [...]
> 
>>>> Last thing I wanted to ask (for my clarity) is that if bootloader loads initrd at region A and marks
>>>> it as reserved (either in /memreserve/ or /reserved-memory/), and later on Xen copies initrd from region
>>>> A to B, shouldn't the reserved memory region be updated to cover new region for initrd?
>>>
>>> If we mark the region has a reserved, then we are telling the OS it
>>> can't use the region. But I am not sure why it would be needed as Xen
>> Well, in the context of initrd, kernel uses it even though it is reserved. This is because
>> of the second part of the spec where other bindings come into play.
>>
>>> doesn't care how the regions is going to be used by the domain. From a
>>> domain side, do you see any reason why we would want to mark again the
>>> region as reserved?
>> TBH I don't same as I still don't know why U-Boot does that trick. It comes from a very
>> old code and my initial understanding is that it is done purely for U-boot bookkeeping.
> 
> /memreserve/ (and now) /reserved-regions/ are an easy way to find the regions that should be excluded from the RAM. Without that, you will need to have special case (such as for initrd, and the various xen boot moudles). I suspect that Linux have a special case and hence why it hasn't been a problem that Xen doesn't reserve the region.
> 

My be it will help in this discussion - some investigation results.

At boot time (only ARM64, but other arches looks similar):

- Determines if initrd present from DT : early_init_dt_scan()->setup_machine_fdt()
or by checking bootargs "initrd/initrdmem=".

- [1] In arm64_memblock_init() it adds initrd in reserved memory
https://github.com/torvalds/linux/blob/master/arch/arm64/mm/init.c#L274
therefore Linux doesn't depends on any kind of "DT reserved memory" ranges for initrd.

- The Linux processes "reserved-memory" nodes first.
https://github.com/torvalds/linux/blob/master/drivers/of/fdt.c#L504

- After this FDT memory reserve map is processed:
https://github.com/torvalds/linux/blob/master/drivers/of/fdt.c#L496

The Linux doesn't perform any special processing of FDT memory reserve map -
it just adds it to the reserved memory list (splice allowed), which is reasonable, as
Linux doesn't know what is the purpose of each reserved memory block in FDT memory reserve map.

Later based on DT (or other meaning) some reserved memory can be handled in a special way.

After early_init_fdt_scan_reserved_mem() finishes - the result is the list of reserved memory
ranges which are excluded from generic memory management process.


In case of Xen, initrd is not intended to be used by Xen directly, but should be passed to Dom0.
My current, understanding - it will be copied, so from the Xen point of view the memory occupied by Initrd is
not permanently reserved and can be even freed once Initrd successfully passed in Dom0.
Also, due to the point [1] above no need to pass initrd memory as reserved region to Dom0 Linux based domain.

Honestly, it looks like nothing from FDT memory reserve map make sense to blindly pass to Dom0/hwdom
- initrd (described above)
- FDT - it will be generated
NOTE. From u-boot fdt_shrink_to_minimum() code it can be seen that FDT region also could be in /memreserve/
https://github.com/u-boot/u-boot/blob/master/boot/fdt_support.c#L766
- spin-table : Xen owns it, I assume.
- any kind of secure/firmware memory - domains do not care unless specified explicitly in "reserved-memory" (or by other means).
   Xen will guarantee that such ranges are not touched/allocated, i assume.

So in my opinion.
- native Linux (or other OS) need to know about all reserved memory to avoid clash with generic memory management routines.
- virtual Linux (OS) does not need to know - as Xen should exclude memory, defined as reserved in System device tree,
   when allocating memory for domain.
   if domain need to know about some reserved memory - it is expected be defined in DT explicitly (or by other means).

The Xen performs the primary memory management role in virtualized environment,
so it is the main user of any kind of information about reserved memory.

Do not shot me please I've tried to be helpful...and issue fixed.
BR,
-grygorii
Stefano Stabellini Nov. 14, 2024, 9:41 p.m. UTC | #26
On Thu, 14 Nov 2024, Julien Grall wrote:
> On 14/11/2024 12:22, Michal Orzel wrote:
> > 
> > 
> > On 14/11/2024 13:04, Julien Grall wrote:
> > > 
> > > 
> > > Hi Michal,
> > > 
> > > On 14/11/2024 11:48, Michal Orzel wrote:
> > > > 
> > > > 
> > > > On 14/11/2024 11:31, Julien Grall wrote:
> > > > > Looking at the code, I think /memreserve/ and /reserved-memory are not
> > > > > mapped in Xen and everything in /reserved-memory is mapped to dom0.
> > > > Why do we forward /reserved-memory to dom0 fdt but /memreserve/ not?
> > > 
> > > I was wondering the same. The main issue I can think of with
> > > /memreserve/ is some of the regions will likely be for Xen own usage. So
> > Can you give example of regions defined as reserved for Xen usage (other
> > than static-mem)?
> 
> The spin table to bring-up CPUs.

Yes, maybe my EFI runtime services example was not ideal, but basically
any firmware "stuff" that is expected to survive the end of the
firmware/bootloader boot sequence and should not be accessed directly by
the OS could be under /memreserve/  The spin table is a good example.


> > > we would need to have a way to exclude them from dom0.
> > > 
> > >   >  From the discussion> we're having it seems like we should treat them
> > > equally. Also, looking at Luca patch,
> > > > we seem to special case /memreserve/ and only allow for overlap
> > > > /memresrve/ with boot modules
> > > > and not /reserved-memory/ with boot modules. If we are going to claim
> > > > that all the boot modules
> > > > can be marked as reserved by the bootloader, then I think we should
> > > > treat them equally providing
> > > > the same experience to dom0.
> > > 
> > > In my mind, /memreserved/ and /reserved-memory/ are different. The
> > > former doesn't say what the region is for, but the latter will indicate
> > > it.
> > In the context of this patch, I don't agree. We're discussing overlap, and
> > if a region A
> > from /memreserve/ overlaps fully with a module A, we know what is the
> > purpose of it.
> > Today it's initrd, but as you say we cannot rule out other modules as well.
> 
> To give a concrete example, the /reserved-region/ can be used to reserve space
> for the VGA buffer. It would be odd that someone would put the boot module in
> the middle of the VGA buffer... If Xen ends up to use the VGA buffer (not the
> case today), then it would be a problem. Xen would need to be reworked to move
> all boot modules outside of the memory because it can access the VGA (or any
> other reserved regions).
> 
> The problem is slightly different for /memreserve/ because we don't exactly
> know what the regions are for until we parse the rest of the DT. Yes
> technically, a user could put the initrd in the wrong place. So the problem is
> the same. But this is a relexation I am more willing to accept over
> /reserved-region/ right now.

Looking at the specification, I believe we should handle
/reserved-memory and /memreserve/ differently. Note that I have not
reviewed this patch; my comments are based solely on the expected Xen
behavior at the specification level.

Given that /reserved-memory regions are designated for special driver
use, I think Xen should remap all /reserved-memory regions to Dom0.
Ideally, Xen would determine which driver requires each /reserved-memory
region and assign only the relevant region to the respective domain,
rather than assigning all regions to Dom0. However, for now, we can take
this as an initial simplification.

On the other hand, since /memreserve/ is not intended for general use
and we do not know its content, I believe /memreserve/ should not be
mapped to Dom0 or any domain. It could be included in Xen's directmap,
but Xen itself should not access it. If another device tree node
represents a range or a subset of a range also described in
/memreserve/, that node should take precedence, and the corresponding
range should be assigned to Dom0 (or another domain, such as DomU, or
Xen, as appropriate). Unfortunately, this implies Xen needs to
understand the linux,initrd node.

I think /memreserve/ is valuable for protecting certain areas from the
operating system. For instance, it could be used as a storage area for
guest EFI variables in a VM, where the guest OS should never access the
range. However, I do not believe the initrd range is a good use of
/memreserve/ and similar items, which are meant to be freed anyway and
where Linux already knows the exact range and its intended use. So I
would not change Xen to start reserving the initrd range for Linux. The
likely reason U-Boot uses /memreserve/ is that the U-Boot code
implementing this behavior predates the introduction of
/reserved-memory.
diff mbox series

Patch

diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
index 199f5260229d..d35c991c856f 100644
--- a/xen/arch/arm/efi/efi-boot.h
+++ b/xen/arch/arm/efi/efi-boot.h
@@ -167,13 +167,14 @@  static bool __init meminfo_add_bank(struct membanks *mem,
     if ( mem->nr_banks >= mem->max_banks )
         return false;
 #ifdef CONFIG_ACPI
-    if ( check_reserved_regions_overlap(start, size) )
+    if ( check_reserved_regions_overlap(start, size, MEMBANK_NONE) )
         return false;
 #endif
 
     bank = &mem->bank[mem->nr_banks];
     bank->start = start;
     bank->size = size;
+    bank->type = MEMBANK_DEFAULT;
 
     mem->nr_banks++;
 
diff --git a/xen/arch/arm/static-shmem.c b/xen/arch/arm/static-shmem.c
index aa80756c3ca5..149ed4b0a5ba 100644
--- a/xen/arch/arm/static-shmem.c
+++ b/xen/arch/arm/static-shmem.c
@@ -696,7 +696,7 @@  int __init process_shm_node(const void *fdt, int node, uint32_t address_cells,
         if (i < mem->max_banks)
         {
             if ( (paddr != INVALID_PADDR) &&
-                 check_reserved_regions_overlap(paddr, size) )
+                 check_reserved_regions_overlap(paddr, size, MEMBANK_NONE) )
                 return -EINVAL;
 
             /* Static shared memory shall be reserved from any other use. */
diff --git a/xen/common/device-tree/bootfdt.c b/xen/common/device-tree/bootfdt.c
index 927f59c64b0d..fb3a6ab95a22 100644
--- a/xen/common/device-tree/bootfdt.c
+++ b/xen/common/device-tree/bootfdt.c
@@ -176,8 +176,15 @@  static int __init device_tree_get_meminfo(const void *fdt, int node,
     for ( i = 0; i < banks && mem->nr_banks < mem->max_banks; i++ )
     {
         device_tree_get_reg(&cell, address_cells, size_cells, &start, &size);
+        /*
+         * Some valid device trees, such as those generated by OpenPOWER
+         * skiboot firmware, expose all reserved memory regions in the
+         * FDT memory reservation block AND in the reserved-memory node which
+         * has already been parsed. Thus, any matching overlaps in the
+         * reserved_mem banks should be ignored.
+         */
         if ( mem == bootinfo_get_reserved_mem() &&
-             check_reserved_regions_overlap(start, size) )
+             check_reserved_regions_overlap(start, size, MEMBANK_FDT_RESVMEM) )
             return -EINVAL;
         /* Some DT may describe empty bank, ignore them */
         if ( !size )
diff --git a/xen/common/device-tree/bootinfo.c b/xen/common/device-tree/bootinfo.c
index f2e6a1145b7c..05038075e835 100644
--- a/xen/common/device-tree/bootinfo.c
+++ b/xen/common/device-tree/bootinfo.c
@@ -99,7 +99,8 @@  static void __init dt_unreserved_regions(paddr_t s, paddr_t e,
  */
 static bool __init meminfo_overlap_check(const struct membanks *mem,
                                          paddr_t region_start,
-                                         paddr_t region_size)
+                                         paddr_t region_size,
+                                         enum membank_type allow_match_type)
 {
     paddr_t bank_start = INVALID_PADDR, bank_end = 0;
     paddr_t region_end = region_start + region_size;
@@ -113,12 +114,16 @@  static bool __init meminfo_overlap_check(const struct membanks *mem,
         if ( INVALID_PADDR == bank_start || region_end <= bank_start ||
              region_start >= bank_end )
             continue;
-        else
-        {
-            printk("Region: [%#"PRIpaddr", %#"PRIpaddr") overlapping with bank[%u]: [%#"PRIpaddr", %#"PRIpaddr")\n",
-                   region_start, region_end, i, bank_start, bank_end);
-            return true;
-        }
+
+        if ( (allow_match_type != MEMBANK_NONE) &&
+             (region_start == bank_start) && (region_end == bank_end) &&
+             (allow_match_type == mem->bank[i].type) )
+            continue;
+
+        printk("Region: [%#"PRIpaddr", %#"PRIpaddr") overlapping with bank[%u]: [%#"PRIpaddr", %#"PRIpaddr")\n",
+                region_start, region_end, i, bank_start, bank_end);
+        return true;
+
     }
 
     return false;
@@ -169,9 +174,14 @@  void __init fw_unreserved_regions(paddr_t s, paddr_t e,
  * with the existing reserved memory regions defined in bootinfo.
  * Return true if the input physical address range is overlapping with any
  * existing reserved memory regions, otherwise false.
+ * The 'allow_match_type' parameter can be used to allow exact match of a
+ * region with another memory region already recorded, but it needs to be used
+ * only on memory regions that allows a type (reserved_mem, acpi). For all the
+ * other cases, passing 'MEMBANK_NONE' will disable the exact match.
  */
 bool __init check_reserved_regions_overlap(paddr_t region_start,
-                                           paddr_t region_size)
+                                           paddr_t region_size,
+                                           enum membank_type allow_match_type)
 {
     const struct membanks *mem_banks[] = {
         bootinfo_get_reserved_mem(),
@@ -190,8 +200,21 @@  bool __init check_reserved_regions_overlap(paddr_t region_start,
      * shared memory banks (when static shared memory feature is enabled)
      */
     for ( i = 0; i < ARRAY_SIZE(mem_banks); i++ )
-        if ( meminfo_overlap_check(mem_banks[i], region_start, region_size) )
+    {
+        enum membank_type type = allow_match_type;
+
+#ifdef CONFIG_STATIC_SHM
+        /*
+         * Static shared memory banks don't have a type, so for them disable
+         * the exact match check.
+         */
+        if (mem_banks[i] == bootinfo_get_shmem())
+            type = MEMBANK_NONE;
+#endif
+        if ( meminfo_overlap_check(mem_banks[i], region_start, region_size,
+                                   type) )
             return true;
+    }
 
     /* Check if input region is overlapping with bootmodules */
     if ( bootmodules_overlap_check(&bootinfo.modules,
@@ -216,7 +239,12 @@  struct bootmodule __init *add_boot_module(bootmodule_kind kind,
         return NULL;
     }
 
-    if ( check_reserved_regions_overlap(start, size) )
+    /*
+     * u-boot adds boot module such as ramdisk to the /memreserve/, since these
+     * ranges are saved in reserved_mem at this stage, allow an eventual exact
+     * match with MEMBANK_FDT_RESVMEM banks.
+     */
+    if ( check_reserved_regions_overlap(start, size, MEMBANK_FDT_RESVMEM) )
         return NULL;
 
     for ( i = 0 ; i < mods->nr_mods ; i++ )
diff --git a/xen/include/xen/bootfdt.h b/xen/include/xen/bootfdt.h
index 16fa05f38f38..8f8a7ad882a4 100644
--- a/xen/include/xen/bootfdt.h
+++ b/xen/include/xen/bootfdt.h
@@ -23,6 +23,12 @@  typedef enum {
 }  bootmodule_kind;
 
 enum membank_type {
+    /*
+     * The MEMBANK_NONE type is a special placeholder that should not be applied
+     * to a memory bank, it is used as a special value in some function in order
+     * to disable some feature.
+     */
+    MEMBANK_NONE = -1,
     /*
      * The MEMBANK_DEFAULT type refers to either reserved memory for the
      * device/firmware (when the bank is in 'reserved_mem') or any RAM (when
@@ -158,7 +164,8 @@  struct bootinfo {
 extern struct bootinfo bootinfo;
 
 bool check_reserved_regions_overlap(paddr_t region_start,
-                                    paddr_t region_size);
+                                    paddr_t region_size,
+                                    enum membank_type allow_match_type);
 
 struct bootmodule *add_boot_module(bootmodule_kind kind,
                                    paddr_t start, paddr_t size, bool domU);