diff mbox series

[v3,2/2] xen/arm: Enlarge identity map space to 127TB

Message ID 20230914021734.1395472-3-leo.yan@linaro.org (mailing list archive)
State Superseded
Headers show
Series xen/arm: Enlarge identity map space | expand

Commit Message

Leo Yan Sept. 14, 2023, 2:17 a.m. UTC
On ADLink AVA platform (Ampere Altra SoC with 32 Arm Neoverse N1 cores),
the physical memory regions are:

  DRAM memory regions:
    Node[0] Region[0]: 0x000080000000 - 0x0000ffffffff
    Node[0] Region[1]: 0x080000000000 - 0x08007fffffff
    Node[0] Region[2]: 0x080100000000 - 0x0807ffffffff

The UEFI loads Xen hypervisor and DTB into the high memory, the kernel
and ramdisk images are loaded into the low memory space:

  (XEN) MODULE[0]: 00000807f6df0000 - 00000807f6f3e000 Xen
  (XEN) MODULE[1]: 00000807f8054000 - 00000807f8056000 Device Tree
  (XEN) MODULE[2]: 00000000fa834000 - 00000000fc5de1d5 Ramdisk
  (XEN) MODULE[3]: 00000000fc5df000 - 00000000ffb3f810 Kernel

In this case, the Xen binary is loaded above 8TB, which exceeds the
maximum supported identity map space of 2TB in Xen. Consequently, the
system fails to boot.

This patch enlarges identity map space to 127TB, allowing module loading
within the range of [0x0 .. 0x00007eff_ffff_ffff].

Note, despite this expansion of the identity map to 127TB, the frame
table still only supports 2TB.  The reason is the frame table is data
structure for the page management, which does not require coverage of
the memory layout gaps (refer to pfn_pdx_hole_setup() for Xen removing
the biggest gap from memory regions).  Thus, 2TB of memory support
remains sufficient for most use cases.

Fixes: 1c78d76b67 ("xen/arm64: mm: Introduce helpers to prepare/enable/disable")
Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 xen/arch/arm/arm64/mm.c               | 6 ++++--
 xen/arch/arm/include/asm/mmu/layout.h | 8 ++++----
 2 files changed, 8 insertions(+), 6 deletions(-)

Comments

Julien Grall Sept. 14, 2023, 10:23 a.m. UTC | #1
Hi,

On 14/09/2023 03:17, Leo Yan wrote:
> On ADLink AVA platform (Ampere Altra SoC with 32 Arm Neoverse N1 cores),
> the physical memory regions are:
> 
>    DRAM memory regions:
>      Node[0] Region[0]: 0x000080000000 - 0x0000ffffffff
>      Node[0] Region[1]: 0x080000000000 - 0x08007fffffff
>      Node[0] Region[2]: 0x080100000000 - 0x0807ffffffff
> 
> The UEFI loads Xen hypervisor and DTB into the high memory, the kernel
> and ramdisk images are loaded into the low memory space:
> 
>    (XEN) MODULE[0]: 00000807f6df0000 - 00000807f6f3e000 Xen
>    (XEN) MODULE[1]: 00000807f8054000 - 00000807f8056000 Device Tree
>    (XEN) MODULE[2]: 00000000fa834000 - 00000000fc5de1d5 Ramdisk
>    (XEN) MODULE[3]: 00000000fc5df000 - 00000000ffb3f810 Kernel
> 
> In this case, the Xen binary is loaded above 8TB, which exceeds the
> maximum supported identity map space of 2TB in Xen. Consequently, the
> system fails to boot.
> 
> This patch enlarges identity map space to 127TB, allowing module loading
> within the range of [0x0 .. 0x00007eff_ffff_ffff].

On v2 you wrote:

"
When I reviewed the existed code, I found it reserves 125TiB:

   0x0000028000000000 - 0x00007fffffffffff (125TB, L0 slots [5..255])
     Unused

  Seems to me, we can map this area.  Ideally, if we only map for the
  first level's page table, we can just fill the zeroeth page and don't
  need to allocate extra page tables.
"

I agree that we will not allocate page-tables for the whole reserved 
region. However, my concern was more related to the fact that it would 
be more difficult to reclaim space in the virtual address if necessary 
in the future.

So I would rather prefer if we don't use the whole 127 TiB if this is 
not necessary. For your platform, it seems that it would be enough to 
bump the area to 10 TB (this is 8TB + some margin).

> 
> Note, despite this expansion of the identity map to 127TB, the frame
> table still only supports 2TB.  The reason is the frame table is data
> structure for the page management, which does not require coverage of
> the memory layout gaps (refer to pfn_pdx_hole_setup() for Xen removing
> the biggest gap from memory regions).

This is not quite correct. The PDX can only compress the bottom bits (if 
they are all zeroes) and one region in the address. So some holes may be 
covered.

It might be possible that for your platform, the compression is enough 
to fit everything in 2TB.

But I would drop this paragraph. The decision to enlarge the identity 
mapping is different from the size of frametable. You may have a 
platform where the first RAM bank is high in memory (such as on AMD 
Seattle). There might also be some changes necessary in Xen to support 
more than 2TB frametable.

Cheers,
diff mbox series

Patch

diff --git a/xen/arch/arm/arm64/mm.c b/xen/arch/arm/arm64/mm.c
index 78b7c7eb00..cb69df0661 100644
--- a/xen/arch/arm/arm64/mm.c
+++ b/xen/arch/arm/arm64/mm.c
@@ -41,7 +41,8 @@  static void __init prepare_boot_identity_mapping(void)
     clear_page(boot_third_id);
 
     if ( id_offsets[0] >= IDENTITY_MAPPING_AREA_NR_L0 )
-        panic("Cannot handle ID mapping above 2TB\n");
+        panic("Cannot handle ID mapping above %uTB\n",
+              IDENTITY_MAPPING_AREA_NR_L0 >> 1);
 
     /* Link first ID table */
     pte = mfn_to_xen_entry(virt_to_mfn(boot_first_id), MT_NORMAL);
@@ -74,7 +75,8 @@  static void __init prepare_runtime_identity_mapping(void)
     DECLARE_OFFSETS(id_offsets, id_addr);
 
     if ( id_offsets[0] >= IDENTITY_MAPPING_AREA_NR_L0 )
-        panic("Cannot handle ID mapping above 2TB\n");
+        panic("Cannot handle ID mapping above %uTB\n",
+              IDENTITY_MAPPING_AREA_NR_L0 >> 1);
 
     /* Link first ID table */
     pte = pte_of_xenaddr((vaddr_t)xen_first_id);
diff --git a/xen/arch/arm/include/asm/mmu/layout.h b/xen/arch/arm/include/asm/mmu/layout.h
index 2cb2382fbf..fa16d07d0d 100644
--- a/xen/arch/arm/include/asm/mmu/layout.h
+++ b/xen/arch/arm/include/asm/mmu/layout.h
@@ -19,11 +19,11 @@ 
  *   2G -   4G   Domheap: on-demand-mapped
  *
  * ARM64 layout:
- * 0x0000000000000000 - 0x000001ffffffffff (2TB, L0 slots [0..3])
+ * 0x0000000000000000 - 0x00007effffffffff (127TB, L0 slots [0..253])
  *
  *  Reserved to identity map Xen
  *
- * 0x0000020000000000 - 0x0000027fffffffff (512GB, L0 slot [4])
+ * 0x00007f0000000000 - 0x00007f7fffffffff (512GB, L0 slot [254])
  *  (Relative offsets)
  *   0  -   2M   Unmapped
  *   2M -  10M   Xen text, data, bss
@@ -35,7 +35,7 @@ 
  *
  *  32G -  64G   Frametable: 56 bytes per page for 2TB of RAM
  *
- * 0x0000028000000000 - 0x00007fffffffffff (125TB, L0 slots [5..255])
+ * 0x00007f8000000000 - 0x00007fffffffffff (512GB, L0 slots [255])
  *  Unused
  *
  * 0x0000800000000000 - 0x000084ffffffffff (5TB, L0 slots [256..265])
@@ -49,7 +49,7 @@ 
 #define XEN_VIRT_START          _AT(vaddr_t, MB(2))
 #else
 
-#define IDENTITY_MAPPING_AREA_NR_L0     4
+#define IDENTITY_MAPPING_AREA_NR_L0     254
 #define XEN_VM_MAPPING                  SLOT0(IDENTITY_MAPPING_AREA_NR_L0)
 
 #define SLOT0_ENTRY_BITS  39