diff mbox series

x86/mm: Fix IS_ALIGNED() check in IS_LnE_ALIGNED()

Message ID 20250319142011.138250-1-andrew.cooper3@citrix.com (mailing list archive)
State New
Headers show
Series x86/mm: Fix IS_ALIGNED() check in IS_LnE_ALIGNED() | expand

Commit Message

Andrew Cooper March 19, 2025, 2:20 p.m. UTC
The current CI failures happen to be a latent bug triggered by a narrow set of
properties of the initrd, which CI encountered by chance.

One step during boot involves constructing directmap mappings for modules.
With some probing at the point of creation, it is observed that there's a 4k
mapping missing towards the end of the initrd.

  (XEN) === Mapped Mod1 [0000000394001000, 00000003be1ff6dc] to Directmap
  (XEN) Probing paddr 394001000, va ffff830394001000
  (XEN) Probing paddr 3be1ff6db, va ffff8303be1ff6db
  (XEN) Probing paddr 3bdffffff, va ffff8303bdffffff
  (XEN) Probing paddr 3be001000, va ffff8303be001000
  (XEN) Probing paddr 3be000000, va ffff8303be000000
  (XEN) Early fatal page fault at e008:ffff82d04032014c (cr2=ffff8303be000000, ec=0000)

The conditions for this bug appear to be map_pages_to_xen() call with a non-2M
aligned start address, some number of full 2M pages, then a tail needing 4k
pages.

Anyway, the condition for spotting superpage boundaries in map_pages_to_xen()
is wrong.  The IS_ALIGNED() macro expects a power of two for the alignment
argument, and subtracts 1 itself.

Fixing this causes the failing case to now boot.

Fixes: 97fb6fcf26e8 ("x86/mm: introduce helpers to detect super page alignment")
Debugged-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>

Judging by how IS_ALIGNED() is wrong, I think the pre-condition might be
exactly 4k past a 2M boundary, not just simply a non-2M boundary.
---
 xen/arch/x86/mm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


base-commit: 77be740e8182fa6b31291a8ae983d253187e9b50

Comments

Jan Beulich March 19, 2025, 2:29 p.m. UTC | #1
On 19.03.2025 15:20, Andrew Cooper wrote:
> The current CI failures happen to be a latent bug triggered by a narrow set of
> properties of the initrd, which CI encountered by chance.

Plus properties of the host memory map.

> One step during boot involves constructing directmap mappings for modules.
> With some probing at the point of creation, it is observed that there's a 4k
> mapping missing towards the end of the initrd.
> 
>   (XEN) === Mapped Mod1 [0000000394001000, 00000003be1ff6dc] to Directmap
>   (XEN) Probing paddr 394001000, va ffff830394001000
>   (XEN) Probing paddr 3be1ff6db, va ffff8303be1ff6db
>   (XEN) Probing paddr 3bdffffff, va ffff8303bdffffff
>   (XEN) Probing paddr 3be001000, va ffff8303be001000
>   (XEN) Probing paddr 3be000000, va ffff8303be000000
>   (XEN) Early fatal page fault at e008:ffff82d04032014c (cr2=ffff8303be000000, ec=0000)
> 
> The conditions for this bug appear to be map_pages_to_xen() call with a non-2M
> aligned start address, some number of full 2M pages, then a tail needing 4k
> pages.
> 
> Anyway, the condition for spotting superpage boundaries in map_pages_to_xen()
> is wrong.  The IS_ALIGNED() macro expects a power of two for the alignment
> argument, and subtracts 1 itself.
> 
> Fixing this causes the failing case to now boot.
> 
> Fixes: 97fb6fcf26e8 ("x86/mm: introduce helpers to detect super page alignment")
> Debugged-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

> Judging by how IS_ALIGNED() is wrong, I think the pre-condition might be
> exactly 4k past a 2M boundary, not just simply a non-2M boundary.

That's the understanding I have gained, yes.

> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -5505,7 +5505,7 @@ int map_pages_to_xen(
>                                                                  \
>      ASSERT(!mfn_eq(m_, INVALID_MFN));                           \
>      IS_ALIGNED(PFN_DOWN(v) | mfn_x(m_),                         \
> -               (1UL << (PAGETABLE_ORDER * ((n) - 1))) - 1);     \
> +               (1UL << (PAGETABLE_ORDER * ((n) - 1))));         \

Can we also get rid of the now unnecessary outermost pair of parentheses
of that operand, please? Imo any reduction in parentheses in constructs
like this helps readability.

Jan
Andrew Cooper March 19, 2025, 2:48 p.m. UTC | #2
On 19/03/2025 2:29 pm, Jan Beulich wrote:
> On 19.03.2025 15:20, Andrew Cooper wrote:
>> The current CI failures happen to be a latent bug triggered by a narrow set of
>> properties of the initrd, which CI encountered by chance.
> Plus properties of the host memory map.
>
>> One step during boot involves constructing directmap mappings for modules.
>> With some probing at the point of creation, it is observed that there's a 4k
>> mapping missing towards the end of the initrd.
>>
>>   (XEN) === Mapped Mod1 [0000000394001000, 00000003be1ff6dc] to Directmap
>>   (XEN) Probing paddr 394001000, va ffff830394001000
>>   (XEN) Probing paddr 3be1ff6db, va ffff8303be1ff6db
>>   (XEN) Probing paddr 3bdffffff, va ffff8303bdffffff
>>   (XEN) Probing paddr 3be001000, va ffff8303be001000
>>   (XEN) Probing paddr 3be000000, va ffff8303be000000
>>   (XEN) Early fatal page fault at e008:ffff82d04032014c (cr2=ffff8303be000000, ec=0000)
>>
>> The conditions for this bug appear to be map_pages_to_xen() call with a non-2M
>> aligned start address, some number of full 2M pages, then a tail needing 4k
>> pages.
>>
>> Anyway, the condition for spotting superpage boundaries in map_pages_to_xen()
>> is wrong.  The IS_ALIGNED() macro expects a power of two for the alignment
>> argument, and subtracts 1 itself.
>>
>> Fixing this causes the failing case to now boot.
>>
>> Fixes: 97fb6fcf26e8 ("x86/mm: introduce helpers to detect super page alignment")
>> Debugged-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>
>> Judging by how IS_ALIGNED() is wrong, I think the pre-condition might be
>> exactly 4k past a 2M boundary, not just simply a non-2M boundary.
> That's the understanding I have gained, yes.
>
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -5505,7 +5505,7 @@ int map_pages_to_xen(
>>                                                                  \
>>      ASSERT(!mfn_eq(m_, INVALID_MFN));                           \
>>      IS_ALIGNED(PFN_DOWN(v) | mfn_x(m_),                         \
>> -               (1UL << (PAGETABLE_ORDER * ((n) - 1))) - 1);     \
>> +               (1UL << (PAGETABLE_ORDER * ((n) - 1))));         \
> Can we also get rid of the now unnecessary outermost pair of parentheses
> of that operand, please? Imo any reduction in parentheses in constructs
> like this helps readability.

Ok. I'll adjust all of these, and get the fix in.

Thanks.

~Andrew
diff mbox series

Patch

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 03b8319f7a9d..e0793ad3462b 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5505,7 +5505,7 @@  int map_pages_to_xen(
                                                                 \
     ASSERT(!mfn_eq(m_, INVALID_MFN));                           \
     IS_ALIGNED(PFN_DOWN(v) | mfn_x(m_),                         \
-               (1UL << (PAGETABLE_ORDER * ((n) - 1))) - 1);     \
+               (1UL << (PAGETABLE_ORDER * ((n) - 1))));         \
 })
 #define IS_L2E_ALIGNED(v, m) IS_LnE_ALIGNED(v, m, 2)
 #define IS_L3E_ALIGNED(v, m) IS_LnE_ALIGNED(v, m, 3)