diff mbox series

[v2,2/4] x86/mm: special case super page alignment detection for INVALID_MFN

Message ID 20241106122927.26461-3-roger.pau@citrix.com (mailing list archive)
State Superseded
Headers show
Series x86/mm: miscellaneous fixes | expand

Commit Message

Roger Pau Monné Nov. 6, 2024, 12:29 p.m. UTC
INVALID_MFN is ~0, so by it having all bits as 1s it doesn't fulfill the
super-page address alignment checks for L3 and L2 entries.  Special case
INVALID_MFN so it's considered to be aligned for all slots.

This fixes a regression introduced by 0b6b51a69f4d, where the switch from 0 to
INVALID_MFN caused all super-pages to be shattered when attempting to remove
mappings by passing INVALID_MFN instead of 0.

Fixes: 0b6b51a69f4d ('xen/mm: Switch map_pages_to_xen to use MFN typesafe')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/include/asm/page.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Jan Beulich Nov. 7, 2024, 11:06 a.m. UTC | #1
On 06.11.2024 13:29, Roger Pau Monne wrote:
> --- a/xen/arch/x86/include/asm/page.h
> +++ b/xen/arch/x86/include/asm/page.h
> @@ -202,7 +202,8 @@ static inline l4_pgentry_t l4e_from_paddr(paddr_t pa, unsigned int flags)
>  
>  /* Check if an address is aligned for a given slot level. */
>  #define SLOT_IS_ALIGNED(v, m, s) \
> -    IS_ALIGNED(PFN_DOWN(v) | mfn_x(m), (1UL << ((s) - PAGE_SHIFT)) - 1)
> +    IS_ALIGNED(PFN_DOWN(v) | (mfn_eq(m, INVALID_MFN) ? 0 : mfn_x(m)), \
> +               (1UL << ((s) - PAGE_SHIFT)) - 1)
>  #define IS_L3E_ALIGNED(v, m) SLOT_IS_ALIGNED(v, m, L3_PAGETABLE_SHIFT)
>  #define IS_L2E_ALIGNED(v, m) SLOT_IS_ALIGNED(v, m, L2_PAGETABLE_SHIFT)

With this adjustment it feels yet more important for these macros to
become local ones in x86/mm.c. This special property may not be what one
wants in the general case. And m is now also evaluated twice (really:
once or twice), which a "random" user of the macro may not like.

I'm further uncertain now that this is the way to go to address the
original issue. Notably for the 1G-mapping case it may be better to go
from the incoming flags having _PAGE_PRESENT clear. After all we can
always create non-present "large" PTEs. E.g.

        if ( (cpu_has_page1gb || !(flags & _PAGE_PRESENT)) &&
             IS_L3E_ALIGNED(virt, flags & _PAGE_PRESENT ? mfn : _mfn(0)) &&
             nr_mfns >= (1UL << (L3_PAGETABLE_SHIFT - PAGE_SHIFT)) &&
             !(flags & (_PAGE_PAT | MAP_SMALL_PAGES)) )

Thoughts?

Jan
Roger Pau Monné Nov. 7, 2024, 3:52 p.m. UTC | #2
On Thu, Nov 07, 2024 at 12:06:41PM +0100, Jan Beulich wrote:
> On 06.11.2024 13:29, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/include/asm/page.h
> > +++ b/xen/arch/x86/include/asm/page.h
> > @@ -202,7 +202,8 @@ static inline l4_pgentry_t l4e_from_paddr(paddr_t pa, unsigned int flags)
> >  
> >  /* Check if an address is aligned for a given slot level. */
> >  #define SLOT_IS_ALIGNED(v, m, s) \
> > -    IS_ALIGNED(PFN_DOWN(v) | mfn_x(m), (1UL << ((s) - PAGE_SHIFT)) - 1)
> > +    IS_ALIGNED(PFN_DOWN(v) | (mfn_eq(m, INVALID_MFN) ? 0 : mfn_x(m)), \
> > +               (1UL << ((s) - PAGE_SHIFT)) - 1)
> >  #define IS_L3E_ALIGNED(v, m) SLOT_IS_ALIGNED(v, m, L3_PAGETABLE_SHIFT)
> >  #define IS_L2E_ALIGNED(v, m) SLOT_IS_ALIGNED(v, m, L2_PAGETABLE_SHIFT)
> 
> With this adjustment it feels yet more important for these macros to
> become local ones in x86/mm.c. This special property may not be what one
> wants in the general case. And m is now also evaluated twice (really:
> once or twice), which a "random" user of the macro may not like.
> 
> I'm further uncertain now that this is the way to go to address the
> original issue. Notably for the 1G-mapping case it may be better to go
> from the incoming flags having _PAGE_PRESENT clear. After all we can
> always create non-present "large" PTEs. E.g.

Hm, I don't think we want to do that in map_pages_to_xen() as part of
this change.  Doing so would possibly imply the freeing of
intermediate page-tables when Xen previously didn't free them.  If the
CPU didn't support 1GB mappings we would always keep the L2, even if
fully empty.  With your proposed change we would now free such L2.

I'm not saying it's a wrong change, but just didn't want to put this
extra change of behavior together with a bugfix for an existing issue.

> 
>         if ( (cpu_has_page1gb || !(flags & _PAGE_PRESENT)) &&
>              IS_L3E_ALIGNED(virt, flags & _PAGE_PRESENT ? mfn : _mfn(0)) &&
>              nr_mfns >= (1UL << (L3_PAGETABLE_SHIFT - PAGE_SHIFT)) &&
>              !(flags & (_PAGE_PAT | MAP_SMALL_PAGES)) )
> 
> Thoughts?

I was doing it based on mfn because that's how it worked previously
when 0 was passed instead of INVALID_MFN, and because I think it was
cleaner to hide the evaluation inside of IS_LnE_ALIGNED() instead of
open-coding it for every call to IS_LnE_ALIGNED().

If we want to do it based on flags it would be best if those are
passed to IS_LnE_ALIGNED(), but again, might be best to do it in a
followup patch and not part of this bugfix.  I fear it could have
unpredicted consequences.

Thanks, Roger.
Jan Beulich Nov. 8, 2024, 7:44 a.m. UTC | #3
On 07.11.2024 16:52, Roger Pau Monné wrote:
> On Thu, Nov 07, 2024 at 12:06:41PM +0100, Jan Beulich wrote:
>> On 06.11.2024 13:29, Roger Pau Monne wrote:
>>> --- a/xen/arch/x86/include/asm/page.h
>>> +++ b/xen/arch/x86/include/asm/page.h
>>> @@ -202,7 +202,8 @@ static inline l4_pgentry_t l4e_from_paddr(paddr_t pa, unsigned int flags)
>>>  
>>>  /* Check if an address is aligned for a given slot level. */
>>>  #define SLOT_IS_ALIGNED(v, m, s) \
>>> -    IS_ALIGNED(PFN_DOWN(v) | mfn_x(m), (1UL << ((s) - PAGE_SHIFT)) - 1)
>>> +    IS_ALIGNED(PFN_DOWN(v) | (mfn_eq(m, INVALID_MFN) ? 0 : mfn_x(m)), \
>>> +               (1UL << ((s) - PAGE_SHIFT)) - 1)
>>>  #define IS_L3E_ALIGNED(v, m) SLOT_IS_ALIGNED(v, m, L3_PAGETABLE_SHIFT)
>>>  #define IS_L2E_ALIGNED(v, m) SLOT_IS_ALIGNED(v, m, L2_PAGETABLE_SHIFT)
>>
>> With this adjustment it feels yet more important for these macros to
>> become local ones in x86/mm.c. This special property may not be what one
>> wants in the general case. And m is now also evaluated twice (really:
>> once or twice), which a "random" user of the macro may not like.
>>
>> I'm further uncertain now that this is the way to go to address the
>> original issue. Notably for the 1G-mapping case it may be better to go
>> from the incoming flags having _PAGE_PRESENT clear. After all we can
>> always create non-present "large" PTEs. E.g.
> 
> Hm, I don't think we want to do that in map_pages_to_xen() as part of
> this change.  Doing so would possibly imply the freeing of
> intermediate page-tables when Xen previously didn't free them.  If the
> CPU didn't support 1GB mappings we would always keep the L2, even if
> fully empty.  With your proposed change we would now free such L2.
> 
> I'm not saying it's a wrong change, but just didn't want to put this
> extra change of behavior together with a bugfix for an existing issue.

I can understand your concern here; perhaps indeed best to keep that
adjustment separate.

>>         if ( (cpu_has_page1gb || !(flags & _PAGE_PRESENT)) &&
>>              IS_L3E_ALIGNED(virt, flags & _PAGE_PRESENT ? mfn : _mfn(0)) &&
>>              nr_mfns >= (1UL << (L3_PAGETABLE_SHIFT - PAGE_SHIFT)) &&
>>              !(flags & (_PAGE_PAT | MAP_SMALL_PAGES)) )
>>
>> Thoughts?
> 
> I was doing it based on mfn because that's how it worked previously
> when 0 was passed instead of INVALID_MFN, and because I think it was
> cleaner to hide the evaluation inside of IS_LnE_ALIGNED() instead of
> open-coding it for every call to IS_LnE_ALIGNED().
> 
> If we want to do it based on flags it would be best if those are
> passed to IS_LnE_ALIGNED(), but again, might be best to do it in a
> followup patch and not part of this bugfix.  I fear it could have
> unpredicted consequences.

Here, however, I view the flags-based approach as simply a different
(and imo more correct) way of addressing the issue at hand. The special
casing of MFN 0 had always been somewhat bogus imo, just that in the
old days we didn't even have a proper INVALID_MFN.

Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/include/asm/page.h b/xen/arch/x86/include/asm/page.h
index 6970916d61d5..2fa4061dc77a 100644
--- a/xen/arch/x86/include/asm/page.h
+++ b/xen/arch/x86/include/asm/page.h
@@ -202,7 +202,8 @@  static inline l4_pgentry_t l4e_from_paddr(paddr_t pa, unsigned int flags)
 
 /* Check if an address is aligned for a given slot level. */
 #define SLOT_IS_ALIGNED(v, m, s) \
-    IS_ALIGNED(PFN_DOWN(v) | mfn_x(m), (1UL << ((s) - PAGE_SHIFT)) - 1)
+    IS_ALIGNED(PFN_DOWN(v) | (mfn_eq(m, INVALID_MFN) ? 0 : mfn_x(m)), \
+               (1UL << ((s) - PAGE_SHIFT)) - 1)
 #define IS_L3E_ALIGNED(v, m) SLOT_IS_ALIGNED(v, m, L3_PAGETABLE_SHIFT)
 #define IS_L2E_ALIGNED(v, m) SLOT_IS_ALIGNED(v, m, L2_PAGETABLE_SHIFT)