diff mbox

[for-4.9] x86/mm: Fix incorrect unmapping of 2MB and 1GB pages

Message ID 1494409385-13729-1-git-send-email-igor.druzhinin@citrix.com (mailing list archive)
State New, archived
Headers show

Commit Message

Igor Druzhinin May 10, 2017, 9:43 a.m. UTC
The same set of functions is used to set as well as to clean
P2M entries, except that for clean operations INVALID_MFN (~0UL)
is passed as a parameter. Unfortunately, when calculating an
appropriate target order for a particular mapping INVALID_MFN
is not taken into account which leads to 4K page target order
being set each time even for 2MB and 1GB mappings. This eventually
breaks down an EPT structure irreversibly into 4K mappings which
prevents consecutive high order mappings to this area.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
---
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: George Dunlap <george.dunlap@eu.citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>

Bugfix intended for 4.9 release.
---
 xen/arch/x86/mm/p2m-ept.c | 3 ++-
 xen/arch/x86/mm/p2m.c     | 8 ++++++--
 2 files changed, 8 insertions(+), 3 deletions(-)

Comments

Jan Beulich May 10, 2017, 10:26 a.m. UTC | #1
>>> On 10.05.17 at 11:43, <igor.druzhinin@citrix.com> wrote:
> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -681,6 +681,7 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
>      ept_entry_t *table, *ept_entry = NULL;
>      unsigned long gfn_remainder = gfn;
>      unsigned int i, target = order / EPT_TABLE_ORDER;
> +    unsigned long mfn_mask = mfn_valid(mfn) ? mfn_x(mfn) : 0;

Aiui MMIO pages will come here too, so an mfn_valid() check here
(and below) is too lax.

> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -536,6 +536,7 @@ int p2m_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
>      struct domain *d = p2m->domain;
>      unsigned long todo = 1ul << page_order;
>      unsigned int order;
> +    unsigned long mfn_mask;

Please move the declaration ...

> @@ -543,12 +544,15 @@ int p2m_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
>      while ( todo )
>      {
>          if ( hap_enabled(d) )
> -            order = (!((gfn | mfn_x(mfn) | todo) &
> +        {
> +            mfn_mask = mfn_valid(mfn) ? mfn_x(mfn) : 0;

... here, perhaps at once making this the initializer. However, ...

> +            order = (!((gfn | mfn_mask | todo) &
>                         ((1ul << PAGE_ORDER_1G) - 1)) &&
>                       hap_has_1gb) ? PAGE_ORDER_1G :
> -                    (!((gfn | mfn_x(mfn) | todo) &
> +                    (!((gfn | mfn_mask | todo) &

... seeing the recurring expression, it may be worth considering to
instead introduce a local variable holding "gfn | mfn_mask | todo".

Jan
George Dunlap May 10, 2017, 10:51 a.m. UTC | #2
On 10/05/17 11:26, Jan Beulich wrote:
>>>> On 10.05.17 at 11:43, <igor.druzhinin@citrix.com> wrote:
>> --- a/xen/arch/x86/mm/p2m-ept.c
>> +++ b/xen/arch/x86/mm/p2m-ept.c
>> @@ -681,6 +681,7 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
>>      ept_entry_t *table, *ept_entry = NULL;
>>      unsigned long gfn_remainder = gfn;
>>      unsigned int i, target = order / EPT_TABLE_ORDER;
>> +    unsigned long mfn_mask = mfn_valid(mfn) ? mfn_x(mfn) : 0;
> 
> Aiui MMIO pages will come here too, so an mfn_valid() check here
> (and below) is too lax.

The resulting order will never be higher than the order passed in by the
caller.  Assuming that the caller is setting an entire 2MiB (or 1GiB)
region as MMIO, is it not valid to set a 2MiB or 1GiB entry as such?
The code seems to be written in such a way that such entries are expected.

 -George
Igor Druzhinin May 10, 2017, 10:55 a.m. UTC | #3
On 10/05/17 11:51, George Dunlap wrote:
> On 10/05/17 11:26, Jan Beulich wrote:
>>>>> On 10.05.17 at 11:43, <igor.druzhinin@citrix.com> wrote:
>>> --- a/xen/arch/x86/mm/p2m-ept.c
>>> +++ b/xen/arch/x86/mm/p2m-ept.c
>>> @@ -681,6 +681,7 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
>>>      ept_entry_t *table, *ept_entry = NULL;
>>>      unsigned long gfn_remainder = gfn;
>>>      unsigned int i, target = order / EPT_TABLE_ORDER;
>>> +    unsigned long mfn_mask = mfn_valid(mfn) ? mfn_x(mfn) : 0;
>>
>> Aiui MMIO pages will come here too, so an mfn_valid() check here
>> (and below) is too lax.
> 
> The resulting order will never be higher than the order passed in by the
> caller.  Assuming that the caller is setting an entire 2MiB (or 1GiB)
> region as MMIO, is it not valid to set a 2MiB or 1GiB entry as such?
> The code seems to be written in such a way that such entries are expected.
> 
>  -George
> 

Using mfn_valid() is my mistake here. I initially used mfn_eq(mfn,
INVALID_MFN) but then mixed them up eventually.

Igor
Jan Beulich May 10, 2017, 12:07 p.m. UTC | #4
>>> On 10.05.17 at 12:51, <george.dunlap@citrix.com> wrote:
> On 10/05/17 11:26, Jan Beulich wrote:
>>>>> On 10.05.17 at 11:43, <igor.druzhinin@citrix.com> wrote:
>>> --- a/xen/arch/x86/mm/p2m-ept.c
>>> +++ b/xen/arch/x86/mm/p2m-ept.c
>>> @@ -681,6 +681,7 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
>>>      ept_entry_t *table, *ept_entry = NULL;
>>>      unsigned long gfn_remainder = gfn;
>>>      unsigned int i, target = order / EPT_TABLE_ORDER;
>>> +    unsigned long mfn_mask = mfn_valid(mfn) ? mfn_x(mfn) : 0;
>> 
>> Aiui MMIO pages will come here too, so an mfn_valid() check here
>> (and below) is too lax.
> 
> The resulting order will never be higher than the order passed in by the
> caller.  Assuming that the caller is setting an entire 2MiB (or 1GiB)
> region as MMIO, is it not valid to set a 2MiB or 1GiB entry as such?
> The code seems to be written in such a way that such entries are expected.

In the case here you're right (albeit that's pretty implicit). In the
p2m_set_entry() case though the order is being determined from
the (gfn,mfn,todo) tuple, so I don't think the code could have
been left as is (seeing that Igor has already sent v2).

Jan
diff mbox

Patch

diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index f37a1f2..8d82097 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -681,6 +681,7 @@  ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
     ept_entry_t *table, *ept_entry = NULL;
     unsigned long gfn_remainder = gfn;
     unsigned int i, target = order / EPT_TABLE_ORDER;
+    unsigned long mfn_mask = mfn_valid(mfn) ? mfn_x(mfn) : 0;
     int ret, rc = 0;
     bool_t entry_written = 0;
     bool_t direct_mmio = (p2mt == p2m_mmio_direct);
@@ -701,7 +702,7 @@  ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
      * 2. gfn not exceeding guest physical address width.
      * 3. passing a valid order.
      */
-    if ( ((gfn | mfn_x(mfn)) & ((1UL << order) - 1)) ||
+    if ( ((gfn | mfn_mask) & ((1UL << order) - 1)) ||
          ((u64)gfn >> ((ept->wl + 1) * EPT_TABLE_ORDER)) ||
          (order % EPT_TABLE_ORDER) )
         return -EINVAL;
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index ae70a92..fd57d41 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -536,6 +536,7 @@  int p2m_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
     struct domain *d = p2m->domain;
     unsigned long todo = 1ul << page_order;
     unsigned int order;
+    unsigned long mfn_mask;
     int set_rc, rc = 0;
 
     ASSERT(gfn_locked_by_me(p2m, gfn));
@@ -543,12 +544,15 @@  int p2m_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
     while ( todo )
     {
         if ( hap_enabled(d) )
-            order = (!((gfn | mfn_x(mfn) | todo) &
+        {
+            mfn_mask = mfn_valid(mfn) ? mfn_x(mfn) : 0;
+            order = (!((gfn | mfn_mask | todo) &
                        ((1ul << PAGE_ORDER_1G) - 1)) &&
                      hap_has_1gb) ? PAGE_ORDER_1G :
-                    (!((gfn | mfn_x(mfn) | todo) &
+                    (!((gfn | mfn_mask | todo) &
                        ((1ul << PAGE_ORDER_2M) - 1)) &&
                      hap_has_2mb) ? PAGE_ORDER_2M : PAGE_ORDER_4K;
+        }
         else
             order = 0;