diff mbox series

[for-4.19?,2/2] xen/x86: remove foreign mappings from the p2m on teardown

Message ID 20240430165845.81696-3-roger.pau@citrix.com (mailing list archive)
State New
Headers show
Series xen/x86: support foreign mappings for HVM | expand

Commit Message

Roger Pau Monne April 30, 2024, 4:58 p.m. UTC
Iterate over the p2m based on the maximum recorded gfn and remove any foreign
mappings, in order to drop the underlying page references and thus don't keep
extra page references if a domain is destroyed while still having foreign
mappings on it's p2m.

The logic is similar to the one used on Arm.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
I still have to test with destroying a guest that does have foreign mappings on
it's p2m.
---
 CHANGELOG.md                   |  1 +
 xen/arch/x86/domain.c          |  6 +++
 xen/arch/x86/include/asm/p2m.h | 17 +++++----
 xen/arch/x86/mm/p2m.c          | 68 ++++++++++++++++++++++++++++++++--
 4 files changed, 81 insertions(+), 11 deletions(-)

Comments

Jan Beulich May 6, 2024, 10:41 a.m. UTC | #1
On 30.04.2024 18:58, Roger Pau Monne wrote:
> @@ -2695,6 +2691,70 @@ int p2m_set_altp2m_view_visibility(struct domain *d, unsigned int altp2m_idx,
>      return rc;
>  }
>  
> +/*
> + * Remove foreign mappings from the p2m, as that drops the page reference taken
> + * when mapped.
> + */
> +int relinquish_p2m_mapping(struct domain *d)
> +{
> +    struct p2m_domain *p2m = p2m_get_hostp2m(d);

Are there any guarantees made anywhere that altp2m-s and nested P2Ms can't
hold foreign mappings? p2m_entry_modify() certainly treats them all the same.

> +    unsigned long gfn = gfn_x(p2m->max_gfn);
> +    int rc = 0;
> +
> +    if ( !paging_mode_translate(d) )
> +        return 0;
> +
> +    BUG_ON(!d->is_dying);
> +
> +    p2m_lock(p2m);
> +
> +    /* Iterate over the whole p2m on debug builds to ensure correctness. */
> +    while ( gfn && (IS_ENABLED(CONFIG_DEBUG) || p2m->nr_foreign) )
> +    {
> +        unsigned int order;
> +        p2m_type_t t;
> +        p2m_access_t a;
> +
> +        _get_gfn_type_access(p2m, _gfn(gfn - 1), &t, &a, 0, &order, 0);
> +        ASSERT(IS_ALIGNED(gfn, 1u << order));

This heavily relies on the sole place where max_gfn is updated being indeed
sufficient.

> +        gfn -= 1 << order;

Please be consistent with the kind of 1 you shift left. Perhaps anyway both
better as 1UL.

> +        if ( t == p2m_map_foreign )
> +        {
> +            ASSERT(p2m->nr_foreign);
> +            ASSERT(order == 0);
> +            /*
> +             * Foreign mappings can only be of order 0, hence there's no need
> +             * to align the gfn to the entry order.  Otherwise we would need to
> +             * adjust gfn to point to the start of the page if order > 0.
> +             */

I'm a little irritated by this comment. Ahead of the enclosing if() you
already rely on (and assert) GFN being suitably aligned.

> +            rc = p2m_set_entry(p2m, _gfn(gfn), INVALID_MFN, order, p2m_invalid,
> +                               p2m->default_access);
> +            if ( rc )
> +            {
> +                printk(XENLOG_ERR
> +                       "%pd: failed to unmap foreign page %" PRI_gfn " order %u error %d\n",
> +                       d, gfn, order, rc);
> +                ASSERT_UNREACHABLE();
> +                break;
> +            }

Together with the updating of ->max_gfn further down, for a release build
this means: A single attempt to clean up the domain would fail when such a
set-entry fails. However, another attempt to clean up despite the earlier
error would then not retry for the failed GFN, but continue one below.
That's unexpected: I'd either see such a domain remain as a zombie forever,
or a best effort continuation of all cleanup right away.

> +        }
> +
> +        if ( !(gfn & 0xfff) && hypercall_preempt_check() )

By going from gfn's low bits you may check way more often than necessary
when encountering large pages.

Jan
Roger Pau Monne May 6, 2024, 2:56 p.m. UTC | #2
On Mon, May 06, 2024 at 12:41:49PM +0200, Jan Beulich wrote:
> On 30.04.2024 18:58, Roger Pau Monne wrote:
> > @@ -2695,6 +2691,70 @@ int p2m_set_altp2m_view_visibility(struct domain *d, unsigned int altp2m_idx,
> >      return rc;
> >  }
> >  
> > +/*
> > + * Remove foreign mappings from the p2m, as that drops the page reference taken
> > + * when mapped.
> > + */
> > +int relinquish_p2m_mapping(struct domain *d)
> > +{
> > +    struct p2m_domain *p2m = p2m_get_hostp2m(d);
> 
> Are there any guarantees made anywhere that altp2m-s and nested P2Ms can't
> hold foreign mappings? p2m_entry_modify() certainly treats them all the same.

Good point, I will disable those initially, as I don't think there's a
need right now for foreign mapping in altp2m-s and nested p2ms.

> > +    unsigned long gfn = gfn_x(p2m->max_gfn);
> > +    int rc = 0;
> > +
> > +    if ( !paging_mode_translate(d) )
> > +        return 0;
> > +
> > +    BUG_ON(!d->is_dying);
> > +
> > +    p2m_lock(p2m);
> > +
> > +    /* Iterate over the whole p2m on debug builds to ensure correctness. */
> > +    while ( gfn && (IS_ENABLED(CONFIG_DEBUG) || p2m->nr_foreign) )
> > +    {
> > +        unsigned int order;
> > +        p2m_type_t t;
> > +        p2m_access_t a;
> > +
> > +        _get_gfn_type_access(p2m, _gfn(gfn - 1), &t, &a, 0, &order, 0);
> > +        ASSERT(IS_ALIGNED(gfn, 1u << order));
> 
> This heavily relies on the sole place where max_gfn is updated being indeed
> sufficient.
> 
> > +        gfn -= 1 << order;
> 
> Please be consistent with the kind of 1 you shift left. Perhaps anyway both
> better as 1UL.
> 
> > +        if ( t == p2m_map_foreign )
> > +        {
> > +            ASSERT(p2m->nr_foreign);
> > +            ASSERT(order == 0);
> > +            /*
> > +             * Foreign mappings can only be of order 0, hence there's no need
> > +             * to align the gfn to the entry order.  Otherwise we would need to
> > +             * adjust gfn to point to the start of the page if order > 0.
> > +             */
> 
> I'm a little irritated by this comment. Ahead of the enclosing if() you
> already rely on (and assert) GFN being suitably aligned.

Oh, I've added that outer assert later, and then didn't remove this
comment.

> > +            rc = p2m_set_entry(p2m, _gfn(gfn), INVALID_MFN, order, p2m_invalid,
> > +                               p2m->default_access);
> > +            if ( rc )
> > +            {
> > +                printk(XENLOG_ERR
> > +                       "%pd: failed to unmap foreign page %" PRI_gfn " order %u error %d\n",
> > +                       d, gfn, order, rc);
> > +                ASSERT_UNREACHABLE();
> > +                break;
> > +            }
> 
> Together with the updating of ->max_gfn further down, for a release build
> this means: A single attempt to clean up the domain would fail when such a
> set-entry fails. However, another attempt to clean up despite the earlier
> error would then not retry for the failed GFN, but continue one below.
> That's unexpected: I'd either see such a domain remain as a zombie forever,
> or a best effort continuation of all cleanup right away.

I see, thanks for spotting that.  Will change the logic to ensure the
index is not updated past the failed to remove entry.

> > +        }
> > +
> > +        if ( !(gfn & 0xfff) && hypercall_preempt_check() )
> 
> By going from gfn's low bits you may check way more often than necessary
> when encountering large pages.

Yeah, it's difficult to strike a good balance, short of counting by
processed entries rather than using the gfn value.  I'm fine with
checking more often than required, as long as we always ensure that
progress is made on each function call.

Thanks, Roger.
diff mbox series

Patch

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 8041cfb7d243..09bdb9b97578 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -14,6 +14,7 @@  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
    - HVM PIRQs are disabled by default.
    - Reduce IOMMU setup time for hardware domain.
  - xl/libxl configures vkb=[] for HVM domains with priority over vkb_device.
+ - Allow HVM/PVH domains to map foreign pages.
 
 ### Added
  - On x86:
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 20e83cf38bbd..5aa2d3744e6b 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -2364,6 +2364,7 @@  int domain_relinquish_resources(struct domain *d)
         enum {
             PROG_iommu_pagetables = 1,
             PROG_shared,
+            PROG_mappings,
             PROG_paging,
             PROG_vcpu_pagetables,
             PROG_xen,
@@ -2412,6 +2413,11 @@  int domain_relinquish_resources(struct domain *d)
         }
 #endif
 
+    PROGRESS(mappings):
+        ret = relinquish_p2m_mapping(d);
+        if ( ret )
+            return ret;
+
     PROGRESS(paging):
 
         /* Tear down paging-assistance stuff. */
diff --git a/xen/arch/x86/include/asm/p2m.h b/xen/arch/x86/include/asm/p2m.h
index d95341ef4242..8b3e6a473a0c 100644
--- a/xen/arch/x86/include/asm/p2m.h
+++ b/xen/arch/x86/include/asm/p2m.h
@@ -402,13 +402,7 @@  struct p2m_domain {
 
 static inline bool arch_acquire_resource_check(struct domain *d)
 {
-    /*
-     * FIXME: Until foreign pages inserted into the P2M are properly
-     * reference counted, it is unsafe to allow mapping of
-     * resource pages unless the caller is the hardware domain
-     * (see set_foreign_p2m_entry()).
-     */
-    return !paging_mode_translate(d) || is_hardware_domain(d);
+    return true;
 }
 
 /*
@@ -725,6 +719,10 @@  p2m_pod_offline_or_broken_hit(struct page_info *p);
 void
 p2m_pod_offline_or_broken_replace(struct page_info *p);
 
+/* Perform cleanup of p2m mappings ahead of teardown. */
+int
+relinquish_p2m_mapping(struct domain *d);
+
 #else
 
 static inline bool
@@ -753,6 +751,11 @@  static inline void p2m_pod_offline_or_broken_replace(struct page_info *p)
     ASSERT_UNREACHABLE();
 }
 
+static inline int relinquish_p2m_mapping(struct domain *d)
+{
+    return 0;
+}
+
 #endif
 
 
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 05d8536adcd7..fac41e5ec808 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -2335,10 +2335,6 @@  static int p2m_add_foreign(struct domain *tdom, unsigned long fgfn,
     int rc;
     struct domain *fdom;
 
-    /*
-     * hvm fixme: until support is added to p2m teardown code to cleanup any
-     * foreign entries, limit this to hardware domain only.
-     */
     if ( !arch_acquire_resource_check(tdom) )
         return -EPERM;
 
@@ -2695,6 +2691,70 @@  int p2m_set_altp2m_view_visibility(struct domain *d, unsigned int altp2m_idx,
     return rc;
 }
 
+/*
+ * Remove foreign mappings from the p2m, as that drops the page reference taken
+ * when mapped.
+ */
+int relinquish_p2m_mapping(struct domain *d)
+{
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+    unsigned long gfn = gfn_x(p2m->max_gfn);
+    int rc = 0;
+
+    if ( !paging_mode_translate(d) )
+        return 0;
+
+    BUG_ON(!d->is_dying);
+
+    p2m_lock(p2m);
+
+    /* Iterate over the whole p2m on debug builds to ensure correctness. */
+    while ( gfn && (IS_ENABLED(CONFIG_DEBUG) || p2m->nr_foreign) )
+    {
+        unsigned int order;
+        p2m_type_t t;
+        p2m_access_t a;
+
+        _get_gfn_type_access(p2m, _gfn(gfn - 1), &t, &a, 0, &order, 0);
+        ASSERT(IS_ALIGNED(gfn, 1u << order));
+        gfn -= 1 << order;
+
+        if ( t == p2m_map_foreign )
+        {
+            ASSERT(p2m->nr_foreign);
+            ASSERT(order == 0);
+            /*
+             * Foreign mappings can only be of order 0, hence there's no need
+             * to align the gfn to the entry order.  Otherwise we would need to
+             * adjust gfn to point to the start of the page if order > 0.
+             */
+            rc = p2m_set_entry(p2m, _gfn(gfn), INVALID_MFN, order, p2m_invalid,
+                               p2m->default_access);
+            if ( rc )
+            {
+                printk(XENLOG_ERR
+                       "%pd: failed to unmap foreign page %" PRI_gfn " order %u error %d\n",
+                       d, gfn, order, rc);
+                ASSERT_UNREACHABLE();
+                break;
+            }
+        }
+
+        if ( !(gfn & 0xfff) && hypercall_preempt_check() )
+        {
+            rc = -ERESTART;
+            break;
+        }
+    }
+
+    ASSERT(gfn || !p2m->nr_foreign);
+    p2m->max_gfn = _gfn(gfn);
+
+    p2m_unlock(p2m);
+
+    return rc;
+}
+
 /*
  * Local variables:
  * mode: C