Message ID | 20240430165845.81696-3-roger.pau@citrix.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | xen/x86: support foreign mappings for HVM | expand |
On 30.04.2024 18:58, Roger Pau Monne wrote: > @@ -2695,6 +2691,70 @@ int p2m_set_altp2m_view_visibility(struct domain *d, unsigned int altp2m_idx, > return rc; > } > > +/* > + * Remove foreign mappings from the p2m, as that drops the page reference taken > + * when mapped. > + */ > +int relinquish_p2m_mapping(struct domain *d) > +{ > + struct p2m_domain *p2m = p2m_get_hostp2m(d); Are there any guarantees made anywhere that altp2m-s and nested P2Ms can't hold foreign mappings? p2m_entry_modify() certainly treats them all the same. > + unsigned long gfn = gfn_x(p2m->max_gfn); > + int rc = 0; > + > + if ( !paging_mode_translate(d) ) > + return 0; > + > + BUG_ON(!d->is_dying); > + > + p2m_lock(p2m); > + > + /* Iterate over the whole p2m on debug builds to ensure correctness. */ > + while ( gfn && (IS_ENABLED(CONFIG_DEBUG) || p2m->nr_foreign) ) > + { > + unsigned int order; > + p2m_type_t t; > + p2m_access_t a; > + > + _get_gfn_type_access(p2m, _gfn(gfn - 1), &t, &a, 0, &order, 0); > + ASSERT(IS_ALIGNED(gfn, 1u << order)); This heavily relies on the sole place where max_gfn is updated being indeed sufficient. > + gfn -= 1 << order; Please be consistent with the kind of 1 you shift left. Perhaps anyway both better as 1UL. > + if ( t == p2m_map_foreign ) > + { > + ASSERT(p2m->nr_foreign); > + ASSERT(order == 0); > + /* > + * Foreign mappings can only be of order 0, hence there's no need > + * to align the gfn to the entry order. Otherwise we would need to > + * adjust gfn to point to the start of the page if order > 0. > + */ I'm a little irritated by this comment. Ahead of the enclosing if() you already rely on (and assert) GFN being suitably aligned. > + rc = p2m_set_entry(p2m, _gfn(gfn), INVALID_MFN, order, p2m_invalid, > + p2m->default_access); > + if ( rc ) > + { > + printk(XENLOG_ERR > + "%pd: failed to unmap foreign page %" PRI_gfn " order %u error %d\n", > + d, gfn, order, rc); > + ASSERT_UNREACHABLE(); > + break; > + } Together with the updating of ->max_gfn further down, for a release build this means: A single attempt to clean up the domain would fail when such a set-entry fails. However, another attempt to clean up despite the earlier error would then not retry for the failed GFN, but continue one below. That's unexpected: I'd either see such a domain remain as a zombie forever, or a best effort continuation of all cleanup right away. > + } > + > + if ( !(gfn & 0xfff) && hypercall_preempt_check() ) By going from gfn's low bits you may check way more often than necessary when encountering large pages. Jan
On Mon, May 06, 2024 at 12:41:49PM +0200, Jan Beulich wrote: > On 30.04.2024 18:58, Roger Pau Monne wrote: > > @@ -2695,6 +2691,70 @@ int p2m_set_altp2m_view_visibility(struct domain *d, unsigned int altp2m_idx, > > return rc; > > } > > > > +/* > > + * Remove foreign mappings from the p2m, as that drops the page reference taken > > + * when mapped. > > + */ > > +int relinquish_p2m_mapping(struct domain *d) > > +{ > > + struct p2m_domain *p2m = p2m_get_hostp2m(d); > > Are there any guarantees made anywhere that altp2m-s and nested P2Ms can't > hold foreign mappings? p2m_entry_modify() certainly treats them all the same. Good point, I will disable those initially, as I don't think there's a need right now for foreign mapping in altp2m-s and nested p2ms. > > + unsigned long gfn = gfn_x(p2m->max_gfn); > > + int rc = 0; > > + > > + if ( !paging_mode_translate(d) ) > > + return 0; > > + > > + BUG_ON(!d->is_dying); > > + > > + p2m_lock(p2m); > > + > > + /* Iterate over the whole p2m on debug builds to ensure correctness. */ > > + while ( gfn && (IS_ENABLED(CONFIG_DEBUG) || p2m->nr_foreign) ) > > + { > > + unsigned int order; > > + p2m_type_t t; > > + p2m_access_t a; > > + > > + _get_gfn_type_access(p2m, _gfn(gfn - 1), &t, &a, 0, &order, 0); > > + ASSERT(IS_ALIGNED(gfn, 1u << order)); > > This heavily relies on the sole place where max_gfn is updated being indeed > sufficient. > > > + gfn -= 1 << order; > > Please be consistent with the kind of 1 you shift left. Perhaps anyway both > better as 1UL. > > > + if ( t == p2m_map_foreign ) > > + { > > + ASSERT(p2m->nr_foreign); > > + ASSERT(order == 0); > > + /* > > + * Foreign mappings can only be of order 0, hence there's no need > > + * to align the gfn to the entry order. Otherwise we would need to > > + * adjust gfn to point to the start of the page if order > 0. > > + */ > > I'm a little irritated by this comment. Ahead of the enclosing if() you > already rely on (and assert) GFN being suitably aligned. Oh, I've added that outer assert later, and then didn't remove this comment. > > + rc = p2m_set_entry(p2m, _gfn(gfn), INVALID_MFN, order, p2m_invalid, > > + p2m->default_access); > > + if ( rc ) > > + { > > + printk(XENLOG_ERR > > + "%pd: failed to unmap foreign page %" PRI_gfn " order %u error %d\n", > > + d, gfn, order, rc); > > + ASSERT_UNREACHABLE(); > > + break; > > + } > > Together with the updating of ->max_gfn further down, for a release build > this means: A single attempt to clean up the domain would fail when such a > set-entry fails. However, another attempt to clean up despite the earlier > error would then not retry for the failed GFN, but continue one below. > That's unexpected: I'd either see such a domain remain as a zombie forever, > or a best effort continuation of all cleanup right away. I see, thanks for spotting that. Will change the logic to ensure the index is not updated past the failed to remove entry. > > + } > > + > > + if ( !(gfn & 0xfff) && hypercall_preempt_check() ) > > By going from gfn's low bits you may check way more often than necessary > when encountering large pages. Yeah, it's difficult to strike a good balance, short of counting by processed entries rather than using the gfn value. I'm fine with checking more often than required, as long as we always ensure that progress is made on each function call. Thanks, Roger.
diff --git a/CHANGELOG.md b/CHANGELOG.md index 8041cfb7d243..09bdb9b97578 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -14,6 +14,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) - HVM PIRQs are disabled by default. - Reduce IOMMU setup time for hardware domain. - xl/libxl configures vkb=[] for HVM domains with priority over vkb_device. + - Allow HVM/PVH domains to map foreign pages. ### Added - On x86: diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 20e83cf38bbd..5aa2d3744e6b 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -2364,6 +2364,7 @@ int domain_relinquish_resources(struct domain *d) enum { PROG_iommu_pagetables = 1, PROG_shared, + PROG_mappings, PROG_paging, PROG_vcpu_pagetables, PROG_xen, @@ -2412,6 +2413,11 @@ int domain_relinquish_resources(struct domain *d) } #endif + PROGRESS(mappings): + ret = relinquish_p2m_mapping(d); + if ( ret ) + return ret; + PROGRESS(paging): /* Tear down paging-assistance stuff. */ diff --git a/xen/arch/x86/include/asm/p2m.h b/xen/arch/x86/include/asm/p2m.h index d95341ef4242..8b3e6a473a0c 100644 --- a/xen/arch/x86/include/asm/p2m.h +++ b/xen/arch/x86/include/asm/p2m.h @@ -402,13 +402,7 @@ struct p2m_domain { static inline bool arch_acquire_resource_check(struct domain *d) { - /* - * FIXME: Until foreign pages inserted into the P2M are properly - * reference counted, it is unsafe to allow mapping of - * resource pages unless the caller is the hardware domain - * (see set_foreign_p2m_entry()). - */ - return !paging_mode_translate(d) || is_hardware_domain(d); + return true; } /* @@ -725,6 +719,10 @@ p2m_pod_offline_or_broken_hit(struct page_info *p); void p2m_pod_offline_or_broken_replace(struct page_info *p); +/* Perform cleanup of p2m mappings ahead of teardown. */ +int +relinquish_p2m_mapping(struct domain *d); + #else static inline bool @@ -753,6 +751,11 @@ static inline void p2m_pod_offline_or_broken_replace(struct page_info *p) ASSERT_UNREACHABLE(); } +static inline int relinquish_p2m_mapping(struct domain *d) +{ + return 0; +} + #endif diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c index 05d8536adcd7..fac41e5ec808 100644 --- a/xen/arch/x86/mm/p2m.c +++ b/xen/arch/x86/mm/p2m.c @@ -2335,10 +2335,6 @@ static int p2m_add_foreign(struct domain *tdom, unsigned long fgfn, int rc; struct domain *fdom; - /* - * hvm fixme: until support is added to p2m teardown code to cleanup any - * foreign entries, limit this to hardware domain only. - */ if ( !arch_acquire_resource_check(tdom) ) return -EPERM; @@ -2695,6 +2691,70 @@ int p2m_set_altp2m_view_visibility(struct domain *d, unsigned int altp2m_idx, return rc; } +/* + * Remove foreign mappings from the p2m, as that drops the page reference taken + * when mapped. + */ +int relinquish_p2m_mapping(struct domain *d) +{ + struct p2m_domain *p2m = p2m_get_hostp2m(d); + unsigned long gfn = gfn_x(p2m->max_gfn); + int rc = 0; + + if ( !paging_mode_translate(d) ) + return 0; + + BUG_ON(!d->is_dying); + + p2m_lock(p2m); + + /* Iterate over the whole p2m on debug builds to ensure correctness. */ + while ( gfn && (IS_ENABLED(CONFIG_DEBUG) || p2m->nr_foreign) ) + { + unsigned int order; + p2m_type_t t; + p2m_access_t a; + + _get_gfn_type_access(p2m, _gfn(gfn - 1), &t, &a, 0, &order, 0); + ASSERT(IS_ALIGNED(gfn, 1u << order)); + gfn -= 1 << order; + + if ( t == p2m_map_foreign ) + { + ASSERT(p2m->nr_foreign); + ASSERT(order == 0); + /* + * Foreign mappings can only be of order 0, hence there's no need + * to align the gfn to the entry order. Otherwise we would need to + * adjust gfn to point to the start of the page if order > 0. + */ + rc = p2m_set_entry(p2m, _gfn(gfn), INVALID_MFN, order, p2m_invalid, + p2m->default_access); + if ( rc ) + { + printk(XENLOG_ERR + "%pd: failed to unmap foreign page %" PRI_gfn " order %u error %d\n", + d, gfn, order, rc); + ASSERT_UNREACHABLE(); + break; + } + } + + if ( !(gfn & 0xfff) && hypercall_preempt_check() ) + { + rc = -ERESTART; + break; + } + } + + ASSERT(gfn || !p2m->nr_foreign); + p2m->max_gfn = _gfn(gfn); + + p2m_unlock(p2m); + + return rc; +} + /* * Local variables: * mode: C
Iterate over the p2m based on the maximum recorded gfn and remove any foreign mappings, in order to drop the underlying page references and thus don't keep extra page references if a domain is destroyed while still having foreign mappings on it's p2m. The logic is similar to the one used on Arm. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- I still have to test with destroying a guest that does have foreign mappings on it's p2m. --- CHANGELOG.md | 1 + xen/arch/x86/domain.c | 6 +++ xen/arch/x86/include/asm/p2m.h | 17 +++++---- xen/arch/x86/mm/p2m.c | 68 ++++++++++++++++++++++++++++++++-- 4 files changed, 81 insertions(+), 11 deletions(-)