Message ID | 20220716145658.4175730-2-olekstysh@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [V7,1/2] xen/arm: Harden the P2M code in p2m_remove_mapping() | expand |
Hi Oleksandr, On 16/07/2022 15:56, Oleksandr Tyshchenko wrote: > Suggested-by: Julien Grall <jgrall@amazon.com> > Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Reviewed-by: Julien Grall <jgrall@amazon.com> > xen/arch/x86/include/asm/grant_table.h | 5 --- This changes will need an ack from the x86 maintainers. Cheers,
On 16.07.22 18:08, Julien Grall wrote: > Hi Oleksandr, Hello Julien > > On 16/07/2022 15:56, Oleksandr Tyshchenko wrote: >> Suggested-by: Julien Grall <jgrall@amazon.com> >> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> > > Reviewed-by: Julien Grall <jgrall@amazon.com> Thank you! > >> xen/arch/x86/include/asm/grant_table.h | 5 --- > > This changes will need an ack from the x86 maintainers. Yes. Jan has reviewed that patch many times. > > Cheers, >
On 16.07.2022 16:56, Oleksandr Tyshchenko wrote: > From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> > > Rework Arm implementation to store grant table frame GFN > in struct page_info directly instead of keeping it in > standalone status/shared arrays. This patch is based on > the assumption that a grant table page is a xenheap page. > > To cover 64-bit/40-bit IPA on Arm64/Arm32 we need the space > to hold 52-bit/28-bit + extra bit value respectively. In order > to not grow the size of struct page_info borrow the required > amount of bits from type_info's count portion which current > context won't suffer (currently only 1 bit is used on Arm). > Please note, to minimize code changes and avoid introducing > an extra #ifdef-s to the header, we keep the same amount of > bits on both subarches, although the count portion on Arm64 > could be wider, so we waste some bits here. > > Introduce corresponding PGT_* constructs and access macro > page_get(set)_xenheap_gfn. Please note, all accesses to > the GFN portion of type_info field should always be protected > by the P2M lock. In case when it is not feasible to satisfy > that requirement (risk of deadlock, lock inversion, etc) > it is important to make sure that all non-protected updates > to this field are atomic. > As several non-protected read accesses still exist within > current code (most calls to page_get_xenheap_gfn() are not > protected by the P2M lock) the subsequent patch will introduce > hardening code for p2m_remove_mapping() to be called with P2M > lock held in order to check any difference between what is > already mapped and what is requested to be ummapped. > > Update existing gnttab macros to deal with GFN value according > to new location. Also update the use of count portion of type_info > field on Arm in share_xen_page_with_guest(). > > While at it, extend this simplified M2P-like approach for any > xenheap pages which are proccessed in xenmem_add_to_physmap_one() > except foreign ones. Update the code to set GFN portion after > establishing new mapping for the xenheap page in said function > and to clean GFN portion when putting a reference on that page > in p2m_put_l3_page(). > > And for everything to work correctly introduce arch-specific > initialization pattern PGT_TYPE_INFO_INITIALIZER to be applied > to type_info field during initialization at alloc_heap_pages() > and acquire_staticmem_pages(). The pattern's purpose on Arm > is to clear the GFN portion before use, on x86 it is just > a stub. > > This patch is intended to fix the potential issue on Arm > which might happen when remapping grant-table frame. > A guest (or the toolstack) will unmap the grant-table frame > using XENMEM_remove_physmap. This is a generic hypercall, > so on x86, we are relying on the fact the M2P entry will > be cleared on removal. For architecture without the M2P, > the GFN would still be present in the grant frame/status > array. So on the next call to map the page, we will end up to > request the P2M to remove whatever mapping was the given GFN. > This could well be another mapping. > > Please note, this patch also changes the behavior how the shared_info > page (which is xenheap RAM page) is mapped in xenmem_add_to_physmap_one(). > Now, we only allow to map the shared_info at once. The subsequent > attempts to map it will result in -EBUSY. Doing that we mandate > the caller to first unmap the page before mapping it again. This is > to prevent Xen creating an unwanted hole in the P2M. For instance, > this could happen if the firmware stole a RAM address for mapping > the shared_info page into but forgot to unmap it afterwards. > > Besides that, this patch simplifies arch code on Arm by > removing arrays and corresponding management code and > as the result gnttab_init_arch/gnttab_destroy_arch helpers > and struct grant_table_arch become useless and can be > dropped globally. > > Suggested-by: Julien Grall <jgrall@amazon.com> > Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Acked-by: Jan Beulich <jbeulich@suse.com>
On 16.07.2022 16:56, Oleksandr Tyshchenko wrote: > From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> > > Rework Arm implementation to store grant table frame GFN > in struct page_info directly instead of keeping it in > standalone status/shared arrays. This patch is based on > the assumption that a grant table page is a xenheap page. > > To cover 64-bit/40-bit IPA on Arm64/Arm32 we need the space > to hold 52-bit/28-bit + extra bit value respectively. In order > to not grow the size of struct page_info borrow the required > amount of bits from type_info's count portion which current > context won't suffer (currently only 1 bit is used on Arm). I'm afraid this isn't true: There's no requirement for a guest to pass all different GFNs to VCPUOP_register_vcpu_info, yet map_vcpu_info() tries to obtain a reference for every vCPU. With my adding of GFN (really gaddr) based registration of the runstate area (already looking towards 4.18) the maximum possible count is to further grow. I guess this went unnoticed because Linux presumably uses different GFNs for every vCPU, so the issue doesn't surface. But I'm afraid this is a regression (unless I'm overlooking something, perhaps a mitigating factor) which wants fixing for 4.17. Jan
Hi Jan, On 11/10/2022 12:59, Jan Beulich wrote: > On 16.07.2022 16:56, Oleksandr Tyshchenko wrote: >> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> >> >> Rework Arm implementation to store grant table frame GFN >> in struct page_info directly instead of keeping it in >> standalone status/shared arrays. This patch is based on >> the assumption that a grant table page is a xenheap page. >> >> To cover 64-bit/40-bit IPA on Arm64/Arm32 we need the space >> to hold 52-bit/28-bit + extra bit value respectively. In order >> to not grow the size of struct page_info borrow the required >> amount of bits from type_info's count portion which current >> context won't suffer (currently only 1 bit is used on Arm). > > I'm afraid this isn't true: There's no requirement for a guest to pass > all different GFNs to VCPUOP_register_vcpu_info, yet map_vcpu_info() > tries to obtain a reference for every vCPU. AFAIU, this would be a reference of the **count_info** not **type_info** (which BTW will never be incremented on Arm because we have no type support). The commit message is only referring to the 'type_info's count'. So... > With my adding of GFN > (really gaddr) based registration of the runstate area (already > looking towards 4.18) the maximum possible count is to further grow. ... I am not sure which problem you are referring too. Cheers,
On 11.10.2022 15:01, Julien Grall wrote: > Hi Jan, > > On 11/10/2022 12:59, Jan Beulich wrote: >> On 16.07.2022 16:56, Oleksandr Tyshchenko wrote: >>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> >>> >>> Rework Arm implementation to store grant table frame GFN >>> in struct page_info directly instead of keeping it in >>> standalone status/shared arrays. This patch is based on >>> the assumption that a grant table page is a xenheap page. >>> >>> To cover 64-bit/40-bit IPA on Arm64/Arm32 we need the space >>> to hold 52-bit/28-bit + extra bit value respectively. In order >>> to not grow the size of struct page_info borrow the required >>> amount of bits from type_info's count portion which current >>> context won't suffer (currently only 1 bit is used on Arm). >> >> I'm afraid this isn't true: There's no requirement for a guest to pass >> all different GFNs to VCPUOP_register_vcpu_info, yet map_vcpu_info() >> tries to obtain a reference for every vCPU. > > AFAIU, this would be a reference of the **count_info** not **type_info** > (which BTW will never be incremented on Arm because we have no type > support). I should have said "obtain a writable type reference". > The commit message is only referring to the 'type_info's count'. So... > >> With my adding of GFN >> (really gaddr) based registration of the runstate area (already >> looking towards 4.18) the maximum possible count is to further grow. > > ... I am not sure which problem you are referring too. Wow - a mere stub (but not inline) function to make the build happy. Then why is the description talking about one bit that's needed on Arm? Jan
Hi Jan, On 11/10/2022 14:28, Jan Beulich wrote: > On 11.10.2022 15:01, Julien Grall wrote: >> Hi Jan, >> >> On 11/10/2022 12:59, Jan Beulich wrote: >>> On 16.07.2022 16:56, Oleksandr Tyshchenko wrote: >>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> >>>> >>>> Rework Arm implementation to store grant table frame GFN >>>> in struct page_info directly instead of keeping it in >>>> standalone status/shared arrays. This patch is based on >>>> the assumption that a grant table page is a xenheap page. >>>> >>>> To cover 64-bit/40-bit IPA on Arm64/Arm32 we need the space >>>> to hold 52-bit/28-bit + extra bit value respectively. In order >>>> to not grow the size of struct page_info borrow the required >>>> amount of bits from type_info's count portion which current >>>> context won't suffer (currently only 1 bit is used on Arm). >>> >>> I'm afraid this isn't true: There's no requirement for a guest to pass >>> all different GFNs to VCPUOP_register_vcpu_info, yet map_vcpu_info() >>> tries to obtain a reference for every vCPU. >> >> AFAIU, this would be a reference of the **count_info** not **type_info** >> (which BTW will never be incremented on Arm because we have no type >> support). > > I should have said "obtain a writable type reference". Thanks for the clarification. > >> The commit message is only referring to the 'type_info's count'. So... >> >>> With my adding of GFN >>> (really gaddr) based registration of the runstate area (already >>> looking towards 4.18) the maximum possible count is to further grow. >> >> ... I am not sure which problem you are referring too. > > Wow - a mere stub (but not inline) function to make the build happy. > Then why is the description talking about one bit that's needed on > Arm? Because share_xen_page_with_guest() will always set the type info's count to 1. TBH I don't exactly know why we set it. I always assumed this was a requirement for the common code but never checked. Cheers,
On 11.10.2022 15:33, Julien Grall wrote: > Hi Jan, > > On 11/10/2022 14:28, Jan Beulich wrote: >> On 11.10.2022 15:01, Julien Grall wrote: >>> Hi Jan, >>> >>> On 11/10/2022 12:59, Jan Beulich wrote: >>>> On 16.07.2022 16:56, Oleksandr Tyshchenko wrote: >>>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> >>>>> >>>>> Rework Arm implementation to store grant table frame GFN >>>>> in struct page_info directly instead of keeping it in >>>>> standalone status/shared arrays. This patch is based on >>>>> the assumption that a grant table page is a xenheap page. >>>>> >>>>> To cover 64-bit/40-bit IPA on Arm64/Arm32 we need the space >>>>> to hold 52-bit/28-bit + extra bit value respectively. In order >>>>> to not grow the size of struct page_info borrow the required >>>>> amount of bits from type_info's count portion which current >>>>> context won't suffer (currently only 1 bit is used on Arm). >>>> >>>> I'm afraid this isn't true: There's no requirement for a guest to pass >>>> all different GFNs to VCPUOP_register_vcpu_info, yet map_vcpu_info() >>>> tries to obtain a reference for every vCPU. >>> >>> AFAIU, this would be a reference of the **count_info** not **type_info** >>> (which BTW will never be incremented on Arm because we have no type >>> support). >> >> I should have said "obtain a writable type reference". > > Thanks for the clarification. > >> >>> The commit message is only referring to the 'type_info's count'. So... >>> >>>> With my adding of GFN >>>> (really gaddr) based registration of the runstate area (already >>>> looking towards 4.18) the maximum possible count is to further grow. >>> >>> ... I am not sure which problem you are referring too. >> >> Wow - a mere stub (but not inline) function to make the build happy. >> Then why is the description talking about one bit that's needed on >> Arm? > > Because share_xen_page_with_guest() will always set the type info's > count to 1. > > TBH I don't exactly know why we set it. I always assumed this was a > requirement for the common code but never checked. I don't think there is any such requirement. In fact there are only very few uses of type_info in common code. By also setting PGT_count_mask to zero you could actually have the compiler eliminate some dead code ... Jan
On 11.10.2022 15:33, Julien Grall wrote: > On 11/10/2022 14:28, Jan Beulich wrote: >> On 11.10.2022 15:01, Julien Grall wrote: >>> On 11/10/2022 12:59, Jan Beulich wrote: >>>> On 16.07.2022 16:56, Oleksandr Tyshchenko wrote: >>>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> >>>>> >>>>> Rework Arm implementation to store grant table frame GFN >>>>> in struct page_info directly instead of keeping it in >>>>> standalone status/shared arrays. This patch is based on >>>>> the assumption that a grant table page is a xenheap page. >>>>> >>>>> To cover 64-bit/40-bit IPA on Arm64/Arm32 we need the space >>>>> to hold 52-bit/28-bit + extra bit value respectively. In order >>>>> to not grow the size of struct page_info borrow the required >>>>> amount of bits from type_info's count portion which current >>>>> context won't suffer (currently only 1 bit is used on Arm). >>>> >>>> I'm afraid this isn't true: There's no requirement for a guest to pass >>>> all different GFNs to VCPUOP_register_vcpu_info, yet map_vcpu_info() >>>> tries to obtain a reference for every vCPU. >>> >>> AFAIU, this would be a reference of the **count_info** not **type_info** >>> (which BTW will never be incremented on Arm because we have no type >>> support). >> >> I should have said "obtain a writable type reference". > > Thanks for the clarification. > >> >>> The commit message is only referring to the 'type_info's count'. So... >>> >>>> With my adding of GFN >>>> (really gaddr) based registration of the runstate area (already >>>> looking towards 4.18) the maximum possible count is to further grow. >>> >>> ... I am not sure which problem you are referring too. >> >> Wow - a mere stub (but not inline) function to make the build happy. >> Then why is the description talking about one bit that's needed on >> Arm? > > Because share_xen_page_with_guest() will always set the type info's > count to 1. > > TBH I don't exactly know why we set it. I always assumed this was a > requirement for the common code but never checked. So my first thought was that this type-ref handling all being no-ops would be an issue with gnttab v2, but besides that not being security supported on Arm the code also passes SHARE_rw (for a reason that escapes me) when sharing the status pages. It does however mean that Dom0 can map the trace buffers r/w (unless there's some special code in Arm preventing that), despite them being shared with SHARE_ro. Not a big problem considering all the power Dom0 has, but still against the intentions. Jan
Hi Jan, On 17/10/2022 14:46, Jan Beulich wrote: > On 11.10.2022 15:33, Julien Grall wrote: >> On 11/10/2022 14:28, Jan Beulich wrote: >>> On 11.10.2022 15:01, Julien Grall wrote: >>>> On 11/10/2022 12:59, Jan Beulich wrote: >>>>> On 16.07.2022 16:56, Oleksandr Tyshchenko wrote: >>>>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> >>>>>> >>>>>> Rework Arm implementation to store grant table frame GFN >>>>>> in struct page_info directly instead of keeping it in >>>>>> standalone status/shared arrays. This patch is based on >>>>>> the assumption that a grant table page is a xenheap page. >>>>>> >>>>>> To cover 64-bit/40-bit IPA on Arm64/Arm32 we need the space >>>>>> to hold 52-bit/28-bit + extra bit value respectively. In order >>>>>> to not grow the size of struct page_info borrow the required >>>>>> amount of bits from type_info's count portion which current >>>>>> context won't suffer (currently only 1 bit is used on Arm). >>>>> >>>>> I'm afraid this isn't true: There's no requirement for a guest to pass >>>>> all different GFNs to VCPUOP_register_vcpu_info, yet map_vcpu_info() >>>>> tries to obtain a reference for every vCPU. >>>> >>>> AFAIU, this would be a reference of the **count_info** not **type_info** >>>> (which BTW will never be incremented on Arm because we have no type >>>> support). >>> >>> I should have said "obtain a writable type reference". >> >> Thanks for the clarification. >> >>> >>>> The commit message is only referring to the 'type_info's count'. So... >>>> >>>>> With my adding of GFN >>>>> (really gaddr) based registration of the runstate area (already >>>>> looking towards 4.18) the maximum possible count is to further grow. >>>> >>>> ... I am not sure which problem you are referring too. >>> >>> Wow - a mere stub (but not inline) function to make the build happy. >>> Then why is the description talking about one bit that's needed on >>> Arm? >> >> Because share_xen_page_with_guest() will always set the type info's >> count to 1. >> >> TBH I don't exactly know why we set it. I always assumed this was a >> requirement for the common code but never checked. > > So my first thought was that this type-ref handling all being no-ops > would be an issue with gnttab v2, but besides that not being security > supported on Arm the code also passes SHARE_rw (for a reason that > escapes me) when sharing the status pages. Probably because grant-table v2 was never tested on Arm. > > It does however mean that Dom0 can map the trace buffers r/w (unless > there's some special code in Arm preventing that), despite them being > shared with SHARE_ro. Not a big problem considering all the power Dom0 > has, but still against the intentions. We don't use the refcounting but still use the flag PGT_writable_page to indicate whether the mapping is writeable or read-only. The code to map the trace buffers will look at the flag and decide the attribute in the P2M. Cheers,
diff --git a/xen/arch/arm/include/asm/grant_table.h b/xen/arch/arm/include/asm/grant_table.h index a283dd5cd6..e13dfeefa5 100644 --- a/xen/arch/arm/include/asm/grant_table.h +++ b/xen/arch/arm/include/asm/grant_table.h @@ -11,11 +11,6 @@ #define INITIAL_NR_GRANT_FRAMES 1U #define GNTTAB_MAX_VERSION 1 -struct grant_table_arch { - gfn_t *shared_gfn; - gfn_t *status_gfn; -}; - static inline void gnttab_clear_flags(struct domain *d, unsigned int mask, uint16_t *addr) { @@ -56,53 +51,27 @@ int replace_grant_host_mapping(unsigned long gpaddr, mfn_t mfn, #define gnttab_dom0_frames() \ min_t(unsigned int, opt_max_grant_frames, PFN_DOWN(_etext - _stext)) -#define gnttab_init_arch(gt) \ -({ \ - unsigned int ngf_ = (gt)->max_grant_frames; \ - unsigned int nsf_ = grant_to_status_frames(ngf_); \ - \ - (gt)->arch.shared_gfn = xmalloc_array(gfn_t, ngf_); \ - (gt)->arch.status_gfn = xmalloc_array(gfn_t, nsf_); \ - if ( (gt)->arch.shared_gfn && (gt)->arch.status_gfn ) \ - { \ - while ( ngf_-- ) \ - (gt)->arch.shared_gfn[ngf_] = INVALID_GFN; \ - while ( nsf_-- ) \ - (gt)->arch.status_gfn[nsf_] = INVALID_GFN; \ - } \ - else \ - gnttab_destroy_arch(gt); \ - (gt)->arch.shared_gfn ? 0 : -ENOMEM; \ -}) - -#define gnttab_destroy_arch(gt) \ - do { \ - XFREE((gt)->arch.shared_gfn); \ - XFREE((gt)->arch.status_gfn); \ - } while ( 0 ) - #define gnttab_set_frame_gfn(gt, st, idx, gfn, mfn) \ - ({ \ - int rc_ = 0; \ - gfn_t ogfn = gnttab_get_frame_gfn(gt, st, idx); \ - if ( gfn_eq(ogfn, INVALID_GFN) || gfn_eq(ogfn, gfn) || \ - (rc_ = guest_physmap_remove_page((gt)->domain, ogfn, mfn, \ - 0)) == 0 ) \ - ((st) ? (gt)->arch.status_gfn \ - : (gt)->arch.shared_gfn)[idx] = (gfn); \ - rc_; \ - }) + (gfn_eq(gfn, INVALID_GFN) \ + ? guest_physmap_remove_page((gt)->domain, \ + gnttab_get_frame_gfn(gt, st, idx), \ + mfn, 0) \ + : 0) #define gnttab_get_frame_gfn(gt, st, idx) ({ \ (st) ? gnttab_status_gfn(NULL, gt, idx) \ : gnttab_shared_gfn(NULL, gt, idx); \ }) +#define gnttab_shared_page(t, i) virt_to_page((t)->shared_raw[i]) + +#define gnttab_status_page(t, i) virt_to_page((t)->status[i]) + #define gnttab_shared_gfn(d, t, i) \ - (((i) >= nr_grant_frames(t)) ? INVALID_GFN : (t)->arch.shared_gfn[i]) + page_get_xenheap_gfn(gnttab_shared_page(t, i)) #define gnttab_status_gfn(d, t, i) \ - (((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i]) + page_get_xenheap_gfn(gnttab_status_page(t, i)) #define gnttab_need_iommu_mapping(d) \ (is_domain_direct_mapped(d) && is_iommu_enabled(d)) diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h index c4bc3cd1e5..6c0a3c789f 100644 --- a/xen/arch/arm/include/asm/mm.h +++ b/xen/arch/arm/include/asm/mm.h @@ -98,9 +98,22 @@ struct page_info #define PGT_writable_page PG_mask(1, 1) /* has writable mappings? */ #define PGT_type_mask PG_mask(1, 1) /* Bits 31 or 63. */ - /* Count of uses of this frame as its current type. */ -#define PGT_count_width PG_shift(2) -#define PGT_count_mask ((1UL<<PGT_count_width)-1) + /* 2-bit count of uses of this frame as its current type. */ +#define PGT_count_mask PG_mask(3, 3) + +/* + * Stored in bits [28:0] (arm32) or [60:0] (arm64) GFN if page is xenheap page. + */ +#define PGT_gfn_width PG_shift(3) +#define PGT_gfn_mask ((1UL<<PGT_gfn_width)-1) + +#define PGT_INVALID_XENHEAP_GFN _gfn(PGT_gfn_mask) + +/* + * An arch-specific initialization pattern is needed for the type_info field + * as it's GFN portion can contain the valid GFN if page is xenheap page. + */ +#define PGT_TYPE_INFO_INITIALIZER gfn_x(PGT_INVALID_XENHEAP_GFN) /* Cleared when the owning guest 'frees' this page. */ #define _PGC_allocated PG_shift(1) @@ -354,6 +367,34 @@ void clear_and_clean_page(struct page_info *page); unsigned int arch_get_dma_bitsize(void); +/* + * All accesses to the GFN portion of type_info field should always be + * protected by the P2M lock. In case when it is not feasible to satisfy + * that requirement (risk of deadlock, lock inversion, etc) it is important + * to make sure that all non-protected updates to this field are atomic. + */ +static inline gfn_t page_get_xenheap_gfn(const struct page_info *p) +{ + gfn_t gfn_ = _gfn(ACCESS_ONCE(p->u.inuse.type_info) & PGT_gfn_mask); + + ASSERT(is_xen_heap_page(p)); + + return gfn_eq(gfn_, PGT_INVALID_XENHEAP_GFN) ? INVALID_GFN : gfn_; +} + +static inline void page_set_xenheap_gfn(struct page_info *p, gfn_t gfn) +{ + gfn_t gfn_ = gfn_eq(gfn, INVALID_GFN) ? PGT_INVALID_XENHEAP_GFN : gfn; + unsigned long x, nx, y = p->u.inuse.type_info; + + ASSERT(is_xen_heap_page(p)); + + do { + x = y; + nx = (x & ~PGT_gfn_mask) | gfn_x(gfn_); + } while ( (y = cmpxchg(&p->u.inuse.type_info, x, nx)) != x ); +} + #endif /* __ARCH_ARM_MM__ */ /* * Local variables: diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index 009b8cd9ef..e59a4ce6d0 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -1449,9 +1449,21 @@ void share_xen_page_with_guest(struct page_info *page, struct domain *d, spin_lock(&d->page_alloc_lock); - /* The incremented type count pins as writable or read-only. */ - page->u.inuse.type_info = - (flags == SHARE_ro ? PGT_none : PGT_writable_page) | 1; + /* + * The incremented type count pins as writable or read-only. + * + * Please note, the update of type_info field here is not atomic as + * we use Read-Modify-Write operation on it. But currently it is fine + * because the caller of page_set_xenheap_gfn() (which is another place + * where type_info is updated) would need to acquire a reference on + * the page. This is only possible after the count_info is updated *and* + * there is a barrier between the type_info and count_info. So there is + * no immediate need to use cmpxchg() here. + */ + page->u.inuse.type_info &= ~(PGT_type_mask | PGT_count_mask); + page->u.inuse.type_info |= (flags == SHARE_ro ? PGT_none + : PGT_writable_page) | + MASK_INSR(1, PGT_count_mask); page_set_owner(page, d); smp_wmb(); /* install valid domain ptr before updating refcnt. */ @@ -1554,8 +1566,37 @@ int xenmem_add_to_physmap_one( return -ENOSYS; } - /* Map at new location. */ - rc = guest_physmap_add_entry(d, gfn, mfn, 0, t); + /* + * Map at new location. Here we need to map xenheap RAM page differently + * because we need to store the valid GFN and make sure that nothing was + * mapped before (the stored GFN is invalid). And these actions need to be + * performed with the P2M lock held. The guest_physmap_add_entry() is just + * a wrapper on top of p2m_set_entry(). + */ + if ( !p2m_is_ram(t) || !is_xen_heap_mfn(mfn) ) + rc = guest_physmap_add_entry(d, gfn, mfn, 0, t); + else + { + struct p2m_domain *p2m = p2m_get_hostp2m(d); + + p2m_write_lock(p2m); + if ( gfn_eq(page_get_xenheap_gfn(mfn_to_page(mfn)), INVALID_GFN) ) + { + rc = p2m_set_entry(p2m, gfn, 1, mfn, t, p2m->default_access); + if ( !rc ) + page_set_xenheap_gfn(mfn_to_page(mfn), gfn); + } + else + /* + * Mandate the caller to first unmap the page before mapping it + * again. This is to prevent Xen creating an unwanted hole in + * the P2M. For instance, this could happen if the firmware stole + * a RAM address for mapping the shared_info page into but forgot + * to unmap it afterwards. + */ + rc = -EBUSY; + p2m_write_unlock(p2m); + } /* * For XENMAPSPACE_gmfn_foreign if we failed to add the mapping, we need diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c index 2a0d383df4..eec5e90ab0 100644 --- a/xen/arch/arm/p2m.c +++ b/xen/arch/arm/p2m.c @@ -716,6 +716,8 @@ static int p2m_mem_access_radix_set(struct p2m_domain *p2m, gfn_t gfn, */ static void p2m_put_l3_page(const lpae_t pte) { + mfn_t mfn = lpae_get_mfn(pte); + ASSERT(p2m_is_valid(pte)); /* @@ -727,11 +729,12 @@ static void p2m_put_l3_page(const lpae_t pte) */ if ( p2m_is_foreign(pte.p2m.type) ) { - mfn_t mfn = lpae_get_mfn(pte); - ASSERT(mfn_valid(mfn)); put_page(mfn_to_page(mfn)); } + /* Detect the xenheap page and mark the stored GFN as invalid. */ + else if ( p2m_is_ram(pte.p2m.type) && is_xen_heap_mfn(mfn) ) + page_set_xenheap_gfn(mfn_to_page(mfn), INVALID_GFN); } /* Free lpae sub-tree behind an entry */ diff --git a/xen/arch/x86/include/asm/grant_table.h b/xen/arch/x86/include/asm/grant_table.h index a8a21439a4..5c23cec90c 100644 --- a/xen/arch/x86/include/asm/grant_table.h +++ b/xen/arch/x86/include/asm/grant_table.h @@ -14,9 +14,6 @@ #define INITIAL_NR_GRANT_FRAMES 1U -struct grant_table_arch { -}; - static inline int create_grant_host_mapping(uint64_t addr, mfn_t frame, unsigned int flags, unsigned int cache_flags) @@ -35,8 +32,6 @@ static inline int replace_grant_host_mapping(uint64_t addr, mfn_t frame, return replace_grant_pv_mapping(addr, frame, new_addr, flags); } -#define gnttab_init_arch(gt) 0 -#define gnttab_destroy_arch(gt) do {} while ( 0 ) #define gnttab_set_frame_gfn(gt, st, idx, gfn, mfn) \ (gfn_eq(gfn, INVALID_GFN) \ ? guest_physmap_remove_page((gt)->domain, \ diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c index aea0ad30a7..ad773a6996 100644 --- a/xen/common/grant_table.c +++ b/xen/common/grant_table.c @@ -99,8 +99,6 @@ struct grant_table { /* Domain to which this struct grant_table belongs. */ struct domain *domain; - - struct grant_table_arch arch; }; unsigned int __read_mostly opt_max_grant_frames = 64; @@ -2018,14 +2016,9 @@ int grant_table_init(struct domain *d, int max_grant_frames, grant_write_lock(gt); - ret = gnttab_init_arch(gt); - if ( ret ) - goto unlock; - /* gnttab_grow_table() allocates a min number of frames, so 0 is okay. */ ret = gnttab_grow_table(d, 0); - unlock: grant_write_unlock(gt); out: @@ -3939,8 +3932,6 @@ grant_table_destroy( if ( t == NULL ) return; - gnttab_destroy_arch(t); - for ( i = 0; i < nr_grant_frames(t); i++ ) free_xenheap_page(t->shared_raw[i]); xfree(t->shared_raw); diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index fe0e15429a..6ca986584d 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -155,6 +155,10 @@ #define PGC_reserved 0 #endif +#ifndef PGT_TYPE_INFO_INITIALIZER +#define PGT_TYPE_INFO_INITIALIZER 0 +#endif + /* * Comma-separated list of hexadecimal page numbers containing bad bytes. * e.g. 'badpage=0x3f45,0x8a321'. @@ -1024,7 +1028,7 @@ static struct page_info *alloc_heap_pages( &tlbflush_timestamp); /* Initialise fields which have other uses for free pages. */ - pg[i].u.inuse.type_info = 0; + pg[i].u.inuse.type_info = PGT_TYPE_INFO_INITIALIZER; page_set_owner(&pg[i], NULL); } @@ -2702,7 +2706,7 @@ static struct page_info * __init acquire_staticmem_pages(mfn_t smfn, */ pg[i].count_info = PGC_reserved | PGC_state_inuse; /* Initialise fields which have other uses for free pages. */ - pg[i].u.inuse.type_info = 0; + pg[i].u.inuse.type_info = PGT_TYPE_INFO_INITIALIZER; page_set_owner(&pg[i], NULL); }