Message ID | 1652294845-13980-1-git-send-email-olekstysh@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [V6,1/2] xen/gnttab: Store frame GFN in struct page_info on Arm | expand |
Hi Oleksandr, Sorry for the late reply. On 11/05/2022 19:47, Oleksandr Tyshchenko wrote: > diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h > +/* > + * All accesses to the GFN portion of type_info field should always be > + * protected by the P2M lock. In case when it is not feasible to satisfy > + * that requirement (risk of deadlock, lock inversion, etc) it is important > + * to make sure that all non-protected updates to this field are atomic. Here you say the non-protected updates should be atomic but... [...] > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c > index 7b1f2f4..c94bdaf 100644 > --- a/xen/arch/arm/mm.c > +++ b/xen/arch/arm/mm.c > @@ -1400,8 +1400,10 @@ void share_xen_page_with_guest(struct page_info *page, struct domain *d, > spin_lock(&d->page_alloc_lock); > > /* The incremented type count pins as writable or read-only. */ > - page->u.inuse.type_info = > - (flags == SHARE_ro ? PGT_none : PGT_writable_page) | 1; > + page->u.inuse.type_info &= ~(PGT_type_mask | PGT_count_mask); > + page->u.inuse.type_info |= (flags == SHARE_ro ? PGT_none > + : PGT_writable_page) | > + MASK_INSR(1, PGT_count_mask); ... this is not going to be atomic. So I would suggest to add a comment explaining why this is fine. > > page_set_owner(page, d); > smp_wmb(); /* install valid domain ptr before updating refcnt. */ > @@ -1505,7 +1507,23 @@ int xenmem_add_to_physmap_one( > } > > /* Map at new location. */ > - rc = guest_physmap_add_entry(d, gfn, mfn, 0, t); > + if ( !p2m_is_ram(t) || !is_xen_heap_mfn(mfn) ) > + rc = guest_physmap_add_entry(d, gfn, mfn, 0, t); I would expand the comment above to explain why you need a different path for xenheap mapped as RAM. AFAICT, this is because we need to call page_set_xenheap_gfn(). > + else > + { > + struct p2m_domain *p2m = p2m_get_hostp2m(d); > + > + p2m_write_lock(p2m); > + if ( gfn_eq(page_get_xenheap_gfn(mfn_to_page(mfn)), INVALID_GFN) ) Sorry to only notice it now. This check will also change the behavior for XENMAPSPACE_shared_info. Now, we are only allowed to map the shared info once. I believe this is fine because AFAICT x86 already prevents it. But this is probably something that ought to be explained in the already long commit message. My comments are mainly seeking for clarifications. The code itself looks correct to me. I can handle the comments on commit to save you a round trip once we agree on them. Cheers,
On 23.06.2022 19:50, Julien Grall wrote: > On 11/05/2022 19:47, Oleksandr Tyshchenko wrote: >> @@ -1505,7 +1507,23 @@ int xenmem_add_to_physmap_one( >> } >> >> /* Map at new location. */ >> - rc = guest_physmap_add_entry(d, gfn, mfn, 0, t); > >> + if ( !p2m_is_ram(t) || !is_xen_heap_mfn(mfn) ) >> + rc = guest_physmap_add_entry(d, gfn, mfn, 0, t); > > I would expand the comment above to explain why you need a different > path for xenheap mapped as RAM. AFAICT, this is because we need to call > page_set_xenheap_gfn(). > >> + else >> + { >> + struct p2m_domain *p2m = p2m_get_hostp2m(d); >> + >> + p2m_write_lock(p2m); >> + if ( gfn_eq(page_get_xenheap_gfn(mfn_to_page(mfn)), INVALID_GFN) ) > > Sorry to only notice it now. This check will also change the behavior > for XENMAPSPACE_shared_info. Now, we are only allowed to map the shared > info once. > > I believe this is fine because AFAICT x86 already prevents it. But this > is probably something that ought to be explained in the already long > commit message. If by "prevent" you mean x86 unmaps the page from its earlier GFN, then yes. But this means that Arm would better follow that model instead of returning -EBUSY in this case. Just think of kexec-ing or a boot loader wanting to map shared info or grant table: There wouldn't necessarily be an explicit unmap. Jan
On 23.06.22 20:50, Julien Grall wrote: > Hi Oleksandr, Hello Julien > > Sorry for the late reply. no problem) > > On 11/05/2022 19:47, Oleksandr Tyshchenko wrote: >> diff --git a/xen/arch/arm/include/asm/mm.h >> b/xen/arch/arm/include/asm/mm.h >> +/* >> + * All accesses to the GFN portion of type_info field should always be >> + * protected by the P2M lock. In case when it is not feasible to >> satisfy >> + * that requirement (risk of deadlock, lock inversion, etc) it is >> important >> + * to make sure that all non-protected updates to this field are >> atomic. > > Here you say the non-protected updates should be atomic but... > > [...] > >> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c >> index 7b1f2f4..c94bdaf 100644 >> --- a/xen/arch/arm/mm.c >> +++ b/xen/arch/arm/mm.c >> @@ -1400,8 +1400,10 @@ void share_xen_page_with_guest(struct >> page_info *page, struct domain *d, >> spin_lock(&d->page_alloc_lock); >> /* The incremented type count pins as writable or read-only. */ >> - page->u.inuse.type_info = >> - (flags == SHARE_ro ? PGT_none : PGT_writable_page) | 1; >> + page->u.inuse.type_info &= ~(PGT_type_mask | PGT_count_mask); >> + page->u.inuse.type_info |= (flags == SHARE_ro ? PGT_none >> + : >> PGT_writable_page) | >> + MASK_INSR(1, PGT_count_mask); > > ... this is not going to be atomic. So I would suggest to add a > comment explaining why this is fine. Yes, I should have added your explanation given in V5 why this is fine. So I propose the following text, do you agree with that being added? /* * Please note, the update of type_info field here is not atomic as we use * Read-Modify-Write operation on it. But currently it is fine because * the caller of page_set_xenheap_gfn() (which is another place where * type_info is updated) would need to acquire a reference on the page. * This is only possible after the count_info is updated *and* there a barrier * between the type_info and count_info. So there is no immediate need to use * cmpxchg() here. */ > > >> page_set_owner(page, d); >> smp_wmb(); /* install valid domain ptr before updating refcnt. */ >> @@ -1505,7 +1507,23 @@ int xenmem_add_to_physmap_one( >> } >> /* Map at new location. */ >> - rc = guest_physmap_add_entry(d, gfn, mfn, 0, t); > >> + if ( !p2m_is_ram(t) || !is_xen_heap_mfn(mfn) ) >> + rc = guest_physmap_add_entry(d, gfn, mfn, 0, t); > > I would expand the comment above to explain why you need a different > path for xenheap mapped as RAM. AFAICT, this is because we need to > call page_set_xenheap_gfn(). agree, I propose the following text, do you agree with that? /* * Map at new location. Here we need to map xenheap RAM page differently * because we need to store the valid GFN and make sure that nothing was * mapped before (the stored GFN is invalid). */ > > >> + else >> + { >> + struct p2m_domain *p2m = p2m_get_hostp2m(d); >> + >> + p2m_write_lock(p2m); >> + if ( gfn_eq(page_get_xenheap_gfn(mfn_to_page(mfn)), >> INVALID_GFN) ) > > Sorry to only notice it now. This check will also change the behavior > for XENMAPSPACE_shared_info. Now, we are only allowed to map the > shared info once. > > I believe this is fine because AFAICT x86 already prevents it. But > this is probably something that ought to be explained in the already > long commit message. agree, I propose the following text, do you agree with that? Please note, this patch changes the behavior how the shared_info page (which is xenheap RAM page) is mapped in xenmem_add_to_physmap_one(). Now, we only allow to map the shared_info at once. The subsequent attempts to map it will result in -EBUSY, if there is a legitimate use case we will be able to relax that behavior. > > > My comments are mainly seeking for clarifications. The code itself > looks correct to me. I can handle the comments on commit to save you a > round trip once we agree on them. Thank you, that would be much appreciated. > > > Cheers, >
Hi Jan, On 24/06/2022 07:45, Jan Beulich wrote: > On 23.06.2022 19:50, Julien Grall wrote: >> On 11/05/2022 19:47, Oleksandr Tyshchenko wrote: >>> @@ -1505,7 +1507,23 @@ int xenmem_add_to_physmap_one( >>> } >>> >>> /* Map at new location. */ >>> - rc = guest_physmap_add_entry(d, gfn, mfn, 0, t); >> >>> + if ( !p2m_is_ram(t) || !is_xen_heap_mfn(mfn) ) >>> + rc = guest_physmap_add_entry(d, gfn, mfn, 0, t); >> >> I would expand the comment above to explain why you need a different >> path for xenheap mapped as RAM. AFAICT, this is because we need to call >> page_set_xenheap_gfn(). >> >>> + else >>> + { >>> + struct p2m_domain *p2m = p2m_get_hostp2m(d); >>> + >>> + p2m_write_lock(p2m); >>> + if ( gfn_eq(page_get_xenheap_gfn(mfn_to_page(mfn)), INVALID_GFN) ) >> >> Sorry to only notice it now. This check will also change the behavior >> for XENMAPSPACE_shared_info. Now, we are only allowed to map the shared >> info once. >> >> I believe this is fine because AFAICT x86 already prevents it. But this >> is probably something that ought to be explained in the already long >> commit message. > > If by "prevent" you mean x86 unmaps the page from its earlier GFN, then > yes. But this means that Arm would better follow that model instead of > returning -EBUSY in this case. Just think of kexec-ing or a boot loader > wanting to map shared info or grant table: There wouldn't necessarily > be an explicit unmap. I spent some time to think about this. There is a potential big issue with implicit unmapping from its earlier GFN. Imagine the boot loader decided to map the page in place of a RAM. If the boot loader didn't unmap the page then when the OS map again, we would have a hole in the RAM. The OS may not know that and it may end up to use a page as RAM (and crash). The problem is the same for kexec and AFAIU that's why we need to use soft-reset when kexec-ing. So overall, I think we should prevent implicit unmap. So it would help to enforce that the bootloader (or any other components) clean-up behind themselves (i.e. unmap the page and populate if necessary). Cheers,
Hi Oleksandr, On 24/06/2022 12:47, Oleksandr wrote: > > On 23.06.22 20:50, Julien Grall wrote: >> On 11/05/2022 19:47, Oleksandr Tyshchenko wrote: >>> diff --git a/xen/arch/arm/include/asm/mm.h >>> b/xen/arch/arm/include/asm/mm.h >>> +/* >>> + * All accesses to the GFN portion of type_info field should always be >>> + * protected by the P2M lock. In case when it is not feasible to >>> satisfy >>> + * that requirement (risk of deadlock, lock inversion, etc) it is >>> important >>> + * to make sure that all non-protected updates to this field are >>> atomic. >> >> Here you say the non-protected updates should be atomic but... >> >> [...] >> >>> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c >>> index 7b1f2f4..c94bdaf 100644 >>> --- a/xen/arch/arm/mm.c >>> +++ b/xen/arch/arm/mm.c >>> @@ -1400,8 +1400,10 @@ void share_xen_page_with_guest(struct >>> page_info *page, struct domain *d, >>> spin_lock(&d->page_alloc_lock); >>> /* The incremented type count pins as writable or read-only. */ >>> - page->u.inuse.type_info = >>> - (flags == SHARE_ro ? PGT_none : PGT_writable_page) | 1; >>> + page->u.inuse.type_info &= ~(PGT_type_mask | PGT_count_mask); >>> + page->u.inuse.type_info |= (flags == SHARE_ro ? PGT_none >>> + : >>> PGT_writable_page) | >>> + MASK_INSR(1, PGT_count_mask); >> >> ... this is not going to be atomic. So I would suggest to add a >> comment explaining why this is fine. > > > Yes, I should have added your explanation given in V5 why this is fine. > > So I propose the following text, do you agree with that being added? > > /* > * Please note, the update of type_info field here is not atomic as we use > * Read-Modify-Write operation on it. But currently it is fine because > * the caller of page_set_xenheap_gfn() (which is another place where > * type_info is updated) would need to acquire a reference on the page. > * This is only possible after the count_info is updated *and* there a Missing word: there *is* a. > barrier > * between the type_info and count_info. So there is no immediate need > to use > * cmpxchg() here. > */ > > >> >> >>> page_set_owner(page, d); >>> smp_wmb(); /* install valid domain ptr before updating refcnt. */ >>> @@ -1505,7 +1507,23 @@ int xenmem_add_to_physmap_one( >>> } >>> /* Map at new location. */ >>> - rc = guest_physmap_add_entry(d, gfn, mfn, 0, t); >> >>> + if ( !p2m_is_ram(t) || !is_xen_heap_mfn(mfn) ) >>> + rc = guest_physmap_add_entry(d, gfn, mfn, 0, t); >> >> I would expand the comment above to explain why you need a different >> path for xenheap mapped as RAM. AFAICT, this is because we need to >> call page_set_xenheap_gfn(). > > > agree, I propose the following text, do you agree with that? > > /* > * Map at new location. Here we need to map xenheap RAM page differently > * because we need to store the valid GFN and make sure that nothing was > * mapped before (the stored GFN is invalid). > */ So I think the key point here is the p2m_set_xenheap_gfn() needs to happen with the P2M lock held. That said, looking at the code again this is a bit confusing to use guest_physmap_add_entry() in one place and p2m_set_entry() in the other. The only way I can think to avoid the confusion is than open-coding guest_physmap_add_entry() (i.e. p2m_write_lock(); p2m_set_entry(); p2m_write_unlock()) or try to merge the two paths. However, I am aware this is already at version 6 and your code should work. So I would be OK with a comment explaining that guest_physmap_add_entry() is just a wrapper on top of p2m_set_entry(). >>> + else >>> + { >>> + struct p2m_domain *p2m = p2m_get_hostp2m(d); >>> + >>> + p2m_write_lock(p2m); >>> + if ( gfn_eq(page_get_xenheap_gfn(mfn_to_page(mfn)), >>> INVALID_GFN) ) >> >> Sorry to only notice it now. This check will also change the behavior >> for XENMAPSPACE_shared_info. Now, we are only allowed to map the >> shared info once. >> >> I believe this is fine because AFAICT x86 already prevents it. But >> this is probably something that ought to be explained in the already >> long commit message. So I was wrong thinking that x86 would prevent it (see Jan's answer). However, I think this is a mistake to allow it because it can result to weird issues. In fact, you mentioned on IRC that there is already an example on how this hypercall could be misused in U-boot [1]. At the moment, U-boot will steal a RAM page to map the shared info page but not unmap it. The OS will not be aware where the shared info page is mapped. As this is part of the RAM region, the OS may end up to allocate for other purpose and corrupt the page. If Xen were going to unmap it, then we would create a hole and the OS will crash when accessing the page. In some way, it is better than corruption but still a good user experience (the RAM page may only be accessed a long time after boot). So I think it is much better to return -EBUSY here at least we can catch misuse in the firmware code earlier. In the case of U-boot, as discussed on IRC, the code should: 1) Unmap the page 2) Populate the area with memory using XENMEM_populate_physmap An optimization would be to use the extended regions. But this was only recently introduced so U-boot cannot always rely on it. > > > agree, I propose the following text, do you agree with that? > > Please note, this patch changes the behavior how the shared_info page > (which is xenheap RAM page) is mapped in xenmem_add_to_physmap_one(). > Now, we only allow to map the shared_info at once. The subsequent attempts > to map it will result in -EBUSY, if there is a legitimate use case > we will be able to relax that behavior. I would suggest to summarize what I wrote above in the commit message. I think this is a strong reason to return -EBUSY and push other projects (e.g. U-boot) to fix there code. Cheers, [1] https://source.denx.de/u-boot/u-boot/-/blob/master/drivers/xen/hypervisor.c
On 15.07.22 20:49, Julien Grall wrote: > Hi Oleksandr, Hello Julien > > On 24/06/2022 12:47, Oleksandr wrote: >> >> On 23.06.22 20:50, Julien Grall wrote: >>> On 11/05/2022 19:47, Oleksandr Tyshchenko wrote: >>>> diff --git a/xen/arch/arm/include/asm/mm.h >>>> b/xen/arch/arm/include/asm/mm.h >>>> +/* >>>> + * All accesses to the GFN portion of type_info field should >>>> always be >>>> + * protected by the P2M lock. In case when it is not feasible to >>>> satisfy >>>> + * that requirement (risk of deadlock, lock inversion, etc) it is >>>> important >>>> + * to make sure that all non-protected updates to this field are >>>> atomic. >>> >>> Here you say the non-protected updates should be atomic but... >>> >>> [...] >>> >>>> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c >>>> index 7b1f2f4..c94bdaf 100644 >>>> --- a/xen/arch/arm/mm.c >>>> +++ b/xen/arch/arm/mm.c >>>> @@ -1400,8 +1400,10 @@ void share_xen_page_with_guest(struct >>>> page_info *page, struct domain *d, >>>> spin_lock(&d->page_alloc_lock); >>>> /* The incremented type count pins as writable or >>>> read-only. */ >>>> - page->u.inuse.type_info = >>>> - (flags == SHARE_ro ? PGT_none : PGT_writable_page) | 1; >>>> + page->u.inuse.type_info &= ~(PGT_type_mask | PGT_count_mask); >>>> + page->u.inuse.type_info |= (flags == SHARE_ro ? PGT_none >>>> + : >>>> PGT_writable_page) | >>>> + MASK_INSR(1, PGT_count_mask); >>> >>> ... this is not going to be atomic. So I would suggest to add a >>> comment explaining why this is fine. >> >> >> Yes, I should have added your explanation given in V5 why this is fine. >> >> So I propose the following text, do you agree with that being added? >> >> /* >> * Please note, the update of type_info field here is not atomic as >> we use >> * Read-Modify-Write operation on it. But currently it is fine because >> * the caller of page_set_xenheap_gfn() (which is another place where >> * type_info is updated) would need to acquire a reference on the page. >> * This is only possible after the count_info is updated *and* there a > > Missing word: there *is* a. ok > >> barrier >> * between the type_info and count_info. So there is no immediate >> need to use >> * cmpxchg() here. >> */ >> >> >>> >>> >>>> page_set_owner(page, d); >>>> smp_wmb(); /* install valid domain ptr before updating >>>> refcnt. */ >>>> @@ -1505,7 +1507,23 @@ int xenmem_add_to_physmap_one( >>>> } >>>> /* Map at new location. */ >>>> - rc = guest_physmap_add_entry(d, gfn, mfn, 0, t); >>> >>>> + if ( !p2m_is_ram(t) || !is_xen_heap_mfn(mfn) ) >>>> + rc = guest_physmap_add_entry(d, gfn, mfn, 0, t); >>> >>> I would expand the comment above to explain why you need a different >>> path for xenheap mapped as RAM. AFAICT, this is because we need to >>> call page_set_xenheap_gfn(). >> >> >> agree, I propose the following text, do you agree with that? >> >> /* >> * Map at new location. Here we need to map xenheap RAM page >> differently >> * because we need to store the valid GFN and make sure that nothing >> was >> * mapped before (the stored GFN is invalid). >> */ > > So I think the key point here is the p2m_set_xenheap_gfn() needs to > happen with the P2M lock held. > > That said, looking at the code again this is a bit confusing to use > guest_physmap_add_entry() in one place and p2m_set_entry() in the other. > > The only way I can think to avoid the confusion is than open-coding > guest_physmap_add_entry() (i.e. p2m_write_lock(); p2m_set_entry(); > p2m_write_unlock()) or try to merge the two paths. > > However, I am aware this is already at version 6 and your code should > work. So I would be OK with a comment explaining that > guest_physmap_add_entry() is just a wrapper on top of p2m_set_entry(). ok, thanks > >>>> + else >>>> + { >>>> + struct p2m_domain *p2m = p2m_get_hostp2m(d); >>>> + >>>> + p2m_write_lock(p2m); >>>> + if ( gfn_eq(page_get_xenheap_gfn(mfn_to_page(mfn)), >>>> INVALID_GFN) ) >>> >>> Sorry to only notice it now. This check will also change the >>> behavior for XENMAPSPACE_shared_info. Now, we are only allowed to >>> map the shared info once. >>> >>> I believe this is fine because AFAICT x86 already prevents it. But >>> this is probably something that ought to be explained in the already >>> long commit message. > > So I was wrong thinking that x86 would prevent it (see Jan's answer). > However, I think this is a mistake to allow it because it can result > to weird issues. > > In fact, you mentioned on IRC that there is already an example on how > this hypercall could be misused in U-boot [1]. At the moment, U-boot > will steal a RAM page to map the shared info page but not unmap it. > > The OS will not be aware where the shared info page is mapped. As this > is part of the RAM region, the OS may end up to allocate for other > purpose and corrupt the page. > > If Xen were going to unmap it, then we would create a hole and the OS > will crash when accessing the page. In some way, it is better than > corruption but still a good user experience (the RAM page may only be > accessed a long time after boot). > > So I think it is much better to return -EBUSY here at least we can > catch misuse in the firmware code earlier. > > In the case of U-boot, as discussed on IRC, the code should: > 1) Unmap the page > 2) Populate the area with memory using XENMEM_populate_physmap > > An optimization would be to use the extended regions. But this was > only recently introduced so U-boot cannot always rely on it. you are right, I have nothing to add > >> >> >> agree, I propose the following text, do you agree with that? >> >> Please note, this patch changes the behavior how the shared_info page >> (which is xenheap RAM page) is mapped in xenmem_add_to_physmap_one(). >> Now, we only allow to map the shared_info at once. The subsequent >> attempts >> to map it will result in -EBUSY, if there is a legitimate use case >> we will be able to relax that behavior. > > I would suggest to summarize what I wrote above in the commit message. > I think this is a strong reason to return -EBUSY and push other > projects (e.g. U-boot) to fix there code. agree, will do > > > Cheers, > > [1] > https://source.denx.de/u-boot/u-boot/-/blob/master/drivers/xen/hypervisor.c >
diff --git a/xen/arch/arm/include/asm/grant_table.h b/xen/arch/arm/include/asm/grant_table.h index d31a4d6..16f817b 100644 --- a/xen/arch/arm/include/asm/grant_table.h +++ b/xen/arch/arm/include/asm/grant_table.h @@ -11,11 +11,6 @@ #define INITIAL_NR_GRANT_FRAMES 1U #define GNTTAB_MAX_VERSION 1 -struct grant_table_arch { - gfn_t *shared_gfn; - gfn_t *status_gfn; -}; - static inline void gnttab_clear_flags(struct domain *d, unsigned int mask, uint16_t *addr) { @@ -46,53 +41,27 @@ int replace_grant_host_mapping(unsigned long gpaddr, mfn_t mfn, #define gnttab_dom0_frames() \ min_t(unsigned int, opt_max_grant_frames, PFN_DOWN(_etext - _stext)) -#define gnttab_init_arch(gt) \ -({ \ - unsigned int ngf_ = (gt)->max_grant_frames; \ - unsigned int nsf_ = grant_to_status_frames(ngf_); \ - \ - (gt)->arch.shared_gfn = xmalloc_array(gfn_t, ngf_); \ - (gt)->arch.status_gfn = xmalloc_array(gfn_t, nsf_); \ - if ( (gt)->arch.shared_gfn && (gt)->arch.status_gfn ) \ - { \ - while ( ngf_-- ) \ - (gt)->arch.shared_gfn[ngf_] = INVALID_GFN; \ - while ( nsf_-- ) \ - (gt)->arch.status_gfn[nsf_] = INVALID_GFN; \ - } \ - else \ - gnttab_destroy_arch(gt); \ - (gt)->arch.shared_gfn ? 0 : -ENOMEM; \ -}) - -#define gnttab_destroy_arch(gt) \ - do { \ - XFREE((gt)->arch.shared_gfn); \ - XFREE((gt)->arch.status_gfn); \ - } while ( 0 ) - #define gnttab_set_frame_gfn(gt, st, idx, gfn, mfn) \ - ({ \ - int rc_ = 0; \ - gfn_t ogfn = gnttab_get_frame_gfn(gt, st, idx); \ - if ( gfn_eq(ogfn, INVALID_GFN) || gfn_eq(ogfn, gfn) || \ - (rc_ = guest_physmap_remove_page((gt)->domain, ogfn, mfn, \ - 0)) == 0 ) \ - ((st) ? (gt)->arch.status_gfn \ - : (gt)->arch.shared_gfn)[idx] = (gfn); \ - rc_; \ - }) + (gfn_eq(gfn, INVALID_GFN) \ + ? guest_physmap_remove_page((gt)->domain, \ + gnttab_get_frame_gfn(gt, st, idx), \ + mfn, 0) \ + : 0) #define gnttab_get_frame_gfn(gt, st, idx) ({ \ (st) ? gnttab_status_gfn(NULL, gt, idx) \ : gnttab_shared_gfn(NULL, gt, idx); \ }) +#define gnttab_shared_page(t, i) virt_to_page((t)->shared_raw[i]) + +#define gnttab_status_page(t, i) virt_to_page((t)->status[i]) + #define gnttab_shared_gfn(d, t, i) \ - (((i) >= nr_grant_frames(t)) ? INVALID_GFN : (t)->arch.shared_gfn[i]) + page_get_xenheap_gfn(gnttab_shared_page(t, i)) #define gnttab_status_gfn(d, t, i) \ - (((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i]) + page_get_xenheap_gfn(gnttab_status_page(t, i)) #define gnttab_need_iommu_mapping(d) \ (is_domain_direct_mapped(d) && is_iommu_enabled(d)) diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h index 424aaf2..412200a 100644 --- a/xen/arch/arm/include/asm/mm.h +++ b/xen/arch/arm/include/asm/mm.h @@ -98,9 +98,22 @@ struct page_info #define PGT_writable_page PG_mask(1, 1) /* has writable mappings? */ #define PGT_type_mask PG_mask(1, 1) /* Bits 31 or 63. */ - /* Count of uses of this frame as its current type. */ -#define PGT_count_width PG_shift(2) -#define PGT_count_mask ((1UL<<PGT_count_width)-1) + /* 2-bit count of uses of this frame as its current type. */ +#define PGT_count_mask PG_mask(3, 3) + +/* + * Stored in bits [28:0] (arm32) or [60:0] (arm64) GFN if page is xenheap page. + */ +#define PGT_gfn_width PG_shift(3) +#define PGT_gfn_mask ((1UL<<PGT_gfn_width)-1) + +#define PGT_INVALID_XENHEAP_GFN _gfn(PGT_gfn_mask) + +/* + * An arch-specific initialization pattern is needed for the type_info field + * as it's GFN portion can contain the valid GFN if page is xenheap page. + */ +#define PGT_TYPE_INFO_INITIALIZER gfn_x(PGT_INVALID_XENHEAP_GFN) /* Cleared when the owning guest 'frees' this page. */ #define _PGC_allocated PG_shift(1) @@ -358,6 +371,34 @@ void clear_and_clean_page(struct page_info *page); unsigned int arch_get_dma_bitsize(void); +/* + * All accesses to the GFN portion of type_info field should always be + * protected by the P2M lock. In case when it is not feasible to satisfy + * that requirement (risk of deadlock, lock inversion, etc) it is important + * to make sure that all non-protected updates to this field are atomic. + */ +static inline gfn_t page_get_xenheap_gfn(const struct page_info *p) +{ + gfn_t gfn_ = _gfn(ACCESS_ONCE(p->u.inuse.type_info) & PGT_gfn_mask); + + ASSERT(is_xen_heap_page(p)); + + return gfn_eq(gfn_, PGT_INVALID_XENHEAP_GFN) ? INVALID_GFN : gfn_; +} + +static inline void page_set_xenheap_gfn(struct page_info *p, gfn_t gfn) +{ + gfn_t gfn_ = gfn_eq(gfn, INVALID_GFN) ? PGT_INVALID_XENHEAP_GFN : gfn; + unsigned long x, nx, y = p->u.inuse.type_info; + + ASSERT(is_xen_heap_page(p)); + + do { + x = y; + nx = (x & ~PGT_gfn_mask) | gfn_x(gfn_); + } while ( (y = cmpxchg(&p->u.inuse.type_info, x, nx)) != x ); +} + #endif /* __ARCH_ARM_MM__ */ /* * Local variables: diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index 7b1f2f4..c94bdaf 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -1400,8 +1400,10 @@ void share_xen_page_with_guest(struct page_info *page, struct domain *d, spin_lock(&d->page_alloc_lock); /* The incremented type count pins as writable or read-only. */ - page->u.inuse.type_info = - (flags == SHARE_ro ? PGT_none : PGT_writable_page) | 1; + page->u.inuse.type_info &= ~(PGT_type_mask | PGT_count_mask); + page->u.inuse.type_info |= (flags == SHARE_ro ? PGT_none + : PGT_writable_page) | + MASK_INSR(1, PGT_count_mask); page_set_owner(page, d); smp_wmb(); /* install valid domain ptr before updating refcnt. */ @@ -1505,7 +1507,23 @@ int xenmem_add_to_physmap_one( } /* Map at new location. */ - rc = guest_physmap_add_entry(d, gfn, mfn, 0, t); + if ( !p2m_is_ram(t) || !is_xen_heap_mfn(mfn) ) + rc = guest_physmap_add_entry(d, gfn, mfn, 0, t); + else + { + struct p2m_domain *p2m = p2m_get_hostp2m(d); + + p2m_write_lock(p2m); + if ( gfn_eq(page_get_xenheap_gfn(mfn_to_page(mfn)), INVALID_GFN) ) + { + rc = p2m_set_entry(p2m, gfn, 1, mfn, t, p2m->default_access); + if ( !rc ) + page_set_xenheap_gfn(mfn_to_page(mfn), gfn); + } + else + rc = -EBUSY; + p2m_write_unlock(p2m); + } /* * For XENMAPSPACE_gmfn_foreign if we failed to add the mapping, we need diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c index d00c2e4..f87b48e 100644 --- a/xen/arch/arm/p2m.c +++ b/xen/arch/arm/p2m.c @@ -716,6 +716,8 @@ static int p2m_mem_access_radix_set(struct p2m_domain *p2m, gfn_t gfn, */ static void p2m_put_l3_page(const lpae_t pte) { + mfn_t mfn = lpae_get_mfn(pte); + ASSERT(p2m_is_valid(pte)); /* @@ -727,11 +729,12 @@ static void p2m_put_l3_page(const lpae_t pte) */ if ( p2m_is_foreign(pte.p2m.type) ) { - mfn_t mfn = lpae_get_mfn(pte); - ASSERT(mfn_valid(mfn)); put_page(mfn_to_page(mfn)); } + /* Detect the xenheap page and mark the stored GFN as invalid. */ + else if ( p2m_is_ram(pte.p2m.type) && is_xen_heap_mfn(mfn) ) + page_set_xenheap_gfn(mfn_to_page(mfn), INVALID_GFN); } /* Free lpae sub-tree behind an entry */ diff --git a/xen/arch/x86/include/asm/grant_table.h b/xen/arch/x86/include/asm/grant_table.h index a8a2143..5c23cec 100644 --- a/xen/arch/x86/include/asm/grant_table.h +++ b/xen/arch/x86/include/asm/grant_table.h @@ -14,9 +14,6 @@ #define INITIAL_NR_GRANT_FRAMES 1U -struct grant_table_arch { -}; - static inline int create_grant_host_mapping(uint64_t addr, mfn_t frame, unsigned int flags, unsigned int cache_flags) @@ -35,8 +32,6 @@ static inline int replace_grant_host_mapping(uint64_t addr, mfn_t frame, return replace_grant_pv_mapping(addr, frame, new_addr, flags); } -#define gnttab_init_arch(gt) 0 -#define gnttab_destroy_arch(gt) do {} while ( 0 ) #define gnttab_set_frame_gfn(gt, st, idx, gfn, mfn) \ (gfn_eq(gfn, INVALID_GFN) \ ? guest_physmap_remove_page((gt)->domain, \ diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c index febbe12..4115da0 100644 --- a/xen/common/grant_table.c +++ b/xen/common/grant_table.c @@ -99,8 +99,6 @@ struct grant_table { /* Domain to which this struct grant_table belongs. */ struct domain *domain; - - struct grant_table_arch arch; }; unsigned int __read_mostly opt_max_grant_frames = 64; @@ -2018,14 +2016,9 @@ int grant_table_init(struct domain *d, int max_grant_frames, grant_write_lock(gt); - ret = gnttab_init_arch(gt); - if ( ret ) - goto unlock; - /* gnttab_grow_table() allocates a min number of frames, so 0 is okay. */ ret = gnttab_grow_table(d, 0); - unlock: grant_write_unlock(gt); out: @@ -3940,8 +3933,6 @@ grant_table_destroy( if ( t == NULL ) return; - gnttab_destroy_arch(t); - for ( i = 0; i < nr_grant_frames(t); i++ ) free_xenheap_page(t->shared_raw[i]); xfree(t->shared_raw); diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index 3190291..dbacee2 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -155,6 +155,10 @@ #define PGC_reserved 0 #endif +#ifndef PGT_TYPE_INFO_INITIALIZER +#define PGT_TYPE_INFO_INITIALIZER 0 +#endif + /* * Comma-separated list of hexadecimal page numbers containing bad bytes. * e.g. 'badpage=0x3f45,0x8a321'. @@ -1024,7 +1028,7 @@ static struct page_info *alloc_heap_pages( &tlbflush_timestamp); /* Initialise fields which have other uses for free pages. */ - pg[i].u.inuse.type_info = 0; + pg[i].u.inuse.type_info = PGT_TYPE_INFO_INITIALIZER; page_set_owner(&pg[i], NULL); } @@ -2702,7 +2706,7 @@ static struct page_info * __init acquire_staticmem_pages(mfn_t smfn, */ pg[i].count_info = PGC_reserved | PGC_state_inuse; /* Initialise fields which have other uses for free pages. */ - pg[i].u.inuse.type_info = 0; + pg[i].u.inuse.type_info = PGT_TYPE_INFO_INITIALIZER; page_set_owner(&pg[i], NULL); }