Message ID | 20201020152405.26892-1-andrew.cooper3@citrix.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | x86/pv: Flush TLB in response to paging structure changes | expand |
On 20/10/2020 16:24, Andrew Cooper wrote: > With MMU_UPDATE, a PV guest can make changes to higher level pagetables. This > is from Xen's point of view (as the update only affects guest mappings), and > the guest is required to flush suitably after making updates. > > However, Xen's use of linear pagetables (UPDATE_VA_MAPPING, GNTTABOP_map, > writeable pagetables, etc.) is an implementation detail outside of the > API/ABI. > > Changes in the paging structure require invalidations in the linear pagetable > range for subsequent accesses into the linear pagetables to access non-stale > mappings. Xen must provide suitable flushing to prevent intermixed guest > actions from accidentally accessing/modifying the wrong pagetable. > > For all L2 and higher modifications, flush the full TLB. (This could in > principle be an order 39 flush starting at LINEAR_PT_VIRT_START, but no such > mechanism exists in practice.) > > As this combines with sync_guest for XPTI L4 "shadowing", replace the > sync_guest boolean with flush_flags and accumulate flags. The sync_guest case > now always needs to flush, there is no point trying to exclude the current CPU > from the flush mask. Use pt_owner->dirty_cpumask directly. > > This is XSA-286. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > --- > CC: Jan Beulich <JBeulich@suse.com> > CC: Roger Pau Monné <roger.pau@citrix.com> > CC: Wei Liu <wl@xen.org> > > A couple of minor points. > > * PV guests can create global mappings. I can't reason any safe way to relax > FLUSH_TLB_GLOBAL to just FLUSH_TLB. Sorry - forgot one of the points here. We could in principle relax the flush entirely if we know that we're editing from a not-present to present entry, but plumbing this up through mod_l?_entry() isn't trivial, and its also not not obvious how much of an optimisation it would be in practice. ~Andrew > * Performance tests are still ongoing, but so far is fairing better than the > embargoed alternative. > --- > xen/arch/x86/mm.c | 31 +++++++++++++++---------------- > 1 file changed, 15 insertions(+), 16 deletions(-) > > diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c > index 918ee2bbe3..a6a7fcb56c 100644 > --- a/xen/arch/x86/mm.c > +++ b/xen/arch/x86/mm.c > @@ -3883,11 +3883,10 @@ long do_mmu_update( > void *va = NULL; > unsigned long gpfn, gmfn; > struct page_info *page; > - unsigned int cmd, i = 0, done = 0, pt_dom; > + unsigned int cmd, i = 0, done = 0, pt_dom, flush_flags = 0; > struct vcpu *curr = current, *v = curr; > struct domain *d = v->domain, *pt_owner = d, *pg_owner; > mfn_t map_mfn = INVALID_MFN, mfn; > - bool sync_guest = false; > uint32_t xsm_needed = 0; > uint32_t xsm_checked = 0; > int rc = put_old_guest_table(curr); > @@ -4037,6 +4036,8 @@ long do_mmu_update( > break; > rc = mod_l2_entry(va, l2e_from_intpte(req.val), mfn, > cmd == MMU_PT_UPDATE_PRESERVE_AD, v); > + if ( !rc ) > + flush_flags |= FLUSH_TLB_GLOBAL; > break; > > case PGT_l3_page_table: > @@ -4044,6 +4045,8 @@ long do_mmu_update( > break; > rc = mod_l3_entry(va, l3e_from_intpte(req.val), mfn, > cmd == MMU_PT_UPDATE_PRESERVE_AD, v); > + if ( !rc ) > + flush_flags |= FLUSH_TLB_GLOBAL; > break; > > case PGT_l4_page_table: > @@ -4051,6 +4054,8 @@ long do_mmu_update( > break; > rc = mod_l4_entry(va, l4e_from_intpte(req.val), mfn, > cmd == MMU_PT_UPDATE_PRESERVE_AD, v); > + if ( !rc ) > + flush_flags |= FLUSH_TLB_GLOBAL; > if ( !rc && pt_owner->arch.pv.xpti ) > { > bool local_in_use = false; > @@ -4071,7 +4076,7 @@ long do_mmu_update( > (1 + !!(page->u.inuse.type_info & PGT_pinned) + > mfn_eq(pagetable_get_mfn(curr->arch.guest_table_user), > mfn) + local_in_use) ) > - sync_guest = true; > + flush_flags |= FLUSH_ROOT_PGTBL; > } > break; > > @@ -4173,19 +4178,13 @@ long do_mmu_update( > if ( va ) > unmap_domain_page(va); > > - if ( sync_guest ) > - { > - /* > - * Force other vCPU-s of the affected guest to pick up L4 entry > - * changes (if any). > - */ > - unsigned int cpu = smp_processor_id(); > - cpumask_t *mask = per_cpu(scratch_cpumask, cpu); > - > - cpumask_andnot(mask, pt_owner->dirty_cpumask, cpumask_of(cpu)); > - if ( !cpumask_empty(mask) ) > - flush_mask(mask, FLUSH_TLB_GLOBAL | FLUSH_ROOT_PGTBL); > - } > + /* > + * Flush TLBs if an L2 or higher was changed (invalidates the structure of > + * the linear pagetables), or an L4 in use by other CPUs was made (needs > + * to resync the XPTI copy of the table). > + */ > + if ( flush_flags ) > + flush_mask(pt_owner->dirty_cpumask, flush_flags); > > perfc_add(num_page_updates, i); >
On 20.10.2020 17:24, Andrew Cooper wrote: > With MMU_UPDATE, a PV guest can make changes to higher level pagetables. This > is from Xen's point of view (as the update only affects guest mappings), and > the guest is required to flush suitably after making updates. > > However, Xen's use of linear pagetables (UPDATE_VA_MAPPING, GNTTABOP_map, > writeable pagetables, etc.) is an implementation detail outside of the > API/ABI. > > Changes in the paging structure require invalidations in the linear pagetable > range for subsequent accesses into the linear pagetables to access non-stale > mappings. Xen must provide suitable flushing to prevent intermixed guest > actions from accidentally accessing/modifying the wrong pagetable. > > For all L2 and higher modifications, flush the full TLB. (This could in > principle be an order 39 flush starting at LINEAR_PT_VIRT_START, but no such > mechanism exists in practice.) > > As this combines with sync_guest for XPTI L4 "shadowing", replace the > sync_guest boolean with flush_flags and accumulate flags. The sync_guest case > now always needs to flush, there is no point trying to exclude the current CPU > from the flush mask. Use pt_owner->dirty_cpumask directly. Why is there no point? There's no need for the FLUSH_ROOT_PGTBL part of the flushing on the local CPU. The draft you had sent earlier looked better in this regard. Jan
On 20/10/2020 16:48, Jan Beulich wrote: > On 20.10.2020 17:24, Andrew Cooper wrote: >> With MMU_UPDATE, a PV guest can make changes to higher level pagetables. This >> is from Xen's point of view (as the update only affects guest mappings), and >> the guest is required to flush suitably after making updates. >> >> However, Xen's use of linear pagetables (UPDATE_VA_MAPPING, GNTTABOP_map, >> writeable pagetables, etc.) is an implementation detail outside of the >> API/ABI. >> >> Changes in the paging structure require invalidations in the linear pagetable >> range for subsequent accesses into the linear pagetables to access non-stale >> mappings. Xen must provide suitable flushing to prevent intermixed guest >> actions from accidentally accessing/modifying the wrong pagetable. >> >> For all L2 and higher modifications, flush the full TLB. (This could in >> principle be an order 39 flush starting at LINEAR_PT_VIRT_START, but no such >> mechanism exists in practice.) >> >> As this combines with sync_guest for XPTI L4 "shadowing", replace the >> sync_guest boolean with flush_flags and accumulate flags. The sync_guest case >> now always needs to flush, there is no point trying to exclude the current CPU >> from the flush mask. Use pt_owner->dirty_cpumask directly. > Why is there no point? There's no need for the FLUSH_ROOT_PGTBL > part of the flushing on the local CPU. The draft you had sent > earlier looked better in this regard. This was the area which broke. It is to do with subtle difference in the scope of L4 updates. ROOT_PGTBL needs to resync current (if in use), and be broadcasted if other references to the pages are found. The TLB flush needs to be broadcast to the whole domain dirty mask, as we can't (easily) know if the update was part of the current structure. ~Andrew
On 20/10/2020 17:20, Andrew Cooper wrote: > On 20/10/2020 16:48, Jan Beulich wrote: >> On 20.10.2020 17:24, Andrew Cooper wrote: >>> With MMU_UPDATE, a PV guest can make changes to higher level pagetables. This >>> is from Xen's point of view (as the update only affects guest mappings), and >>> the guest is required to flush suitably after making updates. >>> >>> However, Xen's use of linear pagetables (UPDATE_VA_MAPPING, GNTTABOP_map, >>> writeable pagetables, etc.) is an implementation detail outside of the >>> API/ABI. >>> >>> Changes in the paging structure require invalidations in the linear pagetable >>> range for subsequent accesses into the linear pagetables to access non-stale >>> mappings. Xen must provide suitable flushing to prevent intermixed guest >>> actions from accidentally accessing/modifying the wrong pagetable. >>> >>> For all L2 and higher modifications, flush the full TLB. (This could in >>> principle be an order 39 flush starting at LINEAR_PT_VIRT_START, but no such >>> mechanism exists in practice.) >>> >>> As this combines with sync_guest for XPTI L4 "shadowing", replace the >>> sync_guest boolean with flush_flags and accumulate flags. The sync_guest case >>> now always needs to flush, there is no point trying to exclude the current CPU >>> from the flush mask. Use pt_owner->dirty_cpumask directly. >> Why is there no point? There's no need for the FLUSH_ROOT_PGTBL >> part of the flushing on the local CPU. The draft you had sent >> earlier looked better in this regard. > This was the area which broke. It is to do with subtle difference in > the scope of L4 updates. > > ROOT_PGTBL needs to resync current (if in use), and be broadcasted if > other references to the pages are found. > > The TLB flush needs to be broadcast to the whole domain dirty mask, as > we can't (easily) know if the update was part of the current structure. Actually - we can know whether an L4 update needs flushing locally or not, in exactly the same way as the sync logic currently works. However, unlike the opencoded get_cpu_info()->root_pgt_changed = true, we can't just flush locally for free. This is quite awkward to express. ~Andrew
On 20/10/2020 18:10, Andrew Cooper wrote: > On 20/10/2020 17:20, Andrew Cooper wrote: >> On 20/10/2020 16:48, Jan Beulich wrote: >>> On 20.10.2020 17:24, Andrew Cooper wrote: >>>> With MMU_UPDATE, a PV guest can make changes to higher level pagetables. This >>>> is from Xen's point of view (as the update only affects guest mappings), and >>>> the guest is required to flush suitably after making updates. >>>> >>>> However, Xen's use of linear pagetables (UPDATE_VA_MAPPING, GNTTABOP_map, >>>> writeable pagetables, etc.) is an implementation detail outside of the >>>> API/ABI. >>>> >>>> Changes in the paging structure require invalidations in the linear pagetable >>>> range for subsequent accesses into the linear pagetables to access non-stale >>>> mappings. Xen must provide suitable flushing to prevent intermixed guest >>>> actions from accidentally accessing/modifying the wrong pagetable. >>>> >>>> For all L2 and higher modifications, flush the full TLB. (This could in >>>> principle be an order 39 flush starting at LINEAR_PT_VIRT_START, but no such >>>> mechanism exists in practice.) >>>> >>>> As this combines with sync_guest for XPTI L4 "shadowing", replace the >>>> sync_guest boolean with flush_flags and accumulate flags. The sync_guest case >>>> now always needs to flush, there is no point trying to exclude the current CPU >>>> from the flush mask. Use pt_owner->dirty_cpumask directly. >>> Why is there no point? There's no need for the FLUSH_ROOT_PGTBL >>> part of the flushing on the local CPU. The draft you had sent >>> earlier looked better in this regard. >> This was the area which broke. It is to do with subtle difference in >> the scope of L4 updates. >> >> ROOT_PGTBL needs to resync current (if in use), and be broadcasted if >> other references to the pages are found. >> >> The TLB flush needs to be broadcast to the whole domain dirty mask, as >> we can't (easily) know if the update was part of the current structure. > Actually - we can know whether an L4 update needs flushing locally or > not, in exactly the same way as the sync logic currently works. > > However, unlike the opencoded get_cpu_info()->root_pgt_changed = true, > we can't just flush locally for free. > > This is quite awkward to express. And not safe. Flushes may accumulate from multiple levels in a batch, and pt_owner may not be equal to current. I stand by the version submitted as the security fix. ~Andrew
On 20.10.2020 17:24, Andrew Cooper wrote: > A couple of minor points. > > * PV guests can create global mappings. I can't reason any safe way to relax > FLUSH_TLB_GLOBAL to just FLUSH_TLB. We only care about guest view here, and from guest view we only care about non-leaf entries. Non-leaf entries can't be global, and luckily (for now at least) the G bit also doesn't get different meaning assigned in non-leaf entries. Jan
On 20.10.2020 20:46, Andrew Cooper wrote: > On 20/10/2020 18:10, Andrew Cooper wrote: >> On 20/10/2020 17:20, Andrew Cooper wrote: >>> On 20/10/2020 16:48, Jan Beulich wrote: >>>> On 20.10.2020 17:24, Andrew Cooper wrote: >>>>> With MMU_UPDATE, a PV guest can make changes to higher level pagetables. This >>>>> is from Xen's point of view (as the update only affects guest mappings), and >>>>> the guest is required to flush suitably after making updates. >>>>> >>>>> However, Xen's use of linear pagetables (UPDATE_VA_MAPPING, GNTTABOP_map, >>>>> writeable pagetables, etc.) is an implementation detail outside of the >>>>> API/ABI. >>>>> >>>>> Changes in the paging structure require invalidations in the linear pagetable >>>>> range for subsequent accesses into the linear pagetables to access non-stale >>>>> mappings. Xen must provide suitable flushing to prevent intermixed guest >>>>> actions from accidentally accessing/modifying the wrong pagetable. >>>>> >>>>> For all L2 and higher modifications, flush the full TLB. (This could in >>>>> principle be an order 39 flush starting at LINEAR_PT_VIRT_START, but no such >>>>> mechanism exists in practice.) >>>>> >>>>> As this combines with sync_guest for XPTI L4 "shadowing", replace the >>>>> sync_guest boolean with flush_flags and accumulate flags. The sync_guest case >>>>> now always needs to flush, there is no point trying to exclude the current CPU >>>>> from the flush mask. Use pt_owner->dirty_cpumask directly. >>>> Why is there no point? There's no need for the FLUSH_ROOT_PGTBL >>>> part of the flushing on the local CPU. The draft you had sent >>>> earlier looked better in this regard. >>> This was the area which broke. It is to do with subtle difference in >>> the scope of L4 updates. >>> >>> ROOT_PGTBL needs to resync current (if in use), and be broadcasted if >>> other references to the pages are found. >>> >>> The TLB flush needs to be broadcast to the whole domain dirty mask, as >>> we can't (easily) know if the update was part of the current structure. >> Actually - we can know whether an L4 update needs flushing locally or >> not, in exactly the same way as the sync logic currently works. >> >> However, unlike the opencoded get_cpu_info()->root_pgt_changed = true, >> we can't just flush locally for free. >> >> This is quite awkward to express. > > And not safe. Flushes may accumulate from multiple levels in a batch, > and pt_owner may not be equal to current. I'm not questioning the TLB flush - this needs to be the scope you use (but just FLUSH_TLB as per my earlier reply). I'm questioning the extra ROOT_PGTBL sync (meaning changes to levels other than L4 don't matter), which is redundant with the explicit setting right after the call to mod_l4_entry(). But I guess since now you need to issue _some_ flush_mask() for the local CPU anyway, perhaps it's rather the explicit setting of ->root_pgt_changed which wants dropping? (If pt_owner != current->domain, then pt_owner->dirty_cpumask can't have smp_processor_id()'s bit set, and hence there was no reduction in scope in this case anyway. Similarly in this case local_in_use is necessarily false, as page tables can't be shared between domains.) Taking both adjustments together - all L[234] changes require FLUSH_TLB on dirty CPUs of pt_owner including the local CPU - the converted sync_guest continues to require FLUSH_TLB_GLOBAL | FLUSH_ROOT_PGTBL on remote dirty CPUs of pt_owner This difference, I think, still warrants treating the local CPU specially, as the global flush continues to be unnecessary there. Whether the local CPU's ->root_pgt_changed gets set via flush_local() or explicitly is then a pretty benign secondary aspect. Jan
On 21/10/2020 07:55, Jan Beulich wrote: > On 20.10.2020 20:46, Andrew Cooper wrote: >> On 20/10/2020 18:10, Andrew Cooper wrote: >>> On 20/10/2020 17:20, Andrew Cooper wrote: >>>> On 20/10/2020 16:48, Jan Beulich wrote: >>>>> On 20.10.2020 17:24, Andrew Cooper wrote: >>>>>> With MMU_UPDATE, a PV guest can make changes to higher level pagetables. This >>>>>> is from Xen's point of view (as the update only affects guest mappings), and >>>>>> the guest is required to flush suitably after making updates. >>>>>> >>>>>> However, Xen's use of linear pagetables (UPDATE_VA_MAPPING, GNTTABOP_map, >>>>>> writeable pagetables, etc.) is an implementation detail outside of the >>>>>> API/ABI. >>>>>> >>>>>> Changes in the paging structure require invalidations in the linear pagetable >>>>>> range for subsequent accesses into the linear pagetables to access non-stale >>>>>> mappings. Xen must provide suitable flushing to prevent intermixed guest >>>>>> actions from accidentally accessing/modifying the wrong pagetable. >>>>>> >>>>>> For all L2 and higher modifications, flush the full TLB. (This could in >>>>>> principle be an order 39 flush starting at LINEAR_PT_VIRT_START, but no such >>>>>> mechanism exists in practice.) >>>>>> >>>>>> As this combines with sync_guest for XPTI L4 "shadowing", replace the >>>>>> sync_guest boolean with flush_flags and accumulate flags. The sync_guest case >>>>>> now always needs to flush, there is no point trying to exclude the current CPU >>>>>> from the flush mask. Use pt_owner->dirty_cpumask directly. >>>>> Why is there no point? There's no need for the FLUSH_ROOT_PGTBL >>>>> part of the flushing on the local CPU. The draft you had sent >>>>> earlier looked better in this regard. >>>> This was the area which broke. It is to do with subtle difference in >>>> the scope of L4 updates. >>>> >>>> ROOT_PGTBL needs to resync current (if in use), and be broadcasted if >>>> other references to the pages are found. >>>> >>>> The TLB flush needs to be broadcast to the whole domain dirty mask, as >>>> we can't (easily) know if the update was part of the current structure. >>> Actually - we can know whether an L4 update needs flushing locally or >>> not, in exactly the same way as the sync logic currently works. >>> >>> However, unlike the opencoded get_cpu_info()->root_pgt_changed = true, >>> we can't just flush locally for free. >>> >>> This is quite awkward to express. >> And not safe. Flushes may accumulate from multiple levels in a batch, >> and pt_owner may not be equal to current. > I'm not questioning the TLB flush - this needs to be the scope you > use (but just FLUSH_TLB as per my earlier reply). I'm questioning > the extra ROOT_PGTBL sync (meaning changes to levels other than L4 > don't matter), which is redundant with the explicit setting right > after the call to mod_l4_entry(). But I guess since now you need > to issue _some_ flush_mask() for the local CPU anyway, perhaps > it's rather the explicit setting of ->root_pgt_changed which wants > dropping? No. That was the delta which delayed posting in the first place. Dom0 crashes very early without it. The problem, as I said, is the asymmetry. As dom0 is booting, the "only one use of this L4" logic kicks in, and skips setting ROOT_PGTBL, which then causes the flush_mask() not to flush on the local CPU either. ~Andrew
On 21.10.2020 12:01, Andrew Cooper wrote: > On 21/10/2020 07:55, Jan Beulich wrote: >> On 20.10.2020 20:46, Andrew Cooper wrote: >>> On 20/10/2020 18:10, Andrew Cooper wrote: >>>> On 20/10/2020 17:20, Andrew Cooper wrote: >>>>> On 20/10/2020 16:48, Jan Beulich wrote: >>>>>> On 20.10.2020 17:24, Andrew Cooper wrote: >>>>>>> With MMU_UPDATE, a PV guest can make changes to higher level pagetables. This >>>>>>> is from Xen's point of view (as the update only affects guest mappings), and >>>>>>> the guest is required to flush suitably after making updates. >>>>>>> >>>>>>> However, Xen's use of linear pagetables (UPDATE_VA_MAPPING, GNTTABOP_map, >>>>>>> writeable pagetables, etc.) is an implementation detail outside of the >>>>>>> API/ABI. >>>>>>> >>>>>>> Changes in the paging structure require invalidations in the linear pagetable >>>>>>> range for subsequent accesses into the linear pagetables to access non-stale >>>>>>> mappings. Xen must provide suitable flushing to prevent intermixed guest >>>>>>> actions from accidentally accessing/modifying the wrong pagetable. >>>>>>> >>>>>>> For all L2 and higher modifications, flush the full TLB. (This could in >>>>>>> principle be an order 39 flush starting at LINEAR_PT_VIRT_START, but no such >>>>>>> mechanism exists in practice.) >>>>>>> >>>>>>> As this combines with sync_guest for XPTI L4 "shadowing", replace the >>>>>>> sync_guest boolean with flush_flags and accumulate flags. The sync_guest case >>>>>>> now always needs to flush, there is no point trying to exclude the current CPU >>>>>>> from the flush mask. Use pt_owner->dirty_cpumask directly. >>>>>> Why is there no point? There's no need for the FLUSH_ROOT_PGTBL >>>>>> part of the flushing on the local CPU. The draft you had sent >>>>>> earlier looked better in this regard. >>>>> This was the area which broke. It is to do with subtle difference in >>>>> the scope of L4 updates. >>>>> >>>>> ROOT_PGTBL needs to resync current (if in use), and be broadcasted if >>>>> other references to the pages are found. >>>>> >>>>> The TLB flush needs to be broadcast to the whole domain dirty mask, as >>>>> we can't (easily) know if the update was part of the current structure. >>>> Actually - we can know whether an L4 update needs flushing locally or >>>> not, in exactly the same way as the sync logic currently works. >>>> >>>> However, unlike the opencoded get_cpu_info()->root_pgt_changed = true, >>>> we can't just flush locally for free. >>>> >>>> This is quite awkward to express. >>> And not safe. Flushes may accumulate from multiple levels in a batch, >>> and pt_owner may not be equal to current. >> I'm not questioning the TLB flush - this needs to be the scope you >> use (but just FLUSH_TLB as per my earlier reply). I'm questioning >> the extra ROOT_PGTBL sync (meaning changes to levels other than L4 >> don't matter), which is redundant with the explicit setting right >> after the call to mod_l4_entry(). But I guess since now you need >> to issue _some_ flush_mask() for the local CPU anyway, perhaps >> it's rather the explicit setting of ->root_pgt_changed which wants >> dropping? > > No. That was the delta which delayed posting in the first place. Dom0 > crashes very early without it. > > The problem, as I said, is the asymmetry. > > As dom0 is booting, the "only one use of this L4" logic kicks in, and > skips setting ROOT_PGTBL, which then causes the flush_mask() not to > flush on the local CPU either. Ah, I think I finally got it. This asymmetry wants expressing then in two different sets of flush flags (not sure whether two different variables are needed), since - as per other replies - the local CPU has different requirements anyway. - all CPUs need FLUSH_TLB - the local CPU may additionally need FLUSH_ROOT_PGTBL - other CPUs may additionally (or instead, if you like) need FLUSH_ROOT_PGTBL | FLUSH_TLB_GLOBAL But then of course the local CPU can as well have its ->root_pgt_changed updated directly - there's no difference whether it gets done this way or by passing FLUSH_ROOT_PGTBL to flush_local(). Jan
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 918ee2bbe3..a6a7fcb56c 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -3883,11 +3883,10 @@ long do_mmu_update( void *va = NULL; unsigned long gpfn, gmfn; struct page_info *page; - unsigned int cmd, i = 0, done = 0, pt_dom; + unsigned int cmd, i = 0, done = 0, pt_dom, flush_flags = 0; struct vcpu *curr = current, *v = curr; struct domain *d = v->domain, *pt_owner = d, *pg_owner; mfn_t map_mfn = INVALID_MFN, mfn; - bool sync_guest = false; uint32_t xsm_needed = 0; uint32_t xsm_checked = 0; int rc = put_old_guest_table(curr); @@ -4037,6 +4036,8 @@ long do_mmu_update( break; rc = mod_l2_entry(va, l2e_from_intpte(req.val), mfn, cmd == MMU_PT_UPDATE_PRESERVE_AD, v); + if ( !rc ) + flush_flags |= FLUSH_TLB_GLOBAL; break; case PGT_l3_page_table: @@ -4044,6 +4045,8 @@ long do_mmu_update( break; rc = mod_l3_entry(va, l3e_from_intpte(req.val), mfn, cmd == MMU_PT_UPDATE_PRESERVE_AD, v); + if ( !rc ) + flush_flags |= FLUSH_TLB_GLOBAL; break; case PGT_l4_page_table: @@ -4051,6 +4054,8 @@ long do_mmu_update( break; rc = mod_l4_entry(va, l4e_from_intpte(req.val), mfn, cmd == MMU_PT_UPDATE_PRESERVE_AD, v); + if ( !rc ) + flush_flags |= FLUSH_TLB_GLOBAL; if ( !rc && pt_owner->arch.pv.xpti ) { bool local_in_use = false; @@ -4071,7 +4076,7 @@ long do_mmu_update( (1 + !!(page->u.inuse.type_info & PGT_pinned) + mfn_eq(pagetable_get_mfn(curr->arch.guest_table_user), mfn) + local_in_use) ) - sync_guest = true; + flush_flags |= FLUSH_ROOT_PGTBL; } break; @@ -4173,19 +4178,13 @@ long do_mmu_update( if ( va ) unmap_domain_page(va); - if ( sync_guest ) - { - /* - * Force other vCPU-s of the affected guest to pick up L4 entry - * changes (if any). - */ - unsigned int cpu = smp_processor_id(); - cpumask_t *mask = per_cpu(scratch_cpumask, cpu); - - cpumask_andnot(mask, pt_owner->dirty_cpumask, cpumask_of(cpu)); - if ( !cpumask_empty(mask) ) - flush_mask(mask, FLUSH_TLB_GLOBAL | FLUSH_ROOT_PGTBL); - } + /* + * Flush TLBs if an L2 or higher was changed (invalidates the structure of + * the linear pagetables), or an L4 in use by other CPUs was made (needs + * to resync the XPTI copy of the table). + */ + if ( flush_flags ) + flush_mask(pt_owner->dirty_cpumask, flush_flags); perfc_add(num_page_updates, i);
With MMU_UPDATE, a PV guest can make changes to higher level pagetables. This is from Xen's point of view (as the update only affects guest mappings), and the guest is required to flush suitably after making updates. However, Xen's use of linear pagetables (UPDATE_VA_MAPPING, GNTTABOP_map, writeable pagetables, etc.) is an implementation detail outside of the API/ABI. Changes in the paging structure require invalidations in the linear pagetable range for subsequent accesses into the linear pagetables to access non-stale mappings. Xen must provide suitable flushing to prevent intermixed guest actions from accidentally accessing/modifying the wrong pagetable. For all L2 and higher modifications, flush the full TLB. (This could in principle be an order 39 flush starting at LINEAR_PT_VIRT_START, but no such mechanism exists in practice.) As this combines with sync_guest for XPTI L4 "shadowing", replace the sync_guest boolean with flush_flags and accumulate flags. The sync_guest case now always needs to flush, there is no point trying to exclude the current CPU from the flush mask. Use pt_owner->dirty_cpumask directly. This is XSA-286. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: Jan Beulich <JBeulich@suse.com> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Wei Liu <wl@xen.org> A couple of minor points. * PV guests can create global mappings. I can't reason any safe way to relax FLUSH_TLB_GLOBAL to just FLUSH_TLB. * Performance tests are still ongoing, but so far is fairing better than the embargoed alternative. --- xen/arch/x86/mm.c | 31 +++++++++++++++---------------- 1 file changed, 15 insertions(+), 16 deletions(-)