Message ID | 20200406105703.79201-2-roger.pau@citrix.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | x86/guest: use assisted TLB flush in guest mode | expand |
On 06.04.2020 12:57, Roger Pau Monne wrote: > Introduce a specific flag to request a HVM guest linear TLB flush, > which is an ASID/VPID tickle that forces a guest linear to guest > physical TLB flush for all HVM guests. > > This was previously unconditionally done in each pre_flush call, but > that's not required: HVM guests not using shadow don't require linear > TLB flushes as Xen doesn't modify the guest page tables in that case > (ie: when using HAP). Note that shadow paging code already takes care > of issuing the necessary flushes when the shadow page tables are > modified. > > In order to keep the previous behavior modify all shadow code TLB > flushes to also flush the guest linear to physical TLB if the guest is > HVM. I haven't looked at each specific shadow code TLB flush in order > to figure out whether it actually requires a guest TLB flush or not, > so there might be room for improvement in that regard. > > Also perform ASID/VPID flushes when modifying the p2m tables as it's a > requirement for AMD hardware. Finally keep the flush in > switch_cr3_cr4, as it's not clear whether code could rely on > switch_cr3_cr4 also performing a guest linear TLB flush. A following > patch can remove the ASID/VPID tickle from switch_cr3_cr4 if found to > not be necessary. > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> with one really minor remark: > --- a/xen/arch/x86/mm/paging.c > +++ b/xen/arch/x86/mm/paging.c > @@ -613,7 +613,8 @@ void paging_log_dirty_range(struct domain *d, > > p2m_unlock(p2m); > > - flush_tlb_mask(d->dirty_cpumask); > + flush_mask(d->dirty_cpumask, (!hap_enabled(d) ? FLUSH_TLB : 0) | > + FLUSH_HVM_ASID_CORE); In cases where one case is assumed to be more likely than the other putting the more likely one first can be viewed as a mild hint to the compiler, and hence an extra ! may be warranted in an if() or a conditional expression. Here, however, I don't think we can really consider one case more likely than the other, and hence I'd suggest to avoid the !, flipping the other two expressions accordingly. I may take the liberty to adjust this while committing (if I'm to be the one). Jan
On Wed, Apr 08, 2020 at 01:25:14PM +0200, Jan Beulich wrote: > On 06.04.2020 12:57, Roger Pau Monne wrote: > > Introduce a specific flag to request a HVM guest linear TLB flush, > > which is an ASID/VPID tickle that forces a guest linear to guest > > physical TLB flush for all HVM guests. > > > > This was previously unconditionally done in each pre_flush call, but > > that's not required: HVM guests not using shadow don't require linear > > TLB flushes as Xen doesn't modify the guest page tables in that case > > (ie: when using HAP). Note that shadow paging code already takes care > > of issuing the necessary flushes when the shadow page tables are > > modified. > > > > In order to keep the previous behavior modify all shadow code TLB > > flushes to also flush the guest linear to physical TLB if the guest is > > HVM. I haven't looked at each specific shadow code TLB flush in order > > to figure out whether it actually requires a guest TLB flush or not, > > so there might be room for improvement in that regard. > > > > Also perform ASID/VPID flushes when modifying the p2m tables as it's a > > requirement for AMD hardware. Finally keep the flush in > > switch_cr3_cr4, as it's not clear whether code could rely on > > switch_cr3_cr4 also performing a guest linear TLB flush. A following > > patch can remove the ASID/VPID tickle from switch_cr3_cr4 if found to > > not be necessary. > > > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > > Reviewed-by: Jan Beulich <jbeulich@suse.com> > with one really minor remark: > > > --- a/xen/arch/x86/mm/paging.c > > +++ b/xen/arch/x86/mm/paging.c > > @@ -613,7 +613,8 @@ void paging_log_dirty_range(struct domain *d, > > > > p2m_unlock(p2m); > > > > - flush_tlb_mask(d->dirty_cpumask); > > + flush_mask(d->dirty_cpumask, (!hap_enabled(d) ? FLUSH_TLB : 0) | > > + FLUSH_HVM_ASID_CORE); > > In cases where one case is assumed to be more likely than the other > putting the more likely one first can be viewed as a mild hint to > the compiler, and hence an extra ! may be warranted in an if() or > a conditional expression. Here, however, I don't think we can > really consider one case more likely than the other, and hence I'd > suggest to avoid the !, flipping the other two expressions > accordingly. I may take the liberty to adjust this while committing > (if I'm to be the one). That's fine, thanks. Somehow '!hap -> flush' was clearer in my mind. Roger.
On 08.04.2020 17:10, Roger Pau Monné wrote: > On Wed, Apr 08, 2020 at 01:25:14PM +0200, Jan Beulich wrote: >> On 06.04.2020 12:57, Roger Pau Monne wrote: >>> --- a/xen/arch/x86/mm/paging.c >>> +++ b/xen/arch/x86/mm/paging.c >>> @@ -613,7 +613,8 @@ void paging_log_dirty_range(struct domain *d, >>> >>> p2m_unlock(p2m); >>> >>> - flush_tlb_mask(d->dirty_cpumask); >>> + flush_mask(d->dirty_cpumask, (!hap_enabled(d) ? FLUSH_TLB : 0) | >>> + FLUSH_HVM_ASID_CORE); >> >> In cases where one case is assumed to be more likely than the other >> putting the more likely one first can be viewed as a mild hint to >> the compiler, and hence an extra ! may be warranted in an if() or >> a conditional expression. Here, however, I don't think we can >> really consider one case more likely than the other, and hence I'd >> suggest to avoid the !, flipping the other two expressions >> accordingly. I may take the liberty to adjust this while committing >> (if I'm to be the one). > > That's fine, thanks. Somehow '!hap -> flush' was clearer in my mind. Thinking about it with the other HVM-related changes in v9, shouldn't this then be flush_mask(d->dirty_cpumask, (hap_enabled(d) ? 0 : FLUSH_TLB) | (is_hvm_domain(d) ? FLUSH_HVM_ASID_CORE : 0)); Or wait - the only caller lives in hap.c. As a result the FLUSH_TLB part can be dropped altogether. And I question the need of flushing guest TLBs - this is purely a p2m operation. I'll go look at the history of this function, but for now I think the call should be dropped (albeit then maybe better in a separate patch). Jan
On 06.04.2020 12:57, Roger Pau Monne wrote: > --- a/xen/arch/x86/mm/hap/hap.c > +++ b/xen/arch/x86/mm/hap/hap.c > @@ -118,7 +118,7 @@ int hap_track_dirty_vram(struct domain *d, > p2m_change_type_range(d, begin_pfn, begin_pfn + nr, > p2m_ram_rw, p2m_ram_logdirty); > > - flush_tlb_mask(d->dirty_cpumask); > + hap_flush_tlb_mask(d->dirty_cpumask); > > memset(dirty_bitmap, 0xff, size); /* consider all pages dirty */ > } > @@ -205,7 +205,7 @@ static int hap_enable_log_dirty(struct domain *d, bool_t log_global) > * to be read-only, or via hardware-assisted log-dirty. > */ > p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty); > - flush_tlb_mask(d->dirty_cpumask); > + hap_flush_tlb_mask(d->dirty_cpumask); > } > return 0; > } > @@ -234,7 +234,7 @@ static void hap_clean_dirty_bitmap(struct domain *d) > * be read-only, or via hardware-assisted log-dirty. > */ > p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty); > - flush_tlb_mask(d->dirty_cpumask); > + hap_flush_tlb_mask(d->dirty_cpumask); > } > > /************************************************/ > @@ -798,7 +798,7 @@ hap_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn, l1_pgentry_t *p, > > safe_write_pte(p, new); > if ( old_flags & _PAGE_PRESENT ) > - flush_tlb_mask(d->dirty_cpumask); > + hap_flush_tlb_mask(d->dirty_cpumask); > > paging_unlock(d); > Following up on my earlier mail about paging_log_dirty_range(), I'm now of the opinion that all of these flushes should go away too. I can only assume that they got put where they are when HAP code was cloned from the shadow one. These are only p2m operations, and hence p2m level TLB flushing is all that's needed here. > --- a/xen/arch/x86/mm/hap/nested_hap.c > +++ b/xen/arch/x86/mm/hap/nested_hap.c > @@ -84,7 +84,7 @@ nestedp2m_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn, > safe_write_pte(p, new); > > if (old_flags & _PAGE_PRESENT) > - flush_tlb_mask(p2m->dirty_cpumask); > + hap_flush_tlb_mask(p2m->dirty_cpumask); Same here then presumably. As suggested in my earlier reply, the plain removals of flush invocations would probably better be split out into a separate patch. > --- a/xen/arch/x86/mm/hap/private.h > +++ b/xen/arch/x86/mm/hap/private.h > @@ -47,4 +47,9 @@ unsigned long hap_p2m_ga_to_gfn_4_levels(struct vcpu *v, > struct p2m_domain *p2m, unsigned long cr3, > paddr_t ga, uint32_t *pfec, unsigned int *page_order); > > +static inline void hap_flush_tlb_mask(const cpumask_t *mask) > +{ > + flush_mask(mask, FLUSH_HVM_ASID_CORE); > +} With the above introduction of this would then become unnecessary. Jan
On Thu, Apr 09, 2020 at 01:54:40PM +0200, Jan Beulich wrote: > On 06.04.2020 12:57, Roger Pau Monne wrote: > > --- a/xen/arch/x86/mm/hap/hap.c > > +++ b/xen/arch/x86/mm/hap/hap.c > > @@ -118,7 +118,7 @@ int hap_track_dirty_vram(struct domain *d, > > p2m_change_type_range(d, begin_pfn, begin_pfn + nr, > > p2m_ram_rw, p2m_ram_logdirty); > > > > - flush_tlb_mask(d->dirty_cpumask); > > + hap_flush_tlb_mask(d->dirty_cpumask); > > > > memset(dirty_bitmap, 0xff, size); /* consider all pages dirty */ > > } > > @@ -205,7 +205,7 @@ static int hap_enable_log_dirty(struct domain *d, bool_t log_global) > > * to be read-only, or via hardware-assisted log-dirty. > > */ > > p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty); > > - flush_tlb_mask(d->dirty_cpumask); > > + hap_flush_tlb_mask(d->dirty_cpumask); > > } > > return 0; > > } > > @@ -234,7 +234,7 @@ static void hap_clean_dirty_bitmap(struct domain *d) > > * be read-only, or via hardware-assisted log-dirty. > > */ > > p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty); > > - flush_tlb_mask(d->dirty_cpumask); > > + hap_flush_tlb_mask(d->dirty_cpumask); > > } > > > > /************************************************/ > > @@ -798,7 +798,7 @@ hap_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn, l1_pgentry_t *p, > > > > safe_write_pte(p, new); > > if ( old_flags & _PAGE_PRESENT ) > > - flush_tlb_mask(d->dirty_cpumask); > > + hap_flush_tlb_mask(d->dirty_cpumask); > > > > paging_unlock(d); > > > > Following up on my earlier mail about paging_log_dirty_range(), I'm > now of the opinion that all of these flushes should go away too. I > can only assume that they got put where they are when HAP code was > cloned from the shadow one. These are only p2m operations, and hence > p2m level TLB flushing is all that's needed here. > > > --- a/xen/arch/x86/mm/hap/nested_hap.c > > +++ b/xen/arch/x86/mm/hap/nested_hap.c > > @@ -84,7 +84,7 @@ nestedp2m_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn, > > safe_write_pte(p, new); > > > > if (old_flags & _PAGE_PRESENT) > > - flush_tlb_mask(p2m->dirty_cpumask); > > + hap_flush_tlb_mask(p2m->dirty_cpumask); > > Same here then presumably. > > As suggested in my earlier reply, the plain removals of flush > invocations would probably better be split out into a separate > patch. > > > --- a/xen/arch/x86/mm/hap/private.h > > +++ b/xen/arch/x86/mm/hap/private.h > > @@ -47,4 +47,9 @@ unsigned long hap_p2m_ga_to_gfn_4_levels(struct vcpu *v, > > struct p2m_domain *p2m, unsigned long cr3, > > paddr_t ga, uint32_t *pfec, unsigned int *page_order); > > > > +static inline void hap_flush_tlb_mask(const cpumask_t *mask) > > +{ > > + flush_mask(mask, FLUSH_HVM_ASID_CORE); > > +} > > With the above introduction of this would then become unnecessary. I had planned to make the required adjustment(s) and commit v9 today. But seeing your comment it appears v10 is warranted. Wei. > > Jan
On 10.04.2020 17:48, Wei Liu wrote: > On Thu, Apr 09, 2020 at 01:54:40PM +0200, Jan Beulich wrote: >> On 06.04.2020 12:57, Roger Pau Monne wrote: >>> --- a/xen/arch/x86/mm/hap/private.h >>> +++ b/xen/arch/x86/mm/hap/private.h >>> @@ -47,4 +47,9 @@ unsigned long hap_p2m_ga_to_gfn_4_levels(struct vcpu *v, >>> struct p2m_domain *p2m, unsigned long cr3, >>> paddr_t ga, uint32_t *pfec, unsigned int *page_order); >>> >>> +static inline void hap_flush_tlb_mask(const cpumask_t *mask) >>> +{ >>> + flush_mask(mask, FLUSH_HVM_ASID_CORE); >>> +} >> >> With the above introduction of this would then become unnecessary. > > I had planned to make the required adjustment(s) and commit v9 today. Actually I came to make these comments in the course of preparing to commit the series, while making the small adjustments. Jan
On Thu, Apr 09, 2020 at 01:54:40PM +0200, Jan Beulich wrote: > On 06.04.2020 12:57, Roger Pau Monne wrote: > > --- a/xen/arch/x86/mm/hap/hap.c > > +++ b/xen/arch/x86/mm/hap/hap.c > > @@ -118,7 +118,7 @@ int hap_track_dirty_vram(struct domain *d, > > p2m_change_type_range(d, begin_pfn, begin_pfn + nr, > > p2m_ram_rw, p2m_ram_logdirty); > > > > - flush_tlb_mask(d->dirty_cpumask); > > + hap_flush_tlb_mask(d->dirty_cpumask); > > > > memset(dirty_bitmap, 0xff, size); /* consider all pages dirty */ > > } > > @@ -205,7 +205,7 @@ static int hap_enable_log_dirty(struct domain *d, bool_t log_global) > > * to be read-only, or via hardware-assisted log-dirty. > > */ > > p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty); > > - flush_tlb_mask(d->dirty_cpumask); > > + hap_flush_tlb_mask(d->dirty_cpumask); > > } > > return 0; > > } > > @@ -234,7 +234,7 @@ static void hap_clean_dirty_bitmap(struct domain *d) > > * be read-only, or via hardware-assisted log-dirty. > > */ > > p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty); > > - flush_tlb_mask(d->dirty_cpumask); > > + hap_flush_tlb_mask(d->dirty_cpumask); > > } > > > > /************************************************/ > > @@ -798,7 +798,7 @@ hap_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn, l1_pgentry_t *p, > > > > safe_write_pte(p, new); > > if ( old_flags & _PAGE_PRESENT ) > > - flush_tlb_mask(d->dirty_cpumask); > > + hap_flush_tlb_mask(d->dirty_cpumask); > > > > paging_unlock(d); > > > > Following up on my earlier mail about paging_log_dirty_range(), I'm > now of the opinion that all of these flushes should go away too. I > can only assume that they got put where they are when HAP code was > cloned from the shadow one. These are only p2m operations, and hence > p2m level TLB flushing is all that's needed here. IIRC without those ASID flushes NPT won't work correctly, as I think it relies on those flushes when updating the p2m. We could see about moving those flushes inside the NPT functions that modify the p2m, AFAICT p2m_pt_set_entry which calls p2m->write_p2m_entry relies on the later doing the ASID flushes, as it doesn't perform any. Thanks, Roger.
On Thu, Apr 09, 2020 at 01:16:57PM +0200, Jan Beulich wrote: > On 08.04.2020 17:10, Roger Pau Monné wrote: > > On Wed, Apr 08, 2020 at 01:25:14PM +0200, Jan Beulich wrote: > >> On 06.04.2020 12:57, Roger Pau Monne wrote: > >>> --- a/xen/arch/x86/mm/paging.c > >>> +++ b/xen/arch/x86/mm/paging.c > >>> @@ -613,7 +613,8 @@ void paging_log_dirty_range(struct domain *d, > >>> > >>> p2m_unlock(p2m); > >>> > >>> - flush_tlb_mask(d->dirty_cpumask); > >>> + flush_mask(d->dirty_cpumask, (!hap_enabled(d) ? FLUSH_TLB : 0) | > >>> + FLUSH_HVM_ASID_CORE); > >> > >> In cases where one case is assumed to be more likely than the other > >> putting the more likely one first can be viewed as a mild hint to > >> the compiler, and hence an extra ! may be warranted in an if() or > >> a conditional expression. Here, however, I don't think we can > >> really consider one case more likely than the other, and hence I'd > >> suggest to avoid the !, flipping the other two expressions > >> accordingly. I may take the liberty to adjust this while committing > >> (if I'm to be the one). > > > > That's fine, thanks. Somehow '!hap -> flush' was clearer in my mind. > > Thinking about it with the other HVM-related changes in v9, shouldn't > this then be > > flush_mask(d->dirty_cpumask, (hap_enabled(d) ? 0 : FLUSH_TLB) | > (is_hvm_domain(d) ? FLUSH_HVM_ASID_CORE : 0)); > > Or wait - the only caller lives in hap.c. As a result the FLUSH_TLB > part can be dropped altogether. And I question the need of flushing > guest TLBs - this is purely a p2m operation. I'll go look at the > history of this function, but for now I think the call should be > dropped (albeit then maybe better in a separate patch). The ASID flush needs to stay unless it's moved into p2m_pt_set_entry, as p2m_pt_set_entry itself doesn't perform any ASID flush and won't work correctly. I think it's safe to remove the TLB flush, as the code is only called from HAP, and hence is not used by shadow (which is what would require a plain TLB flush). The placement of this function seems misleading to me, as it looks like it's used by both shadow and HAP. It might be better to move it to hap.c if it's only to be used by HAP code. Thanks, Roger.
On 14.04.2020 09:52, Roger Pau Monné wrote: > On Thu, Apr 09, 2020 at 01:54:40PM +0200, Jan Beulich wrote: >> On 06.04.2020 12:57, Roger Pau Monne wrote: >>> --- a/xen/arch/x86/mm/hap/hap.c >>> +++ b/xen/arch/x86/mm/hap/hap.c >>> @@ -118,7 +118,7 @@ int hap_track_dirty_vram(struct domain *d, >>> p2m_change_type_range(d, begin_pfn, begin_pfn + nr, >>> p2m_ram_rw, p2m_ram_logdirty); >>> >>> - flush_tlb_mask(d->dirty_cpumask); >>> + hap_flush_tlb_mask(d->dirty_cpumask); >>> >>> memset(dirty_bitmap, 0xff, size); /* consider all pages dirty */ >>> } >>> @@ -205,7 +205,7 @@ static int hap_enable_log_dirty(struct domain *d, bool_t log_global) >>> * to be read-only, or via hardware-assisted log-dirty. >>> */ >>> p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty); >>> - flush_tlb_mask(d->dirty_cpumask); >>> + hap_flush_tlb_mask(d->dirty_cpumask); >>> } >>> return 0; >>> } >>> @@ -234,7 +234,7 @@ static void hap_clean_dirty_bitmap(struct domain *d) >>> * be read-only, or via hardware-assisted log-dirty. >>> */ >>> p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty); >>> - flush_tlb_mask(d->dirty_cpumask); >>> + hap_flush_tlb_mask(d->dirty_cpumask); >>> } >>> >>> /************************************************/ >>> @@ -798,7 +798,7 @@ hap_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn, l1_pgentry_t *p, >>> >>> safe_write_pte(p, new); >>> if ( old_flags & _PAGE_PRESENT ) >>> - flush_tlb_mask(d->dirty_cpumask); >>> + hap_flush_tlb_mask(d->dirty_cpumask); >>> >>> paging_unlock(d); >>> >> >> Following up on my earlier mail about paging_log_dirty_range(), I'm >> now of the opinion that all of these flushes should go away too. I >> can only assume that they got put where they are when HAP code was >> cloned from the shadow one. These are only p2m operations, and hence >> p2m level TLB flushing is all that's needed here. > > IIRC without those ASID flushes NPT won't work correctly, as I think > it relies on those flushes when updating the p2m. Hmm, yes - at least for this last one (in patch context) I definitely agree: NPT's TLB invalidation is quite different from EPT's (and I was too focused on the latter when writing the earlier reply). Therefore how about keeping hap_flush_tlb_mask() (and its uses), but teaching it to do nothing for EPT, while doing an ASID based flush for NPT? There may then even be the option to have a wider purpose helper, dealing with most (all?) of the flushes needed from underneath x86/mm/, setting the TLB and HVM_ASID_CORE flags as appropriate. In the EPT case the function would still be a no-op (afaict). > We could see about moving those flushes inside the NPT functions that > modify the p2m, AFAICT p2m_pt_set_entry which calls > p2m->write_p2m_entry relies on the later doing the ASID flushes, as it > doesn't perform any. I don't think we want to go this far at this point. Jan
On 14.04.2020 10:01, Roger Pau Monné wrote: > On Thu, Apr 09, 2020 at 01:16:57PM +0200, Jan Beulich wrote: >> On 08.04.2020 17:10, Roger Pau Monné wrote: >>> On Wed, Apr 08, 2020 at 01:25:14PM +0200, Jan Beulich wrote: >>>> On 06.04.2020 12:57, Roger Pau Monne wrote: >>>>> --- a/xen/arch/x86/mm/paging.c >>>>> +++ b/xen/arch/x86/mm/paging.c >>>>> @@ -613,7 +613,8 @@ void paging_log_dirty_range(struct domain *d, >>>>> >>>>> p2m_unlock(p2m); >>>>> >>>>> - flush_tlb_mask(d->dirty_cpumask); >>>>> + flush_mask(d->dirty_cpumask, (!hap_enabled(d) ? FLUSH_TLB : 0) | >>>>> + FLUSH_HVM_ASID_CORE); >>>> >>>> In cases where one case is assumed to be more likely than the other >>>> putting the more likely one first can be viewed as a mild hint to >>>> the compiler, and hence an extra ! may be warranted in an if() or >>>> a conditional expression. Here, however, I don't think we can >>>> really consider one case more likely than the other, and hence I'd >>>> suggest to avoid the !, flipping the other two expressions >>>> accordingly. I may take the liberty to adjust this while committing >>>> (if I'm to be the one). >>> >>> That's fine, thanks. Somehow '!hap -> flush' was clearer in my mind. >> >> Thinking about it with the other HVM-related changes in v9, shouldn't >> this then be >> >> flush_mask(d->dirty_cpumask, (hap_enabled(d) ? 0 : FLUSH_TLB) | >> (is_hvm_domain(d) ? FLUSH_HVM_ASID_CORE : 0)); >> >> Or wait - the only caller lives in hap.c. As a result the FLUSH_TLB >> part can be dropped altogether. And I question the need of flushing >> guest TLBs - this is purely a p2m operation. I'll go look at the >> history of this function, but for now I think the call should be >> dropped (albeit then maybe better in a separate patch). > > The ASID flush needs to stay unless it's moved into p2m_pt_set_entry, > as p2m_pt_set_entry itself doesn't perform any ASID flush and won't > work correctly. Just like for said in the other reply sent a few minutes ago - yes for NPT, but no for EPT. > I think it's safe to remove the TLB flush, as the code is only called > from HAP, and hence is not used by shadow (which is what would require > a plain TLB flush). The placement of this function seems misleading to > me, as it looks like it's used by both shadow and HAP. It might be > better to move it to hap.c if it's only to be used by HAP code. Either placement has its problems, I think. The function is meant to be a paging layer one, but is needed by HAP only right now. I'm pondering whether to wrap it in #ifdef CONFIG_HVM (plus perhaps a respective ASSERT_UNREACHABLE()). In the end, just like in the other cases, this may be a valid further user of the more generic helper that I did suggest (resulting in no flushing on EPT and an ASID-based one on NPT). Jan
On Tue, Apr 14, 2020 at 11:09:43AM +0200, Jan Beulich wrote: > On 14.04.2020 10:01, Roger Pau Monné wrote: > > On Thu, Apr 09, 2020 at 01:16:57PM +0200, Jan Beulich wrote: > >> On 08.04.2020 17:10, Roger Pau Monné wrote: > >>> On Wed, Apr 08, 2020 at 01:25:14PM +0200, Jan Beulich wrote: > >>>> On 06.04.2020 12:57, Roger Pau Monne wrote: > >>>>> --- a/xen/arch/x86/mm/paging.c > >>>>> +++ b/xen/arch/x86/mm/paging.c > >>>>> @@ -613,7 +613,8 @@ void paging_log_dirty_range(struct domain *d, > >>>>> > >>>>> p2m_unlock(p2m); > >>>>> > >>>>> - flush_tlb_mask(d->dirty_cpumask); > >>>>> + flush_mask(d->dirty_cpumask, (!hap_enabled(d) ? FLUSH_TLB : 0) | > >>>>> + FLUSH_HVM_ASID_CORE); > >>>> > >>>> In cases where one case is assumed to be more likely than the other > >>>> putting the more likely one first can be viewed as a mild hint to > >>>> the compiler, and hence an extra ! may be warranted in an if() or > >>>> a conditional expression. Here, however, I don't think we can > >>>> really consider one case more likely than the other, and hence I'd > >>>> suggest to avoid the !, flipping the other two expressions > >>>> accordingly. I may take the liberty to adjust this while committing > >>>> (if I'm to be the one). > >>> > >>> That's fine, thanks. Somehow '!hap -> flush' was clearer in my mind. > >> > >> Thinking about it with the other HVM-related changes in v9, shouldn't > >> this then be > >> > >> flush_mask(d->dirty_cpumask, (hap_enabled(d) ? 0 : FLUSH_TLB) | > >> (is_hvm_domain(d) ? FLUSH_HVM_ASID_CORE : 0)); > >> > >> Or wait - the only caller lives in hap.c. As a result the FLUSH_TLB > >> part can be dropped altogether. And I question the need of flushing > >> guest TLBs - this is purely a p2m operation. I'll go look at the > >> history of this function, but for now I think the call should be > >> dropped (albeit then maybe better in a separate patch). > > > > The ASID flush needs to stay unless it's moved into p2m_pt_set_entry, > > as p2m_pt_set_entry itself doesn't perform any ASID flush and won't > > work correctly. > > Just like for said in the other reply sent a few minutes ago - yes > for NPT, but no for EPT. It's not strictly wrong for EPT as it won't cause EPT domains to malfunction, it's just redundant. > > I think it's safe to remove the TLB flush, as the code is only called > > from HAP, and hence is not used by shadow (which is what would require > > a plain TLB flush). The placement of this function seems misleading to > > me, as it looks like it's used by both shadow and HAP. It might be > > better to move it to hap.c if it's only to be used by HAP code. > > Either placement has its problems, I think. The function is meant to > be a paging layer one, but is needed by HAP only right now. I'm > pondering whether to wrap it in #ifdef CONFIG_HVM (plus perhaps a > respective ASSERT_UNREACHABLE()). IMO if a TLB flush is not performed here we should add an ASSERT_UNREACHABLE if called from a shadow mode domain, or else we risk someone trying to use it in shadow later without realizing it's missing a TLB flush. Thanks, Roger.
On 14.04.2020 11:34, Roger Pau Monné wrote: > On Tue, Apr 14, 2020 at 11:09:43AM +0200, Jan Beulich wrote: >> On 14.04.2020 10:01, Roger Pau Monné wrote: >>> On Thu, Apr 09, 2020 at 01:16:57PM +0200, Jan Beulich wrote: >>>> On 08.04.2020 17:10, Roger Pau Monné wrote: >>>>> On Wed, Apr 08, 2020 at 01:25:14PM +0200, Jan Beulich wrote: >>>>>> On 06.04.2020 12:57, Roger Pau Monne wrote: >>>>>>> --- a/xen/arch/x86/mm/paging.c >>>>>>> +++ b/xen/arch/x86/mm/paging.c >>>>>>> @@ -613,7 +613,8 @@ void paging_log_dirty_range(struct domain *d, >>>>>>> >>>>>>> p2m_unlock(p2m); >>>>>>> >>>>>>> - flush_tlb_mask(d->dirty_cpumask); >>>>>>> + flush_mask(d->dirty_cpumask, (!hap_enabled(d) ? FLUSH_TLB : 0) | >>>>>>> + FLUSH_HVM_ASID_CORE); >>>>>> >>>>>> In cases where one case is assumed to be more likely than the other >>>>>> putting the more likely one first can be viewed as a mild hint to >>>>>> the compiler, and hence an extra ! may be warranted in an if() or >>>>>> a conditional expression. Here, however, I don't think we can >>>>>> really consider one case more likely than the other, and hence I'd >>>>>> suggest to avoid the !, flipping the other two expressions >>>>>> accordingly. I may take the liberty to adjust this while committing >>>>>> (if I'm to be the one). >>>>> >>>>> That's fine, thanks. Somehow '!hap -> flush' was clearer in my mind. >>>> >>>> Thinking about it with the other HVM-related changes in v9, shouldn't >>>> this then be >>>> >>>> flush_mask(d->dirty_cpumask, (hap_enabled(d) ? 0 : FLUSH_TLB) | >>>> (is_hvm_domain(d) ? FLUSH_HVM_ASID_CORE : 0)); >>>> >>>> Or wait - the only caller lives in hap.c. As a result the FLUSH_TLB >>>> part can be dropped altogether. And I question the need of flushing >>>> guest TLBs - this is purely a p2m operation. I'll go look at the >>>> history of this function, but for now I think the call should be >>>> dropped (albeit then maybe better in a separate patch). >>> >>> The ASID flush needs to stay unless it's moved into p2m_pt_set_entry, >>> as p2m_pt_set_entry itself doesn't perform any ASID flush and won't >>> work correctly. >> >> Just like for said in the other reply sent a few minutes ago - yes >> for NPT, but no for EPT. > > It's not strictly wrong for EPT as it won't cause EPT domains to > malfunction, it's just redundant. Right - it's wrong just in the sense of being pointless extra overhead. >>> I think it's safe to remove the TLB flush, as the code is only called >>> from HAP, and hence is not used by shadow (which is what would require >>> a plain TLB flush). The placement of this function seems misleading to >>> me, as it looks like it's used by both shadow and HAP. It might be >>> better to move it to hap.c if it's only to be used by HAP code. >> >> Either placement has its problems, I think. The function is meant to >> be a paging layer one, but is needed by HAP only right now. I'm >> pondering whether to wrap it in #ifdef CONFIG_HVM (plus perhaps a >> respective ASSERT_UNREACHABLE()). > > IMO if a TLB flush is not performed here we should add an > ASSERT_UNREACHABLE if called from a shadow mode domain, or else we > risk someone trying to use it in shadow later without realizing it's > missing a TLB flush. This would be fine with me. Jan
On Tue, Apr 14, 2020 at 11:01:06AM +0200, Jan Beulich wrote: > On 14.04.2020 09:52, Roger Pau Monné wrote: > > On Thu, Apr 09, 2020 at 01:54:40PM +0200, Jan Beulich wrote: > >> On 06.04.2020 12:57, Roger Pau Monne wrote: > >>> --- a/xen/arch/x86/mm/hap/hap.c > >>> +++ b/xen/arch/x86/mm/hap/hap.c > >>> @@ -118,7 +118,7 @@ int hap_track_dirty_vram(struct domain *d, > >>> p2m_change_type_range(d, begin_pfn, begin_pfn + nr, > >>> p2m_ram_rw, p2m_ram_logdirty); > >>> > >>> - flush_tlb_mask(d->dirty_cpumask); > >>> + hap_flush_tlb_mask(d->dirty_cpumask); > >>> > >>> memset(dirty_bitmap, 0xff, size); /* consider all pages dirty */ > >>> } > >>> @@ -205,7 +205,7 @@ static int hap_enable_log_dirty(struct domain *d, bool_t log_global) > >>> * to be read-only, or via hardware-assisted log-dirty. > >>> */ > >>> p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty); > >>> - flush_tlb_mask(d->dirty_cpumask); > >>> + hap_flush_tlb_mask(d->dirty_cpumask); > >>> } > >>> return 0; > >>> } > >>> @@ -234,7 +234,7 @@ static void hap_clean_dirty_bitmap(struct domain *d) > >>> * be read-only, or via hardware-assisted log-dirty. > >>> */ > >>> p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty); > >>> - flush_tlb_mask(d->dirty_cpumask); > >>> + hap_flush_tlb_mask(d->dirty_cpumask); > >>> } > >>> > >>> /************************************************/ > >>> @@ -798,7 +798,7 @@ hap_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn, l1_pgentry_t *p, > >>> > >>> safe_write_pte(p, new); > >>> if ( old_flags & _PAGE_PRESENT ) > >>> - flush_tlb_mask(d->dirty_cpumask); > >>> + hap_flush_tlb_mask(d->dirty_cpumask); > >>> > >>> paging_unlock(d); > >>> > >> > >> Following up on my earlier mail about paging_log_dirty_range(), I'm > >> now of the opinion that all of these flushes should go away too. I > >> can only assume that they got put where they are when HAP code was > >> cloned from the shadow one. These are only p2m operations, and hence > >> p2m level TLB flushing is all that's needed here. > > > > IIRC without those ASID flushes NPT won't work correctly, as I think > > it relies on those flushes when updating the p2m. > > Hmm, yes - at least for this last one (in patch context) I definitely > agree: NPT's TLB invalidation is quite different from EPT's (and I > was too focused on the latter when writing the earlier reply). > Therefore how about keeping hap_flush_tlb_mask() (and its uses), but > teaching it to do nothing for EPT, while doing an ASID based flush > for NPT? I could give that a try. I'm slightly worried that EPT code might rely on some of those ASID/VPID flushes. It seems like it's trying to do VPID flushes when needed, but issues could be masked by the ASID/VPID flushes done by the callers. > There may then even be the option to have a wider purpose helper, > dealing with most (all?) of the flushes needed from underneath > x86/mm/, setting the TLB and HVM_ASID_CORE flags as appropriate. In > the EPT case the function would still be a no-op (afaict). That seems nice, we would have to be careful however as reducing the number of ASID/VPID flushes could uncover issues in the existing code. I guess you mean something like: static inline void guest_flush_tlb_mask(const struct domain *d, const cpumask_t *mask) { flush_mask(mask, (is_pv_domain(d) || shadow_mode_enabled(d) ? FLUSH_TLB : 0) | (is_hvm_domain(d) && cpu_has_svm ? FLUSH_HVM_ASID_CORE : 0)); } I think this should work, but I would rather do it in a separate patch. I'm also not fully convinced guest_flush_tlb_mask is the best name, but I couldn't think of anything else more descriptive of the purpose of the function. Thanks, Roger.
On 14.04.2020 12:02, Roger Pau Monné wrote: > On Tue, Apr 14, 2020 at 11:01:06AM +0200, Jan Beulich wrote: >> On 14.04.2020 09:52, Roger Pau Monné wrote: >>> On Thu, Apr 09, 2020 at 01:54:40PM +0200, Jan Beulich wrote: >>>> On 06.04.2020 12:57, Roger Pau Monne wrote: >>>>> --- a/xen/arch/x86/mm/hap/hap.c >>>>> +++ b/xen/arch/x86/mm/hap/hap.c >>>>> @@ -118,7 +118,7 @@ int hap_track_dirty_vram(struct domain *d, >>>>> p2m_change_type_range(d, begin_pfn, begin_pfn + nr, >>>>> p2m_ram_rw, p2m_ram_logdirty); >>>>> >>>>> - flush_tlb_mask(d->dirty_cpumask); >>>>> + hap_flush_tlb_mask(d->dirty_cpumask); >>>>> >>>>> memset(dirty_bitmap, 0xff, size); /* consider all pages dirty */ >>>>> } >>>>> @@ -205,7 +205,7 @@ static int hap_enable_log_dirty(struct domain *d, bool_t log_global) >>>>> * to be read-only, or via hardware-assisted log-dirty. >>>>> */ >>>>> p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty); >>>>> - flush_tlb_mask(d->dirty_cpumask); >>>>> + hap_flush_tlb_mask(d->dirty_cpumask); >>>>> } >>>>> return 0; >>>>> } >>>>> @@ -234,7 +234,7 @@ static void hap_clean_dirty_bitmap(struct domain *d) >>>>> * be read-only, or via hardware-assisted log-dirty. >>>>> */ >>>>> p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty); >>>>> - flush_tlb_mask(d->dirty_cpumask); >>>>> + hap_flush_tlb_mask(d->dirty_cpumask); >>>>> } >>>>> >>>>> /************************************************/ >>>>> @@ -798,7 +798,7 @@ hap_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn, l1_pgentry_t *p, >>>>> >>>>> safe_write_pte(p, new); >>>>> if ( old_flags & _PAGE_PRESENT ) >>>>> - flush_tlb_mask(d->dirty_cpumask); >>>>> + hap_flush_tlb_mask(d->dirty_cpumask); >>>>> >>>>> paging_unlock(d); >>>>> >>>> >>>> Following up on my earlier mail about paging_log_dirty_range(), I'm >>>> now of the opinion that all of these flushes should go away too. I >>>> can only assume that they got put where they are when HAP code was >>>> cloned from the shadow one. These are only p2m operations, and hence >>>> p2m level TLB flushing is all that's needed here. >>> >>> IIRC without those ASID flushes NPT won't work correctly, as I think >>> it relies on those flushes when updating the p2m. >> >> Hmm, yes - at least for this last one (in patch context) I definitely >> agree: NPT's TLB invalidation is quite different from EPT's (and I >> was too focused on the latter when writing the earlier reply). >> Therefore how about keeping hap_flush_tlb_mask() (and its uses), but >> teaching it to do nothing for EPT, while doing an ASID based flush >> for NPT? > > I could give that a try. I'm slightly worried that EPT code might rely > on some of those ASID/VPID flushes. It seems like it's trying to do > VPID flushes when needed, but issues could be masked by the ASID/VPID > flushes done by the callers. I can't seem to find any EPT code doing VPID flushes, and I'd also not expect to. There's VMX code doing so, yes. EPT should be fully agnostic to guest virtual address space. >> There may then even be the option to have a wider purpose helper, >> dealing with most (all?) of the flushes needed from underneath >> x86/mm/, setting the TLB and HVM_ASID_CORE flags as appropriate. In >> the EPT case the function would still be a no-op (afaict). > > That seems nice, we would have to be careful however as reducing the > number of ASID/VPID flushes could uncover issues in the existing code. > I guess you mean something like: > > static inline void guest_flush_tlb_mask(const struct domain *d, > const cpumask_t *mask) > { > flush_mask(mask, (is_pv_domain(d) || shadow_mode_enabled(d) ? FLUSH_TLB > : 0) | > (is_hvm_domain(d) && cpu_has_svm ? FLUSH_HVM_ASID_CORE > : 0)); > } Almost - is_hvm_domain(d) && cpu_has_svm seems to wide for me. I'd rather use hap_enabled() && cpu_has_svm, which effectively means NPT. Or am I overlooking a need to do ASID flushes also in shadow mode? Also I'd suggest to calculate the flags up front, to avoid calling flush_mask() in the first place in case (EPT) neither bit is set. > I think this should work, but I would rather do it in a separate > patch. Yes, just like the originally (wrongly, as you validly say) suggested full removal of them, putting this in a separate patch would indeed seem better. > I'm also not fully convinced guest_flush_tlb_mask is the best > name, but I couldn't think of anything else more descriptive of the > purpose of the function. That's the name I was thinking of, too, despite also not being entirely happy with it. Jan
On Tue, Apr 14, 2020 at 12:13:04PM +0200, Jan Beulich wrote: > On 14.04.2020 12:02, Roger Pau Monné wrote: > > On Tue, Apr 14, 2020 at 11:01:06AM +0200, Jan Beulich wrote: > >> On 14.04.2020 09:52, Roger Pau Monné wrote: > >>> On Thu, Apr 09, 2020 at 01:54:40PM +0200, Jan Beulich wrote: > >>>> On 06.04.2020 12:57, Roger Pau Monne wrote: > >>>>> --- a/xen/arch/x86/mm/hap/hap.c > >>>>> +++ b/xen/arch/x86/mm/hap/hap.c > >>>>> @@ -118,7 +118,7 @@ int hap_track_dirty_vram(struct domain *d, > >>>>> p2m_change_type_range(d, begin_pfn, begin_pfn + nr, > >>>>> p2m_ram_rw, p2m_ram_logdirty); > >>>>> > >>>>> - flush_tlb_mask(d->dirty_cpumask); > >>>>> + hap_flush_tlb_mask(d->dirty_cpumask); > >>>>> > >>>>> memset(dirty_bitmap, 0xff, size); /* consider all pages dirty */ > >>>>> } > >>>>> @@ -205,7 +205,7 @@ static int hap_enable_log_dirty(struct domain *d, bool_t log_global) > >>>>> * to be read-only, or via hardware-assisted log-dirty. > >>>>> */ > >>>>> p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty); > >>>>> - flush_tlb_mask(d->dirty_cpumask); > >>>>> + hap_flush_tlb_mask(d->dirty_cpumask); > >>>>> } > >>>>> return 0; > >>>>> } > >>>>> @@ -234,7 +234,7 @@ static void hap_clean_dirty_bitmap(struct domain *d) > >>>>> * be read-only, or via hardware-assisted log-dirty. > >>>>> */ > >>>>> p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty); > >>>>> - flush_tlb_mask(d->dirty_cpumask); > >>>>> + hap_flush_tlb_mask(d->dirty_cpumask); > >>>>> } > >>>>> > >>>>> /************************************************/ > >>>>> @@ -798,7 +798,7 @@ hap_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn, l1_pgentry_t *p, > >>>>> > >>>>> safe_write_pte(p, new); > >>>>> if ( old_flags & _PAGE_PRESENT ) > >>>>> - flush_tlb_mask(d->dirty_cpumask); > >>>>> + hap_flush_tlb_mask(d->dirty_cpumask); > >>>>> > >>>>> paging_unlock(d); > >>>>> > >>>> > >>>> Following up on my earlier mail about paging_log_dirty_range(), I'm > >>>> now of the opinion that all of these flushes should go away too. I > >>>> can only assume that they got put where they are when HAP code was > >>>> cloned from the shadow one. These are only p2m operations, and hence > >>>> p2m level TLB flushing is all that's needed here. > >>> > >>> IIRC without those ASID flushes NPT won't work correctly, as I think > >>> it relies on those flushes when updating the p2m. > >> > >> Hmm, yes - at least for this last one (in patch context) I definitely > >> agree: NPT's TLB invalidation is quite different from EPT's (and I > >> was too focused on the latter when writing the earlier reply). > >> Therefore how about keeping hap_flush_tlb_mask() (and its uses), but > >> teaching it to do nothing for EPT, while doing an ASID based flush > >> for NPT? > > > > I could give that a try. I'm slightly worried that EPT code might rely > > on some of those ASID/VPID flushes. It seems like it's trying to do > > VPID flushes when needed, but issues could be masked by the ASID/VPID > > flushes done by the callers. > > I can't seem to find any EPT code doing VPID flushes, and I'd also > not expect to. There's VMX code doing so, yes. EPT should be fully > agnostic to guest virtual address space. I got confused. > >> There may then even be the option to have a wider purpose helper, > >> dealing with most (all?) of the flushes needed from underneath > >> x86/mm/, setting the TLB and HVM_ASID_CORE flags as appropriate. In > >> the EPT case the function would still be a no-op (afaict). > > > > That seems nice, we would have to be careful however as reducing the > > number of ASID/VPID flushes could uncover issues in the existing code. > > I guess you mean something like: > > > > static inline void guest_flush_tlb_mask(const struct domain *d, > > const cpumask_t *mask) > > { > > flush_mask(mask, (is_pv_domain(d) || shadow_mode_enabled(d) ? FLUSH_TLB > > : 0) | > > (is_hvm_domain(d) && cpu_has_svm ? FLUSH_HVM_ASID_CORE > > : 0)); > > } > > Almost - is_hvm_domain(d) && cpu_has_svm seems to wide for me. I'd > rather use hap_enabled() && cpu_has_svm, which effectively means NPT. > Or am I overlooking a need to do ASID flushes also in shadow mode? I think so, I've used is_hvm_domain in order to cover for HVM domains running in shadow mode on AMD hardware, I think those also need the ASID flushes. > Also I'd suggest to calculate the flags up front, to avoid calling > flush_mask() in the first place in case (EPT) neither bit is set. > > > I think this should work, but I would rather do it in a separate > > patch. > > Yes, just like the originally (wrongly, as you validly say) suggested > full removal of them, putting this in a separate patch would indeed > seem better. Would you like me to resend with the requested fix to paging_log_dirty_range (ie: drop the FLUSH_TLB and only call flush_mask for HAP guests running on AMD) then? Thanks, Roger.
On 14.04.2020 13:19, Roger Pau Monné wrote: > On Tue, Apr 14, 2020 at 12:13:04PM +0200, Jan Beulich wrote: >> On 14.04.2020 12:02, Roger Pau Monné wrote: >>> That seems nice, we would have to be careful however as reducing the >>> number of ASID/VPID flushes could uncover issues in the existing code. >>> I guess you mean something like: >>> >>> static inline void guest_flush_tlb_mask(const struct domain *d, >>> const cpumask_t *mask) >>> { >>> flush_mask(mask, (is_pv_domain(d) || shadow_mode_enabled(d) ? FLUSH_TLB >>> : 0) | >>> (is_hvm_domain(d) && cpu_has_svm ? FLUSH_HVM_ASID_CORE >>> : 0)); >>> } >> >> Almost - is_hvm_domain(d) && cpu_has_svm seems to wide for me. I'd >> rather use hap_enabled() && cpu_has_svm, which effectively means NPT. >> Or am I overlooking a need to do ASID flushes also in shadow mode? > > I think so, I've used is_hvm_domain in order to cover for HVM domains > running in shadow mode on AMD hardware, I think those also need the > ASID flushes. I'm unconvinced: The entire section "TLB Management" in the PM gives the impression that "ordinary" TLB flushing covers all ASIDs anyway. It's not stated anywhere (I could find) explicitly though. >> Also I'd suggest to calculate the flags up front, to avoid calling >> flush_mask() in the first place in case (EPT) neither bit is set. >> >>> I think this should work, but I would rather do it in a separate >>> patch. >> >> Yes, just like the originally (wrongly, as you validly say) suggested >> full removal of them, putting this in a separate patch would indeed >> seem better. > > Would you like me to resend with the requested fix to > paging_log_dirty_range (ie: drop the FLUSH_TLB and only call > flush_mask for HAP guests running on AMD) then? Well, ideally I'd see that function also make use of the intended new helper function, if at all possible (and suitable). Jan
On Tue, Apr 14, 2020 at 03:50:15PM +0200, Jan Beulich wrote: > On 14.04.2020 13:19, Roger Pau Monné wrote: > > On Tue, Apr 14, 2020 at 12:13:04PM +0200, Jan Beulich wrote: > >> On 14.04.2020 12:02, Roger Pau Monné wrote: > >>> That seems nice, we would have to be careful however as reducing the > >>> number of ASID/VPID flushes could uncover issues in the existing code. > >>> I guess you mean something like: > >>> > >>> static inline void guest_flush_tlb_mask(const struct domain *d, > >>> const cpumask_t *mask) > >>> { > >>> flush_mask(mask, (is_pv_domain(d) || shadow_mode_enabled(d) ? FLUSH_TLB > >>> : 0) | > >>> (is_hvm_domain(d) && cpu_has_svm ? FLUSH_HVM_ASID_CORE > >>> : 0)); > >>> } > >> > >> Almost - is_hvm_domain(d) && cpu_has_svm seems to wide for me. I'd > >> rather use hap_enabled() && cpu_has_svm, which effectively means NPT. > >> Or am I overlooking a need to do ASID flushes also in shadow mode? > > > > I think so, I've used is_hvm_domain in order to cover for HVM domains > > running in shadow mode on AMD hardware, I think those also need the > > ASID flushes. > > I'm unconvinced: The entire section "TLB Management" in the PM gives > the impression that "ordinary" TLB flushing covers all ASIDs anyway. > It's not stated anywhere (I could find) explicitly though. Hm, I don't think so. XenRT found a myriad of issues with NPT when p2m code wasn't modified to do ASID flushes instead of plain TLB flushes. Even if it's just to stay on the safe side I would perform ASID flushes for HVM guests with shadow running on AMD. > >> Also I'd suggest to calculate the flags up front, to avoid calling > >> flush_mask() in the first place in case (EPT) neither bit is set. > >> > >>> I think this should work, but I would rather do it in a separate > >>> patch. > >> > >> Yes, just like the originally (wrongly, as you validly say) suggested > >> full removal of them, putting this in a separate patch would indeed > >> seem better. > > > > Would you like me to resend with the requested fix to > > paging_log_dirty_range (ie: drop the FLUSH_TLB and only call > > flush_mask for HAP guests running on AMD) then? > > Well, ideally I'd see that function also make use of the intended > new helper function, if at all possible (and suitable). Oh, OK. Just to make sure I understand what you are asking for, you would like me to resend introducing the new guest_flush_tlb_mask helper and use it in the flush_tlb_mask callers modified by this patch? Thanks, Roger.
On 14.04.2020 16:53, Roger Pau Monné wrote: > On Tue, Apr 14, 2020 at 03:50:15PM +0200, Jan Beulich wrote: >> On 14.04.2020 13:19, Roger Pau Monné wrote: >>> On Tue, Apr 14, 2020 at 12:13:04PM +0200, Jan Beulich wrote: >>>> On 14.04.2020 12:02, Roger Pau Monné wrote: >>>>> That seems nice, we would have to be careful however as reducing the >>>>> number of ASID/VPID flushes could uncover issues in the existing code. >>>>> I guess you mean something like: >>>>> >>>>> static inline void guest_flush_tlb_mask(const struct domain *d, >>>>> const cpumask_t *mask) >>>>> { >>>>> flush_mask(mask, (is_pv_domain(d) || shadow_mode_enabled(d) ? FLUSH_TLB >>>>> : 0) | >>>>> (is_hvm_domain(d) && cpu_has_svm ? FLUSH_HVM_ASID_CORE >>>>> : 0)); >>>>> } >>>> >>>> Almost - is_hvm_domain(d) && cpu_has_svm seems to wide for me. I'd >>>> rather use hap_enabled() && cpu_has_svm, which effectively means NPT. >>>> Or am I overlooking a need to do ASID flushes also in shadow mode? >>> >>> I think so, I've used is_hvm_domain in order to cover for HVM domains >>> running in shadow mode on AMD hardware, I think those also need the >>> ASID flushes. >> >> I'm unconvinced: The entire section "TLB Management" in the PM gives >> the impression that "ordinary" TLB flushing covers all ASIDs anyway. >> It's not stated anywhere (I could find) explicitly though. > > Hm, I don't think so. XenRT found a myriad of issues with NPT when p2m > code wasn't modified to do ASID flushes instead of plain TLB flushes. Well, that's clear from e.g. p2m_pt_set_entry() not doing any flushing itself. > Even if it's just to stay on the safe side I would perform ASID > flushes for HVM guests with shadow running on AMD. Tim, any chance you could voice your thoughts here? To me it seems odd to do an all-ASIDs flush followed by an ASID one. >>>> Also I'd suggest to calculate the flags up front, to avoid calling >>>> flush_mask() in the first place in case (EPT) neither bit is set. >>>> >>>>> I think this should work, but I would rather do it in a separate >>>>> patch. >>>> >>>> Yes, just like the originally (wrongly, as you validly say) suggested >>>> full removal of them, putting this in a separate patch would indeed >>>> seem better. >>> >>> Would you like me to resend with the requested fix to >>> paging_log_dirty_range (ie: drop the FLUSH_TLB and only call >>> flush_mask for HAP guests running on AMD) then? >> >> Well, ideally I'd see that function also make use of the intended >> new helper function, if at all possible (and suitable). > > Oh, OK. Just to make sure I understand what you are asking for, you > would like me to resend introducing the new guest_flush_tlb_mask > helper and use it in the flush_tlb_mask callers modified by this > patch? Yes (and I now realize it may not make sense to split it off into a separate change). Jan
On Tue, Apr 14, 2020 at 05:06:23PM +0200, Jan Beulich wrote: > On 14.04.2020 16:53, Roger Pau Monné wrote: > > On Tue, Apr 14, 2020 at 03:50:15PM +0200, Jan Beulich wrote: > >> On 14.04.2020 13:19, Roger Pau Monné wrote: > >>>>> I think this should work, but I would rather do it in a separate > >>>>> patch. > >>>> > >>>> Yes, just like the originally (wrongly, as you validly say) suggested > >>>> full removal of them, putting this in a separate patch would indeed > >>>> seem better. > >>> > >>> Would you like me to resend with the requested fix to > >>> paging_log_dirty_range (ie: drop the FLUSH_TLB and only call > >>> flush_mask for HAP guests running on AMD) then? > >> > >> Well, ideally I'd see that function also make use of the intended > >> new helper function, if at all possible (and suitable). > > > > Oh, OK. Just to make sure I understand what you are asking for, you > > would like me to resend introducing the new guest_flush_tlb_mask > > helper and use it in the flush_tlb_mask callers modified by this > > patch? > > Yes (and I now realize it may not make sense to split it off into a > separate change). I could do a pre-patch that introduces guest_flush_tlb_mask as a simple wrapper around flush_tlb_mask and replace the callers that I modify in this patch. Then this patch would only introduce FLUSH_HVM_ASID_CORE and modify guest_flush_tlb_mask to use it when required. It might make it more complicated to see which callers require the ASID flush, but if you think it's better I can arrange the patches in that way. Thanks, Roger.
On 15.04.2020 13:49, Roger Pau Monné wrote: > On Tue, Apr 14, 2020 at 05:06:23PM +0200, Jan Beulich wrote: >> On 14.04.2020 16:53, Roger Pau Monné wrote: >>> On Tue, Apr 14, 2020 at 03:50:15PM +0200, Jan Beulich wrote: >>>> On 14.04.2020 13:19, Roger Pau Monné wrote: >>>>>>> I think this should work, but I would rather do it in a separate >>>>>>> patch. >>>>>> >>>>>> Yes, just like the originally (wrongly, as you validly say) suggested >>>>>> full removal of them, putting this in a separate patch would indeed >>>>>> seem better. >>>>> >>>>> Would you like me to resend with the requested fix to >>>>> paging_log_dirty_range (ie: drop the FLUSH_TLB and only call >>>>> flush_mask for HAP guests running on AMD) then? >>>> >>>> Well, ideally I'd see that function also make use of the intended >>>> new helper function, if at all possible (and suitable). >>> >>> Oh, OK. Just to make sure I understand what you are asking for, you >>> would like me to resend introducing the new guest_flush_tlb_mask >>> helper and use it in the flush_tlb_mask callers modified by this >>> patch? >> >> Yes (and I now realize it may not make sense to split it off into a >> separate change). > > I could do a pre-patch that introduces guest_flush_tlb_mask as a > simple wrapper around flush_tlb_mask and replace the callers that I > modify in this patch. Then this patch would only introduce > FLUSH_HVM_ASID_CORE and modify guest_flush_tlb_mask to use it when > required. > > It might make it more complicated to see which callers require the > ASID flush, but if you think it's better I can arrange the patches in > that way. No, I think it's beteer to leave as a single patch. The idea with splitting was that you'd (fully) take care of some of the sites needing modification ahead of what is now patch 1. I.e. this would have been a suitable approach only if some use sites could really have the call dropped altogether. Jan
On Tue, Apr 14, 2020 at 05:06:23PM +0200, Jan Beulich wrote: > On 14.04.2020 16:53, Roger Pau Monné wrote: > > On Tue, Apr 14, 2020 at 03:50:15PM +0200, Jan Beulich wrote: > >> On 14.04.2020 13:19, Roger Pau Monné wrote: > >>> On Tue, Apr 14, 2020 at 12:13:04PM +0200, Jan Beulich wrote: > >>>> On 14.04.2020 12:02, Roger Pau Monné wrote: > >>>>> That seems nice, we would have to be careful however as reducing the > >>>>> number of ASID/VPID flushes could uncover issues in the existing code. > >>>>> I guess you mean something like: > >>>>> > >>>>> static inline void guest_flush_tlb_mask(const struct domain *d, > >>>>> const cpumask_t *mask) > >>>>> { > >>>>> flush_mask(mask, (is_pv_domain(d) || shadow_mode_enabled(d) ? FLUSH_TLB > >>>>> : 0) | > >>>>> (is_hvm_domain(d) && cpu_has_svm ? FLUSH_HVM_ASID_CORE > >>>>> : 0)); > >>>>> } > >>>> > >>>> Almost - is_hvm_domain(d) && cpu_has_svm seems to wide for me. I'd > >>>> rather use hap_enabled() && cpu_has_svm, which effectively means NPT. > >>>> Or am I overlooking a need to do ASID flushes also in shadow mode? > >>> > >>> I think so, I've used is_hvm_domain in order to cover for HVM domains > >>> running in shadow mode on AMD hardware, I think those also need the > >>> ASID flushes. > >> > >> I'm unconvinced: The entire section "TLB Management" in the PM gives > >> the impression that "ordinary" TLB flushing covers all ASIDs anyway. > >> It's not stated anywhere (I could find) explicitly though. > > > > Hm, I don't think so. XenRT found a myriad of issues with NPT when p2m > > code wasn't modified to do ASID flushes instead of plain TLB flushes. > > Well, that's clear from e.g. p2m_pt_set_entry() not doing any > flushing itself. > > > Even if it's just to stay on the safe side I would perform ASID > > flushes for HVM guests with shadow running on AMD. > > Tim, any chance you could voice your thoughts here? To me it seems > odd to do an all-ASIDs flush followed by an ASID one. I've been reading a bit more into this, and section 15.16.1 states: "TLB flush operations must not be assumed to affect all ASIDs." Since the VMM runs on it's own ASID it's my understanding that doing a TLB flush from the VMM does not flush any of the guests TLBs, and hence an ASID flush is still required. Thanks, Roger.
On 15.04.2020 16:49, Roger Pau Monné wrote: > On Tue, Apr 14, 2020 at 05:06:23PM +0200, Jan Beulich wrote: >> On 14.04.2020 16:53, Roger Pau Monné wrote: >>> On Tue, Apr 14, 2020 at 03:50:15PM +0200, Jan Beulich wrote: >>>> On 14.04.2020 13:19, Roger Pau Monné wrote: >>>>> On Tue, Apr 14, 2020 at 12:13:04PM +0200, Jan Beulich wrote: >>>>>> On 14.04.2020 12:02, Roger Pau Monné wrote: >>>>>>> That seems nice, we would have to be careful however as reducing the >>>>>>> number of ASID/VPID flushes could uncover issues in the existing code. >>>>>>> I guess you mean something like: >>>>>>> >>>>>>> static inline void guest_flush_tlb_mask(const struct domain *d, >>>>>>> const cpumask_t *mask) >>>>>>> { >>>>>>> flush_mask(mask, (is_pv_domain(d) || shadow_mode_enabled(d) ? FLUSH_TLB >>>>>>> : 0) | >>>>>>> (is_hvm_domain(d) && cpu_has_svm ? FLUSH_HVM_ASID_CORE >>>>>>> : 0)); >>>>>>> } >>>>>> >>>>>> Almost - is_hvm_domain(d) && cpu_has_svm seems to wide for me. I'd >>>>>> rather use hap_enabled() && cpu_has_svm, which effectively means NPT. >>>>>> Or am I overlooking a need to do ASID flushes also in shadow mode? >>>>> >>>>> I think so, I've used is_hvm_domain in order to cover for HVM domains >>>>> running in shadow mode on AMD hardware, I think those also need the >>>>> ASID flushes. >>>> >>>> I'm unconvinced: The entire section "TLB Management" in the PM gives >>>> the impression that "ordinary" TLB flushing covers all ASIDs anyway. >>>> It's not stated anywhere (I could find) explicitly though. >>> >>> Hm, I don't think so. XenRT found a myriad of issues with NPT when p2m >>> code wasn't modified to do ASID flushes instead of plain TLB flushes. >> >> Well, that's clear from e.g. p2m_pt_set_entry() not doing any >> flushing itself. >> >>> Even if it's just to stay on the safe side I would perform ASID >>> flushes for HVM guests with shadow running on AMD. >> >> Tim, any chance you could voice your thoughts here? To me it seems >> odd to do an all-ASIDs flush followed by an ASID one. > > I've been reading a bit more into this, and section 15.16.1 states: > > "TLB flush operations must not be assumed to affect all ASIDs." That's the section talking about the tlb_control VMCB field. It is in this context that the sentence needs to be interpreted, imo. Jan
On Wed, Apr 15, 2020 at 05:42:20PM +0200, Jan Beulich wrote: > On 15.04.2020 16:49, Roger Pau Monné wrote: > > On Tue, Apr 14, 2020 at 05:06:23PM +0200, Jan Beulich wrote: > >> On 14.04.2020 16:53, Roger Pau Monné wrote: > >>> On Tue, Apr 14, 2020 at 03:50:15PM +0200, Jan Beulich wrote: > >>>> On 14.04.2020 13:19, Roger Pau Monné wrote: > >>>>> On Tue, Apr 14, 2020 at 12:13:04PM +0200, Jan Beulich wrote: > >>>>>> On 14.04.2020 12:02, Roger Pau Monné wrote: > >>>>>>> That seems nice, we would have to be careful however as reducing the > >>>>>>> number of ASID/VPID flushes could uncover issues in the existing code. > >>>>>>> I guess you mean something like: > >>>>>>> > >>>>>>> static inline void guest_flush_tlb_mask(const struct domain *d, > >>>>>>> const cpumask_t *mask) > >>>>>>> { > >>>>>>> flush_mask(mask, (is_pv_domain(d) || shadow_mode_enabled(d) ? FLUSH_TLB > >>>>>>> : 0) | > >>>>>>> (is_hvm_domain(d) && cpu_has_svm ? FLUSH_HVM_ASID_CORE > >>>>>>> : 0)); > >>>>>>> } > >>>>>> > >>>>>> Almost - is_hvm_domain(d) && cpu_has_svm seems to wide for me. I'd > >>>>>> rather use hap_enabled() && cpu_has_svm, which effectively means NPT. > >>>>>> Or am I overlooking a need to do ASID flushes also in shadow mode? > >>>>> > >>>>> I think so, I've used is_hvm_domain in order to cover for HVM domains > >>>>> running in shadow mode on AMD hardware, I think those also need the > >>>>> ASID flushes. > >>>> > >>>> I'm unconvinced: The entire section "TLB Management" in the PM gives > >>>> the impression that "ordinary" TLB flushing covers all ASIDs anyway. > >>>> It's not stated anywhere (I could find) explicitly though. > >>> > >>> Hm, I don't think so. XenRT found a myriad of issues with NPT when p2m > >>> code wasn't modified to do ASID flushes instead of plain TLB flushes. > >> > >> Well, that's clear from e.g. p2m_pt_set_entry() not doing any > >> flushing itself. > >> > >>> Even if it's just to stay on the safe side I would perform ASID > >>> flushes for HVM guests with shadow running on AMD. > >> > >> Tim, any chance you could voice your thoughts here? To me it seems > >> odd to do an all-ASIDs flush followed by an ASID one. > > > > I've been reading a bit more into this, and section 15.16.1 states: > > > > "TLB flush operations must not be assumed to affect all ASIDs." > > That's the section talking about the tlb_control VMCB field. It is > in this context that the sentence needs to be interpreted, imo. It explicitly mentions move-to-cr3 and move-to-cr4 before that phrase: "TLB flush operations function identically whether or not SVM is enabled (e.g., MOV-TO-CR3 flushes non-global mappings, whereas MOV-TO-CR4 flushes global and non-global mappings). TLB flush operations must not be assumed to affect all ASIDs." So my reading is that normal flush operations not involving tlb_control VMCB field should not assume to flush all ASIDs. Again this is of course my interpretation of the text in the PM. I will go with my suggested approach, which is safer and should cause no functional issues AFAICT. The only downside is the maybe redundant flush, but that's safe. Thanks, Roger.
On 15.04.2020 17:54, Roger Pau Monné wrote: > On Wed, Apr 15, 2020 at 05:42:20PM +0200, Jan Beulich wrote: >> On 15.04.2020 16:49, Roger Pau Monné wrote: >>> On Tue, Apr 14, 2020 at 05:06:23PM +0200, Jan Beulich wrote: >>>> On 14.04.2020 16:53, Roger Pau Monné wrote: >>>>> On Tue, Apr 14, 2020 at 03:50:15PM +0200, Jan Beulich wrote: >>>>>> On 14.04.2020 13:19, Roger Pau Monné wrote: >>>>>>> On Tue, Apr 14, 2020 at 12:13:04PM +0200, Jan Beulich wrote: >>>>>>>> On 14.04.2020 12:02, Roger Pau Monné wrote: >>>>>>>>> That seems nice, we would have to be careful however as reducing the >>>>>>>>> number of ASID/VPID flushes could uncover issues in the existing code. >>>>>>>>> I guess you mean something like: >>>>>>>>> >>>>>>>>> static inline void guest_flush_tlb_mask(const struct domain *d, >>>>>>>>> const cpumask_t *mask) >>>>>>>>> { >>>>>>>>> flush_mask(mask, (is_pv_domain(d) || shadow_mode_enabled(d) ? FLUSH_TLB >>>>>>>>> : 0) | >>>>>>>>> (is_hvm_domain(d) && cpu_has_svm ? FLUSH_HVM_ASID_CORE >>>>>>>>> : 0)); >>>>>>>>> } >>>>>>>> >>>>>>>> Almost - is_hvm_domain(d) && cpu_has_svm seems to wide for me. I'd >>>>>>>> rather use hap_enabled() && cpu_has_svm, which effectively means NPT. >>>>>>>> Or am I overlooking a need to do ASID flushes also in shadow mode? >>>>>>> >>>>>>> I think so, I've used is_hvm_domain in order to cover for HVM domains >>>>>>> running in shadow mode on AMD hardware, I think those also need the >>>>>>> ASID flushes. >>>>>> >>>>>> I'm unconvinced: The entire section "TLB Management" in the PM gives >>>>>> the impression that "ordinary" TLB flushing covers all ASIDs anyway. >>>>>> It's not stated anywhere (I could find) explicitly though. >>>>> >>>>> Hm, I don't think so. XenRT found a myriad of issues with NPT when p2m >>>>> code wasn't modified to do ASID flushes instead of plain TLB flushes. >>>> >>>> Well, that's clear from e.g. p2m_pt_set_entry() not doing any >>>> flushing itself. >>>> >>>>> Even if it's just to stay on the safe side I would perform ASID >>>>> flushes for HVM guests with shadow running on AMD. >>>> >>>> Tim, any chance you could voice your thoughts here? To me it seems >>>> odd to do an all-ASIDs flush followed by an ASID one. >>> >>> I've been reading a bit more into this, and section 15.16.1 states: >>> >>> "TLB flush operations must not be assumed to affect all ASIDs." >> >> That's the section talking about the tlb_control VMCB field. It is >> in this context that the sentence needs to be interpreted, imo. > > It explicitly mentions move-to-cr3 and move-to-cr4 before that phrase: > > "TLB flush operations function identically whether or not SVM is > enabled (e.g., MOV-TO-CR3 flushes non-global mappings, whereas > MOV-TO-CR4 flushes global and non-global mappings). TLB flush > operations must not be assumed to affect all ASIDs." Hmm, indeed. How did I not spot this already when reading this the other day? > So my reading is that normal flush operations not involving > tlb_control VMCB field should not assume to flush all ASIDs. Again > this is of course my interpretation of the text in the PM. I will go > with my suggested approach, which is safer and should cause no > functional issues AFAICT. The only downside is the maybe redundant > flush, but that's safe. Okay. And I'm sorry for having attempted to mislead you. Jan
diff --git a/xen/arch/x86/flushtlb.c b/xen/arch/x86/flushtlb.c index 03f92c23dc..c81e53c0ae 100644 --- a/xen/arch/x86/flushtlb.c +++ b/xen/arch/x86/flushtlb.c @@ -59,8 +59,6 @@ static u32 pre_flush(void) raise_softirq(NEW_TLBFLUSH_CLOCK_PERIOD_SOFTIRQ); skip_clocktick: - hvm_flush_guest_tlbs(); - return t2; } @@ -118,6 +116,7 @@ void switch_cr3_cr4(unsigned long cr3, unsigned long cr4) local_irq_save(flags); t = pre_flush(); + hvm_flush_guest_tlbs(); old_cr4 = read_cr4(); ASSERT(!(old_cr4 & X86_CR4_PCIDE) || !(old_cr4 & X86_CR4_PGE)); @@ -221,6 +220,9 @@ unsigned int flush_area_local(const void *va, unsigned int flags) do_tlb_flush(); } + if ( flags & FLUSH_HVM_ASID_CORE ) + hvm_flush_guest_tlbs(); + if ( flags & FLUSH_CACHE ) { const struct cpuinfo_x86 *c = ¤t_cpu_data; diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c index a6d5e39b02..12856245be 100644 --- a/xen/arch/x86/mm/hap/hap.c +++ b/xen/arch/x86/mm/hap/hap.c @@ -118,7 +118,7 @@ int hap_track_dirty_vram(struct domain *d, p2m_change_type_range(d, begin_pfn, begin_pfn + nr, p2m_ram_rw, p2m_ram_logdirty); - flush_tlb_mask(d->dirty_cpumask); + hap_flush_tlb_mask(d->dirty_cpumask); memset(dirty_bitmap, 0xff, size); /* consider all pages dirty */ } @@ -205,7 +205,7 @@ static int hap_enable_log_dirty(struct domain *d, bool_t log_global) * to be read-only, or via hardware-assisted log-dirty. */ p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty); - flush_tlb_mask(d->dirty_cpumask); + hap_flush_tlb_mask(d->dirty_cpumask); } return 0; } @@ -234,7 +234,7 @@ static void hap_clean_dirty_bitmap(struct domain *d) * be read-only, or via hardware-assisted log-dirty. */ p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty); - flush_tlb_mask(d->dirty_cpumask); + hap_flush_tlb_mask(d->dirty_cpumask); } /************************************************/ @@ -798,7 +798,7 @@ hap_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn, l1_pgentry_t *p, safe_write_pte(p, new); if ( old_flags & _PAGE_PRESENT ) - flush_tlb_mask(d->dirty_cpumask); + hap_flush_tlb_mask(d->dirty_cpumask); paging_unlock(d); diff --git a/xen/arch/x86/mm/hap/nested_hap.c b/xen/arch/x86/mm/hap/nested_hap.c index abe5958a52..02a5ae75c0 100644 --- a/xen/arch/x86/mm/hap/nested_hap.c +++ b/xen/arch/x86/mm/hap/nested_hap.c @@ -84,7 +84,7 @@ nestedp2m_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn, safe_write_pte(p, new); if (old_flags & _PAGE_PRESENT) - flush_tlb_mask(p2m->dirty_cpumask); + hap_flush_tlb_mask(p2m->dirty_cpumask); paging_unlock(d); diff --git a/xen/arch/x86/mm/hap/private.h b/xen/arch/x86/mm/hap/private.h index 973fbe8be5..7ee8480d83 100644 --- a/xen/arch/x86/mm/hap/private.h +++ b/xen/arch/x86/mm/hap/private.h @@ -47,4 +47,9 @@ unsigned long hap_p2m_ga_to_gfn_4_levels(struct vcpu *v, struct p2m_domain *p2m, unsigned long cr3, paddr_t ga, uint32_t *pfec, unsigned int *page_order); +static inline void hap_flush_tlb_mask(const cpumask_t *mask) +{ + flush_mask(mask, FLUSH_HVM_ASID_CORE); +} + #endif /* __HAP_PRIVATE_H__ */ diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c index eb66077496..c90032dc88 100644 --- a/xen/arch/x86/mm/p2m-pt.c +++ b/xen/arch/x86/mm/p2m-pt.c @@ -896,7 +896,7 @@ static void p2m_pt_change_entry_type_global(struct p2m_domain *p2m, unmap_domain_page(tab); if ( changed ) - flush_tlb_mask(p2m->domain->dirty_cpumask); + flush_mask(p2m->domain->dirty_cpumask, FLUSH_HVM_ASID_CORE); } static int p2m_pt_change_entry_type_range(struct p2m_domain *p2m, diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c index 469bb76429..d0bccaf7eb 100644 --- a/xen/arch/x86/mm/paging.c +++ b/xen/arch/x86/mm/paging.c @@ -613,7 +613,8 @@ void paging_log_dirty_range(struct domain *d, p2m_unlock(p2m); - flush_tlb_mask(d->dirty_cpumask); + flush_mask(d->dirty_cpumask, (!hap_enabled(d) ? FLUSH_TLB : 0) | + FLUSH_HVM_ASID_CORE); } /* diff --git a/xen/arch/x86/mm/shadow/common.c b/xen/arch/x86/mm/shadow/common.c index 75dd414a6e..467e0d3fe1 100644 --- a/xen/arch/x86/mm/shadow/common.c +++ b/xen/arch/x86/mm/shadow/common.c @@ -363,7 +363,7 @@ static int oos_remove_write_access(struct vcpu *v, mfn_t gmfn, } if ( ftlb ) - flush_tlb_mask(d->dirty_cpumask); + sh_flush_tlb_mask(d, d->dirty_cpumask); return 0; } @@ -939,7 +939,7 @@ static void _shadow_prealloc(struct domain *d, unsigned int pages) /* See if that freed up enough space */ if ( d->arch.paging.shadow.free_pages >= pages ) { - flush_tlb_mask(d->dirty_cpumask); + sh_flush_tlb_mask(d, d->dirty_cpumask); return; } } @@ -993,7 +993,7 @@ static void shadow_blow_tables(struct domain *d) pagetable_get_mfn(v->arch.shadow_table[i]), 0); /* Make sure everyone sees the unshadowings */ - flush_tlb_mask(d->dirty_cpumask); + sh_flush_tlb_mask(d, d->dirty_cpumask); } void shadow_blow_tables_per_domain(struct domain *d) @@ -1102,7 +1102,7 @@ mfn_t shadow_alloc(struct domain *d, if ( unlikely(!cpumask_empty(&mask)) ) { perfc_incr(shadow_alloc_tlbflush); - flush_tlb_mask(&mask); + sh_flush_tlb_mask(d, &mask); } /* Now safe to clear the page for reuse */ clear_domain_page(page_to_mfn(sp)); @@ -2293,7 +2293,7 @@ void sh_remove_shadows(struct domain *d, mfn_t gmfn, int fast, int all) /* Need to flush TLBs now, so that linear maps are safe next time we * take a fault. */ - flush_tlb_mask(d->dirty_cpumask); + sh_flush_tlb_mask(d, d->dirty_cpumask); paging_unlock(d); } @@ -3008,7 +3008,7 @@ static void sh_unshadow_for_p2m_change(struct domain *d, unsigned long gfn, { sh_remove_all_shadows_and_parents(d, mfn); if ( sh_remove_all_mappings(d, mfn, _gfn(gfn)) ) - flush_tlb_mask(d->dirty_cpumask); + sh_flush_tlb_mask(d, d->dirty_cpumask); } } @@ -3048,7 +3048,7 @@ static void sh_unshadow_for_p2m_change(struct domain *d, unsigned long gfn, } omfn = mfn_add(omfn, 1); } - flush_tlb_mask(&flushmask); + sh_flush_tlb_mask(d, &flushmask); if ( npte ) unmap_domain_page(npte); @@ -3335,7 +3335,7 @@ int shadow_track_dirty_vram(struct domain *d, } } if ( flush_tlb ) - flush_tlb_mask(d->dirty_cpumask); + sh_flush_tlb_mask(d, d->dirty_cpumask); goto out; out_sl1ma: @@ -3405,7 +3405,7 @@ bool shadow_flush_tlb(bool (*flush_vcpu)(void *ctxt, struct vcpu *v), } /* Flush TLBs on all CPUs with dirty vcpu state. */ - flush_tlb_mask(mask); + sh_flush_tlb_mask(d, mask); /* Done. */ for_each_vcpu ( d, v ) diff --git a/xen/arch/x86/mm/shadow/hvm.c b/xen/arch/x86/mm/shadow/hvm.c index 1e6024c71f..149f346a48 100644 --- a/xen/arch/x86/mm/shadow/hvm.c +++ b/xen/arch/x86/mm/shadow/hvm.c @@ -591,7 +591,7 @@ static void validate_guest_pt_write(struct vcpu *v, mfn_t gmfn, if ( rc & SHADOW_SET_FLUSH ) /* Need to flush TLBs to pick up shadow PT changes */ - flush_tlb_mask(d->dirty_cpumask); + sh_flush_tlb_mask(d, d->dirty_cpumask); if ( rc & SHADOW_SET_ERROR ) { diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c index f6b1628742..17af28cdbd 100644 --- a/xen/arch/x86/mm/shadow/multi.c +++ b/xen/arch/x86/mm/shadow/multi.c @@ -3067,7 +3067,7 @@ static int sh_page_fault(struct vcpu *v, perfc_incr(shadow_rm_write_flush_tlb); smp_wmb(); atomic_inc(&d->arch.paging.shadow.gtable_dirty_version); - flush_tlb_mask(d->dirty_cpumask); + sh_flush_tlb_mask(d, d->dirty_cpumask); } #if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC) @@ -3576,7 +3576,8 @@ static bool sh_invlpg(struct vcpu *v, unsigned long linear) if ( mfn_to_page(sl1mfn)->u.sh.type == SH_type_fl1_shadow ) { - flush_tlb_local(); + flush_local(FLUSH_TLB | + (is_hvm_domain(v->domain) ? FLUSH_HVM_ASID_CORE : 0)); return false; } @@ -3811,7 +3812,7 @@ sh_update_linear_entries(struct vcpu *v) * table entry. But, without this change, it would fetch the wrong * value due to a stale TLB. */ - flush_tlb_local(); + flush_local(FLUSH_TLB | (is_hvm_domain(d) ? FLUSH_HVM_ASID_CORE : 0)); } } @@ -4012,7 +4013,7 @@ sh_update_cr3(struct vcpu *v, int do_locking, bool noflush) * (old) shadow linear maps in the writeable mapping heuristics. */ #if GUEST_PAGING_LEVELS == 2 if ( sh_remove_write_access(d, gmfn, 2, 0) != 0 ) - flush_tlb_mask(d->dirty_cpumask); + sh_flush_tlb_mask(d, d->dirty_cpumask); sh_set_toplevel_shadow(v, 0, gmfn, SH_type_l2_shadow); #elif GUEST_PAGING_LEVELS == 3 /* PAE guests have four shadow_table entries, based on the @@ -4036,7 +4037,7 @@ sh_update_cr3(struct vcpu *v, int do_locking, bool noflush) } } if ( flush ) - flush_tlb_mask(d->dirty_cpumask); + sh_flush_tlb_mask(d, d->dirty_cpumask); /* Now install the new shadows. */ for ( i = 0; i < 4; i++ ) { @@ -4057,7 +4058,7 @@ sh_update_cr3(struct vcpu *v, int do_locking, bool noflush) } #elif GUEST_PAGING_LEVELS == 4 if ( sh_remove_write_access(d, gmfn, 4, 0) != 0 ) - flush_tlb_mask(d->dirty_cpumask); + sh_flush_tlb_mask(d, d->dirty_cpumask); sh_set_toplevel_shadow(v, 0, gmfn, SH_type_l4_shadow); if ( !shadow_mode_external(d) && !is_pv_32bit_domain(d) ) { @@ -4503,7 +4504,7 @@ static void sh_pagetable_dying(paddr_t gpa) } } if ( flush ) - flush_tlb_mask(d->dirty_cpumask); + sh_flush_tlb_mask(d, d->dirty_cpumask); /* Remember that we've seen the guest use this interface, so we * can rely on it using it in future, instead of guessing at @@ -4540,7 +4541,7 @@ static void sh_pagetable_dying(paddr_t gpa) mfn_to_page(gmfn)->pagetable_dying = true; shadow_unhook_mappings(d, smfn, 1/* user pages only */); /* Now flush the TLB: we removed toplevel mappings. */ - flush_tlb_mask(d->dirty_cpumask); + sh_flush_tlb_mask(d, d->dirty_cpumask); } /* Remember that we've seen the guest use this interface, so we diff --git a/xen/arch/x86/mm/shadow/private.h b/xen/arch/x86/mm/shadow/private.h index e8b028a365..2404ca4ff8 100644 --- a/xen/arch/x86/mm/shadow/private.h +++ b/xen/arch/x86/mm/shadow/private.h @@ -818,6 +818,12 @@ static inline int sh_check_page_has_no_refs(struct page_info *page) bool shadow_flush_tlb(bool (*flush_vcpu)(void *ctxt, struct vcpu *v), void *ctxt); +static inline void sh_flush_tlb_mask(const struct domain *d, + const cpumask_t *mask) +{ + flush_mask(mask, FLUSH_TLB | (is_hvm_domain(d) ? FLUSH_HVM_ASID_CORE : 0)); +} + #endif /* _XEN_SHADOW_PRIVATE_H */ /* diff --git a/xen/include/asm-x86/flushtlb.h b/xen/include/asm-x86/flushtlb.h index 2cfe4e6e97..579dc56803 100644 --- a/xen/include/asm-x86/flushtlb.h +++ b/xen/include/asm-x86/flushtlb.h @@ -105,6 +105,12 @@ void switch_cr3_cr4(unsigned long cr3, unsigned long cr4); #define FLUSH_VCPU_STATE 0x1000 /* Flush the per-cpu root page table */ #define FLUSH_ROOT_PGTBL 0x2000 +#if CONFIG_HVM + /* Flush all HVM guests linear TLB (using ASID/VPID) */ +#define FLUSH_HVM_ASID_CORE 0x4000 +#else +#define FLUSH_HVM_ASID_CORE 0 +#endif /* Flush local TLBs/caches. */ unsigned int flush_area_local(const void *va, unsigned int flags);
Introduce a specific flag to request a HVM guest linear TLB flush, which is an ASID/VPID tickle that forces a guest linear to guest physical TLB flush for all HVM guests. This was previously unconditionally done in each pre_flush call, but that's not required: HVM guests not using shadow don't require linear TLB flushes as Xen doesn't modify the guest page tables in that case (ie: when using HAP). Note that shadow paging code already takes care of issuing the necessary flushes when the shadow page tables are modified. In order to keep the previous behavior modify all shadow code TLB flushes to also flush the guest linear to physical TLB if the guest is HVM. I haven't looked at each specific shadow code TLB flush in order to figure out whether it actually requires a guest TLB flush or not, so there might be room for improvement in that regard. Also perform ASID/VPID flushes when modifying the p2m tables as it's a requirement for AMD hardware. Finally keep the flush in switch_cr3_cr4, as it's not clear whether code could rely on switch_cr3_cr4 also performing a guest linear TLB flush. A following patch can remove the ASID/VPID tickle from switch_cr3_cr4 if found to not be necessary. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- Changes since v8: - Don't flush host TLB on HAP changes. - Introduce a helper for shadow changes that only flushes ASIDs/VPIDs when the guest is HVM. - Introduce a helper for HAP that only flushes ASIDs/VPIDs. Changes since v7: - Do not perform an ASID flush in filtered_flush_tlb_mask: the requested flush is related to the page need_tlbflush field and not to p2m changes (applies to both callers). Changes since v6: - Add ASID/VPID flushes when modifying the p2m. - Keep the ASID/VPID flush in switch_cr3_cr4. Changes since v5: - Rename FLUSH_GUESTS_TLB to FLUSH_HVM_ASID_CORE. - Clarify commit message. - Define FLUSH_HVM_ASID_CORE to 0 when !CONFIG_HVM. --- xen/arch/x86/flushtlb.c | 6 ++++-- xen/arch/x86/mm/hap/hap.c | 8 ++++---- xen/arch/x86/mm/hap/nested_hap.c | 2 +- xen/arch/x86/mm/hap/private.h | 5 +++++ xen/arch/x86/mm/p2m-pt.c | 2 +- xen/arch/x86/mm/paging.c | 3 ++- xen/arch/x86/mm/shadow/common.c | 18 +++++++++--------- xen/arch/x86/mm/shadow/hvm.c | 2 +- xen/arch/x86/mm/shadow/multi.c | 17 +++++++++-------- xen/arch/x86/mm/shadow/private.h | 6 ++++++ xen/include/asm-x86/flushtlb.h | 6 ++++++ 11 files changed, 48 insertions(+), 27 deletions(-)