Message ID | 20200731104644.20906-1-paul@xen.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/hvm: set 'ipat' in EPT for special pages | expand |
On 31.07.2020 12:46, Paul Durrant wrote: > From: Paul Durrant <pdurrant@amazon.com> > > All non-MMIO ranges (i.e those not mapping real device MMIO regions) that > map valid MFNs are normally marked MTRR_TYPE_WRBACK and 'ipat' is set. Hence > when PV drivers running in a guest populate the BAR space of the Xen Platform > PCI Device with pages such as the Shared Info page or Grant Table pages, > accesses to these pages will be cachable. > > However, should IOMMU mappings be enabled be enabled for the guest then these > accesses become uncachable. This has a substantial negative effect on I/O > throughput of PV devices. Arguably PV drivers should bot be using BAR space to > host the Shared Info and Grant Table pages but it is currently commonplace for > them to do this and so this problem needs mitigation. Hence this patch makes > sure the 'ipat' bit is set for any special page regardless of where in GFN > space it is mapped. > > NOTE: Clearly this mitigation only applies to Intel EPT. It is not obvious > that there is any similar mitigation possible for AMD NPT. Downstreams > such as Citrix XenServer have been carrying a patch similar to this for > several releases though. > > Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> However, ... > --- a/xen/arch/x86/hvm/mtrr.c > +++ b/xen/arch/x86/hvm/mtrr.c > @@ -830,7 +830,8 @@ int epte_get_entry_emt(struct domain *d, unsigned long gfn, mfn_t mfn, > return MTRR_TYPE_UNCACHABLE; > } > > - if ( !is_iommu_enabled(d) && !cache_flush_permitted(d) ) > + if ( (!is_iommu_enabled(d) && !cache_flush_permitted(d)) || > + is_special_page(mfn_to_page(mfn)) ) > { > *ipat = 1; > return MTRR_TYPE_WRBACK; ... shouldn't we leverages this (right away?) to do away with the APIC access page special case a few lines up from here? It is my understanding that vmx_alloc_vlapic_mapping() uses set_mmio_p2m_entry() just in order to get ipat set on the resulting EPT entry. Yet with the allocation using MEMF_no_refcount, this will now be the result even if no artificial MMIO mapping was created. Jan
> -----Original Message----- > From: Jan Beulich <jbeulich@suse.com> > Sent: 31 July 2020 12:15 > To: Paul Durrant <paul@xen.org> > Cc: xen-devel@lists.xenproject.org; Paul Durrant <pdurrant@amazon.com>; Andrew Cooper > <andrew.cooper3@citrix.com>; Wei Liu <wl@xen.org>; Roger Pau Monné <roger.pau@citrix.com> > Subject: Re: [PATCH] x86/hvm: set 'ipat' in EPT for special pages > > On 31.07.2020 12:46, Paul Durrant wrote: > > From: Paul Durrant <pdurrant@amazon.com> > > > > All non-MMIO ranges (i.e those not mapping real device MMIO regions) that > > map valid MFNs are normally marked MTRR_TYPE_WRBACK and 'ipat' is set. Hence > > when PV drivers running in a guest populate the BAR space of the Xen Platform > > PCI Device with pages such as the Shared Info page or Grant Table pages, > > accesses to these pages will be cachable. > > > > However, should IOMMU mappings be enabled be enabled for the guest then these > > accesses become uncachable. This has a substantial negative effect on I/O > > throughput of PV devices. Arguably PV drivers should bot be using BAR space to > > host the Shared Info and Grant Table pages but it is currently commonplace for > > them to do this and so this problem needs mitigation. Hence this patch makes > > sure the 'ipat' bit is set for any special page regardless of where in GFN > > space it is mapped. > > > > NOTE: Clearly this mitigation only applies to Intel EPT. It is not obvious > > that there is any similar mitigation possible for AMD NPT. Downstreams > > such as Citrix XenServer have been carrying a patch similar to this for > > several releases though. > > > > Signed-off-by: Paul Durrant <pdurrant@amazon.com> > > Reviewed-by: Jan Beulich <jbeulich@suse.com> > > However, ... > > > --- a/xen/arch/x86/hvm/mtrr.c > > +++ b/xen/arch/x86/hvm/mtrr.c > > @@ -830,7 +830,8 @@ int epte_get_entry_emt(struct domain *d, unsigned long gfn, mfn_t mfn, > > return MTRR_TYPE_UNCACHABLE; > > } > > > > - if ( !is_iommu_enabled(d) && !cache_flush_permitted(d) ) > > + if ( (!is_iommu_enabled(d) && !cache_flush_permitted(d)) || > > + is_special_page(mfn_to_page(mfn)) ) > > { > > *ipat = 1; > > return MTRR_TYPE_WRBACK; > > ... shouldn't we leverages this (right away?) to do away with the > APIC access page special case a few lines up from here? It is my > understanding that vmx_alloc_vlapic_mapping() uses > set_mmio_p2m_entry() just in order to get ipat set on the resulting > EPT entry. Yet with the allocation using MEMF_no_refcount, this will > now be the result even if no artificial MMIO mapping was created. That's a good point. Best handled by a separate patch I think so I'll re-send this with your R-b plus a follow up patch as a v2. Paul > > Jan
> -----Original Message----- > From: Andrew Cooper <andrew.cooper3@citrix.com> > Sent: 31 July 2020 12:21 > To: Paul Durrant <paul@xen.org>; xen-devel@lists.xenproject.org > Cc: Paul Durrant <pdurrant@amazon.com>; Jan Beulich <jbeulich@suse.com>; Wei Liu <wl@xen.org>; Roger > Pau Monné <roger.pau@citrix.com> > Subject: Re: [PATCH] x86/hvm: set 'ipat' in EPT for special pages > > On 31/07/2020 11:46, Paul Durrant wrote: > > From: Paul Durrant <pdurrant@amazon.com> > > > > All non-MMIO ranges (i.e those not mapping real device MMIO regions) that > > map valid MFNs are normally marked MTRR_TYPE_WRBACK and 'ipat' is set. Hence > > when PV drivers running in a guest populate the BAR space of the Xen Platform > > PCI Device with pages such as the Shared Info page or Grant Table pages, > > accesses to these pages will be cachable. > > > > However, should IOMMU mappings be enabled be enabled for the guest then these > > accesses become uncachable. This has a substantial negative effect on I/O > > throughput of PV devices. Arguably PV drivers should bot be using BAR space to > > host the Shared Info and Grant Table pages but it is currently commonplace for > > them to do this and so this problem needs mitigation. Hence this patch makes > > sure the 'ipat' bit is set for any special page regardless of where in GFN > > space it is mapped. > > > > NOTE: Clearly this mitigation only applies to Intel EPT. It is not obvious > > that there is any similar mitigation possible for AMD NPT. Downstreams > > such as Citrix XenServer have been carrying a patch similar to this for > > several releases though. > > https://github.com/xenserver/xen.pg/blob/XS-8.2.x/master/xen-override-caching-cp-26562.patch > > (Yay for internal ticket references escaping into the wild.) > :-) > > However, it is very important to be aware that this is just papering > over the problem, and it will cease to function as soon as we get MKTME > support. When we hit that point, iPAT cannot be used, as it will cause > data corruption in guests. > > The only correct way to fix this is to not (mis)use BAR space for RAM > mappings. > Oh yes, t his is only a mitigation. I believe Roger is working on a mechanism for guests to query for non-populated RAM space, which would be suitable for use by PV drivers. Paul > ~Andrew
On 31/07/2020 11:46, Paul Durrant wrote: > From: Paul Durrant <pdurrant@amazon.com> > > All non-MMIO ranges (i.e those not mapping real device MMIO regions) that > map valid MFNs are normally marked MTRR_TYPE_WRBACK and 'ipat' is set. Hence > when PV drivers running in a guest populate the BAR space of the Xen Platform > PCI Device with pages such as the Shared Info page or Grant Table pages, > accesses to these pages will be cachable. > > However, should IOMMU mappings be enabled be enabled for the guest then these > accesses become uncachable. This has a substantial negative effect on I/O > throughput of PV devices. Arguably PV drivers should bot be using BAR space to > host the Shared Info and Grant Table pages but it is currently commonplace for > them to do this and so this problem needs mitigation. Hence this patch makes > sure the 'ipat' bit is set for any special page regardless of where in GFN > space it is mapped. > > NOTE: Clearly this mitigation only applies to Intel EPT. It is not obvious > that there is any similar mitigation possible for AMD NPT. Downstreams > such as Citrix XenServer have been carrying a patch similar to this for > several releases though. https://github.com/xenserver/xen.pg/blob/XS-8.2.x/master/xen-override-caching-cp-26562.patch (Yay for internal ticket references escaping into the wild.) However, it is very important to be aware that this is just papering over the problem, and it will cease to function as soon as we get MKTME support. When we hit that point, iPAT cannot be used, as it will cause data corruption in guests. The only correct way to fix this is to not (mis)use BAR space for RAM mappings. ~Andrew
diff --git a/xen/arch/x86/hvm/mtrr.c b/xen/arch/x86/hvm/mtrr.c index 511c3be1c8..3ad813ed15 100644 --- a/xen/arch/x86/hvm/mtrr.c +++ b/xen/arch/x86/hvm/mtrr.c @@ -830,7 +830,8 @@ int epte_get_entry_emt(struct domain *d, unsigned long gfn, mfn_t mfn, return MTRR_TYPE_UNCACHABLE; } - if ( !is_iommu_enabled(d) && !cache_flush_permitted(d) ) + if ( (!is_iommu_enabled(d) && !cache_flush_permitted(d)) || + is_special_page(mfn_to_page(mfn)) ) { *ipat = 1; return MTRR_TYPE_WRBACK;