Message ID | d07ee286-52f2-c7ec-2d0d-1c343dbc78be@suse.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | x86: reduce cache flushing overhead | expand |
On 19/04/2023 11:46 am, Jan Beulich wrote: > We don't need to invalidate caches here; all we're after is that earlier > writes have made it to main memory (and aiui even that just in case). > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > --- > This, aiui, being an analogue to uses of iommu_sync_cache() (just not > range restricted), I wonder whether it shouldn't be conditional upon > iommu_non_coherent. Then again I'm vaguely under the impression that > we had been here before, possibly even as far as questioning the need > for this call altogether. I'd far rather we fix it property than continue to massage around the sides of known-broken logic. Coherency, or not, of the memory accesses of an IOMMU is a per-IOMMU property, not a system-wide property. What the iommu_non_coherent global boolean currently does for us is enforce cache maintenance on all IOMMUs, even the coherent ones, when any single IOMMU in the system is non-coherent. Inside the IOMMU driver, cache maintenance should depend on iommu->ecap alone, disregarding anything else. This will improve the performance on any system with mixed coherent and non-coherent IOMMUs, without any other behavioural impact. The complication comes in in p2m-ept when we're sharing EPT and IOMMU pagetables, because the EPT can be accessed by more than one IOMMU if the guest has devices behind different IOMMUs. But we know the devices assigned to the domain. They're almost always static for the lifetime of the domain, and generally single device only, so there may be value in considering a per-domain "I've got a non-coherent IOMMU" flag, because we absolutely don't want to be walking the list of attached IOMMUs on each EPT update. But... SandyBridge era systems are the only ones I'm aware of where the IOMMUs advertise themselves as non-coherent. And on that generation we quirk the superpage capability off anyway, meaning they are ineligible for HAP-PT sharing anyway. And if we are doing HAP-PT sharing, the cache maintenance for regular EPT updates will be crippling for CPU performance, especially as the software bit updates are not relevant to the IOMMU anyway. A better option would be to simply disallow HAP-PT sharing when there's a non-coherent IOMMU in the system, or (slightly more fine grained) to disallow adding a device behind a non-coherent IOMMU to a domain using HAP-PT sharing (as there's a disable now anyway). Either will reduce the complexity of Xen's code without any loss of functionality in real systems AFAICT. ~Andrew
On 19.04.2023 22:46, Andrew Cooper wrote: > On 19/04/2023 11:46 am, Jan Beulich wrote: >> We don't need to invalidate caches here; all we're after is that earlier >> writes have made it to main memory (and aiui even that just in case). >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> >> --- >> This, aiui, being an analogue to uses of iommu_sync_cache() (just not >> range restricted), I wonder whether it shouldn't be conditional upon >> iommu_non_coherent. Then again I'm vaguely under the impression that >> we had been here before, possibly even as far as questioning the need >> for this call altogether. > > I'd far rather we fix it property than continue to massage around the > sides of known-broken logic. I agree in principle, but you're not asking that I actually amend this single-line change with all the work you outline, are you? To be quite honest, what you ask for really is something the VT-d maintainer(s) (i.e. Kevin as it presently stands) ought to be doing (or really have done long ago) ... Jan > Coherency, or not, of the memory accesses of an IOMMU is a per-IOMMU > property, not a system-wide property. What the iommu_non_coherent > global boolean currently does for us is enforce cache maintenance on all > IOMMUs, even the coherent ones, when any single IOMMU in the system is > non-coherent. > > Inside the IOMMU driver, cache maintenance should depend on iommu->ecap > alone, disregarding anything else. This will improve the performance on > any system with mixed coherent and non-coherent IOMMUs, without any > other behavioural impact. > > The complication comes in in p2m-ept when we're sharing EPT and IOMMU > pagetables, because the EPT can be accessed by more than one IOMMU if > the guest has devices behind different IOMMUs. > > But we know the devices assigned to the domain. They're almost always > static for the lifetime of the domain, and generally single device only, > so there may be value in considering a per-domain "I've got a > non-coherent IOMMU" flag, because we absolutely don't want to be walking > the list of attached IOMMUs on each EPT update. > > > But... > > SandyBridge era systems are the only ones I'm aware of where the IOMMUs > advertise themselves as non-coherent. And on that generation we quirk > the superpage capability off anyway, meaning they are ineligible for > HAP-PT sharing anyway. > > And if we are doing HAP-PT sharing, the cache maintenance for regular > EPT updates will be crippling for CPU performance, especially as the > software bit updates are not relevant to the IOMMU anyway. > > A better option would be to simply disallow HAP-PT sharing when there's > a non-coherent IOMMU in the system, or (slightly more fine grained) to > disallow adding a device behind a non-coherent IOMMU to a domain using > HAP-PT sharing (as there's a disable now anyway). > > Either will reduce the complexity of Xen's code without any loss of > functionality in real systems AFAICT. > > ~Andrew
--- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -693,7 +693,7 @@ static int __must_check iommu_flush_all( bool_t flush_dev_iotlb; int rc = 0; - flush_local(FLUSH_CACHE); + flush_local(FLUSH_WRITEBACK); for_each_drhd_unit ( drhd ) {
We don't need to invalidate caches here; all we're after is that earlier writes have made it to main memory (and aiui even that just in case). Signed-off-by: Jan Beulich <jbeulich@suse.com> --- This, aiui, being an analogue to uses of iommu_sync_cache() (just not range restricted), I wonder whether it shouldn't be conditional upon iommu_non_coherent. Then again I'm vaguely under the impression that we had been here before, possibly even as far as questioning the need for this call altogether.