Message ID | 20200901142608.24481-5-eperezma@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | memory: Delete assertion in memory_region_unregister_iommu_notifier | expand |
On Tue, Sep 01, 2020 at 04:26:07PM +0200, Eugenio Pérez wrote: > This improves performance in case of netperf with vhost-net: > * TCP_STREAM: From 1923.6Mbit/s to 2175.13Mbit/s (13%) > * TCP_RR: From 8464.73 trans/s to 8932.703333 trans/s (5.5%) > * UDP_RR: From 8562.08 trans/s to 9005.62/s (5.1%) > * UDP_STREAM: No change observed (insignificant 0.1% improvement) Just to confirm: are these numbers about applying this patch before/after, or applying the whole series before/after? Asked since we actually optimized two parts: Firstly we avoid sending two invalidations for vhost. That's done by the previous patch, afaict. Secondly, this patch avoids the page walk for vhost since not needed. Am I right? > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com> > --- > hw/i386/intel_iommu.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c > index cdddb089e7..fe82391b73 100644 > --- a/hw/i386/intel_iommu.c > +++ b/hw/i386/intel_iommu.c > @@ -1964,6 +1964,12 @@ static void vtd_iotlb_domain_invalidate(IntelIOMMUState *s, uint16_t domain_id) > vtd_iommu_unlock(s); > > QLIST_FOREACH(vtd_as, &s->vtd_as_with_notifiers, next) { > + if (vtd_as->iommu.iommu_notify_flags & IOMMU_NOTIFIER_DEVIOTLB) { > + /* If IOMMU memory region is DEVICE IOTLB type, it does not make > + * sense to send regular IOMMU notifications. */ > + continue; > + } > + We want to avoid vtd_sync_shadow_page_table() for vhost, however IMHO a better expression would be: if (!(vtd_as->iommu.iommu_notify_flags & (IOMMU_NOTIFIER_MAP | IOMMU_NOTIFIER_UNMAP))) { continue; } The thing is we can't avoid the page sync if e.g. we're registered with MAP|UNMAP|DEVIOTLB. The important thing here, imho, is MAP|UNMAP because these two messages are used for shadow page synchronizations, so we can skip that if neither of the message is registered. Besides, we can add this at the entry of vtd_sync_shadow_page_table() so that all callers of vtd_sync_shadow_page_table() can benefit. > if (!vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus), > vtd_as->devfn, &ce) && > domain_id == vtd_get_domain_id(s, &ce)) { > -- > 2.18.1 >
On Tue, Sep 1, 2020 at 11:06 PM Peter Xu <peterx@redhat.com> wrote: > > On Tue, Sep 01, 2020 at 04:26:07PM +0200, Eugenio Pérez wrote: > > This improves performance in case of netperf with vhost-net: > > * TCP_STREAM: From 1923.6Mbit/s to 2175.13Mbit/s (13%) > > * TCP_RR: From 8464.73 trans/s to 8932.703333 trans/s (5.5%) > > * UDP_RR: From 8562.08 trans/s to 9005.62/s (5.1%) > > * UDP_STREAM: No change observed (insignificant 0.1% improvement) > > Just to confirm: are these numbers about applying this patch before/after, or > applying the whole series before/after? > > Asked since we actually optimized two parts: > > Firstly we avoid sending two invalidations for vhost. That's done by the > previous patch, afaict. > > Secondly, this patch avoids the page walk for vhost since not needed. > > Am I right? > Right! The numbers are comparing v5.1.0 tag to this commit. Avoiding sending invalidations for vhost did improve performance but in a very small number, I thought it would improve more. It also depends a lot on using rhel8 (better numbers, less but appreciable improvements) or rhel7 guest (worse performance compared to rhel8 but patches provide more improvements). > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com> > > --- > > hw/i386/intel_iommu.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c > > index cdddb089e7..fe82391b73 100644 > > --- a/hw/i386/intel_iommu.c > > +++ b/hw/i386/intel_iommu.c > > @@ -1964,6 +1964,12 @@ static void vtd_iotlb_domain_invalidate(IntelIOMMUState *s, uint16_t domain_id) > > vtd_iommu_unlock(s); > > > > QLIST_FOREACH(vtd_as, &s->vtd_as_with_notifiers, next) { > > + if (vtd_as->iommu.iommu_notify_flags & IOMMU_NOTIFIER_DEVIOTLB) { > > + /* If IOMMU memory region is DEVICE IOTLB type, it does not make > > + * sense to send regular IOMMU notifications. */ > > + continue; > > + } > > + > > We want to avoid vtd_sync_shadow_page_table() for vhost, however IMHO a better > expression would be: > > if (!(vtd_as->iommu.iommu_notify_flags & > (IOMMU_NOTIFIER_MAP | IOMMU_NOTIFIER_UNMAP))) { > continue; > } > > The thing is we can't avoid the page sync if e.g. we're registered with > MAP|UNMAP|DEVIOTLB. The important thing here, imho, is MAP|UNMAP because these > two messages are used for shadow page synchronizations, so we can skip that if > neither of the message is registered. > > Besides, we can add this at the entry of vtd_sync_shadow_page_table() so that > all callers of vtd_sync_shadow_page_table() can benefit. > Agree on both. Testing the new changes. Thanks! > > if (!vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus), > > vtd_as->devfn, &ce) && > > domain_id == vtd_get_domain_id(s, &ce)) { > > -- > > 2.18.1 > > > > -- > Peter Xu > >
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index cdddb089e7..fe82391b73 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -1964,6 +1964,12 @@ static void vtd_iotlb_domain_invalidate(IntelIOMMUState *s, uint16_t domain_id) vtd_iommu_unlock(s); QLIST_FOREACH(vtd_as, &s->vtd_as_with_notifiers, next) { + if (vtd_as->iommu.iommu_notify_flags & IOMMU_NOTIFIER_DEVIOTLB) { + /* If IOMMU memory region is DEVICE IOTLB type, it does not make + * sense to send regular IOMMU notifications. */ + continue; + } + if (!vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus), vtd_as->devfn, &ce) && domain_id == vtd_get_domain_id(s, &ce)) {
This improves performance in case of netperf with vhost-net: * TCP_STREAM: From 1923.6Mbit/s to 2175.13Mbit/s (13%) * TCP_RR: From 8464.73 trans/s to 8932.703333 trans/s (5.5%) * UDP_RR: From 8562.08 trans/s to 9005.62/s (5.1%) * UDP_STREAM: No change observed (insignificant 0.1% improvement) Signed-off-by: Eugenio Pérez <eperezma@redhat.com> --- hw/i386/intel_iommu.c | 6 ++++++ 1 file changed, 6 insertions(+)