Message ID | 20221227072015.3134-4-longpeng2@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | two optimizations to speed up the start time | expand |
On 27/12/22 08:20, Longpeng(Mike) wrote: > From: Longpeng <longpeng2@huawei.com> > > This allows the vhost-vdpa device to batch the setup of all its MRs of > host notifiers. > > This significantly reduces the device starting time, e.g. the time spend > on setup the host notifier MRs reduce from 423ms to 32ms for a VM with > 64 vCPUs and 3 vhost-vDPA generic devices (vdpa_sim_blk, 64vq per device). > > Signed-off-by: Longpeng <longpeng2@huawei.com> > --- > hw/virtio/vhost-vdpa.c | 25 +++++++++++++++++++------ > 1 file changed, 19 insertions(+), 6 deletions(-) > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > index fd0c33b0e1..870265188a 100644 > --- a/hw/virtio/vhost-vdpa.c > +++ b/hw/virtio/vhost-vdpa.c > @@ -512,9 +512,18 @@ static void vhost_vdpa_host_notifiers_uninit(struct vhost_dev *dev, int n) > { > int i; > > + /* > + * Pack all the changes to the memory regions in a single > + * transaction to avoid a few updating of the address space > + * topology. > + */ > + memory_region_transaction_begin(); > + > for (i = dev->vq_index; i < dev->vq_index + n; i++) { > vhost_vdpa_host_notifier_uninit(dev, i); > } > + > + memory_region_transaction_commit(); > } Instead of optimizing one frontend, I wonder if we shouldn't expose a 'bool memory_region_transaction_in_progress()' helper in memory.c, and in virtio_queue_set_host_notifier_mr() backend, assert we are within a transaction. That way we'd optimize all virtio frontends.
On Tue, Dec 27, 2022 at 05:51:47PM +0100, Philippe Mathieu-Daudé wrote: > On 27/12/22 08:20, Longpeng(Mike) wrote: > > From: Longpeng <longpeng2@huawei.com> > > > > This allows the vhost-vdpa device to batch the setup of all its MRs of > > host notifiers. > > > > This significantly reduces the device starting time, e.g. the time spend > > on setup the host notifier MRs reduce from 423ms to 32ms for a VM with > > 64 vCPUs and 3 vhost-vDPA generic devices (vdpa_sim_blk, 64vq per device). > > > > Signed-off-by: Longpeng <longpeng2@huawei.com> > > --- > > hw/virtio/vhost-vdpa.c | 25 +++++++++++++++++++------ > > 1 file changed, 19 insertions(+), 6 deletions(-) > > > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > > index fd0c33b0e1..870265188a 100644 > > --- a/hw/virtio/vhost-vdpa.c > > +++ b/hw/virtio/vhost-vdpa.c > > @@ -512,9 +512,18 @@ static void vhost_vdpa_host_notifiers_uninit(struct vhost_dev *dev, int n) > > { > > int i; > > + /* > > + * Pack all the changes to the memory regions in a single > > + * transaction to avoid a few updating of the address space > > + * topology. > > + */ > > + memory_region_transaction_begin(); > > + > > for (i = dev->vq_index; i < dev->vq_index + n; i++) { > > vhost_vdpa_host_notifier_uninit(dev, i); > > } > > + > > + memory_region_transaction_commit(); > > } > > Instead of optimizing one frontend, I wonder if we shouldn't expose > a 'bool memory_region_transaction_in_progress()' helper in memory.c, > and in virtio_queue_set_host_notifier_mr() backend, assert we are > within a transaction. That way we'd optimize all virtio frontends. If we are doing something like this, I'd rather pass around a "transaction" structure so this can be checked statically. Looks like something that can be done on top though.
On 27/12/22 18:56, Michael S. Tsirkin wrote: > On Tue, Dec 27, 2022 at 05:51:47PM +0100, Philippe Mathieu-Daudé wrote: >> On 27/12/22 08:20, Longpeng(Mike) wrote: >>> From: Longpeng <longpeng2@huawei.com> >>> >>> This allows the vhost-vdpa device to batch the setup of all its MRs of >>> host notifiers. >>> >>> This significantly reduces the device starting time, e.g. the time spend >>> on setup the host notifier MRs reduce from 423ms to 32ms for a VM with >>> 64 vCPUs and 3 vhost-vDPA generic devices (vdpa_sim_blk, 64vq per device). >>> >>> Signed-off-by: Longpeng <longpeng2@huawei.com> >>> --- >>> hw/virtio/vhost-vdpa.c | 25 +++++++++++++++++++------ >>> 1 file changed, 19 insertions(+), 6 deletions(-) >>> >>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c >>> index fd0c33b0e1..870265188a 100644 >>> --- a/hw/virtio/vhost-vdpa.c >>> +++ b/hw/virtio/vhost-vdpa.c >>> @@ -512,9 +512,18 @@ static void vhost_vdpa_host_notifiers_uninit(struct vhost_dev *dev, int n) >>> { >>> int i; >>> + /* >>> + * Pack all the changes to the memory regions in a single >>> + * transaction to avoid a few updating of the address space >>> + * topology. >>> + */ >>> + memory_region_transaction_begin(); >>> + >>> for (i = dev->vq_index; i < dev->vq_index + n; i++) { >>> vhost_vdpa_host_notifier_uninit(dev, i); >>> } >>> + >>> + memory_region_transaction_commit(); >>> } >> >> Instead of optimizing one frontend, I wonder if we shouldn't expose >> a 'bool memory_region_transaction_in_progress()' helper in memory.c, >> and in virtio_queue_set_host_notifier_mr() backend, assert we are >> within a transaction. That way we'd optimize all virtio frontends. > > > If we are doing something like this, I'd rather pass around > a "transaction" structure so this can be checked statically. Ah, clever. > Looks like something that can be done on top though. Sure. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c index fd0c33b0e1..870265188a 100644 --- a/hw/virtio/vhost-vdpa.c +++ b/hw/virtio/vhost-vdpa.c @@ -512,9 +512,18 @@ static void vhost_vdpa_host_notifiers_uninit(struct vhost_dev *dev, int n) { int i; + /* + * Pack all the changes to the memory regions in a single + * transaction to avoid a few updating of the address space + * topology. + */ + memory_region_transaction_begin(); + for (i = dev->vq_index; i < dev->vq_index + n; i++) { vhost_vdpa_host_notifier_uninit(dev, i); } + + memory_region_transaction_commit(); } static void vhost_vdpa_host_notifiers_init(struct vhost_dev *dev) @@ -527,17 +536,21 @@ static void vhost_vdpa_host_notifiers_init(struct vhost_dev *dev) return; } + /* + * Pack all the changes to the memory regions in a single + * transaction to avoid a few updating of the address space + * topology. + */ + memory_region_transaction_begin(); + for (i = dev->vq_index; i < dev->vq_index + dev->nvqs; i++) { if (vhost_vdpa_host_notifier_init(dev, i)) { - goto err; + vhost_vdpa_host_notifiers_uninit(dev, i - dev->vq_index); + break; } } - return; - -err: - vhost_vdpa_host_notifiers_uninit(dev, i - dev->vq_index); - return; + memory_region_transaction_commit(); } static void vhost_vdpa_svq_cleanup(struct vhost_dev *dev)