diff mbox series

[v3,3/3] vdpa: commit all host notifier MRs in a single MR transaction

Message ID 20221227072015.3134-4-longpeng2@huawei.com (mailing list archive)
State New, archived
Headers show
Series two optimizations to speed up the start time | expand

Commit Message

Zhijian Li (Fujitsu)" via Dec. 27, 2022, 7:20 a.m. UTC
From: Longpeng <longpeng2@huawei.com>

This allows the vhost-vdpa device to batch the setup of all its MRs of
host notifiers.

This significantly reduces the device starting time, e.g. the time spend
on setup the host notifier MRs reduce from 423ms to 32ms for a VM with
64 vCPUs and 3 vhost-vDPA generic devices (vdpa_sim_blk, 64vq per device).

Signed-off-by: Longpeng <longpeng2@huawei.com>
---
 hw/virtio/vhost-vdpa.c | 25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

Comments

Philippe Mathieu-Daudé Dec. 27, 2022, 4:51 p.m. UTC | #1
On 27/12/22 08:20, Longpeng(Mike) wrote:
> From: Longpeng <longpeng2@huawei.com>
> 
> This allows the vhost-vdpa device to batch the setup of all its MRs of
> host notifiers.
> 
> This significantly reduces the device starting time, e.g. the time spend
> on setup the host notifier MRs reduce from 423ms to 32ms for a VM with
> 64 vCPUs and 3 vhost-vDPA generic devices (vdpa_sim_blk, 64vq per device).
> 
> Signed-off-by: Longpeng <longpeng2@huawei.com>
> ---
>   hw/virtio/vhost-vdpa.c | 25 +++++++++++++++++++------
>   1 file changed, 19 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> index fd0c33b0e1..870265188a 100644
> --- a/hw/virtio/vhost-vdpa.c
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -512,9 +512,18 @@ static void vhost_vdpa_host_notifiers_uninit(struct vhost_dev *dev, int n)
>   {
>       int i;
>   
> +    /*
> +     * Pack all the changes to the memory regions in a single
> +     * transaction to avoid a few updating of the address space
> +     * topology.
> +     */
> +    memory_region_transaction_begin();
> +
>       for (i = dev->vq_index; i < dev->vq_index + n; i++) {
>           vhost_vdpa_host_notifier_uninit(dev, i);
>       }
> +
> +    memory_region_transaction_commit();
>   }

Instead of optimizing one frontend, I wonder if we shouldn't expose
a 'bool memory_region_transaction_in_progress()' helper in memory.c,
and in virtio_queue_set_host_notifier_mr() backend, assert we are
within a transaction. That way we'd optimize all virtio frontends.
Michael S. Tsirkin Dec. 27, 2022, 5:56 p.m. UTC | #2
On Tue, Dec 27, 2022 at 05:51:47PM +0100, Philippe Mathieu-Daudé wrote:
> On 27/12/22 08:20, Longpeng(Mike) wrote:
> > From: Longpeng <longpeng2@huawei.com>
> > 
> > This allows the vhost-vdpa device to batch the setup of all its MRs of
> > host notifiers.
> > 
> > This significantly reduces the device starting time, e.g. the time spend
> > on setup the host notifier MRs reduce from 423ms to 32ms for a VM with
> > 64 vCPUs and 3 vhost-vDPA generic devices (vdpa_sim_blk, 64vq per device).
> > 
> > Signed-off-by: Longpeng <longpeng2@huawei.com>
> > ---
> >   hw/virtio/vhost-vdpa.c | 25 +++++++++++++++++++------
> >   1 file changed, 19 insertions(+), 6 deletions(-)
> > 
> > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > index fd0c33b0e1..870265188a 100644
> > --- a/hw/virtio/vhost-vdpa.c
> > +++ b/hw/virtio/vhost-vdpa.c
> > @@ -512,9 +512,18 @@ static void vhost_vdpa_host_notifiers_uninit(struct vhost_dev *dev, int n)
> >   {
> >       int i;
> > +    /*
> > +     * Pack all the changes to the memory regions in a single
> > +     * transaction to avoid a few updating of the address space
> > +     * topology.
> > +     */
> > +    memory_region_transaction_begin();
> > +
> >       for (i = dev->vq_index; i < dev->vq_index + n; i++) {
> >           vhost_vdpa_host_notifier_uninit(dev, i);
> >       }
> > +
> > +    memory_region_transaction_commit();
> >   }
> 
> Instead of optimizing one frontend, I wonder if we shouldn't expose
> a 'bool memory_region_transaction_in_progress()' helper in memory.c,
> and in virtio_queue_set_host_notifier_mr() backend, assert we are
> within a transaction. That way we'd optimize all virtio frontends.


If we are doing something like this, I'd rather pass around
a "transaction" structure so this can be checked statically.
Looks like something that can be done on top though.
Philippe Mathieu-Daudé Dec. 28, 2022, 1:14 p.m. UTC | #3
On 27/12/22 18:56, Michael S. Tsirkin wrote:
> On Tue, Dec 27, 2022 at 05:51:47PM +0100, Philippe Mathieu-Daudé wrote:
>> On 27/12/22 08:20, Longpeng(Mike) wrote:
>>> From: Longpeng <longpeng2@huawei.com>
>>>
>>> This allows the vhost-vdpa device to batch the setup of all its MRs of
>>> host notifiers.
>>>
>>> This significantly reduces the device starting time, e.g. the time spend
>>> on setup the host notifier MRs reduce from 423ms to 32ms for a VM with
>>> 64 vCPUs and 3 vhost-vDPA generic devices (vdpa_sim_blk, 64vq per device).
>>>
>>> Signed-off-by: Longpeng <longpeng2@huawei.com>
>>> ---
>>>    hw/virtio/vhost-vdpa.c | 25 +++++++++++++++++++------
>>>    1 file changed, 19 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
>>> index fd0c33b0e1..870265188a 100644
>>> --- a/hw/virtio/vhost-vdpa.c
>>> +++ b/hw/virtio/vhost-vdpa.c
>>> @@ -512,9 +512,18 @@ static void vhost_vdpa_host_notifiers_uninit(struct vhost_dev *dev, int n)
>>>    {
>>>        int i;
>>> +    /*
>>> +     * Pack all the changes to the memory regions in a single
>>> +     * transaction to avoid a few updating of the address space
>>> +     * topology.
>>> +     */
>>> +    memory_region_transaction_begin();
>>> +
>>>        for (i = dev->vq_index; i < dev->vq_index + n; i++) {
>>>            vhost_vdpa_host_notifier_uninit(dev, i);
>>>        }
>>> +
>>> +    memory_region_transaction_commit();
>>>    }
>>
>> Instead of optimizing one frontend, I wonder if we shouldn't expose
>> a 'bool memory_region_transaction_in_progress()' helper in memory.c,
>> and in virtio_queue_set_host_notifier_mr() backend, assert we are
>> within a transaction. That way we'd optimize all virtio frontends.
> 
> 
> If we are doing something like this, I'd rather pass around
> a "transaction" structure so this can be checked statically.

Ah, clever.

> Looks like something that can be done on top though.

Sure.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
diff mbox series

Patch

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index fd0c33b0e1..870265188a 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -512,9 +512,18 @@  static void vhost_vdpa_host_notifiers_uninit(struct vhost_dev *dev, int n)
 {
     int i;
 
+    /*
+     * Pack all the changes to the memory regions in a single
+     * transaction to avoid a few updating of the address space
+     * topology.
+     */
+    memory_region_transaction_begin();
+
     for (i = dev->vq_index; i < dev->vq_index + n; i++) {
         vhost_vdpa_host_notifier_uninit(dev, i);
     }
+
+    memory_region_transaction_commit();
 }
 
 static void vhost_vdpa_host_notifiers_init(struct vhost_dev *dev)
@@ -527,17 +536,21 @@  static void vhost_vdpa_host_notifiers_init(struct vhost_dev *dev)
         return;
     }
 
+    /*
+     * Pack all the changes to the memory regions in a single
+     * transaction to avoid a few updating of the address space
+     * topology.
+     */
+    memory_region_transaction_begin();
+
     for (i = dev->vq_index; i < dev->vq_index + dev->nvqs; i++) {
         if (vhost_vdpa_host_notifier_init(dev, i)) {
-            goto err;
+            vhost_vdpa_host_notifiers_uninit(dev, i - dev->vq_index);
+            break;
         }
     }
 
-    return;
-
-err:
-    vhost_vdpa_host_notifiers_uninit(dev, i - dev->vq_index);
-    return;
+    memory_region_transaction_commit();
 }
 
 static void vhost_vdpa_svq_cleanup(struct vhost_dev *dev)