diff mbox series

vdpa: reject F_ENABLE_AFTER_DRIVER_OK if backend does not support it

Message ID 20230703142218.362549-1-eperezma@redhat.com (mailing list archive)
State New, archived
Headers show
Series vdpa: reject F_ENABLE_AFTER_DRIVER_OK if backend does not support it | expand

Commit Message

Eugenio Perez Martin July 3, 2023, 2:22 p.m. UTC
With the current code it is accepted as long as userland send it.

Although userland should not set a feature flag that has not been
offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
complain for it.

Since there is no specific reason for any parent to reject that backend
feature bit when it has been proposed, let's control it at vdpa frontend
level. Future patches may move this control to the parent driver.

Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>

---
Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
commit. Please let me know if I should send a v3 of [1] instead.

[1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
---
 drivers/vhost/vdpa.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

Michael S. Tsirkin July 3, 2023, 2:52 p.m. UTC | #1
On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
> With the current code it is accepted as long as userland send it.
> 
> Although userland should not set a feature flag that has not been
> offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
> complain for it.
> 
> Since there is no specific reason for any parent to reject that backend
> feature bit when it has been proposed, let's control it at vdpa frontend
> level. Future patches may move this control to the parent driver.
> 
> Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>

Please do send v3. And again, I don't want to send "after driver ok" hack
upstream at all, I merged it in next just to give it some testing.
We want RING_ACCESS_AFTER_KICK or some such.


> ---
> Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
> commit. Please let me know if I should send a v3 of [1] instead.
> 
> [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
> ---
>  drivers/vhost/vdpa.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index e1abf29fed5b..a7e554352351 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
>  {
>  	struct vhost_vdpa *v = filep->private_data;
>  	struct vhost_dev *d = &v->vdev;
> +	const struct vdpa_config_ops *ops = v->vdpa->config;
>  	void __user *argp = (void __user *)arg;
>  	u64 __user *featurep = argp;
> -	u64 features;
> +	u64 features, parent_features = 0;
>  	long r = 0;
>  
>  	if (cmd == VHOST_SET_BACKEND_FEATURES) {
>  		if (copy_from_user(&features, featurep, sizeof(features)))
>  			return -EFAULT;
> +		if (ops->get_backend_features)
> +			parent_features = ops->get_backend_features(v->vdpa);
>  		if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
>  				 BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
>  				 BIT_ULL(VHOST_BACKEND_F_RESUME) |
> -				 BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
> +				 parent_features))
>  			return -EOPNOTSUPP;
>  		if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
>  		     !vhost_vdpa_can_suspend(v))
> -- 
> 2.39.3
Eugenio Perez Martin July 4, 2023, 10:25 a.m. UTC | #2
On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
> > With the current code it is accepted as long as userland send it.
> >
> > Although userland should not set a feature flag that has not been
> > offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
> > complain for it.
> >
> > Since there is no specific reason for any parent to reject that backend
> > feature bit when it has been proposed, let's control it at vdpa frontend
> > level. Future patches may move this control to the parent driver.
> >
> > Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>
> Please do send v3. And again, I don't want to send "after driver ok" hack
> upstream at all, I merged it in next just to give it some testing.
> We want RING_ACCESS_AFTER_KICK or some such.
>

Current devices do not support that semantic. My plan was to convert
it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
was not explicit enough.

The only solution I can see to that is to trap & emulate in the vdpa
(parent?) driver, as talked in virtio-comment. But that complicates
the architecture:
* Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
* Store vq enable state separately, at
vdpa->config->set_vq_ready(true), but not transmit that enable to hw
* Store the doorbell state separately, but do not configure it to the
device directly.

But how to recover if the device cannot configure them at kick time,
for example?

Maybe we can just fail if the parent driver does not support enabling
the vq after DRIVER_OK? That way no new feature flag is needed.

Thanks!

>
> > ---
> > Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
> > commit. Please let me know if I should send a v3 of [1] instead.
> >
> > [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
> > ---
> >  drivers/vhost/vdpa.c | 7 +++++--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > index e1abf29fed5b..a7e554352351 100644
> > --- a/drivers/vhost/vdpa.c
> > +++ b/drivers/vhost/vdpa.c
> > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> >  {
> >       struct vhost_vdpa *v = filep->private_data;
> >       struct vhost_dev *d = &v->vdev;
> > +     const struct vdpa_config_ops *ops = v->vdpa->config;
> >       void __user *argp = (void __user *)arg;
> >       u64 __user *featurep = argp;
> > -     u64 features;
> > +     u64 features, parent_features = 0;
> >       long r = 0;
> >
> >       if (cmd == VHOST_SET_BACKEND_FEATURES) {
> >               if (copy_from_user(&features, featurep, sizeof(features)))
> >                       return -EFAULT;
> > +             if (ops->get_backend_features)
> > +                     parent_features = ops->get_backend_features(v->vdpa);
> >               if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
> >                                BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
> >                                BIT_ULL(VHOST_BACKEND_F_RESUME) |
> > -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
> > +                              parent_features))
> >                       return -EOPNOTSUPP;
> >               if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
> >                    !vhost_vdpa_can_suspend(v))
> > --
> > 2.39.3
>
Michael S. Tsirkin July 4, 2023, 10:38 a.m. UTC | #3
On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
> On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
> > > With the current code it is accepted as long as userland send it.
> > >
> > > Although userland should not set a feature flag that has not been
> > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
> > > complain for it.
> > >
> > > Since there is no specific reason for any parent to reject that backend
> > > feature bit when it has been proposed, let's control it at vdpa frontend
> > > level. Future patches may move this control to the parent driver.
> > >
> > > Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
> > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> >
> > Please do send v3. And again, I don't want to send "after driver ok" hack
> > upstream at all, I merged it in next just to give it some testing.
> > We want RING_ACCESS_AFTER_KICK or some such.
> >
> 
> Current devices do not support that semantic.

Which devices specifically access the ring after DRIVER_OK but before
a kick?

> My plan was to convert
> it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
> was not explicit enough.
> 
> The only solution I can see to that is to trap & emulate in the vdpa
> (parent?) driver, as talked in virtio-comment. But that complicates
> the architecture:
> * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
> * Store vq enable state separately, at
> vdpa->config->set_vq_ready(true), but not transmit that enable to hw
> * Store the doorbell state separately, but do not configure it to the
> device directly.
> 
> But how to recover if the device cannot configure them at kick time,
> for example?
> 
> Maybe we can just fail if the parent driver does not support enabling
> the vq after DRIVER_OK? That way no new feature flag is needed.
> 
> Thanks!
> 
> >
> > > ---
> > > Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
> > > commit. Please let me know if I should send a v3 of [1] instead.
> > >
> > > [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
> > > ---
> > >  drivers/vhost/vdpa.c | 7 +++++--
> > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > index e1abf29fed5b..a7e554352351 100644
> > > --- a/drivers/vhost/vdpa.c
> > > +++ b/drivers/vhost/vdpa.c
> > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > >  {
> > >       struct vhost_vdpa *v = filep->private_data;
> > >       struct vhost_dev *d = &v->vdev;
> > > +     const struct vdpa_config_ops *ops = v->vdpa->config;
> > >       void __user *argp = (void __user *)arg;
> > >       u64 __user *featurep = argp;
> > > -     u64 features;
> > > +     u64 features, parent_features = 0;
> > >       long r = 0;
> > >
> > >       if (cmd == VHOST_SET_BACKEND_FEATURES) {
> > >               if (copy_from_user(&features, featurep, sizeof(features)))
> > >                       return -EFAULT;
> > > +             if (ops->get_backend_features)
> > > +                     parent_features = ops->get_backend_features(v->vdpa);
> > >               if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
> > >                                BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
> > >                                BIT_ULL(VHOST_BACKEND_F_RESUME) |
> > > -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
> > > +                              parent_features))
> > >                       return -EOPNOTSUPP;
> > >               if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
> > >                    !vhost_vdpa_can_suspend(v))
> > > --
> > > 2.39.3
> >
Eugenio Perez Martin July 4, 2023, 11:36 a.m. UTC | #4
On Tue, Jul 4, 2023 at 12:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
> > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
> > > > With the current code it is accepted as long as userland send it.
> > > >
> > > > Although userland should not set a feature flag that has not been
> > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
> > > > complain for it.
> > > >
> > > > Since there is no specific reason for any parent to reject that backend
> > > > feature bit when it has been proposed, let's control it at vdpa frontend
> > > > level. Future patches may move this control to the parent driver.
> > > >
> > > > Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
> > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > >
> > > Please do send v3. And again, I don't want to send "after driver ok" hack
> > > upstream at all, I merged it in next just to give it some testing.
> > > We want RING_ACCESS_AFTER_KICK or some such.
> > >
> >
> > Current devices do not support that semantic.
>
> Which devices specifically access the ring after DRIVER_OK but before
> a kick?
>

Previous versions of the QEMU LM series did a spurious kick to start
traffic at the LM destination [1]. When it was proposed, that spurious
kick was removed from the series because to check for descriptors
after driver_ok, even without a kick, was considered work of the
parent driver.

I'm ok to go back to this spurious kick, but I'm not sure if the hw
will read the ring before the kick actually. I can ask.

Thanks!

[1] https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg02775.html

> > My plan was to convert
> > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
> > was not explicit enough.
> >
> > The only solution I can see to that is to trap & emulate in the vdpa
> > (parent?) driver, as talked in virtio-comment. But that complicates
> > the architecture:
> > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
> > * Store vq enable state separately, at
> > vdpa->config->set_vq_ready(true), but not transmit that enable to hw
> > * Store the doorbell state separately, but do not configure it to the
> > device directly.
> >
> > But how to recover if the device cannot configure them at kick time,
> > for example?
> >
> > Maybe we can just fail if the parent driver does not support enabling
> > the vq after DRIVER_OK? That way no new feature flag is needed.
> >
> > Thanks!
> >
> > >
> > > > ---
> > > > Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
> > > > commit. Please let me know if I should send a v3 of [1] instead.
> > > >
> > > > [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
> > > > ---
> > > >  drivers/vhost/vdpa.c | 7 +++++--
> > > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > index e1abf29fed5b..a7e554352351 100644
> > > > --- a/drivers/vhost/vdpa.c
> > > > +++ b/drivers/vhost/vdpa.c
> > > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > >  {
> > > >       struct vhost_vdpa *v = filep->private_data;
> > > >       struct vhost_dev *d = &v->vdev;
> > > > +     const struct vdpa_config_ops *ops = v->vdpa->config;
> > > >       void __user *argp = (void __user *)arg;
> > > >       u64 __user *featurep = argp;
> > > > -     u64 features;
> > > > +     u64 features, parent_features = 0;
> > > >       long r = 0;
> > > >
> > > >       if (cmd == VHOST_SET_BACKEND_FEATURES) {
> > > >               if (copy_from_user(&features, featurep, sizeof(features)))
> > > >                       return -EFAULT;
> > > > +             if (ops->get_backend_features)
> > > > +                     parent_features = ops->get_backend_features(v->vdpa);
> > > >               if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
> > > >                                BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
> > > >                                BIT_ULL(VHOST_BACKEND_F_RESUME) |
> > > > -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
> > > > +                              parent_features))
> > > >                       return -EOPNOTSUPP;
> > > >               if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
> > > >                    !vhost_vdpa_can_suspend(v))
> > > > --
> > > > 2.39.3
> > >
>
Michael S. Tsirkin July 4, 2023, 3:45 p.m. UTC | #5
On Tue, Jul 04, 2023 at 01:36:11PM +0200, Eugenio Perez Martin wrote:
> On Tue, Jul 4, 2023 at 12:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
> > > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
> > > > > With the current code it is accepted as long as userland send it.
> > > > >
> > > > > Although userland should not set a feature flag that has not been
> > > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
> > > > > complain for it.
> > > > >
> > > > > Since there is no specific reason for any parent to reject that backend
> > > > > feature bit when it has been proposed, let's control it at vdpa frontend
> > > > > level. Future patches may move this control to the parent driver.
> > > > >
> > > > > Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
> > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > >
> > > > Please do send v3. And again, I don't want to send "after driver ok" hack
> > > > upstream at all, I merged it in next just to give it some testing.
> > > > We want RING_ACCESS_AFTER_KICK or some such.
> > > >
> > >
> > > Current devices do not support that semantic.
> >
> > Which devices specifically access the ring after DRIVER_OK but before
> > a kick?
> >
> 
> Previous versions of the QEMU LM series did a spurious kick to start
> traffic at the LM destination [1]. When it was proposed, that spurious
> kick was removed from the series because to check for descriptors
> after driver_ok, even without a kick, was considered work of the
> parent driver.
> 
> I'm ok to go back to this spurious kick, but I'm not sure if the hw
> will read the ring before the kick actually. I can ask.
> 
> Thanks!
> 
> [1] https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg02775.html

Let's find out. We need to check for ENABLE_AFTER_DRIVER_OK too, no?



> > > My plan was to convert
> > > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
> > > was not explicit enough.
> > >
> > > The only solution I can see to that is to trap & emulate in the vdpa
> > > (parent?) driver, as talked in virtio-comment. But that complicates
> > > the architecture:
> > > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
> > > * Store vq enable state separately, at
> > > vdpa->config->set_vq_ready(true), but not transmit that enable to hw
> > > * Store the doorbell state separately, but do not configure it to the
> > > device directly.
> > >
> > > But how to recover if the device cannot configure them at kick time,
> > > for example?
> > >
> > > Maybe we can just fail if the parent driver does not support enabling
> > > the vq after DRIVER_OK? That way no new feature flag is needed.
> > >
> > > Thanks!
> > >
> > > >
> > > > > ---
> > > > > Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
> > > > > commit. Please let me know if I should send a v3 of [1] instead.
> > > > >
> > > > > [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
> > > > > ---
> > > > >  drivers/vhost/vdpa.c | 7 +++++--
> > > > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > index e1abf29fed5b..a7e554352351 100644
> > > > > --- a/drivers/vhost/vdpa.c
> > > > > +++ b/drivers/vhost/vdpa.c
> > > > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > >  {
> > > > >       struct vhost_vdpa *v = filep->private_data;
> > > > >       struct vhost_dev *d = &v->vdev;
> > > > > +     const struct vdpa_config_ops *ops = v->vdpa->config;
> > > > >       void __user *argp = (void __user *)arg;
> > > > >       u64 __user *featurep = argp;
> > > > > -     u64 features;
> > > > > +     u64 features, parent_features = 0;
> > > > >       long r = 0;
> > > > >
> > > > >       if (cmd == VHOST_SET_BACKEND_FEATURES) {
> > > > >               if (copy_from_user(&features, featurep, sizeof(features)))
> > > > >                       return -EFAULT;
> > > > > +             if (ops->get_backend_features)
> > > > > +                     parent_features = ops->get_backend_features(v->vdpa);
> > > > >               if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
> > > > >                                BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
> > > > >                                BIT_ULL(VHOST_BACKEND_F_RESUME) |
> > > > > -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
> > > > > +                              parent_features))
> > > > >                       return -EOPNOTSUPP;
> > > > >               if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
> > > > >                    !vhost_vdpa_can_suspend(v))
> > > > > --
> > > > > 2.39.3
> > > >
> >
Jason Wang July 5, 2023, 7:49 a.m. UTC | #6
On Tue, Jul 4, 2023 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Jul 04, 2023 at 01:36:11PM +0200, Eugenio Perez Martin wrote:
> > On Tue, Jul 4, 2023 at 12:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
> > > > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
> > > > > > With the current code it is accepted as long as userland send it.
> > > > > >
> > > > > > Although userland should not set a feature flag that has not been
> > > > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
> > > > > > complain for it.
> > > > > >
> > > > > > Since there is no specific reason for any parent to reject that backend
> > > > > > feature bit when it has been proposed, let's control it at vdpa frontend
> > > > > > level. Future patches may move this control to the parent driver.
> > > > > >
> > > > > > Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
> > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > >
> > > > > Please do send v3. And again, I don't want to send "after driver ok" hack
> > > > > upstream at all, I merged it in next just to give it some testing.
> > > > > We want RING_ACCESS_AFTER_KICK or some such.
> > > > >
> > > >
> > > > Current devices do not support that semantic.
> > >
> > > Which devices specifically access the ring after DRIVER_OK but before
> > > a kick?
> > >
> >
> > Previous versions of the QEMU LM series did a spurious kick to start
> > traffic at the LM destination [1]. When it was proposed, that spurious
> > kick was removed from the series because to check for descriptors
> > after driver_ok, even without a kick, was considered work of the
> > parent driver.
> >
> > I'm ok to go back to this spurious kick, but I'm not sure if the hw
> > will read the ring before the kick actually. I can ask.
> >
> > Thanks!
> >
> > [1] https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg02775.html
>
> Let's find out. We need to check for ENABLE_AFTER_DRIVER_OK too, no?

My understanding is [1] assuming ACCESS_AFTER_KICK. This seems
sub-optimal than assuming ENABLE_AFTER_DRIVER_OK.

But this reminds me one thing, as the thread is going too long, I
wonder if we simply assume ENABLE_AFTER_DRIVER_OK if RING_RESET is
supported?

Thanks

>
>
>
> > > > My plan was to convert
> > > > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
> > > > was not explicit enough.
> > > >
> > > > The only solution I can see to that is to trap & emulate in the vdpa
> > > > (parent?) driver, as talked in virtio-comment. But that complicates
> > > > the architecture:
> > > > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
> > > > * Store vq enable state separately, at
> > > > vdpa->config->set_vq_ready(true), but not transmit that enable to hw
> > > > * Store the doorbell state separately, but do not configure it to the
> > > > device directly.
> > > >
> > > > But how to recover if the device cannot configure them at kick time,
> > > > for example?
> > > >
> > > > Maybe we can just fail if the parent driver does not support enabling
> > > > the vq after DRIVER_OK? That way no new feature flag is needed.
> > > >
> > > > Thanks!
> > > >
> > > > >
> > > > > > ---
> > > > > > Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
> > > > > > commit. Please let me know if I should send a v3 of [1] instead.
> > > > > >
> > > > > > [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
> > > > > > ---
> > > > > >  drivers/vhost/vdpa.c | 7 +++++--
> > > > > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > index e1abf29fed5b..a7e554352351 100644
> > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > >  {
> > > > > >       struct vhost_vdpa *v = filep->private_data;
> > > > > >       struct vhost_dev *d = &v->vdev;
> > > > > > +     const struct vdpa_config_ops *ops = v->vdpa->config;
> > > > > >       void __user *argp = (void __user *)arg;
> > > > > >       u64 __user *featurep = argp;
> > > > > > -     u64 features;
> > > > > > +     u64 features, parent_features = 0;
> > > > > >       long r = 0;
> > > > > >
> > > > > >       if (cmd == VHOST_SET_BACKEND_FEATURES) {
> > > > > >               if (copy_from_user(&features, featurep, sizeof(features)))
> > > > > >                       return -EFAULT;
> > > > > > +             if (ops->get_backend_features)
> > > > > > +                     parent_features = ops->get_backend_features(v->vdpa);
> > > > > >               if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
> > > > > >                                BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
> > > > > >                                BIT_ULL(VHOST_BACKEND_F_RESUME) |
> > > > > > -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
> > > > > > +                              parent_features))
> > > > > >                       return -EOPNOTSUPP;
> > > > > >               if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
> > > > > >                    !vhost_vdpa_can_suspend(v))
> > > > > > --
> > > > > > 2.39.3
> > > > >
> > >
>
Jason Wang July 5, 2023, 7:55 a.m. UTC | #7
On Tue, Jul 4, 2023 at 6:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
> > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
> > > > With the current code it is accepted as long as userland send it.
> > > >
> > > > Although userland should not set a feature flag that has not been
> > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
> > > > complain for it.
> > > >
> > > > Since there is no specific reason for any parent to reject that backend
> > > > feature bit when it has been proposed, let's control it at vdpa frontend
> > > > level. Future patches may move this control to the parent driver.
> > > >
> > > > Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
> > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > >
> > > Please do send v3. And again, I don't want to send "after driver ok" hack
> > > upstream at all, I merged it in next just to give it some testing.
> > > We want RING_ACCESS_AFTER_KICK or some such.
> > >
> >
> > Current devices do not support that semantic.
>
> Which devices specifically access the ring after DRIVER_OK but before
> a kick?

Vhost-net is one example at last. It polls a socket as well, so it
starts to access the ring immediately after DRIVER_OK.

Thanks

>
> > My plan was to convert
> > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
> > was not explicit enough.
> >
> > The only solution I can see to that is to trap & emulate in the vdpa
> > (parent?) driver, as talked in virtio-comment. But that complicates
> > the architecture:
> > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
> > * Store vq enable state separately, at
> > vdpa->config->set_vq_ready(true), but not transmit that enable to hw
> > * Store the doorbell state separately, but do not configure it to the
> > device directly.
> >
> > But how to recover if the device cannot configure them at kick time,
> > for example?
> >
> > Maybe we can just fail if the parent driver does not support enabling
> > the vq after DRIVER_OK? That way no new feature flag is needed.
> >
> > Thanks!
> >
> > >
> > > > ---
> > > > Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
> > > > commit. Please let me know if I should send a v3 of [1] instead.
> > > >
> > > > [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
> > > > ---
> > > >  drivers/vhost/vdpa.c | 7 +++++--
> > > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > index e1abf29fed5b..a7e554352351 100644
> > > > --- a/drivers/vhost/vdpa.c
> > > > +++ b/drivers/vhost/vdpa.c
> > > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > >  {
> > > >       struct vhost_vdpa *v = filep->private_data;
> > > >       struct vhost_dev *d = &v->vdev;
> > > > +     const struct vdpa_config_ops *ops = v->vdpa->config;
> > > >       void __user *argp = (void __user *)arg;
> > > >       u64 __user *featurep = argp;
> > > > -     u64 features;
> > > > +     u64 features, parent_features = 0;
> > > >       long r = 0;
> > > >
> > > >       if (cmd == VHOST_SET_BACKEND_FEATURES) {
> > > >               if (copy_from_user(&features, featurep, sizeof(features)))
> > > >                       return -EFAULT;
> > > > +             if (ops->get_backend_features)
> > > > +                     parent_features = ops->get_backend_features(v->vdpa);
> > > >               if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
> > > >                                BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
> > > >                                BIT_ULL(VHOST_BACKEND_F_RESUME) |
> > > > -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
> > > > +                              parent_features))
> > > >                       return -EOPNOTSUPP;
> > > >               if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
> > > >                    !vhost_vdpa_can_suspend(v))
> > > > --
> > > > 2.39.3
> > >
>
Michael S. Tsirkin July 5, 2023, 8:41 a.m. UTC | #8
On Wed, Jul 05, 2023 at 03:49:58PM +0800, Jason Wang wrote:
> On Tue, Jul 4, 2023 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Jul 04, 2023 at 01:36:11PM +0200, Eugenio Perez Martin wrote:
> > > On Tue, Jul 4, 2023 at 12:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
> > > > > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
> > > > > > > With the current code it is accepted as long as userland send it.
> > > > > > >
> > > > > > > Although userland should not set a feature flag that has not been
> > > > > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
> > > > > > > complain for it.
> > > > > > >
> > > > > > > Since there is no specific reason for any parent to reject that backend
> > > > > > > feature bit when it has been proposed, let's control it at vdpa frontend
> > > > > > > level. Future patches may move this control to the parent driver.
> > > > > > >
> > > > > > > Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
> > > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > >
> > > > > > Please do send v3. And again, I don't want to send "after driver ok" hack
> > > > > > upstream at all, I merged it in next just to give it some testing.
> > > > > > We want RING_ACCESS_AFTER_KICK or some such.
> > > > > >
> > > > >
> > > > > Current devices do not support that semantic.
> > > >
> > > > Which devices specifically access the ring after DRIVER_OK but before
> > > > a kick?
> > > >
> > >
> > > Previous versions of the QEMU LM series did a spurious kick to start
> > > traffic at the LM destination [1]. When it was proposed, that spurious
> > > kick was removed from the series because to check for descriptors
> > > after driver_ok, even without a kick, was considered work of the
> > > parent driver.
> > >
> > > I'm ok to go back to this spurious kick, but I'm not sure if the hw
> > > will read the ring before the kick actually. I can ask.
> > >
> > > Thanks!
> > >
> > > [1] https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg02775.html
> >
> > Let's find out. We need to check for ENABLE_AFTER_DRIVER_OK too, no?
> 
> My understanding is [1] assuming ACCESS_AFTER_KICK. This seems
> sub-optimal than assuming ENABLE_AFTER_DRIVER_OK.
> 
> But this reminds me one thing, as the thread is going too long, I
> wonder if we simply assume ENABLE_AFTER_DRIVER_OK if RING_RESET is
> supported?
> 
> Thanks

I don't see what does one have to do with another ...

I think with RING_RESET we had another solution, enable rings
mapping them to a zero page, then reset and re-enable later.

> >
> >
> >
> > > > > My plan was to convert
> > > > > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
> > > > > was not explicit enough.
> > > > >
> > > > > The only solution I can see to that is to trap & emulate in the vdpa
> > > > > (parent?) driver, as talked in virtio-comment. But that complicates
> > > > > the architecture:
> > > > > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
> > > > > * Store vq enable state separately, at
> > > > > vdpa->config->set_vq_ready(true), but not transmit that enable to hw
> > > > > * Store the doorbell state separately, but do not configure it to the
> > > > > device directly.
> > > > >
> > > > > But how to recover if the device cannot configure them at kick time,
> > > > > for example?
> > > > >
> > > > > Maybe we can just fail if the parent driver does not support enabling
> > > > > the vq after DRIVER_OK? That way no new feature flag is needed.
> > > > >
> > > > > Thanks!
> > > > >
> > > > > >
> > > > > > > ---
> > > > > > > Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
> > > > > > > commit. Please let me know if I should send a v3 of [1] instead.
> > > > > > >
> > > > > > > [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
> > > > > > > ---
> > > > > > >  drivers/vhost/vdpa.c | 7 +++++--
> > > > > > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > > index e1abf29fed5b..a7e554352351 100644
> > > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > > >  {
> > > > > > >       struct vhost_vdpa *v = filep->private_data;
> > > > > > >       struct vhost_dev *d = &v->vdev;
> > > > > > > +     const struct vdpa_config_ops *ops = v->vdpa->config;
> > > > > > >       void __user *argp = (void __user *)arg;
> > > > > > >       u64 __user *featurep = argp;
> > > > > > > -     u64 features;
> > > > > > > +     u64 features, parent_features = 0;
> > > > > > >       long r = 0;
> > > > > > >
> > > > > > >       if (cmd == VHOST_SET_BACKEND_FEATURES) {
> > > > > > >               if (copy_from_user(&features, featurep, sizeof(features)))
> > > > > > >                       return -EFAULT;
> > > > > > > +             if (ops->get_backend_features)
> > > > > > > +                     parent_features = ops->get_backend_features(v->vdpa);
> > > > > > >               if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
> > > > > > >                                BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
> > > > > > >                                BIT_ULL(VHOST_BACKEND_F_RESUME) |
> > > > > > > -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
> > > > > > > +                              parent_features))
> > > > > > >                       return -EOPNOTSUPP;
> > > > > > >               if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
> > > > > > >                    !vhost_vdpa_can_suspend(v))
> > > > > > > --
> > > > > > > 2.39.3
> > > > > >
> > > >
> >
Michael S. Tsirkin July 5, 2023, 8:42 a.m. UTC | #9
On Wed, Jul 05, 2023 at 03:55:23PM +0800, Jason Wang wrote:
> On Tue, Jul 4, 2023 at 6:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
> > > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
> > > > > With the current code it is accepted as long as userland send it.
> > > > >
> > > > > Although userland should not set a feature flag that has not been
> > > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
> > > > > complain for it.
> > > > >
> > > > > Since there is no specific reason for any parent to reject that backend
> > > > > feature bit when it has been proposed, let's control it at vdpa frontend
> > > > > level. Future patches may move this control to the parent driver.
> > > > >
> > > > > Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
> > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > >
> > > > Please do send v3. And again, I don't want to send "after driver ok" hack
> > > > upstream at all, I merged it in next just to give it some testing.
> > > > We want RING_ACCESS_AFTER_KICK or some such.
> > > >
> > >
> > > Current devices do not support that semantic.
> >
> > Which devices specifically access the ring after DRIVER_OK but before
> > a kick?
> 
> Vhost-net is one example at last. It polls a socket as well, so it
> starts to access the ring immediately after DRIVER_OK.
> 
> Thanks


For sure but that is not vdpa.

> >
> > > My plan was to convert
> > > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
> > > was not explicit enough.
> > >
> > > The only solution I can see to that is to trap & emulate in the vdpa
> > > (parent?) driver, as talked in virtio-comment. But that complicates
> > > the architecture:
> > > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
> > > * Store vq enable state separately, at
> > > vdpa->config->set_vq_ready(true), but not transmit that enable to hw
> > > * Store the doorbell state separately, but do not configure it to the
> > > device directly.
> > >
> > > But how to recover if the device cannot configure them at kick time,
> > > for example?
> > >
> > > Maybe we can just fail if the parent driver does not support enabling
> > > the vq after DRIVER_OK? That way no new feature flag is needed.
> > >
> > > Thanks!
> > >
> > > >
> > > > > ---
> > > > > Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
> > > > > commit. Please let me know if I should send a v3 of [1] instead.
> > > > >
> > > > > [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
> > > > > ---
> > > > >  drivers/vhost/vdpa.c | 7 +++++--
> > > > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > index e1abf29fed5b..a7e554352351 100644
> > > > > --- a/drivers/vhost/vdpa.c
> > > > > +++ b/drivers/vhost/vdpa.c
> > > > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > >  {
> > > > >       struct vhost_vdpa *v = filep->private_data;
> > > > >       struct vhost_dev *d = &v->vdev;
> > > > > +     const struct vdpa_config_ops *ops = v->vdpa->config;
> > > > >       void __user *argp = (void __user *)arg;
> > > > >       u64 __user *featurep = argp;
> > > > > -     u64 features;
> > > > > +     u64 features, parent_features = 0;
> > > > >       long r = 0;
> > > > >
> > > > >       if (cmd == VHOST_SET_BACKEND_FEATURES) {
> > > > >               if (copy_from_user(&features, featurep, sizeof(features)))
> > > > >                       return -EFAULT;
> > > > > +             if (ops->get_backend_features)
> > > > > +                     parent_features = ops->get_backend_features(v->vdpa);
> > > > >               if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
> > > > >                                BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
> > > > >                                BIT_ULL(VHOST_BACKEND_F_RESUME) |
> > > > > -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
> > > > > +                              parent_features))
> > > > >                       return -EOPNOTSUPP;
> > > > >               if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
> > > > >                    !vhost_vdpa_can_suspend(v))
> > > > > --
> > > > > 2.39.3
> > > >
> >
Eugenio Perez Martin July 5, 2023, 6:27 p.m. UTC | #10
On Wed, Jul 5, 2023 at 9:50 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Tue, Jul 4, 2023 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Jul 04, 2023 at 01:36:11PM +0200, Eugenio Perez Martin wrote:
> > > On Tue, Jul 4, 2023 at 12:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
> > > > > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
> > > > > > > With the current code it is accepted as long as userland send it.
> > > > > > >
> > > > > > > Although userland should not set a feature flag that has not been
> > > > > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
> > > > > > > complain for it.
> > > > > > >
> > > > > > > Since there is no specific reason for any parent to reject that backend
> > > > > > > feature bit when it has been proposed, let's control it at vdpa frontend
> > > > > > > level. Future patches may move this control to the parent driver.
> > > > > > >
> > > > > > > Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
> > > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > >
> > > > > > Please do send v3. And again, I don't want to send "after driver ok" hack
> > > > > > upstream at all, I merged it in next just to give it some testing.
> > > > > > We want RING_ACCESS_AFTER_KICK or some such.
> > > > > >
> > > > >
> > > > > Current devices do not support that semantic.
> > > >
> > > > Which devices specifically access the ring after DRIVER_OK but before
> > > > a kick?
> > > >
> > >
> > > Previous versions of the QEMU LM series did a spurious kick to start
> > > traffic at the LM destination [1]. When it was proposed, that spurious
> > > kick was removed from the series because to check for descriptors
> > > after driver_ok, even without a kick, was considered work of the
> > > parent driver.
> > >
> > > I'm ok to go back to this spurious kick, but I'm not sure if the hw
> > > will read the ring before the kick actually. I can ask.
> > >
> > > Thanks!
> > >
> > > [1] https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg02775.html
> >
> > Let's find out. We need to check for ENABLE_AFTER_DRIVER_OK too, no?
>
> My understanding is [1] assuming ACCESS_AFTER_KICK. This seems
> sub-optimal than assuming ENABLE_AFTER_DRIVER_OK.
>
> But this reminds me one thing, as the thread is going too long, I
> wonder if we simply assume ENABLE_AFTER_DRIVER_OK if RING_RESET is
> supported?
>

The problem with that is that the device needs to support all
RING_RESET, like to be able to change vq address etc after DRIVER_OK.
Not all HW support it.

We just need the subset of having the dataplane freezed until all CVQ
commands have been consumed. I'm sure current vDPA code already
supports it in some devices, like MLX and PSD.

Thanks!

> Thanks
>
> >
> >
> >
> > > > > My plan was to convert
> > > > > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
> > > > > was not explicit enough.
> > > > >
> > > > > The only solution I can see to that is to trap & emulate in the vdpa
> > > > > (parent?) driver, as talked in virtio-comment. But that complicates
> > > > > the architecture:
> > > > > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
> > > > > * Store vq enable state separately, at
> > > > > vdpa->config->set_vq_ready(true), but not transmit that enable to hw
> > > > > * Store the doorbell state separately, but do not configure it to the
> > > > > device directly.
> > > > >
> > > > > But how to recover if the device cannot configure them at kick time,
> > > > > for example?
> > > > >
> > > > > Maybe we can just fail if the parent driver does not support enabling
> > > > > the vq after DRIVER_OK? That way no new feature flag is needed.
> > > > >
> > > > > Thanks!
> > > > >
> > > > > >
> > > > > > > ---
> > > > > > > Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
> > > > > > > commit. Please let me know if I should send a v3 of [1] instead.
> > > > > > >
> > > > > > > [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
> > > > > > > ---
> > > > > > >  drivers/vhost/vdpa.c | 7 +++++--
> > > > > > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > > index e1abf29fed5b..a7e554352351 100644
> > > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > > >  {
> > > > > > >       struct vhost_vdpa *v = filep->private_data;
> > > > > > >       struct vhost_dev *d = &v->vdev;
> > > > > > > +     const struct vdpa_config_ops *ops = v->vdpa->config;
> > > > > > >       void __user *argp = (void __user *)arg;
> > > > > > >       u64 __user *featurep = argp;
> > > > > > > -     u64 features;
> > > > > > > +     u64 features, parent_features = 0;
> > > > > > >       long r = 0;
> > > > > > >
> > > > > > >       if (cmd == VHOST_SET_BACKEND_FEATURES) {
> > > > > > >               if (copy_from_user(&features, featurep, sizeof(features)))
> > > > > > >                       return -EFAULT;
> > > > > > > +             if (ops->get_backend_features)
> > > > > > > +                     parent_features = ops->get_backend_features(v->vdpa);
> > > > > > >               if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
> > > > > > >                                BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
> > > > > > >                                BIT_ULL(VHOST_BACKEND_F_RESUME) |
> > > > > > > -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
> > > > > > > +                              parent_features))
> > > > > > >                       return -EOPNOTSUPP;
> > > > > > >               if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
> > > > > > >                    !vhost_vdpa_can_suspend(v))
> > > > > > > --
> > > > > > > 2.39.3
> > > > > >
> > > >
> >
>
Nelson, Shannon July 6, 2023, 12:07 a.m. UTC | #11
On 7/5/23 11:27 AM, Eugenio Perez Martin wrote:
 >
> On Wed, Jul 5, 2023 at 9:50 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>> On Tue, Jul 4, 2023 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>
>>> On Tue, Jul 04, 2023 at 01:36:11PM +0200, Eugenio Perez Martin wrote:
>>>> On Tue, Jul 4, 2023 at 12:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>>
>>>>> On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
>>>>>> On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>>>>
>>>>>>> On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
>>>>>>>> With the current code it is accepted as long as userland send it.
>>>>>>>>
>>>>>>>> Although userland should not set a feature flag that has not been
>>>>>>>> offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
>>>>>>>> complain for it.
>>>>>>>>
>>>>>>>> Since there is no specific reason for any parent to reject that backend
>>>>>>>> feature bit when it has been proposed, let's control it at vdpa frontend
>>>>>>>> level. Future patches may move this control to the parent driver.
>>>>>>>>
>>>>>>>> Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
>>>>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>>>>
>>>>>>> Please do send v3. And again, I don't want to send "after driver ok" hack
>>>>>>> upstream at all, I merged it in next just to give it some testing.
>>>>>>> We want RING_ACCESS_AFTER_KICK or some such.
>>>>>>>
>>>>>>
>>>>>> Current devices do not support that semantic.
>>>>>
>>>>> Which devices specifically access the ring after DRIVER_OK but before
>>>>> a kick?

The PDS vdpa device can deal with a call to .set_vq_ready after 
DRIVER_OK is set.  And I'm told that our VQ activity should start 
without a kick.

Our vdpa device FW doesn't currently have support for 
VIRTIO_F_RING_RESET, but I believe it could be added without too much 
trouble.

sln


>>>>>
>>>>
>>>> Previous versions of the QEMU LM series did a spurious kick to start
>>>> traffic at the LM destination [1]. When it was proposed, that spurious
>>>> kick was removed from the series because to check for descriptors
>>>> after driver_ok, even without a kick, was considered work of the
>>>> parent driver.
>>>>
>>>> I'm ok to go back to this spurious kick, but I'm not sure if the hw
>>>> will read the ring before the kick actually. I can ask.
>>>>
>>>> Thanks!
>>>>
>>>> [1] https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg02775.html
>>>
>>> Let's find out. We need to check for ENABLE_AFTER_DRIVER_OK too, no?
>>
>> My understanding is [1] assuming ACCESS_AFTER_KICK. This seems
>> sub-optimal than assuming ENABLE_AFTER_DRIVER_OK.
>>
>> But this reminds me one thing, as the thread is going too long, I
>> wonder if we simply assume ENABLE_AFTER_DRIVER_OK if RING_RESET is
>> supported?
>>
> 
> The problem with that is that the device needs to support all
> RING_RESET, like to be able to change vq address etc after DRIVER_OK.
> Not all HW support it.
> 
> We just need the subset of having the dataplane freezed until all CVQ
> commands have been consumed. I'm sure current vDPA code already
> supports it in some devices, like MLX and PSD.
> 
> Thanks!
> 
>> Thanks
>>
>>>
>>>
>>>
>>>>>> My plan was to convert
>>>>>> it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
>>>>>> was not explicit enough.
>>>>>>
>>>>>> The only solution I can see to that is to trap & emulate in the vdpa
>>>>>> (parent?) driver, as talked in virtio-comment. But that complicates
>>>>>> the architecture:
>>>>>> * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
>>>>>> * Store vq enable state separately, at
>>>>>> vdpa->config->set_vq_ready(true), but not transmit that enable to hw
>>>>>> * Store the doorbell state separately, but do not configure it to the
>>>>>> device directly.
>>>>>>
>>>>>> But how to recover if the device cannot configure them at kick time,
>>>>>> for example?
>>>>>>
>>>>>> Maybe we can just fail if the parent driver does not support enabling
>>>>>> the vq after DRIVER_OK? That way no new feature flag is needed.
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>>>
>>>>>>>> ---
>>>>>>>> Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
>>>>>>>> commit. Please let me know if I should send a v3 of [1] instead.
>>>>>>>>
>>>>>>>> [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
>>>>>>>> ---
>>>>>>>>   drivers/vhost/vdpa.c | 7 +++++--
>>>>>>>>   1 file changed, 5 insertions(+), 2 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
>>>>>>>> index e1abf29fed5b..a7e554352351 100644
>>>>>>>> --- a/drivers/vhost/vdpa.c
>>>>>>>> +++ b/drivers/vhost/vdpa.c
>>>>>>>> @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
>>>>>>>>   {
>>>>>>>>        struct vhost_vdpa *v = filep->private_data;
>>>>>>>>        struct vhost_dev *d = &v->vdev;
>>>>>>>> +     const struct vdpa_config_ops *ops = v->vdpa->config;
>>>>>>>>        void __user *argp = (void __user *)arg;
>>>>>>>>        u64 __user *featurep = argp;
>>>>>>>> -     u64 features;
>>>>>>>> +     u64 features, parent_features = 0;
>>>>>>>>        long r = 0;
>>>>>>>>
>>>>>>>>        if (cmd == VHOST_SET_BACKEND_FEATURES) {
>>>>>>>>                if (copy_from_user(&features, featurep, sizeof(features)))
>>>>>>>>                        return -EFAULT;
>>>>>>>> +             if (ops->get_backend_features)
>>>>>>>> +                     parent_features = ops->get_backend_features(v->vdpa);
>>>>>>>>                if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
>>>>>>>>                                 BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
>>>>>>>>                                 BIT_ULL(VHOST_BACKEND_F_RESUME) |
>>>>>>>> -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
>>>>>>>> +                              parent_features))
>>>>>>>>                        return -EOPNOTSUPP;
>>>>>>>>                if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
>>>>>>>>                     !vhost_vdpa_can_suspend(v))
>>>>>>>> --
>>>>>>>> 2.39.3
>>>>>>>
>>>>>
>>>
>>
>
Jason Wang July 6, 2023, 1:55 a.m. UTC | #12
On Wed, Jul 5, 2023 at 4:41 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Jul 05, 2023 at 03:49:58PM +0800, Jason Wang wrote:
> > On Tue, Jul 4, 2023 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Tue, Jul 04, 2023 at 01:36:11PM +0200, Eugenio Perez Martin wrote:
> > > > On Tue, Jul 4, 2023 at 12:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
> > > > > > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > >
> > > > > > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
> > > > > > > > With the current code it is accepted as long as userland send it.
> > > > > > > >
> > > > > > > > Although userland should not set a feature flag that has not been
> > > > > > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
> > > > > > > > complain for it.
> > > > > > > >
> > > > > > > > Since there is no specific reason for any parent to reject that backend
> > > > > > > > feature bit when it has been proposed, let's control it at vdpa frontend
> > > > > > > > level. Future patches may move this control to the parent driver.
> > > > > > > >
> > > > > > > > Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
> > > > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > >
> > > > > > > Please do send v3. And again, I don't want to send "after driver ok" hack
> > > > > > > upstream at all, I merged it in next just to give it some testing.
> > > > > > > We want RING_ACCESS_AFTER_KICK or some such.
> > > > > > >
> > > > > >
> > > > > > Current devices do not support that semantic.
> > > > >
> > > > > Which devices specifically access the ring after DRIVER_OK but before
> > > > > a kick?
> > > > >
> > > >
> > > > Previous versions of the QEMU LM series did a spurious kick to start
> > > > traffic at the LM destination [1]. When it was proposed, that spurious
> > > > kick was removed from the series because to check for descriptors
> > > > after driver_ok, even without a kick, was considered work of the
> > > > parent driver.
> > > >
> > > > I'm ok to go back to this spurious kick, but I'm not sure if the hw
> > > > will read the ring before the kick actually. I can ask.
> > > >
> > > > Thanks!
> > > >
> > > > [1] https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg02775.html
> > >
> > > Let's find out. We need to check for ENABLE_AFTER_DRIVER_OK too, no?
> >
> > My understanding is [1] assuming ACCESS_AFTER_KICK. This seems
> > sub-optimal than assuming ENABLE_AFTER_DRIVER_OK.
> >
> > But this reminds me one thing, as the thread is going too long, I
> > wonder if we simply assume ENABLE_AFTER_DRIVER_OK if RING_RESET is
> > supported?
> >
> > Thanks
>
> I don't see what does one have to do with another ...
>
> I think with RING_RESET we had another solution, enable rings
> mapping them to a zero page, then reset and re-enable later.

As discussed before, this seems to have some problems:

1) The behaviour is not clarified in the document
2) zero is a valid IOVA

Thanks

>
> > >
> > >
> > >
> > > > > > My plan was to convert
> > > > > > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
> > > > > > was not explicit enough.
> > > > > >
> > > > > > The only solution I can see to that is to trap & emulate in the vdpa
> > > > > > (parent?) driver, as talked in virtio-comment. But that complicates
> > > > > > the architecture:
> > > > > > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
> > > > > > * Store vq enable state separately, at
> > > > > > vdpa->config->set_vq_ready(true), but not transmit that enable to hw
> > > > > > * Store the doorbell state separately, but do not configure it to the
> > > > > > device directly.
> > > > > >
> > > > > > But how to recover if the device cannot configure them at kick time,
> > > > > > for example?
> > > > > >
> > > > > > Maybe we can just fail if the parent driver does not support enabling
> > > > > > the vq after DRIVER_OK? That way no new feature flag is needed.
> > > > > >
> > > > > > Thanks!
> > > > > >
> > > > > > >
> > > > > > > > ---
> > > > > > > > Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
> > > > > > > > commit. Please let me know if I should send a v3 of [1] instead.
> > > > > > > >
> > > > > > > > [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
> > > > > > > > ---
> > > > > > > >  drivers/vhost/vdpa.c | 7 +++++--
> > > > > > > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > > > index e1abf29fed5b..a7e554352351 100644
> > > > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > > > >  {
> > > > > > > >       struct vhost_vdpa *v = filep->private_data;
> > > > > > > >       struct vhost_dev *d = &v->vdev;
> > > > > > > > +     const struct vdpa_config_ops *ops = v->vdpa->config;
> > > > > > > >       void __user *argp = (void __user *)arg;
> > > > > > > >       u64 __user *featurep = argp;
> > > > > > > > -     u64 features;
> > > > > > > > +     u64 features, parent_features = 0;
> > > > > > > >       long r = 0;
> > > > > > > >
> > > > > > > >       if (cmd == VHOST_SET_BACKEND_FEATURES) {
> > > > > > > >               if (copy_from_user(&features, featurep, sizeof(features)))
> > > > > > > >                       return -EFAULT;
> > > > > > > > +             if (ops->get_backend_features)
> > > > > > > > +                     parent_features = ops->get_backend_features(v->vdpa);
> > > > > > > >               if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
> > > > > > > >                                BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
> > > > > > > >                                BIT_ULL(VHOST_BACKEND_F_RESUME) |
> > > > > > > > -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
> > > > > > > > +                              parent_features))
> > > > > > > >                       return -EOPNOTSUPP;
> > > > > > > >               if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
> > > > > > > >                    !vhost_vdpa_can_suspend(v))
> > > > > > > > --
> > > > > > > > 2.39.3
> > > > > > >
> > > > >
> > >
>
Jason Wang July 6, 2023, 1:56 a.m. UTC | #13
On Wed, Jul 5, 2023 at 4:42 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Jul 05, 2023 at 03:55:23PM +0800, Jason Wang wrote:
> > On Tue, Jul 4, 2023 at 6:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
> > > > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
> > > > > > With the current code it is accepted as long as userland send it.
> > > > > >
> > > > > > Although userland should not set a feature flag that has not been
> > > > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
> > > > > > complain for it.
> > > > > >
> > > > > > Since there is no specific reason for any parent to reject that backend
> > > > > > feature bit when it has been proposed, let's control it at vdpa frontend
> > > > > > level. Future patches may move this control to the parent driver.
> > > > > >
> > > > > > Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
> > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > >
> > > > > Please do send v3. And again, I don't want to send "after driver ok" hack
> > > > > upstream at all, I merged it in next just to give it some testing.
> > > > > We want RING_ACCESS_AFTER_KICK or some such.
> > > > >
> > > >
> > > > Current devices do not support that semantic.
> > >
> > > Which devices specifically access the ring after DRIVER_OK but before
> > > a kick?
> >
> > Vhost-net is one example at last. It polls a socket as well, so it
> > starts to access the ring immediately after DRIVER_OK.
> >
> > Thanks
>
>
> For sure but that is not vdpa.

Somehow via vp_vdpa that I'm usually testing vDPA patches.

The problem is that it's very hard to audit all vDPA devices now if it
is not defined in the spec before they are invented.

Thanks

>
> > >
> > > > My plan was to convert
> > > > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
> > > > was not explicit enough.
> > > >
> > > > The only solution I can see to that is to trap & emulate in the vdpa
> > > > (parent?) driver, as talked in virtio-comment. But that complicates
> > > > the architecture:
> > > > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
> > > > * Store vq enable state separately, at
> > > > vdpa->config->set_vq_ready(true), but not transmit that enable to hw
> > > > * Store the doorbell state separately, but do not configure it to the
> > > > device directly.
> > > >
> > > > But how to recover if the device cannot configure them at kick time,
> > > > for example?
> > > >
> > > > Maybe we can just fail if the parent driver does not support enabling
> > > > the vq after DRIVER_OK? That way no new feature flag is needed.
> > > >
> > > > Thanks!
> > > >
> > > > >
> > > > > > ---
> > > > > > Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
> > > > > > commit. Please let me know if I should send a v3 of [1] instead.
> > > > > >
> > > > > > [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
> > > > > > ---
> > > > > >  drivers/vhost/vdpa.c | 7 +++++--
> > > > > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > index e1abf29fed5b..a7e554352351 100644
> > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > >  {
> > > > > >       struct vhost_vdpa *v = filep->private_data;
> > > > > >       struct vhost_dev *d = &v->vdev;
> > > > > > +     const struct vdpa_config_ops *ops = v->vdpa->config;
> > > > > >       void __user *argp = (void __user *)arg;
> > > > > >       u64 __user *featurep = argp;
> > > > > > -     u64 features;
> > > > > > +     u64 features, parent_features = 0;
> > > > > >       long r = 0;
> > > > > >
> > > > > >       if (cmd == VHOST_SET_BACKEND_FEATURES) {
> > > > > >               if (copy_from_user(&features, featurep, sizeof(features)))
> > > > > >                       return -EFAULT;
> > > > > > +             if (ops->get_backend_features)
> > > > > > +                     parent_features = ops->get_backend_features(v->vdpa);
> > > > > >               if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
> > > > > >                                BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
> > > > > >                                BIT_ULL(VHOST_BACKEND_F_RESUME) |
> > > > > > -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
> > > > > > +                              parent_features))
> > > > > >                       return -EOPNOTSUPP;
> > > > > >               if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
> > > > > >                    !vhost_vdpa_can_suspend(v))
> > > > > > --
> > > > > > 2.39.3
> > > > >
> > >
>
Michael S. Tsirkin July 6, 2023, 6:05 a.m. UTC | #14
On Thu, Jul 06, 2023 at 09:56:29AM +0800, Jason Wang wrote:
> On Wed, Jul 5, 2023 at 4:42 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Wed, Jul 05, 2023 at 03:55:23PM +0800, Jason Wang wrote:
> > > On Tue, Jul 4, 2023 at 6:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
> > > > > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
> > > > > > > With the current code it is accepted as long as userland send it.
> > > > > > >
> > > > > > > Although userland should not set a feature flag that has not been
> > > > > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
> > > > > > > complain for it.
> > > > > > >
> > > > > > > Since there is no specific reason for any parent to reject that backend
> > > > > > > feature bit when it has been proposed, let's control it at vdpa frontend
> > > > > > > level. Future patches may move this control to the parent driver.
> > > > > > >
> > > > > > > Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
> > > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > >
> > > > > > Please do send v3. And again, I don't want to send "after driver ok" hack
> > > > > > upstream at all, I merged it in next just to give it some testing.
> > > > > > We want RING_ACCESS_AFTER_KICK or some such.
> > > > > >
> > > > >
> > > > > Current devices do not support that semantic.
> > > >
> > > > Which devices specifically access the ring after DRIVER_OK but before
> > > > a kick?
> > >
> > > Vhost-net is one example at last. It polls a socket as well, so it
> > > starts to access the ring immediately after DRIVER_OK.
> > >
> > > Thanks
> >
> >
> > For sure but that is not vdpa.
> 
> Somehow via vp_vdpa that I'm usually testing vDPA patches.
> 
> The problem is that it's very hard to audit all vDPA devices now if it
> is not defined in the spec before they are invented.
> 
> Thanks

vp_vdpa is exactly the part that bothers me. If on the host it is backed
by e.g. vhost-user then it does not work. And you don't know what is
backing it.

OTOH it supports RING_RESET ...

So, proposal: include both this solution and for drivers
vp_vdpa the RING_RESET trick.


Hmm?



> >
> > > >
> > > > > My plan was to convert
> > > > > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
> > > > > was not explicit enough.
> > > > >
> > > > > The only solution I can see to that is to trap & emulate in the vdpa
> > > > > (parent?) driver, as talked in virtio-comment. But that complicates
> > > > > the architecture:
> > > > > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
> > > > > * Store vq enable state separately, at
> > > > > vdpa->config->set_vq_ready(true), but not transmit that enable to hw
> > > > > * Store the doorbell state separately, but do not configure it to the
> > > > > device directly.
> > > > >
> > > > > But how to recover if the device cannot configure them at kick time,
> > > > > for example?
> > > > >
> > > > > Maybe we can just fail if the parent driver does not support enabling
> > > > > the vq after DRIVER_OK? That way no new feature flag is needed.
> > > > >
> > > > > Thanks!
> > > > >
> > > > > >
> > > > > > > ---
> > > > > > > Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
> > > > > > > commit. Please let me know if I should send a v3 of [1] instead.
> > > > > > >
> > > > > > > [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
> > > > > > > ---
> > > > > > >  drivers/vhost/vdpa.c | 7 +++++--
> > > > > > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > > index e1abf29fed5b..a7e554352351 100644
> > > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > > >  {
> > > > > > >       struct vhost_vdpa *v = filep->private_data;
> > > > > > >       struct vhost_dev *d = &v->vdev;
> > > > > > > +     const struct vdpa_config_ops *ops = v->vdpa->config;
> > > > > > >       void __user *argp = (void __user *)arg;
> > > > > > >       u64 __user *featurep = argp;
> > > > > > > -     u64 features;
> > > > > > > +     u64 features, parent_features = 0;
> > > > > > >       long r = 0;
> > > > > > >
> > > > > > >       if (cmd == VHOST_SET_BACKEND_FEATURES) {
> > > > > > >               if (copy_from_user(&features, featurep, sizeof(features)))
> > > > > > >                       return -EFAULT;
> > > > > > > +             if (ops->get_backend_features)
> > > > > > > +                     parent_features = ops->get_backend_features(v->vdpa);
> > > > > > >               if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
> > > > > > >                                BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
> > > > > > >                                BIT_ULL(VHOST_BACKEND_F_RESUME) |
> > > > > > > -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
> > > > > > > +                              parent_features))
> > > > > > >                       return -EOPNOTSUPP;
> > > > > > >               if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
> > > > > > >                    !vhost_vdpa_can_suspend(v))
> > > > > > > --
> > > > > > > 2.39.3
> > > > > >
> > > >
> >
Michael S. Tsirkin July 6, 2023, 6:07 a.m. UTC | #15
On Wed, Jul 05, 2023 at 05:07:11PM -0700, Shannon Nelson wrote:
> On 7/5/23 11:27 AM, Eugenio Perez Martin wrote:
> >
> > On Wed, Jul 5, 2023 at 9:50 AM Jason Wang <jasowang@redhat.com> wrote:
> > > 
> > > On Tue, Jul 4, 2023 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > 
> > > > On Tue, Jul 04, 2023 at 01:36:11PM +0200, Eugenio Perez Martin wrote:
> > > > > On Tue, Jul 4, 2023 at 12:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > 
> > > > > > On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
> > > > > > > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > 
> > > > > > > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
> > > > > > > > > With the current code it is accepted as long as userland send it.
> > > > > > > > > 
> > > > > > > > > Although userland should not set a feature flag that has not been
> > > > > > > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
> > > > > > > > > complain for it.
> > > > > > > > > 
> > > > > > > > > Since there is no specific reason for any parent to reject that backend
> > > > > > > > > feature bit when it has been proposed, let's control it at vdpa frontend
> > > > > > > > > level. Future patches may move this control to the parent driver.
> > > > > > > > > 
> > > > > > > > > Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
> > > > > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > 
> > > > > > > > Please do send v3. And again, I don't want to send "after driver ok" hack
> > > > > > > > upstream at all, I merged it in next just to give it some testing.
> > > > > > > > We want RING_ACCESS_AFTER_KICK or some such.
> > > > > > > > 
> > > > > > > 
> > > > > > > Current devices do not support that semantic.
> > > > > > 
> > > > > > Which devices specifically access the ring after DRIVER_OK but before
> > > > > > a kick?
> 
> The PDS vdpa device can deal with a call to .set_vq_ready after DRIVER_OK is
> set.  And I'm told that our VQ activity should start without a kick.
> 
> Our vdpa device FW doesn't currently have support for VIRTIO_F_RING_RESET,
> but I believe it could be added without too much trouble.
> 
> sln
> 

OK it seems clear at least in the current version pds needs
VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK.
However can we also code up the RING_RESET path as the default?
Then down the road vendors can choose what to do.





> > > > > > 
> > > > > 
> > > > > Previous versions of the QEMU LM series did a spurious kick to start
> > > > > traffic at the LM destination [1]. When it was proposed, that spurious
> > > > > kick was removed from the series because to check for descriptors
> > > > > after driver_ok, even without a kick, was considered work of the
> > > > > parent driver.
> > > > > 
> > > > > I'm ok to go back to this spurious kick, but I'm not sure if the hw
> > > > > will read the ring before the kick actually. I can ask.
> > > > > 
> > > > > Thanks!
> > > > > 
> > > > > [1] https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg02775.html
> > > > 
> > > > Let's find out. We need to check for ENABLE_AFTER_DRIVER_OK too, no?
> > > 
> > > My understanding is [1] assuming ACCESS_AFTER_KICK. This seems
> > > sub-optimal than assuming ENABLE_AFTER_DRIVER_OK.
> > > 
> > > But this reminds me one thing, as the thread is going too long, I
> > > wonder if we simply assume ENABLE_AFTER_DRIVER_OK if RING_RESET is
> > > supported?
> > > 
> > 
> > The problem with that is that the device needs to support all
> > RING_RESET, like to be able to change vq address etc after DRIVER_OK.
> > Not all HW support it.
> > 
> > We just need the subset of having the dataplane freezed until all CVQ
> > commands have been consumed. I'm sure current vDPA code already
> > supports it in some devices, like MLX and PSD.
> > 
> > Thanks!
> > 
> > > Thanks
> > > 
> > > > 
> > > > 
> > > > 
> > > > > > > My plan was to convert
> > > > > > > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
> > > > > > > was not explicit enough.
> > > > > > > 
> > > > > > > The only solution I can see to that is to trap & emulate in the vdpa
> > > > > > > (parent?) driver, as talked in virtio-comment. But that complicates
> > > > > > > the architecture:
> > > > > > > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
> > > > > > > * Store vq enable state separately, at
> > > > > > > vdpa->config->set_vq_ready(true), but not transmit that enable to hw
> > > > > > > * Store the doorbell state separately, but do not configure it to the
> > > > > > > device directly.
> > > > > > > 
> > > > > > > But how to recover if the device cannot configure them at kick time,
> > > > > > > for example?
> > > > > > > 
> > > > > > > Maybe we can just fail if the parent driver does not support enabling
> > > > > > > the vq after DRIVER_OK? That way no new feature flag is needed.
> > > > > > > 
> > > > > > > Thanks!
> > > > > > > 
> > > > > > > > 
> > > > > > > > > ---
> > > > > > > > > Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
> > > > > > > > > commit. Please let me know if I should send a v3 of [1] instead.
> > > > > > > > > 
> > > > > > > > > [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
> > > > > > > > > ---
> > > > > > > > >   drivers/vhost/vdpa.c | 7 +++++--
> > > > > > > > >   1 file changed, 5 insertions(+), 2 deletions(-)
> > > > > > > > > 
> > > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > > > > index e1abf29fed5b..a7e554352351 100644
> > > > > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > > > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > > > > >   {
> > > > > > > > >        struct vhost_vdpa *v = filep->private_data;
> > > > > > > > >        struct vhost_dev *d = &v->vdev;
> > > > > > > > > +     const struct vdpa_config_ops *ops = v->vdpa->config;
> > > > > > > > >        void __user *argp = (void __user *)arg;
> > > > > > > > >        u64 __user *featurep = argp;
> > > > > > > > > -     u64 features;
> > > > > > > > > +     u64 features, parent_features = 0;
> > > > > > > > >        long r = 0;
> > > > > > > > > 
> > > > > > > > >        if (cmd == VHOST_SET_BACKEND_FEATURES) {
> > > > > > > > >                if (copy_from_user(&features, featurep, sizeof(features)))
> > > > > > > > >                        return -EFAULT;
> > > > > > > > > +             if (ops->get_backend_features)
> > > > > > > > > +                     parent_features = ops->get_backend_features(v->vdpa);
> > > > > > > > >                if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
> > > > > > > > >                                 BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
> > > > > > > > >                                 BIT_ULL(VHOST_BACKEND_F_RESUME) |
> > > > > > > > > -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
> > > > > > > > > +                              parent_features))
> > > > > > > > >                        return -EOPNOTSUPP;
> > > > > > > > >                if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
> > > > > > > > >                     !vhost_vdpa_can_suspend(v))
> > > > > > > > > --
> > > > > > > > > 2.39.3
> > > > > > > > 
> > > > > > 
> > > > 
> > > 
> >
Eugenio Perez Martin July 6, 2023, 6:35 a.m. UTC | #16
On Thu, Jul 6, 2023 at 8:07 AM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Jul 05, 2023 at 05:07:11PM -0700, Shannon Nelson wrote:
> > On 7/5/23 11:27 AM, Eugenio Perez Martin wrote:
> > >
> > > On Wed, Jul 5, 2023 at 9:50 AM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > > On Tue, Jul 4, 2023 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Tue, Jul 04, 2023 at 01:36:11PM +0200, Eugenio Perez Martin wrote:
> > > > > > On Tue, Jul 4, 2023 at 12:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > >
> > > > > > > On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
> > > > > > > > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
> > > > > > > > > > With the current code it is accepted as long as userland send it.
> > > > > > > > > >
> > > > > > > > > > Although userland should not set a feature flag that has not been
> > > > > > > > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
> > > > > > > > > > complain for it.
> > > > > > > > > >
> > > > > > > > > > Since there is no specific reason for any parent to reject that backend
> > > > > > > > > > feature bit when it has been proposed, let's control it at vdpa frontend
> > > > > > > > > > level. Future patches may move this control to the parent driver.
> > > > > > > > > >
> > > > > > > > > > Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
> > > > > > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > >
> > > > > > > > > Please do send v3. And again, I don't want to send "after driver ok" hack
> > > > > > > > > upstream at all, I merged it in next just to give it some testing.
> > > > > > > > > We want RING_ACCESS_AFTER_KICK or some such.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Current devices do not support that semantic.
> > > > > > >
> > > > > > > Which devices specifically access the ring after DRIVER_OK but before
> > > > > > > a kick?
> >
> > The PDS vdpa device can deal with a call to .set_vq_ready after DRIVER_OK is
> > set.  And I'm told that our VQ activity should start without a kick.
> >
> > Our vdpa device FW doesn't currently have support for VIRTIO_F_RING_RESET,
> > but I believe it could be added without too much trouble.
> >
> > sln
> >
>
> OK it seems clear at least in the current version pds needs
> VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK.
> However can we also code up the RING_RESET path as the default?
> Then down the road vendors can choose what to do.
>

Yes, the RING_RESET path can be coded & tested for vp_vdpa, for
example. I'm ok with making it the default, and let
_F_ENABLE_AFTER_DRIVER_OK as a fallback.

>
>
>
>
> > > > > > >
> > > > > >
> > > > > > Previous versions of the QEMU LM series did a spurious kick to start
> > > > > > traffic at the LM destination [1]. When it was proposed, that spurious
> > > > > > kick was removed from the series because to check for descriptors
> > > > > > after driver_ok, even without a kick, was considered work of the
> > > > > > parent driver.
> > > > > >
> > > > > > I'm ok to go back to this spurious kick, but I'm not sure if the hw
> > > > > > will read the ring before the kick actually. I can ask.
> > > > > >
> > > > > > Thanks!
> > > > > >
> > > > > > [1] https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg02775.html
> > > > >
> > > > > Let's find out. We need to check for ENABLE_AFTER_DRIVER_OK too, no?
> > > >
> > > > My understanding is [1] assuming ACCESS_AFTER_KICK. This seems
> > > > sub-optimal than assuming ENABLE_AFTER_DRIVER_OK.
> > > >
> > > > But this reminds me one thing, as the thread is going too long, I
> > > > wonder if we simply assume ENABLE_AFTER_DRIVER_OK if RING_RESET is
> > > > supported?
> > > >
> > >
> > > The problem with that is that the device needs to support all
> > > RING_RESET, like to be able to change vq address etc after DRIVER_OK.
> > > Not all HW support it.
> > >
> > > We just need the subset of having the dataplane freezed until all CVQ
> > > commands have been consumed. I'm sure current vDPA code already
> > > supports it in some devices, like MLX and PSD.
> > >
> > > Thanks!
> > >
> > > > Thanks
> > > >
> > > > >
> > > > >
> > > > >
> > > > > > > > My plan was to convert
> > > > > > > > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
> > > > > > > > was not explicit enough.
> > > > > > > >
> > > > > > > > The only solution I can see to that is to trap & emulate in the vdpa
> > > > > > > > (parent?) driver, as talked in virtio-comment. But that complicates
> > > > > > > > the architecture:
> > > > > > > > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
> > > > > > > > * Store vq enable state separately, at
> > > > > > > > vdpa->config->set_vq_ready(true), but not transmit that enable to hw
> > > > > > > > * Store the doorbell state separately, but do not configure it to the
> > > > > > > > device directly.
> > > > > > > >
> > > > > > > > But how to recover if the device cannot configure them at kick time,
> > > > > > > > for example?
> > > > > > > >
> > > > > > > > Maybe we can just fail if the parent driver does not support enabling
> > > > > > > > the vq after DRIVER_OK? That way no new feature flag is needed.
> > > > > > > >
> > > > > > > > Thanks!
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > ---
> > > > > > > > > > Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
> > > > > > > > > > commit. Please let me know if I should send a v3 of [1] instead.
> > > > > > > > > >
> > > > > > > > > > [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
> > > > > > > > > > ---
> > > > > > > > > >   drivers/vhost/vdpa.c | 7 +++++--
> > > > > > > > > >   1 file changed, 5 insertions(+), 2 deletions(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > > > > > index e1abf29fed5b..a7e554352351 100644
> > > > > > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > > > > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > > > > > >   {
> > > > > > > > > >        struct vhost_vdpa *v = filep->private_data;
> > > > > > > > > >        struct vhost_dev *d = &v->vdev;
> > > > > > > > > > +     const struct vdpa_config_ops *ops = v->vdpa->config;
> > > > > > > > > >        void __user *argp = (void __user *)arg;
> > > > > > > > > >        u64 __user *featurep = argp;
> > > > > > > > > > -     u64 features;
> > > > > > > > > > +     u64 features, parent_features = 0;
> > > > > > > > > >        long r = 0;
> > > > > > > > > >
> > > > > > > > > >        if (cmd == VHOST_SET_BACKEND_FEATURES) {
> > > > > > > > > >                if (copy_from_user(&features, featurep, sizeof(features)))
> > > > > > > > > >                        return -EFAULT;
> > > > > > > > > > +             if (ops->get_backend_features)
> > > > > > > > > > +                     parent_features = ops->get_backend_features(v->vdpa);
> > > > > > > > > >                if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
> > > > > > > > > >                                 BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
> > > > > > > > > >                                 BIT_ULL(VHOST_BACKEND_F_RESUME) |
> > > > > > > > > > -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
> > > > > > > > > > +                              parent_features))
> > > > > > > > > >                        return -EOPNOTSUPP;
> > > > > > > > > >                if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
> > > > > > > > > >                     !vhost_vdpa_can_suspend(v))
> > > > > > > > > > --
> > > > > > > > > > 2.39.3
> > > > > > > > >
> > > > > > >
> > > > >
> > > >
> > >
>
Michael S. Tsirkin July 6, 2023, 6:39 a.m. UTC | #17
On Thu, Jul 06, 2023 at 08:35:30AM +0200, Eugenio Perez Martin wrote:
> On Thu, Jul 6, 2023 at 8:07 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Wed, Jul 05, 2023 at 05:07:11PM -0700, Shannon Nelson wrote:
> > > On 7/5/23 11:27 AM, Eugenio Perez Martin wrote:
> > > >
> > > > On Wed, Jul 5, 2023 at 9:50 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > >
> > > > > On Tue, Jul 4, 2023 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Tue, Jul 04, 2023 at 01:36:11PM +0200, Eugenio Perez Martin wrote:
> > > > > > > On Tue, Jul 4, 2023 at 12:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
> > > > > > > > > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
> > > > > > > > > > > With the current code it is accepted as long as userland send it.
> > > > > > > > > > >
> > > > > > > > > > > Although userland should not set a feature flag that has not been
> > > > > > > > > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
> > > > > > > > > > > complain for it.
> > > > > > > > > > >
> > > > > > > > > > > Since there is no specific reason for any parent to reject that backend
> > > > > > > > > > > feature bit when it has been proposed, let's control it at vdpa frontend
> > > > > > > > > > > level. Future patches may move this control to the parent driver.
> > > > > > > > > > >
> > > > > > > > > > > Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
> > > > > > > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > > >
> > > > > > > > > > Please do send v3. And again, I don't want to send "after driver ok" hack
> > > > > > > > > > upstream at all, I merged it in next just to give it some testing.
> > > > > > > > > > We want RING_ACCESS_AFTER_KICK or some such.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Current devices do not support that semantic.
> > > > > > > >
> > > > > > > > Which devices specifically access the ring after DRIVER_OK but before
> > > > > > > > a kick?
> > >
> > > The PDS vdpa device can deal with a call to .set_vq_ready after DRIVER_OK is
> > > set.  And I'm told that our VQ activity should start without a kick.
> > >
> > > Our vdpa device FW doesn't currently have support for VIRTIO_F_RING_RESET,
> > > but I believe it could be added without too much trouble.
> > >
> > > sln
> > >
> >
> > OK it seems clear at least in the current version pds needs
> > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK.
> > However can we also code up the RING_RESET path as the default?
> > Then down the road vendors can choose what to do.
> >
> 
> Yes, the RING_RESET path can be coded & tested for vp_vdpa, for
> example. I'm ok with making it the default, and let
> _F_ENABLE_AFTER_DRIVER_OK as a fallback.

Sounds good.

> >
> >
> >
> >
> > > > > > > >
> > > > > > >
> > > > > > > Previous versions of the QEMU LM series did a spurious kick to start
> > > > > > > traffic at the LM destination [1]. When it was proposed, that spurious
> > > > > > > kick was removed from the series because to check for descriptors
> > > > > > > after driver_ok, even without a kick, was considered work of the
> > > > > > > parent driver.
> > > > > > >
> > > > > > > I'm ok to go back to this spurious kick, but I'm not sure if the hw
> > > > > > > will read the ring before the kick actually. I can ask.
> > > > > > >
> > > > > > > Thanks!
> > > > > > >
> > > > > > > [1] https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg02775.html
> > > > > >
> > > > > > Let's find out. We need to check for ENABLE_AFTER_DRIVER_OK too, no?
> > > > >
> > > > > My understanding is [1] assuming ACCESS_AFTER_KICK. This seems
> > > > > sub-optimal than assuming ENABLE_AFTER_DRIVER_OK.
> > > > >
> > > > > But this reminds me one thing, as the thread is going too long, I
> > > > > wonder if we simply assume ENABLE_AFTER_DRIVER_OK if RING_RESET is
> > > > > supported?
> > > > >
> > > >
> > > > The problem with that is that the device needs to support all
> > > > RING_RESET, like to be able to change vq address etc after DRIVER_OK.
> > > > Not all HW support it.
> > > >
> > > > We just need the subset of having the dataplane freezed until all CVQ
> > > > commands have been consumed. I'm sure current vDPA code already
> > > > supports it in some devices, like MLX and PSD.
> > > >
> > > > Thanks!
> > > >
> > > > > Thanks
> > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > > > > My plan was to convert
> > > > > > > > > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
> > > > > > > > > was not explicit enough.
> > > > > > > > >
> > > > > > > > > The only solution I can see to that is to trap & emulate in the vdpa
> > > > > > > > > (parent?) driver, as talked in virtio-comment. But that complicates
> > > > > > > > > the architecture:
> > > > > > > > > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
> > > > > > > > > * Store vq enable state separately, at
> > > > > > > > > vdpa->config->set_vq_ready(true), but not transmit that enable to hw
> > > > > > > > > * Store the doorbell state separately, but do not configure it to the
> > > > > > > > > device directly.
> > > > > > > > >
> > > > > > > > > But how to recover if the device cannot configure them at kick time,
> > > > > > > > > for example?
> > > > > > > > >
> > > > > > > > > Maybe we can just fail if the parent driver does not support enabling
> > > > > > > > > the vq after DRIVER_OK? That way no new feature flag is needed.
> > > > > > > > >
> > > > > > > > > Thanks!
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > ---
> > > > > > > > > > > Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
> > > > > > > > > > > commit. Please let me know if I should send a v3 of [1] instead.
> > > > > > > > > > >
> > > > > > > > > > > [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
> > > > > > > > > > > ---
> > > > > > > > > > >   drivers/vhost/vdpa.c | 7 +++++--
> > > > > > > > > > >   1 file changed, 5 insertions(+), 2 deletions(-)
> > > > > > > > > > >
> > > > > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > > > > > > index e1abf29fed5b..a7e554352351 100644
> > > > > > > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > > > > > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > > > > > > >   {
> > > > > > > > > > >        struct vhost_vdpa *v = filep->private_data;
> > > > > > > > > > >        struct vhost_dev *d = &v->vdev;
> > > > > > > > > > > +     const struct vdpa_config_ops *ops = v->vdpa->config;
> > > > > > > > > > >        void __user *argp = (void __user *)arg;
> > > > > > > > > > >        u64 __user *featurep = argp;
> > > > > > > > > > > -     u64 features;
> > > > > > > > > > > +     u64 features, parent_features = 0;
> > > > > > > > > > >        long r = 0;
> > > > > > > > > > >
> > > > > > > > > > >        if (cmd == VHOST_SET_BACKEND_FEATURES) {
> > > > > > > > > > >                if (copy_from_user(&features, featurep, sizeof(features)))
> > > > > > > > > > >                        return -EFAULT;
> > > > > > > > > > > +             if (ops->get_backend_features)
> > > > > > > > > > > +                     parent_features = ops->get_backend_features(v->vdpa);
> > > > > > > > > > >                if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
> > > > > > > > > > >                                 BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
> > > > > > > > > > >                                 BIT_ULL(VHOST_BACKEND_F_RESUME) |
> > > > > > > > > > > -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
> > > > > > > > > > > +                              parent_features))
> > > > > > > > > > >                        return -EOPNOTSUPP;
> > > > > > > > > > >                if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
> > > > > > > > > > >                     !vhost_vdpa_can_suspend(v))
> > > > > > > > > > > --
> > > > > > > > > > > 2.39.3
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> >
Eugenio Perez Martin July 6, 2023, 7:05 a.m. UTC | #18
On Thu, Jul 6, 2023 at 3:55 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Wed, Jul 5, 2023 at 4:41 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Wed, Jul 05, 2023 at 03:49:58PM +0800, Jason Wang wrote:
> > > On Tue, Jul 4, 2023 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Tue, Jul 04, 2023 at 01:36:11PM +0200, Eugenio Perez Martin wrote:
> > > > > On Tue, Jul 4, 2023 at 12:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
> > > > > > > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
> > > > > > > > > With the current code it is accepted as long as userland send it.
> > > > > > > > >
> > > > > > > > > Although userland should not set a feature flag that has not been
> > > > > > > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
> > > > > > > > > complain for it.
> > > > > > > > >
> > > > > > > > > Since there is no specific reason for any parent to reject that backend
> > > > > > > > > feature bit when it has been proposed, let's control it at vdpa frontend
> > > > > > > > > level. Future patches may move this control to the parent driver.
> > > > > > > > >
> > > > > > > > > Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
> > > > > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > >
> > > > > > > > Please do send v3. And again, I don't want to send "after driver ok" hack
> > > > > > > > upstream at all, I merged it in next just to give it some testing.
> > > > > > > > We want RING_ACCESS_AFTER_KICK or some such.
> > > > > > > >
> > > > > > >
> > > > > > > Current devices do not support that semantic.
> > > > > >
> > > > > > Which devices specifically access the ring after DRIVER_OK but before
> > > > > > a kick?
> > > > > >
> > > > >
> > > > > Previous versions of the QEMU LM series did a spurious kick to start
> > > > > traffic at the LM destination [1]. When it was proposed, that spurious
> > > > > kick was removed from the series because to check for descriptors
> > > > > after driver_ok, even without a kick, was considered work of the
> > > > > parent driver.
> > > > >
> > > > > I'm ok to go back to this spurious kick, but I'm not sure if the hw
> > > > > will read the ring before the kick actually. I can ask.
> > > > >
> > > > > Thanks!
> > > > >
> > > > > [1] https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg02775.html
> > > >
> > > > Let's find out. We need to check for ENABLE_AFTER_DRIVER_OK too, no?
> > >
> > > My understanding is [1] assuming ACCESS_AFTER_KICK. This seems
> > > sub-optimal than assuming ENABLE_AFTER_DRIVER_OK.
> > >
> > > But this reminds me one thing, as the thread is going too long, I
> > > wonder if we simply assume ENABLE_AFTER_DRIVER_OK if RING_RESET is
> > > supported?
> > >
> > > Thanks
> >
> > I don't see what does one have to do with another ...
> >
> > I think with RING_RESET we had another solution, enable rings
> > mapping them to a zero page, then reset and re-enable later.
>
> As discussed before, this seems to have some problems:
>
> 1) The behaviour is not clarified in the document
> 2) zero is a valid IOVA
>

I think we're not on the same page here.

As I understood, rings mapped to a zero page means essentially an
avail ring whose avail_idx is always 0, offered to the device instead
of the guest's ring. Once all CVQ commands are processed, we use
RING_RESET to switch to the right ring, being guest's or SVQ vring.



> Thanks
>
> >
> > > >
> > > >
> > > >
> > > > > > > My plan was to convert
> > > > > > > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
> > > > > > > was not explicit enough.
> > > > > > >
> > > > > > > The only solution I can see to that is to trap & emulate in the vdpa
> > > > > > > (parent?) driver, as talked in virtio-comment. But that complicates
> > > > > > > the architecture:
> > > > > > > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
> > > > > > > * Store vq enable state separately, at
> > > > > > > vdpa->config->set_vq_ready(true), but not transmit that enable to hw
> > > > > > > * Store the doorbell state separately, but do not configure it to the
> > > > > > > device directly.
> > > > > > >
> > > > > > > But how to recover if the device cannot configure them at kick time,
> > > > > > > for example?
> > > > > > >
> > > > > > > Maybe we can just fail if the parent driver does not support enabling
> > > > > > > the vq after DRIVER_OK? That way no new feature flag is needed.
> > > > > > >
> > > > > > > Thanks!
> > > > > > >
> > > > > > > >
> > > > > > > > > ---
> > > > > > > > > Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
> > > > > > > > > commit. Please let me know if I should send a v3 of [1] instead.
> > > > > > > > >
> > > > > > > > > [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
> > > > > > > > > ---
> > > > > > > > >  drivers/vhost/vdpa.c | 7 +++++--
> > > > > > > > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > > > > index e1abf29fed5b..a7e554352351 100644
> > > > > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > > > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > > > > >  {
> > > > > > > > >       struct vhost_vdpa *v = filep->private_data;
> > > > > > > > >       struct vhost_dev *d = &v->vdev;
> > > > > > > > > +     const struct vdpa_config_ops *ops = v->vdpa->config;
> > > > > > > > >       void __user *argp = (void __user *)arg;
> > > > > > > > >       u64 __user *featurep = argp;
> > > > > > > > > -     u64 features;
> > > > > > > > > +     u64 features, parent_features = 0;
> > > > > > > > >       long r = 0;
> > > > > > > > >
> > > > > > > > >       if (cmd == VHOST_SET_BACKEND_FEATURES) {
> > > > > > > > >               if (copy_from_user(&features, featurep, sizeof(features)))
> > > > > > > > >                       return -EFAULT;
> > > > > > > > > +             if (ops->get_backend_features)
> > > > > > > > > +                     parent_features = ops->get_backend_features(v->vdpa);
> > > > > > > > >               if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
> > > > > > > > >                                BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
> > > > > > > > >                                BIT_ULL(VHOST_BACKEND_F_RESUME) |
> > > > > > > > > -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
> > > > > > > > > +                              parent_features))
> > > > > > > > >                       return -EOPNOTSUPP;
> > > > > > > > >               if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
> > > > > > > > >                    !vhost_vdpa_can_suspend(v))
> > > > > > > > > --
> > > > > > > > > 2.39.3
> > > > > > > >
> > > > > >
> > > >
> >
>
Jason Wang July 6, 2023, 7:55 a.m. UTC | #19
On Thu, Jul 6, 2023 at 3:06 PM Eugenio Perez Martin <eperezma@redhat.com> wrote:
>
> On Thu, Jul 6, 2023 at 3:55 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Wed, Jul 5, 2023 at 4:41 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Wed, Jul 05, 2023 at 03:49:58PM +0800, Jason Wang wrote:
> > > > On Tue, Jul 4, 2023 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Tue, Jul 04, 2023 at 01:36:11PM +0200, Eugenio Perez Martin wrote:
> > > > > > On Tue, Jul 4, 2023 at 12:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > >
> > > > > > > On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
> > > > > > > > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
> > > > > > > > > > With the current code it is accepted as long as userland send it.
> > > > > > > > > >
> > > > > > > > > > Although userland should not set a feature flag that has not been
> > > > > > > > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
> > > > > > > > > > complain for it.
> > > > > > > > > >
> > > > > > > > > > Since there is no specific reason for any parent to reject that backend
> > > > > > > > > > feature bit when it has been proposed, let's control it at vdpa frontend
> > > > > > > > > > level. Future patches may move this control to the parent driver.
> > > > > > > > > >
> > > > > > > > > > Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
> > > > > > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > >
> > > > > > > > > Please do send v3. And again, I don't want to send "after driver ok" hack
> > > > > > > > > upstream at all, I merged it in next just to give it some testing.
> > > > > > > > > We want RING_ACCESS_AFTER_KICK or some such.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Current devices do not support that semantic.
> > > > > > >
> > > > > > > Which devices specifically access the ring after DRIVER_OK but before
> > > > > > > a kick?
> > > > > > >
> > > > > >
> > > > > > Previous versions of the QEMU LM series did a spurious kick to start
> > > > > > traffic at the LM destination [1]. When it was proposed, that spurious
> > > > > > kick was removed from the series because to check for descriptors
> > > > > > after driver_ok, even without a kick, was considered work of the
> > > > > > parent driver.
> > > > > >
> > > > > > I'm ok to go back to this spurious kick, but I'm not sure if the hw
> > > > > > will read the ring before the kick actually. I can ask.
> > > > > >
> > > > > > Thanks!
> > > > > >
> > > > > > [1] https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg02775.html
> > > > >
> > > > > Let's find out. We need to check for ENABLE_AFTER_DRIVER_OK too, no?
> > > >
> > > > My understanding is [1] assuming ACCESS_AFTER_KICK. This seems
> > > > sub-optimal than assuming ENABLE_AFTER_DRIVER_OK.
> > > >
> > > > But this reminds me one thing, as the thread is going too long, I
> > > > wonder if we simply assume ENABLE_AFTER_DRIVER_OK if RING_RESET is
> > > > supported?
> > > >
> > > > Thanks
> > >
> > > I don't see what does one have to do with another ...
> > >
> > > I think with RING_RESET we had another solution, enable rings
> > > mapping them to a zero page, then reset and re-enable later.
> >
> > As discussed before, this seems to have some problems:
> >
> > 1) The behaviour is not clarified in the document
> > 2) zero is a valid IOVA
> >
>
> I think we're not on the same page here.
>
> As I understood, rings mapped to a zero page means essentially an
> avail ring whose avail_idx is always 0, offered to the device instead
> of the guest's ring. Once all CVQ commands are processed, we use
> RING_RESET to switch to the right ring, being guest's or SVQ vring.

I get this. This seems more complicated in the destination: shadow vq + ASID?

Thanks

>
>
>
> > Thanks
> >
> > >
> > > > >
> > > > >
> > > > >
> > > > > > > > My plan was to convert
> > > > > > > > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
> > > > > > > > was not explicit enough.
> > > > > > > >
> > > > > > > > The only solution I can see to that is to trap & emulate in the vdpa
> > > > > > > > (parent?) driver, as talked in virtio-comment. But that complicates
> > > > > > > > the architecture:
> > > > > > > > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
> > > > > > > > * Store vq enable state separately, at
> > > > > > > > vdpa->config->set_vq_ready(true), but not transmit that enable to hw
> > > > > > > > * Store the doorbell state separately, but do not configure it to the
> > > > > > > > device directly.
> > > > > > > >
> > > > > > > > But how to recover if the device cannot configure them at kick time,
> > > > > > > > for example?
> > > > > > > >
> > > > > > > > Maybe we can just fail if the parent driver does not support enabling
> > > > > > > > the vq after DRIVER_OK? That way no new feature flag is needed.
> > > > > > > >
> > > > > > > > Thanks!
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > ---
> > > > > > > > > > Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
> > > > > > > > > > commit. Please let me know if I should send a v3 of [1] instead.
> > > > > > > > > >
> > > > > > > > > > [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
> > > > > > > > > > ---
> > > > > > > > > >  drivers/vhost/vdpa.c | 7 +++++--
> > > > > > > > > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > > > > > index e1abf29fed5b..a7e554352351 100644
> > > > > > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > > > > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > > > > > >  {
> > > > > > > > > >       struct vhost_vdpa *v = filep->private_data;
> > > > > > > > > >       struct vhost_dev *d = &v->vdev;
> > > > > > > > > > +     const struct vdpa_config_ops *ops = v->vdpa->config;
> > > > > > > > > >       void __user *argp = (void __user *)arg;
> > > > > > > > > >       u64 __user *featurep = argp;
> > > > > > > > > > -     u64 features;
> > > > > > > > > > +     u64 features, parent_features = 0;
> > > > > > > > > >       long r = 0;
> > > > > > > > > >
> > > > > > > > > >       if (cmd == VHOST_SET_BACKEND_FEATURES) {
> > > > > > > > > >               if (copy_from_user(&features, featurep, sizeof(features)))
> > > > > > > > > >                       return -EFAULT;
> > > > > > > > > > +             if (ops->get_backend_features)
> > > > > > > > > > +                     parent_features = ops->get_backend_features(v->vdpa);
> > > > > > > > > >               if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
> > > > > > > > > >                                BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
> > > > > > > > > >                                BIT_ULL(VHOST_BACKEND_F_RESUME) |
> > > > > > > > > > -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
> > > > > > > > > > +                              parent_features))
> > > > > > > > > >                       return -EOPNOTSUPP;
> > > > > > > > > >               if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
> > > > > > > > > >                    !vhost_vdpa_can_suspend(v))
> > > > > > > > > > --
> > > > > > > > > > 2.39.3
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> >
>
Jason Wang July 6, 2023, 8:03 a.m. UTC | #20
On Thu, Jul 6, 2023 at 3:55 PM Jason Wang <jasowang@redhat.com> wrote:
>
> On Thu, Jul 6, 2023 at 3:06 PM Eugenio Perez Martin <eperezma@redhat.com> wrote:
> >
> > On Thu, Jul 6, 2023 at 3:55 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Wed, Jul 5, 2023 at 4:41 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Wed, Jul 05, 2023 at 03:49:58PM +0800, Jason Wang wrote:
> > > > > On Tue, Jul 4, 2023 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Tue, Jul 04, 2023 at 01:36:11PM +0200, Eugenio Perez Martin wrote:
> > > > > > > On Tue, Jul 4, 2023 at 12:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
> > > > > > > > > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
> > > > > > > > > > > With the current code it is accepted as long as userland send it.
> > > > > > > > > > >
> > > > > > > > > > > Although userland should not set a feature flag that has not been
> > > > > > > > > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
> > > > > > > > > > > complain for it.
> > > > > > > > > > >
> > > > > > > > > > > Since there is no specific reason for any parent to reject that backend
> > > > > > > > > > > feature bit when it has been proposed, let's control it at vdpa frontend
> > > > > > > > > > > level. Future patches may move this control to the parent driver.
> > > > > > > > > > >
> > > > > > > > > > > Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
> > > > > > > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > > >
> > > > > > > > > > Please do send v3. And again, I don't want to send "after driver ok" hack
> > > > > > > > > > upstream at all, I merged it in next just to give it some testing.
> > > > > > > > > > We want RING_ACCESS_AFTER_KICK or some such.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Current devices do not support that semantic.
> > > > > > > >
> > > > > > > > Which devices specifically access the ring after DRIVER_OK but before
> > > > > > > > a kick?
> > > > > > > >
> > > > > > >
> > > > > > > Previous versions of the QEMU LM series did a spurious kick to start
> > > > > > > traffic at the LM destination [1]. When it was proposed, that spurious
> > > > > > > kick was removed from the series because to check for descriptors
> > > > > > > after driver_ok, even without a kick, was considered work of the
> > > > > > > parent driver.
> > > > > > >
> > > > > > > I'm ok to go back to this spurious kick, but I'm not sure if the hw
> > > > > > > will read the ring before the kick actually. I can ask.
> > > > > > >
> > > > > > > Thanks!
> > > > > > >
> > > > > > > [1] https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg02775.html
> > > > > >
> > > > > > Let's find out. We need to check for ENABLE_AFTER_DRIVER_OK too, no?
> > > > >
> > > > > My understanding is [1] assuming ACCESS_AFTER_KICK. This seems
> > > > > sub-optimal than assuming ENABLE_AFTER_DRIVER_OK.
> > > > >
> > > > > But this reminds me one thing, as the thread is going too long, I
> > > > > wonder if we simply assume ENABLE_AFTER_DRIVER_OK if RING_RESET is
> > > > > supported?
> > > > >
> > > > > Thanks
> > > >
> > > > I don't see what does one have to do with another ...
> > > >
> > > > I think with RING_RESET we had another solution, enable rings
> > > > mapping them to a zero page, then reset and re-enable later.
> > >
> > > As discussed before, this seems to have some problems:
> > >
> > > 1) The behaviour is not clarified in the document
> > > 2) zero is a valid IOVA
> > >
> >
> > I think we're not on the same page here.
> >
> > As I understood, rings mapped to a zero page means essentially an
> > avail ring whose avail_idx is always 0, offered to the device instead
> > of the guest's ring. Once all CVQ commands are processed, we use
> > RING_RESET to switch to the right ring, being guest's or SVQ vring.
>
> I get this. This seems more complicated in the destination: shadow vq + ASID?

So it's something like:

1) set all vq ASID to shadow virtqueue
2) do not add any bufs to data qp (stick 0 as avail index)
3) start to restore states via cvq
4) ring_rest for dataqp
5) set_vq_state for dataqp
6) re-initialize dataqp address etc
7) set data QP ASID to guest
8) set queue_enable

?

Thanks

>
> Thanks
>
> >
> >
> >
> > > Thanks
> > >
> > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > > > > My plan was to convert
> > > > > > > > > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
> > > > > > > > > was not explicit enough.
> > > > > > > > >
> > > > > > > > > The only solution I can see to that is to trap & emulate in the vdpa
> > > > > > > > > (parent?) driver, as talked in virtio-comment. But that complicates
> > > > > > > > > the architecture:
> > > > > > > > > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
> > > > > > > > > * Store vq enable state separately, at
> > > > > > > > > vdpa->config->set_vq_ready(true), but not transmit that enable to hw
> > > > > > > > > * Store the doorbell state separately, but do not configure it to the
> > > > > > > > > device directly.
> > > > > > > > >
> > > > > > > > > But how to recover if the device cannot configure them at kick time,
> > > > > > > > > for example?
> > > > > > > > >
> > > > > > > > > Maybe we can just fail if the parent driver does not support enabling
> > > > > > > > > the vq after DRIVER_OK? That way no new feature flag is needed.
> > > > > > > > >
> > > > > > > > > Thanks!
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > ---
> > > > > > > > > > > Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
> > > > > > > > > > > commit. Please let me know if I should send a v3 of [1] instead.
> > > > > > > > > > >
> > > > > > > > > > > [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
> > > > > > > > > > > ---
> > > > > > > > > > >  drivers/vhost/vdpa.c | 7 +++++--
> > > > > > > > > > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > > > > > > > > >
> > > > > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > > > > > > index e1abf29fed5b..a7e554352351 100644
> > > > > > > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > > > > > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > > > > > > >  {
> > > > > > > > > > >       struct vhost_vdpa *v = filep->private_data;
> > > > > > > > > > >       struct vhost_dev *d = &v->vdev;
> > > > > > > > > > > +     const struct vdpa_config_ops *ops = v->vdpa->config;
> > > > > > > > > > >       void __user *argp = (void __user *)arg;
> > > > > > > > > > >       u64 __user *featurep = argp;
> > > > > > > > > > > -     u64 features;
> > > > > > > > > > > +     u64 features, parent_features = 0;
> > > > > > > > > > >       long r = 0;
> > > > > > > > > > >
> > > > > > > > > > >       if (cmd == VHOST_SET_BACKEND_FEATURES) {
> > > > > > > > > > >               if (copy_from_user(&features, featurep, sizeof(features)))
> > > > > > > > > > >                       return -EFAULT;
> > > > > > > > > > > +             if (ops->get_backend_features)
> > > > > > > > > > > +                     parent_features = ops->get_backend_features(v->vdpa);
> > > > > > > > > > >               if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
> > > > > > > > > > >                                BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
> > > > > > > > > > >                                BIT_ULL(VHOST_BACKEND_F_RESUME) |
> > > > > > > > > > > -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
> > > > > > > > > > > +                              parent_features))
> > > > > > > > > > >                       return -EOPNOTSUPP;
> > > > > > > > > > >               if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
> > > > > > > > > > >                    !vhost_vdpa_can_suspend(v))
> > > > > > > > > > > --
> > > > > > > > > > > 2.39.3
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > >
> >
Eugenio Perez Martin July 6, 2023, 9:38 a.m. UTC | #21
On Thu, Jul 6, 2023 at 10:03 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Thu, Jul 6, 2023 at 3:55 PM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Thu, Jul 6, 2023 at 3:06 PM Eugenio Perez Martin <eperezma@redhat.com> wrote:
> > >
> > > On Thu, Jul 6, 2023 at 3:55 AM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > > On Wed, Jul 5, 2023 at 4:41 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Wed, Jul 05, 2023 at 03:49:58PM +0800, Jason Wang wrote:
> > > > > > On Tue, Jul 4, 2023 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > >
> > > > > > > On Tue, Jul 04, 2023 at 01:36:11PM +0200, Eugenio Perez Martin wrote:
> > > > > > > > On Tue, Jul 4, 2023 at 12:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > >
> > > > > > > > > On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
> > > > > > > > > > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
> > > > > > > > > > > > With the current code it is accepted as long as userland send it.
> > > > > > > > > > > >
> > > > > > > > > > > > Although userland should not set a feature flag that has not been
> > > > > > > > > > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
> > > > > > > > > > > > complain for it.
> > > > > > > > > > > >
> > > > > > > > > > > > Since there is no specific reason for any parent to reject that backend
> > > > > > > > > > > > feature bit when it has been proposed, let's control it at vdpa frontend
> > > > > > > > > > > > level. Future patches may move this control to the parent driver.
> > > > > > > > > > > >
> > > > > > > > > > > > Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
> > > > > > > > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > > > >
> > > > > > > > > > > Please do send v3. And again, I don't want to send "after driver ok" hack
> > > > > > > > > > > upstream at all, I merged it in next just to give it some testing.
> > > > > > > > > > > We want RING_ACCESS_AFTER_KICK or some such.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Current devices do not support that semantic.
> > > > > > > > >
> > > > > > > > > Which devices specifically access the ring after DRIVER_OK but before
> > > > > > > > > a kick?
> > > > > > > > >
> > > > > > > >
> > > > > > > > Previous versions of the QEMU LM series did a spurious kick to start
> > > > > > > > traffic at the LM destination [1]. When it was proposed, that spurious
> > > > > > > > kick was removed from the series because to check for descriptors
> > > > > > > > after driver_ok, even without a kick, was considered work of the
> > > > > > > > parent driver.
> > > > > > > >
> > > > > > > > I'm ok to go back to this spurious kick, but I'm not sure if the hw
> > > > > > > > will read the ring before the kick actually. I can ask.
> > > > > > > >
> > > > > > > > Thanks!
> > > > > > > >
> > > > > > > > [1] https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg02775.html
> > > > > > >
> > > > > > > Let's find out. We need to check for ENABLE_AFTER_DRIVER_OK too, no?
> > > > > >
> > > > > > My understanding is [1] assuming ACCESS_AFTER_KICK. This seems
> > > > > > sub-optimal than assuming ENABLE_AFTER_DRIVER_OK.
> > > > > >
> > > > > > But this reminds me one thing, as the thread is going too long, I
> > > > > > wonder if we simply assume ENABLE_AFTER_DRIVER_OK if RING_RESET is
> > > > > > supported?
> > > > > >
> > > > > > Thanks
> > > > >
> > > > > I don't see what does one have to do with another ...
> > > > >
> > > > > I think with RING_RESET we had another solution, enable rings
> > > > > mapping them to a zero page, then reset and re-enable later.
> > > >
> > > > As discussed before, this seems to have some problems:
> > > >
> > > > 1) The behaviour is not clarified in the document
> > > > 2) zero is a valid IOVA
> > > >
> > >
> > > I think we're not on the same page here.
> > >
> > > As I understood, rings mapped to a zero page means essentially an
> > > avail ring whose avail_idx is always 0, offered to the device instead
> > > of the guest's ring. Once all CVQ commands are processed, we use
> > > RING_RESET to switch to the right ring, being guest's or SVQ vring.
> >
> > I get this. This seems more complicated in the destination: shadow vq + ASID?
>
> So it's something like:
>
> 1) set all vq ASID to shadow virtqueue
> 2) do not add any bufs to data qp (stick 0 as avail index)
> 3) start to restore states via cvq
> 4) ring_rest for dataqp
> 5) set_vq_state for dataqp
> 6) re-initialize dataqp address etc
> 7) set data QP ASID to guest
> 8) set queue_enable
>
> ?
>

I think the change of ASID is not needed, as the guest cannot access
the device in that timeframe anyway. Moreover, it may require HW
support. So steps 1 and 7 are not needed.

Apart from that, the process is right.


> Thanks
>
> >
> > Thanks
> >
> > >
> > >
> > >
> > > > Thanks
> > > >
> > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > > > My plan was to convert
> > > > > > > > > > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
> > > > > > > > > > was not explicit enough.
> > > > > > > > > >
> > > > > > > > > > The only solution I can see to that is to trap & emulate in the vdpa
> > > > > > > > > > (parent?) driver, as talked in virtio-comment. But that complicates
> > > > > > > > > > the architecture:
> > > > > > > > > > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
> > > > > > > > > > * Store vq enable state separately, at
> > > > > > > > > > vdpa->config->set_vq_ready(true), but not transmit that enable to hw
> > > > > > > > > > * Store the doorbell state separately, but do not configure it to the
> > > > > > > > > > device directly.
> > > > > > > > > >
> > > > > > > > > > But how to recover if the device cannot configure them at kick time,
> > > > > > > > > > for example?
> > > > > > > > > >
> > > > > > > > > > Maybe we can just fail if the parent driver does not support enabling
> > > > > > > > > > the vq after DRIVER_OK? That way no new feature flag is needed.
> > > > > > > > > >
> > > > > > > > > > Thanks!
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > ---
> > > > > > > > > > > > Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
> > > > > > > > > > > > commit. Please let me know if I should send a v3 of [1] instead.
> > > > > > > > > > > >
> > > > > > > > > > > > [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
> > > > > > > > > > > > ---
> > > > > > > > > > > >  drivers/vhost/vdpa.c | 7 +++++--
> > > > > > > > > > > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > > > > > > > > > >
> > > > > > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > > > > > > > index e1abf29fed5b..a7e554352351 100644
> > > > > > > > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > > > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > > > > > > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > > > > > > > >  {
> > > > > > > > > > > >       struct vhost_vdpa *v = filep->private_data;
> > > > > > > > > > > >       struct vhost_dev *d = &v->vdev;
> > > > > > > > > > > > +     const struct vdpa_config_ops *ops = v->vdpa->config;
> > > > > > > > > > > >       void __user *argp = (void __user *)arg;
> > > > > > > > > > > >       u64 __user *featurep = argp;
> > > > > > > > > > > > -     u64 features;
> > > > > > > > > > > > +     u64 features, parent_features = 0;
> > > > > > > > > > > >       long r = 0;
> > > > > > > > > > > >
> > > > > > > > > > > >       if (cmd == VHOST_SET_BACKEND_FEATURES) {
> > > > > > > > > > > >               if (copy_from_user(&features, featurep, sizeof(features)))
> > > > > > > > > > > >                       return -EFAULT;
> > > > > > > > > > > > +             if (ops->get_backend_features)
> > > > > > > > > > > > +                     parent_features = ops->get_backend_features(v->vdpa);
> > > > > > > > > > > >               if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
> > > > > > > > > > > >                                BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
> > > > > > > > > > > >                                BIT_ULL(VHOST_BACKEND_F_RESUME) |
> > > > > > > > > > > > -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
> > > > > > > > > > > > +                              parent_features))
> > > > > > > > > > > >                       return -EOPNOTSUPP;
> > > > > > > > > > > >               if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
> > > > > > > > > > > >                    !vhost_vdpa_can_suspend(v))
> > > > > > > > > > > > --
> > > > > > > > > > > > 2.39.3
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > > >
> > >
>
Jason Wang July 7, 2023, 7:57 a.m. UTC | #22
On Thu, Jul 6, 2023 at 5:39 PM Eugenio Perez Martin <eperezma@redhat.com> wrote:
>
> On Thu, Jul 6, 2023 at 10:03 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Thu, Jul 6, 2023 at 3:55 PM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Thu, Jul 6, 2023 at 3:06 PM Eugenio Perez Martin <eperezma@redhat.com> wrote:
> > > >
> > > > On Thu, Jul 6, 2023 at 3:55 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > >
> > > > > On Wed, Jul 5, 2023 at 4:41 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Wed, Jul 05, 2023 at 03:49:58PM +0800, Jason Wang wrote:
> > > > > > > On Tue, Jul 4, 2023 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Tue, Jul 04, 2023 at 01:36:11PM +0200, Eugenio Perez Martin wrote:
> > > > > > > > > On Tue, Jul 4, 2023 at 12:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
> > > > > > > > > > > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
> > > > > > > > > > > > > With the current code it is accepted as long as userland send it.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Although userland should not set a feature flag that has not been
> > > > > > > > > > > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
> > > > > > > > > > > > > complain for it.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Since there is no specific reason for any parent to reject that backend
> > > > > > > > > > > > > feature bit when it has been proposed, let's control it at vdpa frontend
> > > > > > > > > > > > > level. Future patches may move this control to the parent driver.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
> > > > > > > > > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > > > > >
> > > > > > > > > > > > Please do send v3. And again, I don't want to send "after driver ok" hack
> > > > > > > > > > > > upstream at all, I merged it in next just to give it some testing.
> > > > > > > > > > > > We want RING_ACCESS_AFTER_KICK or some such.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Current devices do not support that semantic.
> > > > > > > > > >
> > > > > > > > > > Which devices specifically access the ring after DRIVER_OK but before
> > > > > > > > > > a kick?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Previous versions of the QEMU LM series did a spurious kick to start
> > > > > > > > > traffic at the LM destination [1]. When it was proposed, that spurious
> > > > > > > > > kick was removed from the series because to check for descriptors
> > > > > > > > > after driver_ok, even without a kick, was considered work of the
> > > > > > > > > parent driver.
> > > > > > > > >
> > > > > > > > > I'm ok to go back to this spurious kick, but I'm not sure if the hw
> > > > > > > > > will read the ring before the kick actually. I can ask.
> > > > > > > > >
> > > > > > > > > Thanks!
> > > > > > > > >
> > > > > > > > > [1] https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg02775.html
> > > > > > > >
> > > > > > > > Let's find out. We need to check for ENABLE_AFTER_DRIVER_OK too, no?
> > > > > > >
> > > > > > > My understanding is [1] assuming ACCESS_AFTER_KICK. This seems
> > > > > > > sub-optimal than assuming ENABLE_AFTER_DRIVER_OK.
> > > > > > >
> > > > > > > But this reminds me one thing, as the thread is going too long, I
> > > > > > > wonder if we simply assume ENABLE_AFTER_DRIVER_OK if RING_RESET is
> > > > > > > supported?
> > > > > > >
> > > > > > > Thanks
> > > > > >
> > > > > > I don't see what does one have to do with another ...
> > > > > >
> > > > > > I think with RING_RESET we had another solution, enable rings
> > > > > > mapping them to a zero page, then reset and re-enable later.
> > > > >
> > > > > As discussed before, this seems to have some problems:
> > > > >
> > > > > 1) The behaviour is not clarified in the document
> > > > > 2) zero is a valid IOVA
> > > > >
> > > >
> > > > I think we're not on the same page here.
> > > >
> > > > As I understood, rings mapped to a zero page means essentially an
> > > > avail ring whose avail_idx is always 0, offered to the device instead
> > > > of the guest's ring. Once all CVQ commands are processed, we use
> > > > RING_RESET to switch to the right ring, being guest's or SVQ vring.
> > >
> > > I get this. This seems more complicated in the destination: shadow vq + ASID?
> >
> > So it's something like:
> >
> > 1) set all vq ASID to shadow virtqueue
> > 2) do not add any bufs to data qp (stick 0 as avail index)
> > 3) start to restore states via cvq
> > 4) ring_rest for dataqp
> > 5) set_vq_state for dataqp
> > 6) re-initialize dataqp address etc
> > 7) set data QP ASID to guest
> > 8) set queue_enable
> >
> > ?
> >
>
> I think the change of ASID is not needed, as the guest cannot access
> the device in that timeframe anyway.

Yes but after the restore, we still want to shadow cvq, so ASID is still needed?

Thanks

> Moreover, it may require HW
> support. So steps 1 and 7 are not needed.
>
> Apart from that, the process is right.
>
>
> > Thanks
> >
> > >
> > > Thanks
> > >
> > > >
> > > >
> > > >
> > > > > Thanks
> > > > >
> > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > > > My plan was to convert
> > > > > > > > > > > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
> > > > > > > > > > > was not explicit enough.
> > > > > > > > > > >
> > > > > > > > > > > The only solution I can see to that is to trap & emulate in the vdpa
> > > > > > > > > > > (parent?) driver, as talked in virtio-comment. But that complicates
> > > > > > > > > > > the architecture:
> > > > > > > > > > > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
> > > > > > > > > > > * Store vq enable state separately, at
> > > > > > > > > > > vdpa->config->set_vq_ready(true), but not transmit that enable to hw
> > > > > > > > > > > * Store the doorbell state separately, but do not configure it to the
> > > > > > > > > > > device directly.
> > > > > > > > > > >
> > > > > > > > > > > But how to recover if the device cannot configure them at kick time,
> > > > > > > > > > > for example?
> > > > > > > > > > >
> > > > > > > > > > > Maybe we can just fail if the parent driver does not support enabling
> > > > > > > > > > > the vq after DRIVER_OK? That way no new feature flag is needed.
> > > > > > > > > > >
> > > > > > > > > > > Thanks!
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > ---
> > > > > > > > > > > > > Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
> > > > > > > > > > > > > commit. Please let me know if I should send a v3 of [1] instead.
> > > > > > > > > > > > >
> > > > > > > > > > > > > [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
> > > > > > > > > > > > > ---
> > > > > > > > > > > > >  drivers/vhost/vdpa.c | 7 +++++--
> > > > > > > > > > > > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > > > > > > > > > > >
> > > > > > > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > > > > > > > > index e1abf29fed5b..a7e554352351 100644
> > > > > > > > > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > > > > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > > > > > > > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > > > > > > > > >  {
> > > > > > > > > > > > >       struct vhost_vdpa *v = filep->private_data;
> > > > > > > > > > > > >       struct vhost_dev *d = &v->vdev;
> > > > > > > > > > > > > +     const struct vdpa_config_ops *ops = v->vdpa->config;
> > > > > > > > > > > > >       void __user *argp = (void __user *)arg;
> > > > > > > > > > > > >       u64 __user *featurep = argp;
> > > > > > > > > > > > > -     u64 features;
> > > > > > > > > > > > > +     u64 features, parent_features = 0;
> > > > > > > > > > > > >       long r = 0;
> > > > > > > > > > > > >
> > > > > > > > > > > > >       if (cmd == VHOST_SET_BACKEND_FEATURES) {
> > > > > > > > > > > > >               if (copy_from_user(&features, featurep, sizeof(features)))
> > > > > > > > > > > > >                       return -EFAULT;
> > > > > > > > > > > > > +             if (ops->get_backend_features)
> > > > > > > > > > > > > +                     parent_features = ops->get_backend_features(v->vdpa);
> > > > > > > > > > > > >               if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
> > > > > > > > > > > > >                                BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
> > > > > > > > > > > > >                                BIT_ULL(VHOST_BACKEND_F_RESUME) |
> > > > > > > > > > > > > -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
> > > > > > > > > > > > > +                              parent_features))
> > > > > > > > > > > > >                       return -EOPNOTSUPP;
> > > > > > > > > > > > >               if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
> > > > > > > > > > > > >                    !vhost_vdpa_can_suspend(v))
> > > > > > > > > > > > > --
> > > > > > > > > > > > > 2.39.3
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> >
>
Eugenio Perez Martin July 7, 2023, 10:25 a.m. UTC | #23
On Fri, Jul 7, 2023 at 9:57 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Thu, Jul 6, 2023 at 5:39 PM Eugenio Perez Martin <eperezma@redhat.com> wrote:
> >
> > On Thu, Jul 6, 2023 at 10:03 AM Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > On Thu, Jul 6, 2023 at 3:55 PM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > > On Thu, Jul 6, 2023 at 3:06 PM Eugenio Perez Martin <eperezma@redhat.com> wrote:
> > > > >
> > > > > On Thu, Jul 6, 2023 at 3:55 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > > >
> > > > > > On Wed, Jul 5, 2023 at 4:41 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > >
> > > > > > > On Wed, Jul 05, 2023 at 03:49:58PM +0800, Jason Wang wrote:
> > > > > > > > On Tue, Jul 4, 2023 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > >
> > > > > > > > > On Tue, Jul 04, 2023 at 01:36:11PM +0200, Eugenio Perez Martin wrote:
> > > > > > > > > > On Tue, Jul 4, 2023 at 12:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
> > > > > > > > > > > > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
> > > > > > > > > > > > > > With the current code it is accepted as long as userland send it.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Although userland should not set a feature flag that has not been
> > > > > > > > > > > > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
> > > > > > > > > > > > > > complain for it.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Since there is no specific reason for any parent to reject that backend
> > > > > > > > > > > > > > feature bit when it has been proposed, let's control it at vdpa frontend
> > > > > > > > > > > > > > level. Future patches may move this control to the parent driver.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
> > > > > > > > > > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > > > > > >
> > > > > > > > > > > > > Please do send v3. And again, I don't want to send "after driver ok" hack
> > > > > > > > > > > > > upstream at all, I merged it in next just to give it some testing.
> > > > > > > > > > > > > We want RING_ACCESS_AFTER_KICK or some such.
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Current devices do not support that semantic.
> > > > > > > > > > >
> > > > > > > > > > > Which devices specifically access the ring after DRIVER_OK but before
> > > > > > > > > > > a kick?
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Previous versions of the QEMU LM series did a spurious kick to start
> > > > > > > > > > traffic at the LM destination [1]. When it was proposed, that spurious
> > > > > > > > > > kick was removed from the series because to check for descriptors
> > > > > > > > > > after driver_ok, even without a kick, was considered work of the
> > > > > > > > > > parent driver.
> > > > > > > > > >
> > > > > > > > > > I'm ok to go back to this spurious kick, but I'm not sure if the hw
> > > > > > > > > > will read the ring before the kick actually. I can ask.
> > > > > > > > > >
> > > > > > > > > > Thanks!
> > > > > > > > > >
> > > > > > > > > > [1] https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg02775.html
> > > > > > > > >
> > > > > > > > > Let's find out. We need to check for ENABLE_AFTER_DRIVER_OK too, no?
> > > > > > > >
> > > > > > > > My understanding is [1] assuming ACCESS_AFTER_KICK. This seems
> > > > > > > > sub-optimal than assuming ENABLE_AFTER_DRIVER_OK.
> > > > > > > >
> > > > > > > > But this reminds me one thing, as the thread is going too long, I
> > > > > > > > wonder if we simply assume ENABLE_AFTER_DRIVER_OK if RING_RESET is
> > > > > > > > supported?
> > > > > > > >
> > > > > > > > Thanks
> > > > > > >
> > > > > > > I don't see what does one have to do with another ...
> > > > > > >
> > > > > > > I think with RING_RESET we had another solution, enable rings
> > > > > > > mapping them to a zero page, then reset and re-enable later.
> > > > > >
> > > > > > As discussed before, this seems to have some problems:
> > > > > >
> > > > > > 1) The behaviour is not clarified in the document
> > > > > > 2) zero is a valid IOVA
> > > > > >
> > > > >
> > > > > I think we're not on the same page here.
> > > > >
> > > > > As I understood, rings mapped to a zero page means essentially an
> > > > > avail ring whose avail_idx is always 0, offered to the device instead
> > > > > of the guest's ring. Once all CVQ commands are processed, we use
> > > > > RING_RESET to switch to the right ring, being guest's or SVQ vring.
> > > >
> > > > I get this. This seems more complicated in the destination: shadow vq + ASID?
> > >
> > > So it's something like:
> > >
> > > 1) set all vq ASID to shadow virtqueue
> > > 2) do not add any bufs to data qp (stick 0 as avail index)
> > > 3) start to restore states via cvq
> > > 4) ring_rest for dataqp
> > > 5) set_vq_state for dataqp
> > > 6) re-initialize dataqp address etc
> > > 7) set data QP ASID to guest
> > > 8) set queue_enable
> > >
> > > ?
> > >
> >
> > I think the change of ASID is not needed, as the guest cannot access
> > the device in that timeframe anyway.
>
> Yes but after the restore, we still want to shadow cvq, so ASID is still needed?
>

Device or parent driver support for ASID is needed to shadow only CVQ.
Bue the device may not support the switch of ASID after DRIVER_OK.

Since dataplane can go in passthrough ASID all the time, we don't need
to switch it after DRIVER_OK.

> Thanks
>
> > Moreover, it may require HW
> > support. So steps 1 and 7 are not needed.
> >
> > Apart from that, the process is right.
> >
> >
> > > Thanks
> > >
> > > >
> > > > Thanks
> > > >
> > > > >
> > > > >
> > > > >
> > > > > > Thanks
> > > > > >
> > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > > > My plan was to convert
> > > > > > > > > > > > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
> > > > > > > > > > > > was not explicit enough.
> > > > > > > > > > > >
> > > > > > > > > > > > The only solution I can see to that is to trap & emulate in the vdpa
> > > > > > > > > > > > (parent?) driver, as talked in virtio-comment. But that complicates
> > > > > > > > > > > > the architecture:
> > > > > > > > > > > > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
> > > > > > > > > > > > * Store vq enable state separately, at
> > > > > > > > > > > > vdpa->config->set_vq_ready(true), but not transmit that enable to hw
> > > > > > > > > > > > * Store the doorbell state separately, but do not configure it to the
> > > > > > > > > > > > device directly.
> > > > > > > > > > > >
> > > > > > > > > > > > But how to recover if the device cannot configure them at kick time,
> > > > > > > > > > > > for example?
> > > > > > > > > > > >
> > > > > > > > > > > > Maybe we can just fail if the parent driver does not support enabling
> > > > > > > > > > > > the vq after DRIVER_OK? That way no new feature flag is needed.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks!
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > ---
> > > > > > > > > > > > > > Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
> > > > > > > > > > > > > > commit. Please let me know if I should send a v3 of [1] instead.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
> > > > > > > > > > > > > > ---
> > > > > > > > > > > > > >  drivers/vhost/vdpa.c | 7 +++++--
> > > > > > > > > > > > > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > > > > > > > > > index e1abf29fed5b..a7e554352351 100644
> > > > > > > > > > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > > > > > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > > > > > > > > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > > > > > > > > > >  {
> > > > > > > > > > > > > >       struct vhost_vdpa *v = filep->private_data;
> > > > > > > > > > > > > >       struct vhost_dev *d = &v->vdev;
> > > > > > > > > > > > > > +     const struct vdpa_config_ops *ops = v->vdpa->config;
> > > > > > > > > > > > > >       void __user *argp = (void __user *)arg;
> > > > > > > > > > > > > >       u64 __user *featurep = argp;
> > > > > > > > > > > > > > -     u64 features;
> > > > > > > > > > > > > > +     u64 features, parent_features = 0;
> > > > > > > > > > > > > >       long r = 0;
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >       if (cmd == VHOST_SET_BACKEND_FEATURES) {
> > > > > > > > > > > > > >               if (copy_from_user(&features, featurep, sizeof(features)))
> > > > > > > > > > > > > >                       return -EFAULT;
> > > > > > > > > > > > > > +             if (ops->get_backend_features)
> > > > > > > > > > > > > > +                     parent_features = ops->get_backend_features(v->vdpa);
> > > > > > > > > > > > > >               if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
> > > > > > > > > > > > > >                                BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
> > > > > > > > > > > > > >                                BIT_ULL(VHOST_BACKEND_F_RESUME) |
> > > > > > > > > > > > > > -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
> > > > > > > > > > > > > > +                              parent_features))
> > > > > > > > > > > > > >                       return -EOPNOTSUPP;
> > > > > > > > > > > > > >               if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
> > > > > > > > > > > > > >                    !vhost_vdpa_can_suspend(v))
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > 2.39.3
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> >
>
Si-Wei Liu July 19, 2023, 10:20 p.m. UTC | #24
On 7/5/2023 11:07 PM, Michael S. Tsirkin wrote:
> On Wed, Jul 05, 2023 at 05:07:11PM -0700, Shannon Nelson wrote:
>> On 7/5/23 11:27 AM, Eugenio Perez Martin wrote:
>>> On Wed, Jul 5, 2023 at 9:50 AM Jason Wang <jasowang@redhat.com> wrote:
>>>> On Tue, Jul 4, 2023 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>> On Tue, Jul 04, 2023 at 01:36:11PM +0200, Eugenio Perez Martin wrote:
>>>>>> On Tue, Jul 4, 2023 at 12:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>>>> On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
>>>>>>>> On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>>>>>> On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
>>>>>>>>>> With the current code it is accepted as long as userland send it.
>>>>>>>>>>
>>>>>>>>>> Although userland should not set a feature flag that has not been
>>>>>>>>>> offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
>>>>>>>>>> complain for it.
>>>>>>>>>>
>>>>>>>>>> Since there is no specific reason for any parent to reject that backend
>>>>>>>>>> feature bit when it has been proposed, let's control it at vdpa frontend
>>>>>>>>>> level. Future patches may move this control to the parent driver.
>>>>>>>>>>
>>>>>>>>>> Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
>>>>>>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>>>>>> Please do send v3. And again, I don't want to send "after driver ok" hack
>>>>>>>>> upstream at all, I merged it in next just to give it some testing.
>>>>>>>>> We want RING_ACCESS_AFTER_KICK or some such.
>>>>>>>>>
>>>>>>>> Current devices do not support that semantic.
>>>>>>> Which devices specifically access the ring after DRIVER_OK but before
>>>>>>> a kick?
>> The PDS vdpa device can deal with a call to .set_vq_ready after DRIVER_OK is
>> set.  And I'm told that our VQ activity should start without a kick.
>>
>> Our vdpa device FW doesn't currently have support for VIRTIO_F_RING_RESET,
>> but I believe it could be added without too much trouble.
>>
>> sln
>>
> OK it seems clear at least in the current version pds needs
> VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK.
> However can we also code up the RING_RESET path as the default?
What's the rationale of making RING_RESET path the default? Noted this 
is on a performance critical path (for live migration downtime), did we 
ever get consensus from every or most hardware vendors that RING_RESET 
has lower cost in terms of latency overall than ENABLE_AFTER_DRIVER_OK? 
I think (RING)RESET in general falls on the slow path for hardware, 
while I assume either RING_RESET or ENABLE_AFTER_DRIVER_OK doesn't 
matters much on software backed vdpa e.g. vp_vdpa. Maybe should make 
ENABLE_AFTER_DRIVER_OK as the default?

-Siwei

> Then down the road vendors can choose what to do.
>
>
>
>
>
>>>>>> Previous versions of the QEMU LM series did a spurious kick to start
>>>>>> traffic at the LM destination [1]. When it was proposed, that spurious
>>>>>> kick was removed from the series because to check for descriptors
>>>>>> after driver_ok, even without a kick, was considered work of the
>>>>>> parent driver.
>>>>>>
>>>>>> I'm ok to go back to this spurious kick, but I'm not sure if the hw
>>>>>> will read the ring before the kick actually. I can ask.
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> [1] https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg02775.html
>>>>> Let's find out. We need to check for ENABLE_AFTER_DRIVER_OK too, no?
>>>> My understanding is [1] assuming ACCESS_AFTER_KICK. This seems
>>>> sub-optimal than assuming ENABLE_AFTER_DRIVER_OK.
>>>>
>>>> But this reminds me one thing, as the thread is going too long, I
>>>> wonder if we simply assume ENABLE_AFTER_DRIVER_OK if RING_RESET is
>>>> supported?
>>>>
>>> The problem with that is that the device needs to support all
>>> RING_RESET, like to be able to change vq address etc after DRIVER_OK.
>>> Not all HW support it.
>>>
>>> We just need the subset of having the dataplane freezed until all CVQ
>>> commands have been consumed. I'm sure current vDPA code already
>>> supports it in some devices, like MLX and PSD.
>>>
>>> Thanks!
>>>
>>>> Thanks
>>>>
>>>>>
>>>>>
>>>>>>>> My plan was to convert
>>>>>>>> it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
>>>>>>>> was not explicit enough.
>>>>>>>>
>>>>>>>> The only solution I can see to that is to trap & emulate in the vdpa
>>>>>>>> (parent?) driver, as talked in virtio-comment. But that complicates
>>>>>>>> the architecture:
>>>>>>>> * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
>>>>>>>> * Store vq enable state separately, at
>>>>>>>> vdpa->config->set_vq_ready(true), but not transmit that enable to hw
>>>>>>>> * Store the doorbell state separately, but do not configure it to the
>>>>>>>> device directly.
>>>>>>>>
>>>>>>>> But how to recover if the device cannot configure them at kick time,
>>>>>>>> for example?
>>>>>>>>
>>>>>>>> Maybe we can just fail if the parent driver does not support enabling
>>>>>>>> the vq after DRIVER_OK? That way no new feature flag is needed.
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>>>> ---
>>>>>>>>>> Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
>>>>>>>>>> commit. Please let me know if I should send a v3 of [1] instead.
>>>>>>>>>>
>>>>>>>>>> [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
>>>>>>>>>> ---
>>>>>>>>>>    drivers/vhost/vdpa.c | 7 +++++--
>>>>>>>>>>    1 file changed, 5 insertions(+), 2 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
>>>>>>>>>> index e1abf29fed5b..a7e554352351 100644
>>>>>>>>>> --- a/drivers/vhost/vdpa.c
>>>>>>>>>> +++ b/drivers/vhost/vdpa.c
>>>>>>>>>> @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
>>>>>>>>>>    {
>>>>>>>>>>         struct vhost_vdpa *v = filep->private_data;
>>>>>>>>>>         struct vhost_dev *d = &v->vdev;
>>>>>>>>>> +     const struct vdpa_config_ops *ops = v->vdpa->config;
>>>>>>>>>>         void __user *argp = (void __user *)arg;
>>>>>>>>>>         u64 __user *featurep = argp;
>>>>>>>>>> -     u64 features;
>>>>>>>>>> +     u64 features, parent_features = 0;
>>>>>>>>>>         long r = 0;
>>>>>>>>>>
>>>>>>>>>>         if (cmd == VHOST_SET_BACKEND_FEATURES) {
>>>>>>>>>>                 if (copy_from_user(&features, featurep, sizeof(features)))
>>>>>>>>>>                         return -EFAULT;
>>>>>>>>>> +             if (ops->get_backend_features)
>>>>>>>>>> +                     parent_features = ops->get_backend_features(v->vdpa);
>>>>>>>>>>                 if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
>>>>>>>>>>                                  BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
>>>>>>>>>>                                  BIT_ULL(VHOST_BACKEND_F_RESUME) |
>>>>>>>>>> -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
>>>>>>>>>> +                              parent_features))
>>>>>>>>>>                         return -EOPNOTSUPP;
>>>>>>>>>>                 if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
>>>>>>>>>>                      !vhost_vdpa_can_suspend(v))
>>>>>>>>>> --
>>>>>>>>>> 2.39.3
> _______________________________________________
> Virtualization mailing list
> Virtualization@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Michael S. Tsirkin July 19, 2023, 10:26 p.m. UTC | #25
On Wed, Jul 19, 2023 at 03:20:03PM -0700, Si-Wei Liu wrote:
> 
> 
> On 7/5/2023 11:07 PM, Michael S. Tsirkin wrote:
> > On Wed, Jul 05, 2023 at 05:07:11PM -0700, Shannon Nelson wrote:
> > > On 7/5/23 11:27 AM, Eugenio Perez Martin wrote:
> > > > On Wed, Jul 5, 2023 at 9:50 AM Jason Wang <jasowang@redhat.com> wrote:
> > > > > On Tue, Jul 4, 2023 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > On Tue, Jul 04, 2023 at 01:36:11PM +0200, Eugenio Perez Martin wrote:
> > > > > > > On Tue, Jul 4, 2023 at 12:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
> > > > > > > > > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
> > > > > > > > > > > With the current code it is accepted as long as userland send it.
> > > > > > > > > > > 
> > > > > > > > > > > Although userland should not set a feature flag that has not been
> > > > > > > > > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
> > > > > > > > > > > complain for it.
> > > > > > > > > > > 
> > > > > > > > > > > Since there is no specific reason for any parent to reject that backend
> > > > > > > > > > > feature bit when it has been proposed, let's control it at vdpa frontend
> > > > > > > > > > > level. Future patches may move this control to the parent driver.
> > > > > > > > > > > 
> > > > > > > > > > > Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
> > > > > > > > > > > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > > > > > > > > > Please do send v3. And again, I don't want to send "after driver ok" hack
> > > > > > > > > > upstream at all, I merged it in next just to give it some testing.
> > > > > > > > > > We want RING_ACCESS_AFTER_KICK or some such.
> > > > > > > > > > 
> > > > > > > > > Current devices do not support that semantic.
> > > > > > > > Which devices specifically access the ring after DRIVER_OK but before
> > > > > > > > a kick?
> > > The PDS vdpa device can deal with a call to .set_vq_ready after DRIVER_OK is
> > > set.  And I'm told that our VQ activity should start without a kick.
> > > 
> > > Our vdpa device FW doesn't currently have support for VIRTIO_F_RING_RESET,
> > > but I believe it could be added without too much trouble.
> > > 
> > > sln
> > > 
> > OK it seems clear at least in the current version pds needs
> > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK.
> > However can we also code up the RING_RESET path as the default?
> What's the rationale of making RING_RESET path the default? Noted this is on
> a performance critical path (for live migration downtime), did we ever get
> consensus from every or most hardware vendors that RING_RESET has lower cost
> in terms of latency overall than ENABLE_AFTER_DRIVER_OK? I think (RING)RESET
> in general falls on the slow path for hardware, while I assume either
> RING_RESET or ENABLE_AFTER_DRIVER_OK doesn't matters much on software backed
> vdpa e.g. vp_vdpa. Maybe should make ENABLE_AFTER_DRIVER_OK as the default?
> 
> -Siwei

Coming from the spec RING_RESET has clearer semantics.
As long as we support it it is not critical which one
is the default though.


> > Then down the road vendors can choose what to do.
> > 
> > 
> > 
> > 
> > 
> > > > > > > Previous versions of the QEMU LM series did a spurious kick to start
> > > > > > > traffic at the LM destination [1]. When it was proposed, that spurious
> > > > > > > kick was removed from the series because to check for descriptors
> > > > > > > after driver_ok, even without a kick, was considered work of the
> > > > > > > parent driver.
> > > > > > > 
> > > > > > > I'm ok to go back to this spurious kick, but I'm not sure if the hw
> > > > > > > will read the ring before the kick actually. I can ask.
> > > > > > > 
> > > > > > > Thanks!
> > > > > > > 
> > > > > > > [1] https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg02775.html
> > > > > > Let's find out. We need to check for ENABLE_AFTER_DRIVER_OK too, no?
> > > > > My understanding is [1] assuming ACCESS_AFTER_KICK. This seems
> > > > > sub-optimal than assuming ENABLE_AFTER_DRIVER_OK.
> > > > > 
> > > > > But this reminds me one thing, as the thread is going too long, I
> > > > > wonder if we simply assume ENABLE_AFTER_DRIVER_OK if RING_RESET is
> > > > > supported?
> > > > > 
> > > > The problem with that is that the device needs to support all
> > > > RING_RESET, like to be able to change vq address etc after DRIVER_OK.
> > > > Not all HW support it.
> > > > 
> > > > We just need the subset of having the dataplane freezed until all CVQ
> > > > commands have been consumed. I'm sure current vDPA code already
> > > > supports it in some devices, like MLX and PSD.
> > > > 
> > > > Thanks!
> > > > 
> > > > > Thanks
> > > > > 
> > > > > > 
> > > > > > 
> > > > > > > > > My plan was to convert
> > > > > > > > > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
> > > > > > > > > was not explicit enough.
> > > > > > > > > 
> > > > > > > > > The only solution I can see to that is to trap & emulate in the vdpa
> > > > > > > > > (parent?) driver, as talked in virtio-comment. But that complicates
> > > > > > > > > the architecture:
> > > > > > > > > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
> > > > > > > > > * Store vq enable state separately, at
> > > > > > > > > vdpa->config->set_vq_ready(true), but not transmit that enable to hw
> > > > > > > > > * Store the doorbell state separately, but do not configure it to the
> > > > > > > > > device directly.
> > > > > > > > > 
> > > > > > > > > But how to recover if the device cannot configure them at kick time,
> > > > > > > > > for example?
> > > > > > > > > 
> > > > > > > > > Maybe we can just fail if the parent driver does not support enabling
> > > > > > > > > the vq after DRIVER_OK? That way no new feature flag is needed.
> > > > > > > > > 
> > > > > > > > > Thanks!
> > > > > > > > > 
> > > > > > > > > > > ---
> > > > > > > > > > > Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
> > > > > > > > > > > commit. Please let me know if I should send a v3 of [1] instead.
> > > > > > > > > > > 
> > > > > > > > > > > [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
> > > > > > > > > > > ---
> > > > > > > > > > >    drivers/vhost/vdpa.c | 7 +++++--
> > > > > > > > > > >    1 file changed, 5 insertions(+), 2 deletions(-)
> > > > > > > > > > > 
> > > > > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > > > > > > > > > index e1abf29fed5b..a7e554352351 100644
> > > > > > > > > > > --- a/drivers/vhost/vdpa.c
> > > > > > > > > > > +++ b/drivers/vhost/vdpa.c
> > > > > > > > > > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
> > > > > > > > > > >    {
> > > > > > > > > > >         struct vhost_vdpa *v = filep->private_data;
> > > > > > > > > > >         struct vhost_dev *d = &v->vdev;
> > > > > > > > > > > +     const struct vdpa_config_ops *ops = v->vdpa->config;
> > > > > > > > > > >         void __user *argp = (void __user *)arg;
> > > > > > > > > > >         u64 __user *featurep = argp;
> > > > > > > > > > > -     u64 features;
> > > > > > > > > > > +     u64 features, parent_features = 0;
> > > > > > > > > > >         long r = 0;
> > > > > > > > > > > 
> > > > > > > > > > >         if (cmd == VHOST_SET_BACKEND_FEATURES) {
> > > > > > > > > > >                 if (copy_from_user(&features, featurep, sizeof(features)))
> > > > > > > > > > >                         return -EFAULT;
> > > > > > > > > > > +             if (ops->get_backend_features)
> > > > > > > > > > > +                     parent_features = ops->get_backend_features(v->vdpa);
> > > > > > > > > > >                 if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
> > > > > > > > > > >                                  BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
> > > > > > > > > > >                                  BIT_ULL(VHOST_BACKEND_F_RESUME) |
> > > > > > > > > > > -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
> > > > > > > > > > > +                              parent_features))
> > > > > > > > > > >                         return -EOPNOTSUPP;
> > > > > > > > > > >                 if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
> > > > > > > > > > >                      !vhost_vdpa_can_suspend(v))
> > > > > > > > > > > --
> > > > > > > > > > > 2.39.3
> > _______________________________________________
> > Virtualization mailing list
> > Virtualization@lists.linux-foundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Si-Wei Liu July 20, 2023, 12:03 a.m. UTC | #26
On 7/19/2023 3:26 PM, Michael S. Tsirkin wrote:
> On Wed, Jul 19, 2023 at 03:20:03PM -0700, Si-Wei Liu wrote:
>>
>> On 7/5/2023 11:07 PM, Michael S. Tsirkin wrote:
>>> On Wed, Jul 05, 2023 at 05:07:11PM -0700, Shannon Nelson wrote:
>>>> On 7/5/23 11:27 AM, Eugenio Perez Martin wrote:
>>>>> On Wed, Jul 5, 2023 at 9:50 AM Jason Wang <jasowang@redhat.com> wrote:
>>>>>> On Tue, Jul 4, 2023 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>>>> On Tue, Jul 04, 2023 at 01:36:11PM +0200, Eugenio Perez Martin wrote:
>>>>>>>> On Tue, Jul 4, 2023 at 12:38 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>>>>>> On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote:
>>>>>>>>>> On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>>>>>>>> On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote:
>>>>>>>>>>>> With the current code it is accepted as long as userland send it.
>>>>>>>>>>>>
>>>>>>>>>>>> Although userland should not set a feature flag that has not been
>>>>>>>>>>>> offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not
>>>>>>>>>>>> complain for it.
>>>>>>>>>>>>
>>>>>>>>>>>> Since there is no specific reason for any parent to reject that backend
>>>>>>>>>>>> feature bit when it has been proposed, let's control it at vdpa frontend
>>>>>>>>>>>> level. Future patches may move this control to the parent driver.
>>>>>>>>>>>>
>>>>>>>>>>>> Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature")
>>>>>>>>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>>>>>>>> Please do send v3. And again, I don't want to send "after driver ok" hack
>>>>>>>>>>> upstream at all, I merged it in next just to give it some testing.
>>>>>>>>>>> We want RING_ACCESS_AFTER_KICK or some such.
>>>>>>>>>>>
>>>>>>>>>> Current devices do not support that semantic.
>>>>>>>>> Which devices specifically access the ring after DRIVER_OK but before
>>>>>>>>> a kick?
>>>> The PDS vdpa device can deal with a call to .set_vq_ready after DRIVER_OK is
>>>> set.  And I'm told that our VQ activity should start without a kick.
>>>>
>>>> Our vdpa device FW doesn't currently have support for VIRTIO_F_RING_RESET,
>>>> but I believe it could be added without too much trouble.
>>>>
>>>> sln
>>>>
>>> OK it seems clear at least in the current version pds needs
>>> VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK.
>>> However can we also code up the RING_RESET path as the default?
>> What's the rationale of making RING_RESET path the default? Noted this is on
>> a performance critical path (for live migration downtime), did we ever get
>> consensus from every or most hardware vendors that RING_RESET has lower cost
>> in terms of latency overall than ENABLE_AFTER_DRIVER_OK? I think (RING)RESET
>> in general falls on the slow path for hardware, while I assume either
>> RING_RESET or ENABLE_AFTER_DRIVER_OK doesn't matters much on software backed
>> vdpa e.g. vp_vdpa. Maybe should make ENABLE_AFTER_DRIVER_OK as the default?
>>
>> -Siwei
> Coming from the spec RING_RESET has clearer semantics.
Spec doesn't have clearer semantics on vdpa specifics - such as how does 
RING_RESET interoperate with ASID?
> As long as we support it it is not critical which one
> is the default though.
The point is vdpa vendor drivers may implement RING_RESET for a 
different purpose than live migration. In case they support both I don't 
see a reason why it has to fallback to a slower path given there's a 
faster path. May we should leave this to vendor driver to decide, but I 
am not sure.

-Siwei

>
>
>>> Then down the road vendors can choose what to do.
>>>
>>>
>>>
>>>
>>>
>>>>>>>> Previous versions of the QEMU LM series did a spurious kick to start
>>>>>>>> traffic at the LM destination [1]. When it was proposed, that spurious
>>>>>>>> kick was removed from the series because to check for descriptors
>>>>>>>> after driver_ok, even without a kick, was considered work of the
>>>>>>>> parent driver.
>>>>>>>>
>>>>>>>> I'm ok to go back to this spurious kick, but I'm not sure if the hw
>>>>>>>> will read the ring before the kick actually. I can ask.
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> [1] https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg02775.html
>>>>>>> Let's find out. We need to check for ENABLE_AFTER_DRIVER_OK too, no?
>>>>>> My understanding is [1] assuming ACCESS_AFTER_KICK. This seems
>>>>>> sub-optimal than assuming ENABLE_AFTER_DRIVER_OK.
>>>>>>
>>>>>> But this reminds me one thing, as the thread is going too long, I
>>>>>> wonder if we simply assume ENABLE_AFTER_DRIVER_OK if RING_RESET is
>>>>>> supported?
>>>>>>
>>>>> The problem with that is that the device needs to support all
>>>>> RING_RESET, like to be able to change vq address etc after DRIVER_OK.
>>>>> Not all HW support it.
>>>>>
>>>>> We just need the subset of having the dataplane freezed until all CVQ
>>>>> commands have been consumed. I'm sure current vDPA code already
>>>>> supports it in some devices, like MLX and PSD.
>>>>>
>>>>> Thanks!
>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>>
>>>>>>>>>> My plan was to convert
>>>>>>>>>> it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I
>>>>>>>>>> was not explicit enough.
>>>>>>>>>>
>>>>>>>>>> The only solution I can see to that is to trap & emulate in the vdpa
>>>>>>>>>> (parent?) driver, as talked in virtio-comment. But that complicates
>>>>>>>>>> the architecture:
>>>>>>>>>> * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK
>>>>>>>>>> * Store vq enable state separately, at
>>>>>>>>>> vdpa->config->set_vq_ready(true), but not transmit that enable to hw
>>>>>>>>>> * Store the doorbell state separately, but do not configure it to the
>>>>>>>>>> device directly.
>>>>>>>>>>
>>>>>>>>>> But how to recover if the device cannot configure them at kick time,
>>>>>>>>>> for example?
>>>>>>>>>>
>>>>>>>>>> Maybe we can just fail if the parent driver does not support enabling
>>>>>>>>>> the vq after DRIVER_OK? That way no new feature flag is needed.
>>>>>>>>>>
>>>>>>>>>> Thanks!
>>>>>>>>>>
>>>>>>>>>>>> ---
>>>>>>>>>>>> Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst
>>>>>>>>>>>> commit. Please let me know if I should send a v3 of [1] instead.
>>>>>>>>>>>>
>>>>>>>>>>>> [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-mst@kernel.org/T/
>>>>>>>>>>>> ---
>>>>>>>>>>>>     drivers/vhost/vdpa.c | 7 +++++--
>>>>>>>>>>>>     1 file changed, 5 insertions(+), 2 deletions(-)
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
>>>>>>>>>>>> index e1abf29fed5b..a7e554352351 100644
>>>>>>>>>>>> --- a/drivers/vhost/vdpa.c
>>>>>>>>>>>> +++ b/drivers/vhost/vdpa.c
>>>>>>>>>>>> @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
>>>>>>>>>>>>     {
>>>>>>>>>>>>          struct vhost_vdpa *v = filep->private_data;
>>>>>>>>>>>>          struct vhost_dev *d = &v->vdev;
>>>>>>>>>>>> +     const struct vdpa_config_ops *ops = v->vdpa->config;
>>>>>>>>>>>>          void __user *argp = (void __user *)arg;
>>>>>>>>>>>>          u64 __user *featurep = argp;
>>>>>>>>>>>> -     u64 features;
>>>>>>>>>>>> +     u64 features, parent_features = 0;
>>>>>>>>>>>>          long r = 0;
>>>>>>>>>>>>
>>>>>>>>>>>>          if (cmd == VHOST_SET_BACKEND_FEATURES) {
>>>>>>>>>>>>                  if (copy_from_user(&features, featurep, sizeof(features)))
>>>>>>>>>>>>                          return -EFAULT;
>>>>>>>>>>>> +             if (ops->get_backend_features)
>>>>>>>>>>>> +                     parent_features = ops->get_backend_features(v->vdpa);
>>>>>>>>>>>>                  if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
>>>>>>>>>>>>                                   BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
>>>>>>>>>>>>                                   BIT_ULL(VHOST_BACKEND_F_RESUME) |
>>>>>>>>>>>> -                              BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
>>>>>>>>>>>> +                              parent_features))
>>>>>>>>>>>>                          return -EOPNOTSUPP;
>>>>>>>>>>>>                  if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
>>>>>>>>>>>>                       !vhost_vdpa_can_suspend(v))
>>>>>>>>>>>> --
>>>>>>>>>>>> 2.39.3
>>> _______________________________________________
>>> Virtualization mailing list
>>> Virtualization@lists.linux-foundation.org
>>> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
diff mbox series

Patch

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index e1abf29fed5b..a7e554352351 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -681,18 +681,21 @@  static long vhost_vdpa_unlocked_ioctl(struct file *filep,
 {
 	struct vhost_vdpa *v = filep->private_data;
 	struct vhost_dev *d = &v->vdev;
+	const struct vdpa_config_ops *ops = v->vdpa->config;
 	void __user *argp = (void __user *)arg;
 	u64 __user *featurep = argp;
-	u64 features;
+	u64 features, parent_features = 0;
 	long r = 0;
 
 	if (cmd == VHOST_SET_BACKEND_FEATURES) {
 		if (copy_from_user(&features, featurep, sizeof(features)))
 			return -EFAULT;
+		if (ops->get_backend_features)
+			parent_features = ops->get_backend_features(v->vdpa);
 		if (features & ~(VHOST_VDPA_BACKEND_FEATURES |
 				 BIT_ULL(VHOST_BACKEND_F_SUSPEND) |
 				 BIT_ULL(VHOST_BACKEND_F_RESUME) |
-				 BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK)))
+				 parent_features))
 			return -EOPNOTSUPP;
 		if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) &&
 		     !vhost_vdpa_can_suspend(v))