diff mbox series

[1/4] vdpa: Add stop operation

Message ID 20220520172325.980884-2-eperezma@redhat.com (mailing list archive)
State New, archived
Headers show
Series Implement vdpasim stop operation | expand

Commit Message

Eugenio Perez Martin May 20, 2022, 5:23 p.m. UTC
This operation is optional: It it's not implemented, backend feature bit
will not be exposed.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
 include/linux/vdpa.h | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Si-Wei Liu May 21, 2022, 10:13 a.m. UTC | #1
On 5/20/2022 10:23 AM, Eugenio Pérez wrote:
> This operation is optional: It it's not implemented, backend feature bit
> will not be exposed.
>
> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> ---
>   include/linux/vdpa.h | 6 ++++++
>   1 file changed, 6 insertions(+)
>
> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> index 15af802d41c4..ddfebc4e1e01 100644
> --- a/include/linux/vdpa.h
> +++ b/include/linux/vdpa.h
> @@ -215,6 +215,11 @@ struct vdpa_map_file {
>    * @reset:			Reset device
>    *				@vdev: vdpa device
>    *				Returns integer: success (0) or error (< 0)
> + * @stop:			Stop or resume the device (optional, but it must
> + *				be implemented if require device stop)
> + *				@vdev: vdpa device
> + *				@stop: stop (true), not stop (false)
> + *				Returns integer: success (0) or error (< 0)
Is this uAPI meant to address all use cases described in the full blown 
_F_STOP virtio spec proposal, such as:

--------------%<--------------

...... the device MUST finish any in flight
operations after the driver writes STOP.  Depending on the device, it 
can do it
in many ways as long as the driver can recover its normal operation if it
resumes the device without the need of resetting it:

- Drain and wait for the completion of all pending requests until a
   convenient avail descriptor. Ignore any other posterior descriptor.
- Return a device-specific failure for these descriptors, so the driver
   can choose to retry or to cancel them.
- Mark them as done even if they are not, if the kind of device can
   assume to lose them.
--------------%<--------------

E.g. do I assume correctly all in flight requests are flushed after 
return from this uAPI call? Or some of pending requests may be subject 
to loss or failure? How does the caller/user specify these various 
options (if there are) for device stop?

BTW, it would be nice to add the corresponding support to vdpa_sim_blk 
as well to demo the stop handling. To just show it on vdpa-sim-net IMHO 
is perhaps not so convincing.

-Siwei

>    * @get_config_size:		Get the size of the configuration space includes
>    *				fields that are conditional on feature bits.
>    *				@vdev: vdpa device
> @@ -316,6 +321,7 @@ struct vdpa_config_ops {
>   	u8 (*get_status)(struct vdpa_device *vdev);
>   	void (*set_status)(struct vdpa_device *vdev, u8 status);
>   	int (*reset)(struct vdpa_device *vdev);
> +	int (*stop)(struct vdpa_device *vdev, bool stop);
>   	size_t (*get_config_size)(struct vdpa_device *vdev);
>   	void (*get_config)(struct vdpa_device *vdev, unsigned int offset,
>   			   void *buf, unsigned int len);
Eugenio Perez Martin May 23, 2022, 7:20 p.m. UTC | #2
On Sat, May 21, 2022 at 12:13 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>
>
>
> On 5/20/2022 10:23 AM, Eugenio Pérez wrote:
> > This operation is optional: It it's not implemented, backend feature bit
> > will not be exposed.
> >
> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> > ---
> >   include/linux/vdpa.h | 6 ++++++
> >   1 file changed, 6 insertions(+)
> >
> > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> > index 15af802d41c4..ddfebc4e1e01 100644
> > --- a/include/linux/vdpa.h
> > +++ b/include/linux/vdpa.h
> > @@ -215,6 +215,11 @@ struct vdpa_map_file {
> >    * @reset:                  Reset device
> >    *                          @vdev: vdpa device
> >    *                          Returns integer: success (0) or error (< 0)
> > + * @stop:                    Stop or resume the device (optional, but it must
> > + *                           be implemented if require device stop)
> > + *                           @vdev: vdpa device
> > + *                           @stop: stop (true), not stop (false)
> > + *                           Returns integer: success (0) or error (< 0)
> Is this uAPI meant to address all use cases described in the full blown
> _F_STOP virtio spec proposal, such as:
>
> --------------%<--------------
>
> ...... the device MUST finish any in flight
> operations after the driver writes STOP.  Depending on the device, it
> can do it
> in many ways as long as the driver can recover its normal operation if it
> resumes the device without the need of resetting it:
>
> - Drain and wait for the completion of all pending requests until a
>    convenient avail descriptor. Ignore any other posterior descriptor.
> - Return a device-specific failure for these descriptors, so the driver
>    can choose to retry or to cancel them.
> - Mark them as done even if they are not, if the kind of device can
>    assume to lose them.
> --------------%<--------------
>

Right, this is totally underspecified in this series.

I'll expand on it in the next version, but that text proposed to
virtio-comment was complicated and misleading. I find better to get
the previous version description. Would the next description work?

```
After the return of ioctl, the device MUST finish any pending operations like
in flight requests. It must also preserve all the necessary state (the
virtqueue vring base plus the possible device specific states) that is required
for restoring in the future.

In the future, we will provide features similar to VHOST_USER_GET_INFLIGHT_FD
so the device can save pending operations.
```

Thanks for pointing it out!





> E.g. do I assume correctly all in flight requests are flushed after
> return from this uAPI call? Or some of pending requests may be subject
> to loss or failure? How does the caller/user specify these various
> options (if there are) for device stop?
>
> BTW, it would be nice to add the corresponding support to vdpa_sim_blk
> as well to demo the stop handling. To just show it on vdpa-sim-net IMHO
> is perhaps not so convincing.
>
> -Siwei
>
> >    * @get_config_size:                Get the size of the configuration space includes
> >    *                          fields that are conditional on feature bits.
> >    *                          @vdev: vdpa device
> > @@ -316,6 +321,7 @@ struct vdpa_config_ops {
> >       u8 (*get_status)(struct vdpa_device *vdev);
> >       void (*set_status)(struct vdpa_device *vdev, u8 status);
> >       int (*reset)(struct vdpa_device *vdev);
> > +     int (*stop)(struct vdpa_device *vdev, bool stop);
> >       size_t (*get_config_size)(struct vdpa_device *vdev);
> >       void (*get_config)(struct vdpa_device *vdev, unsigned int offset,
> >                          void *buf, unsigned int len);
>
Si-Wei Liu May 23, 2022, 11:54 p.m. UTC | #3
On 5/23/2022 12:20 PM, Eugenio Perez Martin wrote:
> On Sat, May 21, 2022 at 12:13 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>>
>>
>> On 5/20/2022 10:23 AM, Eugenio Pérez wrote:
>>> This operation is optional: It it's not implemented, backend feature bit
>>> will not be exposed.
>>>
>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>> ---
>>>    include/linux/vdpa.h | 6 ++++++
>>>    1 file changed, 6 insertions(+)
>>>
>>> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
>>> index 15af802d41c4..ddfebc4e1e01 100644
>>> --- a/include/linux/vdpa.h
>>> +++ b/include/linux/vdpa.h
>>> @@ -215,6 +215,11 @@ struct vdpa_map_file {
>>>     * @reset:                  Reset device
>>>     *                          @vdev: vdpa device
>>>     *                          Returns integer: success (0) or error (< 0)
>>> + * @stop:                    Stop or resume the device (optional, but it must
>>> + *                           be implemented if require device stop)
>>> + *                           @vdev: vdpa device
>>> + *                           @stop: stop (true), not stop (false)
>>> + *                           Returns integer: success (0) or error (< 0)
>> Is this uAPI meant to address all use cases described in the full blown
>> _F_STOP virtio spec proposal, such as:
>>
>> --------------%<--------------
>>
>> ...... the device MUST finish any in flight
>> operations after the driver writes STOP.  Depending on the device, it
>> can do it
>> in many ways as long as the driver can recover its normal operation if it
>> resumes the device without the need of resetting it:
>>
>> - Drain and wait for the completion of all pending requests until a
>>     convenient avail descriptor. Ignore any other posterior descriptor.
>> - Return a device-specific failure for these descriptors, so the driver
>>     can choose to retry or to cancel them.
>> - Mark them as done even if they are not, if the kind of device can
>>     assume to lose them.
>> --------------%<--------------
>>
> Right, this is totally underspecified in this series.
>
> I'll expand on it in the next version, but that text proposed to
> virtio-comment was complicated and misleading. I find better to get
> the previous version description. Would the next description work?
>
> ```
> After the return of ioctl, the device MUST finish any pending operations like
> in flight requests. It must also preserve all the necessary state (the
> virtqueue vring base plus the possible device specific states)
Hmmm, "possible device specific states" is a bit vague. Does it require 
the device to save any device internal state that is not defined in the 
virtio spec - such as any failed in-flight requests to resubmit upon 
resume? Or you would lean on SVQ to intercept it in depth and save it 
with some other means? I think network device also has internal state 
such as flow steering state that needs bookkeeping as well.

A follow-up question is what is the use of the `stop` argument of false, 
does it require the device to support resume? I seem to recall this is 
something to abandon in favor of device reset plus setting queue 
base/addr after. Or it's just a optional feature that may be device 
specific (if one can do so in simple way).

-Siwei

>   that is required
> for restoring in the future.
>
> In the future, we will provide features similar to VHOST_USER_GET_INFLIGHT_FD
> so the device can save pending operations.
> ```
>
> Thanks for pointing it out!
>
>
>
>
>
>> E.g. do I assume correctly all in flight requests are flushed after
>> return from this uAPI call? Or some of pending requests may be subject
>> to loss or failure? How does the caller/user specify these various
>> options (if there are) for device stop?
>>
>> BTW, it would be nice to add the corresponding support to vdpa_sim_blk
>> as well to demo the stop handling. To just show it on vdpa-sim-net IMHO
>> is perhaps not so convincing.
>>
>> -Siwei
>>
>>>     * @get_config_size:                Get the size of the configuration space includes
>>>     *                          fields that are conditional on feature bits.
>>>     *                          @vdev: vdpa device
>>> @@ -316,6 +321,7 @@ struct vdpa_config_ops {
>>>        u8 (*get_status)(struct vdpa_device *vdev);
>>>        void (*set_status)(struct vdpa_device *vdev, u8 status);
>>>        int (*reset)(struct vdpa_device *vdev);
>>> +     int (*stop)(struct vdpa_device *vdev, bool stop);
>>>        size_t (*get_config_size)(struct vdpa_device *vdev);
>>>        void (*get_config)(struct vdpa_device *vdev, unsigned int offset,
>>>                           void *buf, unsigned int len);
Si-Wei Liu May 24, 2022, 12:01 a.m. UTC | #4
On 5/23/2022 4:54 PM, Si-Wei Liu wrote:
>
>
> On 5/23/2022 12:20 PM, Eugenio Perez Martin wrote:
>> On Sat, May 21, 2022 at 12:13 PM Si-Wei Liu <si-wei.liu@oracle.com> 
>> wrote:
>>>
>>>
>>> On 5/20/2022 10:23 AM, Eugenio Pérez wrote:
>>>> This operation is optional: It it's not implemented, backend 
>>>> feature bit
>>>> will not be exposed.
>>>>
>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>> ---
>>>>    include/linux/vdpa.h | 6 ++++++
>>>>    1 file changed, 6 insertions(+)
>>>>
>>>> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
>>>> index 15af802d41c4..ddfebc4e1e01 100644
>>>> --- a/include/linux/vdpa.h
>>>> +++ b/include/linux/vdpa.h
>>>> @@ -215,6 +215,11 @@ struct vdpa_map_file {
>>>>     * @reset:                  Reset device
>>>>     *                          @vdev: vdpa device
>>>>     *                          Returns integer: success (0) or 
>>>> error (< 0)
>>>> + * @stop:                    Stop or resume the device (optional, 
>>>> but it must
>>>> + *                           be implemented if require device stop)
>>>> + *                           @vdev: vdpa device
>>>> + *                           @stop: stop (true), not stop (false)
>>>> + *                           Returns integer: success (0) or error 
>>>> (< 0)
>>> Is this uAPI meant to address all use cases described in the full blown
>>> _F_STOP virtio spec proposal, such as:
>>>
>>> --------------%<--------------
>>>
>>> ...... the device MUST finish any in flight
>>> operations after the driver writes STOP.  Depending on the device, it
>>> can do it
>>> in many ways as long as the driver can recover its normal operation 
>>> if it
>>> resumes the device without the need of resetting it:
>>>
>>> - Drain and wait for the completion of all pending requests until a
>>>     convenient avail descriptor. Ignore any other posterior descriptor.
>>> - Return a device-specific failure for these descriptors, so the driver
>>>     can choose to retry or to cancel them.
>>> - Mark them as done even if they are not, if the kind of device can
>>>     assume to lose them.
>>> --------------%<--------------
>>>
>> Right, this is totally underspecified in this series.
>>
>> I'll expand on it in the next version, but that text proposed to
>> virtio-comment was complicated and misleading. I find better to get
>> the previous version description. Would the next description work?
>>
>> ```
>> After the return of ioctl, the device MUST finish any pending 
>> operations like
>> in flight requests. It must also preserve all the necessary state (the
>> virtqueue vring base plus the possible device specific states)
> Hmmm, "possible device specific states" is a bit vague. Does it 
> require the device to save any device internal state that is not 
> defined in the virtio spec - such as any failed in-flight requests to 
> resubmit upon resume? Or you would lean on SVQ to intercept it in 
> depth and save it with some other means? I think network device also 
> has internal state such as flow steering state that needs bookkeeping 
> as well.
Noted that I understand you may introduce additional feature call 
similar to VHOST_USER_GET_INFLIGHT_FD for (failed) in-flight request, 
but since that's is a get interface, I assume the actual state 
preserving should still take place in this STOP call.

-Siwei

>
> A follow-up question is what is the use of the `stop` argument of 
> false, does it require the device to support resume? I seem to recall 
> this is something to abandon in favor of device reset plus setting 
> queue base/addr after. Or it's just a optional feature that may be 
> device specific (if one can do so in simple way).
>
> -Siwei
>
>>   that is required
>> for restoring in the future.
>>
>> In the future, we will provide features similar to 
>> VHOST_USER_GET_INFLIGHT_FD
>> so the device can save pending operations.
>> ```
>>
>> Thanks for pointing it out!
>>
>>
>>
>>
>>
>>> E.g. do I assume correctly all in flight requests are flushed after
>>> return from this uAPI call? Or some of pending requests may be subject
>>> to loss or failure? How does the caller/user specify these various
>>> options (if there are) for device stop?
>>>
>>> BTW, it would be nice to add the corresponding support to vdpa_sim_blk
>>> as well to demo the stop handling. To just show it on vdpa-sim-net IMHO
>>> is perhaps not so convincing.
>>>
>>> -Siwei
>>>
>>>>     * @get_config_size: Get the size of the configuration space 
>>>> includes
>>>>     *                          fields that are conditional on 
>>>> feature bits.
>>>>     *                          @vdev: vdpa device
>>>> @@ -316,6 +321,7 @@ struct vdpa_config_ops {
>>>>        u8 (*get_status)(struct vdpa_device *vdev);
>>>>        void (*set_status)(struct vdpa_device *vdev, u8 status);
>>>>        int (*reset)(struct vdpa_device *vdev);
>>>> +     int (*stop)(struct vdpa_device *vdev, bool stop);
>>>>        size_t (*get_config_size)(struct vdpa_device *vdev);
>>>>        void (*get_config)(struct vdpa_device *vdev, unsigned int 
>>>> offset,
>>>>                           void *buf, unsigned int len);
>
Jason Wang May 24, 2022, 2:45 a.m. UTC | #5
在 2022/5/24 08:01, Si-Wei Liu 写道:
>
>
> On 5/23/2022 4:54 PM, Si-Wei Liu wrote:
>>
>>
>> On 5/23/2022 12:20 PM, Eugenio Perez Martin wrote:
>>> On Sat, May 21, 2022 at 12:13 PM Si-Wei Liu <si-wei.liu@oracle.com> 
>>> wrote:
>>>>
>>>>
>>>> On 5/20/2022 10:23 AM, Eugenio Pérez wrote:
>>>>> This operation is optional: It it's not implemented, backend 
>>>>> feature bit
>>>>> will not be exposed.
>>>>>
>>>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>>>>> ---
>>>>>    include/linux/vdpa.h | 6 ++++++
>>>>>    1 file changed, 6 insertions(+)
>>>>>
>>>>> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
>>>>> index 15af802d41c4..ddfebc4e1e01 100644
>>>>> --- a/include/linux/vdpa.h
>>>>> +++ b/include/linux/vdpa.h
>>>>> @@ -215,6 +215,11 @@ struct vdpa_map_file {
>>>>>     * @reset:                  Reset device
>>>>>     *                          @vdev: vdpa device
>>>>>     *                          Returns integer: success (0) or 
>>>>> error (< 0)
>>>>> + * @stop:                    Stop or resume the device (optional, 
>>>>> but it must
>>>>> + *                           be implemented if require device stop)
>>>>> + *                           @vdev: vdpa device
>>>>> + *                           @stop: stop (true), not stop (false)
>>>>> + *                           Returns integer: success (0) or 
>>>>> error (< 0)
>>>> Is this uAPI meant to address all use cases described in the full 
>>>> blown
>>>> _F_STOP virtio spec proposal, such as:
>>>>
>>>> --------------%<--------------
>>>>
>>>> ...... the device MUST finish any in flight
>>>> operations after the driver writes STOP.  Depending on the device, it
>>>> can do it
>>>> in many ways as long as the driver can recover its normal operation 
>>>> if it
>>>> resumes the device without the need of resetting it:
>>>>
>>>> - Drain and wait for the completion of all pending requests until a
>>>>     convenient avail descriptor. Ignore any other posterior 
>>>> descriptor.
>>>> - Return a device-specific failure for these descriptors, so the 
>>>> driver
>>>>     can choose to retry or to cancel them.
>>>> - Mark them as done even if they are not, if the kind of device can
>>>>     assume to lose them.
>>>> --------------%<--------------
>>>>
>>> Right, this is totally underspecified in this series.
>>>
>>> I'll expand on it in the next version, but that text proposed to
>>> virtio-comment was complicated and misleading. I find better to get
>>> the previous version description. Would the next description work?
>>>
>>> ```
>>> After the return of ioctl, the device MUST finish any pending 
>>> operations like
>>> in flight requests. It must also preserve all the necessary state (the
>>> virtqueue vring base plus the possible device specific states)
>> Hmmm, "possible device specific states" is a bit vague. Does it 
>> require the device to save any device internal state that is not 
>> defined in the virtio spec - such as any failed in-flight requests to 
>> resubmit upon resume? Or you would lean on SVQ to intercept it in 
>> depth and save it with some other means? I think network device also 
>> has internal state such as flow steering state that needs bookkeeping 
>> as well.
> Noted that I understand you may introduce additional feature call 
> similar to VHOST_USER_GET_INFLIGHT_FD for (failed) in-flight request, 
> but since that's is a get interface, I assume the actual state 
> preserving should still take place in this STOP call.
>

Yes, I think so.


> -Siwei
>
>>
>> A follow-up question is what is the use of the `stop` argument of 
>> false, does it require the device to support resume? 


Yes, this is required by the hypervisor e.g for Qemu it supports vm 
stop/resume.


>> I seem to recall this is something to abandon in favor of device 
>> reset plus setting queue base/addr after. Or it's just a optional 
>> feature that may be device specific (if one can do so in simple way).


Rest is more like a workarond consider we don't have a stop API. 
Consider we don't add stop at the beginning, it can only be an optional 
feature.

Thanks


>>
>> -Siwei
>>
>>>   that is required
>>> for restoring in the future.
>>>
>>> In the future, we will provide features similar to 
>>> VHOST_USER_GET_INFLIGHT_FD
>>> so the device can save pending operations.
>>> ```
>>>
>>> Thanks for pointing it out!
>>>
>>>
>>>
>>>
>>>
>>>> E.g. do I assume correctly all in flight requests are flushed after
>>>> return from this uAPI call? Or some of pending requests may be subject
>>>> to loss or failure? How does the caller/user specify these various
>>>> options (if there are) for device stop?
>>>>
>>>> BTW, it would be nice to add the corresponding support to vdpa_sim_blk
>>>> as well to demo the stop handling. To just show it on vdpa-sim-net 
>>>> IMHO
>>>> is perhaps not so convincing.
>>>>
>>>> -Siwei
>>>>
>>>>>     * @get_config_size: Get the size of the configuration space 
>>>>> includes
>>>>>     *                          fields that are conditional on 
>>>>> feature bits.
>>>>>     *                          @vdev: vdpa device
>>>>> @@ -316,6 +321,7 @@ struct vdpa_config_ops {
>>>>>        u8 (*get_status)(struct vdpa_device *vdev);
>>>>>        void (*set_status)(struct vdpa_device *vdev, u8 status);
>>>>>        int (*reset)(struct vdpa_device *vdev);
>>>>> +     int (*stop)(struct vdpa_device *vdev, bool stop);
>>>>>        size_t (*get_config_size)(struct vdpa_device *vdev);
>>>>>        void (*get_config)(struct vdpa_device *vdev, unsigned int 
>>>>> offset,
>>>>>                           void *buf, unsigned int len);
>>
>
Stefano Garzarella May 24, 2022, 7:09 a.m. UTC | #6
On Mon, May 23, 2022 at 09:20:14PM +0200, Eugenio Perez Martin wrote:
>On Sat, May 21, 2022 at 12:13 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>>
>>
>>
>> On 5/20/2022 10:23 AM, Eugenio Pérez wrote:
>> > This operation is optional: It it's not implemented, backend feature bit
>> > will not be exposed.
>> >
>> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>> > ---
>> >   include/linux/vdpa.h | 6 ++++++
>> >   1 file changed, 6 insertions(+)
>> >
>> > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
>> > index 15af802d41c4..ddfebc4e1e01 100644
>> > --- a/include/linux/vdpa.h
>> > +++ b/include/linux/vdpa.h
>> > @@ -215,6 +215,11 @@ struct vdpa_map_file {
>> >    * @reset:                  Reset device
>> >    *                          @vdev: vdpa device
>> >    *                          Returns integer: success (0) or error (< 0)
>> > + * @stop:                    Stop or resume the device (optional, but it must
>> > + *                           be implemented if require device stop)
>> > + *                           @vdev: vdpa device
>> > + *                           @stop: stop (true), not stop (false)
>> > + *                           Returns integer: success (0) or error (< 0)
>> Is this uAPI meant to address all use cases described in the full blown
>> _F_STOP virtio spec proposal, such as:
>>
>> --------------%<--------------
>>
>> ...... the device MUST finish any in flight
>> operations after the driver writes STOP.  Depending on the device, it
>> can do it
>> in many ways as long as the driver can recover its normal operation 
>> if it
>> resumes the device without the need of resetting it:
>>
>> - Drain and wait for the completion of all pending requests until a
>>    convenient avail descriptor. Ignore any other posterior descriptor.
>> - Return a device-specific failure for these descriptors, so the driver
>>    can choose to retry or to cancel them.
>> - Mark them as done even if they are not, if the kind of device can
>>    assume to lose them.
>> --------------%<--------------
>>
>
>Right, this is totally underspecified in this series.
>
>I'll expand on it in the next version, but that text proposed to
>virtio-comment was complicated and misleading. I find better to get
>the previous version description. Would the next description work?
>
>```
>After the return of ioctl, the device MUST finish any pending operations like
>in flight requests. It must also preserve all the necessary state (the
>virtqueue vring base plus the possible device specific states) that is required
>for restoring in the future.

For block devices wait for all in-flight requests could take several 
time.

Could this be a problem if the caller gets stuck on this ioctl?

If it could be a problem, maybe we should use an eventfd to signal that 
the device is successfully stopped.

Thanks,
Stefano
Eugenio Perez Martin May 24, 2022, 7:38 a.m. UTC | #7
On Tue, May 24, 2022 at 2:01 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>
>
>
> On 5/23/2022 4:54 PM, Si-Wei Liu wrote:
> >
> >
> > On 5/23/2022 12:20 PM, Eugenio Perez Martin wrote:
> >> On Sat, May 21, 2022 at 12:13 PM Si-Wei Liu <si-wei.liu@oracle.com>
> >> wrote:
> >>>
> >>>
> >>> On 5/20/2022 10:23 AM, Eugenio Pérez wrote:
> >>>> This operation is optional: It it's not implemented, backend
> >>>> feature bit
> >>>> will not be exposed.
> >>>>
> >>>> Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> >>>> ---
> >>>>    include/linux/vdpa.h | 6 ++++++
> >>>>    1 file changed, 6 insertions(+)
> >>>>
> >>>> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> >>>> index 15af802d41c4..ddfebc4e1e01 100644
> >>>> --- a/include/linux/vdpa.h
> >>>> +++ b/include/linux/vdpa.h
> >>>> @@ -215,6 +215,11 @@ struct vdpa_map_file {
> >>>>     * @reset:                  Reset device
> >>>>     *                          @vdev: vdpa device
> >>>>     *                          Returns integer: success (0) or
> >>>> error (< 0)
> >>>> + * @stop:                    Stop or resume the device (optional,
> >>>> but it must
> >>>> + *                           be implemented if require device stop)
> >>>> + *                           @vdev: vdpa device
> >>>> + *                           @stop: stop (true), not stop (false)
> >>>> + *                           Returns integer: success (0) or error
> >>>> (< 0)
> >>> Is this uAPI meant to address all use cases described in the full blown
> >>> _F_STOP virtio spec proposal, such as:
> >>>
> >>> --------------%<--------------
> >>>
> >>> ...... the device MUST finish any in flight
> >>> operations after the driver writes STOP.  Depending on the device, it
> >>> can do it
> >>> in many ways as long as the driver can recover its normal operation
> >>> if it
> >>> resumes the device without the need of resetting it:
> >>>
> >>> - Drain and wait for the completion of all pending requests until a
> >>>     convenient avail descriptor. Ignore any other posterior descriptor.
> >>> - Return a device-specific failure for these descriptors, so the driver
> >>>     can choose to retry or to cancel them.
> >>> - Mark them as done even if they are not, if the kind of device can
> >>>     assume to lose them.
> >>> --------------%<--------------
> >>>
> >> Right, this is totally underspecified in this series.
> >>
> >> I'll expand on it in the next version, but that text proposed to
> >> virtio-comment was complicated and misleading. I find better to get
> >> the previous version description. Would the next description work?
> >>
> >> ```
> >> After the return of ioctl, the device MUST finish any pending
> >> operations like
> >> in flight requests. It must also preserve all the necessary state (the
> >> virtqueue vring base plus the possible device specific states)
> > Hmmm, "possible device specific states" is a bit vague. Does it
> > require the device to save any device internal state that is not
> > defined in the virtio spec - such as any failed in-flight requests to
> > resubmit upon resume?

I'd let that be device-specific. For example, the net simulator
doesn't need to store them, since it cannot stop while processing
buffers. Other net devices can also decide to simply drop or re-submit
tx frames.

I can check for the block simulator if that's possible too. For
hardware vdpa block devices, this should be combined with the future
"get inflight buffers" call for sure.

> > Or you would lean on SVQ to intercept it in
> > depth and save it with some other means? I think network device also
> > has internal state such as flow steering state that needs bookkeeping
> > as well.

Yes, for state set by the control vq a permanent SVQ is used only for
the cvq. For other things like config space vdpa already presents an
emulated one to the guest, so we're safe in that regard.

> Noted that I understand you may introduce additional feature call
> similar to VHOST_USER_GET_INFLIGHT_FD for (failed) in-flight request,
> but since that's is a get interface, I assume the actual state
> preserving should still take place in this STOP call.
>

Right. I'll add all of this to the proposal.

Thanks!

> -Siwei
>
> >
> > A follow-up question is what is the use of the `stop` argument of
> > false, does it require the device to support resume? I seem to recall
> > this is something to abandon in favor of device reset plus setting
> > queue base/addr after. Or it's just a optional feature that may be
> > device specific (if one can do so in simple way).
> >
> > -Siwei
> >
> >>   that is required
> >> for restoring in the future.
> >>
> >> In the future, we will provide features similar to
> >> VHOST_USER_GET_INFLIGHT_FD
> >> so the device can save pending operations.
> >> ```
> >>
> >> Thanks for pointing it out!
> >>
> >>
> >>
> >>
> >>
> >>> E.g. do I assume correctly all in flight requests are flushed after
> >>> return from this uAPI call? Or some of pending requests may be subject
> >>> to loss or failure? How does the caller/user specify these various
> >>> options (if there are) for device stop?
> >>>
> >>> BTW, it would be nice to add the corresponding support to vdpa_sim_blk
> >>> as well to demo the stop handling. To just show it on vdpa-sim-net IMHO
> >>> is perhaps not so convincing.
> >>>
> >>> -Siwei
> >>>
> >>>>     * @get_config_size: Get the size of the configuration space
> >>>> includes
> >>>>     *                          fields that are conditional on
> >>>> feature bits.
> >>>>     *                          @vdev: vdpa device
> >>>> @@ -316,6 +321,7 @@ struct vdpa_config_ops {
> >>>>        u8 (*get_status)(struct vdpa_device *vdev);
> >>>>        void (*set_status)(struct vdpa_device *vdev, u8 status);
> >>>>        int (*reset)(struct vdpa_device *vdev);
> >>>> +     int (*stop)(struct vdpa_device *vdev, bool stop);
> >>>>        size_t (*get_config_size)(struct vdpa_device *vdev);
> >>>>        void (*get_config)(struct vdpa_device *vdev, unsigned int
> >>>> offset,
> >>>>                           void *buf, unsigned int len);
> >
>
Eugenio Perez Martin May 24, 2022, 7:42 a.m. UTC | #8
On Tue, May 24, 2022 at 9:09 AM Stefano Garzarella <sgarzare@redhat.com> wrote:
>
> On Mon, May 23, 2022 at 09:20:14PM +0200, Eugenio Perez Martin wrote:
> >On Sat, May 21, 2022 at 12:13 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
> >>
> >>
> >>
> >> On 5/20/2022 10:23 AM, Eugenio Pérez wrote:
> >> > This operation is optional: It it's not implemented, backend feature bit
> >> > will not be exposed.
> >> >
> >> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
> >> > ---
> >> >   include/linux/vdpa.h | 6 ++++++
> >> >   1 file changed, 6 insertions(+)
> >> >
> >> > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> >> > index 15af802d41c4..ddfebc4e1e01 100644
> >> > --- a/include/linux/vdpa.h
> >> > +++ b/include/linux/vdpa.h
> >> > @@ -215,6 +215,11 @@ struct vdpa_map_file {
> >> >    * @reset:                  Reset device
> >> >    *                          @vdev: vdpa device
> >> >    *                          Returns integer: success (0) or error (< 0)
> >> > + * @stop:                    Stop or resume the device (optional, but it must
> >> > + *                           be implemented if require device stop)
> >> > + *                           @vdev: vdpa device
> >> > + *                           @stop: stop (true), not stop (false)
> >> > + *                           Returns integer: success (0) or error (< 0)
> >> Is this uAPI meant to address all use cases described in the full blown
> >> _F_STOP virtio spec proposal, such as:
> >>
> >> --------------%<--------------
> >>
> >> ...... the device MUST finish any in flight
> >> operations after the driver writes STOP.  Depending on the device, it
> >> can do it
> >> in many ways as long as the driver can recover its normal operation
> >> if it
> >> resumes the device without the need of resetting it:
> >>
> >> - Drain and wait for the completion of all pending requests until a
> >>    convenient avail descriptor. Ignore any other posterior descriptor.
> >> - Return a device-specific failure for these descriptors, so the driver
> >>    can choose to retry or to cancel them.
> >> - Mark them as done even if they are not, if the kind of device can
> >>    assume to lose them.
> >> --------------%<--------------
> >>
> >
> >Right, this is totally underspecified in this series.
> >
> >I'll expand on it in the next version, but that text proposed to
> >virtio-comment was complicated and misleading. I find better to get
> >the previous version description. Would the next description work?
> >
> >```
> >After the return of ioctl, the device MUST finish any pending operations like
> >in flight requests. It must also preserve all the necessary state (the
> >virtqueue vring base plus the possible device specific states) that is required
> >for restoring in the future.
>
> For block devices wait for all in-flight requests could take several
> time.
>
> Could this be a problem if the caller gets stuck on this ioctl?
>
> If it could be a problem, maybe we should use an eventfd to signal that
> the device is successfully stopped.
>

For that particular problem I'd very much prefer to add directly an
ioctl to get the inflight descriptors. We know for sure we will need
them, and it will be cleaner in the long run.

As I understand the vdpa block simulator, there is no need to return
the inflight descriptors since all of the requests are processed in a
synchronous way. So, for this iteration, we could offer the stop
feature to qemu.

Other non-simulated devices would need it. Could it be delayed to
future development?

Thanks!

> Thanks,
> Stefano
>
Stefano Garzarella May 24, 2022, 7:51 a.m. UTC | #9
On Tue, May 24, 2022 at 09:42:06AM +0200, Eugenio Perez Martin wrote:
>On Tue, May 24, 2022 at 9:09 AM Stefano Garzarella <sgarzare@redhat.com> wrote:
>>
>> On Mon, May 23, 2022 at 09:20:14PM +0200, Eugenio Perez Martin wrote:
>> >On Sat, May 21, 2022 at 12:13 PM Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>> >>
>> >>
>> >>
>> >> On 5/20/2022 10:23 AM, Eugenio Pérez wrote:
>> >> > This operation is optional: It it's not implemented, backend feature bit
>> >> > will not be exposed.
>> >> >
>> >> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
>> >> > ---
>> >> >   include/linux/vdpa.h | 6 ++++++
>> >> >   1 file changed, 6 insertions(+)
>> >> >
>> >> > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
>> >> > index 15af802d41c4..ddfebc4e1e01 100644
>> >> > --- a/include/linux/vdpa.h
>> >> > +++ b/include/linux/vdpa.h
>> >> > @@ -215,6 +215,11 @@ struct vdpa_map_file {
>> >> >    * @reset:                  Reset device
>> >> >    *                          @vdev: vdpa device
>> >> >    *                          Returns integer: success (0) or error (< 0)
>> >> > + * @stop:                    Stop or resume the device (optional, but it must
>> >> > + *                           be implemented if require device stop)
>> >> > + *                           @vdev: vdpa device
>> >> > + *                           @stop: stop (true), not stop (false)
>> >> > + *                           Returns integer: success (0) or error (< 0)
>> >> Is this uAPI meant to address all use cases described in the full blown
>> >> _F_STOP virtio spec proposal, such as:
>> >>
>> >> --------------%<--------------
>> >>
>> >> ...... the device MUST finish any in flight
>> >> operations after the driver writes STOP.  Depending on the device, it
>> >> can do it
>> >> in many ways as long as the driver can recover its normal operation
>> >> if it
>> >> resumes the device without the need of resetting it:
>> >>
>> >> - Drain and wait for the completion of all pending requests until a
>> >>    convenient avail descriptor. Ignore any other posterior descriptor.
>> >> - Return a device-specific failure for these descriptors, so the driver
>> >>    can choose to retry or to cancel them.
>> >> - Mark them as done even if they are not, if the kind of device can
>> >>    assume to lose them.
>> >> --------------%<--------------
>> >>
>> >
>> >Right, this is totally underspecified in this series.
>> >
>> >I'll expand on it in the next version, but that text proposed to
>> >virtio-comment was complicated and misleading. I find better to get
>> >the previous version description. Would the next description work?
>> >
>> >```
>> >After the return of ioctl, the device MUST finish any pending operations like
>> >in flight requests. It must also preserve all the necessary state (the
>> >virtqueue vring base plus the possible device specific states) that is required
>> >for restoring in the future.
>>
>> For block devices wait for all in-flight requests could take several
>> time.
>>
>> Could this be a problem if the caller gets stuck on this ioctl?
>>
>> If it could be a problem, maybe we should use an eventfd to signal that
>> the device is successfully stopped.
>>
>
>For that particular problem I'd very much prefer to add directly an
>ioctl to get the inflight descriptors. We know for sure we will need
>them, and it will be cleaner in the long run.

Makes sense!

>
>As I understand the vdpa block simulator, there is no need to return
>the inflight descriptors since all of the requests are processed in a
>synchronous way. So, for this iteration, we could offer the stop
>feature to qemu.

Right, the simulator handles everything synchronously.

>
>Other non-simulated devices would need it. Could it be delayed to
>future development?

Yep, sure, it sounds like you already have a plan, so no problem :-)

Thanks,
Stefano
diff mbox series

Patch

diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 15af802d41c4..ddfebc4e1e01 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -215,6 +215,11 @@  struct vdpa_map_file {
  * @reset:			Reset device
  *				@vdev: vdpa device
  *				Returns integer: success (0) or error (< 0)
+ * @stop:			Stop or resume the device (optional, but it must
+ *				be implemented if require device stop)
+ *				@vdev: vdpa device
+ *				@stop: stop (true), not stop (false)
+ *				Returns integer: success (0) or error (< 0)
  * @get_config_size:		Get the size of the configuration space includes
  *				fields that are conditional on feature bits.
  *				@vdev: vdpa device
@@ -316,6 +321,7 @@  struct vdpa_config_ops {
 	u8 (*get_status)(struct vdpa_device *vdev);
 	void (*set_status)(struct vdpa_device *vdev, u8 status);
 	int (*reset)(struct vdpa_device *vdev);
+	int (*stop)(struct vdpa_device *vdev, bool stop);
 	size_t (*get_config_size)(struct vdpa_device *vdev);
 	void (*get_config)(struct vdpa_device *vdev, unsigned int offset,
 			   void *buf, unsigned int len);