diff mbox series

[V3,5/6] vDPA: answer num of queue pairs = 1 to userspace when VIRTIO_NET_F_MQ == 0

Message ID 20220701132826.8132-6-lingshan.zhu@intel.com (mailing list archive)
State Not Applicable
Headers show
Series ifcvf/vDPA: support query device config space through netlink | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Zhu, Lingshan July 1, 2022, 1:28 p.m. UTC
If VIRTIO_NET_F_MQ == 0, the virtio device should have one queue pair,
so when userspace querying queue pair numbers, it should return mq=1
than zero.

Function vdpa_dev_net_config_fill() fills the attributions of the
vDPA devices, so that it should call vdpa_dev_net_mq_config_fill()
so the parameter in vdpa_dev_net_mq_config_fill()
should be feature_device than feature_driver for the
vDPA devices themselves

Before this change, when MQ = 0, iproute2 output:
$vdpa dev config show vdpa0
vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false
max_vq_pairs 0 mtu 1500

After applying this commit, when MQ = 0, iproute2 output:
$vdpa dev config show vdpa0
vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false
max_vq_pairs 1 mtu 1500

Fixes: a64917bc2e9b (vdpa: Provide interface to read driver features)
Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
---
 drivers/vdpa/vdpa.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

Comments

Parav Pandit July 1, 2022, 10:07 p.m. UTC | #1
> From: Zhu Lingshan <lingshan.zhu@intel.com>
> Sent: Friday, July 1, 2022 9:28 AM
> If VIRTIO_NET_F_MQ == 0, the virtio device should have one queue pair, so
> when userspace querying queue pair numbers, it should return mq=1 than
> zero.
> 
> Function vdpa_dev_net_config_fill() fills the attributions of the vDPA
> devices, so that it should call vdpa_dev_net_mq_config_fill() so the
> parameter in vdpa_dev_net_mq_config_fill() should be feature_device than
> feature_driver for the vDPA devices themselves
> 
> Before this change, when MQ = 0, iproute2 output:
> $vdpa dev config show vdpa0
> vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false max_vq_pairs 0
> mtu 1500
>
The fix belongs to user space.
When a feature bit _MQ is not negotiated, vdpa kernel space will not add attribute VDPA_ATTR_DEV_NET_CFG_MAX_VQP.
When such attribute is not returned by kernel, max_vq_pairs should not be shown by the iproute2.

We have many config space fields that depend on the feature bits and some of them do not have any defaults.
To keep consistency of existence of config space fields among all, we don't want to show default like below.

Please fix the iproute2 to not print max_vq_pairs when it is not returned by the kernel.
 
> After applying this commit, when MQ = 0, iproute2 output:
> $vdpa dev config show vdpa0
> vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false max_vq_pairs 1
> mtu 1500
> 
> Fixes: a64917bc2e9b (vdpa: Provide interface to read driver features)
> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> ---
>  drivers/vdpa/vdpa.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
> d76b22b2f7ae..846dd37f3549 100644
> --- a/drivers/vdpa/vdpa.c
> +++ b/drivers/vdpa/vdpa.c
> @@ -806,9 +806,10 @@ static int vdpa_dev_net_mq_config_fill(struct
> vdpa_device *vdev,
>  	u16 val_u16;
> 
>  	if ((features & BIT_ULL(VIRTIO_NET_F_MQ)) == 0)
> -		return 0;
> +		val_u16 = 1;
> +	else
> +		val_u16 = __virtio16_to_cpu(true, config-
> >max_virtqueue_pairs);
> 
> -	val_u16 = le16_to_cpu(config->max_virtqueue_pairs);
>  	return nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP,
> val_u16);  }
> 
> @@ -842,7 +843,7 @@ static int vdpa_dev_net_config_fill(struct
> vdpa_device *vdev, struct sk_buff *ms
>  			      VDPA_ATTR_PAD))
>  		return -EMSGSIZE;
> 
> -	return vdpa_dev_net_mq_config_fill(vdev, msg, features_driver,
> &config);
> +	return vdpa_dev_net_mq_config_fill(vdev, msg, features_device,
> +&config);
>  }
> 
>  static int
> --
> 2.31.1
Zhu, Lingshan July 8, 2022, 6:21 a.m. UTC | #2
On 7/2/2022 6:07 AM, Parav Pandit wrote:
>
>> From: Zhu Lingshan <lingshan.zhu@intel.com>
>> Sent: Friday, July 1, 2022 9:28 AM
>> If VIRTIO_NET_F_MQ == 0, the virtio device should have one queue pair, so
>> when userspace querying queue pair numbers, it should return mq=1 than
>> zero.
>>
>> Function vdpa_dev_net_config_fill() fills the attributions of the vDPA
>> devices, so that it should call vdpa_dev_net_mq_config_fill() so the
>> parameter in vdpa_dev_net_mq_config_fill() should be feature_device than
>> feature_driver for the vDPA devices themselves
>>
>> Before this change, when MQ = 0, iproute2 output:
>> $vdpa dev config show vdpa0
>> vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false max_vq_pairs 0
>> mtu 1500
>>
> The fix belongs to user space.
> When a feature bit _MQ is not negotiated, vdpa kernel space will not add attribute VDPA_ATTR_DEV_NET_CFG_MAX_VQP.
> When such attribute is not returned by kernel, max_vq_pairs should not be shown by the iproute2.
I think userspace tool does not need to care whether MQ is offered or 
negotiated, it just needs to read the number of queues
there, so if no MQ, it is not "not any queues", there are still 1 queue 
pair to be a virtio-net device, means two queues.

If not, how can you tell the user there are only 2 queues? The end users 
may don't know this is default. They may misunderstand this
as an error or defects.
>
> We have many config space fields that depend on the feature bits and some of them do not have any defaults.
> To keep consistency of existence of config space fields among all, we don't want to show default like below.
>
> Please fix the iproute2 to not print max_vq_pairs when it is not returned by the kernel.
>   
>> After applying this commit, when MQ = 0, iproute2 output:
>> $vdpa dev config show vdpa0
>> vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false max_vq_pairs 1
>> mtu 1500
>>
>> Fixes: a64917bc2e9b (vdpa: Provide interface to read driver features)
>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>> ---
>>   drivers/vdpa/vdpa.c | 7 ++++---
>>   1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>> d76b22b2f7ae..846dd37f3549 100644
>> --- a/drivers/vdpa/vdpa.c
>> +++ b/drivers/vdpa/vdpa.c
>> @@ -806,9 +806,10 @@ static int vdpa_dev_net_mq_config_fill(struct
>> vdpa_device *vdev,
>>   	u16 val_u16;
>>
>>   	if ((features & BIT_ULL(VIRTIO_NET_F_MQ)) == 0)
>> -		return 0;
>> +		val_u16 = 1;
>> +	else
>> +		val_u16 = __virtio16_to_cpu(true, config-
>>> max_virtqueue_pairs);
>> -	val_u16 = le16_to_cpu(config->max_virtqueue_pairs);
>>   	return nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP,
>> val_u16);  }
>>
>> @@ -842,7 +843,7 @@ static int vdpa_dev_net_config_fill(struct
>> vdpa_device *vdev, struct sk_buff *ms
>>   			      VDPA_ATTR_PAD))
>>   		return -EMSGSIZE;
>>
>> -	return vdpa_dev_net_mq_config_fill(vdev, msg, features_driver,
>> &config);
>> +	return vdpa_dev_net_mq_config_fill(vdev, msg, features_device,
>> +&config);
>>   }
>>
>>   static int
>> --
>> 2.31.1
Parav Pandit July 8, 2022, 4:23 p.m. UTC | #3
> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Friday, July 8, 2022 2:21 AM
> 
> 
> On 7/2/2022 6:07 AM, Parav Pandit wrote:
> >
> >> From: Zhu Lingshan <lingshan.zhu@intel.com>
> >> Sent: Friday, July 1, 2022 9:28 AM
> >> If VIRTIO_NET_F_MQ == 0, the virtio device should have one queue
> >> pair, so when userspace querying queue pair numbers, it should return
> >> mq=1 than zero.
> >>
> >> Function vdpa_dev_net_config_fill() fills the attributions of the
> >> vDPA devices, so that it should call vdpa_dev_net_mq_config_fill() so
> >> the parameter in vdpa_dev_net_mq_config_fill() should be
> >> feature_device than feature_driver for the vDPA devices themselves
> >>
> >> Before this change, when MQ = 0, iproute2 output:
> >> $vdpa dev config show vdpa0
> >> vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false max_vq_pairs
> >> 0 mtu 1500
> >>
> > The fix belongs to user space.
> > When a feature bit _MQ is not negotiated, vdpa kernel space will not add
> attribute VDPA_ATTR_DEV_NET_CFG_MAX_VQP.
> > When such attribute is not returned by kernel, max_vq_pairs should not
> be shown by the iproute2.
> I think userspace tool does not need to care whether MQ is offered or
> negotiated, it just needs to read the number of queues there, so if no MQ, it
> is not "not any queues", there are still 1 queue pair to be a virtio-net device,
> means two queues.
> 
> If not, how can you tell the user there are only 2 queues? The end users may
> don't know this is default. They may misunderstand this as an error or
> defects.
> >
When max_vq_pairs is not shown, it means that device didn’t expose MAX_VQ_PAIRS attribute to its guest users.
(Because _MQ was not negotiated).
It is not error or defect. 
It precisely shows what is exposed.

User space will care when it wants to turn off/on _MQ feature bits and MAX_QP values.

Showing max_vq_pairs of 1 even when _MQ is not negotiated, incorrectly says that max_vq_pairs is exposed to the guest, but it is not offered.

So, please fix the iproute2 to not print max_vq_pairs when it is not returned by the kernel.

> > We have many config space fields that depend on the feature bits and
> some of them do not have any defaults.
> > To keep consistency of existence of config space fields among all, we don't
> want to show default like below.
> >
> > Please fix the iproute2 to not print max_vq_pairs when it is not returned
> by the kernel.
> >
> >> After applying this commit, when MQ = 0, iproute2 output:
> >> $vdpa dev config show vdpa0
> >> vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false max_vq_pairs
> >> 1 mtu 1500
> >>
> >> Fixes: a64917bc2e9b (vdpa: Provide interface to read driver features)
> >> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> >> ---
> >>   drivers/vdpa/vdpa.c | 7 ++++---
> >>   1 file changed, 4 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
> >> d76b22b2f7ae..846dd37f3549 100644
> >> --- a/drivers/vdpa/vdpa.c
> >> +++ b/drivers/vdpa/vdpa.c
> >> @@ -806,9 +806,10 @@ static int vdpa_dev_net_mq_config_fill(struct
> >> vdpa_device *vdev,
> >>   	u16 val_u16;
> >>
> >>   	if ((features & BIT_ULL(VIRTIO_NET_F_MQ)) == 0)
> >> -		return 0;
> >> +		val_u16 = 1;
> >> +	else
> >> +		val_u16 = __virtio16_to_cpu(true, config-
> >>> max_virtqueue_pairs);
> >> -	val_u16 = le16_to_cpu(config->max_virtqueue_pairs);
> >>   	return nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP,
> val_u16);
> >> }
> >>
> >> @@ -842,7 +843,7 @@ static int vdpa_dev_net_config_fill(struct
> >> vdpa_device *vdev, struct sk_buff *ms
> >>   			      VDPA_ATTR_PAD))
> >>   		return -EMSGSIZE;
> >>
> >> -	return vdpa_dev_net_mq_config_fill(vdev, msg, features_driver,
> >> &config);
> >> +	return vdpa_dev_net_mq_config_fill(vdev, msg, features_device,
> >> +&config);
> >>   }
> >>
> >>   static int
> >> --
> >> 2.31.1
Zhu, Lingshan July 11, 2022, 2:29 a.m. UTC | #4
On 7/9/2022 12:23 AM, Parav Pandit wrote:
>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>> Sent: Friday, July 8, 2022 2:21 AM
>>
>>
>> On 7/2/2022 6:07 AM, Parav Pandit wrote:
>>>> From: Zhu Lingshan <lingshan.zhu@intel.com>
>>>> Sent: Friday, July 1, 2022 9:28 AM
>>>> If VIRTIO_NET_F_MQ == 0, the virtio device should have one queue
>>>> pair, so when userspace querying queue pair numbers, it should return
>>>> mq=1 than zero.
>>>>
>>>> Function vdpa_dev_net_config_fill() fills the attributions of the
>>>> vDPA devices, so that it should call vdpa_dev_net_mq_config_fill() so
>>>> the parameter in vdpa_dev_net_mq_config_fill() should be
>>>> feature_device than feature_driver for the vDPA devices themselves
>>>>
>>>> Before this change, when MQ = 0, iproute2 output:
>>>> $vdpa dev config show vdpa0
>>>> vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false max_vq_pairs
>>>> 0 mtu 1500
>>>>
>>> The fix belongs to user space.
>>> When a feature bit _MQ is not negotiated, vdpa kernel space will not add
>> attribute VDPA_ATTR_DEV_NET_CFG_MAX_VQP.
>>> When such attribute is not returned by kernel, max_vq_pairs should not
>> be shown by the iproute2.
>> I think userspace tool does not need to care whether MQ is offered or
>> negotiated, it just needs to read the number of queues there, so if no MQ, it
>> is not "not any queues", there are still 1 queue pair to be a virtio-net device,
>> means two queues.
>>
>> If not, how can you tell the user there are only 2 queues? The end users may
>> don't know this is default. They may misunderstand this as an error or
>> defects.
> When max_vq_pairs is not shown, it means that device didn’t expose MAX_VQ_PAIRS attribute to its guest users.
> (Because _MQ was not negotiated).
> It is not error or defect.
> It precisely shows what is exposed.
>
> User space will care when it wants to turn off/on _MQ feature bits and MAX_QP values.
>
> Showing max_vq_pairs of 1 even when _MQ is not negotiated, incorrectly says that max_vq_pairs is exposed to the guest, but it is not offered.
>
> So, please fix the iproute2 to not print max_vq_pairs when it is not returned by the kernel.
iproute2 can report whether there is MQ feature in the device / driver 
feature bits.
I think iproute2 only queries the number of max queues here.

max_vq_pairs shows how many queue pairs there, this attribute's existence does not depend on MQ,
if no MQ, there are still one queue pair, so just show one.

>
>>> We have many config space fields that depend on the feature bits and
>> some of them do not have any defaults.
>>> To keep consistency of existence of config space fields among all, we don't
>> want to show default like below.
>>> Please fix the iproute2 to not print max_vq_pairs when it is not returned
>> by the kernel.
>>>> After applying this commit, when MQ = 0, iproute2 output:
>>>> $vdpa dev config show vdpa0
>>>> vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false max_vq_pairs
>>>> 1 mtu 1500
>>>>
>>>> Fixes: a64917bc2e9b (vdpa: Provide interface to read driver features)
>>>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
>>>> ---
>>>>    drivers/vdpa/vdpa.c | 7 ++++---
>>>>    1 file changed, 4 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c index
>>>> d76b22b2f7ae..846dd37f3549 100644
>>>> --- a/drivers/vdpa/vdpa.c
>>>> +++ b/drivers/vdpa/vdpa.c
>>>> @@ -806,9 +806,10 @@ static int vdpa_dev_net_mq_config_fill(struct
>>>> vdpa_device *vdev,
>>>>    	u16 val_u16;
>>>>
>>>>    	if ((features & BIT_ULL(VIRTIO_NET_F_MQ)) == 0)
>>>> -		return 0;
>>>> +		val_u16 = 1;
>>>> +	else
>>>> +		val_u16 = __virtio16_to_cpu(true, config-
>>>>> max_virtqueue_pairs);
>>>> -	val_u16 = le16_to_cpu(config->max_virtqueue_pairs);
>>>>    	return nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP,
>> val_u16);
>>>> }
>>>>
>>>> @@ -842,7 +843,7 @@ static int vdpa_dev_net_config_fill(struct
>>>> vdpa_device *vdev, struct sk_buff *ms
>>>>    			      VDPA_ATTR_PAD))
>>>>    		return -EMSGSIZE;
>>>>
>>>> -	return vdpa_dev_net_mq_config_fill(vdev, msg, features_driver,
>>>> &config);
>>>> +	return vdpa_dev_net_mq_config_fill(vdev, msg, features_device,
>>>> +&config);
>>>>    }
>>>>
>>>>    static int
>>>> --
>>>> 2.31.1
Parav Pandit July 12, 2022, 4:48 p.m. UTC | #5
> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Sunday, July 10, 2022 10:30 PM

> > Showing max_vq_pairs of 1 even when _MQ is not negotiated, incorrectly
> says that max_vq_pairs is exposed to the guest, but it is not offered.
> >
> > So, please fix the iproute2 to not print max_vq_pairs when it is not
> returned by the kernel.
> iproute2 can report whether there is MQ feature in the device / driver
> feature bits.
> I think iproute2 only queries the number of max queues here.
> 
> max_vq_pairs shows how many queue pairs there, this attribute's existence
> does not depend on MQ, if no MQ, there are still one queue pair, so just
> show one.
This netlink attribute's existence is depending on the _MQ feature bit existence.
We can break that and report the value, but if we break that there are many other config space bits who doesn’t have good default like max_vq_pairs.
There is ambiguity for user space what to do with it and so in the kernel space..
Instead of dealing with them differently in kernel, at present we attach each netlink attribute to a respective feature bit wherever applicable.
And code in kernel and user space is uniform to handle them.
Zhu, Lingshan July 13, 2022, 3:03 a.m. UTC | #6
On 7/13/2022 12:48 AM, Parav Pandit wrote:
>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>> Sent: Sunday, July 10, 2022 10:30 PM
>>> Showing max_vq_pairs of 1 even when _MQ is not negotiated, incorrectly
>> says that max_vq_pairs is exposed to the guest, but it is not offered.
>>> So, please fix the iproute2 to not print max_vq_pairs when it is not
>> returned by the kernel.
>> iproute2 can report whether there is MQ feature in the device / driver
>> feature bits.
>> I think iproute2 only queries the number of max queues here.
>>
>> max_vq_pairs shows how many queue pairs there, this attribute's existence
>> does not depend on MQ, if no MQ, there are still one queue pair, so just
>> show one.
> This netlink attribute's existence is depending on the _MQ feature bit existence.
why? If no MQ, then no queues?
> We can break that and report the value, but if we break that there are many other config space bits who doesn’t have good default like max_vq_pairs.
max_vq_paris may not have a default value, but we know if there is no 
MQ, a virtio-net still have one queue pair to be functional.
> There is ambiguity for user space what to do with it and so in the kernel space..
> Instead of dealing with them differently in kernel, at present we attach each netlink attribute to a respective feature bit wherever applicable.
> And code in kernel and user space is uniform to handle them.
I get your point, but you see, by "max_vq_pairs", the user space tool is 
asking how many queue pairs there, it is not asking whether the device 
have MQ.
Even no _MQ, we still need to tell the users that there are one queue 
pair, or it is not a functional virtio-net,
we should detect this error earlier in the device initialization.

I think it is still uniform, it there is _MQ, we return 
cfg.max_queue_pair, if no _MQ, return 1, still by netlink.

Thanks
Parav Pandit July 13, 2022, 3:06 a.m. UTC | #7
> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Tuesday, July 12, 2022 11:03 PM
> 
> 
> On 7/13/2022 12:48 AM, Parav Pandit wrote:
> >> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> >> Sent: Sunday, July 10, 2022 10:30 PM
> >>> Showing max_vq_pairs of 1 even when _MQ is not negotiated,
> >>> incorrectly
> >> says that max_vq_pairs is exposed to the guest, but it is not offered.
> >>> So, please fix the iproute2 to not print max_vq_pairs when it is not
> >> returned by the kernel.
> >> iproute2 can report whether there is MQ feature in the device /
> >> driver feature bits.
> >> I think iproute2 only queries the number of max queues here.
> >>
> >> max_vq_pairs shows how many queue pairs there, this attribute's
> >> existence does not depend on MQ, if no MQ, there are still one queue
> >> pair, so just show one.
> > This netlink attribute's existence is depending on the _MQ feature bit
> existence.
> why? If no MQ, then no queues?
> > We can break that and report the value, but if we break that there are
> many other config space bits who doesn’t have good default like
> max_vq_pairs.
> max_vq_paris may not have a default value, but we know if there is no MQ,
> a virtio-net still have one queue pair to be functional.
> > There is ambiguity for user space what to do with it and so in the kernel
> space..
> > Instead of dealing with them differently in kernel, at present we attach
> each netlink attribute to a respective feature bit wherever applicable.
> > And code in kernel and user space is uniform to handle them.
> I get your point, but you see, by "max_vq_pairs", the user space tool is
> asking how many queue pairs there, it is not asking whether the device have
> MQ.
> Even no _MQ, we still need to tell the users that there are one queue pair, or
> it is not a functional virtio-net, we should detect this error earlier in the
> device initialization.
It is not an error. :)

When the user space which invokes netlink commands, detects that _MQ is not supported, hence it takes max_queue_pair = 1 by itself.

> 
> I think it is still uniform, it there is _MQ, we return cfg.max_queue_pair, if no
> _MQ, return 1, still by netlink.
Better to do that in user space because we cannot do same for other config fields.

> 
> Thanks
Zhu, Lingshan July 13, 2022, 3:45 a.m. UTC | #8
On 7/13/2022 11:06 AM, Parav Pandit wrote:
>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>> Sent: Tuesday, July 12, 2022 11:03 PM
>>
>>
>> On 7/13/2022 12:48 AM, Parav Pandit wrote:
>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>> Sent: Sunday, July 10, 2022 10:30 PM
>>>>> Showing max_vq_pairs of 1 even when _MQ is not negotiated,
>>>>> incorrectly
>>>> says that max_vq_pairs is exposed to the guest, but it is not offered.
>>>>> So, please fix the iproute2 to not print max_vq_pairs when it is not
>>>> returned by the kernel.
>>>> iproute2 can report whether there is MQ feature in the device /
>>>> driver feature bits.
>>>> I think iproute2 only queries the number of max queues here.
>>>>
>>>> max_vq_pairs shows how many queue pairs there, this attribute's
>>>> existence does not depend on MQ, if no MQ, there are still one queue
>>>> pair, so just show one.
>>> This netlink attribute's existence is depending on the _MQ feature bit
>> existence.
>> why? If no MQ, then no queues?
>>> We can break that and report the value, but if we break that there are
>> many other config space bits who doesn’t have good default like
>> max_vq_pairs.
>> max_vq_paris may not have a default value, but we know if there is no MQ,
>> a virtio-net still have one queue pair to be functional.
>>> There is ambiguity for user space what to do with it and so in the kernel
>> space..
>>> Instead of dealing with them differently in kernel, at present we attach
>> each netlink attribute to a respective feature bit wherever applicable.
>>> And code in kernel and user space is uniform to handle them.
>> I get your point, but you see, by "max_vq_pairs", the user space tool is
>> asking how many queue pairs there, it is not asking whether the device have
>> MQ.
>> Even no _MQ, we still need to tell the users that there are one queue pair, or
>> it is not a functional virtio-net, we should detect this error earlier in the
>> device initialization.
> It is not an error. :)
I meant if no queues, it should be non-functional, which is an error.
>
> When the user space which invokes netlink commands, detects that _MQ is not supported, hence it takes max_queue_pair = 1 by itself.
I think the kernel module have all necessary information and it is the 
only one which have precise information of a device, so it
should answer precisely than let the user space guess. The kernel module 
should be reliable than stay silent, leave the question
to the user space tool.
>
>> I think it is still uniform, it there is _MQ, we return cfg.max_queue_pair, if no
>> _MQ, return 1, still by netlink.
> Better to do that in user space because we cannot do same for other config fields.
same as above
>
>> Thanks
Michael S. Tsirkin July 13, 2022, 5:26 a.m. UTC | #9
On Fri, Jul 01, 2022 at 10:07:59PM +0000, Parav Pandit wrote:
> 
> 
> > From: Zhu Lingshan <lingshan.zhu@intel.com>
> > Sent: Friday, July 1, 2022 9:28 AM
> > If VIRTIO_NET_F_MQ == 0, the virtio device should have one queue pair, so
> > when userspace querying queue pair numbers, it should return mq=1 than
> > zero.
> > 
> > Function vdpa_dev_net_config_fill() fills the attributions of the vDPA
> > devices, so that it should call vdpa_dev_net_mq_config_fill() so the
> > parameter in vdpa_dev_net_mq_config_fill() should be feature_device than
> > feature_driver for the vDPA devices themselves
> > 
> > Before this change, when MQ = 0, iproute2 output:
> > $vdpa dev config show vdpa0
> > vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false max_vq_pairs 0
> > mtu 1500
> >
> The fix belongs to user space.
> When a feature bit _MQ is not negotiated, vdpa kernel space will not add attribute VDPA_ATTR_DEV_NET_CFG_MAX_VQP.
> When such attribute is not returned by kernel, max_vq_pairs should not be shown by the iproute2.
> 
> We have many config space fields that depend on the feature bits and some of them do not have any defaults.
> To keep consistency of existence of config space fields among all, we don't want to show default like below.
> 
> Please fix the iproute2 to not print max_vq_pairs when it is not returned by the kernel.

Parav I read the discussion and don't get your argument. From driver's POV
_MQ with 1 VQ pair and !_MQ are exactly functionally equivalent.

It's true that iproute probably needs to be fixed too, to handle old
kernels. But iproute is not the only userspace, why not make it's life
easier by fixing the kernel?
Zhu, Lingshan July 13, 2022, 7:47 a.m. UTC | #10
On 7/13/2022 1:26 PM, Michael S. Tsirkin wrote:
> On Fri, Jul 01, 2022 at 10:07:59PM +0000, Parav Pandit wrote:
>>
>>> From: Zhu Lingshan <lingshan.zhu@intel.com>
>>> Sent: Friday, July 1, 2022 9:28 AM
>>> If VIRTIO_NET_F_MQ == 0, the virtio device should have one queue pair, so
>>> when userspace querying queue pair numbers, it should return mq=1 than
>>> zero.
>>>
>>> Function vdpa_dev_net_config_fill() fills the attributions of the vDPA
>>> devices, so that it should call vdpa_dev_net_mq_config_fill() so the
>>> parameter in vdpa_dev_net_mq_config_fill() should be feature_device than
>>> feature_driver for the vDPA devices themselves
>>>
>>> Before this change, when MQ = 0, iproute2 output:
>>> $vdpa dev config show vdpa0
>>> vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false max_vq_pairs 0
>>> mtu 1500
>>>
>> The fix belongs to user space.
>> When a feature bit _MQ is not negotiated, vdpa kernel space will not add attribute VDPA_ATTR_DEV_NET_CFG_MAX_VQP.
>> When such attribute is not returned by kernel, max_vq_pairs should not be shown by the iproute2.
>>
>> We have many config space fields that depend on the feature bits and some of them do not have any defaults.
>> To keep consistency of existence of config space fields among all, we don't want to show default like below.
>>
>> Please fix the iproute2 to not print max_vq_pairs when it is not returned by the kernel.
> Parav I read the discussion and don't get your argument. From driver's POV
> _MQ with 1 VQ pair and !_MQ are exactly functionally equivalent.
>
> It's true that iproute probably needs to be fixed too, to handle old
> kernels. But iproute is not the only userspace, why not make it's life
> easier by fixing the kernel?
I will fix iproute2 once this series settles down

Thanks,
Zhu Lingshan
Parav Pandit July 26, 2022, 3:54 p.m. UTC | #11
> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Wednesday, July 13, 2022 1:27 AM
> 
> On Fri, Jul 01, 2022 at 10:07:59PM +0000, Parav Pandit wrote:
> >
> >
> > > From: Zhu Lingshan <lingshan.zhu@intel.com>
> > > Sent: Friday, July 1, 2022 9:28 AM
> > > If VIRTIO_NET_F_MQ == 0, the virtio device should have one queue
> > > pair, so when userspace querying queue pair numbers, it should
> > > return mq=1 than zero.
> > >
> > > Function vdpa_dev_net_config_fill() fills the attributions of the
> > > vDPA devices, so that it should call vdpa_dev_net_mq_config_fill()
> > > so the parameter in vdpa_dev_net_mq_config_fill() should be
> > > feature_device than feature_driver for the vDPA devices themselves
> > >
> > > Before this change, when MQ = 0, iproute2 output:
> > > $vdpa dev config show vdpa0
> > > vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false
> > > max_vq_pairs 0 mtu 1500
> > >
> > The fix belongs to user space.
> > When a feature bit _MQ is not negotiated, vdpa kernel space will not add
> attribute VDPA_ATTR_DEV_NET_CFG_MAX_VQP.
> > When such attribute is not returned by kernel, max_vq_pairs should not be
> shown by the iproute2.
> >
> > We have many config space fields that depend on the feature bits and
> some of them do not have any defaults.
> > To keep consistency of existence of config space fields among all, we don't
> want to show default like below.
> >
> > Please fix the iproute2 to not print max_vq_pairs when it is not returned by
> the kernel.
> 
> Parav I read the discussion and don't get your argument. From driver's POV
> _MQ with 1 VQ pair and !_MQ are exactly functionally equivalent.
But we are talking from user POV here.

> 
> It's true that iproute probably needs to be fixed too, to handle old kernels.
> But iproute is not the only userspace, why not make it's life easier by fixing
> the kernel?
Because it cannot be fixed for other config space fields which are control by feature bits those do not have any defaults.
So better to treat all in same way from user POV.
Parav Pandit July 26, 2022, 3:56 p.m. UTC | #12
> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Tuesday, July 12, 2022 11:46 PM
> > When the user space which invokes netlink commands, detects that _MQ
> is not supported, hence it takes max_queue_pair = 1 by itself.
> I think the kernel module have all necessary information and it is the only
> one which have precise information of a device, so it should answer precisely
> than let the user space guess. The kernel module should be reliable than stay
> silent, leave the question to the user space tool.
Kernel is reliable. It doesn’t expose a config space field if the field doesn’t exist regardless of field should have default or no default.
User space should not guess either. User space gets to see if _MQ present/not present. If _MQ present than get reliable data from kernel.
If _MQ not present, it means this device has one VQ pair.
Michael S. Tsirkin July 26, 2022, 7:48 p.m. UTC | #13
On Tue, Jul 26, 2022 at 03:54:06PM +0000, Parav Pandit wrote:
> 
> > From: Michael S. Tsirkin <mst@redhat.com>
> > Sent: Wednesday, July 13, 2022 1:27 AM
> > 
> > On Fri, Jul 01, 2022 at 10:07:59PM +0000, Parav Pandit wrote:
> > >
> > >
> > > > From: Zhu Lingshan <lingshan.zhu@intel.com>
> > > > Sent: Friday, July 1, 2022 9:28 AM
> > > > If VIRTIO_NET_F_MQ == 0, the virtio device should have one queue
> > > > pair, so when userspace querying queue pair numbers, it should
> > > > return mq=1 than zero.
> > > >
> > > > Function vdpa_dev_net_config_fill() fills the attributions of the
> > > > vDPA devices, so that it should call vdpa_dev_net_mq_config_fill()
> > > > so the parameter in vdpa_dev_net_mq_config_fill() should be
> > > > feature_device than feature_driver for the vDPA devices themselves
> > > >
> > > > Before this change, when MQ = 0, iproute2 output:
> > > > $vdpa dev config show vdpa0
> > > > vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false
> > > > max_vq_pairs 0 mtu 1500
> > > >
> > > The fix belongs to user space.
> > > When a feature bit _MQ is not negotiated, vdpa kernel space will not add
> > attribute VDPA_ATTR_DEV_NET_CFG_MAX_VQP.
> > > When such attribute is not returned by kernel, max_vq_pairs should not be
> > shown by the iproute2.
> > >
> > > We have many config space fields that depend on the feature bits and
> > some of them do not have any defaults.
> > > To keep consistency of existence of config space fields among all, we don't
> > want to show default like below.
> > >
> > > Please fix the iproute2 to not print max_vq_pairs when it is not returned by
> > the kernel.
> > 
> > Parav I read the discussion and don't get your argument. From driver's POV
> > _MQ with 1 VQ pair and !_MQ are exactly functionally equivalent.
> But we are talking from user POV here.

From spec POV there's just driver and device, user would be part of
driver here.

> > 
> > It's true that iproute probably needs to be fixed too, to handle old kernels.
> > But iproute is not the only userspace, why not make it's life easier by fixing
> > the kernel?
> Because it cannot be fixed for other config space fields which are control by feature bits those do not have any defaults.
> So better to treat all in same way from user POV.

Consistency is good for sure. What are these other fields though?
Can you give examples so I understand please?
Michael S. Tsirkin July 26, 2022, 7:52 p.m. UTC | #14
On Tue, Jul 26, 2022 at 03:56:32PM +0000, Parav Pandit wrote:
> 
> > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > Sent: Tuesday, July 12, 2022 11:46 PM
> > > When the user space which invokes netlink commands, detects that _MQ
> > is not supported, hence it takes max_queue_pair = 1 by itself.
> > I think the kernel module have all necessary information and it is the only
> > one which have precise information of a device, so it should answer precisely
> > than let the user space guess. The kernel module should be reliable than stay
> > silent, leave the question to the user space tool.
> Kernel is reliable. It doesn’t expose a config space field if the field doesn’t exist regardless of field should have default or no default.
> User space should not guess either. User space gets to see if _MQ present/not present. If _MQ present than get reliable data from kernel.
> If _MQ not present, it means this device has one VQ pair.

Yes that's fine. And if we just didn't return anything without MQ that
would be fine.  But IIUC netlink reports the # of pairs regardless, it
just puts 0 there.
Parav Pandit July 26, 2022, 8:49 p.m. UTC | #15
> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, July 26, 2022 3:52 PM
> 
> On Tue, Jul 26, 2022 at 03:56:32PM +0000, Parav Pandit wrote:
> >
> > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > Sent: Tuesday, July 12, 2022 11:46 PM
> > > > When the user space which invokes netlink commands, detects that
> > > > _MQ
> > > is not supported, hence it takes max_queue_pair = 1 by itself.
> > > I think the kernel module have all necessary information and it is
> > > the only one which have precise information of a device, so it
> > > should answer precisely than let the user space guess. The kernel
> > > module should be reliable than stay silent, leave the question to the user
> space tool.
> > Kernel is reliable. It doesn’t expose a config space field if the field doesn’t exist
> regardless of field should have default or no default.
> > User space should not guess either. User space gets to see if _MQ present/not
> present. If _MQ present than get reliable data from kernel.
> > If _MQ not present, it means this device has one VQ pair.
> 
> Yes that's fine. And if we just didn't return anything without MQ that would be
> fine.  But IIUC netlink reports the # of pairs regardless, it just puts 0 there.
I read it differently at [1] which checks for the MQ feature bit.

[1] https://elixir.bootlin.com/linux/latest/source/drivers/vdpa/vdpa.c#L825

> 
> --
> MST
Parav Pandit July 26, 2022, 8:53 p.m. UTC | #16
> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, July 26, 2022 3:49 PM
> 
> On Tue, Jul 26, 2022 at 03:54:06PM +0000, Parav Pandit wrote:
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Wednesday, July 13, 2022 1:27 AM
> > >
> > > On Fri, Jul 01, 2022 at 10:07:59PM +0000, Parav Pandit wrote:
> > > >
> > > >
> > > > > From: Zhu Lingshan <lingshan.zhu@intel.com>
> > > > > Sent: Friday, July 1, 2022 9:28 AM If VIRTIO_NET_F_MQ == 0, the
> > > > > virtio device should have one queue pair, so when userspace
> > > > > querying queue pair numbers, it should return mq=1 than zero.
> > > > >
> > > > > Function vdpa_dev_net_config_fill() fills the attributions of
> > > > > the vDPA devices, so that it should call
> > > > > vdpa_dev_net_mq_config_fill() so the parameter in
> > > > > vdpa_dev_net_mq_config_fill() should be feature_device than
> > > > > feature_driver for the vDPA devices themselves
> > > > >
> > > > > Before this change, when MQ = 0, iproute2 output:
> > > > > $vdpa dev config show vdpa0
> > > > > vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false
> > > > > max_vq_pairs 0 mtu 1500
> > > > >
> > > > The fix belongs to user space.
> > > > When a feature bit _MQ is not negotiated, vdpa kernel space will
> > > > not add
> > > attribute VDPA_ATTR_DEV_NET_CFG_MAX_VQP.
> > > > When such attribute is not returned by kernel, max_vq_pairs should
> > > > not be
> > > shown by the iproute2.
> > > >
> > > > We have many config space fields that depend on the feature bits
> > > > and
> > > some of them do not have any defaults.
> > > > To keep consistency of existence of config space fields among all,
> > > > we don't
> > > want to show default like below.
> > > >
> > > > Please fix the iproute2 to not print max_vq_pairs when it is not
> > > > returned by
> > > the kernel.
> > >
> > > Parav I read the discussion and don't get your argument. From
> > > driver's POV _MQ with 1 VQ pair and !_MQ are exactly functionally
> equivalent.
> > But we are talking from user POV here.
> 
> From spec POV there's just driver and device, user would be part of driver here.
User space application still need to inspect the _MQ bit to
> 
> > >
> > > It's true that iproute probably needs to be fixed too, to handle old kernels.
> > > But iproute is not the only userspace, why not make it's life easier
> > > by fixing the kernel?
> > Because it cannot be fixed for other config space fields which are control by
> feature bits those do not have any defaults.
> > So better to treat all in same way from user POV.
> 
> Consistency is good for sure. What are these other fields though?

> Can you give examples so I understand please?

speed only exists if VIRTIO_NET_F_SPEED_DUPLEX.
rss_max_key_size exists only if VIRTIO_NET_F_RSS exists.
Zhu, Lingshan July 27, 2022, 1:56 a.m. UTC | #17
On 7/27/2022 4:53 AM, Parav Pandit wrote:
>> From: Michael S. Tsirkin <mst@redhat.com>
>> Sent: Tuesday, July 26, 2022 3:49 PM
>>
>> On Tue, Jul 26, 2022 at 03:54:06PM +0000, Parav Pandit wrote:
>>>> From: Michael S. Tsirkin <mst@redhat.com>
>>>> Sent: Wednesday, July 13, 2022 1:27 AM
>>>>
>>>> On Fri, Jul 01, 2022 at 10:07:59PM +0000, Parav Pandit wrote:
>>>>>
>>>>>> From: Zhu Lingshan <lingshan.zhu@intel.com>
>>>>>> Sent: Friday, July 1, 2022 9:28 AM If VIRTIO_NET_F_MQ == 0, the
>>>>>> virtio device should have one queue pair, so when userspace
>>>>>> querying queue pair numbers, it should return mq=1 than zero.
>>>>>>
>>>>>> Function vdpa_dev_net_config_fill() fills the attributions of
>>>>>> the vDPA devices, so that it should call
>>>>>> vdpa_dev_net_mq_config_fill() so the parameter in
>>>>>> vdpa_dev_net_mq_config_fill() should be feature_device than
>>>>>> feature_driver for the vDPA devices themselves
>>>>>>
>>>>>> Before this change, when MQ = 0, iproute2 output:
>>>>>> $vdpa dev config show vdpa0
>>>>>> vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false
>>>>>> max_vq_pairs 0 mtu 1500
>>>>>>
>>>>> The fix belongs to user space.
>>>>> When a feature bit _MQ is not negotiated, vdpa kernel space will
>>>>> not add
>>>> attribute VDPA_ATTR_DEV_NET_CFG_MAX_VQP.
>>>>> When such attribute is not returned by kernel, max_vq_pairs should
>>>>> not be
>>>> shown by the iproute2.
>>>>> We have many config space fields that depend on the feature bits
>>>>> and
>>>> some of them do not have any defaults.
>>>>> To keep consistency of existence of config space fields among all,
>>>>> we don't
>>>> want to show default like below.
>>>>> Please fix the iproute2 to not print max_vq_pairs when it is not
>>>>> returned by
>>>> the kernel.
>>>>
>>>> Parav I read the discussion and don't get your argument. From
>>>> driver's POV _MQ with 1 VQ pair and !_MQ are exactly functionally
>> equivalent.
>>> But we are talking from user POV here.
>>  From spec POV there's just driver and device, user would be part of driver here.
> User space application still need to inspect the _MQ bit to

>>>> It's true that iproute probably needs to be fixed too, to handle old kernels.
>>>> But iproute is not the only userspace, why not make it's life easier
>>>> by fixing the kernel?
>>> Because it cannot be fixed for other config space fields which are control by
>> feature bits those do not have any defaults.
>>> So better to treat all in same way from user POV.
>> Consistency is good for sure. What are these other fields though?
>> Can you give examples so I understand please?
> speed only exists if VIRTIO_NET_F_SPEED_DUPLEX.
> rss_max_key_size exists only if VIRTIO_NET_F_RSS exists.
That's different cases from the MQ case.

There are no default values for speed and rss_max_key_size. And 
processing speed without VIRTIO_NET_F_SPEED_DUPLEX, or rss_max_key_size 
exists without VIRTIO_NET_F_RSS are meaningless.
But for MQ, if without MQ, we know it has to be 1 queue pair to be a 
functional virtio-net, and only one queue pair. This is meaningful.
Zhu, Lingshan July 27, 2022, 2:11 a.m. UTC | #18
On 7/27/2022 4:53 AM, Parav Pandit wrote:
>> From: Michael S. Tsirkin<mst@redhat.com>
>> Sent: Tuesday, July 26, 2022 3:49 PM
>>
>> On Tue, Jul 26, 2022 at 03:54:06PM +0000, Parav Pandit wrote:
>>>> From: Michael S. Tsirkin<mst@redhat.com>
>>>> Sent: Wednesday, July 13, 2022 1:27 AM
>>>>
>>>> On Fri, Jul 01, 2022 at 10:07:59PM +0000, Parav Pandit wrote:
>>>>>> From: Zhu Lingshan<lingshan.zhu@intel.com>
>>>>>> Sent: Friday, July 1, 2022 9:28 AM If VIRTIO_NET_F_MQ == 0, the
>>>>>> virtio device should have one queue pair, so when userspace
>>>>>> querying queue pair numbers, it should return mq=1 than zero.
>>>>>>
>>>>>> Function vdpa_dev_net_config_fill() fills the attributions of
>>>>>> the vDPA devices, so that it should call
>>>>>> vdpa_dev_net_mq_config_fill() so the parameter in
>>>>>> vdpa_dev_net_mq_config_fill() should be feature_device than
>>>>>> feature_driver for the vDPA devices themselves
>>>>>>
>>>>>> Before this change, when MQ = 0, iproute2 output:
>>>>>> $vdpa dev config show vdpa0
>>>>>> vdpa0: mac 00:e8:ca:11:be:05 link up link_announce false
>>>>>> max_vq_pairs 0 mtu 1500
>>>>>>
>>>>> The fix belongs to user space.
>>>>> When a feature bit _MQ is not negotiated, vdpa kernel space will
>>>>> not add
>>>> attribute VDPA_ATTR_DEV_NET_CFG_MAX_VQP.
>>>>> When such attribute is not returned by kernel, max_vq_pairs should
>>>>> not be
>>>> shown by the iproute2.
>>>>> We have many config space fields that depend on the feature bits
>>>>> and
>>>> some of them do not have any defaults.
>>>>> To keep consistency of existence of config space fields among all,
>>>>> we don't
>>>> want to show default like below.
>>>>> Please fix the iproute2 to not print max_vq_pairs when it is not
>>>>> returned by
>>>> the kernel.
>>>>
>>>> Parav I read the discussion and don't get your argument. From
>>>> driver's POV _MQ with 1 VQ pair and !_MQ are exactly functionally
>> equivalent.
>>> But we are talking from user POV here.
>>  From spec POV there's just driver and device, user would be part of driver here.
> User space application still need to inspect the _MQ bit to

>>>> It's true that iproute probably needs to be fixed too, to handle old kernels.
>>>> But iproute is not the only userspace, why not make it's life easier
>>>> by fixing the kernel?
>>> Because it cannot be fixed for other config space fields which are control by
>> feature bits those do not have any defaults.
>>> So better to treat all in same way from user POV.
>> Consistency is good for sure. What are these other fields though?
>> Can you give examples so I understand please?
> speed only exists if VIRTIO_NET_F_SPEED_DUPLEX.
> rss_max_key_size exists only if VIRTIO_NET_F_RSS exists.
That's different cases from the MQ case.

There are no default values for speed and rss_max_key_size. And talking 
on speed without VIRTIO_NET_F_SEPPD_DUPLEX or rss_max_key_size without 
VIRTIO_NET_F_RSS are meaningless.
But for MQ, if without MQ, we know it has to be 1 queue pair to be a 
functional virtio-net, and this is meaningful.
Zhu, Lingshan July 27, 2022, 2:14 a.m. UTC | #19
On 7/26/2022 11:56 PM, Parav Pandit wrote:
>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>> Sent: Tuesday, July 12, 2022 11:46 PM
>>> When the user space which invokes netlink commands, detects that _MQ
>> is not supported, hence it takes max_queue_pair = 1 by itself.
>> I think the kernel module have all necessary information and it is the only
>> one which have precise information of a device, so it should answer precisely
>> than let the user space guess. The kernel module should be reliable than stay
>> silent, leave the question to the user space tool.
> Kernel is reliable. It doesn’t expose a config space field if the field doesn’t exist regardless of field should have default or no default.
so when you know it is one queue pair, you should answer one, not try to 
guess.
> User space should not guess either. User space gets to see if _MQ present/not present. If _MQ present than get reliable data from kernel.
> If _MQ not present, it means this device has one VQ pair.
it is still a guess, right? And all user space tools implemented this 
feature need to guess
Parav Pandit July 27, 2022, 2:17 a.m. UTC | #20
> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Tuesday, July 26, 2022 10:15 PM
> 
> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> >> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> >> Sent: Tuesday, July 12, 2022 11:46 PM
> >>> When the user space which invokes netlink commands, detects that
> _MQ
> >> is not supported, hence it takes max_queue_pair = 1 by itself.
> >> I think the kernel module have all necessary information and it is
> >> the only one which have precise information of a device, so it should
> >> answer precisely than let the user space guess. The kernel module
> >> should be reliable than stay silent, leave the question to the user space
> tool.
> > Kernel is reliable. It doesn’t expose a config space field if the field doesn’t
> exist regardless of field should have default or no default.
> so when you know it is one queue pair, you should answer one, not try to
> guess.
> > User space should not guess either. User space gets to see if _MQ
> present/not present. If _MQ present than get reliable data from kernel.
> > If _MQ not present, it means this device has one VQ pair.
> it is still a guess, right? And all user space tools implemented this feature
> need to guess
No. it is not a guess.
It is explicitly checking the _MQ feature and deriving the value.
The code you proposed will be present in the user space.
It will be uniform for _MQ and 10 other features that are present now and in the future.

For feature X, kernel reports default and for feature Y, kernel skip reporting it, because there is no default. <- This is what we are trying to avoid here.
Zhu, Lingshan July 27, 2022, 2:53 a.m. UTC | #21
On 7/27/2022 10:17 AM, Parav Pandit wrote:
>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>> Sent: Tuesday, July 26, 2022 10:15 PM
>>
>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>> Sent: Tuesday, July 12, 2022 11:46 PM
>>>>> When the user space which invokes netlink commands, detects that
>> _MQ
>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
>>>> I think the kernel module have all necessary information and it is
>>>> the only one which have precise information of a device, so it should
>>>> answer precisely than let the user space guess. The kernel module
>>>> should be reliable than stay silent, leave the question to the user space
>> tool.
>>> Kernel is reliable. It doesn’t expose a config space field if the field doesn’t
>> exist regardless of field should have default or no default.
>> so when you know it is one queue pair, you should answer one, not try to
>> guess.
>>> User space should not guess either. User space gets to see if _MQ
>> present/not present. If _MQ present than get reliable data from kernel.
>>> If _MQ not present, it means this device has one VQ pair.
>> it is still a guess, right? And all user space tools implemented this feature
>> need to guess
> No. it is not a guess.
> It is explicitly checking the _MQ feature and deriving the value.
> The code you proposed will be present in the user space.
> It will be uniform for _MQ and 10 other features that are present now and in the future.
MQ and other features like RSS are different. If there is no _RSS_XX, 
there are no attributes like max_rss_key_size, and there is not a 
default value.
But for MQ, we know it has to be 1 wihtout _MQ.
> For feature X, kernel reports default and for feature Y, kernel skip reporting it, because there is no default. <- This is what we are trying to avoid here.
Kernel reports one queue pair because there is actually one.
>
Parav Pandit July 27, 2022, 3:47 a.m. UTC | #22
> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Tuesday, July 26, 2022 10:53 PM
> 
> On 7/27/2022 10:17 AM, Parav Pandit wrote:
> >> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> >> Sent: Tuesday, July 26, 2022 10:15 PM
> >>
> >> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> >>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> >>>> Sent: Tuesday, July 12, 2022 11:46 PM
> >>>>> When the user space which invokes netlink commands, detects that
> >> _MQ
> >>>> is not supported, hence it takes max_queue_pair = 1 by itself.
> >>>> I think the kernel module have all necessary information and it is
> >>>> the only one which have precise information of a device, so it
> >>>> should answer precisely than let the user space guess. The kernel
> >>>> module should be reliable than stay silent, leave the question to
> >>>> the user space
> >> tool.
> >>> Kernel is reliable. It doesn’t expose a config space field if the
> >>> field doesn’t
> >> exist regardless of field should have default or no default.
> >> so when you know it is one queue pair, you should answer one, not try
> >> to guess.
> >>> User space should not guess either. User space gets to see if _MQ
> >> present/not present. If _MQ present than get reliable data from kernel.
> >>> If _MQ not present, it means this device has one VQ pair.
> >> it is still a guess, right? And all user space tools implemented this
> >> feature need to guess
> > No. it is not a guess.
> > It is explicitly checking the _MQ feature and deriving the value.
> > The code you proposed will be present in the user space.
> > It will be uniform for _MQ and 10 other features that are present now and
> in the future.
> MQ and other features like RSS are different. If there is no _RSS_XX, there
> are no attributes like max_rss_key_size, and there is not a default value.
> But for MQ, we know it has to be 1 wihtout _MQ.
"we" = user space.
To keep the consistency among all the config space fields.

> > For feature X, kernel reports default and for feature Y, kernel skip
> reporting it, because there is no default. <- This is what we are trying to
> avoid here.
> Kernel reports one queue pair because there is actually one.
> >
Zhu, Lingshan July 27, 2022, 4:24 a.m. UTC | #23
On 7/27/2022 11:47 AM, Parav Pandit wrote:
>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>> Sent: Tuesday, July 26, 2022 10:53 PM
>>
>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>> Sent: Tuesday, July 26, 2022 10:15 PM
>>>>
>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
>>>>>>> When the user space which invokes netlink commands, detects that
>>>> _MQ
>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
>>>>>> I think the kernel module have all necessary information and it is
>>>>>> the only one which have precise information of a device, so it
>>>>>> should answer precisely than let the user space guess. The kernel
>>>>>> module should be reliable than stay silent, leave the question to
>>>>>> the user space
>>>> tool.
>>>>> Kernel is reliable. It doesn’t expose a config space field if the
>>>>> field doesn’t
>>>> exist regardless of field should have default or no default.
>>>> so when you know it is one queue pair, you should answer one, not try
>>>> to guess.
>>>>> User space should not guess either. User space gets to see if _MQ
>>>> present/not present. If _MQ present than get reliable data from kernel.
>>>>> If _MQ not present, it means this device has one VQ pair.
>>>> it is still a guess, right? And all user space tools implemented this
>>>> feature need to guess
>>> No. it is not a guess.
>>> It is explicitly checking the _MQ feature and deriving the value.
>>> The code you proposed will be present in the user space.
>>> It will be uniform for _MQ and 10 other features that are present now and
>> in the future.
>> MQ and other features like RSS are different. If there is no _RSS_XX, there
>> are no attributes like max_rss_key_size, and there is not a default value.
>> But for MQ, we know it has to be 1 wihtout _MQ.
> "we" = user space.
> To keep the consistency among all the config space fields.
The user space tools asks for the number of vq pairs, not whether the 
device has _MQ.
_MQ and _RSS are not the same kind of concepts, as we have discussed above.
You have pointed out the logic: If there is _MQ, kernel answers 
max_vq_paris, if no _MQ, num_vq_paris=1.

So as MST pointed out, implementing this in kernel space can make our 
life easier, once for all.
>
>>> For feature X, kernel reports default and for feature Y, kernel skip
>> reporting it, because there is no default. <- This is what we are trying to
>> avoid here.
>> Kernel reports one queue pair because there is actually one.
Michael S. Tsirkin July 27, 2022, 6:01 a.m. UTC | #24
On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> 
> > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > Sent: Tuesday, July 26, 2022 10:53 PM
> > 
> > On 7/27/2022 10:17 AM, Parav Pandit wrote:
> > >> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > >> Sent: Tuesday, July 26, 2022 10:15 PM
> > >>
> > >> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> > >>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > >>>> Sent: Tuesday, July 12, 2022 11:46 PM
> > >>>>> When the user space which invokes netlink commands, detects that
> > >> _MQ
> > >>>> is not supported, hence it takes max_queue_pair = 1 by itself.
> > >>>> I think the kernel module have all necessary information and it is
> > >>>> the only one which have precise information of a device, so it
> > >>>> should answer precisely than let the user space guess. The kernel
> > >>>> module should be reliable than stay silent, leave the question to
> > >>>> the user space
> > >> tool.
> > >>> Kernel is reliable. It doesn’t expose a config space field if the
> > >>> field doesn’t
> > >> exist regardless of field should have default or no default.
> > >> so when you know it is one queue pair, you should answer one, not try
> > >> to guess.
> > >>> User space should not guess either. User space gets to see if _MQ
> > >> present/not present. If _MQ present than get reliable data from kernel.
> > >>> If _MQ not present, it means this device has one VQ pair.
> > >> it is still a guess, right? And all user space tools implemented this
> > >> feature need to guess
> > > No. it is not a guess.
> > > It is explicitly checking the _MQ feature and deriving the value.
> > > The code you proposed will be present in the user space.
> > > It will be uniform for _MQ and 10 other features that are present now and
> > in the future.
> > MQ and other features like RSS are different. If there is no _RSS_XX, there
> > are no attributes like max_rss_key_size, and there is not a default value.
> > But for MQ, we know it has to be 1 wihtout _MQ.
> "we" = user space.
> To keep the consistency among all the config space fields.

Actually I looked and the code some more and I'm puzzled:


	struct virtio_net_config config = {};
	u64 features;
	u16 val_u16;

	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));

	if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
		    config.mac))
		return -EMSGSIZE;


Mac returned even without VIRTIO_NET_F_MAC


	val_u16 = le16_to_cpu(config.status);
	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
		return -EMSGSIZE;


status returned even without VIRTIO_NET_F_STATUS

	val_u16 = le16_to_cpu(config.mtu);
	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
		return -EMSGSIZE;


MTU returned even without VIRTIO_NET_F_MTU


What's going on here?
Zhu, Lingshan July 27, 2022, 6:25 a.m. UTC | #25
On 7/27/2022 2:01 PM, Michael S. Tsirkin wrote:
> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>> Sent: Tuesday, July 26, 2022 10:53 PM
>>>
>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
>>>>>
>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
>>>>>>>> When the user space which invokes netlink commands, detects that
>>>>> _MQ
>>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
>>>>>>> I think the kernel module have all necessary information and it is
>>>>>>> the only one which have precise information of a device, so it
>>>>>>> should answer precisely than let the user space guess. The kernel
>>>>>>> module should be reliable than stay silent, leave the question to
>>>>>>> the user space
>>>>> tool.
>>>>>> Kernel is reliable. It doesn’t expose a config space field if the
>>>>>> field doesn’t
>>>>> exist regardless of field should have default or no default.
>>>>> so when you know it is one queue pair, you should answer one, not try
>>>>> to guess.
>>>>>> User space should not guess either. User space gets to see if _MQ
>>>>> present/not present. If _MQ present than get reliable data from kernel.
>>>>>> If _MQ not present, it means this device has one VQ pair.
>>>>> it is still a guess, right? And all user space tools implemented this
>>>>> feature need to guess
>>>> No. it is not a guess.
>>>> It is explicitly checking the _MQ feature and deriving the value.
>>>> The code you proposed will be present in the user space.
>>>> It will be uniform for _MQ and 10 other features that are present now and
>>> in the future.
>>> MQ and other features like RSS are different. If there is no _RSS_XX, there
>>> are no attributes like max_rss_key_size, and there is not a default value.
>>> But for MQ, we know it has to be 1 wihtout _MQ.
>> "we" = user space.
>> To keep the consistency among all the config space fields.
> Actually I looked and the code some more and I'm puzzled:
I can submit a fix in my next version patch for these issue.
>
>
> 	struct virtio_net_config config = {};
> 	u64 features;
> 	u16 val_u16;
>
> 	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>
> 	if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> 		    config.mac))
> 		return -EMSGSIZE;
>
>
> Mac returned even without VIRTIO_NET_F_MAC
if no VIRTIO_NET_F_MAC, we should not nla_put 
VDPA_ATTR_DEV_NET_CFG_MAC_ADDR, the spec says the driver should generate 
a random mac.
>
>
> 	val_u16 = le16_to_cpu(config.status);
> 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> 		return -EMSGSIZE;
>
>
> status returned even without VIRTIO_NET_F_STATUS
if no VIRTIO_NET_F_STATUS, we should not nla_put 
VDPA_ATTR_DEV_NET_STATUS, the spec says the driver should assume the 
link is active.
>
> 	val_u16 = le16_to_cpu(config.mtu);
> 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> 		return -EMSGSIZE;
>
>
> MTU returned even without VIRTIO_NET_F_MTU
same as above, the spec says config.mtu depends on VIRTIO_NET_F_MTU, so 
without this feature bit, we should not return MTU to the userspace.

Does these fix look good to you?

And I think we may need your adjudication for the two issues:
(1) Shall we answer max_vq_paris = 1 when _MQ not exist, I know you have 
agreed on this in a previous thread, its nice to clarify
(2) I think we should not re-use the netlink attr to report feature bits 
of both the management device and the vDPA device,
this can lead to a new race condition, there are no locks(especially 
distributed locks for kernel_space and user_space) in the nla_put
functions. Re-using the attr is some kind of breaking the netlink 
lockless design.

Thanks,
Zhu Lingshan
>
>
> What's going on here?
>
>
Jason Wang July 27, 2022, 6:54 a.m. UTC | #26
On Wed, Jul 27, 2022 at 2:01 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> >
> > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > Sent: Tuesday, July 26, 2022 10:53 PM
> > >
> > > On 7/27/2022 10:17 AM, Parav Pandit wrote:
> > > >> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > >> Sent: Tuesday, July 26, 2022 10:15 PM
> > > >>
> > > >> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> > > >>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > >>>> Sent: Tuesday, July 12, 2022 11:46 PM
> > > >>>>> When the user space which invokes netlink commands, detects that
> > > >> _MQ
> > > >>>> is not supported, hence it takes max_queue_pair = 1 by itself.
> > > >>>> I think the kernel module have all necessary information and it is
> > > >>>> the only one which have precise information of a device, so it
> > > >>>> should answer precisely than let the user space guess. The kernel
> > > >>>> module should be reliable than stay silent, leave the question to
> > > >>>> the user space
> > > >> tool.
> > > >>> Kernel is reliable. It doesn’t expose a config space field if the
> > > >>> field doesn’t
> > > >> exist regardless of field should have default or no default.
> > > >> so when you know it is one queue pair, you should answer one, not try
> > > >> to guess.
> > > >>> User space should not guess either. User space gets to see if _MQ
> > > >> present/not present. If _MQ present than get reliable data from kernel.
> > > >>> If _MQ not present, it means this device has one VQ pair.
> > > >> it is still a guess, right? And all user space tools implemented this
> > > >> feature need to guess
> > > > No. it is not a guess.
> > > > It is explicitly checking the _MQ feature and deriving the value.
> > > > The code you proposed will be present in the user space.
> > > > It will be uniform for _MQ and 10 other features that are present now and
> > > in the future.
> > > MQ and other features like RSS are different. If there is no _RSS_XX, there
> > > are no attributes like max_rss_key_size, and there is not a default value.
> > > But for MQ, we know it has to be 1 wihtout _MQ.
> > "we" = user space.
> > To keep the consistency among all the config space fields.
>
> Actually I looked and the code some more and I'm puzzled:
>
>
>         struct virtio_net_config config = {};
>         u64 features;
>         u16 val_u16;
>
>         vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>
>         if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
>                     config.mac))
>                 return -EMSGSIZE;
>
>
> Mac returned even without VIRTIO_NET_F_MAC
>
>
>         val_u16 = le16_to_cpu(config.status);
>         if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>                 return -EMSGSIZE;
>
>
> status returned even without VIRTIO_NET_F_STATUS
>
>         val_u16 = le16_to_cpu(config.mtu);
>         if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>                 return -EMSGSIZE;
>
>
> MTU returned even without VIRTIO_NET_F_MTU
>
>
> What's going on here?

Probably too late to fix, but this should be fine as long as all
parents support STATUS/MTU/MAC.

I wonder if we can add a check in the core and fail the device
registration in this case.

Thanks

>
>
> --
> MST
>
Jason Wang July 27, 2022, 6:56 a.m. UTC | #27
On Wed, Jul 27, 2022 at 2:26 PM Zhu, Lingshan <lingshan.zhu@intel.com> wrote:
>
>
>
> On 7/27/2022 2:01 PM, Michael S. Tsirkin wrote:
> > On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> >>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> >>> Sent: Tuesday, July 26, 2022 10:53 PM
> >>>
> >>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
> >>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> >>>>> Sent: Tuesday, July 26, 2022 10:15 PM
> >>>>>
> >>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> >>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> >>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
> >>>>>>>> When the user space which invokes netlink commands, detects that
> >>>>> _MQ
> >>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
> >>>>>>> I think the kernel module have all necessary information and it is
> >>>>>>> the only one which have precise information of a device, so it
> >>>>>>> should answer precisely than let the user space guess. The kernel
> >>>>>>> module should be reliable than stay silent, leave the question to
> >>>>>>> the user space
> >>>>> tool.
> >>>>>> Kernel is reliable. It doesn’t expose a config space field if the
> >>>>>> field doesn’t
> >>>>> exist regardless of field should have default or no default.
> >>>>> so when you know it is one queue pair, you should answer one, not try
> >>>>> to guess.
> >>>>>> User space should not guess either. User space gets to see if _MQ
> >>>>> present/not present. If _MQ present than get reliable data from kernel.
> >>>>>> If _MQ not present, it means this device has one VQ pair.
> >>>>> it is still a guess, right? And all user space tools implemented this
> >>>>> feature need to guess
> >>>> No. it is not a guess.
> >>>> It is explicitly checking the _MQ feature and deriving the value.
> >>>> The code you proposed will be present in the user space.
> >>>> It will be uniform for _MQ and 10 other features that are present now and
> >>> in the future.
> >>> MQ and other features like RSS are different. If there is no _RSS_XX, there
> >>> are no attributes like max_rss_key_size, and there is not a default value.
> >>> But for MQ, we know it has to be 1 wihtout _MQ.
> >> "we" = user space.
> >> To keep the consistency among all the config space fields.
> > Actually I looked and the code some more and I'm puzzled:
> I can submit a fix in my next version patch for these issue.
> >
> >
> >       struct virtio_net_config config = {};
> >       u64 features;
> >       u16 val_u16;
> >
> >       vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> >
> >       if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> >                   config.mac))
> >               return -EMSGSIZE;
> >
> >
> > Mac returned even without VIRTIO_NET_F_MAC
> if no VIRTIO_NET_F_MAC, we should not nla_put
> VDPA_ATTR_DEV_NET_CFG_MAC_ADDR, the spec says the driver should generate
> a random mac.

It's probably too late to do this. Most of the parents have this
feature support, so probably not a real issue.

> >
> >
> >       val_u16 = le16_to_cpu(config.status);
> >       if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> >               return -EMSGSIZE;
> >
> >
> > status returned even without VIRTIO_NET_F_STATUS
> if no VIRTIO_NET_F_STATUS, we should not nla_put
> VDPA_ATTR_DEV_NET_STATUS, the spec says the driver should assume the
> link is active.

Somehow similar to F_MAC. But we can report if F_MAC is not negotiated.


> >
> >       val_u16 = le16_to_cpu(config.mtu);
> >       if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> >               return -EMSGSIZE;
> >
> >
> > MTU returned even without VIRTIO_NET_F_MTU
> same as above, the spec says config.mtu depends on VIRTIO_NET_F_MTU, so
> without this feature bit, we should not return MTU to the userspace.

Not a big issue, we just need to make sure the parent can report a
correct MTU here.

Thanks

>
> Does these fix look good to you?
>
> And I think we may need your adjudication for the two issues:
> (1) Shall we answer max_vq_paris = 1 when _MQ not exist, I know you have
> agreed on this in a previous thread, its nice to clarify
> (2) I think we should not re-use the netlink attr to report feature bits
> of both the management device and the vDPA device,
> this can lead to a new race condition, there are no locks(especially
> distributed locks for kernel_space and user_space) in the nla_put
> functions. Re-using the attr is some kind of breaking the netlink
> lockless design.
>
> Thanks,
> Zhu Lingshan
> >
> >
> > What's going on here?
> >
> >
>
Si-Wei Liu July 27, 2022, 7:50 a.m. UTC | #28
On 7/26/2022 11:01 PM, Michael S. Tsirkin wrote:
> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>> Sent: Tuesday, July 26, 2022 10:53 PM
>>>
>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
>>>>>
>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
>>>>>>>> When the user space which invokes netlink commands, detects that
>>>>> _MQ
>>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
>>>>>>> I think the kernel module have all necessary information and it is
>>>>>>> the only one which have precise information of a device, so it
>>>>>>> should answer precisely than let the user space guess. The kernel
>>>>>>> module should be reliable than stay silent, leave the question to
>>>>>>> the user space
>>>>> tool.
>>>>>> Kernel is reliable. It doesn’t expose a config space field if the
>>>>>> field doesn’t
>>>>> exist regardless of field should have default or no default.
>>>>> so when you know it is one queue pair, you should answer one, not try
>>>>> to guess.
>>>>>> User space should not guess either. User space gets to see if _MQ
>>>>> present/not present. If _MQ present than get reliable data from kernel.
>>>>>> If _MQ not present, it means this device has one VQ pair.
>>>>> it is still a guess, right? And all user space tools implemented this
>>>>> feature need to guess
>>>> No. it is not a guess.
>>>> It is explicitly checking the _MQ feature and deriving the value.
>>>> The code you proposed will be present in the user space.
>>>> It will be uniform for _MQ and 10 other features that are present now and
>>> in the future.
>>> MQ and other features like RSS are different. If there is no _RSS_XX, there
>>> are no attributes like max_rss_key_size, and there is not a default value.
>>> But for MQ, we know it has to be 1 wihtout _MQ.
>> "we" = user space.
>> To keep the consistency among all the config space fields.
> Actually I looked and the code some more and I'm puzzled:
>
>
> 	struct virtio_net_config config = {};
> 	u64 features;
> 	u16 val_u16;
>
> 	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>
> 	if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> 		    config.mac))
> 		return -EMSGSIZE;
>
>
> Mac returned even without VIRTIO_NET_F_MAC
>
>
> 	val_u16 = le16_to_cpu(config.status);
> 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> 		return -EMSGSIZE;
>
>
> status returned even without VIRTIO_NET_F_STATUS
>
> 	val_u16 = le16_to_cpu(config.mtu);
> 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> 		return -EMSGSIZE;
>
>
> MTU returned even without VIRTIO_NET_F_MTU
>
>
> What's going on here?
>
>
I guess this is spec thing (historical debt), I vaguely recall these 
fields are always present in config space regardless the existence of 
corresponding feature bit.

-Siwei
Michael S. Tsirkin July 27, 2022, 9:01 a.m. UTC | #29
On Wed, Jul 27, 2022 at 12:50:33AM -0700, Si-Wei Liu wrote:
> 
> 
> On 7/26/2022 11:01 PM, Michael S. Tsirkin wrote:
> > On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> > > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > Sent: Tuesday, July 26, 2022 10:53 PM
> > > > 
> > > > On 7/27/2022 10:17 AM, Parav Pandit wrote:
> > > > > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > > Sent: Tuesday, July 26, 2022 10:15 PM
> > > > > > 
> > > > > > On 7/26/2022 11:56 PM, Parav Pandit wrote:
> > > > > > > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > > > > Sent: Tuesday, July 12, 2022 11:46 PM
> > > > > > > > > When the user space which invokes netlink commands, detects that
> > > > > > _MQ
> > > > > > > > is not supported, hence it takes max_queue_pair = 1 by itself.
> > > > > > > > I think the kernel module have all necessary information and it is
> > > > > > > > the only one which have precise information of a device, so it
> > > > > > > > should answer precisely than let the user space guess. The kernel
> > > > > > > > module should be reliable than stay silent, leave the question to
> > > > > > > > the user space
> > > > > > tool.
> > > > > > > Kernel is reliable. It doesn’t expose a config space field if the
> > > > > > > field doesn’t
> > > > > > exist regardless of field should have default or no default.
> > > > > > so when you know it is one queue pair, you should answer one, not try
> > > > > > to guess.
> > > > > > > User space should not guess either. User space gets to see if _MQ
> > > > > > present/not present. If _MQ present than get reliable data from kernel.
> > > > > > > If _MQ not present, it means this device has one VQ pair.
> > > > > > it is still a guess, right? And all user space tools implemented this
> > > > > > feature need to guess
> > > > > No. it is not a guess.
> > > > > It is explicitly checking the _MQ feature and deriving the value.
> > > > > The code you proposed will be present in the user space.
> > > > > It will be uniform for _MQ and 10 other features that are present now and
> > > > in the future.
> > > > MQ and other features like RSS are different. If there is no _RSS_XX, there
> > > > are no attributes like max_rss_key_size, and there is not a default value.
> > > > But for MQ, we know it has to be 1 wihtout _MQ.
> > > "we" = user space.
> > > To keep the consistency among all the config space fields.
> > Actually I looked and the code some more and I'm puzzled:
> > 
> > 
> > 	struct virtio_net_config config = {};
> > 	u64 features;
> > 	u16 val_u16;
> > 
> > 	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> > 
> > 	if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> > 		    config.mac))
> > 		return -EMSGSIZE;
> > 
> > 
> > Mac returned even without VIRTIO_NET_F_MAC
> > 
> > 
> > 	val_u16 = le16_to_cpu(config.status);
> > 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> > 		return -EMSGSIZE;
> > 
> > 
> > status returned even without VIRTIO_NET_F_STATUS
> > 
> > 	val_u16 = le16_to_cpu(config.mtu);
> > 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> > 		return -EMSGSIZE;
> > 
> > 
> > MTU returned even without VIRTIO_NET_F_MTU
> > 
> > 
> > What's going on here?
> > 
> > 
> I guess this is spec thing (historical debt), I vaguely recall these fields
> are always present in config space regardless the existence of corresponding
> feature bit.
> 
> -Siwei

Nope:

2.5.1  Driver Requirements: Device Configuration Space

...

For optional configuration space fields, the driver MUST check that the corresponding feature is offered
before accessing that part of the configuration space.
Michael S. Tsirkin July 27, 2022, 9:02 a.m. UTC | #30
On Wed, Jul 27, 2022 at 02:54:13PM +0800, Jason Wang wrote:
> On Wed, Jul 27, 2022 at 2:01 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> > >
> > > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > Sent: Tuesday, July 26, 2022 10:53 PM
> > > >
> > > > On 7/27/2022 10:17 AM, Parav Pandit wrote:
> > > > >> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > >> Sent: Tuesday, July 26, 2022 10:15 PM
> > > > >>
> > > > >> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> > > > >>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > >>>> Sent: Tuesday, July 12, 2022 11:46 PM
> > > > >>>>> When the user space which invokes netlink commands, detects that
> > > > >> _MQ
> > > > >>>> is not supported, hence it takes max_queue_pair = 1 by itself.
> > > > >>>> I think the kernel module have all necessary information and it is
> > > > >>>> the only one which have precise information of a device, so it
> > > > >>>> should answer precisely than let the user space guess. The kernel
> > > > >>>> module should be reliable than stay silent, leave the question to
> > > > >>>> the user space
> > > > >> tool.
> > > > >>> Kernel is reliable. It doesn’t expose a config space field if the
> > > > >>> field doesn’t
> > > > >> exist regardless of field should have default or no default.
> > > > >> so when you know it is one queue pair, you should answer one, not try
> > > > >> to guess.
> > > > >>> User space should not guess either. User space gets to see if _MQ
> > > > >> present/not present. If _MQ present than get reliable data from kernel.
> > > > >>> If _MQ not present, it means this device has one VQ pair.
> > > > >> it is still a guess, right? And all user space tools implemented this
> > > > >> feature need to guess
> > > > > No. it is not a guess.
> > > > > It is explicitly checking the _MQ feature and deriving the value.
> > > > > The code you proposed will be present in the user space.
> > > > > It will be uniform for _MQ and 10 other features that are present now and
> > > > in the future.
> > > > MQ and other features like RSS are different. If there is no _RSS_XX, there
> > > > are no attributes like max_rss_key_size, and there is not a default value.
> > > > But for MQ, we know it has to be 1 wihtout _MQ.
> > > "we" = user space.
> > > To keep the consistency among all the config space fields.
> >
> > Actually I looked and the code some more and I'm puzzled:
> >
> >
> >         struct virtio_net_config config = {};
> >         u64 features;
> >         u16 val_u16;
> >
> >         vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> >
> >         if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> >                     config.mac))
> >                 return -EMSGSIZE;
> >
> >
> > Mac returned even without VIRTIO_NET_F_MAC
> >
> >
> >         val_u16 = le16_to_cpu(config.status);
> >         if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> >                 return -EMSGSIZE;
> >
> >
> > status returned even without VIRTIO_NET_F_STATUS
> >
> >         val_u16 = le16_to_cpu(config.mtu);
> >         if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> >                 return -EMSGSIZE;
> >
> >
> > MTU returned even without VIRTIO_NET_F_MTU
> >
> >
> > What's going on here?
> 
> Probably too late to fix, but this should be fine as long as all
> parents support STATUS/MTU/MAC.

Why is this too late to fix.

> I wonder if we can add a check in the core and fail the device
> registration in this case.
> 
> Thanks
> 
> >
> >
> > --
> > MST
> >
Michael S. Tsirkin July 27, 2022, 9:05 a.m. UTC | #31
On Wed, Jul 27, 2022 at 02:56:20PM +0800, Jason Wang wrote:
> On Wed, Jul 27, 2022 at 2:26 PM Zhu, Lingshan <lingshan.zhu@intel.com> wrote:
> >
> >
> >
> > On 7/27/2022 2:01 PM, Michael S. Tsirkin wrote:
> > > On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> > >>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > >>> Sent: Tuesday, July 26, 2022 10:53 PM
> > >>>
> > >>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
> > >>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > >>>>> Sent: Tuesday, July 26, 2022 10:15 PM
> > >>>>>
> > >>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> > >>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > >>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
> > >>>>>>>> When the user space which invokes netlink commands, detects that
> > >>>>> _MQ
> > >>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
> > >>>>>>> I think the kernel module have all necessary information and it is
> > >>>>>>> the only one which have precise information of a device, so it
> > >>>>>>> should answer precisely than let the user space guess. The kernel
> > >>>>>>> module should be reliable than stay silent, leave the question to
> > >>>>>>> the user space
> > >>>>> tool.
> > >>>>>> Kernel is reliable. It doesn’t expose a config space field if the
> > >>>>>> field doesn’t
> > >>>>> exist regardless of field should have default or no default.
> > >>>>> so when you know it is one queue pair, you should answer one, not try
> > >>>>> to guess.
> > >>>>>> User space should not guess either. User space gets to see if _MQ
> > >>>>> present/not present. If _MQ present than get reliable data from kernel.
> > >>>>>> If _MQ not present, it means this device has one VQ pair.
> > >>>>> it is still a guess, right? And all user space tools implemented this
> > >>>>> feature need to guess
> > >>>> No. it is not a guess.
> > >>>> It is explicitly checking the _MQ feature and deriving the value.
> > >>>> The code you proposed will be present in the user space.
> > >>>> It will be uniform for _MQ and 10 other features that are present now and
> > >>> in the future.
> > >>> MQ and other features like RSS are different. If there is no _RSS_XX, there
> > >>> are no attributes like max_rss_key_size, and there is not a default value.
> > >>> But for MQ, we know it has to be 1 wihtout _MQ.
> > >> "we" = user space.
> > >> To keep the consistency among all the config space fields.
> > > Actually I looked and the code some more and I'm puzzled:
> > I can submit a fix in my next version patch for these issue.
> > >
> > >
> > >       struct virtio_net_config config = {};
> > >       u64 features;
> > >       u16 val_u16;
> > >
> > >       vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> > >
> > >       if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> > >                   config.mac))
> > >               return -EMSGSIZE;
> > >
> > >
> > > Mac returned even without VIRTIO_NET_F_MAC
> > if no VIRTIO_NET_F_MAC, we should not nla_put
> > VDPA_ATTR_DEV_NET_CFG_MAC_ADDR, the spec says the driver should generate
> > a random mac.
> 
> It's probably too late to do this.

Not sure why.

> Most of the parents have this
> feature support, so probably not a real issue.

I guess not reporting MTU is not worse than failing initialization.

> > >
> > >
> > >       val_u16 = le16_to_cpu(config.status);
> > >       if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> > >               return -EMSGSIZE;
> > >
> > >
> > > status returned even without VIRTIO_NET_F_STATUS
> > if no VIRTIO_NET_F_STATUS, we should not nla_put
> > VDPA_ATTR_DEV_NET_STATUS, the spec says the driver should assume the
> > link is active.
> 
> Somehow similar to F_MAC. But we can report if F_MAC is not negotiated.
> 
> 
> > >
> > >       val_u16 = le16_to_cpu(config.mtu);
> > >       if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> > >               return -EMSGSIZE;
> > >
> > >
> > > MTU returned even without VIRTIO_NET_F_MTU
> > same as above, the spec says config.mtu depends on VIRTIO_NET_F_MTU, so
> > without this feature bit, we should not return MTU to the userspace.
> 
> Not a big issue, we just need to make sure the parent can report a
> correct MTU here.
> 
> Thanks
> 
> >
> > Does these fix look good to you?
> >
> > And I think we may need your adjudication for the two issues:
> > (1) Shall we answer max_vq_paris = 1 when _MQ not exist, I know you have
> > agreed on this in a previous thread, its nice to clarify
> > (2) I think we should not re-use the netlink attr to report feature bits
> > of both the management device and the vDPA device,
> > this can lead to a new race condition, there are no locks(especially
> > distributed locks for kernel_space and user_space) in the nla_put
> > functions. Re-using the attr is some kind of breaking the netlink
> > lockless design.
> >
> > Thanks,
> > Zhu Lingshan
> > >
> > >
> > > What's going on here?
> > >
> > >
> >
Jason Wang July 27, 2022, 9:50 a.m. UTC | #32
On Wed, Jul 27, 2022 at 5:03 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Jul 27, 2022 at 02:54:13PM +0800, Jason Wang wrote:
> > On Wed, Jul 27, 2022 at 2:01 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> > > >
> > > > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > Sent: Tuesday, July 26, 2022 10:53 PM
> > > > >
> > > > > On 7/27/2022 10:17 AM, Parav Pandit wrote:
> > > > > >> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > >> Sent: Tuesday, July 26, 2022 10:15 PM
> > > > > >>
> > > > > >> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> > > > > >>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > >>>> Sent: Tuesday, July 12, 2022 11:46 PM
> > > > > >>>>> When the user space which invokes netlink commands, detects that
> > > > > >> _MQ
> > > > > >>>> is not supported, hence it takes max_queue_pair = 1 by itself.
> > > > > >>>> I think the kernel module have all necessary information and it is
> > > > > >>>> the only one which have precise information of a device, so it
> > > > > >>>> should answer precisely than let the user space guess. The kernel
> > > > > >>>> module should be reliable than stay silent, leave the question to
> > > > > >>>> the user space
> > > > > >> tool.
> > > > > >>> Kernel is reliable. It doesn’t expose a config space field if the
> > > > > >>> field doesn’t
> > > > > >> exist regardless of field should have default or no default.
> > > > > >> so when you know it is one queue pair, you should answer one, not try
> > > > > >> to guess.
> > > > > >>> User space should not guess either. User space gets to see if _MQ
> > > > > >> present/not present. If _MQ present than get reliable data from kernel.
> > > > > >>> If _MQ not present, it means this device has one VQ pair.
> > > > > >> it is still a guess, right? And all user space tools implemented this
> > > > > >> feature need to guess
> > > > > > No. it is not a guess.
> > > > > > It is explicitly checking the _MQ feature and deriving the value.
> > > > > > The code you proposed will be present in the user space.
> > > > > > It will be uniform for _MQ and 10 other features that are present now and
> > > > > in the future.
> > > > > MQ and other features like RSS are different. If there is no _RSS_XX, there
> > > > > are no attributes like max_rss_key_size, and there is not a default value.
> > > > > But for MQ, we know it has to be 1 wihtout _MQ.
> > > > "we" = user space.
> > > > To keep the consistency among all the config space fields.
> > >
> > > Actually I looked and the code some more and I'm puzzled:
> > >
> > >
> > >         struct virtio_net_config config = {};
> > >         u64 features;
> > >         u16 val_u16;
> > >
> > >         vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> > >
> > >         if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> > >                     config.mac))
> > >                 return -EMSGSIZE;
> > >
> > >
> > > Mac returned even without VIRTIO_NET_F_MAC
> > >
> > >
> > >         val_u16 = le16_to_cpu(config.status);
> > >         if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> > >                 return -EMSGSIZE;
> > >
> > >
> > > status returned even without VIRTIO_NET_F_STATUS
> > >
> > >         val_u16 = le16_to_cpu(config.mtu);
> > >         if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> > >                 return -EMSGSIZE;
> > >
> > >
> > > MTU returned even without VIRTIO_NET_F_MTU
> > >
> > >
> > > What's going on here?
> >
> > Probably too late to fix, but this should be fine as long as all
> > parents support STATUS/MTU/MAC.
>
> Why is this too late to fix.

If we make this conditional on the features. This may break the
userspace that always expects VDPA_ATTR_DEV_NET_CFG_MTU?

Thanks

>
> > I wonder if we can add a check in the core and fail the device
> > registration in this case.
> >
> > Thanks
> >
> > >
> > >
> > > --
> > > MST
> > >
>
Si-Wei Liu July 27, 2022, 10:09 a.m. UTC | #33
On 7/27/2022 2:01 AM, Michael S. Tsirkin wrote:
> On Wed, Jul 27, 2022 at 12:50:33AM -0700, Si-Wei Liu wrote:
>>
>> On 7/26/2022 11:01 PM, Michael S. Tsirkin wrote:
>>> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>> Sent: Tuesday, July 26, 2022 10:53 PM
>>>>>
>>>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
>>>>>>>
>>>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
>>>>>>>>>> When the user space which invokes netlink commands, detects that
>>>>>>> _MQ
>>>>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
>>>>>>>>> I think the kernel module have all necessary information and it is
>>>>>>>>> the only one which have precise information of a device, so it
>>>>>>>>> should answer precisely than let the user space guess. The kernel
>>>>>>>>> module should be reliable than stay silent, leave the question to
>>>>>>>>> the user space
>>>>>>> tool.
>>>>>>>> Kernel is reliable. It doesn’t expose a config space field if the
>>>>>>>> field doesn’t
>>>>>>> exist regardless of field should have default or no default.
>>>>>>> so when you know it is one queue pair, you should answer one, not try
>>>>>>> to guess.
>>>>>>>> User space should not guess either. User space gets to see if _MQ
>>>>>>> present/not present. If _MQ present than get reliable data from kernel.
>>>>>>>> If _MQ not present, it means this device has one VQ pair.
>>>>>>> it is still a guess, right? And all user space tools implemented this
>>>>>>> feature need to guess
>>>>>> No. it is not a guess.
>>>>>> It is explicitly checking the _MQ feature and deriving the value.
>>>>>> The code you proposed will be present in the user space.
>>>>>> It will be uniform for _MQ and 10 other features that are present now and
>>>>> in the future.
>>>>> MQ and other features like RSS are different. If there is no _RSS_XX, there
>>>>> are no attributes like max_rss_key_size, and there is not a default value.
>>>>> But for MQ, we know it has to be 1 wihtout _MQ.
>>>> "we" = user space.
>>>> To keep the consistency among all the config space fields.
>>> Actually I looked and the code some more and I'm puzzled:
>>>
>>>
>>> 	struct virtio_net_config config = {};
>>> 	u64 features;
>>> 	u16 val_u16;
>>>
>>> 	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>>
>>> 	if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
>>> 		    config.mac))
>>> 		return -EMSGSIZE;
>>>
>>>
>>> Mac returned even without VIRTIO_NET_F_MAC
>>>
>>>
>>> 	val_u16 = le16_to_cpu(config.status);
>>> 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>>> 		return -EMSGSIZE;
>>>
>>>
>>> status returned even without VIRTIO_NET_F_STATUS
>>>
>>> 	val_u16 = le16_to_cpu(config.mtu);
>>> 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>> 		return -EMSGSIZE;
>>>
>>>
>>> MTU returned even without VIRTIO_NET_F_MTU
>>>
>>>
>>> What's going on here?
>>>
>>>
>> I guess this is spec thing (historical debt), I vaguely recall these fields
>> are always present in config space regardless the existence of corresponding
>> feature bit.
>>
>> -Siwei
> Nope:
>
> 2.5.1  Driver Requirements: Device Configuration Space
>
> ...
>
> For optional configuration space fields, the driver MUST check that the corresponding feature is offered
> before accessing that part of the configuration space.
Well, this is driver side of requirement. As this interface is for host 
admin tool to query or configure vdpa device, we don't have to wait 
until feature negotiation is done on guest driver to extract vdpa 
attributes/parameters, say if we want to replicate another vdpa device 
with the same config on migration destination. I think what may need to 
be fix is to move off from using .vdpa_get_config_unlocked() which 
depends on feature negotiation. And/or expose config space register 
values through another set of attributes.

-Siwei
Zhu, Lingshan July 27, 2022, 11:54 a.m. UTC | #34
On 7/27/2022 6:09 PM, Si-Wei Liu wrote:
>
>
> On 7/27/2022 2:01 AM, Michael S. Tsirkin wrote:
>> On Wed, Jul 27, 2022 at 12:50:33AM -0700, Si-Wei Liu wrote:
>>>
>>> On 7/26/2022 11:01 PM, Michael S. Tsirkin wrote:
>>>> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>> Sent: Tuesday, July 26, 2022 10:53 PM
>>>>>>
>>>>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
>>>>>>>>
>>>>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
>>>>>>>>>>> When the user space which invokes netlink commands, detects 
>>>>>>>>>>> that
>>>>>>>> _MQ
>>>>>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
>>>>>>>>>> I think the kernel module have all necessary information and 
>>>>>>>>>> it is
>>>>>>>>>> the only one which have precise information of a device, so it
>>>>>>>>>> should answer precisely than let the user space guess. The 
>>>>>>>>>> kernel
>>>>>>>>>> module should be reliable than stay silent, leave the 
>>>>>>>>>> question to
>>>>>>>>>> the user space
>>>>>>>> tool.
>>>>>>>>> Kernel is reliable. It doesn’t expose a config space field if the
>>>>>>>>> field doesn’t
>>>>>>>> exist regardless of field should have default or no default.
>>>>>>>> so when you know it is one queue pair, you should answer one, 
>>>>>>>> not try
>>>>>>>> to guess.
>>>>>>>>> User space should not guess either. User space gets to see if _MQ
>>>>>>>> present/not present. If _MQ present than get reliable data from 
>>>>>>>> kernel.
>>>>>>>>> If _MQ not present, it means this device has one VQ pair.
>>>>>>>> it is still a guess, right? And all user space tools 
>>>>>>>> implemented this
>>>>>>>> feature need to guess
>>>>>>> No. it is not a guess.
>>>>>>> It is explicitly checking the _MQ feature and deriving the value.
>>>>>>> The code you proposed will be present in the user space.
>>>>>>> It will be uniform for _MQ and 10 other features that are 
>>>>>>> present now and
>>>>>> in the future.
>>>>>> MQ and other features like RSS are different. If there is no 
>>>>>> _RSS_XX, there
>>>>>> are no attributes like max_rss_key_size, and there is not a 
>>>>>> default value.
>>>>>> But for MQ, we know it has to be 1 wihtout _MQ.
>>>>> "we" = user space.
>>>>> To keep the consistency among all the config space fields.
>>>> Actually I looked and the code some more and I'm puzzled:
>>>>
>>>>
>>>>     struct virtio_net_config config = {};
>>>>     u64 features;
>>>>     u16 val_u16;
>>>>
>>>>     vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>>>
>>>>     if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, 
>>>> sizeof(config.mac),
>>>>             config.mac))
>>>>         return -EMSGSIZE;
>>>>
>>>>
>>>> Mac returned even without VIRTIO_NET_F_MAC
>>>>
>>>>
>>>>     val_u16 = le16_to_cpu(config.status);
>>>>     if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>>>>         return -EMSGSIZE;
>>>>
>>>>
>>>> status returned even without VIRTIO_NET_F_STATUS
>>>>
>>>>     val_u16 = le16_to_cpu(config.mtu);
>>>>     if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>>>         return -EMSGSIZE;
>>>>
>>>>
>>>> MTU returned even without VIRTIO_NET_F_MTU
>>>>
>>>>
>>>> What's going on here?
>>>>
>>>>
>>> I guess this is spec thing (historical debt), I vaguely recall these 
>>> fields
>>> are always present in config space regardless the existence of 
>>> corresponding
>>> feature bit.
>>>
>>> -Siwei
>> Nope:
>>
>> 2.5.1  Driver Requirements: Device Configuration Space
>>
>> ...
>>
>> For optional configuration space fields, the driver MUST check that 
>> the corresponding feature is offered
>> before accessing that part of the configuration space.
> Well, this is driver side of requirement. As this interface is for 
> host admin tool to query or configure vdpa device, we don't have to 
> wait until feature negotiation is done on guest driver to extract vdpa 
> attributes/parameters, say if we want to replicate another vdpa device 
> with the same config on migration destination. I think what may need 
> to be fix is to move off from using .vdpa_get_config_unlocked() which 
> depends on feature negotiation. And/or expose config space register 
> values through another set of attributes.
Yes, we don't have to wait for FEATURES_OK. In another patch in this 
series, I have added a new netlink attr to report the device features, 
and removed the blocker. So the LM orchestration SW can query the device 
features of the devices at the destination cluster, and pick a proper 
one, even mask out some features to meet the LM requirements.

Thanks,
Zhu Lingshan
> -Siwei
>
>
>
>
Michael S. Tsirkin July 27, 2022, 3:45 p.m. UTC | #35
On Wed, Jul 27, 2022 at 05:50:59PM +0800, Jason Wang wrote:
> On Wed, Jul 27, 2022 at 5:03 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Wed, Jul 27, 2022 at 02:54:13PM +0800, Jason Wang wrote:
> > > On Wed, Jul 27, 2022 at 2:01 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> > > > >
> > > > > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > > Sent: Tuesday, July 26, 2022 10:53 PM
> > > > > >
> > > > > > On 7/27/2022 10:17 AM, Parav Pandit wrote:
> > > > > > >> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > > >> Sent: Tuesday, July 26, 2022 10:15 PM
> > > > > > >>
> > > > > > >> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> > > > > > >>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > > >>>> Sent: Tuesday, July 12, 2022 11:46 PM
> > > > > > >>>>> When the user space which invokes netlink commands, detects that
> > > > > > >> _MQ
> > > > > > >>>> is not supported, hence it takes max_queue_pair = 1 by itself.
> > > > > > >>>> I think the kernel module have all necessary information and it is
> > > > > > >>>> the only one which have precise information of a device, so it
> > > > > > >>>> should answer precisely than let the user space guess. The kernel
> > > > > > >>>> module should be reliable than stay silent, leave the question to
> > > > > > >>>> the user space
> > > > > > >> tool.
> > > > > > >>> Kernel is reliable. It doesn’t expose a config space field if the
> > > > > > >>> field doesn’t
> > > > > > >> exist regardless of field should have default or no default.
> > > > > > >> so when you know it is one queue pair, you should answer one, not try
> > > > > > >> to guess.
> > > > > > >>> User space should not guess either. User space gets to see if _MQ
> > > > > > >> present/not present. If _MQ present than get reliable data from kernel.
> > > > > > >>> If _MQ not present, it means this device has one VQ pair.
> > > > > > >> it is still a guess, right? And all user space tools implemented this
> > > > > > >> feature need to guess
> > > > > > > No. it is not a guess.
> > > > > > > It is explicitly checking the _MQ feature and deriving the value.
> > > > > > > The code you proposed will be present in the user space.
> > > > > > > It will be uniform for _MQ and 10 other features that are present now and
> > > > > > in the future.
> > > > > > MQ and other features like RSS are different. If there is no _RSS_XX, there
> > > > > > are no attributes like max_rss_key_size, and there is not a default value.
> > > > > > But for MQ, we know it has to be 1 wihtout _MQ.
> > > > > "we" = user space.
> > > > > To keep the consistency among all the config space fields.
> > > >
> > > > Actually I looked and the code some more and I'm puzzled:
> > > >
> > > >
> > > >         struct virtio_net_config config = {};
> > > >         u64 features;
> > > >         u16 val_u16;
> > > >
> > > >         vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> > > >
> > > >         if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> > > >                     config.mac))
> > > >                 return -EMSGSIZE;
> > > >
> > > >
> > > > Mac returned even without VIRTIO_NET_F_MAC
> > > >
> > > >
> > > >         val_u16 = le16_to_cpu(config.status);
> > > >         if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> > > >                 return -EMSGSIZE;
> > > >
> > > >
> > > > status returned even without VIRTIO_NET_F_STATUS
> > > >
> > > >         val_u16 = le16_to_cpu(config.mtu);
> > > >         if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> > > >                 return -EMSGSIZE;
> > > >
> > > >
> > > > MTU returned even without VIRTIO_NET_F_MTU
> > > >
> > > >
> > > > What's going on here?
> > >
> > > Probably too late to fix, but this should be fine as long as all
> > > parents support STATUS/MTU/MAC.
> >
> > Why is this too late to fix.
> 
> If we make this conditional on the features. This may break the
> userspace that always expects VDPA_ATTR_DEV_NET_CFG_MTU?
> 
> Thanks

Well only on devices without MTU. I'm saying said userspace
was reading trash on such devices anyway.
We don't generally maintain bug for bug compatiblity on a whim,
only if userspace is actually known to break if we fix a bug.


> >
> > > I wonder if we can add a check in the core and fail the device
> > > registration in this case.
> > >
> > > Thanks
> > >
> > > >
> > > >
> > > > --
> > > > MST
> > > >
> >
Michael S. Tsirkin July 27, 2022, 3:48 p.m. UTC | #36
On Wed, Jul 27, 2022 at 03:09:43AM -0700, Si-Wei Liu wrote:
> 
> 
> On 7/27/2022 2:01 AM, Michael S. Tsirkin wrote:
> > On Wed, Jul 27, 2022 at 12:50:33AM -0700, Si-Wei Liu wrote:
> > > 
> > > On 7/26/2022 11:01 PM, Michael S. Tsirkin wrote:
> > > > On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> > > > > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > > Sent: Tuesday, July 26, 2022 10:53 PM
> > > > > > 
> > > > > > On 7/27/2022 10:17 AM, Parav Pandit wrote:
> > > > > > > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > > > > Sent: Tuesday, July 26, 2022 10:15 PM
> > > > > > > > 
> > > > > > > > On 7/26/2022 11:56 PM, Parav Pandit wrote:
> > > > > > > > > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > > > > > > Sent: Tuesday, July 12, 2022 11:46 PM
> > > > > > > > > > > When the user space which invokes netlink commands, detects that
> > > > > > > > _MQ
> > > > > > > > > > is not supported, hence it takes max_queue_pair = 1 by itself.
> > > > > > > > > > I think the kernel module have all necessary information and it is
> > > > > > > > > > the only one which have precise information of a device, so it
> > > > > > > > > > should answer precisely than let the user space guess. The kernel
> > > > > > > > > > module should be reliable than stay silent, leave the question to
> > > > > > > > > > the user space
> > > > > > > > tool.
> > > > > > > > > Kernel is reliable. It doesn’t expose a config space field if the
> > > > > > > > > field doesn’t
> > > > > > > > exist regardless of field should have default or no default.
> > > > > > > > so when you know it is one queue pair, you should answer one, not try
> > > > > > > > to guess.
> > > > > > > > > User space should not guess either. User space gets to see if _MQ
> > > > > > > > present/not present. If _MQ present than get reliable data from kernel.
> > > > > > > > > If _MQ not present, it means this device has one VQ pair.
> > > > > > > > it is still a guess, right? And all user space tools implemented this
> > > > > > > > feature need to guess
> > > > > > > No. it is not a guess.
> > > > > > > It is explicitly checking the _MQ feature and deriving the value.
> > > > > > > The code you proposed will be present in the user space.
> > > > > > > It will be uniform for _MQ and 10 other features that are present now and
> > > > > > in the future.
> > > > > > MQ and other features like RSS are different. If there is no _RSS_XX, there
> > > > > > are no attributes like max_rss_key_size, and there is not a default value.
> > > > > > But for MQ, we know it has to be 1 wihtout _MQ.
> > > > > "we" = user space.
> > > > > To keep the consistency among all the config space fields.
> > > > Actually I looked and the code some more and I'm puzzled:
> > > > 
> > > > 
> > > > 	struct virtio_net_config config = {};
> > > > 	u64 features;
> > > > 	u16 val_u16;
> > > > 
> > > > 	vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> > > > 
> > > > 	if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> > > > 		    config.mac))
> > > > 		return -EMSGSIZE;
> > > > 
> > > > 
> > > > Mac returned even without VIRTIO_NET_F_MAC
> > > > 
> > > > 
> > > > 	val_u16 = le16_to_cpu(config.status);
> > > > 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> > > > 		return -EMSGSIZE;
> > > > 
> > > > 
> > > > status returned even without VIRTIO_NET_F_STATUS
> > > > 
> > > > 	val_u16 = le16_to_cpu(config.mtu);
> > > > 	if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> > > > 		return -EMSGSIZE;
> > > > 
> > > > 
> > > > MTU returned even without VIRTIO_NET_F_MTU
> > > > 
> > > > 
> > > > What's going on here?
> > > > 
> > > > 
> > > I guess this is spec thing (historical debt), I vaguely recall these fields
> > > are always present in config space regardless the existence of corresponding
> > > feature bit.
> > > 
> > > -Siwei
> > Nope:
> > 
> > 2.5.1  Driver Requirements: Device Configuration Space
> > 
> > ...
> > 
> > For optional configuration space fields, the driver MUST check that the corresponding feature is offered
> > before accessing that part of the configuration space.
> Well, this is driver side of requirement.


Well driver and device are the only two entities in the spec.

> As this interface is for host
> admin tool to query or configure vdpa device, we don't have to wait until
> feature negotiation is done on guest driver to extract vdpa
> attributes/parameters, say if we want to replicate another vdpa device with
> the same config on migration destination. I think what may need to be fix is
> to move off from using .vdpa_get_config_unlocked() which depends on feature
> negotiation. And/or expose config space register values through another set
> of attributes.
> 
> -Siwei
> 
> 

Sounds like something that might use the proposed admin queue maybe.
Hope that makes progress ...
Jason Wang July 28, 2022, 1:21 a.m. UTC | #37
On Wed, Jul 27, 2022 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Jul 27, 2022 at 05:50:59PM +0800, Jason Wang wrote:
> > On Wed, Jul 27, 2022 at 5:03 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Wed, Jul 27, 2022 at 02:54:13PM +0800, Jason Wang wrote:
> > > > On Wed, Jul 27, 2022 at 2:01 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> > > > > >
> > > > > > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > > > Sent: Tuesday, July 26, 2022 10:53 PM
> > > > > > >
> > > > > > > On 7/27/2022 10:17 AM, Parav Pandit wrote:
> > > > > > > >> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > > > >> Sent: Tuesday, July 26, 2022 10:15 PM
> > > > > > > >>
> > > > > > > >> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> > > > > > > >>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > > > >>>> Sent: Tuesday, July 12, 2022 11:46 PM
> > > > > > > >>>>> When the user space which invokes netlink commands, detects that
> > > > > > > >> _MQ
> > > > > > > >>>> is not supported, hence it takes max_queue_pair = 1 by itself.
> > > > > > > >>>> I think the kernel module have all necessary information and it is
> > > > > > > >>>> the only one which have precise information of a device, so it
> > > > > > > >>>> should answer precisely than let the user space guess. The kernel
> > > > > > > >>>> module should be reliable than stay silent, leave the question to
> > > > > > > >>>> the user space
> > > > > > > >> tool.
> > > > > > > >>> Kernel is reliable. It doesn’t expose a config space field if the
> > > > > > > >>> field doesn’t
> > > > > > > >> exist regardless of field should have default or no default.
> > > > > > > >> so when you know it is one queue pair, you should answer one, not try
> > > > > > > >> to guess.
> > > > > > > >>> User space should not guess either. User space gets to see if _MQ
> > > > > > > >> present/not present. If _MQ present than get reliable data from kernel.
> > > > > > > >>> If _MQ not present, it means this device has one VQ pair.
> > > > > > > >> it is still a guess, right? And all user space tools implemented this
> > > > > > > >> feature need to guess
> > > > > > > > No. it is not a guess.
> > > > > > > > It is explicitly checking the _MQ feature and deriving the value.
> > > > > > > > The code you proposed will be present in the user space.
> > > > > > > > It will be uniform for _MQ and 10 other features that are present now and
> > > > > > > in the future.
> > > > > > > MQ and other features like RSS are different. If there is no _RSS_XX, there
> > > > > > > are no attributes like max_rss_key_size, and there is not a default value.
> > > > > > > But for MQ, we know it has to be 1 wihtout _MQ.
> > > > > > "we" = user space.
> > > > > > To keep the consistency among all the config space fields.
> > > > >
> > > > > Actually I looked and the code some more and I'm puzzled:
> > > > >
> > > > >
> > > > >         struct virtio_net_config config = {};
> > > > >         u64 features;
> > > > >         u16 val_u16;
> > > > >
> > > > >         vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> > > > >
> > > > >         if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> > > > >                     config.mac))
> > > > >                 return -EMSGSIZE;
> > > > >
> > > > >
> > > > > Mac returned even without VIRTIO_NET_F_MAC
> > > > >
> > > > >
> > > > >         val_u16 = le16_to_cpu(config.status);
> > > > >         if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> > > > >                 return -EMSGSIZE;
> > > > >
> > > > >
> > > > > status returned even without VIRTIO_NET_F_STATUS
> > > > >
> > > > >         val_u16 = le16_to_cpu(config.mtu);
> > > > >         if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> > > > >                 return -EMSGSIZE;
> > > > >
> > > > >
> > > > > MTU returned even without VIRTIO_NET_F_MTU
> > > > >
> > > > >
> > > > > What's going on here?
> > > >
> > > > Probably too late to fix, but this should be fine as long as all
> > > > parents support STATUS/MTU/MAC.
> > >
> > > Why is this too late to fix.
> >
> > If we make this conditional on the features. This may break the
> > userspace that always expects VDPA_ATTR_DEV_NET_CFG_MTU?
> >
> > Thanks
>
> Well only on devices without MTU. I'm saying said userspace
> was reading trash on such devices anyway.

It depends on the parent actually. For example, mlx5 query the lower
mtu unconditionally:

        err = query_mtu(mdev, &mtu);
        if (err)
                goto err_alloc;

        ndev->config.mtu = cpu_to_mlx5vdpa16(mvdev, mtu);

Supporting MTU features seems to be a must for real hardware.
Otherwise the driver may not work correctly.

> We don't generally maintain bug for bug compatiblity on a whim,
> only if userspace is actually known to break if we fix a bug.

 So I think it should be fine to make this conditional then we should
have a consistent handling of other fields like MQ.

Thanks

>
>
> > >
> > > > I wonder if we can add a check in the core and fail the device
> > > > registration in this case.
> > > >
> > > > Thanks
> > > >
> > > > >
> > > > >
> > > > > --
> > > > > MST
> > > > >
> > >
>
Si-Wei Liu July 28, 2022, 1:41 a.m. UTC | #38
On 7/27/2022 4:54 AM, Zhu, Lingshan wrote:
>
>
> On 7/27/2022 6:09 PM, Si-Wei Liu wrote:
>>
>>
>> On 7/27/2022 2:01 AM, Michael S. Tsirkin wrote:
>>> On Wed, Jul 27, 2022 at 12:50:33AM -0700, Si-Wei Liu wrote:
>>>>
>>>> On 7/26/2022 11:01 PM, Michael S. Tsirkin wrote:
>>>>> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>> Sent: Tuesday, July 26, 2022 10:53 PM
>>>>>>>
>>>>>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
>>>>>>>>>
>>>>>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
>>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
>>>>>>>>>>>> When the user space which invokes netlink commands, detects 
>>>>>>>>>>>> that
>>>>>>>>> _MQ
>>>>>>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
>>>>>>>>>>> I think the kernel module have all necessary information and 
>>>>>>>>>>> it is
>>>>>>>>>>> the only one which have precise information of a device, so it
>>>>>>>>>>> should answer precisely than let the user space guess. The 
>>>>>>>>>>> kernel
>>>>>>>>>>> module should be reliable than stay silent, leave the 
>>>>>>>>>>> question to
>>>>>>>>>>> the user space
>>>>>>>>> tool.
>>>>>>>>>> Kernel is reliable. It doesn’t expose a config space field if 
>>>>>>>>>> the
>>>>>>>>>> field doesn’t
>>>>>>>>> exist regardless of field should have default or no default.
>>>>>>>>> so when you know it is one queue pair, you should answer one, 
>>>>>>>>> not try
>>>>>>>>> to guess.
>>>>>>>>>> User space should not guess either. User space gets to see if 
>>>>>>>>>> _MQ
>>>>>>>>> present/not present. If _MQ present than get reliable data 
>>>>>>>>> from kernel.
>>>>>>>>>> If _MQ not present, it means this device has one VQ pair.
>>>>>>>>> it is still a guess, right? And all user space tools 
>>>>>>>>> implemented this
>>>>>>>>> feature need to guess
>>>>>>>> No. it is not a guess.
>>>>>>>> It is explicitly checking the _MQ feature and deriving the value.
>>>>>>>> The code you proposed will be present in the user space.
>>>>>>>> It will be uniform for _MQ and 10 other features that are 
>>>>>>>> present now and
>>>>>>> in the future.
>>>>>>> MQ and other features like RSS are different. If there is no 
>>>>>>> _RSS_XX, there
>>>>>>> are no attributes like max_rss_key_size, and there is not a 
>>>>>>> default value.
>>>>>>> But for MQ, we know it has to be 1 wihtout _MQ.
>>>>>> "we" = user space.
>>>>>> To keep the consistency among all the config space fields.
>>>>> Actually I looked and the code some more and I'm puzzled:
>>>>>
>>>>>
>>>>>     struct virtio_net_config config = {};
>>>>>     u64 features;
>>>>>     u16 val_u16;
>>>>>
>>>>>     vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>>>>
>>>>>     if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, 
>>>>> sizeof(config.mac),
>>>>>             config.mac))
>>>>>         return -EMSGSIZE;
>>>>>
>>>>>
>>>>> Mac returned even without VIRTIO_NET_F_MAC
>>>>>
>>>>>
>>>>>     val_u16 = le16_to_cpu(config.status);
>>>>>     if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>>>>>         return -EMSGSIZE;
>>>>>
>>>>>
>>>>> status returned even without VIRTIO_NET_F_STATUS
>>>>>
>>>>>     val_u16 = le16_to_cpu(config.mtu);
>>>>>     if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>>>>         return -EMSGSIZE;
>>>>>
>>>>>
>>>>> MTU returned even without VIRTIO_NET_F_MTU
>>>>>
>>>>>
>>>>> What's going on here?
>>>>>
>>>>>
>>>> I guess this is spec thing (historical debt), I vaguely recall 
>>>> these fields
>>>> are always present in config space regardless the existence of 
>>>> corresponding
>>>> feature bit.
>>>>
>>>> -Siwei
>>> Nope:
>>>
>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>
>>> ...
>>>
>>> For optional configuration space fields, the driver MUST check that 
>>> the corresponding feature is offered
>>> before accessing that part of the configuration space.
>> Well, this is driver side of requirement. As this interface is for 
>> host admin tool to query or configure vdpa device, we don't have to 
>> wait until feature negotiation is done on guest driver to extract 
>> vdpa attributes/parameters, say if we want to replicate another vdpa 
>> device with the same config on migration destination. I think what 
>> may need to be fix is to move off from using 
>> .vdpa_get_config_unlocked() which depends on feature negotiation. 
>> And/or expose config space register values through another set of 
>> attributes.
> Yes, we don't have to wait for FEATURES_OK. In another patch in this 
> series, I have added a new netlink attr to report the device features, 
> and removed the blocker. So the LM orchestration SW can query the 
> device features of the devices at the destination cluster, and pick a 
> proper one, even mask out some features to meet the LM requirements.
For that end, you'd need to move off from using 
vdpa_get_config_unlocked() which depends on feature negotiation. Since 
this would slightly change the original semantics of each field that 
"vdpa dev config" shows, it probably need another netlink command and 
new uAPI.

-Siwei


>
> Thanks,
> Zhu Lingshan
>> -Siwei
>>
>>
>>
>>
>
Zhu, Lingshan July 28, 2022, 2:44 a.m. UTC | #39
On 7/28/2022 9:41 AM, Si-Wei Liu wrote:
>
>
> On 7/27/2022 4:54 AM, Zhu, Lingshan wrote:
>>
>>
>> On 7/27/2022 6:09 PM, Si-Wei Liu wrote:
>>>
>>>
>>> On 7/27/2022 2:01 AM, Michael S. Tsirkin wrote:
>>>> On Wed, Jul 27, 2022 at 12:50:33AM -0700, Si-Wei Liu wrote:
>>>>>
>>>>> On 7/26/2022 11:01 PM, Michael S. Tsirkin wrote:
>>>>>> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>> Sent: Tuesday, July 26, 2022 10:53 PM
>>>>>>>>
>>>>>>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
>>>>>>>>>>
>>>>>>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
>>>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
>>>>>>>>>>>>> When the user space which invokes netlink commands, 
>>>>>>>>>>>>> detects that
>>>>>>>>>> _MQ
>>>>>>>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
>>>>>>>>>>>> I think the kernel module have all necessary information 
>>>>>>>>>>>> and it is
>>>>>>>>>>>> the only one which have precise information of a device, so it
>>>>>>>>>>>> should answer precisely than let the user space guess. The 
>>>>>>>>>>>> kernel
>>>>>>>>>>>> module should be reliable than stay silent, leave the 
>>>>>>>>>>>> question to
>>>>>>>>>>>> the user space
>>>>>>>>>> tool.
>>>>>>>>>>> Kernel is reliable. It doesn’t expose a config space field 
>>>>>>>>>>> if the
>>>>>>>>>>> field doesn’t
>>>>>>>>>> exist regardless of field should have default or no default.
>>>>>>>>>> so when you know it is one queue pair, you should answer one, 
>>>>>>>>>> not try
>>>>>>>>>> to guess.
>>>>>>>>>>> User space should not guess either. User space gets to see 
>>>>>>>>>>> if _MQ
>>>>>>>>>> present/not present. If _MQ present than get reliable data 
>>>>>>>>>> from kernel.
>>>>>>>>>>> If _MQ not present, it means this device has one VQ pair.
>>>>>>>>>> it is still a guess, right? And all user space tools 
>>>>>>>>>> implemented this
>>>>>>>>>> feature need to guess
>>>>>>>>> No. it is not a guess.
>>>>>>>>> It is explicitly checking the _MQ feature and deriving the value.
>>>>>>>>> The code you proposed will be present in the user space.
>>>>>>>>> It will be uniform for _MQ and 10 other features that are 
>>>>>>>>> present now and
>>>>>>>> in the future.
>>>>>>>> MQ and other features like RSS are different. If there is no 
>>>>>>>> _RSS_XX, there
>>>>>>>> are no attributes like max_rss_key_size, and there is not a 
>>>>>>>> default value.
>>>>>>>> But for MQ, we know it has to be 1 wihtout _MQ.
>>>>>>> "we" = user space.
>>>>>>> To keep the consistency among all the config space fields.
>>>>>> Actually I looked and the code some more and I'm puzzled:
>>>>>>
>>>>>>
>>>>>>     struct virtio_net_config config = {};
>>>>>>     u64 features;
>>>>>>     u16 val_u16;
>>>>>>
>>>>>>     vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>>>>>
>>>>>>     if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, 
>>>>>> sizeof(config.mac),
>>>>>>             config.mac))
>>>>>>         return -EMSGSIZE;
>>>>>>
>>>>>>
>>>>>> Mac returned even without VIRTIO_NET_F_MAC
>>>>>>
>>>>>>
>>>>>>     val_u16 = le16_to_cpu(config.status);
>>>>>>     if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>>>>>>         return -EMSGSIZE;
>>>>>>
>>>>>>
>>>>>> status returned even without VIRTIO_NET_F_STATUS
>>>>>>
>>>>>>     val_u16 = le16_to_cpu(config.mtu);
>>>>>>     if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>>>>>         return -EMSGSIZE;
>>>>>>
>>>>>>
>>>>>> MTU returned even without VIRTIO_NET_F_MTU
>>>>>>
>>>>>>
>>>>>> What's going on here?
>>>>>>
>>>>>>
>>>>> I guess this is spec thing (historical debt), I vaguely recall 
>>>>> these fields
>>>>> are always present in config space regardless the existence of 
>>>>> corresponding
>>>>> feature bit.
>>>>>
>>>>> -Siwei
>>>> Nope:
>>>>
>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>
>>>> ...
>>>>
>>>> For optional configuration space fields, the driver MUST check that 
>>>> the corresponding feature is offered
>>>> before accessing that part of the configuration space.
>>> Well, this is driver side of requirement. As this interface is for 
>>> host admin tool to query or configure vdpa device, we don't have to 
>>> wait until feature negotiation is done on guest driver to extract 
>>> vdpa attributes/parameters, say if we want to replicate another vdpa 
>>> device with the same config on migration destination. I think what 
>>> may need to be fix is to move off from using 
>>> .vdpa_get_config_unlocked() which depends on feature negotiation. 
>>> And/or expose config space register values through another set of 
>>> attributes.
>> Yes, we don't have to wait for FEATURES_OK. In another patch in this 
>> series, I have added a new netlink attr to report the device 
>> features, and removed the blocker. So the LM orchestration SW can 
>> query the device features of the devices at the destination cluster, 
>> and pick a proper one, even mask out some features to meet the LM 
>> requirements.
> For that end, you'd need to move off from using 
> vdpa_get_config_unlocked() which depends on feature negotiation. Since 
> this would slightly change the original semantics of each field that 
> "vdpa dev config" shows, it probably need another netlink command and 
> new uAPI.
why not show both device_features and driver_features in "vdpa dev 
config show"?
>
> -Siwei
>
>
>>
>> Thanks,
>> Zhu Lingshan
>>> -Siwei
>>>
>>>
>>>
>>>
>>
>
Zhu, Lingshan July 28, 2022, 3:46 a.m. UTC | #40
On 7/28/2022 9:21 AM, Jason Wang wrote:
> On Wed, Jul 27, 2022 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>> On Wed, Jul 27, 2022 at 05:50:59PM +0800, Jason Wang wrote:
>>> On Wed, Jul 27, 2022 at 5:03 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>> On Wed, Jul 27, 2022 at 02:54:13PM +0800, Jason Wang wrote:
>>>>> On Wed, Jul 27, 2022 at 2:01 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>>> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>> Sent: Tuesday, July 26, 2022 10:53 PM
>>>>>>>>
>>>>>>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
>>>>>>>>>>
>>>>>>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
>>>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
>>>>>>>>>>>>> When the user space which invokes netlink commands, detects that
>>>>>>>>>> _MQ
>>>>>>>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
>>>>>>>>>>>> I think the kernel module have all necessary information and it is
>>>>>>>>>>>> the only one which have precise information of a device, so it
>>>>>>>>>>>> should answer precisely than let the user space guess. The kernel
>>>>>>>>>>>> module should be reliable than stay silent, leave the question to
>>>>>>>>>>>> the user space
>>>>>>>>>> tool.
>>>>>>>>>>> Kernel is reliable. It doesn’t expose a config space field if the
>>>>>>>>>>> field doesn’t
>>>>>>>>>> exist regardless of field should have default or no default.
>>>>>>>>>> so when you know it is one queue pair, you should answer one, not try
>>>>>>>>>> to guess.
>>>>>>>>>>> User space should not guess either. User space gets to see if _MQ
>>>>>>>>>> present/not present. If _MQ present than get reliable data from kernel.
>>>>>>>>>>> If _MQ not present, it means this device has one VQ pair.
>>>>>>>>>> it is still a guess, right? And all user space tools implemented this
>>>>>>>>>> feature need to guess
>>>>>>>>> No. it is not a guess.
>>>>>>>>> It is explicitly checking the _MQ feature and deriving the value.
>>>>>>>>> The code you proposed will be present in the user space.
>>>>>>>>> It will be uniform for _MQ and 10 other features that are present now and
>>>>>>>> in the future.
>>>>>>>> MQ and other features like RSS are different. If there is no _RSS_XX, there
>>>>>>>> are no attributes like max_rss_key_size, and there is not a default value.
>>>>>>>> But for MQ, we know it has to be 1 wihtout _MQ.
>>>>>>> "we" = user space.
>>>>>>> To keep the consistency among all the config space fields.
>>>>>> Actually I looked and the code some more and I'm puzzled:
>>>>>>
>>>>>>
>>>>>>          struct virtio_net_config config = {};
>>>>>>          u64 features;
>>>>>>          u16 val_u16;
>>>>>>
>>>>>>          vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>>>>>
>>>>>>          if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
>>>>>>                      config.mac))
>>>>>>                  return -EMSGSIZE;
>>>>>>
>>>>>>
>>>>>> Mac returned even without VIRTIO_NET_F_MAC
>>>>>>
>>>>>>
>>>>>>          val_u16 = le16_to_cpu(config.status);
>>>>>>          if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>>>>>>                  return -EMSGSIZE;
>>>>>>
>>>>>>
>>>>>> status returned even without VIRTIO_NET_F_STATUS
>>>>>>
>>>>>>          val_u16 = le16_to_cpu(config.mtu);
>>>>>>          if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>>>>>                  return -EMSGSIZE;
>>>>>>
>>>>>>
>>>>>> MTU returned even without VIRTIO_NET_F_MTU
>>>>>>
>>>>>>
>>>>>> What's going on here?
>>>>> Probably too late to fix, but this should be fine as long as all
>>>>> parents support STATUS/MTU/MAC.
>>>> Why is this too late to fix.
>>> If we make this conditional on the features. This may break the
>>> userspace that always expects VDPA_ATTR_DEV_NET_CFG_MTU?
>>>
>>> Thanks
>> Well only on devices without MTU. I'm saying said userspace
>> was reading trash on such devices anyway.
> It depends on the parent actually. For example, mlx5 query the lower
> mtu unconditionally:
>
>          err = query_mtu(mdev, &mtu);
>          if (err)
>                  goto err_alloc;
>
>          ndev->config.mtu = cpu_to_mlx5vdpa16(mvdev, mtu);
>
> Supporting MTU features seems to be a must for real hardware.
> Otherwise the driver may not work correctly.
>
>> We don't generally maintain bug for bug compatiblity on a whim,
>> only if userspace is actually known to break if we fix a bug.
>   So I think it should be fine to make this conditional then we should
> have a consistent handling of other fields like MQ.
For some fields that have a default value, like MQ =1, we can return the 
default value.
For other fields without a default value, like MAC, we return nothing.

Does this sounds good? So, for MTU, if without _F_MTU, I think we can 
return 1500 by default.

Thanks,
Zhu Lingshan
>
> Thanks
>
>>
>>>>> I wonder if we can add a check in the core and fail the device
>>>>> registration in this case.
>>>>>
>>>>> Thanks
>>>>>
>>>>>>
>>>>>> --
>>>>>> MST
>>>>>>
Jason Wang July 28, 2022, 5:53 a.m. UTC | #41
On Thu, Jul 28, 2022 at 11:47 AM Zhu, Lingshan <lingshan.zhu@intel.com> wrote:
>
>
>
> On 7/28/2022 9:21 AM, Jason Wang wrote:
> > On Wed, Jul 27, 2022 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >> On Wed, Jul 27, 2022 at 05:50:59PM +0800, Jason Wang wrote:
> >>> On Wed, Jul 27, 2022 at 5:03 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >>>> On Wed, Jul 27, 2022 at 02:54:13PM +0800, Jason Wang wrote:
> >>>>> On Wed, Jul 27, 2022 at 2:01 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >>>>>> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> >>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> >>>>>>>> Sent: Tuesday, July 26, 2022 10:53 PM
> >>>>>>>>
> >>>>>>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
> >>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> >>>>>>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
> >>>>>>>>>>
> >>>>>>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> >>>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> >>>>>>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
> >>>>>>>>>>>>> When the user space which invokes netlink commands, detects that
> >>>>>>>>>> _MQ
> >>>>>>>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
> >>>>>>>>>>>> I think the kernel module have all necessary information and it is
> >>>>>>>>>>>> the only one which have precise information of a device, so it
> >>>>>>>>>>>> should answer precisely than let the user space guess. The kernel
> >>>>>>>>>>>> module should be reliable than stay silent, leave the question to
> >>>>>>>>>>>> the user space
> >>>>>>>>>> tool.
> >>>>>>>>>>> Kernel is reliable. It doesn’t expose a config space field if the
> >>>>>>>>>>> field doesn’t
> >>>>>>>>>> exist regardless of field should have default or no default.
> >>>>>>>>>> so when you know it is one queue pair, you should answer one, not try
> >>>>>>>>>> to guess.
> >>>>>>>>>>> User space should not guess either. User space gets to see if _MQ
> >>>>>>>>>> present/not present. If _MQ present than get reliable data from kernel.
> >>>>>>>>>>> If _MQ not present, it means this device has one VQ pair.
> >>>>>>>>>> it is still a guess, right? And all user space tools implemented this
> >>>>>>>>>> feature need to guess
> >>>>>>>>> No. it is not a guess.
> >>>>>>>>> It is explicitly checking the _MQ feature and deriving the value.
> >>>>>>>>> The code you proposed will be present in the user space.
> >>>>>>>>> It will be uniform for _MQ and 10 other features that are present now and
> >>>>>>>> in the future.
> >>>>>>>> MQ and other features like RSS are different. If there is no _RSS_XX, there
> >>>>>>>> are no attributes like max_rss_key_size, and there is not a default value.
> >>>>>>>> But for MQ, we know it has to be 1 wihtout _MQ.
> >>>>>>> "we" = user space.
> >>>>>>> To keep the consistency among all the config space fields.
> >>>>>> Actually I looked and the code some more and I'm puzzled:
> >>>>>>
> >>>>>>
> >>>>>>          struct virtio_net_config config = {};
> >>>>>>          u64 features;
> >>>>>>          u16 val_u16;
> >>>>>>
> >>>>>>          vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> >>>>>>
> >>>>>>          if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> >>>>>>                      config.mac))
> >>>>>>                  return -EMSGSIZE;
> >>>>>>
> >>>>>>
> >>>>>> Mac returned even without VIRTIO_NET_F_MAC
> >>>>>>
> >>>>>>
> >>>>>>          val_u16 = le16_to_cpu(config.status);
> >>>>>>          if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> >>>>>>                  return -EMSGSIZE;
> >>>>>>
> >>>>>>
> >>>>>> status returned even without VIRTIO_NET_F_STATUS
> >>>>>>
> >>>>>>          val_u16 = le16_to_cpu(config.mtu);
> >>>>>>          if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> >>>>>>                  return -EMSGSIZE;
> >>>>>>
> >>>>>>
> >>>>>> MTU returned even without VIRTIO_NET_F_MTU
> >>>>>>
> >>>>>>
> >>>>>> What's going on here?
> >>>>> Probably too late to fix, but this should be fine as long as all
> >>>>> parents support STATUS/MTU/MAC.
> >>>> Why is this too late to fix.
> >>> If we make this conditional on the features. This may break the
> >>> userspace that always expects VDPA_ATTR_DEV_NET_CFG_MTU?
> >>>
> >>> Thanks
> >> Well only on devices without MTU. I'm saying said userspace
> >> was reading trash on such devices anyway.
> > It depends on the parent actually. For example, mlx5 query the lower
> > mtu unconditionally:
> >
> >          err = query_mtu(mdev, &mtu);
> >          if (err)
> >                  goto err_alloc;
> >
> >          ndev->config.mtu = cpu_to_mlx5vdpa16(mvdev, mtu);
> >
> > Supporting MTU features seems to be a must for real hardware.
> > Otherwise the driver may not work correctly.
> >
> >> We don't generally maintain bug for bug compatiblity on a whim,
> >> only if userspace is actually known to break if we fix a bug.
> >   So I think it should be fine to make this conditional then we should
> > have a consistent handling of other fields like MQ.
> For some fields that have a default value, like MQ =1, we can return the
> default value.
> For other fields without a default value, like MAC, we return nothing.
>
> Does this sounds good? So, for MTU, if without _F_MTU, I think we can
> return 1500 by default.

Or we can just read MTU from the device.

But It looks to me Michael wants it conditional.

Thanks

>
> Thanks,
> Zhu Lingshan
> >
> > Thanks
> >
> >>
> >>>>> I wonder if we can add a check in the core and fail the device
> >>>>> registration in this case.
> >>>>>
> >>>>> Thanks
> >>>>>
> >>>>>>
> >>>>>> --
> >>>>>> MST
> >>>>>>
>
Zhu, Lingshan July 28, 2022, 6:02 a.m. UTC | #42
On 7/28/2022 1:53 PM, Jason Wang wrote:
> On Thu, Jul 28, 2022 at 11:47 AM Zhu, Lingshan <lingshan.zhu@intel.com> wrote:
>>
>>
>> On 7/28/2022 9:21 AM, Jason Wang wrote:
>>> On Wed, Jul 27, 2022 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>> On Wed, Jul 27, 2022 at 05:50:59PM +0800, Jason Wang wrote:
>>>>> On Wed, Jul 27, 2022 at 5:03 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>>> On Wed, Jul 27, 2022 at 02:54:13PM +0800, Jason Wang wrote:
>>>>>>> On Wed, Jul 27, 2022 at 2:01 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>>>>> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>> Sent: Tuesday, July 26, 2022 10:53 PM
>>>>>>>>>>
>>>>>>>>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
>>>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
>>>>>>>>>>>>
>>>>>>>>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
>>>>>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
>>>>>>>>>>>>>>> When the user space which invokes netlink commands, detects that
>>>>>>>>>>>> _MQ
>>>>>>>>>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
>>>>>>>>>>>>>> I think the kernel module have all necessary information and it is
>>>>>>>>>>>>>> the only one which have precise information of a device, so it
>>>>>>>>>>>>>> should answer precisely than let the user space guess. The kernel
>>>>>>>>>>>>>> module should be reliable than stay silent, leave the question to
>>>>>>>>>>>>>> the user space
>>>>>>>>>>>> tool.
>>>>>>>>>>>>> Kernel is reliable. It doesn’t expose a config space field if the
>>>>>>>>>>>>> field doesn’t
>>>>>>>>>>>> exist regardless of field should have default or no default.
>>>>>>>>>>>> so when you know it is one queue pair, you should answer one, not try
>>>>>>>>>>>> to guess.
>>>>>>>>>>>>> User space should not guess either. User space gets to see if _MQ
>>>>>>>>>>>> present/not present. If _MQ present than get reliable data from kernel.
>>>>>>>>>>>>> If _MQ not present, it means this device has one VQ pair.
>>>>>>>>>>>> it is still a guess, right? And all user space tools implemented this
>>>>>>>>>>>> feature need to guess
>>>>>>>>>>> No. it is not a guess.
>>>>>>>>>>> It is explicitly checking the _MQ feature and deriving the value.
>>>>>>>>>>> The code you proposed will be present in the user space.
>>>>>>>>>>> It will be uniform for _MQ and 10 other features that are present now and
>>>>>>>>>> in the future.
>>>>>>>>>> MQ and other features like RSS are different. If there is no _RSS_XX, there
>>>>>>>>>> are no attributes like max_rss_key_size, and there is not a default value.
>>>>>>>>>> But for MQ, we know it has to be 1 wihtout _MQ.
>>>>>>>>> "we" = user space.
>>>>>>>>> To keep the consistency among all the config space fields.
>>>>>>>> Actually I looked and the code some more and I'm puzzled:
>>>>>>>>
>>>>>>>>
>>>>>>>>           struct virtio_net_config config = {};
>>>>>>>>           u64 features;
>>>>>>>>           u16 val_u16;
>>>>>>>>
>>>>>>>>           vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>>>>>>>
>>>>>>>>           if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
>>>>>>>>                       config.mac))
>>>>>>>>                   return -EMSGSIZE;
>>>>>>>>
>>>>>>>>
>>>>>>>> Mac returned even without VIRTIO_NET_F_MAC
>>>>>>>>
>>>>>>>>
>>>>>>>>           val_u16 = le16_to_cpu(config.status);
>>>>>>>>           if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>>>>>>>>                   return -EMSGSIZE;
>>>>>>>>
>>>>>>>>
>>>>>>>> status returned even without VIRTIO_NET_F_STATUS
>>>>>>>>
>>>>>>>>           val_u16 = le16_to_cpu(config.mtu);
>>>>>>>>           if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>>>>>>>                   return -EMSGSIZE;
>>>>>>>>
>>>>>>>>
>>>>>>>> MTU returned even without VIRTIO_NET_F_MTU
>>>>>>>>
>>>>>>>>
>>>>>>>> What's going on here?
>>>>>>> Probably too late to fix, but this should be fine as long as all
>>>>>>> parents support STATUS/MTU/MAC.
>>>>>> Why is this too late to fix.
>>>>> If we make this conditional on the features. This may break the
>>>>> userspace that always expects VDPA_ATTR_DEV_NET_CFG_MTU?
>>>>>
>>>>> Thanks
>>>> Well only on devices without MTU. I'm saying said userspace
>>>> was reading trash on such devices anyway.
>>> It depends on the parent actually. For example, mlx5 query the lower
>>> mtu unconditionally:
>>>
>>>           err = query_mtu(mdev, &mtu);
>>>           if (err)
>>>                   goto err_alloc;
>>>
>>>           ndev->config.mtu = cpu_to_mlx5vdpa16(mvdev, mtu);
>>>
>>> Supporting MTU features seems to be a must for real hardware.
>>> Otherwise the driver may not work correctly.
>>>
>>>> We don't generally maintain bug for bug compatiblity on a whim,
>>>> only if userspace is actually known to break if we fix a bug.
>>>    So I think it should be fine to make this conditional then we should
>>> have a consistent handling of other fields like MQ.
>> For some fields that have a default value, like MQ =1, we can return the
>> default value.
>> For other fields without a default value, like MAC, we return nothing.
>>
>> Does this sounds good? So, for MTU, if without _F_MTU, I think we can
>> return 1500 by default.
> Or we can just read MTU from the device.
>
> But It looks to me Michael wants it conditional.
if _F_MTU is offered, we can read it from the device config space. If 
_F_MTU not
offered, I think it is conditional, however there can be a min default 
value,
1500 for Ethernet.

Thanks,
Zhu Lingshan
>
> Thanks
>
>> Thanks,
>> Zhu Lingshan
>>> Thanks
>>>
>>>>>>> I wonder if we can add a check in the core and fail the device
>>>>>>> registration in this case.
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>>> --
>>>>>>>> MST
>>>>>>>>
Michael S. Tsirkin July 28, 2022, 6:41 a.m. UTC | #43
On Thu, Jul 28, 2022 at 01:53:51PM +0800, Jason Wang wrote:
> On Thu, Jul 28, 2022 at 11:47 AM Zhu, Lingshan <lingshan.zhu@intel.com> wrote:
> >
> >
> >
> > On 7/28/2022 9:21 AM, Jason Wang wrote:
> > > On Wed, Jul 27, 2022 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >> On Wed, Jul 27, 2022 at 05:50:59PM +0800, Jason Wang wrote:
> > >>> On Wed, Jul 27, 2022 at 5:03 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >>>> On Wed, Jul 27, 2022 at 02:54:13PM +0800, Jason Wang wrote:
> > >>>>> On Wed, Jul 27, 2022 at 2:01 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >>>>>> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> > >>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > >>>>>>>> Sent: Tuesday, July 26, 2022 10:53 PM
> > >>>>>>>>
> > >>>>>>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
> > >>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > >>>>>>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
> > >>>>>>>>>>
> > >>>>>>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> > >>>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > >>>>>>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
> > >>>>>>>>>>>>> When the user space which invokes netlink commands, detects that
> > >>>>>>>>>> _MQ
> > >>>>>>>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
> > >>>>>>>>>>>> I think the kernel module have all necessary information and it is
> > >>>>>>>>>>>> the only one which have precise information of a device, so it
> > >>>>>>>>>>>> should answer precisely than let the user space guess. The kernel
> > >>>>>>>>>>>> module should be reliable than stay silent, leave the question to
> > >>>>>>>>>>>> the user space
> > >>>>>>>>>> tool.
> > >>>>>>>>>>> Kernel is reliable. It doesn’t expose a config space field if the
> > >>>>>>>>>>> field doesn’t
> > >>>>>>>>>> exist regardless of field should have default or no default.
> > >>>>>>>>>> so when you know it is one queue pair, you should answer one, not try
> > >>>>>>>>>> to guess.
> > >>>>>>>>>>> User space should not guess either. User space gets to see if _MQ
> > >>>>>>>>>> present/not present. If _MQ present than get reliable data from kernel.
> > >>>>>>>>>>> If _MQ not present, it means this device has one VQ pair.
> > >>>>>>>>>> it is still a guess, right? And all user space tools implemented this
> > >>>>>>>>>> feature need to guess
> > >>>>>>>>> No. it is not a guess.
> > >>>>>>>>> It is explicitly checking the _MQ feature and deriving the value.
> > >>>>>>>>> The code you proposed will be present in the user space.
> > >>>>>>>>> It will be uniform for _MQ and 10 other features that are present now and
> > >>>>>>>> in the future.
> > >>>>>>>> MQ and other features like RSS are different. If there is no _RSS_XX, there
> > >>>>>>>> are no attributes like max_rss_key_size, and there is not a default value.
> > >>>>>>>> But for MQ, we know it has to be 1 wihtout _MQ.
> > >>>>>>> "we" = user space.
> > >>>>>>> To keep the consistency among all the config space fields.
> > >>>>>> Actually I looked and the code some more and I'm puzzled:
> > >>>>>>
> > >>>>>>
> > >>>>>>          struct virtio_net_config config = {};
> > >>>>>>          u64 features;
> > >>>>>>          u16 val_u16;
> > >>>>>>
> > >>>>>>          vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> > >>>>>>
> > >>>>>>          if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> > >>>>>>                      config.mac))
> > >>>>>>                  return -EMSGSIZE;
> > >>>>>>
> > >>>>>>
> > >>>>>> Mac returned even without VIRTIO_NET_F_MAC
> > >>>>>>
> > >>>>>>
> > >>>>>>          val_u16 = le16_to_cpu(config.status);
> > >>>>>>          if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> > >>>>>>                  return -EMSGSIZE;
> > >>>>>>
> > >>>>>>
> > >>>>>> status returned even without VIRTIO_NET_F_STATUS
> > >>>>>>
> > >>>>>>          val_u16 = le16_to_cpu(config.mtu);
> > >>>>>>          if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> > >>>>>>                  return -EMSGSIZE;
> > >>>>>>
> > >>>>>>
> > >>>>>> MTU returned even without VIRTIO_NET_F_MTU
> > >>>>>>
> > >>>>>>
> > >>>>>> What's going on here?
> > >>>>> Probably too late to fix, but this should be fine as long as all
> > >>>>> parents support STATUS/MTU/MAC.
> > >>>> Why is this too late to fix.
> > >>> If we make this conditional on the features. This may break the
> > >>> userspace that always expects VDPA_ATTR_DEV_NET_CFG_MTU?
> > >>>
> > >>> Thanks
> > >> Well only on devices without MTU. I'm saying said userspace
> > >> was reading trash on such devices anyway.
> > > It depends on the parent actually. For example, mlx5 query the lower
> > > mtu unconditionally:
> > >
> > >          err = query_mtu(mdev, &mtu);
> > >          if (err)
> > >                  goto err_alloc;
> > >
> > >          ndev->config.mtu = cpu_to_mlx5vdpa16(mvdev, mtu);
> > >
> > > Supporting MTU features seems to be a must for real hardware.
> > > Otherwise the driver may not work correctly.
> > >
> > >> We don't generally maintain bug for bug compatiblity on a whim,
> > >> only if userspace is actually known to break if we fix a bug.
> > >   So I think it should be fine to make this conditional then we should
> > > have a consistent handling of other fields like MQ.
> > For some fields that have a default value, like MQ =1, we can return the
> > default value.
> > For other fields without a default value, like MAC, we return nothing.
> >
> > Does this sounds good? So, for MTU, if without _F_MTU, I think we can
> > return 1500 by default.
> 
> Or we can just read MTU from the device.
> 
> But It looks to me Michael wants it conditional.
> 
> Thanks

I'm fine either way but let's keep it consistent. And I think
Parav wants it conditional.

> >
> > Thanks,
> > Zhu Lingshan
> > >
> > > Thanks
> > >
> > >>
> > >>>>> I wonder if we can add a check in the core and fail the device
> > >>>>> registration in this case.
> > >>>>>
> > >>>>> Thanks
> > >>>>>
> > >>>>>>
> > >>>>>> --
> > >>>>>> MST
> > >>>>>>
> >
Si-Wei Liu July 28, 2022, 9:54 p.m. UTC | #44
On 7/27/2022 7:44 PM, Zhu, Lingshan wrote:
>
>
> On 7/28/2022 9:41 AM, Si-Wei Liu wrote:
>>
>>
>> On 7/27/2022 4:54 AM, Zhu, Lingshan wrote:
>>>
>>>
>>> On 7/27/2022 6:09 PM, Si-Wei Liu wrote:
>>>>
>>>>
>>>> On 7/27/2022 2:01 AM, Michael S. Tsirkin wrote:
>>>>> On Wed, Jul 27, 2022 at 12:50:33AM -0700, Si-Wei Liu wrote:
>>>>>>
>>>>>> On 7/26/2022 11:01 PM, Michael S. Tsirkin wrote:
>>>>>>> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>> Sent: Tuesday, July 26, 2022 10:53 PM
>>>>>>>>>
>>>>>>>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
>>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
>>>>>>>>>>>
>>>>>>>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
>>>>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
>>>>>>>>>>>>>> When the user space which invokes netlink commands, 
>>>>>>>>>>>>>> detects that
>>>>>>>>>>> _MQ
>>>>>>>>>>>>> is not supported, hence it takes max_queue_pair = 1 by 
>>>>>>>>>>>>> itself.
>>>>>>>>>>>>> I think the kernel module have all necessary information 
>>>>>>>>>>>>> and it is
>>>>>>>>>>>>> the only one which have precise information of a device, 
>>>>>>>>>>>>> so it
>>>>>>>>>>>>> should answer precisely than let the user space guess. The 
>>>>>>>>>>>>> kernel
>>>>>>>>>>>>> module should be reliable than stay silent, leave the 
>>>>>>>>>>>>> question to
>>>>>>>>>>>>> the user space
>>>>>>>>>>> tool.
>>>>>>>>>>>> Kernel is reliable. It doesn’t expose a config space field 
>>>>>>>>>>>> if the
>>>>>>>>>>>> field doesn’t
>>>>>>>>>>> exist regardless of field should have default or no default.
>>>>>>>>>>> so when you know it is one queue pair, you should answer 
>>>>>>>>>>> one, not try
>>>>>>>>>>> to guess.
>>>>>>>>>>>> User space should not guess either. User space gets to see 
>>>>>>>>>>>> if _MQ
>>>>>>>>>>> present/not present. If _MQ present than get reliable data 
>>>>>>>>>>> from kernel.
>>>>>>>>>>>> If _MQ not present, it means this device has one VQ pair.
>>>>>>>>>>> it is still a guess, right? And all user space tools 
>>>>>>>>>>> implemented this
>>>>>>>>>>> feature need to guess
>>>>>>>>>> No. it is not a guess.
>>>>>>>>>> It is explicitly checking the _MQ feature and deriving the 
>>>>>>>>>> value.
>>>>>>>>>> The code you proposed will be present in the user space.
>>>>>>>>>> It will be uniform for _MQ and 10 other features that are 
>>>>>>>>>> present now and
>>>>>>>>> in the future.
>>>>>>>>> MQ and other features like RSS are different. If there is no 
>>>>>>>>> _RSS_XX, there
>>>>>>>>> are no attributes like max_rss_key_size, and there is not a 
>>>>>>>>> default value.
>>>>>>>>> But for MQ, we know it has to be 1 wihtout _MQ.
>>>>>>>> "we" = user space.
>>>>>>>> To keep the consistency among all the config space fields.
>>>>>>> Actually I looked and the code some more and I'm puzzled:
>>>>>>>
>>>>>>>
>>>>>>>     struct virtio_net_config config = {};
>>>>>>>     u64 features;
>>>>>>>     u16 val_u16;
>>>>>>>
>>>>>>>     vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>>>>>>
>>>>>>>     if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, 
>>>>>>> sizeof(config.mac),
>>>>>>>             config.mac))
>>>>>>>         return -EMSGSIZE;
>>>>>>>
>>>>>>>
>>>>>>> Mac returned even without VIRTIO_NET_F_MAC
>>>>>>>
>>>>>>>
>>>>>>>     val_u16 = le16_to_cpu(config.status);
>>>>>>>     if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>>>>>>>         return -EMSGSIZE;
>>>>>>>
>>>>>>>
>>>>>>> status returned even without VIRTIO_NET_F_STATUS
>>>>>>>
>>>>>>>     val_u16 = le16_to_cpu(config.mtu);
>>>>>>>     if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>>>>>>         return -EMSGSIZE;
>>>>>>>
>>>>>>>
>>>>>>> MTU returned even without VIRTIO_NET_F_MTU
>>>>>>>
>>>>>>>
>>>>>>> What's going on here?
>>>>>>>
>>>>>>>
>>>>>> I guess this is spec thing (historical debt), I vaguely recall 
>>>>>> these fields
>>>>>> are always present in config space regardless the existence of 
>>>>>> corresponding
>>>>>> feature bit.
>>>>>>
>>>>>> -Siwei
>>>>> Nope:
>>>>>
>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>
>>>>> ...
>>>>>
>>>>> For optional configuration space fields, the driver MUST check 
>>>>> that the corresponding feature is offered
>>>>> before accessing that part of the configuration space.
>>>> Well, this is driver side of requirement. As this interface is for 
>>>> host admin tool to query or configure vdpa device, we don't have to 
>>>> wait until feature negotiation is done on guest driver to extract 
>>>> vdpa attributes/parameters, say if we want to replicate another 
>>>> vdpa device with the same config on migration destination. I think 
>>>> what may need to be fix is to move off from using 
>>>> .vdpa_get_config_unlocked() which depends on feature negotiation. 
>>>> And/or expose config space register values through another set of 
>>>> attributes.
>>> Yes, we don't have to wait for FEATURES_OK. In another patch in this 
>>> series, I have added a new netlink attr to report the device 
>>> features, and removed the blocker. So the LM orchestration SW can 
>>> query the device features of the devices at the destination cluster, 
>>> and pick a proper one, even mask out some features to meet the LM 
>>> requirements.
>> For that end, you'd need to move off from using 
>> vdpa_get_config_unlocked() which depends on feature negotiation. 
>> Since this would slightly change the original semantics of each field 
>> that "vdpa dev config" shows, it probably need another netlink 
>> command and new uAPI.
> why not show both device_features and driver_features in "vdpa dev 
> config show"?
>
As I requested in the other email, I'd like to see the proposed 'vdpa 
dev config ...' example output for various phases in feature 
negotiation, and the specific use case (motivation) for this proposed 
output. I am having difficulty to match what you want to do with the 
patch posted.

-Siwei

>>
>> -Siwei
>>
>>
>>>
>>> Thanks,
>>> Zhu Lingshan
>>>> -Siwei
>>>>
>>>>
>>>>
>>>>
>>>
>>
>
Zhu, Lingshan July 29, 2022, 2:07 a.m. UTC | #45
On 7/29/2022 5:54 AM, Si-Wei Liu wrote:
>
>
> On 7/27/2022 7:44 PM, Zhu, Lingshan wrote:
>>
>>
>> On 7/28/2022 9:41 AM, Si-Wei Liu wrote:
>>>
>>>
>>> On 7/27/2022 4:54 AM, Zhu, Lingshan wrote:
>>>>
>>>>
>>>> On 7/27/2022 6:09 PM, Si-Wei Liu wrote:
>>>>>
>>>>>
>>>>> On 7/27/2022 2:01 AM, Michael S. Tsirkin wrote:
>>>>>> On Wed, Jul 27, 2022 at 12:50:33AM -0700, Si-Wei Liu wrote:
>>>>>>>
>>>>>>> On 7/26/2022 11:01 PM, Michael S. Tsirkin wrote:
>>>>>>>> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>> Sent: Tuesday, July 26, 2022 10:53 PM
>>>>>>>>>>
>>>>>>>>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
>>>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
>>>>>>>>>>>>
>>>>>>>>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
>>>>>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
>>>>>>>>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
>>>>>>>>>>>>>>> When the user space which invokes netlink commands, 
>>>>>>>>>>>>>>> detects that
>>>>>>>>>>>> _MQ
>>>>>>>>>>>>>> is not supported, hence it takes max_queue_pair = 1 by 
>>>>>>>>>>>>>> itself.
>>>>>>>>>>>>>> I think the kernel module have all necessary information 
>>>>>>>>>>>>>> and it is
>>>>>>>>>>>>>> the only one which have precise information of a device, 
>>>>>>>>>>>>>> so it
>>>>>>>>>>>>>> should answer precisely than let the user space guess. 
>>>>>>>>>>>>>> The kernel
>>>>>>>>>>>>>> module should be reliable than stay silent, leave the 
>>>>>>>>>>>>>> question to
>>>>>>>>>>>>>> the user space
>>>>>>>>>>>> tool.
>>>>>>>>>>>>> Kernel is reliable. It doesn’t expose a config space field 
>>>>>>>>>>>>> if the
>>>>>>>>>>>>> field doesn’t
>>>>>>>>>>>> exist regardless of field should have default or no default.
>>>>>>>>>>>> so when you know it is one queue pair, you should answer 
>>>>>>>>>>>> one, not try
>>>>>>>>>>>> to guess.
>>>>>>>>>>>>> User space should not guess either. User space gets to see 
>>>>>>>>>>>>> if _MQ
>>>>>>>>>>>> present/not present. If _MQ present than get reliable data 
>>>>>>>>>>>> from kernel.
>>>>>>>>>>>>> If _MQ not present, it means this device has one VQ pair.
>>>>>>>>>>>> it is still a guess, right? And all user space tools 
>>>>>>>>>>>> implemented this
>>>>>>>>>>>> feature need to guess
>>>>>>>>>>> No. it is not a guess.
>>>>>>>>>>> It is explicitly checking the _MQ feature and deriving the 
>>>>>>>>>>> value.
>>>>>>>>>>> The code you proposed will be present in the user space.
>>>>>>>>>>> It will be uniform for _MQ and 10 other features that are 
>>>>>>>>>>> present now and
>>>>>>>>>> in the future.
>>>>>>>>>> MQ and other features like RSS are different. If there is no 
>>>>>>>>>> _RSS_XX, there
>>>>>>>>>> are no attributes like max_rss_key_size, and there is not a 
>>>>>>>>>> default value.
>>>>>>>>>> But for MQ, we know it has to be 1 wihtout _MQ.
>>>>>>>>> "we" = user space.
>>>>>>>>> To keep the consistency among all the config space fields.
>>>>>>>> Actually I looked and the code some more and I'm puzzled:
>>>>>>>>
>>>>>>>>
>>>>>>>>     struct virtio_net_config config = {};
>>>>>>>>     u64 features;
>>>>>>>>     u16 val_u16;
>>>>>>>>
>>>>>>>>     vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
>>>>>>>>
>>>>>>>>     if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, 
>>>>>>>> sizeof(config.mac),
>>>>>>>>             config.mac))
>>>>>>>>         return -EMSGSIZE;
>>>>>>>>
>>>>>>>>
>>>>>>>> Mac returned even without VIRTIO_NET_F_MAC
>>>>>>>>
>>>>>>>>
>>>>>>>>     val_u16 = le16_to_cpu(config.status);
>>>>>>>>     if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
>>>>>>>>         return -EMSGSIZE;
>>>>>>>>
>>>>>>>>
>>>>>>>> status returned even without VIRTIO_NET_F_STATUS
>>>>>>>>
>>>>>>>>     val_u16 = le16_to_cpu(config.mtu);
>>>>>>>>     if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
>>>>>>>>         return -EMSGSIZE;
>>>>>>>>
>>>>>>>>
>>>>>>>> MTU returned even without VIRTIO_NET_F_MTU
>>>>>>>>
>>>>>>>>
>>>>>>>> What's going on here?
>>>>>>>>
>>>>>>>>
>>>>>>> I guess this is spec thing (historical debt), I vaguely recall 
>>>>>>> these fields
>>>>>>> are always present in config space regardless the existence of 
>>>>>>> corresponding
>>>>>>> feature bit.
>>>>>>>
>>>>>>> -Siwei
>>>>>> Nope:
>>>>>>
>>>>>> 2.5.1  Driver Requirements: Device Configuration Space
>>>>>>
>>>>>> ...
>>>>>>
>>>>>> For optional configuration space fields, the driver MUST check 
>>>>>> that the corresponding feature is offered
>>>>>> before accessing that part of the configuration space.
>>>>> Well, this is driver side of requirement. As this interface is for 
>>>>> host admin tool to query or configure vdpa device, we don't have 
>>>>> to wait until feature negotiation is done on guest driver to 
>>>>> extract vdpa attributes/parameters, say if we want to replicate 
>>>>> another vdpa device with the same config on migration destination. 
>>>>> I think what may need to be fix is to move off from using 
>>>>> .vdpa_get_config_unlocked() which depends on feature negotiation. 
>>>>> And/or expose config space register values through another set of 
>>>>> attributes.
>>>> Yes, we don't have to wait for FEATURES_OK. In another patch in 
>>>> this series, I have added a new netlink attr to report the device 
>>>> features, and removed the blocker. So the LM orchestration SW can 
>>>> query the device features of the devices at the destination 
>>>> cluster, and pick a proper one, even mask out some features to meet 
>>>> the LM requirements.
>>> For that end, you'd need to move off from using 
>>> vdpa_get_config_unlocked() which depends on feature negotiation. 
>>> Since this would slightly change the original semantics of each 
>>> field that "vdpa dev config" shows, it probably need another netlink 
>>> command and new uAPI.
>> why not show both device_features and driver_features in "vdpa dev 
>> config show"?
>>
> As I requested in the other email, I'd like to see the proposed 'vdpa 
> dev config ...' example output for various phases in feature 
> negotiation, and the specific use case (motivation) for this proposed 
> output. I am having difficulty to match what you want to do with the 
> patch posted.
The features bits of a device don't depend on the phases, and the 
driver_features only has meaningful values when FEATURES_OK.

Thanks
>
> -Siwei
>
>>>
>>> -Siwei
>>>
>>>
>>>>
>>>> Thanks,
>>>> Zhu Lingshan
>>>>> -Siwei
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
Jason Wang Aug. 1, 2022, 4:50 a.m. UTC | #46
On Thu, Jul 28, 2022 at 2:41 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Thu, Jul 28, 2022 at 01:53:51PM +0800, Jason Wang wrote:
> > On Thu, Jul 28, 2022 at 11:47 AM Zhu, Lingshan <lingshan.zhu@intel.com> wrote:
> > >
> > >
> > >
> > > On 7/28/2022 9:21 AM, Jason Wang wrote:
> > > > On Wed, Jul 27, 2022 at 11:45 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >> On Wed, Jul 27, 2022 at 05:50:59PM +0800, Jason Wang wrote:
> > > >>> On Wed, Jul 27, 2022 at 5:03 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >>>> On Wed, Jul 27, 2022 at 02:54:13PM +0800, Jason Wang wrote:
> > > >>>>> On Wed, Jul 27, 2022 at 2:01 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >>>>>> On Wed, Jul 27, 2022 at 03:47:35AM +0000, Parav Pandit wrote:
> > > >>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > >>>>>>>> Sent: Tuesday, July 26, 2022 10:53 PM
> > > >>>>>>>>
> > > >>>>>>>> On 7/27/2022 10:17 AM, Parav Pandit wrote:
> > > >>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > >>>>>>>>>> Sent: Tuesday, July 26, 2022 10:15 PM
> > > >>>>>>>>>>
> > > >>>>>>>>>> On 7/26/2022 11:56 PM, Parav Pandit wrote:
> > > >>>>>>>>>>>> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > >>>>>>>>>>>> Sent: Tuesday, July 12, 2022 11:46 PM
> > > >>>>>>>>>>>>> When the user space which invokes netlink commands, detects that
> > > >>>>>>>>>> _MQ
> > > >>>>>>>>>>>> is not supported, hence it takes max_queue_pair = 1 by itself.
> > > >>>>>>>>>>>> I think the kernel module have all necessary information and it is
> > > >>>>>>>>>>>> the only one which have precise information of a device, so it
> > > >>>>>>>>>>>> should answer precisely than let the user space guess. The kernel
> > > >>>>>>>>>>>> module should be reliable than stay silent, leave the question to
> > > >>>>>>>>>>>> the user space
> > > >>>>>>>>>> tool.
> > > >>>>>>>>>>> Kernel is reliable. It doesn’t expose a config space field if the
> > > >>>>>>>>>>> field doesn’t
> > > >>>>>>>>>> exist regardless of field should have default or no default.
> > > >>>>>>>>>> so when you know it is one queue pair, you should answer one, not try
> > > >>>>>>>>>> to guess.
> > > >>>>>>>>>>> User space should not guess either. User space gets to see if _MQ
> > > >>>>>>>>>> present/not present. If _MQ present than get reliable data from kernel.
> > > >>>>>>>>>>> If _MQ not present, it means this device has one VQ pair.
> > > >>>>>>>>>> it is still a guess, right? And all user space tools implemented this
> > > >>>>>>>>>> feature need to guess
> > > >>>>>>>>> No. it is not a guess.
> > > >>>>>>>>> It is explicitly checking the _MQ feature and deriving the value.
> > > >>>>>>>>> The code you proposed will be present in the user space.
> > > >>>>>>>>> It will be uniform for _MQ and 10 other features that are present now and
> > > >>>>>>>> in the future.
> > > >>>>>>>> MQ and other features like RSS are different. If there is no _RSS_XX, there
> > > >>>>>>>> are no attributes like max_rss_key_size, and there is not a default value.
> > > >>>>>>>> But for MQ, we know it has to be 1 wihtout _MQ.
> > > >>>>>>> "we" = user space.
> > > >>>>>>> To keep the consistency among all the config space fields.
> > > >>>>>> Actually I looked and the code some more and I'm puzzled:
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>          struct virtio_net_config config = {};
> > > >>>>>>          u64 features;
> > > >>>>>>          u16 val_u16;
> > > >>>>>>
> > > >>>>>>          vdpa_get_config_unlocked(vdev, 0, &config, sizeof(config));
> > > >>>>>>
> > > >>>>>>          if (nla_put(msg, VDPA_ATTR_DEV_NET_CFG_MACADDR, sizeof(config.mac),
> > > >>>>>>                      config.mac))
> > > >>>>>>                  return -EMSGSIZE;
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> Mac returned even without VIRTIO_NET_F_MAC
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>          val_u16 = le16_to_cpu(config.status);
> > > >>>>>>          if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_STATUS, val_u16))
> > > >>>>>>                  return -EMSGSIZE;
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> status returned even without VIRTIO_NET_F_STATUS
> > > >>>>>>
> > > >>>>>>          val_u16 = le16_to_cpu(config.mtu);
> > > >>>>>>          if (nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MTU, val_u16))
> > > >>>>>>                  return -EMSGSIZE;
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> MTU returned even without VIRTIO_NET_F_MTU
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> What's going on here?
> > > >>>>> Probably too late to fix, but this should be fine as long as all
> > > >>>>> parents support STATUS/MTU/MAC.
> > > >>>> Why is this too late to fix.
> > > >>> If we make this conditional on the features. This may break the
> > > >>> userspace that always expects VDPA_ATTR_DEV_NET_CFG_MTU?
> > > >>>
> > > >>> Thanks
> > > >> Well only on devices without MTU. I'm saying said userspace
> > > >> was reading trash on such devices anyway.
> > > > It depends on the parent actually. For example, mlx5 query the lower
> > > > mtu unconditionally:
> > > >
> > > >          err = query_mtu(mdev, &mtu);
> > > >          if (err)
> > > >                  goto err_alloc;
> > > >
> > > >          ndev->config.mtu = cpu_to_mlx5vdpa16(mvdev, mtu);
> > > >
> > > > Supporting MTU features seems to be a must for real hardware.
> > > > Otherwise the driver may not work correctly.
> > > >
> > > >> We don't generally maintain bug for bug compatiblity on a whim,
> > > >> only if userspace is actually known to break if we fix a bug.
> > > >   So I think it should be fine to make this conditional then we should
> > > > have a consistent handling of other fields like MQ.
> > > For some fields that have a default value, like MQ =1, we can return the
> > > default value.
> > > For other fields without a default value, like MAC, we return nothing.
> > >
> > > Does this sounds good? So, for MTU, if without _F_MTU, I think we can
> > > return 1500 by default.
> >
> > Or we can just read MTU from the device.
> >
> > But It looks to me Michael wants it conditional.
> >
> > Thanks
>
> I'm fine either way but let's keep it consistent. And I think
> Parav wants it conditional.

Parav, what's your opinion here?

Michale spots some in-consistent stuffs, so I think we should either

1) make all conditional, so we should change both MTU and MAC

or

2) make them unconditional, so we should only change MQ

Thanks

>
> > >
> > > Thanks,
> > > Zhu Lingshan
> > > >
> > > > Thanks
> > > >
> > > >>
> > > >>>>> I wonder if we can add a check in the core and fail the device
> > > >>>>> registration in this case.
> > > >>>>>
> > > >>>>> Thanks
> > > >>>>>
> > > >>>>>>
> > > >>>>>> --
> > > >>>>>> MST
> > > >>>>>>
> > >
>
diff mbox series

Patch

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index d76b22b2f7ae..846dd37f3549 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -806,9 +806,10 @@  static int vdpa_dev_net_mq_config_fill(struct vdpa_device *vdev,
 	u16 val_u16;
 
 	if ((features & BIT_ULL(VIRTIO_NET_F_MQ)) == 0)
-		return 0;
+		val_u16 = 1;
+	else
+		val_u16 = __virtio16_to_cpu(true, config->max_virtqueue_pairs);
 
-	val_u16 = le16_to_cpu(config->max_virtqueue_pairs);
 	return nla_put_u16(msg, VDPA_ATTR_DEV_NET_CFG_MAX_VQP, val_u16);
 }
 
@@ -842,7 +843,7 @@  static int vdpa_dev_net_config_fill(struct vdpa_device *vdev, struct sk_buff *ms
 			      VDPA_ATTR_PAD))
 		return -EMSGSIZE;
 
-	return vdpa_dev_net_mq_config_fill(vdev, msg, features_driver, &config);
+	return vdpa_dev_net_mq_config_fill(vdev, msg, features_device, &config);
 }
 
 static int