diff mbox series

vhost/net: Set num_buffers for virtio 1.0

Message ID 20240915-v1-v1-1-f10d2cb5e759@daynix.com (mailing list archive)
State Not Applicable
Headers show
Series vhost/net: Set num_buffers for virtio 1.0 | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Akihiko Odaki Sept. 15, 2024, 1:35 a.m. UTC
The specification says the device MUST set num_buffers to 1 if
VIRTIO_NET_F_MRG_RXBUF has not been negotiated.

Fixes: 41e3e42108bc ("vhost/net: enable virtio 1.0")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
 drivers/vhost/net.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)


---
base-commit: 46a0057a5853cbdb58211c19e89ba7777dc6fd50
change-id: 20240908-v1-90fc83ff8b09

Best regards,

Comments

Michael S. Tsirkin Nov. 6, 2024, 8:54 a.m. UTC | #1
On Sun, Sep 15, 2024 at 10:35:53AM +0900, Akihiko Odaki wrote:
> The specification says the device MUST set num_buffers to 1 if
> VIRTIO_NET_F_MRG_RXBUF has not been negotiated.
> 
> Fixes: 41e3e42108bc ("vhost/net: enable virtio 1.0")
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>

True, this is out of spec. But, qemu is also out of spec :(

Given how many years this was out there, I wonder whether
we should just fix the spec, instead of changing now.

Jason, what's your take?


> ---
>  drivers/vhost/net.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index f16279351db5..d4d97fa9cc8f 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -1107,6 +1107,7 @@ static void handle_rx(struct vhost_net *net)
>  	size_t vhost_hlen, sock_hlen;
>  	size_t vhost_len, sock_len;
>  	bool busyloop_intr = false;
> +	bool set_num_buffers;
>  	struct socket *sock;
>  	struct iov_iter fixup;
>  	__virtio16 num_buffers;
> @@ -1129,6 +1130,8 @@ static void handle_rx(struct vhost_net *net)
>  	vq_log = unlikely(vhost_has_feature(vq, VHOST_F_LOG_ALL)) ?
>  		vq->log : NULL;
>  	mergeable = vhost_has_feature(vq, VIRTIO_NET_F_MRG_RXBUF);
> +	set_num_buffers = mergeable ||
> +			  vhost_has_feature(vq, VIRTIO_F_VERSION_1);
>  
>  	do {
>  		sock_len = vhost_net_rx_peek_head_len(net, sock->sk,
> @@ -1205,7 +1208,7 @@ static void handle_rx(struct vhost_net *net)
>  		/* TODO: Should check and handle checksum. */
>  
>  		num_buffers = cpu_to_vhost16(vq, headcount);
> -		if (likely(mergeable) &&
> +		if (likely(set_num_buffers) &&
>  		    copy_to_iter(&num_buffers, sizeof num_buffers,
>  				 &fixup) != sizeof num_buffers) {
>  			vq_err(vq, "Failed num_buffers write");
> 
> ---
> base-commit: 46a0057a5853cbdb58211c19e89ba7777dc6fd50
> change-id: 20240908-v1-90fc83ff8b09
> 
> Best regards,
> -- 
> Akihiko Odaki <akihiko.odaki@daynix.com>
Jason Wang Nov. 11, 2024, 1:27 a.m. UTC | #2
On Wed, Nov 6, 2024 at 4:54 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Sun, Sep 15, 2024 at 10:35:53AM +0900, Akihiko Odaki wrote:
> > The specification says the device MUST set num_buffers to 1 if
> > VIRTIO_NET_F_MRG_RXBUF has not been negotiated.
> >
> > Fixes: 41e3e42108bc ("vhost/net: enable virtio 1.0")
> > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
>
> True, this is out of spec. But, qemu is also out of spec :(
>
> Given how many years this was out there, I wonder whether
> we should just fix the spec, instead of changing now.
>
> Jason, what's your take?

Fixing the spec (if you mean release the requirement) seems to be less risky.

Thanks

>
>
> > ---
> >  drivers/vhost/net.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> > index f16279351db5..d4d97fa9cc8f 100644
> > --- a/drivers/vhost/net.c
> > +++ b/drivers/vhost/net.c
> > @@ -1107,6 +1107,7 @@ static void handle_rx(struct vhost_net *net)
> >       size_t vhost_hlen, sock_hlen;
> >       size_t vhost_len, sock_len;
> >       bool busyloop_intr = false;
> > +     bool set_num_buffers;
> >       struct socket *sock;
> >       struct iov_iter fixup;
> >       __virtio16 num_buffers;
> > @@ -1129,6 +1130,8 @@ static void handle_rx(struct vhost_net *net)
> >       vq_log = unlikely(vhost_has_feature(vq, VHOST_F_LOG_ALL)) ?
> >               vq->log : NULL;
> >       mergeable = vhost_has_feature(vq, VIRTIO_NET_F_MRG_RXBUF);
> > +     set_num_buffers = mergeable ||
> > +                       vhost_has_feature(vq, VIRTIO_F_VERSION_1);
> >
> >       do {
> >               sock_len = vhost_net_rx_peek_head_len(net, sock->sk,
> > @@ -1205,7 +1208,7 @@ static void handle_rx(struct vhost_net *net)
> >               /* TODO: Should check and handle checksum. */
> >
> >               num_buffers = cpu_to_vhost16(vq, headcount);
> > -             if (likely(mergeable) &&
> > +             if (likely(set_num_buffers) &&
> >                   copy_to_iter(&num_buffers, sizeof num_buffers,
> >                                &fixup) != sizeof num_buffers) {
> >                       vq_err(vq, "Failed num_buffers write");
> >
> > ---
> > base-commit: 46a0057a5853cbdb58211c19e89ba7777dc6fd50
> > change-id: 20240908-v1-90fc83ff8b09
> >
> > Best regards,
> > --
> > Akihiko Odaki <akihiko.odaki@daynix.com>
>
Michael S. Tsirkin Dec. 26, 2024, 11:54 a.m. UTC | #3
On Mon, Nov 11, 2024 at 09:27:45AM +0800, Jason Wang wrote:
> On Wed, Nov 6, 2024 at 4:54 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Sun, Sep 15, 2024 at 10:35:53AM +0900, Akihiko Odaki wrote:
> > > The specification says the device MUST set num_buffers to 1 if
> > > VIRTIO_NET_F_MRG_RXBUF has not been negotiated.
> > >
> > > Fixes: 41e3e42108bc ("vhost/net: enable virtio 1.0")
> > > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> >
> > True, this is out of spec. But, qemu is also out of spec :(
> >
> > Given how many years this was out there, I wonder whether
> > we should just fix the spec, instead of changing now.
> >
> > Jason, what's your take?
> 
> Fixing the spec (if you mean release the requirement) seems to be less risky.
> 
> Thanks

I looked at the latest spec patch.
Issue is, if we relax the requirement in the spec,
it just might break some drivers.

Something I did not realize at the time.

Also, vhost just leaves it uninitialized so there really is no chance
some driver using vhost looks at it and assumes 0.

There is another thing out of spec with vhost at the moment:
it is actually leaving this field in the buffer
uninitialized. Which is out of spec, length supplied by device
must be initialized by device.


We generally just ask everyone to follow spec.  So now I'm inclined to fix
it, and make a corresponding qemu change.


Now, about how to fix it - besides a risk to non-VM workloads, I dislike
doing an extra copy to user into buffer. So maybe we should add an ioctl
to teach tun to set num bufs to 1.
This way userspace has control.

Hmm?


> >
> >
> > > ---
> > >  drivers/vhost/net.c | 5 ++++-
> > >  1 file changed, 4 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> > > index f16279351db5..d4d97fa9cc8f 100644
> > > --- a/drivers/vhost/net.c
> > > +++ b/drivers/vhost/net.c
> > > @@ -1107,6 +1107,7 @@ static void handle_rx(struct vhost_net *net)
> > >       size_t vhost_hlen, sock_hlen;
> > >       size_t vhost_len, sock_len;
> > >       bool busyloop_intr = false;
> > > +     bool set_num_buffers;
> > >       struct socket *sock;
> > >       struct iov_iter fixup;
> > >       __virtio16 num_buffers;
> > > @@ -1129,6 +1130,8 @@ static void handle_rx(struct vhost_net *net)
> > >       vq_log = unlikely(vhost_has_feature(vq, VHOST_F_LOG_ALL)) ?
> > >               vq->log : NULL;
> > >       mergeable = vhost_has_feature(vq, VIRTIO_NET_F_MRG_RXBUF);
> > > +     set_num_buffers = mergeable ||
> > > +                       vhost_has_feature(vq, VIRTIO_F_VERSION_1);
> > >
> > >       do {
> > >               sock_len = vhost_net_rx_peek_head_len(net, sock->sk,
> > > @@ -1205,7 +1208,7 @@ static void handle_rx(struct vhost_net *net)
> > >               /* TODO: Should check and handle checksum. */
> > >
> > >               num_buffers = cpu_to_vhost16(vq, headcount);
> > > -             if (likely(mergeable) &&
> > > +             if (likely(set_num_buffers) &&
> > >                   copy_to_iter(&num_buffers, sizeof num_buffers,
> > >                                &fixup) != sizeof num_buffers) {
> > >                       vq_err(vq, "Failed num_buffers write");
> > >
> > > ---
> > > base-commit: 46a0057a5853cbdb58211c19e89ba7777dc6fd50
> > > change-id: 20240908-v1-90fc83ff8b09
> > >
> > > Best regards,
> > > --
> > > Akihiko Odaki <akihiko.odaki@daynix.com>
> >
Akihiko Odaki Dec. 27, 2024, 4:34 a.m. UTC | #4
On 2024/12/27 10:29, Jason Wang wrote:
> 
> 
> On Thu, Dec 26, 2024 at 7:54 PM Michael S. Tsirkin <mst@redhat.com 
> <mailto:mst@redhat.com>> wrote:
> 
>     On Mon, Nov 11, 2024 at 09:27:45AM +0800, Jason Wang wrote:
>      > On Wed, Nov 6, 2024 at 4:54 PM Michael S. Tsirkin <mst@redhat.com
>     <mailto:mst@redhat.com>> wrote:
>      > >
>      > > On Sun, Sep 15, 2024 at 10:35:53AM +0900, Akihiko Odaki wrote:
>      > > > The specification says the device MUST set num_buffers to 1 if
>      > > > VIRTIO_NET_F_MRG_RXBUF has not been negotiated.
>      > > >
>      > > > Fixes: 41e3e42108bc ("vhost/net: enable virtio 1.0")
>      > > > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com
>     <mailto:akihiko.odaki@daynix.com>>
>      > >
>      > > True, this is out of spec. But, qemu is also out of spec :(
>      > >
>      > > Given how many years this was out there, I wonder whether
>      > > we should just fix the spec, instead of changing now.
>      > >
>      > > Jason, what's your take?
>      >
>      > Fixing the spec (if you mean release the requirement) seems to be
>     less risky.
>      >
>      > Thanks
> 
>     I looked at the latest spec patch.
>     Issue is, if we relax the requirement in the spec,
>     it just might break some drivers.
> 
>     Something I did not realize at the time.
> 
>     Also, vhost just leaves it uninitialized so there really is no chance
>     some driver using vhost looks at it and assumes 0.
 > >
> So it also has no chance to assume it for anything specific value.

Theoretically, there could be a driver written according to the 
specification and tested with other device implementations that set 
num_buffers to one.

Practically, I will be surprised if there is such a driver in reality.

But I also see few reasons to relax the device requirement now; if we 
used to say it should be set to one and there is no better alternative 
value, why don't stick to one?

I sent v2 for the virtio-spec change that retains the device requirement 
so please tell me what you think about it:
https://lore.kernel.org/virtio-comment/20241227-reserved-v2-1-de9f9b0a808d@daynix.com/T/#u

> 
> 
>     There is another thing out of spec with vhost at the moment:
>     it is actually leaving this field in the buffer
>     uninitialized. Which is out of spec, length supplied by device
>     must be initialized by device.
> 
> 
> What do you mean by "length" here?
> 
> 
> 
>     We generally just ask everyone to follow spec.
> 
> 
> Spec can't cover all the behaviour, so there would be some leftovers.
> 
>        So now I'm inclined to fix
>     it, and make a corresponding qemu change.
> 
> 
>     Now, about how to fix it - besides a risk to non-VM workloads, I dislike
>     doing an extra copy to user into buffer. So maybe we should add an ioctl
>     to teach tun to set num bufs to 1.
>     This way userspace has control.
> 
> 
> I'm not sure I will get here. TUN has no knowledge of the mergeable 
> buffers if I understand it correctly.

I rather want QEMU and other vhost_net users automatically fixed instead 
of opting-in the fix.

The extra copy overhead can be almost eliminated if we initialize the 
field in TUN/TAP; they already writes other part of the header so we can 
simply add two bytes there. But I wonder if it's worthwhile.

Regards,
Akihiko Odaki
Michael S. Tsirkin Dec. 27, 2024, 1:44 p.m. UTC | #5
On Fri, Dec 27, 2024 at 01:34:10PM +0900, Akihiko Odaki wrote:
> On 2024/12/27 10:29, Jason Wang wrote:
> > 
> > 
> > On Thu, Dec 26, 2024 at 7:54 PM Michael S. Tsirkin <mst@redhat.com
> > <mailto:mst@redhat.com>> wrote:
> > 
> >     On Mon, Nov 11, 2024 at 09:27:45AM +0800, Jason Wang wrote:
> >      > On Wed, Nov 6, 2024 at 4:54 PM Michael S. Tsirkin <mst@redhat.com
> >     <mailto:mst@redhat.com>> wrote:
> >      > >
> >      > > On Sun, Sep 15, 2024 at 10:35:53AM +0900, Akihiko Odaki wrote:
> >      > > > The specification says the device MUST set num_buffers to 1 if
> >      > > > VIRTIO_NET_F_MRG_RXBUF has not been negotiated.
> >      > > >
> >      > > > Fixes: 41e3e42108bc ("vhost/net: enable virtio 1.0")
> >      > > > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com
> >     <mailto:akihiko.odaki@daynix.com>>
> >      > >
> >      > > True, this is out of spec. But, qemu is also out of spec :(
> >      > >
> >      > > Given how many years this was out there, I wonder whether
> >      > > we should just fix the spec, instead of changing now.
> >      > >
> >      > > Jason, what's your take?
> >      >
> >      > Fixing the spec (if you mean release the requirement) seems to be
> >     less risky.
> >      >
> >      > Thanks
> > 
> >     I looked at the latest spec patch.
> >     Issue is, if we relax the requirement in the spec,
> >     it just might break some drivers.
> > 
> >     Something I did not realize at the time.
> > 
> >     Also, vhost just leaves it uninitialized so there really is no chance
> >     some driver using vhost looks at it and assumes 0.
> > >
> > So it also has no chance to assume it for anything specific value.
> 
> Theoretically, there could be a driver written according to the
> specification and tested with other device implementations that set
> num_buffers to one.
> 
> Practically, I will be surprised if there is such a driver in reality.
> 
> But I also see few reasons to relax the device requirement now; if we used
> to say it should be set to one and there is no better alternative value, why
> don't stick to one?
> 
> I sent v2 for the virtio-spec change that retains the device requirement so
> please tell me what you think about it:
> https://lore.kernel.org/virtio-comment/20241227-reserved-v2-1-de9f9b0a808d@daynix.com/T/#u
> 
> > 
> > 
> >     There is another thing out of spec with vhost at the moment:
> >     it is actually leaving this field in the buffer
> >     uninitialized. Which is out of spec, length supplied by device
> >     must be initialized by device.
> > 
> > 
> > What do you mean by "length" here?
> > 
> > 
> > 
> >     We generally just ask everyone to follow spec.
> > 
> > 
> > Spec can't cover all the behaviour, so there would be some leftovers.
> > 
> >        So now I'm inclined to fix
> >     it, and make a corresponding qemu change.
> > 
> > 
> >     Now, about how to fix it - besides a risk to non-VM workloads, I dislike
> >     doing an extra copy to user into buffer. So maybe we should add an ioctl
> >     to teach tun to set num bufs to 1.
> >     This way userspace has control.
> > 
> > 
> > I'm not sure I will get here. TUN has no knowledge of the mergeable
> > buffers if I understand it correctly.
> 
> I rather want QEMU and other vhost_net users automatically fixed instead of
> opting-in the fix.

qemu can be automatic. kernel I am not sure.

> The extra copy overhead can be almost eliminated if we initialize the field
> in TUN/TAP; they already writes other part of the header so we can simply
> add two bytes there. But I wonder if it's worthwhile.

Try?

> Regards,
> Akihiko Odaki
diff mbox series

Patch

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index f16279351db5..d4d97fa9cc8f 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1107,6 +1107,7 @@  static void handle_rx(struct vhost_net *net)
 	size_t vhost_hlen, sock_hlen;
 	size_t vhost_len, sock_len;
 	bool busyloop_intr = false;
+	bool set_num_buffers;
 	struct socket *sock;
 	struct iov_iter fixup;
 	__virtio16 num_buffers;
@@ -1129,6 +1130,8 @@  static void handle_rx(struct vhost_net *net)
 	vq_log = unlikely(vhost_has_feature(vq, VHOST_F_LOG_ALL)) ?
 		vq->log : NULL;
 	mergeable = vhost_has_feature(vq, VIRTIO_NET_F_MRG_RXBUF);
+	set_num_buffers = mergeable ||
+			  vhost_has_feature(vq, VIRTIO_F_VERSION_1);
 
 	do {
 		sock_len = vhost_net_rx_peek_head_len(net, sock->sk,
@@ -1205,7 +1208,7 @@  static void handle_rx(struct vhost_net *net)
 		/* TODO: Should check and handle checksum. */
 
 		num_buffers = cpu_to_vhost16(vq, headcount);
-		if (likely(mergeable) &&
+		if (likely(set_num_buffers) &&
 		    copy_to_iter(&num_buffers, sizeof num_buffers,
 				 &fixup) != sizeof num_buffers) {
 			vq_err(vq, "Failed num_buffers write");