Message ID | 20240915-v1-v1-1-f10d2cb5e759@daynix.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | vhost/net: Set num_buffers for virtio 1.0 | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
On Sun, Sep 15, 2024 at 10:35:53AM +0900, Akihiko Odaki wrote: > The specification says the device MUST set num_buffers to 1 if > VIRTIO_NET_F_MRG_RXBUF has not been negotiated. > > Fixes: 41e3e42108bc ("vhost/net: enable virtio 1.0") > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> True, this is out of spec. But, qemu is also out of spec :( Given how many years this was out there, I wonder whether we should just fix the spec, instead of changing now. Jason, what's your take? > --- > drivers/vhost/net.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c > index f16279351db5..d4d97fa9cc8f 100644 > --- a/drivers/vhost/net.c > +++ b/drivers/vhost/net.c > @@ -1107,6 +1107,7 @@ static void handle_rx(struct vhost_net *net) > size_t vhost_hlen, sock_hlen; > size_t vhost_len, sock_len; > bool busyloop_intr = false; > + bool set_num_buffers; > struct socket *sock; > struct iov_iter fixup; > __virtio16 num_buffers; > @@ -1129,6 +1130,8 @@ static void handle_rx(struct vhost_net *net) > vq_log = unlikely(vhost_has_feature(vq, VHOST_F_LOG_ALL)) ? > vq->log : NULL; > mergeable = vhost_has_feature(vq, VIRTIO_NET_F_MRG_RXBUF); > + set_num_buffers = mergeable || > + vhost_has_feature(vq, VIRTIO_F_VERSION_1); > > do { > sock_len = vhost_net_rx_peek_head_len(net, sock->sk, > @@ -1205,7 +1208,7 @@ static void handle_rx(struct vhost_net *net) > /* TODO: Should check and handle checksum. */ > > num_buffers = cpu_to_vhost16(vq, headcount); > - if (likely(mergeable) && > + if (likely(set_num_buffers) && > copy_to_iter(&num_buffers, sizeof num_buffers, > &fixup) != sizeof num_buffers) { > vq_err(vq, "Failed num_buffers write"); > > --- > base-commit: 46a0057a5853cbdb58211c19e89ba7777dc6fd50 > change-id: 20240908-v1-90fc83ff8b09 > > Best regards, > -- > Akihiko Odaki <akihiko.odaki@daynix.com>
On Wed, Nov 6, 2024 at 4:54 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > On Sun, Sep 15, 2024 at 10:35:53AM +0900, Akihiko Odaki wrote: > > The specification says the device MUST set num_buffers to 1 if > > VIRTIO_NET_F_MRG_RXBUF has not been negotiated. > > > > Fixes: 41e3e42108bc ("vhost/net: enable virtio 1.0") > > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> > > True, this is out of spec. But, qemu is also out of spec :( > > Given how many years this was out there, I wonder whether > we should just fix the spec, instead of changing now. > > Jason, what's your take? Fixing the spec (if you mean release the requirement) seems to be less risky. Thanks > > > > --- > > drivers/vhost/net.c | 5 ++++- > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c > > index f16279351db5..d4d97fa9cc8f 100644 > > --- a/drivers/vhost/net.c > > +++ b/drivers/vhost/net.c > > @@ -1107,6 +1107,7 @@ static void handle_rx(struct vhost_net *net) > > size_t vhost_hlen, sock_hlen; > > size_t vhost_len, sock_len; > > bool busyloop_intr = false; > > + bool set_num_buffers; > > struct socket *sock; > > struct iov_iter fixup; > > __virtio16 num_buffers; > > @@ -1129,6 +1130,8 @@ static void handle_rx(struct vhost_net *net) > > vq_log = unlikely(vhost_has_feature(vq, VHOST_F_LOG_ALL)) ? > > vq->log : NULL; > > mergeable = vhost_has_feature(vq, VIRTIO_NET_F_MRG_RXBUF); > > + set_num_buffers = mergeable || > > + vhost_has_feature(vq, VIRTIO_F_VERSION_1); > > > > do { > > sock_len = vhost_net_rx_peek_head_len(net, sock->sk, > > @@ -1205,7 +1208,7 @@ static void handle_rx(struct vhost_net *net) > > /* TODO: Should check and handle checksum. */ > > > > num_buffers = cpu_to_vhost16(vq, headcount); > > - if (likely(mergeable) && > > + if (likely(set_num_buffers) && > > copy_to_iter(&num_buffers, sizeof num_buffers, > > &fixup) != sizeof num_buffers) { > > vq_err(vq, "Failed num_buffers write"); > > > > --- > > base-commit: 46a0057a5853cbdb58211c19e89ba7777dc6fd50 > > change-id: 20240908-v1-90fc83ff8b09 > > > > Best regards, > > -- > > Akihiko Odaki <akihiko.odaki@daynix.com> >
On Mon, Nov 11, 2024 at 09:27:45AM +0800, Jason Wang wrote: > On Wed, Nov 6, 2024 at 4:54 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > > > On Sun, Sep 15, 2024 at 10:35:53AM +0900, Akihiko Odaki wrote: > > > The specification says the device MUST set num_buffers to 1 if > > > VIRTIO_NET_F_MRG_RXBUF has not been negotiated. > > > > > > Fixes: 41e3e42108bc ("vhost/net: enable virtio 1.0") > > > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> > > > > True, this is out of spec. But, qemu is also out of spec :( > > > > Given how many years this was out there, I wonder whether > > we should just fix the spec, instead of changing now. > > > > Jason, what's your take? > > Fixing the spec (if you mean release the requirement) seems to be less risky. > > Thanks I looked at the latest spec patch. Issue is, if we relax the requirement in the spec, it just might break some drivers. Something I did not realize at the time. Also, vhost just leaves it uninitialized so there really is no chance some driver using vhost looks at it and assumes 0. There is another thing out of spec with vhost at the moment: it is actually leaving this field in the buffer uninitialized. Which is out of spec, length supplied by device must be initialized by device. We generally just ask everyone to follow spec. So now I'm inclined to fix it, and make a corresponding qemu change. Now, about how to fix it - besides a risk to non-VM workloads, I dislike doing an extra copy to user into buffer. So maybe we should add an ioctl to teach tun to set num bufs to 1. This way userspace has control. Hmm? > > > > > > > --- > > > drivers/vhost/net.c | 5 ++++- > > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c > > > index f16279351db5..d4d97fa9cc8f 100644 > > > --- a/drivers/vhost/net.c > > > +++ b/drivers/vhost/net.c > > > @@ -1107,6 +1107,7 @@ static void handle_rx(struct vhost_net *net) > > > size_t vhost_hlen, sock_hlen; > > > size_t vhost_len, sock_len; > > > bool busyloop_intr = false; > > > + bool set_num_buffers; > > > struct socket *sock; > > > struct iov_iter fixup; > > > __virtio16 num_buffers; > > > @@ -1129,6 +1130,8 @@ static void handle_rx(struct vhost_net *net) > > > vq_log = unlikely(vhost_has_feature(vq, VHOST_F_LOG_ALL)) ? > > > vq->log : NULL; > > > mergeable = vhost_has_feature(vq, VIRTIO_NET_F_MRG_RXBUF); > > > + set_num_buffers = mergeable || > > > + vhost_has_feature(vq, VIRTIO_F_VERSION_1); > > > > > > do { > > > sock_len = vhost_net_rx_peek_head_len(net, sock->sk, > > > @@ -1205,7 +1208,7 @@ static void handle_rx(struct vhost_net *net) > > > /* TODO: Should check and handle checksum. */ > > > > > > num_buffers = cpu_to_vhost16(vq, headcount); > > > - if (likely(mergeable) && > > > + if (likely(set_num_buffers) && > > > copy_to_iter(&num_buffers, sizeof num_buffers, > > > &fixup) != sizeof num_buffers) { > > > vq_err(vq, "Failed num_buffers write"); > > > > > > --- > > > base-commit: 46a0057a5853cbdb58211c19e89ba7777dc6fd50 > > > change-id: 20240908-v1-90fc83ff8b09 > > > > > > Best regards, > > > -- > > > Akihiko Odaki <akihiko.odaki@daynix.com> > >
On 2024/12/27 10:29, Jason Wang wrote: > > > On Thu, Dec 26, 2024 at 7:54 PM Michael S. Tsirkin <mst@redhat.com > <mailto:mst@redhat.com>> wrote: > > On Mon, Nov 11, 2024 at 09:27:45AM +0800, Jason Wang wrote: > > On Wed, Nov 6, 2024 at 4:54 PM Michael S. Tsirkin <mst@redhat.com > <mailto:mst@redhat.com>> wrote: > > > > > > On Sun, Sep 15, 2024 at 10:35:53AM +0900, Akihiko Odaki wrote: > > > > The specification says the device MUST set num_buffers to 1 if > > > > VIRTIO_NET_F_MRG_RXBUF has not been negotiated. > > > > > > > > Fixes: 41e3e42108bc ("vhost/net: enable virtio 1.0") > > > > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com > <mailto:akihiko.odaki@daynix.com>> > > > > > > True, this is out of spec. But, qemu is also out of spec :( > > > > > > Given how many years this was out there, I wonder whether > > > we should just fix the spec, instead of changing now. > > > > > > Jason, what's your take? > > > > Fixing the spec (if you mean release the requirement) seems to be > less risky. > > > > Thanks > > I looked at the latest spec patch. > Issue is, if we relax the requirement in the spec, > it just might break some drivers. > > Something I did not realize at the time. > > Also, vhost just leaves it uninitialized so there really is no chance > some driver using vhost looks at it and assumes 0. > > > So it also has no chance to assume it for anything specific value. Theoretically, there could be a driver written according to the specification and tested with other device implementations that set num_buffers to one. Practically, I will be surprised if there is such a driver in reality. But I also see few reasons to relax the device requirement now; if we used to say it should be set to one and there is no better alternative value, why don't stick to one? I sent v2 for the virtio-spec change that retains the device requirement so please tell me what you think about it: https://lore.kernel.org/virtio-comment/20241227-reserved-v2-1-de9f9b0a808d@daynix.com/T/#u > > > There is another thing out of spec with vhost at the moment: > it is actually leaving this field in the buffer > uninitialized. Which is out of spec, length supplied by device > must be initialized by device. > > > What do you mean by "length" here? > > > > We generally just ask everyone to follow spec. > > > Spec can't cover all the behaviour, so there would be some leftovers. > > So now I'm inclined to fix > it, and make a corresponding qemu change. > > > Now, about how to fix it - besides a risk to non-VM workloads, I dislike > doing an extra copy to user into buffer. So maybe we should add an ioctl > to teach tun to set num bufs to 1. > This way userspace has control. > > > I'm not sure I will get here. TUN has no knowledge of the mergeable > buffers if I understand it correctly. I rather want QEMU and other vhost_net users automatically fixed instead of opting-in the fix. The extra copy overhead can be almost eliminated if we initialize the field in TUN/TAP; they already writes other part of the header so we can simply add two bytes there. But I wonder if it's worthwhile. Regards, Akihiko Odaki
On Fri, Dec 27, 2024 at 01:34:10PM +0900, Akihiko Odaki wrote: > On 2024/12/27 10:29, Jason Wang wrote: > > > > > > On Thu, Dec 26, 2024 at 7:54 PM Michael S. Tsirkin <mst@redhat.com > > <mailto:mst@redhat.com>> wrote: > > > > On Mon, Nov 11, 2024 at 09:27:45AM +0800, Jason Wang wrote: > > > On Wed, Nov 6, 2024 at 4:54 PM Michael S. Tsirkin <mst@redhat.com > > <mailto:mst@redhat.com>> wrote: > > > > > > > > On Sun, Sep 15, 2024 at 10:35:53AM +0900, Akihiko Odaki wrote: > > > > > The specification says the device MUST set num_buffers to 1 if > > > > > VIRTIO_NET_F_MRG_RXBUF has not been negotiated. > > > > > > > > > > Fixes: 41e3e42108bc ("vhost/net: enable virtio 1.0") > > > > > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com > > <mailto:akihiko.odaki@daynix.com>> > > > > > > > > True, this is out of spec. But, qemu is also out of spec :( > > > > > > > > Given how many years this was out there, I wonder whether > > > > we should just fix the spec, instead of changing now. > > > > > > > > Jason, what's your take? > > > > > > Fixing the spec (if you mean release the requirement) seems to be > > less risky. > > > > > > Thanks > > > > I looked at the latest spec patch. > > Issue is, if we relax the requirement in the spec, > > it just might break some drivers. > > > > Something I did not realize at the time. > > > > Also, vhost just leaves it uninitialized so there really is no chance > > some driver using vhost looks at it and assumes 0. > > > > > So it also has no chance to assume it for anything specific value. > > Theoretically, there could be a driver written according to the > specification and tested with other device implementations that set > num_buffers to one. > > Practically, I will be surprised if there is such a driver in reality. > > But I also see few reasons to relax the device requirement now; if we used > to say it should be set to one and there is no better alternative value, why > don't stick to one? > > I sent v2 for the virtio-spec change that retains the device requirement so > please tell me what you think about it: > https://lore.kernel.org/virtio-comment/20241227-reserved-v2-1-de9f9b0a808d@daynix.com/T/#u > > > > > > > There is another thing out of spec with vhost at the moment: > > it is actually leaving this field in the buffer > > uninitialized. Which is out of spec, length supplied by device > > must be initialized by device. > > > > > > What do you mean by "length" here? > > > > > > > > We generally just ask everyone to follow spec. > > > > > > Spec can't cover all the behaviour, so there would be some leftovers. > > > > So now I'm inclined to fix > > it, and make a corresponding qemu change. > > > > > > Now, about how to fix it - besides a risk to non-VM workloads, I dislike > > doing an extra copy to user into buffer. So maybe we should add an ioctl > > to teach tun to set num bufs to 1. > > This way userspace has control. > > > > > > I'm not sure I will get here. TUN has no knowledge of the mergeable > > buffers if I understand it correctly. > > I rather want QEMU and other vhost_net users automatically fixed instead of > opting-in the fix. qemu can be automatic. kernel I am not sure. > The extra copy overhead can be almost eliminated if we initialize the field > in TUN/TAP; they already writes other part of the header so we can simply > add two bytes there. But I wonder if it's worthwhile. Try? > Regards, > Akihiko Odaki
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index f16279351db5..d4d97fa9cc8f 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -1107,6 +1107,7 @@ static void handle_rx(struct vhost_net *net) size_t vhost_hlen, sock_hlen; size_t vhost_len, sock_len; bool busyloop_intr = false; + bool set_num_buffers; struct socket *sock; struct iov_iter fixup; __virtio16 num_buffers; @@ -1129,6 +1130,8 @@ static void handle_rx(struct vhost_net *net) vq_log = unlikely(vhost_has_feature(vq, VHOST_F_LOG_ALL)) ? vq->log : NULL; mergeable = vhost_has_feature(vq, VIRTIO_NET_F_MRG_RXBUF); + set_num_buffers = mergeable || + vhost_has_feature(vq, VIRTIO_F_VERSION_1); do { sock_len = vhost_net_rx_peek_head_len(net, sock->sk, @@ -1205,7 +1208,7 @@ static void handle_rx(struct vhost_net *net) /* TODO: Should check and handle checksum. */ num_buffers = cpu_to_vhost16(vq, headcount); - if (likely(mergeable) && + if (likely(set_num_buffers) && copy_to_iter(&num_buffers, sizeof num_buffers, &fixup) != sizeof num_buffers) { vq_err(vq, "Failed num_buffers write");
The specification says the device MUST set num_buffers to 1 if VIRTIO_NET_F_MRG_RXBUF has not been negotiated. Fixes: 41e3e42108bc ("vhost/net: enable virtio 1.0") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> --- drivers/vhost/net.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) --- base-commit: 46a0057a5853cbdb58211c19e89ba7777dc6fd50 change-id: 20240908-v1-90fc83ff8b09 Best regards,