Message ID | 1608776746-4040-1-git-send-email-wangyunjian@huawei.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | fixes for vhost_net | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for net |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | warning | 5 maintainers not CCed: kvm@vger.kernel.org davem@davemloft.net rusty@rustcorp.com.au arnd@arndb.de paulmck@kernel.org |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 0 this patch: 0 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 40 lines checked |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 0 this patch: 0 |
netdev/header_inline | success | Link |
netdev/stable | success | Stable not CCed |
On Wed, Dec 23, 2020 at 9:25 PM wangyunjian <wangyunjian@huawei.com> wrote: > > From: Yunjian Wang <wangyunjian@huawei.com> > > Currently the driver doesn't drop a packet which can't be sent by tun > (e.g bad packet). In this case, the driver will always process the > same packet lead to the tx queue stuck. > > To fix this issue: > 1. in the case of persistent failure (e.g bad packet), the driver > can skip this descriptor by ignoring the error. > 2. in the case of transient failure (e.g -EAGAIN and -ENOMEM), the > driver schedules the worker to try again. > > Fixes: 3a4d5c94e959 ("vhost_net: a kernel-level virtio server") > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Acked-by: Willem de Bruijn <willemb@google.com> Thanks.
On 2020/12/24 上午10:25, wangyunjian wrote: > From: Yunjian Wang <wangyunjian@huawei.com> > > Currently the driver doesn't drop a packet which can't be sent by tun > (e.g bad packet). In this case, the driver will always process the > same packet lead to the tx queue stuck. > > To fix this issue: > 1. in the case of persistent failure (e.g bad packet), the driver > can skip this descriptor by ignoring the error. > 2. in the case of transient failure (e.g -EAGAIN and -ENOMEM), the > driver schedules the worker to try again. I might be wrong but looking at alloc_skb_with_frags(), it returns -ENOBUFS actually: *errcode = -ENOBUFS; skb = alloc_skb(header_len, gfp_mask); if (!skb) return NULL; Thanks > > Fixes: 3a4d5c94e959 ("vhost_net: a kernel-level virtio server") > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> > --- > drivers/vhost/net.c | 16 ++++++++-------- > 1 file changed, 8 insertions(+), 8 deletions(-) > > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c > index c8784dfafdd7..e76245daa7f6 100644 > --- a/drivers/vhost/net.c > +++ b/drivers/vhost/net.c > @@ -827,14 +827,13 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock) > msg.msg_flags &= ~MSG_MORE; > } > > - /* TODO: Check specific error and bomb out unless ENOBUFS? */ > err = sock->ops->sendmsg(sock, &msg, len); > - if (unlikely(err < 0)) { > + if (unlikely(err == -EAGAIN || err == -ENOMEM)) { > vhost_discard_vq_desc(vq, 1); > vhost_net_enable_vq(net, vq); > break; > } > - if (err != len) > + if (err >= 0 && err != len) > pr_debug("Truncated TX packet: len %d != %zd\n", > err, len); > done: > @@ -922,7 +921,6 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock) > msg.msg_flags &= ~MSG_MORE; > } > > - /* TODO: Check specific error and bomb out unless ENOBUFS? */ > err = sock->ops->sendmsg(sock, &msg, len); > if (unlikely(err < 0)) { > if (zcopy_used) { > @@ -931,11 +929,13 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock) > nvq->upend_idx = ((unsigned)nvq->upend_idx - 1) > % UIO_MAXIOV; > } > - vhost_discard_vq_desc(vq, 1); > - vhost_net_enable_vq(net, vq); > - break; > + if (err == -EAGAIN || err == -ENOMEM) { > + vhost_discard_vq_desc(vq, 1); > + vhost_net_enable_vq(net, vq); > + break; > + } > } > - if (err != len) > + if (err >= 0 && err != len) > pr_debug("Truncated TX packet: " > " len %d != %zd\n", err, len); > if (!zcopy_used)
> -----Original Message----- > From: Jason Wang [mailto:jasowang@redhat.com] > Sent: Thursday, December 24, 2020 11:10 AM > To: wangyunjian <wangyunjian@huawei.com>; netdev@vger.kernel.org; > mst@redhat.com; willemdebruijn.kernel@gmail.com > Cc: virtualization@lists.linux-foundation.org; Lilijun (Jerry) > <jerry.lilijun@huawei.com>; chenchanghu <chenchanghu@huawei.com>; > xudingke <xudingke@huawei.com>; huangbin (J) > <brian.huangbin@huawei.com> > Subject: Re: [PATCH net v4 2/2] vhost_net: fix tx queue stuck when sendmsg > fails > > > On 2020/12/24 上午10:25, wangyunjian wrote: > > From: Yunjian Wang <wangyunjian@huawei.com> > > > > Currently the driver doesn't drop a packet which can't be sent by tun > > (e.g bad packet). In this case, the driver will always process the > > same packet lead to the tx queue stuck. > > > > To fix this issue: > > 1. in the case of persistent failure (e.g bad packet), the driver can > > skip this descriptor by ignoring the error. > > 2. in the case of transient failure (e.g -EAGAIN and -ENOMEM), the > > driver schedules the worker to try again. > > > I might be wrong but looking at alloc_skb_with_frags(), it returns -ENOBUFS > actually: > > *errcode = -ENOBUFS; > skb = alloc_skb(header_len, gfp_mask); > if (!skb) > return NULL; Yes, but the sock_alloc_send_pskb() returns - EAGAIN when no sndbuf space. So there is need to check return value which is -EAGAIN or -ENOMEM or - EAGAIN? struct sk_buff *sock_alloc_send_pskb() { ... for (;;) { ... sk_set_bit(SOCKWQ_ASYNC_NOSPACE, sk); set_bit(SOCK_NOSPACE, &sk->sk_socket->flags); err = -EAGAIN; if (!timeo) goto failure; ... } skb = alloc_skb_with_frags(header_len, data_len, max_page_order, errcode, sk->sk_allocation); if (skb) skb_set_owner_w(skb, sk); return skb; ... *errcode = err; return NULL; } > > Thanks > > > > > > Fixes: 3a4d5c94e959 ("vhost_net: a kernel-level virtio server") > > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> > > --- > > drivers/vhost/net.c | 16 ++++++++-------- > > 1 file changed, 8 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index > > c8784dfafdd7..e76245daa7f6 100644 > > --- a/drivers/vhost/net.c > > +++ b/drivers/vhost/net.c > > @@ -827,14 +827,13 @@ static void handle_tx_copy(struct vhost_net *net, > struct socket *sock) > > msg.msg_flags &= ~MSG_MORE; > > } > > > > - /* TODO: Check specific error and bomb out unless ENOBUFS? */ > > err = sock->ops->sendmsg(sock, &msg, len); > > - if (unlikely(err < 0)) { > > + if (unlikely(err == -EAGAIN || err == -ENOMEM)) { > > vhost_discard_vq_desc(vq, 1); > > vhost_net_enable_vq(net, vq); > > break; > > } > > - if (err != len) > > + if (err >= 0 && err != len) > > pr_debug("Truncated TX packet: len %d != %zd\n", > > err, len); > > done: > > @@ -922,7 +921,6 @@ static void handle_tx_zerocopy(struct vhost_net > *net, struct socket *sock) > > msg.msg_flags &= ~MSG_MORE; > > } > > > > - /* TODO: Check specific error and bomb out unless ENOBUFS? */ > > err = sock->ops->sendmsg(sock, &msg, len); > > if (unlikely(err < 0)) { > > if (zcopy_used) { > > @@ -931,11 +929,13 @@ static void handle_tx_zerocopy(struct vhost_net > *net, struct socket *sock) > > nvq->upend_idx = ((unsigned)nvq->upend_idx - 1) > > % UIO_MAXIOV; > > } > > - vhost_discard_vq_desc(vq, 1); > > - vhost_net_enable_vq(net, vq); > > - break; > > + if (err == -EAGAIN || err == -ENOMEM) { > > + vhost_discard_vq_desc(vq, 1); > > + vhost_net_enable_vq(net, vq); > > + break; > > + } > > } > > - if (err != len) > > + if (err >= 0 && err != len) > > pr_debug("Truncated TX packet: " > > " len %d != %zd\n", err, len); > > if (!zcopy_used)
On 2020/12/24 下午12:37, wangyunjian wrote: >> -----Original Message----- >> From: Jason Wang [mailto:jasowang@redhat.com] >> Sent: Thursday, December 24, 2020 11:10 AM >> To: wangyunjian <wangyunjian@huawei.com>; netdev@vger.kernel.org; >> mst@redhat.com; willemdebruijn.kernel@gmail.com >> Cc: virtualization@lists.linux-foundation.org; Lilijun (Jerry) >> <jerry.lilijun@huawei.com>; chenchanghu <chenchanghu@huawei.com>; >> xudingke <xudingke@huawei.com>; huangbin (J) >> <brian.huangbin@huawei.com> >> Subject: Re: [PATCH net v4 2/2] vhost_net: fix tx queue stuck when sendmsg >> fails >> >> >> On 2020/12/24 上午10:25, wangyunjian wrote: >>> From: Yunjian Wang <wangyunjian@huawei.com> >>> >>> Currently the driver doesn't drop a packet which can't be sent by tun >>> (e.g bad packet). In this case, the driver will always process the >>> same packet lead to the tx queue stuck. >>> >>> To fix this issue: >>> 1. in the case of persistent failure (e.g bad packet), the driver can >>> skip this descriptor by ignoring the error. >>> 2. in the case of transient failure (e.g -EAGAIN and -ENOMEM), the >>> driver schedules the worker to try again. >> >> I might be wrong but looking at alloc_skb_with_frags(), it returns -ENOBUFS >> actually: >> >> *errcode = -ENOBUFS; >> skb = alloc_skb(header_len, gfp_mask); >> if (!skb) >> return NULL; > Yes, but the sock_alloc_send_pskb() returns - EAGAIN when no sndbuf space. > So there is need to check return value which is -EAGAIN or -ENOMEM or - EAGAIN? > > struct sk_buff *sock_alloc_send_pskb() > { > ... > for (;;) { > ... > sk_set_bit(SOCKWQ_ASYNC_NOSPACE, sk); > set_bit(SOCK_NOSPACE, &sk->sk_socket->flags); > err = -EAGAIN; > if (!timeo) > goto failure; > ... > } > skb = alloc_skb_with_frags(header_len, data_len, max_page_order, > errcode, sk->sk_allocation); > if (skb) > skb_set_owner_w(skb, sk); > return skb; > ... > *errcode = err; > return NULL; > } -EAGAIN and -ENOBFS are fine. But I don't see how -ENOMEM can be returned. Thanks >> Thanks >> >> >>> Fixes: 3a4d5c94e959 ("vhost_net: a kernel-level virtio server") >>> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> >>> --- >>> drivers/vhost/net.c | 16 ++++++++-------- >>> 1 file changed, 8 insertions(+), 8 deletions(-) >>> >>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index >>> c8784dfafdd7..e76245daa7f6 100644 >>> --- a/drivers/vhost/net.c >>> +++ b/drivers/vhost/net.c >>> @@ -827,14 +827,13 @@ static void handle_tx_copy(struct vhost_net *net, >> struct socket *sock) >>> msg.msg_flags &= ~MSG_MORE; >>> } >>> >>> - /* TODO: Check specific error and bomb out unless ENOBUFS? */ >>> err = sock->ops->sendmsg(sock, &msg, len); >>> - if (unlikely(err < 0)) { >>> + if (unlikely(err == -EAGAIN || err == -ENOMEM)) { >>> vhost_discard_vq_desc(vq, 1); >>> vhost_net_enable_vq(net, vq); >>> break; >>> } >>> - if (err != len) >>> + if (err >= 0 && err != len) >>> pr_debug("Truncated TX packet: len %d != %zd\n", >>> err, len); >>> done: >>> @@ -922,7 +921,6 @@ static void handle_tx_zerocopy(struct vhost_net >> *net, struct socket *sock) >>> msg.msg_flags &= ~MSG_MORE; >>> } >>> >>> - /* TODO: Check specific error and bomb out unless ENOBUFS? */ >>> err = sock->ops->sendmsg(sock, &msg, len); >>> if (unlikely(err < 0)) { >>> if (zcopy_used) { >>> @@ -931,11 +929,13 @@ static void handle_tx_zerocopy(struct vhost_net >> *net, struct socket *sock) >>> nvq->upend_idx = ((unsigned)nvq->upend_idx - 1) >>> % UIO_MAXIOV; >>> } >>> - vhost_discard_vq_desc(vq, 1); >>> - vhost_net_enable_vq(net, vq); >>> - break; >>> + if (err == -EAGAIN || err == -ENOMEM) { >>> + vhost_discard_vq_desc(vq, 1); >>> + vhost_net_enable_vq(net, vq); >>> + break; >>> + } >>> } >>> - if (err != len) >>> + if (err >= 0 && err != len) >>> pr_debug("Truncated TX packet: " >>> " len %d != %zd\n", err, len); >>> if (!zcopy_used)
> -----Original Message----- > From: Jason Wang [mailto:jasowang@redhat.com] > Sent: Thursday, December 24, 2020 1:56 PM > To: wangyunjian <wangyunjian@huawei.com> > Cc: netdev@vger.kernel.org; mst@redhat.com; > willemdebruijn.kernel@gmail.com; virtualization@lists.linux-foundation.org; > Lilijun (Jerry) <jerry.lilijun@huawei.com>; chenchanghu > <chenchanghu@huawei.com>; xudingke <xudingke@huawei.com>; huangbin (J) > <brian.huangbin@huawei.com> > Subject: Re: [PATCH net v4 2/2] vhost_net: fix tx queue stuck when sendmsg > fails > > > On 2020/12/24 下午12:37, wangyunjian wrote: > >> -----Original Message----- > >> From: Jason Wang [mailto:jasowang@redhat.com] > >> Sent: Thursday, December 24, 2020 11:10 AM > >> To: wangyunjian <wangyunjian@huawei.com>; netdev@vger.kernel.org; > >> mst@redhat.com; willemdebruijn.kernel@gmail.com > >> Cc: virtualization@lists.linux-foundation.org; Lilijun (Jerry) > >> <jerry.lilijun@huawei.com>; chenchanghu <chenchanghu@huawei.com>; > >> xudingke <xudingke@huawei.com>; huangbin (J) > >> <brian.huangbin@huawei.com> > >> Subject: Re: [PATCH net v4 2/2] vhost_net: fix tx queue stuck when > >> sendmsg fails > >> > >> > >> On 2020/12/24 上午10:25, wangyunjian wrote: > >>> From: Yunjian Wang <wangyunjian@huawei.com> > >>> > >>> Currently the driver doesn't drop a packet which can't be sent by > >>> tun (e.g bad packet). In this case, the driver will always process > >>> the same packet lead to the tx queue stuck. > >>> > >>> To fix this issue: > >>> 1. in the case of persistent failure (e.g bad packet), the driver > >>> can skip this descriptor by ignoring the error. > >>> 2. in the case of transient failure (e.g -EAGAIN and -ENOMEM), the > >>> driver schedules the worker to try again. > >> > >> I might be wrong but looking at alloc_skb_with_frags(), it returns > >> -ENOBUFS > >> actually: > >> > >> *errcode = -ENOBUFS; > >> skb = alloc_skb(header_len, gfp_mask); > >> if (!skb) > >> return NULL; > > Yes, but the sock_alloc_send_pskb() returns - EAGAIN when no sndbuf space. > > So there is need to check return value which is -EAGAIN or -ENOMEM or - > EAGAIN? > > > > struct sk_buff *sock_alloc_send_pskb() { ... > > for (;;) { > > ... > > sk_set_bit(SOCKWQ_ASYNC_NOSPACE, sk); > > set_bit(SOCK_NOSPACE, &sk->sk_socket->flags); > > err = -EAGAIN; > > if (!timeo) > > goto failure; > > ... > > } > > skb = alloc_skb_with_frags(header_len, data_len, max_page_order, > > errcode, sk->sk_allocation); > > if (skb) > > skb_set_owner_w(skb, sk); > > return skb; > > ... > > *errcode = err; > > return NULL; > > } > > > -EAGAIN and -ENOBFS are fine. But I don't see how -ENOMEM can be returned. The tun_build_skb() and tun_napi_alloc_frags() return -ENOMEM when memory allocation fails. Thanks > > Thanks > > > >> Thanks > >> > >> > >>> Fixes: 3a4d5c94e959 ("vhost_net: a kernel-level virtio server") > >>> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> > >>> --- > >>> drivers/vhost/net.c | 16 ++++++++-------- > >>> 1 file changed, 8 insertions(+), 8 deletions(-) > >>> > >>> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index > >>> c8784dfafdd7..e76245daa7f6 100644 > >>> --- a/drivers/vhost/net.c > >>> +++ b/drivers/vhost/net.c > >>> @@ -827,14 +827,13 @@ static void handle_tx_copy(struct vhost_net > >>> *net, > >> struct socket *sock) > >>> msg.msg_flags &= ~MSG_MORE; > >>> } > >>> > >>> - /* TODO: Check specific error and bomb out unless ENOBUFS? > */ > >>> err = sock->ops->sendmsg(sock, &msg, len); > >>> - if (unlikely(err < 0)) { > >>> + if (unlikely(err == -EAGAIN || err == -ENOMEM)) { > >>> vhost_discard_vq_desc(vq, 1); > >>> vhost_net_enable_vq(net, vq); > >>> break; > >>> } > >>> - if (err != len) > >>> + if (err >= 0 && err != len) > >>> pr_debug("Truncated TX packet: len %d != %zd\n", > >>> err, len); > >>> done: > >>> @@ -922,7 +921,6 @@ static void handle_tx_zerocopy(struct vhost_net > >> *net, struct socket *sock) > >>> msg.msg_flags &= ~MSG_MORE; > >>> } > >>> > >>> - /* TODO: Check specific error and bomb out unless ENOBUFS? > */ > >>> err = sock->ops->sendmsg(sock, &msg, len); > >>> if (unlikely(err < 0)) { > >>> if (zcopy_used) { > >>> @@ -931,11 +929,13 @@ static void handle_tx_zerocopy(struct > >>> vhost_net > >> *net, struct socket *sock) > >>> nvq->upend_idx = ((unsigned)nvq->upend_idx - 1) > >>> % UIO_MAXIOV; > >>> } > >>> - vhost_discard_vq_desc(vq, 1); > >>> - vhost_net_enable_vq(net, vq); > >>> - break; > >>> + if (err == -EAGAIN || err == -ENOMEM) { > >>> + vhost_discard_vq_desc(vq, 1); > >>> + vhost_net_enable_vq(net, vq); > >>> + break; > >>> + } > >>> } > >>> - if (err != len) > >>> + if (err >= 0 && err != len) > >>> pr_debug("Truncated TX packet: " > >>> " len %d != %zd\n", err, len); > >>> if (!zcopy_used)
On 2020/12/24 下午5:09, wangyunjian wrote: >> -EAGAIN and -ENOBFS are fine. But I don't see how -ENOMEM can be returned. > The tun_build_skb() and tun_napi_alloc_frags() return -ENOMEM when memory > allocation fails. > > Thanks > You're right. So I think we need check them all. In the future, we need think of a better way. I feel such check is kind of fragile since people may modify core sock codes without having a look at what vhost does. Thanks
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index c8784dfafdd7..e76245daa7f6 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -827,14 +827,13 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock) msg.msg_flags &= ~MSG_MORE; } - /* TODO: Check specific error and bomb out unless ENOBUFS? */ err = sock->ops->sendmsg(sock, &msg, len); - if (unlikely(err < 0)) { + if (unlikely(err == -EAGAIN || err == -ENOMEM)) { vhost_discard_vq_desc(vq, 1); vhost_net_enable_vq(net, vq); break; } - if (err != len) + if (err >= 0 && err != len) pr_debug("Truncated TX packet: len %d != %zd\n", err, len); done: @@ -922,7 +921,6 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock) msg.msg_flags &= ~MSG_MORE; } - /* TODO: Check specific error and bomb out unless ENOBUFS? */ err = sock->ops->sendmsg(sock, &msg, len); if (unlikely(err < 0)) { if (zcopy_used) { @@ -931,11 +929,13 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock) nvq->upend_idx = ((unsigned)nvq->upend_idx - 1) % UIO_MAXIOV; } - vhost_discard_vq_desc(vq, 1); - vhost_net_enable_vq(net, vq); - break; + if (err == -EAGAIN || err == -ENOMEM) { + vhost_discard_vq_desc(vq, 1); + vhost_net_enable_vq(net, vq); + break; + } } - if (err != len) + if (err >= 0 && err != len) pr_debug("Truncated TX packet: " " len %d != %zd\n", err, len); if (!zcopy_used)