diff mbox series

[net,V2] virtio-net: correctly enable callback during start_xmit

Message ID 20221215032719.72294-1-jasowang@redhat.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net,V2] virtio-net: correctly enable callback during start_xmit | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers success CCed 8 of 8 maintainers
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 8 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Jason Wang Dec. 15, 2022, 3:27 a.m. UTC
Commit a7766ef18b33("virtio_net: disable cb aggressively") enables
virtqueue callback via the following statement:

        do {
           ......
	} while (use_napi && kick &&
               unlikely(!virtqueue_enable_cb_delayed(sq->vq)));

When NAPI is used and kick is false, the callback won't be enabled
here. And when the virtqueue is about to be full, the tx will be
disabled, but we still don't enable tx interrupt which will cause a TX
hang. This could be observed when using pktgen with burst enabled.

Fixing this by trying to enable tx interrupt after we disable TX when
we're not using napi or kick is false.

Fixes: a7766ef18b33 ("virtio_net: disable cb aggressively")
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
The patch is needed for -stable.
Changes since V1:
- enable tx interrupt after we disable tx
---
 drivers/net/virtio_net.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Michael S. Tsirkin Dec. 15, 2022, 9:02 a.m. UTC | #1
On Thu, Dec 15, 2022 at 11:27:19AM +0800, Jason Wang wrote:
> Commit a7766ef18b33("virtio_net: disable cb aggressively") enables
> virtqueue callback via the following statement:
> 
>         do {
>            ......
> 	} while (use_napi && kick &&
>                unlikely(!virtqueue_enable_cb_delayed(sq->vq)));
> 
> When NAPI is used and kick is false, the callback won't be enabled
> here. And when the virtqueue is about to be full, the tx will be
> disabled, but we still don't enable tx interrupt which will cause a TX
> hang. This could be observed when using pktgen with burst enabled.
> 
> Fixing this by trying to enable tx interrupt after we disable TX when
> we're not using napi or kick is false.
> 
> Fixes: a7766ef18b33 ("virtio_net: disable cb aggressively")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
> The patch is needed for -stable.
> Changes since V1:
> - enable tx interrupt after we disable tx
> ---
>  drivers/net/virtio_net.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 86e52454b5b5..dcf3a536d78a 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1873,7 +1873,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
>  	 */
>  	if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
>  		netif_stop_subqueue(dev, qnum);
> -		if (!use_napi &&
> +		if ((!use_napi || !kick) &&
>  		    unlikely(!virtqueue_enable_cb_delayed(sq->vq))) {
>  			/* More just got used, free them then recheck. */
>  			free_old_xmit_skbs(sq, false);

This will work but the following lines are:

                       if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
                                netif_start_subqueue(dev, qnum);
                                virtqueue_disable_cb(sq->vq);
                        }


and I thought we are supposed to keep callbacks enabled with napi?
One of the ideas of napi is to free on napi callback, not here
immediately.

I think it is easier to just do a separate branch here. Along the
lines of:

		if (use_napi) {
			if (unlikely(!virtqueue_enable_cb_delayed(sq->vq)))
				virtqueue_napi_schedule(napi, vq);
		} else {
			... old code ...
		}

also reduces chances of regressions on !napi (which is not well tested)
and keeps callbacks off while we free skbs.

No?


> -- 
> 2.25.1
Jason Wang Dec. 15, 2022, 9:15 a.m. UTC | #2
On Thu, Dec 15, 2022 at 5:02 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Thu, Dec 15, 2022 at 11:27:19AM +0800, Jason Wang wrote:
> > Commit a7766ef18b33("virtio_net: disable cb aggressively") enables
> > virtqueue callback via the following statement:
> >
> >         do {
> >            ......
> >       } while (use_napi && kick &&
> >                unlikely(!virtqueue_enable_cb_delayed(sq->vq)));
> >
> > When NAPI is used and kick is false, the callback won't be enabled
> > here. And when the virtqueue is about to be full, the tx will be
> > disabled, but we still don't enable tx interrupt which will cause a TX
> > hang. This could be observed when using pktgen with burst enabled.
> >
> > Fixing this by trying to enable tx interrupt after we disable TX when
> > we're not using napi or kick is false.
> >
> > Fixes: a7766ef18b33 ("virtio_net: disable cb aggressively")
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> > The patch is needed for -stable.
> > Changes since V1:
> > - enable tx interrupt after we disable tx
> > ---
> >  drivers/net/virtio_net.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index 86e52454b5b5..dcf3a536d78a 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -1873,7 +1873,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> >        */
> >       if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
> >               netif_stop_subqueue(dev, qnum);
> > -             if (!use_napi &&
> > +             if ((!use_napi || !kick) &&
> >                   unlikely(!virtqueue_enable_cb_delayed(sq->vq))) {
> >                       /* More just got used, free them then recheck. */
> >                       free_old_xmit_skbs(sq, false);
>
> This will work but the following lines are:
>
>                        if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
>                                 netif_start_subqueue(dev, qnum);
>                                 virtqueue_disable_cb(sq->vq);
>                         }
>
>
> and I thought we are supposed to keep callbacks enabled with napi?

This seems to be the opposite logic of commit a7766ef18b33 that
disables callbacks for NAPI.

It said:

    There are currently two cases where we poll TX vq not in response to a
    callback: start xmit and rx napi.  We currently do this with callbacks
    enabled which can cause extra interrupts from the card.  Used not to be
    a big issue as we run with interrupts disabled but that is no longer the
    case, and in some cases the rate of spurious interrupts is so high
    linux detects this and actually kills the interrupt.

My undersatnding is that it tries to disable callbacks on TX.

> One of the ideas of napi is to free on napi callback, not here
> immediately.
>
> I think it is easier to just do a separate branch here. Along the
> lines of:
>
>                 if (use_napi) {
>                         if (unlikely(!virtqueue_enable_cb_delayed(sq->vq)))
>                                 virtqueue_napi_schedule(napi, vq);

This seems to be a new logic and it causes some delay in processing TX
(unnecessary NAPI).

>                 } else {
>                         ... old code ...
>                 }
>
> also reduces chances of regressions on !napi (which is not well tested)
> and keeps callbacks off while we free skbs.

I think my patch doesn't change the logic of !napi? (It checks !napi || kick).

Thanks

>
> No?
>
>
> > --
> > 2.25.1
>
Michael S. Tsirkin Dec. 15, 2022, 9:34 a.m. UTC | #3
On Thu, Dec 15, 2022 at 05:15:43PM +0800, Jason Wang wrote:
> On Thu, Dec 15, 2022 at 5:02 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Thu, Dec 15, 2022 at 11:27:19AM +0800, Jason Wang wrote:
> > > Commit a7766ef18b33("virtio_net: disable cb aggressively") enables
> > > virtqueue callback via the following statement:
> > >
> > >         do {
> > >            ......
> > >       } while (use_napi && kick &&
> > >                unlikely(!virtqueue_enable_cb_delayed(sq->vq)));
> > >
> > > When NAPI is used and kick is false, the callback won't be enabled
> > > here. And when the virtqueue is about to be full, the tx will be
> > > disabled, but we still don't enable tx interrupt which will cause a TX
> > > hang. This could be observed when using pktgen with burst enabled.
> > >
> > > Fixing this by trying to enable tx interrupt after we disable TX when
> > > we're not using napi or kick is false.
> > >
> > > Fixes: a7766ef18b33 ("virtio_net: disable cb aggressively")
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > ---
> > > The patch is needed for -stable.
> > > Changes since V1:
> > > - enable tx interrupt after we disable tx
> > > ---
> > >  drivers/net/virtio_net.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > index 86e52454b5b5..dcf3a536d78a 100644
> > > --- a/drivers/net/virtio_net.c
> > > +++ b/drivers/net/virtio_net.c
> > > @@ -1873,7 +1873,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> > >        */
> > >       if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
> > >               netif_stop_subqueue(dev, qnum);
> > > -             if (!use_napi &&
> > > +             if ((!use_napi || !kick) &&
> > >                   unlikely(!virtqueue_enable_cb_delayed(sq->vq))) {
> > >                       /* More just got used, free them then recheck. */
> > >                       free_old_xmit_skbs(sq, false);
> >
> > This will work but the following lines are:
> >
> >                        if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
> >                                 netif_start_subqueue(dev, qnum);
> >                                 virtqueue_disable_cb(sq->vq);
> >                         }
> >
> >
> > and I thought we are supposed to keep callbacks enabled with napi?
> 
> This seems to be the opposite logic of commit a7766ef18b33 that
> disables callbacks for NAPI.
> 
> It said:
> 
>     There are currently two cases where we poll TX vq not in response to a
>     callback: start xmit and rx napi.  We currently do this with callbacks
>     enabled which can cause extra interrupts from the card.  Used not to be
>     a big issue as we run with interrupts disabled but that is no longer the
>     case, and in some cases the rate of spurious interrupts is so high
>     linux detects this and actually kills the interrupt.
> 
> My undersatnding is that it tries to disable callbacks on TX.

I think we want to disable callbacks while polling, yes. here we are not
polling, and I think we want a callback because otherwise nothing will
orphan skbs and a socket can be blocked, not transmitting anything - a
deadlock.

> > One of the ideas of napi is to free on napi callback, not here
> > immediately.
> >
> > I think it is easier to just do a separate branch here. Along the
> > lines of:
> >
> >                 if (use_napi) {
> >                         if (unlikely(!virtqueue_enable_cb_delayed(sq->vq)))
> >                                 virtqueue_napi_schedule(napi, vq);
> 
> This seems to be a new logic and it causes some delay in processing TX
> (unnecessary NAPI).

That's good, we overloaded the queue so we are already going
too fast, deferring tx so queue has chance to drain
will allow better batching in the qdisc.

> >                 } else {
> >                         ... old code ...
> >                 }
> >
> > also reduces chances of regressions on !napi (which is not well tested)
> > and keeps callbacks off while we free skbs.
> 
> I think my patch doesn't change the logic of !napi? (It checks !napi || kick).
> 
> Thanks

I agree it doesn't seem to as written.

> >
> > No?
> >
> >
> > > --
> > > 2.25.1
> >
Jason Wang Dec. 16, 2022, 3:43 a.m. UTC | #4
On Thu, Dec 15, 2022 at 5:35 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Thu, Dec 15, 2022 at 05:15:43PM +0800, Jason Wang wrote:
> > On Thu, Dec 15, 2022 at 5:02 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Thu, Dec 15, 2022 at 11:27:19AM +0800, Jason Wang wrote:
> > > > Commit a7766ef18b33("virtio_net: disable cb aggressively") enables
> > > > virtqueue callback via the following statement:
> > > >
> > > >         do {
> > > >            ......
> > > >       } while (use_napi && kick &&
> > > >                unlikely(!virtqueue_enable_cb_delayed(sq->vq)));
> > > >
> > > > When NAPI is used and kick is false, the callback won't be enabled
> > > > here. And when the virtqueue is about to be full, the tx will be
> > > > disabled, but we still don't enable tx interrupt which will cause a TX
> > > > hang. This could be observed when using pktgen with burst enabled.
> > > >
> > > > Fixing this by trying to enable tx interrupt after we disable TX when
> > > > we're not using napi or kick is false.
> > > >
> > > > Fixes: a7766ef18b33 ("virtio_net: disable cb aggressively")
> > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > ---
> > > > The patch is needed for -stable.
> > > > Changes since V1:
> > > > - enable tx interrupt after we disable tx
> > > > ---
> > > >  drivers/net/virtio_net.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > > index 86e52454b5b5..dcf3a536d78a 100644
> > > > --- a/drivers/net/virtio_net.c
> > > > +++ b/drivers/net/virtio_net.c
> > > > @@ -1873,7 +1873,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> > > >        */
> > > >       if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
> > > >               netif_stop_subqueue(dev, qnum);
> > > > -             if (!use_napi &&
> > > > +             if ((!use_napi || !kick) &&
> > > >                   unlikely(!virtqueue_enable_cb_delayed(sq->vq))) {
> > > >                       /* More just got used, free them then recheck. */
> > > >                       free_old_xmit_skbs(sq, false);
> > >
> > > This will work but the following lines are:
> > >
> > >                        if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
> > >                                 netif_start_subqueue(dev, qnum);
> > >                                 virtqueue_disable_cb(sq->vq);
> > >                         }
> > >
> > >
> > > and I thought we are supposed to keep callbacks enabled with napi?
> >
> > This seems to be the opposite logic of commit a7766ef18b33 that
> > disables callbacks for NAPI.
> >
> > It said:
> >
> >     There are currently two cases where we poll TX vq not in response to a
> >     callback: start xmit and rx napi.  We currently do this with callbacks
> >     enabled which can cause extra interrupts from the card.  Used not to be
> >     a big issue as we run with interrupts disabled but that is no longer the
> >     case, and in some cases the rate of spurious interrupts is so high
> >     linux detects this and actually kills the interrupt.
> >
> > My undersatnding is that it tries to disable callbacks on TX.
>
> I think we want to disable callbacks while polling, yes. here we are not
> polling, and I think we want a callback because otherwise nothing will
> orphan skbs and a socket can be blocked, not transmitting anything - a
> deadlock.

I'm not sure how I got here, did you mean a partial revert of
a7766ef18b33 (the part that disables TX callbacks on start_xmit)?

Btw, I plan to remove non NAPI mode completely, since it was disabled
by default for years and we don't see any complaint, then we may have
modern features like BQL and better TCP performance. In that sense we
may simply keep tx callback open as most of modern NIC did.

>
> > > One of the ideas of napi is to free on napi callback, not here
> > > immediately.
> > >
> > > I think it is easier to just do a separate branch here. Along the
> > > lines of:
> > >
> > >                 if (use_napi) {
> > >                         if (unlikely(!virtqueue_enable_cb_delayed(sq->vq)))
> > >                                 virtqueue_napi_schedule(napi, vq);
> >
> > This seems to be a new logic and it causes some delay in processing TX
> > (unnecessary NAPI).
>
> That's good, we overloaded the queue so we are already going
> too fast, deferring tx so queue has chance to drain
> will allow better batching in the qdisc.

I meant, compare to

1) schedule NAPI and poll TX

The current code did

2) poll TX immediately

2) seems faster?

Thanks

>
> > >                 } else {
> > >                         ... old code ...
> > >                 }
> > >
> > > also reduces chances of regressions on !napi (which is not well tested)
> > > and keeps callbacks off while we free skbs.
> >
> > I think my patch doesn't change the logic of !napi? (It checks !napi || kick).
> >
> > Thanks
>
> I agree it doesn't seem to as written.
>
> > >
> > > No?
> > >
> > >
> > > > --
> > > > 2.25.1
> > >
>
Jason Wang Dec. 23, 2022, 6:29 a.m. UTC | #5
On Fri, Dec 16, 2022 at 11:43 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Thu, Dec 15, 2022 at 5:35 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Thu, Dec 15, 2022 at 05:15:43PM +0800, Jason Wang wrote:
> > > On Thu, Dec 15, 2022 at 5:02 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Thu, Dec 15, 2022 at 11:27:19AM +0800, Jason Wang wrote:
> > > > > Commit a7766ef18b33("virtio_net: disable cb aggressively") enables
> > > > > virtqueue callback via the following statement:
> > > > >
> > > > >         do {
> > > > >            ......
> > > > >       } while (use_napi && kick &&
> > > > >                unlikely(!virtqueue_enable_cb_delayed(sq->vq)));
> > > > >
> > > > > When NAPI is used and kick is false, the callback won't be enabled
> > > > > here. And when the virtqueue is about to be full, the tx will be
> > > > > disabled, but we still don't enable tx interrupt which will cause a TX
> > > > > hang. This could be observed when using pktgen with burst enabled.
> > > > >
> > > > > Fixing this by trying to enable tx interrupt after we disable TX when
> > > > > we're not using napi or kick is false.
> > > > >
> > > > > Fixes: a7766ef18b33 ("virtio_net: disable cb aggressively")
> > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > ---
> > > > > The patch is needed for -stable.
> > > > > Changes since V1:
> > > > > - enable tx interrupt after we disable tx
> > > > > ---
> > > > >  drivers/net/virtio_net.c | 2 +-
> > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > > > index 86e52454b5b5..dcf3a536d78a 100644
> > > > > --- a/drivers/net/virtio_net.c
> > > > > +++ b/drivers/net/virtio_net.c
> > > > > @@ -1873,7 +1873,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> > > > >        */
> > > > >       if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
> > > > >               netif_stop_subqueue(dev, qnum);
> > > > > -             if (!use_napi &&
> > > > > +             if ((!use_napi || !kick) &&
> > > > >                   unlikely(!virtqueue_enable_cb_delayed(sq->vq))) {
> > > > >                       /* More just got used, free them then recheck. */
> > > > >                       free_old_xmit_skbs(sq, false);
> > > >
> > > > This will work but the following lines are:
> > > >
> > > >                        if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
> > > >                                 netif_start_subqueue(dev, qnum);
> > > >                                 virtqueue_disable_cb(sq->vq);
> > > >                         }
> > > >
> > > >
> > > > and I thought we are supposed to keep callbacks enabled with napi?
> > >
> > > This seems to be the opposite logic of commit a7766ef18b33 that
> > > disables callbacks for NAPI.
> > >
> > > It said:
> > >
> > >     There are currently two cases where we poll TX vq not in response to a
> > >     callback: start xmit and rx napi.  We currently do this with callbacks
> > >     enabled which can cause extra interrupts from the card.  Used not to be
> > >     a big issue as we run with interrupts disabled but that is no longer the
> > >     case, and in some cases the rate of spurious interrupts is so high
> > >     linux detects this and actually kills the interrupt.
> > >
> > > My undersatnding is that it tries to disable callbacks on TX.
> >
> > I think we want to disable callbacks while polling, yes. here we are not
> > polling, and I think we want a callback because otherwise nothing will
> > orphan skbs and a socket can be blocked, not transmitting anything - a
> > deadlock.
>
> I'm not sure how I got here, did you mean a partial revert of
> a7766ef18b33 (the part that disables TX callbacks on start_xmit)?

Michael, any idea on this?

Thanks

>
> Btw, I plan to remove non NAPI mode completely, since it was disabled
> by default for years and we don't see any complaint, then we may have
> modern features like BQL and better TCP performance. In that sense we
> may simply keep tx callback open as most of modern NIC did.
>
> >
> > > > One of the ideas of napi is to free on napi callback, not here
> > > > immediately.
> > > >
> > > > I think it is easier to just do a separate branch here. Along the
> > > > lines of:
> > > >
> > > >                 if (use_napi) {
> > > >                         if (unlikely(!virtqueue_enable_cb_delayed(sq->vq)))
> > > >                                 virtqueue_napi_schedule(napi, vq);
> > >
> > > This seems to be a new logic and it causes some delay in processing TX
> > > (unnecessary NAPI).
> >
> > That's good, we overloaded the queue so we are already going
> > too fast, deferring tx so queue has chance to drain
> > will allow better batching in the qdisc.
>
> I meant, compare to
>
> 1) schedule NAPI and poll TX
>
> The current code did
>
> 2) poll TX immediately
>
> 2) seems faster?
>
> Thanks
>
> >
> > > >                 } else {
> > > >                         ... old code ...
> > > >                 }
> > > >
> > > > also reduces chances of regressions on !napi (which is not well tested)
> > > > and keeps callbacks off while we free skbs.
> > >
> > > I think my patch doesn't change the logic of !napi? (It checks !napi || kick).
> > >
> > > Thanks
> >
> > I agree it doesn't seem to as written.
> >
> > > >
> > > > No?
> > > >
> > > >
> > > > > --
> > > > > 2.25.1
> > > >
> >
Jason Wang Jan. 4, 2023, 4:23 a.m. UTC | #6
在 2022/12/23 14:29, Jason Wang 写道:
> On Fri, Dec 16, 2022 at 11:43 AM Jason Wang <jasowang@redhat.com> wrote:
>> On Thu, Dec 15, 2022 at 5:35 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>> On Thu, Dec 15, 2022 at 05:15:43PM +0800, Jason Wang wrote:
>>>> On Thu, Dec 15, 2022 at 5:02 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>> On Thu, Dec 15, 2022 at 11:27:19AM +0800, Jason Wang wrote:
>>>>>> Commit a7766ef18b33("virtio_net: disable cb aggressively") enables
>>>>>> virtqueue callback via the following statement:
>>>>>>
>>>>>>          do {
>>>>>>             ......
>>>>>>        } while (use_napi && kick &&
>>>>>>                 unlikely(!virtqueue_enable_cb_delayed(sq->vq)));
>>>>>>
>>>>>> When NAPI is used and kick is false, the callback won't be enabled
>>>>>> here. And when the virtqueue is about to be full, the tx will be
>>>>>> disabled, but we still don't enable tx interrupt which will cause a TX
>>>>>> hang. This could be observed when using pktgen with burst enabled.
>>>>>>
>>>>>> Fixing this by trying to enable tx interrupt after we disable TX when
>>>>>> we're not using napi or kick is false.
>>>>>>
>>>>>> Fixes: a7766ef18b33 ("virtio_net: disable cb aggressively")
>>>>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>>>> ---
>>>>>> The patch is needed for -stable.
>>>>>> Changes since V1:
>>>>>> - enable tx interrupt after we disable tx
>>>>>> ---
>>>>>>   drivers/net/virtio_net.c | 2 +-
>>>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>>>>> index 86e52454b5b5..dcf3a536d78a 100644
>>>>>> --- a/drivers/net/virtio_net.c
>>>>>> +++ b/drivers/net/virtio_net.c
>>>>>> @@ -1873,7 +1873,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
>>>>>>         */
>>>>>>        if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
>>>>>>                netif_stop_subqueue(dev, qnum);
>>>>>> -             if (!use_napi &&
>>>>>> +             if ((!use_napi || !kick) &&
>>>>>>                    unlikely(!virtqueue_enable_cb_delayed(sq->vq))) {
>>>>>>                        /* More just got used, free them then recheck. */
>>>>>>                        free_old_xmit_skbs(sq, false);
>>>>> This will work but the following lines are:
>>>>>
>>>>>                         if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
>>>>>                                  netif_start_subqueue(dev, qnum);
>>>>>                                  virtqueue_disable_cb(sq->vq);
>>>>>                          }
>>>>>
>>>>>
>>>>> and I thought we are supposed to keep callbacks enabled with napi?
>>>> This seems to be the opposite logic of commit a7766ef18b33 that
>>>> disables callbacks for NAPI.
>>>>
>>>> It said:
>>>>
>>>>      There are currently two cases where we poll TX vq not in response to a
>>>>      callback: start xmit and rx napi.  We currently do this with callbacks
>>>>      enabled which can cause extra interrupts from the card.  Used not to be
>>>>      a big issue as we run with interrupts disabled but that is no longer the
>>>>      case, and in some cases the rate of spurious interrupts is so high
>>>>      linux detects this and actually kills the interrupt.
>>>>
>>>> My undersatnding is that it tries to disable callbacks on TX.
>>> I think we want to disable callbacks while polling, yes. here we are not
>>> polling, and I think we want a callback because otherwise nothing will
>>> orphan skbs and a socket can be blocked, not transmitting anything - a
>>> deadlock.
>> I'm not sure how I got here, did you mean a partial revert of
>> a7766ef18b33 (the part that disables TX callbacks on start_xmit)?
> Michael, any idea on this?
>
> Thanks


Michael, any comment?

Thanks
Michael S. Tsirkin Jan. 4, 2023, 6:46 a.m. UTC | #7
On Wed, Jan 04, 2023 at 12:23:07PM +0800, Jason Wang wrote:
> 
> 在 2022/12/23 14:29, Jason Wang 写道:
> > On Fri, Dec 16, 2022 at 11:43 AM Jason Wang <jasowang@redhat.com> wrote:
> > > On Thu, Dec 15, 2022 at 5:35 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > On Thu, Dec 15, 2022 at 05:15:43PM +0800, Jason Wang wrote:
> > > > > On Thu, Dec 15, 2022 at 5:02 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > On Thu, Dec 15, 2022 at 11:27:19AM +0800, Jason Wang wrote:
> > > > > > > Commit a7766ef18b33("virtio_net: disable cb aggressively") enables
> > > > > > > virtqueue callback via the following statement:
> > > > > > > 
> > > > > > >          do {
> > > > > > >             ......
> > > > > > >        } while (use_napi && kick &&
> > > > > > >                 unlikely(!virtqueue_enable_cb_delayed(sq->vq)));
> > > > > > > 
> > > > > > > When NAPI is used and kick is false, the callback won't be enabled
> > > > > > > here. And when the virtqueue is about to be full, the tx will be
> > > > > > > disabled, but we still don't enable tx interrupt which will cause a TX
> > > > > > > hang. This could be observed when using pktgen with burst enabled.
> > > > > > > 
> > > > > > > Fixing this by trying to enable tx interrupt after we disable TX when
> > > > > > > we're not using napi or kick is false.
> > > > > > > 
> > > > > > > Fixes: a7766ef18b33 ("virtio_net: disable cb aggressively")
> > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > ---
> > > > > > > The patch is needed for -stable.
> > > > > > > Changes since V1:
> > > > > > > - enable tx interrupt after we disable tx
> > > > > > > ---
> > > > > > >   drivers/net/virtio_net.c | 2 +-
> > > > > > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > > > > > index 86e52454b5b5..dcf3a536d78a 100644
> > > > > > > --- a/drivers/net/virtio_net.c
> > > > > > > +++ b/drivers/net/virtio_net.c
> > > > > > > @@ -1873,7 +1873,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> > > > > > >         */
> > > > > > >        if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
> > > > > > >                netif_stop_subqueue(dev, qnum);
> > > > > > > -             if (!use_napi &&
> > > > > > > +             if ((!use_napi || !kick) &&
> > > > > > >                    unlikely(!virtqueue_enable_cb_delayed(sq->vq))) {
> > > > > > >                        /* More just got used, free them then recheck. */
> > > > > > >                        free_old_xmit_skbs(sq, false);
> > > > > > This will work but the following lines are:
> > > > > > 
> > > > > >                         if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
> > > > > >                                  netif_start_subqueue(dev, qnum);
> > > > > >                                  virtqueue_disable_cb(sq->vq);
> > > > > >                          }
> > > > > > 
> > > > > > 
> > > > > > and I thought we are supposed to keep callbacks enabled with napi?
> > > > > This seems to be the opposite logic of commit a7766ef18b33 that
> > > > > disables callbacks for NAPI.
> > > > > 
> > > > > It said:
> > > > > 
> > > > >      There are currently two cases where we poll TX vq not in response to a
> > > > >      callback: start xmit and rx napi.  We currently do this with callbacks
> > > > >      enabled which can cause extra interrupts from the card.  Used not to be
> > > > >      a big issue as we run with interrupts disabled but that is no longer the
> > > > >      case, and in some cases the rate of spurious interrupts is so high
> > > > >      linux detects this and actually kills the interrupt.
> > > > > 
> > > > > My undersatnding is that it tries to disable callbacks on TX.
> > > > I think we want to disable callbacks while polling, yes. here we are not
> > > > polling, and I think we want a callback because otherwise nothing will
> > > > orphan skbs and a socket can be blocked, not transmitting anything - a
> > > > deadlock.
> > > I'm not sure how I got here, did you mean a partial revert of
> > > a7766ef18b33 (the part that disables TX callbacks on start_xmit)?
> > Michael, any idea on this?
> > 
> > Thanks
> 
> 
> Michael, any comment?
> 
> Thanks

Sorry I don't understand the question. What does "how I got here" mean?
To repeat my suggestion:

	I think it is easier to just do a separate branch here. Along the
	lines of:

			if (use_napi) {
				if (unlikely(!virtqueue_enable_cb_delayed(sq->vq)))
					virtqueue_napi_schedule(napi, vq);
			} else {
				... old code ...
			}

we can also backport this minimal safe fix, any refactorings can be done on
top.
diff mbox series

Patch

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 86e52454b5b5..dcf3a536d78a 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1873,7 +1873,7 @@  static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
 	 */
 	if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
 		netif_stop_subqueue(dev, qnum);
-		if (!use_napi &&
+		if ((!use_napi || !kick) &&
 		    unlikely(!virtqueue_enable_cb_delayed(sq->vq))) {
 			/* More just got used, free them then recheck. */
 			free_old_xmit_skbs(sq, false);