Message ID | 20221013050044.11862-1-akihiko.odaki@daynix.com (mailing list archive) |
---|---|
State | Awaiting Upstream |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | e1000e: Fix TX dispatch condition | expand |
> -----Original Message----- > From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of > Akihiko Odaki > Sent: Thursday, October 13, 2022 10:31 AM > Cc: netdev@vger.kernel.org; linux-kernel@vger.kernel.org; Yuri Benditovich > <yuri.benditovich@daynix.com>; Eric Dumazet <edumazet@google.com>; > Jakub Kicinski <kuba@kernel.org>; Yan Vugenfirer <yan@daynix.com>; intel- > wired-lan@lists.osuosl.org; Paolo Abeni <pabeni@redhat.com>; David S. > Miller <davem@davemloft.net> > Subject: [Intel-wired-lan] [PATCH] e1000e: Fix TX dispatch condition > > e1000_xmit_frame is expected to stop the queue and dispatch frames to > hardware if there is not sufficient space for the next frame in the buffer, but > sometimes it failed to do so because the estimated maxmium size of frame > was wrong. As the consequence, the later invocation of e1000_xmit_frame > failed with NETDEV_TX_BUSY, and the frame in the buffer remained forever, > resulting in a watchdog failure. > > This change fixes the estimated size by making it match with the condition for > NETDEV_TX_BUSY. Apparently, the old estimation failed to account for the > following lines which determines the space requirement for not causing > NETDEV_TX_BUSY: > > /* reserve a descriptor for the offload context */ > > if ((mss) || (skb->ip_summed == CHECKSUM_PARTIAL)) > > count++; > > count++; > > > > count += DIV_ROUND_UP(len, adapter->tx_fifo_limit); > > This issue was found with http-stress02 test included in Linux Test Project > 20220930. > > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> > --- > drivers/net/ethernet/intel/e1000e/netdev.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Dear Akihiko, Thank you very much for the patch. Am 13.10.22 um 07:00 schrieb Akihiko Odaki: > e1000_xmit_frame is expected to stop the queue and dispatch frames to > hardware if there is not sufficient space for the next frame in the > buffer, but sometimes it failed to do so because the estimated maxmium > size of frame was wrong. As the consequence, the later invocation of > e1000_xmit_frame failed with NETDEV_TX_BUSY, and the frame in the buffer > remained forever, resulting in a watchdog failure. > > This change fixes the estimated size by making it match with the > condition for NETDEV_TX_BUSY. Apparently, the old estimation failed to > account for the following lines which determines the space requirement > for not causing NETDEV_TX_BUSY: >> /* reserve a descriptor for the offload context */ >> if ((mss) || (skb->ip_summed == CHECKSUM_PARTIAL)) >> count++; >> count++; >> >> count += DIV_ROUND_UP(len, adapter->tx_fifo_limit); I’d just use Markdown syntax, and indent by four spaces without > for citation. > This issue was found with http-stress02 test included in Linux Test > Project 20220930. So it was reproduced in QEMU? For convenience, it’d be great if you added the QEMU command. Also, do you know if this is a regression? If so, it’d be great if you added the Fixes: tag. Kind regards, Paul > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> > --- > drivers/net/ethernet/intel/e1000e/netdev.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c > index 321f2a95ae3a..da113f5011e9 100644 > --- a/drivers/net/ethernet/intel/e1000e/netdev.c > +++ b/drivers/net/ethernet/intel/e1000e/netdev.c > @@ -5936,9 +5936,9 @@ static netdev_tx_t e1000_xmit_frame(struct sk_buff *skb, > e1000_tx_queue(tx_ring, tx_flags, count); > /* Make sure there is space in the ring for the next send. */ > e1000_maybe_stop_tx(tx_ring, > - (MAX_SKB_FRAGS * > + ((MAX_SKB_FRAGS + 1) * > DIV_ROUND_UP(PAGE_SIZE, > - adapter->tx_fifo_limit) + 2)); > + adapter->tx_fifo_limit) + 4)); > > if (!netdev_xmit_more() || > netif_xmit_stopped(netdev_get_tx_queue(netdev, 0))) {
On 10/13/2022 08:00, Akihiko Odaki wrote: > e1000_xmit_frame is expected to stop the queue and dispatch frames to > hardware if there is not sufficient space for the next frame in the > buffer, but sometimes it failed to do so because the estimated maxmium > size of frame was wrong. As the consequence, the later invocation of > e1000_xmit_frame failed with NETDEV_TX_BUSY, and the frame in the buffer > remained forever, resulting in a watchdog failure. > > This change fixes the estimated size by making it match with the > condition for NETDEV_TX_BUSY. Apparently, the old estimation failed to > account for the following lines which determines the space requirement > for not causing NETDEV_TX_BUSY: >> /* reserve a descriptor for the offload context */ >> if ((mss) || (skb->ip_summed == CHECKSUM_PARTIAL)) >> count++; >> count++; >> >> count += DIV_ROUND_UP(len, adapter->tx_fifo_limit); > > This issue was found with http-stress02 test included in Linux Test > Project 20220930. > > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> > --- > drivers/net/ethernet/intel/e1000e/netdev.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) Tested-by: Naama Meir <naamax.meir@linux.intel.com>
On 10/13/2022 08:00, Akihiko Odaki wrote: > e1000_xmit_frame is expected to stop the queue and dispatch frames to > hardware if there is not sufficient space for the next frame in the > buffer, but sometimes it failed to do so because the estimated maxmium > size of frame was wrong. As the consequence, the later invocation of > e1000_xmit_frame failed with NETDEV_TX_BUSY, and the frame in the buffer > remained forever, resulting in a watchdog failure. > > This change fixes the estimated size by making it match with the > condition for NETDEV_TX_BUSY. Apparently, the old estimation failed to > account for the following lines which determines the space requirement > for not causing NETDEV_TX_BUSY: >> /* reserve a descriptor for the offload context */ >> if ((mss) || (skb->ip_summed == CHECKSUM_PARTIAL)) >> count++; >> count++; >> >> count += DIV_ROUND_UP(len, adapter->tx_fifo_limit); > > This issue was found with http-stress02 test included in Linux Test > Project 20220930. > > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> > --- > drivers/net/ethernet/intel/e1000e/netdev.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) Tested-by: Naama Meir <naamax.meir@linux.intel.com>
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c index 321f2a95ae3a..da113f5011e9 100644 --- a/drivers/net/ethernet/intel/e1000e/netdev.c +++ b/drivers/net/ethernet/intel/e1000e/netdev.c @@ -5936,9 +5936,9 @@ static netdev_tx_t e1000_xmit_frame(struct sk_buff *skb, e1000_tx_queue(tx_ring, tx_flags, count); /* Make sure there is space in the ring for the next send. */ e1000_maybe_stop_tx(tx_ring, - (MAX_SKB_FRAGS * + ((MAX_SKB_FRAGS + 1) * DIV_ROUND_UP(PAGE_SIZE, - adapter->tx_fifo_limit) + 2)); + adapter->tx_fifo_limit) + 4)); if (!netdev_xmit_more() || netif_xmit_stopped(netdev_get_tx_queue(netdev, 0))) {