Message ID | 20230630120306.8534-1-florian.kauer@linutronix.de (mailing list archive) |
---|---|
State | Awaiting Upstream |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,v3] igc: Prevent garbled TX queue with XDP ZEROCOPY | expand |
On Fri, Jun 30, 2023 at 02:03:06PM +0200, Florian Kauer wrote: > In normal operation, each populated queue item has > next_to_watch pointing to the last TX desc of the packet, > while each cleaned item has it set to 0. In particular, > next_to_use that points to the next (necessarily clean) > item to use has next_to_watch set to 0. > > When the TX queue is used both by an application using > AF_XDP with ZEROCOPY as well as a second non-XDP application > generating high traffic, the queue pointers can get in > an invalid state where next_to_use points to an item > where next_to_watch is NOT set to 0. > > However, the implementation assumes at several places > that this is never the case, so if it does hold, > bad things happen. In particular, within the loop inside > of igc_clean_tx_irq(), next_to_clean can overtake next_to_use. > Finally, this prevents any further transmission via > this queue and it never gets unblocked or signaled. > Secondly, if the queue is in this garbled state, > the inner loop of igc_clean_tx_ring() will never terminate, > completely hogging a CPU core. > > The reason is that igc_xdp_xmit_zc() reads next_to_use > before acquiring the lock, and writing it back > (potentially unmodified) later. If it got modified > before locking, the outdated next_to_use is written > pointing to an item that was already used elsewhere > (and thus next_to_watch got written). > > Fixes: 9acf59a752d4 ("igc: Enable TX via AF_XDP zero-copy") > Signed-off-by: Florian Kauer <florian.kauer@linutronix.de> > Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> > Tested-by: Kurt Kanzenbach <kurt@linutronix.de> > Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com>
On 6/30/2023 15:03, Florian Kauer wrote: > In normal operation, each populated queue item has > next_to_watch pointing to the last TX desc of the packet, > while each cleaned item has it set to 0. In particular, > next_to_use that points to the next (necessarily clean) > item to use has next_to_watch set to 0. > > When the TX queue is used both by an application using > AF_XDP with ZEROCOPY as well as a second non-XDP application > generating high traffic, the queue pointers can get in > an invalid state where next_to_use points to an item > where next_to_watch is NOT set to 0. > > However, the implementation assumes at several places > that this is never the case, so if it does hold, > bad things happen. In particular, within the loop inside > of igc_clean_tx_irq(), next_to_clean can overtake next_to_use. > Finally, this prevents any further transmission via > this queue and it never gets unblocked or signaled. > Secondly, if the queue is in this garbled state, > the inner loop of igc_clean_tx_ring() will never terminate, > completely hogging a CPU core. > > The reason is that igc_xdp_xmit_zc() reads next_to_use > before acquiring the lock, and writing it back > (potentially unmodified) later. If it got modified > before locking, the outdated next_to_use is written > pointing to an item that was already used elsewhere > (and thus next_to_watch got written). > > Fixes: 9acf59a752d4 ("igc: Enable TX via AF_XDP zero-copy") > Signed-off-by: Florian Kauer <florian.kauer@linutronix.de> > Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> > Tested-by: Kurt Kanzenbach <kurt@linutronix.de> > Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> > --- > > v2 -> v3: > Resolve merge conflict > > v1 -> v2: > I added some more context for further clarification, > but it is also just how I interpret the code. > Also the typo is fixed and it is reverse christmas again
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index 019ce91c45aa..722ffcc319f0 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -2833,9 +2833,8 @@ static void igc_xdp_xmit_zc(struct igc_ring *ring) struct netdev_queue *nq = txring_txq(ring); union igc_adv_tx_desc *tx_desc = NULL; int cpu = smp_processor_id(); - u16 ntu = ring->next_to_use; struct xdp_desc xdp_desc; - u16 budget; + u16 budget, ntu; if (!netif_carrier_ok(ring->netdev)) return; @@ -2845,6 +2844,7 @@ static void igc_xdp_xmit_zc(struct igc_ring *ring) /* Avoid transmit queue timeout since we share it with the slow path */ txq_trans_cond_update(nq); + ntu = ring->next_to_use; budget = igc_desc_unused(ring); while (xsk_tx_peek_desc(pool, &xdp_desc) && budget--) {