Message ID | 20241012-en7581-bql-v2-1-4deb4efdb60b@kernel.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [net-next,v2] net: airoha: Implement BQL support | expand |
On Sat, 12 Oct 2024 11:01:11 +0200 Lorenzo Bianconi wrote: > Introduce BQL support in the airoha_eth driver reporting to the kernel > info about tx hw DMA queues in order to avoid bufferbloat and keep the > latency small. TBH I haven't looked at the code again, but when I looked at v1 I was surprised you don't have a reset in airoha_qdma_cleanup_tx_queue(). Are you sure it's okay? It's a common bug not to reset the BQL state when queue is purged while stopping the interface.
On Oct 15, Jakub Kicinski wrote: > On Sat, 12 Oct 2024 11:01:11 +0200 Lorenzo Bianconi wrote: > > Introduce BQL support in the airoha_eth driver reporting to the kernel > > info about tx hw DMA queues in order to avoid bufferbloat and keep the > > latency small. > > TBH I haven't looked at the code again, but when I looked at v1 I was > surprised you don't have a reset in airoha_qdma_cleanup_tx_queue(). > Are you sure it's okay? It's a common bug not to reset the BQL state > when queue is purged while stopping the interface. So far airoha_qdma_cleanup_tx_queue() is called just in airoha_hw_cleanup() that in turn runs just when the module is removed (airoha_remove()). Do we need it? Regards, Lorenzo
On Tue, 15 Oct 2024 16:39:08 +0200 Lorenzo Bianconi wrote: > On Oct 15, Jakub Kicinski wrote: > > On Sat, 12 Oct 2024 11:01:11 +0200 Lorenzo Bianconi wrote: > > > Introduce BQL support in the airoha_eth driver reporting to the kernel > > > info about tx hw DMA queues in order to avoid bufferbloat and keep the > > > latency small. > > > > TBH I haven't looked at the code again, but when I looked at v1 I was > > surprised you don't have a reset in airoha_qdma_cleanup_tx_queue(). > > Are you sure it's okay? It's a common bug not to reset the BQL state > > when queue is purged while stopping the interface. > > So far airoha_qdma_cleanup_tx_queue() is called just in airoha_hw_cleanup() > that in turn runs just when the module is removed (airoha_remove()). > Do we need it? Oh, thought its called on stop. In that case we're probably good from BQL perspective. But does it mean potentially very stale packets can sit on the Tx ring when the device is stopped, until it's started again?
On Oct 15, Jakub Kicinski wrote: > On Tue, 15 Oct 2024 16:39:08 +0200 Lorenzo Bianconi wrote: > > On Oct 15, Jakub Kicinski wrote: > > > On Sat, 12 Oct 2024 11:01:11 +0200 Lorenzo Bianconi wrote: > > > > Introduce BQL support in the airoha_eth driver reporting to the kernel > > > > info about tx hw DMA queues in order to avoid bufferbloat and keep the > > > > latency small. > > > > > > TBH I haven't looked at the code again, but when I looked at v1 I was > > > surprised you don't have a reset in airoha_qdma_cleanup_tx_queue(). > > > Are you sure it's okay? It's a common bug not to reset the BQL state > > > when queue is purged while stopping the interface. > > > > So far airoha_qdma_cleanup_tx_queue() is called just in airoha_hw_cleanup() > > that in turn runs just when the module is removed (airoha_remove()). > > Do we need it? > > Oh, thought its called on stop. In that case we're probably good > from BQL perspective. > > But does it mean potentially very stale packets can sit on the Tx > ring when the device is stopped, until it's started again? Do you mean the packets that the stack is transmitting when the .ndo_stop() is run? In airoha_dev_stop() we call netif_tx_disable() to disable the transmission on new packets and inflight packets will be consumed by the completion napi, is it not enough? I guess we can even add netdev_tx_reset_subqueue() for all netdev queues in airoha_dev_stop(), I do not have a strong opinion about it. What do you prefer? Regards, Lorenzo
On Tue, 15 Oct 2024 17:54:59 +0200 Lorenzo Bianconi wrote: > > Oh, thought its called on stop. In that case we're probably good > > from BQL perspective. > > > > But does it mean potentially very stale packets can sit on the Tx > > ring when the device is stopped, until it's started again? > > Do you mean the packets that the stack is transmitting when the .ndo_stop() > is run? Whatever is in the queue at the time ndo_stop() gets called. Could be the full descriptor ring I presume? > In airoha_dev_stop() we call netif_tx_disable() to disable the transmission > on new packets and inflight packets will be consumed by the completion napi, > is it not enough? They will only get consumed if the DMA gets to them right? Stop seems to stop the DMA. > I guess we can even add netdev_tx_reset_subqueue() for all netdev > queues in airoha_dev_stop(), I do not have a strong opinion about it. What > do you prefer? So to be clear I think this patch is correct as of the current driver code. I'm just wondering if we should call airoha_qdma_cleanup_tx_queue() on stop as well, and then that should come with the reset. I think having a packet stuck in a queue may lead to all sort of oddness so my recommendation would be to flush the queues.
> On Tue, 15 Oct 2024 17:54:59 +0200 Lorenzo Bianconi wrote: > > > Oh, thought its called on stop. In that case we're probably good > > > from BQL perspective. > > > > > > But does it mean potentially very stale packets can sit on the Tx > > > ring when the device is stopped, until it's started again? > > > > Do you mean the packets that the stack is transmitting when the .ndo_stop() > > is run? > > Whatever is in the queue at the time ndo_stop() gets called. > Could be the full descriptor ring I presume? > > > In airoha_dev_stop() we call netif_tx_disable() to disable the transmission > > on new packets and inflight packets will be consumed by the completion napi, > > is it not enough? > > They will only get consumed if the DMA gets to them right? > Stop seems to stop the DMA. > > > I guess we can even add netdev_tx_reset_subqueue() for all netdev > > queues in airoha_dev_stop(), I do not have a strong opinion about it. What > > do you prefer? > > So to be clear I think this patch is correct as of the current driver > code. I'm just wondering if we should call airoha_qdma_cleanup_tx_queue() > on stop as well, and then that should come with the reset. > I think having a packet stuck in a queue may lead to all sort of oddness > so my recommendation would be to flush the queues. ack, I will post a fix for this. Do you prefer to resend even this patch? Up to you. Regards, Lorenzo
On Tue, 15 Oct 2024 18:35:22 +0200 Lorenzo Bianconi wrote: > > So to be clear I think this patch is correct as of the current driver > > code. I'm just wondering if we should call airoha_qdma_cleanup_tx_queue() > > on stop as well, and then that should come with the reset. > > I think having a packet stuck in a queue may lead to all sort of oddness > > so my recommendation would be to flush the queues. > > ack, I will post a fix for this. Do you prefer to resend even this patch? Up to you. I'll take it. I don't think the stop rework will be a fix.
Hello: This patch was applied to netdev/net-next.git (main) by Jakub Kicinski <kuba@kernel.org>: On Sat, 12 Oct 2024 11:01:11 +0200 you wrote: > Introduce BQL support in the airoha_eth driver reporting to the kernel > info about tx hw DMA queues in order to avoid bufferbloat and keep the > latency small. > > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> > --- > Changes in v2: > - simplify BQL support and rebase on top of net-next main > - Link to v1: https://lore.kernel.org/r/20240930-en7581-bql-v1-1-064cdd570068@kernel.org > > [...] Here is the summary with links: - [net-next,v2] net: airoha: Implement BQL support https://git.kernel.org/netdev/net-next/c/1d304174106c You are awesome, thank you!
diff --git a/drivers/net/ethernet/mediatek/airoha_eth.c b/drivers/net/ethernet/mediatek/airoha_eth.c index e037f725f6d3505a8b91815ae26322f5d1b8590c..836a957aad77bec972c536567a4ee7d304ac7b52 100644 --- a/drivers/net/ethernet/mediatek/airoha_eth.c +++ b/drivers/net/ethernet/mediatek/airoha_eth.c @@ -1709,9 +1709,11 @@ static int airoha_qdma_tx_napi_poll(struct napi_struct *napi, int budget) WRITE_ONCE(desc->msg1, 0); if (skb) { + u16 queue = skb_get_queue_mapping(skb); struct netdev_queue *txq; - txq = netdev_get_tx_queue(skb->dev, qid); + txq = netdev_get_tx_queue(skb->dev, queue); + netdev_tx_completed_queue(txq, 1, skb->len); if (netif_tx_queue_stopped(txq) && q->ndesc - q->queued >= q->free_thr) netif_tx_wake_queue(txq); @@ -2487,7 +2489,9 @@ static netdev_tx_t airoha_dev_xmit(struct sk_buff *skb, q->queued += i; skb_tx_timestamp(skb); - if (!netdev_xmit_more()) + netdev_tx_sent_queue(txq, skb->len); + + if (netif_xmit_stopped(txq) || !netdev_xmit_more()) airoha_qdma_rmw(qdma, REG_TX_CPU_IDX(qid), TX_RING_CPU_IDX_MASK, FIELD_PREP(TX_RING_CPU_IDX_MASK, q->head));
Introduce BQL support in the airoha_eth driver reporting to the kernel info about tx hw DMA queues in order to avoid bufferbloat and keep the latency small. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> --- Changes in v2: - simplify BQL support and rebase on top of net-next main - Link to v1: https://lore.kernel.org/r/20240930-en7581-bql-v1-1-064cdd570068@kernel.org --- drivers/net/ethernet/mediatek/airoha_eth.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) --- base-commit: c531f2269a53db5cf64b24baf785ccbcda52970f change-id: 20240930-en7581-bql-552566f2078e Best regards,