Message ID | 20221026142624.19314-1-zajec5@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: broadcom: bcm4908_enet: report queued and transmitted bytes | expand |
On 10/26/2022 7:26 AM, Rafał Miłecki wrote: > From: Rafał Miłecki <rafal@milecki.pl> > > This allows BQL to operate avoiding buffer bloat and reducing latency. > > Signed-off-by: Rafał Miłecki <rafal@milecki.pl> > --- > drivers/net/ethernet/broadcom/bcm4908_enet.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/net/ethernet/broadcom/bcm4908_enet.c b/drivers/net/ethernet/broadcom/bcm4908_enet.c > index 93ccf549e2ed..e672a9ef4444 100644 > --- a/drivers/net/ethernet/broadcom/bcm4908_enet.c > +++ b/drivers/net/ethernet/broadcom/bcm4908_enet.c > @@ -495,6 +495,7 @@ static int bcm4908_enet_stop(struct net_device *netdev) > netif_carrier_off(netdev); > napi_disable(&rx_ring->napi); > napi_disable(&tx_ring->napi); > + netdev_reset_queue(netdev); > > bcm4908_enet_dma_rx_ring_disable(enet, &enet->rx_ring); > bcm4908_enet_dma_tx_ring_disable(enet, &enet->tx_ring); > @@ -564,6 +565,8 @@ static netdev_tx_t bcm4908_enet_start_xmit(struct sk_buff *skb, struct net_devic > enet->netdev->stats.tx_bytes += skb->len; > enet->netdev->stats.tx_packets++; > > + netdev_sent_queue(enet->netdev, skb->len); There is an opportunity for fixing an use after free here, after you call bcm4908_enet_dma_tx_ring_enable() the hardware can start transmission right away and also call the TX completion handler, so you could be de-referencing a freed skb reference at this point. Also, to ensure that DMA is actually functional, it is recommended to increase TX stats in the TX completion handler, since that indicates that you have a functional completion process. So long story short, if you record the skb length *before* calling bcm4908_enet_dma_tx_ring_enable() and use that for reporting sent bytes, you should be good.
On 26.10.2022 16:58, Florian Fainelli wrote: > On 10/26/2022 7:26 AM, Rafał Miłecki wrote: >> From: Rafał Miłecki <rafal@milecki.pl> >> >> This allows BQL to operate avoiding buffer bloat and reducing latency. >> >> Signed-off-by: Rafał Miłecki <rafal@milecki.pl> >> --- >> drivers/net/ethernet/broadcom/bcm4908_enet.c | 7 +++++++ >> 1 file changed, 7 insertions(+) >> >> diff --git a/drivers/net/ethernet/broadcom/bcm4908_enet.c b/drivers/net/ethernet/broadcom/bcm4908_enet.c >> index 93ccf549e2ed..e672a9ef4444 100644 >> --- a/drivers/net/ethernet/broadcom/bcm4908_enet.c >> +++ b/drivers/net/ethernet/broadcom/bcm4908_enet.c >> @@ -495,6 +495,7 @@ static int bcm4908_enet_stop(struct net_device *netdev) >> netif_carrier_off(netdev); >> napi_disable(&rx_ring->napi); >> napi_disable(&tx_ring->napi); >> + netdev_reset_queue(netdev); >> bcm4908_enet_dma_rx_ring_disable(enet, &enet->rx_ring); >> bcm4908_enet_dma_tx_ring_disable(enet, &enet->tx_ring); >> @@ -564,6 +565,8 @@ static netdev_tx_t bcm4908_enet_start_xmit(struct sk_buff *skb, struct net_devic >> enet->netdev->stats.tx_bytes += skb->len; >> enet->netdev->stats.tx_packets++; >> + netdev_sent_queue(enet->netdev, skb->len); > > There is an opportunity for fixing an use after free here, after you call bcm4908_enet_dma_tx_ring_enable() the hardware can start transmission right away and also call the TX completion handler, so you could be de-referencing a freed skb reference at this point. Also, to ensure that DMA is actually functional, it is recommended to increase TX stats in the TX completion handler, since that indicates that you have a functional completion process. I see the problem, thanks! Actually hw may start transmission even earlier - right after filling buf_desc coherent struct. > So long story short, if you record the skb length *before* calling bcm4908_enet_dma_tx_ring_enable() and use that for reporting sent bytes, you should be good. I may still end up calling netdev_completed_queue() for data for which I didn't call netdev_sent_queue() yet. Is that safe? Maybe I just just call netdev_sent_queue() before updating the buf_desc?
On 10/26/22 08:12, Rafał Miłecki wrote: > On 26.10.2022 16:58, Florian Fainelli wrote: >> On 10/26/2022 7:26 AM, Rafał Miłecki wrote: >>> From: Rafał Miłecki <rafal@milecki.pl> >>> >>> This allows BQL to operate avoiding buffer bloat and reducing latency. >>> >>> Signed-off-by: Rafał Miłecki <rafal@milecki.pl> >>> --- >>> drivers/net/ethernet/broadcom/bcm4908_enet.c | 7 +++++++ >>> 1 file changed, 7 insertions(+) >>> >>> diff --git a/drivers/net/ethernet/broadcom/bcm4908_enet.c >>> b/drivers/net/ethernet/broadcom/bcm4908_enet.c >>> index 93ccf549e2ed..e672a9ef4444 100644 >>> --- a/drivers/net/ethernet/broadcom/bcm4908_enet.c >>> +++ b/drivers/net/ethernet/broadcom/bcm4908_enet.c >>> @@ -495,6 +495,7 @@ static int bcm4908_enet_stop(struct net_device >>> *netdev) >>> netif_carrier_off(netdev); >>> napi_disable(&rx_ring->napi); >>> napi_disable(&tx_ring->napi); >>> + netdev_reset_queue(netdev); >>> bcm4908_enet_dma_rx_ring_disable(enet, &enet->rx_ring); >>> bcm4908_enet_dma_tx_ring_disable(enet, &enet->tx_ring); >>> @@ -564,6 +565,8 @@ static netdev_tx_t bcm4908_enet_start_xmit(struct >>> sk_buff *skb, struct net_devic >>> enet->netdev->stats.tx_bytes += skb->len; >>> enet->netdev->stats.tx_packets++; >>> + netdev_sent_queue(enet->netdev, skb->len); >> >> There is an opportunity for fixing an use after free here, after you >> call bcm4908_enet_dma_tx_ring_enable() the hardware can start >> transmission right away and also call the TX completion handler, so >> you could be de-referencing a freed skb reference at this point. Also, >> to ensure that DMA is actually functional, it is recommended to >> increase TX stats in the TX completion handler, since that indicates >> that you have a functional completion process. > > I see the problem, thanks! > > Actually hw may start transmission even earlier - right after filling > buf_desc coherent struct. Not familiar with that hardware, but in premise yes, I suppose once you write a proper address and length the DMA can notice and start transmitting. Also even though you are using non-coherent memory, there appears to be a missing dma_wmb() between the store to buf_desc->ctl and buf_desc->addr. There is no explicit dependency between those two stores and subsequent loads or stores, so the processor write buffer could re-order those in theory. Unlikely to happen because this used on a Cortex-A53 IIRC, but better safe than sorry. > > >> So long story short, if you record the skb length *before* calling >> bcm4908_enet_dma_tx_ring_enable() and use that for reporting sent >> bytes, you should be good. > > I may still end up calling netdev_completed_queue() for data for which > I didn't call netdev_sent_queue() yet. Is that safe? > > Maybe I just just call netdev_sent_queue() before updating the buf_desc? You would want it to be as close a possible from when you hand the buffer to the hardware, but I see no locking between bcm4908_start_xmit() and bcm4908_enet_irq_handler() so you already have a race don't you?
On 26.10.2022 16:26, Rafał Miłecki wrote: > From: Rafał Miłecki <rafal@milecki.pl> > > This allows BQL to operate avoiding buffer bloat and reducing latency. > > Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Please drop it, I'll work on V2.
diff --git a/drivers/net/ethernet/broadcom/bcm4908_enet.c b/drivers/net/ethernet/broadcom/bcm4908_enet.c index 93ccf549e2ed..e672a9ef4444 100644 --- a/drivers/net/ethernet/broadcom/bcm4908_enet.c +++ b/drivers/net/ethernet/broadcom/bcm4908_enet.c @@ -495,6 +495,7 @@ static int bcm4908_enet_stop(struct net_device *netdev) netif_carrier_off(netdev); napi_disable(&rx_ring->napi); napi_disable(&tx_ring->napi); + netdev_reset_queue(netdev); bcm4908_enet_dma_rx_ring_disable(enet, &enet->rx_ring); bcm4908_enet_dma_tx_ring_disable(enet, &enet->tx_ring); @@ -564,6 +565,8 @@ static netdev_tx_t bcm4908_enet_start_xmit(struct sk_buff *skb, struct net_devic enet->netdev->stats.tx_bytes += skb->len; enet->netdev->stats.tx_packets++; + netdev_sent_queue(enet->netdev, skb->len); + return NETDEV_TX_OK; } @@ -635,6 +638,7 @@ static int bcm4908_enet_poll_tx(struct napi_struct *napi, int weight) struct bcm4908_enet_dma_ring_bd *buf_desc; struct bcm4908_enet_dma_ring_slot *slot; struct device *dev = enet->dev; + unsigned int bytes = 0; int handled = 0; while (handled < weight && tx_ring->read_idx != tx_ring->write_idx) { @@ -645,6 +649,7 @@ static int bcm4908_enet_poll_tx(struct napi_struct *napi, int weight) dma_unmap_single(dev, slot->dma_addr, slot->len, DMA_TO_DEVICE); dev_kfree_skb(slot->skb); + bytes += slot->len; if (++tx_ring->read_idx == tx_ring->length) tx_ring->read_idx = 0; @@ -656,6 +661,8 @@ static int bcm4908_enet_poll_tx(struct napi_struct *napi, int weight) bcm4908_enet_dma_ring_intrs_on(enet, tx_ring); } + netdev_completed_queue(enet->netdev, handled, bytes); + if (netif_queue_stopped(enet->netdev)) netif_wake_queue(enet->netdev);