Message ID | 20221121095631.216209-2-hch@lst.de (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [1/2] net: fec: use dma_alloc_noncoherent for m532x | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Guessing tree name failed - patch did not apply |
Hi Christoph, On Mon, Nov 21, 2022 at 10:56 AM Christoph Hellwig <hch@lst.de> wrote: > The m532x coldfire platforms can't properly implement dma_alloc_coherent > and currently just return noncoherent memory from it. The fec driver > than works around this with a flush of all caches in the receive path. > Make this hack a little less bad by using the explicit > dma_alloc_noncoherent API and documenting the hacky cache flushes. > > Signed-off-by: Christoph Hellwig <hch@lst.de> Thanks for your patch! > --- a/drivers/net/ethernet/freescale/fec_main.c > +++ b/drivers/net/ethernet/freescale/fec_main.c > @@ -1580,6 +1580,10 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id) > struct page *page; > > #ifdef CONFIG_M532x > + /* > + * Hacky flush of all caches instead of using the DMA API for the TSO > + * headers. > + */ > flush_cache_all(); > #endif > rxq = fep->rx_queue[queue_id]; > @@ -3123,10 +3127,17 @@ static void fec_enet_free_queue(struct net_device *ndev) > for (i = 0; i < fep->num_tx_queues; i++) > if (fep->tx_queue[i] && fep->tx_queue[i]->tso_hdrs) { > txq = fep->tx_queue[i]; > +#ifdef CONFIG_M532x Shouldn't this be the !CONFIG_M532x path? > dma_free_coherent(&fep->pdev->dev, > txq->bd.ring_size * TSO_HEADER_SIZE, > txq->tso_hdrs, > txq->tso_hdrs_dma); > +#else > + dma_free_noncoherent(&fep->pdev->dev, > + txq->bd.ring_size * TSO_HEADER_SIZE, > + txq->tso_hdrs, txq->tso_hdrs_dma, > + DMA_BIDIRECTIONAL); > +#endif > } > > for (i = 0; i < fep->num_rx_queues; i++) > @@ -3157,10 +3168,18 @@ static int fec_enet_alloc_queue(struct net_device *ndev) > txq->tx_wake_threshold = > (txq->bd.ring_size - txq->tx_stop_threshold) / 2; > > +#ifdef CONFIG_M532x Likewise > txq->tso_hdrs = dma_alloc_coherent(&fep->pdev->dev, > txq->bd.ring_size * TSO_HEADER_SIZE, > &txq->tso_hdrs_dma, > GFP_KERNEL); > +#else > + /* m68knommu manually flushes all caches in fec_enet_rx_queue */ > + txq->tso_hdrs = dma_alloc_noncoherent(&fep->pdev->dev, > + txq->bd.ring_size * TSO_HEADER_SIZE, > + &txq->tso_hdrs_dma, DMA_BIDIRECTIONAL, > + GFP_KERNEL); > +#endif > if (!txq->tso_hdrs) { > ret = -ENOMEM; > goto alloc_failed; Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
On Mon, Nov 21, 2022 at 11:24:29AM +0100, Geert Uytterhoeven wrote: > > +#ifdef CONFIG_M532x > > Shouldn't this be the !CONFIG_M532x path? Yes.
Hi Christoph, On 21/11/22 19:56, Christoph Hellwig wrote: > The m532x coldfire platforms can't properly implement dma_alloc_coherent > and currently just return noncoherent memory from it. The fec driver > than works around this with a flush of all caches in the receive path. > Make this hack a little less bad by using the explicit > dma_alloc_noncoherent API and documenting the hacky cache flushes. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > drivers/net/ethernet/freescale/fec_main.c | 19 +++++++++++++++++++ > 1 file changed, 19 insertions(+) > > diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c > index 28ef4d3c18789..5230698310b5e 100644 > --- a/drivers/net/ethernet/freescale/fec_main.c > +++ b/drivers/net/ethernet/freescale/fec_main.c > @@ -1580,6 +1580,10 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id) > struct page *page; > > #ifdef CONFIG_M532x > + /* > + * Hacky flush of all caches instead of using the DMA API for the TSO > + * headers. > + */ > flush_cache_all(); > #endif > rxq = fep->rx_queue[queue_id]; > @@ -3123,10 +3127,17 @@ static void fec_enet_free_queue(struct net_device *ndev) > for (i = 0; i < fep->num_tx_queues; i++) > if (fep->tx_queue[i] && fep->tx_queue[i]->tso_hdrs) { > txq = fep->tx_queue[i]; > +#ifdef CONFIG_M532x > dma_free_coherent(&fep->pdev->dev, > txq->bd.ring_size * TSO_HEADER_SIZE, > txq->tso_hdrs, > txq->tso_hdrs_dma); > +#else > + dma_free_noncoherent(&fep->pdev->dev, > + txq->bd.ring_size * TSO_HEADER_SIZE, > + txq->tso_hdrs, txq->tso_hdrs_dma, > + DMA_BIDIRECTIONAL); > +#endif > } > > for (i = 0; i < fep->num_rx_queues; i++) > @@ -3157,10 +3168,18 @@ static int fec_enet_alloc_queue(struct net_device *ndev) > txq->tx_wake_threshold = > (txq->bd.ring_size - txq->tx_stop_threshold) / 2; > > +#ifdef CONFIG_M532x > txq->tso_hdrs = dma_alloc_coherent(&fep->pdev->dev, > txq->bd.ring_size * TSO_HEADER_SIZE, > &txq->tso_hdrs_dma, > GFP_KERNEL); Even with this corrected this will now end up failing on all other ColdFire types with the FEC hardware module (all the non-M532x types) once the arch_dma_alloc() returns NULL. Did you mean "ifndef CONFIG_COLDFIRE" here? > +#else > + /* m68knommu manually flushes all caches in fec_enet_rx_queue */ > + txq->tso_hdrs = dma_alloc_noncoherent(&fep->pdev->dev, > + txq->bd.ring_size * TSO_HEADER_SIZE, > + &txq->tso_hdrs_dma, DMA_BIDIRECTIONAL, > + GFP_KERNEL); > +#endif > if (!txq->tso_hdrs) { > ret = -ENOMEM; > goto alloc_failed; And what about the dmam_alloc_coherent() call in fec_enet_init()? Does that need changing too? Regards Greg
On Thu, Dec 01, 2022 at 03:41:34PM +1000, Greg Ungerer wrote: >> #ifdef CONFIG_M532x >> + /* >> + * Hacky flush of all caches instead of using the DMA API for the TSO >> + * headers. >> + */ >> flush_cache_all(); > > Even with this corrected this will now end up failing on all other ColdFire types > with the FEC hardware module (all the non-M532x types) once the arch_dma_alloc() > returns NULL. > > Did you mean "ifndef CONFIG_COLDFIRE" here? How did these work before given that the cache flush is conditional on CONFIG_M532x? >> +#else >> + /* m68knommu manually flushes all caches in fec_enet_rx_queue */ >> + txq->tso_hdrs = dma_alloc_noncoherent(&fep->pdev->dev, >> + txq->bd.ring_size * TSO_HEADER_SIZE, >> + &txq->tso_hdrs_dma, DMA_BIDIRECTIONAL, >> + GFP_KERNEL); >> +#endif >> if (!txq->tso_hdrs) { >> ret = -ENOMEM; >> goto alloc_failed; > > And what about the dmam_alloc_coherent() call in fec_enet_init()? > Does that need changing too? If that's actually use by the FEC implementations on coldire: yes. But maybe I need even more help on how the cache flushing is suppoѕed to actually work here.
On 2/12/22 17:02, Christoph Hellwig wrote: > On Thu, Dec 01, 2022 at 03:41:34PM +1000, Greg Ungerer wrote: >>> #ifdef CONFIG_M532x >>> + /* >>> + * Hacky flush of all caches instead of using the DMA API for the TSO >>> + * headers. >>> + */ >>> flush_cache_all(); >> >> Even with this corrected this will now end up failing on all other ColdFire types >> with the FEC hardware module (all the non-M532x types) once the arch_dma_alloc() >> returns NULL. >> >> Did you mean "ifndef CONFIG_COLDFIRE" here? > > How did these work before given that the cache flush is conditional > on CONFIG_M532x? The case for the 5272 is that it only has instruction cache, no data cache. That was the first supported part, and the one I have used the most over the years (though not so much recently). So it should be ok. I am not convinced about the other version 2 cores either. They all have both instruction and data cache - they can be flexibily configured to be all instruction, or a mix of instruction and data - but the default is both. I don't have a 532x platform, so I have never done any work on that. The flush_cache_all() for that seems dubious to me. >>> +#else >>> + /* m68knommu manually flushes all caches in fec_enet_rx_queue */ >>> + txq->tso_hdrs = dma_alloc_noncoherent(&fep->pdev->dev, >>> + txq->bd.ring_size * TSO_HEADER_SIZE, >>> + &txq->tso_hdrs_dma, DMA_BIDIRECTIONAL, >>> + GFP_KERNEL); >>> +#endif >>> if (!txq->tso_hdrs) { >>> ret = -ENOMEM; >>> goto alloc_failed; >> >> And what about the dmam_alloc_coherent() call in fec_enet_init()? >> Does that need changing too? > > If that's actually use by the FEC implementations on coldire: yes. It is. Testing these changes failed at boot without changing this one too. > But maybe I need even more help on how the cache flushing is suppoѕed > to actually work here. Regards Greg
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 28ef4d3c18789..5230698310b5e 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -1580,6 +1580,10 @@ fec_enet_rx_queue(struct net_device *ndev, int budget, u16 queue_id) struct page *page; #ifdef CONFIG_M532x + /* + * Hacky flush of all caches instead of using the DMA API for the TSO + * headers. + */ flush_cache_all(); #endif rxq = fep->rx_queue[queue_id]; @@ -3123,10 +3127,17 @@ static void fec_enet_free_queue(struct net_device *ndev) for (i = 0; i < fep->num_tx_queues; i++) if (fep->tx_queue[i] && fep->tx_queue[i]->tso_hdrs) { txq = fep->tx_queue[i]; +#ifdef CONFIG_M532x dma_free_coherent(&fep->pdev->dev, txq->bd.ring_size * TSO_HEADER_SIZE, txq->tso_hdrs, txq->tso_hdrs_dma); +#else + dma_free_noncoherent(&fep->pdev->dev, + txq->bd.ring_size * TSO_HEADER_SIZE, + txq->tso_hdrs, txq->tso_hdrs_dma, + DMA_BIDIRECTIONAL); +#endif } for (i = 0; i < fep->num_rx_queues; i++) @@ -3157,10 +3168,18 @@ static int fec_enet_alloc_queue(struct net_device *ndev) txq->tx_wake_threshold = (txq->bd.ring_size - txq->tx_stop_threshold) / 2; +#ifdef CONFIG_M532x txq->tso_hdrs = dma_alloc_coherent(&fep->pdev->dev, txq->bd.ring_size * TSO_HEADER_SIZE, &txq->tso_hdrs_dma, GFP_KERNEL); +#else + /* m68knommu manually flushes all caches in fec_enet_rx_queue */ + txq->tso_hdrs = dma_alloc_noncoherent(&fep->pdev->dev, + txq->bd.ring_size * TSO_HEADER_SIZE, + &txq->tso_hdrs_dma, DMA_BIDIRECTIONAL, + GFP_KERNEL); +#endif if (!txq->tso_hdrs) { ret = -ENOMEM; goto alloc_failed;
The m532x coldfire platforms can't properly implement dma_alloc_coherent and currently just return noncoherent memory from it. The fec driver than works around this with a flush of all caches in the receive path. Make this hack a little less bad by using the explicit dma_alloc_noncoherent API and documenting the hacky cache flushes. Signed-off-by: Christoph Hellwig <hch@lst.de> --- drivers/net/ethernet/freescale/fec_main.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)