Message ID | 20220519192443.28681-1-mans@mansr.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: fs_enet: sync rx dma buffer before reading | expand |
Le 19/05/2022 à 21:24, Mans Rullgard a écrit : > The dma_sync_single_for_cpu() call must precede reading the received > data. Fix this. See original commit 070e1f01827c. It explicitely says that the cache must be invalidate _AFTER_ the copy. The cache is initialy invalidated by dma_map_single(), so before the copy the cache is already clean. After the copy, data is in the cache. In order to allow re-use of the skb, it must be put back in the same condition as before, in extenso the cache must be invalidated in order to be in the same situation as after dma_map_single(). So I think your change is wrong. > > Fixes: 070e1f01827c ("net: fs_enet: don't unmap DMA when packet len is below copybreak") > Signed-off-by: Mans Rullgard <mans@mansr.com> > --- > drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c > index b3dae17e067e..432ce10cbfd0 100644 > --- a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c > +++ b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c > @@ -240,14 +240,14 @@ static int fs_enet_napi(struct napi_struct *napi, int budget) > /* +2 to make IP header L1 cache aligned */ > skbn = netdev_alloc_skb(dev, pkt_len + 2); > if (skbn != NULL) { > + dma_sync_single_for_cpu(fep->dev, > + CBDR_BUFADDR(bdp), > + L1_CACHE_ALIGN(pkt_len), > + DMA_FROM_DEVICE); > skb_reserve(skbn, 2); /* align IP header */ > skb_copy_from_linear_data(skb, > skbn->data, pkt_len); > swap(skb, skbn); > - dma_sync_single_for_cpu(fep->dev, > - CBDR_BUFADDR(bdp), > - L1_CACHE_ALIGN(pkt_len), > - DMA_FROM_DEVICE); > } > } else { > skbn = netdev_alloc_skb(dev, ENET_RX_FRSIZE); > -- > 2.35.1 >
Christophe Leroy <christophe.leroy@csgroup.eu> writes: > Le 19/05/2022 à 21:24, Mans Rullgard a écrit : >> The dma_sync_single_for_cpu() call must precede reading the received >> data. Fix this. > > See original commit 070e1f01827c. It explicitely says that the cache > must be invalidate _AFTER_ the copy. > > The cache is initialy invalidated by dma_map_single(), so before the > copy the cache is already clean. > > After the copy, data is in the cache. In order to allow re-use of the > skb, it must be put back in the same condition as before, in extenso the > cache must be invalidated in order to be in the same situation as after > dma_map_single(). > > So I think your change is wrong. OK, looking at it more closely, the change is at least unnecessary since there will be a cache invalidation between each use of the buffer either way. Please disregard the patch. Sorry for the noise. >> >> Fixes: 070e1f01827c ("net: fs_enet: don't unmap DMA when packet len is below copybreak") >> Signed-off-by: Mans Rullgard <mans@mansr.com> >> --- >> drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c | 8 ++++---- >> 1 file changed, 4 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c >> index b3dae17e067e..432ce10cbfd0 100644 >> --- a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c >> +++ b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c >> @@ -240,14 +240,14 @@ static int fs_enet_napi(struct napi_struct *napi, int budget) >> /* +2 to make IP header L1 cache aligned */ >> skbn = netdev_alloc_skb(dev, pkt_len + 2); >> if (skbn != NULL) { >> + dma_sync_single_for_cpu(fep->dev, >> + CBDR_BUFADDR(bdp), >> + L1_CACHE_ALIGN(pkt_len), >> + DMA_FROM_DEVICE); >> skb_reserve(skbn, 2); /* align IP header */ >> skb_copy_from_linear_data(skb, >> skbn->data, pkt_len); >> swap(skb, skbn); >> - dma_sync_single_for_cpu(fep->dev, >> - CBDR_BUFADDR(bdp), >> - L1_CACHE_ALIGN(pkt_len), >> - DMA_FROM_DEVICE); >> } >> } else { >> skbn = netdev_alloc_skb(dev, ENET_RX_FRSIZE); >> -- >> 2.35.1 >>
Le 20/05/2022 à 14:35, Måns Rullgård a écrit : > Christophe Leroy <christophe.leroy@csgroup.eu> writes: > >> Le 19/05/2022 à 21:24, Mans Rullgard a écrit : >>> The dma_sync_single_for_cpu() call must precede reading the received >>> data. Fix this. >> >> See original commit 070e1f01827c. It explicitely says that the cache >> must be invalidate _AFTER_ the copy. >> >> The cache is initialy invalidated by dma_map_single(), so before the >> copy the cache is already clean. >> >> After the copy, data is in the cache. In order to allow re-use of the >> skb, it must be put back in the same condition as before, in extenso the >> cache must be invalidated in order to be in the same situation as after >> dma_map_single(). >> >> So I think your change is wrong. > > OK, looking at it more closely, the change is at least unnecessary since > there will be a cache invalidation between each use of the buffer either > way. Please disregard the patch. Sorry for the noise. > I also looked deeper. Indeed it was implemented in kernel 4.9 or 4.8. At that time dma_unmap_single() was a no-op, it was not doing any sync/invalidation at all, invalidation was done only at mapping, so when we were reusing the skb it was necessary to clean the cache _AFTER_ the copy as if it was a new mapping. Today a sync is done at both map and unmap, so it doesn't really matter whether we do the invalidation before or after the copy when we re-use the skb. Christophe
On Fri, 20 May 2022 12:54:56 +0000 Christophe Leroy wrote: > Le 20/05/2022 à 14:35, Måns Rullgård a écrit : > > Christophe Leroy <christophe.leroy@csgroup.eu> writes: > >> See original commit 070e1f01827c. It explicitely says that the cache > >> must be invalidate _AFTER_ the copy. > >> > >> The cache is initialy invalidated by dma_map_single(), so before the > >> copy the cache is already clean. > >> > >> After the copy, data is in the cache. In order to allow re-use of the > >> skb, it must be put back in the same condition as before, in extenso the > >> cache must be invalidated in order to be in the same situation as after > >> dma_map_single(). > >> > >> So I think your change is wrong. > > > > OK, looking at it more closely, the change is at least unnecessary since > > there will be a cache invalidation between each use of the buffer either > > way. Please disregard the patch. Sorry for the noise. > > > > I also looked deeper. > > Indeed it was implemented in kernel 4.9 or 4.8. At that time > dma_unmap_single() was a no-op, it was not doing any sync/invalidation > at all, invalidation was done only at mapping, so when we were reusing > the skb it was necessary to clean the cache _AFTER_ the copy as if it > was a new mapping. > > Today a sync is done at both map and unmap, so it doesn't really matter > whether we do the invalidation before or after the copy when we re-use > the skb. Hm, I think the patch is necessary, sorry if you're also saying that and I'm misinterpreting. Without the dma_sync_single_for_cpu() if swiotlb is used the data will not be copied back into the original buffer if there is no sync.
Le 20/05/2022 à 19:43, Jakub Kicinski a écrit : > On Fri, 20 May 2022 12:54:56 +0000 Christophe Leroy wrote: >> Le 20/05/2022 à 14:35, Måns Rullgård a écrit : >>> Christophe Leroy <christophe.leroy@csgroup.eu> writes: >>>> See original commit 070e1f01827c. It explicitely says that the cache >>>> must be invalidate _AFTER_ the copy. >>>> >>>> The cache is initialy invalidated by dma_map_single(), so before the >>>> copy the cache is already clean. >>>> >>>> After the copy, data is in the cache. In order to allow re-use of the >>>> skb, it must be put back in the same condition as before, in extenso the >>>> cache must be invalidated in order to be in the same situation as after >>>> dma_map_single(). >>>> >>>> So I think your change is wrong. >>> >>> OK, looking at it more closely, the change is at least unnecessary since >>> there will be a cache invalidation between each use of the buffer either >>> way. Please disregard the patch. Sorry for the noise. >>> >> >> I also looked deeper. >> >> Indeed it was implemented in kernel 4.9 or 4.8. At that time >> dma_unmap_single() was a no-op, it was not doing any sync/invalidation >> at all, invalidation was done only at mapping, so when we were reusing >> the skb it was necessary to clean the cache _AFTER_ the copy as if it >> was a new mapping. >> >> Today a sync is done at both map and unmap, so it doesn't really matter >> whether we do the invalidation before or after the copy when we re-use >> the skb. > > Hm, I think the patch is necessary, sorry if you're also saying that > and I'm misinterpreting. Well, I say the contrary. On the mainline the patch may be applied as is, it won't harm. However, it is gets applied to kernel 4.9 (based on the fixes: tag), it will break the driver for at least powerpc 8xx. In 4.9, dma_direct_map_page() invalidates the cache, but dma_direct_unmap_page() is a no-op. It means that when we re-use a skb as we do in fs_enet when the received packet is small, the cache must be invalidated _AFTER_ reading the received data. The driver works like this: allocate an SKB with the largest possible packet size dma_direct_map_page() ==> cache invalidation loop forever wait for some received data in DMA if (received packet is small) allocate a new SKB with the size of the received packet copy received data into the new SKB hand new SKB to network layer invalidate the cache else dma_direct_unmap_page() ==> no-op hand SKB to network layer allocate a new SKB with the largest possible packet size dma_direct_map_page() ==> cache invalidation endif endloop If you don't invalidate the cache _AFTER_ the copy, you have stale data in the cache when you later hand a non-small received packet to the network stack. Invalidating _BEFORE_ the copy is useless as it has already been invalidated at mapping time. In mainline, the DMA handling has been make generic, and cache invalidation is performed at both mapping at unmapping (Which is by the way sub-optimal) so by change it would still work after the patch. > > Without the dma_sync_single_for_cpu() if swiotlb is used the data > will not be copied back into the original buffer if there is no sync. I don't know how SWIOTLB works or even what it is, does any of the microcontrollers embedding freescale ethernet uses that at all ? Christophe
On Sat, 21 May 2022 06:44:41 +0000 Christophe Leroy wrote: > > Hm, I think the patch is necessary, sorry if you're also saying that > > and I'm misinterpreting. > > Well, I say the contrary. > > On the mainline the patch may be applied as is, it won't harm. > > However, it is gets applied to kernel 4.9 (based on the fixes: tag), it > will break the driver for at least powerpc 8xx. I see, we should make a note of that in the commit message so it doesn't get sucked into stable. > > Without the dma_sync_single_for_cpu() if swiotlb is used the data > > will not be copied back into the original buffer if there is no sync. > > I don't know how SWIOTLB works or even what it is, does any of the > microcontrollers embedding freescale ethernet uses that at all ? AFAIU SWIOTLB basically forces the use of bounce buffers even if the device can reach the entire DRAM. I think some people also use it for added security? IDK. I mostly use it to check if I'm using the DMA API "right" :)
On Sat, 21 May 2022 10:44:30 -0700 Jakub Kicinski wrote: > > Well, I say the contrary. > > > > On the mainline the patch may be applied as is, it won't harm. > > > > However, it is gets applied to kernel 4.9 (based on the fixes: tag), it > > will break the driver for at least powerpc 8xx. > > I see, we should make a note of that in the commit message so it doesn't > get sucked into stable. > > > I don't know how SWIOTLB works or even what it is, does any of the > > microcontrollers embedding freescale ethernet uses that at all ? > > AFAIU SWIOTLB basically forces the use of bounce buffers even if the > device can reach the entire DRAM. I think some people also use it for > added security? IDK. I mostly use it to check if I'm using the DMA API > "right" :) If what I said makes sense please repost the patch, the current version has been dropped from patchwork already.
diff --git a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c index b3dae17e067e..432ce10cbfd0 100644 --- a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c +++ b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c @@ -240,14 +240,14 @@ static int fs_enet_napi(struct napi_struct *napi, int budget) /* +2 to make IP header L1 cache aligned */ skbn = netdev_alloc_skb(dev, pkt_len + 2); if (skbn != NULL) { + dma_sync_single_for_cpu(fep->dev, + CBDR_BUFADDR(bdp), + L1_CACHE_ALIGN(pkt_len), + DMA_FROM_DEVICE); skb_reserve(skbn, 2); /* align IP header */ skb_copy_from_linear_data(skb, skbn->data, pkt_len); swap(skb, skbn); - dma_sync_single_for_cpu(fep->dev, - CBDR_BUFADDR(bdp), - L1_CACHE_ALIGN(pkt_len), - DMA_FROM_DEVICE); } } else { skbn = netdev_alloc_skb(dev, ENET_RX_FRSIZE);
The dma_sync_single_for_cpu() call must precede reading the received data. Fix this. Fixes: 070e1f01827c ("net: fs_enet: don't unmap DMA when packet len is below copybreak") Signed-off-by: Mans Rullgard <mans@mansr.com> --- drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)