Message ID | ZGJ8ZcPZbckX7VNB@p100 (mailing list archive) |
---|---|
State | Accepted, archived |
Headers | show |
Series | parisc: improve cach flushing in arch_sync_dma_for_cpu() | expand |
On 2023-05-15 2:39 p.m., Helge Deller wrote: > + case DMA_BIDIRECTIONAL: > + flush_kernel_dcache_range(addr, size); > + purge_kernel_dcache_range_asm(addr, addr + size); I don't think flush and purge are both needed. Dave
On 5/16/23 00:28, John David Anglin wrote: > On 2023-05-15 2:39 p.m., Helge Deller wrote: >> + case DMA_BIDIRECTIONAL: >> + flush_kernel_dcache_range(addr, size); >> + purge_kernel_dcache_range_asm(addr, addr + size); > I don't think flush and purge are both needed. I'm not sure... Just to fully understand it. Is this short summary correct: ? - flush_kernel_dcache_range: flush cache back to memory, but keep data in cache. Next read fetches the data which is still in the cache, thus the next read doesn't checks if data in memory has been modified in the meantime (e.g. via DMA). - purge_kernel_dcache_range_asm: ignore currently cached data & drop any cached data in that range. Even if cache has dirty memory which hasn't been written back yet, drop it and don't write back. Next read will fetch data from memory, thus return what DMA could have stored there. Helge
On 2023-05-16 3:09 a.m., Helge Deller wrote: > On 5/16/23 00:28, John David Anglin wrote: >> On 2023-05-15 2:39 p.m., Helge Deller wrote: >>> + case DMA_BIDIRECTIONAL: >>> + flush_kernel_dcache_range(addr, size); >>> + purge_kernel_dcache_range_asm(addr, addr + size); >> I don't think flush and purge are both needed. > > I'm not sure... > > Just to fully understand it. Is this short summary correct: ? > - flush_kernel_dcache_range: flush cache back to memory, but keep data in cache. No. If present, fdc writes addressed cache line to memory if and only if line is dirty. It then invalidates line. It does not keep data in cache. Next read loads from memory. > Next read fetches the data which is still in the cache, thus the next > read doesn't checks if data in memory has been modified in the meantime (e.g. via DMA). > - purge_kernel_dcache_range_asm: ignore currently cached data & drop any cached data in that range. > Even if cache has dirty memory which hasn't been written back yet, drop it and don't write back. if present, pdc invalidates line. At privilege level zero, an implementation may optionally write back a dirty line to memory. At non-zero privilege levels, fdc and pdc are effectively the same. I don't know how to determine whether pdc does write back or not. It would be specified in processor ERS. > Next read will fetch data from memory, thus return what DMA could have stored there. Dave
* John David Anglin <dave.anglin@bell.net>: > On 2023-05-16 3:09 a.m., Helge Deller wrote: > > On 5/16/23 00:28, John David Anglin wrote: > > > On 2023-05-15 2:39 p.m., Helge Deller wrote: > > > > + case DMA_BIDIRECTIONAL: > > > > + flush_kernel_dcache_range(addr, size); > > > > + purge_kernel_dcache_range_asm(addr, addr + size); > > > I don't think flush and purge are both needed. > > > > I'm not sure... > > > > Just to fully understand it. Is this short summary correct: ? > > - flush_kernel_dcache_range: flush cache back to memory, but keep data in cache. > No. If present, fdc writes addressed cache line to memory if and only if line is dirty. It > then invalidates line. It does not keep data in cache. > > Next read loads from memory. > > Next read fetches the data which is still in the cache, thus the next > > read doesn't checks if data in memory has been modified in the meantime (e.g. via DMA). > > - purge_kernel_dcache_range_asm: ignore currently cached data & drop any cached data in that range. > > Even if cache has dirty memory which hasn't been written back yet, drop it and don't write back. > if present, pdc invalidates line. At privilege level zero, an implementation may optionally write > back a dirty line to memory. At non-zero privilege levels, fdc and pdc are effectively the same. > > I don't know how to determine whether pdc does write back or not. It would be specified in processor > ERS. Thanks for the explanation! With that, I've attached an updated (untested) patch below. Helge [PATCH v2] parisc: improve cach flushing in arch_sync_dma_for_cpu() Add comment in arch_sync_dma_for_device() and handle the direction flag in arch_sync_dma_for_cpu(). When receiving data from the device (DMA_FROM_DEVICE) unconditionally purge the data cache in arch_sync_dma_for_cpu(). Signed-off-by: Helge Deller <deller@gmx.de> diff --git a/arch/parisc/kernel/pci-dma.c b/arch/parisc/kernel/pci-dma.c index ba87f791323b..71ed5391f29d 100644 --- a/arch/parisc/kernel/pci-dma.c +++ b/arch/parisc/kernel/pci-dma.c @@ -446,11 +446,27 @@ void arch_dma_free(struct device *dev, size_t size, void *vaddr, void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, enum dma_data_direction dir) { + /* + * fdc: The data cache line is written back to memory, if and only if + * it is dirty, and then invalidated from the data cache. + */ flush_kernel_dcache_range((unsigned long)phys_to_virt(paddr), size); } void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, enum dma_data_direction dir) { - flush_kernel_dcache_range((unsigned long)phys_to_virt(paddr), size); + unsigned long addr = (unsigned long) phys_to_virt(paddr); + + switch (dir) { + case DMA_TO_DEVICE: + case DMA_BIDIRECTIONAL: + flush_kernel_dcache_range(addr, size); + return; + case DMA_FROM_DEVICE: + purge_kernel_dcache_range_asm(addr, addr + size); + return; + default: + BUG(); + } }
On 2023-05-16 2:00 p.m., Helge Deller wrote: > * John David Anglin <dave.anglin@bell.net>: >> On 2023-05-16 3:09 a.m., Helge Deller wrote: >>> On 5/16/23 00:28, John David Anglin wrote: >>>> On 2023-05-15 2:39 p.m., Helge Deller wrote: >>>>> + case DMA_BIDIRECTIONAL: >>>>> + flush_kernel_dcache_range(addr, size); >>>>> + purge_kernel_dcache_range_asm(addr, addr + size); >>>> I don't think flush and purge are both needed. >>> I'm not sure... >>> >>> Just to fully understand it. Is this short summary correct: ? >>> - flush_kernel_dcache_range: flush cache back to memory, but keep data in cache. >> No. If present, fdc writes addressed cache line to memory if and only if line is dirty. It >> then invalidates line. It does not keep data in cache. >> >> Next read loads from memory. >>> Next read fetches the data which is still in the cache, thus the next >>> read doesn't checks if data in memory has been modified in the meantime (e.g. via DMA). >>> - purge_kernel_dcache_range_asm: ignore currently cached data & drop any cached data in that range. >>> Even if cache has dirty memory which hasn't been written back yet, drop it and don't write back. >> if present, pdc invalidates line. At privilege level zero, an implementation may optionally write >> back a dirty line to memory. At non-zero privilege levels, fdc and pdc are effectively the same. >> >> I don't know how to determine whether pdc does write back or not. It would be specified in processor >> ERS. > Thanks for the explanation! > With that, I've attached an updated (untested) patch below. Seems to work okay on c8000. Don't know if it helps performance. Dave > [PATCH v2] parisc: improve cach flushing in arch_sync_dma_for_cpu() > > Add comment in arch_sync_dma_for_device() and handle the direction flag in > arch_sync_dma_for_cpu(). > > When receiving data from the device (DMA_FROM_DEVICE) unconditionally > purge the data cache in arch_sync_dma_for_cpu(). > > Signed-off-by: Helge Deller <deller@gmx.de> > diff --git a/arch/parisc/kernel/pci-dma.c b/arch/parisc/kernel/pci-dma.c > index ba87f791323b..71ed5391f29d 100644 > --- a/arch/parisc/kernel/pci-dma.c > +++ b/arch/parisc/kernel/pci-dma.c > @@ -446,11 +446,27 @@ void arch_dma_free(struct device *dev, size_t size, void *vaddr, > void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, > enum dma_data_direction dir) > { > + /* > + * fdc: The data cache line is written back to memory, if and only if > + * it is dirty, and then invalidated from the data cache. > + */ > flush_kernel_dcache_range((unsigned long)phys_to_virt(paddr), size); > } > > void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, > enum dma_data_direction dir) > { > - flush_kernel_dcache_range((unsigned long)phys_to_virt(paddr), size); > + unsigned long addr = (unsigned long) phys_to_virt(paddr); > + > + switch (dir) { > + case DMA_TO_DEVICE: > + case DMA_BIDIRECTIONAL: > + flush_kernel_dcache_range(addr, size); > + return; > + case DMA_FROM_DEVICE: > + purge_kernel_dcache_range_asm(addr, addr + size); > + return; > + default: > + BUG(); > + } > }
On 5/19/23 18:07, John David Anglin wrote: > On 2023-05-16 2:00 p.m., Helge Deller wrote: >> * John David Anglin <dave.anglin@bell.net>: >>> On 2023-05-16 3:09 a.m., Helge Deller wrote: >>>> On 5/16/23 00:28, John David Anglin wrote: >>>>> On 2023-05-15 2:39 p.m., Helge Deller wrote: >>>>>> + case DMA_BIDIRECTIONAL: >>>>>> + flush_kernel_dcache_range(addr, size); >>>>>> + purge_kernel_dcache_range_asm(addr, addr + size); >>>>> I don't think flush and purge are both needed. >>>> I'm not sure... >>>> >>>> Just to fully understand it. Is this short summary correct: ? >>>> - flush_kernel_dcache_range: flush cache back to memory, but keep data in cache. >>> No. If present, fdc writes addressed cache line to memory if and only if line is dirty. It >>> then invalidates line. It does not keep data in cache. >>> >>> Next read loads from memory. >>>> Next read fetches the data which is still in the cache, thus the next >>>> read doesn't checks if data in memory has been modified in the meantime (e.g. via DMA). >>>> - purge_kernel_dcache_range_asm: ignore currently cached data & drop any cached data in that range. >>>> Even if cache has dirty memory which hasn't been written back yet, drop it and don't write back. >>> if present, pdc invalidates line. At privilege level zero, an implementation may optionally write >>> back a dirty line to memory. At non-zero privilege levels, fdc and pdc are effectively the same. >>> >>> I don't know how to determine whether pdc does write back or not. It would be specified in processor >>> ERS. >> Thanks for the explanation! >> With that, I've attached an updated (untested) patch below. > Seems to work okay on c8000. Don't know if it helps performance. Probably no performance change for c8000. I noticed that pci-dma.c is only compiled on 32-bit kernels (for PCX-T machines) :-) Helge
diff --git a/arch/parisc/kernel/pci-dma.c b/arch/parisc/kernel/pci-dma.c index ba87f791323b..f8337c1820fc 100644 --- a/arch/parisc/kernel/pci-dma.c +++ b/arch/parisc/kernel/pci-dma.c @@ -446,11 +446,36 @@ void arch_dma_free(struct device *dev, size_t size, void *vaddr, void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, enum dma_data_direction dir) { - flush_kernel_dcache_range((unsigned long)phys_to_virt(paddr), size); + unsigned long addr = (unsigned long) phys_to_virt(paddr); + + switch (dir) { + case DMA_TO_DEVICE: + flush_kernel_dcache_range(addr, size); + break; + case DMA_FROM_DEVICE: + case DMA_BIDIRECTIONAL: + flush_kernel_dcache_range(addr, size); + purge_kernel_dcache_range_asm(addr, addr + size); + break; + default: + BUG(); + } } void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, enum dma_data_direction dir) { - flush_kernel_dcache_range((unsigned long)phys_to_virt(paddr), size); + unsigned long addr = (unsigned long) phys_to_virt(paddr); + + switch (dir) { + case DMA_TO_DEVICE: + flush_kernel_dcache_range(addr, size); + return; + case DMA_FROM_DEVICE: + case DMA_BIDIRECTIONAL: + purge_kernel_dcache_range_asm(addr, addr + size); + break; + default: + BUG(); + } }
Handle the direction flag in arch_sync_dma_for_device() and arch_sync_dma_for_cpu(). When receiving data from the device (DMA_FROM_DEVICE and DMA_BIDIRECTIONAL) purge the data cache in arch_sync_dma_for_cpu(). Run-tested on C8000 workstation. Signed-off-by: Helge Deller <deller@gmx.de>