diff mbox series

parisc: improve cach flushing in arch_sync_dma_for_cpu()

Message ID ZGJ8ZcPZbckX7VNB@p100 (mailing list archive)
State Accepted, archived
Headers show
Series parisc: improve cach flushing in arch_sync_dma_for_cpu() | expand

Commit Message

Helge Deller May 15, 2023, 6:39 p.m. UTC
Handle the direction flag in arch_sync_dma_for_device() and
arch_sync_dma_for_cpu().

When receiving data from the device (DMA_FROM_DEVICE and DMA_BIDIRECTIONAL)
purge the data cache in arch_sync_dma_for_cpu().

Run-tested on C8000 workstation.

Signed-off-by: Helge Deller <deller@gmx.de>

Comments

John David Anglin May 15, 2023, 10:28 p.m. UTC | #1
On 2023-05-15 2:39 p.m., Helge Deller wrote:
> +	case DMA_BIDIRECTIONAL:
> +		flush_kernel_dcache_range(addr, size);
> +		purge_kernel_dcache_range_asm(addr, addr + size);
I don't think flush and purge are both needed.

Dave
Helge Deller May 16, 2023, 7:09 a.m. UTC | #2
On 5/16/23 00:28, John David Anglin wrote:
> On 2023-05-15 2:39 p.m., Helge Deller wrote:
>> +    case DMA_BIDIRECTIONAL:
>> +        flush_kernel_dcache_range(addr, size);
>> +        purge_kernel_dcache_range_asm(addr, addr + size);
> I don't think flush and purge are both needed.

I'm not sure...

Just to fully understand it. Is this short summary correct: ?
- flush_kernel_dcache_range: flush cache back to memory, but keep data in cache.
	Next read fetches the data which is still in the cache, thus the next
	read doesn't checks if data in memory has been modified in the meantime (e.g. via DMA).
- purge_kernel_dcache_range_asm: ignore currently cached data & drop any cached data in that range.
	Even if cache has dirty memory which hasn't been written back yet, drop it and don't write back.
	Next read will fetch data from memory, thus return what DMA could have stored there.

Helge
John David Anglin May 16, 2023, 2:20 p.m. UTC | #3
On 2023-05-16 3:09 a.m., Helge Deller wrote:
> On 5/16/23 00:28, John David Anglin wrote:
>> On 2023-05-15 2:39 p.m., Helge Deller wrote:
>>> +    case DMA_BIDIRECTIONAL:
>>> +        flush_kernel_dcache_range(addr, size);
>>> +        purge_kernel_dcache_range_asm(addr, addr + size);
>> I don't think flush and purge are both needed.
>
> I'm not sure...
>
> Just to fully understand it. Is this short summary correct: ?
> - flush_kernel_dcache_range: flush cache back to memory, but keep data in cache.
No.  If present, fdc writes addressed cache line to memory if and only if line is dirty.  It
then invalidates line.  It does not keep data in cache.

Next read loads from memory.
>     Next read fetches the data which is still in the cache, thus the next
>     read doesn't checks if data in memory has been modified in the meantime (e.g. via DMA).
> - purge_kernel_dcache_range_asm: ignore currently cached data & drop any cached data in that range.
>     Even if cache has dirty memory which hasn't been written back yet, drop it and don't write back.
if present, pdc invalidates line.  At privilege level zero, an implementation may optionally write
back a dirty line to memory.  At non-zero privilege levels, fdc and pdc are effectively the same.

I don't know how to determine whether pdc does write back or not. It would be specified in processor
ERS.

>     Next read will fetch data from memory, thus return what DMA could have stored there.

Dave
Helge Deller May 16, 2023, 6 p.m. UTC | #4
* John David Anglin <dave.anglin@bell.net>:
> On 2023-05-16 3:09 a.m., Helge Deller wrote:
> > On 5/16/23 00:28, John David Anglin wrote:
> > > On 2023-05-15 2:39 p.m., Helge Deller wrote:
> > > > +    case DMA_BIDIRECTIONAL:
> > > > +        flush_kernel_dcache_range(addr, size);
> > > > +        purge_kernel_dcache_range_asm(addr, addr + size);
> > > I don't think flush and purge are both needed.
> >
> > I'm not sure...
> >
> > Just to fully understand it. Is this short summary correct: ?
> > - flush_kernel_dcache_range: flush cache back to memory, but keep data in cache.
> No.  If present, fdc writes addressed cache line to memory if and only if line is dirty.  It
> then invalidates line.  It does not keep data in cache.
>
> Next read loads from memory.
> >     Next read fetches the data which is still in the cache, thus the next
> >     read doesn't checks if data in memory has been modified in the meantime (e.g. via DMA).
> > - purge_kernel_dcache_range_asm: ignore currently cached data & drop any cached data in that range.
> >     Even if cache has dirty memory which hasn't been written back yet, drop it and don't write back.
> if present, pdc invalidates line.  At privilege level zero, an implementation may optionally write
> back a dirty line to memory.  At non-zero privilege levels, fdc and pdc are effectively the same.
>
> I don't know how to determine whether pdc does write back or not. It would be specified in processor
> ERS.

Thanks for the explanation!
With that, I've attached an updated (untested) patch below.

Helge

[PATCH v2] parisc: improve cach flushing in arch_sync_dma_for_cpu()

Add comment in arch_sync_dma_for_device() and handle the direction flag in
arch_sync_dma_for_cpu().

When receiving data from the device (DMA_FROM_DEVICE) unconditionally
purge the data cache in arch_sync_dma_for_cpu().

Signed-off-by: Helge Deller <deller@gmx.de>
diff --git a/arch/parisc/kernel/pci-dma.c b/arch/parisc/kernel/pci-dma.c
index ba87f791323b..71ed5391f29d 100644
--- a/arch/parisc/kernel/pci-dma.c
+++ b/arch/parisc/kernel/pci-dma.c
@@ -446,11 +446,27 @@ void arch_dma_free(struct device *dev, size_t size, void *vaddr,
 void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
 		enum dma_data_direction dir)
 {
+	/*
+	 * fdc: The data cache line is written back to memory, if and only if
+	 * it is dirty, and then invalidated from the data cache.
+	 */
 	flush_kernel_dcache_range((unsigned long)phys_to_virt(paddr), size);
 }

 void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
 		enum dma_data_direction dir)
 {
-	flush_kernel_dcache_range((unsigned long)phys_to_virt(paddr), size);
+	unsigned long addr = (unsigned long) phys_to_virt(paddr);
+
+	switch (dir) {
+	case DMA_TO_DEVICE:
+	case DMA_BIDIRECTIONAL:
+		flush_kernel_dcache_range(addr, size);
+		return;
+	case DMA_FROM_DEVICE:
+		purge_kernel_dcache_range_asm(addr, addr + size);
+		return;
+	default:
+		BUG();
+	}
 }
John David Anglin May 19, 2023, 4:07 p.m. UTC | #5
On 2023-05-16 2:00 p.m., Helge Deller wrote:
> * John David Anglin <dave.anglin@bell.net>:
>> On 2023-05-16 3:09 a.m., Helge Deller wrote:
>>> On 5/16/23 00:28, John David Anglin wrote:
>>>> On 2023-05-15 2:39 p.m., Helge Deller wrote:
>>>>> +    case DMA_BIDIRECTIONAL:
>>>>> +        flush_kernel_dcache_range(addr, size);
>>>>> +        purge_kernel_dcache_range_asm(addr, addr + size);
>>>> I don't think flush and purge are both needed.
>>> I'm not sure...
>>>
>>> Just to fully understand it. Is this short summary correct: ?
>>> - flush_kernel_dcache_range: flush cache back to memory, but keep data in cache.
>> No.  If present, fdc writes addressed cache line to memory if and only if line is dirty.  It
>> then invalidates line.  It does not keep data in cache.
>>
>> Next read loads from memory.
>>>      Next read fetches the data which is still in the cache, thus the next
>>>      read doesn't checks if data in memory has been modified in the meantime (e.g. via DMA).
>>> - purge_kernel_dcache_range_asm: ignore currently cached data & drop any cached data in that range.
>>>      Even if cache has dirty memory which hasn't been written back yet, drop it and don't write back.
>> if present, pdc invalidates line.  At privilege level zero, an implementation may optionally write
>> back a dirty line to memory.  At non-zero privilege levels, fdc and pdc are effectively the same.
>>
>> I don't know how to determine whether pdc does write back or not. It would be specified in processor
>> ERS.
> Thanks for the explanation!
> With that, I've attached an updated (untested) patch below.
Seems to work okay on c8000.  Don't know if it helps performance.

Dave
> [PATCH v2] parisc: improve cach flushing in arch_sync_dma_for_cpu()
>
> Add comment in arch_sync_dma_for_device() and handle the direction flag in
> arch_sync_dma_for_cpu().
>
> When receiving data from the device (DMA_FROM_DEVICE) unconditionally
> purge the data cache in arch_sync_dma_for_cpu().
>
> Signed-off-by: Helge Deller <deller@gmx.de>
> diff --git a/arch/parisc/kernel/pci-dma.c b/arch/parisc/kernel/pci-dma.c
> index ba87f791323b..71ed5391f29d 100644
> --- a/arch/parisc/kernel/pci-dma.c
> +++ b/arch/parisc/kernel/pci-dma.c
> @@ -446,11 +446,27 @@ void arch_dma_free(struct device *dev, size_t size, void *vaddr,
>   void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
>   		enum dma_data_direction dir)
>   {
> +	/*
> +	 * fdc: The data cache line is written back to memory, if and only if
> +	 * it is dirty, and then invalidated from the data cache.
> +	 */
>   	flush_kernel_dcache_range((unsigned long)phys_to_virt(paddr), size);
>   }
>
>   void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
>   		enum dma_data_direction dir)
>   {
> -	flush_kernel_dcache_range((unsigned long)phys_to_virt(paddr), size);
> +	unsigned long addr = (unsigned long) phys_to_virt(paddr);
> +
> +	switch (dir) {
> +	case DMA_TO_DEVICE:
> +	case DMA_BIDIRECTIONAL:
> +		flush_kernel_dcache_range(addr, size);
> +		return;
> +	case DMA_FROM_DEVICE:
> +		purge_kernel_dcache_range_asm(addr, addr + size);
> +		return;
> +	default:
> +		BUG();
> +	}
>   }
Helge Deller May 20, 2023, 5:10 a.m. UTC | #6
On 5/19/23 18:07, John David Anglin wrote:
> On 2023-05-16 2:00 p.m., Helge Deller wrote:
>> * John David Anglin <dave.anglin@bell.net>:
>>> On 2023-05-16 3:09 a.m., Helge Deller wrote:
>>>> On 5/16/23 00:28, John David Anglin wrote:
>>>>> On 2023-05-15 2:39 p.m., Helge Deller wrote:
>>>>>> +    case DMA_BIDIRECTIONAL:
>>>>>> +        flush_kernel_dcache_range(addr, size);
>>>>>> +        purge_kernel_dcache_range_asm(addr, addr + size);
>>>>> I don't think flush and purge are both needed.
>>>> I'm not sure...
>>>>
>>>> Just to fully understand it. Is this short summary correct: ?
>>>> - flush_kernel_dcache_range: flush cache back to memory, but keep data in cache.
>>> No.  If present, fdc writes addressed cache line to memory if and only if line is dirty.  It
>>> then invalidates line.  It does not keep data in cache.
>>>
>>> Next read loads from memory.
>>>>      Next read fetches the data which is still in the cache, thus the next
>>>>      read doesn't checks if data in memory has been modified in the meantime (e.g. via DMA).
>>>> - purge_kernel_dcache_range_asm: ignore currently cached data & drop any cached data in that range.
>>>>      Even if cache has dirty memory which hasn't been written back yet, drop it and don't write back.
>>> if present, pdc invalidates line.  At privilege level zero, an implementation may optionally write
>>> back a dirty line to memory.  At non-zero privilege levels, fdc and pdc are effectively the same.
>>>
>>> I don't know how to determine whether pdc does write back or not. It would be specified in processor
>>> ERS.
>> Thanks for the explanation!
>> With that, I've attached an updated (untested) patch below.

> Seems to work okay on c8000.  Don't know if it helps performance.

Probably no performance change for c8000.
I noticed that pci-dma.c is only compiled on 32-bit kernels (for PCX-T machines) :-)

Helge
diff mbox series

Patch

diff --git a/arch/parisc/kernel/pci-dma.c b/arch/parisc/kernel/pci-dma.c
index ba87f791323b..f8337c1820fc 100644
--- a/arch/parisc/kernel/pci-dma.c
+++ b/arch/parisc/kernel/pci-dma.c
@@ -446,11 +446,36 @@  void arch_dma_free(struct device *dev, size_t size, void *vaddr,
 void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
 		enum dma_data_direction dir)
 {
-	flush_kernel_dcache_range((unsigned long)phys_to_virt(paddr), size);
+	unsigned long addr = (unsigned long) phys_to_virt(paddr);
+
+	switch (dir) {
+	case DMA_TO_DEVICE:
+		flush_kernel_dcache_range(addr, size);
+		break;
+	case DMA_FROM_DEVICE:
+	case DMA_BIDIRECTIONAL:
+		flush_kernel_dcache_range(addr, size);
+		purge_kernel_dcache_range_asm(addr, addr + size);
+		break;
+	default:
+		BUG();
+	}
 }

 void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
 		enum dma_data_direction dir)
 {
-	flush_kernel_dcache_range((unsigned long)phys_to_virt(paddr), size);
+	unsigned long addr = (unsigned long) phys_to_virt(paddr);
+
+	switch (dir) {
+	case DMA_TO_DEVICE:
+		flush_kernel_dcache_range(addr, size);
+		return;
+	case DMA_FROM_DEVICE:
+	case DMA_BIDIRECTIONAL:
+		purge_kernel_dcache_range_asm(addr, addr + size);
+		break;
+	default:
+		BUG();
+	}
 }