Message ID | 20150501110644.GF27755@e104818-lin.cambridge.arm.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On Friday 01 May 2015 12:06:44 Catalin Marinas wrote: > > > Note that there are lots of ways in which you could have noncoherent DMA: > > the default on ARM32 is that it requires uncached access or explicit > > cache flushes, but it's also possible to have an SMP system where a device > > is only coherent with some of the CPUs and requires explicit synchronization > > (not flushes) otherwise. In a multi-level cache hierarchy, there could be > > all sorts of combinations of flushes and syncs you would need to do. > > > > With DT, we handle this using SoC-specific overrides for platforms that > > are noncoherent in funny ways, see > > http://lxr.free-electrons.com/source/arch/arm/mach-mvebu/coherency.c?v=3.18#L263 > > for instance. > > It looks like mach-mvebu no longer needs this, according to commit > 1bd4d8a6de5c (ARM: mvebu: use arm_coherent_dma_ops and re-enable hardware > I/O coherency). Yes, Thomas Petazzoni found a way to configure that chip to essentially provide PCI semantics where an MMIO read from a devices ensures that all previous DMA has completed, which made the sync unnecessary. I believe Marvell recommends against using that mode for performance reasons, and they still use their own manual syncs in their vendor kernel. > Even if some hardware needs this, it's usually because it has some > broken assumptions about barriers which most likely are architecture > non-compliant. We can work around it on a case by case basis (SoC > quirks). One option would be to disable coherency altogether for that > device, even if the performance is affected (e.g. no partial coherency). > Another possibility may be to add a bus driver for that broken > interconnect which installs its own dma ops for each device attached. Whether the Armada XP example is broken or not is really a matter of perspective. I would count it broken on the basis that is does not match what the Linux DMA and MMIO APIs expect, but you can well build an OS around their semantics. > > If we just disallow DMA to devices that are marked with _CCA=0 > > in ACPI, we can avoid this case, or discuss it by the time someone has hardware > > that wants it, and then make a more informed decision about it. > > I don't think we should disallow DMA to devices with _CCA == 0 (only to > those that don't have a _CCA property at all) as long as _CCA == 0 has > clear semantics like only architected cache maintenance required (and > that's what the ARMv8 ARM requires from compliant system caches). Even if we exclude all cases in which the behavior may be unexpected, there is still the other point I raised initially: what would that be good for? Can you think of a case where a server system has a reason to use a device in noncoherent mode? I think it's more likely to be a case where a device got misconfigured accidentally by the firmware, and we're better off warning about that in the kernel than trying to prepare for an unknown hardware that might use an obscure feature of the spec. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, May 08, 2015 at 04:08:53PM +0200, Arnd Bergmann wrote: > On Friday 01 May 2015 12:06:44 Catalin Marinas wrote: > > > If we just disallow DMA to devices that are marked with _CCA=0 > > > in ACPI, we can avoid this case, or discuss it by the time someone has hardware > > > that wants it, and then make a more informed decision about it. > > > > I don't think we should disallow DMA to devices with _CCA == 0 (only to > > those that don't have a _CCA property at all) as long as _CCA == 0 has > > clear semantics like only architected cache maintenance required (and > > that's what the ARMv8 ARM requires from compliant system caches). > > Even if we exclude all cases in which the behavior may be unexpected, > there is still the other point I raised initially: > > what would that be good for? > > Can you think of a case where a server system has a reason to use > a device in noncoherent mode? I think it's more likely to be a case > where a device got misconfigured accidentally by the firmware, and > we're better off warning about that in the kernel than trying to prepare > for an unknown hardware that might use an obscure feature of the spec. Maybe some of the people involved in arm64 servers can give a better answer, I'm not familiar with their hardware (plans). I would expect most DMA-capable devices to be cache coherent. However, for (system) performance reasons, some of them could be configured as non-coherent. An example, though unlikely on servers, is a display device continuously accessing a framebuffer. You may not want to overload the coherent interconnect.
On 11/05/15 18:10, Catalin Marinas wrote: > On Fri, May 08, 2015 at 04:08:53PM +0200, Arnd Bergmann wrote: >> On Friday 01 May 2015 12:06:44 Catalin Marinas wrote: >>>> If we just disallow DMA to devices that are marked with _CCA=0 >>>> in ACPI, we can avoid this case, or discuss it by the time someone has hardware >>>> that wants it, and then make a more informed decision about it. >>> >>> I don't think we should disallow DMA to devices with _CCA == 0 (only to >>> those that don't have a _CCA property at all) as long as _CCA == 0 has >>> clear semantics like only architected cache maintenance required (and >>> that's what the ARMv8 ARM requires from compliant system caches). >> >> Even if we exclude all cases in which the behavior may be unexpected, >> there is still the other point I raised initially: >> >> what would that be good for? >> >> Can you think of a case where a server system has a reason to use >> a device in noncoherent mode? I think it's more likely to be a case >> where a device got misconfigured accidentally by the firmware, and >> we're better off warning about that in the kernel than trying to prepare >> for an unknown hardware that might use an obscure feature of the spec. > > Maybe some of the people involved in arm64 servers can give a better > answer, I'm not familiar with their hardware (plans). > > I would expect most DMA-capable devices to be cache coherent. However, > for (system) performance reasons, some of them could be configured as > non-coherent. An example, though unlikely on servers, is a display > device continuously accessing a framebuffer. You may not want to > overload the coherent interconnect. FWIW, I've also had much the same argument put to me for IOMMUs, i.e. they want to make the page table walk interface non-coherent because they'd rather pay the cost of flushing the page tables once to save a few extra cycles of latency for cache snooping on every TLB miss. Robin. -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h index 9437e3dc5833..3fd6ef019c8f 100644 --- a/arch/arm64/include/asm/dma-mapping.h +++ b/arch/arm64/include/asm/dma-mapping.h @@ -31,10 +31,14 @@ extern struct dma_map_ops *dma_ops; static inline struct dma_map_ops *__generic_dma_ops(struct device *dev) { - if (unlikely(!dev) || !dev->archdata.dma_ops) + if (!dev) return dma_ops; - else + else if (dev->archdata.dma_ops) return dev->archdata.dma_ops; + else if (!acpi_disabled) + return dummy_dma_ops; + else + return dma_ops; } static inline struct dma_map_ops *get_dma_ops(struct device *dev) @@ -48,6 +52,8 @@ static inline struct dma_map_ops *get_dma_ops(struct device *dev) static inline void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size, struct iommu_ops *iommu, bool coherent) { + if (!acpi_disabled && !dev->archdata.dma_ops) + dev->archdata.dma_ops = dma_ops; dev->archdata.dma_coherent = coherent; } #define arch_setup_dma_ops arch_setup_dma_ops