Message ID | 59f4ebbf06e75a6176a366495211afd16d0048a3.1442507940.git.robin.murphy@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, 2015-09-17 at 17:42 +0100, Robin Murphy wrote: > In checking whether DMA addresses differ from physical addresses, using > dma_to_phys() is actually the wrong thing to do, since it may hide any > DMA offset, which is precisely one of the things we are checking for. > Simply casting between the two address types, whilst ugly, is in fact > the appropriate course of action. Further care (and ugliness) is also > necessary in the comparison to avoid truncation if phys_addr_t and > dma_addr_t differ in size. > > We can also reject any device with a fixed DMA offset up-front at page > table creation, leaving the allocation-time check for the more subtle > cases like bounce buffering due to an incorrect DMA mask. > > Furthermore, we can then fix the hackish KConfig dependency so that > architectures without a dma_to_phys() implementation may still > COMPILE_TEST (or even use!) the code. The true dependency is on the > DMA API, so use the appropriate symbol for that. > > Signed-off-by: Robin Murphy <robin.murphy@arm.com> > --- [...] > > static bool selftest_running = false; > > -static dma_addr_t __arm_lpae_dma_addr(struct device *dev, void *pages) > +static dma_addr_t __arm_lpae_dma_addr(void *pages) > { > - return phys_to_dma(dev, virt_to_phys(pages)); > + return (dma_addr_t)virt_to_phys(pages); > } > > static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp, > @@ -223,10 +223,10 @@ static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp, > goto out_free; > /* > * We depend on the IOMMU being able to work with any physical > - * address directly, so if the DMA layer suggests it can't by > - * giving us back some translation, that bodes very badly... > + * address directly, so if the DMA layer suggests otherwise by > + * translating or truncating them, that bodes very badly... > */ > - if (dma != __arm_lpae_dma_addr(dev, pages)) > + if (dma != virt_to_phys(pages)) Could I ask why not use __arm_lpae_dma_addr(pages) here? dma is dma_addr_t. > goto out_unmap; > } > > @@ -243,10 +243,8 @@ out_free: > static void __arm_lpae_free_pages(void *pages, size_t size, > struct io_pgtable_cfg *cfg) > { > - struct device *dev = cfg->iommu_dev; > - > if (!selftest_running) > - dma_unmap_single(dev, __arm_lpae_dma_addr(dev, pages), > + dma_unmap_single(cfg->iommu_dev, __arm_lpae_dma_addr(pages), > size, DMA_TO_DEVICE); > free_pages_exact(pages, size); > } > @@ -254,12 +252,11 @@ static void __arm_lpae_free_pages(void *pages, size_t size, > static void __arm_lpae_set_pte(arm_lpae_iopte *ptep, arm_lpae_iopte pte, > struct io_pgtable_cfg *cfg) > { > - struct device *dev = cfg->iommu_dev; > - > *ptep = pte; > > if (!selftest_running) > - dma_sync_single_for_device(dev, __arm_lpae_dma_addr(dev, ptep), > + dma_sync_single_for_device(cfg->iommu_dev, > + __arm_lpae_dma_addr(ptep), > sizeof(pte), DMA_TO_DEVICE); > } > > @@ -629,6 +626,11 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg) > if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS) > return NULL; > > + if (cfg->iommu_dev->dma_pfn_offset) { > + dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for IOMMU page tables\n"); > + return NULL; > + } > + > data = kmalloc(sizeof(*data), GFP_KERNEL); > if (!data) > return NULL;
On 18/09/15 09:55, Yong Wu wrote: > On Thu, 2015-09-17 at 17:42 +0100, Robin Murphy wrote: [...] >> the appropriate course of action. Further care (and ugliness) is also >> necessary in the comparison to avoid truncation if phys_addr_t and >> dma_addr_t differ in size. [...] >> /* >> * We depend on the IOMMU being able to work with any physical >> - * address directly, so if the DMA layer suggests it can't by >> - * giving us back some translation, that bodes very badly... >> + * address directly, so if the DMA layer suggests otherwise by >> + * translating or truncating them, that bodes very badly... >> */ >> - if (dma != __arm_lpae_dma_addr(dev, pages)) >> + if (dma != virt_to_phys(pages)) > > Could I ask why not use __arm_lpae_dma_addr(pages) here? > dma is dma_addr_t. Specifically, the problem case for that is when phys_addr_t is 64-bit but dma_addr_t is 32-bit. The cast in __arm_lpae_dma_addr is necessary to avoid a truncation warning when we make the DMA API calls, but we actually need the opposite in the comparison here - comparing the different types directly allows integer promotion to kick in appropriately so we don't lose the top half of the larger address. Otherwise, you'd never spot the difference between, say, your original page at 0x88c0000000 and a bounce-buffered copy that happened to end up mapped to 0xc0000000. Robin.
On Fri, Sep 18, 2015 at 12:04:26PM +0100, Robin Murphy wrote: > Specifically, the problem case for that is when phys_addr_t is 64-bit but > dma_addr_t is 32-bit. The cast in __arm_lpae_dma_addr is necessary to avoid > a truncation warning when we make the DMA API calls, but we actually need > the opposite in the comparison here - comparing the different types directly > allows integer promotion to kick in appropriately so we don't lose the top > half of the larger address. Otherwise, you'd never spot the difference > between, say, your original page at 0x88c0000000 and a bounce-buffered copy > that happened to end up mapped to 0xc0000000. Hmm. Thinking about this, I think we ought to add to arch/arm/mm/Kconfig: config ARCH_PHYS_ADDR_T_64BIT def_bool ARM_LPAE config ARCH_DMA_ADDR_T_64BIT bool + select ARCH_PHYS_ADDR_T_64BIT I seem to remember that you're quite right that dma_addr_t <= phys_addr_t but dma_addr_t must never be bigger than phys_addr_t.
On Fri, 2015-09-18 at 12:04 +0100, Robin Murphy wrote: > On 18/09/15 09:55, Yong Wu wrote: > > On Thu, 2015-09-17 at 17:42 +0100, Robin Murphy wrote: > [...] > >> the appropriate course of action. Further care (and ugliness) is also > >> necessary in the comparison to avoid truncation if phys_addr_t and > >> dma_addr_t differ in size. > [...] > >> /* > >> * We depend on the IOMMU being able to work with any physical > >> - * address directly, so if the DMA layer suggests it can't by > >> - * giving us back some translation, that bodes very badly... > >> + * address directly, so if the DMA layer suggests otherwise by > >> + * translating or truncating them, that bodes very badly... > >> */ > >> - if (dma != __arm_lpae_dma_addr(dev, pages)) > >> + if (dma != virt_to_phys(pages)) > > > > Could I ask why not use __arm_lpae_dma_addr(pages) here? > > dma is dma_addr_t. > > Specifically, the problem case for that is when phys_addr_t is 64-bit > but dma_addr_t is 32-bit. The cast in __arm_lpae_dma_addr is necessary > to avoid a truncation warning when we make the DMA API calls, but we > actually need the opposite in the comparison here - comparing the > different types directly allows integer promotion to kick in > appropriately so we don't lose the top half of the larger address. > Otherwise, you'd never spot the difference between, say, your original > page at 0x88c0000000 and a bounce-buffered copy that happened to end up > mapped to 0xc0000000. Thanks. About here: > @@ -629,6 +626,11 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg) > if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS) > return NULL; > > + if (cfg->iommu_dev->dma_pfn_offset) { Do we need change to : if (!selftest_running && cfg->iommu_dev->dma_pfn_offset) { cfg->iommu_dev will be null while self test. > + dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for IOMMU page tables\n"); > + return NULL; > + } > + > data = kmalloc(sizeof(*data), GFP_KERNEL); > if (!data) > return NULL; > > Robin. >
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index 4664c2a..3dc1bcb 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -23,8 +23,7 @@ config IOMMU_IO_PGTABLE config IOMMU_IO_PGTABLE_LPAE bool "ARMv7/v8 Long Descriptor Format" select IOMMU_IO_PGTABLE - # SWIOTLB guarantees a dma_to_phys() implementation - depends on ARM || ARM64 || (COMPILE_TEST && SWIOTLB) + depends on HAS_DMA && (ARM || ARM64 || COMPILE_TEST) help Enable support for the ARM long descriptor pagetable format. This allocator supports 4K/2M/1G, 16K/32M and 64K/512M page diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 73c0748..96a4baa 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -202,9 +202,9 @@ typedef u64 arm_lpae_iopte; static bool selftest_running = false; -static dma_addr_t __arm_lpae_dma_addr(struct device *dev, void *pages) +static dma_addr_t __arm_lpae_dma_addr(void *pages) { - return phys_to_dma(dev, virt_to_phys(pages)); + return (dma_addr_t)virt_to_phys(pages); } static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp, @@ -223,10 +223,10 @@ static void *__arm_lpae_alloc_pages(size_t size, gfp_t gfp, goto out_free; /* * We depend on the IOMMU being able to work with any physical - * address directly, so if the DMA layer suggests it can't by - * giving us back some translation, that bodes very badly... + * address directly, so if the DMA layer suggests otherwise by + * translating or truncating them, that bodes very badly... */ - if (dma != __arm_lpae_dma_addr(dev, pages)) + if (dma != virt_to_phys(pages)) goto out_unmap; } @@ -243,10 +243,8 @@ out_free: static void __arm_lpae_free_pages(void *pages, size_t size, struct io_pgtable_cfg *cfg) { - struct device *dev = cfg->iommu_dev; - if (!selftest_running) - dma_unmap_single(dev, __arm_lpae_dma_addr(dev, pages), + dma_unmap_single(cfg->iommu_dev, __arm_lpae_dma_addr(pages), size, DMA_TO_DEVICE); free_pages_exact(pages, size); } @@ -254,12 +252,11 @@ static void __arm_lpae_free_pages(void *pages, size_t size, static void __arm_lpae_set_pte(arm_lpae_iopte *ptep, arm_lpae_iopte pte, struct io_pgtable_cfg *cfg) { - struct device *dev = cfg->iommu_dev; - *ptep = pte; if (!selftest_running) - dma_sync_single_for_device(dev, __arm_lpae_dma_addr(dev, ptep), + dma_sync_single_for_device(cfg->iommu_dev, + __arm_lpae_dma_addr(ptep), sizeof(pte), DMA_TO_DEVICE); } @@ -629,6 +626,11 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg) if (cfg->oas > ARM_LPAE_MAX_ADDR_BITS) return NULL; + if (cfg->iommu_dev->dma_pfn_offset) { + dev_err(cfg->iommu_dev, "Cannot accommodate DMA offset for IOMMU page tables\n"); + return NULL; + } + data = kmalloc(sizeof(*data), GFP_KERNEL); if (!data) return NULL;
In checking whether DMA addresses differ from physical addresses, using dma_to_phys() is actually the wrong thing to do, since it may hide any DMA offset, which is precisely one of the things we are checking for. Simply casting between the two address types, whilst ugly, is in fact the appropriate course of action. Further care (and ugliness) is also necessary in the comparison to avoid truncation if phys_addr_t and dma_addr_t differ in size. We can also reject any device with a fixed DMA offset up-front at page table creation, leaving the allocation-time check for the more subtle cases like bounce buffering due to an incorrect DMA mask. Furthermore, we can then fix the hackish KConfig dependency so that architectures without a dma_to_phys() implementation may still COMPILE_TEST (or even use!) the code. The true dependency is on the DMA API, so use the appropriate symbol for that. Signed-off-by: Robin Murphy <robin.murphy@arm.com> --- drivers/iommu/Kconfig | 3 +-- drivers/iommu/io-pgtable-arm.c | 24 +++++++++++++----------- 2 files changed, 14 insertions(+), 13 deletions(-)