Message ID | 1490958171-2303-1-git-send-email-a.hajda@samsung.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Mar 31, 2017 at 01:02:51PM +0200, Andrzej Hajda wrote: > In this version of the patch I have replaced temporal pages and > iommu_dma_mmap with remap_pfn_range or rather its simplified version > vm_iomap_memory. > Unfortunately I have not find nice helper for sgtable creation, so I > left sg allocation inside _iommu_mmap_attrs. > > As the fixing of DMA_ATTR_FORCE_CONTIGUOUS related crashes has higher > priority I have focused only on it in this patch. As I mentioned elsewhere, I don't believe that fudging around in this way is a proper fix. DMA coherent memory was never, ever, intended to be passed back into the streaming APIs - it was designed that the two would be mutually exclusive. The problem is that we now have DMA coherent allocations being passed through the dma-buf API using this dma_get_sgtable() thing, which is quite broken. I regard dma_get_sgtable() as very broken, and had I realised at the time that this was what it was trying to do, I would have NAK'd it. Rather than bodging around this brokenness, trying to get dma_get_sgtable() to work, I believe we need to address the root cause - which is proper support for passing DMA coherent allocations through dma-buf, which does not involve scatterlists and calling dma_map_sg() on it. That's going to need to be addressed in any case, because of the dma_declare_coherent_memory() issue, where we may not have struct pages backing a coherent allocation. Such a case can come up on ARM64 via DT's "shared-dma-pool" reserved memory stuff. Right now, I have a "fix" for ARM queued which causes dma_get_sgtable() to fail when used with reserved memory, but we have one user who needs this to work. So, dma-buf needs to be fixed for this one way or another, and I don't think that bending the current broken stuff to suit it by trying these vmalloc_to_page() hacks is acceptable. dma_get_sgtable() is fundamentally broken IMHO.
Hi Russel, On 31.03.2017 13:16, Russell King - ARM Linux wrote: > On Fri, Mar 31, 2017 at 01:02:51PM +0200, Andrzej Hajda wrote: >> In this version of the patch I have replaced temporal pages and >> iommu_dma_mmap with remap_pfn_range or rather its simplified version >> vm_iomap_memory. >> Unfortunately I have not find nice helper for sgtable creation, so I >> left sg allocation inside _iommu_mmap_attrs. >> >> As the fixing of DMA_ATTR_FORCE_CONTIGUOUS related crashes has higher >> priority I have focused only on it in this patch. > As I mentioned elsewhere, I don't believe that fudging around in this way > is a proper fix. > > DMA coherent memory was never, ever, intended to be passed back into the > streaming APIs - it was designed that the two would be mutually exclusive. > > The problem is that we now have DMA coherent allocations being passed > through the dma-buf API using this dma_get_sgtable() thing, which is quite > broken. This patch just fixes current implementation of the API by re-using existing techniques, nothing more. It prevents code from using some pointer (area->pages) not because of lack backing struct pages, but because this pointer is just not initialized and apparently not necessary in case of contiguous allocations. I think, the problem of fixing DMA-API should be addressed in separate set of patches, as it is a different issue and requires much more experience in MM and DMA-API than I have. Regards Andrzej > I regard dma_get_sgtable() as very broken, and had I realised at > the time that this was what it was trying to do, I would have NAK'd it. > > Rather than bodging around this brokenness, trying to get dma_get_sgtable() > to work, I believe we need to address the root cause - which is proper > support for passing DMA coherent allocations through dma-buf, which does > not involve scatterlists and calling dma_map_sg() on it. > > That's going to need to be addressed in any case, because of the > dma_declare_coherent_memory() issue, where we may not have struct pages > backing a coherent allocation. Such a case can come up on ARM64 via > DT's "shared-dma-pool" reserved memory stuff. > > Right now, I have a "fix" for ARM queued which causes dma_get_sgtable() > to fail when used with reserved memory, but we have one user who needs > this to work. So, dma-buf needs to be fixed for this one way or another, > and I don't think that bending the current broken stuff to suit it by > trying these vmalloc_to_page() hacks is acceptable. > > dma_get_sgtable() is fundamentally broken IMHO. >
Hi, Gently ping. Regards Andrzej On 31.03.2017 13:02, Andrzej Hajda wrote: > In case of DMA_ATTR_FORCE_CONTIGUOUS allocations vm_area->pages > is invalid. __iommu_mmap_attrs and __iommu_get_sgtable should not > use it and take advantage of contiguous nature of the allocation. > > Fixes: 44176bb ("arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU") > Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> > --- > Hi Robin, Geert, > > In this version of the patch I have replaced temporal pages and > iommu_dma_mmap with remap_pfn_range or rather its simplified version > vm_iomap_memory. > Unfortunately I have not find nice helper for sgtable creation, so I > left sg allocation inside _iommu_mmap_attrs. > > As the fixing of DMA_ATTR_FORCE_CONTIGUOUS related crashes has higher > priority I have focused only on it in this patch. > > Regards > Andrzej > --- > arch/arm64/mm/dma-mapping.c | 17 ++++++++++++++++- > 1 file changed, 16 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c > index f7b5401..41c7c36 100644 > --- a/arch/arm64/mm/dma-mapping.c > +++ b/arch/arm64/mm/dma-mapping.c > @@ -703,6 +703,10 @@ static int __iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma, > if (dma_mmap_from_coherent(dev, vma, cpu_addr, size, &ret)) > return ret; > > + if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) > + return vm_iomap_memory(vma, > + page_to_phys(vmalloc_to_page(cpu_addr)), size); > + > area = find_vm_area(cpu_addr); > if (WARN_ON(!area || !area->pages)) > return -ENXIO; > @@ -715,8 +719,19 @@ static int __iommu_get_sgtable(struct device *dev, struct sg_table *sgt, > size_t size, unsigned long attrs) > { > unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT; > - struct vm_struct *area = find_vm_area(cpu_addr); > + struct vm_struct *area; > + > + if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) { > + int ret = sg_alloc_table(sgt, 1, GFP_KERNEL); > + > + if (!ret) > + sg_set_page(sgt->sgl, vmalloc_to_page(cpu_addr), > + PAGE_ALIGN(size), 0); > > + return ret; > + } > + > + area = find_vm_area(cpu_addr); > if (WARN_ON(!area || !area->pages)) > return -ENXIO; >
On Fri, Mar 31, 2017 at 01:02:51PM +0200, Andrzej Hajda wrote: > In case of DMA_ATTR_FORCE_CONTIGUOUS allocations vm_area->pages > is invalid. __iommu_mmap_attrs and __iommu_get_sgtable should not > use it and take advantage of contiguous nature of the allocation. As I replied to your original patch, I think dma_common_contiguous_remap() should be fixed not to leave a dangling area->pages pointer. > arch/arm64/mm/dma-mapping.c | 17 ++++++++++++++++- > 1 file changed, 16 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c > index f7b5401..41c7c36 100644 > --- a/arch/arm64/mm/dma-mapping.c > +++ b/arch/arm64/mm/dma-mapping.c > @@ -703,6 +703,10 @@ static int __iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma, > if (dma_mmap_from_coherent(dev, vma, cpu_addr, size, &ret)) > return ret; > > + if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) > + return vm_iomap_memory(vma, > + page_to_phys(vmalloc_to_page(cpu_addr)), size); I replied to Geert's patch on whether we actually need to call dma_common_contiguous_remap() in __iommu_alloc_attrs() if the device is coherent. We don't do this for the swiotlb allocation. If we are going to change this, cpu_addr may or may not be in the vmalloc range and vmalloc_to_page() would fail (I guess we could add an is_vmalloc_addr() check). Alternatively, just open-code dma_common_contiguous_remap() in __iommu_alloc_attrs() to keep a valid area->pages around (I think Geert already suggested this). AFAICT, arch/arm does this already in its own __iommu_alloc_buffer(). Yet another option would be for iommu_dma_alloc() to understand DMA_ATTR_FORCE_CONTIGUOUS and remove the arm64-specific handling. > @@ -715,8 +719,19 @@ static int __iommu_get_sgtable(struct device *dev, struct sg_table *sgt, > size_t size, unsigned long attrs) > { > unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT; > - struct vm_struct *area = find_vm_area(cpu_addr); > + struct vm_struct *area; > + > + if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) { > + int ret = sg_alloc_table(sgt, 1, GFP_KERNEL); > + > + if (!ret) > + sg_set_page(sgt->sgl, vmalloc_to_page(cpu_addr), > + PAGE_ALIGN(size), 0); > > + return ret; > + } > + > + area = find_vm_area(cpu_addr); As Russell already stated, this function needs a "fix" regardless of the DMA_ATTR_FORCE_CONTIGUOUS as we may not have page structures (*_from_coherent allocations). But I agree with you that dma_get_sgtable() issues should be addressed separately (and I'm happy to disable such API on arm64 until sorted).
Hi Catalin, On Fri, Apr 21, 2017 at 7:32 PM, Catalin Marinas <catalin.marinas@arm.com> wrote: > On Fri, Mar 31, 2017 at 01:02:51PM +0200, Andrzej Hajda wrote: >> In case of DMA_ATTR_FORCE_CONTIGUOUS allocations vm_area->pages >> is invalid. __iommu_mmap_attrs and __iommu_get_sgtable should not >> use it and take advantage of contiguous nature of the allocation. > > As I replied to your original patch, I think > dma_common_contiguous_remap() should be fixed not to leave a dangling > area->pages pointer. > >> arch/arm64/mm/dma-mapping.c | 17 ++++++++++++++++- >> 1 file changed, 16 insertions(+), 1 deletion(-) >> >> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c >> index f7b5401..41c7c36 100644 >> --- a/arch/arm64/mm/dma-mapping.c >> +++ b/arch/arm64/mm/dma-mapping.c >> @@ -703,6 +703,10 @@ static int __iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma, >> if (dma_mmap_from_coherent(dev, vma, cpu_addr, size, &ret)) >> return ret; >> >> + if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) >> + return vm_iomap_memory(vma, >> + page_to_phys(vmalloc_to_page(cpu_addr)), size); > > I replied to Geert's patch on whether we actually need to call > dma_common_contiguous_remap() in __iommu_alloc_attrs() if the device is > coherent. We don't do this for the swiotlb allocation. If we are going > to change this, cpu_addr may or may not be in the vmalloc range and > vmalloc_to_page() would fail (I guess we could add an is_vmalloc_addr() > check). > > Alternatively, just open-code dma_common_contiguous_remap() in > __iommu_alloc_attrs() to keep a valid area->pages around (I think Geert > already suggested this). AFAICT, arch/arm does this already in its own > __iommu_alloc_buffer(). > > Yet another option would be for iommu_dma_alloc() to understand > DMA_ATTR_FORCE_CONTIGUOUS and remove the arm64-specific handling. That was actually the approach I took in my v1. V2 changed that to provide standalone iommu_dma_{alloc,free}_contiguous() functions. V3 changed that to call everything directly from the arm64 code. ... Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Hi Geert, On Fri, Apr 21, 2017 at 07:39:43PM +0200, Geert Uytterhoeven wrote: > On Fri, Apr 21, 2017 at 7:32 PM, Catalin Marinas > <catalin.marinas@arm.com> wrote: > > On Fri, Mar 31, 2017 at 01:02:51PM +0200, Andrzej Hajda wrote: > >> In case of DMA_ATTR_FORCE_CONTIGUOUS allocations vm_area->pages > >> is invalid. __iommu_mmap_attrs and __iommu_get_sgtable should not > >> use it and take advantage of contiguous nature of the allocation. > > > > As I replied to your original patch, I think > > dma_common_contiguous_remap() should be fixed not to leave a dangling > > area->pages pointer. > > > >> arch/arm64/mm/dma-mapping.c | 17 ++++++++++++++++- > >> 1 file changed, 16 insertions(+), 1 deletion(-) > >> > >> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c > >> index f7b5401..41c7c36 100644 > >> --- a/arch/arm64/mm/dma-mapping.c > >> +++ b/arch/arm64/mm/dma-mapping.c > >> @@ -703,6 +703,10 @@ static int __iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma, > >> if (dma_mmap_from_coherent(dev, vma, cpu_addr, size, &ret)) > >> return ret; > >> > >> + if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) > >> + return vm_iomap_memory(vma, > >> + page_to_phys(vmalloc_to_page(cpu_addr)), size); > > > > I replied to Geert's patch on whether we actually need to call > > dma_common_contiguous_remap() in __iommu_alloc_attrs() if the device is > > coherent. We don't do this for the swiotlb allocation. If we are going > > to change this, cpu_addr may or may not be in the vmalloc range and > > vmalloc_to_page() would fail (I guess we could add an is_vmalloc_addr() > > check). > > > > Alternatively, just open-code dma_common_contiguous_remap() in > > __iommu_alloc_attrs() to keep a valid area->pages around (I think Geert > > already suggested this). AFAICT, arch/arm does this already in its own > > __iommu_alloc_buffer(). > > > > Yet another option would be for iommu_dma_alloc() to understand > > DMA_ATTR_FORCE_CONTIGUOUS and remove the arm64-specific handling. > > That was actually the approach I took in my v1. > V2 changed that to provide standalone iommu_dma_{alloc,free}_contiguous() > functions. > V3 changed that to call everything directly from the arm64 code. > ... I now read through the past discussions (as I ignored the threads previously). I got Robin's point of not extending iommu_dma_alloc() (though it looked simpler) and open-coding dma_common_contiguous_remap() wouldn't make sense either as a way to pass this buffer to iommu_dma_mmap() since it wasn't returned by iommu_dma_alloc(). So I think we should just fall back to the swiotlb ops for the mmap and get_sgtable but since we don't have a dma_addr_t handle (we only have the one of the other side of the IOMMU), we'd need to factor out the common code from __swiotlb_mmap into a __swiotlb_mmap_page (similarly for __swiotlb_get_sgtable). The __iommu_* functions would call these with the correct page (from vmalloc_to_page) if DMA_ATTR_FORCE_CONTIGUOUS and the buffer is not a candidate for *_from_coherent. While fixing/removing dma_get_sgtable() is a nice goal, we first need to address DMA_ATTR_FORCE_CONTIGUOUS on arm64 since the patch breaks existing use-cases (and I'd like to avoid reverting this patch).
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c index f7b5401..41c7c36 100644 --- a/arch/arm64/mm/dma-mapping.c +++ b/arch/arm64/mm/dma-mapping.c @@ -703,6 +703,10 @@ static int __iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma, if (dma_mmap_from_coherent(dev, vma, cpu_addr, size, &ret)) return ret; + if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) + return vm_iomap_memory(vma, + page_to_phys(vmalloc_to_page(cpu_addr)), size); + area = find_vm_area(cpu_addr); if (WARN_ON(!area || !area->pages)) return -ENXIO; @@ -715,8 +719,19 @@ static int __iommu_get_sgtable(struct device *dev, struct sg_table *sgt, size_t size, unsigned long attrs) { unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT; - struct vm_struct *area = find_vm_area(cpu_addr); + struct vm_struct *area; + + if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) { + int ret = sg_alloc_table(sgt, 1, GFP_KERNEL); + + if (!ret) + sg_set_page(sgt->sgl, vmalloc_to_page(cpu_addr), + PAGE_ALIGN(size), 0); + return ret; + } + + area = find_vm_area(cpu_addr); if (WARN_ON(!area || !area->pages)) return -ENXIO;
In case of DMA_ATTR_FORCE_CONTIGUOUS allocations vm_area->pages is invalid. __iommu_mmap_attrs and __iommu_get_sgtable should not use it and take advantage of contiguous nature of the allocation. Fixes: 44176bb ("arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU") Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> --- Hi Robin, Geert, In this version of the patch I have replaced temporal pages and iommu_dma_mmap with remap_pfn_range or rather its simplified version vm_iomap_memory. Unfortunately I have not find nice helper for sgtable creation, so I left sg allocation inside _iommu_mmap_attrs. As the fixing of DMA_ATTR_FORCE_CONTIGUOUS related crashes has higher priority I have focused only on it in this patch. Regards Andrzej --- arch/arm64/mm/dma-mapping.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-)