Message ID | 1485893763-20671-2-git-send-email-nwatters@codeaurora.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Nate, On 31/01/17 20:16, Nate Watterson wrote: > Some drivers set the dma_mask of client devices based solely on values > read from capability registers which may not account for platform > specific bus address width limitations. Fortunately, the ACPI IORT table > provides a way to report the effective number of address bits a device > can use to access memory. This information, when present, is used to > supplement the checks already being done in dma_supported() to avoid > setting overly generous dma_masks. This is equally a problem for DT, and I think in general we'd prefer not to be dragging ACPI/DT specifics in at this level when there's a clean way to address it more generally. There is some recent ongoing discussion and work in this area (latest part at [1]) - I have a local branch somewhere implementing the stricter "don't special case default masks" version (after I came around to Arnd's viewpoint), which I must refresh myself on because there was some anomaly in the core DT code which that brought to light. > Signed-off-by: Nate Watterson <nwatters@codeaurora.org> > --- > arch/arm64/mm/dma-mapping.c | 20 +++++++++++++++++++- > 1 file changed, 19 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c > index e040827..467fd23 100644 > --- a/arch/arm64/mm/dma-mapping.c > +++ b/arch/arm64/mm/dma-mapping.c > @@ -19,6 +19,7 @@ > > #include <linux/gfp.h> > #include <linux/acpi.h> > +#include <linux/acpi_iort.h> > #include <linux/bootmem.h> > #include <linux/cache.h> > #include <linux/export.h> > @@ -347,6 +348,12 @@ static int __swiotlb_get_sgtable(struct device *dev, struct sg_table *sgt, > > static int __swiotlb_dma_supported(struct device *hwdev, u64 mask) > { > + int dma_limit; > + > + dma_limit = iort_get_memory_address_limit(hwdev); > + if (dma_limit >= 0 && DMA_BIT_MASK(dma_limit) < mask) > + return 0; > + > if (swiotlb) > return swiotlb_dma_supported(hwdev, mask); > return 1; > @@ -784,6 +791,17 @@ static void __iommu_unmap_sg_attrs(struct device *dev, > iommu_dma_unmap_sg(dev, sgl, nelems, dir, attrs); > } > > +static int __iommu_dma_supported(struct device *hwdev, u64 mask) > +{ > + int dma_limit; > + > + dma_limit = iort_get_memory_address_limit(hwdev); > + if (dma_limit >= 0 && DMA_BIT_MASK(dma_limit) < mask) > + return 0; > + > + return iommu_dma_supported(hwdev, mask); Either way, this reminds me that iommu_dma_supported() is another thing I got completely wrong - time to write yet another patch... Robin. [1]:http://www.mail-archive.com/linux-renesas-soc@vger.kernel.org/msg10637.html > +} > + > static struct dma_map_ops iommu_dma_ops = { > .alloc = __iommu_alloc_attrs, > .free = __iommu_free_attrs, > @@ -799,7 +817,7 @@ static void __iommu_unmap_sg_attrs(struct device *dev, > .sync_sg_for_device = __iommu_sync_sg_for_device, > .map_resource = iommu_dma_map_resource, > .unmap_resource = iommu_dma_unmap_resource, > - .dma_supported = iommu_dma_supported, > + .dma_supported = __iommu_dma_supported, > .mapping_error = iommu_dma_mapping_error, > }; > >
On Wed, Feb 01, 2017 at 01:44:02PM +0000, Robin Murphy wrote: > Hi Nate, > > On 31/01/17 20:16, Nate Watterson wrote: > > Some drivers set the dma_mask of client devices based solely on values > > read from capability registers which may not account for platform > > specific bus address width limitations. Fortunately, the ACPI IORT table > > provides a way to report the effective number of address bits a device > > can use to access memory. This information, when present, is used to > > supplement the checks already being done in dma_supported() to avoid > > setting overly generous dma_masks. > > This is equally a problem for DT, and I think in general we'd prefer not > to be dragging ACPI/DT specifics in at this level when there's a clean > way to address it more generally. There is some recent ongoing > discussion and work in this area (latest part at [1]) - I have a local > branch somewhere implementing the stricter "don't special case default > masks" version (after I came around to Arnd's viewpoint), which I must > refresh myself on because there was some anomaly in the core DT code > which that brought to light. Agreed. I can prototype the ACPI version by using the _DMA object in the ACPI specs instead of IORT specific bindings (what to do for named components has to be seen given that _DMA object and IORT bindings can provide different information - though _DMA object usage at least on x86 seems non-existent, whether we should use it or not on ARM is still a question mark). Anyway, the IORT parsing code in patch 1 is simple, we have to decide how to handle the information retrieved. I will have a look at [1] let me know if you need help prototyping and testing it with ACPI. Lorenzo > > Signed-off-by: Nate Watterson <nwatters@codeaurora.org> > > --- > > arch/arm64/mm/dma-mapping.c | 20 +++++++++++++++++++- > > 1 file changed, 19 insertions(+), 1 deletion(-) > > > > diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c > > index e040827..467fd23 100644 > > --- a/arch/arm64/mm/dma-mapping.c > > +++ b/arch/arm64/mm/dma-mapping.c > > @@ -19,6 +19,7 @@ > > > > #include <linux/gfp.h> > > #include <linux/acpi.h> > > +#include <linux/acpi_iort.h> > > #include <linux/bootmem.h> > > #include <linux/cache.h> > > #include <linux/export.h> > > @@ -347,6 +348,12 @@ static int __swiotlb_get_sgtable(struct device *dev, struct sg_table *sgt, > > > > static int __swiotlb_dma_supported(struct device *hwdev, u64 mask) > > { > > + int dma_limit; > > + > > + dma_limit = iort_get_memory_address_limit(hwdev); > > + if (dma_limit >= 0 && DMA_BIT_MASK(dma_limit) < mask) > > + return 0; > > + > > if (swiotlb) > > return swiotlb_dma_supported(hwdev, mask); > > return 1; > > @@ -784,6 +791,17 @@ static void __iommu_unmap_sg_attrs(struct device *dev, > > iommu_dma_unmap_sg(dev, sgl, nelems, dir, attrs); > > } > > > > +static int __iommu_dma_supported(struct device *hwdev, u64 mask) > > +{ > > + int dma_limit; > > + > > + dma_limit = iort_get_memory_address_limit(hwdev); > > + if (dma_limit >= 0 && DMA_BIT_MASK(dma_limit) < mask) > > + return 0; > > + > > + return iommu_dma_supported(hwdev, mask); > > Either way, this reminds me that iommu_dma_supported() is another thing > I got completely wrong - time to write yet another patch... > > Robin. > > [1]:http://www.mail-archive.com/linux-renesas-soc@vger.kernel.org/msg10637.html > > > +} > > + > > static struct dma_map_ops iommu_dma_ops = { > > .alloc = __iommu_alloc_attrs, > > .free = __iommu_free_attrs, > > @@ -799,7 +817,7 @@ static void __iommu_unmap_sg_attrs(struct device *dev, > > .sync_sg_for_device = __iommu_sync_sg_for_device, > > .map_resource = iommu_dma_map_resource, > > .unmap_resource = iommu_dma_unmap_resource, > > - .dma_supported = iommu_dma_supported, > > + .dma_supported = __iommu_dma_supported, > > .mapping_error = iommu_dma_mapping_error, > > }; > > > > >
On 01/02/17 14:36, Lorenzo Pieralisi wrote: > On Wed, Feb 01, 2017 at 01:44:02PM +0000, Robin Murphy wrote: >> Hi Nate, >> >> On 31/01/17 20:16, Nate Watterson wrote: >>> Some drivers set the dma_mask of client devices based solely on values >>> read from capability registers which may not account for platform >>> specific bus address width limitations. Fortunately, the ACPI IORT table >>> provides a way to report the effective number of address bits a device >>> can use to access memory. This information, when present, is used to >>> supplement the checks already being done in dma_supported() to avoid >>> setting overly generous dma_masks. >> >> This is equally a problem for DT, and I think in general we'd prefer not >> to be dragging ACPI/DT specifics in at this level when there's a clean >> way to address it more generally. There is some recent ongoing >> discussion and work in this area (latest part at [1]) - I have a local >> branch somewhere implementing the stricter "don't special case default >> masks" version (after I came around to Arnd's viewpoint), which I must >> refresh myself on because there was some anomaly in the core DT code >> which that brought to light. > > Agreed. I can prototype the ACPI version by using the _DMA object in the > ACPI specs instead of IORT specific bindings (what to do for named > components has to be seen given that _DMA object and IORT bindings can > provide different information - though _DMA object usage at least on x86 > seems non-existent, whether we should use it or not on ARM is still a > question mark). Anyway, the IORT parsing code in patch 1 is simple, we > have to decide how to handle the information retrieved. I will have a > look at [1] let me know if you need help prototyping and testing it with > ACPI. Essentially, all that needs to be done is to ensure that the initial masks set by acpi_dma_configure() truly reflect the maximum hardware capability; everything else will then just fall out of that. The aforementioned thing on the DT side is that of_dma_configure() currently has a bug which prevents masks larger than 32 bits actually being assigned from "dma-ranges" - I need to split out a proper patch from the "git commit -am 'hacks'" that I have on this local branch :) Robin. > > Lorenzo > >>> Signed-off-by: Nate Watterson <nwatters@codeaurora.org> >>> --- >>> arch/arm64/mm/dma-mapping.c | 20 +++++++++++++++++++- >>> 1 file changed, 19 insertions(+), 1 deletion(-) >>> >>> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c >>> index e040827..467fd23 100644 >>> --- a/arch/arm64/mm/dma-mapping.c >>> +++ b/arch/arm64/mm/dma-mapping.c >>> @@ -19,6 +19,7 @@ >>> >>> #include <linux/gfp.h> >>> #include <linux/acpi.h> >>> +#include <linux/acpi_iort.h> >>> #include <linux/bootmem.h> >>> #include <linux/cache.h> >>> #include <linux/export.h> >>> @@ -347,6 +348,12 @@ static int __swiotlb_get_sgtable(struct device *dev, struct sg_table *sgt, >>> >>> static int __swiotlb_dma_supported(struct device *hwdev, u64 mask) >>> { >>> + int dma_limit; >>> + >>> + dma_limit = iort_get_memory_address_limit(hwdev); >>> + if (dma_limit >= 0 && DMA_BIT_MASK(dma_limit) < mask) >>> + return 0; >>> + >>> if (swiotlb) >>> return swiotlb_dma_supported(hwdev, mask); >>> return 1; >>> @@ -784,6 +791,17 @@ static void __iommu_unmap_sg_attrs(struct device *dev, >>> iommu_dma_unmap_sg(dev, sgl, nelems, dir, attrs); >>> } >>> >>> +static int __iommu_dma_supported(struct device *hwdev, u64 mask) >>> +{ >>> + int dma_limit; >>> + >>> + dma_limit = iort_get_memory_address_limit(hwdev); >>> + if (dma_limit >= 0 && DMA_BIT_MASK(dma_limit) < mask) >>> + return 0; >>> + >>> + return iommu_dma_supported(hwdev, mask); >> >> Either way, this reminds me that iommu_dma_supported() is another thing >> I got completely wrong - time to write yet another patch... >> >> Robin. >> >> [1]:http://www.mail-archive.com/linux-renesas-soc@vger.kernel.org/msg10637.html >> >>> +} >>> + >>> static struct dma_map_ops iommu_dma_ops = { >>> .alloc = __iommu_alloc_attrs, >>> .free = __iommu_free_attrs, >>> @@ -799,7 +817,7 @@ static void __iommu_unmap_sg_attrs(struct device *dev, >>> .sync_sg_for_device = __iommu_sync_sg_for_device, >>> .map_resource = iommu_dma_map_resource, >>> .unmap_resource = iommu_dma_unmap_resource, >>> - .dma_supported = iommu_dma_supported, >>> + .dma_supported = __iommu_dma_supported, >>> .mapping_error = iommu_dma_mapping_error, >>> }; >>> >>> >>
On Wed, Feb 1, 2017 at 4:27 PM, Robin Murphy <robin.murphy@arm.com> wrote: > > Essentially, all that needs to be done is to ensure that the initial > masks set by acpi_dma_configure() truly reflect the maximum hardware > capability; everything else will then just fall out of that. The > aforementioned thing on the DT side is that of_dma_configure() currently > has a bug which prevents masks larger than 32 bits actually being > assigned from "dma-ranges" - I need to split out a proper patch from the > "git commit -am 'hacks'" that I have on this local branch :) Do you mean you want to change the initial DMA mask to the maximum allowed mask? I don't think we can do that, as that would break all devices that support only 32-bit DMA but happen to sit on a bus that has 64-bit DMA support. Arnd
On 01/02/17 15:34, Arnd Bergmann wrote: > On Wed, Feb 1, 2017 at 4:27 PM, Robin Murphy <robin.murphy@arm.com> wrote: > >> >> Essentially, all that needs to be done is to ensure that the initial >> masks set by acpi_dma_configure() truly reflect the maximum hardware >> capability; everything else will then just fall out of that. The >> aforementioned thing on the DT side is that of_dma_configure() currently >> has a bug which prevents masks larger than 32 bits actually being >> assigned from "dma-ranges" - I need to split out a proper patch from the >> "git commit -am 'hacks'" that I have on this local branch :) > > Do you mean you want to change the initial DMA mask to the maximum allowed > mask? I don't think we can do that, as that would break all devices that support > only 32-bit DMA but happen to sit on a bus that has 64-bit DMA support. That doesn't break anything provided that the drivers of said 32-bit devices are calling dma_set_mask_and_coherent(DMA_BIT_MASK(32)) as they should be. e.g on Juno, we (now) have a top-level "dma-ranges" describing the 40-bit interconnect, so (given the aforementioned fix) of_dma_configure() sets initial masks to 40-bit, then the drivers of the 32-bit-only IP blocks (USB, PL330, HDLCD, etc.) reduce their masks to suit and everything works fine. Basically, as long as drivers correctly call dma_set_mask*() with the upper bound of what that device is inherently capable of driving, and the DT has "dma-ranges" present to describe any configuration where fewer bits than that are actually wired up (e.g. the Renesas PCIe and APM SMMU cases), everything's fine. If a 32-bit device on a correctly-described 64-bit bus were to break (presumably by inheriting a too-big mask), that's simply uncovering a driver bug, which would already have been broken until 9a6d7298b083 introduced the erroneous 32-bit clamp. Robin. > > Arnd >
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c index e040827..467fd23 100644 --- a/arch/arm64/mm/dma-mapping.c +++ b/arch/arm64/mm/dma-mapping.c @@ -19,6 +19,7 @@ #include <linux/gfp.h> #include <linux/acpi.h> +#include <linux/acpi_iort.h> #include <linux/bootmem.h> #include <linux/cache.h> #include <linux/export.h> @@ -347,6 +348,12 @@ static int __swiotlb_get_sgtable(struct device *dev, struct sg_table *sgt, static int __swiotlb_dma_supported(struct device *hwdev, u64 mask) { + int dma_limit; + + dma_limit = iort_get_memory_address_limit(hwdev); + if (dma_limit >= 0 && DMA_BIT_MASK(dma_limit) < mask) + return 0; + if (swiotlb) return swiotlb_dma_supported(hwdev, mask); return 1; @@ -784,6 +791,17 @@ static void __iommu_unmap_sg_attrs(struct device *dev, iommu_dma_unmap_sg(dev, sgl, nelems, dir, attrs); } +static int __iommu_dma_supported(struct device *hwdev, u64 mask) +{ + int dma_limit; + + dma_limit = iort_get_memory_address_limit(hwdev); + if (dma_limit >= 0 && DMA_BIT_MASK(dma_limit) < mask) + return 0; + + return iommu_dma_supported(hwdev, mask); +} + static struct dma_map_ops iommu_dma_ops = { .alloc = __iommu_alloc_attrs, .free = __iommu_free_attrs, @@ -799,7 +817,7 @@ static void __iommu_unmap_sg_attrs(struct device *dev, .sync_sg_for_device = __iommu_sync_sg_for_device, .map_resource = iommu_dma_map_resource, .unmap_resource = iommu_dma_unmap_resource, - .dma_supported = iommu_dma_supported, + .dma_supported = __iommu_dma_supported, .mapping_error = iommu_dma_mapping_error, };
Some drivers set the dma_mask of client devices based solely on values read from capability registers which may not account for platform specific bus address width limitations. Fortunately, the ACPI IORT table provides a way to report the effective number of address bits a device can use to access memory. This information, when present, is used to supplement the checks already being done in dma_supported() to avoid setting overly generous dma_masks. Signed-off-by: Nate Watterson <nwatters@codeaurora.org> --- arch/arm64/mm/dma-mapping.c | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-)