Message ID | 1490419893-5073-1-git-send-email-oza.oza@broadcom.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On Sat, Mar 25, 2017 at 12:31 AM, Oza Pawandeep <oza.oza@broadcom.com> wrote: > it is possible that PCI device supports 64-bit DMA addressing, > and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64), > however PCI host bridge may have limitations on the inbound > transaction addressing. As an example, consider NVME SSD device > connected to iproc-PCIe controller. > > Currently, the IOMMU DMA ops only considers PCI device dma_mask > when allocating an IOVA. This is particularly problematic on > ARM/ARM64 SOCs where the IOMMU (i.e. SMMU) translates IOVA to > PA for in-bound transactions only after PCI Host has forwarded > these transactions on SOC IO bus. This means on such ARM/ARM64 > SOCs the IOVA of in-bound transactions has to honor the addressing > restrictions of the PCI Host. > > current pcie frmework and of framework integration assumes dma-ranges > in a way where memory-mapped devices define their dma-ranges. > dma-ranges: (child-bus-address, parent-bus-address, length). > > but iproc based SOCs and even Rcar based SOCs has PCI world dma-ranges. > dma-ranges = <0x43000000 0x00 0x00 0x00 0x00 0x80 0x00>; If you implement a common function, then I expect to see other users converted to use it. There's also PCI hosts in arch/powerpc that parse dma-ranges. Rob
On Mon, Mar 27, 2017 at 8:16 PM, Rob Herring <robh@kernel.org> wrote: > On Sat, Mar 25, 2017 at 12:31 AM, Oza Pawandeep <oza.oza@broadcom.com> wrote: >> it is possible that PCI device supports 64-bit DMA addressing, >> and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64), >> however PCI host bridge may have limitations on the inbound >> transaction addressing. As an example, consider NVME SSD device >> connected to iproc-PCIe controller. >> >> Currently, the IOMMU DMA ops only considers PCI device dma_mask >> when allocating an IOVA. This is particularly problematic on >> ARM/ARM64 SOCs where the IOMMU (i.e. SMMU) translates IOVA to >> PA for in-bound transactions only after PCI Host has forwarded >> these transactions on SOC IO bus. This means on such ARM/ARM64 >> SOCs the IOVA of in-bound transactions has to honor the addressing >> restrictions of the PCI Host. >> >> current pcie frmework and of framework integration assumes dma-ranges >> in a way where memory-mapped devices define their dma-ranges. >> dma-ranges: (child-bus-address, parent-bus-address, length). >> >> but iproc based SOCs and even Rcar based SOCs has PCI world dma-ranges. >> dma-ranges = <0x43000000 0x00 0x00 0x00 0x00 0x80 0x00>; > > If you implement a common function, then I expect to see other users > converted to use it. There's also PCI hosts in arch/powerpc that parse > dma-ranges. the common function should be similar to what of_pci_get_host_bridge_resources is doing right now. it parses ranges property right now. the new function would look look following. of_pci_get_dma_ranges(struct device_node *dev, struct list_head *resources) where resources would return the dma-ranges. but right now if you see the patch, of_dma_configure calls the new function, which actually returns the largest possible size. so this new function has to be generic in a way where other PCI hosts can use it. but certainly iproc(Broadcom SOC) , rcar based SOCs can use it for sure. although having powerpc using it; is a separate exercise, since I do not have any access to other PCI hosts such as powerpc. but we can workout with them on thsi forum if required. so overall, of_pci_get_dma_ranges has to serve following 2 purposes. 1) it has to return largest possible size to of_dma_configure to generate largest possible dma_mask. 2) it also has to return resources (dma-ranges) parsed, to the users. so to address above needs of_pci_get_dma_ranges(struct device_node *dev, struct list_head *resources, u64 *size) dev -> device node. resources -> dma-ranges in allocated list. size -> highest possible size to generate possible dma_mask for of_dma_configure. let em know how this sounds. Regards, Oza.
On Tue, Mar 28, 2017 at 12:27 AM, Oza Oza <oza.oza@broadcom.com> wrote: > On Mon, Mar 27, 2017 at 8:16 PM, Rob Herring <robh@kernel.org> wrote: >> On Sat, Mar 25, 2017 at 12:31 AM, Oza Pawandeep <oza.oza@broadcom.com> wrote: >>> it is possible that PCI device supports 64-bit DMA addressing, >>> and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64), >>> however PCI host bridge may have limitations on the inbound >>> transaction addressing. As an example, consider NVME SSD device >>> connected to iproc-PCIe controller. >>> >>> Currently, the IOMMU DMA ops only considers PCI device dma_mask >>> when allocating an IOVA. This is particularly problematic on >>> ARM/ARM64 SOCs where the IOMMU (i.e. SMMU) translates IOVA to >>> PA for in-bound transactions only after PCI Host has forwarded >>> these transactions on SOC IO bus. This means on such ARM/ARM64 >>> SOCs the IOVA of in-bound transactions has to honor the addressing >>> restrictions of the PCI Host. >>> >>> current pcie frmework and of framework integration assumes dma-ranges >>> in a way where memory-mapped devices define their dma-ranges. >>> dma-ranges: (child-bus-address, parent-bus-address, length). >>> >>> but iproc based SOCs and even Rcar based SOCs has PCI world dma-ranges. >>> dma-ranges = <0x43000000 0x00 0x00 0x00 0x00 0x80 0x00>; >> >> If you implement a common function, then I expect to see other users >> converted to use it. There's also PCI hosts in arch/powerpc that parse >> dma-ranges. > > the common function should be similar to what > of_pci_get_host_bridge_resources is doing right now. > it parses ranges property right now. > > the new function would look look following. > > of_pci_get_dma_ranges(struct device_node *dev, struct list_head *resources) > where resources would return the dma-ranges. > > but right now if you see the patch, of_dma_configure calls the new > function, which actually returns the largest possible size. > so this new function has to be generic in a way where other PCI hosts > can use it. but certainly iproc(Broadcom SOC) , rcar based SOCs can > use it for sure. > > although having powerpc using it; is a separate exercise, since I do > not have any access to other PCI hosts such as powerpc. but we can > workout with them on thsi forum if required. You don't need h/w. You can analyze what parts are common, write patches to convert to common implementation, and build test. The PPC and rcar folks can test on h/w. Rob
On Tue, Mar 28, 2017 at 7:43 PM, Rob Herring <robh@kernel.org> wrote: > On Tue, Mar 28, 2017 at 12:27 AM, Oza Oza <oza.oza@broadcom.com> wrote: >> On Mon, Mar 27, 2017 at 8:16 PM, Rob Herring <robh@kernel.org> wrote: >>> On Sat, Mar 25, 2017 at 12:31 AM, Oza Pawandeep <oza.oza@broadcom.com> wrote: >>>> it is possible that PCI device supports 64-bit DMA addressing, >>>> and thus it's driver sets device's dma_mask to DMA_BIT_MASK(64), >>>> however PCI host bridge may have limitations on the inbound >>>> transaction addressing. As an example, consider NVME SSD device >>>> connected to iproc-PCIe controller. >>>> >>>> Currently, the IOMMU DMA ops only considers PCI device dma_mask >>>> when allocating an IOVA. This is particularly problematic on >>>> ARM/ARM64 SOCs where the IOMMU (i.e. SMMU) translates IOVA to >>>> PA for in-bound transactions only after PCI Host has forwarded >>>> these transactions on SOC IO bus. This means on such ARM/ARM64 >>>> SOCs the IOVA of in-bound transactions has to honor the addressing >>>> restrictions of the PCI Host. >>>> >>>> current pcie frmework and of framework integration assumes dma-ranges >>>> in a way where memory-mapped devices define their dma-ranges. >>>> dma-ranges: (child-bus-address, parent-bus-address, length). >>>> >>>> but iproc based SOCs and even Rcar based SOCs has PCI world dma-ranges. >>>> dma-ranges = <0x43000000 0x00 0x00 0x00 0x00 0x80 0x00>; >>> >>> If you implement a common function, then I expect to see other users >>> converted to use it. There's also PCI hosts in arch/powerpc that parse >>> dma-ranges. >> >> the common function should be similar to what >> of_pci_get_host_bridge_resources is doing right now. >> it parses ranges property right now. >> >> the new function would look look following. >> >> of_pci_get_dma_ranges(struct device_node *dev, struct list_head *resources) >> where resources would return the dma-ranges. >> >> but right now if you see the patch, of_dma_configure calls the new >> function, which actually returns the largest possible size. >> so this new function has to be generic in a way where other PCI hosts >> can use it. but certainly iproc(Broadcom SOC) , rcar based SOCs can >> use it for sure. >> >> although having powerpc using it; is a separate exercise, since I do >> not have any access to other PCI hosts such as powerpc. but we can >> workout with them on thsi forum if required. > > You don't need h/w. You can analyze what parts are common, write > patches to convert to common implementation, and build test. The PPC > and rcar folks can test on h/w. > > Rob Hi Rob, I have addressed your comment and made generic function. Gentle request to have a look at following approach and patch. [RFC PATCH 2/2] of/pci: call pci specific dma-ranges instead of memory-mapped. [RFC PATCH 1/2] of/pci: implement inbound dma-ranges for PCI I have tested this on our platform, with and without iommu, and seems to work. let me know your view on this. Regards, Oza.
diff --git a/drivers/of/device.c b/drivers/of/device.c index b1e6beb..d362a98 100644 --- a/drivers/of/device.c +++ b/drivers/of/device.c @@ -9,6 +9,7 @@ #include <linux/module.h> #include <linux/mod_devicetable.h> #include <linux/slab.h> +#include <linux/of_pci.h> #include <asm/errno.h> #include "of_private.h" @@ -104,7 +105,11 @@ void of_dma_configure(struct device *dev, struct device_node *np) if (!dev->dma_mask) dev->dma_mask = &dev->coherent_dma_mask; - ret = of_dma_get_range(np, &dma_addr, &paddr, &size); + if (dev_is_pci(dev)) + ret = of_pci_get_dma_ranges(np, &dma_addr, &paddr, &size); + else + ret = of_dma_get_range(np, &dma_addr, &paddr, &size); + if (ret < 0) { dma_addr = offset = 0; size = dev->coherent_dma_mask + 1; diff --git a/drivers/of/of_pci.c b/drivers/of/of_pci.c index 0ee42c3..c7f8626 100644 --- a/drivers/of/of_pci.c +++ b/drivers/of/of_pci.c @@ -283,6 +283,52 @@ int of_pci_get_host_bridge_resources(struct device_node *dev, return err; } EXPORT_SYMBOL_GPL(of_pci_get_host_bridge_resources); + +int of_pci_get_dma_ranges(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *size) +{ + struct device_node *node = of_node_get(np); + int rlen, ret = 0; + const int na = 3, ns = 2; + struct of_pci_range_parser parser; + struct of_pci_range range; + + if (!node) + return -EINVAL; + + parser.node = node; + parser.pna = of_n_addr_cells(node); + parser.np = parser.pna + na + ns; + + parser.range = of_get_property(node, "dma-ranges", &rlen); + + if (!parser.range) { + pr_debug("pcie device has no dma-ranges defined for node(%s)\n", np->full_name); + ret = -ENODEV; + goto out; + } + + parser.end = parser.range + rlen / sizeof(__be32); + *size = 0; + + for_each_of_pci_range(&parser, &range) { + if (*size < range.size) { + *dma_addr = range.pci_addr; + *size = range.size; + *paddr = range.cpu_addr; + } + } + + pr_debug("dma_addr(%llx) cpu_addr(%llx) size(%llx)\n", + *dma_addr, *paddr, *size); + *dma_addr = range.pci_addr; + *size = range.size; + +out: + of_node_put(node); + return ret; + +} +EXPORT_SYMBOL_GPL(of_pci_get_dma_ranges); #endif /* CONFIG_OF_ADDRESS */ #ifdef CONFIG_PCI_MSI diff --git a/include/linux/of_pci.h b/include/linux/of_pci.h index 0e0974e..907ace0 100644 --- a/include/linux/of_pci.h +++ b/include/linux/of_pci.h @@ -76,6 +76,7 @@ static inline void of_pci_check_probe_only(void) { } int of_pci_get_host_bridge_resources(struct device_node *dev, unsigned char busno, unsigned char bus_max, struct list_head *resources, resource_size_t *io_base); +int of_pci_get_dma_ranges(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *size); #else static inline int of_pci_get_host_bridge_resources(struct device_node *dev, unsigned char busno, unsigned char bus_max, @@ -83,6 +84,11 @@ static inline int of_pci_get_host_bridge_resources(struct device_node *dev, { return -EINVAL; } + +static inline int of_pci_get_dma_ranges(struct device_node *np, u64 *dma_addr, u64 *paddr, u64 *size) +{ + return -EINVAL; +} #endif #if defined(CONFIG_OF) && defined(CONFIG_PCI_MSI)