Message ID | 20230518210658.66156-4-stewart.hildebrand@amd.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | SMMU handling for PCIe Passthrough on ARM | expand |
On 18.05.2023 23:06, Stewart Hildebrand wrote: > --- a/xen/include/xen/iommu.h > +++ b/xen/include/xen/iommu.h > @@ -26,6 +26,7 @@ > #include <xen/spinlock.h> > #include <public/domctl.h> > #include <public/hvm/ioreq.h> > +#include <asm/acpi.h> > #include <asm/device.h> I view this as problematic: It'll require all architectures with an IOMMU implementation to have an asm/acpi.h. I think this wants to go inside an "#ifdef CONFIG_ACPI" and then ... > @@ -228,12 +230,25 @@ int iommu_release_dt_devices(struct domain *d); > * (IOMMU is not enabled/present or device is not connected to it). > */ > int iommu_add_dt_device(struct dt_device_node *np); > +int iommu_add_dt_pci_sideband_ids(struct pci_dev *pdev); > > int iommu_do_dt_domctl(struct xen_domctl *, struct domain *, > XEN_GUEST_HANDLE_PARAM(xen_domctl_t)); > > +#else /* !HAS_DEVICE_TREE */ > +static inline int iommu_add_dt_pci_sideband_ids(struct pci_dev *pdev) > +{ > + return 0; > +} > #endif /* HAS_DEVICE_TREE */ > > +static inline int iommu_add_pci_sideband_ids(struct pci_dev *pdev) > +{ > + if ( acpi_disabled ) ... the same #ifdef would be added around this if(). All of this of course only if this is deemed enough to allow co-existance of DT and ACPI (which I'm not convinced it is, but I don't know enough about DT and e.g. possible mixed configurations). Jan > + return iommu_add_dt_pci_sideband_ids(pdev); > + return 0; > +} > + > struct page_info; > > /*
On 5/19/23 04:45, Jan Beulich wrote: > On 18.05.2023 23:06, Stewart Hildebrand wrote: >> --- a/xen/include/xen/iommu.h >> +++ b/xen/include/xen/iommu.h >> @@ -26,6 +26,7 @@ >> #include <xen/spinlock.h> >> #include <public/domctl.h> >> #include <public/hvm/ioreq.h> >> +#include <asm/acpi.h> >> #include <asm/device.h> > > I view this as problematic: It'll require all architectures with an > IOMMU implementation to have an asm/acpi.h. I think this wants to go > inside an "#ifdef CONFIG_ACPI" and then ... Will do >> @@ -228,12 +230,25 @@ int iommu_release_dt_devices(struct domain *d); >> * (IOMMU is not enabled/present or device is not connected to it). >> */ >> int iommu_add_dt_device(struct dt_device_node *np); >> +int iommu_add_dt_pci_sideband_ids(struct pci_dev *pdev); >> >> int iommu_do_dt_domctl(struct xen_domctl *, struct domain *, >> XEN_GUEST_HANDLE_PARAM(xen_domctl_t)); >> >> +#else /* !HAS_DEVICE_TREE */ >> +static inline int iommu_add_dt_pci_sideband_ids(struct pci_dev *pdev) >> +{ >> + return 0; >> +} >> #endif /* HAS_DEVICE_TREE */ >> >> +static inline int iommu_add_pci_sideband_ids(struct pci_dev *pdev) >> +{ >> + if ( acpi_disabled ) > > ... the same #ifdef would be added around this if(). Okay. I will take care to avoid an unreachable return 0; by introducing a local variable: static inline int iommu_add_pci_sideband_ids(struct pci_dev *pdev) { int ret = 0; #ifdef CONFIG_ACPI if ( acpi_disabled ) #endif ret = iommu_add_dt_pci_sideband_ids(pdev); return ret; } > All of this of course only if this is deemed enough to allow co-existance > of DT and ACPI (which I'm not convinced it is, but I don't know enough > about DT and e.g. possible mixed configurations). > > Jan On ARM, we dynamically check for the existence of a valid device tree and set acpi_disabled accordingly. I did some basic testing on ARM with both CONFIG_ACPI=y and # CONFIG_ACPI is not set. My understanding is that it will work, and it should allow ACPI on ARM to be implemented in future. >> + return iommu_add_dt_pci_sideband_ids(pdev); >> + return 0; >> +} >> + >> struct page_info; >> >> /* >
Hi Stewart, On 18/05/2023 23:06, Stewart Hildebrand wrote: > > > From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> > > The main purpose of this patch is to add a way to register PCI device > (which is behind the IOMMU) using the generic PCI-IOMMU DT bindings [1] > before assigning that device to a domain. > > This behaves in almost the same way as existing iommu_add_dt_device API, > the difference is in devices to handle and DT bindings to use. > > The function of_map_id to translate an ID through a downstream mapping > (which is also suitable for mapping Requester ID) was borrowed from Linux > (v5.10-rc6) and updated according to the Xen code base. > > XXX: I don't port pci_for_each_dma_alias from Linux which is a part > of PCI-IOMMU bindings infrastucture as I don't have a good understanding > for how it is expected to work in Xen environment. > Also it is not completely clear whether we need to distinguish between > different PCI types here (DEV_TYPE_PCI, DEV_TYPE_PCI_HOST_BRIDGE, etc). > For example, how we should behave here if the host bridge doesn't have > a stream ID (so not described in iommu-map property) just simple > fail or bypasses translation? > > [1] https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/pci-iommu.txt > > Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> > Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> > --- > v2->v3: > * new patch title (was: iommu/arm: Introduce iommu_add_dt_pci_device API) > * renamed function > from: iommu_add_dt_pci_device > to: iommu_add_dt_pci_sideband_ids > * removed stale ops->add_device check > * iommu.h: add empty stub iommu_add_dt_pci_sideband_ids for !HAS_DEVICE_TREE > * iommu.h: add iommu_add_pci_sideband_ids helper > * iommu.h: don't wrap prototype in #ifdef CONFIG_HAS_PCI > * s/iommu_fwspec_free(pci_to_dev(pdev))/iommu_fwspec_free(dev)/ > > v1->v2: > * remove extra devfn parameter since pdev fully describes the device > * remove ops->add_device() call from iommu_add_dt_pci_device(). Instead, rely on > the existing iommu call in iommu_add_device(). > * move the ops->add_device and ops->dt_xlate checks earlier > > downstream->v1: > * rebase > * add const qualifier to struct dt_device_node *np arg in dt_map_id() > * add const qualifier to struct dt_device_node *np declaration in iommu_add_pci_device() > * use stdint.h types instead of u8/u32/etc... > * rename functions: > s/dt_iommu_xlate/iommu_dt_xlate/ > s/dt_map_id/iommu_dt_pci_map_id/ > s/iommu_add_pci_device/iommu_add_dt_pci_device/ > * add device_is_protected check in iommu_add_dt_pci_device > * wrap prototypes in CONFIG_HAS_PCI > > (cherry picked from commit 734e3bf6ee77e7947667ab8fa96c25b349c2e1da from > the downstream branch poc/pci-passthrough from > https://gitlab.com/xen-project/people/bmarquis/xen-arm-poc.git) > --- > xen/drivers/passthrough/device_tree.c | 140 ++++++++++++++++++++++++++ > xen/include/xen/device_tree.h | 25 +++++ > xen/include/xen/iommu.h | 17 +++- > 3 files changed, 181 insertions(+), 1 deletion(-) > > diff --git a/xen/drivers/passthrough/device_tree.c b/xen/drivers/passthrough/device_tree.c > index 1b50f4670944..d568166e19ec 100644 > --- a/xen/drivers/passthrough/device_tree.c > +++ b/xen/drivers/passthrough/device_tree.c > @@ -151,6 +151,146 @@ static int iommu_dt_xlate(struct device *dev, > return ops->dt_xlate(dev, iommu_spec); > } > > +#ifdef CONFIG_HAS_PCI > +int iommu_dt_pci_map_id(const struct dt_device_node *np, uint32_t id, > + const char *map_name, const char *map_mask_name, > + struct dt_device_node **target, uint32_t *id_out) > +{ > + uint32_t map_mask, masked_id, map_len; > + const __be32 *map = NULL; > + > + if ( !np || !map_name || (!target && !id_out) ) > + return -EINVAL; > + > + map = dt_get_property(np, map_name, &map_len); > + if ( !map ) > + { > + if ( target ) > + return -ENODEV; empty line here > + /* Otherwise, no map implies no translation */ > + *id_out = id; > + return 0; > + } > + > + if ( !map_len || map_len % (4 * sizeof(*map)) ) could you enclose the second expression in parantheses? > + { > + printk(XENLOG_ERR "%pOF: Error: Bad %s length: %d\n", np, %pOF is a Linux special printk format to print full name of the node. We do not have this in Xen (see printk-formats.txt). If you want to achieve the same, use np->full_name. This applies to all the uses below. Also, use %u for map_len as it is unsigned. > + map_name, map_len); incorrect alignment > + return -EINVAL; > + } > + > + /* The default is to select all bits. */ > + map_mask = 0xffffffff; > + > + /* > + * Can be overridden by "{iommu,msi}-map-mask" property. > + * If of_property_read_u32() fails, the default is used. s/of_property_read_u32/dt_property_read_u32 > + */ > + if ( map_mask_name ) > + dt_property_read_u32(np, map_mask_name, &map_mask); > + > + masked_id = map_mask & id; > + for ( ; (int)map_len > 0; map_len -= 4 * sizeof(*map), map += 4 ) > + { > + struct dt_device_node *phandle_node; > + uint32_t id_base = be32_to_cpup(map + 0); > + uint32_t phandle = be32_to_cpup(map + 1); > + uint32_t out_base = be32_to_cpup(map + 2); > + uint32_t id_len = be32_to_cpup(map + 3); > + > + if ( id_base & ~map_mask ) > + { > + printk(XENLOG_ERR "%pOF: Invalid %s translation - %s-mask (0x%x) ignores id-base (0x%x)\n", we tend to use PRIx32 to print uint32_t values. > + np, map_name, map_name, map_mask, id_base); > + return -EFAULT; > + } > + > + if ( masked_id < id_base || masked_id >= id_base + id_len ) could you enclose the expressions in parantheses? > + continue; > + > + phandle_node = dt_find_node_by_phandle(phandle); > + if ( !phandle_node ) > + return -ENODEV; > + > + if ( target ) > + { > + if ( !*target ) > + *target = phandle_node; > + > + if ( *target != phandle_node ) > + continue; > + } > + > + if ( id_out ) > + *id_out = masked_id - id_base + out_base; > + > + printk(XENLOG_DEBUG "%pOF: %s, using mask %08x, id-base: %08x, out-base: %08x, length: %08x, id: %08x -> %08x\n", %08"PRIx32" > + np, map_name, map_mask, id_base, out_base, id_len, id, > + masked_id - id_base + out_base); > + return 0; > + } > + > + printk(XENLOG_ERR "%pOF: no %s translation for id 0x%x on %pOF\n", > + np, map_name, id, target && *target ? *target : NULL); > + > + /* > + * NOTE: Linux bypasses translation without returning an error here, > + * but should we behave in the same way on Xen? Restrict for now. > + */ > + return -EFAULT; > +} > + > +int iommu_add_dt_pci_sideband_ids(struct pci_dev *pdev) > +{ > + const struct iommu_ops *ops = iommu_get_ops(); > + struct dt_phandle_args iommu_spec = { .args_count = 1 }; > + struct device *dev = pci_to_dev(pdev); > + const struct dt_device_node *np; > + int rc = NO_IOMMU; > + > + if ( !iommu_enabled ) > + return NO_IOMMU; > + > + if ( !ops ) > + return -EINVAL; > + > + if ( device_is_protected(dev) ) > + return 0; > + > + if ( dev_iommu_fwspec_get(dev) ) > + return -EEXIST; > + > + np = pci_find_host_bridge_node(pdev); > + if ( !np ) > + return -ENODEV; > + > + /* > + * The driver which supports generic PCI-IOMMU DT bindings must have > + * these callback implemented. > + */ > + if ( !ops->dt_xlate ) > + return -EINVAL; See my comment in previous patch. This could be moved to iommu_dt_xlate(). > + > + /* > + * According to the Documentation/devicetree/bindings/pci/pci-iommu.txt > + * from Linux. > + */ > + rc = iommu_dt_pci_map_id(np, PCI_BDF(pdev->bus, pdev->devfn), "iommu-map", > + "iommu-map-mask", &iommu_spec.np, iommu_spec.args); > + if ( rc ) > + return rc == -ENODEV ? NO_IOMMU : rc; > + > + rc = iommu_dt_xlate(dev, &iommu_spec); > + if ( rc < 0 ) > + { > + iommu_fwspec_free(dev); > + return -EINVAL; > + } > + > + return rc; > +} > +#endif /* CONFIG_HAS_PCI */ > + > int iommu_add_dt_device(struct dt_device_node *np) > { > const struct iommu_ops *ops = iommu_get_ops(); > diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h > index c1e4751a581f..dc40fdfb9231 100644 > --- a/xen/include/xen/device_tree.h > +++ b/xen/include/xen/device_tree.h > @@ -852,6 +852,31 @@ int dt_count_phandle_with_args(const struct dt_device_node *np, > */ > int dt_get_pci_domain_nr(struct dt_device_node *node); > > +#ifdef CONFIG_HAS_PCI > +/** > + * iommu_dt_pci_map_id - Translate an ID through a downstream mapping. > + * @np: root complex device node. > + * @id: device ID to map. > + * @map_name: property name of the map to use. > + * @map_mask_name: optional property name of the mask to use. > + * @target: optional pointer to a target device node. > + * @id_out: optional pointer to receive the translated ID. > + * > + * Given a device ID, look up the appropriate implementation-defined > + * platform ID and/or the target device which receives transactions on that > + * ID, as per the "iommu-map" and "msi-map" bindings. Either of @target or > + * @id_out may be NULL if only the other is required. If @target points to > + * a non-NULL device node pointer, only entries targeting that node will be > + * matched; if it points to a NULL value, it will receive the device node of > + * the first matching target phandle, with a reference held. > + * > + * Return: 0 on success or a standard error code on failure. > + */ > +int iommu_dt_pci_map_id(const struct dt_device_node *np, uint32_t id, > + const char *map_name, const char *map_mask_name, > + struct dt_device_node **target, uint32_t *id_out); Why is the iommu function prototype in device_tree.h and not in iommu.h where all rest of the iommu dt related prototypes are placed? Furthermore, do you need to expose globally this function? Looking at this series it could be static as it is only used by iommu_add_dt_pci_sideband_ids(). Will there be any use of it later on? ~Michal
On 5/24/23 03:49, Michal Orzel wrote: > Hi Stewart, > > On 18/05/2023 23:06, Stewart Hildebrand wrote: >> >> >> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> >> >> The main purpose of this patch is to add a way to register PCI device >> (which is behind the IOMMU) using the generic PCI-IOMMU DT bindings [1] >> before assigning that device to a domain. >> >> This behaves in almost the same way as existing iommu_add_dt_device API, >> the difference is in devices to handle and DT bindings to use. >> >> The function of_map_id to translate an ID through a downstream mapping >> (which is also suitable for mapping Requester ID) was borrowed from Linux >> (v5.10-rc6) and updated according to the Xen code base. >> >> XXX: I don't port pci_for_each_dma_alias from Linux which is a part >> of PCI-IOMMU bindings infrastucture as I don't have a good understanding >> for how it is expected to work in Xen environment. >> Also it is not completely clear whether we need to distinguish between >> different PCI types here (DEV_TYPE_PCI, DEV_TYPE_PCI_HOST_BRIDGE, etc). >> For example, how we should behave here if the host bridge doesn't have >> a stream ID (so not described in iommu-map property) just simple >> fail or bypasses translation? >> >> [1] https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/pci-iommu.txt >> >> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> >> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> >> --- >> v2->v3: >> * new patch title (was: iommu/arm: Introduce iommu_add_dt_pci_device API) >> * renamed function >> from: iommu_add_dt_pci_device >> to: iommu_add_dt_pci_sideband_ids >> * removed stale ops->add_device check >> * iommu.h: add empty stub iommu_add_dt_pci_sideband_ids for !HAS_DEVICE_TREE >> * iommu.h: add iommu_add_pci_sideband_ids helper >> * iommu.h: don't wrap prototype in #ifdef CONFIG_HAS_PCI >> * s/iommu_fwspec_free(pci_to_dev(pdev))/iommu_fwspec_free(dev)/ >> >> v1->v2: >> * remove extra devfn parameter since pdev fully describes the device >> * remove ops->add_device() call from iommu_add_dt_pci_device(). Instead, rely on >> the existing iommu call in iommu_add_device(). >> * move the ops->add_device and ops->dt_xlate checks earlier >> >> downstream->v1: >> * rebase >> * add const qualifier to struct dt_device_node *np arg in dt_map_id() >> * add const qualifier to struct dt_device_node *np declaration in iommu_add_pci_device() >> * use stdint.h types instead of u8/u32/etc... >> * rename functions: >> s/dt_iommu_xlate/iommu_dt_xlate/ >> s/dt_map_id/iommu_dt_pci_map_id/ >> s/iommu_add_pci_device/iommu_add_dt_pci_device/ >> * add device_is_protected check in iommu_add_dt_pci_device >> * wrap prototypes in CONFIG_HAS_PCI >> >> (cherry picked from commit 734e3bf6ee77e7947667ab8fa96c25b349c2e1da from >> the downstream branch poc/pci-passthrough from >> https://gitlab.com/xen-project/people/bmarquis/xen-arm-poc.git) >> --- >> xen/drivers/passthrough/device_tree.c | 140 ++++++++++++++++++++++++++ >> xen/include/xen/device_tree.h | 25 +++++ >> xen/include/xen/iommu.h | 17 +++- >> 3 files changed, 181 insertions(+), 1 deletion(-) >> >> diff --git a/xen/drivers/passthrough/device_tree.c b/xen/drivers/passthrough/device_tree.c >> index 1b50f4670944..d568166e19ec 100644 >> --- a/xen/drivers/passthrough/device_tree.c >> +++ b/xen/drivers/passthrough/device_tree.c >> @@ -151,6 +151,146 @@ static int iommu_dt_xlate(struct device *dev, >> return ops->dt_xlate(dev, iommu_spec); >> } >> >> +#ifdef CONFIG_HAS_PCI >> +int iommu_dt_pci_map_id(const struct dt_device_node *np, uint32_t id, >> + const char *map_name, const char *map_mask_name, >> + struct dt_device_node **target, uint32_t *id_out) >> +{ >> + uint32_t map_mask, masked_id, map_len; >> + const __be32 *map = NULL; >> + >> + if ( !np || !map_name || (!target && !id_out) ) >> + return -EINVAL; >> + >> + map = dt_get_property(np, map_name, &map_len); >> + if ( !map ) >> + { >> + if ( target ) >> + return -ENODEV; > empty line here OK >> + /* Otherwise, no map implies no translation */ >> + *id_out = id; >> + return 0; >> + } >> + >> + if ( !map_len || map_len % (4 * sizeof(*map)) ) > could you enclose the second expression in parantheses? Yes >> + { >> + printk(XENLOG_ERR "%pOF: Error: Bad %s length: %d\n", np, > %pOF is a Linux special printk format to print full name of the node. > We do not have this in Xen (see printk-formats.txt). If you want to achieve the same, use np->full_name. > This applies to all the uses below. OK > Also, use %u for map_len as it is unsigned. OK >> + map_name, map_len); > incorrect alignment OK >> + return -EINVAL; >> + } >> + >> + /* The default is to select all bits. */ >> + map_mask = 0xffffffff; >> + >> + /* >> + * Can be overridden by "{iommu,msi}-map-mask" property. >> + * If of_property_read_u32() fails, the default is used. > s/of_property_read_u32/dt_property_read_u32 OK >> + */ >> + if ( map_mask_name ) >> + dt_property_read_u32(np, map_mask_name, &map_mask); >> + >> + masked_id = map_mask & id; >> + for ( ; (int)map_len > 0; map_len -= 4 * sizeof(*map), map += 4 ) >> + { >> + struct dt_device_node *phandle_node; >> + uint32_t id_base = be32_to_cpup(map + 0); >> + uint32_t phandle = be32_to_cpup(map + 1); >> + uint32_t out_base = be32_to_cpup(map + 2); >> + uint32_t id_len = be32_to_cpup(map + 3); >> + >> + if ( id_base & ~map_mask ) >> + { >> + printk(XENLOG_ERR "%pOF: Invalid %s translation - %s-mask (0x%x) ignores id-base (0x%x)\n", > we tend to use PRIx32 to print uint32_t values. OK >> + np, map_name, map_name, map_mask, id_base); >> + return -EFAULT; >> + } >> + >> + if ( masked_id < id_base || masked_id >= id_base + id_len ) > could you enclose the expressions in parantheses? Yes, I will follow MISRA C:2012 Rule 12.1 >> + continue; >> + >> + phandle_node = dt_find_node_by_phandle(phandle); >> + if ( !phandle_node ) >> + return -ENODEV; >> + >> + if ( target ) >> + { >> + if ( !*target ) >> + *target = phandle_node; >> + >> + if ( *target != phandle_node ) >> + continue; >> + } >> + >> + if ( id_out ) >> + *id_out = masked_id - id_base + out_base; >> + >> + printk(XENLOG_DEBUG "%pOF: %s, using mask %08x, id-base: %08x, out-base: %08x, length: %08x, id: %08x -> %08x\n", > %08"PRIx32" OK >> + np, map_name, map_mask, id_base, out_base, id_len, id, >> + masked_id - id_base + out_base); >> + return 0; >> + } >> + >> + printk(XENLOG_ERR "%pOF: no %s translation for id 0x%x on %pOF\n", >> + np, map_name, id, target && *target ? *target : NULL); >> + >> + /* >> + * NOTE: Linux bypasses translation without returning an error here, >> + * but should we behave in the same way on Xen? Restrict for now. >> + */ >> + return -EFAULT; >> +} >> + >> +int iommu_add_dt_pci_sideband_ids(struct pci_dev *pdev) >> +{ >> + const struct iommu_ops *ops = iommu_get_ops(); >> + struct dt_phandle_args iommu_spec = { .args_count = 1 }; >> + struct device *dev = pci_to_dev(pdev); >> + const struct dt_device_node *np; >> + int rc = NO_IOMMU; >> + >> + if ( !iommu_enabled ) >> + return NO_IOMMU; >> + >> + if ( !ops ) >> + return -EINVAL; >> + >> + if ( device_is_protected(dev) ) >> + return 0; >> + >> + if ( dev_iommu_fwspec_get(dev) ) >> + return -EEXIST; >> + >> + np = pci_find_host_bridge_node(pdev); >> + if ( !np ) >> + return -ENODEV; >> + >> + /* >> + * The driver which supports generic PCI-IOMMU DT bindings must have >> + * these callback implemented. >> + */ >> + if ( !ops->dt_xlate ) >> + return -EINVAL; > See my comment in previous patch. This could be moved to iommu_dt_xlate(). OK >> + >> + /* >> + * According to the Documentation/devicetree/bindings/pci/pci-iommu.txt >> + * from Linux. >> + */ >> + rc = iommu_dt_pci_map_id(np, PCI_BDF(pdev->bus, pdev->devfn), "iommu-map", >> + "iommu-map-mask", &iommu_spec.np, iommu_spec.args); >> + if ( rc ) >> + return rc == -ENODEV ? NO_IOMMU : rc; >> + >> + rc = iommu_dt_xlate(dev, &iommu_spec); >> + if ( rc < 0 ) >> + { >> + iommu_fwspec_free(dev); >> + return -EINVAL; >> + } >> + >> + return rc; >> +} >> +#endif /* CONFIG_HAS_PCI */ >> + >> int iommu_add_dt_device(struct dt_device_node *np) >> { >> const struct iommu_ops *ops = iommu_get_ops(); >> diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h >> index c1e4751a581f..dc40fdfb9231 100644 >> --- a/xen/include/xen/device_tree.h >> +++ b/xen/include/xen/device_tree.h >> @@ -852,6 +852,31 @@ int dt_count_phandle_with_args(const struct dt_device_node *np, >> */ >> int dt_get_pci_domain_nr(struct dt_device_node *node); >> >> +#ifdef CONFIG_HAS_PCI >> +/** >> + * iommu_dt_pci_map_id - Translate an ID through a downstream mapping. >> + * @np: root complex device node. >> + * @id: device ID to map. >> + * @map_name: property name of the map to use. >> + * @map_mask_name: optional property name of the mask to use. >> + * @target: optional pointer to a target device node. >> + * @id_out: optional pointer to receive the translated ID. >> + * >> + * Given a device ID, look up the appropriate implementation-defined >> + * platform ID and/or the target device which receives transactions on that >> + * ID, as per the "iommu-map" and "msi-map" bindings. Either of @target or >> + * @id_out may be NULL if only the other is required. If @target points to >> + * a non-NULL device node pointer, only entries targeting that node will be >> + * matched; if it points to a NULL value, it will receive the device node of >> + * the first matching target phandle, with a reference held. >> + * >> + * Return: 0 on success or a standard error code on failure. >> + */ >> +int iommu_dt_pci_map_id(const struct dt_device_node *np, uint32_t id, >> + const char *map_name, const char *map_mask_name, >> + struct dt_device_node **target, uint32_t *id_out); > Why is the iommu function prototype in device_tree.h and not in iommu.h where all rest of the iommu > dt related prototypes are placed? The function was borrowed from Linux's of_map_id, and it can be used to parse both "iommu-map" and "msi-map" properties. So it is not necessarily iommu specific. I'm considering renaming it back to dt_map_id in v4. > Furthermore, do you need to expose globally this function? > Looking at this series it could be static as it is only used by iommu_add_dt_pci_sideband_ids(). > Will there be any use of it later on? Not in this series, but in the future MSI series it will be used with the "msi-map" binding. See [1] [1] https://gitlab.com/xen-project/people/bmarquis/xen-arm-poc/-/commit/a8af686c327c2f2bdde321027abfda0b9ba4469c#15fe36652149e3439a60b50d9672982ca3de755e_290_301
diff --git a/xen/drivers/passthrough/device_tree.c b/xen/drivers/passthrough/device_tree.c index 1b50f4670944..d568166e19ec 100644 --- a/xen/drivers/passthrough/device_tree.c +++ b/xen/drivers/passthrough/device_tree.c @@ -151,6 +151,146 @@ static int iommu_dt_xlate(struct device *dev, return ops->dt_xlate(dev, iommu_spec); } +#ifdef CONFIG_HAS_PCI +int iommu_dt_pci_map_id(const struct dt_device_node *np, uint32_t id, + const char *map_name, const char *map_mask_name, + struct dt_device_node **target, uint32_t *id_out) +{ + uint32_t map_mask, masked_id, map_len; + const __be32 *map = NULL; + + if ( !np || !map_name || (!target && !id_out) ) + return -EINVAL; + + map = dt_get_property(np, map_name, &map_len); + if ( !map ) + { + if ( target ) + return -ENODEV; + /* Otherwise, no map implies no translation */ + *id_out = id; + return 0; + } + + if ( !map_len || map_len % (4 * sizeof(*map)) ) + { + printk(XENLOG_ERR "%pOF: Error: Bad %s length: %d\n", np, + map_name, map_len); + return -EINVAL; + } + + /* The default is to select all bits. */ + map_mask = 0xffffffff; + + /* + * Can be overridden by "{iommu,msi}-map-mask" property. + * If of_property_read_u32() fails, the default is used. + */ + if ( map_mask_name ) + dt_property_read_u32(np, map_mask_name, &map_mask); + + masked_id = map_mask & id; + for ( ; (int)map_len > 0; map_len -= 4 * sizeof(*map), map += 4 ) + { + struct dt_device_node *phandle_node; + uint32_t id_base = be32_to_cpup(map + 0); + uint32_t phandle = be32_to_cpup(map + 1); + uint32_t out_base = be32_to_cpup(map + 2); + uint32_t id_len = be32_to_cpup(map + 3); + + if ( id_base & ~map_mask ) + { + printk(XENLOG_ERR "%pOF: Invalid %s translation - %s-mask (0x%x) ignores id-base (0x%x)\n", + np, map_name, map_name, map_mask, id_base); + return -EFAULT; + } + + if ( masked_id < id_base || masked_id >= id_base + id_len ) + continue; + + phandle_node = dt_find_node_by_phandle(phandle); + if ( !phandle_node ) + return -ENODEV; + + if ( target ) + { + if ( !*target ) + *target = phandle_node; + + if ( *target != phandle_node ) + continue; + } + + if ( id_out ) + *id_out = masked_id - id_base + out_base; + + printk(XENLOG_DEBUG "%pOF: %s, using mask %08x, id-base: %08x, out-base: %08x, length: %08x, id: %08x -> %08x\n", + np, map_name, map_mask, id_base, out_base, id_len, id, + masked_id - id_base + out_base); + return 0; + } + + printk(XENLOG_ERR "%pOF: no %s translation for id 0x%x on %pOF\n", + np, map_name, id, target && *target ? *target : NULL); + + /* + * NOTE: Linux bypasses translation without returning an error here, + * but should we behave in the same way on Xen? Restrict for now. + */ + return -EFAULT; +} + +int iommu_add_dt_pci_sideband_ids(struct pci_dev *pdev) +{ + const struct iommu_ops *ops = iommu_get_ops(); + struct dt_phandle_args iommu_spec = { .args_count = 1 }; + struct device *dev = pci_to_dev(pdev); + const struct dt_device_node *np; + int rc = NO_IOMMU; + + if ( !iommu_enabled ) + return NO_IOMMU; + + if ( !ops ) + return -EINVAL; + + if ( device_is_protected(dev) ) + return 0; + + if ( dev_iommu_fwspec_get(dev) ) + return -EEXIST; + + np = pci_find_host_bridge_node(pdev); + if ( !np ) + return -ENODEV; + + /* + * The driver which supports generic PCI-IOMMU DT bindings must have + * these callback implemented. + */ + if ( !ops->dt_xlate ) + return -EINVAL; + + /* + * According to the Documentation/devicetree/bindings/pci/pci-iommu.txt + * from Linux. + */ + rc = iommu_dt_pci_map_id(np, PCI_BDF(pdev->bus, pdev->devfn), "iommu-map", + "iommu-map-mask", &iommu_spec.np, iommu_spec.args); + if ( rc ) + return rc == -ENODEV ? NO_IOMMU : rc; + + rc = iommu_dt_xlate(dev, &iommu_spec); + if ( rc < 0 ) + { + iommu_fwspec_free(dev); + return -EINVAL; + } + + return rc; +} +#endif /* CONFIG_HAS_PCI */ + int iommu_add_dt_device(struct dt_device_node *np) { const struct iommu_ops *ops = iommu_get_ops(); diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h index c1e4751a581f..dc40fdfb9231 100644 --- a/xen/include/xen/device_tree.h +++ b/xen/include/xen/device_tree.h @@ -852,6 +852,31 @@ int dt_count_phandle_with_args(const struct dt_device_node *np, */ int dt_get_pci_domain_nr(struct dt_device_node *node); +#ifdef CONFIG_HAS_PCI +/** + * iommu_dt_pci_map_id - Translate an ID through a downstream mapping. + * @np: root complex device node. + * @id: device ID to map. + * @map_name: property name of the map to use. + * @map_mask_name: optional property name of the mask to use. + * @target: optional pointer to a target device node. + * @id_out: optional pointer to receive the translated ID. + * + * Given a device ID, look up the appropriate implementation-defined + * platform ID and/or the target device which receives transactions on that + * ID, as per the "iommu-map" and "msi-map" bindings. Either of @target or + * @id_out may be NULL if only the other is required. If @target points to + * a non-NULL device node pointer, only entries targeting that node will be + * matched; if it points to a NULL value, it will receive the device node of + * the first matching target phandle, with a reference held. + * + * Return: 0 on success or a standard error code on failure. + */ +int iommu_dt_pci_map_id(const struct dt_device_node *np, uint32_t id, + const char *map_name, const char *map_mask_name, + struct dt_device_node **target, uint32_t *id_out); +#endif /* CONFIG_HAS_PCI */ + struct dt_device_node *dt_find_node_by_phandle(dt_phandle handle); #ifdef CONFIG_DEVICE_TREE_DEBUG diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h index 405db59971c5..e83de1fced67 100644 --- a/xen/include/xen/iommu.h +++ b/xen/include/xen/iommu.h @@ -26,6 +26,7 @@ #include <xen/spinlock.h> #include <public/domctl.h> #include <public/hvm/ioreq.h> +#include <asm/acpi.h> #include <asm/device.h> TYPE_SAFE(uint64_t, dfn); @@ -219,7 +220,8 @@ int iommu_dt_domain_init(struct domain *d); int iommu_release_dt_devices(struct domain *d); /* - * Helper to add master device to the IOMMU using generic IOMMU DT bindings. + * Helpers to add master device to the IOMMU using generic (PCI-)IOMMU + * DT bindings. * * Return values: * 0 : device is protected by an IOMMU @@ -228,12 +230,25 @@ int iommu_release_dt_devices(struct domain *d); * (IOMMU is not enabled/present or device is not connected to it). */ int iommu_add_dt_device(struct dt_device_node *np); +int iommu_add_dt_pci_sideband_ids(struct pci_dev *pdev); int iommu_do_dt_domctl(struct xen_domctl *, struct domain *, XEN_GUEST_HANDLE_PARAM(xen_domctl_t)); +#else /* !HAS_DEVICE_TREE */ +static inline int iommu_add_dt_pci_sideband_ids(struct pci_dev *pdev) +{ + return 0; +} #endif /* HAS_DEVICE_TREE */ +static inline int iommu_add_pci_sideband_ids(struct pci_dev *pdev) +{ + if ( acpi_disabled ) + return iommu_add_dt_pci_sideband_ids(pdev); + return 0; +} + struct page_info; /*