Message ID | 1429091364-31939-2-git-send-email-Minghuan.Lian@freescale.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Minghuan, Not clear what this patch intends to do. Can you please explain the point about SMMU isolating limited device ID. Regards Varun > -----Original Message----- > From: linux-arm-kernel [mailto:linux-arm-kernel- > bounces@lists.infradead.org] On Behalf Of Minghuan Lian > Sent: Wednesday, April 15, 2015 3:19 PM > To: linux-pci@vger.kernel.org > Cc: Arnd Bergmann; Lian Minghuan-B31939; Hu Mingkai-B21284; Zang Roy- > R61911; Yoder Stuart-B08248; Bjorn Helgaas; Wood Scott-B07421; linux-arm- > kernel@lists.infradead.org > Subject: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > SMMU of some platforms can only isolate limited device ID. > This may require that all PCI devices share the same ITS device with the fixed > device ID. The patch adds function arch_msi_share_devid_update used for > these platforms to update the fixed device ID and maximum MSI interrupts > number. > > Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com> > --- > drivers/irqchip/irq-gic-v3-its.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c > index d0374a6..be78d0a 100644 > --- a/drivers/irqchip/irq-gic-v3-its.c > +++ b/drivers/irqchip/irq-gic-v3-its.c > @@ -1169,6 +1169,15 @@ static int its_get_pci_alias(struct pci_dev *pdev, > u16 alias, void *data) > return 0; > } > > +void __weak > +arch_msi_share_devid_update(struct pci_dev *pdev, u32 *dev_id, u32 > +*nvesc) { > + /* > + * use PCI_DEVID NOT share device ID as default > + * so nothing need to do > + */ > +} > + > static int its_msi_prepare(struct irq_domain *domain, struct device *dev, > int nvec, msi_alloc_info_t *info) { @@ -1185,6 > +1194,8 @@ static int its_msi_prepare(struct irq_domain *domain, struct > device *dev, > dev_alias.count = nvec; > > pci_for_each_dma_alias(pdev, its_get_pci_alias, &dev_alias); > + arch_msi_share_devid_update(pdev, &dev_alias.dev_id, > +&dev_alias.count); > + > its = domain->parent->host_data; > > its_dev = its_find_device(its, dev_alias.dev_id); > -- > 1.9.1 > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Hi Varun, Freescale LS2085A SMMU uses in hit/miss mechanism for the concatenation {tbu number,stream_id}. This concatenation is then assigned to a context bank that determines the translation type and form. The Isolation Context Identifier ICID is the main field of stream_id which will be used to hit ITS device. We may look ICID as ITS device ID and PCI device ID. But there are only 64 ICIDs 0 - 63. If using default PCI_DEVID(bus, devfn) ((((u16)(bus)) << 8) | (devfn)), PCI device(bus >=1) ) ID will larger than 63. SMMU will miss this translation. In addition, because the ICID number is only 64, all the PCIe device will use the same ICID and share the same ITS device. Thanks, Minghuan > -----Original Message----- > From: Sethi Varun-B16395 > Sent: Wednesday, April 15, 2015 7:08 PM > To: Lian Minghuan-B31939; linux-pci@vger.kernel.org > Cc: Arnd Bergmann; Lian Minghuan-B31939; Hu Mingkai-B21284; Zang Roy- > R61911; Yoder Stuart-B08248; Bjorn Helgaas; Wood Scott-B07421; linux-arm- > kernel@lists.infradead.org > Subject: RE: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > Hi Minghuan, > Not clear what this patch intends to do. Can you please explain the point > about SMMU isolating limited device ID. > > Regards > Varun > > > -----Original Message----- > > From: linux-arm-kernel [mailto:linux-arm-kernel- > > bounces@lists.infradead.org] On Behalf Of Minghuan Lian > > Sent: Wednesday, April 15, 2015 3:19 PM > > To: linux-pci@vger.kernel.org > > Cc: Arnd Bergmann; Lian Minghuan-B31939; Hu Mingkai-B21284; Zang Roy- > > R61911; Yoder Stuart-B08248; Bjorn Helgaas; Wood Scott-B07421; > > linux-arm- kernel@lists.infradead.org > > Subject: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > > > SMMU of some platforms can only isolate limited device ID. > > This may require that all PCI devices share the same ITS device with > > the fixed device ID. The patch adds function > > arch_msi_share_devid_update used for these platforms to update the > > fixed device ID and maximum MSI interrupts number. > > > > Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com> > > --- > > drivers/irqchip/irq-gic-v3-its.c | 11 +++++++++++ > > 1 file changed, 11 insertions(+) > > > > diff --git a/drivers/irqchip/irq-gic-v3-its.c > > b/drivers/irqchip/irq-gic-v3-its.c > > index d0374a6..be78d0a 100644 > > --- a/drivers/irqchip/irq-gic-v3-its.c > > +++ b/drivers/irqchip/irq-gic-v3-its.c > > @@ -1169,6 +1169,15 @@ static int its_get_pci_alias(struct pci_dev > > *pdev, > > u16 alias, void *data) > > return 0; > > } > > > > +void __weak > > +arch_msi_share_devid_update(struct pci_dev *pdev, u32 *dev_id, u32 > > +*nvesc) { > > + /* > > + * use PCI_DEVID NOT share device ID as default > > + * so nothing need to do > > + */ > > +} > > + > > static int its_msi_prepare(struct irq_domain *domain, struct device *dev, > > int nvec, msi_alloc_info_t *info) { @@ -1185,6 > > +1194,8 @@ static int its_msi_prepare(struct irq_domain *domain, > > +struct > > device *dev, > > dev_alias.count = nvec; > > > > pci_for_each_dma_alias(pdev, its_get_pci_alias, &dev_alias); > > + arch_msi_share_devid_update(pdev, &dev_alias.dev_id, > > +&dev_alias.count); > > + > > its = domain->parent->host_data; > > > > its_dev = its_find_device(its, dev_alias.dev_id); > > -- > > 1.9.1 > > > > > > _______________________________________________ > > linux-arm-kernel mailing list > > linux-arm-kernel@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Hi Minghuan, Yes, deviceid=stream id (i.e. ICID + other bits). I am not sure if TBU ID would also be forwarded as a part of stream id to GIC. My understanding is that TBU ID is forwarded (as a part of the stream ID) to the TCU in case of a TBU translation miss. In case of the LS2085 PCIe controller you would have to setup the PCIe device ID to stream ID translation table. We may have to restrict the number of entries based on the available number of contexts. Regards Varun > -----Original Message----- > From: Lian Minghuan-B31939 > Sent: Wednesday, April 15, 2015 5:08 PM > To: Sethi Varun-B16395; linux-pci@vger.kernel.org > Cc: Arnd Bergmann; Hu Mingkai-B21284; Zang Roy-R61911; Yoder Stuart- > B08248; Bjorn Helgaas; Wood Scott-B07421; linux-arm- > kernel@lists.infradead.org > Subject: RE: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > Hi Varun, > > Freescale LS2085A SMMU uses in hit/miss mechanism for the concatenation > {tbu number,stream_id}. This concatenation is then assigned to a context > bank that determines the translation type and form. The Isolation Context > Identifier ICID is the main field of stream_id which will be used to hit ITS > device. We may look ICID as ITS device ID and PCI device ID. But there are > only 64 ICIDs 0 - 63. If using default PCI_DEVID(bus, devfn) ((((u16)(bus)) << > 8) | (devfn)), PCI device(bus >=1) ) ID will larger than 63. SMMU will miss this > translation. > In addition, because the ICID number is only 64, all the PCIe device will use > the same ICID and share the same ITS device. > > Thanks, > Minghuan > > > -----Original Message----- > > From: Sethi Varun-B16395 > > Sent: Wednesday, April 15, 2015 7:08 PM > > To: Lian Minghuan-B31939; linux-pci@vger.kernel.org > > Cc: Arnd Bergmann; Lian Minghuan-B31939; Hu Mingkai-B21284; Zang Roy- > > R61911; Yoder Stuart-B08248; Bjorn Helgaas; Wood Scott-B07421; > > linux-arm- kernel@lists.infradead.org > > Subject: RE: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > > > Hi Minghuan, > > Not clear what this patch intends to do. Can you please explain the > > point about SMMU isolating limited device ID. > > > > Regards > > Varun > > > > > -----Original Message----- > > > From: linux-arm-kernel [mailto:linux-arm-kernel- > > > bounces@lists.infradead.org] On Behalf Of Minghuan Lian > > > Sent: Wednesday, April 15, 2015 3:19 PM > > > To: linux-pci@vger.kernel.org > > > Cc: Arnd Bergmann; Lian Minghuan-B31939; Hu Mingkai-B21284; Zang > > > Roy- R61911; Yoder Stuart-B08248; Bjorn Helgaas; Wood Scott-B07421; > > > linux-arm- kernel@lists.infradead.org > > > Subject: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > > > > > SMMU of some platforms can only isolate limited device ID. > > > This may require that all PCI devices share the same ITS device with > > > the fixed device ID. The patch adds function > > > arch_msi_share_devid_update used for these platforms to update the > > > fixed device ID and maximum MSI interrupts number. > > > > > > Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com> > > > --- > > > drivers/irqchip/irq-gic-v3-its.c | 11 +++++++++++ > > > 1 file changed, 11 insertions(+) > > > > > > diff --git a/drivers/irqchip/irq-gic-v3-its.c > > > b/drivers/irqchip/irq-gic-v3-its.c > > > index d0374a6..be78d0a 100644 > > > --- a/drivers/irqchip/irq-gic-v3-its.c > > > +++ b/drivers/irqchip/irq-gic-v3-its.c > > > @@ -1169,6 +1169,15 @@ static int its_get_pci_alias(struct pci_dev > > > *pdev, > > > u16 alias, void *data) > > > return 0; > > > } > > > > > > +void __weak > > > +arch_msi_share_devid_update(struct pci_dev *pdev, u32 *dev_id, u32 > > > +*nvesc) { > > > + /* > > > + * use PCI_DEVID NOT share device ID as default > > > + * so nothing need to do > > > + */ > > > +} > > > + > > > static int its_msi_prepare(struct irq_domain *domain, struct device > *dev, > > > int nvec, msi_alloc_info_t *info) { @@ -1185,6 > > > +1194,8 @@ static int its_msi_prepare(struct irq_domain *domain, > > > +struct > > > device *dev, > > > dev_alias.count = nvec; > > > > > > pci_for_each_dma_alias(pdev, its_get_pci_alias, &dev_alias); > > > + arch_msi_share_devid_update(pdev, &dev_alias.dev_id, > > > +&dev_alias.count); > > > + > > > its = domain->parent->host_data; > > > > > > its_dev = its_find_device(its, dev_alias.dev_id); > > > -- > > > 1.9.1 > > > > > > > > > _______________________________________________ > > > linux-arm-kernel mailing list > > > linux-arm-kernel@lists.infradead.org > > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Wed, 15 Apr 2015 17:49:23 +0800 Minghuan Lian <Minghuan.Lian@freescale.com> wrote: > SMMU of some platforms can only isolate limited device ID. > This may require that all PCI devices share the same ITS > device with the fixed device ID. The patch adds function > arch_msi_share_devid_update used for these platforms to update > the fixed device ID and maximum MSI interrupts number. > > Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com> > --- > drivers/irqchip/irq-gic-v3-its.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c > index d0374a6..be78d0a 100644 > --- a/drivers/irqchip/irq-gic-v3-its.c > +++ b/drivers/irqchip/irq-gic-v3-its.c > @@ -1169,6 +1169,15 @@ static int its_get_pci_alias(struct pci_dev *pdev, u16 alias, void *data) > return 0; > } > > +void __weak > +arch_msi_share_devid_update(struct pci_dev *pdev, u32 *dev_id, u32 *nvesc) > +{ > + /* > + * use PCI_DEVID NOT share device ID as default > + * so nothing need to do > + */ > +} > + NAK. On top of being ugly as sin, this breaks any form of multiplatform support. No way anything like this is going in. Guys, you really should know better. > static int its_msi_prepare(struct irq_domain *domain, struct device *dev, > int nvec, msi_alloc_info_t *info) > { > @@ -1185,6 +1194,8 @@ static int its_msi_prepare(struct irq_domain *domain, struct device *dev, > dev_alias.count = nvec; > > pci_for_each_dma_alias(pdev, its_get_pci_alias, &dev_alias); > + arch_msi_share_devid_update(pdev, &dev_alias.dev_id, &dev_alias.count); > + See the function above? That's where the aliasing should be taken care of. > its = domain->parent->host_data; > > its_dev = its_find_device(its, dev_alias.dev_id); Thanks, M.
Hi Marc, Please see my comments inline > -----Original Message----- > From: Marc Zyngier [mailto:marc.zyngier@arm.com] > Sent: Thursday, April 16, 2015 12:37 AM > To: Lian Minghuan-B31939 > Cc: linux-pci@vger.kernel.org; Arnd Bergmann; Hu Mingkai-B21284; Zang > Roy-R61911; Yoder Stuart-B08248; Bjorn Helgaas; Wood Scott-B07421; linux- > arm-kernel@lists.infradead.org; Jason Cooper; Thomas Gleixner > Subject: Re: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > On Wed, 15 Apr 2015 17:49:23 +0800 > Minghuan Lian <Minghuan.Lian@freescale.com> wrote: > > > SMMU of some platforms can only isolate limited device ID. > > This may require that all PCI devices share the same ITS device with > > the fixed device ID. The patch adds function > > arch_msi_share_devid_update used for these platforms to update the > > fixed device ID and maximum MSI interrupts number. > > > > Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com> > > --- > > drivers/irqchip/irq-gic-v3-its.c | 11 +++++++++++ > > 1 file changed, 11 insertions(+) > > > > diff --git a/drivers/irqchip/irq-gic-v3-its.c > > b/drivers/irqchip/irq-gic-v3-its.c > > index d0374a6..be78d0a 100644 > > --- a/drivers/irqchip/irq-gic-v3-its.c > > +++ b/drivers/irqchip/irq-gic-v3-its.c > > @@ -1169,6 +1169,15 @@ static int its_get_pci_alias(struct pci_dev *pdev, > u16 alias, void *data) > > return 0; > > } > > > > +void __weak > > +arch_msi_share_devid_update(struct pci_dev *pdev, u32 *dev_id, u32 > > +*nvesc) { > > + /* > > + * use PCI_DEVID NOT share device ID as default > > + * so nothing need to do > > + */ > > +} > > + > > NAK. On top of being ugly as sin, this breaks any form of multiplatform > support. No way anything like this is going in. Guys, you really should know > better. > [Minghuan] The current ITS MSI will create an individual ITS device for each PCIe device, and use PCI_DEVID as ITS dev_id However, out platform only supports ITS dev_id 0 -63. A normal PCIe DEVID of 0000:01:00.0 is 256 bigger than 63. Besides, because of the limited dev_id number, all the PCIe device will share the same ITS dev. Our platform provides a hardware module LUT to map PCI DEVID to ITS dev_id. So, when creating ITS device, we need to update dev_id and the nvesc. I may change pci_for_each_dma_alias to add a new flag to use alias_bus and alias_devfn. But I also need to update nvesc which should contains all the PCI device MSI/MSIX nvesc and PCIe PME, aerdrv interrupts. The main difference is that we need a ITS device to service multiple PCIe devices. Could you give me some suggestions how to implement this requirement. > > static int its_msi_prepare(struct irq_domain *domain, struct device *dev, > > int nvec, msi_alloc_info_t *info) { @@ -1185,6 > +1194,8 @@ > > static int its_msi_prepare(struct irq_domain *domain, struct device *dev, > > dev_alias.count = nvec; > > > > pci_for_each_dma_alias(pdev, its_get_pci_alias, &dev_alias); > > + arch_msi_share_devid_update(pdev, &dev_alias.dev_id, > > +&dev_alias.count); > > + > > See the function above? That's where the aliasing should be taken care of. > [Minghuan] The alias will use dma_alias_devfn, but it does not contains alias_bus. I need to translate PCI_DEVID to a fixed ID. > > its = domain->parent->host_data; > > > > its_dev = its_find_device(its, dev_alias.dev_id); > > Thanks, > > M. > -- > Jazz is not dead. It just smells funny.
On 16/04/15 03:57, Minghuan.Lian@freescale.com wrote: > Hi Marc, > > Please see my comments inline > >> -----Original Message----- From: Marc Zyngier >> [mailto:marc.zyngier@arm.com] Sent: Thursday, April 16, 2015 12:37 >> AM To: Lian Minghuan-B31939 Cc: linux-pci@vger.kernel.org; Arnd >> Bergmann; Hu Mingkai-B21284; Zang Roy-R61911; Yoder Stuart-B08248; >> Bjorn Helgaas; Wood Scott-B07421; linux- >> arm-kernel@lists.infradead.org; Jason Cooper; Thomas Gleixner >> Subject: Re: [PATCH 1/2] irqchip/gicv3-its: Support share device >> ID >> >> On Wed, 15 Apr 2015 17:49:23 +0800 Minghuan Lian >> <Minghuan.Lian@freescale.com> wrote: >> >>> SMMU of some platforms can only isolate limited device ID. This >>> may require that all PCI devices share the same ITS device with >>> the fixed device ID. The patch adds function >>> arch_msi_share_devid_update used for these platforms to update >>> the fixed device ID and maximum MSI interrupts number. >>> >>> Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com> --- >>> drivers/irqchip/irq-gic-v3-its.c | 11 +++++++++++ 1 file changed, >>> 11 insertions(+) >>> >>> diff --git a/drivers/irqchip/irq-gic-v3-its.c >>> b/drivers/irqchip/irq-gic-v3-its.c index d0374a6..be78d0a 100644 >>> --- a/drivers/irqchip/irq-gic-v3-its.c +++ >>> b/drivers/irqchip/irq-gic-v3-its.c @@ -1169,6 +1169,15 @@ static >>> int its_get_pci_alias(struct pci_dev *pdev, >> u16 alias, void *data) >>> return 0; } >>> >>> +void __weak +arch_msi_share_devid_update(struct pci_dev *pdev, >>> u32 *dev_id, u32 +*nvesc) { + /* + * use PCI_DEVID NOT share >>> device ID as default + * so nothing need to do + */ +} + >> >> NAK. On top of being ugly as sin, this breaks any form of >> multiplatform support. No way anything like this is going in. Guys, >> you really should know better. >> > [Minghuan] The current ITS MSI will create an individual ITS device > for each PCIe device, and use PCI_DEVID as ITS dev_id However, out > platform only supports ITS dev_id 0 -63. A normal PCIe DEVID of > 0000:01:00.0 is 256 bigger than 63. Besides, because of the limited > dev_id number, all the PCIe device will share the same ITS dev. Our > platform provides a hardware module LUT to map PCI DEVID to ITS > dev_id. So, when creating ITS device, we need to update dev_id and > the nvesc. I may change pci_for_each_dma_alias to add a new flag to > use alias_bus and alias_devfn. Yes, that's where you should take care of this hack. > But I also need to update nvesc which should contains all the PCI > device MSI/MSIX nvesc and PCIe PME, aerdrv interrupts. The main > difference is that we need a ITS device to service multiple PCIe > devices. Could you give me some suggestions how to implement this > requirement. It you take the time to read the code in its_msi_prepare(), you'll quickly notice that we already handle aliasing of multiple PCI devices to a single DeviceID. Once you have the aliasing taken care of in pci_for_each_dma_alias, the ITS code will automatically adjust the number of vectors (using its_get_pci_alias as a callback from pci_for_each_dma_alias). >>> static int its_msi_prepare(struct irq_domain *domain, struct >>> device *dev, int nvec, msi_alloc_info_t *info) { @@ -1185,6 >> +1194,8 @@ >>> static int its_msi_prepare(struct irq_domain *domain, struct >>> device *dev, dev_alias.count = nvec; >>> >>> pci_for_each_dma_alias(pdev, its_get_pci_alias, &dev_alias); + >>> arch_msi_share_devid_update(pdev, &dev_alias.dev_id, >>> +&dev_alias.count); + >> >> See the function above? That's where the aliasing should be taken >> care of. >> > [Minghuan] The alias will use dma_alias_devfn, but it does not > contains alias_bus. I need to translate PCI_DEVID to a fixed ID. Then add another flag to deal with that. Your hardware is "creative" (some might even argue it is broken), so deal with the creativity as a quirk, which has no business in the ITS driver (or any other driver). Thanks, M.
Hi Varun, Thanks for adding me to Cc. On Wed, Apr 15, 2015 at 02:18:13PM +0100, Varun Sethi wrote: > Yes, deviceid=stream id (i.e. ICID + other bits). I am not sure if TBU ID > would also be forwarded as a part of stream id to GIC. My understanding is > that TBU ID is forwarded (as a part of the stream ID) to the TCU in case > of a TBU translation miss. In case of the LS2085 PCIe controller you would > have to setup the PCIe device ID to stream ID translation table. We may > have to restrict the number of entries based on the available number of > contexts. Unfortunately, I'm having a really hard time parsing this thread (some parts of it simply don't make sense; others use non-architectural terms and overall I don't get a feeling for the problem). Please could you explain your system design step by step so that I can understand (a) what you've built and (b) why the current design of Linux is causing you problems? Sorry if I'm just being thick, but it's important that we get this right. Cheers, Will
Hi Marc, Thanks for your comments. I still have a couple of questions, 1. If I add a new flag and all PCI DEVID will be replaced with a fixed ID, does SMMU can work correctly? For example, If we insert an Ethernet card with two ports, a port is assigned to kvm1, another port is assigned to kvm2. In this case, we should use one or two fixed ID? 2. If using PCIe device with SR-IOV, the driver will dynamically create some new PCIe devices. At this time, ITS device may has been created for PF. So we must previously create ITS device with a large enough nvesc. However its_get_pci_alias: dev_alias->count += its_pci_msi_vec_count(dev_alias->pdev) only calculate with the exiting PCIe device, could not meet this requirement. How to change dev_alias->count? Thanks, Minghuan > -----Original Message----- > From: Marc Zyngier [mailto:marc.zyngier@arm.com] > Sent: Thursday, April 16, 2015 6:04 PM > To: Lian Minghuan-B31939 > Cc: linux-pci@vger.kernel.org; Arnd Bergmann; Hu Mingkai-B21284; Zang > Roy-R61911; Yoder Stuart-B08248; Bjorn Helgaas; Wood Scott-B07421; linux- > arm-kernel@lists.infradead.org; Jason Cooper; Thomas Gleixner > Subject: Re: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > On 16/04/15 03:57, Minghuan.Lian@freescale.com wrote: > > Hi Marc, > > > > Please see my comments inline > > > >> -----Original Message----- From: Marc Zyngier > >> [mailto:marc.zyngier@arm.com] Sent: Thursday, April 16, 2015 12:37 > >> AM To: Lian Minghuan-B31939 Cc: linux-pci@vger.kernel.org; Arnd > >> Bergmann; Hu Mingkai-B21284; Zang Roy-R61911; Yoder Stuart-B08248; > >> Bjorn Helgaas; Wood Scott-B07421; linux- > >> arm-kernel@lists.infradead.org; Jason Cooper; Thomas Gleixner > >> Subject: Re: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > >> > >> On Wed, 15 Apr 2015 17:49:23 +0800 Minghuan Lian > >> <Minghuan.Lian@freescale.com> wrote: > >> > >>> SMMU of some platforms can only isolate limited device ID. This may > >>> require that all PCI devices share the same ITS device with the > >>> fixed device ID. The patch adds function arch_msi_share_devid_update > >>> used for these platforms to update the fixed device ID and maximum > >>> MSI interrupts number. > >>> > >>> Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com> --- > >>> drivers/irqchip/irq-gic-v3-its.c | 11 +++++++++++ 1 file changed, > >>> 11 insertions(+) > >>> > >>> diff --git a/drivers/irqchip/irq-gic-v3-its.c > >>> b/drivers/irqchip/irq-gic-v3-its.c index d0374a6..be78d0a 100644 > >>> --- a/drivers/irqchip/irq-gic-v3-its.c +++ > >>> b/drivers/irqchip/irq-gic-v3-its.c @@ -1169,6 +1169,15 @@ static > >>> int its_get_pci_alias(struct pci_dev *pdev, > >> u16 alias, void *data) > >>> return 0; } > >>> > >>> +void __weak +arch_msi_share_devid_update(struct pci_dev *pdev, > >>> u32 *dev_id, u32 +*nvesc) { + /* + * use PCI_DEVID NOT share > >>> device ID as default + * so nothing need to do + */ +} + > >> > >> NAK. On top of being ugly as sin, this breaks any form of > >> multiplatform support. No way anything like this is going in. Guys, > >> you really should know better. > >> > > [Minghuan] The current ITS MSI will create an individual ITS device > > for each PCIe device, and use PCI_DEVID as ITS dev_id However, out > > platform only supports ITS dev_id 0 -63. A normal PCIe DEVID of > > 0000:01:00.0 is 256 bigger than 63. Besides, because of the limited > > dev_id number, all the PCIe device will share the same ITS dev. Our > > platform provides a hardware module LUT to map PCI DEVID to ITS > > dev_id. So, when creating ITS device, we need to update dev_id and > > the nvesc. I may change pci_for_each_dma_alias to add a new flag to > > use alias_bus and alias_devfn. > > Yes, that's where you should take care of this hack. > > > But I also need to update nvesc which should contains all the PCI > > device MSI/MSIX nvesc and PCIe PME, aerdrv interrupts. The main > > difference is that we need a ITS device to service multiple PCIe > > devices. Could you give me some suggestions how to implement this > > requirement. > > It you take the time to read the code in its_msi_prepare(), you'll quickly > notice that we already handle aliasing of multiple PCI devices to a single > DeviceID. Once you have the aliasing taken care of in > pci_for_each_dma_alias, the ITS code will automatically adjust the number > of vectors (using its_get_pci_alias as a callback from > pci_for_each_dma_alias). > > >>> static int its_msi_prepare(struct irq_domain *domain, struct device > >>> *dev, int nvec, msi_alloc_info_t *info) { @@ -1185,6 > >> +1194,8 @@ > >>> static int its_msi_prepare(struct irq_domain *domain, struct device > >>> *dev, dev_alias.count = nvec; > >>> > >>> pci_for_each_dma_alias(pdev, its_get_pci_alias, &dev_alias); + > >>> arch_msi_share_devid_update(pdev, &dev_alias.dev_id, > >>> +&dev_alias.count); + > >> > >> See the function above? That's where the aliasing should be taken > >> care of. > >> > > [Minghuan] The alias will use dma_alias_devfn, but it does not > > contains alias_bus. I need to translate PCI_DEVID to a fixed ID. > > Then add another flag to deal with that. Your hardware is "creative" > (some might even argue it is broken), so deal with the creativity as a quirk, > which has no business in the ITS driver (or any other driver). > > Thanks, > > M. > -- > Jazz is not dead. It just smells funny...
Minghuan, On 16/04/15 11:57, Minghuan.Lian@freescale.com wrote: Please include the relevant people (Will Deacon for questions about the SMMU), and avoid top posting, it makes it very hard to follow a discussion. > I still have a couple of questions, > 1. If I add a new flag and all PCI DEVID will be replaced with a > fixed ID, does SMMU can work correctly? For example, If we insert an > Ethernet card with two ports, a port is assigned to kvm1, another > port is assigned to kvm2. In this case, we should use one or two > fixed ID? That completely depends on what the SMMU sees. If it sees the same ID, you won't be able to perform any form of isolation. But I have no idea how your hardware works, so maybe explaining in details how this is constructed will help. > 2. If using PCIe device with SR-IOV, the driver will dynamically > create some new PCIe devices. At this time, ITS device may has been > created for PF. So we must previously create ITS device with a > large enough nvesc. However its_get_pci_alias: dev_alias->count += > its_pci_msi_vec_count(dev_alias->pdev) only calculate with the > exiting PCIe device, could not meet this requirement. How to change > dev_alias->count? You cannot change this dynamically. The ITT for this DeviceID will have been allocated, and the ITS is getting live traffic. No way you're going to resize it. You will probable have to add some extra quirks to force the allocation of a much larger ITT by lying about the nvec somehow. Or forget about SRIOV and hotplug. M.
Hi Minghuan, > -----Original Message----- > From: linux-arm-kernel [mailto:linux-arm-kernel- > bounces@lists.infradead.org] On Behalf Of Marc Zyngier > Sent: Thursday, April 16, 2015 5:21 PM > To: Lian Minghuan-B31939 > Cc: Jason Cooper; Arnd Bergmann; linux-pci@vger.kernel.org; Will Deacon; > Yoder Stuart-B08248; Wood Scott-B07421; Bjorn Helgaas; Hu Mingkai-B21284; > Zang Roy-R61911; Thomas Gleixner; linux-arm-kernel@lists.infradead.org > Subject: Re: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > Minghuan, > > On 16/04/15 11:57, Minghuan.Lian@freescale.com wrote: > > Please include the relevant people (Will Deacon for questions about the > SMMU), and avoid top posting, it makes it very hard to follow a discussion. > > > I still have a couple of questions, > > 1. If I add a new flag and all PCI DEVID will be replaced with a fixed > > ID, does SMMU can work correctly? For example, If we insert an > > Ethernet card with two ports, a port is assigned to kvm1, another port > > is assigned to kvm2. In this case, we should use one or two fixed ID? > > That completely depends on what the SMMU sees. If it sees the same ID, > you won't be able to perform any form of isolation. But I have no idea how > your hardware works, so maybe explaining in details how this is constructed > will help. > We would need separate stream IDs and iommu groups for endpoints, to assign them to different VMs. We would have to set up the stream ID to device ID translation table during device probe, via add_device callback. While setting the device id for the interrupt translation table, you would have to get streamID corresponding to a given PCIe device ID. Again, I don't think that we should consider the TBUID, while setting the ITT device id. -Varun
Hi Marc and Varun, Thank you very much for your valuable comments. I will study further and try to find a better way to support ls2085 msi. Thanks, Minghuan > -----Original Message----- > From: Sethi Varun-B16395 > Sent: Friday, April 17, 2015 2:38 AM > To: Marc Zyngier; Lian Minghuan-B31939 > Cc: Jason Cooper; Arnd Bergmann; linux-pci@vger.kernel.org; Will Deacon;T > Yoder Stuart-B08248; Wood Scott-B07421; Bjorn Helgaas; Hu Mingkai-B21284; > Zang Roy-R61911; Thomas Gleixner; linux-arm-kernel@lists.infradead.org > Subject: RE: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > Hi Minghuan, > > > -----Original Message----- > > From: linux-arm-kernel [mailto:linux-arm-kernel- > > bounces@lists.infradead.org] On Behalf Of Marc Zyngier > > Sent: Thursday, April 16, 2015 5:21 PM > > To: Lian Minghuan-B31939 > > Cc: Jason Cooper; Arnd Bergmann; linux-pci@vger.kernel.org; Will > > Deacon; Yoder Stuart-B08248; Wood Scott-B07421; Bjorn Helgaas; Hu > > Mingkai-B21284; Zang Roy-R61911; Thomas Gleixner; > > linux-arm-kernel@lists.infradead.org > > Subject: Re: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > > > Minghuan, > > > > On 16/04/15 11:57, Minghuan.Lian@freescale.com wrote: > > > > Please include the relevant people (Will Deacon for questions about > > the SMMU), and avoid top posting, it makes it very hard to follow a > discussion. > > > > > I still have a couple of questions, > > > 1. If I add a new flag and all PCI DEVID will be replaced with a > > > fixed ID, does SMMU can work correctly? For example, If we insert > > > an Ethernet card with two ports, a port is assigned to kvm1, another > > > port is assigned to kvm2. In this case, we should use one or two fixed ID? > > > > That completely depends on what the SMMU sees. If it sees the same ID, > > you won't be able to perform any form of isolation. But I have no idea > > how your hardware works, so maybe explaining in details how this is > > constructed will help. > > > > We would need separate stream IDs and iommu groups for endpoints, to > assign them to different VMs. We would have to set up the stream ID to > device ID translation table during device probe, via add_device callback. > While setting the device id for the interrupt translation table, you would have > to get streamID corresponding to a given PCIe device ID. Again, I don't think > that we should consider the TBUID, while setting the ITT device id. > > -Varun
> -----Original Message----- > From: Will Deacon [mailto:will.deacon@arm.com] > Sent: Thursday, April 16, 2015 5:40 AM > To: Sethi Varun-B16395 > Cc: Lian Minghuan-B31939; linux-pci@vger.kernel.org; Arnd Bergmann; Hu Mingkai-B21284; Zang Roy-R61911; Yoder > Stuart-B08248; Bjorn Helgaas; Wood Scott-B07421; linux-arm-kernel@lists.infradead.org > Subject: Re: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > Hi Varun, > > Thanks for adding me to Cc. > > On Wed, Apr 15, 2015 at 02:18:13PM +0100, Varun Sethi wrote: > > Yes, deviceid=stream id (i.e. ICID + other bits). I am not sure if TBU ID > > would also be forwarded as a part of stream id to GIC. My understanding is > > that TBU ID is forwarded (as a part of the stream ID) to the TCU in case > > of a TBU translation miss. In case of the LS2085 PCIe controller you would > > have to setup the PCIe device ID to stream ID translation table. We may > > have to restrict the number of entries based on the available number of > > contexts. > > Unfortunately, I'm having a really hard time parsing this thread (some parts > of it simply don't make sense; others use non-architectural terms and > overall I don't get a feeling for the problem). > > Please could you explain your system design step by step so that I can > understand (a) what you've built and (b) why the current design of Linux is > causing you problems? > > Sorry if I'm just being thick, but it's important that we get this right. I'll try to summarize some key points about the system... System is using a single SMMU-500 (1 TCU, 6 TBUs) and GICv3-ITS. There are PCI, fsl-mc, and platform devices that do DMA. Devices on the PCI and fsl-mc bus generate message interrupts. The flow a message interrupt would take is this: -------------- PCI device -------------- | | pcidevid + MSI msg | V -------------- PCI controller pcidevid -> streamID mapping -------------- | | streamID + MSI msg | V -------------- SMMU -------------- | | streamID + MSI msg | V -------------- CCN-504 (Dickens) -------------- | | streamID + MSI msg | V -------------- GICv3 ITS streamID == ITS deviceID -------------- So, the way things work (at least initially) is that each PCI device maps to a single streamID, and thus each device has a separate ITT in the ITS. So, things should be cool. However, there is an improvement we envision as possible due to the limited number of SMMU contexts (i.e. 64). If there are 64 SMMU context registers it means that there is a max of 64 software contexts where things can be isolated. But, say I have an SRIOV card with 64 VFs, and I want to assign 8 of the VFs to a KVM VM. Those 8 PCI devices could share the same streamID/ITS-device-ID since they all share the same isolation context. What would be nice is at the time the 8 VFS are being added to the IOMMU domain is for the pcidevid -> streamID mapping table to be updated dynamically. It simply lets us make more efficient use of the limited streamIDs we have. I think it is this improvement that Minghuan had in mind in this patch. Thanks, Stuart
Hi Stuart/Will, > -----Original Message----- > From: Yoder Stuart-B08248 > Sent: Friday, April 17, 2015 7:49 PM > To: Will Deacon; Sethi Varun-B16395 > Cc: Lian Minghuan-B31939; linux-pci@vger.kernel.org; Arnd Bergmann; Hu > Mingkai-B21284; Zang Roy-R61911; Bjorn Helgaas; Wood Scott-B07421; linux- > arm-kernel@lists.infradead.org > Subject: RE: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > > > > -----Original Message----- > > From: Will Deacon [mailto:will.deacon@arm.com] > > Sent: Thursday, April 16, 2015 5:40 AM > > To: Sethi Varun-B16395 > > Cc: Lian Minghuan-B31939; linux-pci@vger.kernel.org; Arnd Bergmann; Hu > > Mingkai-B21284; Zang Roy-R61911; Yoder Stuart-B08248; Bjorn Helgaas; > > Wood Scott-B07421; linux-arm-kernel@lists.infradead.org > > Subject: Re: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > > > Hi Varun, > > > > Thanks for adding me to Cc. > > > > On Wed, Apr 15, 2015 at 02:18:13PM +0100, Varun Sethi wrote: > > > Yes, deviceid=stream id (i.e. ICID + other bits). I am not sure if > > > TBU ID would also be forwarded as a part of stream id to GIC. My > > > understanding is that TBU ID is forwarded (as a part of the stream > > > ID) to the TCU in case of a TBU translation miss. In case of the > > > LS2085 PCIe controller you would have to setup the PCIe device ID to > > > stream ID translation table. We may have to restrict the number of > > > entries based on the available number of contexts. > > > > Unfortunately, I'm having a really hard time parsing this thread (some > > parts of it simply don't make sense; others use non-architectural > > terms and overall I don't get a feeling for the problem). > > > > Please could you explain your system design step by step so that I can > > understand (a) what you've built and (b) why the current design of > > Linux is causing you problems? > > > > Sorry if I'm just being thick, but it's important that we get this right. > > I'll try to summarize some key points about the system... > > System is using a single SMMU-500 (1 TCU, 6 TBUs) and GICv3-ITS. There are > PCI, fsl-mc, and platform devices that do DMA. Devices on the PCI and fsl-mc > bus generate message interrupts. > > The flow a message interrupt would take is this: > > -------------- > PCI device > -------------- > | > | pcidevid + MSI msg > | > V > -------------- > PCI controller > pcidevid -> > streamID > mapping > -------------- > | > | streamID + MSI msg > | > V > -------------- > SMMU > -------------- > | > | streamID + MSI msg > | > V > -------------- > CCN-504 (Dickens) > -------------- > | > | streamID + MSI msg > | > V > -------------- > GICv3 ITS streamID == ITS deviceID > -------------- > > So, the way things work (at least initially) is that each PCI device maps to a > single streamID, and thus each device has a separate ITT in the ITS. So, things > should be cool. > > However, there is an improvement we envision as possible due to the > limited number of SMMU contexts (i.e. 64). If there are > 64 SMMU context registers it means that there is a max of > 64 software contexts where things can be isolated. But, say I have an SRIOV > card with 64 VFs, and I want to assign 8 of the VFs to a KVM VM. Those 8 PCI > devices could share the same streamID/ITS-device-ID since they all share the > same isolation context. > > What would be nice is at the time the 8 VFS are being added to the IOMMU > domain is for the pcidevid -> streamID mapping table to be updated > dynamically. It simply lets us make more efficient use of the limited > streamIDs we have. > i.e. we set up a common stream id for PCIe devices while attaching them to a domain. This would require traversal of PCIe bus topology in order to determine the dma alias during attach_device. In case of a transparent host bridge we would have to consider the device id corresponding to the bridge. I think it would be better if we can statically restrict number of permissible stream Ids per PCIe controller. We can setup the stream ID to device ID mapping during device probe itself (add_device or of_xlate) For the ITS setup, we would have to translate the device id to the associated stream Id. What would happen in case of the transparent host bridge, all the devices behind the bridge would share the same stream ID? -Varun
Hi Stuart, First of, thanks for taking the time to explain this in more detail. Comments inline. On Fri, Apr 17, 2015 at 03:19:08PM +0100, Stuart Yoder wrote: > > On Wed, Apr 15, 2015 at 02:18:13PM +0100, Varun Sethi wrote: > > > Yes, deviceid=stream id (i.e. ICID + other bits). I am not sure if TBU ID > > > would also be forwarded as a part of stream id to GIC. My understanding is > > > that TBU ID is forwarded (as a part of the stream ID) to the TCU in case > > > of a TBU translation miss. In case of the LS2085 PCIe controller you would > > > have to setup the PCIe device ID to stream ID translation table. We may > > > have to restrict the number of entries based on the available number of > > > contexts. > > > > Unfortunately, I'm having a really hard time parsing this thread (some parts > > of it simply don't make sense; others use non-architectural terms and > > overall I don't get a feeling for the problem). > > > > Please could you explain your system design step by step so that I can > > understand (a) what you've built and (b) why the current design of Linux is > > causing you problems? > > > > Sorry if I'm just being thick, but it's important that we get this right. > > I'll try to summarize some key points about the system... > > System is using a single SMMU-500 (1 TCU, 6 TBUs) and GICv3-ITS. There are > PCI, fsl-mc, and platform devices that do DMA. Devices on the PCI and > fsl-mc bus generate message interrupts. Ah cool, so you have multiple buses sharing a single SMMU? That's going to necessitate some ID remapping in the device-tree. Perhaps you could comment on Mark Rutland's proposal if it does/doesn't work for you: http://lists.infradead.org/pipermail/linux-arm-kernel/2015-March/333199.html > The flow a message interrupt would take is this: > > -------------- > PCI device > -------------- > | > | pcidevid + MSI msg > | > V > -------------- > PCI controller > pcidevid -> > streamID > mapping > -------------- > | > | streamID + MSI msg > | > V > -------------- > SMMU > -------------- > | > | streamID + MSI msg > | > V > -------------- > CCN-504 (Dickens) > -------------- > | > | streamID + MSI msg > | > V The streamID here as the same as the one coming out of the SMMU, right? (just trying to out why you have the CCN-504 in the picture). > -------------- > GICv3 ITS streamID == ITS deviceID > -------------- > > So, the way things work (at least initially) is that each PCI device maps > to a single streamID, and thus each device has a separate ITT in > the ITS. So, things should be cool. > > However, there is an improvement we envision as possible due to > the limited number of SMMU contexts (i.e. 64). If there are > 64 SMMU context registers it means that there is a max of > 64 software contexts where things can be isolated. But, say I have > an SRIOV card with 64 VFs, and I want to assign 8 of the VFs > to a KVM VM. Those 8 PCI devices could share the same > streamID/ITS-device-ID since they all share the same isolation > context. > > What would be nice is at the time the 8 VFS are being added > to the IOMMU domain is for the pcidevid -> streamID mapping > table to be updated dynamically. It simply lets us make > more efficient use of the limited streamIDs we have. > > I think it is this improvement that Minghuan had in mind > in this patch. Ok, but in this case it should be possible to use a single context bank for all of the VF streamIDs by configuring the appropriate SMR, no? Wouldn't that sort of thing be preferable to dynamic StreamID assignment? It would certainly make life easier for the MSIs. Will
Hi Will, > -----Original Message----- > From: Will Deacon [mailto:will.deacon@arm.com] > Sent: Wednesday, April 22, 2015 10:37 PM > To: Yoder Stuart-B08248 > Cc: Sethi Varun-B16395; Lian Minghuan-B31939; linux-pci@vger.kernel.org; > Arnd Bergmann; Hu Mingkai-B21284; Zang Roy-R61911; Bjorn Helgaas; Wood > Scott-B07421; linux-arm-kernel@lists.infradead.org > Subject: Re: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > Hi Stuart, > > First of, thanks for taking the time to explain this in more detail. > Comments inline. > > On Fri, Apr 17, 2015 at 03:19:08PM +0100, Stuart Yoder wrote: > > > On Wed, Apr 15, 2015 at 02:18:13PM +0100, Varun Sethi wrote: > > > > Yes, deviceid=stream id (i.e. ICID + other bits). I am not sure if > > > > TBU ID would also be forwarded as a part of stream id to GIC. My > > > > understanding is that TBU ID is forwarded (as a part of the stream > > > > ID) to the TCU in case of a TBU translation miss. In case of the > > > > LS2085 PCIe controller you would have to setup the PCIe device ID > > > > to stream ID translation table. We may have to restrict the number > > > > of entries based on the available number of contexts. > > > > > > Unfortunately, I'm having a really hard time parsing this thread > > > (some parts of it simply don't make sense; others use > > > non-architectural terms and overall I don't get a feeling for the problem). > > > > > > Please could you explain your system design step by step so that I > > > can understand (a) what you've built and (b) why the current design > > > of Linux is causing you problems? > > > > > > Sorry if I'm just being thick, but it's important that we get this right. > > > > I'll try to summarize some key points about the system... > > > > System is using a single SMMU-500 (1 TCU, 6 TBUs) and GICv3-ITS. > > There are PCI, fsl-mc, and platform devices that do DMA. Devices on > > the PCI and fsl-mc bus generate message interrupts. > > Ah cool, so you have multiple buses sharing a single SMMU? That's going to > necessitate some ID remapping in the device-tree. Perhaps you could > comment on Mark Rutland's proposal if it does/doesn't work for you: > > http://lists.infradead.org/pipermail/linux-arm-kernel/2015- > March/333199.html > > > > The flow a message interrupt would take is this: > > > > -------------- > > PCI device > > -------------- > > | > > | pcidevid + MSI msg > > | > > V > > -------------- > > PCI controller > > pcidevid -> > > streamID > > mapping > > -------------- > > | > > | streamID + MSI msg > > | > > V > > -------------- > > SMMU > > -------------- > > | > > | streamID + MSI msg > > | > > V > > -------------- > > CCN-504 (Dickens) > > -------------- > > | > > | streamID + MSI msg > > | > > V > > The streamID here as the same as the one coming out of the SMMU, right? > (just trying to out why you have the CCN-504 in the picture). > > > -------------- > > GICv3 ITS streamID == ITS deviceID > > -------------- > > > > So, the way things work (at least initially) is that each PCI device > > maps to a single streamID, and thus each device has a separate ITT in > > the ITS. So, things should be cool. > > > > However, there is an improvement we envision as possible due to the > > limited number of SMMU contexts (i.e. 64). If there are > > 64 SMMU context registers it means that there is a max of > > 64 software contexts where things can be isolated. But, say I have an > > SRIOV card with 64 VFs, and I want to assign 8 of the VFs to a KVM VM. > > Those 8 PCI devices could share the same streamID/ITS-device-ID since > > they all share the same isolation context. > > > > What would be nice is at the time the 8 VFS are being added to the > > IOMMU domain is for the pcidevid -> streamID mapping table to be > > updated dynamically. It simply lets us make more efficient use of the > > limited streamIDs we have. > > > > I think it is this improvement that Minghuan had in mind in this > > patch. > > Ok, but in this case it should be possible to use a single context bank for all of > the VF streamIDs by configuring the appropriate SMR, no? Wouldn't that sort > of thing be preferable to dynamic StreamID assignment? It would certainly > make life easier for the MSIs. > Yes, that would happen when the all the VFs are attached to the same domain. But it would be an issue if we have to attach devices to different domains. -Varun
> -----Original Message----- > From: Will Deacon [mailto:will.deacon@arm.com] > Sent: Wednesday, April 22, 2015 12:07 PM > To: Yoder Stuart-B08248 > Cc: Sethi Varun-B16395; Lian Minghuan-B31939; linux-pci@vger.kernel.org; Arnd Bergmann; Hu Mingkai-B21284; > Zang Roy-R61911; Bjorn Helgaas; Wood Scott-B07421; linux-arm-kernel@lists.infradead.org > Subject: Re: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > Hi Stuart, > > First of, thanks for taking the time to explain this in more detail. > Comments inline. > > On Fri, Apr 17, 2015 at 03:19:08PM +0100, Stuart Yoder wrote: > > > On Wed, Apr 15, 2015 at 02:18:13PM +0100, Varun Sethi wrote: > > > > Yes, deviceid=stream id (i.e. ICID + other bits). I am not sure if TBU ID > > > > would also be forwarded as a part of stream id to GIC. My understanding is > > > > that TBU ID is forwarded (as a part of the stream ID) to the TCU in case > > > > of a TBU translation miss. In case of the LS2085 PCIe controller you would > > > > have to setup the PCIe device ID to stream ID translation table. We may > > > > have to restrict the number of entries based on the available number of > > > > contexts. > > > > > > Unfortunately, I'm having a really hard time parsing this thread (some parts > > > of it simply don't make sense; others use non-architectural terms and > > > overall I don't get a feeling for the problem). > > > > > > Please could you explain your system design step by step so that I can > > > understand (a) what you've built and (b) why the current design of Linux is > > > causing you problems? > > > > > > Sorry if I'm just being thick, but it's important that we get this right. > > > > I'll try to summarize some key points about the system... > > > > System is using a single SMMU-500 (1 TCU, 6 TBUs) and GICv3-ITS. There are > > PCI, fsl-mc, and platform devices that do DMA. Devices on the PCI and > > fsl-mc bus generate message interrupts. > > Ah cool, so you have multiple buses sharing a single SMMU? That's going to > necessitate some ID remapping in the device-tree. Perhaps you could comment > on Mark Rutland's proposal if it does/doesn't work for you: > > http://lists.infradead.org/pipermail/linux-arm-kernel/2015-March/333199.html Thanks for the pointer, I had not seen that before. Will read and comment on it. > > The flow a message interrupt would take is this: > > > > -------------- > > PCI device > > -------------- > > | > > | pcidevid + MSI msg > > | > > V > > -------------- > > PCI controller > > pcidevid -> > > streamID > > mapping > > -------------- > > | > > | streamID + MSI msg > > | > > V > > -------------- > > SMMU > > -------------- > > | > > | streamID + MSI msg > > | > > V > > -------------- > > CCN-504 (Dickens) > > -------------- > > | > > | streamID + MSI msg > > | > > V > > The streamID here as the same as the one coming out of the SMMU, right? Yes. > (just trying to out why you have the CCN-504 in the picture). It really isn't relevant I guess, just the picture I had in my head. > > -------------- > > GICv3 ITS streamID == ITS deviceID > > -------------- > > > > So, the way things work (at least initially) is that each PCI device maps > > to a single streamID, and thus each device has a separate ITT in > > the ITS. So, things should be cool. > > > > However, there is an improvement we envision as possible due to > > the limited number of SMMU contexts (i.e. 64). If there are > > 64 SMMU context registers it means that there is a max of > > 64 software contexts where things can be isolated. But, say I have > > an SRIOV card with 64 VFs, and I want to assign 8 of the VFs > > to a KVM VM. Those 8 PCI devices could share the same > > streamID/ITS-device-ID since they all share the same isolation > > context. > > > > What would be nice is at the time the 8 VFS are being added > > to the IOMMU domain is for the pcidevid -> streamID mapping > > table to be updated dynamically. It simply lets us make > > more efficient use of the limited streamIDs we have. > > > > I think it is this improvement that Minghuan had in mind > > in this patch. > > Ok, but in this case it should be possible to use a single context bank for > all of the VF streamIDs by configuring the appropriate SMR, no? Yes, but there are limited SMRs. In our case there are only 128 SMRs in SMMU-500 and we have potentially way more masters than that. > Wouldn't > that sort of thing be preferable to dynamic StreamID assignment? It would > certainly make life easier for the MSIs. It would be preferable, but given only 128 total stream IDS and 64 context registers it's potentially an issue. On our LS2085 SoC it is PCI and the fsl-mc bus (see description here: https://lkml.org/lkml/2015/3/5/795) that potentially have way more masters than streamIDS. So, for those busses we would essentially view a streamID as a "context ID"-- each SMR is associated with 1 context bank register. For PCI we have a programmable "PCI req ID"-to-"stream ID" mapping table in the PCI controller that is dynamically programmable. Looking at it like that means that we could have any number of masters but only 64 "contexts" and since the masters all all programmable it's seems feasbile to envision doing some bus/vendor specific set up when a device is added to an IOMMU domain. One thing that would need to be conveyed to the SMMU driver if doing dynamic streamID setup is what streamIDs are available to be used. Thanks, Stuart
On Wed, Apr 22, 2015 at 08:41:02PM +0100, Stuart Yoder wrote: > > > However, there is an improvement we envision as possible due to > > > the limited number of SMMU contexts (i.e. 64). If there are > > > 64 SMMU context registers it means that there is a max of > > > 64 software contexts where things can be isolated. But, say I have > > > an SRIOV card with 64 VFs, and I want to assign 8 of the VFs > > > to a KVM VM. Those 8 PCI devices could share the same > > > streamID/ITS-device-ID since they all share the same isolation > > > context. > > > > > > What would be nice is at the time the 8 VFS are being added > > > to the IOMMU domain is for the pcidevid -> streamID mapping > > > table to be updated dynamically. It simply lets us make > > > more efficient use of the limited streamIDs we have. > > > > > > I think it is this improvement that Minghuan had in mind > > > in this patch. > > > > Ok, but in this case it should be possible to use a single context bank for > > all of the VF streamIDs by configuring the appropriate SMR, no? > > Yes, but there are limited SMRs. In our case there are only > 128 SMRs in SMMU-500 and we have potentially way more masters than > that. Right, but you still only have 64 context banks at the end of the day, so do you really anticipate having more than 128 masters concurrently using the SMMU? If so, then we have devices sharing context banks so we could consider reusing SMRs across masters, but historically that's not been something that we've managed to solve. > > Wouldn't > > that sort of thing be preferable to dynamic StreamID assignment? It would > > certainly make life easier for the MSIs. > > It would be preferable, but given only 128 total stream IDS and > 64 context registers it's potentially an issue. On our LS2085 SoC it is > PCI and the fsl-mc bus (see description here: > https://lkml.org/lkml/2015/3/5/795) that potentially have way > more masters than streamIDS. So, for those busses we would essentially > view a streamID as a "context ID"-- each SMR is associated with > 1 context bank register. > > For PCI we have a programmable "PCI req ID"-to-"stream ID" > mapping table in the PCI controller that is dynamically > programmable. > > Looking at it like that means that we could have > any number of masters but only 64 "contexts" > and since the masters all all programmable it's > seems feasbile to envision doing some bus/vendor > specific set up when a device is added to an > IOMMU domain. One thing that would need to be conveyed > to the SMMU driver if doing dynamic streamID setup > is what streamIDs are available to be used. Ok, but this is going to make life difficult for the MSI people, I suspect. Marc...? Will
On 24/04/15 17:18, Will Deacon wrote: > On Wed, Apr 22, 2015 at 08:41:02PM +0100, Stuart Yoder wrote: >>>> However, there is an improvement we envision as possible due to >>>> the limited number of SMMU contexts (i.e. 64). If there are >>>> 64 SMMU context registers it means that there is a max of >>>> 64 software contexts where things can be isolated. But, say I have >>>> an SRIOV card with 64 VFs, and I want to assign 8 of the VFs >>>> to a KVM VM. Those 8 PCI devices could share the same >>>> streamID/ITS-device-ID since they all share the same isolation >>>> context. >>>> >>>> What would be nice is at the time the 8 VFS are being added >>>> to the IOMMU domain is for the pcidevid -> streamID mapping >>>> table to be updated dynamically. It simply lets us make >>>> more efficient use of the limited streamIDs we have. >>>> >>>> I think it is this improvement that Minghuan had in mind >>>> in this patch. >>> >>> Ok, but in this case it should be possible to use a single context bank for >>> all of the VF streamIDs by configuring the appropriate SMR, no? >> >> Yes, but there are limited SMRs. In our case there are only >> 128 SMRs in SMMU-500 and we have potentially way more masters than >> that. > > Right, but you still only have 64 context banks at the end of the day, so do > you really anticipate having more than 128 masters concurrently using the > SMMU? If so, then we have devices sharing context banks so we could consider > reusing SMRs across masters, but historically that's not been something that > we've managed to solve. > >>> Wouldn't >>> that sort of thing be preferable to dynamic StreamID assignment? It would >>> certainly make life easier for the MSIs. >> >> It would be preferable, but given only 128 total stream IDS and >> 64 context registers it's potentially an issue. On our LS2085 SoC it is >> PCI and the fsl-mc bus (see description here: >> https://lkml.org/lkml/2015/3/5/795) that potentially have way >> more masters than streamIDS. So, for those busses we would essentially >> view a streamID as a "context ID"-- each SMR is associated with >> 1 context bank register. >> >> For PCI we have a programmable "PCI req ID"-to-"stream ID" >> mapping table in the PCI controller that is dynamically >> programmable. >> >> Looking at it like that means that we could have >> any number of masters but only 64 "contexts" >> and since the masters all all programmable it's >> seems feasbile to envision doing some bus/vendor >> specific set up when a device is added to an >> IOMMU domain. One thing that would need to be conveyed >> to the SMMU driver if doing dynamic streamID setup >> is what streamIDs are available to be used. > > Ok, but this is going to make life difficult for the MSI people, I suspect. > > Marc...? We're really facing two conflicting requirements: in order to minimize SMR usage, we want to alias multiple ReqIDs to a single StreamID, but in order to efficiently deal with MSIs, we want to see discrete DeviceIDs (the actual ReqIDs). I don't easily see how we reconcile the two. We can deal with the aliasing, provided that we extend the level of quirkiness that pci_for_each_dma_alias can deal with. But that doesn't solve any form of hotplug/SR-IOV behaviour. Somehow, we're going to end-up with grossly oversized ITTs, just to accommodate for the fact that we have no idea how many MSIs we're going to end-up needing. I'm not thrilled with that prospect. Thanks, M.
> -----Original Message----- > From: Will Deacon [mailto:will.deacon@arm.com] > Sent: Friday, April 24, 2015 11:18 AM > To: Yoder Stuart-B08248 > Cc: Sethi Varun-B16395; Lian Minghuan-B31939; linux-pci@vger.kernel.org; Arnd Bergmann; Hu Mingkai-B21284; > Zang Roy-R61911; Bjorn Helgaas; Wood Scott-B07421; linux-arm-kernel@lists.infradead.org; marc.zyngier@arm.com > Subject: Re: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > On Wed, Apr 22, 2015 at 08:41:02PM +0100, Stuart Yoder wrote: > > > > However, there is an improvement we envision as possible due to > > > > the limited number of SMMU contexts (i.e. 64). If there are > > > > 64 SMMU context registers it means that there is a max of > > > > 64 software contexts where things can be isolated. But, say I have > > > > an SRIOV card with 64 VFs, and I want to assign 8 of the VFs > > > > to a KVM VM. Those 8 PCI devices could share the same > > > > streamID/ITS-device-ID since they all share the same isolation > > > > context. > > > > > > > > What would be nice is at the time the 8 VFS are being added > > > > to the IOMMU domain is for the pcidevid -> streamID mapping > > > > table to be updated dynamically. It simply lets us make > > > > more efficient use of the limited streamIDs we have. > > > > > > > > I think it is this improvement that Minghuan had in mind > > > > in this patch. > > > > > > Ok, but in this case it should be possible to use a single context bank for > > > all of the VF streamIDs by configuring the appropriate SMR, no? > > > > Yes, but there are limited SMRs. In our case there are only > > 128 SMRs in SMMU-500 and we have potentially way more masters than > > that. > > Right, but you still only have 64 context banks at the end of the day, so do > you really anticipate having more than 128 masters concurrently using the > SMMU? Yes. We anticiapte quite possibly quite a few more than 128 masters sharing the 64 context banks. > If so, then we have devices sharing context banks so we could consider > reusing SMRs across masters, but historically that's not been something that > we've managed to solve. But isn't that what iommu groups are all about?... multiple masters that share an IOMMU context because of the isolation characteristics of the hardware like PCI devices behind a PCI bridge. All the devices might be in one iommu group. What is not solved? Stuart
> -----Original Message----- > From: Marc Zyngier [mailto:marc.zyngier@arm.com] > Sent: Friday, April 24, 2015 11:44 AM > To: Will Deacon; Yoder Stuart-B08248 > Cc: Sethi Varun-B16395; Lian Minghuan-B31939; linux-pci@vger.kernel.org; Arnd Bergmann; Hu Mingkai-B21284; > Zang Roy-R61911; Bjorn Helgaas; Wood Scott-B07421; linux-arm-kernel@lists.infradead.org > Subject: Re: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > On 24/04/15 17:18, Will Deacon wrote: > > On Wed, Apr 22, 2015 at 08:41:02PM +0100, Stuart Yoder wrote: > >>>> However, there is an improvement we envision as possible due to > >>>> the limited number of SMMU contexts (i.e. 64). If there are > >>>> 64 SMMU context registers it means that there is a max of > >>>> 64 software contexts where things can be isolated. But, say I have > >>>> an SRIOV card with 64 VFs, and I want to assign 8 of the VFs > >>>> to a KVM VM. Those 8 PCI devices could share the same > >>>> streamID/ITS-device-ID since they all share the same isolation > >>>> context. > >>>> > >>>> What would be nice is at the time the 8 VFS are being added > >>>> to the IOMMU domain is for the pcidevid -> streamID mapping > >>>> table to be updated dynamically. It simply lets us make > >>>> more efficient use of the limited streamIDs we have. > >>>> > >>>> I think it is this improvement that Minghuan had in mind > >>>> in this patch. > >>> > >>> Ok, but in this case it should be possible to use a single context bank for > >>> all of the VF streamIDs by configuring the appropriate SMR, no? > >> > >> Yes, but there are limited SMRs. In our case there are only > >> 128 SMRs in SMMU-500 and we have potentially way more masters than > >> that. > > > > Right, but you still only have 64 context banks at the end of the day, so do > > you really anticipate having more than 128 masters concurrently using the > > SMMU? If so, then we have devices sharing context banks so we could consider > > reusing SMRs across masters, but historically that's not been something that > > we've managed to solve. > > > >>> Wouldn't > >>> that sort of thing be preferable to dynamic StreamID assignment? It would > >>> certainly make life easier for the MSIs. > >> > >> It would be preferable, but given only 128 total stream IDS and > >> 64 context registers it's potentially an issue. On our LS2085 SoC it is > >> PCI and the fsl-mc bus (see description here: > >> https://lkml.org/lkml/2015/3/5/795) that potentially have way > >> more masters than streamIDS. So, for those busses we would essentially > >> view a streamID as a "context ID"-- each SMR is associated with > >> 1 context bank register. > >> > >> For PCI we have a programmable "PCI req ID"-to-"stream ID" > >> mapping table in the PCI controller that is dynamically > >> programmable. > >> > >> Looking at it like that means that we could have > >> any number of masters but only 64 "contexts" > >> and since the masters all all programmable it's > >> seems feasbile to envision doing some bus/vendor > >> specific set up when a device is added to an > >> IOMMU domain. One thing that would need to be conveyed > >> to the SMMU driver if doing dynamic streamID setup > >> is what streamIDs are available to be used. > > > > Ok, but this is going to make life difficult for the MSI people, I suspect. > > > > Marc...? > > We're really facing two conflicting requirements: in order to minimize > SMR usage, we want to alias multiple ReqIDs to a single StreamID Another reason could be the isolation characteristics of the hardware...see comment below about PCI bridges. > but in > order to efficiently deal with MSIs, we want to see discrete DeviceIDs > (the actual ReqIDs). I don't easily see how we reconcile the two. > > We can deal with the aliasing, provided that we extend the level of > quirkiness that pci_for_each_dma_alias can deal with. But that doesn't > solve any form of hotplug/SR-IOV behaviour. > > Somehow, we're going to end-up with grossly oversized ITTs, just to > accommodate for the fact that we have no idea how many MSIs we're going > to end-up needing. I'm not thrilled with that prospect. How can we avoid that in the face of hotplug? And what are we really worried about regarding over-sized ITTs...bytes of memory saved? A fundamental thing built into the IOMMU subsystem in Linux is representing iommu groups that can represent things like multiple PCI devices that for hardware reasons cannot be isolated (and the example I've seen given relates to devices behind PCI bridges). So, I think the thing we are facing here is that while the IOMMU subsystem has accounted for reprsenting the isolation characteristics of a system with iommu groups, there is no corresponding "msi group" concept. In the SMMU/GIC-500-ITS world the iommu isolation ID (the stream ID) and the GIC-ITS device ID are in fact the same ID. Is there some way we could sanely correlate IOMMU group creation (which establishes isolation granularity) with the creation of an ITT for the GIC-ITS? (I don't have a good feel for how device IDs work on x86, I know there is an interrupt remapping table in the x86 IOMMUs that is distinct from the memory isolation page tables) Thanks, Stuart
+Alex W > -----Original Message----- > From: Yoder Stuart-B08248 > Sent: Friday, April 24, 2015 1:19 PM > To: 'Marc Zyngier'; Will Deacon > Cc: Sethi Varun-B16395; Lian Minghuan-B31939; linux-pci@vger.kernel.org; Arnd Bergmann; Hu Mingkai-B21284; > Zang Roy-R61911; Bjorn Helgaas; Wood Scott-B07421; linux-arm-kernel@lists.infradead.org > Subject: RE: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > > > > -----Original Message----- > > From: Marc Zyngier [mailto:marc.zyngier@arm.com] > > Sent: Friday, April 24, 2015 11:44 AM > > To: Will Deacon; Yoder Stuart-B08248 > > Cc: Sethi Varun-B16395; Lian Minghuan-B31939; linux-pci@vger.kernel.org; Arnd Bergmann; Hu Mingkai-B21284; > > Zang Roy-R61911; Bjorn Helgaas; Wood Scott-B07421; linux-arm-kernel@lists.infradead.org > > Subject: Re: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > > > On 24/04/15 17:18, Will Deacon wrote: > > > On Wed, Apr 22, 2015 at 08:41:02PM +0100, Stuart Yoder wrote: > > >>>> However, there is an improvement we envision as possible due to > > >>>> the limited number of SMMU contexts (i.e. 64). If there are > > >>>> 64 SMMU context registers it means that there is a max of > > >>>> 64 software contexts where things can be isolated. But, say I have > > >>>> an SRIOV card with 64 VFs, and I want to assign 8 of the VFs > > >>>> to a KVM VM. Those 8 PCI devices could share the same > > >>>> streamID/ITS-device-ID since they all share the same isolation > > >>>> context. > > >>>> > > >>>> What would be nice is at the time the 8 VFS are being added > > >>>> to the IOMMU domain is for the pcidevid -> streamID mapping > > >>>> table to be updated dynamically. It simply lets us make > > >>>> more efficient use of the limited streamIDs we have. > > >>>> > > >>>> I think it is this improvement that Minghuan had in mind > > >>>> in this patch. > > >>> > > >>> Ok, but in this case it should be possible to use a single context bank for > > >>> all of the VF streamIDs by configuring the appropriate SMR, no? > > >> > > >> Yes, but there are limited SMRs. In our case there are only > > >> 128 SMRs in SMMU-500 and we have potentially way more masters than > > >> that. > > > > > > Right, but you still only have 64 context banks at the end of the day, so do > > > you really anticipate having more than 128 masters concurrently using the > > > SMMU? If so, then we have devices sharing context banks so we could consider > > > reusing SMRs across masters, but historically that's not been something that > > > we've managed to solve. > > > > > >>> Wouldn't > > >>> that sort of thing be preferable to dynamic StreamID assignment? It would > > >>> certainly make life easier for the MSIs. > > >> > > >> It would be preferable, but given only 128 total stream IDS and > > >> 64 context registers it's potentially an issue. On our LS2085 SoC it is > > >> PCI and the fsl-mc bus (see description here: > > >> https://lkml.org/lkml/2015/3/5/795) that potentially have way > > >> more masters than streamIDS. So, for those busses we would essentially > > >> view a streamID as a "context ID"-- each SMR is associated with > > >> 1 context bank register. > > >> > > >> For PCI we have a programmable "PCI req ID"-to-"stream ID" > > >> mapping table in the PCI controller that is dynamically > > >> programmable. > > >> > > >> Looking at it like that means that we could have > > >> any number of masters but only 64 "contexts" > > >> and since the masters all all programmable it's > > >> seems feasbile to envision doing some bus/vendor > > >> specific set up when a device is added to an > > >> IOMMU domain. One thing that would need to be conveyed > > >> to the SMMU driver if doing dynamic streamID setup > > >> is what streamIDs are available to be used. > > > > > > Ok, but this is going to make life difficult for the MSI people, I suspect. > > > > > > Marc...? > > > > We're really facing two conflicting requirements: in order to minimize > > SMR usage, we want to alias multiple ReqIDs to a single StreamID > > Another reason could be the isolation characteristics of the > hardware...see comment below about PCI bridges. > > > but in > > order to efficiently deal with MSIs, we want to see discrete DeviceIDs > > (the actual ReqIDs). I don't easily see how we reconcile the two. > > > > We can deal with the aliasing, provided that we extend the level of > > quirkiness that pci_for_each_dma_alias can deal with. But that doesn't > > solve any form of hotplug/SR-IOV behaviour. > > > > Somehow, we're going to end-up with grossly oversized ITTs, just to > > accommodate for the fact that we have no idea how many MSIs we're going > > to end-up needing. I'm not thrilled with that prospect. > > How can we avoid that in the face of hotplug? > > And what are we really worried about regarding over-sized ITTs...bytes > of memory saved? > > A fundamental thing built into the IOMMU subsystem in Linux is > representing iommu groups that can represent things like > multiple PCI devices that for hardware reasons cannot > be isolated (and the example I've seen given relates to > devices behind PCI bridges). > > So, I think the thing we are facing here is that while the > IOMMU subsystem has accounted for reprsenting the isolation > characteristics of a system with iommu groups, there is > no corresponding "msi group" concept. > > In the SMMU/GIC-500-ITS world the iommu isolation > ID (the stream ID) and the GIC-ITS device ID are in > fact the same ID. > > Is there some way we could sanely correlate IOMMU group creation > (which establishes isolation granularity) with the creation > of an ITT for the GIC-ITS? > > (I don't have a good feel for how device IDs work on x86, > I know there is an interrupt remapping table in the > x86 IOMMUs that is distinct from the memory isolation > page tables) For reference see Alex Williamson's description of IOMMU isolation scenarios in Documentation/vfio.txt: ... This isolation is not always at the granularity of a single device though. Even when an IOMMU is capable of this, properties of devices, interconnects, and IOMMU topologies can each reduce this isolation. For instance, an individual device may be part of a larger multi- function enclosure. While the IOMMU may be able to distinguish between devices within the enclosure, the enclosure may not require transactions between devices to reach the IOMMU. Examples of this could be anything from a multi-function PCI device with backdoors between functions to a non-PCI-ACS (Access Control Services) capable bridge allowing redirection without reaching the IOMMU. Topology can also play a factor in terms of hiding devices. A PCIe-to-PCI bridge masks the devices behind it, making transaction appear as if from the bridge itself. Obviously IOMMU design plays a major factor as well. Therefore, while for the most part an IOMMU may have device level granularity, any system is susceptible to reduced granularity. The IOMMU API therefore supports a notion of IOMMU groups. A group is a set of devices which is isolatable from all other devices in the system. Groups are therefore the unit of ownership used by VFIO. ... The isolation granularity issues he raises are going to impact how we deal with MSIs with GIC ITS because again, isolation ID and GIC ITS device ID are the same. Stuart
On Fri, 24 Apr 2015 19:18:44 +0100 Stuart Yoder <stuart.yoder@freescale.com> wrote: Hi Stuart, > > > > -----Original Message----- > > From: Marc Zyngier [mailto:marc.zyngier@arm.com] > > Sent: Friday, April 24, 2015 11:44 AM > > To: Will Deacon; Yoder Stuart-B08248 > > Cc: Sethi Varun-B16395; Lian Minghuan-B31939; linux-pci@vger.kernel.org; Arnd Bergmann; Hu Mingkai-B21284; > > Zang Roy-R61911; Bjorn Helgaas; Wood Scott-B07421; linux-arm-kernel@lists.infradead.org > > Subject: Re: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > > > On 24/04/15 17:18, Will Deacon wrote: > > > On Wed, Apr 22, 2015 at 08:41:02PM +0100, Stuart Yoder wrote: > > >>>> However, there is an improvement we envision as possible due to > > >>>> the limited number of SMMU contexts (i.e. 64). If there are > > >>>> 64 SMMU context registers it means that there is a max of > > >>>> 64 software contexts where things can be isolated. But, say I have > > >>>> an SRIOV card with 64 VFs, and I want to assign 8 of the VFs > > >>>> to a KVM VM. Those 8 PCI devices could share the same > > >>>> streamID/ITS-device-ID since they all share the same isolation > > >>>> context. > > >>>> > > >>>> What would be nice is at the time the 8 VFS are being added > > >>>> to the IOMMU domain is for the pcidevid -> streamID mapping > > >>>> table to be updated dynamically. It simply lets us make > > >>>> more efficient use of the limited streamIDs we have. > > >>>> > > >>>> I think it is this improvement that Minghuan had in mind > > >>>> in this patch. > > >>> > > >>> Ok, but in this case it should be possible to use a single context bank for > > >>> all of the VF streamIDs by configuring the appropriate SMR, no? > > >> > > >> Yes, but there are limited SMRs. In our case there are only > > >> 128 SMRs in SMMU-500 and we have potentially way more masters than > > >> that. > > > > > > Right, but you still only have 64 context banks at the end of the day, so do > > > you really anticipate having more than 128 masters concurrently using the > > > SMMU? If so, then we have devices sharing context banks so we could consider > > > reusing SMRs across masters, but historically that's not been something that > > > we've managed to solve. > > > > > >>> Wouldn't > > >>> that sort of thing be preferable to dynamic StreamID assignment? It would > > >>> certainly make life easier for the MSIs. > > >> > > >> It would be preferable, but given only 128 total stream IDS and > > >> 64 context registers it's potentially an issue. On our LS2085 SoC it is > > >> PCI and the fsl-mc bus (see description here: > > >> https://lkml.org/lkml/2015/3/5/795) that potentially have way > > >> more masters than streamIDS. So, for those busses we would essentially > > >> view a streamID as a "context ID"-- each SMR is associated with > > >> 1 context bank register. > > >> > > >> For PCI we have a programmable "PCI req ID"-to-"stream ID" > > >> mapping table in the PCI controller that is dynamically > > >> programmable. > > >> > > >> Looking at it like that means that we could have > > >> any number of masters but only 64 "contexts" > > >> and since the masters all all programmable it's > > >> seems feasbile to envision doing some bus/vendor > > >> specific set up when a device is added to an > > >> IOMMU domain. One thing that would need to be conveyed > > >> to the SMMU driver if doing dynamic streamID setup > > >> is what streamIDs are available to be used. > > > > > > Ok, but this is going to make life difficult for the MSI people, I suspect. > > > > > > Marc...? > > > > We're really facing two conflicting requirements: in order to minimize > > SMR usage, we want to alias multiple ReqIDs to a single StreamID > > Another reason could be the isolation characteristics of the > hardware...see comment below about PCI bridges. > > > but in > > order to efficiently deal with MSIs, we want to see discrete DeviceIDs > > (the actual ReqIDs). I don't easily see how we reconcile the two. > > > > We can deal with the aliasing, provided that we extend the level of > > quirkiness that pci_for_each_dma_alias can deal with. But that > > doesn't solve any form of hotplug/SR-IOV behaviour. > > > > Somehow, we're going to end-up with grossly oversized ITTs, just to > > accommodate for the fact that we have no idea how many MSIs we're > > going to end-up needing. I'm not thrilled with that prospect. > > How can we avoid that in the face of hotplug? Fortunately, hotplug is not always synonymous of aliasing. The ITS is built around the hypothesis that aliasing doesn't happen, and that you know upfront how many LPIs the device will be allowed to generate. > And what are we really worried about regarding over-sized ITTs...bytes > of memory saved? That's one thing, yes. But more fundamentally, how do you size your MSI capacity for a single alias? Do you evenly split your LPI space among all possible aliases? Assuming 64 aliases and 16 bits of interrupt ID space, you end up with 10 bit per alias. Is that always enough? Or do you need something more fine-grained? > A fundamental thing built into the IOMMU subsystem in Linux is > representing iommu groups that can represent things like > multiple PCI devices that for hardware reasons cannot > be isolated (and the example I've seen given relates to > devices behind PCI bridges). > > So, I think the thing we are facing here is that while the > IOMMU subsystem has accounted for reprsenting the isolation > characteristics of a system with iommu groups, there is > no corresponding "msi group" concept. > > In the SMMU/GIC-500-ITS world the iommu isolation > ID (the stream ID) and the GIC-ITS device ID are in > fact the same ID. The DeviceID is the "MSI group" you mention. This is what provides isolation at the ITS level. > Is there some way we could sanely correlate IOMMU group creation > (which establishes isolation granularity) with the creation > of an ITT for the GIC-ITS? The problem you have is that your ITT already exists before you start "hotpluging" new devices. Take the following (made up) example: System boots, device X is discovered, claims 64 MSIs. An ITT for device X is allocated, and sized for 64 LPIs. SR-IOV kick is, creates a new X' function that is aliased to X, claiming another 64 MSIs. Fail. What do we do here? The ITT is live (X is generating interrupts), and there is no provision to resize it (I've come up with a horrible scheme, but that could fail as well). The only sane option would be to guess how many MSIs a given alias could possibly use. How wrong is this guess going to be? The problem we have is that IOMMU groups are dynamic, while ITT allocation is completely static for a given DeviceID. The architecture doesn't give you any mechanism to resize it, and I have the ugly feeling that static allocation of the ID space to aliases is too rigid... M.
Hi Marc, > -----Original Message----- > From: Marc Zyngier [mailto:marc.zyngier@arm.com] > Sent: Saturday, April 25, 2015 4:10 PM > To: Yoder Stuart-B08248 > Cc: Will Deacon; Sethi Varun-B16395; Lian Minghuan-B31939; linux- > pci@vger.kernel.org; Arnd Bergmann; Hu Mingkai-B21284; Zang Roy-R61911; > Bjorn Helgaas; Wood Scott-B07421; linux-arm-kernel@lists.infradead.org > Subject: Re: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > On Fri, 24 Apr 2015 19:18:44 +0100 > Stuart Yoder <stuart.yoder@freescale.com> wrote: > > Hi Stuart, > > > > > > > > -----Original Message----- > > > From: Marc Zyngier [mailto:marc.zyngier@arm.com] > > > Sent: Friday, April 24, 2015 11:44 AM > > > To: Will Deacon; Yoder Stuart-B08248 > > > Cc: Sethi Varun-B16395; Lian Minghuan-B31939; > > > linux-pci@vger.kernel.org; Arnd Bergmann; Hu Mingkai-B21284; Zang > > > Roy-R61911; Bjorn Helgaas; Wood Scott-B07421; > > > linux-arm-kernel@lists.infradead.org > > > Subject: Re: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > > > > > On 24/04/15 17:18, Will Deacon wrote: > > > > On Wed, Apr 22, 2015 at 08:41:02PM +0100, Stuart Yoder wrote: > > > >>>> However, there is an improvement we envision as possible due to > > > >>>> the limited number of SMMU contexts (i.e. 64). If there are > > > >>>> 64 SMMU context registers it means that there is a max of > > > >>>> 64 software contexts where things can be isolated. But, say I > > > >>>> have an SRIOV card with 64 VFs, and I want to assign 8 of the > > > >>>> VFs to a KVM VM. Those 8 PCI devices could share the same > > > >>>> streamID/ITS-device-ID since they all share the same isolation > > > >>>> context. > > > >>>> > > > >>>> What would be nice is at the time the 8 VFS are being added to > > > >>>> the IOMMU domain is for the pcidevid -> streamID mapping table > > > >>>> to be updated dynamically. It simply lets us make more > > > >>>> efficient use of the limited streamIDs we have. > > > >>>> > > > >>>> I think it is this improvement that Minghuan had in mind in > > > >>>> this patch. > > > >>> > > > >>> Ok, but in this case it should be possible to use a single > > > >>> context bank for all of the VF streamIDs by configuring the > appropriate SMR, no? > > > >> > > > >> Yes, but there are limited SMRs. In our case there are only > > > >> 128 SMRs in SMMU-500 and we have potentially way more masters > > > >> than that. > > > > > > > > Right, but you still only have 64 context banks at the end of the > > > > day, so do you really anticipate having more than 128 masters > > > > concurrently using the SMMU? If so, then we have devices sharing > > > > context banks so we could consider reusing SMRs across masters, > > > > but historically that's not been something that we've managed to solve. > > > > > > > >>> Wouldn't > > > >>> that sort of thing be preferable to dynamic StreamID assignment? > > > >>> It would certainly make life easier for the MSIs. > > > >> > > > >> It would be preferable, but given only 128 total stream IDS and > > > >> 64 context registers it's potentially an issue. On our LS2085 > > > >> SoC it is PCI and the fsl-mc bus (see description here: > > > >> https://lkml.org/lkml/2015/3/5/795) that potentially have way > > > >> more masters than streamIDS. So, for those busses we would > > > >> essentially view a streamID as a "context ID"-- each SMR is > > > >> associated with > > > >> 1 context bank register. > > > >> > > > >> For PCI we have a programmable "PCI req ID"-to-"stream ID" > > > >> mapping table in the PCI controller that is dynamically > > > >> programmable. > > > >> > > > >> Looking at it like that means that we could have any number of > > > >> masters but only 64 "contexts" > > > >> and since the masters all all programmable it's seems feasbile to > > > >> envision doing some bus/vendor specific set up when a device is > > > >> added to an > > > >> IOMMU domain. One thing that would need to be conveyed > > > >> to the SMMU driver if doing dynamic streamID setup is what > > > >> streamIDs are available to be used. > > > > > > > > Ok, but this is going to make life difficult for the MSI people, I suspect. > > > > > > > > Marc...? > > > > > > We're really facing two conflicting requirements: in order to > > > minimize SMR usage, we want to alias multiple ReqIDs to a single > > > StreamID > > > > Another reason could be the isolation characteristics of the > > hardware...see comment below about PCI bridges. > > > > > but in > > > order to efficiently deal with MSIs, we want to see discrete > > > DeviceIDs (the actual ReqIDs). I don't easily see how we reconcile the > two. > > > > > > We can deal with the aliasing, provided that we extend the level of > > > quirkiness that pci_for_each_dma_alias can deal with. But that > > > doesn't solve any form of hotplug/SR-IOV behaviour. > > > [varun] Can you please elaborate on "extending the quirkiness of pci_for_each_dma_alias". How do you see the case for transparent host bridege being handled? We would see a device ID corresponding to the host bridge for masters behind that bridge. > > > Somehow, we're going to end-up with grossly oversized ITTs, just to > > > accommodate for the fact that we have no idea how many MSIs we're > > > going to end-up needing. I'm not thrilled with that prospect. > > > > How can we avoid that in the face of hotplug? > > Fortunately, hotplug is not always synonymous of aliasing. The ITS is built > around the hypothesis that aliasing doesn't happen, and that you know > upfront how many LPIs the device will be allowed to generate. > > > And what are we really worried about regarding over-sized ITTs...bytes > > of memory saved? > > That's one thing, yes. But more fundamentally, how do you size your MSI > capacity for a single alias? Do you evenly split your LPI space among all > possible aliases? Assuming 64 aliases and 16 bits of interrupt ID space, you > end up with 10 bit per alias. Is that always enough? Or do you need > something more fine-grained? > > > A fundamental thing built into the IOMMU subsystem in Linux is > > representing iommu groups that can represent things like multiple PCI > > devices that for hardware reasons cannot be isolated (and the example > > I've seen given relates to devices behind PCI bridges). > > > > So, I think the thing we are facing here is that while the IOMMU > > subsystem has accounted for reprsenting the isolation characteristics > > of a system with iommu groups, there is no corresponding "msi group" > > concept. > > > > In the SMMU/GIC-500-ITS world the iommu isolation ID (the stream ID) > > and the GIC-ITS device ID are in fact the same ID. > > The DeviceID is the "MSI group" you mention. This is what provides isolation > at the ITS level. > [varun] True, in case of a transparent host bridge device Id won't provide the necessary isolation. > > Is there some way we could sanely correlate IOMMU group creation > > (which establishes isolation granularity) with the creation of an ITT > > for the GIC-ITS? > > The problem you have is that your ITT already exists before you start > "hotpluging" new devices. Take the following (made up) example: > > System boots, device X is discovered, claims 64 MSIs. An ITT for device X is > allocated, and sized for 64 LPIs. SR-IOV kick is, creates a new X' > function that is aliased to X, claiming another 64 MSIs. Fail. > > What do we do here? The ITT is live (X is generating interrupts), and there is > no provision to resize it (I've come up with a horrible scheme, but that could > fail as well). The only sane option would be to guess how many MSIs a given > alias could possibly use. How wrong is this guess going to be? > > The problem we have is that IOMMU groups are dynamic, while ITT allocation > is completely static for a given DeviceID. The architecture doesn't give you > any mechanism to resize it, and I have the ugly feeling that static allocation of > the ID space to aliases is too rigid... [varun] One way would be to restrict the number of stream Ids(device Ids) per PCIe controller. In our scheme we have a device id -> stream ID translation table, we can restrict the number of entries in the table. This would restrict number of virtual functions. -Varun
Hi Will/Stuart, > -----Original Message----- > From: Yoder Stuart-B08248 > Sent: Thursday, April 23, 2015 1:11 AM > To: Will Deacon > Cc: Sethi Varun-B16395; Lian Minghuan-B31939; linux-pci@vger.kernel.org; > Arnd Bergmann; Hu Mingkai-B21284; Zang Roy-R61911; Bjorn Helgaas; Wood > Scott-B07421; linux-arm-kernel@lists.infradead.org > Subject: RE: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > > > > -----Original Message----- > > From: Will Deacon [mailto:will.deacon@arm.com] > > Sent: Wednesday, April 22, 2015 12:07 PM > > To: Yoder Stuart-B08248 > > Cc: Sethi Varun-B16395; Lian Minghuan-B31939; > > linux-pci@vger.kernel.org; Arnd Bergmann; Hu Mingkai-B21284; Zang > > Roy-R61911; Bjorn Helgaas; Wood Scott-B07421; > > linux-arm-kernel@lists.infradead.org > > Subject: Re: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > > > Hi Stuart, > > > > First of, thanks for taking the time to explain this in more detail. > > Comments inline. > > > > On Fri, Apr 17, 2015 at 03:19:08PM +0100, Stuart Yoder wrote: > > > > On Wed, Apr 15, 2015 at 02:18:13PM +0100, Varun Sethi wrote: > > > > > Yes, deviceid=stream id (i.e. ICID + other bits). I am not sure > > > > > if TBU ID would also be forwarded as a part of stream id to GIC. > > > > > My understanding is that TBU ID is forwarded (as a part of the > > > > > stream ID) to the TCU in case of a TBU translation miss. In case > > > > > of the LS2085 PCIe controller you would have to setup the PCIe > > > > > device ID to stream ID translation table. We may have to > > > > > restrict the number of entries based on the available number of > contexts. > > > > > > > > Unfortunately, I'm having a really hard time parsing this thread > > > > (some parts of it simply don't make sense; others use > > > > non-architectural terms and overall I don't get a feeling for the > problem). > > > > > > > > Please could you explain your system design step by step so that I > > > > can understand (a) what you've built and (b) why the current > > > > design of Linux is causing you problems? > > > > > > > > Sorry if I'm just being thick, but it's important that we get this right. > > > > > > I'll try to summarize some key points about the system... > > > > > > System is using a single SMMU-500 (1 TCU, 6 TBUs) and GICv3-ITS. > > > There are PCI, fsl-mc, and platform devices that do DMA. Devices on > > > the PCI and fsl-mc bus generate message interrupts. > > > > Ah cool, so you have multiple buses sharing a single SMMU? That's > > going to necessitate some ID remapping in the device-tree. Perhaps you > > could comment on Mark Rutland's proposal if it does/doesn't work for you: > > > > > > http://lists.infradead.org/pipermail/linux-arm-kernel/2015-March/33319 > > 9.html > > Thanks for the pointer, I had not seen that before. Will read and comment > on it. > > > > The flow a message interrupt would take is this: > > > > > > -------------- > > > PCI device > > > -------------- > > > | > > > | pcidevid + MSI msg > > > | > > > V > > > -------------- > > > PCI controller > > > pcidevid -> > > > streamID > > > mapping > > > -------------- > > > | > > > | streamID + MSI msg > > > | > > > V > > > -------------- > > > SMMU > > > -------------- > > > | > > > | streamID + MSI msg > > > | > > > V > > > -------------- > > > CCN-504 (Dickens) > > > -------------- > > > | > > > | streamID + MSI msg > > > | > > > V > > > > The streamID here as the same as the one coming out of the SMMU, right? > > Yes. > > > (just trying to out why you have the CCN-504 in the picture). > > It really isn't relevant I guess, just the picture I had in my head. > > > > -------------- > > > GICv3 ITS streamID == ITS deviceID > > > -------------- > > > > > > So, the way things work (at least initially) is that each PCI device > > > maps to a single streamID, and thus each device has a separate ITT > > > in the ITS. So, things should be cool. > > > > > > However, there is an improvement we envision as possible due to the > > > limited number of SMMU contexts (i.e. 64). If there are > > > 64 SMMU context registers it means that there is a max of > > > 64 software contexts where things can be isolated. But, say I have > > > an SRIOV card with 64 VFs, and I want to assign 8 of the VFs to a > > > KVM VM. Those 8 PCI devices could share the same > > > streamID/ITS-device-ID since they all share the same isolation > > > context. > > > > > > What would be nice is at the time the 8 VFS are being added to the > > > IOMMU domain is for the pcidevid -> streamID mapping table to be > > > updated dynamically. It simply lets us make more efficient use of > > > the limited streamIDs we have. > > > > > > I think it is this improvement that Minghuan had in mind in this > > > patch. > > > > Ok, but in this case it should be possible to use a single context > > bank for all of the VF streamIDs by configuring the appropriate SMR, no? > > Yes, but there are limited SMRs. In our case there are only > 128 SMRs in SMMU-500 and we have potentially way more masters than > that. > > > Wouldn't > > that sort of thing be preferable to dynamic StreamID assignment? It > > would certainly make life easier for the MSIs. > > It would be preferable, but given only 128 total stream IDS and > 64 context registers it's potentially an issue. On our LS2085 SoC it is PCI and > the fsl-mc bus (see description here: > https://lkml.org/lkml/2015/3/5/795) that potentially have way more masters > than streamIDS. So, for those busses we would essentially view a streamID > as a "context ID"-- each SMR is associated with > 1 context bank register. > [varun] On thing to note here is that we would also be hooking up DMA API with the SMMU driver. In that case we would typically require one context per device. We would have to restrict the number of available stream IDs. -Varun
Hi Varun, On 26/04/15 19:20, Varun Sethi wrote: > Hi Marc, > >>>> We can deal with the aliasing, provided that we extend the level of >>>> quirkiness that pci_for_each_dma_alias can deal with. But that >>>> doesn't solve any form of hotplug/SR-IOV behaviour. >>>> > [varun] Can you please elaborate on "extending the quirkiness of > pci_for_each_dma_alias". How do you see the case for transparent host > bridege being handled? We would see a device ID corresponding to the > host bridge for masters behind that bridge. The PCI code already has code to deal with aliases, and can deal with them in a number of cases. At the moment, this aliasing code can only deal with aliases that belong to the same PCI bus (or aliasing with the bus itself). Given the way the problem has been described, I understand that you can have devices sitting on different buses that will end up with the same DeviceID. This is where expanding the "quirkiness" of pci_for_each_dma_alias comes into play. You need to teach it about this kind of topology. >>>> Somehow, we're going to end-up with grossly oversized ITTs, just to >>>> accommodate for the fact that we have no idea how many MSIs we're >>>> going to end-up needing. I'm not thrilled with that prospect. >>> >>> How can we avoid that in the face of hotplug? >> >> Fortunately, hotplug is not always synonymous of aliasing. The ITS is built >> around the hypothesis that aliasing doesn't happen, and that you know >> upfront how many LPIs the device will be allowed to generate. >> >>> And what are we really worried about regarding over-sized ITTs...bytes >>> of memory saved? >> >> That's one thing, yes. But more fundamentally, how do you size your MSI >> capacity for a single alias? Do you evenly split your LPI space among all >> possible aliases? Assuming 64 aliases and 16 bits of interrupt ID space, you >> end up with 10 bit per alias. Is that always enough? Or do you need >> something more fine-grained? >> >>> A fundamental thing built into the IOMMU subsystem in Linux is >>> representing iommu groups that can represent things like multiple PCI >>> devices that for hardware reasons cannot be isolated (and the example >>> I've seen given relates to devices behind PCI bridges). >>> >>> So, I think the thing we are facing here is that while the IOMMU >>> subsystem has accounted for reprsenting the isolation characteristics >>> of a system with iommu groups, there is no corresponding "msi group" >>> concept. >>> >>> In the SMMU/GIC-500-ITS world the iommu isolation ID (the stream ID) >>> and the GIC-ITS device ID are in fact the same ID. >> >> The DeviceID is the "MSI group" you mention. This is what provides isolation >> at the ITS level. >> > [varun] True, in case of a transparent host bridge device Id won't > provide the necessary isolation. Well, it depends how you look at it. How necessary is this isolation, since we've already established that you couldn't distinguish between these devices at the IOMMU level? >>> Is there some way we could sanely correlate IOMMU group creation >>> (which establishes isolation granularity) with the creation of an ITT >>> for the GIC-ITS? >> >> The problem you have is that your ITT already exists before you start >> "hotpluging" new devices. Take the following (made up) example: >> >> System boots, device X is discovered, claims 64 MSIs. An ITT for device X is >> allocated, and sized for 64 LPIs. SR-IOV kick is, creates a new X' >> function that is aliased to X, claiming another 64 MSIs. Fail. >> >> What do we do here? The ITT is live (X is generating interrupts), and there is >> no provision to resize it (I've come up with a horrible scheme, but that could >> fail as well). The only sane option would be to guess how many MSIs a given >> alias could possibly use. How wrong is this guess going to be? >> >> The problem we have is that IOMMU groups are dynamic, while ITT allocation >> is completely static for a given DeviceID. The architecture doesn't give you >> any mechanism to resize it, and I have the ugly feeling that static allocation of >> the ID space to aliases is too rigid... > > [varun] One way would be to restrict the number of stream Ids(device > Ids) per PCIe controller. In our scheme we have a device id -> stream > ID translation table, we can restrict the number of entries in the > table. This would restrict number of virtual functions. Do you mean reserving a number of StreamIDs per PCIe controller, and letting virtual functions use these spare StreamIDs? This would indeed be more restrictive. But more importantly, who is going to be in charge of this mapping/allocation? Thanks, M.
Hi Marc, > -----Original Message----- > From: Marc Zyngier [mailto:marc.zyngier@arm.com] > Sent: Monday, April 27, 2015 1:28 PM > To: Sethi Varun-B16395; Yoder Stuart-B08248 > Cc: Will Deacon; Lian Minghuan-B31939; linux-pci@vger.kernel.org; Arnd > Bergmann; Hu Mingkai-B21284; Zang Roy-R61911; Bjorn Helgaas; Wood Scott- > B07421; linux-arm-kernel@lists.infradead.org > Subject: Re: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > Hi Varun, > > On 26/04/15 19:20, Varun Sethi wrote: > > Hi Marc, > > > >>>> We can deal with the aliasing, provided that we extend the level of > >>>> quirkiness that pci_for_each_dma_alias can deal with. But that > >>>> doesn't solve any form of hotplug/SR-IOV behaviour. > >>>> > > [varun] Can you please elaborate on "extending the quirkiness of > > pci_for_each_dma_alias". How do you see the case for transparent host > > bridege being handled? We would see a device ID corresponding to the > > host bridge for masters behind that bridge. > > The PCI code already has code to deal with aliases, and can deal with them in > a number of cases. > > At the moment, this aliasing code can only deal with aliases that belong to > the same PCI bus (or aliasing with the bus itself). Given the way the problem > has been described, I understand that you can have devices sitting on > different buses that will end up with the same DeviceID. This is where > expanding the "quirkiness" of pci_for_each_dma_alias comes into play. You > need to teach it about this kind of topology. > [varun] Agreed, in our case the PCIe controller maintains a stream ID to device ID translation table. So, we can actually avoid this problem by setting up unique stream IDs across PCIe controllers. We would need a layer to allow translation from device id to stream ID. > >>>> Somehow, we're going to end-up with grossly oversized ITTs, just to > >>>> accommodate for the fact that we have no idea how many MSIs we're > >>>> going to end-up needing. I'm not thrilled with that prospect. > >>> > >>> How can we avoid that in the face of hotplug? > >> > >> Fortunately, hotplug is not always synonymous of aliasing. The ITS is > >> built around the hypothesis that aliasing doesn't happen, and that > >> you know upfront how many LPIs the device will be allowed to generate. > >> > >>> And what are we really worried about regarding over-sized > >>> ITTs...bytes of memory saved? > >> > >> That's one thing, yes. But more fundamentally, how do you size your > >> MSI capacity for a single alias? Do you evenly split your LPI space > >> among all possible aliases? Assuming 64 aliases and 16 bits of > >> interrupt ID space, you end up with 10 bit per alias. Is that always > >> enough? Or do you need something more fine-grained? > >> > >>> A fundamental thing built into the IOMMU subsystem in Linux is > >>> representing iommu groups that can represent things like multiple > >>> PCI devices that for hardware reasons cannot be isolated (and the > >>> example I've seen given relates to devices behind PCI bridges). > >>> > >>> So, I think the thing we are facing here is that while the IOMMU > >>> subsystem has accounted for reprsenting the isolation > >>> characteristics of a system with iommu groups, there is no corresponding > "msi group" > >>> concept. > >>> > >>> In the SMMU/GIC-500-ITS world the iommu isolation ID (the stream ID) > >>> and the GIC-ITS device ID are in fact the same ID. > >> > >> The DeviceID is the "MSI group" you mention. This is what provides > >> isolation at the ITS level. > >> > > [varun] True, in case of a transparent host bridge device Id won't > > provide the necessary isolation. > > Well, it depends how you look at it. How necessary is this isolation, since > we've already established that you couldn't distinguish between these > devices at the IOMMU level? > [varun] Yes, the devices would fall in the same IOMMU group. So, devices would end up sharing the interrupt? > >>> Is there some way we could sanely correlate IOMMU group creation > >>> (which establishes isolation granularity) with the creation of an > >>> ITT for the GIC-ITS? > >> > >> The problem you have is that your ITT already exists before you start > >> "hotpluging" new devices. Take the following (made up) example: > >> > >> System boots, device X is discovered, claims 64 MSIs. An ITT for > >> device X is allocated, and sized for 64 LPIs. SR-IOV kick is, creates a new X' > >> function that is aliased to X, claiming another 64 MSIs. Fail. > >> > >> What do we do here? The ITT is live (X is generating interrupts), and > >> there is no provision to resize it (I've come up with a horrible > >> scheme, but that could fail as well). The only sane option would be > >> to guess how many MSIs a given alias could possibly use. How wrong is > this guess going to be? > >> > >> The problem we have is that IOMMU groups are dynamic, while ITT > >> allocation is completely static for a given DeviceID. The > >> architecture doesn't give you any mechanism to resize it, and I have > >> the ugly feeling that static allocation of the ID space to aliases is too rigid... > > > > [varun] One way would be to restrict the number of stream Ids(device > > Ids) per PCIe controller. In our scheme we have a device id -> stream > > ID translation table, we can restrict the number of entries in the > > table. This would restrict number of virtual functions. > > Do you mean reserving a number of StreamIDs per PCIe controller, and > letting virtual functions use these spare StreamIDs? This would indeed be > more restrictive. But more importantly, who is going to be in charge of this > mapping/allocation? [varun] My understanding is that, as per the new IOMMU API (of_xlate) this would be done in the bus driver code, while setting up the IOMMU groups. -Varun
On Mon, Apr 27, 2015 at 02:08:10PM +0100, Varun Sethi wrote: > > >>> In the SMMU/GIC-500-ITS world the iommu isolation ID (the stream ID) > > >>> and the GIC-ITS device ID are in fact the same ID. > > >> > > >> The DeviceID is the "MSI group" you mention. This is what provides > > >> isolation at the ITS level. > > >> > > > [varun] True, in case of a transparent host bridge device Id won't > > > provide the necessary isolation. > > > > Well, it depends how you look at it. How necessary is this isolation, since > > we've already established that you couldn't distinguish between these > > devices at the IOMMU level? > > > [varun] Yes, the devices would fall in the same IOMMU group. So, devices > would end up sharing the interrupt? Well, I think that's the crux of the issue here. If IOMMU groups are also needed to relay constraints to the IRQ subsystem, then perhaps we need a more general notion of device grouping and ID transformations between the different levels of group hierarchy. Will
On 27/04/15 14:08, Varun Sethi wrote: > Hi Marc, > >> -----Original Message----- >> From: Marc Zyngier [mailto:marc.zyngier@arm.com] >> Sent: Monday, April 27, 2015 1:28 PM >> To: Sethi Varun-B16395; Yoder Stuart-B08248 >> Cc: Will Deacon; Lian Minghuan-B31939; linux-pci@vger.kernel.org; Arnd >> Bergmann; Hu Mingkai-B21284; Zang Roy-R61911; Bjorn Helgaas; Wood Scott- >> B07421; linux-arm-kernel@lists.infradead.org >> Subject: Re: [PATCH 1/2] irqchip/gicv3-its: Support share device ID >> >> Hi Varun, >> >> On 26/04/15 19:20, Varun Sethi wrote: >>> Hi Marc, >>> >>>>>> We can deal with the aliasing, provided that we extend the level of >>>>>> quirkiness that pci_for_each_dma_alias can deal with. But that >>>>>> doesn't solve any form of hotplug/SR-IOV behaviour. >>>>>> >>> [varun] Can you please elaborate on "extending the quirkiness of >>> pci_for_each_dma_alias". How do you see the case for transparent host >>> bridege being handled? We would see a device ID corresponding to the >>> host bridge for masters behind that bridge. >> >> The PCI code already has code to deal with aliases, and can deal with them in >> a number of cases. >> >> At the moment, this aliasing code can only deal with aliases that belong to >> the same PCI bus (or aliasing with the bus itself). Given the way the problem >> has been described, I understand that you can have devices sitting on >> different buses that will end up with the same DeviceID. This is where >> expanding the "quirkiness" of pci_for_each_dma_alias comes into play. You >> need to teach it about this kind of topology. >> > [varun] Agreed, in our case the PCIe controller maintains a stream ID to device ID translation table. So, we can actually avoid this problem by setting up unique stream IDs across PCIe controllers. We would need a layer to allow translation from device id to stream ID. > >>>>>> Somehow, we're going to end-up with grossly oversized ITTs, just to >>>>>> accommodate for the fact that we have no idea how many MSIs we're >>>>>> going to end-up needing. I'm not thrilled with that prospect. >>>>> >>>>> How can we avoid that in the face of hotplug? >>>> >>>> Fortunately, hotplug is not always synonymous of aliasing. The ITS is >>>> built around the hypothesis that aliasing doesn't happen, and that >>>> you know upfront how many LPIs the device will be allowed to generate. >>>> >>>>> And what are we really worried about regarding over-sized >>>>> ITTs...bytes of memory saved? >>>> >>>> That's one thing, yes. But more fundamentally, how do you size your >>>> MSI capacity for a single alias? Do you evenly split your LPI space >>>> among all possible aliases? Assuming 64 aliases and 16 bits of >>>> interrupt ID space, you end up with 10 bit per alias. Is that always >>>> enough? Or do you need something more fine-grained? >>>> >>>>> A fundamental thing built into the IOMMU subsystem in Linux is >>>>> representing iommu groups that can represent things like multiple >>>>> PCI devices that for hardware reasons cannot be isolated (and the >>>>> example I've seen given relates to devices behind PCI bridges). >>>>> >>>>> So, I think the thing we are facing here is that while the IOMMU >>>>> subsystem has accounted for reprsenting the isolation >>>>> characteristics of a system with iommu groups, there is no corresponding >> "msi group" >>>>> concept. >>>>> >>>>> In the SMMU/GIC-500-ITS world the iommu isolation ID (the stream ID) >>>>> and the GIC-ITS device ID are in fact the same ID. >>>> >>>> The DeviceID is the "MSI group" you mention. This is what provides >>>> isolation at the ITS level. >>>> >>> [varun] True, in case of a transparent host bridge device Id won't >>> provide the necessary isolation. >> >> Well, it depends how you look at it. How necessary is this isolation, since >> we've already established that you couldn't distinguish between these >> devices at the IOMMU level? >> > [varun] Yes, the devices would fall in the same IOMMU group. So, > devices would end up sharing the interrupt? No, they would end-up sharing an Interrupt Translation Table (ITT), Basically the equivalent of your page IOMMU page tables, but for interrupts. Basically, all the devices in this group would have their own set of interrupts, but would also be able to spoof interrupts for each others. Probably not a big deal isolation wise, but problematic from an allocation point of view (given the current way the ITS driver is architected). M.
> -----Original Message----- > From: Will Deacon [mailto:will.deacon@arm.com] > Sent: Monday, April 27, 2015 12:04 PM > To: Sethi Varun-B16395 > Cc: Marc Zyngier; Yoder Stuart-B08248; Lian Minghuan-B31939; linux-pci@vger.kernel.org; Arnd Bergmann; Hu > Mingkai-B21284; Zang Roy-R61911; Bjorn Helgaas; Wood Scott-B07421; linux-arm-kernel@lists.infradead.org > Subject: Re: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > On Mon, Apr 27, 2015 at 02:08:10PM +0100, Varun Sethi wrote: > > > >>> In the SMMU/GIC-500-ITS world the iommu isolation ID (the stream ID) > > > >>> and the GIC-ITS device ID are in fact the same ID. > > > >> > > > >> The DeviceID is the "MSI group" you mention. This is what provides > > > >> isolation at the ITS level. > > > >> > > > > [varun] True, in case of a transparent host bridge device Id won't > > > > provide the necessary isolation. > > > > > > Well, it depends how you look at it. How necessary is this isolation, since > > > we've already established that you couldn't distinguish between these > > > devices at the IOMMU level? > > > > > [varun] Yes, the devices would fall in the same IOMMU group. So, devices > > would end up sharing the interrupt? > > Well, I think that's the crux of the issue here. If IOMMU groups are also > needed to relay constraints to the IRQ subsystem, then perhaps we need a > more general notion of device grouping and ID transformations between > the different levels of group hierarchy. I agree. Have been thinking about it over the last few days...is is a matter of renaming what we currently call an "IOMMU group"? Or, do we really need to separate general 'device grouping' and 'iommu groups' in the Linux kernel? Stuart
On Fri, May 01, 2015 at 04:23:31PM +0100, Stuart Yoder wrote: > > > > -----Original Message----- > > From: Will Deacon [mailto:will.deacon@arm.com] > > Sent: Monday, April 27, 2015 12:04 PM > > To: Sethi Varun-B16395 > > Cc: Marc Zyngier; Yoder Stuart-B08248; Lian Minghuan-B31939; linux-pci@vger.kernel.org; Arnd Bergmann; Hu > > Mingkai-B21284; Zang Roy-R61911; Bjorn Helgaas; Wood Scott-B07421; linux-arm-kernel@lists.infradead.org > > Subject: Re: [PATCH 1/2] irqchip/gicv3-its: Support share device ID > > > > On Mon, Apr 27, 2015 at 02:08:10PM +0100, Varun Sethi wrote: > > > > >>> In the SMMU/GIC-500-ITS world the iommu isolation ID (the stream ID) > > > > >>> and the GIC-ITS device ID are in fact the same ID. > > > > >> > > > > >> The DeviceID is the "MSI group" you mention. This is what provides > > > > >> isolation at the ITS level. > > > > >> > > > > > [varun] True, in case of a transparent host bridge device Id won't > > > > > provide the necessary isolation. > > > > > > > > Well, it depends how you look at it. How necessary is this isolation, since > > > > we've already established that you couldn't distinguish between these > > > > devices at the IOMMU level? > > > > > > > [varun] Yes, the devices would fall in the same IOMMU group. So, devices > > > would end up sharing the interrupt? > > > > Well, I think that's the crux of the issue here. If IOMMU groups are also > > needed to relay constraints to the IRQ subsystem, then perhaps we need a > > more general notion of device grouping and ID transformations between > > the different levels of group hierarchy. > > I agree. Have been thinking about it over the last few days...is is > a matter of renaming what we currently call an "IOMMU group"? Or, > do we really need to separate general 'device grouping' and 'iommu groups' > in the Linux kernel? It depends. Right now, the IOMMU drivers are responsible for creating the IOMMU groups and I don't think that's general enough for what we need in practice. If we move the group creation to be done as part of the bus, then that would be the first step in having an abstraction that could be re-used by the interrupt code imo. It would also move us a step closer to having a generic device group description for platform-bus devices (i.e. as part of the devicetree). I suspect the only way we'll really find out is by prototyping stuff and sending patches. Will
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index d0374a6..be78d0a 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -1169,6 +1169,15 @@ static int its_get_pci_alias(struct pci_dev *pdev, u16 alias, void *data) return 0; } +void __weak +arch_msi_share_devid_update(struct pci_dev *pdev, u32 *dev_id, u32 *nvesc) +{ + /* + * use PCI_DEVID NOT share device ID as default + * so nothing need to do + */ +} + static int its_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec, msi_alloc_info_t *info) { @@ -1185,6 +1194,8 @@ static int its_msi_prepare(struct irq_domain *domain, struct device *dev, dev_alias.count = nvec; pci_for_each_dma_alias(pdev, its_get_pci_alias, &dev_alias); + arch_msi_share_devid_update(pdev, &dev_alias.dev_id, &dev_alias.count); + its = domain->parent->host_data; its_dev = its_find_device(its, dev_alias.dev_id);
SMMU of some platforms can only isolate limited device ID. This may require that all PCI devices share the same ITS device with the fixed device ID. The patch adds function arch_msi_share_devid_update used for these platforms to update the fixed device ID and maximum MSI interrupts number. Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com> --- drivers/irqchip/irq-gic-v3-its.c | 11 +++++++++++ 1 file changed, 11 insertions(+)