Message ID | 1364316746-8702-8-git-send-email-thomas.petazzoni@free-electrons.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On Tuesday 26 March 2013, Thomas Petazzoni wrote: > + mpic: main-intc@d0020000 { > + #interrupt-cells = <1>; > + interrupt-controller; > + }; > + > + msi: msi-intc@d0020000 { > + #interrupt-cells = <1>; > + interrupt-controller; > + marvell,doorbell = <0xd0020a04>; > + }; I think the @d0020000 part needs to be removed for the nodes that have no reg property. I think I did not follow the entire discussion. What has led to having two subnodes in the end, rather than a single node like this? interrupt-controller@d0020000 { compatible = "marvell,mpic"; #interrupt-cells = <1>; #msi-cells = <1>; #address-cells = <1>; #size-cells = <1>; interrupt-controller; msi-controller; reg = <0xd0020a00 0x1d0>, <0xd0021070 0x58>; marvell,doorbell = <0xd0020a04>; }; Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Dear Arnd Bergmann, On Tue, 26 Mar 2013 17:07:41 +0000, Arnd Bergmann wrote: > I think the @d0020000 part needs to be removed for the nodes that > have no reg property. Sure, will fix. > I think I did not follow the entire discussion. What has led to having > two subnodes in the end, rather than a single node like this? > > > interrupt-controller@d0020000 { > compatible = "marvell,mpic"; > #interrupt-cells = <1>; > #msi-cells = <1>; > #address-cells = <1>; > #size-cells = <1>; > interrupt-controller; > msi-controller; > reg = <0xd0020a00 0x1d0>, > <0xd0021070 0x58>; > marvell,doorbell = <0xd0020a04>; > }; I've tried to explain that in the commit log of PATCH 6, which says: However, we need the driver to expose two different IRQ domains: one for the main interrupt controller itself, and one for the MSI interrupt controller. In order to achieve this, we will create two subnodes in the interrupt-controller@d0020000 node: one subnode for the main interrupt controller, and one subnode for the MSI interrupt controller. The two irq domains can't be registered on the same DT node, otherwise when irq_find_host() gets used by of_irq_map_one() to resolve IRQs of devices, they may find the MSI interrupt controller instead of the main interrupt controller. Note that both the parent and the child node need to have the 'interrupt-controller' empty property: * The interrupt-controller property is needed in the main interrupt controller node (interrupt-controller@d0020000) because the of_irq_init() function skips nodes that are matching the given compatible string, but that don't have the interrupt-controller property. * The interrupt-controller property is needed in the child interrupt controller node (main-intc@d0020000) otherwise the resolution done by of_irq_map_one() doesn't work. So really, the thing is that irq_domain_add_linear() registers an IRQ domain on a specific DT node, and then irq_find_host() finds back an IRQ domain from a given DT node. So if you have two IRQ domains registered on the same DT node, then you don't know which one will be used. So if I do the two irq_domain_add_linear() (one for MPIC, one for MSI) on one single DT node, when the timer driver will request its interrupt, it turns out that the MSI IRQ domain is used and not the MPIC IRQ domain, even though the timer has <&mpic> as its interrupt parent. Best regards, Thomas
On Tue, Mar 26, 2013 at 05:52:22PM +0100, Thomas Petazzoni wrote: > This commit introduces the support for the MSI interrupts in the > armada-370-xp interrupt controller driver. It registers an IRQ domain > associated with the 'msi' node in the Device Tree, which the PCI > controller will refer to in a followup commit. I've got some MSI stuff working on Kirkwood using the doorbells, so I can confirm this general approach works in HW. What do you think it would take to extend this to other Marvell SOCs? BTW, in my testing on Kirkwood I found that register ..40010 in the PEX had to be set to the internal register base to make this work. Did you find something similar? > The MSI interrupts use the 16 high doorbells, and are therefore > notified using IRQ1 of the main interrupt controller. I was initially a bit surprised this was the high doorbell bits, but I checked and it is right, the 16 bit MSI maps to the high two bytes in the 32 bit MemWr TLP. FWIW, MSI-X is not restricted to 16 bits, so if you can detect from the PCI layer if it is setting up MSI or MSI-X you could allocate low bits first to MSI-X and high bits first to MSI, increasing the number of available MSI/MSI-X vectors. > + - marvell,doorbell: Gives the physical address at which PCIe > + devices should write to signal an MSI interrupt. Why is this necessary? Can't the doorbell register physical address be computed by the driver? AFAIK there is no possibility for address translation on SOC inbound TLPs. Thinking about it a bit, maybe less magic code is needed here, be explicit about the available interrupts in the DT: pcie-controller { msi-interrupts = <0xd0020a04 (1<<16) &msi 16 0xd0020a04 (1<<17) &msi 17 [..] msi-x-interrupts = <0xd0020a04 (1<<1) &msi 1 0xd0020a04 (1<<2) &msi 2 [..] There is a better chance of that supporting other Marvell SOCs.. Not sure, just throwing it out there. Regarding the sub node, I'm not sure it should be necessary. You don't need to differentiate the MSI vs non-MSI case in the doorbell register at the interrupt controller level. Eg this .. > +#ifdef CONFIG_PCI_MSI > +static struct irq_chip armada_370_xp_msi_irq_chip = { > + .name = "armada_370_xp_msi_irq", > + .irq_enable = unmask_msi_irq, > + .irq_disable = mask_msi_irq, > + .irq_mask = mask_msi_irq, > + .irq_unmask = unmask_msi_irq, > +}; .. isn't necessary for doorbell. With MSI cases on other archs there is no local MSI interrupt controller, the MSI MemWr TLP directly creates an interrupt at the CPU. This requires using the *_msi_irq functions to control the interrupt at the source, since there is no control at the destination. However, Marvell's doorbell can be controlled at the destination, so it is better to handle it that way, especially since it creates symmetry with the IPI usage. I used: priv->gc = irq_alloc_generic_chip("kirkwood_doorbell_irq", 1, irq_base, priv->base, handle_fasteoi_irq); [..] ct->regs.mask = 4; ct->regs.eoi = 0; ct->chip.irq_eoi = irq_gc_eoi_inv; ct->chip.irq_mask = irq_gc_mask_clr_bit; ct->chip.irq_unmask = irq_gc_mask_set_bit; Which should work correctly for the IPI and MSI cases. The handle_fasteoi_irq is important. > #ifdef CONFIG_SMP > void armada_mpic_send_doorbell(const struct cpumask *mask, unsigned int irq) > { > @@ -220,12 +280,39 @@ armada_370_xp_handle_irq(struct pt_regs *regs) > if (irqnr > 1022) > break; > > - if (irqnr > 0) { > + if (irqnr > 1) { > irqnr = irq_find_mapping(armada_370_xp_mpic_domain, > irqnr); > handle_IRQ(irqnr, regs); > continue; > } > + > +#ifdef CONFIG_PCI_MSI > + /* MSI handling */ > + if (irqnr == 1) { > + u32 msimask, msinr; > + > + msimask = readl_relaxed(per_cpu_int_base + > + ARMADA_370_XP_IN_DRBEL_CAUSE_OFFS) > + & PCI_MSI_DOORBELL_MASK; > + > + writel(~PCI_MSI_DOORBELL_MASK, per_cpu_int_base + > + ARMADA_370_XP_IN_DRBEL_CAUSE_OFFS); Be careful here, the ordering of the EOI in relation to the handler is important. If you use my stuff from above then the MSI and IPI cases are identical and the core code handles ack'ing the doorbell via irq_eoi at the right moment, you can just uniformly iterate over all 32 bits and there is no need to care if it is MSI or IPI. Also, I'm not super familiar with the irq stuff, but is irq_find_mapping the best way? Most of the drivers I looked at used irq_alloc_descs to get a contiguous range of irq numbers and then just used a simple offset in the handle_irq... Jason -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tuesday 26 March 2013, Thomas Petazzoni wrote: > > I've tried to explain that in the commit log of PATCH 6, which says: > > However, we need the driver to expose two different IRQ domains: one > for the main interrupt controller itself, and one for the MSI > interrupt controller. In order to achieve this, we will create two > subnodes in the interrupt-controller@d0020000 node: one subnode for > the main interrupt controller, and one subnode for the MSI interrupt > controller. The two irq domains can't be registered on the same DT > node, otherwise when irq_find_host() gets used by of_irq_map_one() > to resolve IRQs of devices, they may find the MSI interrupt > controller instead of the main interrupt controller. Right, I should have read the commit log better ... > Note that both the parent and the child node need to have the > 'interrupt-controller' empty property: > > * The interrupt-controller property is needed in the main > interrupt controller node (interrupt-controller@d0020000) because > the of_irq_init() function skips nodes that are matching the given > compatible string, but that don't have the interrupt-controller > property. > > * The interrupt-controller property is needed in the child > interrupt controller node (main-intc@d0020000) otherwise the > resolution done by of_irq_map_one() doesn't work. If you add compatible properties to the children, that would work I suppose. > So really, the thing is that irq_domain_add_linear() registers an IRQ > domain on a specific DT node, and then irq_find_host() finds back an > IRQ domain from a given DT node. So if you have two IRQ domains > registered on the same DT node, then you don't know which one will be > used. > > So if I do the two irq_domain_add_linear() (one for MPIC, one for MSI) > on one single DT node, when the timer driver will request its > interrupt, it turns out that the MSI IRQ domain is used and not the > MPIC IRQ domain, even though the timer has <&mpic> as its interrupt > parent. I still wonder if the real solution shouldn't instead be to make the irq domain code MSI aware. For instance, you don't really need a cell to describe an interrupt because the interrupt number is not a hardware property. So an MSI using device doesn't really needs an "msis" or "interrupts" property, just an "msi-parent", and we can add code to handle as a separate domain even if you have a single device node that can do both level and message interrupts. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Dear Arnd Bergmann, On Tue, 26 Mar 2013 18:38:22 +0000, Arnd Bergmann wrote: > > Note that both the parent and the child node need to have the > > 'interrupt-controller' empty property: > > > > * The interrupt-controller property is needed in the main > > interrupt controller node (interrupt-controller@d0020000) > > because the of_irq_init() function skips nodes that are matching > > the given compatible string, but that don't have the > > interrupt-controller property. > > > > * The interrupt-controller property is needed in the child > > interrupt controller node (main-intc@d0020000) otherwise the > > resolution done by of_irq_map_one() doesn't work. > > If you add compatible properties to the children, that would work > I suppose. To which children? Only to the main-intc children? If so, armada_370_xp_mpic_of_init() would be called with a 'device_node *' that points to the main-intc, correct? Then it would have to go back up in the Device Tree to find the msi node? It's doable of course, but sounds strange, no? To me, the topology of the hardware is really that a single device provides two features: the main interrupt controller and the MSI interrupt controller. But I will adapt to whatever DT binding you propose. > > So really, the thing is that irq_domain_add_linear() registers an > > IRQ domain on a specific DT node, and then irq_find_host() finds > > back an IRQ domain from a given DT node. So if you have two IRQ > > domains registered on the same DT node, then you don't know which > > one will be used. > > > > So if I do the two irq_domain_add_linear() (one for MPIC, one for > > MSI) on one single DT node, when the timer driver will request its > > interrupt, it turns out that the MSI IRQ domain is used and not the > > MPIC IRQ domain, even though the timer has <&mpic> as its interrupt > > parent. > > I still wonder if the real solution shouldn't instead be to make the > irq domain code MSI aware. For instance, you don't really need a > cell to describe an interrupt because the interrupt number is > not a hardware property. So an MSI using device doesn't really > needs an "msis" or "interrupts" property, just an "msi-parent", > and we can add code to handle as a separate domain even if you > have a single device node that can do both level and message > interrupts. So the msi-parent property should point to the single mpic node? But then the IRQ domain for MSI cannot be registered on this single MPIC node. Then, what would be the first argument of: armada_370_xp_msi_domain = irq_domain_add_linear(msi_node, PCI_MSI_DOORBELL_NR, &armada_370_xp_msi_irq_ops, NULL); and how would the PCIe driver get a pointer to this IRQ domain? (In the currently proposed code, it just does a irq_find_host(), which sounded very simple and straightforward). To clarify: I really don't mind reworking the code, but unfortunately "make the IRQ domain code MSI aware" is too vague for me to understand the direction you're thinking of. Thanks for your review and comments! Thomas
On Tuesday 26 March 2013, Thomas Petazzoni wrote: > Dear Arnd Bergmann, > > On Tue, 26 Mar 2013 18:38:22 +0000, Arnd Bergmann wrote: > > > > Note that both the parent and the child node need to have the > > > 'interrupt-controller' empty property: > > > > > > * The interrupt-controller property is needed in the main > > > interrupt controller node (interrupt-controller@d0020000) > > > because the of_irq_init() function skips nodes that are matching > > > the given compatible string, but that don't have the > > > interrupt-controller property. > > > > > > * The interrupt-controller property is needed in the child > > > interrupt controller node (main-intc@d0020000) otherwise the > > > resolution done by of_irq_map_one() doesn't work. > > > > If you add compatible properties to the children, that would work > > I suppose. > > To which children? Only to the main-intc children? If so, > armada_370_xp_mpic_of_init() would be called with a 'device_node *' > that points to the main-intc, correct? Then it would have to go back up > in the Device Tree to find the msi node? It's doable of course, but > sounds strange, no? I was thinking of registering two init functions as well. > To me, the topology of the hardware is really that a single device > provides two features: the main interrupt controller and the MSI > interrupt controller. But I will adapt to whatever DT binding you > propose. My preference would be to have no sub-nodes but rather make the code deal with an interrupt controller that has an interrupt domain for regular IRQs but can also handle MSI. > > I still wonder if the real solution shouldn't instead be to make the > > irq domain code MSI aware. For instance, you don't really need a > > cell to describe an interrupt because the interrupt number is > > not a hardware property. So an MSI using device doesn't really > > needs an "msis" or "interrupts" property, just an "msi-parent", > > and we can add code to handle as a separate domain even if you > > have a single device node that can do both level and message > > interrupts. > > So the msi-parent property should point to the single mpic node? But > then the IRQ domain for MSI cannot be registered on this single MPIC > node. Then, what would be the first argument of: > > armada_370_xp_msi_domain = > irq_domain_add_linear(msi_node, PCI_MSI_DOORBELL_NR, > &armada_370_xp_msi_irq_ops, NULL); > > and how would the PCIe driver get a pointer to this IRQ domain? (In the > currently proposed code, it just does a irq_find_host(), which sounded > very simple and straightforward). I guess one way would be to have a single domain that is responsible for the controller and handles both MSI and legacy interrupts. That could probably be done without changes to the interrupt domain code. Another option would be to add an irq_domain_add_msi() function that creates a domain which is also tied to the same device node but does not interact with it when going through the legacy interrupts. You could then add a matching msi_find_host() or irq_find_msi_host() function to return the domain. > To clarify: I really don't mind reworking the code, but unfortunately > "make the IRQ domain code MSI aware" is too vague for me to understand > the direction you're thinking of. I don't have specific code in mind yet, mainly playing around with the possibilities for now. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Mar 26, 2013 at 09:46:13PM +0100, Thomas Petazzoni wrote: > To me, the topology of the hardware is really that a single device > provides two features: the main interrupt controller and the MSI > interrupt controller. But I will adapt to whatever DT binding you > propose. No.. the HW is a single device that provides an interrupt on register write capability, so it ideally should be a single DT node.. The need to distinguish MSI vs IPI vs other usage is completely a side effect of how Linux's IRQ and PCI layers are hooked together today. > > I still wonder if the real solution shouldn't instead be to make the > > irq domain code MSI aware. For instance, you don't really need a > > cell to describe an interrupt because the interrupt number is Some kind of generic way for an irq chip driver to say 'here, I can provide some MSI interrupts' and then for the PCI layer to say 'irq layer, find me a driver that can provision a MSI with XXX properties' ? This need to stack an irq chip under a MSI is not something I think the kernel has had to support before, so new common code is probably needed... The interrupt chip should not need to know what the ultimate consumer of the interrupt capability will be, it just needs to tell the consumer 'write D to physical address A and irq I will trigger'. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tuesday 26 March 2013, Thomas Petazzoni wrote: > + msimask = readl_relaxed(per_cpu_int_base + > + ARMADA_370_XP_IN_DRBEL_CAUSE_OFFS) > + & PCI_MSI_DOORBELL_MASK; > + > + writel(~PCI_MSI_DOORBELL_MASK, per_cpu_int_base + > + ARMADA_370_XP_IN_DRBEL_CAUSE_OFFS); > + Upon reading this code again, I stumbled over the barriers. You use a readl_relaxed() without barrier but a writel() with barrier. Is that intentional? Are you sure that you don't need a full readl() to guarantee that all inbound DMA that was sent by the device before the MSI message has arrived by the time the interrupt handler function is called? It depends on the implementation of the MSI controller whether that guarantee is already made by the fact that you are handling the interrupt. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Dear Jason Gunthorpe, On Tue, 26 Mar 2013 12:02:45 -0600, Jason Gunthorpe wrote: > > This commit introduces the support for the MSI interrupts in the > > armada-370-xp interrupt controller driver. It registers an IRQ > > domain associated with the 'msi' node in the Device Tree, which the > > PCI controller will refer to in a followup commit. > > I've got some MSI stuff working on Kirkwood using the doorbells, so I > can confirm this general approach works in HW. > > What do you think it would take to extend this to other Marvell SOCs? A bit of time. It is on my list to use the PCIe driver on Kirkwood and other Marvell SoCs as well. > BTW, in my testing on Kirkwood I found that register ..40010 in the > PEX had to be set to the internal register base to make this work. Did > you find something similar? The original code I used was prepared by Lior Amsalem, and it seems like it was not needed, maybe because the bootloader had already configured the PCIe interfaces. I'll have a look to see if those registers already contain the right values, but certainly it would be safer if the PCIe driver was setting their value. Not a big deal to fix, I believe. > > The MSI interrupts use the 16 high doorbells, and are therefore > > notified using IRQ1 of the main interrupt controller. > > I was initially a bit surprised this was the high doorbell bits, but I > checked and it is right, the 16 bit MSI maps to the high two bytes in > the 32 bit MemWr TLP. Right. The low 8 bits are used for the IPIs, and the high 16 bits are used for MSIs. I think it has been chosen to be like this because then we have two different interrupt numbers for the IPIs and the MSIs, which make the thing easy to handle in the IRQ controller driver. > FWIW, MSI-X is not restricted to 16 bits, so if you can detect from > the PCI layer if it is setting up MSI or MSI-X you could allocate low > bits first to MSI-X and high bits first to MSI, increasing the number > of available MSI/MSI-X vectors. This could be an improvement. There are also other, non-per-CPU, doorbell interrupts that could potentially be used. Can we consider this a possible improvement, and not something that is fundamentally necessary? For now, I'm trying to get the current feature set merged, and not necessarily to extend it to cover all possible features of the hardware. > > + - marvell,doorbell: Gives the physical address at which PCIe > > + devices should write to signal an MSI interrupt. > > Why is this necessary? Can't the doorbell register physical address be > computed by the driver? AFAIK there is no possibility for address > translation on SOC inbound TLPs. It is the responsibility of the PCIe driver to prepare the 'struct msi_msg', which contains the physical address at which the PCIe device should write to trigger an MSI. But this physical address is part of the interrupt controller registers, so there is no way for the PCIe driver to magically know about it. Please see the code where we're using this: https://github.com/MISL-EBU-System-SW/mainline-public/blob/marvell-pcie-msi-v1/drivers/pci/host/pci-mvebu.c#L844. If you think this physical address can be "computed" by the driver, I'm all open to suggestions. > Thinking about it a bit, maybe less magic code is needed here, be > explicit about the available interrupts in the DT: > > pcie-controller { > msi-interrupts = <0xd0020a04 (1<<16) &msi 16 > 0xd0020a04 (1<<17) &msi 17 > [..] > msi-x-interrupts = <0xd0020a04 (1<<1) &msi 1 > 0xd0020a04 (1<<2) &msi 2 > [..] > > There is a better chance of that supporting other Marvell SOCs.. Not > sure, just throwing it out there. Isn't that very verbose, to list each and every MSI interrupt, bit per bit? I'm fine with doing that (except maybe implement both MSI and MSI-X support, I'd like to stick with the current feature set for now), but it sounds like a lot more code in the DT and a lot more code in the driver to parse this... just to get the exact same feature. Arnd, what is your feeling about this suggestion? > Regarding the sub node, I'm not sure it should be necessary. You > don't need to differentiate the MSI vs non-MSI case in the doorbell > register at the interrupt controller level. Eg this .. > > > +#ifdef CONFIG_PCI_MSI > > +static struct irq_chip armada_370_xp_msi_irq_chip = { > > + .name = "armada_370_xp_msi_irq", > > + .irq_enable = unmask_msi_irq, > > + .irq_disable = mask_msi_irq, > > + .irq_mask = mask_msi_irq, > > + .irq_unmask = unmask_msi_irq, > > +}; > > .. isn't necessary for doorbell. With MSI cases on other archs there > is no local MSI interrupt controller, the MSI MemWr TLP directly > creates an interrupt at the CPU. This requires using the *_msi_irq > functions to control the interrupt at the source, since there is no > control at the destination. > > However, Marvell's doorbell can be controlled at the destination, so > it is better to handle it that way, especially since it creates > symmetry with the IPI usage. Hum, ok. But the MSI and IPI are handled quite differently: MSIs have to call handle_IRQ(), while IPIs have to call handle_IPI(), so you'd still have to distinguish between IPIs and MSIs. In the current driver, IPIs don't have an associated irq_chip structure. > I used: > priv->gc = irq_alloc_generic_chip("kirkwood_doorbell_irq", 1, > irq_base, priv->base, > handle_fasteoi_irq); > [..] > ct->regs.mask = 4; > ct->regs.eoi = 0; > ct->chip.irq_eoi = irq_gc_eoi_inv; > ct->chip.irq_mask = irq_gc_mask_clr_bit; > ct->chip.irq_unmask = irq_gc_mask_set_bit; > > Which should work correctly for the IPI and MSI cases. The > handle_fasteoi_irq is important. I don't understand how this can end up calling handle_IPI() when IPIs are raised. There are no IPIs on Kirkwood, so are you sure the Kirkwood logic can apply to the SMP case of Armada XP, which requires IPIs? > > #ifdef CONFIG_SMP > > void armada_mpic_send_doorbell(const struct cpumask *mask, > > unsigned int irq) { > > @@ -220,12 +280,39 @@ armada_370_xp_handle_irq(struct pt_regs *regs) > > if (irqnr > 1022) > > break; > > > > - if (irqnr > 0) { > > + if (irqnr > 1) { > > irqnr = > > irq_find_mapping(armada_370_xp_mpic_domain, irqnr); > > handle_IRQ(irqnr, regs); > > continue; > > } > > + > > +#ifdef CONFIG_PCI_MSI > > + /* MSI handling */ > > + if (irqnr == 1) { > > + u32 msimask, msinr; > > + > > + msimask = readl_relaxed(per_cpu_int_base + > > + > > ARMADA_370_XP_IN_DRBEL_CAUSE_OFFS) > > + & PCI_MSI_DOORBELL_MASK; > > + > > + writel(~PCI_MSI_DOORBELL_MASK, > > per_cpu_int_base + > > + ARMADA_370_XP_IN_DRBEL_CAUSE_OFFS); > > Be careful here, the ordering of the EOI in relation to the handler is > important. If you use my stuff from above then the MSI and IPI cases > are identical and the core code handles ack'ing the doorbell via > irq_eoi at the right moment, you can just uniformly iterate over all > 32 bits and there is no need to care if it is MSI or IPI. Again, I don't see how it's possible to not care whether it's MSI or IPI. IPIs have to call handle_IPI() which is a ARM-specific call, and I don't understand how handle_fasteoi_irq() would end up calling handle_IPI(). > Also, I'm not super familiar with the irq stuff, but is > irq_find_mapping the best way? Most of the drivers I looked at used > irq_alloc_descs to get a contiguous range of irq numbers and then just > used a simple offset in the handle_irq... I'll let Arnd answer this one, but I'm pretty sure that using IRQ domains is the way to go. The fact that a number of drivers don't yet use IRQ domains is maybe just because they haven't been converted yet. Thanks, Thomas
On Tuesday 26 March 2013, Thomas Petazzoni wrote: > > FWIW, MSI-X is not restricted to 16 bits, so if you can detect from > > the PCI layer if it is setting up MSI or MSI-X you could allocate low > > bits first to MSI-X and high bits first to MSI, increasing the number > > of available MSI/MSI-X vectors. > > This could be an improvement. There are also other, non-per-CPU, > doorbell interrupts that could potentially be used. Can we consider > this a possible improvement, and not something that is fundamentally > necessary? For now, I'm trying to get the current feature set merged, > and not necessarily to extend it to cover all possible features of the > hardware. If we are extending the DT binding for the current feature, we should at least think about how it would look like for future extensions, to make sure it won't be fundamentally incompatible. > > > + - marvell,doorbell: Gives the physical address at which PCIe > > > + devices should write to signal an MSI interrupt. > > > > Why is this necessary? Can't the doorbell register physical address be > > computed by the driver? AFAIK there is no possibility for address > > translation on SOC inbound TLPs. > > It is the responsibility of the PCIe driver to prepare the 'struct > msi_msg', which contains the physical address at which the PCIe device > should write to trigger an MSI. But this physical address is part of > the interrupt controller registers, so there is no way for the PCIe > driver to magically know about it. If we introduce an irq_find_msi_host() interface, we can also introduce an interface to return the doorbell register, or more. I suppose we could actually have a generic version of your mvebu_pcie_setup_msi_irq() function that looks up the domain from the device node and calls a new irq_domain_ops function, which allocates a free MSI hwirq number, creates a mapping for it, and fills out a struct msi_msg with the doorbell register and data. > > Thinking about it a bit, maybe less magic code is needed here, be > > explicit about the available interrupts in the DT: > > > > pcie-controller { > > msi-interrupts = <0xd0020a04 (1<<16) &msi 16 > > 0xd0020a04 (1<<17) &msi 17 > > [..] > > msi-x-interrupts = <0xd0020a04 (1<<1) &msi 1 > > 0xd0020a04 (1<<2) &msi 2 > > [..] > > > > There is a better chance of that supporting other Marvell SOCs.. Not > > sure, just throwing it out there. > > Isn't that very verbose, to list each and every MSI interrupt, bit per > bit? I'm fine with doing that (except maybe implement both MSI and > MSI-X support, I'd like to stick with the current feature set for now), > but it sounds like a lot more code in the DT and a lot more code in the > driver to parse this... just to get the exact same feature. > > Arnd, what is your feeling about this suggestion? I think we should only need an msi-parent property and let the details be handled by the irq driver. > > Also, I'm not super familiar with the irq stuff, but is > > irq_find_mapping the best way? Most of the drivers I looked at used > > irq_alloc_descs to get a contiguous range of irq numbers and then just > > used a simple offset in the handle_irq... > > I'll let Arnd answer this one, but I'm pretty sure that using IRQ > domains is the way to go. The fact that a number of drivers don't yet > use IRQ domains is maybe just because they haven't been converted yet. Yes, irq_find_mapping is what we should be using here. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Dear Arnd Bergmann, On Tue, 26 Mar 2013 21:10:15 +0000, Arnd Bergmann wrote: > > To which children? Only to the main-intc children? If so, > > armada_370_xp_mpic_of_init() would be called with a 'device_node *' > > that points to the main-intc, correct? Then it would have to go back up > > in the Device Tree to find the msi node? It's doable of course, but > > sounds strange, no? > > I was thinking of registering two init functions as well. But then which of the two init functions would do the of_iomap() (or whatever ioremap()ing function you use) ? Remember, the very reason what we have *one* driver for both interrupt controllers is because the registers are completely mixed. So to me it's really the case of *one* device providing *two* features, like a device that would be both an Ethernet interface and a SPI controller. You have a single ->probe() function that gets called when the device is detected, and this ->probe() function registers both an Ethernet interface and a SPI controller. To me, the case we have here is really identical: we have one single set of registers that provide multiple features. > > To me, the topology of the hardware is really that a single device > > provides two features: the main interrupt controller and the MSI > > interrupt controller. But I will adapt to whatever DT binding you > > propose. > > My preference would be to have no sub-nodes but rather make the > code deal with an interrupt controller that has an interrupt domain > for regular IRQs but can also handle MSI. Hum, ok. > > > I still wonder if the real solution shouldn't instead be to make the > > > irq domain code MSI aware. For instance, you don't really need a > > > cell to describe an interrupt because the interrupt number is > > > not a hardware property. So an MSI using device doesn't really > > > needs an "msis" or "interrupts" property, just an "msi-parent", > > > and we can add code to handle as a separate domain even if you > > > have a single device node that can do both level and message > > > interrupts. > > > > So the msi-parent property should point to the single mpic node? But > > then the IRQ domain for MSI cannot be registered on this single MPIC > > node. Then, what would be the first argument of: > > > > armada_370_xp_msi_domain = > > irq_domain_add_linear(msi_node, PCI_MSI_DOORBELL_NR, > > &armada_370_xp_msi_irq_ops, NULL); > > > > and how would the PCIe driver get a pointer to this IRQ domain? (In the > > currently proposed code, it just does a irq_find_host(), which sounded > > very simple and straightforward). > > I guess one way would be to have a single domain that is responsible > for the controller and handles both MSI and legacy interrupts. That > could probably be done without changes to the interrupt domain code. > > Another option would be to add an irq_domain_add_msi() function that > creates a domain which is also tied to the same device node but does > not interact with it when going through the legacy interrupts. > You could then add a matching msi_find_host() or irq_find_msi_host() > function to return the domain. This option seems doable. Not sure yet how to do it exactly, but at least I understand the idea. > > To clarify: I really don't mind reworking the code, but unfortunately > > "make the IRQ domain code MSI aware" is too vague for me to understand > > the direction you're thinking of. > > I don't have specific code in mind yet, mainly playing around with the > possibilities for now. Sure, I understand. But I also don't have specific ideas: the current code works fine for me, and I don't find it really ugly. So if you don't point me in the direction that you think would make the code less ugly, then I'm lost. Best regards, Thomas
Dear Jason Gunthorpe, On Tue, 26 Mar 2013 15:14:03 -0600, Jason Gunthorpe wrote: > On Tue, Mar 26, 2013 at 09:46:13PM +0100, Thomas Petazzoni wrote: > > > To me, the topology of the hardware is really that a single device > > provides two features: the main interrupt controller and the MSI > > interrupt controller. But I will adapt to whatever DT binding you > > propose. > > No.. the HW is a single device that provides an interrupt on register > write capability, so it ideally should be a single DT node.. No, here you're only talking about the interrupt on register write capability (used for doorbells), but the interrupt controller is also used for all other devices. Not only IPIs and MSIs are handled by this piece of code. > The need to distinguish MSI vs IPI vs other usage is completely a side > effect of how Linux's IRQ and PCI layers are hooked together today. Yes. And since I live in today's world, I try to adapt somewhat to it :-) > > > I still wonder if the real solution shouldn't instead be to make the > > > irq domain code MSI aware. For instance, you don't really need a > > > cell to describe an interrupt because the interrupt number is > > Some kind of generic way for an irq chip driver to say 'here, I can > provide some MSI interrupts' and then for the PCI layer to say 'irq > layer, find me a driver that can provision a MSI with XXX properties' ? > > This need to stack an irq chip under a MSI is not something I think > the kernel has had to support before, so new common code is probably > needed... > > The interrupt chip should not need to know what the ultimate consumer > of the interrupt capability will be, it just needs to tell the > consumer 'write D to physical address A and irq I will trigger'. I honestly don't see much difference between what you're saying here and what the proposed code is doing. The IRQ driver says "hay, I provide two IRQ domains, one is named mpic, one is named msi". And then all the regular devices have interrupt-parent = <&mpic> to say "I use legacy interrupts from that controller", and the PCIe controller has msi-parent = <&msi> to say "I use MSI interrupt from that controller". Best regards, Thomas
Dear Arnd Bergmann, On Tue, 26 Mar 2013 21:15:40 +0000, Arnd Bergmann wrote: > On Tuesday 26 March 2013, Thomas Petazzoni wrote: > > + msimask = readl_relaxed(per_cpu_int_base + > > + ARMADA_370_XP_IN_DRBEL_CAUSE_OFFS) > > + & PCI_MSI_DOORBELL_MASK; > > + > > + writel(~PCI_MSI_DOORBELL_MASK, per_cpu_int_base + > > + ARMADA_370_XP_IN_DRBEL_CAUSE_OFFS); > > + > > Upon reading this code again, I stumbled over the barriers. You use > a readl_relaxed() without barrier but a writel() with barrier. Is > that intentional? Are you sure that you don't need a full readl() > to guarantee that all inbound DMA that was sent by the device before > the MSI message has arrived by the time the interrupt handler function > is called? It depends on the implementation of the MSI controller whether > that guarantee is already made by the fact that you are handling the > interrupt. This question I will have to raise to the Marvell HW engineers, the datasheet does not go into these considerations. For now, I'm rather trying to validate the 'architecture' of the code: DT binding, interaction between the IRQ controller driver and the PCIe driver, etc. Thanks, Thomas
Dear Arnd Bergmann, On Tue, 26 Mar 2013 21:31:45 +0000, Arnd Bergmann wrote: > On Tuesday 26 March 2013, Thomas Petazzoni wrote: > > > FWIW, MSI-X is not restricted to 16 bits, so if you can detect from > > > the PCI layer if it is setting up MSI or MSI-X you could allocate low > > > bits first to MSI-X and high bits first to MSI, increasing the number > > > of available MSI/MSI-X vectors. > > > > This could be an improvement. There are also other, non-per-CPU, > > doorbell interrupts that could potentially be used. Can we consider > > this a possible improvement, and not something that is fundamentally > > necessary? For now, I'm trying to get the current feature set merged, > > and not necessarily to extend it to cover all possible features of the > > hardware. > > If we are extending the DT binding for the current feature, we should > at least think about how it would look like for future extensions, to > make sure it won't be fundamentally incompatible. Sure. > > > > + - marvell,doorbell: Gives the physical address at which PCIe > > > > + devices should write to signal an MSI interrupt. > > > > > > Why is this necessary? Can't the doorbell register physical address be > > > computed by the driver? AFAIK there is no possibility for address > > > translation on SOC inbound TLPs. > > > > It is the responsibility of the PCIe driver to prepare the 'struct > > msi_msg', which contains the physical address at which the PCIe device > > should write to trigger an MSI. But this physical address is part of > > the interrupt controller registers, so there is no way for the PCIe > > driver to magically know about it. > > If we introduce an irq_find_msi_host() interface, we can also introduce > an interface to return the doorbell register, or more. I suppose > we could actually have a generic version of your mvebu_pcie_setup_msi_irq() > function that looks up the domain from the device node and calls > a new irq_domain_ops function, which allocates a free MSI hwirq number, > creates a mapping for it, and fills out a struct msi_msg with the > doorbell register and data. Ok, sounds like a plan. I must admit I'm not very familiar with the IRQ domain code, but I guess I should take this as an opportunity to become a little bit more familiar :) > > > Thinking about it a bit, maybe less magic code is needed here, be > > > explicit about the available interrupts in the DT: > > > > > > pcie-controller { > > > msi-interrupts = <0xd0020a04 (1<<16) &msi 16 > > > 0xd0020a04 (1<<17) &msi 17 > > > [..] > > > msi-x-interrupts = <0xd0020a04 (1<<1) &msi 1 > > > 0xd0020a04 (1<<2) &msi 2 > > > [..] > > > > > > There is a better chance of that supporting other Marvell SOCs.. Not > > > sure, just throwing it out there. > > > > Isn't that very verbose, to list each and every MSI interrupt, bit per > > bit? I'm fine with doing that (except maybe implement both MSI and > > MSI-X support, I'd like to stick with the current feature set for now), > > but it sounds like a lot more code in the DT and a lot more code in the > > driver to parse this... just to get the exact same feature. > > > > Arnd, what is your feeling about this suggestion? > > I think we should only need an msi-parent property and let the details > be handled by the irq driver. Ok. Having the irq driver allocate the MSI interrupt number as you suggested above seems a good idea. > > > Also, I'm not super familiar with the irq stuff, but is > > > irq_find_mapping the best way? Most of the drivers I looked at used > > > irq_alloc_descs to get a contiguous range of irq numbers and then just > > > used a simple offset in the handle_irq... > > > > I'll let Arnd answer this one, but I'm pretty sure that using IRQ > > domains is the way to go. The fact that a number of drivers don't yet > > use IRQ domains is maybe just because they haven't been converted yet. > > Yes, irq_find_mapping is what we should be using here. Ok, thanks. Thomas
On Tuesday 26 March 2013, Thomas Petazzoni wrote: > On Tue, 26 Mar 2013 21:10:15 +0000, Arnd Bergmann wrote: > > > > To which children? Only to the main-intc children? If so, > > > armada_370_xp_mpic_of_init() would be called with a 'device_node *' > > > that points to the main-intc, correct? Then it would have to go back up > > > in the Device Tree to find the msi node? It's doable of course, but > > > sounds strange, no? > > > > I was thinking of registering two init functions as well. > > But then which of the two init functions would do the of_iomap() (or > whatever ioremap()ing function you use) ? I agree this is where it get a little fishy, hence my preference to use a single node. > Remember, the very reason what we have one driver for both interrupt > controllers is because the registers are completely mixed. So to me > it's really the case of one device providing two features, like a > device that would be both an Ethernet interface and a SPI controller. > You have a single ->probe() function that gets called when the device > is detected, and this ->probe() function registers both an Ethernet > interface and a SPI controller.b > > To me, the case we have here is really identical: we have one single set > of registers that provide multiple features. The problem here is that the irq subsystem makes certain assumptions about one device node being the controller and mapping to a single irq domain, so you'd have to cheat one way or another. Using the sub-nodes the way you do in your patch is bending the rules for the device tree binding, which I think is more problematic than bending the rules for the code. Regarding the call to of_iomap(), you could work around it by having a static variable in the driver that remembers the mmio address and gets set by whichever device node init function gets called first. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Mar 26, 2013 at 10:16:54PM +0100, Thomas Petazzoni wrote: > > FWIW, MSI-X is not restricted to 16 bits, so if you can detect from > > the PCI layer if it is setting up MSI or MSI-X you could allocate low > > bits first to MSI-X and high bits first to MSI, increasing the number > > of available MSI/MSI-X vectors. > > This could be an improvement. There are also other, non-per-CPU, > doorbell interrupts that could potentially be used. Can we consider > this a possible improvement, and not something that is fundamentally > necessary? For now, I'm trying to get the current feature set merged, > and not necessarily to extend it to cover all possible features of the > hardware. A major point of MSI is to be able to direct interrupts on a CPU by CPU basis. Looks like XP has per-cpu and all-cpu doorbell bits? Combined with this: > It is the responsibility of the PCIe driver to prepare the 'struct > msi_msg', which contains the physical address at which the PCIe device > should write to trigger an MSI. But this physical address is part of Makes me think the split of responsibility created by moving the MSI ops into the PCI host structure is not correct. The PCIe host driver just seems to get in the way, it has no knowledge it is adding to the process. irqchip knows: - what the physical address of the doorbell is - how to construct an address that is per-cpu or all-cpu - which bits in the doorbell registers are allocated and which are free pci has none of that info. Looking at this some more, there is tonnes of stuff in linux that when a PCI MSI is allocated a special IRQ number is created for it that has special properties - eg set_affinity on that number actually goes into the MSI table and changes it. The cleanest would be to keep the doorbell driver purely related to the doorbell and when a request for a PCI MSI comes in allocate a new irq_chip (like arch/x86/kernel/apic/io_apic.c does) that has all the special PCI stuff and chain it to the proper bit in the doorbell. Optimizing to remove function calls from the interrupt stack could happen later. > > However, Marvell's doorbell can be controlled at the destination, so > > it is better to handle it that way, especially since it creates > > symmetry with the IPI usage. > > Hum, ok. But the MSI and IPI are handled quite differently: MSIs have > to call handle_IRQ(), while IPIs have to call handle_IPI(), so you'd > still have to distinguish between IPIs and MSIs. In the current driver, > IPIs don't have an associated irq_chip structure. Once you have a proper generic interrupt driver you can go ahead and use request_irq to grab N bits of the doorbell register and assign them to a handler that only calls handle_IPI(ipinr,get_irq_regs()). It is not necessary to keep IPI and the irqchip driver convoluted together. > Again, I don't see how it's possible to not care whether it's MSI or > IPI. IPIs have to call handle_IPI() which is a ARM-specific call, and I > don't understand how handle_fasteoi_irq() would end up calling > handle_IPI(). - The main IRQ vector is entered, it decodes the main cause register and calls the handler for the doorbell - The doorbell handler is setup as a chained handler and it uses handle_fasteoi_irq to enter into armada_370_xp_handle_irq - armada_370_xp_handle_irq then runs through all bits and calls their handlers - the handler for IPI bits is associated with the IPI handler that simply calls handle_IPI(...) handle_fasteoi_irq acks's and clears the handled bits in the doorbell register at the proper time. > I'll let Arnd answer this one, but I'm pretty sure that using IRQ > domains is the way to go. The fact that a number of drivers don't yet > use IRQ domains is maybe just because they haven't been converted yet. Maybe, but they have irq domain code as well.. I'm curious about the answer too :) Jason -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tuesday 26 March 2013, Jason Gunthorpe wrote: > > I'll let Arnd answer this one, but I'm pretty sure that using IRQ > > domains is the way to go. The fact that a number of drivers don't yet > > use IRQ domains is maybe just because they haven't been converted yet. > > Maybe, but they have irq domain code as well.. I'm curious about the > answer too :) To expand on my previous answer: irq_alloc_descs can require a lot of memory if you have a lot of MSIs. If the driver only has 16 MSIs, there is probably no reason to not assign all of them at once, but if you have 2048 MSIs, creating the mapping only when you need it is much more space efficient. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Dear Jason Gunthorpe, On Tue, 26 Mar 2013 15:55:46 -0600, Jason Gunthorpe wrote: > > > FWIW, MSI-X is not restricted to 16 bits, so if you can detect from > > > the PCI layer if it is setting up MSI or MSI-X you could allocate low > > > bits first to MSI-X and high bits first to MSI, increasing the number > > > of available MSI/MSI-X vectors. > > > > This could be an improvement. There are also other, non-per-CPU, > > doorbell interrupts that could potentially be used. Can we consider > > this a possible improvement, and not something that is fundamentally > > necessary? For now, I'm trying to get the current feature set merged, > > and not necessarily to extend it to cover all possible features of the > > hardware. > > A major point of MSI is to be able to direct interrupts on a CPU by > CPU basis. Looks like XP has per-cpu and all-cpu doorbell bits? Yes. Two per-CPU interrupts, IRQ 0 for the low 16 bits of the first doorbell register, IRQ 1 for the high 16 bits of the first doorbell register. And then, three global interrupts, each with a 32 bits register associated to trigger the interrupt. At least, that's my understanding of the datasheet. > Combined with this: > > > It is the responsibility of the PCIe driver to prepare the 'struct > > msi_msg', which contains the physical address at which the PCIe device > > should write to trigger an MSI. But this physical address is part of > > Makes me think the split of responsibility created by moving the MSI > ops into the PCI host structure is not correct. > > The PCIe host driver just seems to get in the way, it has no knowledge > it is adding to the process. > > irqchip knows: > - what the physical address of the doorbell is > - how to construct an address that is per-cpu or all-cpu > - which bits in the doorbell registers are allocated and which are > free > > pci has none of that info. Wait, wait. Did you look at the Tegra PCIe driver? In the Tegra case, the MSI stuff is handled completely by the PCIe unit, and does not need the special interaction with the IRQ controller driver that I need. The fact that PCIe and MSI handling are completely separate matters on Marvell may just be a specific situation. On other platforms, the physical register that gets written to by PCIe devices to trigger a MSI may well be located within the PCIe interface registers, and therefore the MSI thing should be handled by the PCIe driver. > Looking at this some more, there is tonnes of stuff in linux that when > a PCI MSI is allocated a special IRQ number is created for it that has > special properties - eg set_affinity on that number actually goes into > the MSI table and changes it. > > The cleanest would be to keep the doorbell driver purely related to > the doorbell and when a request for a PCI MSI comes in allocate a new > irq_chip (like arch/x86/kernel/apic/io_apic.c does) that has all the > special PCI stuff and chain it to the proper bit in the doorbell. > > Optimizing to remove function calls from the interrupt stack could > happen later. > > > > However, Marvell's doorbell can be controlled at the destination, so > > > it is better to handle it that way, especially since it creates > > > symmetry with the IPI usage. > > > > Hum, ok. But the MSI and IPI are handled quite differently: MSIs have > > to call handle_IRQ(), while IPIs have to call handle_IPI(), so you'd > > still have to distinguish between IPIs and MSIs. In the current driver, > > IPIs don't have an associated irq_chip structure. > > Once you have a proper generic interrupt driver you can go ahead and > use request_irq to grab N bits of the doorbell register and assign > them to a handler that only calls handle_IPI(ipinr,get_irq_regs()). > > It is not necessary to keep IPI and the irqchip driver convoluted > together. > > > Again, I don't see how it's possible to not care whether it's MSI or > > IPI. IPIs have to call handle_IPI() which is a ARM-specific call, and I > > don't understand how handle_fasteoi_irq() would end up calling > > handle_IPI(). > > - The main IRQ vector is entered, it decodes the main cause register > and calls the handler for the doorbell > - The doorbell handler is setup as a chained handler and it uses > handle_fasteoi_irq to enter into armada_370_xp_handle_irq > - armada_370_xp_handle_irq then runs through all bits and calls their > handlers > - the handler for IPI bits is associated with the IPI handler that > simply calls handle_IPI(...) > > handle_fasteoi_irq acks's and clears the handled bits in the doorbell > register at the proper time. Could you propose some code that implements this? The existing code works fine, and is modeled after what the GIC IRQ driver is doing. I don't say what you're proposing is not interesting, but just that it looks like every time I come with some code, you're suggesting me to 'reinvent' the whole world, and you've never come up with some code to help in a direction or another. Thanks, Thomas
On Tue, Mar 26, 2013 at 11:06:56PM +0100, Thomas Petazzoni wrote: > > The PCIe host driver just seems to get in the way, it has no knowledge > > it is adding to the process. > > > > irqchip knows: > > - what the physical address of the doorbell is > > - how to construct an address that is per-cpu or all-cpu > > - which bits in the doorbell registers are allocated and which are > > free > > > > pci has none of that info. > > Wait, wait. Did you look at the Tegra PCIe driver? In the Tegra case, > the MSI stuff is handled completely by the PCIe unit, and does not need > the special interaction with the IRQ controller driver that I need. Yes, but only briefly. It doesn't matter if the mechanisms are in the PCI-E register block or someplace else. If irqchip is providing the interface then the PCI-E driver would have to also instantiate an irqchip block. This is no different from what would be required to handle INTx on Marvell, the PCI-E driver would have to create a chained irqchip from the PEX interrupt and decode the cause register into INTA/B/C/D. > > > Again, I don't see how it's possible to not care whether it's MSI or > > > IPI. IPIs have to call handle_IPI() which is a ARM-specific call, and I > > > don't understand how handle_fasteoi_irq() would end up calling > > > handle_IPI(). > > > > - The main IRQ vector is entered, it decodes the main cause register > > and calls the handler for the doorbell > > - The doorbell handler is setup as a chained handler and it uses > > handle_fasteoi_irq to enter into armada_370_xp_handle_irq > > - armada_370_xp_handle_irq then runs through all bits and calls their > > handlers > > - the handler for IPI bits is associated with the IPI handler that > > simply calls handle_IPI(...) > > > > handle_fasteoi_irq acks's and clears the handled bits in the doorbell > > register at the proper time. > > Could you propose some code that implements this? The existing code > works fine, and is modeled after what the GIC IRQ driver is doing. I > don't say what you're proposing is not interesting, but just that it > looks like every time I come with some code, you're suggesting me to > 'reinvent' the whole world, and you've never come up with some code to > help in a direction or another. I'll send you my doorbell driver for Kirkwood - it doesn't attempt to generically solve the MSI problem, but the basic template shows how to wire up the doorbell concept to irqchip's generic code, and it works in the MSI role with lots of rather ugly hacking.. The XP driver is taking some small shortcuts because it only uses the doorbell for IPI.. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/Documentation/devicetree/bindings/arm/armada-370-xp-mpic.txt b/Documentation/devicetree/bindings/arm/armada-370-xp-mpic.txt index 61df564..2233d20 100644 --- a/Documentation/devicetree/bindings/arm/armada-370-xp-mpic.txt +++ b/Documentation/devicetree/bindings/arm/armada-370-xp-mpic.txt @@ -4,8 +4,6 @@ Marvell Armada 370 and Armada XP Interrupt Controller Required properties: - compatible: Should be "marvell,mpic" - interrupt-controller: Identifies the node as an interrupt controller. -- #interrupt-cells: The number of cells to define the interrupts. Should be 1. - The cell is the IRQ number - reg: Should contain PMIC registers location and length. First pair for the main interrupt registers, second pair for the per-CPU @@ -14,16 +12,45 @@ Required properties: automatically map to the interrupt controller registers of the current CPU) +The node should have two sub-nodes, each describing one aspect of the +interrupt controller: + * A first node named main-intc that describes the main interrupt + controller itself, receiving interrupts from all devices. + + This sub-node should have the following properties: + - interrupt-controller: identifies the code as an interrupt + controller. + - #interrupt-cells: The number of cells to define the + interrupts. Should be 1. The cell is the IRQ number + + * A second node named msi-intc that is used for Message Signaled + Interrupts of the PCIe bus. + + This sub-node should have the following properties: + - interrupt-controller: identifies the code as an interrupt + controller. + - #interrupt-cells: The number of cells to define the + interrupts. Should be 1. The cell is the IRQ number + - marvell,doorbell: Gives the physical address at which PCIe + devices should write to signal an MSI interrupt. Example: - mpic: interrupt-controller@d0020000 { + interrupt-controller@d0020000 { compatible = "marvell,mpic"; - #interrupt-cells = <1>; - #address-cells = <1>; - #size-cells = <1>; interrupt-controller; reg = <0xd0020a00 0x1d0>, <0xd0021070 0x58>; + + mpic: main-intc@d0020000 { + #interrupt-cells = <1>; + interrupt-controller; + }; + + msi: msi-intc@d0020000 { + #interrupt-cells = <1>; + interrupt-controller; + marvell,doorbell = <0xd0020a04>; + }; }; diff --git a/arch/arm/boot/dts/armada-370-xp.dtsi b/arch/arm/boot/dts/armada-370-xp.dtsi index de054ed..de6569cd 100644 --- a/arch/arm/boot/dts/armada-370-xp.dtsi +++ b/arch/arm/boot/dts/armada-370-xp.dtsi @@ -36,6 +36,12 @@ #interrupt-cells = <1>; interrupt-controller; }; + + msi: msi-intc@d0020000 { + #interrupt-cells = <1>; + interrupt-controller; + marvell,doorbell = <0xd0020a04>; + }; }; coherency-fabric@d0020200 { diff --git a/arch/arm/mach-mvebu/Kconfig b/arch/arm/mach-mvebu/Kconfig index d353249..3254b77 100644 --- a/arch/arm/mach-mvebu/Kconfig +++ b/arch/arm/mach-mvebu/Kconfig @@ -16,6 +16,7 @@ config ARCH_MVEBU select MVEBU_MBUS select MIGHT_HAVE_PCI select PCI_QUIRKS if PCI + select ARCH_SUPPORTS_MSI if PCI if ARCH_MVEBU diff --git a/drivers/irqchip/irq-armada-370-xp.c b/drivers/irqchip/irq-armada-370-xp.c index 4b2a6d7..3180797 100644 --- a/drivers/irqchip/irq-armada-370-xp.c +++ b/drivers/irqchip/irq-armada-370-xp.c @@ -22,6 +22,7 @@ #include <linux/of_address.h> #include <linux/of_irq.h> #include <linux/irqdomain.h> +#include <linux/msi.h> #include <asm/mach/arch.h> #include <asm/exception.h> #include <asm/smp_plat.h> @@ -51,6 +52,10 @@ #define IPI_DOORBELL_START (0) #define IPI_DOORBELL_END (8) #define IPI_DOORBELL_MASK 0xFF +#define PCI_MSI_DOORBELL_START (16) +#define PCI_MSI_DOORBELL_NR (16) +#define PCI_MSI_DOORBELL_END (32) +#define PCI_MSI_DOORBELL_MASK 0xFFFF0000 static DEFINE_RAW_SPINLOCK(irq_controller_lock); @@ -58,6 +63,10 @@ static void __iomem *per_cpu_int_base; static void __iomem *main_int_base; static struct irq_domain *armada_370_xp_mpic_domain; +#ifdef CONFIG_PCI_MSI +static struct irq_domain *armada_370_xp_msi_domain; +#endif + /* * In SMP mode: * For shared global interrupts, mask/unmask global enable bit @@ -167,6 +176,57 @@ static int armada_370_xp_mpic_irq_map(struct irq_domain *h, return 0; } +#ifdef CONFIG_PCI_MSI +static struct irq_chip armada_370_xp_msi_irq_chip = { + .name = "armada_370_xp_msi_irq", + .irq_enable = unmask_msi_irq, + .irq_disable = mask_msi_irq, + .irq_mask = mask_msi_irq, + .irq_unmask = unmask_msi_irq, +}; + +static int armada_370_xp_msi_map(struct irq_domain *domain, unsigned int virq, + irq_hw_number_t hw) +{ + irq_set_chip_and_handler(virq, &armada_370_xp_msi_irq_chip, + handle_simple_irq); + set_irq_flags(virq, IRQF_VALID); + + return 0; +} + +static const struct irq_domain_ops armada_370_xp_msi_irq_ops = { + .map = armada_370_xp_msi_map, +}; + +void armada_370_xp_msi_init(struct device_node *node) +{ + struct device_node *msi_node; + u32 reg; + + msi_node = of_get_child_by_name(node, "msi-intc"); + if (!msi_node) + return; + + armada_370_xp_msi_domain = + irq_domain_add_linear(msi_node, PCI_MSI_DOORBELL_NR, + &armada_370_xp_msi_irq_ops, NULL); + if (!armada_370_xp_msi_domain) + panic("Unable to add Armada 370/XP MSI irq domain\n"); + + reg = readl(per_cpu_int_base + ARMADA_370_XP_IN_DRBEL_MSK_OFFS) + | PCI_MSI_DOORBELL_MASK; + + writel(reg, per_cpu_int_base + + ARMADA_370_XP_IN_DRBEL_MSK_OFFS); + + /* Unmask IPI interrupt */ + writel(1, per_cpu_int_base + ARMADA_370_XP_INT_CLEAR_MASK_OFFS); +} +#else +static void armada_370_xp_msi_init(struct device_node *node) { } +#endif /* CONFIG_PCI_MSI */ + #ifdef CONFIG_SMP void armada_mpic_send_doorbell(const struct cpumask *mask, unsigned int irq) { @@ -220,12 +280,39 @@ armada_370_xp_handle_irq(struct pt_regs *regs) if (irqnr > 1022) break; - if (irqnr > 0) { + if (irqnr > 1) { irqnr = irq_find_mapping(armada_370_xp_mpic_domain, irqnr); handle_IRQ(irqnr, regs); continue; } + +#ifdef CONFIG_PCI_MSI + /* MSI handling */ + if (irqnr == 1) { + u32 msimask, msinr; + + msimask = readl_relaxed(per_cpu_int_base + + ARMADA_370_XP_IN_DRBEL_CAUSE_OFFS) + & PCI_MSI_DOORBELL_MASK; + + writel(~PCI_MSI_DOORBELL_MASK, per_cpu_int_base + + ARMADA_370_XP_IN_DRBEL_CAUSE_OFFS); + + for (msinr = PCI_MSI_DOORBELL_START; + msinr < PCI_MSI_DOORBELL_END; msinr++) { + int irq; + + if (!(msimask & BIT(msinr))) + continue; + + irq = irq_find_mapping(armada_370_xp_msi_domain, + msinr - 16); + handle_IRQ(irq, regs); + } + } +#endif + #ifdef CONFIG_SMP /* IPI Handling */ if (irqnr == 0) { @@ -291,6 +378,8 @@ static int __init armada_370_xp_mpic_of_init(struct device_node *node, #endif + armada_370_xp_msi_init(node); + set_handle_irq(armada_370_xp_handle_irq); return 0;
This commit introduces the support for the MSI interrupts in the armada-370-xp interrupt controller driver. It registers an IRQ domain associated with the 'msi' node in the Device Tree, which the PCI controller will refer to in a followup commit. The MSI interrupts use the 16 high doorbells, and are therefore notified using IRQ1 of the main interrupt controller. The Device Tree binding documentation for the armada-370-xp driver is also updated in the process. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> --- .../devicetree/bindings/arm/armada-370-xp-mpic.txt | 39 +++++++-- arch/arm/boot/dts/armada-370-xp.dtsi | 6 ++ arch/arm/mach-mvebu/Kconfig | 1 + drivers/irqchip/irq-armada-370-xp.c | 91 +++++++++++++++++++- 4 files changed, 130 insertions(+), 7 deletions(-)