diff mbox

[V10,2/2] irqchip: gicv2m: Add supports for ARM GICv2m MSI(-X)

Message ID 1415052977-26036-3-git-send-email-suravee.suthikulpanit@amd.com (mailing list archive)
State New, archived
Headers show

Commit Message

Suravee Suthikulpanit Nov. 3, 2014, 10:16 p.m. UTC
From: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>

ARM GICv2m specification extends GICv2 to support MSI(-X) with
a new set of register frame. This patch introduces support for
the non-secure GICv2m register frame. Currently, GICV2m is available
in certain version of GIC-400.

The patch introduces a new property in ARM gic binding, the v2m subnode.
It is optional.

Cc: Marc Zyngier <Marc.Zyngier@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Mark Rutland <Mark.Rutland@arm.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Will Deacon <Will.Deacon@arm.com>
Cc: Catalin Marinas <Catalin.Marinas@arm.com>
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
---
 Documentation/devicetree/bindings/arm/gic.txt |  53 ++++
 arch/arm64/Kconfig                            |   1 +
 drivers/irqchip/Kconfig                       |   5 +
 drivers/irqchip/Makefile                      |   1 +
 drivers/irqchip/irq-gic-v2m.c                 | 340 ++++++++++++++++++++++++++
 drivers/irqchip/irq-gic-v2m.h                 |   6 +
 drivers/irqchip/irq-gic.c                     |  23 +-
 7 files changed, 425 insertions(+), 4 deletions(-)

Comments

Thomas Gleixner Nov. 3, 2014, 10:51 p.m. UTC | #1
On Mon, 3 Nov 2014, suravee.suthikulpanit@amd.com wrote:
> +static void gicv2m_teardown_msi_irq(struct msi_chip *chip, unsigned int irq)
> +{
> +	int pos;
> +	struct v2m_data *v2m = container_of(chip, struct v2m_data, msi_chip);
> +
> +	spin_lock(&v2m->msi_cnt_lock);

Why do you need an extra lock here? Is that stuff not serialized from
the msi_chip layer already?

If not, why don't we have the serialization there instead of forcing
every callback to implement its own?

> +	pos = irq - v2m->spi_start;

So this assumes that @irq is the hwirq number, right? How does the
calling function know about that? It should only have knowledge about
the virq number if I'm not missing something.

And if I'm missing something, then that msi_chip stuff is seriously
broken.

> +	if (pos >= 0 && pos < v2m->nr_spis)

So you simply avoid the clear bitmap instead of yelling loudly about
being called with completely wrong data?

I would not be surprised if that is related to my question above.

> +		bitmap_clear(v2m->bm, pos, 1);

> +static int gicv2m_setup_msi_irq(struct msi_chip *chip, struct pci_dev *pdev,
> +				struct msi_desc *desc)
> +{
> +	int hwirq, virq, offset;
> +	struct v2m_data *v2m = container_of(chip, struct v2m_data, msi_chip);
> +
> +	if (!desc)
> +		return -EINVAL;

Why on earth does every callback of msi_chip have to check for this?

> +	spin_lock(&v2m->msi_cnt_lock);
> +	offset = bitmap_find_free_region(v2m->bm, v2m->nr_spis, 0);
> +	spin_unlock(&v2m->msi_cnt_lock);
> +	if (offset < 0)
> +		return offset;
> +
> +	hwirq = v2m->spi_start + offset;
> +	virq = __irq_domain_alloc_irqs(v2m->domain, hwirq,
> +				       1, NUMA_NO_NODE, v2m, true);
> +	if (virq < 0) {
> +		gicv2m_teardown_msi_irq(chip, hwirq);
> +		return virq;
> +	}
> +
> +	irq_domain_set_hwirq_and_chip(v2m->domain, virq, hwirq,
> +				&v2m_chip, v2m);
> +
> +	irq_set_msi_desc(hwirq, desc);
> +	irq_set_irq_type(hwirq, IRQ_TYPE_EDGE_RISING);

Sure both calls work perfectly fine as long as virq == hwirq, right?

> +	return 0;

Q: How does this populate virq to the caller? 
A: Not at all
Q: How does the caller know which virq this function assigned?
A: Not at all
Q: How does the device driver know which virq to request?
A: Not at all
Q: Was this patch ever properly tested?
A: Not at all.

I do not care at all how YOU waste your time. But I care very much
about the fact that YOU are wasting MY precious time by exposing me to
your patch trainwrecks. 

Thanks,

	tglx
Suravee Suthikulpanit Nov. 4, 2014, 3:22 a.m. UTC | #2
On 11/3/2014 4:51 PM, Thomas Gleixner wrote:
> On Mon, 3 Nov 2014, suravee.suthikulpanit@amd.com wrote:
>> +static void gicv2m_teardown_msi_irq(struct msi_chip *chip, unsigned int irq)
>> +{
>> +	int pos;
>> +	struct v2m_data *v2m = container_of(chip, struct v2m_data, msi_chip);
>> +
>> +	spin_lock(&v2m->msi_cnt_lock);
>
> Why do you need an extra lock here? Is that stuff not serialized from
> the msi_chip layer already?
>
> If not, why don't we have the serialization there instead of forcing
> every callback to implement its own?

 From the following call paths:
   |--> pci_enable_msi_range
    |--> msi_capability_init
     |--> arch_setup_msi_irqs
      |--> arch_setup_msi_irq
and
   |--> pci_enable_msix
    |--> msix_capability_init
     |--> arch_setup_msi_irqs
      |--> arch_setup_msi_irq

It serialize when a PCI device driver tries to allocate multiple 
interrupts. However, AFAICT, it would not serialize the allocation when 
multiple drivers trying to setup MSI irqs at the same time. I needed 
that to protect the bitmap structure. I also noticed the same in other 
drivers as well.

I can look into this more to see where would be a good point.

>> +	pos = irq - v2m->spi_start;
>
> So this assumes that @irq is the hwirq number, right? How does the
> calling function know about that? It should only have knowledge about
> the virq number if I'm not missing something.
>
> And if I'm missing something, then that msi_chip stuff is seriously
> broken.

It works this way because of the direct mapping (as you noticed). But I 
am planning to change that. See below.

>
>> +	if (pos >= 0 && pos < v2m->nr_spis)
>
> So you simply avoid the clear bitmap instead of yelling loudly about
> being called with completely wrong data?

I'll provide appropriate warnings.

> I would not be surprised if that is related to my question above.

Not quite sure which of the above questions.

>> +	spin_lock(&v2m->msi_cnt_lock);
>> +	offset = bitmap_find_free_region(v2m->bm, v2m->nr_spis, 0);
>> +	spin_unlock(&v2m->msi_cnt_lock);
>> +	if (offset < 0)
>> +		return offset;
>> +
>> +	hwirq = v2m->spi_start + offset;
>> +	virq = __irq_domain_alloc_irqs(v2m->domain, hwirq,
>> +				       1, NUMA_NO_NODE, v2m, true);
>> +	if (virq < 0) {
>> +		gicv2m_teardown_msi_irq(chip, hwirq);
>> +		return virq;
>> +	}
>> +
>> +	irq_domain_set_hwirq_and_chip(v2m->domain, virq, hwirq,
>> +				&v2m_chip, v2m);
>> +
>> +	irq_set_msi_desc(hwirq, desc);
>> +	irq_set_irq_type(hwirq, IRQ_TYPE_EDGE_RISING);
>
> Sure both calls work perfectly fine as long as virq == hwirq, right?

I was running into an issue when calling the 
irq_domain_alloc_irq_parent(), it requires of_phandle_args pointer to be 
passed in. However, this does not work for GICv2m since it does not have 
interrupt information in the device tree. So, I decided at first to use 
direct (virq == hwirq) mapping, which simplifies the code a bit, but 
might not be ideal solution, as you pointed out.

An alternative would be to create a temporary struct of_phandle_args, 
and populate it with the interrupt information for the requested MSI. 
Then pass it to:
   --> irq_domain_alloc_irq_parent
    |--> gic_irq_domain_alloc
      |--> gic_irq_domain_xlate
      |--> gic_irq_domain_map

However, this would still not be ideal if we want to support ACPI. 
Another alternative would be coming up with a dedicate structure to be 
used here. I noticed on X86, it uses struct irq_alloc_info. May be 
that's what we also need here.

> [...]
> I do not care at all how YOU waste your time. But I care very much
> about the fact that YOU are wasting MY precious time by exposing me to
> your patch trainwrecks.

I don't intend to waste yours or anybody's precious time. Sorry if it 
takes a couple iterations to work out the issues. Also, I will try to 
put more comment in my code to make it more clear. Let me know what 
works best for you to work out the issues.

Thanks,

Suravee

>
> Thanks,
>
> 	tglx
>
Thomas Gleixner Nov. 4, 2014, 10:06 a.m. UTC | #3
On Mon, 3 Nov 2014, Suravee Suthikulanit wrote:
> On 11/3/2014 4:51 PM, Thomas Gleixner wrote:
> > On Mon, 3 Nov 2014, suravee.suthikulpanit@amd.com wrote:
> > > +	irq_domain_set_hwirq_and_chip(v2m->domain, virq, hwirq,
> > > +				&v2m_chip, v2m);
> > > +
> > > +	irq_set_msi_desc(hwirq, desc);
> > > +	irq_set_irq_type(hwirq, IRQ_TYPE_EDGE_RISING);
> > 
> > Sure both calls work perfectly fine as long as virq == hwirq, right?
> 
> I was running into an issue when calling the irq_domain_alloc_irq_parent(), it
> requires of_phandle_args pointer to be passed in. However, this does not work
> for GICv2m since it does not have interrupt information in the device tree.
> So, I decided at first to use direct (virq == hwirq) mapping, which simplifies
> the code a bit, but might not be ideal solution, as you pointed out.

It's not only far from ideal. It's not a solution at all. Simply
because there is no guarantee for virq == hwirq.
 
> An alternative would be to create a temporary struct of_phandle_args, and
> populate it with the interrupt information for the requested MSI. Then pass it
> to:
>   --> irq_domain_alloc_irq_parent
>    |--> gic_irq_domain_alloc
>      |--> gic_irq_domain_xlate
>      |--> gic_irq_domain_map
> 
> However, this would still not be ideal if we want to support ACPI. Another

Neither device tree nor ACPI has anything to do with MSI interrupts at
runtime.

All they do is to tell that there is a MSI controller and where the
registers are and in the worst case fixups for a borked MSI_TYPER
register.

So either the TYPER reg or DT/ACPI gives you a fixed hwirq range which
is reserved for MSI. And that's all you need, right?

The MSI interrupt itself has no DT/ACPI information to use at
allocation time simply because you CANNOT decribe a MSI device
interrupt in DT/ACPI by any means.

And you do not need any DT/ACPI information at that point. All you
need is to pick one hwirq out of the existing fixed range and
associate it to a newly allocated virq. That's the only information
the underlying gic domain has to know about, because it needs to
translate from the hwirq to the virq in the low level entry handler
gic_handle_irq().

> alternative would be coming up with a dedicate structure to be used here. I
> noticed on X86, it uses struct irq_alloc_info. May be that's what we also need
> here.

It's a x86 concept to transport X86 specific information in order to
avoid duplicated code all over the place. And x86 MSI support is a
completely different beast than the thing you are dealing with. x86
has no concept of a fixed hwirq range for MSI.

So no, just picking random stuff from random MSI implementations does
not help at all.

> > [...]
> > I do not care at all how YOU waste your time. But I care very much
> > about the fact that YOU are wasting MY precious time by exposing me to
> > your patch trainwrecks.
> 
> I don't intend to waste yours or anybody's precious time. Sorry if it takes a
> couple iterations to work out the issues. Also, I will try to put more comment
> in my code to make it more clear. Let me know what works best for you to work
> out the issues.

By not sending obviously broken and half thought out patches in the
first place.

Thanks,

	tglx
Jiang Liu Nov. 4, 2014, 1:01 p.m. UTC | #4
Hi Suravee,
	You may build a two level hierarchy irqdomains. Use the
utilities in this thread
http://www.spinics.net/lists/arm-kernel/msg374722.html to build an MSI
irqdomain to manage MSI controllers
in PCI devices. And build another irqdomain to manage SPI allocation
in GICv2.
	That is: MSI irqdomain (program MSI registers)  -->
GIV irqdomain (manage SPIs in GICv2 controller)

Regards!
Gerry

On 2014/11/4 6:16, suravee.suthikulpanit@amd.com wrote:
> From: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
> 
> ARM GICv2m specification extends GICv2 to support MSI(-X) with
> a new set of register frame. This patch introduces support for
> the non-secure GICv2m register frame. Currently, GICV2m is available
> in certain version of GIC-400.
> 
> The patch introduces a new property in ARM gic binding, the v2m subnode.
> It is optional.
> 
> Cc: Marc Zyngier <Marc.Zyngier@arm.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Mark Rutland <Mark.Rutland@arm.com>
> Cc: Jason Cooper <jason@lakedaemon.net>
> Cc: Will Deacon <Will.Deacon@arm.com>
> Cc: Catalin Marinas <Catalin.Marinas@arm.com>
> Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
> ---
>  Documentation/devicetree/bindings/arm/gic.txt |  53 ++++
>  arch/arm64/Kconfig                            |   1 +
>  drivers/irqchip/Kconfig                       |   5 +
>  drivers/irqchip/Makefile                      |   1 +
>  drivers/irqchip/irq-gic-v2m.c                 | 340 ++++++++++++++++++++++++++
>  drivers/irqchip/irq-gic-v2m.h                 |   6 +
>  drivers/irqchip/irq-gic.c                     |  23 +-
>  7 files changed, 425 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/arm/gic.txt b/Documentation/devicetree/bindings/arm/gic.txt
> index c7d2fa1..ebf976a 100644
> --- a/Documentation/devicetree/bindings/arm/gic.txt
> +++ b/Documentation/devicetree/bindings/arm/gic.txt
> @@ -96,3 +96,56 @@ Example:
>  		      <0x2c006000 0x2000>;
>  		interrupts = <1 9 0xf04>;
>  	};
> +
> +
> +* GICv2m extension for MSI/MSI-x support (Optional)
> +
> +Certain revisions of GIC-400 supports MSI/MSI-x via V2M register frame(s).
> +This is enabled by specifying v2m sub-node(s).
> +
> +Required properties:
> +
> +- compatible        : The value here should contain "arm,gic-v2m-frame".
> +
> +- msi-controller    : Identifies the node as an MSI controller.
> +
> +- reg               : GICv2m MSI interface register base and size
> +
> +Optional properties:
> +
> +- arm,msi-base-spi  : When the MSI_TYPER register contains an incorrect
> +                      value, this property should contain the SPI base of
> +                      the MSI frame, overriding the HW value.
> +
> +- arm,msi-num-spis  : When the MSI_TYPER register contains an incorrect
> +                      value, this property should contain the number of
> +                      SPIs assigned to the frame, overriding the HW value.
> +
> +Example:
> +
> +	interrupt-controller@e1101000 {
> +		compatible = "arm,gic-400";
> +		#interrupt-cells = <3>;
> +		#address-cells = <2>;
> +		#size-cells = <2>;
> +		interrupt-controller;
> +		interrupts = <1 8 0xf04>;
> +		ranges = <0 0 0 0xe1100000 0 0x100000>;
> +		reg = <0x0 0xe1110000 0 0x01000>,
> +		      <0x0 0xe112f000 0 0x02000>,
> +		      <0x0 0xe1140000 0 0x10000>,
> +		      <0x0 0xe1160000 0 0x10000>;
> +		v2m0: v2m@0x8000 {
> +			compatible = "arm,gic-v2m-frame";
> +			msi-controller;
> +			reg = <0x0 0x80000 0 0x1000>;
> +		};
> +
> +		....
> +
> +		v2mN: v2m@0x9000 {
> +			compatible = "arm,gic-v2m-frame";
> +			msi-controller;
> +			reg = <0x0 0x90000 0 0x1000>;
> +		};
> +	};
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index cde2f72..cbcde2d 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -12,6 +12,7 @@ config ARM64
>  	select ARM_ARCH_TIMER
>  	select ARM_GIC
>  	select AUDIT_ARCH_COMPAT_GENERIC
> +	select ARM_GIC_V2M
>  	select ARM_GIC_V3
>  	select BUILDTIME_EXTABLE_SORT
>  	select CLONE_BACKWARDS
> diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
> index 2a48e0a..39ce065 100644
> --- a/drivers/irqchip/Kconfig
> +++ b/drivers/irqchip/Kconfig
> @@ -8,6 +8,11 @@ config ARM_GIC
>  	select IRQ_DOMAIN_HIERARCHY
>  	select MULTI_IRQ_HANDLER
>  
> +config ARM_GIC_V2M
> +	bool
> +	depends on ARM_GIC
> +	depends on PCI && PCI_MSI
> +
>  config GIC_NON_BANKED
>  	bool
>  
> diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
> index 73052ba..3bda951 100644
> --- a/drivers/irqchip/Makefile
> +++ b/drivers/irqchip/Makefile
> @@ -17,6 +17,7 @@ obj-$(CONFIG_ARCH_SUNXI)		+= irq-sun4i.o
>  obj-$(CONFIG_ARCH_SUNXI)		+= irq-sunxi-nmi.o
>  obj-$(CONFIG_ARCH_SPEAR3XX)		+= spear-shirq.o
>  obj-$(CONFIG_ARM_GIC)			+= irq-gic.o irq-gic-common.o
> +obj-$(CONFIG_ARM_GIC_V2M)		+= irq-gic-v2m.o
>  obj-$(CONFIG_ARM_GIC_V3)		+= irq-gic-v3.o irq-gic-common.o
>  obj-$(CONFIG_ARM_NVIC)			+= irq-nvic.o
>  obj-$(CONFIG_ARM_VIC)			+= irq-vic.o
> diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c
> new file mode 100644
> index 0000000..fd8d51a
> --- /dev/null
> +++ b/drivers/irqchip/irq-gic-v2m.c
> @@ -0,0 +1,340 @@
> +/*
> + * ARM GIC v2m MSI(-X) support
> + * Support for Message Signaled Interrupts for systems that
> + * implement ARM Generic Interrupt Controller: GICv2m.
> + *
> + * Copyright (C) 2014 Advanced Micro Devices, Inc.
> + * Authors: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> + *          Harish Kasiviswanathan <harish.kasiviswanathan@amd.com>
> + *          Brandon Anderson <brandon.anderson@amd.com>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation.
> + */
> +
> +#define pr_fmt(fmt) "GICv2m: " fmt
> +
> +#include <linux/bitmap.h>
> +#include <linux/irq.h>
> +#include <linux/irqdomain.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/of_address.h>
> +#include <linux/of_pci.h>
> +#include <linux/pci.h>
> +#include <linux/platform_device.h>
> +#include <linux/slab.h>
> +#include <linux/spinlock.h>
> +
> +#include <asm/hardirq.h>
> +#include <asm/irq.h>
> +
> +#include "irqchip.h"
> +#include "irq-gic-v2m.h"
> +
> +/*
> +* MSI_TYPER:
> +*     [31:26] Reserved
> +*     [25:16] lowest SPI assigned to MSI
> +*     [15:10] Reserved
> +*     [9:0]   Numer of SPIs assigned to MSI
> +*/
> +#define V2M_MSI_TYPER			0x008
> +#define V2M_MSI_TYPER_BASE_SHIFT	16
> +#define V2M_MSI_TYPER_BASE_MASK		0x3FF
> +#define V2M_MSI_TYPER_NUM_MASK		0x3FF
> +#define V2M_MSI_SETSPI_NS		0x040
> +#define V2M_MIN_SPI			32
> +#define V2M_MAX_SPI			1019
> +
> +#define V2M_MSI_TYPER_BASE_SPI(x)	\
> +		(((x) >> V2M_MSI_TYPER_BASE_SHIFT) & V2M_MSI_TYPER_BASE_MASK)
> +
> +#define V2M_MSI_TYPER_NUM_SPI(x)	((x) & V2M_MSI_TYPER_NUM_MASK)
> +
> +struct v2m_data {
> +	spinlock_t msi_cnt_lock;
> +	struct msi_chip msi_chip;
> +	struct resource res;      /* GICv2m resource */
> +	void __iomem *base;       /* GICv2m virt address */
> +	unsigned int spi_start;   /* The SPI number that MSIs start */
> +	unsigned int nr_spis;     /* The number of SPIs for MSIs */
> +	unsigned long *bm;        /* MSI vector bitmap */
> +	struct irq_domain *domain;
> +};
> +
> +static struct irq_chip v2m_chip;
> +
> +static void gicv2m_teardown_msi_irq(struct msi_chip *chip, unsigned int irq)
> +{
> +	int pos;
> +	struct v2m_data *v2m = container_of(chip, struct v2m_data, msi_chip);
> +
> +	spin_lock(&v2m->msi_cnt_lock);
> +
> +	pos = irq - v2m->spi_start;
> +	if (pos >= 0 && pos < v2m->nr_spis)
> +		bitmap_clear(v2m->bm, pos, 1);
> +
> +	spin_unlock(&v2m->msi_cnt_lock);
> +}
> +
> +static int gicv2m_setup_msi_irq(struct msi_chip *chip, struct pci_dev *pdev,
> +				struct msi_desc *desc)
> +{
> +	int hwirq, virq, offset;
> +	struct v2m_data *v2m = container_of(chip, struct v2m_data, msi_chip);
> +
> +	if (!desc)
> +		return -EINVAL;
> +
> +	spin_lock(&v2m->msi_cnt_lock);
> +	offset = bitmap_find_free_region(v2m->bm, v2m->nr_spis, 0);
> +	spin_unlock(&v2m->msi_cnt_lock);
> +	if (offset < 0)
> +		return offset;
> +
> +	hwirq = v2m->spi_start + offset;
> +	virq = __irq_domain_alloc_irqs(v2m->domain, hwirq,
> +				       1, NUMA_NO_NODE, v2m, true);
> +	if (virq < 0) {
> +		gicv2m_teardown_msi_irq(chip, hwirq);
> +		return virq;
> +	}
> +
> +	irq_domain_set_hwirq_and_chip(v2m->domain, virq, hwirq,
> +				&v2m_chip, v2m);
> +
> +	irq_set_msi_desc(hwirq, desc);
> +	irq_set_irq_type(hwirq, IRQ_TYPE_EDGE_RISING);
> +
> +	return 0;
> +}
> +
> +static int gicv2m_domain_activate(struct irq_domain *domain,
> +				      struct irq_data *data)
> +{
> +	struct msi_msg msg;
> +	struct v2m_data *v2m;
> +	phys_addr_t addr;
> +
> +	v2m = container_of(data->chip_data, struct v2m_data, msi_chip);
> +	addr = v2m->res.start + V2M_MSI_SETSPI_NS;
> +
> +	msg.address_hi = (u32)(addr >> 32);
> +	msg.address_lo = (u32)(addr);
> +	msg.data = data->irq;
> +	write_msi_msg(data->irq, &msg);
> +
> +	return 0;
> +}
> +
> +static int gicv2m_domain_deactivate(struct irq_domain *domain,
> +				    struct irq_data *data)
> +{
> +	struct msi_msg msg;
> +
> +	memset(&msg, 0, sizeof(msg));
> +	write_msi_msg(data->irq, &msg);
> +
> +	return 0;
> +}
> +
> +static int gicv2m_domain_alloc(struct irq_domain *d, unsigned int virq,
> +			       unsigned int nr_irqs, void *arg)
> +{
> +	int i, ret, irq;
> +
> +	for (i = 0; i < nr_irqs; i++) {
> +		irq = virq + i;
> +		set_irq_flags(irq, IRQF_VALID | IRQF_PROBE);
> +		irq_set_chip_and_handler_name(irq, &v2m_chip,
> +			handle_fasteoi_irq, v2m_chip.name);
> +	}
> +
> +	ret = irq_domain_alloc_irqs_parent(d, virq, nr_irqs, NULL);
> +	if (ret < 0)
> +		pr_err("Failed to allocate parent IRQ domain\n");
> +
> +	return ret;
> +}
> +
> +static void gicv2m_domain_free(struct irq_domain *d, unsigned int virq,
> +			       unsigned int nr_irqs)
> +{
> +	int i, irq;
> +
> +	for (i = 0; i < nr_irqs; i++) {
> +		irq = virq + i;
> +		irq_set_handler(irq, NULL);
> +		irq_domain_set_hwirq_and_chip(d, irq, 0, NULL, NULL);
> +	}
> +
> +	irq_domain_free_irqs_parent(d, virq, nr_irqs);
> +}
> +
> +static bool is_msi_spi_valid(u32 base, u32 num)
> +{
> +	if (base < V2M_MIN_SPI) {
> +		pr_err("Invalid MSI base SPI (base:%u)\n", base);
> +		return false;
> +	}
> +
> +	if ((num == 0) || (base + num > V2M_MAX_SPI)) {
> +		pr_err("Number of SPIs (%u) exceed maximum (%u)\n",
> +		       num, V2M_MAX_SPI - V2M_MIN_SPI + 1);
> +		return false;
> +	}
> +
> +	return true;
> +}
> +
> +static void gicv2m_mask_irq(struct irq_data *d)
> +{
> +	irq_chip_mask_parent(d);
> +	if (d->msi_desc)
> +		mask_msi_irq(d);
> +}
> +
> +static void gicv2m_unmask_irq(struct irq_data *d)
> +{
> +	irq_chip_unmask_parent(d);
> +	if (d->msi_desc)
> +		unmask_msi_irq(d);
> +}
> +
> +static struct irq_chip v2m_chip = {
> +	.name             = "GICv2m",
> +	.irq_mask         = gicv2m_mask_irq,
> +	.irq_unmask       = gicv2m_unmask_irq,
> +	.irq_eoi          = irq_chip_eoi_parent,
> +	.irq_set_type     = irq_chip_set_type_parent,
> +
> +#ifdef CONFIG_SMP
> +	.irq_set_affinity = irq_chip_set_affinity_parent,
> +#endif
> +};
> +
> +static const struct irq_domain_ops gicv2m_domain_ops = {
> +	.alloc      = gicv2m_domain_alloc,
> +	.free       = gicv2m_domain_free,
> +	.activate   = gicv2m_domain_activate,
> +	.deactivate = gicv2m_domain_deactivate,
> +};
> +
> +static int __init gicv2m_init_one(struct device_node *node,
> +				  struct v2m_data **v,
> +				  struct irq_domain *parent)
> +{
> +	int ret;
> +	struct v2m_data *v2m;
> +
> +	*v = kzalloc(sizeof(struct v2m_data), GFP_KERNEL);
> +	if (!*v) {
> +		pr_err("Failed to allocate struct v2m_data.\n");
> +		return -ENOMEM;
> +	}
> +
> +	v2m = *v;
> +	v2m->msi_chip.owner = THIS_MODULE;
> +	v2m->msi_chip.of_node = node;
> +	v2m->msi_chip.setup_irq = gicv2m_setup_msi_irq;
> +	v2m->msi_chip.teardown_irq = gicv2m_teardown_msi_irq;
> +	ret = of_address_to_resource(node, 0, &v2m->res);
> +	if (ret) {
> +		pr_err("Failed to allocate v2m resource.\n");
> +		goto err_free_v2m;
> +	}
> +
> +	v2m->base = ioremap(v2m->res.start, resource_size(&v2m->res));
> +	if (!v2m->base) {
> +		pr_err("Failed to map GICv2m resource\n");
> +		ret = -EINVAL;
> +		goto err_free_v2m;
> +	}
> +
> +	ret = of_pci_msi_chip_add(&v2m->msi_chip);
> +	if (ret) {
> +		pr_info("Failed to add msi_chip.\n");
> +		goto err_iounmap;
> +	}
> +
> +	if (!of_property_read_u32(node, "arm,msi-base-spi", &v2m->spi_start) &&
> +	    !of_property_read_u32(node, "arm,msi-num-spis", &v2m->nr_spis)) {
> +		pr_info("Overriding V2M MSI_TYPER (base:%u, num:%u)\n",
> +			v2m->spi_start, v2m->nr_spis);
> +	} else {
> +		u32 typer = readl_relaxed(v2m->base + V2M_MSI_TYPER);
> +
> +		v2m->spi_start = V2M_MSI_TYPER_BASE_SPI(typer);
> +		v2m->nr_spis = V2M_MSI_TYPER_NUM_SPI(typer);
> +	}
> +
> +	if (!is_msi_spi_valid(v2m->spi_start, v2m->nr_spis)) {
> +		ret = -EINVAL;
> +		goto err_chip_rm;
> +	}
> +
> +	v2m->bm = kzalloc(sizeof(long) * BITS_TO_LONGS(v2m->nr_spis),
> +			  GFP_KERNEL);
> +	if (!v2m->bm) {
> +		ret = -ENOMEM;
> +		goto err_chip_rm;
> +	}
> +
> +	v2m->domain = irq_domain_add_simple(node, v2m->nr_spis, v2m->spi_start,
> +					    &gicv2m_domain_ops, v2m);
> +	if (!v2m->domain) {
> +		pr_err("Failed to create GICv2m domain\n");
> +		ret = -EINVAL;
> +		goto err_free_bm;
> +	}
> +
> +	v2m->domain->parent = parent;
> +
> +	spin_lock_init(&v2m->msi_cnt_lock);
> +
> +	pr_info("Node %s: range[%#lx:%#lx], SPI[%d:%d]\n", node->name,
> +		(unsigned long)v2m->res.start, (unsigned long)v2m->res.end,
> +		v2m->spi_start, (v2m->spi_start + v2m->nr_spis));
> +
> +	return 0;
> +
> +err_free_bm:
> +	kfree(v2m->bm);
> +err_chip_rm:
> +	of_pci_msi_chip_remove(&v2m->msi_chip);
> +err_iounmap:
> +	iounmap(v2m->base);
> +err_free_v2m:
> +	kfree(v2m);
> +	return ret;
> +}
> +
> +int __init gicv2m_of_init(struct device_node *node,
> +			  struct irq_domain *parent)
> +{
> +	int ret = 0;
> +	struct v2m_data *v2m;
> +	struct device_node *child = NULL;
> +
> +	for (;;) {
> +		child = of_get_next_child(node, child);
> +		if (!child)
> +			break;
> +
> +		if (!of_device_is_compatible(child, "arm,gic-v2m-frame"))
> +			continue;
> +
> +		if (!of_find_property(child, "msi-controller", NULL))
> +			continue;
> +
> +		ret = gicv2m_init_one(child, &v2m, parent);
> +		if (ret) {
> +			of_node_put(node);
> +			break;
> +		}
> +	}
> +	return ret;
> +}
> diff --git a/drivers/irqchip/irq-gic-v2m.h b/drivers/irqchip/irq-gic-v2m.h
> new file mode 100644
> index 0000000..66676a9
> --- /dev/null
> +++ b/drivers/irqchip/irq-gic-v2m.h
> @@ -0,0 +1,6 @@
> +#ifndef _IRQ_GIC_V2M_H_
> +#define _IRQ_GIC_V2M_H_
> +
> +int gicv2m_of_init(struct device_node *node, struct irq_domain *parent) __init;
> +
> +#endif /* _IRQ_GIC_V2M_H_ */
> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> index a99c211..166bc8a 100644
> --- a/drivers/irqchip/irq-gic.c
> +++ b/drivers/irqchip/irq-gic.c
> @@ -46,6 +46,7 @@
>  #include <asm/smp_plat.h>
>  
>  #include "irq-gic-common.h"
> +#include "irq-gic-v2m.h"
>  #include "irqchip.h"
>  
>  union gic_base {
> @@ -843,10 +844,20 @@ static int gic_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
>  	unsigned int type = IRQ_TYPE_NONE;
>  	struct of_phandle_args *irq_data = arg;
>  
> -	ret = gic_irq_domain_xlate(domain, irq_data->np, irq_data->args,
> -				   irq_data->args_count, &hwirq, &type);
> -	if (ret)
> -		return ret;
> +	if (irq_data) {
> +		ret = gic_irq_domain_xlate(domain, irq_data->np, irq_data->args,
> +					   irq_data->args_count, &hwirq, &type);
> +		if (ret)
> +			return ret;
> +	} else {
> +		/*
> +		 * When calling from the children domain (e.g. GICv2m),
> +		 * the child domain does not always have reference to
> +		 * the of_phandle_arg.  In this case, GIC domain assumes
> +		 * direct mapping between virq and hwirq.
> +		 */
> +		hwirq = virq;
> +	}
>  
>  	for (i = 0; i < nr_irqs; i++)
>  		gic_irq_domain_map(domain, virq+i, hwirq+i);
> @@ -1055,6 +1066,10 @@ gic_of_init(struct device_node *node, struct device_node *parent)
>  		irq = irq_of_parse_and_map(node, 0);
>  		gic_cascade_irq(gic_cnt, irq);
>  	}
> +
> +	if (IS_ENABLED(CONFIG_ARM_GIC_V2M))
> +		gicv2m_of_init(node, gic_data[gic_cnt].domain);
> +
>  	gic_cnt++;
>  	return 0;
>  }
>
Suravee Suthikulpanit Nov. 4, 2014, 2:20 p.m. UTC | #5
On 11/4/14 04:06, Thomas Gleixner wrote:
> On Mon, 3 Nov 2014, Suravee Suthikulanit wrote:
>> On 11/3/2014 4:51 PM, Thomas Gleixner wrote:
>>> On Mon, 3 Nov 2014, suravee.suthikulpanit@amd.com wrote:
>>>> +	irq_domain_set_hwirq_and_chip(v2m->domain, virq, hwirq,
>>>> +				&v2m_chip, v2m);
>>>> +
>>>> +	irq_set_msi_desc(hwirq, desc);
>>>> +	irq_set_irq_type(hwirq, IRQ_TYPE_EDGE_RISING);
>>>
>>> Sure both calls work perfectly fine as long as virq == hwirq, right?
>>
>> I was running into an issue when calling the irq_domain_alloc_irq_parent(), it
>> requires of_phandle_args pointer to be passed in. However, this does not work
>> for GICv2m since it does not have interrupt information in the device tree.
>> So, I decided at first to use direct (virq == hwirq) mapping, which simplifies
>> the code a bit, but might not be ideal solution, as you pointed out.
>
> It's not only far from ideal. It's not a solution at all. Simply
> because there is no guarantee for virq == hwirq.
>
>> An alternative would be to create a temporary struct of_phandle_args, and
>> populate it with the interrupt information for the requested MSI. Then pass it
>> to:
>>    --> irq_domain_alloc_irq_parent
>>     |--> gic_irq_domain_alloc
>>       |--> gic_irq_domain_xlate
>>       |--> gic_irq_domain_map
>>
>> However, this would still not be ideal if we want to support ACPI. Another
>
> Neither device tree nor ACPI has anything to do with MSI interrupts at
> runtime.
>
> All they do is to tell that there is a MSI controller and where the
> registers are and in the worst case fixups for a borked MSI_TYPER
> register.
>
> So either the TYPER reg or DT/ACPI gives you a fixed hwirq range which
> is reserved for MSI. And that's all you need, right?
>
Right, I get that part. Figuring out the fixed hwirq range for MSI is 
not the point I am trying to make here.

> [...]
> All you need is to pick one hwirq out of the existing fixed range and
> associate it to a newly allocated virq. That's the only information
> the underlying gic domain has to know about, because it needs to
> translate from the hwirq to the virq in the low level entry handler
> gic_handle_irq().

And that's what I am trying to do here except that GIC is expecting that 
information to be passed to it via irq_domain_alloc_irqs(..., args) 
where args is struct of_phandle_args (e.g. in the kernel/irqdomain.c: 
irq_create_of_mapping). This works fine when specifying interrupt from 
DT, but that is not always the case.

Currently, I can just create a fake of_phandle_args just to pass the 
hwirq information to GIC.

     --> gicv2m_setup_msi_irq()
      |    struct of_phandle_args phan;
      |    phan.np = NULL;
      |    phan.args_count = 3;
      |    phan.args[0] = 0;
      |    phan.args[1] = hwirq - 32;
      |    phan.args[2] = IRQ_TYPE_EDGE_RISING;
      |--> irq_domain_alloc_irqs(d, 1, NUMA_NO_NODE, &phan);
       |--> gicv2m_domain_alloc(d, virq, nr_irqs, arg)
	|--> irq_domain_alloc_irqs_parent(d, virq, nr_irqs, arg);

I am trying to figure out what would be a common data structure for this 
purpose that would work for both Dt and non-DT case (e.g. GICv2m MSI). 
Unless you think this is ok.

Thanks,
Suravee
Thomas Gleixner Nov. 4, 2014, 2:28 p.m. UTC | #6
On Tue, 4 Nov 2014, Suravee Suthikulpanit wrote:
> And that's what I am trying to do here except that GIC is expecting that
> information to be passed to it via irq_domain_alloc_irqs(..., args) where args
> is struct of_phandle_args (e.g. in the kernel/irqdomain.c:
> irq_create_of_mapping). This works fine when specifying interrupt from DT, but
> that is not always the case.
> 
> Currently, I can just create a fake of_phandle_args just to pass the hwirq
> information to GIC.
> 
>     --> gicv2m_setup_msi_irq()
>      |    struct of_phandle_args phan;
>      |    phan.np = NULL;
>      |    phan.args_count = 3;
>      |    phan.args[0] = 0;
>      |    phan.args[1] = hwirq - 32;
>      |    phan.args[2] = IRQ_TYPE_EDGE_RISING;
>      |--> irq_domain_alloc_irqs(d, 1, NUMA_NO_NODE, &phan);
>       |--> gicv2m_domain_alloc(d, virq, nr_irqs, arg)
> 	|--> irq_domain_alloc_irqs_parent(d, virq, nr_irqs, arg);
> 
> I am trying to figure out what would be a common data structure for this
> purpose that would work for both Dt and non-DT case (e.g. GICv2m MSI). Unless
> you think this is ok.

You need to sort that out with Marc. It needs to be done in a way
which is usable for the other potential use cases of stacked domains
on top of GIC.

Thanks,

	tglx
Suravee Suthikulpanit Nov. 4, 2014, 5 p.m. UTC | #7
On 11/4/14 07:01, Jiang Liu wrote:
> Hi Suravee,
> 	You may build a two level hierarchy irqdomains. Use the
> utilities in this thread
> http://www.spinics.net/lists/arm-kernel/msg374722.html to build an MSI
> irqdomain to manage MSI controllers
> in PCI devices. And build another irqdomain to manage SPI allocation
> in GICv2.
> 	That is: MSI irqdomain (program MSI registers)  -->
> GIV irqdomain (manage SPIs in GICv2 controller)

That's great. I'll look at this patch in and make use of it to create to 
MSI domain.

Thanks,

Suravee

> Regards!
> Gerry
Marc Zyngier Nov. 4, 2014, 5:46 p.m. UTC | #8
On 04/11/14 14:20, Suravee Suthikulpanit wrote:
> 
> 
> On 11/4/14 04:06, Thomas Gleixner wrote:
>> On Mon, 3 Nov 2014, Suravee Suthikulanit wrote:
>>> On 11/3/2014 4:51 PM, Thomas Gleixner wrote:
>>>> On Mon, 3 Nov 2014, suravee.suthikulpanit@amd.com wrote:
>>>>> +	irq_domain_set_hwirq_and_chip(v2m->domain, virq, hwirq,
>>>>> +				&v2m_chip, v2m);
>>>>> +
>>>>> +	irq_set_msi_desc(hwirq, desc);
>>>>> +	irq_set_irq_type(hwirq, IRQ_TYPE_EDGE_RISING);
>>>>
>>>> Sure both calls work perfectly fine as long as virq == hwirq, right?
>>>
>>> I was running into an issue when calling the irq_domain_alloc_irq_parent(), it
>>> requires of_phandle_args pointer to be passed in. However, this does not work
>>> for GICv2m since it does not have interrupt information in the device tree.
>>> So, I decided at first to use direct (virq == hwirq) mapping, which simplifies
>>> the code a bit, but might not be ideal solution, as you pointed out.
>>
>> It's not only far from ideal. It's not a solution at all. Simply
>> because there is no guarantee for virq == hwirq.
>>
>>> An alternative would be to create a temporary struct of_phandle_args, and
>>> populate it with the interrupt information for the requested MSI. Then pass it
>>> to:
>>>    --> irq_domain_alloc_irq_parent
>>>     |--> gic_irq_domain_alloc
>>>       |--> gic_irq_domain_xlate
>>>       |--> gic_irq_domain_map
>>>
>>> However, this would still not be ideal if we want to support ACPI. Another
>>
>> Neither device tree nor ACPI has anything to do with MSI interrupts at
>> runtime.
>>
>> All they do is to tell that there is a MSI controller and where the
>> registers are and in the worst case fixups for a borked MSI_TYPER
>> register.
>>
>> So either the TYPER reg or DT/ACPI gives you a fixed hwirq range which
>> is reserved for MSI. And that's all you need, right?
>>
> Right, I get that part. Figuring out the fixed hwirq range for MSI is 
> not the point I am trying to make here.
> 
>> [...]
>> All you need is to pick one hwirq out of the existing fixed range and
>> associate it to a newly allocated virq. That's the only information
>> the underlying gic domain has to know about, because it needs to
>> translate from the hwirq to the virq in the low level entry handler
>> gic_handle_irq().
> 
> And that's what I am trying to do here except that GIC is expecting that 
> information to be passed to it via irq_domain_alloc_irqs(..., args) 
> where args is struct of_phandle_args (e.g. in the kernel/irqdomain.c: 
> irq_create_of_mapping). This works fine when specifying interrupt from 
> DT, but that is not always the case.
> 
> Currently, I can just create a fake of_phandle_args just to pass the 
> hwirq information to GIC.
> 
>      --> gicv2m_setup_msi_irq()
>       |    struct of_phandle_args phan;
>       |    phan.np = NULL;
>       |    phan.args_count = 3;
>       |    phan.args[0] = 0;
>       |    phan.args[1] = hwirq - 32;
>       |    phan.args[2] = IRQ_TYPE_EDGE_RISING;
>       |--> irq_domain_alloc_irqs(d, 1, NUMA_NO_NODE, &phan);
>        |--> gicv2m_domain_alloc(d, virq, nr_irqs, arg)
> 	|--> irq_domain_alloc_irqs_parent(d, virq, nr_irqs, arg);
> 
> I am trying to figure out what would be a common data structure for this 
> purpose that would work for both Dt and non-DT case (e.g. GICv2m MSI). 
> Unless you think this is ok.

I think of_phandle_args is the only thing we can use, whether this is DT
or this other non-DT thing. This is how we represent a GIC HW interrupt
outside of the GIC itself.

Of course, this creates a dependency between the different domains (a
child domain *must* know how the parent domain represents its
interrupt). Things like the v2m widget are completely fused with the GIC
anyway, so it makes sense.

Things like generic secondary irqchips don't fit that model, but they
can continue to use the existing framework.

Thanks,

	M.
Suravee Suthikulpanit Nov. 6, 2014, 12:05 a.m. UTC | #9
On 11/4/2014 7:01 AM, Jiang Liu wrote:
> Hi Suravee,
> 	You may build a two level hierarchy irqdomains. Use the
> utilities in this thread
> http://www.spinics.net/lists/arm-kernel/msg374722.html  to build an MSI
> irqdomain to manage MSI controllers
> in PCI devices. And build another irqdomain to manage SPI allocation
> in GICv2.
> 	That is: MSI irqdomain (program MSI registers)  -->
> GIV irqdomain (manage SPIs in GICv2 controller)
>
> Regards!
> Gerry

Gerry,

I try out your patch from the link above, and I have a couple 
questions/issues.

1. In the drivers/pci/msi.c: msi_irq_domain_alloc_irqs(), it seems that 
the hwirq comes from msi_get_hwirq(dev, msidesc). In GICv2m, hwirq for 
MSI is fixed over a specific range. This might require arch-specific
callback.

2. In msi_domain_activate, why "if (!irq_data->chip_data)"?

3. In, msi_domain_alloc():

- There should be a way to specify other types of irq handler besides 
the "handle_edge_irq". In case of GIC, it needs handle_fasteoi_irq.

- When calling irq_domain_set_hwirq_and_chip(), you are passing "(void 
*)(long)i" for the "void *chip_data" parameter. What is this used for, 
and where?  Shouldn't this be pointing to arch-specific data structure?

- The code is calling irq_domain_alloc_irqs_parent before the loop, 
which calls irq_domain_set_hwirq_and_chip() and __irq_set_handler. 
Shouldn't the order be switched?

- Overall, it seems that msi_domain_alloc() could be quite different 
across architectures. Would it be possible to declare this function as 
weak, and allow arch to override (similar to arch_setup_msi_irq)?

Thanks,

Suravee
Suravee Suthikulpanit Nov. 6, 2014, 12:23 a.m. UTC | #10
On 11/5/2014 6:05 PM, Suravee Suthikulanit wrote:
> - Overall, it seems that msi_domain_alloc() could be quite different
> across architectures. Would it be possible to declare this function as
> weak, and allow arch to override (similar to arch_setup_msi_irq)?

Actually, declaring "msi_domain_ops" as non-static, and allow other code 
to override the .alloc and .free?

Thanks,

Suravee
Thomas Gleixner Nov. 6, 2014, 12:49 a.m. UTC | #11
On Wed, 5 Nov 2014, Suravee Suthikulanit wrote:
> On 11/5/2014 6:05 PM, Suravee Suthikulanit wrote:
> > - Overall, it seems that msi_domain_alloc() could be quite different
> > across architectures. Would it be possible to declare this function as
> > weak, and allow arch to override (similar to arch_setup_msi_irq)?
> 
> Actually, declaring "msi_domain_ops" as non-static, and allow other code to
> override the .alloc and .free?

Why do you want to do that?

Thanks,

	tglx
Thomas Gleixner Nov. 6, 2014, 10:42 a.m. UTC | #12
On Thu, 6 Nov 2014, Thomas Gleixner wrote:
> On Wed, 5 Nov 2014, Suravee Suthikulanit wrote:
> > On 11/5/2014 6:05 PM, Suravee Suthikulanit wrote:
> > > - Overall, it seems that msi_domain_alloc() could be quite different
> > > across architectures. Would it be possible to declare this function as
> > > weak, and allow arch to override (similar to arch_setup_msi_irq)?
> > 
> > Actually, declaring "msi_domain_ops" as non-static, and allow other code to
> > override the .alloc and .free?
> 
> Why do you want to do that?

I know why. Because you want to spare a level of hierarchy. But thats
wrong simply because MSI itself is an interrupt chip at the device
level.

[ MSI ] ---> [ GIC-MSI ] ---> [ GIC ]

So the MSI level only cares about the allocation of the virq
space. GIC-MSI allocates out of the bitmap which handles the hard
wired range of MSI capable GIC interrupts and GIC handles the
underlying functionality.

And this makes a lot of sense, if you think about interrupt
remapping. If ARM ever grows that you simply insert it into the chain:

[ MSI ] ---> [ Remap] ---> [ GIC-MSI ] ---> [ GIC ]

If you look at Jiangs x86 implementation it does exactly that.

[ MSI ] ---> [ Vector ]

[ MSI ] ---> [ Remap ] ---> [ Vector ]

And because ARM has this intermediate layer of GIC-MSI you need to
represent it in the hierarchy whether you like it or not. If you'd try
to bolt the GIC-MSI magic into the MSI layer itself, then interrupt
remapping would never work.

Thanks,

	tglx
Marc Zyngier Nov. 6, 2014, 4:34 p.m. UTC | #13
Hi Thomas,

On 06/11/14 10:42, Thomas Gleixner wrote:
> On Thu, 6 Nov 2014, Thomas Gleixner wrote:
>> On Wed, 5 Nov 2014, Suravee Suthikulanit wrote:
>>> On 11/5/2014 6:05 PM, Suravee Suthikulanit wrote:
>>>> - Overall, it seems that msi_domain_alloc() could be quite different
>>>> across architectures. Would it be possible to declare this function as
>>>> weak, and allow arch to override (similar to arch_setup_msi_irq)?
>>>
>>> Actually, declaring "msi_domain_ops" as non-static, and allow other code to
>>> override the .alloc and .free?
>>
>> Why do you want to do that?
> 
> I know why. Because you want to spare a level of hierarchy. But thats
> wrong simply because MSI itself is an interrupt chip at the device
> level.
> 
> [ MSI ] ---> [ GIC-MSI ] ---> [ GIC ]
> 
> So the MSI level only cares about the allocation of the virq
> space. GIC-MSI allocates out of the bitmap which handles the hard
> wired range of MSI capable GIC interrupts and GIC handles the
> underlying functionality.
> 
> And this makes a lot of sense, if you think about interrupt
> remapping. If ARM ever grows that you simply insert it into the chain:
> 
> [ MSI ] ---> [ Remap] ---> [ GIC-MSI ] ---> [ GIC ]

I think ARM has reached that stage with the ITS block in GICv3:
- Each device gets programmed with a set of  "event IDs" ranging from 0
to N-1, with N being the number of MSI vectors used by the device
- the ITS uses both the device ID (basically the PCI requester ID) and
the event ID to parse a set of software-managed tables (think page
tables for interrupts).

The x86 remapping thing looks quite similar to that, by reading a couple
of pages from the VT-D document.

So the way I understand the layout (and please correct me if I'm wrong,
which is certainly the case) is that the MSI domain is entirely generic,
allocates the virq, uses Remap as a parent, and uses
irq_chip_compose_msi_msg to call into the parent and generate whatever
goes into the MSI message.

I'm still struggling a bit to see how the remapping layer can access the
requester ID. x86 uses the irq_alloc_info to store that (the result of
an msi_get_hwirq call), but we don't have an equivalent structure on
arm/arm64.

I'll try to hack something with my current ITS driver and come back with
the results.

Thanks,

	M.
Jiang Liu Nov. 7, 2014, 1 a.m. UTC | #14
On 2014/11/7 0:34, Marc Zyngier wrote:
> Hi Thomas,
> 
> On 06/11/14 10:42, Thomas Gleixner wrote:
>> On Thu, 6 Nov 2014, Thomas Gleixner wrote:
>>> On Wed, 5 Nov 2014, Suravee Suthikulanit wrote:
>>>> On 11/5/2014 6:05 PM, Suravee Suthikulanit wrote:
>>>>> - Overall, it seems that msi_domain_alloc() could be quite different
>>>>> across architectures. Would it be possible to declare this function as
>>>>> weak, and allow arch to override (similar to arch_setup_msi_irq)?
>>>>
>>>> Actually, declaring "msi_domain_ops" as non-static, and allow other code to
>>>> override the .alloc and .free?
>>>
>>> Why do you want to do that?
>>
>> I know why. Because you want to spare a level of hierarchy. But thats
>> wrong simply because MSI itself is an interrupt chip at the device
>> level.
>>
>> [ MSI ] ---> [ GIC-MSI ] ---> [ GIC ]
>>
>> So the MSI level only cares about the allocation of the virq
>> space. GIC-MSI allocates out of the bitmap which handles the hard
>> wired range of MSI capable GIC interrupts and GIC handles the
>> underlying functionality.
>>
>> And this makes a lot of sense, if you think about interrupt
>> remapping. If ARM ever grows that you simply insert it into the chain:
>>
>> [ MSI ] ---> [ Remap] ---> [ GIC-MSI ] ---> [ GIC ]
> 
> I think ARM has reached that stage with the ITS block in GICv3:
> - Each device gets programmed with a set of  "event IDs" ranging from 0
> to N-1, with N being the number of MSI vectors used by the device
> - the ITS uses both the device ID (basically the PCI requester ID) and
> the event ID to parse a set of software-managed tables (think page
> tables for interrupts).
> 
> The x86 remapping thing looks quite similar to that, by reading a couple
> of pages from the VT-D document.
> 
> So the way I understand the layout (and please correct me if I'm wrong,
> which is certainly the case) is that the MSI domain is entirely generic,
> allocates the virq, uses Remap as a parent, and uses
> irq_chip_compose_msi_msg to call into the parent and generate whatever
> goes into the MSI message.
Hi Marc,
	It works exactly in this way:)

> 
> I'm still struggling a bit to see how the remapping layer can access the
> requester ID. x86 uses the irq_alloc_info to store that (the result of
> an msi_get_hwirq call), but we don't have an equivalent structure on
> arm/arm64.
irq_alloc_info is newly introduced for hierarchy irqdomain on x86.
Regards!
Gerry

> 
> I'll try to hack something with my current ITS driver and come back with
> the results.
> 
> Thanks,
> 
> 	M.
>
diff mbox

Patch

diff --git a/Documentation/devicetree/bindings/arm/gic.txt b/Documentation/devicetree/bindings/arm/gic.txt
index c7d2fa1..ebf976a 100644
--- a/Documentation/devicetree/bindings/arm/gic.txt
+++ b/Documentation/devicetree/bindings/arm/gic.txt
@@ -96,3 +96,56 @@  Example:
 		      <0x2c006000 0x2000>;
 		interrupts = <1 9 0xf04>;
 	};
+
+
+* GICv2m extension for MSI/MSI-x support (Optional)
+
+Certain revisions of GIC-400 supports MSI/MSI-x via V2M register frame(s).
+This is enabled by specifying v2m sub-node(s).
+
+Required properties:
+
+- compatible        : The value here should contain "arm,gic-v2m-frame".
+
+- msi-controller    : Identifies the node as an MSI controller.
+
+- reg               : GICv2m MSI interface register base and size
+
+Optional properties:
+
+- arm,msi-base-spi  : When the MSI_TYPER register contains an incorrect
+                      value, this property should contain the SPI base of
+                      the MSI frame, overriding the HW value.
+
+- arm,msi-num-spis  : When the MSI_TYPER register contains an incorrect
+                      value, this property should contain the number of
+                      SPIs assigned to the frame, overriding the HW value.
+
+Example:
+
+	interrupt-controller@e1101000 {
+		compatible = "arm,gic-400";
+		#interrupt-cells = <3>;
+		#address-cells = <2>;
+		#size-cells = <2>;
+		interrupt-controller;
+		interrupts = <1 8 0xf04>;
+		ranges = <0 0 0 0xe1100000 0 0x100000>;
+		reg = <0x0 0xe1110000 0 0x01000>,
+		      <0x0 0xe112f000 0 0x02000>,
+		      <0x0 0xe1140000 0 0x10000>,
+		      <0x0 0xe1160000 0 0x10000>;
+		v2m0: v2m@0x8000 {
+			compatible = "arm,gic-v2m-frame";
+			msi-controller;
+			reg = <0x0 0x80000 0 0x1000>;
+		};
+
+		....
+
+		v2mN: v2m@0x9000 {
+			compatible = "arm,gic-v2m-frame";
+			msi-controller;
+			reg = <0x0 0x90000 0 0x1000>;
+		};
+	};
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index cde2f72..cbcde2d 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -12,6 +12,7 @@  config ARM64
 	select ARM_ARCH_TIMER
 	select ARM_GIC
 	select AUDIT_ARCH_COMPAT_GENERIC
+	select ARM_GIC_V2M
 	select ARM_GIC_V3
 	select BUILDTIME_EXTABLE_SORT
 	select CLONE_BACKWARDS
diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index 2a48e0a..39ce065 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -8,6 +8,11 @@  config ARM_GIC
 	select IRQ_DOMAIN_HIERARCHY
 	select MULTI_IRQ_HANDLER
 
+config ARM_GIC_V2M
+	bool
+	depends on ARM_GIC
+	depends on PCI && PCI_MSI
+
 config GIC_NON_BANKED
 	bool
 
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index 73052ba..3bda951 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -17,6 +17,7 @@  obj-$(CONFIG_ARCH_SUNXI)		+= irq-sun4i.o
 obj-$(CONFIG_ARCH_SUNXI)		+= irq-sunxi-nmi.o
 obj-$(CONFIG_ARCH_SPEAR3XX)		+= spear-shirq.o
 obj-$(CONFIG_ARM_GIC)			+= irq-gic.o irq-gic-common.o
+obj-$(CONFIG_ARM_GIC_V2M)		+= irq-gic-v2m.o
 obj-$(CONFIG_ARM_GIC_V3)		+= irq-gic-v3.o irq-gic-common.o
 obj-$(CONFIG_ARM_NVIC)			+= irq-nvic.o
 obj-$(CONFIG_ARM_VIC)			+= irq-vic.o
diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c
new file mode 100644
index 0000000..fd8d51a
--- /dev/null
+++ b/drivers/irqchip/irq-gic-v2m.c
@@ -0,0 +1,340 @@ 
+/*
+ * ARM GIC v2m MSI(-X) support
+ * Support for Message Signaled Interrupts for systems that
+ * implement ARM Generic Interrupt Controller: GICv2m.
+ *
+ * Copyright (C) 2014 Advanced Micro Devices, Inc.
+ * Authors: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
+ *          Harish Kasiviswanathan <harish.kasiviswanathan@amd.com>
+ *          Brandon Anderson <brandon.anderson@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ */
+
+#define pr_fmt(fmt) "GICv2m: " fmt
+
+#include <linux/bitmap.h>
+#include <linux/irq.h>
+#include <linux/irqdomain.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/of_address.h>
+#include <linux/of_pci.h>
+#include <linux/pci.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+#include <asm/hardirq.h>
+#include <asm/irq.h>
+
+#include "irqchip.h"
+#include "irq-gic-v2m.h"
+
+/*
+* MSI_TYPER:
+*     [31:26] Reserved
+*     [25:16] lowest SPI assigned to MSI
+*     [15:10] Reserved
+*     [9:0]   Numer of SPIs assigned to MSI
+*/
+#define V2M_MSI_TYPER			0x008
+#define V2M_MSI_TYPER_BASE_SHIFT	16
+#define V2M_MSI_TYPER_BASE_MASK		0x3FF
+#define V2M_MSI_TYPER_NUM_MASK		0x3FF
+#define V2M_MSI_SETSPI_NS		0x040
+#define V2M_MIN_SPI			32
+#define V2M_MAX_SPI			1019
+
+#define V2M_MSI_TYPER_BASE_SPI(x)	\
+		(((x) >> V2M_MSI_TYPER_BASE_SHIFT) & V2M_MSI_TYPER_BASE_MASK)
+
+#define V2M_MSI_TYPER_NUM_SPI(x)	((x) & V2M_MSI_TYPER_NUM_MASK)
+
+struct v2m_data {
+	spinlock_t msi_cnt_lock;
+	struct msi_chip msi_chip;
+	struct resource res;      /* GICv2m resource */
+	void __iomem *base;       /* GICv2m virt address */
+	unsigned int spi_start;   /* The SPI number that MSIs start */
+	unsigned int nr_spis;     /* The number of SPIs for MSIs */
+	unsigned long *bm;        /* MSI vector bitmap */
+	struct irq_domain *domain;
+};
+
+static struct irq_chip v2m_chip;
+
+static void gicv2m_teardown_msi_irq(struct msi_chip *chip, unsigned int irq)
+{
+	int pos;
+	struct v2m_data *v2m = container_of(chip, struct v2m_data, msi_chip);
+
+	spin_lock(&v2m->msi_cnt_lock);
+
+	pos = irq - v2m->spi_start;
+	if (pos >= 0 && pos < v2m->nr_spis)
+		bitmap_clear(v2m->bm, pos, 1);
+
+	spin_unlock(&v2m->msi_cnt_lock);
+}
+
+static int gicv2m_setup_msi_irq(struct msi_chip *chip, struct pci_dev *pdev,
+				struct msi_desc *desc)
+{
+	int hwirq, virq, offset;
+	struct v2m_data *v2m = container_of(chip, struct v2m_data, msi_chip);
+
+	if (!desc)
+		return -EINVAL;
+
+	spin_lock(&v2m->msi_cnt_lock);
+	offset = bitmap_find_free_region(v2m->bm, v2m->nr_spis, 0);
+	spin_unlock(&v2m->msi_cnt_lock);
+	if (offset < 0)
+		return offset;
+
+	hwirq = v2m->spi_start + offset;
+	virq = __irq_domain_alloc_irqs(v2m->domain, hwirq,
+				       1, NUMA_NO_NODE, v2m, true);
+	if (virq < 0) {
+		gicv2m_teardown_msi_irq(chip, hwirq);
+		return virq;
+	}
+
+	irq_domain_set_hwirq_and_chip(v2m->domain, virq, hwirq,
+				&v2m_chip, v2m);
+
+	irq_set_msi_desc(hwirq, desc);
+	irq_set_irq_type(hwirq, IRQ_TYPE_EDGE_RISING);
+
+	return 0;
+}
+
+static int gicv2m_domain_activate(struct irq_domain *domain,
+				      struct irq_data *data)
+{
+	struct msi_msg msg;
+	struct v2m_data *v2m;
+	phys_addr_t addr;
+
+	v2m = container_of(data->chip_data, struct v2m_data, msi_chip);
+	addr = v2m->res.start + V2M_MSI_SETSPI_NS;
+
+	msg.address_hi = (u32)(addr >> 32);
+	msg.address_lo = (u32)(addr);
+	msg.data = data->irq;
+	write_msi_msg(data->irq, &msg);
+
+	return 0;
+}
+
+static int gicv2m_domain_deactivate(struct irq_domain *domain,
+				    struct irq_data *data)
+{
+	struct msi_msg msg;
+
+	memset(&msg, 0, sizeof(msg));
+	write_msi_msg(data->irq, &msg);
+
+	return 0;
+}
+
+static int gicv2m_domain_alloc(struct irq_domain *d, unsigned int virq,
+			       unsigned int nr_irqs, void *arg)
+{
+	int i, ret, irq;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq = virq + i;
+		set_irq_flags(irq, IRQF_VALID | IRQF_PROBE);
+		irq_set_chip_and_handler_name(irq, &v2m_chip,
+			handle_fasteoi_irq, v2m_chip.name);
+	}
+
+	ret = irq_domain_alloc_irqs_parent(d, virq, nr_irqs, NULL);
+	if (ret < 0)
+		pr_err("Failed to allocate parent IRQ domain\n");
+
+	return ret;
+}
+
+static void gicv2m_domain_free(struct irq_domain *d, unsigned int virq,
+			       unsigned int nr_irqs)
+{
+	int i, irq;
+
+	for (i = 0; i < nr_irqs; i++) {
+		irq = virq + i;
+		irq_set_handler(irq, NULL);
+		irq_domain_set_hwirq_and_chip(d, irq, 0, NULL, NULL);
+	}
+
+	irq_domain_free_irqs_parent(d, virq, nr_irqs);
+}
+
+static bool is_msi_spi_valid(u32 base, u32 num)
+{
+	if (base < V2M_MIN_SPI) {
+		pr_err("Invalid MSI base SPI (base:%u)\n", base);
+		return false;
+	}
+
+	if ((num == 0) || (base + num > V2M_MAX_SPI)) {
+		pr_err("Number of SPIs (%u) exceed maximum (%u)\n",
+		       num, V2M_MAX_SPI - V2M_MIN_SPI + 1);
+		return false;
+	}
+
+	return true;
+}
+
+static void gicv2m_mask_irq(struct irq_data *d)
+{
+	irq_chip_mask_parent(d);
+	if (d->msi_desc)
+		mask_msi_irq(d);
+}
+
+static void gicv2m_unmask_irq(struct irq_data *d)
+{
+	irq_chip_unmask_parent(d);
+	if (d->msi_desc)
+		unmask_msi_irq(d);
+}
+
+static struct irq_chip v2m_chip = {
+	.name             = "GICv2m",
+	.irq_mask         = gicv2m_mask_irq,
+	.irq_unmask       = gicv2m_unmask_irq,
+	.irq_eoi          = irq_chip_eoi_parent,
+	.irq_set_type     = irq_chip_set_type_parent,
+
+#ifdef CONFIG_SMP
+	.irq_set_affinity = irq_chip_set_affinity_parent,
+#endif
+};
+
+static const struct irq_domain_ops gicv2m_domain_ops = {
+	.alloc      = gicv2m_domain_alloc,
+	.free       = gicv2m_domain_free,
+	.activate   = gicv2m_domain_activate,
+	.deactivate = gicv2m_domain_deactivate,
+};
+
+static int __init gicv2m_init_one(struct device_node *node,
+				  struct v2m_data **v,
+				  struct irq_domain *parent)
+{
+	int ret;
+	struct v2m_data *v2m;
+
+	*v = kzalloc(sizeof(struct v2m_data), GFP_KERNEL);
+	if (!*v) {
+		pr_err("Failed to allocate struct v2m_data.\n");
+		return -ENOMEM;
+	}
+
+	v2m = *v;
+	v2m->msi_chip.owner = THIS_MODULE;
+	v2m->msi_chip.of_node = node;
+	v2m->msi_chip.setup_irq = gicv2m_setup_msi_irq;
+	v2m->msi_chip.teardown_irq = gicv2m_teardown_msi_irq;
+	ret = of_address_to_resource(node, 0, &v2m->res);
+	if (ret) {
+		pr_err("Failed to allocate v2m resource.\n");
+		goto err_free_v2m;
+	}
+
+	v2m->base = ioremap(v2m->res.start, resource_size(&v2m->res));
+	if (!v2m->base) {
+		pr_err("Failed to map GICv2m resource\n");
+		ret = -EINVAL;
+		goto err_free_v2m;
+	}
+
+	ret = of_pci_msi_chip_add(&v2m->msi_chip);
+	if (ret) {
+		pr_info("Failed to add msi_chip.\n");
+		goto err_iounmap;
+	}
+
+	if (!of_property_read_u32(node, "arm,msi-base-spi", &v2m->spi_start) &&
+	    !of_property_read_u32(node, "arm,msi-num-spis", &v2m->nr_spis)) {
+		pr_info("Overriding V2M MSI_TYPER (base:%u, num:%u)\n",
+			v2m->spi_start, v2m->nr_spis);
+	} else {
+		u32 typer = readl_relaxed(v2m->base + V2M_MSI_TYPER);
+
+		v2m->spi_start = V2M_MSI_TYPER_BASE_SPI(typer);
+		v2m->nr_spis = V2M_MSI_TYPER_NUM_SPI(typer);
+	}
+
+	if (!is_msi_spi_valid(v2m->spi_start, v2m->nr_spis)) {
+		ret = -EINVAL;
+		goto err_chip_rm;
+	}
+
+	v2m->bm = kzalloc(sizeof(long) * BITS_TO_LONGS(v2m->nr_spis),
+			  GFP_KERNEL);
+	if (!v2m->bm) {
+		ret = -ENOMEM;
+		goto err_chip_rm;
+	}
+
+	v2m->domain = irq_domain_add_simple(node, v2m->nr_spis, v2m->spi_start,
+					    &gicv2m_domain_ops, v2m);
+	if (!v2m->domain) {
+		pr_err("Failed to create GICv2m domain\n");
+		ret = -EINVAL;
+		goto err_free_bm;
+	}
+
+	v2m->domain->parent = parent;
+
+	spin_lock_init(&v2m->msi_cnt_lock);
+
+	pr_info("Node %s: range[%#lx:%#lx], SPI[%d:%d]\n", node->name,
+		(unsigned long)v2m->res.start, (unsigned long)v2m->res.end,
+		v2m->spi_start, (v2m->spi_start + v2m->nr_spis));
+
+	return 0;
+
+err_free_bm:
+	kfree(v2m->bm);
+err_chip_rm:
+	of_pci_msi_chip_remove(&v2m->msi_chip);
+err_iounmap:
+	iounmap(v2m->base);
+err_free_v2m:
+	kfree(v2m);
+	return ret;
+}
+
+int __init gicv2m_of_init(struct device_node *node,
+			  struct irq_domain *parent)
+{
+	int ret = 0;
+	struct v2m_data *v2m;
+	struct device_node *child = NULL;
+
+	for (;;) {
+		child = of_get_next_child(node, child);
+		if (!child)
+			break;
+
+		if (!of_device_is_compatible(child, "arm,gic-v2m-frame"))
+			continue;
+
+		if (!of_find_property(child, "msi-controller", NULL))
+			continue;
+
+		ret = gicv2m_init_one(child, &v2m, parent);
+		if (ret) {
+			of_node_put(node);
+			break;
+		}
+	}
+	return ret;
+}
diff --git a/drivers/irqchip/irq-gic-v2m.h b/drivers/irqchip/irq-gic-v2m.h
new file mode 100644
index 0000000..66676a9
--- /dev/null
+++ b/drivers/irqchip/irq-gic-v2m.h
@@ -0,0 +1,6 @@ 
+#ifndef _IRQ_GIC_V2M_H_
+#define _IRQ_GIC_V2M_H_
+
+int gicv2m_of_init(struct device_node *node, struct irq_domain *parent) __init;
+
+#endif /* _IRQ_GIC_V2M_H_ */
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index a99c211..166bc8a 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -46,6 +46,7 @@ 
 #include <asm/smp_plat.h>
 
 #include "irq-gic-common.h"
+#include "irq-gic-v2m.h"
 #include "irqchip.h"
 
 union gic_base {
@@ -843,10 +844,20 @@  static int gic_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
 	unsigned int type = IRQ_TYPE_NONE;
 	struct of_phandle_args *irq_data = arg;
 
-	ret = gic_irq_domain_xlate(domain, irq_data->np, irq_data->args,
-				   irq_data->args_count, &hwirq, &type);
-	if (ret)
-		return ret;
+	if (irq_data) {
+		ret = gic_irq_domain_xlate(domain, irq_data->np, irq_data->args,
+					   irq_data->args_count, &hwirq, &type);
+		if (ret)
+			return ret;
+	} else {
+		/*
+		 * When calling from the children domain (e.g. GICv2m),
+		 * the child domain does not always have reference to
+		 * the of_phandle_arg.  In this case, GIC domain assumes
+		 * direct mapping between virq and hwirq.
+		 */
+		hwirq = virq;
+	}
 
 	for (i = 0; i < nr_irqs; i++)
 		gic_irq_domain_map(domain, virq+i, hwirq+i);
@@ -1055,6 +1066,10 @@  gic_of_init(struct device_node *node, struct device_node *parent)
 		irq = irq_of_parse_and_map(node, 0);
 		gic_cascade_irq(gic_cnt, irq);
 	}
+
+	if (IS_ENABLED(CONFIG_ARM_GIC_V2M))
+		gicv2m_of_init(node, gic_data[gic_cnt].domain);
+
 	gic_cnt++;
 	return 0;
 }