Message ID | 20240814124537.29847-1-kabel@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | irqchip/armada-370-xp: Implement SoC Error interrupts | expand |
On Wed, Aug 14 2024 at 14:45, Marek Behún wrote: Cc+ device tree people. > + The code binds this new interrupt domain to the same device-tree node as > + the main interrupt domain. The main interrupt controller has its > + interrupts described by one argument in device-tree > + (#interrupt-cells = <1>), i.e.: > + > + interrupts-extended = <&mpic 8>; > + > + Because of backwards compatibility we cannot change this number of > + arguments, and so the SoC Error interrupts must also be described by > + this one number. > + > + Thus, to describe a SoC Error interrupt, one has to add the an offset > + to the SoC Error interrupt number. Offset 0x400 was chosen because the > + main controller supports at most 1024 interrupts (in theory; in practice > + it seems to be 116 interrupts on all supported platforms). An example of > + describing a SoC Error interrupt is > + > + interrupts-extended = <&mpic 0x404>; This looks like a horrible hack and I don't understand why this can't be a separate interrupt controller, which it is in the hardware. That controller utilizes interrupt 4 from the MPIC. But then my DT foo is limited, so I let the DT folks comment on that. > +static int mpic_soc_err_irq_set_affinity(struct irq_data *d, const struct cpumask *mask, bool force) > +{ > + unsigned int cpu; > + > + /* > + * TODO: The mpic->per_cpu region accesses CPU Local IC registers for CPU n when accessed > + * from CPU n. Thus if we want to access this registers from another CPU, we need to request > + * a function to be executed on CPU n. This is what we do here by calling smp_call_on_cpu(). > + * > + * Instead, we could access CPU Local IC registers by having CPU Local region of each CPU > + * mapped in the MPIC private data structure. We could do this either by extending the > + * register resource in the device-tree, or by computing the physical base address of those > + * regions relative to the main MPIC base address. That requires locking for those registers obviously. > + */ > + > + cpus_read_lock(); This code was clearly never tested with any debug enabled. set_affinity() is invoked with interrupts disabled and irq_desc::lock held. cpus_read_lock() can sleep... The mandatory debug options would have told you loud and clearly. > + /* First, disable the ERR IRQ on all cores */ > + for_each_online_cpu(cpu) > + smp_call_on_cpu(cpu, mpic_soc_err_irq_mask_on_cpu, d, true); Again. smp_call_on_cpu() invokes wait_for_completion(), which obviously can sleep. Also why do you want to do that on _ALL_ CPUs if there is only one you pick from the effective affinity mask? > + /* Then enable on one online core from the affinity mask */ > + cpu = cpumask_any_and(mask, cpu_online_mask); > + smp_call_on_cpu(cpu, mpic_soc_err_irq_unmask_on_cpu, d, true); Ditto. So you really want to map the registers so they are accessible cross CPU including locking. Alternatively pin the error interrupts to CPU0 which cannot be unplugged and be done with it. > +static int mpic_soc_err_irq_map(struct irq_domain *domain, unsigned int virq, irq_hw_number_t hwirq) > +{ > + struct mpic *mpic = domain->host_data; > + > + irq_set_chip_data(virq, mpic); > + > + mpic_soc_err_irq_mask(irq_get_irq_data(virq)); What for? It should be masked if it's not mapped, no? > + irq_set_status_flags(virq, IRQ_LEVEL); > + irq_set_chip_and_handler(virq, &mpic_soc_err_irq_chip, handle_level_irq); > + irq_set_probe(virq); > + > + return 0; > +} > +static int mpic_soc_err_xlate(struct irq_domain *domain, struct device_node *np, > + const u32 *spec, unsigned int count, > + unsigned long *hwirq, unsigned int *type) > +{ > + int err = irq_domain_xlate_onecell(domain, np, spec, count, hwirq, type); > + > + if (err) > + return err; > + > + *hwirq -= MPIC_SOC_ERR_IRQS_OFFSET; > + return 0; > +} > +static int __init mpic_soc_err_init(struct mpic *mpic, struct device_node *np) > +{ > + unsigned int nr_irqs; > + > + if (of_machine_is_compatible("marvell,armada-370-xp")) > + nr_irqs = 32; > + else > + nr_irqs = 64; > + > + mpic->soc_err_domain = irq_domain_add_hierarchy(mpic->domain, 0, nr_irqs, np, > + &mpic_soc_err_irq_ops, mpic); Why is this a hierarchical domain? That does not make any sense at all. The MPIC domain provides only the demultiplexing interrupt and is not involved in a domain hierarchy. Hierarchical domains are required when the top level domain depends on resources from the parent domain which need to be allocated when mapping an interrupt. e.g. on x86: vector - remap - PCI/MSI So the top level PCI/MSI domain requires resources from the remap domain and the remap domain requires a vector from the vector domain. These resources are per interrupt. During runtime there are also irq_*() callbacks which utilize callbacks from the parent domains. I.e. mask() invokes mask(parent) ... But that has nothing to do with demultiplexed interrupts because the underlying demultiplex interrupt is already there and the same for all interrupts in the demultiplexed domain. There is no callback dependency and no resource dependency at all. Thanks, tglx
On Wed, Aug 14, 2024 at 05:51:33PM +0200, Thomas Gleixner wrote: > On Wed, Aug 14 2024 at 14:45, Marek Behún wrote: > > Cc+ device tree people. > > > + The code binds this new interrupt domain to the same device-tree node as > > + the main interrupt domain. The main interrupt controller has its > > + interrupts described by one argument in device-tree > > + (#interrupt-cells = <1>), i.e.: > > + > > + interrupts-extended = <&mpic 8>; > > + > > + Because of backwards compatibility we cannot change this number of > > + arguments, and so the SoC Error interrupts must also be described by > > + this one number. > > + > > + Thus, to describe a SoC Error interrupt, one has to add the an offset > > + to the SoC Error interrupt number. Offset 0x400 was chosen because the > > + main controller supports at most 1024 interrupts (in theory; in practice > > + it seems to be 116 interrupts on all supported platforms). An example of > > + describing a SoC Error interrupt is > > + > > + interrupts-extended = <&mpic 0x404>; > > This looks like a horrible hack and I don't understand why this can't be > a separate interrupt controller, which it is in the hardware. That > controller utilizes interrupt 4 from the MPIC. > > But then my DT foo is limited, so I let the DT folks comment on that. I'm guessing, not being author and all that, that since the registers for this SOC_ERR business are intermingled with those of the existing interrupt controller it is not possible to create a standalone node for the new controller with it's own reg entry, without having to redo the binding etc for the existing controller, ending up with some sort of syscon. Alternatively, I suppose a child node could work, but it wouldn't be much less of a hack than deciding that numbers > 400 are the SOC_ERR ones. I see arguments for doing it either way and Marek must have an opinion on doing it without a new node, given the comment suggesting that a previous revision did have a dedicated node and that approach was dropped. Cheers, Conor.
On Wed, Aug 14, 2024 at 05:43:19PM +0100, Conor Dooley wrote: > On Wed, Aug 14, 2024 at 05:51:33PM +0200, Thomas Gleixner wrote: > > On Wed, Aug 14 2024 at 14:45, Marek Behún wrote: > > > > Cc+ device tree people. > > > > > + The code binds this new interrupt domain to the same device-tree node as > > > + the main interrupt domain. The main interrupt controller has its > > > + interrupts described by one argument in device-tree > > > + (#interrupt-cells = <1>), i.e.: > > > + > > > + interrupts-extended = <&mpic 8>; > > > + > > > + Because of backwards compatibility we cannot change this number of > > > + arguments, and so the SoC Error interrupts must also be described by > > > + this one number. > > > + > > > + Thus, to describe a SoC Error interrupt, one has to add the an offset > > > + to the SoC Error interrupt number. Offset 0x400 was chosen because the > > > + main controller supports at most 1024 interrupts (in theory; in practice > > > + it seems to be 116 interrupts on all supported platforms). An example of > > > + describing a SoC Error interrupt is > > > + > > > + interrupts-extended = <&mpic 0x404>; > > > > This looks like a horrible hack and I don't understand why this can't be > > a separate interrupt controller, which it is in the hardware. That > > controller utilizes interrupt 4 from the MPIC. > > > > But then my DT foo is limited, so I let the DT folks comment on that. > > I'm guessing, not being author and all that, that since the registers > for this SOC_ERR business are intermingled with those of the existing > interrupt controller it is not possible to create a standalone node for > the new controller with it's own reg entry, without having to redo > the binding etc for the existing controller, ending up with some sort of > syscon. > Alternatively, I suppose a child node could work, but it wouldn't be much > less of a hack than deciding that numbers > 400 are the SOC_ERR ones. > I see arguments for doing it either way and Marek must have an opinion > on doing it without a new node, given the comment suggesting that a > previous revision did have a dedicated node and that approach was > dropped. This is exactly the reason. Pali's original code required creating another interrupt-controller node (soc_err) within the mpic node: mpic: interrupt-controller@20a00 { compatible = "marvell,mpic"; reg = <0x20a00 0x2d0>, <0x21070 0x58>; #interrupt-cells = <1>; interrupt-controller; msi-controller; interrupts = <GIC_PPI 15 IRQ_TYPE_LEVEL_HIGH>; soc_err: interrupt-controller@20 { interrupt-controller; #interrupt-cells = <1>; }; }; I decided against it, because to do this correctly in device-tree bindings, it gets unnecessarily complicated: - it requires #address-cells and #size-cells within the mpic node - the `interrupt-controller` property of mpic node can in some interpretations clash with the `interrupt-controller@20` name of the child node - the registers of soc_err and mpic overlap - the child note should have a compatible string if we want to be proper I also did consider syscon, but IMO it just adds unnecessary burden to backwards compatibility of device-trees. The solution with the 0x400 offset solves the backwards compatiblity issue. Marek
diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig index d078bdc48c38..ac011e9df7f6 100644 --- a/drivers/irqchip/Kconfig +++ b/drivers/irqchip/Kconfig @@ -82,6 +82,7 @@ config IRQ_MSI_LIB config ARMADA_370_XP_IRQ bool select GENERIC_IRQ_CHIP + select IRQ_DOMAIN_HIERARCHY select PCI_MSI if PCI select GENERIC_IRQ_EFFECTIVE_AFF_MASK if SMP diff --git a/drivers/irqchip/irq-armada-370-xp.c b/drivers/irqchip/irq-armada-370-xp.c index d7c5ef248474..8075b25a66a5 100644 --- a/drivers/irqchip/irq-armada-370-xp.c +++ b/drivers/irqchip/irq-armada-370-xp.c @@ -27,6 +27,7 @@ #include <linux/of_pci.h> #include <linux/irqdomain.h> #include <linux/slab.h> +#include <linux/smp.h> #include <linux/syscore_ops.h> #include <linux/msi.h> #include <linux/types.h> @@ -116,6 +117,8 @@ #define MPIC_INT_CONTROL 0x00 #define MPIC_INT_CONTROL_NUMINT_MASK GENMASK(12, 2) #define MPIC_SW_TRIG_INT 0x04 +#define MPIC_INT_SOC_ERR_0_CAUSE 0x20 +#define MPIC_INT_SOC_ERR_1_CAUSE 0x24 #define MPIC_INT_SET_ENABLE 0x30 #define MPIC_INT_CLEAR_ENABLE 0x34 #define MPIC_INT_SOURCE_CTL(hwirq) (0x100 + (hwirq) * 4) @@ -130,7 +133,9 @@ #define MPIC_CPU_INTACK_IID_MASK GENMASK(9, 0) #define MPIC_INT_SET_MASK 0x48 #define MPIC_INT_CLEAR_MASK 0x4C +#define MPIC_INT_SOC_ERR_0_MASK 0x50 #define MPIC_INT_FABRIC_MASK 0x54 +#define MPIC_INT_SOC_ERR_1_MASK 0x58 #define MPIC_INT_CAUSE_PERF(cpu) BIT(cpu) #define MPIC_PER_CPU_IRQS_NR 29 @@ -149,12 +154,16 @@ #define PCI_MSI_FULL_DOORBELL_SRC0_MASK GENMASK(15, 0) #define PCI_MSI_FULL_DOORBELL_SRC1_MASK GENMASK(31, 16) +/* Device-tree binding offset for SoC Error interrupts */ +#define MPIC_SOC_ERR_IRQS_OFFSET 0x400 + /** * struct mpic - MPIC private data structure * @base: MPIC registers base address * @per_cpu: per-CPU registers base address * @parent_irq: parent IRQ if MPIC is not top-level interrupt controller * @domain: MPIC main interrupt domain + * @soc_err_domain: SoC Error interrupt domain * @ipi_domain: IPI domain * @msi_domain: MSI domain * @msi_inner_domain: MSI inner domain @@ -172,6 +181,7 @@ struct mpic { void __iomem *per_cpu; int parent_irq; struct irq_domain *domain; + struct irq_domain *soc_err_domain; #ifdef CONFIG_SMP struct irq_domain *ipi_domain; #endif @@ -542,19 +552,34 @@ static void mpic_smp_cpu_init(struct mpic *mpic) writel(0, mpic->per_cpu + MPIC_INT_CLEAR_MASK); } -static void mpic_reenable_percpu(struct mpic *mpic) +static void mpic_reenable_percpu_domain(struct irq_domain *domain, irq_hw_number_t max, + void (*unmask_cb)(struct irq_data *)) { - /* Re-enable per-CPU interrupts that were enabled before suspend */ - for (irq_hw_number_t i = 0; i < MPIC_PER_CPU_IRQS_NR; i++) { - unsigned int virq = irq_linear_revmap(mpic->domain, i); + for (irq_hw_number_t i = 0; i < max; i++) { + unsigned int virq = irq_linear_revmap(domain, i); struct irq_data *d; if (!virq || !irq_percpu_is_enabled(virq)) continue; d = irq_get_irq_data(virq); - mpic_irq_unmask(d); + unmask_cb(d); } +} + +static void mpic_soc_err_irq_unmask(struct irq_data *d); + +static void mpic_reenable_percpu(struct mpic *mpic) +{ + /* Re-enable per-CPU interrupts that were enabled before suspend */ + mpic_reenable_percpu_domain(mpic->domain, MPIC_PER_CPU_IRQS_NR, mpic_irq_unmask); + + /* Re-enable per-CPU SoC Error interrupts that were enabled before suspend */ + mpic_reenable_percpu_domain(mpic->soc_err_domain, mpic->soc_err_domain->hwirq_max, + mpic_soc_err_irq_unmask); + + /* Unmask summary SoC Error Interrupt */ + writel(4, mpic->per_cpu + MPIC_INT_CLEAR_MASK); if (mpic_is_ipi_available(mpic)) mpic_ipi_resume(mpic); @@ -599,12 +624,31 @@ static struct irq_chip mpic_irq_chip = { .flags = IRQCHIP_SKIP_SET_WAKE | IRQCHIP_MASK_ON_SUSPEND, }; +static int mpic_domain_select(struct irq_domain *our_domain, irq_hw_number_t base, + struct irq_domain *domain, struct irq_fwspec *fwspec, + enum irq_domain_bus_token bus_token) +{ + u32 param = fwspec->param[0]; + + return domain == our_domain && domain->fwnode == fwspec->fwnode && + bus_token == DOMAIN_BUS_WIRED && fwspec->param_count == 1 && + param >= base && param < base + domain->hwirq_max; +} + +static int mpic_irq_select(struct irq_domain *domain, struct irq_fwspec *fwspec, + enum irq_domain_bus_token bus_token) +{ + struct mpic *mpic = domain->host_data; + + return mpic_domain_select(mpic->domain, 0, domain, fwspec, bus_token); +} + static int mpic_irq_map(struct irq_domain *domain, unsigned int virq, irq_hw_number_t hwirq) { struct mpic *mpic = domain->host_data; - /* IRQs 0 and 1 cannot be mapped, they are handled internally */ - if (hwirq <= 1) + /* IRQs 0, 1 and 4 cannot be mapped, they are handled internally */ + if (hwirq <= 1 || hwirq == 4) return -EINVAL; irq_set_chip_data(virq, mpic); @@ -628,10 +672,173 @@ static int mpic_irq_map(struct irq_domain *domain, unsigned int virq, irq_hw_num } static const struct irq_domain_ops mpic_irq_ops = { + .select = mpic_irq_select, .map = mpic_irq_map, .xlate = irq_domain_xlate_onecell, }; +static void mpic_soc_err_irq_mask_set(struct irq_data *d, bool mask) +{ + struct mpic *mpic = irq_data_get_irq_chip_data(d); + irq_hw_number_t hwirq = irqd_to_hwirq(d); + u32 bit = BIT(hwirq % 32); + u32 reg; + + if (hwirq >= 32) + reg = MPIC_INT_SOC_ERR_1_MASK; + else + reg = MPIC_INT_SOC_ERR_0_MASK; + + atomic_io_modify(mpic->per_cpu + reg, bit, mask ? 0 : bit); +} + +static void mpic_soc_err_irq_mask(struct irq_data *d) +{ + mpic_soc_err_irq_mask_set(d, true); +} + +static void mpic_soc_err_irq_unmask(struct irq_data *d) +{ + mpic_soc_err_irq_mask_set(d, false); +} + +static int mpic_soc_err_irq_mask_on_cpu(void *d) +{ + mpic_soc_err_irq_mask(d); + + return 0; +} + +static int mpic_soc_err_irq_unmask_on_cpu(void *d) +{ + mpic_soc_err_irq_unmask(d); + + return 0; +} + +static int mpic_soc_err_irq_set_affinity(struct irq_data *d, const struct cpumask *mask, bool force) +{ + unsigned int cpu; + + /* + * TODO: The mpic->per_cpu region accesses CPU Local IC registers for CPU n when accessed + * from CPU n. Thus if we want to access this registers from another CPU, we need to request + * a function to be executed on CPU n. This is what we do here by calling smp_call_on_cpu(). + * + * Instead, we could access CPU Local IC registers by having CPU Local region of each CPU + * mapped in the MPIC private data structure. We could do this either by extending the + * register resource in the device-tree, or by computing the physical base address of those + * regions relative to the main MPIC base address. + */ + + cpus_read_lock(); + + /* First, disable the ERR IRQ on all cores */ + for_each_online_cpu(cpu) + smp_call_on_cpu(cpu, mpic_soc_err_irq_mask_on_cpu, d, true); + + /* Then enable on one online core from the affinity mask */ + cpu = cpumask_any_and(mask, cpu_online_mask); + smp_call_on_cpu(cpu, mpic_soc_err_irq_unmask_on_cpu, d, true); + + cpus_read_unlock(); + + irq_data_update_effective_affinity(d, cpumask_of(cpu)); + + return IRQ_SET_MASK_OK; +} + +static struct irq_chip mpic_soc_err_irq_chip = { + .name = "MPIC SoC Errors", + .irq_mask = mpic_soc_err_irq_mask, + .irq_unmask = mpic_soc_err_irq_unmask, + .irq_set_affinity = mpic_soc_err_irq_set_affinity, +}; + +static int mpic_soc_err_irq_select(struct irq_domain *domain, struct irq_fwspec *fwspec, + enum irq_domain_bus_token bus_token) +{ + struct mpic *mpic = domain->host_data; + + return mpic_domain_select(mpic->soc_err_domain, MPIC_SOC_ERR_IRQS_OFFSET, domain, fwspec, + bus_token); +} + +static int mpic_soc_err_irq_map(struct irq_domain *domain, unsigned int virq, irq_hw_number_t hwirq) +{ + struct mpic *mpic = domain->host_data; + + irq_set_chip_data(virq, mpic); + + mpic_soc_err_irq_mask(irq_get_irq_data(virq)); + irq_set_status_flags(virq, IRQ_LEVEL); + irq_set_chip_and_handler(virq, &mpic_soc_err_irq_chip, handle_level_irq); + irq_set_probe(virq); + + return 0; +} + +static int mpic_soc_err_xlate(struct irq_domain *domain, struct device_node *np, + const u32 *spec, unsigned int count, + unsigned long *hwirq, unsigned int *type) +{ + int err = irq_domain_xlate_onecell(domain, np, spec, count, hwirq, type); + + if (err) + return err; + + *hwirq -= MPIC_SOC_ERR_IRQS_OFFSET; + + return 0; +} + +static const struct irq_domain_ops mpic_soc_err_irq_ops = { + .select = mpic_soc_err_irq_select, + .map = mpic_soc_err_irq_map, + .xlate = mpic_soc_err_xlate, +}; + +static int __init mpic_soc_err_init(struct mpic *mpic, struct device_node *np) +{ + unsigned int nr_irqs; + + if (of_machine_is_compatible("marvell,armada-370-xp")) + nr_irqs = 32; + else + nr_irqs = 64; + + mpic->soc_err_domain = irq_domain_add_hierarchy(mpic->domain, 0, nr_irqs, np, + &mpic_soc_err_irq_ops, mpic); + if (WARN_ON(!mpic->soc_err_domain)) + return -ENOMEM; + + /* Unmask summary SoC Error Interrupt */ + writel(4, mpic->per_cpu + MPIC_INT_CLEAR_MASK); + + return 0; +} + +static void mpic_handle_soc_err_reg(struct mpic *mpic, u32 mask_reg, u32 cause_reg, + unsigned int offset) +{ + unsigned long cause; + unsigned int i; + + cause = readl(mpic->base + cause_reg); + cause &= readl(mpic->per_cpu + mask_reg); + + for_each_set_bit(i, &cause, BITS_PER_LONG) + generic_handle_domain_irq(mpic->soc_err_domain, i + offset); +} + +static void mpic_handle_soc_err_irq(struct mpic *mpic) +{ + mpic_handle_soc_err_reg(mpic, MPIC_INT_SOC_ERR_0_MASK, MPIC_INT_SOC_ERR_0_CAUSE, 0); + + if (mpic->soc_err_domain->hwirq_max > 32) + mpic_handle_soc_err_reg(mpic, MPIC_INT_SOC_ERR_1_MASK, MPIC_INT_SOC_ERR_1_CAUSE, 32); +} + #ifdef CONFIG_PCI_MSI static void mpic_handle_msi_irq(struct mpic *mpic) { @@ -692,6 +899,11 @@ static void mpic_handle_cascade_irq(struct irq_desc *desc) continue; } + if (unlikely(i == 4)) { + mpic_handle_soc_err_irq(mpic); + continue; + } + generic_handle_domain_irq(mpic->domain, i); } @@ -711,9 +923,13 @@ static void __exception_irq_entry mpic_handle_irq(struct pt_regs *regs) if (i > 1022) break; - if (i > 1) + if (i > 1 && i != 4) generic_handle_domain_irq(mpic->domain, i); + /* SoC Error handling */ + if (unlikely(i == 4)) + mpic_handle_soc_err_irq(mpic); + /* MSI handling */ if (i == 1) mpic_handle_msi_irq(mpic); @@ -766,6 +982,13 @@ static void mpic_resume(void) } } + /* + * Re-enable per-CPU SoC Error interrupts on the current CPU, mpic_reenable_percpu() will + * take care of secondary CPUs when they come up. + */ + mpic_reenable_percpu_domain(mpic->soc_err_domain, mpic->soc_err_domain->hwirq_max, + mpic_soc_err_irq_unmask); + /* Reconfigure doorbells for IPIs and MSIs */ writel(mpic->doorbell_mask, mpic->per_cpu + MPIC_IN_DRBEL_MASK); @@ -782,6 +1005,9 @@ static void mpic_resume(void) if (src1) writel(1, mpic->per_cpu + MPIC_INT_CLEAR_MASK); + /* Unmask summary SoC Error Interrupt */ + writel(4, mpic->per_cpu + MPIC_INT_CLEAR_MASK); + if (mpic_is_ipi_available(mpic)) mpic_ipi_resume(mpic); } @@ -873,6 +1099,12 @@ static int __init mpic_of_init(struct device_node *node, struct device_node *par mpic_perf_init(mpic); mpic_smp_cpu_init(mpic); + err = mpic_soc_err_init(mpic, node); + if (err) { + pr_err("%pOF: Unable to initialize SOC Error IRQs domain\n", node); + return err; + } + err = mpic_msi_init(mpic, node, phys_base); if (err) { pr_err("%pOF: Unable to initialize MSI domain\n", node);