diff mbox series

irqchip/gic-v3-its: Correctly honor the RID remapping

Message ID 20240717195937.2240400-1-maz@kernel.org (mailing list archive)
State New, archived
Headers show
Series irqchip/gic-v3-its: Correctly honor the RID remapping | expand

Commit Message

Marc Zyngier July 17, 2024, 7:59 p.m. UTC
Since 6adb35ff43a16 ("irqchip/gic-v3-its: Provide MSI parent for
PCI/MSI[-X]"), the primary domain a PCI device allocates its interrupts
from is the one that is directly attached to the device itself.

By virtue of being a PCI device, it has no OF node.

This domain is (through more layer than it is worth describing)
passed to its_pci_msi_prepare(), which tries to compute the
full RID that is presented to the ITS by the device. This is ultimately
done by calling pci_msi_domain_get_msi_rid(), passing both the
domain and the PCI device as arguments.

The baked-in assumption is that either the domain that is passed
to pci_msi_domain_get_msi_rid() describes an interrupt controller
with either an OF node or an entry in an ACPI IORT table.
In our case, it is *neither*. This domain is does not represent
anything firmware-based, but just an allocation unit for the device.

As a result, we fail to provide the full RID (which requires inspecting
the msi-map/msi-mask properties in the DT), and stick to the BDF,
which isn't very useful.

Tragedy follows with a litany of devices that randomly die as
they fail to see any MSI (because the RID is wrong) or fail to
get an allocation (because they try to steal LPIs from their
neighbour's pool).

This will happen on any system where a single ITS is shared by
multiple root ports and end-points with overlapping BDF numbers,
and has the topology described in the device-tree.  Simpler DT
topologies will luckily work, and so will ACPI-based systems.

Solve it by pointing pci_msi_domain_get_msi_rid() at the *parent*
domain, which is the ITS, resulting in a correct mapping and a
restored happiness in my personal zoo.

Fixes: 6adb35ff43a16 ("irqchip/gic-v3-its: Provide MSI parent for PCI/MSI[-X]")
Reported-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
---
 drivers/irqchip/irq-gic-v3-its-msi-parent.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Johan Hovold July 18, 2024, 7:20 a.m. UTC | #1
On Wed, Jul 17, 2024 at 08:59:37PM +0100, Marc Zyngier wrote:
> Since 6adb35ff43a16 ("irqchip/gic-v3-its: Provide MSI parent for
> PCI/MSI[-X]"), the primary domain a PCI device allocates its interrupts
> from is the one that is directly attached to the device itself.
> 
> By virtue of being a PCI device, it has no OF node.
> 
> This domain is (through more layer than it is worth describing)
> passed to its_pci_msi_prepare(), which tries to compute the
> full RID that is presented to the ITS by the device. This is ultimately
> done by calling pci_msi_domain_get_msi_rid(), passing both the
> domain and the PCI device as arguments.
> 
> The baked-in assumption is that either the domain that is passed
> to pci_msi_domain_get_msi_rid() describes an interrupt controller
> with either an OF node or an entry in an ACPI IORT table.
> In our case, it is *neither*. This domain is does not represent

nit: stray "is"

> anything firmware-based, but just an allocation unit for the device.
> 
> As a result, we fail to provide the full RID (which requires inspecting
> the msi-map/msi-mask properties in the DT), and stick to the BDF,
> which isn't very useful.
> 
> Tragedy follows with a litany of devices that randomly die as
> they fail to see any MSI (because the RID is wrong) or fail to
> get an allocation (because they try to steal LPIs from their
> neighbour's pool).
> 
> This will happen on any system where a single ITS is shared by
> multiple root ports and end-points with overlapping BDF numbers,
> and has the topology described in the device-tree.  Simpler DT
> topologies will luckily work, and so will ACPI-based systems.
> 
> Solve it by pointing pci_msi_domain_get_msi_rid() at the *parent*
> domain, which is the ITS, resulting in a correct mapping and a
> restored happiness in my personal zoo.
> 
> Fixes: 6adb35ff43a16 ("irqchip/gic-v3-its: Provide MSI parent for PCI/MSI[-X]")
> Reported-by: Johan Hovold <johan@kernel.org>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>

This fixes the regression for the Qualcomm machines I ran into this
with:

Tested-by: Johan Hovold <johan+linaro@kernel.org>

Thanks!

Johan
diff mbox series

Patch

diff --git a/drivers/irqchip/irq-gic-v3-its-msi-parent.c b/drivers/irqchip/irq-gic-v3-its-msi-parent.c
index 780e1bc9df549..2f3fc597331ba 100644
--- a/drivers/irqchip/irq-gic-v3-its-msi-parent.c
+++ b/drivers/irqchip/irq-gic-v3-its-msi-parent.c
@@ -65,7 +65,7 @@  static int its_pci_msi_prepare(struct irq_domain *domain, struct device *dev,
 	}
 
 	/* ITS specific DeviceID, as the core ITS ignores dev. */
-	info->scratchpad[0].ul = pci_msi_domain_get_msi_rid(domain, pdev);
+	info->scratchpad[0].ul = pci_msi_domain_get_msi_rid(domain->parent, pdev);
 
 	/*
 	 * @domain->msi_domain_info->hwsize contains the size of the