Message ID | 20170210114517.14812-1-sassmann@kpanic.de (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
[+Bjorn, linux-pci] On Friday, February 10, 2017 12:45:17 PM Stefan Assmann wrote: > When checking for boot interrupts and PRT index is zero avoid doing > any interrupt rerouting since the code currently does not handle > determining the IRQ via _CRS method. > > Fixes: > https://bugzilla.kernel.org/show_bug.cgi?id=43074 > > Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Bjorn, what do you think? > --- > drivers/acpi/pci_irq.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c > index c576a6f..6ecbde0 100644 > --- a/drivers/acpi/pci_irq.c > +++ b/drivers/acpi/pci_irq.c > @@ -280,7 +280,7 @@ static int bridge_has_boot_interrupt_variant(struct pci_bus *bus) > static int acpi_reroute_boot_interrupt(struct pci_dev *dev, > struct acpi_prt_entry *entry) > { > - if (noioapicquirk || noioapicreroute) { > + if (noioapicquirk || noioapicreroute || entry->index == 0) { > return 0; > } else { > switch (bridge_has_boot_interrupt_variant(dev->bus)) { > -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Stefan, On Fri, Feb 10, 2017 at 12:45:17PM +0100, Stefan Assmann wrote: > When checking for boot interrupts and PRT index is zero avoid doing > any interrupt rerouting since the code currently does not handle > determining the IRQ via _CRS method. I don't understand this at all. Can you educate me a little? This bug occurs on the M2N-LR system. I think that board contains an Intel 6702PXH PCIe-to-PCI-X bridge to support its PCI-X slots. A PCI-X 3ware 9550SX is plugged in below the 6702PXH, and the problem is with the wired INTx interrupts from the 9550SX. Can we tell where those PCI-X INTx wires are connected? There's an IOAPIC in the 6702PXH; if they're connected there, the Intel 6700PXH 64-bit PCI Hub datasheet [1], sec 2.15.2, says it should generate upstream PCIe INTx messages ("boot interrupts") if the PCI-X INTx interrupt is masked in the APIC. If the PCI-X INTx wires are connected elsewhere, e.g., to some other IOAPIC, the 6702PXH should never see its IOAPIC inputs wiggle and it should never generate a boot interrupt. The fact that the 6702PXH IOAPIC doesn't appear in the lspci output and that "pci=noioapicquirk" is a workaround makes me think that we aren't using the 6702PXH IOAPIC and therefore we don't see boot interrupts on this system. So my guess is that we should not use the IRQ reroute feature on this system, and I think that's what your patch accomplishes. But I don't understand the way you figure it out. Looking for "entry->index == 0" doesn't seem like an obvious connection. [1] http://www.intel.com/content/dam/doc/datasheet/6700pxh-64-bit-pci-hub-datasheet.pdf > Fixes: > https://bugzilla.kernel.org/show_bug.cgi?id=43074 > > Signed-off-by: Stefan Assmann <sassmann@kpanic.de> > --- > drivers/acpi/pci_irq.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c > index c576a6f..6ecbde0 100644 > --- a/drivers/acpi/pci_irq.c > +++ b/drivers/acpi/pci_irq.c > @@ -280,7 +280,7 @@ static int bridge_has_boot_interrupt_variant(struct pci_bus *bus) > static int acpi_reroute_boot_interrupt(struct pci_dev *dev, > struct acpi_prt_entry *entry) > { > - if (noioapicquirk || noioapicreroute) { > + if (noioapicquirk || noioapicreroute || entry->index == 0) { > return 0; > } else { > switch (bridge_has_boot_interrupt_variant(dev->bus)) { > -- > 2.9.3 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-acpi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08.03.2017 23:34, Bjorn Helgaas wrote: > Hi Stefan, Hi Bjorn, > On Fri, Feb 10, 2017 at 12:45:17PM +0100, Stefan Assmann wrote: >> When checking for boot interrupts and PRT index is zero avoid doing >> any interrupt rerouting since the code currently does not handle >> determining the IRQ via _CRS method. > > I don't understand this at all. Can you educate me a little? Sure, I'll try to explain my chain of thoughts. > This bug occurs on the M2N-LR system. I think that board contains an > Intel 6702PXH PCIe-to-PCI-X bridge to support its PCI-X slots. A > PCI-X 3ware 9550SX is plugged in below the 6702PXH, and the problem is > with the wired INTx interrupts from the 9550SX. > > Can we tell where those PCI-X INTx wires are connected? There's an > IOAPIC in the 6702PXH; if they're connected there, the Intel 6700PXH > 64-bit PCI Hub datasheet [1], sec 2.15.2, says it should generate > upstream PCIe INTx messages ("boot interrupts") if the PCI-X INTx > interrupt is masked in the APIC. Correct, if you take a look at bridge_has_boot_interrupt_variant() we actually walk up all the bridges to the root bridge and check for the 6702PXH via irq_reroute_variant, which gets set via PCI quirks in quirk_reroute_to_boot_interrupts_intel(). > If the PCI-X INTx wires are connected elsewhere, e.g., to some other > IOAPIC, the 6702PXH should never see its IOAPIC inputs wiggle and it > should never generate a boot interrupt. > > The fact that the 6702PXH IOAPIC doesn't appear in the lspci output > and that "pci=noioapicquirk" is a workaround makes me think that we > aren't using the 6702PXH IOAPIC and therefore we don't see boot > interrupts on this system. Oh, I did not notice that dmesg only reports 1 IO-APIC. Though walking the bridges reports the 6702PXH IO-APIC being involved, otherwise the quirk wouldn't hit. > So my guess is that we should not use the IRQ reroute feature on this > system, and I think that's what your patch accomplishes. But I don't > understand the way you figure it out. Looking for "entry->index == 0" > doesn't seem like an obvious connection. The ACPI spec [1], sec 6.2.12, says "There are two ways that _PRT can be used. [...] In the second model, the PCI interrupts are hardwired to specific interrupt inputs on the interrupt controller and are not configurable. In this case, the Source field in _PRT does not reference a device, but instead contains the value zero, and the Source Index field contains the global system interrupt to which the PCI interrupt is hardwired." This has been the case on all systems we encountered boot interrupts on back then and thus what the quirk is based upon. Reminder of how the _PRT entry looked like on the M2N-LR. 3w-9xxx 0000:03:00.0: PRT pin=1 link=ffff8802140621b8 index=0 Compared to a _PRT entry from the only system I could dig up which has a 6702PXH. megaraid 0000:03:01.0: PRT pin=1 link= (null) index=24 megaraid 0000:03:01.0: PCI IRQ 24 -> rerouted to legacy IRQ 16 I remembered having read that there are buggy BIOSes out there which do not set Source to a zero value but still put the IRQ in Index, though I have no prove for that, I decided not to rely on it. If entry->index is zero the quirk is destined to fail anyway, imho. We could extend the condition to (entry->link && entry->index == 0) if you think that is better. Appreciate your thoughtful analysis! Stefan [1] http://www.acpi.info/DOWNLOADS/ACPIspec50.pdf > > [1] http://www.intel.com/content/dam/doc/datasheet/6700pxh-64-bit-pci-hub-datasheet.pdf > >> Fixes: >> https://bugzilla.kernel.org/show_bug.cgi?id=43074 >> >> Signed-off-by: Stefan Assmann <sassmann@kpanic.de> >> --- >> drivers/acpi/pci_irq.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c >> index c576a6f..6ecbde0 100644 >> --- a/drivers/acpi/pci_irq.c >> +++ b/drivers/acpi/pci_irq.c >> @@ -280,7 +280,7 @@ static int bridge_has_boot_interrupt_variant(struct pci_bus *bus) >> static int acpi_reroute_boot_interrupt(struct pci_dev *dev, >> struct acpi_prt_entry *entry) >> { >> - if (noioapicquirk || noioapicreroute) { >> + if (noioapicquirk || noioapicreroute || entry->index == 0) { >> return 0; >> } else { >> switch (bridge_has_boot_interrupt_variant(dev->bus)) { >> -- >> 2.9.3 >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Mar 10, 2017 at 02:47:06PM +0100, Stefan Assmann wrote: > On 08.03.2017 23:34, Bjorn Helgaas wrote: > > Hi Stefan, > > Hi Bjorn, > > > On Fri, Feb 10, 2017 at 12:45:17PM +0100, Stefan Assmann wrote: > >> When checking for boot interrupts and PRT index is zero avoid doing > >> any interrupt rerouting since the code currently does not handle > >> determining the IRQ via _CRS method. > > > > I don't understand this at all. Can you educate me a little? > > Sure, I'll try to explain my chain of thoughts. > > > This bug occurs on the M2N-LR system. I think that board contains an > > Intel 6702PXH PCIe-to-PCI-X bridge to support its PCI-X slots. A > > PCI-X 3ware 9550SX is plugged in below the 6702PXH, and the problem is > > with the wired INTx interrupts from the 9550SX. > > > > Can we tell where those PCI-X INTx wires are connected? There's an > > IOAPIC in the 6702PXH; if they're connected there, the Intel 6700PXH > > 64-bit PCI Hub datasheet [1], sec 2.15.2, says it should generate > > upstream PCIe INTx messages ("boot interrupts") if the PCI-X INTx > > interrupt is masked in the APIC. > > Correct, if you take a look at bridge_has_boot_interrupt_variant() we > actually walk up all the bridges to the root bridge and check for the > 6702PXH via irq_reroute_variant, which gets set via PCI quirks in > quirk_reroute_to_boot_interrupts_intel(). > > > If the PCI-X INTx wires are connected elsewhere, e.g., to some other > > IOAPIC, the 6702PXH should never see its IOAPIC inputs wiggle and it > > should never generate a boot interrupt. > > > > The fact that the 6702PXH IOAPIC doesn't appear in the lspci output > > and that "pci=noioapicquirk" is a workaround makes me think that we > > aren't using the 6702PXH IOAPIC and therefore we don't see boot > > interrupts on this system. > > Oh, I did not notice that dmesg only reports 1 IO-APIC. Though walking > the bridges reports the 6702PXH IO-APIC being involved, otherwise the > quirk wouldn't hit. The 6702PXH is definitely involved as a PCIe-to-PCI-X bridge, but that doesn't mean the 6702PXH IOAPIC is involved. The PCI-X bus has INTx wires that *could* be connected to the 6702PXH IOAPIC, but on this system I think they're likely connected to an IOAPIC in the MCP55. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Feb 10, 2017 at 12:45:17PM +0100, Stefan Assmann wrote: > When checking for boot interrupts and PRT index is zero avoid doing > any interrupt rerouting since the code currently does not handle > determining the IRQ via _CRS method. > > Fixes: > https://bugzilla.kernel.org/show_bug.cgi?id=43074 > > Signed-off-by: Stefan Assmann <sassmann@kpanic.de> > --- > drivers/acpi/pci_irq.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c > index c576a6f..6ecbde0 100644 > --- a/drivers/acpi/pci_irq.c > +++ b/drivers/acpi/pci_irq.c > @@ -280,7 +280,7 @@ static int bridge_has_boot_interrupt_variant(struct pci_bus *bus) > static int acpi_reroute_boot_interrupt(struct pci_dev *dev, > struct acpi_prt_entry *entry) > { > - if (noioapicquirk || noioapicreroute) { > + if (noioapicquirk || noioapicreroute || entry->index == 0) { > return 0; > } else { > switch (bridge_has_boot_interrupt_variant(dev->bus)) { I don't understand acpi_reroute_boot_interrupt() in general. I think I understand the chipset functionality, i.e., the Intel 6700PXH contains two IOAPICs, each with 16 interrupt inputs, and if an interrupt is asserted while that input is masked in the IOAPIC, the 6700PXH generates an INTx ASSERT message on the PCIe bus. What I don't understand is what the "correct" way of handling this is. I assume the intent was *not* that the OS should have device-specific code to deal with this. Is it a case of the BIOS doing something wrong? Is ACPI just not expressive enough to describe this hardware behavior? Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 13.03.2017 03:20, Bjorn Helgaas wrote: > On Fri, Feb 10, 2017 at 12:45:17PM +0100, Stefan Assmann wrote: >> When checking for boot interrupts and PRT index is zero avoid doing >> any interrupt rerouting since the code currently does not handle >> determining the IRQ via _CRS method. >> >> Fixes: >> https://bugzilla.kernel.org/show_bug.cgi?id=43074 >> >> Signed-off-by: Stefan Assmann <sassmann@kpanic.de> >> --- >> drivers/acpi/pci_irq.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c >> index c576a6f..6ecbde0 100644 >> --- a/drivers/acpi/pci_irq.c >> +++ b/drivers/acpi/pci_irq.c >> @@ -280,7 +280,7 @@ static int bridge_has_boot_interrupt_variant(struct pci_bus *bus) >> static int acpi_reroute_boot_interrupt(struct pci_dev *dev, >> struct acpi_prt_entry *entry) >> { >> - if (noioapicquirk || noioapicreroute) { >> + if (noioapicquirk || noioapicreroute || entry->index == 0) { >> return 0; >> } else { >> switch (bridge_has_boot_interrupt_variant(dev->bus)) { > > I don't understand acpi_reroute_boot_interrupt() in general. > > I think I understand the chipset functionality, i.e., the Intel > 6700PXH contains two IOAPICs, each with 16 interrupt inputs, and if an > interrupt is asserted while that input is masked in the IOAPIC, the > 6700PXH generates an INTx ASSERT message on the PCIe bus. > > What I don't understand is what the "correct" way of handling this is. > I assume the intent was *not* that the OS should have device-specific > code to deal with this. Boot interrupts are a mechanism to forward IRQs to the primary PIC/APIC, this is done via the INTx messages. The idea is to provide support for legacy OSes that do not support APICs. It probably was meant to be a firmware only thing, but the problem at hand is this: The condition for generating boot interrupts is, the input being masked while an interrupt is asserted (on a non-primary IO-APIC). As you already stated. If this happens _and_ the interrupt line, the INTx messages get forwarded to, already has an IRQ handler registered then there's no handler to take care of those forwarded IRQs and eventually the kernel may falsely shutdown that interrupt line. Because of too many lost interrupts. Sidenote, if there's no IRQ handler registered on that interrupt line, the input is already masked and the boot interrupt silently disappears. > Is it a case of the BIOS doing something wrong? Is ACPI just not > expressive enough to describe this hardware behavior? To be honest I'm not sure. The simplest of solutions would have been to implement an on/off switch for boot interrupts in the chipset, but unfortunately we did not find such a thing. So the solution we worked out was to move the interrupt handler to the legacy interrupt line. This has been a working solution for most systems affected so far. All systems I've seen containing 6700PXH are about 10 years old by now and I haven't notice the problem resurface. Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Mar 13, 2017 at 09:19:02AM +0100, Stefan Assmann wrote: > On 13.03.2017 03:20, Bjorn Helgaas wrote: > > On Fri, Feb 10, 2017 at 12:45:17PM +0100, Stefan Assmann wrote: > >> When checking for boot interrupts and PRT index is zero avoid doing > >> any interrupt rerouting since the code currently does not handle > >> determining the IRQ via _CRS method. > >> > >> Fixes: > >> https://bugzilla.kernel.org/show_bug.cgi?id=43074 > >> > >> Signed-off-by: Stefan Assmann <sassmann@kpanic.de> > >> --- > >> drivers/acpi/pci_irq.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c > >> index c576a6f..6ecbde0 100644 > >> --- a/drivers/acpi/pci_irq.c > >> +++ b/drivers/acpi/pci_irq.c > >> @@ -280,7 +280,7 @@ static int bridge_has_boot_interrupt_variant(struct pci_bus *bus) > >> static int acpi_reroute_boot_interrupt(struct pci_dev *dev, > >> struct acpi_prt_entry *entry) > >> { > >> - if (noioapicquirk || noioapicreroute) { > >> + if (noioapicquirk || noioapicreroute || entry->index == 0) { > >> return 0; > >> } else { > >> switch (bridge_has_boot_interrupt_variant(dev->bus)) { > > > > I don't understand acpi_reroute_boot_interrupt() in general. > > > > I think I understand the chipset functionality, i.e., the Intel > > 6700PXH contains two IOAPICs, each with 16 interrupt inputs, and if an > > interrupt is asserted while that input is masked in the IOAPIC, the > > 6700PXH generates an INTx ASSERT message on the PCIe bus. > > > > What I don't understand is what the "correct" way of handling this is. > > I assume the intent was *not* that the OS should have device-specific > > code to deal with this. > > Boot interrupts are a mechanism to forward IRQs to the primary PIC/APIC, > this is done via the INTx messages. The idea is to provide support for > legacy OSes that do not support APICs. It probably was meant to be a > firmware only thing, but the problem at hand is this: > The condition for generating boot interrupts is, the input being masked > while an interrupt is asserted (on a non-primary IO-APIC). As you > already stated. > If this happens _and_ the interrupt line, the INTx messages get forwarded > to, already has an IRQ handler registered then there's no handler to > take care of those forwarded IRQs and eventually the kernel may > falsely shutdown that interrupt line. Because of too many lost > interrupts. > Sidenote, if there's no IRQ handler registered on that interrupt line, > the input is already masked and the boot interrupt silently disappears. > > > Is it a case of the BIOS doing something wrong? Is ACPI just not > > expressive enough to describe this hardware behavior? > > To be honest I'm not sure. The simplest of solutions would have been to > implement an on/off switch for boot interrupts in the chipset, but > unfortunately we did not find such a thing. So the solution we worked > out was to move the interrupt handler to the legacy interrupt line. Rafael, do you have any hints about ways to disable this boot interrupt thing? All the devices that use this irq->reroute_variant seem to be Intel devices. It just seems wrong to me to unconditionally override the _PRT entry when the hardware behavior is conditional on the APIC input being masked. If we can't disable the boot interrupt thing, could we use a DMI switch to special-case this M2N-LR system? The fact that entry->index is zero in this case seems like a coincidence that we shouldn't rely on. But I don't feel like I fully understand _PRT entries, so maybe it's not. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Mar 14, 2017 at 10:23 PM, Bjorn Helgaas <helgaas@kernel.org> wrote: > On Mon, Mar 13, 2017 at 09:19:02AM +0100, Stefan Assmann wrote: >> On 13.03.2017 03:20, Bjorn Helgaas wrote: >> > On Fri, Feb 10, 2017 at 12:45:17PM +0100, Stefan Assmann wrote: >> >> When checking for boot interrupts and PRT index is zero avoid doing >> >> any interrupt rerouting since the code currently does not handle >> >> determining the IRQ via _CRS method. >> >> >> >> Fixes: >> >> https://bugzilla.kernel.org/show_bug.cgi?id=43074 >> >> >> >> Signed-off-by: Stefan Assmann <sassmann@kpanic.de> >> >> --- >> >> drivers/acpi/pci_irq.c | 2 +- >> >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> >> >> diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c >> >> index c576a6f..6ecbde0 100644 >> >> --- a/drivers/acpi/pci_irq.c >> >> +++ b/drivers/acpi/pci_irq.c >> >> @@ -280,7 +280,7 @@ static int bridge_has_boot_interrupt_variant(struct pci_bus *bus) >> >> static int acpi_reroute_boot_interrupt(struct pci_dev *dev, >> >> struct acpi_prt_entry *entry) >> >> { >> >> - if (noioapicquirk || noioapicreroute) { >> >> + if (noioapicquirk || noioapicreroute || entry->index == 0) { >> >> return 0; >> >> } else { >> >> switch (bridge_has_boot_interrupt_variant(dev->bus)) { >> > >> > I don't understand acpi_reroute_boot_interrupt() in general. >> > >> > I think I understand the chipset functionality, i.e., the Intel >> > 6700PXH contains two IOAPICs, each with 16 interrupt inputs, and if an >> > interrupt is asserted while that input is masked in the IOAPIC, the >> > 6700PXH generates an INTx ASSERT message on the PCIe bus. >> > >> > What I don't understand is what the "correct" way of handling this is. >> > I assume the intent was *not* that the OS should have device-specific >> > code to deal with this. >> >> Boot interrupts are a mechanism to forward IRQs to the primary PIC/APIC, >> this is done via the INTx messages. The idea is to provide support for >> legacy OSes that do not support APICs. It probably was meant to be a >> firmware only thing, but the problem at hand is this: >> The condition for generating boot interrupts is, the input being masked >> while an interrupt is asserted (on a non-primary IO-APIC). As you >> already stated. >> If this happens _and_ the interrupt line, the INTx messages get forwarded >> to, already has an IRQ handler registered then there's no handler to >> take care of those forwarded IRQs and eventually the kernel may >> falsely shutdown that interrupt line. Because of too many lost >> interrupts. >> Sidenote, if there's no IRQ handler registered on that interrupt line, >> the input is already masked and the boot interrupt silently disappears. >> >> > Is it a case of the BIOS doing something wrong? Is ACPI just not >> > expressive enough to describe this hardware behavior? >> >> To be honest I'm not sure. The simplest of solutions would have been to >> implement an on/off switch for boot interrupts in the chipset, but >> unfortunately we did not find such a thing. So the solution we worked >> out was to move the interrupt handler to the legacy interrupt line. > > Rafael, do you have any hints about ways to disable this boot > interrupt thing? All the devices that use this irq->reroute_variant > seem to be Intel devices. No, I don't. > It just seems wrong to me to unconditionally override the _PRT entry > when the hardware behavior is conditional on the APIC input being > masked. > > If we can't disable the boot interrupt thing, could we use a DMI > switch to special-case this M2N-LR system? The fact that entry->index > is zero in this case seems like a coincidence that we shouldn't rely > on. But I don't feel like I fully understand _PRT entries, so maybe > it's not. I'm not intimately familiar with those things either and I was not involved in the development of the code in question. If this is a one-off thing, a DMI-based quirk sounds reasonable at least until we have more information. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 14.03.2017 23:15, Rafael J. Wysocki wrote: > On Tue, Mar 14, 2017 at 10:23 PM, Bjorn Helgaas <helgaas@kernel.org> wrote: >> On Mon, Mar 13, 2017 at 09:19:02AM +0100, Stefan Assmann wrote: >>> On 13.03.2017 03:20, Bjorn Helgaas wrote: >>>> On Fri, Feb 10, 2017 at 12:45:17PM +0100, Stefan Assmann wrote: >>>>> When checking for boot interrupts and PRT index is zero avoid doing >>>>> any interrupt rerouting since the code currently does not handle >>>>> determining the IRQ via _CRS method. >>>>> >>>>> Fixes: >>>>> https://bugzilla.kernel.org/show_bug.cgi?id=43074 >>>>> >>>>> Signed-off-by: Stefan Assmann <sassmann@kpanic.de> >>>>> --- >>>>> drivers/acpi/pci_irq.c | 2 +- >>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>> >>>>> diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c >>>>> index c576a6f..6ecbde0 100644 >>>>> --- a/drivers/acpi/pci_irq.c >>>>> +++ b/drivers/acpi/pci_irq.c >>>>> @@ -280,7 +280,7 @@ static int bridge_has_boot_interrupt_variant(struct pci_bus *bus) >>>>> static int acpi_reroute_boot_interrupt(struct pci_dev *dev, >>>>> struct acpi_prt_entry *entry) >>>>> { >>>>> - if (noioapicquirk || noioapicreroute) { >>>>> + if (noioapicquirk || noioapicreroute || entry->index == 0) { >>>>> return 0; >>>>> } else { >>>>> switch (bridge_has_boot_interrupt_variant(dev->bus)) { >>>> >>>> I don't understand acpi_reroute_boot_interrupt() in general. >>>> >>>> I think I understand the chipset functionality, i.e., the Intel >>>> 6700PXH contains two IOAPICs, each with 16 interrupt inputs, and if an >>>> interrupt is asserted while that input is masked in the IOAPIC, the >>>> 6700PXH generates an INTx ASSERT message on the PCIe bus. >>>> >>>> What I don't understand is what the "correct" way of handling this is. >>>> I assume the intent was *not* that the OS should have device-specific >>>> code to deal with this. >>> >>> Boot interrupts are a mechanism to forward IRQs to the primary PIC/APIC, >>> this is done via the INTx messages. The idea is to provide support for >>> legacy OSes that do not support APICs. It probably was meant to be a >>> firmware only thing, but the problem at hand is this: >>> The condition for generating boot interrupts is, the input being masked >>> while an interrupt is asserted (on a non-primary IO-APIC). As you >>> already stated. >>> If this happens _and_ the interrupt line, the INTx messages get forwarded >>> to, already has an IRQ handler registered then there's no handler to >>> take care of those forwarded IRQs and eventually the kernel may >>> falsely shutdown that interrupt line. Because of too many lost >>> interrupts. >>> Sidenote, if there's no IRQ handler registered on that interrupt line, >>> the input is already masked and the boot interrupt silently disappears. >>> >>>> Is it a case of the BIOS doing something wrong? Is ACPI just not >>>> expressive enough to describe this hardware behavior? >>> >>> To be honest I'm not sure. The simplest of solutions would have been to >>> implement an on/off switch for boot interrupts in the chipset, but >>> unfortunately we did not find such a thing. So the solution we worked >>> out was to move the interrupt handler to the legacy interrupt line. >> >> Rafael, do you have any hints about ways to disable this boot >> interrupt thing? All the devices that use this irq->reroute_variant >> seem to be Intel devices. > > No, I don't. > >> It just seems wrong to me to unconditionally override the _PRT entry >> when the hardware behavior is conditional on the APIC input being >> masked. >> >> If we can't disable the boot interrupt thing, could we use a DMI >> switch to special-case this M2N-LR system? The fact that entry->index >> is zero in this case seems like a coincidence that we shouldn't rely >> on. But I don't feel like I fully understand _PRT entries, so maybe >> it's not. > > I'm not intimately familiar with those things either and I was not > involved in the development of the code in question. > > If this is a one-off thing, a DMI-based quirk sounds reasonable at > least until we have more information. It's the first system I've heard of with such an issue, since those quirks were introduced. I'm fine with the approach to exclude this board via a DMI quirk. I'll have the reporter test the patch before submitting. Thanks for the feedback! Stefan -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c index c576a6f..6ecbde0 100644 --- a/drivers/acpi/pci_irq.c +++ b/drivers/acpi/pci_irq.c @@ -280,7 +280,7 @@ static int bridge_has_boot_interrupt_variant(struct pci_bus *bus) static int acpi_reroute_boot_interrupt(struct pci_dev *dev, struct acpi_prt_entry *entry) { - if (noioapicquirk || noioapicreroute) { + if (noioapicquirk || noioapicreroute || entry->index == 0) { return 0; } else { switch (bridge_has_boot_interrupt_variant(dev->bus)) {
When checking for boot interrupts and PRT index is zero avoid doing any interrupt rerouting since the code currently does not handle determining the IRQ via _CRS method. Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=43074 Signed-off-by: Stefan Assmann <sassmann@kpanic.de> --- drivers/acpi/pci_irq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)