Message ID | 1453705178-27389-1-git-send-email-chen.fan.fnst@cn.fujitsu.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
[+cc Thomas] On Mon, Jan 25, 2016 at 02:59:38PM +0800, Chen Fan wrote: > In our environment, when enable Secure boot, we found an abnormal > phenomenon as following call trace shows. after investigation, we > found the firmware assigned an irq number 255 which means unknown > or no connection in PCI local spec for i801_smbus, meanwhile the > ACPI didn't configure the pci irq routing. and the 255 irq number > was assigned for megasa msix without IRQF_SHARED. then in this case > during i801_smbus probe, the i801_smbus driver would request irq with > bad irq number 255. but the 255 irq number was assigned for memgasa > with MSIX enable. which will cause request_irq fails as call trace > shows, here we use ~0U as invalid IRQ to identify the 0xff IRQ specified > by BIOS. > > See the call trace: Maybe you missed my suggestion that the timestamps aren't useful; here's my suggestion again in more detail: Changelogs are written once, but read dozens or hundreds of time, so stripping out irrelevant details shows consideration for the readers. > [ 32.459195] ipmi device interface > [ 32.612907] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4 > [ 32.800459] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 4.0.1-k-rh > [ 32.818319] ixgbe: Copyright (c) 1999-2014 Intel Corporation. > [ 32.844009] lpc_ich 0001:80:1f.0: I/O space for ACPI uninitialized I think the lines above are completely irrelevant. > [ 32.850093] i801_smbus 0000:00:1f.3: enabling device (0140 -> 0143) > [ 32.851134] i801_smbus 0000:00:1f.3: can't derive routing for PCI INT C > [ 32.851136] i801_smbus 0000:00:1f.3: PCI INT C: no GSI > [ 32.851164] genirq: Flags mismatch irq 255. 00000080 (i801_smbus) vs. 00000000 (megasa These are useful, but the timestamps ("[ 32.850093]") are not. > [ 32.851168] CPU: 0 PID: 2487 Comm: kworker/0:1 Not tainted 3.10.0-229.el7.x86_64 #1 > [ 32.851170] Hardware name: FUJITSU PRIMEQUEST 2800E2/D3736, BIOS PRIMEQUEST 2000 Serie5 These are probably useful; it's nice to know what kernel and hardware is involved. > [ 32.851178] Workqueue: events work_for_cpu_fn > [ 32.851208] ffff88086c330b00 00000000e233a9df ffff88086d57bca0 ffffffff81603f36 > [ 32.851227] ffff88086d57bcf8 ffffffff8110d23a ffff88686fe02000 0000000000000246 > [ 32.851246] ffff88086a9a8c00 00000000e233a9df ffffffffa00ad220 0000000000000080 I doubt these are useful. > [ 32.851247] Call Trace: > [ 32.851261] [<ffffffff81603f36>] dump_stack+0x19/0x1b > [ 32.851271] [<ffffffff8110d23a>] __setup_irq+0x54a/0x570 > [ 32.851282] [<ffffffffa00ad220>] ? i801_check_pre.isra.5+0xe0/0xe0 [i2c_i801] > [ 32.851289] [<ffffffff8110d3bc>] request_threaded_irq+0xcc/0x170 > [ 32.851298] [<ffffffffa00ae87f>] i801_probe+0x32f/0x508 [i2c_i801] > [ 32.851308] [<ffffffff81308385>] local_pci_probe+0x45/0xa0 The above might be useful, but the addresses ("[<ffffffff81603f36>]") are not, and you should go through them manually and strip out the lines that are junk from the stack. For example, I don't think request_threaded_irq() really calls i801_check_pre(). > [ 32.851315] [<ffffffff8108bfd4>] work_for_cpu_fn+0x14/0x20 > [ 32.851323] [<ffffffff8108f0ab>] process_one_work+0x17b/0x470 > [ 32.851330] [<ffffffff81090003>] worker_thread+0x293/0x400 > [ 32.851338] [<ffffffff8108fd70>] ? rescuer_thread+0x400/0x400 > [ 32.851346] [<ffffffff8109726f>] kthread+0xcf/0xe0 > [ 32.851353] [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140 > [ 32.851362] [<ffffffff81613cfc>] ret_from_fork+0x7c/0xb0 > [ 32.851369] [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140 The lines above are completely useless. > [ 32.851373] i801_smbus 0000:00:1f.3: Failed to allocate irq 255: -16 > [ 32.851435] i801_smbus: probe of 0000:00:1f.3 failed with error -16 > [ 33.180145] ixgbe 0000:5a:00.0: Multiq[ 33.240538] ixgbe 0000:5a:00.0: (PCI Express:03:e0 > [ 33.280826] ixgbe 0000:5a:00.0: MAC: 3, PHY: 0, PBA No: 000000-000 These ixgbe entries are irrelevant. > Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com> > --- > arch/x86/include/asm/irq_vectors.h | 2 ++ > drivers/acpi/pci_irq.c | 11 ++++++++++- > include/linux/interrupt.h | 9 +++++++++ > 3 files changed, 21 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h > index 6ca9fd6..b616d69 100644 > --- a/arch/x86/include/asm/irq_vectors.h > +++ b/arch/x86/include/asm/irq_vectors.h > @@ -146,4 +146,6 @@ > #define NR_IRQS NR_IRQS_LEGACY > #endif > > +#define IRQ_INVALID (~0U) If this is a good idea (I cc'd Thomas, the IRQ maintainer, for his thoughts), I'd like to see this in a more generic place so it isn't x86-specific. > #endif /* _ASM_X86_IRQ_VECTORS_H */ > diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c > index d30184c..819eb23 100644 > --- a/drivers/acpi/pci_irq.c > +++ b/drivers/acpi/pci_irq.c > @@ -33,6 +33,7 @@ > #include <linux/pci.h> > #include <linux/acpi.h> > #include <linux/slab.h> > +#include <linux/interrupt.h> > > #define PREFIX "ACPI: " > > @@ -436,7 +437,15 @@ int acpi_pci_irq_enable(struct pci_dev *dev) > * driver reported one, then use it. Exit in any case. > */ > if (gsi < 0) { > - if (acpi_isa_register_gsi(dev)) > +#ifdef CONFIG_X86 > + /* > + * The Interrupt Line value of 0xff is defined to mean "unknown" > + * or "no connection" (PCI 3.0, Section 6.2.4, footnote on page > + * 223), using ~0U as invalid IRQ. > + */ > + dev->irq = (dev->irq == 0xff) ? IRQ_INVALID : dev->irq; It's much simpler and clearer to write: if (dev->irq == 0xff) dev->irq = IRQ_INVALID; > +#endif > + if (!irq_is_valid(dev->irq) || acpi_isa_register_gsi(dev)) > dev_warn(&dev->dev, "PCI INT %c: no GSI\n", > pin_name(pin)); > > diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h > index cb30edb..2f0d46b 100644 > --- a/include/linux/interrupt.h > +++ b/include/linux/interrupt.h > @@ -198,6 +198,15 @@ extern void enable_percpu_irq(unsigned int irq, unsigned int type); > extern bool irq_percpu_is_enabled(unsigned int irq); > extern void irq_wake_thread(unsigned int irq, void *dev_id); > > +static inline bool irq_is_valid(unsigned int irq) > +{ > +#ifdef CONFIG_X86 > + if (irq == IRQ_INVALID) > + return false; > +#endif > + return true; > +} I don't like the x86 ifdef. I'd prefer: static inline bool irq_valid(unsigned int irq) { if (irq < NR_IRQS) return true; return false; } This could be used in many of the places that currently use NR_IRQS. My suggestion: - patch 1: Add IRQ_INVALID and irq_valid() as generic things - patch 2: Use irq_valid() in all the places where it can obviously replace NR_IRQS - patch 3: Add the acpi_pci_irq_enable() check. This is now a trivial patch, basically just this: + #ifdef CONFIG_X86 + if (dev->irq == 0xff) + dev->irq = IRQ_INVALID; + #endif + if (!irq_valid(dev->irq) ... Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/26/2016 04:58 AM, Bjorn Helgaas wrote: > [+cc Thomas] > > On Mon, Jan 25, 2016 at 02:59:38PM +0800, Chen Fan wrote: >> In our environment, when enable Secure boot, we found an abnormal >> phenomenon as following call trace shows. after investigation, we >> found the firmware assigned an irq number 255 which means unknown >> or no connection in PCI local spec for i801_smbus, meanwhile the >> ACPI didn't configure the pci irq routing. and the 255 irq number >> was assigned for megasa msix without IRQF_SHARED. then in this case >> during i801_smbus probe, the i801_smbus driver would request irq with >> bad irq number 255. but the 255 irq number was assigned for memgasa >> with MSIX enable. which will cause request_irq fails as call trace >> shows, here we use ~0U as invalid IRQ to identify the 0xff IRQ specified >> by BIOS. >> >> See the call trace: > Maybe you missed my suggestion that the timestamps aren't useful; > here's my suggestion again in more detail: > > Changelogs are written once, but read dozens or hundreds of time, so > stripping out irrelevant details shows consideration for the readers. Got it, thanks for your correction, I will remake it as you suggest. > >> [ 32.459195] ipmi device interface >> [ 32.612907] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4 >> [ 32.800459] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 4.0.1-k-rh >> [ 32.818319] ixgbe: Copyright (c) 1999-2014 Intel Corporation. >> [ 32.844009] lpc_ich 0001:80:1f.0: I/O space for ACPI uninitialized > I think the lines above are completely irrelevant. > >> [ 32.850093] i801_smbus 0000:00:1f.3: enabling device (0140 -> 0143) >> [ 32.851134] i801_smbus 0000:00:1f.3: can't derive routing for PCI INT C >> [ 32.851136] i801_smbus 0000:00:1f.3: PCI INT C: no GSI >> [ 32.851164] genirq: Flags mismatch irq 255. 00000080 (i801_smbus) vs. 00000000 (megasa > These are useful, but the timestamps ("[ 32.850093]") are not. > >> [ 32.851168] CPU: 0 PID: 2487 Comm: kworker/0:1 Not tainted 3.10.0-229.el7.x86_64 #1 >> [ 32.851170] Hardware name: FUJITSU PRIMEQUEST 2800E2/D3736, BIOS PRIMEQUEST 2000 Serie5 > These are probably useful; it's nice to know what kernel and hardware > is involved. > >> [ 32.851178] Workqueue: events work_for_cpu_fn >> [ 32.851208] ffff88086c330b00 00000000e233a9df ffff88086d57bca0 ffffffff81603f36 >> [ 32.851227] ffff88086d57bcf8 ffffffff8110d23a ffff88686fe02000 0000000000000246 >> [ 32.851246] ffff88086a9a8c00 00000000e233a9df ffffffffa00ad220 0000000000000080 > I doubt these are useful. > >> [ 32.851247] Call Trace: >> [ 32.851261] [<ffffffff81603f36>] dump_stack+0x19/0x1b >> [ 32.851271] [<ffffffff8110d23a>] __setup_irq+0x54a/0x570 >> [ 32.851282] [<ffffffffa00ad220>] ? i801_check_pre.isra.5+0xe0/0xe0 [i2c_i801] >> [ 32.851289] [<ffffffff8110d3bc>] request_threaded_irq+0xcc/0x170 >> [ 32.851298] [<ffffffffa00ae87f>] i801_probe+0x32f/0x508 [i2c_i801] >> [ 32.851308] [<ffffffff81308385>] local_pci_probe+0x45/0xa0 > The above might be useful, but the addresses ("[<ffffffff81603f36>]") > are not, and you should go through them manually and strip out the > lines that are junk from the stack. For example, I don't think > request_threaded_irq() really calls i801_check_pre(). > >> [ 32.851315] [<ffffffff8108bfd4>] work_for_cpu_fn+0x14/0x20 >> [ 32.851323] [<ffffffff8108f0ab>] process_one_work+0x17b/0x470 >> [ 32.851330] [<ffffffff81090003>] worker_thread+0x293/0x400 >> [ 32.851338] [<ffffffff8108fd70>] ? rescuer_thread+0x400/0x400 >> [ 32.851346] [<ffffffff8109726f>] kthread+0xcf/0xe0 >> [ 32.851353] [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140 >> [ 32.851362] [<ffffffff81613cfc>] ret_from_fork+0x7c/0xb0 >> [ 32.851369] [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140 > The lines above are completely useless. > >> [ 32.851373] i801_smbus 0000:00:1f.3: Failed to allocate irq 255: -16 >> [ 32.851435] i801_smbus: probe of 0000:00:1f.3 failed with error -16 >> [ 33.180145] ixgbe 0000:5a:00.0: Multiq[ 33.240538] ixgbe 0000:5a:00.0: (PCI Express:03:e0 >> [ 33.280826] ixgbe 0000:5a:00.0: MAC: 3, PHY: 0, PBA No: 000000-000 > These ixgbe entries are irrelevant. > >> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com> >> --- >> arch/x86/include/asm/irq_vectors.h | 2 ++ >> drivers/acpi/pci_irq.c | 11 ++++++++++- >> include/linux/interrupt.h | 9 +++++++++ >> 3 files changed, 21 insertions(+), 1 deletion(-) >> >> diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h >> index 6ca9fd6..b616d69 100644 >> --- a/arch/x86/include/asm/irq_vectors.h >> +++ b/arch/x86/include/asm/irq_vectors.h >> @@ -146,4 +146,6 @@ >> #define NR_IRQS NR_IRQS_LEGACY >> #endif >> >> +#define IRQ_INVALID (~0U) > If this is a good idea (I cc'd Thomas, the IRQ maintainer, for his > thoughts), I'd like to see this in a more generic place so it isn't > x86-specific. > >> #endif /* _ASM_X86_IRQ_VECTORS_H */ >> diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c >> index d30184c..819eb23 100644 >> --- a/drivers/acpi/pci_irq.c >> +++ b/drivers/acpi/pci_irq.c >> @@ -33,6 +33,7 @@ >> #include <linux/pci.h> >> #include <linux/acpi.h> >> #include <linux/slab.h> >> +#include <linux/interrupt.h> >> >> #define PREFIX "ACPI: " >> >> @@ -436,7 +437,15 @@ int acpi_pci_irq_enable(struct pci_dev *dev) >> * driver reported one, then use it. Exit in any case. >> */ >> if (gsi < 0) { >> - if (acpi_isa_register_gsi(dev)) >> +#ifdef CONFIG_X86 >> + /* >> + * The Interrupt Line value of 0xff is defined to mean "unknown" >> + * or "no connection" (PCI 3.0, Section 6.2.4, footnote on page >> + * 223), using ~0U as invalid IRQ. >> + */ >> + dev->irq = (dev->irq == 0xff) ? IRQ_INVALID : dev->irq; > It's much simpler and clearer to write: > > if (dev->irq == 0xff) > dev->irq = IRQ_INVALID; > >> +#endif >> + if (!irq_is_valid(dev->irq) || acpi_isa_register_gsi(dev)) >> dev_warn(&dev->dev, "PCI INT %c: no GSI\n", >> pin_name(pin)); >> >> diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h >> index cb30edb..2f0d46b 100644 >> --- a/include/linux/interrupt.h >> +++ b/include/linux/interrupt.h >> @@ -198,6 +198,15 @@ extern void enable_percpu_irq(unsigned int irq, unsigned int type); >> extern bool irq_percpu_is_enabled(unsigned int irq); >> extern void irq_wake_thread(unsigned int irq, void *dev_id); >> >> +static inline bool irq_is_valid(unsigned int irq) >> +{ >> +#ifdef CONFIG_X86 >> + if (irq == IRQ_INVALID) >> + return false; >> +#endif >> + return true; >> +} > I don't like the x86 ifdef. I'd prefer: > > static inline bool irq_valid(unsigned int irq) > { > if (irq < NR_IRQS) > return true; > return false; > } > > This could be used in many of the places that currently use NR_IRQS. > > My suggestion: > > - patch 1: Add IRQ_INVALID and irq_valid() as generic things > - patch 2: Use irq_valid() in all the places where it can obviously > replace NR_IRQS > - patch 3: Add the acpi_pci_irq_enable() check. This is now a > trivial patch, basically just this: > > + #ifdef CONFIG_X86 > + if (dev->irq == 0xff) > + dev->irq = IRQ_INVALID; > + #endif > + if (!irq_valid(dev->irq) ... this will be more useful. Thanks, Chen > > Bjorn > > > . > -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jan 25, 2016 at 02:58:04PM -0600, Bjorn Helgaas wrote: > [+cc Thomas] > [ ... ] > > I don't like the x86 ifdef. I'd prefer: > > static inline bool irq_valid(unsigned int irq) > { > if (irq < NR_IRQS) > return true; > return false; > } > Or: static inline bool irq_valid(unsigned int irq) { return irq < NR_IRQS; } Guenter -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 25 Jan 2016, Bjorn Helgaas wrote: > On Mon, Jan 25, 2016 at 02:59:38PM +0800, Chen Fan wrote: > > i801_smbus 0000:00:1f.3: PCI INT C: no GSI > > i801_smbus 0000:00:1f.3: Failed to allocate irq 255: -16 > > i801_smbus: probe of 0000:00:1f.3 failed with error -16 The current code does not not fail when the interrupt request fails. It reports it and clears the IRQ feature flag. > > @@ -436,7 +437,15 @@ int acpi_pci_irq_enable(struct pci_dev *dev) > > * driver reported one, then use it. Exit in any case. > > */ > > if (gsi < 0) { > > - if (acpi_isa_register_gsi(dev)) > > +#ifdef CONFIG_X86 > > + /* > > + * The Interrupt Line value of 0xff is defined to mean "unknown" > > + * or "no connection" (PCI 3.0, Section 6.2.4, footnote on page > > + * 223), using ~0U as invalid IRQ. > > + */ And why would this be x86 specific? PCI3.0 is architecture independent, right? > > + dev->irq = (dev->irq == 0xff) ? IRQ_INVALID : dev->irq; > > It's much simpler and clearer to write: > > if (dev->irq == 0xff) > dev->irq = IRQ_INVALID; I do not understand that IRQ_INVALID business at all. > > +#endif > > + if (!irq_is_valid(dev->irq) || acpi_isa_register_gsi(dev)) > > dev_warn(&dev->dev, "PCI INT %c: no GSI\n", > > pin_name(pin)); > > The existing code already drops into this place because acpi_isa_register_gsi() fails. > > i801_smbus 0000:00:1f.3: PCI INT C: no GSI What extra value does that !irq_is_valid() provide? And how does setting dev->irq to ~0U prevent that request_irq() is called in the i801 device driver? Not at all, AFAICT. It will just fail with a different error. So the whole 'fix' relies on the fact that irq ~0U does not exist (at least not today) and therefor the false sharing with some other driver using irq 255 will not happen. Relying on undocumented behaviour is not a fix, that's voodoo programming. The proper solution here is to flag that this device does not have an interrupt connected and act accordingly in the device driver, i.e. do not call request_irq() in the first place. > > +static inline bool irq_is_valid(unsigned int irq) > > +{ > > +#ifdef CONFIG_X86 > > + if (irq == IRQ_INVALID) > > + return false; > > +#endif > > + return true; > > +} > > I don't like the x86 ifdef. I'd prefer: > > static inline bool irq_valid(unsigned int irq) > { > if (irq < NR_IRQS) > return true; > return false; > } > > This could be used in many of the places that currently use NR_IRQS. No. NR_IRQS cannot be used at all if sparse irqs are enabled. Nothing in any generic code is supposed to rely on NR_IRQS. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/26/2016 04:26 PM, Thomas Gleixner wrote: > On Mon, 25 Jan 2016, Bjorn Helgaas wrote: >> On Mon, Jan 25, 2016 at 02:59:38PM +0800, Chen Fan wrote: >>> i801_smbus 0000:00:1f.3: PCI INT C: no GSI >>> i801_smbus 0000:00:1f.3: Failed to allocate irq 255: -16 >>> i801_smbus: probe of 0000:00:1f.3 failed with error -16 > The current code does not not fail when the interrupt request fails. It > reports it and clears the IRQ feature flag. > >>> @@ -436,7 +437,15 @@ int acpi_pci_irq_enable(struct pci_dev *dev) >>> * driver reported one, then use it. Exit in any case. >>> */ >>> if (gsi < 0) { >>> - if (acpi_isa_register_gsi(dev)) >>> +#ifdef CONFIG_X86 >>> + /* >>> + * The Interrupt Line value of 0xff is defined to mean "unknown" >>> + * or "no connection" (PCI 3.0, Section 6.2.4, footnote on page >>> + * 223), using ~0U as invalid IRQ. >>> + */ > And why would this be x86 specific? PCI3.0 is architecture independent, right? quoting the spec document: "For x86 based PCs, the values in this register correspond to IRQ numbers (0-15) of the standard dual 8259 configuration. The value 255 is defined as meaning "unknown" or "no connection" to the interrupt controller. Values between 15 and 254 are reserved." > >>> + dev->irq = (dev->irq == 0xff) ? IRQ_INVALID : dev->irq; >> It's much simpler and clearer to write: >> >> if (dev->irq == 0xff) >> dev->irq = IRQ_INVALID; > I do not understand that IRQ_INVALID business at all. > >>> +#endif >>> + if (!irq_is_valid(dev->irq) || acpi_isa_register_gsi(dev)) >>> dev_warn(&dev->dev, "PCI INT %c: no GSI\n", >>> pin_name(pin)); >>> > The existing code already drops into this place because > acpi_isa_register_gsi() fails. > >>> i801_smbus 0000:00:1f.3: PCI INT C: no GSI > What extra value does that !irq_is_valid() provide? > > And how does setting dev->irq to ~0U prevent that request_irq() is called in > the i801 device driver? Not at all, AFAICT. It will just fail with a different > error. > > So the whole 'fix' relies on the fact that irq ~0U does not exist (at least > not today) and therefor the false sharing with some other driver using irq 255 > will not happen. > > Relying on undocumented behaviour is not a fix, that's voodoo programming. > > The proper solution here is to flag that this device does not have an > interrupt connected and act accordingly in the device driver, i.e. do not call > request_irq() in the first place. yes, this is what I thought in previous email, I has asked that whether we can use a broken_irq flag in pci_dev to mark the device irq if invalid. and then if the device broken_irq set, we could directly skip call the request_irq. maybe we can set the broken_irq in pci_read_irq if the irq is 0xff. Thanks, Chen > >>> +static inline bool irq_is_valid(unsigned int irq) >>> +{ >>> +#ifdef CONFIG_X86 >>> + if (irq == IRQ_INVALID) >>> + return false; >>> +#endif >>> + return true; >>> +} >> I don't like the x86 ifdef. I'd prefer: >> >> static inline bool irq_valid(unsigned int irq) >> { >> if (irq < NR_IRQS) >> return true; >> return false; >> } >> >> This could be used in many of the places that currently use NR_IRQS. > No. NR_IRQS cannot be used at all if sparse irqs are enabled. Nothing in any > generic code is supposed to rely on NR_IRQS. > > Thanks, > > tglx > > > > . > -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 26 Jan 2016, Chen Fan wrote: > On 01/26/2016 04:26 PM, Thomas Gleixner wrote: > > > > if (gsi < 0) { > > > > - if (acpi_isa_register_gsi(dev)) > > > > +#ifdef CONFIG_X86 > > > > + /* > > > > + * The Interrupt Line value of 0xff is defined to mean > > > > "unknown" > > > > + * or "no connection" (PCI 3.0, Section 6.2.4, footnote on > > > > page > > > > + * 223), using ~0U as invalid IRQ. > > > > + */ > > And why would this be x86 specific? PCI3.0 is architecture independent, > > right? > quoting the spec document: > "For x86 based PCs, the values in this register correspond to IRQ numbers > (0-15) of the standard dual > 8259 configuration. The value 255 is defined as meaning "unknown" or "no > connection" to the interrupt > controller. Values between 15 and 254 are reserved." So if that is x86 specific then it needs to be documented proper. The fact that the comment is inside the #ifdef x86 does not tell. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jan 26, 2016 at 09:26:29AM +0100, Thomas Gleixner wrote: > On Mon, 25 Jan 2016, Bjorn Helgaas wrote: > > On Mon, Jan 25, 2016 at 02:59:38PM +0800, Chen Fan wrote: > > > > i801_smbus 0000:00:1f.3: PCI INT C: no GSI > > > i801_smbus 0000:00:1f.3: Failed to allocate irq 255: -16 > > > i801_smbus: probe of 0000:00:1f.3 failed with error -16 > > The current code does not not fail when the interrupt request fails. It > reports it and clears the IRQ feature flag. > > > > @@ -436,7 +437,15 @@ int acpi_pci_irq_enable(struct pci_dev *dev) > > > * driver reported one, then use it. Exit in any case. > > > */ > > > if (gsi < 0) { > > > - if (acpi_isa_register_gsi(dev)) > > > +#ifdef CONFIG_X86 > > > + /* > > > + * The Interrupt Line value of 0xff is defined to mean "unknown" > > > + * or "no connection" (PCI 3.0, Section 6.2.4, footnote on page > > > + * 223), using ~0U as invalid IRQ. > > > + */ > > And why would this be x86 specific? PCI3.0 is architecture independent, right? Yes, PCI is mostly arch-independent, but Interrupt Line is arch-specific, and the corner case of it being 255 is only mentioned in an x86-specific footnote. We can improve the comment. > > > + dev->irq = (dev->irq == 0xff) ? IRQ_INVALID : dev->irq; > > > > It's much simpler and clearer to write: > > > > if (dev->irq == 0xff) > > dev->irq = IRQ_INVALID; > > I do not understand that IRQ_INVALID business at all. > > > > +#endif > > > + if (!irq_is_valid(dev->irq) || acpi_isa_register_gsi(dev)) > > > dev_warn(&dev->dev, "PCI INT %c: no GSI\n", > > > pin_name(pin)); > > > > > The existing code already drops into this place because > acpi_isa_register_gsi() fails. > > > > i801_smbus 0000:00:1f.3: PCI INT C: no GSI > > What extra value does that !irq_is_valid() provide? > > And how does setting dev->irq to ~0U prevent that request_irq() is called in > the i801 device driver? Not at all, AFAICT. It will just fail with a different > error. > > So the whole 'fix' relies on the fact that irq ~0U does not exist (at least > not today) and therefor the false sharing with some other driver using irq 255 > will not happen. > > Relying on undocumented behaviour is not a fix, that's voodoo programming. > > The proper solution here is to flag that this device does not have an > interrupt connected and act accordingly in the device driver, i.e. do not call > request_irq() in the first place. This is the crux of the problem. As far as I know, PCI doesn't have a flag to indicate that dev->irq is a wire that's not connected, so there's no generic way for a driver to know whether it should call request_irq(). We could add one, of course, but that only helps in the drivers we update. It'd be nice if we could figure out a way to fix this without having to touch all the drivers. I think any driver that uses line-based interrupts can potentially fail if the platform uses Interrupt Line == 255 to indicate that the line is not connected. If another driver happens to be using IRQ 255, request_irq() may fail as it does here. Otherwise, I suspect request_irq() will return success, but the driver won't get any interrupts. > > I don't like the x86 ifdef. I'd prefer: > > > > static inline bool irq_valid(unsigned int irq) > > { > > if (irq < NR_IRQS) > > return true; > > return false; > > } > > > > This could be used in many of the places that currently use NR_IRQS. > > No. NR_IRQS cannot be used at all if sparse irqs are enabled. Ah, thanks, this is a critical piece I missed. There *are* a few places where we do define or use NR_IRQS when CONFIG_SPARSE_IRQ=y. Do these need updates? include/asm-generic/irq.h defines NR_IRQS if not defined yet arch/arm/include/asm/irq.h defines NR_IRQS to NR_IRQS_LEGACY arch/arm/kernel/irq.c uses NR_IRQS arch/sh/include/asm/irq.h defines NR_IRQS to 8 kernel/irq/internals.h uses NR_IRQS to define IRQ_BITMAP_BITS > Nothing in any > generic code is supposed to rely on NR_IRQS. I guess that means these uses are suspect: $ git grep -w NR_IRQS | grep -v "^arch" | grep -v "^kernel" | grep -v "^drivers/irqchip" | grep -v "^include" drivers/input/keyboard/lpc32xx-keys.c: if (irq < 0 || irq >= NR_IRQS) { drivers/mtd/nand/lpc32xx_mlc.c: if ((host->irq < 0) || (host->irq >= NR_IRQS)) { drivers/net/hamradio/scc.c:static struct irqflags { unsigned char used : 1; } Ivec[NR_IRQS]; drivers/pcmcia/pcmcia_resource.c: if (irq > NR_IRQS) drivers/tty/serial/apbuart.c: if (ser->irq < 0 || ser->irq >= NR_IRQS) drivers/tty/serial/ar933x_uart.c: if (ser->irq < 0 || ser->irq >= NR_IRQS) drivers/tty/serial/lantiq.c: if (ser->irq < 0 || ser->irq >= NR_IRQS) drivers/tty/serial/m32r_sio.c:static struct irq_info irq_lists[NR_IRQS]; Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 26 Jan 2016, Bjorn Helgaas wrote: > On Tue, Jan 26, 2016 at 09:26:29AM +0100, Thomas Gleixner wrote: > > And why would this be x86 specific? PCI3.0 is architecture independent, right? > > Yes, PCI is mostly arch-independent, but Interrupt Line is > arch-specific, and the corner case of it being 255 is only mentioned > in an x86-specific footnote. We can improve the comment. Yes please :) > > The proper solution here is to flag that this device does not have an > > interrupt connected and act accordingly in the device driver, i.e. do not call > > request_irq() in the first place. > > This is the crux of the problem. As far as I know, PCI doesn't have > a flag to indicate that dev->irq is a wire that's not connected, so > there's no generic way for a driver to know whether it should call > request_irq(). Ok. > We could add one, of course, but that only helps in the drivers we > update. It'd be nice if we could figure out a way to fix this > without having to touch all the drivers. Hmm. > I think any driver that uses line-based interrupts can potentially > fail if the platform uses Interrupt Line == 255 to indicate that the > line is not connected. If another driver happens to be using IRQ 255, > request_irq() may fail as it does here. Otherwise, I suspect > request_irq() will return success, but the driver won't get any > interrupts. Right. So we could certainly do something like this INVALID_IRQ thingy, but that looks a bit weird. What would request_irq() return? If it returns success, then drivers might make the wrong decision. If it returns an error code, then the i801 one works, but we might have to fix others anyway. I think it's better to have a software flag in pci_dev to indicate that there is no irq line and fix up the (probably few) affected drivers so they avoid calling request_irq() and take the right action. > > No. NR_IRQS cannot be used at all if sparse irqs are enabled. > > Ah, thanks, this is a critical piece I missed. There *are* a few > places where we do define or use NR_IRQS when CONFIG_SPARSE_IRQ=y. Do > these need updates? > > include/asm-generic/irq.h defines NR_IRQS if not defined yet > arch/arm/include/asm/irq.h defines NR_IRQS to NR_IRQS_LEGACY > arch/arm/kernel/irq.c uses NR_IRQS > arch/sh/include/asm/irq.h defines NR_IRQS to 8 These defines are used for preallocating interrupt descriptors in early boot. That was invented to help migrating to sparse irq. > kernel/irq/internals.h uses NR_IRQS to define IRQ_BITMAP_BITS Right, that's probably pointless, but harmless. > > Nothing in any generic code is supposed to rely on NR_IRQS. > > I guess that means these uses are suspect: > > $ git grep -w NR_IRQS | grep -v "^arch" | grep -v "^kernel" | grep -v "^drivers/irqchip" | grep -v "^include" > drivers/input/keyboard/lpc32xx-keys.c: if (irq < 0 || irq >= NR_IRQS) { > drivers/mtd/nand/lpc32xx_mlc.c: if ((host->irq < 0) || (host->irq >= NR_IRQS)) { > drivers/net/hamradio/scc.c:static struct irqflags { unsigned char used : 1; } Ivec[NR_IRQS]; > drivers/pcmcia/pcmcia_resource.c: if (irq > NR_IRQS) > drivers/tty/serial/apbuart.c: if (ser->irq < 0 || ser->irq >= NR_IRQS) > drivers/tty/serial/ar933x_uart.c: if (ser->irq < 0 || ser->irq >= NR_IRQS) > drivers/tty/serial/lantiq.c: if (ser->irq < 0 || ser->irq >= NR_IRQS) > drivers/tty/serial/m32r_sio.c:static struct irq_info irq_lists[NR_IRQS]; Indeed. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jan 26, 2016 at 04:48:25PM +0100, Thomas Gleixner wrote: > On Tue, 26 Jan 2016, Bjorn Helgaas wrote: > > On Tue, Jan 26, 2016 at 09:26:29AM +0100, Thomas Gleixner wrote: > > > The proper solution here is to flag that this device does not have an > > > interrupt connected and act accordingly in the device driver, i.e. do not call > > > request_irq() in the first place. > > > > This is the crux of the problem. As far as I know, PCI doesn't have > > a flag to indicate that dev->irq is a wire that's not connected, so > > there's no generic way for a driver to know whether it should call > > request_irq(). > > Ok. > > > We could add one, of course, but that only helps in the drivers we > > update. It'd be nice if we could figure out a way to fix this > > without having to touch all the drivers. > > Hmm. > > > I think any driver that uses line-based interrupts can potentially > > fail if the platform uses Interrupt Line == 255 to indicate that the > > line is not connected. If another driver happens to be using IRQ 255, > > request_irq() may fail as it does here. Otherwise, I suspect > > request_irq() will return success, but the driver won't get any > > interrupts. > > Right. So we could certainly do something like this INVALID_IRQ thingy, but > that looks a bit weird. What would request_irq() return? > > If it returns success, then drivers might make the wrong decision. If it > returns an error code, then the i801 one works, but we might have to fix > others anyway. I was thinking request_irq() could return -EINVAL if the caller passed INVALID_IRQ. That should tell drivers that this interrupt won't work. We'd be making request_irq() return -EINVAL in some cases where it currently returns success. But even though it returns success today, I don't think the driver is getting interrupts, because the wire isn't connected. > I think it's better to have a software flag in pci_dev to indicate that there > is no irq line and fix up the (probably few) affected drivers so they avoid > calling request_irq() and take the right action. We could add an "irq_valid" flag in struct pci_dev and make a new rule that drivers should check dev->irq_valid before using dev->irq. But realistically, i801 is the only place that will check irq_valid because that's the only driver where we know about a problem, so that seems like sort of a point solution. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/27/2016 08:25 AM, Bjorn Helgaas wrote: > On Tue, Jan 26, 2016 at 04:48:25PM +0100, Thomas Gleixner wrote: >> On Tue, 26 Jan 2016, Bjorn Helgaas wrote: >> >> Right. So we could certainly do something like this INVALID_IRQ thingy, but >> that looks a bit weird. What would request_irq() return? >> >> If it returns success, then drivers might make the wrong decision. If it >> returns an error code, then the i801 one works, but we might have to fix >> others anyway. > > I was thinking request_irq() could return -EINVAL if the caller passed > INVALID_IRQ. That should tell drivers that this interrupt won't work. > > We'd be making request_irq() return -EINVAL in some cases where it > currently returns success. But even though it returns success today, > I don't think the driver is getting interrupts, because the wire isn't > connected. > >> I think it's better to have a software flag in pci_dev to indicate that there >> is no irq line and fix up the (probably few) affected drivers so they avoid >> calling request_irq() and take the right action. > > We could add an "irq_valid" flag in struct pci_dev and make a new > rule that drivers should check dev->irq_valid before using dev->irq. > But realistically, i801 is the only place that will check irq_valid > because that's the only driver where we know about a problem, so that > seems like sort of a point solution. > > Bjorn > How about using IRQ_BITMAP_BITS as that "irq_valid" flag? because it is the ceiling of struct irq_desc irq_desc[], and request_irq() will return -EINVAL in case of the ceiling. #ifdef CONFIG_SPARSE_IRQ # define IRQ_BITMAP_BITS (NR_IRQS + 8196) #else # define IRQ_BITMAP_BITS NR_IRQS #endif > . >
On Wed, 27 Jan 2016, Cao jin wrote: > How about using IRQ_BITMAP_BITS as that "irq_valid" flag? because it is the > ceiling of struct irq_desc irq_desc[], and request_irq() will return -EINVAL > in case of the ceiling. > > #ifdef CONFIG_SPARSE_IRQ > # define IRQ_BITMAP_BITS (NR_IRQS + 8196) > #else > # define IRQ_BITMAP_BITS NR_IRQS > #endif No. This is a core internal implementation detail. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h index 6ca9fd6..b616d69 100644 --- a/arch/x86/include/asm/irq_vectors.h +++ b/arch/x86/include/asm/irq_vectors.h @@ -146,4 +146,6 @@ #define NR_IRQS NR_IRQS_LEGACY #endif +#define IRQ_INVALID (~0U) + #endif /* _ASM_X86_IRQ_VECTORS_H */ diff --git a/drivers/acpi/pci_irq.c b/drivers/acpi/pci_irq.c index d30184c..819eb23 100644 --- a/drivers/acpi/pci_irq.c +++ b/drivers/acpi/pci_irq.c @@ -33,6 +33,7 @@ #include <linux/pci.h> #include <linux/acpi.h> #include <linux/slab.h> +#include <linux/interrupt.h> #define PREFIX "ACPI: " @@ -436,7 +437,15 @@ int acpi_pci_irq_enable(struct pci_dev *dev) * driver reported one, then use it. Exit in any case. */ if (gsi < 0) { - if (acpi_isa_register_gsi(dev)) +#ifdef CONFIG_X86 + /* + * The Interrupt Line value of 0xff is defined to mean "unknown" + * or "no connection" (PCI 3.0, Section 6.2.4, footnote on page + * 223), using ~0U as invalid IRQ. + */ + dev->irq = (dev->irq == 0xff) ? IRQ_INVALID : dev->irq; +#endif + if (!irq_is_valid(dev->irq) || acpi_isa_register_gsi(dev)) dev_warn(&dev->dev, "PCI INT %c: no GSI\n", pin_name(pin)); diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h index cb30edb..2f0d46b 100644 --- a/include/linux/interrupt.h +++ b/include/linux/interrupt.h @@ -198,6 +198,15 @@ extern void enable_percpu_irq(unsigned int irq, unsigned int type); extern bool irq_percpu_is_enabled(unsigned int irq); extern void irq_wake_thread(unsigned int irq, void *dev_id); +static inline bool irq_is_valid(unsigned int irq) +{ +#ifdef CONFIG_X86 + if (irq == IRQ_INVALID) + return false; +#endif + return true; +} + /* The following three functions are for the core kernel use only. */ extern void suspend_device_irqs(void); extern void resume_device_irqs(void);
In our environment, when enable Secure boot, we found an abnormal phenomenon as following call trace shows. after investigation, we found the firmware assigned an irq number 255 which means unknown or no connection in PCI local spec for i801_smbus, meanwhile the ACPI didn't configure the pci irq routing. and the 255 irq number was assigned for megasa msix without IRQF_SHARED. then in this case during i801_smbus probe, the i801_smbus driver would request irq with bad irq number 255. but the 255 irq number was assigned for memgasa with MSIX enable. which will cause request_irq fails as call trace shows, here we use ~0U as invalid IRQ to identify the 0xff IRQ specified by BIOS. See the call trace: [ 32.459195] ipmi device interface [ 32.612907] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4 [ 32.800459] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 4.0.1-k-rh [ 32.818319] ixgbe: Copyright (c) 1999-2014 Intel Corporation. [ 32.844009] lpc_ich 0001:80:1f.0: I/O space for ACPI uninitialized [ 32.850093] i801_smbus 0000:00:1f.3: enabling device (0140 -> 0143) [ 32.851134] i801_smbus 0000:00:1f.3: can't derive routing for PCI INT C [ 32.851136] i801_smbus 0000:00:1f.3: PCI INT C: no GSI [ 32.851164] genirq: Flags mismatch irq 255. 00000080 (i801_smbus) vs. 00000000 (megasa [ 32.851168] CPU: 0 PID: 2487 Comm: kworker/0:1 Not tainted 3.10.0-229.el7.x86_64 #1 [ 32.851170] Hardware name: FUJITSU PRIMEQUEST 2800E2/D3736, BIOS PRIMEQUEST 2000 Serie5 [ 32.851178] Workqueue: events work_for_cpu_fn [ 32.851208] ffff88086c330b00 00000000e233a9df ffff88086d57bca0 ffffffff81603f36 [ 32.851227] ffff88086d57bcf8 ffffffff8110d23a ffff88686fe02000 0000000000000246 [ 32.851246] ffff88086a9a8c00 00000000e233a9df ffffffffa00ad220 0000000000000080 [ 32.851247] Call Trace: [ 32.851261] [<ffffffff81603f36>] dump_stack+0x19/0x1b [ 32.851271] [<ffffffff8110d23a>] __setup_irq+0x54a/0x570 [ 32.851282] [<ffffffffa00ad220>] ? i801_check_pre.isra.5+0xe0/0xe0 [i2c_i801] [ 32.851289] [<ffffffff8110d3bc>] request_threaded_irq+0xcc/0x170 [ 32.851298] [<ffffffffa00ae87f>] i801_probe+0x32f/0x508 [i2c_i801] [ 32.851308] [<ffffffff81308385>] local_pci_probe+0x45/0xa0 [ 32.851315] [<ffffffff8108bfd4>] work_for_cpu_fn+0x14/0x20 [ 32.851323] [<ffffffff8108f0ab>] process_one_work+0x17b/0x470 [ 32.851330] [<ffffffff81090003>] worker_thread+0x293/0x400 [ 32.851338] [<ffffffff8108fd70>] ? rescuer_thread+0x400/0x400 [ 32.851346] [<ffffffff8109726f>] kthread+0xcf/0xe0 [ 32.851353] [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140 [ 32.851362] [<ffffffff81613cfc>] ret_from_fork+0x7c/0xb0 [ 32.851369] [<ffffffff810971a0>] ? kthread_create_on_node+0x140/0x140 [ 32.851373] i801_smbus 0000:00:1f.3: Failed to allocate irq 255: -16 [ 32.851435] i801_smbus: probe of 0000:00:1f.3 failed with error -16 [ 33.180145] ixgbe 0000:5a:00.0: Multiq[ 33.240538] ixgbe 0000:5a:00.0: (PCI Express:03:e0 [ 33.280826] ixgbe 0000:5a:00.0: MAC: 3, PHY: 0, PBA No: 000000-000 Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com> --- arch/x86/include/asm/irq_vectors.h | 2 ++ drivers/acpi/pci_irq.c | 11 ++++++++++- include/linux/interrupt.h | 9 +++++++++ 3 files changed, 21 insertions(+), 1 deletion(-)