Message ID | 20230707095338.44244-1-roger.pau@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/ioapic: sanitize IO-APIC pins before enabling the local APIC | expand |
On 07.07.2023 11:53, Roger Pau Monne wrote: > The current logic to init the local APIC and the IO-APIC does init the > former first before doing any kind of sanitation on the IO-APIC pin > configuration. It's already noted on enable_IO_APIC() that Xen > shouldn't trust the IO-APIC being empty at bootup. > > At XenServer we have a system where the IO-APIC 0 is handed to Xen > with pin 0 unmasked, set to Fixed delivery mode, edge triggered and > with a vector of 0 (all fields of the RTE are zeroed). Once the local > APIC is enabled periodic injections from such pin cause a storm of > errors: > > APIC error on CPU0: 00(40), Received illegal vector > APIC error on CPU0: 40(40), Received illegal vector > APIC error on CPU0: 40(40), Received illegal vector > APIC error on CPU0: 40(40), Received illegal vector > APIC error on CPU0: 40(40), Received illegal vector > APIC error on CPU0: 40(40), Received illegal vector > > That prevents Xen from booting. And I expect no RTE is in ExtInt mode, so one might conclude that firmware meant to set up RTE 0 that way. Mainly as a remark, albeit of course there's then the question whether to change the RTE rather than masking it. What do ACPI tables say? > Fix this by moving the masking of IO-APIC pins ahead of the enabling > of the local APIC. Note that before doing such masking Xen attempts > to detect the pin where the legacy i8259 is connected, and that logic > relies on the pin being unmasked, hence the logic is also moved ahead > of enabling the local APIC. A comma after "masking" might help readers here. > --- a/xen/arch/x86/apic.c > +++ b/xen/arch/x86/apic.c > @@ -1476,6 +1476,10 @@ int __init APIC_init_uniprocessor (void) > return -1; > } > > + if ( smp_found_config && !skip_ioapic_setup && nr_ioapics ) > + /* Sanitize the IO-APIC pins before enabling the local APIC. */ > + sanitize_IO_APIC(); I'm a little puzzled by the smp_found_config part of the check here, but not in smp_prepare_cpus(). What's the reason for (a) the check and (b) the difference? Isn't checking nr_ioapics sufficient in this regard? (b) probably could be addressed by moving the code to the beginning of verify_local_APIC(), immediately ahead of which you insert both call sites. (At which point the function may want naming verify_IO_APIC() for consistency, but that's surely minor.) As to also checking skip_ioapic_setup - wouldn't the unmasked pin be similarly troublesome in that case? Aiui it would not hurt only if the LAPIC also isn't (going to be) set up. Then again the flag, among other things, signals a zero base address found in the ACPI or MP tables, so I guess this is a (largely) separate issue anyway. Jan
On Mon, Jul 10, 2023 at 12:56:27PM +0200, Jan Beulich wrote: > On 07.07.2023 11:53, Roger Pau Monne wrote: > > The current logic to init the local APIC and the IO-APIC does init the > > former first before doing any kind of sanitation on the IO-APIC pin > > configuration. It's already noted on enable_IO_APIC() that Xen > > shouldn't trust the IO-APIC being empty at bootup. > > > > At XenServer we have a system where the IO-APIC 0 is handed to Xen > > with pin 0 unmasked, set to Fixed delivery mode, edge triggered and > > with a vector of 0 (all fields of the RTE are zeroed). Once the local > > APIC is enabled periodic injections from such pin cause a storm of > > errors: > > > > APIC error on CPU0: 00(40), Received illegal vector > > APIC error on CPU0: 40(40), Received illegal vector > > APIC error on CPU0: 40(40), Received illegal vector > > APIC error on CPU0: 40(40), Received illegal vector > > APIC error on CPU0: 40(40), Received illegal vector > > APIC error on CPU0: 40(40), Received illegal vector > > > > That prevents Xen from booting. > > And I expect no RTE is in ExtInt mode, so one might conclude that > firmware meant to set up RTE 0 that way. Mainly as a remark, albeit > of course there's then the question whether to change the RTE > rather than masking it. What do ACPI tables say? There's one relevant override: [668h 1640 1] Subtable Type : 02 [Interrupt Source Override] [669h 1641 1] Length : 0A [66Ah 1642 1] Bus : 00 [66Bh 1643 1] Source : 00 [66Ch 1644 4] Interrupt : 00000002 [670h 1648 2] Flags (decoded below) : 0000 Polarity : 0 Trigger Mode : 0 So IRQ 0 -> GSI 2, so it's likely pin 0 is the one the i8259 is connected to. > > Fix this by moving the masking of IO-APIC pins ahead of the enabling > > of the local APIC. Note that before doing such masking Xen attempts > > to detect the pin where the legacy i8259 is connected, and that logic > > relies on the pin being unmasked, hence the logic is also moved ahead > > of enabling the local APIC. > > A comma after "masking" might help readers here. > > > --- a/xen/arch/x86/apic.c > > +++ b/xen/arch/x86/apic.c > > @@ -1476,6 +1476,10 @@ int __init APIC_init_uniprocessor (void) > > return -1; > > } > > > > + if ( smp_found_config && !skip_ioapic_setup && nr_ioapics ) > > + /* Sanitize the IO-APIC pins before enabling the local APIC. */ > > + sanitize_IO_APIC(); > > I'm a little puzzled by the smp_found_config part of the check here, > but not in smp_prepare_cpus(). What's the reason for (a) the check > and (b) the difference? This just mimics what gates the call to setup_IO_APIC() in that same function. It makes no sense to call sanitize_IO_APIC() if setup_IO_APIC() is not called, and I wasn't planning on changing the logic that gates the call setup_IO_APIC() in this patch. I did note the difference with smp_prepare_cpus(), and I think we should look at unifying those paths, but didn't want to do it as part of this fix. > Isn't checking nr_ioapics sufficient in this > regard? (b) probably could be addressed by moving the code to the > beginning of verify_local_APIC(), immediately ahead of which you > insert both call sites. (At which point the function may want naming > verify_IO_APIC() for consistency, but that's surely minor.) I wanted the call to sanitize_IO_APIC() to be done at the same level where the call to setup_IO_APIC() is done, because it makes the logic clearer. > As to also checking skip_ioapic_setup - wouldn't the unmasked pin be > similarly troublesome in that case? skip_ioapic_setup is set when the IO-APIC address in the MADT is invalid, so in that case attempting to access IO-APIC registers in that case won't lead to anything good. > Aiui it would not hurt only if > the LAPIC also isn't (going to be) set up. Then again the flag, > among other things, signals a zero base address found in the ACPI or > MP tables, so I guess this is a (largely) separate issue anyway. Oh, yes, indeed. See my reply above. Thanks, Roger.
On 10.07.2023 16:43, Roger Pau Monné wrote: > On Mon, Jul 10, 2023 at 12:56:27PM +0200, Jan Beulich wrote: >> On 07.07.2023 11:53, Roger Pau Monne wrote: >>> The current logic to init the local APIC and the IO-APIC does init the >>> former first before doing any kind of sanitation on the IO-APIC pin >>> configuration. It's already noted on enable_IO_APIC() that Xen >>> shouldn't trust the IO-APIC being empty at bootup. >>> >>> At XenServer we have a system where the IO-APIC 0 is handed to Xen >>> with pin 0 unmasked, set to Fixed delivery mode, edge triggered and >>> with a vector of 0 (all fields of the RTE are zeroed). Once the local >>> APIC is enabled periodic injections from such pin cause a storm of >>> errors: >>> >>> APIC error on CPU0: 00(40), Received illegal vector >>> APIC error on CPU0: 40(40), Received illegal vector >>> APIC error on CPU0: 40(40), Received illegal vector >>> APIC error on CPU0: 40(40), Received illegal vector >>> APIC error on CPU0: 40(40), Received illegal vector >>> APIC error on CPU0: 40(40), Received illegal vector >>> >>> That prevents Xen from booting. >> >> And I expect no RTE is in ExtInt mode, so one might conclude that >> firmware meant to set up RTE 0 that way. Mainly as a remark, albeit >> of course there's then the question whether to change the RTE >> rather than masking it. What do ACPI tables say? > > There's one relevant override: > > [668h 1640 1] Subtable Type : 02 [Interrupt Source Override] > [669h 1641 1] Length : 0A > [66Ah 1642 1] Bus : 00 > [66Bh 1643 1] Source : 00 > [66Ch 1644 4] Interrupt : 00000002 > [670h 1648 2] Flags (decoded below) : 0000 > Polarity : 0 > Trigger Mode : 0 > > So IRQ 0 -> GSI 2, so it's likely pin 0 is the one the i8259 is > connected to. Then wouldn't we be better off converting that RTE to ExtInt? That would allow PIC IRQs (not likely to exist, but still) to work (without undue other side effects afaics), whereas masking this RTE would not. >>> --- a/xen/arch/x86/apic.c >>> +++ b/xen/arch/x86/apic.c >>> @@ -1476,6 +1476,10 @@ int __init APIC_init_uniprocessor (void) >>> return -1; >>> } >>> >>> + if ( smp_found_config && !skip_ioapic_setup && nr_ioapics ) >>> + /* Sanitize the IO-APIC pins before enabling the local APIC. */ >>> + sanitize_IO_APIC(); >> >> I'm a little puzzled by the smp_found_config part of the check here, >> but not in smp_prepare_cpus(). What's the reason for (a) the check >> and (b) the difference? > > This just mimics what gates the call to setup_IO_APIC() in that same > function. It makes no sense to call sanitize_IO_APIC() if > setup_IO_APIC() is not called, and I wasn't planning on changing the > logic that gates the call setup_IO_APIC() in this patch. > > I did note the difference with smp_prepare_cpus(), and I think we > should look at unifying those paths, but didn't want to do it as part > of this fix. Well, consistency is one valid goal. But masking RTEs may need to be done more aggressively than setting up the IO-APICs. In particular if we're not to use them, we still want to mask all RTEs. Otherwise we're likely to observe each IRQ to arrive via two separate routes (once through the 8259 and once from an unmasked IO-APIC pin). >> Isn't checking nr_ioapics sufficient in this >> regard? (b) probably could be addressed by moving the code to the >> beginning of verify_local_APIC(), immediately ahead of which you >> insert both call sites. (At which point the function may want naming >> verify_IO_APIC() for consistency, but that's surely minor.) > > I wanted the call to sanitize_IO_APIC() to be done at the same level > where the call to setup_IO_APIC() is done, because it makes the logic > clearer. Hmm, I see. >> As to also checking skip_ioapic_setup - wouldn't the unmasked pin be >> similarly troublesome in that case? > > skip_ioapic_setup is set when the IO-APIC address in the MADT is > invalid, so in that case attempting to access IO-APIC registers in > that case won't lead to anything good. Of course, as I did say as well. Still if for some IO-APICs we know their addresses, we would still be able to deal with those (if we were to stick to this mask-all-RTEs-early model). Jan
On Tue, Jul 11, 2023 at 12:53:31PM +0200, Jan Beulich wrote: > On 10.07.2023 16:43, Roger Pau Monné wrote: > > On Mon, Jul 10, 2023 at 12:56:27PM +0200, Jan Beulich wrote: > >> On 07.07.2023 11:53, Roger Pau Monne wrote: > >>> The current logic to init the local APIC and the IO-APIC does init the > >>> former first before doing any kind of sanitation on the IO-APIC pin > >>> configuration. It's already noted on enable_IO_APIC() that Xen > >>> shouldn't trust the IO-APIC being empty at bootup. > >>> > >>> At XenServer we have a system where the IO-APIC 0 is handed to Xen > >>> with pin 0 unmasked, set to Fixed delivery mode, edge triggered and > >>> with a vector of 0 (all fields of the RTE are zeroed). Once the local > >>> APIC is enabled periodic injections from such pin cause a storm of > >>> errors: > >>> > >>> APIC error on CPU0: 00(40), Received illegal vector > >>> APIC error on CPU0: 40(40), Received illegal vector > >>> APIC error on CPU0: 40(40), Received illegal vector > >>> APIC error on CPU0: 40(40), Received illegal vector > >>> APIC error on CPU0: 40(40), Received illegal vector > >>> APIC error on CPU0: 40(40), Received illegal vector > >>> > >>> That prevents Xen from booting. > >> > >> And I expect no RTE is in ExtInt mode, so one might conclude that > >> firmware meant to set up RTE 0 that way. Mainly as a remark, albeit > >> of course there's then the question whether to change the RTE > >> rather than masking it. What do ACPI tables say? > > > > There's one relevant override: > > > > [668h 1640 1] Subtable Type : 02 [Interrupt Source Override] > > [669h 1641 1] Length : 0A > > [66Ah 1642 1] Bus : 00 > > [66Bh 1643 1] Source : 00 > > [66Ch 1644 4] Interrupt : 00000002 > > [670h 1648 2] Flags (decoded below) : 0000 > > Polarity : 0 > > Trigger Mode : 0 > > > > So IRQ 0 -> GSI 2, so it's likely pin 0 is the one the i8259 is > > connected to. > > Then wouldn't we be better off converting that RTE to ExtInt? That > would allow PIC IRQs (not likely to exist, but still) to work > (without undue other side effects afaics), whereas masking this RTE > would not. I'm kind of worry of trying to automate this logic. Should we always convert pin 0 to ExtInt if it's found unmasked, with and invalid vector and there's a source override entry for the IRQ? It seems weird to infer this just from the fact that pin 0 is all zeroed out. > >>> --- a/xen/arch/x86/apic.c > >>> +++ b/xen/arch/x86/apic.c > >>> @@ -1476,6 +1476,10 @@ int __init APIC_init_uniprocessor (void) > >>> return -1; > >>> } > >>> > >>> + if ( smp_found_config && !skip_ioapic_setup && nr_ioapics ) > >>> + /* Sanitize the IO-APIC pins before enabling the local APIC. */ > >>> + sanitize_IO_APIC(); > >> > >> I'm a little puzzled by the smp_found_config part of the check here, > >> but not in smp_prepare_cpus(). What's the reason for (a) the check > >> and (b) the difference? > > > > This just mimics what gates the call to setup_IO_APIC() in that same > > function. It makes no sense to call sanitize_IO_APIC() if > > setup_IO_APIC() is not called, and I wasn't planning on changing the > > logic that gates the call setup_IO_APIC() in this patch. > > > > I did note the difference with smp_prepare_cpus(), and I think we > > should look at unifying those paths, but didn't want to do it as part > > of this fix. > > Well, consistency is one valid goal. But masking RTEs may need to be > done more aggressively than setting up the IO-APICs. In particular > if we're not to use them, we still want to mask all RTEs. Otherwise > we're likely to observe each IRQ to arrive via two separate routes > (once through the 8259 and once from an unmasked IO-APIC pin). So avoid the smp_found_config check here and use the same condition as in smp_prepare_cpus(), I will adjust the patch to that effect. I do wonder why we don't simply mandate IO-APIC usage and get rid of the code to handle the 8259. Is it only to cope with systems that have a broken IO-APIC configuration? Because I don't think there are x86 64bit systems without an IO-APIC? > >> Isn't checking nr_ioapics sufficient in this > >> regard? (b) probably could be addressed by moving the code to the > >> beginning of verify_local_APIC(), immediately ahead of which you > >> insert both call sites. (At which point the function may want naming > >> verify_IO_APIC() for consistency, but that's surely minor.) > > > > I wanted the call to sanitize_IO_APIC() to be done at the same level > > where the call to setup_IO_APIC() is done, because it makes the logic > > clearer. > > Hmm, I see. > > >> As to also checking skip_ioapic_setup - wouldn't the unmasked pin be > >> similarly troublesome in that case? > > > > skip_ioapic_setup is set when the IO-APIC address in the MADT is > > invalid, so in that case attempting to access IO-APIC registers in > > that case won't lead to anything good. > > Of course, as I did say as well. Still if for some IO-APICs we know > their addresses, we would still be able to deal with those (if we > were to stick to this mask-all-RTEs-early model). The issue is that ioapic_init_mappings() will refuse to process further IO-APIC entries once an entry with address 0 is found. We could change that, but I would likely do it as an improvement rather than tie it to this change. Thanks, Roger.
On 11.07.2023 15:02, Roger Pau Monné wrote: > On Tue, Jul 11, 2023 at 12:53:31PM +0200, Jan Beulich wrote: >> On 10.07.2023 16:43, Roger Pau Monné wrote: >>> On Mon, Jul 10, 2023 at 12:56:27PM +0200, Jan Beulich wrote: >>>> On 07.07.2023 11:53, Roger Pau Monne wrote: >>>>> The current logic to init the local APIC and the IO-APIC does init the >>>>> former first before doing any kind of sanitation on the IO-APIC pin >>>>> configuration. It's already noted on enable_IO_APIC() that Xen >>>>> shouldn't trust the IO-APIC being empty at bootup. >>>>> >>>>> At XenServer we have a system where the IO-APIC 0 is handed to Xen >>>>> with pin 0 unmasked, set to Fixed delivery mode, edge triggered and >>>>> with a vector of 0 (all fields of the RTE are zeroed). Once the local >>>>> APIC is enabled periodic injections from such pin cause a storm of >>>>> errors: >>>>> >>>>> APIC error on CPU0: 00(40), Received illegal vector >>>>> APIC error on CPU0: 40(40), Received illegal vector >>>>> APIC error on CPU0: 40(40), Received illegal vector >>>>> APIC error on CPU0: 40(40), Received illegal vector >>>>> APIC error on CPU0: 40(40), Received illegal vector >>>>> APIC error on CPU0: 40(40), Received illegal vector >>>>> >>>>> That prevents Xen from booting. >>>> >>>> And I expect no RTE is in ExtInt mode, so one might conclude that >>>> firmware meant to set up RTE 0 that way. Mainly as a remark, albeit >>>> of course there's then the question whether to change the RTE >>>> rather than masking it. What do ACPI tables say? >>> >>> There's one relevant override: >>> >>> [668h 1640 1] Subtable Type : 02 [Interrupt Source Override] >>> [669h 1641 1] Length : 0A >>> [66Ah 1642 1] Bus : 00 >>> [66Bh 1643 1] Source : 00 >>> [66Ch 1644 4] Interrupt : 00000002 >>> [670h 1648 2] Flags (decoded below) : 0000 >>> Polarity : 0 >>> Trigger Mode : 0 >>> >>> So IRQ 0 -> GSI 2, so it's likely pin 0 is the one the i8259 is >>> connected to. >> >> Then wouldn't we be better off converting that RTE to ExtInt? That >> would allow PIC IRQs (not likely to exist, but still) to work >> (without undue other side effects afaics), whereas masking this RTE >> would not. > > I'm kind of worry of trying to automate this logic. Should we always > convert pin 0 to ExtInt if it's found unmasked, with and invalid > vector and there's a source override entry for the IRQ? > > It seems weird to infer this just from the fact that pin 0 is all > zeroed out. As you say in the earlier paragraph, it wouldn't be just that. It's unmasked + bad vector + present source override (indicating that nothing except perhaps a PIC is connected to the pin). >>>>> --- a/xen/arch/x86/apic.c >>>>> +++ b/xen/arch/x86/apic.c >>>>> @@ -1476,6 +1476,10 @@ int __init APIC_init_uniprocessor (void) >>>>> return -1; >>>>> } >>>>> >>>>> + if ( smp_found_config && !skip_ioapic_setup && nr_ioapics ) >>>>> + /* Sanitize the IO-APIC pins before enabling the local APIC. */ >>>>> + sanitize_IO_APIC(); >>>> >>>> I'm a little puzzled by the smp_found_config part of the check here, >>>> but not in smp_prepare_cpus(). What's the reason for (a) the check >>>> and (b) the difference? >>> >>> This just mimics what gates the call to setup_IO_APIC() in that same >>> function. It makes no sense to call sanitize_IO_APIC() if >>> setup_IO_APIC() is not called, and I wasn't planning on changing the >>> logic that gates the call setup_IO_APIC() in this patch. >>> >>> I did note the difference with smp_prepare_cpus(), and I think we >>> should look at unifying those paths, but didn't want to do it as part >>> of this fix. >> >> Well, consistency is one valid goal. But masking RTEs may need to be >> done more aggressively than setting up the IO-APICs. In particular >> if we're not to use them, we still want to mask all RTEs. Otherwise >> we're likely to observe each IRQ to arrive via two separate routes >> (once through the 8259 and once from an unmasked IO-APIC pin). > > So avoid the smp_found_config check here and use the same condition as in > smp_prepare_cpus(), I will adjust the patch to that effect. > > I do wonder why we don't simply mandate IO-APIC usage and get rid of > the code to handle the 8259. Is it only to cope with systems that > have a broken IO-APIC configuration? Because I don't think there are > x86 64bit systems without an IO-APIC? There shouldn't be, yes. And Andrew has been suggesting the same on a number of occasions, iirc. Yet trying to remove that code is perhaps more risky than simply keeping it, which may be (besides time constraints) why nobody has made an attempt so far. >>>> Isn't checking nr_ioapics sufficient in this >>>> regard? (b) probably could be addressed by moving the code to the >>>> beginning of verify_local_APIC(), immediately ahead of which you >>>> insert both call sites. (At which point the function may want naming >>>> verify_IO_APIC() for consistency, but that's surely minor.) >>> >>> I wanted the call to sanitize_IO_APIC() to be done at the same level >>> where the call to setup_IO_APIC() is done, because it makes the logic >>> clearer. >> >> Hmm, I see. >> >>>> As to also checking skip_ioapic_setup - wouldn't the unmasked pin be >>>> similarly troublesome in that case? >>> >>> skip_ioapic_setup is set when the IO-APIC address in the MADT is >>> invalid, so in that case attempting to access IO-APIC registers in >>> that case won't lead to anything good. >> >> Of course, as I did say as well. Still if for some IO-APICs we know >> their addresses, we would still be able to deal with those (if we >> were to stick to this mask-all-RTEs-early model). > > The issue is that ioapic_init_mappings() will refuse to process > further IO-APIC entries once an entry with address 0 is found. We > could change that, but I would likely do it as an improvement rather > than tie it to this change. Sure, I didn't mean to suggest that such needs to be done right here (it could though, but chances are that this would quickly grow too much). What we want to do right here is make the change maximally useful without needing to tweak overly many other places in the code. Jan
On Wed, Jul 12, 2023 at 12:07:43PM +0200, Jan Beulich wrote: > On 11.07.2023 15:02, Roger Pau Monné wrote: > > On Tue, Jul 11, 2023 at 12:53:31PM +0200, Jan Beulich wrote: > >> On 10.07.2023 16:43, Roger Pau Monné wrote: > >>> On Mon, Jul 10, 2023 at 12:56:27PM +0200, Jan Beulich wrote: > >>>> On 07.07.2023 11:53, Roger Pau Monne wrote: > >>>>> The current logic to init the local APIC and the IO-APIC does init the > >>>>> former first before doing any kind of sanitation on the IO-APIC pin > >>>>> configuration. It's already noted on enable_IO_APIC() that Xen > >>>>> shouldn't trust the IO-APIC being empty at bootup. > >>>>> > >>>>> At XenServer we have a system where the IO-APIC 0 is handed to Xen > >>>>> with pin 0 unmasked, set to Fixed delivery mode, edge triggered and > >>>>> with a vector of 0 (all fields of the RTE are zeroed). Once the local > >>>>> APIC is enabled periodic injections from such pin cause a storm of > >>>>> errors: > >>>>> > >>>>> APIC error on CPU0: 00(40), Received illegal vector > >>>>> APIC error on CPU0: 40(40), Received illegal vector > >>>>> APIC error on CPU0: 40(40), Received illegal vector > >>>>> APIC error on CPU0: 40(40), Received illegal vector > >>>>> APIC error on CPU0: 40(40), Received illegal vector > >>>>> APIC error on CPU0: 40(40), Received illegal vector > >>>>> > >>>>> That prevents Xen from booting. > >>>> > >>>> And I expect no RTE is in ExtInt mode, so one might conclude that > >>>> firmware meant to set up RTE 0 that way. Mainly as a remark, albeit > >>>> of course there's then the question whether to change the RTE > >>>> rather than masking it. What do ACPI tables say? > >>> > >>> There's one relevant override: > >>> > >>> [668h 1640 1] Subtable Type : 02 [Interrupt Source Override] > >>> [669h 1641 1] Length : 0A > >>> [66Ah 1642 1] Bus : 00 > >>> [66Bh 1643 1] Source : 00 > >>> [66Ch 1644 4] Interrupt : 00000002 > >>> [670h 1648 2] Flags (decoded below) : 0000 > >>> Polarity : 0 > >>> Trigger Mode : 0 > >>> > >>> So IRQ 0 -> GSI 2, so it's likely pin 0 is the one the i8259 is > >>> connected to. > >> > >> Then wouldn't we be better off converting that RTE to ExtInt? That > >> would allow PIC IRQs (not likely to exist, but still) to work > >> (without undue other side effects afaics), whereas masking this RTE > >> would not. > > > > I'm kind of worry of trying to automate this logic. Should we always > > convert pin 0 to ExtInt if it's found unmasked, with and invalid > > vector and there's a source override entry for the IRQ? > > > > It seems weird to infer this just from the fact that pin 0 is all > > zeroed out. > > As you say in the earlier paragraph, it wouldn't be just that. It's > unmasked + bad vector + present source override (indicating that > nothing except perhaps a PIC is connected to the pin). I can do this as a separate patch, but not really here IMO. The purpose of this patch is strictly to perform the IO-APIC sanitation ahead of enabling the local APIC. I will add a followup patch to the series, albeit I'm unconvinced we want to infer IO-APIC pin configuration based on firmware miss configurations. Thanks, Roger.
On 12.07.2023 15:53, Roger Pau Monné wrote: > On Wed, Jul 12, 2023 at 12:07:43PM +0200, Jan Beulich wrote: >> On 11.07.2023 15:02, Roger Pau Monné wrote: >>> On Tue, Jul 11, 2023 at 12:53:31PM +0200, Jan Beulich wrote: >>>> On 10.07.2023 16:43, Roger Pau Monné wrote: >>>>> On Mon, Jul 10, 2023 at 12:56:27PM +0200, Jan Beulich wrote: >>>>>> On 07.07.2023 11:53, Roger Pau Monne wrote: >>>>>>> The current logic to init the local APIC and the IO-APIC does init the >>>>>>> former first before doing any kind of sanitation on the IO-APIC pin >>>>>>> configuration. It's already noted on enable_IO_APIC() that Xen >>>>>>> shouldn't trust the IO-APIC being empty at bootup. >>>>>>> >>>>>>> At XenServer we have a system where the IO-APIC 0 is handed to Xen >>>>>>> with pin 0 unmasked, set to Fixed delivery mode, edge triggered and >>>>>>> with a vector of 0 (all fields of the RTE are zeroed). Once the local >>>>>>> APIC is enabled periodic injections from such pin cause a storm of >>>>>>> errors: >>>>>>> >>>>>>> APIC error on CPU0: 00(40), Received illegal vector >>>>>>> APIC error on CPU0: 40(40), Received illegal vector >>>>>>> APIC error on CPU0: 40(40), Received illegal vector >>>>>>> APIC error on CPU0: 40(40), Received illegal vector >>>>>>> APIC error on CPU0: 40(40), Received illegal vector >>>>>>> APIC error on CPU0: 40(40), Received illegal vector >>>>>>> >>>>>>> That prevents Xen from booting. >>>>>> >>>>>> And I expect no RTE is in ExtInt mode, so one might conclude that >>>>>> firmware meant to set up RTE 0 that way. Mainly as a remark, albeit >>>>>> of course there's then the question whether to change the RTE >>>>>> rather than masking it. What do ACPI tables say? >>>>> >>>>> There's one relevant override: >>>>> >>>>> [668h 1640 1] Subtable Type : 02 [Interrupt Source Override] >>>>> [669h 1641 1] Length : 0A >>>>> [66Ah 1642 1] Bus : 00 >>>>> [66Bh 1643 1] Source : 00 >>>>> [66Ch 1644 4] Interrupt : 00000002 >>>>> [670h 1648 2] Flags (decoded below) : 0000 >>>>> Polarity : 0 >>>>> Trigger Mode : 0 >>>>> >>>>> So IRQ 0 -> GSI 2, so it's likely pin 0 is the one the i8259 is >>>>> connected to. >>>> >>>> Then wouldn't we be better off converting that RTE to ExtInt? That >>>> would allow PIC IRQs (not likely to exist, but still) to work >>>> (without undue other side effects afaics), whereas masking this RTE >>>> would not. >>> >>> I'm kind of worry of trying to automate this logic. Should we always >>> convert pin 0 to ExtInt if it's found unmasked, with and invalid >>> vector and there's a source override entry for the IRQ? >>> >>> It seems weird to infer this just from the fact that pin 0 is all >>> zeroed out. >> >> As you say in the earlier paragraph, it wouldn't be just that. It's >> unmasked + bad vector + present source override (indicating that >> nothing except perhaps a PIC is connected to the pin). > > I can do this as a separate patch, but not really here IMO. The > purpose of this patch is strictly to perform the IO-APIC sanitation > ahead of enabling the local APIC. I will add a followup patch to the > series, albeit I'm unconvinced we want to infer IO-APIC pin > configuration based on firmware miss configurations. Hmm. The question really is which of the changes we want to backport. That one should be first. While it's largely guesswork, I'd be more inclined to take the change that has less of an effect overall. That said I can see that Linux have their enable_IO_APIC() calls also ahead of setup_IO_APIC() (but after connect_bsp_APIC() and setup_local_APIC()). IOW with your change we'd do the masking yet earlier than them. This may of course even be advantageous, but there may also be good reasons for the sequence they're using. Jan
On Wed, Jul 12, 2023 at 04:50:34PM +0200, Jan Beulich wrote: > On 12.07.2023 15:53, Roger Pau Monné wrote: > > On Wed, Jul 12, 2023 at 12:07:43PM +0200, Jan Beulich wrote: > >> On 11.07.2023 15:02, Roger Pau Monné wrote: > >>> On Tue, Jul 11, 2023 at 12:53:31PM +0200, Jan Beulich wrote: > >>>> On 10.07.2023 16:43, Roger Pau Monné wrote: > >>>>> On Mon, Jul 10, 2023 at 12:56:27PM +0200, Jan Beulich wrote: > >>>>>> On 07.07.2023 11:53, Roger Pau Monne wrote: > >>>>>>> The current logic to init the local APIC and the IO-APIC does init the > >>>>>>> former first before doing any kind of sanitation on the IO-APIC pin > >>>>>>> configuration. It's already noted on enable_IO_APIC() that Xen > >>>>>>> shouldn't trust the IO-APIC being empty at bootup. > >>>>>>> > >>>>>>> At XenServer we have a system where the IO-APIC 0 is handed to Xen > >>>>>>> with pin 0 unmasked, set to Fixed delivery mode, edge triggered and > >>>>>>> with a vector of 0 (all fields of the RTE are zeroed). Once the local > >>>>>>> APIC is enabled periodic injections from such pin cause a storm of > >>>>>>> errors: > >>>>>>> > >>>>>>> APIC error on CPU0: 00(40), Received illegal vector > >>>>>>> APIC error on CPU0: 40(40), Received illegal vector > >>>>>>> APIC error on CPU0: 40(40), Received illegal vector > >>>>>>> APIC error on CPU0: 40(40), Received illegal vector > >>>>>>> APIC error on CPU0: 40(40), Received illegal vector > >>>>>>> APIC error on CPU0: 40(40), Received illegal vector > >>>>>>> > >>>>>>> That prevents Xen from booting. > >>>>>> > >>>>>> And I expect no RTE is in ExtInt mode, so one might conclude that > >>>>>> firmware meant to set up RTE 0 that way. Mainly as a remark, albeit > >>>>>> of course there's then the question whether to change the RTE > >>>>>> rather than masking it. What do ACPI tables say? > >>>>> > >>>>> There's one relevant override: > >>>>> > >>>>> [668h 1640 1] Subtable Type : 02 [Interrupt Source Override] > >>>>> [669h 1641 1] Length : 0A > >>>>> [66Ah 1642 1] Bus : 00 > >>>>> [66Bh 1643 1] Source : 00 > >>>>> [66Ch 1644 4] Interrupt : 00000002 > >>>>> [670h 1648 2] Flags (decoded below) : 0000 > >>>>> Polarity : 0 > >>>>> Trigger Mode : 0 > >>>>> > >>>>> So IRQ 0 -> GSI 2, so it's likely pin 0 is the one the i8259 is > >>>>> connected to. > >>>> > >>>> Then wouldn't we be better off converting that RTE to ExtInt? That > >>>> would allow PIC IRQs (not likely to exist, but still) to work > >>>> (without undue other side effects afaics), whereas masking this RTE > >>>> would not. > >>> > >>> I'm kind of worry of trying to automate this logic. Should we always > >>> convert pin 0 to ExtInt if it's found unmasked, with and invalid > >>> vector and there's a source override entry for the IRQ? > >>> > >>> It seems weird to infer this just from the fact that pin 0 is all > >>> zeroed out. > >> > >> As you say in the earlier paragraph, it wouldn't be just that. It's > >> unmasked + bad vector + present source override (indicating that > >> nothing except perhaps a PIC is connected to the pin). > > > > I can do this as a separate patch, but not really here IMO. The > > purpose of this patch is strictly to perform the IO-APIC sanitation > > ahead of enabling the local APIC. I will add a followup patch to the > > series, albeit I'm unconvinced we want to infer IO-APIC pin > > configuration based on firmware miss configurations. > > Hmm. The question really is which of the changes we want to backport. > That one should be first. While it's largely guesswork, I'd be more > inclined to take the change that has less of an effect overall. My views would be the other way around I think. Current code already has a comment to notice that IO-APIC pins might be wrongly unmasked, and there's also logic for masking them when the IO-APICs are initialized. The fact that such logic is placed after the local APIC has been initialized is IMO a bug, as having unmasked unconfigured IO-APIC pins when the local APIC is enabled should be avoided. > That said I can see that Linux have their enable_IO_APIC() calls > also ahead of setup_IO_APIC() (but after connect_bsp_APIC() and > setup_local_APIC()). IOW with your change we'd do the masking yet > earlier than them. This may of course even be advantageous, but > there may also be good reasons for the sequence they're using. Linux calls enable_IO_APIC() before setting up the local APIC LVT Error vector (that's done in end_local_APIC_setup()). That logic seems to be introduced by commit: 1c69524c2e5b x86: clear IO_APIC before enabing apic error vector. Might it be less controversial to do like Linux does and call enable_IO_APIC() before the local APIC ESR is setup? Thanks, Roger.
On 12.07.2023 17:41, Roger Pau Monné wrote: > On Wed, Jul 12, 2023 at 04:50:34PM +0200, Jan Beulich wrote: >> Hmm. The question really is which of the changes we want to backport. >> That one should be first. While it's largely guesswork, I'd be more >> inclined to take the change that has less of an effect overall. > > My views would be the other way around I think. Current code already > has a comment to notice that IO-APIC pins might be wrongly unmasked, > and there's also logic for masking them when the IO-APICs are > initialized. The fact that such logic is placed after the local APIC > has been initialized is IMO a bug, as having unmasked unconfigured > IO-APIC pins when the local APIC is enabled should be avoided. Hmm, if you really meant this (and not setting up / unmasking of LVTERR), then your change would still be insufficient. We may enable the APIC for the BSP in init_bsp_APIC() (which is called quite a bit earlier), and the APIC may also have been enabled by firmware already, so I don't view this argument as fully convincing. (That said, init_bsp_APIC() calls clear_local_APIC(), so while the LAPIC may be enabled, errors would be reported only in ESR, not by delivering interrupts.) Which gets me back to another aspect of your scenario that I haven't fully understood: In the description you say that booting fails. Since we handle the errors, the implication is that the pin remains constantly active (thus triggering errors over and over again). Yet how would this not already cause problems ahead of smp_prepare_cpus() if the LAPIC was already enabled? Wouldn't we need to do part of clear_local_APIC() from init_bsp_APIC() before bailing from there when smp_found_config is set? ("Part of" because as per the comment at the top of init_bsp_APIC() we apparently would need to leave LVT0 [and then perhaps also LVT1] unmasked.) >> That said I can see that Linux have their enable_IO_APIC() calls >> also ahead of setup_IO_APIC() (but after connect_bsp_APIC() and >> setup_local_APIC()). IOW with your change we'd do the masking yet >> earlier than them. This may of course even be advantageous, but >> there may also be good reasons for the sequence they're using. > > Linux calls enable_IO_APIC() before setting up the local > APIC LVT Error vector (that's done in end_local_APIC_setup()). That > logic seems to be introduced by commit: > > 1c69524c2e5b x86: clear IO_APIC before enabing apic error vector. > > Might it be less controversial to do like Linux does and call > enable_IO_APIC() before the local APIC ESR is setup? You already do so, just that you do it yet earlier. I'm not convinced it needs doing from the middle of setup_local_APIC() (or, like nowaday's Linux has it, with ESR / LVTERR setup split to a separate function, and enable_IO_APIC() called between those two LAPIC related calls). You also disliked putting the call at the beginning of setup_local_APIC(), so putting it in the middle of it might be yet worse when taking that perspective. (Another downside of calling it from anywhere in setup_local_APIC() is that this would be another __init function called from a non-__init one. We have examples of such, and keying the call to "bsp" would leave it safe, but avoiding such calls when we easily can is probably a worthwhile secondary goal.) Question is whether it can sensibly be moved at least a little later: verify_local_APIC() isn't of much use anyway, first and foremost because we ignore its return value. And connect_bsp_APIC() largely is concerned about leaving PIC mode. So maybe put the call immediately ahead of the setup_local_APIC(true) ones? A further question is whether, considering that Linux continues to use that name, we wouldn't be better off not renaming the function. One other thing I finally figured was confusing me in the description of the patch; re-quoting that paragraph here: "Fix this by moving the masking of IO-APIC pins ahead of the enabling of the local APIC. Note that before doing such masking Xen attempts to detect the pin where the legacy i8259 is connected, and that logic relies on the pin being unmasked, hence the logic is also moved ahead of enabling the local APIC." The last sentence saying "also" is kind of odd with the first one already stating this very movement. To me it's therefore unclear what exactly the second sentence is intended to be telling me. I guess you want to express that together with the making the detection logic is also moved (i.e. the entire function is called earlier), but I'm afraid this isn't the only way to read that second sentence. Jan
On Thu, Jul 13, 2023 at 09:49:14AM +0200, Jan Beulich wrote: > On 12.07.2023 17:41, Roger Pau Monné wrote: > > On Wed, Jul 12, 2023 at 04:50:34PM +0200, Jan Beulich wrote: > >> Hmm. The question really is which of the changes we want to backport. > >> That one should be first. While it's largely guesswork, I'd be more > >> inclined to take the change that has less of an effect overall. > > > > My views would be the other way around I think. Current code already > > has a comment to notice that IO-APIC pins might be wrongly unmasked, > > and there's also logic for masking them when the IO-APICs are > > initialized. The fact that such logic is placed after the local APIC > > has been initialized is IMO a bug, as having unmasked unconfigured > > IO-APIC pins when the local APIC is enabled should be avoided. > > Hmm, if you really meant this (and not setting up / unmasking of > LVTERR), then your change would still be insufficient. We may enable > the APIC for the BSP in init_bsp_APIC() (which is called quite a bit > earlier), and the APIC may also have been enabled by firmware already, > so I don't view this argument as fully convincing. (That said, > init_bsp_APIC() calls clear_local_APIC(), so while the LAPIC may be > enabled, errors would be reported only in ESR, not by delivering > interrupts.) Indeed. I should mention the error vector explicitly, as it's not the local APIC being enabled, but the error vector being setup which causes the loop here. > Which gets me back to another aspect of your scenario that I haven't > fully understood: In the description you say that booting fails. > Since we handle the errors, the implication is that the pin remains > constantly active (thus triggering errors over and over again). Yet > how would this not already cause problems ahead of smp_prepare_cpus() Before smp_prepare_cpus() (and explicitly before setup_local_APIC()) the error vector is not setup (or that would be my expectation), so errors are just reported on the ESR, but there's no vector injected. > if the LAPIC was already enabled? Wouldn't we need to do part of > clear_local_APIC() from init_bsp_APIC() before bailing from there > when smp_found_config is set? ("Part of" because as per the comment > at the top of init_bsp_APIC() we apparently would need to leave LVT0 > [and then perhaps also LVT1] unmasked.) Right, if the APIC is already enabled by the firmware we do currently rely on the error vector being masked. OI think this is a reasonable expectation, as handing being done with an unknown error vector unmasked in the local APIC would be bad. Even with the CPU running with interrupts disabled we would get error vectors set in IRR. > > >> That said I can see that Linux have their enable_IO_APIC() calls > >> also ahead of setup_IO_APIC() (but after connect_bsp_APIC() and > >> setup_local_APIC()). IOW with your change we'd do the masking yet > >> earlier than them. This may of course even be advantageous, but > >> there may also be good reasons for the sequence they're using. > > > > Linux calls enable_IO_APIC() before setting up the local > > APIC LVT Error vector (that's done in end_local_APIC_setup()). That > > logic seems to be introduced by commit: > > > > 1c69524c2e5b x86: clear IO_APIC before enabing apic error vector. > > > > Might it be less controversial to do like Linux does and call > > enable_IO_APIC() before the local APIC ESR is setup? > > You already do so, just that you do it yet earlier. I'm not > convinced it needs doing from the middle of setup_local_APIC() (or, > like nowaday's Linux has it, with ESR / LVTERR setup split to a > separate function, and enable_IO_APIC() called between those two > LAPIC related calls). You also disliked putting the call at the > beginning of setup_local_APIC(), so putting it in the middle of it > might be yet worse when taking that perspective. (Another downside > of calling it from anywhere in setup_local_APIC() is that this > would be another __init function called from a non-__init one. We > have examples of such, and keying the call to "bsp" would leave it > safe, but avoiding such calls when we easily can is probably a > worthwhile secondary goal.) Yes, it's not my preference to call it from setup_local_APIC(), neither splitting the setup of LVTERR/ESR into a separate function. > > Question is whether it can sensibly be moved at least a little > later: verify_local_APIC() isn't of much use anyway, first and > foremost because we ignore its return value. And connect_bsp_APIC() > largely is concerned about leaving PIC mode. So maybe put the call > immediately ahead of the setup_local_APIC(true) ones? Strictly speaking for the issue I have at hand it needs to be done ahead of setting up LVTERR/ESR. > A further question is whether, considering that Linux continues to > use that name, we wouldn't be better off not renaming the function. Yes, I will leave the name alone, albeit it's IMO a bit misleading, it's not actually enabling anything, but rather masking the pins and figuring out the pin where the i8259 might be connected. > One other thing I finally figured was confusing me in the > description of the patch; re-quoting that paragraph here: > > "Fix this by moving the masking of IO-APIC pins ahead of the enabling > of the local APIC. Note that before doing such masking Xen attempts > to detect the pin where the legacy i8259 is connected, and that logic > relies on the pin being unmasked, hence the logic is also moved ahead > of enabling the local APIC." > > The last sentence saying "also" is kind of odd with the first one > already stating this very movement. To me it's therefore unclear what > exactly the second sentence is intended to be telling me. I guess you > want to express that together with the making the detection logic is > also moved (i.e. the entire function is called earlier), but I'm > afraid this isn't the only way to read that second sentence. Yes, you are reading it as I was intending. It's merely a logic that the i8259 detection is also done slightly earlier, as it relies on the RTE masking not being tampered with. What about: "Fix this by moving the masking of IO-APIC pins ahead of the enabling of the local APIC. Note that before doing such masking Xen attempts to detect the pin where the legacy i8259 is connected, and that logic relies on the pin being unmasked, hence the logic is also moved ahead of enabling the local APIC." "Move the masking of the IO-APIC pins ahead of the setup of the local APIC. This has the side effect of also moving the detection of the pin where the i8259 is connected, as it must be done before masking any pins." Is the aboce any better? Would you rather prefer me to drop any mention of the i8259 detection being moved? (as it's just a side effect of moving the masking). So to recap, I think we are in agreement that calling enable_IO_APIC() just ahead of the call to setup_local_APIC() is the preferred solution? Thanks, Roger.
On 13.07.2023 13:29, Roger Pau Monné wrote: > On Thu, Jul 13, 2023 at 09:49:14AM +0200, Jan Beulich wrote: >> One other thing I finally figured was confusing me in the >> description of the patch; re-quoting that paragraph here: >> >> "Fix this by moving the masking of IO-APIC pins ahead of the enabling >> of the local APIC. Note that before doing such masking Xen attempts >> to detect the pin where the legacy i8259 is connected, and that logic >> relies on the pin being unmasked, hence the logic is also moved ahead >> of enabling the local APIC." >> >> The last sentence saying "also" is kind of odd with the first one >> already stating this very movement. To me it's therefore unclear what >> exactly the second sentence is intended to be telling me. I guess you >> want to express that together with the making the detection logic is >> also moved (i.e. the entire function is called earlier), but I'm >> afraid this isn't the only way to read that second sentence. > > Yes, you are reading it as I was intending. It's merely a logic that > the i8259 detection is also done slightly earlier, as it relies on the > RTE masking not being tampered with. > > What about: > > "Fix this by moving the masking of IO-APIC pins ahead of the enabling > of the local APIC. Note that before doing such masking Xen attempts > to detect the pin where the legacy i8259 is connected, and that logic > relies on the pin being unmasked, hence the logic is also moved ahead > of enabling the local APIC." > > "Move the masking of the IO-APIC pins ahead of the setup of the local > APIC. This has the side effect of also moving the detection of the > pin where the i8259 is connected, as it must be done before masking > any pins." > > Is the aboce any better? It is, thanks. > Would you rather prefer me to drop any > mention of the i8259 detection being moved? (as it's just a side > effect of moving the masking). No, better mention this aspect. > So to recap, I think we are in agreement that calling enable_IO_APIC() > just ahead of the call to setup_local_APIC() is the preferred > solution? Well, yes and no. My preferred course of action for the issue at hand would be to convert RTE 0 to ExtInt (under the mentioned set of conditions). I agree though that we also want to move the masking of RTEs, and for that I further agree with the placement mentioned above. That said, you're the contributor, so it's still up to you what you submit in what order (you could still decide to leave out the RTE conversion change). The two of us not reaching agreement merely makes a little more difficult to decide which of the changes to backport then. Jan
On Thu, Jul 13, 2023 at 02:18:29PM +0200, Jan Beulich wrote: > On 13.07.2023 13:29, Roger Pau Monné wrote: > > So to recap, I think we are in agreement that calling enable_IO_APIC() > > just ahead of the call to setup_local_APIC() is the preferred > > solution? > > Well, yes and no. My preferred course of action for the issue at hand > would be to convert RTE 0 to ExtInt (under the mentioned set of > conditions). I agree though that we also want to move the masking of > RTEs, and for that I further agree with the placement mentioned above. So I hacked up a change to set pin 0 to ExtINT mode (and avoid doing the masking early), and I got: (XEN) spurious 8259A interrupt: IRQ7. This was a single interrupt, but still I think the masking is the critical part to get backported. Thanks, Roger.
On 14.07.2023 18:05, Roger Pau Monné wrote: > On Thu, Jul 13, 2023 at 02:18:29PM +0200, Jan Beulich wrote: >> On 13.07.2023 13:29, Roger Pau Monné wrote: >>> So to recap, I think we are in agreement that calling enable_IO_APIC() >>> just ahead of the call to setup_local_APIC() is the preferred >>> solution? >> >> Well, yes and no. My preferred course of action for the issue at hand >> would be to convert RTE 0 to ExtInt (under the mentioned set of >> conditions). I agree though that we also want to move the masking of >> RTEs, and for that I further agree with the placement mentioned above. > > So I hacked up a change to set pin 0 to ExtINT mode (and avoid doing > the masking early), and I got: > > (XEN) spurious 8259A interrupt: IRQ7. > > This was a single interrupt, but still I think the masking is the > critical part to get backported. One way to look at it, yes. My perspective is different though: If there truly is a (spurious or not) IRQ at the 8259, then it needs dealing with sooner or later (in the process of booting). Aiui if we masked pin 0 initially and then later unmasked it (after switching to ExtInt, if not set so by firmware), we'd still see a spurious IRQ7. Jan
On Mon, Jul 17, 2023 at 08:40:05AM +0200, Jan Beulich wrote: > On 14.07.2023 18:05, Roger Pau Monné wrote: > > On Thu, Jul 13, 2023 at 02:18:29PM +0200, Jan Beulich wrote: > >> On 13.07.2023 13:29, Roger Pau Monné wrote: > >>> So to recap, I think we are in agreement that calling enable_IO_APIC() > >>> just ahead of the call to setup_local_APIC() is the preferred > >>> solution? > >> > >> Well, yes and no. My preferred course of action for the issue at hand > >> would be to convert RTE 0 to ExtInt (under the mentioned set of > >> conditions). I agree though that we also want to move the masking of > >> RTEs, and for that I further agree with the placement mentioned above. > > > > So I hacked up a change to set pin 0 to ExtINT mode (and avoid doing > > the masking early), and I got: > > > > (XEN) spurious 8259A interrupt: IRQ7. > > > > This was a single interrupt, but still I think the masking is the > > critical part to get backported. > > One way to look at it, yes. My perspective is different though: If > there truly is a (spurious or not) IRQ at the 8259, then it needs > dealing with sooner or later (in the process of booting). Aiui if > we masked pin 0 initially and then later unmasked it (after > switching to ExtInt, if not set so by firmware), we'd still see a > spurious IRQ7. Hm, I see. I'm not very familiar with 8259, I was expecting that maybe we would mask the IRQ before getting such spurious injection, but I'm not even sure that's possible or whether it would work. Thanks, Roger.
diff --git a/xen/arch/x86/apic.c b/xen/arch/x86/apic.c index f71474d47dd1..9197b9532480 100644 --- a/xen/arch/x86/apic.c +++ b/xen/arch/x86/apic.c @@ -1476,6 +1476,10 @@ int __init APIC_init_uniprocessor (void) return -1; } + if ( smp_found_config && !skip_ioapic_setup && nr_ioapics ) + /* Sanitize the IO-APIC pins before enabling the local APIC. */ + sanitize_IO_APIC(); + verify_local_APIC(); connect_bsp_APIC(); diff --git a/xen/arch/x86/include/asm/irq.h b/xen/arch/x86/include/asm/irq.h index 424b0e1af8f4..dfa681846255 100644 --- a/xen/arch/x86/include/asm/irq.h +++ b/xen/arch/x86/include/asm/irq.h @@ -118,6 +118,7 @@ bool bogus_8259A_irq(unsigned int irq); int i8259A_suspend(void); int i8259A_resume(void); +void sanitize_IO_APIC(void); void setup_IO_APIC(void); void disable_IO_APIC(void); void setup_ioapic_dest(void); diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c index 9b8a972cf570..120c231e0302 100644 --- a/xen/arch/x86/io_apic.c +++ b/xen/arch/x86/io_apic.c @@ -1273,7 +1273,7 @@ static void cf_check _print_IO_APIC_keyhandler(unsigned char key) __print_IO_APIC(0); } -static void __init enable_IO_APIC(void) +void __init sanitize_IO_APIC(void) { int i8259_apic, i8259_pin; int i, apic; @@ -2067,8 +2067,6 @@ static void __init ioapic_pm_state_alloc(void) void __init setup_IO_APIC(void) { - enable_IO_APIC(); - if (acpi_ioapic) io_apic_irqs = ~0; /* all IRQs go through IOAPIC */ else diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c index cf9bb220f90d..f9e27a23d383 100644 --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -1221,6 +1221,10 @@ void __init smp_prepare_cpus(void) goto init_uniprocessor; } + if ( !skip_ioapic_setup && nr_ioapics ) + /* Sanitize the IO-APIC pins before enabling the local APIC. */ + sanitize_IO_APIC(); + verify_local_APIC(); connect_bsp_APIC();
The current logic to init the local APIC and the IO-APIC does init the former first before doing any kind of sanitation on the IO-APIC pin configuration. It's already noted on enable_IO_APIC() that Xen shouldn't trust the IO-APIC being empty at bootup. At XenServer we have a system where the IO-APIC 0 is handed to Xen with pin 0 unmasked, set to Fixed delivery mode, edge triggered and with a vector of 0 (all fields of the RTE are zeroed). Once the local APIC is enabled periodic injections from such pin cause a storm of errors: APIC error on CPU0: 00(40), Received illegal vector APIC error on CPU0: 40(40), Received illegal vector APIC error on CPU0: 40(40), Received illegal vector APIC error on CPU0: 40(40), Received illegal vector APIC error on CPU0: 40(40), Received illegal vector APIC error on CPU0: 40(40), Received illegal vector That prevents Xen from booting. Fix this by moving the masking of IO-APIC pins ahead of the enabling of the local APIC. Note that before doing such masking Xen attempts to detect the pin where the legacy i8259 is connected, and that logic relies on the pin being unmasked, hence the logic is also moved ahead of enabling the local APIC. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- I've placed the sanitize_IO_APIC() declaration in irq.h with the rest of related IO-APIC setup functions declarations instead of placing it in io_apic.h. --- xen/arch/x86/apic.c | 4 ++++ xen/arch/x86/include/asm/irq.h | 1 + xen/arch/x86/io_apic.c | 4 +--- xen/arch/x86/smpboot.c | 4 ++++ 4 files changed, 10 insertions(+), 3 deletions(-)