Message ID | 20220421132114.35118-2-roger.pau@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/ioapic: fix edge triggered interrupt migration | expand |
On 21.04.2022 15:21, Roger Pau Monne wrote: > Allow disabling (masking) IO-APIC pins set to edge trigger mode. This > is required in order to safely migrate such interrupts between CPUs, > as the write to update the IO-APIC RTE (or the IRTE) is not done > atomically, For IRTE on VT-d we use cmpxchg16b if available (i.e. virtually always). > so there's a window where there's a mismatch between the > destination CPU and the vector: > > (XEN) CPU1: No irq handler for vector b5 (IRQ -11, LAPIC) > (XEN) IRQ10 a=0002[0002,0008] v=bd[b5] t=IO-APIC-edge s=00000030 I think this needs some further explanation, as we generally move edge IRQs only when an un-acked interrupt is in flight (and hence no further one can arrive). Jan > The main risk with masking edge triggered interrupts is losing them, > but getting them injected while in the process of updating the RTE is > equally harmful as the interrupts gets lost anyway. > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > --- > xen/arch/x86/io_apic.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c > index c086f40f63..2e5964640b 100644 > --- a/xen/arch/x86/io_apic.c > +++ b/xen/arch/x86/io_apic.c > @@ -1782,7 +1782,7 @@ static hw_irq_controller ioapic_edge_type = { > .startup = startup_edge_ioapic_irq, > .shutdown = irq_shutdown_none, > .enable = unmask_IO_APIC_irq, > - .disable = irq_disable_none, > + .disable = mask_IO_APIC_irq, > .ack = ack_edge_ioapic_irq, > .set_affinity = set_ioapic_affinity_irq, > };
On Fri, Jun 03, 2022 at 03:19:34PM +0200, Jan Beulich wrote: > On 21.04.2022 15:21, Roger Pau Monne wrote: > > Allow disabling (masking) IO-APIC pins set to edge trigger mode. This > > is required in order to safely migrate such interrupts between CPUs, > > as the write to update the IO-APIC RTE (or the IRTE) is not done > > atomically, > > For IRTE on VT-d we use cmpxchg16b if available (i.e. virtually always). > > > so there's a window where there's a mismatch between the > > destination CPU and the vector: > > > > (XEN) CPU1: No irq handler for vector b5 (IRQ -11, LAPIC) > > (XEN) IRQ10 a=0002[0002,0008] v=bd[b5] t=IO-APIC-edge s=00000030 > > I think this needs some further explanation, as we generally move > edge IRQs only when an un-acked interrupt is in flight (and hence > no further one can arrive). A further one can arrive as soon as you modify either the vector or the destination fields of the IO-APIC RTE, as then the non-EOI'ed lapic vector is no longer there (because you have moved to a different destination or vector). This is the issue with updating the IO-APIC RTE using two separate writes: even when using interrupt remapping the IRTE cannot be atomically updated and there's a window where the interrupt is not masked, but the destination and vector fields are not in sync, because they reside in different parts of the RTE (destination is in the high 32bits, vector in the low 32bits part of the RTE). Thanks, Roger.
On 03.06.2022 16:53, Roger Pau Monné wrote: > On Fri, Jun 03, 2022 at 03:19:34PM +0200, Jan Beulich wrote: >> On 21.04.2022 15:21, Roger Pau Monne wrote: >>> Allow disabling (masking) IO-APIC pins set to edge trigger mode. This >>> is required in order to safely migrate such interrupts between CPUs, >>> as the write to update the IO-APIC RTE (or the IRTE) is not done >>> atomically, >> >> For IRTE on VT-d we use cmpxchg16b if available (i.e. virtually always). >> >>> so there's a window where there's a mismatch between the >>> destination CPU and the vector: >>> >>> (XEN) CPU1: No irq handler for vector b5 (IRQ -11, LAPIC) >>> (XEN) IRQ10 a=0002[0002,0008] v=bd[b5] t=IO-APIC-edge s=00000030 >> >> I think this needs some further explanation, as we generally move >> edge IRQs only when an un-acked interrupt is in flight (and hence >> no further one can arrive). > > A further one can arrive as soon as you modify either the vector or > the destination fields of the IO-APIC RTE, as then the non-EOI'ed > lapic vector is no longer there (because you have moved to a different > destination or vector). Right - this is what I'm asking you to spell out in the description. Jan > This is the issue with updating the IO-APIC RTE using two separate > writes: even when using interrupt remapping the IRTE cannot be > atomically updated and there's a window where the interrupt is not > masked, but the destination and vector fields are not in sync, because > they reside in different parts of the RTE (destination is in the high > 32bits, vector in the low 32bits part of the RTE). > > Thanks, Roger. >
diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c index c086f40f63..2e5964640b 100644 --- a/xen/arch/x86/io_apic.c +++ b/xen/arch/x86/io_apic.c @@ -1782,7 +1782,7 @@ static hw_irq_controller ioapic_edge_type = { .startup = startup_edge_ioapic_irq, .shutdown = irq_shutdown_none, .enable = unmask_IO_APIC_irq, - .disable = irq_disable_none, + .disable = mask_IO_APIC_irq, .ack = ack_edge_ioapic_irq, .set_affinity = set_ioapic_affinity_irq, };
Allow disabling (masking) IO-APIC pins set to edge trigger mode. This is required in order to safely migrate such interrupts between CPUs, as the write to update the IO-APIC RTE (or the IRTE) is not done atomically, so there's a window where there's a mismatch between the destination CPU and the vector: (XEN) CPU1: No irq handler for vector b5 (IRQ -11, LAPIC) (XEN) IRQ10 a=0002[0002,0008] v=bd[b5] t=IO-APIC-edge s=00000030 The main risk with masking edge triggered interrupts is losing them, but getting them injected while in the process of updating the RTE is equally harmful as the interrupts gets lost anyway. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- xen/arch/x86/io_apic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)