Message ID | 20140228040617.GA22103@crash.ini.cmu.edu (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Feb 27, 2014 at 11:06:17PM -0500, Gabriel L. Somlo wrote: > Both QEMU and KVM have already accumulated a significant number of > optimizations based on the hard-coded assumption that ioapic polarity > will always use the ActiveHigh convention, where the logical and > physical states of level-triggered irq lines always match (i.e., > active(asserted) == high == 1, inactive == low == 0). QEMU guests > are expected to follow directions given via ACPI and configure the > ioapic with polarity 0 (ActiveHigh). However, even when misbehaving > guests (e.g. OS X <= 10.9) set the ioapic polarity to 1 (ActiveLow), > QEMU will still use the ActiveHigh signaling convention when > interfacing with KVM. > > This patch modifies KVM to completely ignore ioapic polarity as set by > the guest OS, enabling misbehaving guests to work alongside those which > comply with the ActiveHigh polarity specified by QEMU's ACPI tables. > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com> > Signed-off-by: Gabriel L. Somlo <somlo@cmu.edu> > --- > > On Fri, Feb 28, 2014 at 12:31:27AM +0100, Paolo Bonzini wrote: > >>>This is a much better description. Can you turn it into a patch to > >>>Documentation/virtual/kvm/api.txt and a more complete commit > >>>message? > > OK, let me know what you all think of this version. Looks good to me. Acked-by: Michael S. Tsirkin <mst@redhat.com> > > If you change ACPI tables to ActiveLow, and leave the polarity > > inversion in hw/intc/ioapic.c, then it will matter. > > I only changed ACPI to ActiveLow to cause Linux to misbehave in the > same way OS X does, it was a temporary/test hack. ACPI should stay > ActiveHigh to reflect the rest of QEMU/KVM, and misbehaving guest OSs > should ignore it at their own peril :) I actually think now that you've fixed intc.c as well, we should change it as the next step. > > >(Hmmm, maybe this one of the reasons I never got OS X to boot without > > >-enable-kvm... I should look at the QEMU hw/intc/ioapic*.c, and see > > >if *it* cares about guest-configured polarity, and maybe get it to > > >*stop* caring :) > > > Exactly my point. :) > > So this patch gets the accelerated path to work regardless of how > well-behaved a guest OS may or may not be w.r.t. ACPI and polarity. > I'll try to find out whether it's possible to be *extra* nice to these > types of guests by also allowing them to ignore ACPI polarity when not > using acceleration :) > > Thanks again, > Gabriel > > Documentation/virtual/kvm/api.txt | 18 ++++++++++++++++++ > include/uapi/linux/kvm.h | 1 + > virt/kvm/ioapic.c | 1 - > 3 files changed, 19 insertions(+), 1 deletion(-) > > diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt > index 6cd63a9..0492a94 100644 > --- a/Documentation/virtual/kvm/api.txt > +++ b/Documentation/virtual/kvm/api.txt > @@ -1365,6 +1365,24 @@ struct kvm_irq_routing_msi { > __u32 pad; > }; > > +NOTE: For each level-triggered interrupt managed by a virtual ioapic, > +the guest OS may set a polarity value (bit 13 of each corresponding I/O > +redirection table register). The polarity bit defines the relationship > +between an irq line's logical state (active/asserted = 1, inactive = 0) > +and its physical state (high = 1, low = 0). When the polarity bit is 0 > +(ActiveHigh), logical and physical states are matched (active == high == 1, > +inactive == low == 0). When polarity is set to 1 (ActiveLow), logical and > +physical state are opposites (logical == !physical). Typically, guests are > +expected to use the same polarity across all level-triggered interrupts, > +as directed by ACPI. Historically, both KVM and QEMU have accumulated a > +significant number of optimizations based on the hard-coded assumption > +that polarity will always be set to ActiveHigh. When interfacing with KVM, > +QEMU will use the ActiveHigh convention for all level-triggered irqs, > +regardless of how the guest OS has configured the polarity bits in the > +ioapic registers. As such, KVM must also ignore these bits, and always act > +as if the logical and physical states of an irq line exactly match each > +other (i.e., follow the ActiveHigh convention regardless of polarity). > + > > 4.53 KVM_ASSIGN_SET_MSIX_NR > > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > index 932d7f2..5bfc0e6 100644 > --- a/include/uapi/linux/kvm.h > +++ b/include/uapi/linux/kvm.h > @@ -675,6 +675,7 @@ struct kvm_ppc_smmu_info { > #define KVM_CAP_SPAPR_MULTITCE 94 > #define KVM_CAP_EXT_EMUL_CPUID 95 > #define KVM_CAP_HYPERV_TIME 96 > +#define KVM_CAP_X86_IOAPIC_POLARITY_IGNORED 97 > > #ifdef KVM_CAP_IRQ_ROUTING > > diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c > index ce9ed99..1539d37 100644 > --- a/virt/kvm/ioapic.c > +++ b/virt/kvm/ioapic.c > @@ -328,7 +328,6 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int irq_source_id, > irq_level = __kvm_irq_line_state(&ioapic->irq_states[irq], > irq_source_id, level); > entry = ioapic->redirtbl[irq]; > - irq_level ^= entry.fields.polarity; > if (!irq_level) { > ioapic->irr &= ~mask; > ret = 1; > -- > 1.8.1.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Il 28/02/2014 05:06, Gabriel L. Somlo ha scritto: > +NOTE: For each level-triggered interrupt managed by a virtual ioapic, > +the guest OS may set a polarity value (bit 13 of each corresponding I/O > +redirection table register). The polarity bit defines the relationship > +between an irq line's logical state (active/asserted = 1, inactive = 0) > +and its physical state (high = 1, low = 0). When the polarity bit is 0 > +(ActiveHigh), logical and physical states are matched (active == high == 1, > +inactive == low == 0). When polarity is set to 1 (ActiveLow), logical and > +physical state are opposites (logical == !physical). Typically, guests are > +expected to use the same polarity across all level-triggered interrupts, > +as directed by ACPI. Historically, both KVM and QEMU have accumulated a > +significant number of optimizations based on the hard-coded assumption > +that polarity will always be set to ActiveHigh. When interfacing with KVM, > +QEMU will use the ActiveHigh convention for all level-triggered irqs, > +regardless of how the guest OS has configured the polarity bits in the > +ioapic registers. As such, KVM must also ignore these bits, and always act > +as if the logical and physical states of an irq line exactly match each > +other (i.e., follow the ActiveHigh convention regardless of polarity). > + Instead of this, I'm adding the following to the KVM_IRQ_LINE ioctl: +On real hardware, interrupt pins can be active-low or active-high. This +does not matter for the level field of struct kvm_irq_level: 1 always +means active (asserted), 0 means inactive (deasserted). + +x86 allows the operating system to program the interrupt polarity +(active-low/active-high) for level-triggered interrupts, and KVM used +to consider the polarity. However, due to bitrot in the handling of +active-low interrupts, the above convention is now valid on x86 too. +This is signaled by KVM_CAP_X86_IOAPIC_POLARITY_IGNORED. Userspace +should not present interrupts to the guest as active-low unless this +capability is present (or unless it is not using the in-kernel irqchip, +of course). and applying the patch to kvm/queue. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Mar 13, 2014 at 11:53:04AM +0100, Paolo Bonzini wrote: > Instead of this, I'm adding the following to the KVM_IRQ_LINE ioctl: > > +On real hardware, interrupt pins can be active-low or active-high. This > +does not matter for the level field of struct kvm_irq_level: 1 always > +means active (asserted), 0 means inactive (deasserted). > + > +x86 allows the operating system to program the interrupt polarity > +(active-low/active-high) for level-triggered interrupts, and KVM used > +to consider the polarity. However, due to bitrot in the handling of > +active-low interrupts, the above convention is now valid on x86 too. > +This is signaled by KVM_CAP_X86_IOAPIC_POLARITY_IGNORED. Userspace > +should not present interrupts to the guest as active-low unless this > +capability is present (or unless it is not using the in-kernel irqchip, > +of course). > > and applying the patch to kvm/queue. Sounds great to me, thanks ! --Gabriel -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 6cd63a9..0492a94 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -1365,6 +1365,24 @@ struct kvm_irq_routing_msi { __u32 pad; }; +NOTE: For each level-triggered interrupt managed by a virtual ioapic, +the guest OS may set a polarity value (bit 13 of each corresponding I/O +redirection table register). The polarity bit defines the relationship +between an irq line's logical state (active/asserted = 1, inactive = 0) +and its physical state (high = 1, low = 0). When the polarity bit is 0 +(ActiveHigh), logical and physical states are matched (active == high == 1, +inactive == low == 0). When polarity is set to 1 (ActiveLow), logical and +physical state are opposites (logical == !physical). Typically, guests are +expected to use the same polarity across all level-triggered interrupts, +as directed by ACPI. Historically, both KVM and QEMU have accumulated a +significant number of optimizations based on the hard-coded assumption +that polarity will always be set to ActiveHigh. When interfacing with KVM, +QEMU will use the ActiveHigh convention for all level-triggered irqs, +regardless of how the guest OS has configured the polarity bits in the +ioapic registers. As such, KVM must also ignore these bits, and always act +as if the logical and physical states of an irq line exactly match each +other (i.e., follow the ActiveHigh convention regardless of polarity). + 4.53 KVM_ASSIGN_SET_MSIX_NR diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 932d7f2..5bfc0e6 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -675,6 +675,7 @@ struct kvm_ppc_smmu_info { #define KVM_CAP_SPAPR_MULTITCE 94 #define KVM_CAP_EXT_EMUL_CPUID 95 #define KVM_CAP_HYPERV_TIME 96 +#define KVM_CAP_X86_IOAPIC_POLARITY_IGNORED 97 #ifdef KVM_CAP_IRQ_ROUTING diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c index ce9ed99..1539d37 100644 --- a/virt/kvm/ioapic.c +++ b/virt/kvm/ioapic.c @@ -328,7 +328,6 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int irq_source_id, irq_level = __kvm_irq_line_state(&ioapic->irq_states[irq], irq_source_id, level); entry = ioapic->redirtbl[irq]; - irq_level ^= entry.fields.polarity; if (!irq_level) { ioapic->irr &= ~mask; ret = 1;