diff mbox

[v5,3/4] KVM: x86: Add EOI exit bitmap inference

Message ID 1438039062-3168-3-git-send-email-srutherford@google.com (mailing list archive)
State New, archived
Headers show

Commit Message

Steve Rutherford July 27, 2015, 11:17 p.m. UTC
In order to support a userspace IOAPIC interacting with an in kernel
APIC, the EOI exit bitmaps need to be configurable.

If the IOAPIC is in userspace (i.e. the irqchip has been split), the
EOI exit bitmaps will be set whenever the GSI Routes are configured.
In particular, for the low MSI routes are reservable for userspace
IOAPICs. For these MSI routes, the EOI Exit bit corresponding to the
destination vector of the route will be set for the destination VCPU.

The intention is for the userspace IOAPICs to use the reservable MSI
routes to inject interrupts into the guest.

This is a slight abuse of the notion of an MSI Route, given that MSIs
classically bypass the IOAPIC. It might be worthwhile to add an
additional route type to improve clarity.

Compile tested for Intel x86.

Signed-off-by: Steve Rutherford <srutherford@google.com>
---
 Documentation/virtual/kvm/api.txt |  8 ++++----
 arch/x86/include/asm/kvm_host.h   |  1 +
 arch/x86/kvm/ioapic.h             |  2 ++
 arch/x86/kvm/irq_comm.c           | 42 +++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/lapic.c              |  3 +--
 arch/x86/kvm/x86.c                | 29 +++++++++++++++++----------
 include/linux/kvm_host.h          | 20 +++++++++++++++++++
 virt/kvm/irqchip.c                | 12 ++---------
 8 files changed, 91 insertions(+), 26 deletions(-)

Comments

Paolo Bonzini July 29, 2015, 12:38 p.m. UTC | #1
On 28/07/2015 01:17, Steve Rutherford wrote:
> diff --git a/arch/x86/kvm/ioapic.h b/arch/x86/kvm/ioapic.h
> index d8cc54b..f6ce112 100644
> --- a/arch/x86/kvm/ioapic.h
> +++ b/arch/x86/kvm/ioapic.h
> @@ -9,6 +9,7 @@ struct kvm;
>  struct kvm_vcpu;
>  
>  #define IOAPIC_NUM_PINS  KVM_IOAPIC_NUM_PINS
> +#define MAX_NR_RESERVED_IOAPIC_PINS 48

Why is this needed?

Paolo

>  #define IOAPIC_VERSION_ID 0x11	/* IOAPIC version */
>  #define IOAPIC_EDGE_TRIG  0
>  #define IOAPIC_LEVEL_TRIG 1
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Rutherford July 29, 2015, 8:27 p.m. UTC | #2
On Wed, Jul 29, 2015 at 02:38:09PM +0200, Paolo Bonzini wrote:
> 
> 
> On 28/07/2015 01:17, Steve Rutherford wrote:
> > diff --git a/arch/x86/kvm/ioapic.h b/arch/x86/kvm/ioapic.h
> > index d8cc54b..f6ce112 100644
> > --- a/arch/x86/kvm/ioapic.h
> > +++ b/arch/x86/kvm/ioapic.h
> > @@ -9,6 +9,7 @@ struct kvm;
> >  struct kvm_vcpu;
> >  
> >  #define IOAPIC_NUM_PINS  KVM_IOAPIC_NUM_PINS
> > +#define MAX_NR_RESERVED_IOAPIC_PINS 48
> 
> Why is this needed?
This constant is used to bound the number of IOAPIC pins that are
reservable when enabling KVM_CAP_SPLIT_IRQCHIP. IIRC, x86 doesn't
support more than 2 IOAPICs.  

> 
> Paolo
> 
> >  #define IOAPIC_VERSION_ID 0x11	/* IOAPIC version */
> >  #define IOAPIC_EDGE_TRIG  0
> >  #define IOAPIC_LEVEL_TRIG 1
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kiszka July 30, 2015, 6:23 a.m. UTC | #3
On 2015-07-29 22:27, Steve Rutherford wrote:
> On Wed, Jul 29, 2015 at 02:38:09PM +0200, Paolo Bonzini wrote:
>>
>>
>> On 28/07/2015 01:17, Steve Rutherford wrote:
>>> diff --git a/arch/x86/kvm/ioapic.h b/arch/x86/kvm/ioapic.h
>>> index d8cc54b..f6ce112 100644
>>> --- a/arch/x86/kvm/ioapic.h
>>> +++ b/arch/x86/kvm/ioapic.h
>>> @@ -9,6 +9,7 @@ struct kvm;
>>>  struct kvm_vcpu;
>>>  
>>>  #define IOAPIC_NUM_PINS  KVM_IOAPIC_NUM_PINS
>>> +#define MAX_NR_RESERVED_IOAPIC_PINS 48
>>
>> Why is this needed?
> This constant is used to bound the number of IOAPIC pins that are
> reservable when enabling KVM_CAP_SPLIT_IRQCHIP. IIRC, x86 doesn't
> support more than 2 IOAPICs.  

Huh? Surely not. I've already seen boxes with at least three, and I
think you can even hot-plug them today via extension cards. Not saying
that QEMU supports that already, even without KVM, but we must not limit
ourselves in the kernel API.

So please remove such a static limit on how many IOAPICs userspace can
emulate or raise it to something sufficiently large that will last long
enough.

Jan
Steve Rutherford July 30, 2015, 6:27 a.m. UTC | #4
On Thu, Jul 30, 2015 at 08:23:43AM +0200, Jan Kiszka wrote:
> On 2015-07-29 22:27, Steve Rutherford wrote:
> > On Wed, Jul 29, 2015 at 02:38:09PM +0200, Paolo Bonzini wrote:
> >>
> >>
> >> On 28/07/2015 01:17, Steve Rutherford wrote:
> >>> diff --git a/arch/x86/kvm/ioapic.h b/arch/x86/kvm/ioapic.h
> >>> index d8cc54b..f6ce112 100644
> >>> --- a/arch/x86/kvm/ioapic.h
> >>> +++ b/arch/x86/kvm/ioapic.h
> >>> @@ -9,6 +9,7 @@ struct kvm;
> >>>  struct kvm_vcpu;
> >>>  
> >>>  #define IOAPIC_NUM_PINS  KVM_IOAPIC_NUM_PINS
> >>> +#define MAX_NR_RESERVED_IOAPIC_PINS 48
> >>
> >> Why is this needed?
> > This constant is used to bound the number of IOAPIC pins that are
> > reservable when enabling KVM_CAP_SPLIT_IRQCHIP. IIRC, x86 doesn't
> > support more than 2 IOAPICs.  
> 
> Huh? Surely not. I've already seen boxes with at least three, and I
> think you can even hot-plug them today via extension cards. Not saying
> that QEMU supports that already, even without KVM, but we must not limit
> ourselves in the kernel API.
> 
> So please remove such a static limit on how many IOAPICs userspace can
> emulate or raise it to something sufficiently large that will last long
> enough.
I'll go with the latter. I'll set it to the same size as the max size of the
GSI routing table, which needs to upper bound it.

> 
> Jan
> 
> -- 
> Siemens AG, Corporate Technology, CT RTC ITP SES-DE
> Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 6a13dff..39e4c02 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -3302,10 +3302,10 @@  Valid values for 'type' are:
    to ignore the request, or to gather VM memory core dump and/or
    reset/shutdown of the VM.
 
-	/* KVM_EXIT_IOAPIC_EOI */
-        struct {
-	       __u8 vector;
-        } eoi;
+		/* KVM_EXIT_IOAPIC_EOI */
+		struct {
+			__u8 vector;
+		} eoi;
 
 Indicates that the VCPU's in-kernel local APIC received an EOI for a
 level-triggered IOAPIC interrupt.  This exit only triggers when the
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index f1e0103..ebe7f07 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -674,6 +674,7 @@  struct kvm_arch {
 	u64 disabled_quirks;
 
 	bool irqchip_split;
+	u8 nr_reserved_ioapic_pins;
 };
 
 struct kvm_vm_stat {
diff --git a/arch/x86/kvm/ioapic.h b/arch/x86/kvm/ioapic.h
index d8cc54b..f6ce112 100644
--- a/arch/x86/kvm/ioapic.h
+++ b/arch/x86/kvm/ioapic.h
@@ -9,6 +9,7 @@  struct kvm;
 struct kvm_vcpu;
 
 #define IOAPIC_NUM_PINS  KVM_IOAPIC_NUM_PINS
+#define MAX_NR_RESERVED_IOAPIC_PINS 48
 #define IOAPIC_VERSION_ID 0x11	/* IOAPIC version */
 #define IOAPIC_EDGE_TRIG  0
 #define IOAPIC_LEVEL_TRIG 1
@@ -132,4 +133,5 @@  int kvm_set_ioapic(struct kvm *kvm, struct kvm_ioapic_state *state);
 void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap,
 			u32 *tmr);
 
+void kvm_scan_ioapic_routes(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
 #endif
diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c
index 67f6b62..da4827f 100644
--- a/arch/x86/kvm/irq_comm.c
+++ b/arch/x86/kvm/irq_comm.c
@@ -335,3 +335,45 @@  int kvm_setup_empty_irq_routing(struct kvm *kvm)
 {
 	return kvm_set_irq_routing(kvm, empty_routing, 0, 0);
 }
+
+void kvm_arch_irq_routing_update(struct kvm *kvm)
+{
+	if (ioapic_in_kernel(kvm) || !irqchip_in_kernel(kvm))
+		return;
+	kvm_make_scan_ioapic_request(kvm);
+}
+
+void kvm_scan_ioapic_routes(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
+{
+	struct kvm *kvm = vcpu->kvm;
+	struct kvm_kernel_irq_routing_entry *entry;
+	struct kvm_irq_routing_table *table;
+	u32 i, nr_ioapic_pins;
+	int idx;
+
+	/* kvm->irq_routing must be read after clearing
+	 * KVM_SCAN_IOAPIC. */
+	smp_mb();
+	idx = srcu_read_lock(&kvm->irq_srcu);
+	table = kvm->irq_routing;
+	nr_ioapic_pins = min_t(u32, table->nr_rt_entries,
+			       kvm->arch.nr_reserved_ioapic_pins);
+	for (i = 0; i < nr_ioapic_pins; ++i) {
+		hlist_for_each_entry(entry, &table->map[i], link) {
+			u32 dest_id, dest_mode;
+
+			if (entry->type != KVM_IRQ_ROUTING_MSI)
+				continue;
+			dest_id = (entry->msi.address_lo >> 12) & 0xff;
+			dest_mode = (entry->msi.address_lo >> 2) & 0x1;
+			if (kvm_apic_match_dest(vcpu, NULL, 0, dest_id,
+						dest_mode)) {
+				u32 vector = entry->msi.data & 0xff;
+
+				__set_bit(vector,
+					  (unsigned long *) eoi_exit_bitmap);
+			}
+		}
+	}
+	srcu_read_unlock(&kvm->irq_srcu, idx);
+}
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 37e220d..4dbf6c1 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -209,8 +209,7 @@  out:
 	if (old)
 		kfree_rcu(old, rcu);
 
-	if (!irqchip_split(kvm))
-		kvm_vcpu_request_scan_ioapic(kvm);
+	kvm_make_scan_ioapic_request(kvm);
 }
 
 static inline void apic_set_spiv(struct kvm_lapic *apic, u32 val)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 03ba33a..eef562f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3575,12 +3575,17 @@  static int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		if (irqchip_in_kernel(kvm))
 			goto split_irqchip_unlock;
 		r = -EINVAL;
-		if (atomic_read(&kvm->online_vcpus))
-			goto split_irqchip_unlock;
-		r = kvm_setup_empty_irq_routing(kvm);
-		if (r)
+		if (cap->args[0] > MAX_NR_RESERVED_IOAPIC_PINS)
 			goto split_irqchip_unlock;
-		kvm->arch.irqchip_split = true;
+		if (!irqchip_split(kvm)) {
+			if (atomic_read(&kvm->online_vcpus))
+				goto split_irqchip_unlock;
+			r = kvm_setup_empty_irq_routing(kvm);
+			if (r)
+				goto split_irqchip_unlock;
+			kvm->arch.irqchip_split = true;
+		}
+		kvm->arch.nr_reserved_ioapic_pins = cap->args[0];
 		r = 0;
 split_irqchip_unlock:
 		mutex_unlock(&kvm->lock);
@@ -6164,18 +6169,22 @@  static void process_smi(struct kvm_vcpu *vcpu)
 
 static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
 {
-	u64 eoi_exit_bitmap[4];
+	struct kvm *kvm = vcpu->kvm;
 	u32 tmr[8];
 
 	if (!kvm_apic_hw_enabled(vcpu->arch.apic))
 		return;
 
-	memset(eoi_exit_bitmap, 0, 32);
+	memset(vcpu->arch.eoi_exit_bitmaps, 0, 32);
 	memset(tmr, 0, 32);
+	if (irqchip_split(kvm))
+		kvm_scan_ioapic_routes(vcpu, vcpu->arch.eoi_exit_bitmaps);
+	else
+		kvm_ioapic_scan_entry(vcpu, vcpu->arch.eoi_exit_bitmaps, tmr);
+	kvm_x86_ops->load_eoi_exitmap(vcpu, vcpu->arch.eoi_exit_bitmaps);
 
-	kvm_ioapic_scan_entry(vcpu, eoi_exit_bitmap, tmr);
-	kvm_x86_ops->load_eoi_exitmap(vcpu, eoi_exit_bitmap);
-	kvm_apic_update_tmr(vcpu, tmr);
+	if (!irqchip_split(kvm))
+		kvm_apic_update_tmr(vcpu, tmr);
 }
 
 static void kvm_vcpu_flush_tlb(struct kvm_vcpu *vcpu)
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 8e12d67..064067e 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -329,6 +329,17 @@  struct kvm_kernel_irq_routing_entry {
 	struct hlist_node link;
 };
 
+struct kvm_irq_routing_table {
+	int chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS];
+	struct kvm_kernel_irq_routing_entry *rt_entries;
+	u32 nr_rt_entries;
+	/*
+	 * Array indexed by gsi. Each entry contains list of irq chips
+	 * the gsi is connected to.
+	 */
+	struct hlist_head map[0];
+};
+
 #ifndef KVM_PRIVATE_MEM_SLOTS
 #define KVM_PRIVATE_MEM_SLOTS 0
 #endif
@@ -454,10 +465,19 @@  void vcpu_put(struct kvm_vcpu *vcpu);
 
 #ifdef __KVM_HAVE_IOAPIC
 void kvm_vcpu_request_scan_ioapic(struct kvm *kvm);
+void kvm_arch_irq_routing_update(struct kvm *kvm);
+u8 kvm_arch_nr_userspace_ioapic_pins(struct kvm *kvm);
 #else
 static inline void kvm_vcpu_request_scan_ioapic(struct kvm *kvm)
 {
 }
+static inline void kvm_arch_irq_routing_update(struct kvm *kvm)
+{
+}
+static inline u8 kvm_arch_nr_userspace_ioapic_pins(struct kvm *kvm)
+{
+	return 0;
+}
 #endif
 
 #ifdef CONFIG_HAVE_KVM_IRQFD
diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
index 21c1424..4f85c6e 100644
--- a/virt/kvm/irqchip.c
+++ b/virt/kvm/irqchip.c
@@ -31,16 +31,6 @@ 
 #include <trace/events/kvm.h>
 #include "irq.h"
 
-struct kvm_irq_routing_table {
-	int chip[KVM_NR_IRQCHIPS][KVM_IRQCHIP_NUM_PINS];
-	u32 nr_rt_entries;
-	/*
-	 * Array indexed by gsi. Each entry contains list of irq chips
-	 * the gsi is connected to.
-	 */
-	struct hlist_head map[0];
-};
-
 int kvm_irq_map_gsi(struct kvm *kvm,
 		    struct kvm_kernel_irq_routing_entry *entries, int gsi)
 {
@@ -227,6 +217,8 @@  int kvm_set_irq_routing(struct kvm *kvm,
 	kvm_irq_routing_update(kvm);
 	mutex_unlock(&kvm->irq_lock);
 
+	kvm_arch_irq_routing_update(kvm);
+
 	synchronize_srcu_expedited(&kvm->irq_srcu);
 
 	new = old;