diff mbox series

[v2,03/11] KVM: nSVM: explicitly raise KVM_REQ_EVENT on nested VM exit if L1 doesn't intercept interrupts

Message ID 20221129193717.513824-4-mlevitsk@redhat.com (mailing list archive)
State New, archived
Headers show
Series SVM: vNMI (with my fixes) | expand

Commit Message

Maxim Levitsky Nov. 29, 2022, 7:37 p.m. UTC
If the L2 doesn't intercept interrupts, then the KVM will use vmcb02's
V_IRQ for L1 (to detect the interrupt window)

In this case on the nested VM exit KVM might need to copy the V_IRQ bit
from the vmcb02 to the vmcb01, to continue waiting for the
interrupt window.

To make it simple, just raise the KVM_REQ_EVENT request, which
execution will lead to the reenabling of the interrupt
window if needed.

Note that this is a theoretical bug because the KVM already does raise
the KVM_REQ_EVENT request one each nested VM exit because the nested
VM exit resets RFLAGS and the kvm_set_rflags() raises the
KVM_REQ_EVENT request in the response.

However raising this request explicitly, together with
documenting why this is needed, is still preferred.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/svm/nested.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

Comments

Sean Christopherson Jan. 28, 2023, 12:56 a.m. UTC | #1
Shortlog is too long, maybe this?

  KVM: nSVM: Raise event on nested VM exit if L1 doesn't intercept IRQs

On Tue, Nov 29, 2022, Maxim Levitsky wrote:
> If the L2 doesn't intercept interrupts, then the KVM will use vmcb02's

s/the L2/L2, though don't you mean L1?

> V_IRQ for L1 (to detect the interrupt window)

"an interrupt window", i.e. there's not just one window.

> In this case on the nested VM exit KVM might need to copy the V_IRQ bit

s/the nested/nested

> from the vmcb02 to the vmcb01, to continue waiting for the
> interrupt window.
> 
> To make it simple, just raise the KVM_REQ_EVENT request, which
> execution will lead to the reenabling of the interrupt
> window if needed.
> 
> Note that this is a theoretical bug because the KVM already does raise

s/the KVM/KVM

> the KVM_REQ_EVENT request one each nested VM exit because the nested

s/the KVM_REQ_EVENT/KVM_REQ_EVENT, and s/one/on

> VM exit resets RFLAGS and the kvm_set_rflags() raises the
> KVM_REQ_EVENT request in the response.
> 
> However raising this request explicitly, together with
> documenting why this is needed, is still preferred.
> 
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>  arch/x86/kvm/svm/nested.c | 25 +++++++++++++++++++++++++
>  1 file changed, 25 insertions(+)
> 
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index aad3145b2f62fe..e891318595113e 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -1016,6 +1016,31 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
>  
>  	svm_switch_vmcb(svm, &svm->vmcb01);
>  
> +	/* Note about synchronizing some of int_ctl bits from vmcb02 to vmcb01:

	/*
	 * Preferred block comment style...

> +	 *
> +	 * - V_IRQ, V_IRQ_VECTOR, V_INTR_PRIO_MASK, V_IGN_TPR:

Drop the "-" to be consistent with the rest of the paragraphs.

> +	 * If the L2 doesn't intercept interrupts, then
> +	 * (even if the L2 does use virtual interrupt masking),

KVM uses "L2" to refer to the thing running at L2.  I think what you are referring
to here is vmcb12?  And that's controlled by L1.

> +	 * KVM will use the vmcb02's V_INTR to detect interrupt window.

s/the vmcb02/vmcb02

Which of the V_INTR fields does this refer to?  Oooh, you're saying the KVM injects
a virtual interrupt into L2 using vmcb02 in order to determine when L2 has IRQs
enabled.

Why does KVM do that?  Why not pend the actual IRQ directly?  

> +	 *
> +	 * In this case, the KVM raises the KVM_REQ_EVENT to ensure that interrupt window

s/the KVM_REQ_EVENT/KVM_REQ_EVENT

> +	 * is not lost and this implicitly copies these bits from vmcb02 to vmcb01

Too many pronouns.  What do "this" and "these bits" refer to?

> +	 *
> +	 * V_TPR:
> +	 * If the L2 doesn't use virtual interrupt masking, then the L1's vTPR
> +	 * is stored in the vmcb02 but its value doesn't need to be copied from/to
> +	 * vmcb01 because it is copied from/to the TPR APIC's register on
> +	 * each VM entry/exit.
> +	 *
> +	 * V_GIF:
> +	 * - If the nested vGIF is not used, KVM uses vmcb02's V_GIF for L1's V_GIF,

Drop this "-" too.

> +	 * however, the L1 vGIF is reset to false on each VM exit, thus
> +	 * there is no need to copy it from vmcb02 to vmcb01.
> +	 */
> +
> +	if (!nested_exit_on_intr(svm))
> +		kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
> +
>  	if (unlikely(svm->lbrv_enabled && (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) {
>  		svm_copy_lbrs(vmcb12, vmcb02);
>  		svm_update_lbrv(vcpu);
> -- 
> 2.26.3
>
Sean Christopherson Jan. 30, 2023, 6:41 p.m. UTC | #2
On Sat, Jan 28, 2023, Sean Christopherson wrote:
> > +	 * If the L2 doesn't intercept interrupts, then
> > +	 * (even if the L2 does use virtual interrupt masking),
> 
> KVM uses "L2" to refer to the thing running at L2.  I think what you are referring
> to here is vmcb12?  And that's controlled by L1.
> 
> > +	 * KVM will use the vmcb02's V_INTR to detect interrupt window.
> 
> s/the vmcb02/vmcb02
> 
> Which of the V_INTR fields does this refer to?  Oooh, you're saying the KVM injects
> a virtual interrupt into L2 using vmcb02 in order to determine when L2 has IRQs
> enabled.
> 
> Why does KVM do that?  Why not pend the actual IRQ directly?

Duh, because KVM needs to gain control in if there are multiple pending events.
diff mbox series

Patch

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index aad3145b2f62fe..e891318595113e 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1016,6 +1016,31 @@  int nested_svm_vmexit(struct vcpu_svm *svm)
 
 	svm_switch_vmcb(svm, &svm->vmcb01);
 
+	/* Note about synchronizing some of int_ctl bits from vmcb02 to vmcb01:
+	 *
+	 * - V_IRQ, V_IRQ_VECTOR, V_INTR_PRIO_MASK, V_IGN_TPR:
+	 * If the L2 doesn't intercept interrupts, then
+	 * (even if the L2 does use virtual interrupt masking),
+	 * KVM will use the vmcb02's V_INTR to detect interrupt window.
+	 *
+	 * In this case, the KVM raises the KVM_REQ_EVENT to ensure that interrupt window
+	 * is not lost and this implicitly copies these bits from vmcb02 to vmcb01
+	 *
+	 * V_TPR:
+	 * If the L2 doesn't use virtual interrupt masking, then the L1's vTPR
+	 * is stored in the vmcb02 but its value doesn't need to be copied from/to
+	 * vmcb01 because it is copied from/to the TPR APIC's register on
+	 * each VM entry/exit.
+	 *
+	 * V_GIF:
+	 * - If the nested vGIF is not used, KVM uses vmcb02's V_GIF for L1's V_GIF,
+	 * however, the L1 vGIF is reset to false on each VM exit, thus
+	 * there is no need to copy it from vmcb02 to vmcb01.
+	 */
+
+	if (!nested_exit_on_intr(svm))
+		kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
+
 	if (unlikely(svm->lbrv_enabled && (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) {
 		svm_copy_lbrs(vmcb12, vmcb02);
 		svm_update_lbrv(vcpu);