From patchwork Thu Mar 10 21:38:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 12776949 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 84435C433EF for ; Thu, 10 Mar 2022 21:39:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344028AbiCJVkb (ORCPT ); Thu, 10 Mar 2022 16:40:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56986 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344025AbiCJVka (ORCPT ); Thu, 10 Mar 2022 16:40:30 -0500 Received: from vps-vb.mhejs.net (vps-vb.mhejs.net [37.28.154.113]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5B9CAC0856; Thu, 10 Mar 2022 13:39:27 -0800 (PST) Received: from MUA by vps-vb.mhejs.net with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nSQUm-0006It-FR; Thu, 10 Mar 2022 22:38:52 +0100 From: "Maciej S. Szmigiero" To: Paolo Bonzini Cc: Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Tom Lendacky , Brijesh Singh , Jon Grimm , David Kaplan , Boris Ostrovsky , Liam Merwick , kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 1/5] KVM: nSVM: Sync next_rip field from vmcb12 to vmcb02 Date: Thu, 10 Mar 2022 22:38:37 +0100 Message-Id: <19c757487eeeff5344ff3684fe9c090235b07d05.1646944472.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "Maciej S. Szmigiero" The next_rip field of a VMCB is *not* an output-only field for a VMRUN. This field value (instead of the saved guest RIP) in used by the CPU for the return address pushed on stack when injecting a software interrupt or INT3 or INTO exception. Make sure this field gets synced from vmcb12 to vmcb02 when entering L2 or loading a nested state. Signed-off-by: Maciej S. Szmigiero --- arch/x86/kvm/svm/nested.c | 4 ++++ arch/x86/kvm/svm/svm.h | 1 + 2 files changed, 5 insertions(+) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index d736ec6514ca..9656f0d6815c 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -366,6 +366,7 @@ void __nested_copy_vmcb_control_to_cache(struct kvm_vcpu *vcpu, to->nested_ctl = from->nested_ctl; to->event_inj = from->event_inj; to->event_inj_err = from->event_inj_err; + to->next_rip = from->next_rip; to->nested_cr3 = from->nested_cr3; to->virt_ext = from->virt_ext; to->pause_filter_count = from->pause_filter_count; @@ -638,6 +639,8 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) svm->vmcb->control.int_state = svm->nested.ctl.int_state; svm->vmcb->control.event_inj = svm->nested.ctl.event_inj; svm->vmcb->control.event_inj_err = svm->nested.ctl.event_inj_err; + /* The return address pushed on stack by the CPU for some injected events */ + svm->vmcb->control.next_rip = svm->nested.ctl.next_rip; if (!nested_vmcb_needs_vls_intercept(svm)) svm->vmcb->control.virt_ext |= VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK; @@ -1348,6 +1351,7 @@ static void nested_copy_vmcb_cache_to_control(struct vmcb_control_area *dst, dst->nested_ctl = from->nested_ctl; dst->event_inj = from->event_inj; dst->event_inj_err = from->event_inj_err; + dst->next_rip = from->next_rip; dst->nested_cr3 = from->nested_cr3; dst->virt_ext = from->virt_ext; dst->pause_filter_count = from->pause_filter_count; diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 93502d2a52ce..f757400fc933 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -138,6 +138,7 @@ struct vmcb_ctrl_area_cached { u64 nested_ctl; u32 event_inj; u32 event_inj_err; + u64 next_rip; u64 nested_cr3; u64 virt_ext; u32 clean; From patchwork Thu Mar 10 21:38:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 12776947 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E9C8DC433FE for ; Thu, 10 Mar 2022 21:39:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343944AbiCJVkP (ORCPT ); Thu, 10 Mar 2022 16:40:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55698 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245656AbiCJVkM (ORCPT ); Thu, 10 Mar 2022 16:40:12 -0500 Received: from vps-vb.mhejs.net (vps-vb.mhejs.net [37.28.154.113]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2305BC0840; Thu, 10 Mar 2022 13:39:09 -0800 (PST) Received: from MUA by vps-vb.mhejs.net with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nSQUr-0006Iv-S5; Thu, 10 Mar 2022 22:38:57 +0100 From: "Maciej S. Szmigiero" To: Paolo Bonzini Cc: Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Tom Lendacky , Brijesh Singh , Jon Grimm , David Kaplan , Boris Ostrovsky , Liam Merwick , kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 2/5] KVM: SVM: Downgrade BUG_ON() to WARN_ON() in svm_inject_irq() Date: Thu, 10 Mar 2022 22:38:38 +0100 Message-Id: <3f8422d9185477148e53440a4c6d66acbf387f65.1646944472.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "Maciej S. Szmigiero" There is no need to bring down the whole host just because there might be some issue with respect to guest GIF handling in KVM. Signed-off-by: Maciej S. Szmigiero Reviewed-by: Maxim Levitsky --- arch/x86/kvm/svm/svm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index b069493ad5c7..1e5d904aeec3 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -3322,7 +3322,7 @@ static void svm_inject_irq(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); - BUG_ON(!(gif_set(svm))); + WARN_ON(!gif_set(svm)); trace_kvm_inj_virq(vcpu->arch.interrupt.nr); ++vcpu->stat.irq_injections; From patchwork Thu Mar 10 21:38:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 12776948 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8CE87C433EF for ; Thu, 10 Mar 2022 21:39:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344013AbiCJVk2 (ORCPT ); Thu, 10 Mar 2022 16:40:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56616 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344008AbiCJVk0 (ORCPT ); Thu, 10 Mar 2022 16:40:26 -0500 Received: from vps-vb.mhejs.net (vps-vb.mhejs.net [37.28.154.113]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 28524CD33F; Thu, 10 Mar 2022 13:39:24 -0800 (PST) Received: from MUA by vps-vb.mhejs.net with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nSQUx-0006JF-7w; Thu, 10 Mar 2022 22:39:03 +0100 From: "Maciej S. Szmigiero" To: Paolo Bonzini Cc: Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Tom Lendacky , Brijesh Singh , Jon Grimm , David Kaplan , Boris Ostrovsky , Liam Merwick , kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 3/5] KVM: nSVM: Don't forget about L1-injected events Date: Thu, 10 Mar 2022 22:38:39 +0100 Message-Id: X-Mailer: git-send-email 2.35.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "Maciej S. Szmigiero" In SVM synthetic software interrupts or INT3 or INTO exception that L1 wants to inject into its L2 guest are forgotten if there is an intervening L0 VMEXIT during their delivery. They are re-injected correctly with VMX, however. This is because there is an assumption in SVM that such exceptions will be re-delivered by simply re-executing the current instruction. Which might not be true if this is a synthetic exception injected by L1, since in this case the re-executed instruction will be one already in L2, not the VMRUN instruction in L1 that attempted the injection. Leave the pending L1 -> L2 event in svm->nested.ctl.event_inj{,err} until it is either re-injected successfully or returned to L1 upon a nested VMEXIT. Make sure to always re-queue such event if returned in EXITINTINFO. The handling of L0 -> {L1, L2} event re-injection is left as-is to avoid unforeseen regressions. Signed-off-by: Maciej S. Szmigiero --- arch/x86/kvm/svm/nested.c | 65 +++++++++++++++++++++++++++++++++++++-- arch/x86/kvm/svm/svm.c | 17 ++++++++-- arch/x86/kvm/svm/svm.h | 47 ++++++++++++++++++++++++++++ 3 files changed, 125 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 9656f0d6815c..75017bf77955 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -420,8 +420,17 @@ void nested_copy_vmcb_save_to_cache(struct vcpu_svm *svm, void nested_sync_control_from_vmcb02(struct vcpu_svm *svm) { u32 mask; - svm->nested.ctl.event_inj = svm->vmcb->control.event_inj; - svm->nested.ctl.event_inj_err = svm->vmcb->control.event_inj_err; + + /* + * Leave the pending L1 -> L2 event in svm->nested.ctl.event_inj{,err} + * if its re-injection is needed + */ + if (!exit_during_event_injection(svm, svm->nested.ctl.event_inj, + svm->nested.ctl.event_inj_err)) { + WARN_ON_ONCE(svm->vmcb->control.event_inj & SVM_EVTINJ_VALID); + svm->nested.ctl.event_inj = svm->vmcb->control.event_inj; + svm->nested.ctl.event_inj_err = svm->vmcb->control.event_inj_err; + } /* Only a few fields of int_ctl are written by the processor. */ mask = V_IRQ_MASK | V_TPR_MASK; @@ -669,6 +678,54 @@ static void nested_svm_copy_common_state(struct vmcb *from_vmcb, struct vmcb *to to_vmcb->save.spec_ctrl = from_vmcb->save.spec_ctrl; } +void nested_svm_maybe_reinject(struct kvm_vcpu *vcpu) +{ + struct vcpu_svm *svm = to_svm(vcpu); + unsigned int vector, type; + u32 exitintinfo = svm->vmcb->control.exit_int_info; + + if (WARN_ON_ONCE(!is_guest_mode(vcpu))) + return; + + /* + * No L1 -> L2 event to re-inject? + * + * In this case event_inj will be cleared by + * nested_sync_control_from_vmcb02(). + */ + if (!(svm->nested.ctl.event_inj & SVM_EVTINJ_VALID)) + return; + + /* If the last event injection was successful there shouldn't be any pending event */ + if (WARN_ON_ONCE(!(exitintinfo & SVM_EXITINTINFO_VALID))) + return; + + kvm_make_request(KVM_REQ_EVENT, vcpu); + + vector = exitintinfo & SVM_EXITINTINFO_VEC_MASK; + type = exitintinfo & SVM_EXITINTINFO_TYPE_MASK; + + switch (type) { + case SVM_EXITINTINFO_TYPE_NMI: + vcpu->arch.nmi_injected = true; + break; + case SVM_EXITINTINFO_TYPE_EXEPT: + if (exitintinfo & SVM_EXITINTINFO_VALID_ERR) + kvm_requeue_exception_e(vcpu, vector, + svm->vmcb->control.exit_int_info_err); + else + kvm_requeue_exception(vcpu, vector); + break; + case SVM_EXITINTINFO_TYPE_SOFT: + case SVM_EXITINTINFO_TYPE_INTR: + kvm_queue_interrupt(vcpu, vector, type == SVM_EXITINTINFO_TYPE_SOFT); + break; + default: + vcpu_unimpl(vcpu, "unknown L1 -> L2 exitintinfo type 0x%x\n", type); + break; + } +} + int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmcb12_gpa, struct vmcb *vmcb12, bool from_vmrun) { @@ -898,6 +955,10 @@ int nested_svm_vmexit(struct vcpu_svm *svm) if (svm->nrips_enabled) vmcb12->control.next_rip = vmcb->control.next_rip; + /* Forget about any pending L1 event injection since it's a L1 worry now */ + svm->nested.ctl.event_inj = 0; + svm->nested.ctl.event_inj_err = 0; + vmcb12->control.int_ctl = svm->nested.ctl.int_ctl; vmcb12->control.tlb_ctl = svm->nested.ctl.tlb_ctl; vmcb12->control.event_inj = svm->nested.ctl.event_inj; diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 1e5d904aeec3..5b128baa5e57 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -3322,13 +3322,18 @@ static void svm_inject_irq(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); - WARN_ON(!gif_set(svm)); + WARN_ON(!(vcpu->arch.interrupt.soft || gif_set(svm))); trace_kvm_inj_virq(vcpu->arch.interrupt.nr); ++vcpu->stat.irq_injections; svm->vmcb->control.event_inj = vcpu->arch.interrupt.nr | - SVM_EVTINJ_VALID | SVM_EVTINJ_TYPE_INTR; + SVM_EVTINJ_VALID; + if (vcpu->arch.interrupt.soft) { + svm->vmcb->control.event_inj |= SVM_EVTINJ_TYPE_SOFT; + } else { + svm->vmcb->control.event_inj |= SVM_EVTINJ_TYPE_INTR; + } } void svm_complete_interrupt_delivery(struct kvm_vcpu *vcpu, int delivery_mode, @@ -3627,6 +3632,14 @@ static void svm_complete_interrupts(struct kvm_vcpu *vcpu) if (!(exitintinfo & SVM_EXITINTINFO_VALID)) return; + /* L1 -> L2 event re-injection needs a different handling */ + if (is_guest_mode(vcpu) && + exit_during_event_injection(svm, svm->nested.ctl.event_inj, + svm->nested.ctl.event_inj_err)) { + nested_svm_maybe_reinject(vcpu); + return; + } + kvm_make_request(KVM_REQ_EVENT, vcpu); vector = exitintinfo & SVM_EXITINTINFO_VEC_MASK; diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index f757400fc933..7cafc2e6c82a 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -488,6 +488,52 @@ static inline bool nested_npt_enabled(struct vcpu_svm *svm) return svm->nested.ctl.nested_ctl & SVM_NESTED_CTL_NP_ENABLE; } +static inline bool event_inj_same(u32 event_inj1, u32 event_inj_err1, + u32 event_inj2, u32 event_inj_err2) +{ + unsigned int vector_1, vector_2, type_1, type_2; + + /* Either of them not valid? */ + if (!(event_inj1 & SVM_EVTINJ_VALID) || + !(event_inj2 & SVM_EVTINJ_VALID)) + return false; + + vector_1 = event_inj1 & SVM_EVTINJ_VEC_MASK; + type_1 = event_inj1 & SVM_EVTINJ_TYPE_MASK; + vector_2 = event_inj2 & SVM_EVTINJ_VEC_MASK; + type_2 = event_inj2 & SVM_EVTINJ_TYPE_MASK; + + /* Different vector or type? */ + if (vector_1 != vector_2 || type_1 != type_2) + return false; + + /* Different error code presence flag? */ + if ((event_inj1 & SVM_EVTINJ_VALID_ERR) != + (event_inj2 & SVM_EVTINJ_VALID_ERR)) + return false; + + /* No error code? */ + if (!(event_inj1 & SVM_EVTINJ_VALID_ERR)) + return true; + + /* Same error code? */ + return event_inj_err1 == event_inj_err2; +} + +/* Did the last VMEXIT happen when attempting to inject that event? */ +static inline bool exit_during_event_injection(struct vcpu_svm *svm, + u32 event_inj, u32 event_inj_err) +{ + BUILD_BUG_ON(SVM_EXITINTINFO_VEC_MASK != SVM_EVTINJ_VEC_MASK || + SVM_EXITINTINFO_TYPE_MASK != SVM_EVTINJ_TYPE_MASK || + SVM_EXITINTINFO_VALID != SVM_EVTINJ_VALID || + SVM_EXITINTINFO_VALID_ERR != SVM_EVTINJ_VALID_ERR); + + return event_inj_same(svm->vmcb->control.exit_int_info, + svm->vmcb->control.exit_int_info_err, + event_inj, event_inj_err); +} + /* svm.c */ #define MSR_INVALID 0xffffffffU @@ -540,6 +586,7 @@ static inline bool nested_exit_on_nmi(struct vcpu_svm *svm) return vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_NMI); } +void nested_svm_maybe_reinject(struct kvm_vcpu *vcpu); int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmcb_gpa, struct vmcb *vmcb12, bool from_vmrun); void svm_leave_nested(struct kvm_vcpu *vcpu); From patchwork Thu Mar 10 21:38:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 12776950 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9AA57C433F5 for ; Thu, 10 Mar 2022 21:39:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343992AbiCJVkm (ORCPT ); Thu, 10 Mar 2022 16:40:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57228 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344017AbiCJVkc (ORCPT ); Thu, 10 Mar 2022 16:40:32 -0500 Received: from vps-vb.mhejs.net (vps-vb.mhejs.net [37.28.154.113]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DAF8CC084B; Thu, 10 Mar 2022 13:39:30 -0800 (PST) Received: from MUA by vps-vb.mhejs.net with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nSQV2-0006Ja-IC; Thu, 10 Mar 2022 22:39:08 +0100 From: "Maciej S. Szmigiero" To: Paolo Bonzini Cc: Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Tom Lendacky , Brijesh Singh , Jon Grimm , David Kaplan , Boris Ostrovsky , Liam Merwick , kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 4/5] KVM: nSVM: Restore next_rip when doing L1 -> L2 event re-injection Date: Thu, 10 Mar 2022 22:38:40 +0100 Message-Id: X-Mailer: git-send-email 2.35.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "Maciej S. Szmigiero" According to APM 15.7.1 "State Saved on Exit" the next_rip field can be zero after a VMEXIT in some cases. Yet, it is used by the CPU for the return address pushed on stack when injecting INT3 or INTO exception or a software interrupt. Restore this field to the L1-provided value if zeroed by the CPU when re-injecting a L1-provided event into L2. Signed-off-by: Maciej S. Szmigiero --- arch/x86/kvm/svm/svm.c | 43 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 5b128baa5e57..760dd0e070ea 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -385,6 +385,44 @@ static int svm_skip_emulated_instruction(struct kvm_vcpu *vcpu) return 1; } +/* + * According to APM 15.7.1 "State Saved on Exit" the next_rip field can + * be zero after a VMEXIT in some cases. + * Yet, it is used by the CPU for the return address pushed on stack when + * injecting INT3 or INTO exception or a software interrupt. + * + * Restore this field to the L1-provided value if zeroed by the CPU when + * re-injecting a L1-provided event into L2. + */ +static void maybe_fixup_next_rip(struct kvm_vcpu *vcpu, bool uses_err) +{ + struct vcpu_svm *svm = to_svm(vcpu); + u32 err_vmcb = uses_err ? svm->vmcb->control.event_inj_err : 0; + u32 err_inject = uses_err ? svm->nested.ctl.event_inj_err : 0; + + /* No nRIP Save feature? Then nothing to fix up. */ + if (!nrips) + return; + + /* The fix only applies to event injection into a L2. */ + if (!is_guest_mode(vcpu)) + return; + + /* + * If the current next_rip field is already non-zero assume the CPU had + * returned the correct address during the last VMEXIT. + */ + if (svm->vmcb->control.next_rip) + return; + + /* Is this a L1 -> L2 event re-injection? */ + if (!event_inj_same(svm->vmcb->control.event_inj, err_vmcb, + svm->nested.ctl.event_inj, err_inject)) + return; + + svm->vmcb->control.next_rip = svm->nested.ctl.next_rip; +} + static void svm_queue_exception(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); @@ -415,6 +453,9 @@ static void svm_queue_exception(struct kvm_vcpu *vcpu) | (has_error_code ? SVM_EVTINJ_VALID_ERR : 0) | SVM_EVTINJ_TYPE_EXEPT; svm->vmcb->control.event_inj_err = error_code; + + if (kvm_exception_is_soft(nr)) + maybe_fixup_next_rip(vcpu, true); } static void svm_init_erratum_383(void) @@ -3331,6 +3372,8 @@ static void svm_inject_irq(struct kvm_vcpu *vcpu) SVM_EVTINJ_VALID; if (vcpu->arch.interrupt.soft) { svm->vmcb->control.event_inj |= SVM_EVTINJ_TYPE_SOFT; + + maybe_fixup_next_rip(vcpu, false); } else { svm->vmcb->control.event_inj |= SVM_EVTINJ_TYPE_INTR; } From patchwork Thu Mar 10 21:38:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 12776951 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54699C433F5 for ; Thu, 10 Mar 2022 21:39:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344083AbiCJVkq (ORCPT ); Thu, 10 Mar 2022 16:40:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57358 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344057AbiCJVkl (ORCPT ); Thu, 10 Mar 2022 16:40:41 -0500 Received: from vps-vb.mhejs.net (vps-vb.mhejs.net [37.28.154.113]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A0962D226F; Thu, 10 Mar 2022 13:39:35 -0800 (PST) Received: from MUA by vps-vb.mhejs.net with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nSQV7-0006K2-SC; Thu, 10 Mar 2022 22:39:13 +0100 From: "Maciej S. Szmigiero" To: Paolo Bonzini Cc: Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Tom Lendacky , Brijesh Singh , Jon Grimm , David Kaplan , Boris Ostrovsky , Liam Merwick , kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 5/5] KVM: selftests: nSVM: Add svm_nested_soft_inject_test Date: Thu, 10 Mar 2022 22:38:41 +0100 Message-Id: X-Mailer: git-send-email 2.35.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: "Maciej S. Szmigiero" Add a KVM self-test that checks whether a nSVM L1 is able to successfully inject a software interrupt and a soft exception into its L2 guest. In practice, this tests both the next_rip field consistency and L1-injected event with intervening L0 VMEXIT during its delivery: the first nested VMRUN (that's also trying to inject a software interrupt) will immediately trigger a L0 NPF. This L0 NPF will have zero in its CPU-returned next_rip field, which if incorrectly reused by KVM will trigger a #PF when trying to return to such address 0 from the interrupt handler. Signed-off-by: Maciej S. Szmigiero --- tools/testing/selftests/kvm/.gitignore | 1 + tools/testing/selftests/kvm/Makefile | 1 + .../selftests/kvm/include/x86_64/svm_util.h | 2 + .../kvm/x86_64/svm_nested_soft_inject_test.c | 147 ++++++++++++++++++ 4 files changed, 151 insertions(+) create mode 100644 tools/testing/selftests/kvm/x86_64/svm_nested_soft_inject_test.c diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore index 9b67343dc4ab..bc7e2c5a8560 100644 --- a/tools/testing/selftests/kvm/.gitignore +++ b/tools/testing/selftests/kvm/.gitignore @@ -32,6 +32,7 @@ /x86_64/state_test /x86_64/svm_vmcall_test /x86_64/svm_int_ctl_test +/x86_64/svm_nested_soft_inject_test /x86_64/sync_regs_test /x86_64/tsc_msrs_test /x86_64/userspace_io_test diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 04099f453b59..ff63e3caac9b 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -65,6 +65,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/state_test TEST_GEN_PROGS_x86_64 += x86_64/vmx_preemption_timer_test TEST_GEN_PROGS_x86_64 += x86_64/svm_vmcall_test TEST_GEN_PROGS_x86_64 += x86_64/svm_int_ctl_test +TEST_GEN_PROGS_x86_64 += x86_64/svm_nested_soft_inject_test TEST_GEN_PROGS_x86_64 += x86_64/sync_regs_test TEST_GEN_PROGS_x86_64 += x86_64/userspace_io_test TEST_GEN_PROGS_x86_64 += x86_64/userspace_msr_exit_test diff --git a/tools/testing/selftests/kvm/include/x86_64/svm_util.h b/tools/testing/selftests/kvm/include/x86_64/svm_util.h index a25aabd8f5e7..d49f7c9b4564 100644 --- a/tools/testing/selftests/kvm/include/x86_64/svm_util.h +++ b/tools/testing/selftests/kvm/include/x86_64/svm_util.h @@ -16,6 +16,8 @@ #define CPUID_SVM_BIT 2 #define CPUID_SVM BIT_ULL(CPUID_SVM_BIT) +#define SVM_EXIT_EXCP_BASE 0x040 +#define SVM_EXIT_HLT 0x078 #define SVM_EXIT_MSR 0x07c #define SVM_EXIT_VMMCALL 0x081 diff --git a/tools/testing/selftests/kvm/x86_64/svm_nested_soft_inject_test.c b/tools/testing/selftests/kvm/x86_64/svm_nested_soft_inject_test.c new file mode 100644 index 000000000000..d39be5d885c1 --- /dev/null +++ b/tools/testing/selftests/kvm/x86_64/svm_nested_soft_inject_test.c @@ -0,0 +1,147 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2022 Oracle and/or its affiliates. + * + * Based on: + * svm_int_ctl_test + * + * Copyright (C) 2021, Red Hat, Inc. + * + */ + +#include "test_util.h" +#include "kvm_util.h" +#include "processor.h" +#include "svm_util.h" + +#define VCPU_ID 0 +#define INT_NR 0x20 +#define X86_FEATURE_NRIPS BIT(3) + +#define vmcall() \ + __asm__ __volatile__( \ + "vmmcall\n" \ + ) + +#define ud2() \ + __asm__ __volatile__( \ + "ud2\n" \ + ) + +#define hlt() \ + __asm__ __volatile__( \ + "hlt\n" \ + ) + +static unsigned int bp_fired; +static void guest_bp_handler(struct ex_regs *regs) +{ + bp_fired++; +} + +static unsigned int int_fired; +static void guest_int_handler(struct ex_regs *regs) +{ + int_fired++; +} + +static void l2_guest_code(void) +{ + GUEST_ASSERT(int_fired == 1); + vmcall(); + ud2(); + + GUEST_ASSERT(bp_fired == 1); + hlt(); +} + +static void l1_guest_code(struct svm_test_data *svm) +{ + #define L2_GUEST_STACK_SIZE 64 + unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE]; + struct vmcb *vmcb = svm->vmcb; + + /* Prepare for L2 execution. */ + generic_svm_setup(svm, l2_guest_code, + &l2_guest_stack[L2_GUEST_STACK_SIZE]); + + vmcb->control.intercept_exceptions |= BIT(PF_VECTOR) | BIT(UD_VECTOR); + vmcb->control.intercept |= BIT(INTERCEPT_HLT); + + vmcb->control.event_inj = INT_NR | SVM_EVTINJ_VALID | SVM_EVTINJ_TYPE_SOFT; + /* The return address pushed on stack */ + vmcb->control.next_rip = vmcb->save.rip; + + run_guest(vmcb, svm->vmcb_gpa); + GUEST_ASSERT_3(vmcb->control.exit_code == SVM_EXIT_VMMCALL, + vmcb->control.exit_code, + vmcb->control.exit_info_1, vmcb->control.exit_info_2); + + /* Skip over VMCALL */ + vmcb->save.rip += 3; + + vmcb->control.event_inj = BP_VECTOR | SVM_EVTINJ_VALID | SVM_EVTINJ_TYPE_EXEPT; + /* The return address pushed on stack, skip over UD2 */ + vmcb->control.next_rip = vmcb->save.rip + 2; + + run_guest(vmcb, svm->vmcb_gpa); + GUEST_ASSERT_3(vmcb->control.exit_code == SVM_EXIT_HLT, + vmcb->control.exit_code, + vmcb->control.exit_info_1, vmcb->control.exit_info_2); + + GUEST_DONE(); +} + +int main(int argc, char *argv[]) +{ + struct kvm_cpuid_entry2 *cpuid; + struct kvm_vm *vm; + vm_vaddr_t svm_gva; + struct kvm_guest_debug debug; + + nested_svm_check_supported(); + + cpuid = kvm_get_supported_cpuid_entry(0x8000000a); + if (!(cpuid->edx & X86_FEATURE_NRIPS)) { + print_skip("nRIP Save unavailable"); + exit(KSFT_SKIP); + } + + vm = vm_create_default(VCPU_ID, 0, (void *) l1_guest_code); + + vm_init_descriptor_tables(vm); + vcpu_init_descriptor_tables(vm, VCPU_ID); + + vm_install_exception_handler(vm, BP_VECTOR, guest_bp_handler); + vm_install_exception_handler(vm, INT_NR, guest_int_handler); + + vcpu_alloc_svm(vm, &svm_gva); + vcpu_args_set(vm, VCPU_ID, 1, svm_gva); + + memset(&debug, 0, sizeof(debug)); + vcpu_set_guest_debug(vm, VCPU_ID, &debug); + + struct kvm_run *run = vcpu_state(vm, VCPU_ID); + struct ucall uc; + + vcpu_run(vm, VCPU_ID); + TEST_ASSERT(run->exit_reason == KVM_EXIT_IO, + "Got exit_reason other than KVM_EXIT_IO: %u (%s)\n", + run->exit_reason, + exit_reason_str(run->exit_reason)); + + switch (get_ucall(vm, VCPU_ID, &uc)) { + case UCALL_ABORT: + TEST_FAIL("%s at %s:%ld, vals = 0x%lx 0x%lx 0x%lx", (const char *)uc.args[0], + __FILE__, uc.args[1], uc.args[2], uc.args[3], uc.args[4]); + break; + /* NOT REACHED */ + case UCALL_DONE: + goto done; + default: + TEST_FAIL("Unknown ucall 0x%lx.", uc.cmd); + } +done: + kvm_vm_free(vm); + return 0; +}