Message ID | 4a96a33c01b547f6e89ecf40224c80afa59c6aa4.1708933498.git.isaku.yamahata@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v19,001/130] x86/virt/tdx: Rename _offset to _member for TD_SYSINFO_MAP() macro | expand |
On Mon, Feb 26, 2024 at 12:26:45AM -0800, isaku.yamahata@intel.com wrote: >From: Isaku Yamahata <isaku.yamahata@intel.com> > >When BIOS eMCA MCE-SMI morphing is enabled, the #MC is morphed to MSMI >(Machine Check System Management Interrupt). Then the SMI causes TD exit >with the read reason of EXIT_REASON_OTHER_SMI with MSMI bit set in the exit >qualification to KVM instead of EXIT_REASON_EXCEPTION_NMI with MC >exception. > >Handle EXIT_REASON_OTHER_SMI with MSMI bit set in the exit qualification as >MCE(Machine Check Exception) happened during TD guest running. > >Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> >--- > arch/x86/kvm/vmx/tdx.c | 40 ++++++++++++++++++++++++++++++++++--- > arch/x86/kvm/vmx/tdx_arch.h | 2 ++ > 2 files changed, 39 insertions(+), 3 deletions(-) > >diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c >index bdd74682b474..117c2315f087 100644 >--- a/arch/x86/kvm/vmx/tdx.c >+++ b/arch/x86/kvm/vmx/tdx.c >@@ -916,6 +916,30 @@ void tdx_handle_exit_irqoff(struct kvm_vcpu *vcpu) > tdexit_intr_info(vcpu)); > else if (exit_reason == EXIT_REASON_EXCEPTION_NMI) > vmx_handle_exception_irqoff(vcpu, tdexit_intr_info(vcpu)); >+ else if (unlikely(tdx->exit_reason.non_recoverable || >+ tdx->exit_reason.error)) { why not just: else if (tdx->exit_reason.basic == EXIT_REASON_OTHER_SMI) { i.e., does EXIT_REASON_OTHER_SMI imply exit_reason.non_recoverable or exit_reason.error? >+ /* >+ * The only reason it gets EXIT_REASON_OTHER_SMI is there is an >+ * #MSMI(Machine Check System Management Interrupt) with >+ * exit_qualification bit 0 set in TD guest. >+ * The #MSMI is delivered right after SEAMCALL returns, >+ * and an #MC is delivered to host kernel after SMI handler >+ * returns. >+ * >+ * The #MC right after SEAMCALL is fixed up and skipped in #MC Looks fixing up and skipping #MC on the first instruction after TD-exit is missing in v19? >+ * handler because it's an #MC happens in TD guest we cannot >+ * handle it with host's context. >+ * >+ * Call KVM's machine check handler explicitly here. >+ */ >+ if (tdx->exit_reason.basic == EXIT_REASON_OTHER_SMI) { >+ unsigned long exit_qual; >+ >+ exit_qual = tdexit_exit_qual(vcpu); >+ if (exit_qual & TD_EXIT_OTHER_SMI_IS_MSMI) >+ kvm_machine_check(); >+ } >+ } > } > > static int tdx_handle_exception(struct kvm_vcpu *vcpu) >@@ -1381,6 +1405,11 @@ int tdx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t fastpath) > exit_reason.full, exit_reason.basic, > to_kvm_tdx(vcpu->kvm)->hkid, > set_hkid_to_hpa(0, to_kvm_tdx(vcpu->kvm)->hkid)); >+ >+ /* >+ * tdx_handle_exit_irqoff() handled EXIT_REASON_OTHER_SMI. It >+ * must be handled before enabling preemption because it's #MC. >+ */ Then EXIT_REASON_OTHER_SMI is handled, why still go to unhandled_exit? > goto unhandled_exit; > } > >@@ -1419,9 +1448,14 @@ int tdx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t fastpath) > return tdx_handle_ept_misconfig(vcpu); > case EXIT_REASON_OTHER_SMI: > /* >- * If reach here, it's not a Machine Check System Management >- * Interrupt(MSMI). #SMI is delivered and handled right after >- * SEAMRET, nothing needs to be done in KVM. >+ * Unlike VMX, all the SMI in SEAM non-root mode (i.e. when >+ * TD guest vcpu is running) will cause TD exit to TDX module, >+ * then SEAMRET to KVM. Once it exits to KVM, SMI is delivered >+ * and handled right away. >+ * >+ * - If it's an Machine Check System Management Interrupt >+ * (MSMI), it's handled above due to non_recoverable bit set. >+ * - If it's not an MSMI, don't need to do anything here. This corrects a comment added in patch 100. Maybe we can just merge patch 100 into this one? > */ > return 1; > default: >diff --git a/arch/x86/kvm/vmx/tdx_arch.h b/arch/x86/kvm/vmx/tdx_arch.h >index efc3c61c14ab..87ef22e9cd49 100644 >--- a/arch/x86/kvm/vmx/tdx_arch.h >+++ b/arch/x86/kvm/vmx/tdx_arch.h >@@ -42,6 +42,8 @@ > #define TDH_VP_WR 43 > #define TDH_SYS_LP_SHUTDOWN 44 > >+#define TD_EXIT_OTHER_SMI_IS_MSMI BIT(1) >+ > /* TDX control structure (TDR/TDCS/TDVPS) field access codes */ > #define TDX_NON_ARCH BIT_ULL(63) > #define TDX_CLASS_SHIFT 56 >-- >2.25.1 > >
On Mon, Apr 01, 2024 at 05:14:02PM +0800, Chao Gao <chao.gao@intel.com> wrote: > On Mon, Feb 26, 2024 at 12:26:45AM -0800, isaku.yamahata@intel.com wrote: > >From: Isaku Yamahata <isaku.yamahata@intel.com> > > > >When BIOS eMCA MCE-SMI morphing is enabled, the #MC is morphed to MSMI > >(Machine Check System Management Interrupt). Then the SMI causes TD exit > >with the read reason of EXIT_REASON_OTHER_SMI with MSMI bit set in the exit > >qualification to KVM instead of EXIT_REASON_EXCEPTION_NMI with MC > >exception. > > > >Handle EXIT_REASON_OTHER_SMI with MSMI bit set in the exit qualification as > >MCE(Machine Check Exception) happened during TD guest running. > > > >Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> > >--- > > arch/x86/kvm/vmx/tdx.c | 40 ++++++++++++++++++++++++++++++++++--- > > arch/x86/kvm/vmx/tdx_arch.h | 2 ++ > > 2 files changed, 39 insertions(+), 3 deletions(-) > > > >diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c > >index bdd74682b474..117c2315f087 100644 > >--- a/arch/x86/kvm/vmx/tdx.c > >+++ b/arch/x86/kvm/vmx/tdx.c > >@@ -916,6 +916,30 @@ void tdx_handle_exit_irqoff(struct kvm_vcpu *vcpu) > > tdexit_intr_info(vcpu)); > > else if (exit_reason == EXIT_REASON_EXCEPTION_NMI) > > vmx_handle_exception_irqoff(vcpu, tdexit_intr_info(vcpu)); > >+ else if (unlikely(tdx->exit_reason.non_recoverable || > >+ tdx->exit_reason.error)) { > > why not just: > else if (tdx->exit_reason.basic == EXIT_REASON_OTHER_SMI) { > > > i.e., does EXIT_REASON_OTHER_SMI imply exit_reason.non_recoverable or > exit_reason.error? Yes, this should be refined. > >+ /* > >+ * The only reason it gets EXIT_REASON_OTHER_SMI is there is an > >+ * #MSMI(Machine Check System Management Interrupt) with > >+ * exit_qualification bit 0 set in TD guest. > >+ * The #MSMI is delivered right after SEAMCALL returns, > >+ * and an #MC is delivered to host kernel after SMI handler > >+ * returns. > >+ * > >+ * The #MC right after SEAMCALL is fixed up and skipped in #MC > > Looks fixing up and skipping #MC on the first instruction after TD-exit is > missing in v19? Right. We removed it as MSMI will provides if #MC happened in SEAM or not. > > >+ * handler because it's an #MC happens in TD guest we cannot > >+ * handle it with host's context. > >+ * > >+ * Call KVM's machine check handler explicitly here. > >+ */ > >+ if (tdx->exit_reason.basic == EXIT_REASON_OTHER_SMI) { > >+ unsigned long exit_qual; > >+ > >+ exit_qual = tdexit_exit_qual(vcpu); > >+ if (exit_qual & TD_EXIT_OTHER_SMI_IS_MSMI) > > >+ kvm_machine_check(); > >+ } > >+ } > > } > > > > static int tdx_handle_exception(struct kvm_vcpu *vcpu) > >@@ -1381,6 +1405,11 @@ int tdx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t fastpath) > > exit_reason.full, exit_reason.basic, > > to_kvm_tdx(vcpu->kvm)->hkid, > > set_hkid_to_hpa(0, to_kvm_tdx(vcpu->kvm)->hkid)); > >+ > >+ /* > >+ * tdx_handle_exit_irqoff() handled EXIT_REASON_OTHER_SMI. It > >+ * must be handled before enabling preemption because it's #MC. > >+ */ > > Then EXIT_REASON_OTHER_SMI is handled, why still go to unhandled_exit? Let me update the comment. exit_irqoff() doesn't return value to tell vcpu_run loop to continue or exit to user-space. As the guest is dead, we'd like to exit to the user-space. > > goto unhandled_exit; > > } > > > >@@ -1419,9 +1448,14 @@ int tdx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t fastpath) > > return tdx_handle_ept_misconfig(vcpu); > > case EXIT_REASON_OTHER_SMI: > > /* > >- * If reach here, it's not a Machine Check System Management > >- * Interrupt(MSMI). #SMI is delivered and handled right after > >- * SEAMRET, nothing needs to be done in KVM. > >+ * Unlike VMX, all the SMI in SEAM non-root mode (i.e. when > >+ * TD guest vcpu is running) will cause TD exit to TDX module, > >+ * then SEAMRET to KVM. Once it exits to KVM, SMI is delivered > >+ * and handled right away. > >+ * > >+ * - If it's an Machine Check System Management Interrupt > >+ * (MSMI), it's handled above due to non_recoverable bit set. > >+ * - If it's not an MSMI, don't need to do anything here. > > This corrects a comment added in patch 100. Maybe we can just merge patch 100 into > this one? Yes. Will do. > > */ > > return 1; > > default: > >diff --git a/arch/x86/kvm/vmx/tdx_arch.h b/arch/x86/kvm/vmx/tdx_arch.h > >index efc3c61c14ab..87ef22e9cd49 100644 > >--- a/arch/x86/kvm/vmx/tdx_arch.h > >+++ b/arch/x86/kvm/vmx/tdx_arch.h > >@@ -42,6 +42,8 @@ > > #define TDH_VP_WR 43 > > #define TDH_SYS_LP_SHUTDOWN 44 > > > >+#define TD_EXIT_OTHER_SMI_IS_MSMI BIT(1) > >+ > > /* TDX control structure (TDR/TDCS/TDVPS) field access codes */ > > #define TDX_NON_ARCH BIT_ULL(63) > > #define TDX_CLASS_SHIFT 56 > >-- > >2.25.1 > > > > >
On 4/4/2024 6:23 AM, Isaku Yamahata wrote: > On Mon, Apr 01, 2024 at 05:14:02PM +0800, > Chao Gao <chao.gao@intel.com> wrote: > >> On Mon, Feb 26, 2024 at 12:26:45AM -0800, isaku.yamahata@intel.com wrote: >>> From: Isaku Yamahata <isaku.yamahata@intel.com> >>> >>> When BIOS eMCA MCE-SMI morphing is enabled, the #MC is morphed to MSMI >>> (Machine Check System Management Interrupt). Then the SMI causes TD exit >>> with the read reason of EXIT_REASON_OTHER_SMI with MSMI bit set in the exit >>> qualification to KVM instead of EXIT_REASON_EXCEPTION_NMI with MC >>> exception. >>> >>> Handle EXIT_REASON_OTHER_SMI with MSMI bit set in the exit qualification as >>> MCE(Machine Check Exception) happened during TD guest running. >>> >>> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> >>> --- >>> arch/x86/kvm/vmx/tdx.c | 40 ++++++++++++++++++++++++++++++++++--- >>> arch/x86/kvm/vmx/tdx_arch.h | 2 ++ >>> 2 files changed, 39 insertions(+), 3 deletions(-) >>> >>> diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c >>> index bdd74682b474..117c2315f087 100644 >>> --- a/arch/x86/kvm/vmx/tdx.c >>> +++ b/arch/x86/kvm/vmx/tdx.c >>> @@ -916,6 +916,30 @@ void tdx_handle_exit_irqoff(struct kvm_vcpu *vcpu) >>> tdexit_intr_info(vcpu)); >>> else if (exit_reason == EXIT_REASON_EXCEPTION_NMI) >>> vmx_handle_exception_irqoff(vcpu, tdexit_intr_info(vcpu)); >>> + else if (unlikely(tdx->exit_reason.non_recoverable || >>> + tdx->exit_reason.error)) { >> why not just: >> else if (tdx->exit_reason.basic == EXIT_REASON_OTHER_SMI) { >> >> >> i.e., does EXIT_REASON_OTHER_SMI imply exit_reason.non_recoverable or >> exit_reason.error? > Yes, this should be refined. > > >>> + /* >>> + * The only reason it gets EXIT_REASON_OTHER_SMI is there is an >>> + * #MSMI(Machine Check System Management Interrupt) with >>> + * exit_qualification bit 0 set in TD guest. >>> + * The #MSMI is delivered right after SEAMCALL returns, >>> + * and an #MC is delivered to host kernel after SMI handler >>> + * returns. >>> + * >>> + * The #MC right after SEAMCALL is fixed up and skipped in #MC >> Looks fixing up and skipping #MC on the first instruction after TD-exit is >> missing in v19? > Right. We removed it as MSMI will provides if #MC happened in SEAM or not. According to the patch of host #MC handler https://lore.kernel.org/lkml/171265126376.10875.16864387954272613660.tip-bot2@tip-bot2/, the #MC triggered by MSMI can be handled by kernel #MC handler. There is no need to call kvm_machine_check(). Does the following fixup make sense to you? -------- KVM: TDX: Handle EXIT_REASON_OTHER_SMI Handle "Other SMI" VM exit for TDX. Unlike VMX, an SMI occurs in SEAM non-root mode cause VM exit to TDX module, then SEAMRET to KVM. Once it exits to KVM, SMI is delivered and handled by kernel handler right away. Specifically, when BIOS eMCA MCE-SMI morphing is enabled, the #MC occurs in TDX guest is delivered as an Machine Check System Management Interrupt (MSMI) with the exit reason of EXIT_REASON_OTHER_SMI with MSMI (bit 0) set in the exit qualification. On VM exit, TDX module checks whether the "Other SMI" is caused by a MSMI or not. If so, TDX module makes TD as fatal, preventing further TD entries, and then completes the TD exit flow to KVM with the TDH.VP.ENTER outputs indicating TDX_NON_RECOVERABLE_TD. After TD exit, the MSMI is delivered and eventually handled by the kernel #MC handler[1]. So, to handle "Other SMI" VM exit: - For non-MSMI case, KVM doesn't need to do anything, just continue TDX vCPU execution. - For MSMI case, since the TDX guest is dead, follow other non-recoverable cases, exit to userspace. [1] The patch supports handling MSMI signaled during SEAM operation. It's already in tip tree. https://lore.kernel.org/lkml/171265126376.10875.16864387954272613660.tip-bot2@tip-bot2/ diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 4ee94bfb17e2..fd756d231204 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -975,30 +975,6 @@ void tdx_handle_exit_irqoff(struct kvm_vcpu *vcpu) tdexit_intr_info(vcpu)); else if (exit_reason == EXIT_REASON_EXCEPTION_NMI) vmx_handle_exception_irqoff(vcpu, tdexit_intr_info(vcpu)); - else if (unlikely(tdx->exit_reason.non_recoverable || - tdx->exit_reason.error)) { - /* - * The only reason it gets EXIT_REASON_OTHER_SMI is there is an - * #MSMI(Machine Check System Management Interrupt) with - * exit_qualification bit 0 set in TD guest. - * The #MSMI is delivered right after SEAMCALL returns, - * and an #MC is delivered to host kernel after SMI handler - * returns. - * - * The #MC right after SEAMCALL is fixed up and skipped in #MC - * handler because it's an #MC happens in TD guest we cannot - * handle it with host's context. - * - * Call KVM's machine check handler explicitly here. - */ - if (tdx->exit_reason.basic == EXIT_REASON_OTHER_SMI) { - unsigned long exit_qual; - - exit_qual = tdexit_exit_qual(vcpu); - if (exit_qual & TD_EXIT_OTHER_SMI_IS_MSMI) - kvm_machine_check(); - } - } } static int tdx_handle_exception(struct kvm_vcpu *vcpu) @@ -1923,10 +1899,6 @@ int tdx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t fastpath) to_kvm_tdx(vcpu->kvm)->hkid, set_hkid_to_hpa(0, to_kvm_tdx(vcpu->kvm)->hkid)); - /* - * tdx_handle_exit_irqoff() handled EXIT_REASON_OTHER_SMI. It - * must be handled before enabling preemption because it's #MC. - */ goto unhandled_exit; } @@ -1970,14 +1942,14 @@ int tdx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t fastpath) return tdx_handle_ept_misconfig(vcpu); case EXIT_REASON_OTHER_SMI: /* - * Unlike VMX, all the SMI in SEAM non-root mode (i.e. when - * TD guest vcpu is running) will cause TD exit to TDX module, - * then SEAMRET to KVM. Once it exits to KVM, SMI is delivered - * and handled right away. + * Unlike VMX, SMI occurs in SEAM non-root mode (i.e. when + * TD guest vCPU is running) will cause VM exit to TDX module, + * then SEAMRET to KVM. Once it exits to KVM, SMI is delivered + * and handled by kernel handler right away. * - * - If it's an Machine Check System Management Interrupt - * (MSMI), it's handled above due to non_recoverable bit set. - * - If it's not an MSMI, don't need to do anything here. + * - A MSMI will not reach here, it's handled as non_recoverable + * case above. + * - If it's not a MSMI, no need to do anything here. */ return 1; default: diff --git a/arch/x86/kvm/vmx/tdx_arch.h b/arch/x86/kvm/vmx/tdx_arch.h index 2aecffe9f276..aa2fea7b2652 100644 --- a/arch/x86/kvm/vmx/tdx_arch.h +++ b/arch/x86/kvm/vmx/tdx_arch.h @@ -41,8 +41,6 @@ #define TDH_PHYMEM_PAGE_WBINVD 41 #define TDH_VP_WR 43 -#define TD_EXIT_OTHER_SMI_IS_MSMI BIT_ULL(1) - /* TDX control structure (TDR/TDCS/TDVPS) field access codes */ #define TDX_NON_ARCH BIT_ULL(63) #define TDX_CLASS_SHIFT 56
On Tue, May 07, 2024 at 03:06:57PM +0800, Binbin Wu <binbin.wu@linux.intel.com> wrote: > > > > + /* > > > > + * The only reason it gets EXIT_REASON_OTHER_SMI is there is an > > > > + * #MSMI(Machine Check System Management Interrupt) with > > > > + * exit_qualification bit 0 set in TD guest. > > > > + * The #MSMI is delivered right after SEAMCALL returns, > > > > + * and an #MC is delivered to host kernel after SMI handler > > > > + * returns. > > > > + * > > > > + * The #MC right after SEAMCALL is fixed up and skipped in #MC > > > Looks fixing up and skipping #MC on the first instruction after TD-exit is > > > missing in v19? > > Right. We removed it as MSMI will provides if #MC happened in SEAM or not. > > According to the patch of host #MC handler > https://lore.kernel.org/lkml/171265126376.10875.16864387954272613660.tip-bot2@tip-bot2/, > the #MC triggered by MSMI can be handled by kernel #MC handler. > There is no need to call kvm_machine_check(). > > Does the following fixup make sense to you? Yes. Now this patch becomes mostly none. So it makes more sense to squash this patch into [PATCH v19 100/130] KVM: TDX: handle EXIT_REASON_OTHER_SMI. > -------- > > KVM: TDX: Handle EXIT_REASON_OTHER_SMI > > Handle "Other SMI" VM exit for TDX. > > Unlike VMX, an SMI occurs in SEAM non-root mode cause VM exit to TDX > module, then SEAMRET to KVM. Once it exits to KVM, SMI is delivered and > handled by kernel handler right away. > > Specifically, when BIOS eMCA MCE-SMI morphing is enabled, the #MC occurs > in TDX guest is delivered as an Machine Check System Management Interrupt > (MSMI) with the exit reason of EXIT_REASON_OTHER_SMI with MSMI (bit 0) set > in the exit qualification. On VM exit, TDX module checks whether the "Other > SMI" is caused by a MSMI or not. If so, TDX module makes TD as fatal, > preventing further TD entries, and then completes the TD exit flow to KVM > with the TDH.VP.ENTER outputs indicating TDX_NON_RECOVERABLE_TD. > After TD exit, the MSMI is delivered and eventually handled by the kernel > #MC handler[1]. > > So, to handle "Other SMI" VM exit: > - For non-MSMI case, KVM doesn't need to do anything, just continue TDX vCPU > execution. > - For MSMI case, since the TDX guest is dead, follow other non-recoverable > cases, exit to userspace. > > [1] The patch supports handling MSMI signaled during SEAM operation. > It's already in tip tree. > https://lore.kernel.org/lkml/171265126376.10875.16864387954272613660.tip-bot2@tip-bot2/ > > diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c > index 4ee94bfb17e2..fd756d231204 100644 > --- a/arch/x86/kvm/vmx/tdx.c > +++ b/arch/x86/kvm/vmx/tdx.c > @@ -975,30 +975,6 @@ void tdx_handle_exit_irqoff(struct kvm_vcpu *vcpu) > tdexit_intr_info(vcpu)); > else if (exit_reason == EXIT_REASON_EXCEPTION_NMI) > vmx_handle_exception_irqoff(vcpu, tdexit_intr_info(vcpu)); > - else if (unlikely(tdx->exit_reason.non_recoverable || > - tdx->exit_reason.error)) { > - /* > - * The only reason it gets EXIT_REASON_OTHER_SMI is there is > an > - * #MSMI(Machine Check System Management Interrupt) with > - * exit_qualification bit 0 set in TD guest. > - * The #MSMI is delivered right after SEAMCALL returns, > - * and an #MC is delivered to host kernel after SMI handler > - * returns. > - * > - * The #MC right after SEAMCALL is fixed up and skipped in > #MC > - * handler because it's an #MC happens in TD guest we cannot > - * handle it with host's context. > - * > - * Call KVM's machine check handler explicitly here. > - */ > - if (tdx->exit_reason.basic == EXIT_REASON_OTHER_SMI) { > - unsigned long exit_qual; > - > - exit_qual = tdexit_exit_qual(vcpu); > - if (exit_qual & TD_EXIT_OTHER_SMI_IS_MSMI) > - kvm_machine_check(); > - } > - } > } > > static int tdx_handle_exception(struct kvm_vcpu *vcpu) > @@ -1923,10 +1899,6 @@ int tdx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t > fastpath) > to_kvm_tdx(vcpu->kvm)->hkid, > set_hkid_to_hpa(0, > to_kvm_tdx(vcpu->kvm)->hkid)); > > - /* > - * tdx_handle_exit_irqoff() handled EXIT_REASON_OTHER_SMI. > It > - * must be handled before enabling preemption because it's > #MC. > - */ > goto unhandled_exit; > } > > @@ -1970,14 +1942,14 @@ int tdx_handle_exit(struct kvm_vcpu *vcpu, > fastpath_t fastpath) > return tdx_handle_ept_misconfig(vcpu); > case EXIT_REASON_OTHER_SMI: > /* > - * Unlike VMX, all the SMI in SEAM non-root mode (i.e. when > - * TD guest vcpu is running) will cause TD exit to TDX > module, > - * then SEAMRET to KVM. Once it exits to KVM, SMI is > delivered > - * and handled right away. > + * Unlike VMX, SMI occurs in SEAM non-root mode (i.e. when > + * TD guest vCPU is running) will cause VM exit to TDX > module, > + * then SEAMRET to KVM. Once it exits to KVM, SMI is > delivered > + * and handled by kernel handler right away. > * > - * - If it's an Machine Check System Management Interrupt > - * (MSMI), it's handled above due to non_recoverable bit > set. > - * - If it's not an MSMI, don't need to do anything here. > + * - A MSMI will not reach here, it's handled as > non_recoverable > + * case above. > + * - If it's not a MSMI, no need to do anything here. > */ > return 1; > default: > diff --git a/arch/x86/kvm/vmx/tdx_arch.h b/arch/x86/kvm/vmx/tdx_arch.h > index 2aecffe9f276..aa2fea7b2652 100644 > --- a/arch/x86/kvm/vmx/tdx_arch.h > +++ b/arch/x86/kvm/vmx/tdx_arch.h > @@ -41,8 +41,6 @@ > #define TDH_PHYMEM_PAGE_WBINVD 41 > #define TDH_VP_WR 43 > > -#define TD_EXIT_OTHER_SMI_IS_MSMI BIT_ULL(1) > - > /* TDX control structure (TDR/TDCS/TDVPS) field access codes */ > #define TDX_NON_ARCH BIT_ULL(63) > #define TDX_CLASS_SHIFT 56 > > >
diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index bdd74682b474..117c2315f087 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -916,6 +916,30 @@ void tdx_handle_exit_irqoff(struct kvm_vcpu *vcpu) tdexit_intr_info(vcpu)); else if (exit_reason == EXIT_REASON_EXCEPTION_NMI) vmx_handle_exception_irqoff(vcpu, tdexit_intr_info(vcpu)); + else if (unlikely(tdx->exit_reason.non_recoverable || + tdx->exit_reason.error)) { + /* + * The only reason it gets EXIT_REASON_OTHER_SMI is there is an + * #MSMI(Machine Check System Management Interrupt) with + * exit_qualification bit 0 set in TD guest. + * The #MSMI is delivered right after SEAMCALL returns, + * and an #MC is delivered to host kernel after SMI handler + * returns. + * + * The #MC right after SEAMCALL is fixed up and skipped in #MC + * handler because it's an #MC happens in TD guest we cannot + * handle it with host's context. + * + * Call KVM's machine check handler explicitly here. + */ + if (tdx->exit_reason.basic == EXIT_REASON_OTHER_SMI) { + unsigned long exit_qual; + + exit_qual = tdexit_exit_qual(vcpu); + if (exit_qual & TD_EXIT_OTHER_SMI_IS_MSMI) + kvm_machine_check(); + } + } } static int tdx_handle_exception(struct kvm_vcpu *vcpu) @@ -1381,6 +1405,11 @@ int tdx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t fastpath) exit_reason.full, exit_reason.basic, to_kvm_tdx(vcpu->kvm)->hkid, set_hkid_to_hpa(0, to_kvm_tdx(vcpu->kvm)->hkid)); + + /* + * tdx_handle_exit_irqoff() handled EXIT_REASON_OTHER_SMI. It + * must be handled before enabling preemption because it's #MC. + */ goto unhandled_exit; } @@ -1419,9 +1448,14 @@ int tdx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t fastpath) return tdx_handle_ept_misconfig(vcpu); case EXIT_REASON_OTHER_SMI: /* - * If reach here, it's not a Machine Check System Management - * Interrupt(MSMI). #SMI is delivered and handled right after - * SEAMRET, nothing needs to be done in KVM. + * Unlike VMX, all the SMI in SEAM non-root mode (i.e. when + * TD guest vcpu is running) will cause TD exit to TDX module, + * then SEAMRET to KVM. Once it exits to KVM, SMI is delivered + * and handled right away. + * + * - If it's an Machine Check System Management Interrupt + * (MSMI), it's handled above due to non_recoverable bit set. + * - If it's not an MSMI, don't need to do anything here. */ return 1; default: diff --git a/arch/x86/kvm/vmx/tdx_arch.h b/arch/x86/kvm/vmx/tdx_arch.h index efc3c61c14ab..87ef22e9cd49 100644 --- a/arch/x86/kvm/vmx/tdx_arch.h +++ b/arch/x86/kvm/vmx/tdx_arch.h @@ -42,6 +42,8 @@ #define TDH_VP_WR 43 #define TDH_SYS_LP_SHUTDOWN 44 +#define TD_EXIT_OTHER_SMI_IS_MSMI BIT(1) + /* TDX control structure (TDR/TDCS/TDVPS) field access codes */ #define TDX_NON_ARCH BIT_ULL(63) #define TDX_CLASS_SHIFT 56