From patchwork Fri Mar 29 13:54:21 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiaoyao Li X-Patchwork-Id: 10877099 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 839131708 for ; Fri, 29 Mar 2019 13:55:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 70D8228EE3 for ; Fri, 29 Mar 2019 13:55:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 64CC229054; Fri, 29 Mar 2019 13:55:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E4FCA28EE3 for ; Fri, 29 Mar 2019 13:55:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729746AbfC2Nyy (ORCPT ); Fri, 29 Mar 2019 09:54:54 -0400 Received: from mga03.intel.com ([134.134.136.65]:44836 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729688AbfC2Nyx (ORCPT ); Fri, 29 Mar 2019 09:54:53 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 29 Mar 2019 06:54:52 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,284,1549958400"; d="scan'208";a="138342096" Received: from lxy-server.sh.intel.com ([10.239.48.11]) by orsmga003.jf.intel.com with ESMTP; 29 Mar 2019 06:54:50 -0700 From: Xiaoyao Li To: Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= , linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Xiaoyao Li , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , x86@kernel.org, chao.gao@intel.com, Sean Christopherson Subject: [PATCH v4 1/2] kvm/vmx: Switch MSR_MISC_FEATURES_ENABLES between host and guest Date: Fri, 29 Mar 2019 21:54:21 +0800 Message-Id: <20190329135422.15046-2-xiaoyao.li@linux.intel.com> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20190329135422.15046-1-xiaoyao.li@linux.intel.com> References: <20190329135422.15046-1-xiaoyao.li@linux.intel.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP There are two defined bits in MSR_MISC_FEATURES_ENABLES, bit 0 for cpuid faulting and bit 1 for ring3mwait. == cpuid Faulting == cpuid faulting is a feature about CPUID instruction. When cpuid faulting is enabled, all execution of the CPUID instruction outside system-management mode (SMM) cause a general-protection (#GP) if the CPL > 0. About this feature, detailed information can be found at https://www.intel.com/content/dam/www/public/us/en/documents/application-notes/virtualization-technology-flexmigration-application-note.pdf Current KVM provides software emulation of this feature for guest. However, because cpuid faulting takes higher priority over CPUID vm exit (Intel SDM vol3.25.1.1), there is a risk of leaking cpuid faulting to guest when host enables it. If host enables cpuid faulting by setting the bit 0 of MSR_MISC_FEATURES_ENABLES, it will pass to guest since there is no switch of MSR_MISC_FEATURES_ENABLES yet. As a result, when guest calls CPUID instruction in CPL > 0, it will generate a #GP instead of CPUID vm eixt. This issue will cause guest boot failure when guest uses *modprobe* to load modules. *modprobe* calls CPUID instruction, thus causing #GP in guest. Since there is no handling of cpuid faulting in #GP handler, guest fails boot. == ring3mwait == Ring3mwait is a Xeon-Phi Product Family x200 series specific feature, which allows the MONITOR and MWAIT instructions to be executed in rings other than ring 0. The feature can be enabled by setting bit 1 in MSR_MISC_FEATURES_ENABLES. The register can also be read to determine whether the instructions are enabled at other than ring 0. About this feature, description can be found at https://software.intel.com/en-us/blogs/2016/10/06/intel-xeon-phi-product-family-x200-knl-user-mode-ring-3-monitor-and-mwait Current kvm doesn't expose feature ring3mwait to guest. However, there is also a risk of leaking ring3mwait to guest if host enables it since there is no switch of MSR_MISC_FEATURES_ENABLES. == solution == From above analysis, both cpuid faulting and ring3mwait can be leaked to guest. To fix this issue, MSR_MISC_FEATURES_ENABLES should be switched between host and guest. Since MSR_MISC_FEATURES_ENABLES is intel-specific, this patch implement the switching only in vmx. For the reason that kvm provides the software emulation of cpuid faulting and kvm doesn't expose ring3mwait to guest. MSR_MISC_FEATURES_ENABLES can be just cleared to zero for guest when any of the features is enabled in host. Signed-off-by: Xiaoyao Li --- arch/x86/kernel/process.c | 1 + arch/x86/kvm/vmx/vmx.c | 8 ++++++++ 2 files changed, 9 insertions(+) diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 1bba1a3c0b01..94a566e79b6c 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -191,6 +191,7 @@ int set_tsc_mode(unsigned int val) } DEFINE_PER_CPU(u64, msr_misc_features_shadow); +EXPORT_PER_CPU_SYMBOL_GPL(msr_misc_features_shadow); static void set_cpuid_faulting(bool on) { diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 270c6566fd5a..cb0f63879a25 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1064,6 +1064,9 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) vmx->loaded_cpu_state = vmx->loaded_vmcs; host_state = &vmx->loaded_cpu_state->host_state; + if (this_cpu_read(msr_misc_features_shadow)) + wrmsrl(MSR_MISC_FEATURES_ENABLES, 0ULL); + /* * Set host fs and gs selectors. Unfortunately, 22.2.3 does not * allow segment selectors with cpl > 0 or ti == 1. @@ -1123,6 +1126,7 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) static void vmx_prepare_switch_to_host(struct vcpu_vmx *vmx) { struct vmcs_host_state *host_state; + u64 msrval; if (!vmx->loaded_cpu_state) return; @@ -1133,6 +1137,10 @@ static void vmx_prepare_switch_to_host(struct vcpu_vmx *vmx) ++vmx->vcpu.stat.host_state_reload; vmx->loaded_cpu_state = NULL; + msrval = this_cpu_read(msr_misc_features_shadow); + if (msrval) + wrmsrl(MSR_MISC_FEATURES_ENABLES, msrval); + #ifdef CONFIG_X86_64 rdmsrl(MSR_KERNEL_GS_BASE, vmx->msr_guest_kernel_gs_base); #endif From patchwork Fri Mar 29 13:54:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiaoyao Li X-Patchwork-Id: 10877097 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 97A061708 for ; Fri, 29 Mar 2019 13:55:03 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8515928EE3 for ; Fri, 29 Mar 2019 13:55:03 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7967129054; Fri, 29 Mar 2019 13:55:03 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 31B9F28EE3 for ; Fri, 29 Mar 2019 13:55:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729768AbfC2Ny5 (ORCPT ); Fri, 29 Mar 2019 09:54:57 -0400 Received: from mga03.intel.com ([134.134.136.65]:44836 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729688AbfC2Ny4 (ORCPT ); Fri, 29 Mar 2019 09:54:56 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 29 Mar 2019 06:54:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,284,1549958400"; d="scan'208";a="138342106" Received: from lxy-server.sh.intel.com ([10.239.48.11]) by orsmga003.jf.intel.com with ESMTP; 29 Mar 2019 06:54:53 -0700 From: Xiaoyao Li To: Paolo Bonzini , =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= , linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Xiaoyao Li , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , x86@kernel.org, chao.gao@intel.com, Sean Christopherson Subject: [PATCH v4 2/2] x86/vmx: optimize MSR_MISC_FEATURES_ENABLES switch Date: Fri, 29 Mar 2019 21:54:22 +0800 Message-Id: <20190329135422.15046-3-xiaoyao.li@linux.intel.com> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20190329135422.15046-1-xiaoyao.li@linux.intel.com> References: <20190329135422.15046-1-xiaoyao.li@linux.intel.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP KVM needs to switch MSR_MISC_FEATURES_ENABLES between host and guest in every pcpu/vcpu context switch. Since WRMSR is expensive, this patch tries to save cycles by avoiding WRMSR MSR_MISC_FEATURES_ENABLES whenever possible. If host's value is zero, nothing needs to do, since guest can use kvm emulated cpuid faulting. If host's value is non-zero, it need not clear MSR_MISC_FEATURES_ENABLES unconditionally. We can use hardware cpuid faulting if guest's value is equal to host'value, thus avoid WRMSR MSR_MISC_FEATURES_ENABLES. Since hardware cpuid faulting takes higher priority than CPUID vm exit, it should be updated to hardware while guest wrmsr and hardware cpuid faulting is used for guest. Note that MSR_MISC_FEATURES_ENABLES only exists in Intel CPU, only applying this optimization to vmx. Signed-off-by: Xiaoyao Li --- arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/vmx/vmx.c | 15 ++++++++++++--- arch/x86/kvm/x86.c | 11 ++++++++--- 3 files changed, 22 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 2c53df4a5a2a..105691e069d9 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1343,6 +1343,8 @@ void kvm_lmsw(struct kvm_vcpu *vcpu, unsigned long msw); void kvm_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l); int kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr); +bool kvm_misc_features_enables_msr_invalid(struct kvm_vcpu *vcpu, u64 data); + int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr); int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index cb0f63879a25..07a598663ace 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1041,6 +1041,7 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) unsigned long fs_base, gs_base; u16 fs_sel, gs_sel; int i; + u64 msrval; vmx->req_immediate_exit = false; @@ -1064,8 +1065,9 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) vmx->loaded_cpu_state = vmx->loaded_vmcs; host_state = &vmx->loaded_cpu_state->host_state; - if (this_cpu_read(msr_misc_features_shadow)) - wrmsrl(MSR_MISC_FEATURES_ENABLES, 0ULL); + msrval = this_cpu_read(msr_misc_features_shadow); + if (msrval && msrval != vcpu->arch.msr_misc_features_enables) + wrmsrl(MSR_MISC_FEATURES_ENABLES, vcpu->arch.msr_misc_features_enables); /* * Set host fs and gs selectors. Unfortunately, 22.2.3 does not @@ -1138,7 +1140,7 @@ static void vmx_prepare_switch_to_host(struct vcpu_vmx *vmx) vmx->loaded_cpu_state = NULL; msrval = this_cpu_read(msr_misc_features_shadow); - if (msrval) + if (msrval && msrval != vmx->vcpu.arch.msr_misc_features_enables) wrmsrl(MSR_MISC_FEATURES_ENABLES, msrval); #ifdef CONFIG_X86_64 @@ -2027,6 +2029,13 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) else vmx->pt_desc.guest.addr_a[index / 2] = data; break; + case MSR_MISC_FEATURES_ENABLES: + if (kvm_misc_features_enables_msr_invalid(vcpu, data)) + return 1; + if (vmx->loaded_cpu_state && this_cpu_read(msr_misc_features_shadow)) + wrmsrl(MSR_MISC_FEATURES_ENABLES, data); + vcpu->arch.msr_misc_features_enables = data; + break; case MSR_TSC_AUX: if (!msr_info->host_initiated && !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP)) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ad1df965574e..749aa4c9437a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2449,6 +2449,13 @@ static void record_steal_time(struct kvm_vcpu *vcpu) &vcpu->arch.st.steal, sizeof(struct kvm_steal_time)); } +bool kvm_misc_features_enables_msr_invalid(struct kvm_vcpu *vcpu, u64 data) +{ + return (data & ~MSR_MISC_FEATURES_ENABLES_CPUID_FAULT) || + (data && !supports_cpuid_fault(vcpu)); +} +EXPORT_SYMBOL_GPL(kvm_misc_features_enables_msr_invalid); + int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) { bool pr = false; @@ -2669,9 +2676,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) vcpu->arch.msr_platform_info = data; break; case MSR_MISC_FEATURES_ENABLES: - if (data & ~MSR_MISC_FEATURES_ENABLES_CPUID_FAULT || - (data & MSR_MISC_FEATURES_ENABLES_CPUID_FAULT && - !supports_cpuid_fault(vcpu))) + if (kvm_misc_features_enables_msr_invalid(vcpu, data)) return 1; vcpu->arch.msr_misc_features_enables = data; break;