From patchwork Tue Aug 2 12:54:52 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Har'El X-Patchwork-Id: 1029392 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter1.kernel.org (8.14.4/8.14.4) with ESMTP id p72CsDtn002176 for ; Tue, 2 Aug 2011 12:57:20 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754481Ab1HBMzG (ORCPT ); Tue, 2 Aug 2011 08:55:06 -0400 Received: from mtagate1.uk.ibm.com ([194.196.100.161]:58689 "EHLO mtagate1.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754345Ab1HBMzF (ORCPT ); Tue, 2 Aug 2011 08:55:05 -0400 Received: from d06nrmr1307.portsmouth.uk.ibm.com (d06nrmr1307.portsmouth.uk.ibm.com [9.149.38.129]) by mtagate1.uk.ibm.com (8.13.1/8.13.1) with ESMTP id p72CstbH023990 for ; Tue, 2 Aug 2011 12:54:55 GMT Received: from d06av06.portsmouth.uk.ibm.com (d06av06.portsmouth.uk.ibm.com [9.149.37.217]) by d06nrmr1307.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p72CstM12482328 for ; Tue, 2 Aug 2011 13:54:55 +0100 Received: from d06av06.portsmouth.uk.ibm.com (loopback [127.0.0.1]) by d06av06.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p72Css48023544 for ; Tue, 2 Aug 2011 06:54:55 -0600 Received: from rice.haifa.ibm.com (rice.haifa.ibm.com [9.148.8.217]) by d06av06.portsmouth.uk.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id p72CssYK023537 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 2 Aug 2011 06:54:54 -0600 Received: from rice.haifa.ibm.com (lnx-nyh.haifa.ibm.com [127.0.0.1]) by rice.haifa.ibm.com (8.14.5/8.14.4) with ESMTP id p72CsrGX002139; Tue, 2 Aug 2011 15:54:53 +0300 Received: (from nyh@localhost) by rice.haifa.ibm.com (8.14.5/8.14.5/Submit) id p72Csqn1002137; Tue, 2 Aug 2011 15:54:52 +0300 Date: Tue, 2 Aug 2011 15:54:52 +0300 Message-Id: <201108021254.p72Csqn1002137@rice.haifa.ibm.com> X-Authentication-Warning: rice.haifa.ibm.com: nyh set sender to "Nadav Har'El" using -f Cc: "Roedel, Joerg" , Zachary Amsden , Bandan Das , Marcelo Tosatti , avi@redhat.com To: kvm@vger.kernel.org From: "Nadav Har'El" References: <1312289591-nyh@il.ibm.com> Subject: [PATCH 2/3] Fix nested VMX TSC emulation Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.6 (demeter1.kernel.org [140.211.167.41]); Tue, 02 Aug 2011 12:57:21 +0000 (UTC) This patch fixes two corner cases in nested (L2) handling of TSC-related issues: 1. Somewhat suprisingly, according to the Intel spec, if L1 allows WRMSR to the TSC MSR without an exit, then this should set L1's TSC value itself - not offset by vmcs12.TSC_OFFSET (like was wrongly done in the previous code). 2. Allow L1 to disable the TSC_OFFSETING control, and then correctly ignore the vmcs12.TSC_OFFSET. Signed-off-by: Nadav Har'El --- arch/x86/kvm/vmx.c | 31 +++++++++++++++++++++---------- 1 file changed, 21 insertions(+), 10 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html --- .before/arch/x86/kvm/vmx.c 2011-08-02 15:51:02.000000000 +0300 +++ .after/arch/x86/kvm/vmx.c 2011-08-02 15:51:02.000000000 +0300 @@ -1777,15 +1777,23 @@ static void vmx_set_tsc_khz(struct kvm_v */ static void vmx_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset) { - vmcs_write64(TSC_OFFSET, offset); - if (is_guest_mode(vcpu)) + if (is_guest_mode(vcpu)) { /* - * We're here if L1 chose not to trap the TSC MSR. Since - * prepare_vmcs12() does not copy tsc_offset, we need to also - * set the vmcs12 field here. + * We're here if L1 chose not to trap WRMSR to TSC. According + * to the spec, this should set L1's TSC; The offset that L1 + * set for L2 remains unchanged, and still needs to be added + * to the newly set TSC to get L2's TSC. */ - get_vmcs12(vcpu)->tsc_offset = offset - - to_vmx(vcpu)->nested.vmcs01_tsc_offset; + struct vmcs12 *vmcs12; + to_vmx(vcpu)->nested.vmcs01_tsc_offset = offset; + /* recalculate vmcs02.TSC_OFFSET: */ + vmcs12 = get_vmcs12(vcpu); + vmcs_write64(TSC_OFFSET, offset + + (nested_cpu_has(vmcs12, CPU_BASED_USE_TSC_OFFSETING) ? + vmcs12->tsc_offset : 0)); + } else { + vmcs_write64(TSC_OFFSET, offset); + } } static void vmx_adjust_tsc_offset(struct kvm_vcpu *vcpu, s64 adjustment) @@ -6529,8 +6537,11 @@ static void prepare_vmcs02(struct kvm_vc set_cr4_guest_host_mask(vmx); - vmcs_write64(TSC_OFFSET, - vmx->nested.vmcs01_tsc_offset + vmcs12->tsc_offset); + if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETING) + vmcs_write64(TSC_OFFSET, + vmx->nested.vmcs01_tsc_offset + vmcs12->tsc_offset); + else + vmcs_write64(TSC_OFFSET, vmx->nested.vmcs01_tsc_offset); if (enable_vpid) { /* @@ -6937,7 +6948,7 @@ static void nested_vmx_vmexit(struct kvm load_vmcs12_host_state(vcpu, vmcs12); - /* Update TSC_OFFSET if vmx_adjust_tsc_offset() was used while L2 ran */ + /* Update TSC_OFFSET if TSC was changed while L2 ran */ vmcs_write64(TSC_OFFSET, vmx->nested.vmcs01_tsc_offset); /* This is needed for same reason as it was needed in prepare_vmcs02 */