From patchwork Thu Oct 10 16:12:32 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jan Kiszka X-Patchwork-Id: 3017471 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 1915EBF924 for ; Thu, 10 Oct 2013 16:13:40 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id CB6CD20127 for ; Thu, 10 Oct 2013 16:13:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1FCC22030D for ; Thu, 10 Oct 2013 16:13:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757745Ab3JJQNb (ORCPT ); Thu, 10 Oct 2013 12:13:31 -0400 Received: from thoth.sbs.de ([192.35.17.2]:33176 "EHLO thoth.sbs.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757381Ab3JJQNa (ORCPT ); Thu, 10 Oct 2013 12:13:30 -0400 Received: from mail1.siemens.de (localhost [127.0.0.1]) by thoth.sbs.de (8.13.6/8.13.6) with ESMTP id r9AGCX68010863; Thu, 10 Oct 2013 18:12:34 +0200 Received: from mchn199C.mchp.siemens.de ([139.25.40.156]) by mail1.siemens.de (8.13.6/8.13.6) with ESMTP id r9AGCXrY015006; Thu, 10 Oct 2013 18:12:33 +0200 Message-ID: <5256D1F0.7000905@siemens.com> Date: Thu, 10 Oct 2013 18:12:32 +0200 From: Jan Kiszka User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666 MIME-Version: 1.0 To: Paolo Bonzini , Arthur Chunqi Li CC: kvm@vger.kernel.org, gleb@redhat.com, "Zhang, Yang Z" Subject: Re: [PATCH v5] KVM: nVMX: Fully support of nested VMX preemption timer References: <1379319104-10266-1-git-send-email-yzt356@gmail.com> <52444CF6.1020102@redhat.com> <52493F8C.6040009@web.de> <524C6A30.9090403@web.de> In-Reply-To: <524C6A30.9090403@web.de> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 2013-10-02 20:47, Jan Kiszka wrote: > On 2013-09-30 11:08, Jan Kiszka wrote: >> On 2013-09-26 17:04, Paolo Bonzini wrote: >>> Il 16/09/2013 10:11, Arthur Chunqi Li ha scritto: >>>> This patch contains the following two changes: >>>> 1. Fix the bug in nested preemption timer support. If vmexit L2->L0 >>>> with some reasons not emulated by L1, preemption timer value should >>>> be save in such exits. >>>> 2. Add support of "Save VMX-preemption timer value" VM-Exit controls >>>> to nVMX. >>>> >>>> With this patch, nested VMX preemption timer features are fully >>>> supported. >>>> >>>> Signed-off-by: Arthur Chunqi Li >>>> --- >>>> ChangeLog to v4: >>>> Format changes and remove a flag in nested_vmx. >>>> arch/x86/include/uapi/asm/msr-index.h | 1 + >>>> arch/x86/kvm/vmx.c | 44 +++++++++++++++++++++++++++++++-- >>>> 2 files changed, 43 insertions(+), 2 deletions(-) >>> >>> Hi all, >>> >>> the test fails for me if the preemption timer value is set to a value >>> that is above ~2000 (which means ~65000 TSC cycles on this machine). >>> The preemption timer seems to count faster than what is expected, for >>> example only up to 4 million cycles if you set it to one million. >>> So, I am leaving the patch out of kvm/queue for now, until I can >>> test it on more processors. >> >> I've done some measurements with the help of ftrace on the time it takes >> to let the preemption timer trigger (no adjustments via Arthur's patch >> were involved): On my Core i7-620M, the preemption timer seems to tick >> almost 10 times faster than spec and scale value (5) suggests. I've >> loaded a value of 100000, and it took about 130 µs until I got a vmexit >> with reason PREEMPTION_TIMER (no other exists in between). >> >> qemu-system-x86-13765 [003] 298562.966079: bprint: prepare_vmcs02: preempt val 100000 >> qemu-system-x86-13765 [003] 298562.966083: kvm_entry: vcpu 0 >> qemu-system-x86-13765 [003] 298562.966212: kvm_exit: reason PREEMPTION_TIMER rip 0x401fea info 0 0 >> >> That's a frequency of ~769 MHz. The TSC ticks at 2.66 GHz. But 769 MHz * >> 2^5 is 24.6 GHz. I've read the spec several times, but it seems pretty >> clear on this. It just doesn't match reality. Very strange. > > ...but documented: I found an related errata for my processor (AAT59) > and also for Xeon 5500 (AAK139). At least current Haswell generation is > no affected. I can test the patch on a Haswell board I have at work > later this week. To complete this story: Arthur's patch works fine on a non-broken CPU (here: i7-4770S). Arthur, find some fix-ups for your test case below. It avoids printing from within L2 as this could deadlock when the timer fires and L1 then tries to print something. Also, it disables the preemption timer on leave so that it cannot fire later on again. If you want to fold this into your patch, feel free. Otherwise I can post a separate patch on top. Jan diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c index 4372878..66a4201 100644 --- a/x86/vmx_tests.c +++ b/x86/vmx_tests.c @@ -141,6 +141,9 @@ void preemption_timer_init() preempt_val = 10000000; vmcs_write(PREEMPT_TIMER_VALUE, preempt_val); preempt_scale = rdmsr(MSR_IA32_VMX_MISC) & 0x1F; + + if (!(ctrl_exit_rev.clr & EXI_SAVE_PREEMPT)) + printf("\tSave preemption value is not supported\n"); } void preemption_timer_main() @@ -150,9 +153,7 @@ void preemption_timer_main() printf("\tPreemption timer is not supported\n"); return; } - if (!(ctrl_exit_rev.clr & EXI_SAVE_PREEMPT)) - printf("\tSave preemption value is not supported\n"); - else { + if (ctrl_exit_rev.clr & EXI_SAVE_PREEMPT) { set_stage(0); vmcall(); if (get_stage() == 1) @@ -161,8 +162,8 @@ void preemption_timer_main() while (1) { if (((rdtsc() - tsc_val) >> preempt_scale) > 10 * preempt_val) { - report("Preemption timer", 0); - break; + set_stage(2); + vmcall(); } } } @@ -183,7 +184,7 @@ int preemption_timer_exit_handler() report("Preemption timer", 0); else report("Preemption timer", 1); - return VMX_TEST_VMEXIT; + break; case VMX_VMCALL: switch (get_stage()) { case 0: @@ -195,24 +196,29 @@ int preemption_timer_exit_handler() EXI_SAVE_PREEMPT) & ctrl_exit_rev.clr; vmcs_write(EXI_CONTROLS, ctrl_exit); } - break; + vmcs_write(GUEST_RIP, guest_rip + insn_len); + return VMX_TEST_RESUME; case 1: if (vmcs_read(PREEMPT_TIMER_VALUE) >= preempt_val) report("Save preemption value", 0); else report("Save preemption value", 1); + vmcs_write(GUEST_RIP, guest_rip + insn_len); + return VMX_TEST_RESUME; + case 2: + report("Preemption timer", 0); break; default: printf("Invalid stage.\n"); print_vmexit_info(); - return VMX_TEST_VMEXIT; + break; } - vmcs_write(GUEST_RIP, guest_rip + insn_len); - return VMX_TEST_RESUME; + break; default: printf("Unknown exit reason, %d\n", reason); print_vmexit_info(); } + vmcs_write(PIN_CONTROLS, vmcs_read(PIN_CONTROLS) & ~PIN_PREEMPT); return VMX_TEST_VMEXIT; }