Message ID | 5256D1F0.7000905@siemens.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Il 10/10/2013 18:12, Jan Kiszka ha scritto: > On 2013-10-02 20:47, Jan Kiszka wrote: >> On 2013-09-30 11:08, Jan Kiszka wrote: >>> On 2013-09-26 17:04, Paolo Bonzini wrote: >>>> Il 16/09/2013 10:11, Arthur Chunqi Li ha scritto: >>>>> This patch contains the following two changes: >>>>> 1. Fix the bug in nested preemption timer support. If vmexit L2->L0 >>>>> with some reasons not emulated by L1, preemption timer value should >>>>> be save in such exits. >>>>> 2. Add support of "Save VMX-preemption timer value" VM-Exit controls >>>>> to nVMX. >>>>> >>>>> With this patch, nested VMX preemption timer features are fully >>>>> supported. >>>>> >>>>> Signed-off-by: Arthur Chunqi Li <yzt356@gmail.com> >>>>> --- >>>>> ChangeLog to v4: >>>>> Format changes and remove a flag in nested_vmx. >>>>> arch/x86/include/uapi/asm/msr-index.h | 1 + >>>>> arch/x86/kvm/vmx.c | 44 +++++++++++++++++++++++++++++++-- >>>>> 2 files changed, 43 insertions(+), 2 deletions(-) >>>> >>>> Hi all, >>>> >>>> the test fails for me if the preemption timer value is set to a value >>>> that is above ~2000 (which means ~65000 TSC cycles on this machine). >>>> The preemption timer seems to count faster than what is expected, for >>>> example only up to 4 million cycles if you set it to one million. >>>> So, I am leaving the patch out of kvm/queue for now, until I can >>>> test it on more processors. >>> >>> I've done some measurements with the help of ftrace on the time it takes >>> to let the preemption timer trigger (no adjustments via Arthur's patch >>> were involved): On my Core i7-620M, the preemption timer seems to tick >>> almost 10 times faster than spec and scale value (5) suggests. I've >>> loaded a value of 100000, and it took about 130 µs until I got a vmexit >>> with reason PREEMPTION_TIMER (no other exists in between). >>> >>> qemu-system-x86-13765 [003] 298562.966079: bprint: prepare_vmcs02: preempt val 100000 >>> qemu-system-x86-13765 [003] 298562.966083: kvm_entry: vcpu 0 >>> qemu-system-x86-13765 [003] 298562.966212: kvm_exit: reason PREEMPTION_TIMER rip 0x401fea info 0 0 >>> >>> That's a frequency of ~769 MHz. The TSC ticks at 2.66 GHz. But 769 MHz * >>> 2^5 is 24.6 GHz. I've read the spec several times, but it seems pretty >>> clear on this. It just doesn't match reality. Very strange. >> >> ...but documented: I found an related errata for my processor (AAT59) >> and also for Xeon 5500 (AAK139). At least current Haswell generation is >> no affected. I can test the patch on a Haswell board I have at work >> later this week. > > To complete this story: Arthur's patch works fine on a non-broken CPU > (here: i7-4770S). > > Arthur, find some fix-ups for your test case below. It avoids printing > from within L2 as this could deadlock when the timer fires and L1 then > tries to print something. Also, it disables the preemption timer on > leave so that it cannot fire later on again. If you want to fold this > into your patch, feel free. Otherwise I can post a separate patch on > top. Is that a Signed-off-by? :) BTW, VirtualBox has a test for this erratum. It would be nice to skip the test when the processor is found to be buggy. I'll put Arthur's patch back. Thanks for testing! Paolo static bool hmR0InitIntelIsSubjectToVmxPreemptionTimerErratum(void) { uint32_t u = ASMCpuId_EAX(1); u &= ~(RT_BIT_32(14) | RT_BIT_32(15) | RT_BIT_32(28) | RT_BIT_32(29) | RT_BIT_32(30) | RT_BIT_32(31)); if ( u == UINT32_C(0x000206E6) /* 323344.pdf - BA86 - D0 - Intel Xeon Processor 7500 Series */ || u == UINT32_C(0x00020652) /* 323056.pdf - AAX65 - C2 - Intel Xeon Processor L3406 */ || u == UINT32_C(0x00020652) /* 322814.pdf - AAT59 - C2 - Intel CoreTM i7-600, i5-500, i5-400 and i3-300 Mobile Processor Series */ || u == UINT32_C(0x00020652) /* 322911.pdf - AAU65 - C2 - Intel CoreTM i5-600, i3-500 Desktop Processor Series and Intel Pentium Processor G6950 */ || u == UINT32_C(0x00020655) /* 322911.pdf - AAU65 - K0 - Intel CoreTM i5-600, i3-500 Desktop Processor Series and Intel Pentium Processor G6950 */ || u == UINT32_C(0x000106E5) /* 322373.pdf - AAO95 - B1 - Intel Xeon Processor 3400 Series */ || u == UINT32_C(0x000106E5) /* 322166.pdf - AAN92 - B1 - Intel CoreTM i7-800 and i5-700 Desktop Processor Series */ || u == UINT32_C(0x000106E5) /* 320767.pdf - AAP86 - B1 - Intel Core i7-900 Mobile Processor Extreme Edition Series, Intel Core i7-800 and i7-700 Mobile Processor Series */ || u == UINT32_C(0x000106A0) /*?321333.pdf - AAM126 - C0 - Intel Xeon Processor 3500 Series Specification */ || u == UINT32_C(0x000106A1) /*?321333.pdf - AAM126 - C1 - Intel Xeon Processor 3500 Series Specification */ || u == UINT32_C(0x000106A4) /* 320836.pdf - AAJ124 - C0 - Intel Core i7-900 Desktop Processor Extreme Edition Series and Intel Core i7-900 Desktop Processor Series */ || u == UINT32_C(0x000106A5) /* 321333.pdf - AAM126 - D0 - Intel Xeon Processor 3500 Series Specification */ || u == UINT32_C(0x000106A5) /* 321324.pdf - AAK139 - D0 - Intel Xeon Processor 5500 Series Specification */ || u == UINT32_C(0x000106A5) /* 320836.pdf - AAJ124 - D0 - Intel Core i7-900 Desktop Processor Extreme Edition Series and Intel Core i7-900 Desktop Processor Series */ ) return true; return false; } > Jan > > diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c > index 4372878..66a4201 100644 > --- a/x86/vmx_tests.c > +++ b/x86/vmx_tests.c > @@ -141,6 +141,9 @@ void preemption_timer_init() > preempt_val = 10000000; > vmcs_write(PREEMPT_TIMER_VALUE, preempt_val); > preempt_scale = rdmsr(MSR_IA32_VMX_MISC) & 0x1F; > + > + if (!(ctrl_exit_rev.clr & EXI_SAVE_PREEMPT)) > + printf("\tSave preemption value is not supported\n"); > } > > void preemption_timer_main() > @@ -150,9 +153,7 @@ void preemption_timer_main() > printf("\tPreemption timer is not supported\n"); > return; > } > - if (!(ctrl_exit_rev.clr & EXI_SAVE_PREEMPT)) > - printf("\tSave preemption value is not supported\n"); > - else { > + if (ctrl_exit_rev.clr & EXI_SAVE_PREEMPT) { > set_stage(0); > vmcall(); > if (get_stage() == 1) > @@ -161,8 +162,8 @@ void preemption_timer_main() > while (1) { > if (((rdtsc() - tsc_val) >> preempt_scale) > > 10 * preempt_val) { > - report("Preemption timer", 0); > - break; > + set_stage(2); > + vmcall(); > } > } > } > @@ -183,7 +184,7 @@ int preemption_timer_exit_handler() > report("Preemption timer", 0); > else > report("Preemption timer", 1); > - return VMX_TEST_VMEXIT; > + break; > case VMX_VMCALL: > switch (get_stage()) { > case 0: > @@ -195,24 +196,29 @@ int preemption_timer_exit_handler() > EXI_SAVE_PREEMPT) & ctrl_exit_rev.clr; > vmcs_write(EXI_CONTROLS, ctrl_exit); > } > - break; > + vmcs_write(GUEST_RIP, guest_rip + insn_len); > + return VMX_TEST_RESUME; > case 1: > if (vmcs_read(PREEMPT_TIMER_VALUE) >= preempt_val) > report("Save preemption value", 0); > else > report("Save preemption value", 1); > + vmcs_write(GUEST_RIP, guest_rip + insn_len); > + return VMX_TEST_RESUME; > + case 2: > + report("Preemption timer", 0); > break; > default: > printf("Invalid stage.\n"); > print_vmexit_info(); > - return VMX_TEST_VMEXIT; > + break; > } > - vmcs_write(GUEST_RIP, guest_rip + insn_len); > - return VMX_TEST_RESUME; > + break; > default: > printf("Unknown exit reason, %d\n", reason); > print_vmexit_info(); > } > + vmcs_write(PIN_CONTROLS, vmcs_read(PIN_CONTROLS) & ~PIN_PREEMPT); > return VMX_TEST_VMEXIT; > } > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Jan, On Fri, Oct 11, 2013 at 12:12 AM, Jan Kiszka <jan.kiszka@siemens.com> wrote: > On 2013-10-02 20:47, Jan Kiszka wrote: >> On 2013-09-30 11:08, Jan Kiszka wrote: >>> On 2013-09-26 17:04, Paolo Bonzini wrote: >>>> Il 16/09/2013 10:11, Arthur Chunqi Li ha scritto: >>>>> This patch contains the following two changes: >>>>> 1. Fix the bug in nested preemption timer support. If vmexit L2->L0 >>>>> with some reasons not emulated by L1, preemption timer value should >>>>> be save in such exits. >>>>> 2. Add support of "Save VMX-preemption timer value" VM-Exit controls >>>>> to nVMX. >>>>> >>>>> With this patch, nested VMX preemption timer features are fully >>>>> supported. >>>>> >>>>> Signed-off-by: Arthur Chunqi Li <yzt356@gmail.com> >>>>> --- >>>>> ChangeLog to v4: >>>>> Format changes and remove a flag in nested_vmx. >>>>> arch/x86/include/uapi/asm/msr-index.h | 1 + >>>>> arch/x86/kvm/vmx.c | 44 +++++++++++++++++++++++++++++++-- >>>>> 2 files changed, 43 insertions(+), 2 deletions(-) >>>> >>>> Hi all, >>>> >>>> the test fails for me if the preemption timer value is set to a value >>>> that is above ~2000 (which means ~65000 TSC cycles on this machine). >>>> The preemption timer seems to count faster than what is expected, for >>>> example only up to 4 million cycles if you set it to one million. >>>> So, I am leaving the patch out of kvm/queue for now, until I can >>>> test it on more processors. >>> >>> I've done some measurements with the help of ftrace on the time it takes >>> to let the preemption timer trigger (no adjustments via Arthur's patch >>> were involved): On my Core i7-620M, the preemption timer seems to tick >>> almost 10 times faster than spec and scale value (5) suggests. I've >>> loaded a value of 100000, and it took about 130 ? until I got a vmexit >>> with reason PREEMPTION_TIMER (no other exists in between). >>> >>> qemu-system-x86-13765 [003] 298562.966079: bprint: prepare_vmcs02: preempt val 100000 >>> qemu-system-x86-13765 [003] 298562.966083: kvm_entry: vcpu 0 >>> qemu-system-x86-13765 [003] 298562.966212: kvm_exit: reason PREEMPTION_TIMER rip 0x401fea info 0 0 >>> >>> That's a frequency of ~769 MHz. The TSC ticks at 2.66 GHz. But 769 MHz * >>> 2^5 is 24.6 GHz. I've read the spec several times, but it seems pretty >>> clear on this. It just doesn't match reality. Very strange. >> >> ...but documented: I found an related errata for my processor (AAT59) >> and also for Xeon 5500 (AAK139). At least current Haswell generation is >> no affected. I can test the patch on a Haswell board I have at work >> later this week. > > To complete this story: Arthur's patch works fine on a non-broken CPU > (here: i7-4770S). > > Arthur, find some fix-ups for your test case below. It avoids printing > from within L2 as this could deadlock when the timer fires and L1 then > tries to print something. Also, it disables the preemption timer on > leave so that it cannot fire later on again. If you want to fold this > into your patch, feel free. Otherwise I can post a separate patch on > top. I think this can be treated as a separate patch to our test suite. You can post it on top. I have tested it and it works fine. Arthur > > Jan > > diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c > index 4372878..66a4201 100644 > --- a/x86/vmx_tests.c > +++ b/x86/vmx_tests.c > @@ -141,6 +141,9 @@ void preemption_timer_init() > preempt_val = 10000000; > vmcs_write(PREEMPT_TIMER_VALUE, preempt_val); > preempt_scale = rdmsr(MSR_IA32_VMX_MISC) & 0x1F; > + > + if (!(ctrl_exit_rev.clr & EXI_SAVE_PREEMPT)) > + printf("\tSave preemption value is not supported\n"); > } > > void preemption_timer_main() > @@ -150,9 +153,7 @@ void preemption_timer_main() > printf("\tPreemption timer is not supported\n"); > return; > } > - if (!(ctrl_exit_rev.clr & EXI_SAVE_PREEMPT)) > - printf("\tSave preemption value is not supported\n"); > - else { > + if (ctrl_exit_rev.clr & EXI_SAVE_PREEMPT) { > set_stage(0); > vmcall(); > if (get_stage() == 1) > @@ -161,8 +162,8 @@ void preemption_timer_main() > while (1) { > if (((rdtsc() - tsc_val) >> preempt_scale) > > 10 * preempt_val) { > - report("Preemption timer", 0); > - break; > + set_stage(2); > + vmcall(); > } > } > } > @@ -183,7 +184,7 @@ int preemption_timer_exit_handler() > report("Preemption timer", 0); > else > report("Preemption timer", 1); > - return VMX_TEST_VMEXIT; > + break; > case VMX_VMCALL: > switch (get_stage()) { > case 0: > @@ -195,24 +196,29 @@ int preemption_timer_exit_handler() > EXI_SAVE_PREEMPT) & ctrl_exit_rev.clr; > vmcs_write(EXI_CONTROLS, ctrl_exit); > } > - break; > + vmcs_write(GUEST_RIP, guest_rip + insn_len); > + return VMX_TEST_RESUME; > case 1: > if (vmcs_read(PREEMPT_TIMER_VALUE) >= preempt_val) > report("Save preemption value", 0); > else > report("Save preemption value", 1); > + vmcs_write(GUEST_RIP, guest_rip + insn_len); > + return VMX_TEST_RESUME; > + case 2: > + report("Preemption timer", 0); > break; > default: > printf("Invalid stage.\n"); > print_vmexit_info(); > - return VMX_TEST_VMEXIT; > + break; > } > - vmcs_write(GUEST_RIP, guest_rip + insn_len); > - return VMX_TEST_RESUME; > + break; > default: > printf("Unknown exit reason, %d\n", reason); > print_vmexit_info(); > } > + vmcs_write(PIN_CONTROLS, vmcs_read(PIN_CONTROLS) & ~PIN_PREEMPT); > return VMX_TEST_VMEXIT; > } > > -- > Siemens AG, Corporate Technology, CT RTC ITP SES-DE > Corporate Competence Center Embedded Linux
Il 10/10/2013 17:20, Paolo Bonzini ha scritto: >> > Arthur, find some fix-ups for your test case below. It avoids printing >> > from within L2 as this could deadlock when the timer fires and L1 then >> > tries to print something. Also, it disables the preemption timer on >> > leave so that it cannot fire later on again. If you want to fold this >> > into your patch, feel free. Otherwise I can post a separate patch on >> > top. > Is that a Signed-off-by? :) Ping? Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2013-10-25 10:56, Paolo Bonzini wrote: > Il 10/10/2013 17:20, Paolo Bonzini ha scritto: >>>> Arthur, find some fix-ups for your test case below. It avoids printing >>>> from within L2 as this could deadlock when the timer fires and L1 then >>>> tries to print something. Also, it disables the preemption timer on >>>> leave so that it cannot fire later on again. If you want to fold this >>>> into your patch, feel free. Otherwise I can post a separate patch on >>>> top. >> Is that a Signed-off-by? :) > > Ping? Sent as separate patch two days ago. Jan
diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c index 4372878..66a4201 100644 --- a/x86/vmx_tests.c +++ b/x86/vmx_tests.c @@ -141,6 +141,9 @@ void preemption_timer_init() preempt_val = 10000000; vmcs_write(PREEMPT_TIMER_VALUE, preempt_val); preempt_scale = rdmsr(MSR_IA32_VMX_MISC) & 0x1F; + + if (!(ctrl_exit_rev.clr & EXI_SAVE_PREEMPT)) + printf("\tSave preemption value is not supported\n"); } void preemption_timer_main() @@ -150,9 +153,7 @@ void preemption_timer_main() printf("\tPreemption timer is not supported\n"); return; } - if (!(ctrl_exit_rev.clr & EXI_SAVE_PREEMPT)) - printf("\tSave preemption value is not supported\n"); - else { + if (ctrl_exit_rev.clr & EXI_SAVE_PREEMPT) { set_stage(0); vmcall(); if (get_stage() == 1) @@ -161,8 +162,8 @@ void preemption_timer_main() while (1) { if (((rdtsc() - tsc_val) >> preempt_scale) > 10 * preempt_val) { - report("Preemption timer", 0); - break; + set_stage(2); + vmcall(); } } } @@ -183,7 +184,7 @@ int preemption_timer_exit_handler() report("Preemption timer", 0); else report("Preemption timer", 1); - return VMX_TEST_VMEXIT; + break; case VMX_VMCALL: switch (get_stage()) { case 0: @@ -195,24 +196,29 @@ int preemption_timer_exit_handler() EXI_SAVE_PREEMPT) & ctrl_exit_rev.clr; vmcs_write(EXI_CONTROLS, ctrl_exit); } - break; + vmcs_write(GUEST_RIP, guest_rip + insn_len); + return VMX_TEST_RESUME; case 1: if (vmcs_read(PREEMPT_TIMER_VALUE) >= preempt_val) report("Save preemption value", 0); else report("Save preemption value", 1); + vmcs_write(GUEST_RIP, guest_rip + insn_len); + return VMX_TEST_RESUME; + case 2: + report("Preemption timer", 0); break; default: printf("Invalid stage.\n"); print_vmexit_info(); - return VMX_TEST_VMEXIT; + break; } - vmcs_write(GUEST_RIP, guest_rip + insn_len); - return VMX_TEST_RESUME; + break; default: printf("Unknown exit reason, %d\n", reason); print_vmexit_info(); } + vmcs_write(PIN_CONTROLS, vmcs_read(PIN_CONTROLS) & ~PIN_PREEMPT); return VMX_TEST_VMEXIT; }