Message ID | 20160706051051.7869-1-haozhong.zhang@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Cc Jan, 2016-07-06 13:10 GMT+08:00 Haozhong Zhang <haozhong.zhang@intel.com>: > A different VMCS is loaded when L2 guest is running, so it's incorrect > to use the VMX preemption timer for L1 TSC deadline timer. This patch > switches to hrtimer for L1 TSC deadline timer when entering L2 guest, > and switches back to VMX preemption timer when nested VMEXIT from L2 to > L1. Nested preemption timer is emulated by hrtimer, so it doesn't influence vmcs02, why this is needed? Regards, Wanpeng Li -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/07/2016 07:37, Wanpeng Li wrote: > Cc Jan, > 2016-07-06 13:10 GMT+08:00 Haozhong Zhang <haozhong.zhang@intel.com>: >> A different VMCS is loaded when L2 guest is running, so it's incorrect >> to use the VMX preemption timer for L1 TSC deadline timer. This patch >> switches to hrtimer for L1 TSC deadline timer when entering L2 guest, >> and switches back to VMX preemption timer when nested VMEXIT from L2 to >> L1. > > Nested preemption timer is emulated by hrtimer, so it doesn't > influence vmcs02, why this is needed? I agree (and nested APIC timer is emulated by L1). Because the L1 APIC timer is set in the preemption timer before vmentry, this should not be necessary. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/06/16 13:37, Wanpeng Li wrote: > Cc Jan, > 2016-07-06 13:10 GMT+08:00 Haozhong Zhang <haozhong.zhang@intel.com>: > > A different VMCS is loaded when L2 guest is running, so it's incorrect > > to use the VMX preemption timer for L1 TSC deadline timer. This patch > > switches to hrtimer for L1 TSC deadline timer when entering L2 guest, > > and switches back to VMX preemption timer when nested VMEXIT from L2 to > > L1. > > Nested preemption timer is emulated by hrtimer, so it doesn't > influence vmcs02, why this is needed? Nested (L2) preemption timer is not affected for sure, and this patch is to fix another problem caused by using L1 preemption timer for L1 TSC deadline timer. When we use L1 VMX preemption timer for L1 TSC deadline timer, we intend to use the pin-based exec config and the VMX preemption timer value in vmcs01. However, when L2 guest is running, vmcs02 is loaded as the current VMCS so that a different VMX preemption timer config is used (i.e. VMX preemption timer is disabled in prepare_vmcs02()). If we still use preemption timer for L1 TSC deadline timer at this moment, then L1 TSC deadline timer will not be able to be triggered when L2 guest is running. Thanks, Haozhong -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/07/2016 08:05, Haozhong Zhang wrote: >> > >> > Nested preemption timer is emulated by hrtimer, so it doesn't >> > influence vmcs02, why this is needed? > Nested (L2) preemption timer is not affected for sure, and this patch > is to fix another problem caused by using L1 preemption timer for L1 > TSC deadline timer. When we use L1 VMX preemption timer for L1 TSC > deadline timer, we intend to use the pin-based exec config and the VMX > preemption timer value in vmcs01. However, when L2 guest is running, > vmcs02 is loaded as the current VMCS so that a different VMX > preemption timer config is used (i.e. VMX preemption timer is disabled > in prepare_vmcs02()). If we still use preemption timer for L1 TSC > deadline timer at this moment, then L1 TSC deadline timer will not be > able to be triggered when L2 guest is running. The right fix then is to set the pin-based controls in prepare_vmcs02, based on vcpu->arch.hv_deadline_tsc. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/06/16 09:46, Paolo Bonzini wrote: > > > On 06/07/2016 08:05, Haozhong Zhang wrote: > >> > > >> > Nested preemption timer is emulated by hrtimer, so it doesn't > >> > influence vmcs02, why this is needed? > > Nested (L2) preemption timer is not affected for sure, and this patch > > is to fix another problem caused by using L1 preemption timer for L1 > > TSC deadline timer. When we use L1 VMX preemption timer for L1 TSC > > deadline timer, we intend to use the pin-based exec config and the VMX > > preemption timer value in vmcs01. However, when L2 guest is running, > > vmcs02 is loaded as the current VMCS so that a different VMX > > preemption timer config is used (i.e. VMX preemption timer is disabled > > in prepare_vmcs02()). If we still use preemption timer for L1 TSC > > deadline timer at this moment, then L1 TSC deadline timer will not be > > able to be triggered when L2 guest is running. > > The right fix then is to set the pin-based controls in prepare_vmcs02, > based on vcpu->arch.hv_deadline_tsc. Then KVM needs to distinguish L1 and L2 VMEXITs for preemption timer (or does KVM already have this?). The current implementation which uses a separate hrtimer for L2 preemption timer can easily distinguish them. Haozhong -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/06/16 16:01, Haozhong Zhang wrote: > On 07/06/16 09:46, Paolo Bonzini wrote: > > > > > > On 06/07/2016 08:05, Haozhong Zhang wrote: > > >> > > > >> > Nested preemption timer is emulated by hrtimer, so it doesn't > > >> > influence vmcs02, why this is needed? > > > Nested (L2) preemption timer is not affected for sure, and this patch > > > is to fix another problem caused by using L1 preemption timer for L1 > > > TSC deadline timer. When we use L1 VMX preemption timer for L1 TSC > > > deadline timer, we intend to use the pin-based exec config and the VMX > > > preemption timer value in vmcs01. However, when L2 guest is running, > > > vmcs02 is loaded as the current VMCS so that a different VMX > > > preemption timer config is used (i.e. VMX preemption timer is disabled > > > in prepare_vmcs02()). If we still use preemption timer for L1 TSC > > > deadline timer at this moment, then L1 TSC deadline timer will not be > > > able to be triggered when L2 guest is running. > > > > The right fix then is to set the pin-based controls in prepare_vmcs02, > > based on vcpu->arch.hv_deadline_tsc. > > Then KVM needs to distinguish L1 and L2 VMEXITs for preemption timer > (or does KVM already have this?). The current implementation which > uses a separate hrtimer for L2 preemption timer can easily distinguish > them. > And L1 hypervisor may also want to use preemption timer for L2 guest. In this case, L0 KVM can not reuse preemption timer for both L1 and L2. Haozhong -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On 07/06/16 16:01, Haozhong Zhang wrote: > > On 07/06/16 09:46, Paolo Bonzini wrote: > > > On 06/07/2016 08:05, Haozhong Zhang wrote: > > > >> > Nested preemption timer is emulated by hrtimer, so it doesn't > > > >> > influence vmcs02, why this is needed? > > > > Nested (L2) preemption timer is not affected for sure, and this patch > > > > is to fix another problem caused by using L1 preemption timer for L1 > > > > TSC deadline timer. When we use L1 VMX preemption timer for L1 TSC > > > > deadline timer, we intend to use the pin-based exec config and the VMX > > > > preemption timer value in vmcs01. However, when L2 guest is running, > > > > vmcs02 is loaded as the current VMCS so that a different VMX > > > > preemption timer config is used (i.e. VMX preemption timer is disabled > > > > in prepare_vmcs02()). If we still use preemption timer for L1 TSC > > > > deadline timer at this moment, then L1 TSC deadline timer will not be > > > > able to be triggered when L2 guest is running. > > > > > > The right fix then is to set the pin-based controls in prepare_vmcs02, > > > based on vcpu->arch.hv_deadline_tsc. > > > > Then KVM needs to distinguish L1 and L2 VMEXITs for preemption timer > > (or does KVM already have this?). The current implementation which > > uses a separate hrtimer for L2 preemption timer can easily distinguish > > them. > > And L1 hypervisor may also want to use preemption timer for L2 > guest. In this case, L0 KVM can not reuse preemption timer for both L1 > and L2. The vmcs12's preemption timer is emulated by KVM through a separate hrtimer, so KVM would only use the vmcs02 preemption timer for the L1 APIC timer. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/06/16 04:14, Paolo Bonzini wrote: > > On 07/06/16 16:01, Haozhong Zhang wrote: > > > On 07/06/16 09:46, Paolo Bonzini wrote: > > > > On 06/07/2016 08:05, Haozhong Zhang wrote: > > > > >> > Nested preemption timer is emulated by hrtimer, so it doesn't > > > > >> > influence vmcs02, why this is needed? > > > > > Nested (L2) preemption timer is not affected for sure, and this patch > > > > > is to fix another problem caused by using L1 preemption timer for L1 > > > > > TSC deadline timer. When we use L1 VMX preemption timer for L1 TSC > > > > > deadline timer, we intend to use the pin-based exec config and the VMX > > > > > preemption timer value in vmcs01. However, when L2 guest is running, > > > > > vmcs02 is loaded as the current VMCS so that a different VMX > > > > > preemption timer config is used (i.e. VMX preemption timer is disabled > > > > > in prepare_vmcs02()). If we still use preemption timer for L1 TSC > > > > > deadline timer at this moment, then L1 TSC deadline timer will not be > > > > > able to be triggered when L2 guest is running. > > > > > > > > The right fix then is to set the pin-based controls in prepare_vmcs02, > > > > based on vcpu->arch.hv_deadline_tsc. > > > > > > Then KVM needs to distinguish L1 and L2 VMEXITs for preemption timer > > > (or does KVM already have this?). The current implementation which > > > uses a separate hrtimer for L2 preemption timer can easily distinguish > > > them. > > > > And L1 hypervisor may also want to use preemption timer for L2 > > guest. In this case, L0 KVM can not reuse preemption timer for both L1 > > and L2. > > The vmcs12's preemption timer is emulated by KVM through a separate hrtimer, > so KVM would only use the vmcs02 preemption timer for the L1 APIC timer. Ah, I got what you mean. Please drop this patch, and I'll send another one later. Thanks, Haozhong -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 85e2f0a..cc29c2a 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -10203,6 +10203,9 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) if (!vmcs02) return -ENOMEM; + if (kvm_lapic_hv_timer_in_use(vcpu)) + kvm_lapic_switch_to_sw_timer(vcpu); + enter_guest_mode(vcpu); vmx->nested.vmcs01_tsc_offset = vmcs_read64(TSC_OFFSET); @@ -10227,6 +10230,7 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch) if (msr_entry_idx) { leave_guest_mode(vcpu); vmx_load_vmcs01(vcpu); + kvm_lapic_switch_to_hv_timer(vcpu); nested_vmx_entry_failure(vcpu, vmcs12, EXIT_REASON_MSR_LOAD_FAIL, msr_entry_idx); return 1; @@ -10700,6 +10704,7 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason, nested_vmx_abort(vcpu, VMX_ABORT_SAVE_GUEST_MSR_FAIL); vmx_load_vmcs01(vcpu); + kvm_lapic_switch_to_hv_timer(vcpu); if ((exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT) && nested_exit_intr_ack_set(vcpu)) {
A different VMCS is loaded when L2 guest is running, so it's incorrect to use the VMX preemption timer for L1 TSC deadline timer. This patch switches to hrtimer for L1 TSC deadline timer when entering L2 guest, and switches back to VMX preemption timer when nested VMEXIT from L2 to L1. Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> --- arch/x86/kvm/vmx.c | 5 +++++ 1 file changed, 5 insertions(+)