Message ID | 1502095466-21312-3-git-send-email-longpeng2@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 07/08/2017 10:44, Longpeng(Mike) wrote: > + > + /* > + * Intel sdm vol3 ch-25.1.3 says: The “PAUSE-loop exiting” > + * VM-execution control is ignored if CPL > 0. So the vcpu > + * is always exiting with CPL=0 if it uses PLE. This is not true (how can it be?). What 25.1.3 says is, the VCPU is always at CPL=0 if you get a PAUSE exit (reason 40) and PAUSE exiting is 0 (it always is for KVM). But here you're looking for a VCPU that didn't get a PAUSE exit, so the CPL can certainly be 3. However, I understand that vmx_get_cpl can be a bit slow here. You can actually read SS's access rights directly in this function and get the DPL from there, that's going to be just a single VMREAD. The only difference is when vmx->rmode.vm86_active=1. However, pause-loop exiting is not working properly anyway if vmx->rmode.vm86_active=1, because CPL=3 according to the processor. Paolo > + * The following block needs less cycles than vmx_get_cpl(). > + */ > + if (cpu_has_secondary_exec_ctrls()) > + secondary_exec_ctrl = vmcs_read32(SECONDARY_VM_EXEC_CONTROL); > + if (secondary_exec_ctrl & SECONDARY_EXEC_PAUSE_LOOP_EXITING) > + return true; > + Paolo
On 08/07/2017 06:45 PM, Paolo Bonzini wrote: > On 07/08/2017 10:44, Longpeng(Mike) wrote: >> + >> + /* >> + * Intel sdm vol3 ch-25.1.3 says: The “PAUSE-loop exiting” >> + * VM-execution control is ignored if CPL > 0. So the vcpu >> + * is always exiting with CPL=0 if it uses PLE. > > This is not true (how can it be?). What 25.1.3 says is, the VCPU is > always at CPL=0 if you get a PAUSE exit (reason 40) and PAUSE exiting is > 0 (it always is for KVM). But here you're looking for a VCPU that > didn't get a PAUSE exit, so the CPL can certainly be 3. > Hi Paolo, My comment above is something wrong(please forgive my poor English), my origin meaning is: The “PAUSE-loop exiting” VM-execution control is ignored if CPL > 0. So the vcpu's CPL is must 0 if it exits due to PLE. * kvm_arch_spin_in_kernel() returns whether the vcpu(which exits due to spinlock) is CPL=0. It only be called by kvm_vcpu_on_spin(), and the input vcpu is 'me' which get a PAUSE exit now. * I split kvm_arch_vcpu_in_kernel(in RFC) into two functions: kvm_arch_spin_in_kernel and kvm_arch_preempt_in_kernel Because of KVM/VMX L1 never set CPU_BASED_PAUSE_EXITING and only set SECONDARY_EXEC_PAUSE_LOOP_EXITING if supported, so for L1: 1. get a PAUSE exit with CPL=0 if PLE is supported 2. never get a PAUSE exit if don't support PLE So, I think it can direct return true(CPL=0) if supports PLE. But for nested KVM/VMX(I'm not familiar with nested), it could set CPU_BASED_PAUSE_EXITING, so I think get_cpl() is also needed. If the above is correct, what about this way( we can save a vmcs_read opeartion for L1): kvm_arch_vcpu_spin_in_kernel(vcpu) { if (!is_guest_mode(vcpu)) return true; return vmx_get_cpl(vcpu) == 0; } kvm_vcpu_on_spin() { /* @me get a PAUSE exit */ me_in_kernel = kvm_arch_vcpu_spin_in_kernel(me); ... for each vcpu { ... if (me_in_kernel && !...preempt_in_kernel(vcpu)) continue; ... } ... } --- Regards, Longpeng(Mike) > However, I understand that vmx_get_cpl can be a bit slow here. You can > actually read SS's access rights directly in this function and get the > DPL from there, that's going to be just a single VMREAD. > > The only difference is when vmx->rmode.vm86_active=1. However, > pause-loop exiting is not working properly anyway if > vmx->rmode.vm86_active=1, because CPL=3 according to the processor. > > Paolo > >> + * The following block needs less cycles than vmx_get_cpl(). >> + */ >> + if (cpu_has_secondary_exec_ctrls()) >> + secondary_exec_ctrl = vmcs_read32(SECONDARY_VM_EXEC_CONTROL); >> + if (secondary_exec_ctrl & SECONDARY_EXEC_PAUSE_LOOP_EXITING) >> + return true; >> + > > Paolo >
On 07/08/2017 14:28, Longpeng(Mike) wrote: > * kvm_arch_spin_in_kernel() returns whether the vcpu (which exits due to > spinlock) is CPL=0. It only be called by kvm_vcpu_on_spin(), and the > input vcpu is 'me' which get a PAUSE exit now. * > > I split kvm_arch_vcpu_in_kernel(in RFC) into two functions: > kvm_arch_spin_in_kernel and kvm_arch_preempt_in_kernel > > Because of KVM/VMX L1 never set CPU_BASED_PAUSE_EXITING and only set > SECONDARY_EXEC_PAUSE_LOOP_EXITING if supported, so for L1: I understand better now. I think vmx.c should just return true from vmx_spin_in_kernel. However, kvm_arch_vcpu_spin_in_kernel is not necessary. Instead you should make "in_kern" an argument to kvm_vcpu_on_spin (maybe renamed to "yield_to_kernel_mode_vcpu"). Then vmx.c can just call "kvm_vcpu_on_spin(vcpu, true)". > 1. get a PAUSE exit with CPL=0 if PLE is supported > 2. never get a PAUSE exit if don't support PLE > > So, I think it can direct return true(CPL=0) if supports PLE. > > But for nested KVM/VMX(I'm not familiar with nested), it could set > CPU_BASED_PAUSE_EXITING, so I think get_cpl() is also needed. If the nested hypervisor sets CPU_BASED_PAUSE_EXITING, a PAUSE vmexit while running a nested guest would be reflected to the nested hypervisor. So you wouldn't get to handle_pause and thus to kvm_vcpu_on_spin. Thanks, Paolo
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 87ac4fb..d2b2d57 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -688,6 +688,9 @@ struct kvm_vcpu_arch { /* GPA available (AMD only) */ bool gpa_available; + + /* be preempted when it's in kernel-mode(cpl=0) */ + bool preempted_in_kernel; }; struct kvm_lpage_info { @@ -1057,6 +1060,8 @@ struct kvm_x86_ops { void (*cancel_hv_timer)(struct kvm_vcpu *vcpu); void (*setup_mce)(struct kvm_vcpu *vcpu); + + bool (*spin_in_kernel)(struct kvm_vcpu *vcpu); }; struct kvm_arch_async_pf { diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 4d8141e..552ab4c 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -5352,6 +5352,11 @@ static void svm_setup_mce(struct kvm_vcpu *vcpu) vcpu->arch.mcg_cap &= 0x1ff; } +static bool svm_spin_in_kernel(struct kvm_vcpu *vcpu) +{ + return svm_get_cpl(vcpu) == 0; +} + static struct kvm_x86_ops svm_x86_ops __ro_after_init = { .cpu_has_kvm_support = has_svm, .disabled_by_bios = is_disabled, @@ -5464,6 +5469,7 @@ static void svm_setup_mce(struct kvm_vcpu *vcpu) .deliver_posted_interrupt = svm_deliver_avic_intr, .update_pi_irte = svm_update_pi_irte, .setup_mce = svm_setup_mce, + .spin_in_kernel = svm_spin_in_kernel, }; static int __init svm_init(void) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 39a6222..d0dfe2e 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -11547,6 +11547,25 @@ static void vmx_setup_mce(struct kvm_vcpu *vcpu) ~FEATURE_CONTROL_LMCE; } +static bool vmx_spin_in_kernel(struct kvm_vcpu *vcpu) +{ + u32 secondary_exec_ctrl = 0; + + /* + * Intel sdm vol3 ch-25.1.3 says: The “PAUSE-loop exiting” + * VM-execution control is ignored if CPL > 0. So the vcpu + * is always exiting with CPL=0 if it uses PLE. + * + * The following block needs less cycles than vmx_get_cpl(). + */ + if (cpu_has_secondary_exec_ctrls()) + secondary_exec_ctrl = vmcs_read32(SECONDARY_VM_EXEC_CONTROL); + if (secondary_exec_ctrl & SECONDARY_EXEC_PAUSE_LOOP_EXITING) + return true; + + return vmx_get_cpl(vcpu) == 0; +} + static struct kvm_x86_ops vmx_x86_ops __ro_after_init = { .cpu_has_kvm_support = cpu_has_kvm_support, .disabled_by_bios = vmx_disabled_by_bios, @@ -11674,6 +11693,7 @@ static void vmx_setup_mce(struct kvm_vcpu *vcpu) #endif .setup_mce = vmx_setup_mce, + .spin_in_kernel = vmx_spin_in_kernel, }; static int __init vmx_init(void) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 04c6a1f..fa79a60 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2881,6 +2881,10 @@ static void kvm_steal_time_set_preempted(struct kvm_vcpu *vcpu) void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) { int idx; + + if (vcpu->preempted) + vcpu->arch.preempted_in_kernel = !kvm_x86_ops->get_cpl(vcpu); + /* * Disable page faults because we're in atomic context here. * kvm_write_guest_offset_cached() would call might_fault() @@ -7988,6 +7992,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) kvm_pmu_init(vcpu); vcpu->arch.pending_external_vector = -1; + vcpu->arch.preempted_in_kernel = false; kvm_hv_vcpu_init(vcpu); @@ -8437,12 +8442,12 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu) bool kvm_arch_vcpu_spin_in_kernel(struct kvm_vcpu *vcpu) { - return false; + return kvm_x86_ops->spin_in_kernel(vcpu); } bool kvm_arch_vcpu_preempt_in_kernel(struct kvm_vcpu *vcpu) { - return false; + return vcpu->arch.preempted_in_kernel; } int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
Implements the kvm_arch_vcpu_spin/preempt_in_kernel(), because get_cpl requires vcpu_load, so we must cache the result(whether the vcpu was preempted when its cpl=0) in kvm_arch_vcpu. Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> --- arch/x86/include/asm/kvm_host.h | 5 +++++ arch/x86/kvm/svm.c | 6 ++++++ arch/x86/kvm/vmx.c | 20 ++++++++++++++++++++ arch/x86/kvm/x86.c | 9 +++++++-- 4 files changed, 38 insertions(+), 2 deletions(-)