Message ID | 20211221090449.15337-3-kechenl@nvidia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: x86: add per-vCPU exits disable capability | expand |
The shortlog is weird, I get what you're going for with the "()", but it honestly looks like a typo :-) And add "power management" so that there's a bit more context in the shortlog? Maybe this? KVM: x86: Move *_in_guest power management flags to vCPU scope On Tue, Dec 21, 2021, Kechen Lu wrote: > For futher extensions on finer-grained control on per-vCPU exits > disable control, and because VM-scoped restricted to set before > vCPUs creation, runtime disabled exits flag check could be purely > vCPU scope. State what the patch does, not what it "could" do. E.g. Make the runtime disabled mwait/hlt/pause/cstate exits flags vCPU scope to allow finer-grained, per-vCPU control. The VM-scoped control is only allowed before vCPUs are created, thus preserving the existing behavior is a simple matter of snapshotting the flags at vCPU creation. A few nits below that aren't even from this path, but otherwise, Reviewed-by: Sean Christopherson <seanjc@google.com> > Suggested-by: Sean Christopherson <seanjc@google.com> > Signed-off-by: Kechen Lu <kechenl@nvidia.com> > --- > arch/x86/include/asm/kvm_host.h | 5 +++++ > arch/x86/kvm/cpuid.c | 2 +- > arch/x86/kvm/lapic.c | 2 +- > arch/x86/kvm/svm/svm.c | 10 +++++----- > arch/x86/kvm/vmx/vmx.c | 16 ++++++++-------- > arch/x86/kvm/x86.c | 6 +++++- > arch/x86/kvm/x86.h | 16 ++++++++-------- > 7 files changed, 33 insertions(+), 24 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index 2164b9f4c7b0..edc5fca4d8c8 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -908,6 +908,11 @@ struct kvm_vcpu_arch { > #if IS_ENABLED(CONFIG_HYPERV) > hpa_t hv_root_tdp; > #endif > + > + bool mwait_in_guest; > + bool hlt_in_guest; > + bool pause_in_guest; > + bool cstate_in_guest; > }; > > struct kvm_lpage_info { > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c > index 07e9215e911d..6291e15710ba 100644 > --- a/arch/x86/kvm/cpuid.c > +++ b/arch/x86/kvm/cpuid.c > @@ -177,7 +177,7 @@ void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu) > best->ebx = xstate_required_size(vcpu->arch.xcr0, true); > > best = kvm_find_kvm_cpuid_features(vcpu); > - if (kvm_hlt_in_guest(vcpu->kvm) && best && > + if (kvm_hlt_in_guest(vcpu) && best && > (best->eax & (1 << KVM_FEATURE_PV_UNHALT))) Can you (or Paolo?) opportunistically align this? And maybe even shuffle the check on "best" to pair the !NULL check with the functional check? E.g. if (kvm_hlt_in_guest(vcpu) && best && (best->eax & (1 << KVM_FEATURE_PV_UNHALT))) best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT); > best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT); > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > index f206fc35deff..effb994e6642 100644 > --- a/arch/x86/kvm/lapic.c > +++ b/arch/x86/kvm/lapic.c > @@ -119,7 +119,7 @@ static bool kvm_can_post_timer_interrupt(struct kvm_vcpu *vcpu) > bool kvm_can_use_hv_timer(struct kvm_vcpu *vcpu) > { > return kvm_x86_ops.set_hv_timer > - && !(kvm_mwait_in_guest(vcpu->kvm) || > + && !(kvm_mwait_in_guest(vcpu) || And another opportunistic tweak? I'm been itching for an excuse to "fix" this particular helper for quite some time :-) return kvm_x86_ops.set_hv_timer && !(kvm_mwait_in_guest(vcpu) || kvm_can_post_timer_interrupt(vcpu)); That overruns the 80 char soft limit, but IMO it's worth it in this case as the resulting code is more readable. > kvm_can_post_timer_interrupt(vcpu)); > } > EXPORT_SYMBOL_GPL(kvm_can_use_hv_timer);
> -----Original Message----- > From: Sean Christopherson <seanjc@google.com> > Sent: Monday, January 10, 2022 11:36 AM > To: Kechen Lu <kechenl@nvidia.com> > Cc: kvm@vger.kernel.org; pbonzini@redhat.com; wanpengli@tencent.com; > vkuznets@redhat.com; mst@redhat.com; Somdutta Roy > <somduttar@nvidia.com>; linux-kernel@vger.kernel.org > Subject: Re: [RFC PATCH v2 2/3] KVM: x86: move ()_in_guest checking to > vCPU scope > > External email: Use caution opening links or attachments > > > The shortlog is weird, I get what you're going for with the "()", but it honestly > looks like a typo :-) And add "power management" so that there's a bit more > context in the shortlog? Maybe this? > > KVM: x86: Move *_in_guest power management flags to vCPU scope > Ack. Yeah it's a typo :) > On Tue, Dec 21, 2021, Kechen Lu wrote: > > For futher extensions on finer-grained control on per-vCPU exits > > disable control, and because VM-scoped restricted to set before vCPUs > > creation, runtime disabled exits flag check could be purely vCPU > > scope. > > State what the patch does, not what it "could" do. E.g. > > Make the runtime disabled mwait/hlt/pause/cstate exits flags vCPU scope to > allow finer-grained, per-vCPU control. The VM-scoped control is only > allowed before vCPUs are created, thus preserving the existing behavior is a > simple matter of snapshotting the flags at vCPU creation. > Ack! Thanks for patient refinement on the description wording :P > A few nits below that aren't even from this path, but otherwise, > > Reviewed-by: Sean Christopherson <seanjc@google.com> > > > Suggested-by: Sean Christopherson <seanjc@google.com> > > Signed-off-by: Kechen Lu <kechenl@nvidia.com> > > --- > > arch/x86/include/asm/kvm_host.h | 5 +++++ > > arch/x86/kvm/cpuid.c | 2 +- > > arch/x86/kvm/lapic.c | 2 +- > > arch/x86/kvm/svm/svm.c | 10 +++++----- > > arch/x86/kvm/vmx/vmx.c | 16 ++++++++-------- > > arch/x86/kvm/x86.c | 6 +++++- > > arch/x86/kvm/x86.h | 16 ++++++++-------- > > 7 files changed, 33 insertions(+), 24 deletions(-) > > > > diff --git a/arch/x86/include/asm/kvm_host.h > > b/arch/x86/include/asm/kvm_host.h index 2164b9f4c7b0..edc5fca4d8c8 > > 100644 > > --- a/arch/x86/include/asm/kvm_host.h > > +++ b/arch/x86/include/asm/kvm_host.h > > @@ -908,6 +908,11 @@ struct kvm_vcpu_arch { #if > > IS_ENABLED(CONFIG_HYPERV) > > hpa_t hv_root_tdp; > > #endif > > + > > + bool mwait_in_guest; > > + bool hlt_in_guest; > > + bool pause_in_guest; > > + bool cstate_in_guest; > > }; > > > > struct kvm_lpage_info { > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index > > 07e9215e911d..6291e15710ba 100644 > > --- a/arch/x86/kvm/cpuid.c > > +++ b/arch/x86/kvm/cpuid.c > > @@ -177,7 +177,7 @@ void kvm_update_cpuid_runtime(struct kvm_vcpu > *vcpu) > > best->ebx = xstate_required_size(vcpu->arch.xcr0, true); > > > > best = kvm_find_kvm_cpuid_features(vcpu); > > - if (kvm_hlt_in_guest(vcpu->kvm) && best && > > + if (kvm_hlt_in_guest(vcpu) && best && > > (best->eax & (1 << KVM_FEATURE_PV_UNHALT))) > > Can you (or Paolo?) opportunistically align this? And maybe even shuffle the > check on "best" to pair the !NULL check with the functional check? E.g. > > if (kvm_hlt_in_guest(vcpu) && > best && (best->eax & (1 << KVM_FEATURE_PV_UNHALT))) > best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT); > Makes sense. Let me align and reform this in this patch. > > best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT); > > > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index > > f206fc35deff..effb994e6642 100644 > > --- a/arch/x86/kvm/lapic.c > > +++ b/arch/x86/kvm/lapic.c > > @@ -119,7 +119,7 @@ static bool kvm_can_post_timer_interrupt(struct > > kvm_vcpu *vcpu) bool kvm_can_use_hv_timer(struct kvm_vcpu *vcpu) { > > return kvm_x86_ops.set_hv_timer > > - && !(kvm_mwait_in_guest(vcpu->kvm) || > > + && !(kvm_mwait_in_guest(vcpu) || > > And another opportunistic tweak? I'm been itching for an excuse to "fix" this > particular helper for quite some time :-) > > return kvm_x86_ops.set_hv_timer && > !(kvm_mwait_in_guest(vcpu) || > kvm_can_post_timer_interrupt(vcpu)); > > That overruns the 80 char soft limit, but IMO it's worth it in this case as the > resulting code is more readable. > Sure! Best Regards, Kechen > > > kvm_can_post_timer_interrupt(vcpu)); > > } > > EXPORT_SYMBOL_GPL(kvm_can_use_hv_timer);
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 2164b9f4c7b0..edc5fca4d8c8 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -908,6 +908,11 @@ struct kvm_vcpu_arch { #if IS_ENABLED(CONFIG_HYPERV) hpa_t hv_root_tdp; #endif + + bool mwait_in_guest; + bool hlt_in_guest; + bool pause_in_guest; + bool cstate_in_guest; }; struct kvm_lpage_info { diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 07e9215e911d..6291e15710ba 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -177,7 +177,7 @@ void kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu) best->ebx = xstate_required_size(vcpu->arch.xcr0, true); best = kvm_find_kvm_cpuid_features(vcpu); - if (kvm_hlt_in_guest(vcpu->kvm) && best && + if (kvm_hlt_in_guest(vcpu) && best && (best->eax & (1 << KVM_FEATURE_PV_UNHALT))) best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT); diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index f206fc35deff..effb994e6642 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -119,7 +119,7 @@ static bool kvm_can_post_timer_interrupt(struct kvm_vcpu *vcpu) bool kvm_can_use_hv_timer(struct kvm_vcpu *vcpu) { return kvm_x86_ops.set_hv_timer - && !(kvm_mwait_in_guest(vcpu->kvm) || + && !(kvm_mwait_in_guest(vcpu) || kvm_can_post_timer_interrupt(vcpu)); } EXPORT_SYMBOL_GPL(kvm_can_use_hv_timer); diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index d0f68d11ec70..70e393c6dfb5 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1271,12 +1271,12 @@ static void init_vmcb(struct kvm_vcpu *vcpu) svm_set_intercept(svm, INTERCEPT_RDPRU); svm_set_intercept(svm, INTERCEPT_RSM); - if (!kvm_mwait_in_guest(vcpu->kvm)) { + if (!kvm_mwait_in_guest(vcpu)) { svm_set_intercept(svm, INTERCEPT_MONITOR); svm_set_intercept(svm, INTERCEPT_MWAIT); } - if (!kvm_hlt_in_guest(vcpu->kvm)) + if (!kvm_hlt_in_guest(vcpu)) svm_set_intercept(svm, INTERCEPT_HLT); control->iopm_base_pa = __sme_set(iopm_base); @@ -1320,7 +1320,7 @@ static void init_vmcb(struct kvm_vcpu *vcpu) svm->nested.vmcb12_gpa = INVALID_GPA; svm->nested.last_vmcb12_gpa = INVALID_GPA; - if (!kvm_pause_in_guest(vcpu->kvm)) { + if (!kvm_pause_in_guest(vcpu)) { control->pause_filter_count = pause_filter_count; if (pause_filter_thresh) control->pause_filter_thresh = pause_filter_thresh; @@ -3095,7 +3095,7 @@ static int pause_interception(struct kvm_vcpu *vcpu) */ in_kernel = !sev_es_guest(vcpu->kvm) && svm_get_cpl(vcpu) == 0; - if (!kvm_pause_in_guest(vcpu->kvm)) + if (!kvm_pause_in_guest(vcpu)) grow_ple_window(vcpu); kvm_vcpu_on_spin(vcpu, in_kernel); @@ -4308,7 +4308,7 @@ static void svm_handle_exit_irqoff(struct kvm_vcpu *vcpu) static void svm_sched_in(struct kvm_vcpu *vcpu, int cpu) { - if (!kvm_pause_in_guest(vcpu->kvm)) + if (!kvm_pause_in_guest(vcpu)) shrink_ple_window(vcpu); } diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 5aadad3e7367..b5133619dea1 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1585,7 +1585,7 @@ static void vmx_clear_hlt(struct kvm_vcpu *vcpu) * then the instruction is already executing and RIP has already been * advanced. */ - if (kvm_hlt_in_guest(vcpu->kvm) && + if (kvm_hlt_in_guest(vcpu) && vmcs_read32(GUEST_ACTIVITY_STATE) == GUEST_ACTIVITY_HLT) vmcs_write32(GUEST_ACTIVITY_STATE, GUEST_ACTIVITY_ACTIVE); } @@ -4120,10 +4120,10 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx) exec_control |= CPU_BASED_CR3_STORE_EXITING | CPU_BASED_CR3_LOAD_EXITING | CPU_BASED_INVLPG_EXITING; - if (kvm_mwait_in_guest(vmx->vcpu.kvm)) + if (kvm_mwait_in_guest(&vmx->vcpu)) exec_control &= ~(CPU_BASED_MWAIT_EXITING | CPU_BASED_MONITOR_EXITING); - if (kvm_hlt_in_guest(vmx->vcpu.kvm)) + if (kvm_hlt_in_guest(&vmx->vcpu)) exec_control &= ~CPU_BASED_HLT_EXITING; return exec_control; } @@ -4202,7 +4202,7 @@ static u32 vmx_secondary_exec_control(struct vcpu_vmx *vmx) } if (!enable_unrestricted_guest) exec_control &= ~SECONDARY_EXEC_UNRESTRICTED_GUEST; - if (kvm_pause_in_guest(vmx->vcpu.kvm)) + if (kvm_pause_in_guest(vcpu)) exec_control &= ~SECONDARY_EXEC_PAUSE_LOOP_EXITING; if (!kvm_vcpu_apicv_active(vcpu)) exec_control &= ~(SECONDARY_EXEC_APIC_REGISTER_VIRT | @@ -4305,7 +4305,7 @@ static void init_vmcs(struct vcpu_vmx *vmx) vmcs_write64(POSTED_INTR_DESC_ADDR, __pa((&vmx->pi_desc))); } - if (!kvm_pause_in_guest(vmx->vcpu.kvm)) { + if (!kvm_pause_in_guest(&vmx->vcpu)) { vmcs_write32(PLE_GAP, ple_gap); vmx->ple_window = ple_window; vmx->ple_window_dirty = true; @@ -5412,7 +5412,7 @@ static void shrink_ple_window(struct kvm_vcpu *vcpu) */ static int handle_pause(struct kvm_vcpu *vcpu) { - if (!kvm_pause_in_guest(vcpu->kvm)) + if (!kvm_pause_in_guest(vcpu)) grow_ple_window(vcpu); /* @@ -6859,7 +6859,7 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu) vmx_disable_intercept_for_msr(vcpu, MSR_IA32_SYSENTER_CS, MSR_TYPE_RW); vmx_disable_intercept_for_msr(vcpu, MSR_IA32_SYSENTER_ESP, MSR_TYPE_RW); vmx_disable_intercept_for_msr(vcpu, MSR_IA32_SYSENTER_EIP, MSR_TYPE_RW); - if (kvm_cstate_in_guest(vcpu->kvm)) { + if (kvm_cstate_in_guest(vcpu)) { vmx_disable_intercept_for_msr(vcpu, MSR_CORE_C1_RES, MSR_TYPE_R); vmx_disable_intercept_for_msr(vcpu, MSR_CORE_C3_RESIDENCY, MSR_TYPE_R); vmx_disable_intercept_for_msr(vcpu, MSR_CORE_C6_RESIDENCY, MSR_TYPE_R); @@ -7401,7 +7401,7 @@ static void vmx_cancel_hv_timer(struct kvm_vcpu *vcpu) static void vmx_sched_in(struct kvm_vcpu *vcpu, int cpu) { - if (!kvm_pause_in_guest(vcpu->kvm)) + if (!kvm_pause_in_guest(vcpu)) shrink_ple_window(vcpu); } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 37529c0c279d..d5d0d99b584e 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10941,6 +10941,10 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) #if IS_ENABLED(CONFIG_HYPERV) vcpu->arch.hv_root_tdp = INVALID_PAGE; #endif + vcpu->arch.mwait_in_guest = vcpu->kvm->arch.mwait_in_guest; + vcpu->arch.hlt_in_guest = vcpu->kvm->arch.hlt_in_guest; + vcpu->arch.pause_in_guest = vcpu->kvm->arch.pause_in_guest; + vcpu->arch.cstate_in_guest = vcpu->kvm->arch.cstate_in_guest; r = static_call(kvm_x86_vcpu_create)(vcpu); if (r) @@ -12086,7 +12090,7 @@ bool kvm_can_do_async_pf(struct kvm_vcpu *vcpu) vcpu->arch.exception.pending)) return false; - if (kvm_hlt_in_guest(vcpu->kvm) && !kvm_can_deliver_async_pf(vcpu)) + if (kvm_hlt_in_guest(vcpu) && !kvm_can_deliver_async_pf(vcpu)) return false; /* diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 4abcd8d9836d..0da4b09535a5 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -372,24 +372,24 @@ static inline u64 nsec_to_cycles(struct kvm_vcpu *vcpu, u64 nsec) __rem; \ }) -static inline bool kvm_mwait_in_guest(struct kvm *kvm) +static inline bool kvm_mwait_in_guest(struct kvm_vcpu *vcpu) { - return kvm->arch.mwait_in_guest; + return vcpu->arch.mwait_in_guest; } -static inline bool kvm_hlt_in_guest(struct kvm *kvm) +static inline bool kvm_hlt_in_guest(struct kvm_vcpu *vcpu) { - return kvm->arch.hlt_in_guest; + return vcpu->arch.hlt_in_guest; } -static inline bool kvm_pause_in_guest(struct kvm *kvm) +static inline bool kvm_pause_in_guest(struct kvm_vcpu *vcpu) { - return kvm->arch.pause_in_guest; + return vcpu->arch.pause_in_guest; } -static inline bool kvm_cstate_in_guest(struct kvm *kvm) +static inline bool kvm_cstate_in_guest(struct kvm_vcpu *vcpu) { - return kvm->arch.cstate_in_guest; + return vcpu->arch.cstate_in_guest; } DECLARE_PER_CPU(struct kvm_vcpu *, current_vcpu);
For futher extensions on finer-grained control on per-vCPU exits disable control, and because VM-scoped restricted to set before vCPUs creation, runtime disabled exits flag check could be purely vCPU scope. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Kechen Lu <kechenl@nvidia.com> --- arch/x86/include/asm/kvm_host.h | 5 +++++ arch/x86/kvm/cpuid.c | 2 +- arch/x86/kvm/lapic.c | 2 +- arch/x86/kvm/svm/svm.c | 10 +++++----- arch/x86/kvm/vmx/vmx.c | 16 ++++++++-------- arch/x86/kvm/x86.c | 6 +++++- arch/x86/kvm/x86.h | 16 ++++++++-------- 7 files changed, 33 insertions(+), 24 deletions(-)