Message ID | 20211009021236.4122790-33-seanjc@google.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | KVM: Halt-polling and x86 APICv overhaul | expand |
On Fri, 2021-10-08 at 19:12 -0700, Sean Christopherson wrote: > Handle the switch to/from the hypervisor/software timer when a vCPU is > blocking in common x86 instead of in VMX. Even though VMX is the only > user of a hypervisor timer, the logic and all functions involved are > generic x86 (unless future CPUs do something completely different and > implement a hypervisor timer that runs regardless of mode). > > Handling the switch in common x86 will allow for the elimination of the > pre/post_blocks hooks, and also lets KVM switch back to the hypervisor > timer if and only if it was in use (without additional params). Add a > comment explaining why the switch cannot be deferred to kvm_sched_out() > or kvm_vcpu_block(). > > Signed-off-by: Sean Christopherson <seanjc@google.com> > --- > arch/x86/kvm/vmx/vmx.c | 6 +----- > arch/x86/kvm/x86.c | 21 +++++++++++++++++++++ > 2 files changed, 22 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index b3bb2031a7ac..a24f19874716 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -7464,16 +7464,12 @@ void vmx_update_cpu_dirty_logging(struct kvm_vcpu *vcpu) > > static int vmx_pre_block(struct kvm_vcpu *vcpu) > { > - if (kvm_lapic_hv_timer_in_use(vcpu)) > - kvm_lapic_switch_to_sw_timer(vcpu); > - > return 0; > } > > static void vmx_post_block(struct kvm_vcpu *vcpu) > { > - if (kvm_x86_ops.set_hv_timer) > - kvm_lapic_switch_to_hv_timer(vcpu); > + > } > > static void vmx_setup_mce(struct kvm_vcpu *vcpu) > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index e0219acfd9cf..909e932a7ae7 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -9896,8 +9896,21 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) > > static inline int vcpu_block(struct kvm *kvm, struct kvm_vcpu *vcpu) > { > + bool hv_timer; > + > if (!kvm_arch_vcpu_runnable(vcpu) && > (!kvm_x86_ops.pre_block || static_call(kvm_x86_pre_block)(vcpu) == 0)) { > + /* > + * Switch to the software timer before halt-polling/blocking as > + * the guest's timer may be a break event for the vCPU, and the > + * hypervisor timer runs only when the CPU is in guest mode. > + * Switch before halt-polling so that KVM recognizes an expired > + * timer before blocking. > + */ I didn't knew about this until now but it all makes sense. The comment is very good. > + hv_timer = kvm_lapic_hv_timer_in_use(vcpu); > + if (hv_timer) > + kvm_lapic_switch_to_sw_timer(vcpu); > + > srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx); > if (vcpu->arch.mp_state == KVM_MP_STATE_HALTED) > kvm_vcpu_halt(vcpu); > @@ -9905,6 +9918,9 @@ static inline int vcpu_block(struct kvm *kvm, struct kvm_vcpu *vcpu) > kvm_vcpu_block(vcpu); > vcpu->srcu_idx = srcu_read_lock(&kvm->srcu); > > + if (hv_timer) > + kvm_lapic_switch_to_hv_timer(vcpu); > + > if (kvm_x86_ops.post_block) > static_call(kvm_x86_post_block)(vcpu); > > @@ -10136,6 +10152,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) > r = -EINTR; > goto out; > } > + /* > + * It should be impossible for the hypervisor timer to be in > + * use before KVM has ever run the vCPU. > + */ > + WARN_ON_ONCE(kvm_lapic_hv_timer_in_use(vcpu)); > kvm_vcpu_block(vcpu); > if (kvm_apic_accept_events(vcpu) < 0) { > r = 0; Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Best regards, Maxim Levitsky
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index b3bb2031a7ac..a24f19874716 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7464,16 +7464,12 @@ void vmx_update_cpu_dirty_logging(struct kvm_vcpu *vcpu) static int vmx_pre_block(struct kvm_vcpu *vcpu) { - if (kvm_lapic_hv_timer_in_use(vcpu)) - kvm_lapic_switch_to_sw_timer(vcpu); - return 0; } static void vmx_post_block(struct kvm_vcpu *vcpu) { - if (kvm_x86_ops.set_hv_timer) - kvm_lapic_switch_to_hv_timer(vcpu); + } static void vmx_setup_mce(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e0219acfd9cf..909e932a7ae7 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9896,8 +9896,21 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) static inline int vcpu_block(struct kvm *kvm, struct kvm_vcpu *vcpu) { + bool hv_timer; + if (!kvm_arch_vcpu_runnable(vcpu) && (!kvm_x86_ops.pre_block || static_call(kvm_x86_pre_block)(vcpu) == 0)) { + /* + * Switch to the software timer before halt-polling/blocking as + * the guest's timer may be a break event for the vCPU, and the + * hypervisor timer runs only when the CPU is in guest mode. + * Switch before halt-polling so that KVM recognizes an expired + * timer before blocking. + */ + hv_timer = kvm_lapic_hv_timer_in_use(vcpu); + if (hv_timer) + kvm_lapic_switch_to_sw_timer(vcpu); + srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx); if (vcpu->arch.mp_state == KVM_MP_STATE_HALTED) kvm_vcpu_halt(vcpu); @@ -9905,6 +9918,9 @@ static inline int vcpu_block(struct kvm *kvm, struct kvm_vcpu *vcpu) kvm_vcpu_block(vcpu); vcpu->srcu_idx = srcu_read_lock(&kvm->srcu); + if (hv_timer) + kvm_lapic_switch_to_hv_timer(vcpu); + if (kvm_x86_ops.post_block) static_call(kvm_x86_post_block)(vcpu); @@ -10136,6 +10152,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) r = -EINTR; goto out; } + /* + * It should be impossible for the hypervisor timer to be in + * use before KVM has ever run the vCPU. + */ + WARN_ON_ONCE(kvm_lapic_hv_timer_in_use(vcpu)); kvm_vcpu_block(vcpu); if (kvm_apic_accept_events(vcpu) < 0) { r = 0;
Handle the switch to/from the hypervisor/software timer when a vCPU is blocking in common x86 instead of in VMX. Even though VMX is the only user of a hypervisor timer, the logic and all functions involved are generic x86 (unless future CPUs do something completely different and implement a hypervisor timer that runs regardless of mode). Handling the switch in common x86 will allow for the elimination of the pre/post_blocks hooks, and also lets KVM switch back to the hypervisor timer if and only if it was in use (without additional params). Add a comment explaining why the switch cannot be deferred to kvm_sched_out() or kvm_vcpu_block(). Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/kvm/vmx/vmx.c | 6 +----- arch/x86/kvm/x86.c | 21 +++++++++++++++++++++ 2 files changed, 22 insertions(+), 5 deletions(-)