Message ID | 20240522001817.619072-17-dwmw2@infradead.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Cleaning up the KVM clock mess | expand |
On 22/05/2024 01:17, David Woodhouse wrote: > From: David Woodhouse <dwmw@amazon.co.uk> > > Both kvm_track_tsc_matching() and pvclock_update_vm_gtod_copy() make a > decision about whether the KVM clock should be in master clock mode. > > They use *different* criteria for the decision though. This isn't really > a problem; it only has the potential to cause unnecessary invocations of > KVM_REQ_MASTERCLOCK_UPDATE if the masterclock was disabled due to TSC > going backwards, or the guest using the old MSR. But it isn't pretty. > > Factor the decision out to a single function. And document the historical > reason why it's disabled for guests that use the old MSR_KVM_SYSTEM_TIME. > > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> > --- > arch/x86/kvm/x86.c | 27 +++++++++++++++++++++++---- > 1 file changed, 23 insertions(+), 4 deletions(-) > Reviewed-by: Paul Durrant <paul@xen.org>
The shortlog is rather misleading. This is more than just a refactor, and I would argue the refactor aspect is secondary, i.e. the main goal of this patch is to apply the exceptons to kvm_track_tsc_matching(). On Wed, May 22, 2024, David Woodhouse wrote: > From: David Woodhouse <dwmw@amazon.co.uk> > > Both kvm_track_tsc_matching() and pvclock_update_vm_gtod_copy() make a > decision about whether the KVM clock should be in master clock mode. > > They use *different* criteria for the decision though. This isn't really > a problem; it only has the potential to cause unnecessary invocations of > KVM_REQ_MASTERCLOCK_UPDATE if the masterclock was disabled due to TSC > going backwards, or the guest using the old MSR. But it isn't pretty. > > Factor the decision out to a single function. And document the historical > reason why it's disabled for guests that use the old MSR_KVM_SYSTEM_TIME. > > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> > --- > arch/x86/kvm/x86.c | 27 +++++++++++++++++++++++---- > 1 file changed, 23 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index e21b8c075bf6..437412b36cae 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -2518,6 +2518,27 @@ static inline bool gtod_is_based_on_tsc(int mode) > } > #endif > > +static bool kvm_use_master_clock(struct kvm *kvm) Maybe kvm_can_use_master_clock() so that this isn't misconstrued with the actual ka->user_master_clock field. > +{ > + struct kvm_arch *ka = &kvm->arch; > + > + /* > + * The 'old kvmclock' check is a workaround (from 2015) for a > + * SUSE 2.6.16 kernel that didn't boot if the system_time in > + * its kvmclock was too far behind the current time. So the > + * mode of just setting the reference point and allowing time > + * to proceed linearly from there makes it fail to boot. > + * Despite that being kind of the *point* of the way the clock > + * is exposed to the guest. By coincidence, the offending > + * kernels used the old MSR_KVM_SYSTEM_TIME, which was moved > + * only because it resided in the wrong number range. So the > + * workaround is activated for *all* guests using the old MSR. > + */ > + return ka->all_vcpus_matched_tsc && > + !ka->backwards_tsc_observed && > + !ka->boot_vcpu_runs_old_kvmclock; Please align indentation: return ka->all_vcpus_matched_tsc && !ka->backwards_tsc_observed && !ka->boot_vcpu_runs_old_kvmclock; > +} > + > static void kvm_track_tsc_matching(struct kvm_vcpu *vcpu) > { > #ifdef CONFIG_X86_64 > @@ -2550,7 +2571,7 @@ static void kvm_track_tsc_matching(struct kvm_vcpu *vcpu) > * To use the masterclock, the host clocksource must be based on TSC > * and all vCPUs must have matching TSC frequencies. > */ > - bool use_master_clock = ka->all_vcpus_matched_tsc && > + bool use_master_clock = kvm_use_master_clock(vcpu->kvm) && > gtod_is_based_on_tsc(gtod->clock.vclock_mode); > > /* > @@ -3096,9 +3117,7 @@ static void pvclock_update_vm_gtod_copy(struct kvm *kvm) > &ka->master_cycle_now); > > ka->use_master_clock = host_tsc_clocksource > - && ka->all_vcpus_matched_tsc > - && !ka->backwards_tsc_observed > - && !ka->boot_vcpu_runs_old_kvmclock; > + && kvm_use_master_clock(kvm); Perfect opportuity to put the "&&" on the preceding line. > > /* > * When TSC scaling is in use (which can thankfully only happen > -- > 2.44.0 >
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e21b8c075bf6..437412b36cae 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2518,6 +2518,27 @@ static inline bool gtod_is_based_on_tsc(int mode) } #endif +static bool kvm_use_master_clock(struct kvm *kvm) +{ + struct kvm_arch *ka = &kvm->arch; + + /* + * The 'old kvmclock' check is a workaround (from 2015) for a + * SUSE 2.6.16 kernel that didn't boot if the system_time in + * its kvmclock was too far behind the current time. So the + * mode of just setting the reference point and allowing time + * to proceed linearly from there makes it fail to boot. + * Despite that being kind of the *point* of the way the clock + * is exposed to the guest. By coincidence, the offending + * kernels used the old MSR_KVM_SYSTEM_TIME, which was moved + * only because it resided in the wrong number range. So the + * workaround is activated for *all* guests using the old MSR. + */ + return ka->all_vcpus_matched_tsc && + !ka->backwards_tsc_observed && + !ka->boot_vcpu_runs_old_kvmclock; +} + static void kvm_track_tsc_matching(struct kvm_vcpu *vcpu) { #ifdef CONFIG_X86_64 @@ -2550,7 +2571,7 @@ static void kvm_track_tsc_matching(struct kvm_vcpu *vcpu) * To use the masterclock, the host clocksource must be based on TSC * and all vCPUs must have matching TSC frequencies. */ - bool use_master_clock = ka->all_vcpus_matched_tsc && + bool use_master_clock = kvm_use_master_clock(vcpu->kvm) && gtod_is_based_on_tsc(gtod->clock.vclock_mode); /* @@ -3096,9 +3117,7 @@ static void pvclock_update_vm_gtod_copy(struct kvm *kvm) &ka->master_cycle_now); ka->use_master_clock = host_tsc_clocksource - && ka->all_vcpus_matched_tsc - && !ka->backwards_tsc_observed - && !ka->boot_vcpu_runs_old_kvmclock; + && kvm_use_master_clock(kvm); /* * When TSC scaling is in use (which can thankfully only happen