Message ID | 20250227021855.3257188-25-seanjc@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | x86: Try to wrangle PV clocks vs. TSC | expand |
On Wed, Feb 26 2025 at 18:18, Sean Christopherson wrote: > When resuming timekeeping after suspend, restore clocksources prior to > reading the persistent clock. Paravirt clocks, e.g. kvmclock, tie the > validity of a PV persistent clock to a clocksource, i.e. reading the PV > persistent clock will return garbage if the underlying PV clocksource > hasn't been enabled. The flaw has gone unnoticed because kvmclock is a > mess and uses its own suspend/resume hooks instead of the clocksource > suspend/resume hooks, which happens to work by sheer dumb luck (the > kvmclock resume hook runs before timekeeping_resume()). > > Note, there is no evidence that any clocksource supported by the kernel > depends on a persistent clock. > > Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index 1e67d076f195..332d053fa9ce 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -1794,11 +1794,16 @@ void timekeeping_resume(void) u64 cycle_now, nsec; unsigned long flags; - read_persistent_clock64(&ts_new); - clockevents_resume(); clocksource_resume(); + /* + * Read persistent time after clocksources have been resumed. Paravirt + * clocks have a nasty habit of piggybacking a persistent clock on a + * system clock, and may return garbage if the system clock is suspended. + */ + read_persistent_clock64(&ts_new); + raw_spin_lock_irqsave(&tk_core.lock, flags); /*
When resuming timekeeping after suspend, restore clocksources prior to reading the persistent clock. Paravirt clocks, e.g. kvmclock, tie the validity of a PV persistent clock to a clocksource, i.e. reading the PV persistent clock will return garbage if the underlying PV clocksource hasn't been enabled. The flaw has gone unnoticed because kvmclock is a mess and uses its own suspend/resume hooks instead of the clocksource suspend/resume hooks, which happens to work by sheer dumb luck (the kvmclock resume hook runs before timekeeping_resume()). Note, there is no evidence that any clocksource supported by the kernel depends on a persistent clock. Signed-off-by: Sean Christopherson <seanjc@google.com> --- kernel/time/timekeeping.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)