diff mbox series

[v2,08/13] x86/vdso: Fix gettimeofday masking

Message ID 20230519102715.704767397@infradead.org (mailing list archive)
State Handled Elsewhere, archived
Headers show
Series local_clock() vs noinstr | expand

Commit Message

Peter Zijlstra May 19, 2023, 10:21 a.m. UTC
Because of how the virtual clocks use U64_MAX as an exception value
instead of a valid time, the clocks can no longer be assumed to wrap
cleanly. This is then compounded by arch_vdso_cycles_ok() rejecting
everything with the MSB/Sign-bit set.

Therefore, the effective mask becomes S64_MAX, and the comment with
vdso_calc_delta() that states the mask is U64_MAX and isn't optimized
out is just plain silly.

Now, the code has a negative filter -- to deal with TSC wobbles:

	if (cycles > last)

which is just plain wrong, because it should've been written as:

	if ((s64)(cycles - last) > 0)

to take wrapping into account, but per all the above, we don't
actually wrap on u64 anymore.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/vdso/gettimeofday.h |   39 ++++++++++++++++++++++---------
 1 file changed, 28 insertions(+), 11 deletions(-)

Comments

Thomas Gleixner May 31, 2023, 3:27 p.m. UTC | #1
On Fri, May 19 2023 at 12:21, Peter Zijlstra wrote:
> Because of how the virtual clocks use U64_MAX as an exception value
> instead of a valid time, the clocks can no longer be assumed to wrap
> cleanly. This is then compounded by arch_vdso_cycles_ok() rejecting
> everything with the MSB/Sign-bit set.
>
> Therefore, the effective mask becomes S64_MAX, and the comment with
> vdso_calc_delta() that states the mask is U64_MAX and isn't optimized
> out is just plain silly.
>
> Now, the code has a negative filter -- to deal with TSC wobbles:
>
> 	if (cycles > last)
>
> which is just plain wrong, because it should've been written as:
>
> 	if ((s64)(cycles - last) > 0)
>
> to take wrapping into account, but per all the above, we don't
> actually wrap on u64 anymore.

Indeed. The rationale was that you need ~146 years uptime with a 4GHz
TSC or ~584 years with 1GHz to actually reach the wrap around point.

Though I can see your point to make sure that silly BIOSes or VMMs
cannot cause havoc by accident or malice.

Did anyone ever validate that wrap around on TSC including TSC deadline
timer works correctly?

I have faint memories of TSC_ADJUST, which I prefer not to bring back to
main memory :)

Thanks,

        tglx
Thomas Gleixner May 31, 2023, 10:46 p.m. UTC | #2
On Wed, May 31 2023 at 17:27, Thomas Gleixner wrote:
> On Fri, May 19 2023 at 12:21, Peter Zijlstra wrote:
>> to take wrapping into account, but per all the above, we don't
>> actually wrap on u64 anymore.
>
> Indeed. The rationale was that you need ~146 years uptime with a 4GHz
> TSC or ~584 years with 1GHz to actually reach the wrap around point.
>
> Though I can see your point to make sure that silly BIOSes or VMMs
> cannot cause havoc by accident or malice.
>
> Did anyone ever validate that wrap around on TSC including TSC deadline
> timer works correctly?
>
> I have faint memories of TSC_ADJUST, which I prefer not to bring back to
> main memory :)

It seems my fears have been unjustified.

At least a quick test which sets the TSC to ~ -8min @2.1Ghz the machine
seems to survive without the colourful explosions I expected due to my
early exposure to TSC_ADJUST and TSC_DEADLINE_TIMER :)

Thanks,

        tglx
Thomas Gleixner May 31, 2023, 10:46 p.m. UTC | #3
On Fri, May 19 2023 at 12:21, Peter Zijlstra wrote:
> to take wrapping into account, but per all the above, we don't
> actually wrap on u64 anymore.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Tested-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
diff mbox series

Patch

--- a/arch/x86/include/asm/vdso/gettimeofday.h
+++ b/arch/x86/include/asm/vdso/gettimeofday.h
@@ -231,14 +231,17 @@  static u64 vread_pvclock(void)
 		ret = __pvclock_read_cycles(pvti, rdtsc_ordered());
 	} while (pvclock_read_retry(pvti, version));
 
-	return ret;
+	return ret & S64_MAX;
 }
 #endif
 
 #ifdef CONFIG_HYPERV_TIMER
 static u64 vread_hvclock(void)
 {
-	return hv_read_tsc_page(&hvclock_page);
+	u64 ret = hv_read_tsc_page(&hvclock_page);
+	if (likely(ret != U64_MAX))
+		ret &= S64_MAX;
+	return ret;
 }
 #endif
 
@@ -246,7 +249,7 @@  static inline u64 __arch_get_hw_counter(
 					const struct vdso_data *vd)
 {
 	if (likely(clock_mode == VDSO_CLOCKMODE_TSC))
-		return (u64)rdtsc_ordered();
+		return (u64)rdtsc_ordered() & S64_MAX;
 	/*
 	 * For any memory-mapped vclock type, we need to make sure that gcc
 	 * doesn't cleverly hoist a load before the mode check.  Otherwise we
@@ -284,6 +287,9 @@  static inline bool arch_vdso_clocksource
  * which can be invalidated asynchronously and indicate invalidation by
  * returning U64_MAX, which can be effectively tested by checking for a
  * negative value after casting it to s64.
+ *
+ * This effectively forces a S64_MAX mask on the calculations, unlike the
+ * U64_MAX mask normally used by x86 clocksources.
  */
 static inline bool arch_vdso_cycles_ok(u64 cycles)
 {
@@ -303,18 +309,29 @@  static inline bool arch_vdso_cycles_ok(u
  * @last. If not then use @last, which is the base time of the current
  * conversion period.
  *
- * This variant also removes the masking of the subtraction because the
- * clocksource mask of all VDSO capable clocksources on x86 is U64_MAX
- * which would result in a pointless operation. The compiler cannot
- * optimize it away as the mask comes from the vdso data and is not compile
- * time constant.
+ * This variant also uses a custom mask because while the clocksource mask of
+ * all the VDSO capable clocksources on x86 is U64_MAX, the above code uses
+ * U64_MASK as an exception value, additionally arch_vdso_cycles_ok() above
+ * declares everything with the MSB/Sign-bit set as invalid. Therefore the
+ * effective mask is S64_MAX.
  */
 static __always_inline
 u64 vdso_calc_delta(u64 cycles, u64 last, u64 mask, u32 mult)
 {
-	if (cycles > last)
-		return (cycles - last) * mult;
-	return 0;
+	/*
+	 * Due to the MSB/Sign-bit being used as invald marker (see
+	 * arch_vdso_cycles_valid() above), the effective mask is S64_MAX.
+	 */
+	u64 delta = (cycles - last) & S64_MAX;
+
+	/*
+	 * Due to the above mentioned TSC wobbles, filter out negative motion.
+	 * Per the above masking, the effective sign bit is now bit 62.
+	 */
+	if (unlikely(delta & (1ULL << 62)))
+		return 0;
+
+	return delta * mult;
 }
 #define vdso_calc_delta vdso_calc_delta