Message ID | 87997f62-a6e2-1812-ccf5-d7d2e65fd50e@suse.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | x86/time: switch platform timer hooks to altcall | expand |
On 12/01/2022 08:58, Jan Beulich wrote: > Except in the "clocksource=tsc" case we can replace the indirect calls > involved in accessing the platform timers by direct ones, as they get > established once and never changed. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > --- > Sort of RFC, for both the whether and the how aspects. > > TBD: Overriding X86_FEATURE_ALWAYS is somewhat dangerous; there's only > no issue with e.g. hvm_set_tsc_offset() used later in time.c > because that's an inline function which did already "latch" the > usual value of the macro. But the alternative of introducing an > alternative_call() variant allowing to specify the controlling > feature also doesn't look overly nice to me either. Then again the > .resume hook invocation could be patched unconditionally, as the > TSC clocksource leaves this hook set to NULL. > > --- a/xen/arch/x86/alternative.c > +++ b/xen/arch/x86/alternative.c > @@ -268,8 +268,7 @@ static void init_or_livepatch _apply_alt > * point the branch destination is still NULL, insert "UD2; UD0" > * (for ease of recognition) instead of CALL/JMP. > */ > - if ( a->cpuid == X86_FEATURE_ALWAYS && > - *(int32_t *)(buf + 1) == -5 && > + if ( *(int32_t *)(buf + 1) == -5 && I'm afraid that this must not become conditional. One of the reasons I was hesitant towards the mechanics of altcall in the first place was that it intentionally broke Spectre v2 protections by manually writing out a non-retpoline'd indirect call. This is made safe in practice because all altcall sites either get converted to a direct call, or rewritten to be a UD2. If you want to make altcalls conversions conditional, then the code gen must be rearranged to use INDIRECT_CALL first, but that comes with overheads too because then call callsites would load the function pointer into a register, just to not use it in the patched case. I suspect it will be easier to figure out how to make the TSC case also invariant after boot. ~Andrew
On 12.01.2022 10:17, Andrew Cooper wrote: > On 12/01/2022 08:58, Jan Beulich wrote: >> Except in the "clocksource=tsc" case we can replace the indirect calls >> involved in accessing the platform timers by direct ones, as they get >> established once and never changed. >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> >> --- >> Sort of RFC, for both the whether and the how aspects. >> >> TBD: Overriding X86_FEATURE_ALWAYS is somewhat dangerous; there's only >> no issue with e.g. hvm_set_tsc_offset() used later in time.c >> because that's an inline function which did already "latch" the >> usual value of the macro. But the alternative of introducing an >> alternative_call() variant allowing to specify the controlling >> feature also doesn't look overly nice to me either. Then again the >> .resume hook invocation could be patched unconditionally, as the >> TSC clocksource leaves this hook set to NULL. >> >> --- a/xen/arch/x86/alternative.c >> +++ b/xen/arch/x86/alternative.c >> @@ -268,8 +268,7 @@ static void init_or_livepatch _apply_alt >> * point the branch destination is still NULL, insert "UD2; UD0" >> * (for ease of recognition) instead of CALL/JMP. >> */ >> - if ( a->cpuid == X86_FEATURE_ALWAYS && >> - *(int32_t *)(buf + 1) == -5 && >> + if ( *(int32_t *)(buf + 1) == -5 && > > I'm afraid that this must not become conditional. > > One of the reasons I was hesitant towards the mechanics of altcall in > the first place was that it intentionally broke Spectre v2 protections > by manually writing out a non-retpoline'd indirect call. > > This is made safe in practice because all altcall sites either get > converted to a direct call, or rewritten to be a UD2. Oh, sorry, I really should have realized this. > If you want to make altcalls conversions conditional, then the code gen > must be rearranged to use INDIRECT_CALL first, but that comes with > overheads too because then call callsites would load the function > pointer into a register, just to not use it in the patched case. I don't view this as desirable. > I suspect it will be easier to figure out how to make the TSC case also > invariant after boot. Since switching to "tsc" happens only after bringing up all CPUs, I don't see how this could become possible; in particular I don't fancy an alternatives patching pass with all CPUs online (albeit technically this could be an option). The solution (workaround) I can see for now is using if ( plt_src.read_counter != read_tsc ) count = alternative_call(plt_src.read_counter); else count = rdtsc_ordered(); everywhere. Potentially we'd then want to hide this in a static (inline?) function. Jan
--- a/xen/arch/x86/alternative.c +++ b/xen/arch/x86/alternative.c @@ -268,8 +268,7 @@ static void init_or_livepatch _apply_alt * point the branch destination is still NULL, insert "UD2; UD0" * (for ease of recognition) instead of CALL/JMP. */ - if ( a->cpuid == X86_FEATURE_ALWAYS && - *(int32_t *)(buf + 1) == -5 && + if ( *(int32_t *)(buf + 1) == -5 && a->orig_len >= 6 && orig[0] == 0xff && orig[1] == (*buf & 1 ? 0x25 : 0x15) ) --- a/xen/arch/x86/include/asm/cpufeatures.h +++ b/xen/arch/x86/include/asm/cpufeatures.h @@ -24,7 +24,7 @@ XEN_CPUFEATURE(APERFMPERF, X86_SY XEN_CPUFEATURE(MFENCE_RDTSC, X86_SYNTH( 9)) /* MFENCE synchronizes RDTSC */ XEN_CPUFEATURE(XEN_SMEP, X86_SYNTH(10)) /* SMEP gets used by Xen itself */ XEN_CPUFEATURE(XEN_SMAP, X86_SYNTH(11)) /* SMAP gets used by Xen itself */ -/* Bit 12 - unused. */ +XEN_CPUFEATURE(CS_NOT_TSC, X86_SYNTH(12)) /* Clocksource is not TSC */ XEN_CPUFEATURE(IND_THUNK_LFENCE, X86_SYNTH(13)) /* Use IND_THUNK_LFENCE */ XEN_CPUFEATURE(IND_THUNK_JMP, X86_SYNTH(14)) /* Use IND_THUNK_JMP */ XEN_CPUFEATURE(SC_NO_BRANCH_HARDEN, X86_SYNTH(15)) /* (Disable) Conditional branch hardening */ --- a/xen/arch/x86/time.c +++ b/xen/arch/x86/time.c @@ -786,6 +786,14 @@ static struct platform_timesource __init * GENERIC PLATFORM TIMER INFRASTRUCTURE */ +/* + * Override for alternative call patching: Since "clocksource=tsc" is honored + * only after all CPUs have been brought up, we can't patch indirect calls in + * that case. + */ +#undef X86_FEATURE_ALWAYS +#define X86_FEATURE_ALWAYS X86_FEATURE_CS_NOT_TSC + /* details of chosen timesource */ static struct platform_timesource __read_mostly plt_src; /* hardware-width mask */ @@ -818,7 +826,7 @@ static void plt_overflow(void *unused) spin_lock_irq(&platform_timer_lock); - count = plt_src.read_counter(); + count = alternative_call(plt_src.read_counter); plt_stamp64 += (count - plt_stamp) & plt_mask; plt_stamp = count; @@ -854,7 +862,7 @@ static s_time_t read_platform_stime(u64 ASSERT(!local_irq_is_enabled()); spin_lock(&platform_timer_lock); - plt_counter = plt_src.read_counter(); + plt_counter = alternative_call(plt_src.read_counter); count = plt_stamp64 + ((plt_counter - plt_stamp) & plt_mask); stime = __read_platform_stime(count); spin_unlock(&platform_timer_lock); @@ -872,7 +880,8 @@ static void platform_time_calibration(vo unsigned long flags; spin_lock_irqsave(&platform_timer_lock, flags); - count = plt_stamp64 + ((plt_src.read_counter() - plt_stamp) & plt_mask); + count = plt_stamp64 + ((alternative_call(plt_src.read_counter) - + plt_stamp) & plt_mask); stamp = __read_platform_stime(count); stime_platform_stamp = stamp; platform_timer_stamp = count; @@ -883,10 +892,10 @@ static void resume_platform_timer(void) { /* Timer source can be reset when backing from S3 to S0 */ if ( plt_src.resume ) - plt_src.resume(&plt_src); + alternative_vcall(plt_src.resume, &plt_src); plt_stamp64 = platform_timer_stamp; - plt_stamp = plt_src.read_counter(); + plt_stamp = alternative_call(plt_src.read_counter); } static void __init reset_platform_timer(void) @@ -975,6 +984,10 @@ static u64 __init init_platform_timer(vo printk("Platform timer is %s %s\n", freq_string(pts->frequency), pts->name); + if ( strcmp(opt_clocksource, "tsc") || + !boot_cpu_has(X86_FEATURE_TSC_RELIABLE) ) + setup_force_cpu_cap(X86_FEATURE_CS_NOT_TSC); + return rc; }
Except in the "clocksource=tsc" case we can replace the indirect calls involved in accessing the platform timers by direct ones, as they get established once and never changed. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- Sort of RFC, for both the whether and the how aspects. TBD: Overriding X86_FEATURE_ALWAYS is somewhat dangerous; there's only no issue with e.g. hvm_set_tsc_offset() used later in time.c because that's an inline function which did already "latch" the usual value of the macro. But the alternative of introducing an alternative_call() variant allowing to specify the controlling feature also doesn't look overly nice to me either. Then again the .resume hook invocation could be patched unconditionally, as the TSC clocksource leaves this hook set to NULL.