Message ID | 20250213191457.12377-1-ubizjak@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [RESEND,1/2] x86/locking: Use ALT_OUTPUT_SP() for percpu_{,try_}cmpxchg{64,128}_op() | expand |
On 2/13/25 11:14, Uros Bizjak wrote: > percpu_{,try_}cmpxchg{64,128}() macros use CALL instruction inside > asm statement in one of their alternatives. Use ALT_OUTPUT_SP() > macro to add required dependence on %esp register. Is this just a pedantic fix? Or is there an actual impact to end users that needs to be considered? Basically, you've told me what the patch does, but not why anyone should care or why it should be applied.
On Thu, Feb 13, 2025 at 9:43 PM Dave Hansen <dave.hansen@intel.com> wrote: > > On 2/13/25 11:14, Uros Bizjak wrote: > > percpu_{,try_}cmpxchg{64,128}() macros use CALL instruction inside > > asm statement in one of their alternatives. Use ALT_OUTPUT_SP() > > macro to add required dependence on %esp register. > > Is this just a pedantic fix? Or is there an actual impact to end users > that needs to be considered? When call insn is embedded in the asm, then the compiler doesn't know that due to call insn asm now depends on stack pointer or frame pointer, so it is free to schedule the instruction outside the function frame prologue/epilogue. Currently, this only triggers objtool warning, but if we ever compile the kernel with redzone (IIRC, it was mentioned that this is possible with FRED enabled kernel), the call will clobber the redzone. Please note that alternative_call() family of functions, __alternative_atomic64() and __arch_{,try_}cmpxchg64_emu() all use the same macro exactly for the reason explained above. OTOH, all recent x86_64 processors support CMPXCHG128 insn, so the call alternative will be rarely used. > Basically, you've told me what the patch does, but not why anyone should > care or why it should be applied. This is actually explained at length in the comment for ASM_CALL_CONSTRAINT, which ALT_OUTPUT_SP macro uses. Uros.
On 2/13/25 13:17, Uros Bizjak wrote: >> Basically, you've told me what the patch does, but not why anyone should >> care or why it should be applied. > This is actually explained at length in the comment for > ASM_CALL_CONSTRAINT, which ALT_OUTPUT_SP macro uses. Great info, thanks! Could you give the patch another shot and include this in the changelog, please? Better yet, you could paraphrase the comment so that we don't have to go searching for it.
On Thu, 13 Feb 2025, Uros Bizjak wrote: > OTOH, all recent x86_64 processors support CMPXCHG128 insn, so the > call alternative will be rarely used. Do we still support processors without cmpxchg128? If not then lets just drop the calls from the kernel.
On Fri, Feb 14, 2025 at 7:22 PM Christoph Lameter (Ampere) <cl@gentwo.org> wrote: > > On Thu, 13 Feb 2025, Uros Bizjak wrote: > > > OTOH, all recent x86_64 processors support CMPXCHG128 insn, so the > > call alternative will be rarely used. > > Do we still support processors without cmpxchg128? If not then lets just > drop the calls from the kernel. I'm not aware of any discussion about that decision. Thanks, Uros.
diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h index e525cd85f999..0ab991fba7de 100644 --- a/arch/x86/include/asm/percpu.h +++ b/arch/x86/include/asm/percpu.h @@ -350,9 +350,9 @@ do { \ \ asm qual (ALTERNATIVE("call this_cpu_cmpxchg8b_emu", \ "cmpxchg8b " __percpu_arg([var]), X86_FEATURE_CX8) \ - : [var] "+m" (__my_cpu_var(_var)), \ - "+a" (old__.low), \ - "+d" (old__.high) \ + : ALT_OUTPUT_SP([var] "+m" (__my_cpu_var(_var)), \ + "+a" (old__.low), \ + "+d" (old__.high)) \ : "b" (new__.low), \ "c" (new__.high), \ "S" (&(_var)) \ @@ -381,10 +381,10 @@ do { \ asm qual (ALTERNATIVE("call this_cpu_cmpxchg8b_emu", \ "cmpxchg8b " __percpu_arg([var]), X86_FEATURE_CX8) \ CC_SET(z) \ - : CC_OUT(z) (success), \ - [var] "+m" (__my_cpu_var(_var)), \ - "+a" (old__.low), \ - "+d" (old__.high) \ + : ALT_OUTPUT_SP(CC_OUT(z) (success), \ + [var] "+m" (__my_cpu_var(_var)), \ + "+a" (old__.low), \ + "+d" (old__.high)) \ : "b" (new__.low), \ "c" (new__.high), \ "S" (&(_var)) \ @@ -421,9 +421,9 @@ do { \ \ asm qual (ALTERNATIVE("call this_cpu_cmpxchg16b_emu", \ "cmpxchg16b " __percpu_arg([var]), X86_FEATURE_CX16) \ - : [var] "+m" (__my_cpu_var(_var)), \ - "+a" (old__.low), \ - "+d" (old__.high) \ + : ALT_OUTPUT_SP([var] "+m" (__my_cpu_var(_var)), \ + "+a" (old__.low), \ + "+d" (old__.high)) \ : "b" (new__.low), \ "c" (new__.high), \ "S" (&(_var)) \ @@ -452,10 +452,10 @@ do { \ asm qual (ALTERNATIVE("call this_cpu_cmpxchg16b_emu", \ "cmpxchg16b " __percpu_arg([var]), X86_FEATURE_CX16) \ CC_SET(z) \ - : CC_OUT(z) (success), \ - [var] "+m" (__my_cpu_var(_var)), \ - "+a" (old__.low), \ - "+d" (old__.high) \ + : ALT_OUTPUT_SP(CC_OUT(z) (success), \ + [var] "+m" (__my_cpu_var(_var)), \ + "+a" (old__.low), \ + "+d" (old__.high)) \ : "b" (new__.low), \ "c" (new__.high), \ "S" (&(_var)) \
percpu_{,try_}cmpxchg{64,128}() macros use CALL instruction inside asm statement in one of their alternatives. Use ALT_OUTPUT_SP() macro to add required dependence on %esp register. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux.com> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> --- arch/x86/include/asm/percpu.h | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-)