Message ID | 20240228135908.13319-1-roger.pau@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/altcall: always use a temporary parameter stashing variable | expand |
On 28.02.2024 14:59, Roger Pau Monne wrote: > The usage in ALT_CALL_ARG() on clang of: > > register union { > typeof(arg) e; > const unsigned long r; > } ... > > When `arg` is the first argument to alternative_{,v}call() and > const_vlapic_vcpu() is used results in clang 3.5.0 complaining with: > > arch/x86/hvm/vlapic.c:141:47: error: non-const static data member must be initialized out of line > alternative_call(hvm_funcs.test_pir, const_vlapic_vcpu(vlapic), vec) ) > > Workaround this by pulling `arg1` into a local variable, like it's done for > further arguments (arg2, arg3...) > > Originally arg1 wasn't pulled into a variable because for the a1_ register > local variable the possible clobbering as a result of operators on other > variables don't matter: > > https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html#Local-Register-Variables > > Note clang version 3.8.1 seems to already be fixed and don't require the > workaround, but since it's harmless do it uniformly everywhere. > > Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> > Fixes: 2ce562b2a413 ('x86/altcall: use a union as register type for function parameters on clang') > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> I'm okay with this change, but since you don't mention anything in this regard: Did you look at whether / how generated code (with gcc) changes? Jan
On Wed, Feb 28, 2024 at 03:28:25PM +0100, Jan Beulich wrote: > On 28.02.2024 14:59, Roger Pau Monne wrote: > > The usage in ALT_CALL_ARG() on clang of: > > > > register union { > > typeof(arg) e; > > const unsigned long r; > > } ... > > > > When `arg` is the first argument to alternative_{,v}call() and > > const_vlapic_vcpu() is used results in clang 3.5.0 complaining with: > > > > arch/x86/hvm/vlapic.c:141:47: error: non-const static data member must be initialized out of line > > alternative_call(hvm_funcs.test_pir, const_vlapic_vcpu(vlapic), vec) ) > > > > Workaround this by pulling `arg1` into a local variable, like it's done for > > further arguments (arg2, arg3...) > > > > Originally arg1 wasn't pulled into a variable because for the a1_ register > > local variable the possible clobbering as a result of operators on other > > variables don't matter: > > > > https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html#Local-Register-Variables > > > > Note clang version 3.8.1 seems to already be fixed and don't require the > > workaround, but since it's harmless do it uniformly everywhere. > > > > Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> > > Fixes: 2ce562b2a413 ('x86/altcall: use a union as register type for function parameters on clang') > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > > I'm okay with this change, but since you don't mention anything in this > regard: Did you look at whether / how generated code (with gcc) changes? So the specific example of vlapic_test_irq() shows no changes to the generated code. bloat-o-meter shows a 0 delta: add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0 (0) Function old new delta Total: Before=3570098, After=3570098, chg +0.00% I assume there can be some reordering of instructions, as we now force arg1 temporary variable (v1_) to be initialized ahead of the rest of the arguments. Thanks, Roger.
On 28.02.2024 17:56, Roger Pau Monné wrote: > On Wed, Feb 28, 2024 at 03:28:25PM +0100, Jan Beulich wrote: >> On 28.02.2024 14:59, Roger Pau Monne wrote: >>> The usage in ALT_CALL_ARG() on clang of: >>> >>> register union { >>> typeof(arg) e; >>> const unsigned long r; >>> } ... >>> >>> When `arg` is the first argument to alternative_{,v}call() and >>> const_vlapic_vcpu() is used results in clang 3.5.0 complaining with: >>> >>> arch/x86/hvm/vlapic.c:141:47: error: non-const static data member must be initialized out of line >>> alternative_call(hvm_funcs.test_pir, const_vlapic_vcpu(vlapic), vec) ) >>> >>> Workaround this by pulling `arg1` into a local variable, like it's done for >>> further arguments (arg2, arg3...) >>> >>> Originally arg1 wasn't pulled into a variable because for the a1_ register >>> local variable the possible clobbering as a result of operators on other >>> variables don't matter: >>> >>> https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html#Local-Register-Variables >>> >>> Note clang version 3.8.1 seems to already be fixed and don't require the >>> workaround, but since it's harmless do it uniformly everywhere. >>> >>> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> >>> Fixes: 2ce562b2a413 ('x86/altcall: use a union as register type for function parameters on clang') >>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> >> >> I'm okay with this change, but since you don't mention anything in this >> regard: Did you look at whether / how generated code (with gcc) changes? > > So the specific example of vlapic_test_irq() shows no changes to the > generated code. bloat-o-meter shows a 0 delta: > > add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0 (0) > Function old new delta > Total: Before=3570098, After=3570098, chg +0.00% > > I assume there can be some reordering of instructions, as we now force > arg1 temporary variable (v1_) to be initialized ahead of the rest of > the arguments. Yeah, re-ordering is okay. I was more worried of new unnecessary register copying, for example. Thanks for confirming. Jan
diff --git a/xen/arch/x86/include/asm/alternative.h b/xen/arch/x86/include/asm/alternative.h index 3c14db5078ba..0d3697f1de49 100644 --- a/xen/arch/x86/include/asm/alternative.h +++ b/xen/arch/x86/include/asm/alternative.h @@ -253,21 +253,24 @@ extern void alternative_branches(void); }) #define alternative_vcall1(func, arg) ({ \ - ALT_CALL_ARG(arg, 1); \ + typeof(arg) v1_ = (arg); \ + ALT_CALL_ARG(v1_, 1); \ ALT_CALL_NO_ARG2; \ (void)sizeof(func(arg)); \ (void)alternative_callN(1, int, func); \ }) #define alternative_call1(func, arg) ({ \ - ALT_CALL_ARG(arg, 1); \ + typeof(arg) v1_ = (arg); \ + ALT_CALL_ARG(v1_, 1); \ ALT_CALL_NO_ARG2; \ alternative_callN(1, typeof(func(arg)), func); \ }) #define alternative_vcall2(func, arg1, arg2) ({ \ + typeof(arg1) v1_ = (arg1); \ typeof(arg2) v2_ = (arg2); \ - ALT_CALL_ARG(arg1, 1); \ + ALT_CALL_ARG(v1_, 1); \ ALT_CALL_ARG(v2_, 2); \ ALT_CALL_NO_ARG3; \ (void)sizeof(func(arg1, arg2)); \ @@ -275,17 +278,19 @@ extern void alternative_branches(void); }) #define alternative_call2(func, arg1, arg2) ({ \ + typeof(arg1) v1_ = (arg1); \ typeof(arg2) v2_ = (arg2); \ - ALT_CALL_ARG(arg1, 1); \ + ALT_CALL_ARG(v1_, 1); \ ALT_CALL_ARG(v2_, 2); \ ALT_CALL_NO_ARG3; \ alternative_callN(2, typeof(func(arg1, arg2)), func); \ }) #define alternative_vcall3(func, arg1, arg2, arg3) ({ \ + typeof(arg1) v1_ = (arg1); \ typeof(arg2) v2_ = (arg2); \ typeof(arg3) v3_ = (arg3); \ - ALT_CALL_ARG(arg1, 1); \ + ALT_CALL_ARG(v1_, 1); \ ALT_CALL_ARG(v2_, 2); \ ALT_CALL_ARG(v3_, 3); \ ALT_CALL_NO_ARG4; \ @@ -294,9 +299,10 @@ extern void alternative_branches(void); }) #define alternative_call3(func, arg1, arg2, arg3) ({ \ + typeof(arg1) v1_ = (arg1); \ typeof(arg2) v2_ = (arg2); \ typeof(arg3) v3_ = (arg3); \ - ALT_CALL_ARG(arg1, 1); \ + ALT_CALL_ARG(v1_, 1); \ ALT_CALL_ARG(v2_, 2); \ ALT_CALL_ARG(v3_, 3); \ ALT_CALL_NO_ARG4; \ @@ -305,10 +311,11 @@ extern void alternative_branches(void); }) #define alternative_vcall4(func, arg1, arg2, arg3, arg4) ({ \ + typeof(arg1) v1_ = (arg1); \ typeof(arg2) v2_ = (arg2); \ typeof(arg3) v3_ = (arg3); \ typeof(arg4) v4_ = (arg4); \ - ALT_CALL_ARG(arg1, 1); \ + ALT_CALL_ARG(v1_, 1); \ ALT_CALL_ARG(v2_, 2); \ ALT_CALL_ARG(v3_, 3); \ ALT_CALL_ARG(v4_, 4); \ @@ -318,10 +325,11 @@ extern void alternative_branches(void); }) #define alternative_call4(func, arg1, arg2, arg3, arg4) ({ \ + typeof(arg1) v1_ = (arg1); \ typeof(arg2) v2_ = (arg2); \ typeof(arg3) v3_ = (arg3); \ typeof(arg4) v4_ = (arg4); \ - ALT_CALL_ARG(arg1, 1); \ + ALT_CALL_ARG(v1_, 1); \ ALT_CALL_ARG(v2_, 2); \ ALT_CALL_ARG(v3_, 3); \ ALT_CALL_ARG(v4_, 4); \ @@ -332,11 +340,12 @@ extern void alternative_branches(void); }) #define alternative_vcall5(func, arg1, arg2, arg3, arg4, arg5) ({ \ + typeof(arg1) v1_ = (arg1); \ typeof(arg2) v2_ = (arg2); \ typeof(arg3) v3_ = (arg3); \ typeof(arg4) v4_ = (arg4); \ typeof(arg5) v5_ = (arg5); \ - ALT_CALL_ARG(arg1, 1); \ + ALT_CALL_ARG(v1_, 1); \ ALT_CALL_ARG(v2_, 2); \ ALT_CALL_ARG(v3_, 3); \ ALT_CALL_ARG(v4_, 4); \ @@ -347,11 +356,12 @@ extern void alternative_branches(void); }) #define alternative_call5(func, arg1, arg2, arg3, arg4, arg5) ({ \ + typeof(arg1) v1_ = (arg1); \ typeof(arg2) v2_ = (arg2); \ typeof(arg3) v3_ = (arg3); \ typeof(arg4) v4_ = (arg4); \ typeof(arg5) v5_ = (arg5); \ - ALT_CALL_ARG(arg1, 1); \ + ALT_CALL_ARG(v1_, 1); \ ALT_CALL_ARG(v2_, 2); \ ALT_CALL_ARG(v3_, 3); \ ALT_CALL_ARG(v4_, 4); \ @@ -363,12 +373,13 @@ extern void alternative_branches(void); }) #define alternative_vcall6(func, arg1, arg2, arg3, arg4, arg5, arg6) ({ \ + typeof(arg1) v1_ = (arg1); \ typeof(arg2) v2_ = (arg2); \ typeof(arg3) v3_ = (arg3); \ typeof(arg4) v4_ = (arg4); \ typeof(arg5) v5_ = (arg5); \ typeof(arg6) v6_ = (arg6); \ - ALT_CALL_ARG(arg1, 1); \ + ALT_CALL_ARG(v1_, 1); \ ALT_CALL_ARG(v2_, 2); \ ALT_CALL_ARG(v3_, 3); \ ALT_CALL_ARG(v4_, 4); \ @@ -379,12 +390,13 @@ extern void alternative_branches(void); }) #define alternative_call6(func, arg1, arg2, arg3, arg4, arg5, arg6) ({ \ + typeof(arg1) v1_ = (arg1); \ typeof(arg2) v2_ = (arg2); \ typeof(arg3) v3_ = (arg3); \ typeof(arg4) v4_ = (arg4); \ typeof(arg5) v5_ = (arg5); \ typeof(arg6) v6_ = (arg6); \ - ALT_CALL_ARG(arg1, 1); \ + ALT_CALL_ARG(v1_, 1); \ ALT_CALL_ARG(v2_, 2); \ ALT_CALL_ARG(v3_, 3); \ ALT_CALL_ARG(v4_, 4); \
The usage in ALT_CALL_ARG() on clang of: register union { typeof(arg) e; const unsigned long r; } ... When `arg` is the first argument to alternative_{,v}call() and const_vlapic_vcpu() is used results in clang 3.5.0 complaining with: arch/x86/hvm/vlapic.c:141:47: error: non-const static data member must be initialized out of line alternative_call(hvm_funcs.test_pir, const_vlapic_vcpu(vlapic), vec) ) Workaround this by pulling `arg1` into a local variable, like it's done for further arguments (arg2, arg3...) Originally arg1 wasn't pulled into a variable because for the a1_ register local variable the possible clobbering as a result of operators on other variables don't matter: https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html#Local-Register-Variables Note clang version 3.8.1 seems to already be fixed and don't require the workaround, but since it's harmless do it uniformly everywhere. Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Fixes: 2ce562b2a413 ('x86/altcall: use a union as register type for function parameters on clang') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- Gitlab CI seems OK on both the 4.17 and staging branches with this applied: 4.17: https://gitlab.com/xen-project/people/royger/xen/-/pipelines/1193801183 staging: https://gitlab.com/xen-project/people/royger/xen/-/pipelines/1193801881 --- xen/arch/x86/include/asm/alternative.h | 36 +++++++++++++++++--------- 1 file changed, 24 insertions(+), 12 deletions(-)