Message ID | 4d66eb4d-4044-8b48-d7cc-354a236e6b26@suse.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | x86: some assembler macro rework | expand |
On Mon, Sep 28, 2020 at 02:32:24PM +0200, Jan Beulich wrote: > There's no point having every replacement variant to also specify the > INT3 - just have it once in the base macro. When patching, NOPs will get > inserted, which are fine to speculate through (until reaching the INT3). > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> > --- > I also wonder whether the LFENCE in IND_THUNK_RETPOLINE couldn't be > replaced by INT3 as well. Of course the effect will be marginal, as the > size of the thunk will still be 16 bytes when including tail padding > resulting from alignment. I think Andrew is the best one to have an opinion on this. Thanks, Roger.
On 28/09/2020 13:32, Jan Beulich wrote: > There's no point having every replacement variant to also specify the > INT3 - just have it once in the base macro. When patching, NOPs will get > inserted, which are fine to speculate through (until reaching the INT3). > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > --- > I also wonder whether the LFENCE in IND_THUNK_RETPOLINE couldn't be > replaced by INT3 as well. Of course the effect will be marginal, as the > size of the thunk will still be 16 bytes when including tail padding > resulting from alignment. There are surprising performance implications from the choice of speculation blocker. RSB filling in particular had a benefit (up to 6% iirc) from unrolling the loop. Any differences here are likely to be marginal, whereas for inline retpoline, the code volume reduction might easily be the winning factor. > --- > v2: New. > > --- a/xen/arch/x86/indirect-thunk.S > +++ b/xen/arch/x86/indirect-thunk.S > @@ -11,6 +11,8 @@ > > #include <asm/asm_defns.h> > > +.purgem ret This needs a comment. ~Andrew
--- a/xen/arch/x86/indirect-thunk.S +++ b/xen/arch/x86/indirect-thunk.S @@ -11,6 +11,8 @@ #include <asm/asm_defns.h> +.purgem ret + .macro IND_THUNK_RETPOLINE reg:req call 2f 1: @@ -24,12 +26,10 @@ .macro IND_THUNK_LFENCE reg:req lfence jmp *%\reg - int3 /* Halt straight-line speculation */ .endm .macro IND_THUNK_JMP reg:req jmp *%\reg - int3 /* Halt straight-line speculation */ .endm /* @@ -44,6 +44,8 @@ ENTRY(__x86_indirect_thunk_\reg) __stringify(IND_THUNK_LFENCE \reg), X86_FEATURE_IND_THUNK_LFENCE, \ __stringify(IND_THUNK_JMP \reg), X86_FEATURE_IND_THUNK_JMP + int3 /* Halt straight-line speculation */ + .size __x86_indirect_thunk_\reg, . - __x86_indirect_thunk_\reg .type __x86_indirect_thunk_\reg, @function .endm
There's no point having every replacement variant to also specify the INT3 - just have it once in the base macro. When patching, NOPs will get inserted, which are fine to speculate through (until reaching the INT3). Signed-off-by: Jan Beulich <jbeulich@suse.com> --- I also wonder whether the LFENCE in IND_THUNK_RETPOLINE couldn't be replaced by INT3 as well. Of course the effect will be marginal, as the size of the thunk will still be 16 bytes when including tail padding resulting from alignment. --- v2: New.