Message ID | 20211026120132.613201817@infradead.org (mailing list archive) |
---|---|
Headers | show |
Series | x86: Rewrite the retpoline rewrite logic | expand |
On Tue, Oct 26, 2021 at 5:05 AM Peter Zijlstra <peterz@infradead.org> wrote: > > Hi, > > These patches rewrite the way retpolines are rewritten. Currently objtool emits > alternative entries for most retpoline calls. However trying to extend that led > to trouble (ELF files are horrid). > > Therefore completely overhaul this and have objtool emit a .retpoline_sites > section that lists all compiler generated retpoline thunk calls. Then the > kernel can do with them as it pleases. > > Notably it will: > > - rewrite them to indirect instructions for !RETPOLINE > - rewrite them to lfence; indirect; for RETPOLINE_AMD, > where size allows (boo clang!) > > Specifically, the !RETPOLINE case can now also deal with the clang-special > conditional-indirect-tail-call: > > Jcc __x86_indirect_thunk_\reg. > > Finally, also update the x86 BPF jit to catch up to recent times and do these > same things. > > All this should help improve performance by removing an indirection. > > Patches can (soon) be found here: > > git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git objtool/core > > Changes since v2: > > - rewrite the __x86_indirect_thunk_array[] stuff again > - rewrite the retpoline,amd rewrite logic, it now also supports > rewriting the Jcc case, if the original instruction is long enough, but > more importantly, it's simpler code. > - bpf label simplification patch > - random assorted cleanups > - actually managed to get bpf selftests working Great. The patchset didn't go through BPF CI though. See https://patchwork.kernel.org/project/netdevbpf/patch/20211026120309.658539311@infradead.org/ It's a merge conflict. The patchset failed to apply to both bpf and bpf-next trees: Cmd('git') failed due to: exit code(128) cmdline: git am -3 stdout: 'Applying: objtool: Classify symbols Patch failed at 0001 objtool: Classify symbols When you have resolved this problem, run "git am --continue". If you prefer to skip this patch, run "git am --skip" instead. To restore the original branch and stop patching, run "git am --abort".' stderr: 'error: sha1 information is lacking or useless (tools/objtool/check.c). error: could not build fake ancestor hint: Use 'git am --show-current-patch=diff' to see the failed patch'
On Tue, Oct 26, 2021 at 11:26:57AM -0700, Alexei Starovoitov wrote: > It's a merge conflict. The patchset failed to apply to both bpf and > bpf-next trees: Figures :/ I suspect it relies on tip/objtool/core at the very least and possibly some of the x86 trees as well. I can locally merge tip/master with bpf, but getting a CI to do that might be tricky.
On Tue, Oct 26, 2021 at 11:45 AM Peter Zijlstra <peterz@infradead.org> wrote: > > On Tue, Oct 26, 2021 at 11:26:57AM -0700, Alexei Starovoitov wrote: > > > It's a merge conflict. The patchset failed to apply to both bpf and > > bpf-next trees: > > Figures :/ I suspect it relies on tip/objtool/core at the very least and > possibly some of the x86 trees as well. > > I can locally merge tip/master with bpf, but getting a CI to do that > might be tricky. We have an ability in CI to supply few additional patches on top bpf/bpf-next trees, but that's usually done for the cases where we've merged a fix into one tree, but it's needed in both while bpf->net->linus->net-next->bpf-next circle is still pending. Does tip/objtool/core dependency relevant for this set? Can you rebase the current set on top of bpf-next and send it to the list just to get CI to run it? We won't be merging it into bpf-next, of course. I'm mainly interested in seeing all that additional tests passing that we have in bpf-next.
On Tue, Oct 26, 2021 at 01:00:04PM -0700, Alexei Starovoitov wrote: > On Tue, Oct 26, 2021 at 11:45 AM Peter Zijlstra <peterz@infradead.org> wrote: > > > > On Tue, Oct 26, 2021 at 11:26:57AM -0700, Alexei Starovoitov wrote: > > > > > It's a merge conflict. The patchset failed to apply to both bpf and > > > bpf-next trees: > > > > Figures :/ I suspect it relies on tip/objtool/core at the very least and > > possibly some of the x86 trees as well. > > > > I can locally merge tip/master with bpf, but getting a CI to do that > > might be tricky. > > We have an ability in CI to supply few additional patches on top bpf/bpf-next > trees, but that's usually done for the cases where we've merged a fix into > one tree, but it's needed in both while bpf->net->linus->net-next->bpf-next > circle is still pending. > > Does tip/objtool/core dependency relevant for this set? > Can you rebase the current set on top of bpf-next and send it to the list > just to get CI to run it? We won't be merging it into bpf-next, of course. > I'm mainly interested in seeing all that additional tests passing that > we have in bpf-next. I should be able to rebase it just to that, let me try that in the am though, brain is fairly fried atm. Do you really want me to post it to the list, or is a git repo good enough?
On Tue, Oct 26, 2021 at 2:05 PM Peter Zijlstra <peterz@infradead.org> wrote: > > On Tue, Oct 26, 2021 at 01:00:04PM -0700, Alexei Starovoitov wrote: > > On Tue, Oct 26, 2021 at 11:45 AM Peter Zijlstra <peterz@infradead.org> wrote: > > > > > > On Tue, Oct 26, 2021 at 11:26:57AM -0700, Alexei Starovoitov wrote: > > > > > > > It's a merge conflict. The patchset failed to apply to both bpf and > > > > bpf-next trees: > > > > > > Figures :/ I suspect it relies on tip/objtool/core at the very least and > > > possibly some of the x86 trees as well. > > > > > > I can locally merge tip/master with bpf, but getting a CI to do that > > > might be tricky. > > > > We have an ability in CI to supply few additional patches on top bpf/bpf-next > > trees, but that's usually done for the cases where we've merged a fix into > > one tree, but it's needed in both while bpf->net->linus->net-next->bpf-next > > circle is still pending. > > > > Does tip/objtool/core dependency relevant for this set? > > Can you rebase the current set on top of bpf-next and send it to the list > > just to get CI to run it? We won't be merging it into bpf-next, of course. > > I'm mainly interested in seeing all that additional tests passing that > > we have in bpf-next. > > I should be able to rebase it just to that, let me try that in the am > though, brain is fairly fried atm. Do you really want me to post it to > the list, or is a git repo good enough? Please post it. CI cannot pull it from the repo.
On Tue, Oct 26, 2021 at 02:05:55PM -0700, Alexei Starovoitov wrote:
> Please post it. CI cannot pull it from the repo.
Done:
https://lore.kernel.org/bpf/20211027085243.008677168@infradead.org/T/#t
On Wed, Oct 27, 2021 at 2:02 AM Peter Zijlstra <peterz@infradead.org> wrote: > > On Tue, Oct 26, 2021 at 02:05:55PM -0700, Alexei Starovoitov wrote: > > > Please post it. CI cannot pull it from the repo. > > Done: > > https://lore.kernel.org/bpf/20211027085243.008677168@infradead.org/T/#t Perfect. Thanks! Looks like vger got into 'delete all users' mode. Looks like my and many other emails were automatically unsubscribed from many mailing lists. bpf@vger, netdev@vger, linux-kernel@vger user count looks very low compared to a few weeks ago. I doubt all these users decided to unsubscribe themselves. Anyway looks like BPF CI doesn't see the patchset yet for some reason (we're debugging), but I've tested your patchset manually the same way CI would do and reviewed most of the patches. Please add: Acked-by: Alexei Starovoitov <ast@kernel.org> to bpf patches 16 and 17 and Tested-by: Alexei Starovoitov <ast@kernel.org> for the whole set. Thanks!
On Tue, Oct 26, 2021 at 02:01:32PM +0200, Peter Zijlstra wrote: > Hi, > > These patches rewrite the way retpolines are rewritten. Currently objtool emits > alternative entries for most retpoline calls. However trying to extend that led > to trouble (ELF files are horrid). > > Therefore completely overhaul this and have objtool emit a .retpoline_sites > section that lists all compiler generated retpoline thunk calls. Then the > kernel can do with them as it pleases. > > Notably it will: > > - rewrite them to indirect instructions for !RETPOLINE > - rewrite them to lfence; indirect; for RETPOLINE_AMD, > where size allows (boo clang!) > > Specifically, the !RETPOLINE case can now also deal with the clang-special > conditional-indirect-tail-call: > > Jcc __x86_indirect_thunk_\reg. > > Finally, also update the x86 BPF jit to catch up to recent times and do these > same things. > > All this should help improve performance by removing an indirection. > > Patches can (soon) be found here: > > git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git objtool/core > > Changes since v2: > > - rewrite the __x86_indirect_thunk_array[] stuff again > - rewrite the retpoline,amd rewrite logic, it now also supports > rewriting the Jcc case, if the original instruction is long enough, but > more importantly, it's simpler code. > - bpf label simplification patch > - random assorted cleanups > - actually managed to get bpf selftests working Good stuff! Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
On Tue, Oct 26, 2021 at 02:01:32PM +0200, Peter Zijlstra wrote: > Hi, > > These patches rewrite the way retpolines are rewritten. Currently objtool emits > alternative entries for most retpoline calls. However trying to extend that led > to trouble (ELF files are horrid). > > Therefore completely overhaul this and have objtool emit a .retpoline_sites > section that lists all compiler generated retpoline thunk calls. Then the > kernel can do with them as it pleases. > > Notably it will: > > - rewrite them to indirect instructions for !RETPOLINE > - rewrite them to lfence; indirect; for RETPOLINE_AMD, > where size allows (boo clang!) > > Specifically, the !RETPOLINE case can now also deal with the clang-special > conditional-indirect-tail-call: > > Jcc __x86_indirect_thunk_\reg. > > Finally, also update the x86 BPF jit to catch up to recent times and do these > same things. > > All this should help improve performance by removing an indirection. > > Patches can (soon) be found here: > > git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git objtool/core > > Changes since v2: > > - rewrite the __x86_indirect_thunk_array[] stuff again > - rewrite the retpoline,amd rewrite logic, it now also supports > rewriting the Jcc case, if the original instruction is long enough, but > more importantly, it's simpler code. > - bpf label simplification patch > - random assorted cleanups > - actually managed to get bpf selftests working > > --- > arch/um/kernel/um_arch.c | 4 + > arch/x86/include/asm/GEN-for-each-reg.h | 14 ++- > arch/x86/include/asm/alternative.h | 1 + > arch/x86/include/asm/asm-prototypes.h | 18 --- > arch/x86/include/asm/nospec-branch.h | 72 ++--------- > arch/x86/kernel/alternative.c | 189 ++++++++++++++++++++++++++++- > arch/x86/kernel/cpu/bugs.c | 7 -- > arch/x86/kernel/module.c | 9 +- > arch/x86/kernel/vmlinux.lds.S | 14 +++ > arch/x86/lib/retpoline.S | 56 ++------- > arch/x86/net/bpf_jit_comp.c | 160 +++++++++--------------- > arch/x86/net/bpf_jit_comp32.c | 22 +++- > tools/objtool/arch/x86/decode.c | 120 ------------------ > tools/objtool/check.c | 208 ++++++++++++++++++++++---------- > tools/objtool/elf.c | 84 ------------- > tools/objtool/include/objtool/check.h | 1 - > tools/objtool/include/objtool/elf.h | 6 +- > tools/objtool/special.c | 8 -- > 18 files changed, 472 insertions(+), 521 deletions(-) Ok, this all looks real nice, thx! Reviewed-by: Borislav Petkov <bp@suse.de>
On Tue, 26 Oct 2021, Peter Zijlstra wrote: > Hi, > > These patches rewrite the way retpolines are rewritten. Currently objtool emits > alternative entries for most retpoline calls. However trying to extend that led > to trouble (ELF files are horrid). > > Therefore completely overhaul this and have objtool emit a .retpoline_sites > section that lists all compiler generated retpoline thunk calls. Then the > kernel can do with them as it pleases. > > Notably it will: > > - rewrite them to indirect instructions for !RETPOLINE > - rewrite them to lfence; indirect; for RETPOLINE_AMD, > where size allows (boo clang!) > > Specifically, the !RETPOLINE case can now also deal with the clang-special > conditional-indirect-tail-call: > > Jcc __x86_indirect_thunk_\reg. > > Finally, also update the x86 BPF jit to catch up to recent times and do these > same things. > > All this should help improve performance by removing an indirection. > > Patches can (soon) be found here: > > git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git objtool/core > > Changes since v2: > > - rewrite the __x86_indirect_thunk_array[] stuff again > - rewrite the retpoline,amd rewrite logic, it now also supports > rewriting the Jcc case, if the original instruction is long enough, but > more importantly, it's simpler code. > - bpf label simplification patch > - random assorted cleanups > - actually managed to get bpf selftests working It is already in tip, but FWIW the objtool changes look good to me Reviewed-by: Miroslav Benes <mbenes@suse.cz> M