Message ID | 20190208150826.44EBC68DD2@newverein.lst.de (mailing list archive) |
---|---|
Headers | show |
Series | arm64: ftrace with regs | expand |
Hi Torsten, On 08/02/2019 15:08, Torsten Duwe wrote: > Patch series v8, as discussed. > The whole series applies cleanly on 5.0-rc5 > For this series: Tested-by: Julien Thierry <julien.thierry@arm.com> > --- > arch/arm64/Kconfig | 4 + > arch/arm64/Makefile | 10 ++ > arch/arm64/include/asm/ftrace.h | 16 ++++ > arch/arm64/include/asm/module.h | 3 > arch/arm64/kernel/Makefile | 12 +-- > arch/arm64/kernel/entry-ftrace.S | 125 ++++++++++++++++++++++++++++++++-- > arch/arm64/kernel/ftrace.c | 117 ++++++++++++++++++++++++------- > arch/arm64/kernel/module-plts.c | 3 > arch/arm64/kernel/module.c | 2 > arch/arm64/lib/Makefile | 4 - > drivers/firmware/efi/libstub/Makefile | 12 +-- > include/asm-generic/vmlinux.lds.h | 2 > include/linux/compiler_types.h | 4 + > kernel/module.c | 14 +++ > mm/kasan/Makefile | 8 +- > 15 files changed, 281 insertions(+), 55 deletions(-) > --- > changes since v7: > > * -pg -> $(CC_FLAGS_FTRACE) cleanup now split according to subtree > maintainership. > > * REC_IP_BRANCH_OFFSET is gone, the functionality went into > ftrace_call_adjust(), where it belongs. > > * MOV_X9_X30 macro is gone (why did we argue about its name anyway?); > it is only used once now in the initial ftrace_make_nop new helper > function ftrace_setup_lr_saver(), suggested by Julien. > > * call site processing was missing for modules. Fixed. > > changes since v6: > > * change the stack layout once more; I hope I have it the "standard" way now. > And yes, it looks simpler and cleaner; thanks, Mark, for nagging. > > * split out the independent Kconfig and Makefile changes > > * fixed style issues > > * s/fp/x29/g > > * MCOUNT_ADDR is now merely a 64-bit magic, as this is totally sufficient. > > * QUICK_LR_SAVE renamed back to MOV_X9_X30. > > * place MOV_X9_X30 insns on bootup, and only flip b <-> nop at runtime > > * graph tracer "ifdeffery" reshuffle > > Torsten > >
On Wed, Feb 13, 2019 at 11:11:04AM +0000, Julien Thierry wrote: > Hi Torsten, > > On 08/02/2019 15:08, Torsten Duwe wrote: > > Patch series v8, as discussed. > > The whole series applies cleanly on 5.0-rc5 So what's the status now? Besides debatable minor style issues there were no more objections to v8. Would this go through the ARM repo or via the ftrace repo? Torsten
Hi Torsten, On Mon, Mar 11, 2019 at 12:49:46PM +0100, Torsten Duwe wrote: > On Wed, Feb 13, 2019 at 11:11:04AM +0000, Julien Thierry wrote: > > Hi Torsten, > > > > On 08/02/2019 15:08, Torsten Duwe wrote: > > > Patch series v8, as discussed. > > > The whole series applies cleanly on 5.0-rc5 > > So what's the status now? Besides debatable minor style > issues there were no more objections to v8. Would this > go through the ARM repo or via the ftrace repo? Sorry, I have some half-written review comments that I will clean up and send shortly. As commented on prior versions, I'd very much like to see the MCOUNT_ADDR hack go, by teaching the core ftrace code to not assume that an mcount symbol exists. We should be able to do that by separating the notion of NOPing a patch site from the notion of initializing it for the first time. Thanks, Mark.
On Mon, Mar 11, 2019 at 12:18:05PM +0000, Mark Rutland wrote: > Hi Torsten, > > On Mon, Mar 11, 2019 at 12:49:46PM +0100, Torsten Duwe wrote: > > On Wed, Feb 13, 2019 at 11:11:04AM +0000, Julien Thierry wrote: > > > Hi Torsten, > > > > > > On 08/02/2019 15:08, Torsten Duwe wrote: > > > > Patch series v8, as discussed. > > > > The whole series applies cleanly on 5.0-rc5 > > > > So what's the status now? Besides debatable minor style > > issues there were no more objections to v8. Would this > > go through the ARM repo or via the ftrace repo? > > Sorry, I have some half-written review comments that I will clean up and > send shortly. Ping? > As commented on prior versions, I'd very much like to see the > MCOUNT_ADDR hack go, by teaching the core ftrace code to not assume that > an mcount symbol exists. > We should be able to do that by separating the notion of NOPing a patch > site from the notion of initializing it for the first time. This is generally a good idea, and would affect other architectures as well, see arch/s390/kernel/ftrace.c ftrace_make_nop(...) I propose to do this in a second round. Torsten
On Mon, Mar 11, 2019 at 12:49:46PM +0100, Torsten Duwe wrote: > On Wed, Feb 13, 2019 at 11:11:04AM +0000, Julien Thierry wrote: > > Hi Torsten, > > > > On 08/02/2019 15:08, Torsten Duwe wrote: > > > Patch series v8, as discussed. > > > The whole series applies cleanly on 5.0-rc5 > > So what's the status now? Besides debatable minor style > issues there were no more objections to v8. Would this > go through the ARM repo or via the ftrace repo? Sorry agains for the delay on this. I'm now back in the office and in front of a computer daily, so I can spend a bit more time on this. Regardless of anything else, I think that we should queue the first three patches now. I've poked the relevant maintainers for their acks so that those can be taken via the arm64 tree. I'm happy to do the trivial cleanups on the last couple of patches (e.g. s/lr/x30), and I'm actively looking at the API rework I requested. Thanks, Mark.
On Mon, Apr 08, 2019 at 04:36:28PM +0100, Mark Rutland wrote: > On Mon, Mar 11, 2019 at 12:49:46PM +0100, Torsten Duwe wrote: > > On Wed, Feb 13, 2019 at 11:11:04AM +0000, Julien Thierry wrote: > > > Hi Torsten, > > > > > > On 08/02/2019 15:08, Torsten Duwe wrote: > > > > Patch series v8, as discussed. > > > > The whole series applies cleanly on 5.0-rc5 > > > > So what's the status now? Besides debatable minor style > > issues there were no more objections to v8. Would this > > go through the ARM repo or via the ftrace repo? > > Sorry agains for the delay on this. I'm now back in the office and in > front of a computer daily, so I can spend a bit more time on this. > > Regardless of anything else, I think that we should queue the first > three patches now. I've poked the relevant maintainers for their acks so > that those can be taken via the arm64 tree. > > I'm happy to do the trivial cleanups on the last couple of patches (e.g. > s/lr/x30), and I'm actively looking at the API rework I requested. Ok, I've picked up patches 1-3 and I'll wait for you to spin updates to the last two. Will
On Tue, Apr 9, 2019 at 8:52 PM Will Deacon <will.deacon@arm.com> wrote: > > On Mon, Apr 08, 2019 at 04:36:28PM +0100, Mark Rutland wrote: > > On Mon, Mar 11, 2019 at 12:49:46PM +0100, Torsten Duwe wrote: > > > On Wed, Feb 13, 2019 at 11:11:04AM +0000, Julien Thierry wrote: > > > > Hi Torsten, > > > > > > > > On 08/02/2019 15:08, Torsten Duwe wrote: > > > > > Patch series v8, as discussed. > > > > > The whole series applies cleanly on 5.0-rc5 > > > > > > So what's the status now? Besides debatable minor style > > > issues there were no more objections to v8. Would this > > > go through the ARM repo or via the ftrace repo? > > > > Sorry agains for the delay on this. I'm now back in the office and in > > front of a computer daily, so I can spend a bit more time on this. > > > > Regardless of anything else, I think that we should queue the first > > three patches now. I've poked the relevant maintainers for their acks so > > that those can be taken via the arm64 tree. > > > > I'm happy to do the trivial cleanups on the last couple of patches (e.g. > > s/lr/x30), and I'm actively looking at the API rework I requested. > > Ok, I've picked up patches 1-3 and I'll wait for you to spin updates to the > last two. Ok, I see that patches 1-3 are picked up and are already present in recent kernels. Is there any progress on remaining two patches? Any help required? Thanks, Ruslan
Hi Ruslan, On Wed, Jul 10, 2019 at 03:27:58PM +0300, Ruslan Bilovol wrote: > On Tue, Apr 9, 2019 at 8:52 PM Will Deacon <will.deacon@arm.com> wrote: > > > > On Mon, Apr 08, 2019 at 04:36:28PM +0100, Mark Rutland wrote: > > > On Mon, Mar 11, 2019 at 12:49:46PM +0100, Torsten Duwe wrote: > > > > On Wed, Feb 13, 2019 at 11:11:04AM +0000, Julien Thierry wrote: > > > > > Hi Torsten, > > > > > > > > > > On 08/02/2019 15:08, Torsten Duwe wrote: > > > > > > Patch series v8, as discussed. > > > > > > The whole series applies cleanly on 5.0-rc5 > > > > > > > > So what's the status now? Besides debatable minor style > > > > issues there were no more objections to v8. Would this > > > > go through the ARM repo or via the ftrace repo? > > > > > > Sorry agains for the delay on this. I'm now back in the office and in > > > front of a computer daily, so I can spend a bit more time on this. > > > > > > Regardless of anything else, I think that we should queue the first > > > three patches now. I've poked the relevant maintainers for their acks so > > > that those can be taken via the arm64 tree. > > > > > > I'm happy to do the trivial cleanups on the last couple of patches (e.g. > > > s/lr/x30), and I'm actively looking at the API rework I requested. > > > > Ok, I've picked up patches 1-3 and I'll wait for you to spin updates to the > > last two. > > Ok, I see that patches 1-3 are picked up and are already present in recent > kernels. > > Is there any progress on remaining two patches? I'm afraid that I've been distracted on other fronts, so I haven't made progress there. > Any help required? If you'd be happy to look at the cleanup I previously suggested for the core, that would be great. When I last looked, it was simple to rework things so that arch code doesn't have to define MCOUNT_ADDR, but I hadn't figured out exactly how to split the core mcount assumptions from the important state machine bits. I'll take another look and see if I can provide more detail. :) Thanks, Mark.
On Wed, 24 Jul 2019, Mark Rutland wrote: > > > > > So what's the status now? Besides debatable minor style > > > > > issues there were no more objections to v8. Would this > > > > > go through the ARM repo or via the ftrace repo? > > > > > > > > Sorry agains for the delay on this. I'm now back in the office and in > > > > front of a computer daily, so I can spend a bit more time on this. > > > > > > > > Regardless of anything else, I think that we should queue the first > > > > three patches now. I've poked the relevant maintainers for their acks so > > > > that those can be taken via the arm64 tree. > > > > > > > > I'm happy to do the trivial cleanups on the last couple of patches (e.g. > > > > s/lr/x30), and I'm actively looking at the API rework I requested. > > > > > > Ok, I've picked up patches 1-3 and I'll wait for you to spin updates to the > > > last two. > > > > Ok, I see that patches 1-3 are picked up and are already present in recent > > kernels. > > > > Is there any progress on remaining two patches? > > I'm afraid that I've been distracted on other fronts, so I haven't made > progress there. > > > Any help required? > > If you'd be happy to look at the cleanup I previously suggested for the > core, that would be great. When I last looked, it was simple to rework > things so that arch code doesn't have to define MCOUNT_ADDR, but I > hadn't figured out exactly how to split the core mcount assumptions from > the important state machine bits. > > I'll take another look and see if I can provide more detail. :) Hi Mark, has any progress been made on any front? Feels like this got stuck a bit. Thanks,
On Wed, Oct 16, 2019 at 01:42:59PM +0200, Jiri Kosina wrote: > On Wed, 24 Jul 2019, Mark Rutland wrote: > > > > > > > So what's the status now? Besides debatable minor style > > > > > > issues there were no more objections to v8. Would this > > > > > > go through the ARM repo or via the ftrace repo? > > > > > > > > > > Sorry agains for the delay on this. I'm now back in the office and in > > > > > front of a computer daily, so I can spend a bit more time on this. > > > > > > > > > > Regardless of anything else, I think that we should queue the first > > > > > three patches now. I've poked the relevant maintainers for their acks so > > > > > that those can be taken via the arm64 tree. > > > > > > > > > > I'm happy to do the trivial cleanups on the last couple of patches (e.g. > > > > > s/lr/x30), and I'm actively looking at the API rework I requested. > > > > > > > > Ok, I've picked up patches 1-3 and I'll wait for you to spin updates to the > > > > last two. > > > > > > Ok, I see that patches 1-3 are picked up and are already present in recent > > > kernels. > > > > > > Is there any progress on remaining two patches? > > > > I'm afraid that I've been distracted on other fronts, so I haven't made > > progress there. > > > > > Any help required? > > > > If you'd be happy to look at the cleanup I previously suggested for the > > core, that would be great. When I last looked, it was simple to rework > > things so that arch code doesn't have to define MCOUNT_ADDR, but I > > hadn't figured out exactly how to split the core mcount assumptions from > > the important state machine bits. > > > > I'll take another look and see if I can provide more detail. :) > > Hi Mark, Hi Jiri, > has any progress been made on any front? Feels like this got stuck a bit. Sorry about this; I've been a bit distracted. I've just done the core (non-arm64) bits today, and pushed that out: https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=arm64/ftrace-with-regs ... I'll fold the remainging bits of patches 4 and 5 together tomorrow atop of that. Thanks, Mark.
On Wed, Oct 16, 2019 at 06:58:42PM +0100, Mark Rutland wrote: > I've just done the core (non-arm64) bits today, and pushed that out: > > https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=arm64/ftrace-with-regs > > ... I'll fold the remainging bits of patches 4 and 5 together tomorrow > atop of that. I've just force-pushed an updated version with the actual arm64 FTRACE_WITH_REGS bits. There are a couple of bits I still need to verify, but I'm hoping that I can send this out for real next week. In the process of reworking this I spotted some issues that will get in the way of livepatching. Notably: * When modules can be loaded far away from the kernel, we'll potentially need a PLT for each function within a module, if each can be patched to a unique function. Currently we have a fixed number, which is only sufficient for the two ftrace entry trampolines. IIUC, the new code being patched in is itself a module, in which case we'd need a PLT for each function in the main kernel image. We have a few options here, e.g. changing which memory size model we use, or reserving space for a PLT before each function using -f patchable-function-entry=N,M. * There are windows where backtracing will miss the callsite's caller, as its address is not live in the LR or existing chain of frame records. Thus we cannot claim to have a reliable stacktrace. I suspect we'll have to teach the stacktrace code to handle this as a special-case. I'll try to write these up, as similar probably applies to other architectures with a link register. Thanks, Mark.
Hi Mark! On Fri, 18 Oct 2019 18:41:02 +0100 Mark Rutland <mark.rutland@arm.com> wrote: > In the process of reworking this I spotted some issues that will get > in the way of livepatching. Notably: > > * When modules can be loaded far away from the kernel, we'll > potentially need a PLT for each function within a module, if each can > be patched to a unique function. Currently we have a fixed number, > which is only sufficient for the two ftrace entry trampolines. > > IIUC, the new code being patched in is itself a module, in which > case we'd need a PLT for each function in the main kernel image. When no live patching is involved, obviously all cases need to have been handled so far. And when a live patching module comes in, there are calls in and out of the new patch code: Calls going into the live patch are not aware of this. They are caught by an active ftrace intercept, and the actual call into the LP module is done in klp_arch_set_pc, by manipulating the intercept (call site) return address (in case thread lives in the "new world", for completeness' sake). This is an unsigned long write in C. All calls going _out_ from the KLP module are newly generated, as part of the KLP module building process, and are thus aware of them being "extern" -- a PLT entry should be generated and accounted for in the KLP module. > We have a few options here, e.g. changing which memory size model we > use, or reserving space for a PLT before each function using > -f patchable-function-entry=N,M. Nonetheless I'm happy I once added the ,M option here. You never know :) > * There are windows where backtracing will miss the callsite's caller, > as its address is not live in the LR or existing chain of frame > records. Thus we cannot claim to have a reliable stacktrace. > > I suspect we'll have to teach the stacktrace code to handle this as > a special-case. Yes, that's where I had to step back. The unwinder needs to stop where the chain is even questionable. In _all_ cases. Missing only one race condition means a lurking inconsistency. OTOH it's not a problem to report "not reliable" when in doubt; the thread in question will then get woken up and unwind itself. It is only an optimisation to let all kernel threads which are guaranteed to not contain any patched functions sleep on. > I'll try to write these up, as similar probably applies to other > architectures with a link register. I thought I'd quickly give you my feedback upfront here. Torsten
On Sat, Oct 19, 2019 at 01:01:35PM +0200, Torsten Duwe wrote: > Hi Mark! Hi Torsten! > On Fri, 18 Oct 2019 18:41:02 +0100 Mark Rutland > <mark.rutland@arm.com> wrote: > > > In the process of reworking this I spotted some issues that will get > > in the way of livepatching. Notably: > > > > * When modules can be loaded far away from the kernel, we'll > > potentially need a PLT for each function within a module, if each can > > be patched to a unique function. Currently we have a fixed number, > > which is only sufficient for the two ftrace entry trampolines. > > > > IIUC, the new code being patched in is itself a module, in which > > case we'd need a PLT for each function in the main kernel image. > > When no live patching is involved, obviously all cases need to have > been handled so far. And when a live patching module comes in, there > are calls in and out of the new patch code: > > Calls going into the live patch are not aware of this. They are caught > by an active ftrace intercept, and the actual call into the LP module > is done in klp_arch_set_pc, by manipulating the intercept (call site) > return address (in case thread lives in the "new world", for > completeness' sake). This is an unsigned long write in C. I was under the impression that (at some point) the patch site would be patched to call the LP code directly. From the above I understand that's not the case, and it will always be directed via the regular ftrace entry code -- have I got that right? Assuming that is the case, that sounds fine to me, and sorry for the noise. > All calls going _out_ from the KLP module are newly generated, as part > of the KLP module building process, and are thus aware of them being > "extern" -- a PLT entry should be generated and accounted for in the > KLP module. Yup; understood. > > We have a few options here, e.g. changing which memory size model we > > use, or reserving space for a PLT before each function using > > -f patchable-function-entry=N,M. > > Nonetheless I'm happy I once added the ,M option here. You never know :) Yup; we may have other reasons to need this in future (and I see parisc uses this today). > > * There are windows where backtracing will miss the callsite's caller, > > as its address is not live in the LR or existing chain of frame > > records. Thus we cannot claim to have a reliable stacktrace. > > > > I suspect we'll have to teach the stacktrace code to handle this as > > a special-case. > > Yes, that's where I had to step back. The unwinder needs to stop where > the chain is even questionable. In _all_ cases. Missing only one race > condition means a lurking inconsistency. Sure. I'm calling this out now so that we don't miss this in future. I've added comments to the ftrace entry asm to this effect for now. > OTOH it's not a problem to report "not reliable" when in doubt; the > thread in question will then get woken up and unwind itself. > It is only an optimisation to let all kernel threads which are > guaranteed to not contain any patched functions sleep on. I just want to make it clear that some care will be needed if/when adding CONFIG_HAVE_RELIABLE_STACKTRACE so that we handle this case correctly. > > I'll try to write these up, as similar probably applies to other > > architectures with a link register. > > I thought I'd quickly give you my feedback upfront here. Thanks; it's much appreciated! Mark.
On Sat, Oct 19, 2019 at 01:01:35PM +0200, Torsten Duwe wrote: > All calls going _out_ from the KLP module are newly generated, as part > of the KLP module building process, and are thus aware of them being > "extern" -- a PLT entry should be generated and accounted for in the > KLP module. Hm... for kpatch-build I assume we may need a GCC plugin to convert local calls to global somehow?