Message ID | 20210127110944.41813-3-alexander.sverdlin@nokia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ARM: Implement MODULE_PLT support in FTRACE | expand |
On 1/27/2021 3:09 AM, Alexander A Sverdlin wrote: > From: Alexander Sverdlin <alexander.sverdlin@nokia.com> > > Teach ftrace_make_call() and ftrace_make_nop() about PLTs. > Teach PLT code about FTRACE and all its callbacks. > Otherwise the following might happen: > > ------------[ cut here ]------------ > WARNING: CPU: 14 PID: 2265 at .../arch/arm/kernel/insn.c:14 __arm_gen_branch+0x83/0x8c() > ... > Hardware name: LSI Axxia AXM55XX > [<c0314a49>] (unwind_backtrace) from [<c03115e9>] (show_stack+0x11/0x14) > [<c03115e9>] (show_stack) from [<c0519f51>] (dump_stack+0x81/0xa8) > [<c0519f51>] (dump_stack) from [<c032185d>] (warn_slowpath_common+0x69/0x90) > [<c032185d>] (warn_slowpath_common) from [<c03218f3>] (warn_slowpath_null+0x17/0x1c) > [<c03218f3>] (warn_slowpath_null) from [<c03143cf>] (__arm_gen_branch+0x83/0x8c) > [<c03143cf>] (__arm_gen_branch) from [<c0314337>] (ftrace_make_nop+0xf/0x24) > [<c0314337>] (ftrace_make_nop) from [<c038ebcb>] (ftrace_process_locs+0x27b/0x3e8) > [<c038ebcb>] (ftrace_process_locs) from [<c0378d79>] (load_module+0x11e9/0x1a44) > [<c0378d79>] (load_module) from [<c037974d>] (SyS_finit_module+0x59/0x84) > [<c037974d>] (SyS_finit_module) from [<c030e981>] (ret_fast_syscall+0x1/0x18) > ---[ end trace e1b64ced7a89adcc ]--- > ------------[ cut here ]------------ > WARNING: CPU: 14 PID: 2265 at .../kernel/trace/ftrace.c:1979 ftrace_bug+0x1b1/0x234() > ... > Hardware name: LSI Axxia AXM55XX > [<c0314a49>] (unwind_backtrace) from [<c03115e9>] (show_stack+0x11/0x14) > [<c03115e9>] (show_stack) from [<c0519f51>] (dump_stack+0x81/0xa8) > [<c0519f51>] (dump_stack) from [<c032185d>] (warn_slowpath_common+0x69/0x90) > [<c032185d>] (warn_slowpath_common) from [<c03218f3>] (warn_slowpath_null+0x17/0x1c) > [<c03218f3>] (warn_slowpath_null) from [<c038e87d>] (ftrace_bug+0x1b1/0x234) > [<c038e87d>] (ftrace_bug) from [<c038ebd5>] (ftrace_process_locs+0x285/0x3e8) > [<c038ebd5>] (ftrace_process_locs) from [<c0378d79>] (load_module+0x11e9/0x1a44) > [<c0378d79>] (load_module) from [<c037974d>] (SyS_finit_module+0x59/0x84) > [<c037974d>] (SyS_finit_module) from [<c030e981>] (ret_fast_syscall+0x1/0x18) > ---[ end trace e1b64ced7a89adcd ]--- > ftrace failed to modify [<e9ef7006>] 0xe9ef7006 > actual: 02:f0:3b:fa > ftrace record flags: 0 > (0) expected tramp: c0314265 > > Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Hi Alexander Sorry if I'm butting in out of the blue. I got curious so decided to have a go at reproducing the issue :-) On 01/27/21 12:09, Alexander A Sverdlin wrote: > From: Alexander Sverdlin <alexander.sverdlin@nokia.com> > > Teach ftrace_make_call() and ftrace_make_nop() about PLTs. > Teach PLT code about FTRACE and all its callbacks. > Otherwise the following might happen: > > ------------[ cut here ]------------ > WARNING: CPU: 14 PID: 2265 at .../arch/arm/kernel/insn.c:14 __arm_gen_branch+0x83/0x8c() > ... > Hardware name: LSI Axxia AXM55XX > [<c0314a49>] (unwind_backtrace) from [<c03115e9>] (show_stack+0x11/0x14) > [<c03115e9>] (show_stack) from [<c0519f51>] (dump_stack+0x81/0xa8) > [<c0519f51>] (dump_stack) from [<c032185d>] (warn_slowpath_common+0x69/0x90) > [<c032185d>] (warn_slowpath_common) from [<c03218f3>] (warn_slowpath_null+0x17/0x1c) > [<c03218f3>] (warn_slowpath_null) from [<c03143cf>] (__arm_gen_branch+0x83/0x8c) > [<c03143cf>] (__arm_gen_branch) from [<c0314337>] (ftrace_make_nop+0xf/0x24) > [<c0314337>] (ftrace_make_nop) from [<c038ebcb>] (ftrace_process_locs+0x27b/0x3e8) > [<c038ebcb>] (ftrace_process_locs) from [<c0378d79>] (load_module+0x11e9/0x1a44) > [<c0378d79>] (load_module) from [<c037974d>] (SyS_finit_module+0x59/0x84) > [<c037974d>] (SyS_finit_module) from [<c030e981>] (ret_fast_syscall+0x1/0x18) > ---[ end trace e1b64ced7a89adcc ]--- > ------------[ cut here ]------------ > WARNING: CPU: 14 PID: 2265 at .../kernel/trace/ftrace.c:1979 ftrace_bug+0x1b1/0x234() > ... > Hardware name: LSI Axxia AXM55XX > [<c0314a49>] (unwind_backtrace) from [<c03115e9>] (show_stack+0x11/0x14) > [<c03115e9>] (show_stack) from [<c0519f51>] (dump_stack+0x81/0xa8) > [<c0519f51>] (dump_stack) from [<c032185d>] (warn_slowpath_common+0x69/0x90) > [<c032185d>] (warn_slowpath_common) from [<c03218f3>] (warn_slowpath_null+0x17/0x1c) > [<c03218f3>] (warn_slowpath_null) from [<c038e87d>] (ftrace_bug+0x1b1/0x234) > [<c038e87d>] (ftrace_bug) from [<c038ebd5>] (ftrace_process_locs+0x285/0x3e8) > [<c038ebd5>] (ftrace_process_locs) from [<c0378d79>] (load_module+0x11e9/0x1a44) > [<c0378d79>] (load_module) from [<c037974d>] (SyS_finit_module+0x59/0x84) > [<c037974d>] (SyS_finit_module) from [<c030e981>] (ret_fast_syscall+0x1/0x18) > ---[ end trace e1b64ced7a89adcd ]--- > ftrace failed to modify [<e9ef7006>] 0xe9ef7006 > actual: 02:f0:3b:fa > ftrace record flags: 0 > (0) expected tramp: c0314265 > > Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> I tried on 5.12-rc2 and 5.11 but couldn't reproduce the problem using your instructions on the other email. But most likely because I'm hitting another problem that could be masking it. I'm not sure it is related or just randomly happened to hit it. Did you see something similar? [ 0.000000] ftrace: allocating 82941 entries in 244 pages [ 0.000000] ------------[ ftrace bug ]------------ [ 0.000000] ftrace failed to modify [ 0.000000] [<c2b00350>] set_reset_devices+0x10/0x28 [ 0.000000] actual: 0e:28:03:eb [ 0.000000] Initializing ftrace call sites [ 0.000000] ftrace record flags: 0 [ 0.000000] (0) [ 0.000000] expected tramp: c0312eb8 [ 0.000000] ------------[ cut here ]------------ [ 0.000000] WARNING: CPU: 0 PID: 0 at kernel/trace/ftrace.c:2066 ftrace_bug+0x240/0x268 [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.12.0-rc2-00002-gcb8fe87aa3fa-dirty #12 [ 0.000000] Hardware name: Generic DT based system [ 0.000000] Backtrace: [ 0.000000] [<c1afd8c4>] (dump_backtrace) from [<c1afdca4>] (show_stack+0x20/0x24) [ 0.000000] r7:c2e13c1c r6:600000d3 r5:00000000 r4:c2e13c1c [ 0.000000] [<c1afdc84>] (show_stack) from [<c1b04c94>] (dump_stack+0xf4/0x124) [ 0.000000] [<c1b04ba0>] (dump_stack) from [<c035f33c>] (__warn+0xfc/0x128) [ 0.000000] r10:c2d0908c r9:00000000 r8:00000009 r7:00000812 r6:00000009 r5:c1b01c78 [ 0.000000] r4:c27d0f70 r3:c2d08f50 [ 0.000000] [<c035f240>] (__warn) from [<c1afee14>] (warn_slowpath_fmt+0x74/0xd0) [ 0.000000] r7:c1b01c78 r6:00000812 r5:c27d0f70 r4:00000000 [ 0.000000] [<c1afeda4>] (warn_slowpath_fmt) from [<c1b01c78>] (ftrace_bug+0x240/0x268) [ 0.000000] r8:c0312eac r7:c362f518 r6:ffffffea r5:c2b00350 r4:c3817ac4 [ 0.000000] [<c1b01a38>] (ftrace_bug) from [<c046316c>] (ftrace_process_locs+0x2b0/0x518) [ 0.000000] r7:c3817ac4 r6:c38040c0 r5:00000a3c r4:000134e4 [ 0.000000] [<c0462ebc>] (ftrace_process_locs) from [<c2b25240>] (ftrace_init+0xc8/0x174) [ 0.000000] r10:c2ffa000 r9:c2be8a78 r8:c2c5d1fc r7:c2c0c208 r6:00000001 r5:c2d0908c [ 0.000000] r4:c362f518 [ 0.000000] [<c2b25178>] (ftrace_init) from [<c2b00e14>] (start_kernel+0x2f4/0x5b8) [ 0.000000] r9:c2be8a78 r8:dbfffec0 r7:00000000 r6:c36385cc r5:c2d08f00 r4:c2ffa000 [ 0.000000] [<c2b00b20>] (start_kernel) from [<00000000>] (0x0) [ 0.000000] r10:10c5387d r9:412fc0f1 r8:48000000 r7:ffffffff r6:10c0387d r5:00000051 [ 0.000000] r4:c2b00330 [ 0.000000] irq event stamp: 0 [ 0.000000] hardirqs last enabled at (0): [<00000000>] 0x0 [ 0.000000] hardirqs last disabled at (0): [<00000000>] 0x0 [ 0.000000] softirqs last enabled at (0): [<00000000>] 0x0 [ 0.000000] softirqs last disabled at (0): [<00000000>] 0x0 [ 0.000000] random: get_random_bytes called from print_oops_end_marker+0x34/0xa0 with crng_init=0 [ 0.000000] ---[ end trace 0000000000000000 ]--- [ 0.000000] ftrace: allocated 243 pages with 6 groups > diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c > index 9a79ef6..fa867a5 100644 > --- a/arch/arm/kernel/ftrace.c > +++ b/arch/arm/kernel/ftrace.c > @@ -70,6 +70,19 @@ int ftrace_arch_code_modify_post_process(void) > > static unsigned long ftrace_call_replace(unsigned long pc, unsigned long addr) > { > + s32 offset = addr - pc; > + s32 blim = 0xfe000008; > + s32 flim = 0x02000004; This look like magic numbers to me.. > + > + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { > + blim = 0xff000004; > + flim = 0x01000002; .. ditto .. > + } > + > + if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS) && > + (offset < blim || offset > flim)) > + return 0; .. I could have missed something, but wouldn't something like below be clearer? Only compile tested. I think abs() will do the right thing here given the passed types. I admit I don't understand why you have the '4' and '8' at the lowest nibble.. diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c index fa867a57100f..b44aee87c53a 100644 --- a/arch/arm/kernel/ftrace.c +++ b/arch/arm/kernel/ftrace.c @@ -70,17 +70,13 @@ int ftrace_arch_code_modify_post_process(void) static unsigned long ftrace_call_replace(unsigned long pc, unsigned long addr) { - s32 offset = addr - pc; - s32 blim = 0xfe000008; - s32 flim = 0x02000004; + u32 offset = abs(addr - pc); + u32 range = 0x02000000; /* +-32MiB */ - if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { - blim = 0xff000004; - flim = 0x01000002; - } + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) + range = 0x01000000; /* +-16MiB */ - if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS) && - (offset < blim || offset > flim)) + if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS) && offset > range) return 0; return arm_gen_branch_link(pc, addr); In case CONFIG_ARM_MODULE_PLTS is not enabled what would happen? Is it impossible to hit this corner case or we could fail one way or another? IOW, should this check be always compiled in? > + > return arm_gen_branch_link(pc, addr); > } > > @@ -124,10 +137,22 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr) > { > unsigned long new, old; > unsigned long ip = rec->ip; > + unsigned long aaddr = adjust_address(rec, addr); > > old = ftrace_nop_replace(rec); > > - new = ftrace_call_replace(ip, adjust_address(rec, addr)); > + new = ftrace_call_replace(ip, aaddr); > + > +#ifdef CONFIG_ARM_MODULE_PLTS > + if (!new) { > + struct module *mod = rec->arch.mod; > + > + if (mod) { What would happen if !new and !mod? > + aaddr = get_module_plt(mod, ip, aaddr); > + new = ftrace_call_replace(ip, aaddr); I assume we're guaranteed to have a sensible value returned in 'new' here? Thanks -- Qais Yousef > + } > + } > +#endif
Hi! On 07/03/2021 18:26, Qais Yousef wrote: > I tried on 5.12-rc2 and 5.11 but couldn't reproduce the problem using your > instructions on the other email. But most likely because I'm hitting another > problem that could be masking it. I'm not sure it is related or just randomly > happened to hit it. > > Did you see something similar? [...] > [ 0.000000] [<c1b01a38>] (ftrace_bug) from [<c046316c>] (ftrace_process_locs+0x2b0/0x518) > [ 0.000000] r7:c3817ac4 r6:c38040c0 r5:00000a3c r4:000134e4 > [ 0.000000] [<c0462ebc>] (ftrace_process_locs) from [<c2b25240>] (ftrace_init+0xc8/0x174) > [ 0.000000] r10:c2ffa000 r9:c2be8a78 r8:c2c5d1fc r7:c2c0c208 r6:00000001 r5:c2d0908c > [ 0.000000] r4:c362f518 > [ 0.000000] [<c2b25178>] (ftrace_init) from [<c2b00e14>] (start_kernel+0x2f4/0x5b8) > [ 0.000000] r9:c2be8a78 r8:dbfffec0 r7:00000000 r6:c36385cc r5:c2d08f00 r4:c2ffa000 > [ 0.000000] [<c2b00b20>] (start_kernel) from [<00000000>] (0x0) This means, FTRACE has more problems with your kernel/compiler/platform, I've addressed similar issue in the past, but my patch should be long merged: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1817963.html Could it be the same problem as here: https://www.spinics.net/lists/arm-kernel/msg854022.html Seems that the size check deserves something line BUILD_BUG_ON() with FTRACE... >> diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c >> index 9a79ef6..fa867a5 100644 >> --- a/arch/arm/kernel/ftrace.c >> +++ b/arch/arm/kernel/ftrace.c >> @@ -70,6 +70,19 @@ int ftrace_arch_code_modify_post_process(void) >> >> static unsigned long ftrace_call_replace(unsigned long pc, unsigned long addr) >> { >> + s32 offset = addr - pc; >> + s32 blim = 0xfe000008; >> + s32 flim = 0x02000004; > > This look like magic numbers to me.. These magic numbers are most probably the reason for your FTRACE to resign... Those are backward- and forward-branch limits. I didn't find the matching DEFINEs in the kernel, but I would be happy to learn them. I can also put some comments, but I actually thought the purpose would be obvious from the code... >> + >> + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { >> + blim = 0xff000004; >> + flim = 0x01000002; > > .. ditto .. > >> + } >> + >> + if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS) && >> + (offset < blim || offset > flim)) >> + return 0; > > .. I could have missed something, but wouldn't something like below be clearer? > Only compile tested. I think abs() will do the right thing here given the > passed types. I admit I don't understand why you have the '4' and '8' at the > lowest nibble.. Yes, the limits are not symmetrical. These "magic numbers" have been checked many times by me, but I admit I'm not expert in ARM assembly. I'm however still quite sure about them. > diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c > index fa867a57100f..b44aee87c53a 100644 > --- a/arch/arm/kernel/ftrace.c > +++ b/arch/arm/kernel/ftrace.c > @@ -70,17 +70,13 @@ int ftrace_arch_code_modify_post_process(void) > > static unsigned long ftrace_call_replace(unsigned long pc, unsigned long addr) > { > - s32 offset = addr - pc; > - s32 blim = 0xfe000008; > - s32 flim = 0x02000004; > + u32 offset = abs(addr - pc); > + u32 range = 0x02000000; /* +-32MiB */ > > - if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { > - blim = 0xff000004; > - flim = 0x01000002; > - } > + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) > + range = 0x01000000; /* +-16MiB */ > > - if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS) && > - (offset < blim || offset > flim)) > + if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS) && offset > range) > return 0; See above, the limits are not symmetrical. > return arm_gen_branch_link(pc, addr); > > In case CONFIG_ARM_MODULE_PLTS is not enabled what would happen? Is it > impossible to hit this corner case or we could fail one way or another? IOW, > should this check be always compiled in? I didn't want to modify the original behavior and the limits are again checked in either ARM or THUMB implementations of __arm_gen_branch() (there you will again find a nice set of "magic numbers". >> + >> return arm_gen_branch_link(pc, addr); >> } >> >> @@ -124,10 +137,22 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr) >> { >> unsigned long new, old; >> unsigned long ip = rec->ip; >> + unsigned long aaddr = adjust_address(rec, addr); >> >> old = ftrace_nop_replace(rec); >> >> - new = ftrace_call_replace(ip, adjust_address(rec, addr)); >> + new = ftrace_call_replace(ip, aaddr); >> + >> +#ifdef CONFIG_ARM_MODULE_PLTS >> + if (!new) { >> + struct module *mod = rec->arch.mod; >> + >> + if (mod) { > > What would happen if !new and !mod? I believe, that's exactly what happens in the dump you experience with your kernel. This is not covered by this patch, this patch covers the issue with modules in vmalloc area. >> + aaddr = get_module_plt(mod, ip, aaddr); >> + new = ftrace_call_replace(ip, aaddr); > > I assume we're guaranteed to have a sensible value returned in 'new' here? Otherwise you'd see the dump you see :) It relies on the already existing error handling. >> + } >> + } >> +#endif
On 03/08/21 08:58, Alexander Sverdlin wrote: > Hi! > > On 07/03/2021 18:26, Qais Yousef wrote: > > I tried on 5.12-rc2 and 5.11 but couldn't reproduce the problem using your I still can't reproduce on 5.12-rc2. I do have CONFIG_ARM_MODULE_PLTS=y. Do you need to do something else after loading the module? I tried starting ftrace, but maybe there's a particular combination required? > > instructions on the other email. But most likely because I'm hitting another > > problem that could be masking it. I'm not sure it is related or just randomly > > happened to hit it. > > > > Did you see something similar? > > [...] > > > [ 0.000000] [<c1b01a38>] (ftrace_bug) from [<c046316c>] (ftrace_process_locs+0x2b0/0x518) > > [ 0.000000] r7:c3817ac4 r6:c38040c0 r5:00000a3c r4:000134e4 > > [ 0.000000] [<c0462ebc>] (ftrace_process_locs) from [<c2b25240>] (ftrace_init+0xc8/0x174) > > [ 0.000000] r10:c2ffa000 r9:c2be8a78 r8:c2c5d1fc r7:c2c0c208 r6:00000001 r5:c2d0908c > > [ 0.000000] r4:c362f518 > > [ 0.000000] [<c2b25178>] (ftrace_init) from [<c2b00e14>] (start_kernel+0x2f4/0x5b8) > > [ 0.000000] r9:c2be8a78 r8:dbfffec0 r7:00000000 r6:c36385cc r5:c2d08f00 r4:c2ffa000 > > [ 0.000000] [<c2b00b20>] (start_kernel) from [<00000000>] (0x0) > > This means, FTRACE has more problems with your kernel/compiler/platform, I've addressed similar issue > in the past, but my patch should be long merged: > > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1817963.html > > Could it be the same problem as here: > https://www.spinics.net/lists/arm-kernel/msg854022.html > > Seems that the size check deserves something line BUILD_BUG_ON() with FTRACE... So I only see this when I convert all modules to be built-in sed -i 's/=m/=y/' .config FWIW, I see the problem with your patch applied too. Trying to dig more into it.. > > >> diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c > >> index 9a79ef6..fa867a5 100644 > >> --- a/arch/arm/kernel/ftrace.c > >> +++ b/arch/arm/kernel/ftrace.c > >> @@ -70,6 +70,19 @@ int ftrace_arch_code_modify_post_process(void) > >> > >> static unsigned long ftrace_call_replace(unsigned long pc, unsigned long addr) > >> { > >> + s32 offset = addr - pc; > >> + s32 blim = 0xfe000008; > >> + s32 flim = 0x02000004; > > > > This look like magic numbers to me.. > > These magic numbers are most probably the reason for your FTRACE to resign... > Those are backward- and forward-branch limits. I didn't find the matching DEFINEs > in the kernel, but I would be happy to learn them. I can also put some comments, > but I actually thought the purpose would be obvious from the code... So I did dig more into it. The range is asymmetrical indeed. And the strange offset is to cater for the pc being incremented by +8 (+4 for thumb2). You're duplicating the checks in __arm_gen_branch_{thumb2, arm}(). As you noted __arm_gen_branch() which is called by arm_gen_branch_link() will end up doing the exact same check and return 0. So why do you need to duplicate the check here? We can do something about the WARN_ON_ONCE(1). [...] > >> + > >> return arm_gen_branch_link(pc, addr); > >> } > >> > >> @@ -124,10 +137,22 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr) > >> { > >> unsigned long new, old; > >> unsigned long ip = rec->ip; > >> + unsigned long aaddr = adjust_address(rec, addr); > >> > >> old = ftrace_nop_replace(rec); > >> > >> - new = ftrace_call_replace(ip, adjust_address(rec, addr)); > >> + new = ftrace_call_replace(ip, aaddr); > >> + > >> +#ifdef CONFIG_ARM_MODULE_PLTS > >> + if (!new) { > >> + struct module *mod = rec->arch.mod; > >> + > >> + if (mod) { > > > > What would happen if !new and !mod? > > I believe, that's exactly what happens in the dump you experience with your kernel. > This is not covered by this patch, this patch covers the issue with modules in vmalloc area. > > >> + aaddr = get_module_plt(mod, ip, aaddr); > >> + new = ftrace_call_replace(ip, aaddr); > > > > I assume we're guaranteed to have a sensible value returned in 'new' here? > > Otherwise you'd see the dump you see :) > It relies on the already existing error handling. I understand from this there are still loose ends to be handled in this area of the code. I admit I need to spend more time to understand why I get the failure above and how this overlaps with your proposal. But as it stands it seems there's more work to be done here. Thanks -- Qais Yousef
Hi! On 09/03/2021 18:42, Qais Yousef wrote: >>> I tried on 5.12-rc2 and 5.11 but couldn't reproduce the problem using your > I still can't reproduce on 5.12-rc2. > > I do have CONFIG_ARM_MODULE_PLTS=y. Do you need to do something else after > loading the module? I tried starting ftrace, but maybe there's a particular > combination required? You need to load a BIG module, so big that it has no place in the modules area any more and goes to vmalloc area. >>> instructions on the other email. But most likely because I'm hitting another >>> problem that could be masking it. I'm not sure it is related or just randomly >>> happened to hit it. >>> >>> Did you see something similar? >> [...] >> >>> [ 0.000000] [<c1b01a38>] (ftrace_bug) from [<c046316c>] (ftrace_process_locs+0x2b0/0x518) >>> [ 0.000000] r7:c3817ac4 r6:c38040c0 r5:00000a3c r4:000134e4 >>> [ 0.000000] [<c0462ebc>] (ftrace_process_locs) from [<c2b25240>] (ftrace_init+0xc8/0x174) >>> [ 0.000000] r10:c2ffa000 r9:c2be8a78 r8:c2c5d1fc r7:c2c0c208 r6:00000001 r5:c2d0908c >>> [ 0.000000] r4:c362f518 >>> [ 0.000000] [<c2b25178>] (ftrace_init) from [<c2b00e14>] (start_kernel+0x2f4/0x5b8) >>> [ 0.000000] r9:c2be8a78 r8:dbfffec0 r7:00000000 r6:c36385cc r5:c2d08f00 r4:c2ffa000 >>> [ 0.000000] [<c2b00b20>] (start_kernel) from [<00000000>] (0x0) >> This means, FTRACE has more problems with your kernel/compiler/platform, I've addressed similar issue >> in the past, but my patch should be long merged: >> >> https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1817963.html >> >> Could it be the same problem as here: >> https://www.spinics.net/lists/arm-kernel/msg854022.html >> >> Seems that the size check deserves something line BUILD_BUG_ON() with FTRACE... > So I only see this when I convert all modules to be built-in > > sed -i 's/=m/=y/' .config > > FWIW, I see the problem with your patch applied too. Trying to dig more into > it.. Then it's definitely the problem explained in the second link. If you have THUMB2 kernel, maybe you have to switch to ARM.
On 3/9/2021 11:23 PM, Alexander Sverdlin wrote: > Hi! > > On 09/03/2021 18:42, Qais Yousef wrote: >>>> I tried on 5.12-rc2 and 5.11 but couldn't reproduce the problem using your >> I still can't reproduce on 5.12-rc2. >> >> I do have CONFIG_ARM_MODULE_PLTS=y. Do you need to do something else after >> loading the module? I tried starting ftrace, but maybe there's a particular >> combination required? > > You need to load a BIG module, so big that it has no place in the modules area > any more and goes to vmalloc area. You absolutely need a very big module maybe more than one. When I tested this, I could use the two proprietary modules (*sigh*) that I needed to exercise against and loading one but not the other was not enough to make the second module loading spill into vmalloc space. > >>>> instructions on the other email. But most likely because I'm hitting another >>>> problem that could be masking it. I'm not sure it is related or just randomly >>>> happened to hit it. >>>> >>>> Did you see something similar? >>> [...] >>> >>>> [ 0.000000] [<c1b01a38>] (ftrace_bug) from [<c046316c>] (ftrace_process_locs+0x2b0/0x518) >>>> [ 0.000000] r7:c3817ac4 r6:c38040c0 r5:00000a3c r4:000134e4 >>>> [ 0.000000] [<c0462ebc>] (ftrace_process_locs) from [<c2b25240>] (ftrace_init+0xc8/0x174) >>>> [ 0.000000] r10:c2ffa000 r9:c2be8a78 r8:c2c5d1fc r7:c2c0c208 r6:00000001 r5:c2d0908c >>>> [ 0.000000] r4:c362f518 >>>> [ 0.000000] [<c2b25178>] (ftrace_init) from [<c2b00e14>] (start_kernel+0x2f4/0x5b8) >>>> [ 0.000000] r9:c2be8a78 r8:dbfffec0 r7:00000000 r6:c36385cc r5:c2d08f00 r4:c2ffa000 >>>> [ 0.000000] [<c2b00b20>] (start_kernel) from [<00000000>] (0x0) >>> This means, FTRACE has more problems with your kernel/compiler/platform, I've addressed similar issue >>> in the past, but my patch should be long merged: >>> >>> https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1817963.html >>> >>> Could it be the same problem as here: >>> https://www.spinics.net/lists/arm-kernel/msg854022.html >>> >>> Seems that the size check deserves something line BUILD_BUG_ON() with FTRACE... >> So I only see this when I convert all modules to be built-in >> >> sed -i 's/=m/=y/' .config >> >> FWIW, I see the problem with your patch applied too. Trying to dig more into >> it.. > > Then it's definitely the problem explained in the second link. If you have THUMB2 kernel, maybe > you have to switch to ARM. >
Hi! On 10/03/2021 17:14, Florian Fainelli wrote: >>>>> I tried on 5.12-rc2 and 5.11 but couldn't reproduce the problem using your >>> I still can't reproduce on 5.12-rc2. >>> >>> I do have CONFIG_ARM_MODULE_PLTS=y. Do you need to do something else after >>> loading the module? I tried starting ftrace, but maybe there's a particular >>> combination required? >> You need to load a BIG module, so big that it has no place in the modules area >> any more and goes to vmalloc area. > You absolutely need a very big module maybe more than one. When I tested > this, I could use the two proprietary modules (*sigh*) that I needed to > exercise against and loading one but not the other was not enough to > make the second module loading spill into vmalloc space. Here is what I use instead of these real world "proprietary" modules (which of course were the real trigger for the patch): https://www.spinics.net/lists/arm-kernel/msg878599.html
Hi Alexander On 03/10/21 18:17, Alexander Sverdlin wrote: > Hi! > > On 10/03/2021 17:14, Florian Fainelli wrote: > >>>>> I tried on 5.12-rc2 and 5.11 but couldn't reproduce the problem using your > >>> I still can't reproduce on 5.12-rc2. > >>> > >>> I do have CONFIG_ARM_MODULE_PLTS=y. Do you need to do something else after > >>> loading the module? I tried starting ftrace, but maybe there's a particular > >>> combination required? > >> You need to load a BIG module, so big that it has no place in the modules area > >> any more and goes to vmalloc area. > > You absolutely need a very big module maybe more than one. When I tested > > this, I could use the two proprietary modules (*sigh*) that I needed to > > exercise against and loading one but not the other was not enough to > > make the second module loading spill into vmalloc space. > > Here is what I use instead of these real world "proprietary" modules (which of course > were the real trigger for the patch): > > https://www.spinics.net/lists/arm-kernel/msg878599.html I am testing with your module. I can't reproduce the problem you describe with it as I stated. I will try to spend more time on it on the weekend. Thanks -- Qais Yousef
On 3/12/21 9:24 AM, Qais Yousef wrote: > Hi Alexander > > On 03/10/21 18:17, Alexander Sverdlin wrote: >> Hi! >> >> On 10/03/2021 17:14, Florian Fainelli wrote: >>>>>>> I tried on 5.12-rc2 and 5.11 but couldn't reproduce the problem using your >>>>> I still can't reproduce on 5.12-rc2. >>>>> >>>>> I do have CONFIG_ARM_MODULE_PLTS=y. Do you need to do something else after >>>>> loading the module? I tried starting ftrace, but maybe there's a particular >>>>> combination required? >>>> You need to load a BIG module, so big that it has no place in the modules area >>>> any more and goes to vmalloc area. >>> You absolutely need a very big module maybe more than one. When I tested >>> this, I could use the two proprietary modules (*sigh*) that I needed to >>> exercise against and loading one but not the other was not enough to >>> make the second module loading spill into vmalloc space. >> >> Here is what I use instead of these real world "proprietary" modules (which of course >> were the real trigger for the patch): >> >> https://www.spinics.net/lists/arm-kernel/msg878599.html > > I am testing with your module. I can't reproduce the problem you describe with > it as I stated. > > I will try to spend more time on it on the weekend. Alexander, do you load one or multiple instances of that fat module? The test module does a 6 * 1024 * 1024 / 2 = 3 million repetitions of the "nop" instruction which should be 32-bits wide in ARM mode and 16-bits wide in Thumb mode, right? In ARM mode we have a 14MB module space, so 3 * 1024 * 1024 * 4 = 12MB, which should still fit within if you have no module loaded, however a second instance of the module should make us spill into vmalloc space. In Thumb mode, we have a 6MB module space, so 3 * 1024 * 1024 * 2 = 6MB so we may spill, but maybe not. I was not able to reproduce the warning with just one module, but with two (cannot have the same name BTW), it kicked in.
I fixed Ard's email as it kept bouncing back. +CC Linus Walleij On 03/12/21 10:35, Florian Fainelli wrote: > On 3/12/21 9:24 AM, Qais Yousef wrote: > > Hi Alexander > > > > On 03/10/21 18:17, Alexander Sverdlin wrote: > >> Hi! > >> > >> On 10/03/2021 17:14, Florian Fainelli wrote: > >>>>>>> I tried on 5.12-rc2 and 5.11 but couldn't reproduce the problem using your > >>>>> I still can't reproduce on 5.12-rc2. > >>>>> > >>>>> I do have CONFIG_ARM_MODULE_PLTS=y. Do you need to do something else after > >>>>> loading the module? I tried starting ftrace, but maybe there's a particular > >>>>> combination required? > >>>> You need to load a BIG module, so big that it has no place in the modules area > >>>> any more and goes to vmalloc area. > >>> You absolutely need a very big module maybe more than one. When I tested > >>> this, I could use the two proprietary modules (*sigh*) that I needed to > >>> exercise against and loading one but not the other was not enough to > >>> make the second module loading spill into vmalloc space. > >> > >> Here is what I use instead of these real world "proprietary" modules (which of course > >> were the real trigger for the patch): > >> > >> https://www.spinics.net/lists/arm-kernel/msg878599.html > > > > I am testing with your module. I can't reproduce the problem you describe with > > it as I stated. > > > > I will try to spend more time on it on the weekend. > > Alexander, do you load one or multiple instances of that fat module? > > The test module does a 6 * 1024 * 1024 / 2 = 3 million repetitions of > the "nop" instruction which should be 32-bits wide in ARM mode and > 16-bits wide in Thumb mode, right? > > In ARM mode we have a 14MB module space, so 3 * 1024 * 1024 * 4 = 12MB, > which should still fit within if you have no module loaded, however a > second instance of the module should make us spill into vmalloc space. > > In Thumb mode, we have a 6MB module space, so 3 * 1024 * 1024 * 2 = 6MB > so we may spill, but maybe not. > > I was not able to reproduce the warning with just one module, but with > two (cannot have the same name BTW), it kicked in. Thanks Florian. With 2 modules I can indeed reproduce the issue. Though for thumb I still hit the other problem so I couldn't properly test it. I have come up with an alternative solution (hacky patch at the end) which seem to fix both problems for me. That is * sed -i 's/=m/=y/ .config' which causes a lot of veneers to be generated is handled correctly now AFAICT * Loading a fat module works the same. I have tested it in both arm and thumb modes. Running trace-cmd start -p function executes successfully and I can see functions being traced. The solution makes a lot of assumptions that I yet to verify though, but I hope it could form the basis for a proper one that fixes both issues. It's faster too. I'm just not sure if I need to take this handling one layer above or it's okay to be done at __arm_gen_branch(). Also I had a version that was more strict about verifying the veneer @pc points to @addr before using the veneer address but that fails for modules. The logic didn't work to verify for module, but we do get the correct veneer function as returned by get_module_plt(). Since for ftrace we really care about mcount, I considered reading the plt once at load time to get the address and save it in rec->arch and use that later instead of continuously reading it + having to store the plt. We could revisit that if the below has a fundamental flaw. Alexander, have you considered these options before? I could have easily missed something :-) Thanks -- Qais Yousef ------->8--------- diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c index 9a79ef6b1876..70f6e197c306 100644 --- a/arch/arm/kernel/ftrace.c +++ b/arch/arm/kernel/ftrace.c @@ -157,6 +157,9 @@ int ftrace_make_nop(struct module *mod, unsigned long new; int ret; + if (mod) + pr_info("VENEER_DBG: mcount_veneedr_addr = 0x%x\n", get_module_plt(mod, ip, addr)); + old = ftrace_call_replace(ip, adjust_address(rec, addr)); new = ftrace_nop_replace(rec); ret = ftrace_modify_code(ip, old, new, true); diff --git a/arch/arm/kernel/insn.c b/arch/arm/kernel/insn.c index 2e844b70386b..0d6abf8d726f 100644 --- a/arch/arm/kernel/insn.c +++ b/arch/arm/kernel/insn.c @@ -3,17 +3,53 @@ #include <linux/kernel.h> #include <asm/opcodes.h> +/* + * Checks if a @pc points to a veneer to @addr + * + * Returns addr to veneer if true, 0 otherwise. + */ +static unsigned long + __arm_addr_is_veneer_thumb2(unsigned long pc, unsigned long addr, bool *exchange) +{ + unsigned long insn = __mem_to_opcode_thumb32(*(unsigned long *)pc); + + unsigned long s = insn >> 26 & 0x1; + unsigned long j1 = insn >> 13 & 0x1; + unsigned long j2 = insn >> 11 & 0x1; + unsigned long i1 = !(j1 ^ s); + unsigned long i2 = !(j2 ^ s); + unsigned long imm10 = insn >> 16 & 0x3ff; + unsigned long imm11 = (insn & 0x7ff) + 1; + + long offset = ((s ? 0xff : s) << 24) | (i1 << 23) | + (i2 << 22) | (imm10 << 12) | (imm11 << 1); + + unsigned long veneer_fn = pc + 2 + offset; + long voffset = *(long *)(veneer_fn + 4 + 4 + 4); + unsigned long veneer_addr = (long)veneer_fn + 4 + 4 + 4 + voffset; + + *exchange = !(insn & 1 << 12); + + pr_info("VENEER_DBG: insn=0x%lx, offset=0x%lx\n", insn, offset); + pr_info("VENEER_DBG: pc=0x%lx, addr=0x%lx\n", pc, addr); + pr_info("VENEER_DBG: veneer_fn=0x%lx, veneer_addr=0x%lx\n", veneer_fn, veneer_addr); + + return veneer_fn; + +} + static unsigned long __arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link) { unsigned long s, j1, j2, i1, i2, imm10, imm11; unsigned long first, second; + bool exchange = false; long offset; offset = (long)addr - (long)(pc + 4); if (offset < -16777216 || offset > 16777214) { - WARN_ON_ONCE(1); - return 0; + unsigned long vaddr = __arm_addr_is_veneer_thumb2(pc, addr, &exchange); + offset = (long)vaddr - (long)(pc + 4); } s = (offset >> 24) & 0x1; @@ -30,9 +66,34 @@ __arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link) if (link) second |= 1 << 14; + if (exchange) + second &= ~(1 << 12); + + //pr_info("VENEER_DBG: thumb32 opcode = 0x%x\n", __opcode_thumb32_compose(first, second)); return __opcode_thumb32_compose(first, second); } +/* + * Checks if a @pc points to a veneer to @addr + * + * Returns addr to veneer if true, 0 otherwise. + */ +static unsigned long + __arm_addr_is_veneer_arm(unsigned long pc, unsigned long addr) +{ + unsigned long insn = __mem_to_opcode_arm(*(unsigned long *)pc); + long offset = (insn & 0xffffff) << 2; + unsigned long veneer_fn = pc + 8 + offset; + long voffset = 0; //*(long *)(veneer_fn + 8); + unsigned long veneer_addr = (long)veneer_fn + 4 + 8 + voffset; + + pr_info("VENEER_DBG: insn=0x%lx, offset=0x%lx\n", insn, offset); + pr_info("VENEER_DBG: pc=0x%lx, addr=0x%lx\n", pc, addr); + pr_info("VENEER_DBG: veneer_fn=0x%lx, veneer_addr=0x%lx\n", veneer_fn, veneer_addr); + + return veneer_fn; +} + static unsigned long __arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link) { @@ -44,8 +105,8 @@ __arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link) offset = (long)addr - (long)(pc + 8); if (unlikely(offset < -33554432 || offset > 33554428)) { - WARN_ON_ONCE(1); - return 0; + unsigned long vaddr = __arm_addr_is_veneer_arm(pc, addr); + offset = (long)vaddr - (long)(pc + 8); } offset = (offset >> 2) & 0x00ffffff;
Hello Florian, On 12/03/2021 19:35, Florian Fainelli wrote: >>> https://www.spinics.net/lists/arm-kernel/msg878599.html >> I am testing with your module. I can't reproduce the problem you describe with >> it as I stated. >> >> I will try to spend more time on it on the weekend. > Alexander, do you load one or multiple instances of that fat module? One cannot have "multiple instances of the same module"... > The test module does a 6 * 1024 * 1024 / 2 = 3 million repetitions of > the "nop" instruction which should be 32-bits wide in ARM mode and > 16-bits wide in Thumb mode, right? > > In ARM mode we have a 14MB module space, so 3 * 1024 * 1024 * 4 = 12MB, > which should still fit within if you have no module loaded, however a > second instance of the module should make us spill into vmalloc space. > > In Thumb mode, we have a 6MB module space, so 3 * 1024 * 1024 * 2 = 6MB > so we may spill, but maybe not. > > I was not able to reproduce the warning with just one module, but with > two (cannot have the same name BTW), it kicked in. ... well, may be the size was arbitrary chosen to not fit into our module space and we have more modules already loaded. But you are free to adjust the amount of NOPs! :)
Hi Alexander On 03/14/21 22:02, Qais Yousef wrote: > I fixed Ard's email as it kept bouncing back. > > +CC Linus Walleij > > On 03/12/21 10:35, Florian Fainelli wrote: > > On 3/12/21 9:24 AM, Qais Yousef wrote: > > > Hi Alexander > > > > > > On 03/10/21 18:17, Alexander Sverdlin wrote: > > >> Hi! > > >> > > >> On 10/03/2021 17:14, Florian Fainelli wrote: > > >>>>>>> I tried on 5.12-rc2 and 5.11 but couldn't reproduce the problem using your > > >>>>> I still can't reproduce on 5.12-rc2. > > >>>>> > > >>>>> I do have CONFIG_ARM_MODULE_PLTS=y. Do you need to do something else after > > >>>>> loading the module? I tried starting ftrace, but maybe there's a particular > > >>>>> combination required? > > >>>> You need to load a BIG module, so big that it has no place in the modules area > > >>>> any more and goes to vmalloc area. > > >>> You absolutely need a very big module maybe more than one. When I tested > > >>> this, I could use the two proprietary modules (*sigh*) that I needed to > > >>> exercise against and loading one but not the other was not enough to > > >>> make the second module loading spill into vmalloc space. > > >> > > >> Here is what I use instead of these real world "proprietary" modules (which of course > > >> were the real trigger for the patch): > > >> > > >> https://www.spinics.net/lists/arm-kernel/msg878599.html > > > > > > I am testing with your module. I can't reproduce the problem you describe with > > > it as I stated. > > > > > > I will try to spend more time on it on the weekend. > > > > Alexander, do you load one or multiple instances of that fat module? > > > > The test module does a 6 * 1024 * 1024 / 2 = 3 million repetitions of > > the "nop" instruction which should be 32-bits wide in ARM mode and > > 16-bits wide in Thumb mode, right? > > > > In ARM mode we have a 14MB module space, so 3 * 1024 * 1024 * 4 = 12MB, > > which should still fit within if you have no module loaded, however a > > second instance of the module should make us spill into vmalloc space. > > > > In Thumb mode, we have a 6MB module space, so 3 * 1024 * 1024 * 2 = 6MB > > so we may spill, but maybe not. > > > > I was not able to reproduce the warning with just one module, but with > > two (cannot have the same name BTW), it kicked in. > > Thanks Florian. With 2 modules I can indeed reproduce the issue. Though for > thumb I still hit the other problem so I couldn't properly test it. > > I have come up with an alternative solution (hacky patch at the end) which seem > to fix both problems for me. That is > > * sed -i 's/=m/=y/ .config' which causes a lot of veneers to be > generated is handled correctly now AFAICT > * Loading a fat module works the same. > > I have tested it in both arm and thumb modes. Running > > trace-cmd start -p function > > executes successfully and I can see functions being traced. > > The solution makes a lot of assumptions that I yet to verify though, but I hope > it could form the basis for a proper one that fixes both issues. It's faster > too. I'm just not sure if I need to take this handling one layer above or it's > okay to be done at __arm_gen_branch(). Also I had a version that was more > strict about verifying the veneer @pc points to @addr before using the veneer > address but that fails for modules. The logic didn't work to verify for module, > but we do get the correct veneer function as returned by get_module_plt(). > > Since for ftrace we really care about mcount, I considered reading the plt once > at load time to get the address and save it in rec->arch and use that later > instead of continuously reading it + having to store the plt. We could revisit > that if the below has a fundamental flaw. > > Alexander, have you considered these options before? I could have easily missed > something :-) So the answer is no, that wouldn't have worked. After spending more time on this your approach looks goot to me now. If you send v8 addressing my comment about removing the range check from ftrace_call_repalce() as it is rendundant to what arm_gen_branch_link() is already doing I'd be happy to give this series my reviewed-and-tested-by. You'd need to ensure the warning is not triggered when we expect the call to fail. FWIW, I do have this to address that range check issue and reduce the ifdefery too if it helps ------>8--------- diff --git a/arch/arm/include/asm/ftrace.h b/arch/arm/include/asm/ftrace.h index a4dbac07e4ef..ee2707c613bf 100644 --- a/arch/arm/include/asm/ftrace.h +++ b/arch/arm/include/asm/ftrace.h @@ -15,9 +15,7 @@ extern void __gnu_mcount_nc(void); #ifdef CONFIG_DYNAMIC_FTRACE struct dyn_arch_ftrace { -#ifdef CONFIG_ARM_MODULE_PLTS struct module *mod; -#endif }; static inline unsigned long ftrace_call_adjust(unsigned long addr) diff --git a/arch/arm/include/asm/insn.h b/arch/arm/include/asm/insn.h index f20e08ac85ae..71c3edefe629 100644 --- a/arch/arm/include/asm/insn.h +++ b/arch/arm/include/asm/insn.h @@ -13,18 +13,24 @@ arm_gen_nop(void) } unsigned long -__arm_gen_branch(unsigned long pc, unsigned long addr, bool link); +__arm_gen_branch(unsigned long pc, unsigned long addr, bool link, bool check); static inline unsigned long arm_gen_branch(unsigned long pc, unsigned long addr) { - return __arm_gen_branch(pc, addr, false); + return __arm_gen_branch(pc, addr, false, true); } static inline unsigned long arm_gen_branch_link(unsigned long pc, unsigned long addr) { - return __arm_gen_branch(pc, addr, true); + return __arm_gen_branch(pc, addr, true, true); +} + +static inline unsigned long +arm_gen_branch_link_nocheck(unsigned long pc, unsigned long addr) +{ + return __arm_gen_branch(pc, addr, true, false); } #endif diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c index fa867a57100f..a30b1a8e5908 100644 --- a/arch/arm/kernel/ftrace.c +++ b/arch/arm/kernel/ftrace.c @@ -70,20 +70,28 @@ int ftrace_arch_code_modify_post_process(void) static unsigned long ftrace_call_replace(unsigned long pc, unsigned long addr) { - s32 offset = addr - pc; - s32 blim = 0xfe000008; - s32 flim = 0x02000004; + return arm_gen_branch_link(pc, addr); +} - if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { - blim = 0xff000004; - flim = 0x01000002; - } +static unsigned long +ftrace_call_replace_mod(struct module *mod, unsigned long pc, unsigned long addr) +{ +#ifdef CONFIG_ARM_MODULE_PLTS + unsigned long new; - if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS) && - (offset < blim || offset > flim)) - return 0; + if (likely(!mod)) + return arm_gen_branch_link(pc, addr); + new = arm_gen_branch_link_nocheck(pc, addr); + if (!new) { + addr = get_module_plt(mod, pc, addr); + new = arm_gen_branch_link(pc, addr); + } + + return new; +#else return arm_gen_branch_link(pc, addr); +#endif } static int ftrace_modify_code(unsigned long pc, unsigned long old, @@ -141,18 +149,7 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr) old = ftrace_nop_replace(rec); - new = ftrace_call_replace(ip, aaddr); - -#ifdef CONFIG_ARM_MODULE_PLTS - if (!new) { - struct module *mod = rec->arch.mod; - - if (mod) { - aaddr = get_module_plt(mod, ip, aaddr); - new = ftrace_call_replace(ip, aaddr); - } - } -#endif + new = ftrace_call_replace_mod(rec->arch.mod, ip, aaddr); return ftrace_modify_code(rec->ip, old, new, true); } @@ -183,23 +180,11 @@ int ftrace_make_nop(struct module *mod, unsigned long new; int ret; -#ifdef CONFIG_ARM_MODULE_PLTS /* mod is only supplied during module loading */ - if (!mod) - mod = rec->arch.mod; - else + if (mod) rec->arch.mod = mod; -#endif - - old = ftrace_call_replace(ip, aaddr); - -#ifdef CONFIG_ARM_MODULE_PLTS - if (!old && mod) { - aaddr = get_module_plt(mod, ip, aaddr); - old = ftrace_call_replace(ip, aaddr); - } -#endif + old = ftrace_call_replace_mod(rec->arch.mod, ip, aaddr); new = ftrace_nop_replace(rec); ret = ftrace_modify_code(ip, old, new, true); diff --git a/arch/arm/kernel/insn.c b/arch/arm/kernel/insn.c index 2e844b70386b..37ec5734309e 100644 --- a/arch/arm/kernel/insn.c +++ b/arch/arm/kernel/insn.c @@ -4,7 +4,7 @@ #include <asm/opcodes.h> static unsigned long -__arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link) +__arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link, bool check) { unsigned long s, j1, j2, i1, i2, imm10, imm11; unsigned long first, second; @@ -12,7 +12,7 @@ __arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link) offset = (long)addr - (long)(pc + 4); if (offset < -16777216 || offset > 16777214) { - WARN_ON_ONCE(1); + WARN_ON_ONCE(check); return 0; } @@ -34,7 +34,7 @@ __arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link) } static unsigned long -__arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link) +__arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link, bool check) { unsigned long opcode = 0xea000000; long offset; @@ -44,7 +44,7 @@ __arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link) offset = (long)addr - (long)(pc + 8); if (unlikely(offset < -33554432 || offset > 33554428)) { - WARN_ON_ONCE(1); + WARN_ON_ONCE(check); return 0; } @@ -54,10 +54,10 @@ __arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link) } unsigned long -__arm_gen_branch(unsigned long pc, unsigned long addr, bool link) +__arm_gen_branch(unsigned long pc, unsigned long addr, bool link, bool check) { if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) - return __arm_gen_branch_thumb2(pc, addr, link); + return __arm_gen_branch_thumb2(pc, addr, link, check); else - return __arm_gen_branch_arm(pc, addr, link); + return __arm_gen_branch_arm(pc, addr, link, check); } ------>8--------- Thanks -- Qais Yousef
On Sun, 21 Mar 2021 19:06:11 +0000 Qais Yousef <qais.yousef@arm.com> wrote: > #ifdef CONFIG_DYNAMIC_FTRACE > struct dyn_arch_ftrace { > -#ifdef CONFIG_ARM_MODULE_PLTS > struct module *mod; > -#endif > }; > I know you want to reduce the "ifdefery", but please note that the dyn_arch_ftrace is defined once for every function that can be traced. If you have 40,000 functions that can be traced, that pointer is created 40,000 times. Thus, you really only want fields in the struct dyn_arch_ftrace if you really need them, otherwise, that's a lot of memory that is wasted. -- Steve
On 03/22/21 11:01, Steven Rostedt wrote: > On Sun, 21 Mar 2021 19:06:11 +0000 > Qais Yousef <qais.yousef@arm.com> wrote: > > > #ifdef CONFIG_DYNAMIC_FTRACE > > struct dyn_arch_ftrace { > > -#ifdef CONFIG_ARM_MODULE_PLTS > > struct module *mod; > > -#endif > > }; > > > > I know you want to reduce the "ifdefery", but please note that the > dyn_arch_ftrace is defined once for every function that can be traced. If > you have 40,000 functions that can be traced, that pointer is created > 40,000 times. Thus, you really only want fields in the struct > dyn_arch_ftrace if you really need them, otherwise, that's a lot of memory > that is wasted. Yes you're right. I was a bit optimistic on CONFIG_DYNAMIC_FTRACE will imply CONFIG_ARM_MODULE_PLTS is enabled too. It only has an impact on reducing ifdefery when calling ftrace_call_replace_mod(rec->arch.mod, ...) Should be easy to wrap rec->arch.mod with its own accessor that will return NULL if !CONFIG_ARM_MODULE_PLTS or just ifdef the functions. Up to Alexander to pick what he prefers :-) Thanks -- Qais Yousef
Hi Qais, On 22/03/2021 17:32, Qais Yousef wrote: > Yes you're right. I was a bit optimistic on CONFIG_DYNAMIC_FTRACE will imply > CONFIG_ARM_MODULE_PLTS is enabled too. > > It only has an impact on reducing ifdefery when calling > > ftrace_call_replace_mod(rec->arch.mod, ...) > > Should be easy to wrap rec->arch.mod with its own accessor that will return > NULL if !CONFIG_ARM_MODULE_PLTS or just ifdef the functions. > > Up to Alexander to pick what he prefers :-) well, I of course prefer v7 as-is, because this review is running longer than two years and I actually hope these patches to be finally merged at some point. But you are welcome to optimize them with follow up patches :)
Hi Alexander On 03/22/21 18:02, Alexander Sverdlin wrote: > Hi Qais, > > On 22/03/2021 17:32, Qais Yousef wrote: > > Yes you're right. I was a bit optimistic on CONFIG_DYNAMIC_FTRACE will imply > > CONFIG_ARM_MODULE_PLTS is enabled too. > > > > It only has an impact on reducing ifdefery when calling > > > > ftrace_call_replace_mod(rec->arch.mod, ...) > > > > Should be easy to wrap rec->arch.mod with its own accessor that will return > > NULL if !CONFIG_ARM_MODULE_PLTS or just ifdef the functions. > > > > Up to Alexander to pick what he prefers :-) > > well, I of course prefer v7 as-is, because this review is running longer than two > years and I actually hope these patches to be finally merged at some point. > But you are welcome to optimize them with follow up patches :) I appreciate that and thanks a lot for your effort. My attempt to review and test here is to help in getting this merged. FWIW my main concern is about duplicating the range check in ftrace_call_replace() and using magic values that already exist in __arm_gen_branch_{arm, thumb2}() and better remain encapsulated there. Thanks -- Qais Yousef ----->8------ diff --git a/arch/arm/include/asm/ftrace.h b/arch/arm/include/asm/ftrace.h index a4dbac07e4ef..8545b3ff8317 100644 --- a/arch/arm/include/asm/ftrace.h +++ b/arch/arm/include/asm/ftrace.h @@ -25,6 +25,27 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr) /* With Thumb-2, the recorded addresses have the lsb set */ return addr & ~1; } + +#ifdef CONFIG_ARM_MODULE_PLTS +static inline void ftrace_set_mod(struct dyn_arch_ftrace *arch, struct module *mod) +{ + arch->mod = mod; +} + +static inline struct module *ftrace_get_mod(struct dyn_arch_ftrace *arch) +{ + return arch->mod; +} +#else +static inline void ftrace_set_mod(struct dyn_arch_ftrace *arch, struct module *mod) +{ +} + +static inline struct module *ftrace_get_mod(struct dyn_arch_ftrace *arch) +{ + return NULL; +} +#endif #endif #endif diff --git a/arch/arm/include/asm/insn.h b/arch/arm/include/asm/insn.h index f20e08ac85ae..71c3edefe629 100644 --- a/arch/arm/include/asm/insn.h +++ b/arch/arm/include/asm/insn.h @@ -13,18 +13,24 @@ arm_gen_nop(void) } unsigned long -__arm_gen_branch(unsigned long pc, unsigned long addr, bool link); +__arm_gen_branch(unsigned long pc, unsigned long addr, bool link, bool check); static inline unsigned long arm_gen_branch(unsigned long pc, unsigned long addr) { - return __arm_gen_branch(pc, addr, false); + return __arm_gen_branch(pc, addr, false, true); } static inline unsigned long arm_gen_branch_link(unsigned long pc, unsigned long addr) { - return __arm_gen_branch(pc, addr, true); + return __arm_gen_branch(pc, addr, true, true); +} + +static inline unsigned long +arm_gen_branch_link_nocheck(unsigned long pc, unsigned long addr) +{ + return __arm_gen_branch(pc, addr, true, false); } #endif diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c index fa867a57100f..63ea34edd222 100644 --- a/arch/arm/kernel/ftrace.c +++ b/arch/arm/kernel/ftrace.c @@ -70,20 +70,28 @@ int ftrace_arch_code_modify_post_process(void) static unsigned long ftrace_call_replace(unsigned long pc, unsigned long addr) { - s32 offset = addr - pc; - s32 blim = 0xfe000008; - s32 flim = 0x02000004; + return arm_gen_branch_link(pc, addr); +} - if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { - blim = 0xff000004; - flim = 0x01000002; - } +static unsigned long +ftrace_call_replace_mod(struct module *mod, unsigned long pc, unsigned long addr) +{ +#ifdef CONFIG_ARM_MODULE_PLTS + unsigned long new; - if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS) && - (offset < blim || offset > flim)) - return 0; + if (likely(!mod)) + return arm_gen_branch_link(pc, addr); + new = arm_gen_branch_link_nocheck(pc, addr); + if (!new) { + addr = get_module_plt(mod, pc, addr); + new = arm_gen_branch_link(pc, addr); + } + + return new; +#else return arm_gen_branch_link(pc, addr); +#endif } static int ftrace_modify_code(unsigned long pc, unsigned long old, @@ -141,18 +149,7 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr) old = ftrace_nop_replace(rec); - new = ftrace_call_replace(ip, aaddr); - -#ifdef CONFIG_ARM_MODULE_PLTS - if (!new) { - struct module *mod = rec->arch.mod; - - if (mod) { - aaddr = get_module_plt(mod, ip, aaddr); - new = ftrace_call_replace(ip, aaddr); - } - } -#endif + new = ftrace_call_replace_mod(ftrace_get_mod(&rec->arch), ip, aaddr); return ftrace_modify_code(rec->ip, old, new, true); } @@ -183,23 +180,11 @@ int ftrace_make_nop(struct module *mod, unsigned long new; int ret; -#ifdef CONFIG_ARM_MODULE_PLTS /* mod is only supplied during module loading */ - if (!mod) - mod = rec->arch.mod; - else - rec->arch.mod = mod; -#endif - - old = ftrace_call_replace(ip, aaddr); - -#ifdef CONFIG_ARM_MODULE_PLTS - if (!old && mod) { - aaddr = get_module_plt(mod, ip, aaddr); - old = ftrace_call_replace(ip, aaddr); - } -#endif + if (mod) + ftrace_set_mod(&rec->arch, mod); + old = ftrace_call_replace_mod(ftrace_get_mod(&rec->arch), ip, aaddr); new = ftrace_nop_replace(rec); ret = ftrace_modify_code(ip, old, new, true); diff --git a/arch/arm/kernel/insn.c b/arch/arm/kernel/insn.c index 2e844b70386b..37ec5734309e 100644 --- a/arch/arm/kernel/insn.c +++ b/arch/arm/kernel/insn.c @@ -4,7 +4,7 @@ #include <asm/opcodes.h> static unsigned long -__arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link) +__arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link, bool check) { unsigned long s, j1, j2, i1, i2, imm10, imm11; unsigned long first, second; @@ -12,7 +12,7 @@ __arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link) offset = (long)addr - (long)(pc + 4); if (offset < -16777216 || offset > 16777214) { - WARN_ON_ONCE(1); + WARN_ON_ONCE(check); return 0; } @@ -34,7 +34,7 @@ __arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link) } static unsigned long -__arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link) +__arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link, bool check) { unsigned long opcode = 0xea000000; long offset; @@ -44,7 +44,7 @@ __arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link) offset = (long)addr - (long)(pc + 8); if (unlikely(offset < -33554432 || offset > 33554428)) { - WARN_ON_ONCE(1); + WARN_ON_ONCE(check); return 0; } @@ -54,10 +54,10 @@ __arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link) } unsigned long -__arm_gen_branch(unsigned long pc, unsigned long addr, bool link) +__arm_gen_branch(unsigned long pc, unsigned long addr, bool link, bool check) { if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) - return __arm_gen_branch_thumb2(pc, addr, link); + return __arm_gen_branch_thumb2(pc, addr, link, check); else - return __arm_gen_branch_arm(pc, addr, link); + return __arm_gen_branch_arm(pc, addr, link, check); }
Hi Qais, On 3/23/2021 3:22 PM, Qais Yousef wrote: > Hi Alexander > > On 03/22/21 18:02, Alexander Sverdlin wrote: >> Hi Qais, >> >> On 22/03/2021 17:32, Qais Yousef wrote: >>> Yes you're right. I was a bit optimistic on CONFIG_DYNAMIC_FTRACE will imply >>> CONFIG_ARM_MODULE_PLTS is enabled too. >>> >>> It only has an impact on reducing ifdefery when calling >>> >>> ftrace_call_replace_mod(rec->arch.mod, ...) >>> >>> Should be easy to wrap rec->arch.mod with its own accessor that will return >>> NULL if !CONFIG_ARM_MODULE_PLTS or just ifdef the functions. >>> >>> Up to Alexander to pick what he prefers :-) >> >> well, I of course prefer v7 as-is, because this review is running longer than two >> years and I actually hope these patches to be finally merged at some point. >> But you are welcome to optimize them with follow up patches :) > > I appreciate that and thanks a lot for your effort. My attempt to review and > test here is to help in getting this merged. > > FWIW my main concern is about duplicating the range check in > ftrace_call_replace() and using magic values that already exist in > __arm_gen_branch_{arm, thumb2}() and better remain encapsulated there. Your patch in addition to Alexander's patch work for me as well, so feel free to add a: Tested-by: Florian Fainelli <f.fainelli@gmail.com> FWIW, what is nice about Alexander's original patch is that it applies relatively cleanly to older kernels as well where this is equally needed. There is not currently any Fixes: tag being provided but maybe we should amend the second patch with one? Thanks! > > Thanks > > -- > Qais Yousef > > ----->8------ > > > diff --git a/arch/arm/include/asm/ftrace.h b/arch/arm/include/asm/ftrace.h > index a4dbac07e4ef..8545b3ff8317 100644 > --- a/arch/arm/include/asm/ftrace.h > +++ b/arch/arm/include/asm/ftrace.h > @@ -25,6 +25,27 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr) > /* With Thumb-2, the recorded addresses have the lsb set */ > return addr & ~1; > } > + > +#ifdef CONFIG_ARM_MODULE_PLTS > +static inline void ftrace_set_mod(struct dyn_arch_ftrace *arch, struct module *mod) > +{ > + arch->mod = mod; > +} > + > +static inline struct module *ftrace_get_mod(struct dyn_arch_ftrace *arch) > +{ > + return arch->mod; > +} > +#else > +static inline void ftrace_set_mod(struct dyn_arch_ftrace *arch, struct module *mod) > +{ > +} > + > +static inline struct module *ftrace_get_mod(struct dyn_arch_ftrace *arch) > +{ > + return NULL; > +} > +#endif > #endif > > #endif > diff --git a/arch/arm/include/asm/insn.h b/arch/arm/include/asm/insn.h > index f20e08ac85ae..71c3edefe629 100644 > --- a/arch/arm/include/asm/insn.h > +++ b/arch/arm/include/asm/insn.h > @@ -13,18 +13,24 @@ arm_gen_nop(void) > } > > unsigned long > -__arm_gen_branch(unsigned long pc, unsigned long addr, bool link); > +__arm_gen_branch(unsigned long pc, unsigned long addr, bool link, bool check); > > static inline unsigned long > arm_gen_branch(unsigned long pc, unsigned long addr) > { > - return __arm_gen_branch(pc, addr, false); > + return __arm_gen_branch(pc, addr, false, true); > } > > static inline unsigned long > arm_gen_branch_link(unsigned long pc, unsigned long addr) > { > - return __arm_gen_branch(pc, addr, true); > + return __arm_gen_branch(pc, addr, true, true); > +} > + > +static inline unsigned long > +arm_gen_branch_link_nocheck(unsigned long pc, unsigned long addr) > +{ > + return __arm_gen_branch(pc, addr, true, false); > } > > #endif > diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c > index fa867a57100f..63ea34edd222 100644 > --- a/arch/arm/kernel/ftrace.c > +++ b/arch/arm/kernel/ftrace.c > @@ -70,20 +70,28 @@ int ftrace_arch_code_modify_post_process(void) > > static unsigned long ftrace_call_replace(unsigned long pc, unsigned long addr) > { > - s32 offset = addr - pc; > - s32 blim = 0xfe000008; > - s32 flim = 0x02000004; > + return arm_gen_branch_link(pc, addr); > +} > > - if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { > - blim = 0xff000004; > - flim = 0x01000002; > - } > +static unsigned long > +ftrace_call_replace_mod(struct module *mod, unsigned long pc, unsigned long addr) > +{ > +#ifdef CONFIG_ARM_MODULE_PLTS > + unsigned long new; > > - if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS) && > - (offset < blim || offset > flim)) > - return 0; > + if (likely(!mod)) > + return arm_gen_branch_link(pc, addr); > > + new = arm_gen_branch_link_nocheck(pc, addr); > + if (!new) { > + addr = get_module_plt(mod, pc, addr); > + new = arm_gen_branch_link(pc, addr); > + } > + > + return new; > +#else > return arm_gen_branch_link(pc, addr); > +#endif > } > > static int ftrace_modify_code(unsigned long pc, unsigned long old, > @@ -141,18 +149,7 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr) > > old = ftrace_nop_replace(rec); > > - new = ftrace_call_replace(ip, aaddr); > - > -#ifdef CONFIG_ARM_MODULE_PLTS > - if (!new) { > - struct module *mod = rec->arch.mod; > - > - if (mod) { > - aaddr = get_module_plt(mod, ip, aaddr); > - new = ftrace_call_replace(ip, aaddr); > - } > - } > -#endif > + new = ftrace_call_replace_mod(ftrace_get_mod(&rec->arch), ip, aaddr); > > return ftrace_modify_code(rec->ip, old, new, true); > } > @@ -183,23 +180,11 @@ int ftrace_make_nop(struct module *mod, > unsigned long new; > int ret; > > -#ifdef CONFIG_ARM_MODULE_PLTS > /* mod is only supplied during module loading */ > - if (!mod) > - mod = rec->arch.mod; > - else > - rec->arch.mod = mod; > -#endif > - > - old = ftrace_call_replace(ip, aaddr); > - > -#ifdef CONFIG_ARM_MODULE_PLTS > - if (!old && mod) { > - aaddr = get_module_plt(mod, ip, aaddr); > - old = ftrace_call_replace(ip, aaddr); > - } > -#endif > + if (mod) > + ftrace_set_mod(&rec->arch, mod); > > + old = ftrace_call_replace_mod(ftrace_get_mod(&rec->arch), ip, aaddr); > new = ftrace_nop_replace(rec); > ret = ftrace_modify_code(ip, old, new, true); > > diff --git a/arch/arm/kernel/insn.c b/arch/arm/kernel/insn.c > index 2e844b70386b..37ec5734309e 100644 > --- a/arch/arm/kernel/insn.c > +++ b/arch/arm/kernel/insn.c > @@ -4,7 +4,7 @@ > #include <asm/opcodes.h> > > static unsigned long > -__arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link) > +__arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link, bool check) > { > unsigned long s, j1, j2, i1, i2, imm10, imm11; > unsigned long first, second; > @@ -12,7 +12,7 @@ __arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link) > > offset = (long)addr - (long)(pc + 4); > if (offset < -16777216 || offset > 16777214) { > - WARN_ON_ONCE(1); > + WARN_ON_ONCE(check); > return 0; > } > > @@ -34,7 +34,7 @@ __arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link) > } > > static unsigned long > -__arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link) > +__arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link, bool check) > { > unsigned long opcode = 0xea000000; > long offset; > @@ -44,7 +44,7 @@ __arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link) > > offset = (long)addr - (long)(pc + 8); > if (unlikely(offset < -33554432 || offset > 33554428)) { > - WARN_ON_ONCE(1); > + WARN_ON_ONCE(check); > return 0; > } > > @@ -54,10 +54,10 @@ __arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link) > } > > unsigned long > -__arm_gen_branch(unsigned long pc, unsigned long addr, bool link) > +__arm_gen_branch(unsigned long pc, unsigned long addr, bool link, bool check) > { > if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) > - return __arm_gen_branch_thumb2(pc, addr, link); > + return __arm_gen_branch_thumb2(pc, addr, link, check); > else > - return __arm_gen_branch_arm(pc, addr, link); > + return __arm_gen_branch_arm(pc, addr, link, check); > } >
Hi Qais, On 23/03/2021 23:22, Qais Yousef wrote: >>> Yes you're right. I was a bit optimistic on CONFIG_DYNAMIC_FTRACE will imply >>> CONFIG_ARM_MODULE_PLTS is enabled too. >>> >>> It only has an impact on reducing ifdefery when calling >>> >>> ftrace_call_replace_mod(rec->arch.mod, ...) >>> >>> Should be easy to wrap rec->arch.mod with its own accessor that will return >>> NULL if !CONFIG_ARM_MODULE_PLTS or just ifdef the functions. >>> >>> Up to Alexander to pick what he prefers :-) >> well, I of course prefer v7 as-is, because this review is running longer than two >> years and I actually hope these patches to be finally merged at some point. >> But you are welcome to optimize them with follow up patches :) > I appreciate that and thanks a lot for your effort. My attempt to review and > test here is to help in getting this merged. > > FWIW my main concern is about duplicating the range check in > ftrace_call_replace() and using magic values that already exist in > __arm_gen_branch_{arm, thumb2}() and better remain encapsulated there. could you please check the negative limits? I have an opinion, my limits are correct. I could add extra parameter to arm_gen_branch_link(), but for this I first need to fix its negative limits, which, I believe, well... Approximate :)
Hey Alexander On 03/24/21 10:04, Alexander Sverdlin wrote: > Hi Qais, > > On 23/03/2021 23:22, Qais Yousef wrote: > >>> Yes you're right. I was a bit optimistic on CONFIG_DYNAMIC_FTRACE will imply > >>> CONFIG_ARM_MODULE_PLTS is enabled too. > >>> > >>> It only has an impact on reducing ifdefery when calling > >>> > >>> ftrace_call_replace_mod(rec->arch.mod, ...) > >>> > >>> Should be easy to wrap rec->arch.mod with its own accessor that will return > >>> NULL if !CONFIG_ARM_MODULE_PLTS or just ifdef the functions. > >>> > >>> Up to Alexander to pick what he prefers :-) > >> well, I of course prefer v7 as-is, because this review is running longer than two > >> years and I actually hope these patches to be finally merged at some point. > >> But you are welcome to optimize them with follow up patches :) > > I appreciate that and thanks a lot for your effort. My attempt to review and > > test here is to help in getting this merged. > > > > FWIW my main concern is about duplicating the range check in > > ftrace_call_replace() and using magic values that already exist in > > __arm_gen_branch_{arm, thumb2}() and better remain encapsulated there. > > could you please check the negative limits? I have an opinion, my limits are > correct. I could add extra parameter to arm_gen_branch_link(), but for this > I first need to fix its negative limits, which, I believe, well... Approximate :) Can you elaborate please? If you look at Arm ARM [1] the ranges are defined in page 347 Encoding T1 Even numbers in the range –16777216 to 16777214. Encoding T2 Multiples of 4 in the range –16777216 to 16777212. Encoding A1 Multiples of 4 in the range –33554432 to 33554428. Encoding A2 Even numbers in the range –33554432 to 33554430. which matches what's in the code (T1 for thumb2 and A1 for arm). Why do you think it's wrong? Thanks -- Qais Yousef [1] https://developer.arm.com/documentation/ddi0406/latest/
Hi Florian On 03/23/21 20:37, Florian Fainelli wrote: > Hi Qais, > > On 3/23/2021 3:22 PM, Qais Yousef wrote: > > Hi Alexander > > > > On 03/22/21 18:02, Alexander Sverdlin wrote: > >> Hi Qais, > >> > >> On 22/03/2021 17:32, Qais Yousef wrote: > >>> Yes you're right. I was a bit optimistic on CONFIG_DYNAMIC_FTRACE will imply > >>> CONFIG_ARM_MODULE_PLTS is enabled too. > >>> > >>> It only has an impact on reducing ifdefery when calling > >>> > >>> ftrace_call_replace_mod(rec->arch.mod, ...) > >>> > >>> Should be easy to wrap rec->arch.mod with its own accessor that will return > >>> NULL if !CONFIG_ARM_MODULE_PLTS or just ifdef the functions. > >>> > >>> Up to Alexander to pick what he prefers :-) > >> > >> well, I of course prefer v7 as-is, because this review is running longer than two > >> years and I actually hope these patches to be finally merged at some point. > >> But you are welcome to optimize them with follow up patches :) > > > > I appreciate that and thanks a lot for your effort. My attempt to review and > > test here is to help in getting this merged. > > > > FWIW my main concern is about duplicating the range check in > > ftrace_call_replace() and using magic values that already exist in > > __arm_gen_branch_{arm, thumb2}() and better remain encapsulated there. > > Your patch in addition to Alexander's patch work for me as well, so feel > free to add a: > > Tested-by: Florian Fainelli <f.fainelli@gmail.com> > > FWIW, what is nice about Alexander's original patch is that it applies > relatively cleanly to older kernels as well where this is equally How old are we talking? Was the conflict that bad for the stable maintainers to deal with it? ie: would it require sending the backport separately? > needed. There is not currently any Fixes: tag being provided but maybe > we should amend the second patch with one? I'm not sure if this will be considered new feature or a bug fix. FWIW, tagging it for stable sounds reasonable to me. Thanks! -- Qais Yosuef
Hello Qais, On 24/03/2021 16:57, Qais Yousef wrote: >>> FWIW my main concern is about duplicating the range check in >>> ftrace_call_replace() and using magic values that already exist in >>> __arm_gen_branch_{arm, thumb2}() and better remain encapsulated there. >> could you please check the negative limits? I have an opinion, my limits are >> correct. I could add extra parameter to arm_gen_branch_link(), but for this >> I first need to fix its negative limits, which, I believe, well... Approximate :) > Can you elaborate please? > > If you look at Arm ARM [1] the ranges are defined in page 347 > > Encoding T1 Even numbers in the range –16777216 to 16777214. > Encoding T2 Multiples of 4 in the range –16777216 to 16777212. > Encoding A1 Multiples of 4 in the range –33554432 to 33554428. > Encoding A2 Even numbers in the range –33554432 to 33554430. > > which matches what's in the code (T1 for thumb2 and A1 for arm). > > Why do you think it's wrong? thanks for checking this! I'll re-send v8 with your proposal.
On 03/24/21 17:33, Alexander Sverdlin wrote: > Hello Qais, > > On 24/03/2021 16:57, Qais Yousef wrote: > >>> FWIW my main concern is about duplicating the range check in > >>> ftrace_call_replace() and using magic values that already exist in > >>> __arm_gen_branch_{arm, thumb2}() and better remain encapsulated there. > >> could you please check the negative limits? I have an opinion, my limits are > >> correct. I could add extra parameter to arm_gen_branch_link(), but for this > >> I first need to fix its negative limits, which, I believe, well... Approximate :) > > Can you elaborate please? > > > > If you look at Arm ARM [1] the ranges are defined in page 347 > > > > Encoding T1 Even numbers in the range –16777216 to 16777214. > > Encoding T2 Multiples of 4 in the range –16777216 to 16777212. > > Encoding A1 Multiples of 4 in the range –33554432 to 33554428. > > Encoding A2 Even numbers in the range –33554432 to 33554430. > > > > which matches what's in the code (T1 for thumb2 and A1 for arm). > > > > Why do you think it's wrong? > > thanks for checking this! I'll re-send v8 with your proposal. If you felt some details need tweaking I don't mind, my proposal was an attempt to help rather than impose. Thanks! -- Qais Yousef
diff --git a/arch/arm/include/asm/ftrace.h b/arch/arm/include/asm/ftrace.h index 48ec1d0..a4dbac0 100644 --- a/arch/arm/include/asm/ftrace.h +++ b/arch/arm/include/asm/ftrace.h @@ -15,6 +15,9 @@ extern void __gnu_mcount_nc(void); #ifdef CONFIG_DYNAMIC_FTRACE struct dyn_arch_ftrace { +#ifdef CONFIG_ARM_MODULE_PLTS + struct module *mod; +#endif }; static inline unsigned long ftrace_call_adjust(unsigned long addr) diff --git a/arch/arm/include/asm/module.h b/arch/arm/include/asm/module.h index 09b9ad5..cfffae6 100644 --- a/arch/arm/include/asm/module.h +++ b/arch/arm/include/asm/module.h @@ -30,6 +30,7 @@ struct plt_entries { struct mod_plt_sec { struct elf32_shdr *plt; + struct plt_entries *plt_ent; int plt_count; }; diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c index 9a79ef6..fa867a5 100644 --- a/arch/arm/kernel/ftrace.c +++ b/arch/arm/kernel/ftrace.c @@ -70,6 +70,19 @@ int ftrace_arch_code_modify_post_process(void) static unsigned long ftrace_call_replace(unsigned long pc, unsigned long addr) { + s32 offset = addr - pc; + s32 blim = 0xfe000008; + s32 flim = 0x02000004; + + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { + blim = 0xff000004; + flim = 0x01000002; + } + + if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS) && + (offset < blim || offset > flim)) + return 0; + return arm_gen_branch_link(pc, addr); } @@ -124,10 +137,22 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr) { unsigned long new, old; unsigned long ip = rec->ip; + unsigned long aaddr = adjust_address(rec, addr); old = ftrace_nop_replace(rec); - new = ftrace_call_replace(ip, adjust_address(rec, addr)); + new = ftrace_call_replace(ip, aaddr); + +#ifdef CONFIG_ARM_MODULE_PLTS + if (!new) { + struct module *mod = rec->arch.mod; + + if (mod) { + aaddr = get_module_plt(mod, ip, aaddr); + new = ftrace_call_replace(ip, aaddr); + } + } +#endif return ftrace_modify_code(rec->ip, old, new, true); } @@ -152,12 +177,29 @@ int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr, int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec, unsigned long addr) { + unsigned long aaddr = adjust_address(rec, addr); unsigned long ip = rec->ip; unsigned long old; unsigned long new; int ret; - old = ftrace_call_replace(ip, adjust_address(rec, addr)); +#ifdef CONFIG_ARM_MODULE_PLTS + /* mod is only supplied during module loading */ + if (!mod) + mod = rec->arch.mod; + else + rec->arch.mod = mod; +#endif + + old = ftrace_call_replace(ip, aaddr); + +#ifdef CONFIG_ARM_MODULE_PLTS + if (!old && mod) { + aaddr = get_module_plt(mod, ip, aaddr); + old = ftrace_call_replace(ip, aaddr); + } +#endif + new = ftrace_nop_replace(rec); ret = ftrace_modify_code(ip, old, new, true); diff --git a/arch/arm/kernel/module-plts.c b/arch/arm/kernel/module-plts.c index d330e9e..a0524ad 100644 --- a/arch/arm/kernel/module-plts.c +++ b/arch/arm/kernel/module-plts.c @@ -4,6 +4,7 @@ */ #include <linux/elf.h> +#include <linux/ftrace.h> #include <linux/kernel.h> #include <linux/module.h> #include <linux/sort.h> @@ -20,19 +21,52 @@ (PLT_ENT_STRIDE - 8)) #endif +static const u32 fixed_plts[] = { +#ifdef CONFIG_FUNCTION_TRACER + FTRACE_ADDR, + MCOUNT_ADDR, +#endif +}; + static bool in_init(const struct module *mod, unsigned long loc) { return loc - (u32)mod->init_layout.base < mod->init_layout.size; } +static void prealloc_fixed(struct mod_plt_sec *pltsec, struct plt_entries *plt) +{ + int i; + + if (!ARRAY_SIZE(fixed_plts) || pltsec->plt_count) + return; + pltsec->plt_count = ARRAY_SIZE(fixed_plts); + + for (i = 0; i < ARRAY_SIZE(plt->ldr); ++i) + plt->ldr[i] = PLT_ENT_LDR; + + BUILD_BUG_ON(sizeof(fixed_plts) > sizeof(plt->lit)); + memcpy(plt->lit, fixed_plts, sizeof(fixed_plts)); +} + u32 get_module_plt(struct module *mod, unsigned long loc, Elf32_Addr val) { struct mod_plt_sec *pltsec = !in_init(mod, loc) ? &mod->arch.core : &mod->arch.init; + struct plt_entries *plt; + int idx; + + /* cache the address, ELF header is available only during module load */ + if (!pltsec->plt_ent) + pltsec->plt_ent = (struct plt_entries *)pltsec->plt->sh_addr; + plt = pltsec->plt_ent; - struct plt_entries *plt = (struct plt_entries *)pltsec->plt->sh_addr; - int idx = 0; + prealloc_fixed(pltsec, plt); + + for (idx = 0; idx < ARRAY_SIZE(fixed_plts); ++idx) + if (plt->lit[idx] == val) + return (u32)&plt->ldr[idx]; + idx = 0; /* * Look for an existing entry pointing to 'val'. Given that the * relocations are sorted, this will be the last entry we allocated. @@ -180,8 +214,8 @@ static unsigned int count_plts(const Elf32_Sym *syms, Elf32_Addr base, int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs, char *secstrings, struct module *mod) { - unsigned long core_plts = 0; - unsigned long init_plts = 0; + unsigned long core_plts = ARRAY_SIZE(fixed_plts); + unsigned long init_plts = ARRAY_SIZE(fixed_plts); Elf32_Shdr *s, *sechdrs_end = sechdrs + ehdr->e_shnum; Elf32_Sym *syms = NULL; @@ -236,6 +270,7 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs, mod->arch.core.plt->sh_size = round_up(core_plts * PLT_ENT_SIZE, sizeof(struct plt_entries)); mod->arch.core.plt_count = 0; + mod->arch.core.plt_ent = NULL; mod->arch.init.plt->sh_type = SHT_NOBITS; mod->arch.init.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC; @@ -243,6 +278,7 @@ int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs, mod->arch.init.plt->sh_size = round_up(init_plts * PLT_ENT_SIZE, sizeof(struct plt_entries)); mod->arch.init.plt_count = 0; + mod->arch.init.plt_ent = NULL; pr_debug("%s: plt=%x, init.plt=%x\n", __func__, mod->arch.core.plt->sh_size, mod->arch.init.plt->sh_size);