diff mbox series

[v7,2/2] ARM: ftrace: Add MODULE_PLTS support

Message ID 20210127110944.41813-3-alexander.sverdlin@nokia.com (mailing list archive)
State New, archived
Headers show
Series ARM: Implement MODULE_PLT support in FTRACE | expand

Commit Message

Alexander Sverdlin Jan. 27, 2021, 11:09 a.m. UTC
From: Alexander Sverdlin <alexander.sverdlin@nokia.com>

Teach ftrace_make_call() and ftrace_make_nop() about PLTs.
Teach PLT code about FTRACE and all its callbacks.
Otherwise the following might happen:

------------[ cut here ]------------
WARNING: CPU: 14 PID: 2265 at .../arch/arm/kernel/insn.c:14 __arm_gen_branch+0x83/0x8c()
...
Hardware name: LSI Axxia AXM55XX
[<c0314a49>] (unwind_backtrace) from [<c03115e9>] (show_stack+0x11/0x14)
[<c03115e9>] (show_stack) from [<c0519f51>] (dump_stack+0x81/0xa8)
[<c0519f51>] (dump_stack) from [<c032185d>] (warn_slowpath_common+0x69/0x90)
[<c032185d>] (warn_slowpath_common) from [<c03218f3>] (warn_slowpath_null+0x17/0x1c)
[<c03218f3>] (warn_slowpath_null) from [<c03143cf>] (__arm_gen_branch+0x83/0x8c)
[<c03143cf>] (__arm_gen_branch) from [<c0314337>] (ftrace_make_nop+0xf/0x24)
[<c0314337>] (ftrace_make_nop) from [<c038ebcb>] (ftrace_process_locs+0x27b/0x3e8)
[<c038ebcb>] (ftrace_process_locs) from [<c0378d79>] (load_module+0x11e9/0x1a44)
[<c0378d79>] (load_module) from [<c037974d>] (SyS_finit_module+0x59/0x84)
[<c037974d>] (SyS_finit_module) from [<c030e981>] (ret_fast_syscall+0x1/0x18)
---[ end trace e1b64ced7a89adcc ]---
------------[ cut here ]------------
WARNING: CPU: 14 PID: 2265 at .../kernel/trace/ftrace.c:1979 ftrace_bug+0x1b1/0x234()
...
Hardware name: LSI Axxia AXM55XX
[<c0314a49>] (unwind_backtrace) from [<c03115e9>] (show_stack+0x11/0x14)
[<c03115e9>] (show_stack) from [<c0519f51>] (dump_stack+0x81/0xa8)
[<c0519f51>] (dump_stack) from [<c032185d>] (warn_slowpath_common+0x69/0x90)
[<c032185d>] (warn_slowpath_common) from [<c03218f3>] (warn_slowpath_null+0x17/0x1c)
[<c03218f3>] (warn_slowpath_null) from [<c038e87d>] (ftrace_bug+0x1b1/0x234)
[<c038e87d>] (ftrace_bug) from [<c038ebd5>] (ftrace_process_locs+0x285/0x3e8)
[<c038ebd5>] (ftrace_process_locs) from [<c0378d79>] (load_module+0x11e9/0x1a44)
[<c0378d79>] (load_module) from [<c037974d>] (SyS_finit_module+0x59/0x84)
[<c037974d>] (SyS_finit_module) from [<c030e981>] (ret_fast_syscall+0x1/0x18)
---[ end trace e1b64ced7a89adcd ]---
ftrace failed to modify [<e9ef7006>] 0xe9ef7006
actual: 02:f0:3b:fa
ftrace record flags: 0
(0) expected tramp: c0314265

Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
---
 arch/arm/include/asm/ftrace.h |  3 +++
 arch/arm/include/asm/module.h |  1 +
 arch/arm/kernel/ftrace.c      | 46 +++++++++++++++++++++++++++++++++++++++++--
 arch/arm/kernel/module-plts.c | 44 +++++++++++++++++++++++++++++++++++++----
 4 files changed, 88 insertions(+), 6 deletions(-)

Comments

Florian Fainelli Jan. 27, 2021, 7:36 p.m. UTC | #1
On 1/27/2021 3:09 AM, Alexander A Sverdlin wrote:
> From: Alexander Sverdlin <alexander.sverdlin@nokia.com>
> 
> Teach ftrace_make_call() and ftrace_make_nop() about PLTs.
> Teach PLT code about FTRACE and all its callbacks.
> Otherwise the following might happen:
> 
> ------------[ cut here ]------------
> WARNING: CPU: 14 PID: 2265 at .../arch/arm/kernel/insn.c:14 __arm_gen_branch+0x83/0x8c()
> ...
> Hardware name: LSI Axxia AXM55XX
> [<c0314a49>] (unwind_backtrace) from [<c03115e9>] (show_stack+0x11/0x14)
> [<c03115e9>] (show_stack) from [<c0519f51>] (dump_stack+0x81/0xa8)
> [<c0519f51>] (dump_stack) from [<c032185d>] (warn_slowpath_common+0x69/0x90)
> [<c032185d>] (warn_slowpath_common) from [<c03218f3>] (warn_slowpath_null+0x17/0x1c)
> [<c03218f3>] (warn_slowpath_null) from [<c03143cf>] (__arm_gen_branch+0x83/0x8c)
> [<c03143cf>] (__arm_gen_branch) from [<c0314337>] (ftrace_make_nop+0xf/0x24)
> [<c0314337>] (ftrace_make_nop) from [<c038ebcb>] (ftrace_process_locs+0x27b/0x3e8)
> [<c038ebcb>] (ftrace_process_locs) from [<c0378d79>] (load_module+0x11e9/0x1a44)
> [<c0378d79>] (load_module) from [<c037974d>] (SyS_finit_module+0x59/0x84)
> [<c037974d>] (SyS_finit_module) from [<c030e981>] (ret_fast_syscall+0x1/0x18)
> ---[ end trace e1b64ced7a89adcc ]---
> ------------[ cut here ]------------
> WARNING: CPU: 14 PID: 2265 at .../kernel/trace/ftrace.c:1979 ftrace_bug+0x1b1/0x234()
> ...
> Hardware name: LSI Axxia AXM55XX
> [<c0314a49>] (unwind_backtrace) from [<c03115e9>] (show_stack+0x11/0x14)
> [<c03115e9>] (show_stack) from [<c0519f51>] (dump_stack+0x81/0xa8)
> [<c0519f51>] (dump_stack) from [<c032185d>] (warn_slowpath_common+0x69/0x90)
> [<c032185d>] (warn_slowpath_common) from [<c03218f3>] (warn_slowpath_null+0x17/0x1c)
> [<c03218f3>] (warn_slowpath_null) from [<c038e87d>] (ftrace_bug+0x1b1/0x234)
> [<c038e87d>] (ftrace_bug) from [<c038ebd5>] (ftrace_process_locs+0x285/0x3e8)
> [<c038ebd5>] (ftrace_process_locs) from [<c0378d79>] (load_module+0x11e9/0x1a44)
> [<c0378d79>] (load_module) from [<c037974d>] (SyS_finit_module+0x59/0x84)
> [<c037974d>] (SyS_finit_module) from [<c030e981>] (ret_fast_syscall+0x1/0x18)
> ---[ end trace e1b64ced7a89adcd ]---
> ftrace failed to modify [<e9ef7006>] 0xe9ef7006
> actual: 02:f0:3b:fa
> ftrace record flags: 0
> (0) expected tramp: c0314265
> 
> Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>

Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Qais Yousef March 7, 2021, 5:26 p.m. UTC | #2
Hi Alexander

Sorry if I'm butting in out of the blue. I got curious so decided to have a go
at reproducing the issue :-)

On 01/27/21 12:09, Alexander A Sverdlin wrote:
> From: Alexander Sverdlin <alexander.sverdlin@nokia.com>
> 
> Teach ftrace_make_call() and ftrace_make_nop() about PLTs.
> Teach PLT code about FTRACE and all its callbacks.
> Otherwise the following might happen:
> 
> ------------[ cut here ]------------
> WARNING: CPU: 14 PID: 2265 at .../arch/arm/kernel/insn.c:14 __arm_gen_branch+0x83/0x8c()
> ...
> Hardware name: LSI Axxia AXM55XX
> [<c0314a49>] (unwind_backtrace) from [<c03115e9>] (show_stack+0x11/0x14)
> [<c03115e9>] (show_stack) from [<c0519f51>] (dump_stack+0x81/0xa8)
> [<c0519f51>] (dump_stack) from [<c032185d>] (warn_slowpath_common+0x69/0x90)
> [<c032185d>] (warn_slowpath_common) from [<c03218f3>] (warn_slowpath_null+0x17/0x1c)
> [<c03218f3>] (warn_slowpath_null) from [<c03143cf>] (__arm_gen_branch+0x83/0x8c)
> [<c03143cf>] (__arm_gen_branch) from [<c0314337>] (ftrace_make_nop+0xf/0x24)
> [<c0314337>] (ftrace_make_nop) from [<c038ebcb>] (ftrace_process_locs+0x27b/0x3e8)
> [<c038ebcb>] (ftrace_process_locs) from [<c0378d79>] (load_module+0x11e9/0x1a44)
> [<c0378d79>] (load_module) from [<c037974d>] (SyS_finit_module+0x59/0x84)
> [<c037974d>] (SyS_finit_module) from [<c030e981>] (ret_fast_syscall+0x1/0x18)
> ---[ end trace e1b64ced7a89adcc ]---
> ------------[ cut here ]------------
> WARNING: CPU: 14 PID: 2265 at .../kernel/trace/ftrace.c:1979 ftrace_bug+0x1b1/0x234()
> ...
> Hardware name: LSI Axxia AXM55XX
> [<c0314a49>] (unwind_backtrace) from [<c03115e9>] (show_stack+0x11/0x14)
> [<c03115e9>] (show_stack) from [<c0519f51>] (dump_stack+0x81/0xa8)
> [<c0519f51>] (dump_stack) from [<c032185d>] (warn_slowpath_common+0x69/0x90)
> [<c032185d>] (warn_slowpath_common) from [<c03218f3>] (warn_slowpath_null+0x17/0x1c)
> [<c03218f3>] (warn_slowpath_null) from [<c038e87d>] (ftrace_bug+0x1b1/0x234)
> [<c038e87d>] (ftrace_bug) from [<c038ebd5>] (ftrace_process_locs+0x285/0x3e8)
> [<c038ebd5>] (ftrace_process_locs) from [<c0378d79>] (load_module+0x11e9/0x1a44)
> [<c0378d79>] (load_module) from [<c037974d>] (SyS_finit_module+0x59/0x84)
> [<c037974d>] (SyS_finit_module) from [<c030e981>] (ret_fast_syscall+0x1/0x18)
> ---[ end trace e1b64ced7a89adcd ]---
> ftrace failed to modify [<e9ef7006>] 0xe9ef7006
> actual: 02:f0:3b:fa
> ftrace record flags: 0
> (0) expected tramp: c0314265
> 
> Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>

I tried on 5.12-rc2 and 5.11 but couldn't reproduce the problem using your
instructions on the other email. But most likely because I'm hitting another
problem that could be masking it. I'm not sure it is related or just randomly
happened to hit it.

Did you see something similar?


	[    0.000000] ftrace: allocating 82941 entries in 244 pages
	[    0.000000] ------------[ ftrace bug ]------------
	[    0.000000] ftrace failed to modify
	[    0.000000] [<c2b00350>] set_reset_devices+0x10/0x28
	[    0.000000]  actual:   0e:28:03:eb
	[    0.000000] Initializing ftrace call sites
	[    0.000000] ftrace record flags: 0
	[    0.000000]  (0)
	[    0.000000]  expected tramp: c0312eb8
	[    0.000000] ------------[ cut here ]------------
	[    0.000000] WARNING: CPU: 0 PID: 0 at kernel/trace/ftrace.c:2066 ftrace_bug+0x240/0x268
	[    0.000000] Modules linked in:
	[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.12.0-rc2-00002-gcb8fe87aa3fa-dirty #12
	[    0.000000] Hardware name: Generic DT based system
	[    0.000000] Backtrace:
	[    0.000000] [<c1afd8c4>] (dump_backtrace) from [<c1afdca4>] (show_stack+0x20/0x24)
	[    0.000000]  r7:c2e13c1c r6:600000d3 r5:00000000 r4:c2e13c1c
	[    0.000000] [<c1afdc84>] (show_stack) from [<c1b04c94>] (dump_stack+0xf4/0x124)
	[    0.000000] [<c1b04ba0>] (dump_stack) from [<c035f33c>] (__warn+0xfc/0x128)
	[    0.000000]  r10:c2d0908c r9:00000000 r8:00000009 r7:00000812 r6:00000009 r5:c1b01c78
	[    0.000000]  r4:c27d0f70 r3:c2d08f50
	[    0.000000] [<c035f240>] (__warn) from [<c1afee14>] (warn_slowpath_fmt+0x74/0xd0)
	[    0.000000]  r7:c1b01c78 r6:00000812 r5:c27d0f70 r4:00000000
	[    0.000000] [<c1afeda4>] (warn_slowpath_fmt) from [<c1b01c78>] (ftrace_bug+0x240/0x268)
	[    0.000000]  r8:c0312eac r7:c362f518 r6:ffffffea r5:c2b00350 r4:c3817ac4
	[    0.000000] [<c1b01a38>] (ftrace_bug) from [<c046316c>] (ftrace_process_locs+0x2b0/0x518)
	[    0.000000]  r7:c3817ac4 r6:c38040c0 r5:00000a3c r4:000134e4
	[    0.000000] [<c0462ebc>] (ftrace_process_locs) from [<c2b25240>] (ftrace_init+0xc8/0x174)
	[    0.000000]  r10:c2ffa000 r9:c2be8a78 r8:c2c5d1fc r7:c2c0c208 r6:00000001 r5:c2d0908c
	[    0.000000]  r4:c362f518
	[    0.000000] [<c2b25178>] (ftrace_init) from [<c2b00e14>] (start_kernel+0x2f4/0x5b8)
	[    0.000000]  r9:c2be8a78 r8:dbfffec0 r7:00000000 r6:c36385cc r5:c2d08f00 r4:c2ffa000
	[    0.000000] [<c2b00b20>] (start_kernel) from [<00000000>] (0x0)
	[    0.000000]  r10:10c5387d r9:412fc0f1 r8:48000000 r7:ffffffff r6:10c0387d r5:00000051
	[    0.000000]  r4:c2b00330
	[    0.000000] irq event stamp: 0
	[    0.000000] hardirqs last  enabled at (0): [<00000000>] 0x0
	[    0.000000] hardirqs last disabled at (0): [<00000000>] 0x0
	[    0.000000] softirqs last  enabled at (0): [<00000000>] 0x0
	[    0.000000] softirqs last disabled at (0): [<00000000>] 0x0
	[    0.000000] random: get_random_bytes called from print_oops_end_marker+0x34/0xa0 with crng_init=0
	[    0.000000] ---[ end trace 0000000000000000 ]---
	[    0.000000] ftrace: allocated 243 pages with 6 groups


> diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
> index 9a79ef6..fa867a5 100644
> --- a/arch/arm/kernel/ftrace.c
> +++ b/arch/arm/kernel/ftrace.c
> @@ -70,6 +70,19 @@ int ftrace_arch_code_modify_post_process(void)
>  
>  static unsigned long ftrace_call_replace(unsigned long pc, unsigned long addr)
>  {
> +	s32 offset = addr - pc;
> +	s32 blim = 0xfe000008;
> +	s32 flim = 0x02000004;

This look like magic numbers to me..

> +
> +	if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) {
> +		blim = 0xff000004;
> +		flim = 0x01000002;

.. ditto ..

> +	}
> +
> +	if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS) &&
> +	    (offset < blim || offset > flim))
> +		return 0;

.. I could have missed something, but wouldn't something like below be clearer?
Only compile tested. I think abs() will do the right thing here given the
passed types. I admit I don't understand why you have the '4' and '8' at the
lowest nibble..

	diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
	index fa867a57100f..b44aee87c53a 100644
	--- a/arch/arm/kernel/ftrace.c
	+++ b/arch/arm/kernel/ftrace.c
	@@ -70,17 +70,13 @@ int ftrace_arch_code_modify_post_process(void)

	 static unsigned long ftrace_call_replace(unsigned long pc, unsigned long addr)
	 {
	-       s32 offset = addr - pc;
	-       s32 blim = 0xfe000008;
	-       s32 flim = 0x02000004;
	+       u32 offset = abs(addr - pc);
	+       u32 range = 0x02000000; /* +-32MiB */

	-       if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) {
	-               blim = 0xff000004;
	-               flim = 0x01000002;
	-       }
	+       if (IS_ENABLED(CONFIG_THUMB2_KERNEL))
	+               range = 0x01000000; /* +-16MiB */

	-       if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS) &&
	-           (offset < blim || offset > flim))
	+       if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS) && offset > range)
			return 0;

		return arm_gen_branch_link(pc, addr);

In case CONFIG_ARM_MODULE_PLTS is not enabled what would happen? Is it
impossible to hit this corner case or we could fail one way or another? IOW,
should this check be always compiled in?

> +
>  	return arm_gen_branch_link(pc, addr);
>  }
>  
> @@ -124,10 +137,22 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
>  {
>  	unsigned long new, old;
>  	unsigned long ip = rec->ip;
> +	unsigned long aaddr = adjust_address(rec, addr);
>  
>  	old = ftrace_nop_replace(rec);
>  
> -	new = ftrace_call_replace(ip, adjust_address(rec, addr));
> +	new = ftrace_call_replace(ip, aaddr);
> +
> +#ifdef CONFIG_ARM_MODULE_PLTS
> +	if (!new) {
> +		struct module *mod = rec->arch.mod;
> +
> +		if (mod) {

What would happen if !new and !mod?

> +			aaddr = get_module_plt(mod, ip, aaddr);
> +			new = ftrace_call_replace(ip, aaddr);

I assume we're guaranteed to have a sensible value returned in 'new' here?

Thanks

--
Qais Yousef

> +		}
> +	}
> +#endif
Alexander Sverdlin March 8, 2021, 7:58 a.m. UTC | #3
Hi!

On 07/03/2021 18:26, Qais Yousef wrote:
> I tried on 5.12-rc2 and 5.11 but couldn't reproduce the problem using your
> instructions on the other email. But most likely because I'm hitting another
> problem that could be masking it. I'm not sure it is related or just randomly
> happened to hit it.
> 
> Did you see something similar?

[...]

> 	[    0.000000] [<c1b01a38>] (ftrace_bug) from [<c046316c>] (ftrace_process_locs+0x2b0/0x518)
> 	[    0.000000]  r7:c3817ac4 r6:c38040c0 r5:00000a3c r4:000134e4
> 	[    0.000000] [<c0462ebc>] (ftrace_process_locs) from [<c2b25240>] (ftrace_init+0xc8/0x174)
> 	[    0.000000]  r10:c2ffa000 r9:c2be8a78 r8:c2c5d1fc r7:c2c0c208 r6:00000001 r5:c2d0908c
> 	[    0.000000]  r4:c362f518
> 	[    0.000000] [<c2b25178>] (ftrace_init) from [<c2b00e14>] (start_kernel+0x2f4/0x5b8)
> 	[    0.000000]  r9:c2be8a78 r8:dbfffec0 r7:00000000 r6:c36385cc r5:c2d08f00 r4:c2ffa000
> 	[    0.000000] [<c2b00b20>] (start_kernel) from [<00000000>] (0x0)

This means, FTRACE has more problems with your kernel/compiler/platform, I've addressed similar issue
in the past, but my patch should be long merged:

https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1817963.html

Could it be the same problem as here:
https://www.spinics.net/lists/arm-kernel/msg854022.html

Seems that the size check deserves something line BUILD_BUG_ON() with FTRACE...

>> diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
>> index 9a79ef6..fa867a5 100644
>> --- a/arch/arm/kernel/ftrace.c
>> +++ b/arch/arm/kernel/ftrace.c
>> @@ -70,6 +70,19 @@ int ftrace_arch_code_modify_post_process(void)
>>  
>>  static unsigned long ftrace_call_replace(unsigned long pc, unsigned long addr)
>>  {
>> +	s32 offset = addr - pc;
>> +	s32 blim = 0xfe000008;
>> +	s32 flim = 0x02000004;
> 
> This look like magic numbers to me..

These magic numbers are most probably the reason for your FTRACE to resign...
Those are backward- and forward-branch limits. I didn't find the matching DEFINEs
in the kernel, but I would be happy to learn them. I can also put some comments,
but I actually thought the purpose would be obvious from the code...

>> +
>> +	if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) {
>> +		blim = 0xff000004;
>> +		flim = 0x01000002;
> 
> .. ditto ..
> 
>> +	}
>> +
>> +	if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS) &&
>> +	    (offset < blim || offset > flim))
>> +		return 0;
> 
> .. I could have missed something, but wouldn't something like below be clearer?
> Only compile tested. I think abs() will do the right thing here given the
> passed types. I admit I don't understand why you have the '4' and '8' at the
> lowest nibble..

Yes, the limits are not symmetrical. These "magic numbers" have been checked many
times by me, but I admit I'm not expert in ARM assembly. I'm however still quite
sure about them.

> 	diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
> 	index fa867a57100f..b44aee87c53a 100644
> 	--- a/arch/arm/kernel/ftrace.c
> 	+++ b/arch/arm/kernel/ftrace.c
> 	@@ -70,17 +70,13 @@ int ftrace_arch_code_modify_post_process(void)
> 
> 	 static unsigned long ftrace_call_replace(unsigned long pc, unsigned long addr)
> 	 {
> 	-       s32 offset = addr - pc;
> 	-       s32 blim = 0xfe000008;
> 	-       s32 flim = 0x02000004;
> 	+       u32 offset = abs(addr - pc);
> 	+       u32 range = 0x02000000; /* +-32MiB */
> 
> 	-       if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) {
> 	-               blim = 0xff000004;
> 	-               flim = 0x01000002;
> 	-       }
> 	+       if (IS_ENABLED(CONFIG_THUMB2_KERNEL))
> 	+               range = 0x01000000; /* +-16MiB */
> 
> 	-       if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS) &&
> 	-           (offset < blim || offset > flim))
> 	+       if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS) && offset > range)
> 			return 0;

See above, the limits are not symmetrical.

> 		return arm_gen_branch_link(pc, addr);
> 
> In case CONFIG_ARM_MODULE_PLTS is not enabled what would happen? Is it
> impossible to hit this corner case or we could fail one way or another? IOW,
> should this check be always compiled in?

I didn't want to modify the original behavior and the limits are again checked
in either ARM or THUMB implementations of __arm_gen_branch() (there you will
again find a nice set of "magic numbers". 

>> +
>>  	return arm_gen_branch_link(pc, addr);
>>  }
>>  
>> @@ -124,10 +137,22 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
>>  {
>>  	unsigned long new, old;
>>  	unsigned long ip = rec->ip;
>> +	unsigned long aaddr = adjust_address(rec, addr);
>>  
>>  	old = ftrace_nop_replace(rec);
>>  
>> -	new = ftrace_call_replace(ip, adjust_address(rec, addr));
>> +	new = ftrace_call_replace(ip, aaddr);
>> +
>> +#ifdef CONFIG_ARM_MODULE_PLTS
>> +	if (!new) {
>> +		struct module *mod = rec->arch.mod;
>> +
>> +		if (mod) {
> 
> What would happen if !new and !mod?

I believe, that's exactly what happens in the dump you experience with your kernel.
This is not covered by this patch, this patch covers the issue with modules in vmalloc area.

>> +			aaddr = get_module_plt(mod, ip, aaddr);
>> +			new = ftrace_call_replace(ip, aaddr);
> 
> I assume we're guaranteed to have a sensible value returned in 'new' here?

Otherwise you'd see the dump you see :)
It relies on the already existing error handling.

>> +		}
>> +	}
>> +#endif
Qais Yousef March 9, 2021, 5:42 p.m. UTC | #4
On 03/08/21 08:58, Alexander Sverdlin wrote:
> Hi!
> 
> On 07/03/2021 18:26, Qais Yousef wrote:
> > I tried on 5.12-rc2 and 5.11 but couldn't reproduce the problem using your

I still can't reproduce on 5.12-rc2.

I do have CONFIG_ARM_MODULE_PLTS=y. Do you need to do something else after
loading the module? I tried starting ftrace, but maybe there's a particular
combination required?

> > instructions on the other email. But most likely because I'm hitting another
> > problem that could be masking it. I'm not sure it is related or just randomly
> > happened to hit it.
> > 
> > Did you see something similar?
> 
> [...]
> 
> > 	[    0.000000] [<c1b01a38>] (ftrace_bug) from [<c046316c>] (ftrace_process_locs+0x2b0/0x518)
> > 	[    0.000000]  r7:c3817ac4 r6:c38040c0 r5:00000a3c r4:000134e4
> > 	[    0.000000] [<c0462ebc>] (ftrace_process_locs) from [<c2b25240>] (ftrace_init+0xc8/0x174)
> > 	[    0.000000]  r10:c2ffa000 r9:c2be8a78 r8:c2c5d1fc r7:c2c0c208 r6:00000001 r5:c2d0908c
> > 	[    0.000000]  r4:c362f518
> > 	[    0.000000] [<c2b25178>] (ftrace_init) from [<c2b00e14>] (start_kernel+0x2f4/0x5b8)
> > 	[    0.000000]  r9:c2be8a78 r8:dbfffec0 r7:00000000 r6:c36385cc r5:c2d08f00 r4:c2ffa000
> > 	[    0.000000] [<c2b00b20>] (start_kernel) from [<00000000>] (0x0)
> 
> This means, FTRACE has more problems with your kernel/compiler/platform, I've addressed similar issue
> in the past, but my patch should be long merged:
> 
> https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1817963.html
> 
> Could it be the same problem as here:
> https://www.spinics.net/lists/arm-kernel/msg854022.html
> 
> Seems that the size check deserves something line BUILD_BUG_ON() with FTRACE...

So I only see this when I convert all modules to be built-in

	sed -i 's/=m/=y/' .config

FWIW, I see the problem with your patch applied too. Trying to dig more into
it..

> 
> >> diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
> >> index 9a79ef6..fa867a5 100644
> >> --- a/arch/arm/kernel/ftrace.c
> >> +++ b/arch/arm/kernel/ftrace.c
> >> @@ -70,6 +70,19 @@ int ftrace_arch_code_modify_post_process(void)
> >>  
> >>  static unsigned long ftrace_call_replace(unsigned long pc, unsigned long addr)
> >>  {
> >> +	s32 offset = addr - pc;
> >> +	s32 blim = 0xfe000008;
> >> +	s32 flim = 0x02000004;
> > 
> > This look like magic numbers to me..
> 
> These magic numbers are most probably the reason for your FTRACE to resign...
> Those are backward- and forward-branch limits. I didn't find the matching DEFINEs
> in the kernel, but I would be happy to learn them. I can also put some comments,
> but I actually thought the purpose would be obvious from the code...

So I did dig more into it. The range is asymmetrical indeed. And the strange
offset is to cater for the pc being incremented by +8 (+4 for thumb2).

You're duplicating the checks in __arm_gen_branch_{thumb2, arm}(). As you noted
__arm_gen_branch() which is called by arm_gen_branch_link() will end up doing
the exact same check and return 0. So why do you need to duplicate the check
here? We can do something about the WARN_ON_ONCE(1).

[...]

> >> +
> >>  	return arm_gen_branch_link(pc, addr);
> >>  }
> >>  
> >> @@ -124,10 +137,22 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
> >>  {
> >>  	unsigned long new, old;
> >>  	unsigned long ip = rec->ip;
> >> +	unsigned long aaddr = adjust_address(rec, addr);
> >>  
> >>  	old = ftrace_nop_replace(rec);
> >>  
> >> -	new = ftrace_call_replace(ip, adjust_address(rec, addr));
> >> +	new = ftrace_call_replace(ip, aaddr);
> >> +
> >> +#ifdef CONFIG_ARM_MODULE_PLTS
> >> +	if (!new) {
> >> +		struct module *mod = rec->arch.mod;
> >> +
> >> +		if (mod) {
> > 
> > What would happen if !new and !mod?
> 
> I believe, that's exactly what happens in the dump you experience with your kernel.
> This is not covered by this patch, this patch covers the issue with modules in vmalloc area.
> 
> >> +			aaddr = get_module_plt(mod, ip, aaddr);
> >> +			new = ftrace_call_replace(ip, aaddr);
> > 
> > I assume we're guaranteed to have a sensible value returned in 'new' here?
> 
> Otherwise you'd see the dump you see :)
> It relies on the already existing error handling.

I understand from this there are still loose ends to be handled in this area of
the code.

I admit I need to spend more time to understand why I get the failure above and
how this overlaps with your proposal. But as it stands it seems there's more
work to be done here.

Thanks

--
Qais Yousef
Alexander Sverdlin March 10, 2021, 7:23 a.m. UTC | #5
Hi!

On 09/03/2021 18:42, Qais Yousef wrote:
>>> I tried on 5.12-rc2 and 5.11 but couldn't reproduce the problem using your
> I still can't reproduce on 5.12-rc2.
> 
> I do have CONFIG_ARM_MODULE_PLTS=y. Do you need to do something else after
> loading the module? I tried starting ftrace, but maybe there's a particular
> combination required?

You need to load a BIG module, so big that it has no place in the modules area
any more and goes to vmalloc area.

>>> instructions on the other email. But most likely because I'm hitting another
>>> problem that could be masking it. I'm not sure it is related or just randomly
>>> happened to hit it.
>>>
>>> Did you see something similar?
>> [...]
>>
>>> 	[    0.000000] [<c1b01a38>] (ftrace_bug) from [<c046316c>] (ftrace_process_locs+0x2b0/0x518)
>>> 	[    0.000000]  r7:c3817ac4 r6:c38040c0 r5:00000a3c r4:000134e4
>>> 	[    0.000000] [<c0462ebc>] (ftrace_process_locs) from [<c2b25240>] (ftrace_init+0xc8/0x174)
>>> 	[    0.000000]  r10:c2ffa000 r9:c2be8a78 r8:c2c5d1fc r7:c2c0c208 r6:00000001 r5:c2d0908c
>>> 	[    0.000000]  r4:c362f518
>>> 	[    0.000000] [<c2b25178>] (ftrace_init) from [<c2b00e14>] (start_kernel+0x2f4/0x5b8)
>>> 	[    0.000000]  r9:c2be8a78 r8:dbfffec0 r7:00000000 r6:c36385cc r5:c2d08f00 r4:c2ffa000
>>> 	[    0.000000] [<c2b00b20>] (start_kernel) from [<00000000>] (0x0)
>> This means, FTRACE has more problems with your kernel/compiler/platform, I've addressed similar issue
>> in the past, but my patch should be long merged:
>>
>> https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1817963.html
>>
>> Could it be the same problem as here:
>> https://www.spinics.net/lists/arm-kernel/msg854022.html
>>
>> Seems that the size check deserves something line BUILD_BUG_ON() with FTRACE...
> So I only see this when I convert all modules to be built-in
> 
> 	sed -i 's/=m/=y/' .config
> 
> FWIW, I see the problem with your patch applied too. Trying to dig more into
> it..

Then it's definitely the problem explained in the second link. If you have THUMB2 kernel, maybe
you have to switch to ARM.
Florian Fainelli March 10, 2021, 4:14 p.m. UTC | #6
On 3/9/2021 11:23 PM, Alexander Sverdlin wrote:
> Hi!
> 
> On 09/03/2021 18:42, Qais Yousef wrote:
>>>> I tried on 5.12-rc2 and 5.11 but couldn't reproduce the problem using your
>> I still can't reproduce on 5.12-rc2.
>>
>> I do have CONFIG_ARM_MODULE_PLTS=y. Do you need to do something else after
>> loading the module? I tried starting ftrace, but maybe there's a particular
>> combination required?
> 
> You need to load a BIG module, so big that it has no place in the modules area
> any more and goes to vmalloc area.

You absolutely need a very big module maybe more than one. When I tested
this, I could use the two proprietary modules (*sigh*) that I needed to
exercise against and loading one but not the other was not enough to
make the second module loading spill into vmalloc space.

> 
>>>> instructions on the other email. But most likely because I'm hitting another
>>>> problem that could be masking it. I'm not sure it is related or just randomly
>>>> happened to hit it.
>>>>
>>>> Did you see something similar?
>>> [...]
>>>
>>>> 	[    0.000000] [<c1b01a38>] (ftrace_bug) from [<c046316c>] (ftrace_process_locs+0x2b0/0x518)
>>>> 	[    0.000000]  r7:c3817ac4 r6:c38040c0 r5:00000a3c r4:000134e4
>>>> 	[    0.000000] [<c0462ebc>] (ftrace_process_locs) from [<c2b25240>] (ftrace_init+0xc8/0x174)
>>>> 	[    0.000000]  r10:c2ffa000 r9:c2be8a78 r8:c2c5d1fc r7:c2c0c208 r6:00000001 r5:c2d0908c
>>>> 	[    0.000000]  r4:c362f518
>>>> 	[    0.000000] [<c2b25178>] (ftrace_init) from [<c2b00e14>] (start_kernel+0x2f4/0x5b8)
>>>> 	[    0.000000]  r9:c2be8a78 r8:dbfffec0 r7:00000000 r6:c36385cc r5:c2d08f00 r4:c2ffa000
>>>> 	[    0.000000] [<c2b00b20>] (start_kernel) from [<00000000>] (0x0)
>>> This means, FTRACE has more problems with your kernel/compiler/platform, I've addressed similar issue
>>> in the past, but my patch should be long merged:
>>>
>>> https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1817963.html
>>>
>>> Could it be the same problem as here:
>>> https://www.spinics.net/lists/arm-kernel/msg854022.html
>>>
>>> Seems that the size check deserves something line BUILD_BUG_ON() with FTRACE...
>> So I only see this when I convert all modules to be built-in
>>
>> 	sed -i 's/=m/=y/' .config
>>
>> FWIW, I see the problem with your patch applied too. Trying to dig more into
>> it..
> 
> Then it's definitely the problem explained in the second link. If you have THUMB2 kernel, maybe
> you have to switch to ARM.
>
Alexander Sverdlin March 10, 2021, 5:17 p.m. UTC | #7
Hi!

On 10/03/2021 17:14, Florian Fainelli wrote:
>>>>> I tried on 5.12-rc2 and 5.11 but couldn't reproduce the problem using your
>>> I still can't reproduce on 5.12-rc2.
>>>
>>> I do have CONFIG_ARM_MODULE_PLTS=y. Do you need to do something else after
>>> loading the module? I tried starting ftrace, but maybe there's a particular
>>> combination required?
>> You need to load a BIG module, so big that it has no place in the modules area
>> any more and goes to vmalloc area.
> You absolutely need a very big module maybe more than one. When I tested
> this, I could use the two proprietary modules (*sigh*) that I needed to
> exercise against and loading one but not the other was not enough to
> make the second module loading spill into vmalloc space.

Here is what I use instead of these real world "proprietary" modules (which of course
were the real trigger for the patch):

https://www.spinics.net/lists/arm-kernel/msg878599.html
Qais Yousef March 12, 2021, 5:24 p.m. UTC | #8
Hi Alexander

On 03/10/21 18:17, Alexander Sverdlin wrote:
> Hi!
> 
> On 10/03/2021 17:14, Florian Fainelli wrote:
> >>>>> I tried on 5.12-rc2 and 5.11 but couldn't reproduce the problem using your
> >>> I still can't reproduce on 5.12-rc2.
> >>>
> >>> I do have CONFIG_ARM_MODULE_PLTS=y. Do you need to do something else after
> >>> loading the module? I tried starting ftrace, but maybe there's a particular
> >>> combination required?
> >> You need to load a BIG module, so big that it has no place in the modules area
> >> any more and goes to vmalloc area.
> > You absolutely need a very big module maybe more than one. When I tested
> > this, I could use the two proprietary modules (*sigh*) that I needed to
> > exercise against and loading one but not the other was not enough to
> > make the second module loading spill into vmalloc space.
> 
> Here is what I use instead of these real world "proprietary" modules (which of course
> were the real trigger for the patch):
> 
> https://www.spinics.net/lists/arm-kernel/msg878599.html

I am testing with your module. I can't reproduce the problem you describe with
it as I stated.

I will try to spend more time on it on the weekend.

Thanks

--
Qais Yousef
Florian Fainelli March 12, 2021, 6:35 p.m. UTC | #9
On 3/12/21 9:24 AM, Qais Yousef wrote:
> Hi Alexander
> 
> On 03/10/21 18:17, Alexander Sverdlin wrote:
>> Hi!
>>
>> On 10/03/2021 17:14, Florian Fainelli wrote:
>>>>>>> I tried on 5.12-rc2 and 5.11 but couldn't reproduce the problem using your
>>>>> I still can't reproduce on 5.12-rc2.
>>>>>
>>>>> I do have CONFIG_ARM_MODULE_PLTS=y. Do you need to do something else after
>>>>> loading the module? I tried starting ftrace, but maybe there's a particular
>>>>> combination required?
>>>> You need to load a BIG module, so big that it has no place in the modules area
>>>> any more and goes to vmalloc area.
>>> You absolutely need a very big module maybe more than one. When I tested
>>> this, I could use the two proprietary modules (*sigh*) that I needed to
>>> exercise against and loading one but not the other was not enough to
>>> make the second module loading spill into vmalloc space.
>>
>> Here is what I use instead of these real world "proprietary" modules (which of course
>> were the real trigger for the patch):
>>
>> https://www.spinics.net/lists/arm-kernel/msg878599.html
> 
> I am testing with your module. I can't reproduce the problem you describe with
> it as I stated.
> 
> I will try to spend more time on it on the weekend.

Alexander, do you load one or multiple instances of that fat module?

The test module does a 6 * 1024 * 1024 / 2 = 3 million repetitions of
the "nop" instruction which should be 32-bits wide in ARM mode and
16-bits wide in Thumb mode, right?

In ARM mode we have a 14MB module space, so 3 * 1024 * 1024 * 4 = 12MB,
which should still fit within if you have no module loaded, however a
second instance of the module should make us spill into vmalloc space.

In Thumb mode, we have a 6MB module space, so 3 * 1024 * 1024 * 2 = 6MB
so we may spill, but maybe not.

I was not able to reproduce the warning with just one module, but with
two (cannot have the same name BTW), it kicked in.
Qais Yousef March 14, 2021, 10:02 p.m. UTC | #10
I fixed Ard's email as it kept bouncing back.

+CC Linus Walleij

On 03/12/21 10:35, Florian Fainelli wrote:
> On 3/12/21 9:24 AM, Qais Yousef wrote:
> > Hi Alexander
> > 
> > On 03/10/21 18:17, Alexander Sverdlin wrote:
> >> Hi!
> >>
> >> On 10/03/2021 17:14, Florian Fainelli wrote:
> >>>>>>> I tried on 5.12-rc2 and 5.11 but couldn't reproduce the problem using your
> >>>>> I still can't reproduce on 5.12-rc2.
> >>>>>
> >>>>> I do have CONFIG_ARM_MODULE_PLTS=y. Do you need to do something else after
> >>>>> loading the module? I tried starting ftrace, but maybe there's a particular
> >>>>> combination required?
> >>>> You need to load a BIG module, so big that it has no place in the modules area
> >>>> any more and goes to vmalloc area.
> >>> You absolutely need a very big module maybe more than one. When I tested
> >>> this, I could use the two proprietary modules (*sigh*) that I needed to
> >>> exercise against and loading one but not the other was not enough to
> >>> make the second module loading spill into vmalloc space.
> >>
> >> Here is what I use instead of these real world "proprietary" modules (which of course
> >> were the real trigger for the patch):
> >>
> >> https://www.spinics.net/lists/arm-kernel/msg878599.html
> > 
> > I am testing with your module. I can't reproduce the problem you describe with
> > it as I stated.
> > 
> > I will try to spend more time on it on the weekend.
> 
> Alexander, do you load one or multiple instances of that fat module?
> 
> The test module does a 6 * 1024 * 1024 / 2 = 3 million repetitions of
> the "nop" instruction which should be 32-bits wide in ARM mode and
> 16-bits wide in Thumb mode, right?
> 
> In ARM mode we have a 14MB module space, so 3 * 1024 * 1024 * 4 = 12MB,
> which should still fit within if you have no module loaded, however a
> second instance of the module should make us spill into vmalloc space.
> 
> In Thumb mode, we have a 6MB module space, so 3 * 1024 * 1024 * 2 = 6MB
> so we may spill, but maybe not.
> 
> I was not able to reproduce the warning with just one module, but with
> two (cannot have the same name BTW), it kicked in.

Thanks Florian. With 2 modules I can indeed reproduce the issue. Though for
thumb I still hit the other problem so I couldn't properly test it.

I have come up with an alternative solution (hacky patch at the end) which seem
to fix both problems for me. That is

	* sed -i 's/=m/=y/ .config' which causes a lot of veneers to be
	  generated is handled correctly now AFAICT
	* Loading a fat module works the same.

I have tested it in both arm and thumb modes. Running

	trace-cmd start -p function

executes successfully and I can see functions being traced.

The solution makes a lot of assumptions that I yet to verify though, but I hope
it could form the basis for a proper one that fixes both issues. It's faster
too. I'm just not sure if I need to take this handling one layer above or it's
okay to be done at __arm_gen_branch(). Also I had a version that was more
strict about verifying the veneer @pc points to @addr before using the veneer
address but that fails for modules. The logic didn't work to verify for module,
but we do get the correct veneer function as returned by get_module_plt().

Since for ftrace we really care about mcount, I considered reading the plt once
at load time to get the address and save it in rec->arch and use that later
instead of continuously reading it + having to store the plt. We could revisit
that if the below has a fundamental flaw.

Alexander, have you considered these options before? I could have easily missed
something :-)

Thanks

--
Qais Yousef


------->8---------


diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
index 9a79ef6b1876..70f6e197c306 100644
--- a/arch/arm/kernel/ftrace.c
+++ b/arch/arm/kernel/ftrace.c
@@ -157,6 +157,9 @@ int ftrace_make_nop(struct module *mod,
 	unsigned long new;
 	int ret;
 
+	if (mod)
+		pr_info("VENEER_DBG: mcount_veneedr_addr = 0x%x\n", get_module_plt(mod, ip, addr));
+
 	old = ftrace_call_replace(ip, adjust_address(rec, addr));
 	new = ftrace_nop_replace(rec);
 	ret = ftrace_modify_code(ip, old, new, true);
diff --git a/arch/arm/kernel/insn.c b/arch/arm/kernel/insn.c
index 2e844b70386b..0d6abf8d726f 100644
--- a/arch/arm/kernel/insn.c
+++ b/arch/arm/kernel/insn.c
@@ -3,17 +3,53 @@
 #include <linux/kernel.h>
 #include <asm/opcodes.h>
 
+/*
+ *  Checks if a @pc points to a veneer to @addr
+ *
+ *  Returns addr to veneer if true, 0 otherwise.
+ */
+static unsigned long
+ __arm_addr_is_veneer_thumb2(unsigned long pc, unsigned long addr, bool *exchange)
+{
+	unsigned long insn = __mem_to_opcode_thumb32(*(unsigned long *)pc);
+
+	unsigned long s = insn >> 26 & 0x1;
+	unsigned long j1 = insn >> 13 & 0x1;
+	unsigned long j2 = insn >> 11 & 0x1;
+	unsigned long i1 = !(j1 ^ s);
+	unsigned long i2 = !(j2 ^ s);
+	unsigned long imm10 = insn >> 16 & 0x3ff;
+	unsigned long imm11 = (insn & 0x7ff) + 1;
+
+	long offset = ((s ? 0xff : s) << 24) | (i1 << 23) |
+			(i2 << 22) | (imm10 << 12) | (imm11 << 1);
+
+	unsigned long veneer_fn = pc + 2 + offset;
+	long voffset = *(long *)(veneer_fn + 4 + 4 + 4);
+	unsigned long veneer_addr = (long)veneer_fn + 4 + 4 + 4 + voffset;
+
+	*exchange = !(insn & 1 << 12);
+
+	pr_info("VENEER_DBG: insn=0x%lx, offset=0x%lx\n", insn, offset);
+	pr_info("VENEER_DBG: pc=0x%lx, addr=0x%lx\n", pc, addr);
+	pr_info("VENEER_DBG: veneer_fn=0x%lx, veneer_addr=0x%lx\n", veneer_fn, veneer_addr);
+
+	return veneer_fn;
+
+}
+
 static unsigned long
 __arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link)
 {
 	unsigned long s, j1, j2, i1, i2, imm10, imm11;
 	unsigned long first, second;
+	bool exchange = false;
 	long offset;
 
 	offset = (long)addr - (long)(pc + 4);
 	if (offset < -16777216 || offset > 16777214) {
-		WARN_ON_ONCE(1);
-		return 0;
+		unsigned long vaddr = __arm_addr_is_veneer_thumb2(pc, addr, &exchange);
+		offset = (long)vaddr - (long)(pc + 4);
 	}
 
 	s	= (offset >> 24) & 0x1;
@@ -30,9 +66,34 @@ __arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link)
 	if (link)
 		second |= 1 << 14;
 
+	if (exchange)
+		second &= ~(1 << 12);
+
+	//pr_info("VENEER_DBG: thumb32 opcode = 0x%x\n", __opcode_thumb32_compose(first, second));
 	return __opcode_thumb32_compose(first, second);
 }
 
+/*
+ *  Checks if a @pc points to a veneer to @addr
+ *
+ *  Returns addr to veneer if true, 0 otherwise.
+ */
+static unsigned long
+ __arm_addr_is_veneer_arm(unsigned long pc, unsigned long addr)
+{
+	unsigned long insn = __mem_to_opcode_arm(*(unsigned long *)pc);
+	long offset = (insn & 0xffffff) << 2;
+	unsigned long veneer_fn = pc + 8 + offset;
+	long voffset = 0; //*(long *)(veneer_fn + 8);
+	unsigned long veneer_addr = (long)veneer_fn + 4 + 8 + voffset;
+
+	pr_info("VENEER_DBG: insn=0x%lx, offset=0x%lx\n", insn, offset);
+	pr_info("VENEER_DBG: pc=0x%lx, addr=0x%lx\n", pc, addr);
+	pr_info("VENEER_DBG: veneer_fn=0x%lx, veneer_addr=0x%lx\n", veneer_fn, veneer_addr);
+
+	return veneer_fn;
+}
+
 static unsigned long
 __arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link)
 {
@@ -44,8 +105,8 @@ __arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link)
 
 	offset = (long)addr - (long)(pc + 8);
 	if (unlikely(offset < -33554432 || offset > 33554428)) {
-		WARN_ON_ONCE(1);
-		return 0;
+		unsigned long vaddr = __arm_addr_is_veneer_arm(pc, addr);
+		offset = (long)vaddr - (long)(pc + 8);
 	}
 
 	offset = (offset >> 2) & 0x00ffffff;
Alexander Sverdlin March 15, 2021, 9:19 a.m. UTC | #11
Hello Florian,

On 12/03/2021 19:35, Florian Fainelli wrote:
>>> https://www.spinics.net/lists/arm-kernel/msg878599.html
>> I am testing with your module. I can't reproduce the problem you describe with
>> it as I stated.
>>
>> I will try to spend more time on it on the weekend.
> Alexander, do you load one or multiple instances of that fat module?

One cannot have "multiple instances of the same module"...

> The test module does a 6 * 1024 * 1024 / 2 = 3 million repetitions of
> the "nop" instruction which should be 32-bits wide in ARM mode and
> 16-bits wide in Thumb mode, right?
> 
> In ARM mode we have a 14MB module space, so 3 * 1024 * 1024 * 4 = 12MB,
> which should still fit within if you have no module loaded, however a
> second instance of the module should make us spill into vmalloc space.
> 
> In Thumb mode, we have a 6MB module space, so 3 * 1024 * 1024 * 2 = 6MB
> so we may spill, but maybe not.
> 
> I was not able to reproduce the warning with just one module, but with
> two (cannot have the same name BTW), it kicked in.

... well, may be the size was arbitrary chosen to not fit into our module space
and we have more modules already loaded. But you are free to adjust the 
amount of NOPs! :)
Qais Yousef March 21, 2021, 7:06 p.m. UTC | #12
Hi Alexander

On 03/14/21 22:02, Qais Yousef wrote:
> I fixed Ard's email as it kept bouncing back.
> 
> +CC Linus Walleij
> 
> On 03/12/21 10:35, Florian Fainelli wrote:
> > On 3/12/21 9:24 AM, Qais Yousef wrote:
> > > Hi Alexander
> > > 
> > > On 03/10/21 18:17, Alexander Sverdlin wrote:
> > >> Hi!
> > >>
> > >> On 10/03/2021 17:14, Florian Fainelli wrote:
> > >>>>>>> I tried on 5.12-rc2 and 5.11 but couldn't reproduce the problem using your
> > >>>>> I still can't reproduce on 5.12-rc2.
> > >>>>>
> > >>>>> I do have CONFIG_ARM_MODULE_PLTS=y. Do you need to do something else after
> > >>>>> loading the module? I tried starting ftrace, but maybe there's a particular
> > >>>>> combination required?
> > >>>> You need to load a BIG module, so big that it has no place in the modules area
> > >>>> any more and goes to vmalloc area.
> > >>> You absolutely need a very big module maybe more than one. When I tested
> > >>> this, I could use the two proprietary modules (*sigh*) that I needed to
> > >>> exercise against and loading one but not the other was not enough to
> > >>> make the second module loading spill into vmalloc space.
> > >>
> > >> Here is what I use instead of these real world "proprietary" modules (which of course
> > >> were the real trigger for the patch):
> > >>
> > >> https://www.spinics.net/lists/arm-kernel/msg878599.html
> > > 
> > > I am testing with your module. I can't reproduce the problem you describe with
> > > it as I stated.
> > > 
> > > I will try to spend more time on it on the weekend.
> > 
> > Alexander, do you load one or multiple instances of that fat module?
> > 
> > The test module does a 6 * 1024 * 1024 / 2 = 3 million repetitions of
> > the "nop" instruction which should be 32-bits wide in ARM mode and
> > 16-bits wide in Thumb mode, right?
> > 
> > In ARM mode we have a 14MB module space, so 3 * 1024 * 1024 * 4 = 12MB,
> > which should still fit within if you have no module loaded, however a
> > second instance of the module should make us spill into vmalloc space.
> > 
> > In Thumb mode, we have a 6MB module space, so 3 * 1024 * 1024 * 2 = 6MB
> > so we may spill, but maybe not.
> > 
> > I was not able to reproduce the warning with just one module, but with
> > two (cannot have the same name BTW), it kicked in.
> 
> Thanks Florian. With 2 modules I can indeed reproduce the issue. Though for
> thumb I still hit the other problem so I couldn't properly test it.
> 
> I have come up with an alternative solution (hacky patch at the end) which seem
> to fix both problems for me. That is
> 
> 	* sed -i 's/=m/=y/ .config' which causes a lot of veneers to be
> 	  generated is handled correctly now AFAICT
> 	* Loading a fat module works the same.
> 
> I have tested it in both arm and thumb modes. Running
> 
> 	trace-cmd start -p function
> 
> executes successfully and I can see functions being traced.
> 
> The solution makes a lot of assumptions that I yet to verify though, but I hope
> it could form the basis for a proper one that fixes both issues. It's faster
> too. I'm just not sure if I need to take this handling one layer above or it's
> okay to be done at __arm_gen_branch(). Also I had a version that was more
> strict about verifying the veneer @pc points to @addr before using the veneer
> address but that fails for modules. The logic didn't work to verify for module,
> but we do get the correct veneer function as returned by get_module_plt().
> 
> Since for ftrace we really care about mcount, I considered reading the plt once
> at load time to get the address and save it in rec->arch and use that later
> instead of continuously reading it + having to store the plt. We could revisit
> that if the below has a fundamental flaw.
> 
> Alexander, have you considered these options before? I could have easily missed
> something :-)

So the answer is no, that wouldn't have worked.

After spending more time on this your approach looks goot to me now.  If you
send v8 addressing my comment about removing the range check from
ftrace_call_repalce() as it is rendundant to what arm_gen_branch_link() is
already doing I'd be happy to give this series my reviewed-and-tested-by.
You'd need to ensure the warning is not triggered when we expect the call to
fail.

FWIW, I do have this to address that range check issue and reduce the ifdefery
too if it helps

------>8---------


diff --git a/arch/arm/include/asm/ftrace.h b/arch/arm/include/asm/ftrace.h
index a4dbac07e4ef..ee2707c613bf 100644
--- a/arch/arm/include/asm/ftrace.h
+++ b/arch/arm/include/asm/ftrace.h
@@ -15,9 +15,7 @@ extern void __gnu_mcount_nc(void);
 
 #ifdef CONFIG_DYNAMIC_FTRACE
 struct dyn_arch_ftrace {
-#ifdef CONFIG_ARM_MODULE_PLTS
 	struct module *mod;
-#endif
 };
 
 static inline unsigned long ftrace_call_adjust(unsigned long addr)
diff --git a/arch/arm/include/asm/insn.h b/arch/arm/include/asm/insn.h
index f20e08ac85ae..71c3edefe629 100644
--- a/arch/arm/include/asm/insn.h
+++ b/arch/arm/include/asm/insn.h
@@ -13,18 +13,24 @@ arm_gen_nop(void)
 }
 
 unsigned long
-__arm_gen_branch(unsigned long pc, unsigned long addr, bool link);
+__arm_gen_branch(unsigned long pc, unsigned long addr, bool link, bool check);
 
 static inline unsigned long
 arm_gen_branch(unsigned long pc, unsigned long addr)
 {
-	return __arm_gen_branch(pc, addr, false);
+	return __arm_gen_branch(pc, addr, false, true);
 }
 
 static inline unsigned long
 arm_gen_branch_link(unsigned long pc, unsigned long addr)
 {
-	return __arm_gen_branch(pc, addr, true);
+	return __arm_gen_branch(pc, addr, true, true);
+}
+
+static inline unsigned long
+arm_gen_branch_link_nocheck(unsigned long pc, unsigned long addr)
+{
+	return __arm_gen_branch(pc, addr, true, false);
 }
 
 #endif
diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
index fa867a57100f..a30b1a8e5908 100644
--- a/arch/arm/kernel/ftrace.c
+++ b/arch/arm/kernel/ftrace.c
@@ -70,20 +70,28 @@ int ftrace_arch_code_modify_post_process(void)
 
 static unsigned long ftrace_call_replace(unsigned long pc, unsigned long addr)
 {
-	s32 offset = addr - pc;
-	s32 blim = 0xfe000008;
-	s32 flim = 0x02000004;
+	return arm_gen_branch_link(pc, addr);
+}
 
-	if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) {
-		blim = 0xff000004;
-		flim = 0x01000002;
-	}
+static unsigned long
+ftrace_call_replace_mod(struct module *mod, unsigned long pc, unsigned long addr)
+{
+#ifdef CONFIG_ARM_MODULE_PLTS
+	unsigned long new;
 
-	if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS) &&
-	    (offset < blim || offset > flim))
-		return 0;
+	if (likely(!mod))
+		return arm_gen_branch_link(pc, addr);
 
+	new = arm_gen_branch_link_nocheck(pc, addr);
+	if (!new) {
+		addr = get_module_plt(mod, pc, addr);
+		new = arm_gen_branch_link(pc, addr);
+	}
+
+	return new;
+#else
 	return arm_gen_branch_link(pc, addr);
+#endif
 }
 
 static int ftrace_modify_code(unsigned long pc, unsigned long old,
@@ -141,18 +149,7 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
 
 	old = ftrace_nop_replace(rec);
 
-	new = ftrace_call_replace(ip, aaddr);
-
-#ifdef CONFIG_ARM_MODULE_PLTS
-	if (!new) {
-		struct module *mod = rec->arch.mod;
-
-		if (mod) {
-			aaddr = get_module_plt(mod, ip, aaddr);
-			new = ftrace_call_replace(ip, aaddr);
-		}
-	}
-#endif
+	new = ftrace_call_replace_mod(rec->arch.mod, ip, aaddr);
 
 	return ftrace_modify_code(rec->ip, old, new, true);
 }
@@ -183,23 +180,11 @@ int ftrace_make_nop(struct module *mod,
 	unsigned long new;
 	int ret;
 
-#ifdef CONFIG_ARM_MODULE_PLTS
 	/* mod is only supplied during module loading */
-	if (!mod)
-		mod = rec->arch.mod;
-	else
+	if (mod)
 		rec->arch.mod = mod;
-#endif
-
-	old = ftrace_call_replace(ip, aaddr);
-
-#ifdef CONFIG_ARM_MODULE_PLTS
-	if (!old && mod) {
-		aaddr = get_module_plt(mod, ip, aaddr);
-		old = ftrace_call_replace(ip, aaddr);
-	}
-#endif
 
+	old = ftrace_call_replace_mod(rec->arch.mod, ip, aaddr);
 	new = ftrace_nop_replace(rec);
 	ret = ftrace_modify_code(ip, old, new, true);
 
diff --git a/arch/arm/kernel/insn.c b/arch/arm/kernel/insn.c
index 2e844b70386b..37ec5734309e 100644
--- a/arch/arm/kernel/insn.c
+++ b/arch/arm/kernel/insn.c
@@ -4,7 +4,7 @@
 #include <asm/opcodes.h>
 
 static unsigned long
-__arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link)
+__arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link, bool check)
 {
 	unsigned long s, j1, j2, i1, i2, imm10, imm11;
 	unsigned long first, second;
@@ -12,7 +12,7 @@ __arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link)
 
 	offset = (long)addr - (long)(pc + 4);
 	if (offset < -16777216 || offset > 16777214) {
-		WARN_ON_ONCE(1);
+		WARN_ON_ONCE(check);
 		return 0;
 	}
 
@@ -34,7 +34,7 @@ __arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link)
 }
 
 static unsigned long
-__arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link)
+__arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link, bool check)
 {
 	unsigned long opcode = 0xea000000;
 	long offset;
@@ -44,7 +44,7 @@ __arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link)
 
 	offset = (long)addr - (long)(pc + 8);
 	if (unlikely(offset < -33554432 || offset > 33554428)) {
-		WARN_ON_ONCE(1);
+		WARN_ON_ONCE(check);
 		return 0;
 	}
 
@@ -54,10 +54,10 @@ __arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link)
 }
 
 unsigned long
-__arm_gen_branch(unsigned long pc, unsigned long addr, bool link)
+__arm_gen_branch(unsigned long pc, unsigned long addr, bool link, bool check)
 {
 	if (IS_ENABLED(CONFIG_THUMB2_KERNEL))
-		return __arm_gen_branch_thumb2(pc, addr, link);
+		return __arm_gen_branch_thumb2(pc, addr, link, check);
 	else
-		return __arm_gen_branch_arm(pc, addr, link);
+		return __arm_gen_branch_arm(pc, addr, link, check);
 }

------>8---------


Thanks

--
Qais Yousef
Steven Rostedt March 22, 2021, 3:01 p.m. UTC | #13
On Sun, 21 Mar 2021 19:06:11 +0000
Qais Yousef <qais.yousef@arm.com> wrote:

>  #ifdef CONFIG_DYNAMIC_FTRACE
>  struct dyn_arch_ftrace {
> -#ifdef CONFIG_ARM_MODULE_PLTS
>  	struct module *mod;
> -#endif
>  };
>  

I know you want to reduce the "ifdefery", but please note that the
dyn_arch_ftrace is defined once for every function that can be traced. If
you have 40,000 functions that can be traced, that pointer is created
40,000 times. Thus, you really only want fields in the struct
dyn_arch_ftrace if you really need them, otherwise, that's a lot of memory
that is wasted.

-- Steve
Qais Yousef March 22, 2021, 4:32 p.m. UTC | #14
On 03/22/21 11:01, Steven Rostedt wrote:
> On Sun, 21 Mar 2021 19:06:11 +0000
> Qais Yousef <qais.yousef@arm.com> wrote:
> 
> >  #ifdef CONFIG_DYNAMIC_FTRACE
> >  struct dyn_arch_ftrace {
> > -#ifdef CONFIG_ARM_MODULE_PLTS
> >  	struct module *mod;
> > -#endif
> >  };
> >  
> 
> I know you want to reduce the "ifdefery", but please note that the
> dyn_arch_ftrace is defined once for every function that can be traced. If
> you have 40,000 functions that can be traced, that pointer is created
> 40,000 times. Thus, you really only want fields in the struct
> dyn_arch_ftrace if you really need them, otherwise, that's a lot of memory
> that is wasted.

Yes you're right. I was a bit optimistic on CONFIG_DYNAMIC_FTRACE will imply
CONFIG_ARM_MODULE_PLTS is enabled too.

It only has an impact on reducing ifdefery when calling

	ftrace_call_replace_mod(rec->arch.mod, ...)

Should be easy to wrap rec->arch.mod with its own accessor that will return
NULL if !CONFIG_ARM_MODULE_PLTS or just ifdef the functions.

Up to Alexander to pick what he prefers :-)

Thanks

--
Qais Yousef
Alexander Sverdlin March 22, 2021, 5:02 p.m. UTC | #15
Hi Qais,

On 22/03/2021 17:32, Qais Yousef wrote:
> Yes you're right. I was a bit optimistic on CONFIG_DYNAMIC_FTRACE will imply
> CONFIG_ARM_MODULE_PLTS is enabled too.
> 
> It only has an impact on reducing ifdefery when calling
> 
> 	ftrace_call_replace_mod(rec->arch.mod, ...)
> 
> Should be easy to wrap rec->arch.mod with its own accessor that will return
> NULL if !CONFIG_ARM_MODULE_PLTS or just ifdef the functions.
> 
> Up to Alexander to pick what he prefers :-)

well, I of course prefer v7 as-is, because this review is running longer than two
years and I actually hope these patches to be finally merged at some point.
But you are welcome to optimize them with follow up patches :)
Qais Yousef March 23, 2021, 10:22 p.m. UTC | #16
Hi Alexander

On 03/22/21 18:02, Alexander Sverdlin wrote:
> Hi Qais,
> 
> On 22/03/2021 17:32, Qais Yousef wrote:
> > Yes you're right. I was a bit optimistic on CONFIG_DYNAMIC_FTRACE will imply
> > CONFIG_ARM_MODULE_PLTS is enabled too.
> > 
> > It only has an impact on reducing ifdefery when calling
> > 
> > 	ftrace_call_replace_mod(rec->arch.mod, ...)
> > 
> > Should be easy to wrap rec->arch.mod with its own accessor that will return
> > NULL if !CONFIG_ARM_MODULE_PLTS or just ifdef the functions.
> > 
> > Up to Alexander to pick what he prefers :-)
> 
> well, I of course prefer v7 as-is, because this review is running longer than two
> years and I actually hope these patches to be finally merged at some point.
> But you are welcome to optimize them with follow up patches :)

I appreciate that and thanks a lot for your effort. My attempt to review and
test here is to help in getting this merged.

FWIW my main concern is about duplicating the range check in
ftrace_call_replace() and using magic values that already exist in
__arm_gen_branch_{arm, thumb2}() and better remain encapsulated there.

Thanks

--
Qais Yousef

----->8------


diff --git a/arch/arm/include/asm/ftrace.h b/arch/arm/include/asm/ftrace.h
index a4dbac07e4ef..8545b3ff8317 100644
--- a/arch/arm/include/asm/ftrace.h
+++ b/arch/arm/include/asm/ftrace.h
@@ -25,6 +25,27 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr)
 	/* With Thumb-2, the recorded addresses have the lsb set */
 	return addr & ~1;
 }
+
+#ifdef CONFIG_ARM_MODULE_PLTS
+static inline void ftrace_set_mod(struct dyn_arch_ftrace *arch, struct module *mod)
+{
+	arch->mod = mod;
+}
+
+static inline struct module *ftrace_get_mod(struct dyn_arch_ftrace *arch)
+{
+	return arch->mod;
+}
+#else
+static inline void ftrace_set_mod(struct dyn_arch_ftrace *arch, struct module *mod)
+{
+}
+
+static inline struct module *ftrace_get_mod(struct dyn_arch_ftrace *arch)
+{
+	return NULL;
+}
+#endif
 #endif
 
 #endif
diff --git a/arch/arm/include/asm/insn.h b/arch/arm/include/asm/insn.h
index f20e08ac85ae..71c3edefe629 100644
--- a/arch/arm/include/asm/insn.h
+++ b/arch/arm/include/asm/insn.h
@@ -13,18 +13,24 @@ arm_gen_nop(void)
 }
 
 unsigned long
-__arm_gen_branch(unsigned long pc, unsigned long addr, bool link);
+__arm_gen_branch(unsigned long pc, unsigned long addr, bool link, bool check);
 
 static inline unsigned long
 arm_gen_branch(unsigned long pc, unsigned long addr)
 {
-	return __arm_gen_branch(pc, addr, false);
+	return __arm_gen_branch(pc, addr, false, true);
 }
 
 static inline unsigned long
 arm_gen_branch_link(unsigned long pc, unsigned long addr)
 {
-	return __arm_gen_branch(pc, addr, true);
+	return __arm_gen_branch(pc, addr, true, true);
+}
+
+static inline unsigned long
+arm_gen_branch_link_nocheck(unsigned long pc, unsigned long addr)
+{
+	return __arm_gen_branch(pc, addr, true, false);
 }
 
 #endif
diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
index fa867a57100f..63ea34edd222 100644
--- a/arch/arm/kernel/ftrace.c
+++ b/arch/arm/kernel/ftrace.c
@@ -70,20 +70,28 @@ int ftrace_arch_code_modify_post_process(void)
 
 static unsigned long ftrace_call_replace(unsigned long pc, unsigned long addr)
 {
-	s32 offset = addr - pc;
-	s32 blim = 0xfe000008;
-	s32 flim = 0x02000004;
+	return arm_gen_branch_link(pc, addr);
+}
 
-	if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) {
-		blim = 0xff000004;
-		flim = 0x01000002;
-	}
+static unsigned long
+ftrace_call_replace_mod(struct module *mod, unsigned long pc, unsigned long addr)
+{
+#ifdef CONFIG_ARM_MODULE_PLTS
+	unsigned long new;
 
-	if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS) &&
-	    (offset < blim || offset > flim))
-		return 0;
+	if (likely(!mod))
+		return arm_gen_branch_link(pc, addr);
 
+	new = arm_gen_branch_link_nocheck(pc, addr);
+	if (!new) {
+		addr = get_module_plt(mod, pc, addr);
+		new = arm_gen_branch_link(pc, addr);
+	}
+
+	return new;
+#else
 	return arm_gen_branch_link(pc, addr);
+#endif
 }
 
 static int ftrace_modify_code(unsigned long pc, unsigned long old,
@@ -141,18 +149,7 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
 
 	old = ftrace_nop_replace(rec);
 
-	new = ftrace_call_replace(ip, aaddr);
-
-#ifdef CONFIG_ARM_MODULE_PLTS
-	if (!new) {
-		struct module *mod = rec->arch.mod;
-
-		if (mod) {
-			aaddr = get_module_plt(mod, ip, aaddr);
-			new = ftrace_call_replace(ip, aaddr);
-		}
-	}
-#endif
+	new = ftrace_call_replace_mod(ftrace_get_mod(&rec->arch), ip, aaddr);
 
 	return ftrace_modify_code(rec->ip, old, new, true);
 }
@@ -183,23 +180,11 @@ int ftrace_make_nop(struct module *mod,
 	unsigned long new;
 	int ret;
 
-#ifdef CONFIG_ARM_MODULE_PLTS
 	/* mod is only supplied during module loading */
-	if (!mod)
-		mod = rec->arch.mod;
-	else
-		rec->arch.mod = mod;
-#endif
-
-	old = ftrace_call_replace(ip, aaddr);
-
-#ifdef CONFIG_ARM_MODULE_PLTS
-	if (!old && mod) {
-		aaddr = get_module_plt(mod, ip, aaddr);
-		old = ftrace_call_replace(ip, aaddr);
-	}
-#endif
+	if (mod)
+		ftrace_set_mod(&rec->arch, mod);
 
+	old = ftrace_call_replace_mod(ftrace_get_mod(&rec->arch), ip, aaddr);
 	new = ftrace_nop_replace(rec);
 	ret = ftrace_modify_code(ip, old, new, true);
 
diff --git a/arch/arm/kernel/insn.c b/arch/arm/kernel/insn.c
index 2e844b70386b..37ec5734309e 100644
--- a/arch/arm/kernel/insn.c
+++ b/arch/arm/kernel/insn.c
@@ -4,7 +4,7 @@
 #include <asm/opcodes.h>
 
 static unsigned long
-__arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link)
+__arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link, bool check)
 {
 	unsigned long s, j1, j2, i1, i2, imm10, imm11;
 	unsigned long first, second;
@@ -12,7 +12,7 @@ __arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link)
 
 	offset = (long)addr - (long)(pc + 4);
 	if (offset < -16777216 || offset > 16777214) {
-		WARN_ON_ONCE(1);
+		WARN_ON_ONCE(check);
 		return 0;
 	}
 
@@ -34,7 +34,7 @@ __arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link)
 }
 
 static unsigned long
-__arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link)
+__arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link, bool check)
 {
 	unsigned long opcode = 0xea000000;
 	long offset;
@@ -44,7 +44,7 @@ __arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link)
 
 	offset = (long)addr - (long)(pc + 8);
 	if (unlikely(offset < -33554432 || offset > 33554428)) {
-		WARN_ON_ONCE(1);
+		WARN_ON_ONCE(check);
 		return 0;
 	}
 
@@ -54,10 +54,10 @@ __arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link)
 }
 
 unsigned long
-__arm_gen_branch(unsigned long pc, unsigned long addr, bool link)
+__arm_gen_branch(unsigned long pc, unsigned long addr, bool link, bool check)
 {
 	if (IS_ENABLED(CONFIG_THUMB2_KERNEL))
-		return __arm_gen_branch_thumb2(pc, addr, link);
+		return __arm_gen_branch_thumb2(pc, addr, link, check);
 	else
-		return __arm_gen_branch_arm(pc, addr, link);
+		return __arm_gen_branch_arm(pc, addr, link, check);
 }
Florian Fainelli March 24, 2021, 3:37 a.m. UTC | #17
Hi Qais,

On 3/23/2021 3:22 PM, Qais Yousef wrote:
> Hi Alexander
> 
> On 03/22/21 18:02, Alexander Sverdlin wrote:
>> Hi Qais,
>>
>> On 22/03/2021 17:32, Qais Yousef wrote:
>>> Yes you're right. I was a bit optimistic on CONFIG_DYNAMIC_FTRACE will imply
>>> CONFIG_ARM_MODULE_PLTS is enabled too.
>>>
>>> It only has an impact on reducing ifdefery when calling
>>>
>>> 	ftrace_call_replace_mod(rec->arch.mod, ...)
>>>
>>> Should be easy to wrap rec->arch.mod with its own accessor that will return
>>> NULL if !CONFIG_ARM_MODULE_PLTS or just ifdef the functions.
>>>
>>> Up to Alexander to pick what he prefers :-)
>>
>> well, I of course prefer v7 as-is, because this review is running longer than two
>> years and I actually hope these patches to be finally merged at some point.
>> But you are welcome to optimize them with follow up patches :)
> 
> I appreciate that and thanks a lot for your effort. My attempt to review and
> test here is to help in getting this merged.
> 
> FWIW my main concern is about duplicating the range check in
> ftrace_call_replace() and using magic values that already exist in
> __arm_gen_branch_{arm, thumb2}() and better remain encapsulated there.

Your patch in addition to Alexander's patch work for me as well, so feel
free to add a:

Tested-by: Florian Fainelli <f.fainelli@gmail.com>

FWIW, what is nice about Alexander's original patch is that it applies
relatively cleanly to older kernels as well where this is equally
needed. There is not currently any Fixes: tag being provided but maybe
we should amend the second patch with one?

Thanks!

> 
> Thanks
> 
> --
> Qais Yousef
> 
> ----->8------
> 
> 
> diff --git a/arch/arm/include/asm/ftrace.h b/arch/arm/include/asm/ftrace.h
> index a4dbac07e4ef..8545b3ff8317 100644
> --- a/arch/arm/include/asm/ftrace.h
> +++ b/arch/arm/include/asm/ftrace.h
> @@ -25,6 +25,27 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr)
>  	/* With Thumb-2, the recorded addresses have the lsb set */
>  	return addr & ~1;
>  }
> +
> +#ifdef CONFIG_ARM_MODULE_PLTS
> +static inline void ftrace_set_mod(struct dyn_arch_ftrace *arch, struct module *mod)
> +{
> +	arch->mod = mod;
> +}
> +
> +static inline struct module *ftrace_get_mod(struct dyn_arch_ftrace *arch)
> +{
> +	return arch->mod;
> +}
> +#else
> +static inline void ftrace_set_mod(struct dyn_arch_ftrace *arch, struct module *mod)
> +{
> +}
> +
> +static inline struct module *ftrace_get_mod(struct dyn_arch_ftrace *arch)
> +{
> +	return NULL;
> +}
> +#endif
>  #endif
>  
>  #endif
> diff --git a/arch/arm/include/asm/insn.h b/arch/arm/include/asm/insn.h
> index f20e08ac85ae..71c3edefe629 100644
> --- a/arch/arm/include/asm/insn.h
> +++ b/arch/arm/include/asm/insn.h
> @@ -13,18 +13,24 @@ arm_gen_nop(void)
>  }
>  
>  unsigned long
> -__arm_gen_branch(unsigned long pc, unsigned long addr, bool link);
> +__arm_gen_branch(unsigned long pc, unsigned long addr, bool link, bool check);
>  
>  static inline unsigned long
>  arm_gen_branch(unsigned long pc, unsigned long addr)
>  {
> -	return __arm_gen_branch(pc, addr, false);
> +	return __arm_gen_branch(pc, addr, false, true);
>  }
>  
>  static inline unsigned long
>  arm_gen_branch_link(unsigned long pc, unsigned long addr)
>  {
> -	return __arm_gen_branch(pc, addr, true);
> +	return __arm_gen_branch(pc, addr, true, true);
> +}
> +
> +static inline unsigned long
> +arm_gen_branch_link_nocheck(unsigned long pc, unsigned long addr)
> +{
> +	return __arm_gen_branch(pc, addr, true, false);
>  }
>  
>  #endif
> diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
> index fa867a57100f..63ea34edd222 100644
> --- a/arch/arm/kernel/ftrace.c
> +++ b/arch/arm/kernel/ftrace.c
> @@ -70,20 +70,28 @@ int ftrace_arch_code_modify_post_process(void)
>  
>  static unsigned long ftrace_call_replace(unsigned long pc, unsigned long addr)
>  {
> -	s32 offset = addr - pc;
> -	s32 blim = 0xfe000008;
> -	s32 flim = 0x02000004;
> +	return arm_gen_branch_link(pc, addr);
> +}
>  
> -	if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) {
> -		blim = 0xff000004;
> -		flim = 0x01000002;
> -	}
> +static unsigned long
> +ftrace_call_replace_mod(struct module *mod, unsigned long pc, unsigned long addr)
> +{
> +#ifdef CONFIG_ARM_MODULE_PLTS
> +	unsigned long new;
>  
> -	if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS) &&
> -	    (offset < blim || offset > flim))
> -		return 0;
> +	if (likely(!mod))
> +		return arm_gen_branch_link(pc, addr);
>  
> +	new = arm_gen_branch_link_nocheck(pc, addr);
> +	if (!new) {
> +		addr = get_module_plt(mod, pc, addr);
> +		new = arm_gen_branch_link(pc, addr);
> +	}
> +
> +	return new;
> +#else
>  	return arm_gen_branch_link(pc, addr);
> +#endif
>  }
>  
>  static int ftrace_modify_code(unsigned long pc, unsigned long old,
> @@ -141,18 +149,7 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
>  
>  	old = ftrace_nop_replace(rec);
>  
> -	new = ftrace_call_replace(ip, aaddr);
> -
> -#ifdef CONFIG_ARM_MODULE_PLTS
> -	if (!new) {
> -		struct module *mod = rec->arch.mod;
> -
> -		if (mod) {
> -			aaddr = get_module_plt(mod, ip, aaddr);
> -			new = ftrace_call_replace(ip, aaddr);
> -		}
> -	}
> -#endif
> +	new = ftrace_call_replace_mod(ftrace_get_mod(&rec->arch), ip, aaddr);
>  
>  	return ftrace_modify_code(rec->ip, old, new, true);
>  }
> @@ -183,23 +180,11 @@ int ftrace_make_nop(struct module *mod,
>  	unsigned long new;
>  	int ret;
>  
> -#ifdef CONFIG_ARM_MODULE_PLTS
>  	/* mod is only supplied during module loading */
> -	if (!mod)
> -		mod = rec->arch.mod;
> -	else
> -		rec->arch.mod = mod;
> -#endif
> -
> -	old = ftrace_call_replace(ip, aaddr);
> -
> -#ifdef CONFIG_ARM_MODULE_PLTS
> -	if (!old && mod) {
> -		aaddr = get_module_plt(mod, ip, aaddr);
> -		old = ftrace_call_replace(ip, aaddr);
> -	}
> -#endif
> +	if (mod)
> +		ftrace_set_mod(&rec->arch, mod);
>  
> +	old = ftrace_call_replace_mod(ftrace_get_mod(&rec->arch), ip, aaddr);
>  	new = ftrace_nop_replace(rec);
>  	ret = ftrace_modify_code(ip, old, new, true);
>  
> diff --git a/arch/arm/kernel/insn.c b/arch/arm/kernel/insn.c
> index 2e844b70386b..37ec5734309e 100644
> --- a/arch/arm/kernel/insn.c
> +++ b/arch/arm/kernel/insn.c
> @@ -4,7 +4,7 @@
>  #include <asm/opcodes.h>
>  
>  static unsigned long
> -__arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link)
> +__arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link, bool check)
>  {
>  	unsigned long s, j1, j2, i1, i2, imm10, imm11;
>  	unsigned long first, second;
> @@ -12,7 +12,7 @@ __arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link)
>  
>  	offset = (long)addr - (long)(pc + 4);
>  	if (offset < -16777216 || offset > 16777214) {
> -		WARN_ON_ONCE(1);
> +		WARN_ON_ONCE(check);
>  		return 0;
>  	}
>  
> @@ -34,7 +34,7 @@ __arm_gen_branch_thumb2(unsigned long pc, unsigned long addr, bool link)
>  }
>  
>  static unsigned long
> -__arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link)
> +__arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link, bool check)
>  {
>  	unsigned long opcode = 0xea000000;
>  	long offset;
> @@ -44,7 +44,7 @@ __arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link)
>  
>  	offset = (long)addr - (long)(pc + 8);
>  	if (unlikely(offset < -33554432 || offset > 33554428)) {
> -		WARN_ON_ONCE(1);
> +		WARN_ON_ONCE(check);
>  		return 0;
>  	}
>  
> @@ -54,10 +54,10 @@ __arm_gen_branch_arm(unsigned long pc, unsigned long addr, bool link)
>  }
>  
>  unsigned long
> -__arm_gen_branch(unsigned long pc, unsigned long addr, bool link)
> +__arm_gen_branch(unsigned long pc, unsigned long addr, bool link, bool check)
>  {
>  	if (IS_ENABLED(CONFIG_THUMB2_KERNEL))
> -		return __arm_gen_branch_thumb2(pc, addr, link);
> +		return __arm_gen_branch_thumb2(pc, addr, link, check);
>  	else
> -		return __arm_gen_branch_arm(pc, addr, link);
> +		return __arm_gen_branch_arm(pc, addr, link, check);
>  }
>
Alexander Sverdlin March 24, 2021, 9:04 a.m. UTC | #18
Hi Qais,

On 23/03/2021 23:22, Qais Yousef wrote:
>>> Yes you're right. I was a bit optimistic on CONFIG_DYNAMIC_FTRACE will imply
>>> CONFIG_ARM_MODULE_PLTS is enabled too.
>>>
>>> It only has an impact on reducing ifdefery when calling
>>>
>>> 	ftrace_call_replace_mod(rec->arch.mod, ...)
>>>
>>> Should be easy to wrap rec->arch.mod with its own accessor that will return
>>> NULL if !CONFIG_ARM_MODULE_PLTS or just ifdef the functions.
>>>
>>> Up to Alexander to pick what he prefers :-)
>> well, I of course prefer v7 as-is, because this review is running longer than two
>> years and I actually hope these patches to be finally merged at some point.
>> But you are welcome to optimize them with follow up patches :)
> I appreciate that and thanks a lot for your effort. My attempt to review and
> test here is to help in getting this merged.
> 
> FWIW my main concern is about duplicating the range check in
> ftrace_call_replace() and using magic values that already exist in
> __arm_gen_branch_{arm, thumb2}() and better remain encapsulated there.

could you please check the negative limits? I have an opinion, my limits are
correct. I could add extra parameter to arm_gen_branch_link(), but for this
I first need to fix its negative limits, which, I believe, well... Approximate :)
Qais Yousef March 24, 2021, 3:57 p.m. UTC | #19
Hey Alexander

On 03/24/21 10:04, Alexander Sverdlin wrote:
> Hi Qais,
> 
> On 23/03/2021 23:22, Qais Yousef wrote:
> >>> Yes you're right. I was a bit optimistic on CONFIG_DYNAMIC_FTRACE will imply
> >>> CONFIG_ARM_MODULE_PLTS is enabled too.
> >>>
> >>> It only has an impact on reducing ifdefery when calling
> >>>
> >>> 	ftrace_call_replace_mod(rec->arch.mod, ...)
> >>>
> >>> Should be easy to wrap rec->arch.mod with its own accessor that will return
> >>> NULL if !CONFIG_ARM_MODULE_PLTS or just ifdef the functions.
> >>>
> >>> Up to Alexander to pick what he prefers :-)
> >> well, I of course prefer v7 as-is, because this review is running longer than two
> >> years and I actually hope these patches to be finally merged at some point.
> >> But you are welcome to optimize them with follow up patches :)
> > I appreciate that and thanks a lot for your effort. My attempt to review and
> > test here is to help in getting this merged.
> > 
> > FWIW my main concern is about duplicating the range check in
> > ftrace_call_replace() and using magic values that already exist in
> > __arm_gen_branch_{arm, thumb2}() and better remain encapsulated there.
> 
> could you please check the negative limits? I have an opinion, my limits are
> correct. I could add extra parameter to arm_gen_branch_link(), but for this
> I first need to fix its negative limits, which, I believe, well... Approximate :)

Can you elaborate please?

If you look at Arm ARM [1] the ranges are defined in page 347

	Encoding T1 Even numbers in the range –16777216 to 16777214.
	Encoding T2 Multiples of 4 in the range –16777216 to 16777212.
	Encoding A1 Multiples of 4 in the range –33554432 to 33554428.
	Encoding A2 Even numbers in the range –33554432 to 33554430.

which matches what's in the code (T1 for thumb2 and A1 for arm).

Why do you think it's wrong?

Thanks

--
Qais Yousef

[1] https://developer.arm.com/documentation/ddi0406/latest/
Qais Yousef March 24, 2021, 4:10 p.m. UTC | #20
Hi Florian

On 03/23/21 20:37, Florian Fainelli wrote:
> Hi Qais,
> 
> On 3/23/2021 3:22 PM, Qais Yousef wrote:
> > Hi Alexander
> > 
> > On 03/22/21 18:02, Alexander Sverdlin wrote:
> >> Hi Qais,
> >>
> >> On 22/03/2021 17:32, Qais Yousef wrote:
> >>> Yes you're right. I was a bit optimistic on CONFIG_DYNAMIC_FTRACE will imply
> >>> CONFIG_ARM_MODULE_PLTS is enabled too.
> >>>
> >>> It only has an impact on reducing ifdefery when calling
> >>>
> >>> 	ftrace_call_replace_mod(rec->arch.mod, ...)
> >>>
> >>> Should be easy to wrap rec->arch.mod with its own accessor that will return
> >>> NULL if !CONFIG_ARM_MODULE_PLTS or just ifdef the functions.
> >>>
> >>> Up to Alexander to pick what he prefers :-)
> >>
> >> well, I of course prefer v7 as-is, because this review is running longer than two
> >> years and I actually hope these patches to be finally merged at some point.
> >> But you are welcome to optimize them with follow up patches :)
> > 
> > I appreciate that and thanks a lot for your effort. My attempt to review and
> > test here is to help in getting this merged.
> > 
> > FWIW my main concern is about duplicating the range check in
> > ftrace_call_replace() and using magic values that already exist in
> > __arm_gen_branch_{arm, thumb2}() and better remain encapsulated there.
> 
> Your patch in addition to Alexander's patch work for me as well, so feel
> free to add a:
> 
> Tested-by: Florian Fainelli <f.fainelli@gmail.com>
> 
> FWIW, what is nice about Alexander's original patch is that it applies
> relatively cleanly to older kernels as well where this is equally

How old are we talking? Was the conflict that bad for the stable maintainers to
deal with it? ie: would it require sending the backport separately?

> needed. There is not currently any Fixes: tag being provided but maybe
> we should amend the second patch with one?

I'm not sure if this will be considered new feature or a bug fix. FWIW,
tagging it for stable sounds reasonable to me.

Thanks!

--
Qais Yosuef
Alexander Sverdlin March 24, 2021, 4:33 p.m. UTC | #21
Hello Qais,

On 24/03/2021 16:57, Qais Yousef wrote:
>>> FWIW my main concern is about duplicating the range check in
>>> ftrace_call_replace() and using magic values that already exist in
>>> __arm_gen_branch_{arm, thumb2}() and better remain encapsulated there.
>> could you please check the negative limits? I have an opinion, my limits are
>> correct. I could add extra parameter to arm_gen_branch_link(), but for this
>> I first need to fix its negative limits, which, I believe, well... Approximate :)
> Can you elaborate please?
> 
> If you look at Arm ARM [1] the ranges are defined in page 347
> 
> 	Encoding T1 Even numbers in the range –16777216 to 16777214.
> 	Encoding T2 Multiples of 4 in the range –16777216 to 16777212.
> 	Encoding A1 Multiples of 4 in the range –33554432 to 33554428.
> 	Encoding A2 Even numbers in the range –33554432 to 33554430.
> 
> which matches what's in the code (T1 for thumb2 and A1 for arm).
> 
> Why do you think it's wrong?

thanks for checking this! I'll re-send v8 with your proposal.
Qais Yousef March 24, 2021, 4:46 p.m. UTC | #22
On 03/24/21 17:33, Alexander Sverdlin wrote:
> Hello Qais,
> 
> On 24/03/2021 16:57, Qais Yousef wrote:
> >>> FWIW my main concern is about duplicating the range check in
> >>> ftrace_call_replace() and using magic values that already exist in
> >>> __arm_gen_branch_{arm, thumb2}() and better remain encapsulated there.
> >> could you please check the negative limits? I have an opinion, my limits are
> >> correct. I could add extra parameter to arm_gen_branch_link(), but for this
> >> I first need to fix its negative limits, which, I believe, well... Approximate :)
> > Can you elaborate please?
> > 
> > If you look at Arm ARM [1] the ranges are defined in page 347
> > 
> > 	Encoding T1 Even numbers in the range –16777216 to 16777214.
> > 	Encoding T2 Multiples of 4 in the range –16777216 to 16777212.
> > 	Encoding A1 Multiples of 4 in the range –33554432 to 33554428.
> > 	Encoding A2 Even numbers in the range –33554432 to 33554430.
> > 
> > which matches what's in the code (T1 for thumb2 and A1 for arm).
> > 
> > Why do you think it's wrong?
> 
> thanks for checking this! I'll re-send v8 with your proposal.

If you felt some details need tweaking I don't mind, my proposal was an attempt
to help rather than impose.

Thanks!

--
Qais Yousef
diff mbox series

Patch

diff --git a/arch/arm/include/asm/ftrace.h b/arch/arm/include/asm/ftrace.h
index 48ec1d0..a4dbac0 100644
--- a/arch/arm/include/asm/ftrace.h
+++ b/arch/arm/include/asm/ftrace.h
@@ -15,6 +15,9 @@  extern void __gnu_mcount_nc(void);
 
 #ifdef CONFIG_DYNAMIC_FTRACE
 struct dyn_arch_ftrace {
+#ifdef CONFIG_ARM_MODULE_PLTS
+	struct module *mod;
+#endif
 };
 
 static inline unsigned long ftrace_call_adjust(unsigned long addr)
diff --git a/arch/arm/include/asm/module.h b/arch/arm/include/asm/module.h
index 09b9ad5..cfffae6 100644
--- a/arch/arm/include/asm/module.h
+++ b/arch/arm/include/asm/module.h
@@ -30,6 +30,7 @@  struct plt_entries {
 
 struct mod_plt_sec {
 	struct elf32_shdr	*plt;
+	struct plt_entries	*plt_ent;
 	int			plt_count;
 };
 
diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
index 9a79ef6..fa867a5 100644
--- a/arch/arm/kernel/ftrace.c
+++ b/arch/arm/kernel/ftrace.c
@@ -70,6 +70,19 @@  int ftrace_arch_code_modify_post_process(void)
 
 static unsigned long ftrace_call_replace(unsigned long pc, unsigned long addr)
 {
+	s32 offset = addr - pc;
+	s32 blim = 0xfe000008;
+	s32 flim = 0x02000004;
+
+	if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) {
+		blim = 0xff000004;
+		flim = 0x01000002;
+	}
+
+	if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS) &&
+	    (offset < blim || offset > flim))
+		return 0;
+
 	return arm_gen_branch_link(pc, addr);
 }
 
@@ -124,10 +137,22 @@  int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
 {
 	unsigned long new, old;
 	unsigned long ip = rec->ip;
+	unsigned long aaddr = adjust_address(rec, addr);
 
 	old = ftrace_nop_replace(rec);
 
-	new = ftrace_call_replace(ip, adjust_address(rec, addr));
+	new = ftrace_call_replace(ip, aaddr);
+
+#ifdef CONFIG_ARM_MODULE_PLTS
+	if (!new) {
+		struct module *mod = rec->arch.mod;
+
+		if (mod) {
+			aaddr = get_module_plt(mod, ip, aaddr);
+			new = ftrace_call_replace(ip, aaddr);
+		}
+	}
+#endif
 
 	return ftrace_modify_code(rec->ip, old, new, true);
 }
@@ -152,12 +177,29 @@  int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
 int ftrace_make_nop(struct module *mod,
 		    struct dyn_ftrace *rec, unsigned long addr)
 {
+	unsigned long aaddr = adjust_address(rec, addr);
 	unsigned long ip = rec->ip;
 	unsigned long old;
 	unsigned long new;
 	int ret;
 
-	old = ftrace_call_replace(ip, adjust_address(rec, addr));
+#ifdef CONFIG_ARM_MODULE_PLTS
+	/* mod is only supplied during module loading */
+	if (!mod)
+		mod = rec->arch.mod;
+	else
+		rec->arch.mod = mod;
+#endif
+
+	old = ftrace_call_replace(ip, aaddr);
+
+#ifdef CONFIG_ARM_MODULE_PLTS
+	if (!old && mod) {
+		aaddr = get_module_plt(mod, ip, aaddr);
+		old = ftrace_call_replace(ip, aaddr);
+	}
+#endif
+
 	new = ftrace_nop_replace(rec);
 	ret = ftrace_modify_code(ip, old, new, true);
 
diff --git a/arch/arm/kernel/module-plts.c b/arch/arm/kernel/module-plts.c
index d330e9e..a0524ad 100644
--- a/arch/arm/kernel/module-plts.c
+++ b/arch/arm/kernel/module-plts.c
@@ -4,6 +4,7 @@ 
  */
 
 #include <linux/elf.h>
+#include <linux/ftrace.h>
 #include <linux/kernel.h>
 #include <linux/module.h>
 #include <linux/sort.h>
@@ -20,19 +21,52 @@ 
 						    (PLT_ENT_STRIDE - 8))
 #endif
 
+static const u32 fixed_plts[] = {
+#ifdef CONFIG_FUNCTION_TRACER
+	FTRACE_ADDR,
+	MCOUNT_ADDR,
+#endif
+};
+
 static bool in_init(const struct module *mod, unsigned long loc)
 {
 	return loc - (u32)mod->init_layout.base < mod->init_layout.size;
 }
 
+static void prealloc_fixed(struct mod_plt_sec *pltsec, struct plt_entries *plt)
+{
+	int i;
+
+	if (!ARRAY_SIZE(fixed_plts) || pltsec->plt_count)
+		return;
+	pltsec->plt_count = ARRAY_SIZE(fixed_plts);
+
+	for (i = 0; i < ARRAY_SIZE(plt->ldr); ++i)
+		plt->ldr[i] = PLT_ENT_LDR;
+
+	BUILD_BUG_ON(sizeof(fixed_plts) > sizeof(plt->lit));
+	memcpy(plt->lit, fixed_plts, sizeof(fixed_plts));
+}
+
 u32 get_module_plt(struct module *mod, unsigned long loc, Elf32_Addr val)
 {
 	struct mod_plt_sec *pltsec = !in_init(mod, loc) ? &mod->arch.core :
 							  &mod->arch.init;
+	struct plt_entries *plt;
+	int idx;
+
+	/* cache the address, ELF header is available only during module load */
+	if (!pltsec->plt_ent)
+		pltsec->plt_ent = (struct plt_entries *)pltsec->plt->sh_addr;
+	plt = pltsec->plt_ent;
 
-	struct plt_entries *plt = (struct plt_entries *)pltsec->plt->sh_addr;
-	int idx = 0;
+	prealloc_fixed(pltsec, plt);
+
+	for (idx = 0; idx < ARRAY_SIZE(fixed_plts); ++idx)
+		if (plt->lit[idx] == val)
+			return (u32)&plt->ldr[idx];
 
+	idx = 0;
 	/*
 	 * Look for an existing entry pointing to 'val'. Given that the
 	 * relocations are sorted, this will be the last entry we allocated.
@@ -180,8 +214,8 @@  static unsigned int count_plts(const Elf32_Sym *syms, Elf32_Addr base,
 int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
 			      char *secstrings, struct module *mod)
 {
-	unsigned long core_plts = 0;
-	unsigned long init_plts = 0;
+	unsigned long core_plts = ARRAY_SIZE(fixed_plts);
+	unsigned long init_plts = ARRAY_SIZE(fixed_plts);
 	Elf32_Shdr *s, *sechdrs_end = sechdrs + ehdr->e_shnum;
 	Elf32_Sym *syms = NULL;
 
@@ -236,6 +270,7 @@  int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
 	mod->arch.core.plt->sh_size = round_up(core_plts * PLT_ENT_SIZE,
 					       sizeof(struct plt_entries));
 	mod->arch.core.plt_count = 0;
+	mod->arch.core.plt_ent = NULL;
 
 	mod->arch.init.plt->sh_type = SHT_NOBITS;
 	mod->arch.init.plt->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
@@ -243,6 +278,7 @@  int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs,
 	mod->arch.init.plt->sh_size = round_up(init_plts * PLT_ENT_SIZE,
 					       sizeof(struct plt_entries));
 	mod->arch.init.plt_count = 0;
+	mod->arch.init.plt_ent = NULL;
 
 	pr_debug("%s: plt=%x, init.plt=%x\n", __func__,
 		 mod->arch.core.plt->sh_size, mod->arch.init.plt->sh_size);