diff mbox

arm64: ftrace: stop using kstop_machine to enable/disable tracing

Message ID 1448697009-17211-1-git-send-email-huawei.libin@huawei.com (mailing list archive)
State New, archived
Headers show

Commit Message

Li Bin Nov. 28, 2015, 7:50 a.m. UTC
On arm64, kstop_machine which is hugely disruptive to a running
system is not needed to convert nops to ftrace calls or back,
because that modifed code is a single 32bit instructions which
is impossible to cross cache (or page) boundaries, and the used str
instruction is single-copy atomic.

Cc: <stable@vger.kernel.org> # 3.18+
Signed-off-by: Li Bin <huawei.libin@huawei.com>
---
 arch/arm64/kernel/ftrace.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

Comments

Steven Rostedt Nov. 28, 2015, 3:58 p.m. UTC | #1
On Sat, 28 Nov 2015 15:50:09 +0800
Li Bin <huawei.libin@huawei.com> wrote:

> On arm64, kstop_machine which is hugely disruptive to a running
> system is not needed to convert nops to ftrace calls or back,
> because that modifed code is a single 32bit instructions which
> is impossible to cross cache (or page) boundaries, and the used str
> instruction is single-copy atomic.

Is this really true? I thought that arm (and then perhaps arm64) has
some 2 byte instructions. If that's the case it is very well possible
that a 4 byte instruction can cross cache lines.

-- Steve

> 
> Cc: <stable@vger.kernel.org> # 3.18+
> Signed-off-by: Li Bin <huawei.libin@huawei.com>
> ---
>  arch/arm64/kernel/ftrace.c |    5 +++++
>  1 files changed, 5 insertions(+), 0 deletions(-)
Li Bin Nov. 30, 2015, 2:03 a.m. UTC | #2
on 2015/11/28 23:58, Steven Rostedt wrote:
> On Sat, 28 Nov 2015 15:50:09 +0800
> Li Bin <huawei.libin@huawei.com> wrote:
>
>> On arm64, kstop_machine which is hugely disruptive to a running
>> system is not needed to convert nops to ftrace calls or back,
>> because that modifed code is a single 32bit instructions which
>> is impossible to cross cache (or page) boundaries, and the used str
>> instruction is single-copy atomic.
> Is this really true? I thought that arm (and then perhaps arm64) has
> some 2 byte instructions. If that's the case it is very well possible
> that a 4 byte instruction can cross cache lines.

When system in aarch32 state, it will use A32 or T32 instrucion set, and
T32 (thumb) have 16-bit instructions. But arm64 that in aarch64 state only
using A64 instruction set, which is a clean and fixed length instruction
set that instuctions are always 32 bits wide. Right?

Thanks,
Li Bin

> -- Steve
>
>> Cc: <stable@vger.kernel.org> # 3.18+
>> Signed-off-by: Li Bin <huawei.libin@huawei.com>
>> ---
>>  arch/arm64/kernel/ftrace.c |    5 +++++
>>  1 files changed, 5 insertions(+), 0 deletions(-)
> .
>
Will Deacon Dec. 2, 2015, 12:36 p.m. UTC | #3
On Sat, Nov 28, 2015 at 03:50:09PM +0800, Li Bin wrote:
> On arm64, kstop_machine which is hugely disruptive to a running
> system is not needed to convert nops to ftrace calls or back,
> because that modifed code is a single 32bit instructions which
> is impossible to cross cache (or page) boundaries, and the used str
> instruction is single-copy atomic.

This commit message is misleading, since the single-copy atomicity
guarantees don't apply to the instruction-side. Instead, the architecture
calls out a handful of safe instructions in "Concurrent modification and
execution of instructions".

Now, those safe instructions *do* include NOP, B and BL, so that should
be sufficient for ftrace provided that we don't patch condition codes
(and I don't think we do).

> Cc: <stable@vger.kernel.org> # 3.18+

I don't think this is stable material.

Will

> Signed-off-by: Li Bin <huawei.libin@huawei.com>
> ---
>  arch/arm64/kernel/ftrace.c |    5 +++++
>  1 files changed, 5 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
> index c851be7..9669b33 100644
> --- a/arch/arm64/kernel/ftrace.c
> +++ b/arch/arm64/kernel/ftrace.c
> @@ -93,6 +93,11 @@ int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
>  	return ftrace_modify_code(pc, old, new, true);
>  }
>  
> +void arch_ftrace_update_code(int command)
> +{
> +	ftrace_modify_all_code(command);
> +}
> +
>  int __init ftrace_dyn_arch_init(void)
>  {
>  	return 0;
> -- 
> 1.7.1
>
Will Deacon Dec. 2, 2015, 1:16 p.m. UTC | #4
On Wed, Dec 02, 2015 at 12:36:54PM +0000, Will Deacon wrote:
> On Sat, Nov 28, 2015 at 03:50:09PM +0800, Li Bin wrote:
> > On arm64, kstop_machine which is hugely disruptive to a running
> > system is not needed to convert nops to ftrace calls or back,
> > because that modifed code is a single 32bit instructions which
> > is impossible to cross cache (or page) boundaries, and the used str
> > instruction is single-copy atomic.
> 
> This commit message is misleading, since the single-copy atomicity
> guarantees don't apply to the instruction-side. Instead, the architecture
> calls out a handful of safe instructions in "Concurrent modification and
> execution of instructions".
> 
> Now, those safe instructions *do* include NOP, B and BL, so that should
> be sufficient for ftrace provided that we don't patch condition codes
> (and I don't think we do).

Thinking about this some more, you also need to fix the validate=1 case
in ftrace_modify_code so that it can run outside of stop_machine. We
currently rely on that to deal with concurrent modifications (e.g.
module unloading).

Will
Steven Rostedt Dec. 2, 2015, 2:02 p.m. UTC | #5
On Wed, 2 Dec 2015 12:36:55 +0000
Will Deacon <will.deacon@arm.com> wrote:


> > Cc: <stable@vger.kernel.org> # 3.18+  
> 
> I don't think this is stable material.
> 

No, it definitely isn't. This is a new feature not a fix.

-- Steve
Li Bin Dec. 3, 2015, 9:21 a.m. UTC | #6
on 2015/12/2 20:36, Will Deacon wrote:
> On Sat, Nov 28, 2015 at 03:50:09PM +0800, Li Bin wrote:
>> On arm64, kstop_machine which is hugely disruptive to a running
>> system is not needed to convert nops to ftrace calls or back,
>> because that modifed code is a single 32bit instructions which
>> is impossible to cross cache (or page) boundaries, and the used str
>> instruction is single-copy atomic.
> This commit message is misleading, since the single-copy atomicity
> guarantees don't apply to the instruction-side. Instead, the architecture
> calls out a handful of safe instructions in "Concurrent modification and
> execution of instructions".

Right, thank you for your comments.

> Now, those safe instructions *do* include NOP, B and BL, so that should
> be sufficient for ftrace provided that we don't patch condition codes
> (and I don't think we do).

Yes, and so far this assumption has no probem, but in order to avoid exceeding these
safe insturctions in the future, we can use aarch64_insn_hotpatch_safe() to verify the
instruction to determine whether needs stop_machine() to synchronize or use
aarch64_insn_patch_text directly. Right or I am missing something?

Thanks,
Li Bin

>> Cc: <stable@vger.kernel.org> # 3.18+
> I don't think this is stable material.
>
> Will
>
>> Signed-off-by: Li Bin <huawei.libin@huawei.com>
>> ---
>>  arch/arm64/kernel/ftrace.c |    5 +++++
>>  1 files changed, 5 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
>> index c851be7..9669b33 100644
>> --- a/arch/arm64/kernel/ftrace.c
>> +++ b/arch/arm64/kernel/ftrace.c
>> @@ -93,6 +93,11 @@ int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
>>  	return ftrace_modify_code(pc, old, new, true);
>>  }
>>  
>> +void arch_ftrace_update_code(int command)
>> +{
>> +	ftrace_modify_all_code(command);
>> +}
>> +
>>  int __init ftrace_dyn_arch_init(void)
>>  {
>>  	return 0;
>> -- 
>> 1.7.1
>>
> .
>
Will Deacon Dec. 3, 2015, 9:38 a.m. UTC | #7
On Thu, Dec 03, 2015 at 05:21:22PM +0800, libin wrote:
> 
> on 2015/12/2 20:36, Will Deacon wrote:
> > On Sat, Nov 28, 2015 at 03:50:09PM +0800, Li Bin wrote:
> >> On arm64, kstop_machine which is hugely disruptive to a running
> >> system is not needed to convert nops to ftrace calls or back,
> >> because that modifed code is a single 32bit instructions which
> >> is impossible to cross cache (or page) boundaries, and the used str
> >> instruction is single-copy atomic.
> > This commit message is misleading, since the single-copy atomicity
> > guarantees don't apply to the instruction-side. Instead, the architecture
> > calls out a handful of safe instructions in "Concurrent modification and
> > execution of instructions".
> 
> Right, thank you for your comments.
> 
> > Now, those safe instructions *do* include NOP, B and BL, so that should
> > be sufficient for ftrace provided that we don't patch condition codes
> > (and I don't think we do).
> 
> Yes, and so far this assumption has no probem, but in order to avoid exceeding these
> safe insturctions in the future, we can use aarch64_insn_hotpatch_safe() to verify the
> instruction to determine whether needs stop_machine() to synchronize or use
> aarch64_insn_patch_text directly. Right or I am missing something?

I think you're missing the case where the instruction changes under our
feet after we've read it but before we've replaced it (e.g. due to module
unloading). I think that's why ftrace_modify_code has the comment about
lack of locking thanks to stop_machine.

Will
Li Bin Dec. 3, 2015, 9:39 a.m. UTC | #8
on 2015/12/2 21:16, Will Deacon wrote:
> On Wed, Dec 02, 2015 at 12:36:54PM +0000, Will Deacon wrote:
>> On Sat, Nov 28, 2015 at 03:50:09PM +0800, Li Bin wrote:
>>> On arm64, kstop_machine which is hugely disruptive to a running
>>> system is not needed to convert nops to ftrace calls or back,
>>> because that modifed code is a single 32bit instructions which
>>> is impossible to cross cache (or page) boundaries, and the used str
>>> instruction is single-copy atomic.
>> This commit message is misleading, since the single-copy atomicity
>> guarantees don't apply to the instruction-side. Instead, the architecture
>> calls out a handful of safe instructions in "Concurrent modification and
>> execution of instructions".
>>
>> Now, those safe instructions *do* include NOP, B and BL, so that should
>> be sufficient for ftrace provided that we don't patch condition codes
>> (and I don't think we do).
> Thinking about this some more, you also need to fix the validate=1 case
> in ftrace_modify_code so that it can run outside of stop_machine. We
> currently rely on that to deal with concurrent modifications (e.g.
> module unloading).

I'm not sure it is really a problem, but on x86, which using breakpoints method,
add_break() that run outside of stop_machine also has similar code.

static int add_break(unsigned long ip, const char *old)
{
        unsigned char replaced[MCOUNT_INSN_SIZE];
        unsigned char brk = BREAKPOINT_INSTRUCTION;

        if (probe_kernel_read(replaced, (void *)ip, MCOUNT_INSN_SIZE))
                return -EFAULT;

        /* Make sure it is what we expect it to be */
        if (memcmp(replaced, old, MCOUNT_INSN_SIZE) != 0)
                return -EINVAL;

        return ftrace_write(ip, &brk, 1);
}

Or I misunderstand what you mean?

Thanks,
Li Bin

> Will
>
> .
>
Will Deacon Dec. 3, 2015, 11:48 a.m. UTC | #9
On Thu, Dec 03, 2015 at 05:39:56PM +0800, libin wrote:
> on 2015/12/2 21:16, Will Deacon wrote:
> > On Wed, Dec 02, 2015 at 12:36:54PM +0000, Will Deacon wrote:
> >> On Sat, Nov 28, 2015 at 03:50:09PM +0800, Li Bin wrote:
> >>> On arm64, kstop_machine which is hugely disruptive to a running
> >>> system is not needed to convert nops to ftrace calls or back,
> >>> because that modifed code is a single 32bit instructions which
> >>> is impossible to cross cache (or page) boundaries, and the used str
> >>> instruction is single-copy atomic.
> >> This commit message is misleading, since the single-copy atomicity
> >> guarantees don't apply to the instruction-side. Instead, the architecture
> >> calls out a handful of safe instructions in "Concurrent modification and
> >> execution of instructions".
> >>
> >> Now, those safe instructions *do* include NOP, B and BL, so that should
> >> be sufficient for ftrace provided that we don't patch condition codes
> >> (and I don't think we do).
> > Thinking about this some more, you also need to fix the validate=1 case
> > in ftrace_modify_code so that it can run outside of stop_machine. We
> > currently rely on that to deal with concurrent modifications (e.g.
> > module unloading).
> 
> I'm not sure it is really a problem, but on x86, which using breakpoints method,
> add_break() that run outside of stop_machine also has similar code.

Yeah, having now read through that, I also can't see any locking issues.
We should remove the comment suggesting otherwise.

> static int add_break(unsigned long ip, const char *old)
> {
>         unsigned char replaced[MCOUNT_INSN_SIZE];
>         unsigned char brk = BREAKPOINT_INSTRUCTION;
> 
>         if (probe_kernel_read(replaced, (void *)ip, MCOUNT_INSN_SIZE))
>                 return -EFAULT;
> 
>         /* Make sure it is what we expect it to be */
>         if (memcmp(replaced, old, MCOUNT_INSN_SIZE) != 0)
>                 return -EINVAL;
> 
>         return ftrace_write(ip, &brk, 1);
> }
> 
> Or I misunderstand what you mean?

Hmm, so this should all be fine if we exclusively use the probe_kernel_*
functions and handle the -EFAULT gracefully. Now, that leaves an
interesting scenario with the flush_icache_range call in
aarch64_insn_patch_text_nosync, since that's not run with
KERNEL_DS/pagefault_disable() and so we'll panic if the text disappears
underneath us.

So we probably need to add that code and call __flush_cache_user_range
instead.

What do you think?

Will
Steven Rostedt Dec. 3, 2015, 3:05 p.m. UTC | #10
On Thu, 3 Dec 2015 09:38:21 +0000
Will Deacon <will.deacon@arm.com> wrote:


> I think you're missing the case where the instruction changes under our
> feet after we've read it but before we've replaced it (e.g. due to module
> unloading). I think that's why ftrace_modify_code has the comment about
> lack of locking thanks to stop_machine.

Note, ftrace has a module notifier that is called when a module is
being unloaded and before the text goes away. This code grabs the
ftrace_lock mutex and removes the module functions from the ftrace
list, such that it will no longer do any modifications to that module's
text.

The update to make functions be traced is done under the ftrace_lock
mutex as well.

You do not need to worry about module text disappearing from
underneath you while you do your modifications.

Now, if there's comments that suggest otherwise, they need to be
updated.

-- Steve
Steven Rostedt Dec. 3, 2015, 3:07 p.m. UTC | #11
On Thu, 3 Dec 2015 11:48:24 +0000
Will Deacon <will.deacon@arm.com> wrote:

> Hmm, so this should all be fine if we exclusively use the probe_kernel_*
> functions and handle the -EFAULT gracefully. Now, that leaves an
> interesting scenario with the flush_icache_range call in
> aarch64_insn_patch_text_nosync, since that's not run with
> KERNEL_DS/pagefault_disable() and so we'll panic if the text disappears
> underneath us.

Nothing should remove the text from underneath you if everything
matches up fine before that. Module unloading will block on the
ftrace_lock if it has any functions that can be traced.

-- Steve


> 
> So we probably need to add that code and call __flush_cache_user_range
> instead.
> 
> What do you think?
> 
> Will
Will Deacon Dec. 3, 2015, 3:09 p.m. UTC | #12
On Thu, Dec 03, 2015 at 10:05:25AM -0500, Steven Rostedt wrote:
> On Thu, 3 Dec 2015 09:38:21 +0000
> Will Deacon <will.deacon@arm.com> wrote:
> > I think you're missing the case where the instruction changes under our
> > feet after we've read it but before we've replaced it (e.g. due to module
> > unloading). I think that's why ftrace_modify_code has the comment about
> > lack of locking thanks to stop_machine.
> 
> Note, ftrace has a module notifier that is called when a module is
> being unloaded and before the text goes away. This code grabs the
> ftrace_lock mutex and removes the module functions from the ftrace
> list, such that it will no longer do any modifications to that module's
> text.
> 
> The update to make functions be traced is done under the ftrace_lock
> mutex as well.
> 
> You do not need to worry about module text disappearing from
> underneath you while you do your modifications.

Good.

> Now, if there's comments that suggest otherwise, they need to be
> updated.

Yeah, I think the comments on x86 and arm64 are out of date. They also
mention the freeing of __init sections -- is that still a concern?

Will
Steven Rostedt Dec. 3, 2015, 3:31 p.m. UTC | #13
On Thu, 3 Dec 2015 15:09:26 +0000
Will Deacon <will.deacon@arm.com> wrote:

> Yeah, I think the comments on x86 and arm64 are out of date. They also
> mention the freeing of __init sections -- is that still a concern?

No we black list them, any section that we are not sure will be there
when we expect it to has recordmcount.c nop out the calls to mcount and
they are ignored. Remember the e1000e bug? This was the reproducer.

Anyway, on my todo list is to allow init sections to be traced. To do
so would mean that I need to add generic code that lets ftrace know to
remove init sections at boot up. Right now (or at least the last time I
checked, which was back in 2009), every arch had its own way of freeing
init memory. If that has changed, or I can just place a hook where it
happens (which is probably the easy part), I can allow init code to be
traced too.

-- Steve
Li Bin Dec. 4, 2015, 1 a.m. UTC | #14
Thanks very much to Will and Steve for the wonderful comments, I will modify the
commit message, and remove the misleading comments about module text
disappearing case.

Thanks again,
Li Bin

on 2015/12/3 23:31, Steven Rostedt wrote:
> On Thu, 3 Dec 2015 15:09:26 +0000
> Will Deacon <will.deacon@arm.com> wrote:
>
>> Yeah, I think the comments on x86 and arm64 are out of date. They also
>> mention the freeing of __init sections -- is that still a concern?
> No we black list them, any section that we are not sure will be there
> when we expect it to has recordmcount.c nop out the calls to mcount and
> they are ignored. Remember the e1000e bug? This was the reproducer.
>
> Anyway, on my todo list is to allow init sections to be traced. To do
> so would mean that I need to add generic code that lets ftrace know to
> remove init sections at boot up. Right now (or at least the last time I
> checked, which was back in 2009), every arch had its own way of freeing
> init memory. If that has changed, or I can just place a hook where it
> happens (which is probably the easy part), I can allow init code to be
> traced too.
>
> -- Steve
>
>
> .
>
diff mbox

Patch

diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
index c851be7..9669b33 100644
--- a/arch/arm64/kernel/ftrace.c
+++ b/arch/arm64/kernel/ftrace.c
@@ -93,6 +93,11 @@  int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
 	return ftrace_modify_code(pc, old, new, true);
 }
 
+void arch_ftrace_update_code(int command)
+{
+	ftrace_modify_all_code(command);
+}
+
 int __init ftrace_dyn_arch_init(void)
 {
 	return 0;