diff mbox

[v15,04/10] arm64: Kprobes with single stepping support

Message ID 57A2C8DF.7050401@linaro.org (mailing list archive)
State New, archived
Headers show

Commit Message

David Long Aug. 4, 2016, 4:47 a.m. UTC
On 07/29/2016 05:01 AM, Daniel Thompson wrote:
> On 28/07/16 15:40, Catalin Marinas wrote:
>> On Wed, Jul 27, 2016 at 06:13:37PM -0400, David Long wrote:
>>> On 07/27/2016 07:50 AM, Daniel Thompson wrote:
>>>> On 25/07/16 23:27, David Long wrote:
>>>>> On 07/25/2016 01:13 PM, Catalin Marinas wrote:
>>>>>> The problem is that the original design was done on x86 for its 
>>>>>> PCS and
>>>>>> it doesn't always fit other architectures. So we could either 
>>>>>> ignore the
>>>>>> problem, hoping that no probed function requires argument passing on
>>>>>> stack or we copy all the valid data on the kernel stack:
>>>>>>
>>>>>> diff --git a/arch/arm64/include/asm/kprobes.h
>>>>>> b/arch/arm64/include/asm/kprobes.h
>>>>>> index 61b49150dfa3..157fd0d0aa08 100644
>>>>>> --- a/arch/arm64/include/asm/kprobes.h
>>>>>> +++ b/arch/arm64/include/asm/kprobes.h
>>>>>> @@ -22,7 +22,7 @@
>>>>>>
>>>>>>  #define __ARCH_WANT_KPROBES_INSN_SLOT
>>>>>>  #define MAX_INSN_SIZE            1
>>>>>> -#define MAX_STACK_SIZE            128
>>>>>> +#define MAX_STACK_SIZE            THREAD_SIZE
>>>>>>
>>>>>>  #define flush_insn_slot(p)        do { } while (0)
>>>>>>  #define kretprobe_blacklist_size    0
>>>>>
>>>>> I doubt the ARM PCS is unusual.  At any rate I'm certain there are 
>>>>> other
>>>>> architectures that pass aggregate parameters on the stack. I suspect
>>>>> other RISC(-ish) architectures have similar PCS issues and I think 
>>>>> this
>>>>> is at least a big part of where this simple copy with a 64/128 limit
>>>>> comes from, or at least why it continues to exist.  That said, I'm not
>>>>> enthusiastic about researching that assertion in detail as it could be
>>>>> time consuming.
>>>>
>>>> Given Mark shared a test program I *was* curious enough to take a look
>>>> at this.
>>>>
>>>> The only architecture I can find that behaves like arm64 with the
>>>> implicit pass-by-reference described by Catalin/Mark is sparc64.
>>>>
>>>> In contrast alpha, arm (32-bit), hppa64, mips64 and powerpc64 all use a
>>>> hybrid approach where the first fragments of the structure are 
>>>> passed in
>>>> registers and the remainder on the stack.
>>>
>>> That's interesting.  It also looks like sparc64 does not copy any 
>>> stack for
>>> jprobes. I guess that approach at least makes it clear what will and 
>>> won't
>>> work.
>>
>> I suggest we do the same for arm64 - avoid the copying entirely as it's
>> not safe anyway. We don't know how much to copy, nor can we be sure it
>> is safe (see Dave's DMA to the stack example). This would need to be
>> documented in the kprobes.txt file and MAX_STACK_SIZE removed from the
>> arm64 kprobes support.
>>
>> There is also the case that Daniel was talking about - passing more than
>> 8 arguments. I don't think it's worth handling this
> 
> Its actually quite hard to document the (architecture specific) "no big 
> structures" *and* the "8 argument" limits. It ends up as something like:
> 
>    Structures/unions >16 bytes must not be passed by value and the
>    size of all arguments, after padding each to an 8 byte boundary, must
>    be less than 64 bytes.
> 
> We cannot avoid tackling big structures through documentation but when 
> we impose additional limits like "only 8 arguments" we are swapping an 
> architecture neutral "gotcha" that affects almost all jprobes uses (and 
> can be inferred from the documentation) with an architecture specific one!
> 

See new patch below.  The documentation change in it could use some scrutiny.
I've tested with one-off jprobes functions in a test module and I've
verified NET_TCPPROBE doesn't cause misbehavior.

> 
>  > but we should at
>> least add a warning and skip the probe:
>>
>> diff --git a/arch/arm64/kernel/probes/kprobes.c 
>> b/arch/arm64/kernel/probes/kprobes.c
>> index bf9768588288..84e02606ec3d 100644
>> --- a/arch/arm64/kernel/probes/kprobes.c
>> +++ b/arch/arm64/kernel/probes/kprobes.c
>> @@ -491,6 +491,10 @@ int __kprobes setjmp_pre_handler(struct kprobe 
>> *p, struct pt_regs *regs)
>>      struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
>>      long stack_ptr = kernel_stack_pointer(regs);
>>
>> +    /* do not allow arguments passed on the stack */
>> +    if (WARN_ON_ONCE(regs->sp != regs->regs[29]))
>> +        return 0;
>> +
> 
> I don't really understand this test.
> 
> If we could reliably assume that the frame record was at the lowest 
> address within a stack frame then we could exploit that to store the 
> stacked arguments without risking overwriting volatile variables on the 
> stack.
> 
> 
> Daniel.
> 

I'm assuming the consensus is to not use the above snippet of code.

Thanks,
-dl

----------cut here--------


From b451caa1adaf1d03e08a44b5dad3fca31cebd97a Mon Sep 17 00:00:00 2001
From: "David A. Long" <dave.long@linaro.org>
Date: Thu, 4 Aug 2016 00:35:33 -0400
Subject: [PATCH] arm64: Remove stack duplicating code from jprobes

Because the arm64 calling standard allows stacked function arguments to be
anywhere in the stack frame, do not attempt to duplicate the stack frame for
jprobes handler functions.

Signed-off-by: David A. Long <dave.long@linaro.org>
---
 Documentation/kprobes.txt          |  7 +++++++
 arch/arm64/include/asm/kprobes.h   |  2 --
 arch/arm64/kernel/probes/kprobes.c | 31 +++++--------------------------
 3 files changed, 12 insertions(+), 28 deletions(-)

Comments

Daniel Thompson Aug. 8, 2016, 11:13 a.m. UTC | #1
On 04/08/16 05:47, David Long wrote:
> From b451caa1adaf1d03e08a44b5dad3fca31cebd97a Mon Sep 17 00:00:00 2001
> From: "David A. Long" <dave.long@linaro.org>
> Date: Thu, 4 Aug 2016 00:35:33 -0400
> Subject: [PATCH] arm64: Remove stack duplicating code from jprobes
>
> Because the arm64 calling standard allows stacked function arguments to be
> anywhere in the stack frame, do not attempt to duplicate the stack frame for
> jprobes handler functions.
>
> Signed-off-by: David A. Long <dave.long@linaro.org>
> ---
>  Documentation/kprobes.txt          |  7 +++++++
>  arch/arm64/include/asm/kprobes.h   |  2 --
>  arch/arm64/kernel/probes/kprobes.c | 31 +++++--------------------------
>  3 files changed, 12 insertions(+), 28 deletions(-)
>
> diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt
> index 1f9b3e2..bd01839 100644
> --- a/Documentation/kprobes.txt
> +++ b/Documentation/kprobes.txt
> @@ -103,6 +103,13 @@ Note that the probed function's args may be passed on the stack
>  or in registers.  The jprobe will work in either case, so long as the
>  handler's prototype matches that of the probed function.
>
> +Note that in some architectures (e.g.: arm64) the stack copy is not

Could sparc64 be added to this list?

   For the sparc folks who are new to the thread, we've previously
   established that the sparc64 ABI passes large structures by
   allocating them from the caller's stack frame and passing a pointer
   to the stack frame (i.e. arguments may not be at top of the stack).
   We also noticed that sparc code does not save/restore anything from
   the stack.


> +done, as the actual location of stacked parameters may be outside of
> +a reasonable MAX_STACK_SIZE value and because that location cannot be
> +determined by the jprobes code. In this case the jprobes user must be
> +careful to make certain the calling signature of the function does
> +not cause parameters to be passed on the stack.
> +
>  1.3 Return Probes
>
>  1.3.1 How Does a Return Probe Work?
> diff --git a/arch/arm64/include/asm/kprobes.h b/arch/arm64/include/asm/kprobes.h
> index 61b4915..1737aec 100644
> --- a/arch/arm64/include/asm/kprobes.h
> +++ b/arch/arm64/include/asm/kprobes.h
> @@ -22,7 +22,6 @@
>
>  #define __ARCH_WANT_KPROBES_INSN_SLOT
>  #define MAX_INSN_SIZE			1
> -#define MAX_STACK_SIZE			128
>
>  #define flush_insn_slot(p)		do { } while (0)
>  #define kretprobe_blacklist_size	0
> @@ -47,7 +46,6 @@ struct kprobe_ctlblk {
>  	struct prev_kprobe prev_kprobe;
>  	struct kprobe_step_ctx ss_ctx;
>  	struct pt_regs jprobe_saved_regs;
> -	char jprobes_stack[MAX_STACK_SIZE];
>  };
>
>  void arch_remove_kprobe(struct kprobe *);
> diff --git a/arch/arm64/kernel/probes/kprobes.c b/arch/arm64/kernel/probes/kprobes.c
> index bf97685..c6b0f40 100644
> --- a/arch/arm64/kernel/probes/kprobes.c
> +++ b/arch/arm64/kernel/probes/kprobes.c
> @@ -41,18 +41,6 @@ DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
>  static void __kprobes
>  post_kprobe_handler(struct kprobe_ctlblk *, struct pt_regs *);
>
> -static inline unsigned long min_stack_size(unsigned long addr)
> -{
> -	unsigned long size;
> -
> -	if (on_irq_stack(addr, raw_smp_processor_id()))
> -		size = IRQ_STACK_PTR(raw_smp_processor_id()) - addr;
> -	else
> -		size = (unsigned long)current_thread_info() + THREAD_START_SP - addr;
> -
> -	return min(size, FIELD_SIZEOF(struct kprobe_ctlblk, jprobes_stack));
> -}
> -
>  static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
>  {
>  	/* prepare insn slot */
> @@ -489,20 +477,15 @@ int __kprobes setjmp_pre_handler(struct kprobe *p, struct pt_regs *regs)
>  {
>  	struct jprobe *jp = container_of(p, struct jprobe, kp);
>  	struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
> -	long stack_ptr = kernel_stack_pointer(regs);
>
>  	kcb->jprobe_saved_regs = *regs;
>  	/*
> -	 * As Linus pointed out, gcc assumes that the callee
> -	 * owns the argument space and could overwrite it, e.g.
> -	 * tailcall optimization. So, to be absolutely safe
> -	 * we also save and restore enough stack bytes to cover
> -	 * the argument area.
> +	 * Since we can't be sure where in the stack frame "stacked"
> +	 * pass-by-value arguments are stored we just don't try to
> +	 * duplicate any of the stack.
 > ...
>                                       Do not use jprobes on functions that
> +	 * use more than 64 bytes (after padding each to an 8 byte boundary)
> +	 * of arguments, or pass individual arguments larger than 16 bytes.

I like this wording. So much so that it really would be great to repeat 
this in the Documentation/. Could this be included in the list of 
architecture support/restrictions?


Daniel.
David Long Aug. 8, 2016, 2:29 p.m. UTC | #2
On 08/08/2016 07:13 AM, Daniel Thompson wrote:
> On 04/08/16 05:47, David Long wrote:
>> From b451caa1adaf1d03e08a44b5dad3fca31cebd97a Mon Sep 17 00:00:00 2001
>> From: "David A. Long" <dave.long@linaro.org>
>> Date: Thu, 4 Aug 2016 00:35:33 -0400
>> Subject: [PATCH] arm64: Remove stack duplicating code from jprobes
>>
>> Because the arm64 calling standard allows stacked function arguments
>> to be
>> anywhere in the stack frame, do not attempt to duplicate the stack
>> frame for
>> jprobes handler functions.
>>
>> Signed-off-by: David A. Long <dave.long@linaro.org>
>> ---
>>  Documentation/kprobes.txt          |  7 +++++++
>>  arch/arm64/include/asm/kprobes.h   |  2 --
>>  arch/arm64/kernel/probes/kprobes.c | 31 +++++--------------------------
>>  3 files changed, 12 insertions(+), 28 deletions(-)
>>
>> diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt
>> index 1f9b3e2..bd01839 100644
>> --- a/Documentation/kprobes.txt
>> +++ b/Documentation/kprobes.txt
>> @@ -103,6 +103,13 @@ Note that the probed function's args may be
>> passed on the stack
>>  or in registers.  The jprobe will work in either case, so long as the
>>  handler's prototype matches that of the probed function.
>>
>> +Note that in some architectures (e.g.: arm64) the stack copy is not
>
> Could sparc64 be added to this list?
>
>    For the sparc folks who are new to the thread, we've previously
>    established that the sparc64 ABI passes large structures by
>    allocating them from the caller's stack frame and passing a pointer
>    to the stack frame (i.e. arguments may not be at top of the stack).
>    We also noticed that sparc code does not save/restore anything from
>    the stack.
>

I was reluctant to do that in the context of late changes to v4.8 for 
arm64 but now that any changes for this are going in as a new patch it 
would indeed be useful to get involvement from sparc maintainers.

>
>> +done, as the actual location of stacked parameters may be outside of
>> +a reasonable MAX_STACK_SIZE value and because that location cannot be
>> +determined by the jprobes code. In this case the jprobes user must be
>> +careful to make certain the calling signature of the function does
>> +not cause parameters to be passed on the stack.
>> +
>>  1.3 Return Probes
>>
>>  1.3.1 How Does a Return Probe Work?
>> diff --git a/arch/arm64/include/asm/kprobes.h
>> b/arch/arm64/include/asm/kprobes.h
>> index 61b4915..1737aec 100644
>> --- a/arch/arm64/include/asm/kprobes.h
>> +++ b/arch/arm64/include/asm/kprobes.h
>> @@ -22,7 +22,6 @@
>>
>>  #define __ARCH_WANT_KPROBES_INSN_SLOT
>>  #define MAX_INSN_SIZE            1
>> -#define MAX_STACK_SIZE            128
>>
>>  #define flush_insn_slot(p)        do { } while (0)
>>  #define kretprobe_blacklist_size    0
>> @@ -47,7 +46,6 @@ struct kprobe_ctlblk {
>>      struct prev_kprobe prev_kprobe;
>>      struct kprobe_step_ctx ss_ctx;
>>      struct pt_regs jprobe_saved_regs;
>> -    char jprobes_stack[MAX_STACK_SIZE];
>>  };
>>
>>  void arch_remove_kprobe(struct kprobe *);
>> diff --git a/arch/arm64/kernel/probes/kprobes.c
>> b/arch/arm64/kernel/probes/kprobes.c
>> index bf97685..c6b0f40 100644
>> --- a/arch/arm64/kernel/probes/kprobes.c
>> +++ b/arch/arm64/kernel/probes/kprobes.c
>> @@ -41,18 +41,6 @@ DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
>>  static void __kprobes
>>  post_kprobe_handler(struct kprobe_ctlblk *, struct pt_regs *);
>>
>> -static inline unsigned long min_stack_size(unsigned long addr)
>> -{
>> -    unsigned long size;
>> -
>> -    if (on_irq_stack(addr, raw_smp_processor_id()))
>> -        size = IRQ_STACK_PTR(raw_smp_processor_id()) - addr;
>> -    else
>> -        size = (unsigned long)current_thread_info() + THREAD_START_SP
>> - addr;
>> -
>> -    return min(size, FIELD_SIZEOF(struct kprobe_ctlblk, jprobes_stack));
>> -}
>> -
>>  static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
>>  {
>>      /* prepare insn slot */
>> @@ -489,20 +477,15 @@ int __kprobes setjmp_pre_handler(struct kprobe
>> *p, struct pt_regs *regs)
>>  {
>>      struct jprobe *jp = container_of(p, struct jprobe, kp);
>>      struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
>> -    long stack_ptr = kernel_stack_pointer(regs);
>>
>>      kcb->jprobe_saved_regs = *regs;
>>      /*
>> -     * As Linus pointed out, gcc assumes that the callee
>> -     * owns the argument space and could overwrite it, e.g.
>> -     * tailcall optimization. So, to be absolutely safe
>> -     * we also save and restore enough stack bytes to cover
>> -     * the argument area.
>> +     * Since we can't be sure where in the stack frame "stacked"
>> +     * pass-by-value arguments are stored we just don't try to
>> +     * duplicate any of the stack.
>  > ...
>>                                       Do not use jprobes on functions
>> that
>> +     * use more than 64 bytes (after padding each to an 8 byte boundary)
>> +     * of arguments, or pass individual arguments larger than 16 bytes.
>
> I like this wording. So much so that it really would be great to repeat
> this in the Documentation/. Could this be included in the list of
> architecture support/restrictions?
>

Are you thinking specifically of the "5. Kprobes Features and 
Limitations" section in Documentation/kprobes.txt?

>
> Daniel.
>

Thanks,
-dl
Masami Hiramatsu (Google) Aug. 8, 2016, 10:19 p.m. UTC | #3
On Thu, 4 Aug 2016 00:47:27 -0400
David Long <dave.long@linaro.org> wrote:

> On 07/29/2016 05:01 AM, Daniel Thompson wrote:
> > On 28/07/16 15:40, Catalin Marinas wrote:
> >> On Wed, Jul 27, 2016 at 06:13:37PM -0400, David Long wrote:
> >>> On 07/27/2016 07:50 AM, Daniel Thompson wrote:
> >>>> On 25/07/16 23:27, David Long wrote:
> >>>>> On 07/25/2016 01:13 PM, Catalin Marinas wrote:
> >>>>>> The problem is that the original design was done on x86 for its 
> >>>>>> PCS and
> >>>>>> it doesn't always fit other architectures. So we could either 
> >>>>>> ignore the
> >>>>>> problem, hoping that no probed function requires argument passing on
> >>>>>> stack or we copy all the valid data on the kernel stack:
> >>>>>>
> >>>>>> diff --git a/arch/arm64/include/asm/kprobes.h
> >>>>>> b/arch/arm64/include/asm/kprobes.h
> >>>>>> index 61b49150dfa3..157fd0d0aa08 100644
> >>>>>> --- a/arch/arm64/include/asm/kprobes.h
> >>>>>> +++ b/arch/arm64/include/asm/kprobes.h
> >>>>>> @@ -22,7 +22,7 @@
> >>>>>>
> >>>>>>  #define __ARCH_WANT_KPROBES_INSN_SLOT
> >>>>>>  #define MAX_INSN_SIZE            1
> >>>>>> -#define MAX_STACK_SIZE            128
> >>>>>> +#define MAX_STACK_SIZE            THREAD_SIZE
> >>>>>>
> >>>>>>  #define flush_insn_slot(p)        do { } while (0)
> >>>>>>  #define kretprobe_blacklist_size    0
> >>>>>
> >>>>> I doubt the ARM PCS is unusual.  At any rate I'm certain there are 
> >>>>> other
> >>>>> architectures that pass aggregate parameters on the stack. I suspect
> >>>>> other RISC(-ish) architectures have similar PCS issues and I think 
> >>>>> this
> >>>>> is at least a big part of where this simple copy with a 64/128 limit
> >>>>> comes from, or at least why it continues to exist.  That said, I'm not
> >>>>> enthusiastic about researching that assertion in detail as it could be
> >>>>> time consuming.
> >>>>
> >>>> Given Mark shared a test program I *was* curious enough to take a look
> >>>> at this.
> >>>>
> >>>> The only architecture I can find that behaves like arm64 with the
> >>>> implicit pass-by-reference described by Catalin/Mark is sparc64.
> >>>>
> >>>> In contrast alpha, arm (32-bit), hppa64, mips64 and powerpc64 all use a
> >>>> hybrid approach where the first fragments of the structure are 
> >>>> passed in
> >>>> registers and the remainder on the stack.
> >>>
> >>> That's interesting.  It also looks like sparc64 does not copy any 
> >>> stack for
> >>> jprobes. I guess that approach at least makes it clear what will and 
> >>> won't
> >>> work.
> >>
> >> I suggest we do the same for arm64 - avoid the copying entirely as it's
> >> not safe anyway. We don't know how much to copy, nor can we be sure it
> >> is safe (see Dave's DMA to the stack example). This would need to be
> >> documented in the kprobes.txt file and MAX_STACK_SIZE removed from the
> >> arm64 kprobes support.
> >>
> >> There is also the case that Daniel was talking about - passing more than
> >> 8 arguments. I don't think it's worth handling this
> > 
> > Its actually quite hard to document the (architecture specific) "no big 
> > structures" *and* the "8 argument" limits. It ends up as something like:
> > 
> >    Structures/unions >16 bytes must not be passed by value and the
> >    size of all arguments, after padding each to an 8 byte boundary, must
> >    be less than 64 bytes.
> > 
> > We cannot avoid tackling big structures through documentation but when 
> > we impose additional limits like "only 8 arguments" we are swapping an 
> > architecture neutral "gotcha" that affects almost all jprobes uses (and 
> > can be inferred from the documentation) with an architecture specific one!
> > 
> 
> See new patch below.  The documentation change in it could use some scrutiny.
> I've tested with one-off jprobes functions in a test module and I've
> verified NET_TCPPROBE doesn't cause misbehavior.
> 
> > 
> >  > but we should at
> >> least add a warning and skip the probe:
> >>
> >> diff --git a/arch/arm64/kernel/probes/kprobes.c 
> >> b/arch/arm64/kernel/probes/kprobes.c
> >> index bf9768588288..84e02606ec3d 100644
> >> --- a/arch/arm64/kernel/probes/kprobes.c
> >> +++ b/arch/arm64/kernel/probes/kprobes.c
> >> @@ -491,6 +491,10 @@ int __kprobes setjmp_pre_handler(struct kprobe 
> >> *p, struct pt_regs *regs)
> >>      struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
> >>      long stack_ptr = kernel_stack_pointer(regs);
> >>
> >> +    /* do not allow arguments passed on the stack */
> >> +    if (WARN_ON_ONCE(regs->sp != regs->regs[29]))
> >> +        return 0;
> >> +
> > 
> > I don't really understand this test.
> > 
> > If we could reliably assume that the frame record was at the lowest 
> > address within a stack frame then we could exploit that to store the 
> > stacked arguments without risking overwriting volatile variables on the 
> > stack.
> > 
> > 
> > Daniel.
> > 
> 
> I'm assuming the consensus is to not use the above snippet of code.
> 
> Thanks,
> -dl
> 
> ----------cut here--------
> 
> 
> From b451caa1adaf1d03e08a44b5dad3fca31cebd97a Mon Sep 17 00:00:00 2001
> From: "David A. Long" <dave.long@linaro.org>
> Date: Thu, 4 Aug 2016 00:35:33 -0400
> Subject: [PATCH] arm64: Remove stack duplicating code from jprobes
> 
> Because the arm64 calling standard allows stacked function arguments to be
> anywhere in the stack frame, do not attempt to duplicate the stack frame for
> jprobes handler functions.
> 
> Signed-off-by: David A. Long <dave.long@linaro.org>

Looks good to me.

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>

Thanks,

> ---
>  Documentation/kprobes.txt          |  7 +++++++
>  arch/arm64/include/asm/kprobes.h   |  2 --
>  arch/arm64/kernel/probes/kprobes.c | 31 +++++--------------------------
>  3 files changed, 12 insertions(+), 28 deletions(-)
> 
> diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt
> index 1f9b3e2..bd01839 100644
> --- a/Documentation/kprobes.txt
> +++ b/Documentation/kprobes.txt
> @@ -103,6 +103,13 @@ Note that the probed function's args may be passed on the stack
>  or in registers.  The jprobe will work in either case, so long as the
>  handler's prototype matches that of the probed function.
>  
> +Note that in some architectures (e.g.: arm64) the stack copy is not
> +done, as the actual location of stacked parameters may be outside of
> +a reasonable MAX_STACK_SIZE value and because that location cannot be
> +determined by the jprobes code. In this case the jprobes user must be
> +careful to make certain the calling signature of the function does
> +not cause parameters to be passed on the stack.
> +
>  1.3 Return Probes
>  
>  1.3.1 How Does a Return Probe Work?
> diff --git a/arch/arm64/include/asm/kprobes.h b/arch/arm64/include/asm/kprobes.h
> index 61b4915..1737aec 100644
> --- a/arch/arm64/include/asm/kprobes.h
> +++ b/arch/arm64/include/asm/kprobes.h
> @@ -22,7 +22,6 @@
>  
>  #define __ARCH_WANT_KPROBES_INSN_SLOT
>  #define MAX_INSN_SIZE			1
> -#define MAX_STACK_SIZE			128
>  
>  #define flush_insn_slot(p)		do { } while (0)
>  #define kretprobe_blacklist_size	0
> @@ -47,7 +46,6 @@ struct kprobe_ctlblk {
>  	struct prev_kprobe prev_kprobe;
>  	struct kprobe_step_ctx ss_ctx;
>  	struct pt_regs jprobe_saved_regs;
> -	char jprobes_stack[MAX_STACK_SIZE];
>  };
>  
>  void arch_remove_kprobe(struct kprobe *);
> diff --git a/arch/arm64/kernel/probes/kprobes.c b/arch/arm64/kernel/probes/kprobes.c
> index bf97685..c6b0f40 100644
> --- a/arch/arm64/kernel/probes/kprobes.c
> +++ b/arch/arm64/kernel/probes/kprobes.c
> @@ -41,18 +41,6 @@ DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
>  static void __kprobes
>  post_kprobe_handler(struct kprobe_ctlblk *, struct pt_regs *);
>  
> -static inline unsigned long min_stack_size(unsigned long addr)
> -{
> -	unsigned long size;
> -
> -	if (on_irq_stack(addr, raw_smp_processor_id()))
> -		size = IRQ_STACK_PTR(raw_smp_processor_id()) - addr;
> -	else
> -		size = (unsigned long)current_thread_info() + THREAD_START_SP - addr;
> -
> -	return min(size, FIELD_SIZEOF(struct kprobe_ctlblk, jprobes_stack));
> -}
> -
>  static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
>  {
>  	/* prepare insn slot */
> @@ -489,20 +477,15 @@ int __kprobes setjmp_pre_handler(struct kprobe *p, struct pt_regs *regs)
>  {
>  	struct jprobe *jp = container_of(p, struct jprobe, kp);
>  	struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
> -	long stack_ptr = kernel_stack_pointer(regs);
>  
>  	kcb->jprobe_saved_regs = *regs;
>  	/*
> -	 * As Linus pointed out, gcc assumes that the callee
> -	 * owns the argument space and could overwrite it, e.g.
> -	 * tailcall optimization. So, to be absolutely safe
> -	 * we also save and restore enough stack bytes to cover
> -	 * the argument area.
> +	 * Since we can't be sure where in the stack frame "stacked"
> +	 * pass-by-value arguments are stored we just don't try to
> +	 * duplicate any of the stack. Do not use jprobes on functions that
> +	 * use more than 64 bytes (after padding each to an 8 byte boundary)
> +	 * of arguments, or pass individual arguments larger than 16 bytes.
>  	 */
> -	kasan_disable_current();
> -	memcpy(kcb->jprobes_stack, (void *)stack_ptr,
> -	       min_stack_size(stack_ptr));
> -	kasan_enable_current();
>  
>  	instruction_pointer_set(regs, (unsigned long) jp->entry);
>  	preempt_disable();
> @@ -554,10 +537,6 @@ int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
>  	}
>  	unpause_graph_tracing();
>  	*regs = kcb->jprobe_saved_regs;
> -	kasan_disable_current();
> -	memcpy((void *)stack_addr, kcb->jprobes_stack,
> -	       min_stack_size(stack_addr));
> -	kasan_enable_current();
>  	preempt_enable_no_resched();
>  	return 1;
>  }
> -- 
> 2.5.0
>
Masami Hiramatsu (Google) Aug. 8, 2016, 10:49 p.m. UTC | #4
On Mon, 8 Aug 2016 10:29:05 -0400
David Long <dave.long@linaro.org> wrote:

> >> @@ -489,20 +477,15 @@ int __kprobes setjmp_pre_handler(struct kprobe
> >> *p, struct pt_regs *regs)
> >>  {
> >>      struct jprobe *jp = container_of(p, struct jprobe, kp);
> >>      struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
> >> -    long stack_ptr = kernel_stack_pointer(regs);
> >>
> >>      kcb->jprobe_saved_regs = *regs;
> >>      /*
> >> -     * As Linus pointed out, gcc assumes that the callee
> >> -     * owns the argument space and could overwrite it, e.g.
> >> -     * tailcall optimization. So, to be absolutely safe
> >> -     * we also save and restore enough stack bytes to cover
> >> -     * the argument area.
> >> +     * Since we can't be sure where in the stack frame "stacked"
> >> +     * pass-by-value arguments are stored we just don't try to
> >> +     * duplicate any of the stack.
> >  > ...
> >>                                       Do not use jprobes on functions
> >> that
> >> +     * use more than 64 bytes (after padding each to an 8 byte boundary)
> >> +     * of arguments, or pass individual arguments larger than 16 bytes.
> >
> > I like this wording. So much so that it really would be great to repeat
> > this in the Documentation/. Could this be included in the list of
> > architecture support/restrictions?
> >
> 
> Are you thinking specifically of the "5. Kprobes Features and 
> Limitations" section in Documentation/kprobes.txt?

OK, That's a good idea :)

If you update the patch for that, please feel free to add my Ack.

Thank you,
Catalin Marinas Aug. 9, 2016, 5:23 p.m. UTC | #5
On Mon, Aug 08, 2016 at 10:29:05AM -0400, David Long wrote:
> On 08/08/2016 07:13 AM, Daniel Thompson wrote:
> >On 04/08/16 05:47, David Long wrote:
> >>From b451caa1adaf1d03e08a44b5dad3fca31cebd97a Mon Sep 17 00:00:00 2001
> >>From: "David A. Long" <dave.long@linaro.org>
> >>Date: Thu, 4 Aug 2016 00:35:33 -0400
> >>Subject: [PATCH] arm64: Remove stack duplicating code from jprobes
> >>
> >>Because the arm64 calling standard allows stacked function arguments
> >>to be
> >>anywhere in the stack frame, do not attempt to duplicate the stack
> >>frame for
> >>jprobes handler functions.
> >>
> >>Signed-off-by: David A. Long <dave.long@linaro.org>
> >>---
> >> Documentation/kprobes.txt          |  7 +++++++
> >> arch/arm64/include/asm/kprobes.h   |  2 --
> >> arch/arm64/kernel/probes/kprobes.c | 31 +++++--------------------------
> >> 3 files changed, 12 insertions(+), 28 deletions(-)
> >>
> >>diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt
> >>index 1f9b3e2..bd01839 100644
> >>--- a/Documentation/kprobes.txt
> >>+++ b/Documentation/kprobes.txt
> >>@@ -103,6 +103,13 @@ Note that the probed function's args may be
> >>passed on the stack
> >> or in registers.  The jprobe will work in either case, so long as the
> >> handler's prototype matches that of the probed function.
> >>
> >>+Note that in some architectures (e.g.: arm64) the stack copy is not
> >
> >Could sparc64 be added to this list?
> >
> >   For the sparc folks who are new to the thread, we've previously
> >   established that the sparc64 ABI passes large structures by
> >   allocating them from the caller's stack frame and passing a pointer
> >   to the stack frame (i.e. arguments may not be at top of the stack).
> >   We also noticed that sparc code does not save/restore anything from
> >   the stack.
> 
> I was reluctant to do that in the context of late changes to v4.8 for arm64
> but now that any changes for this are going in as a new patch it would
> indeed be useful to get involvement from sparc maintainers.

I'm happy to take the arm64 patch for 4.8 as it's mainly a clean-up.
Whether you can mention sparc64 as well, it depends on the sparc
maintainers. You can either cc them or send the series as two patches,
one for documentation and the other for arm64.
David Long Aug. 10, 2016, 8:41 p.m. UTC | #6
On 08/09/2016 01:23 PM, Catalin Marinas wrote:
> On Mon, Aug 08, 2016 at 10:29:05AM -0400, David Long wrote:
>> On 08/08/2016 07:13 AM, Daniel Thompson wrote:
>>> On 04/08/16 05:47, David Long wrote:
>>> >From b451caa1adaf1d03e08a44b5dad3fca31cebd97a Mon Sep 17 00:00:00 2001
>>>> From: "David A. Long" <dave.long@linaro.org>
>>>> Date: Thu, 4 Aug 2016 00:35:33 -0400
>>>> Subject: [PATCH] arm64: Remove stack duplicating code from jprobes
>>>>
>>>> Because the arm64 calling standard allows stacked function arguments
>>>> to be
>>>> anywhere in the stack frame, do not attempt to duplicate the stack
>>>> frame for
>>>> jprobes handler functions.
>>>>
>>>> Signed-off-by: David A. Long <dave.long@linaro.org>
>>>> ---
>>>> Documentation/kprobes.txt          |  7 +++++++
>>>> arch/arm64/include/asm/kprobes.h   |  2 --
>>>> arch/arm64/kernel/probes/kprobes.c | 31 +++++--------------------------
>>>> 3 files changed, 12 insertions(+), 28 deletions(-)
>>>>
>>>> diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt
>>>> index 1f9b3e2..bd01839 100644
>>>> --- a/Documentation/kprobes.txt
>>>> +++ b/Documentation/kprobes.txt
>>>> @@ -103,6 +103,13 @@ Note that the probed function's args may be
>>>> passed on the stack
>>>> or in registers.  The jprobe will work in either case, so long as the
>>>> handler's prototype matches that of the probed function.
>>>>
>>>> +Note that in some architectures (e.g.: arm64) the stack copy is not
>>>
>>> Could sparc64 be added to this list?
>>>
>>>    For the sparc folks who are new to the thread, we've previously
>>>    established that the sparc64 ABI passes large structures by
>>>    allocating them from the caller's stack frame and passing a pointer
>>>    to the stack frame (i.e. arguments may not be at top of the stack).
>>>    We also noticed that sparc code does not save/restore anything from
>>>    the stack.
>>
>> I was reluctant to do that in the context of late changes to v4.8 for arm64
>> but now that any changes for this are going in as a new patch it would
>> indeed be useful to get involvement from sparc maintainers.
>
> I'm happy to take the arm64 patch for 4.8 as it's mainly a clean-up.
> Whether you can mention sparc64 as well, it depends on the sparc
> maintainers. You can either cc them or send the series as two patches,
> one for documentation and the other for arm64.
>

I didn't think that was going to be possible after v4.8-rc1.  I have 
separated the documentation and code changes.  I will send out the new 
code-only patch (otherwise unchanged in content) momentarily.

Thanks,
-dl
diff mbox

Patch

diff --git a/Documentation/kprobes.txt b/Documentation/kprobes.txt
index 1f9b3e2..bd01839 100644
--- a/Documentation/kprobes.txt
+++ b/Documentation/kprobes.txt
@@ -103,6 +103,13 @@  Note that the probed function's args may be passed on the stack
 or in registers.  The jprobe will work in either case, so long as the
 handler's prototype matches that of the probed function.
 
+Note that in some architectures (e.g.: arm64) the stack copy is not
+done, as the actual location of stacked parameters may be outside of
+a reasonable MAX_STACK_SIZE value and because that location cannot be
+determined by the jprobes code. In this case the jprobes user must be
+careful to make certain the calling signature of the function does
+not cause parameters to be passed on the stack.
+
 1.3 Return Probes
 
 1.3.1 How Does a Return Probe Work?
diff --git a/arch/arm64/include/asm/kprobes.h b/arch/arm64/include/asm/kprobes.h
index 61b4915..1737aec 100644
--- a/arch/arm64/include/asm/kprobes.h
+++ b/arch/arm64/include/asm/kprobes.h
@@ -22,7 +22,6 @@ 
 
 #define __ARCH_WANT_KPROBES_INSN_SLOT
 #define MAX_INSN_SIZE			1
-#define MAX_STACK_SIZE			128
 
 #define flush_insn_slot(p)		do { } while (0)
 #define kretprobe_blacklist_size	0
@@ -47,7 +46,6 @@  struct kprobe_ctlblk {
 	struct prev_kprobe prev_kprobe;
 	struct kprobe_step_ctx ss_ctx;
 	struct pt_regs jprobe_saved_regs;
-	char jprobes_stack[MAX_STACK_SIZE];
 };
 
 void arch_remove_kprobe(struct kprobe *);
diff --git a/arch/arm64/kernel/probes/kprobes.c b/arch/arm64/kernel/probes/kprobes.c
index bf97685..c6b0f40 100644
--- a/arch/arm64/kernel/probes/kprobes.c
+++ b/arch/arm64/kernel/probes/kprobes.c
@@ -41,18 +41,6 @@  DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
 static void __kprobes
 post_kprobe_handler(struct kprobe_ctlblk *, struct pt_regs *);
 
-static inline unsigned long min_stack_size(unsigned long addr)
-{
-	unsigned long size;
-
-	if (on_irq_stack(addr, raw_smp_processor_id()))
-		size = IRQ_STACK_PTR(raw_smp_processor_id()) - addr;
-	else
-		size = (unsigned long)current_thread_info() + THREAD_START_SP - addr;
-
-	return min(size, FIELD_SIZEOF(struct kprobe_ctlblk, jprobes_stack));
-}
-
 static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
 {
 	/* prepare insn slot */
@@ -489,20 +477,15 @@  int __kprobes setjmp_pre_handler(struct kprobe *p, struct pt_regs *regs)
 {
 	struct jprobe *jp = container_of(p, struct jprobe, kp);
 	struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
-	long stack_ptr = kernel_stack_pointer(regs);
 
 	kcb->jprobe_saved_regs = *regs;
 	/*
-	 * As Linus pointed out, gcc assumes that the callee
-	 * owns the argument space and could overwrite it, e.g.
-	 * tailcall optimization. So, to be absolutely safe
-	 * we also save and restore enough stack bytes to cover
-	 * the argument area.
+	 * Since we can't be sure where in the stack frame "stacked"
+	 * pass-by-value arguments are stored we just don't try to
+	 * duplicate any of the stack. Do not use jprobes on functions that
+	 * use more than 64 bytes (after padding each to an 8 byte boundary)
+	 * of arguments, or pass individual arguments larger than 16 bytes.
 	 */
-	kasan_disable_current();
-	memcpy(kcb->jprobes_stack, (void *)stack_ptr,
-	       min_stack_size(stack_ptr));
-	kasan_enable_current();
 
 	instruction_pointer_set(regs, (unsigned long) jp->entry);
 	preempt_disable();
@@ -554,10 +537,6 @@  int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
 	}
 	unpause_graph_tracing();
 	*regs = kcb->jprobe_saved_regs;
-	kasan_disable_current();
-	memcpy((void *)stack_addr, kcb->jprobes_stack,
-	       min_stack_size(stack_addr));
-	kasan_enable_current();
 	preempt_enable_no_resched();
 	return 1;
 }