Message ID | 20220602193706.2607681-2-song@kernel.org (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | ftrace: host klp and bpf trampoline together | expand |
On Thu, 2 Jun 2022 12:37:02 -0700 Song Liu <song@kernel.org> wrote: > This enables users of ftrace_direct_multi to specify the flags based on > the actual use case. For example, some users may not set flag IPMODIFY. If we apply this patch without any of the others, then we are relying on the caller to get it right? That is, can we register a direct function with this function and pick a function with IPMODIFY already attached? -- Steve > > Signed-off-by: Song Liu <song@kernel.org> > --- > kernel/trace/ftrace.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c > index 2fcd17857ff6..afe782ae28d3 100644 > --- a/kernel/trace/ftrace.c > +++ b/kernel/trace/ftrace.c > @@ -5456,8 +5456,7 @@ int modify_ftrace_direct(unsigned long ip, > } > EXPORT_SYMBOL_GPL(modify_ftrace_direct); > > -#define MULTI_FLAGS (FTRACE_OPS_FL_IPMODIFY | FTRACE_OPS_FL_DIRECT | \ > - FTRACE_OPS_FL_SAVE_REGS) > +#define MULTI_FLAGS (FTRACE_OPS_FL_DIRECT | FTRACE_OPS_FL_SAVE_REGS) > > static int check_direct_multi(struct ftrace_ops *ops) > { > @@ -5547,7 +5546,7 @@ int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr) > } > > ops->func = call_direct_funcs; > - ops->flags = MULTI_FLAGS; > + ops->flags |= MULTI_FLAGS; > ops->trampoline = FTRACE_REGS_ADDR; > > err = register_ftrace_function(ops);
> On Jul 13, 2022, at 4:18 PM, Steven Rostedt <rostedt@goodmis.org> wrote: > > On Thu, 2 Jun 2022 12:37:02 -0700 > Song Liu <song@kernel.org> wrote: > >> This enables users of ftrace_direct_multi to specify the flags based on >> the actual use case. For example, some users may not set flag IPMODIFY. > > If we apply this patch without any of the others, then we are relying on > the caller to get it right? > > That is, can we register a direct function with this function and pick a > function with IPMODIFY already attached? Yes, if the direct function follows regs->ip, it works. For example, BPF trampoline with only fentry calls will just work with only this patch. Thanks, Song > > -- Steve > > >> >> Signed-off-by: Song Liu <song@kernel.org> >> --- >> kernel/trace/ftrace.c | 5 ++--- >> 1 file changed, 2 insertions(+), 3 deletions(-) >> >> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c >> index 2fcd17857ff6..afe782ae28d3 100644 >> --- a/kernel/trace/ftrace.c >> +++ b/kernel/trace/ftrace.c >> @@ -5456,8 +5456,7 @@ int modify_ftrace_direct(unsigned long ip, >> } >> EXPORT_SYMBOL_GPL(modify_ftrace_direct); >> >> -#define MULTI_FLAGS (FTRACE_OPS_FL_IPMODIFY | FTRACE_OPS_FL_DIRECT | \ >> - FTRACE_OPS_FL_SAVE_REGS) >> +#define MULTI_FLAGS (FTRACE_OPS_FL_DIRECT | FTRACE_OPS_FL_SAVE_REGS) >> >> static int check_direct_multi(struct ftrace_ops *ops) >> { >> @@ -5547,7 +5546,7 @@ int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr) >> } >> >> ops->func = call_direct_funcs; >> - ops->flags = MULTI_FLAGS; >> + ops->flags |= MULTI_FLAGS; >> ops->trampoline = FTRACE_REGS_ADDR; >> >> err = register_ftrace_function(ops); >
On Thu, 14 Jul 2022 00:11:53 +0000 Song Liu <songliubraving@fb.com> wrote: > > That is, can we register a direct function with this function and pick a > > function with IPMODIFY already attached? > > Yes, if the direct function follows regs->ip, it works. > > For example, BPF trampoline with only fentry calls will just work with only this patch. I replied with my thoughts on this to patch 3, but I disagree with this. ftrace has no idea if the direct trampoline modifies the IP or not. ftrace must assume that it does, and the management should be done in the infrastructure. As I replied to patch 3, here's my thoughts. DIRECT is treated as though it changes the IP. If you register it to a function that has an IPMODIFY already set to it, it will call the ops->ops_func() asking if the ops can use SHARED_IPMODIFY (which means a DIRECT can be shared with IPMODIFY). If it can, then it returns true, and the SHARED_IPMODIFY is set *by ftrace*. The user of the ftrace APIs should not have to manage this. It should be managed by the ftrace infrastructure. Make sense? -- Steve
> On Jul 13, 2022, at 5:38 PM, Steven Rostedt <rostedt@goodmis.org> wrote: > > On Thu, 14 Jul 2022 00:11:53 +0000 > Song Liu <songliubraving@fb.com> wrote: > >>> That is, can we register a direct function with this function and pick a >>> function with IPMODIFY already attached? >> >> Yes, if the direct function follows regs->ip, it works. >> >> For example, BPF trampoline with only fentry calls will just work with only this patch. > > I replied with my thoughts on this to patch 3, but I disagree with this. > > ftrace has no idea if the direct trampoline modifies the IP or not. > ftrace must assume that it does, and the management should be done in > the infrastructure. > > As I replied to patch 3, here's my thoughts. > > DIRECT is treated as though it changes the IP. If you register it to a > function that has an IPMODIFY already set to it, it will call the > ops->ops_func() asking if the ops can use SHARED_IPMODIFY (which means > a DIRECT can be shared with IPMODIFY). If it can, then it returns true, > and the SHARED_IPMODIFY is set *by ftrace*. The user of the ftrace APIs > should not have to manage this. It should be managed by the ftrace > infrastructure. Hmm... I don't think this gonna work. First, two IPMODIFY ftrace ops cannot work together on the same kernel function. So there won't be a ops with both IPMODIFY and SHARE_IPMODIFY. non-direct ops without IPMODIFY can already share with IPMODIFY ops. Only direct ops need SHARE_IPMODIFY flag, and it means "I can share with another ops with IPMODIFY". So there will be different flavors of direct ops: 1. w/ IPMODIFY, w/o SHARE_IPMODIFY; 2. w/o IPMODIFY, w/o SHARE_IPMODIFY; 3. w/o IPMODIFY, w/ SHARE_IPMODIFY. #1 can never work on the same function with another IPMODIFY ops, and we don't plan to make it work. #2 does not work with another IPMODIFY ops. And #3 works with another IPMODIFY ops. The owner of the direct trampoline uses these flags to tell ftrace infrastructure the property of this trampoline. BPF trampolines with only fentry calls are #3 direct ops. BPF trampolines with fexit or fmod_ret calls will be #2 trampoline by default, but it is also possible to generate #3 trampoline for it. BPF side will try to register #2 trampoline, If ftrace detects another IPMODIFY ops on the same function, it will let BPF trampoline know with -EAGAIN from register_ftrace_direct_multi(). Then, BPF side will regenerate a #3 trampoline and register it again. I know this somehow changes the policy with direct ops, but it is the only way this can work, AFAICT. Does this make sense? Did I miss something? Thanks, Song
On Thu, 14 Jul 2022 01:42:59 +0000 Song Liu <songliubraving@fb.com> wrote: > > As I replied to patch 3, here's my thoughts. > > > > DIRECT is treated as though it changes the IP. If you register it to a > > function that has an IPMODIFY already set to it, it will call the > > ops->ops_func() asking if the ops can use SHARED_IPMODIFY (which means > > a DIRECT can be shared with IPMODIFY). If it can, then it returns true, > > and the SHARED_IPMODIFY is set *by ftrace*. The user of the ftrace APIs > > should not have to manage this. It should be managed by the ftrace > > infrastructure. > > Hmm... I don't think this gonna work. > > First, two IPMODIFY ftrace ops cannot work together on the same kernel > function. So there won't be a ops with both IPMODIFY and SHARE_IPMODIFY. I'm not saying that. I'm saying that ftrace does not have any clue (nor cares) about what a DIRECT ops does. It might modify the IP or it might not. It doesn't know. But ftrace has control over the callbacks it does control. A DIRECT ops knows if it can work with another ops that has IPMODIFY set. If the DIRECT ops does not work with IPMODIFY (perhaps it wants to modify the IP), then it will tell ftrace that it can't and ftrace will not allow it. That is, ftrace doesn't care if the DIRECT ops modifies the IP or not. It can only ask (through the ops->ops_func()) if the direct trampoline can honor the IP that is modified. If it can, then it reports back that it can, and then ftrace will set that ops to SHARED_MODIFY, and the direct function can switch the ops->func() to one that uses SHARED_MODIFY. > > non-direct ops without IPMODIFY can already share with IPMODIFY ops. It can? ftrace sets IPMODIFY for all DIRECT callers to prevent that. Except for this patch that removes that restriction (which I believe is broken). > Only direct ops need SHARE_IPMODIFY flag, and it means "I can share with > another ops with IPMODIFY". So there will be different flavors of > direct ops: I agree that only DIRECT ops can have SHARED_IPMODIFY set. That's what I'm saying. But I'm saying it gets set by ftrace. > > 1. w/ IPMODIFY, w/o SHARE_IPMODIFY; > 2. w/o IPMODIFY, w/o SHARE_IPMODIFY; > 3. w/o IPMODIFY, w/ SHARE_IPMODIFY. > > #1 can never work on the same function with another IPMODIFY ops, and > we don't plan to make it work. #2 does not work with another IPMODIFY > ops. And #3 works with another IPMODIFY ops. Lets look at this differently. What I'm proposing is that registering a direct ops does not need to tell ftrace if it modifies the IP or not. That's because ftrace doesn't care. Once ftrace calls the direct trampoline it loses all control. With the ftrace ops callbacks, it is the one responsible for setting up the modified IP. That's not the case with the direct trampolines. I'm saying that ftrace doesn't care what the DIRECT ops does. But it will not, by default, allow an IPMODIFY to happen when a DIRECT ops is on the same function, or vice versa. What I'm suggesting is when a IPMODIFY tries to attach to a function that also has a DIRECT ops, or a DIRECT tries to attach to a function that already has an IPMODIFY ops on it, that ftrace calls the direct ops->ops_func() asking if it is safe to use with an IPMODIFY function. If the direct ops modifies the IP itself, it will return a "no", and ftrace will reject the attachment. If the direct ops can, it returns a "yes" and then ftrace will set the SHARED_IPMODIFY flag to that ops and continue. The fact that the ops->ops_func was called will let the caller (bpf) know that it needs to use the stack to return to the function, or to call it if it is also tracing the return. If the IPMODIFY ops is removed, then ftrace will call the ops->ops_func() telling it it no longer has the IPMODIFY set, and will clear the SHARED_IPMODIFY flag, and then the owner of the direct ops can go back to doing whatever it did before (the calling the function directly, or whatever). > > The owner of the direct trampoline uses these flags to tell ftrace > infrastructure the property of this trampoline. I disagree. The owner shouldn't have to care about the flags. Let ftrace handle it. This will make things so much more simple for both BPF and ftrace. > > BPF trampolines with only fentry calls are #3 direct ops. BPF > trampolines with fexit or fmod_ret calls will be #2 trampoline by > default, but it is also possible to generate #3 trampoline for it. And ftrace doesn't care about this. But bpf needs to care if the IP is being modified or not. Which can be done by the ops->ops_func() that you added. > > BPF side will try to register #2 trampoline, If ftrace detects another > IPMODIFY ops on the same function, it will let BPF trampoline know > with -EAGAIN from register_ftrace_direct_multi(). Then, BPF side will > regenerate a #3 trampoline and register it again. This is too complex. You are missing the simple way. > > I know this somehow changes the policy with direct ops, but it is the > only way this can work, AFAICT. I disagree. There's a much better way that this can work. > > Does this make sense? Did I miss something? Let me start from the beginning. 1. Live kernel patching patches function foo. 2. lkp has an ops->flags | IPMODIFY set when it registers. 3. bpf registers a direct trampoline to function foo. 4. bpf has an ops->flags | DIRECT set when it registers 5. ftrace sees that the function already has an ops->flag = IPMODIFY on it, so it calls bpf ops->ops_func(SHARE_WITH_IPMODIFY) 6. bpf can and does the following a. if it's the simple #1 trampoline (just traces the start of a function) it doesn't need to do anything special returns "yes". b. if it's the #2 trampoline, it will change the trampoline to use the stack to find what to call, and returns "yes". 7. ftrace gets "yes" and sets the *ipmodify* ops with SHARED_IPMODIFY (there's a reason for setting this flag for the ipmodify ops and not the direct ops). 8. LKP is removed from the function foo. 9. ftrace sees the lkp IPMODIFY ops has SHARED_IPMODIFY on it, and knows that there's a direct call here too. It removes the IPMODIFY ops, and then calls the direct ops->ops_func(STOP_SHARE_WITH_IPMODIFY) to let the direct code know that it is no longer sharing with an IPMODIFY such that it can change to call the function directly and not use the stack. See how simple this is? ftrace doesn't have to care if the direct caller changes the IP or not. It just wants to know if it can be shared with an IPMODIFY ops. And BPF doesn't have to care about extra flags used to manage the ftrace infrastructure. Does this make sense now? -- Steve
> On Jul 13, 2022, at 7:55 PM, Steven Rostedt <rostedt@goodmis.org> wrote: > > On Thu, 14 Jul 2022 01:42:59 +0000 > Song Liu <songliubraving@fb.com> wrote: > >>> As I replied to patch 3, here's my thoughts. >>> >>> DIRECT is treated as though it changes the IP. If you register it to a >>> function that has an IPMODIFY already set to it, it will call the >>> ops->ops_func() asking if the ops can use SHARED_IPMODIFY (which means >>> a DIRECT can be shared with IPMODIFY). If it can, then it returns true, >>> and the SHARED_IPMODIFY is set *by ftrace*. The user of the ftrace APIs >>> should not have to manage this. It should be managed by the ftrace >>> infrastructure. >> >> Hmm... I don't think this gonna work. >> >> First, two IPMODIFY ftrace ops cannot work together on the same kernel >> function. So there won't be a ops with both IPMODIFY and SHARE_IPMODIFY. > > I'm not saying that. > > I'm saying that ftrace does not have any clue (nor cares) about what a > DIRECT ops does. It might modify the IP or it might not. It doesn't know. > > But ftrace has control over the callbacks it does control. > > A DIRECT ops knows if it can work with another ops that has IPMODIFY set. > If the DIRECT ops does not work with IPMODIFY (perhaps it wants to modify > the IP), then it will tell ftrace that it can't and ftrace will not allow > it. > > That is, ftrace doesn't care if the DIRECT ops modifies the IP or not. It > can only ask (through the ops->ops_func()) if the direct trampoline can > honor the IP that is modified. If it can, then it reports back that it can, > and then ftrace will set that ops to SHARED_MODIFY, and the direct function > can switch the ops->func() to one that uses SHARED_MODIFY. > >> >> non-direct ops without IPMODIFY can already share with IPMODIFY ops. > > It can? ftrace sets IPMODIFY for all DIRECT callers to prevent that. Except > for this patch that removes that restriction (which I believe is broken). I mean "non-direct" ftrace ops, not direct ftrace ops. > >> Only direct ops need SHARE_IPMODIFY flag, and it means "I can share with >> another ops with IPMODIFY". So there will be different flavors of >> direct ops: > > I agree that only DIRECT ops can have SHARED_IPMODIFY set. That's what I'm > saying. But I'm saying it gets set by ftrace. > >> >> 1. w/ IPMODIFY, w/o SHARE_IPMODIFY; >> 2. w/o IPMODIFY, w/o SHARE_IPMODIFY; >> 3. w/o IPMODIFY, w/ SHARE_IPMODIFY. >> >> #1 can never work on the same function with another IPMODIFY ops, and >> we don't plan to make it work. #2 does not work with another IPMODIFY >> ops. And #3 works with another IPMODIFY ops. > > Lets look at this differently. What I'm proposing is that registering a > direct ops does not need to tell ftrace if it modifies the IP or not. > That's because ftrace doesn't care. Once ftrace calls the direct trampoline > it loses all control. With the ftrace ops callbacks, it is the one > responsible for setting up the modified IP. That's not the case with the > direct trampolines. > > I'm saying that ftrace doesn't care what the DIRECT ops does. But it will > not, by default, allow an IPMODIFY to happen when a DIRECT ops is on the > same function, or vice versa. > > What I'm suggesting is when a IPMODIFY tries to attach to a function that > also has a DIRECT ops, or a DIRECT tries to attach to a function that > already has an IPMODIFY ops on it, that ftrace calls the direct > ops->ops_func() asking if it is safe to use with an IPMODIFY function. > > If the direct ops modifies the IP itself, it will return a "no", and ftrace > will reject the attachment. If the direct ops can, it returns a "yes" and > then ftrace will set the SHARED_IPMODIFY flag to that ops and continue. > > The fact that the ops->ops_func was called will let the caller (bpf) know > that it needs to use the stack to return to the function, or to call it if > it is also tracing the return. > > If the IPMODIFY ops is removed, then ftrace will call the ops->ops_func() > telling it it no longer has the IPMODIFY set, and will clear the > SHARED_IPMODIFY flag, and then the owner of the direct ops can go back to > doing whatever it did before (the calling the function directly, or > whatever). > >> >> The owner of the direct trampoline uses these flags to tell ftrace >> infrastructure the property of this trampoline. > > I disagree. The owner shouldn't have to care about the flags. Let ftrace > handle it. This will make things so much more simple for both BPF and > ftrace. > >> >> BPF trampolines with only fentry calls are #3 direct ops. BPF >> trampolines with fexit or fmod_ret calls will be #2 trampoline by >> default, but it is also possible to generate #3 trampoline for it. > > And ftrace doesn't care about this. But bpf needs to care if the IP is > being modified or not. Which can be done by the ops->ops_func() that you > added. > >> >> BPF side will try to register #2 trampoline, If ftrace detects another >> IPMODIFY ops on the same function, it will let BPF trampoline know >> with -EAGAIN from register_ftrace_direct_multi(). Then, BPF side will >> regenerate a #3 trampoline and register it again. > > This is too complex. You are missing the simple way. > >> >> I know this somehow changes the policy with direct ops, but it is the >> only way this can work, AFAICT. > > I disagree. There's a much better way that this can work. > >> >> Does this make sense? Did I miss something? > > > Let me start from the beginning. I got your point now. We replace the flag on direct trampoline with a callback check. So yes, this works. > > 1. Live kernel patching patches function foo. > > 2. lkp has an ops->flags | IPMODIFY set when it registers. > > 3. bpf registers a direct trampoline to function foo. > > 4. bpf has an ops->flags | DIRECT set when it registers > > 5. ftrace sees that the function already has an ops->flag = IPMODIFY on it, > so it calls bpf ops->ops_func(SHARE_WITH_IPMODIFY) > > 6. bpf can and does the following > > a. if it's the simple #1 trampoline (just traces the start of a function) > it doesn't need to do anything special returns "yes". > > b. if it's the #2 trampoline, it will change the trampoline to use the > stack to find what to call, and returns "yes". > > 7. ftrace gets "yes" and sets the *ipmodify* ops with SHARED_IPMODIFY > (there's a reason for setting this flag for the ipmodify ops and not the > direct ops). > > > 8. LKP is removed from the function foo. > > 9. ftrace sees the lkp IPMODIFY ops has SHARED_IPMODIFY on it, and knows > that there's a direct call here too. It removes the IPMODIFY ops, and > then calls the direct ops->ops_func(STOP_SHARE_WITH_IPMODIFY) to let the > direct code know that it is no longer sharing with an IPMODIFY such that > it can change to call the function directly and not use the stack. I wonder whether we still need this flag. Alternatively, we can always find direct calls on the function and calls ops_func(STOP_SHARE_WITH_IPMODIFY). What do you think about this? Thanks, Song
On Thu, 14 Jul 2022 04:37:43 +0000 Song Liu <songliubraving@fb.com> wrote: > > > >> > >> non-direct ops without IPMODIFY can already share with IPMODIFY ops. > > > > It can? ftrace sets IPMODIFY for all DIRECT callers to prevent that. Except > > for this patch that removes that restriction (which I believe is broken). > > I mean "non-direct" ftrace ops, not direct ftrace ops. Ah, sorry misunderstood that. > > Let me start from the beginning. > > I got your point now. We replace the flag on direct trampoline with a > callback check. So yes, this works. I'm glad we are on the same page :-) > > 9. ftrace sees the lkp IPMODIFY ops has SHARED_IPMODIFY on it, and knows > > that there's a direct call here too. It removes the IPMODIFY ops, and > > then calls the direct ops->ops_func(STOP_SHARE_WITH_IPMODIFY) to let the > > direct code know that it is no longer sharing with an IPMODIFY such that > > it can change to call the function directly and not use the stack. > > I wonder whether we still need this flag. Alternatively, we can always > find direct calls on the function and calls ops_func(STOP_SHARE_WITH_IPMODIFY). Actually we don't need the new flag and we don't need to always search. When a direct is attached to the function then the rec->flags will have FTRACE_FL_DIRECT attached to it. Then if an IPMODIFY is being removed and the rec->flags has FTRACE_FL_DIRECT set, then we know to search the ops for the one that has a DIRECT flag attached and we can call the ops_func() on that one. We should also add a FTRACE_WARN_ON() if a direct is not found but the flag was set. > > What do you think about this? > I think this works. Also, on the patch that implements this in the next version, please add to the change log: Link: https://lore.kernel.org/all/20220602193706.2607681-2-song@kernel.org/ so that we have a link to this discussion. Thanks, -- Steve
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index 2fcd17857ff6..afe782ae28d3 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -5456,8 +5456,7 @@ int modify_ftrace_direct(unsigned long ip, } EXPORT_SYMBOL_GPL(modify_ftrace_direct); -#define MULTI_FLAGS (FTRACE_OPS_FL_IPMODIFY | FTRACE_OPS_FL_DIRECT | \ - FTRACE_OPS_FL_SAVE_REGS) +#define MULTI_FLAGS (FTRACE_OPS_FL_DIRECT | FTRACE_OPS_FL_SAVE_REGS) static int check_direct_multi(struct ftrace_ops *ops) { @@ -5547,7 +5546,7 @@ int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr) } ops->func = call_direct_funcs; - ops->flags = MULTI_FLAGS; + ops->flags |= MULTI_FLAGS; ops->trampoline = FTRACE_REGS_ADDR; err = register_ftrace_function(ops);
This enables users of ftrace_direct_multi to specify the flags based on the actual use case. For example, some users may not set flag IPMODIFY. Signed-off-by: Song Liu <song@kernel.org> --- kernel/trace/ftrace.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)