Message ID | 20240525023652.903909489@goodmis.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On Fri, 24 May 2024 22:36:52 -0400 Steven Rostedt <rostedt@goodmis.org> wrote: > [ > Resend for some of you as I missed a comma in the Cc and > quilt died sending this out. > ] > > This is a continuation of the function graph multi user code. > I wrote a proof of concept back in 2019 of this code[1] and > Masami started cleaning it up. I started from Masami's work v10 > that can be found here: > > https://lore.kernel.org/linux-trace-kernel/171509088006.162236.7227326999861366050.stgit@devnote2/ > > This is *only* the code that allows multiple users of function > graph tracing. This is not the fprobe work that Masami is working > to add on top of it. As Masami took my proof of concept, there > was still several things I disliked about that code. Instead of > having Masami clean it up even more, I decided to take over on just > my code and change it up a bit. > > The biggest changes from where Masami left off is mostly renaming more > variables, macros, and function names. I fixed up the current comments > and added more to make the code a bit more understandable. > > At the end of the series, I added two patches to optimize the entry > and exit. On entry, there was a loop that iterated the 16 elements > of the fgraph_array[] looking for any that may have a gops registered > to it. It's quite a waste to do that loop if there's only one > registered user. To fix that, I added a fgraph_array_bitmask that has > its bits set that correspond to the elements of the array. Then > a simple for_each_set_bit() is used for the iteration. I do the same > thing at the exit callback of the function where it iterates over the > bits of the bitmap saved on the ret_stack. > > I also noticed that Masami added code to handle tail calls in the > unwinder and had that in one of my patches. I took that code out > and made it a separate patch with Masami as the author. > > The diff between this and Masami's last update is at the end of this email. Thanks for update the patches! I think your changes are good. I just have some comments and replied. So, the plan is to push this series in the tracing/for-next? I will port my fprobe part on it and run some tests. Thank you, > > Based on Linus commit: 0eb03c7e8e2a4cc3653eb5eeb2d2001182071215 > > [1] https://lore.kernel.org/all/20190525031633.811342628@goodmis.org/ > > Masami Hiramatsu (Google) (3): > function_graph: Handle tail calls for stack unwinding > function_graph: Use a simple LRU for fgraph_array index number > ftrace: Add multiple fgraph storage selftest > > Steven Rostedt (Google) (2): > function_graph: Use for_each_set_bit() in __ftrace_return_to_handler() > function_graph: Use bitmask to loop on fgraph entry > > Steven Rostedt (VMware) (15): > function_graph: Convert ret_stack to a series of longs > fgraph: Use BUILD_BUG_ON() to make sure we have structures divisible by long > function_graph: Add an array structure that will allow multiple callbacks > function_graph: Allow multiple users to attach to function graph > function_graph: Remove logic around ftrace_graph_entry and return > ftrace/function_graph: Pass fgraph_ops to function graph callbacks > ftrace: Allow function_graph tracer to be enabled in instances > ftrace: Allow ftrace startup flags to exist without dynamic ftrace > function_graph: Have the instances use their own ftrace_ops for filtering > function_graph: Add "task variables" per task for fgraph_ops > function_graph: Move set_graph_function tests to shadow stack global var > function_graph: Move graph depth stored data to shadow stack global var > function_graph: Move graph notrace bit to shadow stack global var > function_graph: Implement fgraph_reserve_data() and fgraph_retrieve_data() > function_graph: Add selftest for passing local variables > > ---- > arch/arm64/kernel/ftrace.c | 21 +- > arch/loongarch/kernel/ftrace_dyn.c | 15 +- > arch/powerpc/kernel/trace/ftrace.c | 3 +- > arch/riscv/kernel/ftrace.c | 15 +- > arch/x86/kernel/ftrace.c | 19 +- > include/linux/ftrace.h | 42 +- > include/linux/sched.h | 2 +- > include/linux/trace_recursion.h | 39 -- > kernel/trace/fgraph.c | 994 ++++++++++++++++++++++++++++------- > kernel/trace/ftrace.c | 11 +- > kernel/trace/ftrace_internal.h | 2 - > kernel/trace/trace.h | 94 +++- > kernel/trace/trace_functions.c | 8 + > kernel/trace/trace_functions_graph.c | 96 ++-- > kernel/trace/trace_irqsoff.c | 10 +- > kernel/trace/trace_sched_wakeup.c | 10 +- > kernel/trace/trace_selftest.c | 259 ++++++++- > 17 files changed, 1330 insertions(+), 310 deletions(-) > > > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c > index 3313e4b83aa2..1aae521e5997 100644 > --- a/kernel/trace/fgraph.c > +++ b/kernel/trace/fgraph.c > @@ -18,29 +18,36 @@ > #include "ftrace_internal.h" > #include "trace.h" > > -#define FGRAPH_FRAME_SIZE sizeof(struct ftrace_ret_stack) > -#define FGRAPH_FRAME_OFFSET DIV_ROUND_UP(FGRAPH_FRAME_SIZE, sizeof(long)) > > /* > - * On entry to a function (via function_graph_enter()), a new ftrace_ret_stack > - * is allocated on the task's ret_stack with bitmap entry, then each > - * fgraph_ops on the fgraph_array[]'s entryfunc is called and if that returns > - * non-zero, the index into the fgraph_array[] for that fgraph_ops is recorded > - * on the bitmap entry as a bit flag. > - * > - * The top of the ret_stack (when not empty) will always have a reference > - * to the last ftrace_ret_stack saved. All references to the > - * ftrace_ret_stack has the format of: > + * FGRAPH_FRAME_SIZE: Size in bytes of the meta data on the shadow stack > + * FGRAPH_FRAME_OFFSET: Size in long words of the meta data frame > + */ > +#define FGRAPH_FRAME_SIZE sizeof(struct ftrace_ret_stack) > +#define FGRAPH_FRAME_OFFSET DIV_ROUND_UP(FGRAPH_FRAME_SIZE, sizeof(long)) > + > +/* > + * On entry to a function (via function_graph_enter()), a new fgraph frame > + * (ftrace_ret_stack) is pushed onto the stack as well as a word that > + * holds a bitmask and a type (called "bitmap"). The bitmap is defined as: > * > * bits: 0 - 9 offset in words from the previous ftrace_ret_stack > - * (bitmap type should have FGRAPH_FRAME_OFFSET always) > + * > * bits: 10 - 11 Type of storage > * 0 - reserved > * 1 - bitmap of fgraph_array index > * 2 - reserved data > * > - * For bitmap of fgraph_array index > + * For type with "bitmap of fgraph_array index" (FGRAPH_TYPE_BITMAP): > * bits: 12 - 27 The bitmap of fgraph_ops fgraph_array index > + * That is, it's a bitmask of 0-15 (16 bits) > + * where if a corresponding ops in the fgraph_array[] > + * expects a callback from the return of the function > + * it's corresponding bit will be set. > + * > + * > + * The top of the ret_stack (when not empty) will always have a reference > + * word that points to the last fgraph frame that was saved. > * > * For reserved data: > * bits: 12 - 17 The size in words that is stored > @@ -52,7 +59,7 @@ > * stored two words of data, this is what will be on the task's shadow > * ret_stack: (the stack grows upward) > * > - * ret_stack[SHADOW_STACK_IN_WORD] > + * ret_stack[SHADOW_STACK_OFFSET] > * | SHADOW_STACK_TASK_VARS(ret_stack)[15] | > * ... > * | SHADOW_STACK_TASK_VARS(ret_stack)[0] | > @@ -60,6 +67,8 @@ > * ... > * | | <- task->curr_ret_stack > * +--------------------------------------------+ > + * | (3 << 12) | (3 << 10) | FGRAPH_FRAME_OFFSET| > + * | *or put another way* | > * | (3 << FGRAPH_DATA_INDEX_SHIFT)| \ | This is for fgraph_ops[3]. > * | ((2 - 1) << FGRAPH_DATA_SHIFT)| \ | The data size is 2 words. > * | (FGRAPH_TYPE_DATA << FGRAPH_TYPE_SHIFT)| \ | > @@ -67,7 +76,9 @@ > * +--------------------------------------------+ ( It is 4 words from the ret_stack) > * | STORED DATA WORD 2 | > * | STORED DATA WORD 1 | > - * +------i-------------------------------------+ > + * +--------------------------------------------+ > + * | (9 << 12) | (1 << 10) | FGRAPH_FRAME_OFFSET| > + * | *or put another way* | > * | (BIT(3)|BIT(0)) << FGRAPH_INDEX_SHIFT | \ | > * | FGRAPH_TYPE_BITMAP << FGRAPH_TYPE_SHIFT| \ | > * | (offset1:FGRAPH_FRAME_OFFSET) | <- the offset1 is from here > @@ -82,17 +93,24 @@ > * > * If a backtrace is required, and the real return pointer needs to be > * fetched, then it looks at the task's curr_ret_stack offset, if it > - * is greater than zero (reserved, or right before poped), it would mask > + * is greater than zero (reserved, or right before popped), it would mask > * the value by FGRAPH_FRAME_OFFSET_MASK to get the offset of the > * ftrace_ret_stack structure stored on the shadow stack. > */ > > -#define FGRAPH_FRAME_OFFSET_SIZE 10 > -#define FGRAPH_FRAME_OFFSET_MASK GENMASK(FGRAPH_FRAME_OFFSET_SIZE - 1, 0) > +/* > + * The following is for the top word on the stack: > + * > + * FGRAPH_FRAME_OFFSET (0-9) holds the offset delta to the fgraph frame > + * FGRAPH_TYPE (10-11) holds the type of word this is. > + * (RESERVED or BITMAP) > + */ > +#define FGRAPH_FRAME_OFFSET_BITS 10 > +#define FGRAPH_FRAME_OFFSET_MASK GENMASK(FGRAPH_FRAME_OFFSET_BITS - 1, 0) > > -#define FGRAPH_TYPE_SIZE 2 > -#define FGRAPH_TYPE_MASK GENMASK(FGRAPH_TYPE_SIZE - 1, 0) > -#define FGRAPH_TYPE_SHIFT FGRAPH_FRAME_OFFSET_SIZE > +#define FGRAPH_TYPE_BITS 2 > +#define FGRAPH_TYPE_MASK GENMASK(FGRAPH_TYPE_BITS - 1, 0) > +#define FGRAPH_TYPE_SHIFT FGRAPH_FRAME_OFFSET_BITS > > enum { > FGRAPH_TYPE_RESERVED = 0, > @@ -100,31 +118,48 @@ enum { > FGRAPH_TYPE_DATA = 2, > }; > > -#define FGRAPH_INDEX_SIZE 16 > -#define FGRAPH_INDEX_MASK GENMASK(FGRAPH_INDEX_SIZE - 1, 0) > -#define FGRAPH_INDEX_SHIFT (FGRAPH_TYPE_SHIFT + FGRAPH_TYPE_SIZE) > +/* > + * For BITMAP type: > + * FGRAPH_INDEX (12-27) bits holding the gops index wanting return callback called > + */ > +#define FGRAPH_INDEX_BITS 16 > +#define FGRAPH_INDEX_MASK GENMASK(FGRAPH_INDEX_BITS - 1, 0) > +#define FGRAPH_INDEX_SHIFT (FGRAPH_TYPE_SHIFT + FGRAPH_TYPE_BITS) > > -/* The data size == 0 means 1 word, and 31 (=2^5 - 1) means 32 words. */ > -#define FGRAPH_DATA_SIZE 5 > -#define FGRAPH_DATA_MASK GENMASK(FGRAPH_DATA_SIZE - 1, 0) > -#define FGRAPH_DATA_SHIFT (FGRAPH_TYPE_SHIFT + FGRAPH_TYPE_SIZE) > -#define FGRAPH_MAX_DATA_SIZE (sizeof(long) * (1 << FGRAPH_DATA_SIZE)) > +/* > + * For DATA type: > + * FGRAPH_DATA (12-17) bits hold the size of data (in words) > + * FGRAPH_INDEX (18-23) bits hold the index for which gops->idx the data is for > + * > + * Note: > + * data_size == 0 means 1 word, and 31 (=2^5 - 1) means 32 words. > + */ > +#define FGRAPH_DATA_BITS 5 > +#define FGRAPH_DATA_MASK GENMASK(FGRAPH_DATA_BITS - 1, 0) > +#define FGRAPH_DATA_SHIFT (FGRAPH_TYPE_SHIFT + FGRAPH_TYPE_BITS) > +#define FGRAPH_MAX_DATA_SIZE (sizeof(long) * (1 << FGRAPH_DATA_BITS)) > > -#define FGRAPH_DATA_INDEX_SIZE 4 > -#define FGRAPH_DATA_INDEX_MASK GENMASK(FGRAPH_DATA_INDEX_SIZE - 1, 0) > -#define FGRAPH_DATA_INDEX_SHIFT (FGRAPH_DATA_SHIFT + FGRAPH_DATA_SIZE) > +#define FGRAPH_DATA_INDEX_BITS 4 > +#define FGRAPH_DATA_INDEX_MASK GENMASK(FGRAPH_DATA_INDEX_BITS - 1, 0) > +#define FGRAPH_DATA_INDEX_SHIFT (FGRAPH_DATA_SHIFT + FGRAPH_DATA_BITS) > > #define FGRAPH_MAX_INDEX \ > - ((FGRAPH_INDEX_SIZE << FGRAPH_DATA_SIZE) + FGRAPH_RET_INDEX) > + ((FGRAPH_INDEX_SIZE << FGRAPH_DATA_BITS) + FGRAPH_RET_INDEX) > > -#define FGRAPH_ARRAY_SIZE FGRAPH_INDEX_SIZE > +#define FGRAPH_ARRAY_SIZE FGRAPH_INDEX_BITS > > -#define SHADOW_STACK_SIZE (PAGE_SIZE) > -#define SHADOW_STACK_IN_WORD (SHADOW_STACK_SIZE / sizeof(long)) > +/* > + * SHADOW_STACK_SIZE: The size in bytes of the entire shadow stack > + * SHADOW_STACK_OFFSET: The size in long words of the shadow stack > + * SHADOW_STACK_MAX_OFFSET: The max offset of the stack for a new frame to be added > + */ > +#define SHADOW_STACK_SIZE (PAGE_SIZE) > +#define SHADOW_STACK_OFFSET (SHADOW_STACK_SIZE / sizeof(long)) > /* Leave on a buffer at the end */ > #define SHADOW_STACK_MAX_OFFSET \ > - (SHADOW_STACK_IN_WORD - (FGRAPH_FRAME_OFFSET + 1 + FGRAPH_ARRAY_SIZE)) > + (SHADOW_STACK_OFFSET - (FGRAPH_FRAME_OFFSET + 1 + FGRAPH_ARRAY_SIZE)) > > +/* RET_STACK(): Return the frame from a given @offset from task @t */ > #define RET_STACK(t, offset) ((struct ftrace_ret_stack *)(&(t)->ret_stack[offset])) > > /* > @@ -132,12 +167,13 @@ enum { > * ret_stack to store task specific state. > */ > #define SHADOW_STACK_TASK_VARS(ret_stack) \ > - ((unsigned long *)(&(ret_stack)[SHADOW_STACK_IN_WORD - FGRAPH_ARRAY_SIZE])) > + ((unsigned long *)(&(ret_stack)[SHADOW_STACK_OFFSET - FGRAPH_ARRAY_SIZE])) > > DEFINE_STATIC_KEY_FALSE(kill_ftrace_graph); > int ftrace_graph_active; > > static struct fgraph_ops *fgraph_array[FGRAPH_ARRAY_SIZE]; > +static unsigned long fgraph_array_bitmask; > > /* LRU index table for fgraph_array */ > static int fgraph_lru_table[FGRAPH_ARRAY_SIZE]; > @@ -162,6 +198,8 @@ static int fgraph_lru_release_index(int idx) > > fgraph_lru_table[fgraph_lru_last] = idx; > fgraph_lru_last = (fgraph_lru_last + 1) % FGRAPH_ARRAY_SIZE; > + > + clear_bit(idx, &fgraph_array_bitmask); > return 0; > } > > @@ -176,70 +214,77 @@ static int fgraph_lru_alloc_index(void) > > fgraph_lru_table[fgraph_lru_next] = -1; > fgraph_lru_next = (fgraph_lru_next + 1) % FGRAPH_ARRAY_SIZE; > + > + set_bit(idx, &fgraph_array_bitmask); > return idx; > } > > +/* Get the offset to the fgraph frame from a ret_stack value */ > static inline int __get_offset(unsigned long val) > { > return val & FGRAPH_FRAME_OFFSET_MASK; > } > > +/* Get the type of word from a ret_stack value */ > static inline int __get_type(unsigned long val) > { > return (val >> FGRAPH_TYPE_SHIFT) & FGRAPH_TYPE_MASK; > } > > +/* Get the data_index for a DATA type ret_stack word */ > static inline int __get_data_index(unsigned long val) > { > return (val >> FGRAPH_DATA_INDEX_SHIFT) & FGRAPH_DATA_INDEX_MASK; > } > > +/* Get the data_size for a DATA type ret_stack word */ > static inline int __get_data_size(unsigned long val) > { > return ((val >> FGRAPH_DATA_SHIFT) & FGRAPH_DATA_MASK) + 1; > } > > +/* Get the word from the ret_stack at @offset */ > static inline unsigned long get_fgraph_entry(struct task_struct *t, int offset) > { > return t->ret_stack[offset]; > } > > +/* Get the FRAME_OFFSET from the word from the @offset on ret_stack */ > static inline int get_frame_offset(struct task_struct *t, int offset) > { > return __get_offset(t->ret_stack[offset]); > } > > +/* Get FGRAPH_TYPE from the word from the @offset at ret_stack */ > static inline int get_fgraph_type(struct task_struct *t, int offset) > { > return __get_type(t->ret_stack[offset]); > } > > +/* For BITMAP type: get the bitmask from the @offset at ret_stack */ > static inline unsigned long > -get_fgraph_index_bitmap(struct task_struct *t, int offset) > +get_bitmap_bits(struct task_struct *t, int offset) > { > return (t->ret_stack[offset] >> FGRAPH_INDEX_SHIFT) & FGRAPH_INDEX_MASK; > } > > +/* For BITMAP type: set the bits in the bitmap bitmask at @offset on ret_stack */ > static inline void > -set_fgraph_index_bitmap(struct task_struct *t, int offset, unsigned long bitmap) > -{ > - t->ret_stack[offset] = (bitmap << FGRAPH_INDEX_SHIFT) | > - (FGRAPH_TYPE_BITMAP << FGRAPH_TYPE_SHIFT) | FGRAPH_FRAME_OFFSET; > -} > - > -static inline bool is_fgraph_index_set(struct task_struct *t, int offset, int idx) > +set_bitmap_bits(struct task_struct *t, int offset, unsigned long bitmap) > { > - return !!(get_fgraph_index_bitmap(t, offset) & BIT(idx)); > + t->ret_stack[offset] |= (bitmap << FGRAPH_INDEX_SHIFT); > } > > +/* Write the bitmap to the ret_stack at @offset (does index, offset and bitmask) */ > static inline void > -add_fgraph_index_bitmap(struct task_struct *t, int offset, unsigned long bitmap) > +set_bitmap(struct task_struct *t, int offset, unsigned long bitmap) > { > - t->ret_stack[offset] |= (bitmap << FGRAPH_INDEX_SHIFT); > + t->ret_stack[offset] = (bitmap << FGRAPH_INDEX_SHIFT) | > + (FGRAPH_TYPE_BITMAP << FGRAPH_TYPE_SHIFT) | FGRAPH_FRAME_OFFSET; > } > > -/* Note: @offset is the offset of FGRAPH_TYPE_DATA entry. */ > -static inline void *get_fgraph_data(struct task_struct *t, int offset) > +/* For DATA type: get the data saved under the ret_stack word at @offset */ > +static inline void *get_data_type_data(struct task_struct *t, int offset) > { > unsigned long val = t->ret_stack[offset]; > > @@ -249,7 +294,8 @@ static inline void *get_fgraph_data(struct task_struct *t, int offset) > return (void *)&t->ret_stack[offset]; > } > > -static inline unsigned long make_fgraph_data(int idx, int size, int offset) > +/* Create the ret_stack word for a DATA type */ > +static inline unsigned long make_data_type_val(int idx, int size, int offset) > { > return (idx << FGRAPH_DATA_INDEX_SHIFT) | > ((size - 1) << FGRAPH_DATA_SHIFT) | > @@ -326,7 +372,7 @@ void *fgraph_reserve_data(int idx, int size_bytes) > if (unlikely(curr_ret_stack >= SHADOW_STACK_MAX_OFFSET)) > return NULL; > > - val = make_fgraph_data(idx, data_size, __get_offset(val) + data_size + 1); > + val = make_data_type_val(idx, data_size, __get_offset(val) + data_size + 1); > > /* Set the last word to be reserved */ > current->ret_stack[curr_ret_stack - 1] = val; > @@ -371,7 +417,7 @@ void *fgraph_retrieve_data(int idx, int *size_bytes) > found: > if (size_bytes) > *size_bytes = __get_data_size(val) * sizeof(long); > - return get_fgraph_data(current, offset); > + return get_data_type_data(current, offset); > } > > /** > @@ -394,6 +440,9 @@ unsigned long *fgraph_get_task_var(struct fgraph_ops *gops) > * @offset: The offset into @t->ret_stack to find the ret_stack entry > * @frame_offset: Where to place the offset into @t->ret_stack of that entry > * > + * Returns a pointer to the previous ret_stack below @offset or NULL > + * when it reaches the bottom of the stack. > + * > * Calling this with: > * > * offset = task->curr_ret_stack; > @@ -493,30 +542,21 @@ ftrace_push_return_trace(unsigned long ret, unsigned long func, > if (!current->ret_stack) > return -EBUSY; > > - /* > - * At first, check whether the previous fgraph callback is pushed by > - * the fgraph on the same function entry. > - * But if @func is the self tail-call function, we also need to ensure > - * the ret_stack is not for the previous call by checking whether the > - * bit of @fgraph_idx is set or not. > - */ > - ret_stack = get_ret_stack(current, current->curr_ret_stack, &offset); > - if (ret_stack && ret_stack->func == func && > - get_fgraph_type(current, offset + FGRAPH_FRAME_OFFSET) == FGRAPH_TYPE_BITMAP && > - !is_fgraph_index_set(current, offset + FGRAPH_FRAME_OFFSET, fgraph_idx)) > - return offset + FGRAPH_FRAME_OFFSET; > + BUILD_BUG_ON(SHADOW_STACK_SIZE % sizeof(long)); > > + /* Set val to "reserved" with the delta to the new fgraph frame */ > val = (FGRAPH_TYPE_RESERVED << FGRAPH_TYPE_SHIFT) | FGRAPH_FRAME_OFFSET; > > - BUILD_BUG_ON(SHADOW_STACK_SIZE % sizeof(long)); > - > /* > * We must make sure the ret_stack is tested before we read > * anything else. > */ > smp_rmb(); > > - /* The return trace stack is full */ > + /* > + * Check if there's room on the shadow stack to fit a fraph frame > + * and a bitmap word. > + */ > if (current->curr_ret_stack + FGRAPH_FRAME_OFFSET + 1 >= SHADOW_STACK_MAX_OFFSET) { > atomic_inc(¤t->trace_overrun); > return -EBUSY; > @@ -545,7 +585,7 @@ ftrace_push_return_trace(unsigned long ret, unsigned long func, > * at least a correct offset! > */ > barrier(); > - current->curr_ret_stack = offset + 1; > + WRITE_ONCE(current->curr_ret_stack, offset + 1); > /* > * This next barrier is to ensure that an interrupt coming in > * will not corrupt what we are about to write. > @@ -597,7 +637,8 @@ int function_graph_enter(unsigned long ret, unsigned long func, > if (offset < 0) > goto out; > > - for (i = 0; i < FGRAPH_ARRAY_SIZE; i++) { > + for_each_set_bit(i, &fgraph_array_bitmask, > + sizeof(fgraph_array_bitmask) * BITS_PER_BYTE) { > struct fgraph_ops *gops = fgraph_array[i]; > int save_curr_ret_stack; > > @@ -620,7 +661,7 @@ int function_graph_enter(unsigned long ret, unsigned long func, > * Since this function uses fgraph_idx = 0 as a tail-call checking > * flag, set that bit always. > */ > - set_fgraph_index_bitmap(current, offset, bitmap | BIT(0)); > + set_bitmap(current, offset, bitmap | BIT(0)); > > return 0; > out_ret: > @@ -659,9 +700,9 @@ int function_graph_enter_ops(unsigned long ret, unsigned long func, > save_curr_ret_stack = current->curr_ret_stack; > if (gops->entryfunc(&trace, gops)) { > if (type == FGRAPH_TYPE_RESERVED) > - set_fgraph_index_bitmap(current, offset, BIT(gops->idx)); > + set_bitmap(current, offset, BIT(gops->idx)); > else > - add_fgraph_index_bitmap(current, offset, BIT(gops->idx)); > + set_bitmap_bits(current, offset, BIT(gops->idx)); > return 0; > } else > current->curr_ret_stack = save_curr_ret_stack; > @@ -791,12 +832,11 @@ static unsigned long __ftrace_return_to_handler(struct fgraph_ret_regs *ret_regs > trace.retval = fgraph_ret_regs_return_value(ret_regs); > #endif > > - bitmap = get_fgraph_index_bitmap(current, offset); > - for (i = 0; i < FGRAPH_ARRAY_SIZE; i++) { > + bitmap = get_bitmap_bits(current, offset); > + > + for_each_set_bit(i, &bitmap, sizeof(bitmap) * BITS_PER_BYTE) { > struct fgraph_ops *gops = fgraph_array[i]; > > - if (!(bitmap & BIT(i))) > - continue; > if (gops == &fgraph_stub) > continue; > > @@ -879,9 +919,10 @@ unsigned long ftrace_graph_ret_addr(struct task_struct *task, int *idx, > unsigned long ret, unsigned long *retp) > { > struct ftrace_ret_stack *ret_stack; > + unsigned long return_handler = (unsigned long)dereference_kernel_function_descriptor(return_to_handler); > int i = task->curr_ret_stack; > > - if (ret != (unsigned long)dereference_kernel_function_descriptor(return_to_handler)) > + if (ret != return_handler) > return ret; > > while (i > 0) { > @@ -891,14 +932,13 @@ unsigned long ftrace_graph_ret_addr(struct task_struct *task, int *idx, > /* > * For the tail-call, there would be 2 or more ftrace_ret_stacks on > * the ret_stack, which records "return_to_handler" as the return > - * address excpt for the last one. > + * address except for the last one. > * But on the real stack, there should be 1 entry because tail-call > * reuses the return address on the stack and jump to the next function. > * Thus we will continue to find real return address. > */ > if (ret_stack->retp == retp && > - ret_stack->ret != > - (unsigned long)dereference_kernel_function_descriptor(return_to_handler)) > + ret_stack->ret != return_handler) > return ret_stack->ret; > } > > @@ -909,10 +949,11 @@ unsigned long ftrace_graph_ret_addr(struct task_struct *task, int *idx, > unsigned long ret, unsigned long *retp) > { > struct ftrace_ret_stack *ret_stack; > + unsigned long return_handler = (unsigned long)dereference_kernel_function_descriptor(return_to_handler); > int offset = task->curr_ret_stack; > int i; > > - if (ret != (unsigned long)dereference_kernel_function_descriptor(return_to_handler)) > + if (ret != return_handler) > return ret; > > if (!idx) > @@ -921,8 +962,7 @@ unsigned long ftrace_graph_ret_addr(struct task_struct *task, int *idx, > i = *idx; > do { > ret_stack = get_ret_stack(task, offset, &offset); > - if (ret_stack && ret_stack->ret == > - (unsigned long)dereference_kernel_function_descriptor(return_to_handler)) > + if (ret_stack && ret_stack->ret == return_handler) > continue; > i--; > } while (i >= 0 && ret_stack);
On Mon, 27 May 2024 09:37:47 +0900 Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote: > > The diff between this and Masami's last update is at the end of this email. > > Thanks for update the patches! > I think your changes are good. I just have some comments and replied. > So, the plan is to push this series in the tracing/for-next? I will > port my fprobe part on it and run some tests. Yeah. I'll probably push it after -rc2 comes out and base it on top of that. I usually wait till rc2 as it tends to be more stable than rc1. I'll send an updated series later this week that addresses your comments. -- Steve
diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c index 3313e4b83aa2..1aae521e5997 100644 --- a/kernel/trace/fgraph.c +++ b/kernel/trace/fgraph.c @@ -18,29 +18,36 @@ #include "ftrace_internal.h" #include "trace.h" -#define FGRAPH_FRAME_SIZE sizeof(struct ftrace_ret_stack) -#define FGRAPH_FRAME_OFFSET DIV_ROUND_UP(FGRAPH_FRAME_SIZE, sizeof(long)) /* - * On entry to a function (via function_graph_enter()), a new ftrace_ret_stack - * is allocated on the task's ret_stack with bitmap entry, then each - * fgraph_ops on the fgraph_array[]'s entryfunc is called and if that returns - * non-zero, the index into the fgraph_array[] for that fgraph_ops is recorded - * on the bitmap entry as a bit flag. - * - * The top of the ret_stack (when not empty) will always have a reference - * to the last ftrace_ret_stack saved. All references to the - * ftrace_ret_stack has the format of: + * FGRAPH_FRAME_SIZE: Size in bytes of the meta data on the shadow stack + * FGRAPH_FRAME_OFFSET: Size in long words of the meta data frame + */ +#define FGRAPH_FRAME_SIZE sizeof(struct ftrace_ret_stack) +#define FGRAPH_FRAME_OFFSET DIV_ROUND_UP(FGRAPH_FRAME_SIZE, sizeof(long)) + +/* + * On entry to a function (via function_graph_enter()), a new fgraph frame + * (ftrace_ret_stack) is pushed onto the stack as well as a word that + * holds a bitmask and a type (called "bitmap"). The bitmap is defined as: * * bits: 0 - 9 offset in words from the previous ftrace_ret_stack - * (bitmap type should have FGRAPH_FRAME_OFFSET always) + * * bits: 10 - 11 Type of storage * 0 - reserved * 1 - bitmap of fgraph_array index * 2 - reserved data * - * For bitmap of fgraph_array index + * For type with "bitmap of fgraph_array index" (FGRAPH_TYPE_BITMAP): * bits: 12 - 27 The bitmap of fgraph_ops fgraph_array index + * That is, it's a bitmask of 0-15 (16 bits) + * where if a corresponding ops in the fgraph_array[] + * expects a callback from the return of the function + * it's corresponding bit will be set. + * + * + * The top of the ret_stack (when not empty) will always have a reference + * word that points to the last fgraph frame that was saved. * * For reserved data: * bits: 12 - 17 The size in words that is stored @@ -52,7 +59,7 @@ * stored two words of data, this is what will be on the task's shadow * ret_stack: (the stack grows upward) * - * ret_stack[SHADOW_STACK_IN_WORD] + * ret_stack[SHADOW_STACK_OFFSET] * | SHADOW_STACK_TASK_VARS(ret_stack)[15] | * ... * | SHADOW_STACK_TASK_VARS(ret_stack)[0] | @@ -60,6 +67,8 @@ * ... * | | <- task->curr_ret_stack * +--------------------------------------------+ + * | (3 << 12) | (3 << 10) | FGRAPH_FRAME_OFFSET| + * | *or put another way* | * | (3 << FGRAPH_DATA_INDEX_SHIFT)| \ | This is for fgraph_ops[3]. * | ((2 - 1) << FGRAPH_DATA_SHIFT)| \ | The data size is 2 words. * | (FGRAPH_TYPE_DATA << FGRAPH_TYPE_SHIFT)| \ | @@ -67,7 +76,9 @@ * +--------------------------------------------+ ( It is 4 words from the ret_stack) * | STORED DATA WORD 2 | * | STORED DATA WORD 1 | - * +------i-------------------------------------+ + * +--------------------------------------------+ + * | (9 << 12) | (1 << 10) | FGRAPH_FRAME_OFFSET| + * | *or put another way* | * | (BIT(3)|BIT(0)) << FGRAPH_INDEX_SHIFT | \ | * | FGRAPH_TYPE_BITMAP << FGRAPH_TYPE_SHIFT| \ | * | (offset1:FGRAPH_FRAME_OFFSET) | <- the offset1 is from here @@ -82,17 +93,24 @@ * * If a backtrace is required, and the real return pointer needs to be * fetched, then it looks at the task's curr_ret_stack offset, if it - * is greater than zero (reserved, or right before poped), it would mask + * is greater than zero (reserved, or right before popped), it would mask * the value by FGRAPH_FRAME_OFFSET_MASK to get the offset of the * ftrace_ret_stack structure stored on the shadow stack. */ -#define FGRAPH_FRAME_OFFSET_SIZE 10 -#define FGRAPH_FRAME_OFFSET_MASK GENMASK(FGRAPH_FRAME_OFFSET_SIZE - 1, 0) +/* + * The following is for the top word on the stack: + * + * FGRAPH_FRAME_OFFSET (0-9) holds the offset delta to the fgraph frame + * FGRAPH_TYPE (10-11) holds the type of word this is. + * (RESERVED or BITMAP) + */ +#define FGRAPH_FRAME_OFFSET_BITS 10 +#define FGRAPH_FRAME_OFFSET_MASK GENMASK(FGRAPH_FRAME_OFFSET_BITS - 1, 0) -#define FGRAPH_TYPE_SIZE 2 -#define FGRAPH_TYPE_MASK GENMASK(FGRAPH_TYPE_SIZE - 1, 0) -#define FGRAPH_TYPE_SHIFT FGRAPH_FRAME_OFFSET_SIZE +#define FGRAPH_TYPE_BITS 2 +#define FGRAPH_TYPE_MASK GENMASK(FGRAPH_TYPE_BITS - 1, 0) +#define FGRAPH_TYPE_SHIFT FGRAPH_FRAME_OFFSET_BITS enum { FGRAPH_TYPE_RESERVED = 0, @@ -100,31 +118,48 @@ enum { FGRAPH_TYPE_DATA = 2, }; -#define FGRAPH_INDEX_SIZE 16 -#define FGRAPH_INDEX_MASK GENMASK(FGRAPH_INDEX_SIZE - 1, 0) -#define FGRAPH_INDEX_SHIFT (FGRAPH_TYPE_SHIFT + FGRAPH_TYPE_SIZE) +/* + * For BITMAP type: + * FGRAPH_INDEX (12-27) bits holding the gops index wanting return callback called + */ +#define FGRAPH_INDEX_BITS 16 +#define FGRAPH_INDEX_MASK GENMASK(FGRAPH_INDEX_BITS - 1, 0) +#define FGRAPH_INDEX_SHIFT (FGRAPH_TYPE_SHIFT + FGRAPH_TYPE_BITS) -/* The data size == 0 means 1 word, and 31 (=2^5 - 1) means 32 words. */ -#define FGRAPH_DATA_SIZE 5 -#define FGRAPH_DATA_MASK GENMASK(FGRAPH_DATA_SIZE - 1, 0) -#define FGRAPH_DATA_SHIFT (FGRAPH_TYPE_SHIFT + FGRAPH_TYPE_SIZE) -#define FGRAPH_MAX_DATA_SIZE (sizeof(long) * (1 << FGRAPH_DATA_SIZE)) +/* + * For DATA type: + * FGRAPH_DATA (12-17) bits hold the size of data (in words) + * FGRAPH_INDEX (18-23) bits hold the index for which gops->idx the data is for + * + * Note: + * data_size == 0 means 1 word, and 31 (=2^5 - 1) means 32 words. + */ +#define FGRAPH_DATA_BITS 5 +#define FGRAPH_DATA_MASK GENMASK(FGRAPH_DATA_BITS - 1, 0) +#define FGRAPH_DATA_SHIFT (FGRAPH_TYPE_SHIFT + FGRAPH_TYPE_BITS) +#define FGRAPH_MAX_DATA_SIZE (sizeof(long) * (1 << FGRAPH_DATA_BITS)) -#define FGRAPH_DATA_INDEX_SIZE 4 -#define FGRAPH_DATA_INDEX_MASK GENMASK(FGRAPH_DATA_INDEX_SIZE - 1, 0) -#define FGRAPH_DATA_INDEX_SHIFT (FGRAPH_DATA_SHIFT + FGRAPH_DATA_SIZE) +#define FGRAPH_DATA_INDEX_BITS 4 +#define FGRAPH_DATA_INDEX_MASK GENMASK(FGRAPH_DATA_INDEX_BITS - 1, 0) +#define FGRAPH_DATA_INDEX_SHIFT (FGRAPH_DATA_SHIFT + FGRAPH_DATA_BITS) #define FGRAPH_MAX_INDEX \ - ((FGRAPH_INDEX_SIZE << FGRAPH_DATA_SIZE) + FGRAPH_RET_INDEX) + ((FGRAPH_INDEX_SIZE << FGRAPH_DATA_BITS) + FGRAPH_RET_INDEX) -#define FGRAPH_ARRAY_SIZE FGRAPH_INDEX_SIZE +#define FGRAPH_ARRAY_SIZE FGRAPH_INDEX_BITS -#define SHADOW_STACK_SIZE (PAGE_SIZE) -#define SHADOW_STACK_IN_WORD (SHADOW_STACK_SIZE / sizeof(long)) +/* + * SHADOW_STACK_SIZE: The size in bytes of the entire shadow stack + * SHADOW_STACK_OFFSET: The size in long words of the shadow stack + * SHADOW_STACK_MAX_OFFSET: The max offset of the stack for a new frame to be added + */ +#define SHADOW_STACK_SIZE (PAGE_SIZE) +#define SHADOW_STACK_OFFSET (SHADOW_STACK_SIZE / sizeof(long)) /* Leave on a buffer at the end */ #define SHADOW_STACK_MAX_OFFSET \ - (SHADOW_STACK_IN_WORD - (FGRAPH_FRAME_OFFSET + 1 + FGRAPH_ARRAY_SIZE)) + (SHADOW_STACK_OFFSET - (FGRAPH_FRAME_OFFSET + 1 + FGRAPH_ARRAY_SIZE)) +/* RET_STACK(): Return the frame from a given @offset from task @t */ #define RET_STACK(t, offset) ((struct ftrace_ret_stack *)(&(t)->ret_stack[offset])) /* @@ -132,12 +167,13 @@ enum { * ret_stack to store task specific state. */ #define SHADOW_STACK_TASK_VARS(ret_stack) \ - ((unsigned long *)(&(ret_stack)[SHADOW_STACK_IN_WORD - FGRAPH_ARRAY_SIZE])) + ((unsigned long *)(&(ret_stack)[SHADOW_STACK_OFFSET - FGRAPH_ARRAY_SIZE])) DEFINE_STATIC_KEY_FALSE(kill_ftrace_graph); int ftrace_graph_active; static struct fgraph_ops *fgraph_array[FGRAPH_ARRAY_SIZE]; +static unsigned long fgraph_array_bitmask; /* LRU index table for fgraph_array */ static int fgraph_lru_table[FGRAPH_ARRAY_SIZE]; @@ -162,6 +198,8 @@ static int fgraph_lru_release_index(int idx) fgraph_lru_table[fgraph_lru_last] = idx; fgraph_lru_last = (fgraph_lru_last + 1) % FGRAPH_ARRAY_SIZE; + + clear_bit(idx, &fgraph_array_bitmask); return 0; } @@ -176,70 +214,77 @@ static int fgraph_lru_alloc_index(void) fgraph_lru_table[fgraph_lru_next] = -1; fgraph_lru_next = (fgraph_lru_next + 1) % FGRAPH_ARRAY_SIZE; + + set_bit(idx, &fgraph_array_bitmask); return idx; } +/* Get the offset to the fgraph frame from a ret_stack value */ static inline int __get_offset(unsigned long val) { return val & FGRAPH_FRAME_OFFSET_MASK; } +/* Get the type of word from a ret_stack value */ static inline int __get_type(unsigned long val) { return (val >> FGRAPH_TYPE_SHIFT) & FGRAPH_TYPE_MASK; } +/* Get the data_index for a DATA type ret_stack word */ static inline int __get_data_index(unsigned long val) { return (val >> FGRAPH_DATA_INDEX_SHIFT) & FGRAPH_DATA_INDEX_MASK; } +/* Get the data_size for a DATA type ret_stack word */ static inline int __get_data_size(unsigned long val) { return ((val >> FGRAPH_DATA_SHIFT) & FGRAPH_DATA_MASK) + 1; } +/* Get the word from the ret_stack at @offset */ static inline unsigned long get_fgraph_entry(struct task_struct *t, int offset) { return t->ret_stack[offset]; } +/* Get the FRAME_OFFSET from the word from the @offset on ret_stack */ static inline int get_frame_offset(struct task_struct *t, int offset) { return __get_offset(t->ret_stack[offset]); } +/* Get FGRAPH_TYPE from the word from the @offset at ret_stack */ static inline int get_fgraph_type(struct task_struct *t, int offset) { return __get_type(t->ret_stack[offset]); } +/* For BITMAP type: get the bitmask from the @offset at ret_stack */ static inline unsigned long -get_fgraph_index_bitmap(struct task_struct *t, int offset) +get_bitmap_bits(struct task_struct *t, int offset) { return (t->ret_stack[offset] >> FGRAPH_INDEX_SHIFT) & FGRAPH_INDEX_MASK; } +/* For BITMAP type: set the bits in the bitmap bitmask at @offset on ret_stack */ static inline void -set_fgraph_index_bitmap(struct task_struct *t, int offset, unsigned long bitmap) -{ - t->ret_stack[offset] = (bitmap << FGRAPH_INDEX_SHIFT) | - (FGRAPH_TYPE_BITMAP << FGRAPH_TYPE_SHIFT) | FGRAPH_FRAME_OFFSET; -} - -static inline bool is_fgraph_index_set(struct task_struct *t, int offset, int idx) +set_bitmap_bits(struct task_struct *t, int offset, unsigned long bitmap) { - return !!(get_fgraph_index_bitmap(t, offset) & BIT(idx)); + t->ret_stack[offset] |= (bitmap << FGRAPH_INDEX_SHIFT); } +/* Write the bitmap to the ret_stack at @offset (does index, offset and bitmask) */ static inline void -add_fgraph_index_bitmap(struct task_struct *t, int offset, unsigned long bitmap) +set_bitmap(struct task_struct *t, int offset, unsigned long bitmap) { - t->ret_stack[offset] |= (bitmap << FGRAPH_INDEX_SHIFT); + t->ret_stack[offset] = (bitmap << FGRAPH_INDEX_SHIFT) | + (FGRAPH_TYPE_BITMAP << FGRAPH_TYPE_SHIFT) | FGRAPH_FRAME_OFFSET; } -/* Note: @offset is the offset of FGRAPH_TYPE_DATA entry. */ -static inline void *get_fgraph_data(struct task_struct *t, int offset) +/* For DATA type: get the data saved under the ret_stack word at @offset */ +static inline void *get_data_type_data(struct task_struct *t, int offset) { unsigned long val = t->ret_stack[offset]; @@ -249,7 +294,8 @@ static inline void *get_fgraph_data(struct task_struct *t, int offset) return (void *)&t->ret_stack[offset]; } -static inline unsigned long make_fgraph_data(int idx, int size, int offset) +/* Create the ret_stack word for a DATA type */ +static inline unsigned long make_data_type_val(int idx, int size, int offset) { return (idx << FGRAPH_DATA_INDEX_SHIFT) | ((size - 1) << FGRAPH_DATA_SHIFT) | @@ -326,7 +372,7 @@ void *fgraph_reserve_data(int idx, int size_bytes) if (unlikely(curr_ret_stack >= SHADOW_STACK_MAX_OFFSET)) return NULL; - val = make_fgraph_data(idx, data_size, __get_offset(val) + data_size + 1); + val = make_data_type_val(idx, data_size, __get_offset(val) + data_size + 1); /* Set the last word to be reserved */ current->ret_stack[curr_ret_stack - 1] = val; @@ -371,7 +417,7 @@ void *fgraph_retrieve_data(int idx, int *size_bytes) found: if (size_bytes) *size_bytes = __get_data_size(val) * sizeof(long); - return get_fgraph_data(current, offset); + return get_data_type_data(current, offset); } /** @@ -394,6 +440,9 @@ unsigned long *fgraph_get_task_var(struct fgraph_ops *gops) * @offset: The offset into @t->ret_stack to find the ret_stack entry * @frame_offset: Where to place the offset into @t->ret_stack of that entry * + * Returns a pointer to the previous ret_stack below @offset or NULL + * when it reaches the bottom of the stack. + * * Calling this with: * * offset = task->curr_ret_stack; @@ -493,30 +542,21 @@ ftrace_push_return_trace(unsigned long ret, unsigned long func, if (!current->ret_stack) return -EBUSY; - /* - * At first, check whether the previous fgraph callback is pushed by - * the fgraph on the same function entry. - * But if @func is the self tail-call function, we also need to ensure - * the ret_stack is not for the previous call by checking whether the - * bit of @fgraph_idx is set or not. - */ - ret_stack = get_ret_stack(current, current->curr_ret_stack, &offset); - if (ret_stack && ret_stack->func == func && - get_fgraph_type(current, offset + FGRAPH_FRAME_OFFSET) == FGRAPH_TYPE_BITMAP && - !is_fgraph_index_set(current, offset + FGRAPH_FRAME_OFFSET, fgraph_idx)) - return offset + FGRAPH_FRAME_OFFSET; + BUILD_BUG_ON(SHADOW_STACK_SIZE % sizeof(long)); + /* Set val to "reserved" with the delta to the new fgraph frame */ val = (FGRAPH_TYPE_RESERVED << FGRAPH_TYPE_SHIFT) | FGRAPH_FRAME_OFFSET; - BUILD_BUG_ON(SHADOW_STACK_SIZE % sizeof(long)); - /* * We must make sure the ret_stack is tested before we read * anything else. */ smp_rmb(); - /* The return trace stack is full */ + /* + * Check if there's room on the shadow stack to fit a fraph frame + * and a bitmap word. + */ if (current->curr_ret_stack + FGRAPH_FRAME_OFFSET + 1 >= SHADOW_STACK_MAX_OFFSET) { atomic_inc(¤t->trace_overrun); return -EBUSY; @@ -545,7 +585,7 @@ ftrace_push_return_trace(unsigned long ret, unsigned long func, * at least a correct offset! */ barrier(); - current->curr_ret_stack = offset + 1; + WRITE_ONCE(current->curr_ret_stack, offset + 1); /* * This next barrier is to ensure that an interrupt coming in * will not corrupt what we are about to write. @@ -597,7 +637,8 @@ int function_graph_enter(unsigned long ret, unsigned long func, if (offset < 0) goto out; - for (i = 0; i < FGRAPH_ARRAY_SIZE; i++) { + for_each_set_bit(i, &fgraph_array_bitmask, + sizeof(fgraph_array_bitmask) * BITS_PER_BYTE) { struct fgraph_ops *gops = fgraph_array[i]; int save_curr_ret_stack; @@ -620,7 +661,7 @@ int function_graph_enter(unsigned long ret, unsigned long func, * Since this function uses fgraph_idx = 0 as a tail-call checking * flag, set that bit always. */ - set_fgraph_index_bitmap(current, offset, bitmap | BIT(0)); + set_bitmap(current, offset, bitmap | BIT(0)); return 0; out_ret: @@ -659,9 +700,9 @@ int function_graph_enter_ops(unsigned long ret, unsigned long func, save_curr_ret_stack = current->curr_ret_stack; if (gops->entryfunc(&trace, gops)) { if (type == FGRAPH_TYPE_RESERVED) - set_fgraph_index_bitmap(current, offset, BIT(gops->idx)); + set_bitmap(current, offset, BIT(gops->idx)); else - add_fgraph_index_bitmap(current, offset, BIT(gops->idx)); + set_bitmap_bits(current, offset, BIT(gops->idx)); return 0; } else current->curr_ret_stack = save_curr_ret_stack; @@ -791,12 +832,11 @@ static unsigned long __ftrace_return_to_handler(struct fgraph_ret_regs *ret_regs trace.retval = fgraph_ret_regs_return_value(ret_regs); #endif - bitmap = get_fgraph_index_bitmap(current, offset); - for (i = 0; i < FGRAPH_ARRAY_SIZE; i++) { + bitmap = get_bitmap_bits(current, offset); + + for_each_set_bit(i, &bitmap, sizeof(bitmap) * BITS_PER_BYTE) { struct fgraph_ops *gops = fgraph_array[i]; - if (!(bitmap & BIT(i))) - continue; if (gops == &fgraph_stub) continue; @@ -879,9 +919,10 @@ unsigned long ftrace_graph_ret_addr(struct task_struct *task, int *idx, unsigned long ret, unsigned long *retp) { struct ftrace_ret_stack *ret_stack; + unsigned long return_handler = (unsigned long)dereference_kernel_function_descriptor(return_to_handler); int i = task->curr_ret_stack; - if (ret != (unsigned long)dereference_kernel_function_descriptor(return_to_handler)) + if (ret != return_handler) return ret; while (i > 0) { @@ -891,14 +932,13 @@ unsigned long ftrace_graph_ret_addr(struct task_struct *task, int *idx, /* * For the tail-call, there would be 2 or more ftrace_ret_stacks on * the ret_stack, which records "return_to_handler" as the return - * address excpt for the last one. + * address except for the last one. * But on the real stack, there should be 1 entry because tail-call * reuses the return address on the stack and jump to the next function. * Thus we will continue to find real return address. */ if (ret_stack->retp == retp && - ret_stack->ret != - (unsigned long)dereference_kernel_function_descriptor(return_to_handler)) + ret_stack->ret != return_handler) return ret_stack->ret; } @@ -909,10 +949,11 @@ unsigned long ftrace_graph_ret_addr(struct task_struct *task, int *idx, unsigned long ret, unsigned long *retp) { struct ftrace_ret_stack *ret_stack; + unsigned long return_handler = (unsigned long)dereference_kernel_function_descriptor(return_to_handler); int offset = task->curr_ret_stack; int i; - if (ret != (unsigned long)dereference_kernel_function_descriptor(return_to_handler)) + if (ret != return_handler) return ret; if (!idx) @@ -921,8 +962,7 @@ unsigned long ftrace_graph_ret_addr(struct task_struct *task, int *idx, i = *idx; do { ret_stack = get_ret_stack(task, offset, &offset); - if (ret_stack && ret_stack->ret == - (unsigned long)dereference_kernel_function_descriptor(return_to_handler)) + if (ret_stack && ret_stack->ret == return_handler) continue; i--; } while (i >= 0 && ret_stack);