Message ID | 38191e96ec6d331662ee7278e26edb79cdf95402.1644482771.git.christophe.leroy@csgroup.eu (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | tracing: uninline trace_trigger_soft_disabled() | expand |
On Thu, 10 Feb 2022 09:47:52 +0100 Christophe Leroy <christophe.leroy@csgroup.eu> wrote: > On a ppc32 build with CONFIG_CC_OPTIMISE_FOR_SIZE, > trace_trigger_soft_disabled() appears more than 50 times in vmlinux. > > That function is rather big for an inlined function, and > it doesn't benefit much from inlining as its only parameter > is a pointer to a struct in memory: The number of parameters is not the reason for it being inlined. It's in a *very* hot path, and a function call causes a noticeable performance hit. > > c003df60 <trace_trigger_soft_disabled>: > c003df60: 94 21 ff f0 stwu r1,-16(r1) > c003df64: 7c 08 02 a6 mflr r0 > c003df68: 90 01 00 14 stw r0,20(r1) > c003df6c: bf c1 00 08 stmw r30,8(r1) > c003df70: 83 e3 00 24 lwz r31,36(r3) > c003df74: 73 e9 01 00 andi. r9,r31,256 > c003df78: 41 82 00 10 beq c003df88 <trace_trigger_soft_disabled+0x28> > c003df7c: 38 60 00 00 li r3,0 > c003df80: 39 61 00 10 addi r11,r1,16 > c003df84: 4b fd 60 ac b c0014030 <_rest32gpr_30_x> > c003df88: 73 e9 00 80 andi. r9,r31,128 > c003df8c: 7c 7e 1b 78 mr r30,r3 > c003df90: 41 a2 00 14 beq c003dfa4 <trace_trigger_soft_disabled+0x44> > c003df94: 38 c0 00 00 li r6,0 > c003df98: 38 a0 00 00 li r5,0 > c003df9c: 38 80 00 00 li r4,0 > c003dfa0: 48 05 c5 f1 bl c009a590 <event_triggers_call> > c003dfa4: 73 e9 00 40 andi. r9,r31,64 > c003dfa8: 40 82 00 28 bne c003dfd0 <trace_trigger_soft_disabled+0x70> > c003dfac: 73 ff 02 00 andi. r31,r31,512 > c003dfb0: 41 82 ff cc beq c003df7c <trace_trigger_soft_disabled+0x1c> > c003dfb4: 80 01 00 14 lwz r0,20(r1) > c003dfb8: 83 e1 00 0c lwz r31,12(r1) > c003dfbc: 7f c3 f3 78 mr r3,r30 > c003dfc0: 83 c1 00 08 lwz r30,8(r1) > c003dfc4: 7c 08 03 a6 mtlr r0 > c003dfc8: 38 21 00 10 addi r1,r1,16 > c003dfcc: 48 05 6f 6c b c0094f38 <trace_event_ignore_this_pid> > c003dfd0: 38 60 00 01 li r3,1 > c003dfd4: 4b ff ff ac b c003df80 <trace_trigger_soft_disabled+0x20> > > It doesn't benefit much from inlining as its only parameter is a > pointer to a struct in memory so no constant folding is involved. > > Uninline it and move it into kernel/trace/trace_events_trigger.c > > It reduces the size of vmlinux by approximately 10 kbytes. If you have an issue with the size, perhaps the function can be modified to condense it. I'm happy to have a size reduction, but I will NACK making it into a function call. -- Steve
Le 10/02/2022 à 15:26, Steven Rostedt a écrit : > On Thu, 10 Feb 2022 09:47:52 +0100 > Christophe Leroy <christophe.leroy@csgroup.eu> wrote: > >> On a ppc32 build with CONFIG_CC_OPTIMISE_FOR_SIZE, >> trace_trigger_soft_disabled() appears more than 50 times in vmlinux. >> >> That function is rather big for an inlined function, and >> it doesn't benefit much from inlining as its only parameter >> is a pointer to a struct in memory: > > The number of parameters is not the reason for it being inlined. It's in a > *very* hot path, and a function call causes a noticeable performance hit. Fair enough > > >> >> >> It doesn't benefit much from inlining as its only parameter is a >> pointer to a struct in memory so no constant folding is involved. >> >> Uninline it and move it into kernel/trace/trace_events_trigger.c >> >> It reduces the size of vmlinux by approximately 10 kbytes. > > If you have an issue with the size, perhaps the function can be modified to > condense it. I'm happy to have a size reduction, but I will NACK making it > into a function call. > Well, my first issue is that I get it duplicated 50 times because GCC find it too big to get inlined. So it is a function call anyway. For instance, when building arch/powerpc/kernel/irq.o which -Winline, I get: ./include/linux/perf_event.h:1169:20: error: inlining failed in call to 'perf_fetch_caller_regs': call is unlikely and code size would grow [-Werror=inline] ./include/linux/perf_event.h:1169:20: error: inlining failed in call to 'perf_fetch_caller_regs': call is unlikely and code size would grow [-Werror=inline] ./include/linux/perf_event.h:1169:20: error: inlining failed in call to 'perf_fetch_caller_regs': call is unlikely and code size would grow [-Werror=inline] ./include/linux/perf_event.h:1169:20: error: inlining failed in call to 'perf_fetch_caller_regs': call is unlikely and code size would grow [-Werror=inline] ./include/linux/trace_events.h:712:1: error: inlining failed in call to 'trace_trigger_soft_disabled': call is unlikely and code size would grow [-Werror=inline] ./include/linux/trace_events.h:712:1: error: inlining failed in call to 'trace_trigger_soft_disabled': call is unlikely and code size would grow [-Werror=inline] ./include/linux/trace_events.h:712:1: error: inlining failed in call to 'trace_trigger_soft_disabled': call is unlikely and code size would grow [-Werror=inline] ./include/linux/trace_events.h:712:1: error: inlining failed in call to 'trace_trigger_soft_disabled': call is unlikely and code size would grow [-Werror=inline] If having it a function call is really an issue, then it should be __always_inline I will see the impact on size when we do so. What is in the hot path really ? It is the entire function or only the first test ? What about: static inline bool trace_trigger_soft_disabled(struct trace_event_file *file) { unsigned long eflags = file->flags; if (!(eflags & EVENT_FILE_FL_TRIGGER_COND)) return outlined_trace_trigger_soft_disabled(...); return false; } Or is there more in the hot path ? Thanks Christophe
On Thu, 10 Feb 2022 15:05:51 +0000 Christophe Leroy <christophe.leroy@csgroup.eu> wrote: > > Well, my first issue is that I get it duplicated 50 times because GCC > find it too big to get inlined. So it is a function call anyway. > > For instance, when building arch/powerpc/kernel/irq.o which -Winline, I get: > > ./include/linux/perf_event.h:1169:20: error: inlining failed in call to > 'perf_fetch_caller_regs': call is unlikely and code size would grow > [-Werror=inline] > ./include/linux/perf_event.h:1169:20: error: inlining failed in call to > 'perf_fetch_caller_regs': call is unlikely and code size would grow > [-Werror=inline] > ./include/linux/perf_event.h:1169:20: error: inlining failed in call to > 'perf_fetch_caller_regs': call is unlikely and code size would grow > [-Werror=inline] > ./include/linux/perf_event.h:1169:20: error: inlining failed in call to > 'perf_fetch_caller_regs': call is unlikely and code size would grow > [-Werror=inline] > ./include/linux/trace_events.h:712:1: error: inlining failed in call to > 'trace_trigger_soft_disabled': call is unlikely and code size would grow > [-Werror=inline] > ./include/linux/trace_events.h:712:1: error: inlining failed in call to > 'trace_trigger_soft_disabled': call is unlikely and code size would grow > [-Werror=inline] > ./include/linux/trace_events.h:712:1: error: inlining failed in call to > 'trace_trigger_soft_disabled': call is unlikely and code size would grow > [-Werror=inline] > ./include/linux/trace_events.h:712:1: error: inlining failed in call to > 'trace_trigger_soft_disabled': call is unlikely and code size would grow > [-Werror=inline] > > > > If having it a function call is really an issue, then it should be > __always_inline Yes you are correct. > > I will see the impact on size when we do so. > > > What is in the hot path really ? It is the entire function or only the > first test ? > > What about: > > static inline bool > trace_trigger_soft_disabled(struct trace_event_file *file) > { > unsigned long eflags = file->flags; > > if (!(eflags & EVENT_FILE_FL_TRIGGER_COND)) > return outlined_trace_trigger_soft_disabled(...); > return false; > } > > > Or is there more in the hot path ? Actually, the condition should be: if (!(eflags & EVENT_FILE_FL_TRIGGER_COND) && (eflags & (EVENT_FILE_FL_TRIGGER_MODE | EVENT_FILE_FL_SOFT_DISABLE | EVENT_FILE_FL_PID_FILTER)) { return outlined_trace_trigger_soft_disabled(...); } return false; As we only want to call the function when TRIGGER_COND is not set and one of those bits are. Because the most common case is !eflags, which your version would call the function every time. Maybe even switch the condition, to the most common case: if ((eflags & (EVENT_FILE_FL_TRIGGER_MODE | EVENT_FILE_FL_SOFT_DISABLE | EVENT_FILE_FL_PID_FILTER) && !(eflags & EVENT_FILE_FL_TRIGGER_COND)) { Because if one of those bits are not set, no reason to look further. And the TRIGGER_COND being set is actually the unlikely case, so it should be checked last. Would that work for you? -- Steve
diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h index 70c069aef02c..23dc8a12008b 100644 --- a/include/linux/trace_events.h +++ b/include/linux/trace_events.h @@ -708,21 +708,7 @@ bool trace_event_ignore_this_pid(struct trace_event_file *trace_file); * triggers that require testing the fields, it will return true, * otherwise false. */ -static inline bool -trace_trigger_soft_disabled(struct trace_event_file *file) -{ - unsigned long eflags = file->flags; - - if (!(eflags & EVENT_FILE_FL_TRIGGER_COND)) { - if (eflags & EVENT_FILE_FL_TRIGGER_MODE) - event_triggers_call(file, NULL, NULL, NULL); - if (eflags & EVENT_FILE_FL_SOFT_DISABLED) - return true; - if (eflags & EVENT_FILE_FL_PID_FILTER) - return trace_event_ignore_this_pid(file); - } - return false; -} +bool trace_trigger_soft_disabled(struct trace_event_file *file); #ifdef CONFIG_BPF_EVENTS unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx); diff --git a/kernel/trace/trace_events_trigger.c b/kernel/trace/trace_events_trigger.c index d00fee705f9c..41b766bc56b9 100644 --- a/kernel/trace/trace_events_trigger.c +++ b/kernel/trace/trace_events_trigger.c @@ -84,6 +84,22 @@ event_triggers_call(struct trace_event_file *file, } EXPORT_SYMBOL_GPL(event_triggers_call); +bool trace_trigger_soft_disabled(struct trace_event_file *file) +{ + unsigned long eflags = file->flags; + + if (!(eflags & EVENT_FILE_FL_TRIGGER_COND)) { + if (eflags & EVENT_FILE_FL_TRIGGER_MODE) + event_triggers_call(file, NULL, NULL, NULL); + if (eflags & EVENT_FILE_FL_SOFT_DISABLED) + return true; + if (eflags & EVENT_FILE_FL_PID_FILTER) + return trace_event_ignore_this_pid(file); + } + return false; +} +EXPORT_SYMBOL_GPL(trace_trigger_soft_disabled); + /** * event_triggers_post_call - Call 'post_triggers' for a trace event * @file: The trace_event_file associated with the event
On a ppc32 build with CONFIG_CC_OPTIMISE_FOR_SIZE, trace_trigger_soft_disabled() appears more than 50 times in vmlinux. That function is rather big for an inlined function, and it doesn't benefit much from inlining as its only parameter is a pointer to a struct in memory: c003df60 <trace_trigger_soft_disabled>: c003df60: 94 21 ff f0 stwu r1,-16(r1) c003df64: 7c 08 02 a6 mflr r0 c003df68: 90 01 00 14 stw r0,20(r1) c003df6c: bf c1 00 08 stmw r30,8(r1) c003df70: 83 e3 00 24 lwz r31,36(r3) c003df74: 73 e9 01 00 andi. r9,r31,256 c003df78: 41 82 00 10 beq c003df88 <trace_trigger_soft_disabled+0x28> c003df7c: 38 60 00 00 li r3,0 c003df80: 39 61 00 10 addi r11,r1,16 c003df84: 4b fd 60 ac b c0014030 <_rest32gpr_30_x> c003df88: 73 e9 00 80 andi. r9,r31,128 c003df8c: 7c 7e 1b 78 mr r30,r3 c003df90: 41 a2 00 14 beq c003dfa4 <trace_trigger_soft_disabled+0x44> c003df94: 38 c0 00 00 li r6,0 c003df98: 38 a0 00 00 li r5,0 c003df9c: 38 80 00 00 li r4,0 c003dfa0: 48 05 c5 f1 bl c009a590 <event_triggers_call> c003dfa4: 73 e9 00 40 andi. r9,r31,64 c003dfa8: 40 82 00 28 bne c003dfd0 <trace_trigger_soft_disabled+0x70> c003dfac: 73 ff 02 00 andi. r31,r31,512 c003dfb0: 41 82 ff cc beq c003df7c <trace_trigger_soft_disabled+0x1c> c003dfb4: 80 01 00 14 lwz r0,20(r1) c003dfb8: 83 e1 00 0c lwz r31,12(r1) c003dfbc: 7f c3 f3 78 mr r3,r30 c003dfc0: 83 c1 00 08 lwz r30,8(r1) c003dfc4: 7c 08 03 a6 mtlr r0 c003dfc8: 38 21 00 10 addi r1,r1,16 c003dfcc: 48 05 6f 6c b c0094f38 <trace_event_ignore_this_pid> c003dfd0: 38 60 00 01 li r3,1 c003dfd4: 4b ff ff ac b c003df80 <trace_trigger_soft_disabled+0x20> It doesn't benefit much from inlining as its only parameter is a pointer to a struct in memory so no constant folding is involved. Uninline it and move it into kernel/trace/trace_events_trigger.c It reduces the size of vmlinux by approximately 10 kbytes. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> --- include/linux/trace_events.h | 16 +--------------- kernel/trace/trace_events_trigger.c | 16 ++++++++++++++++ 2 files changed, 17 insertions(+), 15 deletions(-)