Message ID | 20240805055922.6277-1-Tze-nan.Wu@mediatek.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
Series | [RESEND] tracing: Fix overflow in get_free_elt() | expand |
On Mon, 5 Aug 2024 13:59:22 +0800 Tze-nan Wu <Tze-nan.Wu@mediatek.com> wrote: > "tracing_map->next_elt" in get_free_elt() is at risk of overflowing. > > Once it overflows, new elements can still be inserted into the tracing_map > even though the maximum number of elements (`max_elts`) has been reached. > Continuing to insert elements after the overflow could result in the > tracing_map containing "tracing_map->max_size" elements, leaving no empty > entries. > If any attempt is made to insert an element into a full tracing_map using > `__tracing_map_insert()`, it will cause an infinite loop with preemption > disabled, leading to a CPU hang problem. > > Fix this by preventing any further increments to "tracing_map->next_elt" > once it reaches "tracing_map->max_elt". > > Co-developed-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com> > Signed-off-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com> > Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com> > --- > We have encountered this issue internally after enabling the > throttle_rss_stat feature provided by Perfetto in background for more than > two days, during which `rss_stat` tracepoint was invoked over 2^32 times. > After tracing_map->next_elt overflow, new elements can continue to be > inserted to the tracing_map belong to `rss_stat`. > Then the CPU could hang inside the while dead loop in function > `__tracing_map_insert()` by calling it after the tracing_map left no empty > entry. > > Call trace during hang: > __tracing_map_insert() > tracing_map_insert() > event_hist_trigger() > event_triggers_call() > __event_trigger_test_discard() > trace_event_buffer_commit() > trace_event_raw_event_rss_stat() > __traceiter_rss_stat() > trace_rss_stat() > mm_trace_rss_stat() > inc_mm_counter() > do_swap_page() > > throttle_rss_stat is literally a synthetic event triggered by `rss_stat` > with condition: > 1. $echo "rss_stat_throttled unsigned int mm_id unsigned int curr int > member long size" >> /sys/kernel/tracing/synthetic_events > 2. $echo > 'hist:keys=mm_id,member:bucket=size/0x80000:onchange($bucket).rss_stat_ > throttled(mm_id,curr,member,size)' > > /sys/kernel/tracing/events/kmem/rss_stat/trigger > > The hang issue could also be reproduced easily by calling a customize > trace event `myevent(u64 mycount)` for more than 2^32+(map_size-max_elts) > times during its histogram enabled with the key set to variable "mycount". > While we call `myevent` with different argument "mycount" everytime. > > BTW, I added Cheng-jui to Co-developed because we have a lot of discussions > during debugging this. > --- > kernel/trace/tracing_map.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/kernel/trace/tracing_map.c b/kernel/trace/tracing_map.c > index a4dcf0f24352..3a56e7c8aa4f 100644 > --- a/kernel/trace/tracing_map.c > +++ b/kernel/trace/tracing_map.c > @@ -454,7 +454,7 @@ static struct tracing_map_elt *get_free_elt(struct tracing_map *map) > struct tracing_map_elt *elt = NULL; > int idx; > > - idx = atomic_inc_return(&map->next_elt); > + idx = atomic_fetch_add_unless(&map->next_elt, 1, map->max_elts); I guess we need to add (with a comment): idx--; > if (idx < map->max_elts) { Otherwise the max elements will be off by one. > elt = *(TRACING_MAP_ELT(map->elts, idx)); And the index will skip position zero. -- Steve > if (map->ops && map->ops->elt_init) > @@ -699,7 +699,7 @@ void tracing_map_clear(struct tracing_map *map) > { > unsigned int i; > > - atomic_set(&map->next_elt, -1); > + atomic_set(&map->next_elt, 0); > atomic64_set(&map->hits, 0); > atomic64_set(&map->drops, 0); > > @@ -783,7 +783,7 @@ struct tracing_map *tracing_map_create(unsigned int map_bits, > > map->map_bits = map_bits; > map->max_elts = (1 << map_bits); > - atomic_set(&map->next_elt, -1); > + atomic_set(&map->next_elt, 0); > > map->map_size = (1 << (map_bits + 1)); > map->ops = ops;
On Tue, 2024-08-06 at 15:40 -0400, Steven Rostedt wrote: > > External email : Please do not click links or open attachments until > you have verified the sender or the content. > On Mon, 5 Aug 2024 13:59:22 +0800 > Tze-nan Wu <Tze-nan.Wu@mediatek.com> wrote: > > > "tracing_map->next_elt" in get_free_elt() is at risk of > overflowing. > > > > Once it overflows, new elements can still be inserted into the > tracing_map > > even though the maximum number of elements (`max_elts`) has been > reached. > > Continuing to insert elements after the overflow could result in > the > > tracing_map containing "tracing_map->max_size" elements, leaving no > empty > > entries. > > If any attempt is made to insert an element into a full tracing_map > using > > `__tracing_map_insert()`, it will cause an infinite loop with > preemption > > disabled, leading to a CPU hang problem. > > > > Fix this by preventing any further increments to "tracing_map- > >next_elt" > > once it reaches "tracing_map->max_elt". > > > > Co-developed-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com> > > Signed-off-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com> > > Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com> > > --- > > We have encountered this issue internally after enabling the > > throttle_rss_stat feature provided by Perfetto in background for > more than > > two days, during which `rss_stat` tracepoint was invoked over 2^32 > times. > > After tracing_map->next_elt overflow, new elements can continue to > be > > inserted to the tracing_map belong to `rss_stat`. > > Then the CPU could hang inside the while dead loop in function > > `__tracing_map_insert()` by calling it after the tracing_map left > no empty > > entry. > > > > Call trace during hang: > > __tracing_map_insert() > > tracing_map_insert() > > event_hist_trigger() > > event_triggers_call() > > __event_trigger_test_discard() > > trace_event_buffer_commit() > > trace_event_raw_event_rss_stat() > > __traceiter_rss_stat() > > trace_rss_stat() > > mm_trace_rss_stat() > > inc_mm_counter() > > do_swap_page() > > > > throttle_rss_stat is literally a synthetic event triggered by > `rss_stat` > > with condition: > > 1. $echo "rss_stat_throttled unsigned int mm_id unsigned int curr > int > > member long size" >> /sys/kernel/tracing/synthetic_events > > 2. $echo > > 'hist:keys=mm_id,member:bucket=size/0x80000:onchange($bucket).r > ss_stat_ > > throttled(mm_id,curr,member,size)' > > > /sys/kernel/tracing/events/kmem/rss_stat/trigger > > > > The hang issue could also be reproduced easily by calling a > customize > > trace event `myevent(u64 mycount)` for more than 2^32+(map_size- > max_elts) > > times during its histogram enabled with the key set to variable > "mycount". > > While we call `myevent` with different argument "mycount" > everytime. > > > > BTW, I added Cheng-jui to Co-developed because we have a lot of > discussions > > during debugging this. > > --- > > kernel/trace/tracing_map.c | 6 +++--- > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > diff --git a/kernel/trace/tracing_map.c > b/kernel/trace/tracing_map.c > > index a4dcf0f24352..3a56e7c8aa4f 100644 > > --- a/kernel/trace/tracing_map.c > > +++ b/kernel/trace/tracing_map.c > > @@ -454,7 +454,7 @@ static struct tracing_map_elt > *get_free_elt(struct tracing_map *map) > > struct tracing_map_elt *elt = NULL; > > int idx; > > > > -idx = atomic_inc_return(&map->next_elt); > > +idx = atomic_fetch_add_unless(&map->next_elt, 1, map->max_elts); > > I guess we need to add (with a comment): > > idx--; > Thanks for reviewing this! By reference to "/include/linux/atomic/atomic-instrumented.h", `atomic_fetch_add_unless` will return the original value before the adding happens on operands. Therefore, the return values of `atomic_fetch_add`, will sequentially be: - 0, 1, 2, ..., max_elts-1, max_elts, max_elts, max_elts, ... (in this case) I've also observed the result from my local test. And I guess the result is already what we desire? (elt will insert to map from index 0). Here's the comment from "atomic-instrumented.h": atomic_fetch_add_unless() - atomic add unless value with full ordering @v: pointer to atomic_t @a: int value to add @u: int value to compare with If (@v != @u), atomically updates @v to (@v + @a) with full ordering. Otherwise, @v is not modified and relaxed ordering is provided. Unsafe to use in noinstr code; use raw_atomic_fetch_add_unless() there. Return: The original value of @v. Thanks -- Tze-nan > > if (idx < map->max_elts) { > > Otherwise the max elements will be off by one. > > > elt = *(TRACING_MAP_ELT(map->elts, idx)); > > And the index will skip position zero. > > -- Steve > > > if (map->ops && map->ops->elt_init) > > @@ -699,7 +699,7 @@ void tracing_map_clear(struct tracing_map *map) > > { > > unsigned int i; > > > > -atomic_set(&map->next_elt, -1); > > +atomic_set(&map->next_elt, 0); > > atomic64_set(&map->hits, 0); > > atomic64_set(&map->drops, 0); > > > > @@ -783,7 +783,7 @@ struct tracing_map *tracing_map_create(unsigned > int map_bits, > > > > map->map_bits = map_bits; > > map->max_elts = (1 << map_bits); > > -atomic_set(&map->next_elt, -1); > > +atomic_set(&map->next_elt, 0); > > > > map->map_size = (1 << (map_bits + 1)); > > map->ops = ops; >
On Wed, 7 Aug 2024 11:34:43 +0000 Tze-nan Wu (吳澤南) <Tze-nan.Wu@mediatek.com> wrote: > > > -idx = atomic_inc_return(&map->next_elt); > > > +idx = atomic_fetch_add_unless(&map->next_elt, 1, map->max_elts); > > > > I guess we need to add (with a comment): > > > > idx--; > > > Thanks for reviewing this! > > By reference to "/include/linux/atomic/atomic-instrumented.h", > `atomic_fetch_add_unless` will return the original value before the > adding happens on operands. Ah, right. I haven't used that before and was thinking this was the same as the inc_return(). -- Steve
diff --git a/kernel/trace/tracing_map.c b/kernel/trace/tracing_map.c index a4dcf0f24352..3a56e7c8aa4f 100644 --- a/kernel/trace/tracing_map.c +++ b/kernel/trace/tracing_map.c @@ -454,7 +454,7 @@ static struct tracing_map_elt *get_free_elt(struct tracing_map *map) struct tracing_map_elt *elt = NULL; int idx; - idx = atomic_inc_return(&map->next_elt); + idx = atomic_fetch_add_unless(&map->next_elt, 1, map->max_elts); if (idx < map->max_elts) { elt = *(TRACING_MAP_ELT(map->elts, idx)); if (map->ops && map->ops->elt_init) @@ -699,7 +699,7 @@ void tracing_map_clear(struct tracing_map *map) { unsigned int i; - atomic_set(&map->next_elt, -1); + atomic_set(&map->next_elt, 0); atomic64_set(&map->hits, 0); atomic64_set(&map->drops, 0); @@ -783,7 +783,7 @@ struct tracing_map *tracing_map_create(unsigned int map_bits, map->map_bits = map_bits; map->max_elts = (1 << map_bits); - atomic_set(&map->next_elt, -1); + atomic_set(&map->next_elt, 0); map->map_size = (1 << (map_bits + 1)); map->ops = ops;