Message ID | d7bb31547e9bbf6684801a7bbd857810@umbiko.net (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | rtla osnoise hist: average duration is always zero | expand |
On Sat, Dec 24, 2022 at 7:48 AM Andreas Ziegler <br015@umbiko.net> wrote: > > -- Observed in, but not limited to, Linux 6.1.1 Wait, "but not limited to"? What does that mean? Are there more versions affected? -- Slade
On 2022-12-24 21:17, Slade Watkins wrote: > On Sat, Dec 24, 2022 at 7:48 AM Andreas Ziegler <br015@umbiko.net> > wrote: >> >> -- Observed in, but not limited to, Linux 6.1.1 > > Wait, "but not limited to"? What does that mean? Are there more > versions affected? This was meant to indicate that the bug is not a regression; it can be found in every release since introduction in 5.17. Currently affected are 6.1.y and 6.0.y kernel trees. Regards, Andreas > -- Slade
Hi Andreas, On 12/24/22 13:39, Andreas Ziegler wrote: > -- Observed in, but not limited to, Linux 6.1.1 Since original commit... The best way to report this is adding a Fixes: tag. For example: Fixes: 829a6c0b5698 ("rtla/osnoise: Add the hist mode") > rtla osnoise hist always outputs '0' as average duration value. Example: > > # rtla osnoise hist -P F:1 -c 0-1 -r 900000 -d 1M -b 1 -E 5000 -T 1 > # RTLA osnoise histogram > # Time unit is microseconds (us) > # Duration: 0 00:01:00 > ... > count: 5629 1364 > min: 1 1 > avg: 0 0 > max: 2955 56 > > This is due to sum_sample in osnoise_hist_update_multiple() being calculated as the sum (duration), not as sum (duration * count). Yeah, that is correct. It works on timerlat hist because timerlat hist collects each trace event. osnoise hist uses in-kernel histograms, so we need to multiply the value with the count. This is a leftover from the development phase, as I started using tracing and then moved to histograms (which is better). > Rounding, instead of truncating, of the average value would be cool. I thought: the values were already rounded up, so it might be rounding too much. But we are in user space. It is just easier to add a two digits precision to the value, no? > The following patch would solve the issue described above: > > > Sampled duration must be weighted by observed quantity, to arrive at a > correct average duration value. > > Fix calculation of total duration by summing (duration * count). > Introduce rounding for calculation of final value. > > Signed-off-by: Andreas Ziegler <br015@umbiko.net> > > --- a/tools/tracing/rtla/src/osnoise_hist.c > +++ b/tools/tracing/rtla/src/osnoise_hist.c > @@ -121,6 +121,7 @@ > { > struct osnoise_hist_params *params = tool->params; > struct osnoise_hist_data *data = tool->data; > + unsigned long long total_duration; > int entries = data->entries; > int bucket; > int *hist; > @@ -131,10 +132,12 @@ > if (data->bucket_size) > bucket = duration / data->bucket_size; > > + total_duration = duration * count; > + > hist = data->hist[cpu].samples; > data->hist[cpu].count += count; > update_min(&data->hist[cpu].min_sample, &duration); > - update_sum(&data->hist[cpu].sum_sample, &duration); > + update_sum(&data->hist[cpu].sum_sample, &total_duration); How about re-seding a patch with the code above, adding the: Fixes: 829a6c0b5698 ("rtla/osnoise: Add the hist mode") and... > update_max(&data->hist[cpu].max_sample, &duration); > > if (bucket < entries) > @@ -333,7 +336,7 @@ > > if (data->hist[cpu].count) > trace_seq_printf(trace->seq, "%9llu ", > - data->hist[cpu].sum_sample / data->hist[cpu].count); > + (data->hist[cpu].sum_sample + data->hist[cpu].count / 2) / data->hist[cpu].count); another patch with this part, adding two digits precision? > else > trace_seq_printf(trace->seq, " - "); > } > Thanks! -- Daniel > Kind regards, > Andreas
--- a/tools/tracing/rtla/src/osnoise_hist.c +++ b/tools/tracing/rtla/src/osnoise_hist.c @@ -121,6 +121,7 @@ { struct osnoise_hist_params *params = tool->params; struct osnoise_hist_data *data = tool->data; + unsigned long long total_duration; int entries = data->entries; int bucket; int *hist; @@ -131,10 +132,12 @@ if (data->bucket_size) bucket = duration / data->bucket_size; + total_duration = duration * count; + hist = data->hist[cpu].samples; data->hist[cpu].count += count; update_min(&data->hist[cpu].min_sample, &duration); - update_sum(&data->hist[cpu].sum_sample, &duration); + update_sum(&data->hist[cpu].sum_sample, &total_duration); update_max(&data->hist[cpu].max_sample, &duration); if (bucket < entries) @@ -333,7 +336,7 @@ if (data->hist[cpu].count) trace_seq_printf(trace->seq, "%9llu ", - data->hist[cpu].sum_sample / data->hist[cpu].count); + (data->hist[cpu].sum_sample + data->hist[cpu].count / 2) / data->hist[cpu].count); else trace_seq_printf(trace->seq, " - ");
-- Observed in, but not limited to, Linux 6.1.1 Dear all, rtla osnoise hist always outputs '0' as average duration value. Example: # rtla osnoise hist -P F:1 -c 0-1 -r 900000 -d 1M -b 1 -E 5000 -T 1 # RTLA osnoise histogram # Time unit is microseconds (us) # Duration: 0 00:01:00 ... count: 5629 1364 min: 1 1 avg: 0 0 max: 2955 56 This is due to sum_sample in osnoise_hist_update_multiple() being calculated as the sum (duration), not as sum (duration * count). Rounding, instead of truncating, of the average value would be cool. The following patch would solve the issue described above: Sampled duration must be weighted by observed quantity, to arrive at a correct average duration value. Fix calculation of total duration by summing (duration * count). Introduce rounding for calculation of final value. Signed-off-by: Andreas Ziegler <br015@umbiko.net> } Kind regards, Andreas