Message ID | 20170222152510.5039-4-robert@sixbynine.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 22 February 2017 at 15:25, Robert Bragg <robert@sixbynine.org> wrote: > This change is pre-emptively aiming to avoid a potential cause of kernel > logging noise in case some condition were to result in us seeing invalid > OA reports. > > The workaround for the OA unit's tail pointer race condition is what > avoids the primary know cause of invalid reports being seen and with > that in place we aren't expecting to see this notice but it can't be > entirely ruled out. > > Just in case some condition does lead to the notice then it's likely > that it will be triggered repeatedly while attempting to append a > sequence of reports and depending on the configured OA sampling > frequency that might be a large number of repeat notices. > > Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
On Tue, Feb 28, 2017 at 01:28:13PM +0000, Matthew Auld wrote: > On 22 February 2017 at 15:25, Robert Bragg <robert@sixbynine.org> wrote: > > This change is pre-emptively aiming to avoid a potential cause of kernel > > logging noise in case some condition were to result in us seeing invalid > > OA reports. > > > > The workaround for the OA unit's tail pointer race condition is what > > avoids the primary know cause of invalid reports being seen and with > > that in place we aren't expecting to see this notice but it can't be > > entirely ruled out. > > > > Just in case some condition does lead to the notice then it's likely > > that it will be triggered repeatedly while attempting to append a > > sequence of reports and depending on the configured OA sampling > > frequency that might be a large number of repeat notices. > > > > Signed-off-by: Robert Bragg <robert@sixbynine.org> > Reviewed-by: Matthew Auld <matthew.auld@intel.com> printk_ratelimits emits a WARN when triggered, defeating the purpose of using NOTE. -Chris
On Tue, Feb 28, 2017 at 1:33 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote: > On Tue, Feb 28, 2017 at 01:28:13PM +0000, Matthew Auld wrote: >> On 22 February 2017 at 15:25, Robert Bragg <robert@sixbynine.org> wrote: >> > This change is pre-emptively aiming to avoid a potential cause of kernel >> > logging noise in case some condition were to result in us seeing invalid >> > OA reports. >> > >> > The workaround for the OA unit's tail pointer race condition is what >> > avoids the primary know cause of invalid reports being seen and with >> > that in place we aren't expecting to see this notice but it can't be >> > entirely ruled out. >> > >> > Just in case some condition does lead to the notice then it's likely >> > that it will be triggered repeatedly while attempting to append a >> > sequence of reports and depending on the configured OA sampling >> > frequency that might be a large number of repeat notices. >> > >> > Signed-off-by: Robert Bragg <robert@sixbynine.org> >> Reviewed-by: Matthew Auld <matthew.auld@intel.com> > > printk_ratelimits emits a WARN when triggered, defeating the purpose of > using NOTE. Hmm, that's a slightly awkward problem. The warning comes from the common __ratelimit utility api that's used for numerous things but by default will warn if the rate limiting kicked in (once printing resumes again). There's a RATELIMIT_MSG_ON_RELEASE flag that can be set to def the warning until ratelimit_state_exit() is called (which looks optional or at least the flag could also be cleared before calling it to avoid any warnings). In general printk_ratelimit() doesn't seem like an ideal mechanism for throttling messages, even if they are warnings, since the rate limit state is shared across orthogonal components, so maybe it's best anyway to use a custom rate limit state. Considering the _MSG_ON_RELEASE flag, I think I'll init some ratelimit state in dev_priv->perf.oa just for this message, use ratelimit_set_flags() to set _MSG_ON_RELEASE and have a corresponding debug/note in i915_perf_fini after checking the rs->missed counter. Actually at this point I'm slightly doubting whether having a warning for throttling is that terrible in this case. Although I tend to think it could be good to have dedicated ratelimit state here, there conceptually is a point a which a visible warning could be appropriate if we're really seeing a lot of these spurious reports. It's a grey area since it's ok as note if it's rare/intermittent, but somthing's maybe gone wrong if we see *lots* of these. hmm. Incidentally, I suppose this same observation is relevant to many of the DRM_*_RATELIMITED macros too - there will be an inconsistent level used to notify about any throttling? Br, - Robert > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index d04ebaa8406e..a901bcd80263 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -632,7 +632,8 @@ static int gen7_append_oa_reports(struct i915_perf_stream *stream, * copying it to userspace... */ if (report32[0] == 0) { - DRM_NOTE("Skipping spurious, invalid OA report\n"); + if (printk_ratelimit()) + DRM_NOTE("Skipping spurious, invalid OA report\n"); continue; }
This change is pre-emptively aiming to avoid a potential cause of kernel logging noise in case some condition were to result in us seeing invalid OA reports. The workaround for the OA unit's tail pointer race condition is what avoids the primary know cause of invalid reports being seen and with that in place we aren't expecting to see this notice but it can't be entirely ruled out. Just in case some condition does lead to the notice then it's likely that it will be triggered repeatedly while attempting to append a sequence of reports and depending on the configured OA sampling frequency that might be a large number of repeat notices. Signed-off-by: Robert Bragg <robert@sixbynine.org> --- drivers/gpu/drm/i915/i915_perf.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)