Message ID | 20250102110618.174415-1-raag.jadav@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v1] drm/i915/selftest: Change throttle criteria for rps | expand |
On Thu, Jan 02, 2025 at 04:36:18PM +0530, Raag Jadav wrote: > Current live_rps_control() implementation errors out on throttling. > This was done with the assumption that throttling to minimum frequency > is a catastrophic failure, which is incorrect. Throttling can happen > due to variety of reasons and often times out of our control. Also, > the resulting frequency can be at any given point below the maximum > allowed. Change throttle criteria to reflect this logic and drop the > error, as it doesn't necessarily mean selftest failure. > > Signed-off-by: Raag Jadav <raag.jadav@intel.com> > --- Rodrigo, does this look okay? Raag > drivers/gpu/drm/i915/gt/selftest_rps.c | 11 ++++++----- > 1 file changed, 6 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/selftest_rps.c b/drivers/gpu/drm/i915/gt/selftest_rps.c > index dcef8d498919..7aac90c1679e 100644 > --- a/drivers/gpu/drm/i915/gt/selftest_rps.c > +++ b/drivers/gpu/drm/i915/gt/selftest_rps.c > @@ -477,12 +477,13 @@ int live_rps_control(void *arg) > limit, intel_gpu_freq(rps, limit), > min, max, ktime_to_ns(min_dt), ktime_to_ns(max_dt)); > > - if (limit == rps->min_freq) { > - pr_err("%s: GPU throttled to minimum!\n", > - engine->name); > + if (limit != rps->max_freq) { > + u32 throttle = intel_uncore_read(gt->uncore, > + intel_gt_perf_limit_reasons_reg(gt)); > + > + pr_warn("%s: GPU throttled with reasons 0x%08x\n", > + engine->name, throttle & GT0_PERF_LIMIT_REASONS_MASK); > show_pstate_limits(rps); > - err = -ENODEV; > - break; > } > > if (igt_flush_test(gt->i915)) { > -- > 2.34.1 >
On 1/2/2025 3:06 AM, Raag Jadav wrote: > Current live_rps_control() implementation errors out on throttling. > This was done with the assumption that throttling to minimum frequency > is a catastrophic failure, which is incorrect. Throttling can happen > due to variety of reasons and often times out of our control. Also, > the resulting frequency can be at any given point below the maximum > allowed. Change throttle criteria to reflect this logic and drop the > error, as it doesn't necessarily mean selftest failure. LGTM, CI systems are especially susceptible to thermal issues. Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> > > Signed-off-by: Raag Jadav <raag.jadav@intel.com> > --- > drivers/gpu/drm/i915/gt/selftest_rps.c | 11 ++++++----- > 1 file changed, 6 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/selftest_rps.c b/drivers/gpu/drm/i915/gt/selftest_rps.c > index dcef8d498919..7aac90c1679e 100644 > --- a/drivers/gpu/drm/i915/gt/selftest_rps.c > +++ b/drivers/gpu/drm/i915/gt/selftest_rps.c > @@ -477,12 +477,13 @@ int live_rps_control(void *arg) > limit, intel_gpu_freq(rps, limit), > min, max, ktime_to_ns(min_dt), ktime_to_ns(max_dt)); > > - if (limit == rps->min_freq) { > - pr_err("%s: GPU throttled to minimum!\n", > - engine->name); > + if (limit != rps->max_freq) { > + u32 throttle = intel_uncore_read(gt->uncore, > + intel_gt_perf_limit_reasons_reg(gt)); > + > + pr_warn("%s: GPU throttled with reasons 0x%08x\n", > + engine->name, throttle & GT0_PERF_LIMIT_REASONS_MASK); > show_pstate_limits(rps); > - err = -ENODEV; > - break; > } > > if (igt_flush_test(gt->i915)) {
On Wed, Jan 08, 2025 at 11:28:47AM +0200, Raag Jadav wrote: > On Thu, Jan 02, 2025 at 04:36:18PM +0530, Raag Jadav wrote: > > Current live_rps_control() implementation errors out on throttling. > > This was done with the assumption that throttling to minimum frequency > > is a catastrophic failure, which is incorrect. Throttling can happen > > due to variety of reasons and often times out of our control. Also, > > the resulting frequency can be at any given point below the maximum > > allowed. Change throttle criteria to reflect this logic and drop the > > error, as it doesn't necessarily mean selftest failure. > > > > Signed-off-by: Raag Jadav <raag.jadav@intel.com> > > --- > > Rodrigo, does this look okay? Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> > > Raag > > > drivers/gpu/drm/i915/gt/selftest_rps.c | 11 ++++++----- > > 1 file changed, 6 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gt/selftest_rps.c b/drivers/gpu/drm/i915/gt/selftest_rps.c > > index dcef8d498919..7aac90c1679e 100644 > > --- a/drivers/gpu/drm/i915/gt/selftest_rps.c > > +++ b/drivers/gpu/drm/i915/gt/selftest_rps.c > > @@ -477,12 +477,13 @@ int live_rps_control(void *arg) > > limit, intel_gpu_freq(rps, limit), > > min, max, ktime_to_ns(min_dt), ktime_to_ns(max_dt)); > > > > - if (limit == rps->min_freq) { > > - pr_err("%s: GPU throttled to minimum!\n", > > - engine->name); > > + if (limit != rps->max_freq) { > > + u32 throttle = intel_uncore_read(gt->uncore, > > + intel_gt_perf_limit_reasons_reg(gt)); > > + > > + pr_warn("%s: GPU throttled with reasons 0x%08x\n", > > + engine->name, throttle & GT0_PERF_LIMIT_REASONS_MASK); > > show_pstate_limits(rps); > > - err = -ENODEV; > > - break; > > } > > > > if (igt_flush_test(gt->i915)) { > > -- > > 2.34.1 > >
Hi Raag, On Thu, Jan 02, 2025 at 04:36:18PM +0530, Raag Jadav wrote: > Current live_rps_control() implementation errors out on throttling. > This was done with the assumption that throttling to minimum frequency > is a catastrophic failure, which is incorrect. Throttling can happen > due to variety of reasons and often times out of our control. Also, > the resulting frequency can be at any given point below the maximum > allowed. Change throttle criteria to reflect this logic and drop the > error, as it doesn't necessarily mean selftest failure. > > Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed and merged to drm-intel-gt-next. Thanks, Andi
On Sat, Jan 11, 2025 at 09:41:47PM +0100, Andi Shyti wrote: > Hi Raag, > > On Thu, Jan 02, 2025 at 04:36:18PM +0530, Raag Jadav wrote: > > Current live_rps_control() implementation errors out on throttling. > > This was done with the assumption that throttling to minimum frequency > > is a catastrophic failure, which is incorrect. Throttling can happen > > due to variety of reasons and often times out of our control. Also, > > the resulting frequency can be at any given point below the maximum > > allowed. Change throttle criteria to reflect this logic and drop the > > error, as it doesn't necessarily mean selftest failure. > > > > Signed-off-by: Raag Jadav <raag.jadav@intel.com> > > Reviewed and merged to drm-intel-gt-next. Thank you. Sorry I didn't pick your tag as I considered this a different patch. Raag
diff --git a/drivers/gpu/drm/i915/gt/selftest_rps.c b/drivers/gpu/drm/i915/gt/selftest_rps.c index dcef8d498919..7aac90c1679e 100644 --- a/drivers/gpu/drm/i915/gt/selftest_rps.c +++ b/drivers/gpu/drm/i915/gt/selftest_rps.c @@ -477,12 +477,13 @@ int live_rps_control(void *arg) limit, intel_gpu_freq(rps, limit), min, max, ktime_to_ns(min_dt), ktime_to_ns(max_dt)); - if (limit == rps->min_freq) { - pr_err("%s: GPU throttled to minimum!\n", - engine->name); + if (limit != rps->max_freq) { + u32 throttle = intel_uncore_read(gt->uncore, + intel_gt_perf_limit_reasons_reg(gt)); + + pr_warn("%s: GPU throttled with reasons 0x%08x\n", + engine->name, throttle & GT0_PERF_LIMIT_REASONS_MASK); show_pstate_limits(rps); - err = -ENODEV; - break; } if (igt_flush_test(gt->i915)) {
Current live_rps_control() implementation errors out on throttling. This was done with the assumption that throttling to minimum frequency is a catastrophic failure, which is incorrect. Throttling can happen due to variety of reasons and often times out of our control. Also, the resulting frequency can be at any given point below the maximum allowed. Change throttle criteria to reflect this logic and drop the error, as it doesn't necessarily mean selftest failure. Signed-off-by: Raag Jadav <raag.jadav@intel.com> --- drivers/gpu/drm/i915/gt/selftest_rps.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)