diff mbox series

[v1] drm/i915/selftest: Change throttle criteria for rps

Message ID 20250102110618.174415-1-raag.jadav@intel.com (mailing list archive)
State New, archived
Headers show
Series [v1] drm/i915/selftest: Change throttle criteria for rps | expand

Commit Message

Raag Jadav Jan. 2, 2025, 11:06 a.m. UTC
Current live_rps_control() implementation errors out on throttling.
This was done with the assumption that throttling to minimum frequency
is a catastrophic failure, which is incorrect. Throttling can happen
due to variety of reasons and often times out of our control. Also,
the resulting frequency can be at any given point below the maximum
allowed. Change throttle criteria to reflect this logic and drop the
error, as it doesn't necessarily mean selftest failure.

Signed-off-by: Raag Jadav <raag.jadav@intel.com>
---
 drivers/gpu/drm/i915/gt/selftest_rps.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

Comments

Raag Jadav Jan. 8, 2025, 9:28 a.m. UTC | #1
On Thu, Jan 02, 2025 at 04:36:18PM +0530, Raag Jadav wrote:
> Current live_rps_control() implementation errors out on throttling.
> This was done with the assumption that throttling to minimum frequency
> is a catastrophic failure, which is incorrect. Throttling can happen
> due to variety of reasons and often times out of our control. Also,
> the resulting frequency can be at any given point below the maximum
> allowed. Change throttle criteria to reflect this logic and drop the
> error, as it doesn't necessarily mean selftest failure.
> 
> Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> ---

Rodrigo, does this look okay?

Raag

>  drivers/gpu/drm/i915/gt/selftest_rps.c | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/selftest_rps.c b/drivers/gpu/drm/i915/gt/selftest_rps.c
> index dcef8d498919..7aac90c1679e 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_rps.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_rps.c
> @@ -477,12 +477,13 @@ int live_rps_control(void *arg)
>  			limit, intel_gpu_freq(rps, limit),
>  			min, max, ktime_to_ns(min_dt), ktime_to_ns(max_dt));
>  
> -		if (limit == rps->min_freq) {
> -			pr_err("%s: GPU throttled to minimum!\n",
> -			       engine->name);
> +		if (limit != rps->max_freq) {
> +			u32 throttle = intel_uncore_read(gt->uncore,
> +							 intel_gt_perf_limit_reasons_reg(gt));
> +
> +			pr_warn("%s: GPU throttled with reasons 0x%08x\n",
> +				engine->name, throttle & GT0_PERF_LIMIT_REASONS_MASK);
>  			show_pstate_limits(rps);
> -			err = -ENODEV;
> -			break;
>  		}
>  
>  		if (igt_flush_test(gt->i915)) {
> -- 
> 2.34.1
>
Vinay Belgaumkar Jan. 9, 2025, 6:59 p.m. UTC | #2
On 1/2/2025 3:06 AM, Raag Jadav wrote:
> Current live_rps_control() implementation errors out on throttling.
> This was done with the assumption that throttling to minimum frequency
> is a catastrophic failure, which is incorrect. Throttling can happen
> due to variety of reasons and often times out of our control. Also,
> the resulting frequency can be at any given point below the maximum
> allowed. Change throttle criteria to reflect this logic and drop the
> error, as it doesn't necessarily mean selftest failure.

LGTM, CI systems are especially susceptible to thermal issues.

Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>

>
> Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/selftest_rps.c | 11 ++++++-----
>   1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/selftest_rps.c b/drivers/gpu/drm/i915/gt/selftest_rps.c
> index dcef8d498919..7aac90c1679e 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_rps.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_rps.c
> @@ -477,12 +477,13 @@ int live_rps_control(void *arg)
>   			limit, intel_gpu_freq(rps, limit),
>   			min, max, ktime_to_ns(min_dt), ktime_to_ns(max_dt));
>   
> -		if (limit == rps->min_freq) {
> -			pr_err("%s: GPU throttled to minimum!\n",
> -			       engine->name);
> +		if (limit != rps->max_freq) {
> +			u32 throttle = intel_uncore_read(gt->uncore,
> +							 intel_gt_perf_limit_reasons_reg(gt));
> +
> +			pr_warn("%s: GPU throttled with reasons 0x%08x\n",
> +				engine->name, throttle & GT0_PERF_LIMIT_REASONS_MASK);
>   			show_pstate_limits(rps);
> -			err = -ENODEV;
> -			break;
>   		}
>   
>   		if (igt_flush_test(gt->i915)) {
Rodrigo Vivi Jan. 9, 2025, 7:04 p.m. UTC | #3
On Wed, Jan 08, 2025 at 11:28:47AM +0200, Raag Jadav wrote:
> On Thu, Jan 02, 2025 at 04:36:18PM +0530, Raag Jadav wrote:
> > Current live_rps_control() implementation errors out on throttling.
> > This was done with the assumption that throttling to minimum frequency
> > is a catastrophic failure, which is incorrect. Throttling can happen
> > due to variety of reasons and often times out of our control. Also,
> > the resulting frequency can be at any given point below the maximum
> > allowed. Change throttle criteria to reflect this logic and drop the
> > error, as it doesn't necessarily mean selftest failure.
> > 
> > Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> > ---
> 
> Rodrigo, does this look okay?

Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

> 
> Raag
> 
> >  drivers/gpu/drm/i915/gt/selftest_rps.c | 11 ++++++-----
> >  1 file changed, 6 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/selftest_rps.c b/drivers/gpu/drm/i915/gt/selftest_rps.c
> > index dcef8d498919..7aac90c1679e 100644
> > --- a/drivers/gpu/drm/i915/gt/selftest_rps.c
> > +++ b/drivers/gpu/drm/i915/gt/selftest_rps.c
> > @@ -477,12 +477,13 @@ int live_rps_control(void *arg)
> >  			limit, intel_gpu_freq(rps, limit),
> >  			min, max, ktime_to_ns(min_dt), ktime_to_ns(max_dt));
> >  
> > -		if (limit == rps->min_freq) {
> > -			pr_err("%s: GPU throttled to minimum!\n",
> > -			       engine->name);
> > +		if (limit != rps->max_freq) {
> > +			u32 throttle = intel_uncore_read(gt->uncore,
> > +							 intel_gt_perf_limit_reasons_reg(gt));
> > +
> > +			pr_warn("%s: GPU throttled with reasons 0x%08x\n",
> > +				engine->name, throttle & GT0_PERF_LIMIT_REASONS_MASK);
> >  			show_pstate_limits(rps);
> > -			err = -ENODEV;
> > -			break;
> >  		}
> >  
> >  		if (igt_flush_test(gt->i915)) {
> > -- 
> > 2.34.1
> >
Andi Shyti Jan. 11, 2025, 8:41 p.m. UTC | #4
Hi Raag,

On Thu, Jan 02, 2025 at 04:36:18PM +0530, Raag Jadav wrote:
> Current live_rps_control() implementation errors out on throttling.
> This was done with the assumption that throttling to minimum frequency
> is a catastrophic failure, which is incorrect. Throttling can happen
> due to variety of reasons and often times out of our control. Also,
> the resulting frequency can be at any given point below the maximum
> allowed. Change throttle criteria to reflect this logic and drop the
> error, as it doesn't necessarily mean selftest failure.
> 
> Signed-off-by: Raag Jadav <raag.jadav@intel.com>

Reviewed and merged to drm-intel-gt-next.

Thanks,
Andi
Raag Jadav Jan. 12, 2025, 7:34 a.m. UTC | #5
On Sat, Jan 11, 2025 at 09:41:47PM +0100, Andi Shyti wrote:
> Hi Raag,
> 
> On Thu, Jan 02, 2025 at 04:36:18PM +0530, Raag Jadav wrote:
> > Current live_rps_control() implementation errors out on throttling.
> > This was done with the assumption that throttling to minimum frequency
> > is a catastrophic failure, which is incorrect. Throttling can happen
> > due to variety of reasons and often times out of our control. Also,
> > the resulting frequency can be at any given point below the maximum
> > allowed. Change throttle criteria to reflect this logic and drop the
> > error, as it doesn't necessarily mean selftest failure.
> > 
> > Signed-off-by: Raag Jadav <raag.jadav@intel.com>
> 
> Reviewed and merged to drm-intel-gt-next.

Thank you.

Sorry I didn't pick your tag as I considered this a different patch.

Raag
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/selftest_rps.c b/drivers/gpu/drm/i915/gt/selftest_rps.c
index dcef8d498919..7aac90c1679e 100644
--- a/drivers/gpu/drm/i915/gt/selftest_rps.c
+++ b/drivers/gpu/drm/i915/gt/selftest_rps.c
@@ -477,12 +477,13 @@  int live_rps_control(void *arg)
 			limit, intel_gpu_freq(rps, limit),
 			min, max, ktime_to_ns(min_dt), ktime_to_ns(max_dt));
 
-		if (limit == rps->min_freq) {
-			pr_err("%s: GPU throttled to minimum!\n",
-			       engine->name);
+		if (limit != rps->max_freq) {
+			u32 throttle = intel_uncore_read(gt->uncore,
+							 intel_gt_perf_limit_reasons_reg(gt));
+
+			pr_warn("%s: GPU throttled with reasons 0x%08x\n",
+				engine->name, throttle & GT0_PERF_LIMIT_REASONS_MASK);
 			show_pstate_limits(rps);
-			err = -ENODEV;
-			break;
 		}
 
 		if (igt_flush_test(gt->i915)) {