diff mbox series

drm/i915/pmu: Increase the live_engine_busy_stats sample period

Message ID 20211112025222.61031-1-umesh.nerlige.ramappa@intel.com (mailing list archive)
State New, archived
Headers show
Series drm/i915/pmu: Increase the live_engine_busy_stats sample period | expand

Commit Message

Umesh Nerlige Ramappa Nov. 12, 2021, 2:52 a.m. UTC
Irrespective of the backend for request submissions, busyness for an
engine with an active context is calculated using:

busyness = total + (current_time - context_switch_in_time)

In execlists mode of operation, the context switch events are handled
by the CPU. Context switch in/out time and current_time are captured
in CPU time domain using ktime_get().

In GuC mode of submission, context switch events are handled by GuC and
the times in the above formula are captured in GT clock domain. This
information is shared with the CPU through shared memory. This results
in 2 caveats:

1) The time taken between start of a batch and the time that CPU is able
to see the context_switch_in_time in shared memory is dependent on GuC
and memory bandwidth constraints.

2) Determining current_time requires an MMIO read that can take anywhere
between a few us to a couple ms. A reference CPU time is captured soon
after reading the MMIO so that the caller can compare the cpu delta
between 2 busyness samples. The issue here is that the CPU delta and the
busyness delta can be skewed because of the time taken to read the
register.

These 2 factors affect the accuracy of the selftest -
live_engine_busy_stats. For (1) the selftest waits until busyness stats
are visible to the CPU. The effects of (2) are more prominent for the
current busyness sample period of 100 us. Increase the busyness sample
period from 100 us to 10 ms to overccome (2).

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
 drivers/gpu/drm/i915/gt/selftest_engine_pm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Matthew Brost Nov. 12, 2021, 8:51 p.m. UTC | #1
On Thu, Nov 11, 2021 at 06:52:22PM -0800, Umesh Nerlige Ramappa wrote:
> Irrespective of the backend for request submissions, busyness for an
> engine with an active context is calculated using:
> 
> busyness = total + (current_time - context_switch_in_time)
> 
> In execlists mode of operation, the context switch events are handled
> by the CPU. Context switch in/out time and current_time are captured
> in CPU time domain using ktime_get().
> 
> In GuC mode of submission, context switch events are handled by GuC and
> the times in the above formula are captured in GT clock domain. This
> information is shared with the CPU through shared memory. This results
> in 2 caveats:
> 
> 1) The time taken between start of a batch and the time that CPU is able
> to see the context_switch_in_time in shared memory is dependent on GuC
> and memory bandwidth constraints.
> 
> 2) Determining current_time requires an MMIO read that can take anywhere
> between a few us to a couple ms. A reference CPU time is captured soon
> after reading the MMIO so that the caller can compare the cpu delta
> between 2 busyness samples. The issue here is that the CPU delta and the
> busyness delta can be skewed because of the time taken to read the
> register.
> 
> These 2 factors affect the accuracy of the selftest -
> live_engine_busy_stats. For (1) the selftest waits until busyness stats
> are visible to the CPU. The effects of (2) are more prominent for the
> current busyness sample period of 100 us. Increase the busyness sample
> period from 100 us to 10 ms to overccome (2).
> 
> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>

Explaination of increased wait period makes sense to me.

With that:
Reviewed-by: Matthew Brost <matthew.brost@intel.com>

> ---
>  drivers/gpu/drm/i915/gt/selftest_engine_pm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
> index 0bfd738dbf3a..96cc565afa78 100644
> --- a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
> +++ b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
> @@ -316,7 +316,7 @@ static int live_engine_busy_stats(void *arg)
>  		ENGINE_TRACE(engine, "measuring busy time\n");
>  		preempt_disable();
>  		de = intel_engine_get_busy_time(engine, &t[0]);
> -		udelay(100);
> +		udelay(10000);
>  		de = ktime_sub(intel_engine_get_busy_time(engine, &t[1]), de);
>  		preempt_enable();
>  		dt = ktime_sub(t[1], t[0]);
> -- 
> 2.20.1
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
index 0bfd738dbf3a..96cc565afa78 100644
--- a/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/selftest_engine_pm.c
@@ -316,7 +316,7 @@  static int live_engine_busy_stats(void *arg)
 		ENGINE_TRACE(engine, "measuring busy time\n");
 		preempt_disable();
 		de = intel_engine_get_busy_time(engine, &t[0]);
-		udelay(100);
+		udelay(10000);
 		de = ktime_sub(intel_engine_get_busy_time(engine, &t[1]), de);
 		preempt_enable();
 		dt = ktime_sub(t[1], t[0]);