Message ID | 20180228171519.19657-1-tvrtko.ursulin@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Quoting Tvrtko Ursulin (2018-02-28 17:15:19) > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > Verify that the reported busyness is in line with what would we expect > from a batch which causes a hang and gets kicked out from the engine. > > v2: Change to explicit igt_force_gpu_reset instead of guessing when a spin > batch will hang. (Chris Wilson) > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> It's nice and quick, yes. However, sometime the opposite is true and you have to wait for the batch you want to start before pulling the trigger. I'd put a usleep(100) in there /* Wait for batch to execute */ and we should put the wait-for-execution ability in igt_spin_t. Unfortunately that requires MI_STORE_DWORD_IMM (or more creativity) limiting it's availability. With the sleep issue addressed (commented upon if nothing else), Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> -Chris
On 01/03/2018 08:08, Chris Wilson wrote: > Quoting Tvrtko Ursulin (2018-02-28 17:15:19) >> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >> >> Verify that the reported busyness is in line with what would we expect >> from a batch which causes a hang and gets kicked out from the engine. >> >> v2: Change to explicit igt_force_gpu_reset instead of guessing when a spin >> batch will hang. (Chris Wilson) >> >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > It's nice and quick, yes. However, sometime the opposite is true and you > have to wait for the batch you want to start before pulling the trigger. > > I'd put a usleep(100) in there /* Wait for batch to execute */ and we Hm, but reset is triggered after the first sleep (which checks for 100% busy). So even the non-hanging test flavour could be affect if the delay before execution is so long. So by this logic this usleep(100) (or more for small core CI) should then go to many tests. Only difference in the hang flavour is that if it completely failed to run until after the reset, then the idle assert would fail. So maybe I should just add an assert that the batch is idle before sampling pmu after reset? Regards, Tvrtko > should put the wait-for-execution ability in igt_spin_t. Unfortunately > that requires MI_STORE_DWORD_IMM (or more creativity) limiting it's > availability. > > With the sleep issue addressed (commented upon if nothing else), > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> > -Chris > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx >
Quoting Tvrtko Ursulin (2018-03-01 09:21:52) > > On 01/03/2018 08:08, Chris Wilson wrote: > > Quoting Tvrtko Ursulin (2018-02-28 17:15:19) > >> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > >> > >> Verify that the reported busyness is in line with what would we expect > >> from a batch which causes a hang and gets kicked out from the engine. > >> > >> v2: Change to explicit igt_force_gpu_reset instead of guessing when a spin > >> batch will hang. (Chris Wilson) > >> > >> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > > > It's nice and quick, yes. However, sometime the opposite is true and you > > have to wait for the batch you want to start before pulling the trigger. > > > > I'd put a usleep(100) in there /* Wait for batch to execute */ and we > > Hm, but reset is triggered after the first sleep (which checks for 100% > busy). So even the non-hanging test flavour could be affect if the delay > before execution is so long. So by this logic this usleep(100) (or more > for small core CI) should then go to many tests. Only difference in the > hang flavour is that if it completely failed to run until after the > reset, then the idle assert would fail. So maybe I should just add an > assert that the batch is idle before sampling pmu after reset? Sneaky. Yes, that will work, just add a comment for the case where it may fail (reset before batch execution). -Chris
diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c index 3bbb18d2f216..f5c70776e2cf 100644 --- a/tests/perf_pmu.c +++ b/tests/perf_pmu.c @@ -168,6 +168,7 @@ static unsigned int e2ring(int gem_fd, const struct intel_execution_engine2 *e) #define TEST_TRAILING_IDLE (4) #define TEST_RUNTIME_PM (8) #define FLAG_LONG (16) +#define FLAG_HANG (32) static void end_spin(int fd, igt_spin_t *spin, unsigned int flags) { @@ -204,11 +205,27 @@ single(int gem_fd, const struct intel_execution_engine2 *e, unsigned int flags) end_spin(gem_fd, spin, flags); val = pmu_read_single(fd) - val; - end_spin(gem_fd, spin, FLAG_SYNC); + if (flags & FLAG_HANG) + igt_force_gpu_reset(gem_fd); + else + end_spin(gem_fd, spin, FLAG_SYNC); + + assert_within_epsilon(val, flags & TEST_BUSY ? slept : 0.f, tolerance); + + /* Check for idle after hang. */ + if (flags & FLAG_HANG) { + /* Sleep for a bit for reset unwind to settle. */ + sleep(1); + val = pmu_read_single(fd); + slept = measured_usleep(batch_duration_ns / 1000); + val = pmu_read_single(fd) - val; + + assert_within_epsilon(val, 0, tolerance); + } + igt_spin_batch_free(gem_fd, spin); close(fd); - assert_within_epsilon(val, flags & TEST_BUSY ? slept : 0.f, tolerance); gem_quiescent_gpu(gem_fd); } @@ -1690,6 +1707,9 @@ igt_main pct[i], e->name) accuracy(fd, e, pct[i]); } + + igt_subtest_f("busy-hang-%s", e->name) + single(fd, e, TEST_BUSY | FLAG_HANG); } /**