From patchwork Thu Oct 18 15:28:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10647519 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BD6E1112B for ; Thu, 18 Oct 2018 15:29:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AE56328C76 for ; Thu, 18 Oct 2018 15:29:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A2AF828D97; Thu, 18 Oct 2018 15:29:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id F3DA728C76 for ; Thu, 18 Oct 2018 15:29:06 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C3A7D6E0BC; Thu, 18 Oct 2018 15:28:42 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8D5CB6E052 for ; Thu, 18 Oct 2018 15:28:40 +0000 (UTC) Received: by mail-wr1-x444.google.com with SMTP id n1-v6so34168556wrt.10 for ; Thu, 18 Oct 2018 08:28:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=7WMlJ1auIa7+yo/IAAkai1q06gDhnk2fvcXjM65lY0o=; b=N2faA8R3xzbclqv4PSN4zU2akX91ZOrIYBedxXPEzKA+GBXrm0PTMZSIaIoumEORfN 9ZdeN+2953HWPrAlrATzDSv/4jXk4qRUvZ5A6PCoS0SaGlk1HgQqiA5okuFuhZShVKAz 3aejV7hqCw4QJJhsPaNp5ajXlhF5W7Yi3ESiQIkIBQNoIr0/bdp8UVKujtJ/f9ntSn4b DotZO78i2kUrlDrDu11CdmBNb88gUyEIqDaSbHjqRh5V8qPgOTtK2Tnn1lwthntmnISy lL5dLR0JyyeNWj7DCplX1TnUw3aShy5UhcE5N3ahb9nGIhALF5xV6n+DJAtS7xpsHrQk CP9Q== X-Gm-Message-State: AGRZ1gLoLHQFfjuQHtiD7TPN3ZyIvmn9fx1laU2P7G75GJhwsSQdFR0c l/IlqNIQNlw2kjvzXBEPydaUR1oDW4s= X-Google-Smtp-Source: AJdET5e5m5PYOHmGTGhud3rYMO0OlFhIAfocjUpldBSI9DU2YI/gOMeZjuqnNCsCN4YmiON9+Grr2A== X-Received: by 2002:a5d:498c:: with SMTP id r12-v6mr2642560wrq.232.1539876518928; Thu, 18 Oct 2018 08:28:38 -0700 (PDT) Received: from localhost.localdomain ([91.110.193.16]) by smtp.gmail.com with ESMTPSA id i6-v6sm19530387wrq.4.2018.10.18.08.28.38 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 18 Oct 2018 08:28:38 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: igt-dev@lists.freedesktop.org Date: Thu, 18 Oct 2018 16:28:15 +0100 Message-Id: <20181018152815.31816-18-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> References: <20181018152815.31816-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH i-g-t 17/17] gem_wsim: Infinite batch support X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Intel-gfx@lists.freedesktop.org MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin For simulating frame split workloads it is useful to express a batch which ends at the same time as the parallel submission on the respective bonded engine. For this we add support for infinite batch durations and the batch terminate command ('T'). Syntax looks like this: 1.RCS.*.0.0 T.-1 First step starts an infinite batch, and second command terminates the infinite batch with the usual relative workload step addressing. Signed-off-by: Tvrtko Ursulin --- benchmarks/gem_wsim.c | 119 +++++++++++++++++++------ benchmarks/wsim/README | 9 +- benchmarks/wsim/frame-split-60fps.wsim | 6 +- 3 files changed, 102 insertions(+), 32 deletions(-) diff --git a/benchmarks/gem_wsim.c b/benchmarks/gem_wsim.c index b5ade7b33883..3669c1f7f1c9 100644 --- a/benchmarks/gem_wsim.c +++ b/benchmarks/gem_wsim.c @@ -85,6 +85,7 @@ enum w_type ENGINE_MAP, LOAD_BALANCE, BOND, + TERMINATE, }; struct deps @@ -112,6 +113,7 @@ struct w_step unsigned int context; unsigned int engine; struct duration duration; + bool unbound_duration; struct deps data_deps; struct deps fence_deps; int emit_fence; @@ -142,7 +144,7 @@ struct w_step struct drm_i915_gem_execbuffer2 eb; struct drm_i915_gem_exec_object2 *obj; - struct drm_i915_gem_relocation_entry reloc[4]; + struct drm_i915_gem_relocation_entry reloc[5]; unsigned long bb_sz; uint32_t bb_handle; uint32_t *mapped_batch; @@ -153,6 +155,7 @@ struct w_step uint32_t *rt1_address; uint32_t *latch_value; uint32_t *latch_address; + uint32_t *recursive_bb_start; unsigned int mapped_len; }; @@ -492,6 +495,10 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) step.type = ENGINE_MAP; goto add_step; + } else if (!strcmp(field, "T")) { + int_field(TERMINATE, target, + tmp >= 0 || ((int)nr_steps + tmp) < 0, + "Invalid terminate target at step %u!\n"); } else if (!strcmp(field, "X")) { unsigned int nr = 0; while ((field = strtok_r(fstart, ".", &fctx))) { @@ -598,23 +605,28 @@ parse_workload(struct w_arg *arg, unsigned int flags, struct workload *app_w) fstart = NULL; - tmpl = strtol(field, &sep, 10); - check_arg(tmpl <= 0 || tmpl == LONG_MIN || - tmpl == LONG_MAX, - "Invalid duration at step %u!\n", nr_steps); - step.duration.min = tmpl; - - if (sep && *sep == '-') { - tmpl = strtol(sep + 1, NULL, 10); - check_arg(tmpl <= 0 || - tmpl <= step.duration.min || - tmpl == LONG_MIN || + if (field[0] == '*') { + step.unbound_duration = true; + } else { + tmpl = strtol(field, &sep, 10); + check_arg(tmpl <= 0 || tmpl == LONG_MIN || tmpl == LONG_MAX, - "Invalid duration range at step %u!\n", + "Invalid duration at step %u!\n", nr_steps); - step.duration.max = tmpl; - } else { - step.duration.max = step.duration.min; + step.duration.min = tmpl; + + if (sep && *sep == '-') { + tmpl = strtol(sep + 1, NULL, 10); + check_arg(tmpl <= 0 || + tmpl <= step.duration.min || + tmpl == LONG_MIN || + tmpl == LONG_MAX, + "Invalid duration range at step %u!\n", + nr_steps); + step.duration.max = tmpl; + } else { + step.duration.max = step.duration.min; + } } valid++; @@ -773,7 +785,7 @@ init_bb(struct w_step *w, unsigned int flags) unsigned int i; uint32_t *ptr; - if (!arb_period) + if (w->unbound_duration || !arb_period) return; gem_set_domain(fd, w->bb_handle, @@ -793,6 +805,7 @@ terminate_bb(struct w_step *w, unsigned int flags) const uint32_t bbe = 0xa << 23; unsigned long mmap_start, mmap_len; unsigned long batch_start = w->bb_sz; + unsigned int r = 0; uint32_t *ptr, *cs; igt_assert(((flags & RT) && (flags & SEQNO)) || !(flags & RT)); @@ -803,6 +816,9 @@ terminate_bb(struct w_step *w, unsigned int flags) if (flags & RT) batch_start -= 12 * sizeof(uint32_t); + if (w->unbound_duration) + batch_start -= 4 * sizeof(uint32_t); /* MI_ARB_CHK + MI_BATCH_BUFFER_START */ + mmap_start = rounddown(batch_start, PAGE_SIZE); mmap_len = w->bb_sz - mmap_start; @@ -812,8 +828,19 @@ terminate_bb(struct w_step *w, unsigned int flags) ptr = gem_mmap__wc(fd, w->bb_handle, mmap_start, mmap_len, PROT_WRITE); cs = (uint32_t *)((char *)ptr + batch_start - mmap_start); + if (w->unbound_duration) { + w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t); + batch_start += 4 * sizeof(uint32_t); + + *cs++ = w->preempt_us ? 0x5 << 23 /* MI_ARB_CHK; */ : MI_NOOP; + w->recursive_bb_start = cs; + *cs++ = MI_BATCH_BUFFER_START | 1 << 8 | 1; + *cs++ = 0; + *cs++ = 0; + } + if (flags & SEQNO) { - w->reloc[0].offset = batch_start + sizeof(uint32_t); + w->reloc[r++].offset = batch_start + sizeof(uint32_t); batch_start += 4 * sizeof(uint32_t); *cs++ = MI_STORE_DWORD_IMM; @@ -825,7 +852,7 @@ terminate_bb(struct w_step *w, unsigned int flags) } if (flags & RT) { - w->reloc[1].offset = batch_start + sizeof(uint32_t); + w->reloc[r++].offset = batch_start + sizeof(uint32_t); batch_start += 4 * sizeof(uint32_t); *cs++ = MI_STORE_DWORD_IMM; @@ -835,7 +862,7 @@ terminate_bb(struct w_step *w, unsigned int flags) w->rt0_value = cs; *cs++ = 0; - w->reloc[2].offset = batch_start + 2 * sizeof(uint32_t); + w->reloc[r++].offset = batch_start + 2 * sizeof(uint32_t); batch_start += 4 * sizeof(uint32_t); *cs++ = 0x24 << 23 | 2; /* MI_STORE_REG_MEM */ @@ -844,7 +871,7 @@ terminate_bb(struct w_step *w, unsigned int flags) *cs++ = 0; *cs++ = 0; - w->reloc[3].offset = batch_start + sizeof(uint32_t); + w->reloc[r++].offset = batch_start + sizeof(uint32_t); batch_start += 4 * sizeof(uint32_t); *cs++ = MI_STORE_DWORD_IMM; @@ -979,19 +1006,28 @@ alloc_step_batch(struct workload *wrk, struct w_step *w, unsigned int flags) } } - w->bb_sz = get_bb_sz(w->duration.max); - w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz); + if (w->unbound_duration) + /* nops + MI_ARB_CHK + MI_BATCH_BUFFER_START */ + w->bb_sz = max(64, get_bb_sz(w->preempt_us)) + + (1 + 3) * sizeof(uint32_t); + else + w->bb_sz = get_bb_sz(w->duration.max); + w->bb_handle = w->obj[j].handle = gem_create(fd, w->bb_sz + (w->unbound_duration ? 4096 : 0)); init_bb(w, flags); terminate_bb(w, flags); - if (flags & SEQNO) { + if ((flags & SEQNO) || w->unbound_duration) { w->obj[j].relocs_ptr = to_user_pointer(&w->reloc); + if (flags & SEQNO) + w->obj[j].relocation_count++; if (flags & RT) - w->obj[j].relocation_count = 4; - else - w->obj[j].relocation_count = 1; + w->obj[j].relocation_count += 3; + if (w->unbound_duration) + w->obj[j].relocation_count++; for (i = 0; i < w->obj[j].relocation_count; i++) w->reloc[i].target_handle = 1; + if (w->unbound_duration) + w->reloc[0].target_handle = j; } w->eb.buffers_ptr = to_user_pointer(w->obj); @@ -1988,6 +2024,18 @@ update_bb_rt(struct w_step *w, enum intel_engine_id engine, uint32_t seqno) } } +static void +update_bb_start(struct w_step *w) +{ + if (!w->unbound_duration) + return; + + gem_set_domain(fd, w->bb_handle, + I915_GEM_DOMAIN_WC, I915_GEM_DOMAIN_WC); + + *w->recursive_bb_start = MI_BATCH_BUFFER_START | (1 << 8) | 1; +} + static void w_sync_to(struct workload *wrk, struct w_step *w, int target) { if (target < 0) @@ -2123,9 +2171,13 @@ do_eb(struct workload *wrk, struct w_step *w, enum intel_engine_id engine, if (flags & RT) update_bb_rt(w, engine, seqno); + update_bb_start(w); + w->eb.batch_start_offset = + w->unbound_duration ? + 0 : ALIGN(w->bb_sz - get_bb_sz(get_duration(w)), - 2 * sizeof(uint32_t)); + 2 * sizeof(uint32_t)); for (i = 0; i < w->fence_deps.nr; i++) { int tgt = w->idx + w->fence_deps.list[i]; @@ -2265,6 +2317,17 @@ static void *run_workload(void *data) w->priority; } continue; + } else if (w->type == TERMINATE) { + unsigned int t_idx = i + w->target; + + igt_assert(t_idx >= 0 && t_idx < i); + igt_assert(wrk->steps[t_idx].type == BATCH); + igt_assert(wrk->steps[t_idx].unbound_duration); + + *wrk->steps[t_idx].recursive_bb_start = + MI_BATCH_BUFFER_END; + __sync_synchronize(); + continue; } else if (w->type == PREEMPTION || w->type == ENGINE_MAP || w->type == LOAD_BALANCE || diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README index f2974992ab68..439ea3650e3d 100644 --- a/benchmarks/wsim/README +++ b/benchmarks/wsim/README @@ -2,11 +2,11 @@ Workload descriptor format ========================== ctx.engine.duration_us.dependency.wait,... -..[-].[/][...].<0|1>,... +..[-]|*.[/][...].<0|1>,... B. M..[|]... P|X.. -d|p|s|t|q|a.,... +d|p|s|t|q|a|T.,... b... f @@ -30,6 +30,7 @@ Additional workload steps are also supported: 'b' - Set up engine bonds. 'M' - Set up engine map. 'P' - Context priority. + 'T' - Terminate an infinite batch. 'X' - Context preemption control. Engine ids: DEFAULT, RCS, BCS, VCS, VCS1, VCS2, VECS @@ -77,6 +78,10 @@ Example: I this case the last step has a data dependency on both first and second steps. +Batch durations can also be specified as infinite by using the '*' in the +duration field. Such batches must be ended by the terminate command ('T') +otherwise they will cause a GPU hang to be reported. + Sync (fd) fences ---------------- diff --git a/benchmarks/wsim/frame-split-60fps.wsim b/benchmarks/wsim/frame-split-60fps.wsim index cfbfcd39be7d..ea89da3add48 100644 --- a/benchmarks/wsim/frame-split-60fps.wsim +++ b/benchmarks/wsim/frame-split-60fps.wsim @@ -6,10 +6,12 @@ M.2.VCS2 B.2 b.2.1.VCS1 f -1.DEFAULT.4000-6000.f-1.0 +1.DEFAULT.*.f-1.0 2.DEFAULT.4000-6000.s-1.0 a.-3 -3.RCS.2000-4000.-3/-2.0 +s.-2 +T.-4 +3.RCS.2000-4000.-5/-4.0 3.VECS.2000.-1.0 4.BCS.1000.-1.0 s.-2