From patchwork Wed May 22 13:05:21 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lionel Landwerlin X-Patchwork-Id: 10955821 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2F7AA13AD for ; Wed, 22 May 2019 13:08:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2103628B56 for ; Wed, 22 May 2019 13:08:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 15C2D286DD; Wed, 22 May 2019 13:08:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 6B37928BB8 for ; Wed, 22 May 2019 13:08:27 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 501BE89A6D; Wed, 22 May 2019 13:08:26 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 77ED789A6D for ; Wed, 22 May 2019 13:08:25 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 22 May 2019 06:08:25 -0700 X-ExtLoop1: 1 Received: from delly.ld.intel.com ([10.103.238.204]) by fmsmga006.fm.intel.com with ESMTP; 22 May 2019 06:08:24 -0700 From: Lionel Landwerlin To: intel-gfx@lists.freedesktop.org Date: Wed, 22 May 2019 14:05:21 +0100 Message-Id: <20190522130524.10223-3-lionel.g.landwerlin@intel.com> X-Mailer: git-send-email 2.21.0.392.gf8f6787159e In-Reply-To: <20190522130524.10223-1-lionel.g.landwerlin@intel.com> References: <20190522130524.10223-1-lionel.g.landwerlin@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v2 2/5] drm/i915/perf: allow holding preemption on filtered ctx X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP We would like to make use of perf in Vulkan. The Vulkan API is much lower level than OpenGL, with applications directly exposed to the concept of command buffers (pretty much equivalent to our batch buffers). In Vulkan, queries are always limited in scope to a command buffer. In OpenGL, the lack of command buffer concept meant that queries' duration could span multiple command buffers. With that restriction gone in Vulkan, we would like to simplify measuring performance just by measuring the deltas between the counter snapshots written by 2 MI_RECORD_PERF_COUNT commands, rather than the more complex scheme we currently have in the GL driver, using 2 MI_RECORD_PERF_COUNT commands and doing some post processing on the stream of OA reports, coming from the global OA buffer, to remove any unrelated deltas in between the 2 MI_RECORD_PERF_COUNT. Disabling preemption only apply to a single context with which want to query performance counters for and is considered a privileged operation, by default protected by CAP_SYS_ADMIN. It is possible to enable it for a normal user by disabling the paranoid stream setting. v2: Store preemption setting in intel_context (Chris) Signed-off-by: Lionel Landwerlin --- drivers/gpu/drm/i915/gt/intel_context.c | 1 + drivers/gpu/drm/i915/gt/intel_context_types.h | 3 ++ drivers/gpu/drm/i915/gt/intel_lrc.c | 2 +- drivers/gpu/drm/i915/i915_drv.c | 2 +- drivers/gpu/drm/i915/i915_perf.c | 37 +++++++++++++++---- include/uapi/drm/i915_drm.h | 10 +++++ 6 files changed, 46 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index 5b31e1e05ddd..68a4b888fb1a 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -117,6 +117,7 @@ intel_context_init(struct intel_context *ce, ce->ops = engine->cops; ce->sseu = engine->sseu; ce->saturated = 0; + ce->arb_enable = MI_ARB_ENABLE; INIT_LIST_HEAD(&ce->signal_link); INIT_LIST_HEAD(&ce->signals); diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h index 963a312430e6..07f586e3608d 100644 --- a/drivers/gpu/drm/i915/gt/intel_context_types.h +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h @@ -65,6 +65,9 @@ struct intel_context { /** sseu: Control eu/slice partitioning */ struct intel_sseu sseu; + + /** arb_enable: Control preemption */ + u32 arb_enable; }; #endif /* __INTEL_CONTEXT_TYPES__ */ diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 13a90404e0f6..4137739daf2d 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -2422,7 +2422,7 @@ static int gen9_emit_bb_start(struct i915_request *rq, if (IS_ERR(cs)) return PTR_ERR(cs); - *cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE; + *cs++ = MI_ARB_ON_OFF | rq->hw_context->arb_enable; *cs++ = MI_BATCH_BUFFER_START_GEN8 | (flags & I915_DISPATCH_SECURE ? 0 : BIT(8)); diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 74e89f880ea0..72274cbe7d11 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -471,7 +471,7 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data, value = INTEL_INFO(dev_priv)->has_coherent_ggtt; break; case I915_PARAM_PERF_REVISION: - value = 1; + value = 2; break; default: DRM_DEBUG("Unknown parameter %d\n", param->param); diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index c4995d5a16d2..321014ca1592 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -341,6 +341,8 @@ static const struct i915_oa_format gen8_plus_oa_formats[I915_OA_FORMAT_MAX] = { * struct perf_open_properties - for validated properties given to open a stream * @sample_flags: `DRM_I915_PERF_PROP_SAMPLE_*` properties are tracked as flags * @single_context: Whether a single or all gpu contexts should be monitored + * @context_disable_preemption: Whether the preemption is disabled for the + * filtered context * @ctx_handle: A gem ctx handle for use with @single_context * @metrics_set: An ID for an OA unit metric set advertised via sysfs * @oa_format: An OA unit HW report format @@ -355,6 +357,7 @@ struct perf_open_properties { u32 sample_flags; u64 single_context:1; + u64 context_disable_preemption:1; u64 ctx_handle; /* OA sampling state */ @@ -1201,7 +1204,8 @@ static int i915_oa_read(struct i915_perf_stream *stream, } static struct intel_context *oa_pin_context(struct drm_i915_private *i915, - struct i915_gem_context *ctx) + struct i915_gem_context *ctx, + bool disable_preemption) { struct i915_gem_engines_iter it; struct intel_context *ce; @@ -1222,6 +1226,7 @@ static struct intel_context *oa_pin_context(struct drm_i915_private *i915, err = intel_context_pin(ce); if (err == 0) { i915->perf.oa.pinned_ctx = ce; + ce->arb_enable = MI_ARB_DISABLE; break; } } @@ -1237,19 +1242,22 @@ static struct intel_context *oa_pin_context(struct drm_i915_private *i915, /** * oa_get_render_ctx_id - determine and hold ctx hw id * @stream: An i915-perf stream opened for OA metrics + * @disable_preemption: Whether to disable preemption on the context * * Determine the render context hw id, and ensure it remains fixed for the * lifetime of the stream. This ensures that we don't have to worry about - * updating the context ID in OACONTROL on the fly. + * updating the context ID in OACONTROL on the fly. Also disable preemption on + * the context if needed. * * Returns: zero on success or a negative error code */ -static int oa_get_render_ctx_id(struct i915_perf_stream *stream) +static int oa_get_render_ctx_id(struct i915_perf_stream *stream, + bool disable_preemption) { struct drm_i915_private *i915 = stream->dev_priv; struct intel_context *ce; - ce = oa_pin_context(i915, stream->ctx); + ce = oa_pin_context(i915, stream->ctx, disable_preemption); if (IS_ERR(ce)) return PTR_ERR(ce); @@ -1337,6 +1345,7 @@ static void oa_put_render_ctx_id(struct i915_perf_stream *stream) ce = fetch_and_zero(&dev_priv->perf.oa.pinned_ctx); if (ce) { mutex_lock(&dev_priv->drm.struct_mutex); + ce->arb_enable = MI_ARB_ENABLE; intel_context_unpin(ce); mutex_unlock(&dev_priv->drm.struct_mutex); } @@ -2085,7 +2094,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, dev_priv->perf.oa.period_exponent = props->oa_period_exponent; if (stream->ctx) { - ret = oa_get_render_ctx_id(stream); + ret = oa_get_render_ctx_id(stream, props->context_disable_preemption); if (ret) { DRM_DEBUG("Invalid context id to filter with\n"); return ret; @@ -2583,6 +2592,15 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv, } } + if (props->context_disable_preemption) { + if (!props->single_context) { + DRM_DEBUG("preemption disable with no context\n"); + ret = -EINVAL; + goto err; + } + privileged_op = true; + } + /* * On Haswell the OA unit supports clock gating off for a specific * context and in this mode there's no visibility of metrics for the @@ -2597,8 +2615,10 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv, * MI_REPORT_PERF_COUNT commands and so consider it a privileged op to * enable the OA unit by default. */ - if (IS_HASWELL(dev_priv) && specific_ctx) + if (IS_HASWELL(dev_priv) && specific_ctx && + !props->context_disable_preemption) { privileged_op = false; + } /* Similar to perf's kernel.perf_paranoid_cpu sysctl option * we check a dev.i915.perf_stream_paranoid sysctl option @@ -2607,7 +2627,7 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv, */ if (privileged_op && i915_perf_stream_paranoid && !capable(CAP_SYS_ADMIN)) { - DRM_DEBUG("Insufficient privileges to open system-wide i915 perf stream\n"); + DRM_DEBUG("Insufficient privileges to open i915 perf stream\n"); ret = -EACCES; goto err_ctx; } @@ -2799,6 +2819,9 @@ static int read_properties_unlocked(struct drm_i915_private *dev_priv, props->oa_periodic = true; props->oa_period_exponent = value; break; + case DRM_I915_PERF_PROP_HOLD_PREEMPTION: + props->context_disable_preemption = value != 0 ? 1 : 0; + break; case DRM_I915_PERF_PROP_MAX: MISSING_CASE(id); return -EINVAL; diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 9708969f2fcb..9acb455fad39 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -1889,6 +1889,16 @@ enum drm_i915_perf_property_id { */ DRM_I915_PERF_PROP_OA_EXPONENT, + /** + * Specifying this property is only valid when specify a context to + * filter with DRM_I915_PERF_PROP_CTX_HANDLE. Specifying this property + * will hold preemption of the particular context we want to gather + * performance data about. + * + * This property is available in perf revision 2. + */ + DRM_I915_PERF_PROP_HOLD_PREEMPTION, + DRM_I915_PERF_PROP_MAX /* non-ABI */ };