From patchwork Fri Nov 13 18:03:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 11904411 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5BBE1C63699 for ; Fri, 13 Nov 2020 18:04:33 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 15097208D5 for ; Fri, 13 Nov 2020 18:04:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 15097208D5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1EFC76E7EF; Fri, 13 Nov 2020 18:04:31 +0000 (UTC) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by gabe.freedesktop.org (Postfix) with ESMTPS id CAFC06E598 for ; Fri, 13 Nov 2020 18:04:26 +0000 (UTC) IronPort-SDR: aAA6vDvGquPOxGtcy1O4rBOzmSWMPcRKcfLyaiz9RyHH0UNFtdFLhRGSG7Diyf7OMPQqna8MlE Uryj+ZV4jOaw== X-IronPort-AV: E=McAfee;i="6000,8403,9804"; a="167931879" X-IronPort-AV: E=Sophos;i="5.77,476,1596524400"; d="scan'208";a="167931879" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Nov 2020 10:04:00 -0800 IronPort-SDR: dvcf2XTBSz4LxW+2DnpbFXAsGemtkTwy6FGuB2sTc62IXtQZ3GFAkZHTsoyse4qDefDrdwqVc0 9vxHhmg1/DzA== X-IronPort-AV: E=Sophos;i="5.77,476,1596524400"; d="scan'208";a="474752358" Received: from orsosgc001.ra.intel.com ([10.23.184.150]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Nov 2020 10:03:59 -0800 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Fri, 13 Nov 2020 10:03:53 -0800 Message-Id: <20201113180359.54410-2-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20201113180359.54410-1-umesh.nerlige.ramappa@intel.com> References: <20201113180359.54410-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 1/7] drm/i915/perf: Ensure observation logic is not clock gated X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Piotr Maciejewski A clock gating switch can control if the performance monitoring and observation logic is enaled or not. Ensure that we enable the clocks. v2: Separate code from other patches (Lionel) v3: Reset PMON enable when disabling perf to save power (Lionel) v4: Use intel_uncore_rmw and REG_BIT (Chris) Fixes: 00a7f0d7155c ("drm/i915/tgl: Add perf support on TGL") Signed-off-by: Piotr Maciejewski Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_perf.c | 9 +++++++++ drivers/gpu/drm/i915/i915_reg.h | 2 ++ 2 files changed, 11 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index e94976976571..df5166d89d82 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -2520,6 +2520,12 @@ gen12_enable_metric_set(struct i915_perf_stream *stream, (period_exponent << GEN12_OAG_OAGLBCTXCTRL_TIMER_PERIOD_SHIFT)) : 0); + /* + * Initialize Super Queue Internal Cnt Register + * Set PMON Enable in order to collect valid metrics. + */ + intel_uncore_rmw(uncore, GEN12_SQCNT1, 0, GEN12_SQCNT1_PMON_ENABLE); + /* * Update all contexts prior writing the mux configurations as we need * to make sure all slices/subslices are ON before writing to NOA @@ -2579,6 +2585,9 @@ static void gen12_disable_metric_set(struct i915_perf_stream *stream) /* Make sure we disable noa to save power. */ intel_uncore_rmw(uncore, RPM_CONFIG1, GEN10_GT_NOA_ENABLE, 0); + + /* Reset PMON Enable to save power. */ + intel_uncore_rmw(uncore, GEN12_SQCNT1, GEN12_SQCNT1_PMON_ENABLE, 0); } static void gen7_oa_enable(struct i915_perf_stream *stream) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 7ea70b7ffcc6..4137ca9d3315 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -699,6 +699,8 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define OABUFFER_SIZE_16M (7 << 3) #define GEN12_OA_TLB_INV_CR _MMIO(0xceec) +#define GEN12_SQCNT1 _MMIO(0x8718) +#define GEN12_SQCNT1_PMON_ENABLE REG_BIT(30) /* Gen12 OAR unit */ #define GEN12_OAR_OACONTROL _MMIO(0x2960) From patchwork Fri Nov 13 18:03:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 11904407 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CFD23C63697 for ; Fri, 13 Nov 2020 18:04:28 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6E616208D5 for ; Fri, 13 Nov 2020 18:04:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6E616208D5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4FEBB6E598; Fri, 13 Nov 2020 18:04:27 +0000 (UTC) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by gabe.freedesktop.org (Postfix) with ESMTPS id 04A756E598 for ; Fri, 13 Nov 2020 18:04:25 +0000 (UTC) IronPort-SDR: ZnSy/cEjmNUQVQxCwvU1lDuRl+djwf49/E8icJgnwvJ/O2w15fvccIiY7NEb+6lFDV5CmONCQ6 jzAUpexS4CEg== X-IronPort-AV: E=McAfee;i="6000,8403,9804"; a="167931875" X-IronPort-AV: E=Sophos;i="5.77,476,1596524400"; d="scan'208";a="167931875" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Nov 2020 10:04:00 -0800 IronPort-SDR: neqdk+DzYg9FJhuVG3aKpIxG8D+CUYcSBJtg9RWhecMvR0v9qSSVj69OQnroX5W8bdTsI61/TN MgSMkT51wMbQ== X-IronPort-AV: E=Sophos;i="5.77,476,1596524400"; d="scan'208";a="474752361" Received: from orsosgc001.ra.intel.com ([10.23.184.150]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Nov 2020 10:03:59 -0800 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Fri, 13 Nov 2020 10:03:54 -0800 Message-Id: <20201113180359.54410-3-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20201113180359.54410-1-umesh.nerlige.ramappa@intel.com> References: <20201113180359.54410-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 2/7] drm/i915/gt: Lock intel_engine_apply_whitelist with uncore->lock X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Refactor intel_engine_apply_whitelist into locked and unlocked versions so that a caller who already has the lock can apply whitelist. v2: Fix sparse warning v3: (Chris) - Drop prefix and suffix for static function - Use longest to shortest line ordering for variable declaration Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 44 +++++++++++++++------ 1 file changed, 31 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index fed9503a7c4e..fa8c9d86847d 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -1337,7 +1337,8 @@ void intel_gt_init_workarounds(struct drm_i915_private *i915) } static enum forcewake_domains -wal_get_fw_for_rmw(struct intel_uncore *uncore, const struct i915_wa_list *wal) +wal_get_fw(struct intel_uncore *uncore, const struct i915_wa_list *wal, + unsigned int op) { enum forcewake_domains fw = 0; struct i915_wa *wa; @@ -1346,8 +1347,7 @@ wal_get_fw_for_rmw(struct intel_uncore *uncore, const struct i915_wa_list *wal) for (i = 0, wa = wal->list; i < wal->count; i++, wa++) fw |= intel_uncore_forcewake_for_reg(uncore, wa->reg, - FW_REG_READ | - FW_REG_WRITE); + op); return fw; } @@ -1377,7 +1377,7 @@ wa_list_apply(struct intel_uncore *uncore, const struct i915_wa_list *wal) if (!wal->count) return; - fw = wal_get_fw_for_rmw(uncore, wal); + fw = wal_get_fw(uncore, wal, FW_REG_READ | FW_REG_WRITE); spin_lock_irqsave(&uncore->lock, flags); intel_uncore_forcewake_get__locked(uncore, fw); @@ -1703,27 +1703,45 @@ void intel_engine_init_whitelist(struct intel_engine_cs *engine) wa_init_finish(w); } -void intel_engine_apply_whitelist(struct intel_engine_cs *engine) +static void __engine_apply_whitelist(struct intel_engine_cs *engine) { const struct i915_wa_list *wal = &engine->whitelist; struct intel_uncore *uncore = engine->uncore; const u32 base = engine->mmio_base; + enum forcewake_domains fw; struct i915_wa *wa; unsigned int i; - if (!wal->count) - return; + lockdep_assert_held(&uncore->lock); + + fw = wal_get_fw(uncore, wal, FW_REG_WRITE); + intel_uncore_forcewake_get__locked(uncore, fw); for (i = 0, wa = wal->list; i < wal->count; i++, wa++) - intel_uncore_write(uncore, - RING_FORCE_TO_NONPRIV(base, i), - i915_mmio_reg_offset(wa->reg)); + intel_uncore_write_fw(uncore, + RING_FORCE_TO_NONPRIV(base, i), + i915_mmio_reg_offset(wa->reg)); /* And clear the rest just in case of garbage */ for (; i < RING_MAX_NONPRIV_SLOTS; i++) - intel_uncore_write(uncore, - RING_FORCE_TO_NONPRIV(base, i), - i915_mmio_reg_offset(RING_NOPID(base))); + intel_uncore_write_fw(uncore, + RING_FORCE_TO_NONPRIV(base, i), + i915_mmio_reg_offset(RING_NOPID(base))); + + intel_uncore_forcewake_put__locked(uncore, fw); +} + +void intel_engine_apply_whitelist(struct intel_engine_cs *engine) +{ + const struct i915_wa_list *wal = &engine->whitelist; + unsigned long flags; + + if (!wal->count) + return; + + spin_lock_irqsave(&engine->uncore->lock, flags); + __engine_apply_whitelist(engine); + spin_unlock_irqrestore(&engine->uncore->lock, flags); } static void From patchwork Fri Nov 13 18:03:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 11904405 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 165DFC63699 for ; Fri, 13 Nov 2020 18:04:30 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C5A6A208D5 for ; Fri, 13 Nov 2020 18:04:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C5A6A208D5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BD4236E7D3; Fri, 13 Nov 2020 18:04:27 +0000 (UTC) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by gabe.freedesktop.org (Postfix) with ESMTPS id 89D4E6E598 for ; Fri, 13 Nov 2020 18:04:25 +0000 (UTC) IronPort-SDR: VrxsIJYUBCCAT7Ja7fHiP9ck/M6poitnZrEU0vRR6viglQYeHbEkY1V1XAXRKBW9IurIa0MIv6 5cvID22h4dXA== X-IronPort-AV: E=McAfee;i="6000,8403,9804"; a="167931876" X-IronPort-AV: E=Sophos;i="5.77,476,1596524400"; d="scan'208";a="167931876" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Nov 2020 10:04:00 -0800 IronPort-SDR: eOjtZJjXGNo+gQ10IQZiv4kSeo+OmjfAwYoaoKAjz6UYikRTuCfkK33c/loKY/L8s/8DDz/rIR WZO5Duzzuo4w== X-IronPort-AV: E=Sophos;i="5.77,476,1596524400"; d="scan'208";a="474752364" Received: from orsosgc001.ra.intel.com ([10.23.184.150]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Nov 2020 10:04:00 -0800 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Fri, 13 Nov 2020 10:03:55 -0800 Message-Id: <20201113180359.54410-4-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20201113180359.54410-1-umesh.nerlige.ramappa@intel.com> References: <20201113180359.54410-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 3/7] drm/i915/gt: Add a reference to the engine in i915_wa_list X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Applying OA whitelist dynamically also demands that we update the wal list in sync with engine resume. The lock currently used to synchronize this is engine->uncore->lock. To use this lock, make i915_wa_list aware of the engine. Signed-off-by: Umesh Nerlige Ramappa --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 12 +++++++----- drivers/gpu/drm/i915/gt/intel_workarounds_types.h | 1 + drivers/gpu/drm/i915/gt/selftest_workarounds.c | 4 ++-- 3 files changed, 10 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index fa8c9d86847d..0f9d2a65dcfe 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -83,10 +83,12 @@ const struct i915_rev_steppings tgl_revids[] = { [1] = { .gt_stepping = TGL_REVID_B0, .disp_stepping = TGL_REVID_D0 }, }; -static void wa_init_start(struct i915_wa_list *wal, const char *name, const char *engine_name) +static void wa_init_start(struct i915_wa_list *wal, const char *name, const char *engine_name, + struct intel_engine_cs *engine) { wal->name = name; wal->engine_name = engine_name; + wal->engine = engine; } #define WA_LIST_CHUNK (1 << 4) @@ -696,7 +698,7 @@ __intel_engine_init_ctx_wa(struct intel_engine_cs *engine, if (engine->class != RENDER_CLASS) return; - wa_init_start(wal, name, engine->name); + wa_init_start(wal, name, engine->name, engine); if (IS_DG1(i915)) dg1_ctx_workarounds_init(engine, wal); @@ -1331,7 +1333,7 @@ void intel_gt_init_workarounds(struct drm_i915_private *i915) { struct i915_wa_list *wal = &i915->gt_wa_list; - wa_init_start(wal, "GT", "global"); + wa_init_start(wal, "GT", "global", NULL); gt_init_workarounds(i915, wal); wa_init_finish(wal); } @@ -1673,7 +1675,7 @@ void intel_engine_init_whitelist(struct intel_engine_cs *engine) struct drm_i915_private *i915 = engine->i915; struct i915_wa_list *w = &engine->whitelist; - wa_init_start(w, "whitelist", engine->name); + wa_init_start(w, "whitelist", engine->name, engine); if (IS_DG1(i915)) dg1_whitelist_build(engine); @@ -2073,7 +2075,7 @@ void intel_engine_init_workarounds(struct intel_engine_cs *engine) if (INTEL_GEN(engine->i915) < 4) return; - wa_init_start(wal, "engine", engine->name); + wa_init_start(wal, "engine", engine->name, engine); engine_init_workarounds(engine, wal); wa_init_finish(wal); } diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h index d166a7145720..e562fd43697b 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h +++ b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h @@ -24,6 +24,7 @@ struct i915_wa_list { struct i915_wa *list; unsigned int count; unsigned int wa_count; + struct intel_engine_cs *engine; }; #endif /* __INTEL_WORKAROUNDS_TYPES_H__ */ diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c index 61a0532d0f3d..fa616d122a23 100644 --- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c +++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c @@ -65,14 +65,14 @@ reference_lists_init(struct intel_gt *gt, struct wa_lists *lists) memset(lists, 0, sizeof(*lists)); - wa_init_start(&lists->gt_wa_list, "GT_REF", "global"); + wa_init_start(&lists->gt_wa_list, "GT_REF", "global", NULL); gt_init_workarounds(gt->i915, &lists->gt_wa_list); wa_init_finish(&lists->gt_wa_list); for_each_engine(engine, gt, id) { struct i915_wa_list *wal = &lists->engine[id].wa_list; - wa_init_start(wal, "REF", engine->name); + wa_init_start(wal, "REF", engine->name, engine); engine_init_workarounds(engine, wal); wa_init_finish(wal); From patchwork Fri Nov 13 18:03:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 11904417 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 97DF1C55ABD for ; Fri, 13 Nov 2020 18:04:32 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 30300208D5 for ; Fri, 13 Nov 2020 18:04:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 30300208D5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 13E996E7EA; Fri, 13 Nov 2020 18:04:30 +0000 (UTC) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8B9B26E7D2 for ; Fri, 13 Nov 2020 18:04:27 +0000 (UTC) IronPort-SDR: IgwNVUo6PWif2AFu00ee+4mqmSy/dyX/AqKl77m0FsOlmfA2VzoaDTbUJA0ZhMxRI2x5nFjk9o m+58uJG15D+Q== X-IronPort-AV: E=McAfee;i="6000,8403,9804"; a="167931881" X-IronPort-AV: E=Sophos;i="5.77,476,1596524400"; d="scan'208";a="167931881" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Nov 2020 10:04:00 -0800 IronPort-SDR: e7A89P9+aTu+foUIw5VjrCQIonizTrC+zsIBHFf+csdmTKaD0tBb2nlLHKfNz19UHnBlF6xHiw GbPGVWJUTFQg== X-IronPort-AV: E=Sophos;i="5.77,476,1596524400"; d="scan'208";a="474752368" Received: from orsosgc001.ra.intel.com ([10.23.184.150]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Nov 2020 10:04:00 -0800 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Fri, 13 Nov 2020 10:03:56 -0800 Message-Id: <20201113180359.54410-5-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20201113180359.54410-1-umesh.nerlige.ramappa@intel.com> References: <20201113180359.54410-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 4/7] drm/i915/perf: Whitelist OA report trigger registers X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" OA reports can be triggered into the OA buffer by writing into the OAREPORTTRIG registers. Whitelist the registers to allow non-privileged user to trigger reports. Whitelist registers only if perf_stream_paranoid is set to 0. In i915_perf_open_ioctl, this setting is checked and the whitelist is enabled accordingly. On closing the perf fd, the whitelist is removed. This ensures that the access to the whitelist is gated by perf_stream_paranoid. v2: - Move related change to this patch (Lionel) - Bump up perf revision (Lionel) v3: Pardon whitelisted registers for selftest (Umesh) v4: Document supported gens for the feature (Lionel) v5: Whitelist registers only if perf_stream_paranoid is set to 0 (Jon) v6: Move oa whitelist array to i915_perf (Chris) v7: Fix OA writing beyond the wal->list memory (CI) v8: Protect write to engine whitelist registers v9: (Umesh) - Use uncore->lock to protect write to forcepriv regs - In case wal->count falls to zero on _wa_remove, make sure you still clear the registers. Remove wal->count check when applying whitelist. v10: (Umesh) - Split patches modifying intel_workarounds - On some platforms there are no whitelisted regs. intel_engine_resume applies whitelist on these platforms too and the wal->count gates such platforms. Bring back the wal->count check. - intel_engine_allow/deny_user_register_access modifies the engine whitelist and the wal->count. Use uncore->lock to serialize it with intel_engine_apply_whitelist. - Grow the wal->list when adding whitelist registers after driver load. v11: - Fix memory leak in _wa_list_grow (Chris) - Serialize kfree with engine resume using uncore->lock (Umesh) - Grow the list only if wal->count is not aligned (Umesh) Signed-off-by: Piotr Maciejewski Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Lionel Landwerlin --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 138 ++++++++++++++++++ drivers/gpu/drm/i915/gt/intel_workarounds.h | 7 + .../gpu/drm/i915/gt/intel_workarounds_types.h | 5 + drivers/gpu/drm/i915/i915_perf.c | 79 +++++++++- drivers/gpu/drm/i915/i915_perf_types.h | 8 + 5 files changed, 234 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 0f9d2a65dcfe..722f11e43dad 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -114,6 +114,80 @@ static void wa_init_finish(struct i915_wa_list *wal) wal->wa_count, wal->name, wal->engine_name); } +static int _wa_index(struct i915_wa_list *wal, i915_reg_t reg) +{ + unsigned int addr = i915_mmio_reg_offset(reg); + int start = 0, end = wal->count; + + /* addr and wal->list[].reg, both include the R/W flags */ + while (start < end) { + unsigned int mid = start + (end - start) / 2; + + if (i915_mmio_reg_offset(wal->list[mid].reg) < addr) + start = mid + 1; + else if (i915_mmio_reg_offset(wal->list[mid].reg) > addr) + end = mid; + else + return mid; + } + + return -ENOENT; +} + +static int _wa_list_grow(struct i915_wa_list *wal, size_t count) +{ + struct i915_wa *list; + unsigned long flags; + + list = kmalloc_array(ALIGN(count + 1, WA_LIST_CHUNK), sizeof(*list), + GFP_KERNEL); + if (!list) { + DRM_ERROR("No space for workaround init!\n"); + return -ENOMEM; + } + + if (wal->list) + memcpy(list, wal->list, sizeof(*list) * count); + + /* + * Most wal->lists are only modified during driver init and do not + * require to be serialized with application of workarounds. The one + * case where this is required is when OA adds/removes whitelist entries + * from the wal->list and engine resume is in progress. + */ + if (wal->engine) + spin_lock_irqsave(&wal->engine->uncore->lock, flags); + + kfree(wal->list); + wal->list = list; + + if (wal->engine) + spin_unlock_irqrestore(&wal->engine->uncore->lock, flags); + + return 0; +} + +static void _wa_remove(struct i915_wa_list *wal, i915_reg_t reg, u32 flags) +{ + int index; + struct i915_wa *wa = wal->list; + + reg.reg |= flags; + + index = _wa_index(wal, reg); + if (GEM_DEBUG_WARN_ON(index < 0)) + return; + + memset(wa + index, 0, sizeof(*wa)); + + while (index < wal->count - 1) { + swap(wa[index], wa[index + 1]); + index++; + } + + wal->count--; +} + static void _wa_add(struct i915_wa_list *wal, const struct i915_wa *wa) { unsigned int addr = i915_mmio_reg_offset(wa->reg); @@ -2080,6 +2154,70 @@ void intel_engine_init_workarounds(struct intel_engine_cs *engine) wa_init_finish(wal); } +int intel_engine_allow_user_register_access(struct intel_engine_cs *engine, + struct i915_whitelist_reg *reg, + u32 count) +{ + unsigned long flags; + struct i915_wa_list *wal; + int ret; + + if (!engine || !reg || !count) + return -EINVAL; + + wal = &engine->whitelist; + + /* + * i915 compacts the wa list by calling wa_init_finish during driver + * load. If we want to add additional workarounds after driver load, + * we need to grow the list. _wa_list_grow will add at least one free + * slot for a workaround. Any additional slot required are added by + * _wa_add in the below for loop. + * + * Once we remove the workarounds, we compact the list again in + * intel_engine_deny_user_register_access by calling wa_init_finish. + * + * Note that we want to grow the list here only if wal->count is not + * aligned. If it is aligned, whitelist_reg_ext will grow the list. + * + */ + if (!IS_ALIGNED(wal->count, WA_LIST_CHUNK)) { + ret = _wa_list_grow(wal, wal->count); + if (ret < 0) + return ret; + } + + spin_lock_irqsave(&engine->uncore->lock, flags); + for (; count--; reg++) + whitelist_reg_ext(wal, reg->reg, reg->flags); + + __engine_apply_whitelist(engine); + spin_unlock_irqrestore(&engine->uncore->lock, flags); + + return 0; +} + +void intel_engine_deny_user_register_access(struct intel_engine_cs *engine, + struct i915_whitelist_reg *reg, + u32 count) +{ + unsigned long flags; + struct i915_wa_list *wal; + + if (!engine || !reg || !count) + return; + + wal = &engine->whitelist; + spin_lock_irqsave(&engine->uncore->lock, flags); + for (; count--; reg++) + _wa_remove(wal, reg->reg, reg->flags); + + __engine_apply_whitelist(engine); + spin_unlock_irqrestore(&engine->uncore->lock, flags); + + wa_init_finish(wal); +} + void intel_engine_apply_workarounds(struct intel_engine_cs *engine) { wa_list_apply(engine->uncore, &engine->wa_list); diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.h b/drivers/gpu/drm/i915/gt/intel_workarounds.h index 8c9c769c2204..558c21b7d4cb 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.h +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.h @@ -37,4 +37,11 @@ void intel_engine_apply_workarounds(struct intel_engine_cs *engine); int intel_engine_verify_workarounds(struct intel_engine_cs *engine, const char *from); +int intel_engine_allow_user_register_access(struct intel_engine_cs *engine, + struct i915_whitelist_reg *reg, + u32 count); +void intel_engine_deny_user_register_access(struct intel_engine_cs *engine, + struct i915_whitelist_reg *reg, + u32 count); + #endif diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h index e562fd43697b..53372546be9b 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds_types.h +++ b/drivers/gpu/drm/i915/gt/intel_workarounds_types.h @@ -11,6 +11,11 @@ #include "i915_reg.h" +struct i915_whitelist_reg { + i915_reg_t reg; + u32 flags; +}; + struct i915_wa { i915_reg_t reg; u32 clr; diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index df5166d89d82..64cf27187b40 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1362,12 +1362,56 @@ free_noa_wait(struct i915_perf_stream *stream) i915_vma_unpin_and_release(&stream->noa_wait, 0); } +static struct i915_whitelist_reg gen9_oa_wl_regs[] = { + { OAREPORTTRIG2, RING_FORCE_TO_NONPRIV_ACCESS_RW }, + { OAREPORTTRIG6, RING_FORCE_TO_NONPRIV_ACCESS_RW }, +}; + +static struct i915_whitelist_reg gen12_oa_wl_regs[] = { + { GEN12_OAG_OAREPORTTRIG2, RING_FORCE_TO_NONPRIV_ACCESS_RW }, + { GEN12_OAG_OAREPORTTRIG6, RING_FORCE_TO_NONPRIV_ACCESS_RW }, +}; + +static int intel_engine_apply_oa_whitelist(struct i915_perf_stream *stream) +{ + struct intel_engine_cs *engine = stream->engine; + int ret; + + if (i915_perf_stream_paranoid || + !(stream->sample_flags & SAMPLE_OA_REPORT) || + !stream->perf->oa_wl) + return 0; + + ret = intel_engine_allow_user_register_access(engine, + stream->perf->oa_wl, + stream->perf->num_oa_wl); + if (ret < 0) + return ret; + + stream->oa_whitelisted = true; + return 0; +} + +static void intel_engine_remove_oa_whitelist(struct i915_perf_stream *stream) +{ + struct intel_engine_cs *engine = stream->engine; + + if (!stream->oa_whitelisted) + return; + + intel_engine_deny_user_register_access(engine, + stream->perf->oa_wl, + stream->perf->num_oa_wl); +} + static void i915_oa_stream_destroy(struct i915_perf_stream *stream) { struct i915_perf *perf = stream->perf; BUG_ON(stream != perf->exclusive_stream); + intel_engine_remove_oa_whitelist(stream); + /* * Unset exclusive_stream first, it will be checked while disabling * the metric set on gen8+. @@ -1463,7 +1507,8 @@ static void gen8_init_oa_buffer(struct i915_perf_stream *stream) * bit." */ intel_uncore_write(uncore, GEN8_OABUFFER, gtt_offset | - OABUFFER_SIZE_16M | GEN8_OABUFFER_MEM_SELECT_GGTT); + OABUFFER_SIZE_16M | GEN8_OABUFFER_MEM_SELECT_GGTT | + GEN7_OABUFFER_EDGE_TRIGGER); intel_uncore_write(uncore, GEN8_OATAILPTR, gtt_offset & GEN8_OATAILPTR_MASK); /* Mark that we need updated tail pointers to read from... */ @@ -1516,7 +1561,8 @@ static void gen12_init_oa_buffer(struct i915_perf_stream *stream) * bit." */ intel_uncore_write(uncore, GEN12_OAG_OABUFFER, gtt_offset | - OABUFFER_SIZE_16M | GEN8_OABUFFER_MEM_SELECT_GGTT); + OABUFFER_SIZE_16M | GEN8_OABUFFER_MEM_SELECT_GGTT | + GEN7_OABUFFER_EDGE_TRIGGER); intel_uncore_write(uncore, GEN12_OAG_OATAILPTR, gtt_offset & GEN12_OAG_OATAILPTR_MASK); @@ -3497,6 +3543,20 @@ i915_perf_open_ioctl_locked(struct i915_perf *perf, if (!(param->flags & I915_PERF_FLAG_DISABLED)) i915_perf_enable_locked(stream); + /* + * OA whitelist allows non-privileged access to some OA counters for + * triggering reports into the OA buffer. This is only allowed if + * perf_stream_paranoid is set to 0 by the sysadmin. + * + * We want to make sure this is almost the last thing we do before + * returning the stream fd. If we do end up checking for errors in code + * that follows this, we MUST call intel_engine_remove_oa_whitelist in + * the error handling path to remove the whitelisted registers. + */ + ret = intel_engine_apply_oa_whitelist(stream); + if (ret < 0) + goto err_flags; + /* Take a reference on the driver that will be kept with stream_fd * until its release. */ @@ -4323,6 +4383,8 @@ void i915_perf_init(struct drm_i915_private *i915) perf->ctx_flexeu0_offset = 0x3de; perf->gen8_valid_ctx_bit = BIT(16); + perf->oa_wl = gen9_oa_wl_regs; + perf->num_oa_wl = ARRAY_SIZE(gen9_oa_wl_regs); } } else if (IS_GEN_RANGE(i915, 10, 11)) { perf->oa_formats = gen8_plus_oa_formats; @@ -4340,6 +4402,9 @@ void i915_perf_init(struct drm_i915_private *i915) perf->ops.disable_metric_set = gen10_disable_metric_set; perf->ops.oa_hw_tail_read = gen8_oa_hw_tail_read; + perf->oa_wl = gen9_oa_wl_regs; + perf->num_oa_wl = ARRAY_SIZE(gen9_oa_wl_regs); + if (IS_GEN(i915, 10)) { perf->ctx_oactxctrl_offset = 0x128; perf->ctx_flexeu0_offset = 0x3de; @@ -4366,6 +4431,9 @@ void i915_perf_init(struct drm_i915_private *i915) perf->ctx_flexeu0_offset = 0; perf->ctx_oactxctrl_offset = 0x144; + + perf->oa_wl = gen12_oa_wl_regs; + perf->num_oa_wl = ARRAY_SIZE(gen12_oa_wl_regs); } } @@ -4468,8 +4536,13 @@ int i915_perf_ioctl_version(void) * * 5: Add DRM_I915_PERF_PROP_POLL_OA_PERIOD parameter that controls the * interval for the hrtimer used to check for OA data. + * + * 6: Whitelist OATRIGGER registers to allow user to trigger reports + * into the OA buffer. This applies only to gen8+. The feature can + * only be accessed if perf_stream_paranoid is set to 0 by privileged + * user. */ - return 5; + return 6; } #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) diff --git a/drivers/gpu/drm/i915/i915_perf_types.h b/drivers/gpu/drm/i915/i915_perf_types.h index a36a455ae336..cceb54012053 100644 --- a/drivers/gpu/drm/i915/i915_perf_types.h +++ b/drivers/gpu/drm/i915/i915_perf_types.h @@ -311,6 +311,11 @@ struct i915_perf_stream { * buffer should be checked for available data. */ u64 poll_oa_period; + + /** + * @oa_whitelisted: Indicates that the oa registers are whitelisted. + */ + bool oa_whitelisted; }; /** @@ -431,6 +436,9 @@ struct i915_perf { u32 ctx_oactxctrl_offset; u32 ctx_flexeu0_offset; + struct i915_whitelist_reg *oa_wl; + size_t num_oa_wl; + /** * The RPT_ID/reason field for Gen8+ includes a bit * to determine if the CTX ID in the report is valid From patchwork Fri Nov 13 18:03:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 11904419 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B0E0C6369E for ; Fri, 13 Nov 2020 18:04:35 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 24CCF208D5 for ; Fri, 13 Nov 2020 18:04:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 24CCF208D5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5A7AF6E7FE; Fri, 13 Nov 2020 18:04:34 +0000 (UTC) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by gabe.freedesktop.org (Postfix) with ESMTPS id D58936E7EC for ; Fri, 13 Nov 2020 18:04:27 +0000 (UTC) IronPort-SDR: 4h5pt63q+QbBqBzdtCfSEcP+A/yH3k2cTOWaxI3bLcAnnqViejkhMVHUIV0quivsKG5D5qEc/k 9JJoAykU7uWg== X-IronPort-AV: E=McAfee;i="6000,8403,9804"; a="167931886" X-IronPort-AV: E=Sophos;i="5.77,476,1596524400"; d="scan'208";a="167931886" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Nov 2020 10:04:00 -0800 IronPort-SDR: gO/q1fTDrk8sPcep7vuprrJMdkUnQt1Lwz2yIgLfXaogtT4QIVJn1VK5TSj+m6EOElw8BFQOMA oNoY5Y9+hkJw== X-IronPort-AV: E=Sophos;i="5.77,476,1596524400"; d="scan'208";a="474752371" Received: from orsosgc001.ra.intel.com ([10.23.184.150]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Nov 2020 10:04:00 -0800 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Fri, 13 Nov 2020 10:03:57 -0800 Message-Id: <20201113180359.54410-6-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20201113180359.54410-1-umesh.nerlige.ramappa@intel.com> References: <20201113180359.54410-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 5/7] drm/i915/gt: Refactor _wa_add to reuse wa_index and wa_list_grow X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Switch the search and grow code of the _wa_add to use _wa_index and _wa_list_grow. Signed-off-by: Umesh Nerlige Ramappa --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 54 +++++++-------------- 1 file changed, 17 insertions(+), 37 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 722f11e43dad..b79a08868fa7 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -190,53 +190,33 @@ static void _wa_remove(struct i915_wa_list *wal, i915_reg_t reg, u32 flags) static void _wa_add(struct i915_wa_list *wal, const struct i915_wa *wa) { - unsigned int addr = i915_mmio_reg_offset(wa->reg); - unsigned int start = 0, end = wal->count; + int index; const unsigned int grow = WA_LIST_CHUNK; struct i915_wa *wa_; GEM_BUG_ON(!is_power_of_2(grow)); - if (IS_ALIGNED(wal->count, grow)) { /* Either uninitialized or full. */ - struct i915_wa *list; - - list = kmalloc_array(ALIGN(wal->count + 1, grow), sizeof(*wa), - GFP_KERNEL); - if (!list) { - DRM_ERROR("No space for workaround init!\n"); + if (IS_ALIGNED(wal->count, grow)) /* Either uninitialized or full. */ + if (_wa_list_grow(wal, wal->count) < 0) return; - } - - if (wal->list) - memcpy(list, wal->list, sizeof(*wa) * wal->count); - wal->list = list; - } + index = _wa_index(wal, wa->reg); + if (index >= 0) { + wa_ = &wal->list[index]; - while (start < end) { - unsigned int mid = start + (end - start) / 2; + if ((wa->clr | wa_->clr) && !(wa->clr & ~wa_->clr)) { + DRM_ERROR("Discarding overwritten w/a for reg %04x (clear: %08x, set: %08x)\n", + i915_mmio_reg_offset(wa_->reg), + wa_->clr, wa_->set); - if (i915_mmio_reg_offset(wal->list[mid].reg) < addr) { - start = mid + 1; - } else if (i915_mmio_reg_offset(wal->list[mid].reg) > addr) { - end = mid; - } else { - wa_ = &wal->list[mid]; - - if ((wa->clr | wa_->clr) && !(wa->clr & ~wa_->clr)) { - DRM_ERROR("Discarding overwritten w/a for reg %04x (clear: %08x, set: %08x)\n", - i915_mmio_reg_offset(wa_->reg), - wa_->clr, wa_->set); - - wa_->set &= ~wa->clr; - } - - wal->wa_count++; - wa_->set |= wa->set; - wa_->clr |= wa->clr; - wa_->read |= wa->read; - return; + wa_->set &= ~wa->clr; } + + wal->wa_count++; + wa_->set |= wa->set; + wa_->clr |= wa->clr; + wa_->read |= wa->read; + return; } wal->wa_count++; From patchwork Fri Nov 13 18:03:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 11904409 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2C12C63697 for ; Fri, 13 Nov 2020 18:04:34 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 54A02208D5 for ; Fri, 13 Nov 2020 18:04:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 54A02208D5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C25146E7F1; Fri, 13 Nov 2020 18:04:33 +0000 (UTC) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by gabe.freedesktop.org (Postfix) with ESMTPS id EA2AE6E5D3 for ; Fri, 13 Nov 2020 18:04:26 +0000 (UTC) IronPort-SDR: K1pACFp2f5vQdrgVcOiXukhAMOfe6H66HI4FO2JmS8jj8BTbZccH9Ogo86bwOkrcW+PwnihiKO /s4/6JAtMm2A== X-IronPort-AV: E=McAfee;i="6000,8403,9804"; a="167931889" X-IronPort-AV: E=Sophos;i="5.77,476,1596524400"; d="scan'208";a="167931889" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Nov 2020 10:04:01 -0800 IronPort-SDR: lkcbB8c0UH+PolnpTHCO2qaUXqtIhkDL43dcvNnA/vZobUkc6HiqGvvPUdZ/O4p2Gqvn+RBYb3 lfiAnCfcLWFA== X-IronPort-AV: E=Sophos;i="5.77,476,1596524400"; d="scan'208";a="474752374" Received: from orsosgc001.ra.intel.com ([10.23.184.150]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Nov 2020 10:04:00 -0800 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Fri, 13 Nov 2020 10:03:58 -0800 Message-Id: <20201113180359.54410-7-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20201113180359.54410-1-umesh.nerlige.ramappa@intel.com> References: <20201113180359.54410-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 6/7] drm/i915/perf: Whitelist OA counter and buffer registers X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" It is useful to have markers in the OA reports to identify triggered reports. Whitelist some OA counters that can be used as markers. A triggered report can be found faster if we can sample the HW tail and head registers when the report was triggered. Whitelist OA buffer specific registers. v2: - Bump up the perf revision (Lionel) - Use indexing for counters (Lionel) - Fix selftest for oa ticking register (Umesh) v3: Pardon whitelisted registers for selftest (Umesh) v4: - Document whitelisted registers (Lionel) - Fix live isolated whitelist for OA regs (Umesh) v5: - Free up whitelist slots. Remove GPU_TICKS and A20 counter (Piotr) - Whitelist registers only if perf_stream_paranoid is set to 0 (Jon) v6: Move oa whitelist array to i915_perf (Chris) Signed-off-by: Piotr Maciejewski Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_perf.c | 18 +++++++++++++++++- drivers/gpu/drm/i915/i915_reg.h | 8 ++++++++ 2 files changed, 25 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 64cf27187b40..9b4401588572 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1365,11 +1365,19 @@ free_noa_wait(struct i915_perf_stream *stream) static struct i915_whitelist_reg gen9_oa_wl_regs[] = { { OAREPORTTRIG2, RING_FORCE_TO_NONPRIV_ACCESS_RW }, { OAREPORTTRIG6, RING_FORCE_TO_NONPRIV_ACCESS_RW }, + { OA_PERF_COUNTER_A(18), RING_FORCE_TO_NONPRIV_ACCESS_RW | + RING_FORCE_TO_NONPRIV_RANGE_4 }, + { GEN8_OASTATUS, RING_FORCE_TO_NONPRIV_ACCESS_RD | + RING_FORCE_TO_NONPRIV_RANGE_4 }, }; static struct i915_whitelist_reg gen12_oa_wl_regs[] = { { GEN12_OAG_OAREPORTTRIG2, RING_FORCE_TO_NONPRIV_ACCESS_RW }, { GEN12_OAG_OAREPORTTRIG6, RING_FORCE_TO_NONPRIV_ACCESS_RW }, + { GEN12_OAG_PERF_COUNTER_A(18), RING_FORCE_TO_NONPRIV_ACCESS_RW | + RING_FORCE_TO_NONPRIV_RANGE_4 }, + { GEN12_OAG_OASTATUS, RING_FORCE_TO_NONPRIV_ACCESS_RD | + RING_FORCE_TO_NONPRIV_RANGE_4 }, }; static int intel_engine_apply_oa_whitelist(struct i915_perf_stream *stream) @@ -4541,8 +4549,16 @@ int i915_perf_ioctl_version(void) * into the OA buffer. This applies only to gen8+. The feature can * only be accessed if perf_stream_paranoid is set to 0 by privileged * user. + * + * 7: Whitelist below OA registers for user to identify the location of + * triggered reports in the OA buffer. This applies only to gen8+. + * The feature can only be accessed if perf_stream_paranoid is set to + * 0 by privileged user. + * + * - OA buffer head/tail/status/buffer registers for read only + * - OA counters A18, A19, A20 for read/write */ - return 6; + return 7; } #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 4137ca9d3315..5201be6f659f 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -977,6 +977,14 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define OAREPORTTRIG8_NOA_SELECT_6_SHIFT 24 #define OAREPORTTRIG8_NOA_SELECT_7_SHIFT 28 +/* Performance counters registers */ +#define OA_PERF_COUNTER_A(idx) _MMIO(0x2800 + 8 * (idx)) +#define OA_PERF_COUNTER_A_UPPER(idx) _MMIO(0x2800 + 8 * (idx) + 4) + +/* Gen12 Performance counters registers */ +#define GEN12_OAG_PERF_COUNTER_A(idx) _MMIO(0xD980 + 8 * (idx)) +#define GEN12_OAG_PERF_COUNTER_A_UPPER(idx) _MMIO(0xD980 + 8 * (idx) + 4) + /* Same layout as OASTARTTRIGX */ #define GEN12_OAG_OASTARTTRIG1 _MMIO(0xd900) #define GEN12_OAG_OASTARTTRIG2 _MMIO(0xd904) From patchwork Fri Nov 13 18:03:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 11904413 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A9B0C61DD8 for ; Fri, 13 Nov 2020 18:04:31 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0DA6A208D5 for ; Fri, 13 Nov 2020 18:04:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0DA6A208D5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id EAB456E7D2; Fri, 13 Nov 2020 18:04:29 +0000 (UTC) Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by gabe.freedesktop.org (Postfix) with ESMTPS id CA6426E7EA for ; Fri, 13 Nov 2020 18:04:27 +0000 (UTC) IronPort-SDR: aRLwRpASXY9PcABDlLaq7Ny53xeVkzz6P92ytfy+Rwp7kuq8IpVB9B/PNo2Bs9zVNugePyoMXb f/Kb4+BJriOg== X-IronPort-AV: E=McAfee;i="6000,8403,9804"; a="167931890" X-IronPort-AV: E=Sophos;i="5.77,476,1596524400"; d="scan'208";a="167931890" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Nov 2020 10:04:01 -0800 IronPort-SDR: LJqr0dJg5XGZxGz1RLBhmATtJ9bBil39yU267qWIn20RMpQfqN3uTEv7DRRufY+0WiDc4SNDHa Nj1wZWCylBFA== X-IronPort-AV: E=Sophos;i="5.77,476,1596524400"; d="scan'208";a="474752376" Received: from orsosgc001.ra.intel.com ([10.23.184.150]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Nov 2020 10:04:00 -0800 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Fri, 13 Nov 2020 10:03:59 -0800 Message-Id: <20201113180359.54410-8-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20201113180359.54410-1-umesh.nerlige.ramappa@intel.com> References: <20201113180359.54410-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 7/7] drm/i915/perf: Map OA buffer to user space for gen12 performance query X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" i915 used to support time based sampling mode which is good for overall system monitoring, but is not enough for query mode used to measure a single draw call or dispatch. Gen9-Gen11 are using current i915 perf implementation for query, but Gen12+ requires a new approach for query based on triggered reports within oa buffer. Triggering reports into the OA buffer is achieved by writing into a a trigger register. Optionally an unused counter/register is set with a marker value such that a triggered report can be identified in the OA buffer. Reports are usually triggered at the start and end of work that is measured. Since OA buffer is large and queries can be frequent, an efficient way to look for triggered reports is required. By knowing the current head and tail offsets into the OA buffer, it is easier to determine the locality of the reports of interest. Current perf OA interface does not expose head/tail information to the user and it filters out invalid reports before sending data to user. Also considering limited size of user buffer used during a query, creating a 1:1 copy of the OA buffer at the user space added undesired complexity. The solution was to map the OA buffer to user space provided (1) that it is accessed from a privileged user. (2) OA report filtering is not used. These 2 conditions would satisfy the safety criteria that the current perf interface addresses. To enable the query: - Add an ioctl to expose head and tail to the user - Add an ioctl to return size and offset of the OA buffer - Map the OA buffer to the user space v2: - Improve commit message (Chris) - Do not mmap based on gem object filp. Instead, use perf_fd and support mmap syscall (Chris) - Pass non-zero offset in mmap to enforce the right object is mapped (Chris) - Do not expose gpu_address (Chris) - Verify start and length of vma for page alignment (Lionel) - Move SQNTL config out (Lionel) v3: (Chris) - Omit redundant checks - Return VM_FAULT_SIGBUS is old stream is closed - Maintain reference counts to stream in vm_open and vm_close - Use switch to identify object to be mapped v4: Call kref_put on closing perf fd (Chris) v5: - Strip access to OA buffer from unprivileged child of a privileged parent. Use VM_DONTCOPY - Enforce MAP_PRIVATE by checking for VM_MAYSHARE v6: (Chris) - Use len of -1 in unmap_mapping_range - Don't use stream->oa_buffer.vma->obj in vm_fault_oa - Use kernel block comment style - do_mmap gets a reference to the file and puts it in do_munmap, so no need to maintain a reference to i915_perf_stream. Hence, remove vm_open/vm_close and stream->closed hooks/checks. (Umesh) - Do not allow mmap if SAMPLE_OA_REPORT is not set during i915_perf_open_ioctl. - Drop ioctl returning head/tail since this information is already whitelisted. Remove hooks to read head register. v7: (Chris) - unmap before destroy - change ioctl argument struct v8: Documentation and more checks (Chris) Signed-off-by: Piotr Maciejewski Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/gem/i915_gem_mman.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_mman.h | 2 + drivers/gpu/drm/i915/i915_perf.c | 126 ++++++++++++++++++++++- include/uapi/drm/i915_drm.h | 33 ++++++ 4 files changed, 161 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c index 3d69e51f3e4d..2ab08b152b9d 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c @@ -204,7 +204,7 @@ compute_partial_view(const struct drm_i915_gem_object *obj, return view; } -static vm_fault_t i915_error_to_vmf_fault(int err) +vm_fault_t i915_error_to_vmf_fault(int err) { switch (err) { default: diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.h b/drivers/gpu/drm/i915/gem/i915_gem_mman.h index efee9e0d2508..1190a3a228ea 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.h @@ -29,4 +29,6 @@ void i915_gem_object_release_mmap_gtt(struct drm_i915_gem_object *obj); void i915_gem_object_release_mmap_offset(struct drm_i915_gem_object *obj); +vm_fault_t i915_error_to_vmf_fault(int err); + #endif diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 9b4401588572..2ff5bfcb1ab9 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -192,10 +192,12 @@ */ #include +#include #include #include #include "gem/i915_gem_context.h" +#include "gem/i915_gem_mman.h" #include "gt/intel_engine_pm.h" #include "gt/intel_engine_user.h" #include "gt/intel_gt.h" @@ -3289,6 +3291,44 @@ static long i915_perf_config_locked(struct i915_perf_stream *stream, return ret; } +#define I915_PERF_OA_BUFFER_MMAP_OFFSET 1 + +/** + * i915_perf_oa_buffer_info_locked - size and offset of the OA buffer + * @stream: i915 perf stream + * @cmd: ioctl command + * @arg: pointer to oa buffer info filled by this function. + */ +static int i915_perf_oa_buffer_info_locked(struct i915_perf_stream *stream, + unsigned int cmd, + unsigned long arg) +{ + struct drm_i915_perf_oa_buffer_info info; + void __user *output = (void __user *)arg; + + if (i915_perf_stream_paranoid && !perfmon_capable()) { + DRM_DEBUG("Insufficient privileges to access OA buffer info\n"); + return -EACCES; + } + + if (_IOC_SIZE(cmd) != sizeof(info)) + return -EINVAL; + + if (copy_from_user(&info, output, sizeof(info))) + return -EFAULT; + + if (info.type || info.flags || info.rsvd) + return -EINVAL; + + info.size = stream->oa_buffer.vma->size; + info.offset = I915_PERF_OA_BUFFER_MMAP_OFFSET * PAGE_SIZE; + + if (copy_to_user(output, &info, sizeof(info))) + return -EFAULT; + + return 0; +} + /** * i915_perf_ioctl - support ioctl() usage with i915 perf stream FDs * @stream: An i915 perf stream @@ -3314,6 +3354,8 @@ static long i915_perf_ioctl_locked(struct i915_perf_stream *stream, return 0; case I915_PERF_IOCTL_CONFIG: return i915_perf_config_locked(stream, arg); + case I915_PERF_IOCTL_GET_OA_BUFFER_INFO: + return i915_perf_oa_buffer_info_locked(stream, cmd, arg); } return -EINVAL; @@ -3385,6 +3427,14 @@ static int i915_perf_release(struct inode *inode, struct file *file) struct i915_perf_stream *stream = file->private_data; struct i915_perf *perf = stream->perf; + /* + * User could have multiple vmas from multiple mmaps. We want to zap + * them all here. Note that a fresh fault cannot occur as the mmap holds + * a reference to the stream via the vma->vm_file, so before user's + * munmap, the stream cannot be destroyed. + */ + unmap_mapping_range(file->f_mapping, 0, -1, 1); + mutex_lock(&perf->lock); i915_perf_destroy_locked(stream); mutex_unlock(&perf->lock); @@ -3395,6 +3445,75 @@ static int i915_perf_release(struct inode *inode, struct file *file) return 0; } +static vm_fault_t vm_fault_oa(struct vm_fault *vmf) +{ + struct vm_area_struct *vma = vmf->vma; + struct i915_perf_stream *stream = vma->vm_private_data; + int err; + + err = remap_io_sg(vma, + vma->vm_start, vma->vm_end - vma->vm_start, + stream->oa_buffer.vma->pages->sgl, -1); + + return i915_error_to_vmf_fault(err); +} + +static const struct vm_operations_struct vm_ops_oa = { + .fault = vm_fault_oa, +}; + +static int i915_perf_mmap(struct file *file, struct vm_area_struct *vma) +{ + struct i915_perf_stream *stream = file->private_data; + + /* mmap-ing OA buffer to user space MUST absolutely be privileged */ + if (i915_perf_stream_paranoid && !perfmon_capable()) { + DRM_DEBUG("Insufficient privileges to map OA buffer\n"); + return -EACCES; + } + + switch (vma->vm_pgoff) { + /* + * A non-zero offset ensures that we are mapping the right object. Also + * leaves room for future objects added to this implementation. + */ + case I915_PERF_OA_BUFFER_MMAP_OFFSET: + if (!(stream->sample_flags & SAMPLE_OA_REPORT)) + return -EINVAL; + + if (vma->vm_end - vma->vm_start > OA_BUFFER_SIZE) + return -EINVAL; + + /* + * Only support VM_READ. Enforce MAP_PRIVATE by checking for + * VM_MAYSHARE. + */ + if (vma->vm_flags & (VM_WRITE | VM_EXEC | + VM_SHARED | VM_MAYSHARE)) + return -EINVAL; + + vma->vm_flags &= ~(VM_MAYWRITE | VM_MAYEXEC); + + /* + * If the privileged parent forks and child drops root + * privilege, we do not want the child to retain access to the + * mapped OA buffer. Explicitly set VM_DONTCOPY to avoid such + * cases. + */ + vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND | + VM_DONTDUMP | VM_DONTCOPY; + break; + + default: + return -EINVAL; + } + + vma->vm_page_prot = vm_get_page_prot(vma->vm_flags); + vma->vm_private_data = stream; + vma->vm_ops = &vm_ops_oa; + + return 0; +} static const struct file_operations fops = { .owner = THIS_MODULE, @@ -3407,6 +3526,7 @@ static const struct file_operations fops = { * to handle 32bits compatibility. */ .compat_ioctl = i915_perf_ioctl, + .mmap = i915_perf_mmap, }; @@ -4557,8 +4677,12 @@ int i915_perf_ioctl_version(void) * * - OA buffer head/tail/status/buffer registers for read only * - OA counters A18, A19, A20 for read/write + * + * 8: Added an option to map oa buffer at umd driver level and trigger + * oa reports within oa buffer from command buffer. See + * I915_PERF_IOCTL_GET_OA_BUFFER_INFO. */ - return 7; + return 8; } #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index fa1f3d62f9a6..cc1702ddc859 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -2101,6 +2101,39 @@ struct drm_i915_perf_open_param { */ #define I915_PERF_IOCTL_CONFIG _IO('i', 0x2) +/** + * Returns OA buffer properties to be used with mmap. + * + * This ioctl is available in perf revision 8. + */ +#define I915_PERF_IOCTL_GET_OA_BUFFER_INFO _IOWR('i', 0x3, struct drm_i915_perf_oa_buffer_info) + +/** + * OA buffer size and offset. + * + * OA output buffer + * type: 0 + * flags: mbz + * + * After querying the info, pass (size,offset) to mmap(), + * + * mmap(0, info.size, PROT_READ, MAP_PRIVATE, perf_fd, info.offset). + * + * Note that only a private (not shared between processes, or across fork()) + * read-only mmapping is allowed. + * + * Userspace must treat the incoming data as tainted, but it conforms to the OA + * format as specified by user config. The buffer provides reports that have + * OA counters - A, B and C. + */ +struct drm_i915_perf_oa_buffer_info { + __u32 type; /* in */ + __u32 flags; /* in */ + __u64 size; /* out */ + __u64 offset; /* out */ + __u64 rsvd; /* mbz */ +}; + /** * Common to all i915 perf records */