From patchwork Fri Jul 24 00:18:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 11681859 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 46A9C159A for ; Fri, 24 Jul 2020 00:19:05 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2E3C7206E3 for ; Fri, 24 Jul 2020 00:19:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2E3C7206E3 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BF6306E8AA; Fri, 24 Jul 2020 00:19:03 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id 446706E8AA for ; Fri, 24 Jul 2020 00:19:02 +0000 (UTC) IronPort-SDR: IZFdTkQeCxxc9WcrcppbNznyF/OZTieCTSMwU3k1ea140KXbKOPxsikJtbU9d82ied6wADBq7Y 6pmXo5DA8tbQ== X-IronPort-AV: E=McAfee;i="6000,8403,9691"; a="149830250" X-IronPort-AV: E=Sophos;i="5.75,388,1589266800"; d="scan'208";a="149830250" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Jul 2020 17:19:02 -0700 IronPort-SDR: ZqIVOyTCJsgaEp1vQZw7YEhlw7YyIbByzFJ9sjIXWg7NRfX84w/v3DiRqeTCOXQBSnvLXB7PzW fmoJh9B5KJBA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,388,1589266800"; d="scan'208";a="311217841" Received: from orsosgc001.ra.intel.com ([10.23.184.150]) by fmsmga004.fm.intel.com with ESMTP; 23 Jul 2020 17:19:01 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Thu, 23 Jul 2020 17:18:58 -0700 Message-Id: <20200724001901.35662-2-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200724001901.35662-1-umesh.nerlige.ramappa@intel.com> References: <20200724001901.35662-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 1/4] drm/i915/perf: Ensure observation logic is not clock gated X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Piotr Maciejewski A clock gating switch can control if the performance monitoring and observation logic is enaled or not. Ensure that we enable the clocks. v2: Separate code from other patches (Lionel) v3: Reset PMON enable when disabling perf to save power (Lionel) Fixes: 00a7f0d7155c ("drm/i915/tgl: Add perf support on TGL") Signed-off-by: Piotr Maciejewski Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_perf.c | 13 +++++++++++++ drivers/gpu/drm/i915/i915_reg.h | 2 ++ 2 files changed, 15 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index c6f6370283cf..fe408c327d3c 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -2493,6 +2493,14 @@ gen12_enable_metric_set(struct i915_perf_stream *stream, (period_exponent << GEN12_OAG_OAGLBCTXCTRL_TIMER_PERIOD_SHIFT)) : 0); + /* + * Initialize Super Queue Internal Cnt Register + * Set PMON Enable in order to collect valid metrics. + */ + intel_uncore_write(uncore, GEN12_SQCNT1, + intel_uncore_read(uncore, GEN12_SQCNT1) | + GEN12_SQCNT1_PMON_ENABLE); + /* * Update all contexts prior writing the mux configurations as we need * to make sure all slices/subslices are ON before writing to NOA @@ -2552,6 +2560,11 @@ static void gen12_disable_metric_set(struct i915_perf_stream *stream) /* Make sure we disable noa to save power. */ intel_uncore_rmw(uncore, RPM_CONFIG1, GEN10_GT_NOA_ENABLE, 0); + + /* Reset PMON Enable to save power. */ + intel_uncore_write(uncore, GEN12_SQCNT1, + intel_uncore_read(uncore, GEN12_SQCNT1) & + ~GEN12_SQCNT1_PMON_ENABLE); } static void gen7_oa_enable(struct i915_perf_stream *stream) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index a0d31f3bf634..9cc3e312b6b7 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -696,6 +696,8 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define OABUFFER_SIZE_16M (7 << 3) #define GEN12_OA_TLB_INV_CR _MMIO(0xceec) +#define GEN12_SQCNT1 _MMIO(0x8718) +#define GEN12_SQCNT1_PMON_ENABLE (1 << 30) /* Gen12 OAR unit */ #define GEN12_OAR_OACONTROL _MMIO(0x2960) From patchwork Fri Jul 24 00:18:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 11681857 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C31F2138C for ; Fri, 24 Jul 2020 00:19:04 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A9D9F2086A for ; Fri, 24 Jul 2020 00:19:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A9D9F2086A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A9C866E2B6; Fri, 24 Jul 2020 00:19:03 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id 68FBE6E2B6 for ; Fri, 24 Jul 2020 00:19:02 +0000 (UTC) IronPort-SDR: cHs9BLfHOKS6J4D/mM26ZvFaL5TLT81sjktS9vTeFkqyFOiSL7b0wbgo9mDG6cmu4fiIYK4SaL N2DPkEmlz1OA== X-IronPort-AV: E=McAfee;i="6000,8403,9691"; a="149830251" X-IronPort-AV: E=Sophos;i="5.75,388,1589266800"; d="scan'208";a="149830251" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Jul 2020 17:19:02 -0700 IronPort-SDR: SN3VcjkD7hxvBCtltsesMZUxtcy/ofpCIeUITflWWMg3SqXfZdABdhRfIEuqscxYHMph2sV9zv U/rqhc3YhE1A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,388,1589266800"; d="scan'208";a="311217846" Received: from orsosgc001.ra.intel.com ([10.23.184.150]) by fmsmga004.fm.intel.com with ESMTP; 23 Jul 2020 17:19:02 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Thu, 23 Jul 2020 17:18:59 -0700 Message-Id: <20200724001901.35662-3-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200724001901.35662-1-umesh.nerlige.ramappa@intel.com> References: <20200724001901.35662-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 2/4] drm/i915/perf: Whitelist OA report trigger registers X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Piotr Maciejewski OA reports can be triggered into the OA buffer by writing into the OAREPORTTRIG registers. Whitelist the registers to allow user to trigger reports. v2: - Move related change to this patch (Lionel) - Bump up perf revision (Lionel) v3: Pardon whitelisted registers for selftest (Umesh) v4: Document supported gens for the feature (Lionel) Signed-off-by: Piotr Maciejewski Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Lionel Landwerlin --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 26 +++++++++++++++++++ .../gpu/drm/i915/gt/selftest_workarounds.c | 8 ++++++ drivers/gpu/drm/i915/i915_perf.c | 11 +++++--- 3 files changed, 42 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index cef1c122696f..a72ebfd115e5 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -1387,6 +1387,20 @@ whitelist_reg(struct i915_wa_list *wal, i915_reg_t reg) whitelist_reg_ext(wal, reg, RING_FORCE_TO_NONPRIV_ACCESS_RW); } +static void gen9_whitelist_build_performance_counters(struct i915_wa_list *w) +{ + /* OA buffer trigger report 2/6 used by performance query */ + whitelist_reg(w, OAREPORTTRIG2); + whitelist_reg(w, OAREPORTTRIG6); +} + +static void gen12_whitelist_build_performance_counters(struct i915_wa_list *w) +{ + /* OA buffer trigger report 2/6 used by performance query */ + whitelist_reg(w, GEN12_OAG_OAREPORTTRIG2); + whitelist_reg(w, GEN12_OAG_OAREPORTTRIG6); +} + static void gen9_whitelist_build(struct i915_wa_list *w) { /* WaVFEStateAfterPipeControlwithMediaStateClear:skl,bxt,glk,cfl */ @@ -1400,6 +1414,9 @@ static void gen9_whitelist_build(struct i915_wa_list *w) /* WaSendPushConstantsFromMMIO:skl,bxt */ whitelist_reg(w, COMMON_SLICE_CHICKEN2); + + /* Performance counters support */ + gen9_whitelist_build_performance_counters(w); } static void skl_whitelist_build(struct intel_engine_cs *engine) @@ -1493,6 +1510,9 @@ static void cnl_whitelist_build(struct intel_engine_cs *engine) /* WaEnablePreemptionGranularityControlByUMD:cnl */ whitelist_reg(w, GEN8_CS_CHICKEN1); + + /* Performance counters support */ + gen9_whitelist_build_performance_counters(w); } static void icl_whitelist_build(struct intel_engine_cs *engine) @@ -1522,6 +1542,9 @@ static void icl_whitelist_build(struct intel_engine_cs *engine) whitelist_reg_ext(w, PS_INVOCATION_COUNT, RING_FORCE_TO_NONPRIV_ACCESS_RD | RING_FORCE_TO_NONPRIV_RANGE_4); + + /* Performance counters support */ + gen9_whitelist_build_performance_counters(w); break; case VIDEO_DECODE_CLASS: @@ -1572,6 +1595,9 @@ static void tgl_whitelist_build(struct intel_engine_cs *engine) /* Wa_1806527549:tgl */ whitelist_reg(w, HIZ_CHICKEN); + + /* Performance counters support */ + gen12_whitelist_build_performance_counters(w); break; default: whitelist_reg_ext(w, diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c index febc9e6692ba..3b1d3dbcd477 100644 --- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c +++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c @@ -934,6 +934,10 @@ static bool pardon_reg(struct drm_i915_private *i915, i915_reg_t reg) static const struct regmask pardon[] = { { GEN9_CTX_PREEMPT_REG, INTEL_GEN_MASK(9, 9) }, { GEN8_L3SQCREG4, INTEL_GEN_MASK(9, 9) }, + { OAREPORTTRIG2, INTEL_GEN_MASK(8, 11) }, + { OAREPORTTRIG6, INTEL_GEN_MASK(8, 11) }, + { GEN12_OAG_OAREPORTTRIG2, INTEL_GEN_MASK(12, 12) }, + { GEN12_OAG_OAREPORTTRIG6, INTEL_GEN_MASK(12, 12) }, }; return find_reg(i915, reg, pardon, ARRAY_SIZE(pardon)); @@ -956,6 +960,10 @@ static bool writeonly_reg(struct drm_i915_private *i915, i915_reg_t reg) /* Some registers do not seem to behave and our writes unreadable */ static const struct regmask wo[] = { { GEN9_SLICE_COMMON_ECO_CHICKEN1, INTEL_GEN_MASK(9, 9) }, + { OAREPORTTRIG2, INTEL_GEN_MASK(8, 11) }, + { OAREPORTTRIG6, INTEL_GEN_MASK(8, 11) }, + { GEN12_OAG_OAREPORTTRIG2, INTEL_GEN_MASK(12, 12) }, + { GEN12_OAG_OAREPORTTRIG6, INTEL_GEN_MASK(12, 12) }, }; return find_reg(i915, reg, wo, ARRAY_SIZE(wo)); diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index fe408c327d3c..30f6aeb819aa 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1448,7 +1448,8 @@ static void gen8_init_oa_buffer(struct i915_perf_stream *stream) * bit." */ intel_uncore_write(uncore, GEN8_OABUFFER, gtt_offset | - OABUFFER_SIZE_16M | GEN8_OABUFFER_MEM_SELECT_GGTT); + OABUFFER_SIZE_16M | GEN8_OABUFFER_MEM_SELECT_GGTT | + GEN7_OABUFFER_EDGE_TRIGGER); intel_uncore_write(uncore, GEN8_OATAILPTR, gtt_offset & GEN8_OATAILPTR_MASK); /* Mark that we need updated tail pointers to read from... */ @@ -1501,7 +1502,8 @@ static void gen12_init_oa_buffer(struct i915_perf_stream *stream) * bit." */ intel_uncore_write(uncore, GEN12_OAG_OABUFFER, gtt_offset | - OABUFFER_SIZE_16M | GEN8_OABUFFER_MEM_SELECT_GGTT); + OABUFFER_SIZE_16M | GEN8_OABUFFER_MEM_SELECT_GGTT | + GEN7_OABUFFER_EDGE_TRIGGER); intel_uncore_write(uncore, GEN12_OAG_OATAILPTR, gtt_offset & GEN12_OAG_OATAILPTR_MASK); @@ -4445,8 +4447,11 @@ int i915_perf_ioctl_version(void) * * 5: Add DRM_I915_PERF_PROP_POLL_OA_PERIOD parameter that controls the * interval for the hrtimer used to check for OA data. + * + * 6: Whitelist OATRIGGER registers to allow user to trigger reports + * into the OA buffer. This applies only to gen8+. */ - return 5; + return 6; } #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) From patchwork Fri Jul 24 00:19:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 11681865 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4B6F713B6 for ; Fri, 24 Jul 2020 00:19:08 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3298F206E3 for ; Fri, 24 Jul 2020 00:19:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3298F206E3 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C10826E8B3; Fri, 24 Jul 2020 00:19:04 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9C8846E2B6 for ; Fri, 24 Jul 2020 00:19:02 +0000 (UTC) IronPort-SDR: ValQKb+5TaTgpRNo1Hb1ynhYp4hsCqHwK9qn8ST+inJGrT2dZkU4v60YS0+kpg7Beni8Kxm2B+ a4Q5KYB9we0Q== X-IronPort-AV: E=McAfee;i="6000,8403,9691"; a="149830252" X-IronPort-AV: E=Sophos;i="5.75,388,1589266800"; d="scan'208";a="149830252" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Jul 2020 17:19:02 -0700 IronPort-SDR: KolgvgHdcBPCM9iS0qouYaY4ZcPPn3TDsWptGtRsu+pUzPv2Jqscyx2e+Cjq7/cyx/n0y+h2Yl m1htvd8afmNg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,388,1589266800"; d="scan'208";a="311217850" Received: from orsosgc001.ra.intel.com ([10.23.184.150]) by fmsmga004.fm.intel.com with ESMTP; 23 Jul 2020 17:19:02 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Thu, 23 Jul 2020 17:19:00 -0700 Message-Id: <20200724001901.35662-4-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200724001901.35662-1-umesh.nerlige.ramappa@intel.com> References: <20200724001901.35662-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 3/4] drm/i915/perf: Whitelist OA counter and buffer registers X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Piotr Maciejewski It is useful to have markers in the OA reports to identify triggered reports. Whitelist some OA counters that can be used as markers. A triggered report can be found faster if we can sample the HW tail and head registers when the report was triggered. Whitelist OA buffer specific registers. v2: - Bump up the perf revision (Lionel) - Use indexing for counters (Lionel) - Fix selftest for oa ticking register (Umesh) v3: Pardon whitelisted registers for selftest (Umesh) v4: - Document whitelisted registers (Lionel) - Fix live isolated whitelist for OA regs (Umesh) Signed-off-by: Piotr Maciejewski Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Lionel Landwerlin --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 34 +++++++++++++++++++ .../gpu/drm/i915/gt/selftest_workarounds.c | 30 +++++++++++++++- drivers/gpu/drm/i915/i915_perf.c | 8 ++++- drivers/gpu/drm/i915/i915_reg.h | 10 ++++++ 4 files changed, 80 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index a72ebfd115e5..c950d07beec3 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -1392,6 +1392,23 @@ static void gen9_whitelist_build_performance_counters(struct i915_wa_list *w) /* OA buffer trigger report 2/6 used by performance query */ whitelist_reg(w, OAREPORTTRIG2); whitelist_reg(w, OAREPORTTRIG6); + + /* Performance counters A18-20 used by tbs marker query */ + whitelist_reg_ext(w, OA_PERF_COUNTER_A(18), + RING_FORCE_TO_NONPRIV_ACCESS_RW | + RING_FORCE_TO_NONPRIV_RANGE_4); + + whitelist_reg(w, OA_PERF_COUNTER_A(20)); + whitelist_reg(w, OA_PERF_COUNTER_A_UPPER(20)); + + /* Read access to gpu ticks */ + whitelist_reg_ext(w, GEN8_GPU_TICKS, + RING_FORCE_TO_NONPRIV_ACCESS_RD); + + /* Read access to: oa status, head, tail, buffer settings */ + whitelist_reg_ext(w, GEN8_OASTATUS, + RING_FORCE_TO_NONPRIV_ACCESS_RD | + RING_FORCE_TO_NONPRIV_RANGE_4); } static void gen12_whitelist_build_performance_counters(struct i915_wa_list *w) @@ -1399,6 +1416,23 @@ static void gen12_whitelist_build_performance_counters(struct i915_wa_list *w) /* OA buffer trigger report 2/6 used by performance query */ whitelist_reg(w, GEN12_OAG_OAREPORTTRIG2); whitelist_reg(w, GEN12_OAG_OAREPORTTRIG6); + + /* Performance counters A18-20 used by tbs marker query */ + whitelist_reg_ext(w, GEN12_OAG_PERF_COUNTER_A(18), + RING_FORCE_TO_NONPRIV_ACCESS_RW | + RING_FORCE_TO_NONPRIV_RANGE_4); + + whitelist_reg(w, GEN12_OAG_PERF_COUNTER_A(20)); + whitelist_reg(w, GEN12_OAG_PERF_COUNTER_A_UPPER(20)); + + /* Read access to gpu ticks */ + whitelist_reg_ext(w, GEN12_OAG_GPU_TICKS, + RING_FORCE_TO_NONPRIV_ACCESS_RD); + + /* Read access to: oa status, head, tail, buffer settings */ + whitelist_reg_ext(w, GEN12_OAG_OASTATUS, + RING_FORCE_TO_NONPRIV_ACCESS_RD | + RING_FORCE_TO_NONPRIV_RANGE_4); } static void gen9_whitelist_build(struct i915_wa_list *w) diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c index 3b1d3dbcd477..7c2c2be8d212 100644 --- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c +++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c @@ -431,6 +431,19 @@ static bool timestamp(const struct intel_engine_cs *engine, u32 reg) } } +static bool oa_gpu_ticks(u32 reg) +{ + reg = reg & ~RING_FORCE_TO_NONPRIV_ACCESS_MASK; + switch (reg) { + case 0x2910: + case 0xda90: + return true; + + default: + return false; + } +} + static bool ro_register(u32 reg) { if ((reg & RING_FORCE_TO_NONPRIV_ACCESS_MASK) == @@ -511,7 +524,7 @@ static int check_dirty_whitelist(struct intel_context *ce) if (wo_register(engine, reg)) continue; - if (timestamp(engine, reg)) + if (timestamp(engine, reg) || oa_gpu_ticks(reg)) continue; /* timestamps are expected to autoincrement */ ro_reg = ro_register(reg); @@ -918,6 +931,9 @@ static bool find_reg(struct drm_i915_private *i915, { u32 offset = i915_mmio_reg_offset(reg); + /* Clear non priv flags */ + offset &= RING_FORCE_TO_NONPRIV_ADDRESS_MASK; + while (count--) { if (INTEL_INFO(i915)->gen_mask & tbl->gen_mask && i915_mmio_reg_offset(tbl->reg) == offset) @@ -938,6 +954,12 @@ static bool pardon_reg(struct drm_i915_private *i915, i915_reg_t reg) { OAREPORTTRIG6, INTEL_GEN_MASK(8, 11) }, { GEN12_OAG_OAREPORTTRIG2, INTEL_GEN_MASK(12, 12) }, { GEN12_OAG_OAREPORTTRIG6, INTEL_GEN_MASK(12, 12) }, + { OA_PERF_COUNTER_A(18), INTEL_GEN_MASK(8, 11) }, + { OA_PERF_COUNTER_A(20), INTEL_GEN_MASK(8, 11) }, + { OA_PERF_COUNTER_A_UPPER(20), INTEL_GEN_MASK(8, 11) }, + { GEN12_OAG_PERF_COUNTER_A(18), INTEL_GEN_MASK(12, 12) }, + { GEN12_OAG_PERF_COUNTER_A(20), INTEL_GEN_MASK(12, 12) }, + { GEN12_OAG_PERF_COUNTER_A_UPPER(20), INTEL_GEN_MASK(12, 12) }, }; return find_reg(i915, reg, pardon, ARRAY_SIZE(pardon)); @@ -964,6 +986,12 @@ static bool writeonly_reg(struct drm_i915_private *i915, i915_reg_t reg) { OAREPORTTRIG6, INTEL_GEN_MASK(8, 11) }, { GEN12_OAG_OAREPORTTRIG2, INTEL_GEN_MASK(12, 12) }, { GEN12_OAG_OAREPORTTRIG6, INTEL_GEN_MASK(12, 12) }, + { OA_PERF_COUNTER_A(18), INTEL_GEN_MASK(8, 11) }, + { OA_PERF_COUNTER_A(20), INTEL_GEN_MASK(8, 11) }, + { OA_PERF_COUNTER_A_UPPER(20), INTEL_GEN_MASK(8, 11) }, + { GEN12_OAG_PERF_COUNTER_A(18), INTEL_GEN_MASK(12, 12) }, + { GEN12_OAG_PERF_COUNTER_A(20), INTEL_GEN_MASK(12, 12) }, + { GEN12_OAG_PERF_COUNTER_A_UPPER(20), INTEL_GEN_MASK(12, 12) }, }; return find_reg(i915, reg, wo, ARRAY_SIZE(wo)); diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 30f6aeb819aa..2f23aad12c60 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -4450,8 +4450,14 @@ int i915_perf_ioctl_version(void) * * 6: Whitelist OATRIGGER registers to allow user to trigger reports * into the OA buffer. This applies only to gen8+. + * + * 7: Whitelist below OA registers for user to identify the location of + * triggered reports in the OA buffer. This applies only to gen8+. + * + * - OA buffer head/tail/status/buffer registers for read only + * - OA counters A18, A19, A20 for read/write */ - return 6; + return 7; } #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 9cc3e312b6b7..c68dc3f39e62 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -675,6 +675,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define GEN7_OASTATUS2_HEAD_MASK 0xffffffc0 #define GEN7_OASTATUS2_MEM_SELECT_GGTT (1 << 0) /* 0: PPGTT, 1: GGTT */ +#define GEN8_GPU_TICKS _MMIO(0x2910) #define GEN8_OASTATUS _MMIO(0x2b08) #define GEN8_OASTATUS_OVERRUN_STATUS (1 << 3) #define GEN8_OASTATUS_COUNTER_OVERFLOW (1 << 2) @@ -733,6 +734,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define GEN12_OAG_OA_DEBUG_DISABLE_GO_1_0_REPORTS (1 << 2) #define GEN12_OAG_OA_DEBUG_DISABLE_CTX_SWITCH_REPORTS (1 << 1) +#define GEN12_OAG_GPU_TICKS _MMIO(0xda90) #define GEN12_OAG_OASTATUS _MMIO(0xdafc) #define GEN12_OAG_OASTATUS_COUNTER_OVERFLOW (1 << 2) #define GEN12_OAG_OASTATUS_BUFFER_OVERFLOW (1 << 1) @@ -974,6 +976,14 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define OAREPORTTRIG8_NOA_SELECT_6_SHIFT 24 #define OAREPORTTRIG8_NOA_SELECT_7_SHIFT 28 +/* Performance counters registers */ +#define OA_PERF_COUNTER_A(idx) _MMIO(0x2800 + 8 * (idx)) +#define OA_PERF_COUNTER_A_UPPER(idx) _MMIO(0x2800 + 8 * (idx) + 4) + +/* Gen12 Performance counters registers */ +#define GEN12_OAG_PERF_COUNTER_A(idx) _MMIO(0xD980 + 8 * (idx)) +#define GEN12_OAG_PERF_COUNTER_A_UPPER(idx) _MMIO(0xD980 + 8 * (idx) + 4) + /* Same layout as OASTARTTRIGX */ #define GEN12_OAG_OASTARTTRIG1 _MMIO(0xd900) #define GEN12_OAG_OASTARTTRIG2 _MMIO(0xd904) From patchwork Fri Jul 24 00:19:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 11681863 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 64753138C for ; Fri, 24 Jul 2020 00:19:07 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4C03E206E3 for ; Fri, 24 Jul 2020 00:19:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4C03E206E3 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 543796E8AF; Fri, 24 Jul 2020 00:19:04 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id E22996E2B6 for ; Fri, 24 Jul 2020 00:19:02 +0000 (UTC) IronPort-SDR: E6EK8hROYDkLc3+8O5dvU8MQYIjTEbqu6fCrCTqDLbHkj2k9UAQQ/b2QgTfDTz29cUYMQon0Oe BU+Yy2qw9qQg== X-IronPort-AV: E=McAfee;i="6000,8403,9691"; a="149830253" X-IronPort-AV: E=Sophos;i="5.75,388,1589266800"; d="scan'208";a="149830253" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Jul 2020 17:19:02 -0700 IronPort-SDR: 8ydIGwvLsnX86vcjSPBikkbF2LidFJ5kZndgMTwsGWpKw8TPsN+T4Q3wth6X4UB7vdC6/Jkesr A9RUSCA9RkMQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,388,1589266800"; d="scan'208";a="311217854" Received: from orsosgc001.ra.intel.com ([10.23.184.150]) by fmsmga004.fm.intel.com with ESMTP; 23 Jul 2020 17:19:02 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Thu, 23 Jul 2020 17:19:01 -0700 Message-Id: <20200724001901.35662-5-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200724001901.35662-1-umesh.nerlige.ramappa@intel.com> References: <20200724001901.35662-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 4/4] drm/i915/perf: Map OA buffer to user space for gen12 performance query X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Piotr Maciejewski i915 used to support time based sampling mode which is good for overall system monitoring, but is not enough for query mode used to measure a single draw call or dispatch. Gen9-Gen11 are using current i915 perf implementation for query, but Gen12+ requires a new approach for query based on triggered reports within oa buffer. Triggering reports into the OA buffer is achieved by writing into a a trigger register. Optionally an unused counter/register is set with a marker value such that a triggered report can be identified in the OA buffer. Reports are usually triggered at the start and end of work that is measured. Since OA buffer is large and queries can be frequent, an efficient way to look for triggered reports is required. By knowing the current head and tail offsets into the OA buffer, it is easier to determine the locality of the reports of interest. Current perf OA interface does not expose head/tail information to the user and it filters out invalid reports before sending data to user. Also considering limited size of user buffer used during a query, creating a 1:1 copy of the OA buffer at the user space added undesired complexity. The solution was to map the OA buffer to user space provided (1) that it is accessed from a privileged user. (2) OA report filtering is not used. These 2 conditions would satisfy the safety criteria that the current perf interface addresses. To enable the query: - Add an ioctl to expose head and tail to the user - Add an ioctl to return size and offset of the OA buffer - Map the OA buffer to the user space v2: - Improve commit message (Chris) - Do not mmap based on gem object filp. Instead, use perf_fd and support mmap syscall (Chris) - Pass non-zero offset in mmap to enforce the right object is mapped (Chris) - Do not expose gpu_address (Chris) - Verify start and length of vma for page alignment (Lionel) - Move SQNTL config out (Lionel) v3: (Chris) - Omit redundant checks - Return VM_FAULT_SIGBUS is old stream is closed - Maintain reference counts to stream in vm_open and vm_close - Use switch to identify object to be mapped v4: Call kref_put on closing perf fd (Chris) v5: - Strip access to OA buffer from unprivileged child of a privileged parent. Use VM_DONTCOPY - Enforce MAP_PRIVATE by checking for VM_MAYSHARE Signed-off-by: Piotr Maciejewski Signed-off-by: Umesh Nerlige Ramappa --- drivers/gpu/drm/i915/gem/i915_gem_mman.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_mman.h | 2 + drivers/gpu/drm/i915/i915_perf.c | 226 ++++++++++++++++++++++- drivers/gpu/drm/i915/i915_perf_types.h | 17 ++ include/uapi/drm/i915_drm.h | 32 ++++ 5 files changed, 276 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c index b23368529a40..7c4b9b0c334b 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c @@ -204,7 +204,7 @@ compute_partial_view(const struct drm_i915_gem_object *obj, return view; } -static vm_fault_t i915_error_to_vmf_fault(int err) +vm_fault_t i915_error_to_vmf_fault(int err) { switch (err) { default: diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.h b/drivers/gpu/drm/i915/gem/i915_gem_mman.h index efee9e0d2508..1190a3a228ea 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.h @@ -29,4 +29,6 @@ void i915_gem_object_release_mmap_gtt(struct drm_i915_gem_object *obj); void i915_gem_object_release_mmap_offset(struct drm_i915_gem_object *obj); +vm_fault_t i915_error_to_vmf_fault(int err); + #endif diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 2f23aad12c60..4a374b96eca6 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -192,10 +192,12 @@ */ #include +#include #include #include #include "gem/i915_gem_context.h" +#include "gem/i915_gem_mman.h" #include "gt/intel_engine_pm.h" #include "gt/intel_engine_user.h" #include "gt/intel_gt.h" @@ -378,6 +380,24 @@ static struct ctl_table_header *sysctl_header; static enum hrtimer_restart oa_poll_check_timer_cb(struct hrtimer *hrtimer); +static void free_stream(struct kref *kref) +{ + struct i915_perf_stream *stream = + container_of(kref, typeof(*stream), refcount); + + kfree(stream); +} + +static void perf_stream_get(struct i915_perf_stream *stream) +{ + kref_get(&stream->refcount); +} + +static void perf_stream_put(struct i915_perf_stream *stream) +{ + kref_put(&stream->refcount, free_stream); +} + void i915_oa_config_release(struct kref *ref) { struct i915_oa_config *oa_config = @@ -434,6 +454,30 @@ static u32 gen7_oa_hw_tail_read(struct i915_perf_stream *stream) return oastatus1 & GEN7_OASTATUS1_TAIL_MASK; } +static u32 gen12_oa_hw_head_read(struct i915_perf_stream *stream) +{ + struct intel_uncore *uncore = stream->uncore; + + return intel_uncore_read(uncore, GEN12_OAG_OAHEADPTR) & + GEN12_OAG_OAHEADPTR_MASK; +} + +static u32 gen8_oa_hw_head_read(struct i915_perf_stream *stream) +{ + struct intel_uncore *uncore = stream->uncore; + + return intel_uncore_read(uncore, GEN8_OAHEADPTR) & + GEN8_OAHEADPTR_MASK; +} + +static u32 gen7_oa_hw_head_read(struct i915_perf_stream *stream) +{ + struct intel_uncore *uncore = stream->uncore; + u32 oastatus2 = intel_uncore_read(uncore, GEN7_OASTATUS2); + + return oastatus2 & GEN7_OASTATUS2_HEAD_MASK; +} + /** * oa_buffer_check_unlocked - check for data and update tail ptr state * @stream: i915 stream instance @@ -2934,6 +2978,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, stream->poll_check_timer.function = oa_poll_check_timer_cb; init_waitqueue_head(&stream->poll_wq); spin_lock_init(&stream->oa_buffer.ptr_lock); + kref_init(&stream->refcount); return 0; @@ -3214,6 +3259,69 @@ static long i915_perf_config_locked(struct i915_perf_stream *stream, return ret; } +/** + * i915_perf_oa_buffer_head_tail_locked - head and tail of the OA buffer + * @stream: i915 perf stream + * @arg: pointer to oa buffer head and tail filled by this function. + */ +static int i915_perf_oa_buffer_head_tail_locked(struct i915_perf_stream *stream, + unsigned long arg) +{ + struct drm_i915_perf_oa_buffer_head_tail ht; + void __user *output = (void __user *)arg; + u32 gtt_offset = i915_ggtt_offset(stream->oa_buffer.vma); + + if (i915_perf_stream_paranoid && !perfmon_capable()) { + DRM_DEBUG("Insufficient privileges to access OA buffer info\n"); + return -EACCES; + } + + if (!output) + return -EINVAL; + + memset(&ht, 0, sizeof(ht)); + + ht.head = stream->perf->ops.oa_hw_head_read(stream) - gtt_offset; + ht.tail = stream->perf->ops.oa_hw_tail_read(stream) - gtt_offset; + + if (copy_to_user(output, &ht, sizeof(ht))) + return -EFAULT; + + return 0; +} + +#define I915_PERF_OA_BUFFER_MMAP_OFFSET 1 + +/** + * i915_perf_oa_buffer_info_locked - size and offset of the OA buffer + * @stream: i915 perf stream + * @arg: pointer to oa buffer info filled by this function. + */ +static int i915_perf_oa_buffer_info_locked(struct i915_perf_stream *stream, + unsigned long arg) +{ + struct drm_i915_perf_oa_buffer_info info; + void __user *output = (void __user *)arg; + + if (i915_perf_stream_paranoid && !perfmon_capable()) { + DRM_DEBUG("Insufficient privileges to access OA buffer info\n"); + return -EACCES; + } + + if (!output) + return -EINVAL; + + memset(&info, 0, sizeof(info)); + + info.size = stream->oa_buffer.vma->size; + info.offset = I915_PERF_OA_BUFFER_MMAP_OFFSET * PAGE_SIZE; + + if (copy_to_user(output, &info, sizeof(info))) + return -EFAULT; + + return 0; +} + /** * i915_perf_ioctl - support ioctl() usage with i915 perf stream FDs * @stream: An i915 perf stream @@ -3239,6 +3347,10 @@ static long i915_perf_ioctl_locked(struct i915_perf_stream *stream, return 0; case I915_PERF_IOCTL_CONFIG: return i915_perf_config_locked(stream, arg); + case I915_PERF_IOCTL_GET_OA_BUFFER_INFO: + return i915_perf_oa_buffer_info_locked(stream, arg); + case I915_PERF_IOCTL_GET_OA_BUFFER_HEAD_TAIL: + return i915_perf_oa_buffer_head_tail_locked(stream, arg); } return -EINVAL; @@ -3291,7 +3403,8 @@ static void i915_perf_destroy_locked(struct i915_perf_stream *stream) if (stream->ctx) i915_gem_context_put(stream->ctx); - kfree(stream); + WRITE_ONCE(stream->closed, true); + perf_stream_put(stream); } /** @@ -3314,12 +3427,113 @@ static int i915_perf_release(struct inode *inode, struct file *file) i915_perf_destroy_locked(stream); mutex_unlock(&perf->lock); + unmap_mapping_range(file->f_mapping, 0, OA_BUFFER_SIZE, 1); + /* Release the reference the perf stream kept on the driver. */ drm_dev_put(&perf->i915->drm); return 0; } +static void vm_open_oa(struct vm_area_struct *vma) +{ + struct i915_perf_stream *stream = vma->vm_private_data; + + GEM_BUG_ON(!stream); + perf_stream_get(stream); +} + +static void vm_close_oa(struct vm_area_struct *vma) +{ + struct i915_perf_stream *stream = vma->vm_private_data; + + GEM_BUG_ON(!stream); + perf_stream_put(stream); +} + +static vm_fault_t vm_fault_oa(struct vm_fault *vmf) +{ + struct vm_area_struct *vma = vmf->vma; + struct i915_perf_stream *stream = vma->vm_private_data; + struct i915_perf *perf = stream->perf; + struct drm_i915_gem_object *obj = stream->oa_buffer.vma->obj; + int err; + bool closed; + + mutex_lock(&perf->lock); + closed = READ_ONCE(stream->closed); + mutex_unlock(&perf->lock); + + if (closed) + return VM_FAULT_SIGBUS; + + err = i915_gem_object_pin_pages(obj); + if (err) + goto out; + + err = remap_io_sg(vma, + vma->vm_start, vma->vm_end - vma->vm_start, + obj->mm.pages->sgl, -1); + + i915_gem_object_unpin_pages(obj); + +out: + return i915_error_to_vmf_fault(err); +} + +static const struct vm_operations_struct vm_ops_oa = { + .open = vm_open_oa, + .close = vm_close_oa, + .fault = vm_fault_oa, +}; + +int i915_perf_mmap(struct file *file, struct vm_area_struct *vma) +{ + struct i915_perf_stream *stream = file->private_data; + + /* mmap-ing OA buffer to user space MUST absolutely be privileged */ + if (i915_perf_stream_paranoid && !perfmon_capable()) { + DRM_DEBUG("Insufficient privileges to map OA buffer\n"); + return -EACCES; + } + + switch (vma->vm_pgoff) { + /* A non-zero offset ensures that we are mapping the right object. Also + * leaves room for future objects added to this implementation. + */ + case I915_PERF_OA_BUFFER_MMAP_OFFSET: + if (vma->vm_end - vma->vm_start > OA_BUFFER_SIZE) + return -EINVAL; + + /* Only support VM_READ. Enforce MAP_PRIVATE by checking for + * VM_MAYSHARE. + */ + if (vma->vm_flags & (VM_WRITE | VM_EXEC | + VM_SHARED | VM_MAYSHARE)) + return -EINVAL; + + vma->vm_flags &= ~(VM_MAYWRITE | VM_MAYEXEC); + + /* If the privileged parent forks and child drops root + * privilege, we do not want the child to retain access to the + * mapped OA buffer. Explicitly set VM_DONTCOPY to avoid such + * cases. + */ + vma->vm_flags |= VM_PFNMAP | VM_DONTEXPAND | + VM_DONTDUMP | VM_DONTCOPY; + break; + + default: + return -EINVAL; + } + + vma->vm_page_prot = vm_get_page_prot(vma->vm_flags); + vma->vm_private_data = stream; + vma->vm_ops = &vm_ops_oa; + vm_open_oa(vma); + + return 0; +} static const struct file_operations fops = { .owner = THIS_MODULE, @@ -3332,6 +3546,7 @@ static const struct file_operations fops = { * to handle 32bits compatibility. */ .compat_ioctl = i915_perf_ioctl, + .mmap = i915_perf_mmap, }; @@ -4260,6 +4475,7 @@ void i915_perf_init(struct drm_i915_private *i915) perf->ops.oa_disable = gen7_oa_disable; perf->ops.read = gen7_oa_read; perf->ops.oa_hw_tail_read = gen7_oa_hw_tail_read; + perf->ops.oa_hw_head_read = gen7_oa_hw_head_read; perf->oa_formats = hsw_oa_formats; } else if (HAS_LOGICAL_RING_CONTEXTS(i915)) { @@ -4291,6 +4507,7 @@ void i915_perf_init(struct drm_i915_private *i915) perf->ops.enable_metric_set = gen8_enable_metric_set; perf->ops.disable_metric_set = gen8_disable_metric_set; perf->ops.oa_hw_tail_read = gen8_oa_hw_tail_read; + perf->ops.oa_hw_head_read = gen8_oa_hw_head_read; if (IS_GEN(i915, 8)) { perf->ctx_oactxctrl_offset = 0x120; @@ -4318,6 +4535,7 @@ void i915_perf_init(struct drm_i915_private *i915) perf->ops.enable_metric_set = gen8_enable_metric_set; perf->ops.disable_metric_set = gen10_disable_metric_set; perf->ops.oa_hw_tail_read = gen8_oa_hw_tail_read; + perf->ops.oa_hw_head_read = gen8_oa_hw_head_read; if (IS_GEN(i915, 10)) { perf->ctx_oactxctrl_offset = 0x128; @@ -4342,6 +4560,7 @@ void i915_perf_init(struct drm_i915_private *i915) perf->ops.enable_metric_set = gen12_enable_metric_set; perf->ops.disable_metric_set = gen12_disable_metric_set; perf->ops.oa_hw_tail_read = gen12_oa_hw_tail_read; + perf->ops.oa_hw_head_read = gen12_oa_hw_head_read; perf->ctx_flexeu0_offset = 0; perf->ctx_oactxctrl_offset = 0x144; @@ -4456,8 +4675,11 @@ int i915_perf_ioctl_version(void) * * - OA buffer head/tail/status/buffer registers for read only * - OA counters A18, A19, A20 for read/write + * + * 8: Added an option to map oa buffer at umd driver level and trigger + * oa reports within oa buffer from command buffer. */ - return 7; + return 8; } #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) diff --git a/drivers/gpu/drm/i915/i915_perf_types.h b/drivers/gpu/drm/i915/i915_perf_types.h index a36a455ae336..2efbe35c5fa9 100644 --- a/drivers/gpu/drm/i915/i915_perf_types.h +++ b/drivers/gpu/drm/i915/i915_perf_types.h @@ -311,6 +311,18 @@ struct i915_perf_stream { * buffer should be checked for available data. */ u64 poll_oa_period; + + /** + * @closed: Open or closed state of the stream. + * True if stream is closed. + */ + bool closed; + + /** + * @refcount: References to the mapped OA buffer managed by this + * stream. + */ + struct kref refcount; }; /** @@ -377,6 +389,11 @@ struct i915_oa_ops { * generations. */ u32 (*oa_hw_tail_read)(struct i915_perf_stream *stream); + + /** + * @oa_hw_head_read: read the OA head pointer register + */ + u32 (*oa_hw_head_read)(struct i915_perf_stream *stream); }; struct i915_perf { diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 00546062e023..2042f6339182 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -2048,6 +2048,38 @@ struct drm_i915_perf_open_param { */ #define I915_PERF_IOCTL_CONFIG _IO('i', 0x2) +/** + * Returns OA buffer properties to be used with mmap. + * + * This ioctl is available in perf revision 8. + */ +#define I915_PERF_IOCTL_GET_OA_BUFFER_INFO _IO('i', 0x3) + +/** + * OA buffer size and offset. + */ +struct drm_i915_perf_oa_buffer_info { + __u32 size; + __u32 offset; + __u64 reserved[4]; +}; + +/** + * Returns current position of OA buffer head and tail. + * + * This ioctl is available in perf revision 8. + */ +#define I915_PERF_IOCTL_GET_OA_BUFFER_HEAD_TAIL _IO('i', 0x4) + +/** + * OA buffer head and tail. + */ +struct drm_i915_perf_oa_buffer_head_tail { + __u32 head; + __u32 tail; + __u64 reserved[4]; +}; + /** * Common to all i915 perf records */