From patchwork Thu Mar 30 00:40:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13193325 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5EC4DC761A6 for ; Thu, 30 Mar 2023 00:41:08 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E3F7B10ECAB; Thu, 30 Mar 2023 00:41:06 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8617210ECA2 for ; Thu, 30 Mar 2023 00:41:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680136864; x=1711672864; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=YjSaHL5A1iolNDo1jBQdSWIT5/XO3JJ/rFAOqXMNLBk=; b=bz6xdLyuxTZWWVImiUyyCjmshoNOsPMwI/KWA4Jop3eAMu7xo1Gfbz/Y GVZKMGZwmiYUfyUxxAVZAr6EGJZTHzxUORU7zI5p0ECpR9mOOT6g9xba/ Ch75OxMzjSJNMitkJ9Mx1jVWcm7gMS7aLOWS2sL+lWtUckFu5zOCTXv5k dIHraHXruLZUhBTGIHHIplyp7H6vC9wGoZUpsqMmyFCuGl10gMEgQixow WhVIg/2E/4HPb9KM1xT0lbXh+BiY6reVr2MxTlOUR4RwvpyuyAJsmrjUp o5c211wYAidv2byoSPV3HjwqdmgwuUk+bRzMkKMo4tcIwCgvG2onuDG2f Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="427310369" X-IronPort-AV: E=Sophos;i="5.98,301,1673942400"; d="scan'208";a="427310369" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2023 17:41:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="634668666" X-IronPort-AV: E=Sophos;i="5.98,301,1673942400"; d="scan'208";a="634668666" Received: from orsosgc001.jf.intel.com ([10.165.21.138]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2023 17:41:03 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Wed, 29 Mar 2023 17:40:55 -0700 Message-Id: <20230330004103.1295413-2-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20230330004103.1295413-1-umesh.nerlige.ramappa@intel.com> References: <20230330004103.1295413-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 1/9] drm/i915/pmu: Support PMU for all engines X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Tvrtko Ursulin Given how the metrics are already exported, we also need to run sampling over engines from all GTs. Problem of GT frequencies is left for later. Signed-off-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_pmu.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index 7ece883a7d95..e274dba58629 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -10,6 +10,7 @@ #include "gt/intel_engine_pm.h" #include "gt/intel_engine_regs.h" #include "gt/intel_engine_user.h" +#include "gt/intel_gt.h" #include "gt/intel_gt_pm.h" #include "gt/intel_gt_regs.h" #include "gt/intel_rc6.h" @@ -414,8 +415,9 @@ static enum hrtimer_restart i915_sample(struct hrtimer *hrtimer) struct drm_i915_private *i915 = container_of(hrtimer, struct drm_i915_private, pmu.timer); struct i915_pmu *pmu = &i915->pmu; - struct intel_gt *gt = to_gt(i915); unsigned int period_ns; + struct intel_gt *gt; + unsigned int i; ktime_t now; if (!READ_ONCE(pmu->timer_enabled)) @@ -431,8 +433,14 @@ static enum hrtimer_restart i915_sample(struct hrtimer *hrtimer) * grabbing the forcewake. However the potential error from timer call- * back delay greatly dominates this so we keep it simple. */ - engines_sample(gt, period_ns); - frequency_sample(gt, period_ns); + + for_each_gt(gt, i915, i) { + engines_sample(gt, period_ns); + + /* Sample only gt0 until gt support is added for frequency */ + if (i == 0) + frequency_sample(gt, period_ns); + } hrtimer_forward(hrtimer, now, ns_to_ktime(PERIOD)); From patchwork Thu Mar 30 00:40:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13193323 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 70C31C6FD18 for ; Thu, 30 Mar 2023 00:41:06 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E25F010EC94; Thu, 30 Mar 2023 00:41:05 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id A104710EC94 for ; Thu, 30 Mar 2023 00:41:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680136864; x=1711672864; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1phlpWwA7X7CjtHodxaQogAKXJhRp8HyJVCs0dPxQ2Y=; b=U4zNqGYcWqdGMX4kWQPM0+TYKJEoUxBvjPTEAb2AVgYZaEQ850a1ppnl BdUrQpRkO8nduDIx7WHRmHo4qac1kl1ybipoxFwNOgEZs7RKfcuhcFQyh 0/NL6ndkZFqIGyY8IYSVKnG8LRqoYa49ANkdZGxG725zpGBr3546Ku2fL 15oM11qEm1b3kujZrFVa9QNuNcqtkqT1yYpv9emYuRvBfcWMWC9LWQlkr EL6cEbnhWycie1a5WcQGOjOvL7O5ZN6VfwCNNQLFut6ieJAHOhR+9kvZB M/idwsy68vYxN8aF0oB5FnfeCUEAtnbfmGi7o6Q2Lt/U9kJBX/hH75m9D g==; X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="427310370" X-IronPort-AV: E=Sophos;i="5.98,301,1673942400"; d="scan'208";a="427310370" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2023 17:41:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="634668669" X-IronPort-AV: E=Sophos;i="5.98,301,1673942400"; d="scan'208";a="634668669" Received: from orsosgc001.jf.intel.com ([10.165.21.138]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2023 17:41:03 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Wed, 29 Mar 2023 17:40:56 -0700 Message-Id: <20230330004103.1295413-3-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20230330004103.1295413-1-umesh.nerlige.ramappa@intel.com> References: <20230330004103.1295413-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 2/9] drm/i915/pmu: Skip sampling engines with no enabled counters X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Tvrtko Ursulin As we have more and more engines do not waste time sampling the ones no- one is monitoring. Signed-off-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_pmu.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index e274dba58629..6abd5042dea3 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -339,6 +339,9 @@ engines_sample(struct intel_gt *gt, unsigned int period_ns) return; for_each_engine(engine, gt, id) { + if (!engine->pmu.enable) + continue; + if (!intel_engine_pm_get_if_awake(engine)) continue; From patchwork Thu Mar 30 00:40:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13193324 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DC4B4C74A5B for ; Thu, 30 Mar 2023 00:41:06 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5D4BD10ECA2; Thu, 30 Mar 2023 00:41:06 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id BA46210ECA2 for ; Thu, 30 Mar 2023 00:41:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680136864; x=1711672864; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=U6kBJ8Bo2O0IG+5FZvD8CLg3x1X+NY5vXhWEX6uqC3o=; b=PP18pePStNy+jIjWUdR9r2Amr2AlvsAtxuqzgU5hqkP0dmPQEm2lcsOL 7YGQTjAhwe0aR0x8IGpJEZc2jFYgVEAGdqRBRy9a7qx2LZYMRMmlsQ5js PukBYnXSAAdLyKBzlR7kIJOFxZtuE65dPDlE2/h6f22Q90swIM0R0RF6F 1beXuQ8LlfvopBNiPJSdY40ZjF4IiRDHG7i11vZzbPgp4lIN7iFDWiq7q 3rmQEt7oiNtGeGbhxB9Vi+MRVR+KJKFYgIDo5rZUJFwYOcRdRdirm2Dcj DhaE0HOzcoVXeevwX2fu6dxPujdsOz21AgFOr4JxgzOdC0ENhSgFOdAzf w==; X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="427310371" X-IronPort-AV: E=Sophos;i="5.98,301,1673942400"; d="scan'208";a="427310371" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2023 17:41:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="634668672" X-IronPort-AV: E=Sophos;i="5.98,301,1673942400"; d="scan'208";a="634668672" Received: from orsosgc001.jf.intel.com ([10.165.21.138]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2023 17:41:03 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Wed, 29 Mar 2023 17:40:57 -0700 Message-Id: <20230330004103.1295413-4-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20230330004103.1295413-1-umesh.nerlige.ramappa@intel.com> References: <20230330004103.1295413-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 3/9] drm/i915/pmu: Transform PMU parking code to be GT based X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Tvrtko Ursulin Trivial prep work for full multi-tile enablement later. Signed-off-by: Tvrtko Ursulin Signed-off-by: Vinay Belgaumkar --- drivers/gpu/drm/i915/gt/intel_gt_pm.c | 4 ++-- drivers/gpu/drm/i915/i915_pmu.c | 16 ++++++++-------- drivers/gpu/drm/i915/i915_pmu.h | 9 +++++---- 3 files changed, 15 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm.c index e02cb90723ae..c2e69bafd02b 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c @@ -87,7 +87,7 @@ static int __gt_unpark(struct intel_wakeref *wf) intel_rc6_unpark(>->rc6); intel_rps_unpark(>->rps); - i915_pmu_gt_unparked(i915); + i915_pmu_gt_unparked(gt); intel_guc_busyness_unpark(gt); intel_gt_unpark_requests(gt); @@ -109,7 +109,7 @@ static int __gt_park(struct intel_wakeref *wf) intel_guc_busyness_park(gt); i915_vma_parked(gt); - i915_pmu_gt_parked(i915); + i915_pmu_gt_parked(gt); intel_rps_park(>->rps); intel_rc6_park(>->rc6); diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index 6abd5042dea3..6f7f9b40860d 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -217,11 +217,11 @@ static void init_rc6(struct i915_pmu *pmu) } } -static void park_rc6(struct drm_i915_private *i915) +static void park_rc6(struct intel_gt *gt) { - struct i915_pmu *pmu = &i915->pmu; + struct i915_pmu *pmu = >->i915->pmu; - pmu->sample[__I915_SAMPLE_RC6].cur = __get_rc6(to_gt(i915)); + pmu->sample[__I915_SAMPLE_RC6].cur = __get_rc6(gt); pmu->sleep_last = ktime_get_raw(); } @@ -236,16 +236,16 @@ static void __i915_pmu_maybe_start_timer(struct i915_pmu *pmu) } } -void i915_pmu_gt_parked(struct drm_i915_private *i915) +void i915_pmu_gt_parked(struct intel_gt *gt) { - struct i915_pmu *pmu = &i915->pmu; + struct i915_pmu *pmu = >->i915->pmu; if (!pmu->base.event_init) return; spin_lock_irq(&pmu->lock); - park_rc6(i915); + park_rc6(gt); /* * Signal sampling timer to stop if only engine events are enabled and @@ -256,9 +256,9 @@ void i915_pmu_gt_parked(struct drm_i915_private *i915) spin_unlock_irq(&pmu->lock); } -void i915_pmu_gt_unparked(struct drm_i915_private *i915) +void i915_pmu_gt_unparked(struct intel_gt *gt) { - struct i915_pmu *pmu = &i915->pmu; + struct i915_pmu *pmu = >->i915->pmu; if (!pmu->base.event_init) return; diff --git a/drivers/gpu/drm/i915/i915_pmu.h b/drivers/gpu/drm/i915/i915_pmu.h index 449057648f39..d98fbc7a2f45 100644 --- a/drivers/gpu/drm/i915/i915_pmu.h +++ b/drivers/gpu/drm/i915/i915_pmu.h @@ -13,6 +13,7 @@ #include struct drm_i915_private; +struct intel_gt; /** * Non-engine events that we need to track enabled-disabled transition and @@ -151,15 +152,15 @@ int i915_pmu_init(void); void i915_pmu_exit(void); void i915_pmu_register(struct drm_i915_private *i915); void i915_pmu_unregister(struct drm_i915_private *i915); -void i915_pmu_gt_parked(struct drm_i915_private *i915); -void i915_pmu_gt_unparked(struct drm_i915_private *i915); +void i915_pmu_gt_parked(struct intel_gt *gt); +void i915_pmu_gt_unparked(struct intel_gt *gt); #else static inline int i915_pmu_init(void) { return 0; } static inline void i915_pmu_exit(void) {} static inline void i915_pmu_register(struct drm_i915_private *i915) {} static inline void i915_pmu_unregister(struct drm_i915_private *i915) {} -static inline void i915_pmu_gt_parked(struct drm_i915_private *i915) {} -static inline void i915_pmu_gt_unparked(struct drm_i915_private *i915) {} +static inline void i915_pmu_gt_parked(struct intel_gt *gt) {} +static inline void i915_pmu_gt_unparked(struct intel_gt *gt) {} #endif #endif From patchwork Thu Mar 30 00:40:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13193328 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F071CC74A5B for ; Thu, 30 Mar 2023 00:41:15 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8D67D10ECC2; Thu, 30 Mar 2023 00:41:13 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id EDF1410ECA2 for ; Thu, 30 Mar 2023 00:41:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680136864; x=1711672864; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ouSJCwNN6PUG1ASC3l4ex50/aQQ8eT0M44+7ZeRF7CA=; b=OqC+UKlo5+3ny6AxrrP8Qs9Ocaio1MHQWcsCY17sGmlP4rsFZvouhzYG F/5qY13Fp4XQ8rOJXtrwuCmXP2+mUvbuqS/XptBxwcb/FYwlSh3F4cXRI djVGTP9YNk7kHXFx6Sfz+A1ktqrfZ+FkDkH65YsBtdVSxbzLP0Pv6E8h2 6K44ebpIB14V06GCMfSRZLsJbj4c94Jj0wA8rrsC0lr6t/CiEOH0hNAdu pIX7avxRYDdUqbWRcuAQc5ke5heH4hit6pbbAaQTAgYU24Rom/kt5CWQz 2wDK8K0dLodExKNRjk7JvHx3H7HncfnuGkKUSUsaoTTncgOcwtqZzwU7O Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="427310373" X-IronPort-AV: E=Sophos;i="5.98,301,1673942400"; d="scan'208";a="427310373" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2023 17:41:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="634668675" X-IronPort-AV: E=Sophos;i="5.98,301,1673942400"; d="scan'208";a="634668675" Received: from orsosgc001.jf.intel.com ([10.165.21.138]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2023 17:41:03 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Wed, 29 Mar 2023 17:40:58 -0700 Message-Id: <20230330004103.1295413-5-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20230330004103.1295413-1-umesh.nerlige.ramappa@intel.com> References: <20230330004103.1295413-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 4/9] drm/i915/pmu: Add reference counting to the sampling timer X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Tvrtko Ursulin We do not want to have timers per tile and waste CPU cycles and energy via multiple wake-up sources, for a relatively un-important task of PMU sampling, so keeping a single timer works well. But we also do not want the first GT which goes idle to turn off the timer. Add some reference counting, via a mask of unparked GTs, to solve this. Signed-off-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_pmu.c | 12 ++++++++++-- drivers/gpu/drm/i915/i915_pmu.h | 4 ++++ 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index 6f7f9b40860d..c00b94c7f509 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -251,7 +251,9 @@ void i915_pmu_gt_parked(struct intel_gt *gt) * Signal sampling timer to stop if only engine events are enabled and * GPU went idle. */ - pmu->timer_enabled = pmu_needs_timer(pmu, false); + pmu->unparked &= ~BIT(gt->info.id); + if (pmu->unparked == 0) + pmu->timer_enabled = pmu_needs_timer(pmu, false); spin_unlock_irq(&pmu->lock); } @@ -268,7 +270,10 @@ void i915_pmu_gt_unparked(struct intel_gt *gt) /* * Re-enable sampling timer when GPU goes active. */ - __i915_pmu_maybe_start_timer(pmu); + if (pmu->unparked == 0) + __i915_pmu_maybe_start_timer(pmu); + + pmu->unparked |= BIT(gt->info.id); spin_unlock_irq(&pmu->lock); } @@ -438,6 +443,9 @@ static enum hrtimer_restart i915_sample(struct hrtimer *hrtimer) */ for_each_gt(gt, i915, i) { + if (!(pmu->unparked & BIT(i))) + continue; + engines_sample(gt, period_ns); /* Sample only gt0 until gt support is added for frequency */ diff --git a/drivers/gpu/drm/i915/i915_pmu.h b/drivers/gpu/drm/i915/i915_pmu.h index d98fbc7a2f45..1b04c79907e8 100644 --- a/drivers/gpu/drm/i915/i915_pmu.h +++ b/drivers/gpu/drm/i915/i915_pmu.h @@ -76,6 +76,10 @@ struct i915_pmu { * @lock: Lock protecting enable mask and ref count handling. */ spinlock_t lock; + /** + * @unparked: GT unparked mask. + */ + unsigned int unparked; /** * @timer: Timer for internal i915 PMU sampling. */ From patchwork Thu Mar 30 00:40:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13193331 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BB6F3C74A5B for ; Thu, 30 Mar 2023 00:41:31 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0E67810EC9A; Thu, 30 Mar 2023 00:41:31 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3F05510ECA2 for ; Thu, 30 Mar 2023 00:41:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680136865; x=1711672865; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9MFsHNhw5mmYFfpPN5faUT6PCKokc/BehiBa2wiDpJ0=; b=cTkwxyGKkZFXzi8gkHhbBiuru9pWNUD48/QBmfftEWDEJAacaFqgrmFH Yaj8ymJ5mpDpuYbjr4Bktth32ogsR+sXXzp3vpmaPOPiH6Yk8Mqfa9vid 8y//aOPvZmvKWeOKsdiXu8neTY/fIKlJuUpW8iBlvzH5lYkYAbSoyJRjH I7FtVhVQXAdP7noDw1Mkjn9zaN5IyXvL0yHdZK311sheYEgaFsCfKKdRB NaQA0qVbUAeBe88ZrTRS1cQPi2teErYTWzDNsSVQgs/tfUFQz3ewZwsEb ctP8zqQXZBrMD21A/cdIP3DhKTEefrZbaKFWw1mpFTJKWvw0XHCU1BFIZ g==; X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="427310376" X-IronPort-AV: E=Sophos;i="5.98,301,1673942400"; d="scan'208";a="427310376" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2023 17:41:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="634668679" X-IronPort-AV: E=Sophos;i="5.98,301,1673942400"; d="scan'208";a="634668679" Received: from orsosgc001.jf.intel.com ([10.165.21.138]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2023 17:41:03 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Wed, 29 Mar 2023 17:40:59 -0700 Message-Id: <20230330004103.1295413-6-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20230330004103.1295413-1-umesh.nerlige.ramappa@intel.com> References: <20230330004103.1295413-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 5/9] drm/i915/pmu: Prepare for multi-tile non-engine counters X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Tvrtko Ursulin Reserve some bits in the counter config namespace which will carry the tile id and prepare the code to handle this. No per tile counters have been added yet. Signed-off-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_pmu.c | 153 +++++++++++++++++++++++--------- drivers/gpu/drm/i915/i915_pmu.h | 9 +- include/uapi/drm/i915_drm.h | 18 +++- 3 files changed, 132 insertions(+), 48 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index c00b94c7f509..5d1de98d86b4 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -56,11 +56,21 @@ static bool is_engine_config(u64 config) return config < __I915_PMU_OTHER(0); } +static unsigned int config_gt_id(const u64 config) +{ + return config >> __I915_PMU_GT_SHIFT; +} + +static u64 config_counter(const u64 config) +{ + return config & ~(~0ULL << __I915_PMU_GT_SHIFT); +} + static unsigned int other_bit(const u64 config) { unsigned int val; - switch (config) { + switch (config_counter(config)) { case I915_PMU_ACTUAL_FREQUENCY: val = __I915_PMU_ACTUAL_FREQUENCY_ENABLED; break; @@ -78,15 +88,20 @@ static unsigned int other_bit(const u64 config) return -1; } - return I915_ENGINE_SAMPLE_COUNT + val; + return I915_ENGINE_SAMPLE_COUNT + + config_gt_id(config) * __I915_PMU_TRACKED_EVENT_COUNT + + val; } static unsigned int config_bit(const u64 config) { - if (is_engine_config(config)) + if (is_engine_config(config)) { + GEM_BUG_ON(config_gt_id(config)); + return engine_config_sample(config); - else + } else { return other_bit(config); + } } static u64 config_mask(u64 config) @@ -104,6 +119,18 @@ static unsigned int event_bit(struct perf_event *event) return config_bit(event->attr.config); } +static u64 frequency_enabled_mask(void) +{ + unsigned int i; + u64 mask = 0; + + for (i = 0; i < I915_PMU_MAX_GTS; i++) + mask |= config_mask(__I915_PMU_ACTUAL_FREQUENCY(i)) | + config_mask(__I915_PMU_REQUESTED_FREQUENCY(i)); + + return mask; +} + static bool pmu_needs_timer(struct i915_pmu *pmu, bool gpu_active) { struct drm_i915_private *i915 = container_of(pmu, typeof(*i915), pmu); @@ -120,9 +147,7 @@ static bool pmu_needs_timer(struct i915_pmu *pmu, bool gpu_active) * Mask out all the ones which do not need the timer, or in * other words keep all the ones that could need the timer. */ - enable &= config_mask(I915_PMU_ACTUAL_FREQUENCY) | - config_mask(I915_PMU_REQUESTED_FREQUENCY) | - ENGINE_SAMPLE_MASK; + enable &= frequency_enabled_mask() | ENGINE_SAMPLE_MASK; /* * When the GPU is idle per-engine counters do not need to be @@ -164,9 +189,39 @@ static inline s64 ktime_since_raw(const ktime_t kt) return ktime_to_ns(ktime_sub(ktime_get_raw(), kt)); } +static unsigned int +__sample_idx(struct i915_pmu *pmu, unsigned int gt_id, int sample) +{ + unsigned int idx = gt_id * __I915_NUM_PMU_SAMPLERS + sample; + + GEM_BUG_ON(idx >= ARRAY_SIZE(pmu->sample)); + + return idx; +} + +static u64 read_sample(struct i915_pmu *pmu, unsigned int gt_id, int sample) +{ + return pmu->sample[__sample_idx(pmu, gt_id, sample)].cur; +} + +static void +store_sample(struct i915_pmu *pmu, unsigned int gt_id, int sample, u64 val) +{ + pmu->sample[__sample_idx(pmu, gt_id, sample)].cur = val; +} + +static void +add_sample_mult(struct i915_pmu *pmu, unsigned int gt_id, int sample, u32 val, + u32 mul) +{ + pmu->sample[__sample_idx(pmu, gt_id, sample)].cur += + mul_u32_u32(val, mul); +} + static u64 get_rc6(struct intel_gt *gt) { struct drm_i915_private *i915 = gt->i915; + const unsigned int gt_id = gt->info.id; struct i915_pmu *pmu = &i915->pmu; unsigned long flags; bool awake = false; @@ -181,7 +236,7 @@ static u64 get_rc6(struct intel_gt *gt) spin_lock_irqsave(&pmu->lock, flags); if (awake) { - pmu->sample[__I915_SAMPLE_RC6].cur = val; + store_sample(pmu, gt_id, __I915_SAMPLE_RC6, val); } else { /* * We think we are runtime suspended. @@ -190,14 +245,14 @@ static u64 get_rc6(struct intel_gt *gt) * on top of the last known real value, as the approximated RC6 * counter value. */ - val = ktime_since_raw(pmu->sleep_last); - val += pmu->sample[__I915_SAMPLE_RC6].cur; + val = ktime_since_raw(pmu->sleep_last[gt_id]); + val += read_sample(pmu, gt_id, __I915_SAMPLE_RC6); } - if (val < pmu->sample[__I915_SAMPLE_RC6_LAST_REPORTED].cur) - val = pmu->sample[__I915_SAMPLE_RC6_LAST_REPORTED].cur; + if (val < read_sample(pmu, gt_id, __I915_SAMPLE_RC6_LAST_REPORTED)) + val = read_sample(pmu, gt_id, __I915_SAMPLE_RC6_LAST_REPORTED); else - pmu->sample[__I915_SAMPLE_RC6_LAST_REPORTED].cur = val; + store_sample(pmu, gt_id, __I915_SAMPLE_RC6_LAST_REPORTED, val); spin_unlock_irqrestore(&pmu->lock, flags); @@ -207,13 +262,20 @@ static u64 get_rc6(struct intel_gt *gt) static void init_rc6(struct i915_pmu *pmu) { struct drm_i915_private *i915 = container_of(pmu, typeof(*i915), pmu); - intel_wakeref_t wakeref; + struct intel_gt *gt; + unsigned int i; + + for_each_gt(gt, i915, i) { + intel_wakeref_t wakeref; - with_intel_runtime_pm(to_gt(i915)->uncore->rpm, wakeref) { - pmu->sample[__I915_SAMPLE_RC6].cur = __get_rc6(to_gt(i915)); - pmu->sample[__I915_SAMPLE_RC6_LAST_REPORTED].cur = - pmu->sample[__I915_SAMPLE_RC6].cur; - pmu->sleep_last = ktime_get_raw(); + with_intel_runtime_pm(gt->uncore->rpm, wakeref) { + u64 val = __get_rc6(gt); + + store_sample(pmu, i, __I915_SAMPLE_RC6, val); + store_sample(pmu, i, __I915_SAMPLE_RC6_LAST_REPORTED, + val); + pmu->sleep_last[i] = ktime_get_raw(); + } } } @@ -221,8 +283,8 @@ static void park_rc6(struct intel_gt *gt) { struct i915_pmu *pmu = >->i915->pmu; - pmu->sample[__I915_SAMPLE_RC6].cur = __get_rc6(gt); - pmu->sleep_last = ktime_get_raw(); + store_sample(pmu, gt->info.id, __I915_SAMPLE_RC6, __get_rc6(gt)); + pmu->sleep_last[gt->info.id] = ktime_get_raw(); } static void __i915_pmu_maybe_start_timer(struct i915_pmu *pmu) @@ -362,34 +424,30 @@ engines_sample(struct intel_gt *gt, unsigned int period_ns) } } -static void -add_sample_mult(struct i915_pmu_sample *sample, u32 val, u32 mul) -{ - sample->cur += mul_u32_u32(val, mul); -} - -static bool frequency_sampling_enabled(struct i915_pmu *pmu) +static bool +frequency_sampling_enabled(struct i915_pmu *pmu, unsigned int gt) { return pmu->enable & - (config_mask(I915_PMU_ACTUAL_FREQUENCY) | - config_mask(I915_PMU_REQUESTED_FREQUENCY)); + (config_mask(__I915_PMU_ACTUAL_FREQUENCY(gt)) | + config_mask(__I915_PMU_REQUESTED_FREQUENCY(gt))); } static void frequency_sample(struct intel_gt *gt, unsigned int period_ns) { struct drm_i915_private *i915 = gt->i915; + const unsigned int gt_id = gt->info.id; struct i915_pmu *pmu = &i915->pmu; struct intel_rps *rps = >->rps; - if (!frequency_sampling_enabled(pmu)) + if (!frequency_sampling_enabled(pmu, gt_id)) return; /* Report 0/0 (actual/requested) frequency while parked. */ if (!intel_gt_pm_get_if_awake(gt)) return; - if (pmu->enable & config_mask(I915_PMU_ACTUAL_FREQUENCY)) { + if (pmu->enable & config_mask(__I915_PMU_ACTUAL_FREQUENCY(gt_id))) { u32 val; /* @@ -405,12 +463,12 @@ frequency_sample(struct intel_gt *gt, unsigned int period_ns) if (!val) val = intel_gpu_freq(rps, rps->cur_freq); - add_sample_mult(&pmu->sample[__I915_SAMPLE_FREQ_ACT], + add_sample_mult(pmu, gt_id, __I915_SAMPLE_FREQ_ACT, val, period_ns / 1000); } - if (pmu->enable & config_mask(I915_PMU_REQUESTED_FREQUENCY)) { - add_sample_mult(&pmu->sample[__I915_SAMPLE_FREQ_REQ], + if (pmu->enable & config_mask(__I915_PMU_REQUESTED_FREQUENCY(gt_id))) { + add_sample_mult(pmu, gt_id, __I915_SAMPLE_FREQ_REQ, intel_rps_get_requested_frequency(rps), period_ns / 1000); } @@ -447,10 +505,7 @@ static enum hrtimer_restart i915_sample(struct hrtimer *hrtimer) continue; engines_sample(gt, period_ns); - - /* Sample only gt0 until gt support is added for frequency */ - if (i == 0) - frequency_sample(gt, period_ns); + frequency_sample(gt, period_ns); } hrtimer_forward(hrtimer, now, ns_to_ktime(PERIOD)); @@ -492,7 +547,12 @@ config_status(struct drm_i915_private *i915, u64 config) { struct intel_gt *gt = to_gt(i915); - switch (config) { + unsigned int gt_id = config_gt_id(config); + + if (gt_id) + return -ENOENT; + + switch (config_counter(config)) { case I915_PMU_ACTUAL_FREQUENCY: if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915)) /* Requires a mutex for sampling! */ @@ -600,22 +660,27 @@ static u64 __i915_pmu_event_read(struct perf_event *event) val = engine->pmu.sample[sample].cur; } } else { - switch (event->attr.config) { + const unsigned int gt_id = config_gt_id(event->attr.config); + const u64 config = config_counter(event->attr.config); + + switch (config) { case I915_PMU_ACTUAL_FREQUENCY: val = - div_u64(pmu->sample[__I915_SAMPLE_FREQ_ACT].cur, + div_u64(read_sample(pmu, gt_id, + __I915_SAMPLE_FREQ_ACT), USEC_PER_SEC /* to MHz */); break; case I915_PMU_REQUESTED_FREQUENCY: val = - div_u64(pmu->sample[__I915_SAMPLE_FREQ_REQ].cur, + div_u64(read_sample(pmu, gt_id, + __I915_SAMPLE_FREQ_REQ), USEC_PER_SEC /* to MHz */); break; case I915_PMU_INTERRUPTS: val = READ_ONCE(pmu->irq_count); break; case I915_PMU_RC6_RESIDENCY: - val = get_rc6(to_gt(i915)); + val = get_rc6(i915->gt[gt_id]); break; case I915_PMU_SOFTWARE_GT_AWAKE_TIME: val = ktime_to_ns(intel_gt_get_awake_time(to_gt(i915))); diff --git a/drivers/gpu/drm/i915/i915_pmu.h b/drivers/gpu/drm/i915/i915_pmu.h index 1b04c79907e8..a708e44a227e 100644 --- a/drivers/gpu/drm/i915/i915_pmu.h +++ b/drivers/gpu/drm/i915/i915_pmu.h @@ -38,13 +38,16 @@ enum { __I915_NUM_PMU_SAMPLERS }; +#define I915_PMU_MAX_GTS (4) /* FIXME */ + /** * How many different events we track in the global PMU mask. * * It is also used to know to needed number of event reference counters. */ #define I915_PMU_MASK_BITS \ - (I915_ENGINE_SAMPLE_COUNT + __I915_PMU_TRACKED_EVENT_COUNT) + (I915_ENGINE_SAMPLE_COUNT + \ + I915_PMU_MAX_GTS * __I915_PMU_TRACKED_EVENT_COUNT) #define I915_ENGINE_SAMPLE_COUNT (I915_SAMPLE_SEMA + 1) @@ -124,11 +127,11 @@ struct i915_pmu { * Only global counters are held here, while the per-engine ones are in * struct intel_engine_cs. */ - struct i915_pmu_sample sample[__I915_NUM_PMU_SAMPLERS]; + struct i915_pmu_sample sample[I915_PMU_MAX_GTS * __I915_NUM_PMU_SAMPLERS]; /** * @sleep_last: Last time GT parked for RC6 estimation. */ - ktime_t sleep_last; + ktime_t sleep_last[I915_PMU_MAX_GTS]; /** * @irq_count: Number of interrupts * diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index dba7c5a5b25e..bbab7f3dbeb4 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -280,7 +280,17 @@ enum drm_i915_pmu_engine_sample { #define I915_PMU_ENGINE_SEMA(class, instance) \ __I915_PMU_ENGINE(class, instance, I915_SAMPLE_SEMA) -#define __I915_PMU_OTHER(x) (__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x)) +/* + * Top 8 bits of every non-engine counter are GT id. + * FIXME: __I915_PMU_GT_SHIFT will be changed to 56 + */ +#define __I915_PMU_GT_SHIFT (60) + +#define ___I915_PMU_OTHER(gt, x) \ + (((__u64)__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x)) | \ + ((__u64)(gt) << __I915_PMU_GT_SHIFT)) + +#define __I915_PMU_OTHER(x) ___I915_PMU_OTHER(0, x) #define I915_PMU_ACTUAL_FREQUENCY __I915_PMU_OTHER(0) #define I915_PMU_REQUESTED_FREQUENCY __I915_PMU_OTHER(1) @@ -290,6 +300,12 @@ enum drm_i915_pmu_engine_sample { #define I915_PMU_LAST /* Deprecated - do not use */ I915_PMU_RC6_RESIDENCY +#define __I915_PMU_ACTUAL_FREQUENCY(gt) ___I915_PMU_OTHER(gt, 0) +#define __I915_PMU_REQUESTED_FREQUENCY(gt) ___I915_PMU_OTHER(gt, 1) +#define __I915_PMU_INTERRUPTS(gt) ___I915_PMU_OTHER(gt, 2) +#define __I915_PMU_RC6_RESIDENCY(gt) ___I915_PMU_OTHER(gt, 3) +#define __I915_PMU_SOFTWARE_GT_AWAKE_TIME(gt) ___I915_PMU_OTHER(gt, 4) + /* Each region is a minimum of 16k, and there are at most 255 of them. */ #define I915_NR_TEX_REGIONS 255 /* table size 2k - maximum due to use From patchwork Thu Mar 30 00:41:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13193326 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0D087C74A5B for ; Thu, 30 Mar 2023 00:41:13 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C8FDE10ECAD; Thu, 30 Mar 2023 00:41:11 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5AC2610EC94 for ; Thu, 30 Mar 2023 00:41:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680136865; x=1711672865; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=CSfvlI0W8vv88LSEsjxCJpMW52kc8GUXo574i7RZjB8=; b=ksqQ6QWkWD5r1b7OFNzP7DlWTlUbwcYhZTwYxX6miT0sgUrJKifx37Te v1Q7qddKZFLffHzs81lfcgUmxF3RKNGbfrMGeDGKePzdkTeG9mfYjJcJC 3mPDQZVWNdfTbcLVy5UjYlK6osJf3/Qp+MEamhCRKo3Ie+Sn9HlJW24gU dCY/ugJgwkaDXyRUp2YQC7KEC3fq/9AYWAlIFyVbnu8GlRIymmRUDwsnW LA13X8MExFKBjf1OpgCFaW2DR352l+MmL8eLcKCZQMSS3LkeNAR6HjYlL MfuTwCwe7ih+qD37fc9G50S5AfuGC8lcVxV94jXot/ZINibnIZaYCTz2z Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="427310377" X-IronPort-AV: E=Sophos;i="5.98,301,1673942400"; d="scan'208";a="427310377" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2023 17:41:05 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="634668681" X-IronPort-AV: E=Sophos;i="5.98,301,1673942400"; d="scan'208";a="634668681" Received: from orsosgc001.jf.intel.com ([10.165.21.138]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2023 17:41:03 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Wed, 29 Mar 2023 17:41:00 -0700 Message-Id: <20230330004103.1295413-7-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20230330004103.1295413-1-umesh.nerlige.ramappa@intel.com> References: <20230330004103.1295413-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 6/9] drm/i915/pmu: Export counters from all tiles X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Tvrtko Ursulin Start exporting frequency and RC6 counters from all tiles. Existing counters keep their names and config values and new one use the namespace added in the previous patch, with the "-gtN" added to their names. Interrupts counter is an odd one off. Because it is the global device counters (not only GT) we choose not to add per tile versions for now. Signed-off-by: Tvrtko Ursulin Signed-off-by: Aravind Iddamsetty --- drivers/gpu/drm/i915/i915_pmu.c | 96 ++++++++++++++++++++++++++------- 1 file changed, 77 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index 5d1de98d86b4..2a5deabff088 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -548,8 +548,9 @@ config_status(struct drm_i915_private *i915, u64 config) struct intel_gt *gt = to_gt(i915); unsigned int gt_id = config_gt_id(config); + unsigned int max_gt_id = HAS_EXTRA_GT_LIST(i915) ? 1 : 0; - if (gt_id) + if (gt_id > max_gt_id) return -ENOENT; switch (config_counter(config)) { @@ -563,6 +564,8 @@ config_status(struct drm_i915_private *i915, u64 config) return -ENODEV; break; case I915_PMU_INTERRUPTS: + if (gt_id) + return -ENOENT; break; case I915_PMU_RC6_RESIDENCY: if (!gt->rc6.supported) @@ -932,9 +935,9 @@ static const struct attribute_group i915_pmu_cpumask_attr_group = { .attrs = i915_cpumask_attrs, }; -#define __event(__config, __name, __unit) \ +#define __event(__counter, __name, __unit) \ { \ - .config = (__config), \ + .counter = (__counter), \ .name = (__name), \ .unit = (__unit), \ } @@ -975,15 +978,21 @@ create_event_attributes(struct i915_pmu *pmu) { struct drm_i915_private *i915 = container_of(pmu, typeof(*i915), pmu); static const struct { - u64 config; + unsigned int counter; const char *name; const char *unit; } events[] = { - __event(I915_PMU_ACTUAL_FREQUENCY, "actual-frequency", "M"), - __event(I915_PMU_REQUESTED_FREQUENCY, "requested-frequency", "M"), - __event(I915_PMU_INTERRUPTS, "interrupts", NULL), - __event(I915_PMU_RC6_RESIDENCY, "rc6-residency", "ns"), - __event(I915_PMU_SOFTWARE_GT_AWAKE_TIME, "software-gt-awake-time", "ns"), + __event(0, "actual-frequency", "M"), + __event(1, "requested-frequency", "M"), + __event(3, "rc6-residency", "ns"), + __event(4, "software-gt-awake-time", "ns"), + }; + static const struct { + unsigned int counter; + const char *name; + const char *unit; + } global_events[] = { + __event(2, "interrupts", NULL), }; static const struct { enum drm_i915_pmu_engine_sample sample; @@ -998,14 +1007,29 @@ create_event_attributes(struct i915_pmu *pmu) struct i915_ext_attribute *i915_attr = NULL, *i915_iter; struct attribute **attr = NULL, **attr_iter; struct intel_engine_cs *engine; - unsigned int i; + struct intel_gt *gt; + unsigned int i, j; /* Count how many counters we will be exposing. */ - for (i = 0; i < ARRAY_SIZE(events); i++) { - if (!config_status(i915, events[i].config)) + /* per gt counters */ + for_each_gt(gt, i915, j) { + for (i = 0; i < ARRAY_SIZE(events); i++) { + u64 config = ___I915_PMU_OTHER(j, events[i].counter); + + if (!config_status(i915, config)) + count++; + } + } + + /* global (per GPU) counters */ + for (i = 0; i < ARRAY_SIZE(global_events); i++) { + u64 config = ___I915_PMU_OTHER(0, global_events[i].counter); + + if (!config_status(i915, config)) count++; } + /* per engine counters */ for_each_uabi_engine(engine, i915) { for (i = 0; i < ARRAY_SIZE(engine_events); i++) { if (!engine_event_status(engine, @@ -1033,26 +1057,60 @@ create_event_attributes(struct i915_pmu *pmu) attr_iter = attr; /* Initialize supported non-engine counters. */ - for (i = 0; i < ARRAY_SIZE(events); i++) { + /* per gt counters */ + for_each_gt(gt, i915, j) { + for (i = 0; i < ARRAY_SIZE(events); i++) { + u64 config = ___I915_PMU_OTHER(j, events[i].counter); + char *str; + + if (config_status(i915, config)) + continue; + + str = kasprintf(GFP_KERNEL, "%s-gt%u", + events[i].name, j); + if (!str) + goto err; + + *attr_iter++ = &i915_iter->attr.attr; + i915_iter = add_i915_attr(i915_iter, str, config); + + if (events[i].unit) { + str = kasprintf(GFP_KERNEL, "%s-gt%u.unit", + events[i].name, j); + if (!str) + goto err; + + *attr_iter++ = &pmu_iter->attr.attr; + pmu_iter = add_pmu_attr(pmu_iter, str, + events[i].unit); + } + } + } + + /* global (per GPU) counters */ + for (i = 0; i < ARRAY_SIZE(global_events); i++) { + u64 config = ___I915_PMU_OTHER(0, global_events[i].counter); char *str; - if (config_status(i915, events[i].config)) + if (config_status(i915, config)) continue; - str = kstrdup(events[i].name, GFP_KERNEL); + str = kstrdup(global_events[i].name, GFP_KERNEL); if (!str) goto err; *attr_iter++ = &i915_iter->attr.attr; - i915_iter = add_i915_attr(i915_iter, str, events[i].config); + i915_iter = add_i915_attr(i915_iter, str, config); - if (events[i].unit) { - str = kasprintf(GFP_KERNEL, "%s.unit", events[i].name); + if (global_events[i].unit) { + str = kasprintf(GFP_KERNEL, "%s.unit", + global_events[i].name); if (!str) goto err; *attr_iter++ = &pmu_iter->attr.attr; - pmu_iter = add_pmu_attr(pmu_iter, str, events[i].unit); + pmu_iter = add_pmu_attr(pmu_iter, str, + global_events[i].unit); } } From patchwork Thu Mar 30 00:41:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13193332 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B4BCFC761A6 for ; Thu, 30 Mar 2023 00:41:33 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D966510ECB3; Thu, 30 Mar 2023 00:41:32 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id D3FBC10EC94 for ; Thu, 30 Mar 2023 00:41:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680136864; x=1711672864; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=kOlt+mcNeawYIB7VH8Yu5/GuIe1Ecre/uDKHAcmAvLY=; b=BhTa/kCKGIES8/BHY4j2I6LuA7cXHDJoSxyXBeDtNkEwjw5SsmrZ+yFe qA61QTt9ht70n+hhU480S01Szho0J2HPOHWLLP0KF0a/QvTxd1OZ3dF2o W+v2OkbSvtjta4LOLSYfEaSBjGGC4Asto2QQbybzrIV72tEePnKrszM2y aRHPFuXPyXAoXeaI04J2YZM2ffkS7NwVxrU3FKnvYGgGK1eUD9ap8ASy+ 3aNL35p+R17uunawexdtDQij2MlLpTTgdMBQQ+TWTssMmqnXLKA7G/ER3 32AqlF9u6MXHPwyDijrZ/y74h2MFBCVeYxqGO9k3jrHB3PIgm2+uTT0L+ g==; X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="427310372" X-IronPort-AV: E=Sophos;i="5.98,301,1673942400"; d="scan'208";a="427310372" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2023 17:41:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="634668684" X-IronPort-AV: E=Sophos;i="5.98,301,1673942400"; d="scan'208";a="634668684" Received: from orsosgc001.jf.intel.com ([10.165.21.138]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2023 17:41:03 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Wed, 29 Mar 2023 17:41:01 -0700 Message-Id: <20230330004103.1295413-8-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20230330004103.1295413-1-umesh.nerlige.ramappa@intel.com> References: <20230330004103.1295413-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 7/9] drm/i915/pmu: Use a helper to convert to MHz X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Use a helper to convert frequency values to MHz. Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_pmu.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index 2a5deabff088..40ce1dc00067 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -636,6 +636,11 @@ static int i915_pmu_event_init(struct perf_event *event) return 0; } +static u64 read_sample_us(struct i915_pmu *pmu, unsigned int gt_id, int sample) +{ + return div_u64(read_sample(pmu, gt_id, sample), USEC_PER_SEC); +} + static u64 __i915_pmu_event_read(struct perf_event *event) { struct drm_i915_private *i915 = @@ -668,16 +673,10 @@ static u64 __i915_pmu_event_read(struct perf_event *event) switch (config) { case I915_PMU_ACTUAL_FREQUENCY: - val = - div_u64(read_sample(pmu, gt_id, - __I915_SAMPLE_FREQ_ACT), - USEC_PER_SEC /* to MHz */); + val = read_sample_us(pmu, gt_id, __I915_SAMPLE_FREQ_ACT); break; case I915_PMU_REQUESTED_FREQUENCY: - val = - div_u64(read_sample(pmu, gt_id, - __I915_SAMPLE_FREQ_REQ), - USEC_PER_SEC /* to MHz */); + val = read_sample_us(pmu, gt_id, __I915_SAMPLE_FREQ_REQ); break; case I915_PMU_INTERRUPTS: val = READ_ONCE(pmu->irq_count); From patchwork Thu Mar 30 00:41:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13193329 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AC52AC761A6 for ; Thu, 30 Mar 2023 00:41:14 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3014D10ECCD; Thu, 30 Mar 2023 00:41:13 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1376310EC94 for ; Thu, 30 Mar 2023 00:41:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680136865; x=1711672865; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TplmR7BiFldWX+J5NStJ5yy09BcNKireEkoAw1wEEsI=; b=FDgViSGeTlvEZ+JZB0PmPEz4ur7LSW4JvjrpKf1T+5WYiDKLNjVesPBZ MdwnQddinhcVOuU/TLmfc9rrK1RTD48iHu6E7aPzElOWJiiFpJ4qVhVYL 4InwqdXTtYkp/DnzTbD9vCg0kHN0RK438Rj+1313Y1KrmNwNXH66CyUMs pFr1Kf1eVy/jUv8ma4/lTxv2G4WgmqqulmneBKJtu4D7IJ8ysO9c47Qat KR/MewYvS6i1J2QStQtvpVTC4qnCHv0btOWznRPb92M/VGQ6ab5HkQ4gy JRRxboFIxON6vxqeR+Mx700qU7Lr4zGUSEqCU8BGqnpCjYjfxBAlIOoaW g==; X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="427310375" X-IronPort-AV: E=Sophos;i="5.98,301,1673942400"; d="scan'208";a="427310375" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2023 17:41:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="634668687" X-IronPort-AV: E=Sophos;i="5.98,301,1673942400"; d="scan'208";a="634668687" Received: from orsosgc001.jf.intel.com ([10.165.21.138]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2023 17:41:03 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Wed, 29 Mar 2023 17:41:02 -0700 Message-Id: <20230330004103.1295413-9-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20230330004103.1295413-1-umesh.nerlige.ramappa@intel.com> References: <20230330004103.1295413-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 8/9] drm/i915/pmu: Split reading engine and other events into helpers X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Split the event reading function into engine and other helpers. Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_pmu.c | 93 ++++++++++++++++++--------------- 1 file changed, 52 insertions(+), 41 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index 40ce1dc00067..9bd9605d2662 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -641,58 +641,69 @@ static u64 read_sample_us(struct i915_pmu *pmu, unsigned int gt_id, int sample) return div_u64(read_sample(pmu, gt_id, sample), USEC_PER_SEC); } -static u64 __i915_pmu_event_read(struct perf_event *event) +static u64 __i915_pmu_event_read_engine(struct perf_event *event) { - struct drm_i915_private *i915 = - container_of(event->pmu, typeof(*i915), pmu.base); - struct i915_pmu *pmu = &i915->pmu; + struct drm_i915_private *i915 = container_of(event->pmu, typeof(*i915), pmu.base); + u8 sample = engine_event_sample(event); + struct intel_engine_cs *engine; u64 val = 0; - if (is_engine_event(event)) { - u8 sample = engine_event_sample(event); - struct intel_engine_cs *engine; - - engine = intel_engine_lookup_user(i915, - engine_event_class(event), - engine_event_instance(event)); + engine = intel_engine_lookup_user(i915, + engine_event_class(event), + engine_event_instance(event)); - if (drm_WARN_ON_ONCE(&i915->drm, !engine)) { - /* Do nothing */ - } else if (sample == I915_SAMPLE_BUSY && - intel_engine_supports_stats(engine)) { - ktime_t unused; + if (drm_WARN_ON_ONCE(&i915->drm, !engine)) { + /* Do nothing */ + } else if (sample == I915_SAMPLE_BUSY && + intel_engine_supports_stats(engine)) { + ktime_t unused; - val = ktime_to_ns(intel_engine_get_busy_time(engine, - &unused)); - } else { - val = engine->pmu.sample[sample].cur; - } + val = ktime_to_ns(intel_engine_get_busy_time(engine, + &unused)); } else { - const unsigned int gt_id = config_gt_id(event->attr.config); - const u64 config = config_counter(event->attr.config); - - switch (config) { - case I915_PMU_ACTUAL_FREQUENCY: - val = read_sample_us(pmu, gt_id, __I915_SAMPLE_FREQ_ACT); - break; - case I915_PMU_REQUESTED_FREQUENCY: - val = read_sample_us(pmu, gt_id, __I915_SAMPLE_FREQ_REQ); - break; - case I915_PMU_INTERRUPTS: - val = READ_ONCE(pmu->irq_count); - break; - case I915_PMU_RC6_RESIDENCY: - val = get_rc6(i915->gt[gt_id]); - break; - case I915_PMU_SOFTWARE_GT_AWAKE_TIME: - val = ktime_to_ns(intel_gt_get_awake_time(to_gt(i915))); - break; - } + val = engine->pmu.sample[sample].cur; } return val; } +static u64 __i915_pmu_event_read_other(struct perf_event *event) +{ + struct drm_i915_private *i915 = container_of(event->pmu, typeof(*i915), pmu.base); + const unsigned int gt_id = config_gt_id(event->attr.config); + const u64 config = config_counter(event->attr.config); + struct i915_pmu *pmu = &i915->pmu; + u64 val = 0; + + switch (config) { + case I915_PMU_ACTUAL_FREQUENCY: + val = read_sample_us(pmu, gt_id, __I915_SAMPLE_FREQ_ACT); + break; + case I915_PMU_REQUESTED_FREQUENCY: + val = read_sample_us(pmu, gt_id, __I915_SAMPLE_FREQ_REQ); + break; + case I915_PMU_INTERRUPTS: + val = READ_ONCE(pmu->irq_count); + break; + case I915_PMU_RC6_RESIDENCY: + val = get_rc6(i915->gt[gt_id]); + break; + case I915_PMU_SOFTWARE_GT_AWAKE_TIME: + val = ktime_to_ns(intel_gt_get_awake_time(to_gt(i915))); + break; + } + + return val; +} + +static u64 __i915_pmu_event_read(struct perf_event *event) +{ + if (is_engine_event(event)) + return __i915_pmu_event_read_engine(event); + else + return __i915_pmu_event_read_other(event); +} + static void i915_pmu_event_read(struct perf_event *event) { struct drm_i915_private *i915 = From patchwork Thu Mar 30 00:41:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Umesh Nerlige Ramappa X-Patchwork-Id: 13193327 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 55374C6FD18 for ; Thu, 30 Mar 2023 00:41:12 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9F01F10ECAC; Thu, 30 Mar 2023 00:41:11 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id 73DA910ECA2 for ; Thu, 30 Mar 2023 00:41:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680136865; x=1711672865; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UjbXSdzSG9mpHdVpUCXOsCof6Z27giUadqbQbdH4IR0=; b=JyiVbEU3WnjuTKdC/7dJJTNS8E2qXJyXN5k+H93h4pEOTwXS/nYXTqrr 0RWrRzaPgiPFuwyBSuFDJ0iQxvy+d8/4b2B7KalIKk5lCaDUy51iteqXH f8p01iSFvIbXcG9jcc8uRSfA0r/WfcKh6RZcLpBMjLP4Yyxrf5JbOGT4Q 1gxkHUlqBhp8sukcWEUJ493aC2VEJUZ6ZBWLnNY4/tOiSbN1Pfe3sobfn PolUaHkNerAPzqw/1RHqkaeAzCU9Z7nV6nlMQ4qscwkQZyleCqjXjmSTL PKNYB7ORZ4TNt/fAnrOrhdBN1POEUtSE+CeICqvc5jZ2jWN8kOAGzQ4wm g==; X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="427310378" X-IronPort-AV: E=Sophos;i="5.98,301,1673942400"; d="scan'208";a="427310378" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2023 17:41:05 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="634668690" X-IronPort-AV: E=Sophos;i="5.98,301,1673942400"; d="scan'208";a="634668690" Received: from orsosgc001.jf.intel.com ([10.165.21.138]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2023 17:41:03 -0700 From: Umesh Nerlige Ramappa To: intel-gfx@lists.freedesktop.org Date: Wed, 29 Mar 2023 17:41:03 -0700 Message-Id: <20230330004103.1295413-10-umesh.nerlige.ramappa@intel.com> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20230330004103.1295413-1-umesh.nerlige.ramappa@intel.com> References: <20230330004103.1295413-1-umesh.nerlige.ramappa@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 9/9] drm/i915/pmu: Enable legacy PMU events for MTL X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" MTL introduces separate GTs for render and media. This complicates the definition of frequency and rc6 counters for the GPU as a whole since each GT has an independent counter. The best way to support this change is to deprecate the GPU-specific counters and create GT-specific counters, however that just breaks ABI. Since perf tools and scripts may be decentralized with probably many users, it's hard to deprecate the legacy counters and have all the users on board with that. Re-introduce the legacy counters and support them as min/max of GT-specific counters as necessary to ensure backwards compatibility. I915_PMU_ACTUAL_FREQUENCY - will show max of GT-specific counters I915_PMU_REQUESTED_FREQUENCY - will show max of GT-specific counters I915_PMU_INTERRUPTS - no changes since it is GPU specific on all platforms I915_PMU_RC6_RESIDENCY - will show min of GT-specific counters I915_PMU_SOFTWARE_GT_AWAKE_TIME - will show max of GT-specific counters Note: - For deeper debugging of performance issues, tools must be upgraded to read the GT-specific counters. - This patch deserves to be separate from the other PMU features so that it can be easily dropped if legacy events are ever deprecated. - Internal implementation relies on creating an extra entry in the arrays used for GT specific counters. Index 0 is empty. Index 1 through N are mapped to GTs 0 through N - 1. - User interface will use GT numbers indexed from 0 to specify the GT of interest. Signed-off-by: Umesh Nerlige Ramappa --- drivers/gpu/drm/i915/i915_pmu.c | 134 +++++++++++++++++++++++++++----- drivers/gpu/drm/i915/i915_pmu.h | 2 +- include/uapi/drm/i915_drm.h | 14 ++-- 3 files changed, 125 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index 9bd9605d2662..0dc7711c3b4b 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -221,7 +221,7 @@ add_sample_mult(struct i915_pmu *pmu, unsigned int gt_id, int sample, u32 val, static u64 get_rc6(struct intel_gt *gt) { struct drm_i915_private *i915 = gt->i915; - const unsigned int gt_id = gt->info.id; + const unsigned int gt_id = gt->info.id + 1; struct i915_pmu *pmu = &i915->pmu; unsigned long flags; bool awake = false; @@ -267,24 +267,26 @@ static void init_rc6(struct i915_pmu *pmu) for_each_gt(gt, i915, i) { intel_wakeref_t wakeref; + const unsigned int gt_id = i + 1; with_intel_runtime_pm(gt->uncore->rpm, wakeref) { u64 val = __get_rc6(gt); - store_sample(pmu, i, __I915_SAMPLE_RC6, val); - store_sample(pmu, i, __I915_SAMPLE_RC6_LAST_REPORTED, + store_sample(pmu, gt_id, __I915_SAMPLE_RC6, val); + store_sample(pmu, gt_id, __I915_SAMPLE_RC6_LAST_REPORTED, val); - pmu->sleep_last[i] = ktime_get_raw(); + pmu->sleep_last[gt_id] = ktime_get_raw(); } } } static void park_rc6(struct intel_gt *gt) { + const unsigned int gt_id = gt->info.id + 1; struct i915_pmu *pmu = >->i915->pmu; - store_sample(pmu, gt->info.id, __I915_SAMPLE_RC6, __get_rc6(gt)); - pmu->sleep_last[gt->info.id] = ktime_get_raw(); + store_sample(pmu, gt_id, __I915_SAMPLE_RC6, __get_rc6(gt)); + pmu->sleep_last[gt_id] = ktime_get_raw(); } static void __i915_pmu_maybe_start_timer(struct i915_pmu *pmu) @@ -436,18 +438,18 @@ static void frequency_sample(struct intel_gt *gt, unsigned int period_ns) { struct drm_i915_private *i915 = gt->i915; - const unsigned int gt_id = gt->info.id; + const unsigned int gt_id = gt->info.id + 1; struct i915_pmu *pmu = &i915->pmu; struct intel_rps *rps = >->rps; - if (!frequency_sampling_enabled(pmu, gt_id)) + if (!frequency_sampling_enabled(pmu, gt->info.id)) return; /* Report 0/0 (actual/requested) frequency while parked. */ if (!intel_gt_pm_get_if_awake(gt)) return; - if (pmu->enable & config_mask(__I915_PMU_ACTUAL_FREQUENCY(gt_id))) { + if (pmu->enable & config_mask(__I915_PMU_ACTUAL_FREQUENCY(gt->info.id))) { u32 val; /* @@ -467,7 +469,7 @@ frequency_sample(struct intel_gt *gt, unsigned int period_ns) val, period_ns / 1000); } - if (pmu->enable & config_mask(__I915_PMU_REQUESTED_FREQUENCY(gt_id))) { + if (pmu->enable & config_mask(__I915_PMU_REQUESTED_FREQUENCY(gt->info.id))) { add_sample_mult(pmu, gt_id, __I915_SAMPLE_FREQ_REQ, intel_rps_get_requested_frequency(rps), period_ns / 1000); @@ -545,14 +547,15 @@ engine_event_status(struct intel_engine_cs *engine, static int config_status(struct drm_i915_private *i915, u64 config) { - struct intel_gt *gt = to_gt(i915); - unsigned int gt_id = config_gt_id(config); - unsigned int max_gt_id = HAS_EXTRA_GT_LIST(i915) ? 1 : 0; + unsigned int max_gt_id = HAS_EXTRA_GT_LIST(i915) ? 2 : 1; + struct intel_gt *gt; if (gt_id > max_gt_id) return -ENOENT; + gt = !gt_id ? to_gt(i915) : i915->gt[gt_id - 1]; + switch (config_counter(config)) { case I915_PMU_ACTUAL_FREQUENCY: if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915)) @@ -673,23 +676,58 @@ static u64 __i915_pmu_event_read_other(struct perf_event *event) const unsigned int gt_id = config_gt_id(event->attr.config); const u64 config = config_counter(event->attr.config); struct i915_pmu *pmu = &i915->pmu; + struct intel_gt *gt; u64 val = 0; + int i; switch (config) { case I915_PMU_ACTUAL_FREQUENCY: - val = read_sample_us(pmu, gt_id, __I915_SAMPLE_FREQ_ACT); + if (gt_id) + return read_sample_us(pmu, gt_id, __I915_SAMPLE_FREQ_ACT); + + if (!HAS_EXTRA_GT_LIST(i915)) + return read_sample_us(pmu, 1, __I915_SAMPLE_FREQ_ACT); + + for_each_gt(gt, i915, i) + val = max(val, read_sample_us(pmu, i + 1, __I915_SAMPLE_FREQ_ACT)); + break; case I915_PMU_REQUESTED_FREQUENCY: - val = read_sample_us(pmu, gt_id, __I915_SAMPLE_FREQ_REQ); + if (gt_id) + return read_sample_us(pmu, gt_id, __I915_SAMPLE_FREQ_REQ); + + if (!HAS_EXTRA_GT_LIST(i915)) + return read_sample_us(pmu, 1, __I915_SAMPLE_FREQ_REQ); + + for_each_gt(gt, i915, i) + val = max(val, read_sample_us(pmu, i + 1, __I915_SAMPLE_FREQ_REQ)); + break; case I915_PMU_INTERRUPTS: val = READ_ONCE(pmu->irq_count); break; case I915_PMU_RC6_RESIDENCY: - val = get_rc6(i915->gt[gt_id]); + if (gt_id) + return get_rc6(i915->gt[gt_id - 1]); + + if (!HAS_EXTRA_GT_LIST(i915)) + return get_rc6(i915->gt[0]); + + val = U64_MAX; + for_each_gt(gt, i915, i) + val = min(val, get_rc6(gt)); + break; case I915_PMU_SOFTWARE_GT_AWAKE_TIME: - val = ktime_to_ns(intel_gt_get_awake_time(to_gt(i915))); + if (gt_id) + return ktime_to_ns(intel_gt_get_awake_time(i915->gt[gt_id - 1])); + + if (!HAS_EXTRA_GT_LIST(i915)) + return ktime_to_ns(intel_gt_get_awake_time(i915->gt[0])); + + val = 0; + for_each_gt(gt, i915, i) + val = max((s64)val, ktime_to_ns(intel_gt_get_awake_time(gt))); break; } @@ -728,11 +766,14 @@ static void i915_pmu_event_read(struct perf_event *event) static void i915_pmu_enable(struct perf_event *event) { + const unsigned int gt_id = config_gt_id(event->attr.config); struct drm_i915_private *i915 = container_of(event->pmu, typeof(*i915), pmu.base); struct i915_pmu *pmu = &i915->pmu; + struct intel_gt *gt; unsigned long flags; unsigned int bit; + u64 i; bit = event_bit(event); if (bit == -1) @@ -745,12 +786,42 @@ static void i915_pmu_enable(struct perf_event *event) * the event reference counter. */ BUILD_BUG_ON(ARRAY_SIZE(pmu->enable_count) != I915_PMU_MASK_BITS); + BUILD_BUG_ON(BITS_PER_TYPE(pmu->enable) < I915_PMU_MASK_BITS); GEM_BUG_ON(bit >= ARRAY_SIZE(pmu->enable_count)); GEM_BUG_ON(pmu->enable_count[bit] == ~0); pmu->enable |= BIT_ULL(bit); pmu->enable_count[bit]++; + /* + * The arrays that i915_pmu maintains are now indexed as + * + * 0 - aggregate events (a.k.a !gt_id) + * 1 - gt0 + * 2 - gt1 + * + * The same logic applies to event_bit masks. The first set of mask are + * for aggregate, followed by gt0 and gt1 masks. The idea here is to + * enable the event on all gts if the aggregate event bit is set. This + * applies only to the non-engine-events. + */ + if (!gt_id && !is_engine_event(event)) { + for_each_gt(gt, i915, i) { + u64 counter = config_counter(event->attr.config); + u64 config = ((i + 1) << __I915_PMU_GT_SHIFT) | counter; + unsigned int bit = config_bit(config); + + if (bit == -1) + continue; + + GEM_BUG_ON(bit >= ARRAY_SIZE(pmu->enable_count)); + GEM_BUG_ON(pmu->enable_count[bit] == ~0); + + pmu->enable |= BIT_ULL(bit); + pmu->enable_count[bit]++; + } + } + /* * Start the sampling timer if needed and not already enabled. */ @@ -793,6 +864,7 @@ static void i915_pmu_enable(struct perf_event *event) static void i915_pmu_disable(struct perf_event *event) { + const unsigned int gt_id = config_gt_id(event->attr.config); struct drm_i915_private *i915 = container_of(event->pmu, typeof(*i915), pmu.base); unsigned int bit = event_bit(event); @@ -822,6 +894,26 @@ static void i915_pmu_disable(struct perf_event *event) */ if (--engine->pmu.enable_count[sample] == 0) engine->pmu.enable &= ~BIT(sample); + } else if (!gt_id) { + struct intel_gt *gt; + u64 i; + + for_each_gt(gt, i915, i) { + u64 counter = config_counter(event->attr.config); + u64 config = ((i + 1) << __I915_PMU_GT_SHIFT) | counter; + unsigned int bit = config_bit(config); + + if (bit == -1) + continue; + + GEM_BUG_ON(bit >= ARRAY_SIZE(pmu->enable_count)); + GEM_BUG_ON(pmu->enable_count[bit] == 0); + + if (--pmu->enable_count[bit] == 0) { + pmu->enable &= ~BIT_ULL(bit); + pmu->timer_enabled &= pmu_needs_timer(pmu, true); + } + } } GEM_BUG_ON(bit >= ARRAY_SIZE(pmu->enable_count)); @@ -1002,7 +1094,11 @@ create_event_attributes(struct i915_pmu *pmu) const char *name; const char *unit; } global_events[] = { + __event(0, "actual-frequency", "M"), + __event(1, "requested-frequency", "M"), __event(2, "interrupts", NULL), + __event(3, "rc6-residency", "ns"), + __event(4, "software-gt-awake-time", "ns"), }; static const struct { enum drm_i915_pmu_engine_sample sample; @@ -1024,7 +1120,7 @@ create_event_attributes(struct i915_pmu *pmu) /* per gt counters */ for_each_gt(gt, i915, j) { for (i = 0; i < ARRAY_SIZE(events); i++) { - u64 config = ___I915_PMU_OTHER(j, events[i].counter); + u64 config = ___I915_PMU_OTHER(j + 1, events[i].counter); if (!config_status(i915, config)) count++; @@ -1070,7 +1166,7 @@ create_event_attributes(struct i915_pmu *pmu) /* per gt counters */ for_each_gt(gt, i915, j) { for (i = 0; i < ARRAY_SIZE(events); i++) { - u64 config = ___I915_PMU_OTHER(j, events[i].counter); + u64 config = ___I915_PMU_OTHER(j + 1, events[i].counter); char *str; if (config_status(i915, config)) diff --git a/drivers/gpu/drm/i915/i915_pmu.h b/drivers/gpu/drm/i915/i915_pmu.h index a708e44a227e..a4cc1eb218fc 100644 --- a/drivers/gpu/drm/i915/i915_pmu.h +++ b/drivers/gpu/drm/i915/i915_pmu.h @@ -38,7 +38,7 @@ enum { __I915_NUM_PMU_SAMPLERS }; -#define I915_PMU_MAX_GTS (4) /* FIXME */ +#define I915_PMU_MAX_GTS (4 + 1) /* FIXME */ /** * How many different events we track in the global PMU mask. diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index bbab7f3dbeb4..18794c30027f 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -290,6 +290,7 @@ enum drm_i915_pmu_engine_sample { (((__u64)__I915_PMU_ENGINE(0xff, 0xff, 0xf) + 1 + (x)) | \ ((__u64)(gt) << __I915_PMU_GT_SHIFT)) +/* Aggregate from all gts */ #define __I915_PMU_OTHER(x) ___I915_PMU_OTHER(0, x) #define I915_PMU_ACTUAL_FREQUENCY __I915_PMU_OTHER(0) @@ -300,11 +301,14 @@ enum drm_i915_pmu_engine_sample { #define I915_PMU_LAST /* Deprecated - do not use */ I915_PMU_RC6_RESIDENCY -#define __I915_PMU_ACTUAL_FREQUENCY(gt) ___I915_PMU_OTHER(gt, 0) -#define __I915_PMU_REQUESTED_FREQUENCY(gt) ___I915_PMU_OTHER(gt, 1) -#define __I915_PMU_INTERRUPTS(gt) ___I915_PMU_OTHER(gt, 2) -#define __I915_PMU_RC6_RESIDENCY(gt) ___I915_PMU_OTHER(gt, 3) -#define __I915_PMU_SOFTWARE_GT_AWAKE_TIME(gt) ___I915_PMU_OTHER(gt, 4) +/* GT specific counters */ +#define ____I915_PMU_OTHER(gt, x) ___I915_PMU_OTHER(((gt) + 1), x) + +#define __I915_PMU_ACTUAL_FREQUENCY(gt) ____I915_PMU_OTHER(gt, 0) +#define __I915_PMU_REQUESTED_FREQUENCY(gt) ____I915_PMU_OTHER(gt, 1) +#define __I915_PMU_INTERRUPTS(gt) ____I915_PMU_OTHER(gt, 2) +#define __I915_PMU_RC6_RESIDENCY(gt) ____I915_PMU_OTHER(gt, 3) +#define __I915_PMU_SOFTWARE_GT_AWAKE_TIME(gt) ____I915_PMU_OTHER(gt, 4) /* Each region is a minimum of 16k, and there are at most 255 of them. */