From patchwork Thu Sep 7 10:06:12 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: sagar.a.kamble@intel.com X-Patchwork-Id: 9942013 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 80234602CC for ; Thu, 7 Sep 2017 10:03:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 921CC28600 for ; Thu, 7 Sep 2017 10:03:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 86F272860A; Thu, 7 Sep 2017 10:03:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 8BAF428600 for ; Thu, 7 Sep 2017 10:03:23 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 31C1F6E8B4; Thu, 7 Sep 2017 10:03:13 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1A5EF6E8EC for ; Thu, 7 Sep 2017 10:03:12 +0000 (UTC) Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Sep 2017 03:03:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.42,357,1500966000"; d="scan'208";a="126439769" Received: from sakamble-desktop.iind.intel.com ([10.223.26.118]) by orsmga004.jf.intel.com with ESMTP; 07 Sep 2017 03:03:10 -0700 From: Sagar Arun Kamble To: intel-gfx@lists.freedesktop.org Date: Thu, 7 Sep 2017 15:36:12 +0530 Message-Id: <1504778774-18117-13-git-send-email-sagar.a.kamble@intel.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1504778774-18117-1-git-send-email-sagar.a.kamble@intel.com> References: <1504778774-18117-1-git-send-email-sagar.a.kamble@intel.com> Cc: Sourab Gupta Subject: [Intel-gfx] [PATCH 12/14] drm/i915: Extract raw GPU timestamps from OA reports to forward in perf samples X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Sourab Gupta The OA reports contain the least significant 32 bits of the gpu timestamp. This patch enables retrieval of the timestamp field from OA reports, to forward as 64 bit raw gpu timestamps in the perf samples. Signed-off-by: Sourab Gupta Signed-off-by: Sagar Arun Kamble --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_perf.c | 48 ++++++++++++++++++++++++++++++---------- drivers/gpu/drm/i915/i915_reg.h | 4 ++++ 3 files changed, 41 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 2d5f20a..d9f12a5 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2739,6 +2739,7 @@ struct drm_i915_private { u32 ctx_flexeu0_offset; u32 n_pending_periodic_samples; u32 pending_periodic_ts; + u64 last_gpu_ts; /** * The RPT_ID/reason field for Gen8+ includes a bit diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 8243246..3a72705 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1121,6 +1121,26 @@ static int append_perf_sample(struct i915_perf_stream *stream, } /** + * get_gpu_ts_from_oa_report - Retrieve absolute gpu timestamp from OA report + * + * Note: We are assuming that we're updating last_gpu_ts frequently enough so + * that it's never possible to see multiple overflows before we compare + * sample_ts to last_gpu_ts. Since this is significantly large duration + * (~6min for 80ns ts base), we can safely assume so. + */ +static u64 get_gpu_ts_from_oa_report(struct drm_i915_private *dev_priv, + const u8 *report) +{ + u32 sample_ts = *(u32 *)(report + 4); + u32 delta; + + delta = sample_ts - (u32)dev_priv->perf.oa.last_gpu_ts; + dev_priv->perf.oa.last_gpu_ts += delta; + + return dev_priv->perf.oa.last_gpu_ts; +} + +/** * append_oa_buffer_sample - Copies single periodic OA report into userspace * read() buffer. * @stream: An i915-perf stream opened for OA metrics @@ -1152,11 +1172,8 @@ static int append_oa_buffer_sample(struct i915_perf_stream *stream, if (sample_flags & SAMPLE_TAG) data.tag = stream->last_tag; - /* TODO: Derive timestamp from OA report, - * after scaling with the ts base - */ if (sample_flags & SAMPLE_TS) - data.ts = 0; + data.ts = get_gpu_ts_from_oa_report(dev_priv, report); if (sample_flags & SAMPLE_OA_REPORT) data.report = report; @@ -1730,6 +1747,7 @@ static int append_cs_buffer_sample(struct i915_perf_stream *stream, struct drm_i915_private *dev_priv = stream->dev_priv; struct i915_perf_sample_data data = { 0 }; u32 sample_flags = stream->sample_flags; + u64 gpu_ts = 0; int ret = 0; if (sample_flags & SAMPLE_OA_REPORT) { @@ -1745,6 +1763,9 @@ static int append_cs_buffer_sample(struct i915_perf_stream *stream, sample_ts, U32_MAX); if (ret) return ret; + + if (sample_flags & SAMPLE_TS) + gpu_ts = get_gpu_ts_from_oa_report(dev_priv, report); } if (sample_flags & SAMPLE_OA_SOURCE) @@ -1783,16 +1804,13 @@ static int append_cs_buffer_sample(struct i915_perf_stream *stream, } if (sample_flags & SAMPLE_TS) { - /* For RCS, if OA samples are also being collected, derive the - * timestamp from OA report, after scaling with the TS base. + /* If OA sampling is enabled, derive the ts from OA report. * Else, forward the timestamp collected via command stream. */ - /* TODO: derive the timestamp from OA report */ - if (sample_flags & SAMPLE_OA_REPORT) - data.ts = 0; - else - data.ts = *(u64 *) (stream->cs_buffer.vaddr + + if (!(sample_flags & SAMPLE_OA_REPORT)) + gpu_ts = *(u64 *) (stream->cs_buffer.vaddr + node->ts_offset); + data.ts = gpu_ts; } return append_perf_sample(stream, buf, count, offset, &data); @@ -2959,9 +2977,15 @@ static void i915_perf_stream_enable(struct i915_perf_stream *stream) { struct drm_i915_private *dev_priv = stream->dev_priv; - if (stream->sample_flags & SAMPLE_OA_REPORT) + if (stream->sample_flags & SAMPLE_OA_REPORT) { dev_priv->perf.oa.ops.oa_enable(dev_priv); + if (stream->sample_flags & SAMPLE_TS) + dev_priv->perf.oa.last_gpu_ts = + I915_READ64_2x32(GT_TIMESTAMP_COUNT, + GT_TIMESTAMP_COUNT_UDW); + } + if (stream->cs_mode || dev_priv->perf.oa.periodic) hrtimer_start(&dev_priv->perf.poll_check_timer, ns_to_ktime(POLL_PERIOD), diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index a24d391..7958a15 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -730,6 +730,10 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define PS_DEPTH_COUNT _MMIO(0x2350) #define PS_DEPTH_COUNT_UDW _MMIO(0x2350 + 4) +/* Timestamp count register */ +#define GT_TIMESTAMP_COUNT _MMIO(0x2358) +#define GT_TIMESTAMP_COUNT_UDW _MMIO(0x2358 + 4) + /* There are the 4 64-bit counter registers, one for each stream output */ #define GEN7_SO_NUM_PRIMS_WRITTEN(n) _MMIO(0x5200 + (n) * 8) #define GEN7_SO_NUM_PRIMS_WRITTEN_UDW(n) _MMIO(0x5200 + (n) * 8 + 4)