From patchwork Wed Jul 15 08:46:59 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: sourab.gupta@intel.com X-Patchwork-Id: 6794591 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id CF4BDC05AC for ; Wed, 15 Jul 2015 08:45:08 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id D523920603 for ; Wed, 15 Jul 2015 08:45:07 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id ADE0D205F4 for ; Wed, 15 Jul 2015 08:45:06 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3C1E26EAB6; Wed, 15 Jul 2015 01:45:06 -0700 (PDT) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTP id 35FF86EAB6 for ; Wed, 15 Jul 2015 01:45:05 -0700 (PDT) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga102.jf.intel.com with ESMTP; 15 Jul 2015 01:45:05 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.15,479,1432623600"; d="scan'208";a="764782555" Received: from sourabgu-desktop.iind.intel.com ([10.223.82.35]) by orsmga002.jf.intel.com with ESMTP; 15 Jul 2015 01:45:02 -0700 From: sourab.gupta@intel.com To: intel-gfx@lists.freedesktop.org Date: Wed, 15 Jul 2015 14:16:59 +0530 Message-Id: <1436950023-13940-5-git-send-email-sourab.gupta@intel.com> X-Mailer: git-send-email 1.8.5.1 In-Reply-To: <1436950023-13940-1-git-send-email-sourab.gupta@intel.com> References: <1436950023-13940-1-git-send-email-sourab.gupta@intel.com> Cc: Insoo Woo , Peter Zijlstra , Jabin Wu , Sourab Gupta Subject: [Intel-gfx] [RFC 4/8] drm/i915: Forward periodic and CS based OA reports sorted acc to timestamps X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-5.6 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Sourab Gupta The periodic reports and the RCS based reports are collected in two separate buffers. While forwarding to userspace, these have to be sent to single perf event ringbuffer. From a userspace perspective, it is good to have the reports in the single buffer in order to their timestamps. This patch addresses this problem by forwarding the periodic OA reports with a lower timestamp, whenever we are forwarding the Command streamer based report. Signed-off-by: Sourab Gupta --- drivers/gpu/drm/i915/i915_oa_perf.c | 38 ++++++++++++++++++++++--------------- 1 file changed, 23 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_oa_perf.c b/drivers/gpu/drm/i915/i915_oa_perf.c index a4fdca3..491496b 100644 --- a/drivers/gpu/drm/i915/i915_oa_perf.c +++ b/drivers/gpu/drm/i915/i915_oa_perf.c @@ -48,8 +48,7 @@ static void forward_one_oa_snapshot_to_event(struct drm_i915_private *dev_priv, } static u32 forward_oa_snapshots(struct drm_i915_private *dev_priv, - u32 head, - u32 tail) + u32 head, u32 tail, u64 gpu_ts) { struct perf_event *exclusive_event = dev_priv->oa_pmu.exclusive_event; int snapshot_size = dev_priv->oa_pmu.oa_buffer.format_size; @@ -58,14 +57,6 @@ static u32 forward_oa_snapshots(struct drm_i915_private *dev_priv, u8 *snapshot; u32 taken; - /* - * Schedule a worker to forward the RCS based OA reports collected. - * A worker is needed since it requires device mutex to be taken - * which can't be done here because of atomic context - */ - if (dev_priv->oa_pmu.multiple_ctx_mode) - schedule_work(&dev_priv->oa_pmu.work_timer); - head -= dev_priv->oa_pmu.oa_buffer.gtt_offset; tail -= dev_priv->oa_pmu.oa_buffer.gtt_offset; @@ -75,12 +66,19 @@ static u32 forward_oa_snapshots(struct drm_i915_private *dev_priv, */ while ((taken = OA_TAKEN(tail, head))) { + u64 snapshot_ts; + /* The tail increases in 64 byte increments, not in * format_size steps. */ if (taken < snapshot_size) break; snapshot = oa_buf_base + (head & mask); + + snapshot_ts = *(u64 *)(snapshot + 4); + if (snapshot_ts > gpu_ts) + break; + head += snapshot_size; /* We currently only allow exclusive access to the counters @@ -122,7 +120,7 @@ static void log_oa_status(struct drm_i915_private *dev_priv, } static void flush_oa_snapshots(struct drm_i915_private *dev_priv, - bool skip_if_flushing) + bool skip_if_flushing, u64 gpu_ts) { unsigned long flags; u32 oastatus2; @@ -165,7 +163,7 @@ static void flush_oa_snapshots(struct drm_i915_private *dev_priv, GEN7_OASTATUS1_REPORT_LOST)); } - head = forward_oa_snapshots(dev_priv, head, tail); + head = forward_oa_snapshots(dev_priv, head, tail, gpu_ts); I915_WRITE(GEN7_OASTATUS2, (head & GEN7_OASTATUS2_HEAD_MASK) | GEN7_OASTATUS2_GGTT); @@ -215,6 +213,7 @@ static void forward_one_oa_rcs_sample(struct drm_i915_private *dev_priv, u8 *snapshot; struct drm_i915_oa_node_ctx_id *ctx_info; struct perf_raw_record raw; + u64 snapshot_ts; format_size = dev_priv->oa_pmu.oa_rcs_buffer.format_size; snapshot_size = format_size + sizeof(*ctx_info); @@ -223,6 +222,10 @@ static void forward_one_oa_rcs_sample(struct drm_i915_private *dev_priv, ctx_info = (struct drm_i915_oa_node_ctx_id *)(snapshot + format_size); ctx_info->ctx_id = node->ctx_id; + /* Flush the periodic snapshots till the ts of this OA report */ + snapshot_ts = *(u64 *)(snapshot + 4); + flush_oa_snapshots(dev_priv, true, snapshot_ts); + perf_sample_data_init(&data, 0, event->hw.last_period); /* Note: the combined u32 raw->size member + raw data itself must be 8 @@ -502,7 +505,10 @@ static enum hrtimer_restart hrtimer_sample(struct hrtimer *hrtimer) struct drm_i915_private *i915 = container_of(hrtimer, typeof(*i915), oa_pmu.timer); - flush_oa_snapshots(i915, true); + if (i915->oa_pmu.multiple_ctx_mode) + schedule_work(&i915->oa_pmu.work_timer); + else + flush_oa_snapshots(i915, true, U64_MAX); hrtimer_forward_now(hrtimer, ns_to_ktime(PERIOD)); return HRTIMER_RESTART; @@ -931,7 +937,9 @@ static void i915_oa_event_stop(struct perf_event *event, int flags) if (event->attr.sample_period) { hrtimer_cancel(&dev_priv->oa_pmu.timer); - flush_oa_snapshots(dev_priv, false); + if (dev_priv->oa_pmu.multiple_ctx_mode) + schedule_work(&dev_priv->oa_pmu.work_timer); + flush_oa_snapshots(dev_priv, false, U64_MAX); } event->hw.state = PERF_HES_STOPPED; @@ -971,7 +979,7 @@ static int i915_oa_event_flush(struct perf_event *event) if (ret) return ret; } - flush_oa_snapshots(i915, true); + flush_oa_snapshots(i915, true, U64_MAX); } return 0;