From patchwork Mon Jun 22 09:55:08 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: sourab.gupta@intel.com X-Patchwork-Id: 6654551 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 55657C05AC for ; Mon, 22 Jun 2015 09:53:32 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 4575620634 for ; Mon, 22 Jun 2015 09:53:31 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 2DD472050B for ; Mon, 22 Jun 2015 09:53:30 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A776D6E62C; Mon, 22 Jun 2015 02:53:29 -0700 (PDT) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTP id DA9EA6E62C for ; Mon, 22 Jun 2015 02:53:28 -0700 (PDT) Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga103.fm.intel.com with ESMTP; 22 Jun 2015 02:53:29 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.13,658,1427785200"; d="scan'208";a="732124397" Received: from sourabgu-desktop.iind.intel.com ([10.223.82.35]) by fmsmga001.fm.intel.com with ESMTP; 22 Jun 2015 02:53:26 -0700 From: sourab.gupta@intel.com To: intel-gfx@lists.freedesktop.org Date: Mon, 22 Jun 2015 15:25:08 +0530 Message-Id: <1434966909-4113-7-git-send-email-sourab.gupta@intel.com> X-Mailer: git-send-email 1.8.5.1 In-Reply-To: <1434966909-4113-1-git-send-email-sourab.gupta@intel.com> References: <1434966909-4113-1-git-send-email-sourab.gupta@intel.com> Cc: Insoo Woo , Peter Zijlstra , Jabin Wu , Sourab Gupta Subject: [Intel-gfx] [RFC 6/7] drm/i915: Add routines for inserting commands in the ringbuf for capturing timestamps X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-5.6 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Sourab Gupta This patch adds the routines through which one can insert commands in the ringbuf for capturing timestamps. The routines to insert these commands can be called at appropriate places during workload execution. The snapshots thus captured for each batchbuffer are then forwarded to userspace using the perf event framework, through the Gen PMU interfaces. Signed-off-by: Sourab Gupta --- drivers/gpu/drm/i915/i915_oa_perf.c | 88 +++++++++++++++++++++++++++++++++++++ drivers/gpu/drm/i915/i915_reg.h | 2 + 2 files changed, 90 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_oa_perf.c b/drivers/gpu/drm/i915/i915_oa_perf.c index 574b6d3..ed0bdc9 100644 --- a/drivers/gpu/drm/i915/i915_oa_perf.c +++ b/drivers/gpu/drm/i915/i915_oa_perf.c @@ -99,6 +99,79 @@ void i915_oa_insert_cmd(struct intel_ringbuffer *ringbuf, u32 ctx_id, queue_hdr->wrap_count++; } +/* Returns the ring's ID mask (i.e. I915_EXEC_) */ +#define ring_id_mask(ring) ((ring)->id + 1) + +void i915_gen_insert_cmd_ts(struct intel_ringbuffer *ringbuf, u32 ctx_id, + int perftag) +{ + struct intel_engine_cs *ring = ringbuf->ring; + struct drm_i915_private *dev_priv = ring->dev->dev_private; + struct drm_i915_ts_node_info *node_info = NULL; + struct drm_i915_ts_queue_header *queue_hdr = + (struct drm_i915_ts_queue_header *) + dev_priv->gen_pmu.buffer.addr; + void *data_ptr = (u8 *)queue_hdr + queue_hdr->data_offset; + int data_size = (queue_hdr->size_in_bytes - queue_hdr->data_offset); + u32 node_offset, timestamp_offset, addr = 0; + int ret; + + struct drm_i915_ts_node *nodes = data_ptr; + int num_nodes = 0; + int index = 0; + + num_nodes = data_size / sizeof(*nodes); + index = queue_hdr->node_count % num_nodes; + + timestamp_offset = offsetof(struct drm_i915_ts_data, ts_low); + + node_offset = i915_gem_obj_ggtt_offset(dev_priv->gen_pmu.buffer.obj) + + queue_hdr->data_offset + + index * sizeof(struct drm_i915_ts_node); + addr = node_offset + + offsetof(struct drm_i915_ts_node, timestamp) + + timestamp_offset; + + if (ring->id == RCS) { + ret = intel_ring_begin(ring, 6); + if (ret) + return; + + intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(5)); + intel_ring_emit(ring, + PIPE_CONTROL_GLOBAL_GTT_IVB | + PIPE_CONTROL_TIMESTAMP_WRITE); + intel_ring_emit(ring, addr | PIPE_CONTROL_GLOBAL_GTT); + intel_ring_emit(ring, 0); /* imm low, must be zero */ + intel_ring_emit(ring, 0); /* imm high, must be zero */ + intel_ring_emit(ring, MI_NOOP); + intel_ring_advance(ring); + } else { + ret = intel_ring_begin(ring, 4); + if (ret) + return; + + intel_ring_emit(ring, + MI_FLUSH_DW | MI_FLUSH_DW_OP_STAMP); + intel_ring_emit(ring, addr | MI_FLUSH_DW_USE_GTT); + intel_ring_emit(ring, 0); /* imm low, must be zero */ + intel_ring_emit(ring, 0); /* imm high, must be zero */ + intel_ring_advance(ring); + } + node_info = &nodes[index].node_info; + i915_gem_request_assign(&node_info->req, + ring->outstanding_lazy_request); + + node_info = &nodes[index].node_info; + node_info->pid = current->pid; + node_info->ctx_id = ctx_id; + node_info->ring = ring_id_mask(ring); + node_info->perftag = perftag; + queue_hdr->node_count++; + if (queue_hdr->node_count > num_nodes) + queue_hdr->wrap_count++; +} + static void init_oa_async_buf_queue(struct drm_i915_private *dev_priv) { struct drm_i915_oa_async_queue_header *hdr = @@ -344,6 +417,7 @@ void i915_gen_pmu_stop_work_fn(struct work_struct *__work) container_of(__work, typeof(*dev_priv), gen_pmu.work_event_stop); struct perf_event *event = dev_priv->gen_pmu.exclusive_event; + struct drm_i915_insert_cmd *entry, *next; struct drm_i915_ts_queue_header *hdr = (struct drm_i915_ts_queue_header *) dev_priv->gen_pmu.buffer.addr; @@ -361,6 +435,13 @@ void i915_gen_pmu_stop_work_fn(struct work_struct *__work) if (ret) return; + list_for_each_entry_safe(entry, next, &dev_priv->profile_cmd, list) { + if (entry->insert_cmd == i915_gen_insert_cmd_ts) { + list_del(&entry->list); + kfree(entry); + } + } + i915_gen_pmu_wait_gpu(dev_priv); /* Ensure that all requests are completed*/ @@ -1430,10 +1511,17 @@ static void i915_gen_event_start(struct perf_event *event, int flags) struct drm_i915_private *dev_priv = container_of(event->pmu, typeof(*dev_priv), gen_pmu.pmu); unsigned long lock_flags; + struct drm_i915_insert_cmd *entry; + + entry = kzalloc(sizeof(*entry), GFP_ATOMIC); + if (!entry) + return; + entry->insert_cmd = i915_gen_insert_cmd_ts; spin_lock_irqsave(&dev_priv->gen_pmu.lock, lock_flags); dev_priv->gen_pmu.event_active = true; + list_add_tail(&entry->list, &dev_priv->profile_cmd); spin_unlock_irqrestore(&dev_priv->gen_pmu.lock, lock_flags); diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index c9955968..22eee10 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -347,6 +347,7 @@ #define MI_FLUSH_DW_STORE_INDEX (1<<21) #define MI_INVALIDATE_TLB (1<<18) #define MI_FLUSH_DW_OP_STOREDW (1<<14) +#define MI_FLUSH_DW_OP_STAMP (3<<14) #define MI_FLUSH_DW_OP_MASK (3<<14) #define MI_FLUSH_DW_NOTIFY (1<<8) #define MI_INVALIDATE_BSD (1<<7) @@ -422,6 +423,7 @@ #define PIPE_CONTROL_TLB_INVALIDATE (1<<18) #define PIPE_CONTROL_MEDIA_STATE_CLEAR (1<<16) #define PIPE_CONTROL_QW_WRITE (1<<14) +#define PIPE_CONTROL_TIMESTAMP_WRITE (3<<14) #define PIPE_CONTROL_POST_SYNC_OP_MASK (3<<14) #define PIPE_CONTROL_DEPTH_STALL (1<<13) #define PIPE_CONTROL_WRITE_FLUSH (1<<12)