From patchwork Mon Jun 22 09:55:07 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: sourab.gupta@intel.com X-Patchwork-Id: 6654541 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 5C56F9F1C1 for ; Mon, 22 Jun 2015 09:53:29 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 6817120634 for ; Mon, 22 Jun 2015 09:53:28 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 602A82050B for ; Mon, 22 Jun 2015 09:53:27 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D06376E5E0; Mon, 22 Jun 2015 02:53:26 -0700 (PDT) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTP id 9462B6E5E0 for ; Mon, 22 Jun 2015 02:53:25 -0700 (PDT) Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga103.fm.intel.com with ESMTP; 22 Jun 2015 02:53:26 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.13,658,1427785200"; d="scan'208";a="732124383" Received: from sourabgu-desktop.iind.intel.com ([10.223.82.35]) by fmsmga001.fm.intel.com with ESMTP; 22 Jun 2015 02:53:23 -0700 From: sourab.gupta@intel.com To: intel-gfx@lists.freedesktop.org Date: Mon, 22 Jun 2015 15:25:07 +0530 Message-Id: <1434966909-4113-6-git-send-email-sourab.gupta@intel.com> X-Mailer: git-send-email 1.8.5.1 In-Reply-To: <1434966909-4113-1-git-send-email-sourab.gupta@intel.com> References: <1434966909-4113-1-git-send-email-sourab.gupta@intel.com> Cc: Insoo Woo , Peter Zijlstra , Jabin Wu , Sourab Gupta Subject: [Intel-gfx] [RFC 5/7] drm/i915: Wait for GPU to finish before event stop in Gen Perf PMU X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-5.6 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Sourab Gupta To collect timestamps around any GPU workload, we need to insert commands to capture them into the ringbuffer. Therefore, during the stop event call, we need to wait for GPU to complete processing the last request for which these commands were inserted. We need to ensure this processing is done before event_destroy callback which deallocates the buffer for holding the data. Signed-off-by: Sourab Gupta --- drivers/gpu/drm/i915/i915_drv.h | 2 ++ drivers/gpu/drm/i915/i915_oa_perf.c | 54 ++++++++++++++++++++++++++++++++++++- 2 files changed, 55 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 25c0938..a0e1d17 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2022,6 +2022,8 @@ struct drm_i915_private { u32 tail; } buffer; struct work_struct work_timer; + struct work_struct work_event_stop; + struct completion complete; } gen_pmu; struct list_head profile_cmd; diff --git a/drivers/gpu/drm/i915/i915_oa_perf.c b/drivers/gpu/drm/i915/i915_oa_perf.c index e3e867f..574b6d3 100644 --- a/drivers/gpu/drm/i915/i915_oa_perf.c +++ b/drivers/gpu/drm/i915/i915_oa_perf.c @@ -306,6 +306,9 @@ void forward_gen_pmu_snapshots_work(struct work_struct *__work) int head, tail, num_nodes, ret; struct drm_i915_gem_request *req; + if (dev_priv->gen_pmu.event_active == false) + return; + first_node = (struct drm_i915_ts_node *) ((char *)hdr + hdr->data_offset); num_nodes = (hdr->size_in_bytes - hdr->data_offset) / @@ -335,6 +338,50 @@ void forward_gen_pmu_snapshots_work(struct work_struct *__work) mutex_unlock(&dev_priv->dev->struct_mutex); } +void i915_gen_pmu_stop_work_fn(struct work_struct *__work) +{ + struct drm_i915_private *dev_priv = + container_of(__work, typeof(*dev_priv), + gen_pmu.work_event_stop); + struct perf_event *event = dev_priv->gen_pmu.exclusive_event; + struct drm_i915_ts_queue_header *hdr = + (struct drm_i915_ts_queue_header *) + dev_priv->gen_pmu.buffer.addr; + struct drm_i915_ts_node *first_node, *node; + int head, tail, num_nodes, ret; + struct drm_i915_gem_request *req; + + first_node = (struct drm_i915_ts_node *) + ((char *)hdr + hdr->data_offset); + num_nodes = (hdr->size_in_bytes - hdr->data_offset) / + sizeof(*node); + + + ret = i915_mutex_lock_interruptible(dev_priv->dev); + if (ret) + return; + + i915_gen_pmu_wait_gpu(dev_priv); + + /* Ensure that all requests are completed*/ + tail = hdr->node_count; + head = dev_priv->gen_pmu.buffer.head; + while ((head % num_nodes) != (tail % num_nodes)) { + node = &first_node[head % num_nodes]; + req = node->node_info.req; + if (req && !i915_gem_request_completed(req, true)) + WARN_ON(1); + head++; + } + + event->hw.state = PERF_HES_STOPPED; + dev_priv->gen_pmu.buffer.tail = 0; + dev_priv->gen_pmu.buffer.head = 0; + + mutex_unlock(&dev_priv->dev->struct_mutex); + complete(&dev_priv->gen_pmu.complete); +} + static void gen_pmu_flush_snapshots(struct drm_i915_private *dev_priv) { WARN_ON(!dev_priv->gen_pmu.buffer.addr); @@ -562,6 +609,7 @@ static void i915_oa_event_destroy(struct perf_event *event) static void gen_buffer_destroy(struct drm_i915_private *i915) { + wait_for_completion(&i915->gen_pmu.complete); mutex_lock(&i915->dev->struct_mutex); vunmap(i915->gen_pmu.buffer.addr); @@ -1409,7 +1457,7 @@ static void i915_gen_event_stop(struct perf_event *event, int flags) hrtimer_cancel(&dev_priv->gen_pmu.timer); gen_pmu_flush_snapshots(dev_priv); - event->hw.state = PERF_HES_STOPPED; + schedule_work(&dev_priv->gen_pmu.work_event_stop); } static int i915_gen_event_add(struct perf_event *event, int flags) @@ -1595,6 +1643,9 @@ void i915_gen_pmu_register(struct drm_device *dev) i915->gen_pmu.timer.function = hrtimer_sample_gen; INIT_WORK(&i915->gen_pmu.work_timer, forward_gen_pmu_snapshots_work); + INIT_WORK(&i915->gen_pmu.work_event_stop, i915_gen_pmu_stop_work_fn); + init_completion(&i915->gen_pmu.complete); + spin_lock_init(&i915->gen_pmu.lock); i915->gen_pmu.pmu.capabilities = PERF_PMU_CAP_IS_DEVICE; @@ -1625,6 +1676,7 @@ void i915_gen_pmu_unregister(struct drm_device *dev) return; cancel_work_sync(&i915->gen_pmu.work_timer); + cancel_work_sync(&i915->gen_pmu.work_event_stop); perf_pmu_unregister(&i915->gen_pmu.pmu); i915->gen_pmu.pmu.event_init = NULL;