From patchwork Mon Jun 22 09:50:16 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: sourab.gupta@intel.com X-Patchwork-Id: 6654451 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 85C25C05AC for ; Mon, 22 Jun 2015 09:48:51 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 7934720642 for ; Mon, 22 Jun 2015 09:48:50 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 6626D20636 for ; Mon, 22 Jun 2015 09:48:49 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E367A6E5D7; Mon, 22 Jun 2015 02:48:48 -0700 (PDT) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by gabe.freedesktop.org (Postfix) with ESMTP id 1DEBB6E5D7 for ; Mon, 22 Jun 2015 02:48:48 -0700 (PDT) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga103.jf.intel.com with ESMTP; 22 Jun 2015 02:48:48 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.13,658,1427785200"; d="scan'208";a="748044455" Received: from sourabgu-desktop.iind.intel.com ([10.223.82.35]) by fmsmga002.fm.intel.com with ESMTP; 22 Jun 2015 02:48:45 -0700 From: sourab.gupta@intel.com To: intel-gfx@lists.freedesktop.org Date: Mon, 22 Jun 2015 15:20:16 +0530 Message-Id: <1434966619-3979-6-git-send-email-sourab.gupta@intel.com> X-Mailer: git-send-email 1.8.5.1 In-Reply-To: <1434966619-3979-1-git-send-email-sourab.gupta@intel.com> References: <1434966619-3979-1-git-send-email-sourab.gupta@intel.com> Cc: Insoo Woo , Peter Zijlstra , Jabin Wu , Sourab Gupta Subject: [Intel-gfx] [RFC 5/8] drm/i915: Wait for GPU to finish before event stop, in async OA counter mode X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-5.6 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Sourab Gupta The mode of asynchronous OA counter snapshot collection would need insertion of MI_REPORT_PERF_COUNT commands into the ringbuffer. Therefore, during the stop event call, we need to wait for GPU to complete processing the last request for which MI_RPC command was inserted. We need to ensure the processing is completed before event_destroy callback which deallocates the buffer Signed-off-by: Sourab Gupta --- drivers/gpu/drm/i915/i915_drv.h | 2 + drivers/gpu/drm/i915/i915_oa_perf.c | 95 ++++++++++++++++++++++++++++++------- 2 files changed, 81 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index d738f7a..5453842 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1979,6 +1979,8 @@ struct drm_i915_private { u8 *snapshot; } oa_async_buffer; struct work_struct work_timer; + struct work_struct work_event_stop; + struct completion complete; } oa_pmu; #endif diff --git a/drivers/gpu/drm/i915/i915_oa_perf.c b/drivers/gpu/drm/i915/i915_oa_perf.c index 3bf4c47..5d63dab 100644 --- a/drivers/gpu/drm/i915/i915_oa_perf.c +++ b/drivers/gpu/drm/i915/i915_oa_perf.c @@ -118,6 +118,9 @@ void forward_oa_async_snapshots_work(struct work_struct *__work) int ret, head, tail, num_nodes; struct drm_i915_gem_request *req; + if (dev_priv->oa_pmu.event_active == false) + return; + first_node = (struct drm_i915_oa_async_node *) ((char *)hdr + hdr->data_offset); num_nodes = (hdr->size_in_bytes - hdr->data_offset) / @@ -298,6 +301,7 @@ static void flush_oa_snapshots(struct drm_i915_private *dev_priv, static void oa_async_buffer_destroy(struct drm_i915_private *i915) { + wait_for_completion(&i915->oa_pmu.complete); mutex_lock(&i915->dev->struct_mutex); vunmap(i915->oa_pmu.oa_async_buffer.addr); @@ -854,6 +858,63 @@ static void config_oa_regs(struct drm_i915_private *dev_priv, } } + +void i915_oa_async_stop_work_fn(struct work_struct *__work) +{ + struct drm_i915_private *dev_priv = + container_of(__work, typeof(*dev_priv), + oa_pmu.work_event_stop); + struct perf_event *event = dev_priv->oa_pmu.exclusive_event; + struct drm_i915_oa_async_queue_header *hdr = + (struct drm_i915_oa_async_queue_header *) + dev_priv->oa_pmu.oa_async_buffer.addr; + struct drm_i915_oa_async_node *first_node, *node; + struct drm_i915_gem_request *req; + int ret, head, tail, num_nodes; + + first_node = (struct drm_i915_oa_async_node *) + ((char *)hdr + hdr->data_offset); + num_nodes = (hdr->size_in_bytes - hdr->data_offset) / + sizeof(*node); + + + ret = i915_mutex_lock_interruptible(dev_priv->dev); + if (ret) + return; + + dev_priv->oa_pmu.event_active = false; + + i915_oa_async_wait_gpu(dev_priv); + + update_oacontrol(dev_priv); + mmiowb(); + + /* Ensure that all requests are completed*/ + tail = hdr->node_count; + head = dev_priv->oa_pmu.oa_async_buffer.head; + while ((head % num_nodes) != (tail % num_nodes)) { + node = &first_node[head % num_nodes]; + req = node->node_info.req; + if (req && !i915_gem_request_completed(req, true)) + WARN_ON(1); + head++; + } + + if (event->attr.sample_period) { + hrtimer_cancel(&dev_priv->oa_pmu.timer); + flush_oa_snapshots(dev_priv, false); + } + cancel_work_sync(&dev_priv->oa_pmu.work_timer); + + dev_priv->oa_pmu.oa_async_buffer.tail = 0; + dev_priv->oa_pmu.oa_async_buffer.head = 0; + + mutex_unlock(&dev_priv->dev->struct_mutex); + + event->hw.state = PERF_HES_STOPPED; + complete(&dev_priv->oa_pmu.complete); +} + static void i915_oa_event_start(struct perf_event *event, int flags) { struct drm_i915_private *dev_priv = @@ -939,25 +1000,23 @@ static void i915_oa_event_stop(struct perf_event *event, int flags) container_of(event->pmu, typeof(*dev_priv), oa_pmu.pmu); unsigned long lock_flags; - spin_lock_irqsave(&dev_priv->oa_pmu.lock, lock_flags); - - dev_priv->oa_pmu.event_active = false; - update_oacontrol(dev_priv); - - mmiowb(); - spin_unlock_irqrestore(&dev_priv->oa_pmu.lock, lock_flags); + if (dev_priv->oa_pmu.async_sample_mode) + schedule_work(&dev_priv->oa_pmu.work_event_stop); + else { + spin_lock_irqsave(&dev_priv->oa_pmu.lock, lock_flags); + dev_priv->oa_pmu.event_active = false; + update_oacontrol(dev_priv); - if (event->attr.sample_period) { - hrtimer_cancel(&dev_priv->oa_pmu.timer); - flush_oa_snapshots(dev_priv, false); - } + mmiowb(); + spin_unlock_irqrestore(&dev_priv->oa_pmu.lock, lock_flags); + if (event->attr.sample_period) { + hrtimer_cancel(&dev_priv->oa_pmu.timer); + flush_oa_snapshots(dev_priv, false); + } - if (dev_priv->oa_pmu.async_sample_mode) { - dev_priv->oa_pmu.oa_async_buffer.tail = 0; - dev_priv->oa_pmu.oa_async_buffer.head = 0; + event->hw.state = PERF_HES_STOPPED; } - event->hw.state = PERF_HES_STOPPED; } static int i915_oa_event_add(struct perf_event *event, int flags) @@ -1092,6 +1151,8 @@ void i915_oa_pmu_register(struct drm_device *dev) i915->oa_pmu.timer.function = hrtimer_sample; INIT_WORK(&i915->oa_pmu.work_timer, forward_oa_async_snapshots_work); + INIT_WORK(&i915->oa_pmu.work_event_stop, i915_oa_async_stop_work_fn); + init_completion(&i915->oa_pmu.complete); spin_lock_init(&i915->oa_pmu.lock); @@ -1122,8 +1183,10 @@ void i915_oa_pmu_unregister(struct drm_device *dev) if (i915->oa_pmu.pmu.event_init == NULL) return; - if (i915->oa_pmu.async_sample_mode) + if (i915->oa_pmu.async_sample_mode) { cancel_work_sync(&i915->oa_pmu.work_timer); + cancel_work_sync(&i915->oa_pmu.work_event_stop); + } unregister_sysctl_table(i915->oa_pmu.sysctl_header);