From patchwork Thu Dec 18 11:18:42 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Gordon X-Patchwork-Id: 5513071 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 2B59DBEEA8 for ; Thu, 18 Dec 2014 11:20:07 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 2843320A1B for ; Thu, 18 Dec 2014 11:20:06 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 9D23920961 for ; Thu, 18 Dec 2014 11:20:02 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DC15789FFD; Thu, 18 Dec 2014 03:20:01 -0800 (PST) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTP id 1010289FFD for ; Thu, 18 Dec 2014 03:19:59 -0800 (PST) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga102.jf.intel.com with ESMTP; 18 Dec 2014 03:18:05 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.07,600,1413270000"; d="scan'208";a="625888812" Received: from dsgordon-linux.isw.intel.com ([10.102.226.149]) by orsmga001.jf.intel.com with ESMTP; 18 Dec 2014 03:18:54 -0800 From: Dave Gordon To: intel-gfx@lists.freedesktop.org Date: Thu, 18 Dec 2014 11:18:42 +0000 Message-Id: <1418901522-14819-1-git-send-email-david.s.gordon@intel.com> X-Mailer: git-send-email 1.7.9.5 Subject: [Intel-gfx] [PATCH] drm/i915: add irq_barrier operation for synchronising reads X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On some generations of chips, it is necessary to read an MMIO register before getting the sequence number from the status page in main memory, in order to ensure coherency; and on all generations this should be either helpful or harmless. In general, we want this operation to be the cheapest possible, since we require only the side-effect of DMA completion and don't interpret the result of the read, and don't require any coordination with other threads, power domains, or anything else. However, finding a suitable register may be problematic; on GEN6 chips the ACTHD register was used, but on VLV et al access to this register requires FORCEWAKE and therefore many complications involving spinlocks and polling. So this commit introduces this synchronising operation as a distinct vfunc in the engine structure, so that it can be GEN- or chip-specific if needed. And there are three implementations; a dummy one, for chips where no synchronising read is needed, a gen6(+) version that issues a posting read (to TAIL), and a VLV-specific one that issues a raw read instead, avoiding touching FORCEWAKE and GTFIFO and other such complications. We then change gen6_ring_get_seqno() to use this new irq_barrier rather than a POSTING_READ of ACTHD. Note that both older (pre-GEN6) and newer (GEN8+) devices running in LRC mode do not currently include any posting read in their own get_seqno() implementations, so this change only makes a difference on VLV (and not CHV+). Signed-off-by: Dave Gordon Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com) --- drivers/gpu/drm/i915/intel_ringbuffer.c | 37 +++++++++++++++++++++++++++++-- drivers/gpu/drm/i915/intel_ringbuffer.h | 1 + 2 files changed, 36 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 12a36f0..ed8034b 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1209,6 +1209,28 @@ pc_render_add_request(struct intel_engine_cs *ring) return 0; } +static void +dummy_irq_barrier(struct intel_engine_cs *ring) +{ +} + +static void +gen6_irq_barrier(struct intel_engine_cs *ring) +{ + struct drm_i915_private *dev_priv = to_i915(ring->dev); + POSTING_READ(RING_TAIL(ring->mmio_base)); +} + +#define __raw_i915_read32(dev_priv__, reg__) readl((dev_priv__)->regs + (reg__)) +#define RAW_POSTING_READ(reg__) (void)__raw_i915_read32(dev_priv, reg__) + +static void +vlv_irq_barrier(struct intel_engine_cs *ring) +{ + struct drm_i915_private *dev_priv = to_i915(ring->dev); + RAW_POSTING_READ(RING_TAIL(ring->mmio_base)); +} + static u32 gen6_ring_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) { @@ -1216,8 +1238,7 @@ gen6_ring_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) * ivb (and maybe also on snb) by reading from a CS register (like * ACTHD) before reading the status page. */ if (!lazy_coherency) { - struct drm_i915_private *dev_priv = ring->dev->dev_private; - POSTING_READ(RING_ACTHD(ring->mmio_base)); + ring->irq_barrier(ring); } return intel_read_status_page(ring, I915_GEM_HWS_INDEX); @@ -2375,6 +2396,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev) ring->irq_get = gen8_ring_get_irq; ring->irq_put = gen8_ring_put_irq; ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT; + ring->irq_barrier = gen6_irq_barrier; ring->get_seqno = gen6_ring_get_seqno; ring->set_seqno = ring_set_seqno; if (i915_semaphore_is_enabled(dev)) { @@ -2391,6 +2413,10 @@ int intel_init_render_ring_buffer(struct drm_device *dev) ring->irq_get = gen6_ring_get_irq; ring->irq_put = gen6_ring_put_irq; ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT; + if (IS_VALLEYVIEW(dev) && !IS_GEN8(dev)) + ring->irq_barrier = vlv_irq_barrier; + else + ring->irq_barrier = gen6_irq_barrier; ring->get_seqno = gen6_ring_get_seqno; ring->set_seqno = ring_set_seqno; if (i915_semaphore_is_enabled(dev)) { @@ -2417,6 +2443,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev) } else if (IS_GEN5(dev)) { ring->add_request = pc_render_add_request; ring->flush = gen4_render_ring_flush; + ring->irq_barrier = dummy_irq_barrier; ring->get_seqno = pc_render_get_seqno; ring->set_seqno = pc_render_set_seqno; ring->irq_get = gen5_ring_get_irq; @@ -2429,6 +2456,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev) ring->flush = gen2_render_ring_flush; else ring->flush = gen4_render_ring_flush; + ring->irq_barrier = dummy_irq_barrier; ring->get_seqno = ring_get_seqno; ring->set_seqno = ring_set_seqno; if (IS_GEN2(dev)) { @@ -2505,6 +2533,7 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev) ring->write_tail = gen6_bsd_ring_write_tail; ring->flush = gen6_bsd_ring_flush; ring->add_request = gen6_add_request; + ring->irq_barrier = gen6_irq_barrier; ring->get_seqno = gen6_ring_get_seqno; ring->set_seqno = ring_set_seqno; if (INTEL_INFO(dev)->gen >= 8) { @@ -2544,6 +2573,7 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev) ring->mmio_base = BSD_RING_BASE; ring->flush = bsd_ring_flush; ring->add_request = i9xx_add_request; + ring->irq_barrier = dummy_irq_barrier; ring->get_seqno = ring_get_seqno; ring->set_seqno = ring_set_seqno; if (IS_GEN5(dev)) { @@ -2583,6 +2613,7 @@ int intel_init_bsd2_ring_buffer(struct drm_device *dev) ring->mmio_base = GEN8_BSD2_RING_BASE; ring->flush = gen6_bsd_ring_flush; ring->add_request = gen6_add_request; + ring->irq_barrier = gen6_irq_barrier; ring->get_seqno = gen6_ring_get_seqno; ring->set_seqno = ring_set_seqno; ring->irq_enable_mask = @@ -2613,6 +2644,7 @@ int intel_init_blt_ring_buffer(struct drm_device *dev) ring->write_tail = ring_write_tail; ring->flush = gen6_ring_flush; ring->add_request = gen6_add_request; + ring->irq_barrier = gen6_irq_barrier; ring->get_seqno = gen6_ring_get_seqno; ring->set_seqno = ring_set_seqno; if (INTEL_INFO(dev)->gen >= 8) { @@ -2670,6 +2702,7 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev) ring->write_tail = ring_write_tail; ring->flush = gen6_ring_flush; ring->add_request = gen6_add_request; + ring->irq_barrier = gen6_irq_barrier; ring->get_seqno = gen6_ring_get_seqno; ring->set_seqno = ring_set_seqno; diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 6dbb6f4..f686929 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -163,6 +163,7 @@ struct intel_engine_cs { * seen value is good enough. Note that the seqno will always be * monotonic, even if not coherent. */ + void (*irq_barrier)(struct intel_engine_cs *ring); u32 (*get_seqno)(struct intel_engine_cs *ring, bool lazy_coherency); void (*set_seqno)(struct intel_engine_cs *ring,