From patchwork Tue Feb 26 14:29:08 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lionel Landwerlin X-Patchwork-Id: 10830375 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0E6381669 for ; Tue, 26 Feb 2019 14:29:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F0D012C525 for ; Tue, 26 Feb 2019 14:29:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E4DDF2C530; Tue, 26 Feb 2019 14:29:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 36E812C525 for ; Tue, 26 Feb 2019 14:29:31 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7EC0889ED3; Tue, 26 Feb 2019 14:29:29 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id 350B789E69 for ; Tue, 26 Feb 2019 14:29:26 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 26 Feb 2019 06:29:26 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.58,415,1544515200"; d="scan'208";a="150113381" Received: from delly.ld.intel.com ([10.103.238.201]) by fmsmga001.fm.intel.com with ESMTP; 26 Feb 2019 06:29:25 -0800 From: Lionel Landwerlin To: intel-gfx@lists.freedesktop.org Date: Tue, 26 Feb 2019 14:29:08 +0000 Message-Id: <20190226142911.9789-7-lionel.g.landwerlin@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190226142911.9789-1-lionel.g.landwerlin@intel.com> References: <20190226142911.9789-1-lionel.g.landwerlin@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v3 6/9] drm/i915: handle interrupts from the OA unit X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP The OA unit can notify that its circular buffer is half full through an interrupt and we would like to give the application the ability to make use of this interrupt to get rid of CPU checks on the OA buffer. This change wires up the interrupt to the i915-perf stream and leaves it ignored for now. v2: Use spin_lock_irq() to access the IMR register on Haswell (Chris) Signed-off-by: Lionel Landwerlin --- drivers/gpu/drm/i915/i915_drv.h | 21 +++++++++++++ drivers/gpu/drm/i915/i915_irq.c | 39 ++++++++++++++++++++----- drivers/gpu/drm/i915/i915_perf.c | 26 +++++++++++++++++ drivers/gpu/drm/i915/i915_reg.h | 7 +++++ drivers/gpu/drm/i915/intel_ringbuffer.c | 2 ++ 5 files changed, 88 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index b54929cbf1f9..8faa9cb2b620 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1400,6 +1400,12 @@ struct i915_perf_stream { * buffer should be checked for available data. */ u64 poll_oa_period; + + /** + * @oa_interrupt_monitor: Whether the stream will be notified by OA + * interrupts. + */ + bool oa_interrupt_monitor; }; /** @@ -1892,6 +1898,21 @@ struct drm_i915_private { wait_queue_head_t poll_wq; bool pollin; + /** + * Atomic counter incremented by the interrupt + * handling code for each OA half full interrupt + * received. + */ + atomic64_t half_full_count; + + /** + * Copy of the atomic half_full_count that was last + * processed in the i915-perf driver. If both counters + * differ, there is data available to read in the OA + * buffer. + */ + u64 half_full_count_last; + /** * For rate limiting any notifications of spurious * invalid OA reports diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 7c7e84e86c6a..1028d0d5542d 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -1171,6 +1171,12 @@ static void ironlake_rps_change_irq_handler(struct drm_i915_private *dev_priv) return; } +static void notify_perfmon_buffer_half_full(struct drm_i915_private *i915) +{ + atomic64_inc(&i915->perf.oa.half_full_count); + wake_up_all(&i915->perf.oa.poll_wq); +} + static void vlv_c0_read(struct drm_i915_private *dev_priv, struct intel_rps_ei *ei) { @@ -1447,6 +1453,9 @@ static void snb_gt_irq_handler(struct drm_i915_private *dev_priv, GT_RENDER_CS_MASTER_ERROR_INTERRUPT)) DRM_DEBUG("Command parser error, gt_iir 0x%08x\n", gt_iir); + if (gt_iir & GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT) + notify_perfmon_buffer_half_full(dev_priv); + if (gt_iir & GT_PARITY_ERROR(dev_priv)) ivybridge_parity_error_irq_handler(dev_priv, gt_iir); } @@ -1468,6 +1477,12 @@ gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir) tasklet_hi_schedule(&engine->execlists.tasklet); } +static void gen8_perfmon_handler(struct drm_i915_private *i915, u32 iir) +{ + if (iir & GEN8_GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT) + notify_perfmon_buffer_half_full(i915); +} + static void gen8_gt_irq_ack(struct drm_i915_private *i915, u32 master_ctl, u32 gt_iir[4]) { @@ -1477,6 +1492,7 @@ static void gen8_gt_irq_ack(struct drm_i915_private *i915, GEN8_GT_BCS_IRQ | \ GEN8_GT_VCS1_IRQ | \ GEN8_GT_VCS2_IRQ | \ + GEN8_GT_WDBOX_OACS_IRQ | \ GEN8_GT_VECS_IRQ | \ GEN8_GT_PM_IRQ | \ GEN8_GT_GUC_IRQ) @@ -1499,7 +1515,7 @@ static void gen8_gt_irq_ack(struct drm_i915_private *i915, raw_reg_write(regs, GEN8_GT_IIR(2), gt_iir[2]); } - if (master_ctl & GEN8_GT_VECS_IRQ) { + if (master_ctl & (GEN8_GT_VECS_IRQ | GEN8_GT_WDBOX_OACS_IRQ)) { gt_iir[3] = raw_reg_read(regs, GEN8_GT_IIR(3)); if (likely(gt_iir[3])) raw_reg_write(regs, GEN8_GT_IIR(3), gt_iir[3]); @@ -1523,9 +1539,11 @@ static void gen8_gt_irq_handler(struct drm_i915_private *i915, gt_iir[1] >> GEN8_VCS2_IRQ_SHIFT); } - if (master_ctl & GEN8_GT_VECS_IRQ) { + if (master_ctl & (GEN8_GT_VECS_IRQ | GEN8_GT_WDBOX_OACS_IRQ)) { gen8_cs_irq_handler(i915->engine[VECS], gt_iir[3] >> GEN8_VECS_IRQ_SHIFT); + gen8_perfmon_handler(i915, + gt_iir[3] >> GEN8_WD_IRQ_SHIFT); } if (master_ctl & (GEN8_GT_PM_IRQ | GEN8_GT_GUC_IRQ)) { @@ -2936,6 +2954,8 @@ gen11_other_irq_handler(struct drm_i915_private * const i915, { if (instance == OTHER_GTPM_INSTANCE) return gen6_rps_irq_handler(i915, iir); + if (instance == OTHER_WDOAPERF_INSTANCE) + return gen8_perfmon_handler(i915, iir); WARN_ONCE(1, "unhandled other interrupt instance=0x%x, iir=0x%x\n", instance, iir); @@ -3769,6 +3789,10 @@ static void gen5_gt_irq_postinstall(struct drm_device *dev) gt_irqs |= GT_BLT_USER_INTERRUPT | GT_BSD_USER_INTERRUPT; } + /* We only expose the i915/perf interface on HSW+. */ + if (IS_HASWELL(dev_priv)) + gt_irqs |= GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT; + GEN3_IRQ_INIT(GT, dev_priv->gt_irq_mask, gt_irqs); if (INTEL_GEN(dev_priv) >= 6) { @@ -3898,7 +3922,8 @@ static void gen8_gt_irq_postinstall(struct drm_i915_private *dev_priv) GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS2_IRQ_SHIFT, 0, GT_RENDER_USER_INTERRUPT << GEN8_VECS_IRQ_SHIFT | - GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VECS_IRQ_SHIFT + GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VECS_IRQ_SHIFT | + GEN8_GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT << GEN8_WD_IRQ_SHIFT }; dev_priv->pm_ier = 0x0; @@ -4017,12 +4042,12 @@ static void gen11_gt_irq_postinstall(struct drm_i915_private *dev_priv) /* * RPS interrupts will get enabled/disabled on demand when RPS itself - * is enabled/disabled. + * is enabled/disabled, just enable the OA interrupt for now. */ - dev_priv->pm_ier = 0x0; + dev_priv->pm_ier = GEN8_GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT; dev_priv->pm_imr = ~dev_priv->pm_ier; - I915_WRITE(GEN11_GPM_WGBOXPERF_INTR_ENABLE, 0); - I915_WRITE(GEN11_GPM_WGBOXPERF_INTR_MASK, ~0); + I915_WRITE(GEN11_GPM_WGBOXPERF_INTR_ENABLE, dev_priv->pm_ier); + I915_WRITE(GEN11_GPM_WGBOXPERF_INTR_MASK, dev_priv->pm_imr); } static void icp_irq_postinstall(struct drm_device *dev) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 5ef9164a22a0..3ab389edf1de 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -337,6 +337,7 @@ static const struct i915_oa_format gen8_plus_oa_formats[I915_OA_FORMAT_MAX] = { * @oa_period_exponent: The OA unit sampling period is derived from this * @poll_oa_period: The period at which the CPU will check for OA data * availability + * @oa_interrupt_monitor: Whether we should monitor the OA interrupt. * * As read_properties_unlocked() enumerates and validates the properties given * to open a stream of metrics the configuration is built up in the structure @@ -354,6 +355,7 @@ struct perf_open_properties { bool oa_periodic; int oa_period_exponent; u64 poll_oa_period; + bool oa_interrupt_monitor; }; static void free_oa_config(struct drm_i915_private *dev_priv, @@ -1838,6 +1840,13 @@ static void gen7_oa_enable(struct i915_perf_stream *stream) */ gen7_init_oa_buffer(dev_priv); + if (stream->oa_interrupt_monitor) { + spin_lock_irq(&dev_priv->irq_lock); + gen5_enable_gt_irq(dev_priv, + GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT); + spin_unlock_irq(&dev_priv->irq_lock); + } + I915_WRITE(GEN7_OACONTROL, (ctx_id & GEN7_OACONTROL_CTX_MASK) | (period_exponent << @@ -1864,6 +1873,9 @@ static void gen8_oa_enable(struct i915_perf_stream *stream) */ gen8_init_oa_buffer(dev_priv); + if (stream->oa_interrupt_monitor) + I915_WRITE(GEN8_OA_IMR, ~GEN8_OA_IMR_MASK_INTR); + /* * Note: we don't rely on the hardware to perform single context * filtering and instead filter on the cpu based on the context-id @@ -1893,6 +1905,10 @@ static void i915_oa_stream_enable(struct i915_perf_stream *stream) */ dev_priv->perf.oa.pollin = false; + dev_priv->perf.oa.half_full_count_last = 0; + atomic64_set(&dev_priv->perf.oa.half_full_count, + dev_priv->perf.oa.half_full_count_last); + dev_priv->perf.oa.ops.oa_enable(stream); if (dev_priv->perf.oa.periodic && stream->poll_oa_period) @@ -1905,6 +1921,13 @@ static void gen7_oa_disable(struct i915_perf_stream *stream) { struct drm_i915_private *dev_priv = stream->dev_priv; + if (stream->oa_interrupt_monitor) { + spin_lock_irq(&dev_priv->irq_lock); + gen5_disable_gt_irq(dev_priv, + GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT); + spin_unlock_irq(&dev_priv->irq_lock); + } + I915_WRITE(GEN7_OACONTROL, 0); if (intel_wait_for_register(dev_priv, GEN7_OACONTROL, GEN7_OACONTROL_ENABLE, 0, @@ -1916,6 +1939,8 @@ static void gen8_oa_disable(struct i915_perf_stream *stream) { struct drm_i915_private *dev_priv = stream->dev_priv; + I915_WRITE(GEN8_OA_IMR, 0xffffffff); + I915_WRITE(GEN8_OACONTROL, 0); if (intel_wait_for_register(dev_priv, GEN8_OACONTROL, GEN8_OA_COUNTER_ENABLE, 0, @@ -2591,6 +2616,7 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv, stream->dev_priv = dev_priv; stream->ctx = specific_ctx; stream->poll_oa_period = props->poll_oa_period; + stream->oa_interrupt_monitor = props->oa_interrupt_monitor; ret = i915_oa_stream_init(stream, param, props); if (ret) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 730bb1917fd1..62e93a492d25 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -229,6 +229,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define MAX_ENGINE_CLASS 4 #define OTHER_GTPM_INSTANCE 1 +#define OTHER_WDOAPERF_INSTANCE 2 #define MAX_ENGINE_INSTANCE 3 /* PCI config space */ @@ -641,6 +642,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define OABUFFER_SIZE_8M (6 << 3) #define OABUFFER_SIZE_16M (7 << 3) +#define GEN8_OA_IMR _MMIO(0x2b20) +#define GEN8_OA_IMR_MASK_INTR (1 << 28) + /* * Flexible, Aggregate EU Counter Registers. * Note: these aren't contiguous @@ -2923,7 +2927,9 @@ enum i915_power_well_id { #define GT_BLT_USER_INTERRUPT (1 << 22) #define GT_BSD_CS_ERROR_INTERRUPT (1 << 15) #define GT_BSD_USER_INTERRUPT (1 << 12) +#define GEN8_GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT (1 << 12) /* bdw+ */ #define GT_RENDER_L3_PARITY_ERROR_INTERRUPT_S1 (1 << 11) /* hsw+; rsvd on snb, ivb, vlv */ +#define GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT (1 << 9) /* ivb+ but only used on hsw+ */ #define GT_CONTEXT_SWITCH_INTERRUPT (1 << 8) #define GT_RENDER_L3_PARITY_ERROR_INTERRUPT (1 << 5) /* !snb */ #define GT_RENDER_PIPECTL_NOTIFY_INTERRUPT (1 << 4) @@ -7246,6 +7252,7 @@ enum { #define GEN8_DE_PIPE_B_IRQ (1 << 17) #define GEN8_DE_PIPE_A_IRQ (1 << 16) #define GEN8_DE_PIPE_IRQ(pipe) (1 << (16 + (pipe))) +#define GEN8_GT_WDBOX_OACS_IRQ (1 << 7) #define GEN8_GT_VECS_IRQ (1 << 6) #define GEN8_GT_GUC_IRQ (1 << 5) #define GEN8_GT_PM_IRQ (1 << 4) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 1b96b0960adc..c9c460612a56 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -2304,6 +2304,8 @@ int intel_init_render_ring_buffer(struct intel_engine_cs *engine) if (HAS_L3_DPF(dev_priv)) engine->irq_keep_mask = GT_RENDER_L3_PARITY_ERROR_INTERRUPT; + if (IS_HASWELL(dev_priv)) + engine->irq_keep_mask |= GT_PERFMON_BUFFER_HALF_FULL_INTERRUPT; engine->irq_enable_mask = GT_RENDER_USER_INTERRUPT;