From patchwork Mon Sep 17 11:30:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10602559 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EB77A17E1 for ; Mon, 17 Sep 2018 11:31:12 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D2817299E0 for ; Mon, 17 Sep 2018 11:31:12 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D02FC299E5; Mon, 17 Sep 2018 11:31:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 5810B299D4 for ; Mon, 17 Sep 2018 11:31:12 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 665FA6E24E; Mon, 17 Sep 2018 11:31:11 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5B19189FA6 for ; Mon, 17 Sep 2018 11:31:06 +0000 (UTC) Received: by mail-wm1-x344.google.com with SMTP id 207-v6so9128810wme.5 for ; Mon, 17 Sep 2018 04:31:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=f5YpVzp0lDmBZl8tFfNG2pC56g5F58Gg7BPNv9s0WOk=; b=QG7WofLuDe5Shfd42zP4txx56esVx8MnFQquhgptunqX73soVceJxs/nuEf+5G2J/R 6L6Tnh49p35/u5I/0+mxsjEV6sAxWZwfBazxGq0+p312WBmkVEzBc9lxeCydsF7YmhRV PvjI0vB7aQH4E4vuNn0SA3ssmdEBuvq8YeM8Uae4f9blvViMhBHsdJ/7MwDKW2fGtuMj rx70zBd1pqAWMD1v/HptJ0qqUpurR1BdboZ9usQBKENMNTgWn/8PqXqKZVr0B0k/V/k9 i0RkDM5+AbVseOVUwWlwawad6AVQ21lUO5PjOYK2REUR6H46EWLdxMNJ25T8TpU/rBGv wr3A== X-Gm-Message-State: APzg51AyP3RjU/9mySaFVPnfrBPQzc8faghaJ0XbrozlinGca5ugtDFm ME/SQ8J2DPemt684IPnZza+dzn5I+30= X-Google-Smtp-Source: ANB0VdYxv3TgaZ8xjDz5BEwOBvdd43Y78VkiIPE6rADRt/T34MA4xIL15NAbdch1K2YaCBgbTpSS5A== X-Received: by 2002:a1c:65c5:: with SMTP id z188-v6mr10899602wmb.57.1537183864752; Mon, 17 Sep 2018 04:31:04 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id o3-v6sm10781971wrn.58.2018.09.17.04.31.03 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 17 Sep 2018 04:31:04 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Date: Mon, 17 Sep 2018 12:30:53 +0100 Message-Id: <20180917113058.28994-2-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180917113058.28994-1-tvrtko.ursulin@linux.intel.com> References: <20180917113058.28994-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH 1/6] drm/i915/execlists: Move RPCS setup to context pin X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin Configuring RPCS in context image just before pin is sufficient and will come extra handy in one of the following patches. v2: * Split image setup a bit differently. (Chris Wilson) v3: * Update context image after reset as well - otherwise the application of pinned default state clears the RPCS. v4: * Use local variable throughout the function. (Chris Wilson) Signed-off-by: Tvrtko Ursulin Suggested-by: Chris Wilson Cc: Chris Wilson Reviewed-by: Chris Wilson # v2 Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/intel_lrc.c | 45 ++++++++++++++++++++------------ 1 file changed, 28 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index a51be16ddaac..54446b6aeb7a 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1305,6 +1305,24 @@ static int __context_pin(struct i915_gem_context *ctx, struct i915_vma *vma) return i915_vma_pin(vma, 0, 0, flags); } +static u32 make_rpcs(struct drm_i915_private *dev_priv); + +static void +__execlists_update_reg_state(struct intel_engine_cs *engine, + struct intel_context *ce) +{ + u32 *regs = ce->lrc_reg_state; + struct intel_ring *ring = ce->ring; + + regs[CTX_RING_BUFFER_START + 1] = i915_ggtt_offset(ring->vma); + regs[CTX_RING_HEAD + 1] = ring->head; + regs[CTX_RING_TAIL + 1] = ring->tail; + + /* RPCS */ + if (engine->class == RENDER_CLASS) + regs[CTX_R_PWR_CLK_STATE + 1] = make_rpcs(engine->i915); +} + static struct intel_context * __execlists_context_pin(struct intel_engine_cs *engine, struct i915_gem_context *ctx, @@ -1343,10 +1361,8 @@ __execlists_context_pin(struct intel_engine_cs *engine, GEM_BUG_ON(!intel_ring_offset_valid(ce->ring, ce->ring->head)); ce->lrc_reg_state = vaddr + LRC_STATE_PN * PAGE_SIZE; - ce->lrc_reg_state[CTX_RING_BUFFER_START+1] = - i915_ggtt_offset(ce->ring->vma); - ce->lrc_reg_state[CTX_RING_HEAD + 1] = ce->ring->head; - ce->lrc_reg_state[CTX_RING_TAIL + 1] = ce->ring->tail; + + __execlists_update_reg_state(engine, ce); ce->state->obj->pin_global++; i915_gem_context_get(ctx); @@ -1955,14 +1971,14 @@ static void execlists_reset(struct intel_engine_cs *engine, engine->pinned_default_state + LRC_STATE_PN * PAGE_SIZE, engine->context_size - PAGE_SIZE); } - execlists_init_reg_state(regs, - request->gem_context, engine, request->ring); /* Move the RING_HEAD onto the breadcrumb, past the hanging batch */ - regs[CTX_RING_BUFFER_START + 1] = i915_ggtt_offset(request->ring->vma); - request->ring->head = intel_ring_wrap(request->ring, request->postfix); - regs[CTX_RING_HEAD + 1] = request->ring->head; + + execlists_init_reg_state(regs, request->gem_context, engine, + request->ring); + + __execlists_update_reg_state(engine, request->hw_context); intel_ring_update_space(request->ring); @@ -2710,8 +2726,7 @@ static void execlists_init_reg_state(u32 *regs, if (rcs) { regs[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1); - CTX_REG(regs, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE, - make_rpcs(dev_priv)); + CTX_REG(regs, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE, 0); i915_oa_init_reg_state(engine, ctx, regs); } @@ -2872,12 +2887,8 @@ void intel_lr_context_resume(struct drm_i915_private *i915) intel_ring_reset(ce->ring, 0); - if (ce->pin_count) { /* otherwise done in context_pin */ - u32 *regs = ce->lrc_reg_state; - - regs[CTX_RING_HEAD + 1] = ce->ring->head; - regs[CTX_RING_TAIL + 1] = ce->ring->tail; - } + if (ce->pin_count) /* otherwise done in context_pin */ + __execlists_update_reg_state(engine, ce); } } } From patchwork Mon Sep 17 11:30:54 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10602551 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9855E112B for ; Mon, 17 Sep 2018 11:31:10 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7FC87299AB for ; Mon, 17 Sep 2018 11:31:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7D29B299A7; Mon, 17 Sep 2018 11:31:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id D89E9299B5 for ; Mon, 17 Sep 2018 11:31:09 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 44E576E156; Mon, 17 Sep 2018 11:31:09 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) by gabe.freedesktop.org (Postfix) with ESMTPS id DAF886E147 for ; Mon, 17 Sep 2018 11:31:07 +0000 (UTC) Received: by mail-wm1-x344.google.com with SMTP id b19-v6so9164373wme.3 for ; Mon, 17 Sep 2018 04:31:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Oy8RWDUtvkF4IrpDS0Y37nUpb3ceqLIY+yy9Xj3mU+4=; b=Kv6UX9nkW24UusJeICeGhOjZF+Y5u2Ibo4x3gdTZRu09GHZjfOPxOrLwbrfgBEfco1 F5mvHpkuHkxGMRo4XKxbnSq7ji34p76YfTWzorska+3w6ospkKyppNy2oWF+y4xpQbEO bq/sEvUTUYQDbaNS7eKMd1yIVOaEzrNqphz2Vj6XjxxRosNpkHUcW0k1dn2JFDsg9EAg 19+1jjZWE/UrjZ3iNl78Z8aIxDtVqembdVI3dECmXB7LVt3KfIO0G2og0m4gViiLY8v8 FA2tGd/AjVKPl68hCbrfaatLPoaH0tC3Qks2Bo98/NOkHx5XAcnQM/c95bxpwvbVc8QM jboQ== X-Gm-Message-State: APzg51ASjvrMpQSOChrrgo7e20KfAsI58IpcsiGE/uJ67InPyOMPbRvT Levk9pFP1vVkQs6zqN/KA/5WP8gxcOw= X-Google-Smtp-Source: ANB0VdaU9BgRghUtPnXEXel7WdS1RuaaB3yPr4D5xif0oeqILITXgYNe2wA5RD3KzCQ2c4FYciTosA== X-Received: by 2002:a1c:2dc8:: with SMTP id t191-v6mr11060569wmt.94.1537183865900; Mon, 17 Sep 2018 04:31:05 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id o3-v6sm10781971wrn.58.2018.09.17.04.31.04 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 17 Sep 2018 04:31:05 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Date: Mon, 17 Sep 2018 12:30:54 +0100 Message-Id: <20180917113058.28994-3-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180917113058.28994-1-tvrtko.ursulin@linux.intel.com> References: <20180917113058.28994-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH 2/6] drm/i915: Record the sseu configuration per-context & engine X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Lionel Landwerlin We want to expose the ability to reconfigure the slices, subslice and eu per context and per engine. To facilitate that, store the current configuration on the context for each engine, which is initially set to the device default upon creation. v2: record sseu configuration per context & engine (Chris) v3: introduce the i915_gem_context_sseu to store powergating programming, sseu_dev_info has grown quite a bit (Lionel) v4: rename i915_gem_sseu into intel_sseu (Chris) use to_intel_context() (Chris) v5: More to_intel_context() (Tvrtko) Switch intel_sseu from union to struct (Tvrtko) Move context default sseu in existing loop (Chris) v6: s/intel_sseu_from_device_sseu/intel_device_default_sseu/ (Tvrtko) Tvrtko Ursulin: v7: * Pass intel_sseu by pointer instead of value to make_rpcs. * Rebase for make_rpcs changes. v8: * Rebase for RPCS edit on pin. v9: * Rebase for context image setup changes. v10: * Rename dev_priv to i915. (Chris Wilson) v11: * Rebase. Signed-off-by: Chris Wilson Signed-off-by: Lionel Landwerlin Signed-off-by: Tvrtko Ursulin Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h | 14 +++++++++++ drivers/gpu/drm/i915/i915_gem_context.c | 2 ++ drivers/gpu/drm/i915/i915_gem_context.h | 4 ++++ drivers/gpu/drm/i915/i915_request.h | 10 ++++++++ drivers/gpu/drm/i915/intel_lrc.c | 31 +++++++++++++------------ 5 files changed, 46 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 7d4daa7412f1..6b7ae63e47c3 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3479,6 +3479,20 @@ mkwrite_device_info(struct drm_i915_private *dev_priv) return (struct intel_device_info *)&dev_priv->info; } +static inline struct intel_sseu +intel_device_default_sseu(struct drm_i915_private *i915) +{ + const struct sseu_dev_info *sseu = &INTEL_INFO(i915)->sseu; + struct intel_sseu value = { + .slice_mask = sseu->slice_mask, + .subslice_mask = sseu->subslice_mask[0], + .min_eus_per_subslice = sseu->max_eus_per_subslice, + .max_eus_per_subslice = sseu->max_eus_per_subslice, + }; + + return value; +} + /* modesetting */ extern void intel_modeset_init_hw(struct drm_device *dev); extern int intel_modeset_init(struct drm_device *dev); diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index f772593b99ab..0b8cc748648b 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -343,6 +343,8 @@ __create_hw_context(struct drm_i915_private *dev_priv, struct intel_context *ce = &ctx->__engine[n]; ce->gem_context = ctx; + /* Use the whole device by default */ + ce->sseu = intel_device_default_sseu(dev_priv); } INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL); diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h index 08165f6a0a84..7510de738b35 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.h +++ b/drivers/gpu/drm/i915/i915_gem_context.h @@ -31,6 +31,7 @@ #include "i915_gem.h" #include "i915_scheduler.h" +#include "intel_device_info.h" struct pid; @@ -170,6 +171,9 @@ struct i915_gem_context { int pin_count; const struct intel_context_ops *ops; + + /** sseu: Control eu/slice partitioning */ + struct intel_sseu sseu; } __engine[I915_NUM_ENGINES]; /** ring_size: size for allocating the per-engine ring buffer */ diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h index 7fa94b024968..3a4be20ea74a 100644 --- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -39,6 +39,16 @@ struct drm_i915_gem_object; struct i915_request; struct i915_timeline; +/* + * Powergating configuration for a particular (context,engine). + */ +struct intel_sseu { + u8 slice_mask; + u8 subslice_mask; + u8 min_eus_per_subslice; + u8 max_eus_per_subslice; +}; + struct intel_wait { struct rb_node node; struct task_struct *tsk; diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 54446b6aeb7a..1491032b675b 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1305,7 +1305,8 @@ static int __context_pin(struct i915_gem_context *ctx, struct i915_vma *vma) return i915_vma_pin(vma, 0, 0, flags); } -static u32 make_rpcs(struct drm_i915_private *dev_priv); +static u32 +make_rpcs(struct drm_i915_private *i915, struct intel_sseu *ctx_sseu); static void __execlists_update_reg_state(struct intel_engine_cs *engine, @@ -1320,7 +1321,8 @@ __execlists_update_reg_state(struct intel_engine_cs *engine, /* RPCS */ if (engine->class == RENDER_CLASS) - regs[CTX_R_PWR_CLK_STATE + 1] = make_rpcs(engine->i915); + regs[CTX_R_PWR_CLK_STATE + 1] = make_rpcs(engine->i915, + &ce->sseu); } static struct intel_context * @@ -2506,18 +2508,19 @@ int logical_xcs_ring_init(struct intel_engine_cs *engine) } static u32 -make_rpcs(struct drm_i915_private *dev_priv) +make_rpcs(struct drm_i915_private *i915, struct intel_sseu *ctx_sseu) { - bool subslice_pg = INTEL_INFO(dev_priv)->sseu.has_subslice_pg; - u8 slices = hweight8(INTEL_INFO(dev_priv)->sseu.slice_mask); - u8 subslices = hweight8(INTEL_INFO(dev_priv)->sseu.subslice_mask[0]); + const struct sseu_dev_info *sseu = &INTEL_INFO(i915)->sseu; + bool subslice_pg = sseu->has_subslice_pg; + u8 slices = hweight8(ctx_sseu->slice_mask); + u8 subslices = hweight8(ctx_sseu->subslice_mask); u32 rpcs = 0; /* * No explicit RPCS request is needed to ensure full * slice/subslice/EU enablement prior to Gen9. */ - if (INTEL_GEN(dev_priv) < 9) + if (INTEL_GEN(i915) < 9) return 0; /* @@ -2545,7 +2548,7 @@ make_rpcs(struct drm_i915_private *dev_priv) * subslices are enabled, or a count between one and four on the first * slice. */ - if (IS_GEN11(dev_priv) && slices == 1 && subslices >= 4) { + if (IS_GEN11(i915) && slices == 1 && subslices >= 4) { GEM_BUG_ON(subslices & 1); subslice_pg = false; @@ -2558,10 +2561,10 @@ make_rpcs(struct drm_i915_private *dev_priv) * must make an explicit request through RPCS for full * enablement. */ - if (INTEL_INFO(dev_priv)->sseu.has_slice_pg) { + if (sseu->has_slice_pg) { u32 mask, val = slices; - if (INTEL_GEN(dev_priv) >= 11) { + if (INTEL_GEN(i915) >= 11) { mask = GEN11_RPCS_S_CNT_MASK; val <<= GEN11_RPCS_S_CNT_SHIFT; } else { @@ -2586,18 +2589,16 @@ make_rpcs(struct drm_i915_private *dev_priv) rpcs |= GEN8_RPCS_ENABLE | GEN8_RPCS_SS_CNT_ENABLE | val; } - if (INTEL_INFO(dev_priv)->sseu.has_eu_pg) { + if (sseu->has_eu_pg) { u32 val; - val = INTEL_INFO(dev_priv)->sseu.eu_per_subslice << - GEN8_RPCS_EU_MIN_SHIFT; + val = ctx_sseu->min_eus_per_subslice << GEN8_RPCS_EU_MIN_SHIFT; GEM_BUG_ON(val & ~GEN8_RPCS_EU_MIN_MASK); val &= GEN8_RPCS_EU_MIN_MASK; rpcs |= val; - val = INTEL_INFO(dev_priv)->sseu.eu_per_subslice << - GEN8_RPCS_EU_MAX_SHIFT; + val = ctx_sseu->max_eus_per_subslice << GEN8_RPCS_EU_MAX_SHIFT; GEM_BUG_ON(val & ~GEN8_RPCS_EU_MAX_MASK); val &= GEN8_RPCS_EU_MAX_MASK; From patchwork Mon Sep 17 11:30:55 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10602569 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DFF9617E1 for ; Mon, 17 Sep 2018 11:31:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C5C01299EB for ; Mon, 17 Sep 2018 11:31:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C4056299B5; Mon, 17 Sep 2018 11:31:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 08575299D5 for ; Mon, 17 Sep 2018 11:31:13 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6B4A06E252; Mon, 17 Sep 2018 11:31:11 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by gabe.freedesktop.org (Postfix) with ESMTPS id 03FB96E147 for ; Mon, 17 Sep 2018 11:31:09 +0000 (UTC) Received: by mail-wm1-x342.google.com with SMTP id b19-v6so9164436wme.3 for ; Mon, 17 Sep 2018 04:31:08 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=2j6tGryllxGgwedVtUOIWtfzIpCqJ97m1YDyvtnACo4=; b=P4b8RI0WjyfF6wGX6XeKAisquKwYcML/1QdiC1KNdmydeVx0ll6m8YGj/9IP7pT1Nx fWWC9ky46nhtuk/OySTqkoR5tuiPk7l3MKkviQt5IzdxCUkXHVLPkgz674W7AcfJ1d8k X9p+WyIDMmldyCYb4z/QDoHMxxx6L6wD9i5Kh0OMr8wEfVoQKJ4bp81NV8yxOtSCa06P +e+vcCq/Lr68bLz9U8GLBsiOwstB86hy3kirHuAEn5IYX+X8skMffpIkOpG6yXMcr1rT nwI6qjqRXwRejz+n7tYctp0WaPJM4YpUY5Losw0b74W4+D8bqG+A7XHOTNvTJRtNsNmV WsGQ== X-Gm-Message-State: APzg51BXPsfa/s+D9XlWQf9DoEm0fz5+FpOuwRvvXcFNZNkUvMTK4NST GMlyv0UrsNSGw72e6z9fTjnnoyTtpLo= X-Google-Smtp-Source: ANB0VdYYnE62edsWoE9vIx9+g6Y4YTWgUHbHd38UIK4YRPS94Xtk5WFuWkC9BWi2+ixgyViCUYoTZg== X-Received: by 2002:a1c:a60e:: with SMTP id p14-v6mr11208658wme.64.1537183867058; Mon, 17 Sep 2018 04:31:07 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id o3-v6sm10781971wrn.58.2018.09.17.04.31.05 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 17 Sep 2018 04:31:06 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Date: Mon, 17 Sep 2018 12:30:55 +0100 Message-Id: <20180917113058.28994-4-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180917113058.28994-1-tvrtko.ursulin@linux.intel.com> References: <20180917113058.28994-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH 3/6] drm/i915/perf: lock powergating configuration to default when active X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Lionel Landwerlin If some of the contexts submitting workloads to the GPU have been configured to shutdown slices/subslices, we might loose the NOA configurations written in the NOA muxes. One possible solution to this problem is to reprogram the NOA muxes when we switch to a new context. We initially tried this in the workaround batchbuffer but some concerns where raised about the cost of reprogramming at every context switch. This solution is also not without consequences from the userspace point of view. Reprogramming of the muxes can only happen once the powergating configuration has changed (which happens after context switch). This means for a window of time during the recording, counters recorded by the OA unit might be invalid. This requires userspace dealing with OA reports to discard the invalid values. Minimizing the reprogramming could be implemented by tracking of the last programmed configuration somewhere in GGTT and use MI_PREDICATE to discard some of the programming commands, but the command streamer would still have to parse all the MI_LRI instructions in the workaround batchbuffer. Another solution, which this change implements, is to simply disregard the user requested configuration for the period of time when i915/perf is active. There is no known issue with this apart from a performance penality for some media workloads that benefit from running on a partially powergated GPU. We already prevent RC6 from affecting the programming so it doesn't sound completely unreasonable to hold on powergating for the same reason. v2: Leave RPCS programming in intel_lrc.c (Lionel) v3: Update for s/union intel_sseu/struct intel_sseu/ (Lionel) More to_intel_context() (Tvrtko) s/dev_priv/i915/ (Tvrtko) Tvrtko Ursulin: v4: * Rebase for make_rpcs changes. v5: * Apply OA restriction from make_rpcs directly. v6: * Rebase for context image setup changes. v7: * Move stream assignment before metric enable. v8: * Rebase. Signed-off-by: Lionel Landwerlin Signed-off-by: Tvrtko Ursulin Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_perf.c | 13 +++++++++---- drivers/gpu/drm/i915/intel_lrc.c | 31 ++++++++++++++++++++----------- drivers/gpu/drm/i915/intel_lrc.h | 2 ++ 3 files changed, 31 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 664b96bb65a3..46ddd1913fee 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1677,6 +1677,11 @@ static void gen8_update_reg_state_unlocked(struct i915_gem_context *ctx, CTX_REG(reg_state, state_offset, flex_regs[i], value); } + + CTX_REG(reg_state, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE, + gen8_make_rpcs(dev_priv, + &to_intel_context(ctx, + dev_priv->engine[RCS])->sseu)); } /* @@ -2092,6 +2097,9 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, if (ret) goto err_lock; + stream->ops = &i915_oa_stream_ops; + dev_priv->perf.oa.exclusive_stream = stream; + ret = dev_priv->perf.oa.ops.enable_metric_set(dev_priv, stream->oa_config); if (ret) { @@ -2099,15 +2107,12 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, goto err_enable; } - stream->ops = &i915_oa_stream_ops; - - dev_priv->perf.oa.exclusive_stream = stream; - mutex_unlock(&dev_priv->drm.struct_mutex); return 0; err_enable: + dev_priv->perf.oa.exclusive_stream = NULL; dev_priv->perf.oa.ops.disable_metric_set(dev_priv); mutex_unlock(&dev_priv->drm.struct_mutex); diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 1491032b675b..cf2e0848fa55 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1305,9 +1305,6 @@ static int __context_pin(struct i915_gem_context *ctx, struct i915_vma *vma) return i915_vma_pin(vma, 0, 0, flags); } -static u32 -make_rpcs(struct drm_i915_private *i915, struct intel_sseu *ctx_sseu); - static void __execlists_update_reg_state(struct intel_engine_cs *engine, struct intel_context *ce) @@ -1321,8 +1318,8 @@ __execlists_update_reg_state(struct intel_engine_cs *engine, /* RPCS */ if (engine->class == RENDER_CLASS) - regs[CTX_R_PWR_CLK_STATE + 1] = make_rpcs(engine->i915, - &ce->sseu); + regs[CTX_R_PWR_CLK_STATE + 1] = gen8_make_rpcs(engine->i915, + &ce->sseu); } static struct intel_context * @@ -2507,13 +2504,12 @@ int logical_xcs_ring_init(struct intel_engine_cs *engine) return logical_ring_init(engine); } -static u32 -make_rpcs(struct drm_i915_private *i915, struct intel_sseu *ctx_sseu) +u32 gen8_make_rpcs(struct drm_i915_private *i915, struct intel_sseu *req_sseu) { const struct sseu_dev_info *sseu = &INTEL_INFO(i915)->sseu; bool subslice_pg = sseu->has_subslice_pg; - u8 slices = hweight8(ctx_sseu->slice_mask); - u8 subslices = hweight8(ctx_sseu->subslice_mask); + struct intel_sseu ctx_sseu; + u8 slices, subslices; u32 rpcs = 0; /* @@ -2523,6 +2519,19 @@ make_rpcs(struct drm_i915_private *i915, struct intel_sseu *ctx_sseu) if (INTEL_GEN(i915) < 9) return 0; + /* + * If i915/perf is active, we want a stable powergating configuration + * on the system. The most natural configuration to take in that case + * is the default (i.e maximum the hardware can do). + */ + if (unlikely(i915->perf.oa.exclusive_stream)) + ctx_sseu = intel_device_default_sseu(i915); + else + ctx_sseu = *req_sseu; + + slices = hweight8(ctx_sseu.slice_mask); + subslices = hweight8(ctx_sseu.subslice_mask); + /* * Since the SScount bitfield in GEN8_R_PWR_CLK_STATE is only three bits * wide and Icelake has up to eight subslices, specfial programming is @@ -2592,13 +2601,13 @@ make_rpcs(struct drm_i915_private *i915, struct intel_sseu *ctx_sseu) if (sseu->has_eu_pg) { u32 val; - val = ctx_sseu->min_eus_per_subslice << GEN8_RPCS_EU_MIN_SHIFT; + val = ctx_sseu.min_eus_per_subslice << GEN8_RPCS_EU_MIN_SHIFT; GEM_BUG_ON(val & ~GEN8_RPCS_EU_MIN_MASK); val &= GEN8_RPCS_EU_MIN_MASK; rpcs |= val; - val = ctx_sseu->max_eus_per_subslice << GEN8_RPCS_EU_MAX_SHIFT; + val = ctx_sseu.max_eus_per_subslice << GEN8_RPCS_EU_MAX_SHIFT; GEM_BUG_ON(val & ~GEN8_RPCS_EU_MAX_MASK); val &= GEN8_RPCS_EU_MAX_MASK; diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h index f5a5502ecf70..a4e28cc55fda 100644 --- a/drivers/gpu/drm/i915/intel_lrc.h +++ b/drivers/gpu/drm/i915/intel_lrc.h @@ -104,4 +104,6 @@ void intel_lr_context_resume(struct drm_i915_private *dev_priv); void intel_execlists_set_default_submission(struct intel_engine_cs *engine); +u32 gen8_make_rpcs(struct drm_i915_private *i915, struct intel_sseu *ctx_sseu); + #endif /* _INTEL_LRC_H_ */ From patchwork Mon Sep 17 11:30:56 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10602573 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CC00B17E1 for ; Mon, 17 Sep 2018 11:31:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B000F299D5 for ; Mon, 17 Sep 2018 11:31:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AE6AD299DF; Mon, 17 Sep 2018 11:31:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 3A2A9299CE for ; Mon, 17 Sep 2018 11:31:15 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 316DE6E264; Mon, 17 Sep 2018 11:31:12 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by gabe.freedesktop.org (Postfix) with ESMTPS id EC8BE6E17E for ; Mon, 17 Sep 2018 11:31:09 +0000 (UTC) Received: by mail-wm1-x342.google.com with SMTP id n11-v6so9136184wmc.2 for ; Mon, 17 Sep 2018 04:31:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=XLzPja5VgZN4zB9nkeMR91Vc7FGd0XItd/PIb3XYXkA=; b=ctPZ0ZnC1Ppet3/HcWLFX9+vdQo3cKd5FroPyDO/sWeW4voEIVcQv1csVElCHhq29m eoi+yFH5RZ0MBs6IMX+IcuGDriimgIZvhormzBijlEK+K/xA+scxeybnbHWK6oqEosrY drBWzS+V54t1CsGtXHoxXpD9fM6gs1zpm+6lqv6XPhnLkZhg9rAly3LiiO3DXKhuS8yp TprBPq/rTiVYG8DuLFvDowoZGcGcsVE1wzLZ0Yr1anyhNCg/x7SNEn3FCzjp6KtlCrsf mYo6DeB26P/k/9zkRopfG+RVvYm+PMsKyxnxKP/JhWzqtY7R6rWYhwro19MOQ3X6HVPT vUpQ== X-Gm-Message-State: APzg51A1aLcNrIozYQ8FK7QT8qn9geU/mVQmr63M1LuwNQsNK2IU4XTs VKXt9OemZMR5uaQCqxfQAZvY0M+LSAE= X-Google-Smtp-Source: ANB0Vda/mGke3xzA0i+pvngnRs1IJLi38HGyrJyfgsCdeX7oF0sNUts9xjPVgJlFkOwtlEA+6d8ZGw== X-Received: by 2002:a1c:4c0e:: with SMTP id z14-v6mr11090087wmf.89.1537183868266; Mon, 17 Sep 2018 04:31:08 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id o3-v6sm10781971wrn.58.2018.09.17.04.31.07 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 17 Sep 2018 04:31:07 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Date: Mon, 17 Sep 2018 12:30:56 +0100 Message-Id: <20180917113058.28994-5-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180917113058.28994-1-tvrtko.ursulin@linux.intel.com> References: <20180917113058.28994-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH 4/6] drm/i915: Add timeline barrier support X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin Timeline barrier allows serialization between different timelines. After calling i915_timeline_set_barrier with a request, all following submissions on this timeline will be set up as depending on this request, or barrier. Once the barrier has been completed it automatically gets cleared and things continue as normal. This facility will be used by the upcoming context SSEU code. v2: * Assert barrier has been retired on timeline_fini. (Chris Wilson) * Fix mock_timeline. Signed-off-by: Tvrtko Ursulin Suggested-by: Chris Wilson Cc: Chris Wilson Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_request.c | 13 +++++++++ drivers/gpu/drm/i915/i915_timeline.c | 3 +++ drivers/gpu/drm/i915/i915_timeline.h | 27 +++++++++++++++++++ .../gpu/drm/i915/selftests/mock_timeline.c | 2 ++ 4 files changed, 45 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index a492385b2089..76fc80330c85 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -644,6 +644,15 @@ submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) return NOTIFY_DONE; } +static int add_timeline_barrier(struct i915_request *rq) +{ + struct i915_request *barrier = + i915_gem_active_raw(&rq->timeline->barrier, + &rq->i915->drm.struct_mutex); + + return barrier ? i915_request_await_dma_fence(rq, &barrier->fence) : 0; +} + /** * i915_request_alloc - allocate a request structure * @@ -808,6 +817,10 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx) */ rq->head = rq->ring->emit; + ret = add_timeline_barrier(rq); + if (ret) + goto err_unwind; + /* Unconditionally invalidate GPU caches and TLBs. */ ret = engine->emit_flush(rq, EMIT_INVALIDATE); if (ret) diff --git a/drivers/gpu/drm/i915/i915_timeline.c b/drivers/gpu/drm/i915/i915_timeline.c index 4667cc08c416..5a87c5bd5154 100644 --- a/drivers/gpu/drm/i915/i915_timeline.c +++ b/drivers/gpu/drm/i915/i915_timeline.c @@ -37,6 +37,8 @@ void i915_timeline_init(struct drm_i915_private *i915, INIT_LIST_HEAD(&timeline->requests); i915_syncmap_init(&timeline->sync); + + init_request_active(&timeline->barrier, NULL); } /** @@ -69,6 +71,7 @@ void i915_timelines_park(struct drm_i915_private *i915) void i915_timeline_fini(struct i915_timeline *timeline) { GEM_BUG_ON(!list_empty(&timeline->requests)); + GEM_BUG_ON(i915_gem_active_isset(&timeline->barrier)); i915_syncmap_free(&timeline->sync); diff --git a/drivers/gpu/drm/i915/i915_timeline.h b/drivers/gpu/drm/i915/i915_timeline.h index a2c2c3ab5fb0..c8526ab44dbc 100644 --- a/drivers/gpu/drm/i915/i915_timeline.h +++ b/drivers/gpu/drm/i915/i915_timeline.h @@ -72,6 +72,16 @@ struct i915_timeline { */ u32 global_sync[I915_NUM_ENGINES]; + /** + * Barrier provides the ability to serialize ordering between different + * timelines. + * + * Users can call i915_timeline_set_barrier which will make all + * subsequent submissions be executed only after this barrier has been + * completed. + */ + struct i915_gem_active barrier; + struct list_head link; const char *name; @@ -125,4 +135,21 @@ static inline bool i915_timeline_sync_is_later(struct i915_timeline *tl, void i915_timelines_park(struct drm_i915_private *i915); +/** + * i915_timeline_set_barrier - orders submission between different timelines + * @timeline: timeline to set the barrier on + * @rq: request after which new submissions can proceed + * + * Sets the passed in request as the serialization point for all subsequent + * submissions on @timeline. Subsequent requests will not be submitted to GPU + * until the barrier has been completed. + */ +static inline void +i915_timeline_set_barrier(struct i915_timeline *timeline, + struct i915_request *rq) +{ + GEM_BUG_ON(timeline->fence_context == rq->timeline->fence_context); + i915_gem_active_set(&timeline->barrier, rq); +} + #endif diff --git a/drivers/gpu/drm/i915/selftests/mock_timeline.c b/drivers/gpu/drm/i915/selftests/mock_timeline.c index dcf3b16f5a07..a718b64c988e 100644 --- a/drivers/gpu/drm/i915/selftests/mock_timeline.c +++ b/drivers/gpu/drm/i915/selftests/mock_timeline.c @@ -19,6 +19,8 @@ void mock_timeline_init(struct i915_timeline *timeline, u64 context) i915_syncmap_init(&timeline->sync); + init_request_active(&timeline->barrier, NULL); + INIT_LIST_HEAD(&timeline->link); } From patchwork Mon Sep 17 11:30:57 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10602575 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8B7DA112B for ; Mon, 17 Sep 2018 11:31:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7006929979 for ; Mon, 17 Sep 2018 11:31:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6E0962998B; Mon, 17 Sep 2018 11:31:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 4EC0F299E9 for ; Mon, 17 Sep 2018 11:31:18 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id AA1C96E261; Mon, 17 Sep 2018 11:31:17 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by gabe.freedesktop.org (Postfix) with ESMTPS id 62F9C6E17E for ; Mon, 17 Sep 2018 11:31:11 +0000 (UTC) Received: by mail-wr1-x441.google.com with SMTP id 20-v6so16847030wrb.12 for ; Mon, 17 Sep 2018 04:31:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=iXgYZ11oh3MP0Nnb8EYb01UB5KBT8x3KdCVFHITPdM8=; b=jt3YgESCedktZlaNDfZsAwgAcDawUUnKd/7/nXAjbwPFpjBL+CHEilzJf6vU0+AuTQ NMPXNitudPF5mrjzMD+wIQAWBB1gvpoYMNHnJO10L4MzyTYGb/NQ4p1rZhZ0MsJad3iP lasHfEgB/ajG0AVNH7DLvSUrp12bhatNdSLlOi+JR+zhz/ppIqLEjFllmIsu//EY1mCd bxPnnc9+aHJDp9WF80e7qU+HVHLXb3BHtTGKeH3YtrcSSGFrTP7JvO84zzL24P2d4f0z UJP+NGR4+8MFh6sE/ULP7P72ok5mYonujFSVXLo6eNLOLsQiaW74aAAXwXMk3y46Mp+G 1A9w== X-Gm-Message-State: APzg51CFrtaqp1vJIvYeGjNmYT8MBRpqxGpFSfVUVLCUrv26+m6QOf59 JMv/FWbxAWa3cxYGYfWgrMmtwsFYReM= X-Google-Smtp-Source: ANB0VdZ5CzqYYdUUOIeWVA192MCzQWRiPc3mM4dphbMC//B2WB43cR22zpwR+8iS2b+grSqkKfNf+g== X-Received: by 2002:a5d:4648:: with SMTP id j8-v6mr17550988wrs.52.1537183869607; Mon, 17 Sep 2018 04:31:09 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id o3-v6sm10781971wrn.58.2018.09.17.04.31.08 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 17 Sep 2018 04:31:09 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Date: Mon, 17 Sep 2018 12:30:57 +0100 Message-Id: <20180917113058.28994-6-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180917113058.28994-1-tvrtko.ursulin@linux.intel.com> References: <20180917113058.28994-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH 5/6] drm/i915: Expose RPCS (SSEU) configuration to userspace X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin We want to allow userspace to reconfigure the subslice configuration for its own use case. To do so, we expose a context parameter to allow adjustment of the RPCS register stored within the context image (and currently not accessible via LRI). If the context is adjusted before first use, the adjustment is for "free"; otherwise if the context is active we flush the context off the GPU (stalling all users) and forcing the GPU to save the context to memory where we can modify it and so ensure that the register is reloaded on next execution. The overhead of managing additional EU subslices can be significant, especially in multi-context workloads. Non-GPGPU contexts should preferably disable the subslices it is not using, and others should fine-tune the number to match their workload. We expose complete control over the RPCS register, allowing configuration of slice/subslice, via masks packed into a u64 for simplicity. For example, struct drm_i915_gem_context_param arg; struct drm_i915_gem_context_param_sseu sseu = { .class = 0, .instance = 0, }; memset(&arg, 0, sizeof(arg)); arg.ctx_id = ctx; arg.param = I915_CONTEXT_PARAM_SSEU; arg.value = (uintptr_t) &sseu; if (drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_GETPARAM, &arg) == 0) { sseu.packed.subslice_mask = 0; drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_SETPARAM, &arg); } could be used to disable all subslices where supported. v2: Fix offset of CTX_R_PWR_CLK_STATE in intel_lr_context_set_sseu() (Lionel) v3: Add ability to program this per engine (Chris) v4: Move most get_sseu() into i915_gem_context.c (Lionel) v5: Validate sseu configuration against the device's capabilities (Lionel) v6: Change context powergating settings through MI_SDM on kernel context (Chris) v7: Synchronize the requests following a powergating setting change using a global dependency (Chris) Iterate timelines through dev_priv.gt.active_rings (Tvrtko) Disable RPCS configuration setting for non capable users (Lionel/Tvrtko) v8: s/union intel_sseu/struct intel_sseu/ (Lionel) s/dev_priv/i915/ (Tvrtko) Change uapi class/instance fields to u16 (Tvrtko) Bump mask fields to 64bits (Lionel) Don't return EPERM when dynamic sseu is disabled (Tvrtko) v9: Import context image into kernel context's ppgtt only when reconfiguring powergated slice/subslices (Chris) Use aliasing ppgtt when needed (Michel) Tvrtko Ursulin: v10: * Update for upstream changes. * Request submit needs a RPM reference. * Reject on !FULL_PPGTT for simplicity. * Pull out get/set param to helpers for readability and less indent. * Use i915_request_await_dma_fence in add_global_barrier to skip waits on the same timeline and avoid GEM_BUG_ON. * No need to explicitly assign a NULL pointer to engine in legacy mode. * No need to move gen8_make_rpcs up. * Factored out global barrier as prep patch. * Allow to only CAP_SYS_ADMIN if !Gen11. v11: * Remove engine vfunc in favour of local helper. (Chris Wilson) * Stop retiring requests before updates since it is not needed (Chris Wilson) * Implement direct CPU update path for idle contexts. (Chris Wilson) * Left side dependency needs only be on the same context timeline. (Chris Wilson) * It is sufficient to order the timeline. (Chris Wilson) * Reject !RCS configuration attempts with -ENODEV for now. v12: * Rebase for make_rpcs. v13: * Centralize SSEU normalization to make_rpcs. * Type width checking (uAPI <-> implementation). * Gen11 restrictions uAPI checks. * Gen11 subslice count differences handling. Chris Wilson: * args->size handling fixes. * Update context image from GGTT. * Postpone context image update to pinning. * Use i915_gem_active_raw instead of last_request_on_engine. v14: * Add activity tracker on intel_context to fix the lifetime issues and simplify the code. (Chris Wilson) v15: * Fix context pin leak if no space in ring by simplifying the context pinning sequence. v16: * Rebase for context get/set param locking changes. * Just -ENODEV on !Gen11. (Joonas) v17: * Fix one Gen11 subslice enablement rule. * Handle error from i915_sw_fence_await_sw_fence_gfp. (Chris Wilson) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100899 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107634 Issue: https://github.com/intel/media-driver/issues/267 Signed-off-by: Chris Wilson Signed-off-by: Lionel Landwerlin Cc: Dmitry Rogozhkin Cc: Tvrtko Ursulin Cc: Zhipeng Gong Cc: Joonas Lahtinen Signed-off-by: Tvrtko Ursulin Reviewed-by: Chris Wilson # v15 Reviewed-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_context.c | 318 +++++++++++++++++++++++- drivers/gpu/drm/i915/i915_gem_context.h | 6 + drivers/gpu/drm/i915/intel_lrc.c | 4 +- include/uapi/drm/i915_drm.h | 43 ++++ 4 files changed, 368 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 0b8cc748648b..b38e8b8ec772 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -90,6 +90,7 @@ #include #include "i915_drv.h" #include "i915_trace.h" +#include "intel_lrc_reg.h" #include "intel_workarounds.h" #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1 @@ -322,6 +323,14 @@ static u32 default_desc_template(const struct drm_i915_private *i915, return desc; } +static void intel_context_retire(struct i915_gem_active *active, + struct i915_request *rq) +{ + struct intel_context *ce = container_of(active, typeof(*ce), active); + + intel_context_unpin(ce); +} + static struct i915_gem_context * __create_hw_context(struct drm_i915_private *dev_priv, struct drm_i915_file_private *file_priv) @@ -345,6 +354,8 @@ __create_hw_context(struct drm_i915_private *dev_priv, ce->gem_context = ctx; /* Use the whole device by default */ ce->sseu = intel_device_default_sseu(dev_priv); + + init_request_active(&ce->active, intel_context_retire); } INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL); @@ -846,6 +857,56 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data, return 0; } +static int get_sseu(struct i915_gem_context *ctx, + struct drm_i915_gem_context_param *args) +{ + struct drm_i915_gem_context_param_sseu user_sseu; + struct intel_engine_cs *engine; + struct intel_context *ce; + int ret; + + if (args->size == 0) + goto out; + else if (args->size < sizeof(user_sseu)) + return -EINVAL; + + if (copy_from_user(&user_sseu, u64_to_user_ptr(args->value), + sizeof(user_sseu))) + return -EFAULT; + + if (user_sseu.rsvd1 || user_sseu.rsvd2) + return -EINVAL; + + engine = intel_engine_lookup_user(ctx->i915, + user_sseu.class, + user_sseu.instance); + if (!engine) + return -EINVAL; + + /* Only use for mutex here is to serialize get_param and set_param. */ + ret = mutex_lock_interruptible(&ctx->i915->drm.struct_mutex); + if (ret) + return ret; + + ce = to_intel_context(ctx, engine); + + user_sseu.slice_mask = ce->sseu.slice_mask; + user_sseu.subslice_mask = ce->sseu.subslice_mask; + user_sseu.min_eus_per_subslice = ce->sseu.min_eus_per_subslice; + user_sseu.max_eus_per_subslice = ce->sseu.max_eus_per_subslice; + + mutex_unlock(&ctx->i915->drm.struct_mutex); + + if (copy_to_user(u64_to_user_ptr(args->value), &user_sseu, + sizeof(user_sseu))) + return -EFAULT; + +out: + args->size = sizeof(user_sseu); + + return 0; +} + int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data, struct drm_file *file) { @@ -858,15 +919,17 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data, if (!ctx) return -ENOENT; - args->size = 0; switch (args->param) { case I915_CONTEXT_PARAM_BAN_PERIOD: ret = -EINVAL; break; case I915_CONTEXT_PARAM_NO_ZEROMAP: + args->size = 0; args->value = test_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags); break; case I915_CONTEXT_PARAM_GTT_SIZE: + args->size = 0; + if (ctx->ppgtt) args->value = ctx->ppgtt->vm.total; else if (to_i915(dev)->mm.aliasing_ppgtt) @@ -875,14 +938,20 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data, args->value = to_i915(dev)->ggtt.vm.total; break; case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE: + args->size = 0; args->value = i915_gem_context_no_error_capture(ctx); break; case I915_CONTEXT_PARAM_BANNABLE: + args->size = 0; args->value = i915_gem_context_is_bannable(ctx); break; case I915_CONTEXT_PARAM_PRIORITY: + args->size = 0; args->value = ctx->sched.priority; break; + case I915_CONTEXT_PARAM_SSEU: + ret = get_sseu(ctx, args); + break; default: ret = -EINVAL; break; @@ -892,6 +961,249 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data, return ret; } +static int gen8_emit_rpcs_config(struct i915_request *rq, + struct intel_context *ce, + struct intel_sseu sseu) +{ + u64 offset; + u32 *cs; + + cs = intel_ring_begin(rq, 4); + if (IS_ERR(cs)) + return PTR_ERR(cs); + + offset = ce->state->node.start + + LRC_STATE_PN * PAGE_SIZE + + (CTX_R_PWR_CLK_STATE + 1) * 4; + + *cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT; + *cs++ = lower_32_bits(offset); + *cs++ = upper_32_bits(offset); + *cs++ = gen8_make_rpcs(rq->i915, &sseu); + + intel_ring_advance(rq, cs); + + return 0; +} + +static int +gen8_modify_rpcs_gpu(struct intel_context *ce, + struct intel_engine_cs *engine, + struct intel_sseu sseu) +{ + struct drm_i915_private *i915 = engine->i915; + struct i915_request *rq, *prev; + int ret; + + GEM_BUG_ON(!ce->pin_count); + + lockdep_assert_held(&i915->drm.struct_mutex); + + /* Submitting requests etc needs the hw awake. */ + intel_runtime_pm_get(i915); + + rq = i915_request_alloc(engine, i915->kernel_context); + if (IS_ERR(rq)) { + ret = PTR_ERR(rq); + goto out_put; + } + + /* Queue this switch after all other activity by this context. */ + prev = i915_gem_active_raw(&ce->ring->timeline->last_request, + &i915->drm.struct_mutex); + if (prev && !i915_request_completed(prev)) { + ret = i915_sw_fence_await_sw_fence_gfp(&rq->submit, + &prev->submit, + I915_FENCE_GFP); + if (ret < 0) + goto out_add; + } + + ret = gen8_emit_rpcs_config(rq, ce, sseu); + if (ret) + goto out_add; + + /* Order all following requests to be after. */ + i915_timeline_set_barrier(ce->ring->timeline, rq); + + /* + * Guarantee context image and the timeline remains pinned until the + * modifying request is retired by setting the ce activity tracker. + * + * But we only need to take one pin on the account of it. Or in other + * words transfer the pinned ce object to tracked active request. + */ + if (!i915_gem_active_isset(&ce->active)) + __intel_context_pin(ce); + i915_gem_active_set(&ce->active, rq); + +out_add: + i915_request_add(rq); +out_put: + intel_runtime_pm_put(i915); + + return ret; +} + +static int +i915_gem_context_reconfigure_sseu(struct i915_gem_context *ctx, + struct intel_engine_cs *engine, + struct intel_sseu sseu) +{ + struct intel_context *ce = to_intel_context(ctx, engine); + int ret; + + GEM_BUG_ON(INTEL_GEN(ctx->i915) < 8); + GEM_BUG_ON(engine->id != RCS); + + ret = mutex_lock_interruptible(&ctx->i915->drm.struct_mutex); + if (ret) + return ret; + + /* Nothing to do if unmodified. */ + if (!memcmp(&ce->sseu, &sseu, sizeof(sseu))) + goto out; + + /* + * If context is not idle we have to submit an ordered request to modify + * its context image via the kernel context. Pristine and idle contexts + * will be configured on pinning. + */ + if (ce->pin_count) + ret = gen8_modify_rpcs_gpu(ce, engine, sseu); + + if (!ret) + ce->sseu = sseu; + +out: + mutex_unlock(&ctx->i915->drm.struct_mutex); + + return ret; +} + +static int +user_to_context_sseu(struct drm_i915_private *i915, + const struct drm_i915_gem_context_param_sseu *user, + struct intel_sseu *context) +{ + const struct sseu_dev_info *device = &INTEL_INFO(i915)->sseu; + + /* No zeros in any field. */ + if (!user->slice_mask || !user->subslice_mask || + !user->min_eus_per_subslice || !user->max_eus_per_subslice) + return -EINVAL; + + /* Max > min. */ + if (user->max_eus_per_subslice < user->min_eus_per_subslice) + return -EINVAL; + + /* Check validity against hardware. */ + if (user->slice_mask & ~device->slice_mask) + return -EINVAL; + + if (user->subslice_mask & ~device->subslice_mask[0]) + return -EINVAL; + + if (user->max_eus_per_subslice > device->max_eus_per_subslice) + return -EINVAL; + + /* + * Some future proofing on the types since the uAPI is wider than the + * current internal implementation. + */ + if (WARN_ON((fls(user->slice_mask) > + sizeof(context->slice_mask) * BITS_PER_BYTE) || + (fls(user->subslice_mask) > + sizeof(context->subslice_mask) * BITS_PER_BYTE) || + overflows_type(user->min_eus_per_subslice, + context->min_eus_per_subslice) || + overflows_type(user->max_eus_per_subslice, + context->max_eus_per_subslice))) + return -EINVAL; + + context->slice_mask = user->slice_mask; + context->subslice_mask = user->subslice_mask; + context->min_eus_per_subslice = user->min_eus_per_subslice; + context->max_eus_per_subslice = user->max_eus_per_subslice; + + /* Part specific restrictions. */ + if (IS_GEN11(i915)) { + unsigned int hw_ss_per_s = hweight8(device->subslice_mask[0]); + unsigned int req_s = hweight8(context->slice_mask); + unsigned int req_ss = hweight8(context->subslice_mask); + + /* + * Only full subslice enablement is possible if more than one + * slice is turned on. + */ + if (req_s > 1 && req_ss != hw_ss_per_s) + return -EINVAL; + + /* + * If more than four (SScount bitfield limit) subslices are + * requested then the number has to be even. + */ + if (req_ss > 4 && (req_ss & 1)) + return -EINVAL; + + /* + * If only one slice is enabled and subslice count is below the + * device full enablement, it must be at most half of the all + * available subslices. + */ + if (req_s == 1 && req_ss < hw_ss_per_s && + req_ss > (hw_ss_per_s / 2)) + return -EINVAL; + } + + return 0; +} + +static int set_sseu(struct i915_gem_context *ctx, + struct drm_i915_gem_context_param *args) +{ + struct drm_i915_private *i915 = ctx->i915; + struct drm_i915_gem_context_param_sseu user_sseu; + struct intel_engine_cs *engine; + struct intel_sseu sseu; + int ret; + + if (args->size < sizeof(user_sseu)) + return -EINVAL; + + if (!IS_GEN11(i915)) + return -ENODEV; + + if (copy_from_user(&user_sseu, u64_to_user_ptr(args->value), + sizeof(user_sseu))) + return -EFAULT; + + if (user_sseu.rsvd1 || user_sseu.rsvd2) + return -EINVAL; + + engine = intel_engine_lookup_user(i915, + user_sseu.class, + user_sseu.instance); + if (!engine) + return -EINVAL; + + /* Only render engine supports RPCS configuration. */ + if (engine->class != RENDER_CLASS) + return -ENODEV; + + ret = user_to_context_sseu(i915, &user_sseu, &sseu); + if (ret) + return ret; + + ret = i915_gem_context_reconfigure_sseu(ctx, engine, sseu); + if (ret) + return ret; + + args->size = sizeof(user_sseu); + + return 0; +} + int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, struct drm_file *file) { @@ -953,7 +1265,9 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, ctx->sched.priority = priority; } break; - + case I915_CONTEXT_PARAM_SSEU: + ret = set_sseu(ctx, args); + break; default: ret = -EINVAL; break; diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h index 7510de738b35..c4aebfe7d4d2 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.h +++ b/drivers/gpu/drm/i915/i915_gem_context.h @@ -170,6 +170,12 @@ struct i915_gem_context { u64 lrc_desc; int pin_count; + /** + * active: Active tracker for the external rq activity on this + * intel_context object. + */ + struct i915_gem_active active; + const struct intel_context_ops *ops; /** sseu: Control eu/slice partitioning */ diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index cf2e0848fa55..b5603e977a3f 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -2557,7 +2557,9 @@ u32 gen8_make_rpcs(struct drm_i915_private *i915, struct intel_sseu *req_sseu) * subslices are enabled, or a count between one and four on the first * slice. */ - if (IS_GEN11(i915) && slices == 1 && subslices >= 4) { + if (IS_GEN11(i915) && + slices == 1 && + subslices > min_t(u8, 4, hweight8(sseu->subslice_mask[0]) / 2)) { GEM_BUG_ON(subslices & 1); subslice_pg = false; diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index a4446f452040..e195c38b15a6 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -1478,9 +1478,52 @@ struct drm_i915_gem_context_param { #define I915_CONTEXT_MAX_USER_PRIORITY 1023 /* inclusive */ #define I915_CONTEXT_DEFAULT_PRIORITY 0 #define I915_CONTEXT_MIN_USER_PRIORITY -1023 /* inclusive */ + /* + * When using the following param, value should be a pointer to + * drm_i915_gem_context_param_sseu. + */ +#define I915_CONTEXT_PARAM_SSEU 0x7 __u64 value; }; +struct drm_i915_gem_context_param_sseu { + /* + * Engine class & instance to be configured or queried. + */ + __u16 class; + __u16 instance; + + /* + * Unused for now. Must be cleared to zero. + */ + __u32 rsvd1; + + /* + * Mask of slices to enable for the context. Valid values are a subset + * of the bitmask value returned for I915_PARAM_SLICE_MASK. + */ + __u64 slice_mask; + + /* + * Mask of subslices to enable for the context. Valid values are a + * subset of the bitmask value return by I915_PARAM_SUBSLICE_MASK. + */ + __u64 subslice_mask; + + /* + * Minimum/Maximum number of EUs to enable per subslice for the + * context. min_eus_per_subslice must be inferior or equal to + * max_eus_per_subslice. + */ + __u16 min_eus_per_subslice; + __u16 max_eus_per_subslice; + + /* + * Unused for now. Must be cleared to zero. + */ + __u32 rsvd2; +}; + enum drm_i915_oa_format { I915_OA_FORMAT_A13 = 1, /* HSW only */ I915_OA_FORMAT_A29, /* HSW only */ From patchwork Mon Sep 17 11:30:58 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 10602577 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id ED1721508 for ; Mon, 17 Sep 2018 11:31:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D6A552992D for ; Mon, 17 Sep 2018 11:31:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D527A2998B; Mon, 17 Sep 2018 11:31:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 8EF5829979 for ; Mon, 17 Sep 2018 11:31:19 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D86446E26C; Mon, 17 Sep 2018 11:31:17 +0000 (UTC) X-Original-To: Intel-gfx@lists.freedesktop.org Delivered-To: Intel-gfx@lists.freedesktop.org Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) by gabe.freedesktop.org (Postfix) with ESMTPS id 311F06E261 for ; Mon, 17 Sep 2018 11:31:12 +0000 (UTC) Received: by mail-wm1-x341.google.com with SMTP id s12-v6so9168562wmc.0 for ; Mon, 17 Sep 2018 04:31:12 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ntQ28b0jePfbe6gAToX6Qj1zorRWhxbCP4PyytLAxmQ=; b=kbN8XyETzIjPktAM/qNhJUsSQJSpp6Qy0KWuaL43NAmQoinja+gA2BGTdpkvFsOjEF t4pEQQsjLe9rJLxRbJkdFKZafKVs9v1eMkw02e79L8UTA3xW/+ddDflUQWqmBnMRAhhw dACqpxjiC56FKq3E01zeo6AJSv2/yedc0AcZOuYyFrElaF//hdD8py+RRcH3hi5ASMyK hchqbu4aotfU9lHeA5aHmzOUY4XuLNgh59XtLBBP66gRDvvVF6uxL03jT4r9P7QTZzPR EHHP3Ci8jtZksnFEKkJASKn58AwFHroVLu4RrIXoeaq7qN1pnwJHn3NJLvYzQIvJjWhh hcWA== X-Gm-Message-State: APzg51Ay5hr5N3mwuyugjVdfckA/W1dtKkl55e+lBCmx+YyZDONhIlCD Km/LBM8j6S4bBKIaL7Dvy/U1JEe4Oqc= X-Google-Smtp-Source: ANB0VdbXpPTiUaBoWvmJ7r+ZxF3LE/4lV7Qc7el4EJNBGwfcxGJFhDygTQp5bfClZRL73x5yjUBJuw== X-Received: by 2002:a7b:c18a:: with SMTP id y10-v6mr11495031wmi.87.1537183870599; Mon, 17 Sep 2018 04:31:10 -0700 (PDT) Received: from localhost.localdomain ([95.144.165.37]) by smtp.gmail.com with ESMTPSA id o3-v6sm10781971wrn.58.2018.09.17.04.31.09 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 17 Sep 2018 04:31:09 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Date: Mon, 17 Sep 2018 12:30:58 +0100 Message-Id: <20180917113058.28994-7-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180917113058.28994-1-tvrtko.ursulin@linux.intel.com> References: <20180917113058.28994-1-tvrtko.ursulin@linux.intel.com> Subject: [Intel-gfx] [PATCH 6/6] drm/i915/icl: Support co-existence between per-context SSEU and OA X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin When OA is active we want to lock the powergating configuration, but on Icelake, users like the media stack will have issues if we lock to the full device configuration. Instead lock to a subset of (sub)slices which are currently a known working configuration for all users. v2: * Fix commit message spelling. Signed-off-by: Tvrtko Ursulin Cc: Lionel Landwerlin Reviewed-by: Lionel Landwerlin --- drivers/gpu/drm/i915/intel_lrc.c | 25 ++++++++++++++++++++----- 1 file changed, 20 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index b5603e977a3f..cded1f1d9ec2 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -2521,13 +2521,28 @@ u32 gen8_make_rpcs(struct drm_i915_private *i915, struct intel_sseu *req_sseu) /* * If i915/perf is active, we want a stable powergating configuration - * on the system. The most natural configuration to take in that case - * is the default (i.e maximum the hardware can do). + * on the system. + * + * We could choose full enablement, but on ICL we know there are use + * cases which disable slices for functional, apart for performance + * reasons. So in this case we select a known stable subset. */ - if (unlikely(i915->perf.oa.exclusive_stream)) - ctx_sseu = intel_device_default_sseu(i915); - else + if (!i915->perf.oa.exclusive_stream) { ctx_sseu = *req_sseu; + } else { + ctx_sseu = intel_device_default_sseu(i915); + + if (IS_GEN11(i915)) { + /* + * We only need subslice count so it doesn't matter + * which ones we select - just turn of low bits in the + * amount of half of all available subslices per slice. + */ + ctx_sseu.subslice_mask = + ~(~0 << (hweight8(ctx_sseu.subslice_mask) / 2)); + ctx_sseu.slice_mask = 0x1; + } + } slices = hweight8(ctx_sseu.slice_mask); subslices = hweight8(ctx_sseu.subslice_mask);