From patchwork Mon Oct 7 16:52:09 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kumar Valsan, Prathap" X-Patchwork-Id: 11178081 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DDCF31575 for ; Mon, 7 Oct 2019 16:35:28 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C639F2067B for ; Mon, 7 Oct 2019 16:35:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C639F2067B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0500B89D7B; Mon, 7 Oct 2019 16:35:28 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0D8F789D5C for ; Mon, 7 Oct 2019 16:35:25 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Oct 2019 09:35:23 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.67,268,1566889200"; d="scan'208";a="344781342" Received: from pkumarva-server.ra.intel.com ([10.23.184.130]) by orsmga004.jf.intel.com with ESMTP; 07 Oct 2019 09:35:23 -0700 From: Prathap Kumar Valsan To: intel-gfx@lists.freedesktop.org Date: Mon, 7 Oct 2019 12:52:09 -0400 Message-Id: <20191007165209.31797-2-prathap.kumar.valsan@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191007165209.31797-1-prathap.kumar.valsan@intel.com> References: <20191007165209.31797-1-prathap.kumar.valsan@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v3 1/1] drm/i915/ehl: Add sysfs interface to control class-of-service X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris P Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" To provide shared last-level-cache isolation to cpu workloads running concurrently with gpu workloads, the gpu allocation of cache lines needs to be restricted to certain ways. Currently GPU hardware supports four class-of-service(CLOS) levels and there is an associated way-mask for each CLOS. Each LLC MOCS register has a field to select the clos level. So in-order to globally set the gpu to a clos level, driver needs to program entire MOCS table. Hardware supports reading supported way-mask configuration for GPU using a bios pcode interface. This interface has two files--llc_clos_modes and llc_clos. The file llc_clos_modes is read only file and will list the available way masks. The file llc_clos is read/write and will show the currently active way mask and writing a new way mask will update the active way mask of the gpu. Note of Caution: Restricting cache ways using this mechanism presents a larger attack surface for side-channel attacks. Example usage: > cat /sys/class/drm/card0/llc_clos_modes 0xfff 0xfc0 0xc00 0x800 >cat /sys/class/drm/card0/llc_clos 0xfff Update to new clos echo "0x800" > /sys/class/drm/card0/llc_clos Signed-off-by: Prathap Kumar Valsan --- drivers/gpu/drm/i915/gt/intel_lrc.c | 7 + drivers/gpu/drm/i915/gt/intel_lrc_reg.h | 1 + drivers/gpu/drm/i915/gt/intel_mocs.c | 216 +++++++++++++++++++++++- drivers/gpu/drm/i915/gt/intel_mocs.h | 6 +- drivers/gpu/drm/i915/i915_drv.h | 8 + drivers/gpu/drm/i915/i915_reg.h | 1 + drivers/gpu/drm/i915/i915_sysfs.c | 100 +++++++++++ 7 files changed, 337 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 468438fb47af..054051969d00 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -2137,6 +2137,13 @@ __execlists_update_reg_state(const struct intel_context *ce, intel_sseu_make_rpcs(engine->i915, &ce->sseu); i915_oa_init_reg_state(ce, engine); + /* + * Gen11+ wants to support update of LLC class-of-service via + * sysfs interface. CLOS is defined in MOCS registers and for + * Gen11, MOCS is part of context resgister state. + */ + if (IS_GEN(engine->i915, 11)) + intel_mocs_init_reg_state(ce); } } diff --git a/drivers/gpu/drm/i915/gt/intel_lrc_reg.h b/drivers/gpu/drm/i915/gt/intel_lrc_reg.h index 06ab0276e10e..f07a6262217c 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc_reg.h +++ b/drivers/gpu/drm/i915/gt/intel_lrc_reg.h @@ -28,6 +28,7 @@ #define CTX_R_PWR_CLK_STATE (0x42 + 1) #define GEN9_CTX_RING_MI_MODE 0x54 +#define GEN11_CTX_GFX_MOCS_BASE 0x4F2 /* GEN12+ Reg State Context */ #define GEN12_CTX_BB_PER_CTX_PTR (0x12 + 1) diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c index 728704bbbe18..a2669bee9b1b 100644 --- a/drivers/gpu/drm/i915/gt/intel_mocs.c +++ b/drivers/gpu/drm/i915/gt/intel_mocs.c @@ -26,6 +26,9 @@ #include "intel_gt.h" #include "intel_mocs.h" #include "intel_lrc.h" +#include "intel_lrc_reg.h" +#include "intel_sideband.h" +#include "gem/i915_gem_context.h" /* structures required */ struct drm_i915_mocs_entry { @@ -40,6 +43,7 @@ struct drm_i915_mocs_table { const struct drm_i915_mocs_entry *table; }; +#define ctx_mocsN(N) (GEN11_CTX_GFX_MOCS_BASE + 2 * (N) + 1) /* Defines for the tables (XXX_MOCS_0 - XXX_MOCS_63) */ #define _LE_CACHEABILITY(value) ((value) << 0) #define _LE_TGT_CACHE(value) ((value) << 2) @@ -51,6 +55,7 @@ struct drm_i915_mocs_table { #define LE_SCF(value) ((value) << 14) #define LE_COS(value) ((value) << 15) #define LE_SSE(value) ((value) << 17) +#define LE_COS_MASK GENMASK(16, 15) /* Defines for the tables (LNCFMOCS0 - LNCFMOCS31) - two entries per word */ #define L3_ESC(value) ((value) << 0) @@ -377,6 +382,7 @@ void intel_mocs_init_engine(struct intel_engine_cs *engine) struct intel_gt *gt = engine->gt; struct intel_uncore *uncore = gt->uncore; struct drm_i915_mocs_table table; + unsigned int active_clos; unsigned int index; u32 unused_value; @@ -390,11 +396,16 @@ void intel_mocs_init_engine(struct intel_engine_cs *engine) if (!get_mocs_settings(gt, &table)) return; + active_clos = engine->i915->clos.active_clos; /* Set unused values to PTE */ unused_value = table.table[I915_MOCS_PTE].control_value; + unused_value &= ~LE_COS_MASK; + unused_value |= FIELD_PREP(LE_COS_MASK, active_clos); for (index = 0; index < table.size; index++) { u32 value = get_entry_control(&table, index); + value &= ~LE_COS_MASK; + value |= FIELD_PREP(LE_COS_MASK, active_clos); intel_uncore_write_fw(uncore, mocs_register(engine->id, index), @@ -408,7 +419,7 @@ void intel_mocs_init_engine(struct intel_engine_cs *engine) unused_value); } -static void intel_mocs_init_global(struct intel_gt *gt) +void intel_mocs_init_global(struct intel_gt *gt) { struct intel_uncore *uncore = gt->uncore; struct drm_i915_mocs_table table; @@ -442,6 +453,7 @@ static int emit_mocs_control_table(struct i915_request *rq, const struct drm_i915_mocs_table *table) { enum intel_engine_id engine = rq->engine->id; + unsigned int active_clos; unsigned int index; u32 unused_value; u32 *cs; @@ -449,8 +461,11 @@ static int emit_mocs_control_table(struct i915_request *rq, if (GEM_WARN_ON(table->size > table->n_entries)) return -ENODEV; + active_clos = rq->i915->clos.active_clos; /* Set unused values to PTE */ unused_value = table->table[I915_MOCS_PTE].control_value; + unused_value &= ~LE_COS_MASK; + unused_value |= FIELD_PREP(LE_COS_MASK, active_clos); cs = intel_ring_begin(rq, 2 + 2 * table->n_entries); if (IS_ERR(cs)) @@ -460,6 +475,8 @@ static int emit_mocs_control_table(struct i915_request *rq, for (index = 0; index < table->size; index++) { u32 value = get_entry_control(table, index); + value &= ~LE_COS_MASK; + value |= FIELD_PREP(LE_COS_MASK, active_clos); *cs++ = i915_mmio_reg_offset(mocs_register(engine, index)); *cs++ = value; @@ -625,10 +642,207 @@ int intel_mocs_emit(struct i915_request *rq) return 0; } +void intel_mocs_init_reg_state(const struct intel_context *ce) +{ + struct drm_i915_private *i915 = ce->engine->i915; + u32 *reg_state = ce->lrc_reg_state; + struct drm_i915_mocs_table t; + unsigned int active_clos; + u32 value; + int i; + + get_mocs_settings(ce->engine->gt, &t); + + active_clos = i915->clos.active_clos; + + if (active_clos == FIELD_GET(LE_COS_MASK, get_entry_control(&t, 0))) + return; + + for (i = 0; i < t.n_entries; i++) { + value = reg_state[ctx_mocsN(i)]; + value &= ~LE_COS_MASK; + value |= FIELD_PREP(LE_COS_MASK, active_clos); + reg_state[ctx_mocsN(i)] = value; + } +} + +static int +mocs_store_clos(struct i915_request *rq, + struct intel_context *ce) +{ + struct drm_i915_mocs_table t; + unsigned int count, active_clos, index; + u32 offset; + u32 value; + u32 *cs; + + if (!get_mocs_settings(rq->engine->gt, &t)) + return -ENODEV; + + count = t.n_entries; + active_clos = rq->i915->clos.active_clos; + cs = intel_ring_begin(rq, 4 * count); + if (IS_ERR(cs)) + return PTR_ERR(cs); + + offset = i915_ggtt_offset(ce->state) + LRC_STATE_PN * PAGE_SIZE; + + for (index = 0; index < count; index++) { + value = ce->lrc_reg_state[ctx_mocsN(index)]; + value &= ~LE_COS_MASK; + value |= FIELD_PREP(LE_COS_MASK, active_clos); + + *cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT; + *cs++ = offset + ctx_mocsN(index) * sizeof(uint32_t); + *cs++ = 0; + *cs++ = value; + } + + intel_ring_advance(rq, cs); + + return 0; +} + +static int modify_context_mocs(struct intel_context *ce) +{ + struct i915_request *rq; + int err; + + lockdep_assert_held(&ce->pin_mutex); + + rq = i915_request_create(ce->engine->kernel_context); + if (IS_ERR(rq)) + return PTR_ERR(rq); + + /* Serialise with the remote context */ + err = intel_context_prepare_remote_request(ce, rq); + if (err == 0) + err = mocs_store_clos(rq, ce); + + i915_request_add(rq); + return err; +} + +static int intel_mocs_configure_context(struct i915_gem_context *ctx) +{ + struct i915_gem_engines_iter it; + struct intel_context *ce; + int err = 0; + + for_each_gem_engine(ce, i915_gem_context_lock_engines(ctx), it) { + GEM_BUG_ON(ce == ce->engine->kernel_context); + + if (ce->engine->class != RENDER_CLASS) + continue; + + err = intel_context_lock_pinned(ce); + if (err) + break; + + if (intel_context_is_pinned(ce)) + err = modify_context_mocs(ce); + + intel_context_unlock_pinned(ce); + if (err) + break; + } + i915_gem_context_unlock_engines(ctx); + + return err; +} + +static int intel_mocs_configure_all_contexts(struct intel_gt *gt) +{ + struct drm_i915_private *i915 = gt->i915; + struct intel_engine_cs *engine; + struct i915_gem_context *ctx; + enum intel_engine_id id; + int err; + + lockdep_assert_held(&i915->drm.struct_mutex); + /* + * MOCS registers of render engine are context saved and restored to and + * from a context image. + * So for any MOCS update to reflect on the existing contexts requires + * updating the context image. + */ + list_for_each_entry(ctx, &i915->gem.contexts.list, link) { + if (ctx == i915->kernel_context) + continue; + + err = intel_mocs_configure_context(ctx); + if (err) + return err; + } + + /* + * After updating all other contexts, update render context image of + * kernel context. Also update the MOCS of non-render engines. + */ + + for_each_engine(engine, i915, id) { + struct i915_request *rq; + struct drm_i915_mocs_table t; + + rq = i915_request_create(engine->kernel_context); + if (IS_ERR(rq)) + return PTR_ERR(rq); + + get_mocs_settings(rq->engine->gt, &t); + err = emit_mocs_control_table(rq, &t); + if (err) { + i915_request_skip(rq, err); + i915_request_add(rq); + return err; + } + + i915_request_add(rq); + } + + return 0; +} + +int intel_mocs_update_clos(struct intel_gt *gt) +{ + return intel_mocs_configure_all_contexts(gt); +} + +static void intel_read_clos_way_mask(struct intel_gt *gt) +{ + struct drm_i915_private *i915 = gt->i915; + struct drm_i915_mocs_table table; + int ret, i; + u32 val; + + if (!get_mocs_settings(gt, &table)) + return; + + /* CLOS is same for all entries. So its enough to read one*/ + i915->clos.active_clos = FIELD_GET(LE_COS_MASK, + get_entry_control(&table, 0)); + for (i = 0; i < NUM_OF_CLOS; i++) { + val = i; + ret = sandybridge_pcode_read(i915, + ICL_PCODE_LLC_COS_WAY_MASK_INFO, + &val, NULL); + if (ret) { + DRM_ERROR("Mailbox read error = %d\n", ret); + return; + } + + i915->clos.way_mask[i] = val; + } + + i915->clos.support_way_mask_read = true; +} + void intel_mocs_init(struct intel_gt *gt) { intel_mocs_init_l3cc_table(gt); + if (IS_GEN(gt->i915, 11)) + intel_read_clos_way_mask(gt); + if (HAS_GLOBAL_MOCS_REGISTERS(gt->i915)) intel_mocs_init_global(gt); } diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.h b/drivers/gpu/drm/i915/gt/intel_mocs.h index 2ae816b7ca19..65566457408e 100644 --- a/drivers/gpu/drm/i915/gt/intel_mocs.h +++ b/drivers/gpu/drm/i915/gt/intel_mocs.h @@ -49,13 +49,17 @@ * context handling keep the MOCS in step. */ -struct i915_request; struct intel_engine_cs; +struct intel_context; +struct i915_request; struct intel_gt; void intel_mocs_init(struct intel_gt *gt); void intel_mocs_init_engine(struct intel_engine_cs *engine); int intel_mocs_emit(struct i915_request *rq); +int intel_mocs_update_clos(struct intel_gt *gt); + +void intel_mocs_init_reg_state(const struct intel_context *ce); #endif diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index cde4c7fb5570..01c5f7fec634 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1604,6 +1604,14 @@ struct drm_i915_private { bool distrust_bios_wm; } wm; + /* Last Level Cache Class of Service */ + struct { + bool support_way_mask_read; + u8 active_clos; +#define NUM_OF_CLOS 4 + u16 way_mask[NUM_OF_CLOS]; + } clos; + struct dram_info { bool valid; bool is_16gb_dimm; diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 6d67bd238cfe..0a7937234975 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -8859,6 +8859,7 @@ enum { #define ICL_PCODE_MEM_SUBSYSYSTEM_INFO 0xd #define ICL_PCODE_MEM_SS_READ_GLOBAL_INFO (0x0 << 8) #define ICL_PCODE_MEM_SS_READ_QGV_POINT_INFO(point) (((point) << 16) | (0x1 << 8)) +#define ICL_PCODE_LLC_COS_WAY_MASK_INFO 0x1d #define GEN6_PCODE_READ_D_COMP 0x10 #define GEN6_PCODE_WRITE_D_COMP 0x11 #define HSW_PCODE_DE_WRITE_FREQ_REQ 0x17 diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c index bf039b8ba593..8344ad2380cc 100644 --- a/drivers/gpu/drm/i915/i915_sysfs.c +++ b/drivers/gpu/drm/i915/i915_sysfs.c @@ -36,6 +36,7 @@ #include "i915_sysfs.h" #include "intel_pm.h" #include "intel_sideband.h" +#include "gt/intel_mocs.h" static inline struct drm_i915_private *kdev_minor_to_i915(struct device *kdev) { @@ -255,6 +256,88 @@ static const struct bin_attribute dpf_attrs_1 = { .private = (void *)1 }; +static ssize_t llc_clos_show(struct device *kdev, + struct device_attribute *attr, char *buf) +{ + struct drm_i915_private *dev_priv = kdev_minor_to_i915(kdev); + ssize_t len = 0; + int active_clos; + + active_clos = dev_priv->clos.active_clos; + len += snprintf(buf + len, PAGE_SIZE, "0x%x\n", + dev_priv->clos.way_mask[active_clos]); + + return len; +} + +static ssize_t llc_clos_store(struct device *kdev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct drm_i915_private *dev_priv = kdev_minor_to_i915(kdev); + struct drm_device *dev = &dev_priv->drm; + u8 active_clos, new_clos, clos_index; + bool valid_mask = false; + ssize_t ret; + u16 way_mask; + + ret = kstrtou16(buf, 0, &way_mask); + if (ret) + return ret; + + active_clos = dev_priv->clos.active_clos; + + if (dev_priv->clos.way_mask[active_clos] == way_mask) + return count; + + for (clos_index = 0; clos_index < NUM_OF_CLOS; clos_index++) { + if (dev_priv->clos.way_mask[clos_index] == way_mask) { + new_clos = clos_index; + valid_mask = true; + break; + } + } + + if (!valid_mask) + return -EINVAL; + + ret = i915_mutex_lock_interruptible(dev); + if (ret) + return ret; + + dev_priv->clos.active_clos = new_clos; + ret = intel_mocs_update_clos(&dev_priv->gt); + if (ret) { + DRM_ERROR("Failed to update Class of service\n"); + dev_priv->clos.active_clos = active_clos; + mutex_unlock(&dev->struct_mutex); + return ret; + } + + mutex_unlock(&dev->struct_mutex); + + return count; +} + +static ssize_t llc_clos_modes_show(struct device *kdev, + struct device_attribute *attr, char *buf) +{ + struct drm_i915_private *dev_priv = kdev_minor_to_i915(kdev); + ssize_t len = 0; + int i; + + for (i = 0; i < NUM_OF_CLOS; i++) + len += snprintf(buf + len, PAGE_SIZE, "0x%x ", + dev_priv->clos.way_mask[i]); + + len += snprintf(buf + len, PAGE_SIZE, "\n"); + + return len; +} + +static DEVICE_ATTR_RW(llc_clos); +static DEVICE_ATTR_RO(llc_clos_modes); + static ssize_t gt_act_freq_mhz_show(struct device *kdev, struct device_attribute *attr, char *buf) { @@ -574,6 +657,18 @@ void i915_setup_sysfs(struct drm_i915_private *dev_priv) struct device *kdev = dev_priv->drm.primary->kdev; int ret; + if (dev_priv->clos.support_way_mask_read) { + ret = sysfs_create_file(&kdev->kobj, + &dev_attr_llc_clos.attr); + if (ret) + DRM_ERROR("LLC COS sysfs setup failed\n"); + + ret = sysfs_create_file(&kdev->kobj, + &dev_attr_llc_clos_modes.attr); + if (ret) + DRM_ERROR("LLC COS sysfs setup failed\n"); + } + #ifdef CONFIG_PM if (HAS_RC6(dev_priv)) { ret = sysfs_merge_group(&kdev->kobj, @@ -624,6 +719,11 @@ void i915_teardown_sysfs(struct drm_i915_private *dev_priv) i915_teardown_error_capture(kdev); + if (dev_priv->clos.support_way_mask_read) { + sysfs_remove_file(&kdev->kobj, &dev_attr_llc_clos.attr); + sysfs_remove_file(&kdev->kobj, &dev_attr_llc_clos_modes.attr); + } + if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) sysfs_remove_files(&kdev->kobj, vlv_attrs); else