From patchwork Thu Jul 6 23:27:01 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 9829221 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id D3F67602CA for ; Thu, 6 Jul 2017 23:27:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C3DCA26B41 for ; Thu, 6 Jul 2017 23:27:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B8D322855E; Thu, 6 Jul 2017 23:27:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 3550F26B41 for ; Thu, 6 Jul 2017 23:27:19 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CB8836E691; Thu, 6 Jul 2017 23:27:15 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail.bwidawsk.net (zangief.bwidawsk.net [107.170.211.233]) by gabe.freedesktop.org (Postfix) with ESMTPS id 068CE6E689; Thu, 6 Jul 2017 23:27:15 +0000 (UTC) Received: by mail.bwidawsk.net (Postfix, from userid 5001) id C661B1226A9; Thu, 6 Jul 2017 16:27:14 -0700 (PDT) Received: from bolo_yeung.intel.com (unknown [134.134.139.94]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mail.bwidawsk.net (Postfix) with ESMTPSA id CD4B41226A4; Thu, 6 Jul 2017 16:27:07 -0700 (PDT) From: Ben Widawsky To: Intel GFX , mesa-dev Date: Thu, 6 Jul 2017 16:27:01 -0700 Message-Id: <20170706232703.14229-2-ben@bwidawsk.net> X-Mailer: git-send-email 2.13.2 In-Reply-To: <20170706232703.14229-1-ben@bwidawsk.net> References: <20170706232703.14229-1-ben@bwidawsk.net> Cc: Ben Widawsky Subject: [Intel-gfx] [PATCH 1/1] drm/i915: Version the MOCS settings X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP From: Ben Widawsky Starting with GEN9, Memory Object Control State (MOCS) becomes an index into a table as opposed to the direct programming within the command. The table has 62 usable entries (ie 6 bits can represent all settings), and each buffer type may use one of these 62 entries to describe cacheability type, and age (and some other less useful fields). Because we hadn't dealt with MOCS settings like this, we didn't think ahead too well and have ended up with a mess for GEN9 (and soon GEN10) platform. The plan for for future platforms is that the ideal MOCS settings will be determined, defined, and written in the public PRMs. After this point, the i915.ko will absorb these settings and sometime afterwards flip the alpha switch. All driver releases without the final MOCS table must be considered alpha. Here on, userspace can assume the MOCS table is definitively done. There will be some reserved entries for 'oh shit' scenarios. This avoids versioning the MOCS table which leaves somewhat of a mess in userspace trying to handle arbitrarily many MOCS versions. But we do have a mess on GEN9. In the beginning, the MOCS table entries were pre-populated by the hardware based on estimations made prior to tapeout and we could just use that. Subsequently much performance tuning was done to determine optimal settings that the i915 driver should load on top of the hardware defaults. That was posted last as v6 of the original per-engine MOCS settings: https://patchwork.freedesktop.org/patch/53237/. Since the MOCS table is not context saved/restored, it isn't feasible to let userspace upload its own MOCS table. After a good amount of debate, it was decided that we'd utilize only the minimal set of entires in mesa anyway, and so we took only those entries for our MOCS entries. Now we've come to the realization that indeed there are other MOCS entries which are more optimal for various buffer types and workloads. The problem is that the meaning of the indices is ABI (we assume index 0 is the uncached entry, and that there are only 3 entries total). What this patch [simply] aims to do is expose a parameter to inform userspace which "version" of the table was loaded by i915. Upon sufficient data, new entries can be added, and the version can be bumped. For example, from my original mesa mocs branch: commit c9b0481bce24af032386701de0266eb5bc24e988 Author: Ben Widawsky Date: Fri Apr 8 10:21:16 2016 -0700 i965: Use PTE mocs Signed-off-by: Ben Widawsky diff --git a/src/mesa/drivers/dri/i965/brw_mocs.c b/src/mesa/drivers/dri/i965/brw_mocs.c index 5df154eb86..b7bfdab671 100644 --- a/src/mesa/drivers/dri/i965/brw_mocs.c +++ b/src/mesa/drivers/dri/i965/brw_mocs.c @@ -14,6 +14,9 @@ /* Skylake: MOCS is now an index into an array of 62 different caching * configurations programmed by the kernel. */ + +/* TC=PTE, LeCC=PTE, LRUM=3, L3CC=WB */ +#define SKL_MOCS_PTE_PTE (3 << 1) /* TC=LLC/eLLC, LeCC=WB, LRUM=3, L3CC=WB */ #define SKL_MOCS_WB (2 << 1) /* TC=LLC/eLLC, LeCC=PTE, LRUM=3, L3CC=WB */ @@ -26,6 +29,9 @@ brw_mocs_get_control_state(const struct brw_context *brw, switch (brw->gen) { default: case 9: + if (brw->intelScreen->mocs_version > 1) + return SKL_MOCS_PTE_PTE; + return type == INTEL_MOCS_PTE ? SKL_MOCS_PTE : SKL_MOCS_WB; case 8: return type == INTEL_MOCS_PTE ? BDW_MOCS_PTE : BDW_MOCS_WB; tl;dr: A versioned MOCS table will allow userspace to be aware of new and potentially interesting cacheability settings. Next GEN platforms will not be considered production worthy until the MOCS table is finalized. v2: Update 1.5 year old patch. Add comments. Update commit message. Signed-off-by: Ben Widawsky --- drivers/gpu/drm/i915/i915_drv.c | 3 +++ drivers/gpu/drm/i915/i915_drv.h | 2 ++ drivers/gpu/drm/i915/i915_pci.c | 13 +++++++++---- include/uapi/drm/i915_drm.h | 8 ++++++++ 4 files changed, 22 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 9167a73f3c69..26c27b6ae814 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -401,6 +401,9 @@ static int i915_getparam(struct drm_device *dev, void *data, if (!value) return -ENODEV; break; + case I915_PARAM_MOCS_TABLE_VERSION: + value = INTEL_INFO(dev_priv)->mocs_version; + break; default: DRM_DEBUG("Unknown parameter %d\n", param->param); return -EINVAL; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index effbe4f72a64..9b30f6e6ef9b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -859,6 +859,8 @@ struct intel_device_info { u16 degamma_lut_size; u16 gamma_lut_size; } color; + + u8 mocs_version; }; struct intel_display_error_state; diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 04aaf553e3fa..9697eb91d972 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -356,7 +356,8 @@ static const struct intel_device_info intel_cherryview_info = { .platform = INTEL_SKYLAKE, \ .has_csr = 1, \ .has_guc = 1, \ - .ddb_size = 896 + .ddb_size = 896, \ + .mocs_version = MOCS_TABLE_VERSION static const struct intel_device_info intel_skylake_info = { SKL_PLATFORM, @@ -390,6 +391,7 @@ static const struct intel_device_info intel_skylake_gt3_info = { .has_full_ppgtt = 1, \ .has_full_48bit_ppgtt = 1, \ .has_reset_engine = 1, \ + .mocs_version = MOCS_TABLE_VERSION, \ GEN_DEFAULT_PIPEOFFSETS, \ IVB_CURSOR_OFFSETS, \ BDW_COLORS @@ -413,7 +415,8 @@ static const struct intel_device_info intel_geminilake_info = { .platform = INTEL_KABYLAKE, \ .has_csr = 1, \ .has_guc = 1, \ - .ddb_size = 896 + .ddb_size = 896, \ + .mocs_version = MOCS_TABLE_VERSION static const struct intel_device_info intel_kabylake_info = { KBL_PLATFORM, @@ -431,7 +434,8 @@ static const struct intel_device_info intel_kabylake_gt3_info = { .platform = INTEL_COFFEELAKE, \ .has_csr = 1, \ .has_guc = 1, \ - .ddb_size = 896 + .ddb_size = 896, \ + .mocs_version = MOCS_TABLE_VERSION static const struct intel_device_info intel_coffeelake_info = { CFL_PLATFORM, @@ -448,7 +452,8 @@ static const struct intel_device_info intel_cannonlake_info = { .platform = INTEL_CANNONLAKE, .gen = 10, .ddb_size = 1024, - .has_csr = 1, + .has_csr = 1, \ + .mocs_version = MOCS_TABLE_VERSION }; /* diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 7ccbd6a2bbe0..2e370c40e48d 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -431,6 +431,14 @@ typedef struct drm_i915_irq_wait { */ #define I915_PARAM_HAS_EXEC_BATCH_FIRST 48 +/* What version of the MOCS table we have. For GEN9 GPUs, the PRM defined + * non-optimal settings for the MOCS table. As a result, we were required to use a + * small subset, and later add new settings. This param allows userspace to + * determine which settings are there. + */ +#define MOCS_TABLE_VERSION 1 /* Build time MOCS table version */ +#define I915_PARAM_MOCS_TABLE_VERSION 49 + typedef struct drm_i915_getparam { __s32 param; /*