From patchwork Mon Apr 17 06:24:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yang, Fei" X-Patchwork-Id: 13213323 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 173D6C77B77 for ; Mon, 17 Apr 2023 06:24:11 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 05DDD10E35D; Mon, 17 Apr 2023 06:23:54 +0000 (UTC) Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8F0E710E26F; Mon, 17 Apr 2023 06:23:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1681712630; x=1713248630; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=OQPdZJ6uq86fDdelr+3CJppVufDNxUytLNp6b889BE4=; b=FCtgq0ubkC7bTrWy6l2uyhQ0Y5egDzpYTqL+SQWfkMPRHSvr0zFT5nvV We9GLED2Cun9A/4tEaFajaRoTWvWj0lLGyJvrfFJjPm3XrZ0de5l7yRHz BobcraZgIhXtBiySw0e4rRyv+IZuiVUKSRr6U7504tZRDhusjPmioGFZU 6uDlXeAJfJXU3S78YVysNKp4YytyGIj2XP2E5QYfbTivAMQm9sIKsR33s 3iVA8zwL5I0CCmND20UzCfz7sfDFGgvfF/AJ+kxbpMQ8R6OPp/HERx3Sk gJ7nhu/G20uRspvoStpXoR42jDXu9tL5KDr5e9qTu+9zmWyjY/1c7QmCZ w==; X-IronPort-AV: E=McAfee;i="6600,9927,10682"; a="347552378" X-IronPort-AV: E=Sophos;i="5.99,203,1677571200"; d="scan'208";a="347552378" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Apr 2023 23:23:48 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10682"; a="721007615" X-IronPort-AV: E=Sophos;i="5.99,203,1677571200"; d="scan'208";a="721007615" Received: from fyang16-desk.jf.intel.com ([10.24.96.243]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Apr 2023 23:23:48 -0700 From: fei.yang@intel.com To: intel-gfx@lists.freedesktop.org Subject: [PATCH 4/8] drm/i915/mtl: workaround coherency issue for Media Date: Sun, 16 Apr 2023 23:24:59 -0700 Message-Id: <20230417062503.1884465-5-fei.yang@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230417062503.1884465-1-fei.yang@intel.com> References: <20230417062503.1884465-1-fei.yang@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Fei Yang , dri-devel@lists.freedesktop.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Fei Yang This patch implements Wa_22016122933. In MTL, memory writes initiated by Media tile update the whole cache line even for partial writes. This creates a coherency problem for cacheable memory if both CPU and GPU are writing data to different locations within a single cache line. CTB communication is impacted by this issue because the head and tail pointers are adjacent words within a cache line (see struct guc_ct_buffer_desc), where one is written by GuC and the other by the host. This patch circumvents the issue by making CPU/GPU shared memory uncacheable (WC on CPU side, and PAT index 2 for GPU). Also for CTB which is being updated by both CPU and GuC, mfence instruction is added to make sure the CPU writes are visible to GPU right away (flush the write combining buffer). While fixing the CTB issue, we noticed some random GSC firmware loading failure because the share buffers are cacheable (WB) on CPU side but uncached on GPU side. To fix these issues we need to map such shared buffers as WC on CPU side. Since such allocations are not all done through GuC allocator, to avoid too many code changes, the i915_coherent_map_type() is now hard coded to return WC for MTL. BSpec: 45101 Signed-off-by: Fei Yang Reviewed-by: Andi Shyti Acked-by: Nirmoy Das --- drivers/gpu/drm/i915/gem/i915_gem_pages.c | 5 ++++- drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c | 13 +++++++++++++ drivers/gpu/drm/i915/gt/uc/intel_guc.c | 7 +++++++ drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 18 ++++++++++++------ 4 files changed, 36 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c index ecd86130b74f..89fc8ea6bcfc 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c @@ -469,7 +469,10 @@ enum i915_map_type i915_coherent_map_type(struct drm_i915_private *i915, struct drm_i915_gem_object *obj, bool always_coherent) { - if (i915_gem_object_is_lmem(obj)) + /* + * Wa_22016122933: always return I915_MAP_WC for MTL + */ + if (i915_gem_object_is_lmem(obj) || IS_METEORLAKE(i915)) return I915_MAP_WC; if (HAS_LLC(i915) || always_coherent) return I915_MAP_WB; diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c index 1d9fdfb11268..236673c02f9a 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c @@ -110,6 +110,13 @@ static int gsc_fw_load_prepare(struct intel_gsc_uc *gsc) if (obj->base.size < gsc->fw.size) return -ENOSPC; + /* + * Wa_22016122933: For MTL the shared memory needs to be mapped + * as WC on CPU side and UC (PAT index 2) on GPU side + */ + if (IS_METEORLAKE(i915)) + i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE); + dst = i915_gem_object_pin_map_unlocked(obj, i915_coherent_map_type(i915, obj, true)); if (IS_ERR(dst)) @@ -125,6 +132,12 @@ static int gsc_fw_load_prepare(struct intel_gsc_uc *gsc) memset(dst, 0, obj->base.size); memcpy(dst, src, gsc->fw.size); + /* + * Wa_22016122933: Making sure the data in dst is + * visible to GSC right away + */ + intel_guc_write_barrier(>->uc.guc); + i915_gem_object_unpin_map(gsc->fw.obj); i915_gem_object_unpin_map(obj); diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c b/drivers/gpu/drm/i915/gt/uc/intel_guc.c index d76508fa3af7..f9bddaa876d9 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c @@ -743,6 +743,13 @@ struct i915_vma *intel_guc_allocate_vma(struct intel_guc *guc, u32 size) if (IS_ERR(obj)) return ERR_CAST(obj); + /* + * Wa_22016122933: For MTL the shared memory needs to be mapped + * as WC on CPU side and UC (PAT index 2) on GPU side + */ + if (IS_METEORLAKE(gt->i915)) + i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE); + vma = i915_vma_instance(obj, >->ggtt->vm, NULL); if (IS_ERR(vma)) goto err; diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c index 1803a633ed64..98e682b7df07 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c @@ -415,12 +415,6 @@ static int ct_write(struct intel_guc_ct *ct, } GEM_BUG_ON(tail > size); - /* - * make sure H2G buffer update and LRC tail update (if this triggering a - * submission) are visible before updating the descriptor tail - */ - intel_guc_write_barrier(ct_to_guc(ct)); - /* update local copies */ ctb->tail = tail; GEM_BUG_ON(atomic_read(&ctb->space) < len + GUC_CTB_HDR_LEN); @@ -429,6 +423,12 @@ static int ct_write(struct intel_guc_ct *ct, /* now update descriptor */ WRITE_ONCE(desc->tail, tail); + /* + * make sure H2G buffer update and LRC tail update (if this triggering a + * submission) are visible before updating the descriptor tail + */ + intel_guc_write_barrier(ct_to_guc(ct)); + return 0; corrupted: @@ -902,6 +902,12 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg) /* now update descriptor */ WRITE_ONCE(desc->head, head); + /* + * Wa_22016122933: Making sure the head update is + * visible to GuC right away + */ + intel_guc_write_barrier(ct_to_guc(ct)); + return available - len; corrupted: