From patchwork Fri Aug 5 13:22:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Auld X-Patchwork-Id: 12937271 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DD0C0C00140 for ; Fri, 5 Aug 2022 13:26:59 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 18F06B00EB; Fri, 5 Aug 2022 13:24:26 +0000 (UTC) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 11FD9B04BC; Fri, 5 Aug 2022 13:23:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1659705830; x=1691241830; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=Mi4O+OVOGbpBfWqRUDxCGwRvU18sBzxwy2R7o7uzqOE=; b=K3j+jofGQyCTSX2i2bDiLSnQMa3mbjJ7K0yb1hwPzYHFhIJEL0gP4ETc pRyHs8/zheelIPa2ezQKOzYc3tUScRmR1eOmwpsUw2rH7T3Dt3pDz5yvM brBTm2oFR89fkg+OTIzmke3oQyZcKDpdnjGS8hjLGURTKo4P6z3g7iuuH PpPAvTGBUicbW25mcsDIOyINKBfDPSeiIux1LnsfjGs+PLIeNrH3u6Q7e pmE/Q6T4fzzAMes/bEhiTguFUN4X9/QsCM463LjvwFy8SWUbhBGBDhmEK QatRGJ0KfwVGLasZzURFs0sSS11CRV8oUDBLnThKLYKlrlQ76VDdf6bAn w==; X-IronPort-AV: E=McAfee;i="6400,9594,10430"; a="316089232" X-IronPort-AV: E=Sophos;i="5.93,216,1654585200"; d="scan'208";a="316089232" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2022 06:23:49 -0700 X-IronPort-AV: E=Sophos;i="5.93,216,1654585200"; d="scan'208";a="579497665" Received: from namitaga-mobl.ger.corp.intel.com (HELO mwauld-desk1.intel.com) ([10.213.224.55]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2022 06:23:47 -0700 From: Matthew Auld To: intel-gfx@lists.freedesktop.org Date: Fri, 5 Aug 2022 14:22:39 +0100 Message-Id: <20220805132240.442747-1-matthew.auld@intel.com> X-Mailer: git-send-email 2.37.1 MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 1/2] drm/i915/ttm: remove calc_ctrl_surf_instr_size X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Thomas_Hellstr=C3=B6m?= , dri-devel@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" We only ever need to emit one ccs block copy command. Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Ramalingam C Reviewed-by: Ramalingam C --- drivers/gpu/drm/i915/gt/intel_migrate.c | 35 +++---------------------- 1 file changed, 3 insertions(+), 32 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c index 9a0814422ba4..1bbed7aa436a 100644 --- a/drivers/gpu/drm/i915/gt/intel_migrate.c +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c @@ -511,44 +511,16 @@ static inline u32 *i915_flush_dw(u32 *cmd, u32 flags) return cmd; } -static u32 calc_ctrl_surf_instr_size(struct drm_i915_private *i915, int size) -{ - u32 num_cmds, num_blks, total_size; - - if (!GET_CCS_BYTES(i915, size)) - return 0; - - /* - * XY_CTRL_SURF_COPY_BLT transfers CCS in 256 byte - * blocks. one XY_CTRL_SURF_COPY_BLT command can - * transfer upto 1024 blocks. - */ - num_blks = DIV_ROUND_UP(GET_CCS_BYTES(i915, size), - NUM_CCS_BYTES_PER_BLOCK); - num_cmds = DIV_ROUND_UP(num_blks, NUM_CCS_BLKS_PER_XFER); - total_size = XY_CTRL_SURF_INSTR_SIZE * num_cmds; - - /* - * Adding a flush before and after XY_CTRL_SURF_COPY_BLT - */ - total_size += 2 * MI_FLUSH_DW_SIZE; - - return total_size; -} - static int emit_copy_ccs(struct i915_request *rq, u32 dst_offset, u8 dst_access, u32 src_offset, u8 src_access, int size) { struct drm_i915_private *i915 = rq->engine->i915; int mocs = rq->engine->gt->mocs.uc_index << 1; - u32 num_ccs_blks, ccs_ring_size; + u32 num_ccs_blks; u32 *cs; - ccs_ring_size = calc_ctrl_surf_instr_size(i915, size); - WARN_ON(!ccs_ring_size); - - cs = intel_ring_begin(rq, round_up(ccs_ring_size, 2)); + cs = intel_ring_begin(rq, 12); if (IS_ERR(cs)) return PTR_ERR(cs); @@ -583,8 +555,7 @@ static int emit_copy_ccs(struct i915_request *rq, FIELD_PREP(XY_CTRL_SURF_MOCS_MASK, mocs); cs = i915_flush_dw(cs, MI_FLUSH_DW_LLC | MI_FLUSH_DW_CCS); - if (ccs_ring_size & 1) - *cs++ = MI_NOOP; + *cs++ = MI_NOOP; intel_ring_advance(rq, cs); From patchwork Fri Aug 5 13:22:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Auld X-Patchwork-Id: 12937270 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 48ACDC25B08 for ; Fri, 5 Aug 2022 13:26:04 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C037AB02F0; Fri, 5 Aug 2022 13:24:20 +0000 (UTC) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 09E90B0759; Fri, 5 Aug 2022 13:23:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1659705831; x=1691241831; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=i8Xlc0juAwGuI6eFKNzmyh10cOrkYBzLlOnGwbrJtzk=; b=j3CYHgI4lpJRrb2VDQtSW+d4GaXKxFjTRH3P+tLUcFoeQgH1gFQxa5+G LUeGLmVxg11UVQtmZWCN2eR+u0PDS8FjRNB9EM3rs83TCiFC11u2SW15Y b7PVbr3Nf8AK8EAR18zQ7qy6/dU2Y8cLBaAFhVi4d7aS0whSmdRDmb069 jpIm/trYe/b75lD0Y3Z783i/GBvmY7AjZGtwI+S3xLY35PGuHHJP9GION savMYdwDjYNiQTA7EU72FBPS3iqgLz4JLqB7r0JSzi0duoTezEm5J8trQ nWdqYNJog6WCW3q9Qdhw7lgo9EVYeTe50IxnTt7sZ2k0oskfGy+lOD5zU A==; X-IronPort-AV: E=McAfee;i="6400,9594,10430"; a="316089237" X-IronPort-AV: E=Sophos;i="5.93,216,1654585200"; d="scan'208";a="316089237" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2022 06:23:50 -0700 X-IronPort-AV: E=Sophos;i="5.93,216,1654585200"; d="scan'208";a="579497677" Received: from namitaga-mobl.ger.corp.intel.com (HELO mwauld-desk1.intel.com) ([10.213.224.55]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2022 06:23:49 -0700 From: Matthew Auld To: intel-gfx@lists.freedesktop.org Date: Fri, 5 Aug 2022 14:22:40 +0100 Message-Id: <20220805132240.442747-2-matthew.auld@intel.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220805132240.442747-1-matthew.auld@intel.com> References: <20220805132240.442747-1-matthew.auld@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 2/2] drm/i915/ttm: fix CCS handling X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Thomas_Hellstr=C3=B6m?= , dri-devel@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Crucible + recent Mesa seems to sometimes hit: GEM_BUG_ON(num_ccs_blks > NUM_CCS_BLKS_PER_XFER) And it looks like we can also trigger this with gem_lmem_swapping, if we modify the test to use slightly larger object sizes. Looking closer it looks like we have the following issues in migrate_copy(): - We are using plain integer in various places, which we can easily overflow with a large object. - We pass the entire object size (when the src is lmem) into emit_pte() and then try to copy it, which doesn't work, since we only have a few fixed sized windows in which to map the pages and perform the copy. With an object > 8M we therefore aren't properly copying the pages. And then with an object > 64M we trigger the GEM_BUG_ON(num_ccs_blks > NUM_CCS_BLKS_PER_XFER). So it looks like our copy handling for any object > 8M (which is our CHUNK_SZ) is currently broken on DG2. Fixes: da0595ae91da ("drm/i915/migrate: Evict and restore the flatccs capable lmem obj") Testcase: igt@gem_lmem_swapping@basic-big Testcase: igt@gem_lmem_swapping@verify-ccs-big Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Ramalingam C Reviewed-by: Ramalingam C --- drivers/gpu/drm/i915/gt/intel_migrate.c | 44 ++++++++++++------------- 1 file changed, 21 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c index 1bbed7aa436a..aaaf1906026c 100644 --- a/drivers/gpu/drm/i915/gt/intel_migrate.c +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c @@ -609,9 +609,9 @@ static int emit_copy(struct i915_request *rq, return 0; } -static int scatter_list_length(struct scatterlist *sg) +static u64 scatter_list_length(struct scatterlist *sg) { - int len = 0; + u64 len = 0; while (sg && sg_dma_len(sg)) { len += sg_dma_len(sg); @@ -621,28 +621,26 @@ static int scatter_list_length(struct scatterlist *sg) return len; } -static void +static int calculate_chunk_sz(struct drm_i915_private *i915, bool src_is_lmem, - int *src_sz, u32 bytes_to_cpy, u32 ccs_bytes_to_cpy) + u64 bytes_to_cpy, u64 ccs_bytes_to_cpy) { - if (ccs_bytes_to_cpy) { - if (!src_is_lmem) - /* - * When CHUNK_SZ is passed all the pages upto CHUNK_SZ - * will be taken for the blt. in Flat-ccs supported - * platform Smem obj will have more pages than required - * for main meory hence limit it to the required size - * for main memory - */ - *src_sz = min_t(int, bytes_to_cpy, CHUNK_SZ); - } else { /* ccs handling is not required */ - *src_sz = CHUNK_SZ; - } + if (ccs_bytes_to_cpy && !src_is_lmem) + /* + * When CHUNK_SZ is passed all the pages upto CHUNK_SZ + * will be taken for the blt. in Flat-ccs supported + * platform Smem obj will have more pages than required + * for main meory hence limit it to the required size + * for main memory + */ + return min_t(u64, bytes_to_cpy, CHUNK_SZ); + else + return CHUNK_SZ; } -static void get_ccs_sg_sgt(struct sgt_dma *it, u32 bytes_to_cpy) +static void get_ccs_sg_sgt(struct sgt_dma *it, u64 bytes_to_cpy) { - u32 len; + u64 len; do { GEM_BUG_ON(!it->sg || !sg_dma_len(it->sg)); @@ -673,12 +671,12 @@ intel_context_migrate_copy(struct intel_context *ce, { struct sgt_dma it_src = sg_sgt(src), it_dst = sg_sgt(dst), it_ccs; struct drm_i915_private *i915 = ce->engine->i915; - u32 ccs_bytes_to_cpy = 0, bytes_to_cpy; + u64 ccs_bytes_to_cpy = 0, bytes_to_cpy; enum i915_cache_level ccs_cache_level; u32 src_offset, dst_offset; u8 src_access, dst_access; struct i915_request *rq; - int src_sz, dst_sz; + u64 src_sz, dst_sz; bool ccs_is_src, overwrite_ccs; int err; @@ -761,8 +759,8 @@ intel_context_migrate_copy(struct intel_context *ce, if (err) goto out_rq; - calculate_chunk_sz(i915, src_is_lmem, &src_sz, - bytes_to_cpy, ccs_bytes_to_cpy); + src_sz = calculate_chunk_sz(i915, src_is_lmem, + bytes_to_cpy, ccs_bytes_to_cpy); len = emit_pte(rq, &it_src, src_cache_level, src_is_lmem, src_offset, src_sz);