From patchwork Fri May 20 23:04:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matt Roper X-Patchwork-Id: 12857523 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2A635C433F5 for ; Fri, 20 May 2022 23:04:27 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8819910EC26; Fri, 20 May 2022 23:04:21 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3D6D110E62E; Fri, 20 May 2022 23:04:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1653087860; x=1684623860; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=r6XPrPvos+yKEW+yrB4wzSkta5zzFat+i9rOpqOHLv0=; b=gywnpjSuZeJ+TXoy5c1J6/HN6LSOvyoOT4e0HFfT57EyJUi9o7nO8Cne uSSrvaS0085r/acq+q9T+z7GG1pSx2NvQLeBHHuyh4BXdLXBfRuVtk8wR NN9yTk1Pman+Y6kr3sYcjJcodK4vY2T48jbo21q86U3/Td+glFufDQ2JT cTXGbMZRSV2cnkZAJe+5+XSorUv6ll5aW6Rxgs+FilIG/nPqBPL++FHAm VDUPPFxIr49Iroei0h2MgyiwQxUsw04ca49CiAX+rLN5iR1CoEGQGFCvA irGZQCZr4aGimsbpM/IBK7ng0lpE8UtbcChB+5MeBNVUGZHyqkN8vSJ9P g==; X-IronPort-AV: E=McAfee;i="6400,9594,10353"; a="272472169" X-IronPort-AV: E=Sophos;i="5.91,240,1647327600"; d="scan'208";a="272472169" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 May 2022 16:04:18 -0700 X-IronPort-AV: E=Sophos;i="5.91,240,1647327600"; d="scan'208";a="599482025" Received: from mdroper-desk1.fm.intel.com ([10.1.27.134]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 May 2022 16:04:14 -0700 From: Matt Roper To: intel-gfx@lists.freedesktop.org Date: Fri, 20 May 2022 16:04:03 -0700 Message-Id: <20220520230408.3787166-2-matthew.d.roper@intel.com> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20220520230408.3787166-1-matthew.d.roper@intel.com> References: <20220520230408.3787166-1-matthew.d.roper@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v4 1/6] drm/i915/xehp: Use separate sseu init function X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: dri-devel@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Xe_HP has enough fundamental differences from previous platforms that it makes sense to use a separate SSEU init function to keep things straightforward and easy to understand. We'll also add a has_xehp_dss flag to the SSEU structure that will be used by other upcoming changes. v2: - Add has_xehp_dss flag Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_sseu.c | 86 ++++++++++++++++------------ drivers/gpu/drm/i915/gt/intel_sseu.h | 5 ++ 2 files changed, 54 insertions(+), 37 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c index fdd25691beda..b5fd479a7b85 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.c +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c @@ -169,13 +169,43 @@ static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, u8 s_en, sseu->eu_total = compute_eu_total(sseu); } -static void gen12_sseu_info_init(struct intel_gt *gt) +static void xehp_sseu_info_init(struct intel_gt *gt) { struct sseu_dev_info *sseu = >->info.sseu; struct intel_uncore *uncore = gt->uncore; u32 g_dss_en, c_dss_en = 0; u16 eu_en = 0; u8 eu_en_fuse; + int eu; + + /* + * The concept of slice has been removed in Xe_HP. To be compatible + * with prior generations, assume a single slice across the entire + * device. Then calculate out the DSS for each workload type within + * that software slice. + */ + intel_sseu_set_info(sseu, 1, 32, 16); + sseu->has_xehp_dss = 1; + + g_dss_en = intel_uncore_read(uncore, GEN12_GT_GEOMETRY_DSS_ENABLE); + c_dss_en = intel_uncore_read(uncore, GEN12_GT_COMPUTE_DSS_ENABLE); + + eu_en_fuse = intel_uncore_read(uncore, XEHP_EU_ENABLE) & XEHP_EU_ENA_MASK; + + for (eu = 0; eu < sseu->max_eus_per_subslice / 2; eu++) + if (eu_en_fuse & BIT(eu)) + eu_en |= BIT(eu * 2) | BIT(eu * 2 + 1); + + gen11_compute_sseu_info(sseu, 0x1, g_dss_en, c_dss_en, eu_en); +} + +static void gen12_sseu_info_init(struct intel_gt *gt) +{ + struct sseu_dev_info *sseu = >->info.sseu; + struct intel_uncore *uncore = gt->uncore; + u32 g_dss_en; + u16 eu_en = 0; + u8 eu_en_fuse; u8 s_en; int eu; @@ -183,43 +213,23 @@ static void gen12_sseu_info_init(struct intel_gt *gt) * Gen12 has Dual-Subslices, which behave similarly to 2 gen11 SS. * Instead of splitting these, provide userspace with an array * of DSS to more closely represent the hardware resource. - * - * In addition, the concept of slice has been removed in Xe_HP. - * To be compatible with prior generations, assume a single slice - * across the entire device. Then calculate out the DSS for each - * workload type within that software slice. */ - if (IS_DG2(gt->i915) || IS_XEHPSDV(gt->i915)) - intel_sseu_set_info(sseu, 1, 32, 16); - else - intel_sseu_set_info(sseu, 1, 6, 16); + intel_sseu_set_info(sseu, 1, 6, 16); - /* - * As mentioned above, Xe_HP does not have the concept of a slice. - * Enable one for software backwards compatibility. - */ - if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) - s_en = 0x1; - else - s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) & - GEN11_GT_S_ENA_MASK; + s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) & + GEN11_GT_S_ENA_MASK; g_dss_en = intel_uncore_read(uncore, GEN12_GT_GEOMETRY_DSS_ENABLE); - if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) - c_dss_en = intel_uncore_read(uncore, GEN12_GT_COMPUTE_DSS_ENABLE); /* one bit per pair of EUs */ - if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) - eu_en_fuse = intel_uncore_read(uncore, XEHP_EU_ENABLE) & XEHP_EU_ENA_MASK; - else - eu_en_fuse = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) & - GEN11_EU_DIS_MASK); + eu_en_fuse = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) & + GEN11_EU_DIS_MASK); for (eu = 0; eu < sseu->max_eus_per_subslice / 2; eu++) if (eu_en_fuse & BIT(eu)) eu_en |= BIT(eu * 2) | BIT(eu * 2 + 1); - gen11_compute_sseu_info(sseu, s_en, g_dss_en, c_dss_en, eu_en); + gen11_compute_sseu_info(sseu, s_en, g_dss_en, 0, eu_en); /* TGL only supports slice-level power gating */ sseu->has_slice_pg = 1; @@ -574,18 +584,20 @@ void intel_sseu_info_init(struct intel_gt *gt) { struct drm_i915_private *i915 = gt->i915; - if (IS_HASWELL(i915)) - hsw_sseu_info_init(gt); - else if (IS_CHERRYVIEW(i915)) - cherryview_sseu_info_init(gt); - else if (IS_BROADWELL(i915)) - bdw_sseu_info_init(gt); - else if (GRAPHICS_VER(i915) == 9) - gen9_sseu_info_init(gt); - else if (GRAPHICS_VER(i915) == 11) - gen11_sseu_info_init(gt); + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) + xehp_sseu_info_init(gt); else if (GRAPHICS_VER(i915) >= 12) gen12_sseu_info_init(gt); + else if (GRAPHICS_VER(i915) >= 11) + gen11_sseu_info_init(gt); + else if (GRAPHICS_VER(i915) >= 9) + gen9_sseu_info_init(gt); + else if (IS_BROADWELL(i915)) + bdw_sseu_info_init(gt); + else if (IS_CHERRYVIEW(i915)) + cherryview_sseu_info_init(gt); + else if (IS_HASWELL(i915)) + hsw_sseu_info_init(gt); } u32 intel_sseu_make_rpcs(struct intel_gt *gt, diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h index 5c078df4729c..4a041f9dc490 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.h +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h @@ -66,6 +66,11 @@ struct sseu_dev_info { u8 has_slice_pg:1; u8 has_subslice_pg:1; u8 has_eu_pg:1; + /* + * For Xe_HP and beyond, the hardware no longer has traditional slices + * so we just report the entire DSS pool under a fake "slice 0." + */ + u8 has_xehp_dss:1; /* Topology fields */ u8 max_slices;