From patchwork Wed Nov 20 16:41:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Ville Syrjala X-Patchwork-Id: 13881366 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B7459D711C0 for ; Wed, 20 Nov 2024 16:41:35 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3236610E799; Wed, 20 Nov 2024 16:41:35 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="MUMV4sG8"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3103810E79B for ; Wed, 20 Nov 2024 16:41:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1732120894; x=1763656894; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=g3A3r3MopyN8cFO6jvBckjdNob0tjvA5Qrwv3A0PpDo=; b=MUMV4sG8yV24jz+rz0Xby/8I9BSA117FuKLUslOwi0rGmHB5mbPm7yhp aJ5taQlKSB4aWzb0OEvq6wCHxT2yluW+bkw/6/YRTzZdck1h5kgPGt7p/ TR97H124GzqBaWfBYGYnkYm+0gdo3Ro0fR55ugPSygSIMdw6IYR72SaUI eK0V0TZ6lwuILenbe1ABoD7Jt5e6O/SzbOx/3asQSx8vf3t+QV7PIIyoR 0TwecQQsV0pagDh3zg7QgvLj3ig/b8aTUyw9nq8Y3o7qzdXl9bEAynuqV uDAAzm+o2k4BcQ3rBumMRnKI+DDX2Ri/aYX+07PsUleSuVt2PqQT29JQC g==; X-CSE-ConnectionGUID: p64ZBIEwTxOnIYMoYeoZ5A== X-CSE-MsgGUID: Af7fZhJYQmauML5HzDcTvA== X-IronPort-AV: E=McAfee;i="6700,10204,11262"; a="49620292" X-IronPort-AV: E=Sophos;i="6.12,170,1728975600"; d="scan'208";a="49620292" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Nov 2024 08:41:28 -0800 X-CSE-ConnectionGUID: ok5lBF/JRzC9h8pgBYoUTg== X-CSE-MsgGUID: xs7bx+DlRseqyXB4PlWRbw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,170,1728975600"; d="scan'208";a="90137153" Received: from stinkpipe.fi.intel.com (HELO stinkbox) ([10.237.72.74]) by fmviesa008.fm.intel.com with SMTP; 20 Nov 2024 08:41:27 -0800 Received: by stinkbox (sSMTP sendmail emulation); Wed, 20 Nov 2024 18:41:25 +0200 From: Ville Syrjala To: intel-gfx@lists.freedesktop.org Cc: stable@vger.kernel.org Subject: [PATCH 1/4] drm/i915/dsb: Don't use indexed register writes needlessly Date: Wed, 20 Nov 2024 18:41:20 +0200 Message-ID: <20241120164123.12706-2-ville.syrjala@linux.intel.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20241120164123.12706-1-ville.syrjala@linux.intel.com> References: <20241120164123.12706-1-ville.syrjala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Ville Syrjälä Turns out the DSB indexed register write command has rather significant initial overhead compared to the normal MMIO write command. Based on some quick experiments on TGL you have to write the register at least ~5 times for the indexed write command to come out ahead. If you write the register less times than that the MMIO write is faster. So it seems my automagic indexed write logic was a bit misguided. Go back to the original approach only use indexed writes for the cases we know will benefit from it (indexed LUT register updates). Currently we shouldn't have any cases where this truly matters (just some rare double writes to the precision LUT index registers), but we will need to switch the legacy LUT updates to write each LUT register twice (to avoid some palette anti-collision logic troubles). This would be close to the worst case for using indexed writes (two writes per register, and 256 separate registers). Using the MMIO write command should shave off around 30% of the execution time compared to using the indexed write command. Cc: stable@vger.kernel.org Fixes: 34d8311f4a1c ("drm/i915/dsb: Re-instate DSB for LUT updates") Fixes: 25ea3411bd23 ("drm/i915/dsb: Use non-posted register writes for legacy LUT") Signed-off-by: Ville Syrjälä --- drivers/gpu/drm/i915/display/intel_color.c | 51 +++++++++++++--------- drivers/gpu/drm/i915/display/intel_dsb.c | 19 ++++++-- drivers/gpu/drm/i915/display/intel_dsb.h | 2 + 3 files changed, 49 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_color.c b/drivers/gpu/drm/i915/display/intel_color.c index 174753625bca..6ea3d5c58cb1 100644 --- a/drivers/gpu/drm/i915/display/intel_color.c +++ b/drivers/gpu/drm/i915/display/intel_color.c @@ -1343,6 +1343,17 @@ static void ilk_lut_write(const struct intel_crtc_state *crtc_state, intel_de_write_fw(display, reg, val); } +static void ilk_lut_write_indexed(const struct intel_crtc_state *crtc_state, + i915_reg_t reg, u32 val) +{ + struct intel_display *display = to_intel_display(crtc_state); + + if (crtc_state->dsb_color_vblank) + intel_dsb_reg_write_indexed(crtc_state->dsb_color_vblank, reg, val); + else + intel_de_write_fw(display, reg, val); +} + static void ilk_load_lut_8(const struct intel_crtc_state *crtc_state, const struct drm_property_blob *blob) { @@ -1458,8 +1469,8 @@ static void bdw_load_lut_10(const struct intel_crtc_state *crtc_state, prec_index); for (i = 0; i < lut_size; i++) - ilk_lut_write(crtc_state, PREC_PAL_DATA(pipe), - ilk_lut_10(&lut[i])); + ilk_lut_write_indexed(crtc_state, PREC_PAL_DATA(pipe), + ilk_lut_10(&lut[i])); /* * Reset the index, otherwise it prevents the legacy palette to be @@ -1612,16 +1623,16 @@ static void glk_load_degamma_lut(const struct intel_crtc_state *crtc_state, * ToDo: Extend to max 7.0. Enable 32 bit input value * as compared to just 16 to achieve this. */ - ilk_lut_write(crtc_state, PRE_CSC_GAMC_DATA(pipe), - DISPLAY_VER(display) >= 14 ? - mtl_degamma_lut(&lut[i]) : glk_degamma_lut(&lut[i])); + ilk_lut_write_indexed(crtc_state, PRE_CSC_GAMC_DATA(pipe), + DISPLAY_VER(display) >= 14 ? + mtl_degamma_lut(&lut[i]) : glk_degamma_lut(&lut[i])); } /* Clamp values > 1.0. */ while (i++ < glk_degamma_lut_size(display)) - ilk_lut_write(crtc_state, PRE_CSC_GAMC_DATA(pipe), - DISPLAY_VER(display) >= 14 ? - 1 << 24 : 1 << 16); + ilk_lut_write_indexed(crtc_state, PRE_CSC_GAMC_DATA(pipe), + DISPLAY_VER(display) >= 14 ? + 1 << 24 : 1 << 16); ilk_lut_write(crtc_state, PRE_CSC_GAMC_INDEX(pipe), 0); } @@ -1687,10 +1698,10 @@ icl_program_gamma_superfine_segment(const struct intel_crtc_state *crtc_state) for (i = 0; i < 9; i++) { const struct drm_color_lut *entry = &lut[i]; - ilk_lut_write(crtc_state, PREC_PAL_MULTI_SEG_DATA(pipe), - ilk_lut_12p4_ldw(entry)); - ilk_lut_write(crtc_state, PREC_PAL_MULTI_SEG_DATA(pipe), - ilk_lut_12p4_udw(entry)); + ilk_lut_write_indexed(crtc_state, PREC_PAL_MULTI_SEG_DATA(pipe), + ilk_lut_12p4_ldw(entry)); + ilk_lut_write_indexed(crtc_state, PREC_PAL_MULTI_SEG_DATA(pipe), + ilk_lut_12p4_udw(entry)); } ilk_lut_write(crtc_state, PREC_PAL_MULTI_SEG_INDEX(pipe), @@ -1726,10 +1737,10 @@ icl_program_gamma_multi_segment(const struct intel_crtc_state *crtc_state) for (i = 1; i < 257; i++) { entry = &lut[i * 8]; - ilk_lut_write(crtc_state, PREC_PAL_DATA(pipe), - ilk_lut_12p4_ldw(entry)); - ilk_lut_write(crtc_state, PREC_PAL_DATA(pipe), - ilk_lut_12p4_udw(entry)); + ilk_lut_write_indexed(crtc_state, PREC_PAL_DATA(pipe), + ilk_lut_12p4_ldw(entry)); + ilk_lut_write_indexed(crtc_state, PREC_PAL_DATA(pipe), + ilk_lut_12p4_udw(entry)); } /* @@ -1747,10 +1758,10 @@ icl_program_gamma_multi_segment(const struct intel_crtc_state *crtc_state) for (i = 0; i < 256; i++) { entry = &lut[i * 8 * 128]; - ilk_lut_write(crtc_state, PREC_PAL_DATA(pipe), - ilk_lut_12p4_ldw(entry)); - ilk_lut_write(crtc_state, PREC_PAL_DATA(pipe), - ilk_lut_12p4_udw(entry)); + ilk_lut_write_indexed(crtc_state, PREC_PAL_DATA(pipe), + ilk_lut_12p4_ldw(entry)); + ilk_lut_write_indexed(crtc_state, PREC_PAL_DATA(pipe), + ilk_lut_12p4_udw(entry)); } ilk_lut_write(crtc_state, PREC_PAL_INDEX(pipe), diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c b/drivers/gpu/drm/i915/display/intel_dsb.c index b7b44399adaa..4d3785f5cb52 100644 --- a/drivers/gpu/drm/i915/display/intel_dsb.c +++ b/drivers/gpu/drm/i915/display/intel_dsb.c @@ -273,16 +273,20 @@ static bool intel_dsb_prev_ins_is_indexed_write(struct intel_dsb *dsb, i915_reg_ } /** - * intel_dsb_reg_write() - Emit register wriite to the DSB context + * intel_dsb_reg_write_indexed() - Emit register wriite to the DSB context * @dsb: DSB context * @reg: register address. * @val: value. * * This function is used for writing register-value pair in command * buffer of DSB. + * + * Note that indexed writes are slower than normal MMIO writes + * for a small number (less than 5 or so) of writes to the same + * register. */ -void intel_dsb_reg_write(struct intel_dsb *dsb, - i915_reg_t reg, u32 val) +void intel_dsb_reg_write_indexed(struct intel_dsb *dsb, + i915_reg_t reg, u32 val) { /* * For example the buffer will look like below for 3 dwords for auto @@ -340,6 +344,15 @@ void intel_dsb_reg_write(struct intel_dsb *dsb, } } +void intel_dsb_reg_write(struct intel_dsb *dsb, + i915_reg_t reg, u32 val) +{ + intel_dsb_emit(dsb, val, + (DSB_OPCODE_MMIO_WRITE << DSB_OPCODE_SHIFT) | + (DSB_BYTE_EN << DSB_BYTE_EN_SHIFT) | + i915_mmio_reg_offset(reg)); +} + static u32 intel_dsb_mask_to_byte_en(u32 mask) { return (!!(mask & 0xff000000) << 3 | diff --git a/drivers/gpu/drm/i915/display/intel_dsb.h b/drivers/gpu/drm/i915/display/intel_dsb.h index 33e0fc2ab380..da6df07a3c83 100644 --- a/drivers/gpu/drm/i915/display/intel_dsb.h +++ b/drivers/gpu/drm/i915/display/intel_dsb.h @@ -34,6 +34,8 @@ void intel_dsb_finish(struct intel_dsb *dsb); void intel_dsb_cleanup(struct intel_dsb *dsb); void intel_dsb_reg_write(struct intel_dsb *dsb, i915_reg_t reg, u32 val); +void intel_dsb_reg_write_indexed(struct intel_dsb *dsb, + i915_reg_t reg, u32 val); void intel_dsb_reg_write_masked(struct intel_dsb *dsb, i915_reg_t reg, u32 mask, u32 val); void intel_dsb_noop(struct intel_dsb *dsb, int count); From patchwork Wed Nov 20 16:41:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Ville Syrjala X-Patchwork-Id: 13881367 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1C112D711C0 for ; Wed, 20 Nov 2024 16:41:41 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9FEDC10E18A; Wed, 20 Nov 2024 16:41:40 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="D7pD6Xln"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7CCFD10E79E for ; Wed, 20 Nov 2024 16:41:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1732120900; x=1763656900; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Lp1XsOjfEdnhtE95S3d9mElLCXp65ltf+bTb949N1gA=; b=D7pD6XlnrxIUgbx/nR/cmU8vYvZ7fCNiKAmK6tAuJPqc8km6ibjs2BaS nFU71tyj4iPcVQXJUA5e0/JgJob4EYl4bsECXilqKRxAoHUBvfthwMD1c 5ZSFkBvpDUGY5l7SBHGojKTsrKBaib97aijIiPJAQEq8DUrTePfXsLMuN 48QqPnlMZ+j8gPEQoh0yaWKKFBAtDBMgvX7w+Zefy+R7FWzwLn/wJms0u fEqfkaLwZuW0V2PdalTeAjTjZ474HL4sFcXZeKeBpZdvcYNWs9tQDAlRa EJ6js/BVsng35r/iH72ehZhrVRlE5LhqPp7ZmAJECWbyJAEqmR59VzfGx A==; X-CSE-ConnectionGUID: N07YYqhqThWlLC22IdBptQ== X-CSE-MsgGUID: jznXexN6RT6ymdjECCl+Vw== X-IronPort-AV: E=McAfee;i="6700,10204,11262"; a="49620316" X-IronPort-AV: E=Sophos;i="6.12,170,1728975600"; d="scan'208";a="49620316" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Nov 2024 08:41:31 -0800 X-CSE-ConnectionGUID: 8Bl7c7onSzio3udjPW59BQ== X-CSE-MsgGUID: qMTWedN2SeWFfZUjz6xghg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,170,1728975600"; d="scan'208";a="90137166" Received: from stinkpipe.fi.intel.com (HELO stinkbox) ([10.237.72.74]) by fmviesa008.fm.intel.com with SMTP; 20 Nov 2024 08:41:30 -0800 Received: by stinkbox (sSMTP sendmail emulation); Wed, 20 Nov 2024 18:41:28 +0200 From: Ville Syrjala To: intel-gfx@lists.freedesktop.org Cc: stable@vger.kernel.org Subject: [PATCH 2/4] drm/i915/color: Stop using non-posted DSB writes for legacy LUT Date: Wed, 20 Nov 2024 18:41:21 +0200 Message-ID: <20241120164123.12706-3-ville.syrjala@linux.intel.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20241120164123.12706-1-ville.syrjala@linux.intel.com> References: <20241120164123.12706-1-ville.syrjala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Ville Syrjälä DSB LUT register writes vs. palette anti-collision logic appear to interact in interesting ways: - posted DSB writes simply vanish into thin air while anti-collision is active - non-posted DSB writes actually get blocked by the anti-collision logic, but unfortunately this ends up hogging the bus for long enough that unrelated parallel CPU MMIO accesses start to disappear instead Even though we are updating the LUT during vblank we aren't immune to the anti-collision logic because it kicks in brifly for pipe prefill (initiated at frame start). The safe time window for performing the LUT update is thus between the undelayed vblank and frame start. Turns out that with low enough CDCLK frequency (DSB execution speed depends on CDCLK) we can exceed that. As we are currently using non-posted writes for the legacy LUT updates, in which case we can hit the far more severe failure mode. The problem is exacerbated by the fact that non-posted writes are much slower than posted writes (~4x it seems). To mititage the problem let's switch to using posted DSB writes for legacy LUT updates (which will involve using the double write approach to avoid other problems with DSB vs. legacy LUT writes). Despite writing each register twice this will in fact make the legacy LUT update faster when compared to the non-posted write approach, making the problem less likely to appear. The failure mode is also less severe. This isn't the 100% solution we need though. That will involve estimating how long the LUT update will take, and pushing frame start and/or delayed vblank forward to guarantee that the update will have finished by the time the pipe prefill starts... Cc: stable@vger.kernel.org Fixes: 34d8311f4a1c ("drm/i915/dsb: Re-instate DSB for LUT updates") Fixes: 25ea3411bd23 ("drm/i915/dsb: Use non-posted register writes for legacy LUT") Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12494 Signed-off-by: Ville Syrjälä --- drivers/gpu/drm/i915/display/intel_color.c | 30 ++++++++++++++-------- 1 file changed, 20 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_color.c b/drivers/gpu/drm/i915/display/intel_color.c index 6ea3d5c58cb1..7cd902bbd244 100644 --- a/drivers/gpu/drm/i915/display/intel_color.c +++ b/drivers/gpu/drm/i915/display/intel_color.c @@ -1368,19 +1368,29 @@ static void ilk_load_lut_8(const struct intel_crtc_state *crtc_state, lut = blob->data; /* - * DSB fails to correctly load the legacy LUT - * unless we either write each entry twice, - * or use non-posted writes + * DSB fails to correctly load the legacy LUT unless + * we either write each entry twice when using posted + * writes, or we use non-posted writes. + * + * If palette anti-collision is active during LUT + * register writes: + * - posted writes simply get dropped and thus the LUT + * contents may not be correctly updated + * - non-posted writes are blocked and thus the LUT + * contents are always correct, but simultaneous CPU + * MMIO access will start to fail + * + * Choose the lesser of two evils and use posted writes. + * Using posted writes is also faster, even when having + * to write each register twice. */ - if (crtc_state->dsb_color_vblank) - intel_dsb_nonpost_start(crtc_state->dsb_color_vblank); - - for (i = 0; i < 256; i++) + for (i = 0; i < 256; i++) { ilk_lut_write(crtc_state, LGC_PALETTE(pipe, i), i9xx_lut_8(&lut[i])); - - if (crtc_state->dsb_color_vblank) - intel_dsb_nonpost_end(crtc_state->dsb_color_vblank); + if (crtc_state->dsb_color_vblank) + ilk_lut_write(crtc_state, LGC_PALETTE(pipe, i), + i9xx_lut_8(&lut[i])); + } } static void ilk_load_lut_10(const struct intel_crtc_state *crtc_state, From patchwork Wed Nov 20 16:41:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Ville Syrjala X-Patchwork-Id: 13881369 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 57E04D711C7 for ; Wed, 20 Nov 2024 16:41:53 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BEE8110E78F; Wed, 20 Nov 2024 16:41:52 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="MuCbfrLM"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) by gabe.freedesktop.org (Postfix) with ESMTPS id 357BF10E2D3 for ; Wed, 20 Nov 2024 16:41:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1732120911; x=1763656911; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=D0JMaCkdmCyhxqk/vJeFo/DhxxVH5lE4XbJMDTTf0EA=; b=MuCbfrLMdiYc8ZqdukyLvZhKy46+46ypQol8DmbK8pVrZaoYeN9t+IVl qju1Ph/ni02NCGswBtfkfZR/jJzvZlRX5pu1e2NUip9jhf0ACdWMzcpxr g+2N+QYzfke0g5EjA8qOGY7WlpHjfJLDkHhy/sPl46ly6t1/zlFYiQ3dI JxDz26+OKcd6ErLf2OtOrQ9u0ozXsXy55+sBWCql9S7fBdun9k7NC7f5e I8fmhltiy8hT4dbXqaeIva1Z2/VWVbW9LJgYDq35AbVlUHYL4904E9q80 rMNEnztwgRTbL54BplYpjvyIf2EqMupMKXGpl+uDXG6yMZI6RvSAQYJVN w==; X-CSE-ConnectionGUID: 1RBqKLRsQAylcc+bzPEr4g== X-CSE-MsgGUID: 9rUJkkAAQWaUFBG/O29ljw== X-IronPort-AV: E=McAfee;i="6700,10204,11262"; a="49620335" X-IronPort-AV: E=Sophos;i="6.12,170,1728975600"; d="scan'208";a="49620335" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Nov 2024 08:41:34 -0800 X-CSE-ConnectionGUID: imFnWCpzSqaI0FDpQPtj1g== X-CSE-MsgGUID: mW8huab0T3SeIPGjt4tnYg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,170,1728975600"; d="scan'208";a="90137173" Received: from stinkpipe.fi.intel.com (HELO stinkbox) ([10.237.72.74]) by fmviesa008.fm.intel.com with SMTP; 20 Nov 2024 08:41:33 -0800 Received: by stinkbox (sSMTP sendmail emulation); Wed, 20 Nov 2024 18:41:31 +0200 From: Ville Syrjala To: intel-gfx@lists.freedesktop.org Subject: [PATCH 3/4] drm/i915/dsb: Nuke the MMIO->indexed register write logic Date: Wed, 20 Nov 2024 18:41:22 +0200 Message-ID: <20241120164123.12706-4-ville.syrjala@linux.intel.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20241120164123.12706-1-ville.syrjala@linux.intel.com> References: <20241120164123.12706-1-ville.syrjala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Ville Syrjälä We've determined that indexed DSB writes are only faster than MMIO writes when writing the same register ~5 or more times. That seems very unlikely to happen in any other case than when using indexed LUT registers. Simplify the code by removing the MMIO->indexed write conversion logic and just emit the instruction as an indexed write from the get go. Signed-off-by: Ville Syrjälä --- drivers/gpu/drm/i915/display/intel_dsb.c | 58 ++++++------------------ 1 file changed, 14 insertions(+), 44 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c b/drivers/gpu/drm/i915/display/intel_dsb.c index 4d3785f5cb52..e6f8fc743fb4 100644 --- a/drivers/gpu/drm/i915/display/intel_dsb.c +++ b/drivers/gpu/drm/i915/display/intel_dsb.c @@ -256,15 +256,6 @@ static bool intel_dsb_prev_ins_is_write(struct intel_dsb *dsb, return prev_opcode == opcode && prev_reg == i915_mmio_reg_offset(reg); } -static bool intel_dsb_prev_ins_is_mmio_write(struct intel_dsb *dsb, i915_reg_t reg) -{ - /* only full byte-enables can be converted to indexed writes */ - return intel_dsb_prev_ins_is_write(dsb, - DSB_OPCODE_MMIO_WRITE << DSB_OPCODE_SHIFT | - DSB_BYTE_EN << DSB_BYTE_EN_SHIFT, - reg); -} - static bool intel_dsb_prev_ins_is_indexed_write(struct intel_dsb *dsb, i915_reg_t reg) { return intel_dsb_prev_ins_is_write(dsb, @@ -273,7 +264,7 @@ static bool intel_dsb_prev_ins_is_indexed_write(struct intel_dsb *dsb, i915_reg_ } /** - * intel_dsb_reg_write_indexed() - Emit register wriite to the DSB context + * intel_dsb_reg_write_indexed() - Emit indexed register write to the DSB context * @dsb: DSB context * @reg: register address. * @val: value. @@ -304,44 +295,23 @@ void intel_dsb_reg_write_indexed(struct intel_dsb *dsb, * we are writing odd no of dwords, Zeros will be added in the end for * padding. */ - if (!intel_dsb_prev_ins_is_mmio_write(dsb, reg) && - !intel_dsb_prev_ins_is_indexed_write(dsb, reg)) { - intel_dsb_emit(dsb, val, - (DSB_OPCODE_MMIO_WRITE << DSB_OPCODE_SHIFT) | - (DSB_BYTE_EN << DSB_BYTE_EN_SHIFT) | + if (!intel_dsb_prev_ins_is_indexed_write(dsb, reg)) + intel_dsb_emit(dsb, 0, /* count */ + (DSB_OPCODE_INDEXED_WRITE << DSB_OPCODE_SHIFT) | i915_mmio_reg_offset(reg)); - } else { - if (!assert_dsb_has_room(dsb)) - return; - - /* convert to indexed write? */ - if (intel_dsb_prev_ins_is_mmio_write(dsb, reg)) { - u32 prev_val = dsb->ins[0]; - dsb->ins[0] = 1; /* count */ - dsb->ins[1] = (DSB_OPCODE_INDEXED_WRITE << DSB_OPCODE_SHIFT) | - i915_mmio_reg_offset(reg); - - intel_dsb_buffer_write(&dsb->dsb_buf, dsb->ins_start_offset + 0, - dsb->ins[0]); - intel_dsb_buffer_write(&dsb->dsb_buf, dsb->ins_start_offset + 1, - dsb->ins[1]); - intel_dsb_buffer_write(&dsb->dsb_buf, dsb->ins_start_offset + 2, - prev_val); - - dsb->free_pos++; - } + if (!assert_dsb_has_room(dsb)) + return; - intel_dsb_buffer_write(&dsb->dsb_buf, dsb->free_pos++, val); - /* Update the count */ - dsb->ins[0]++; - intel_dsb_buffer_write(&dsb->dsb_buf, dsb->ins_start_offset + 0, - dsb->ins[0]); + /* Update the count */ + dsb->ins[0]++; + intel_dsb_buffer_write(&dsb->dsb_buf, dsb->ins_start_offset + 0, + dsb->ins[0]); - /* if number of data words is odd, then the last dword should be 0.*/ - if (dsb->free_pos & 0x1) - intel_dsb_buffer_write(&dsb->dsb_buf, dsb->free_pos, 0); - } + intel_dsb_buffer_write(&dsb->dsb_buf, dsb->free_pos++, val); + /* if number of data words is odd, then the last dword should be 0.*/ + if (dsb->free_pos & 0x1) + intel_dsb_buffer_write(&dsb->dsb_buf, dsb->free_pos, 0); } void intel_dsb_reg_write(struct intel_dsb *dsb, From patchwork Wed Nov 20 16:41:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Ville Syrjala X-Patchwork-Id: 13881368 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D5419D711C0 for ; Wed, 20 Nov 2024 16:41:52 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6B64410E2D3; Wed, 20 Nov 2024 16:41:52 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="IeOlRwtg"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) by gabe.freedesktop.org (Postfix) with ESMTPS id ABFF810E2D3 for ; Wed, 20 Nov 2024 16:41:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1732120910; x=1763656910; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Pv5VFkjekOTioWmyYjl4EJwkl2bztJEYXSf8Si7vUDk=; b=IeOlRwtgHnPyzluqrtsxDhG/XqSaMM0OnrMTG94Hq5DwhC/OELwW70yK Hm9mNBL7JOC+pnuVjLDtzp2taGzWYoaV0EqkopNGAog8El0CCw4J+REGn SKj8YAYogHWkT7irPD0/63FkSpe+SW3Yp+ZR7lRRcCO6+SvhWQY3AtjUa yaeoWMcZv7B2YagSFHF0JcpAfGd7V9E2io8cWfc4he24qVjD0AT1+4ZYR 5ZQTanHhlBzl8v5Mzyg1tWC4MiBeItmgJxVFJ4QJwlHzAq033aI0+ssJr PmuEiVWaVW8smg3E8B4/1v0BHVAD4YdqxGVILhJmk8KnF4OlcB41DlTki A==; X-CSE-ConnectionGUID: jn9gGZH6QIGfm2ePIRSkPw== X-CSE-MsgGUID: L24FRP2VTDus3ar7nMo2fA== X-IronPort-AV: E=McAfee;i="6700,10204,11262"; a="49620353" X-IronPort-AV: E=Sophos;i="6.12,170,1728975600"; d="scan'208";a="49620353" Received: from fmviesa008.fm.intel.com ([10.60.135.148]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Nov 2024 08:41:37 -0800 X-CSE-ConnectionGUID: Wqc3PM6rRZiVBU5dsn4j0A== X-CSE-MsgGUID: ytStt/uzT+6mszaEyVhahg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,170,1728975600"; d="scan'208";a="90137181" Received: from stinkpipe.fi.intel.com (HELO stinkbox) ([10.237.72.74]) by fmviesa008.fm.intel.com with SMTP; 20 Nov 2024 08:41:35 -0800 Received: by stinkbox (sSMTP sendmail emulation); Wed, 20 Nov 2024 18:41:34 +0200 From: Ville Syrjala To: intel-gfx@lists.freedesktop.org Cc: Uma Shankar Subject: [PATCH 4/4] drm/i915: Do state check for color management changes Date: Wed, 20 Nov 2024 18:41:23 +0200 Message-ID: <20241120164123.12706-5-ville.syrjala@linux.intel.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20241120164123.12706-1-ville.syrjala@linux.intel.com> References: <20241120164123.12706-1-ville.syrjala@linux.intel.com> MIME-Version: 1.0 X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Ville Syrjälä In order to validate LUT programming more thoroughly let's do a state check for all color management updates as well. Not sure we really want this outside CI. It is rather heavy and color management updates could become rather common with all the HDR/etc. stuff happening. Maybe we should have an extra knob for this that we could enable in CI? v2: Skip for initial_commit to avoid FDI dotclock sanity checks/etc. tripping up Reviewed-by: Uma Shankar Signed-off-by: Ville Syrjälä --- drivers/gpu/drm/i915/display/intel_modeset_verify.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/i915/display/intel_modeset_verify.c b/drivers/gpu/drm/i915/display/intel_modeset_verify.c index bc70e72ccc2e..f61641d27c4a 100644 --- a/drivers/gpu/drm/i915/display/intel_modeset_verify.c +++ b/drivers/gpu/drm/i915/display/intel_modeset_verify.c @@ -239,6 +239,8 @@ void intel_modeset_verify_crtc(struct intel_atomic_state *state, intel_atomic_get_new_crtc_state(state, crtc); if (!intel_crtc_needs_modeset(new_crtc_state) && + (!intel_crtc_needs_color_update(new_crtc_state) || + new_crtc_state->inherited) && !intel_crtc_needs_fastset(new_crtc_state)) return;