mbox series

[v8,0/3] Disable automatic load CCS load balancing

Message ID 20240328073409.674098-1-andi.shyti@linux.intel.com (mailing list archive)
Headers show
Series Disable automatic load CCS load balancing | expand

Message

Andi Shyti March 28, 2024, 7:34 a.m. UTC
Hi,

I think we are at the end of it and hopefully this is the last
version. Thanks Matt for having followed this series until here.

This series does basically two things:

1. Disables automatic load balancing as adviced by the hardware
   workaround.

2. Assigns all the CCS slices to one single user engine. The user
   will then be able to query only one CCS engine

From v5 I have created a new file, gt/intel_gt_ccs_mode.c where
I added the intel_gt_apply_ccs_mode(). In the upcoming patches,
this file will contain the implementation for dynamic CCS mode
setting.

Thanks Tvrtko, Matt, John and Joonas for your reviews!

Andi

Changelog
=========
v7 -> v8
 - Just used a different way for removing the first instance of
   the CCS from the info->engine_mask, as suggested by Matt.

v6 -> v7
 - find a more appropriate place where to remove the CCS engines:
   remove them in init_engine_mask() instead of
   intel_engines_init_mmio(). (Thanks, Matt)
 - Add Michal's ACK, thanks Michal!

v5 -> v6 (thanks Matt for the suggestions in v6)
 - Remove the refactoring and the for_each_available_engine()
   macro and instead do not create the intel_engine_cs structure
   at all.
 - In patch 1 just a trivial reordering of the bit definitions.

v4 -> v5
 - Use the workaround framework to do all the CCS balancing
   settings in order to always apply the modes also when the
   engine resets. Put everything in its own specific function to
   be executed for the first CCS engine encountered. (Thanks
   Matt)
 - Calculate the CCS ID for the CCS mode as the first available
   CCS among all the engines (Thanks Matt)
 - create the intel_gt_ccs_mode.c function to host the CCS
   configuration. We will have it ready for the next series.
 - Fix a selftest that was failing because could not set CCS2.
 - Add the for_each_available_engine() macro to exclude CCS1+ and
   start using it in the hangcheck selftest.

v3 -> v4
 - Reword correctly the comment in the workaround
 - Fix a buffer overflow (Thanks Joonas)
 - Handle properly the fused engines when setting the CCS mode.

v2 -> v3
 - Simplified the algorithm for creating the list of the exported
   uabi engines. (Patch 1) (Thanks, Tvrtko)
 - Consider the fused engines when creating the uabi engine list
   (Patch 2) (Thanks, Matt)
 - Patch 4 now uses a the refactoring from patch 1, in a cleaner
   outcome.

v1 -> v2
 - In Patch 1 use the correct workaround number (thanks Matt).
 - In Patch 2 do not add the extra CCS engines to the exposed
   UABI engine list and adapt the engine counting accordingly
   (thanks Tvrtko).
 - Reword the commit of Patch 2 (thanks John).

Andi Shyti (3):
  drm/i915/gt: Disable HW load balancing for CCS
  drm/i915/gt: Do not generate the command streamer for all the CCS
  drm/i915/gt: Enable only one CCS for compute workload

 drivers/gpu/drm/i915/Makefile               |  1 +
 drivers/gpu/drm/i915/gt/intel_engine_cs.c   | 17 +++++++++
 drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c | 39 +++++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h | 13 +++++++
 drivers/gpu/drm/i915/gt/intel_gt_regs.h     |  6 ++++
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 30 ++++++++++++++--
 6 files changed, 104 insertions(+), 2 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.c
 create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_ccs_mode.h

Comments

Andi Shyti March 30, 2024, 12:14 a.m. UTC | #1
Hi,

On Sat, Mar 30, 2024 at 12:03:08AM -0000, Patchwork wrote:
> Patch Details
> 
> Series:  Disable automatic load CCS load balancing (rev14)
> URL:     https://patchwork.freedesktop.org/series/129951/
> State:   failure
> Details: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_129951v14/
>          index.html
> 
> CI Bug Log - changes from CI_DRM_14506_full -> Patchwork_129951v14_full
> 
> Summary
> 
> FAILURE
> 
> Serious unknown changes coming with Patchwork_129951v14_full absolutely need to
> be
> verified manually.
> 
> If you think the reported changes have nothing to do with the changes
> introduced in Patchwork_129951v14_full, please notify your bug team
> (I915-ci-infra@lists.freedesktop.org) to allow them
> to document this new failure mode, which will reduce false positives in CI.
> 
> Participating hosts (10 -> 9)
> 
> Missing (1): shard-snb-0
> 
> Possible new issues
> 
> Here are the unknown changes that may have been introduced in
> Patchwork_129951v14_full:
> 
> IGT changes
> 
> Possible regressions
> 
>   • igt@sysfs_heartbeat_interval@nopreempt@vcs0:
>       □ shard-dg2: NOTRUN -> INCOMPLETE

This looks unrelated. I also see from the previous shards tests
that I get some random failures.

I'm going ahead and merge this series.

Andi