Message ID | 20220627125928.177845-1-lionel.g.landwerlin@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] drm/i915/dg2: Add performance workaround 18019455067 | expand |
On Mon, Jun 27, 2022 at 03:59:28PM +0300, Lionel Landwerlin wrote: > The recommended number of stackIDs for Ray Tracing subsystem is 512 > rather than 2048 (default HW programming). > > v2: Move the programming to dg2_ctx_gt_tuning_init() (Lucas) I'm not sure this is actually the correct move. As far as I can see on bspec 46261, RT_CTRL isn't part of the engine's context, so we need to make sure it gets added to engine->wa_list instead of engine->ctx_wa_list, otherwise it won't be properly re-applied after engine resets and such. Most of our other tuning values are part of the context image, so this one is a bit unusual. To get it onto the engine->wa_list, the workaround needs to either be defined via rcs_engine_wa_init() or general_render_compute_wa_init(). The latter is the new, preferred location for registers that are part of the render/compute reset domain, but that don't live in the RCS engine's 0x2xxx MMIO range (since all RCS and CCS engines get reset together, the items in general_render_compute_wa_init() will make sure it's dealt with as part of the handling for the first RCS/CCS engine, so that we won't miss out on applying it if the platform doesn't have an RCS). At the moment we don't have too many "tuning" values that we need to set that aren't part of an engine's context, so we don't yet have a dedicated "tuning" function for engine-style workarounds like we do with ctx-style workarounds. Matt > > Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> > --- > drivers/gpu/drm/i915/gt/intel_gt_regs.h | 4 ++++ > drivers/gpu/drm/i915/gt/intel_workarounds.c | 5 +++++ > 2 files changed, 9 insertions(+) > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h > index 07ef111947b8c..12fc87b957425 100644 > --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h > +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h > @@ -1112,6 +1112,10 @@ > #define GEN12_PUSH_CONST_DEREF_HOLD_DIS REG_BIT(8) > > #define RT_CTRL _MMIO(0xe530) > +#define RT_CTRL_NUMBER_OF_STACKIDS_MASK REG_GENMASK(6, 5) > +#define NUMBER_OF_STACKIDS_512 2 > +#define NUMBER_OF_STACKIDS_1024 1 > +#define NUMBER_OF_STACKIDS_2048 0 > #define DIS_NULL_QUERY REG_BIT(10) > > #define EU_PERF_CNTL1 _MMIO(0xe558) > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c > index 3213c593a55f4..4d80716b957d4 100644 > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c > @@ -575,6 +575,11 @@ static void dg2_ctx_gt_tuning_init(struct intel_engine_cs *engine, > FF_MODE2_TDS_TIMER_MASK, > FF_MODE2_TDS_TIMER_128, > 0, false); > + wa_write_clr_set(wal, > + RT_CTRL, > + RT_CTRL_NUMBER_OF_STACKIDS_MASK, > + REG_FIELD_PREP(RT_CTRL_NUMBER_OF_STACKIDS_MASK, > + NUMBER_OF_STACKIDS_512)); > } > > /* > -- > 2.34.1 >
On Wed, Jun 29, 2022 at 03:16:09PM -0700, Matt Roper wrote: >On Mon, Jun 27, 2022 at 03:59:28PM +0300, Lionel Landwerlin wrote: >> The recommended number of stackIDs for Ray Tracing subsystem is 512 >> rather than 2048 (default HW programming). >> >> v2: Move the programming to dg2_ctx_gt_tuning_init() (Lucas) > >I'm not sure this is actually the correct move. As far as I can see on >bspec 46261, RT_CTRL isn't part of the engine's context, so we need to >make sure it gets added to engine->wa_list instead of >engine->ctx_wa_list, otherwise it won't be properly re-applied after >engine resets and such. Most of our other tuning values are part of the >context image, so this one is a bit unusual. > >To get it onto the engine->wa_list, the workaround needs to either be >defined via rcs_engine_wa_init() or general_render_compute_wa_init(). >The latter is the new, preferred location for registers that are part of >the render/compute reset domain, but that don't live in the RCS engine's >0x2xxx MMIO range (since all RCS and CCS engines get reset together, the >items in general_render_compute_wa_init() will make sure it's dealt with >as part of the handling for the first RCS/CCS engine, so that we won't >miss out on applying it if the platform doesn't have an RCS). > >At the moment we don't have too many "tuning" values that we need to set >that aren't part of an engine's context, so we don't yet have a >dedicated "tuning" function for engine-style workarounds like we do with >ctx-style workarounds. what I meant on my review was not to move it to dg2_ctx_gt_tuning_init(), but rather to follow the same logic: we need an equivalent tuning version for engine wa. Lucas De Marchi
On 30/06/2022 01:16, Matt Roper wrote: > On Mon, Jun 27, 2022 at 03:59:28PM +0300, Lionel Landwerlin wrote: >> The recommended number of stackIDs for Ray Tracing subsystem is 512 >> rather than 2048 (default HW programming). >> >> v2: Move the programming to dg2_ctx_gt_tuning_init() (Lucas) > I'm not sure this is actually the correct move. As far as I can see on > bspec 46261, RT_CTRL isn't part of the engine's context, so we need to > make sure it gets added to engine->wa_list instead of > engine->ctx_wa_list, otherwise it won't be properly re-applied after > engine resets and such. Most of our other tuning values are part of the > context image, so this one is a bit unusual. > > To get it onto the engine->wa_list, the workaround needs to either be > defined via rcs_engine_wa_init() or general_render_compute_wa_init(). > The latter is the new, preferred location for registers that are part of > the render/compute reset domain, but that don't live in the RCS engine's > 0x2xxx MMIO range (since all RCS and CCS engines get reset together, the > items in general_render_compute_wa_init() will make sure it's dealt with > as part of the handling for the first RCS/CCS engine, so that we won't > miss out on applying it if the platform doesn't have an RCS). > > At the moment we don't have too many "tuning" values that we need to set > that aren't part of an engine's context, so we don't yet have a > dedicated "tuning" function for engine-style workarounds like we do with > ctx-style workarounds. > > > Matt Thanks Matt, I didn't pay attention to the register offset and that it's not context/engine specific. Moving it to general_render_compute_wa_init() -Lionel > >> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> >> --- >> drivers/gpu/drm/i915/gt/intel_gt_regs.h | 4 ++++ >> drivers/gpu/drm/i915/gt/intel_workarounds.c | 5 +++++ >> 2 files changed, 9 insertions(+) >> >> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h >> index 07ef111947b8c..12fc87b957425 100644 >> --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h >> +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h >> @@ -1112,6 +1112,10 @@ >> #define GEN12_PUSH_CONST_DEREF_HOLD_DIS REG_BIT(8) >> >> #define RT_CTRL _MMIO(0xe530) >> +#define RT_CTRL_NUMBER_OF_STACKIDS_MASK REG_GENMASK(6, 5) >> +#define NUMBER_OF_STACKIDS_512 2 >> +#define NUMBER_OF_STACKIDS_1024 1 >> +#define NUMBER_OF_STACKIDS_2048 0 >> #define DIS_NULL_QUERY REG_BIT(10) >> >> #define EU_PERF_CNTL1 _MMIO(0xe558) >> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c >> index 3213c593a55f4..4d80716b957d4 100644 >> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c >> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c >> @@ -575,6 +575,11 @@ static void dg2_ctx_gt_tuning_init(struct intel_engine_cs *engine, >> FF_MODE2_TDS_TIMER_MASK, >> FF_MODE2_TDS_TIMER_128, >> 0, false); >> + wa_write_clr_set(wal, >> + RT_CTRL, >> + RT_CTRL_NUMBER_OF_STACKIDS_MASK, >> + REG_FIELD_PREP(RT_CTRL_NUMBER_OF_STACKIDS_MASK, >> + NUMBER_OF_STACKIDS_512)); >> } >> >> /* >> -- >> 2.34.1 >>
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h index 07ef111947b8c..12fc87b957425 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h @@ -1112,6 +1112,10 @@ #define GEN12_PUSH_CONST_DEREF_HOLD_DIS REG_BIT(8) #define RT_CTRL _MMIO(0xe530) +#define RT_CTRL_NUMBER_OF_STACKIDS_MASK REG_GENMASK(6, 5) +#define NUMBER_OF_STACKIDS_512 2 +#define NUMBER_OF_STACKIDS_1024 1 +#define NUMBER_OF_STACKIDS_2048 0 #define DIS_NULL_QUERY REG_BIT(10) #define EU_PERF_CNTL1 _MMIO(0xe558) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 3213c593a55f4..4d80716b957d4 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -575,6 +575,11 @@ static void dg2_ctx_gt_tuning_init(struct intel_engine_cs *engine, FF_MODE2_TDS_TIMER_MASK, FF_MODE2_TDS_TIMER_128, 0, false); + wa_write_clr_set(wal, + RT_CTRL, + RT_CTRL_NUMBER_OF_STACKIDS_MASK, + REG_FIELD_PREP(RT_CTRL_NUMBER_OF_STACKIDS_MASK, + NUMBER_OF_STACKIDS_512)); } /*
The recommended number of stackIDs for Ray Tracing subsystem is 512 rather than 2048 (default HW programming). v2: Move the programming to dg2_ctx_gt_tuning_init() (Lucas) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> --- drivers/gpu/drm/i915/gt/intel_gt_regs.h | 4 ++++ drivers/gpu/drm/i915/gt/intel_workarounds.c | 5 +++++ 2 files changed, 9 insertions(+)