Message ID | 20221128233014.4000136-6-matthew.d.roper@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | i915: dedicated MCR locking and hardware semaphore | expand |
On 28.11.2022 15:30, Matt Roper wrote: > PPAT setup involves a series of multicast writes. This can be optimized > slightly be acquiring forcewake and the steering lock just once for the > entire sequence. > > Suggested-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> > Signed-off-by: Matt Roper <matthew.d.roper@intel.com> > --- > drivers/gpu/drm/i915/gt/intel_gtt.c | 27 +++++++++++++++++++-------- > 1 file changed, 19 insertions(+), 8 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c > index 2ba3983984b9..288d9f118ee9 100644 > --- a/drivers/gpu/drm/i915/gt/intel_gtt.c > +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c > @@ -482,14 +482,25 @@ static void tgl_setup_private_ppat(struct intel_uncore *uncore) > > static void xehp_setup_private_ppat(struct intel_gt *gt) > { > - intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(0), GEN8_PPAT_WB); > - intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(1), GEN8_PPAT_WC); > - intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(2), GEN8_PPAT_WT); > - intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(3), GEN8_PPAT_UC); > - intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(4), GEN8_PPAT_WB); > - intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(5), GEN8_PPAT_WB); > - intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(6), GEN8_PPAT_WB); > - intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(7), GEN8_PPAT_WB); > + enum forcewake_domains fw; > + unsigned long flags; > + > + fw = intel_uncore_forcewake_for_reg(gt->uncore, _MMIO(XEHP_PAT_INDEX(0).reg), > + FW_REG_READ); I am not completely aware of forcewake implementation. I am wondering if the last parameter should be FW_REG_WRITE since it is register write which is happening later. Regards, Bala > + intel_uncore_forcewake_get(gt->uncore, fw); > + > + intel_gt_mcr_lock(gt, &flags); > + intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(0), GEN8_PPAT_WB); > + intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(1), GEN8_PPAT_WC); > + intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(2), GEN8_PPAT_WT); > + intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(3), GEN8_PPAT_UC); > + intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(4), GEN8_PPAT_WB); > + intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(5), GEN8_PPAT_WB); > + intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(6), GEN8_PPAT_WB); > + intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(7), GEN8_PPAT_WB); > + intel_gt_mcr_unlock(gt, flags); > + > + intel_uncore_forcewake_put(gt->uncore, fw); > } > > static void icl_setup_private_ppat(struct intel_uncore *uncore) > -- > 2.38.1 >
On Wed, Nov 30, 2022 at 09:21:07PM +0530, Balasubramani Vivekanandan wrote: > On 28.11.2022 15:30, Matt Roper wrote: > > PPAT setup involves a series of multicast writes. This can be optimized > > slightly be acquiring forcewake and the steering lock just once for the > > entire sequence. > > > > Suggested-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> > > Signed-off-by: Matt Roper <matthew.d.roper@intel.com> > > --- > > drivers/gpu/drm/i915/gt/intel_gtt.c | 27 +++++++++++++++++++-------- > > 1 file changed, 19 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c > > index 2ba3983984b9..288d9f118ee9 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_gtt.c > > +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c > > @@ -482,14 +482,25 @@ static void tgl_setup_private_ppat(struct intel_uncore *uncore) > > > > static void xehp_setup_private_ppat(struct intel_gt *gt) > > { > > - intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(0), GEN8_PPAT_WB); > > - intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(1), GEN8_PPAT_WC); > > - intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(2), GEN8_PPAT_WT); > > - intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(3), GEN8_PPAT_UC); > > - intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(4), GEN8_PPAT_WB); > > - intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(5), GEN8_PPAT_WB); > > - intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(6), GEN8_PPAT_WB); > > - intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(7), GEN8_PPAT_WB); > > + enum forcewake_domains fw; > > + unsigned long flags; > > + > > + fw = intel_uncore_forcewake_for_reg(gt->uncore, _MMIO(XEHP_PAT_INDEX(0).reg), > > + FW_REG_READ); > > I am not completely aware of forcewake implementation. I am wondering if > the last parameter should be FW_REG_WRITE since it is register write > which is happening later. Yep, good catch. Using FW_REG_WRITE allows the driver to potentially skip obtaining forcewake and waking the hardware if the registers being written are "shadowed" so it's always good to use FW_REG_WRITE in places where we're only writing and not reading. In this case the specific registers in question don't appear to be part of the shadow table for any affected platforms (e.g., dg2_shadowed_regs[] and such in intel_uncore.c), so FW_REG_WRITE will still behave the same as FW_REG_READ here. But it's always possible that future platforms could decide to shadow these registers, so it's good to fix anyway; I just sent an updated copy of this patch making that change. Matt > > Regards, > Bala > > > + intel_uncore_forcewake_get(gt->uncore, fw); > > + > > + intel_gt_mcr_lock(gt, &flags); > > + intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(0), GEN8_PPAT_WB); > > + intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(1), GEN8_PPAT_WC); > > + intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(2), GEN8_PPAT_WT); > > + intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(3), GEN8_PPAT_UC); > > + intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(4), GEN8_PPAT_WB); > > + intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(5), GEN8_PPAT_WB); > > + intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(6), GEN8_PPAT_WB); > > + intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(7), GEN8_PPAT_WB); > > + intel_gt_mcr_unlock(gt, flags); > > + > > + intel_uncore_forcewake_put(gt->uncore, fw); > > } > > > > static void icl_setup_private_ppat(struct intel_uncore *uncore) > > -- > > 2.38.1 > >
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c index 2ba3983984b9..288d9f118ee9 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.c +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c @@ -482,14 +482,25 @@ static void tgl_setup_private_ppat(struct intel_uncore *uncore) static void xehp_setup_private_ppat(struct intel_gt *gt) { - intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(0), GEN8_PPAT_WB); - intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(1), GEN8_PPAT_WC); - intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(2), GEN8_PPAT_WT); - intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(3), GEN8_PPAT_UC); - intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(4), GEN8_PPAT_WB); - intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(5), GEN8_PPAT_WB); - intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(6), GEN8_PPAT_WB); - intel_gt_mcr_multicast_write(gt, XEHP_PAT_INDEX(7), GEN8_PPAT_WB); + enum forcewake_domains fw; + unsigned long flags; + + fw = intel_uncore_forcewake_for_reg(gt->uncore, _MMIO(XEHP_PAT_INDEX(0).reg), + FW_REG_READ); + intel_uncore_forcewake_get(gt->uncore, fw); + + intel_gt_mcr_lock(gt, &flags); + intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(0), GEN8_PPAT_WB); + intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(1), GEN8_PPAT_WC); + intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(2), GEN8_PPAT_WT); + intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(3), GEN8_PPAT_UC); + intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(4), GEN8_PPAT_WB); + intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(5), GEN8_PPAT_WB); + intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(6), GEN8_PPAT_WB); + intel_gt_mcr_multicast_write_fw(gt, XEHP_PAT_INDEX(7), GEN8_PPAT_WB); + intel_gt_mcr_unlock(gt, flags); + + intel_uncore_forcewake_put(gt->uncore, fw); } static void icl_setup_private_ppat(struct intel_uncore *uncore)
PPAT setup involves a series of multicast writes. This can be optimized slightly be acquiring forcewake and the steering lock just once for the entire sequence. Suggested-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> --- drivers/gpu/drm/i915/gt/intel_gtt.c | 27 +++++++++++++++++++-------- 1 file changed, 19 insertions(+), 8 deletions(-)