Message ID | 20230119194955.2426167-1-alan.previn.teres.alexis@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/1] drm/i915/gsc: Fix the Driver-FLR completion | expand |
On Thu, Jan 19, 2023 at 11:49:55AM -0800, Alan Previn wrote: > The Driver-FLR flow may inadvertently exit early before the full > completion of the re-init of the internal HW state if we only poll > GU_DEBUG Bit31 (polling for it to toggle from 0 -> 1). Instead > we need a two-step completion wait-for-completion flow that also > involves GU_CNTL. See the patch and new code comments for detail. > This is new direction from HW architecture folks. Do we have this documented anywhere? but the patch looks good to me... > > Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> > Fixes: 5a44fcd73498 ("drm/i915/gsc: Do a driver-FLR on unload if GSC was loaded") > --- > drivers/gpu/drm/i915/intel_uncore.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c > index 8dee9e62a73e..959869e2ff05 100644 > --- a/drivers/gpu/drm/i915/intel_uncore.c > +++ b/drivers/gpu/drm/i915/intel_uncore.c > @@ -2748,6 +2748,12 @@ static void driver_initiated_flr(struct intel_uncore *uncore) > /* Trigger the actual Driver-FLR */ > intel_uncore_rmw_fw(uncore, GU_CNTL, 0, DRIVERFLR); > > + /* Completion Step 1 - poll for 'CNTL-BIT31 = 0' wait for hw teardown to complete */ > + ret = intel_wait_for_register_fw(uncore, GU_CNTL, > + DRIVERFLR_STATUS, 0, > + flr_timeout_ms); > + > + /* Completion: Step 2 - poll for 'DEBUG-BIT31 = 1' for hw/fw re-init to complete */ > ret = intel_wait_for_register_fw(uncore, GU_DEBUG, > DRIVERFLR_STATUS, DRIVERFLR_STATUS, > flr_timeout_ms); > @@ -2756,6 +2762,7 @@ static void driver_initiated_flr(struct intel_uncore *uncore) > return; > } > > + /* Write 1 to clear GU_DEBUG's sticky completion status bit */ > intel_uncore_write_fw(uncore, GU_DEBUG, DRIVERFLR_STATUS); > } > > > base-commit: 0a0ee61784df01ac098a92bd43673ee30c629f13 > -- > 2.39.0 >
Forwarded offline. Let's hold off R-B or merging until I verify that hw spec update is finalized to be exactly as what this patch is (probably a minor delay). On Thu, 2023-01-19 at 14:57 -0500, Vivi, Rodrigo wrote: > On Thu, Jan 19, 2023 at 11:49:55AM -0800, Alan Previn wrote: > > The Driver-FLR flow may inadvertently exit early before the full > > completion of the re-init of the internal HW state if we only poll > > GU_DEBUG Bit31 (polling for it to toggle from 0 -> 1). Instead > > we need a two-step completion wait-for-completion flow that also > > involves GU_CNTL. See the patch and new code comments for detail. > > This is new direction from HW architecture folks. > > Do we have this documented anywhere? > > but the patch looks good to me... > > > > > Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> > > Fixes: 5a44fcd73498 ("drm/i915/gsc: Do a driver-FLR on unload if GSC was loaded") > > --- > > drivers/gpu/drm/i915/intel_uncore.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c > > index 8dee9e62a73e..959869e2ff05 100644 > > --- a/drivers/gpu/drm/i915/intel_uncore.c > > +++ b/drivers/gpu/drm/i915/intel_uncore.c > > @@ -2748,6 +2748,12 @@ static void driver_initiated_flr(struct intel_uncore *uncore) > > /* Trigger the actual Driver-FLR */ > > intel_uncore_rmw_fw(uncore, GU_CNTL, 0, DRIVERFLR); > > > > + /* Completion Step 1 - poll for 'CNTL-BIT31 = 0' wait for hw teardown to complete */ > > + ret = intel_wait_for_register_fw(uncore, GU_CNTL, > > + DRIVERFLR_STATUS, 0, > > + flr_timeout_ms); > > + > > + /* Completion: Step 2 - poll for 'DEBUG-BIT31 = 1' for hw/fw re-init to complete */ > > ret = intel_wait_for_register_fw(uncore, GU_DEBUG, > > DRIVERFLR_STATUS, DRIVERFLR_STATUS, > > flr_timeout_ms); > > @@ -2756,6 +2762,7 @@ static void driver_initiated_flr(struct intel_uncore *uncore) > > return; > > } > > > > + /* Write 1 to clear GU_DEBUG's sticky completion status bit */ > > intel_uncore_write_fw(uncore, GU_DEBUG, DRIVERFLR_STATUS); > > } > > > > > > base-commit: 0a0ee61784df01ac098a92bd43673ee30c629f13 > > -- > > 2.39.0 > >
> -----Original Message----- > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Alan > Previn > Sent: Friday, January 20, 2023 1:20 AM > To: intel-gfx@lists.freedesktop.org > Cc: Vivi@freedesktop.org; dri-devel@lists.freedesktop.org; Teres Alexis, > Alan Previn <alan.previn.teres.alexis@intel.com>; Vivi, Rodrigo > <rodrigo.vivi@intel.com> > Subject: [Intel-gfx] [PATCH 1/1] drm/i915/gsc: Fix the Driver-FLR completion > > The Driver-FLR flow may inadvertently exit early before the full completion > of the re-init of the internal HW state if we only poll GU_DEBUG Bit31 (polling > for it to toggle from 0 -> 1). Instead we need a two-step completion wait-for- > completion flow that also involves GU_CNTL. See the patch and new code > comments for detail. > This is new direction from HW architecture folks. > > Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> > Fixes: 5a44fcd73498 ("drm/i915/gsc: Do a driver-FLR on unload if GSC was > loaded") > --- > drivers/gpu/drm/i915/intel_uncore.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/gpu/drm/i915/intel_uncore.c > b/drivers/gpu/drm/i915/intel_uncore.c > index 8dee9e62a73e..959869e2ff05 100644 > --- a/drivers/gpu/drm/i915/intel_uncore.c > +++ b/drivers/gpu/drm/i915/intel_uncore.c > @@ -2748,6 +2748,12 @@ static void driver_initiated_flr(struct intel_uncore > *uncore) > /* Trigger the actual Driver-FLR */ > intel_uncore_rmw_fw(uncore, GU_CNTL, 0, DRIVERFLR); > > + /* Completion Step 1 - poll for 'CNTL-BIT31 = 0' wait for hw teardown > to complete */ > + ret = intel_wait_for_register_fw(uncore, GU_CNTL, > + DRIVERFLR_STATUS, 0, > + flr_timeout_ms); We need an error here if above wait timeout then below wait is essentially a NOP. And driver may return before completion of FLR. Thanks, Anshuman Gupta. > + > + /* Completion: Step 2 - poll for 'DEBUG-BIT31 = 1' for hw/fw re-init > +to complete */ > ret = intel_wait_for_register_fw(uncore, GU_DEBUG, > DRIVERFLR_STATUS, > DRIVERFLR_STATUS, > flr_timeout_ms); > @@ -2756,6 +2762,7 @@ static void driver_initiated_flr(struct intel_uncore > *uncore) > return; > } > > + /* Write 1 to clear GU_DEBUG's sticky completion status bit */ > intel_uncore_write_fw(uncore, GU_DEBUG, DRIVERFLR_STATUS); } > > > base-commit: 0a0ee61784df01ac098a92bd43673ee30c629f13 > -- > 2.39.0
On Thu, 19 Jan 2023, Alan Previn <alan.previn.teres.alexis@intel.com> wrote: > The Driver-FLR flow may inadvertently exit early before the full > completion of the re-init of the internal HW state if we only poll > GU_DEBUG Bit31 (polling for it to toggle from 0 -> 1). Instead > we need a two-step completion wait-for-completion flow that also > involves GU_CNTL. See the patch and new code comments for detail. > This is new direction from HW architecture folks. > > Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> > Fixes: 5a44fcd73498 ("drm/i915/gsc: Do a driver-FLR on unload if GSC was loaded") > --- > drivers/gpu/drm/i915/intel_uncore.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c > index 8dee9e62a73e..959869e2ff05 100644 > --- a/drivers/gpu/drm/i915/intel_uncore.c > +++ b/drivers/gpu/drm/i915/intel_uncore.c > @@ -2748,6 +2748,12 @@ static void driver_initiated_flr(struct intel_uncore *uncore) > /* Trigger the actual Driver-FLR */ > intel_uncore_rmw_fw(uncore, GU_CNTL, 0, DRIVERFLR); > > + /* Completion Step 1 - poll for 'CNTL-BIT31 = 0' wait for hw teardown to complete */ Please don't use comments to repeat what the code already says. Here, you could just say, "Wait for hardware teardown to complete", which describes what the code does at a higher level, but does not duplicate any of it. > + ret = intel_wait_for_register_fw(uncore, GU_CNTL, > + DRIVERFLR_STATUS, 0, > + flr_timeout_ms); > + > + /* Completion: Step 2 - poll for 'DEBUG-BIT31 = 1' for hw/fw re-init to complete */ "Wait for hardware/firmware re-init to complete" > ret = intel_wait_for_register_fw(uncore, GU_DEBUG, > DRIVERFLR_STATUS, DRIVERFLR_STATUS, > flr_timeout_ms); > @@ -2756,6 +2762,7 @@ static void driver_initiated_flr(struct intel_uncore *uncore) > return; > } > > + /* Write 1 to clear GU_DEBUG's sticky completion status bit */ "Clear sticky completion status" maybe? > intel_uncore_write_fw(uncore, GU_DEBUG, DRIVERFLR_STATUS); > } > > > base-commit: 0a0ee61784df01ac098a92bd43673ee30c629f13
On Fri, 2023-01-20 at 08:27 +0000, Gupta, Anshuman wrote: > > > > -----Original Message----- > > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of Alan > > Previn > > Sent: Friday, January 20, 2023 1:20 AM > > To: intel-gfx@lists.freedesktop.org > > Cc: Vivi@freedesktop.org; dri-devel@lists.freedesktop.org; Teres Alexis, > > Alan Previn <alan.previn.teres.alexis@intel.com>; Vivi, Rodrigo > > <rodrigo.vivi@intel.com> > > Subject: [Intel-gfx] [PATCH 1/1] drm/i915/gsc: Fix the Driver-FLR completion > > > > alan:snip.. > > + /* Completion Step 1 - poll for 'CNTL-BIT31 = 0' wait for hw teardown > > to complete */ > > + ret = intel_wait_for_register_fw(uncore, GU_CNTL, > > + DRIVERFLR_STATUS, 0, > > + flr_timeout_ms); > We need an error here if above wait timeout then below wait is essentially a NOP. > And driver may return before completion of FLR. > Thanks, > Anshuman Gupta. alan: my bad - good catch - will fix. alan:snip..
Thanks for reviewing - sounds good - will fix those comments up as per your recommendation. On Fri, 2023-01-20 at 11:14 +0200, Jani Nikula wrote: > On Thu, 19 Jan 2023, Alan Previn <alan.previn.teres.alexis@intel.com> wrote: > > The Driver-FLR flow may inadvertently exit early before the full > > completion of the re-init of the internal HW state if we only poll > > GU_DEBUG Bit31 (polling for it to toggle from 0 -> 1). Instead > > we need a two-step completion wait-for-completion flow that also > > involves GU_CNTL. See the patch and new code comments for detail. > > This is new direction from HW architecture folks. > > > > Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> > > Fixes: 5a44fcd73498 ("drm/i915/gsc: Do a driver-FLR on unload if GSC was loaded") > > --- > > drivers/gpu/drm/i915/intel_uncore.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c > > index 8dee9e62a73e..959869e2ff05 100644 > > --- a/drivers/gpu/drm/i915/intel_uncore.c > > +++ b/drivers/gpu/drm/i915/intel_uncore.c > > @@ -2748,6 +2748,12 @@ static void driver_initiated_flr(struct intel_uncore *uncore) > > /* Trigger the actual Driver-FLR */ > > intel_uncore_rmw_fw(uncore, GU_CNTL, 0, DRIVERFLR); > > > > + /* Completion Step 1 - poll for 'CNTL-BIT31 = 0' wait for hw teardown to complete */ > > Please don't use comments to repeat what the code already says. > > Here, you could just say, "Wait for hardware teardown to complete", > which describes what the code does at a higher level, but does not > duplicate any of it. > > > + ret = intel_wait_for_register_fw(uncore, GU_CNTL, > > + DRIVERFLR_STATUS, 0, > > + flr_timeout_ms); > > + > > + /* Completion: Step 2 - poll for 'DEBUG-BIT31 = 1' for hw/fw re-init to complete */ > > "Wait for hardware/firmware re-init to complete" > > > ret = intel_wait_for_register_fw(uncore, GU_DEBUG, > > DRIVERFLR_STATUS, DRIVERFLR_STATUS, > > flr_timeout_ms); > > @@ -2756,6 +2762,7 @@ static void driver_initiated_flr(struct intel_uncore *uncore) > > return; > > } > > > > + /* Write 1 to clear GU_DEBUG's sticky completion status bit */ > > "Clear sticky completion status" maybe? > > > intel_uncore_write_fw(uncore, GU_DEBUG, DRIVERFLR_STATUS); > > } > > > > > > base-commit: 0a0ee61784df01ac098a92bd43673ee30c629f13 >
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 8dee9e62a73e..959869e2ff05 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -2748,6 +2748,12 @@ static void driver_initiated_flr(struct intel_uncore *uncore) /* Trigger the actual Driver-FLR */ intel_uncore_rmw_fw(uncore, GU_CNTL, 0, DRIVERFLR); + /* Completion Step 1 - poll for 'CNTL-BIT31 = 0' wait for hw teardown to complete */ + ret = intel_wait_for_register_fw(uncore, GU_CNTL, + DRIVERFLR_STATUS, 0, + flr_timeout_ms); + + /* Completion: Step 2 - poll for 'DEBUG-BIT31 = 1' for hw/fw re-init to complete */ ret = intel_wait_for_register_fw(uncore, GU_DEBUG, DRIVERFLR_STATUS, DRIVERFLR_STATUS, flr_timeout_ms); @@ -2756,6 +2762,7 @@ static void driver_initiated_flr(struct intel_uncore *uncore) return; } + /* Write 1 to clear GU_DEBUG's sticky completion status bit */ intel_uncore_write_fw(uncore, GU_DEBUG, DRIVERFLR_STATUS); }
The Driver-FLR flow may inadvertently exit early before the full completion of the re-init of the internal HW state if we only poll GU_DEBUG Bit31 (polling for it to toggle from 0 -> 1). Instead we need a two-step completion wait-for-completion flow that also involves GU_CNTL. See the patch and new code comments for detail. This is new direction from HW architecture folks. Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Fixes: 5a44fcd73498 ("drm/i915/gsc: Do a driver-FLR on unload if GSC was loaded") --- drivers/gpu/drm/i915/intel_uncore.c | 7 +++++++ 1 file changed, 7 insertions(+) base-commit: 0a0ee61784df01ac098a92bd43673ee30c629f13