diff mbox series

drm/i915/rkl: Remove require_force_probe protection

Message ID 20201130124855.319226-1-tejaskumarx.surendrakumar.upadhyay@intel.com (mailing list archive)
State New, archived
Headers show
Series drm/i915/rkl: Remove require_force_probe protection | expand

Commit Message

Tejas Upadhyay Nov. 30, 2020, 12:48 p.m. UTC
Removing force probe protection from RKL platform. Did
not observe warnings, errors, flickering or any visual
defects while doing ordinary tasks like browsing and
editing documents in a two monitor setup.

Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
---
 drivers/gpu/drm/i915/i915_pci.c | 1 -
 1 file changed, 1 deletion(-)

Comments

Chris Wilson Nov. 30, 2020, 1:01 p.m. UTC | #1
Quoting Tejas Upadhyay (2020-11-30 12:48:55)
> Removing force probe protection from RKL platform. Did
> not observe warnings, errors, flickering or any visual
> defects while doing ordinary tasks like browsing and
> editing documents in a two monitor setup.

Really? CI says differently.
https://gitlab.freedesktop.org/drm/intel/-/issues/2743
is severe HW failure, something fishy in the world of forcewake.
-Chris
Tejas Upadhyay Nov. 30, 2020, 2:45 p.m. UTC | #2
Hi Chris,

The failing test was not part of BAT run, it ran in CI resume run and failed there, however on manual run the same test got passed. Please find attached results.

Thanks,
Tejas

> -----Original Message-----
> From: Chris Wilson <chris@chris-wilson.co.uk>
> Sent: 30 November 2020 18:31
> To: Surendrakumar Upadhyay, TejaskumarX
> <tejaskumarx.surendrakumar.upadhyay@intel.com>; Pandey, Hariom
> <hariom.pandey@intel.com>; intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH] drm/i915/rkl: Remove require_force_probe
> protection
> 
> Quoting Tejas Upadhyay (2020-11-30 12:48:55)
> > Removing force probe protection from RKL platform. Did not observe
> > warnings, errors, flickering or any visual defects while doing
> > ordinary tasks like browsing and editing documents in a two monitor
> > setup.
> 
> Really? CI says differently.
> https://gitlab.freedesktop.org/drm/intel/-/issues/2743
> is severe HW failure, something fishy in the world of forcewake.
> -Chris
Chris Wilson Nov. 30, 2020, 3:01 p.m. UTC | #3
Quoting Surendrakumar Upadhyay, TejaskumarX (2020-11-30 14:45:14)
> Hi Chris,
> 
> The failing test was not part of BAT run, it ran in CI resume run and failed there, however on manual run the same test got passed. Please find attached results.

One pass versus a major failure is not satisfactory.

We can not say we are happy with the hardware/driver until it is
reliable, and forcewake is of fundamental importance for mmio access,
as well as execution.
-Chris
Tejas Upadhyay Dec. 3, 2020, 4:13 a.m. UTC | #4
Hi Jaswant,

Can you please re-run resume run on CI as well as local setup and share results here? If it passes in full resume run in either of setup we are good go with. 

Thanks,
Tejas

> -----Original Message-----
> From: Chris Wilson <chris@chris-wilson.co.uk>
> Sent: 30 November 2020 20:31
> To: Pandey, Hariom <hariom.pandey@intel.com>; Surendrakumar
> Upadhyay, TejaskumarX
> <tejaskumarx.surendrakumar.upadhyay@intel.com>; intel-
> gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH] drm/i915/rkl: Remove require_force_probe
> protection
> 
> Quoting Surendrakumar Upadhyay, TejaskumarX (2020-11-30 14:45:14)
> > Hi Chris,
> >
> > The failing test was not part of BAT run, it ran in CI resume run and failed
> there, however on manual run the same test got passed. Please find
> attached results.
> 
> One pass versus a major failure is not satisfactory.
> 
> We can not say we are happy with the hardware/driver until it is reliable,
> and forcewake is of fundamental importance for mmio access, as well as
> execution.
> -Chris
Tejas Upadhyay Dec. 3, 2020, 11:07 a.m. UTC | #5
+ Jaswant

> -----Original Message-----
> From: Surendrakumar Upadhyay, TejaskumarX
> Sent: 03 December 2020 09:44
> To: Chris Wilson <chris@chris-wilson.co.uk>; Pandey, Hariom
> <hariom.pandey@intel.com>; intel-gfx@lists.freedesktop.org
> Subject: RE: [Intel-gfx] [PATCH] drm/i915/rkl: Remove require_force_probe
> protection
> 
> Hi Jaswant,
> 
> Can you please re-run resume run on CI as well as local setup and share
> results here? If it passes in full resume run in either of setup we are good go
> with.
> 
> Thanks,
> Tejas
> 
> > -----Original Message-----
> > From: Chris Wilson <chris@chris-wilson.co.uk>
> > Sent: 30 November 2020 20:31
> > To: Pandey, Hariom <hariom.pandey@intel.com>; Surendrakumar
> Upadhyay,
> > TejaskumarX <tejaskumarx.surendrakumar.upadhyay@intel.com>; intel-
> > gfx@lists.freedesktop.org
> > Subject: Re: [Intel-gfx] [PATCH] drm/i915/rkl: Remove
> > require_force_probe protection
> >
> > Quoting Surendrakumar Upadhyay, TejaskumarX (2020-11-30 14:45:14)
> > > Hi Chris,
> > >
> > > The failing test was not part of BAT run, it ran in CI resume run
> > > and failed
> > there, however on manual run the same test got passed. Please find
> > attached results.
> >
> > One pass versus a major failure is not satisfactory.
> >
> > We can not say we are happy with the hardware/driver until it is
> > reliable, and forcewake is of fundamental importance for mmio access,
> > as well as execution.
> > -Chris
Chris Wilson Dec. 3, 2020, 11:18 a.m. UTC | #6
Quoting Surendrakumar Upadhyay, TejaskumarX (2020-12-03 04:13:57)
> Hi Jaswant,
> 
> Can you please re-run resume run on CI as well as local setup and share results here? If it passes in full resume run in either of setup we are good go with. 

Acknowledge the bug as a critical failure [it is, the gpu is no longer
responding via mmio]. Root cause the failure, and fix/prevent it. We
cannot claim that the driver is functioning correctly while failures such
as the GPU dying have been been hit by CI and no action has been taken.
-Chris
Kattamanchi, JaswanthX Dec. 4, 2020, 9:41 a.m. UTC | #7
Hi Tejas,

As per your request triggered resume run on RKL CI machine, the testcases which chris mentioned were passing with this run, Please find the below logs for your reference 

Git ID : https://gitlab.freedesktop.org/drm/intel/-/issues/2743

igt@gem_exec_schedule@pi-ringfull@vcs0 : https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9432/re-rkl-1/igt@gem_exec_schedule@pi-ringfull@vcs0.html

igt@gem_exec_schedule@pi-common@vcs0 : https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9432/re-rkl-1/igt@gem_exec_schedule@pi-common@vcs0.html

Regards,
Jaswanth Kattamanchi

-----Original Message-----
From: Surendrakumar Upadhyay, TejaskumarX <tejaskumarx.surendrakumar.upadhyay@intel.com> 
Sent: Thursday, December 3, 2020 4:38 PM
To: Chris Wilson <chris@chris-wilson.co.uk>; Pandey, Hariom <hariom.pandey@intel.com>; intel-gfx@lists.freedesktop.org; Kattamanchi, JaswanthX <jaswanthx.kattamanchi@intel.com>
Cc: Naramasetti, LaxminarayanaX <laxminarayanax.naramasetti@intel.com>
Subject: RE: [Intel-gfx] [PATCH] drm/i915/rkl: Remove require_force_probe protection

+ Jaswant

> -----Original Message-----
> From: Surendrakumar Upadhyay, TejaskumarX
> Sent: 03 December 2020 09:44
> To: Chris Wilson <chris@chris-wilson.co.uk>; Pandey, Hariom 
> <hariom.pandey@intel.com>; intel-gfx@lists.freedesktop.org
> Subject: RE: [Intel-gfx] [PATCH] drm/i915/rkl: Remove 
> require_force_probe protection
> 
> Hi Jaswant,
> 
> Can you please re-run resume run on CI as well as local setup and 
> share results here? If it passes in full resume run in either of setup 
> we are good go with.
> 
> Thanks,
> Tejas
> 
> > -----Original Message-----
> > From: Chris Wilson <chris@chris-wilson.co.uk>
> > Sent: 30 November 2020 20:31
> > To: Pandey, Hariom <hariom.pandey@intel.com>; Surendrakumar
> Upadhyay,
> > TejaskumarX <tejaskumarx.surendrakumar.upadhyay@intel.com>; intel- 
> > gfx@lists.freedesktop.org
> > Subject: Re: [Intel-gfx] [PATCH] drm/i915/rkl: Remove 
> > require_force_probe protection
> >
> > Quoting Surendrakumar Upadhyay, TejaskumarX (2020-11-30 14:45:14)
> > > Hi Chris,
> > >
> > > The failing test was not part of BAT run, it ran in CI resume run 
> > > and failed
> > there, however on manual run the same test got passed. Please find 
> > attached results.
> >
> > One pass versus a major failure is not satisfactory.
> >
> > We can not say we are happy with the hardware/driver until it is 
> > reliable, and forcewake is of fundamental importance for mmio 
> > access, as well as execution.
> > -Chris
Chris Wilson Dec. 4, 2020, 9:52 a.m. UTC | #8
Quoting Kattamanchi, JaswanthX (2020-12-04 09:41:17)
> Hi Tejas,
> 
> As per your request triggered resume run on RKL CI machine, the testcases which chris mentioned were passing with this run, Please find the below logs for your reference 

It is not particular to a testcase. HW failure rarely is.
-Chris
Tejas Upadhyay Dec. 7, 2020, 10:02 a.m. UTC | #9
Hi Chris,

Are below results satisfying?

Thanks,
Tejas

> -----Original Message-----
> From: Kattamanchi, JaswanthX <jaswanthx.kattamanchi@intel.com>
> Sent: 04 December 2020 15:11
> To: Surendrakumar Upadhyay, TejaskumarX
> <tejaskumarx.surendrakumar.upadhyay@intel.com>; Chris Wilson
> <chris@chris-wilson.co.uk>; Pandey, Hariom <hariom.pandey@intel.com>;
> intel-gfx@lists.freedesktop.org
> Cc: Naramasetti, LaxminarayanaX <laxminarayanax.naramasetti@intel.com>
> Subject: RE: [Intel-gfx] [PATCH] drm/i915/rkl: Remove require_force_probe
> protection
> 
> Hi Tejas,
> 
> As per your request triggered resume run on RKL CI machine, the testcases
> which chris mentioned were passing with this run, Please find the below logs
> for your reference
> 
> Git ID : https://gitlab.freedesktop.org/drm/intel/-/issues/2743
> 
> igt@gem_exec_schedule@pi-ringfull@vcs0 : https://intel-gfx-
> ci.01.org/tree/drm-tip/CI_DRM_9432/re-rkl-1/igt@gem_exec_schedule@pi-
> ringfull@vcs0.html
> 
> igt@gem_exec_schedule@pi-common@vcs0 : https://intel-gfx-
> ci.01.org/tree/drm-tip/CI_DRM_9432/re-rkl-1/igt@gem_exec_schedule@pi-
> common@vcs0.html
> 
> Regards,
> Jaswanth Kattamanchi
> 
> -----Original Message-----
> From: Surendrakumar Upadhyay, TejaskumarX
> <tejaskumarx.surendrakumar.upadhyay@intel.com>
> Sent: Thursday, December 3, 2020 4:38 PM
> To: Chris Wilson <chris@chris-wilson.co.uk>; Pandey, Hariom
> <hariom.pandey@intel.com>; intel-gfx@lists.freedesktop.org; Kattamanchi,
> JaswanthX <jaswanthx.kattamanchi@intel.com>
> Cc: Naramasetti, LaxminarayanaX <laxminarayanax.naramasetti@intel.com>
> Subject: RE: [Intel-gfx] [PATCH] drm/i915/rkl: Remove require_force_probe
> protection
> 
> + Jaswant
> 
> > -----Original Message-----
> > From: Surendrakumar Upadhyay, TejaskumarX
> > Sent: 03 December 2020 09:44
> > To: Chris Wilson <chris@chris-wilson.co.uk>; Pandey, Hariom
> > <hariom.pandey@intel.com>; intel-gfx@lists.freedesktop.org
> > Subject: RE: [Intel-gfx] [PATCH] drm/i915/rkl: Remove
> > require_force_probe protection
> >
> > Hi Jaswant,
> >
> > Can you please re-run resume run on CI as well as local setup and
> > share results here? If it passes in full resume run in either of setup
> > we are good go with.
> >
> > Thanks,
> > Tejas
> >
> > > -----Original Message-----
> > > From: Chris Wilson <chris@chris-wilson.co.uk>
> > > Sent: 30 November 2020 20:31
> > > To: Pandey, Hariom <hariom.pandey@intel.com>; Surendrakumar
> > Upadhyay,
> > > TejaskumarX <tejaskumarx.surendrakumar.upadhyay@intel.com>; intel-
> > > gfx@lists.freedesktop.org
> > > Subject: Re: [Intel-gfx] [PATCH] drm/i915/rkl: Remove
> > > require_force_probe protection
> > >
> > > Quoting Surendrakumar Upadhyay, TejaskumarX (2020-11-30 14:45:14)
> > > > Hi Chris,
> > > >
> > > > The failing test was not part of BAT run, it ran in CI resume run
> > > > and failed
> > > there, however on manual run the same test got passed. Please find
> > > attached results.
> > >
> > > One pass versus a major failure is not satisfactory.
> > >
> > > We can not say we are happy with the hardware/driver until it is
> > > reliable, and forcewake is of fundamental importance for mmio
> > > access, as well as execution.
> > > -Chris
Pandey, Hariom Jan. 27, 2021, 3:10 p.m. UTC | #10
Hi Chris,

(i) To your concern on the GPU dying issue gitlab#2743 --> this issue has been resolved and not observed in last 3 runs --> The gitlab had been updated with the pass results and closed.
(ii) RocketLate platform has been setup in Public CI with the name " fi-rkl-11500t" --> https://intel-gfx-ci.01.org/tree/drm-tip/bat-all.html? --> This link shows last few Pass runs.

With the above progress, please confirm if you are fine to merge/accept this patch of RKL force probe flag removal.

Thanks
Hariom Pandey

> -----Original Message-----
> From: Chris Wilson <chris@chris-wilson.co.uk>
> Sent: Friday, December 4, 2020 3:23 PM
> To: Kattamanchi, JaswanthX <jaswanthx.kattamanchi@intel.com>; Pandey,
> Hariom <hariom.pandey@intel.com>; Surendrakumar Upadhyay,
> TejaskumarX <tejaskumarx.surendrakumar.upadhyay@intel.com>; intel-
> gfx@lists.freedesktop.org
> Cc: Naramasetti, LaxminarayanaX <laxminarayanax.naramasetti@intel.com>
> Subject: Re: [Intel-gfx] [PATCH] drm/i915/rkl: Remove require_force_probe
> protection
> 
> Quoting Kattamanchi, JaswanthX (2020-12-04 09:41:17)
> > Hi Tejas,
> >
> > As per your request triggered resume run on RKL CI machine, the testcases
> which chris mentioned were passing with this run, Please find the below logs
> for your reference
> 
> It is not particular to a testcase. HW failure rarely is.
> -Chris
Chris Wilson Jan. 27, 2021, 3:18 p.m. UTC | #11
Quoting Pandey, Hariom (2021-01-27 15:10:53)
> Hi Chris,
> 
> (i) To your concern on the GPU dying issue gitlab#2743 --> this issue has been resolved and not observed in last 3 runs --> The gitlab had been updated with the pass results and closed.
> (ii) RocketLate platform has been setup in Public CI with the name " fi-rkl-11500t" --> https://intel-gfx-ci.01.org/tree/drm-tip/bat-all.html? --> This link shows last few Pass runs.
> 
> With the above progress, please confirm if you are fine to merge/accept this patch of RKL force probe flag removal.

Now that we have some visibility in CI, those of us without rkl (who
_just_ see the bug reports) can all build up some confidence. From the
CI, it's looking good, but you want to wait for a few idle [full] runs to
get a true feel of the overall health.

So if people are happy that the scary forcewake error was truly a one off
and doesn't need any follow up, then I see nothing stopping us from
declaring ourselves in good shape -- barring a disastrous idle run.
-Chris
Chris Wilson Feb. 3, 2021, 11:43 a.m. UTC | #12
Quoting Tejas Upadhyay (2020-11-30 12:48:55)
> Removing force probe protection from RKL platform. Did
> not observe warnings, errors, flickering or any visual
> defects while doing ordinary tasks like browsing and
> editing documents in a two monitor setup.
> 
> Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>

We now have a system in CI and that appears quite promising,
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
Rodrigo Vivi Feb. 8, 2021, 10:16 p.m. UTC | #13
On Wed, Feb 03, 2021 at 11:43:13AM +0000, Chris Wilson wrote:
> Quoting Tejas Upadhyay (2020-11-30 12:48:55)
> > Removing force probe protection from RKL platform. Did
> > not observe warnings, errors, flickering or any visual
> > defects while doing ordinary tasks like browsing and
> > editing documents in a two monitor setup.
> > 
> > Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
> 
> We now have a system in CI and that appears quite promising,
> Acked-by: Chris Wilson <chris@chris-wilson.co.uk>

Indeed. Pulled, thanks!

> -Chris
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 11fe790b1969..665626d2524f 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -898,7 +898,6 @@  static const struct intel_device_info rkl_info = {
 	.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C),
 	.cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B) |
 		BIT(TRANSCODER_C),
-	.require_force_probe = 1,
 	.display.has_hti = 1,
 	.display.has_psr_hw_tracking = 0,
 	.platform_engine_mask =