diff mbox series

[v2] drm/i915/DG{1, 2}: FIXME Temporary hammer to disable rpm

Message ID 20220914161329.2604-1-anshuman.gupta@intel.com (mailing list archive)
State New, archived
Headers show
Series [v2] drm/i915/DG{1, 2}: FIXME Temporary hammer to disable rpm | expand

Commit Message

Gupta, Anshuman Sept. 14, 2022, 4:13 p.m. UTC
DG1 and DG2 has lmem, and cpu can access the lmem objects
via mmap and i915 internal i915_gem_object_pin_map() for
i915 own usages. Both of these methods has pre-requisite
requirement to keep GFX PCI endpoint in D0 for a supported
iomem transaction over PCI link. (Refer PCIe specs 5.3.1.4.1)

Both DG1/DG2 have a hardware bug that violates the PCIe specs
and support the iomem read write transaction over PCIe bus despite
endpoint is D3 state.
Due to above H/W bug, we had never observed any issue with i915 runtime
PM versus lmem access.
But this issue becomes visible when PCIe gfx endpoint's upstream
bridge enters to D3, at this point any lmem read/write access will be
returned as unsupported request. But again this issue is not observed
on every platform because it has been observed on few host machines
DG1/DG2 endpoint's upstream bridge does not bind with pcieport driver.
which really disables the PCIe poer power savings and leaves the bridge
at D0 state.

TODO:
With respect to i915_gem_object_pin_map(), every caller
has to grab a wakeref if gem object lies in lmem.

Till we fix all issues related to runtime PM, we need
to keep runtime PM disable on both DG1 and DG2.

V2:
- Keep a smaller FIXME code comment for both DG1/DG2.

Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
---
 drivers/gpu/drm/i915/i915_pci.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

Comments

Joonas Lahtinen Sept. 15, 2022, 6:47 a.m. UTC | #1
On the patch subject, could we aim to be a bit more readable and
concise. Something like:

"drm/i915: Temporarily disable RPM on DG1/DG2"

The patch title definitely should not have a FIXME in it if we're going
to merge it in that form.

There's nothing to be fixed about the patch itself, we are applying a
workaround until we've fixed the rootcause, which is business as usual.

Regards, Joonas

Quoting Anshuman Gupta (2022-09-14 19:13:29)
> DG1 and DG2 has lmem, and cpu can access the lmem objects
> via mmap and i915 internal i915_gem_object_pin_map() for
> i915 own usages. Both of these methods has pre-requisite
> requirement to keep GFX PCI endpoint in D0 for a supported
> iomem transaction over PCI link. (Refer PCIe specs 5.3.1.4.1)
> 
> Both DG1/DG2 have a hardware bug that violates the PCIe specs
> and support the iomem read write transaction over PCIe bus despite
> endpoint is D3 state.
> Due to above H/W bug, we had never observed any issue with i915 runtime
> PM versus lmem access.
> But this issue becomes visible when PCIe gfx endpoint's upstream
> bridge enters to D3, at this point any lmem read/write access will be
> returned as unsupported request. But again this issue is not observed
> on every platform because it has been observed on few host machines
> DG1/DG2 endpoint's upstream bridge does not bind with pcieport driver.
> which really disables the PCIe poer power savings and leaves the bridge
> at D0 state.
> 
> TODO:
> With respect to i915_gem_object_pin_map(), every caller
> has to grab a wakeref if gem object lies in lmem.
> 
> Till we fix all issues related to runtime PM, we need
> to keep runtime PM disable on both DG1 and DG2.
> 
> V2:
> - Keep a smaller FIXME code comment for both DG1/DG2.
> 
> Cc: Matthew Auld <matthew.auld@intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_pci.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index 77e7df21f539..4a7d226b074f 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -931,6 +931,14 @@ static const struct intel_device_info dg1_info = {
>                 BIT(VCS0) | BIT(VCS2),
>         /* Wa_16011227922 */
>         .__runtime.ppgtt_size = 47,
> +
> +       /*
> +        *  FIXME: Temporary hammer to disable rpm.
> +        *  As per PCIe specs 5.3.1.4.1, all iomem read write request over a PCIe
> +        *  function will be unsupported in case PCIe endpoint function is in D3.
> +        *  Let's disable i915 rpm till we fix all known issue with lmem access in D3.
> +        */
> +       .has_runtime_pm = 0,
>  };
>  
>  static const struct intel_device_info adl_s_info = {
> @@ -1076,6 +1084,13 @@ static const struct intel_device_info dg2_info = {
>         XE_LPD_FEATURES,
>         .__runtime.cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B) |
>                                BIT(TRANSCODER_C) | BIT(TRANSCODER_D),
> +       /*
> +        *  FIXME: Temporary hammer to disable rpm.
> +        *  As per PCIe specs 5.3.1.4.1, all iomem read write request over a PCIe
> +        *  function will be unsupported in case PCIe endpoint function is in D3.
> +        *  Let's disable i915 rpm till we fix all known issue with lmem access in D3.
> +        */
> +       .has_runtime_pm = 0,
>         .require_force_probe = 1,
>  };
>  
> -- 
> 2.26.2
>
Gupta, Anshuman Sept. 15, 2022, 7:19 a.m. UTC | #2
> -----Original Message-----
> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Sent: Thursday, September 15, 2022 12:17 PM
> To: Gupta, Anshuman <anshuman.gupta@intel.com>; intel-
> gfx@lists.freedesktop.org
> Cc: Vivi, Rodrigo <rodrigo.vivi@intel.com>; Nilawar, Badal
> <badal.nilawar@intel.com>; Ewins, Jon <jon.ewins@intel.com>;
> andi.shyti@linux.intel.com; Gupta, Anshuman <anshuman.gupta@intel.com>;
> Auld, Matthew <matthew.auld@intel.com>
> Subject: Re: [PATCH v2] drm/i915/DG{1,2}: FIXME Temporary hammer to disable
> rpm
> 
> On the patch subject, could we aim to be a bit more readable and concise.
> Something like:
> 
> "drm/i915: Temporarily disable RPM on DG1/DG2"
> 
> The patch title definitely should not have a FIXME in it if we're going to merge it
> in that form.
> 
> There's nothing to be fixed about the patch itself, we are applying a workaround
> until we've fixed the rootcause, which is business as usual.
Thanks Joonas for feedback, will remove the FIXME from the subject before pushing.
Br,
Anshuman.
> 
> Regards, Joonas
> 
> Quoting Anshuman Gupta (2022-09-14 19:13:29)
> > DG1 and DG2 has lmem, and cpu can access the lmem objects via mmap and
> > i915 internal i915_gem_object_pin_map() for
> > i915 own usages. Both of these methods has pre-requisite requirement
> > to keep GFX PCI endpoint in D0 for a supported iomem transaction over
> > PCI link. (Refer PCIe specs 5.3.1.4.1)
> >
> > Both DG1/DG2 have a hardware bug that violates the PCIe specs and
> > support the iomem read write transaction over PCIe bus despite
> > endpoint is D3 state.
> > Due to above H/W bug, we had never observed any issue with i915
> > runtime PM versus lmem access.
> > But this issue becomes visible when PCIe gfx endpoint's upstream
> > bridge enters to D3, at this point any lmem read/write access will be
> > returned as unsupported request. But again this issue is not observed
> > on every platform because it has been observed on few host machines
> > DG1/DG2 endpoint's upstream bridge does not bind with pcieport driver.
> > which really disables the PCIe poer power savings and leaves the
> > bridge at D0 state.
> >
> > TODO:
> > With respect to i915_gem_object_pin_map(), every caller has to grab a
> > wakeref if gem object lies in lmem.
> >
> > Till we fix all issues related to runtime PM, we need to keep runtime
> > PM disable on both DG1 and DG2.
> >
> > V2:
> > - Keep a smaller FIXME code comment for both DG1/DG2.
> >
> > Cc: Matthew Auld <matthew.auld@intel.com>
> > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
> > Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_pci.c | 15 +++++++++++++++
> >  1 file changed, 15 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_pci.c
> > b/drivers/gpu/drm/i915/i915_pci.c index 77e7df21f539..4a7d226b074f
> > 100644
> > --- a/drivers/gpu/drm/i915/i915_pci.c
> > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > @@ -931,6 +931,14 @@ static const struct intel_device_info dg1_info = {
> >                 BIT(VCS0) | BIT(VCS2),
> >         /* Wa_16011227922 */
> >         .__runtime.ppgtt_size = 47,
> > +
> > +       /*
> > +        *  FIXME: Temporary hammer to disable rpm.
> > +        *  As per PCIe specs 5.3.1.4.1, all iomem read write request over a PCIe
> > +        *  function will be unsupported in case PCIe endpoint function is in D3.
> > +        *  Let's disable i915 rpm till we fix all known issue with lmem access in
> D3.
> > +        */
> > +       .has_runtime_pm = 0,
> >  };
> >
> >  static const struct intel_device_info adl_s_info = { @@ -1076,6
> > +1084,13 @@ static const struct intel_device_info dg2_info = {
> >         XE_LPD_FEATURES,
> >         .__runtime.cpu_transcoder_mask = BIT(TRANSCODER_A) |
> BIT(TRANSCODER_B) |
> >                                BIT(TRANSCODER_C) | BIT(TRANSCODER_D),
> > +       /*
> > +        *  FIXME: Temporary hammer to disable rpm.
> > +        *  As per PCIe specs 5.3.1.4.1, all iomem read write request over a PCIe
> > +        *  function will be unsupported in case PCIe endpoint function is in D3.
> > +        *  Let's disable i915 rpm till we fix all known issue with lmem access in
> D3.
> > +        */
> > +       .has_runtime_pm = 0,
> >         .require_force_probe = 1,
> >  };
> >
> > --
> > 2.26.2
> >
Joonas Lahtinen Sept. 15, 2022, 2:44 p.m. UTC | #3
Quoting Anshuman Gupta (2022-09-14 19:13:29)
> DG1 and DG2 has lmem, and cpu can access the lmem objects
> via mmap and i915 internal i915_gem_object_pin_map() for
> i915 own usages. Both of these methods has pre-requisite
> requirement to keep GFX PCI endpoint in D0 for a supported
> iomem transaction over PCI link. (Refer PCIe specs 5.3.1.4.1)
> 
> Both DG1/DG2 have a hardware bug that violates the PCIe specs
> and support the iomem read write transaction over PCIe bus despite
> endpoint is D3 state.
> Due to above H/W bug, we had never observed any issue with i915 runtime
> PM versus lmem access.
> But this issue becomes visible when PCIe gfx endpoint's upstream
> bridge enters to D3, at this point any lmem read/write access will be
> returned as unsupported request. But again this issue is not observed
> on every platform because it has been observed on few host machines
> DG1/DG2 endpoint's upstream bridge does not bind with pcieport driver.
> which really disables the PCIe poer power savings and leaves the bridge
> at D0 state.
> 
> TODO:
> With respect to i915_gem_object_pin_map(), every caller
> has to grab a wakeref if gem object lies in lmem.
> 
> Till we fix all issues related to runtime PM, we need
> to keep runtime PM disable on both DG1 and DG2.
> 
> V2:
> - Keep a smaller FIXME code comment for both DG1/DG2.

I think we should also have Fixes: tags to cover both platforms.

Regards, Joonas

> Cc: Matthew Auld <matthew.auld@intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_pci.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index 77e7df21f539..4a7d226b074f 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -931,6 +931,14 @@ static const struct intel_device_info dg1_info = {
>                 BIT(VCS0) | BIT(VCS2),
>         /* Wa_16011227922 */
>         .__runtime.ppgtt_size = 47,
> +
> +       /*
> +        *  FIXME: Temporary hammer to disable rpm.
> +        *  As per PCIe specs 5.3.1.4.1, all iomem read write request over a PCIe
> +        *  function will be unsupported in case PCIe endpoint function is in D3.
> +        *  Let's disable i915 rpm till we fix all known issue with lmem access in D3.
> +        */
> +       .has_runtime_pm = 0,
>  };
>  
>  static const struct intel_device_info adl_s_info = {
> @@ -1076,6 +1084,13 @@ static const struct intel_device_info dg2_info = {
>         XE_LPD_FEATURES,
>         .__runtime.cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B) |
>                                BIT(TRANSCODER_C) | BIT(TRANSCODER_D),
> +       /*
> +        *  FIXME: Temporary hammer to disable rpm.
> +        *  As per PCIe specs 5.3.1.4.1, all iomem read write request over a PCIe
> +        *  function will be unsupported in case PCIe endpoint function is in D3.
> +        *  Let's disable i915 rpm till we fix all known issue with lmem access in D3.
> +        */
> +       .has_runtime_pm = 0,
>         .require_force_probe = 1,
>  };
>  
> -- 
> 2.26.2
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 77e7df21f539..4a7d226b074f 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -931,6 +931,14 @@  static const struct intel_device_info dg1_info = {
 		BIT(VCS0) | BIT(VCS2),
 	/* Wa_16011227922 */
 	.__runtime.ppgtt_size = 47,
+
+	/*
+	 *  FIXME: Temporary hammer to disable rpm.
+	 *  As per PCIe specs 5.3.1.4.1, all iomem read write request over a PCIe
+	 *  function will be unsupported in case PCIe endpoint function is in D3.
+	 *  Let's disable i915 rpm till we fix all known issue with lmem access in D3.
+	 */
+	.has_runtime_pm = 0,
 };
 
 static const struct intel_device_info adl_s_info = {
@@ -1076,6 +1084,13 @@  static const struct intel_device_info dg2_info = {
 	XE_LPD_FEATURES,
 	.__runtime.cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B) |
 			       BIT(TRANSCODER_C) | BIT(TRANSCODER_D),
+	/*
+	 *  FIXME: Temporary hammer to disable rpm.
+	 *  As per PCIe specs 5.3.1.4.1, all iomem read write request over a PCIe
+	 *  function will be unsupported in case PCIe endpoint function is in D3.
+	 *  Let's disable i915 rpm till we fix all known issue with lmem access in D3.
+	 */
+	.has_runtime_pm = 0,
 	.require_force_probe = 1,
 };