drm/i915: fix vxd392 memory corruption on VLV and >4GB

Message ID	1394241231-17242-1-git-send-email-sean.v.kelley@intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <intel-gfx-bounces@lists.freedesktop.org> From: Sean V Kelley <sean.v.kelley@intel.com> To: intel-gfx@lists.freedesktop.org Date: Fri, 7 Mar 2014 17:13:51 -0800 Message-Id: <1394241231-17242-1-git-send-email-sean.v.kelley@intel.com> Cc: Sean V Kelley <sean.v.kelley@intel.com>, Fei Jiang <fei.jiang@intel.com>, Alan Cox <alan@linux.intel.com> Subject: [Intel-gfx] [PATCH] drm/i915: fix vxd392 memory corruption on VLV and >4GB Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: intel-gfx-bounces@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org

Kelley, Sean V March 8, 2014, 1:13 a.m. UTC

On VLV systems addressing 4GB of memory or greater, memory corruption was seen
when initializing and attempting to render VP8 hardware decode surfaces using
the VXD392 HW IP block.

The VXD MMU has a limitation to addressing only 32bits.  On 64bit kernel and
user space builds this can cause problems for use of that IP block.

When 2G memory is inserted, fw buffer pfn was at 0x5f62b, which is below 4GB.
While for 4GB of memory the fw buffer pfn was 0x162ea9 with a physical address
at 0x162ea9000, above 4GB.

So although the memory is 4GB in the test hardware (Bayleybay CRB), a large
physical region (for example 3G-4G) can be occupied by onboard system
resources.

Enabling ZONE_DMA32 and setting the correct mask DMA for this device
resolves the issue kernel side.

Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Fei Jiang <fei.jiang@intel.com>
Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Chris Wilson March 8, 2014, 9:25 a.m. UTC | #1

On Fri, Mar 07, 2014 at 05:13:51PM -0800, Sean V Kelley wrote:
> On VLV systems addressing 4GB of memory or greater, memory corruption was seen
> when initializing and attempting to render VP8 hardware decode surfaces using
> the VXD392 HW IP block.
> 
> The VXD MMU has a limitation to addressing only 32bits.  On 64bit kernel and
> user space builds this can cause problems for use of that IP block.
> 
> When 2G memory is inserted, fw buffer pfn was at 0x5f62b, which is below 4GB.
> While for 4GB of memory the fw buffer pfn was 0x162ea9 with a physical address
> at 0x162ea9000, above 4GB.
> 
> So although the memory is 4GB in the test hardware (Bayleybay CRB), a large
> physical region (for example 3G-4G) can be occupied by onboard system
> resources.
> 
> Enabling ZONE_DMA32 and setting the correct mask DMA for this device
> resolves the issue kernel side.

That's a shame. I guess this is restricted to a subset of byt?
 
> Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
> Cc: Alan Cox <alan@linux.intel.com>
> Cc: Fei Jiang <fei.jiang@intel.com>
> Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_dma.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index e4d2b9f..b8c6efc 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1636,7 +1636,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
>  	 * behaviour if any general state is accessed within a page above 4GB,
>  	 * which also needs to be handled carefully.

Also add a sentence here giving a quick explanation as to why we need to
quirk IS_VLV as well.

>  	 */
> -	if (IS_BROADWATER(dev) || IS_CRESTLINE(dev))
> +	if (IS_BROADWATER(dev) || IS_CRESTLINE(dev) || IS_VALLEYVIEW(dev))
>  		dma_set_coherent_mask(&dev->pdev->dev, DMA_BIT_MASK(32));
>  
>  	aperture_size = dev_priv->gtt.mappable_end;

Daniel Vetter March 8, 2014, 10:07 a.m. UTC | #2

On Sat, Mar 08, 2014 at 09:25:54AM +0000, Chris Wilson wrote:
> On Fri, Mar 07, 2014 at 05:13:51PM -0800, Sean V Kelley wrote:
> > On VLV systems addressing 4GB of memory or greater, memory corruption was seen
> > when initializing and attempting to render VP8 hardware decode surfaces using
> > the VXD392 HW IP block.
> > 
> > The VXD MMU has a limitation to addressing only 32bits.  On 64bit kernel and
> > user space builds this can cause problems for use of that IP block.
> > 
> > When 2G memory is inserted, fw buffer pfn was at 0x5f62b, which is below 4GB.
> > While for 4GB of memory the fw buffer pfn was 0x162ea9 with a physical address
> > at 0x162ea9000, above 4GB.
> > 
> > So although the memory is 4GB in the test hardware (Bayleybay CRB), a large
> > physical region (for example 3G-4G) can be occupied by onboard system
> > resources.
> > 
> > Enabling ZONE_DMA32 and setting the correct mask DMA for this device
> > resolves the issue kernel side.
> 
> That's a shame. I guess this is restricted to a subset of byt?
>  
> > Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
> > Cc: Alan Cox <alan@linux.intel.com>
> > Cc: Fei Jiang <fei.jiang@intel.com>
> > Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_dma.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > index e4d2b9f..b8c6efc 100644
> > --- a/drivers/gpu/drm/i915/i915_dma.c
> > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > @@ -1636,7 +1636,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
> >  	 * behaviour if any general state is accessed within a page above 4GB,
> >  	 * which also needs to be handled carefully.
> 
> Also add a sentence here giving a quick explanation as to why we need to
> quirk IS_VLV as well.
> 
> >  	 */
> > -	if (IS_BROADWATER(dev) || IS_CRESTLINE(dev))
> > +	if (IS_BROADWATER(dev) || IS_CRESTLINE(dev) || IS_VALLEYVIEW(dev))
> >  		dma_set_coherent_mask(&dev->pdev->dev, DMA_BIT_MASK(32));

Nack because:
- the vxd392 isn't merged upstream
- we can allocate shared buffers from vxd392 and have the restriction just
  there
- for shared buffers allocated in i915 imo the right approach is to move
  offending pages around when vxd392 attaches - i.e. we need to check the
  attached device's dma masks and if there's something offending, migrate
  the buffer with a differen shmem allocation mask. Iirc Alan Cox had
  patches to do just that (but for swapoff, still the same idea though).

Cheers, Daniel
> >  
> >  	aperture_size = dev_priv->gtt.mappable_end;
> 
> -- 
> Chris Wilson, Intel Open Source Technology Centre
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Kelley, Sean V March 8, 2014, 7:25 p.m. UTC | #3

On Saturday 08 Mar 2014 at 09:25:54 (+0000), Chris Wilson writes :
> On Fri, Mar 07, 2014 at 05:13:51PM -0800, Sean V Kelley wrote:
> > On VLV systems addressing 4GB of memory or greater, memory corruption was seen
> > when initializing and attempting to render VP8 hardware decode surfaces using
> > the VXD392 HW IP block.
> > 
> > The VXD MMU has a limitation to addressing only 32bits.  On 64bit kernel and
> > user space builds this can cause problems for use of that IP block.
> > 
> > When 2G memory is inserted, fw buffer pfn was at 0x5f62b, which is below 4GB.
> > While for 4GB of memory the fw buffer pfn was 0x162ea9 with a physical address
> > at 0x162ea9000, above 4GB.
> > 
> > So although the memory is 4GB in the test hardware (Bayleybay CRB), a large
> > physical region (for example 3G-4G) can be occupied by onboard system
> > resources.
> > 
> > Enabling ZONE_DMA32 and setting the correct mask DMA for this device
> > resolves the issue kernel side.
> 
> That's a shame. I guess this is restricted to a subset of byt?

It should affect all baytrail systems. To my knowledge there are no subsets that have it fused off.

>  
> > Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
> > Cc: Alan Cox <alan@linux.intel.com>
> > Cc: Fei Jiang <fei.jiang@intel.com>
> > Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_dma.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > index e4d2b9f..b8c6efc 100644
> > --- a/drivers/gpu/drm/i915/i915_dma.c
> > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > @@ -1636,7 +1636,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
> >  	 * behaviour if any general state is accessed within a page above 4GB,
> >  	 * which also needs to be handled carefully.
> 
> Also add a sentence here giving a quick explanation as to why we need to
> quirk IS_VLV as well.

Thanks for the feedback also going through Daniel's.

Sean

> 
> >  	 */
> > -	if (IS_BROADWATER(dev) || IS_CRESTLINE(dev))
> > +	if (IS_BROADWATER(dev) || IS_CRESTLINE(dev) || IS_VALLEYVIEW(dev))
> >  		dma_set_coherent_mask(&dev->pdev->dev, DMA_BIT_MASK(32));
> >  
> >  	aperture_size = dev_priv->gtt.mappable_end;
> 
> -- 
> Chris Wilson, Intel Open Source Technology Centre

Kelley, Sean V March 8, 2014, 7:42 p.m. UTC | #4

On Saturday 08 Mar 2014 at 11:07:50 (+0100), Daniel Vetter writes :
> On Sat, Mar 08, 2014 at 09:25:54AM +0000, Chris Wilson wrote:
> > On Fri, Mar 07, 2014 at 05:13:51PM -0800, Sean V Kelley wrote:
> > > On VLV systems addressing 4GB of memory or greater, memory corruption was seen
> > > when initializing and attempting to render VP8 hardware decode surfaces using
> > > the VXD392 HW IP block.
> > > 
> > > The VXD MMU has a limitation to addressing only 32bits.  On 64bit kernel and
> > > user space builds this can cause problems for use of that IP block.
> > > 
> > > When 2G memory is inserted, fw buffer pfn was at 0x5f62b, which is below 4GB.
> > > While for 4GB of memory the fw buffer pfn was 0x162ea9 with a physical address
> > > at 0x162ea9000, above 4GB.
> > > 
> > > So although the memory is 4GB in the test hardware (Bayleybay CRB), a large
> > > physical region (for example 3G-4G) can be occupied by onboard system
> > > resources.
> > > 
> > > Enabling ZONE_DMA32 and setting the correct mask DMA for this device
> > > resolves the issue kernel side.
> > 
> > That's a shame. I guess this is restricted to a subset of byt?
> >  
> > > Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
> > > Cc: Alan Cox <alan@linux.intel.com>
> > > Cc: Fei Jiang <fei.jiang@intel.com>
> > > Signed-off-by: Sean V Kelley <sean.v.kelley@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_dma.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > > index e4d2b9f..b8c6efc 100644
> > > --- a/drivers/gpu/drm/i915/i915_dma.c
> > > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > > @@ -1636,7 +1636,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
> > >  	 * behaviour if any general state is accessed within a page above 4GB,
> > >  	 * which also needs to be handled carefully.
> > 
> > Also add a sentence here giving a quick explanation as to why we need to
> > quirk IS_VLV as well.
> > 
> > >  	 */
> > > -	if (IS_BROADWATER(dev) || IS_CRESTLINE(dev))
> > > +	if (IS_BROADWATER(dev) || IS_CRESTLINE(dev) || IS_VALLEYVIEW(dev))
> > >  		dma_set_coherent_mask(&dev->pdev->dev, DMA_BIT_MASK(32));
> 
> Nack because:
> - the vxd392 isn't merged upstream

Sure, I just wanted to get some feedback.  Yeah, in it's current state my
driver port will not be acceptable upstream until I cleanup the  deprecated API from
the driver, remove my wrappers/hooks from the i915, and make similar API changes to the user
mode driver.  That is a work-in-progress for me now.

> - we can allocate shared buffers from vxd392 and have the restriction just
>   there
> - for shared buffers allocated in i915 imo the right approach is to move
>   offending pages around when vxd392 attaches - i.e. we need to check the
>   attached device's dma masks and if there's something offending, migrate
>   the buffer with a differen shmem allocation mask. Iirc Alan Cox had
>   patches to do just that (but for swapoff, still the same idea though).

Will keep that in mind.  I have been getting some feedback from Alan too.

Thanks,

Sean

> 
> Cheers, Daniel
> > >  
> > >  	aperture_size = dev_priv->gtt.mappable_end;
> > 
> > -- 
> > Chris Wilson, Intel Open Source Technology Centre
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Alan Cox March 9, 2014, 8:23 p.m. UTC | #5

>   offending pages around when vxd392 attaches - i.e. we need to check the
>   attached device's dma masks and if there's something offending, migrate
>   the buffer with a differen shmem allocation mask. Iirc Alan Cox had
>   patches to do just that (but for swapoff, still the same idea though).

The shmemfs code knows about this for GEM (and anything else it backs).
My only credit is finding the problem, I didn't do the fixing!

Alan

Alan Cox March 9, 2014, 8:25 p.m. UTC | #6

On Sat, 2014-03-08 at 11:25 -0800, Sean V Kelley wrote:
> On Saturday 08 Mar 2014 at 09:25:54 (+0000), Chris Wilson writes :
> > On Fri, Mar 07, 2014 at 05:13:51PM -0800, Sean V Kelley wrote:
> > > On VLV systems addressing 4GB of memory or greater, memory corruption was seen
> > > when initializing and attempting to render VP8 hardware decode surfaces using
> > > the VXD392 HW IP block.
> > > 
> > > The VXD MMU has a limitation to addressing only 32bits.  On 64bit kernel and
> > > user space builds this can cause problems for use of that IP block.
> > > 
> > > When 2G memory is inserted, fw buffer pfn was at 0x5f62b, which is below 4GB.
> > > While for 4GB of memory the fw buffer pfn was 0x162ea9 with a physical address
> > > at 0x162ea9000, above 4GB.
> > > 
> > > So although the memory is 4GB in the test hardware (Bayleybay CRB), a large
> > > physical region (for example 3G-4G) can be occupied by onboard system
> > > resources.
> > > 
> > > Enabling ZONE_DMA32 and setting the correct mask DMA for this device
> > > resolves the issue kernel side.
> > 
> > That's a shame. I guess this is restricted to a subset of byt?
> 
> It should affect all baytrail systems. To my knowledge there are no subsets that have it fused off.

This is in the gross hacks department but what happens if you set the
VXD to DMA to GTT translated addresses, does it see the memory directly
or via the GTT translations ?

Alan

drm/i915: fix vxd392 memory corruption on VLV and >4GB

Commit Message

Comments

Patch