[4/4] i915: fix remap_io_sg to verify the pgprot

Message ID	20210326055505.1424432-5-hch@lst.de (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=R0Rw=IY=kvack.org=owner-linux-mm@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6EBE061A3F From: Christoph Hellwig <hch@lst.de> To: Andrew Morton <akpm@linux-foundation.org>, Jani Nikula <jani.nikula@linux.intel.com>, Joonas Lahtinen <joonas.lahtinen@linux.intel.com>, Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk>, Daniel Vetter <daniel.vetter@ffwll.ch>, Peter Zijlstra <peterz@infradead.org>, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-mm@kvack.org Subject: [PATCH 4/4] i915: fix remap_io_sg to verify the pgprot Date: Fri, 26 Mar 2021 06:55:05 +0100 Message-Id: <20210326055505.1424432-5-hch@lst.de> In-Reply-To: <20210326055505.1424432-1-hch@lst.de> References: <20210326055505.1424432-1-hch@lst.de> MIME-Version: 1.0 Received-SPF: none (bombadil.srs.infradead.org>: No applicable sender policy available) receiver=imf08; identity=mailfrom; envelope-from="<BATV+31783f3a8021f30233dc+6424+infradead.org+hch@bombadil.srs.infradead.org>"; helo=bombadil.infradead.org; client-ip=198.137.202.133 Content-Transfer-Encoding: quoted-printable Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	[1/4] mm: add remap_pfn_range_notrack \| expand [1/4] mm: add remap_pfn_range_notrack [2/4] mm: add a io_mapping_map_user helper [3/4] i915: use io_mapping_map_user [4/4] i915: fix remap_io_sg to verify the pgprot

Christoph Hellwig March 26, 2021, 5:55 a.m. UTC

remap_io_sg claims that that the pgprot is pre-verified using an
io_mapping, but actually does not get passed an io_mapping and just
uses the pgprot in the VMA.  Remove the apply_to_page_range abuse
and just loop over remap_pfn_range for each segment.

Note: this could use io_mapping_map_user by passing an iomap to
remap_io_sg if the maintainers can verify that the pgprot in the
iomap in the only caller is indeed the desired one here.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/gpu/drm/i915/i915_mm.c | 73 +++++++++++-----------------------
 1 file changed, 23 insertions(+), 50 deletions(-)

youling 257 May 8, 2021, 7:33 p.m. UTC | #1

This patch cause "x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x064a2000-0x064a2fff], got write-back" problem.
my 2GB ram Bay trail z3735f tablet runing on android-x86, "i915: fix remap_io_sg to verify the pgprot" cause this problem.

05-09 02:59:25.099     0     0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x0640a000-0x0640dfff], got write-back
05-09 02:59:25.106  1440  1440 W hwc-gl-worker: EGL_ANDROID_native_fence_sync extension not supported
05-09 02:59:25.111     0     0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x064a2000-0x064a2fff], got write-back
05-09 02:59:25.118     0     0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x06400000-0x06404fff], got write-back
05-09 02:59:25.125     0     0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x06405000-0x06408fff], got write-back
05-09 02:59:25.148  1440  1440 W hwc-gl-worker: EGL_ANDROID_native_fence_sync extension not supported
05-09 02:59:25.158     0     0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x06542000-0x06542fff], got write-back
05-09 02:59:25.165     0     0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x06499000-0x0649dfff], got write-back
05-09 02:59:25.171     0     0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x0649e000-0x064a1fff], got write-back
05-09 02:59:25.177  1440  1440 W hwc-gl-worker: EGL_ANDROID_native_fence_sync extension not supported
05-09 02:59:25.183     0     0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x065fa000-0x065fafff], got write-back
05-09 02:59:25.192     0     0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x06539000-0x0653dfff], got write-back
05-09 02:59:25.199     0     0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x0653e000-0x06541fff], got write-back
05-09 02:59:25.204  1440  1440 W hwc-gl-worker: EGL_ANDROID_native_fence_sync extension not supported
05-09 02:59:25.212     0     0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x066a2000-0x066a2fff], got write-back
05-09 02:59:25.218     0     0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x065f1000-0x065f5fff], got write-back
05-09 02:59:25.226     0     0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x065f6000-0x065f9fff], got write-back


05-09 02:59:27.101     0     0 W x86/PAT : BootAnimation:1665 map pfn RAM range req write-combining for [mem 0x08a76000-0x08a76fff], got write-back
05-09 02:59:27.225     0     0 W x86/PAT : BootAnimation:1665 map pfn RAM range req write-combining for [mem 0x08a77000-0x08a7afff], got write-back
05-09 02:59:27.242     0     0 W x86/PAT : BootAnimation:1665 map pfn RAM range req write-combining for [mem 0x08bd0000-0x08bd0fff], got write-back
05-09 02:59:27.254     0     0 W x86/PAT : BootAnimation:1665 map pfn RAM range req write-combining for [mem 0x08bd1000-0x08bf0fff], got write-back
05-09 02:59:27.310  1440  1440 E drm-fb  : Failed to get handle from prime fd: 25
05-09 02:59:27.322     0     0 W x86/PAT : BootAnimation:1665 map pfn RAM range req write-combining for [mem 0x080d5000-0x080d9fff], got write-back
05-09 02:59:27.322     0     0 W x86/PAT : BootAnimation:1665 map pfn RAM range req write-combining for [mem 0x080da000-0x080ddfff], got write-back
05-09 02:59:27.338     0     0 W x86/PAT : BootAnimation:1665 map pfn RAM range req write-combining for [mem 0x1b830000-0x1b83ffff], got write-back
05-09 02:59:27.338     0     0 W x86/PAT : BootAnimation:1665 map pfn RAM range req write-combining for [mem 0x1b76a000-0x1b76efff], got write-back
05-09 02:59:27.344     0     0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x07e87000-0x07e8bfff], got write-back
05-09 02:59:27.349     0     0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x07e8c000-0x07e90fff], got write-back
05-09 02:59:27.347  1440  1440 E drm-fb  : Failed to get handle from prime fd: 25
05-09 02:59:27.361     0     0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x1c123000-0x1c126fff], got write-back
05-09 02:59:27.361     0     0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x1c127000-0x1c12afff], got write-back
05-09 02:59:27.362     0     0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x1c12f000-0x1c13efff], got write-back
05-09 02:59:27.362     0     0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x1c12b000-0x1c12efff], got write-back
05-09 02:59:27.364  1440  1440 E drm-fb  : Failed to get handle from prime fd: 25
05-09 02:59:27.377     0     0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x1c140000-0x1c144fff], got write-back
05-09 02:59:27.377     0     0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x1c145000-0x1c148fff], got write-back
05-09 02:59:27.378     0     0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x1c14b000-0x1c14ffff], got write-back
05-09 02:59:27.379     0     0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x1c151000-0x1c155fff], got write-back
05-09 02:59:27.377  1440  1440 E drm-fb  : Failed to get handle from prime fd: 25
05-09 02:59:27.393     0     0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x1c157000-0x1c15bfff], got write-back

Christoph Hellwig May 10, 2021, 8:58 a.m. UTC | #2

On Sun, May 09, 2021 at 03:33:29AM +0800, youling257 wrote:
> This patch cause "x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x064a2000-0x064a2fff], got write-back" problem.
> my 2GB ram Bay trail z3735f tablet runing on android-x86, "i915: fix remap_io_sg to verify the pgprot" cause this problem.

So this is the memtype verification added by the patch, meaning that the
old code did in fact not call into track_pfn_remap with the right flags.

Can the i915 maintainers take a look at making sure the page permissions
here make sense?

Serge Belyshev May 16, 2021, 4:06 p.m. UTC | #3

I have another problem with this patch since it landed in mainline. On
my m3-6Y30 skylake HD Graphics 515 (rev 07), it causes visual artifacts
that look like bunch of one pixel high horizontal streaks, seen most
often in firefox while scrolling or in menu controls.

Reverting this patch on top of current mainline fixes the problem.

Christoph Hellwig May 17, 2021, 12:37 p.m. UTC | #4

As an ad-hoc experiment:  can you replace the call to remap_pfn_range
with remap_pfn_range_notrack (and export it if you build i915 modular)
in remap_io_sg and see if that makes any difference?

Serge Belyshev May 17, 2021, 1:09 p.m. UTC | #5

Christoph Hellwig <hch@lst.de> writes:

> As an ad-hoc experiment:  can you replace the call to remap_pfn_range
> with remap_pfn_range_notrack (and export it if you build i915 modular)
> in remap_io_sg and see if that makes any difference?

That worked, thanks -- no artifacts seen.

Christoph Hellwig May 17, 2021, 1:11 p.m. UTC | #6

On Mon, May 17, 2021 at 04:09:42PM +0300, Serge Belyshev wrote:
> Christoph Hellwig <hch@lst.de> writes:
> 
> > As an ad-hoc experiment:  can you replace the call to remap_pfn_range
> > with remap_pfn_range_notrack (and export it if you build i915 modular)
> > in remap_io_sg and see if that makes any difference?
> 
> That worked, thanks -- no artifacts seen.

Looks like it is caused by the validation failure then.  Which means the
existing code is doing something wrong in its choice of the page
protection bit.  I really need help from the i915 maintainers here..

Matthew Auld May 17, 2021, 5:06 p.m. UTC | #7

On Mon, 17 May 2021 at 14:11, Christoph Hellwig <hch@lst.de> wrote:
>
> On Mon, May 17, 2021 at 04:09:42PM +0300, Serge Belyshev wrote:
> > Christoph Hellwig <hch@lst.de> writes:
> >
> > > As an ad-hoc experiment:  can you replace the call to remap_pfn_range
> > > with remap_pfn_range_notrack (and export it if you build i915 modular)
> > > in remap_io_sg and see if that makes any difference?
> >
> > That worked, thanks -- no artifacts seen.
>
> Looks like it is caused by the validation failure then.  Which means the
> existing code is doing something wrong in its choice of the page
> protection bit.  I really need help from the i915 maintainers here..

AFAIK there are two users of remap_io_sg, the first is our shmem
objects(see i915_gem_shmem.c), and for these we support UC, WC, and WB
mmap modes for userspace. The other user is device local-memory
objects(VRAM), and for this one we have an actual io_mapping which is
allocated as WC, and IIRC this should only be mapped as WC for the
mmap mode, but normal userspace can't hit this path yet.

What do we need to do here? It sounds like shmem backed objects are
allocated as WB for the pages underneath, but i915 allows mapping them
as UC/WC which trips up this track_pfn thing?

> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Thomas Hellström May 17, 2021, 9:46 p.m. UTC | #8

On 5/17/21 3:11 PM, Christoph Hellwig wrote:
> On Mon, May 17, 2021 at 04:09:42PM +0300, Serge Belyshev wrote:
>> Christoph Hellwig <hch@lst.de> writes:
>>
>>> As an ad-hoc experiment:  can you replace the call to remap_pfn_range
>>> with remap_pfn_range_notrack (and export it if you build i915 modular)
>>> in remap_io_sg and see if that makes any difference?
>> That worked, thanks -- no artifacts seen.
> Looks like it is caused by the validation failure then.  Which means the
> existing code is doing something wrong in its choice of the page
> protection bit.  I really need help from the i915 maintainers here..

Hmm,

Apart from the caching aliasing Mattew brought up, doesn't the 
remap_pfn_range_xxx() family require the mmap_sem held in write mode 
since it modifies the vma structure? remap_io_sg() is called from the 
fault handler with the mmap_sem held in read mode only.

/Thomas

> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Thomas Hellström May 18, 2021, 6:46 a.m. UTC | #9

On 5/17/21 11:46 PM, Thomas Hellström wrote:
>
> On 5/17/21 3:11 PM, Christoph Hellwig wrote:
>> On Mon, May 17, 2021 at 04:09:42PM +0300, Serge Belyshev wrote:
>>> Christoph Hellwig <hch@lst.de> writes:
>>>
>>>> As an ad-hoc experiment:  can you replace the call to remap_pfn_range
>>>> with remap_pfn_range_notrack (and export it if you build i915 modular)
>>>> in remap_io_sg and see if that makes any difference?
>>> That worked, thanks -- no artifacts seen.
>> Looks like it is caused by the validation failure then.  Which means the
>> existing code is doing something wrong in its choice of the page
>> protection bit.  I really need help from the i915 maintainers here..
>
> Hmm,
>
> Apart from the caching aliasing Mattew brought up, doesn't the 
> remap_pfn_range_xxx() family require the mmap_sem held in write mode 
> since it modifies the vma structure? remap_io_sg() is called from the 
> fault handler with the mmap_sem held in read mode only.
>
> /Thomas

And worse, if we prefault a user-space buffer object map using 
remap_io_sg() and then zap some ptes using madvise(), the next time 
those ptes are accessed, we'd trigger a new call to remap_io_sg() which 
would now find already populated ptes. While the old code looks to just 
silently overwrite those, it looks like the new code would BUG in 
remap_pte_range()?

/Thomas




>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Christoph Hellwig May 18, 2021, 1:21 p.m. UTC | #10

On Mon, May 17, 2021 at 06:06:44PM +0100, Matthew Auld wrote:
> > Looks like it is caused by the validation failure then.  Which means the
> > existing code is doing something wrong in its choice of the page
> > protection bit.  I really need help from the i915 maintainers here..
> 
> AFAIK there are two users of remap_io_sg, the first is our shmem
> objects(see i915_gem_shmem.c), and for these we support UC, WC, and WB
> mmap modes for userspace. The other user is device local-memory
> objects(VRAM), and for this one we have an actual io_mapping which is
> allocated as WC, and IIRC this should only be mapped as WC for the
> mmap mode, but normal userspace can't hit this path yet.

The only caller in current mainline is vm_fault_cpu in i915_gem_mman.c.
Is that device local?

> What do we need to do here? It sounds like shmem backed objects are
> allocated as WB for the pages underneath, but i915 allows mapping them
> as UC/WC which trips up this track_pfn thing?

To me the warnings looks like system memory is mapped with the wrong
permissions, yes.  If you want to map it as UC/WC the right set_memory_*
needs to be used on the kernel mapping as well to ensure that the
attributes don't conflict.

Christoph Hellwig May 18, 2021, 1:23 p.m. UTC | #11

On Mon, May 17, 2021 at 11:46:35PM +0200, Thomas Hellström wrote:
> Apart from the caching aliasing Mattew brought up, doesn't the 
> remap_pfn_range_xxx() family require the mmap_sem held in write mode since 
> it modifies the vma structure? remap_io_sg() is called from the fault 
> handler with the mmap_sem held in read mode only.

Only for vma->vm_flags, and remap_sg already asserts all the interesting
flags are set, although it does not assert VM_IO.

We could move the assignment out of remap_pfn_range_notrack and
into remap_pfn_range and just assert that the proper flags are set,
though.

Christoph Hellwig May 18, 2021, 1:24 p.m. UTC | #12

On Tue, May 18, 2021 at 08:46:44AM +0200, Thomas Hellström wrote:
> And worse, if we prefault a user-space buffer object map using 
> remap_io_sg() and then zap some ptes using madvise(), the next time those 
> ptes are accessed, we'd trigger a new call to remap_io_sg() which would now 
> find already populated ptes. While the old code looks to just silently 
> overwrite those, it looks like the new code would BUG in remap_pte_range()?

How can you zap the PTEs using madvise?

Thomas Hellström May 18, 2021, 1:33 p.m. UTC | #13

On 5/18/21 3:24 PM, Christoph Hellwig wrote:
> On Tue, May 18, 2021 at 08:46:44AM +0200, Thomas Hellström wrote:
>> And worse, if we prefault a user-space buffer object map using
>> remap_io_sg() and then zap some ptes using madvise(), the next time those
>> ptes are accessed, we'd trigger a new call to remap_io_sg() which would now
>> find already populated ptes. While the old code looks to just silently
>> overwrite those, it looks like the new code would BUG in remap_pte_range()?
> How can you zap the PTEs using madvise?

Hmm, that's not possible with VM_PFNMAP. My bad. Should be OK then.

/Thomas

Matthew Auld May 18, 2021, 3 p.m. UTC | #14

On Tue, 18 May 2021 at 14:21, Christoph Hellwig <hch@lst.de> wrote:
>
> On Mon, May 17, 2021 at 06:06:44PM +0100, Matthew Auld wrote:
> > > Looks like it is caused by the validation failure then.  Which means the
> > > existing code is doing something wrong in its choice of the page
> > > protection bit.  I really need help from the i915 maintainers here..
> >
> > AFAIK there are two users of remap_io_sg, the first is our shmem
> > objects(see i915_gem_shmem.c), and for these we support UC, WC, and WB
> > mmap modes for userspace. The other user is device local-memory
> > objects(VRAM), and for this one we have an actual io_mapping which is
> > allocated as WC, and IIRC this should only be mapped as WC for the
> > mmap mode, but normal userspace can't hit this path yet.
>
> The only caller in current mainline is vm_fault_cpu in i915_gem_mman.c.
> Is that device local?

The vm_fault_cpu covers both device local and shmem objects.

>
> > What do we need to do here? It sounds like shmem backed objects are
> > allocated as WB for the pages underneath, but i915 allows mapping them
> > as UC/WC which trips up this track_pfn thing?
>
> To me the warnings looks like system memory is mapped with the wrong
> permissions, yes.  If you want to map it as UC/WC the right set_memory_*
> needs to be used on the kernel mapping as well to ensure that the
> attributes don't conflict.

AFAIK mmap_offset also supports multiple active mmap modes for a given
object, so set_memory_* should still work here?

Thomas Hellström (Intel) May 19, 2021, 5:46 a.m. UTC | #15

On 5/18/21 5:00 PM, Matthew Auld wrote:
> On Tue, 18 May 2021 at 14:21, Christoph Hellwig <hch@lst.de> wrote:
>> On Mon, May 17, 2021 at 06:06:44PM +0100, Matthew Auld wrote:
>>>> Looks like it is caused by the validation failure then.  Which means the
>>>> existing code is doing something wrong in its choice of the page
>>>> protection bit.  I really need help from the i915 maintainers here..
>>> AFAIK there are two users of remap_io_sg, the first is our shmem
>>> objects(see i915_gem_shmem.c), and for these we support UC, WC, and WB
>>> mmap modes for userspace. The other user is device local-memory
>>> objects(VRAM), and for this one we have an actual io_mapping which is
>>> allocated as WC, and IIRC this should only be mapped as WC for the
>>> mmap mode, but normal userspace can't hit this path yet.
>> The only caller in current mainline is vm_fault_cpu in i915_gem_mman.c.
>> Is that device local?
> The vm_fault_cpu covers both device local and shmem objects.
>
>>> What do we need to do here? It sounds like shmem backed objects are
>>> allocated as WB for the pages underneath, but i915 allows mapping them
>>> as UC/WC which trips up this track_pfn thing?
>> To me the warnings looks like system memory is mapped with the wrong
>> permissions, yes.  If you want to map it as UC/WC the right set_memory_*
>> needs to be used on the kernel mapping as well to ensure that the
>> attributes don't conflict.
> AFAIK mmap_offset also supports multiple active mmap modes for a given
> object, so set_memory_* should still work here?

No, that won't work because there are active maps with conflicting 
caching attributes. I think the history here is that that was assumed to 
be OK for integrated graphics that ran only on Intel processors that 
promise to never write back unmodified cache lines resulting from 
prefetching, like some AMD processors did way back at least.

These conflicting mappings can obviously not be supported for discrete 
graphics, but for integrated they are part of the uAPI.

/Thomas





> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Thomas Hellström May 19, 2021, 5:51 a.m. UTC | #16

On 5/18/21 3:23 PM, Christoph Hellwig wrote:
> On Mon, May 17, 2021 at 11:46:35PM +0200, Thomas Hellström wrote:
>> Apart from the caching aliasing Mattew brought up, doesn't the
>> remap_pfn_range_xxx() family require the mmap_sem held in write mode since
>> it modifies the vma structure? remap_io_sg() is called from the fault
>> handler with the mmap_sem held in read mode only.
> Only for vma->vm_flags, and remap_sg already asserts all the interesting
> flags are set, although it does not assert VM_IO.
>
> We could move the assignment out of remap_pfn_range_notrack and
> into remap_pfn_range and just assert that the proper flags are set,
> though.

That to me sounds like a way forward. It sound like in general a gpu 
prefaulting helper that in the long run also supports faulting huge ptes 
is desired also by TTM. Although it looks like that BUG_ON() I pointed 
out was hit anyway....

/Thomas

[4/4] i915: fix remap_io_sg to verify the pgprot

Commit Message

Comments

Patch