Message ID | 20210326055505.1424432-5-hch@lst.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/4] mm: add remap_pfn_range_notrack | expand |
This patch cause "x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x064a2000-0x064a2fff], got write-back" problem. my 2GB ram Bay trail z3735f tablet runing on android-x86, "i915: fix remap_io_sg to verify the pgprot" cause this problem. 05-09 02:59:25.099 0 0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x0640a000-0x0640dfff], got write-back 05-09 02:59:25.106 1440 1440 W hwc-gl-worker: EGL_ANDROID_native_fence_sync extension not supported 05-09 02:59:25.111 0 0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x064a2000-0x064a2fff], got write-back 05-09 02:59:25.118 0 0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x06400000-0x06404fff], got write-back 05-09 02:59:25.125 0 0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x06405000-0x06408fff], got write-back 05-09 02:59:25.148 1440 1440 W hwc-gl-worker: EGL_ANDROID_native_fence_sync extension not supported 05-09 02:59:25.158 0 0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x06542000-0x06542fff], got write-back 05-09 02:59:25.165 0 0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x06499000-0x0649dfff], got write-back 05-09 02:59:25.171 0 0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x0649e000-0x064a1fff], got write-back 05-09 02:59:25.177 1440 1440 W hwc-gl-worker: EGL_ANDROID_native_fence_sync extension not supported 05-09 02:59:25.183 0 0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x065fa000-0x065fafff], got write-back 05-09 02:59:25.192 0 0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x06539000-0x0653dfff], got write-back 05-09 02:59:25.199 0 0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x0653e000-0x06541fff], got write-back 05-09 02:59:25.204 1440 1440 W hwc-gl-worker: EGL_ANDROID_native_fence_sync extension not supported 05-09 02:59:25.212 0 0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x066a2000-0x066a2fff], got write-back 05-09 02:59:25.218 0 0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x065f1000-0x065f5fff], got write-back 05-09 02:59:25.226 0 0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x065f6000-0x065f9fff], got write-back 05-09 02:59:27.101 0 0 W x86/PAT : BootAnimation:1665 map pfn RAM range req write-combining for [mem 0x08a76000-0x08a76fff], got write-back 05-09 02:59:27.225 0 0 W x86/PAT : BootAnimation:1665 map pfn RAM range req write-combining for [mem 0x08a77000-0x08a7afff], got write-back 05-09 02:59:27.242 0 0 W x86/PAT : BootAnimation:1665 map pfn RAM range req write-combining for [mem 0x08bd0000-0x08bd0fff], got write-back 05-09 02:59:27.254 0 0 W x86/PAT : BootAnimation:1665 map pfn RAM range req write-combining for [mem 0x08bd1000-0x08bf0fff], got write-back 05-09 02:59:27.310 1440 1440 E drm-fb : Failed to get handle from prime fd: 25 05-09 02:59:27.322 0 0 W x86/PAT : BootAnimation:1665 map pfn RAM range req write-combining for [mem 0x080d5000-0x080d9fff], got write-back 05-09 02:59:27.322 0 0 W x86/PAT : BootAnimation:1665 map pfn RAM range req write-combining for [mem 0x080da000-0x080ddfff], got write-back 05-09 02:59:27.338 0 0 W x86/PAT : BootAnimation:1665 map pfn RAM range req write-combining for [mem 0x1b830000-0x1b83ffff], got write-back 05-09 02:59:27.338 0 0 W x86/PAT : BootAnimation:1665 map pfn RAM range req write-combining for [mem 0x1b76a000-0x1b76efff], got write-back 05-09 02:59:27.344 0 0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x07e87000-0x07e8bfff], got write-back 05-09 02:59:27.349 0 0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x07e8c000-0x07e90fff], got write-back 05-09 02:59:27.347 1440 1440 E drm-fb : Failed to get handle from prime fd: 25 05-09 02:59:27.361 0 0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x1c123000-0x1c126fff], got write-back 05-09 02:59:27.361 0 0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x1c127000-0x1c12afff], got write-back 05-09 02:59:27.362 0 0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x1c12f000-0x1c13efff], got write-back 05-09 02:59:27.362 0 0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x1c12b000-0x1c12efff], got write-back 05-09 02:59:27.364 1440 1440 E drm-fb : Failed to get handle from prime fd: 25 05-09 02:59:27.377 0 0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x1c140000-0x1c144fff], got write-back 05-09 02:59:27.377 0 0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x1c145000-0x1c148fff], got write-back 05-09 02:59:27.378 0 0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x1c14b000-0x1c14ffff], got write-back 05-09 02:59:27.379 0 0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x1c151000-0x1c155fff], got write-back 05-09 02:59:27.377 1440 1440 E drm-fb : Failed to get handle from prime fd: 25 05-09 02:59:27.393 0 0 W x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x1c157000-0x1c15bfff], got write-back
On Sun, May 09, 2021 at 03:33:29AM +0800, youling257 wrote: > This patch cause "x86/PAT : surfaceflinger:1440 map pfn RAM range req write-combining for [mem 0x064a2000-0x064a2fff], got write-back" problem. > my 2GB ram Bay trail z3735f tablet runing on android-x86, "i915: fix remap_io_sg to verify the pgprot" cause this problem. So this is the memtype verification added by the patch, meaning that the old code did in fact not call into track_pfn_remap with the right flags. Can the i915 maintainers take a look at making sure the page permissions here make sense?
I have another problem with this patch since it landed in mainline. On my m3-6Y30 skylake HD Graphics 515 (rev 07), it causes visual artifacts that look like bunch of one pixel high horizontal streaks, seen most often in firefox while scrolling or in menu controls. Reverting this patch on top of current mainline fixes the problem.
As an ad-hoc experiment: can you replace the call to remap_pfn_range with remap_pfn_range_notrack (and export it if you build i915 modular) in remap_io_sg and see if that makes any difference?
Christoph Hellwig <hch@lst.de> writes: > As an ad-hoc experiment: can you replace the call to remap_pfn_range > with remap_pfn_range_notrack (and export it if you build i915 modular) > in remap_io_sg and see if that makes any difference? That worked, thanks -- no artifacts seen.
On Mon, May 17, 2021 at 04:09:42PM +0300, Serge Belyshev wrote: > Christoph Hellwig <hch@lst.de> writes: > > > As an ad-hoc experiment: can you replace the call to remap_pfn_range > > with remap_pfn_range_notrack (and export it if you build i915 modular) > > in remap_io_sg and see if that makes any difference? > > That worked, thanks -- no artifacts seen. Looks like it is caused by the validation failure then. Which means the existing code is doing something wrong in its choice of the page protection bit. I really need help from the i915 maintainers here..
On Mon, 17 May 2021 at 14:11, Christoph Hellwig <hch@lst.de> wrote: > > On Mon, May 17, 2021 at 04:09:42PM +0300, Serge Belyshev wrote: > > Christoph Hellwig <hch@lst.de> writes: > > > > > As an ad-hoc experiment: can you replace the call to remap_pfn_range > > > with remap_pfn_range_notrack (and export it if you build i915 modular) > > > in remap_io_sg and see if that makes any difference? > > > > That worked, thanks -- no artifacts seen. > > Looks like it is caused by the validation failure then. Which means the > existing code is doing something wrong in its choice of the page > protection bit. I really need help from the i915 maintainers here.. AFAIK there are two users of remap_io_sg, the first is our shmem objects(see i915_gem_shmem.c), and for these we support UC, WC, and WB mmap modes for userspace. The other user is device local-memory objects(VRAM), and for this one we have an actual io_mapping which is allocated as WC, and IIRC this should only be mapped as WC for the mmap mode, but normal userspace can't hit this path yet. What do we need to do here? It sounds like shmem backed objects are allocated as WB for the pages underneath, but i915 allows mapping them as UC/WC which trips up this track_pfn thing? > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
On 5/17/21 3:11 PM, Christoph Hellwig wrote: > On Mon, May 17, 2021 at 04:09:42PM +0300, Serge Belyshev wrote: >> Christoph Hellwig <hch@lst.de> writes: >> >>> As an ad-hoc experiment: can you replace the call to remap_pfn_range >>> with remap_pfn_range_notrack (and export it if you build i915 modular) >>> in remap_io_sg and see if that makes any difference? >> That worked, thanks -- no artifacts seen. > Looks like it is caused by the validation failure then. Which means the > existing code is doing something wrong in its choice of the page > protection bit. I really need help from the i915 maintainers here.. Hmm, Apart from the caching aliasing Mattew brought up, doesn't the remap_pfn_range_xxx() family require the mmap_sem held in write mode since it modifies the vma structure? remap_io_sg() is called from the fault handler with the mmap_sem held in read mode only. /Thomas > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
On 5/17/21 11:46 PM, Thomas Hellström wrote: > > On 5/17/21 3:11 PM, Christoph Hellwig wrote: >> On Mon, May 17, 2021 at 04:09:42PM +0300, Serge Belyshev wrote: >>> Christoph Hellwig <hch@lst.de> writes: >>> >>>> As an ad-hoc experiment: can you replace the call to remap_pfn_range >>>> with remap_pfn_range_notrack (and export it if you build i915 modular) >>>> in remap_io_sg and see if that makes any difference? >>> That worked, thanks -- no artifacts seen. >> Looks like it is caused by the validation failure then. Which means the >> existing code is doing something wrong in its choice of the page >> protection bit. I really need help from the i915 maintainers here.. > > Hmm, > > Apart from the caching aliasing Mattew brought up, doesn't the > remap_pfn_range_xxx() family require the mmap_sem held in write mode > since it modifies the vma structure? remap_io_sg() is called from the > fault handler with the mmap_sem held in read mode only. > > /Thomas And worse, if we prefault a user-space buffer object map using remap_io_sg() and then zap some ptes using madvise(), the next time those ptes are accessed, we'd trigger a new call to remap_io_sg() which would now find already populated ptes. While the old code looks to just silently overwrite those, it looks like the new code would BUG in remap_pte_range()? /Thomas > >> _______________________________________________ >> Intel-gfx mailing list >> Intel-gfx@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Mon, May 17, 2021 at 06:06:44PM +0100, Matthew Auld wrote: > > Looks like it is caused by the validation failure then. Which means the > > existing code is doing something wrong in its choice of the page > > protection bit. I really need help from the i915 maintainers here.. > > AFAIK there are two users of remap_io_sg, the first is our shmem > objects(see i915_gem_shmem.c), and for these we support UC, WC, and WB > mmap modes for userspace. The other user is device local-memory > objects(VRAM), and for this one we have an actual io_mapping which is > allocated as WC, and IIRC this should only be mapped as WC for the > mmap mode, but normal userspace can't hit this path yet. The only caller in current mainline is vm_fault_cpu in i915_gem_mman.c. Is that device local? > What do we need to do here? It sounds like shmem backed objects are > allocated as WB for the pages underneath, but i915 allows mapping them > as UC/WC which trips up this track_pfn thing? To me the warnings looks like system memory is mapped with the wrong permissions, yes. If you want to map it as UC/WC the right set_memory_* needs to be used on the kernel mapping as well to ensure that the attributes don't conflict.
On Mon, May 17, 2021 at 11:46:35PM +0200, Thomas Hellström wrote: > Apart from the caching aliasing Mattew brought up, doesn't the > remap_pfn_range_xxx() family require the mmap_sem held in write mode since > it modifies the vma structure? remap_io_sg() is called from the fault > handler with the mmap_sem held in read mode only. Only for vma->vm_flags, and remap_sg already asserts all the interesting flags are set, although it does not assert VM_IO. We could move the assignment out of remap_pfn_range_notrack and into remap_pfn_range and just assert that the proper flags are set, though.
On Tue, May 18, 2021 at 08:46:44AM +0200, Thomas Hellström wrote: > And worse, if we prefault a user-space buffer object map using > remap_io_sg() and then zap some ptes using madvise(), the next time those > ptes are accessed, we'd trigger a new call to remap_io_sg() which would now > find already populated ptes. While the old code looks to just silently > overwrite those, it looks like the new code would BUG in remap_pte_range()? How can you zap the PTEs using madvise?
On 5/18/21 3:24 PM, Christoph Hellwig wrote: > On Tue, May 18, 2021 at 08:46:44AM +0200, Thomas Hellström wrote: >> And worse, if we prefault a user-space buffer object map using >> remap_io_sg() and then zap some ptes using madvise(), the next time those >> ptes are accessed, we'd trigger a new call to remap_io_sg() which would now >> find already populated ptes. While the old code looks to just silently >> overwrite those, it looks like the new code would BUG in remap_pte_range()? > How can you zap the PTEs using madvise? Hmm, that's not possible with VM_PFNMAP. My bad. Should be OK then. /Thomas
On Tue, 18 May 2021 at 14:21, Christoph Hellwig <hch@lst.de> wrote: > > On Mon, May 17, 2021 at 06:06:44PM +0100, Matthew Auld wrote: > > > Looks like it is caused by the validation failure then. Which means the > > > existing code is doing something wrong in its choice of the page > > > protection bit. I really need help from the i915 maintainers here.. > > > > AFAIK there are two users of remap_io_sg, the first is our shmem > > objects(see i915_gem_shmem.c), and for these we support UC, WC, and WB > > mmap modes for userspace. The other user is device local-memory > > objects(VRAM), and for this one we have an actual io_mapping which is > > allocated as WC, and IIRC this should only be mapped as WC for the > > mmap mode, but normal userspace can't hit this path yet. > > The only caller in current mainline is vm_fault_cpu in i915_gem_mman.c. > Is that device local? The vm_fault_cpu covers both device local and shmem objects. > > > What do we need to do here? It sounds like shmem backed objects are > > allocated as WB for the pages underneath, but i915 allows mapping them > > as UC/WC which trips up this track_pfn thing? > > To me the warnings looks like system memory is mapped with the wrong > permissions, yes. If you want to map it as UC/WC the right set_memory_* > needs to be used on the kernel mapping as well to ensure that the > attributes don't conflict. AFAIK mmap_offset also supports multiple active mmap modes for a given object, so set_memory_* should still work here?
On 5/18/21 5:00 PM, Matthew Auld wrote: > On Tue, 18 May 2021 at 14:21, Christoph Hellwig <hch@lst.de> wrote: >> On Mon, May 17, 2021 at 06:06:44PM +0100, Matthew Auld wrote: >>>> Looks like it is caused by the validation failure then. Which means the >>>> existing code is doing something wrong in its choice of the page >>>> protection bit. I really need help from the i915 maintainers here.. >>> AFAIK there are two users of remap_io_sg, the first is our shmem >>> objects(see i915_gem_shmem.c), and for these we support UC, WC, and WB >>> mmap modes for userspace. The other user is device local-memory >>> objects(VRAM), and for this one we have an actual io_mapping which is >>> allocated as WC, and IIRC this should only be mapped as WC for the >>> mmap mode, but normal userspace can't hit this path yet. >> The only caller in current mainline is vm_fault_cpu in i915_gem_mman.c. >> Is that device local? > The vm_fault_cpu covers both device local and shmem objects. > >>> What do we need to do here? It sounds like shmem backed objects are >>> allocated as WB for the pages underneath, but i915 allows mapping them >>> as UC/WC which trips up this track_pfn thing? >> To me the warnings looks like system memory is mapped with the wrong >> permissions, yes. If you want to map it as UC/WC the right set_memory_* >> needs to be used on the kernel mapping as well to ensure that the >> attributes don't conflict. > AFAIK mmap_offset also supports multiple active mmap modes for a given > object, so set_memory_* should still work here? No, that won't work because there are active maps with conflicting caching attributes. I think the history here is that that was assumed to be OK for integrated graphics that ran only on Intel processors that promise to never write back unmodified cache lines resulting from prefetching, like some AMD processors did way back at least. These conflicting mappings can obviously not be supported for discrete graphics, but for integrated they are part of the uAPI. /Thomas > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
On 5/18/21 3:23 PM, Christoph Hellwig wrote: > On Mon, May 17, 2021 at 11:46:35PM +0200, Thomas Hellström wrote: >> Apart from the caching aliasing Mattew brought up, doesn't the >> remap_pfn_range_xxx() family require the mmap_sem held in write mode since >> it modifies the vma structure? remap_io_sg() is called from the fault >> handler with the mmap_sem held in read mode only. > Only for vma->vm_flags, and remap_sg already asserts all the interesting > flags are set, although it does not assert VM_IO. > > We could move the assignment out of remap_pfn_range_notrack and > into remap_pfn_range and just assert that the proper flags are set, > though. That to me sounds like a way forward. It sound like in general a gpu prefaulting helper that in the long run also supports faulting huge ptes is desired also by TTM. Although it looks like that BUG_ON() I pointed out was hit anyway.... /Thomas
diff --git a/drivers/gpu/drm/i915/i915_mm.c b/drivers/gpu/drm/i915/i915_mm.c index 9a777b0ff59b05..4c8cd08c672d2d 100644 --- a/drivers/gpu/drm/i915/i915_mm.c +++ b/drivers/gpu/drm/i915/i915_mm.c @@ -28,46 +28,10 @@ #include "i915_drv.h" -struct remap_pfn { - struct mm_struct *mm; - unsigned long pfn; - pgprot_t prot; - - struct sgt_iter sgt; - resource_size_t iobase; -}; +#define EXPECTED_FLAGS (VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP) #define use_dma(io) ((io) != -1) -static inline unsigned long sgt_pfn(const struct remap_pfn *r) -{ - if (use_dma(r->iobase)) - return (r->sgt.dma + r->sgt.curr + r->iobase) >> PAGE_SHIFT; - else - return r->sgt.pfn + (r->sgt.curr >> PAGE_SHIFT); -} - -static int remap_sg(pte_t *pte, unsigned long addr, void *data) -{ - struct remap_pfn *r = data; - - if (GEM_WARN_ON(!r->sgt.sgp)) - return -EINVAL; - - /* Special PTE are not associated with any struct page */ - set_pte_at(r->mm, addr, pte, - pte_mkspecial(pfn_pte(sgt_pfn(r), r->prot))); - r->pfn++; /* track insertions in case we need to unwind later */ - - r->sgt.curr += PAGE_SIZE; - if (r->sgt.curr >= r->sgt.max) - r->sgt = __sgt_iter(__sg_next(r->sgt.sgp), use_dma(r->iobase)); - - return 0; -} - -#define EXPECTED_FLAGS (VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP) - /** * remap_io_sg - remap an IO mapping to userspace * @vma: user vma to map to @@ -82,12 +46,7 @@ int remap_io_sg(struct vm_area_struct *vma, unsigned long addr, unsigned long size, struct scatterlist *sgl, resource_size_t iobase) { - struct remap_pfn r = { - .mm = vma->vm_mm, - .prot = vma->vm_page_prot, - .sgt = __sgt_iter(sgl, use_dma(iobase)), - .iobase = iobase, - }; + unsigned long pfn, len, remapped = 0; int err; /* We rely on prevalidation of the io-mapping to skip track_pfn(). */ @@ -96,11 +55,25 @@ int remap_io_sg(struct vm_area_struct *vma, if (!use_dma(iobase)) flush_cache_range(vma, addr, size); - err = apply_to_page_range(r.mm, addr, size, remap_sg, &r); - if (unlikely(err)) { - zap_vma_ptes(vma, addr, r.pfn << PAGE_SHIFT); - return err; - } - - return 0; + do { + if (use_dma(iobase)) { + if (!sg_dma_len(sgl)) + break; + pfn = (sg_dma_address(sgl) + iobase) >> PAGE_SHIFT; + len = sg_dma_len(sgl); + } else { + pfn = page_to_pfn(sg_page(sgl)); + len = sgl->length; + } + + err = remap_pfn_range(vma, addr + remapped, pfn, len, + vma->vm_page_prot); + if (err) + break; + remapped += len; + } while ((sgl = __sg_next(sgl))); + + if (err) + zap_vma_ptes(vma, addr, remapped); + return err; }
remap_io_sg claims that that the pgprot is pre-verified using an io_mapping, but actually does not get passed an io_mapping and just uses the pgprot in the VMA. Remove the apply_to_page_range abuse and just loop over remap_pfn_range for each segment. Note: this could use io_mapping_map_user by passing an iomap to remap_io_sg if the maintainers can verify that the pgprot in the iomap in the only caller is indeed the desired one here. Signed-off-by: Christoph Hellwig <hch@lst.de> --- drivers/gpu/drm/i915/i915_mm.c | 73 +++++++++++----------------------- 1 file changed, 23 insertions(+), 50 deletions(-)