Message ID | 1391805427-4576-9-git-send-email-rodrigo.vivi@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Feb 07, 2014 at 06:37:06PM -0200, Rodrigo Vivi wrote: > From: Chris Wilson <chris@chris-wilson.co.uk> > > Arjan van de Ven reported that on his test machine that he was seeing > stalls of greater than 1 frame greatly impacting the user experience. He > tracked this down to being the locked flush during a pagefault as being > the culprit hogging the struct_mutex and so blocking any other user from > proceeding. Stalling on a pagefault is bad behaviour on userspace's > part, for one it means that they are ignoring the coherency rules on > pointer access through the GTT, but fortunately we can apply the same > trick as the set-to-domain ioctl to do a lightweight, nonblocking flush > of outstanding rendering first. > > "Prior to the patch it looks like this > (this one testrun does not show the 20ms+ I've seen occasionally) > > 4.99 ms 2.36 ms 31360 __wait_seqno i915_wait_seqno i915_gem_object_wait_rendering i915_gem_object_set_to_gtt_domain i915_gem_fault __do_fault handle_ > +pte_fault handle_mm_fault __do_page_fault do_page_fault page_fault > 4.99 ms 2.75 ms 107751 __wait_seqno i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret > 4.99 ms 1.63 ms 1666 i915_mutex_lock_interruptible i915_gem_fault __do_fault handle_pte_fault handle_mm_fault __do_page_fault do_page_fault page_fa > +ult > 4.93 ms 2.45 ms 980 i915_mutex_lock_interruptible intel_crtc_page_flip drm_mode_page_flip_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_ > +sysret > 4.89 ms 2.20 ms 3283 i915_mutex_lock_interruptible i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret > 4.34 ms 1.66 ms 1715 i915_mutex_lock_interruptible i915_gem_pwrite_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret > 3.73 ms 3.73 ms 49 i915_mutex_lock_interruptible i915_gem_set_domain_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret > 3.17 ms 0.33 ms 931 i915_mutex_lock_interruptible i915_gem_madvise_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret > 2.97 ms 0.43 ms 1029 i915_mutex_lock_interruptible i915_gem_busy_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret > 2.55 ms 0.51 ms 735 i915_gem_get_tiling drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret > > After the patch it looks like this: > > 4.99 ms 2.14 ms 22212 __wait_seqno i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret > 4.86 ms 0.99 ms 14170 __wait_seqno i915_gem_object_wait_rendering__nonblocking i915_gem_fault __do_fault handle_pte_fault handle_mm_fault __do_page_ > +fault do_page_fault page_fault > 3.59 ms 1.31 ms 325 i915_gem_get_tiling drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret > 3.37 ms 3.37 ms 65 i915_mutex_lock_interruptible i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret > 2.58 ms 2.58 ms 65 i915_mutex_lock_interruptible i915_gem_do_execbuffer.isra.23 i915_gem_execbuffer2 drm_ioctl i915_compat_ioctl compat_sys_ioctl > +ia32_sysret > 2.19 ms 2.19 ms 65 i915_mutex_lock_interruptible intel_crtc_page_flip drm_mode_page_flip_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_ > +sysret > 2.18 ms 2.18 ms 65 i915_mutex_lock_interruptible i915_gem_busy_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret > 1.66 ms 1.66 ms 65 i915_gem_set_tiling drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret > > It may not look like it, but this is quite a large difference, and I've > been unable to reproduce > 5 msec delays at all, while before they do > happen (just not in the trace above)." > > gem_gtt_hog on an old Pineview (GMA3150), > before: 4969.119ms > after: 4122.749ms > > Reported-by: Arjan van de Ven <arjan.van.de.ven@intel.com> > Testcase: igt/gem_gtt_hog > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
On Mon, Feb 10, 2014 at 03:52:00PM +0000, Damien Lespiau wrote: > On Fri, Feb 07, 2014 at 06:37:06PM -0200, Rodrigo Vivi wrote: > > From: Chris Wilson <chris@chris-wilson.co.uk> > > > > Arjan van de Ven reported that on his test machine that he was seeing > > stalls of greater than 1 frame greatly impacting the user experience. He > > tracked this down to being the locked flush during a pagefault as being > > the culprit hogging the struct_mutex and so blocking any other user from > > proceeding. Stalling on a pagefault is bad behaviour on userspace's > > part, for one it means that they are ignoring the coherency rules on > > pointer access through the GTT, but fortunately we can apply the same > > trick as the set-to-domain ioctl to do a lightweight, nonblocking flush > > of outstanding rendering first. > > > > "Prior to the patch it looks like this > > (this one testrun does not show the 20ms+ I've seen occasionally) > > > > 4.99 ms 2.36 ms 31360 __wait_seqno i915_wait_seqno i915_gem_object_wait_rendering i915_gem_object_set_to_gtt_domain i915_gem_fault __do_fault handle_ > > +pte_fault handle_mm_fault __do_page_fault do_page_fault page_fault > > 4.99 ms 2.75 ms 107751 __wait_seqno i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret > > 4.99 ms 1.63 ms 1666 i915_mutex_lock_interruptible i915_gem_fault __do_fault handle_pte_fault handle_mm_fault __do_page_fault do_page_fault page_fa > > +ult > > 4.93 ms 2.45 ms 980 i915_mutex_lock_interruptible intel_crtc_page_flip drm_mode_page_flip_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_ > > +sysret > > 4.89 ms 2.20 ms 3283 i915_mutex_lock_interruptible i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret > > 4.34 ms 1.66 ms 1715 i915_mutex_lock_interruptible i915_gem_pwrite_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret > > 3.73 ms 3.73 ms 49 i915_mutex_lock_interruptible i915_gem_set_domain_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret > > 3.17 ms 0.33 ms 931 i915_mutex_lock_interruptible i915_gem_madvise_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret > > 2.97 ms 0.43 ms 1029 i915_mutex_lock_interruptible i915_gem_busy_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret > > 2.55 ms 0.51 ms 735 i915_gem_get_tiling drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret > > > > After the patch it looks like this: > > > > 4.99 ms 2.14 ms 22212 __wait_seqno i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret > > 4.86 ms 0.99 ms 14170 __wait_seqno i915_gem_object_wait_rendering__nonblocking i915_gem_fault __do_fault handle_pte_fault handle_mm_fault __do_page_ > > +fault do_page_fault page_fault > > 3.59 ms 1.31 ms 325 i915_gem_get_tiling drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret > > 3.37 ms 3.37 ms 65 i915_mutex_lock_interruptible i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret > > 2.58 ms 2.58 ms 65 i915_mutex_lock_interruptible i915_gem_do_execbuffer.isra.23 i915_gem_execbuffer2 drm_ioctl i915_compat_ioctl compat_sys_ioctl > > +ia32_sysret > > 2.19 ms 2.19 ms 65 i915_mutex_lock_interruptible intel_crtc_page_flip drm_mode_page_flip_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_ > > +sysret > > 2.18 ms 2.18 ms 65 i915_mutex_lock_interruptible i915_gem_busy_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret > > 1.66 ms 1.66 ms 65 i915_gem_set_tiling drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret > > > > It may not look like it, but this is quite a large difference, and I've > > been unable to reproduce > 5 msec delays at all, while before they do > > happen (just not in the trace above)." > > > > gem_gtt_hog on an old Pineview (GMA3150), > > before: 4969.119ms > > after: 4122.749ms > > > > Reported-by: Arjan van de Ven <arjan.van.de.ven@intel.com> > > Testcase: igt/gem_gtt_hog > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> > > Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Queued for -next, thanks for the patch. -Daniel
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index a8a069f..6008d88 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1184,7 +1184,7 @@ i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj, */ static __must_check int i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj, - struct drm_file *file, + struct drm_i915_file_private *file_priv, bool readonly) { struct drm_device *dev = obj->base.dev; @@ -1211,7 +1211,7 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj, reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter); mutex_unlock(&dev->struct_mutex); - ret = __wait_seqno(ring, seqno, reset_counter, true, NULL, file->driver_priv); + ret = __wait_seqno(ring, seqno, reset_counter, true, NULL, file_priv); mutex_lock(&dev->struct_mutex); if (ret) return ret; @@ -1260,7 +1260,9 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, * We will repeat the flush holding the lock in the normal manner * to catch cases where we are gazumped. */ - ret = i915_gem_object_wait_rendering__nonblocking(obj, file, !write_domain); + ret = i915_gem_object_wait_rendering__nonblocking(obj, + file->driver_priv, + !write_domain); if (ret) goto unref; @@ -1392,6 +1394,15 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf) trace_i915_gem_object_fault(obj, page_offset, true, write); + /* Try to flush the object off the GPU first without holding the lock. + * Upon reacquiring the lock, we will perform our sanity checks and then + * repeat the flush holding the lock in the normal manner to catch cases + * where we are gazumped. + */ + ret = i915_gem_object_wait_rendering__nonblocking(obj, NULL, !write); + if (ret) + goto unlock; + /* Access to snoopable pages through the GTT is incoherent. */ if (obj->cache_level != I915_CACHE_NONE && !HAS_LLC(dev)) { ret = -EINVAL;