From patchwork Fri Feb 7 20:37:06 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rodrigo Vivi X-Patchwork-Id: 3607711 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 476C69F344 for ; Fri, 7 Feb 2014 20:37:50 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 5843820131 for ; Fri, 7 Feb 2014 20:37:49 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 5C5062012F for ; Fri, 7 Feb 2014 20:37:48 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A25841057B1; Fri, 7 Feb 2014 12:37:46 -0800 (PST) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-yk0-f171.google.com (mail-yk0-f171.google.com [209.85.160.171]) by gabe.freedesktop.org (Postfix) with ESMTP id 7A7261057B0 for ; Fri, 7 Feb 2014 12:37:37 -0800 (PST) Received: by mail-yk0-f171.google.com with SMTP id q9so510199ykb.2 for ; Fri, 07 Feb 2014 12:37:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=B65GnhYdPBcJb+1fsOIncGE9OoEtKMhTiUU2HrzETzo=; b=cbmnjDhzka9+ThpXdqaY33m2jTF7sT16t0zxUoHbiTtvKTAPiO39imUTfGhMgp65f4 4kpOHDusUkQZYMln/LiGuklTW9oG92pIiX3DZsmwAkDu4KvNZSZc4FArusbh+K1PQ8li OsPk4vJFakaSqybTzqKOWK0/B67n/3VLTnhPLu00WcZi1fHZHinVOFz5kpymvQe15QQg w7CoNWQgSaT70PTK6tj/4Ft3UXYq8LWFXNatscFQsne9fb/Qjn0+01EbDSnPearvD3xI 1guEBx8Po6hSkHaejsIJRLPEQ7yGGU7Ng4c9STHMBMsQwa1K7HFmkItbGE2RWXGoAfl7 4SNQ== X-Received: by 10.236.46.18 with SMTP id q18mr12711893yhb.21.1391805455754; Fri, 07 Feb 2014 12:37:35 -0800 (PST) Received: from localhost.localdomain ([189.67.97.168]) by mx.google.com with ESMTPSA id q9sm13697981yhk.16.2014.02.07.12.37.33 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 07 Feb 2014 12:37:35 -0800 (PST) From: Rodrigo Vivi To: intel-gfx@lists.freedesktop.org Date: Fri, 7 Feb 2014 18:37:06 -0200 Message-Id: <1391805427-4576-9-git-send-email-rodrigo.vivi@gmail.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1391805427-4576-1-git-send-email-rodrigo.vivi@gmail.com> References: <1391805427-4576-1-git-send-email-rodrigo.vivi@gmail.com> Subject: [Intel-gfx] [PATCH 8/9] drm/i915: Flush GPU rendering with a lockless wait during a pagefault X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: intel-gfx-bounces@lists.freedesktop.org Errors-To: intel-gfx-bounces@lists.freedesktop.org X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, T_DKIM_INVALID, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Chris Wilson Arjan van de Ven reported that on his test machine that he was seeing stalls of greater than 1 frame greatly impacting the user experience. He tracked this down to being the locked flush during a pagefault as being the culprit hogging the struct_mutex and so blocking any other user from proceeding. Stalling on a pagefault is bad behaviour on userspace's part, for one it means that they are ignoring the coherency rules on pointer access through the GTT, but fortunately we can apply the same trick as the set-to-domain ioctl to do a lightweight, nonblocking flush of outstanding rendering first. "Prior to the patch it looks like this (this one testrun does not show the 20ms+ I've seen occasionally) 4.99 ms 2.36 ms 31360 __wait_seqno i915_wait_seqno i915_gem_object_wait_rendering i915_gem_object_set_to_gtt_domain i915_gem_fault __do_fault handle_ +pte_fault handle_mm_fault __do_page_fault do_page_fault page_fault 4.99 ms 2.75 ms 107751 __wait_seqno i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 4.99 ms 1.63 ms 1666 i915_mutex_lock_interruptible i915_gem_fault __do_fault handle_pte_fault handle_mm_fault __do_page_fault do_page_fault page_fa +ult 4.93 ms 2.45 ms 980 i915_mutex_lock_interruptible intel_crtc_page_flip drm_mode_page_flip_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_ +sysret 4.89 ms 2.20 ms 3283 i915_mutex_lock_interruptible i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 4.34 ms 1.66 ms 1715 i915_mutex_lock_interruptible i915_gem_pwrite_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 3.73 ms 3.73 ms 49 i915_mutex_lock_interruptible i915_gem_set_domain_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 3.17 ms 0.33 ms 931 i915_mutex_lock_interruptible i915_gem_madvise_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 2.97 ms 0.43 ms 1029 i915_mutex_lock_interruptible i915_gem_busy_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 2.55 ms 0.51 ms 735 i915_gem_get_tiling drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret After the patch it looks like this: 4.99 ms 2.14 ms 22212 __wait_seqno i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 4.86 ms 0.99 ms 14170 __wait_seqno i915_gem_object_wait_rendering__nonblocking i915_gem_fault __do_fault handle_pte_fault handle_mm_fault __do_page_ +fault do_page_fault page_fault 3.59 ms 1.31 ms 325 i915_gem_get_tiling drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 3.37 ms 3.37 ms 65 i915_mutex_lock_interruptible i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 2.58 ms 2.58 ms 65 i915_mutex_lock_interruptible i915_gem_do_execbuffer.isra.23 i915_gem_execbuffer2 drm_ioctl i915_compat_ioctl compat_sys_ioctl +ia32_sysret 2.19 ms 2.19 ms 65 i915_mutex_lock_interruptible intel_crtc_page_flip drm_mode_page_flip_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_ +sysret 2.18 ms 2.18 ms 65 i915_mutex_lock_interruptible i915_gem_busy_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 1.66 ms 1.66 ms 65 i915_gem_set_tiling drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret It may not look like it, but this is quite a large difference, and I've been unable to reproduce > 5 msec delays at all, while before they do happen (just not in the trace above)." gem_gtt_hog on an old Pineview (GMA3150), before: 4969.119ms after: 4122.749ms Reported-by: Arjan van de Ven Testcase: igt/gem_gtt_hog Signed-off-by: Chris Wilson Signed-off-by: Rodrigo Vivi Reviewed-by: Damien Lespiau --- drivers/gpu/drm/i915/i915_gem.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index a8a069f..6008d88 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1184,7 +1184,7 @@ i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj, */ static __must_check int i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj, - struct drm_file *file, + struct drm_i915_file_private *file_priv, bool readonly) { struct drm_device *dev = obj->base.dev; @@ -1211,7 +1211,7 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj, reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter); mutex_unlock(&dev->struct_mutex); - ret = __wait_seqno(ring, seqno, reset_counter, true, NULL, file->driver_priv); + ret = __wait_seqno(ring, seqno, reset_counter, true, NULL, file_priv); mutex_lock(&dev->struct_mutex); if (ret) return ret; @@ -1260,7 +1260,9 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, * We will repeat the flush holding the lock in the normal manner * to catch cases where we are gazumped. */ - ret = i915_gem_object_wait_rendering__nonblocking(obj, file, !write_domain); + ret = i915_gem_object_wait_rendering__nonblocking(obj, + file->driver_priv, + !write_domain); if (ret) goto unref; @@ -1392,6 +1394,15 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf) trace_i915_gem_object_fault(obj, page_offset, true, write); + /* Try to flush the object off the GPU first without holding the lock. + * Upon reacquiring the lock, we will perform our sanity checks and then + * repeat the flush holding the lock in the normal manner to catch cases + * where we are gazumped. + */ + ret = i915_gem_object_wait_rendering__nonblocking(obj, NULL, !write); + if (ret) + goto unlock; + /* Access to snoopable pages through the GTT is incoherent. */ if (obj->cache_level != I915_CACHE_NONE && !HAS_LLC(dev)) { ret = -EINVAL;