From patchwork Wed Jun 9 06:34:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Hellstrom X-Patchwork-Id: 12309075 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D81AC47095 for ; Wed, 9 Jun 2021 06:34:54 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C51F861183 for ; Wed, 9 Jun 2021 06:34:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C51F861183 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 557876E0E8; Wed, 9 Jun 2021 06:34:50 +0000 (UTC) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1FFD66E0F0; Wed, 9 Jun 2021 06:34:49 +0000 (UTC) IronPort-SDR: KRHKNDur1eqLllg6PV6dKZNslBSRuSyTORkC4QJXzwq781r0hfxrZ3a8v4wdMV0gL3crOld7Dl Iac/azKB5kiA== X-IronPort-AV: E=McAfee;i="6200,9189,10009"; a="268868878" X-IronPort-AV: E=Sophos;i="5.83,260,1616482800"; d="scan'208";a="268868878" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jun 2021 23:34:49 -0700 IronPort-SDR: 3XNMiaRNqGnqwhCFvWHSODuOYSBdd0UcYUT9aLj54Vnt2daGV7h6WFfkDv9kDqH+/vl4srAHqe pWh9RRLfYmHA== X-IronPort-AV: E=Sophos;i="5.83,260,1616482800"; d="scan'208";a="482265769" Received: from ekolpasx-mobl.ccr.corp.intel.com (HELO thellst-mobl1.intel.com) ([10.249.254.109]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jun 2021 23:34:47 -0700 From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Date: Wed, 9 Jun 2021 08:34:28 +0200 Message-Id: <20210609063436.284332-2-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210609063436.284332-1-thomas.hellstrom@linux.intel.com> References: <20210609063436.284332-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v2 1/9] drm/i915: Reference objects on the ww object list X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.auld@intel.com, =?utf-8?q?Thomas_Hellstr=C3=B6m?= Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Since the ww transaction endpoint easily end up far out-of-scope of the objects on the ww object list, particularly for contending lock objects, make sure we reference objects on the list so they don't disappear under us. This comes with a performance penalty so it's been debated whether this is really needed. But I think this is motivated by the fact that locking is typically difficult to get right, and whatever we can do to make it simpler for developers moving forward should be done, unless the performance impact is far too high. Signed-off-by: Thomas Hellström Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/gem/i915_gem_object.h | 8 ++++++-- drivers/gpu/drm/i915/i915_gem.c | 4 ++++ 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 7c0eb425cb3b..1fafcc89ecee 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -169,13 +169,17 @@ static inline int __i915_gem_object_lock(struct drm_i915_gem_object *obj, else ret = dma_resv_lock(obj->base.resv, ww ? &ww->ctx : NULL); - if (!ret && ww) + if (!ret && ww) { + i915_gem_object_get(obj); list_add_tail(&obj->obj_link, &ww->obj_list); + } if (ret == -EALREADY) ret = 0; - if (ret == -EDEADLK) + if (ret == -EDEADLK) { + i915_gem_object_get(obj); ww->contended = obj; + } return ret; } diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 589388dec48a..3f060ab58c5d 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1219,6 +1219,7 @@ static void i915_gem_ww_ctx_unlock_all(struct i915_gem_ww_ctx *ww) while ((obj = list_first_entry_or_null(&ww->obj_list, struct drm_i915_gem_object, obj_link))) { list_del(&obj->obj_link); i915_gem_object_unlock(obj); + i915_gem_object_put(obj); } } @@ -1226,6 +1227,7 @@ void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj) { list_del(&obj->obj_link); i915_gem_object_unlock(obj); + i915_gem_object_put(obj); } void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ww) @@ -1250,6 +1252,8 @@ int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ww) if (!ret) list_add_tail(&ww->contended->obj_link, &ww->obj_list); + else + i915_gem_object_put(ww->contended); ww->contended = NULL; From patchwork Wed Jun 9 06:34:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Hellstrom X-Patchwork-Id: 12309077 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BF9E3C48BCF for ; Wed, 9 Jun 2021 06:34:57 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8329B61183 for ; Wed, 9 Jun 2021 06:34:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8329B61183 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D5CFE6E0F1; Wed, 9 Jun 2021 06:34:52 +0000 (UTC) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1DA906E0F1; Wed, 9 Jun 2021 06:34:51 +0000 (UTC) IronPort-SDR: qLCP2QSPDUGHipZTmxq4ftzwH2vj4nzYJs9qipwc1i5R9e/Y7uQMqpjwp+d8pinY4WJZzRECKh KuiFF/xEt/Wg== X-IronPort-AV: E=McAfee;i="6200,9189,10009"; a="268868885" X-IronPort-AV: E=Sophos;i="5.83,260,1616482800"; d="scan'208";a="268868885" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jun 2021 23:34:51 -0700 IronPort-SDR: dQq7QX8zLlccjLafsYmxCDm9hCEvDpXY+gymRdlcg6fQ98qLA/90zvrFb+76VDhm17mu3fuJUj AdFRqR+79QbA== X-IronPort-AV: E=Sophos;i="5.83,260,1616482800"; d="scan'208";a="482265780" Received: from ekolpasx-mobl.ccr.corp.intel.com (HELO thellst-mobl1.intel.com) ([10.249.254.109]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jun 2021 23:34:49 -0700 From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Date: Wed, 9 Jun 2021 08:34:29 +0200 Message-Id: <20210609063436.284332-3-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210609063436.284332-1-thomas.hellstrom@linux.intel.com> References: <20210609063436.284332-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v2 2/9] drm/i915: Break out dma_resv ww locking utilities to separate files X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.auld@intel.com, =?utf-8?q?Thomas_Hellstr=C3=B6m?= Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" As we're about to add more ww-related functionality, break out the dma_resv ww locking utilities to their own files Signed-off-by: Thomas Hellström Reviewed-by: Matthew Auld --- v2: - Make sure filenames are sorted in include file lists and Makefile (Reported by Matthew Auld) --- drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/gem/i915_gem_object.h | 1 + drivers/gpu/drm/i915/gt/intel_renderstate.h | 1 + drivers/gpu/drm/i915/i915_gem.c | 56 ------------------ drivers/gpu/drm/i915/i915_gem.h | 12 ---- drivers/gpu/drm/i915/i915_gem_ww.c | 63 +++++++++++++++++++++ drivers/gpu/drm/i915/i915_gem_ww.h | 21 +++++++ 7 files changed, 87 insertions(+), 68 deletions(-) create mode 100644 drivers/gpu/drm/i915/i915_gem_ww.c create mode 100644 drivers/gpu/drm/i915/i915_gem_ww.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 4f22cac1c49b..16a5a006cf7c 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -164,6 +164,7 @@ i915-y += \ i915_cmd_parser.o \ i915_gem_evict.o \ i915_gem_gtt.o \ + i915_gem_ww.o \ i915_gem.o \ i915_globals.o \ i915_query.o \ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 1fafcc89ecee..bdab929e448b 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -14,6 +14,7 @@ #include "display/intel_frontbuffer.h" #include "i915_gem_object_types.h" #include "i915_gem_gtt.h" +#include "i915_gem_ww.h" #include "i915_vma_types.h" /* diff --git a/drivers/gpu/drm/i915/gt/intel_renderstate.h b/drivers/gpu/drm/i915/gt/intel_renderstate.h index 48f009203917..4da4c5234ef0 100644 --- a/drivers/gpu/drm/i915/gt/intel_renderstate.h +++ b/drivers/gpu/drm/i915/gt/intel_renderstate.h @@ -8,6 +8,7 @@ #include #include "i915_gem.h" +#include "i915_gem_ww.h" struct i915_request; struct intel_context; diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 3f060ab58c5d..ce64d3005cf2 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -1204,62 +1204,6 @@ int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file) return ret; } -void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ww, bool intr) -{ - ww_acquire_init(&ww->ctx, &reservation_ww_class); - INIT_LIST_HEAD(&ww->obj_list); - ww->intr = intr; - ww->contended = NULL; -} - -static void i915_gem_ww_ctx_unlock_all(struct i915_gem_ww_ctx *ww) -{ - struct drm_i915_gem_object *obj; - - while ((obj = list_first_entry_or_null(&ww->obj_list, struct drm_i915_gem_object, obj_link))) { - list_del(&obj->obj_link); - i915_gem_object_unlock(obj); - i915_gem_object_put(obj); - } -} - -void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj) -{ - list_del(&obj->obj_link); - i915_gem_object_unlock(obj); - i915_gem_object_put(obj); -} - -void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ww) -{ - i915_gem_ww_ctx_unlock_all(ww); - WARN_ON(ww->contended); - ww_acquire_fini(&ww->ctx); -} - -int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ww) -{ - int ret = 0; - - if (WARN_ON(!ww->contended)) - return -EINVAL; - - i915_gem_ww_ctx_unlock_all(ww); - if (ww->intr) - ret = dma_resv_lock_slow_interruptible(ww->contended->base.resv, &ww->ctx); - else - dma_resv_lock_slow(ww->contended->base.resv, &ww->ctx); - - if (!ret) - list_add_tail(&ww->contended->obj_link, &ww->obj_list); - else - i915_gem_object_put(ww->contended); - - ww->contended = NULL; - - return ret; -} - #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) #include "selftests/mock_gem_device.c" #include "selftests/i915_gem.c" diff --git a/drivers/gpu/drm/i915/i915_gem.h b/drivers/gpu/drm/i915/i915_gem.h index 440c35f1abc9..d0752e5553db 100644 --- a/drivers/gpu/drm/i915/i915_gem.h +++ b/drivers/gpu/drm/i915/i915_gem.h @@ -123,16 +123,4 @@ static inline bool __tasklet_is_scheduled(struct tasklet_struct *t) return test_bit(TASKLET_STATE_SCHED, &t->state); } -struct i915_gem_ww_ctx { - struct ww_acquire_ctx ctx; - struct list_head obj_list; - bool intr; - struct drm_i915_gem_object *contended; -}; - -void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ctx, bool intr); -void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ctx); -int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ctx); -void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj); - #endif /* __I915_GEM_H__ */ diff --git a/drivers/gpu/drm/i915/i915_gem_ww.c b/drivers/gpu/drm/i915/i915_gem_ww.c new file mode 100644 index 000000000000..3f6ff139478e --- /dev/null +++ b/drivers/gpu/drm/i915/i915_gem_ww.c @@ -0,0 +1,63 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2020 Intel Corporation + */ +#include +#include "i915_gem_ww.h" +#include "gem/i915_gem_object.h" + +void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ww, bool intr) +{ + ww_acquire_init(&ww->ctx, &reservation_ww_class); + INIT_LIST_HEAD(&ww->obj_list); + ww->intr = intr; + ww->contended = NULL; +} + +static void i915_gem_ww_ctx_unlock_all(struct i915_gem_ww_ctx *ww) +{ + struct drm_i915_gem_object *obj; + + while ((obj = list_first_entry_or_null(&ww->obj_list, struct drm_i915_gem_object, obj_link))) { + list_del(&obj->obj_link); + i915_gem_object_unlock(obj); + i915_gem_object_put(obj); + } +} + +void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj) +{ + list_del(&obj->obj_link); + i915_gem_object_unlock(obj); + i915_gem_object_put(obj); +} + +void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ww) +{ + i915_gem_ww_ctx_unlock_all(ww); + WARN_ON(ww->contended); + ww_acquire_fini(&ww->ctx); +} + +int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ww) +{ + int ret = 0; + + if (WARN_ON(!ww->contended)) + return -EINVAL; + + i915_gem_ww_ctx_unlock_all(ww); + if (ww->intr) + ret = dma_resv_lock_slow_interruptible(ww->contended->base.resv, &ww->ctx); + else + dma_resv_lock_slow(ww->contended->base.resv, &ww->ctx); + + if (!ret) + list_add_tail(&ww->contended->obj_link, &ww->obj_list); + else + i915_gem_object_put(ww->contended); + + ww->contended = NULL; + + return ret; +} diff --git a/drivers/gpu/drm/i915/i915_gem_ww.h b/drivers/gpu/drm/i915/i915_gem_ww.h new file mode 100644 index 000000000000..f2d8769e4118 --- /dev/null +++ b/drivers/gpu/drm/i915/i915_gem_ww.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2020 Intel Corporation + */ +#ifndef __I915_GEM_WW_H__ +#define __I915_GEM_WW_H__ + +#include + +struct i915_gem_ww_ctx { + struct ww_acquire_ctx ctx; + struct list_head obj_list; + struct drm_i915_gem_object *contended; + bool intr; +}; + +void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ctx, bool intr); +void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ctx); +int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ctx); +void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj); +#endif From patchwork Wed Jun 9 06:34:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Hellstrom X-Patchwork-Id: 12309079 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88CD1C48BCF for ; Wed, 9 Jun 2021 06:35:00 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 56C9261181 for ; Wed, 9 Jun 2021 06:35:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 56C9261181 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C0FA96E158; Wed, 9 Jun 2021 06:34:55 +0000 (UTC) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 533BE6E12D; Wed, 9 Jun 2021 06:34:53 +0000 (UTC) IronPort-SDR: 8RzCKhbXRcIbxMP2v9OScw+k3X7NuoyZP556I80sEcDeDv1XIayOgtnmX6EiHO1I/Msw1D/d0T vR1Vq4kb7vkQ== X-IronPort-AV: E=McAfee;i="6200,9189,10009"; a="268868886" X-IronPort-AV: E=Sophos;i="5.83,260,1616482800"; d="scan'208";a="268868886" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jun 2021 23:34:53 -0700 IronPort-SDR: GFc06J6TZPQXNiNvRBGYm2CLfo+gpWQb9g5A5fhSf2FZzNZIh/qXjpHShHbVOlnPXt0KmcLvey LSdn0U80SNOw== X-IronPort-AV: E=Sophos;i="5.83,260,1616482800"; d="scan'208";a="482265795" Received: from ekolpasx-mobl.ccr.corp.intel.com (HELO thellst-mobl1.intel.com) ([10.249.254.109]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jun 2021 23:34:51 -0700 From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Date: Wed, 9 Jun 2021 08:34:30 +0200 Message-Id: <20210609063436.284332-4-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210609063436.284332-1-thomas.hellstrom@linux.intel.com> References: <20210609063436.284332-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v2 3/9] drm/i915: Introduce a ww transaction helper X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.auld@intel.com, =?utf-8?q?Thomas_Hellstr=C3=B6m?= Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Introduce a for_i915_gem_ww(){} utility to help make the code around a ww transaction more readable. Signed-off-by: Thomas Hellström Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/i915_gem_ww.h | 31 +++++++++++++++++++++++++++++- 1 file changed, 30 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_ww.h b/drivers/gpu/drm/i915/i915_gem_ww.h index f2d8769e4118..f6b1a796667b 100644 --- a/drivers/gpu/drm/i915/i915_gem_ww.h +++ b/drivers/gpu/drm/i915/i915_gem_ww.h @@ -11,11 +11,40 @@ struct i915_gem_ww_ctx { struct ww_acquire_ctx ctx; struct list_head obj_list; struct drm_i915_gem_object *contended; - bool intr; + unsigned short intr; + unsigned short loop; }; void i915_gem_ww_ctx_init(struct i915_gem_ww_ctx *ctx, bool intr); void i915_gem_ww_ctx_fini(struct i915_gem_ww_ctx *ctx); int __must_check i915_gem_ww_ctx_backoff(struct i915_gem_ww_ctx *ctx); void i915_gem_ww_unlock_single(struct drm_i915_gem_object *obj); + +/* Internal functions used by the inlines! Don't use. */ +static inline int __i915_gem_ww_fini(struct i915_gem_ww_ctx *ww, int err) +{ + ww->loop = 0; + if (err == -EDEADLK) { + err = i915_gem_ww_ctx_backoff(ww); + if (!err) + ww->loop = 1; + } + + if (!ww->loop) + i915_gem_ww_ctx_fini(ww); + + return err; +} + +static inline void +__i915_gem_ww_init(struct i915_gem_ww_ctx *ww, bool intr) +{ + i915_gem_ww_ctx_init(ww, intr); + ww->loop = 1; +} + +#define for_i915_gem_ww(_ww, _err, _intr) \ + for (__i915_gem_ww_init(_ww, _intr); (_ww)->loop; \ + _err = __i915_gem_ww_fini(_ww, _err)) + #endif From patchwork Wed Jun 9 06:34:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Hellstrom X-Patchwork-Id: 12309081 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0589C48BCD for ; Wed, 9 Jun 2021 06:35:03 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6DA3961181 for ; Wed, 9 Jun 2021 06:35:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6DA3961181 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id AD9626E12D; Wed, 9 Jun 2021 06:34:56 +0000 (UTC) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2F33C6E154; Wed, 9 Jun 2021 06:34:55 +0000 (UTC) IronPort-SDR: UVwbEOLSFDz1pROuB9lz7uphxwekJlKYDkgoCCokbvFIrEl4VQ54KBWq8ShZFDrvF6OA++AFHm QnS3FLFpfdxw== X-IronPort-AV: E=McAfee;i="6200,9189,10009"; a="268868890" X-IronPort-AV: E=Sophos;i="5.83,260,1616482800"; d="scan'208";a="268868890" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jun 2021 23:34:55 -0700 IronPort-SDR: S8VGdF1MU3T7HsfzvNSTktfGcMycqsULjrdiD+Eqamv1BILYzIOWblQkCQG5Do2de5t0th7kyN LNPe/9IJt83Q== X-IronPort-AV: E=Sophos;i="5.83,260,1616482800"; d="scan'208";a="482265804" Received: from ekolpasx-mobl.ccr.corp.intel.com (HELO thellst-mobl1.intel.com) ([10.249.254.109]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jun 2021 23:34:53 -0700 From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Date: Wed, 9 Jun 2021 08:34:31 +0200 Message-Id: <20210609063436.284332-5-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210609063436.284332-1-thomas.hellstrom@linux.intel.com> References: <20210609063436.284332-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v2 4/9] drm/i915/gt: Add an insert_entry for gen8_ppgtt X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.auld@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Chris Wilson In the next patch, we will want to write a PTE for an explicit dma address, outside of the usual vma. Signed-off-by: Chris Wilson Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c index 21c8b7350b7a..1b676d7700bf 100644 --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c @@ -555,6 +555,24 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm, } } +static void gen8_ppgtt_insert_entry(struct i915_address_space *vm, + dma_addr_t addr, + u64 offset, + enum i915_cache_level level, + u32 flags) +{ + u64 idx = offset >> GEN8_PTE_SHIFT; + struct i915_page_directory * const pdp = + gen8_pdp_for_page_index(vm, idx); + struct i915_page_directory *pd = + i915_pd_entry(pdp, gen8_pd_index(idx, 2)); + gen8_pte_t *vaddr; + + vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1))); + vaddr[gen8_pd_index(idx, 0)] = gen8_pte_encode(addr, level, flags); + clflush_cache_range(&vaddr[gen8_pd_index(idx, 0)], sizeof(*vaddr)); +} + static int gen8_init_scratch(struct i915_address_space *vm) { u32 pte_flags; @@ -734,6 +752,7 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt) ppgtt->vm.bind_async_flags = I915_VMA_LOCAL_BIND; ppgtt->vm.insert_entries = gen8_ppgtt_insert; + ppgtt->vm.insert_page = gen8_ppgtt_insert_entry; ppgtt->vm.allocate_va_range = gen8_ppgtt_alloc; ppgtt->vm.clear_range = gen8_ppgtt_clear; From patchwork Wed Jun 9 06:34:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Hellstrom X-Patchwork-Id: 12309085 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 364DFC48BCD for ; Wed, 9 Jun 2021 06:35:09 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0794661183 for ; Wed, 9 Jun 2021 06:35:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0794661183 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 000656E1F3; Wed, 9 Jun 2021 06:35:00 +0000 (UTC) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2D2336E1C0; Wed, 9 Jun 2021 06:34:57 +0000 (UTC) IronPort-SDR: +m5iekTsAkjL0qbpjDLz1iClDaB5+riER7L+DJ7qh0IcHWFO81bbp1Of1nUQcjv6YUaveu36Ix CcIaE/DDdBYQ== X-IronPort-AV: E=McAfee;i="6200,9189,10009"; a="268868894" X-IronPort-AV: E=Sophos;i="5.83,260,1616482800"; d="scan'208";a="268868894" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jun 2021 23:34:57 -0700 IronPort-SDR: 34nTATV2Zjyq3GYXFE8poDW88ZVR8myxlSG5pGMAXapg4av1ynQcBbldPFU6Vj/QQWYjUYBd+r PzLCNVCinprg== X-IronPort-AV: E=Sophos;i="5.83,260,1616482800"; d="scan'208";a="482265810" Received: from ekolpasx-mobl.ccr.corp.intel.com (HELO thellst-mobl1.intel.com) ([10.249.254.109]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jun 2021 23:34:55 -0700 From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Date: Wed, 9 Jun 2021 08:34:32 +0200 Message-Id: <20210609063436.284332-6-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210609063436.284332-1-thomas.hellstrom@linux.intel.com> References: <20210609063436.284332-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v2 5/9] drm/i915/gt: Add a routine to iterate over the pagetables of a GTT X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.auld@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Chris Wilson In the next patch, we will want to look at the dma addresses of individual page tables, so add a routine to iterate over them. Signed-off-by: Chris Wilson Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 49 ++++++++++++++++++++++++++++ drivers/gpu/drm/i915/gt/intel_gtt.h | 7 ++++ 2 files changed, 56 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c index 1b676d7700bf..3d02c726c746 100644 --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c @@ -361,6 +361,54 @@ static void gen8_ppgtt_alloc(struct i915_address_space *vm, &start, start + length, vm->top); } +static void __gen8_ppgtt_foreach(struct i915_address_space *vm, + struct i915_page_directory *pd, + u64 *start, u64 end, int lvl, + void (*fn)(struct i915_address_space *vm, + struct i915_page_table *pt, + void *data), + void *data) +{ + unsigned int idx, len; + + len = gen8_pd_range(*start, end, lvl--, &idx); + + spin_lock(&pd->lock); + do { + struct i915_page_table *pt = pd->entry[idx]; + + atomic_inc(&pt->used); + spin_unlock(&pd->lock); + + if (lvl) { + __gen8_ppgtt_foreach(vm, as_pd(pt), start, end, lvl, + fn, data); + } else { + fn(vm, pt, data); + *start += gen8_pt_count(*start, end); + } + + spin_lock(&pd->lock); + atomic_dec(&pt->used); + } while (idx++, --len); + spin_unlock(&pd->lock); +} + +static void gen8_ppgtt_foreach(struct i915_address_space *vm, + u64 start, u64 length, + void (*fn)(struct i915_address_space *vm, + struct i915_page_table *pt, + void *data), + void *data) +{ + start >>= GEN8_PTE_SHIFT; + length >>= GEN8_PTE_SHIFT; + + __gen8_ppgtt_foreach(vm, i915_vm_to_ppgtt(vm)->pd, + &start, start + length, vm->top, + fn, data); +} + static __always_inline u64 gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt, struct i915_page_directory *pdp, @@ -755,6 +803,7 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt) ppgtt->vm.insert_page = gen8_ppgtt_insert_entry; ppgtt->vm.allocate_va_range = gen8_ppgtt_alloc; ppgtt->vm.clear_range = gen8_ppgtt_clear; + ppgtt->vm.foreach = gen8_ppgtt_foreach; ppgtt->vm.pte_encode = gen8_pte_encode; diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h index edea95b97c36..9bd89f2a01ff 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.h +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h @@ -296,6 +296,13 @@ struct i915_address_space { u32 flags); void (*cleanup)(struct i915_address_space *vm); + void (*foreach)(struct i915_address_space *vm, + u64 start, u64 length, + void (*fn)(struct i915_address_space *vm, + struct i915_page_table *pt, + void *data), + void *data); + struct i915_vma_ops vma_ops; I915_SELFTEST_DECLARE(struct fault_attr fault_attr); From patchwork Wed Jun 9 06:34:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Hellstrom X-Patchwork-Id: 12309083 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6937BC47095 for ; Wed, 9 Jun 2021 06:35:08 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3AD5161183 for ; Wed, 9 Jun 2021 06:35:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3AD5161183 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 07E106E20E; Wed, 9 Jun 2021 06:35:01 +0000 (UTC) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8A4F06E1F3; Wed, 9 Jun 2021 06:34:59 +0000 (UTC) IronPort-SDR: ueaC5nhoIL/tomZSHU4hpWxkQHF9oXxuzGQsUGo9MGBoUoTNIR+0oOeT6IsQZdNAgY26sd74sj lMnRn4AExwmg== X-IronPort-AV: E=McAfee;i="6200,9189,10009"; a="268868903" X-IronPort-AV: E=Sophos;i="5.83,260,1616482800"; d="scan'208";a="268868903" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jun 2021 23:34:59 -0700 IronPort-SDR: SU0r3vS0fy7T5KlRJUfhZGzbrai6tS9ABVuzUN1fVNv4ZBBU+Vi1kT4EAZqRZsIXBSxYBtu9ud SFVdMenyS4Tw== X-IronPort-AV: E=Sophos;i="5.83,260,1616482800"; d="scan'208";a="482265816" Received: from ekolpasx-mobl.ccr.corp.intel.com (HELO thellst-mobl1.intel.com) ([10.249.254.109]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jun 2021 23:34:57 -0700 From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Date: Wed, 9 Jun 2021 08:34:33 +0200 Message-Id: <20210609063436.284332-7-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210609063436.284332-1-thomas.hellstrom@linux.intel.com> References: <20210609063436.284332-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v2 6/9] drm/i915/gt: Export the pinned context constructor and destructor X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: matthew.auld@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Chris Wilson Allow internal clients to create and destroy a pinned context. Signed-off-by: Chris Wilson Reviewed-by: Matthew Auld --- v2: - (Thomas) Export also the pinned context destructor --- drivers/gpu/drm/i915/gt/intel_engine.h | 11 +++++++++ drivers/gpu/drm/i915/gt/intel_engine_cs.c | 27 ++++++++++++++--------- 2 files changed, 28 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h index 8d9184920c51..36ea9eb52bb5 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine.h +++ b/drivers/gpu/drm/i915/gt/intel_engine.h @@ -19,7 +19,9 @@ #include "intel_workarounds.h" struct drm_printer; +struct intel_context; struct intel_gt; +struct lock_class_key; /* Early gen2 devices have a cacheline of just 32 bytes, using 64 is overkill, * but keeps the logic simple. Indeed, the whole purpose of this macro is just @@ -256,6 +258,15 @@ struct i915_request * intel_engine_find_active_request(struct intel_engine_cs *engine); u32 intel_engine_context_size(struct intel_gt *gt, u8 class); +struct intel_context * +intel_engine_create_pinned_context(struct intel_engine_cs *engine, + struct i915_address_space *vm, + unsigned int ring_size, + unsigned int hwsp, + struct lock_class_key *key, + const char *name); + +void intel_engine_destroy_pinned_context(struct intel_context *ce); void intel_engine_init_active(struct intel_engine_cs *engine, unsigned int subclass); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 9ceddfbb1687..fcbaad18ac91 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -810,11 +810,13 @@ intel_engine_init_active(struct intel_engine_cs *engine, unsigned int subclass) #endif } -static struct intel_context * -create_pinned_context(struct intel_engine_cs *engine, - unsigned int hwsp, - struct lock_class_key *key, - const char *name) +struct intel_context * +intel_engine_create_pinned_context(struct intel_engine_cs *engine, + struct i915_address_space *vm, + unsigned int ring_size, + unsigned int hwsp, + struct lock_class_key *key, + const char *name) { struct intel_context *ce; int err; @@ -825,6 +827,10 @@ create_pinned_context(struct intel_engine_cs *engine, __set_bit(CONTEXT_BARRIER_BIT, &ce->flags); ce->timeline = page_pack_bits(NULL, hwsp); + ce->ring = __intel_context_ring_size(ring_size); + + i915_vm_put(ce->vm); + ce->vm = i915_vm_get(vm); err = intel_context_pin(ce); /* perma-pin so it is always available */ if (err) { @@ -843,7 +849,7 @@ create_pinned_context(struct intel_engine_cs *engine, return ce; } -static void destroy_pinned_context(struct intel_context *ce) +void intel_engine_destroy_pinned_context(struct intel_context *ce) { struct intel_engine_cs *engine = ce->engine; struct i915_vma *hwsp = engine->status_page.vma; @@ -863,8 +869,9 @@ create_kernel_context(struct intel_engine_cs *engine) { static struct lock_class_key kernel; - return create_pinned_context(engine, I915_GEM_HWS_SEQNO_ADDR, - &kernel, "kernel_context"); + return intel_engine_create_pinned_context(engine, engine->gt->vm, SZ_4K, + I915_GEM_HWS_SEQNO_ADDR, + &kernel, "kernel_context"); } /** @@ -907,7 +914,7 @@ static int engine_init_common(struct intel_engine_cs *engine) return 0; err_context: - destroy_pinned_context(ce); + intel_engine_destroy_pinned_context(ce); return ret; } @@ -969,7 +976,7 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine) fput(engine->default_state); if (engine->kernel_context) - destroy_pinned_context(engine->kernel_context); + intel_engine_destroy_pinned_context(engine->kernel_context); GEM_BUG_ON(!llist_empty(&engine->barrier_tasks)); cleanup_status_page(engine); From patchwork Wed Jun 9 06:34:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Hellstrom X-Patchwork-Id: 12309087 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0DF3C47095 for ; Wed, 9 Jun 2021 06:35:11 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8589561181 for ; Wed, 9 Jun 2021 06:35:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8589561181 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DDD196E235; Wed, 9 Jun 2021 06:35:04 +0000 (UTC) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 281606E212; Wed, 9 Jun 2021 06:35:02 +0000 (UTC) IronPort-SDR: T6UYHeithRGmwm75uuJhlCstpSbFvssdr37t8LVKuUzmD1ZUnXoZbIeaJKfhdR6eHD/QE4ud97 OhaXMkktgboQ== X-IronPort-AV: E=McAfee;i="6200,9189,10009"; a="268868907" X-IronPort-AV: E=Sophos;i="5.83,260,1616482800"; d="scan'208";a="268868907" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jun 2021 23:35:01 -0700 IronPort-SDR: 19dPAPIKm6JiCEhqV887FPrdtc9DgeWVTRQrEWjLCIEIEj/WfDoOo6qVtPwsbiavU2vGXKSK6L QAqszx6CVPIg== X-IronPort-AV: E=Sophos;i="5.83,260,1616482800"; d="scan'208";a="482265833" Received: from ekolpasx-mobl.ccr.corp.intel.com (HELO thellst-mobl1.intel.com) ([10.249.254.109]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jun 2021 23:34:59 -0700 From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Date: Wed, 9 Jun 2021 08:34:34 +0200 Message-Id: <20210609063436.284332-8-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210609063436.284332-1-thomas.hellstrom@linux.intel.com> References: <20210609063436.284332-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v2 7/9] drm/i915/gt: Pipelined page migration X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Thomas_Hellstr=C3=B6m?= , matthew.auld@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Chris Wilson If we pipeline the PTE updates and then do the copy of those pages within a single unpreemptible command packet, we can submit the copies and leave them to be scheduled without having to synchronously wait under a global lock. In order to manage migration, we need to preallocate the page tables (and keep them pinned and available for use at any time), causing a bottleneck for migrations as all clients must contend on the limited resources. By inlining the ppGTT updates and performing the blit atomically, each client only owns the PTE while in use, and so we can reschedule individual operations however we see fit. And most importantly, we do not need to take a global lock on the shared vm, and wait until the operation is complete before releasing the lock for others to claim the PTE for themselves. Signed-off-by: Chris Wilson Co-developed-by: Thomas Hellström Signed-off-by: Thomas Hellström --- v2: - Add a TODO for huge LMEM ptes (Pointed out by Matthew Auld) - Use intel_engine_destroy_pinned_context() to properly take the pinned context timeline off the engine list. (CI warning). --- drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/gt/intel_engine.h | 1 + drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 2 + drivers/gpu/drm/i915/gt/intel_migrate.c | 544 ++++++++++++++++++ drivers/gpu/drm/i915/gt/intel_migrate.h | 45 ++ drivers/gpu/drm/i915/gt/intel_migrate_types.h | 15 + drivers/gpu/drm/i915/gt/intel_ring.h | 1 + drivers/gpu/drm/i915/gt/selftest_migrate.c | 291 ++++++++++ .../drm/i915/selftests/i915_live_selftests.h | 1 + 9 files changed, 901 insertions(+) create mode 100644 drivers/gpu/drm/i915/gt/intel_migrate.c create mode 100644 drivers/gpu/drm/i915/gt/intel_migrate.h create mode 100644 drivers/gpu/drm/i915/gt/intel_migrate_types.h create mode 100644 drivers/gpu/drm/i915/gt/selftest_migrate.c diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 16a5a006cf7c..95bd38e84625 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -108,6 +108,7 @@ gt-y += \ gt/intel_gtt.o \ gt/intel_llc.o \ gt/intel_lrc.o \ + gt/intel_migrate.o \ gt/intel_mocs.o \ gt/intel_ppgtt.o \ gt/intel_rc6.o \ diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h index 36ea9eb52bb5..62f7440bc111 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine.h +++ b/drivers/gpu/drm/i915/gt/intel_engine.h @@ -188,6 +188,7 @@ intel_write_status_page(struct intel_engine_cs *engine, int reg, u32 value) #define I915_GEM_HWS_PREEMPT_ADDR (I915_GEM_HWS_PREEMPT * sizeof(u32)) #define I915_GEM_HWS_SEQNO 0x40 #define I915_GEM_HWS_SEQNO_ADDR (I915_GEM_HWS_SEQNO * sizeof(u32)) +#define I915_GEM_HWS_MIGRATE (0x42 * sizeof(u32)) #define I915_GEM_HWS_SCRATCH 0x80 #define I915_HWS_CSB_BUF0_INDEX 0x10 diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h index 2694dbb9967e..1c3af0fc0456 100644 --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h @@ -123,8 +123,10 @@ #define MI_SEMAPHORE_SAD_NEQ_SDD (5 << 12) #define MI_SEMAPHORE_TOKEN_MASK REG_GENMASK(9, 5) #define MI_SEMAPHORE_TOKEN_SHIFT 5 +#define MI_STORE_DATA_IMM MI_INSTR(0x20, 0) #define MI_STORE_DWORD_IMM MI_INSTR(0x20, 1) #define MI_STORE_DWORD_IMM_GEN4 MI_INSTR(0x20, 2) +#define MI_STORE_QWORD_IMM_GEN8 (MI_INSTR(0x20, 3) | REG_BIT(21)) #define MI_MEM_VIRTUAL (1 << 22) /* 945,g33,965 */ #define MI_USE_GGTT (1 << 22) /* g4x+ */ #define MI_STORE_DWORD_INDEX MI_INSTR(0x21, 1) diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c new file mode 100644 index 000000000000..70776316863d --- /dev/null +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c @@ -0,0 +1,544 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2020 Intel Corporation + */ + +#include "i915_drv.h" +#include "intel_context.h" +#include "intel_gpu_commands.h" +#include "intel_gt.h" +#include "intel_gtt.h" +#include "intel_migrate.h" +#include "intel_ring.h" + +struct insert_pte_data { + u64 offset; + bool is_lmem; +}; + +#define CHUNK_SZ SZ_8M /* ~1ms at 8GiB/s preemption delay */ + +static bool engine_supports_migration(struct intel_engine_cs *engine) +{ + if (!engine) + return false; + + /* + * We need the ability to prevent aribtration (MI_ARB_ON_OFF), + * the ability to write PTE using inline data (MI_STORE_DATA) + * and of course the ability to do the block transfer (blits). + */ + GEM_BUG_ON(engine->class != COPY_ENGINE_CLASS); + + return true; +} + +static void insert_pte(struct i915_address_space *vm, + struct i915_page_table *pt, + void *data) +{ + struct insert_pte_data *d = data; + + vm->insert_page(vm, px_dma(pt), d->offset, I915_CACHE_NONE, + d->is_lmem ? PTE_LM : 0); + d->offset += PAGE_SIZE; +} + +static struct i915_address_space *migrate_vm(struct intel_gt *gt) +{ + struct i915_vm_pt_stash stash = {}; + struct i915_ppgtt *vm; + int err; + int i; + + /* + * We construct a very special VM for use by all migration contexts, + * it is kept pinned so that it can be used at any time. As we need + * to pre-allocate the page directories for the migration VM, this + * limits us to only using a small number of prepared vma. + * + * To be able to pipeline and reschedule migration operations while + * avoiding unnecessary contention on the vm itself, the PTE updates + * are inline with the blits. All the blits use the same fixed + * addresses, with the backing store redirection being updated on the + * fly. Only 2 implicit vma are used for all migration operations. + * + * We lay the ppGTT out as: + * + * [0, CHUNK_SZ) -> first object + * [CHUNK_SZ, 2 * CHUNK_SZ) -> second object + * [2 * CHUNK_SZ, 2 * CHUNK_SZ + 2 * CHUNK_SZ >> 9] -> PTE + * + * By exposing the dma addresses of the page directories themselves + * within the ppGTT, we are then able to rewrite the PTE prior to use. + * But the PTE update and subsequent migration operation must be atomic, + * i.e. within the same non-preemptible window so that we do not switch + * to another migration context that overwrites the PTE. + * + * TODO: Add support for huge LMEM PTEs + */ + + vm = i915_ppgtt_create(gt); + if (IS_ERR(vm)) + return ERR_CAST(vm); + + if (!vm->vm.allocate_va_range || !vm->vm.foreach) { + err = -ENODEV; + goto err_vm; + } + + /* + * Each engine instance is assigned its own chunk in the VM, so + * that we can run multiple instances concurrently + */ + for (i = 0; i < ARRAY_SIZE(gt->engine_class[COPY_ENGINE_CLASS]); i++) { + struct intel_engine_cs *engine; + u64 base = (u64)i << 32; + struct insert_pte_data d = {}; + struct i915_gem_ww_ctx ww; + u64 sz; + + engine = gt->engine_class[COPY_ENGINE_CLASS][i]; + if (!engine_supports_migration(engine)) + continue; + + /* + * We copy in 8MiB chunks. Each PDE covers 2MiB, so we need + * 4x2 page directories for source/destination. + */ + sz = 2 * CHUNK_SZ; + d.offset = base + sz; + + /* + * We need another page directory setup so that we can write + * the 8x512 PTE in each chunk. + */ + sz += (sz >> 12) * sizeof(u64); + + err = i915_vm_alloc_pt_stash(&vm->vm, &stash, sz); + if (err) + goto err_vm; + + for_i915_gem_ww(&ww, err, true) { + err = i915_vm_lock_objects(&vm->vm, &ww); + if (err) + continue; + err = i915_vm_map_pt_stash(&vm->vm, &stash); + if (err) + continue; + + vm->vm.allocate_va_range(&vm->vm, &stash, base, base + sz); + } + i915_vm_free_pt_stash(&vm->vm, &stash); + if (err) + goto err_vm; + + /* Now allow the GPU to rewrite the PTE via its own ppGTT */ + d.is_lmem = i915_gem_object_is_lmem(vm->vm.scratch[0]); + vm->vm.foreach(&vm->vm, base, base + sz, insert_pte, &d); + } + + return &vm->vm; + +err_vm: + i915_vm_put(&vm->vm); + return ERR_PTR(err); +} + +static struct intel_engine_cs *first_copy_engine(struct intel_gt *gt) +{ + struct intel_engine_cs *engine; + int i; + + for (i = 0; i < ARRAY_SIZE(gt->engine_class[COPY_ENGINE_CLASS]); i++) { + engine = gt->engine_class[COPY_ENGINE_CLASS][i]; + if (engine_supports_migration(engine)) + return engine; + } + + return NULL; +} + +static struct intel_context *pinned_context(struct intel_gt *gt) +{ + static struct lock_class_key key; + struct intel_engine_cs *engine; + struct i915_address_space *vm; + struct intel_context *ce; + + engine = first_copy_engine(gt); + if (!engine) + return ERR_PTR(-ENODEV); + + vm = migrate_vm(gt); + if (IS_ERR(vm)) + return ERR_CAST(vm); + + ce = intel_engine_create_pinned_context(engine, vm, SZ_512K, + I915_GEM_HWS_MIGRATE, + &key, "migrate"); + i915_vm_put(ce->vm); + return ce; +} + +int intel_migrate_init(struct intel_migrate *m, struct intel_gt *gt) +{ + struct intel_context *ce; + + memset(m, 0, sizeof(*m)); + + ce = pinned_context(gt); + if (IS_ERR(ce)) + return PTR_ERR(ce); + + m->context = ce; + return 0; +} + +static int random_index(unsigned int max) +{ + return upper_32_bits(mul_u32_u32(get_random_u32(), max)); +} + +static struct intel_context *__migrate_engines(struct intel_gt *gt) +{ + struct intel_engine_cs *engines[MAX_ENGINE_INSTANCE]; + struct intel_engine_cs *engine; + unsigned int count, i; + + count = 0; + for (i = 0; i < ARRAY_SIZE(gt->engine_class[COPY_ENGINE_CLASS]); i++) { + engine = gt->engine_class[COPY_ENGINE_CLASS][i]; + if (engine_supports_migration(engine)) + engines[count++] = engine; + } + + return intel_context_create(engines[random_index(count)]); +} + +struct intel_context *intel_migrate_create_context(struct intel_migrate *m) +{ + struct intel_context *ce; + + /* + * We randomly distribute contexts across the engines upon constrction, + * as they all share the same pinned vm, and so in order to allow + * multiple blits to run in parallel, we must construct each blit + * to use a different range of the vm for its GTT. This has to be + * known at construction, so we can not use the late greedy load + * balancing of the virtual-engine. + */ + ce = __migrate_engines(m->context->engine->gt); + if (IS_ERR(ce)) + return ce; + + ce->ring = __intel_context_ring_size(SZ_256K); + + i915_vm_put(ce->vm); + ce->vm = i915_vm_get(m->context->vm); + + return ce; +} + +static inline struct sgt_dma sg_sgt(struct scatterlist *sg) +{ + dma_addr_t addr = sg_dma_address(sg); + + return (struct sgt_dma){ sg, addr, addr + sg_dma_len(sg) }; +} + +static int emit_no_arbitration(struct i915_request *rq) +{ + u32 *cs; + + cs = intel_ring_begin(rq, 2); + if (IS_ERR(cs)) + return PTR_ERR(cs); + + /* Explicitly disable preemption for this request. */ + *cs++ = MI_ARB_ON_OFF; + *cs++ = MI_NOOP; + intel_ring_advance(rq, cs); + + return 0; +} + +static int emit_pte(struct i915_request *rq, + struct sgt_dma *it, + enum i915_cache_level cache_level, + bool is_lmem, + u64 offset, + int length) +{ + const u64 encode = rq->context->vm->pte_encode(0, cache_level, + is_lmem ? PTE_LM : 0); + struct intel_ring *ring = rq->ring; + int total = 0; + u32 *hdr, *cs; + int pkt; + + GEM_BUG_ON(INTEL_GEN(rq->engine->i915) < 8); + + /* Compute the page directory offset for the target address range */ + offset += (u64)rq->engine->instance << 32; + offset >>= 12; + offset *= sizeof(u64); + offset += 2 * CHUNK_SZ; + + cs = intel_ring_begin(rq, 6); + if (IS_ERR(cs)) + return PTR_ERR(cs); + + /* Pack as many PTE updates as possible into a single MI command */ + pkt = min_t(int, 0x400, ring->space / sizeof(u32) + 5); + pkt = min_t(int, pkt, (ring->size - ring->emit) / sizeof(u32) + 5); + + hdr = cs; + *cs++ = MI_STORE_DATA_IMM | REG_BIT(21); /* as qword elements */ + *cs++ = lower_32_bits(offset); + *cs++ = upper_32_bits(offset); + + do { + if (cs - hdr >= pkt) { + *hdr += cs - hdr - 2; + *cs++ = MI_NOOP; + + ring->emit = (void *)cs - ring->vaddr; + intel_ring_advance(rq, cs); + intel_ring_update_space(ring); + + cs = intel_ring_begin(rq, 6); + if (IS_ERR(cs)) + return PTR_ERR(cs); + + pkt = min_t(int, 0x400, ring->space / sizeof(u32) + 5); + pkt = min_t(int, pkt, (ring->size - ring->emit) / sizeof(u32) + 5); + + hdr = cs; + *cs++ = MI_STORE_DATA_IMM | REG_BIT(21); + *cs++ = lower_32_bits(offset); + *cs++ = upper_32_bits(offset); + } + + *cs++ = lower_32_bits(encode | it->dma); + *cs++ = upper_32_bits(encode | it->dma); + + offset += 8; + total += I915_GTT_PAGE_SIZE; + + it->dma += I915_GTT_PAGE_SIZE; + if (it->dma >= it->max) { + it->sg = __sg_next(it->sg); + if (!it->sg || sg_dma_len(it->sg) == 0) + break; + + it->dma = sg_dma_address(it->sg); + it->max = it->dma + sg_dma_len(it->sg); + } + } while (total < length); + + *hdr += cs - hdr - 2; + *cs++ = MI_NOOP; + + ring->emit = (void *)cs - ring->vaddr; + intel_ring_advance(rq, cs); + intel_ring_update_space(ring); + + return total; +} + +static bool wa_1209644611_applies(int gen, u32 size) +{ + u32 height = size >> PAGE_SHIFT; + + if (gen != 11) + return false; + + return height % 4 == 3 && height <= 8; +} + +static int emit_copy(struct i915_request *rq, int size) +{ + const int gen = INTEL_GEN(rq->engine->i915); + u32 instance = rq->engine->instance; + u32 *cs; + + cs = intel_ring_begin(rq, gen >= 8 ? 10 : 6); + if (IS_ERR(cs)) + return PTR_ERR(cs); + + if (gen >= 9 && !wa_1209644611_applies(gen, size)) { + *cs++ = GEN9_XY_FAST_COPY_BLT_CMD | (10 - 2); + *cs++ = BLT_DEPTH_32 | PAGE_SIZE; + *cs++ = 0; + *cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4; + *cs++ = CHUNK_SZ; /* dst offset */ + *cs++ = instance; + *cs++ = 0; + *cs++ = PAGE_SIZE; + *cs++ = 0; /* src offset */ + *cs++ = instance; + } else if (gen >= 8) { + *cs++ = XY_SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (10 - 2); + *cs++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE; + *cs++ = 0; + *cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4; + *cs++ = CHUNK_SZ; /* dst offset */ + *cs++ = instance; + *cs++ = 0; + *cs++ = PAGE_SIZE; + *cs++ = 0; /* src offset */ + *cs++ = instance; + } else { + GEM_BUG_ON(instance); + *cs++ = SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (6 - 2); + *cs++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE; + *cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE; + *cs++ = CHUNK_SZ; /* dst offset */ + *cs++ = PAGE_SIZE; + *cs++ = 0; /* src offset */ + } + + intel_ring_advance(rq, cs); + return 0; +} + +int +intel_context_migrate_copy(struct intel_context *ce, + struct dma_fence *await, + struct scatterlist *src, + enum i915_cache_level src_cache_level, + bool src_is_lmem, + struct scatterlist *dst, + enum i915_cache_level dst_cache_level, + bool dst_is_lmem, + struct i915_request **out) +{ + struct sgt_dma it_src = sg_sgt(src), it_dst = sg_sgt(dst); + struct i915_request *rq; + int err; + + *out = NULL; + + /* GEM_BUG_ON(ce->vm != migrate_vm); */ + + GEM_BUG_ON(ce->ring->size < SZ_64K); + + do { + int len; + + rq = i915_request_create(ce); + if (IS_ERR(rq)) { + err = PTR_ERR(rq); + goto out_ce; + } + + if (await) { + err = i915_request_await_dma_fence(rq, await); + if (err) + goto out_rq; + + if (rq->engine->emit_init_breadcrumb) { + err = rq->engine->emit_init_breadcrumb(rq); + if (err) + goto out_rq; + } + + await = NULL; + } + + /* The PTE updates + copy must not be interrupted. */ + err = emit_no_arbitration(rq); + if (err) + goto out_rq; + + len = emit_pte(rq, &it_src, src_cache_level, src_is_lmem, 0, + CHUNK_SZ); + if (len <= 0) { + err = len; + goto out_rq; + } + + err = emit_pte(rq, &it_dst, dst_cache_level, dst_is_lmem, + CHUNK_SZ, len); + if (err < 0) + goto out_rq; + if (err < len) { + err = -EINVAL; + goto out_rq; + } + + err = rq->engine->emit_flush(rq, EMIT_INVALIDATE); + if (err) + goto out_rq; + + err = emit_copy(rq, len); + + /* Arbitration is re-enabled between requests. */ +out_rq: + if (*out) + i915_request_put(*out); + *out = i915_request_get(rq); + i915_request_add(rq); + if (err || !it_src.sg || !sg_dma_len(it_src.sg)) + break; + + cond_resched(); + } while (1); + +out_ce: + return err; +} + +int intel_migrate_copy(struct intel_migrate *m, + struct i915_gem_ww_ctx *ww, + struct dma_fence *await, + struct scatterlist *src, + enum i915_cache_level src_cache_level, + bool src_is_lmem, + struct scatterlist *dst, + enum i915_cache_level dst_cache_level, + bool dst_is_lmem, + struct i915_request **out) +{ + struct intel_context *ce; + int err; + + *out = NULL; + if (!m->context) + return -ENODEV; + + ce = intel_migrate_create_context(m); + if (IS_ERR(ce)) + ce = intel_context_get(m->context); + GEM_BUG_ON(IS_ERR(ce)); + + err = intel_context_pin_ww(ce, ww); + if (err) + goto out; + + err = intel_context_migrate_copy(ce, await, + src, src_cache_level, src_is_lmem, + dst, dst_cache_level, dst_is_lmem, + out); + + intel_context_unpin(ce); +out: + intel_context_put(ce); + return err; +} + +void intel_migrate_fini(struct intel_migrate *m) +{ + struct intel_context *ce; + + ce = fetch_and_zero(&m->context); + if (!ce) + return; + + intel_engine_destroy_pinned_context(ce); +} + +#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) +#include "selftest_migrate.c" +#endif diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.h b/drivers/gpu/drm/i915/gt/intel_migrate.h new file mode 100644 index 000000000000..32c61190ed73 --- /dev/null +++ b/drivers/gpu/drm/i915/gt/intel_migrate.h @@ -0,0 +1,45 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2020 Intel Corporation + */ + +#ifndef __INTEL_MIGRATE__ +#define __INTEL_MIGRATE__ + +#include "intel_migrate_types.h" + +struct dma_fence; +struct i915_request; +struct i915_gem_ww_ctx; +struct intel_gt; +struct scatterlist; +enum i915_cache_level; + +int intel_migrate_init(struct intel_migrate *m, struct intel_gt *gt); + +struct intel_context *intel_migrate_create_context(struct intel_migrate *m); + +int intel_migrate_copy(struct intel_migrate *m, + struct i915_gem_ww_ctx *ww, + struct dma_fence *await, + struct scatterlist *src, + enum i915_cache_level src_cache_level, + bool src_is_lmem, + struct scatterlist *dst, + enum i915_cache_level dst_cache_level, + bool dst_is_lmem, + struct i915_request **out); + +int intel_context_migrate_copy(struct intel_context *ce, + struct dma_fence *await, + struct scatterlist *src, + enum i915_cache_level src_cache_level, + bool src_is_lmem, + struct scatterlist *dst, + enum i915_cache_level dst_cache_level, + bool dst_is_lmem, + struct i915_request **out); + +void intel_migrate_fini(struct intel_migrate *m); + +#endif /* __INTEL_MIGRATE__ */ diff --git a/drivers/gpu/drm/i915/gt/intel_migrate_types.h b/drivers/gpu/drm/i915/gt/intel_migrate_types.h new file mode 100644 index 000000000000..d98230597f42 --- /dev/null +++ b/drivers/gpu/drm/i915/gt/intel_migrate_types.h @@ -0,0 +1,15 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2020 Intel Corporation + */ + +#ifndef __INTEL_MIGRATE_TYPES__ +#define __INTEL_MIGRATE_TYPES__ + +struct intel_context; + +struct intel_migrate { + struct intel_context *context; +}; + +#endif /* __INTEL_MIGRATE_TYPES__ */ diff --git a/drivers/gpu/drm/i915/gt/intel_ring.h b/drivers/gpu/drm/i915/gt/intel_ring.h index dbf5f14a136f..1b32dadfb8c3 100644 --- a/drivers/gpu/drm/i915/gt/intel_ring.h +++ b/drivers/gpu/drm/i915/gt/intel_ring.h @@ -49,6 +49,7 @@ static inline void intel_ring_advance(struct i915_request *rq, u32 *cs) * intel_ring_begin()). */ GEM_BUG_ON((rq->ring->vaddr + rq->ring->emit) != cs); + GEM_BUG_ON(!IS_ALIGNED(rq->ring->emit, 8)); /* RING_TAIL qword align */ } static inline u32 intel_ring_wrap(const struct intel_ring *ring, u32 pos) diff --git a/drivers/gpu/drm/i915/gt/selftest_migrate.c b/drivers/gpu/drm/i915/gt/selftest_migrate.c new file mode 100644 index 000000000000..9784d149ebf1 --- /dev/null +++ b/drivers/gpu/drm/i915/gt/selftest_migrate.c @@ -0,0 +1,291 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2020 Intel Corporation + */ + +#include "selftests/i915_random.h" + +static const unsigned int sizes[] = { + SZ_4K, + SZ_64K, + SZ_2M, + CHUNK_SZ - SZ_4K, + CHUNK_SZ, + CHUNK_SZ + SZ_4K, + SZ_64M, +}; + +static struct drm_i915_gem_object * +create_lmem_or_internal(struct drm_i915_private *i915, size_t size) +{ + if (HAS_LMEM(i915)) { + struct drm_i915_gem_object *obj; + + obj = i915_gem_object_create_lmem(i915, size, 0); + if (!IS_ERR(obj)) + return obj; + } + + return i915_gem_object_create_internal(i915, size); +} + +static int copy(struct intel_migrate *migrate, + int (*fn)(struct intel_migrate *migrate, + struct i915_gem_ww_ctx *ww, + struct drm_i915_gem_object *src, + struct drm_i915_gem_object *dst, + struct i915_request **out), + u32 sz, struct rnd_state *prng) +{ + struct drm_i915_private *i915 = migrate->context->engine->i915; + struct drm_i915_gem_object *src, *dst; + struct i915_request *rq; + struct i915_gem_ww_ctx ww; + u32 *vaddr; + int err = 0; + int i; + + src = create_lmem_or_internal(i915, sz); + if (IS_ERR(src)) + return 0; + + dst = i915_gem_object_create_internal(i915, sz); + if (IS_ERR(dst)) + goto err_free_src; + + for_i915_gem_ww(&ww, err, true) { + err = i915_gem_object_lock(src, &ww); + if (err) + continue; + + err = i915_gem_object_lock(dst, &ww); + if (err) + continue; + + vaddr = i915_gem_object_pin_map(src, I915_MAP_WC); + if (IS_ERR(vaddr)) { + err = PTR_ERR(vaddr); + continue; + } + + for (i = 0; i < sz / sizeof(u32); i++) + vaddr[i] = i; + i915_gem_object_flush_map(src); + + vaddr = i915_gem_object_pin_map(dst, I915_MAP_WC); + if (IS_ERR(vaddr)) { + err = PTR_ERR(vaddr); + goto unpin_src; + } + + for (i = 0; i < sz / sizeof(u32); i++) + vaddr[i] = ~i; + i915_gem_object_flush_map(dst); + + err = fn(migrate, &ww, src, dst, &rq); + if (!err) + continue; + + if (err != -EDEADLK && err != -EINTR && err != -ERESTARTSYS) + pr_err("%ps failed, size: %u\n", fn, sz); + if (rq) { + i915_request_wait(rq, 0, HZ); + i915_request_put(rq); + } + i915_gem_object_unpin_map(dst); +unpin_src: + i915_gem_object_unpin_map(src); + } + if (err) + goto err_out; + + if (rq) { + if (i915_request_wait(rq, 0, HZ) < 0) { + pr_err("%ps timed out, size: %u\n", fn, sz); + err = -ETIME; + } + i915_request_put(rq); + } + + for (i = 0; !err && i < sz / PAGE_SIZE; i++) { + int x = i * 1024 + i915_prandom_u32_max_state(1024, prng); + + if (vaddr[x] != x) { + pr_err("%ps failed, size: %u, offset: %zu\n", + fn, sz, x * sizeof(u32)); + igt_hexdump(vaddr + i * 1024, 4096); + err = -EINVAL; + } + } + + i915_gem_object_unpin_map(dst); + i915_gem_object_unpin_map(src); + +err_out: + i915_gem_object_put(dst); +err_free_src: + i915_gem_object_put(src); + + return err; +} + +static int __migrate_copy(struct intel_migrate *migrate, + struct i915_gem_ww_ctx *ww, + struct drm_i915_gem_object *src, + struct drm_i915_gem_object *dst, + struct i915_request **out) +{ + return intel_migrate_copy(migrate, ww, NULL, + src->mm.pages->sgl, src->cache_level, + i915_gem_object_is_lmem(src), + dst->mm.pages->sgl, dst->cache_level, + i915_gem_object_is_lmem(dst), + out); +} + +static int __global_copy(struct intel_migrate *migrate, + struct i915_gem_ww_ctx *ww, + struct drm_i915_gem_object *src, + struct drm_i915_gem_object *dst, + struct i915_request **out) +{ + return intel_context_migrate_copy(migrate->context, NULL, + src->mm.pages->sgl, src->cache_level, + i915_gem_object_is_lmem(src), + dst->mm.pages->sgl, dst->cache_level, + i915_gem_object_is_lmem(dst), + out); +} + +static int +migrate_copy(struct intel_migrate *migrate, u32 sz, struct rnd_state *prng) +{ + return copy(migrate, __migrate_copy, sz, prng); +} + +static int +global_copy(struct intel_migrate *migrate, u32 sz, struct rnd_state *prng) +{ + return copy(migrate, __global_copy, sz, prng); +} + +static int live_migrate_copy(void *arg) +{ + struct intel_migrate *migrate = arg; + struct drm_i915_private *i915 = migrate->context->engine->i915; + I915_RND_STATE(prng); + int i; + + for (i = 0; i < ARRAY_SIZE(sizes); i++) { + int err; + + err = migrate_copy(migrate, sizes[i], &prng); + if (err == 0) + err = global_copy(migrate, sizes[i], &prng); + i915_gem_drain_freed_objects(i915); + if (err) + return err; + } + + return 0; +} + +struct threaded_migrate { + struct intel_migrate *migrate; + struct task_struct *tsk; + struct rnd_state prng; +}; + +static int threaded_migrate(struct intel_migrate *migrate, + int (*fn)(void *arg), + unsigned int flags) +{ + const unsigned int n_cpus = num_online_cpus() + 1; + struct threaded_migrate *thread; + I915_RND_STATE(prng); + unsigned int i; + int err = 0; + + thread = kcalloc(n_cpus, sizeof(*thread), GFP_KERNEL); + if (!thread) + return 0; + + for (i = 0; i < n_cpus; ++i) { + struct task_struct *tsk; + + thread[i].migrate = migrate; + thread[i].prng = + I915_RND_STATE_INITIALIZER(prandom_u32_state(&prng)); + + tsk = kthread_run(fn, &thread[i], "igt-%d", i); + if (IS_ERR(tsk)) { + err = PTR_ERR(tsk); + break; + } + + get_task_struct(tsk); + thread[i].tsk = tsk; + } + + msleep(10); /* start all threads before we kthread_stop() */ + + for (i = 0; i < n_cpus; ++i) { + struct task_struct *tsk = thread[i].tsk; + int status; + + if (IS_ERR_OR_NULL(tsk)) + continue; + + status = kthread_stop(tsk); + if (status && !err) + err = status; + + put_task_struct(tsk); + } + + kfree(thread); + return err; +} + +static int __thread_migrate_copy(void *arg) +{ + struct threaded_migrate *tm = arg; + + return migrate_copy(tm->migrate, 2 * CHUNK_SZ, &tm->prng); +} + +static int thread_migrate_copy(void *arg) +{ + return threaded_migrate(arg, __thread_migrate_copy, 0); +} + +static int __thread_global_copy(void *arg) +{ + struct threaded_migrate *tm = arg; + + return global_copy(tm->migrate, 2 * CHUNK_SZ, &tm->prng); +} + +static int thread_global_copy(void *arg) +{ + return threaded_migrate(arg, __thread_global_copy, 0); +} + +int intel_migrate_live_selftests(struct drm_i915_private *i915) +{ + static const struct i915_subtest tests[] = { + SUBTEST(live_migrate_copy), + SUBTEST(thread_migrate_copy), + SUBTEST(thread_global_copy), + }; + struct intel_migrate m; + int err; + + if (intel_migrate_init(&m, &i915->gt)) + return 0; + + err = i915_subtests(tests, &m); + intel_migrate_fini(&m); + + return err; +} diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h index a92c0e9b7e6b..be5e0191eaea 100644 --- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h +++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h @@ -26,6 +26,7 @@ selftest(gt_mocs, intel_mocs_live_selftests) selftest(gt_pm, intel_gt_pm_live_selftests) selftest(gt_heartbeat, intel_heartbeat_live_selftests) selftest(requests, i915_request_live_selftests) +selftest(migrate, intel_migrate_live_selftests) selftest(active, i915_active_live_selftests) selftest(objects, i915_gem_object_live_selftests) selftest(mman, i915_gem_mman_live_selftests) From patchwork Wed Jun 9 06:34:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Hellstrom X-Patchwork-Id: 12309089 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 869F1C48BCF for ; Wed, 9 Jun 2021 06:35:12 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 57F2861181 for ; Wed, 9 Jun 2021 06:35:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 57F2861181 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 16B866E34B; Wed, 9 Jun 2021 06:35:06 +0000 (UTC) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9D1C96E212; Wed, 9 Jun 2021 06:35:04 +0000 (UTC) IronPort-SDR: By/+WBMjH9QUomKwdhCYehbRbWcHdwh9ZGZGDiRekr5B5zeABCL+oG5KbtzPnvzhNTjCTmOFHK P1rjGfxd8/KA== X-IronPort-AV: E=McAfee;i="6200,9189,10009"; a="268868913" X-IronPort-AV: E=Sophos;i="5.83,260,1616482800"; d="scan'208";a="268868913" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jun 2021 23:35:04 -0700 IronPort-SDR: Gv6hZhJxhx+ivPyr6OfWW/UhNTOa8ArchKn/aH/brAScF7HDZbhleRcoj01tw2ehA0oQgEJhvK r5U2IMCy4qGQ== X-IronPort-AV: E=Sophos;i="5.83,260,1616482800"; d="scan'208";a="482265855" Received: from ekolpasx-mobl.ccr.corp.intel.com (HELO thellst-mobl1.intel.com) ([10.249.254.109]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jun 2021 23:35:01 -0700 From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Date: Wed, 9 Jun 2021 08:34:35 +0200 Message-Id: <20210609063436.284332-9-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210609063436.284332-1-thomas.hellstrom@linux.intel.com> References: <20210609063436.284332-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v2 8/9] drm/i915/gt: Pipelined clear X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Thomas_Hellstr=C3=B6m?= , matthew.auld@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Chris Wilson Update the PTE and emit a clear within a single unpreemptible packet such that we can schedule and pipeline clears. Signed-off-by: Chris Wilson Co-developed-by: Thomas Hellström Signed-off-by: Thomas Hellström --- drivers/gpu/drm/i915/gt/intel_migrate.c | 141 ++++++++++++++++++ drivers/gpu/drm/i915/gt/intel_migrate.h | 20 +++ drivers/gpu/drm/i915/gt/selftest_migrate.c | 163 +++++++++++++++++++++ 3 files changed, 324 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c index 70776316863d..fda05ce3eb9c 100644 --- a/drivers/gpu/drm/i915/gt/intel_migrate.c +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c @@ -490,6 +490,112 @@ intel_context_migrate_copy(struct intel_context *ce, return err; } +static int emit_clear(struct i915_request *rq, int size, u32 value) +{ + const int gen = INTEL_GEN(rq->engine->i915); + u32 *cs; + + GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX); + + cs = intel_ring_begin(rq, gen >= 8 ? 8 : 6); + if (IS_ERR(cs)) + return PTR_ERR(cs); + + if (gen >= 8) { + *cs++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (7 - 2); + *cs++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE; + *cs++ = 0; + *cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4; + *cs++ = 0; /* offset */ + *cs++ = 0; + *cs++ = value; + *cs++ = MI_NOOP; + } else { + *cs++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (6 - 2); + *cs++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE; + *cs++ = 0; + *cs++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4; + *cs++ = 0; + *cs++ = value; + } + + intel_ring_advance(rq, cs); + return 0; +} + +int +intel_context_migrate_clear(struct intel_context *ce, + struct dma_fence *await, + struct scatterlist *sg, + enum i915_cache_level cache_level, + bool is_lmem, + u32 value, + struct i915_request **out) +{ + struct sgt_dma it = sg_sgt(sg); + struct i915_request *rq; + int err; + + *out = NULL; + + GEM_BUG_ON(ce->ring->size < SZ_64K); + + do { + int len; + + rq = i915_request_create(ce); + if (IS_ERR(rq)) { + err = PTR_ERR(rq); + goto out_ce; + } + + if (await) { + err = i915_request_await_dma_fence(rq, await); + if (err) + goto out_rq; + + if (rq->engine->emit_init_breadcrumb) { + err = rq->engine->emit_init_breadcrumb(rq); + if (err) + goto out_rq; + } + + await = NULL; + } + + /* The PTE updates + clear must not be interrupted. */ + err = emit_no_arbitration(rq); + if (err) + goto out_rq; + + len = emit_pte(rq, &it, cache_level, is_lmem, 0, CHUNK_SZ); + if (len <= 0) { + err = len; + goto out_rq; + } + + err = rq->engine->emit_flush(rq, EMIT_INVALIDATE); + if (err) + goto out_rq; + + err = emit_clear(rq, len, value); + + /* Arbitration is re-enabled between requests. */ +out_rq: + if (*out) + i915_request_put(*out); + *out = i915_request_get(rq); + i915_request_add(rq); + if (err || !it.sg || !sg_dma_len(it.sg)) + break; + + cond_resched(); + } while (1); + +out_ce: + return err; +} + int intel_migrate_copy(struct intel_migrate *m, struct i915_gem_ww_ctx *ww, struct dma_fence *await, @@ -528,6 +634,41 @@ int intel_migrate_copy(struct intel_migrate *m, return err; } +int +intel_migrate_clear(struct intel_migrate *m, + struct i915_gem_ww_ctx *ww, + struct dma_fence *await, + struct scatterlist *sg, + enum i915_cache_level cache_level, + bool is_lmem, + u32 value, + struct i915_request **out) +{ + struct intel_context *ce; + int err; + + *out = NULL; + if (!m->context) + return -ENODEV; + + ce = intel_migrate_create_context(m); + if (IS_ERR(ce)) + ce = intel_context_get(m->context); + GEM_BUG_ON(IS_ERR(ce)); + + err = intel_context_pin_ww(ce, ww); + if (err) + goto out; + + err = intel_context_migrate_clear(ce, await, sg, cache_level, + is_lmem, value, out); + + intel_context_unpin(ce); +out: + intel_context_put(ce); + return err; +} + void intel_migrate_fini(struct intel_migrate *m) { struct intel_context *ce; diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.h b/drivers/gpu/drm/i915/gt/intel_migrate.h index 32c61190ed73..4e18e755a00b 100644 --- a/drivers/gpu/drm/i915/gt/intel_migrate.h +++ b/drivers/gpu/drm/i915/gt/intel_migrate.h @@ -6,6 +6,8 @@ #ifndef __INTEL_MIGRATE__ #define __INTEL_MIGRATE__ +#include + #include "intel_migrate_types.h" struct dma_fence; @@ -40,6 +42,24 @@ int intel_context_migrate_copy(struct intel_context *ce, bool dst_is_lmem, struct i915_request **out); +int +intel_migrate_clear(struct intel_migrate *m, + struct i915_gem_ww_ctx *ww, + struct dma_fence *await, + struct scatterlist *sg, + enum i915_cache_level cache_level, + bool is_lmem, + u32 value, + struct i915_request **out); +int +intel_context_migrate_clear(struct intel_context *ce, + struct dma_fence *await, + struct scatterlist *sg, + enum i915_cache_level cache_level, + bool is_lmem, + u32 value, + struct i915_request **out); + void intel_migrate_fini(struct intel_migrate *m); #endif /* __INTEL_MIGRATE__ */ diff --git a/drivers/gpu/drm/i915/gt/selftest_migrate.c b/drivers/gpu/drm/i915/gt/selftest_migrate.c index 9784d149ebf1..159c8656e1b0 100644 --- a/drivers/gpu/drm/i915/gt/selftest_migrate.c +++ b/drivers/gpu/drm/i915/gt/selftest_migrate.c @@ -129,6 +129,82 @@ static int copy(struct intel_migrate *migrate, return err; } +static int clear(struct intel_migrate *migrate, + int (*fn)(struct intel_migrate *migrate, + struct i915_gem_ww_ctx *ww, + struct drm_i915_gem_object *obj, + u32 value, + struct i915_request **out), + u32 sz, struct rnd_state *prng) +{ + struct drm_i915_private *i915 = migrate->context->engine->i915; + struct drm_i915_gem_object *obj; + struct i915_request *rq; + struct i915_gem_ww_ctx ww; + u32 *vaddr; + int err = 0; + int i; + + obj = create_lmem_or_internal(i915, sz); + if (IS_ERR(obj)) + return 0; + + for_i915_gem_ww(&ww, err, true) { + err = i915_gem_object_lock(obj, &ww); + if (err) + continue; + + vaddr = i915_gem_object_pin_map(obj, I915_MAP_WC); + if (IS_ERR(vaddr)) { + err = PTR_ERR(vaddr); + continue; + } + + for (i = 0; i < sz / sizeof(u32); i++) + vaddr[i] = ~i; + i915_gem_object_flush_map(obj); + + err = fn(migrate, &ww, obj, sz, &rq); + if (!err) + continue; + + if (err != -EDEADLK && err != -EINTR && err != -ERESTARTSYS) + pr_err("%ps failed, size: %u\n", fn, sz); + if (rq) { + i915_request_wait(rq, 0, HZ); + i915_request_put(rq); + } + i915_gem_object_unpin_map(obj); + } + if (err) + goto err_out; + + if (rq) { + if (i915_request_wait(rq, 0, HZ) < 0) { + pr_err("%ps timed out, size: %u\n", fn, sz); + err = -ETIME; + } + i915_request_put(rq); + } + + for (i = 0; !err && i < sz / PAGE_SIZE; i++) { + int x = i * 1024 + i915_prandom_u32_max_state(1024, prng); + + if (vaddr[x] != sz) { + pr_err("%ps failed, size: %u, offset: %zu\n", + fn, sz, x * sizeof(u32)); + igt_hexdump(vaddr + i * 1024, 4096); + err = -EINVAL; + } + } + + i915_gem_object_unpin_map(obj); +err_out: + i915_gem_object_put(obj); + + return err; +} + static int __migrate_copy(struct intel_migrate *migrate, struct i915_gem_ww_ctx *ww, struct drm_i915_gem_object *src, @@ -169,6 +245,44 @@ global_copy(struct intel_migrate *migrate, u32 sz, struct rnd_state *prng) return copy(migrate, __global_copy, sz, prng); } +static int __migrate_clear(struct intel_migrate *migrate, + struct i915_gem_ww_ctx *ww, + struct drm_i915_gem_object *obj, + u32 value, + struct i915_request **out) +{ + return intel_migrate_clear(migrate, ww, NULL, + obj->mm.pages->sgl, + obj->cache_level, + i915_gem_object_is_lmem(obj), + value, out); +} + +static int __global_clear(struct intel_migrate *migrate, + struct i915_gem_ww_ctx *ww, + struct drm_i915_gem_object *obj, + u32 value, + struct i915_request **out) +{ + return intel_context_migrate_clear(migrate->context, NULL, + obj->mm.pages->sgl, + obj->cache_level, + i915_gem_object_is_lmem(obj), + value, out); +} + +static int +migrate_clear(struct intel_migrate *migrate, u32 sz, struct rnd_state *prng) +{ + return clear(migrate, __migrate_clear, sz, prng); +} + +static int +global_clear(struct intel_migrate *migrate, u32 sz, struct rnd_state *prng) +{ + return clear(migrate, __global_clear, sz, prng); +} + static int live_migrate_copy(void *arg) { struct intel_migrate *migrate = arg; @@ -190,6 +304,28 @@ static int live_migrate_copy(void *arg) return 0; } +static int live_migrate_clear(void *arg) +{ + struct intel_migrate *migrate = arg; + struct drm_i915_private *i915 = migrate->context->engine->i915; + I915_RND_STATE(prng); + int i; + + for (i = 0; i < ARRAY_SIZE(sizes); i++) { + int err; + + err = migrate_clear(migrate, sizes[i], &prng); + if (err == 0) + err = global_clear(migrate, sizes[i], &prng); + + i915_gem_drain_freed_objects(i915); + if (err) + return err; + } + + return 0; +} + struct threaded_migrate { struct intel_migrate *migrate; struct task_struct *tsk; @@ -271,12 +407,39 @@ static int thread_global_copy(void *arg) return threaded_migrate(arg, __thread_global_copy, 0); } +static int __thread_migrate_clear(void *arg) +{ + struct threaded_migrate *tm = arg; + + return migrate_clear(tm->migrate, 2 * CHUNK_SZ, &tm->prng); +} + +static int __thread_global_clear(void *arg) +{ + struct threaded_migrate *tm = arg; + + return global_clear(tm->migrate, 2 * CHUNK_SZ, &tm->prng); +} + +static int thread_migrate_clear(void *arg) +{ + return threaded_migrate(arg, __thread_migrate_clear, 0); +} + +static int thread_global_clear(void *arg) +{ + return threaded_migrate(arg, __thread_global_clear, 0); +} + int intel_migrate_live_selftests(struct drm_i915_private *i915) { static const struct i915_subtest tests[] = { SUBTEST(live_migrate_copy), + SUBTEST(live_migrate_clear), SUBTEST(thread_migrate_copy), + SUBTEST(thread_migrate_clear), SUBTEST(thread_global_copy), + SUBTEST(thread_global_clear), }; struct intel_migrate m; int err; From patchwork Wed Jun 9 06:34:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Hellstrom X-Patchwork-Id: 12309091 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3CEE1C48BD1 for ; Wed, 9 Jun 2021 06:35:17 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 149F261181 for ; Wed, 9 Jun 2021 06:35:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 149F261181 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5469A6E3CE; Wed, 9 Jun 2021 06:35:11 +0000 (UTC) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6C3C96E20A; Wed, 9 Jun 2021 06:35:06 +0000 (UTC) IronPort-SDR: Ul+5cTeWwV2D7MPbLNKkJK8t/2r49hXRmlhRIBKpcX4yfQjuojEfov4asdalpQmPdl6SeRiO5/ ITvNGzpMMiTg== X-IronPort-AV: E=McAfee;i="6200,9189,10009"; a="268868941" X-IronPort-AV: E=Sophos;i="5.83,260,1616482800"; d="scan'208";a="268868941" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jun 2021 23:35:06 -0700 IronPort-SDR: sinJWIkdSEImn2u1YtVlHCsh4exKC/M318FtZ3cqNj/z+JWw7P5HUAgO+51OpwIw6XkbyXiHnM 00cHzeOfmDgg== X-IronPort-AV: E=Sophos;i="5.83,260,1616482800"; d="scan'208";a="482265865" Received: from ekolpasx-mobl.ccr.corp.intel.com (HELO thellst-mobl1.intel.com) ([10.249.254.109]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jun 2021 23:35:04 -0700 From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Date: Wed, 9 Jun 2021 08:34:36 +0200 Message-Id: <20210609063436.284332-10-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210609063436.284332-1-thomas.hellstrom@linux.intel.com> References: <20210609063436.284332-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v2 9/9] drm/i915/gt: Setup a default migration context on the GT X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Thomas_Hellstr=C3=B6m?= , matthew.auld@intel.com, Chris Wilson Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Chris Wilson Set up a default migration context on the GT and use it from the selftests. Add a perf selftest and make sure we exercise LMEM if available. Signed-off-by: Chris Wilson Co-developed-by: Thomas Hellström Signed-off-by: Thomas Hellström Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/gt/intel_gt.c | 4 + drivers/gpu/drm/i915/gt/intel_gt_types.h | 3 + drivers/gpu/drm/i915/gt/intel_migrate.c | 4 +- drivers/gpu/drm/i915/gt/selftest_migrate.c | 227 +++++++++++++++++- .../drm/i915/selftests/i915_perf_selftests.h | 1 + 5 files changed, 232 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 2161bf01ef8b..67ef057ae918 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -13,6 +13,7 @@ #include "intel_gt_clock_utils.h" #include "intel_gt_pm.h" #include "intel_gt_requests.h" +#include "intel_migrate.h" #include "intel_mocs.h" #include "intel_rc6.h" #include "intel_renderstate.h" @@ -626,6 +627,8 @@ int intel_gt_init(struct intel_gt *gt) if (err) goto err_gt; + intel_migrate_init(>->migrate, gt); + goto out_fw; err_gt: __intel_gt_disable(gt); @@ -649,6 +652,7 @@ void intel_gt_driver_remove(struct intel_gt *gt) { __intel_gt_disable(gt); + intel_migrate_fini(>->migrate); intel_uc_driver_remove(>->uc); intel_engines_release(gt); diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index fecfacf551d5..7450935f2ca8 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h @@ -24,6 +24,7 @@ #include "intel_reset_types.h" #include "intel_rc6_types.h" #include "intel_rps_types.h" +#include "intel_migrate_types.h" #include "intel_wakeref.h" struct drm_i915_private; @@ -145,6 +146,8 @@ struct intel_gt { struct i915_vma *scratch; + struct intel_migrate migrate; + struct intel_gt_info { intel_engine_mask_t engine_mask; u8 num_engines; diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c index fda05ce3eb9c..935b5f14ff21 100644 --- a/drivers/gpu/drm/i915/gt/intel_migrate.c +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c @@ -418,10 +418,9 @@ intel_context_migrate_copy(struct intel_context *ce, struct i915_request *rq; int err; + GEM_BUG_ON(ce->vm != ce->engine->gt->migrate.context->vm); *out = NULL; - /* GEM_BUG_ON(ce->vm != migrate_vm); */ - GEM_BUG_ON(ce->ring->size < SZ_64K); do { @@ -536,6 +535,7 @@ intel_context_migrate_clear(struct intel_context *ce, struct i915_request *rq; int err; + GEM_BUG_ON(ce->vm != ce->engine->gt->migrate.context->vm); *out = NULL; GEM_BUG_ON(ce->ring->size < SZ_64K); diff --git a/drivers/gpu/drm/i915/gt/selftest_migrate.c b/drivers/gpu/drm/i915/gt/selftest_migrate.c index 159c8656e1b0..396c81364399 100644 --- a/drivers/gpu/drm/i915/gt/selftest_migrate.c +++ b/drivers/gpu/drm/i915/gt/selftest_migrate.c @@ -3,6 +3,8 @@ * Copyright © 2020 Intel Corporation */ +#include + #include "selftests/i915_random.h" static const unsigned int sizes[] = { @@ -441,14 +443,229 @@ int intel_migrate_live_selftests(struct drm_i915_private *i915) SUBTEST(thread_global_copy), SUBTEST(thread_global_clear), }; - struct intel_migrate m; + struct intel_gt *gt = &i915->gt; + + if (!gt->migrate.context) + return 0; + + return i915_subtests(tests, >->migrate); +} + +static struct drm_i915_gem_object * +create_init_lmem_internal(struct intel_gt *gt, size_t sz, bool try_lmem) +{ + struct drm_i915_gem_object *obj = NULL; int err; - if (intel_migrate_init(&m, &i915->gt)) + if (try_lmem && HAS_LMEM(gt->i915)) + obj = i915_gem_object_create_lmem(gt->i915, sz, 0); + + if (IS_ERR_OR_NULL(obj)) { + obj = i915_gem_object_create_internal(gt->i915, sz); + if (IS_ERR(obj)) + return obj; + } + + i915_gem_object_trylock(obj); + err = i915_gem_object_pin_pages(obj); + if (err) { + i915_gem_object_unlock(obj); + i915_gem_object_put(obj); + return ERR_PTR(err); + } + + return obj; +} + +static int wrap_ktime_compare(const void *A, const void *B) +{ + const ktime_t *a = A, *b = B; + + return ktime_compare(*a, *b); +} + +static int __perf_clear_blt(struct intel_context *ce, + struct scatterlist *sg, + enum i915_cache_level cache_level, + bool is_lmem, + size_t sz) +{ + ktime_t t[5]; + int pass; + int err = 0; + + for (pass = 0; pass < ARRAY_SIZE(t); pass++) { + struct i915_request *rq; + ktime_t t0, t1; + + t0 = ktime_get(); + + err = intel_context_migrate_clear(ce, NULL, sg, cache_level, + is_lmem, 0, &rq); + if (rq) { + if (i915_request_wait(rq, 0, MAX_SCHEDULE_TIMEOUT) < 0) + err = -EIO; + i915_request_put(rq); + } + if (err) + break; + + t1 = ktime_get(); + t[pass] = ktime_sub(t1, t0); + } + if (err) + return err; + + sort(t, ARRAY_SIZE(t), sizeof(*t), wrap_ktime_compare, NULL); + pr_info("%s: %zd KiB fill: %lld MiB/s\n", + ce->engine->name, sz >> 10, + div64_u64(mul_u32_u32(4 * sz, + 1000 * 1000 * 1000), + t[1] + 2 * t[2] + t[3]) >> 20); + return 0; +} + +static int perf_clear_blt(void *arg) +{ + struct intel_gt *gt = arg; + static const unsigned long sizes[] = { + SZ_4K, + SZ_64K, + SZ_2M, + SZ_64M + }; + int i; + + for (i = 0; i < ARRAY_SIZE(sizes); i++) { + struct drm_i915_gem_object *dst; + int err; + + dst = create_init_lmem_internal(gt, sizes[i], true); + if (IS_ERR(dst)) + return PTR_ERR(dst); + + err = __perf_clear_blt(gt->migrate.context, + dst->mm.pages->sgl, + I915_CACHE_NONE, + i915_gem_object_is_lmem(dst), + sizes[i]); + + i915_gem_object_unlock(dst); + i915_gem_object_put(dst); + if (err) + return err; + } + + return 0; +} + +static int __perf_copy_blt(struct intel_context *ce, + struct scatterlist *src, + enum i915_cache_level src_cache_level, + bool src_is_lmem, + struct scatterlist *dst, + enum i915_cache_level dst_cache_level, + bool dst_is_lmem, + size_t sz) +{ + ktime_t t[5]; + int pass; + int err = 0; + + for (pass = 0; pass < ARRAY_SIZE(t); pass++) { + struct i915_request *rq; + ktime_t t0, t1; + + t0 = ktime_get(); + + err = intel_context_migrate_copy(ce, NULL, + src, src_cache_level, + src_is_lmem, + dst, dst_cache_level, + dst_is_lmem, + &rq); + if (rq) { + if (i915_request_wait(rq, 0, MAX_SCHEDULE_TIMEOUT) < 0) + err = -EIO; + i915_request_put(rq); + } + if (err) + break; + + t1 = ktime_get(); + t[pass] = ktime_sub(t1, t0); + } + if (err) + return err; + + sort(t, ARRAY_SIZE(t), sizeof(*t), wrap_ktime_compare, NULL); + pr_info("%s: %zd KiB copy: %lld MiB/s\n", + ce->engine->name, sz >> 10, + div64_u64(mul_u32_u32(4 * sz, + 1000 * 1000 * 1000), + t[1] + 2 * t[2] + t[3]) >> 20); + return 0; +} + +static int perf_copy_blt(void *arg) +{ + struct intel_gt *gt = arg; + static const unsigned long sizes[] = { + SZ_4K, + SZ_64K, + SZ_2M, + SZ_64M + }; + int i; + + for (i = 0; i < ARRAY_SIZE(sizes); i++) { + struct drm_i915_gem_object *src, *dst; + int err; + + src = create_init_lmem_internal(gt, sizes[i], true); + if (IS_ERR(src)) + return PTR_ERR(src); + + dst = create_init_lmem_internal(gt, sizes[i], false); + if (IS_ERR(dst)) { + err = PTR_ERR(dst); + goto err_src; + } + + err = __perf_copy_blt(gt->migrate.context, + src->mm.pages->sgl, + I915_CACHE_NONE, + i915_gem_object_is_lmem(src), + dst->mm.pages->sgl, + I915_CACHE_NONE, + i915_gem_object_is_lmem(dst), + sizes[i]); + + i915_gem_object_unlock(dst); + i915_gem_object_put(dst); +err_src: + i915_gem_object_unlock(src); + i915_gem_object_put(src); + if (err) + return err; + } + + return 0; +} + +int intel_migrate_perf_selftests(struct drm_i915_private *i915) +{ + static const struct i915_subtest tests[] = { + SUBTEST(perf_clear_blt), + SUBTEST(perf_copy_blt), + }; + struct intel_gt *gt = &i915->gt; + + if (intel_gt_is_wedged(gt)) return 0; - err = i915_subtests(tests, &m); - intel_migrate_fini(&m); + if (!gt->migrate.context) + return 0; - return err; + return intel_gt_live_subtests(tests, gt); } diff --git a/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h b/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h index c2389f8a257d..5077dc3c3b8c 100644 --- a/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h +++ b/drivers/gpu/drm/i915/selftests/i915_perf_selftests.h @@ -17,5 +17,6 @@ */ selftest(engine_cs, intel_engine_cs_perf_selftests) selftest(request, i915_request_perf_selftests) +selftest(migrate, intel_migrate_perf_selftests) selftest(blt, i915_gem_object_blt_perf_selftests) selftest(region, intel_memory_region_perf_selftests)