From patchwork Fri Jun 18 13:25:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Hellstrom X-Patchwork-Id: 12331395 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D3A9C48BDF for ; Fri, 18 Jun 2021 13:25:29 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4D4FF613EC for ; Fri, 18 Jun 2021 13:25:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4D4FF613EC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CAC686EA09; Fri, 18 Jun 2021 13:25:28 +0000 (UTC) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9DD696EA09 for ; Fri, 18 Jun 2021 13:25:27 +0000 (UTC) IronPort-SDR: TrDsG2SQEKOOZQBiMv/FgAqNL0hzXED2WLSin5JKqoYmLsBhb/F4uWkIpECtS7Xzu4LD2Jvl3K KT7FDarOxuTA== X-IronPort-AV: E=McAfee;i="6200,9189,10018"; a="292182821" X-IronPort-AV: E=Sophos;i="5.83,283,1616482800"; d="scan'208";a="292182821" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jun 2021 06:25:27 -0700 IronPort-SDR: kfFkeOyJ9m1+fXUZWXluZX/VE3/XIr6hKIdMWRcAM9yENdNXZoH7VnRl80FsDlhRJz7yCnkW0S yIOezlwejCNA== X-IronPort-AV: E=Sophos;i="5.83,283,1616482800"; d="scan'208";a="622389431" Received: from jhogberg-mobl1.ger.corp.intel.com (HELO thellst-mobl1.intel.com) ([10.249.254.60]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jun 2021 06:25:25 -0700 From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= To: intel-gfx@lists.freedesktop.org Date: Fri, 18 Jun 2021 15:25:15 +0200 Message-Id: <20210618132515.163277-1-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v2] drm/i915/ttm: Fix incorrect assumptions about ttm_bo_validate() semantics X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?utf-8?q?Thomas_Hellstr=C3=B6m?= , matthew.auld@intel.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" We have assumed that if the current placement was not the requested placement, but instead one of the busy placements, a TTM move would have been triggered. That is not the case. So when we initially place LMEM objects in "Limbo", (that is system placement without any pages allocated), to be able to defer clearing objects until first get_pages(), the first get_pages() would happily keep objects in system memory if that is one of the allowed placements. And since we don't yet support i915 GEM system memory from TTM, everything breaks apart. So make sure we try the requested placement first, if no eviction is needed. If that fails, retry with all allowed placements also allowing evictions. Also make sure we handle TTM failure codes correctly. Also temporarily (until we support i915 GEM system on TTM), restrict allowed placements to the requested placement to avoid things falling apart should LMEM be full. Fixes: 38f28c0695c0 ("drm/i915/ttm: Calculate the object placement at get_pages time) Signed-off-by: Thomas Hellström Reviewed-by: Matthew Auld --- v2: - Updated a comment and added a R-B. --- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 64 +++++++++++++++++++++++-- 1 file changed, 61 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index df46535cca47..c5deb8b7227c 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -64,6 +64,33 @@ static struct ttm_placement i915_sys_placement = { .busy_placement = &sys_placement_flags, }; +static int i915_ttm_err_to_gem(int err) +{ + /* Fastpath */ + if (likely(!err)) + return 0; + + switch (err) { + case -EBUSY: + /* + * TTM likes to convert -EDEADLK to -EBUSY, and wants us to + * restart the operation, since we don't record the contending + * lock. We use -EAGAIN to restart. + */ + return -EAGAIN; + case -ENOSPC: + /* + * Memory type / region is full, and we can't evict. + * Except possibly system, that returns -ENOMEM; + */ + return -ENXIO; + default: + break; + } + + return err; +} + static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj); static enum ttm_caching @@ -522,15 +549,46 @@ static int i915_ttm_get_pages(struct drm_i915_gem_object *obj) struct sg_table *st; struct ttm_place requested, busy[I915_TTM_MAX_PLACEMENTS]; struct ttm_placement placement; + int real_num_busy; int ret; GEM_BUG_ON(obj->mm.n_placements > I915_TTM_MAX_PLACEMENTS); /* Move to the requested placement. */ i915_ttm_placement_from_obj(obj, &requested, busy, &placement); + + /* + * For now we support LMEM only with TTM. + * TODO: Remove with system support + */ + GEM_BUG_ON(requested.mem_type < I915_PL_LMEM0 || + busy[0].mem_type < I915_PL_LMEM0); + + /* First try only the requested placement. No eviction. */ + real_num_busy = fetch_and_zero(&placement.num_busy_placement); ret = ttm_bo_validate(bo, &placement, &ctx); - if (ret) - return ret == -ENOSPC ? -ENXIO : ret; + if (ret) { + ret = i915_ttm_err_to_gem(ret); + /* + * Anything that wants to restart the operation gets to + * do that. + */ + if (ret == -EDEADLK || ret == -EINTR || ret == -ERESTARTSYS || + ret == -EAGAIN) + return ret; + + /* TODO: Remove this when we support system as TTM. */ + real_num_busy = 1; + + /* + * If the initial attempt fails, allow all accepted placements, + * evicting if necessary. + */ + placement.num_busy_placement = real_num_busy; + ret = ttm_bo_validate(bo, &placement, &ctx); + if (ret) + return i915_ttm_err_to_gem(ret); + } /* Object either has a page vector or is an iomem object */ st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj->ttm.cached_io_st; @@ -741,5 +799,5 @@ int __i915_gem_ttm_object_init(struct intel_memory_region *mem, obj->ttm.created = true; /* i915 wants -ENXIO when out of memory region space. */ - return (ret == -ENOSPC) ? -ENXIO : ret; + return i915_ttm_err_to_gem(ret); }