From patchwork Wed May 8 12:05:41 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10935547 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E817F92A for ; Wed, 8 May 2019 12:06:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D7989286D3 for ; Wed, 8 May 2019 12:06:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CBB5C28715; Wed, 8 May 2019 12:06:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 41287286D3 for ; Wed, 8 May 2019 12:06:02 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7153689804; Wed, 8 May 2019 12:06:01 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3C7CC89804 for ; Wed, 8 May 2019 12:06:00 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 16489689-1500050 for multiple; Wed, 08 May 2019 13:05:51 +0100 From: Chris Wilson To: dri-devel@lists.freedesktop.org Subject: [PATCH 1/2] drm/i915: Seal races between async GPU cancellation, retirement and signaling Date: Wed, 8 May 2019 13:05:41 +0100 Message-Id: <20190508120542.28377-2-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190508120542.28377-1-chris@chris-wilson.co.uk> References: <20190508120542.28377-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tvrtko Ursulin Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP Currently there is an underlying assumption that i915_request_unsubmit() is synchronous wrt the GPU -- that is the request is no longer in flight as we remove it. In the near future that may change, and this may upset our signaling as we can process an interrupt for that request while it is no longer in flight. CPU0 CPU1 intel_engine_breadcrumbs_irq (queue request completion) i915_request_cancel_signaling ... ... i915_request_enable_signaling dma_fence_signal Hence in the time it took us to drop the lock to signal the request, a preemption event may have occurred and re-queued the request. In the process, that request would have seen I915_FENCE_FLAG_SIGNAL clear and so reused the rq->signal_link that was in use on CPU0, leading to bad pointer chasing in intel_engine_breadcrumbs_irq. A related issue was that if someone started listening for a signal on a completed but no longer in-flight request, we missed the opportunity to immediately signal that request. Furthermore, as intel_contexts may be immediately released during request retirement, in order to be entirely sure that intel_engine_breadcrumbs_irq may no longer dereference the intel_context (ce->signals and ce->signal_link), we must wait for irq spinlock. In order to prevent the race, we use a bit in the fence.flags to signal the transfer onto the signal list inside intel_engine_breadcrumbs_irq. For simplicity, we use the DMA_FENCE_FLAG_SIGNALED_BIT as it then quickly signals to any outside observer that the fence is indeed signaled. v2: Sketch out potential dma-fence API for manual signaling v3: And the test_and_set_bit() Fixes: 52c0fdb25c7c ("drm/i915: Replace global breadcrumbs with per-context interrupt tracking") Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Reviewed-by: Tvrtko Ursulin --- drivers/dma-buf/dma-fence.c | 1 + drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 78 +++++++++++++++------ drivers/gpu/drm/i915/i915_request.c | 1 + drivers/gpu/drm/i915/intel_guc_submission.c | 1 - 4 files changed, 59 insertions(+), 22 deletions(-) diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 3aa8733f832a..9bf06042619a 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -29,6 +29,7 @@ EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit); EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal); +EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled); static DEFINE_SPINLOCK(dma_fence_stub_lock); static struct dma_fence dma_fence_stub; diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c index fe455f01aa65..c092bdf5f0bf 100644 --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c @@ -23,6 +23,7 @@ */ #include +#include #include #include "i915_drv.h" @@ -96,9 +97,39 @@ check_signal_order(struct intel_context *ce, struct i915_request *rq) return true; } +static bool +__dma_fence_signal(struct dma_fence *fence) +{ + return !test_and_set_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags); +} + +static void +__dma_fence_signal__timestamp(struct dma_fence *fence, ktime_t timestamp) +{ + fence->timestamp = timestamp; + set_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT, &fence->flags); + trace_dma_fence_signaled(fence); +} + +static void +__dma_fence_signal__notify(struct dma_fence *fence) +{ + struct dma_fence_cb *cur, *tmp; + + lockdep_assert_held(fence->lock); + lockdep_assert_irqs_disabled(); + + list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) { + INIT_LIST_HEAD(&cur->node); + cur->func(fence, cur); + } + INIT_LIST_HEAD(&fence->cb_list); +} + void intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine) { struct intel_breadcrumbs *b = &engine->breadcrumbs; + const ktime_t timestamp = ktime_get(); struct intel_context *ce, *cn; struct list_head *pos, *next; LIST_HEAD(signal); @@ -122,6 +153,10 @@ void intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine) GEM_BUG_ON(!test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags)); + clear_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags); + + if (!__dma_fence_signal(&rq->fence)) + continue; /* * Queue for execution after dropping the signaling @@ -129,14 +164,6 @@ void intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine) * more signalers to the same context or engine. */ i915_request_get(rq); - - /* - * We may race with direct invocation of - * dma_fence_signal(), e.g. i915_request_retire(), - * so we need to acquire our reference to the request - * before we cancel the breadcrumb. - */ - clear_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags); list_add_tail(&rq->signal_link, &signal); } @@ -159,7 +186,12 @@ void intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine) struct i915_request *rq = list_entry(pos, typeof(*rq), signal_link); - dma_fence_signal(&rq->fence); + __dma_fence_signal__timestamp(&rq->fence, timestamp); + + spin_lock(&rq->lock); + __dma_fence_signal__notify(&rq->fence); + spin_unlock(&rq->lock); + i915_request_put(rq); } } @@ -261,19 +293,17 @@ void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine) bool i915_request_enable_breadcrumb(struct i915_request *rq) { - struct intel_breadcrumbs *b = &rq->engine->breadcrumbs; - - GEM_BUG_ON(test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags)); + lockdep_assert_held(&rq->lock); + lockdep_assert_irqs_disabled(); - if (!test_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags)) - return true; - - spin_lock(&b->irq_lock); - if (test_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags) && - !__request_completed(rq)) { + if (test_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags)) { + struct intel_breadcrumbs *b = &rq->engine->breadcrumbs; struct intel_context *ce = rq->hw_context; struct list_head *pos; + spin_lock(&b->irq_lock); + GEM_BUG_ON(test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags)); + __intel_breadcrumbs_arm_irq(b); /* @@ -303,8 +333,8 @@ bool i915_request_enable_breadcrumb(struct i915_request *rq) GEM_BUG_ON(!check_signal_order(ce, rq)); set_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags); + spin_unlock(&b->irq_lock); } - spin_unlock(&b->irq_lock); return !__request_completed(rq); } @@ -313,9 +343,15 @@ void i915_request_cancel_breadcrumb(struct i915_request *rq) { struct intel_breadcrumbs *b = &rq->engine->breadcrumbs; - if (!test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags)) - return; + lockdep_assert_held(&rq->lock); + lockdep_assert_irqs_disabled(); + /* + * We must wait for b->irq_lock so that we know the interrupt handler + * has released its reference to the intel_context and has completed + * the DMA_FENCE_FLAG_SIGNALED_BIT/I915_FENCE_FLAG_SIGNAL dance (if + * required). + */ spin_lock(&b->irq_lock); if (test_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags)) { struct intel_context *ce = rq->hw_context; diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index fa955b7b6def..bed213148cbb 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -437,6 +437,7 @@ void __i915_request_submit(struct i915_request *request) set_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags); if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags) && + !test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &request->fence.flags) && !i915_request_enable_breadcrumb(request)) intel_engine_queue_breadcrumbs(engine); diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c index 380d83a2bfb6..ea0e3734d37c 100644 --- a/drivers/gpu/drm/i915/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/intel_guc_submission.c @@ -23,7 +23,6 @@ */ #include -#include #include "gt/intel_engine_pm.h" #include "gt/intel_lrc_reg.h" From patchwork Wed May 8 12:05:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 10935545 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E1AFF92A for ; Wed, 8 May 2019 12:06:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CD256286D3 for ; Wed, 8 May 2019 12:06:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C055E28715; Wed, 8 May 2019 12:06:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 6014C286D3 for ; Wed, 8 May 2019 12:05:59 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2A884896B5; Wed, 8 May 2019 12:05:57 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by gabe.freedesktop.org (Postfix) with ESMTPS id 709F889622 for ; Wed, 8 May 2019 12:05:54 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 16489690-1500050 for multiple; Wed, 08 May 2019 13:05:51 +0100 From: Chris Wilson To: dri-devel@lists.freedesktop.org Subject: [PATCH 2/2] dma-fence: Refactor signaling for manual invocation Date: Wed, 8 May 2019 13:05:42 +0100 Message-Id: <20190508120542.28377-3-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190508120542.28377-1-chris@chris-wilson.co.uk> References: <20190508120542.28377-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tvrtko Ursulin Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP Move the duplicated code within dma-fence.c into the header for wider reuse. In the process apply a small micro-optimisation to only prune the fence->cb_list once rather than use list_del on every entry. Signed-off-by: Chris Wilson Reviewed-by: Tvrtko Ursulin --- drivers/dma-buf/Makefile | 10 +- drivers/dma-buf/dma-fence-trace.c | 28 +++ drivers/dma-buf/dma-fence.c | 32 +-- drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 30 --- include/linux/dma-fence-types.h | 248 +++++++++++++++++++ include/linux/dma-fence.h | 251 +++----------------- 6 files changed, 321 insertions(+), 278 deletions(-) create mode 100644 drivers/dma-buf/dma-fence-trace.c create mode 100644 include/linux/dma-fence-types.h diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile index 1f006e083eb9..56e579878f26 100644 --- a/drivers/dma-buf/Makefile +++ b/drivers/dma-buf/Makefile @@ -1,5 +1,11 @@ -obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \ - reservation.o seqno-fence.o +obj-y := \ + dma-buf.o \ + dma-fence.o \ + dma-fence-array.o \ + dma-fence-chain.o \ + dma-fence-trace.o \ + reservation.o \ + seqno-fence.o obj-$(CONFIG_SYNC_FILE) += sync_file.o obj-$(CONFIG_SW_SYNC) += sw_sync.o sync_debug.o obj-$(CONFIG_UDMABUF) += udmabuf.o diff --git a/drivers/dma-buf/dma-fence-trace.c b/drivers/dma-buf/dma-fence-trace.c new file mode 100644 index 000000000000..eb6f282be4c0 --- /dev/null +++ b/drivers/dma-buf/dma-fence-trace.c @@ -0,0 +1,28 @@ +/* + * Fence mechanism for dma-buf and to allow for asynchronous dma access + * + * Copyright (C) 2012 Canonical Ltd + * Copyright (C) 2012 Texas Instruments + * + * Authors: + * Rob Clark + * Maarten Lankhorst + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + */ + +#include + +#define CREATE_TRACE_POINTS +#include + +EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit); +EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal); +EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled); diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 9bf06042619a..8196a179fdc2 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -24,13 +24,6 @@ #include #include -#define CREATE_TRACE_POINTS -#include - -EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit); -EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal); -EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled); - static DEFINE_SPINLOCK(dma_fence_stub_lock); static struct dma_fence dma_fence_stub; @@ -136,7 +129,6 @@ EXPORT_SYMBOL(dma_fence_context_alloc); */ int dma_fence_signal_locked(struct dma_fence *fence) { - struct dma_fence_cb *cur, *tmp; int ret = 0; lockdep_assert_held(fence->lock); @@ -144,7 +136,7 @@ int dma_fence_signal_locked(struct dma_fence *fence) if (WARN_ON(!fence)) return -EINVAL; - if (test_and_set_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) { + if (!__dma_fence_signal(fence)) { ret = -EINVAL; /* @@ -152,15 +144,10 @@ int dma_fence_signal_locked(struct dma_fence *fence) * still run through all callbacks */ } else { - fence->timestamp = ktime_get(); - set_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT, &fence->flags); - trace_dma_fence_signaled(fence); + __dma_fence_signal__timestamp(fence, ktime_get()); } - list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) { - list_del_init(&cur->node); - cur->func(fence, cur); - } + __dma_fence_signal__notify(fence); return ret; } EXPORT_SYMBOL(dma_fence_signal_locked); @@ -185,21 +172,14 @@ int dma_fence_signal(struct dma_fence *fence) if (!fence) return -EINVAL; - if (test_and_set_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) + if (!__dma_fence_signal(fence)) return -EINVAL; - fence->timestamp = ktime_get(); - set_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT, &fence->flags); - trace_dma_fence_signaled(fence); + __dma_fence_signal__timestamp(fence, ktime_get()); if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags)) { - struct dma_fence_cb *cur, *tmp; - spin_lock_irqsave(fence->lock, flags); - list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) { - list_del_init(&cur->node); - cur->func(fence, cur); - } + __dma_fence_signal__notify(fence); spin_unlock_irqrestore(fence->lock, flags); } return 0; diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c index c092bdf5f0bf..d1f8572100c3 100644 --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c @@ -23,7 +23,6 @@ */ #include -#include #include #include "i915_drv.h" @@ -97,35 +96,6 @@ check_signal_order(struct intel_context *ce, struct i915_request *rq) return true; } -static bool -__dma_fence_signal(struct dma_fence *fence) -{ - return !test_and_set_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags); -} - -static void -__dma_fence_signal__timestamp(struct dma_fence *fence, ktime_t timestamp) -{ - fence->timestamp = timestamp; - set_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT, &fence->flags); - trace_dma_fence_signaled(fence); -} - -static void -__dma_fence_signal__notify(struct dma_fence *fence) -{ - struct dma_fence_cb *cur, *tmp; - - lockdep_assert_held(fence->lock); - lockdep_assert_irqs_disabled(); - - list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) { - INIT_LIST_HEAD(&cur->node); - cur->func(fence, cur); - } - INIT_LIST_HEAD(&fence->cb_list); -} - void intel_engine_breadcrumbs_irq(struct intel_engine_cs *engine) { struct intel_breadcrumbs *b = &engine->breadcrumbs; diff --git a/include/linux/dma-fence-types.h b/include/linux/dma-fence-types.h new file mode 100644 index 000000000000..18e7511c0eed --- /dev/null +++ b/include/linux/dma-fence-types.h @@ -0,0 +1,248 @@ +/* + * Fence mechanism for dma-buf to allow for asynchronous dma access + * + * Copyright (C) 2012 Canonical Ltd + * Copyright (C) 2012 Texas Instruments + * + * Authors: + * Rob Clark + * Maarten Lankhorst + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + */ + +#ifndef __LINUX_DMA_FENCE_TYPES_H +#define __LINUX_DMA_FENCE_TYPES_H + +#include +#include + +struct dma_fence; +struct dma_fence_ops; +struct dma_fence_cb; + +/** + * struct dma_fence - software synchronization primitive + * @refcount: refcount for this fence + * @ops: dma_fence_ops associated with this fence + * @rcu: used for releasing fence with kfree_rcu + * @cb_list: list of all callbacks to call + * @lock: spin_lock_irqsave used for locking + * @context: execution context this fence belongs to, returned by + * dma_fence_context_alloc() + * @seqno: the sequence number of this fence inside the execution context, + * can be compared to decide which fence would be signaled later. + * @flags: A mask of DMA_FENCE_FLAG_* defined below + * @timestamp: Timestamp when the fence was signaled. + * @error: Optional, only valid if < 0, must be set before calling + * dma_fence_signal, indicates that the fence has completed with an error. + * + * the flags member must be manipulated and read using the appropriate + * atomic ops (bit_*), so taking the spinlock will not be needed most + * of the time. + * + * DMA_FENCE_FLAG_SIGNALED_BIT - fence is already signaled + * DMA_FENCE_FLAG_TIMESTAMP_BIT - timestamp recorded for fence signaling + * DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT - enable_signaling might have been called + * DMA_FENCE_FLAG_USER_BITS - start of the unused bits, can be used by the + * implementer of the fence for its own purposes. Can be used in different + * ways by different fence implementers, so do not rely on this. + * + * Since atomic bitops are used, this is not guaranteed to be the case. + * Particularly, if the bit was set, but dma_fence_signal was called right + * before this bit was set, it would have been able to set the + * DMA_FENCE_FLAG_SIGNALED_BIT, before enable_signaling was called. + * Adding a check for DMA_FENCE_FLAG_SIGNALED_BIT after setting + * DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT closes this race, and makes sure that + * after dma_fence_signal was called, any enable_signaling call will have either + * been completed, or never called at all. + */ +struct dma_fence { + struct kref refcount; + const struct dma_fence_ops *ops; + struct rcu_head rcu; + struct list_head cb_list; + spinlock_t *lock; + u64 context; + u64 seqno; + unsigned long flags; + ktime_t timestamp; + int error; +}; + +enum dma_fence_flag_bits { + DMA_FENCE_FLAG_SIGNALED_BIT, + DMA_FENCE_FLAG_TIMESTAMP_BIT, + DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, + DMA_FENCE_FLAG_USER_BITS, /* must always be last member */ +}; + +typedef void (*dma_fence_func_t)(struct dma_fence *fence, + struct dma_fence_cb *cb); + +/** + * struct dma_fence_cb - callback for dma_fence_add_callback() + * @node: used by dma_fence_add_callback() to append this struct to fence::cb_list + * @func: dma_fence_func_t to call + * + * This struct will be initialized by dma_fence_add_callback(), additional + * data can be passed along by embedding dma_fence_cb in another struct. + */ +struct dma_fence_cb { + struct list_head node; + dma_fence_func_t func; +}; + +/** + * struct dma_fence_ops - operations implemented for fence + * + */ +struct dma_fence_ops { + /** + * @use_64bit_seqno: + * + * True if this dma_fence implementation uses 64bit seqno, false + * otherwise. + */ + bool use_64bit_seqno; + + /** + * @get_driver_name: + * + * Returns the driver name. This is a callback to allow drivers to + * compute the name at runtime, without having it to store permanently + * for each fence, or build a cache of some sort. + * + * This callback is mandatory. + */ + const char * (*get_driver_name)(struct dma_fence *fence); + + /** + * @get_timeline_name: + * + * Return the name of the context this fence belongs to. This is a + * callback to allow drivers to compute the name at runtime, without + * having it to store permanently for each fence, or build a cache of + * some sort. + * + * This callback is mandatory. + */ + const char * (*get_timeline_name)(struct dma_fence *fence); + + /** + * @enable_signaling: + * + * Enable software signaling of fence. + * + * For fence implementations that have the capability for hw->hw + * signaling, they can implement this op to enable the necessary + * interrupts, or insert commands into cmdstream, etc, to avoid these + * costly operations for the common case where only hw->hw + * synchronization is required. This is called in the first + * dma_fence_wait() or dma_fence_add_callback() path to let the fence + * implementation know that there is another driver waiting on the + * signal (ie. hw->sw case). + * + * This function can be called from atomic context, but not + * from irq context, so normal spinlocks can be used. + * + * A return value of false indicates the fence already passed, + * or some failure occurred that made it impossible to enable + * signaling. True indicates successful enabling. + * + * &dma_fence.error may be set in enable_signaling, but only when false + * is returned. + * + * Since many implementations can call dma_fence_signal() even when before + * @enable_signaling has been called there's a race window, where the + * dma_fence_signal() might result in the final fence reference being + * released and its memory freed. To avoid this, implementations of this + * callback should grab their own reference using dma_fence_get(), to be + * released when the fence is signalled (through e.g. the interrupt + * handler). + * + * This callback is optional. If this callback is not present, then the + * driver must always have signaling enabled. + */ + bool (*enable_signaling)(struct dma_fence *fence); + + /** + * @signaled: + * + * Peek whether the fence is signaled, as a fastpath optimization for + * e.g. dma_fence_wait() or dma_fence_add_callback(). Note that this + * callback does not need to make any guarantees beyond that a fence + * once indicates as signalled must always return true from this + * callback. This callback may return false even if the fence has + * completed already, in this case information hasn't propogated throug + * the system yet. See also dma_fence_is_signaled(). + * + * May set &dma_fence.error if returning true. + * + * This callback is optional. + */ + bool (*signaled)(struct dma_fence *fence); + + /** + * @wait: + * + * Custom wait implementation, defaults to dma_fence_default_wait() if + * not set. + * + * The dma_fence_default_wait implementation should work for any fence, as long + * as @enable_signaling works correctly. This hook allows drivers to + * have an optimized version for the case where a process context is + * already available, e.g. if @enable_signaling for the general case + * needs to set up a worker thread. + * + * Must return -ERESTARTSYS if the wait is intr = true and the wait was + * interrupted, and remaining jiffies if fence has signaled, or 0 if wait + * timed out. Can also return other error values on custom implementations, + * which should be treated as if the fence is signaled. For example a hardware + * lockup could be reported like that. + * + * This callback is optional. + */ + signed long (*wait)(struct dma_fence *fence, + bool intr, signed long timeout); + + /** + * @release: + * + * Called on destruction of fence to release additional resources. + * Can be called from irq context. This callback is optional. If it is + * NULL, then dma_fence_free() is instead called as the default + * implementation. + */ + void (*release)(struct dma_fence *fence); + + /** + * @fence_value_str: + * + * Callback to fill in free-form debug info specific to this fence, like + * the sequence number. + * + * This callback is optional. + */ + void (*fence_value_str)(struct dma_fence *fence, char *str, int size); + + /** + * @timeline_value_str: + * + * Fills in the current value of the timeline as a string, like the + * sequence number. Note that the specific fence passed to this function + * should not matter, drivers should only use it to look up the + * corresponding timeline structures. + */ + void (*timeline_value_str)(struct dma_fence *fence, + char *str, int size); +}; + +#endif /* __LINUX_DMA_FENCE_TYPES_H */ diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index 974717d6ac0c..142eb67e695f 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -21,6 +21,7 @@ #ifndef __LINUX_DMA_FENCE_H #define __LINUX_DMA_FENCE_H +#include #include #include #include @@ -30,226 +31,7 @@ #include #include -struct dma_fence; -struct dma_fence_ops; -struct dma_fence_cb; - -/** - * struct dma_fence - software synchronization primitive - * @refcount: refcount for this fence - * @ops: dma_fence_ops associated with this fence - * @rcu: used for releasing fence with kfree_rcu - * @cb_list: list of all callbacks to call - * @lock: spin_lock_irqsave used for locking - * @context: execution context this fence belongs to, returned by - * dma_fence_context_alloc() - * @seqno: the sequence number of this fence inside the execution context, - * can be compared to decide which fence would be signaled later. - * @flags: A mask of DMA_FENCE_FLAG_* defined below - * @timestamp: Timestamp when the fence was signaled. - * @error: Optional, only valid if < 0, must be set before calling - * dma_fence_signal, indicates that the fence has completed with an error. - * - * the flags member must be manipulated and read using the appropriate - * atomic ops (bit_*), so taking the spinlock will not be needed most - * of the time. - * - * DMA_FENCE_FLAG_SIGNALED_BIT - fence is already signaled - * DMA_FENCE_FLAG_TIMESTAMP_BIT - timestamp recorded for fence signaling - * DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT - enable_signaling might have been called - * DMA_FENCE_FLAG_USER_BITS - start of the unused bits, can be used by the - * implementer of the fence for its own purposes. Can be used in different - * ways by different fence implementers, so do not rely on this. - * - * Since atomic bitops are used, this is not guaranteed to be the case. - * Particularly, if the bit was set, but dma_fence_signal was called right - * before this bit was set, it would have been able to set the - * DMA_FENCE_FLAG_SIGNALED_BIT, before enable_signaling was called. - * Adding a check for DMA_FENCE_FLAG_SIGNALED_BIT after setting - * DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT closes this race, and makes sure that - * after dma_fence_signal was called, any enable_signaling call will have either - * been completed, or never called at all. - */ -struct dma_fence { - struct kref refcount; - const struct dma_fence_ops *ops; - struct rcu_head rcu; - struct list_head cb_list; - spinlock_t *lock; - u64 context; - u64 seqno; - unsigned long flags; - ktime_t timestamp; - int error; -}; - -enum dma_fence_flag_bits { - DMA_FENCE_FLAG_SIGNALED_BIT, - DMA_FENCE_FLAG_TIMESTAMP_BIT, - DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, - DMA_FENCE_FLAG_USER_BITS, /* must always be last member */ -}; - -typedef void (*dma_fence_func_t)(struct dma_fence *fence, - struct dma_fence_cb *cb); - -/** - * struct dma_fence_cb - callback for dma_fence_add_callback() - * @node: used by dma_fence_add_callback() to append this struct to fence::cb_list - * @func: dma_fence_func_t to call - * - * This struct will be initialized by dma_fence_add_callback(), additional - * data can be passed along by embedding dma_fence_cb in another struct. - */ -struct dma_fence_cb { - struct list_head node; - dma_fence_func_t func; -}; - -/** - * struct dma_fence_ops - operations implemented for fence - * - */ -struct dma_fence_ops { - /** - * @use_64bit_seqno: - * - * True if this dma_fence implementation uses 64bit seqno, false - * otherwise. - */ - bool use_64bit_seqno; - - /** - * @get_driver_name: - * - * Returns the driver name. This is a callback to allow drivers to - * compute the name at runtime, without having it to store permanently - * for each fence, or build a cache of some sort. - * - * This callback is mandatory. - */ - const char * (*get_driver_name)(struct dma_fence *fence); - - /** - * @get_timeline_name: - * - * Return the name of the context this fence belongs to. This is a - * callback to allow drivers to compute the name at runtime, without - * having it to store permanently for each fence, or build a cache of - * some sort. - * - * This callback is mandatory. - */ - const char * (*get_timeline_name)(struct dma_fence *fence); - - /** - * @enable_signaling: - * - * Enable software signaling of fence. - * - * For fence implementations that have the capability for hw->hw - * signaling, they can implement this op to enable the necessary - * interrupts, or insert commands into cmdstream, etc, to avoid these - * costly operations for the common case where only hw->hw - * synchronization is required. This is called in the first - * dma_fence_wait() or dma_fence_add_callback() path to let the fence - * implementation know that there is another driver waiting on the - * signal (ie. hw->sw case). - * - * This function can be called from atomic context, but not - * from irq context, so normal spinlocks can be used. - * - * A return value of false indicates the fence already passed, - * or some failure occurred that made it impossible to enable - * signaling. True indicates successful enabling. - * - * &dma_fence.error may be set in enable_signaling, but only when false - * is returned. - * - * Since many implementations can call dma_fence_signal() even when before - * @enable_signaling has been called there's a race window, where the - * dma_fence_signal() might result in the final fence reference being - * released and its memory freed. To avoid this, implementations of this - * callback should grab their own reference using dma_fence_get(), to be - * released when the fence is signalled (through e.g. the interrupt - * handler). - * - * This callback is optional. If this callback is not present, then the - * driver must always have signaling enabled. - */ - bool (*enable_signaling)(struct dma_fence *fence); - - /** - * @signaled: - * - * Peek whether the fence is signaled, as a fastpath optimization for - * e.g. dma_fence_wait() or dma_fence_add_callback(). Note that this - * callback does not need to make any guarantees beyond that a fence - * once indicates as signalled must always return true from this - * callback. This callback may return false even if the fence has - * completed already, in this case information hasn't propogated throug - * the system yet. See also dma_fence_is_signaled(). - * - * May set &dma_fence.error if returning true. - * - * This callback is optional. - */ - bool (*signaled)(struct dma_fence *fence); - - /** - * @wait: - * - * Custom wait implementation, defaults to dma_fence_default_wait() if - * not set. - * - * The dma_fence_default_wait implementation should work for any fence, as long - * as @enable_signaling works correctly. This hook allows drivers to - * have an optimized version for the case where a process context is - * already available, e.g. if @enable_signaling for the general case - * needs to set up a worker thread. - * - * Must return -ERESTARTSYS if the wait is intr = true and the wait was - * interrupted, and remaining jiffies if fence has signaled, or 0 if wait - * timed out. Can also return other error values on custom implementations, - * which should be treated as if the fence is signaled. For example a hardware - * lockup could be reported like that. - * - * This callback is optional. - */ - signed long (*wait)(struct dma_fence *fence, - bool intr, signed long timeout); - - /** - * @release: - * - * Called on destruction of fence to release additional resources. - * Can be called from irq context. This callback is optional. If it is - * NULL, then dma_fence_free() is instead called as the default - * implementation. - */ - void (*release)(struct dma_fence *fence); - - /** - * @fence_value_str: - * - * Callback to fill in free-form debug info specific to this fence, like - * the sequence number. - * - * This callback is optional. - */ - void (*fence_value_str)(struct dma_fence *fence, char *str, int size); - - /** - * @timeline_value_str: - * - * Fills in the current value of the timeline as a string, like the - * sequence number. Note that the specific fence passed to this function - * should not matter, drivers should only use it to look up the - * corresponding timeline structures. - */ - void (*timeline_value_str)(struct dma_fence *fence, - char *str, int size); -}; +#include void dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops, spinlock_t *lock, u64 context, u64 seqno); @@ -561,6 +343,35 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr) struct dma_fence *dma_fence_get_stub(void); u64 dma_fence_context_alloc(unsigned num); +static inline bool +__dma_fence_signal(struct dma_fence *fence) +{ + return !test_and_set_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags); +} + +static inline void +__dma_fence_signal__timestamp(struct dma_fence *fence, ktime_t timestamp) +{ + fence->timestamp = timestamp; + set_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT, &fence->flags); + trace_dma_fence_signaled(fence); +} + +static inline void +__dma_fence_signal__notify(struct dma_fence *fence) +{ + struct dma_fence_cb *cur, *tmp; + + lockdep_assert_held(fence->lock); + lockdep_assert_irqs_disabled(); + + list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) { + INIT_LIST_HEAD(&cur->node); + cur->func(fence, cur); + } + INIT_LIST_HEAD(&fence->cb_list); +} + #define DMA_FENCE_TRACE(f, fmt, args...) \ do { \ struct dma_fence *__ff = (f); \