From patchwork Tue Jul 14 21:23:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 11663833 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C2495722 for ; Tue, 14 Jul 2020 21:24:23 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id AB2D5206D9 for ; Tue, 14 Jul 2020 21:24:23 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AB2D5206D9 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E8AC76E9E4; Tue, 14 Jul 2020 21:24:22 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from fireflyinternet.com (unknown [77.68.26.236]) by gabe.freedesktop.org (Postfix) with ESMTPS id 56B796E9E4; Tue, 14 Jul 2020 21:24:21 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from build.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 21820608-1500050 for multiple; Tue, 14 Jul 2020 22:24:03 +0100 From: Chris Wilson To: dri-devel@lists.freedesktop.org Subject: [PATCH v2 1/3] dma-buf/sw_sync: Avoid recursive lock during fence signal. Date: Tue, 14 Jul 2020 22:23:59 +0100 Message-Id: <20200714212401.15825-2-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200714212401.15825-1-chris@chris-wilson.co.uk> References: <20200714212401.15825-1-chris@chris-wilson.co.uk> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Gustavo Padovan , intel-gfx@lists.freedesktop.org, stable@vger.kernel.org, Chris Wilson , =?utf-8?q?Christian_K=C3=B6nig?= Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Bas Nieuwenhuizen Calltree: timeline_fence_release drm_sched_entity_wakeup dma_fence_signal_locked sync_timeline_signal sw_sync_ioctl Releasing the reference to the fence in the fence signal callback seems reasonable to me, so this patch avoids the locking issue in sw_sync. d3862e44daa7 ("dma-buf/sw-sync: Fix locking around sync_timeline lists") fixed the recursive locking issue but caused an use-after-free. Later d3c6dd1fb30d ("dma-buf/sw_sync: Synchronize signal vs syncpt free") fixed the use-after-free but reintroduced the recursive locking issue. In this attempt we avoid the use-after-free still because the release function still always locks, and outside of the locking region in the signal function we have properly refcounted references. We furthermore also avoid the recurive lock by making sure that either: 1) We have a properly refcounted reference, preventing the signal from triggering the release function inside the locked region. 2) The refcount was already zero, and hence nobody will be able to trigger the release function from the signal function. v2: Move dma_fence_signal() into second loop in preparation to moving the callback out of the timeline obj->lock. Fixes: d3c6dd1fb30d ("dma-buf/sw_sync: Synchronize signal vs syncpt free") Cc: Sumit Semwal Cc: Chris Wilson Cc: Gustavo Padovan Cc: Christian König Cc: Signed-off-by: Bas Nieuwenhuizen Signed-off-by: Chris Wilson --- drivers/dma-buf/sw_sync.c | 32 ++++++++++++++++++++++---------- 1 file changed, 22 insertions(+), 10 deletions(-) diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c index 348b3a9170fa..807c82148062 100644 --- a/drivers/dma-buf/sw_sync.c +++ b/drivers/dma-buf/sw_sync.c @@ -192,6 +192,7 @@ static const struct dma_fence_ops timeline_fence_ops = { static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc) { struct sync_pt *pt, *next; + LIST_HEAD(signal); trace_sync_timeline(obj); @@ -203,21 +204,32 @@ static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc) if (!timeline_fence_signaled(&pt->base)) break; - list_del_init(&pt->link); - rb_erase(&pt->node, &obj->pt_tree); - /* - * A signal callback may release the last reference to this - * fence, causing it to be freed. That operation has to be - * last to avoid a use after free inside this loop, and must - * be after we remove the fence from the timeline in order to - * prevent deadlocking on timeline->lock inside - * timeline_fence_release(). + * We need to take a reference to avoid a release during + * signalling (which can cause a recursive lock of obj->lock). + * If refcount was already zero, another thread is already + * taking care of destroying the fence. */ - dma_fence_signal_locked(&pt->base); + if (!dma_fence_get_rcu(&pt->base)) + continue; + + list_move_tail(&pt->link, &signal); + rb_erase(&pt->node, &obj->pt_tree); } spin_unlock_irq(&obj->lock); + + list_for_each_entry_safe(pt, next, &signal, link) { + /* + * This needs to be cleared before release, otherwise the + * timeline_fence_release function gets confused about also + * removing the fence from the pt_tree. + */ + list_del_init(&pt->link); + + dma_fence_signal(&pt->base); + dma_fence_put(&pt->base); + } } /**