From patchwork Tue Oct 3 15:27:01 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jordan Crouse X-Patchwork-Id: 9983093 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 28B386038E for ; Tue, 3 Oct 2017 15:27:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1A702289B1 for ; Tue, 3 Oct 2017 15:27:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0F6A4289EA; Tue, 3 Oct 2017 15:27:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id B1A3C289BE for ; Tue, 3 Oct 2017 15:27:15 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B037D6E534; Tue, 3 Oct 2017 15:27:12 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from smtp.codeaurora.org (smtp.codeaurora.org [198.145.29.96]) by gabe.freedesktop.org (Postfix) with ESMTPS id 845EA6E529; Tue, 3 Oct 2017 15:27:11 +0000 (UTC) Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 66B4A60C68; Tue, 3 Oct 2017 15:27:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1507044431; bh=oDJUOAPIDbRFKLE5K2quZMh1os16xWc7BYgc4vSrSWc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=EyDNOfY9oFz9KSHXZqV5CcBrAIxru/PE5V9emXTSHH4snoL/5gQGCtp2dyUQjgqHb gT9TudmbWm4WhNPakUjatmUh0S3WpgdHkq6Hse2veDIE8u+oVvLM6/9gF3e1txgUw3 bNRuXRSxzogjT2cDSXObniUgDOgFxKpt5F+9FZY8= Received: from jcrouse-lnx.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: jcrouse@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id 999F360B7C; Tue, 3 Oct 2017 15:27:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1507044431; bh=oDJUOAPIDbRFKLE5K2quZMh1os16xWc7BYgc4vSrSWc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=EyDNOfY9oFz9KSHXZqV5CcBrAIxru/PE5V9emXTSHH4snoL/5gQGCtp2dyUQjgqHb gT9TudmbWm4WhNPakUjatmUh0S3WpgdHkq6Hse2veDIE8u+oVvLM6/9gF3e1txgUw3 bNRuXRSxzogjT2cDSXObniUgDOgFxKpt5F+9FZY8= DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 999F360B7C Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=jcrouse@codeaurora.org From: Jordan Crouse To: freedreno@lists.freedesktop.org Subject: [PATCH 1/6] drm/msm: Fix race condition in the submit path Date: Tue, 3 Oct 2017 09:27:01 -0600 Message-Id: <1507044426-4042-2-git-send-email-jcrouse@codeaurora.org> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1507044426-4042-1-git-send-email-jcrouse@codeaurora.org> References: <1507044426-4042-1-git-send-email-jcrouse@codeaurora.org> Cc: linux-arm-msm@vger.kernel.org, Sharat Masetty , dri-devel@lists.freedesktop.org X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Sharat Masetty There is a race condition issue between the IRQ context trying to trigger preemption and the user context trying to submit commands to the GPU. The check in a5xx_flush() API only updates the wptr if the GPU is not in preemption. In the cases where we move from PREEMPT_START to PREEMPT_NONE there is a small window where the preempt state is still in START but the CPU context switches to the user thread which is in the a5xx_flush() call to update the wptr, but fails to update the wptr to the GPU since the preempt state is not PREEMPT_NONE. This leads to a GPU stall. Introduce a new intermediate state PREEMPT_ABORT and change preempt_trigger() to use gpu's current ring instead of the ring retrieved from get_next_ring() while in this state. Signed-off-by: Sharat Masetty Signed-off-by: Jordan Crouse --- drivers/gpu/drm/msm/adreno/a5xx_gpu.h | 8 +++++++- drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 17 ++++++++++++++--- 2 files changed, 21 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h index f062a90..6fb8c2f 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h @@ -56,6 +56,8 @@ struct a5xx_gpu { * PREEMPT_NONE - no preemption in progress. Next state START. * PREEMPT_START - The trigger is evaulating if preemption is possible. Next * states: TRIGGERED, NONE + * PREEMPT_ABORT - An intermediate state before moving back to NONE. Next + * state: NONE. * PREEMPT_TRIGGERED: A preemption has been executed on the hardware. Next * states: FAULTED, PENDING * PREEMPT_FAULTED: A preemption timed out (never completed). This will trigger @@ -67,6 +69,7 @@ struct a5xx_gpu { enum preempt_state { PREEMPT_NONE = 0, PREEMPT_START, + PREEMPT_ABORT, PREEMPT_TRIGGERED, PREEMPT_FAULTED, PREEMPT_PENDING, @@ -154,7 +157,10 @@ static inline int spin_usecs(struct msm_gpu *gpu, uint32_t usecs, /* Return true if we are in a preempt state */ static inline bool a5xx_in_preempt(struct a5xx_gpu *a5xx_gpu) { - return !(atomic_read(&a5xx_gpu->preempt_state) == PREEMPT_NONE); + int preempt_state = atomic_read(&a5xx_gpu->preempt_state); + + return !(preempt_state == PREEMPT_NONE || + preempt_state == PREEMPT_ABORT); } #endif /* __A5XX_GPU_H__ */ diff --git a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c index 6a3767d..40f4840 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_preempt.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_preempt.c @@ -122,9 +122,20 @@ void a5xx_preempt_trigger(struct msm_gpu *gpu) * one do nothing except to update the wptr to the latest and greatest */ if (!ring || (a5xx_gpu->cur_ring == ring)) { - update_wptr(gpu, ring); - - /* Set the state back to NONE */ + /* + * Its possible that while a preemption request is in progress + * from an irq context, a user context trying to submit might + * fail to update the write pointer, because it determines + * that the preempt state is not PREEMPT_NONE. + * + * Close the race by introducing an intermediate + * state PREEMPT_ABORT to let the submit path + * know that the ringbuffer is not going to change + * and can safely update the write pointer. + */ + + set_preempt_state(a5xx_gpu, PREEMPT_ABORT); + update_wptr(gpu, a5xx_gpu->cur_ring); set_preempt_state(a5xx_gpu, PREEMPT_NONE); return; }