From patchwork Thu Nov 8 13:42:34 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sharat Masetty X-Patchwork-Id: 10674221 X-Patchwork-Delegate: agross@codeaurora.org Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9FA21175A for ; Thu, 8 Nov 2018 13:42:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8E8102DAD5 for ; Thu, 8 Nov 2018 13:42:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 82E9C2DB00; Thu, 8 Nov 2018 13:42:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI,SUBJ_OBFU_PUNCT_FEW autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CAF762DAD5 for ; Thu, 8 Nov 2018 13:42:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727059AbeKHXSQ (ORCPT ); Thu, 8 Nov 2018 18:18:16 -0500 Received: from smtp.codeaurora.org ([198.145.29.96]:39056 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726618AbeKHXSQ (ORCPT ); Thu, 8 Nov 2018 18:18:16 -0500 Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 0E155607F7; Thu, 8 Nov 2018 13:42:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1541684563; bh=ouhei/jMEJa85g6d+x/qwtqKAnae3vVBwaPh7hFRQD0=; h=From:To:Cc:Subject:Date:From; b=M5tvuAYZsNUUuqZ9g1H+BfssNWifWvBflHBcj0vJYO+ka5iMLT92AUMIfzR8UUu6K lLiEaGcj7AJCpkFBYsW4IHLdm0QGVXyJXqTGE1xCemqy3FrETvWKDhtRLLiA8L7pEA WW/LpGonP2GwUq9On6Sqil+JNLvle1r6pScVBREo= Received: from smasetty-linux.qualcomm.com (blr-c-bdr-fw-01_globalnat_allzones-outside.qualcomm.com [103.229.19.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: smasetty@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id ACDDC6038E; Thu, 8 Nov 2018 13:42:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1541684562; bh=ouhei/jMEJa85g6d+x/qwtqKAnae3vVBwaPh7hFRQD0=; h=From:To:Cc:Subject:Date:From; b=TkPf6BY36AhFOAEuj9gxNBh2tXmZvLpdv3KqbqkFG6Lrd2xWFQVaeGHvR0yHYftYH OgEv16z+Reuy4TrMC2djleycLi8ICpo0cTQ0hba6RIWbVsNdyZ/bhas7+ORf2qapnB Y6JZwDgAmcur6Hss+zT0/63AvFkXJ2Rdo/mHJjZ4= DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org ACDDC6038E Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=smasetty@codeaurora.org From: Sharat Masetty To: Christian.Koenig@amd.com, freedreno@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, jcrouse@codeaurora.org, Sharat Masetty Subject: [PATCH] drm/scheduler: Add drm_sched_suspend/resume timeout functions Date: Thu, 8 Nov 2018 19:12:34 +0530 Message-Id: <1541684554-17115-1-git-send-email-smasetty@codeaurora.org> X-Mailer: git-send-email 1.9.1 Sender: linux-arm-msm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hi Christian, Can you please review this patch? It is a continuation of the discussion at [1]. At first I was thinking of using a cancel for suspend instead of a mod(to an arbitrarily large value), but I couldn't get it to fit in as I have an additional constraint of being able to call these functions from an IRQ context. These new functions race with other contexts, primarily finish_job(), timedout_job() and recovery(), but I did go through the possible races between these(I think). Please let me know what you think of this? I have not tested this yet and if this is something in the right direction, I will put this through my testing drill and polish it. IMO I think I prefer the callback approach as it appears to be simple, less error prone for both the scheduler and the drivers. [1] https://patchwork.freedesktop.org/patch/259914/ Signed-off-by: Sharat Masetty --- drivers/gpu/drm/scheduler/sched_main.c | 81 +++++++++++++++++++++++++++++++++- include/drm/gpu_scheduler.h | 5 +++ 2 files changed, 85 insertions(+), 1 deletion(-) -- 1.9.1 diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index c993d10..9645789 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -191,11 +191,84 @@ bool drm_sched_dependency_optimized(struct dma_fence* fence, */ static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched) { + unsigned long flags; + + spin_lock_irqsave(&sched->tdr_suspend_lock, flags); + + sched->timeout_remaining = sched->timeout; + if (sched->timeout != MAX_SCHEDULE_TIMEOUT && - !list_empty(&sched->ring_mirror_list)) + !list_empty(&sched->ring_mirror_list) && !sched->work_tdr_suspended) schedule_delayed_work(&sched->work_tdr, sched->timeout); + + spin_unlock_irqrestore(&sched->tdr_suspend_lock, flags); } +/** + * drm_sched_suspend_timeout - suspend timeout for reset worker + * + * @sched: scheduler instance for which to suspend the timeout + * + * Suspend the delayed work timeout for the scheduler. Note that + * this function can be called from an IRQ context. + */ +void drm_sched_suspend_timeout(struct drm_gpu_scheduler *sched) +{ + unsigned long flags, timeout; + + spin_lock_irqsave(&sched->tdr_suspend_lock, flags); + + if (sched->work_tdr_suspended || + sched->timeout == MAX_SCHEDULE_TIMEOUT || + list_empty(&sched->ring_mirror_list)) + goto done; + + timeout = sched->work_tdr.timer.expires; + + /* + * Reset timeout to an arbitrarily large value + */ + mod_delayed_work(system_wq, &sched->work_tdr, sched->timeout * 10); + + timeout -= jiffies; + + /* FIXME: Can jiffies be after timeout? */ + sched->timeout_remaining = time_after(jiffies, timeout)? 0: timeout; + sched->work_tdr_suspended = true; + +done: + spin_unlock_irqrestore(&sched->tdr_suspend_lock, flags); +} +EXPORT_SYMBOL(drm_sched_suspend_timeout); + +/** + * drm_sched_resume_timeout - resume timeout for reset worker + * + * @sched: scheduler instance for which to resume the timeout + * + * Resume the delayed work timeout for the scheduler. Note that + * this function can be called from an IRQ context. + */ +void drm_sched_resume_timeout(struct drm_gpu_scheduler *sched) +{ + unsigned long flags; + + spin_lock_irqsave(&sched->tdr_suspend_lock, flags); + + if (!sched->work_tdr_suspended || + sched->timeout == MAX_SCHEDULE_TIMEOUT) { + spin_unlock_irqrestore(&sched->tdr_suspend_lock, flags); + return; + } + + mod_delayed_work(system_wq, &sched->work_tdr, sched->timeout_remaining); + + sched->work_tdr_suspended = false; + + spin_unlock_irqrestore(&sched->tdr_suspend_lock, flags); +} +EXPORT_SYMBOL(drm_sched_resume_timeout); + /* job_finish is called after hw fence signaled */ static void drm_sched_job_finish(struct work_struct *work) @@ -348,6 +421,11 @@ void drm_sched_job_recovery(struct drm_gpu_scheduler *sched) struct drm_sched_job *s_job, *tmp; bool found_guilty = false; int r; + unsigned long flags; + + spin_lock_irqsave(&sched->tdr_suspend_lock, flags); + sched->work_tdr_suspended = false; + spin_unlock_irqrestore(&sched->tdr_suspend_lock, flags); spin_lock(&sched->job_list_lock); list_for_each_entry_safe(s_job, tmp, &sched->ring_mirror_list, node) { @@ -607,6 +685,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, init_waitqueue_head(&sched->job_scheduled); INIT_LIST_HEAD(&sched->ring_mirror_list); spin_lock_init(&sched->job_list_lock); + spin_lock_init(&sched->tdr_suspend_lock); atomic_set(&sched->hw_rq_count, 0); INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout); atomic_set(&sched->num_jobs, 0); diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index d87b268..5d39572 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -278,6 +278,9 @@ struct drm_gpu_scheduler { atomic_t hw_rq_count; atomic64_t job_id_count; struct delayed_work work_tdr; + unsigned long timeout_remaining; + bool work_tdr_suspended; + spinlock_t tdr_suspend_lock; struct task_struct *thread; struct list_head ring_mirror_list; spinlock_t job_list_lock; @@ -300,6 +303,8 @@ void drm_sched_hw_job_reset(struct drm_gpu_scheduler *sched, bool drm_sched_dependency_optimized(struct dma_fence* fence, struct drm_sched_entity *entity); void drm_sched_job_kickout(struct drm_sched_job *s_job); +void drm_sched_suspend_timeout(struct drm_gpu_scheduler *sched); +void drm_sched_resume_timeout(struct drm_gpu_scheduler *sched); void drm_sched_rq_add_entity(struct drm_sched_rq *rq, struct drm_sched_entity *entity);