From patchwork Tue Oct 31 03:24:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13440992 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 32BE1C4332F for ; Tue, 31 Oct 2023 03:24:57 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8606F10E3E8; Tue, 31 Oct 2023 03:24:56 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.65]) by gabe.freedesktop.org (Postfix) with ESMTPS id D068F10E3E7; Tue, 31 Oct 2023 03:24:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698722694; x=1730258694; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=wN6C6PniUKR3nsyz7b6O70v63elRiQeCyV72F00DcKI=; b=FxxUGl00M/GmHtv1I7UAFFqOC3riFzUO2fn5UbMBvqk8DDcTfsC0x51F GyRdfXUHyFk6BOmlMQGE0wAgX5XcaQX5w3WIvm6ojncqd0YAp3mEtcm3c OP/lSQP7TjGL++2//+lwjObHdyTqdkrIDmXsG0Olon9Ja5kiEBHma893j GfPxBJpZDvvTIfWYeyIGCqz4cwHz2yxJRh3S+N9MrGCQE1utnqr3CC1Tt T9ErWXG/9LbQhiilJkjyHKs+kVd1rcXBvFa0E1PKK+58iw3K9q6nZTlTC kOWdINIKnlvc04Pe4RolqRpk5FZY3CJFCYfH985VE5EmDhQfjuJBtlaFB g==; X-IronPort-AV: E=McAfee;i="6600,9927,10879"; a="392069026" X-IronPort-AV: E=Sophos;i="6.03,264,1694761200"; d="scan'208";a="392069026" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2023 20:24:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,264,1694761200"; d="scan'208";a="1660981" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2023 20:24:09 -0700 From: Matthew Brost To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH v8 1/5] drm/sched: Add drm_sched_wqueue_* helpers Date: Mon, 30 Oct 2023 20:24:35 -0700 Message-Id: <20231031032439.1558703-2-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231031032439.1558703-1-matthew.brost@intel.com> References: <20231031032439.1558703-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: robdclark@chromium.org, thomas.hellstrom@linux.intel.com, Matthew Brost , sarah.walker@imgtec.com, Luben Tuikov , ltuikov@yahoo.com, ketil.johnsen@arm.com, Liviu.Dudau@arm.com, mcanal@igalia.com, boris.brezillon@collabora.com, dakr@redhat.com, donald.robson@imgtec.com, lina@asahilina.net, christian.koenig@amd.com, faith.ekstrand@collabora.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Add scheduler wqueue ready, stop, and start helpers to hide the implementation details of the scheduler from the drivers. v2: - s/sched_wqueue/sched_wqueue (Luben) - Remove the extra white line after the return-statement (Luben) - update drm_sched_wqueue_ready comment (Luben) Cc: Luben Tuikov Signed-off-by: Matthew Brost Reviewed-by: Luben Tuikov --- .../drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 15 +++---- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 +++--- drivers/gpu/drm/msm/adreno/adreno_device.c | 6 ++- drivers/gpu/drm/scheduler/sched_main.c | 39 ++++++++++++++++++- include/drm/gpu_scheduler.h | 3 ++ 6 files changed, 59 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c index 625db444df1c..10d56979fe3b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c @@ -290,7 +290,7 @@ static int suspend_resume_compute_scheduler(struct amdgpu_device *adev, bool sus for (i = 0; i < adev->gfx.num_compute_rings; i++) { struct amdgpu_ring *ring = &adev->gfx.compute_ring[i]; - if (!(ring && ring->sched.thread)) + if (!(ring && drm_sched_wqueue_ready(&ring->sched))) continue; /* stop secheduler and drain ring. */ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c index 3136a0774dd9..e20fd9e6c5bf 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c @@ -1659,9 +1659,9 @@ static int amdgpu_debugfs_test_ib_show(struct seq_file *m, void *unused) for (i = 0; i < AMDGPU_MAX_RINGS; i++) { struct amdgpu_ring *ring = adev->rings[i]; - if (!ring || !ring->sched.thread) + if (!ring || !drm_sched_wqueue_ready(&ring->sched)) continue; - kthread_park(ring->sched.thread); + drm_sched_wqueue_stop(&ring->sched); } seq_puts(m, "run ib test:\n"); @@ -1675,9 +1675,9 @@ static int amdgpu_debugfs_test_ib_show(struct seq_file *m, void *unused) for (i = 0; i < AMDGPU_MAX_RINGS; i++) { struct amdgpu_ring *ring = adev->rings[i]; - if (!ring || !ring->sched.thread) + if (!ring || !drm_sched_wqueue_ready(&ring->sched)) continue; - kthread_unpark(ring->sched.thread); + drm_sched_wqueue_start(&ring->sched); } up_write(&adev->reset_domain->sem); @@ -1897,7 +1897,8 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 val) ring = adev->rings[val]; - if (!ring || !ring->funcs->preempt_ib || !ring->sched.thread) + if (!ring || !ring->funcs->preempt_ib || + !drm_sched_wqueue_ready(&ring->sched)) return -EINVAL; /* the last preemption failed */ @@ -1915,7 +1916,7 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 val) goto pro_end; /* stop the scheduler */ - kthread_park(ring->sched.thread); + drm_sched_wqueue_stop(&ring->sched); /* preempt the IB */ r = amdgpu_ring_preempt_ib(ring); @@ -1949,7 +1950,7 @@ static int amdgpu_debugfs_ib_preempt(void *data, u64 val) failure: /* restart the scheduler */ - kthread_unpark(ring->sched.thread); + drm_sched_wqueue_start(&ring->sched); up_read(&adev->reset_domain->sem); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 186c06756a2c..d20c12aae66b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4861,7 +4861,7 @@ bool amdgpu_device_has_job_running(struct amdgpu_device *adev) for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { struct amdgpu_ring *ring = adev->rings[i]; - if (!ring || !ring->sched.thread) + if (!ring || !drm_sched_wqueue_ready(&ring->sched)) continue; spin_lock(&ring->sched.job_list_lock); @@ -5000,7 +5000,7 @@ int amdgpu_device_pre_asic_reset(struct amdgpu_device *adev, for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { struct amdgpu_ring *ring = adev->rings[i]; - if (!ring || !ring->sched.thread) + if (!ring || !drm_sched_wqueue_ready(&ring->sched)) continue; /* Clear job fence from fence drv to avoid force_completion @@ -5489,7 +5489,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { struct amdgpu_ring *ring = tmp_adev->rings[i]; - if (!ring || !ring->sched.thread) + if (!ring || !drm_sched_wqueue_ready(&ring->sched)) continue; drm_sched_stop(&ring->sched, job ? &job->base : NULL); @@ -5565,7 +5565,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { struct amdgpu_ring *ring = tmp_adev->rings[i]; - if (!ring || !ring->sched.thread) + if (!ring || !drm_sched_wqueue_ready(&ring->sched)) continue; drm_sched_start(&ring->sched, true); @@ -5892,7 +5892,7 @@ pci_ers_result_t amdgpu_pci_error_detected(struct pci_dev *pdev, pci_channel_sta for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { struct amdgpu_ring *ring = adev->rings[i]; - if (!ring || !ring->sched.thread) + if (!ring || !drm_sched_wqueue_ready(&ring->sched)) continue; drm_sched_stop(&ring->sched, NULL); @@ -6020,7 +6020,7 @@ void amdgpu_pci_resume(struct pci_dev *pdev) for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { struct amdgpu_ring *ring = adev->rings[i]; - if (!ring || !ring->sched.thread) + if (!ring || !drm_sched_wqueue_ready(&ring->sched)) continue; drm_sched_start(&ring->sched, true); diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c index 41b13dec9bef..f62ab5257e66 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_device.c +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c @@ -841,7 +841,8 @@ static void suspend_scheduler(struct msm_gpu *gpu) */ for (i = 0; i < gpu->nr_rings; i++) { struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched; - kthread_park(sched->thread); + + drm_sched_wqueue_stop(sched); } } @@ -851,7 +852,8 @@ static void resume_scheduler(struct msm_gpu *gpu) for (i = 0; i < gpu->nr_rings; i++) { struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched; - kthread_unpark(sched->thread); + + drm_sched_wqueue_start(sched); } } diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 99797a8c836a..54c1c5fe01ba 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -439,7 +439,7 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad) { struct drm_sched_job *s_job, *tmp; - kthread_park(sched->thread); + drm_sched_wqueue_stop(sched); /* * Reinsert back the bad job here - now it's safe as @@ -552,7 +552,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery) spin_unlock(&sched->job_list_lock); } - kthread_unpark(sched->thread); + drm_sched_wqueue_start(sched); } EXPORT_SYMBOL(drm_sched_start); @@ -1252,3 +1252,38 @@ void drm_sched_increase_karma(struct drm_sched_job *bad) } } EXPORT_SYMBOL(drm_sched_increase_karma); + +/** + * drm_sched_wqueue_ready - Is the scheduler ready for submission + * + * @sched: scheduler instance + * + * Returns true if submission is ready + */ +bool drm_sched_wqueue_ready(struct drm_gpu_scheduler *sched) +{ + return !!sched->thread; +} +EXPORT_SYMBOL(drm_sched_wqueue_ready); + +/** + * drm_sched_wqueue_stop - stop scheduler submission + * + * @sched: scheduler instance + */ +void drm_sched_wqueue_stop(struct drm_gpu_scheduler *sched) +{ + kthread_park(sched->thread); +} +EXPORT_SYMBOL(drm_sched_wqueue_stop); + +/** + * drm_sched_wqueue_start - start scheduler submission + * + * @sched: scheduler instance + */ +void drm_sched_wqueue_start(struct drm_gpu_scheduler *sched) +{ + kthread_unpark(sched->thread); +} +EXPORT_SYMBOL(drm_sched_wqueue_start); diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index d2fb81e34174..1d5a20af4a06 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -552,6 +552,9 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity, void drm_sched_job_cleanup(struct drm_sched_job *job); void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched); +bool drm_sched_wqueue_ready(struct drm_gpu_scheduler *sched); +void drm_sched_wqueue_stop(struct drm_gpu_scheduler *sched); +void drm_sched_wqueue_start(struct drm_gpu_scheduler *sched); void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad); void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery); void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched); From patchwork Tue Oct 31 03:24:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13440997 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B73F4C4167D for ; Tue, 31 Oct 2023 03:25:12 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3F21910E3F2; Tue, 31 Oct 2023 03:25:01 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.65]) by gabe.freedesktop.org (Postfix) with ESMTPS id 091DF10E3E8; Tue, 31 Oct 2023 03:24:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698722695; x=1730258695; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=dKoZ0kOEZ2iqvNbACKeGFDgyXPuOICxAqTQoBIX1faM=; b=Z7CrNNKu4NmHm0JIOFh0KCwGg6mZZnvor5S9z68rU3F+dVfZBSzRXkVv DQQiIwceUB7FPu3eKja5q8puyxtBJNm+ZNJsS8V3tXTPsioripGDb5suk sscSKrqrPdstuaFNZWPe1yD6yDIxap6yY9Hw5VqPgFi3dNKozpqZz7fum CUKD6Ak+bRCRsA7HyK/pC3cHdRkcyFSKERSas9smTcM4htiDT/TduPLm9 9NqVLiaR0tXMCYTN9y9O80qO+G2UhD87vDMBVHUXDwE5Ejl7GSu1wH8K3 tun7uHPHzryXYCJpZ6DQrgL+LFmWHKiNa+wtCoFL9PLAPasDehEJ/cDOn w==; X-IronPort-AV: E=McAfee;i="6600,9927,10879"; a="392069040" X-IronPort-AV: E=Sophos;i="6.03,264,1694761200"; d="scan'208";a="392069040" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2023 20:24:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,264,1694761200"; d="scan'208";a="1660987" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2023 20:24:09 -0700 From: Matthew Brost To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH v8 2/5] drm/sched: Convert drm scheduler to use a work queue rather than kthread Date: Mon, 30 Oct 2023 20:24:36 -0700 Message-Id: <20231031032439.1558703-3-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231031032439.1558703-1-matthew.brost@intel.com> References: <20231031032439.1558703-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: robdclark@chromium.org, thomas.hellstrom@linux.intel.com, Matthew Brost , sarah.walker@imgtec.com, Luben Tuikov , ltuikov@yahoo.com, ketil.johnsen@arm.com, Liviu.Dudau@arm.com, mcanal@igalia.com, boris.brezillon@collabora.com, dakr@redhat.com, donald.robson@imgtec.com, lina@asahilina.net, christian.koenig@amd.com, faith.ekstrand@collabora.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" In Xe, the new Intel GPU driver, a choice has made to have a 1 to 1 mapping between a drm_gpu_scheduler and drm_sched_entity. At first this seems a bit odd but let us explain the reasoning below. 1. In Xe the submission order from multiple drm_sched_entity is not guaranteed to be the same completion even if targeting the same hardware engine. This is because in Xe we have a firmware scheduler, the GuC, which allowed to reorder, timeslice, and preempt submissions. If a using shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls apart as the TDR expects submission order == completion order. Using a dedicated drm_gpu_scheduler per drm_sched_entity solve this problem. 2. In Xe submissions are done via programming a ring buffer (circular buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow control on the ring for free. A problem with this design is currently a drm_gpu_scheduler uses a kthread for submission / job cleanup. This doesn't scale if a large number of drm_gpu_scheduler are used. To work around the scaling issue, use a worker rather than kthread for submission / job cleanup. v2: - (Rob Clark) Fix msm build - Pass in run work queue v3: - (Boris) don't have loop in worker v4: - (Tvrtko) break out submit ready, stop, start helpers into own patch v5: - (Boris) default to ordered work queue v6: - (Luben / checkpatch) fix alignment in msm_ringbuffer.c - (Luben) s/drm_sched_submit_queue/drm_sched_wqueue_enqueue - (Luben) Update comment for drm_sched_wqueue_enqueue - (Luben) Positive check for submit_wq in drm_sched_init - (Luben) s/alloc_submit_wq/own_submit_wq v7: - (Luben) s/drm_sched_wqueue_enqueue/drm_sched_run_job_queue v8: - (Luben) Adjust var names / comments Signed-off-by: Matthew Brost Reviewed-by: Luben Tuikov --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- drivers/gpu/drm/etnaviv/etnaviv_sched.c | 2 +- drivers/gpu/drm/lima/lima_sched.c | 2 +- drivers/gpu/drm/msm/msm_ringbuffer.c | 2 +- drivers/gpu/drm/nouveau/nouveau_sched.c | 2 +- drivers/gpu/drm/panfrost/panfrost_job.c | 2 +- drivers/gpu/drm/scheduler/sched_main.c | 131 +++++++++++---------- drivers/gpu/drm/v3d/v3d_sched.c | 10 +- include/drm/gpu_scheduler.h | 14 ++- 9 files changed, 86 insertions(+), 81 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index d20c12aae66b..f493ffa1feec 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2491,7 +2491,7 @@ static int amdgpu_device_init_schedulers(struct amdgpu_device *adev) break; } - r = drm_sched_init(&ring->sched, &amdgpu_sched_ops, + r = drm_sched_init(&ring->sched, &amdgpu_sched_ops, NULL, DRM_SCHED_PRIORITY_COUNT, ring->num_hw_submission, 0, timeout, adev->reset_domain->wq, diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c index 9b79f218e21a..c4b04b0dee16 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c @@ -134,7 +134,7 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu) { int ret; - ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops, + ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops, NULL, DRM_SCHED_PRIORITY_COUNT, etnaviv_hw_jobs_limit, etnaviv_job_hang_limit, msecs_to_jiffies(500), NULL, NULL, diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c index 295f0353a02e..aa030e1f7cda 100644 --- a/drivers/gpu/drm/lima/lima_sched.c +++ b/drivers/gpu/drm/lima/lima_sched.c @@ -488,7 +488,7 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, const char *name) INIT_WORK(&pipe->recover_work, lima_sched_recover_work); - return drm_sched_init(&pipe->base, &lima_sched_ops, + return drm_sched_init(&pipe->base, &lima_sched_ops, NULL, DRM_SCHED_PRIORITY_COUNT, 1, lima_job_hang_limit, diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c index 95257ab0185d..4968568e3b54 100644 --- a/drivers/gpu/drm/msm/msm_ringbuffer.c +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c @@ -94,7 +94,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int id, /* currently managing hangcheck ourselves: */ sched_timeout = MAX_SCHEDULE_TIMEOUT; - ret = drm_sched_init(&ring->sched, &msm_sched_ops, + ret = drm_sched_init(&ring->sched, &msm_sched_ops, NULL, DRM_SCHED_PRIORITY_COUNT, num_hw_submissions, 0, sched_timeout, NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev->dev); diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouveau/nouveau_sched.c index 7c376c4ccdcf..c4ba56b1a6dd 100644 --- a/drivers/gpu/drm/nouveau/nouveau_sched.c +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c @@ -435,7 +435,7 @@ int nouveau_sched_init(struct nouveau_drm *drm) if (!drm->sched_wq) return -ENOMEM; - return drm_sched_init(sched, &nouveau_sched_ops, + return drm_sched_init(sched, &nouveau_sched_ops, NULL, DRM_SCHED_PRIORITY_COUNT, NOUVEAU_SCHED_HW_SUBMISSIONS, 0, job_hang_limit, NULL, NULL, "nouveau_sched", drm->dev->dev); diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c index ecd2e035147f..6d89e24322db 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.c +++ b/drivers/gpu/drm/panfrost/panfrost_job.c @@ -852,7 +852,7 @@ int panfrost_job_init(struct panfrost_device *pfdev) js->queue[j].fence_context = dma_fence_context_alloc(1); ret = drm_sched_init(&js->queue[j].sched, - &panfrost_sched_ops, + &panfrost_sched_ops, NULL, DRM_SCHED_PRIORITY_COUNT, nentries, 0, msecs_to_jiffies(JOB_TIMEOUT_MS), diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 54c1c5fe01ba..d1ae05bded15 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -48,7 +48,6 @@ * through the jobs entity pointer. */ -#include #include #include #include @@ -256,6 +255,16 @@ drm_sched_rq_select_entity_fifo(struct drm_sched_rq *rq) return rb ? rb_entry(rb, struct drm_sched_entity, rb_tree_node) : NULL; } +/** + * drm_sched_run_job_queue - enqueue run-job work + * @sched: scheduler instance + */ +static void drm_sched_run_job_queue(struct drm_gpu_scheduler *sched) +{ + if (!READ_ONCE(sched->pause_submit)) + queue_work(sched->submit_wq, &sched->work_run_job); +} + /** * drm_sched_job_done - complete a job * @s_job: pointer to the job which is done @@ -275,7 +284,7 @@ static void drm_sched_job_done(struct drm_sched_job *s_job, int result) dma_fence_get(&s_fence->finished); drm_sched_fence_finished(s_fence, result); dma_fence_put(&s_fence->finished); - wake_up_interruptible(&sched->wake_up_worker); + drm_sched_run_job_queue(sched); } /** @@ -874,7 +883,7 @@ static bool drm_sched_can_queue(struct drm_gpu_scheduler *sched) void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched) { if (drm_sched_can_queue(sched)) - wake_up_interruptible(&sched->wake_up_worker); + drm_sched_run_job_queue(sched); } /** @@ -985,60 +994,41 @@ drm_sched_pick_best(struct drm_gpu_scheduler **sched_list, EXPORT_SYMBOL(drm_sched_pick_best); /** - * drm_sched_blocked - check if the scheduler is blocked + * drm_sched_run_job_work - main scheduler thread * - * @sched: scheduler instance - * - * Returns true if blocked, otherwise false. + * @w: run job work */ -static bool drm_sched_blocked(struct drm_gpu_scheduler *sched) +static void drm_sched_run_job_work(struct work_struct *w) { - if (kthread_should_park()) { - kthread_parkme(); - return true; - } - - return false; -} - -/** - * drm_sched_main - main scheduler thread - * - * @param: scheduler instance - * - * Returns 0. - */ -static int drm_sched_main(void *param) -{ - struct drm_gpu_scheduler *sched = (struct drm_gpu_scheduler *)param; + struct drm_gpu_scheduler *sched = + container_of(w, struct drm_gpu_scheduler, work_run_job); + struct drm_sched_entity *entity; + struct drm_sched_job *cleanup_job; int r; - sched_set_fifo_low(current); + if (READ_ONCE(sched->pause_submit)) + return; - while (!kthread_should_stop()) { - struct drm_sched_entity *entity = NULL; - struct drm_sched_fence *s_fence; - struct drm_sched_job *sched_job; - struct dma_fence *fence; - struct drm_sched_job *cleanup_job = NULL; + cleanup_job = drm_sched_get_cleanup_job(sched); + entity = drm_sched_select_entity(sched); - wait_event_interruptible(sched->wake_up_worker, - (cleanup_job = drm_sched_get_cleanup_job(sched)) || - (!drm_sched_blocked(sched) && - (entity = drm_sched_select_entity(sched))) || - kthread_should_stop()); + if (!entity && !cleanup_job) + return; /* No more work */ - if (cleanup_job) - sched->ops->free_job(cleanup_job); + if (cleanup_job) + sched->ops->free_job(cleanup_job); - if (!entity) - continue; + if (entity) { + struct dma_fence *fence; + struct drm_sched_fence *s_fence; + struct drm_sched_job *sched_job; sched_job = drm_sched_entity_pop_job(entity); - if (!sched_job) { complete_all(&entity->entity_idle); - continue; + if (!cleanup_job) + return; /* No more work */ + goto again; } s_fence = sched_job->s_fence; @@ -1069,7 +1059,9 @@ static int drm_sched_main(void *param) wake_up(&sched->job_scheduled); } - return 0; + +again: + drm_sched_run_job_queue(sched); } /** @@ -1077,6 +1069,8 @@ static int drm_sched_main(void *param) * * @sched: scheduler instance * @ops: backend operations for this scheduler + * @submit_wq: workqueue to use for submission. If NULL, an ordered wq is + * allocated and used * @num_rqs: number of runqueues, one for each priority, up to DRM_SCHED_PRIORITY_COUNT * @hw_submission: number of hw submissions that can be in flight * @hang_limit: number of times to allow a job to hang before dropping it @@ -1091,6 +1085,7 @@ static int drm_sched_main(void *param) */ int drm_sched_init(struct drm_gpu_scheduler *sched, const struct drm_sched_backend_ops *ops, + struct workqueue_struct *submit_wq, u32 num_rqs, uint32_t hw_submission, unsigned int hang_limit, long timeout, struct workqueue_struct *timeout_wq, atomic_t *score, const char *name, struct device *dev) @@ -1121,14 +1116,22 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, return 0; } + if (submit_wq) { + sched->submit_wq = submit_wq; + sched->own_submit_wq = false; + } else { + sched->submit_wq = alloc_ordered_workqueue(name, 0); + if (!sched->submit_wq) + return -ENOMEM; + + sched->own_submit_wq = true; + } + ret = -ENOMEM; sched->sched_rq = kmalloc_array(num_rqs, sizeof(*sched->sched_rq), GFP_KERNEL | __GFP_ZERO); - if (!sched->sched_rq) { - drm_err(sched, "%s: out of memory for sched_rq\n", __func__); - return -ENOMEM; - } + if (!sched->sched_rq) + goto Out_free; sched->num_rqs = num_rqs; - ret = -ENOMEM; for (i = DRM_SCHED_PRIORITY_MIN; i < sched->num_rqs; i++) { sched->sched_rq[i] = kzalloc(sizeof(*sched->sched_rq[i]), GFP_KERNEL); if (!sched->sched_rq[i]) @@ -1136,31 +1139,26 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, drm_sched_rq_init(sched, sched->sched_rq[i]); } - init_waitqueue_head(&sched->wake_up_worker); init_waitqueue_head(&sched->job_scheduled); INIT_LIST_HEAD(&sched->pending_list); spin_lock_init(&sched->job_list_lock); atomic_set(&sched->hw_rq_count, 0); INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout); + INIT_WORK(&sched->work_run_job, drm_sched_run_job_work); atomic_set(&sched->_score, 0); atomic64_set(&sched->job_id_count, 0); - - /* Each scheduler will run on a seperate kernel thread */ - sched->thread = kthread_run(drm_sched_main, sched, sched->name); - if (IS_ERR(sched->thread)) { - ret = PTR_ERR(sched->thread); - sched->thread = NULL; - DRM_DEV_ERROR(sched->dev, "Failed to create scheduler for %s.\n", name); - goto Out_unroll; - } + sched->pause_submit = false; sched->ready = true; return 0; Out_unroll: for (--i ; i >= DRM_SCHED_PRIORITY_MIN; i--) kfree(sched->sched_rq[i]); +Out_free: kfree(sched->sched_rq); sched->sched_rq = NULL; + if (sched->own_submit_wq) + destroy_workqueue(sched->submit_wq); drm_err(sched, "%s: Failed to setup GPU scheduler--out of memory\n", __func__); return ret; } @@ -1178,8 +1176,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched) struct drm_sched_entity *s_entity; int i; - if (sched->thread) - kthread_stop(sched->thread); + drm_sched_wqueue_stop(sched); for (i = sched->num_rqs - 1; i >= DRM_SCHED_PRIORITY_MIN; i--) { struct drm_sched_rq *rq = sched->sched_rq[i]; @@ -1202,6 +1199,8 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched) /* Confirm no work left behind accessing device structures */ cancel_delayed_work_sync(&sched->work_tdr); + if (sched->own_submit_wq) + destroy_workqueue(sched->submit_wq); sched->ready = false; kfree(sched->sched_rq); sched->sched_rq = NULL; @@ -1262,7 +1261,7 @@ EXPORT_SYMBOL(drm_sched_increase_karma); */ bool drm_sched_wqueue_ready(struct drm_gpu_scheduler *sched) { - return !!sched->thread; + return sched->ready; } EXPORT_SYMBOL(drm_sched_wqueue_ready); @@ -1273,7 +1272,8 @@ EXPORT_SYMBOL(drm_sched_wqueue_ready); */ void drm_sched_wqueue_stop(struct drm_gpu_scheduler *sched) { - kthread_park(sched->thread); + WRITE_ONCE(sched->pause_submit, true); + cancel_work_sync(&sched->work_run_job); } EXPORT_SYMBOL(drm_sched_wqueue_stop); @@ -1284,6 +1284,7 @@ EXPORT_SYMBOL(drm_sched_wqueue_stop); */ void drm_sched_wqueue_start(struct drm_gpu_scheduler *sched) { - kthread_unpark(sched->thread); + WRITE_ONCE(sched->pause_submit, false); + queue_work(sched->submit_wq, &sched->work_run_job); } EXPORT_SYMBOL(drm_sched_wqueue_start); diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c index 038e1ae589c7..0b6696b0d882 100644 --- a/drivers/gpu/drm/v3d/v3d_sched.c +++ b/drivers/gpu/drm/v3d/v3d_sched.c @@ -388,7 +388,7 @@ v3d_sched_init(struct v3d_dev *v3d) int ret; ret = drm_sched_init(&v3d->queue[V3D_BIN].sched, - &v3d_bin_sched_ops, + &v3d_bin_sched_ops, NULL, DRM_SCHED_PRIORITY_COUNT, hw_jobs_limit, job_hang_limit, msecs_to_jiffies(hang_limit_ms), NULL, @@ -397,7 +397,7 @@ v3d_sched_init(struct v3d_dev *v3d) return ret; ret = drm_sched_init(&v3d->queue[V3D_RENDER].sched, - &v3d_render_sched_ops, + &v3d_render_sched_ops, NULL, DRM_SCHED_PRIORITY_COUNT, hw_jobs_limit, job_hang_limit, msecs_to_jiffies(hang_limit_ms), NULL, @@ -406,7 +406,7 @@ v3d_sched_init(struct v3d_dev *v3d) goto fail; ret = drm_sched_init(&v3d->queue[V3D_TFU].sched, - &v3d_tfu_sched_ops, + &v3d_tfu_sched_ops, NULL, DRM_SCHED_PRIORITY_COUNT, hw_jobs_limit, job_hang_limit, msecs_to_jiffies(hang_limit_ms), NULL, @@ -416,7 +416,7 @@ v3d_sched_init(struct v3d_dev *v3d) if (v3d_has_csd(v3d)) { ret = drm_sched_init(&v3d->queue[V3D_CSD].sched, - &v3d_csd_sched_ops, + &v3d_csd_sched_ops, NULL, DRM_SCHED_PRIORITY_COUNT, hw_jobs_limit, job_hang_limit, msecs_to_jiffies(hang_limit_ms), NULL, @@ -425,7 +425,7 @@ v3d_sched_init(struct v3d_dev *v3d) goto fail; ret = drm_sched_init(&v3d->queue[V3D_CACHE_CLEAN].sched, - &v3d_cache_clean_sched_ops, + &v3d_cache_clean_sched_ops, NULL, DRM_SCHED_PRIORITY_COUNT, hw_jobs_limit, job_hang_limit, msecs_to_jiffies(hang_limit_ms), NULL, diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 1d5a20af4a06..e0e7c4eb57d9 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -474,17 +474,16 @@ struct drm_sched_backend_ops { * @num_rqs: Number of run-queues. This is at most DRM_SCHED_PRIORITY_COUNT, * as there's usually one run-queue per priority, but could be less. * @sched_rq: An allocated array of run-queues of size @num_rqs; - * @wake_up_worker: the wait queue on which the scheduler sleeps until a job - * is ready to be scheduled. * @job_scheduled: once @drm_sched_entity_do_release is called the scheduler * waits on this wait queue until all the scheduled jobs are * finished. * @hw_rq_count: the number of jobs currently in the hardware queue. * @job_id_count: used to assign unique id to the each job. + * @submit_wq: workqueue used to queue @work_run_job * @timeout_wq: workqueue used to queue @work_tdr + * @work_run_job: work which calls run_job op of each scheduler. * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the * timeout interval is over. - * @thread: the kthread on which the scheduler which run. * @pending_list: the list of jobs which are currently in the job queue. * @job_list_lock: lock to protect the pending_list. * @hang_limit: once the hangs by a job crosses this limit then it is marked @@ -493,6 +492,8 @@ struct drm_sched_backend_ops { * @_score: score used when the driver doesn't provide one * @ready: marks if the underlying HW is ready to work * @free_guilty: A hit to time out handler to free the guilty job. + * @pause_submit: pause queuing of @work_run_job on @submit_wq + * @own_submit_wq: scheduler owns allocation of @submit_wq * @dev: system &struct device * * One scheduler is implemented for each hardware ring. @@ -504,13 +505,13 @@ struct drm_gpu_scheduler { const char *name; u32 num_rqs; struct drm_sched_rq **sched_rq; - wait_queue_head_t wake_up_worker; wait_queue_head_t job_scheduled; atomic_t hw_rq_count; atomic64_t job_id_count; + struct workqueue_struct *submit_wq; struct workqueue_struct *timeout_wq; + struct work_struct work_run_job; struct delayed_work work_tdr; - struct task_struct *thread; struct list_head pending_list; spinlock_t job_list_lock; int hang_limit; @@ -518,11 +519,14 @@ struct drm_gpu_scheduler { atomic_t _score; bool ready; bool free_guilty; + bool pause_submit; + bool own_submit_wq; struct device *dev; }; int drm_sched_init(struct drm_gpu_scheduler *sched, const struct drm_sched_backend_ops *ops, + struct workqueue_struct *submit_wq, u32 num_rqs, uint32_t hw_submission, unsigned int hang_limit, long timeout, struct workqueue_struct *timeout_wq, atomic_t *score, const char *name, struct device *dev); From patchwork Tue Oct 31 03:24:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13440993 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C5028C4332F for ; Tue, 31 Oct 2023 03:25:01 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E6BF810E3EA; Tue, 31 Oct 2023 03:24:56 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.65]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3B4CE10E3E7; Tue, 31 Oct 2023 03:24:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698722695; x=1730258695; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=6FUUOB1fdorH3rqMKzR3Ljh46VBNJ0Xc0AtsnXNyGUI=; b=UU8oSvzesnjQW3OCHhZSIQ2NnQ+qslBI4gQwtqiexNLfz1SBhdYrzcWd En6UVuxZp8Ixc39d8JjFeY2bmBUSWzjbpeMuGkPMalx6pheCI0OYAlXaI mg75aPg7XB+SyuuuUgf+CgfQdVTbhY2s03/Y5rYDakBVB+t8Xwk54MPoe PpP0Azr8Ixb8iuZVZLR37Ym/01GjreroWtYxwPAfP7L9njeJMOqiAQLkY scs7awE7c7QCGjYMRTxqchD5SatdibdLrbz+cxYldGzOD2h7y4wqc6ywf QhW+MTfHHiTqmOyvv9ED2PaZ4yDkFhw/mPN+0kORp1uuCjQyiplpw8/ng Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10879"; a="392069052" X-IronPort-AV: E=Sophos;i="6.03,264,1694761200"; d="scan'208";a="392069052" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2023 20:24:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,264,1694761200"; d="scan'208";a="1660995" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2023 20:24:10 -0700 From: Matthew Brost To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH v8 3/5] drm/sched: Split free_job into own work item Date: Mon, 30 Oct 2023 20:24:37 -0700 Message-Id: <20231031032439.1558703-4-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231031032439.1558703-1-matthew.brost@intel.com> References: <20231031032439.1558703-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: robdclark@chromium.org, thomas.hellstrom@linux.intel.com, Matthew Brost , sarah.walker@imgtec.com, ltuikov@yahoo.com, ketil.johnsen@arm.com, Liviu.Dudau@arm.com, mcanal@igalia.com, boris.brezillon@collabora.com, dakr@redhat.com, donald.robson@imgtec.com, lina@asahilina.net, christian.koenig@amd.com, faith.ekstrand@collabora.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Rather than call free_job and run_job in same work item have a dedicated work item for each. This aligns with the design and intended use of work queues. v2: - Test for DMA_FENCE_FLAG_TIMESTAMP_BIT before setting timestamp in free_job() work item (Danilo) v3: - Drop forward dec of drm_sched_select_entity (Boris) - Return in drm_sched_run_job_work if entity NULL (Boris) v4: - Replace dequeue with peek and invert logic (Luben) - Wrap to 100 lines (Luben) - Update comments for *_queue / *_queue_if_ready functions (Luben) v5: - Drop peek argument, blindly reinit idle (Luben) - s/drm_sched_free_job_queue_if_ready/drm_sched_free_job_queue_if_done (Luben) - Update work_run_job & work_free_job kernel doc (Luben) v6: - Do not move drm_sched_select_entity in file (Luben) Signed-off-by: Matthew Brost Reviewed-by: Luben Tuikov --- drivers/gpu/drm/scheduler/sched_main.c | 146 +++++++++++++++++-------- include/drm/gpu_scheduler.h | 4 +- 2 files changed, 101 insertions(+), 49 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index d1ae05bded15..3b1b2f8eafe8 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -265,6 +265,32 @@ static void drm_sched_run_job_queue(struct drm_gpu_scheduler *sched) queue_work(sched->submit_wq, &sched->work_run_job); } +/** + * drm_sched_free_job_queue - enqueue free-job work + * @sched: scheduler instance + */ +static void drm_sched_free_job_queue(struct drm_gpu_scheduler *sched) +{ + if (!READ_ONCE(sched->pause_submit)) + queue_work(sched->submit_wq, &sched->work_free_job); +} + +/** + * drm_sched_free_job_queue_if_done - enqueue free-job work if ready + * @sched: scheduler instance + */ +static void drm_sched_free_job_queue_if_done(struct drm_gpu_scheduler *sched) +{ + struct drm_sched_job *job; + + spin_lock(&sched->job_list_lock); + job = list_first_entry_or_null(&sched->pending_list, + struct drm_sched_job, list); + if (job && dma_fence_is_signaled(&job->s_fence->finished)) + drm_sched_free_job_queue(sched); + spin_unlock(&sched->job_list_lock); +} + /** * drm_sched_job_done - complete a job * @s_job: pointer to the job which is done @@ -284,7 +310,7 @@ static void drm_sched_job_done(struct drm_sched_job *s_job, int result) dma_fence_get(&s_fence->finished); drm_sched_fence_finished(s_fence, result); dma_fence_put(&s_fence->finished); - drm_sched_run_job_queue(sched); + drm_sched_free_job_queue(sched); } /** @@ -943,8 +969,10 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) typeof(*next), list); if (next) { - next->s_fence->scheduled.timestamp = - dma_fence_timestamp(&job->s_fence->finished); + if (test_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT, + &next->s_fence->scheduled.flags)) + next->s_fence->scheduled.timestamp = + dma_fence_timestamp(&job->s_fence->finished); /* start TO timer for next job */ drm_sched_start_timeout(sched); } @@ -994,7 +1022,40 @@ drm_sched_pick_best(struct drm_gpu_scheduler **sched_list, EXPORT_SYMBOL(drm_sched_pick_best); /** - * drm_sched_run_job_work - main scheduler thread + * drm_sched_run_job_queue_if_ready - enqueue run-job work if ready + * @sched: scheduler instance + */ +static void drm_sched_run_job_queue_if_ready(struct drm_gpu_scheduler *sched) +{ + if (drm_sched_select_entity(sched)) + drm_sched_run_job_queue(sched); +} + +/** + * drm_sched_free_job_work - worker to call free_job + * + * @w: free job work + */ +static void drm_sched_free_job_work(struct work_struct *w) +{ + struct drm_gpu_scheduler *sched = + container_of(w, struct drm_gpu_scheduler, work_free_job); + struct drm_sched_job *cleanup_job; + + if (READ_ONCE(sched->pause_submit)) + return; + + cleanup_job = drm_sched_get_cleanup_job(sched); + if (cleanup_job) { + sched->ops->free_job(cleanup_job); + + drm_sched_free_job_queue_if_done(sched); + drm_sched_run_job_queue_if_ready(sched); + } +} + +/** + * drm_sched_run_job_work - worker to call run_job * * @w: run job work */ @@ -1003,65 +1064,51 @@ static void drm_sched_run_job_work(struct work_struct *w) struct drm_gpu_scheduler *sched = container_of(w, struct drm_gpu_scheduler, work_run_job); struct drm_sched_entity *entity; - struct drm_sched_job *cleanup_job; + struct dma_fence *fence; + struct drm_sched_fence *s_fence; + struct drm_sched_job *sched_job; int r; if (READ_ONCE(sched->pause_submit)) return; - cleanup_job = drm_sched_get_cleanup_job(sched); entity = drm_sched_select_entity(sched); + if (!entity) + return; - if (!entity && !cleanup_job) + sched_job = drm_sched_entity_pop_job(entity); + if (!sched_job) { + complete_all(&entity->entity_idle); return; /* No more work */ + } - if (cleanup_job) - sched->ops->free_job(cleanup_job); - - if (entity) { - struct dma_fence *fence; - struct drm_sched_fence *s_fence; - struct drm_sched_job *sched_job; - - sched_job = drm_sched_entity_pop_job(entity); - if (!sched_job) { - complete_all(&entity->entity_idle); - if (!cleanup_job) - return; /* No more work */ - goto again; - } - - s_fence = sched_job->s_fence; - - atomic_inc(&sched->hw_rq_count); - drm_sched_job_begin(sched_job); + s_fence = sched_job->s_fence; - trace_drm_run_job(sched_job, entity); - fence = sched->ops->run_job(sched_job); - complete_all(&entity->entity_idle); - drm_sched_fence_scheduled(s_fence, fence); + atomic_inc(&sched->hw_rq_count); + drm_sched_job_begin(sched_job); - if (!IS_ERR_OR_NULL(fence)) { - /* Drop for original kref_init of the fence */ - dma_fence_put(fence); + trace_drm_run_job(sched_job, entity); + fence = sched->ops->run_job(sched_job); + complete_all(&entity->entity_idle); + drm_sched_fence_scheduled(s_fence, fence); - r = dma_fence_add_callback(fence, &sched_job->cb, - drm_sched_job_done_cb); - if (r == -ENOENT) - drm_sched_job_done(sched_job, fence->error); - else if (r) - DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n", - r); - } else { - drm_sched_job_done(sched_job, IS_ERR(fence) ? - PTR_ERR(fence) : 0); - } + if (!IS_ERR_OR_NULL(fence)) { + /* Drop for original kref_init of the fence */ + dma_fence_put(fence); - wake_up(&sched->job_scheduled); + r = dma_fence_add_callback(fence, &sched_job->cb, + drm_sched_job_done_cb); + if (r == -ENOENT) + drm_sched_job_done(sched_job, fence->error); + else if (r) + DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n", r); + } else { + drm_sched_job_done(sched_job, IS_ERR(fence) ? + PTR_ERR(fence) : 0); } -again: - drm_sched_run_job_queue(sched); + wake_up(&sched->job_scheduled); + drm_sched_run_job_queue_if_ready(sched); } /** @@ -1145,6 +1192,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, atomic_set(&sched->hw_rq_count, 0); INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout); INIT_WORK(&sched->work_run_job, drm_sched_run_job_work); + INIT_WORK(&sched->work_free_job, drm_sched_free_job_work); atomic_set(&sched->_score, 0); atomic64_set(&sched->job_id_count, 0); sched->pause_submit = false; @@ -1274,6 +1322,7 @@ void drm_sched_wqueue_stop(struct drm_gpu_scheduler *sched) { WRITE_ONCE(sched->pause_submit, true); cancel_work_sync(&sched->work_run_job); + cancel_work_sync(&sched->work_free_job); } EXPORT_SYMBOL(drm_sched_wqueue_stop); @@ -1286,5 +1335,6 @@ void drm_sched_wqueue_start(struct drm_gpu_scheduler *sched) { WRITE_ONCE(sched->pause_submit, false); queue_work(sched->submit_wq, &sched->work_run_job); + queue_work(sched->submit_wq, &sched->work_free_job); } EXPORT_SYMBOL(drm_sched_wqueue_start); diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index e0e7c4eb57d9..677ba96759ab 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -479,9 +479,10 @@ struct drm_sched_backend_ops { * finished. * @hw_rq_count: the number of jobs currently in the hardware queue. * @job_id_count: used to assign unique id to the each job. - * @submit_wq: workqueue used to queue @work_run_job + * @submit_wq: workqueue used to queue @work_run_job and @work_free_job * @timeout_wq: workqueue used to queue @work_tdr * @work_run_job: work which calls run_job op of each scheduler. + * @work_free_job: work which calls free_job op of each scheduler. * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the * timeout interval is over. * @pending_list: the list of jobs which are currently in the job queue. @@ -511,6 +512,7 @@ struct drm_gpu_scheduler { struct workqueue_struct *submit_wq; struct workqueue_struct *timeout_wq; struct work_struct work_run_job; + struct work_struct work_free_job; struct delayed_work work_tdr; struct list_head pending_list; spinlock_t job_list_lock; From patchwork Tue Oct 31 03:24:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13440995 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 93201C4332F for ; Tue, 31 Oct 2023 03:25:08 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C19EE10E3F0; Tue, 31 Oct 2023 03:24:58 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.65]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5E04510E3E8; Tue, 31 Oct 2023 03:24:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698722695; x=1730258695; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hKEhkrxCQeFQbvyuX0QhUphWRmaBpLUeUBV8qDhgtNw=; b=B7DBL8HDRIN2dQO8pOUpZSXIVPQ6zxWLgA89/uXdsEpsPipGcbK2j8rk ofhdTlGqfJpHV/64eIEHf6Qi5tjjvRj8xIFdJA1wDQ/itI6i3m+v24wI7 dbPuQ/DKGEHiRmHSLD5jktGks0YyajTQZHDBc4DUt3sBMfcs7h3vP21Uf 9OxlMUdlG42yI4cME3mWtVF/NIq6aLW16+37flgzzSjHbCp2OZkYvSlVM RRino3x0C7gyKzODxT4XXKFYkreKo/bWVwbsmIuW+a1jK82+BJ9gN62Bs EuIkQiZA7E9wTvr4pGQne86m+xeiGaTwVKJZNnL16DxC98DzA8oVsJSqh w==; X-IronPort-AV: E=McAfee;i="6600,9927,10879"; a="392069065" X-IronPort-AV: E=Sophos;i="6.03,264,1694761200"; d="scan'208";a="392069065" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2023 20:24:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,264,1694761200"; d="scan'208";a="1661001" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2023 20:24:10 -0700 From: Matthew Brost To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH v8 4/5] drm/sched: Add drm_sched_start_timeout_unlocked helper Date: Mon, 30 Oct 2023 20:24:38 -0700 Message-Id: <20231031032439.1558703-5-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231031032439.1558703-1-matthew.brost@intel.com> References: <20231031032439.1558703-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: robdclark@chromium.org, thomas.hellstrom@linux.intel.com, Matthew Brost , sarah.walker@imgtec.com, Luben Tuikov , ltuikov@yahoo.com, ketil.johnsen@arm.com, Liviu.Dudau@arm.com, mcanal@igalia.com, boris.brezillon@collabora.com, dakr@redhat.com, donald.robson@imgtec.com, lina@asahilina.net, christian.koenig@amd.com, faith.ekstrand@collabora.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Also add a lockdep assert to drm_sched_start_timeout. Signed-off-by: Matthew Brost Reviewed-by: Luben Tuikov --- drivers/gpu/drm/scheduler/sched_main.c | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 3b1b2f8eafe8..fc387de5a0c7 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -334,11 +334,20 @@ static void drm_sched_job_done_cb(struct dma_fence *f, struct dma_fence_cb *cb) */ static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched) { + lockdep_assert_held(&sched->job_list_lock); + if (sched->timeout != MAX_SCHEDULE_TIMEOUT && !list_empty(&sched->pending_list)) queue_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout); } +static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched) +{ + spin_lock(&sched->job_list_lock); + drm_sched_start_timeout(sched); + spin_unlock(&sched->job_list_lock); +} + /** * drm_sched_fault - immediately start timeout handler * @@ -451,11 +460,8 @@ static void drm_sched_job_timedout(struct work_struct *work) spin_unlock(&sched->job_list_lock); } - if (status != DRM_GPU_SCHED_STAT_ENODEV) { - spin_lock(&sched->job_list_lock); - drm_sched_start_timeout(sched); - spin_unlock(&sched->job_list_lock); - } + if (status != DRM_GPU_SCHED_STAT_ENODEV) + drm_sched_start_timeout_unlocked(sched); } /** @@ -581,11 +587,8 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery) drm_sched_job_done(s_job, -ECANCELED); } - if (full_recovery) { - spin_lock(&sched->job_list_lock); - drm_sched_start_timeout(sched); - spin_unlock(&sched->job_list_lock); - } + if (full_recovery) + drm_sched_start_timeout_unlocked(sched); drm_sched_wqueue_start(sched); } From patchwork Tue Oct 31 03:24:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 13440996 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E25A8C4332F for ; Tue, 31 Oct 2023 03:25:10 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8600C10E3EB; Tue, 31 Oct 2023 03:25:00 +0000 (UTC) Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.65]) by gabe.freedesktop.org (Postfix) with ESMTPS id 31D5D10E3E8; Tue, 31 Oct 2023 03:24:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698722696; x=1730258696; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VddqtxxaXR/lC8/tsZnMfTuif317JW2iAjBR/Ky+FCE=; b=Qrn87QD7h9OrF9HnOMf7ZQUe8z/Ry7S7UEGdscUg24yjqmgwhG1R899+ sTi9WnS4rFPlLvYaF5aToV24I/kS0h6Z/69t48DnJwcLI/9tUuvru1LpQ 9uc3k4GyPHTHUmTYe4cqNof1HHn/LqW4eGDFIYfWnKGZ9Tsp/QQmNGwjX SyoUiVheOhUF7q82Hp5b1nRdDHdcPxSHUtLx64mh3uNopzmZ6G10vJgqX 5o5TDSZCg3P8vNEnGCjp9rnn1/bTpkshP/AX9TOPQV7Tl+kngSX9xjpAn NAZG8thcQt/3JX06iMo461Y2rfwyEXuiPC895O1QJbL4ykcSe8ElfsbLe w==; X-IronPort-AV: E=McAfee;i="6600,9927,10879"; a="392069077" X-IronPort-AV: E=Sophos;i="6.03,264,1694761200"; d="scan'208";a="392069077" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2023 20:24:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,264,1694761200"; d="scan'208";a="1661006" Received: from lstrano-desk.jf.intel.com ([10.54.39.91]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2023 20:24:10 -0700 From: Matthew Brost To: dri-devel@lists.freedesktop.org, intel-xe@lists.freedesktop.org Subject: [PATCH v8 5/5] drm/sched: Add a helper to queue TDR immediately Date: Mon, 30 Oct 2023 20:24:39 -0700 Message-Id: <20231031032439.1558703-6-matthew.brost@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231031032439.1558703-1-matthew.brost@intel.com> References: <20231031032439.1558703-1-matthew.brost@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: robdclark@chromium.org, thomas.hellstrom@linux.intel.com, Matthew Brost , sarah.walker@imgtec.com, Luben Tuikov , ltuikov@yahoo.com, ketil.johnsen@arm.com, Liviu.Dudau@arm.com, mcanal@igalia.com, boris.brezillon@collabora.com, dakr@redhat.com, donald.robson@imgtec.com, lina@asahilina.net, christian.koenig@amd.com, faith.ekstrand@collabora.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Add a helper whereby a driver can invoke TDR immediately. v2: - Drop timeout args, rename function, use mod delayed work (Luben) v3: - s/XE/Xe (Luben) - present tense in commit message (Luben) - Adjust comment for drm_sched_tdr_queue_imm (Luben) v4: - Adjust commit message (Luben) Cc: Luben Tuikov Signed-off-by: Matthew Brost Reviewed-by: Luben Tuikov --- drivers/gpu/drm/scheduler/sched_main.c | 18 +++++++++++++++++- include/drm/gpu_scheduler.h | 1 + 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index fc387de5a0c7..98b2ad54fc70 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -338,7 +338,7 @@ static void drm_sched_start_timeout(struct drm_gpu_scheduler *sched) if (sched->timeout != MAX_SCHEDULE_TIMEOUT && !list_empty(&sched->pending_list)) - queue_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout); + mod_delayed_work(sched->timeout_wq, &sched->work_tdr, sched->timeout); } static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched) @@ -348,6 +348,22 @@ static void drm_sched_start_timeout_unlocked(struct drm_gpu_scheduler *sched) spin_unlock(&sched->job_list_lock); } +/** + * drm_sched_tdr_queue_imm: - immediately start job timeout handler + * + * @sched: scheduler for which the timeout handling should be started. + * + * Start timeout handling immediately for the named scheduler. + */ +void drm_sched_tdr_queue_imm(struct drm_gpu_scheduler *sched) +{ + spin_lock(&sched->job_list_lock); + sched->timeout = 0; + drm_sched_start_timeout(sched); + spin_unlock(&sched->job_list_lock); +} +EXPORT_SYMBOL(drm_sched_tdr_queue_imm); + /** * drm_sched_fault - immediately start timeout handler * diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 677ba96759ab..c1565694c0e9 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -556,6 +556,7 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity, struct drm_gpu_scheduler **sched_list, unsigned int num_sched_list); +void drm_sched_tdr_queue_imm(struct drm_gpu_scheduler *sched); void drm_sched_job_cleanup(struct drm_sched_job *job); void drm_sched_wakeup_if_can_queue(struct drm_gpu_scheduler *sched); bool drm_sched_wqueue_ready(struct drm_gpu_scheduler *sched);