From patchwork Mon Jan 18 21:01:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrey Grodzovsky X-Patchwork-Id: 12028257 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,MSGID_FROM_MTA_HEADER,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 398E6C43381 for ; Mon, 18 Jan 2021 21:02:50 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F36AD22DD3 for ; Mon, 18 Jan 2021 21:02:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F36AD22DD3 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=amd.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1376C6E5CC; Mon, 18 Jan 2021 21:02:48 +0000 (UTC) Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2079.outbound.protection.outlook.com [40.107.220.79]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3E05F6E5C3; Mon, 18 Jan 2021 21:02:45 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=BgBAh6l1zjPuFS91l4PNKTXhW19TuVfAYs0588zwX/62iK5VOQclazoFmurTlY4y1VJUleElupb7GOixPGmXgCBF/JPs8348MDW9zRNRXo62o6fmLYym/s2WgOqqy2P8GuYR59SJDj+4rzIBTYlSModlQLH0e8ntu1G1P7XuKBrJ7J0gwFF2NKzKQDIhRx712lXjVYeXqgrjpXDioNbJeNfy4zVGve0bVYS5OP23O+QnE7gnuk30Vyz0JIRVSZCea/l2OVWJlsA5lTzK7m+WceKn5EO4GPhUQLbbNQlto0Cr+6lqzJnF0yMxEu9iNJbIhRt4o5VQKTMZv4dlwh4iag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=YGLl3snTccK41uXYtQBDr+JZ8i16Ma3viKtMFJCf13A=; b=XoKCzel5PyDyEOcnBk4JT9AuPYM5vHwq2Eov9Ebay7t+qtbGHNedeZStxd1fy2r3cnDEC7etYkz1Amy0YYQ9RWq/ebeqyByZWNeyb+cPy1xyZiz6I6RLfFcThGSaPR/13AoybhanJ36wH0p27d9fdqJ8xu61sNIDMAwDj8ejTQl6IeIJtJS/9llOURUVQSvEyEK8PO+BaPKxcr46Xx/4YlYqKN47weSRzXVOkX/VuoLz6O3GJN/SlFN08gQir+ps2Oir7aEsM1LdAPnIBUOOVWKeKhhKXUcQeX2m7Iq0DuTCdMchj2NsdD/n8CFDKjCyb18FbfHxPtcxnfiL+0YaOw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=YGLl3snTccK41uXYtQBDr+JZ8i16Ma3viKtMFJCf13A=; b=UZeKQMJc0dk4XilsSuUmxo+Eu6Wbhg9lDz6e0WKmZJTLHs418J2+sLkHQKyH1J2bd/m8QIujxx0cDVQHqyO1dcDTfzqs9pOMNnUz/vOK3OAPwAYH8GF5ydl/36KTn0GTvntUlKSgBqRNragsFt1v7NyQyl9r/NzL/E1wSGk0B80= Authentication-Results: lists.freedesktop.org; dkim=none (message not signed) header.d=none; lists.freedesktop.org; dmarc=none action=none header.from=amd.com; Received: from SN6PR12MB4623.namprd12.prod.outlook.com (2603:10b6:805:e9::17) by SN6PR12MB4767.namprd12.prod.outlook.com (2603:10b6:805:e5::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3763.10; Mon, 18 Jan 2021 21:02:44 +0000 Received: from SN6PR12MB4623.namprd12.prod.outlook.com ([fe80::5d30:b29d:5f5b:6921]) by SN6PR12MB4623.namprd12.prod.outlook.com ([fe80::5d30:b29d:5f5b:6921%5]) with mapi id 15.20.3763.014; Mon, 18 Jan 2021 21:02:44 +0000 From: Andrey Grodzovsky To: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, ckoenig.leichtzumerken@gmail.com, daniel.vetter@ffwll.ch, robh@kernel.org, l.stach@pengutronix.de, yuq825@gmail.com, eric@anholt.net Subject: [PATCH v4 12/14] drm/scheduler: Job timeout handler returns status Date: Mon, 18 Jan 2021 16:01:21 -0500 Message-Id: <1611003683-3534-13-git-send-email-andrey.grodzovsky@amd.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1611003683-3534-1-git-send-email-andrey.grodzovsky@amd.com> References: <1611003683-3534-1-git-send-email-andrey.grodzovsky@amd.com> X-Originating-IP: [2607:fea8:3edf:49b0:84d3:21cc:478c:efa7] X-ClientProxiedBy: YTXPR0101CA0020.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:b00::33) To SN6PR12MB4623.namprd12.prod.outlook.com (2603:10b6:805:e9::17) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from ubuntu-1604-test.hitronhub.home (2607:fea8:3edf:49b0:84d3:21cc:478c:efa7) by YTXPR0101CA0020.CANPRD01.PROD.OUTLOOK.COM (2603:10b6:b00::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.20.3763.9 via Frontend Transport; Mon, 18 Jan 2021 21:02:42 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 73d4d327-d672-4c4b-884e-08d8bbf46674 X-MS-TrafficTypeDiagnostic: SN6PR12MB4767: X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:1284; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 3mwGnHm1Jce5dXoPgtdQhAP2EIYkYtucAYBXSixIkUZdhjM/OVkk8O2jZdr5JDiHF+YqWj4yLeRxq+AH35hE4vGiw39kyGpKlZX647kAcbfsVQe8Qrss/312l4L+1n6XW7tvr2zKvuNa+/CrFdTbePkVw8GgWErGje+UzoMj1uN/CiohDDrprNDUt2QDOAGtnhi4ikSduXhVNj3FvARIIgCyYJIjkWeGJxWtwOvU7yd7x+j9oJV6zTjK19Trz5BZf77s7DluOt4Ib2S9+r6Hb4nLmlMc8RMth7U3To3a4PXbqh7DgLfRxQa+nS0APEEDVYuEZjPRs0nLBSPoDO3l7X9PplsuGL8aO1cxHVbFIGE/2Fo2XcZyIWMXLVpvaGyLYdEdPjyLvxsS2MMmYEvQMw== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SN6PR12MB4623.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(396003)(376002)(346002)(39860400002)(366004)(136003)(86362001)(66946007)(66574015)(30864003)(6666004)(83380400001)(44832011)(8936002)(66556008)(316002)(6512007)(6486002)(36756003)(2616005)(66476007)(6506007)(4326008)(478600001)(5660300002)(7416002)(2906002)(52116002)(16526019)(54906003)(8676002)(186003); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: =?utf-8?q?8i1Xqvu1zvH/wc+qXiiG0P/3dpKoC7?= =?utf-8?q?DQsBBXRbSKsgbsGAWHl19/kIff9Qd9nKtpVoPmzxc4zuwgHCdopF1CeuM3f40yaxd?= =?utf-8?q?YMzPUbOcnkecQT0fVd8rnryqB1p/mDoxVw0ZXsNOitaUtyAx4WAbeAOjv0x/B6eoP?= =?utf-8?q?jXlopXUM6TaQvxD52YIXW+uv3x+OXITUGn7KDuxP+VzyxM+SWGhBeznS97Sg62BjH?= =?utf-8?q?o9jLECUq6UIzzmopoAQQVEoB7+ICf21dQ8GrjsWZVXx1SP1eyYuQqYVpY7SnTgHno?= =?utf-8?q?PxCjVZFhdWXjtcqiS7hXbP8XlPHxiVIRtZHl39S5gZmrhmVwPqVAh+Na2TYw7tOzl?= =?utf-8?q?0uYivHzW6wL49WHvx9FlLNlwj8waxA9WP9SD5QPDOpn4oaTTV3WVdiWtVo7DNF39t?= =?utf-8?q?HawBBz9EBS5fyUXWKvrn0GZ6KUNve9oaBMW2rZhTUJ8MmkJA78AR2flfpLBPmitjI?= =?utf-8?q?/L8eiNF507lhx9Yei+FjQEp4YdnqSYHO4uY7BviyC60Gu4FtpIfxfSGOK4mv/EIMA?= =?utf-8?q?hV/a4eJsU13wOHOSx/B1lKVO/HWCmE76CgjKR3cnt7GG7a0KPuNfIOFGf2dHzzZTb?= =?utf-8?q?QR9ImZvAz2sfsea2huc9Z0yXe0Lw4XQkLGfmvKSWibOvgHX3TH+cYFvYKKNri7RCl?= =?utf-8?q?VoIXHwRALQfxck7egwlIZaAvYuhC6I+COFTPwF+ibwrTd5dKtB2MisGfKT7YR2J2O?= =?utf-8?q?mmUPqcVbfk0Uw3NwDa+Ufl57/PB4abDeDxWV5mDVPlcQuc1P+DW7rUCEQpxG61X2w?= =?utf-8?q?lb9WNxvrapSG94uhQx8ScNqIUFLG05c6WvvZBv69AJ9iLQgMc1cVFq2Z3wfUfliHH?= =?utf-8?q?ZBA+G2iE+Djvbs2LaGEtviVcsm++EO5xP2aVSPiCVNoUeaMQf80b36QuPk+AH9ot0?= =?utf-8?q?E4oMYqij5Lu+LuYlZ+K/MNgjvyCGV3AcZJVUJs7UJL0boUbP7Z9/gybvH8cCgi5rl?= =?utf-8?q?OPQo8nL1BxU80CY1tUOhkkvW+v5/4WGDU0Vx1k1FKRGBmQmnX+WcVOSLyjjX8wJCa?= =?utf-8?q?WmjeV4TMuljcorhBsvXakfv4nSujk+tIPsOu2sNivQyxHJehUYk+D14qv3nIIx4cO?= =?utf-8?q?MUq1H/0BH4gO9jakY?= X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-Network-Message-Id: 73d4d327-d672-4c4b-884e-08d8bbf46674 X-MS-Exchange-CrossTenant-AuthSource: SN6PR12MB4623.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Jan 2021 21:02:43.9871 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 9y2P1qJLp3ie69WeasdJCWUYJxuGZgsOcvPlEXXqt6wdB5+m0kdZQF9kzu1cF2xYA+gWWTkaFAUrAA5RfP/J+w== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN6PR12MB4767 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tomeu Vizoso , gregkh@linuxfoundation.org, Steven Price , Luben Tuikov , Alyssa Rosenzweig , Russell King , Alexander.Deucher@amd.com, =?utf-8?q?Christian_K=C3=B6nig?= Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Luben Tuikov This patch does not change current behaviour. The driver's job timeout handler now returns status indicating back to the DRM layer whether the task (job) was successfully aborted or whether more time should be given to the task to complete. Default behaviour as of this patch, is preserved, except in obvious-by-comment case in the Panfrost driver, as documented below. All drivers which make use of the drm_sched_backend_ops' .timedout_job() callback have been accordingly renamed and return the would've-been default value of DRM_TASK_STATUS_ALIVE to restart the task's timeout timer--this is the old behaviour, and is preserved by this patch. In the case of the Panfrost driver, its timedout callback correctly first checks if the job had completed in due time and if so, it now returns DRM_TASK_STATUS_COMPLETE to notify the DRM layer that the task can be moved to the done list, to be freed later. In the other two subsequent checks, the value of DRM_TASK_STATUS_ALIVE is returned, as per the default behaviour. A more involved driver's solutions can be had in subequent patches. v2: Use enum as the status of a driver's job timeout callback method. v4: (By Andrey Grodzovsky) Replace DRM_TASK_STATUS_COMPLETE with DRM_TASK_STATUS_ENODEV to enable a hint to the schduler for when NOT to rearm the timeout timer. Cc: Alexander Deucher Cc: Andrey Grodzovsky Cc: Christian König Cc: Daniel Vetter Cc: Lucas Stach Cc: Russell King Cc: Christian Gmeiner Cc: Qiang Yu Cc: Rob Herring Cc: Tomeu Vizoso Cc: Steven Price Cc: Alyssa Rosenzweig Cc: Eric Anholt Reported-by: kernel test robot Signed-off-by: Luben Tuikov Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 6 ++++-- drivers/gpu/drm/etnaviv/etnaviv_sched.c | 10 +++++++++- drivers/gpu/drm/lima/lima_sched.c | 4 +++- drivers/gpu/drm/panfrost/panfrost_job.c | 9 ++++++--- drivers/gpu/drm/scheduler/sched_main.c | 4 +--- drivers/gpu/drm/v3d/v3d_sched.c | 32 +++++++++++++++++--------------- include/drm/gpu_scheduler.h | 17 ++++++++++++++--- 7 files changed, 54 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index ff48101..a111326 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -28,7 +28,7 @@ #include "amdgpu.h" #include "amdgpu_trace.h" -static void amdgpu_job_timedout(struct drm_sched_job *s_job) +static enum drm_task_status amdgpu_job_timedout(struct drm_sched_job *s_job) { struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched); struct amdgpu_job *job = to_amdgpu_job(s_job); @@ -41,7 +41,7 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job) amdgpu_ring_soft_recovery(ring, job->vmid, s_job->s_fence->parent)) { DRM_ERROR("ring %s timeout, but soft recovered\n", s_job->sched->name); - return; + return DRM_TASK_STATUS_ALIVE; } amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti); @@ -53,10 +53,12 @@ static void amdgpu_job_timedout(struct drm_sched_job *s_job) if (amdgpu_device_should_recover_gpu(ring->adev)) { amdgpu_device_gpu_recover(ring->adev, job); + return DRM_TASK_STATUS_ALIVE; } else { drm_sched_suspend_timeout(&ring->sched); if (amdgpu_sriov_vf(adev)) adev->virt.tdr_debug = true; + return DRM_TASK_STATUS_ALIVE; } } diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c index cd46c88..c495169 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c @@ -82,7 +82,8 @@ static struct dma_fence *etnaviv_sched_run_job(struct drm_sched_job *sched_job) return fence; } -static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job) +static enum drm_task_status etnaviv_sched_timedout_job(struct drm_sched_job + *sched_job) { struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job); struct etnaviv_gpu *gpu = submit->gpu; @@ -120,9 +121,16 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job) drm_sched_resubmit_jobs(&gpu->sched); + /* Tell the DRM scheduler that this task needs + * more time. + */ + drm_sched_start(&gpu->sched, true); + return DRM_TASK_STATUS_ALIVE; + out_no_timeout: /* restart scheduler after GPU is usable again */ drm_sched_start(&gpu->sched, true); + return DRM_TASK_STATUS_ALIVE; } static void etnaviv_sched_free_job(struct drm_sched_job *sched_job) diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c index 63b4c56..66d9236 100644 --- a/drivers/gpu/drm/lima/lima_sched.c +++ b/drivers/gpu/drm/lima/lima_sched.c @@ -415,7 +415,7 @@ static void lima_sched_build_error_task_list(struct lima_sched_task *task) mutex_unlock(&dev->error_task_list_lock); } -static void lima_sched_timedout_job(struct drm_sched_job *job) +static enum drm_task_status lima_sched_timedout_job(struct drm_sched_job *job) { struct lima_sched_pipe *pipe = to_lima_pipe(job->sched); struct lima_sched_task *task = to_lima_task(job); @@ -449,6 +449,8 @@ static void lima_sched_timedout_job(struct drm_sched_job *job) drm_sched_resubmit_jobs(&pipe->base); drm_sched_start(&pipe->base, true); + + return DRM_TASK_STATUS_ALIVE; } static void lima_sched_free_job(struct drm_sched_job *job) diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c index 04e6f6f..10d41ac 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.c +++ b/drivers/gpu/drm/panfrost/panfrost_job.c @@ -432,7 +432,8 @@ static void panfrost_scheduler_start(struct panfrost_queue_state *queue) mutex_unlock(&queue->lock); } -static void panfrost_job_timedout(struct drm_sched_job *sched_job) +static enum drm_task_status panfrost_job_timedout(struct drm_sched_job + *sched_job) { struct panfrost_job *job = to_panfrost_job(sched_job); struct panfrost_device *pfdev = job->pfdev; @@ -443,7 +444,7 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job) * spurious. Bail out. */ if (dma_fence_is_signaled(job->done_fence)) - return; + return DRM_TASK_STATUS_ALIVE; dev_err(pfdev->dev, "gpu sched timeout, js=%d, config=0x%x, status=0x%x, head=0x%x, tail=0x%x, sched_job=%p", js, @@ -455,11 +456,13 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job) /* Scheduler is already stopped, nothing to do. */ if (!panfrost_scheduler_stop(&pfdev->js->queue[js], sched_job)) - return; + return DRM_TASK_STATUS_ALIVE; /* Schedule a reset if there's no reset in progress. */ if (!atomic_xchg(&pfdev->reset.pending, 1)) schedule_work(&pfdev->reset.work); + + return DRM_TASK_STATUS_ALIVE; } static const struct drm_sched_backend_ops panfrost_sched_ops = { diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 92637b7..73fccc5 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -527,7 +527,7 @@ void drm_sched_start(struct drm_gpu_scheduler *sched, bool full_recovery) EXPORT_SYMBOL(drm_sched_start); /** - * drm_sched_resubmit_jobs - helper to relunch job from pending ring list + * drm_sched_resubmit_jobs - helper to relaunch jobs from the pending list * * @sched: scheduler instance * @@ -561,8 +561,6 @@ void drm_sched_resubmit_jobs(struct drm_gpu_scheduler *sched) } else { s_job->s_fence->parent = fence; } - - } } EXPORT_SYMBOL(drm_sched_resubmit_jobs); diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c index 452682e..3740665e 100644 --- a/drivers/gpu/drm/v3d/v3d_sched.c +++ b/drivers/gpu/drm/v3d/v3d_sched.c @@ -259,7 +259,7 @@ v3d_cache_clean_job_run(struct drm_sched_job *sched_job) return NULL; } -static void +static enum drm_task_status v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job) { enum v3d_queue q; @@ -285,6 +285,8 @@ v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job) } mutex_unlock(&v3d->reset_lock); + + return DRM_TASK_STATUS_ALIVE; } /* If the current address or return address have changed, then the GPU @@ -292,7 +294,7 @@ v3d_gpu_reset_for_timeout(struct v3d_dev *v3d, struct drm_sched_job *sched_job) * could fail if the GPU got in an infinite loop in the CL, but that * is pretty unlikely outside of an i-g-t testcase. */ -static void +static enum drm_task_status v3d_cl_job_timedout(struct drm_sched_job *sched_job, enum v3d_queue q, u32 *timedout_ctca, u32 *timedout_ctra) { @@ -304,39 +306,39 @@ v3d_cl_job_timedout(struct drm_sched_job *sched_job, enum v3d_queue q, if (*timedout_ctca != ctca || *timedout_ctra != ctra) { *timedout_ctca = ctca; *timedout_ctra = ctra; - return; + return DRM_TASK_STATUS_ALIVE; } - v3d_gpu_reset_for_timeout(v3d, sched_job); + return v3d_gpu_reset_for_timeout(v3d, sched_job); } -static void +static enum drm_task_status v3d_bin_job_timedout(struct drm_sched_job *sched_job) { struct v3d_bin_job *job = to_bin_job(sched_job); - v3d_cl_job_timedout(sched_job, V3D_BIN, - &job->timedout_ctca, &job->timedout_ctra); + return v3d_cl_job_timedout(sched_job, V3D_BIN, + &job->timedout_ctca, &job->timedout_ctra); } -static void +static enum drm_task_status v3d_render_job_timedout(struct drm_sched_job *sched_job) { struct v3d_render_job *job = to_render_job(sched_job); - v3d_cl_job_timedout(sched_job, V3D_RENDER, - &job->timedout_ctca, &job->timedout_ctra); + return v3d_cl_job_timedout(sched_job, V3D_RENDER, + &job->timedout_ctca, &job->timedout_ctra); } -static void +static enum drm_task_status v3d_generic_job_timedout(struct drm_sched_job *sched_job) { struct v3d_job *job = to_v3d_job(sched_job); - v3d_gpu_reset_for_timeout(job->v3d, sched_job); + return v3d_gpu_reset_for_timeout(job->v3d, sched_job); } -static void +static enum drm_task_status v3d_csd_job_timedout(struct drm_sched_job *sched_job) { struct v3d_csd_job *job = to_csd_job(sched_job); @@ -348,10 +350,10 @@ v3d_csd_job_timedout(struct drm_sched_job *sched_job) */ if (job->timedout_batches != batches) { job->timedout_batches = batches; - return; + return DRM_TASK_STATUS_ALIVE; } - v3d_gpu_reset_for_timeout(v3d, sched_job); + return v3d_gpu_reset_for_timeout(v3d, sched_job); } static const struct drm_sched_backend_ops v3d_bin_sched_ops = { diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 975e8a6..3ba36bc 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -206,6 +206,11 @@ static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job, return s_job && atomic_inc_return(&s_job->karma) > threshold; } +enum drm_task_status { + DRM_TASK_STATUS_ENODEV, + DRM_TASK_STATUS_ALIVE +}; + /** * struct drm_sched_backend_ops * @@ -230,10 +235,16 @@ struct drm_sched_backend_ops { struct dma_fence *(*run_job)(struct drm_sched_job *sched_job); /** - * @timedout_job: Called when a job has taken too long to execute, - * to trigger GPU recovery. + * @timedout_job: Called when a job has taken too long to execute, + * to trigger GPU recovery. + * + * Return DRM_TASK_STATUS_ALIVE, if the task (job) is healthy + * and executing in the hardware, i.e. it needs more time. + * + * Return DRM_TASK_STATUS_ENODEV, if the task (job) has + * been aborted. */ - void (*timedout_job)(struct drm_sched_job *sched_job); + enum drm_task_status (*timedout_job)(struct drm_sched_job *sched_job); /** * @free_job: Called once the job's finished fence has been signaled