From patchwork Mon Nov 25 14:10:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrey Grodzovsky X-Patchwork-Id: 11260463 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C42BA14ED for ; Mon, 25 Nov 2019 14:10:50 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id ACA4320748 for ; Mon, 25 Nov 2019 14:10:50 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org ACA4320748 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=amd.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id EA5CA89EF7; Mon, 25 Nov 2019 14:10:47 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from NAM01-BN3-obe.outbound.protection.outlook.com (mail-eopbgr740085.outbound.protection.outlook.com [40.107.74.85]) by gabe.freedesktop.org (Postfix) with ESMTPS id 592F289EEB; Mon, 25 Nov 2019 14:10:45 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=IPaZ5yejEs3z1loCkAagdCTDLT2OKVnbDjxuPdFGdZXRbohx1HMdNgkw9UCARuCC1GQUpeShOp8s68TA2SDBRAiME5oRpfjVZ7bBLY7PewkaZhiR+peHvBYVJBH4oX9Ajt39UvD897+IRljJVvTgx1Y160PSCSYYxDW/G5cOOkPFrT1OKQ0myiX3Ru84tsK1Kzar1xvGYdXA4F/OzBaWOE17/oKit0YgXc9/VkcurANzmWvkK4mAm3Rs3cu/H8BWLOdzBGXfq3urtPeXCSj1mNi8YonEha5opkNh8Re8kt8ON1AZ4BmdLgvlpi7yoC0yjipkJJXRaa1seIUuNTYN+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=HQB2P7dzqtgJCuhlvrOp4O3VPy4MxsHmfUz8X1lVXBo=; b=aI9zVxt3/eK3vmo1E1dRP6td040l8ph548TrK8dP/65FxUJdjazaDjfOPT/TAnwcIqdZeFgSGnPMjS7fM8IvrUDdILBLw5Lk31eUlSRHWnqZWw2sPFCEWNsgB/yKRp+5rK1q6R4icN6pz54z6lEPUuGGFYFxYugbrDN3/mQuaDs7jDUM+ohcnB0aJjGKKOIVHEWGdIMiY05XbsIKadJxjJWk+fRIbMmb3HVHkaPYKpj1ugHID4Hvw/7bubsnoLBh+R2hy64gdQly8jgMm5PesBhQacZwWD5vYcjpHgeAXd0N+30kX2VHUC3dc2GvNsjGfy0BuZzLT2ZpLzO3VCb//w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none (sender ip is 165.204.84.17) smtp.rcpttodomain=lists.freedesktop.org smtp.mailfrom=amd.com; dmarc=permerror action=none header.from=amd.com; dkim=none (message not signed); arc=none Received: from DM5PR12CA0010.namprd12.prod.outlook.com (2603:10b6:4:1::20) by BYAPR12MB2904.namprd12.prod.outlook.com (2603:10b6:a03:137::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2474.21; Mon, 25 Nov 2019 14:10:43 +0000 Received: from BN8NAM11FT062.eop-nam11.prod.protection.outlook.com (2a01:111:f400:7eae::200) by DM5PR12CA0010.outlook.office365.com (2603:10b6:4:1::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.2474.17 via Frontend Transport; Mon, 25 Nov 2019 14:10:43 +0000 Received-SPF: None (protection.outlook.com: amd.com does not designate permitted sender hosts) Received: from SATLEXMB01.amd.com (165.204.84.17) by BN8NAM11FT062.mail.protection.outlook.com (10.13.177.34) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.20.2451.23 via Frontend Transport; Mon, 25 Nov 2019 14:10:43 +0000 Received: from SATLEXMB02.amd.com (10.181.40.143) by SATLEXMB01.amd.com (10.181.40.142) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Mon, 25 Nov 2019 08:10:43 -0600 Received: from agrodzovsky-All-Series.amd.com (10.180.168.240) by SATLEXMB02.amd.com (10.181.40.143) with Microsoft SMTP Server id 15.1.1713.5 via Frontend Transport; Mon, 25 Nov 2019 08:10:42 -0600 From: Andrey Grodzovsky To: Subject: [PATCH v3 1/2] drm/sched: Avoid job cleanup if sched thread is parked. Date: Mon, 25 Nov 2019 09:10:40 -0500 Message-ID: <1574691041-5499-1-git-send-email-andrey.grodzovsky@amd.com> X-Mailer: git-send-email 2.7.4 MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-Office365-Filtering-HT: Tenant X-Forefront-Antispam-Report: CIP:165.204.84.17; IPV:NLI; CTRY:US; EFV:NLI; SFV:NSPM; SFS:(10009020)(4636009)(39860400002)(136003)(346002)(376002)(396003)(428003)(189003)(199004)(23676004)(7696005)(53416004)(47776003)(478600001)(54906003)(186003)(70586007)(70206006)(44832011)(426003)(5820100001)(316002)(450100002)(50226002)(336012)(8936002)(4326008)(2616005)(5660300002)(2906002)(8676002)(356004)(50466002)(36756003)(86362001)(81166006)(81156014)(1671002)(66574012)(26005)(14444005)(2870700001)(305945005)(109986005)(266003); DIR:OUT; SFP:1101; SCL:1; SRVR:BYAPR12MB2904; H:SATLEXMB01.amd.com; FPR:; SPF:None; LANG:en; PTR:InfoDomainNonexistent; MX:1; A:1; X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 10e750e1-967f-4b09-b062-08d771b14277 X-MS-TrafficTypeDiagnostic: BYAPR12MB2904: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:792; X-Forefront-PRVS: 0232B30BBC X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 8tOukE7XvVgahiihCqJcL47Zu+efYZD7rsMoWzT/uX80Rsc4dWVsnmde0WLfLmZQIVSiPKOz47TwOvnzC92HPXeGo/L+s0InTz2yyh4ner9DSSBhHfOhmIMqGwcEHILTAEfxC0VYXfuqx2siPyJZ6mqPhmniOsGUH+v50wsW3GhpFzNAeD4TWinndWXQGa2/yCQSXIQsWd8Ul/hPeK3hd2CTet449Ruib4nW6KlOMJYEiEQTisI75ERqqOzjU55sfmZE3SCqDYH+gl2F+qLUrlW6AFVmWYQRHYDmGEINZvFZtOYzvzLMSEYXFWjjpDGGRJ5s2Fz/7H9GhaXTMVzv9kxmuOJWIziHaaCJD2mZ8nYVjaBztQQ0rNaMptZ2yaZ0ao6MEFV1BkBa7RmeEZS2ENBix0WhZhlEJxru8ejGbMwiK1+UTe9MOP4FkijQLCXC X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Nov 2019 14:10:43.4074 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 10e750e1-967f-4b09-b062-08d771b14277 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d; Ip=[165.204.84.17]; Helo=[SATLEXMB01.amd.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR12MB2904 X-Mailman-Original-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amdcloud.onmicrosoft.com; s=selector2-amdcloud-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=HQB2P7dzqtgJCuhlvrOp4O3VPy4MxsHmfUz8X1lVXBo=; b=kHcZ6diOMDKVnQb68GXL/IfsZ9oD27HOY4qKV2D+hr53Jv7BMsqI6RGe8C2QTbNg8qOT/axJa1oJ6wQ0ksnJ/dMHMFSHNhfZf1ZTsJVSi8DHoMDoUkIFQ87Dr+t3BtVLVYiTfeKSILKHcPYTzYJ3cT3EZWP4ksncKFijnqhHkj0= X-Mailman-Original-Authentication-Results: spf=none (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; lists.freedesktop.org; dkim=none (message not signed) header.d=none;lists.freedesktop.org; dmarc=permerror action=none header.from=amd.com; X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Emily.Deng@amd.com, amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, Christian.Koenig@amd.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" When the sched thread is parked we assume ring_mirror_list is not accessed from here. Signed-off-by: Andrey Grodzovsky Reviewed-by: Christian König --- drivers/gpu/drm/scheduler/sched_main.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index d4cc728..6774955 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -635,9 +635,13 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) struct drm_sched_job *job; unsigned long flags; - /* Don't destroy jobs while the timeout worker is running */ - if (sched->timeout != MAX_SCHEDULE_TIMEOUT && - !cancel_delayed_work(&sched->work_tdr)) + /* + * Don't destroy jobs while the timeout worker is running OR thread + * is being parked and hence assumed to not touch ring_mirror_list + */ + if ((sched->timeout != MAX_SCHEDULE_TIMEOUT && + !cancel_delayed_work(&sched->work_tdr)) || + __kthread_should_park(sched->thread)) return NULL; spin_lock_irqsave(&sched->job_list_lock, flags); From patchwork Mon Nov 25 14:10:41 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrey Grodzovsky X-Patchwork-Id: 11260465 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E15ED109A for ; Mon, 25 Nov 2019 14:10:52 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CA11B20748 for ; Mon, 25 Nov 2019 14:10:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CA11B20748 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=amd.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1579289F0B; Mon, 25 Nov 2019 14:10:48 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from NAM01-BN3-obe.outbound.protection.outlook.com (mail-eopbgr740059.outbound.protection.outlook.com [40.107.74.59]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8060D89EF7; Mon, 25 Nov 2019 14:10:46 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=C9XJEtEhxDYvqV/HIFb8BrJk5SDGWHo1MKoEqvgxnNkJVg6d0YHkj2wHzFvg0K8GKaqK4wm6mdEYHOLrO+pfrBqJv3roNflRkPrIOdFDW7Eq+6z9kVApeY5TFSdgc9wDIgHefjs6inWSSce6SjVLSr2OYN+GLdh69ASgl/BJSW6Geyh86cKxf5Q/sNXp/MwgahQNnabZoNzsAkxGPOwyfv8gf0Uefap9Kh6mQsiQ3hX9UwmqSgmCxftbqsm9XlWOokw1SmkLQhxjt29/OINywFsxGw5p9BQbhEXo553fAJjEwXybvWHoL1Lu7aSJy5ir+M2+OeH7dkPCK6nsUkK8cg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=D7dYvhNij+Dzqiq5Nno6Ls+eEraxP7Qi8lC4jZPMFqg=; b=lXKI9tddS2wphiP0HuJIvw4n+pc1li2OAL3tQ+F5rWUZMLBu+ILnR4FVVjqn1EC7fhAvIBPNzLl0AnftygZDMIvsg32MAlIjGkwkySZrO0tP1UnGiglRfSFEBiSsX/FMIrFjR22//0UEueBFIfUHcqMVuQIs08SbZSMZd06DgggPMAQO9jeLAu1L8ytM7InDyHswRWiLFe3J+97FdYuipms3TcXPoyRQIti+zgFV8j/+7Poe15XmQMUrNzCkEJocNIIj20MZUmuB2zx6/wUTP7Sxwfe3oFkI7/2+o85D3NoxVNOgrdt5qOqiyWYb8AlmowH1HVHLEANkFn17xX6pVA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none (sender ip is 165.204.84.17) smtp.rcpttodomain=lists.freedesktop.org smtp.mailfrom=amd.com; dmarc=permerror action=none header.from=amd.com; dkim=none (message not signed); arc=none Received: from BN6PR1201CA0024.namprd12.prod.outlook.com (2603:10b6:405:4c::34) by DM5PR12MB2517.namprd12.prod.outlook.com (2603:10b6:4:bb::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2474.17; Mon, 25 Nov 2019 14:10:45 +0000 Received: from BN8NAM11FT021.eop-nam11.prod.protection.outlook.com (2a01:111:f400:7eae::203) by BN6PR1201CA0024.outlook.office365.com (2603:10b6:405:4c::34) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.2474.16 via Frontend Transport; Mon, 25 Nov 2019 14:10:44 +0000 Received-SPF: None (protection.outlook.com: amd.com does not designate permitted sender hosts) Received: from SATLEXMB01.amd.com (165.204.84.17) by BN8NAM11FT021.mail.protection.outlook.com (10.13.177.114) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.20.2451.23 via Frontend Transport; Mon, 25 Nov 2019 14:10:44 +0000 Received: from SATLEXMB02.amd.com (10.181.40.143) by SATLEXMB01.amd.com (10.181.40.142) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Mon, 25 Nov 2019 08:10:44 -0600 Received: from agrodzovsky-All-Series.amd.com (10.180.168.240) by SATLEXMB02.amd.com (10.181.40.143) with Microsoft SMTP Server id 15.1.1713.5 via Frontend Transport; Mon, 25 Nov 2019 08:10:44 -0600 From: Andrey Grodzovsky To: Subject: [PATCH v3 2/2] drm/scheduler: Avoid accessing freed bad job. Date: Mon, 25 Nov 2019 09:10:41 -0500 Message-ID: <1574691041-5499-2-git-send-email-andrey.grodzovsky@amd.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1574691041-5499-1-git-send-email-andrey.grodzovsky@amd.com> References: <1574691041-5499-1-git-send-email-andrey.grodzovsky@amd.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-Office365-Filtering-HT: Tenant X-Forefront-Antispam-Report: CIP:165.204.84.17; IPV:NLI; CTRY:US; EFV:NLI; SFV:NSPM; SFS:(10009020)(4636009)(136003)(39860400002)(346002)(376002)(396003)(428003)(189003)(199004)(316002)(305945005)(76176011)(2870700001)(26005)(81156014)(81166006)(4326008)(50226002)(5660300002)(70586007)(8676002)(86362001)(5820100001)(356004)(6666004)(23676004)(1671002)(50466002)(70206006)(7696005)(446003)(2906002)(14444005)(47776003)(44832011)(450100002)(2616005)(11346002)(336012)(426003)(478600001)(36756003)(54906003)(109986005)(186003)(66574012)(53416004)(8936002)(266003); DIR:OUT; SFP:1101; SCL:1; SRVR:DM5PR12MB2517; H:SATLEXMB01.amd.com; FPR:; SPF:None; LANG:en; PTR:InfoDomainNonexistent; MX:1; A:1; X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 45b28cb1-8789-41a0-28ff-08d771b1435b X-MS-TrafficTypeDiagnostic: DM5PR12MB2517: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:9508; X-Forefront-PRVS: 0232B30BBC X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: VO4a/IeA8bDt3pKkZ060bYwu/clV3tcc1we7y6mn+x71vJWPs9H+b1A6BblLvRnI/Do5hVlnFm18oJd9qcqmB+IUscywXcVIx8QSe/AlPFvwM//m8r+Z6FBtZ5t/Qax+7zo1+BRt0+ZirGHGqgOwaToUtfAv1wCDIX4I5avWXy2G3dnJnEM5uQvN2vDMo6ntGz+GFT2B27n8uc7WJvAosAE6dopEOEcNOVPLcrYnStpuieDSQnvFnYm4U7qGgU6t+z+mmWIXsrX6Kmo/IKdFoscdLpcVflAFyNc1Cu+JDZONhO6czDBvEN7QgWe0Yg4JprX+LuB4XWJHmK9w+fvxgJLfOwxzPuNYQiLIHDO8TFjISYDH9mYQeSrIoiR4aEq9jrWUzNfZ6TkPZtIR802QpO/9A1VVHYsvyuLvDExqhdm6nKiyWvswnkHs/aYmwOq6 X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Nov 2019 14:10:44.8982 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 45b28cb1-8789-41a0-28ff-08d771b1435b X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d; Ip=[165.204.84.17]; Helo=[SATLEXMB01.amd.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM5PR12MB2517 X-Mailman-Original-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amdcloud.onmicrosoft.com; s=selector2-amdcloud-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=D7dYvhNij+Dzqiq5Nno6Ls+eEraxP7Qi8lC4jZPMFqg=; b=AYuHx7CWlbRmHL6Ntw7RAD2FAVmMe3+u/iLW5D+alNk9572wSzEv3ORhzm+ZveNqsZRpjCytyGTbFqXq7rqhFoviu37lbfsa5Brw7+CyyMHsI7pqAB7QqpbWbprYugkmtdlHkc1a0rGpHm6mSSmPgoNw6u53ZelzRrOFsqiTBgg= X-Mailman-Original-Authentication-Results: spf=none (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; lists.freedesktop.org; dkim=none (message not signed) header.d=none;lists.freedesktop.org; dmarc=permerror action=none header.from=amd.com; X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Emily.Deng@amd.com, amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, Christian.Koenig@amd.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Problem: Due to a race between drm_sched_cleanup_jobs in sched thread and drm_sched_job_timedout in timeout work there is a possiblity that bad job was already freed while still being accessed from the timeout thread. Fix: Instead of just peeking at the bad job in the mirror list remove it from the list under lock and then put it back later when we are garanteed no race with main sched thread is possible which is after the thread is parked. v2: Lock around processing ring_mirror_list in drm_sched_cleanup_jobs. v3: Rebase on top of drm-misc-next. v2 is not needed anymore as drm_sched_cleanup_jobs already has a lock there. Signed-off-by: Andrey Grodzovsky Reviewed-by: Christian König Tested-by: Emily Deng --- drivers/gpu/drm/scheduler/sched_main.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 6774955..a604dfa 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -284,10 +284,24 @@ static void drm_sched_job_timedout(struct work_struct *work) unsigned long flags; sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); + + /* + * Protects against concurrent deletion in drm_sched_cleanup_jobs that + * is already in progress. + */ + spin_lock_irqsave(&sched->job_list_lock, flags); job = list_first_entry_or_null(&sched->ring_mirror_list, struct drm_sched_job, node); if (job) { + /* + * Remove the bad job so it cannot be freed by already in progress + * drm_sched_cleanup_jobs. It will be reinsrted back after sched->thread + * is parked at which point it's safe. + */ + list_del_init(&job->node); + spin_unlock_irqrestore(&sched->job_list_lock, flags); + job->sched->ops->timedout_job(job); /* @@ -298,6 +312,8 @@ static void drm_sched_job_timedout(struct work_struct *work) job->sched->ops->free_job(job); sched->free_guilty = false; } + } else { + spin_unlock_irqrestore(&sched->job_list_lock, flags); } spin_lock_irqsave(&sched->job_list_lock, flags); @@ -370,6 +386,19 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad) kthread_park(sched->thread); /* + * Reinsert back the bad job here - now it's safe as drm_sched_cleanup_jobs + * cannot race against us and release the bad job at this point - we parked + * (waited for) any in progress (earlier) cleanups and any later ones will + * bail out due to sched->thread being parked. + */ + if (bad && bad->sched == sched) + /* + * Add at the head of the queue to reflect it was the earliest + * job extracted. + */ + list_add(&bad->node, &sched->ring_mirror_list); + + /* * Iterate the job list from later to earlier one and either deactive * their HW callbacks or remove them from mirror list if they already * signaled.