From patchwork Mon Jun 20 22:02:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrey Grodzovsky X-Patchwork-Id: 12888313 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 15BFFC433EF for ; Mon, 20 Jun 2022 22:03:46 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B5E7D10F71D; Mon, 20 Jun 2022 22:03:41 +0000 (UTC) Received: from NAM12-BN8-obe.outbound.protection.outlook.com (mail-bn8nam12on2074.outbound.protection.outlook.com [40.107.237.74]) by gabe.freedesktop.org (Postfix) with ESMTPS id DBE6110F718; Mon, 20 Jun 2022 22:03:40 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=QqZHdvQdBHcFDOptYJSa3Kdaf8f1cHNKRQLZV1BqlLALJPsxyFaeVy+mfHfrps/lChYFUk1+wRlm47ph5Ow29gxICo5ROF5pY/kRtGpac2CjW0ot+gc4DTgIjS7p+2JA9vkarDkooGl/vjT3k5WbEt7jiRzTeJYny5k2chJNKM/32Ub6Z7FhbRBl3QX1/uRnVipunJrnwDNeu08MERBwvWJn/O+yVJIcJ9BQwvG6qfYgNuUVzPmzMbdg61giqF9df6gJuHVhvbVBw+SJc/u8CwHo9CRwDHQTrPVBOFmBX5hcndqopNtwIJxKnmShuIeZ95HiyAGGCStmgYMfGDlzJg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=RV0YUMW9SNKkAa8QkRXi5B/1Uaks4SJhT2zFLoCd9kw=; b=AjvI+Qh0ruar9BaZaSOVe6mx4VlxIAzsOwunLQCxUvMz8D6AciDHqk7FQw0OesO03YCIbNt4qQumeOWM/5JS4FQSXxayiZi0xwQ2wOLs9pacROxxsBa1sDjyRWIHTHLkTPQPcBlsy9SIY1IGlOPFjl/GtNBJWGMYm8E1iyX1LhRjaACHVtsf9zSw2gC/h3w2vqpir6DvZtyic+SFVJWWJm88ztvmoYo+wq7Qm+khj4+D7ZXHcUjJtsqiTkPpCVJ18mJM/c1JNyEidOYT63gwRFaDsXFz73t5grCvnKAo6sHT7eokR6TzMjFCu/ZGcDbBsPyuJGAA1xYPWQ15i4rqaw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=lists.freedesktop.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=RV0YUMW9SNKkAa8QkRXi5B/1Uaks4SJhT2zFLoCd9kw=; b=ggsl39z86YgmPUymQ0mYDB0tOt7itgd5uEXK38RsiAclS5XBz67bKq0NdpFmvIFtjlKZuCIA4YUexxrhrbA86fYXjFkQAO3CTqWNGag4qlEYqErpzQLyzmFZvz6ENo/E0lwEKGCkUBK9JhhP5ERDLLOVMKH7aTd/K9T43Jixerc= Received: from MW2PR16CA0042.namprd16.prod.outlook.com (2603:10b6:907:1::19) by MN2PR12MB4534.namprd12.prod.outlook.com (2603:10b6:208:24f::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5353.15; Mon, 20 Jun 2022 22:03:37 +0000 Received: from CO1NAM11FT021.eop-nam11.prod.protection.outlook.com (2603:10b6:907:1:cafe::e6) by MW2PR16CA0042.outlook.office365.com (2603:10b6:907:1::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5353.22 via Frontend Transport; Mon, 20 Jun 2022 22:03:37 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CO1NAM11FT021.mail.protection.outlook.com (10.13.175.51) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.5353.14 via Frontend Transport; Mon, 20 Jun 2022 22:03:37 +0000 Received: from agrodzovsky-All-Series.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Mon, 20 Jun 2022 17:03:36 -0500 From: Andrey Grodzovsky To: , Subject: [PATCH 1/5] drm/amdgpu: Fix possible refcount leak for release of external_hw_fence Date: Mon, 20 Jun 2022 18:02:58 -0400 Message-ID: <20220620220302.86389-2-andrey.grodzovsky@amd.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220620220302.86389-1-andrey.grodzovsky@amd.com> References: <20220620220302.86389-1-andrey.grodzovsky@amd.com> MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: e964e1f1-2f49-4d35-0a7f-08da5308ba34 X-MS-TrafficTypeDiagnostic: MN2PR12MB4534:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: X4Djdpp2NFlXGyuXfSKNtk89Zs720NxUp0kI+Zx1I4VC3+dgzJIiuPg2kUPvoxxP9RY97BCsUss/aoD59Y9IEGb79oGMOC0tD5sMFjm5+miDQ9aBoaIXQCD6gG5hr5o739zvZrOXBWdKzDKTuUvOjNzjO0xJrsIlrZ27NIH+OoTCZd4B9LHA1vCvshb838PFXMUBcqiGVLX5cFdycramj/3zZqLotIUoRCT6GJ0+4BvQURRKsB1mGgrOHWatRyDbVG9dCD2WLKcYBF4Xb2QC8pzoO/T3pL5igz8MZLrZyO/VpO7WO6NrTMC/32TASGQM2mkD8WAxKBAghcLj3HHpoWJ3SkhEHrVzaJcV8hVsUjvR7EPmKNfpVVbS9DbA0SbCKrzg+fW6ymDnHlHvFa/uXlpDIKgir058RcsbCa4tQ60jecuFrZfbkiXj7/QIb0YJVlhZTVUhj59d9pK7fIQuoTBK1wiLkBryR8lPqNbF7P8Gy7YE+4MEEloaCwvaswM7EH6gW6t6n4qigJcOKqm46IQLxewAdYUW7xaRZVEqaWC0MuE0/6P1e1BJ2VYJIVTBK9HnhA+fSZjLkmfYBMgQ0r25doHQBDjLverkaDFiegUYh2U05HNs6wGXG3BAali5QEA7qSjAbjE5yITX+qGf0a9oILivVh0Od3Q+t1dcySPv0ccQ544sDWdN8MIRagZIS13/LDlVqZ1PjwCzf6rkgngsWlwamw4iggwvfEoKIpSi11AgutDv4tECk+ZdAzvc X-Forefront-Antispam-Report: CIP:165.204.84.17; CTRY:US; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:SATLEXMB04.amd.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230016)(4636009)(396003)(376002)(39860400002)(346002)(136003)(46966006)(36840700001)(40470700004)(86362001)(36756003)(7696005)(26005)(54906003)(2906002)(6666004)(450100002)(44832011)(356005)(110136005)(1076003)(8676002)(82310400005)(4326008)(316002)(70586007)(70206006)(36860700001)(81166007)(478600001)(5660300002)(8936002)(40480700001)(16526019)(40460700003)(426003)(47076005)(186003)(41300700001)(83380400001)(2616005)(82740400003)(336012)(36900700001); DIR:OUT; SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Jun 2022 22:03:37.3288 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: e964e1f1-2f49-4d35-0a7f-08da5308ba34 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d; Ip=[165.204.84.17]; Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT021.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR12MB4534 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: jingwen.chen2@amd.com, Christian.Koenig@amd.com, monk.liu@amd.com, yiqing.yao@amd.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Problem: In amdgpu_job_submit_direct - The refcount should drop by 2 but it drops only by 1. amdgpu_ib_sched->emit -> refcount 1 from first fence init dma_fence_get -> refcount 2 dme_fence_put -> refcount 1 Fix: Add put for external_hw_fence in amdgpu_job_free/free_cb Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 10aa073600d4..58568fdde2d0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -152,8 +152,10 @@ static void amdgpu_job_free_cb(struct drm_sched_job *s_job) /* only put the hw fence if has embedded fence */ if (job->hw_fence.ops != NULL) dma_fence_put(&job->hw_fence); - else + else { + dma_fence_put(job->external_hw_fence); kfree(job); + } } void amdgpu_job_free(struct amdgpu_job *job) @@ -165,8 +167,10 @@ void amdgpu_job_free(struct amdgpu_job *job) /* only put the hw fence if has embedded fence */ if (job->hw_fence.ops != NULL) dma_fence_put(&job->hw_fence); - else + else { + dma_fence_put(job->external_hw_fence); kfree(job); + } } int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity, From patchwork Mon Jun 20 22:02:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrey Grodzovsky X-Patchwork-Id: 12888314 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D1FECC43334 for ; Mon, 20 Jun 2022 22:03:52 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1EBEC10F733; Mon, 20 Jun 2022 22:03:46 +0000 (UTC) Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2073.outbound.protection.outlook.com [40.107.223.73]) by gabe.freedesktop.org (Postfix) with ESMTPS id 02F9710F71A; Mon, 20 Jun 2022 22:03:42 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=nlxUYuE4JIUbbukviJ4pnGhH0u569tQFY4n7kN5dt55oqKcMZHHAkW5u9Gf56WSRqsXBUl1UX91Kd9OPXeKML+/bwGMVhBFi9JOXGwY4zr18UAnFZ4FJojExOMMHHLD3vxF4ne2IoXdjLfoxb8mhc62tbCRZQxOofsnS4kAEz4ywMSCkzG/uWxrknYBiAFPA7BiQyBBFHkcBg8XWvEqMioi1NAieNPn2MczU3+78/rn6eklOfcE4rXjiDT6zJvN6uR8MLDqWYv2Q6b9jYslwbHTVn0Ly+d0DQ/x9z+fLeapspa34cFBHmFzw0snjY6Qrcuu0snUaYWXOuHz3CfA/SQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=IhcHFtYWc28Slyzx/KOSERtchaMmg24dOdiu1M0Q2dk=; b=mzyYlTmMp6rPtPchwrVnfM6lPBv8A67GLuMDfkzToEj36R6ZObbRuFsLDaYc7hraxa6+0B4Vta6UvpBK6pqcrRr2Nm7TXVc5Hop9yx1RxXFNwdYHh8k7aobnao7LjtyjqwisCc2p9DLvUmxn9l/Gp1FEoLGmXhf2J5goT9rizcotlSwqW5qs79nXOcygEaJzl5hnT5hV7d55C6rcINZr6T3IVvolLy7WS8xyjIN7FBX3v8xxhJM0M25hEqAd/uTL38LRwJKNao7GiZIugk5bA7Zn+HAedC6i7ajbAKGNMDuh1ZyTKezKt502MDNMe/2kAzhvnGzhkTOFF/hmMxyCqA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=lists.freedesktop.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=IhcHFtYWc28Slyzx/KOSERtchaMmg24dOdiu1M0Q2dk=; b=4HkJf6PdWBIqj855AHatcLxLLPLVZ6GPv1pLgLMS3+5Lv8mvlrOhEVfWvdKiJiKblCkLZMiWFZ9a+Q5KoZ2DXA2j2SFNLchRzX0F3iVTPH3L8byABg0hF10sG9aWaWisKaFHmRJjWZm5itqCMYPS4fQKQvC3C9jQIYuBm2YwgT0= Received: from MWH0EPF00056D03.namprd21.prod.outlook.com (2603:10b6:30f:fff2:0:1:0:d) by MW3PR12MB4394.namprd12.prod.outlook.com (2603:10b6:303:54::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5353.13; Mon, 20 Jun 2022 22:03:41 +0000 Received: from CO1NAM11FT054.eop-nam11.prod.protection.outlook.com (2a01:111:f400:7eab::207) by MWH0EPF00056D03.outlook.office365.com (2603:1036:d20::b) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5395.2 via Frontend Transport; Mon, 20 Jun 2022 22:03:39 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CO1NAM11FT054.mail.protection.outlook.com (10.13.174.70) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.5353.14 via Frontend Transport; Mon, 20 Jun 2022 22:03:39 +0000 Received: from agrodzovsky-All-Series.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Mon, 20 Jun 2022 17:03:37 -0500 From: Andrey Grodzovsky To: , Subject: [PATCH 2/5] drm/amdgpu: Add put fence in amdgpu_fence_driver_clear_job_fences Date: Mon, 20 Jun 2022 18:02:59 -0400 Message-ID: <20220620220302.86389-3-andrey.grodzovsky@amd.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220620220302.86389-1-andrey.grodzovsky@amd.com> References: <20220620220302.86389-1-andrey.grodzovsky@amd.com> MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: d9e23b26-b014-47f0-0fc4-08da5308bb6b X-MS-TrafficTypeDiagnostic: MW3PR12MB4394:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: aRxSlp0nd+H1LNszTw0FYHPM+j+uLOX2yIYZN0vNx+n+geq7FdG6uyjvO5YY1Z8p5BlEZlc+/5rGsaY7Bf51SvjoEV+pI0AItz9Eb6Q1xVJcXMGlgZz0Up9djL6TvsucXOPp7VRTkzDrSMhcyGGk87EhWzQQakLmhjMI4PSswZuJxXJqwmSj1yjEERwdegFR+7dtqBiN6Y0skExjbdZAC/M32wwDCyVQnhELKBOt7NNC5td/0aKKkvQZdMlpvg1x7BVQ3+BMenVElncjdgRGwNT+2+P3hdJ78FJWW4GXFCuxU25G07yTMT/6V7Pun6r+mfy3Kd4879FMA4ZJHA2fq+PyFJ6fjIDpo8Zw+doAIMns/JV0hkNvKCrRg6QnqvaOmts9cqZknREwcPthNPzGXfDd9SFz2u9ZJoTL8vpdCTKp2/q0oPQpEshxussMewQfeVCEuN4A5Yknnkg/D/TV/Qh+HTQ2CP+3M66KO55t1/Dh3EFkEQzLabF89r63kTw4GdozABu4AD2Jmcrc8ZD/41MAk3q/NMn0BlA0iF0uUTgDxhPB8ubpYSMcNQ6nQWfQXxuUz+JXxqNdp110DyAt/I8qJuKwsA5QgDR92ITVkBxeV5mxQMs6Fbs+Xcd0EH7GhbvhwjfPzIZ6f/jkHVpUslM7+LSeSdOaghiz1QklzQdj+KvCbEAV2RLsf/XEMzq34yUouhISbLHhqtqQSTmhkyzK5eW3tjPFQ+zn+6ilxu9oX6xvNrQKq/HLcldJwn2o X-Forefront-Antispam-Report: CIP:165.204.84.17; CTRY:US; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:SATLEXMB04.amd.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230016)(4636009)(136003)(346002)(39860400002)(396003)(376002)(46966006)(40470700004)(36840700001)(4744005)(41300700001)(44832011)(2616005)(5660300002)(316002)(356005)(6666004)(81166007)(2906002)(7696005)(36860700001)(40460700003)(82740400003)(26005)(82310400005)(86362001)(8676002)(4326008)(336012)(40480700001)(426003)(83380400001)(70206006)(70586007)(36756003)(450100002)(478600001)(47076005)(1076003)(186003)(110136005)(16526019)(8936002)(54906003)(36900700001); DIR:OUT; SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Jun 2022 22:03:39.3671 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: d9e23b26-b014-47f0-0fc4-08da5308bb6b X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d; Ip=[165.204.84.17]; Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT054.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW3PR12MB4394 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: jingwen.chen2@amd.com, Christian.Koenig@amd.com, monk.liu@amd.com, yiqing.yao@amd.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" This function should drop the fence refcount when it extracts the fence from the fence array, just as it's done in amdgpu_fence_process. Signed-off-by: Andrey Grodzovsky Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index 957437a5558c..a9ae3beaa1d3 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -595,8 +595,10 @@ void amdgpu_fence_driver_clear_job_fences(struct amdgpu_ring *ring) for (i = 0; i <= ring->fence_drv.num_fences_mask; i++) { ptr = &ring->fence_drv.fences[i]; old = rcu_dereference_protected(*ptr, 1); - if (old && old->ops == &amdgpu_job_fence_ops) + if (old && old->ops == &amdgpu_job_fence_ops) { RCU_INIT_POINTER(*ptr, NULL); + dma_fence_put(old); + } } } From patchwork Mon Jun 20 22:03:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrey Grodzovsky X-Patchwork-Id: 12888316 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 54D3CC43334 for ; Mon, 20 Jun 2022 22:03:57 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id AA26010F71A; Mon, 20 Jun 2022 22:03:46 +0000 (UTC) Received: from NAM02-DM3-obe.outbound.protection.outlook.com (mail-dm3nam02on2058.outbound.protection.outlook.com [40.107.95.58]) by gabe.freedesktop.org (Postfix) with ESMTPS id B3E7410F71A; Mon, 20 Jun 2022 22:03:43 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=JEuzgjaz8KA3PB/cliP8Mq2JdsyziLCDsIGtwwhU5KkIaI8JOQENtxGPSDmNYjtTHb8yk1G/xgl4fuvhvnVDMVq6B7cyZF73dqc4GUr2X8X4FE62jBJSBoPc59yIzCaBPk+tXnqaj26fQB/ssfBVKO9riAYF5+1+MAX3sSp+wZxIqhCGQpFwli+QGL6tQBdcOxjaG6S4M8W01i97JMCJc7eb0PP2c9LeggFL7M2aMz4E0Qp0gBaiina5iWgA3ia5tTeYhw1V2AruHUS8lBn78hTVILv/PsTphfwgaFdig0iB7bX7wIqO0w1pG79L6pKF+J7IwYJTbs08qpUyqylDcw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=YmLTyILqSTaiYP6vq93GrSun60ytp1vxkAWZyw7dfGE=; b=F6ASyq1BWqa/59GDNx6tDVEOxUqSfovtuNLYDaNy/s2vNYFSLK9pEQPCHSsfyqSWKxcsgrkPDxDYFB4cA6MtimW49coZ3L2LHsBsi+IFqSuN2QqK698564GSi+Ni4HI08ck2S/WxoAfYN4Nk2yoqwxMsyNWkH/Gofc7iViZDPcp6sJNxs8v4fIxEEWDnzt+Ao+BnZBOtSisOceY1nWebJtmpx6KtYjEYkmh3LYwnmctrj54NQ0m6irlhdSciqu7KaZEC/uo0kbMJyfRnONoB9Tmkxy5fvEMltlzoM9AhExnuI2owervFFiUTDAegeVL/qxdz5U9vwAzdmg8zShS2hA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=lists.freedesktop.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=YmLTyILqSTaiYP6vq93GrSun60ytp1vxkAWZyw7dfGE=; b=41nYby4OeDC/D88z1yPrm6OSQPt2/1t9igcijorPrq+t7SjqelZB05GBe7pEDPiaOJPGNxFWEu2vEYQNuhzshHj9iFVSA6zoLlsYlutVuzy6dTNl+9GNi0V5Koqh1WT50HJMeQqtP1T7sjYxs1PYTMxgGfB8n/gwX9MYMomP0gI= Received: from MWH0EPF00056D16.namprd21.prod.outlook.com (2603:10b6:30f:fff2:0:1:0:1d) by IA1PR12MB6330.namprd12.prod.outlook.com (2603:10b6:208:3e4::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5353.14; Mon, 20 Jun 2022 22:03:41 +0000 Received: from CO1NAM11FT054.eop-nam11.prod.protection.outlook.com (2a01:111:f400:7eab::206) by MWH0EPF00056D16.outlook.office365.com (2603:1036:d20::b) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5395.3 via Frontend Transport; Mon, 20 Jun 2022 22:03:41 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CO1NAM11FT054.mail.protection.outlook.com (10.13.174.70) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.5353.14 via Frontend Transport; Mon, 20 Jun 2022 22:03:41 +0000 Received: from agrodzovsky-All-Series.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Mon, 20 Jun 2022 17:03:38 -0500 From: Andrey Grodzovsky To: , Subject: [PATCH 3/5] drm/amdgpu: Prevent race between late signaled fences and GPU reset. Date: Mon, 20 Jun 2022 18:03:00 -0400 Message-ID: <20220620220302.86389-4-andrey.grodzovsky@amd.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220620220302.86389-1-andrey.grodzovsky@amd.com> References: <20220620220302.86389-1-andrey.grodzovsky@amd.com> MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 5af672e7-1f60-49d7-2232-08da5308bca8 X-MS-TrafficTypeDiagnostic: IA1PR12MB6330:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: G61UmUnbjHg66+W5EY/9NGr3eWFPAsH4HJndxT6Q/uWo2Gxr54yl6uhd28+84u3CERPgDn9U5YFqMxflxKUXc8o8rTWyb1OsDiVI5K38bJvzY5dwbZixj1AFI5vlKAlTKEknMuZVtBtkv8UtfekEGETISgqXGcAzApBUQ0Cq57/PbdGzz1O65vSG31opye3TOuwfjkZtcKTnwIFmqX/LAkBjZSQsgQyvuWKmK93u9fNLkjiGar3o9RwhiaC9gctQZGnFe3AP6STDPS2rV/wpSV9QSdT/1G+rtk4ZgHDR0URYIBX/x0batG3M3I9vsNuOk5SuycHSGEGZMWzR6lXHD/8lWT0iNzjeJb6Tx7rRnzLlwhy9m4Shu1TJPQ5xtktuOMH8RjAEufNpJVHDfQJWVCq5+rpwD97anlo088Xzc1qSyFkRekbXEgZh6BtFaGsCbfCy7LhZDH/M6Ig3saZYAcLJ2POj416kfEGiNlON7xQfosBk1S15XFKa8xTvnTEVYU+9B6Hr9aI/TTIzMNOmbs7jhuXl+bdxmHf7iCmWc4CX+11AIxNP/4YWcgCvA+AO9Qbase314RKOzUPuT/mAp/3Q8z6ICq6z+jFBGSfWImBqVQ8w9O9boU8sJp9k2MBRENaC08hw4apBgxvnJVRBKCN56tvsBDwLXLSyaKS7RQF1xesTzlv01gY9uZb7wyfs7evfwqT/x+TiH4Il5VEcX277XFvdCCz2mIGmBHON6BpJbJn21N0iRnxVR8fCAfU9 X-Forefront-Antispam-Report: CIP:165.204.84.17; CTRY:US; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:SATLEXMB04.amd.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230016)(4636009)(346002)(39860400002)(396003)(136003)(376002)(36840700001)(46966006)(40470700004)(336012)(6666004)(40480700001)(450100002)(47076005)(186003)(316002)(5660300002)(8936002)(478600001)(4326008)(82740400003)(26005)(1076003)(7696005)(83380400001)(54906003)(110136005)(36860700001)(41300700001)(81166007)(356005)(40460700003)(82310400005)(2906002)(70586007)(2616005)(16526019)(70206006)(36756003)(44832011)(86362001)(426003)(8676002)(36900700001); DIR:OUT; SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Jun 2022 22:03:41.4607 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 5af672e7-1f60-49d7-2232-08da5308bca8 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d; Ip=[165.204.84.17]; Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT054.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6330 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: jingwen.chen2@amd.com, Christian.Koenig@amd.com, monk.liu@amd.com, yiqing.yao@amd.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Problem: After we start handling timed out jobs we assume there fences won't be signaled but we cannot be sure and sometimes they fire late. We need to prevent concurrent accesses to fence array from amdgpu_fence_driver_clear_job_fences during GPU reset and amdgpu_fence_process from a late EOP interrupt. Fix: Before accessing fence array in GPU disable EOP interrupt and flush all pending interrupt handlers for amdgpu device's interrupt line. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++++ drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 26 ++++++++++++++++++++++ drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 1 + 3 files changed, 31 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 2b92281dd0c1..c99541685804 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4605,6 +4605,8 @@ int amdgpu_device_pre_asic_reset(struct amdgpu_device *adev, amdgpu_virt_fini_data_exchange(adev); } + amdgpu_fence_driver_isr_toggle(adev, true); + /* block all schedulers and reset given job's ring */ for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { struct amdgpu_ring *ring = adev->rings[i]; @@ -4620,6 +4622,8 @@ int amdgpu_device_pre_asic_reset(struct amdgpu_device *adev, amdgpu_fence_driver_force_completion(ring); } + amdgpu_fence_driver_isr_toggle(adev, false); + if (job && job->vm) drm_sched_increase_karma(&job->base); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index a9ae3beaa1d3..d6d54ba4c185 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -532,6 +532,32 @@ void amdgpu_fence_driver_hw_fini(struct amdgpu_device *adev) } } +void amdgpu_fence_driver_isr_toggle(struct amdgpu_device *adev, bool stop) +{ + int i; + + for (i = 0; i < AMDGPU_MAX_RINGS; i++) { + struct amdgpu_ring *ring = adev->rings[i]; + + if (!ring || !ring->fence_drv.initialized || !ring->fence_drv.irq_src) + continue; + + if (stop) + amdgpu_irq_put(adev, ring->fence_drv.irq_src, + ring->fence_drv.irq_type); + else + amdgpu_irq_get(adev, ring->fence_drv.irq_src, + ring->fence_drv.irq_type); + } + + /* TODO Only waits for irq handlers on other CPUs, maybe local_irq_save + * local_irq_local_irq_restore are needed here for local interrupts ? + * + */ + if (stop) + synchronize_irq(adev->irq.irq); +} + void amdgpu_fence_driver_sw_fini(struct amdgpu_device *adev) { unsigned int i, j; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h index 7d89a52091c0..82c178a9033a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h @@ -143,6 +143,7 @@ signed long amdgpu_fence_wait_polling(struct amdgpu_ring *ring, uint32_t wait_seq, signed long timeout); unsigned amdgpu_fence_count_emitted(struct amdgpu_ring *ring); +void amdgpu_fence_driver_isr_toggle(struct amdgpu_device *adev, bool stop); /* * Rings. From patchwork Mon Jun 20 22:03:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrey Grodzovsky X-Patchwork-Id: 12888315 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5AEC4C43334 for ; Mon, 20 Jun 2022 22:03:55 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0BEEA10F730; Mon, 20 Jun 2022 22:03:46 +0000 (UTC) Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2085.outbound.protection.outlook.com [40.107.244.85]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8EBC010F71A; Mon, 20 Jun 2022 22:03:44 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ieyW8E0wO/O4inhE8/ADqPmY8MJH6P4SVJw+sizr6Dh5VoxzPONzF6X7wM3wwj92wv3LxCCX7QI8dZ375H4c0PMcpxgVUm3iYFw9ls+BKambPERqVweelhT0YjzWeYM2jVDKkycdTGNrjO7MIrVTzKReNVCBFHCu8VzXwK1shqzItdiyL9ZwaZKuzRSa9VzvMsheCMmHIIaGxEV2Nh2gX0MIDABLXYLcRNI1G0Vmqia9VcAAABSEMNNyqGgZGsePoWkff89dB2677v1Q4UUODd7pjtHZMvQN5hUJxZmkHrD5E50LkDBXDTBS7gDVnQGplISi9gtKTQVJhlv9foqmSw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=TyJZTsCkrjcKfbP7lXdrG44FSirUMZ7lN34LVAM5P5Q=; b=VN1B8bP7b6M7bZQSbfQVOJfBBBTcHGS3wVroc0u1idKk5AUFq+d+ITfMfBLGhi/q0j8F5TfOikxAmZNUpxs5ncU4OUrYfHf3pgIPewqCiQbMDf2yLEqCeCbKEi28xeazyRDQoDjCW1HoljAb+gUdX5rNXHBPexctBpLN1cvZXZQUTtK2sLk/atJM8VHDmP1dlXtLm6VslfZipU4zEpOh12GWEndoJ2FEt1dPZXKsZS6j+bQo+mAOMxAb2wU4ukD1hX6ZoEhOYh7cPSy5KhpS2QhTuMkOpIKJkQDJxQ8ZVS7FmM5haT9hwwe3wxfN0Rvt4A4jFfodP+wYroyelCtoCw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=lists.freedesktop.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=TyJZTsCkrjcKfbP7lXdrG44FSirUMZ7lN34LVAM5P5Q=; b=dBw/WpkWHut+bJadm3ip2hSiq/0kkW1rigKDYoeDgW+LZxbWU7q5n5JmJf7KrYjB8PSRsicQTJfZ7o1Qyh2xNLELTn9mwHPQTI2jGRRrvPFYf8mcCu5D1Hn+ZFlWdXXAZgY3kEnGj5ONRGom+IB5V0fgVi4LlSC9u93YVuRavqY= Received: from MWH0EPF00056D03.namprd21.prod.outlook.com (2603:10b6:30f:fff2:0:1:0:d) by MWHPR12MB1151.namprd12.prod.outlook.com (2603:10b6:300:e::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5353.14; Mon, 20 Jun 2022 22:03:42 +0000 Received: from CO1NAM11FT054.eop-nam11.prod.protection.outlook.com (2a01:111:f400:7eab::207) by MWH0EPF00056D03.outlook.office365.com (2603:1036:d20::b) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5395.2 via Frontend Transport; Mon, 20 Jun 2022 22:03:42 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CO1NAM11FT054.mail.protection.outlook.com (10.13.174.70) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.5353.14 via Frontend Transport; Mon, 20 Jun 2022 22:03:42 +0000 Received: from agrodzovsky-All-Series.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Mon, 20 Jun 2022 17:03:40 -0500 From: Andrey Grodzovsky To: , Subject: [PATCH 4/5] drm/sched: Partial revert of 'drm/sched: Keep s_fence->parent pointer' Date: Mon, 20 Jun 2022 18:03:01 -0400 Message-ID: <20220620220302.86389-5-andrey.grodzovsky@amd.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220620220302.86389-1-andrey.grodzovsky@amd.com> References: <20220620220302.86389-1-andrey.grodzovsky@amd.com> MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 5e77c0b9-a4b8-4843-b7b7-08da5308bd21 X-MS-TrafficTypeDiagnostic: MWHPR12MB1151:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: YyWrG9tb6otxwR/j2PmghR8a2u2OVDn24pPP5DOhvD4MGe+2vAcTrViqccU0BMvRbm9ZLlkqYcNi6q16NuHINzaJATMQbo1RYlYhScC4teTCPTtCoNlJ/8bISSSKhWYxXKkG+AVGlLOs/uW9XcMdgXuJXDGXrH+9MFbuigBg3UK2a/3jieRZ4psCC9wJ1/PLMBSOXmm3xaWZmG8ZhcPg5uzefi2OLqWWiLdOksmZOZLFS1L3MdwWfiI7+Ji7caViZdELMZdg5hz4RhtWBfjirN0H5lAnMVK8Y5cY5V+0YXXFiCm26ZvoH/M0juj4NdHdUcH19j5IWPCyGG+IN58nCJy/EgiyQzJLxan7Gv+d2wd/sTIzdqRIF7IE33sbrjGr/bi8xwEjN3zd6lemO8Xs2Ja+gCy4RRk9V5PQA48F7mEB2we1HNvqn2WXdnjSr7W7JFo2ER6URj7rDW9uaxbB/V0KqabuhIQnVNNusjv9a7dbEmXLN+21xYBbcQIbIR09EsPNSsEDnLfVI+wsBEuSxLrMA8nw10bre4trBd4QNq31ZVLi/4ELWthUbprXjsM2TBhhBtqMTw5zfhXdrOeiPDzWPZZLUy+MnlF18fSwzxE93GVaBgGKkQoZay3Wvhh6SgtnKp8yPCfSQW+E5l/SHlK1P+g1GR0QPP/exjhuLyChYm3TPTaoi3JgiO5+EjvuheRzs1ymwHV/JlWr4aHdg3CeJzJoSpk9Xi9vfcpZ+FWR8gn/vsuQ47u9HfAxoJKUMWM2vRLgkU+p4wu+noLZXJx0/ZNPafknJS0k9Nqp1ig7sTfw91do427QjiuzQWlR5L5mTVSxMtirELqE1ARbyA== X-Forefront-Antispam-Report: CIP:165.204.84.17; CTRY:US; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:SATLEXMB04.amd.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230016)(4636009)(346002)(396003)(376002)(39860400002)(136003)(36840700001)(46966006)(40470700004)(16526019)(426003)(186003)(82740400003)(70206006)(47076005)(450100002)(1076003)(83380400001)(86362001)(336012)(40460700003)(4326008)(70586007)(356005)(5660300002)(81166007)(44832011)(2616005)(2906002)(36860700001)(40480700001)(478600001)(8936002)(41300700001)(7696005)(26005)(8676002)(36756003)(6666004)(54906003)(316002)(966005)(82310400005)(110136005)(36900700001); DIR:OUT; SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Jun 2022 22:03:42.2418 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 5e77c0b9-a4b8-4843-b7b7-08da5308bd21 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d; Ip=[165.204.84.17]; Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT054.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR12MB1151 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: jingwen.chen2@amd.com, Christian.Koenig@amd.com, monk.liu@amd.com, yiqing.yao@amd.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Problem: This patch caused negative refcount as described in [1] because for that case parent fence did not signal by the time of drm_sched_stop and hence kept in pending list the assumption was they will not signal and so fence was put to account for the s_fence->parent refcount but for amdgpu which has embedded HW fence (always same parent fence) drm_sched_fence_release_scheduled was always called and would still drop the count for parent fence once more. For jobs that never signaled this imbalance was masked by refcount bug in amdgpu_fence_driver_clear_job_fences that would not drop refcount on the fences that were removed from fence drive fences array (against prevois insertion into the array in get in amdgpu_fence_emit). Fix: Revert this patch and by setting s_job->s_fence->parent to NULL as before prevent the extra refcount drop in amdgpu when drm_sched_fence_release_scheduled is called on job release. Also - align behaviour in drm_sched_resubmit_jobs_ext with that of drm_sched_main when submitting jobs - take a refcount for the new parent fence pointer and drop refcount for original kref_init for new HW fence creation (or fake new HW fence in amdgpu - see next patch). [1] - https://lore.kernel.org/all/731b7ff1-3cc9-e314-df2a-7c51b76d4db0@amd.com/t/#r00c728fcc069b1276642c325bfa9d82bf8fa21a3 Signed-off-by: Andrey Grodzovsky Tested-by: Yiqing Yao --- drivers/gpu/drm/scheduler/sched_main.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index b81fceb0b8a2..b38394f5694f 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -419,6 +419,11 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad) if (s_job->s_fence->parent && dma_fence_remove_callback(s_job->s_fence->parent, &s_job->cb)) { + /* Revert drm/sched: Keep s_fence->parent pointer, no + * need anymore for amdgpu and creates only troubles + */ + dma_fence_put(s_job->s_fence->parent); + s_job->s_fence->parent = NULL; atomic_dec(&sched->hw_rq_count); } else { /* @@ -548,7 +553,6 @@ void drm_sched_resubmit_jobs_ext(struct drm_gpu_scheduler *sched, int max) if (found_guilty && s_job->s_fence->scheduled.context == guilty_context) dma_fence_set_error(&s_fence->finished, -ECANCELED); - dma_fence_put(s_job->s_fence->parent); fence = sched->ops->run_job(s_job); i++; @@ -558,7 +562,11 @@ void drm_sched_resubmit_jobs_ext(struct drm_gpu_scheduler *sched, int max) s_job->s_fence->parent = NULL; } else { - s_job->s_fence->parent = fence; + + s_job->s_fence->parent = dma_fence_get(fence); + + /* Drop for orignal kref_init */ + dma_fence_put(fence); } } } @@ -952,6 +960,9 @@ static int drm_sched_main(void *param) if (!IS_ERR_OR_NULL(fence)) { s_fence->parent = dma_fence_get(fence); + /* Drop for original kref_init of the fence */ + dma_fence_put(fence); + r = dma_fence_add_callback(fence, &sched_job->cb, drm_sched_job_done_cb); if (r == -ENOENT) @@ -959,7 +970,6 @@ static int drm_sched_main(void *param) else if (r) DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n", r); - dma_fence_put(fence); } else { if (IS_ERR(fence)) dma_fence_set_error(&s_fence->finished, PTR_ERR(fence)); From patchwork Mon Jun 20 22:03:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrey Grodzovsky X-Patchwork-Id: 12888317 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2F261C433EF for ; Mon, 20 Jun 2022 22:04:00 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1ADEB10F742; Mon, 20 Jun 2022 22:03:48 +0000 (UTC) Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2083.outbound.protection.outlook.com [40.107.243.83]) by gabe.freedesktop.org (Postfix) with ESMTPS id 44AF210F726; Mon, 20 Jun 2022 22:03:45 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=dEwIzRcj02Yu/piScyk71hp/DLOBTQJhj4YE8HMpVSMXo5ZLxNYvsag7OQAQS9zHwA8EIEHjdZGzt528sdBvFZUVQFFvEtwOP98pX8noMNPNbiAn4dCmMV3mzqiYpv5D174yZ9dyg9mXBKAGHHiwgvSopVN3Hbufs4ayFKeMktmKGHfga7DItR7215iRJCCrJ+hrrYc5TUPBhkKi/YGHaJe1/N+zkSpjO1wQqDtb5iONUN5YywYRGdNrr60JBQQexZ5ik764PGZAcEAflYve2yfO1W6OkVTJfa4HmSboSOFzBvd8cgbptiTRfrv1xmEEuzgeuPhz8oN5n7hDHyscQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=tzdt4BEPk+gnsSjeFHPzJP6h+2XTibMnN3lym82vie8=; b=gNdqJ7ofqTwrXwLQdK4IBvbUwwwOxnfyWuAsAklCQiRLKyhh3NzgV46698tnAIpsmjpONm2p3Whz9qH2KFzDBlxzZ1O5qX9lyw23XBtgcgbflQNfVZqtKB8cHPCQKyuFuw9bDlu8Nttv1t5Ap52LndyyXzB9csThlGbI0jXlZp3hu/d25ycTQsX9uOKsU4tkMCy9SOh1pP37BK+HcuElew3L3I4pAlrBqLzduEhwFhQPcl85wYR0jWLoKwzamjIZcaICLT3pEUPG6s6TXDZUfHS6a9d0uID7PngJiwcdtQJvtRRW4Ft283CVjLAZF2xsrNePUwYtjkf+LpKh9tk0Eg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=lists.freedesktop.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=tzdt4BEPk+gnsSjeFHPzJP6h+2XTibMnN3lym82vie8=; b=PFOnFfQgABE60jtZ+Y7K95+4f3OXSIDJztzQo9unMBANZf9GqN2O21QDKmI5TdulsRJ962Snm/ho7pNR+qtaHcHsYbSzKG/ncfWw7XkZ9gtacue7/oI39e0aZasFbxKoGKsRdct0ceEiBBhIyvwvxuxSczh0+bzaQTrzKTX+16o= Received: from MWH0EPF00056D12.namprd21.prod.outlook.com (2603:10b6:30f:fff2:0:1:0:15) by CH2PR12MB3991.namprd12.prod.outlook.com (2603:10b6:610:2f::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5353.18; Mon, 20 Jun 2022 22:03:43 +0000 Received: from CO1NAM11FT054.eop-nam11.prod.protection.outlook.com (2a01:111:f400:7eab::206) by MWH0EPF00056D12.outlook.office365.com (2603:1036:d20::b) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5395.3 via Frontend Transport; Mon, 20 Jun 2022 22:03:43 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CO1NAM11FT054.mail.protection.outlook.com (10.13.174.70) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.5353.14 via Frontend Transport; Mon, 20 Jun 2022 22:03:42 +0000 Received: from agrodzovsky-All-Series.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.28; Mon, 20 Jun 2022 17:03:41 -0500 From: Andrey Grodzovsky To: , Subject: [PATCH 5/5] drm/amdgpu: Follow up change to previous drm scheduler change. Date: Mon, 20 Jun 2022 18:03:02 -0400 Message-ID: <20220620220302.86389-6-andrey.grodzovsky@amd.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220620220302.86389-1-andrey.grodzovsky@amd.com> References: <20220620220302.86389-1-andrey.grodzovsky@amd.com> MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 07d3f879-6229-46ad-c612-08da5308bd72 X-MS-TrafficTypeDiagnostic: CH2PR12MB3991:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: KFeF6tnmgRZAb8GIjUnMfdkUZykC2/m9evN5zG3SNjVj0zComcz8wth/4PXGdP+1doCv778vzWZtsqNA7b23jiBk676ojFxX7YsiE/nMZ2ZpKNPQVlRKxEuYmHVZDbvsrDe8wtq8F/y35TldzuohtYvrq+G2ZKS383kEzMkPUPyRVCMjyWajNfLpIiUNGKDU0Pq9AS9aHD2AqG53xdmbNkwE9L5mnfuHt7aF6T3mQKtvnv2kC7WzAdAfsvI0WBo4YXMfQbRC9oKui5QvgcoxdKduHVOH/MFJ4grH3v/EgUhMyoNENox1KX9CMo1Gndhn+pf4+QgInZ1xeYrje2UYXZHlJWpdyxgUMzJ7NoWP0+7ImnSZtTL+39Mh2jWsouR9flVtm834b8s2kflM5m2xODhHwAwN8idVQ0IKzBY6k5RNYVwSQZBgwyhLCXvhLbJXu3yDsvS6oFbgMF1UpBoIeq+1eVOK3DSvnqOV82lNPcsY6FZZGYNZqF8iNNcwxkaFyXvAfeQe3OT13uEOBHS5vYyDWw5vO2mILudGSMkEh37zfDYlTH7icB4kCfQkK2uz9bQxsLTuk3WU1ZX3tks+f/JWBsO6UBDVTbC7gzdu2f9emCEgAhzhScp/KyZFkE+2SaULBhPQkmPiBYchQtuAAsmgED+MuI3yegmZN0SJqbl2blmaxdDkqF8fg/G7JbR5WIvWCjgJhI7LbGBRvjnVdzpcWmQRN3nVBCfQRTagOeobP/Ie8TYJVhshPSH5HfTm X-Forefront-Antispam-Report: CIP:165.204.84.17; CTRY:US; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:SATLEXMB04.amd.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230016)(4636009)(136003)(39860400002)(376002)(396003)(346002)(40470700004)(46966006)(36840700001)(450100002)(7696005)(8676002)(44832011)(4326008)(54906003)(86362001)(110136005)(82740400003)(70586007)(426003)(356005)(70206006)(2906002)(26005)(47076005)(8936002)(5660300002)(186003)(316002)(81166007)(478600001)(6666004)(36860700001)(16526019)(83380400001)(336012)(41300700001)(2616005)(1076003)(36756003)(40460700003)(40480700001)(82310400005)(36900700001); DIR:OUT; SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Jun 2022 22:03:42.7731 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 07d3f879-6229-46ad-c612-08da5308bd72 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d; Ip=[165.204.84.17]; Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT054.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH2PR12MB3991 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: jingwen.chen2@amd.com, Christian.Koenig@amd.com, monk.liu@amd.com, yiqing.yao@amd.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Align refcount behaviour for amdgpu_job embedded HW fence with classic pointer style HW fences by increasing refcount each time emit is called so amdgpu code doesn't need to make workarounds using amdgpu_job.job_run_counter to keep the HW fence refcount balanced. Also since in the previous patch we resumed setting s_fence->parent to NULL in drm_sched_stop switch to directly checking if job->hw_fence is signaled to short circuit reset if already signed. Signed-off-by: Andrey Grodzovsky Tested-by: Yiqing Yao --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 23 ++++++++++++++++------ drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 7 ++++++- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 4 ---- 4 files changed, 25 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c index 513c57f839d8..447bd92c4856 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c @@ -684,6 +684,8 @@ int amdgpu_amdkfd_submit_ib(struct amdgpu_device *adev, goto err_ib_sched; } + /* Drop the initial kref_init count (see drm_sched_main as example) */ + dma_fence_put(f); ret = dma_fence_wait(f, false); err_ib_sched: diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index c99541685804..f9718119834f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -5009,16 +5009,28 @@ static void amdgpu_device_recheck_guilty_jobs( /* clear job's guilty and depend the folowing step to decide the real one */ drm_sched_reset_karma(s_job); - /* for the real bad job, it will be resubmitted twice, adding a dma_fence_get - * to make sure fence is balanced */ - dma_fence_get(s_job->s_fence->parent); drm_sched_resubmit_jobs_ext(&ring->sched, 1); + if (!s_job->s_fence->parent) { + DRM_WARN("Failed to get a HW fence for job!"); + continue; + } + ret = dma_fence_wait_timeout(s_job->s_fence->parent, false, ring->sched.timeout); if (ret == 0) { /* timeout */ DRM_ERROR("Found the real bad job! ring:%s, job_id:%llx\n", ring->sched.name, s_job->id); + + /* Clear this failed job from fence array */ + amdgpu_fence_driver_clear_job_fences(ring); + + /* Since the job won't signal and we go for + * another resubmit drop this parent pointer + */ + dma_fence_put(s_job->s_fence->parent); + s_job->s_fence->parent = NULL; + /* set guilty */ drm_sched_increase_karma(s_job); retry: @@ -5047,7 +5059,6 @@ static void amdgpu_device_recheck_guilty_jobs( /* got the hw fence, signal finished fence */ atomic_dec(ring->sched.score); - dma_fence_put(s_job->s_fence->parent); dma_fence_get(&s_job->s_fence->finished); dma_fence_signal(&s_job->s_fence->finished); dma_fence_put(&s_job->s_fence->finished); @@ -5220,8 +5231,8 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, * * job->base holds a reference to parent fence */ - if (job && job->base.s_fence->parent && - dma_fence_is_signaled(job->base.s_fence->parent)) { + if (job && (job->hw_fence.ops != NULL) && + dma_fence_is_signaled(&job->hw_fence)) { job_signaled = true; dev_info(adev->dev, "Guilty job already signaled, skipping HW reset"); goto skip_hw_reset; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index d6d54ba4c185..9bd4e18212fc 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -164,11 +164,16 @@ int amdgpu_fence_emit(struct amdgpu_ring *ring, struct dma_fence **f, struct amd if (job && job->job_run_counter) { /* reinit seq for resubmitted jobs */ fence->seqno = seq; + /* TO be inline with external fence creation and other drivers */ + dma_fence_get(fence); } else { - if (job) + if (job) { dma_fence_init(fence, &amdgpu_job_fence_ops, &ring->fence_drv.lock, adev->fence_context + ring->idx, seq); + /* Against remove in amdgpu_job_{free, free_cb} */ + dma_fence_get(fence); + } else dma_fence_init(fence, &amdgpu_fence_ops, &ring->fence_drv.lock, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 58568fdde2d0..638e1d600258 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -267,10 +267,6 @@ static struct dma_fence *amdgpu_job_run(struct drm_sched_job *sched_job) DRM_ERROR("Error scheduling IBs (%d)\n", r); } - if (!job->job_run_counter) - dma_fence_get(fence); - else if (finished->error < 0) - dma_fence_put(&job->hw_fence); job->job_run_counter++; amdgpu_job_free_resources(job);