From patchwork Wed May 30 19:54:17 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrey Grodzovsky X-Patchwork-Id: 10439943 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 9409F60327 for ; Wed, 30 May 2018 21:28:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 871F3295F6 for ; Wed, 30 May 2018 21:28:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7C02129651; Wed, 30 May 2018 21:28:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAD_ENC_HEADER,BAYES_00, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 05CDB295F6 for ; Wed, 30 May 2018 21:28:00 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 911A36E463; Wed, 30 May 2018 19:54:41 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from NAM03-DM3-obe.outbound.protection.outlook.com (mail-dm3nam03on0083.outbound.protection.outlook.com [104.47.41.83]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6AEAB6E047; Wed, 30 May 2018 19:54:39 +0000 (UTC) Received: from DM3PR12CA0077.namprd12.prod.outlook.com (2603:10b6:0:57::21) by SN1PR12MB0333.namprd12.prod.outlook.com (2a01:111:e400:5146::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.797.11; Wed, 30 May 2018 19:54:37 +0000 Received: from DM3NAM03FT050.eop-NAM03.prod.protection.outlook.com (2a01:111:f400:7e49::204) by DM3PR12CA0077.outlook.office365.com (2603:10b6:0:57::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.820.11 via Frontend Transport; Wed, 30 May 2018 19:54:36 +0000 Received-SPF: None (protection.outlook.com: amd.com does not designate permitted sender hosts) Received: from SATLEXCHOV02.amd.com (165.204.84.17) by DM3NAM03FT050.mail.protection.outlook.com (10.152.82.252) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.820.8 via Frontend Transport; Wed, 30 May 2018 19:54:36 +0000 Received: from agrodzovsky-All-Series.amd.com (10.34.1.3) by SATLEXCHOV02.amd.com (10.181.40.72) with Microsoft SMTP Server id 14.3.382.0; Wed, 30 May 2018 14:54:34 -0500 From: Andrey Grodzovsky To: , Subject: [PATCH 1/2] drm/scheduler: Avoid using wait_event_killable for dying process. Date: Wed, 30 May 2018 15:54:17 -0400 Message-ID: <1527710058-11896-1-git-send-email-andrey.grodzovsky@amd.com> X-Mailer: git-send-email 2.7.4 MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-Office365-Filtering-HT: Tenant X-Forefront-Antispam-Report: CIP:165.204.84.17; IPV:NLI; CTRY:US; EFV:NLI; SFV:NSPM; SFS:(10009020)(396003)(346002)(39380400002)(376002)(39860400002)(2980300002)(428003)(199004)(189003)(47776003)(336012)(16586007)(26005)(86362001)(53416004)(2616005)(110136005)(316002)(7696005)(77096007)(106466001)(97736004)(6666003)(476003)(426003)(450100002)(356003)(5660300001)(50466002)(8936002)(4326008)(305945005)(48376002)(105586002)(44832011)(126002)(486006)(81166006)(81156014)(50226002)(8676002)(72206003)(478600001)(36756003)(186003)(68736007)(53936002)(2906002)(104016004)(59450400001)(51416003)(54906003); DIR:OUT; SFP:1101; SCL:1; SRVR:SN1PR12MB0333; H:SATLEXCHOV02.amd.com; FPR:; SPF:None; LANG:en; PTR:InfoDomainNonexistent; A:1; MX:1; X-Microsoft-Exchange-Diagnostics: 1; DM3NAM03FT050; 1:2YqFnVJqMZhk9RE4Z/krSBka+Zr1WwfEHt6nfzEfgEg4gLJbR8NQIqt3jYJg6/BIu9sHt+E1RX3wqw5zS6Geu97O3kqanPXZXOO81IyCaFh5R6oW0BNftzKZqdmi71RH X-MS-PublicTrafficType: Email X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(5600026)(4534165)(4627221)(201703031133081)(201702281549075)(2017052603328)(7153060); SRVR:SN1PR12MB0333; X-Microsoft-Exchange-Diagnostics: 1; SN1PR12MB0333; 3:ns7HSHIaF69elgxET8WytgImCTUcflRsNCQKkIaAep8BODXGIHgqgdPbNnVl9t4BbQ/qgjoPUZeuqGqlNrFYYh3hOsLVa/YByxvIIm7Rw/kj3ss0425jzveSWGuTw73kh7Ru9x99lsh7fMU7e23z1JDX0hr9gfinLLcajNft9CNYwV+26J+cLsSaWWeo25wFCSgmYDjAcsZzxGlrWIUvGJ7RbeATxtWkXhhyE2TVvjEZ6AgnVOLL1GGilBd4m2aNPADrTiAvjDbCV560jBpXk/6keFw+xwF4b6OhcjmmssevkRooCSbNiwT9ThuDuU+pymnKkhgl55sP517K16VEAmFiVjA+j7PSffcUEpCUd28=; 25:B3A/NlBTlzwUjXhqqzuoiPG9WxKeIp+pmZvHTi22uVj7a0dZyak0ndoWDVceqUiupLFS1KCnzjBjXIIfMmkB2yd4smdhoBZqjdLNVbxrnqac5Ie5X9NmgIYBg4vLS4jmqctRLSnphEWUQ5s8cR7SQ8ELscTXHiPNnI3JvWO4lzNiQ2YWQcEp1KvXpjvF5aK/g/qv8L0tEr+wSC6HOSvZNUoNKiu2kR/NqP9kcOOkk3AnxIqMeYFLDw6+A9S2BLarKs/93wGpCfFnQcFi7UBXgJg4Yn7bkgIbWQOhnXDNb3Z/tNQECEucnG+0kIseg4I91i9yTfy2m2qfD6Tlu690Xg== X-MS-TrafficTypeDiagnostic: SN1PR12MB0333: X-Microsoft-Exchange-Diagnostics: 1; SN1PR12MB0333; 31:l+t/J7mLu8KjwFDU2gK5kzj3infzyU9uroEOCUFXj//ZU+BKlbUflUMHRSc8flSNf85IVEMauaT8dFV7WtYlrpkMNlWMzEOFlufZdZLhaCV4ddEw20lQLAT/Y+zDdjonFxAyxC7GrHBwbOqsqyxmVb+NFkkHwl2GfSUbYrz3i5/7xPCPf9MdTh5BBgyY1ACvXNBL3UeJECzo466lRJ2mDe2n2GYAFYf6tSdmnQjXceQ=; 20:85yAAL7S4oE1sIlHwezsKp4pJBv586RTBecc4l8/FYKEcMSatxndxN+wA7PiAW+77Fe1prg9kKpjBVRDkoFNYAmt6GLTyw+p3YdazWyF/w+1qYuhjD3IVSEGS2tXPcI/QRHagpYdOq2UsVRHg20RJ/WEhmwokAqFDQsuaSpj/U9243lzdPq0z4riA2TNWr8zEqJcix6UlzPhTCe6bvgNMjSHBg+2RpysWMj5xH9J1Qg7hUk9OGn+arwxJ1B7ZCH0jV5isk3SFFYPJF2W1OHxDhKzBMQtIMGyXBkUwpnqTH/o+7xGrmDB2sT0BgJdnFgQ4Ne1HhQ/OEDnv7t70BXLHTogxDIxZQC8mrYdahm6SOwCi1X04+yPXXdiyKFaIQMxgS8RWuqRptRrpd2QxXh55DScXC3rQxlewFFCdZKrawwYV0CxdMYoa2HqQGvfUnKtizHuW662nVblouqFBihZ+3hJ1CQ0uDtA12qQOW7FZtiN3/wpy3Rdh56RD4zmI0Zb X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(767451399110); X-MS-Exchange-SenderADCheck: 1 X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(10201501046)(3002001)(3231254)(944501410)(52105095)(93006095)(93003095)(6055026)(149027)(150027)(6041310)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123562045)(20161123564045)(20161123560045)(20161123558120)(6072148)(201708071742011)(7699016); SRVR:SN1PR12MB0333; BCL:0; PCL:0; RULEID:; SRVR:SN1PR12MB0333; X-Microsoft-Exchange-Diagnostics: 1; SN1PR12MB0333; 4:9TWBLmyAR2xNh9vgQNTvgmKQb67wqyEhYCrNaPbmfOveWXE3SqLGXp21clUGEgHYfhGGSZdHFF4w38Fdfm+3yTvDQBVW53RPdJKlVg6WSdoptEiuXIbboWlcHaX7Nw2FQgJid0B7wWS1RvUBrD5E+CyJc1jbxhGLRWRb3QlAc+XjwM+T/O9asjt7nIojMp3maK2ulaQjiUMz4pMXujqf/+KJefXG3vxRnqGr5QrNnVX/BnSKawSNhsN5YJ8obnZJ8Iu1IvH1rgcuo6ZIzge5Nk0fsP1y3atMbGwS0rLg4h1+beLTTmb03qe+R7XgB8VT X-Forefront-PRVS: 0688BF9B46 X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; SN1PR12MB0333; 23:GnE/tQv4YJG9D2wodxFB5V5kLxeMQ0Q3gAQLnVTb5?= =?us-ascii?Q?eqS7FTSQfIw25b6BQquWMzcNqvQ2UmkSfcjbnUBYnry7PAIHFFvzISiLTWbb?= =?us-ascii?Q?W3zAS5JLCc/PktwVRX3uSOjJ8Jt57QLepNxe5lvjbMc8S+y3ZyaXb9Q6MOAu?= =?us-ascii?Q?hclnzHH8YItuk8rN4oDoaOeaO3By75GmZUgotOVEKw7UEwsEt++AicJGt7Sr?= =?us-ascii?Q?un46XER3Pc7oyM9OZW5M8DY4ftHH+HiuJzxnGK/c9+sUiQR3SNzR0PrTXMsX?= =?us-ascii?Q?We9BLWEBnzGur42xhS6zb/O04u34W9/YA/noArrw90WKURQIhExasMzWeGBu?= =?us-ascii?Q?nkp33S/+RMsAKqUzS+9hCVkm4vhQ4hWthFCt4mkRurDWR9E9u8sSfrfe5R/G?= =?us-ascii?Q?43tigf3GpJnRzfCsTydqTFzpbCg07AEvCefNaeKQqzaFF3Ic+eCSLnilg3JJ?= =?us-ascii?Q?yVngai9h+z9fIIFHKMKjxDC1zqE4e+9VcIgSKDyJqm2i8+gH4jjaJTSDf4AG?= =?us-ascii?Q?3u/sNiZADTbtaEKDcbidD3+CZmNe87B7fD8N0XFCzHEB4m02bSD23blUgRUZ?= =?us-ascii?Q?PBfJ8BnLdtXODBMvi9ASx4tRRMFSzmvZROxnxEVhEJHuc8YiZQ0siQWcBx5+?= =?us-ascii?Q?GmTpz5lcrX/L3R56eQfK84KVTH56r+D/dGuozaaIdIxVOc/mkDsepHop8cc3?= =?us-ascii?Q?6Ensgh1LH+45dtN3V5G/AYtgYgc0irShlWWzhKqwiSPliJhBp5qscxr0lTTc?= =?us-ascii?Q?8btdv+3Jsigk2jEohPh+OXMsM8HyqfseZvd0m5VF190jMB0042bNVhKVCKQs?= =?us-ascii?Q?dVu2vBTqDi8Q4Rn9/EGjBbzN7K7t+dasidphlIVYXNuLk8HJ3iKJl+SY5bOD?= =?us-ascii?Q?3S5aEopj1iVkW4tkZU9xhb/BWIfCqCUk0cbfHGqMRBHGhvjX533hlKViwXNv?= =?us-ascii?Q?H7H9OsZjNUp1y4QCb2+s/X5BpKxtqgrITT10XZKNCCGM2Z0aHhJ+ttfP07Y1?= =?us-ascii?Q?UObnXXJttwxCce0drBrbJAGP7NoWRuIwa4zizVmv1dtBXPo8Nbo+8NjkQydH?= =?us-ascii?Q?bxby9lOuYoVLKZlnVBsZ8kcpMBCpN/n0jvtHI2v6JBRJEIP3VD07Rae/fNhS?= =?us-ascii?Q?5kX/K+UIb8r6JfOSuK6lpLYDTUJKsW01GM0jEGR4W/QT5/40RRf3+hR0tQnX?= =?us-ascii?Q?Q+UGLDJloDZ2Ws=3D?= X-Microsoft-Antispam-Message-Info: 7y+x4stQJ4ACZl8+Rz60L8klD2GWf9PZhJCQb6uJ+hb7oVB5/oOfDNHDPU3zuFHmkMNtwyCeYs2wR4XKDXgJ50vmJSyD5LIsfdvYFodQCQ5Zce0rhcvBcunkw5LDIuVbs9HfFmWuCOiGEIrYM1WpxX6Ym3NWvKTAGENSqTQyGzSOSQJh324PWUhqdOhF1tza X-Microsoft-Exchange-Diagnostics: 1; SN1PR12MB0333; 6:vr1u3BOxE7yU9OODAVSTbTLja37Ho2QUbc1VpDkYXhFtxpL5wK68JnTRGyFCi8RBvVkqBSEsOAKpx1p9/d5BxNrAyeV8DiIc3F8NSoQCGEH0Wx6Z0l15eNJL5J52SkmMPRJO9v22GWpZ3sGVA14mNu/8Icu1jfPkgu5VkQYajkKjcFLvq0SYL1LfPSRxSRne3HeOtVThvljDO0IEUaRSJoxBJ+v7M5WLjBKn1czSksYVd6lpJCqjF/Sm9dEUrvAV+/DEDLFiNO4buRF8vXCN5vY9/l+sV+Rqzq15fIUjfq99hSEw4pGIVo1qB4RAWAMgFElDXA0cZpvapeqXckIPiGHD4/dWsdzy+WbEMwT4+ikcorHu06Rfz24mm+r44Eo385mQRCxDgzke6Y+Mqe7vPKUZqCNTzO0SYf/12amXjeBJz+UTI15tDI9Unk24ZSBWKJrbzcUA2qVeb+Fjh+lMbw==; 5:EqVlH+vB6c1HTl5eVw1t7eAzZebilwB95bgkB7x1omUt2EISgMaE4usJaiYcSMMvVIOQ7hR4NYFYb65gowZvs1DQ2ntyi2a0WB+9MfZZIsPMGtlx4F0T/iVpTdErlyVXJHW999wI/8CDqzRdqZQ90Sc0wxTgYYLWGO9D8kXHrPI=; 24:/PYS2Jdft77CfXsbbohUbpWRzF/L1sfJl3t8x5KrXCU33yw78jr3DdJkthw+8oE6dlcCq1pmubuWWZQC/7o0PkvuJl2pzcJybt9x/Nrmo9g= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; SN1PR12MB0333; 7:uAR9FdM7+wJsGJNXiOYYU03bPcmC8FOECnkdqcORfWDYcyYpmNP88uv/hgMusQcLZbQFV6o5iBx3tYXB6ShtHsNCB3XxxA2rnv2eGi9P70UNfhO2ul9O4bU9rPYtG6qpmqabrvPzi4UTcXTUEh5GDcO8wRyxo9oH3Hf7TgpCDuYis6w6EWGIuv+L8ALnGOHK2qTyoUlRRLwL+/EsFgRgUL0lBm7tBJlrbKZabL+JrfCDv8P1BcunuvBm8QMLWMT8; 20:NkAvDFpCQ/21bYTZjEg2UuRtJHRzB698g0cJqCDcMM2m2tLQW3Cl6y//DLaRs6RAbKZbGxfuWDuawpwhmCGvP+e9xLwUbgX9DztkFLa7Kt6QvZgU4x0F3OX8taMvcYpRSu4sIDvZaDqg6YDQ8MwGKt/o4xOMysz+fd7ToxPd8rALlEoqNy9rtXvW1PJ5bjYXUrHaXvWmyjfmkTWJe1QQG7IncUVc4e99gm1ETUa/LH0qyev/d+Szai7fLftDVAjf X-MS-Office365-Filtering-Correlation-Id: 31e6308e-31c9-43bd-53ec-08d5c6672bf1 X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 30 May 2018 19:54:36.2397 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 31e6308e-31c9-43bd-53ec-08d5c6672bf1 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d; Ip=[165.204.84.17]; Helo=[SATLEXCHOV02.amd.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN1PR12MB0333 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Christian.Koenig@amd.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP Dying process might be blocked from receiving any more signals so avoid using it. Also retire enity->fini_status and just check the SW queue, if it's not empty do the fallback cleanup. Also handle entity->last_scheduled == NULL use case which happens when HW ring is already hangged whem a new entity tried to enqeue jobs. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/scheduler/gpu_scheduler.c | 47 ++++++++++++++++++++++--------- include/drm/gpu_scheduler.h | 1 - 2 files changed, 34 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c index 44d4807..4d038f9 100644 --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c @@ -135,7 +135,6 @@ int drm_sched_entity_init(struct drm_gpu_scheduler *sched, entity->rq = rq; entity->sched = sched; entity->guilty = guilty; - entity->fini_status = 0; entity->last_scheduled = NULL; spin_lock_init(&entity->rq_lock); @@ -173,7 +172,8 @@ static bool drm_sched_entity_is_initialized(struct drm_gpu_scheduler *sched, static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity) { rmb(); - if (spsc_queue_peek(&entity->job_queue) == NULL) + + if (entity->rq == NULL || spsc_queue_peek(&entity->job_queue) == NULL) return true; return false; @@ -227,12 +227,16 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched, * The client will not queue more IBs during this fini, consume existing * queued IBs or discard them on SIGKILL */ - if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL) - entity->fini_status = -ERESTARTSYS; + if ((current->flags & PF_EXITING)) + wait_event_timeout(sched->job_scheduled, + drm_sched_entity_is_idle(entity), msecs_to_jiffies(1000)); else - entity->fini_status = wait_event_killable(sched->job_scheduled, - drm_sched_entity_is_idle(entity)); - drm_sched_entity_set_rq(entity, NULL); + wait_event_killable(sched->job_scheduled, drm_sched_entity_is_idle(entity)); + + + /* For killed process disable any more IBs enqueue right now */ + if ((current->flags & PF_EXITING) && (current->exit_code == SIGKILL)) + drm_sched_entity_set_rq(entity, NULL); } EXPORT_SYMBOL(drm_sched_entity_do_release); @@ -247,7 +251,13 @@ EXPORT_SYMBOL(drm_sched_entity_do_release); void drm_sched_entity_cleanup(struct drm_gpu_scheduler *sched, struct drm_sched_entity *entity) { - if (entity->fini_status) { + + drm_sched_entity_set_rq(entity, NULL); + + /* Consumption of existing IBs wasn't completed. Forcefully + * remove them here. + */ + if (spsc_queue_peek(&entity->job_queue)) { struct drm_sched_job *job; int r; @@ -267,12 +277,22 @@ void drm_sched_entity_cleanup(struct drm_gpu_scheduler *sched, struct drm_sched_fence *s_fence = job->s_fence; drm_sched_fence_scheduled(s_fence); dma_fence_set_error(&s_fence->finished, -ESRCH); - r = dma_fence_add_callback(entity->last_scheduled, &job->finish_cb, - drm_sched_entity_kill_jobs_cb); - if (r == -ENOENT) + + /* + * When pipe is hanged by older entity, new entity might + * not even have chance to submit it's first job to HW + * and so entity->last_scheduled will remain NULL + */ + if (!entity->last_scheduled) drm_sched_entity_kill_jobs_cb(NULL, &job->finish_cb); - else if (r) - DRM_ERROR("fence add callback failed (%d)\n", r); + else { + r = dma_fence_add_callback(entity->last_scheduled, &job->finish_cb, + drm_sched_entity_kill_jobs_cb); + if (r == -ENOENT) + drm_sched_entity_kill_jobs_cb(NULL, &job->finish_cb); + else if (r) + DRM_ERROR("fence add callback failed (%d)\n", r); + } } } @@ -713,6 +733,7 @@ static int drm_sched_main(void *param) continue; sched_job = drm_sched_entity_pop_job(entity); + if (!sched_job) continue; diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index dec6558..d220ac9 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -64,7 +64,6 @@ struct drm_sched_entity { struct dma_fence *dependency; struct dma_fence_cb cb; atomic_t *guilty; /* points to ctx's guilty */ - int fini_status; struct dma_fence *last_scheduled; };