From patchwork Wed May 20 21:29:08 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oded Gabbay X-Patchwork-Id: 6450201 Return-Path: X-Original-To: patchwork-dri-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id D347E9F318 for ; Wed, 20 May 2015 21:29:41 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id CC8EC20425 for ; Wed, 20 May 2015 21:29:40 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id BB30D2041E for ; Wed, 20 May 2015 21:29:39 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CC01B6E7CD; Wed, 20 May 2015 14:29:38 -0700 (PDT) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mail-wg0-f46.google.com (mail-wg0-f46.google.com [74.125.82.46]) by gabe.freedesktop.org (Postfix) with ESMTP id 3BF076E7CD for ; Wed, 20 May 2015 14:29:36 -0700 (PDT) Received: by wgjc11 with SMTP id c11so66236381wgj.0 for ; Wed, 20 May 2015 14:29:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=S3XQbH16+bPOTGaeo467ufQd37DOvTMV8T5PfsvWO2I=; b=Y4cMRys9KkiAdDbYnp35erqJ/uT10re3kHGnnDVIqUpCZtSyyV3hiZpD9QFttOapPd QF3fiHyLEiS2mbpV76iU/wEOSepC/FMS2TXIHKrxdiOQTJyRtc+vL1MTHwz1S7UA+1KG Eky01SH3hlqwJnEniXGfpHHM4MZ5j/u0eIm9hFxwQBuu+dz/xxOlkcJV4nMtaZ/yFHkW sXfayYzbbFZV49eOZfkk5yvPyrq0nIfzs63xzjH7fQbGgPy4bRmtb8W8XcpdavzM4tX9 x5h1f+oWvGFYrBf+/dOPo/O3XThRwJ2cFkEoJXq50qZD8UFyt64Vmd/x3Zw0f421kd0c ms+g== X-Received: by 10.194.134.40 with SMTP id ph8mr69228768wjb.147.1432157375610; Wed, 20 May 2015 14:29:35 -0700 (PDT) Received: from odedg-home.localdomain (87.68.31.122.cable.012.net.il. [87.68.31.122]) by mx.google.com with ESMTPSA id k9sm5427491wia.6.2015.05.20.14.29.34 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 20 May 2015 14:29:34 -0700 (PDT) From: Oded Gabbay To: dri-devel@lists.freedesktop.org, alexdeucher@gmail.com Subject: [PATCH 11/11] drm/amdkfd: Enforce kill all waves on process termination Date: Thu, 21 May 2015 00:29:08 +0300 Message-Id: <1432157348-26686-12-git-send-email-oded.gabbay@gmail.com> X-Mailer: git-send-email 2.1.0 In-Reply-To: <1432157348-26686-1-git-send-email-oded.gabbay@gmail.com> References: <1432157348-26686-1-git-send-email-oded.gabbay@gmail.com> X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_MED, T_DKIM_INVALID, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Ben Goz This commit makes sure that on process termination, after we're destroying all the active queues, we're killing all the existing wave front of the current process. By doing this we're making sure that if any of the CUs were blocked by infinite loop we're enforcing it to end the shader explicitly. Signed-off-by: Ben Goz Signed-off-by: Oded Gabbay --- drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 65 ++++++++++++++++++++++ .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 8 ++- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 7 +++ drivers/gpu/drm/amd/amdkfd/kfd_process.c | 8 +++ 4 files changed, 87 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c index 00d8fcf..96153f2 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c @@ -792,6 +792,71 @@ static int dbgdev_wave_control_nodiq(struct kfd_dbgdev *dbgdev, reg_sq_cmd.u32All); } +int dbgdev_wave_reset_wavefronts(struct kfd_dev *dev, struct kfd_process *p) +{ + int status = 0; + unsigned int vmid; + union SQ_CMD_BITS reg_sq_cmd; + union GRBM_GFX_INDEX_BITS reg_gfx_index; + struct kfd_process_device *pdd; + struct dbg_wave_control_info wac_info; + int temp; + int first_vmid_to_scan = 8; + int last_vmid_to_scan = 15; + + first_vmid_to_scan = ffs(dev->shared_resources.compute_vmid_bitmap) - 1; + temp = dev->shared_resources.compute_vmid_bitmap >> first_vmid_to_scan; + last_vmid_to_scan = first_vmid_to_scan + ffz(temp); + + reg_sq_cmd.u32All = 0; + status = 0; + + wac_info.mode = HSA_DBG_WAVEMODE_BROADCAST_PROCESS; + wac_info.operand = HSA_DBG_WAVEOP_KILL; + + pr_debug("Killing all process wavefronts\n"); + + /* Scan all registers in the range ATC_VMID8_PASID_MAPPING .. + * ATC_VMID15_PASID_MAPPING + * to check which VMID the current process is mapped to. */ + + for (vmid = first_vmid_to_scan; vmid <= last_vmid_to_scan; vmid++) { + if (dev->kfd2kgd->get_atc_vmid_pasid_mapping_valid + (dev->kgd, vmid)) { + if (dev->kfd2kgd->get_atc_vmid_pasid_mapping_valid + (dev->kgd, vmid) == p->pasid) { + pr_debug("Killing wave fronts of vmid %d and pasid %d\n", + vmid, p->pasid); + break; + } + } + } + + if (vmid > last_vmid_to_scan) { + pr_err("amdkfd: didn't found vmid for pasid (%d)\n", p->pasid); + return -EFAULT; + } + + /* taking the VMID for that process on the safe way using PDD */ + pdd = kfd_get_process_device_data(dev, p); + if (!pdd) + return -EFAULT; + + status = dbgdev_wave_control_set_registers(&wac_info, ®_sq_cmd, + ®_gfx_index); + if (status != 0) + return -EINVAL; + + /* for non DIQ we need to patch the VMID: */ + reg_sq_cmd.bits.vm_id = vmid; + + dev->kfd2kgd->wave_control_execute(dev->kgd, + reg_gfx_index.u32All, + reg_sq_cmd.u32All); + + return 0; +} + void kfd_dbgdev_init(struct kfd_dbgdev *pdbgdev, struct kfd_dev *pdev, enum DBGDEV_TYPE type) { diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index b08ec05..547b0a5 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -946,6 +946,7 @@ static int destroy_queues_cpsch(struct device_queue_manager *dqm, { int retval; enum kfd_preempt_type_filter preempt_type; + struct kfd_process *p; BUG_ON(!dqm); @@ -977,8 +978,13 @@ static int destroy_queues_cpsch(struct device_queue_manager *dqm, pm_send_query_status(&dqm->packets, dqm->fence_gpu_addr, KFD_FENCE_COMPLETED); /* should be timed out */ - amdkfd_fence_wait_timeout(dqm->fence_addr, KFD_FENCE_COMPLETED, + retval = amdkfd_fence_wait_timeout(dqm->fence_addr, KFD_FENCE_COMPLETED, QUEUE_PREEMPT_DEFAULT_TIMEOUT_MS); + if (retval != 0) { + p = kfd_get_process(current); + p->reset_wavefronts = true; + goto out; + } pm_release_ib(&dqm->packets); dqm->active_runlist = false; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index 39a2cf0..2aa0e49 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -518,6 +518,11 @@ struct kfd_process { event_pages */ u32 next_nonsignal_event_id; size_t signal_event_count; + /* + * This flag tells if we should reset all wavefronts on + * process termination + */ + bool reset_wavefronts; }; /** @@ -725,4 +730,6 @@ int kfd_event_create(struct file *devkfd, struct kfd_process *p, uint64_t *event_page_offset, uint32_t *event_slot_index); int kfd_event_destroy(struct kfd_process *p, uint32_t event_id); +int dbgdev_wave_reset_wavefronts(struct kfd_dev *dev, struct kfd_process *p); + #endif diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c index dc910af..4826ba5 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c @@ -31,6 +31,7 @@ struct mm_struct; #include "kfd_priv.h" +#include "kfd_dbgmgr.h" /* * Initial size for the array of queues. @@ -301,6 +302,8 @@ static struct kfd_process *create_process(const struct task_struct *thread) if (kfd_init_apertures(process) != 0) goto err_init_apretures; + process->reset_wavefronts = false; + return process; err_init_apretures: @@ -399,7 +402,12 @@ void kfd_unbind_process_from_device(struct kfd_dev *dev, unsigned int pasid) mutex_lock(&p->mutex); + if ((dev->dbgmgr) && (dev->dbgmgr->pasid == p->pasid)) + kfd_dbgmgr_destroy(dev->dbgmgr); + pqm_uninit(&p->pqm); + if (p->reset_wavefronts) + dbgdev_wave_reset_wavefronts(dev, p); pdd = kfd_get_process_device_data(dev, p);