From patchwork Tue Jul 11 11:12:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oded Gabbay X-Patchwork-Id: 13308416 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3E63FEB64DC for ; Tue, 11 Jul 2023 11:12:37 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 338F910E370; Tue, 11 Jul 2023 11:12:36 +0000 (UTC) Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by gabe.freedesktop.org (Postfix) with ESMTPS id E707510E371 for ; Tue, 11 Jul 2023 11:12:34 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id A48866142E; Tue, 11 Jul 2023 11:12:32 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3D450C433C8; Tue, 11 Jul 2023 11:12:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1689073952; bh=GbAVVXenpzEwyGaNST7Zo4SNR1X2b2Sx/2R8j6HsGtU=; h=From:To:Cc:Subject:Date:From; b=ZgSyAZxyaZ+Wg8DlhMfUWSBvl8vm667bsCaZ2g0dc5Fxq2idl+UoabzEgawIXaUF6 atZzM5Bc+gU07w1IFg6hm6aRNK0n4BI9jJZJ64bRO5GmCPPzlioDb3CsWg1UXmrW3S qNV0W2gTkITqO2l19aqXkdnbJA+g40vWowuWCR/Nf26ibvJsYuc/4Qh0rjZB41yOM3 enKbZzPZH+mylKkDgZMZSq85CDf0OqwFsqGP925scPAOgN1I03xas5Wh0Z1JfKUCse 2COyHik7NuR+TF5fnH8smKGVOpW1YnB2NCAsybamTlRpDU5tmsH0D+F1tpiomXgN60 pgYye4pVEQk5Q== From: Oded Gabbay To: dri-devel@lists.freedesktop.org Subject: [PATCH 01/12] accel/habanalabs/gaudi2: un-secure register for engine cores interrupt Date: Tue, 11 Jul 2023 14:12:15 +0300 Message-Id: <20230711111226.163670-1-ogabbay@kernel.org> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tomer Tayar Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tomer Tayar The F/W dynamically allocates one of the PSOC scratchpad registers for the engine cores, so they can raise events towards the F/W. To allow the engine cores to access this register, this register must be non-secured. Signed-off-by: Tomer Tayar Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- .../accel/habanalabs/gaudi2/gaudi2_security.c | 20 ++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/drivers/accel/habanalabs/gaudi2/gaudi2_security.c b/drivers/accel/habanalabs/gaudi2/gaudi2_security.c index 2742b1f801eb..d08267e59303 100644 --- a/drivers/accel/habanalabs/gaudi2/gaudi2_security.c +++ b/drivers/accel/habanalabs/gaudi2/gaudi2_security.c @@ -2907,7 +2907,7 @@ static void gaudi2_init_lbw_range_registers_secure(struct hl_device *hdev) * - range 11: NIC11_CFG + *_DBG (not including TPC_DBG) * * If F/W security is not enabled: - * - ranges 12,13: PSOC_CFG (excluding PSOC_TIMESTAMP) + * - ranges 12,13: PSOC_CFG (excluding PSOC_TIMESTAMP, PSOC_EFUSE and PSOC_GLOBAL_CONF) */ u64 lbw_range_min_short[] = { mmNIC0_TX_AXUSER_BASE, @@ -2923,7 +2923,7 @@ static void gaudi2_init_lbw_range_registers_secure(struct hl_device *hdev) mmNIC10_TX_AXUSER_BASE, mmNIC11_TX_AXUSER_BASE, mmPSOC_I2C_M0_BASE, - mmPSOC_EFUSE_BASE + mmPSOC_GPIO0_BASE }; u64 lbw_range_max_short[] = { mmNIC0_MAC_CH3_MAC_PCS_BASE + HL_BLOCK_SIZE, @@ -3219,6 +3219,7 @@ static void gaudi2_init_range_registers(struct hl_device *hdev) */ static int gaudi2_init_protection_bits(struct hl_device *hdev) { + u32 *user_regs_array = NULL, user_regs_array_size = 0, engine_core_intr_reg; struct asic_fixed_properties *prop = &hdev->asic_prop; u32 instance_offset; int rc = 0; @@ -3389,11 +3390,24 @@ static int gaudi2_init_protection_bits(struct hl_device *hdev) /* PSOC. * Except for PSOC_GLOBAL_CONF, skip when security is enabled in F/W, because the blocks are * protected by privileged RR. + * For PSOC_GLOBAL_CONF, need to un-secure the scratchpad register which is used for engine + * cores to raise events towards F/W. */ + engine_core_intr_reg = (u32) (hdev->asic_prop.engine_core_interrupt_reg_addr - CFG_BASE); + if (engine_core_intr_reg >= mmPSOC_GLOBAL_CONF_SCRATCHPAD_0 && + engine_core_intr_reg <= mmPSOC_GLOBAL_CONF_SCRATCHPAD_31) { + user_regs_array = &engine_core_intr_reg; + user_regs_array_size = 1; + } else { + dev_err(hdev->dev, + "Engine cores register for interrupts (%#x) is not a PSOC scratchpad register\n", + engine_core_intr_reg); + } + rc |= hl_init_pb(hdev, HL_PB_SHARED, HL_PB_NA, HL_PB_SINGLE_INSTANCE, HL_PB_NA, gaudi2_pb_psoc_global_conf, ARRAY_SIZE(gaudi2_pb_psoc_global_conf), - NULL, HL_PB_NA); + user_regs_array, user_regs_array_size); if (!hdev->asic_prop.fw_security_enabled) rc |= hl_init_pb(hdev, HL_PB_SHARED, HL_PB_NA, From patchwork Tue Jul 11 11:12:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oded Gabbay X-Patchwork-Id: 13308417 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 66143EB64DC for ; Tue, 11 Jul 2023 11:12:41 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 47DE810E373; Tue, 11 Jul 2023 11:12:37 +0000 (UTC) Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by gabe.freedesktop.org (Postfix) with ESMTPS id E594B10E370 for ; Tue, 11 Jul 2023 11:12:34 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 1245F6147B; Tue, 11 Jul 2023 11:12:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 99987C433C7; Tue, 11 Jul 2023 11:12:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1689073953; bh=k2yzZjnww519gu9iIyHbPqTALZQ1a3wjTHoUAY8fwUg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=gJiKVuuJHvuU7SoqgqLtldcZVkpg2h/W4E7mIDkC30p28lwHzKWBlzD4sqswwdfW0 fT081yYoSVTjRLt8YH33/Gm1kOcytWpR+N3sHbhGfo+42ja9MzQPlMxIefIEcDl5Y0 AhyntJSEhTY8ash1G4d/P2XQnS5a7ACMSVlm8w9EmwhlOtnn78lonkBYG+SmrEo/rN 4j7Q0L6Tt2Kv+nK0kc5i/zXGOMPFdIcaelYrrLpExR+1yI+tLGsUH0BzkaAdApJspE /MuErtt//4qjG/D8u7Sy4EbzfnQqgo3PGQm539NJ454wWNVn2a+zyEOaRh8Jaz8PYb aeTdfV8lcKBJg== From: Oded Gabbay To: dri-devel@lists.freedesktop.org Subject: [PATCH 02/12] accel/habanalabs/gaudi2: unsecure tpc count registers Date: Tue, 11 Jul 2023 14:12:16 +0300 Message-Id: <20230711111226.163670-2-ogabbay@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230711111226.163670-1-ogabbay@kernel.org> References: <20230711111226.163670-1-ogabbay@kernel.org> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Ofir Bitton Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Ofir Bitton As TPC kernels now must use those registers we unsecure them. Signed-off-by: Ofir Bitton Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/gaudi2/gaudi2_security.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/accel/habanalabs/gaudi2/gaudi2_security.c b/drivers/accel/habanalabs/gaudi2/gaudi2_security.c index d08267e59303..34bf80c5a44b 100644 --- a/drivers/accel/habanalabs/gaudi2/gaudi2_security.c +++ b/drivers/accel/habanalabs/gaudi2/gaudi2_security.c @@ -1601,6 +1601,7 @@ static const u32 gaudi2_pb_dcr0_tpc0_unsecured_regs[] = { mmDCORE0_TPC0_CFG_KERNEL_SRF_30, mmDCORE0_TPC0_CFG_KERNEL_SRF_31, mmDCORE0_TPC0_CFG_TPC_SB_L0CD, + mmDCORE0_TPC0_CFG_TPC_COUNT, mmDCORE0_TPC0_CFG_TPC_ID, mmDCORE0_TPC0_CFG_QM_KERNEL_ID_INC, mmDCORE0_TPC0_CFG_QM_TID_BASE_SIZE_HIGH_DIM_0, From patchwork Tue Jul 11 11:12:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oded Gabbay X-Patchwork-Id: 13308419 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2F1ACEB64DD for ; Tue, 11 Jul 2023 11:12:47 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D1D0110E376; Tue, 11 Jul 2023 11:12:41 +0000 (UTC) Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9173910E371 for ; Tue, 11 Jul 2023 11:12:36 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 6E37061449; Tue, 11 Jul 2023 11:12:35 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 015ECC433C8; Tue, 11 Jul 2023 11:12:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1689073954; bh=zeVtVw6gaq3RvqYHwpxtIYNqzLZ3ywjrmMXlBceG7eU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Ta40R8u0r7OzXHfPvkNv+EPje3BSNZ7U9u+7R5GU9f/imaugb2Bfm3vAVO+KowFD/ tqnmBbY7hfia/oN7nh98yFPdKZiPnUti5U+PVT+i8RG5sdaJmADCKA9LlR33djibRw 1RSpymtd8l/ZlYb0INurRxyXn0IZwUxObvwgY8n9+BqpMFK5352lh+gmgKPDjPvFUS 1ihHk2qTyLiIT/RZEhCQZUDiYtH/CyJpIV+wzc9x/r2ymEFK9wvnTY490Ag3tPhmgv pcynQT3nQ37bCBauOFZ5XjlAZLCvwr9vlz3+7bgYeMWjQYqxEj1JcHJC3cW1ST3GTC E+UdTsqVms3+g== From: Oded Gabbay To: dri-devel@lists.freedesktop.org Subject: [PATCH 03/12] accel/habanalabs/gaudi2: prepare to remove soft_rst_irq Date: Tue, 11 Jul 2023 14:12:17 +0300 Message-Id: <20230711111226.163670-3-ogabbay@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230711111226.163670-1-ogabbay@kernel.org> References: <20230711111226.163670-1-ogabbay@kernel.org> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Igor Grinberg Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Igor Grinberg The soft reset has transitioned to CPUCP packet instead of plain register write and is about to be removed from the struct cpu_dyn_regs. As a preparation for removing the gic_host_soft_rst_irq field from struct cpu_dyn_regs, switch to use the plain macro - this keeps the backward compatibility. Signed-off-by: Igor Grinberg Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/gaudi2/gaudi2.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/accel/habanalabs/gaudi2/gaudi2.c b/drivers/accel/habanalabs/gaudi2/gaudi2.c index 1e22c7a47358..0f9e9522233f 100644 --- a/drivers/accel/habanalabs/gaudi2/gaudi2.c +++ b/drivers/accel/habanalabs/gaudi2/gaudi2.c @@ -6263,7 +6263,8 @@ static int gaudi2_execute_soft_reset(struct hl_device *hdev, bool driver_perform WREG32(le32_to_cpu(dyn_regs->cpu_rst_status), CPU_RST_STATUS_NA); else WREG32(mmCPU_RST_STATUS_TO_HOST, CPU_RST_STATUS_NA); - WREG32(le32_to_cpu(dyn_regs->gic_host_soft_rst_irq), + + WREG32(mmGIC_HOST_SOFT_RST_IRQ_POLL_REG, gaudi2_irq_map_table[GAUDI2_EVENT_CPU_SOFT_RESET].cpu_id); /* wait for f/w response */ From patchwork Tue Jul 11 11:12:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oded Gabbay X-Patchwork-Id: 13308425 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 464DEEB64DD for ; Tue, 11 Jul 2023 11:13:00 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id F035010E379; Tue, 11 Jul 2023 11:12:48 +0000 (UTC) Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by gabe.freedesktop.org (Postfix) with ESMTPS id 16E1210E372 for ; Tue, 11 Jul 2023 11:12:37 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 915A56142E for ; Tue, 11 Jul 2023 11:12:36 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5D243C433C9 for ; Tue, 11 Jul 2023 11:12:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1689073956; bh=MCArzWC6q+e9w3lwQdYeV+tlqHV0iwb092ut/gu8pVU=; h=From:To:Subject:Date:In-Reply-To:References:From; b=RPBakGMgPo3o+5meJk+uAoOQCfgixVJx4GDLBq4ifPHrDihRyFlpWZ1nSlTOONDM3 sreqSL6wX+S7kFi+7TUVkHsllDmkKTDFnkiSYYqaug3WyCxNF5MHeplm5/v8QK1BiF lMCJYVqQ7SBHLMdUWXi1MpL90gmHo2gJ0kwepC37Yurt6IzuQRpT5400x4KPT8mWTD kdpq/XmKRybjdD0lFYJPfOBqodfrj2N3Lz/hJzKkPtCOf4WPluXvO43H8QJ6/t2zuP q8GjaLFVmliEk5pEGvocW3Z4VCU+Y5Kbo80XTswzMzm0bxbCx1qNuaekXBQ6NaeivU 45I29/BzzVz0w== From: Oded Gabbay To: dri-devel@lists.freedesktop.org Subject: [PATCH 04/12] accel/habanalabs/gaudi2: fix missing check of kernel ctx Date: Tue, 11 Jul 2023 14:12:18 +0300 Message-Id: <20230711111226.163670-4-ogabbay@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230711111226.163670-1-ogabbay@kernel.org> References: <20230711111226.163670-1-ogabbay@kernel.org> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" If we are initializing the kernel context when we have a Gaudi2 device, we don't need to do any late initializing of that context with specific Gaudi2 code. Signed-off-by: Oded Gabbay Reviewed-by: Ofir Bitton --- drivers/accel/habanalabs/gaudi2/gaudi2.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/accel/habanalabs/gaudi2/gaudi2.c b/drivers/accel/habanalabs/gaudi2/gaudi2.c index 0f9e9522233f..70b8f744cd73 100644 --- a/drivers/accel/habanalabs/gaudi2/gaudi2.c +++ b/drivers/accel/habanalabs/gaudi2/gaudi2.c @@ -10650,6 +10650,9 @@ static int gaudi2_ctx_init(struct hl_ctx *ctx) { int rc; + if (ctx->asid == HL_KERNEL_ASID_ID) + return 0; + rc = gaudi2_mmu_prepare(ctx->hdev, ctx->asid); if (rc) return rc; From patchwork Tue Jul 11 11:12:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oded Gabbay X-Patchwork-Id: 13308418 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 92AF1EB64DC for ; Tue, 11 Jul 2023 11:12:44 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 05A1210E371; Tue, 11 Jul 2023 11:12:41 +0000 (UTC) Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by gabe.freedesktop.org (Postfix) with ESMTPS id 739DB10E372 for ; Tue, 11 Jul 2023 11:12:38 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id EFE7561449; Tue, 11 Jul 2023 11:12:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 83D2AC433C8; Tue, 11 Jul 2023 11:12:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1689073957; bh=mAjnjw19LM3IptYw67mr37QegsWcSEZk93NyZA1lLu0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=b1rlIvtHWPGzOTDJrHKXRhOpBWOmEMjcFAQYE9AtG2pJ9+FnR8yb26tg8Vis2nonQ GMZCXpkCMgJQvn0lSG4DL/wVaeG7YmVO1tT0N2xZ1jkvXCJ6YJjrfxt0qHBK57Sh4S CF2vE+1QSAlxpT5OKf5QQj/cMlIBvMicJYBTJdaQ+Pu1rEMSDKxPRpZVUuXUeFhn69 /gKHRSDUUyxZxWMNBw0ruquFinmBRM6O55QLDHQSfQ0lMyPdkohApOJlREi9DDx+7r HuWhIKVEMqmAMQyNNziUL3pMJDv8yRtPk68es5EwiXtHOFInv0HvcopArLTw6xE0v0 acviJTnGrdpQg== From: Oded Gabbay To: dri-devel@lists.freedesktop.org Subject: [PATCH 05/12] accel/habanalabs: handle f/w reserved dram space request Date: Tue, 11 Jul 2023 14:12:19 +0300 Message-Id: <20230711111226.163670-5-ogabbay@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230711111226.163670-1-ogabbay@kernel.org> References: <20230711111226.163670-1-ogabbay@kernel.org> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dani Liberman Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Dani Liberman It is possible for FW to request reserved space in dram. If the device supports this option, it will retrieve the size from the f/w and will reserve it. Currently we add the common code infrastructure to support it. Signed-off-by: Dani Liberman Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/firmware_if.c | 5 +++++ drivers/accel/habanalabs/common/habanalabs.h | 4 ++++ drivers/accel/habanalabs/include/common/hl_boot_if.h | 5 +++++ 3 files changed, 14 insertions(+) diff --git a/drivers/accel/habanalabs/common/firmware_if.c b/drivers/accel/habanalabs/common/firmware_if.c index c7da69dbfa0a..2bc775d29854 100644 --- a/drivers/accel/habanalabs/common/firmware_if.c +++ b/drivers/accel/habanalabs/common/firmware_if.c @@ -2783,6 +2783,11 @@ static int hl_fw_dynamic_init_cpu(struct hl_device *hdev, hdev->decoder_binning, hdev->rotator_binning); } + if (hdev->asic_prop.support_dynamic_resereved_fw_size) { + hdev->asic_prop.reserved_fw_mem_size = + le32_to_cpu(fw_loader->dynamic_loader.comm_desc.rsvd_mem_size_mb); + } + return 0; } diff --git a/drivers/accel/habanalabs/common/habanalabs.h b/drivers/accel/habanalabs/common/habanalabs.h index 16bea0a3f3a4..4fecd300b8dd 100644 --- a/drivers/accel/habanalabs/common/habanalabs.h +++ b/drivers/accel/habanalabs/common/habanalabs.h @@ -641,6 +641,7 @@ struct hl_hints_range { * @glbl_err_cause_num: global err cause number. * @hbw_flush_reg: register to read to generate HBW flush. value of 0 means HBW flush is * not supported. + * @reserved_fw_mem_size: size in MB of dram memory reserved for FW. * @collective_first_sob: first sync object available for collective use * @collective_first_mon: first monitor available for collective use * @sync_stream_first_sob: first sync object available for sync stream use @@ -689,6 +690,7 @@ struct hl_hints_range { * @dma_mask: the dma mask to be set for this device * @supports_advanced_cpucp_rc: true if new cpucp opcodes are supported. * @supports_engine_modes: true if changing engines/engine_cores modes is supported. + * @support_dynamic_resereved_fw_size: true if we support dynamic reserved size for fw. */ struct asic_fixed_properties { struct hw_queue_properties *hw_queues_props; @@ -772,6 +774,7 @@ struct asic_fixed_properties { u32 num_of_special_blocks; u32 glbl_err_cause_num; u32 hbw_flush_reg; + u32 reserved_fw_mem_size; u16 collective_first_sob; u16 collective_first_mon; u16 sync_stream_first_sob; @@ -808,6 +811,7 @@ struct asic_fixed_properties { u8 dma_mask; u8 supports_advanced_cpucp_rc; u8 supports_engine_modes; + u8 support_dynamic_resereved_fw_size; }; /** diff --git a/drivers/accel/habanalabs/include/common/hl_boot_if.h b/drivers/accel/habanalabs/include/common/hl_boot_if.h index cff79f7f9f75..7de8a5786a36 100644 --- a/drivers/accel/habanalabs/include/common/hl_boot_if.h +++ b/drivers/accel/habanalabs/include/common/hl_boot_if.h @@ -570,6 +570,8 @@ struct lkd_fw_comms_desc { __le64 img_addr; /* address for next FW component load */ struct lkd_fw_binning_info binning_info; struct lkd_fw_ascii_msg ascii_msg[LKD_FW_ASCII_MSG_MAX]; + __le32 rsvd_mem_size_mb; /* reserved memory size [MB] for FW/SVE */ + char reserved1[4]; }; enum comms_reset_cause { @@ -596,6 +598,9 @@ struct lkd_fw_comms_msg { __le64 img_addr; struct lkd_fw_binning_info binning_info; struct lkd_fw_ascii_msg ascii_msg[LKD_FW_ASCII_MSG_MAX]; + /* reserved memory size [MB] for FW/SVE */ + __le32 rsvd_mem_size_mb; + char reserved1[4]; }; struct { __u8 reset_cause; From patchwork Tue Jul 11 11:12:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oded Gabbay X-Patchwork-Id: 13308420 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DDBC6EB64DC for ; Tue, 11 Jul 2023 11:12:49 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 827E410E375; Tue, 11 Jul 2023 11:12:42 +0000 (UTC) Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by gabe.freedesktop.org (Postfix) with ESMTPS id CEAC610E371 for ; Tue, 11 Jul 2023 11:12:39 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 50ACD6147B; Tue, 11 Jul 2023 11:12:39 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E02CAC433C9; Tue, 11 Jul 2023 11:12:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1689073958; bh=WcW1OApAnq8dLkPnAdSlkE3sEmUbfEIEAAf0rJTUC08=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=nMiRzPofLHdgv80BBxzqrOu/1/uUa3mhnws7PtWtnOh4PKKrWG+Xcu9aMT8crHLxy gtVNBlUdNcppAHhIQe2Dd1U0+JmZ2fhCAVXAP1nentmMN52HOOEAsXOE4R3XPrLQu3 jGn9/Zryn//umSEPLiyUjhbRwZbz7FmeK2s9ijlPfn2eG6PYhsSC5Fix5Nu5OSxZaI JhUSIbWc7BuLrTQQk0MvmWBiqU7PzJIUtrOa9JWPXsiT+GPIDTZCEqf4iKAq3GQncy 8Il2KskV1Frs+KBFHV57aa5FOHpT/LlP9qj9v1hDcrR14Dw9EbIUjyi1kbidzIsMNb 2AHVj/7qkm5rg== From: Oded Gabbay To: dri-devel@lists.freedesktop.org Subject: [PATCH 06/12] accel/habanalabs: set default device release watchdog T/O as 30 sec Date: Tue, 11 Jul 2023 14:12:20 +0300 Message-Id: <20230711111226.163670-6-ogabbay@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230711111226.163670-1-ogabbay@kernel.org> References: <20230711111226.163670-1-ogabbay@kernel.org> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tomer Tayar Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tomer Tayar After being notified about certain errors, user is expected to finish his post-errors actions and to release the device within some timeout, after which is deice is being reset. The default timeout value is 5 sec, which in some case is not enough for a user application to collect debug data. Increase the default value to 30 sec. Signed-off-by: Tomer Tayar Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/accel/habanalabs/common/device.c b/drivers/accel/habanalabs/common/device.c index d7d9198b2103..28be0fc325ea 100644 --- a/drivers/accel/habanalabs/common/device.c +++ b/drivers/accel/habanalabs/common/device.c @@ -18,7 +18,7 @@ #define HL_RESET_DELAY_USEC 10000 /* 10ms */ -#define HL_DEVICE_RELEASE_WATCHDOG_TIMEOUT_SEC 5 +#define HL_DEVICE_RELEASE_WATCHDOG_TIMEOUT_SEC 30 enum dma_alloc_type { DMA_ALLOC_COHERENT, From patchwork Tue Jul 11 11:12:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oded Gabbay X-Patchwork-Id: 13308423 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C3DE2EB64DD for ; Tue, 11 Jul 2023 11:12:56 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2BC0010E372; Tue, 11 Jul 2023 11:12:48 +0000 (UTC) Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by gabe.freedesktop.org (Postfix) with ESMTPS id BC91A10E375 for ; Tue, 11 Jul 2023 11:12:41 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2CE846147F; Tue, 11 Jul 2023 11:12:41 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4856BC433C7; Tue, 11 Jul 2023 11:12:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1689073960; bh=k272oihJyhtSq0n26bDpI0T6S4Z9QshNoBiJkl4neSY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=sIxNAulY94NC715NjWV/leW0d1qWBYaHpztjuwml8VO07D/eC7G6xCuEB5dJjURPv qfdlnf3KJEcf+XpR5L7lNO8bnflfK4QY1STMqI5pttXwwf+Xl+4+YLIaNK+1rqS7/j jXesX+hhUbiSq+awnB2oWqW/JEE7bddY+FaoPp8NsbKj4IaqHbuWYaMTFfotL0nSVL zn+q4mQGM4eN52j0FfI+OVl22c4m/eMIWO0PFAKuffqDUv0o7rPZcxIVX9GfcyS7P7 ZyESrXithdgU8RlBRofRvco235BrwxTqCixtXT+c4hXIwBvdbxuo+nbEOd+tFiK22G 7k7XhVRxhtXNw== From: Oded Gabbay To: dri-devel@lists.freedesktop.org Subject: [PATCH 07/12] accel/habanalabs: add info ioctl for engine error reports Date: Tue, 11 Jul 2023 14:12:21 +0300 Message-Id: <20230711111226.163670-7-ogabbay@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230711111226.163670-1-ogabbay@kernel.org> References: <20230711111226.163670-1-ogabbay@kernel.org> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Ofir Bitton Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Ofir Bitton User gets notification for every engine error report, but he still lacks the exact engine information. Hence, we allow user to query for the exact engine reported an error. Signed-off-by: Ofir Bitton Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/device.c | 14 ++ drivers/accel/habanalabs/common/habanalabs.h | 17 ++ .../habanalabs/common/habanalabs_ioctl.c | 25 +++ drivers/accel/habanalabs/gaudi2/gaudi2.c | 168 ++++++++++++++++++ include/uapi/drm/habanalabs_accel.h | 16 ++ 5 files changed, 240 insertions(+) diff --git a/drivers/accel/habanalabs/common/device.c b/drivers/accel/habanalabs/common/device.c index 28be0fc325ea..80cce6b74d05 100644 --- a/drivers/accel/habanalabs/common/device.c +++ b/drivers/accel/habanalabs/common/device.c @@ -2701,6 +2701,20 @@ void hl_handle_fw_err(struct hl_device *hdev, struct hl_info_fw_err_info *info) *info->event_mask |= HL_NOTIFIER_EVENT_CRITICL_FW_ERR; } +void hl_capture_engine_err(struct hl_device *hdev, u16 engine_id, u16 error_count) +{ + struct engine_err_info *info = &hdev->captured_err_info.engine_err; + + /* Capture only the first engine error */ + if (atomic_cmpxchg(&info->event_detected, 0, 1)) + return; + + info->event.timestamp = ktime_to_ns(ktime_get()); + info->event.engine_id = engine_id; + info->event.error_count = error_count; + info->event_info_available = true; +} + void hl_enable_err_info_capture(struct hl_error_info *captured_err_info) { vfree(captured_err_info->page_fault_info.user_mappings); diff --git a/drivers/accel/habanalabs/common/habanalabs.h b/drivers/accel/habanalabs/common/habanalabs.h index 4fecd300b8dd..201d826b0fb7 100644 --- a/drivers/accel/habanalabs/common/habanalabs.h +++ b/drivers/accel/habanalabs/common/habanalabs.h @@ -3062,6 +3062,20 @@ struct fw_err_info { bool event_info_available; }; +/** + * struct engine_err_info - engine error information. + * @event: holds information on the event. + * @event_detected: if set as 1, then an engine event was discovered for the + * first time after the driver has finished booting-up. + * @event_info_available: indicates that an engine event info is now available. + */ +struct engine_err_info { + struct hl_info_engine_err_event event; + atomic_t event_detected; + bool event_info_available; +}; + + /** * struct hl_error_info - holds information collected during an error. * @cs_timeout: CS timeout error information. @@ -3070,6 +3084,7 @@ struct fw_err_info { * @page_fault_info: page fault information. * @hw_err: (fatal) hardware error information. * @fw_err: firmware error information. + * @engine_err: engine error information. */ struct hl_error_info { struct cs_timeout_info cs_timeout; @@ -3078,6 +3093,7 @@ struct hl_error_info { struct page_fault_info page_fault_info; struct hw_err_info hw_err; struct fw_err_info fw_err; + struct engine_err_info engine_err; }; /** @@ -3951,6 +3967,7 @@ void hl_handle_page_fault(struct hl_device *hdev, u64 addr, u16 eng_id, bool is_ u64 *event_mask); void hl_handle_critical_hw_err(struct hl_device *hdev, u16 event_id, u64 *event_mask); void hl_handle_fw_err(struct hl_device *hdev, struct hl_info_fw_err_info *info); +void hl_capture_engine_err(struct hl_device *hdev, u16 engine_id, u16 error_count); void hl_enable_err_info_capture(struct hl_error_info *captured_err_info); #ifdef CONFIG_DEBUG_FS diff --git a/drivers/accel/habanalabs/common/habanalabs_ioctl.c b/drivers/accel/habanalabs/common/habanalabs_ioctl.c index 549b2518fae0..097d65e493c8 100644 --- a/drivers/accel/habanalabs/common/habanalabs_ioctl.c +++ b/drivers/accel/habanalabs/common/habanalabs_ioctl.c @@ -875,6 +875,28 @@ static int fw_err_info(struct hl_fpriv *hpriv, struct hl_info_args *args) return rc ? -EFAULT : 0; } +static int engine_err_info(struct hl_fpriv *hpriv, struct hl_info_args *args) +{ + void __user *user_buf = (void __user *) (uintptr_t) args->return_pointer; + struct hl_device *hdev = hpriv->hdev; + u32 user_buf_size = args->return_size; + struct engine_err_info *info; + int rc; + + if (!user_buf) + return -EINVAL; + + info = &hdev->captured_err_info.engine_err; + if (!info->event_info_available) + return 0; + + if (user_buf_size < sizeof(struct hl_info_engine_err_event)) + return -ENOMEM; + + rc = copy_to_user(user_buf, &info->event, sizeof(struct hl_info_engine_err_event)); + return rc ? -EFAULT : 0; +} + static int send_fw_generic_request(struct hl_device *hdev, struct hl_info_args *info_args) { void __user *buff = (void __user *) (uintptr_t) info_args->return_pointer; @@ -1001,6 +1023,9 @@ static int _hl_info_ioctl(struct hl_fpriv *hpriv, void *data, case HL_INFO_FW_ERR_EVENT: return fw_err_info(hpriv, args); + case HL_INFO_USER_ENGINE_ERR_EVENT: + return engine_err_info(hpriv, args); + case HL_INFO_DRAM_USAGE: return dram_usage_info(hpriv, args); default: diff --git a/drivers/accel/habanalabs/gaudi2/gaudi2.c b/drivers/accel/habanalabs/gaudi2/gaudi2.c index 70b8f744cd73..222310bf1098 100644 --- a/drivers/accel/habanalabs/gaudi2/gaudi2.c +++ b/drivers/accel/habanalabs/gaudi2/gaudi2.c @@ -9588,6 +9588,171 @@ static int hl_arc_event_handle(struct hl_device *hdev, u16 event_type, } } +static u16 event_id_to_engine_id(struct hl_device *hdev, u16 event_type) +{ + enum gaudi2_block_types type = GAUDI2_BLOCK_TYPE_MAX; + u16 index; + + switch (event_type) { + case GAUDI2_EVENT_TPC0_AXI_ERR_RSP ... GAUDI2_EVENT_TPC24_AXI_ERR_RSP: + index = event_type - GAUDI2_EVENT_TPC0_AXI_ERR_RSP; + type = GAUDI2_BLOCK_TYPE_TPC; + break; + case GAUDI2_EVENT_TPC0_QM ... GAUDI2_EVENT_TPC24_QM: + index = event_type - GAUDI2_EVENT_TPC0_QM; + type = GAUDI2_BLOCK_TYPE_TPC; + break; + case GAUDI2_EVENT_MME0_SBTE0_AXI_ERR_RSP ... GAUDI2_EVENT_MME0_CTRL_AXI_ERROR_RESPONSE: + case GAUDI2_EVENT_MME0_SPI_BASE ... GAUDI2_EVENT_MME0_WAP_SOURCE_RESULT_INVALID: + case GAUDI2_EVENT_MME0_QM: + index = 0; + type = GAUDI2_BLOCK_TYPE_MME; + break; + case GAUDI2_EVENT_MME1_SBTE0_AXI_ERR_RSP ... GAUDI2_EVENT_MME1_CTRL_AXI_ERROR_RESPONSE: + case GAUDI2_EVENT_MME1_SPI_BASE ... GAUDI2_EVENT_MME1_WAP_SOURCE_RESULT_INVALID: + case GAUDI2_EVENT_MME1_QM: + index = 1; + type = GAUDI2_BLOCK_TYPE_MME; + break; + case GAUDI2_EVENT_MME2_SBTE0_AXI_ERR_RSP ... GAUDI2_EVENT_MME2_CTRL_AXI_ERROR_RESPONSE: + case GAUDI2_EVENT_MME2_SPI_BASE ... GAUDI2_EVENT_MME2_WAP_SOURCE_RESULT_INVALID: + case GAUDI2_EVENT_MME2_QM: + index = 2; + type = GAUDI2_BLOCK_TYPE_MME; + break; + case GAUDI2_EVENT_MME3_SBTE0_AXI_ERR_RSP ... GAUDI2_EVENT_MME3_CTRL_AXI_ERROR_RESPONSE: + case GAUDI2_EVENT_MME3_SPI_BASE ... GAUDI2_EVENT_MME3_WAP_SOURCE_RESULT_INVALID: + case GAUDI2_EVENT_MME3_QM: + index = 3; + type = GAUDI2_BLOCK_TYPE_MME; + break; + case GAUDI2_EVENT_KDMA_CH0_AXI_ERR_RSP: + case GAUDI2_EVENT_KDMA_BM_SPMU: + case GAUDI2_EVENT_KDMA0_CORE: + return GAUDI2_ENGINE_ID_KDMA; + case GAUDI2_EVENT_PDMA_CH0_AXI_ERR_RSP: + case GAUDI2_EVENT_PDMA0_CORE: + case GAUDI2_EVENT_PDMA0_BM_SPMU: + case GAUDI2_EVENT_PDMA0_QM: + return GAUDI2_ENGINE_ID_PDMA_0; + case GAUDI2_EVENT_PDMA_CH1_AXI_ERR_RSP: + case GAUDI2_EVENT_PDMA1_CORE: + case GAUDI2_EVENT_PDMA1_BM_SPMU: + case GAUDI2_EVENT_PDMA1_QM: + return GAUDI2_ENGINE_ID_PDMA_1; + case GAUDI2_EVENT_DEC0_AXI_ERR_RSPONSE ... GAUDI2_EVENT_DEC9_AXI_ERR_RSPONSE: + index = event_type - GAUDI2_EVENT_DEC0_AXI_ERR_RSPONSE; + type = GAUDI2_BLOCK_TYPE_DEC; + break; + case GAUDI2_EVENT_DEC0_SPI ... GAUDI2_EVENT_DEC9_BMON_SPMU: + index = (event_type - GAUDI2_EVENT_DEC0_SPI) >> 1; + type = GAUDI2_BLOCK_TYPE_DEC; + break; + case GAUDI2_EVENT_NIC0_AXI_ERROR_RESPONSE ... GAUDI2_EVENT_NIC11_AXI_ERROR_RESPONSE: + index = event_type - GAUDI2_EVENT_NIC0_AXI_ERROR_RESPONSE; + return GAUDI2_ENGINE_ID_NIC0_0 + (index * 2); + case GAUDI2_EVENT_NIC0_QM0 ... GAUDI2_EVENT_NIC11_QM1: + index = event_type - GAUDI2_EVENT_NIC0_QM0; + return GAUDI2_ENGINE_ID_NIC0_0 + index; + case GAUDI2_EVENT_NIC0_BMON_SPMU ... GAUDI2_EVENT_NIC11_SW_ERROR: + index = event_type - GAUDI2_EVENT_NIC0_BMON_SPMU; + return GAUDI2_ENGINE_ID_NIC0_0 + (index * 2); + case GAUDI2_EVENT_TPC0_BMON_SPMU ... GAUDI2_EVENT_TPC24_KERNEL_ERR: + index = (event_type - GAUDI2_EVENT_TPC0_BMON_SPMU) >> 1; + type = GAUDI2_BLOCK_TYPE_TPC; + break; + case GAUDI2_EVENT_ROTATOR0_AXI_ERROR_RESPONSE: + case GAUDI2_EVENT_ROTATOR0_BMON_SPMU: + case GAUDI2_EVENT_ROTATOR0_ROT0_QM: + return GAUDI2_ENGINE_ID_ROT_0; + case GAUDI2_EVENT_ROTATOR1_AXI_ERROR_RESPONSE: + case GAUDI2_EVENT_ROTATOR1_BMON_SPMU: + case GAUDI2_EVENT_ROTATOR1_ROT1_QM: + return GAUDI2_ENGINE_ID_ROT_1; + case GAUDI2_EVENT_HDMA0_BM_SPMU: + case GAUDI2_EVENT_HDMA0_QM: + case GAUDI2_EVENT_HDMA0_CORE: + return GAUDI2_DCORE0_ENGINE_ID_EDMA_0; + case GAUDI2_EVENT_HDMA1_BM_SPMU: + case GAUDI2_EVENT_HDMA1_QM: + case GAUDI2_EVENT_HDMA1_CORE: + return GAUDI2_DCORE0_ENGINE_ID_EDMA_1; + case GAUDI2_EVENT_HDMA2_BM_SPMU: + case GAUDI2_EVENT_HDMA2_QM: + case GAUDI2_EVENT_HDMA2_CORE: + return GAUDI2_DCORE1_ENGINE_ID_EDMA_0; + case GAUDI2_EVENT_HDMA3_BM_SPMU: + case GAUDI2_EVENT_HDMA3_QM: + case GAUDI2_EVENT_HDMA3_CORE: + return GAUDI2_DCORE1_ENGINE_ID_EDMA_1; + case GAUDI2_EVENT_HDMA4_BM_SPMU: + case GAUDI2_EVENT_HDMA4_QM: + case GAUDI2_EVENT_HDMA4_CORE: + return GAUDI2_DCORE2_ENGINE_ID_EDMA_0; + case GAUDI2_EVENT_HDMA5_BM_SPMU: + case GAUDI2_EVENT_HDMA5_QM: + case GAUDI2_EVENT_HDMA5_CORE: + return GAUDI2_DCORE2_ENGINE_ID_EDMA_1; + case GAUDI2_EVENT_HDMA6_BM_SPMU: + case GAUDI2_EVENT_HDMA6_QM: + case GAUDI2_EVENT_HDMA6_CORE: + return GAUDI2_DCORE3_ENGINE_ID_EDMA_0; + case GAUDI2_EVENT_HDMA7_BM_SPMU: + case GAUDI2_EVENT_HDMA7_QM: + case GAUDI2_EVENT_HDMA7_CORE: + return GAUDI2_DCORE3_ENGINE_ID_EDMA_1; + default: + break; + } + + switch (type) { + case GAUDI2_BLOCK_TYPE_TPC: + switch (index) { + case TPC_ID_DCORE0_TPC0 ... TPC_ID_DCORE0_TPC5: + return GAUDI2_DCORE0_ENGINE_ID_TPC_0 + index; + case TPC_ID_DCORE1_TPC0 ... TPC_ID_DCORE1_TPC5: + return GAUDI2_DCORE1_ENGINE_ID_TPC_0 + index - TPC_ID_DCORE1_TPC0; + case TPC_ID_DCORE2_TPC0 ... TPC_ID_DCORE2_TPC5: + return GAUDI2_DCORE2_ENGINE_ID_TPC_0 + index - TPC_ID_DCORE2_TPC0; + case TPC_ID_DCORE3_TPC0 ... TPC_ID_DCORE3_TPC5: + return GAUDI2_DCORE3_ENGINE_ID_TPC_0 + index - TPC_ID_DCORE3_TPC0; + default: + break; + } + break; + case GAUDI2_BLOCK_TYPE_MME: + switch (index) { + case MME_ID_DCORE0: return GAUDI2_DCORE0_ENGINE_ID_MME; + case MME_ID_DCORE1: return GAUDI2_DCORE1_ENGINE_ID_MME; + case MME_ID_DCORE2: return GAUDI2_DCORE2_ENGINE_ID_MME; + case MME_ID_DCORE3: return GAUDI2_DCORE3_ENGINE_ID_MME; + default: + break; + } + break; + case GAUDI2_BLOCK_TYPE_DEC: + switch (index) { + case DEC_ID_DCORE0_DEC0: return GAUDI2_DCORE0_ENGINE_ID_DEC_0; + case DEC_ID_DCORE0_DEC1: return GAUDI2_DCORE0_ENGINE_ID_DEC_1; + case DEC_ID_DCORE1_DEC0: return GAUDI2_DCORE1_ENGINE_ID_DEC_0; + case DEC_ID_DCORE1_DEC1: return GAUDI2_DCORE1_ENGINE_ID_DEC_1; + case DEC_ID_DCORE2_DEC0: return GAUDI2_DCORE2_ENGINE_ID_DEC_0; + case DEC_ID_DCORE2_DEC1: return GAUDI2_DCORE2_ENGINE_ID_DEC_1; + case DEC_ID_DCORE3_DEC0: return GAUDI2_DCORE3_ENGINE_ID_DEC_0; + case DEC_ID_DCORE3_DEC1: return GAUDI2_DCORE3_ENGINE_ID_DEC_1; + case DEC_ID_PCIE_VDEC0: return GAUDI2_PCIE_ENGINE_ID_DEC_0; + case DEC_ID_PCIE_VDEC1: return GAUDI2_PCIE_ENGINE_ID_DEC_1; + default: + break; + } + break; + default: + break; + } + + return U16_MAX; +} + static void gaudi2_handle_eqe(struct hl_device *hdev, struct hl_eq_entry *eq_entry) { struct gaudi2_device *gaudi2 = hdev->asic_specific; @@ -10010,6 +10175,9 @@ static void gaudi2_handle_eqe(struct hl_device *hdev, struct hl_eq_entry *eq_ent } } + if (event_mask & HL_NOTIFIER_EVENT_USER_ENGINE_ERR) + hl_capture_engine_err(hdev, event_id_to_engine_id(hdev, event_type), error_count); + /* Make sure to dump an error in case no error cause was printed so far. * Note that although we have counted the errors, we use this number as * a boolean. diff --git a/include/uapi/drm/habanalabs_accel.h b/include/uapi/drm/habanalabs_accel.h index e6436f3e8ea6..f912869b151e 100644 --- a/include/uapi/drm/habanalabs_accel.h +++ b/include/uapi/drm/habanalabs_accel.h @@ -809,6 +809,7 @@ enum hl_server_type { * HL_INFO_FW_ERR_EVENT - Retrieve information on the reported FW error. * May return 0 even though no new data is available, in that case * timestamp will be 0. + * HL_INFO_USER_ENGINE_ERR_EVENT - Retrieve the last engine id that reported an error. */ #define HL_INFO_HW_IP_INFO 0 #define HL_INFO_HW_EVENTS 1 @@ -845,6 +846,7 @@ enum hl_server_type { #define HL_INFO_FW_GENERIC_REQ 35 #define HL_INFO_HW_ERR_EVENT 36 #define HL_INFO_FW_ERR_EVENT 37 +#define HL_INFO_USER_ENGINE_ERR_EVENT 38 #define HL_INFO_VERSION_MAX_LEN 128 #define HL_INFO_CARD_NAME_MAX_LEN 16 @@ -1226,6 +1228,20 @@ struct hl_info_fw_err_event { __u32 pad; }; +/** + * struct hl_info_engine_err_event - engine error info + * @timestamp: time-stamp of error occurrence + * @engine_id: engine id who reported the error. + * @error_count: Amount of errors reported. + * @pad: size padding for u64 granularity. + */ +struct hl_info_engine_err_event { + __s64 timestamp; + __u16 engine_id; + __u16 error_count; + __u32 pad; +}; + /** * struct hl_info_dev_memalloc_page_sizes - valid page sizes in device mem alloc information. * @page_order_bitmask: bitmap in which a set bit represents the order of the supported page size From patchwork Tue Jul 11 11:12:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oded Gabbay X-Patchwork-Id: 13308422 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0088AEB64DC for ; Tue, 11 Jul 2023 11:12:54 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4A1CA10E37A; Tue, 11 Jul 2023 11:12:48 +0000 (UTC) Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2355310E372 for ; Tue, 11 Jul 2023 11:12:43 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 9042A61449; Tue, 11 Jul 2023 11:12:42 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DD6AAC433C8; Tue, 11 Jul 2023 11:12:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1689073962; bh=ty6Mv606cD3N/DazACN8sW3ZtDjr1kepG9LHg4ukQWc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=DJykXqix5M9mFgfK3jt5eCBxKGsQowdQan0DZP/ByVSld/n/F5bcWN0Fy+wnznuhn T3gnqdLD/iOdtyTI5Xo5j22MUE5QIjNrJBCtms/gWMrQhhPhHyHJbk1k3ZFFza+t1D gIZIZDgH58nTHrlNZzYpDraIVF6Q942+ieOCOD/D7mGYE9WLu0J8SHJgjulAgBfYYa DIOK9o4n9anLnUUuytBqsf0axCDzcnjqt7msnzvJ2TMGTWYsuHXFbH4dKShlx/ucPu 47s7WJghr3LlGZt+i9EJXtAjV4NIDfWDYXslQskq0nZYQ35jgS503tiiCaQgqQ07Zf TmYFK5qvRdO8g== From: Oded Gabbay To: dri-devel@lists.freedesktop.org Subject: [PATCH 08/12] accel/habanalabs: register compute device as an accel device Date: Tue, 11 Jul 2023 14:12:22 +0300 Message-Id: <20230711111226.163670-8-ogabbay@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230711111226.163670-1-ogabbay@kernel.org> References: <20230711111226.163670-1-ogabbay@kernel.org> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tomer Tayar Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tomer Tayar Register the compute device as an accel device, and remove the creation of the habanalabs compute char device. The IOCTLs in this patch are still handled by the current driver handler. Moving to DRM IOCTL handling requires moving the IOCTLs numbers to a specific range, so it will be handled in subsequent patches. Signed-off-by: Tomer Tayar Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/debugfs.c | 22 +-- drivers/accel/habanalabs/common/device.c | 163 +++++++----------- drivers/accel/habanalabs/common/habanalabs.h | 44 ++--- .../accel/habanalabs/common/habanalabs_drv.c | 157 ++++++++--------- .../habanalabs/common/habanalabs_ioctl.c | 12 +- drivers/accel/habanalabs/common/memory.c | 4 +- 6 files changed, 161 insertions(+), 241 deletions(-) diff --git a/drivers/accel/habanalabs/common/debugfs.c b/drivers/accel/habanalabs/common/debugfs.c index 9e84a47a21dc..01f071d52570 100644 --- a/drivers/accel/habanalabs/common/debugfs.c +++ b/drivers/accel/habanalabs/common/debugfs.c @@ -18,8 +18,6 @@ #define MMU_KBUF_SIZE (MMU_ADDR_BUF_SIZE + MMU_ASID_BUF_SIZE) #define I2C_MAX_TRANSACTION_LEN 8 -static struct dentry *hl_debug_root; - static int hl_debugfs_i2c_read(struct hl_device *hdev, u8 i2c_bus, u8 i2c_addr, u8 i2c_reg, u8 i2c_len, u64 *val) { @@ -1788,20 +1786,14 @@ void hl_debugfs_add_device(struct hl_device *hdev) { struct hl_dbg_device_entry *dev_entry = &hdev->hl_debugfs; - dev_entry->root = debugfs_create_dir(dev_name(hdev->dev), hl_debug_root); + dev_entry->root = hdev->drm.accel->debugfs_root; add_files_to_device(hdev, dev_entry, dev_entry->root); + if (!hdev->asic_prop.fw_security_enabled) add_secured_nodes(dev_entry, dev_entry->root); } -void hl_debugfs_remove_device(struct hl_device *hdev) -{ - struct hl_dbg_device_entry *entry = &hdev->hl_debugfs; - - debugfs_remove_recursive(entry->root); -} - void hl_debugfs_add_file(struct hl_fpriv *hpriv) { struct hl_dbg_device_entry *dev_entry = &hpriv->hdev->hl_debugfs; @@ -1932,13 +1924,3 @@ void hl_debugfs_set_state_dump(struct hl_device *hdev, char *data, up_write(&dev_entry->state_dump_sem); } - -void __init hl_debugfs_init(void) -{ - hl_debug_root = debugfs_create_dir("habanalabs", NULL); -} - -void hl_debugfs_fini(void) -{ - debugfs_remove_recursive(hl_debug_root); -} diff --git a/drivers/accel/habanalabs/common/device.c b/drivers/accel/habanalabs/common/device.c index 80cce6b74d05..c0c9e9504672 100644 --- a/drivers/accel/habanalabs/common/device.c +++ b/drivers/accel/habanalabs/common/device.c @@ -14,6 +14,9 @@ #include #include +#include +#include + #include #define HL_RESET_DELAY_USEC 10000 /* 10ms */ @@ -520,24 +523,20 @@ static void print_device_in_use_info(struct hl_device *hdev, const char *message } /* - * hl_device_release - release function for habanalabs device - * - * @inode: pointer to inode structure - * @filp: pointer to file structure + * hl_device_release() - release function for habanalabs device. + * @ddev: pointer to DRM device structure. + * @file: pointer to DRM file private data structure. * * Called when process closes an habanalabs device */ -static int hl_device_release(struct inode *inode, struct file *filp) +void hl_device_release(struct drm_device *ddev, struct drm_file *file_priv) { - struct hl_fpriv *hpriv = filp->private_data; - struct hl_device *hdev = hpriv->hdev; - - filp->private_data = NULL; + struct hl_fpriv *hpriv = file_priv->driver_priv; + struct hl_device *hdev = to_hl_device(ddev); if (!hdev) { pr_crit("Closing FD after device was removed. Memory leak will occur and it is advised to reboot.\n"); put_pid(hpriv->taskpid); - return 0; } hl_ctx_mgr_fini(hdev, &hpriv->ctx_mgr); @@ -555,8 +554,6 @@ static int hl_device_release(struct inode *inode, struct file *filp) } hdev->last_open_session_duration_jif = jiffies - hdev->last_successful_open_jif; - - return 0; } static int hl_device_release_ctrl(struct inode *inode, struct file *filp) @@ -587,18 +584,8 @@ static int hl_device_release_ctrl(struct inode *inode, struct file *filp) return 0; } -/* - * hl_mmap - mmap function for habanalabs device - * - * @*filp: pointer to file structure - * @*vma: pointer to vm_area_struct of the process - * - * Called when process does an mmap on habanalabs device. Call the relevant mmap - * function at the end of the common code. - */ -static int hl_mmap(struct file *filp, struct vm_area_struct *vma) +static int __hl_mmap(struct hl_fpriv *hpriv, struct vm_area_struct *vma) { - struct hl_fpriv *hpriv = filp->private_data; struct hl_device *hdev = hpriv->hdev; unsigned long vm_pgoff; @@ -621,14 +608,22 @@ static int hl_mmap(struct file *filp, struct vm_area_struct *vma) return -EINVAL; } -static const struct file_operations hl_ops = { - .owner = THIS_MODULE, - .open = hl_device_open, - .release = hl_device_release, - .mmap = hl_mmap, - .unlocked_ioctl = hl_ioctl, - .compat_ioctl = hl_ioctl -}; +/* + * hl_mmap - mmap function for habanalabs device + * + * @*filp: pointer to file structure + * @*vma: pointer to vm_area_struct of the process + * + * Called when process does an mmap on habanalabs device. Call the relevant mmap + * function at the end of the common code. + */ +int hl_mmap(struct file *filp, struct vm_area_struct *vma) +{ + struct drm_file *file_priv = filp->private_data; + struct hl_fpriv *hpriv = file_priv->driver_priv; + + return __hl_mmap(hpriv, vma); +} static const struct file_operations hl_ctrl_ops = { .owner = THIS_MODULE, @@ -656,7 +651,7 @@ static void device_release_func(struct device *dev) * * Initialize a cdev and a Linux device for habanalabs's device. */ -static int device_init_cdev(struct hl_device *hdev, struct class *class, +static int device_init_cdev(struct hl_device *hdev, const struct class *class, int minor, const struct file_operations *fops, char *name, struct cdev *cdev, struct device **dev) @@ -680,23 +675,26 @@ static int device_init_cdev(struct hl_device *hdev, struct class *class, static int cdev_sysfs_debugfs_add(struct hl_device *hdev) { + const struct class *accel_class = hdev->drm.accel->kdev->class; + char name[32]; int rc; - rc = cdev_device_add(&hdev->cdev, hdev->dev); - if (rc) { - dev_err(hdev->dev, - "failed to add a char device to the system\n"); + hdev->cdev_idx = hdev->drm.accel->index; + + /* Initialize cdev and device structures for the control device */ + snprintf(name, sizeof(name), "accel_controlD%d", hdev->cdev_idx); + rc = device_init_cdev(hdev, accel_class, hdev->cdev_idx, &hl_ctrl_ops, name, + &hdev->cdev_ctrl, &hdev->dev_ctrl); + if (rc) return rc; - } rc = cdev_device_add(&hdev->cdev_ctrl, hdev->dev_ctrl); if (rc) { - dev_err(hdev->dev, - "failed to add a control char device to the system\n"); - goto delete_cdev_device; + dev_err(hdev->dev_ctrl, + "failed to add an accel control char device to the system\n"); + goto free_ctrl_device; } - /* hl_sysfs_init() must be done after adding the device to the system */ rc = hl_sysfs_init(hdev); if (rc) { dev_err(hdev->dev, "failed to initialize sysfs\n"); @@ -711,23 +709,19 @@ static int cdev_sysfs_debugfs_add(struct hl_device *hdev) delete_ctrl_cdev_device: cdev_device_del(&hdev->cdev_ctrl, hdev->dev_ctrl); -delete_cdev_device: - cdev_device_del(&hdev->cdev, hdev->dev); +free_ctrl_device: + put_device(hdev->dev_ctrl); return rc; } static void cdev_sysfs_debugfs_remove(struct hl_device *hdev) { if (!hdev->cdev_sysfs_debugfs_created) - goto put_devices; + return; - hl_debugfs_remove_device(hdev); hl_sysfs_fini(hdev); - cdev_device_del(&hdev->cdev_ctrl, hdev->dev_ctrl); - cdev_device_del(&hdev->cdev, hdev->dev); -put_devices: - put_device(hdev->dev); + cdev_device_del(&hdev->cdev_ctrl, hdev->dev_ctrl); put_device(hdev->dev_ctrl); } @@ -2011,51 +2005,6 @@ void hl_notifier_event_send_all(struct hl_device *hdev, u64 event_mask) mutex_unlock(&hdev->fpriv_ctrl_list_lock); } -static int create_cdev(struct hl_device *hdev) -{ - char *name; - int rc; - - hdev->cdev_idx = hdev->id / 2; - - name = kasprintf(GFP_KERNEL, "hl%d", hdev->cdev_idx); - if (!name) { - rc = -ENOMEM; - goto out_err; - } - - /* Initialize cdev and device structures */ - rc = device_init_cdev(hdev, hdev->hclass, hdev->id, &hl_ops, name, - &hdev->cdev, &hdev->dev); - - kfree(name); - - if (rc) - goto out_err; - - name = kasprintf(GFP_KERNEL, "hl_controlD%d", hdev->cdev_idx); - if (!name) { - rc = -ENOMEM; - goto free_dev; - } - - /* Initialize cdev and device structures for control device */ - rc = device_init_cdev(hdev, hdev->hclass, hdev->id_control, &hl_ctrl_ops, - name, &hdev->cdev_ctrl, &hdev->dev_ctrl); - - kfree(name); - - if (rc) - goto free_dev; - - return 0; - -free_dev: - put_device(hdev->dev); -out_err: - return rc; -} - /* * hl_device_init - main initialization function for habanalabs device * @@ -2070,14 +2019,10 @@ int hl_device_init(struct hl_device *hdev) int i, rc, cq_cnt, user_interrupt_cnt, cq_ready_cnt; bool expose_interfaces_on_err = false; - rc = create_cdev(hdev); - if (rc) - goto out_disabled; - /* Initialize ASIC function pointers and perform early init */ rc = device_early_init(hdev); if (rc) - goto free_dev; + goto out_disabled; user_interrupt_cnt = hdev->asic_prop.user_dec_intr_count + hdev->asic_prop.user_interrupt_count; @@ -2264,6 +2209,14 @@ int hl_device_init(struct hl_device *hdev) * From here there is no need to expose them in case of an error. */ expose_interfaces_on_err = false; + + rc = drm_dev_register(&hdev->drm, 0); + if (rc) { + dev_err(hdev->dev, "Failed to register DRM device, rc %d\n", rc); + rc = 0; + goto out_disabled; + } + rc = cdev_sysfs_debugfs_add(hdev); if (rc) { dev_err(hdev->dev, "Failed to add char devices and sysfs/debugfs files\n"); @@ -2332,15 +2285,14 @@ int hl_device_init(struct hl_device *hdev) kfree(hdev->user_interrupt); early_fini: device_early_fini(hdev); -free_dev: - put_device(hdev->dev_ctrl); - put_device(hdev->dev); out_disabled: hdev->disabled = true; - if (expose_interfaces_on_err) + if (expose_interfaces_on_err) { + drm_dev_register(&hdev->drm, 0); cdev_sysfs_debugfs_add(hdev); - dev_err(&hdev->pdev->dev, - "Failed to initialize hl%d. Device %s is NOT usable !\n", + } + + pr_err("Failed to initialize accel%d. Device %s is NOT usable!\n", hdev->cdev_idx, dev_name(&hdev->pdev->dev)); return rc; @@ -2486,6 +2438,7 @@ void hl_device_fini(struct hl_device *hdev) /* Hide devices and sysfs/debugfs files from user */ cdev_sysfs_debugfs_remove(hdev); + drm_dev_unregister(&hdev->drm); hl_debugfs_device_fini(hdev); diff --git a/drivers/accel/habanalabs/common/habanalabs.h b/drivers/accel/habanalabs/common/habanalabs.h index 201d826b0fb7..58948044ad16 100644 --- a/drivers/accel/habanalabs/common/habanalabs.h +++ b/drivers/accel/habanalabs/common/habanalabs.h @@ -29,6 +29,9 @@ #include #include +#include +#include + #include "security.h" #define HL_NAME "habanalabs" @@ -2258,7 +2261,7 @@ struct hl_notifier_event { /** * struct hl_fpriv - process information stored in FD private data. * @hdev: habanalabs device structure. - * @filp: pointer to the given file structure. + * @filp: pointer to the DRM file private data structure. * @taskpid: current process ID. * @ctx: current executing context. TODO: remove for multiple ctx per process * @ctx_mgr: context manager to handle multiple context for this FD. @@ -2273,7 +2276,7 @@ struct hl_notifier_event { */ struct hl_fpriv { struct hl_device *hdev; - struct file *filp; + struct drm_file *file_priv; struct pid *taskpid; struct hl_ctx *ctx; struct hl_ctx_mgr ctx_mgr; @@ -3141,8 +3144,7 @@ struct hl_reset_info { * (required only for PCI address match mode) * @pcie_bar: array of available PCIe bars virtual addresses. * @rmmio: configuration area address on SRAM. - * @hclass: pointer to the habanalabs class. - * @cdev: related char device. + * @drm: related DRM device. * @cdev_ctrl: char device for control operations only (INFO IOCTL) * @dev: related kernel basic device structure. * @dev_ctrl: related kernel device structure for the control device @@ -3269,8 +3271,7 @@ struct hl_reset_info { * @rotator_binning: contains mask of rotators engines that is received from the f/w * which indicates which rotator engines are binned-out(Gaudi3 and above). * @id: device minor. - * @id_control: minor of the control device. - * @cdev_idx: char device index. Used for setting its name. + * @cdev_idx: char device index. * @cpu_pci_msb_addr: 50-bit extension bits for the device CPU's 40-bit * addresses. * @is_in_dram_scrub: true if dram scrub operation is on going. @@ -3332,8 +3333,7 @@ struct hl_device { u64 pcie_bar_phys[HL_PCI_NUM_BARS]; void __iomem *pcie_bar[HL_PCI_NUM_BARS]; void __iomem *rmmio; - struct class *hclass; - struct cdev cdev; + struct drm_device drm; struct cdev cdev_ctrl; struct device *dev; struct device *dev_ctrl; @@ -3442,7 +3442,6 @@ struct hl_device { u32 device_release_watchdog_timeout_sec; u32 rotator_binning; u16 id; - u16 id_control; u16 cdev_idx; u16 cpu_pci_msb_addr; u8 is_in_dram_scrub; @@ -3606,6 +3605,11 @@ static inline bool hl_mem_area_inside_range(u64 address, u64 size, return false; } +static inline struct hl_device *to_hl_device(struct drm_device *ddev) +{ + return container_of(ddev, struct hl_device, drm); +} + /** * hl_mem_area_crosses_range() - Checks whether address+size crossing a range. * @address: The start address of the area we want to validate. @@ -3644,7 +3648,12 @@ int hl_access_cfg_region(struct hl_device *hdev, u64 addr, u64 *val, enum debugfs_access_type acc_type); int hl_access_dev_mem(struct hl_device *hdev, enum pci_region region_type, u64 addr, u64 *val, enum debugfs_access_type acc_type); -int hl_device_open(struct inode *inode, struct file *filp); + +int hl_mmap(struct file *filp, struct vm_area_struct *vma); + +int hl_device_open(struct drm_device *drm, struct drm_file *file_priv); +void hl_device_release(struct drm_device *ddev, struct drm_file *file_priv); + int hl_device_open_ctrl(struct inode *inode, struct file *filp); bool hl_device_operational(struct hl_device *hdev, enum hl_device_status *status); @@ -3972,12 +3981,9 @@ void hl_enable_err_info_capture(struct hl_error_info *captured_err_info); #ifdef CONFIG_DEBUG_FS -void hl_debugfs_init(void); -void hl_debugfs_fini(void); int hl_debugfs_device_init(struct hl_device *hdev); void hl_debugfs_device_fini(struct hl_device *hdev); void hl_debugfs_add_device(struct hl_device *hdev); -void hl_debugfs_remove_device(struct hl_device *hdev); void hl_debugfs_add_file(struct hl_fpriv *hpriv); void hl_debugfs_remove_file(struct hl_fpriv *hpriv); void hl_debugfs_add_cb(struct hl_cb *cb); @@ -3996,22 +4002,10 @@ void hl_debugfs_set_state_dump(struct hl_device *hdev, char *data, #else -static inline void __init hl_debugfs_init(void) -{ -} - -static inline void hl_debugfs_fini(void) -{ -} - static inline void hl_debugfs_add_device(struct hl_device *hdev) { } -static inline void hl_debugfs_remove_device(struct hl_device *hdev) -{ -} - static inline void hl_debugfs_add_file(struct hl_fpriv *hpriv) { } diff --git a/drivers/accel/habanalabs/common/habanalabs_drv.c b/drivers/accel/habanalabs/common/habanalabs_drv.c index 7263e84c1a4d..6341b8362b3e 100644 --- a/drivers/accel/habanalabs/common/habanalabs_drv.c +++ b/drivers/accel/habanalabs/common/habanalabs_drv.c @@ -14,6 +14,10 @@ #include #include #include +#include + +#include +#include #define CREATE_TRACE_POINTS #include @@ -27,7 +31,6 @@ MODULE_DESCRIPTION(HL_DRIVER_DESC); MODULE_LICENSE("GPL v2"); static int hl_major; -static struct class *hl_class; static DEFINE_IDR(hl_devs_idr); static DEFINE_MUTEX(hl_devs_idr_lock); @@ -70,6 +73,31 @@ static const struct pci_device_id ids[] = { }; MODULE_DEVICE_TABLE(pci, ids); +static const struct file_operations hl_fops = { + .owner = THIS_MODULE, + .open = accel_open, + .release = drm_release, + .unlocked_ioctl = hl_ioctl, + .compat_ioctl = hl_ioctl, + .llseek = noop_llseek, + .mmap = hl_mmap +}; + +static const struct drm_driver hl_driver = { + .driver_features = DRIVER_COMPUTE_ACCEL, + + .name = HL_NAME, + .desc = HL_DRIVER_DESC, + .major = LINUX_VERSION_MAJOR, + .minor = LINUX_VERSION_PATCHLEVEL, + .patchlevel = LINUX_VERSION_SUBLEVEL, + .date = "20190505", + + .fops = &hl_fops, + .open = hl_device_open, + .postclose = hl_device_release +}; + /* * get_asic_type - translate device id to asic type * @@ -123,43 +151,28 @@ static bool is_asic_secured(enum hl_asic_type asic_type) } /* - * hl_device_open - open function for habanalabs device - * - * @inode: pointer to inode structure - * @filp: pointer to file structure + * hl_device_open() - open function for habanalabs device. + * @ddev: pointer to DRM device structure. + * @file: pointer to DRM file private data structure. * * Called when process opens an habanalabs device. */ -int hl_device_open(struct inode *inode, struct file *filp) +int hl_device_open(struct drm_device *ddev, struct drm_file *file_priv) { + struct hl_device *hdev = to_hl_device(ddev); enum hl_device_status status; - struct hl_device *hdev; struct hl_fpriv *hpriv; int rc; - mutex_lock(&hl_devs_idr_lock); - hdev = idr_find(&hl_devs_idr, iminor(inode)); - mutex_unlock(&hl_devs_idr_lock); - - if (!hdev) { - pr_err("Couldn't find device %d:%d\n", - imajor(inode), iminor(inode)); - return -ENXIO; - } - hpriv = kzalloc(sizeof(*hpriv), GFP_KERNEL); if (!hpriv) return -ENOMEM; hpriv->hdev = hdev; - filp->private_data = hpriv; - hpriv->filp = filp; - mutex_init(&hpriv->notifier_event.lock); mutex_init(&hpriv->restore_phase_mutex); mutex_init(&hpriv->ctx_lock); kref_init(&hpriv->refcount); - nonseekable_open(inode, filp); hl_ctx_mgr_init(&hpriv->ctx_mgr); hl_mem_mgr_init(hpriv->hdev->dev, &hpriv->mem_mgr); @@ -225,6 +238,9 @@ int hl_device_open(struct inode *inode, struct file *filp) hdev->last_successful_open_jif = jiffies; hdev->last_successful_open_ktime = ktime_get(); + file_priv->driver_priv = hpriv; + hpriv->file_priv = file_priv; + return 0; out_err: @@ -232,7 +248,6 @@ int hl_device_open(struct inode *inode, struct file *filp) hl_mem_mgr_fini(&hpriv->mem_mgr); hl_mem_mgr_idr_destroy(&hpriv->mem_mgr); hl_ctx_mgr_fini(hpriv->hdev, &hpriv->ctx_mgr); - filp->private_data = NULL; mutex_destroy(&hpriv->ctx_lock); mutex_destroy(&hpriv->restore_phase_mutex); mutex_destroy(&hpriv->notifier_event.lock); @@ -268,7 +283,6 @@ int hl_device_open_ctrl(struct inode *inode, struct file *filp) */ hpriv->hdev = hdev; filp->private_data = hpriv; - hpriv->filp = filp; mutex_init(&hpriv->notifier_event.lock); nonseekable_open(inode, filp); @@ -317,7 +331,6 @@ static void copy_kernel_module_params_to_device(struct hl_device *hdev) hdev->asic_prop.fw_security_enabled = is_asic_secured(hdev->asic_type); hdev->major = hl_major; - hdev->hclass = hl_class; hdev->memory_scrub = memory_scrub; hdev->reset_on_lockup = reset_on_lockup; hdev->boot_error_status_mask = boot_error_status_mask; @@ -383,6 +396,31 @@ static int fixup_device_params(struct hl_device *hdev) return 0; } +static int allocate_device_id(struct hl_device *hdev) +{ + int id; + + mutex_lock(&hl_devs_idr_lock); + id = idr_alloc(&hl_devs_idr, hdev, 0, HL_MAX_MINORS, GFP_KERNEL); + mutex_unlock(&hl_devs_idr_lock); + + if (id < 0) { + if (id == -ENOSPC) + pr_err("too many devices in the system\n"); + return -EBUSY; + } + + hdev->id = id; + + /* + * Firstly initialized with the internal device ID. + * Will be updated later after the DRM device registration to hold the minor ID. + */ + hdev->cdev_idx = hdev->id; + + return 0; +} + /** * create_hdev - create habanalabs device instance * @@ -395,14 +433,16 @@ static int fixup_device_params(struct hl_device *hdev) */ static int create_hdev(struct hl_device **dev, struct pci_dev *pdev) { - int main_id, ctrl_id = 0, rc = 0; struct hl_device *hdev; + int rc; *dev = NULL; - hdev = kzalloc(sizeof(*hdev), GFP_KERNEL); - if (!hdev) - return -ENOMEM; + hdev = devm_drm_dev_alloc(&pdev->dev, &hl_driver, struct hl_device, drm); + if (IS_ERR(hdev)) + return PTR_ERR(hdev); + + hdev->dev = hdev->drm.dev; /* Will be NULL in case of simulator device */ hdev->pdev = pdev; @@ -425,7 +465,7 @@ static int create_hdev(struct hl_device **dev, struct pci_dev *pdev) if (hdev->asic_type == ASIC_INVALID) { dev_err(&pdev->dev, "Unsupported ASIC\n"); rc = -ENODEV; - goto free_hdev; + goto out_err; } copy_kernel_module_params_to_device(hdev); @@ -434,42 +474,15 @@ static int create_hdev(struct hl_device **dev, struct pci_dev *pdev) fixup_device_params(hdev); - mutex_lock(&hl_devs_idr_lock); - - /* Always save 2 numbers, 1 for main device and 1 for control. - * They must be consecutive - */ - main_id = idr_alloc(&hl_devs_idr, hdev, 0, HL_MAX_MINORS, GFP_KERNEL); - - if (main_id >= 0) - ctrl_id = idr_alloc(&hl_devs_idr, hdev, main_id + 1, - main_id + 2, GFP_KERNEL); - - mutex_unlock(&hl_devs_idr_lock); - - if ((main_id < 0) || (ctrl_id < 0)) { - if ((main_id == -ENOSPC) || (ctrl_id == -ENOSPC)) - pr_err("too many devices in the system\n"); - - if (main_id >= 0) { - mutex_lock(&hl_devs_idr_lock); - idr_remove(&hl_devs_idr, main_id); - mutex_unlock(&hl_devs_idr_lock); - } - - rc = -EBUSY; - goto free_hdev; - } - - hdev->id = main_id; - hdev->id_control = ctrl_id; + rc = allocate_device_id(hdev); + if (rc) + goto out_err; *dev = hdev; return 0; -free_hdev: - kfree(hdev); +out_err: return rc; } @@ -484,10 +497,8 @@ static void destroy_hdev(struct hl_device *hdev) /* Remove device from the device list */ mutex_lock(&hl_devs_idr_lock); idr_remove(&hl_devs_idr, hdev->id); - idr_remove(&hl_devs_idr, hdev->id_control); mutex_unlock(&hl_devs_idr_lock); - kfree(hdev); } static int hl_pmops_suspend(struct device *dev) @@ -691,28 +702,16 @@ static int __init hl_init(void) hl_major = MAJOR(dev); - hl_class = class_create(HL_NAME); - if (IS_ERR(hl_class)) { - pr_err("failed to allocate class\n"); - rc = PTR_ERR(hl_class); - goto remove_major; - } - - hl_debugfs_init(); - rc = pci_register_driver(&hl_pci_driver); if (rc) { pr_err("failed to register pci device\n"); - goto remove_debugfs; + goto remove_major; } pr_debug("driver loaded\n"); return 0; -remove_debugfs: - hl_debugfs_fini(); - class_destroy(hl_class); remove_major: unregister_chrdev_region(MKDEV(hl_major, 0), HL_MAX_MINORS); return rc; @@ -725,14 +724,6 @@ static void __exit hl_exit(void) { pci_unregister_driver(&hl_pci_driver); - /* - * Removing debugfs must be after all devices or simulator devices - * have been removed because otherwise we get a bug in the - * debugfs module for referencing NULL objects - */ - hl_debugfs_fini(); - - class_destroy(hl_class); unregister_chrdev_region(MKDEV(hl_major, 0), HL_MAX_MINORS); idr_destroy(&hl_devs_idr); diff --git a/drivers/accel/habanalabs/common/habanalabs_ioctl.c b/drivers/accel/habanalabs/common/habanalabs_ioctl.c index 097d65e493c8..28c3793e802f 100644 --- a/drivers/accel/habanalabs/common/habanalabs_ioctl.c +++ b/drivers/accel/habanalabs/common/habanalabs_ioctl.c @@ -1166,10 +1166,9 @@ static const struct hl_ioctl_desc hl_ioctls_control[] = { HL_IOCTL_DEF(HL_IOCTL_INFO, hl_info_ioctl_control) }; -static long _hl_ioctl(struct file *filep, unsigned int cmd, unsigned long arg, - const struct hl_ioctl_desc *ioctl, struct device *dev) +static long _hl_ioctl(struct hl_fpriv *hpriv, unsigned int cmd, unsigned long arg, + const struct hl_ioctl_desc *ioctl, struct device *dev) { - struct hl_fpriv *hpriv = filep->private_data; unsigned int nr = _IOC_NR(cmd); char stack_kdata[128] = {0}; char *kdata = NULL; @@ -1235,7 +1234,8 @@ static long _hl_ioctl(struct file *filep, unsigned int cmd, unsigned long arg, long hl_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) { - struct hl_fpriv *hpriv = filep->private_data; + struct drm_file *file_priv = filep->private_data; + struct hl_fpriv *hpriv = file_priv->driver_priv; struct hl_device *hdev = hpriv->hdev; const struct hl_ioctl_desc *ioctl = NULL; unsigned int nr = _IOC_NR(cmd); @@ -1256,7 +1256,7 @@ long hl_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) return -ENOTTY; } - return _hl_ioctl(filep, cmd, arg, ioctl, hdev->dev); + return _hl_ioctl(hpriv, cmd, arg, ioctl, hdev->dev); } long hl_ioctl_control(struct file *filep, unsigned int cmd, unsigned long arg) @@ -1282,5 +1282,5 @@ long hl_ioctl_control(struct file *filep, unsigned int cmd, unsigned long arg) return -ENOTTY; } - return _hl_ioctl(filep, cmd, arg, ioctl, hdev->dev_ctrl); + return _hl_ioctl(hpriv, cmd, arg, ioctl, hdev->dev_ctrl); } diff --git a/drivers/accel/habanalabs/common/memory.c b/drivers/accel/habanalabs/common/memory.c index 4fc72a07d2f5..45fdf39bfc8c 100644 --- a/drivers/accel/habanalabs/common/memory.c +++ b/drivers/accel/habanalabs/common/memory.c @@ -1818,7 +1818,7 @@ static void hl_release_dmabuf(struct dma_buf *dmabuf) hl_ctx_put(ctx); /* Paired with get_file() in export_dmabuf() */ - fput(ctx->hpriv->filp); + fput(ctx->hpriv->file_priv->filp); kfree(hl_dmabuf); } @@ -1864,7 +1864,7 @@ static int export_dmabuf(struct hl_ctx *ctx, * released first and only then the compute device. * Paired with fput() in hl_release_dmabuf(). */ - get_file(ctx->hpriv->filp); + get_file(ctx->hpriv->file_priv->filp); *dmabuf_fd = fd; From patchwork Tue Jul 11 11:12:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oded Gabbay X-Patchwork-Id: 13308421 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8F632EB64DD for ; Tue, 11 Jul 2023 11:12:52 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 61DE210E377; Tue, 11 Jul 2023 11:12:47 +0000 (UTC) Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by gabe.freedesktop.org (Postfix) with ESMTPS id 85BD810E372 for ; Tue, 11 Jul 2023 11:12:44 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id EB3836142E; Tue, 11 Jul 2023 11:12:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8167FC433C7; Tue, 11 Jul 2023 11:12:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1689073963; bh=QhY36kuJ1eBWFDNRej0vvqSDIeII5A0xVmZCjQWBy3I=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=QLQBzIPL5pPJrIyigZmY23y9/s3qU6+lAWgiieVcAUcAtGwDT4H/hSL1ulNFaARwE NZV/eSWnjf5k13ypTOqBdGYD5LY+J5biPCpB0oFjAve/SqIXdfSAhKygPxTHOT2jgq 33hxKmjrRKYjhtDHzXSuZ2ssCAdjoFHs1xRR5XxyICJ+oDXi9Tp+IxtUiSAc/p/2rG KXKIerbPM/8Xow6vYmVZ66pv+wfer7JYXLKxiL/2UVzUpLmUxVR5UJ7DY2oJCFYvHh +5aVVyl55yHEd8zaI73uLZRpuJB4dLqUTsE2kOtaVNmPu+QoM/Tuw/dZFFwjpEDl0l a9gyD+xsV8FZA== From: Oded Gabbay To: dri-devel@lists.freedesktop.org Subject: [PATCH 09/12] accel/habanalabs: update sysfs-driver-habanalabs with the accel path Date: Tue, 11 Jul 2023 14:12:23 +0300 Message-Id: <20230711111226.163670-9-ogabbay@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230711111226.163670-1-ogabbay@kernel.org> References: <20230711111226.163670-1-ogabbay@kernel.org> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tomer Tayar Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tomer Tayar Replace "/sys/class/habanalabs/hl/..." with "/sys/class/accel/accel/device/...". Signed-off-by: Tomer Tayar Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- .../ABI/testing/sysfs-driver-habanalabs | 64 +++++++++---------- 1 file changed, 32 insertions(+), 32 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-driver-habanalabs b/Documentation/ABI/testing/sysfs-driver-habanalabs index 1b98b6503b23..c63ca1ad500d 100644 --- a/Documentation/ABI/testing/sysfs-driver-habanalabs +++ b/Documentation/ABI/testing/sysfs-driver-habanalabs @@ -1,4 +1,4 @@ -What: /sys/class/habanalabs/hl/armcp_kernel_ver +What: /sys/class/accel/accel/device/armcp_kernel_ver Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org @@ -6,7 +6,7 @@ Description: Version of the Linux kernel running on the device's CPU. Will be DEPRECATED in Linux kernel version 5.10, and be replaced with cpucp_kernel_ver -What: /sys/class/habanalabs/hl/armcp_ver +What: /sys/class/accel/accel/device/armcp_ver Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org @@ -14,7 +14,7 @@ Description: Version of the application running on the device's CPU Will be DEPRECATED in Linux kernel version 5.10, and be replaced with cpucp_ver -What: /sys/class/habanalabs/hl/clk_max_freq_mhz +What: /sys/class/accel/accel/device/clk_max_freq_mhz Date: Jun 2019 KernelVersion: 5.7 Contact: ogabbay@kernel.org @@ -24,58 +24,58 @@ Description: Allows the user to set the maximum clock frequency, in MHz. frequency value of the device clock. This property is valid only for the Gaudi ASIC family -What: /sys/class/habanalabs/hl/clk_cur_freq_mhz +What: /sys/class/accel/accel/device/clk_cur_freq_mhz Date: Jun 2019 KernelVersion: 5.7 Contact: ogabbay@kernel.org Description: Displays the current frequency, in MHz, of the device clock. This property is valid only for the Gaudi ASIC family -What: /sys/class/habanalabs/hl/cpld_ver +What: /sys/class/accel/accel/device/cpld_ver Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org Description: Version of the Device's CPLD F/W -What: /sys/class/habanalabs/hl/cpucp_kernel_ver +What: /sys/class/accel/accel/device/cpucp_kernel_ver Date: Oct 2020 KernelVersion: 5.10 Contact: ogabbay@kernel.org Description: Version of the Linux kernel running on the device's CPU -What: /sys/class/habanalabs/hl/cpucp_ver +What: /sys/class/accel/accel/device/cpucp_ver Date: Oct 2020 KernelVersion: 5.10 Contact: ogabbay@kernel.org Description: Version of the application running on the device's CPU -What: /sys/class/habanalabs/hl/device_type +What: /sys/class/accel/accel/device/device_type Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org Description: Displays the code name of the device according to its type. The supported values are: "GOYA" -What: /sys/class/habanalabs/hl/eeprom +What: /sys/class/accel/accel/device/eeprom Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org Description: A binary file attribute that contains the contents of the on-board EEPROM -What: /sys/class/habanalabs/hl/fuse_ver +What: /sys/class/accel/accel/device/fuse_ver Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org Description: Displays the device's version from the eFuse -What: /sys/class/habanalabs/hl/fw_os_ver +What: /sys/class/accel/accel/device/fw_os_ver Date: Dec 2021 KernelVersion: 5.18 Contact: ogabbay@kernel.org Description: Version of the firmware OS running on the device's CPU -What: /sys/class/habanalabs/hl/hard_reset +What: /sys/class/accel/accel/device/hard_reset Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org @@ -83,14 +83,14 @@ Description: Interface to trigger a hard-reset operation for the device. Hard-reset will reset ALL internal components of the device except for the PCI interface and the internal PLLs -What: /sys/class/habanalabs/hl/hard_reset_cnt +What: /sys/class/accel/accel/device/hard_reset_cnt Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org Description: Displays how many times the device have undergone a hard-reset operation since the driver was loaded -What: /sys/class/habanalabs/hl/high_pll +What: /sys/class/accel/accel/device/high_pll Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org @@ -98,7 +98,7 @@ Description: Allows the user to set the maximum clock frequency for MME, TPC and IC when the power management profile is set to "automatic". This property is valid only for the Goya ASIC family -What: /sys/class/habanalabs/hl/ic_clk +What: /sys/class/accel/accel/device/ic_clk Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org @@ -110,27 +110,27 @@ Description: Allows the user to set the maximum clock frequency, in Hz, of frequency value of the IC. This property is valid only for the Goya ASIC family -What: /sys/class/habanalabs/hl/ic_clk_curr +What: /sys/class/accel/accel/device/ic_clk_curr Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org Description: Displays the current clock frequency, in Hz, of the Interconnect fabric. This property is valid only for the Goya ASIC family -What: /sys/class/habanalabs/hl/infineon_ver +What: /sys/class/accel/accel/device/infineon_ver Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org Description: Version of the Device's power supply F/W code. Relevant only to GOYA and GAUDI -What: /sys/class/habanalabs/hl/max_power +What: /sys/class/accel/accel/device/max_power Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org Description: Allows the user to set the maximum power consumption of the device in milliwatts. -What: /sys/class/habanalabs/hl/mme_clk +What: /sys/class/accel/accel/device/mme_clk Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org @@ -142,21 +142,21 @@ Description: Allows the user to set the maximum clock frequency, in Hz, of frequency value of the MME. This property is valid only for the Goya ASIC family -What: /sys/class/habanalabs/hl/mme_clk_curr +What: /sys/class/accel/accel/device/mme_clk_curr Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org Description: Displays the current clock frequency, in Hz, of the MME compute engine. This property is valid only for the Goya ASIC family -What: /sys/class/habanalabs/hl/pci_addr +What: /sys/class/accel/accel/device/pci_addr Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org Description: Displays the PCI address of the device. This is needed so the user would be able to open a device based on its PCI address -What: /sys/class/habanalabs/hl/pm_mng_profile +What: /sys/class/accel/accel/device/pm_mng_profile Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org @@ -170,19 +170,19 @@ Description: Power management profile. Values are "auto", "manual". In "auto" ic_clk, mme_clk and tpc_clk. This property is valid only for the Goya ASIC family -What: /sys/class/habanalabs/hl/preboot_btl_ver +What: /sys/class/accel/accel/device/preboot_btl_ver Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org Description: Version of the device's preboot F/W code -What: /sys/class/habanalabs/hl/security_enabled +What: /sys/class/accel/accel/device/security_enabled Date: Oct 2022 KernelVersion: 6.1 Contact: obitton@habana.ai Description: Displays the device's security status -What: /sys/class/habanalabs/hl/soft_reset +What: /sys/class/accel/accel/device/soft_reset Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org @@ -190,14 +190,14 @@ Description: Interface to trigger a soft-reset operation for the device. Soft-reset will reset only the compute and DMA engines of the device -What: /sys/class/habanalabs/hl/soft_reset_cnt +What: /sys/class/accel/accel/device/soft_reset_cnt Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org Description: Displays how many times the device have undergone a soft-reset operation since the driver was loaded -What: /sys/class/habanalabs/hl/status +What: /sys/class/accel/accel/device/status Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org @@ -215,13 +215,13 @@ Description: Status of the card: a compute-reset which is executed after a device release (relevant for Gaudi2 only). -What: /sys/class/habanalabs/hl/thermal_ver +What: /sys/class/accel/accel/device/thermal_ver Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org Description: Version of the Device's thermal daemon -What: /sys/class/habanalabs/hl/tpc_clk +What: /sys/class/accel/accel/device/tpc_clk Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org @@ -233,20 +233,20 @@ Description: Allows the user to set the maximum clock frequency, in Hz, of frequency value of the TPC. This property is valid only for Goya ASIC family -What: /sys/class/habanalabs/hl/tpc_clk_curr +What: /sys/class/accel/accel/device/tpc_clk_curr Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org Description: Displays the current clock frequency, in Hz, of the TPC compute engines. This property is valid only for the Goya ASIC family -What: /sys/class/habanalabs/hl/uboot_ver +What: /sys/class/accel/accel/device/uboot_ver Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org Description: Version of the u-boot running on the device's CPU -What: /sys/class/habanalabs/hl/vrm_ver +What: /sys/class/accel/accel/device/vrm_ver Date: Jan 2022 KernelVersion: 5.17 Contact: ogabbay@kernel.org From patchwork Tue Jul 11 11:12:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oded Gabbay X-Patchwork-Id: 13308424 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 96908EB64DC for ; Tue, 11 Jul 2023 11:12:58 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id ECC6610E378; Tue, 11 Jul 2023 11:12:48 +0000 (UTC) Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by gabe.freedesktop.org (Postfix) with ESMTPS id E9D6C10E372 for ; Tue, 11 Jul 2023 11:12:45 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 573C161449; Tue, 11 Jul 2023 11:12:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DEFA2C433C9; Tue, 11 Jul 2023 11:12:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1689073964; bh=6d19aDsLtYFM8VeUHkMIF7rlggscWhxUPEKwy2hpD18=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=YxPw+6/GIIkegNmq+0W1lSa3l/zaLS18NIm7rRq/HzFk5eW3s+J3KwDIutRNfoS+o YJLpg3Aq9Y2gc7vkvOGTh4PiNj9qqwGfICnku8S1QGoBFx/CjVCTqTBTl3Z4qw1+9F NVc8bLdm+Ey3HdE5/9Ri6x6xCozADo8cVsyP+TVCMJcupeG4uDZ6vKCPMPXsAgLi3t OhOAet7lpPk/TyYkz50oTOw8HeirMhnhphUSE0FdS+THU12DNhPeCijUpX7951A3go B7Uw+v1iPc69dCYHIsL8oGDToN3OY+qvwJ431QFyzuoiAhKKkNXjrSbVwdWsZEdLSX fTrgoCz5W5vgg== From: Oded Gabbay To: dri-devel@lists.freedesktop.org Subject: [PATCH 10/12] accel/habanalabs: update debugfs-driver-habanalabs with the accel path Date: Tue, 11 Jul 2023 14:12:24 +0300 Message-Id: <20230711111226.163670-10-ogabbay@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230711111226.163670-1-ogabbay@kernel.org> References: <20230711111226.163670-1-ogabbay@kernel.org> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tomer Tayar Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tomer Tayar Replace "/sys/kernel/debug/habanalabs/hl/..." with "/sys/kernel/debug/accel//...". Signed-off-by: Tomer Tayar Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- .../ABI/testing/debugfs-driver-habanalabs | 84 +++++++++---------- 1 file changed, 42 insertions(+), 42 deletions(-) diff --git a/Documentation/ABI/testing/debugfs-driver-habanalabs b/Documentation/ABI/testing/debugfs-driver-habanalabs index 85f6d04f528b..042fd125fbc9 100644 --- a/Documentation/ABI/testing/debugfs-driver-habanalabs +++ b/Documentation/ABI/testing/debugfs-driver-habanalabs @@ -1,4 +1,4 @@ -What: /sys/kernel/debug/habanalabs/hl/addr +What: /sys/kernel/debug/accel//addr Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org @@ -8,34 +8,34 @@ Description: Sets the device address to be used for read or write through only when the IOMMU is disabled. The acceptable value is a string that starts with "0x" -What: /sys/kernel/debug/habanalabs/hl/clk_gate +What: /sys/kernel/debug/accel//clk_gate Date: May 2020 KernelVersion: 5.8 Contact: ogabbay@kernel.org Description: This setting is now deprecated as clock gating is handled solely by the f/w -What: /sys/kernel/debug/habanalabs/hl/command_buffers +What: /sys/kernel/debug/accel//command_buffers Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org Description: Displays a list with information about the currently allocated command buffers -What: /sys/kernel/debug/habanalabs/hl/command_submission +What: /sys/kernel/debug/accel//command_submission Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org Description: Displays a list with information about the currently active command submissions -What: /sys/kernel/debug/habanalabs/hl/command_submission_jobs +What: /sys/kernel/debug/accel//command_submission_jobs Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org Description: Displays a list with detailed information about each JOB (CB) of each active command submission -What: /sys/kernel/debug/habanalabs/hl/data32 +What: /sys/kernel/debug/accel//data32 Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org @@ -50,7 +50,7 @@ Description: Allows the root user to read or write directly through the If the IOMMU is disabled, it also allows the root user to read or write from the host a device VA of a host mapped memory -What: /sys/kernel/debug/habanalabs/hl/data64 +What: /sys/kernel/debug/accel//data64 Date: Jan 2020 KernelVersion: 5.6 Contact: ogabbay@kernel.org @@ -65,7 +65,7 @@ Description: Allows the root user to read or write 64 bit data directly If the IOMMU is disabled, it also allows the root user to read or write from the host a device VA of a host mapped memory -What: /sys/kernel/debug/habanalabs/hl/data_dma +What: /sys/kernel/debug/accel//data_dma Date: Apr 2021 KernelVersion: 5.13 Contact: ogabbay@kernel.org @@ -79,11 +79,11 @@ Description: Allows the root user to read from the device's internal a very long time. This interface doesn't support concurrency in the same device. In GAUDI and GOYA, this action can cause undefined behavior - in case the it is done while the device is executing user + in case it is done while the device is executing user workloads. Only supported on GAUDI at this stage. -What: /sys/kernel/debug/habanalabs/hl/device +What: /sys/kernel/debug/accel//device Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org @@ -91,14 +91,14 @@ Description: Enables the root user to set the device to specific state. Valid values are "disable", "enable", "suspend", "resume". User can read this property to see the valid values -What: /sys/kernel/debug/habanalabs/hl/device_release_watchdog_timeout +What: /sys/kernel/debug/accel//device_release_watchdog_timeout Date: Oct 2022 KernelVersion: 6.2 Contact: ttayar@habana.ai -Description: The watchdog timeout value in seconds for a device relese upon +Description: The watchdog timeout value in seconds for a device release upon certain error cases, after which the device is reset. -What: /sys/kernel/debug/habanalabs/hl/dma_size +What: /sys/kernel/debug/accel//dma_size Date: Apr 2021 KernelVersion: 5.13 Contact: ogabbay@kernel.org @@ -108,7 +108,7 @@ Description: Specify the size of the DMA transaction when using DMA to read When the write is finished, the user can read the "data_dma" blob -What: /sys/kernel/debug/habanalabs/hl/dump_razwi_events +What: /sys/kernel/debug/accel//dump_razwi_events Date: Aug 2022 KernelVersion: 5.20 Contact: fkassabri@habana.ai @@ -117,7 +117,7 @@ Description: Dumps all razwi events to dmesg if exist. the routine will clear the status register. Usage: cat dump_razwi_events -What: /sys/kernel/debug/habanalabs/hl/dump_security_violations +What: /sys/kernel/debug/accel//dump_security_violations Date: Jan 2021 KernelVersion: 5.12 Contact: ogabbay@kernel.org @@ -125,14 +125,14 @@ Description: Dumps all security violations to dmesg. This will also ack all security violations meanings those violations will not be dumped next time user calls this API -What: /sys/kernel/debug/habanalabs/hl/engines +What: /sys/kernel/debug/accel//engines Date: Jul 2019 KernelVersion: 5.3 Contact: ogabbay@kernel.org Description: Displays the status registers values of the device engines and their derived idle status -What: /sys/kernel/debug/habanalabs/hl/i2c_addr +What: /sys/kernel/debug/accel//i2c_addr Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org @@ -140,7 +140,7 @@ Description: Sets I2C device address for I2C transaction that is generated by the device's CPU, Not available when device is loaded with secured firmware -What: /sys/kernel/debug/habanalabs/hl/i2c_bus +What: /sys/kernel/debug/accel//i2c_bus Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org @@ -148,7 +148,7 @@ Description: Sets I2C bus address for I2C transaction that is generated by the device's CPU, Not available when device is loaded with secured firmware -What: /sys/kernel/debug/habanalabs/hl/i2c_data +What: /sys/kernel/debug/accel//i2c_data Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org @@ -157,7 +157,7 @@ Description: Triggers an I2C transaction that is generated by the device's reading from the file generates a read transaction, Not available when device is loaded with secured firmware -What: /sys/kernel/debug/habanalabs/hl/i2c_len +What: /sys/kernel/debug/accel//i2c_len Date: Dec 2021 KernelVersion: 5.17 Contact: obitton@habana.ai @@ -165,7 +165,7 @@ Description: Sets I2C length in bytes for I2C transaction that is generated b the device's CPU, Not available when device is loaded with secured firmware -What: /sys/kernel/debug/habanalabs/hl/i2c_reg +What: /sys/kernel/debug/accel//i2c_reg Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org @@ -173,35 +173,35 @@ Description: Sets I2C register id for I2C transaction that is generated by the device's CPU, Not available when device is loaded with secured firmware -What: /sys/kernel/debug/habanalabs/hl/led0 +What: /sys/kernel/debug/accel//led0 Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org Description: Sets the state of the first S/W led on the device, Not available when device is loaded with secured firmware -What: /sys/kernel/debug/habanalabs/hl/led1 +What: /sys/kernel/debug/accel//led1 Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org Description: Sets the state of the second S/W led on the device, Not available when device is loaded with secured firmware -What: /sys/kernel/debug/habanalabs/hl/led2 +What: /sys/kernel/debug/accel//led2 Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org Description: Sets the state of the third S/W led on the device, Not available when device is loaded with secured firmware -What: /sys/kernel/debug/habanalabs/hl/memory_scrub +What: /sys/kernel/debug/accel//memory_scrub Date: May 2022 KernelVersion: 5.19 Contact: dhirschfeld@habana.ai Description: Allows the root user to scrub the dram memory. The scrubbing value can be set using the debugfs file memory_scrub_val. -What: /sys/kernel/debug/habanalabs/hl/memory_scrub_val +What: /sys/kernel/debug/accel//memory_scrub_val Date: May 2022 KernelVersion: 5.19 Contact: dhirschfeld@habana.ai @@ -209,7 +209,7 @@ Description: The value to which the dram will be set to when the user scrubs the dram using 'memory_scrub' debugfs file and the scrubbing value when using module param 'memory_scrub' -What: /sys/kernel/debug/habanalabs/hl/mmu +What: /sys/kernel/debug/accel//mmu Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org @@ -217,19 +217,19 @@ Description: Displays the hop values and physical address for a given ASID and virtual address. The user should write the ASID and VA into the file and then read the file to get the result. e.g. to display info about VA 0x1000 for ASID 1 you need to do: - echo "1 0x1000" > /sys/kernel/debug/habanalabs/hl0/mmu + echo "1 0x1000" > /sys/kernel/debug/accel/0/mmu -What: /sys/kernel/debug/habanalabs/hl/mmu_error +What: /sys/kernel/debug/accel//mmu_error Date: Mar 2021 KernelVersion: 5.12 Contact: fkassabri@habana.ai Description: Check and display page fault or access violation mmu errors for all MMUs specified in mmu_cap_mask. e.g. to display error info for MMU hw cap bit 9, you need to do: - echo "0x200" > /sys/kernel/debug/habanalabs/hl0/mmu_error - cat /sys/kernel/debug/habanalabs/hl0/mmu_error + echo "0x200" > /sys/kernel/debug/accel/0/mmu_error + cat /sys/kernel/debug/accel/0/mmu_error -What: /sys/kernel/debug/habanalabs/hl/monitor_dump +What: /sys/kernel/debug/accel//monitor_dump Date: Mar 2022 KernelVersion: 5.19 Contact: osharabi@habana.ai @@ -243,7 +243,7 @@ Description: Allows the root user to dump monitors status from the device's This interface doesn't support concurrency in the same device. Only supported on GAUDI. -What: /sys/kernel/debug/habanalabs/hl/monitor_dump_trig +What: /sys/kernel/debug/accel//monitor_dump_trig Date: Mar 2022 KernelVersion: 5.19 Contact: osharabi@habana.ai @@ -253,14 +253,14 @@ Description: Triggers dump of monitor data. The value to trigger the operatio When the write is finished, the user can read the "monitor_dump" blob -What: /sys/kernel/debug/habanalabs/hl/set_power_state +What: /sys/kernel/debug/accel//set_power_state Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org Description: Sets the PCI power state. Valid values are "1" for D0 and "2" for D3Hot -What: /sys/kernel/debug/habanalabs/hl/skip_reset_on_timeout +What: /sys/kernel/debug/accel//skip_reset_on_timeout Date: Jun 2021 KernelVersion: 5.13 Contact: ynudelman@habana.ai @@ -268,7 +268,7 @@ Description: Sets the skip reset on timeout option for the device. Value of "0" means device will be reset in case some CS has timed out, otherwise it will not be reset. -What: /sys/kernel/debug/habanalabs/hl/state_dump +What: /sys/kernel/debug/accel//state_dump Date: Oct 2021 KernelVersion: 5.15 Contact: ynudelman@habana.ai @@ -279,7 +279,7 @@ Description: Gets the state dump occurring on a CS timeout or failure. Writing an integer X discards X state dumps, so that the next read would return X+1-st newest state dump. -What: /sys/kernel/debug/habanalabs/hl/stop_on_err +What: /sys/kernel/debug/accel//stop_on_err Date: Mar 2020 KernelVersion: 5.6 Contact: ogabbay@kernel.org @@ -287,21 +287,21 @@ Description: Sets the stop-on_error option for the device engines. Value of "0" is for disable, otherwise enable. Relevant only for GOYA and GAUDI. -What: /sys/kernel/debug/habanalabs/hl/timeout_locked +What: /sys/kernel/debug/accel//timeout_locked Date: Sep 2021 KernelVersion: 5.16 Contact: obitton@habana.ai Description: Sets the command submission timeout value in seconds. -What: /sys/kernel/debug/habanalabs/hl/userptr +What: /sys/kernel/debug/accel//userptr Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org -Description: Displays a list with information about the currently user +Description: Displays a list with information about the current user pointers (user virtual addresses) that are pinned and mapped to DMA addresses -What: /sys/kernel/debug/habanalabs/hl/userptr_lookup +What: /sys/kernel/debug/accel//userptr_lookup Date: Oct 2021 KernelVersion: 5.15 Contact: ogabbay@kernel.org @@ -309,7 +309,7 @@ Description: Allows to search for specific user pointers (user virtual addresses) that are pinned and mapped to DMA addresses, and see their resolution to the specific dma address. -What: /sys/kernel/debug/habanalabs/hl/vm +What: /sys/kernel/debug/accel//vm Date: Jan 2019 KernelVersion: 5.1 Contact: ogabbay@kernel.org From patchwork Tue Jul 11 11:12:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oded Gabbay X-Patchwork-Id: 13308426 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DF921EB64DC for ; Tue, 11 Jul 2023 11:13:01 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 635F910E380; Tue, 11 Jul 2023 11:12:50 +0000 (UTC) Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4829910E372 for ; Tue, 11 Jul 2023 11:12:47 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id AD20B6147B; Tue, 11 Jul 2023 11:12:46 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 48005C433C8; Tue, 11 Jul 2023 11:12:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1689073966; bh=xX2uTRftD9EqHwWo8crdxO8fyhXNQxRxsD1FXxeuaFQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=pgqzkOhmHNncLUHMwcWsMWFGlnOq9tDburUB//w5R0ZqL+SrLcswn9HrDZwm+aE9X GwjhWYKOWeCJKsEdxjj0VORBX0NwkA8Op8jL3IcYPl6Y9jBbIqs+ggmRy5z+Maj0F+ 5gycyr/iNr0TAeqXjXpjsDNEvGAla8sezPiOsBG1no5gfPjZ4W/sVV0PIrvb6FaT6Q Z/1RcoojxtWVHrGfNs42XF0ZQrKqP8va+49AeHihOhfzZnl0ET3wcrxZNvbM9l6EX0 wuc9tyPETrBre1YazryEgZm7Am5qkISXCuWks2gyxQAvkk+3r+v0PedTqyxWCeqzFV U0t6TEi4dtZ8Q== From: Oded Gabbay To: dri-devel@lists.freedesktop.org Subject: [PATCH 11/12] accel/habanalabs: Move ioctls to the device specific ioctls range Date: Tue, 11 Jul 2023 14:12:25 +0300 Message-Id: <20230711111226.163670-11-ogabbay@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230711111226.163670-1-ogabbay@kernel.org> References: <20230711111226.163670-1-ogabbay@kernel.org> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tomer Tayar Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tomer Tayar To use drm_ioctl(), move the ioctls to the device specific ioctls range at [DRM_COMMAND_BASE, DRM_COMMAND_END). Signed-off-by: Tomer Tayar Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- .../accel/habanalabs/common/command_buffer.c | 5 +- .../habanalabs/common/command_submission.c | 6 ++- drivers/accel/habanalabs/common/habanalabs.h | 11 ++-- .../accel/habanalabs/common/habanalabs_drv.c | 18 +++++-- .../habanalabs/common/habanalabs_ioctl.c | 53 ++++--------------- drivers/accel/habanalabs/common/memory.c | 3 +- include/uapi/drm/habanalabs_accel.h | 39 +++++++------- 7 files changed, 59 insertions(+), 76 deletions(-) diff --git a/drivers/accel/habanalabs/common/command_buffer.c b/drivers/accel/habanalabs/common/command_buffer.c index 08f7aee42624..0f0d295116e7 100644 --- a/drivers/accel/habanalabs/common/command_buffer.c +++ b/drivers/accel/habanalabs/common/command_buffer.c @@ -361,10 +361,11 @@ static int hl_cb_info(struct hl_mem_mgr *mmg, return rc; } -int hl_cb_ioctl(struct hl_fpriv *hpriv, void *data) +int hl_cb_ioctl(struct drm_device *ddev, void *data, struct drm_file *file_priv) { - union hl_cb_args *args = data; + struct hl_fpriv *hpriv = file_priv->driver_priv; struct hl_device *hdev = hpriv->hdev; + union hl_cb_args *args = data; u64 handle = 0, device_va = 0; enum hl_device_status status; u32 usage_cnt = 0; diff --git a/drivers/accel/habanalabs/common/command_submission.c b/drivers/accel/habanalabs/common/command_submission.c index cfbf5fe72bb1..0291a79c06ab 100644 --- a/drivers/accel/habanalabs/common/command_submission.c +++ b/drivers/accel/habanalabs/common/command_submission.c @@ -2557,8 +2557,9 @@ static int cs_ioctl_flush_pci_hbw_writes(struct hl_fpriv *hpriv) return 0; } -int hl_cs_ioctl(struct hl_fpriv *hpriv, void *data) +int hl_cs_ioctl(struct drm_device *ddev, void *data, struct drm_file *file_priv) { + struct hl_fpriv *hpriv = file_priv->driver_priv; union hl_cs_args *args = data; enum hl_cs_type cs_type = 0; u64 cs_seq = ULONG_MAX; @@ -3718,8 +3719,9 @@ static int hl_interrupt_wait_ioctl(struct hl_fpriv *hpriv, void *data) return 0; } -int hl_wait_ioctl(struct hl_fpriv *hpriv, void *data) +int hl_wait_ioctl(struct drm_device *ddev, void *data, struct drm_file *file_priv) { + struct hl_fpriv *hpriv = file_priv->driver_priv; struct hl_device *hdev = hpriv->hdev; union hl_wait_cs_args *args = data; u32 flags = args->in.flags; diff --git a/drivers/accel/habanalabs/common/habanalabs.h b/drivers/accel/habanalabs/common/habanalabs.h index 58948044ad16..834f8cbf080a 100644 --- a/drivers/accel/habanalabs/common/habanalabs.h +++ b/drivers/accel/habanalabs/common/habanalabs.h @@ -4117,11 +4117,12 @@ void hl_ack_pb_single_dcore(struct hl_device *hdev, u32 dcore_offset, const u32 pb_blocks[], u32 blocks_array_size); /* IOCTLs */ -long hl_ioctl(struct file *filep, unsigned int cmd, unsigned long arg); long hl_ioctl_control(struct file *filep, unsigned int cmd, unsigned long arg); -int hl_cb_ioctl(struct hl_fpriv *hpriv, void *data); -int hl_cs_ioctl(struct hl_fpriv *hpriv, void *data); -int hl_wait_ioctl(struct hl_fpriv *hpriv, void *data); -int hl_mem_ioctl(struct hl_fpriv *hpriv, void *data); +int hl_info_ioctl(struct drm_device *ddev, void *data, struct drm_file *file_priv); +int hl_cb_ioctl(struct drm_device *ddev, void *data, struct drm_file *file_priv); +int hl_cs_ioctl(struct drm_device *ddev, void *data, struct drm_file *file_priv); +int hl_wait_ioctl(struct drm_device *ddev, void *data, struct drm_file *file_priv); +int hl_mem_ioctl(struct drm_device *ddev, void *data, struct drm_file *file_priv); +int hl_debug_ioctl(struct drm_device *ddev, void *data, struct drm_file *file_priv); #endif /* HABANALABSP_H_ */ diff --git a/drivers/accel/habanalabs/common/habanalabs_drv.c b/drivers/accel/habanalabs/common/habanalabs_drv.c index 6341b8362b3e..7e66f623f350 100644 --- a/drivers/accel/habanalabs/common/habanalabs_drv.c +++ b/drivers/accel/habanalabs/common/habanalabs_drv.c @@ -18,6 +18,7 @@ #include #include +#include #define CREATE_TRACE_POINTS #include @@ -73,12 +74,21 @@ static const struct pci_device_id ids[] = { }; MODULE_DEVICE_TABLE(pci, ids); +static const struct drm_ioctl_desc hl_drm_ioctls[] = { + DRM_IOCTL_DEF_DRV(HL_INFO, hl_info_ioctl, 0), + DRM_IOCTL_DEF_DRV(HL_CB, hl_cb_ioctl, 0), + DRM_IOCTL_DEF_DRV(HL_CS, hl_cs_ioctl, 0), + DRM_IOCTL_DEF_DRV(HL_WAIT_CS, hl_wait_ioctl, 0), + DRM_IOCTL_DEF_DRV(HL_MEMORY, hl_mem_ioctl, 0), + DRM_IOCTL_DEF_DRV(HL_DEBUG, hl_debug_ioctl, 0), +}; + static const struct file_operations hl_fops = { .owner = THIS_MODULE, .open = accel_open, .release = drm_release, - .unlocked_ioctl = hl_ioctl, - .compat_ioctl = hl_ioctl, + .unlocked_ioctl = drm_ioctl, + .compat_ioctl = drm_compat_ioctl, .llseek = noop_llseek, .mmap = hl_mmap }; @@ -95,7 +105,9 @@ static const struct drm_driver hl_driver = { .fops = &hl_fops, .open = hl_device_open, - .postclose = hl_device_release + .postclose = hl_device_release, + .ioctls = hl_drm_ioctls, + .num_ioctls = ARRAY_SIZE(hl_drm_ioctls) }; /* diff --git a/drivers/accel/habanalabs/common/habanalabs_ioctl.c b/drivers/accel/habanalabs/common/habanalabs_ioctl.c index 28c3793e802f..87a6a0c0c48a 100644 --- a/drivers/accel/habanalabs/common/habanalabs_ioctl.c +++ b/drivers/accel/habanalabs/common/habanalabs_ioctl.c @@ -1095,8 +1095,10 @@ static int _hl_info_ioctl(struct hl_fpriv *hpriv, void *data, return rc; } -static int hl_info_ioctl(struct hl_fpriv *hpriv, void *data) +int hl_info_ioctl(struct drm_device *ddev, void *data, struct drm_file *file_priv) { + struct hl_fpriv *hpriv = file_priv->driver_priv; + return _hl_info_ioctl(hpriv, data, hpriv->hdev->dev); } @@ -1105,10 +1107,11 @@ static int hl_info_ioctl_control(struct hl_fpriv *hpriv, void *data) return _hl_info_ioctl(hpriv, data, hpriv->hdev->dev_ctrl); } -static int hl_debug_ioctl(struct hl_fpriv *hpriv, void *data) +int hl_debug_ioctl(struct drm_device *ddev, void *data, struct drm_file *file_priv) { - struct hl_debug_args *args = data; + struct hl_fpriv *hpriv = file_priv->driver_priv; struct hl_device *hdev = hpriv->hdev; + struct hl_debug_args *args = data; enum hl_device_status status; int rc = 0; @@ -1151,19 +1154,10 @@ static int hl_debug_ioctl(struct hl_fpriv *hpriv, void *data) } #define HL_IOCTL_DEF(ioctl, _func) \ - [_IOC_NR(ioctl)] = {.cmd = ioctl, .func = _func} - -static const struct hl_ioctl_desc hl_ioctls[] = { - HL_IOCTL_DEF(HL_IOCTL_INFO, hl_info_ioctl), - HL_IOCTL_DEF(HL_IOCTL_CB, hl_cb_ioctl), - HL_IOCTL_DEF(HL_IOCTL_CS, hl_cs_ioctl), - HL_IOCTL_DEF(HL_IOCTL_WAIT_CS, hl_wait_ioctl), - HL_IOCTL_DEF(HL_IOCTL_MEMORY, hl_mem_ioctl), - HL_IOCTL_DEF(HL_IOCTL_DEBUG, hl_debug_ioctl) -}; + [_IOC_NR(ioctl) - HL_COMMAND_START] = {.cmd = ioctl, .func = _func} static const struct hl_ioctl_desc hl_ioctls_control[] = { - HL_IOCTL_DEF(HL_IOCTL_INFO, hl_info_ioctl_control) + HL_IOCTL_DEF(DRM_IOCTL_HL_INFO, hl_info_ioctl_control) }; static long _hl_ioctl(struct hl_fpriv *hpriv, unsigned int cmd, unsigned long arg, @@ -1232,33 +1226,6 @@ static long _hl_ioctl(struct hl_fpriv *hpriv, unsigned int cmd, unsigned long ar return retcode; } -long hl_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) -{ - struct drm_file *file_priv = filep->private_data; - struct hl_fpriv *hpriv = file_priv->driver_priv; - struct hl_device *hdev = hpriv->hdev; - const struct hl_ioctl_desc *ioctl = NULL; - unsigned int nr = _IOC_NR(cmd); - - if (!hdev) { - pr_err_ratelimited("Sending ioctl after device was removed! Please close FD\n"); - return -ENODEV; - } - - if ((nr >= HL_COMMAND_START) && (nr < HL_COMMAND_END)) { - ioctl = &hl_ioctls[nr]; - } else { - char task_comm[TASK_COMM_LEN]; - - dev_dbg_ratelimited(hdev->dev, - "invalid ioctl: pid=%d, comm=\"%s\", cmd=%#010x, nr=%#04x\n", - task_pid_nr(current), get_task_comm(task_comm, current), cmd, nr); - return -ENOTTY; - } - - return _hl_ioctl(hpriv, cmd, arg, ioctl, hdev->dev); -} - long hl_ioctl_control(struct file *filep, unsigned int cmd, unsigned long arg) { struct hl_fpriv *hpriv = filep->private_data; @@ -1271,8 +1238,8 @@ long hl_ioctl_control(struct file *filep, unsigned int cmd, unsigned long arg) return -ENODEV; } - if (nr == _IOC_NR(HL_IOCTL_INFO)) { - ioctl = &hl_ioctls_control[nr]; + if (nr == _IOC_NR(DRM_IOCTL_HL_INFO)) { + ioctl = &hl_ioctls_control[nr - HL_COMMAND_START]; } else { char task_comm[TASK_COMM_LEN]; diff --git a/drivers/accel/habanalabs/common/memory.c b/drivers/accel/habanalabs/common/memory.c index 45fdf39bfc8c..1b1b4256b011 100644 --- a/drivers/accel/habanalabs/common/memory.c +++ b/drivers/accel/habanalabs/common/memory.c @@ -2171,8 +2171,9 @@ static int allocate_timestamps_buffers(struct hl_fpriv *hpriv, struct hl_mem_in return 0; } -int hl_mem_ioctl(struct hl_fpriv *hpriv, void *data) +int hl_mem_ioctl(struct drm_device *ddev, void *data, struct drm_file *file_priv) { + struct hl_fpriv *hpriv = file_priv->driver_priv; enum hl_device_status status; union hl_mem_args *args = data; struct hl_device *hdev = hpriv->hdev; diff --git a/include/uapi/drm/habanalabs_accel.h b/include/uapi/drm/habanalabs_accel.h index f912869b151e..e7893b082bf8 100644 --- a/include/uapi/drm/habanalabs_accel.h +++ b/include/uapi/drm/habanalabs_accel.h @@ -8,8 +8,7 @@ #ifndef HABANALABS_H_ #define HABANALABS_H_ -#include -#include +#include /* * Defines that are asic-specific but constitutes as ABI between kernel driver @@ -607,9 +606,9 @@ enum gaudi2_engine_id { /* * ASIC specific PLL index * - * Used to retrieve in frequency info of different IPs via - * HL_INFO_PLL_FREQUENCY under HL_IOCTL_INFO IOCTL. The enums need to be - * used as an index in struct hl_pll_frequency_info + * Used to retrieve in frequency info of different IPs via HL_INFO_PLL_FREQUENCY under + * DRM_IOCTL_HL_INFO IOCTL. + * The enums need to be used as an index in struct hl_pll_frequency_info. */ enum hl_goya_pll_index { @@ -2163,6 +2162,13 @@ struct hl_debug_args { __u32 ctx_id; }; +#define HL_IOCTL_INFO 0x00 +#define HL_IOCTL_CB 0x01 +#define HL_IOCTL_CS 0x02 +#define HL_IOCTL_WAIT_CS 0x03 +#define HL_IOCTL_MEMORY 0x04 +#define HL_IOCTL_DEBUG 0x05 + /* * Various information operations such as: * - H/W IP information @@ -2177,8 +2183,7 @@ struct hl_debug_args { * definitions of structures in kernel and userspace, e.g. in case of old * userspace and new kernel driver */ -#define HL_IOCTL_INFO \ - _IOWR('H', 0x01, struct hl_info_args) +#define DRM_IOCTL_HL_INFO DRM_IOWR(DRM_COMMAND_BASE + HL_IOCTL_INFO, struct hl_info_args) /* * Command Buffer @@ -2199,8 +2204,7 @@ struct hl_debug_args { * and won't be returned to user. * */ -#define HL_IOCTL_CB \ - _IOWR('H', 0x02, union hl_cb_args) +#define DRM_IOCTL_HL_CB DRM_IOWR(DRM_COMMAND_BASE + HL_IOCTL_CB, union hl_cb_args) /* * Command Submission @@ -2252,8 +2256,7 @@ struct hl_debug_args { * and only if CS N and CS N-1 are exactly the same (same CBs for the same * queues). */ -#define HL_IOCTL_CS \ - _IOWR('H', 0x03, union hl_cs_args) +#define DRM_IOCTL_HL_CS DRM_IOWR(DRM_COMMAND_BASE + HL_IOCTL_CS, union hl_cs_args) /* * Wait for Command Submission @@ -2285,9 +2288,7 @@ struct hl_debug_args { * HL_WAIT_CS_STATUS_ABORTED - The CS was aborted, usually because the * device was reset (EIO) */ - -#define HL_IOCTL_WAIT_CS \ - _IOWR('H', 0x04, union hl_wait_cs_args) +#define DRM_IOCTL_HL_WAIT_CS DRM_IOWR(DRM_COMMAND_BASE + HL_IOCTL_WAIT_CS, union hl_wait_cs_args) /* * Memory @@ -2304,8 +2305,7 @@ struct hl_debug_args { * There is an option for the user to specify the requested virtual address. * */ -#define HL_IOCTL_MEMORY \ - _IOWR('H', 0x05, union hl_mem_args) +#define DRM_IOCTL_HL_MEMORY DRM_IOWR(DRM_COMMAND_BASE + HL_IOCTL_MEMORY, union hl_mem_args) /* * Debug @@ -2331,10 +2331,9 @@ struct hl_debug_args { * The driver can decide to "kick out" the user if he abuses this interface. * */ -#define HL_IOCTL_DEBUG \ - _IOWR('H', 0x06, struct hl_debug_args) +#define DRM_IOCTL_HL_DEBUG DRM_IOWR(DRM_COMMAND_BASE + HL_IOCTL_DEBUG, struct hl_debug_args) -#define HL_COMMAND_START 0x01 -#define HL_COMMAND_END 0x07 +#define HL_COMMAND_START (DRM_COMMAND_BASE + HL_IOCTL_INFO) +#define HL_COMMAND_END (DRM_COMMAND_BASE + HL_IOCTL_DEBUG + 1) #endif /* HABANALABS_H_ */ From patchwork Tue Jul 11 11:12:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oded Gabbay X-Patchwork-Id: 13308427 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4B94EC0015E for ; Tue, 11 Jul 2023 11:13:03 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3806110E37B; Tue, 11 Jul 2023 11:12:51 +0000 (UTC) Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by gabe.freedesktop.org (Postfix) with ESMTPS id 389E010E379 for ; Tue, 11 Jul 2023 11:12:48 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id AB66861485; Tue, 11 Jul 2023 11:12:47 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A4D34C433CA; Tue, 11 Jul 2023 11:12:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1689073967; bh=hQA5QLicVgrPAVk0QXpv8p873QJnrbqy0dif3WEwaXo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=oGnz2BtfpUyRpjNvs3eY/eLgfp1VMXQqiS2CwxVfJcOjqYVF53jdP9u3luZv5nszp QmyFxUV1Zhqx36OFfeGyjdzO2+9tzM94X0efrP53tfeGwtRqfLSUUatdeRHNfTLdb7 qGddOfWS2EP05BbGpkdtj6l3uxGkAxJpy1AvasTQKjvzfLxp0Q6RjRDv81V655tGyl 0drcnT/OojQaT3TFhAxmBEReVm34H2/S93eDutPeFl95xHK5zm6UGOh9LFxiSSyCbh K+zstMtM4hf383FMSl58UGSZlMfgUWo0GgMsTEu7tUbX4TmXN376QBtP1WR5zfH6vn +ZNJxdJs0SJkw== From: Oded Gabbay To: dri-devel@lists.freedesktop.org Subject: [PATCH 12/12] accel/habanalabs: release user interfaces earlier in device fini Date: Tue, 11 Jul 2023 14:12:26 +0300 Message-Id: <20230711111226.163670-12-ogabbay@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230711111226.163670-1-ogabbay@kernel.org> References: <20230711111226.163670-1-ogabbay@kernel.org> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tomer Tayar Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tomer Tayar Currently the sysfs/debugfs interfaces and device un-registration are done as the last thing in hl_device_fini(), after several finalizations and releases are done. While a disabled flag is set at the beginning of hl_device_fini(), and it is being checked when handling user accesses to these interfaces, this check is not hermetic and it is better to just reverse the order of the code in hl_device_fini(). Signed-off-by: Tomer Tayar Reviewed-by: Oded Gabbay Signed-off-by: Oded Gabbay --- drivers/accel/habanalabs/common/device.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/accel/habanalabs/common/device.c b/drivers/accel/habanalabs/common/device.c index c0c9e9504672..5293ac3c7988 100644 --- a/drivers/accel/habanalabs/common/device.c +++ b/drivers/accel/habanalabs/common/device.c @@ -2408,6 +2408,12 @@ void hl_device_fini(struct hl_device *hdev) hdev->fw_loader.fw_comp_loaded = FW_TYPE_NONE; + /* Hide devices and sysfs/debugfs files from user */ + cdev_sysfs_debugfs_remove(hdev); + drm_dev_unregister(&hdev->drm); + + hl_debugfs_device_fini(hdev); + /* Release kernel context */ if ((hdev->kernel_ctx) && (hl_ctx_put(hdev->kernel_ctx) != 1)) dev_err(hdev->dev, "kernel ctx is still alive\n"); @@ -2436,12 +2442,6 @@ void hl_device_fini(struct hl_device *hdev) device_early_fini(hdev); - /* Hide devices and sysfs/debugfs files from user */ - cdev_sysfs_debugfs_remove(hdev); - drm_dev_unregister(&hdev->drm); - - hl_debugfs_device_fini(hdev); - pr_info("removed device successfully\n"); }