From patchwork Thu Sep 10 12:44:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mansur Alisha Shaik X-Patchwork-Id: 11767769 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3A556698 for ; Thu, 10 Sep 2020 12:45:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 243F52078B for ; Thu, 10 Sep 2020 12:45:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730777AbgIJMpm (ORCPT ); Thu, 10 Sep 2020 08:45:42 -0400 Received: from alexa-out.qualcomm.com ([129.46.98.28]:45909 "EHLO alexa-out.qualcomm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730474AbgIJMo7 (ORCPT ); Thu, 10 Sep 2020 08:44:59 -0400 Received: from ironmsg09-lv.qualcomm.com ([10.47.202.153]) by alexa-out.qualcomm.com with ESMTP; 10 Sep 2020 05:44:35 -0700 Received: from ironmsg02-blr.qualcomm.com ([10.86.208.131]) by ironmsg09-lv.qualcomm.com with ESMTP/TLS/AES256-SHA; 10 Sep 2020 05:44:34 -0700 Received: from c-mansur-linux.qualcomm.com ([10.204.90.208]) by ironmsg02-blr.qualcomm.com with ESMTP; 10 Sep 2020 18:14:23 +0530 Received: by c-mansur-linux.qualcomm.com (Postfix, from userid 461723) id 5305E21D23; Thu, 10 Sep 2020 18:14:22 +0530 (IST) From: Mansur Alisha Shaik To: linux-media@vger.kernel.org, stanimir.varbanov@linaro.org Cc: linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, vgarodia@codeaurora.org, Mansur Alisha Shaik Subject: [PATCH v2 1/3] venus: core: handle race condititon for core ops Date: Thu, 10 Sep 2020 18:14:14 +0530 Message-Id: <1599741856-16239-2-git-send-email-mansur@codeaurora.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1599741856-16239-1-git-send-email-mansur@codeaurora.org> References: <1599741856-16239-1-git-send-email-mansur@codeaurora.org> Sender: linux-arm-msm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org For core ops we are having only write protect but there is no read protect, because of this in multthreading and concurrency, one CPU core is reading without wait which is causing the NULL pointer dereferece crash. one such scenario is as show below, where in one CPU core, core->ops becoming NULL and in another CPU core calling core->ops->session_init(). CPU: core-7: Call trace: hfi_session_init+0x180/0x1dc [venus_core] vdec_queue_setup+0x9c/0x364 [venus_dec] vb2_core_reqbufs+0x1e4/0x368 [videobuf2_common] vb2_reqbufs+0x4c/0x64 [videobuf2_v4l2] v4l2_m2m_reqbufs+0x50/0x84 [v4l2_mem2mem] v4l2_m2m_ioctl_reqbufs+0x2c/0x38 [v4l2_mem2mem] v4l_reqbufs+0x4c/0x5c __video_do_ioctl+0x2b0/0x39c CPU: core-0: Call trace: venus_shutdown+0x98/0xfc [venus_core] venus_sys_error_handler+0x64/0x148 [venus_core] process_one_work+0x210/0x3d0 worker_thread+0x248/0x3f4 kthread+0x11c/0x12c Signed-off-by: Mansur Alisha Shaik Acked-by: Stanimir Varbanov --- Changes in V2: - Addressed review comments by stan by validating on top - of https://lore.kernel.org/patchwork/project/lkml/list/?series=455962 drivers/media/platform/qcom/venus/hfi.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/media/platform/qcom/venus/hfi.c b/drivers/media/platform/qcom/venus/hfi.c index a59022a..3137071 100644 --- a/drivers/media/platform/qcom/venus/hfi.c +++ b/drivers/media/platform/qcom/venus/hfi.c @@ -195,7 +195,7 @@ EXPORT_SYMBOL_GPL(hfi_session_create); int hfi_session_init(struct venus_inst *inst, u32 pixfmt) { struct venus_core *core = inst->core; - const struct hfi_ops *ops = core->ops; + const struct hfi_ops *ops; int ret; if (inst->state != INST_UNINIT) @@ -204,10 +204,13 @@ int hfi_session_init(struct venus_inst *inst, u32 pixfmt) inst->hfi_codec = to_codec_type(pixfmt); reinit_completion(&inst->done); + mutex_lock(&core->lock); + ops = core->ops; ret = ops->session_init(inst, inst->session_type, inst->hfi_codec); if (ret) return ret; + mutex_unlock(&core->lock); ret = wait_session_msg(inst); if (ret) return ret; From patchwork Thu Sep 10 12:44:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mansur Alisha Shaik X-Patchwork-Id: 11767779 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 60AC2698 for ; Thu, 10 Sep 2020 12:48:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4C3C120882 for ; Thu, 10 Sep 2020 12:48:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730832AbgIJMqY (ORCPT ); Thu, 10 Sep 2020 08:46:24 -0400 Received: from alexa-out.qualcomm.com ([129.46.98.28]:11026 "EHLO alexa-out.qualcomm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730696AbgIJMpg (ORCPT ); Thu, 10 Sep 2020 08:45:36 -0400 Received: from ironmsg07-lv.qualcomm.com (HELO ironmsg07-lv.qulacomm.com) ([10.47.202.151]) by alexa-out.qualcomm.com with ESMTP; 10 Sep 2020 05:44:39 -0700 Received: from ironmsg02-blr.qualcomm.com ([10.86.208.131]) by ironmsg07-lv.qulacomm.com with ESMTP/TLS/AES256-SHA; 10 Sep 2020 05:44:37 -0700 Received: from c-mansur-linux.qualcomm.com ([10.204.90.208]) by ironmsg02-blr.qualcomm.com with ESMTP; 10 Sep 2020 18:14:25 +0530 Received: by c-mansur-linux.qualcomm.com (Postfix, from userid 461723) id DD94921D23; Thu, 10 Sep 2020 18:14:23 +0530 (IST) From: Mansur Alisha Shaik To: linux-media@vger.kernel.org, stanimir.varbanov@linaro.org Cc: linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, vgarodia@codeaurora.org, Mansur Alisha Shaik Subject: [PATCH v2 2/3] venus: core: cancel pending work items in workqueue Date: Thu, 10 Sep 2020 18:14:15 +0530 Message-Id: <1599741856-16239-3-git-send-email-mansur@codeaurora.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1599741856-16239-1-git-send-email-mansur@codeaurora.org> References: <1599741856-16239-1-git-send-email-mansur@codeaurora.org> Sender: linux-arm-msm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org In concurrency usecase and reboot scenario we are observing race condition and seeing NULL pointer dereference crash. In shutdown path and system recovery path we are destroying the same mutex hence seeing crash. This case is handled by mutex protection and cancel delayed work items in work queue. Below is the call trace for the crash Call trace: venus_remove+0xdc/0xec [venus_core] venus_core_shutdown+0x1c/0x34 [venus_core] platform_drv_shutdown+0x28/0x34 device_shutdown+0x154/0x1fc kernel_restart_prepare+0x40/0x4c kernel_restart+0x1c/0x64 Call trace: mutex_lock+0x34/0x60 venus_hfi_destroy+0x28/0x98 [venus_core] hfi_destroy+0x1c/0x28 [venus_core] venus_sys_error_handler+0x60/0x14c [venus_core] process_one_work+0x210/0x3d0 worker_thread+0x248/0x3f4 kthread+0x11c/0x12c ret_from_fork+0x10/0x18 Signed-off-by: Mansur Alisha Shaik --- drivers/media/platform/qcom/venus/core.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/media/platform/qcom/venus/core.c b/drivers/media/platform/qcom/venus/core.c index c5af428..69aa199 100644 --- a/drivers/media/platform/qcom/venus/core.c +++ b/drivers/media/platform/qcom/venus/core.c @@ -323,6 +323,8 @@ static int venus_remove(struct platform_device *pdev) struct device *dev = core->dev; int ret; + cancel_delayed_work_sync(&core->work); + ret = pm_runtime_get_sync(dev); WARN_ON(ret < 0); @@ -340,7 +342,9 @@ static int venus_remove(struct platform_device *pdev) if (pm_ops->core_put) pm_ops->core_put(dev); + mutex_lock(&core->lock); hfi_destroy(core); + mutex_unlock(&core->lock); icc_put(core->video_path); icc_put(core->cpucfg_path); From patchwork Thu Sep 10 12:44:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mansur Alisha Shaik X-Patchwork-Id: 11767777 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D6653698 for ; Thu, 10 Sep 2020 12:47:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C13FE20882 for ; Thu, 10 Sep 2020 12:47:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730813AbgIJMqn (ORCPT ); Thu, 10 Sep 2020 08:46:43 -0400 Received: from alexa-out.qualcomm.com ([129.46.98.28]:45050 "EHLO alexa-out.qualcomm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730684AbgIJMo7 (ORCPT ); Thu, 10 Sep 2020 08:44:59 -0400 Received: from ironmsg09-lv.qualcomm.com ([10.47.202.153]) by alexa-out.qualcomm.com with ESMTP; 10 Sep 2020 05:44:39 -0700 Received: from ironmsg02-blr.qualcomm.com ([10.86.208.131]) by ironmsg09-lv.qualcomm.com with ESMTP/TLS/AES256-SHA; 10 Sep 2020 05:44:37 -0700 Received: from c-mansur-linux.qualcomm.com ([10.204.90.208]) by ironmsg02-blr.qualcomm.com with ESMTP; 10 Sep 2020 18:14:26 +0530 Received: by c-mansur-linux.qualcomm.com (Postfix, from userid 461723) id 4B51221D23; Thu, 10 Sep 2020 18:14:25 +0530 (IST) From: Mansur Alisha Shaik To: linux-media@vger.kernel.org, stanimir.varbanov@linaro.org Cc: linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, vgarodia@codeaurora.org, Mansur Alisha Shaik Subject: [PATCH v2 3/3] venus: handle use after free for iommu_map/iommu_unmap Date: Thu, 10 Sep 2020 18:14:16 +0530 Message-Id: <1599741856-16239-4-git-send-email-mansur@codeaurora.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1599741856-16239-1-git-send-email-mansur@codeaurora.org> References: <1599741856-16239-1-git-send-email-mansur@codeaurora.org> Sender: linux-arm-msm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org In concurrency usecase and reboot scenario we are seeing muliple crashes related to iommu_map/iommu_unamp of core->fw.iommu_domain. In one case we are seeing "Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008" crash, this is because of core->fw.iommu_domain in venus_firmware_deinit() and trying to map in venus_boot() during venus_sys_error_handler() Call trace: __iommu_map+0x4c/0x348 iommu_map+0x5c/0x70 venus_boot+0x184/0x230 [venus_core] venus_sys_error_handler+0xa0/0x14c [venus_core] process_one_work+0x210/0x3d0 worker_thread+0x248/0x3f4 kthread+0x11c/0x12c ret_from_fork+0x10/0x18 In second case we are seeing "Unable to handle kernel paging request at virtual address 006b6b6b6b6b6b9b" crash, this is because of unmappin iommu domain which is already unmapped. Call trace: venus_remove+0xf8/0x108 [venus_core] venus_core_shutdown+0x1c/0x34 [venus_core] platform_drv_shutdown+0x28/0x34 device_shutdown+0x154/0x1fc kernel_restart_prepare+0x40/0x4c kernel_restart+0x1c/0x64 __arm64_sys_reboot+0x190/0x238 el0_svc_common+0xa4/0x154 el0_svc_compat_handler+0x2c/0x38 el0_svc_compat+0x8/0x10 Signed-off-by: Mansur Alisha Shaik --- Changes in V2: - Addressed review comments by stan - Elaborated commit message drivers/media/platform/qcom/venus/firmware.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/drivers/media/platform/qcom/venus/firmware.c b/drivers/media/platform/qcom/venus/firmware.c index 8801a6a..c427e88 100644 --- a/drivers/media/platform/qcom/venus/firmware.c +++ b/drivers/media/platform/qcom/venus/firmware.c @@ -171,9 +171,14 @@ static int venus_shutdown_no_tz(struct venus_core *core) iommu = core->fw.iommu_domain; - unmapped = iommu_unmap(iommu, VENUS_FW_START_ADDR, mapped); - if (unmapped != mapped) - dev_err(dev, "failed to unmap firmware\n"); + if (core->fw.mapped_mem_size && iommu) { + unmapped = iommu_unmap(iommu, VENUS_FW_START_ADDR, mapped); + + if (unmapped != mapped) + dev_err(dev, "failed to unmap firmware\n"); + else + core->fw.mapped_mem_size = 0; + } return 0; } @@ -288,7 +293,11 @@ void venus_firmware_deinit(struct venus_core *core) iommu = core->fw.iommu_domain; iommu_detach_device(iommu, core->fw.dev); - iommu_domain_free(iommu); + + if (core->fw.iommu_domain) { + iommu_domain_free(iommu); + core->fw.iommu_domain = NULL; + } platform_device_unregister(to_platform_device(core->fw.dev)); }