From patchwork Fri Apr 15 06:12:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Alice Chao X-Patchwork-Id: 12814603 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28B7AC4332F for ; Fri, 15 Apr 2022 06:15:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350149AbiDOGRc (ORCPT ); Fri, 15 Apr 2022 02:17:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55138 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240827AbiDOGRZ (ORCPT ); Fri, 15 Apr 2022 02:17:25 -0400 Received: from mailgw02.mediatek.com (unknown [210.61.82.184]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B13F13C492; Thu, 14 Apr 2022 23:14:57 -0700 (PDT) X-UUID: 9f214143cbc146d49ba89343c5238ff7-20220415 X-UUID: 9f214143cbc146d49ba89343c5238ff7-20220415 Received: from mtkexhb01.mediatek.inc [(172.21.101.102)] by mailgw02.mediatek.com (envelope-from ) (Generic MTA with TLSv1.2 ECDHE-RSA-AES256-SHA384 256/256) with ESMTP id 1654108916; Fri, 15 Apr 2022 14:14:51 +0800 Received: from mtkcas10.mediatek.inc (172.21.101.39) by mtkmbs10n2.mediatek.inc (172.21.101.183) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.2.792.3; Fri, 15 Apr 2022 14:14:50 +0800 Received: from mtksdccf07.mediatek.inc (172.21.84.99) by mtkcas10.mediatek.inc (172.21.101.73) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Fri, 15 Apr 2022 14:14:50 +0800 From: Alice Chao To: , , , , , , CC: , , , , , , , , , , , Subject: [PATCH v4 1/1] scsi: Fix racing between dev init and dev reset Date: Fri, 15 Apr 2022 14:12:44 +0800 Message-ID: <20220415061243.30229-2-alice.chao@mediatek.com> X-Mailer: git-send-email 2.18.0 MIME-Version: 1.0 X-MTK: N Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Device reset thread uses kobject_uevent_env() to get kobj.parent, and it races with device init thread which calls device_add() to add kobj.parent before kobject_uevent_env(). Device init call: Device reset call: scsi_probe_and_add_lun() scsi_evt_thread() scsi_add_lun() scsi_evt_emit() scsi_sysfs_add_sdev() kobject_uevent_env() //get kobj.parent scsi_target_add() kobject_get_path() len = get_kobj_path_length () // len=1 because parent hasn't created yet device_add() // add kobj.parent kobject_uevent_env() kobject_get_path() path = kzalloc() fill_kobj_path() fill_kobj_path() // --length; length -= cur is a negative value memcpy(path + length, kobject_name(parent), cur); // slab OOB! Above backtrace describes the problem, device reset thread will get wrong kobj.parent when device init thread didn’t add kobj.parent yet. When this racing happened, it triggers the a KASAN dump on the final iteration: BUG: KASAN: slab-out-of-bounds in kobject_get_path+0xf8/0x1b8 Write of size 11 at addr ffffff80d6bb94f5 by task kworker/3:1/58 Call trace: __kasan_report+0x124/0x1c8 kasan_report+0x54/0x84 kasan_check_range+0x200/0x208 memcpy+0xb8/0xf0 kobject_get_path+0xf8/0x1b8 kobject_uevent_env+0x228/0xa88 scsi_evt_thread+0x2d0/0x5b0 process_one_work+0x570/0xf94 worker_thread+0x7cc/0xf80 kthread+0x2c4/0x388 These two jobs are scheduled asynchronously, we can't guaranteed that kobj.parent will be created in device init thread before device reset thread calls kobject_get_path(). To resolve the racing issue between device init thread and device reset thread, we use wait_event() in scsi_evt_emit() to wait for device_add() to complete the creation of kobj.parent. Device init call: Device reset call: ufshcd_async_scan() scsi_evt_thread() scsi_scan_host() scsi_evt_emit() <- add wait_event() do_scsi_scan_host() <- add wake_up() scsi_scan_host_selected() scsi_scan_channel() scsi_probe_and_add_lun() scsi_target_add() device_add() // add kobj.parent kobject_uevent_env() kobject_get_path() fill_kobj_path() // call wake_up() after scsi_scan_host_selected is done kobject_uevent_env() // add kobj.parent kobject_get_path() // get valid kobj.parent fill_kobj_path() After we add wake_up at do_scsi_scan_host() in device init thread, we can ensure that device reset thread will get kobject after device init thread finishes adding parent. Signed-off-by: Alice Chao --- Change in v4 -Change commit: Change call stack description. Change in v3 -Change commit: Describe the preblem first and then the solution. -Add commit: Add KASAN error log. Change in v2 -Remove Change-Id. --- drivers/scsi/scsi_lib.c | 1 + drivers/scsi/scsi_scan.c | 1 + 2 files changed, 2 insertions(+) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 0a70aa763a96..abf9a71ed77c 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -2461,6 +2461,7 @@ static void scsi_evt_emit(struct scsi_device *sdev, struct scsi_event *evt) break; case SDEV_EVT_POWER_ON_RESET_OCCURRED: envp[idx++] = "SDEV_UA=POWER_ON_RESET_OCCURRED"; + wait_event(sdev->host->host_wait, sdev->sdev_gendev.kobj.parent != NULL); break; default: /* do nothing */ diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index f4e6c68ac99e..431f229ac435 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -1904,6 +1904,7 @@ static void do_scsi_scan_host(struct Scsi_Host *shost) } else { scsi_scan_host_selected(shost, SCAN_WILD_CARD, SCAN_WILD_CARD, SCAN_WILD_CARD, 0); + wake_up(&shost->host_wait); } }