From patchwork Tue May 2 10:43:57 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Zhang X-Patchwork-Id: 9707819 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id AC3DC60245 for ; Tue, 2 May 2017 10:44:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9ED1C2843B for ; Tue, 2 May 2017 10:44:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9351528449; Tue, 2 May 2017 10:44:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 1FAE02843B for ; Tue, 2 May 2017 10:44:00 +0000 (UTC) Received: from [127.0.0.1] (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 6867121951CB5; Tue, 2 May 2017 03:44:00 -0700 (PDT) X-Original-To: linux-nvdimm@lists.01.org Delivered-To: linux-nvdimm@lists.01.org Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id D865D21951CB5 for ; Tue, 2 May 2017 03:43:58 -0700 (PDT) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4A013C154204; Tue, 2 May 2017 10:43:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 4A013C154204 Authentication-Results: ext-mx07.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx07.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=yizhan@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 4A013C154204 Received: from colo-mx.corp.redhat.com (colo-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 3F91D173A8; Tue, 2 May 2017 10:43:58 +0000 (UTC) Received: from zmail25.collab.prod.int.phx2.redhat.com (zmail25.collab.prod.int.phx2.redhat.com [10.5.83.31]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 2C2FD18523CD; Tue, 2 May 2017 10:43:58 +0000 (UTC) Date: Tue, 2 May 2017 06:43:57 -0400 (EDT) From: Yi Zhang To: Dan Williams Message-ID: <49995374.2635027.1493721837868.JavaMail.zimbra@redhat.com> In-Reply-To: <149356211395.12803.4163197195995720592.stgit@dwillia2-desk3.amr.corp.intel.com> References: <0193aa84-a6ad-8472-43de-a49ea8617fc4@redhat.com> <149356211395.12803.4163197195995720592.stgit@dwillia2-desk3.amr.corp.intel.com> Subject: Re: [PATCH] device-dax: fix sysfs attribute deadlock MIME-Version: 1.0 X-Originating-IP: [10.68.5.41, 10.4.195.2] Thread-Topic: device-dax: fix sysfs attribute deadlock Thread-Index: dousbLXtU1Q75bvYxAJF+XPOoqsleQ== X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Tue, 02 May 2017 10:43:58 +0000 (UTC) X-BeenThere: linux-nvdimm@lists.01.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: "Linux-nvdimm developer list." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-kernel@vger.kernel.org, stable@vger.kernel.org, linux-nvdimm@lists.01.org Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" X-Virus-Scanned: ClamAV using ClamSMTP Verified this patch on 4.11. Tested-by: Yi Zhang Best Regards, Yi Zhang ----- Original Message ----- From: "Dan Williams" To: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org, stable@vger.kernel.org, "Yi Zhang" Sent: Sunday, April 30, 2017 10:21:54 PM Subject: [PATCH] device-dax: fix sysfs attribute deadlock Usage of device_lock() for dax_region attributes is unnecessary and deadlock prone. It's unnecessary because the order of registration / un-registration guarantees that drvdata is always valid. It's deadlock prone because it sets up this situation: ndctl D 0 2170 2082 0x00000000 Call Trace: __schedule+0x31f/0x980 schedule+0x3d/0x90 schedule_preempt_disabled+0x15/0x20 __mutex_lock+0x402/0x980 ? __mutex_lock+0x158/0x980 ? align_show+0x2b/0x80 [dax] ? kernfs_seq_start+0x2f/0x90 mutex_lock_nested+0x1b/0x20 align_show+0x2b/0x80 [dax] dev_attr_show+0x20/0x50 ndctl D 0 2186 2079 0x00000000 Call Trace: __schedule+0x31f/0x980 schedule+0x3d/0x90 __kernfs_remove+0x1f6/0x340 ? kernfs_remove_by_name_ns+0x45/0xa0 ? remove_wait_queue+0x70/0x70 kernfs_remove_by_name_ns+0x45/0xa0 remove_files.isra.1+0x35/0x70 sysfs_remove_group+0x44/0x90 sysfs_remove_groups+0x2e/0x50 dax_region_unregister+0x25/0x40 [dax] devm_action_release+0xf/0x20 release_nodes+0x16d/0x2b0 devres_release_all+0x3c/0x60 device_release_driver_internal+0x17d/0x220 device_release_driver+0x12/0x20 unbind_store+0x112/0x160 ndctl/2170 is trying to acquire the device_lock() to read an attribute, and ndctl/2186 is holding the device_lock() while trying to drain all active attribute readers. Thanks to Yi Zhang for the reproduction script. Fixes: d7fe1a67f658 ("dax: add region 'id', 'size', and 'align' attributes") Cc: Reported-by: Yi Zhang Signed-off-by: Dan Williams --- drivers/dax/dax.c | 40 ++++++++++++---------------------------- 1 file changed, 12 insertions(+), 28 deletions(-) diff --git a/drivers/dax/dax.c b/drivers/dax/dax.c index ef93aa84622b..5e8302d3a89c 100644 --- a/drivers/dax/dax.c +++ b/drivers/dax/dax.c @@ -36,36 +36,27 @@ static struct kmem_cache *dax_cache __read_mostly; static struct super_block *dax_superblock __read_mostly; MODULE_PARM_DESC(nr_dax, "max number of device-dax instances"); +/* + * Rely on the fact that drvdata is set before the attributes are + * registered, and that the attributes are unregistered before drvdata + * is cleared to assume that drvdata is always valid. + */ static ssize_t id_show(struct device *dev, struct device_attribute *attr, char *buf) { - struct dax_region *dax_region; - ssize_t rc = -ENXIO; + struct dax_region *dax_region = dev_get_drvdata(dev); - device_lock(dev); - dax_region = dev_get_drvdata(dev); - if (dax_region) - rc = sprintf(buf, "%d\n", dax_region->id); - device_unlock(dev); - - return rc; + return sprintf(buf, "%d\n", dax_region->id); } static DEVICE_ATTR_RO(id); static ssize_t region_size_show(struct device *dev, struct device_attribute *attr, char *buf) { - struct dax_region *dax_region; - ssize_t rc = -ENXIO; + struct dax_region *dax_region = dev_get_drvdata(dev); - device_lock(dev); - dax_region = dev_get_drvdata(dev); - if (dax_region) - rc = sprintf(buf, "%llu\n", (unsigned long long) - resource_size(&dax_region->res)); - device_unlock(dev); - - return rc; + return sprintf(buf, "%llu\n", (unsigned long long) + resource_size(&dax_region->res)); } static struct device_attribute dev_attr_region_size = __ATTR(size, 0444, region_size_show, NULL); @@ -73,16 +64,9 @@ static struct device_attribute dev_attr_region_size = __ATTR(size, 0444, static ssize_t align_show(struct device *dev, struct device_attribute *attr, char *buf) { - struct dax_region *dax_region; - ssize_t rc = -ENXIO; + struct dax_region *dax_region = dev_get_drvdata(dev); - device_lock(dev); - dax_region = dev_get_drvdata(dev); - if (dax_region) - rc = sprintf(buf, "%u\n", dax_region->align); - device_unlock(dev); - - return rc; + return sprintf(buf, "%u\n", dax_region->align); } static DEVICE_ATTR_RO(align);