From patchwork Fri Jun 16 15:41:22 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Douglas Miller X-Patchwork-Id: 9792093 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 5A0416038E for ; Fri, 16 Jun 2017 15:41:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5559328552 for ; Fri, 16 Jun 2017 15:41:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 49FA828653; Fri, 16 Jun 2017 15:41:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DC9A628552 for ; Fri, 16 Jun 2017 15:41:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752705AbdFPPlb (ORCPT ); Fri, 16 Jun 2017 11:41:31 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:60313 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752672AbdFPPla (ORCPT ); Fri, 16 Jun 2017 11:41:30 -0400 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.20/8.16.0.20) with SMTP id v5GFdKmR035302 for ; Fri, 16 Jun 2017 11:41:30 -0400 Received: from e35.co.us.ibm.com (e35.co.us.ibm.com [32.97.110.153]) by mx0a-001b2d01.pphosted.com with ESMTP id 2b4j2v8877-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Fri, 16 Jun 2017 11:41:29 -0400 Received: from localhost by e35.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 16 Jun 2017 09:41:28 -0600 Received: from b03cxnp08028.gho.boulder.ibm.com (9.17.130.20) by e35.co.us.ibm.com (192.168.1.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Fri, 16 Jun 2017 09:41:27 -0600 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v5GFfOme20578408; Fri, 16 Jun 2017 08:41:24 -0700 Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 02DDCBE03A; Fri, 16 Jun 2017 09:41:24 -0600 (MDT) Received: from oc5780617838.ibm.com (unknown [9.80.81.252]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP id 46C2FBE038; Fri, 16 Jun 2017 09:41:23 -0600 (MDT) Subject: Re: enclosure: fix sysfs symlinks creation when using multipath To: James Bottomley , "Martin K. Petersen" , Maurizio Lombardi Cc: linux-scsi@vger.kernel.org References: <1489690155.11068.10.camel@linux.vnet.ibm.com> From: Douglas Miller Date: Fri, 16 Jun 2017 10:41:22 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.0 MIME-Version: 1.0 In-Reply-To: <1489690155.11068.10.camel@linux.vnet.ibm.com> Content-Language: en-US X-TM-AS-GCONF: 00 x-cbid: 17061615-0012-0000-0000-000014793105 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007243; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000214; SDB=6.00875666; UDB=6.00436008; IPR=6.00655754; BA=6.00005425; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00015853; XFM=3.00000015; UTC=2017-06-16 15:41:28 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17061615-0013-0000-0000-00004E2A0109 Message-Id: <1998341d-7f47-f73a-c5c5-4095b7fa9ef1@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-06-16_09:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000 definitions=main-1706160258 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 03/16/2017 01:49 PM, James Bottomley wrote: > On Wed, 2017-03-15 at 19:39 -0400, Martin K. Petersen wrote: >> Maurizio Lombardi writes: >> >>> With multipath, it may happen that the same device is passed to >>> enclosure_add_device() multiple times and that the >>> enclosure_add_links() function fails to create the symlinks because >>> the device's sysfs directory entry is still NULL. In this case, >>> the >>> links will never be created because all the subsequent calls to >>> enclosure_add_device() will immediately fail with EEXIST. >> James? > Well I don't think the patch is the correct way to do this. The > problem is that if we encounter an error creating the links, we > shouldn't add the device to the enclosure. There's no need of a > links_created variable (see below). > > However, more interesting is why the link creation failed in the first > place. The device clearly seems to exist because it was added to sysfs > at time index 19.2 and the enclosure didn't try to use it until 60.0. > Can you debug this a bit more, please? I can't see anything specific > to multipath in the trace, so whatever this is looks like it could > happen in the single path case as well. > > James > > diff --git a/drivers/misc/enclosure.c b/drivers/misc/enclosure.c > index 65fed71..ae89082 100644 > --- a/drivers/misc/enclosure.c > +++ b/drivers/misc/enclosure.c > @@ -375,6 +375,7 @@ int enclosure_add_device(struct enclosure_device *edev, int component, > struct device *dev) > { > struct enclosure_component *cdev; > + int err; > > if (!edev || component >= edev->components) > return -EINVAL; > @@ -384,12 +385,15 @@ int enclosure_add_device(struct enclosure_device *edev, int component, > if (cdev->dev == dev) > return -EEXIST; > > - if (cdev->dev) > + if (cdev->dev) { > enclosure_remove_links(cdev); > - > - put_device(cdev->dev); > - cdev->dev = get_device(dev); > - return enclosure_add_links(cdev); > + put_device(cdev->dev); > + cdev->dev = NULL; > + } > + err = enclosure_add_links(cdev); > + if (!err) > + cdev->dev = get_device(dev); > + return err; > } > EXPORT_SYMBOL_GPL(enclosure_add_device); > After stumbling across the NULL pointer panic, I was able to use Maurizio's second patch below: return -EINVAL; @@ -384,12 +385,17 @@ int enclosure_add_device(struct enclosure_device *edev, int component, if (cdev->dev == dev) return -EEXIST; - if (cdev->dev) + if (cdev->dev) { enclosure_remove_links(cdev); - - put_device(cdev->dev); + put_device(cdev->dev); + } cdev->dev = get_device(dev); - return enclosure_add_links(cdev); + err = enclosure_add_links(cdev); + if (err) { + cdev->dev = NULL; + put_device(cdev->dev); + } + return err; } EXPORT_SYMBOL_GPL(enclosure_add_device); I am able to pass my testing with this patch. I don't see an official submit of this patch, but will respond to it when I see one. Again, I am seeing the problem even without multipath. diff --git a/drivers/misc/enclosure.c b/drivers/misc/enclosure.c index 65fed71..6ac07ea 100644 --- a/drivers/misc/enclosure.c +++ b/drivers/misc/enclosure.c @@ -375,6 +375,7 @@ int enclosure_add_device(struct enclosure_device *edev, int component, struct device *dev) { struct enclosure_component *cdev; + int err; if (!edev || component >= edev->components)