From patchwork Thu Oct 15 07:38:35 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhengping Zhou X-Patchwork-Id: 7403701 Return-Path: X-Original-To: patchwork-linux-scsi@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 51CA79F1B9 for ; Thu, 15 Oct 2015 07:38:51 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 65BB120788 for ; Thu, 15 Oct 2015 07:38:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6D14020787 for ; Thu, 15 Oct 2015 07:38:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751968AbbJOHir (ORCPT ); Thu, 15 Oct 2015 03:38:47 -0400 Received: from mail-pa0-f54.google.com ([209.85.220.54]:35889 "EHLO mail-pa0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751730AbbJOHir (ORCPT ); Thu, 15 Oct 2015 03:38:47 -0400 Received: by pabws5 with SMTP id ws5so15994658pab.3 for ; Thu, 15 Oct 2015 00:38:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id; bh=oO2NDZWp8Wc2AZ5EIqajdPnxGHQtrFhnCpmKQ1taOjE=; b=F2+d1XK1Ha4p8co+jWn+P7CnoS6FLRf/Hw7+O3SJvpyJSVoQz/jIKOD0rEQlKXGUw9 2bE9Zd6ztTRWKsN9GV885dsU4sIhnc8nfz0EDWxXFTj8EwkBJDzsNSc3yvw+LaCY1sxU hGOUlHG55BTh5OVgaZjiLiWcxASZBvSzd1/mKautJT7RUPwgSZm53Xnzl9Ra40PIeObm b8gH9GEbnfRQFIQIHI3i6vD0tWbWSEYMHSujk183QlLcB5jWzmoAIEfpUkLlW7BrGlGf Wa+hjt2vFlJMxMemkdC7uszi5NLvyDlotcYPN6AGCR2CU5f7naOAHLWQb6C82OgZMgqw Vf+g== X-Received: by 10.68.184.5 with SMTP id eq5mr8637429pbc.130.1444894726566; Thu, 15 Oct 2015 00:38:46 -0700 (PDT) Received: from localhost.localdomain ([58.96.178.102]) by smtp.gmail.com with ESMTPSA id oo2sm13546415pbb.86.2015.10.15.00.38.44 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 15 Oct 2015 00:38:45 -0700 (PDT) From: Zhengping Zhou To: James Bottomley Cc: linux-scsi@vger.kernel.org, Zhengping Zhou Subject: [PATCH 1/1] scsi subsystem : fix function __scsi_device_lookup Date: Thu, 15 Oct 2015 15:38:35 +0800 Message-Id: <1444894715-4906-1-git-send-email-johnzzpcrystal@gmail.com> X-Mailer: git-send-email 1.8.3.1 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Spam-Status: No, score=-5.8 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, HK_RANDOM_FROM, RCVD_IN_DNSWL_HI, T_DKIM_INVALID, T_RP_MATCHES_RCVD,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP when a scsi_device is unpluged from scsi controller, if the scsi_device is still be used by application layer,it won't be released until users release it. In this case, scsi_device_remove just set the scsi_device's state to be SDEV_DEL. But if you plug the disk just before the old scsi_device is released, then there will be two scsi_device structures in scsi_host->__devices. when the next unpluging event happens,some low-level drivers will check whether the scsi_device has been added to host (for example, the megaraid sas series controller) by calling scsi_device_lookup(call __scsi_device_lookup). __scsi_device_lookup will return the first scsi_device. Because its state is SDEV_DEL, the scsi_device_lookup will return NULL finally, making the low-level driver assume that the scsi_device has been removed,and won't call scsi_device_remove,which will lead the failure of hot swap. Signed-off-by: Zhengping Zhou --- Hi all: I'm sorry to bother again,that's my second time to send this patch. I find a bug about the failure of hot swap when I am using megaraid sas series controller. Finally I have found that when controller receives the event of hot swap, it will firstly check whether the device is added to the system/host by calling scsi_device_lookup.The logics in function megasas_aen_polling is as follows: case MR_EVT_PD_REMOVED: if (megasas_get_pd_list(instance) == 0) { for (i = 0; i < MEGASAS_MAX_PD_CHANNELS; i++) { for (j = 0; j < MEGASAS_MAX_DEV_PER_CHANNEL; j++) { pd_index = (i * MEGASAS_MAX_DEV_PER_CHANNEL) + j; sdev1 = scsi_device_lookup(host, i, j, 0); if (instance->pd_list[pd_index].driveState == MR_PD_STATE_SYSTEM) { if (sdev1) scsi_device_put(sdev1); } else { if (sdev1) { scsi_remove_device(sdev1); scsi_device_put(sdev1); } } } } } If the previous scsi_device is not released, this will lead the appearance of two scsi_devices which correspond with the same disk. And when the disk is unpluged afterwards, the controller will assume that this disk has never been added into the system/host. Thus it won't call scsi_device_remove. When I finish this modification, this problem is fixed.So far, I have successfully test PCI_DEVICE_ID_LSI_SAS0073SKINNY and PCI_DEVICE_ID_LSI_FURY. Thanks Zhengping --- drivers/scsi/scsi.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index 207d6a7..5251d6d 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -1118,6 +1118,8 @@ struct scsi_device *__scsi_device_lookup(struct Scsi_Host *shost, struct scsi_device *sdev; list_for_each_entry(sdev, &shost->__devices, siblings) { + if (sdev->sdev_state == SDEV_DEL) + continue; if (sdev->channel == channel && sdev->id == id && sdev->lun ==lun) return sdev;