From patchwork Thu Oct 1 15:58:08 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Matthew R. Ochs" X-Patchwork-Id: 7309841 Return-Path: X-Original-To: patchwork-linux-scsi@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id D2822BF90C for ; Thu, 1 Oct 2015 15:59:08 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id E426120699 for ; Thu, 1 Oct 2015 15:59:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CDA3F205D6 for ; Thu, 1 Oct 2015 15:59:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753274AbbJAP7F (ORCPT ); Thu, 1 Oct 2015 11:59:05 -0400 Received: from e39.co.us.ibm.com ([32.97.110.160]:48719 "EHLO e39.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751194AbbJAP7D (ORCPT ); Thu, 1 Oct 2015 11:59:03 -0400 Received: from localhost by e39.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 1 Oct 2015 09:59:02 -0600 Received: from d01dlp02.pok.ibm.com (9.56.250.167) by e39.co.us.ibm.com (192.168.1.139) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Thu, 1 Oct 2015 09:59:00 -0600 X-IBM-Helo: d01dlp02.pok.ibm.com X-IBM-MailFrom: mrochs@linux.vnet.ibm.com X-IBM-RcptTo: linux-scsi@vger.kernel.org Received: from b01cxnp22033.gho.pok.ibm.com (b01cxnp22033.gho.pok.ibm.com [9.57.198.23]) by d01dlp02.pok.ibm.com (Postfix) with ESMTP id 734586E8045 for ; Thu, 1 Oct 2015 11:47:12 -0400 (EDT) Received: from d01av02.pok.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by b01cxnp22033.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t91Fx0fo49152254 for ; Thu, 1 Oct 2015 15:59:00 GMT Received: from d01av02.pok.ibm.com (localhost [127.0.0.1]) by d01av02.pok.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t91FwuFq024461 for ; Thu, 1 Oct 2015 11:58:59 -0400 Received: from p8tul1-build.aus.stglabs.ibm.com (als141206.austin.ibm.com [9.3.141.206]) by d01av02.pok.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id t91FwrJA024181; Thu, 1 Oct 2015 11:58:53 -0400 From: "Matthew R. Ochs" To: linux-scsi@vger.kernel.org, James Bottomley , "Nicholas A. Bellinger" , Brian King , Ian Munsie , Daniel Axtens , Andrew Donnellan , Tomas Henzl , David Laight Cc: Michael Neuling , linuxppc-dev@lists.ozlabs.org, "Manoj N. Kumar" Subject: [PATCH v5 25/34] cxlflash: Fix to prevent EEH recovery failure Date: Thu, 1 Oct 2015 10:58:08 -0500 Message-Id: <1443715088-21200-1-git-send-email-mrochs@linux.vnet.ibm.com> X-Mailer: git-send-email 2.1.0 In-Reply-To: <1443714773-9176-1-git-send-email-mrochs@linux.vnet.ibm.com> References: <1443714773-9176-1-git-send-email-mrochs@linux.vnet.ibm.com> X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15100115-0033-0000-0000-0000062B36E8 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The process_sense() routine can perform a read capacity which can take some time to complete. If an EEH occurs while waiting on the read capacity, the EEH handler will wait to obtain the context's mutex in order to put the context in an error state. The EEH handler will sit and wait until the context is free, but this wait can potentially last forever (deadlock) if the scsi_execute() that performs the read capacity experiences a timeout and calls into the reset callback. When that occurs, the reset callback sees that the device is already being reset and waits for the reset to complete. This leaves two threads waiting on the other. To address this issue, make the context unavailable to new, non-system owned threads and release the context while calling into process_sense(). After returning from process_sense() the context mutex is reacquired and the context is made available again. The context can be safely moved to the error state if needed during the unavailable window as no other threads will hold its reference. Signed-off-by: Matthew R. Ochs Signed-off-by: Manoj N. Kumar Reviewed-by: Brian King Reviewed-by: Daniel Axtens --- drivers/scsi/cxlflash/superpipe.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c index a6316f5..7283e83 100644 --- a/drivers/scsi/cxlflash/superpipe.c +++ b/drivers/scsi/cxlflash/superpipe.c @@ -1787,12 +1787,21 @@ static int cxlflash_disk_verify(struct scsi_device *sdev, * inquiry (i.e. the Unit attention is due to the WWN changing). */ if (verify->hint & DK_CXLFLASH_VERIFY_HINT_SENSE) { + /* Can't hold mutex across process_sense/read_cap16, + * since we could have an intervening EEH event. + */ + ctxi->unavail = true; + mutex_unlock(&ctxi->mutex); rc = process_sense(sdev, verify); if (unlikely(rc)) { dev_err(dev, "%s: Failed to validate sense data (%d)\n", __func__, rc); + mutex_lock(&ctxi->mutex); + ctxi->unavail = false; goto out; } + mutex_lock(&ctxi->mutex); + ctxi->unavail = false; } switch (gli->mode) {