From patchwork Tue Sep 26 17:22:35 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lee Duncan X-Patchwork-Id: 9972393 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 05DFF6037E for ; Tue, 26 Sep 2017 17:24:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E0D6F28F76 for ; Tue, 26 Sep 2017 17:24:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D5CFA28F7A; Tue, 26 Sep 2017 17:24:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4A55828F76 for ; Tue, 26 Sep 2017 17:24:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965047AbdIZRYH (ORCPT ); Tue, 26 Sep 2017 13:24:07 -0400 Received: from mx2.suse.de ([195.135.220.15]:38067 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S935367AbdIZRYH (ORCPT ); Tue, 26 Sep 2017 13:24:07 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 06687AEDB; Tue, 26 Sep 2017 17:24:05 +0000 (UTC) From: Lee Duncan To: linux-scsi@vger.kernel.org Cc: "James E . J . Bottomley" , "Martin K . Petersen" , Hannes Reinecke , Lee Duncan Subject: [PATCH] scsi: ioctl reset should wait for IOs to complete Date: Tue, 26 Sep 2017 10:22:35 -0700 Message-Id: <20170926172235.29530-1-lduncan@suse.com> X-Mailer: git-send-email 2.12.3 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The SCSI ioctl reset path is smart enough to set the flag tmf_in_progress when a user-requested reset is processed, but it does not wait for IO that is in flight. This can result in lost IOs and hung processes. We should wait for a reasonable amount of time for either the IOs to complete or to fail the request. Signed-off-by: Lee Duncan --- drivers/scsi/scsi_error.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 38942050b265..b964152611c3 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -57,6 +57,14 @@ #define BUS_RESET_SETTLE_TIME (10) #define HOST_RESET_SETTLE_TIME (10) +/* + * Time to wait for outstanding IOs when about to send + * a device reset, e.g. sg_reset. The msecs to wait must + * be an multiple of the msecs to wait per try. + */ +#define MSECS_PER_TRY_FOR_IO_ON_RESET 500 +#define MSECS_TO_WAIT_FOR_IO_ON_RESET (MSECS_PER_TRY_FOR_IO_ON_RESET * 10) + static int scsi_eh_try_stu(struct scsi_cmnd *scmd); static int scsi_try_to_abort_cmd(struct scsi_host_template *, struct scsi_cmnd *); @@ -2269,6 +2277,7 @@ void scsi_report_device_reset(struct Scsi_Host *shost, int channel, int target) struct request *rq; unsigned long flags; int error = 0, rtn, val; + unsigned int msecs_to_wait = MSECS_TO_WAIT_FOR_IO_ON_RESET; if (!capable(CAP_SYS_ADMIN) || !capable(CAP_SYS_RAWIO)) return -EACCES; @@ -2301,6 +2310,22 @@ void scsi_report_device_reset(struct Scsi_Host *shost, int channel, int target) spin_lock_irqsave(shost->host_lock, flags); shost->tmf_in_progress = 1; + + /* if any IOs in progress wait for them a while */ + while ((atomic_read(&shost->host_busy) > 0) && (msecs_to_wait > 0)) { + spin_unlock_irqrestore(shost->host_lock, flags); + msleep(MSECS_PER_TRY_FOR_IO_ON_RESET); + msecs_to_wait -= MSECS_PER_TRY_FOR_IO_ON_RESET; + spin_lock_irqsave(shost->host_lock, flags); + } + if (atomic_read(&shost->host_busy)) { + shost->tmf_in_progress = 0; + spin_unlock_irqrestore(shost->host_lock, flags); + SCSI_LOG_ERROR_RECOVERY(3, + printk("%s: device reset failed: outstanding IO\n", __func__)); + goto out_put_scmd_and_free; + } + spin_unlock_irqrestore(shost->host_lock, flags); switch (val & ~SG_SCSI_RESET_NO_ESCALATE) { @@ -2349,6 +2374,7 @@ void scsi_report_device_reset(struct Scsi_Host *shost, int channel, int target) wake_up(&shost->host_wait); scsi_run_host_queues(shost); +out_put_scmd_and_free: scsi_put_command(scmd); kfree(rq);