From patchwork Tue Jan 24 14:59:57 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauricio Faria de Oliveira X-Patchwork-Id: 9535347 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 1A07B60434 for ; Tue, 24 Jan 2017 15:00:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0E13B2811C for ; Tue, 24 Jan 2017 15:00:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 02C2028174; Tue, 24 Jan 2017 15:00:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DE6102815E for ; Tue, 24 Jan 2017 15:00:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750847AbdAXPAO (ORCPT ); Tue, 24 Jan 2017 10:00:14 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:37595 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750795AbdAXPAN (ORCPT ); Tue, 24 Jan 2017 10:00:13 -0500 Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.20/8.16.0.20) with SMTP id v0OEnQps102920 for ; Tue, 24 Jan 2017 10:00:12 -0500 Received: from e24smtp04.br.ibm.com (e24smtp04.br.ibm.com [32.104.18.25]) by mx0a-001b2d01.pphosted.com with ESMTP id 285jtvevpg-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Tue, 24 Jan 2017 10:00:12 -0500 Received: from localhost by e24smtp04.br.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 24 Jan 2017 13:00:09 -0200 Received: from d24dlp01.br.ibm.com (9.18.248.204) by e24smtp04.br.ibm.com (10.172.0.140) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 24 Jan 2017 13:00:08 -0200 Received: from d24relay01.br.ibm.com (d24relay01.br.ibm.com [9.8.31.16]) by d24dlp01.br.ibm.com (Postfix) with ESMTP id 6CB0E3520072; Tue, 24 Jan 2017 09:59:35 -0500 (EST) Received: from d24av03.br.ibm.com (d24av03.br.ibm.com [9.8.31.95]) by d24relay01.br.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v0OF07Pg4264026; Tue, 24 Jan 2017 13:00:07 -0200 Received: from d24av03.br.ibm.com (localhost [127.0.0.1]) by d24av03.br.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id v0OF06a9022811; Tue, 24 Jan 2017 13:00:07 -0200 Received: from [9.85.167.2] ([9.85.167.2]) by d24av03.br.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id v0OF02rn022653; Tue, 24 Jan 2017 13:00:04 -0200 Subject: Re: [PATCH 2/2] qla2xxx: Avoid that issuing a LIP triggers a kernel crash To: Bart Van Assche , "Martin K . Petersen" References: <20170123163446.9227-1-bart.vanassche@sandisk.com> <20170123163446.9227-3-bart.vanassche@sandisk.com> Cc: linux-scsi@vger.kernel.org, Naresh Bannoth , Himanshu Madhani , stable@vger.kernel.org From: Mauricio Faria de Oliveira Date: Tue, 24 Jan 2017 12:59:57 -0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0 MIME-Version: 1.0 In-Reply-To: <20170123163446.9227-3-bart.vanassche@sandisk.com> X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 17012415-0028-0000-0000-0000018E1005 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17012415-0029-0000-0000-0000148A14CC Message-Id: <91a8f843-c969-72c4-1add-8e44ea2d9a8a@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-01-24_11:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1612050000 definitions=main-1701240103 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hi Bart, First of all, sorry for the new bug; I didn't realize the pointer could be NULL at this scenario. On 01/23/2017 02:34 PM, Bart Van Assche wrote: > @@ -1624,7 +1627,8 @@ qla2x00_abort_all_cmds(scsi_qla_host_t *vha, int res) > */ > sp_get(sp); > spin_unlock_irqrestore(&ha->hardware_lock, flags); > - qla2xxx_eh_abort(GET_CMD_SP(sp)); > + if (scmd) > + qla2xxx_eh_abort(scmd); > spin_lock_irqsave(&ha->hardware_lock, flags); > } Now, this chunk has a problem with reference counting (and unnecessary spin-locking), which we can avoid by simply moving up this NULL check. The call to sp_get() increments the sp->ref_count, but if you skip the call to qla2xxx_eh_abort() you don't get the decrement from the call to sp->done() at abort handling from ISR, e.g., qla24xx_abort_iocb_entry(). [or if the command completed successfully between issue/complete abort, at the completion from ISR, e.g., qla2x00_process_completed_request().] The sp->done() call just below this chunk was supposed to drop the initial reference [set at qla2xxx_queuecommand()] at a time we did not call qla2xxx_eh_abort() yet... but now that we __may__ call it (and get that sp->done() call from the ISR abort handling), we need to only increment it if we're going to drop it. That should be resolved with this slight change to your patch (which also helps w/ the spin-locking). What do you/others think? /* Get a reference to the sp and drop the lock. * The reference ensures this sp->done() call * - and not the call in qla2xxx_eh_abort() - @@ -1624,7 +1627,7 @@ uint32_t qla2x00_isp_reg_stat(struct qla_hw_data *ha) */ sp_get(sp); spin_unlock_irqrestore(&ha->hardware_lock, flags); - qla2xxx_eh_abort(GET_CMD_SP(sp)); + qla2xxx_eh_abort(scmd); spin_lock_irqsave(&ha->hardware_lock, flags); } req->outstanding_cmds[cnt] = NULL; Signed-off-by: Mauricio Faria de Oliveira diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index 0a000ecf0881..a17cb63b3fd5 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -1600,6 +1600,7 @@ uint32_t qla2x00_isp_reg_stat(struct qla_hw_data *ha) srb_t *sp; struct qla_hw_data *ha = vha->hw; struct req_que *req; + struct scsi_cmnd *scmd; qlt_host_reset_handler(ha); @@ -1613,10 +1614,12 @@ uint32_t qla2x00_isp_reg_stat(struct qla_hw_data *ha) for (cnt = 1; cnt < req->num_outstanding_cmds; cnt++) { sp = req->outstanding_cmds[cnt]; if (sp) { + scmd = GET_CMD_SP(sp); + /* Don't abort commands in adapter during EEH * recovery as it's not accessible/responding. */ - if (!ha->flags.eeh_busy) { + if (scmd && !ha->flags.eeh_busy) {