From patchwork Tue Nov 7 20:59:02 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Smart X-Patchwork-Id: 10047243 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 75F4E60360 for ; Tue, 7 Nov 2017 20:59:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6A02629300 for ; Tue, 7 Nov 2017 20:59:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5EAB92933B; Tue, 7 Nov 2017 20:59:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.5 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 564B029300 for ; Tue, 7 Nov 2017 20:59:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933635AbdKGU7P (ORCPT ); Tue, 7 Nov 2017 15:59:15 -0500 Received: from mail-qk0-f195.google.com ([209.85.220.195]:43969 "EHLO mail-qk0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933309AbdKGU7O (ORCPT ); Tue, 7 Nov 2017 15:59:14 -0500 Received: by mail-qk0-f195.google.com with SMTP id 78so794979qkz.0 for ; Tue, 07 Nov 2017 12:59:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=ADJnchyz243OPuBXjfWvhfeuoATmS90VFL5vJCsj55A=; b=YpciUTqPu4CNWMiOPqUqOYv4C6F+nsVWtlJwbvFWfBHaK0qHmDjmtlgOpKGPrYHFsR REuGJM7ZPdLP1hbDpj9EL0ks2F2wru1ibuBTxH4wHP237bVovuqKmp4efC0sF02sk2w1 Fowsu0XwiZh6htPSFqNYiNIOaHbYMkm83Hh32jtLRi8xT1y2emlq+wH0NlabZrdvfOoz r7Za6Lrv1FbmsMAuKHQR+xDhgvurUXfYFU0BBoEhZpOVy16+Nq8zCtmrUJ4z0fagOOrd ij2ju0Vgr6iGbWZax/MXNJric7TZwvh1mHiWQa2DLP2Y1OmNdsC5iBS9pojlhnZiysyO E/bw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=ADJnchyz243OPuBXjfWvhfeuoATmS90VFL5vJCsj55A=; b=GXrNujqqyWs+HeoGOx9H0KEqEbc3YGSiYaQf0RXQLRqSXJVt/c1dmsGfLFC3uYOjlE DwCX+gjV7VBPI8BiAsudNN2E3jYd69oV27yqUdAYPLXF3K3YJ7cFS2sfvuxg7WX3jaUC FJFb1z1h06nIdOucorhvr9gKhoT6v9PEUf6D7PDkhjwgCsrlsne0kRHIDSEcgoywQiib TutXbauS3ETnbWlrN6ZGEnBusjENC1x/1+Qhei9noA1RnTTl3anxkRko8lW+5SZOVq5z GQyVVeXQrsxaooJBQknOrs4VR47yD9xJa85WQHA4oIg9oUv/1CI24OZH2dX2j6Xd8UOl NYng== X-Gm-Message-State: AJaThX5hL/dAalL9sQ2YRlv9LoGevzaF3FbsNcpOX4tGoCd8Wp9GCwxr fZ/2WwnsecxK81znTMWdpSbWQDV1 X-Google-Smtp-Source: ABhQp+SmoK/Kgo0NUZBlUSk5cypmB3Z8bKM3UjwJHjvnQjNSBnV45bC/F9G7J4U5r4FiXYWD+OqcAA== X-Received: by 10.55.167.22 with SMTP id q22mr108950qke.234.1510088353944; Tue, 07 Nov 2017 12:59:13 -0800 (PST) Received: from pallmd1.broadcom.com ([192.19.255.250]) by smtp.gmail.com with ESMTPSA id y192sm1581587qky.62.2017.11.07.12.59.13 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 07 Nov 2017 12:59:13 -0800 (PST) From: James Smart To: linux-scsi@vger.kernel.org Cc: Dick Kennedy , James Smart Subject: [PATCH] lpfc: Fix hard lock up NMI in els timeout handling. Date: Tue, 7 Nov 2017 12:59:02 -0800 Message-Id: <20171107205902.17352-1-jsmart2021@gmail.com> X-Mailer: git-send-email 2.13.1 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Dick Kennedy System crashed due to a hard lockup at lpfc_els_timeout_handler+0x128. The els ring's txcmplq list is corrupted: the last element in the list does not point back the the head causing a loop. Issue is the els processing path for sli4 hbas are using the hbalock instead of the ring_lock for removing elements from the txcmplq list. Use the adapter SLI_REV to determine which lock should be used for removing iocbqs from the els rings txcmplq. note: the future refactoring will address this so that we don't have this ugly type-based lock code. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Ewan D. Milne --- drivers/scsi/lpfc/lpfc_sli.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c index 1229f58bdd09..c1c7df607604 100644 --- a/drivers/scsi/lpfc/lpfc_sli.c +++ b/drivers/scsi/lpfc/lpfc_sli.c @@ -2732,7 +2732,8 @@ lpfc_sli_process_unsol_iocb(struct lpfc_hba *phba, struct lpfc_sli_ring *pring, * * This function looks up the iocb_lookup table to get the command iocb * corresponding to the given response iocb using the iotag of the - * response iocb. This function is called with the hbalock held. + * response iocb. This function is called with the hbalock held + * for sli3 devices or the ring_lock for sli4 devices. * This function returns the command iocb object if it finds the command * iocb else returns NULL. **/ @@ -2828,9 +2829,15 @@ lpfc_sli_process_sol_iocb(struct lpfc_hba *phba, struct lpfc_sli_ring *pring, unsigned long iflag; /* Based on the iotag field, get the cmd IOCB from the txcmplq */ - spin_lock_irqsave(&phba->hbalock, iflag); + if (phba->sli_rev == LPFC_SLI_REV4) + spin_lock_irqsave(&pring->ring_lock, iflag); + else + spin_lock_irqsave(&phba->hbalock, iflag); cmdiocbp = lpfc_sli_iocbq_lookup(phba, pring, saveq); - spin_unlock_irqrestore(&phba->hbalock, iflag); + if (phba->sli_rev == LPFC_SLI_REV4) + spin_unlock_irqrestore(&pring->ring_lock, iflag); + else + spin_unlock_irqrestore(&phba->hbalock, iflag); if (cmdiocbp) { if (cmdiocbp->iocb_cmpl) {