From patchwork Thu May 17 17:14:47 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steffen Maier X-Patchwork-Id: 10407321 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id A715F602C2 for ; Thu, 17 May 2018 17:16:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9056528574 for ; Thu, 17 May 2018 17:16:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 85264285E2; Thu, 17 May 2018 17:16:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 19CB028574 for ; Thu, 17 May 2018 17:16:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752326AbeEQRQW (ORCPT ); Thu, 17 May 2018 13:16:22 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:35092 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751907AbeEQRPo (ORCPT ); Thu, 17 May 2018 13:15:44 -0400 Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w4HHELYd051081 for ; Thu, 17 May 2018 13:15:43 -0400 Received: from e06smtp14.uk.ibm.com (e06smtp14.uk.ibm.com [195.75.94.110]) by mx0a-001b2d01.pphosted.com with ESMTP id 2j1cjs4akv-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 17 May 2018 13:15:43 -0400 Received: from localhost by e06smtp14.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 17 May 2018 18:15:41 +0100 Received: from b06cxnps4076.portsmouth.uk.ibm.com (9.149.109.198) by e06smtp14.uk.ibm.com (192.168.101.144) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Thu, 17 May 2018 18:15:39 +0100 Received: from d06av22.portsmouth.uk.ibm.com (d06av22.portsmouth.uk.ibm.com [9.149.105.58]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w4HHFbdD36700188 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Thu, 17 May 2018 17:15:37 GMT Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3A1314C04E; Thu, 17 May 2018 18:07:26 +0100 (BST) Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DD7764C044; Thu, 17 May 2018 18:07:25 +0100 (BST) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by d06av22.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Thu, 17 May 2018 18:07:25 +0100 (BST) From: Steffen Maier To: "James E . J . Bottomley" , "Martin K . Petersen" Cc: linux-scsi@vger.kernel.org, linux-s390@vger.kernel.org, Benjamin Block , Martin Schwidefsky , Heiko Carstens , Steffen Maier , stable@vger.kernel.org Subject: [PATCH 05/25] zfcp: fix missing REC trigger trace on terminate_rport_io for ERP_FAILED Date: Thu, 17 May 2018 19:14:47 +0200 X-Mailer: git-send-email 2.16.3 In-Reply-To: <20180517171507.18237-1-maier@linux.ibm.com> References: <20180517171507.18237-1-maier@linux.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18051717-0044-0000-0000-0000055397CB X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18051717-0045-0000-0000-000028950DB7 Message-Id: <20180517171507.18237-6-maier@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-05-17_09:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1805170161 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP For problem determination we always want to see when we were invoked on the terminate_rport_io callback whether we perform something or not. Temporal event sequence of interest with a long fast_io_fail_tmo of 27 sec: loose remote port t workqueue [s] zfcp_q_ IRQ zfcperp === ================== =================== ============================ 0 recv RSCN q p.test_link_work block rport start fast_io_fail_tmo send ADISC ELS 4 recv ADISC fail block zfcp_port port forced reopen send open port 12 recv open port fail q p.gid_pn_work zfcp_erp_wakeup (zfcp_erp_wait would return) GID_PN fail Before this point, we got a SCSI trace with tag "sctrpi1" on fast_io_fail, e.g. with the typical 5 sec setting. port.status |= ERP_FAILED If fast_io_fail_tmo triggers after this point, we missed a SCSI trace. workqueue fc_dl_ ================== 27 fc_timeout_fail_rport_io fc_terminate_rport_io zfcp_scsi_terminate_rport_io zfcp_erp_port_forced_reopen _zfcp_erp_port_forced_reopen if (port.status & ERP_FAILED) return; Therefore, write a trace before above early return. Example trace record formatted with zfcpdbf from s390-tools: Timestamp : ... Area : REC Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 ZFCP_DBF_REC_TRIG Tag : sctrpi1 SCSI terminate rport I/O LUN : 0xffffffffffffffff none (invalid) WWPN : 0x D_ID : 0x Adapter status : 0x... Port status : 0x... LUN status : 0x00000000 none (invalid) Ready count : 0x... Running count : 0x... ERP want : 0x03 ZFCP_ERP_ACTION_REOPEN_PORT_FORCED ERP need : 0xe0 ZFCP_ERP_ACTION_FAILED Signed-off-by: Steffen Maier Cc: #2.6.38+ Reviewed-by: Benjamin Block --- drivers/s390/scsi/zfcp_erp.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c index 3489b1bc9121..5c368cdfc455 100644 --- a/drivers/s390/scsi/zfcp_erp.c +++ b/drivers/s390/scsi/zfcp_erp.c @@ -42,9 +42,13 @@ enum zfcp_erp_steps { * @ZFCP_ERP_ACTION_REOPEN_PORT_FORCED: Forced port recovery. * @ZFCP_ERP_ACTION_REOPEN_ADAPTER: Adapter recovery. * @ZFCP_ERP_ACTION_NONE: Eyecatcher pseudo flag to bitwise or-combine with - * either of the other enum values. + * either of the first four enum values. * Used to indicate that an ERP action could not be * set up despite a detected need for some recovery. + * @ZFCP_ERP_ACTION_FAILED: Eyecatcher pseudo flag to bitwise or-combine with + * either of the first four enum values. + * Used to indicate that ERP not needed because + * the object has ZFCP_STATUS_COMMON_ERP_FAILED. */ enum zfcp_erp_act_type { ZFCP_ERP_ACTION_REOPEN_LUN = 1, @@ -52,6 +56,7 @@ enum zfcp_erp_act_type { ZFCP_ERP_ACTION_REOPEN_PORT_FORCED = 3, ZFCP_ERP_ACTION_REOPEN_ADAPTER = 4, ZFCP_ERP_ACTION_NONE = 0xc0, + ZFCP_ERP_ACTION_FAILED = 0xe0, }; enum zfcp_erp_act_state { @@ -379,8 +384,12 @@ static void _zfcp_erp_port_forced_reopen(struct zfcp_port *port, int clear, zfcp_erp_port_block(port, clear); zfcp_scsi_schedule_rport_block(port); - if (atomic_read(&port->status) & ZFCP_STATUS_COMMON_ERP_FAILED) + if (atomic_read(&port->status) & ZFCP_STATUS_COMMON_ERP_FAILED) { + zfcp_dbf_rec_trig(id, port->adapter, port, NULL, + ZFCP_ERP_ACTION_REOPEN_PORT_FORCED, + ZFCP_ERP_ACTION_FAILED); return; + } zfcp_erp_action_enqueue(ZFCP_ERP_ACTION_REOPEN_PORT_FORCED, port->adapter, port, NULL, id, 0);