From patchwork Fri Apr 16 22:06:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 12209051 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 172C8C433B4 for ; Fri, 16 Apr 2021 22:06:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EEFFC61107 for ; Fri, 16 Apr 2021 22:06:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235581AbhDPWHI (ORCPT ); Fri, 16 Apr 2021 18:07:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60520 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235608AbhDPWHG (ORCPT ); Fri, 16 Apr 2021 18:07:06 -0400 Received: from mail-qk1-x731.google.com (mail-qk1-x731.google.com [IPv6:2607:f8b0:4864:20::731]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1BD52C061574 for ; Fri, 16 Apr 2021 15:06:41 -0700 (PDT) Received: by mail-qk1-x731.google.com with SMTP id c123so25612300qke.1 for ; Fri, 16 Apr 2021 15:06:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=TPYpDSq8Mf6GV23A4rg39jSHse9EX5i3mfgaOsbHzoU=; b=LgIcZlSDZxZaNxg+glWRfcMpcC9dR6quQulhPwYdpCRhwy+iqE6s3gcU3jlWt7L12D 1kiwiB3lXhdhOdnWcADpTiOKxv7IiiVuIZyeyE6YvWgNlq/JEVIo1HRefy1w8zWA5ceB 1mb/pNNjloYkz2+1x6BtCxttrywhEZqAOdcbl7BlZ95td7M+BtvNNysjGrgqymGFiaDB HUTtTF2bH+D/azHxQmu6aZvhzBR/5AK2Xno7qgC/0siHsVU1gDLhkeLD91LTWf9lyGOX reEfb+qSTwZnOIM0+XYF2A1AnNvbtmNaOn8m9oZ5dh2PhEZriCaBe7dFKzC9bhZt/nLh rjsw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=TPYpDSq8Mf6GV23A4rg39jSHse9EX5i3mfgaOsbHzoU=; b=spaLo8VAB4QQlyIGh/yxeGbqLIcmebCb6oF8YUXZcy0ugCNuf6Cw7IO+uWI8L7cXrE XzxFy1eDrmkKtxhwTwbhr/s9WTa4EqAH6SzWsz8s5vQkPLtTE+VS8xNJijEoBA7JDNR5 7yU3Y4zao+JlJwmFFqiFcyeWQ34ltPlXQ3kgP09fVUGSYcGsS2+4qggEfbHxXA/XsbqR V9k2W7SHyp9GIDNKWRI3v8WJZcRh8p8StTMa7LYTqgZOwUlfjZBr3LNKEX3MB3WUcO1s l72AnTQersa9fLvAiiH4hARzHQT2N7yyb/6iO4dxqFsfn+trs+ZDBi9anIzDWC2p5Ath sBXQ== X-Gm-Message-State: AOAM531UvwvrlwCndKANf3qeGhhUoEbR6OjpLTq/LeyEAiFWsNfjacvp h/6qsISnHWNdRsLWGS1FDcuscLTckDZfXw== X-Google-Smtp-Source: ABdhPJw7ppGOCSahVk+duKk+ADJkHO9EgUt1Epe+BpmeIZB5SJVOeurj1Lrnnmi1p9nftS5OkFYZjg== X-Received: by 2002:a05:620a:1112:: with SMTP id o18mr1476969qkk.52.1618610800275; Fri, 16 Apr 2021 15:06:40 -0700 (PDT) Received: from localhost (pool-68-160-176-52.bstnma.fios.verizon.net. [68.160.176.52]) by smtp.gmail.com with ESMTPSA id x4sm4972979qkp.78.2021.04.16.15.06.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 Apr 2021 15:06:39 -0700 (PDT) Sender: Mike Snitzer From: Mike Snitzer To: Christoph Hellwig , Jens Axboe Cc: dm-devel@redhat.com, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org Subject: [PATCH v3 1/4] nvme: return BLK_STS_DO_NOT_RETRY if the DNR bit is set Date: Fri, 16 Apr 2021 18:06:34 -0400 Message-Id: <20210416220637.41111-2-snitzer@redhat.com> X-Mailer: git-send-email 2.15.0 In-Reply-To: <20210416220637.41111-1-snitzer@redhat.com> References: <20210416220637.41111-1-snitzer@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org If the DNR bit is set we should not retry the command. We care about the retryable vs not retryable distinction at the block layer so propagate the equivalent of the DNR bit by introducing BLK_STS_DO_NOT_RETRY. Update blk_path_error() to _not_ retry if it is set. This change runs with the suggestion made here: https://lore.kernel.org/linux-nvme/20190813170144.GA10269@lst.de/ Suggested-by: Christoph Hellwig Signed-off-by: Mike Snitzer Reviewed-by: Chaitanya Kulkarni Reviewed-by: Hannes Reinecke --- drivers/nvme/host/core.c | 3 +++ include/linux/blk_types.h | 8 ++++++++ 2 files changed, 11 insertions(+) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 0896e21642be..540d6fd8ffef 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -237,6 +237,9 @@ static void nvme_delete_ctrl_sync(struct nvme_ctrl *ctrl) static blk_status_t nvme_error_status(u16 status) { + if (unlikely(status & NVME_SC_DNR)) + return BLK_STS_DO_NOT_RETRY; + switch (status & 0x7ff) { case NVME_SC_SUCCESS: return BLK_STS_OK; diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index db026b6ec15a..1ca724948c56 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -142,6 +142,13 @@ typedef u8 __bitwise blk_status_t; */ #define BLK_STS_ZONE_ACTIVE_RESOURCE ((__force blk_status_t)16) +/* + * BLK_STS_DO_NOT_RETRY is returned from the driver in the completion path + * if the device returns a status indicating that if the same command is + * re-submitted it is expected to fail. + */ +#define BLK_STS_DO_NOT_RETRY ((__force blk_status_t)17) + /** * blk_path_error - returns true if error may be path related * @error: status the request was completed with @@ -157,6 +164,7 @@ typedef u8 __bitwise blk_status_t; static inline bool blk_path_error(blk_status_t error) { switch (error) { + case BLK_STS_DO_NOT_RETRY: case BLK_STS_NOTSUPP: case BLK_STS_NOSPC: case BLK_STS_TARGET: From patchwork Fri Apr 16 22:06:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 12209055 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96DE5C433ED for ; Fri, 16 Apr 2021 22:06:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 75B2861107 for ; Fri, 16 Apr 2021 22:06:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236327AbhDPWHL (ORCPT ); Fri, 16 Apr 2021 18:07:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60530 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236307AbhDPWHJ (ORCPT ); Fri, 16 Apr 2021 18:07:09 -0400 Received: from mail-qt1-x82d.google.com (mail-qt1-x82d.google.com [IPv6:2607:f8b0:4864:20::82d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 54CAFC061574 for ; Fri, 16 Apr 2021 15:06:42 -0700 (PDT) Received: by mail-qt1-x82d.google.com with SMTP id i6so12318677qti.10 for ; Fri, 16 Apr 2021 15:06:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=T/JS0tPFuGx026nisaFi9hJiA3/ZXWO6l+YekczmOw4=; b=A5OY46EW0Bi2mszyQA/M6FH31LsD9a02pRqwcxcfjKCMFnRgWnOWR5QFY7iHAy16zF mrcEm/1grade7bnJ1u01dFs4y6Zq8pY1cdA8ILzMDrVnd6Gu35NYl0GQeJw5wkAl42sj QX7eo8m30g/d80fPahymTwZgbIZ7eNGjhDslt75qhi2bvq00qh+cCzjFDx7Fh1FeJK64 iqkc2yH+FPOGqptvoB0JfBA9ZF7/84N193cRbCnFTDr6vH33hQuys4QSq5hlMhm3/YmX fEyDAAXY2CWCMrwLrmyt8UHpsAAawgck9LQFdt5rQZfjqtyyjftlyDr3zCVTUIozni1D PqOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=T/JS0tPFuGx026nisaFi9hJiA3/ZXWO6l+YekczmOw4=; b=YdMugY7Q3vdyf42gmX7IgYE1iZJ1bgf2dtcblwEimBJCFDBOK3pPpOjr6jD2Hfgelg CnxDQOTZJ0DXg30MmUMrD5DTVbfkD56Vq7rP2Ig70d3GbSwGkWFRICFtXqEfI4hJpAx0 ChkGW1XjnIXl1HD7ZT/bddVj2sT/4GPeBHc2ocf/lnOu8bVW/vRsK+JdLlYiUNwPVM8H lERsXDfBjSr07I1vG6/ZZzDTm5OP/xXsSeCfhYRxNGlrZiVZd8WaHDXgJBmi8K4oBK+H MQt36AcxhNPoct0iPKahLSNwJ0RydegRdFxTdebM6M+ggxutJ/EJzLPcX0FQzyXF88s/ jrrQ== X-Gm-Message-State: AOAM532jrWOJeV9oZD38m+nIeFKldlUdBFXjSjzOC5OfdmzjZ+5G1pJE NxRyo7HWjdKJxx9mdkxt4+I= X-Google-Smtp-Source: ABdhPJxyJrj306zuvVTg3nzFruQmMJ8VK2i/9VkWSo3aiQx5V9FJ+qx4C2zKZCnzibF9TMCU9sAExA== X-Received: by 2002:a05:622a:52:: with SMTP id y18mr1219038qtw.216.1618610801550; Fri, 16 Apr 2021 15:06:41 -0700 (PDT) Received: from localhost (pool-68-160-176-52.bstnma.fios.verizon.net. [68.160.176.52]) by smtp.gmail.com with ESMTPSA id o62sm5165833qkd.81.2021.04.16.15.06.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 Apr 2021 15:06:41 -0700 (PDT) Sender: Mike Snitzer From: Mike Snitzer To: Christoph Hellwig , Jens Axboe Cc: dm-devel@redhat.com, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, Chao Leng Subject: [PATCH v3 2/4] nvme: allow local retry for requests with REQ_FAILFAST_TRANSPORT set Date: Fri, 16 Apr 2021 18:06:35 -0400 Message-Id: <20210416220637.41111-3-snitzer@redhat.com> X-Mailer: git-send-email 2.15.0 In-Reply-To: <20210416220637.41111-1-snitzer@redhat.com> References: <20210416220637.41111-1-snitzer@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Chao Leng REQ_FAILFAST_TRANSPORT was designed for SCSI, because the SCSI protocol does not define the local retry mechanism. SCSI implements a fuzzy local retry mechanism, so REQ_FAILFAST_TRANSPORT is needed to allow higher-level multipathing software to perform failover/retry. NVMe is different with SCSI about this. It defines a local retry mechanism and path error codes, so NVMe should retry local for non path error. If path related error, whether to retry and how to retry is still determined by higher-level multipathing's failover. Unlike SCSI, NVMe shouldn't prevent retry if REQ_FAILFAST_TRANSPORT because NVMe's local retry is needed -- as is NVMe specific logic to categorize whether an error is path related. In this way, the mechanism of NVMe multipath or other multipath are now equivalent. The mechanism is: non path related error will be retried locally, path related error is handled by multipath. Signed-off-by: Chao Leng [snitzer: edited header for grammar and clarity, also added code comment] Signed-off-by: Mike Snitzer --- drivers/nvme/host/core.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 540d6fd8ffef..4134cf3c7e48 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -306,7 +306,14 @@ static inline enum nvme_disposition nvme_decide_disposition(struct request *req) if (likely(nvme_req(req)->status == 0)) return COMPLETE; - if (blk_noretry_request(req) || + /* + * REQ_FAILFAST_TRANSPORT is set by upper layer software that + * handles multipathing. Unlike SCSI, NVMe's error handling was + * specifically designed to handle local retry for non-path errors. + * As such, allow NVMe's local retry mechanism to be used for + * requests marked with REQ_FAILFAST_TRANSPORT. + */ + if ((req->cmd_flags & (REQ_FAILFAST_DEV | REQ_FAILFAST_DRIVER)) || (nvme_req(req)->status & NVME_SC_DNR) || nvme_req(req)->retries >= nvme_max_retries) return COMPLETE; From patchwork Fri Apr 16 22:06:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 12209053 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0E0EC43461 for ; Fri, 16 Apr 2021 22:06:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 883556115B for ; Fri, 16 Apr 2021 22:06:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236307AbhDPWHL (ORCPT ); Fri, 16 Apr 2021 18:07:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60532 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236324AbhDPWHJ (ORCPT ); Fri, 16 Apr 2021 18:07:09 -0400 Received: from mail-qk1-x731.google.com (mail-qk1-x731.google.com [IPv6:2607:f8b0:4864:20::731]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7349EC061756 for ; Fri, 16 Apr 2021 15:06:43 -0700 (PDT) Received: by mail-qk1-x731.google.com with SMTP id q136so9526612qka.7 for ; Fri, 16 Apr 2021 15:06:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=bMR2VTLj6F901p54R7TJJG701KH0qEMwjYWby2WBJ1k=; b=mDbIrcwHFRy36PxLRjINn8KBeQK2Ja/TGUbAGVEyEBT37K1rvaRAlkrDNeJq/H3FZZ cqqFrzkh9T9aBaM0RJ1UtGCc/jYEa82z2twH0NyE1CWV5hijRzXptVHlq3Hv6G3g7Ozb AKvmml+Zn2DR4xbG4k4pu9aYBgvbr0jJWoiNtwGjphQfr4Z1KxKXnlHMwyBbpwSRzzBa 6yDUJOsJ4AOGUWhkUvDkGml3yGr8IrGqo07FHzOug43DugUS+N9qWNfke0iDL2o/8ONi cTF+LUdYsZ39xgjDEt0IC2jdjJ97Q7aNOMU0SlXTWjwuvWOZwH9hjj/3O4eu9gdXoYVi mhhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=bMR2VTLj6F901p54R7TJJG701KH0qEMwjYWby2WBJ1k=; b=NhRVATg0pWe1B1wTZpzkx8tgn5jKM62PCVVcMwsD7i+kOxWqZygedKZrF+OGp8lXLJ WTeH/AM6LmJieQiPcj2ssRtxEfQn0VTmh7Do8C63IopkVy89yezI3ibdlmyDFvpguWGL q/dXvSIWUseGR9ZbcoVRhzZsPNjQnmD09OZggrylP9Y3a6iTZL/gKAR7Zbr/IAzkS0VD CidMtfhBkzhC87ZRoJLfGfXqbQ+H00ZDhUDfLieElQyRTOcFCbhvXuU9oZwWkBSxXH3d cNCCkLLwqI3unR0V9sTyUJ0piA/tNtQGkPUScAekffxLufZ/CTKgR4ZrOurYyX/p2iMT R9dw== X-Gm-Message-State: AOAM532rdKKLQut0F0SrN+wT81YIhvvm/t7uTz1ASqk102qMzmiHRADC 4mXNmY9uu35dm6p7OVE/D5Q= X-Google-Smtp-Source: ABdhPJwWCSOVGVoMxgIJwBSWNNpF9FwTIxmOZvPSZ8Rmi8aNJ381/s9+9qabM8hKroVd8ED9vuQBtw== X-Received: by 2002:a37:7006:: with SMTP id l6mr1394793qkc.137.1618610802681; Fri, 16 Apr 2021 15:06:42 -0700 (PDT) Received: from localhost (pool-68-160-176-52.bstnma.fios.verizon.net. [68.160.176.52]) by smtp.gmail.com with ESMTPSA id d62sm5173006qkg.55.2021.04.16.15.06.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 Apr 2021 15:06:42 -0700 (PDT) Sender: Mike Snitzer From: Mike Snitzer To: Christoph Hellwig , Jens Axboe Cc: dm-devel@redhat.com, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org Subject: [PATCH v3 3/4] nvme: introduce FAILUP handling for REQ_FAILFAST_TRANSPORT Date: Fri, 16 Apr 2021 18:06:36 -0400 Message-Id: <20210416220637.41111-4-snitzer@redhat.com> X-Mailer: git-send-email 2.15.0 In-Reply-To: <20210416220637.41111-1-snitzer@redhat.com> References: <20210416220637.41111-1-snitzer@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org If REQ_FAILFAST_TRANSPORT is set it means the driver should not retry IO that completed with transport errors. REQ_FAILFAST_TRANSPORT is set by multipathing software (e.g. dm-multipath) before it issues IO. Update NVMe to allow failover of requests marked with either REQ_NVME_MPATH or REQ_FAILFAST_TRANSPORT. This allows such requests to be given a disposition of either FAILOVER or FAILUP respectively. FAILUP handling ensures a retryable error is returned up from NVMe. Introduce nvme_failup_req() for use in nvme_complete_rq() if nvme_decide_disposition() returns FAILUP. nvme_failup_req() ensures the request is completed with a retryable IO error when appropriate. Signed-off-by: Mike Snitzer --- drivers/nvme/host/core.c | 28 ++++++++++++++++++++++++++-- 1 file changed, 26 insertions(+), 2 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 4134cf3c7e48..605ffba6835f 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -299,6 +299,7 @@ enum nvme_disposition { COMPLETE, RETRY, FAILOVER, + FAILUP, }; static inline enum nvme_disposition nvme_decide_disposition(struct request *req) @@ -318,10 +319,11 @@ static inline enum nvme_disposition nvme_decide_disposition(struct request *req) nvme_req(req)->retries >= nvme_max_retries) return COMPLETE; - if (req->cmd_flags & REQ_NVME_MPATH) { + if (req->cmd_flags & (REQ_NVME_MPATH | REQ_FAILFAST_TRANSPORT)) { if (nvme_is_path_error(nvme_req(req)->status) || blk_queue_dying(req->q)) - return FAILOVER; + return (req->cmd_flags & REQ_NVME_MPATH) ? + FAILOVER : FAILUP; } else { if (blk_queue_dying(req->q)) return COMPLETE; @@ -343,6 +345,25 @@ static inline void nvme_end_req(struct request *req) blk_mq_end_request(req, status); } +static void nvme_failup_req(struct request *req) +{ + blk_status_t status = nvme_error_status(nvme_req(req)->status); + + /* Ensure a retryable path error is returned */ + if (WARN_ON_ONCE(!blk_path_error(status))) { + /* + * If here, nvme_is_path_error() returned true. + * So nvme_error_status() translation needs updating + * relative to blk_path_error(), or vice versa. + */ + pr_debug("Request meant for failover but blk_status_t (errno=%d) was not retryable.\n", + blk_status_to_errno(status)); + nvme_req(req)->status = NVME_SC_HOST_PATH_ERROR; + } + + nvme_end_req(req); +} + void nvme_complete_rq(struct request *req) { trace_nvme_complete_rq(req); @@ -361,6 +382,9 @@ void nvme_complete_rq(struct request *req) case FAILOVER: nvme_failover_req(req); return; + case FAILUP: + nvme_failup_req(req); + return; } } EXPORT_SYMBOL_GPL(nvme_complete_rq); From patchwork Fri Apr 16 22:06:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Snitzer X-Patchwork-Id: 12209057 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40D26C433B4 for ; Fri, 16 Apr 2021 22:06:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2326E610F7 for ; Fri, 16 Apr 2021 22:06:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236324AbhDPWHL (ORCPT ); Fri, 16 Apr 2021 18:07:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60540 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235608AbhDPWHK (ORCPT ); Fri, 16 Apr 2021 18:07:10 -0400 Received: from mail-qk1-x72d.google.com (mail-qk1-x72d.google.com [IPv6:2607:f8b0:4864:20::72d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CBEC6C06175F for ; Fri, 16 Apr 2021 15:06:44 -0700 (PDT) Received: by mail-qk1-x72d.google.com with SMTP id e13so20621186qkl.6 for ; Fri, 16 Apr 2021 15:06:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=rDUiV58X75hPG1P5uWek/G5MSETfeVhGHbmBONCaZ8U=; b=CKj5b28UASeTOfqPWz2oTGQYrduQxQZ2eAG3FT5PPVb8fdVDOOEKtJWkRcoYjAzuFc sn0wYTkzxYbNVPpJ3G6aPgjV986ykrnqPpLuTeHUlplzLZCEjLzD35cvh96xpre24pgJ qn1zA9POnl5XUHMKv+5bLtr94YthtfWjPwvNjLwjMKR3nGxy6UaGKhz+OAiN+RNG4GFK 2a1aJY/O9wImAS/STIru/YR9iWLevZtnYSmqyncOgWv6rRizKsXMdOPozJIgniljDwET EO477RXXJbHey8Y3MjWjpyQq60MAarpTff+yOghpQW1Srv8llLnIRInzD2lSZYnY95X7 V/2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=rDUiV58X75hPG1P5uWek/G5MSETfeVhGHbmBONCaZ8U=; b=dHgkNnyWUo4450KxvCs2sL8Lx2rLlPPRk6iSP737J+vpig2Nqves+cATABPQENA8W6 aq7cnp41E3fEKuTE/ntVHKE361GfimRE5cQ0bFFclMu0Fx6h4PnEMkg/RDC00G3JEJs8 1XzHJG3niRvwYYIpq3pVZ+iaE7BndDkmxyWsAVog/AeVAgJCrFZRkdoVoJewPnAWwlhF Ju8NDLcHr4sGN5YTE+ZE5xQl+but1ABoWh9CfcKQsPnjrvxAP5IkORkiCSPq7cmsCrvl /pg8TDhxKSg6QtHBYZ0t45r+IOATlHada+OgaW8VST4vDtKkXtMjonDQrTRTqbgBTD77 e9Kw== X-Gm-Message-State: AOAM530/tWpWkDzuXIJKAE+RC94iZrqZGn49Lp1nQKENfX1BImzrvMgZ EzaySu/fq9XternaI8Usm8g= X-Google-Smtp-Source: ABdhPJwQGhrESLAHtZ/cMgCEXSH14BFWFw6ZctlPgcT4kjH9HsxQmaXkprC7Mn2+sEKVYnurrbUTtQ== X-Received: by 2002:a37:a1d3:: with SMTP id k202mr1452014qke.97.1618610804004; Fri, 16 Apr 2021 15:06:44 -0700 (PDT) Received: from localhost (pool-68-160-176-52.bstnma.fios.verizon.net. [68.160.176.52]) by smtp.gmail.com with ESMTPSA id 132sm5094410qkn.52.2021.04.16.15.06.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 Apr 2021 15:06:43 -0700 (PDT) Sender: Mike Snitzer From: Mike Snitzer To: Christoph Hellwig , Jens Axboe Cc: dm-devel@redhat.com, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org Subject: [PATCH v3 4/4] nvme: decouple basic ANA log page re-read support from native multipathing Date: Fri, 16 Apr 2021 18:06:37 -0400 Message-Id: <20210416220637.41111-5-snitzer@redhat.com> X-Mailer: git-send-email 2.15.0 In-Reply-To: <20210416220637.41111-1-snitzer@redhat.com> References: <20210416220637.41111-1-snitzer@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Whether or not ANA is present is a choice of the target implementation; the host (and whether it supports multipathing) has _zero_ influence on this. If the target declares a path as 'inaccessible' the path _is_ inaccessible to the host. As such, ANA support should be functional even if native multipathing is not. Introduce ability to always re-read ANA log page as required due to ANA error and make current ANA state available via sysfs -- even if native multipathing is disabled on the host (e.g. nvme_core.multipath=N). This is achieved by factoring out nvme_update_ana() and calling it in nvme_complete_rq() for all FAILOVER requests. This affords userspace access to the current ANA state independent of which layer might be doing multipathing. This makes 'nvme list-subsys' show ANA state for all NVMe subsystems with multiple controllers. It also allows userspace multipath-tools to rely on the NVMe driver for ANA support while dm-multipath takes care of multipathing. And as always, if embedded NVMe users do not want any performance overhead associated with ANA or native NVMe multipathing they can disable CONFIG_NVME_MULTIPATH. Signed-off-by: Mike Snitzer --- drivers/nvme/host/core.c | 2 ++ drivers/nvme/host/multipath.c | 16 +++++++++++----- drivers/nvme/host/nvme.h | 4 ++++ 3 files changed, 17 insertions(+), 5 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 605ffba6835f..9a878a599897 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -349,6 +349,8 @@ static void nvme_failup_req(struct request *req) { blk_status_t status = nvme_error_status(nvme_req(req)->status); + nvme_update_ana(req); + /* Ensure a retryable path error is returned */ if (WARN_ON_ONCE(!blk_path_error(status))) { /* diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c index a1d476e1ac02..7d94250264aa 100644 --- a/drivers/nvme/host/multipath.c +++ b/drivers/nvme/host/multipath.c @@ -65,23 +65,29 @@ void nvme_set_disk_name(char *disk_name, struct nvme_ns *ns, } } -void nvme_failover_req(struct request *req) +void nvme_update_ana(struct request *req) { struct nvme_ns *ns = req->q->queuedata; u16 status = nvme_req(req)->status & 0x7ff; - unsigned long flags; - - nvme_mpath_clear_current_path(ns); /* * If we got back an ANA error, we know the controller is alive but not - * ready to serve this namespace. Kick of a re-read of the ANA + * ready to serve this namespace. Kick off a re-read of the ANA * information page, and just try any other available path for now. */ if (nvme_is_ana_error(status) && ns->ctrl->ana_log_buf) { set_bit(NVME_NS_ANA_PENDING, &ns->flags); queue_work(nvme_wq, &ns->ctrl->ana_work); } +} + +void nvme_failover_req(struct request *req) +{ + struct nvme_ns *ns = req->q->queuedata; + unsigned long flags; + + nvme_mpath_clear_current_path(ns); + nvme_update_ana(req); spin_lock_irqsave(&ns->head->requeue_lock, flags); blk_steal_bios(&ns->head->requeue_list, req); diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index 07b34175c6ce..4eed8536625c 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -664,6 +664,7 @@ void nvme_mpath_start_freeze(struct nvme_subsystem *subsys); void nvme_set_disk_name(char *disk_name, struct nvme_ns *ns, struct nvme_ctrl *ctrl, int *flags); void nvme_failover_req(struct request *req); +void nvme_update_ana(struct request *req); void nvme_kick_requeue_lists(struct nvme_ctrl *ctrl); int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl,struct nvme_ns_head *head); void nvme_mpath_add_disk(struct nvme_ns *ns, struct nvme_id_ns *id); @@ -714,6 +715,9 @@ static inline void nvme_set_disk_name(char *disk_name, struct nvme_ns *ns, static inline void nvme_failover_req(struct request *req) { } +static inline void nvme_update_ana(struct request *req) +{ +} static inline void nvme_kick_requeue_lists(struct nvme_ctrl *ctrl) { }