From patchwork Tue May 17 22:08:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tanjore Suresh X-Patchwork-Id: 12852981 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E31F5C433EF for ; Tue, 17 May 2022 22:08:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230124AbiEQWIb (ORCPT ); Tue, 17 May 2022 18:08:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38388 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230122AbiEQWIa (ORCPT ); Tue, 17 May 2022 18:08:30 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DE396377FF for ; Tue, 17 May 2022 15:08:27 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id m136-20020a25268e000000b0064b233e03d1so300160ybm.14 for ; Tue, 17 May 2022 15:08:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=tf+qNQoI5qGFZ9RW7jx4HEPwOLh/TGpPIUH/6a1l9U4=; b=Vsd0rtAyuCX1ep9X8JPz0kVgE4qriTwFWX2119dzWimM6EjZ30AHxEw53p3qxrWdi5 POmyjeHImwG4MqnpLGozKFxoNPL5EGmsOIPrfxbeifQQoCw61r9cF6yLgj9qaJudrBCn TjT2tZwfvaODmd0MssVVbrKjGeMSc6AW81dJLt5WkCPgCAdTmPdy1mskr6PlDa56AKta 25rvMKsojCYOuGZa4hqfLIuJZnoUa0GnEMoN2rRTBJeMWGKQFe9SJr22yWJXSPsaCeoQ awSYMJYsMF7kP+naNxCLPkKZAodpvp94xIYrYzmJDzVjbz7xUvrsI31pwdAGhXQwJLv7 LH7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=tf+qNQoI5qGFZ9RW7jx4HEPwOLh/TGpPIUH/6a1l9U4=; b=NNKuDmeA1H7jcIgBG7DEjQ8SfABU296VzvJwNjgGZzZxST6WjcxgHmVkxZeQl3kL1L 2nOIz2/+Kw2FVqHK/HO5K3EuvLNCygvbuEwN3qGukgjn6RB2kmG/aGdOJMs1Gn0iLaXP qpXnvDN3F68HP89DY9V2cqJiOEjJkhcp4mWlWBxcnrA0SEF3fLsbUeoxi55DrTwkKN0p xCtwP/siHQSapal87dWtgyJn+E61tijOoUyrLeU4qLJoBgs4cilM49wsCDxs4adCZUVC 4UwZAnVuUqo0jWZk71vRQGhsOk//qy/pt6aNr2QnlyCNVS5xwCqaF2WmbkrM+QIDAiol ZiKw== X-Gm-Message-State: AOAM533+7xE2yV1qSKJ0EYrjOvOdXTy9HLpf3oS25WMpb1gsjccATmEx 26cbBQPU1w4OxGoSSKwC+/Ay0bijwMwXTpU= X-Google-Smtp-Source: ABdhPJwcmd7U+J0BbWux8Dql+fC0+EhIAotkoA14iXEPCqqLPyINCsjksq6MYJzDnCiD6Pwc0H4rOmOTjArLiFM= X-Received: from tansuresh.svl.corp.google.com ([2620:15c:2c5:13:3c9b:5345:708:1378]) (user=tansuresh job=sendgmr) by 2002:a25:9c08:0:b0:64b:c9f8:de84 with SMTP id c8-20020a259c08000000b0064bc9f8de84mr20468621ybo.391.1652825307082; Tue, 17 May 2022 15:08:27 -0700 (PDT) Date: Tue, 17 May 2022 15:08:14 -0700 In-Reply-To: <20220517220816.1635044-1-tansuresh@google.com> Message-Id: <20220517220816.1635044-2-tansuresh@google.com> Mime-Version: 1.0 References: <20220517220816.1635044-1-tansuresh@google.com> X-Mailer: git-send-email 2.36.0.550.gb090851708-goog Subject: [PATCH v3 1/3] driver core: Support asynchronous driver shutdown From: Tanjore Suresh To: Greg Kroah-Hartman , "Rafael J . Wysocki" , Christoph Hellwig , Sagi Grimberg , Bjorn Helgaas Cc: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, Tanjore Suresh Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org This changes the bus driver interface with additional entry points to enable devices to implement asynchronous shutdown. The existing synchronous interface to shutdown is unmodified and retained for backward compatibility. This changes the common device shutdown code to enable devices to participate in asynchronous shutdown implementation. Signed-off-by: Tanjore Suresh --- drivers/base/core.c | 38 +++++++++++++++++++++++++++++++++++++- include/linux/device/bus.h | 12 ++++++++++++ 2 files changed, 49 insertions(+), 1 deletion(-) diff --git a/drivers/base/core.c b/drivers/base/core.c index 3d6430eb0c6a..ba267ae70a22 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -4479,6 +4479,7 @@ EXPORT_SYMBOL_GPL(device_change_owner); void device_shutdown(void) { struct device *dev, *parent; + LIST_HEAD(async_shutdown_list); wait_for_device_probe(); device_block_probing(); @@ -4523,7 +4524,13 @@ void device_shutdown(void) dev_info(dev, "shutdown_pre\n"); dev->class->shutdown_pre(dev); } - if (dev->bus && dev->bus->shutdown) { + if (dev->bus && dev->bus->async_shutdown_start) { + if (initcall_debug) + dev_info(dev, "async_shutdown_start\n"); + dev->bus->async_shutdown_start(dev); + list_add_tail(&dev->kobj.entry, + &async_shutdown_list); + } else if (dev->bus && dev->bus->shutdown) { if (initcall_debug) dev_info(dev, "shutdown\n"); dev->bus->shutdown(dev); @@ -4543,6 +4550,35 @@ void device_shutdown(void) spin_lock(&devices_kset->list_lock); } spin_unlock(&devices_kset->list_lock); + + /* + * Second pass spin for only devices, that have configured + * Asynchronous shutdown. + */ + while (!list_empty(&async_shutdown_list)) { + dev = list_entry(async_shutdown_list.next, struct device, + kobj.entry); + parent = get_device(dev->parent); + get_device(dev); + /* + * Make sure the device is off the list + */ + list_del_init(&dev->kobj.entry); + if (parent) + device_lock(parent); + device_lock(dev); + if (dev->bus && dev->bus->async_shutdown_end) { + if (initcall_debug) + dev_info(dev, + "async_shutdown_end called\n"); + dev->bus->async_shutdown_end(dev); + } + device_unlock(dev); + if (parent) + device_unlock(parent); + put_device(dev); + put_device(parent); + } } /* diff --git a/include/linux/device/bus.h b/include/linux/device/bus.h index a039ab809753..f582c9d21515 100644 --- a/include/linux/device/bus.h +++ b/include/linux/device/bus.h @@ -49,6 +49,16 @@ struct fwnode_handle; * will never get called until they do. * @remove: Called when a device removed from this bus. * @shutdown: Called at shut-down time to quiesce the device. + * @async_shutdown_start: Called at the shutdown-time to start + * the shutdown process on the device. + * This entry point will be called only + * when the bus driver has indicated it would + * like to participate in asynchronous shutdown + * completion. + * @async_shutdown_end: Called at shutdown-time to complete the shutdown + * process of the device. This entry point will be called + * only when the bus drive has indicated it would like to + * participate in the asynchronous shutdown completion. * * @online: Called to put the device back online (after offlining it). * @offline: Called to put the device offline for hot-removal. May fail. @@ -93,6 +103,8 @@ struct bus_type { void (*sync_state)(struct device *dev); void (*remove)(struct device *dev); void (*shutdown)(struct device *dev); + void (*async_shutdown_start)(struct device *dev); + void (*async_shutdown_end)(struct device *dev); int (*online)(struct device *dev); int (*offline)(struct device *dev); From patchwork Tue May 17 22:08:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tanjore Suresh X-Patchwork-Id: 12852982 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7D02C433EF for ; Tue, 17 May 2022 22:08:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230146AbiEQWIh (ORCPT ); Tue, 17 May 2022 18:08:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38486 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230134AbiEQWIf (ORCPT ); Tue, 17 May 2022 18:08:35 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2FBE5377FA for ; Tue, 17 May 2022 15:08:32 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id p5-20020a254205000000b0064da2110759so308129yba.12 for ; Tue, 17 May 2022 15:08:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=zm0DqW+BoHKP/acD7zIvmyyRPadAhJ70kej4mteuoVA=; b=AsNbs7wnDoomcBDBtfoCOElE6jNniS5QhqV6pIza4jl4lpHIHoIYTtS9hAoHBuc1P5 Lai81UoqAapTGV+EhVeBkb9ifvYs4rS3/UM04qTmqwbuktXpET6Zr8lur6zjXKSJaDYw JrAu/bjtVVQCKzKgvkPerJoZcyc/wAI3pKA7B3S2FvC6tHkHMFB33Y8FGIKJVRhU9Fvq VXoLxEyN6Zuu+gbZk8aUKRbh2CbfJd5zJjM4WNL+YDervkJkXdgRDNW82FG9uJVWj1US bI3ndM7Dlh7Tc/WbUDlpdK80LuSSQuPZmwdVDBcPS5Xq1OQBpuUWpp81khbnmmAAbegu xmaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=zm0DqW+BoHKP/acD7zIvmyyRPadAhJ70kej4mteuoVA=; b=t9DnZdtlYlb567373TTx824CO0jRvhtbZBgs+AT2YVhqumrfUKaSmfmdkpPc8SR5Hu yVrO0uNPoPzwXHrhEZSeyPww8Qwe+HOisnmMQf5wKyfgiongkdEKQHG6UFAetK6rtOJS 697jdAx33f4M8RkTjGnvw/gVFjTDzzDt+WwnKHBZv7Yy07OsWF3kwNU4HjtQZxwQnYg8 JmKXhQG46S/IXbKoR4ZB6618MmvlBI1gUJ+qDJ0+6/PsIaICAkblcT6UOszTXBrdg1yl 62Quo+8cxfZof3FrsdOOuI0r8dFsTaGj1FVNXiPqJGYDQFHS81bYdqzb+uH0mxqKlxGu xtqw== X-Gm-Message-State: AOAM530j9hbh/uvl+/RvB+hs9kaqX/cujnklspL9BO9FrjaKU++8uOpK DvIxx0bfsymUOKxLm8vWs0uTKWRIZ1h3sPk= X-Google-Smtp-Source: ABdhPJzdJAzeV7gm8AS0Ny+YavQ8JOhwxUevF3poLXIh2oE3WhHG+lRupAn6BASD11L0wQDazBgiZye60KwAhw4= X-Received: from tansuresh.svl.corp.google.com ([2620:15c:2c5:13:3c9b:5345:708:1378]) (user=tansuresh job=sendgmr) by 2002:a5b:481:0:b0:649:d872:d521 with SMTP id n1-20020a5b0481000000b00649d872d521mr24676650ybp.73.1652825311407; Tue, 17 May 2022 15:08:31 -0700 (PDT) Date: Tue, 17 May 2022 15:08:15 -0700 In-Reply-To: <20220517220816.1635044-2-tansuresh@google.com> Message-Id: <20220517220816.1635044-3-tansuresh@google.com> Mime-Version: 1.0 References: <20220517220816.1635044-1-tansuresh@google.com> <20220517220816.1635044-2-tansuresh@google.com> X-Mailer: git-send-email 2.36.0.550.gb090851708-goog Subject: [PATCH v3 2/3] PCI: Support asynchronous shutdown From: Tanjore Suresh To: Greg Kroah-Hartman , "Rafael J . Wysocki" , Christoph Hellwig , Sagi Grimberg , Bjorn Helgaas Cc: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, Tanjore Suresh Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Enhances the base PCI driver to add support for asynchronous shutdown. Assume a device takes n secs to shutdown. If a machine has been populated with M such devices, the total time spent in shutting down all the devices will be M * n secs, if the shutdown is done synchronously. For example, if NVMe PCI Controllers take 5 secs to shutdown and if there are 16 such NVMe controllers in a system, system will spend a total of 80 secs to shutdown all NVMe devices in that system. In order to speed up the shutdown time, asynchronous interface to shutdown has been implemented. This will significantly reduce the machine reboot time. Signed-off-by: Tanjore Suresh --- drivers/pci/pci-driver.c | 20 ++++++++++++++++---- include/linux/pci.h | 4 ++++ 2 files changed, 20 insertions(+), 4 deletions(-) diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index 4ceeb75fc899..63f49a8dff8e 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -501,16 +501,28 @@ static void pci_device_remove(struct device *dev) pci_dev_put(pci_dev); } -static void pci_device_shutdown(struct device *dev) +static void pci_device_async_shutdown_start(struct device *dev) { struct pci_dev *pci_dev = to_pci_dev(dev); struct pci_driver *drv = pci_dev->driver; pm_runtime_resume(dev); - if (drv && drv->shutdown) + if (drv && drv->async_shutdown_start) + drv->async_shutdown_start(pci_dev); + else if (drv && drv->shutdown) drv->shutdown(pci_dev); +} + +static void pci_device_async_shutdown_end(struct device *dev) +{ + struct pci_dev *pci_dev = to_pci_dev(dev); + struct pci_driver *drv = pci_dev->driver; + + if (drv && drv->async_shutdown_end) + drv->async_shutdown_end(pci_dev); + /* * If this is a kexec reboot, turn off Bus Master bit on the * device to tell it to not continue to do DMA. Don't touch @@ -521,7 +533,6 @@ static void pci_device_shutdown(struct device *dev) if (kexec_in_progress && (pci_dev->current_state <= PCI_D3hot)) pci_clear_master(pci_dev); } - #ifdef CONFIG_PM /* Auxiliary functions used for system resume and run-time resume. */ @@ -1625,7 +1636,8 @@ struct bus_type pci_bus_type = { .uevent = pci_uevent, .probe = pci_device_probe, .remove = pci_device_remove, - .shutdown = pci_device_shutdown, + .async_shutdown_start = pci_device_async_shutdown_start, + .async_shutdown_end = pci_device_async_shutdown_end, .dev_groups = pci_dev_groups, .bus_groups = pci_bus_groups, .drv_groups = pci_drv_groups, diff --git a/include/linux/pci.h b/include/linux/pci.h index 60adf42460ab..ef5500c18fed 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -881,6 +881,8 @@ struct module; * Useful for enabling wake-on-lan (NIC) or changing * the power state of a device before reboot. * e.g. drivers/net/e100.c. + * @async_shutdown_start: This starts the asynchronous shutdown + * @async_shutdown_end: This completes the started asynchronous shutdown * @sriov_configure: Optional driver callback to allow configuration of * number of VFs to enable via sysfs "sriov_numvfs" file. * @sriov_set_msix_vec_count: PF Driver callback to change number of MSI-X @@ -905,6 +907,8 @@ struct pci_driver { int (*suspend)(struct pci_dev *dev, pm_message_t state); /* Device suspended */ int (*resume)(struct pci_dev *dev); /* Device woken up */ void (*shutdown)(struct pci_dev *dev); + void (*async_shutdown_start)(struct pci_dev *dev); + void (*async_shutdown_end)(struct pci_dev *dev); int (*sriov_configure)(struct pci_dev *dev, int num_vfs); /* On PF */ int (*sriov_set_msix_vec_count)(struct pci_dev *vf, int msix_vec_count); /* On PF */ u32 (*sriov_get_vf_total_msix)(struct pci_dev *pf); From patchwork Tue May 17 22:08:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tanjore Suresh X-Patchwork-Id: 12852983 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 08083C433EF for ; Tue, 17 May 2022 22:08:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230171AbiEQWIp (ORCPT ); Tue, 17 May 2022 18:08:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38594 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230164AbiEQWIi (ORCPT ); Tue, 17 May 2022 18:08:38 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3FC8637AB7 for ; Tue, 17 May 2022 15:08:36 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id x3-20020a25b3c3000000b0064e03a85ccbso328759ybf.5 for ; Tue, 17 May 2022 15:08:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=sUFcKVd5mQylxNWANG3jpkv2FlMsca8kAE4xpeCEJf0=; b=aIxa/I42w535claG/lW2XsGiWQff1K+xoXMPx6bOH8I5yrWsu7Rkm5FMzZ9IsFqUXK ebSMCh201NcRhTsuw7xr/mv3k1qIZtvcyO8Tzw65vsbN1MiIgbhrPjG4X2hY0/CJ6cLG YQ0m/XK2CRM1//uhFXgE488iEYVwEZfW8rGOsC3ppkTwwDgj/Jt3m9Oj/pR3BPNVmR7A rFsfEBwGrPbLWFR5uIzosFaX4d32oyMEwlkgSXhaeidG5TTAM6br1bNyZBj9YfETNMo6 TzWNyP06Ljgvr16iwwTHion0unZdgQmlYhuGCvqq4Fn0vOV9Gkdp4wYFpmSech6MeON/ 332A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=sUFcKVd5mQylxNWANG3jpkv2FlMsca8kAE4xpeCEJf0=; b=8Ne0DOywbFMrn05BfzFm9J7sklip/sUmIgMgdw+bLCt9X/smh92D7RRO7ppBNIiwit +u4tMdamxGxSOlf+/36qZUPAeKYzB2iA+mK4LkTIroVzlSzWkHrS+UGlkrpWfNPjw5hN kg3gCmpC8EdRcr8AN7vmVmfpwLfytU0EPZ6bP6pUUSpfyMLD3E8eYpDLceAczoGSfm7+ L++DixXo392FuZ4SmatxJg1gvu2UGDPFX2YwMfnEGLWT4aeHyDlRVKReyGhHdFjGzuHe ddkOP0ibtDfP0JnlG219IDe2rwDOTcq7xVt9fnmYXXn2fGu+4xmmHClstvBzWC6q39jt sCMA== X-Gm-Message-State: AOAM532f8qX8Q1uUpiuQWEYs1fd3nCVm8F+7RYluAqrCCUfenesVvWkP eedTJezvP6awZyp502g65Rj1KXz+RkgDQgU= X-Google-Smtp-Source: ABdhPJwJndPxiZpo8HoH7GmaDggY9zxdEyZMeeT+SjZw0CJObzVfZY0QnKKLPFR4XXhzP3ArpKkIWOQOsdzQlTQ= X-Received: from tansuresh.svl.corp.google.com ([2620:15c:2c5:13:3c9b:5345:708:1378]) (user=tansuresh job=sendgmr) by 2002:a25:8a8a:0:b0:648:4d85:1331 with SMTP id h10-20020a258a8a000000b006484d851331mr24748418ybl.643.1652825315404; Tue, 17 May 2022 15:08:35 -0700 (PDT) Date: Tue, 17 May 2022 15:08:16 -0700 In-Reply-To: <20220517220816.1635044-3-tansuresh@google.com> Message-Id: <20220517220816.1635044-4-tansuresh@google.com> Mime-Version: 1.0 References: <20220517220816.1635044-1-tansuresh@google.com> <20220517220816.1635044-2-tansuresh@google.com> <20220517220816.1635044-3-tansuresh@google.com> X-Mailer: git-send-email 2.36.0.550.gb090851708-goog Subject: [PATCH v3 3/3] nvme: Add async shutdown support From: Tanjore Suresh To: Greg Kroah-Hartman , "Rafael J . Wysocki" , Christoph Hellwig , Sagi Grimberg , Bjorn Helgaas Cc: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, Tanjore Suresh Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org This works with the asynchronous shutdown mechanism setup for the PCI drivers and participates to provide both pre and post shutdown routines at pci_driver structure level. The shutdown_pre routine starts the shutdown and does not wait for the shutdown to complete. The shutdown_post routine waits for the shutdown to complete on individual controllers that this driver instance controls. This mechanism optimizes to speed up the shutdown in a system which host many controllers. Signed-off-by: Tanjore Suresh --- drivers/nvme/host/core.c | 28 ++++++++++---- drivers/nvme/host/nvme.h | 8 ++++ drivers/nvme/host/pci.c | 80 +++++++++++++++++++++++++--------------- 3 files changed, 80 insertions(+), 36 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index e1846d04817f..f2fc62e1176e 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -2188,16 +2188,30 @@ EXPORT_SYMBOL_GPL(nvme_enable_ctrl); int nvme_shutdown_ctrl(struct nvme_ctrl *ctrl) { - unsigned long timeout = jiffies + (ctrl->shutdown_timeout * HZ); - u32 csts; int ret; + ret = nvme_shutdown_ctrl_start(ctrl); + if (ret) + return ret; + return nvme_wait_for_shutdown_cmpl(ctrl); +} +EXPORT_SYMBOL_GPL(nvme_shutdown_ctrl); + +int nvme_shutdown_ctrl_start(struct nvme_ctrl *ctrl) +{ + ctrl->ctrl_config &= ~NVME_CC_SHN_MASK; ctrl->ctrl_config |= NVME_CC_SHN_NORMAL; - ret = ctrl->ops->reg_write32(ctrl, NVME_REG_CC, ctrl->ctrl_config); - if (ret) - return ret; + return ctrl->ops->reg_write32(ctrl, NVME_REG_CC, ctrl->ctrl_config); +} +EXPORT_SYMBOL_GPL(nvme_shutdown_ctrl_start); + +int nvme_wait_for_shutdown_cmpl(struct nvme_ctrl *ctrl) +{ + unsigned long deadline = jiffies + (ctrl->shutdown_timeout * HZ); + u32 csts; + int ret; while ((ret = ctrl->ops->reg_read32(ctrl, NVME_REG_CSTS, &csts)) == 0) { if ((csts & NVME_CSTS_SHST_MASK) == NVME_CSTS_SHST_CMPLT) @@ -2206,7 +2220,7 @@ int nvme_shutdown_ctrl(struct nvme_ctrl *ctrl) msleep(100); if (fatal_signal_pending(current)) return -EINTR; - if (time_after(jiffies, timeout)) { + if (time_after(jiffies, deadline)) { dev_err(ctrl->device, "Device shutdown incomplete; abort shutdown\n"); return -ENODEV; @@ -2215,7 +2229,7 @@ int nvme_shutdown_ctrl(struct nvme_ctrl *ctrl) return ret; } -EXPORT_SYMBOL_GPL(nvme_shutdown_ctrl); +EXPORT_SYMBOL_GPL(nvme_wait_for_shutdown_cmpl); static int nvme_configure_timestamp(struct nvme_ctrl *ctrl) { diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index a2b53ca63335..a8706a0f32f2 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -175,6 +175,12 @@ enum { NVME_REQ_USERCMD = (1 << 1), }; +enum shutdown_type { + DO_NOT_SHUTDOWN = 0, + SHUTDOWN_TYPE_SYNC = 1, + SHUTDOWN_TYPE_ASYNC = 2, +}; + static inline struct nvme_request *nvme_req(struct request *req) { return blk_mq_rq_to_pdu(req); @@ -677,6 +683,8 @@ bool nvme_wait_reset(struct nvme_ctrl *ctrl); int nvme_disable_ctrl(struct nvme_ctrl *ctrl); int nvme_enable_ctrl(struct nvme_ctrl *ctrl); int nvme_shutdown_ctrl(struct nvme_ctrl *ctrl); +int nvme_shutdown_ctrl_start(struct nvme_ctrl *ctrl); +int nvme_wait_for_shutdown_cmpl(struct nvme_ctrl *ctrl); int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev, const struct nvme_ctrl_ops *ops, unsigned long quirks); void nvme_uninit_ctrl(struct nvme_ctrl *ctrl); diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 3aacf1c0d5a5..a0ab2e777d44 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -107,7 +107,7 @@ MODULE_PARM_DESC(noacpi, "disable acpi bios quirks"); struct nvme_dev; struct nvme_queue; -static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown); +static void nvme_dev_disable(struct nvme_dev *dev, int shutdown_type); static bool __nvme_disable_io_queues(struct nvme_dev *dev, u8 opcode); /* @@ -1357,7 +1357,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) */ if (nvme_should_reset(dev, csts)) { nvme_warn_reset(dev, csts); - nvme_dev_disable(dev, false); + nvme_dev_disable(dev, DO_NOT_SHUTDOWN); nvme_reset_ctrl(&dev->ctrl); return BLK_EH_DONE; } @@ -1392,7 +1392,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) "I/O %d QID %d timeout, disable controller\n", req->tag, nvmeq->qid); nvme_req(req)->flags |= NVME_REQ_CANCELLED; - nvme_dev_disable(dev, true); + nvme_dev_disable(dev, SHUTDOWN_TYPE_SYNC); return BLK_EH_DONE; case NVME_CTRL_RESETTING: return BLK_EH_RESET_TIMER; @@ -1410,7 +1410,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) "I/O %d QID %d timeout, reset controller\n", req->tag, nvmeq->qid); nvme_req(req)->flags |= NVME_REQ_CANCELLED; - nvme_dev_disable(dev, false); + nvme_dev_disable(dev, DO_NOT_SHUTDOWN); nvme_reset_ctrl(&dev->ctrl); return BLK_EH_DONE; @@ -1503,11 +1503,13 @@ static void nvme_suspend_io_queues(struct nvme_dev *dev) nvme_suspend_queue(&dev->queues[i]); } -static void nvme_disable_admin_queue(struct nvme_dev *dev, bool shutdown) +static void nvme_disable_admin_queue(struct nvme_dev *dev, int shutdown_type) { struct nvme_queue *nvmeq = &dev->queues[0]; - if (shutdown) + if (shutdown_type == SHUTDOWN_TYPE_ASYNC) + nvme_shutdown_ctrl_start(&dev->ctrl); + else if (shutdown_type == SHUTDOWN_TYPE_SYNC) nvme_shutdown_ctrl(&dev->ctrl); else nvme_disable_ctrl(&dev->ctrl); @@ -2669,7 +2671,7 @@ static void nvme_pci_disable(struct nvme_dev *dev) } } -static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown) +static void nvme_dev_disable(struct nvme_dev *dev, int shutdown_type) { bool dead = true, freeze = false; struct pci_dev *pdev = to_pci_dev(dev->dev); @@ -2691,14 +2693,14 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown) * Give the controller a chance to complete all entered requests if * doing a safe shutdown. */ - if (!dead && shutdown && freeze) + if (!dead && (shutdown_type != DO_NOT_SHUTDOWN) && freeze) nvme_wait_freeze_timeout(&dev->ctrl, NVME_IO_TIMEOUT); nvme_stop_queues(&dev->ctrl); if (!dead && dev->ctrl.queue_count > 0) { nvme_disable_io_queues(dev); - nvme_disable_admin_queue(dev, shutdown); + nvme_disable_admin_queue(dev, shutdown_type); } nvme_suspend_io_queues(dev); nvme_suspend_queue(&dev->queues[0]); @@ -2710,12 +2712,12 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown) blk_mq_tagset_wait_completed_request(&dev->tagset); blk_mq_tagset_wait_completed_request(&dev->admin_tagset); - /* - * The driver will not be starting up queues again if shutting down so - * must flush all entered requests to their failed completion to avoid - * deadlocking blk-mq hot-cpu notifier. - */ - if (shutdown) { + if (shutdown_type == SHUTDOWN_TYPE_SYNC) { + /* + * The driver will not be starting up queues again if shutting down so + * must flush all entered requests to their failed completion to avoid + * deadlocking blk-mq hot-cpu notifier. + */ nvme_start_queues(&dev->ctrl); if (dev->ctrl.admin_q && !blk_queue_dying(dev->ctrl.admin_q)) nvme_start_admin_queue(&dev->ctrl); @@ -2723,11 +2725,11 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown) mutex_unlock(&dev->shutdown_lock); } -static int nvme_disable_prepare_reset(struct nvme_dev *dev, bool shutdown) +static int nvme_disable_prepare_reset(struct nvme_dev *dev, int type) { if (!nvme_wait_reset(&dev->ctrl)) return -EBUSY; - nvme_dev_disable(dev, shutdown); + nvme_dev_disable(dev, type); return 0; } @@ -2785,7 +2787,7 @@ static void nvme_remove_dead_ctrl(struct nvme_dev *dev) */ nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_DELETING); nvme_get_ctrl(&dev->ctrl); - nvme_dev_disable(dev, false); + nvme_dev_disable(dev, DO_NOT_SHUTDOWN); nvme_kill_queues(&dev->ctrl); if (!queue_work(nvme_wq, &dev->remove_work)) nvme_put_ctrl(&dev->ctrl); @@ -2810,7 +2812,7 @@ static void nvme_reset_work(struct work_struct *work) * moving on. */ if (dev->ctrl.ctrl_config & NVME_CC_ENABLE) - nvme_dev_disable(dev, false); + nvme_dev_disable(dev, DO_NOT_SHUTDOWN); nvme_sync_queues(&dev->ctrl); mutex_lock(&dev->shutdown_lock); @@ -3151,7 +3153,7 @@ static void nvme_reset_prepare(struct pci_dev *pdev) * state as pci_dev device lock is held, making it impossible to race * with ->remove(). */ - nvme_disable_prepare_reset(dev, false); + nvme_disable_prepare_reset(dev, DO_NOT_SHUTDOWN); nvme_sync_queues(&dev->ctrl); } @@ -3163,13 +3165,32 @@ static void nvme_reset_done(struct pci_dev *pdev) flush_work(&dev->ctrl.reset_work); } -static void nvme_shutdown(struct pci_dev *pdev) +static void nvme_async_shutdown_start(struct pci_dev *pdev) { struct nvme_dev *dev = pci_get_drvdata(pdev); - nvme_disable_prepare_reset(dev, true); + nvme_disable_prepare_reset(dev, SHUTDOWN_TYPE_ASYNC); } +static void nvme_async_shutdown_end(struct pci_dev *pdev) +{ + struct nvme_dev *dev = pci_get_drvdata(pdev); + + mutex_lock(&dev->shutdown_lock); + nvme_wait_for_shutdown_cmpl(&dev->ctrl); + + /* + * The driver will not be starting up queues again if shutting down so + * must flush all entered requests to their failed completion to avoid + * deadlocking blk-mq hot-cpu notifier. + */ + nvme_start_queues(&dev->ctrl); + if (dev->ctrl.admin_q && !blk_queue_dying(dev->ctrl.admin_q)) + nvme_start_admin_queue(&dev->ctrl); + + mutex_unlock(&dev->shutdown_lock); + +} static void nvme_remove_attrs(struct nvme_dev *dev) { if (dev->attrs_added) @@ -3191,13 +3212,13 @@ static void nvme_remove(struct pci_dev *pdev) if (!pci_device_is_present(pdev)) { nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_DEAD); - nvme_dev_disable(dev, true); + nvme_dev_disable(dev, SHUTDOWN_TYPE_SYNC); } flush_work(&dev->ctrl.reset_work); nvme_stop_ctrl(&dev->ctrl); nvme_remove_namespaces(&dev->ctrl); - nvme_dev_disable(dev, true); + nvme_dev_disable(dev, SHUTDOWN_TYPE_SYNC); nvme_remove_attrs(dev); nvme_free_host_mem(dev); nvme_dev_remove_admin(dev); @@ -3259,7 +3280,7 @@ static int nvme_suspend(struct device *dev) if (pm_suspend_via_firmware() || !ctrl->npss || !pcie_aspm_enabled(pdev) || (ndev->ctrl.quirks & NVME_QUIRK_SIMPLE_SUSPEND)) - return nvme_disable_prepare_reset(ndev, true); + return nvme_disable_prepare_reset(ndev, SHUTDOWN_TYPE_SYNC); nvme_start_freeze(ctrl); nvme_wait_freeze(ctrl); @@ -3302,7 +3323,7 @@ static int nvme_suspend(struct device *dev) * Clearing npss forces a controller reset on resume. The * correct value will be rediscovered then. */ - ret = nvme_disable_prepare_reset(ndev, true); + ret = nvme_disable_prepare_reset(ndev, SHUTDOWN_TYPE_SYNC); ctrl->npss = 0; } unfreeze: @@ -3314,7 +3335,7 @@ static int nvme_simple_suspend(struct device *dev) { struct nvme_dev *ndev = pci_get_drvdata(to_pci_dev(dev)); - return nvme_disable_prepare_reset(ndev, true); + return nvme_disable_prepare_reset(ndev, SHUTDOWN_TYPE_SYNC); } static int nvme_simple_resume(struct device *dev) @@ -3351,7 +3372,7 @@ static pci_ers_result_t nvme_error_detected(struct pci_dev *pdev, case pci_channel_io_frozen: dev_warn(dev->ctrl.device, "frozen state error detected, reset controller\n"); - nvme_dev_disable(dev, false); + nvme_dev_disable(dev, DO_NOT_SHUTDOWN); return PCI_ERS_RESULT_NEED_RESET; case pci_channel_io_perm_failure: dev_warn(dev->ctrl.device, @@ -3488,7 +3509,8 @@ static struct pci_driver nvme_driver = { .id_table = nvme_id_table, .probe = nvme_probe, .remove = nvme_remove, - .shutdown = nvme_shutdown, + .async_shutdown_start = nvme_async_shutdown_start, + .async_shutdown_end = nvme_async_shutdown_end, #ifdef CONFIG_PM_SLEEP .driver = { .pm = &nvme_dev_pm_ops,