From patchwork Thu May 3 14:55:09 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikulas Patocka X-Patchwork-Id: 10378615 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 475BE60327 for ; Thu, 3 May 2018 14:55:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3B36A2835B for ; Thu, 3 May 2018 14:55:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2F58028390; Thu, 3 May 2018 14:55:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8459B2835B for ; Thu, 3 May 2018 14:55:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751326AbeECOzP (ORCPT ); Thu, 3 May 2018 10:55:15 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:55046 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751575AbeECOzK (ORCPT ); Thu, 3 May 2018 10:55:10 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 69E43EBFEE; Thu, 3 May 2018 14:55:10 +0000 (UTC) Received: from file01.intranet.prod.int.rdu2.redhat.com (file01.intranet.prod.int.rdu2.redhat.com [10.11.5.7]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 4469211166E0; Thu, 3 May 2018 14:55:10 +0000 (UTC) Received: from file01.intranet.prod.int.rdu2.redhat.com (localhost [127.0.0.1]) by file01.intranet.prod.int.rdu2.redhat.com (8.14.4/8.14.4) with ESMTP id w43EtAIa031222; Thu, 3 May 2018 10:55:10 -0400 Received: from localhost (mpatocka@localhost) by file01.intranet.prod.int.rdu2.redhat.com (8.14.4/8.14.4/Submit) with ESMTP id w43Et9kU031219; Thu, 3 May 2018 10:55:09 -0400 X-Authentication-Warning: file01.intranet.prod.int.rdu2.redhat.com: mpatocka owned process doing -bs Date: Thu, 3 May 2018 10:55:09 -0400 (EDT) From: Mikulas Patocka X-X-Sender: mpatocka@file01.intranet.prod.int.rdu2.redhat.com To: Keith Busch cc: Keith Busch , Ming Lei , Christoph Hellwig , linux-nvme , Sagi Grimberg , linux-pci@vger.kernel.org, Bjorn Helgaas Subject: Re: [PATCH] nvme/pci: Use async_schedule for initial reset work In-Reply-To: <20180502152953.GH5938@localhost.localdomain> Message-ID: References: <20180427211708.5604-1-keith.busch@intel.com> <20180430194533.GC5938@localhost.localdomain> <20180502152953.GH5938@localhost.localdomain> User-Agent: Alpine 2.02 (LRH 1266 2009-07-14) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Thu, 03 May 2018 14:55:10 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Thu, 03 May 2018 14:55:10 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'mpatocka@redhat.com' RCPT:'' Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Wed, 2 May 2018, Keith Busch wrote: > On Tue, May 01, 2018 at 07:33:10PM -0400, Mikulas Patocka wrote: > > On Mon, 30 Apr 2018, Keith Busch wrote: > > > > > On Sat, Apr 28, 2018 at 05:11:18PM +0800, Ming Lei wrote: > > > > Looks fine, > > > > > > > > Reviewed-by: Ming Lei > > > > > > Thanks, Ming. > > > > > > Mikulas, would you be able to test this and confirm it works for you? > > > This appears successful in my testing, but want to hear from the source > > > if possible. > > > > > > Thanks, > > > Keith > > > > The patch is not correct - scan_work is still called from a workqueue and > > if it's too slow, the nvme device is not found when mounting root. > > > > You can add msleep(10000) at the beginning of nvme_scan_work to test for > > the race condition on your system. > > > > Here I submit the corrected patch - I added > > flush_work(&dev->ctrl.scan_work) to nvme_async_probe > > Roger that. Will incorporate your adjustment and add your > Tested-by. Thanks for the confirmation. I think there is still one more bug: If nvme_probe is called, it schedules the asynchronous work using async_schedule - now suppose that the pci system calls the "remove", "shutdown" or "suspend" method - this method will race with nvme_async_probe running in the async domain - that will cause misbehavior. Or - does the PCI subsystem flush the async queues before calling these methods? I'm not sure, but it doesn't seem so. I think, you need to save the cookie returned by async_schedule and wait for this cookie with async_synchronize_cookie in the other methods. Mikulas This patch schedules the initial controller reset in an async_domain so that it can be synchronized from wait_for_device_probe(). This way the kernel waits for the first nvme controller initialization to complete for all devices before proceeding with the boot sequence, which may have nvme dependencies. Reported-by: Mikulas Patocka Tested-by: Mikulas Patocka Signed-off-by: Keith Busch Signed-off-by: Mikulas Patocka --- drivers/nvme/host/pci.c | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) Index: linux-4.16.6/drivers/nvme/host/pci.c =================================================================== --- linux-4.16.6.orig/drivers/nvme/host/pci.c 2018-05-03 16:43:01.000000000 +0200 +++ linux-4.16.6/drivers/nvme/host/pci.c 2018-05-03 16:46:35.000000000 +0200 @@ -13,6 +13,7 @@ */ #include +#include #include #include #include @@ -111,6 +112,8 @@ struct nvme_dev { dma_addr_t host_mem_descs_dma; struct nvme_host_mem_buf_desc *host_mem_descs; void **host_mem_desc_bufs; + + async_cookie_t probe_cookie; }; static int io_queue_depth_set(const char *val, const struct kernel_param *kp) @@ -2471,6 +2474,13 @@ static unsigned long check_vendor_combin return 0; } +static void nvme_async_probe(void *data, async_cookie_t cookie) +{ + struct nvme_dev *dev = data; + nvme_reset_ctrl_sync(&dev->ctrl); + flush_work(&dev->ctrl.scan_work); +} + static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id) { int node, result = -ENOMEM; @@ -2515,7 +2525,7 @@ static int nvme_probe(struct pci_dev *pd dev_info(dev->ctrl.device, "pci function %s\n", dev_name(&pdev->dev)); - nvme_reset_ctrl(&dev->ctrl); + dev->probe_cookie = async_schedule(nvme_async_probe, dev); return 0; @@ -2546,6 +2556,9 @@ static void nvme_reset_done(struct pci_d static void nvme_shutdown(struct pci_dev *pdev) { struct nvme_dev *dev = pci_get_drvdata(pdev); + + async_synchronize_cookie(dev->probe_cookie); + nvme_dev_disable(dev, true); } @@ -2558,6 +2571,8 @@ static void nvme_remove(struct pci_dev * { struct nvme_dev *dev = pci_get_drvdata(pdev); + async_synchronize_cookie(dev->probe_cookie); + nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_DELETING); cancel_work_sync(&dev->ctrl.reset_work); @@ -2605,6 +2620,8 @@ static int nvme_suspend(struct device *d struct pci_dev *pdev = to_pci_dev(dev); struct nvme_dev *ndev = pci_get_drvdata(pdev); + async_synchronize_cookie(dev->probe_cookie); + nvme_dev_disable(ndev, true); return 0; } @@ -2614,6 +2631,8 @@ static int nvme_resume(struct device *de struct pci_dev *pdev = to_pci_dev(dev); struct nvme_dev *ndev = pci_get_drvdata(pdev); + async_synchronize_cookie(dev->probe_cookie); + nvme_reset_ctrl(&ndev->ctrl); return 0; }