Message ID | 20170508161524.GE5696@ming.t460p (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, May 09, 2017 at 12:15:25AM +0800, Ming Lei wrote: > This patch looks working, but seems any 'goto out' in this function > may have rick to cause the same race too. The goto was really intended for handling totally broken contronllers, which isn't the case if someone requested to remove the pci device while we're initializing it. Point taken, though, let me run a few tests and see if there's a better way to handle this condition. > Another solution I thought of is to kill queues earlier, what do you > think about the following patch? That should get it unstuck, but it will error all the IO that fsync_bdev would probably rather complete successfully. Question though, why doesn't the remove_work's nvme_kill_queues in its current place allow forward progress already? > --- > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c > index c8541c3dcd19..16740e8c4225 100644 > --- a/drivers/nvme/host/pci.c > +++ b/drivers/nvme/host/pci.c > @@ -1892,6 +1892,7 @@ static void nvme_remove_dead_ctrl(struct nvme_dev *dev, int status) > > kref_get(&dev->ctrl.kref); > nvme_dev_disable(dev, false); > + nvme_kill_queues(&dev->ctrl); > if (!schedule_work(&dev->remove_work)) > nvme_put_ctrl(&dev->ctrl); > } > @@ -1998,7 +1999,6 @@ static void nvme_remove_dead_ctrl_work(struct work_struct *work) > struct nvme_dev *dev = container_of(work, struct nvme_dev, remove_work); > struct pci_dev *pdev = to_pci_dev(dev->dev); > > - nvme_kill_queues(&dev->ctrl); > if (pci_get_drvdata(pdev)) > device_release_driver(&pdev->dev); > nvme_put_ctrl(&dev->ctrl);
Hi Keith, Thanks for looking at this issue! On Mon, May 08, 2017 at 01:25:12PM -0400, Keith Busch wrote: > On Tue, May 09, 2017 at 12:15:25AM +0800, Ming Lei wrote: > > This patch looks working, but seems any 'goto out' in this function > > may have rick to cause the same race too. > > The goto was really intended for handling totally broken contronllers, > which isn't the case if someone requested to remove the pci device while > we're initializing it. Point taken, though, let me run a few tests and > see if there's a better way to handle this condition. The thing is that remove can happen any time, either from hotplug or unbinding driver or 'echo 1 > $PCI_PATH/remove'. At the same time, the reset can be ongoing. Also looks the hang in del_gendisk() is fixed by this change, but I just found a new issue which is triggered after the NVMe PCI device is rescaned again after last remove. [ 504.135554] VFS: Dirty inode writeback failed for block device nvme0n1p1 (err=-5). > > > Another solution I thought of is to kill queues earlier, what do you > > think about the following patch? > > That should get it unstuck, but it will error all the IO that fsync_bdev > would probably rather complete successfully. nvme_dev_disable(false) has been completed already before killing queues in nvme_remove_dead_ctrl(), so both hw queue is stopped and nvmeq->cq_vector is set as -1 in nvme_suspend_queue(). That means no new I/O(include IO in fsync_bdev) can be submitted successfully any more, so looks it is reasonable to kill queue in nvme_remove_dead_ctrl(). > > Question though, why doesn't the remove_work's nvme_kill_queues in > its current place allow forward progress already? That is because .remove_work may not be run before del_gendisk() is started even though the .reset_work is flushed, and we can't flush .remove_work simply here. Thanks, Ming
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index c8541c3dcd19..16740e8c4225 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -1892,6 +1892,7 @@ static void nvme_remove_dead_ctrl(struct nvme_dev *dev, int status) kref_get(&dev->ctrl.kref); nvme_dev_disable(dev, false); + nvme_kill_queues(&dev->ctrl); if (!schedule_work(&dev->remove_work)) nvme_put_ctrl(&dev->ctrl); } @@ -1998,7 +1999,6 @@ static void nvme_remove_dead_ctrl_work(struct work_struct *work) struct nvme_dev *dev = container_of(work, struct nvme_dev, remove_work); struct pci_dev *pdev = to_pci_dev(dev->dev); - nvme_kill_queues(&dev->ctrl); if (pci_get_drvdata(pdev)) device_release_driver(&pdev->dev); nvme_put_ctrl(&dev->ctrl);