Message ID | 1469822242-3477-4-git-send-email-sagi@grimberg.me (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
On Fri, Jul 29, 2016 at 10:57:20PM +0300, Sagi Grimberg wrote: > Relying on ctrl state in nvme_rdma_shutdown_ctrl is wrong because > it will never be NVME_CTRL_LIVE (delete_ctrl or reset_ctrl invoked it). > > Instead, check that the admin queue is connected. Note that it is safe > because we can never see a copmeting thread trying to destroy the admin > queue (reset or delete controller). > > Signed-off-by: Sagi Grimberg <sagi@grimberg.me> > --- > drivers/nvme/host/rdma.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c > index a70eb3cbf656..641ab7f91899 100644 > --- a/drivers/nvme/host/rdma.c > +++ b/drivers/nvme/host/rdma.c > @@ -1644,7 +1644,7 @@ static void nvme_rdma_shutdown_ctrl(struct nvme_rdma_ctrl *ctrl) > nvme_rdma_free_io_queues(ctrl); > } > > - if (ctrl->ctrl.state == NVME_CTRL_LIVE) > + if (test_bit(NVME_RDMA_Q_CONNECTED, &ctrl->queues[0].flags)) > nvme_shutdown_ctrl(&ctrl->ctrl); > > blk_mq_stop_hw_queues(ctrl->ctrl.admin_q); Maybe the right way to handle this is to unconditionally call nvme_shutdown_ctrl and make sure we return an early error on the register write? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c >> index a70eb3cbf656..641ab7f91899 100644 >> --- a/drivers/nvme/host/rdma.c >> +++ b/drivers/nvme/host/rdma.c >> @@ -1644,7 +1644,7 @@ static void nvme_rdma_shutdown_ctrl(struct nvme_rdma_ctrl *ctrl) >> nvme_rdma_free_io_queues(ctrl); >> } >> >> - if (ctrl->ctrl.state == NVME_CTRL_LIVE) >> + if (test_bit(NVME_RDMA_Q_CONNECTED, &ctrl->queues[0].flags)) >> nvme_shutdown_ctrl(&ctrl->ctrl); >> >> blk_mq_stop_hw_queues(ctrl->ctrl.admin_q); > > Maybe the right way to handle this is to unconditionally call > nvme_shutdown_ctrl and make sure we return an early error > on the register write? As I wrote on patch 2/5 reply, I'd like to avoid depending on queue_rq to fail early peeking at the ctrl state. This dependency can grow in the future and I think we should at least try not to go there... -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Aug 01, 2016 at 02:17:37PM +0300, Sagi Grimberg wrote: >>> - if (ctrl->ctrl.state == NVME_CTRL_LIVE) >>> + if (test_bit(NVME_RDMA_Q_CONNECTED, &ctrl->queues[0].flags)) >>> nvme_shutdown_ctrl(&ctrl->ctrl); >>> >>> blk_mq_stop_hw_queues(ctrl->ctrl.admin_q); >> >> Maybe the right way to handle this is to unconditionally call >> nvme_shutdown_ctrl and make sure we return an early error >> on the register write? > > As I wrote on patch 2/5 reply, I'd like to avoid depending on > queue_rq to fail early peeking at the ctrl state. This dependency > can grow in the future and I think we should at least try not to > go there... I don't want ->queue_rq to peek at controller state. What I had in mind was copying the PCIe behavior of failing early in the timeout handler if we are in the reset handler, which will mean blk_execute_rq will return ASAP with an error. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>> - if (ctrl->ctrl.state == NVME_CTRL_LIVE) >>>> + if (test_bit(NVME_RDMA_Q_CONNECTED, &ctrl->queues[0].flags)) >>>> nvme_shutdown_ctrl(&ctrl->ctrl); >>>> >>>> blk_mq_stop_hw_queues(ctrl->ctrl.admin_q); >>> >>> Maybe the right way to handle this is to unconditionally call >>> nvme_shutdown_ctrl and make sure we return an early error >>> on the register write? >> >> As I wrote on patch 2/5 reply, I'd like to avoid depending on >> queue_rq to fail early peeking at the ctrl state. This dependency >> can grow in the future and I think we should at least try not to >> go there... > > I don't want ->queue_rq to peek at controller state. What I had > in mind was copying the PCIe behavior of failing early in the > timeout handler if we are in the reset handler, which will > mean blk_execute_rq will return ASAP with an error. Do we want the unnecessary ADMIN_TIMEOUT for sending a shutdown on a queue we know is not connected? I'm worried it will have scale issues... -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Aug 02, 2016 at 09:19:56AM +0300, Sagi Grimberg wrote: >> I don't want ->queue_rq to peek at controller state. What I had >> in mind was copying the PCIe behavior of failing early in the >> timeout handler if we are in the reset handler, which will >> mean blk_execute_rq will return ASAP with an error. > > Do we want the unnecessary ADMIN_TIMEOUT for sending a shutdown > on a queue we know is not connected? I'm worried it will have scale > issues... Allright, let's go with your version. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>>> I don't want ->queue_rq to peek at controller state. What I had >>> in mind was copying the PCIe behavior of failing early in the >>> timeout handler if we are in the reset handler, which will >>> mean blk_execute_rq will return ASAP with an error. >> >> Do we want the unnecessary ADMIN_TIMEOUT for sending a shutdown >> on a queue we know is not connected? I'm worried it will have scale >> issues... > > Allright, let's go with your version. Is this an implicit reviewed-by? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index a70eb3cbf656..641ab7f91899 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -1644,7 +1644,7 @@ static void nvme_rdma_shutdown_ctrl(struct nvme_rdma_ctrl *ctrl) nvme_rdma_free_io_queues(ctrl); } - if (ctrl->ctrl.state == NVME_CTRL_LIVE) + if (test_bit(NVME_RDMA_Q_CONNECTED, &ctrl->queues[0].flags)) nvme_shutdown_ctrl(&ctrl->ctrl); blk_mq_stop_hw_queues(ctrl->ctrl.admin_q);
Relying on ctrl state in nvme_rdma_shutdown_ctrl is wrong because it will never be NVME_CTRL_LIVE (delete_ctrl or reset_ctrl invoked it). Instead, check that the admin queue is connected. Note that it is safe because we can never see a copmeting thread trying to destroy the admin queue (reset or delete controller). Signed-off-by: Sagi Grimberg <sagi@grimberg.me> --- drivers/nvme/host/rdma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)