Message ID | 20170921212255.12788-5-bart.vanassche@wdc.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Sep 21, 2017 at 02:22:55PM -0700, Bart Van Assche wrote: > Avoid that it can take 200 ms too long to wait for ongoing requests > to finish. Note: blk_mq_freeze_queue() uses a wait queue to wait > for ongoing requests to finish. > > Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> > Cc: Martin K. Petersen <martin.petersen@oracle.com> > Cc: Ming Lei <ming.lei@redhat.com> > Cc: Christoph Hellwig <hch@lst.de> > Cc: Hannes Reinecke <hare@suse.com> > Cc: Johannes Thumshirn <jthumshirn@suse.de> > --- > drivers/scsi/scsi_lib.c | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index e76fd6e89a81..34e5f0f95d01 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -2901,10 +2901,15 @@ scsi_device_quiesce(struct scsi_device *sdev) > if (err) > return err; > > - scsi_run_queue(q); > - while (atomic_read(&sdev->device_busy)) { > - msleep_interruptible(200); > + if (q->mq_ops) { > + blk_mq_freeze_queue(q); > + blk_mq_unfreeze_queue(q); As I commented in another patch, you don't check the 'preempt only' flag in normal I/O path, then from now on, any I/O can come. > + } else { > scsi_run_queue(q); > + while (atomic_read(&sdev->device_busy)) { > + msleep_interruptible(200); > + scsi_run_queue(q); > + } Are you sure only blk-mq need to drain queue? We need to do that for block legacy too.
On Fri, 2017-09-22 at 06:06 +0800, Ming Lei wrote: > On Thu, Sep 21, 2017 at 02:22:55PM -0700, Bart Van Assche wrote: > > + } else { > > scsi_run_queue(q); > > + while (atomic_read(&sdev->device_busy)) { > > + msleep_interruptible(200); > > + scsi_run_queue(q); > > + } > > Are you sure only blk-mq need to drain queue? We need > to do that for block legacy too. The code above your comment drains the queue for the legacy block layer. Bart.
On Thu, Sep 21, 2017 at 10:43:26PM +0000, Bart Van Assche wrote: > On Fri, 2017-09-22 at 06:06 +0800, Ming Lei wrote: > > On Thu, Sep 21, 2017 at 02:22:55PM -0700, Bart Van Assche wrote: > > > + } else { > > > scsi_run_queue(q); > > > + while (atomic_read(&sdev->device_busy)) { > > > + msleep_interruptible(200); > > > + scsi_run_queue(q); > > > + } > > > > Are you sure only blk-mq need to drain queue? We need > > to do that for block legacy too. > > The code above your comment drains the queue for the legacy block layer. That is just draining the requests dispatched to SCSI layer, and there might be lots of requests in block I/O scheduler queue or requeue or whatever.
On Fri, 2017-09-22 at 07:25 +0800, Ming Lei wrote: > On Thu, Sep 21, 2017 at 10:43:26PM +0000, Bart Van Assche wrote: > > On Fri, 2017-09-22 at 06:06 +0800, Ming Lei wrote: > > > On Thu, Sep 21, 2017 at 02:22:55PM -0700, Bart Van Assche wrote: > > > > + } else { > > > > scsi_run_queue(q); > > > > + while (atomic_read(&sdev->device_busy)) { > > > > + msleep_interruptible(200); > > > > + scsi_run_queue(q); > > > > + } > > > > > > Are you sure only blk-mq need to drain queue? We need > > > to do that for block legacy too. > > > > The code above your comment drains the queue for the legacy block layer. > > That is just draining the requests dispatched to SCSI layer, and there > might be lots of requests in block I/O scheduler queue or requeue or > whatever. There is no requeue list for the legacy block layer. There is only a requeue list for blk-mq. Waiting for I/O requests that are in scheduler queues is not the purpose of scsi_quiesce_device(). The purpose of that function is to wait for requests that have already been started. The sdev->device_busy counter represents the number of started requests so waiting until that counter has reached zero is sufficient. Bart.
On Thu, Sep 21, 2017 at 11:32:33PM +0000, Bart Van Assche wrote: > On Fri, 2017-09-22 at 07:25 +0800, Ming Lei wrote: > > On Thu, Sep 21, 2017 at 10:43:26PM +0000, Bart Van Assche wrote: > > > On Fri, 2017-09-22 at 06:06 +0800, Ming Lei wrote: > > > > On Thu, Sep 21, 2017 at 02:22:55PM -0700, Bart Van Assche wrote: > > > > > + } else { > > > > > scsi_run_queue(q); > > > > > + while (atomic_read(&sdev->device_busy)) { > > > > > + msleep_interruptible(200); > > > > > + scsi_run_queue(q); > > > > > + } > > > > > > > > Are you sure only blk-mq need to drain queue? We need > > > > to do that for block legacy too. > > > > > > The code above your comment drains the queue for the legacy block layer. > > > > That is just draining the requests dispatched to SCSI layer, and there > > might be lots of requests in block I/O scheduler queue or requeue or > > whatever. > > There is no requeue list for the legacy block layer. There is only a requeue > list for blk-mq. > > Waiting for I/O requests that are in scheduler queues is not the purpose of > scsi_quiesce_device(). The purpose of that function is to wait for requests > that have already been started. The sdev->device_busy counter represents the > number of started requests so waiting until that counter has reached zero is > sufficient. Then that is simply not enough since the issue is that request pool can be used up easily when SCSI device is quiesced, then no request is left for RQF_PREEMPT(PM), and I/O hang is triggered. It has been discussed for long time, I am sorry you still don't understand it.
On Fri, 2017-09-22 at 07:53 +0800, Ming Lei wrote: > Then that is simply not enough since the issue is that request pool can > be used up easily when SCSI device is quiesced, then no request is left > for RQF_PREEMPT(PM), and I/O hang is triggered. > > It has been discussed for long time, I am sorry you still don't > understand it. I'm really sorry that you don't understand the intention of this patch series. Anyway, let's close the discussion here and continue once the next version of this patch series has been posted. Bart.
On Thu, Sep 21, 2017 at 11:56:33PM +0000, Bart Van Assche wrote: > On Fri, 2017-09-22 at 07:53 +0800, Ming Lei wrote: > > Then that is simply not enough since the issue is that request pool can > > be used up easily when SCSI device is quiesced, then no request is left > > for RQF_PREEMPT(PM), and I/O hang is triggered. > > > > It has been discussed for long time, I am sorry you still don't > > understand it. > > I'm really sorry that you don't understand the intention of this patch series. > Anyway, let's close the discussion here and continue once the next version of > this patch series has been posted. So you mean you don't want to fix Cathy's issue? which was reported from one production system of our customer. And definitely belongs to same kind of SCSI quiesce issue.
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index e76fd6e89a81..34e5f0f95d01 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -2901,10 +2901,15 @@ scsi_device_quiesce(struct scsi_device *sdev) if (err) return err; - scsi_run_queue(q); - while (atomic_read(&sdev->device_busy)) { - msleep_interruptible(200); + if (q->mq_ops) { + blk_mq_freeze_queue(q); + blk_mq_unfreeze_queue(q); + } else { scsi_run_queue(q); + while (atomic_read(&sdev->device_busy)) { + msleep_interruptible(200); + scsi_run_queue(q); + } } return 0; }
Avoid that it can take 200 ms too long to wait for ongoing requests to finish. Note: blk_mq_freeze_queue() uses a wait queue to wait for ongoing requests to finish. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Ming Lei <ming.lei@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> --- drivers/scsi/scsi_lib.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)