Message ID | 20220122111054.1126146-10-ming.lei@redhat.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | block: don't drain file system I/O on del_gendisk | expand |
On Sat, Jan 22, 2022 at 07:10:50PM +0800, Ming Lei wrote: > In scsi_disk_release() request queue is frozen for clearing > disk->private_data, and there can't be any FS IO issued to > this queue, and only private passthrough request will be handled, so > force unfreezing queue into atomic mode. > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > --- > drivers/scsi/sd.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > index 0e73c3f2f381..27f04c860f00 100644 > --- a/drivers/scsi/sd.c > +++ b/drivers/scsi/sd.c > @@ -3670,7 +3670,7 @@ static void scsi_disk_release(struct device *dev) > * in case multiple processes open a /dev/sd... node concurrently. > */ > blk_mq_freeze_queue(q); > - blk_mq_unfreeze_queue(q); > + __blk_mq_unfreeze_queue(q, true); I think the right thing here is to drop the freeze/unfreeze pair. Now that del_gendisk properly freezes the queue, we don't need this protection as the issue that Bart fixed with it can't happen any more.
On Mon, Jan 24, 2022 at 02:15:16PM +0100, Christoph Hellwig wrote: > On Sat, Jan 22, 2022 at 07:10:50PM +0800, Ming Lei wrote: > > In scsi_disk_release() request queue is frozen for clearing > > disk->private_data, and there can't be any FS IO issued to > > this queue, and only private passthrough request will be handled, so > > force unfreezing queue into atomic mode. > > > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > > --- > > drivers/scsi/sd.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > > index 0e73c3f2f381..27f04c860f00 100644 > > --- a/drivers/scsi/sd.c > > +++ b/drivers/scsi/sd.c > > @@ -3670,7 +3670,7 @@ static void scsi_disk_release(struct device *dev) > > * in case multiple processes open a /dev/sd... node concurrently. > > */ > > blk_mq_freeze_queue(q); > > - blk_mq_unfreeze_queue(q); > > + __blk_mq_unfreeze_queue(q, true); > > I think the right thing here is to drop the freeze/unfreeze pair. > Now that del_gendisk properly freezes the queue, we don't need this > protection as the issue that Bart fixed with it can't happen any more. As you see, the last patch removes freeze/unfreeze pair in del_gendisk(), which looks not very useful: it can't drain IO on bio based driver, and del_gendisk() is supposed to provide consistent behavior for both request and bio based driver. Thanks, Ming
On Tue, Jan 25, 2022 at 07:21:55AM +0800, Ming Lei wrote: > > > @@ -3670,7 +3670,7 @@ static void scsi_disk_release(struct device *dev) > > > * in case multiple processes open a /dev/sd... node concurrently. > > > */ > > > blk_mq_freeze_queue(q); > > > - blk_mq_unfreeze_queue(q); > > > + __blk_mq_unfreeze_queue(q, true); > > > > I think the right thing here is to drop the freeze/unfreeze pair. > > Now that del_gendisk properly freezes the queue, we don't need this > > protection as the issue that Bart fixed with it can't happen any more. > > As you see, the last patch removes freeze/unfreeze pair in del_gendisk(), > which looks not very useful: it can't drain IO on bio based driver, and > del_gendisk() is supposed to provide consistent behavior for both request > and bio based driver. So what is the advantage of trying to remove the freeze from where it belongs (common unregister code) while keeping it where it is a bandaid (driver specific unregister code)?
On Tue, Jan 25, 2022 at 08:27:39AM +0100, Christoph Hellwig wrote: > On Tue, Jan 25, 2022 at 07:21:55AM +0800, Ming Lei wrote: > > > > @@ -3670,7 +3670,7 @@ static void scsi_disk_release(struct device *dev) > > > > * in case multiple processes open a /dev/sd... node concurrently. > > > > */ > > > > blk_mq_freeze_queue(q); > > > > - blk_mq_unfreeze_queue(q); > > > > + __blk_mq_unfreeze_queue(q, true); > > > > > > I think the right thing here is to drop the freeze/unfreeze pair. > > > Now that del_gendisk properly freezes the queue, we don't need this > > > protection as the issue that Bart fixed with it can't happen any more. > > > > As you see, the last patch removes freeze/unfreeze pair in del_gendisk(), > > which looks not very useful: it can't drain IO on bio based driver, and > > del_gendisk() is supposed to provide consistent behavior for both request > > and bio based driver. > > So what is the advantage of trying to remove the freeze from where > it belongs (common unregister code) while keeping it where it is a bandaid > (driver specific unregister code)? freeze in common unregister code is actually not good, because it provide nothing for bio based driver, so we can't move blk-cgroup shutdown into del_gendisk. Also we can't move elevator shutdown to del_gendisk for similar reason. Secondly freeze is pretty slow in percpu mode, so why slow down removing every disk just for scsi's bandaid? Thanks, Ming
On Tue, Jan 25, 2022 at 04:54:43PM +0800, Ming Lei wrote: > > So what is the advantage of trying to remove the freeze from where > > it belongs (common unregister code) while keeping it where it is a bandaid > > (driver specific unregister code)? > > freeze in common unregister code is actually not good, because it provide > nothing for bio based driver, so we can't move blk-cgroup shutdown into > del_gendisk. Also we can't move elevator shutdown to del_gendisk for > similar reason. > > Secondly freeze is pretty slow in percpu mode, so why slow down removing every > disk just for scsi's bandaid? I'd frame this differently: - del_gendisk is the right place to freeze the queue, as that is where the gendisk is unregistered and all fs I/O needs to stop. If we don't get all aspects right we need to fix. As mentioned I'm already looking into keeping a reference for the bio life time for bio based drivers. - SCSI is the only driver that ever submits passthrough I/O without the gendisk. So doing an additional freeze during request_queue teardown is only needed for SCSI and we can eventually remove that for everyone else. And most of your series already does really good work towards that goal! > > > Thanks, > Ming ---end quoted text---
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 0e73c3f2f381..27f04c860f00 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -3670,7 +3670,7 @@ static void scsi_disk_release(struct device *dev) * in case multiple processes open a /dev/sd... node concurrently. */ blk_mq_freeze_queue(q); - blk_mq_unfreeze_queue(q); + __blk_mq_unfreeze_queue(q, true); disk->private_data = NULL; put_disk(disk);
In scsi_disk_release() request queue is frozen for clearing disk->private_data, and there can't be any FS IO issued to this queue, and only private passthrough request will be handled, so force unfreezing queue into atomic mode. Signed-off-by: Ming Lei <ming.lei@redhat.com> --- drivers/scsi/sd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)