Message ID | 20241009113831.557606-2-hch@lst.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/2] block: also mark disk-owned queues as dying in __blk_mark_disk_dead | expand |
On (24/10/09 13:38), Christoph Hellwig wrote: [..] > @@ -589,8 +589,16 @@ static void __blk_mark_disk_dead(struct gendisk *disk) > if (test_and_set_bit(GD_DEAD, &disk->state)) > return; > > - if (test_bit(GD_OWNS_QUEUE, &disk->state)) > - blk_queue_flag_set(QUEUE_FLAG_DYING, disk->queue); > + /* > + * Also mark the disk dead if it is not owned by the gendisk. This > + * means we can't allow /dev/sg passthrough or SCSI internal commands > + * while unbinding a ULP. That is more than just a bit ugly, but until > + * we untangle q_usage_counter into one owned by the disk and one owned > + * by the queue this is as good as it gets. The flag will be cleared > + * at the end of del_gendisk if it wasn't set before. > + */ > + if (!test_and_set_bit(QUEUE_FLAG_DYING, &disk->queue->queue_flags)) > + set_bit(QUEUE_FLAG_RESURRECT, &disk->queue->queue_flags); > > /* > * Stop buffered writers from dirtying pages that can't be written out. > @@ -719,6 +727,10 @@ void del_gendisk(struct gendisk *disk) > * again. Else leave the queue frozen to fail all I/O. > */ > if (!test_bit(GD_OWNS_QUEUE, &disk->state)) { > + if (test_bit(QUEUE_FLAG_RESURRECT, &q->queue_flags)) { > + clear_bit(QUEUE_FLAG_DYING, &q->queue_flags); > + clear_bit(QUEUE_FLAG_RESURRECT, &q->queue_flags); > + } Christoph, shouldn't QUEUE_FLAG_RESURRECT handling be outside of GD_OWNS_QUEUE if-block? Because __blk_mark_disk_dead() sets QUEUE_FLAG_DYING/QUEUE_FLAG_RESURRECT regardless of GD_OWNS_QUEUE. // A silly nit: it seems the code uses blk_queue_flag_set() and // blk_queue_flag_clear() helpers, but there is no queue_flag_test(), // I don't know what if the preference here - stick to queue_flag // helpers, or is it ok to mix them. > blk_queue_flag_clear(QUEUE_FLAG_INIT_DONE, q); > __blk_mq_unfreeze_queue(q, true);
On Wed, Oct 09, 2024 at 09:31:23PM +0900, Sergey Senozhatsky wrote: > > if (!test_bit(GD_OWNS_QUEUE, &disk->state)) { > > + if (test_bit(QUEUE_FLAG_RESURRECT, &q->queue_flags)) { > > + clear_bit(QUEUE_FLAG_DYING, &q->queue_flags); > > + clear_bit(QUEUE_FLAG_RESURRECT, &q->queue_flags); > > + } > > Christoph, shouldn't QUEUE_FLAG_RESURRECT handling be outside of > GD_OWNS_QUEUE if-block? Because __blk_mark_disk_dead() sets > QUEUE_FLAG_DYING/QUEUE_FLAG_RESURRECT regardless of GD_OWNS_QUEUE. For !GD_OWNS_QUEUE the queue is freed right below, so there isn't much of a point. > // A silly nit: it seems the code uses blk_queue_flag_set() and > // blk_queue_flag_clear() helpers, but there is no queue_flag_test(), > // I don't know what if the preference here - stick to queue_flag > // helpers, or is it ok to mix them. Yeah. I looked into a test_and_set wrapper, but then saw how pointless the existing wrappers are. So for now this just open codes it, and once we're done with the fixes I plan to just send a patch to remove the wrappers entirely.
On (24/10/09 14:41), Christoph Hellwig wrote: > On Wed, Oct 09, 2024 at 09:31:23PM +0900, Sergey Senozhatsky wrote: > > > if (!test_bit(GD_OWNS_QUEUE, &disk->state)) { > > > + if (test_bit(QUEUE_FLAG_RESURRECT, &q->queue_flags)) { > > > + clear_bit(QUEUE_FLAG_DYING, &q->queue_flags); > > > + clear_bit(QUEUE_FLAG_RESURRECT, &q->queue_flags); > > > + } > > > > Christoph, shouldn't QUEUE_FLAG_RESURRECT handling be outside of > > GD_OWNS_QUEUE if-block? Because __blk_mark_disk_dead() sets > > QUEUE_FLAG_DYING/QUEUE_FLAG_RESURRECT regardless of GD_OWNS_QUEUE. > > For !GD_OWNS_QUEUE the queue is freed right below, so there isn't much > of a point. Oh, right. > > // A silly nit: it seems the code uses blk_queue_flag_set() and > > // blk_queue_flag_clear() helpers, but there is no queue_flag_test(), > > // I don't know what if the preference here - stick to queue_flag > > // helpers, or is it ok to mix them. > > Yeah. I looked into a test_and_set wrapper, but then saw how pointless > the existing wrappers are. Likewise. > So for now this just open codes it, and once we're done with the fixes > I plan to just send a patch to remove the wrappers entirely. Ack.
On 10/9/24 6:41 AM, Christoph Hellwig wrote: >> // A silly nit: it seems the code uses blk_queue_flag_set() and >> // blk_queue_flag_clear() helpers, but there is no queue_flag_test(), >> // I don't know what if the preference here - stick to queue_flag >> // helpers, or is it ok to mix them. > > Yeah. I looked into a test_and_set wrapper, but then saw how pointless > the existing wrappers are. So for now this just open codes it, and > once we're done with the fixes I plan to just send a patch to remove > the wrappers entirely. Agree, but that's because you didn't do it back when you changed them to be just set/clear bit operations ;-). They should definitely just go away now.
On 2024/10/9 19:38, Christoph Hellwig wrote: > When del_gendisk shuts down access to a gendisk, it could lead to a > deadlock with sd or, which try to submit passthrough SCSI commands from > their ->release method under open_mutex. The submission can be blocked > in blk_enter_queue while del_gendisk can't get to actually telling them > top stop and wake them up. > > As the disk is going away there is no real point in sending these > commands, but we have no really good way to distinguish between the > cases. For now mark even standalone (aka SCSI queues) as dying in > del_gendisk to avoid this deadlock, but the real fix will be to split > freeing a disk from freezing a queue for not disk associated requests. > > Reported-by: Sergey Senozhatsky <senozhatsky@chromium.org> > Signed-off-by: Christoph Hellwig <hch@lst.de> > Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> > --- > block/genhd.c | 16 ++++++++++++++-- > include/linux/blkdev.h | 1 + > 2 files changed, 15 insertions(+), 2 deletions(-) > > diff --git a/block/genhd.c b/block/genhd.c > index 1c05dd4c6980b5..7026569fa8a0be 100644 > --- a/block/genhd.c > +++ b/block/genhd.c > @@ -589,8 +589,16 @@ static void __blk_mark_disk_dead(struct gendisk *disk) > if (test_and_set_bit(GD_DEAD, &disk->state)) > return; > > - if (test_bit(GD_OWNS_QUEUE, &disk->state)) > - blk_queue_flag_set(QUEUE_FLAG_DYING, disk->queue); > + /* > + * Also mark the disk dead if it is not owned by the gendisk. This > + * means we can't allow /dev/sg passthrough or SCSI internal commands > + * while unbinding a ULP. That is more than just a bit ugly, but until > + * we untangle q_usage_counter into one owned by the disk and one owned > + * by the queue this is as good as it gets. The flag will be cleared > + * at the end of del_gendisk if it wasn't set before. > + */ > + if (!test_and_set_bit(QUEUE_FLAG_DYING, &disk->queue->queue_flags)) > + set_bit(QUEUE_FLAG_RESURRECT, &disk->queue->queue_flags); > > /* > * Stop buffered writers from dirtying pages that can't be written out. > @@ -719,6 +727,10 @@ void del_gendisk(struct gendisk *disk) > * again. Else leave the queue frozen to fail all I/O. > */ > if (!test_bit(GD_OWNS_QUEUE, &disk->state)) { > + if (test_bit(QUEUE_FLAG_RESURRECT, &q->queue_flags)) { > + clear_bit(QUEUE_FLAG_DYING, &q->queue_flags); > + clear_bit(QUEUE_FLAG_RESURRECT, &q->queue_flags); > + } > blk_queue_flag_clear(QUEUE_FLAG_INIT_DONE, q); > __blk_mq_unfreeze_queue(q, true); > } else { > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h > index 50c3b959da2816..391e3eb3bb5e61 100644 > --- a/include/linux/blkdev.h > +++ b/include/linux/blkdev.h > @@ -590,6 +590,7 @@ struct request_queue { > /* Keep blk_queue_flag_name[] in sync with the definitions below */ > enum { > QUEUE_FLAG_DYING, /* queue being torn down */ > + QUEUE_FLAG_RESURRECT, /* temporarily dying */ > QUEUE_FLAG_NOMERGES, /* disable merge attempts */ > QUEUE_FLAG_SAME_COMP, /* complete on same CPU-group */ > QUEUE_FLAG_FAIL_IO, /* fake timeout */ Looks good. Feel free to add: Reviewed-by: Yang Yang <yang.yang@vivo.com> Thanks.
On Wed, Oct 09, 2024 at 01:38:20PM +0200, Christoph Hellwig wrote: > When del_gendisk shuts down access to a gendisk, it could lead to a > deadlock with sd or, which try to submit passthrough SCSI commands from > their ->release method under open_mutex. The submission can be blocked > in blk_enter_queue while del_gendisk can't get to actually telling them > top stop and wake them up. > > As the disk is going away there is no real point in sending these > commands, but we have no really good way to distinguish between the > cases. For now mark even standalone (aka SCSI queues) as dying in > del_gendisk to avoid this deadlock, but the real fix will be to split > freeing a disk from freezing a queue for not disk associated requests. > > Reported-by: Sergey Senozhatsky <senozhatsky@chromium.org> > Signed-off-by: Christoph Hellwig <hch@lst.de> > Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> > --- > block/genhd.c | 16 ++++++++++++++-- > include/linux/blkdev.h | 1 + > 2 files changed, 15 insertions(+), 2 deletions(-) > > diff --git a/block/genhd.c b/block/genhd.c > index 1c05dd4c6980b5..7026569fa8a0be 100644 > --- a/block/genhd.c > +++ b/block/genhd.c > @@ -589,8 +589,16 @@ static void __blk_mark_disk_dead(struct gendisk *disk) > if (test_and_set_bit(GD_DEAD, &disk->state)) > return; > > - if (test_bit(GD_OWNS_QUEUE, &disk->state)) > - blk_queue_flag_set(QUEUE_FLAG_DYING, disk->queue); > + /* > + * Also mark the disk dead if it is not owned by the gendisk. This > + * means we can't allow /dev/sg passthrough or SCSI internal commands > + * while unbinding a ULP. That is more than just a bit ugly, but until > + * we untangle q_usage_counter into one owned by the disk and one owned > + * by the queue this is as good as it gets. The flag will be cleared > + * at the end of del_gendisk if it wasn't set before. > + */ > + if (!test_and_set_bit(QUEUE_FLAG_DYING, &disk->queue->queue_flags)) > + set_bit(QUEUE_FLAG_RESURRECT, &disk->queue->queue_flags); Setting QUEUE_FLAG_DYING may fail passthrough request for !GD_OWNS_QUEUE, I guess this may cause SCSI regression. blk_queue_enter() need to wait until RESURRECT & DYING are cleared instead of returning failure. Thanks, Ming
On Wed, Oct 16, 2024 at 07:09:48PM +0800, Ming Lei wrote: > Setting QUEUE_FLAG_DYING may fail passthrough request for > !GD_OWNS_QUEUE, I guess this may cause SCSI regression. Yes, as clearly documented in the commit log. > > blk_queue_enter() need to wait until RESURRECT & DYING are cleared > instead of returning failure. What we really need to is to split the enter conditions between disk and standalone queue. But until then I think the current version is reasonable enough.
On Wed, Oct 16, 2024 at 02:32:40PM +0200, Christoph Hellwig wrote: > On Wed, Oct 16, 2024 at 07:09:48PM +0800, Ming Lei wrote: > > Setting QUEUE_FLAG_DYING may fail passthrough request for > > !GD_OWNS_QUEUE, I guess this may cause SCSI regression. > > Yes, as clearly documented in the commit log. The change need Cc linux-scsi. > As the disk is going away there is no real point in sending these > commands, but we have no really good way to distinguish between the > cases. scsi request queue has very different lifetime with gendisk, not sure the above comment is correct. Thanks, Ming
On Wed, Oct 09, 2024 at 01:38:20PM +0200, Christoph Hellwig wrote: > When del_gendisk shuts down access to a gendisk, it could lead to a > deadlock with sd or, which try to submit passthrough SCSI commands from > their ->release method under open_mutex. The submission can be blocked > in blk_enter_queue while del_gendisk can't get to actually telling them > top stop and wake them up. When ->release() waits in blk_enter_queue(), the following code block mutex_lock(&disk->open_mutex); __blk_mark_disk_dead(disk); xa_for_each_start(&disk->part_tbl, idx, part, 1) drop_partition(part); mutex_unlock(&disk->open_mutex); in del_gendisk() should have been done. Then del_gendisk() should move on and finally unfreeze queue, so I still don't get the idea how the above dead lock is triggered. Thanks, Ming
On (24/10/16 21:35), Ming Lei wrote: > On Wed, Oct 09, 2024 at 01:38:20PM +0200, Christoph Hellwig wrote: > > When del_gendisk shuts down access to a gendisk, it could lead to a > > deadlock with sd or, which try to submit passthrough SCSI commands from > > their ->release method under open_mutex. The submission can be blocked > > in blk_enter_queue while del_gendisk can't get to actually telling them > > top stop and wake them up. > > When ->release() waits in blk_enter_queue(), the following code block > > mutex_lock(&disk->open_mutex); > __blk_mark_disk_dead(disk); > xa_for_each_start(&disk->part_tbl, idx, part, 1) > drop_partition(part); > mutex_unlock(&disk->open_mutex); blk_enter_queue()->schedule() holds ->open_mutex, so that block of code sleeps on ->open_mutex. We can't drain under ->open_mutex.
On Sat, Oct 19, 2024 at 10:25:41AM +0900, Sergey Senozhatsky wrote: > On (24/10/16 21:35), Ming Lei wrote: > > On Wed, Oct 09, 2024 at 01:38:20PM +0200, Christoph Hellwig wrote: > > > When del_gendisk shuts down access to a gendisk, it could lead to a > > > deadlock with sd or, which try to submit passthrough SCSI commands from > > > their ->release method under open_mutex. The submission can be blocked > > > in blk_enter_queue while del_gendisk can't get to actually telling them > > > top stop and wake them up. > > > > When ->release() waits in blk_enter_queue(), the following code block > > > > mutex_lock(&disk->open_mutex); > > __blk_mark_disk_dead(disk); > > xa_for_each_start(&disk->part_tbl, idx, part, 1) > > drop_partition(part); > > mutex_unlock(&disk->open_mutex); > > blk_enter_queue()->schedule() holds ->open_mutex, so that block > of code sleeps on ->open_mutex. We can't drain under ->open_mutex. We don't start to drain yet, then why does blk_enter_queue() sleeps and it waits for what? Thanks, Ming
On (24/10/19 20:32), Ming Lei wrote: [..] > > > When ->release() waits in blk_enter_queue(), the following code block > > > > > > mutex_lock(&disk->open_mutex); > > > __blk_mark_disk_dead(disk); > > > xa_for_each_start(&disk->part_tbl, idx, part, 1) > > > drop_partition(part); > > > mutex_unlock(&disk->open_mutex); > > > > blk_enter_queue()->schedule() holds ->open_mutex, so that block > > of code sleeps on ->open_mutex. We can't drain under ->open_mutex. > > We don't start to drain yet, then why does blk_enter_queue() sleeps and > it waits for what? Unfortunately I don't have a device to repro this, but it happens to a number of our customers (using different peripheral devices, but, as far as I'm concerned, all running 6.6 kernel).
On Sat, Oct 19, 2024 at 09:37:27PM +0900, Sergey Senozhatsky wrote: > On (24/10/19 20:32), Ming Lei wrote: > [..] > > > > When ->release() waits in blk_enter_queue(), the following code block > > > > > > > > mutex_lock(&disk->open_mutex); > > > > __blk_mark_disk_dead(disk); > > > > xa_for_each_start(&disk->part_tbl, idx, part, 1) > > > > drop_partition(part); > > > > mutex_unlock(&disk->open_mutex); > > > > > > blk_enter_queue()->schedule() holds ->open_mutex, so that block > > > of code sleeps on ->open_mutex. We can't drain under ->open_mutex. > > > > We don't start to drain yet, then why does blk_enter_queue() sleeps and > > it waits for what? > > Unfortunately I don't have a device to repro this, but it happens to a > number of our customers (using different peripheral devices, but, as far > as I'm concerned, all running 6.6 kernel). I can understand the issue on v6.6 because it doesn't have commit 7e04da2dc701 ("block: fix deadlock between sd_remove & sd_release"). But for the latest upstream, I don't get idea how it can happen. Thanks, Ming
On (24/10/19 20:50), Ming Lei wrote: > On Sat, Oct 19, 2024 at 09:37:27PM +0900, Sergey Senozhatsky wrote: > > On (24/10/19 20:32), Ming Lei wrote: > > [..] > > Unfortunately I don't have a device to repro this, but it happens to a > > number of our customers (using different peripheral devices, but, as far > > as I'm concerned, all running 6.6 kernel). > > I can understand the issue on v6.6 because it doesn't have commit > 7e04da2dc701 ("block: fix deadlock between sd_remove & sd_release"). We have that one in 6.6, as far as I can tell https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/block/genhd.c?h=v6.6.57#n663
On Sat, Oct 19, 2024 at 09:58:04PM +0900, Sergey Senozhatsky wrote: > On (24/10/19 20:50), Ming Lei wrote: > > On Sat, Oct 19, 2024 at 09:37:27PM +0900, Sergey Senozhatsky wrote: > > > On (24/10/19 20:32), Ming Lei wrote: > > > [..] > > > Unfortunately I don't have a device to repro this, but it happens to a > > > number of our customers (using different peripheral devices, but, as far > > > as I'm concerned, all running 6.6 kernel). > > > > I can understand the issue on v6.6 because it doesn't have commit > > 7e04da2dc701 ("block: fix deadlock between sd_remove & sd_release"). > > We have that one in 6.6, as far as I can tell > > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/block/genhd.c?h=v6.6.57#n663 Then we need to root-cause it first. If you can reproduce it, please provide dmesg log, and deadlock related process stack trace log collected via sysrq control. thanks, Ming
On (24/10/19 21:09), Ming Lei wrote: > On Sat, Oct 19, 2024 at 09:58:04PM +0900, Sergey Senozhatsky wrote: > > On (24/10/19 20:50), Ming Lei wrote: > > > On Sat, Oct 19, 2024 at 09:37:27PM +0900, Sergey Senozhatsky wrote: [..] > > Then we need to root-cause it first. > > If you can reproduce it I cannot. All I'm having are backtraces from various crash reports, I posted some of them earlier [1] (and in that entire thread). This loos like close()->bio_queue_enter() vs usb_disconnect()->del_gendisk() deadlock, and del_gendisk() cannot drain. Doing drain under the same lock, that things we want to drain currently hold, looks troublesome in general. [1] https://lore.kernel.org/linux-block/20241008051948.GB10794@google.com
On Sat, Oct 19, 2024 at 10:50:10PM +0900, Sergey Senozhatsky wrote: > On (24/10/19 21:09), Ming Lei wrote: > > On Sat, Oct 19, 2024 at 09:58:04PM +0900, Sergey Senozhatsky wrote: > > > On (24/10/19 20:50), Ming Lei wrote: > > > > On Sat, Oct 19, 2024 at 09:37:27PM +0900, Sergey Senozhatsky wrote: > [..] > > > > Then we need to root-cause it first. > > > > If you can reproduce it > > I cannot. > > All I'm having are backtraces from various crash reports, I posted > some of them earlier [1] (and in that entire thread). This loos like > close()->bio_queue_enter() vs usb_disconnect()->del_gendisk() deadlock, > and del_gendisk() cannot drain. Doing drain under the same lock, that > things we want to drain currently hold, looks troublesome in general. > > [1] https://lore.kernel.org/linux-block/20241008051948.GB10794@google.com Probably bio_queue_enter() waits for runtime PM, and the queue is in ->pm_only state, and BLK_MQ_REQ_PM isn't passed actually from ioctl_internal_command() <- scsi_set_medium_removal(). And if you have vmcore collected, it shouldn't be not hard to root cause. Also I'd suggest to collect intact related dmesg log in future, instead of providing selective log, such as, there isn't even kernel version... Thanks, Ming
On (24/10/19 23:03), Ming Lei wrote: > Probably bio_queue_enter() waits for runtime PM, and the queue is in > ->pm_only state, and BLK_MQ_REQ_PM isn't passed actually from > ioctl_internal_command() <- scsi_set_medium_removal(). > > And if you have vmcore collected, it shouldn't be not hard to root cause. We don't collect those. > Also I'd suggest to collect intact related dmesg log in future, instead of > providing selective log, such as, there isn't even kernel version... These "selected" backtraces are the only backtraces in the dmesg. I literally have reports that have just two backtraces of tasks blocked over 120 seconds, one close()->bio_queue_enter()->schedule (under ->open_mutex) and the other one del_gendisk()->mutex_lock()->schedule().
On (24/10/19 23:03), Ming Lei wrote: > > there isn't even kernel version... > Well, that's on me, yes, I admit it. I completely missed that but that was never a secret [1]. I missed it, probably, because I would have not reached out to upstream with 5.4 bug report; and 6.6, in that part of the code, looked quite close to the upsteram. But well, I forgot to add the kernel version, yes. [1] https://lore.kernel.org/linux-block/20241003135504.GL11458@google.com
On (24/10/19 23:03), Ming Lei wrote: > On Sat, Oct 19, 2024 at 10:50:10PM +0900, Sergey Senozhatsky wrote: > > On (24/10/19 21:09), Ming Lei wrote: > > > On Sat, Oct 19, 2024 at 09:58:04PM +0900, Sergey Senozhatsky wrote: > > > > On (24/10/19 20:50), Ming Lei wrote: > > > > > On Sat, Oct 19, 2024 at 09:37:27PM +0900, Sergey Senozhatsky wrote: > > [..] > > Probably bio_queue_enter() waits for runtime PM, and the queue is in > ->pm_only state, and BLK_MQ_REQ_PM isn't passed actually from > ioctl_internal_command() <- scsi_set_medium_removal(). Sorry for the delay. Another report. I see lots of buffer I/O errors <6>[ 364.268167] usb-storage 3-3:1.0: USB Mass Storage device detected <6>[ 364.268551] scsi host3: usb-storage 3-3:1.0 <3>[ 364.274806] Buffer I/O error on dev sdc1, logical block 0, lost async page write <5>[ 365.318424] scsi 3:0:0:0: Direct-Access VendorCo ProductCode 2.00 PQ: 0 ANSI: 4 <5>[ 365.319898] sd 3:0:0:0: [sdc] 122880000 512-byte logical blocks: (62.9 GB/58.6 GiB) <5>[ 365.320077] sd 3:0:0:0: [sdc] Write Protect is off <7>[ 365.320085] sd 3:0:0:0: [sdc] Mode Sense: 03 00 00 00 <4>[ 365.320255] sd 3:0:0:0: [sdc] No Caching mode page found <4>[ 365.320262] sd 3:0:0:0: [sdc] Assuming drive cache: write through <6>[ 365.322483] sdc: sdc1 <5>[ 365.323130] sd 3:0:0:0: [sdc] Attached SCSI removable disk <6>[ 369.083225] usb 3-3: USB disconnect, device number 49 Then PM suspend/resume. After resume <7>[ 1338.847937] PM: resume of devices complete after 291.422 msecs <6>[ 1338.854215] OOM killer enabled. <6>[ 1338.854235] Restarting tasks ... <6>[ 1338.854797] mei_hdcp 0000:00:16.0-(UUID: 7): bound 0000:00:02.0 (ops 0xffffffffb8f03e50) <6>[ 1338.857745] mei_pxp 0000:00:16.0-(UUID: 2): bound 0000:00:02.0 (ops 0xffffffffb8f16a80) <4>[ 1338.859663] done. <5>[ 1338.859683] random: crng reseeded on system resumption <12>[ 1338.868200] init: cupsd main process ended, respawning <6>[ 1338.868541] Resume caused by IRQ 9, acpi <6>[ 1338.868549] Resume caused by IRQ 98, chromeos-ec <6>[ 1338.868555] PM: suspend exit lots of buffer I/O errors again and eventually a deadlock. The deadlock happens much later than 120 seconds after resume, so I cannot directly connect those events. [..] <6>[ 1859.660882] usb-storage 3-3:1.0: USB Mass Storage device detected <6>[ 1859.661457] scsi host4: usb-storage 3-3:1.0 <3>[ 1859.668180] Buffer I/O error on dev sdd1, logical block 0, lost async page write <5>[ 1860.697826] scsi 4:0:0:0: Direct-Access VendorCo ProductCode 2.00 PQ: 0 ANSI: 4 <5>[ 1860.699222] sd 4:0:0:0: [sdd] 122880000 512-byte logical blocks: (62.9 GB/58.6 GiB) <5>[ 1860.699373] sd 4:0:0:0: [sdd] Write Protect is off <7>[ 1860.699380] sd 4:0:0:0: [sdd] Mode Sense: 03 00 00 00 <4>[ 1860.699522] sd 4:0:0:0: [sdd] No Caching mode page found <4>[ 1860.699526] sd 4:0:0:0: [sdd] Assuming drive cache: write through <6>[ 1860.701393] sdd: sdd1 <5>[ 1860.701886] sd 4:0:0:0: [sdd] Attached SCSI removable disk <6>[ 1862.077109] usb 3-3: USB disconnect, device number 110 <6>[ 1862.338159] usb 3-3: new high-speed USB device number 111 using xhci_hcd <6>[ 1862.468090] usb 3-3: New USB device found, idVendor=346d, idProduct=5678, bcdDevice= 2.00 <6>[ 1862.468105] usb 3-3: New USB device strings: Mfr=1, Product=2, SerialNumber=(Serial: 8) <6>[ 1862.468111] usb 3-3: Product: Disk 2.0 <6>[ 1862.468115] usb 3-3: Manufacturer: USB <6>[ 1862.468119] usb 3-3: SerialNumber: (Serial: 9) <6>[ 1862.469962] usb-storage 3-3:1.0: USB Mass Storage device detected <6>[ 1862.470642] scsi host3: usb-storage 3-3:1.0 <3>[ 1862.476447] Buffer I/O error on dev sdd1, logical block 0, lost async page write <5>[ 1863.514018] scsi 3:0:0:0: Direct-Access VendorCo ProductCode 2.00 PQ: 0 ANSI: 4 <5>[ 1863.515489] sd 3:0:0:0: [sdd] 122880000 512-byte logical blocks: (62.9 GB/58.6 GiB) <5>[ 1863.515640] sd 3:0:0:0: [sdd] Write Protect is off <7>[ 1863.515646] sd 3:0:0:0: [sdd] Mode Sense: 03 00 00 00 <4>[ 1863.515797] sd 3:0:0:0: [sdd] No Caching mode page found <4>[ 1863.515802] sd 3:0:0:0: [sdd] Assuming drive cache: write through <6>[ 1863.518227] sdd: sdd1 <5>[ 1863.518551] sd 3:0:0:0: [sdd] Attached SCSI removable disk <6>[ 1865.018356] usb 3-3: USB disconnect, device number 111 <6>[ 1865.285091] usb 3-3: new high-speed USB device number 112 using xhci_hcd <3>[ 1865.605088] usb 3-3: device descriptor read/64, error -71 <6>[ 1865.844873] usb 3-3: New USB device found, idVendor=346d, idProduct=5678, bcdDevice= 2.00 <6>[ 1865.844892] usb 3-3: New USB device strings: Mfr=1, Product=2, SerialNumber=(Serial: 8) <6>[ 1865.844898] usb 3-3: Product: Disk 2.0 <6>[ 1865.844903] usb 3-3: Manufacturer: USB <6>[ 1865.844906] usb 3-3: SerialNumber: (Serial: 9) <6>[ 1865.847205] usb-storage 3-3:1.0: USB Mass Storage device detected <6>[ 1865.847806] scsi host4: usb-storage 3-3:1.0 <3>[ 1865.853941] Buffer I/O error on dev sdd1, logical block 0, lost async page write <6>[ 1866.436729] usb 3-3: USB disconnect, device number 112 <6>[ 1866.700998] usb 3-3: new high-speed USB device number 113 using xhci_hcd <6>[ 1866.829449] usb 3-3: New USB device found, idVendor=346d, idProduct=5678, bcdDevice= 2.00 <6>[ 1866.829466] usb 3-3: New USB device strings: Mfr=1, Product=2, SerialNumber=(Serial: 8) <6>[ 1866.829473] usb 3-3: Product: Disk 2.0 <6>[ 1866.829478] usb 3-3: Manufacturer: USB <6>[ 1866.829482] usb 3-3: SerialNumber: (Serial: 9) <6>[ 1866.831605] usb-storage 3-3:1.0: USB Mass Storage device detected <6>[ 1866.832173] scsi host3: usb-storage 3-3:1.0 <5>[ 1867.866118] scsi 3:0:0:0: Direct-Access VendorCo ProductCode 2.00 PQ: 0 ANSI: 4 <5>[ 1867.868213] sd 3:0:0:0: [sdd] 122880000 512-byte logical blocks: (62.9 GB/58.6 GiB) <5>[ 1867.868604] sd 3:0:0:0: [sdd] Write Protect is off <7>[ 1867.868616] sd 3:0:0:0: [sdd] Mode Sense: 03 00 00 00 <4>[ 1867.869071] sd 3:0:0:0: [sdd] No Caching mode page found <4>[ 1867.869081] sd 3:0:0:0: [sdd] Assuming drive cache: write through <6>[ 1867.871429] sdd: sdd1 <5>[ 1867.871857] sd 3:0:0:0: [sdd] Attached SCSI removable disk <6>[ 1868.423593] usb 3-3: USB disconnect, device number 113 <6>[ 1868.431172] sdd: detected capacity change from 122880000 to 0 <28>[ 1928.670962] udevd[203]: sdd: Worker [9839] processing SEQNUM=6508 is taking a long time <3>[ 2004.633104] INFO: task kworker/0:3:187 blocked for more than 122 seconds. <3>[ 2004.633125] Tainted: G U 6.6.41-03520-gd3d77f15f842 #1 <3>[ 2004.633131] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. <6>[ 2004.633149] task:kworker/0:3 state:D stack:0 pid:187 ppid:2 flags:0x00004000 <6>[ 2004.633149] Workqueue: usb_hub_wq hub_event <6>[ 2004.633166] Call Trace: <6>[ 2004.633172] <TASK> <6>[ 2004.633179] schedule+0x4f4/0x1540 <6>[ 2004.633190] ? default_wake_function+0x388/0xcd0 <6>[ 2004.633200] schedule_preempt_disabled+0x15/0x30 <6>[ 2004.633206] __mutex_lock_slowpath+0x2b5/0x4d0 <6>[ 2004.633212] del_gendisk+0x136/0x370 <6>[ 2004.633222] sd_remove+0x30/0x60 <6>[ 2004.633230] device_release_driver_internal+0x1a2/0x2a0 <6>[ 2004.633239] bus_remove_device+0x154/0x180 <6>[ 2004.633248] device_del+0x207/0x370 <6>[ 2004.633256] ? __pfx_transport_remove_classdev+0x10/0x10 <6>[ 2004.633264] ? attribute_container_device_trigger+0xe3/0x110 <6>[ 2004.633272] __scsi_remove_device+0xc0/0x170 <6>[ 2004.633279] scsi_forget_host+0x45/0x60 <6>[ 2004.633287] scsi_remove_host+0x87/0x170 <6>[ 2004.633295] usb_stor_disconnect+0x63/0xb0 <6>[ 2004.633302] usb_unbind_interface+0xbe/0x250 <6>[ 2004.633309] device_release_driver_internal+0x1a2/0x2a0 <6>[ 2004.633315] bus_remove_device+0x154/0x180 <6>[ 2004.633322] device_del+0x207/0x370 <6>[ 2004.633328] ? kobject_release+0x56/0xb0 <6>[ 2004.633336] usb_disable_device+0x72/0x170 <6>[ 2004.633342] usb_disconnect+0xeb/0x280 <6>[ 2004.633350] hub_event+0xac7/0x1760 <6>[ 2004.633359] worker_thread+0x355/0x900 <6>[ 2004.633367] kthread+0xed/0x110 <6>[ 2004.633374] ? __pfx_worker_thread+0x10/0x10 <6>[ 2004.633381] ? __pfx_kthread+0x10/0x10 <6>[ 2004.633387] ret_from_fork+0x38/0x50 <6>[ 2004.633393] ? __pfx_kthread+0x10/0x10 <6>[ 2004.633399] ret_from_fork_asm+0x1b/0x30 <6>[ 2004.633407] </TASK> <3>[ 2004.633496] INFO: task cros-disks:1614 blocked for more than 122 seconds. <3>[ 2004.633502] Tainted: G U 6.6.41-03520-gd3d77f15f842 #1 <3>[ 2004.633506] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. <6>[ 2004.633519] task:cros-disks state:D stack:0 pid:1614 ppid:1 flags:0x00004002 <6>[ 2004.633519] Call Trace: <6>[ 2004.633523] <TASK> <6>[ 2004.633527] schedule+0x4f4/0x1540 <6>[ 2004.633533] ? xas_store+0xc57/0xcc0 <6>[ 2004.633539] ? lru_add_drain+0x4d8/0x6e0 <6>[ 2004.633548] blk_queue_enter+0x172/0x250 <6>[ 2004.633557] ? __pfx_autoremove_wake_function+0x10/0x10 <6>[ 2004.633565] blk_mq_alloc_request+0x167/0x210 <6>[ 2004.633573] scsi_execute_cmd+0x65/0x240 <6>[ 2004.633580] ioctl_internal_command+0x6c/0x150 <6>[ 2004.633590] scsi_set_medium_removal+0x63/0xc0 <6>[ 2004.633598] sd_release+0x42/0x50 <6>[ 2004.633606] blkdev_put+0x13b/0x1f0 <6>[ 2004.633615] blkdev_release+0x2b/0x40 <6>[ 2004.633623] __fput_sync+0x9b/0x2c0 <6>[ 2004.633632] __se_sys_close+0x69/0xc0 <6>[ 2004.633639] do_syscall_64+0x60/0x90 <6>[ 2004.633649] ? exit_to_user_mode_prepare+0x49/0x130 <6>[ 2004.633657] ? do_syscall_64+0x6f/0x90 <6>[ 2004.633665] ? do_syscall_64+0x6f/0x90 <6>[ 2004.633672] ? do_syscall_64+0x6f/0x90 <6>[ 2004.633680] ? irq_exit_rcu+0x38/0x90 <6>[ 2004.633687] ? exit_to_user_mode_prepare+0x49/0x130 <6>[ 2004.633694] entry_SYSCALL_64_after_hwframe+0x73/0xdd <6>[ 2004.633703] RIP: 0033:0x786d55239960 <6>[ 2004.633711] RSP: 002b:00007ffd1c6d8c28 EFLAGS: 00000202 ORIG_RAX: 0000000000000003 <6>[ 2004.633719] RAX: ffffffffffffffda RBX: 00005a5ffe743fd0 RCX: 0000786d55239960 <6>[ 2004.633725] RDX: 0000786d55307b00 RSI: 0000000000000000 RDI: 000000000000000c <6>[ 2004.633730] RBP: 00007ffd1c6d8d30 R08: 0000000000000007 R09: 00005a5ffe78a9f0 <6>[ 2004.633735] R10: 8a1ecef621fff8a0 R11: 0000000000000202 R12: 0000000000000831 <6>[ 2004.633741] R13: 00005a5ffe743f60 R14: 00005a5ffe743f80 R15: 000000000000000c <6>[ 2004.633746] </TASK>
diff --git a/block/genhd.c b/block/genhd.c index 1c05dd4c6980b5..7026569fa8a0be 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -589,8 +589,16 @@ static void __blk_mark_disk_dead(struct gendisk *disk) if (test_and_set_bit(GD_DEAD, &disk->state)) return; - if (test_bit(GD_OWNS_QUEUE, &disk->state)) - blk_queue_flag_set(QUEUE_FLAG_DYING, disk->queue); + /* + * Also mark the disk dead if it is not owned by the gendisk. This + * means we can't allow /dev/sg passthrough or SCSI internal commands + * while unbinding a ULP. That is more than just a bit ugly, but until + * we untangle q_usage_counter into one owned by the disk and one owned + * by the queue this is as good as it gets. The flag will be cleared + * at the end of del_gendisk if it wasn't set before. + */ + if (!test_and_set_bit(QUEUE_FLAG_DYING, &disk->queue->queue_flags)) + set_bit(QUEUE_FLAG_RESURRECT, &disk->queue->queue_flags); /* * Stop buffered writers from dirtying pages that can't be written out. @@ -719,6 +727,10 @@ void del_gendisk(struct gendisk *disk) * again. Else leave the queue frozen to fail all I/O. */ if (!test_bit(GD_OWNS_QUEUE, &disk->state)) { + if (test_bit(QUEUE_FLAG_RESURRECT, &q->queue_flags)) { + clear_bit(QUEUE_FLAG_DYING, &q->queue_flags); + clear_bit(QUEUE_FLAG_RESURRECT, &q->queue_flags); + } blk_queue_flag_clear(QUEUE_FLAG_INIT_DONE, q); __blk_mq_unfreeze_queue(q, true); } else { diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 50c3b959da2816..391e3eb3bb5e61 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -590,6 +590,7 @@ struct request_queue { /* Keep blk_queue_flag_name[] in sync with the definitions below */ enum { QUEUE_FLAG_DYING, /* queue being torn down */ + QUEUE_FLAG_RESURRECT, /* temporarily dying */ QUEUE_FLAG_NOMERGES, /* disable merge attempts */ QUEUE_FLAG_SAME_COMP, /* complete on same CPU-group */ QUEUE_FLAG_FAIL_IO, /* fake timeout */