Message ID | 20230602084956.1299001-1-kai.heng.feng@canonical.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | [v6] scsi: core: Wait until device is fully resumed before doing rescan | expand |
On 6/2/23 01:49, Kai-Heng Feng wrote: > diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c > index d217be323cc6..092f37464101 100644 > --- a/drivers/scsi/scsi_scan.c > +++ b/drivers/scsi/scsi_scan.c > @@ -1621,6 +1621,11 @@ void scsi_rescan_device(struct device *dev) > { > struct scsi_device *sdev = to_scsi_device(dev); > > +#ifdef CONFIG_PM_SLEEP > + if (dev->power.is_suspended) > + wait_for_completion(&dev->power.completion); > +#endif > + > device_lock(dev); > > scsi_attach_vpd(sdev); Directly accessing dev->power.completion from the SCSI core seems like a layering violation to me. Isn't this an object that should only be accessed directly by the device driver power management core? Additionally, what guarantees that the desired power state has been reached after wait_for_completion(&dev->power.completion) has finished? I think we need another solution. The device_lock() and device_unlock() calls have been introduced by commit e27829dc92e5 ("scsi: serialize ->rescan against ->remove"). I think there are other ways to serialize scsi_rescan_device() against scsi_remove_device(), e.g. via host->scan_mutex. Is this something that has been considered? Thanks, Bart.
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index d217be323cc6..092f37464101 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -1621,6 +1621,11 @@ void scsi_rescan_device(struct device *dev) { struct scsi_device *sdev = to_scsi_device(dev); +#ifdef CONFIG_PM_SLEEP + if (dev->power.is_suspended) + wait_for_completion(&dev->power.completion); +#endif + device_lock(dev); scsi_attach_vpd(sdev);
During system resuming process, the resuming order is from top to down. Namely, the ATA host is resumed before disks connected to it. When an EH is scheduled while ATA host is resumed and disk device is still suspended, the device_lock hold by scsi_rescan_device() is never released so the dpm_resume() of the disk is blocked forerver, therefore the system can never be resumed back. That's because scsi_attach_vpd() is expecting the disk device is in operational state, as it doesn't work on suspended device. To avoid such deadlock, wait until the scsi device is fully resumed, before continuing the rescan process. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> --- v6: - Fix !CONFIG_PM_SLEEP compilation error. v5: - Use a different approach. Wait until the disk device is resumed. v4: - No change. v3: - New patch to resolve undefined pm_suspend_target_state. v2: - Schedule rescan task at the end of system resume phase. - Wording. drivers/scsi/scsi_scan.c | 5 +++++ 1 file changed, 5 insertions(+)