Message ID | 20200623111018.31954-1-martin.kepplinger@puri.sm (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | scsi: sd: add runtime pm to open / release | expand |
On 2020-06-23 04:10, Martin Kepplinger wrote: > This add a very conservative but simple implementation for runtime PM > to the sd scsi driver: > Resume when opened (mounted) and suspend when released (unmounted). > > Improvements that allow suspending while a device is "open" can > be added later, but now we save power when no filesystem is mounted > and runtime PM is enabled. > > Signed-off-by: Martin Kepplinger <martin.kepplinger@puri.sm> > --- > drivers/scsi/sd.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > index d90fefffe31b..fe4cb7c50ec1 100644 > --- a/drivers/scsi/sd.c > +++ b/drivers/scsi/sd.c > @@ -1372,6 +1372,7 @@ static int sd_open(struct block_device *bdev, fmode_t mode) > SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_open\n")); > > sdev = sdkp->device; > + scsi_autopm_get_device(sdev); > > /* > * If the device is in error recovery, wait until it is done. > @@ -1418,6 +1419,9 @@ static int sd_open(struct block_device *bdev, fmode_t mode) > > error_out: > scsi_disk_put(sdkp); > + > + scsi_autopm_put_device(sdev); > + > return retval; > } > > @@ -1441,6 +1445,8 @@ static void sd_release(struct gendisk *disk, fmode_t mode) > > SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_release\n")); > > + scsi_autopm_put_device(sdev); > + > if (atomic_dec_return(&sdkp->openers) == 0 && sdev->removable) { > if (scsi_block_when_processing_errors(sdev)) > scsi_set_medium_removal(sdev, SCSI_REMOVAL_ALLOW); My understanding of the above patch is that it introduces a regression, namely by disabling runtime suspend as long as an sd device is held open. Bart.
On 24.06.20 15:33, Bart Van Assche wrote: > On 2020-06-23 04:10, Martin Kepplinger wrote: >> This add a very conservative but simple implementation for runtime PM >> to the sd scsi driver: >> Resume when opened (mounted) and suspend when released (unmounted). >> >> Improvements that allow suspending while a device is "open" can >> be added later, but now we save power when no filesystem is mounted >> and runtime PM is enabled. >> >> Signed-off-by: Martin Kepplinger <martin.kepplinger@puri.sm> >> --- >> drivers/scsi/sd.c | 6 ++++++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c >> index d90fefffe31b..fe4cb7c50ec1 100644 >> --- a/drivers/scsi/sd.c >> +++ b/drivers/scsi/sd.c >> @@ -1372,6 +1372,7 @@ static int sd_open(struct block_device *bdev, fmode_t mode) >> SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_open\n")); >> >> sdev = sdkp->device; >> + scsi_autopm_get_device(sdev); >> >> /* >> * If the device is in error recovery, wait until it is done. >> @@ -1418,6 +1419,9 @@ static int sd_open(struct block_device *bdev, fmode_t mode) >> >> error_out: >> scsi_disk_put(sdkp); >> + >> + scsi_autopm_put_device(sdev); >> + >> return retval; >> } >> >> @@ -1441,6 +1445,8 @@ static void sd_release(struct gendisk *disk, fmode_t mode) >> >> SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_release\n")); >> >> + scsi_autopm_put_device(sdev); >> + >> if (atomic_dec_return(&sdkp->openers) == 0 && sdev->removable) { >> if (scsi_block_when_processing_errors(sdev)) >> scsi_set_medium_removal(sdev, SCSI_REMOVAL_ALLOW); > > My understanding of the above patch is that it introduces a regression, > namely by disabling runtime suspend as long as an sd device is held open. > > Bart. > > hi Bart, Alan says the same (on block request, the block layer should initiate a runtime resume), so merging with the thread from https://lore.kernel.org/linux-usb/8738e4d3-62b1-0144-107d-ff42000ed6c6@puri.sm/T/ now and answer to both Bart and Alan here:] I see scsi-pm.c using the blk-pm.c API but I'm not sure how the block layer would itself resume the scsi device (I use it via usb_storage, so that usb_stor_resume() follows in my case but I guess that doesn't matter here): my understanding of "sd" is: enable runtime pm in probe(), so *allow* the device to be suspended (if enabled by the user), but never resume(?). Also, why isn't "autopm" used in its ioctl() implementation (as opposed to in "sr")? here's roughly what happens when enabling runtime PM in sysfs (again, because sd_probe() calls autopm_put() and thus allows it: [ 27.384446] sd 0:0:0:0: scsi_runtime_suspend [ 27.432282] blk_pre_runtime_suspend [ 27.435783] sd_suspend_common [ 27.438782] blk_post_runtime_suspend [ 27.442427] scsi target0:0:0: scsi_runtime_suspend [ 27.447303] scsi host0: scsi_runtime_suspend then I "mount /dev/sda1 /mnt" and none of the resume() functions get called. To me it looks like the sd driver should initiate resuming, and that's not implemented. what am I doing wrong or overlooking? how exactly does (or should) the block layer initiate resume here? thanks again for your time, martin
On Thu, Jun 25, 2020 at 10:16:06AM +0200, Martin Kepplinger wrote: > On 24.06.20 15:33, Bart Van Assche wrote: > > On 2020-06-23 04:10, Martin Kepplinger wrote: > >> This add a very conservative but simple implementation for runtime PM > >> to the sd scsi driver: > >> Resume when opened (mounted) and suspend when released (unmounted). > >> > >> Improvements that allow suspending while a device is "open" can > >> be added later, but now we save power when no filesystem is mounted > >> and runtime PM is enabled. > >> > >> Signed-off-by: Martin Kepplinger <martin.kepplinger@puri.sm> > >> --- > >> drivers/scsi/sd.c | 6 ++++++ > >> 1 file changed, 6 insertions(+) > >> > >> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > >> index d90fefffe31b..fe4cb7c50ec1 100644 > >> --- a/drivers/scsi/sd.c > >> +++ b/drivers/scsi/sd.c > >> @@ -1372,6 +1372,7 @@ static int sd_open(struct block_device *bdev, fmode_t mode) > >> SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_open\n")); > >> > >> sdev = sdkp->device; > >> + scsi_autopm_get_device(sdev); > >> > >> /* > >> * If the device is in error recovery, wait until it is done. > >> @@ -1418,6 +1419,9 @@ static int sd_open(struct block_device *bdev, fmode_t mode) > >> > >> error_out: > >> scsi_disk_put(sdkp); > >> + > >> + scsi_autopm_put_device(sdev); > >> + > >> return retval; > >> } > >> > >> @@ -1441,6 +1445,8 @@ static void sd_release(struct gendisk *disk, fmode_t mode) > >> > >> SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_release\n")); > >> > >> + scsi_autopm_put_device(sdev); > >> + > >> if (atomic_dec_return(&sdkp->openers) == 0 && sdev->removable) { > >> if (scsi_block_when_processing_errors(sdev)) > >> scsi_set_medium_removal(sdev, SCSI_REMOVAL_ALLOW); > > > > My understanding of the above patch is that it introduces a regression, > > namely by disabling runtime suspend as long as an sd device is held open. > > > > Bart. > > > > > > hi Bart, > > Alan says the same (on block request, the block layer should initiate a > runtime resume), so merging with the thread from > https://lore.kernel.org/linux-usb/8738e4d3-62b1-0144-107d-ff42000ed6c6@puri.sm/T/ > now and answer to both Bart and Alan here:] > > I see scsi-pm.c using the blk-pm.c API but I'm not sure how the block > layer would itself resume the scsi device (I use it via usb_storage, so > that usb_stor_resume() follows in my case but I guess that doesn't > matter here): The block layer does this in block/blk-core.c:blk_queue_enter(), as part of the condition check in the call to wait_event() near the end of the function. The blk_pm_request_resume() inline routine calls pm_request_resume(). At least, that's what is _supposed_ to happen. See commit 0d25bd072b49 ("block: Schedule runtime resume earlier"). > my understanding of "sd" is: enable runtime pm in probe(), so *allow* > the device to be suspended (if enabled by the user), but never > resume(?). Also, why isn't "autopm" used in its ioctl() implementation > (as opposed to in "sr")? I don't remember the reason. It may be that the code in sr.c isn't needed. > here's roughly what happens when enabling runtime PM in sysfs (again, > because sd_probe() calls autopm_put() and thus allows it: > > [ 27.384446] sd 0:0:0:0: scsi_runtime_suspend > [ 27.432282] blk_pre_runtime_suspend > [ 27.435783] sd_suspend_common > [ 27.438782] blk_post_runtime_suspend > [ 27.442427] scsi target0:0:0: scsi_runtime_suspend > [ 27.447303] scsi host0: scsi_runtime_suspend > > then I "mount /dev/sda1 /mnt" and none of the resume() functions get > called. To me it looks like the sd driver should initiate resuming, and > that's not implemented. > > what am I doing wrong or overlooking? how exactly does (or should) the > block layer initiate resume here? I don't know what's going wrong. Bart, can you look into it? As far as I can tell, you're the last person to touch the block-layer's runtime PM code. Alan Stern
On 2020-06-25 01:16, Martin Kepplinger wrote: > Also, why isn't "autopm" used in its ioctl() implementation > (as opposed to in "sr")? Some of the scsi_autopm_{get,put}_device() calls in the sr driver have been introduced by me before I fully understood runtime pm. I will have a another look to see whether these are really necessary and if not, post a patch to remove these again. Bart.
On 2020-06-25 01:16, Martin Kepplinger wrote: > here's roughly what happens when enabling runtime PM in sysfs (again, > because sd_probe() calls autopm_put() and thus allows it: > > [ 27.384446] sd 0:0:0:0: scsi_runtime_suspend > [ 27.432282] blk_pre_runtime_suspend > [ 27.435783] sd_suspend_common > [ 27.438782] blk_post_runtime_suspend > [ 27.442427] scsi target0:0:0: scsi_runtime_suspend > [ 27.447303] scsi host0: scsi_runtime_suspend > > then I "mount /dev/sda1 /mnt" and none of the resume() functions get > called. To me it looks like the sd driver should initiate resuming, and > that's not implemented. > > what am I doing wrong or overlooking? how exactly does (or should) the > block layer initiate resume here? As far as I know runtime power management support in the sd driver is working fine and is being used intensively by the UFS driver. The following commit was submitted to fix a bug encountered by an UFS developer: 05d18ae1cc8a ("scsi: pm: Balance pm_only counter of request queue during system resume") # v5.7. I'm not sure which bug is causing trouble on your setup but I think it's likely that the root cause is somewhere else than in the block layer, the SCSI core or the SCSI sd driver. Bart.
On Fri, Jun 26, 2020 at 08:07:51AM -0700, Bart Van Assche wrote: > On 2020-06-25 01:16, Martin Kepplinger wrote: > > here's roughly what happens when enabling runtime PM in sysfs (again, > > because sd_probe() calls autopm_put() and thus allows it: > > > > [ 27.384446] sd 0:0:0:0: scsi_runtime_suspend > > [ 27.432282] blk_pre_runtime_suspend > > [ 27.435783] sd_suspend_common > > [ 27.438782] blk_post_runtime_suspend > > [ 27.442427] scsi target0:0:0: scsi_runtime_suspend > > [ 27.447303] scsi host0: scsi_runtime_suspend > > > > then I "mount /dev/sda1 /mnt" and none of the resume() functions get > > called. To me it looks like the sd driver should initiate resuming, and > > that's not implemented. > > > > what am I doing wrong or overlooking? how exactly does (or should) the > > block layer initiate resume here? > > As far as I know runtime power management support in the sd driver is working > fine and is being used intensively by the UFS driver. The following commit was > submitted to fix a bug encountered by an UFS developer: 05d18ae1cc8a ("scsi: > pm: Balance pm_only counter of request queue during system resume") # v5.7. I just looked at that commit for the first time. Instead of making the SCSI driver do the work of deciding what routine to call, why not redefine blk_set_runtime_active(q) to simply call blk_post_runtime_resume(q, 0)? Or vice versa: if err == 0 have blk_post_runtime_resume call blk_set_runtime_active? After all, the two routines do almost the same thing -- and the bug addressed by this commit was caused by the difference in their behaviors. If the device was already runtime-active during the system suspend, doing an extra clear of the pm_only counter won't hurt anything. > I'm not sure which bug is causing trouble on your setup but I think it's likely > that the root cause is somewhere else than in the block layer, the SCSI core > or the SCSI sd driver. > > Bart. Martin's best approach would be to add some debugging code to find out why blk_queue_enter() isn't calling bkl_pm_request_resume(), or why that call doesn't lead to pm_request_resume(). Alan Stern
On 2020-06-26 08:44, Alan Stern wrote: > On Fri, Jun 26, 2020 at 08:07:51AM -0700, Bart Van Assche wrote: >> As far as I know runtime power management support in the sd driver is working >> fine and is being used intensively by the UFS driver. The following commit was >> submitted to fix a bug encountered by an UFS developer: 05d18ae1cc8a ("scsi: >> pm: Balance pm_only counter of request queue during system resume") # v5.7. > > I just looked at that commit for the first time. > > Instead of making the SCSI driver do the work of deciding what routine to > call, why not redefine blk_set_runtime_active(q) to simply call > blk_post_runtime_resume(q, 0)? Or vice versa: if err == 0 have > blk_post_runtime_resume call blk_set_runtime_active? > > After all, the two routines do almost the same thing -- and the bug > addressed by this commit was caused by the difference in their behaviors. > > If the device was already runtime-active during the system suspend, doing > an extra clear of the pm_only counter won't hurt anything. Hi Alan, Do you want to submit a patch that implements this change or do you perhaps expect me to do that? Thanks, Bart.
On Sat, Jun 27, 2020 at 07:37:54PM -0700, Bart Van Assche wrote: > On 2020-06-26 08:44, Alan Stern wrote: > > On Fri, Jun 26, 2020 at 08:07:51AM -0700, Bart Van Assche wrote: > >> As far as I know runtime power management support in the sd driver is working > >> fine and is being used intensively by the UFS driver. The following commit was > >> submitted to fix a bug encountered by an UFS developer: 05d18ae1cc8a ("scsi: > >> pm: Balance pm_only counter of request queue during system resume") # v5.7. > > > > I just looked at that commit for the first time. > > > > Instead of making the SCSI driver do the work of deciding what routine to > > call, why not redefine blk_set_runtime_active(q) to simply call > > blk_post_runtime_resume(q, 0)? Or vice versa: if err == 0 have > > blk_post_runtime_resume call blk_set_runtime_active? > > > > After all, the two routines do almost the same thing -- and the bug > > addressed by this commit was caused by the difference in their behaviors. > > > > If the device was already runtime-active during the system suspend, doing > > an extra clear of the pm_only counter won't hurt anything. > > Hi Alan, > > Do you want to submit a patch that implements this change or do you > perhaps expect me to do that? I'll submit a patch in a few days. I just wanted to check first that the idea was sound. Alan Stern
On 26.06.20 17:44, Alan Stern wrote: > On Fri, Jun 26, 2020 at 08:07:51AM -0700, Bart Van Assche wrote: >> On 2020-06-25 01:16, Martin Kepplinger wrote: >>> here's roughly what happens when enabling runtime PM in sysfs (again, >>> because sd_probe() calls autopm_put() and thus allows it: >>> >>> [ 27.384446] sd 0:0:0:0: scsi_runtime_suspend >>> [ 27.432282] blk_pre_runtime_suspend >>> [ 27.435783] sd_suspend_common >>> [ 27.438782] blk_post_runtime_suspend >>> [ 27.442427] scsi target0:0:0: scsi_runtime_suspend >>> [ 27.447303] scsi host0: scsi_runtime_suspend >>> >>> then I "mount /dev/sda1 /mnt" and none of the resume() functions get >>> called. To me it looks like the sd driver should initiate resuming, and >>> that's not implemented. >>> >>> what am I doing wrong or overlooking? how exactly does (or should) the >>> block layer initiate resume here? >> >> As far as I know runtime power management support in the sd driver is working >> fine and is being used intensively by the UFS driver. The following commit was >> submitted to fix a bug encountered by an UFS developer: 05d18ae1cc8a ("scsi: >> pm: Balance pm_only counter of request queue during system resume") # v5.7. > > I just looked at that commit for the first time. > > Instead of making the SCSI driver do the work of deciding what routine to > call, why not redefine blk_set_runtime_active(q) to simply call > blk_post_runtime_resume(q, 0)? Or vice versa: if err == 0 have > blk_post_runtime_resume call blk_set_runtime_active? > > After all, the two routines do almost the same thing -- and the bug > addressed by this commit was caused by the difference in their behaviors. > > If the device was already runtime-active during the system suspend, doing > an extra clear of the pm_only counter won't hurt anything. > >> I'm not sure which bug is causing trouble on your setup but I think it's likely >> that the root cause is somewhere else than in the block layer, the SCSI core >> or the SCSI sd driver. >> >> Bart. > > Martin's best approach would be to add some debugging code to find out why > blk_queue_enter() isn't calling bkl_pm_request_resume(), or why that call > doesn't lead to pm_request_resume(). > > Alan Stern > Hi Alan, blk_queue_enter() always - especially when sd is runtime suspended and I try to mount as above - sets success to be true for me, so never continues down to bkl_pm_request_resume(). All I see is "PM: Removing info for No Bus:sda1". thanks, martin
On Mon, Jun 29, 2020 at 11:42:59AM +0200, Martin Kepplinger wrote: > > > On 26.06.20 17:44, Alan Stern wrote: > > Martin's best approach would be to add some debugging code to find out why > > blk_queue_enter() isn't calling bkl_pm_request_resume(), or why that call > > doesn't lead to pm_request_resume(). > > > > Alan Stern > > > > Hi Alan, > > blk_queue_enter() always - especially when sd is runtime suspended and I > try to mount as above - sets success to be true for me, so never > continues down to bkl_pm_request_resume(). All I see is "PM: Removing > info for No Bus:sda1". Aha. Looking at this more closely, it's apparent that the code in blk-core.c contains a logic bug: It assumes that if the BLK_MQ_REQ_PREEMPT flag is set then the request can be issued regardless of the queue's runtime status. That is not correct when the queue is suspended. Below is my attempt to fix this up. I'm not sure that the patch is entirely correct, but it should fix this logic bug. I would appreciate a critical review. Martin, does this fix the problem? Alan Stern Index: usb-devel/block/blk-core.c =================================================================== --- usb-devel.orig/block/blk-core.c +++ usb-devel/block/blk-core.c @@ -423,7 +423,8 @@ int blk_queue_enter(struct request_queue * responsible for ensuring that that counter is * globally visible before the queue is unfrozen. */ - if (pm || !blk_queue_pm_only(q)) { + if ((pm && q->rpm_status != RPM_SUSPENDED) || + !blk_queue_pm_only(q)) { success = true; } else { percpu_ref_put(&q->q_usage_counter); @@ -448,8 +449,7 @@ int blk_queue_enter(struct request_queue wait_event(q->mq_freeze_wq, (!q->mq_freeze_depth && - (pm || (blk_pm_request_resume(q), - !blk_queue_pm_only(q)))) || + blk_pm_resume_queue(pm, q)) || blk_queue_dying(q)); if (blk_queue_dying(q)) return -ENODEV; Index: usb-devel/block/blk-pm.h =================================================================== --- usb-devel.orig/block/blk-pm.h +++ usb-devel/block/blk-pm.h @@ -6,11 +6,14 @@ #include <linux/pm_runtime.h> #ifdef CONFIG_PM -static inline void blk_pm_request_resume(struct request_queue *q) +static inline int blk_pm_resume_queue(const bool pm, struct request_queue *q) { - if (q->dev && (q->rpm_status == RPM_SUSPENDED || - q->rpm_status == RPM_SUSPENDING)) - pm_request_resume(q->dev); + if (!q->dev || !blk_queue_pm_only(q)) + return 1; /* Nothing to do */ + if (pm && q->rpm_status != RPM_SUSPENDED) + return 1; /* Request allowed */ + pm_request_resume(q->dev); + return 0; } static inline void blk_pm_mark_last_busy(struct request *rq) @@ -44,8 +47,9 @@ static inline void blk_pm_put_request(st --rq->q->nr_pending; } #else -static inline void blk_pm_request_resume(struct request_queue *q) +static inline int blk_pm_resume_queue(const bool pm, struct request_queue *q) { + return 1; } static inline void blk_pm_mark_last_busy(struct request *rq)
On 2020-06-29 09:15, Alan Stern wrote: > Aha. Looking at this more closely, it's apparent that the code in > blk-core.c contains a logic bug: It assumes that if the BLK_MQ_REQ_PREEMPT > flag is set then the request can be issued regardless of the queue's > runtime status. That is not correct when the queue is suspended. Please clarify why this is not correct. > Index: usb-devel/block/blk-core.c > =================================================================== > --- usb-devel.orig/block/blk-core.c > +++ usb-devel/block/blk-core.c > @@ -423,7 +423,8 @@ int blk_queue_enter(struct request_queue > * responsible for ensuring that that counter is > * globally visible before the queue is unfrozen. > */ > - if (pm || !blk_queue_pm_only(q)) { > + if ((pm && q->rpm_status != RPM_SUSPENDED) || > + !blk_queue_pm_only(q)) { > success = true; > } else { > percpu_ref_put(&q->q_usage_counter); Does the above change make it impossible to bring a suspended device back to the RPM_ACTIVE state if the BLK_MQ_REQ_NOWAIT flag is set? > @@ -448,8 +449,7 @@ int blk_queue_enter(struct request_queue > > wait_event(q->mq_freeze_wq, > (!q->mq_freeze_depth && > - (pm || (blk_pm_request_resume(q), > - !blk_queue_pm_only(q)))) || > + blk_pm_resume_queue(pm, q)) || > blk_queue_dying(q)); > if (blk_queue_dying(q)) > return -ENODEV; > Index: usb-devel/block/blk-pm.h > =================================================================== > --- usb-devel.orig/block/blk-pm.h > +++ usb-devel/block/blk-pm.h > @@ -6,11 +6,14 @@ > #include <linux/pm_runtime.h> > > #ifdef CONFIG_PM > -static inline void blk_pm_request_resume(struct request_queue *q) > +static inline int blk_pm_resume_queue(const bool pm, struct request_queue *q) > { > - if (q->dev && (q->rpm_status == RPM_SUSPENDED || > - q->rpm_status == RPM_SUSPENDING)) > - pm_request_resume(q->dev); > + if (!q->dev || !blk_queue_pm_only(q)) > + return 1; /* Nothing to do */ > + if (pm && q->rpm_status != RPM_SUSPENDED) > + return 1; /* Request allowed */ > + pm_request_resume(q->dev); > + return 0; > } Does the above change, especially the " && q->rpm_status != RPM_SUSPENDED" part, make it impossible to bring a suspended device back to the RPM_ACTIVE state? Thanks, Bart.
On Mon, Jun 29, 2020 at 09:56:49AM -0700, Bart Van Assche wrote: > On 2020-06-29 09:15, Alan Stern wrote: > > Aha. Looking at this more closely, it's apparent that the code in > > blk-core.c contains a logic bug: It assumes that if the BLK_MQ_REQ_PREEMPT > > flag is set then the request can be issued regardless of the queue's > > runtime status. That is not correct when the queue is suspended. > > Please clarify why this is not correct. As I understand it, BLK_MQ_REQ_PREEMPT is supposed to mean (among other things) that this request may be issued as part of the procedure for putting a device into a low-power state or returning it to a high-power state. Consequently, requests with that flag set must be allowed while the queue is in the RPM_SUSPENDING or RPM_RESUMING runtime states -- as opposed to ordinary requests, which are allowed only in the RPM_ACTIVE state. In the RPM_SUSPENDED state, however, the queue is entirely inactive. Even if a request were to be issued somehow, it would fail because the system would not be able to transmit it to the device. In other words, when the queue is in the RPM_SUSPENDED state, a resume must be requested before _any_ request can be issued. > > Index: usb-devel/block/blk-core.c > > =================================================================== > > --- usb-devel.orig/block/blk-core.c > > +++ usb-devel/block/blk-core.c > > @@ -423,7 +423,8 @@ int blk_queue_enter(struct request_queue > > * responsible for ensuring that that counter is > > * globally visible before the queue is unfrozen. > > */ > > - if (pm || !blk_queue_pm_only(q)) { > > + if ((pm && q->rpm_status != RPM_SUSPENDED) || > > + !blk_queue_pm_only(q)) { > > success = true; > > } else { > > percpu_ref_put(&q->q_usage_counter); > > Does the above change make it impossible to bring a suspended device > back to the RPM_ACTIVE state if the BLK_MQ_REQ_NOWAIT flag is set? The only case affected by this change is when BLK_MQ_REQ_PREEMPT is set and the queue is in the RPM_SUSPENDED state. If BLK_MQ_REQ_NOWAIT was also set, the original code would set "success" to true, allowing the request to proceed even though it could not be carried out immediately -- a bug. With the patch, such a request will fail without resuming the device. I don't know whether that is the desired behavior or not, but at least it's not obviously a bug. It does seem odd that blk_queue_enter() tests the queue's pm_only status and the request flag in two different spots (here and below). Why does it do this? It seems like an invitation for bugs. > > @@ -448,8 +449,7 @@ int blk_queue_enter(struct request_queue > > > > wait_event(q->mq_freeze_wq, > > (!q->mq_freeze_depth && > > - (pm || (blk_pm_request_resume(q), > > - !blk_queue_pm_only(q)))) || > > + blk_pm_resume_queue(pm, q)) || > > blk_queue_dying(q)); > > if (blk_queue_dying(q)) > > return -ENODEV; > > Index: usb-devel/block/blk-pm.h > > =================================================================== > > --- usb-devel.orig/block/blk-pm.h > > +++ usb-devel/block/blk-pm.h > > @@ -6,11 +6,14 @@ > > #include <linux/pm_runtime.h> > > > > #ifdef CONFIG_PM > > -static inline void blk_pm_request_resume(struct request_queue *q) > > +static inline int blk_pm_resume_queue(const bool pm, struct request_queue *q) > > { > > - if (q->dev && (q->rpm_status == RPM_SUSPENDED || > > - q->rpm_status == RPM_SUSPENDING)) > > - pm_request_resume(q->dev); > > + if (!q->dev || !blk_queue_pm_only(q)) > > + return 1; /* Nothing to do */ > > + if (pm && q->rpm_status != RPM_SUSPENDED) > > + return 1; /* Request allowed */ > > + pm_request_resume(q->dev); > > + return 0; > > } > > Does the above change, especially the " && q->rpm_status != > RPM_SUSPENDED" part, make it impossible to bring a suspended device back > to the RPM_ACTIVE state? Just the opposite -- the change makes it _possible_ for a BLK_MQ_REQ_PREEMPT request to bring a suspended device back to the RPM_ACTIVE state. Look at the existing code: If pm is true then blk_pm_request_resume() will be skipped, so the device won't be resumed. With this patch -- in particular with the "&& q->rpm_status != RPM_SUSPENDED" part added -- the call won't be skipped and so the resume will take place. The rather complicated syntax of the wait_event() call in the existing code contributes to this confusion. One of the things my patch tries to do is make the code more straightforward and easier to grasp. I admit that there are parts to this thing I don't understand. The wait_event() call in blk_queue_enter(), for example: If we are waiting for the device to leave the RPM_SUSPENDED state (or enter the RPM_ACTIVE state), where does q->mq_freeze_wq get woken up? There's no obvious spot in blk_{pre|post}_runtime_resume(). Alan Stern
On 29.06.20 18:15, Alan Stern wrote: > On Mon, Jun 29, 2020 at 11:42:59AM +0200, Martin Kepplinger wrote: >> >> >> On 26.06.20 17:44, Alan Stern wrote: >>> Martin's best approach would be to add some debugging code to find out why >>> blk_queue_enter() isn't calling bkl_pm_request_resume(), or why that call >>> doesn't lead to pm_request_resume(). >>> >>> Alan Stern >>> >> >> Hi Alan, >> >> blk_queue_enter() always - especially when sd is runtime suspended and I >> try to mount as above - sets success to be true for me, so never >> continues down to bkl_pm_request_resume(). All I see is "PM: Removing >> info for No Bus:sda1". > > Aha. Looking at this more closely, it's apparent that the code in > blk-core.c contains a logic bug: It assumes that if the BLK_MQ_REQ_PREEMPT > flag is set then the request can be issued regardless of the queue's > runtime status. That is not correct when the queue is suspended. > > Below is my attempt to fix this up. I'm not sure that the patch is > entirely correct, but it should fix this logic bug. I would appreciate a > critical review. > > Martin, does this fix the problem? > not quite: mounting works and resuming itself indeed happens now when copying a file, but the I/O itself doesn't, but says "device offline or changed": [ 167.167615] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=0s [ 167.167630] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current] [ 167.167638] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0 [ 167.167648] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 24 c2 00 00 01 00 [ 167.167658] blk_update_request: I/O error, dev sda, sector 9410 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 [ 167.178327] FAT-fs (sda1): FAT read failed (blocknr 1218) [ 167.183895] sd 0:0:0:0: [sda] tag#0 device offline or changed [ 167.189695] blk_update_request: I/O error, dev sda, sector 5101888 op 0x0:(READ) flags 0x80700 phys_seg 8 prio class 0 [ 167.200510] sd 0:0:0:0: [sda] tag#0 device offline or changed and a later try to copy a file only yields (mostly my own debug prints): [ 371.110798] blk_queue_enter: wait_event: pm=0 [ 371.300666] scsi_runtime_resume [ 371.303834] scsi_runtime_resume [ 371.307007] scsi_runtime_resume [ 371.310213] sd 0:0:0:0: [sda] tag#0 device offline or changed [ 371.316011] blk_update_request: I/O error, dev sda, sector 5101888 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 [ 372.560690] scsi_runtime_suspend [ 372.563968] scsi_runtime_suspend [ 372.567237] scsi_runtime_suspend thanks Alan for taking the time and trying to fix this! you're close. what is missing? martin
On Tue, Jun 30, 2020 at 05:33:25AM +0200, Martin Kepplinger wrote: > > Martin, does this fix the problem? > > > > not quite: mounting works and resuming itself indeed happens now when > copying a file, but the I/O itself doesn't, but says "device offline or > changed": > > [ 167.167615] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: > hostbyte=0x00 driverbyte=0x08 cmd_age=0s > [ 167.167630] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current] > [ 167.167638] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0 That code stands for "Not-ready to ready transition". It isn't really an error, just a notification. The command should have been retried. > [ 167.167648] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 24 > c2 00 00 01 00 > [ 167.167658] blk_update_request: I/O error, dev sda, sector 9410 op > 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 > [ 167.178327] FAT-fs (sda1): FAT read failed (blocknr 1218) And it should not have failed. Martin or James, any ideas about this? > [ 167.183895] sd 0:0:0:0: [sda] tag#0 device offline or changed > [ 167.189695] blk_update_request: I/O error, dev sda, sector 5101888 op > 0x0:(READ) flags 0x80700 phys_seg 8 prio class 0 > [ 167.200510] sd 0:0:0:0: [sda] tag#0 device offline or changed > > > and a later try to copy a file only yields (mostly my own debug prints): > > > [ 371.110798] blk_queue_enter: wait_event: pm=0 > [ 371.300666] scsi_runtime_resume > [ 371.303834] scsi_runtime_resume > [ 371.307007] scsi_runtime_resume > [ 371.310213] sd 0:0:0:0: [sda] tag#0 device offline or changed > [ 371.316011] blk_update_request: I/O error, dev sda, sector 5101888 op > 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 No way to tell from the log what caused this error. > [ 372.560690] scsi_runtime_suspend > [ 372.563968] scsi_runtime_suspend > [ 372.567237] scsi_runtime_suspend > > thanks Alan for taking the time and trying to fix this! you're close. > what is missing? At this point I don't know. Alan Stern
On 2020-06-29 09:15, Alan Stern wrote: > Aha. Looking at this more closely, it's apparent that the code in > blk-core.c contains a logic bug: It assumes that if the BLK_MQ_REQ_PREEMPT > flag is set then the request can be issued regardless of the queue's > runtime status. That is not correct when the queue is suspended. Are you sure of this? In the past (legacy block layer) no requests were processed for queues in state RPM_SUSPENDED. However, that function and its successor blk_pm_allow_request() are gone. The following code was removed by commit 7cedffec8e75 ("block: Make blk_get_request() block for non-PM requests while suspended"). static struct request *blk_pm_peek_request(struct request_queue *q, struct request *rq) { if (q->dev && (q->rpm_status == RPM_SUSPENDED || (q->rpm_status != RPM_ACTIVE && !(rq->cmd_flags & REQ_PM)))) return NULL; else return rq; } Bart.
On Tue, Jun 30, 2020 at 08:59:00AM -0700, Bart Van Assche wrote: > On 2020-06-29 09:15, Alan Stern wrote: > > Aha. Looking at this more closely, it's apparent that the code in > > blk-core.c contains a logic bug: It assumes that if the BLK_MQ_REQ_PREEMPT > > flag is set then the request can be issued regardless of the queue's > > runtime status. That is not correct when the queue is suspended. > > Are you sure of this? In the past (legacy block layer) no requests were > processed for queues in state RPM_SUSPENDED. However, that function and > its successor blk_pm_allow_request() are gone. The following code was > removed by commit 7cedffec8e75 ("block: Make blk_get_request() block for > non-PM requests while suspended"). > > static struct request *blk_pm_peek_request(struct request_queue *q, > struct request *rq) > { > if (q->dev && (q->rpm_status == RPM_SUSPENDED || > (q->rpm_status != RPM_ACTIVE && !(rq->cmd_flags & REQ_PM)))) > return NULL; > else > return rq; > } No, it wasn't. Another routine, blk_pm_allow_request(), was removed by that commit, but almost no other code was deleted. Maybe you're thinking of a different commit? In any case, I don't understand the point you're trying to make. Here's what I _do_ understand: While the queue is in the RPM_SUSPENDED state, it can't carry out any requests at all. If a request is added to the queue, the queue must be resumed before the request can be issued to the lower-layer drivers -- no matter what flags are set in the request. Right now there doesn't seem to be any mechanism for resuming the queue if an REQ_PREEMPT request is added while the queue is suspended. Alan Stern
On 2020-06-30 11:02, Alan Stern wrote: > Right now there doesn't seem to be any mechanism for resuming the queue > if an REQ_PREEMPT request is added while the queue is suspended. I do not agree with the above statement. My understanding is that resuming happens as follows if a request is submitted against a runtime suspended queue owned by a SCSI LLD: blk_queue_enter() -> blk_pm_request_resume() -> pm_request_resume(dev) -> __pm_runtime_resume(dev, RPM_ASYNC) -> rpm_resume(dev, RPM_ASYNC) -> dev->power.request = RPM_REQ_RESUME; -> queue_work(pm_wq, &dev->power.work) -> pm_runtime_work() -> rpm_resume(dev, RPM_NOWAIT) -> callback = scsi_runtime_resume; -> rpm_callback(callback, dev); -> scsi_runtime_resume(dev); -> sdev_runtime_resume(dev); -> blk_pre_runtime_resume(sdev->request_queue); -> q->rpm_status = RPM_RESUMING; -> sd_resume(dev); -> sd_start_stop_device(sdkp); -> sd_pm_ops.runtime_resume == scsi_execute(sdp, START); -> blk_get_request(..., ..., BLK_MQ_REQ_PREEMPT) -> blk_mq_alloc_request() -> blk_queue_enter() -> __blk_mq_alloc_request() -> blk_execute_rq() -> blk_put_request() -> blk_post_runtime_resume(sdev->request_queue); -> q->rpm_status = RPM_ACTIVE; -> pm_runtime_mark_last_busy(q->dev); -> pm_request_autosuspend(q->dev); Bart.
On Tue, Jun 30, 2020 at 12:23:12PM -0700, Bart Van Assche wrote: > On 2020-06-30 11:02, Alan Stern wrote: > > Right now there doesn't seem to be any mechanism for resuming the queue > > if an REQ_PREEMPT request is added while the queue is suspended. > > I do not agree with the above statement. My understanding is that resuming > happens as follows if a request is submitted against a runtime suspended > queue owned by a SCSI LLD: > > blk_queue_enter() > -> blk_pm_request_resume() Assume that BLK_MQ_REQ_PREEMPT is set in flags. Then where exactly does blk_queue_enter(q, flags) call blk_pm_request_resume(q)? Nowhere, as far as I can see. Alan Stern > -> pm_request_resume(dev) > -> __pm_runtime_resume(dev, RPM_ASYNC) > -> rpm_resume(dev, RPM_ASYNC) > -> dev->power.request = RPM_REQ_RESUME; > -> queue_work(pm_wq, &dev->power.work) > -> pm_runtime_work() > -> rpm_resume(dev, RPM_NOWAIT) > -> callback = scsi_runtime_resume; > -> rpm_callback(callback, dev); > -> scsi_runtime_resume(dev); > -> sdev_runtime_resume(dev); > -> blk_pre_runtime_resume(sdev->request_queue); > -> q->rpm_status = RPM_RESUMING; > -> sd_resume(dev); > -> sd_start_stop_device(sdkp); > -> sd_pm_ops.runtime_resume == scsi_execute(sdp, START); > -> blk_get_request(..., ..., BLK_MQ_REQ_PREEMPT) > -> blk_mq_alloc_request() > -> blk_queue_enter() > -> __blk_mq_alloc_request() > -> blk_execute_rq() > -> blk_put_request() > -> blk_post_runtime_resume(sdev->request_queue); > -> q->rpm_status = RPM_ACTIVE; > -> pm_runtime_mark_last_busy(q->dev); > -> pm_request_autosuspend(q->dev); > > Bart.
On 2020-06-30 12:38, Alan Stern wrote: > Assume that BLK_MQ_REQ_PREEMPT is set in flags. Then where exactly > does blk_queue_enter(q, flags) call blk_pm_request_resume(q)? Please take a look at how the *current* implementation of runtime power management works. Your question is relevant for the old implementation of runtime power management but not for the current implementation. Bart.
On Tue, Jun 30, 2020 at 04:31:58PM -0700, Bart Van Assche wrote: > On 2020-06-30 12:38, Alan Stern wrote: > > Assume that BLK_MQ_REQ_PREEMPT is set in flags. Then where exactly > > does blk_queue_enter(q, flags) call blk_pm_request_resume(q)? > > Please take a look at how the *current* implementation of runtime power > management works. Your question is relevant for the old implementation > of runtime power management but not for the current implementation. What do you mean by "current"? I have been looking at the implementation in 5.8-rc3 from Linus's tree. Should I look somewhere else? Alan Stern
On Tue, Jun 30, 2020 at 20:49:58PM -0400, Alan Stern wrote: > On Tue, Jun 30, 2020 at 04:31:58PM -0700, Bart Van Assche wrote: > > On 2020-06-30 12:38, Alan Stern wrote: > > > Assume that BLK_MQ_REQ_PREEMPT is set in flags. Then where exactly > > > does blk_queue_enter(q, flags) call blk_pm_request_resume(q)? > > > > Please take a look at how the *current* implementation of runtime power > > management works. Your question is relevant for the old implementation > > of runtime power management but not for the current implementation. > > What do you mean by "current"? I have been looking at the implementation > in 5.8-rc3 from Linus's tree. Should I look somewhere else? Any reply to this, or further concerns about the proposed patch? I'd like to fix up this API, and it appears that you are the only person to have worked on it significantly in the last two years. Alan Stern
On 06.07.20 18:41, Alan Stern wrote: > On Tue, Jun 30, 2020 at 20:49:58PM -0400, Alan Stern wrote: >> On Tue, Jun 30, 2020 at 04:31:58PM -0700, Bart Van Assche wrote: >>> On 2020-06-30 12:38, Alan Stern wrote: >>>> Assume that BLK_MQ_REQ_PREEMPT is set in flags. Then where exactly >>>> does blk_queue_enter(q, flags) call blk_pm_request_resume(q)? >>> >>> Please take a look at how the *current* implementation of runtime power >>> management works. Your question is relevant for the old implementation >>> of runtime power management but not for the current implementation. >> >> What do you mean by "current"? I have been looking at the implementation >> in 5.8-rc3 from Linus's tree. Should I look somewhere else? > > Any reply to this, or further concerns about the proposed patch? > > I'd like to fix up this API, and it appears that you are the only > person to have worked on it significantly in the last two years. > > Alan Stern > Hi Alan, Any API cleanup is of course welcome. I just wanted to remind you that the underlying problem: broken block device runtime pm. Your initial proposed fix "almost" did it and mounting works but during file access, it still just looks like a runtime_resume is missing somewhere. As we need to have that working at some point, I might look into it, but someone who has experience in the block layer can surely do it more efficiently. to test, turn off polling: echo 0 > /sys/module/block/parameters/events_dfl_poll_msecs and enable runtime pm in your (scsi) device: power/autosuspend_delay_ms and of course power/control (set to auto). thanks, martin
On Tue, Jul 28, 2020 at 09:02:44AM +0200, Martin Kepplinger wrote: > Hi Alan, > > Any API cleanup is of course welcome. I just wanted to remind you that > the underlying problem: broken block device runtime pm. Your initial > proposed fix "almost" did it and mounting works but during file access, > it still just looks like a runtime_resume is missing somewhere. Well, I have tested that proposed fix several times, and on my system it's working perfectly. When I stop accessing a drive it autosuspends, and when I access it again it gets resumed and works -- as you would expect. > As we need to have that working at some point, I might look into it, but > someone who has experience in the block layer can surely do it more > efficiently. I suspect that any problems you still face are caused by something else. Alan Stern
On 28.07.20 22:02, Alan Stern wrote: > On Tue, Jul 28, 2020 at 09:02:44AM +0200, Martin Kepplinger wrote: >> Hi Alan, >> >> Any API cleanup is of course welcome. I just wanted to remind you that >> the underlying problem: broken block device runtime pm. Your initial >> proposed fix "almost" did it and mounting works but during file access, >> it still just looks like a runtime_resume is missing somewhere. > > Well, I have tested that proposed fix several times, and on my system > it's working perfectly. When I stop accessing a drive it autosuspends, > and when I access it again it gets resumed and works -- as you would > expect. that's weird. when I mount, everything looks good, "sda1". But as soon as I cd to the mountpoint and do "ls" (on another SD card "ls" works but actual file reading leads to the exact same errors), I get: [ 77.474632] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=0s [ 77.474647] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current] [ 77.474655] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0 [ 77.474667] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 60 40 00 00 01 00 [ 77.474678] blk_update_request: I/O error, dev sda, sector 24640 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 [ 77.485836] sd 0:0:0:0: [sda] tag#0 device offline or changed [ 77.491628] blk_update_request: I/O error, dev sda, sector 24641 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 [ 77.502275] sd 0:0:0:0: [sda] tag#0 device offline or changed [ 77.508051] blk_update_request: I/O error, dev sda, sector 24642 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 [ 77.518651] sd 0:0:0:0: [sda] tag#0 device offline or changed (...) [ 77.947653] sd 0:0:0:0: [sda] tag#0 device offline or changed [ 77.953434] FAT-fs (sda1): Directory bread(block 16448) failed [ 77.959333] sd 0:0:0:0: [sda] tag#0 device offline or changed [ 77.965118] FAT-fs (sda1): Directory bread(block 16449) failed [ 77.971014] sd 0:0:0:0: [sda] tag#0 device offline or changed [ 77.976802] FAT-fs (sda1): Directory bread(block 16450) failed [ 77.982698] sd 0:0:0:0: [sda] tag#0 device offline or changed (...) [ 78.384929] FAT-fs (sda1): Filesystem has been set read-only [ 103.070973] sd 0:0:0:0: [sda] tag#0 device offline or changed [ 103.076751] print_req_error: 118 callbacks suppressed [ 103.076760] blk_update_request: I/O error, dev sda, sector 9748 op 0x1:(WRITE) flags 0x100000 phys_seg 1 prio class 0 [ 103.087428] Buffer I/O error on dev sda1, logical block 1556, lost async page write [ 103.095309] sd 0:0:0:0: [sda] tag#0 device offline or changed [ 103.101123] blk_update_request: I/O error, dev sda, sector 17162 op 0x1:(WRITE) flags 0x100000 phys_seg 1 prio class 0 [ 103.111883] Buffer I/O error on dev sda1, logical block 8970, lost async page write > >> As we need to have that working at some point, I might look into it, but >> someone who has experience in the block layer can surely do it more >> efficiently. > > I suspect that any problems you still face are caused by something else. > I then formatted sda1 to ext2 (on the runtime suspend system testing your patch) and that seems to have worked! Again accessing the mountpoint then yield the very same "device offline or changed" errors. What kind of device are you testing? You should be easily able to reproduce this using an "sd" device. The problems must lie in the different other drivers we use I guess. martin > Alan Stern >
On Wed, Jul 29, 2020 at 04:12:22PM +0200, Martin Kepplinger wrote: > On 28.07.20 22:02, Alan Stern wrote: > > On Tue, Jul 28, 2020 at 09:02:44AM +0200, Martin Kepplinger wrote: > >> Hi Alan, > >> > >> Any API cleanup is of course welcome. I just wanted to remind you that > >> the underlying problem: broken block device runtime pm. Your initial > >> proposed fix "almost" did it and mounting works but during file access, > >> it still just looks like a runtime_resume is missing somewhere. > > > > Well, I have tested that proposed fix several times, and on my system > > it's working perfectly. When I stop accessing a drive it autosuspends, > > and when I access it again it gets resumed and works -- as you would > > expect. > > that's weird. when I mount, everything looks good, "sda1". But as soon > as I cd to the mountpoint and do "ls" (on another SD card "ls" works but > actual file reading leads to the exact same errors), I get: > > [ 77.474632] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: > hostbyte=0x00 driverbyte=0x08 cmd_age=0s > [ 77.474647] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current] > [ 77.474655] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0 > [ 77.474667] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 60 > 40 00 00 01 00 This error report comes from the SCSI layer, not the block layer. > [ 77.474678] blk_update_request: I/O error, dev sda, sector 24640 op > 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 > [ 77.485836] sd 0:0:0:0: [sda] tag#0 device offline or changed > [ 77.491628] blk_update_request: I/O error, dev sda, sector 24641 op > 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 > [ 77.502275] sd 0:0:0:0: [sda] tag#0 device offline or changed > [ 77.508051] blk_update_request: I/O error, dev sda, sector 24642 op > 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 > [ 77.518651] sd 0:0:0:0: [sda] tag#0 device offline or changed > (...) > [ 77.947653] sd 0:0:0:0: [sda] tag#0 device offline or changed > [ 77.953434] FAT-fs (sda1): Directory bread(block 16448) failed > [ 77.959333] sd 0:0:0:0: [sda] tag#0 device offline or changed > [ 77.965118] FAT-fs (sda1): Directory bread(block 16449) failed > [ 77.971014] sd 0:0:0:0: [sda] tag#0 device offline or changed > [ 77.976802] FAT-fs (sda1): Directory bread(block 16450) failed > [ 77.982698] sd 0:0:0:0: [sda] tag#0 device offline or changed > (...) > [ 78.384929] FAT-fs (sda1): Filesystem has been set read-only > [ 103.070973] sd 0:0:0:0: [sda] tag#0 device offline or changed > [ 103.076751] print_req_error: 118 callbacks suppressed > [ 103.076760] blk_update_request: I/O error, dev sda, sector 9748 op > 0x1:(WRITE) flags 0x100000 phys_seg 1 prio class 0 > [ 103.087428] Buffer I/O error on dev sda1, logical block 1556, lost > async page write > [ 103.095309] sd 0:0:0:0: [sda] tag#0 device offline or changed > [ 103.101123] blk_update_request: I/O error, dev sda, sector 17162 op > 0x1:(WRITE) flags 0x100000 phys_seg 1 prio class 0 > [ 103.111883] Buffer I/O error on dev sda1, logical block 8970, lost > async page write I can't tell why you're getting that error. In one of my tests the device returned the same kind of error status (Sense Key = 6, ASC = 0x28) but the operation was then retried successfully. Perhaps the problem lies in the device you are testing. > >> As we need to have that working at some point, I might look into it, but > >> someone who has experience in the block layer can surely do it more > >> efficiently. > > > > I suspect that any problems you still face are caused by something else. > > > > I then formatted sda1 to ext2 (on the runtime suspend system testing > your patch) and that seems to have worked! > > Again accessing the mountpoint then yield the very same "device offline > or changed" errors. > > What kind of device are you testing? You should be easily able to > reproduce this using an "sd" device. I tested two devices: a SanDisk Cruzer USB flash drive and a g-mass-storage gadget running under dummy-hcd. They each showed up as /dev/sdb on my system. I haven't tried testing with an SD card. If you have any specific sequence of commands you would like me to run, let me know. > The problems must lie in the different other drivers we use I guess. Or the devices. Have you tried testing with a USB flash drive? Alan Stern
Alan, >> [ 77.474632] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: >> hostbyte=0x00 driverbyte=0x08 cmd_age=0s >> [ 77.474647] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current] >> [ 77.474655] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0 >> [ 77.474667] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 60 >> 40 00 00 01 00 > > This error report comes from the SCSI layer, not the block layer. This the device telling us that the media (SD card?) has changed.
On Wed, 2020-07-29 at 10:32 -0400, Alan Stern wrote: > On Wed, Jul 29, 2020 at 04:12:22PM +0200, Martin Kepplinger wrote: > > On 28.07.20 22:02, Alan Stern wrote: > > > On Tue, Jul 28, 2020 at 09:02:44AM +0200, Martin Kepplinger > > > wrote: > > > > Hi Alan, > > > > > > > > Any API cleanup is of course welcome. I just wanted to remind > > > > you that the underlying problem: broken block device runtime > > > > pm. Your initial proposed fix "almost" did it and mounting > > > > works but during file access, it still just looks like a > > > > runtime_resume is missing somewhere. > > > > > > Well, I have tested that proposed fix several times, and on my > > > system it's working perfectly. When I stop accessing a drive it > > > autosuspends, and when I access it again it gets resumed and > > > works -- as you would expect. > > > > that's weird. when I mount, everything looks good, "sda1". But as > > soon as I cd to the mountpoint and do "ls" (on another SD card "ls" > > works but actual file reading leads to the exact same errors), I > > get: > > > > [ 77.474632] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: > > hostbyte=0x00 driverbyte=0x08 cmd_age=0s > > [ 77.474647] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current] > > [ 77.474655] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0 > > [ 77.474667] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 > > 60 40 00 00 01 00 > > This error report comes from the SCSI layer, not the block layer. That sense code means "NOT READY TO READY CHANGE, MEDIUM MAY HAVE CHANGED" so it sounds like it something we should be ignoring. Usually this signals a problem, like you changed the medium manually (ejected the CD). But in this case you can tell us to expect this by setting sdev->expecting_cc_ua And we'll retry. I think you need to set this on all resumed devices. James
On Wed, 2020-07-29 at 07:46 -0700, James Bottomley wrote: > On Wed, 2020-07-29 at 10:32 -0400, Alan Stern wrote: > > On Wed, Jul 29, 2020 at 04:12:22PM +0200, Martin Kepplinger wrote: > > > On 28.07.20 22:02, Alan Stern wrote: > > > > On Tue, Jul 28, 2020 at 09:02:44AM +0200, Martin Kepplinger > > > > wrote: > > > > > Hi Alan, > > > > > > > > > > Any API cleanup is of course welcome. I just wanted to remind > > > > > you that the underlying problem: broken block device runtime > > > > > pm. Your initial proposed fix "almost" did it and mounting > > > > > works but during file access, it still just looks like a > > > > > runtime_resume is missing somewhere. > > > > > > > > Well, I have tested that proposed fix several times, and on my > > > > system it's working perfectly. When I stop accessing a drive > > > > it autosuspends, and when I access it again it gets resumed and > > > > works -- as you would expect. > > > > > > that's weird. when I mount, everything looks good, "sda1". But as > > > soon as I cd to the mountpoint and do "ls" (on another SD card > > > "ls" works but actual file reading leads to the exact same > > > errors), I get: > > > > > > [ 77.474632] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: > > > hostbyte=0x00 driverbyte=0x08 cmd_age=0s > > > [ 77.474647] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current] > > > [ 77.474655] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0 > > > [ 77.474667] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 > > > 00 60 40 00 00 01 00 > > > > This error report comes from the SCSI layer, not the block layer. > > That sense code means "NOT READY TO READY CHANGE, MEDIUM MAY HAVE > CHANGED" so it sounds like it something we should be > ignoring. Usually this signals a problem, like you changed the > medium manually (ejected the CD). But in this case you can tell us > to expect this by setting > > sdev->expecting_cc_ua > > And we'll retry. I think you need to set this on all resumed > devices. Actually, it's not quite that easy, we filter out this ASC/ASCQ combination from the check because we should never ignore medium might have changed events on running devices. We could ignore it if we had a flag to say the power has been yanked (perhaps an additional sdev flag you set on resume) but we would still miss the case where you really had powered off the drive and then changed the media ... if you can regard this as the user's problem, then we might have a solution. James
On Wed, Jul 29, 2020 at 10:44:26AM -0400, Martin K. Petersen wrote: > > Alan, > > >> [ 77.474632] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: > >> hostbyte=0x00 driverbyte=0x08 cmd_age=0s > >> [ 77.474647] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current] > >> [ 77.474655] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0 > >> [ 77.474667] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 60 > >> 40 00 00 01 00 > > > > This error report comes from the SCSI layer, not the block layer. > > This the device telling us that the media (SD card?) has changed. Ah yes, thank you. I knew that SK=6 ASC=0x28 meant "Not Ready to Ready Transition", but I had forgotten the "(Media May Have Changed)" part. This makes sense and is a reasonable thing to see, since many SD card readers lose track of whether or not the card has been changed when they go into suspend. Alan Stern
On 29.07.20 16:53, James Bottomley wrote: > On Wed, 2020-07-29 at 07:46 -0700, James Bottomley wrote: >> On Wed, 2020-07-29 at 10:32 -0400, Alan Stern wrote: >>> On Wed, Jul 29, 2020 at 04:12:22PM +0200, Martin Kepplinger wrote: >>>> On 28.07.20 22:02, Alan Stern wrote: >>>>> On Tue, Jul 28, 2020 at 09:02:44AM +0200, Martin Kepplinger >>>>> wrote: >>>>>> Hi Alan, >>>>>> >>>>>> Any API cleanup is of course welcome. I just wanted to remind >>>>>> you that the underlying problem: broken block device runtime >>>>>> pm. Your initial proposed fix "almost" did it and mounting >>>>>> works but during file access, it still just looks like a >>>>>> runtime_resume is missing somewhere. >>>>> >>>>> Well, I have tested that proposed fix several times, and on my >>>>> system it's working perfectly. When I stop accessing a drive >>>>> it autosuspends, and when I access it again it gets resumed and >>>>> works -- as you would expect. >>>> >>>> that's weird. when I mount, everything looks good, "sda1". But as >>>> soon as I cd to the mountpoint and do "ls" (on another SD card >>>> "ls" works but actual file reading leads to the exact same >>>> errors), I get: >>>> >>>> [ 77.474632] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: >>>> hostbyte=0x00 driverbyte=0x08 cmd_age=0s >>>> [ 77.474647] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current] >>>> [ 77.474655] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0 >>>> [ 77.474667] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 >>>> 00 60 40 00 00 01 00 >>> >>> This error report comes from the SCSI layer, not the block layer. >> >> That sense code means "NOT READY TO READY CHANGE, MEDIUM MAY HAVE >> CHANGED" so it sounds like it something we should be >> ignoring. Usually this signals a problem, like you changed the >> medium manually (ejected the CD). But in this case you can tell us >> to expect this by setting >> >> sdev->expecting_cc_ua >> >> And we'll retry. I think you need to set this on all resumed >> devices. > > Actually, it's not quite that easy, we filter out this ASC/ASCQ > combination from the check because we should never ignore medium might > have changed events on running devices. We could ignore it if we had a > flag to say the power has been yanked (perhaps an additional sdev flag > you set on resume) but we would still miss the case where you really > had powered off the drive and then changed the media ... if you can > regard this as the user's problem, then we might have a solution. > > James > oh I see what you mean now, thanks for the ellaboration. if I do the following change, things all look normal and runtime pm works. I'm not 100% sure if just setting expecting_cc_ua in resume() is "correct" but that looks like it is what you're talking about: (note that this is of course with the one block layer diff applied that Alan posted a few emails back) --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -554,16 +554,8 @@ int scsi_check_sense(struct scsi_cmnd *scmd) * so that we can deal with it there. */ if (scmd->device->expecting_cc_ua) { - /* - * Because some device does not queue unit - * attentions correctly, we carefully check - * additional sense code and qualifier so as - * not to squash media change unit attention. - */ - if (sshdr.asc != 0x28 || sshdr.ascq != 0x00) { - scmd->device->expecting_cc_ua = 0; - return NEEDS_RETRY; - } + scmd->device->expecting_cc_ua = 0; + return NEEDS_RETRY; } /* * we might also expect a cc/ua if another LUN on the target diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index d90fefffe31b..5ad847fed8b9 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -3642,6 +3642,8 @@ static int sd_resume(struct device *dev) if (!sdkp) /* E.g.: runtime resume at the start of sd_probe() */ return 0; + sdkp->device->expecting_cc_ua = 1; + if (!sdkp->device->manage_start_stop) return 0;
On Wed, Jul 29, 2020 at 07:53:52AM -0700, James Bottomley wrote: > On Wed, 2020-07-29 at 07:46 -0700, James Bottomley wrote: > > On Wed, 2020-07-29 at 10:32 -0400, Alan Stern wrote: > > > On Wed, Jul 29, 2020 at 04:12:22PM +0200, Martin Kepplinger wrote: > > > > On 28.07.20 22:02, Alan Stern wrote: > > > > > On Tue, Jul 28, 2020 at 09:02:44AM +0200, Martin Kepplinger > > > > > wrote: > > > > > > Hi Alan, > > > > > > > > > > > > Any API cleanup is of course welcome. I just wanted to remind > > > > > > you that the underlying problem: broken block device runtime > > > > > > pm. Your initial proposed fix "almost" did it and mounting > > > > > > works but during file access, it still just looks like a > > > > > > runtime_resume is missing somewhere. > > > > > > > > > > Well, I have tested that proposed fix several times, and on my > > > > > system it's working perfectly. When I stop accessing a drive > > > > > it autosuspends, and when I access it again it gets resumed and > > > > > works -- as you would expect. > > > > > > > > that's weird. when I mount, everything looks good, "sda1". But as > > > > soon as I cd to the mountpoint and do "ls" (on another SD card > > > > "ls" works but actual file reading leads to the exact same > > > > errors), I get: > > > > > > > > [ 77.474632] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: > > > > hostbyte=0x00 driverbyte=0x08 cmd_age=0s > > > > [ 77.474647] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current] > > > > [ 77.474655] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0 > > > > [ 77.474667] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 > > > > 00 60 40 00 00 01 00 > > > > > > This error report comes from the SCSI layer, not the block layer. > > > > That sense code means "NOT READY TO READY CHANGE, MEDIUM MAY HAVE > > CHANGED" so it sounds like it something we should be > > ignoring. Usually this signals a problem, like you changed the > > medium manually (ejected the CD). But in this case you can tell us > > to expect this by setting > > > > sdev->expecting_cc_ua > > > > And we'll retry. I think you need to set this on all resumed > > devices. > > Actually, it's not quite that easy, we filter out this ASC/ASCQ > combination from the check because we should never ignore medium might > have changed events on running devices. We could ignore it if we had a > flag to say the power has been yanked (perhaps an additional sdev flag > you set on resume) but we would still miss the case where you really > had powered off the drive and then changed the media ... if you can > regard this as the user's problem, then we might have a solution. Indeed, I was going to make the same point. The only reasonable conclusion is that suspending these SD card readers isn't safe unless they don't contain a card -- or maybe just if the device file isn't open or mounted. Although support for this sort of thing could be added to the kernel, for now it's best to rely on userspace doing the right thing. The kernel doesn't even know which devices suffer from this problem. Alan Stern
On Wed, 2020-07-29 at 17:40 +0200, Martin Kepplinger wrote: > On 29.07.20 16:53, James Bottomley wrote: > > On Wed, 2020-07-29 at 07:46 -0700, James Bottomley wrote: > > > On Wed, 2020-07-29 at 10:32 -0400, Alan Stern wrote: [...] > > > > This error report comes from the SCSI layer, not the block > > > > layer. > > > > > > That sense code means "NOT READY TO READY CHANGE, MEDIUM MAY HAVE > > > CHANGED" so it sounds like it something we should be > > > ignoring. Usually this signals a problem, like you changed the > > > medium manually (ejected the CD). But in this case you can tell > > > us to expect this by setting > > > > > > sdev->expecting_cc_ua > > > > > > And we'll retry. I think you need to set this on all resumed > > > devices. > > > > Actually, it's not quite that easy, we filter out this ASC/ASCQ > > combination from the check because we should never ignore medium > > might have changed events on running devices. We could ignore it > > if we had a flag to say the power has been yanked (perhaps an > > additional sdev flag you set on resume) but we would still miss the > > case where you really had powered off the drive and then changed > > the media ... if you can regard this as the user's problem, then we > > might have a solution. > > > > James > > > > oh I see what you mean now, thanks for the ellaboration. > > if I do the following change, things all look normal and runtime pm > works. I'm not 100% sure if just setting expecting_cc_ua in resume() > is "correct" but that looks like it is what you're talking about: > > (note that this is of course with the one block layer diff applied > that Alan posted a few emails back) > > > --- a/drivers/scsi/scsi_error.c > +++ b/drivers/scsi/scsi_error.c > @@ -554,16 +554,8 @@ int scsi_check_sense(struct scsi_cmnd *scmd) > * so that we can deal with it there. > */ > if (scmd->device->expecting_cc_ua) { > - /* > - * Because some device does not queue unit > - * attentions correctly, we carefully check > - * additional sense code and qualifier so as > - * not to squash media change unit attention. > - */ > - if (sshdr.asc != 0x28 || sshdr.ascq != 0x00) > { > - scmd->device->expecting_cc_ua = 0; > - return NEEDS_RETRY; > - } > + scmd->device->expecting_cc_ua = 0; > + return NEEDS_RETRY; Well, yes, but you can't do this because it would lose us media change events in the non-suspend/resume case which we really don't want. That's why I was suggesting a new flag. James
On Wed, 2020-07-29 at 11:40 -0400, Alan Stern wrote: > On Wed, Jul 29, 2020 at 07:53:52AM -0700, James Bottomley wrote: > > On Wed, 2020-07-29 at 07:46 -0700, James Bottomley wrote: [...] > > > That sense code means "NOT READY TO READY CHANGE, MEDIUM MAY HAVE > > > CHANGED" so it sounds like it something we should be > > > ignoring. Usually this signals a problem, like you changed the > > > medium manually (ejected the CD). But in this case you can tell > > > us to expect this by setting > > > > > > sdev->expecting_cc_ua > > > > > > And we'll retry. I think you need to set this on all resumed > > > devices. > > > > Actually, it's not quite that easy, we filter out this ASC/ASCQ > > combination from the check because we should never ignore medium > > might have changed events on running devices. We could ignore it > > if we had a flag to say the power has been yanked (perhaps an > > additional sdev flag you set on resume) but we would still miss the > > case where you really had powered off the drive and then changed > > the media ... if you can regard this as the user's problem, then we > > might have a solution. > > Indeed, I was going to make the same point. > > The only reasonable conclusion is that suspending these SD card > readers isn't safe unless they don't contain a card -- or maybe just > if the device file isn't open or mounted. For standard removable media, I could see issuing an eject before suspend to emphasize the point. However, sd cards don't work like that ... they're not ejectable except manually and mostly used as the HDD of embedded, so the 99% use case is that the user will suspend and resume the device with the same sdd insert. It does sound like a use case we should support. > Although support for this sort of thing could be added to the > kernel, for now it's best to rely on userspace doing the right > thing. The kernel doesn't even know which devices suffer from this > problem. Can we solve from userspace the case where the user hasn't changed the media and the resume fails because we don't know? James
On 29.07.20 17:40, Alan Stern wrote: > On Wed, Jul 29, 2020 at 07:53:52AM -0700, James Bottomley wrote: >> On Wed, 2020-07-29 at 07:46 -0700, James Bottomley wrote: >>> On Wed, 2020-07-29 at 10:32 -0400, Alan Stern wrote: >>>> On Wed, Jul 29, 2020 at 04:12:22PM +0200, Martin Kepplinger wrote: >>>>> On 28.07.20 22:02, Alan Stern wrote: >>>>>> On Tue, Jul 28, 2020 at 09:02:44AM +0200, Martin Kepplinger >>>>>> wrote: >>>>>>> Hi Alan, >>>>>>> >>>>>>> Any API cleanup is of course welcome. I just wanted to remind >>>>>>> you that the underlying problem: broken block device runtime >>>>>>> pm. Your initial proposed fix "almost" did it and mounting >>>>>>> works but during file access, it still just looks like a >>>>>>> runtime_resume is missing somewhere. >>>>>> >>>>>> Well, I have tested that proposed fix several times, and on my >>>>>> system it's working perfectly. When I stop accessing a drive >>>>>> it autosuspends, and when I access it again it gets resumed and >>>>>> works -- as you would expect. >>>>> >>>>> that's weird. when I mount, everything looks good, "sda1". But as >>>>> soon as I cd to the mountpoint and do "ls" (on another SD card >>>>> "ls" works but actual file reading leads to the exact same >>>>> errors), I get: >>>>> >>>>> [ 77.474632] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: >>>>> hostbyte=0x00 driverbyte=0x08 cmd_age=0s >>>>> [ 77.474647] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current] >>>>> [ 77.474655] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0 >>>>> [ 77.474667] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 >>>>> 00 60 40 00 00 01 00 >>>> >>>> This error report comes from the SCSI layer, not the block layer. >>> >>> That sense code means "NOT READY TO READY CHANGE, MEDIUM MAY HAVE >>> CHANGED" so it sounds like it something we should be >>> ignoring. Usually this signals a problem, like you changed the >>> medium manually (ejected the CD). But in this case you can tell us >>> to expect this by setting >>> >>> sdev->expecting_cc_ua >>> >>> And we'll retry. I think you need to set this on all resumed >>> devices. >> >> Actually, it's not quite that easy, we filter out this ASC/ASCQ >> combination from the check because we should never ignore medium might >> have changed events on running devices. We could ignore it if we had a >> flag to say the power has been yanked (perhaps an additional sdev flag >> you set on resume) but we would still miss the case where you really >> had powered off the drive and then changed the media ... if you can >> regard this as the user's problem, then we might have a solution. > > Indeed, I was going to make the same point. > > The only reasonable conclusion is that suspending these SD card readers > isn't safe unless they don't contain a card -- or maybe just if the > device file isn't open or mounted. > > Although support for this sort of thing could be added to the kernel, > for now it's best to rely on userspace doing the right thing. The > kernel doesn't even know which devices suffer from this problem. > > well, userspace can do something like "automatically unmount if not used" but that is *way* more inefficient than if the kernel would support this is some way or another, for SD cards.
On Wed, Jul 29, 2020 at 08:49:34AM -0700, James Bottomley wrote: > On Wed, 2020-07-29 at 11:40 -0400, Alan Stern wrote: > > On Wed, Jul 29, 2020 at 07:53:52AM -0700, James Bottomley wrote: > > > On Wed, 2020-07-29 at 07:46 -0700, James Bottomley wrote: > [...] > > > > That sense code means "NOT READY TO READY CHANGE, MEDIUM MAY HAVE > > > > CHANGED" so it sounds like it something we should be > > > > ignoring. Usually this signals a problem, like you changed the > > > > medium manually (ejected the CD). But in this case you can tell > > > > us to expect this by setting > > > > > > > > sdev->expecting_cc_ua > > > > > > > > And we'll retry. I think you need to set this on all resumed > > > > devices. > > > > > > Actually, it's not quite that easy, we filter out this ASC/ASCQ > > > combination from the check because we should never ignore medium > > > might have changed events on running devices. We could ignore it > > > if we had a flag to say the power has been yanked (perhaps an > > > additional sdev flag you set on resume) but we would still miss the > > > case where you really had powered off the drive and then changed > > > the media ... if you can regard this as the user's problem, then we > > > might have a solution. > > > > Indeed, I was going to make the same point. > > > > The only reasonable conclusion is that suspending these SD card > > readers isn't safe unless they don't contain a card -- or maybe just > > if the device file isn't open or mounted. > > For standard removable media, I could see issuing an eject before > suspend to emphasize the point. That's not a workable approach in general. For example, you wouldn't want to eject a CD just because it hadn't been used for a couple of minutes and the drive was therefore about to be suspended. > However, sd cards don't work like that > ... they're not ejectable except manually and mostly used as the HDD of > embedded, so the 99% use case is that the user will suspend and resume > the device with the same sdd insert. It does sound like a use case we > should support. Agreed. > > Although support for this sort of thing could be added to the > > kernel, for now it's best to rely on userspace doing the right > > thing. The kernel doesn't even know which devices suffer from this > > problem. > > Can we solve from userspace the case where the user hasn't changed the > media and the resume fails because we don't know? The difficulty is recognizing situations where the media really was changed while the device was suspended. Most devices, AFAIK, don't have any problem with this, but card readers often do. I suppose the kernel could just rely on users not to change media in drives while they are mounted. Then any problems that arise would be the users' fault. Yes, this is nothing more than passing the buck, but it's hard to come up with any better approach. Doesn't Windows do something like this? Alan Stern
Am 29. Juli 2020 17:44:42 MESZ schrieb James Bottomley <James.Bottomley@HansenPartnership.com>: >On Wed, 2020-07-29 at 17:40 +0200, Martin Kepplinger wrote: >> On 29.07.20 16:53, James Bottomley wrote: >> > On Wed, 2020-07-29 at 07:46 -0700, James Bottomley wrote: >> > > On Wed, 2020-07-29 at 10:32 -0400, Alan Stern wrote: >[...] >> > > > This error report comes from the SCSI layer, not the block >> > > > layer. >> > > >> > > That sense code means "NOT READY TO READY CHANGE, MEDIUM MAY HAVE >> > > CHANGED" so it sounds like it something we should be >> > > ignoring. Usually this signals a problem, like you changed the >> > > medium manually (ejected the CD). But in this case you can tell >> > > us to expect this by setting >> > > >> > > sdev->expecting_cc_ua >> > > >> > > And we'll retry. I think you need to set this on all resumed >> > > devices. >> > >> > Actually, it's not quite that easy, we filter out this ASC/ASCQ >> > combination from the check because we should never ignore medium >> > might have changed events on running devices. We could ignore it >> > if we had a flag to say the power has been yanked (perhaps an >> > additional sdev flag you set on resume) but we would still miss the >> > case where you really had powered off the drive and then changed >> > the media ... if you can regard this as the user's problem, then we >> > might have a solution. >> > >> > James >> > >> >> oh I see what you mean now, thanks for the ellaboration. >> >> if I do the following change, things all look normal and runtime pm >> works. I'm not 100% sure if just setting expecting_cc_ua in resume() >> is "correct" but that looks like it is what you're talking about: >> >> (note that this is of course with the one block layer diff applied >> that Alan posted a few emails back) >> >> >> --- a/drivers/scsi/scsi_error.c >> +++ b/drivers/scsi/scsi_error.c >> @@ -554,16 +554,8 @@ int scsi_check_sense(struct scsi_cmnd *scmd) >> * so that we can deal with it there. >> */ >> if (scmd->device->expecting_cc_ua) { >> - /* >> - * Because some device does not queue unit >> - * attentions correctly, we carefully check >> - * additional sense code and qualifier so as >> - * not to squash media change unit attention. >> - */ >> - if (sshdr.asc != 0x28 || sshdr.ascq != 0x00) >> { >> - scmd->device->expecting_cc_ua = 0; >> - return NEEDS_RETRY; >> - } >> + scmd->device->expecting_cc_ua = 0; >> + return NEEDS_RETRY; > >Well, yes, but you can't do this because it would lose us media change >events in the non-suspend/resume case which we really don't want. >That's why I was suggesting a new flag. > >James also if I set expecting_cc_ua in resume() only, like I did?
On 2020-07-29 10:32 a.m., Alan Stern wrote: > On Wed, Jul 29, 2020 at 04:12:22PM +0200, Martin Kepplinger wrote: >> On 28.07.20 22:02, Alan Stern wrote: >>> On Tue, Jul 28, 2020 at 09:02:44AM +0200, Martin Kepplinger wrote: >>>> Hi Alan, >>>> >>>> Any API cleanup is of course welcome. I just wanted to remind you that >>>> the underlying problem: broken block device runtime pm. Your initial >>>> proposed fix "almost" did it and mounting works but during file access, >>>> it still just looks like a runtime_resume is missing somewhere. >>> >>> Well, I have tested that proposed fix several times, and on my system >>> it's working perfectly. When I stop accessing a drive it autosuspends, >>> and when I access it again it gets resumed and works -- as you would >>> expect. >> >> that's weird. when I mount, everything looks good, "sda1". But as soon >> as I cd to the mountpoint and do "ls" (on another SD card "ls" works but >> actual file reading leads to the exact same errors), I get: >> >> [ 77.474632] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: >> hostbyte=0x00 driverbyte=0x08 cmd_age=0s >> [ 77.474647] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current] >> [ 77.474655] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0 >> [ 77.474667] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 60 >> 40 00 00 01 00 > > This error report comes from the SCSI layer, not the block layer. SCSI's first 11 byte command! I'm guessing the first byte is being repeated and it's actually: 28 00 00 00 60 40 00 00 01 00 [READ(10)] That should be fixed. It should be something like: "...CDB in hex: 28 00 ...". Doug Gilbert >> [ 77.474678] blk_update_request: I/O error, dev sda, sector 24640 op >> 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 >> [ 77.485836] sd 0:0:0:0: [sda] tag#0 device offline or changed >> [ 77.491628] blk_update_request: I/O error, dev sda, sector 24641 op >> 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 >> [ 77.502275] sd 0:0:0:0: [sda] tag#0 device offline or changed >> [ 77.508051] blk_update_request: I/O error, dev sda, sector 24642 op >> 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 >> [ 77.518651] sd 0:0:0:0: [sda] tag#0 device offline or changed >> (...) >> [ 77.947653] sd 0:0:0:0: [sda] tag#0 device offline or changed >> [ 77.953434] FAT-fs (sda1): Directory bread(block 16448) failed >> [ 77.959333] sd 0:0:0:0: [sda] tag#0 device offline or changed >> [ 77.965118] FAT-fs (sda1): Directory bread(block 16449) failed >> [ 77.971014] sd 0:0:0:0: [sda] tag#0 device offline or changed >> [ 77.976802] FAT-fs (sda1): Directory bread(block 16450) failed >> [ 77.982698] sd 0:0:0:0: [sda] tag#0 device offline or changed >> (...) >> [ 78.384929] FAT-fs (sda1): Filesystem has been set read-only >> [ 103.070973] sd 0:0:0:0: [sda] tag#0 device offline or changed >> [ 103.076751] print_req_error: 118 callbacks suppressed >> [ 103.076760] blk_update_request: I/O error, dev sda, sector 9748 op >> 0x1:(WRITE) flags 0x100000 phys_seg 1 prio class 0 >> [ 103.087428] Buffer I/O error on dev sda1, logical block 1556, lost >> async page write >> [ 103.095309] sd 0:0:0:0: [sda] tag#0 device offline or changed >> [ 103.101123] blk_update_request: I/O error, dev sda, sector 17162 op >> 0x1:(WRITE) flags 0x100000 phys_seg 1 prio class 0 >> [ 103.111883] Buffer I/O error on dev sda1, logical block 8970, lost >> async page write > > I can't tell why you're getting that error. In one of my tests the > device returned the same kind of error status (Sense Key = 6, ASC = > 0x28) but the operation was then retried successfully. Perhaps the > problem lies in the device you are testing. > >>>> As we need to have that working at some point, I might look into it, but >>>> someone who has experience in the block layer can surely do it more >>>> efficiently. >>> >>> I suspect that any problems you still face are caused by something else. >>> >> >> I then formatted sda1 to ext2 (on the runtime suspend system testing >> your patch) and that seems to have worked! >> >> Again accessing the mountpoint then yield the very same "device offline >> or changed" errors. >> >> What kind of device are you testing? You should be easily able to >> reproduce this using an "sd" device. > > I tested two devices: a SanDisk Cruzer USB flash drive and a > g-mass-storage gadget running under dummy-hcd. They each showed up as > /dev/sdb on my system. > > I haven't tried testing with an SD card. If you have any specific > sequence of commands you would like me to run, let me know. > >> The problems must lie in the different other drivers we use I guess. > > Or the devices. Have you tried testing with a USB flash drive? > > Alan Stern >
On Wed, Jul 29, 2020 at 06:43:48PM +0200, Martin Kepplinger wrote: > > > Am 29. Juli 2020 17:44:42 MESZ schrieb James Bottomley <James.Bottomley@HansenPartnership.com>: > >On Wed, 2020-07-29 at 17:40 +0200, Martin Kepplinger wrote: > >> On 29.07.20 16:53, James Bottomley wrote: > >> > On Wed, 2020-07-29 at 07:46 -0700, James Bottomley wrote: > >> > > On Wed, 2020-07-29 at 10:32 -0400, Alan Stern wrote: > >[...] > >> > > > This error report comes from the SCSI layer, not the block > >> > > > layer. > >> > > > >> > > That sense code means "NOT READY TO READY CHANGE, MEDIUM MAY HAVE > >> > > CHANGED" so it sounds like it something we should be > >> > > ignoring. Usually this signals a problem, like you changed the > >> > > medium manually (ejected the CD). But in this case you can tell > >> > > us to expect this by setting > >> > > > >> > > sdev->expecting_cc_ua > >> > > > >> > > And we'll retry. I think you need to set this on all resumed > >> > > devices. > >> > > >> > Actually, it's not quite that easy, we filter out this ASC/ASCQ > >> > combination from the check because we should never ignore medium > >> > might have changed events on running devices. We could ignore it > >> > if we had a flag to say the power has been yanked (perhaps an > >> > additional sdev flag you set on resume) but we would still miss the > >> > case where you really had powered off the drive and then changed > >> > the media ... if you can regard this as the user's problem, then we > >> > might have a solution. > >> > > >> > James > >> > > >> > >> oh I see what you mean now, thanks for the ellaboration. > >> > >> if I do the following change, things all look normal and runtime pm > >> works. I'm not 100% sure if just setting expecting_cc_ua in resume() > >> is "correct" but that looks like it is what you're talking about: > >> > >> (note that this is of course with the one block layer diff applied > >> that Alan posted a few emails back) > >> > >> > >> --- a/drivers/scsi/scsi_error.c > >> +++ b/drivers/scsi/scsi_error.c > >> @@ -554,16 +554,8 @@ int scsi_check_sense(struct scsi_cmnd *scmd) > >> * so that we can deal with it there. > >> */ > >> if (scmd->device->expecting_cc_ua) { > >> - /* > >> - * Because some device does not queue unit > >> - * attentions correctly, we carefully check > >> - * additional sense code and qualifier so as > >> - * not to squash media change unit attention. > >> - */ > >> - if (sshdr.asc != 0x28 || sshdr.ascq != 0x00) > >> { > >> - scmd->device->expecting_cc_ua = 0; > >> - return NEEDS_RETRY; > >> - } > >> + scmd->device->expecting_cc_ua = 0; > >> + return NEEDS_RETRY; > > > >Well, yes, but you can't do this because it would lose us media change > >events in the non-suspend/resume case which we really don't want. > >That's why I was suggesting a new flag. > > > >James > > also if I set expecting_cc_ua in resume() only, like I did? That wouldn't make any difference. The information sent by your card reader has sshdr.asc == 0x28 and sshdr.ascq == 0x00 (you can see it in the log). So because of the code here in scsi_check_sense(), which you can't change, the Unit Attention sent by the card reader would not be retried even if you do set the flag in resume(). Alan Stern
On Wed, 2020-07-29 at 14:25 -0400, Alan Stern wrote: > On Wed, Jul 29, 2020 at 06:43:48PM +0200, Martin Kepplinger wrote: [...] > > > > --- a/drivers/scsi/scsi_error.c > > > > +++ b/drivers/scsi/scsi_error.c > > > > @@ -554,16 +554,8 @@ int scsi_check_sense(struct scsi_cmnd > > > > *scmd) > > > > * so that we can deal with it there. > > > > */ > > > > if (scmd->device->expecting_cc_ua) { > > > > - /* > > > > - * Because some device does not queue > > > > unit > > > > - * attentions correctly, we carefully > > > > check > > > > - * additional sense code and qualifier > > > > so as > > > > - * not to squash media change unit > > > > attention. > > > > - */ > > > > - if (sshdr.asc != 0x28 || sshdr.ascq != > > > > 0x00) > > > > { > > > > - scmd->device->expecting_cc_ua = > > > > 0; > > > > - return NEEDS_RETRY; > > > > - } > > > > + scmd->device->expecting_cc_ua = 0; > > > > + return NEEDS_RETRY; > > > > > > Well, yes, but you can't do this because it would lose us media > > > change events in the non-suspend/resume case which we really > > > don't want. That's why I was suggesting a new flag. > > > > > > James > > > > also if I set expecting_cc_ua in resume() only, like I did? > > That wouldn't make any difference. The information sent by your > card reader has sshdr.asc == 0x28 and sshdr.ascq == 0x00 (you can see > it in the log). So because of the code here in scsi_check_sense(), > which you can't change, the Unit Attention sent by the card reader > would not be retried even if you do set the flag in resume(). But if we had a new flag, like expecting_media_change, you could set that in resume and we could condition the above test in the code on it and reset it and do a retry if it gets set. I'm not saying we have to do it this way, but it sounds like we have to do something in the kernel, so I think the flag will become necessary but there might be a bit of policy based dance around how it gets set in the kernel (to avoid losing accurate media change events). James
On 29.06.20 18:15, Alan Stern wrote: > On Mon, Jun 29, 2020 at 11:42:59AM +0200, Martin Kepplinger wrote: >> >> >> On 26.06.20 17:44, Alan Stern wrote: >>> Martin's best approach would be to add some debugging code to find out why >>> blk_queue_enter() isn't calling bkl_pm_request_resume(), or why that call >>> doesn't lead to pm_request_resume(). >>> >>> Alan Stern >>> >> >> Hi Alan, >> >> blk_queue_enter() always - especially when sd is runtime suspended and I >> try to mount as above - sets success to be true for me, so never >> continues down to bkl_pm_request_resume(). All I see is "PM: Removing >> info for No Bus:sda1". > > Aha. Looking at this more closely, it's apparent that the code in > blk-core.c contains a logic bug: It assumes that if the BLK_MQ_REQ_PREEMPT > flag is set then the request can be issued regardless of the queue's > runtime status. That is not correct when the queue is suspended. > > Below is my attempt to fix this up. I'm not sure that the patch is > entirely correct, but it should fix this logic bug. I would appreciate a > critical review. > > Martin, does this fix the problem? > > Alan Stern Hi Alan, So in the block layer your change below indeed fixes the problem and if you want to submit that 1:1 feel free to add Tested-by: Martin Kepplinger <martin.kepplinger@puri.sm> thanks for your help in this! martin > > > > Index: usb-devel/block/blk-core.c > =================================================================== > --- usb-devel.orig/block/blk-core.c > +++ usb-devel/block/blk-core.c > @@ -423,7 +423,8 @@ int blk_queue_enter(struct request_queue > * responsible for ensuring that that counter is > * globally visible before the queue is unfrozen. > */ > - if (pm || !blk_queue_pm_only(q)) { > + if ((pm && q->rpm_status != RPM_SUSPENDED) || > + !blk_queue_pm_only(q)) { > success = true; > } else { > percpu_ref_put(&q->q_usage_counter); > @@ -448,8 +449,7 @@ int blk_queue_enter(struct request_queue > > wait_event(q->mq_freeze_wq, > (!q->mq_freeze_depth && > - (pm || (blk_pm_request_resume(q), > - !blk_queue_pm_only(q)))) || > + blk_pm_resume_queue(pm, q)) || > blk_queue_dying(q)); > if (blk_queue_dying(q)) > return -ENODEV; > Index: usb-devel/block/blk-pm.h > =================================================================== > --- usb-devel.orig/block/blk-pm.h > +++ usb-devel/block/blk-pm.h > @@ -6,11 +6,14 @@ > #include <linux/pm_runtime.h> > > #ifdef CONFIG_PM > -static inline void blk_pm_request_resume(struct request_queue *q) > +static inline int blk_pm_resume_queue(const bool pm, struct request_queue *q) > { > - if (q->dev && (q->rpm_status == RPM_SUSPENDED || > - q->rpm_status == RPM_SUSPENDING)) > - pm_request_resume(q->dev); > + if (!q->dev || !blk_queue_pm_only(q)) > + return 1; /* Nothing to do */ > + if (pm && q->rpm_status != RPM_SUSPENDED) > + return 1; /* Request allowed */ > + pm_request_resume(q->dev); > + return 0; > } > > static inline void blk_pm_mark_last_busy(struct request *rq) > @@ -44,8 +47,9 @@ static inline void blk_pm_put_request(st > --rq->q->nr_pending; > } > #else > -static inline void blk_pm_request_resume(struct request_queue *q) > +static inline int blk_pm_resume_queue(const bool pm, struct request_queue *q) > { > + return 1; > } > > static inline void blk_pm_mark_last_busy(struct request *rq) >
On 29.07.20 20:29, James Bottomley wrote: > On Wed, 2020-07-29 at 14:25 -0400, Alan Stern wrote: >> On Wed, Jul 29, 2020 at 06:43:48PM +0200, Martin Kepplinger wrote: > [...] >>>>> --- a/drivers/scsi/scsi_error.c >>>>> +++ b/drivers/scsi/scsi_error.c >>>>> @@ -554,16 +554,8 @@ int scsi_check_sense(struct scsi_cmnd >>>>> *scmd) >>>>> * so that we can deal with it there. >>>>> */ >>>>> if (scmd->device->expecting_cc_ua) { >>>>> - /* >>>>> - * Because some device does not queue >>>>> unit >>>>> - * attentions correctly, we carefully >>>>> check >>>>> - * additional sense code and qualifier >>>>> so as >>>>> - * not to squash media change unit >>>>> attention. >>>>> - */ >>>>> - if (sshdr.asc != 0x28 || sshdr.ascq != >>>>> 0x00) >>>>> { >>>>> - scmd->device->expecting_cc_ua = >>>>> 0; >>>>> - return NEEDS_RETRY; >>>>> - } >>>>> + scmd->device->expecting_cc_ua = 0; >>>>> + return NEEDS_RETRY; >>>> >>>> Well, yes, but you can't do this because it would lose us media >>>> change events in the non-suspend/resume case which we really >>>> don't want. That's why I was suggesting a new flag. >>>> >>>> James >>> >>> also if I set expecting_cc_ua in resume() only, like I did? >> >> That wouldn't make any difference. The information sent by your >> card reader has sshdr.asc == 0x28 and sshdr.ascq == 0x00 (you can see >> it in the log). So because of the code here in scsi_check_sense(), >> which you can't change, the Unit Attention sent by the card reader >> would not be retried even if you do set the flag in resume(). > > But if we had a new flag, like expecting_media_change, you could set > that in resume and we could condition the above test in the code on it > and reset it and do a retry if it gets set. I'm not saying we have to > do it this way, but it sounds like we have to do something in the > kernel, so I think the flag will become necessary but there might be a > bit of policy based dance around how it gets set in the kernel (to > avoid losing accurate media change events). > > James > Maybe I should just start a new discussion with a patch, but the below is what makes sense to me (when I understand you correctly) and seems to work. I basically add a new flag, so that the old flags behave unchanged and only call it during *runtime* resume for SD cards: --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -553,15 +553,21 @@ int scsi_check_sense(struct scsi_cmnd *scmd) * information that we should pass up to the upper-level driver * so that we can deal with it there. */ - if (scmd->device->expecting_cc_ua) { + if (scmd->device->expecting_cc_ua || + scmd->device->expecting_media_change) { /* * Because some device does not queue unit * attentions correctly, we carefully check * additional sense code and qualifier so as - * not to squash media change unit attention. + * not to squash media change unit attention; + * unless expecting_media_change is set, indicating + * that the media (most likely) didn't change + * but a device only believes so (for example + * because of suspend/resume). */ - if (sshdr.asc != 0x28 || sshdr.ascq != 0x00) { - scmd->device->expecting_cc_ua = 0; + if ((sshdr.asc != 0x28 || sshdr.ascq != 0x00) || + scmd->device->expecting_media_change) { + scmd->device->expecting_media_change = 0; return NEEDS_RETRY; } } diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index d90fefffe31b..b647fab2b663 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -114,6 +114,7 @@ static void sd_shutdown(struct device *); static int sd_suspend_system(struct device *); static int sd_suspend_runtime(struct device *); static int sd_resume(struct device *); +static int sd_resume_runtime(struct device *); static void sd_rescan(struct device *); static blk_status_t sd_init_command(struct scsi_cmnd *SCpnt); static void sd_uninit_command(struct scsi_cmnd *SCpnt); @@ -574,7 +575,7 @@ static const struct dev_pm_ops sd_pm_ops = { .poweroff = sd_suspend_system, .restore = sd_resume, .runtime_suspend = sd_suspend_runtime, - .runtime_resume = sd_resume, + .runtime_resume = sd_resume_runtime, }; static struct scsi_driver sd_template = { @@ -3652,6 +3653,21 @@ static int sd_resume(struct device *dev) return ret; } +static int sd_resume_runtime(struct device *dev) +{ + struct scsi_disk *sdkp = dev_get_drvdata(dev); + + /* Some SD cardreaders report media change when resuming from suspend + * because they can't keep track during suspend. */ + + /* XXX This is not unproblematic though: We won't notice when a card + * was really changed during runtime suspend! We basically rely on users + * to unmount or suspend before doing so. */ + sdkp->device->expecting_media_change = 1; + + return sd_resume(dev); +} + /** * init_sd - entry point for this driver (both when built in or when * a module). diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h index bc5909033d13..8c8f053f71c8 100644 --- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@ -169,6 +169,8 @@ struct scsi_device { * this device */ unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN * because we did a bus reset. */ + unsigned expecting_media_change:1; /* Expecting media change ASC/ASCQ + when it actually doesn't change */ unsigned use_10_for_rw:1; /* first try 10-byte read / write */ unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */ unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */
On 30.07.20 10:52, Martin Kepplinger wrote: > On 29.07.20 20:29, James Bottomley wrote: >> On Wed, 2020-07-29 at 14:25 -0400, Alan Stern wrote: >>> On Wed, Jul 29, 2020 at 06:43:48PM +0200, Martin Kepplinger wrote: >> [...] >>>>>> --- a/drivers/scsi/scsi_error.c >>>>>> +++ b/drivers/scsi/scsi_error.c >>>>>> @@ -554,16 +554,8 @@ int scsi_check_sense(struct scsi_cmnd >>>>>> *scmd) >>>>>> * so that we can deal with it there. >>>>>> */ >>>>>> if (scmd->device->expecting_cc_ua) { >>>>>> - /* >>>>>> - * Because some device does not queue >>>>>> unit >>>>>> - * attentions correctly, we carefully >>>>>> check >>>>>> - * additional sense code and qualifier >>>>>> so as >>>>>> - * not to squash media change unit >>>>>> attention. >>>>>> - */ >>>>>> - if (sshdr.asc != 0x28 || sshdr.ascq != >>>>>> 0x00) >>>>>> { >>>>>> - scmd->device->expecting_cc_ua = >>>>>> 0; >>>>>> - return NEEDS_RETRY; >>>>>> - } >>>>>> + scmd->device->expecting_cc_ua = 0; >>>>>> + return NEEDS_RETRY; >>>>> >>>>> Well, yes, but you can't do this because it would lose us media >>>>> change events in the non-suspend/resume case which we really >>>>> don't want. That's why I was suggesting a new flag. >>>>> >>>>> James >>>> >>>> also if I set expecting_cc_ua in resume() only, like I did? >>> >>> That wouldn't make any difference. The information sent by your >>> card reader has sshdr.asc == 0x28 and sshdr.ascq == 0x00 (you can see >>> it in the log). So because of the code here in scsi_check_sense(), >>> which you can't change, the Unit Attention sent by the card reader >>> would not be retried even if you do set the flag in resume(). >> >> But if we had a new flag, like expecting_media_change, you could set >> that in resume and we could condition the above test in the code on it >> and reset it and do a retry if it gets set. I'm not saying we have to >> do it this way, but it sounds like we have to do something in the >> kernel, so I think the flag will become necessary but there might be a >> bit of policy based dance around how it gets set in the kernel (to >> avoid losing accurate media change events). >> >> James >> > > Maybe I should just start a new discussion with a patch, but the below > is what makes sense to me (when I understand you correctly) and seems to > work. I basically add a new flag, so that the old flags behave unchanged > and only call it during *runtime* resume for SD cards: > > > --- a/drivers/scsi/scsi_error.c > +++ b/drivers/scsi/scsi_error.c > @@ -553,15 +553,21 @@ int scsi_check_sense(struct scsi_cmnd *scmd) > * information that we should pass up to the upper-level > driver > * so that we can deal with it there. > */ > - if (scmd->device->expecting_cc_ua) { > + if (scmd->device->expecting_cc_ua || > + scmd->device->expecting_media_change) { > /* > * Because some device does not queue unit > * attentions correctly, we carefully check > * additional sense code and qualifier so as > - * not to squash media change unit attention. > + * not to squash media change unit attention; > + * unless expecting_media_change is set, indicating > + * that the media (most likely) didn't change > + * but a device only believes so (for example > + * because of suspend/resume). > */ > - if (sshdr.asc != 0x28 || sshdr.ascq != 0x00) { > - scmd->device->expecting_cc_ua = 0; > + if ((sshdr.asc != 0x28 || sshdr.ascq != 0x00) || > + scmd->device->expecting_media_change) { > + scmd->device->expecting_media_change = 0; oops, I missed expecting_cc_ua here, but you get the idea. > return NEEDS_RETRY; > } > } > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > index d90fefffe31b..b647fab2b663 100644 > --- a/drivers/scsi/sd.c > +++ b/drivers/scsi/sd.c > @@ -114,6 +114,7 @@ static void sd_shutdown(struct device *); > static int sd_suspend_system(struct device *); > static int sd_suspend_runtime(struct device *); > static int sd_resume(struct device *); > +static int sd_resume_runtime(struct device *); > static void sd_rescan(struct device *); > static blk_status_t sd_init_command(struct scsi_cmnd *SCpnt); > static void sd_uninit_command(struct scsi_cmnd *SCpnt); > @@ -574,7 +575,7 @@ static const struct dev_pm_ops sd_pm_ops = { > .poweroff = sd_suspend_system, > .restore = sd_resume, > .runtime_suspend = sd_suspend_runtime, > - .runtime_resume = sd_resume, > + .runtime_resume = sd_resume_runtime, > }; > > static struct scsi_driver sd_template = { > @@ -3652,6 +3653,21 @@ static int sd_resume(struct device *dev) > return ret; > } > > +static int sd_resume_runtime(struct device *dev) > +{ > + struct scsi_disk *sdkp = dev_get_drvdata(dev); > + > + /* Some SD cardreaders report media change when resuming from > suspend > + * because they can't keep track during suspend. */ > + > + /* XXX This is not unproblematic though: We won't notice when a card > + * was really changed during runtime suspend! We basically rely > on users > + * to unmount or suspend before doing so. */ > + sdkp->device->expecting_media_change = 1; > + > + return sd_resume(dev); > +} > + > /** > * init_sd - entry point for this driver (both when built in or when > * a module). > diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h > index bc5909033d13..8c8f053f71c8 100644 > --- a/include/scsi/scsi_device.h > +++ b/include/scsi/scsi_device.h > @@ -169,6 +169,8 @@ struct scsi_device { > * this device */ > unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN > * because we did a bus reset. */ > + unsigned expecting_media_change:1; /* Expecting media change > ASC/ASCQ > + when it actually doesn't > change */ > unsigned use_10_for_rw:1; /* first try 10-byte read / write */ > unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */ > unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */ >
On Thu, Jul 30, 2020 at 10:52:14AM +0200, Martin Kepplinger wrote: > Maybe I should just start a new discussion with a patch, but the below > is what makes sense to me (when I understand you correctly) and seems to > work. I basically add a new flag, so that the old flags behave unchanged > and only call it during *runtime* resume for SD cards: > > > --- a/drivers/scsi/scsi_error.c > +++ b/drivers/scsi/scsi_error.c > @@ -553,15 +553,21 @@ int scsi_check_sense(struct scsi_cmnd *scmd) > * information that we should pass up to the upper-level > driver > * so that we can deal with it there. > */ > - if (scmd->device->expecting_cc_ua) { > + if (scmd->device->expecting_cc_ua || > + scmd->device->expecting_media_change) { > /* > * Because some device does not queue unit > * attentions correctly, we carefully check > * additional sense code and qualifier so as > - * not to squash media change unit attention. > + * not to squash media change unit attention; > + * unless expecting_media_change is set, indicating > + * that the media (most likely) didn't change > + * but a device only believes so (for example > + * because of suspend/resume). > */ > - if (sshdr.asc != 0x28 || sshdr.ascq != 0x00) { > - scmd->device->expecting_cc_ua = 0; > + if ((sshdr.asc != 0x28 || sshdr.ascq != 0x00) || > + scmd->device->expecting_media_change) { > + scmd->device->expecting_media_change = 0; > return NEEDS_RETRY; > } > } > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > index d90fefffe31b..b647fab2b663 100644 > --- a/drivers/scsi/sd.c > +++ b/drivers/scsi/sd.c > @@ -114,6 +114,7 @@ static void sd_shutdown(struct device *); > static int sd_suspend_system(struct device *); > static int sd_suspend_runtime(struct device *); > static int sd_resume(struct device *); > +static int sd_resume_runtime(struct device *); > static void sd_rescan(struct device *); > static blk_status_t sd_init_command(struct scsi_cmnd *SCpnt); > static void sd_uninit_command(struct scsi_cmnd *SCpnt); > @@ -574,7 +575,7 @@ static const struct dev_pm_ops sd_pm_ops = { > .poweroff = sd_suspend_system, > .restore = sd_resume, > .runtime_suspend = sd_suspend_runtime, > - .runtime_resume = sd_resume, > + .runtime_resume = sd_resume_runtime, > }; > > static struct scsi_driver sd_template = { > @@ -3652,6 +3653,21 @@ static int sd_resume(struct device *dev) > return ret; > } > > +static int sd_resume_runtime(struct device *dev) > +{ > + struct scsi_disk *sdkp = dev_get_drvdata(dev); > + > + /* Some SD cardreaders report media change when resuming from > suspend > + * because they can't keep track during suspend. */ > + > + /* XXX This is not unproblematic though: We won't notice when a card > + * was really changed during runtime suspend! We basically rely > on users > + * to unmount or suspend before doing so. */ > + sdkp->device->expecting_media_change = 1; > + > + return sd_resume(dev); > +} > + > /** > * init_sd - entry point for this driver (both when built in or when > * a module). > diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h > index bc5909033d13..8c8f053f71c8 100644 > --- a/include/scsi/scsi_device.h > +++ b/include/scsi/scsi_device.h > @@ -169,6 +169,8 @@ struct scsi_device { > * this device */ > unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN > * because we did a bus reset. */ > + unsigned expecting_media_change:1; /* Expecting media change > ASC/ASCQ > + when it actually doesn't > change */ > unsigned use_10_for_rw:1; /* first try 10-byte read / write */ > unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */ > unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */ That's pretty much what James was suggesting, except for one thing: You must not set sdkp->device->expecting_media_change to 1 for all devices in sd_runtime_resume(). Only for devices which may generate a spurious Unit Attention following runtime resume -- and maybe not even for all of them, depending on what the user wants. Alan Stern
On Thu, Jul 30, 2020 at 10:05:50AM +0200, Martin Kepplinger wrote: > On 29.06.20 18:15, Alan Stern wrote: > > On Mon, Jun 29, 2020 at 11:42:59AM +0200, Martin Kepplinger wrote: > >> > >> > >> On 26.06.20 17:44, Alan Stern wrote: > >>> Martin's best approach would be to add some debugging code to find out why > >>> blk_queue_enter() isn't calling bkl_pm_request_resume(), or why that call > >>> doesn't lead to pm_request_resume(). > >>> > >>> Alan Stern > >>> > >> > >> Hi Alan, > >> > >> blk_queue_enter() always - especially when sd is runtime suspended and I > >> try to mount as above - sets success to be true for me, so never > >> continues down to bkl_pm_request_resume(). All I see is "PM: Removing > >> info for No Bus:sda1". > > > > Aha. Looking at this more closely, it's apparent that the code in > > blk-core.c contains a logic bug: It assumes that if the BLK_MQ_REQ_PREEMPT > > flag is set then the request can be issued regardless of the queue's > > runtime status. That is not correct when the queue is suspended. > > > > Below is my attempt to fix this up. I'm not sure that the patch is > > entirely correct, but it should fix this logic bug. I would appreciate a > > critical review. > > > > Martin, does this fix the problem? > > > > Alan Stern > > Hi Alan, > > So in the block layer your change below indeed fixes the problem and if > you want to submit that 1:1 feel free to add > > Tested-by: Martin Kepplinger <martin.kepplinger@puri.sm> > > thanks for your help in this! Thank you for _your_ help! The next merge window is coming up soon. I think I'll wait until it is over before submitting the patch (maintainers tend to be too busy to consider new patches during a merge window). But I am still open to comments or criticism of the patch in the meantime. There hasn't been any feedback since Bart's initial set of questions. Alan Stern
On 30.07.20 17:10, Alan Stern wrote: > On Thu, Jul 30, 2020 at 10:52:14AM +0200, Martin Kepplinger wrote: >> Maybe I should just start a new discussion with a patch, but the below >> is what makes sense to me (when I understand you correctly) and seems to >> work. I basically add a new flag, so that the old flags behave unchanged >> and only call it during *runtime* resume for SD cards: >> >> >> --- a/drivers/scsi/scsi_error.c >> +++ b/drivers/scsi/scsi_error.c >> @@ -553,15 +553,21 @@ int scsi_check_sense(struct scsi_cmnd *scmd) >> * information that we should pass up to the upper-level >> driver >> * so that we can deal with it there. >> */ >> - if (scmd->device->expecting_cc_ua) { >> + if (scmd->device->expecting_cc_ua || >> + scmd->device->expecting_media_change) { >> /* >> * Because some device does not queue unit >> * attentions correctly, we carefully check >> * additional sense code and qualifier so as >> - * not to squash media change unit attention. >> + * not to squash media change unit attention; >> + * unless expecting_media_change is set, indicating >> + * that the media (most likely) didn't change >> + * but a device only believes so (for example >> + * because of suspend/resume). >> */ >> - if (sshdr.asc != 0x28 || sshdr.ascq != 0x00) { >> - scmd->device->expecting_cc_ua = 0; >> + if ((sshdr.asc != 0x28 || sshdr.ascq != 0x00) || >> + scmd->device->expecting_media_change) { >> + scmd->device->expecting_media_change = 0; >> return NEEDS_RETRY; >> } >> } >> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c >> index d90fefffe31b..b647fab2b663 100644 >> --- a/drivers/scsi/sd.c >> +++ b/drivers/scsi/sd.c >> @@ -114,6 +114,7 @@ static void sd_shutdown(struct device *); >> static int sd_suspend_system(struct device *); >> static int sd_suspend_runtime(struct device *); >> static int sd_resume(struct device *); >> +static int sd_resume_runtime(struct device *); >> static void sd_rescan(struct device *); >> static blk_status_t sd_init_command(struct scsi_cmnd *SCpnt); >> static void sd_uninit_command(struct scsi_cmnd *SCpnt); >> @@ -574,7 +575,7 @@ static const struct dev_pm_ops sd_pm_ops = { >> .poweroff = sd_suspend_system, >> .restore = sd_resume, >> .runtime_suspend = sd_suspend_runtime, >> - .runtime_resume = sd_resume, >> + .runtime_resume = sd_resume_runtime, >> }; >> >> static struct scsi_driver sd_template = { >> @@ -3652,6 +3653,21 @@ static int sd_resume(struct device *dev) >> return ret; >> } >> >> +static int sd_resume_runtime(struct device *dev) >> +{ >> + struct scsi_disk *sdkp = dev_get_drvdata(dev); >> + >> + /* Some SD cardreaders report media change when resuming from >> suspend >> + * because they can't keep track during suspend. */ >> + >> + /* XXX This is not unproblematic though: We won't notice when a card >> + * was really changed during runtime suspend! We basically rely >> on users >> + * to unmount or suspend before doing so. */ >> + sdkp->device->expecting_media_change = 1; >> + >> + return sd_resume(dev); >> +} >> + >> /** >> * init_sd - entry point for this driver (both when built in or when >> * a module). >> diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h >> index bc5909033d13..8c8f053f71c8 100644 >> --- a/include/scsi/scsi_device.h >> +++ b/include/scsi/scsi_device.h >> @@ -169,6 +169,8 @@ struct scsi_device { >> * this device */ >> unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN >> * because we did a bus reset. */ >> + unsigned expecting_media_change:1; /* Expecting media change >> ASC/ASCQ >> + when it actually doesn't >> change */ >> unsigned use_10_for_rw:1; /* first try 10-byte read / write */ >> unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */ >> unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */ > > That's pretty much what James was suggesting, except for one thing: You > must not set sdkp->device->expecting_media_change to 1 for all devices > in sd_runtime_resume(). Only for devices which may generate a spurious > Unit Attention following runtime resume -- and maybe not even for all of > them, depending on what the user wants. > > Alan Stern > when I mount the SD card myself or via Nautilus, things work. When I put sth like: /dev/sda1 /mnt/sda1 auto auto,nofail 0 2 into fstab, I *still* get (constantly) when accessing the files: [ 50.838061] sd 0:0:0:0: [sda] tag#0 device offline or changed why could that be? is there another place we would add such a new flag (not only resume())? martin
On 04.08.20 11:39, Martin Kepplinger wrote: > On 30.07.20 17:10, Alan Stern wrote: >> On Thu, Jul 30, 2020 at 10:52:14AM +0200, Martin Kepplinger wrote: >>> Maybe I should just start a new discussion with a patch, but the below >>> is what makes sense to me (when I understand you correctly) and seems to >>> work. I basically add a new flag, so that the old flags behave unchanged >>> and only call it during *runtime* resume for SD cards: >>> >>> >>> --- a/drivers/scsi/scsi_error.c >>> +++ b/drivers/scsi/scsi_error.c >>> @@ -553,15 +553,21 @@ int scsi_check_sense(struct scsi_cmnd *scmd) >>> * information that we should pass up to the upper-level >>> driver >>> * so that we can deal with it there. >>> */ >>> - if (scmd->device->expecting_cc_ua) { >>> + if (scmd->device->expecting_cc_ua || >>> + scmd->device->expecting_media_change) { >>> /* >>> * Because some device does not queue unit >>> * attentions correctly, we carefully check >>> * additional sense code and qualifier so as >>> - * not to squash media change unit attention. >>> + * not to squash media change unit attention; >>> + * unless expecting_media_change is set, indicating >>> + * that the media (most likely) didn't change >>> + * but a device only believes so (for example >>> + * because of suspend/resume). >>> */ >>> - if (sshdr.asc != 0x28 || sshdr.ascq != 0x00) { >>> - scmd->device->expecting_cc_ua = 0; >>> + if ((sshdr.asc != 0x28 || sshdr.ascq != 0x00) || >>> + scmd->device->expecting_media_change) { >>> + scmd->device->expecting_media_change = 0; >>> return NEEDS_RETRY; >>> } >>> } >>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c >>> index d90fefffe31b..b647fab2b663 100644 >>> --- a/drivers/scsi/sd.c >>> +++ b/drivers/scsi/sd.c >>> @@ -114,6 +114,7 @@ static void sd_shutdown(struct device *); >>> static int sd_suspend_system(struct device *); >>> static int sd_suspend_runtime(struct device *); >>> static int sd_resume(struct device *); >>> +static int sd_resume_runtime(struct device *); >>> static void sd_rescan(struct device *); >>> static blk_status_t sd_init_command(struct scsi_cmnd *SCpnt); >>> static void sd_uninit_command(struct scsi_cmnd *SCpnt); >>> @@ -574,7 +575,7 @@ static const struct dev_pm_ops sd_pm_ops = { >>> .poweroff = sd_suspend_system, >>> .restore = sd_resume, >>> .runtime_suspend = sd_suspend_runtime, >>> - .runtime_resume = sd_resume, >>> + .runtime_resume = sd_resume_runtime, >>> }; >>> >>> static struct scsi_driver sd_template = { >>> @@ -3652,6 +3653,21 @@ static int sd_resume(struct device *dev) >>> return ret; >>> } >>> >>> +static int sd_resume_runtime(struct device *dev) >>> +{ >>> + struct scsi_disk *sdkp = dev_get_drvdata(dev); >>> + >>> + /* Some SD cardreaders report media change when resuming from >>> suspend >>> + * because they can't keep track during suspend. */ >>> + >>> + /* XXX This is not unproblematic though: We won't notice when a card >>> + * was really changed during runtime suspend! We basically rely >>> on users >>> + * to unmount or suspend before doing so. */ >>> + sdkp->device->expecting_media_change = 1; >>> + >>> + return sd_resume(dev); >>> +} >>> + >>> /** >>> * init_sd - entry point for this driver (both when built in or when >>> * a module). >>> diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h >>> index bc5909033d13..8c8f053f71c8 100644 >>> --- a/include/scsi/scsi_device.h >>> +++ b/include/scsi/scsi_device.h >>> @@ -169,6 +169,8 @@ struct scsi_device { >>> * this device */ >>> unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN >>> * because we did a bus reset. */ >>> + unsigned expecting_media_change:1; /* Expecting media change >>> ASC/ASCQ >>> + when it actually doesn't >>> change */ >>> unsigned use_10_for_rw:1; /* first try 10-byte read / write */ >>> unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */ >>> unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */ >> >> That's pretty much what James was suggesting, except for one thing: You >> must not set sdkp->device->expecting_media_change to 1 for all devices >> in sd_runtime_resume(). Only for devices which may generate a spurious >> Unit Attention following runtime resume -- and maybe not even for all of >> them, depending on what the user wants. >> >> Alan Stern >> > > when I mount the SD card myself or via Nautilus, things work. When I put > sth like: > > /dev/sda1 /mnt/sda1 auto auto,nofail 0 2 > > into fstab, I *still* get (constantly) when accessing the files: > > [ 50.838061] sd 0:0:0:0: [sda] tag#0 device offline or changed > > why could that be? is there another place we would add such a new flag > (not only resume())? > > martin > it's really strange: below is the change I'm trying. Of course that's only for testing the functionality, nothing how a patch could look like. While I remember it had worked, now (weirdly since I tried that mounting via fstab) it doesn't anymore! What I understand (not much): I handle the error with "retry" via the new flag, but scsi_decide_disposition() returns SUCCESS because of "no more retries"; but it's the first and only time it's called. How can this be? What am I missing? --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -565,6 +565,13 @@ int scsi_check_sense(struct scsi_cmnd *scmd) return NEEDS_RETRY; } } + if (scmd->device->expecting_media_change) { + if (sshdr.asc == 0x28 && sshdr.ascq == 0x00) { + scmd->device->expecting_media_change = 0; + return NEEDS_RETRY; + } + } + /* * we might also expect a cc/ua if another LUN on the target * reported a UA with an ASC/ASCQ of 3F 0E - diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index d90fefffe31b..bb583e403b81 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -3642,6 +3642,8 @@ static int sd_resume(struct device *dev) if (!sdkp) /* E.g.: runtime resume at the start of sd_probe() */ return 0; + sdkp->device->expecting_media_change = 1; + if (!sdkp->device->manage_start_stop) return 0; diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h index bc5909033d13..f5fc1af68e00 100644 --- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@ -169,6 +169,7 @@ struct scsi_device { * this device */ unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN * because we did a bus reset. */ + unsigned expecting_media_change:1; unsigned use_10_for_rw:1; /* first try 10-byte read / write */ unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */ unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */
On Fri, Aug 07, 2020 at 11:51:21AM +0200, Martin Kepplinger wrote: > it's really strange: below is the change I'm trying. Of course that's > only for testing the functionality, nothing how a patch could look like. > > While I remember it had worked, now (weirdly since I tried that mounting > via fstab) it doesn't anymore! > > What I understand (not much): I handle the error with "retry" via the > new flag, but scsi_decide_disposition() returns SUCCESS because of "no > more retries"; but it's the first and only time it's called. Are you saying that scmd->allowed is set to 0? Or is scsi_notry_cmd() returning a nonzero value? Whichever is true, why does it happen that way? What is the failing command? Is it a READ(10)? > How can this be? What am I missing? It's kind of hard to tell without seeing the error messages or system log or any debugging information. Alan Stern > --- a/drivers/scsi/scsi_error.c > +++ b/drivers/scsi/scsi_error.c > @@ -565,6 +565,13 @@ int scsi_check_sense(struct scsi_cmnd *scmd) > return NEEDS_RETRY; > } > } > + if (scmd->device->expecting_media_change) { > + if (sshdr.asc == 0x28 && sshdr.ascq == 0x00) { > + scmd->device->expecting_media_change = 0; > + return NEEDS_RETRY; > + } > + } > + > /* > * we might also expect a cc/ua if another LUN on the target > * reported a UA with an ASC/ASCQ of 3F 0E - > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > index d90fefffe31b..bb583e403b81 100644 > --- a/drivers/scsi/sd.c > +++ b/drivers/scsi/sd.c > @@ -3642,6 +3642,8 @@ static int sd_resume(struct device *dev) > if (!sdkp) /* E.g.: runtime resume at the start of sd_probe() */ > return 0; > > + sdkp->device->expecting_media_change = 1; > + > if (!sdkp->device->manage_start_stop) > return 0; > > diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h > index bc5909033d13..f5fc1af68e00 100644 > --- a/include/scsi/scsi_device.h > +++ b/include/scsi/scsi_device.h > @@ -169,6 +169,7 @@ struct scsi_device { > * this device */ > unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN > * because we did a bus reset. */ > + unsigned expecting_media_change:1; > unsigned use_10_for_rw:1; /* first try 10-byte read / write */ > unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */ > unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */
On 07.08.20 16:30, Alan Stern wrote: > On Fri, Aug 07, 2020 at 11:51:21AM +0200, Martin Kepplinger wrote: >> it's really strange: below is the change I'm trying. Of course that's >> only for testing the functionality, nothing how a patch could look like. >> >> While I remember it had worked, now (weirdly since I tried that mounting >> via fstab) it doesn't anymore! >> >> What I understand (not much): I handle the error with "retry" via the >> new flag, but scsi_decide_disposition() returns SUCCESS because of "no >> more retries"; but it's the first and only time it's called. > > Are you saying that scmd->allowed is set to 0? Or is scsi_notry_cmd() > returning a nonzero value? Whichever is true, why does it happen that > way? scsi_notry_cmd() is returning 1. (it's retry 1 of 5 allowed). why is it returning 1? REQ_FAILFAST_DEV is set. It's DID_OK, then "if (status_byte(scmd->result) != CHECK_CONDITION)" appearently is not true, then at the end it returns 1 because of REQ_FAILFAST_DEV. that seems to come from the block layer. why and when? could I change that so that the scsi error handling stays in control? > > What is the failing command? Is it a READ(10)? Not sure how I'd answer that, but here's the test to trigger the error: mount /dev/sda1 /mnt cd /mnt ls cp file ~/ (if ls "works" and doesn't yet trigger the error) and that's the (familiar looking) logs when doing so. again: despite the mentioned workaround in scsi_error and the new expected_media_change flag *is* set and gets cleared, as it should be. REQ_FAILFAST_DEV seems to override what I want to do is scsi_error: [ 55.557629] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 cmd_age=0s [ 55.557639] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current] [ 55.557646] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0 [ 55.557657] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 08 fc e0 00 00 01 00 [ 55.557666] blk_update_request: I/O error, dev sda, sector 589024 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 [ 55.568899] sd 0:0:0:0: [sda] tag#0 device offline or changed [ 55.574691] blk_update_request: I/O error, dev sda, sector 589025 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 [ 55.585756] sd 0:0:0:0: [sda] tag#0 device offline or changed [ 55.591562] blk_update_request: I/O error, dev sda, sector 589026 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 [ 55.602274] sd 0:0:0:0: [sda] tag#0 device offline or changed (... goes on with the same) > >> How can this be? What am I missing? > > It's kind of hard to tell without seeing the error messages or system > log or any debugging information. thanks a lot for getting back to me, martin > > Alan Stern > >> --- a/drivers/scsi/scsi_error.c >> +++ b/drivers/scsi/scsi_error.c >> @@ -565,6 +565,13 @@ int scsi_check_sense(struct scsi_cmnd *scmd) >> return NEEDS_RETRY; >> } >> } >> + if (scmd->device->expecting_media_change) { >> + if (sshdr.asc == 0x28 && sshdr.ascq == 0x00) { >> + scmd->device->expecting_media_change = 0; >> + return NEEDS_RETRY; >> + } >> + } >> + >> /* >> * we might also expect a cc/ua if another LUN on the target >> * reported a UA with an ASC/ASCQ of 3F 0E - >> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c >> index d90fefffe31b..bb583e403b81 100644 >> --- a/drivers/scsi/sd.c >> +++ b/drivers/scsi/sd.c >> @@ -3642,6 +3642,8 @@ static int sd_resume(struct device *dev) >> if (!sdkp) /* E.g.: runtime resume at the start of sd_probe() */ >> return 0; >> >> + sdkp->device->expecting_media_change = 1; >> + >> if (!sdkp->device->manage_start_stop) >> return 0; >> >> diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h >> index bc5909033d13..f5fc1af68e00 100644 >> --- a/include/scsi/scsi_device.h >> +++ b/include/scsi/scsi_device.h >> @@ -169,6 +169,7 @@ struct scsi_device { >> * this device */ >> unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN >> * because we did a bus reset. */ >> + unsigned expecting_media_change:1; >> unsigned use_10_for_rw:1; /* first try 10-byte read / write */ >> unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */ >> unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */
On Sat, Aug 08, 2020 at 08:59:09AM +0200, Martin Kepplinger wrote: > On 07.08.20 16:30, Alan Stern wrote: > > On Fri, Aug 07, 2020 at 11:51:21AM +0200, Martin Kepplinger wrote: > >> it's really strange: below is the change I'm trying. Of course that's > >> only for testing the functionality, nothing how a patch could look like. > >> > >> While I remember it had worked, now (weirdly since I tried that mounting > >> via fstab) it doesn't anymore! > >> > >> What I understand (not much): I handle the error with "retry" via the > >> new flag, but scsi_decide_disposition() returns SUCCESS because of "no > >> more retries"; but it's the first and only time it's called. > > > > Are you saying that scmd->allowed is set to 0? Or is scsi_notry_cmd() > > returning a nonzero value? Whichever is true, why does it happen that > > way? > > scsi_notry_cmd() is returning 1. (it's retry 1 of 5 allowed). > > why is it returning 1? REQ_FAILFAST_DEV is set. It's DID_OK, then "if > (status_byte(scmd->result) != CHECK_CONDITION)" appearently is not true, > then at the end it returns 1 because of REQ_FAILFAST_DEV. > > that seems to come from the block layer. why and when? could I change > that so that the scsi error handling stays in control? The only place I see where that flag might get set is in blk_mq_bio_to_request() in block/blk-mq.c, which does: if (bio->bi_opf & REQ_RAHEAD) rq->cmd_flags |= REQ_FAILFAST_MASK; So apparently read-ahead reads are supposed to fail fast (i.e., without retries), presumably because they are optional after all. > > What is the failing command? Is it a READ(10)? > > Not sure how I'd answer that, but here's the test to trigger the error: > > mount /dev/sda1 /mnt > cd /mnt > ls > cp file ~/ (if ls "works" and doesn't yet trigger the error) > > and that's the (familiar looking) logs when doing so. again: despite the > mentioned workaround in scsi_error and the new expected_media_change > flag *is* set and gets cleared, as it should be. REQ_FAILFAST_DEV seems > to override what I want to do is scsi_error: > > [ 55.557629] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: > hostbyte=0x00 driverbyte=0x08 cmd_age=0s > [ 55.557639] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current] > [ 55.557646] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0 > [ 55.557657] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 08 fc > e0 00 00 01 00 Yes, 0x28 is READ(10). Likely this is a read-ahead request, although I don't know how we can tell for sure. > [ 55.557666] blk_update_request: I/O error, dev sda, sector 589024 op > 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 > [ 55.568899] sd 0:0:0:0: [sda] tag#0 device offline or changed > [ 55.574691] blk_update_request: I/O error, dev sda, sector 589025 op > 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 > [ 55.585756] sd 0:0:0:0: [sda] tag#0 device offline or changed > [ 55.591562] blk_update_request: I/O error, dev sda, sector 589026 op > 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 > [ 55.602274] sd 0:0:0:0: [sda] tag#0 device offline or changed > (... goes on with the same) Is such a drastic response really appropriate for the failure of a read-ahead request? It seems like a more logical response would be to let the request fail but keep the device online. Of course, that would only solve part of your problem -- your log would still get filled with those "I/O error" messages even though they wouldn't be fatal. Probably a better approach would be to make the new expecting_media_change flag override scsi_no_retry_cmd(). But this is not my area of expertise. Maybe someone else will have more to say. Alan Stern
On 08.08.20 17:05, Alan Stern wrote: > On Sat, Aug 08, 2020 at 08:59:09AM +0200, Martin Kepplinger wrote: >> On 07.08.20 16:30, Alan Stern wrote: >>> On Fri, Aug 07, 2020 at 11:51:21AM +0200, Martin Kepplinger wrote: >>>> it's really strange: below is the change I'm trying. Of course that's >>>> only for testing the functionality, nothing how a patch could look like. >>>> >>>> While I remember it had worked, now (weirdly since I tried that mounting >>>> via fstab) it doesn't anymore! >>>> >>>> What I understand (not much): I handle the error with "retry" via the >>>> new flag, but scsi_decide_disposition() returns SUCCESS because of "no >>>> more retries"; but it's the first and only time it's called. >>> >>> Are you saying that scmd->allowed is set to 0? Or is scsi_notry_cmd() >>> returning a nonzero value? Whichever is true, why does it happen that >>> way? >> >> scsi_notry_cmd() is returning 1. (it's retry 1 of 5 allowed). >> >> why is it returning 1? REQ_FAILFAST_DEV is set. It's DID_OK, then "if >> (status_byte(scmd->result) != CHECK_CONDITION)" appearently is not true, >> then at the end it returns 1 because of REQ_FAILFAST_DEV. >> >> that seems to come from the block layer. why and when? could I change >> that so that the scsi error handling stays in control? > > The only place I see where that flag might get set is in > blk_mq_bio_to_request() in block/blk-mq.c, which does: > > if (bio->bi_opf & REQ_RAHEAD) > rq->cmd_flags |= REQ_FAILFAST_MASK; > > So apparently read-ahead reads are supposed to fail fast (i.e., without > retries), presumably because they are optional after all. > >>> What is the failing command? Is it a READ(10)? >> >> Not sure how I'd answer that, but here's the test to trigger the error: >> >> mount /dev/sda1 /mnt >> cd /mnt >> ls >> cp file ~/ (if ls "works" and doesn't yet trigger the error) >> >> and that's the (familiar looking) logs when doing so. again: despite the >> mentioned workaround in scsi_error and the new expected_media_change >> flag *is* set and gets cleared, as it should be. REQ_FAILFAST_DEV seems >> to override what I want to do is scsi_error: >> >> [ 55.557629] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: >> hostbyte=0x00 driverbyte=0x08 cmd_age=0s >> [ 55.557639] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current] >> [ 55.557646] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0 >> [ 55.557657] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 08 fc >> e0 00 00 01 00 > > Yes, 0x28 is READ(10). Likely this is a read-ahead request, although I > don't know how we can tell for sure. > >> [ 55.557666] blk_update_request: I/O error, dev sda, sector 589024 op >> 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 >> [ 55.568899] sd 0:0:0:0: [sda] tag#0 device offline or changed >> [ 55.574691] blk_update_request: I/O error, dev sda, sector 589025 op >> 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 >> [ 55.585756] sd 0:0:0:0: [sda] tag#0 device offline or changed >> [ 55.591562] blk_update_request: I/O error, dev sda, sector 589026 op >> 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 >> [ 55.602274] sd 0:0:0:0: [sda] tag#0 device offline or changed >> (... goes on with the same) > > Is such a drastic response really appropriate for the failure of a > read-ahead request? It seems like a more logical response would be to > let the request fail but keep the device online. > > Of course, that would only solve part of your problem -- your log would > still get filled with those "I/O error" messages even though they > wouldn't be fatal. Probably a better approach would be to make the new > expecting_media_change flag override scsi_no_retry_cmd(). > > But this is not my area of expertise. Maybe someone else will have more > to say. > > Alan Stern > Hey Alan, I'm really glad for that, I suspected some of this but I have little experience in scsi/block layers, so that is super helpful. I'd appreciate an opinion on the below workaround that *seems* to work now (let's see, I had thought so before :) Whether or not this helps to find a real solution, let's see. But integration of such a flag in the error handling paths is what's interesting for now: --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -565,6 +565,17 @@ int scsi_check_sense(struct scsi_cmnd *scmd) return NEEDS_RETRY; } } + if (scmd->device->expecting_media_change) { + if (sshdr.asc == 0x28 && sshdr.ascq == 0x00) { + /* clear expecting_media_change in + * scsi_noretry_cmd() because we need + * to override possible "failfast" overrides + * that block readahead can cause. + */ + return NEEDS_RETRY; + } + } + /* * we might also expect a cc/ua if another LUN on the target * reported a UA with an ASC/ASCQ of 3F 0E - @@ -1744,6 +1755,15 @@ int scsi_noretry_cmd(struct scsi_cmnd *scmd) return (scmd->request->cmd_flags & REQ_FAILFAST_DRIVER); } + /* + * We need to have retries when expecting_media_change is set. + * So we need to return success and clear the flag here. + */ + if (scmd->device->expecting_media_change) { + scmd->device->expecting_media_change = 0; + return 0; + } + if (status_byte(scmd->result) != CHECK_CONDITION) return 0; diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index d90fefffe31b..bb583e403b81 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -3642,6 +3642,8 @@ static int sd_resume(struct device *dev) if (!sdkp) /* E.g.: runtime resume at the start of sd_probe() */ return 0; + sdkp->device->expecting_media_change = 1; + if (!sdkp->device->manage_start_stop) return 0; diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h index bc5909033d13..f5fc1af68e00 100644 --- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@ -169,6 +169,7 @@ struct scsi_device { * this device */ unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN * because we did a bus reset. */ + unsigned expecting_media_change:1; unsigned use_10_for_rw:1; /* first try 10-byte read / write */ unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */ unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */ --
On Sun, Aug 09, 2020 at 11:20:22AM +0200, Martin Kepplinger wrote: > Hey Alan, I'm really glad for that, I suspected some of this but I have > little experience in scsi/block layers, so that is super helpful. > > I'd appreciate an opinion on the below workaround that *seems* to work > now (let's see, I had thought so before :) > > Whether or not this helps to find a real solution, let's see. But > integration of such a flag in the error handling paths is what's > interesting for now: > > > --- a/drivers/scsi/scsi_error.c > +++ b/drivers/scsi/scsi_error.c > @@ -565,6 +565,17 @@ int scsi_check_sense(struct scsi_cmnd *scmd) > return NEEDS_RETRY; > } > } > + if (scmd->device->expecting_media_change) { > + if (sshdr.asc == 0x28 && sshdr.ascq == 0x00) { > + /* clear expecting_media_change in > + * scsi_noretry_cmd() because we need > + * to override possible "failfast" overrides > + * that block readahead can cause. > + */ > + return NEEDS_RETRY; This is a somewhat fragile approach. You don't know for certain that scsi_noretry_cmd will be called. Also, scsi_noretry_cmd can be called from other places. It would be better to clear the expecting_media_change flag just before returning from scsi_decide_disposition. That way its use is localized to one routine, not spread out between two. Alan Stern
On 09.08.20 17:26, Alan Stern wrote: > On Sun, Aug 09, 2020 at 11:20:22AM +0200, Martin Kepplinger wrote: >> Hey Alan, I'm really glad for that, I suspected some of this but I have >> little experience in scsi/block layers, so that is super helpful. >> >> I'd appreciate an opinion on the below workaround that *seems* to work >> now (let's see, I had thought so before :) >> >> Whether or not this helps to find a real solution, let's see. But >> integration of such a flag in the error handling paths is what's >> interesting for now: >> >> >> --- a/drivers/scsi/scsi_error.c >> +++ b/drivers/scsi/scsi_error.c >> @@ -565,6 +565,17 @@ int scsi_check_sense(struct scsi_cmnd *scmd) >> return NEEDS_RETRY; >> } >> } >> + if (scmd->device->expecting_media_change) { >> + if (sshdr.asc == 0x28 && sshdr.ascq == 0x00) { >> + /* clear expecting_media_change in >> + * scsi_noretry_cmd() because we need >> + * to override possible "failfast" overrides >> + * that block readahead can cause. >> + */ >> + return NEEDS_RETRY; > > This is a somewhat fragile approach. You don't know for certain that > scsi_noretry_cmd will be called. Also, scsi_noretry_cmd can be called > from other places. > > It would be better to clear the expecting_media_change flag just before > returning from scsi_decide_disposition. That way its use is localized > to one routine, not spread out between two. > > Alan Stern > Hi Alan, maybe you're right. I initially just thought that I'd allow for specific error codes in scsi_noretry_cmd() to return non-NULL (BUS_BUSY, PARITY, ERROR) despite having the flag set. The below version works equally fine for me but I'm not sure if it's actually more safe. James, when exposing a new writable sysfs option like "suspend_no_media_change"(?) that drivers can check before setting the new "expecting_media_change" flag (during resume), would this addition make sense to you? thanks, martin --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -565,6 +565,18 @@ int scsi_check_sense(struct scsi_cmnd *scmd) return NEEDS_RETRY; } } + if (scmd->device->expecting_media_change) { + if (sshdr.asc == 0x28 && sshdr.ascq == 0x00) { + /* + * clear the expecting_media_change in + * scsi_decide_disposition() because we + * need to catch possible "fail fast" overrides + * that block readahead can cause. + */ + return NEEDS_RETRY; + } + } + /* * we might also expect a cc/ua if another LUN on the target * reported a UA with an ASC/ASCQ of 3F 0E - @@ -1944,9 +1956,19 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd) * the request was not marked fast fail. Note that above, * even if the request is marked fast fail, we still requeue * for queue congestion conditions (QUEUE_FULL or BUSY) */ - if ((++scmd->retries) <= scmd->allowed - && !scsi_noretry_cmd(scmd)) { - return NEEDS_RETRY; + if ((++scmd->retries) <= scmd->allowed) { + /* + * but scsi_noretry_cmd() cannot override the + * expecting_media_change flag. + */ + if (!scsi_noretry_cmd(scmd) || + scmd->device->expecting_media_change) { + scmd->device->expecting_media_change = 0; + return NEEDS_RETRY; + } else { + /* marked fast fail and not expected. */ + return SUCCESS; + } } else { /* * no more retries - report this one back to upper level. diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index d90fefffe31b..bb583e403b81 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -3642,6 +3642,8 @@ static int sd_resume(struct device *dev) if (!sdkp) /* E.g.: runtime resume at the start of sd_probe() */ return 0; + sdkp->device->expecting_media_change = 1; + if (!sdkp->device->manage_start_stop) return 0; diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h index bc5909033d13..f5fc1af68e00 100644 --- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@ -169,6 +169,7 @@ struct scsi_device { * this device */ unsigned expecting_cc_ua:1; /* Expecting a CHECK_CONDITION/UNIT_ATTN * because we did a bus reset. */ + unsigned expecting_media_change:1; unsigned use_10_for_rw:1; /* first try 10-byte read / write */ unsigned use_10_for_ms:1; /* first try 10-byte mode sense/select */ unsigned set_dbd_for_ms:1; /* Set "DBD" field in mode sense */ --
On Mon, Aug 10, 2020 at 02:03:17PM +0200, Martin Kepplinger wrote: > On 09.08.20 17:26, Alan Stern wrote: > > This is a somewhat fragile approach. You don't know for certain that > > scsi_noretry_cmd will be called. Also, scsi_noretry_cmd can be called > > from other places. > > > > It would be better to clear the expecting_media_change flag just before > > returning from scsi_decide_disposition. That way its use is localized > > to one routine, not spread out between two. > > > > Alan Stern > > > > Hi Alan, > > maybe you're right. I initially just thought that I'd allow for specific > error codes in scsi_noretry_cmd() to return non-NULL (BUS_BUSY, PARITY, > ERROR) despite having the flag set. > > The below version works equally fine for me but I'm not sure if it's > actually more safe. > > James, when exposing a new writable sysfs option like > "suspend_no_media_change"(?) that drivers can check before setting the > new "expecting_media_change" flag (during resume), would this addition > make sense to you? > > thanks, > > martin > > > > --- a/drivers/scsi/scsi_error.c > +++ b/drivers/scsi/scsi_error.c > @@ -565,6 +565,18 @@ int scsi_check_sense(struct scsi_cmnd *scmd) > return NEEDS_RETRY; > } > } > + if (scmd->device->expecting_media_change) { > + if (sshdr.asc == 0x28 && sshdr.ascq == 0x00) { > + /* > + * clear the expecting_media_change in > + * scsi_decide_disposition() because we > + * need to catch possible "fail fast" overrides > + * that block readahead can cause. > + */ > + return NEEDS_RETRY; > + } > + } > + > /* > * we might also expect a cc/ua if another LUN on the target > * reported a UA with an ASC/ASCQ of 3F 0E - > @@ -1944,9 +1956,19 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd) > * the request was not marked fast fail. Note that above, > * even if the request is marked fast fail, we still requeue > * for queue congestion conditions (QUEUE_FULL or BUSY) */ > - if ((++scmd->retries) <= scmd->allowed > - && !scsi_noretry_cmd(scmd)) { > - return NEEDS_RETRY; > + if ((++scmd->retries) <= scmd->allowed) { > + /* > + * but scsi_noretry_cmd() cannot override the > + * expecting_media_change flag. > + */ > + if (!scsi_noretry_cmd(scmd) || > + scmd->device->expecting_media_change) { > + scmd->device->expecting_media_change = 0; > + return NEEDS_RETRY; > + } else { > + /* marked fast fail and not expected. */ > + return SUCCESS; > + } > } else { This may not matter... but it's worth pointing out that expecting_media_change doesn't get cleared if ++scmd->retries > scmd->allowed. It also doesn't get cleared in cases where the device _doesn't_ report a Unit Attention. Alan Stern
On 10.08.20 16:13, Alan Stern wrote: > On Mon, Aug 10, 2020 at 02:03:17PM +0200, Martin Kepplinger wrote: >> On 09.08.20 17:26, Alan Stern wrote: >>> This is a somewhat fragile approach. You don't know for certain that >>> scsi_noretry_cmd will be called. Also, scsi_noretry_cmd can be called >>> from other places. >>> >>> It would be better to clear the expecting_media_change flag just before >>> returning from scsi_decide_disposition. That way its use is localized >>> to one routine, not spread out between two. >>> >>> Alan Stern >>> >> >> Hi Alan, >> >> maybe you're right. I initially just thought that I'd allow for specific >> error codes in scsi_noretry_cmd() to return non-NULL (BUS_BUSY, PARITY, >> ERROR) despite having the flag set. >> >> The below version works equally fine for me but I'm not sure if it's >> actually more safe. >> >> James, when exposing a new writable sysfs option like >> "suspend_no_media_change"(?) that drivers can check before setting the >> new "expecting_media_change" flag (during resume), would this addition >> make sense to you? >> >> thanks, >> >> martin >> >> >> >> --- a/drivers/scsi/scsi_error.c >> +++ b/drivers/scsi/scsi_error.c >> @@ -565,6 +565,18 @@ int scsi_check_sense(struct scsi_cmnd *scmd) >> return NEEDS_RETRY; >> } >> } >> + if (scmd->device->expecting_media_change) { >> + if (sshdr.asc == 0x28 && sshdr.ascq == 0x00) { >> + /* >> + * clear the expecting_media_change in >> + * scsi_decide_disposition() because we >> + * need to catch possible "fail fast" overrides >> + * that block readahead can cause. >> + */ >> + return NEEDS_RETRY; >> + } >> + } >> + >> /* >> * we might also expect a cc/ua if another LUN on the target >> * reported a UA with an ASC/ASCQ of 3F 0E - >> @@ -1944,9 +1956,19 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd) >> * the request was not marked fast fail. Note that above, >> * even if the request is marked fast fail, we still requeue >> * for queue congestion conditions (QUEUE_FULL or BUSY) */ >> - if ((++scmd->retries) <= scmd->allowed >> - && !scsi_noretry_cmd(scmd)) { >> - return NEEDS_RETRY; >> + if ((++scmd->retries) <= scmd->allowed) { >> + /* >> + * but scsi_noretry_cmd() cannot override the >> + * expecting_media_change flag. >> + */ >> + if (!scsi_noretry_cmd(scmd) || >> + scmd->device->expecting_media_change) { >> + scmd->device->expecting_media_change = 0; >> + return NEEDS_RETRY; >> + } else { >> + /* marked fast fail and not expected. */ >> + return SUCCESS; >> + } >> } else { > > This may not matter... but it's worth pointing out that > expecting_media_change doesn't get cleared if ++scmd->retries > > scmd->allowed. absolutely worth pointing out and I'm not yet sure about that one. > > It also doesn't get cleared in cases where the device _doesn't_ > report a Unit Attention. true. but don't we set the flag for a future UA we don't yet know of? If we would want to clear it outside of a UA, I think we'd need to keep track of a suspend/resume cycle and if we see that we *had* successfully "done requests" after resuming, we could clear it... > > Alan Stern >
On Tue, Aug 11, 2020 at 09:55:54AM +0200, Martin Kepplinger wrote: > On 10.08.20 16:13, Alan Stern wrote: > > This may not matter... but it's worth pointing out that > > expecting_media_change doesn't get cleared if ++scmd->retries > > > scmd->allowed. > > absolutely worth pointing out and I'm not yet sure about that one. > > > > > It also doesn't get cleared in cases where the device _doesn't_ > > report a Unit Attention. > > true. but don't we set the flag for a future UA we don't yet know of? If > we would want to clear it outside of a UA, I think we'd need to keep > track of a suspend/resume cycle and if we see that we *had* successfully > "done requests" after resuming, we could clear it... The point is that expecting_media_change should apply only to the _next_ command. It should be cleared after _every_ command. You do not want to leave it hanging around -- if you do then you will miss real media changes. There's a potential issue when a bunch of commands get sent and completed all at once, but hopefully it won't matter here. Alan Stern
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index d90fefffe31b..fe4cb7c50ec1 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -1372,6 +1372,7 @@ static int sd_open(struct block_device *bdev, fmode_t mode) SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_open\n")); sdev = sdkp->device; + scsi_autopm_get_device(sdev); /* * If the device is in error recovery, wait until it is done. @@ -1418,6 +1419,9 @@ static int sd_open(struct block_device *bdev, fmode_t mode) error_out: scsi_disk_put(sdkp); + + scsi_autopm_put_device(sdev); + return retval; } @@ -1441,6 +1445,8 @@ static void sd_release(struct gendisk *disk, fmode_t mode) SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_release\n")); + scsi_autopm_put_device(sdev); + if (atomic_dec_return(&sdkp->openers) == 0 && sdev->removable) { if (scsi_block_when_processing_errors(sdev)) scsi_set_medium_removal(sdev, SCSI_REMOVAL_ALLOW);
This add a very conservative but simple implementation for runtime PM to the sd scsi driver: Resume when opened (mounted) and suspend when released (unmounted). Improvements that allow suspending while a device is "open" can be added later, but now we save power when no filesystem is mounted and runtime PM is enabled. Signed-off-by: Martin Kepplinger <martin.kepplinger@puri.sm> --- drivers/scsi/sd.c | 6 ++++++ 1 file changed, 6 insertions(+)