Message ID | 20230330164943.11607-1-thenzl@redhat.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | scsi: sd: mark the scsi device in shutdown as deleted | expand |
On 3/30/23 09:49, Tomas Henzl wrote: > Set the state to deleted in sd_shutdown so that the attached LLD > doesn't receive new I/O (can happen when in kexec) later after > LLD's shutdown function has been called. > > Signed-off-by: Tomas Henzl <thenzl@redhat.com> > --- > drivers/scsi/sd.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > index 4f28dd617eca..8095f0302e66 100644 > --- a/drivers/scsi/sd.c > +++ b/drivers/scsi/sd.c > @@ -3694,10 +3694,13 @@ static int sd_start_stop_device(struct scsi_disk *sdkp, int start) > static void sd_shutdown(struct device *dev) > { > struct scsi_disk *sdkp = dev_get_drvdata(dev); > + struct scsi_device *sdp; > > if (!sdkp) > return; /* this can happen */ > > + sdp = sdkp->device; > + > if (pm_runtime_suspended(dev)) > return; > > @@ -3710,6 +3713,10 @@ static void sd_shutdown(struct device *dev) > sd_printk(KERN_NOTICE, sdkp, "Stopping disk\n"); > sd_start_stop_device(sdkp, 0); > } > + > + mutex_lock(&sdp->state_mutex); > + scsi_device_set_state(sdp, SDEV_DEL); > + mutex_unlock(&sdp->state_mutex); > } Shouldn't new I/O to the SCSI disk be prevented whether or not the SCSI disk has been runtime suspended? Thanks, Bart.
On 3/30/23 11:49 AM, Tomas Henzl wrote: > Set the state to deleted in sd_shutdown so that the attached LLD > doesn't receive new I/O (can happen when in kexec) later after > LLD's shutdown function has been called. > > Signed-off-by: Tomas Henzl <thenzl@redhat.com> > --- > drivers/scsi/sd.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > index 4f28dd617eca..8095f0302e66 100644 > --- a/drivers/scsi/sd.c > +++ b/drivers/scsi/sd.c > @@ -3694,10 +3694,13 @@ static int sd_start_stop_device(struct scsi_disk *sdkp, int start) > static void sd_shutdown(struct device *dev) > { > struct scsi_disk *sdkp = dev_get_drvdata(dev); > + struct scsi_device *sdp; > > if (!sdkp) > return; /* this can happen */ > > + sdp = sdkp->device; > + > if (pm_runtime_suspended(dev)) > return; > > @@ -3710,6 +3713,10 @@ static void sd_shutdown(struct device *dev) > sd_printk(KERN_NOTICE, sdkp, "Stopping disk\n"); > sd_start_stop_device(sdkp, 0); > } > + > + mutex_lock(&sdp->state_mutex); > + scsi_device_set_state(sdp, SDEV_DEL); > + mutex_unlock(&sdp->state_mutex); > } If this is run for device removal what state will be in here? Are we going to do: 1. __scsi_remove_device sets the state to SDEV_CANCEL at the beginning of the function 2. device_unregister causes sd_remove to be called and sd_shutdown sets the state to SDEV_DEL 3. then __scsi_remove_device sets the state to SDEV_DEL at the bottom, so we get "Illegal state transition" errors printed.
On 3/30/23 19:08, Bart Van Assche wrote: > On 3/30/23 09:49, Tomas Henzl wrote: >> Set the state to deleted in sd_shutdown so that the attached LLD >> doesn't receive new I/O (can happen when in kexec) later after >> LLD's shutdown function has been called. >> >> Signed-off-by: Tomas Henzl <thenzl@redhat.com> >> --- >> drivers/scsi/sd.c | 7 +++++++ >> 1 file changed, 7 insertions(+) >> >> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c >> index 4f28dd617eca..8095f0302e66 100644 >> --- a/drivers/scsi/sd.c >> +++ b/drivers/scsi/sd.c >> @@ -3694,10 +3694,13 @@ static int sd_start_stop_device(struct scsi_disk *sdkp, int start) >> static void sd_shutdown(struct device *dev) >> { >> struct scsi_disk *sdkp = dev_get_drvdata(dev); >> + struct scsi_device *sdp; >> >> if (!sdkp) >> return; /* this can happen */ >> >> + sdp = sdkp->device; >> + >> if (pm_runtime_suspended(dev)) >> return; >> >> @@ -3710,6 +3713,10 @@ static void sd_shutdown(struct device *dev) >> sd_printk(KERN_NOTICE, sdkp, "Stopping disk\n"); >> sd_start_stop_device(sdkp, 0); >> } >> + >> + mutex_lock(&sdp->state_mutex); >> + scsi_device_set_state(sdp, SDEV_DEL); >> + mutex_unlock(&sdp->state_mutex); >> } > > Shouldn't new I/O to the SCSI disk be prevented whether or not the SCSI > disk has been runtime suspended? Thanks, I'll fix that. Tomas > > Thanks, > > Bart. >
On 3/30/23 19:12, Mike Christie wrote: > On 3/30/23 11:49 AM, Tomas Henzl wrote: >> Set the state to deleted in sd_shutdown so that the attached LLD >> doesn't receive new I/O (can happen when in kexec) later after >> LLD's shutdown function has been called. >> >> Signed-off-by: Tomas Henzl <thenzl@redhat.com> >> --- >> drivers/scsi/sd.c | 7 +++++++ >> 1 file changed, 7 insertions(+) >> >> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c >> index 4f28dd617eca..8095f0302e66 100644 >> --- a/drivers/scsi/sd.c >> +++ b/drivers/scsi/sd.c >> @@ -3694,10 +3694,13 @@ static int sd_start_stop_device(struct scsi_disk *sdkp, int start) >> static void sd_shutdown(struct device *dev) >> { >> struct scsi_disk *sdkp = dev_get_drvdata(dev); >> + struct scsi_device *sdp; >> >> if (!sdkp) >> return; /* this can happen */ >> >> + sdp = sdkp->device; >> + >> if (pm_runtime_suspended(dev)) >> return; >> >> @@ -3710,6 +3713,10 @@ static void sd_shutdown(struct device *dev) >> sd_printk(KERN_NOTICE, sdkp, "Stopping disk\n"); >> sd_start_stop_device(sdkp, 0); >> } >> + >> + mutex_lock(&sdp->state_mutex); >> + scsi_device_set_state(sdp, SDEV_DEL); >> + mutex_unlock(&sdp->state_mutex); >> } > > If this is run for device removal what state will be in here? > > Are we going to do: > 1. __scsi_remove_device sets the state to SDEV_CANCEL at the beginning > of the function > 2. device_unregister causes sd_remove to be called and sd_shutdown sets > the state to SDEV_DEL > 3. then ide sets the state to SDEV_DEL at the bottom, > so we get "Illegal state transition" errors printed. > Thanks for looking. A state change from SDEV_DEL to SDEV_DEL isn't illegal (state == oldstate) or am I wrong?
On Thu, 2023-03-30 at 12:12 -0500, Mike Christie wrote: > On 3/30/23 11:49 AM, Tomas Henzl wrote: > > Set the state to deleted in sd_shutdown so that the attached LLD > > doesn't receive new I/O (can happen when in kexec) later after > > LLD's shutdown function has been called. > > > > Signed-off-by: Tomas Henzl <thenzl@redhat.com> > > --- > > drivers/scsi/sd.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > > index 4f28dd617eca..8095f0302e66 100644 > > --- a/drivers/scsi/sd.c > > +++ b/drivers/scsi/sd.c > > @@ -3694,10 +3694,13 @@ static int sd_start_stop_device(struct > > scsi_disk *sdkp, int start) > > static void sd_shutdown(struct device *dev) > > { > > struct scsi_disk *sdkp = dev_get_drvdata(dev); > > + struct scsi_device *sdp; > > > > if (!sdkp) > > return; /* this can happen */ > > > > + sdp = sdkp->device; > > + > > if (pm_runtime_suspended(dev)) > > return; > > > > @@ -3710,6 +3713,10 @@ static void sd_shutdown(struct device *dev) > > sd_printk(KERN_NOTICE, sdkp, "Stopping disk\n"); > > sd_start_stop_device(sdkp, 0); > > } > > + > > + mutex_lock(&sdp->state_mutex); > > + scsi_device_set_state(sdp, SDEV_DEL); > > + mutex_unlock(&sdp->state_mutex); > > } > > If this is run for device removal what state will be in here? > > Are we going to do: > 1. __scsi_remove_device sets the state to SDEV_CANCEL at the > beginning of the function It will also interfere with target and host device removal. They traverse their own lists and assume that anything in DEL is already being removed, which won't be the case here. So basically, after this happens it's impossible to clean the device trees. It also means any I/O to the root device wouldn't be allowed. I assume the contention is that if we get here, we're either going for immediate shutdown or all the root device remounting to read only has already been done? If so, could you say that? James
On 3/31/23 13:48, James Bottomley wrote: > On Thu, 2023-03-30 at 12:12 -0500, Mike Christie wrote: >> On 3/30/23 11:49 AM, Tomas Henzl wrote: >>> Set the state to deleted in sd_shutdown so that the attached LLD >>> doesn't receive new I/O (can happen when in kexec) later after >>> LLD's shutdown function has been called. >>> >>> Signed-off-by: Tomas Henzl <thenzl@redhat.com> >>> --- >>> drivers/scsi/sd.c | 7 +++++++ >>> 1 file changed, 7 insertions(+) >>> >>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c >>> index 4f28dd617eca..8095f0302e66 100644 >>> --- a/drivers/scsi/sd.c >>> +++ b/drivers/scsi/sd.c >>> @@ -3694,10 +3694,13 @@ static int sd_start_stop_device(struct >>> scsi_disk *sdkp, int start) >>> static void sd_shutdown(struct device *dev) >>> { >>> struct scsi_disk *sdkp = dev_get_drvdata(dev); >>> + struct scsi_device *sdp; >>> >>> if (!sdkp) >>> return; /* this can happen */ >>> >>> + sdp = sdkp->device; >>> + >>> if (pm_runtime_suspended(dev)) >>> return; >>> >>> @@ -3710,6 +3713,10 @@ static void sd_shutdown(struct device *dev) >>> sd_printk(KERN_NOTICE, sdkp, "Stopping disk\n"); >>> sd_start_stop_device(sdkp, 0); >>> } >>> + >>> + mutex_lock(&sdp->state_mutex); >>> + scsi_device_set_state(sdp, SDEV_DEL); >>> + mutex_unlock(&sdp->state_mutex); >>> } >> >> If this is run for device removal what state will be in here? >> >> Are we going to do: >> 1. __scsi_remove_device sets the state to SDEV_CANCEL at the >> beginning of the function > > It will also interfere with target and host device removal. They > traverse their own lists and assume that anything in DEL is already > being removed, which won't be the case here. So basically, after this > happens it's impossible to clean the device trees. It also means any > I/O to the root device wouldn't be allowed. How will it interfere? After a return from sd_remove or via device_unregister->__scsi_remove_device the device state is SDEV_DEL regardless whether this patch has been added or not. Or is sd_shutdown called directly? > I assume the contention is that if we get here, we're either going for > immediate shutdown or all the root device remounting to read only has > already been done? If so, could you say that? I can't say that, quite the opposite (see body of the mail). When the system goes shutdown the individual device's .shutdown is called. Just moments after sd shutdown the LLD shutdown is entered and the driver stops any I/O immediately anyway. With this patch the I/O is stopped before reaching LLD with a reasonable message and without error correction mechanism in place. I also assume that no I/O after sd_shutdown was projected when it was written as there is a cache sync followed by a device power down. Tomas > > James >
On Fri, 2023-03-31 at 16:11 +0200, Tomas Henzl wrote: > On 3/31/23 13:48, James Bottomley wrote: > > On Thu, 2023-03-30 at 12:12 -0500, Mike Christie wrote: [...] > > > Are we going to do: > > > 1. __scsi_remove_device sets the state to SDEV_CANCEL at the > > > beginning of the function > > > > It will also interfere with target and host device removal. They > > traverse their own lists and assume that anything in DEL is already > > being removed, which won't be the case here. So basically, after > > this happens it's impossible to clean the device trees. It also > > means any I/O to the root device wouldn't be allowed. > > How will it interfere? After a return from sd_remove or via > device_unregister->__scsi_remove_device the device state is SDEV_DEL > regardless whether this patch has been added or not. Or is > sd_shutdown called directly? I thought it was called directly from the restart logic via the device_shutdown() call (see kernel/reboot.c:kernel_restart_prepare()) and was completely independent of any other state transitions within the device model ... however, I'd really like *you* to confirm this. I think by the time it's called, we're already in the system death throes, so if there were going to be an orderly shut down it's already happened, but if not, once you set a devices state to DEL, it will defeat any later attempt to do orderly remove of the host or target. > > I assume the contention is that if we get here, we're either going > > for immediate shutdown or all the root device remounting to read > > only has already been done? If so, could you say that? > > I can't say that, quite the opposite (see body of the mail). When the > system goes shutdown the individual device's .shutdown is called. > Just moments after sd shutdown the LLD shutdown is entered and the > driver stops any I/O immediately anyway. With this patch the I/O is > stopped before reaching LLD with a reasonable message and without > error correction mechanism in place. > > I also assume that no I/O after sd_shutdown was projected when it was > written as there is a cache sync followed by a device power down. It seems reasonable, but can you validate that? Shutdown is called both from reboot and kexec and if we stop IO to a quiescing root device before it completes, so that all filesystems come back dirty, we'll have a lot of unhappy users ... James
On 3/31/23 17:06, James Bottomley wrote: > On Fri, 2023-03-31 at 16:11 +0200, Tomas Henzl wrote: >> On 3/31/23 13:48, James Bottomley wrote: >>> On Thu, 2023-03-30 at 12:12 -0500, Mike Christie wrote: > [...] >>>> Are we going to do: >>>> 1. __scsi_remove_device sets the state to SDEV_CANCEL at the >>>> beginning of the function >>> >>> It will also interfere with target and host device removal. They >>> traverse their own lists and assume that anything in DEL is already >>> being removed, which won't be the case here. So basically, after >>> this happens it's impossible to clean the device trees. It also >>> means any I/O to the root device wouldn't be allowed. >> >> How will it interfere? After a return from sd_remove or via >> device_unregister->__scsi_remove_device the device state is SDEV_DEL >> regardless whether this patch has been added or not. Or is >> sd_shutdown called directly? > > I thought it was called directly from the restart logic via the > device_shutdown() call (see kernel/reboot.c:kernel_restart_prepare()) > and was completely independent of any other state transitions within > the device model ... however, I'd really like *you* to confirm this. > > I think by the time it's called, we're already in the system death > throes, so if there were going to be an orderly shut down it's already > happened, but if not, once you set a devices state to DEL, it will > defeat any later attempt to do orderly remove of the host or target. After the first device's shutdown method in device_shutdown has been called there is now way back (we don't have an opposite to shutdown like for suspend resume is) so it can't be undone and also it doesn't matter if some structures remain in memory. To me it seems to be the only logical consequence but I can't prove it. >>> I assume the contention is that if we get here, we're either going >>> for immediate shutdown or all the root device remounting to read >>> only has already been done? If so, could you say that? >> >> I can't say that, quite the opposite (see body of the mail). When the >> system goes shutdown the individual device's .shutdown is called. >> Just moments after sd shutdown the LLD shutdown is entered and the >> driver stops any I/O immediately anyway. With this patch the I/O is >> stopped before reaching LLD with a reasonable message and without >> error correction mechanism in place. >> >> I also assume that no I/O after sd_shutdown was projected when it was >> written as there is a cache sync followed by a device power down. > > It seems reasonable, but can you validate that? Shutdown is called > both from reboot and kexec and if we stop IO to a quiescing root device > before it completes, so that all filesystems come back dirty, we'll > have a lot of unhappy users ... Validate, I don't know how or what you mean. Making sure that everything is in a state when the kernel shutdown procedure may be entered is for example on my system done by userspace systemd-shutdown service like so :"systemd-shutdown[1]: Syncing filesystems and block devices ..." When the kernel shutdown procedure is entered the I/O will be cut off abruptly by a LLD.shutdown. That means it depends on scripts like systemd and that works well on all systems I checked for reboot but not for kexec on everywhere. The patch may stop the I/O few msec sooner that is true. But the clearer messages may bring a script writer to fix it. The patch doesn't fix a real bug so it isn't urgent nor important, seeing the congestion it creates please just drop it. Tomas > > James >
On 3/31/23 11:32, Tomas Henzl wrote: > The patch doesn't fix a real bug so it isn't urgent nor important, > seeing the congestion it creates please just drop it. Hi Tomas, I'm interested in failing future I/O from inside sd_shutdown() because it would allow me to remove the I/O quiescing code from the UFS driver shutdown callback. Thanks, Bart.
On 3/31/23 20:48, Bart Van Assche wrote: > On 3/31/23 11:32, Tomas Henzl wrote: >> The patch doesn't fix a real bug so it isn't urgent nor important, >> seeing the congestion it creates please just drop it. > > Hi Tomas, > > I'm interested in failing future I/O from inside sd_shutdown() because > it would allow me to remove the I/O quiescing code from the UFS driver > shutdown callback. I'm aware of this, other drivers do have similar code and so it would help elsewhere as well. The patch as it is doesn't however ensure that there isn't for example an I/O started before sd.shutdown which may arrive in a driver after his shutdown has been called. Because of that I haven't used this as an argument in the discussion here. Regards, Tomas > > Thanks, > > Bart. >
On 4/2/23 14:49, Tomas Henzl wrote: > On 3/31/23 20:48, Bart Van Assche wrote: >> I'm interested in failing future I/O from inside sd_shutdown() because >> it would allow me to remove the I/O quiescing code from the UFS driver >> shutdown callback. > I'm aware of this, other drivers do have similar code and so it would > help elsewhere as well. The patch as it is doesn't however ensure that > there isn't for example an I/O started before sd.shutdown which may > arrive in a driver after his shutdown has been called. Because of that I > haven't used this as an argument in the discussion here. Has it been considered to call blk_mq_freeze_queue() and blk_mq_unfreeze_queue() to wait for I/O that started earlier? Has it been considered to set QUEUE_FLAG_DYING to make future SCSI commands fail? See also blk_mq_destroy_queue(). Thanks, Bart.
On 4/3/23 21:56, Bart Van Assche wrote: > On 4/2/23 14:49, Tomas Henzl wrote: >> On 3/31/23 20:48, Bart Van Assche wrote: >>> I'm interested in failing future I/O from inside sd_shutdown() because >>> it would allow me to remove the I/O quiescing code from the UFS driver >>> shutdown callback. >> I'm aware of this, other drivers do have similar code and so it would >> help elsewhere as well. The patch as it is doesn't however ensure that >> there isn't for example an I/O started before sd.shutdown which may >> arrive in a driver after his shutdown has been called. Because of that I >> haven't used this as an argument in the discussion here. > > Has it been considered to call blk_mq_freeze_queue() and > blk_mq_unfreeze_queue() to wait for I/O that started earlier? > > Has it been considered to set QUEUE_FLAG_DYING to make future SCSI > commands fail? See also blk_mq_destroy_queue(). I haven't been considering anything from above. I just had noticed on a system timeouts when in kexec and wanted to have that fixed and for this case it is sufficient when the blocking is only close to 100% (and the patch is simple). I, like you, see the benefit of moving a bit of functionality from LLD to scsi layer but my patch wasn't accepted and I doubt that something more complex could be. > > Thanks, > > Bart. >
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 4f28dd617eca..8095f0302e66 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -3694,10 +3694,13 @@ static int sd_start_stop_device(struct scsi_disk *sdkp, int start) static void sd_shutdown(struct device *dev) { struct scsi_disk *sdkp = dev_get_drvdata(dev); + struct scsi_device *sdp; if (!sdkp) return; /* this can happen */ + sdp = sdkp->device; + if (pm_runtime_suspended(dev)) return; @@ -3710,6 +3713,10 @@ static void sd_shutdown(struct device *dev) sd_printk(KERN_NOTICE, sdkp, "Stopping disk\n"); sd_start_stop_device(sdkp, 0); } + + mutex_lock(&sdp->state_mutex); + scsi_device_set_state(sdp, SDEV_DEL); + mutex_unlock(&sdp->state_mutex); } static int sd_suspend_common(struct device *dev, bool ignore_stop_errors)
Set the state to deleted in sd_shutdown so that the attached LLD doesn't receive new I/O (can happen when in kexec) later after LLD's shutdown function has been called. Signed-off-by: Tomas Henzl <thenzl@redhat.com> --- drivers/scsi/sd.c | 7 +++++++ 1 file changed, 7 insertions(+)