Message ID | 20240605091731.3111195-2-haowenchao22@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | SCSI: Fix issues between removing device and error handle | expand |
On 6/5/24 11:17, Wenchao Hao wrote: > shost_for_each_device() would skip devices which is in SDEV_CANCEL or > SDEV_DEL state, for some scenarios, we donot want to skip these devices, > so add a new macro shost_for_each_device_include_deleted() to handle it. > > Following changes are introduced: > > 1. Rework scsi_device_get(), add new helper __scsi_device_get() which > determine if skip deleted scsi_device by parameter "skip_deleted". > 2. Add new parameter "skip_deleted" to __scsi_iterate_devices() which > is used when calling __scsi_device_get() > 3. Update shost_for_each_device() to call __scsi_iterate_devices() with > "skip_deleted" true > 4. Add new macro shost_for_each_device_include_deleted() which call > __scsi_iterate_devices() with "skip_deleted" false > > Signed-off-by: Wenchao Hao <haowenchao22@gmail.com> > --- > drivers/scsi/scsi.c | 46 ++++++++++++++++++++++++++------------ > include/scsi/scsi_device.h | 25 ++++++++++++++++++--- > 2 files changed, 54 insertions(+), 17 deletions(-) > > diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c > index 3e0c0381277a..5913de543d93 100644 > --- a/drivers/scsi/scsi.c > +++ b/drivers/scsi/scsi.c > @@ -735,20 +735,18 @@ int scsi_cdl_enable(struct scsi_device *sdev, bool enable) > return 0; > } > > -/** > - * scsi_device_get - get an additional reference to a scsi_device > +/* > + * __scsi_device_get - get an additional reference to a scsi_device > * @sdev: device to get a reference to > - * > - * Description: Gets a reference to the scsi_device and increments the use count > - * of the underlying LLDD module. You must hold host_lock of the > - * parent Scsi_Host or already have a reference when calling this. > - * > - * This will fail if a device is deleted or cancelled, or when the LLD module > - * is in the process of being unloaded. > + * @skip_deleted: when true, would return failed if device is deleted > */ > -int scsi_device_get(struct scsi_device *sdev) > +static int __scsi_device_get(struct scsi_device *sdev, bool skip_deleted) > { > - if (sdev->sdev_state == SDEV_DEL || sdev->sdev_state == SDEV_CANCEL) > + /* > + * if skip_deleted is true and device is in removing, return failed > + */ > + if (skip_deleted && > + (sdev->sdev_state == SDEV_DEL || sdev->sdev_state == SDEV_CANCEL)) > goto fail; Nack. SDEV_DEL means the device is about to be deleted, so we _must not_ access it at all. Cheers, Hannes
On 6/12/24 4:33 PM, Hannes Reinecke wrote: > On 6/5/24 11:17, Wenchao Hao wrote: >> shost_for_each_device() would skip devices which is in SDEV_CANCEL or >> SDEV_DEL state, for some scenarios, we donot want to skip these devices, >> so add a new macro shost_for_each_device_include_deleted() to handle it. >> >> Following changes are introduced: >> >> 1. Rework scsi_device_get(), add new helper __scsi_device_get() which >> determine if skip deleted scsi_device by parameter "skip_deleted". >> 2. Add new parameter "skip_deleted" to __scsi_iterate_devices() which >> is used when calling __scsi_device_get() >> 3. Update shost_for_each_device() to call __scsi_iterate_devices() with >> "skip_deleted" true >> 4. Add new macro shost_for_each_device_include_deleted() which call >> __scsi_iterate_devices() with "skip_deleted" false >> >> Signed-off-by: Wenchao Hao <haowenchao22@gmail.com> >> --- >> drivers/scsi/scsi.c | 46 ++++++++++++++++++++++++++------------ >> include/scsi/scsi_device.h | 25 ++++++++++++++++++--- >> 2 files changed, 54 insertions(+), 17 deletions(-) >> >> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c >> index 3e0c0381277a..5913de543d93 100644 >> --- a/drivers/scsi/scsi.c >> +++ b/drivers/scsi/scsi.c >> @@ -735,20 +735,18 @@ int scsi_cdl_enable(struct scsi_device *sdev, bool enable) >> return 0; >> } >> -/** >> - * scsi_device_get - get an additional reference to a scsi_device >> +/* >> + * __scsi_device_get - get an additional reference to a scsi_device >> * @sdev: device to get a reference to >> - * >> - * Description: Gets a reference to the scsi_device and increments the use count >> - * of the underlying LLDD module. You must hold host_lock of the >> - * parent Scsi_Host or already have a reference when calling this. >> - * >> - * This will fail if a device is deleted or cancelled, or when the LLD module >> - * is in the process of being unloaded. >> + * @skip_deleted: when true, would return failed if device is deleted >> */ >> -int scsi_device_get(struct scsi_device *sdev) >> +static int __scsi_device_get(struct scsi_device *sdev, bool skip_deleted) >> { >> - if (sdev->sdev_state == SDEV_DEL || sdev->sdev_state == SDEV_CANCEL) >> + /* >> + * if skip_deleted is true and device is in removing, return failed >> + */ >> + if (skip_deleted && >> + (sdev->sdev_state == SDEV_DEL || sdev->sdev_state == SDEV_CANCEL)) >> goto fail; > > Nack. > SDEV_DEL means the device is about to be deleted, so we _must not_ access it at all. > Sorry I added SDEV_DEL here at hand without understanding what it means. Actually, just include scsi_device which is in SDEV_CANCEL would fix the issues I described. The issues are because device removing concurrent with error handle. Normally, error handle would not be triggered when scsi_device is in SDEV_DEL. Below is my analysis, if it is wrong, please correct me. If there are scsi_cmnd remain unfinished when removing scsi_device, the removing process would waiting for all commands to be finished. If commands error happened and trigger error handle, the removing process would be blocked until error handle finished, because __scsi_remove_device called del_gendisk() which would wait all requests to be finished. So now scsi_device is in SDEV_CANCEL. If the scsi_device is already in SDEV_DEL, then no scsi_cmnd has been dispatched to this scsi_device, then error handle would never triggered. I want to change the new function __scsi_device_get() as following, please help to review. /* * __scsi_device_get - get an additional reference to a scsi_device * @sdev: device to get a reference to * @skip_canceled: when true, would return failed if device is deleted */ static int __scsi_device_get(struct scsi_device *sdev, bool skip_canceled) { /* * if skip_canceled is true and device is in removing, return failed */ if (sdev->sdev_state == SDEV_DEL || (sdev->sdev_state == SDEV_CANCEL && skip_canceled)) goto fail; if (!try_module_get(sdev->host->hostt->module)) goto fail; if (!get_device(&sdev->sdev_gendev)) goto fail_put_module; return 0; fail_put_module: module_put(sdev->host->hostt->module); fail: return -ENXIO; } > Cheers, > > Hannes
On 6/12/24 17:06, Wenchao Hao wrote: > On 6/12/24 4:33 PM, Hannes Reinecke wrote: >> On 6/5/24 11:17, Wenchao Hao wrote: >>> shost_for_each_device() would skip devices which is in SDEV_CANCEL or >>> SDEV_DEL state, for some scenarios, we donot want to skip these devices, >>> so add a new macro shost_for_each_device_include_deleted() to handle it. >>> >>> Following changes are introduced: >>> >>> 1. Rework scsi_device_get(), add new helper __scsi_device_get() which >>> determine if skip deleted scsi_device by parameter "skip_deleted". >>> 2. Add new parameter "skip_deleted" to __scsi_iterate_devices() which >>> is used when calling __scsi_device_get() >>> 3. Update shost_for_each_device() to call __scsi_iterate_devices() with >>> "skip_deleted" true >>> 4. Add new macro shost_for_each_device_include_deleted() which call >>> __scsi_iterate_devices() with "skip_deleted" false >>> >>> Signed-off-by: Wenchao Hao <haowenchao22@gmail.com> >>> --- >>> drivers/scsi/scsi.c | 46 ++++++++++++++++++++++++++------------ >>> include/scsi/scsi_device.h | 25 ++++++++++++++++++--- >>> 2 files changed, 54 insertions(+), 17 deletions(-) >>> >>> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c >>> index 3e0c0381277a..5913de543d93 100644 >>> --- a/drivers/scsi/scsi.c >>> +++ b/drivers/scsi/scsi.c >>> @@ -735,20 +735,18 @@ int scsi_cdl_enable(struct scsi_device *sdev, bool enable) >>> return 0; >>> } >>> -/** >>> - * scsi_device_get - get an additional reference to a scsi_device >>> +/* >>> + * __scsi_device_get - get an additional reference to a scsi_device >>> * @sdev: device to get a reference to >>> - * >>> - * Description: Gets a reference to the scsi_device and increments the use count >>> - * of the underlying LLDD module. You must hold host_lock of the >>> - * parent Scsi_Host or already have a reference when calling this. >>> - * >>> - * This will fail if a device is deleted or cancelled, or when the LLD module >>> - * is in the process of being unloaded. >>> + * @skip_deleted: when true, would return failed if device is deleted >>> */ >>> -int scsi_device_get(struct scsi_device *sdev) >>> +static int __scsi_device_get(struct scsi_device *sdev, bool skip_deleted) >>> { >>> - if (sdev->sdev_state == SDEV_DEL || sdev->sdev_state == SDEV_CANCEL) >>> + /* >>> + * if skip_deleted is true and device is in removing, return failed >>> + */ >>> + if (skip_deleted && >>> + (sdev->sdev_state == SDEV_DEL || sdev->sdev_state == SDEV_CANCEL)) >>> goto fail; >> >> Nack. >> SDEV_DEL means the device is about to be deleted, so we _must not_ access it at all. >> > > Sorry I added SDEV_DEL here at hand without understanding what it means. > Actually, just include scsi_device which is in SDEV_CANCEL would fix the > issues I described. > > The issues are because device removing concurrent with error handle. > Normally, error handle would not be triggered when scsi_device is in > SDEV_DEL. Below is my analysis, if it is wrong, please correct me. > > If there are scsi_cmnd remain unfinished when removing scsi_device, > the removing process would waiting for all commands to be finished. > If commands error happened and trigger error handle, the removing > process would be blocked until error handle finished, because > __scsi_remove_device called del_gendisk() which would wait all > requests to be finished. So now scsi_device is in SDEV_CANCEL. > > If the scsi_device is already in SDEV_DEL, then no scsi_cmnd has been > dispatched to this scsi_device, then error handle would never triggered. > > I want to change the new function __scsi_device_get() as following, > please help to review. > > /* > * __scsi_device_get - get an additional reference to a scsi_device > * @sdev: device to get a reference to > * @skip_canceled: when true, would return failed if device is deleted > */ > static int __scsi_device_get(struct scsi_device *sdev, bool skip_canceled) > { > /* > * if skip_canceled is true and device is in removing, return failed > */ > if (sdev->sdev_state == SDEV_DEL || > (sdev->sdev_state == SDEV_CANCEL && skip_canceled)) > goto fail; > if (!try_module_get(sdev->host->hostt->module)) > goto fail; > if (!get_device(&sdev->sdev_gendev)) > goto fail_put_module; > return 0; > > fail_put_module: > module_put(sdev->host->hostt->module); > fail: > return -ENXIO; > } > I don't think that's required. With your above analysis, wouldn't the problem be solved with: diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c index 775df00021e4..911fcfa80d69 100644 --- a/drivers/scsi/scsi_sysfs.c +++ b/drivers/scsi/scsi_sysfs.c @@ -1470,6 +1470,8 @@ void __scsi_remove_device(struct scsi_device *sdev) if (sdev->sdev_state == SDEV_DEL) return; + scsi_block_when_processing_errors(sdev); + if (sdev->is_visible) { /* * If scsi_internal_target_block() is running concurrently, Hmm? Cheers, Hannes
On 2024/6/13 14:27, Hannes Reinecke wrote: > On 6/12/24 17:06, Wenchao Hao wrote: >> On 6/12/24 4:33 PM, Hannes Reinecke wrote: >>> On 6/5/24 11:17, Wenchao Hao wrote: >>>> shost_for_each_device() would skip devices which is in SDEV_CANCEL or >>>> SDEV_DEL state, for some scenarios, we donot want to skip these >>>> devices, >>>> so add a new macro shost_for_each_device_include_deleted() to handle >>>> it. >>>> >>>> Following changes are introduced: >>>> >>>> 1. Rework scsi_device_get(), add new helper __scsi_device_get() which >>>> determine if skip deleted scsi_device by parameter "skip_deleted". >>>> 2. Add new parameter "skip_deleted" to __scsi_iterate_devices() which >>>> is used when calling __scsi_device_get() >>>> 3. Update shost_for_each_device() to call __scsi_iterate_devices() with >>>> "skip_deleted" true >>>> 4. Add new macro shost_for_each_device_include_deleted() which call >>>> __scsi_iterate_devices() with "skip_deleted" false >>>> >>>> Signed-off-by: Wenchao Hao <haowenchao22@gmail.com> >>>> --- >>>> drivers/scsi/scsi.c | 46 >>>> ++++++++++++++++++++++++++------------ >>>> include/scsi/scsi_device.h | 25 ++++++++++++++++++--- >>>> 2 files changed, 54 insertions(+), 17 deletions(-) >>>> >>>> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c >>>> index 3e0c0381277a..5913de543d93 100644 >>>> --- a/drivers/scsi/scsi.c >>>> +++ b/drivers/scsi/scsi.c >>>> @@ -735,20 +735,18 @@ int scsi_cdl_enable(struct scsi_device *sdev, >>>> bool enable) >>>> return 0; >>>> } >>>> -/** >>>> - * scsi_device_get - get an additional reference to a scsi_device >>>> +/* >>>> + * __scsi_device_get - get an additional reference to a scsi_device >>>> * @sdev: device to get a reference to >>>> - * >>>> - * Description: Gets a reference to the scsi_device and increments >>>> the use count >>>> - * of the underlying LLDD module. You must hold host_lock of the >>>> - * parent Scsi_Host or already have a reference when calling this. >>>> - * >>>> - * This will fail if a device is deleted or cancelled, or when the >>>> LLD module >>>> - * is in the process of being unloaded. >>>> + * @skip_deleted: when true, would return failed if device is deleted >>>> */ >>>> -int scsi_device_get(struct scsi_device *sdev) >>>> +static int __scsi_device_get(struct scsi_device *sdev, bool >>>> skip_deleted) >>>> { >>>> - if (sdev->sdev_state == SDEV_DEL || sdev->sdev_state == >>>> SDEV_CANCEL) >>>> + /* >>>> + * if skip_deleted is true and device is in removing, return >>>> failed >>>> + */ >>>> + if (skip_deleted && >>>> + (sdev->sdev_state == SDEV_DEL || sdev->sdev_state == >>>> SDEV_CANCEL)) >>>> goto fail; >>> >>> Nack. >>> SDEV_DEL means the device is about to be deleted, so we _must not_ >>> access it at all. >>> >> >> Sorry I added SDEV_DEL here at hand without understanding what it means. >> Actually, just include scsi_device which is in SDEV_CANCEL would fix the >> issues I described. >> >> The issues are because device removing concurrent with error handle. >> Normally, error handle would not be triggered when scsi_device is in >> SDEV_DEL. Below is my analysis, if it is wrong, please correct me. >> >> If there are scsi_cmnd remain unfinished when removing scsi_device, >> the removing process would waiting for all commands to be finished. >> If commands error happened and trigger error handle, the removing >> process would be blocked until error handle finished, because >> __scsi_remove_device called del_gendisk() which would wait all >> requests to be finished. So now scsi_device is in SDEV_CANCEL. >> >> If the scsi_device is already in SDEV_DEL, then no scsi_cmnd has been >> dispatched to this scsi_device, then error handle would never triggered. >> >> I want to change the new function __scsi_device_get() as following, >> please help to review. >> >> /* >> * __scsi_device_get - get an additional reference to a scsi_device >> * @sdev: device to get a reference to >> * @skip_canceled: when true, would return failed if device is deleted >> */ >> static int __scsi_device_get(struct scsi_device *sdev, bool >> skip_canceled) >> { >> /* >> * if skip_canceled is true and device is in removing, return failed >> */ >> if (sdev->sdev_state == SDEV_DEL || >> (sdev->sdev_state == SDEV_CANCEL && skip_canceled)) >> goto fail; >> if (!try_module_get(sdev->host->hostt->module)) >> goto fail; >> if (!get_device(&sdev->sdev_gendev)) >> goto fail_put_module; >> return 0; >> >> fail_put_module: >> module_put(sdev->host->hostt->module); >> fail: >> return -ENXIO; >> } >> > I don't think that's required. > With your above analysis, wouldn't the problem be solved with: > > diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c > index 775df00021e4..911fcfa80d69 100644 > --- a/drivers/scsi/scsi_sysfs.c > +++ b/drivers/scsi/scsi_sysfs.c > @@ -1470,6 +1470,8 @@ void __scsi_remove_device(struct scsi_device *sdev) > if (sdev->sdev_state == SDEV_DEL) > return; > > + scsi_block_when_processing_errors(sdev); > + > if (sdev->is_visible) { > /* > * If scsi_internal_target_block() is running > concurrently, > > Hmm? > We can not make sure no scsi_cmnd remain unfinished after scsi_block_when_processing_errors(). For example, there is a command has been dispatched but it's not timeouted when removing device, no error happened. After scsi_device is set to SDEV_CANCEL, the removing process would be blocked by del_gendisk() because there is still a request. Then the request timeout and abort failed, error handle would be triggered, now scsi_device is SDEV_CANCEL. The error handle would skip this scsi_device when doing device reset. > Cheers, > > Hannes
On 2024/6/13 15:10, Wenchao Hao wrote: > On 2024/6/13 14:27, Hannes Reinecke wrote: >> On 6/12/24 17:06, Wenchao Hao wrote: >>> On 6/12/24 4:33 PM, Hannes Reinecke wrote: >>>> On 6/5/24 11:17, Wenchao Hao wrote: >>>>> shost_for_each_device() would skip devices which is in SDEV_CANCEL or >>>>> SDEV_DEL state, for some scenarios, we donot want to skip these devices, >>>>> so add a new macro shost_for_each_device_include_deleted() to handle it. >>>>> >>>>> Following changes are introduced: >>>>> >>>>> 1. Rework scsi_device_get(), add new helper __scsi_device_get() which >>>>> determine if skip deleted scsi_device by parameter "skip_deleted". >>>>> 2. Add new parameter "skip_deleted" to __scsi_iterate_devices() which >>>>> is used when calling __scsi_device_get() >>>>> 3. Update shost_for_each_device() to call __scsi_iterate_devices() with >>>>> "skip_deleted" true >>>>> 4. Add new macro shost_for_each_device_include_deleted() which call >>>>> __scsi_iterate_devices() with "skip_deleted" false >>>>> >>>>> Signed-off-by: Wenchao Hao <haowenchao22@gmail.com> >>>>> --- >>>>> drivers/scsi/scsi.c | 46 ++++++++++++++++++++++++++------------ >>>>> include/scsi/scsi_device.h | 25 ++++++++++++++++++--- >>>>> 2 files changed, 54 insertions(+), 17 deletions(-) >>>>> >>>>> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c >>>>> index 3e0c0381277a..5913de543d93 100644 >>>>> --- a/drivers/scsi/scsi.c >>>>> +++ b/drivers/scsi/scsi.c >>>>> @@ -735,20 +735,18 @@ int scsi_cdl_enable(struct scsi_device *sdev, bool enable) >>>>> return 0; >>>>> } >>>>> -/** >>>>> - * scsi_device_get - get an additional reference to a scsi_device >>>>> +/* >>>>> + * __scsi_device_get - get an additional reference to a scsi_device >>>>> * @sdev: device to get a reference to >>>>> - * >>>>> - * Description: Gets a reference to the scsi_device and increments the use count >>>>> - * of the underlying LLDD module. You must hold host_lock of the >>>>> - * parent Scsi_Host or already have a reference when calling this. >>>>> - * >>>>> - * This will fail if a device is deleted or cancelled, or when the LLD module >>>>> - * is in the process of being unloaded. >>>>> + * @skip_deleted: when true, would return failed if device is deleted >>>>> */ >>>>> -int scsi_device_get(struct scsi_device *sdev) >>>>> +static int __scsi_device_get(struct scsi_device *sdev, bool skip_deleted) >>>>> { >>>>> - if (sdev->sdev_state == SDEV_DEL || sdev->sdev_state == SDEV_CANCEL) >>>>> + /* >>>>> + * if skip_deleted is true and device is in removing, return failed >>>>> + */ >>>>> + if (skip_deleted && >>>>> + (sdev->sdev_state == SDEV_DEL || sdev->sdev_state == SDEV_CANCEL)) >>>>> goto fail; >>>> >>>> Nack. >>>> SDEV_DEL means the device is about to be deleted, so we _must not_ access it at all. >>>> >>> >>> Sorry I added SDEV_DEL here at hand without understanding what it means. >>> Actually, just include scsi_device which is in SDEV_CANCEL would fix the >>> issues I described. >>> >>> The issues are because device removing concurrent with error handle. >>> Normally, error handle would not be triggered when scsi_device is in >>> SDEV_DEL. Below is my analysis, if it is wrong, please correct me. >>> >>> If there are scsi_cmnd remain unfinished when removing scsi_device, >>> the removing process would waiting for all commands to be finished. >>> If commands error happened and trigger error handle, the removing >>> process would be blocked until error handle finished, because >>> __scsi_remove_device called del_gendisk() which would wait all >>> requests to be finished. So now scsi_device is in SDEV_CANCEL. >>> >>> If the scsi_device is already in SDEV_DEL, then no scsi_cmnd has been >>> dispatched to this scsi_device, then error handle would never triggered. >>> >>> I want to change the new function __scsi_device_get() as following, >>> please help to review. >>> >>> /* >>> * __scsi_device_get - get an additional reference to a scsi_device >>> * @sdev: device to get a reference to >>> * @skip_canceled: when true, would return failed if device is deleted >>> */ >>> static int __scsi_device_get(struct scsi_device *sdev, bool skip_canceled) >>> { >>> /* >>> * if skip_canceled is true and device is in removing, return failed >>> */ >>> if (sdev->sdev_state == SDEV_DEL || >>> (sdev->sdev_state == SDEV_CANCEL && skip_canceled)) >>> goto fail; >>> if (!try_module_get(sdev->host->hostt->module)) >>> goto fail; >>> if (!get_device(&sdev->sdev_gendev)) >>> goto fail_put_module; >>> return 0; >>> >>> fail_put_module: >>> module_put(sdev->host->hostt->module); >>> fail: >>> return -ENXIO; >>> } >>> >> I don't think that's required. >> With your above analysis, wouldn't the problem be solved with: >> >> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c >> index 775df00021e4..911fcfa80d69 100644 >> --- a/drivers/scsi/scsi_sysfs.c >> +++ b/drivers/scsi/scsi_sysfs.c >> @@ -1470,6 +1470,8 @@ void __scsi_remove_device(struct scsi_device *sdev) >> if (sdev->sdev_state == SDEV_DEL) >> return; >> >> + scsi_block_when_processing_errors(sdev); >> + >> if (sdev->is_visible) { >> /* >> * If scsi_internal_target_block() is running concurrently, >> >> Hmm? >> > > We can not make sure no scsi_cmnd remain unfinished after scsi_block_when_processing_errors(). For example, there is a > command has been dispatched but it's not timeouted when removing > device, no error happened. After scsi_device is set to SDEV_CANCEL, > the removing process would be blocked by del_gendisk() because there > is still a request. > > Then the request timeout and abort failed, error handle would be triggered, now scsi_device is SDEV_CANCEL. > > The error handle would skip this scsi_device when doing device reset. > >> Cheers, >> >> Hannes > Hi Hannes, Would you review these patches? Or do how do you suggest to fix the issues? thanks.
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index 3e0c0381277a..5913de543d93 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -735,20 +735,18 @@ int scsi_cdl_enable(struct scsi_device *sdev, bool enable) return 0; } -/** - * scsi_device_get - get an additional reference to a scsi_device +/* + * __scsi_device_get - get an additional reference to a scsi_device * @sdev: device to get a reference to - * - * Description: Gets a reference to the scsi_device and increments the use count - * of the underlying LLDD module. You must hold host_lock of the - * parent Scsi_Host or already have a reference when calling this. - * - * This will fail if a device is deleted or cancelled, or when the LLD module - * is in the process of being unloaded. + * @skip_deleted: when true, would return failed if device is deleted */ -int scsi_device_get(struct scsi_device *sdev) +static int __scsi_device_get(struct scsi_device *sdev, bool skip_deleted) { - if (sdev->sdev_state == SDEV_DEL || sdev->sdev_state == SDEV_CANCEL) + /* + * if skip_deleted is true and device is in removing, return failed + */ + if (skip_deleted && + (sdev->sdev_state == SDEV_DEL || sdev->sdev_state == SDEV_CANCEL)) goto fail; if (!try_module_get(sdev->host->hostt->module)) goto fail; @@ -761,6 +759,22 @@ int scsi_device_get(struct scsi_device *sdev) fail: return -ENXIO; } + +/** + * scsi_device_get - get an additional reference to a scsi_device + * @sdev: device to get a reference to + * + * Description: Gets a reference to the scsi_device and increments the use count + * of the underlying LLDD module. You must hold host_lock of the + * parent Scsi_Host or already have a reference when calling this. + * + * This will fail if a device is deleted or cancelled, or when the LLD module + * is in the process of being unloaded. + */ +int scsi_device_get(struct scsi_device *sdev) +{ + return __scsi_device_get(sdev, 0); +} EXPORT_SYMBOL(scsi_device_get); /** @@ -780,9 +794,13 @@ void scsi_device_put(struct scsi_device *sdev) } EXPORT_SYMBOL(scsi_device_put); -/* helper for shost_for_each_device, see that for documentation */ +/** + * helper for shost_for_each_device, see that for documentation + * @skip_deleted: if true, sdev in progress of removing would be skipped + */ struct scsi_device *__scsi_iterate_devices(struct Scsi_Host *shost, - struct scsi_device *prev) + struct scsi_device *prev, + bool skip_deleted) { struct list_head *list = (prev ? &prev->siblings : &shost->__devices); struct scsi_device *next = NULL; @@ -792,7 +810,7 @@ struct scsi_device *__scsi_iterate_devices(struct Scsi_Host *shost, while (list->next != &shost->__devices) { next = list_entry(list->next, struct scsi_device, siblings); /* skip devices that we can't get a reference to */ - if (!scsi_device_get(next)) + if (!__scsi_device_get(next, skip_deleted)) break; next = NULL; list = list->next; diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h index 9c540f5468eb..5cb294ff0a41 100644 --- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@ -412,7 +412,8 @@ extern void __starget_for_each_device(struct scsi_target *, void *, /* only exposed to implement shost_for_each_device */ extern struct scsi_device *__scsi_iterate_devices(struct Scsi_Host *, - struct scsi_device *); + struct scsi_device *, + bool); /** * shost_for_each_device - iterate over all devices of a host @@ -422,11 +423,29 @@ extern struct scsi_device *__scsi_iterate_devices(struct Scsi_Host *, * Iterator that returns each device attached to @shost. This loop * takes a reference on each device and releases it at the end. If * you break out of the loop, you must call scsi_device_put(sdev). + * + * Note: this macro would skip sdev which is in progress of removing */ #define shost_for_each_device(sdev, shost) \ - for ((sdev) = __scsi_iterate_devices((shost), NULL); \ + for ((sdev) = __scsi_iterate_devices((shost), NULL, 1); \ + (sdev); \ + (sdev) = __scsi_iterate_devices((shost), (sdev), 1)) + +/* + * shost_for_each_device_include_deleted- iterate over all devices of a host + * @sdev: the &struct scsi_device to use as a cursor + * @shost: the &struct scsi_host to iterate over + * + * Iterator that returns each device attached to @shost. This loop + * takes a reference on each device and releases it at the end. If + * you break out of the loop, you must call scsi_device_put(sdev). + * + * Note: this macro would include sdev which is in progress of removing + */ +#define shost_for_each_device_include_deleted(sdev, shost) \ + for ((sdev) = __scsi_iterate_devices((shost), NULL, 0); \ (sdev); \ - (sdev) = __scsi_iterate_devices((shost), (sdev))) + (sdev) = __scsi_iterate_devices((shost), (sdev), 0)) /** * __shost_for_each_device - iterate over all devices of a host (UNLOCKED)
shost_for_each_device() would skip devices which is in SDEV_CANCEL or SDEV_DEL state, for some scenarios, we donot want to skip these devices, so add a new macro shost_for_each_device_include_deleted() to handle it. Following changes are introduced: 1. Rework scsi_device_get(), add new helper __scsi_device_get() which determine if skip deleted scsi_device by parameter "skip_deleted". 2. Add new parameter "skip_deleted" to __scsi_iterate_devices() which is used when calling __scsi_device_get() 3. Update shost_for_each_device() to call __scsi_iterate_devices() with "skip_deleted" true 4. Add new macro shost_for_each_device_include_deleted() which call __scsi_iterate_devices() with "skip_deleted" false Signed-off-by: Wenchao Hao <haowenchao22@gmail.com> --- drivers/scsi/scsi.c | 46 ++++++++++++++++++++++++++------------ include/scsi/scsi_device.h | 25 ++++++++++++++++++--- 2 files changed, 54 insertions(+), 17 deletions(-)