Message ID | 1624433711-9339-8-git-send-email-cang@codeaurora.org (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | Complementary changes for error handling | expand |
On 23/06/21 10:35 am, Can Guo wrote: > To protect system suspend/resume from being disturbed by error handling, > instead of using host_sem, let error handler call lock_system_sleep() and > unlock_system_sleep() which achieve the same purpose. Remove the host_sem > used in suspend/resume paths to make the code more readable. > > Suggested-by: Bart Van Assche <bvanassche@acm.org> > Signed-off-by: Can Guo <cang@codeaurora.org> > --- > drivers/scsi/ufs/ufshcd.c | 12 +++++++----- > 1 file changed, 7 insertions(+), 5 deletions(-) > > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c > index 3695dd2..a09e4a2 100644 > --- a/drivers/scsi/ufs/ufshcd.c > +++ b/drivers/scsi/ufs/ufshcd.c > @@ -5907,6 +5907,11 @@ static void ufshcd_clk_scaling_suspend(struct ufs_hba *hba, bool suspend) > > static void ufshcd_err_handling_prepare(struct ufs_hba *hba) > { > + /* > + * It is not safe to perform error handling while suspend or resume is > + * in progress. Hence the lock_system_sleep() call. > + */ > + lock_system_sleep(); It looks to me like the system takes this lock quite early, even before freezing tasks, so if anything needs the error handler to run it will deadlock. > ufshcd_rpm_get_sync(hba); > if (pm_runtime_status_suspended(&hba->sdev_ufs_device->sdev_gendev) || > hba->is_wlu_sys_suspended) { > @@ -5951,6 +5956,7 @@ static void ufshcd_err_handling_unprepare(struct ufs_hba *hba) > ufshcd_clk_scaling_suspend(hba, false); > ufshcd_clear_ua_wluns(hba); > ufshcd_rpm_put(hba); > + unlock_system_sleep(); > } > > static inline bool ufshcd_err_handling_should_stop(struct ufs_hba *hba) > @@ -9053,16 +9059,13 @@ static int ufshcd_wl_suspend(struct device *dev) > ktime_t start = ktime_get(); > > hba = shost_priv(sdev->host); > - down(&hba->host_sem); > > if (pm_runtime_suspended(dev)) > goto out; > > ret = __ufshcd_wl_suspend(hba, UFS_SYSTEM_PM); > - if (ret) { > + if (ret) > dev_err(&sdev->sdev_gendev, "%s failed: %d\n", __func__, ret); > - up(&hba->host_sem); > - } > > out: > if (!ret) > @@ -9095,7 +9098,6 @@ static int ufshcd_wl_resume(struct device *dev) > hba->curr_dev_pwr_mode, hba->uic_link_state); > if (!ret) > hba->is_wlu_sys_suspended = false; > - up(&hba->host_sem); > return ret; > } > #endif >
On 2021-06-23 22:30, Adrian Hunter wrote: > On 23/06/21 10:35 am, Can Guo wrote: >> To protect system suspend/resume from being disturbed by error >> handling, >> instead of using host_sem, let error handler call lock_system_sleep() >> and >> unlock_system_sleep() which achieve the same purpose. Remove the >> host_sem >> used in suspend/resume paths to make the code more readable. >> >> Suggested-by: Bart Van Assche <bvanassche@acm.org> >> Signed-off-by: Can Guo <cang@codeaurora.org> >> --- >> drivers/scsi/ufs/ufshcd.c | 12 +++++++----- >> 1 file changed, 7 insertions(+), 5 deletions(-) >> >> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c >> index 3695dd2..a09e4a2 100644 >> --- a/drivers/scsi/ufs/ufshcd.c >> +++ b/drivers/scsi/ufs/ufshcd.c >> @@ -5907,6 +5907,11 @@ static void ufshcd_clk_scaling_suspend(struct >> ufs_hba *hba, bool suspend) >> >> static void ufshcd_err_handling_prepare(struct ufs_hba *hba) >> { >> + /* >> + * It is not safe to perform error handling while suspend or resume >> is >> + * in progress. Hence the lock_system_sleep() call. >> + */ >> + lock_system_sleep(); > > It looks to me like the system takes this lock quite early, even before > freezing tasks, so if anything needs the error handler to run it will > deadlock. Hi Adrian, UFS/hba system suspend/resume does not invoke or call error handling in a synchronous way. So, whatever UFS errors (which schedules the error handler) happens during suspend/resume, error handler will just wait here till system suspend/resume release the lock. Hence no worries of deadlock here. Thanks, Can Guo. > >> ufshcd_rpm_get_sync(hba); >> if (pm_runtime_status_suspended(&hba->sdev_ufs_device->sdev_gendev) >> || >> hba->is_wlu_sys_suspended) { >> @@ -5951,6 +5956,7 @@ static void ufshcd_err_handling_unprepare(struct >> ufs_hba *hba) >> ufshcd_clk_scaling_suspend(hba, false); >> ufshcd_clear_ua_wluns(hba); >> ufshcd_rpm_put(hba); >> + unlock_system_sleep(); >> } >> >> static inline bool ufshcd_err_handling_should_stop(struct ufs_hba >> *hba) >> @@ -9053,16 +9059,13 @@ static int ufshcd_wl_suspend(struct device >> *dev) >> ktime_t start = ktime_get(); >> >> hba = shost_priv(sdev->host); >> - down(&hba->host_sem); >> >> if (pm_runtime_suspended(dev)) >> goto out; >> >> ret = __ufshcd_wl_suspend(hba, UFS_SYSTEM_PM); >> - if (ret) { >> + if (ret) >> dev_err(&sdev->sdev_gendev, "%s failed: %d\n", __func__, ret); >> - up(&hba->host_sem); >> - } >> >> out: >> if (!ret) >> @@ -9095,7 +9098,6 @@ static int ufshcd_wl_resume(struct device *dev) >> hba->curr_dev_pwr_mode, hba->uic_link_state); >> if (!ret) >> hba->is_wlu_sys_suspended = false; >> - up(&hba->host_sem); >> return ret; >> } >> #endif >>
On 24/06/21 5:16 am, Can Guo wrote: > On 2021-06-23 22:30, Adrian Hunter wrote: >> On 23/06/21 10:35 am, Can Guo wrote: >>> To protect system suspend/resume from being disturbed by error handling, >>> instead of using host_sem, let error handler call lock_system_sleep() and >>> unlock_system_sleep() which achieve the same purpose. Remove the host_sem >>> used in suspend/resume paths to make the code more readable. >>> >>> Suggested-by: Bart Van Assche <bvanassche@acm.org> >>> Signed-off-by: Can Guo <cang@codeaurora.org> >>> --- >>> drivers/scsi/ufs/ufshcd.c | 12 +++++++----- >>> 1 file changed, 7 insertions(+), 5 deletions(-) >>> >>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c >>> index 3695dd2..a09e4a2 100644 >>> --- a/drivers/scsi/ufs/ufshcd.c >>> +++ b/drivers/scsi/ufs/ufshcd.c >>> @@ -5907,6 +5907,11 @@ static void ufshcd_clk_scaling_suspend(struct ufs_hba *hba, bool suspend) >>> >>> static void ufshcd_err_handling_prepare(struct ufs_hba *hba) >>> { >>> + /* >>> + * It is not safe to perform error handling while suspend or resume is >>> + * in progress. Hence the lock_system_sleep() call. >>> + */ >>> + lock_system_sleep(); >> >> It looks to me like the system takes this lock quite early, even before >> freezing tasks, so if anything needs the error handler to run it will >> deadlock. > > Hi Adrian, > > UFS/hba system suspend/resume does not invoke or call error handling in a > synchronous way. So, whatever UFS errors (which schedules the error handler) > happens during suspend/resume, error handler will just wait here till system > suspend/resume release the lock. Hence no worries of deadlock here. It looks to me like the state can change to UFSHCD_STATE_EH_SCHEDULED_FATAL and since user processes are not frozen, nor file systems sync'ed, everything is going to deadlock. i.e. I/O is blocked waiting on error handling error handling is blocked waiting on lock_system_sleep() suspend is blocked waiting on I/O > > Thanks, > > Can Guo. > >> >>> ufshcd_rpm_get_sync(hba); >>> if (pm_runtime_status_suspended(&hba->sdev_ufs_device->sdev_gendev) || >>> hba->is_wlu_sys_suspended) { >>> @@ -5951,6 +5956,7 @@ static void ufshcd_err_handling_unprepare(struct ufs_hba *hba) >>> ufshcd_clk_scaling_suspend(hba, false); >>> ufshcd_clear_ua_wluns(hba); >>> ufshcd_rpm_put(hba); >>> + unlock_system_sleep(); >>> } >>> >>> static inline bool ufshcd_err_handling_should_stop(struct ufs_hba *hba) >>> @@ -9053,16 +9059,13 @@ static int ufshcd_wl_suspend(struct device *dev) >>> ktime_t start = ktime_get(); >>> >>> hba = shost_priv(sdev->host); >>> - down(&hba->host_sem); >>> >>> if (pm_runtime_suspended(dev)) >>> goto out; >>> >>> ret = __ufshcd_wl_suspend(hba, UFS_SYSTEM_PM); >>> - if (ret) { >>> + if (ret) >>> dev_err(&sdev->sdev_gendev, "%s failed: %d\n", __func__, ret); >>> - up(&hba->host_sem); >>> - } >>> >>> out: >>> if (!ret) >>> @@ -9095,7 +9098,6 @@ static int ufshcd_wl_resume(struct device *dev) >>> hba->curr_dev_pwr_mode, hba->uic_link_state); >>> if (!ret) >>> hba->is_wlu_sys_suspended = false; >>> - up(&hba->host_sem); >>> return ret; >>> } >>> #endif >>>
On 2021-06-24 13:52, Adrian Hunter wrote: > On 24/06/21 5:16 am, Can Guo wrote: >> On 2021-06-23 22:30, Adrian Hunter wrote: >>> On 23/06/21 10:35 am, Can Guo wrote: >>>> To protect system suspend/resume from being disturbed by error >>>> handling, >>>> instead of using host_sem, let error handler call >>>> lock_system_sleep() and >>>> unlock_system_sleep() which achieve the same purpose. Remove the >>>> host_sem >>>> used in suspend/resume paths to make the code more readable. >>>> >>>> Suggested-by: Bart Van Assche <bvanassche@acm.org> >>>> Signed-off-by: Can Guo <cang@codeaurora.org> >>>> --- >>>> drivers/scsi/ufs/ufshcd.c | 12 +++++++----- >>>> 1 file changed, 7 insertions(+), 5 deletions(-) >>>> >>>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c >>>> index 3695dd2..a09e4a2 100644 >>>> --- a/drivers/scsi/ufs/ufshcd.c >>>> +++ b/drivers/scsi/ufs/ufshcd.c >>>> @@ -5907,6 +5907,11 @@ static void ufshcd_clk_scaling_suspend(struct >>>> ufs_hba *hba, bool suspend) >>>> >>>> static void ufshcd_err_handling_prepare(struct ufs_hba *hba) >>>> { >>>> + /* >>>> + * It is not safe to perform error handling while suspend or >>>> resume is >>>> + * in progress. Hence the lock_system_sleep() call. >>>> + */ >>>> + lock_system_sleep(); >>> >>> It looks to me like the system takes this lock quite early, even >>> before >>> freezing tasks, so if anything needs the error handler to run it will >>> deadlock. >> >> Hi Adrian, >> >> UFS/hba system suspend/resume does not invoke or call error handling >> in a >> synchronous way. So, whatever UFS errors (which schedules the error >> handler) >> happens during suspend/resume, error handler will just wait here till >> system >> suspend/resume release the lock. Hence no worries of deadlock here. > > It looks to me like the state can change to > UFSHCD_STATE_EH_SCHEDULED_FATAL > and since user processes are not frozen, nor file systems sync'ed, > everything > is going to deadlock. > i.e. > I/O is blocked waiting on error handling > error handling is blocked waiting on lock_system_sleep() > suspend is blocked waiting on I/O > Hi Adrian, First of all, enter_state(suspend_state_t state) uses mutex_trylock(&system_transition_mutex). Second, even that happens, in ufshcd_queuecommand(), below logic will break the cycle, by fast failing the PM request (below codes are from the code tip with this whole series applied). case UFSHCD_STATE_EH_SCHEDULED_FATAL: /* * ufshcd_rpm_get_sync() is used at error handling preparation * stage. If a scsi cmd, e.g., the SSU cmd, is sent from the * PM ops, it can never be finished if we let SCSI layer keep * retrying it, which gets err handler stuck forever. Neither * can we let the scsi cmd pass through, because UFS is in bad * state, the scsi cmd may eventually time out, which will get * err handler blocked for too long. So, just fail the scsi cmd * sent from PM ops, err handler can recover PM error anyways. */ if (cmd->request->rq_flags & RQF_PM) { hba->force_reset = true; set_host_byte(cmd, DID_BAD_TARGET); cmd->scsi_done(cmd); goto out; } fallthrough; case UFSHCD_STATE_RESET: Thanks, Can Guo. >> >> Thanks, >> >> Can Guo. >> >>> >>>> ufshcd_rpm_get_sync(hba); >>>> if >>>> (pm_runtime_status_suspended(&hba->sdev_ufs_device->sdev_gendev) || >>>> hba->is_wlu_sys_suspended) { >>>> @@ -5951,6 +5956,7 @@ static void >>>> ufshcd_err_handling_unprepare(struct ufs_hba *hba) >>>> ufshcd_clk_scaling_suspend(hba, false); >>>> ufshcd_clear_ua_wluns(hba); >>>> ufshcd_rpm_put(hba); >>>> + unlock_system_sleep(); >>>> } >>>> >>>> static inline bool ufshcd_err_handling_should_stop(struct ufs_hba >>>> *hba) >>>> @@ -9053,16 +9059,13 @@ static int ufshcd_wl_suspend(struct device >>>> *dev) >>>> ktime_t start = ktime_get(); >>>> >>>> hba = shost_priv(sdev->host); >>>> - down(&hba->host_sem); >>>> >>>> if (pm_runtime_suspended(dev)) >>>> goto out; >>>> >>>> ret = __ufshcd_wl_suspend(hba, UFS_SYSTEM_PM); >>>> - if (ret) { >>>> + if (ret) >>>> dev_err(&sdev->sdev_gendev, "%s failed: %d\n", __func__, >>>> ret); >>>> - up(&hba->host_sem); >>>> - } >>>> >>>> out: >>>> if (!ret) >>>> @@ -9095,7 +9098,6 @@ static int ufshcd_wl_resume(struct device >>>> *dev) >>>> hba->curr_dev_pwr_mode, hba->uic_link_state); >>>> if (!ret) >>>> hba->is_wlu_sys_suspended = false; >>>> - up(&hba->host_sem); >>>> return ret; >>>> } >>>> #endif >>>>
On 24/06/21 9:12 am, Can Guo wrote: > On 2021-06-24 13:52, Adrian Hunter wrote: >> On 24/06/21 5:16 am, Can Guo wrote: >>> On 2021-06-23 22:30, Adrian Hunter wrote: >>>> On 23/06/21 10:35 am, Can Guo wrote: >>>>> To protect system suspend/resume from being disturbed by error handling, >>>>> instead of using host_sem, let error handler call lock_system_sleep() and >>>>> unlock_system_sleep() which achieve the same purpose. Remove the host_sem >>>>> used in suspend/resume paths to make the code more readable. >>>>> >>>>> Suggested-by: Bart Van Assche <bvanassche@acm.org> >>>>> Signed-off-by: Can Guo <cang@codeaurora.org> >>>>> --- >>>>> drivers/scsi/ufs/ufshcd.c | 12 +++++++----- >>>>> 1 file changed, 7 insertions(+), 5 deletions(-) >>>>> >>>>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c >>>>> index 3695dd2..a09e4a2 100644 >>>>> --- a/drivers/scsi/ufs/ufshcd.c >>>>> +++ b/drivers/scsi/ufs/ufshcd.c >>>>> @@ -5907,6 +5907,11 @@ static void ufshcd_clk_scaling_suspend(struct ufs_hba *hba, bool suspend) >>>>> >>>>> static void ufshcd_err_handling_prepare(struct ufs_hba *hba) >>>>> { >>>>> + /* >>>>> + * It is not safe to perform error handling while suspend or resume is >>>>> + * in progress. Hence the lock_system_sleep() call. >>>>> + */ >>>>> + lock_system_sleep(); >>>> >>>> It looks to me like the system takes this lock quite early, even before >>>> freezing tasks, so if anything needs the error handler to run it will >>>> deadlock. >>> >>> Hi Adrian, >>> >>> UFS/hba system suspend/resume does not invoke or call error handling in a >>> synchronous way. So, whatever UFS errors (which schedules the error handler) >>> happens during suspend/resume, error handler will just wait here till system >>> suspend/resume release the lock. Hence no worries of deadlock here. >> >> It looks to me like the state can change to UFSHCD_STATE_EH_SCHEDULED_FATAL >> and since user processes are not frozen, nor file systems sync'ed, everything >> is going to deadlock. >> i.e. >> I/O is blocked waiting on error handling >> error handling is blocked waiting on lock_system_sleep() >> suspend is blocked waiting on I/O >> > > Hi Adrian, > > First of all, enter_state(suspend_state_t state) uses mutex_trylock(&system_transition_mutex). Yes, in the case I am outlining it gets the mutex. > Second, even that happens, in ufshcd_queuecommand(), below logic will break the cycle, by > fast failing the PM request (below codes are from the code tip with this whole series applied). It won't get that far because the suspend will be waiting to sync filesystems. Filesystems will be waiting on I/O. I/O will be waiting on the error handler. The error handler will be waiting on system_transition_mutex. But system_transition_mutex is already held by PM core. > > case UFSHCD_STATE_EH_SCHEDULED_FATAL: > /* > * ufshcd_rpm_get_sync() is used at error handling preparation > * stage. If a scsi cmd, e.g., the SSU cmd, is sent from the > * PM ops, it can never be finished if we let SCSI layer keep > * retrying it, which gets err handler stuck forever. Neither > * can we let the scsi cmd pass through, because UFS is in bad > * state, the scsi cmd may eventually time out, which will get > * err handler blocked for too long. So, just fail the scsi cmd > * sent from PM ops, err handler can recover PM error anyways. > */ > if (cmd->request->rq_flags & RQF_PM) { > hba->force_reset = true; > set_host_byte(cmd, DID_BAD_TARGET); > cmd->scsi_done(cmd); > goto out; > } > fallthrough; > case UFSHCD_STATE_RESET: > > Thanks, > > Can Guo. > >>> >>> Thanks, >>> >>> Can Guo. >>> >>>> >>>>> ufshcd_rpm_get_sync(hba); >>>>> if (pm_runtime_status_suspended(&hba->sdev_ufs_device->sdev_gendev) || >>>>> hba->is_wlu_sys_suspended) { >>>>> @@ -5951,6 +5956,7 @@ static void ufshcd_err_handling_unprepare(struct ufs_hba *hba) >>>>> ufshcd_clk_scaling_suspend(hba, false); >>>>> ufshcd_clear_ua_wluns(hba); >>>>> ufshcd_rpm_put(hba); >>>>> + unlock_system_sleep(); >>>>> } >>>>> >>>>> static inline bool ufshcd_err_handling_should_stop(struct ufs_hba *hba) >>>>> @@ -9053,16 +9059,13 @@ static int ufshcd_wl_suspend(struct device *dev) >>>>> ktime_t start = ktime_get(); >>>>> >>>>> hba = shost_priv(sdev->host); >>>>> - down(&hba->host_sem); >>>>> >>>>> if (pm_runtime_suspended(dev)) >>>>> goto out; >>>>> >>>>> ret = __ufshcd_wl_suspend(hba, UFS_SYSTEM_PM); >>>>> - if (ret) { >>>>> + if (ret) >>>>> dev_err(&sdev->sdev_gendev, "%s failed: %d\n", __func__, ret); >>>>> - up(&hba->host_sem); >>>>> - } >>>>> >>>>> out: >>>>> if (!ret) >>>>> @@ -9095,7 +9098,6 @@ static int ufshcd_wl_resume(struct device *dev) >>>>> hba->curr_dev_pwr_mode, hba->uic_link_state); >>>>> if (!ret) >>>>> hba->is_wlu_sys_suspended = false; >>>>> - up(&hba->host_sem); >>>>> return ret; >>>>> } >>>>> #endif >>>>>
On 2021-06-24 14:23, Adrian Hunter wrote: > On 24/06/21 9:12 am, Can Guo wrote: >> On 2021-06-24 13:52, Adrian Hunter wrote: >>> On 24/06/21 5:16 am, Can Guo wrote: >>>> On 2021-06-23 22:30, Adrian Hunter wrote: >>>>> On 23/06/21 10:35 am, Can Guo wrote: >>>>>> To protect system suspend/resume from being disturbed by error >>>>>> handling, >>>>>> instead of using host_sem, let error handler call >>>>>> lock_system_sleep() and >>>>>> unlock_system_sleep() which achieve the same purpose. Remove the >>>>>> host_sem >>>>>> used in suspend/resume paths to make the code more readable. >>>>>> >>>>>> Suggested-by: Bart Van Assche <bvanassche@acm.org> >>>>>> Signed-off-by: Can Guo <cang@codeaurora.org> >>>>>> --- >>>>>> drivers/scsi/ufs/ufshcd.c | 12 +++++++----- >>>>>> 1 file changed, 7 insertions(+), 5 deletions(-) >>>>>> >>>>>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c >>>>>> index 3695dd2..a09e4a2 100644 >>>>>> --- a/drivers/scsi/ufs/ufshcd.c >>>>>> +++ b/drivers/scsi/ufs/ufshcd.c >>>>>> @@ -5907,6 +5907,11 @@ static void >>>>>> ufshcd_clk_scaling_suspend(struct ufs_hba *hba, bool suspend) >>>>>> >>>>>> static void ufshcd_err_handling_prepare(struct ufs_hba *hba) >>>>>> { >>>>>> + /* >>>>>> + * It is not safe to perform error handling while suspend or >>>>>> resume is >>>>>> + * in progress. Hence the lock_system_sleep() call. >>>>>> + */ >>>>>> + lock_system_sleep(); >>>>> >>>>> It looks to me like the system takes this lock quite early, even >>>>> before >>>>> freezing tasks, so if anything needs the error handler to run it >>>>> will >>>>> deadlock. >>>> >>>> Hi Adrian, >>>> >>>> UFS/hba system suspend/resume does not invoke or call error handling >>>> in a >>>> synchronous way. So, whatever UFS errors (which schedules the error >>>> handler) >>>> happens during suspend/resume, error handler will just wait here >>>> till system >>>> suspend/resume release the lock. Hence no worries of deadlock here. >>> >>> It looks to me like the state can change to >>> UFSHCD_STATE_EH_SCHEDULED_FATAL >>> and since user processes are not frozen, nor file systems sync'ed, >>> everything >>> is going to deadlock. >>> i.e. >>> I/O is blocked waiting on error handling >>> error handling is blocked waiting on lock_system_sleep() >>> suspend is blocked waiting on I/O >>> >> >> Hi Adrian, >> >> First of all, enter_state(suspend_state_t state) uses >> mutex_trylock(&system_transition_mutex). > > Yes, in the case I am outlining it gets the mutex. > >> Second, even that happens, in ufshcd_queuecommand(), below logic will >> break the cycle, by >> fast failing the PM request (below codes are from the code tip with >> this whole series applied). > > It won't get that far because the suspend will be waiting to sync > filesystems. > Filesystems will be waiting on I/O. > I/O will be waiting on the error handler. > The error handler will be waiting on system_transition_mutex. > But system_transition_mutex is already held by PM core. Hi Adrian, You are right.... I missed the action of syncing filesystems... Using back host_sem in suspend_prepare()/resume_complete() won't have this problem of deadlock, right? Thanks, Can Guo. > >> >> case UFSHCD_STATE_EH_SCHEDULED_FATAL: >> /* >> * ufshcd_rpm_get_sync() is used at error handling >> preparation >> * stage. If a scsi cmd, e.g., the SSU cmd, is sent >> from the >> * PM ops, it can never be finished if we let SCSI >> layer keep >> * retrying it, which gets err handler stuck forever. >> Neither >> * can we let the scsi cmd pass through, because UFS >> is in bad >> * state, the scsi cmd may eventually time out, which >> will get >> * err handler blocked for too long. So, just fail the >> scsi cmd >> * sent from PM ops, err handler can recover PM error >> anyways. >> */ >> if (cmd->request->rq_flags & RQF_PM) { >> hba->force_reset = true; >> set_host_byte(cmd, DID_BAD_TARGET); >> cmd->scsi_done(cmd); >> goto out; >> } >> fallthrough; >> case UFSHCD_STATE_RESET: >> >> Thanks, >> >> Can Guo. >> >>>> >>>> Thanks, >>>> >>>> Can Guo. >>>> >>>>> >>>>>> ufshcd_rpm_get_sync(hba); >>>>>> if >>>>>> (pm_runtime_status_suspended(&hba->sdev_ufs_device->sdev_gendev) >>>>>> || >>>>>> hba->is_wlu_sys_suspended) { >>>>>> @@ -5951,6 +5956,7 @@ static void >>>>>> ufshcd_err_handling_unprepare(struct ufs_hba *hba) >>>>>> ufshcd_clk_scaling_suspend(hba, false); >>>>>> ufshcd_clear_ua_wluns(hba); >>>>>> ufshcd_rpm_put(hba); >>>>>> + unlock_system_sleep(); >>>>>> } >>>>>> >>>>>> static inline bool ufshcd_err_handling_should_stop(struct ufs_hba >>>>>> *hba) >>>>>> @@ -9053,16 +9059,13 @@ static int ufshcd_wl_suspend(struct device >>>>>> *dev) >>>>>> ktime_t start = ktime_get(); >>>>>> >>>>>> hba = shost_priv(sdev->host); >>>>>> - down(&hba->host_sem); >>>>>> >>>>>> if (pm_runtime_suspended(dev)) >>>>>> goto out; >>>>>> >>>>>> ret = __ufshcd_wl_suspend(hba, UFS_SYSTEM_PM); >>>>>> - if (ret) { >>>>>> + if (ret) >>>>>> dev_err(&sdev->sdev_gendev, "%s failed: %d\n", __func__, >>>>>> ret); >>>>>> - up(&hba->host_sem); >>>>>> - } >>>>>> >>>>>> out: >>>>>> if (!ret) >>>>>> @@ -9095,7 +9098,6 @@ static int ufshcd_wl_resume(struct device >>>>>> *dev) >>>>>> hba->curr_dev_pwr_mode, hba->uic_link_state); >>>>>> if (!ret) >>>>>> hba->is_wlu_sys_suspended = false; >>>>>> - up(&hba->host_sem); >>>>>> return ret; >>>>>> } >>>>>> #endif >>>>>>
On 24/06/21 9:31 am, Can Guo wrote: > On 2021-06-24 14:23, Adrian Hunter wrote: >> On 24/06/21 9:12 am, Can Guo wrote: >>> On 2021-06-24 13:52, Adrian Hunter wrote: >>>> On 24/06/21 5:16 am, Can Guo wrote: >>>>> On 2021-06-23 22:30, Adrian Hunter wrote: >>>>>> On 23/06/21 10:35 am, Can Guo wrote: >>>>>>> To protect system suspend/resume from being disturbed by error handling, >>>>>>> instead of using host_sem, let error handler call lock_system_sleep() and >>>>>>> unlock_system_sleep() which achieve the same purpose. Remove the host_sem >>>>>>> used in suspend/resume paths to make the code more readable. >>>>>>> >>>>>>> Suggested-by: Bart Van Assche <bvanassche@acm.org> >>>>>>> Signed-off-by: Can Guo <cang@codeaurora.org> >>>>>>> --- >>>>>>> drivers/scsi/ufs/ufshcd.c | 12 +++++++----- >>>>>>> 1 file changed, 7 insertions(+), 5 deletions(-) >>>>>>> >>>>>>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c >>>>>>> index 3695dd2..a09e4a2 100644 >>>>>>> --- a/drivers/scsi/ufs/ufshcd.c >>>>>>> +++ b/drivers/scsi/ufs/ufshcd.c >>>>>>> @@ -5907,6 +5907,11 @@ static void ufshcd_clk_scaling_suspend(struct ufs_hba *hba, bool suspend) >>>>>>> >>>>>>> static void ufshcd_err_handling_prepare(struct ufs_hba *hba) >>>>>>> { >>>>>>> + /* >>>>>>> + * It is not safe to perform error handling while suspend or resume is >>>>>>> + * in progress. Hence the lock_system_sleep() call. >>>>>>> + */ >>>>>>> + lock_system_sleep(); >>>>>> >>>>>> It looks to me like the system takes this lock quite early, even before >>>>>> freezing tasks, so if anything needs the error handler to run it will >>>>>> deadlock. >>>>> >>>>> Hi Adrian, >>>>> >>>>> UFS/hba system suspend/resume does not invoke or call error handling in a >>>>> synchronous way. So, whatever UFS errors (which schedules the error handler) >>>>> happens during suspend/resume, error handler will just wait here till system >>>>> suspend/resume release the lock. Hence no worries of deadlock here. >>>> >>>> It looks to me like the state can change to UFSHCD_STATE_EH_SCHEDULED_FATAL >>>> and since user processes are not frozen, nor file systems sync'ed, everything >>>> is going to deadlock. >>>> i.e. >>>> I/O is blocked waiting on error handling >>>> error handling is blocked waiting on lock_system_sleep() >>>> suspend is blocked waiting on I/O >>>> >>> >>> Hi Adrian, >>> >>> First of all, enter_state(suspend_state_t state) uses mutex_trylock(&system_transition_mutex). >> >> Yes, in the case I am outlining it gets the mutex. >> >>> Second, even that happens, in ufshcd_queuecommand(), below logic will break the cycle, by >>> fast failing the PM request (below codes are from the code tip with this whole series applied). >> >> It won't get that far because the suspend will be waiting to sync filesystems. >> Filesystems will be waiting on I/O. >> I/O will be waiting on the error handler. >> The error handler will be waiting on system_transition_mutex. >> But system_transition_mutex is already held by PM core. > > Hi Adrian, > > You are right.... I missed the action of syncing filesystems... > > Using back host_sem in suspend_prepare()/resume_complete() won't have this > problem of deadlock, right? I am not sure, but what was problem that the V3 patch was fixing? Can you give an example? > > Thanks, > > Can Guo. > >> >>> >>> case UFSHCD_STATE_EH_SCHEDULED_FATAL: >>> /* >>> * ufshcd_rpm_get_sync() is used at error handling preparation >>> * stage. If a scsi cmd, e.g., the SSU cmd, is sent from the >>> * PM ops, it can never be finished if we let SCSI layer keep >>> * retrying it, which gets err handler stuck forever. Neither >>> * can we let the scsi cmd pass through, because UFS is in bad >>> * state, the scsi cmd may eventually time out, which will get >>> * err handler blocked for too long. So, just fail the scsi cmd >>> * sent from PM ops, err handler can recover PM error anyways. >>> */ >>> if (cmd->request->rq_flags & RQF_PM) { >>> hba->force_reset = true; >>> set_host_byte(cmd, DID_BAD_TARGET); >>> cmd->scsi_done(cmd); >>> goto out; >>> } >>> fallthrough; >>> case UFSHCD_STATE_RESET: >>> >>> Thanks, >>> >>> Can Guo. >>> >>>>> >>>>> Thanks, >>>>> >>>>> Can Guo. >>>>> >>>>>> >>>>>>> ufshcd_rpm_get_sync(hba); >>>>>>> if (pm_runtime_status_suspended(&hba->sdev_ufs_device->sdev_gendev) || >>>>>>> hba->is_wlu_sys_suspended) { >>>>>>> @@ -5951,6 +5956,7 @@ static void ufshcd_err_handling_unprepare(struct ufs_hba *hba) >>>>>>> ufshcd_clk_scaling_suspend(hba, false); >>>>>>> ufshcd_clear_ua_wluns(hba); >>>>>>> ufshcd_rpm_put(hba); >>>>>>> + unlock_system_sleep(); >>>>>>> } >>>>>>> >>>>>>> static inline bool ufshcd_err_handling_should_stop(struct ufs_hba *hba) >>>>>>> @@ -9053,16 +9059,13 @@ static int ufshcd_wl_suspend(struct device *dev) >>>>>>> ktime_t start = ktime_get(); >>>>>>> >>>>>>> hba = shost_priv(sdev->host); >>>>>>> - down(&hba->host_sem); >>>>>>> >>>>>>> if (pm_runtime_suspended(dev)) >>>>>>> goto out; >>>>>>> >>>>>>> ret = __ufshcd_wl_suspend(hba, UFS_SYSTEM_PM); >>>>>>> - if (ret) { >>>>>>> + if (ret) >>>>>>> dev_err(&sdev->sdev_gendev, "%s failed: %d\n", __func__, ret); >>>>>>> - up(&hba->host_sem); >>>>>>> - } >>>>>>> >>>>>>> out: >>>>>>> if (!ret) >>>>>>> @@ -9095,7 +9098,6 @@ static int ufshcd_wl_resume(struct device *dev) >>>>>>> hba->curr_dev_pwr_mode, hba->uic_link_state); >>>>>>> if (!ret) >>>>>>> hba->is_wlu_sys_suspended = false; >>>>>>> - up(&hba->host_sem); >>>>>>> return ret; >>>>>>> } >>>>>>> #endif >>>>>>>
On 6/23/21 11:31 PM, Can Guo wrote: > Using back host_sem in suspend_prepare()/resume_complete() won't have this > problem of deadlock, right? Although that would solve the deadlock discussed in this email thread, it wouldn't solve the issue of potential adverse interactions of the UFS error handler and the SCSI error handler running concurrently. How about using the standard approach for invoking the UFS error handler instead of using a custom mechanism, e.g. by using something like the (untested) patch below? This approach guarantees that the UFS error handler is only activated after all pending SCSI commands have failed or timed out and also guarantees that no new SCSI commands will be queued while the UFS error handler is in progress (see also scsi_host_queue_ready()). Thanks, Bart. diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 0ac1e15ac914..c6303ea9e5db 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -17,6 +17,8 @@ #include <linux/blk-pm.h> #include <linux/blkdev.h> #include <scsi/scsi_driver.h> +#include <scsi/scsi_transport.h> +#include "../scsi_transport_api.h" #include "ufshcd.h" #include "ufs_quirks.h" #include "unipro.h" @@ -233,7 +235,6 @@ static int ufshcd_scale_clks(struct ufs_hba *hba, bool scale_up); static irqreturn_t ufshcd_intr(int irq, void *__hba); static int ufshcd_change_power_mode(struct ufs_hba *hba, struct ufs_pa_layer_attr *pwr_mode); -static void ufshcd_schedule_eh_work(struct ufs_hba *hba); static int ufshcd_setup_hba_vreg(struct ufs_hba *hba, bool on); static int ufshcd_setup_vreg(struct ufs_hba *hba, bool on); static inline int ufshcd_config_vreg_hpm(struct ufs_hba *hba, @@ -2697,26 +2698,7 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd) switch (hba->ufshcd_state) { case UFSHCD_STATE_OPERATIONAL: - case UFSHCD_STATE_EH_SCHEDULED_NON_FATAL: break; - case UFSHCD_STATE_EH_SCHEDULED_FATAL: - /* - * pm_runtime_get_sync() is used at error handling preparation - * stage. If a scsi cmd, e.g. the SSU cmd, is sent from hba's - * PM ops, it can never be finished if we let SCSI layer keep - * retrying it, which gets err handler stuck forever. Neither - * can we let the scsi cmd pass through, because UFS is in bad - * state, the scsi cmd may eventually time out, which will get - * err handler blocked for too long. So, just fail the scsi cmd - * sent from PM ops, err handler can recover PM error anyways. - */ - if (hba->pm_op_in_progress) { - hba->force_reset = true; - set_host_byte(cmd, DID_BAD_TARGET); - cmd->scsi_done(cmd); - goto out; - } - fallthrough; case UFSHCD_STATE_RESET: err = SCSI_MLQUEUE_HOST_BUSY; goto out; @@ -3954,6 +3936,7 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba *hba, struct uic_command *cmd) u8 status; int ret; bool reenable_intr = false; + bool schedule_eh = false; mutex_lock(&hba->uic_cmd_mutex); init_completion(&uic_async_done); @@ -4021,10 +4004,12 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba *hba, struct uic_command *cmd) ufshcd_enable_intr(hba, UIC_COMMAND_COMPL); if (ret) { ufshcd_set_link_broken(hba); - ufshcd_schedule_eh_work(hba); + schedule_eh = true; } out_unlock: spin_unlock_irqrestore(hba->host->host_lock, flags); + if (schedule_eh) + scsi_schedule_eh(hba->host); mutex_unlock(&hba->uic_cmd_mutex); return ret; @@ -4085,8 +4070,6 @@ static bool ufshcd_set_state(struct ufs_hba *hba, enum ufshcd_state new_state) } break; case UFSHCD_STATE_RESET: - case UFSHCD_STATE_EH_SCHEDULED_NON_FATAL: - case UFSHCD_STATE_EH_SCHEDULED_FATAL: allowed = hba->ufshcd_state != UFSHCD_STATE_ERROR; break; case UFSHCD_STATE_ERROR: @@ -5886,22 +5869,6 @@ static inline bool ufshcd_is_saved_err_fatal(struct ufs_hba *hba) (hba->saved_err & (INT_FATAL_ERRORS | UFSHCD_UIC_HIBERN8_MASK)); } -/* host lock must be held before calling this func */ -static inline void ufshcd_schedule_eh_work(struct ufs_hba *hba) -{ - bool proceed; - - if (hba->force_reset || ufshcd_is_link_broken(hba) || - ufshcd_is_saved_err_fatal(hba)) - proceed = ufshcd_set_state(hba, - UFSHCD_STATE_EH_SCHEDULED_FATAL); - else - proceed = ufshcd_set_state(hba, - UFSHCD_STATE_EH_SCHEDULED_NON_FATAL); - if (proceed) - queue_work(hba->eh_wq, &hba->eh_work); -} - static void ufshcd_clk_scaling_allow(struct ufs_hba *hba, bool allow) { down_write(&hba->clk_scaling_lock); @@ -5924,7 +5891,7 @@ static void ufshcd_clk_scaling_suspend(struct ufs_hba *hba, bool suspend) static void ufshcd_err_handling_prepare(struct ufs_hba *hba) { - ufshcd_rpm_get_sync(hba); + pm_runtime_get_noresume(&hba->sdev_ufs_device->sdev_gendev); if (pm_runtime_status_suspended(&hba->sdev_ufs_device->sdev_gendev) || hba->is_sys_suspended) { enum ufs_pm_op pm_op; @@ -6035,11 +6002,12 @@ static bool ufshcd_is_pwr_mode_restore_needed(struct ufs_hba *hba) /** * ufshcd_err_handler - handle UFS errors that require s/w attention - * @work: pointer to work structure + * @host: SCSI host pointer */ -static void ufshcd_err_handler(struct work_struct *work) +static void ufshcd_err_handler(struct Scsi_Host *host) { - struct ufs_hba *hba; + struct ufs_hba *hba = shost_priv(host); + struct completion *eh_compl = NULL; unsigned long flags; bool err_xfer = false; bool err_tm = false; @@ -6047,10 +6015,10 @@ static void ufshcd_err_handler(struct work_struct *work) int tag; bool needs_reset = false, needs_restore = false; - hba = container_of(work, struct ufs_hba, eh_work); - down(&hba->host_sem); spin_lock_irqsave(hba->host->host_lock, flags); + /* Clear host_eh_scheduled which has been set by scsi_schedule_eh(). */ + hba->host->host_eh_scheduled = 0; if (ufshcd_err_handling_should_stop(hba)) { ufshcd_set_state(hba, UFSHCD_STATE_OPERATIONAL); spin_unlock_irqrestore(hba->host->host_lock, flags); @@ -6202,9 +6170,12 @@ static void ufshcd_err_handler(struct work_struct *work) __func__, hba->saved_err, hba->saved_uic_err); } ufshcd_clear_eh_in_progress(hba); + swap(hba->eh_compl, eh_compl); spin_unlock_irqrestore(hba->host->host_lock, flags); ufshcd_err_handling_unprepare(hba); up(&hba->host_sem); + if (eh_compl) + complete(eh_compl); } /** @@ -6361,7 +6332,6 @@ static irqreturn_t ufshcd_check_errors(struct ufs_hba *hba, u32 intr_status) "host_regs: "); ufshcd_print_pwr_info(hba); } - ufshcd_schedule_eh_work(hba); retval |= IRQ_HANDLED; } /* @@ -6373,6 +6343,8 @@ static irqreturn_t ufshcd_check_errors(struct ufs_hba *hba, u32 intr_status) hba->errors = 0; hba->uic_error = 0; spin_unlock(hba->host->host_lock); + if (queue_eh_work) + scsi_schedule_eh(hba->host); return retval; } @@ -7045,8 +7017,8 @@ static int ufshcd_abort(struct scsi_cmnd *cmd) set_bit(tag, &hba->outstanding_reqs); spin_lock_irqsave(host->host_lock, flags); hba->force_reset = true; - ufshcd_schedule_eh_work(hba); spin_unlock_irqrestore(host->host_lock, flags); + scsi_schedule_eh(hba->host); goto out; } @@ -7172,7 +7144,8 @@ static int ufshcd_reset_and_restore(struct ufs_hba *hba) */ static int ufshcd_eh_host_reset_handler(struct scsi_cmnd *cmd) { - int err = SUCCESS; + DECLARE_COMPLETION_ONSTACK(eh_compl); + int err = SUCCESS, res; unsigned long flags; struct ufs_hba *hba; @@ -7180,11 +7153,15 @@ static int ufshcd_eh_host_reset_handler(struct scsi_cmnd *cmd) spin_lock_irqsave(hba->host->host_lock, flags); hba->force_reset = true; - ufshcd_schedule_eh_work(hba); + hba->eh_compl = &eh_compl; dev_err(hba->dev, "%s: reset in progress - 1\n", __func__); spin_unlock_irqrestore(hba->host->host_lock, flags); - flush_work(&hba->eh_work); + scsi_schedule_eh(hba->host); + res = wait_for_completion_interruptible_timeout(&eh_compl, 180 * HZ); + WARN_ONCE(res <= 0, + "wait_for_completion_interruptible_timeout() returned %d", + res); spin_lock_irqsave(hba->host->host_lock, flags); if (hba->ufshcd_state == UFSHCD_STATE_ERROR) @@ -8511,8 +8488,6 @@ static void ufshcd_hba_exit(struct ufs_hba *hba) if (hba->is_powered) { ufshcd_exit_clk_scaling(hba); ufshcd_exit_clk_gating(hba); - if (hba->eh_wq) - destroy_workqueue(hba->eh_wq); ufs_debugfs_hba_exit(hba); ufshcd_variant_hba_exit(hba); ufshcd_setup_vreg(hba, false); @@ -9371,6 +9346,10 @@ static int ufshcd_set_dma_mask(struct ufs_hba *hba) return dma_set_mask_and_coherent(hba->dev, DMA_BIT_MASK(32)); } +static struct scsi_transport_template ufshcd_transport_template = { + .eh_strategy_handler = ufshcd_err_handler, +}; + /** * ufshcd_alloc_host - allocate Host Bus Adapter (HBA) * @dev: pointer to device handle @@ -9397,6 +9376,7 @@ int ufshcd_alloc_host(struct device *dev, struct ufs_hba **hba_handle) err = -ENOMEM; goto out_error; } + host->transportt = &ufshcd_transport_template; hba = shost_priv(host); hba->host = host; hba->dev = dev; @@ -9434,7 +9414,6 @@ int ufshcd_init(struct ufs_hba *hba, void __iomem *mmio_base, unsigned int irq) int err; struct Scsi_Host *host = hba->host; struct device *dev = hba->dev; - char eh_wq_name[sizeof("ufs_eh_wq_00")]; if (!mmio_base) { dev_err(hba->dev, @@ -9488,17 +9467,6 @@ int ufshcd_init(struct ufs_hba *hba, void __iomem *mmio_base, unsigned int irq) hba->max_pwr_info.is_valid = false; - /* Initialize work queues */ - snprintf(eh_wq_name, sizeof(eh_wq_name), "ufs_eh_wq_%d", - hba->host->host_no); - hba->eh_wq = create_singlethread_workqueue(eh_wq_name); - if (!hba->eh_wq) { - dev_err(hba->dev, "%s: failed to create eh workqueue\n", - __func__); - err = -ENOMEM; - goto out_disable; - } - INIT_WORK(&hba->eh_work, ufshcd_err_handler); INIT_WORK(&hba->eeh_work, ufshcd_exception_event_handler); sema_init(&hba->host_sem, 1); diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h index f2796ea25598..d4f7ab0171c6 100644 --- a/drivers/scsi/ufs/ufshcd.h +++ b/drivers/scsi/ufs/ufshcd.h @@ -482,18 +482,12 @@ struct ufs_stats { * processing. * @UFSHCD_STATE_OPERATIONAL: The host controller is operational and can process * SCSI commands. - * @UFSHCD_STATE_EH_SCHEDULED_NON_FATAL: The error handler has been scheduled. - * SCSI commands may be submitted to the controller. - * @UFSHCD_STATE_EH_SCHEDULED_FATAL: The error handler has been scheduled. Fail - * newly submitted SCSI commands with error code DID_BAD_TARGET. * @UFSHCD_STATE_ERROR: An unrecoverable error occurred, e.g. link recovery * failed. Fail all SCSI commands with error code DID_ERROR. */ enum ufshcd_state { UFSHCD_STATE_RESET, UFSHCD_STATE_OPERATIONAL, - UFSHCD_STATE_EH_SCHEDULED_NON_FATAL, - UFSHCD_STATE_EH_SCHEDULED_FATAL, UFSHCD_STATE_ERROR, }; @@ -715,8 +709,7 @@ struct ufs_hba_monitor { * @is_powered: flag to check if HBA is powered * @shutting_down: flag to check if shutdown has been invoked * @host_sem: semaphore used to serialize concurrent contexts - * @eh_wq: Workqueue that eh_work works on - * @eh_work: Worker to handle UFS errors that require s/w attention + * @eh_compl: Signaled by the UFS error handler after error handling finished. * @eeh_work: Worker to handle exception events * @errors: HBA errors * @uic_error: UFS interconnect layer error status @@ -817,9 +810,7 @@ struct ufs_hba { bool shutting_down; struct semaphore host_sem; - /* Work Queues */ - struct workqueue_struct *eh_wq; - struct work_struct eh_work; + struct completion *eh_compl; struct work_struct eeh_work; /* HBA Errors */
On 2021-06-24 18:04, Adrian Hunter wrote: > On 24/06/21 9:31 am, Can Guo wrote: >> On 2021-06-24 14:23, Adrian Hunter wrote: >>> On 24/06/21 9:12 am, Can Guo wrote: >>>> On 2021-06-24 13:52, Adrian Hunter wrote: >>>>> On 24/06/21 5:16 am, Can Guo wrote: >>>>>> On 2021-06-23 22:30, Adrian Hunter wrote: >>>>>>> On 23/06/21 10:35 am, Can Guo wrote: >>>>>>>> To protect system suspend/resume from being disturbed by error >>>>>>>> handling, >>>>>>>> instead of using host_sem, let error handler call >>>>>>>> lock_system_sleep() and >>>>>>>> unlock_system_sleep() which achieve the same purpose. Remove the >>>>>>>> host_sem >>>>>>>> used in suspend/resume paths to make the code more readable. >>>>>>>> >>>>>>>> Suggested-by: Bart Van Assche <bvanassche@acm.org> >>>>>>>> Signed-off-by: Can Guo <cang@codeaurora.org> >>>>>>>> --- >>>>>>>> drivers/scsi/ufs/ufshcd.c | 12 +++++++----- >>>>>>>> 1 file changed, 7 insertions(+), 5 deletions(-) >>>>>>>> >>>>>>>> diff --git a/drivers/scsi/ufs/ufshcd.c >>>>>>>> b/drivers/scsi/ufs/ufshcd.c >>>>>>>> index 3695dd2..a09e4a2 100644 >>>>>>>> --- a/drivers/scsi/ufs/ufshcd.c >>>>>>>> +++ b/drivers/scsi/ufs/ufshcd.c >>>>>>>> @@ -5907,6 +5907,11 @@ static void >>>>>>>> ufshcd_clk_scaling_suspend(struct ufs_hba *hba, bool suspend) >>>>>>>> >>>>>>>> static void ufshcd_err_handling_prepare(struct ufs_hba *hba) >>>>>>>> { >>>>>>>> + /* >>>>>>>> + * It is not safe to perform error handling while suspend >>>>>>>> or resume is >>>>>>>> + * in progress. Hence the lock_system_sleep() call. >>>>>>>> + */ >>>>>>>> + lock_system_sleep(); >>>>>>> >>>>>>> It looks to me like the system takes this lock quite early, even >>>>>>> before >>>>>>> freezing tasks, so if anything needs the error handler to run it >>>>>>> will >>>>>>> deadlock. >>>>>> >>>>>> Hi Adrian, >>>>>> >>>>>> UFS/hba system suspend/resume does not invoke or call error >>>>>> handling in a >>>>>> synchronous way. So, whatever UFS errors (which schedules the >>>>>> error handler) >>>>>> happens during suspend/resume, error handler will just wait here >>>>>> till system >>>>>> suspend/resume release the lock. Hence no worries of deadlock >>>>>> here. >>>>> >>>>> It looks to me like the state can change to >>>>> UFSHCD_STATE_EH_SCHEDULED_FATAL >>>>> and since user processes are not frozen, nor file systems sync'ed, >>>>> everything >>>>> is going to deadlock. >>>>> i.e. >>>>> I/O is blocked waiting on error handling >>>>> error handling is blocked waiting on lock_system_sleep() >>>>> suspend is blocked waiting on I/O >>>>> >>>> >>>> Hi Adrian, >>>> >>>> First of all, enter_state(suspend_state_t state) uses >>>> mutex_trylock(&system_transition_mutex). >>> >>> Yes, in the case I am outlining it gets the mutex. >>> >>>> Second, even that happens, in ufshcd_queuecommand(), below logic >>>> will break the cycle, by >>>> fast failing the PM request (below codes are from the code tip with >>>> this whole series applied). >>> >>> It won't get that far because the suspend will be waiting to sync >>> filesystems. >>> Filesystems will be waiting on I/O. >>> I/O will be waiting on the error handler. >>> The error handler will be waiting on system_transition_mutex. >>> But system_transition_mutex is already held by PM core. >> >> Hi Adrian, >> >> You are right.... I missed the action of syncing filesystems... >> >> Using back host_sem in suspend_prepare()/resume_complete() won't have >> this >> problem of deadlock, right? > > I am not sure, but what was problem that the V3 patch was fixing? > Can you give an example? V3 was moving host_sem from wl_system_suspend/resume() to ufshcd_suspend_prepare()/ufshcd_resume_complete(). It is to make sure error handling does not run concurrenly with system PM, since error handling is recovering/clearing runtime PM errors of all the scsi devices under hba (in patch #8). Having the error handling doing so (in patch 8) is because runtime PM framework may save the runtime errors of the supplier to one or more consumers ( unlike the children - parent relationship), for example if wlu resume fails, sda and/or other scsi devices may save the resume error, then they will be left runtime suspended permanently. Thanks, Can Guo. > >> >> Thanks, >> >> Can Guo. >> >>> >>>> >>>> case UFSHCD_STATE_EH_SCHEDULED_FATAL: >>>> /* >>>> * ufshcd_rpm_get_sync() is used at error handling >>>> preparation >>>> * stage. If a scsi cmd, e.g., the SSU cmd, is sent >>>> from the >>>> * PM ops, it can never be finished if we let SCSI >>>> layer keep >>>> * retrying it, which gets err handler stuck >>>> forever. Neither >>>> * can we let the scsi cmd pass through, because UFS >>>> is in bad >>>> * state, the scsi cmd may eventually time out, >>>> which will get >>>> * err handler blocked for too long. So, just fail >>>> the scsi cmd >>>> * sent from PM ops, err handler can recover PM >>>> error anyways. >>>> */ >>>> if (cmd->request->rq_flags & RQF_PM) { >>>> hba->force_reset = true; >>>> set_host_byte(cmd, DID_BAD_TARGET); >>>> cmd->scsi_done(cmd); >>>> goto out; >>>> } >>>> fallthrough; >>>> case UFSHCD_STATE_RESET: >>>> >>>> Thanks, >>>> >>>> Can Guo. >>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Can Guo. >>>>>> >>>>>>> >>>>>>>> ufshcd_rpm_get_sync(hba); >>>>>>>> if >>>>>>>> (pm_runtime_status_suspended(&hba->sdev_ufs_device->sdev_gendev) >>>>>>>> || >>>>>>>> hba->is_wlu_sys_suspended) { >>>>>>>> @@ -5951,6 +5956,7 @@ static void >>>>>>>> ufshcd_err_handling_unprepare(struct ufs_hba *hba) >>>>>>>> ufshcd_clk_scaling_suspend(hba, false); >>>>>>>> ufshcd_clear_ua_wluns(hba); >>>>>>>> ufshcd_rpm_put(hba); >>>>>>>> + unlock_system_sleep(); >>>>>>>> } >>>>>>>> >>>>>>>> static inline bool ufshcd_err_handling_should_stop(struct >>>>>>>> ufs_hba *hba) >>>>>>>> @@ -9053,16 +9059,13 @@ static int ufshcd_wl_suspend(struct >>>>>>>> device *dev) >>>>>>>> ktime_t start = ktime_get(); >>>>>>>> >>>>>>>> hba = shost_priv(sdev->host); >>>>>>>> - down(&hba->host_sem); >>>>>>>> >>>>>>>> if (pm_runtime_suspended(dev)) >>>>>>>> goto out; >>>>>>>> >>>>>>>> ret = __ufshcd_wl_suspend(hba, UFS_SYSTEM_PM); >>>>>>>> - if (ret) { >>>>>>>> + if (ret) >>>>>>>> dev_err(&sdev->sdev_gendev, "%s failed: %d\n", >>>>>>>> __func__, ret); >>>>>>>> - up(&hba->host_sem); >>>>>>>> - } >>>>>>>> >>>>>>>> out: >>>>>>>> if (!ret) >>>>>>>> @@ -9095,7 +9098,6 @@ static int ufshcd_wl_resume(struct device >>>>>>>> *dev) >>>>>>>> hba->curr_dev_pwr_mode, hba->uic_link_state); >>>>>>>> if (!ret) >>>>>>>> hba->is_wlu_sys_suspended = false; >>>>>>>> - up(&hba->host_sem); >>>>>>>> return ret; >>>>>>>> } >>>>>>>> #endif >>>>>>>>
Hi Bart, On 2021-06-25 01:11, Bart Van Assche wrote: > On 6/23/21 11:31 PM, Can Guo wrote: >> Using back host_sem in suspend_prepare()/resume_complete() won't have >> this >> problem of deadlock, right? > > Although that would solve the deadlock discussed in this email thread, > it > wouldn't solve the issue of potential adverse interactions of the UFS > error > handler and the SCSI error handler running concurrently. I think I've explained it before, paste it here - ufshcd_eh_host_reset_handler() invokes ufshcd_err_handler() and flushes it, so SCSI error handler and UFS error handler can safely run together. > How about using the > standard approach for invoking the UFS error handler instead of using a > custom > mechanism, e.g. by using something like the (untested) patch below? > This > approach guarantees that the UFS error handler is only activated after > all > pending SCSI commands have failed or timed out and also guarantees that > no new > SCSI commands will be queued while the UFS error handler is in progress > (see > also scsi_host_queue_ready()). Per my understanding, SCSI error handling is scsi cmd based, meaning it only works when certain SCSI cmds failed and are added to shost->eh_cmd_q (by calling scsi_eh_scmd_add(struct scsi_cmnd *scmd)). scsi_schedule_eh() -> scsi_error_handler() -> scsi_unjam_host() where scsi_unjam_host() may or may NOT invoke scsi_eh_ready_devs(), and scsi_eh_ready_devs() invokes ufshcd_eh_host_reset_handler(). 2140 static void scsi_unjam_host(struct Scsi_Host *shost) 2141 { 2142 unsigned long flags; 2143 LIST_HEAD(eh_work_q); 2144 LIST_HEAD(eh_done_q); 2145 2146 spin_lock_irqsave(shost->host_lock, flags); 2147 list_splice_init(&shost->eh_cmd_q, &eh_work_q); 2148 spin_unlock_irqrestore(shost->host_lock, flags); ... 2152 if (!scsi_eh_get_sense(&eh_work_q, &eh_done_q)) 2153 scsi_eh_ready_devs(shost, &eh_work_q, &eh_done_q); ... However, most UFS (UIC) errors happens during gear scaling, clk gating and suspend/resume (due to power mode changes and/or hibern8 enter/exit), during which there is NO scsi cmds in UFS driver at all (because these contexts start only when there is no ongoing data transactions). Thus, scsi_unjam_host() won't even call scsi_eh_ready_devs() because scsi_eh_get_sense() always returns TRUE in these cases (eh_work_q is empty). Thanks, Can Guo. > > Thanks, > > Bart. > > diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c > index 0ac1e15ac914..c6303ea9e5db 100644 > --- a/drivers/scsi/ufs/ufshcd.c > +++ b/drivers/scsi/ufs/ufshcd.c > @@ -17,6 +17,8 @@ > #include <linux/blk-pm.h> > #include <linux/blkdev.h> > #include <scsi/scsi_driver.h> > +#include <scsi/scsi_transport.h> > +#include "../scsi_transport_api.h" > #include "ufshcd.h" > #include "ufs_quirks.h" > #include "unipro.h" > @@ -233,7 +235,6 @@ static int ufshcd_scale_clks(struct ufs_hba *hba, > bool scale_up); > static irqreturn_t ufshcd_intr(int irq, void *__hba); > static int ufshcd_change_power_mode(struct ufs_hba *hba, > struct ufs_pa_layer_attr *pwr_mode); > -static void ufshcd_schedule_eh_work(struct ufs_hba *hba); > static int ufshcd_setup_hba_vreg(struct ufs_hba *hba, bool on); > static int ufshcd_setup_vreg(struct ufs_hba *hba, bool on); > static inline int ufshcd_config_vreg_hpm(struct ufs_hba *hba, > @@ -2697,26 +2698,7 @@ static int ufshcd_queuecommand(struct Scsi_Host > *host, struct scsi_cmnd *cmd) > > switch (hba->ufshcd_state) { > case UFSHCD_STATE_OPERATIONAL: > - case UFSHCD_STATE_EH_SCHEDULED_NON_FATAL: > break; > - case UFSHCD_STATE_EH_SCHEDULED_FATAL: > - /* > - * pm_runtime_get_sync() is used at error handling preparation > - * stage. If a scsi cmd, e.g. the SSU cmd, is sent from hba's > - * PM ops, it can never be finished if we let SCSI layer keep > - * retrying it, which gets err handler stuck forever. Neither > - * can we let the scsi cmd pass through, because UFS is in bad > - * state, the scsi cmd may eventually time out, which will get > - * err handler blocked for too long. So, just fail the scsi cmd > - * sent from PM ops, err handler can recover PM error anyways. > - */ > - if (hba->pm_op_in_progress) { > - hba->force_reset = true; > - set_host_byte(cmd, DID_BAD_TARGET); > - cmd->scsi_done(cmd); > - goto out; > - } > - fallthrough; > case UFSHCD_STATE_RESET: > err = SCSI_MLQUEUE_HOST_BUSY; > goto out; > @@ -3954,6 +3936,7 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba > *hba, struct uic_command *cmd) > u8 status; > int ret; > bool reenable_intr = false; > + bool schedule_eh = false; > > mutex_lock(&hba->uic_cmd_mutex); > init_completion(&uic_async_done); > @@ -4021,10 +4004,12 @@ static int ufshcd_uic_pwr_ctrl(struct ufs_hba > *hba, struct uic_command *cmd) > ufshcd_enable_intr(hba, UIC_COMMAND_COMPL); > if (ret) { > ufshcd_set_link_broken(hba); > - ufshcd_schedule_eh_work(hba); > + schedule_eh = true; > } > out_unlock: > spin_unlock_irqrestore(hba->host->host_lock, flags); > + if (schedule_eh) > + scsi_schedule_eh(hba->host); > mutex_unlock(&hba->uic_cmd_mutex); > > return ret; > @@ -4085,8 +4070,6 @@ static bool ufshcd_set_state(struct ufs_hba > *hba, enum ufshcd_state new_state) > } > break; > case UFSHCD_STATE_RESET: > - case UFSHCD_STATE_EH_SCHEDULED_NON_FATAL: > - case UFSHCD_STATE_EH_SCHEDULED_FATAL: > allowed = hba->ufshcd_state != UFSHCD_STATE_ERROR; > break; > case UFSHCD_STATE_ERROR: > @@ -5886,22 +5869,6 @@ static inline bool > ufshcd_is_saved_err_fatal(struct ufs_hba *hba) > (hba->saved_err & (INT_FATAL_ERRORS | > UFSHCD_UIC_HIBERN8_MASK)); > } > > -/* host lock must be held before calling this func */ > -static inline void ufshcd_schedule_eh_work(struct ufs_hba *hba) > -{ > - bool proceed; > - > - if (hba->force_reset || ufshcd_is_link_broken(hba) || > - ufshcd_is_saved_err_fatal(hba)) > - proceed = ufshcd_set_state(hba, > - UFSHCD_STATE_EH_SCHEDULED_FATAL); > - else > - proceed = ufshcd_set_state(hba, > - UFSHCD_STATE_EH_SCHEDULED_NON_FATAL); > - if (proceed) > - queue_work(hba->eh_wq, &hba->eh_work); > -} > - > static void ufshcd_clk_scaling_allow(struct ufs_hba *hba, bool allow) > { > down_write(&hba->clk_scaling_lock); > @@ -5924,7 +5891,7 @@ static void ufshcd_clk_scaling_suspend(struct > ufs_hba *hba, bool suspend) > > static void ufshcd_err_handling_prepare(struct ufs_hba *hba) > { > - ufshcd_rpm_get_sync(hba); > + pm_runtime_get_noresume(&hba->sdev_ufs_device->sdev_gendev); > if (pm_runtime_status_suspended(&hba->sdev_ufs_device->sdev_gendev) > || > hba->is_sys_suspended) { > enum ufs_pm_op pm_op; > @@ -6035,11 +6002,12 @@ static bool > ufshcd_is_pwr_mode_restore_needed(struct ufs_hba *hba) > > /** > * ufshcd_err_handler - handle UFS errors that require s/w attention > - * @work: pointer to work structure > + * @host: SCSI host pointer > */ > -static void ufshcd_err_handler(struct work_struct *work) > +static void ufshcd_err_handler(struct Scsi_Host *host) > { > - struct ufs_hba *hba; > + struct ufs_hba *hba = shost_priv(host); > + struct completion *eh_compl = NULL; > unsigned long flags; > bool err_xfer = false; > bool err_tm = false; > @@ -6047,10 +6015,10 @@ static void ufshcd_err_handler(struct > work_struct *work) > int tag; > bool needs_reset = false, needs_restore = false; > > - hba = container_of(work, struct ufs_hba, eh_work); > - > down(&hba->host_sem); > spin_lock_irqsave(hba->host->host_lock, flags); > + /* Clear host_eh_scheduled which has been set by scsi_schedule_eh(). > */ > + hba->host->host_eh_scheduled = 0; > if (ufshcd_err_handling_should_stop(hba)) { > ufshcd_set_state(hba, UFSHCD_STATE_OPERATIONAL); > spin_unlock_irqrestore(hba->host->host_lock, flags); > @@ -6202,9 +6170,12 @@ static void ufshcd_err_handler(struct > work_struct *work) > __func__, hba->saved_err, hba->saved_uic_err); > } > ufshcd_clear_eh_in_progress(hba); > + swap(hba->eh_compl, eh_compl); > spin_unlock_irqrestore(hba->host->host_lock, flags); > ufshcd_err_handling_unprepare(hba); > up(&hba->host_sem); > + if (eh_compl) > + complete(eh_compl); > } > > /** > @@ -6361,7 +6332,6 @@ static irqreturn_t ufshcd_check_errors(struct > ufs_hba *hba, u32 intr_status) > "host_regs: "); > ufshcd_print_pwr_info(hba); > } > - ufshcd_schedule_eh_work(hba); > retval |= IRQ_HANDLED; > } > /* > @@ -6373,6 +6343,8 @@ static irqreturn_t ufshcd_check_errors(struct > ufs_hba *hba, u32 intr_status) > hba->errors = 0; > hba->uic_error = 0; > spin_unlock(hba->host->host_lock); > + if (queue_eh_work) > + scsi_schedule_eh(hba->host); > return retval; > } > > @@ -7045,8 +7017,8 @@ static int ufshcd_abort(struct scsi_cmnd *cmd) > set_bit(tag, &hba->outstanding_reqs); > spin_lock_irqsave(host->host_lock, flags); > hba->force_reset = true; > - ufshcd_schedule_eh_work(hba); > spin_unlock_irqrestore(host->host_lock, flags); > + scsi_schedule_eh(hba->host); > goto out; > } > > @@ -7172,7 +7144,8 @@ static int ufshcd_reset_and_restore(struct > ufs_hba *hba) > */ > static int ufshcd_eh_host_reset_handler(struct scsi_cmnd *cmd) > { > - int err = SUCCESS; > + DECLARE_COMPLETION_ONSTACK(eh_compl); > + int err = SUCCESS, res; > unsigned long flags; > struct ufs_hba *hba; > > @@ -7180,11 +7153,15 @@ static int ufshcd_eh_host_reset_handler(struct > scsi_cmnd *cmd) > > spin_lock_irqsave(hba->host->host_lock, flags); > hba->force_reset = true; > - ufshcd_schedule_eh_work(hba); > + hba->eh_compl = &eh_compl; > dev_err(hba->dev, "%s: reset in progress - 1\n", __func__); > spin_unlock_irqrestore(hba->host->host_lock, flags); > > - flush_work(&hba->eh_work); > + scsi_schedule_eh(hba->host); > + res = wait_for_completion_interruptible_timeout(&eh_compl, 180 * HZ); > + WARN_ONCE(res <= 0, > + "wait_for_completion_interruptible_timeout() returned %d", > + res); > > spin_lock_irqsave(hba->host->host_lock, flags); > if (hba->ufshcd_state == UFSHCD_STATE_ERROR) > @@ -8511,8 +8488,6 @@ static void ufshcd_hba_exit(struct ufs_hba *hba) > if (hba->is_powered) { > ufshcd_exit_clk_scaling(hba); > ufshcd_exit_clk_gating(hba); > - if (hba->eh_wq) > - destroy_workqueue(hba->eh_wq); > ufs_debugfs_hba_exit(hba); > ufshcd_variant_hba_exit(hba); > ufshcd_setup_vreg(hba, false); > @@ -9371,6 +9346,10 @@ static int ufshcd_set_dma_mask(struct ufs_hba > *hba) > return dma_set_mask_and_coherent(hba->dev, DMA_BIT_MASK(32)); > } > > +static struct scsi_transport_template ufshcd_transport_template = { > + .eh_strategy_handler = ufshcd_err_handler, > +}; > + > /** > * ufshcd_alloc_host - allocate Host Bus Adapter (HBA) > * @dev: pointer to device handle > @@ -9397,6 +9376,7 @@ int ufshcd_alloc_host(struct device *dev, struct > ufs_hba **hba_handle) > err = -ENOMEM; > goto out_error; > } > + host->transportt = &ufshcd_transport_template; > hba = shost_priv(host); > hba->host = host; > hba->dev = dev; > @@ -9434,7 +9414,6 @@ int ufshcd_init(struct ufs_hba *hba, void > __iomem *mmio_base, unsigned int irq) > int err; > struct Scsi_Host *host = hba->host; > struct device *dev = hba->dev; > - char eh_wq_name[sizeof("ufs_eh_wq_00")]; > > if (!mmio_base) { > dev_err(hba->dev, > @@ -9488,17 +9467,6 @@ int ufshcd_init(struct ufs_hba *hba, void > __iomem *mmio_base, unsigned int irq) > > hba->max_pwr_info.is_valid = false; > > - /* Initialize work queues */ > - snprintf(eh_wq_name, sizeof(eh_wq_name), "ufs_eh_wq_%d", > - hba->host->host_no); > - hba->eh_wq = create_singlethread_workqueue(eh_wq_name); > - if (!hba->eh_wq) { > - dev_err(hba->dev, "%s: failed to create eh workqueue\n", > - __func__); > - err = -ENOMEM; > - goto out_disable; > - } > - INIT_WORK(&hba->eh_work, ufshcd_err_handler); > INIT_WORK(&hba->eeh_work, ufshcd_exception_event_handler); > > sema_init(&hba->host_sem, 1); > diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h > index f2796ea25598..d4f7ab0171c6 100644 > --- a/drivers/scsi/ufs/ufshcd.h > +++ b/drivers/scsi/ufs/ufshcd.h > @@ -482,18 +482,12 @@ struct ufs_stats { > * processing. > * @UFSHCD_STATE_OPERATIONAL: The host controller is operational and > can process > * SCSI commands. > - * @UFSHCD_STATE_EH_SCHEDULED_NON_FATAL: The error handler has been > scheduled. > - * SCSI commands may be submitted to the controller. > - * @UFSHCD_STATE_EH_SCHEDULED_FATAL: The error handler has been > scheduled. Fail > - * newly submitted SCSI commands with error code DID_BAD_TARGET. > * @UFSHCD_STATE_ERROR: An unrecoverable error occurred, e.g. link > recovery > * failed. Fail all SCSI commands with error code DID_ERROR. > */ > enum ufshcd_state { > UFSHCD_STATE_RESET, > UFSHCD_STATE_OPERATIONAL, > - UFSHCD_STATE_EH_SCHEDULED_NON_FATAL, > - UFSHCD_STATE_EH_SCHEDULED_FATAL, > UFSHCD_STATE_ERROR, > }; > > @@ -715,8 +709,7 @@ struct ufs_hba_monitor { > * @is_powered: flag to check if HBA is powered > * @shutting_down: flag to check if shutdown has been invoked > * @host_sem: semaphore used to serialize concurrent contexts > - * @eh_wq: Workqueue that eh_work works on > - * @eh_work: Worker to handle UFS errors that require s/w attention > + * @eh_compl: Signaled by the UFS error handler after error handling > finished. > * @eeh_work: Worker to handle exception events > * @errors: HBA errors > * @uic_error: UFS interconnect layer error status > @@ -817,9 +810,7 @@ struct ufs_hba { > bool shutting_down; > struct semaphore host_sem; > > - /* Work Queues */ > - struct workqueue_struct *eh_wq; > - struct work_struct eh_work; > + struct completion *eh_compl; > struct work_struct eeh_work; > > /* HBA Errors */
On 6/28/21 1:17 AM, Can Guo wrote: > On 2021-06-25 01:11, Bart Van Assche wrote: >> On 6/23/21 11:31 PM, Can Guo wrote: >>> Using back host_sem in suspend_prepare()/resume_complete() won't have >>> this problem of deadlock, right? >> >> Although that would solve the deadlock discussed in this email thread, it >> wouldn't solve the issue of potential adverse interactions of the UFS >> error handler and the SCSI error handler running concurrently. > > I think I've explained it before, paste it here - > > ufshcd_eh_host_reset_handler() invokes ufshcd_err_handler() and flushes it, > so SCSI error handler and UFS error handler can safely run together. That code path is the exception. Do you agree that the following three functions all invoke the ufshcd_err_handler() function asynchronously? * ufshcd_uic_pwr_ctrl() * ufshcd_check_errors() * ufshcd_abort() >> How about using the >> standard approach for invoking the UFS error handler instead of using >> a custom >> mechanism, e.g. by using something like the (untested) patch below? This >> approach guarantees that the UFS error handler is only activated after >> all >> pending SCSI commands have failed or timed out and also guarantees >> that no new >> SCSI commands will be queued while the UFS error handler is in >> progress (see >> also scsi_host_queue_ready()). > > Per my understanding, SCSI error handling is scsi cmd based, meaning it > only works when certain SCSI cmds failed [ ... ] That is not completely correct. The SCSI error handler is activated if either all pending commands have failed or if it is scheduled explicitly. Please take a look at the host_eh_scheduled member variable, how it is used and also at scsi_schedule_eh(). The scsi_schedule_eh() function was introduced in 2006 and that the ATA code uses it since then to activate the SCSI error handler even if no commands are pending. See also the patch "SCSI: make scsi_implement_eh() generic API for SCSI transports". > However, most UFS (UIC) errors happens during gear scaling, clk gating > and suspend/resume (due to power mode changes and/or hibern8 > enter/exit), during which there is NO scsi cmds in UFS driver at all > (because these contexts start only when there is no ongoing data > transactions). Activating the SCSI error handler if no SCSI commands are in progress is supported by scsi_schedule_eh(). > Thus, scsi_unjam_host() won't even call scsi_eh_ready_devs() because > scsi_eh_get_sense() always returns TRUE in these cases (eh_work_q is > empty). Please take another look at the patch in my previous message. There is a scsi_transport_template instance in that patch. The eh_strategy_handler defined in a SCSI transport template is called *instead* of scsi_unjam_host(). In other words, scsi_unjam_host() won't be called if my patch would be applied to the UFS driver. Please let me know if you need more information. Bart.
On 2021-06-29 01:31, Bart Van Assche wrote: > On 6/28/21 1:17 AM, Can Guo wrote: >> On 2021-06-25 01:11, Bart Van Assche wrote: >>> On 6/23/21 11:31 PM, Can Guo wrote: >>>> Using back host_sem in suspend_prepare()/resume_complete() won't >>>> have >>>> this problem of deadlock, right? >>> >>> Although that would solve the deadlock discussed in this email >>> thread, it >>> wouldn't solve the issue of potential adverse interactions of the UFS >>> error handler and the SCSI error handler running concurrently. >> >> I think I've explained it before, paste it here - >> >> ufshcd_eh_host_reset_handler() invokes ufshcd_err_handler() and >> flushes it, >> so SCSI error handler and UFS error handler can safely run together. > > That code path is the exception. Do you agree that the following three > functions all invoke the ufshcd_err_handler() function asynchronously? > * ufshcd_uic_pwr_ctrl() > * ufshcd_check_errors() > * ufshcd_abort() > I agree, but I don't see what's wrong with that. Any context can invoke ufs error handler asynchronously and ufs error handler prepare makes sure error handler can work safely, i.e., stopping PM ops/gating/scaling in error handler prepare makes sure no one shall call ufshcd_uic_pwr_ctrl() ever again. And ufshcd_check_errors() and ufshcd_abort() are OK to run concurrently with UFS error handler. >>> How about using the >>> standard approach for invoking the UFS error handler instead of using >>> a custom >>> mechanism, e.g. by using something like the (untested) patch below? >>> This >>> approach guarantees that the UFS error handler is only activated >>> after >>> all >>> pending SCSI commands have failed or timed out and also guarantees >>> that no new >>> SCSI commands will be queued while the UFS error handler is in >>> progress (see >>> also scsi_host_queue_ready()). >> >> Per my understanding, SCSI error handling is scsi cmd based, meaning >> it >> only works when certain SCSI cmds failed [ ... ] > That is not completely correct. The SCSI error handler is activated if > either all pending commands have failed or if it is scheduled > explicitly. Please take a look at the host_eh_scheduled member > variable, > how it is used and also at scsi_schedule_eh(). The scsi_schedule_eh() > function was introduced in 2006 and that the ATA code uses it since > then > to activate the SCSI error handler even if no commands are pending. See > also the patch "SCSI: make scsi_implement_eh() generic API for SCSI > transports". > >> However, most UFS (UIC) errors happens during gear scaling, clk gating >> and suspend/resume (due to power mode changes and/or hibern8 >> enter/exit), during which there is NO scsi cmds in UFS driver at all >> (because these contexts start only when there is no ongoing data >> transactions). > > Activating the SCSI error handler if no SCSI commands are in progress > is > supported by scsi_schedule_eh(). > >> Thus, scsi_unjam_host() won't even call scsi_eh_ready_devs() because >> scsi_eh_get_sense() always returns TRUE in these cases (eh_work_q is >> empty). > > Please take another look at the patch in my previous message. There is > a > scsi_transport_template instance in that patch. The eh_strategy_handler > defined in a SCSI transport template is called *instead* of > scsi_unjam_host(). In other words, scsi_unjam_host() won't be called if > my patch would be applied to the UFS driver. > > Please let me know if you need more information. Sorry that I missed the change of scsi_transport_template() in your previous message. I can understand that you want to invoke UFS error hander by invoking SCSI error handler, but I didn't go that far because I saw you changed pm_runtime_get_sync() to pm_runtime_get_noresume() in ufs error handler prepare. How can that change make sure that the device is not suspending or resuming while error handler is running? Thanks, Can Guo. > > Bart.
On 6/28/21 11:23 PM, Can Guo wrote: > On 2021-06-29 01:31, Bart Van Assche wrote: >> On 6/28/21 1:17 AM, Can Guo wrote: >>> On 2021-06-25 01:11, Bart Van Assche wrote: >>>> On 6/23/21 11:31 PM, Can Guo wrote: >>>>> Using back host_sem in suspend_prepare()/resume_complete() >>>>> won't have this problem of deadlock, right? >>>> >>>> Although that would solve the deadlock discussed in this email >>>> thread, it wouldn't solve the issue of potential adverse >>>> interactions of the UFS error handler and the SCSI error >>>> handler running concurrently. >>> >>> I think I've explained it before, paste it here - >>> >>> ufshcd_eh_host_reset_handler() invokes ufshcd_err_handler() and >>> flushes it, so SCSI error handler and UFS error handler can >>> safely run together. >> >> That code path is the exception. Do you agree that the following >> three functions all invoke the ufshcd_err_handler() function >> asynchronously? * ufshcd_uic_pwr_ctrl() * ufshcd_check_errors() * >> ufshcd_abort() > > I agree, but I don't see what's wrong with that. Any context can > invoke ufs error handler asynchronously and ufs error handler prepare > makes sure error handler can work safely, i.e., stopping PM > ops/gating/scaling in error handler prepare makes sure no one shall > call ufshcd_uic_pwr_ctrl() ever again. And ufshcd_check_errors() and > ufshcd_abort() are OK to run concurrently with UFS error handler. The current UFS error handling approach requires the following code in ufshcd_queuecommand(): if (hba->pm_op_in_progress) { hba->force_reset = true; set_host_byte(cmd, DID_BAD_TARGET); cmd->scsi_done(cmd); goto out; } Removing that code is not possible with the current error handling approach. My patch makes it possible to remove that code. > Sorry that I missed the change of scsi_transport_template() in your > previous message. I can understand that you want to invoke UFS error > hander by invoking SCSI error handler, but I didn't go that far > because I saw you changed pm_runtime_get_sync() to > pm_runtime_get_noresume() in ufs error handler prepare. How can that > change make sure that the device is not suspending or resuming while > error handler is running? UFS power state transitions happen by submitting a SCSI command to a WLUN. The SCSI error handler is only activated after all outstanding SCSI commands for a SCSI host have failed or completed. I think this guarantees for the UFS driver that eh_strategy_handler is not invoked while a command submitted to a WLUN is changing the power state of the UFS device. The following code from scsi_error.c only wakes up the error handler if (shost->host_failed || shost->host_eh_scheduled) && shost->host_failed == scsi_host_busy(shost): if ((shost->host_failed == 0 && shost->host_eh_scheduled == 0) || shost->host_failed != scsi_host_busy(shost)) { schedule(); continue; } /* Handle SCSI errors */ Thanks, Bart.
On 2021-06-30 02:01, Bart Van Assche wrote: > On 6/28/21 11:23 PM, Can Guo wrote: >> On 2021-06-29 01:31, Bart Van Assche wrote: >>> On 6/28/21 1:17 AM, Can Guo wrote: >>>> On 2021-06-25 01:11, Bart Van Assche wrote: >>>>> On 6/23/21 11:31 PM, Can Guo wrote: >>>>>> Using back host_sem in suspend_prepare()/resume_complete() >>>>>> won't have this problem of deadlock, right? >>>>> >>>>> Although that would solve the deadlock discussed in this email >>>>> thread, it wouldn't solve the issue of potential adverse >>>>> interactions of the UFS error handler and the SCSI error >>>>> handler running concurrently. >>>> >>>> I think I've explained it before, paste it here - >>>> >>>> ufshcd_eh_host_reset_handler() invokes ufshcd_err_handler() and >>>> flushes it, so SCSI error handler and UFS error handler can >>>> safely run together. >>> >>> That code path is the exception. Do you agree that the following >>> three functions all invoke the ufshcd_err_handler() function >>> asynchronously? * ufshcd_uic_pwr_ctrl() * ufshcd_check_errors() * >>> ufshcd_abort() >> >> I agree, but I don't see what's wrong with that. Any context can >> invoke ufs error handler asynchronously and ufs error handler prepare >> makes sure error handler can work safely, i.e., stopping PM >> ops/gating/scaling in error handler prepare makes sure no one shall >> call ufshcd_uic_pwr_ctrl() ever again. And ufshcd_check_errors() and >> ufshcd_abort() are OK to run concurrently with UFS error handler. > > The current UFS error handling approach requires the following code in > ufshcd_queuecommand(): > > if (hba->pm_op_in_progress) { > hba->force_reset = true; > set_host_byte(cmd, DID_BAD_TARGET); > cmd->scsi_done(cmd); > goto out; > } > > Removing that code is not possible with the current error handling > approach. My patch makes it possible to remove that code. > >> Sorry that I missed the change of scsi_transport_template() in your >> previous message. I can understand that you want to invoke UFS error >> hander by invoking SCSI error handler, but I didn't go that far >> because I saw you changed pm_runtime_get_sync() to >> pm_runtime_get_noresume() in ufs error handler prepare. How can that >> change make sure that the device is not suspending or resuming while >> error handler is running? > > UFS power state transitions happen by submitting a SCSI command to a > WLUN. The SCSI error handler is only activated after all outstanding > SCSI commands for a SCSI host have failed or completed. I think this > guarantees for the UFS driver that eh_strategy_handler is not invoked > while a command submitted to a WLUN is changing the power state of the > UFS device. The following code from scsi_error.c only wakes up the > error > handler if (shost->host_failed || shost->host_eh_scheduled) && > shost->host_failed == scsi_host_busy(shost): > > if ((shost->host_failed == 0 && shost->host_eh_scheduled == 0) > || shost->host_failed != scsi_host_busy(shost)) { > schedule(); > continue; > } > /* Handle SCSI errors */ > It is not completely right - wl_suspend/resume() are much more twisted. wl_suspend() may or may NOT send a SCSI cmd to WLUN, i.e., SSU cmd may be skipped if spm/rpm_lvl is 0/1 and/or if bkops/wb is on-going (even when rpm_lvl is not 0/1), while link can still be put to hibern8/off, then power/clks can still be shutdown to save power. wl_resume(), in case of rpm/spm_lvl == 5, does a full reset to UFS device, without sending a SSU cmd to WLU to complete the power state transition. So above checks (in scsi_error_handler()) cannot gaurantee that actual power state transistions in UFS driver has ceased before start UFS error handling. Thanks, Can Guo. > Thanks, > > Bart.
On 28/06/21 10:26 am, Can Guo wrote: > On 2021-06-24 18:04, Adrian Hunter wrote: >> On 24/06/21 9:31 am, Can Guo wrote: >>> On 2021-06-24 14:23, Adrian Hunter wrote: >>>> On 24/06/21 9:12 am, Can Guo wrote: >>>>> On 2021-06-24 13:52, Adrian Hunter wrote: >>>>>> On 24/06/21 5:16 am, Can Guo wrote: >>>>>>> On 2021-06-23 22:30, Adrian Hunter wrote: >>>>>>>> On 23/06/21 10:35 am, Can Guo wrote: >>>>>>>>> To protect system suspend/resume from being disturbed by error handling, >>>>>>>>> instead of using host_sem, let error handler call lock_system_sleep() and >>>>>>>>> unlock_system_sleep() which achieve the same purpose. Remove the host_sem >>>>>>>>> used in suspend/resume paths to make the code more readable. >>>>>>>>> >>>>>>>>> Suggested-by: Bart Van Assche <bvanassche@acm.org> >>>>>>>>> Signed-off-by: Can Guo <cang@codeaurora.org> >>>>>>>>> --- >>>>>>>>> drivers/scsi/ufs/ufshcd.c | 12 +++++++----- >>>>>>>>> 1 file changed, 7 insertions(+), 5 deletions(-) >>>>>>>>> >>>>>>>>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c >>>>>>>>> index 3695dd2..a09e4a2 100644 >>>>>>>>> --- a/drivers/scsi/ufs/ufshcd.c >>>>>>>>> +++ b/drivers/scsi/ufs/ufshcd.c >>>>>>>>> @@ -5907,6 +5907,11 @@ static void ufshcd_clk_scaling_suspend(struct ufs_hba *hba, bool suspend) >>>>>>>>> >>>>>>>>> static void ufshcd_err_handling_prepare(struct ufs_hba *hba) >>>>>>>>> { >>>>>>>>> + /* >>>>>>>>> + * It is not safe to perform error handling while suspend or resume is >>>>>>>>> + * in progress. Hence the lock_system_sleep() call. >>>>>>>>> + */ >>>>>>>>> + lock_system_sleep(); >>>>>>>> >>>>>>>> It looks to me like the system takes this lock quite early, even before >>>>>>>> freezing tasks, so if anything needs the error handler to run it will >>>>>>>> deadlock. >>>>>>> >>>>>>> Hi Adrian, >>>>>>> >>>>>>> UFS/hba system suspend/resume does not invoke or call error handling in a >>>>>>> synchronous way. So, whatever UFS errors (which schedules the error handler) >>>>>>> happens during suspend/resume, error handler will just wait here till system >>>>>>> suspend/resume release the lock. Hence no worries of deadlock here. >>>>>> >>>>>> It looks to me like the state can change to UFSHCD_STATE_EH_SCHEDULED_FATAL >>>>>> and since user processes are not frozen, nor file systems sync'ed, everything >>>>>> is going to deadlock. >>>>>> i.e. >>>>>> I/O is blocked waiting on error handling >>>>>> error handling is blocked waiting on lock_system_sleep() >>>>>> suspend is blocked waiting on I/O >>>>>> >>>>> >>>>> Hi Adrian, >>>>> >>>>> First of all, enter_state(suspend_state_t state) uses mutex_trylock(&system_transition_mutex). >>>> >>>> Yes, in the case I am outlining it gets the mutex. >>>> >>>>> Second, even that happens, in ufshcd_queuecommand(), below logic will break the cycle, by >>>>> fast failing the PM request (below codes are from the code tip with this whole series applied). >>>> >>>> It won't get that far because the suspend will be waiting to sync filesystems. >>>> Filesystems will be waiting on I/O. >>>> I/O will be waiting on the error handler. >>>> The error handler will be waiting on system_transition_mutex. >>>> But system_transition_mutex is already held by PM core. >>> >>> Hi Adrian, >>> >>> You are right.... I missed the action of syncing filesystems... >>> >>> Using back host_sem in suspend_prepare()/resume_complete() won't have this >>> problem of deadlock, right? >> >> I am not sure, but what was problem that the V3 patch was fixing? >> Can you give an example? > > V3 was moving host_sem from wl_system_suspend/resume() to > ufshcd_suspend_prepare()/ufshcd_resume_complete(). It is to > make sure error handling does not run concurrenly with system > PM, since error handling is recovering/clearing runtime PM > errors of all the scsi devices under hba (in patch #8). Having the > error handling doing so (in patch 8) is because runtime PM framework > may save the runtime errors of the supplier to one or more consumers ( > unlike the children - parent relationship), for example if wlu resume > fails, sda and/or other scsi devices may save the resume error, then > they will be left runtime suspended permanently. Sorry for the slow reply. I was going to do some more investigation but never found time. I was wondering if it would be simpler to do the error recovery for wl_system_suspend/resume() before exiting wl_system_suspend/resume(). Then it would be possible to do something along the lines: - prevent runtime suspend while the error handler is outstanding - at suspend, block queuing of the error handler work and flush it - at resume, allow queuing of the error handler work
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index 3695dd2..a09e4a2 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -5907,6 +5907,11 @@ static void ufshcd_clk_scaling_suspend(struct ufs_hba *hba, bool suspend) static void ufshcd_err_handling_prepare(struct ufs_hba *hba) { + /* + * It is not safe to perform error handling while suspend or resume is + * in progress. Hence the lock_system_sleep() call. + */ + lock_system_sleep(); ufshcd_rpm_get_sync(hba); if (pm_runtime_status_suspended(&hba->sdev_ufs_device->sdev_gendev) || hba->is_wlu_sys_suspended) { @@ -5951,6 +5956,7 @@ static void ufshcd_err_handling_unprepare(struct ufs_hba *hba) ufshcd_clk_scaling_suspend(hba, false); ufshcd_clear_ua_wluns(hba); ufshcd_rpm_put(hba); + unlock_system_sleep(); } static inline bool ufshcd_err_handling_should_stop(struct ufs_hba *hba) @@ -9053,16 +9059,13 @@ static int ufshcd_wl_suspend(struct device *dev) ktime_t start = ktime_get(); hba = shost_priv(sdev->host); - down(&hba->host_sem); if (pm_runtime_suspended(dev)) goto out; ret = __ufshcd_wl_suspend(hba, UFS_SYSTEM_PM); - if (ret) { + if (ret) dev_err(&sdev->sdev_gendev, "%s failed: %d\n", __func__, ret); - up(&hba->host_sem); - } out: if (!ret) @@ -9095,7 +9098,6 @@ static int ufshcd_wl_resume(struct device *dev) hba->curr_dev_pwr_mode, hba->uic_link_state); if (!ret) hba->is_wlu_sys_suspended = false; - up(&hba->host_sem); return ret; } #endif
To protect system suspend/resume from being disturbed by error handling, instead of using host_sem, let error handler call lock_system_sleep() and unlock_system_sleep() which achieve the same purpose. Remove the host_sem used in suspend/resume paths to make the code more readable. Suggested-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Can Guo <cang@codeaurora.org> --- drivers/scsi/ufs/ufshcd.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)