diff mbox series

[RFC,2/2] scsi: ufshcd: Fix device links when BOOT WLUN fails to probe

Message ID 20210707172948.1025-3-adrian.hunter@intel.com (mailing list archive)
State Superseded
Headers show
Series driver core: Add ability to delete device links of unregistered devices | expand

Commit Message

Adrian Hunter July 7, 2021, 5:29 p.m. UTC
If a LUN fails to probe (e.g. absent BOOT WLUN), the device will not have
been registered but can still have a device link holding a reference to the
device. The unwanted device link will prevent runtime suspend indefinitely,
and cause some warnings if the supplier is ever deleted (e.g. by unbinding
the UFS host controller). Fix by explicitly deleting the device link when
SCSI destroys the SCSI device.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 drivers/scsi/ufs/ufshcd.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Greg KH July 7, 2021, 5:39 p.m. UTC | #1
On Wed, Jul 07, 2021 at 08:29:48PM +0300, Adrian Hunter wrote:
> If a LUN fails to probe (e.g. absent BOOT WLUN), the device will not have
> been registered but can still have a device link holding a reference to the
> device. The unwanted device link will prevent runtime suspend indefinitely,
> and cause some warnings if the supplier is ever deleted (e.g. by unbinding
> the UFS host controller). Fix by explicitly deleting the device link when
> SCSI destroys the SCSI device.
> 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  drivers/scsi/ufs/ufshcd.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index 708b3b62fc4d..483aa74fe2c8 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -5029,6 +5029,13 @@ static void ufshcd_slave_destroy(struct scsi_device *sdev)
>  		spin_lock_irqsave(hba->host->host_lock, flags);
>  		hba->sdev_ufs_device = NULL;
>  		spin_unlock_irqrestore(hba->host->host_lock, flags);
> +	} else {
> +		/*
> +		 * If a LUN fails to probe (e.g. absent BOOT WLUN), the device
> +		 * will not have been registered but can still have a device
> +		 * link holding a reference to the device.
> +		 */
> +		device_links_scrap(&sdev->sdev_gendev);

What created that link?  And why did it do that before probe happened
successfully?

thanks,

greg k-h
Adrian Hunter July 7, 2021, 5:49 p.m. UTC | #2
On 7/07/21 8:39 pm, Greg Kroah-Hartman wrote:
> On Wed, Jul 07, 2021 at 08:29:48PM +0300, Adrian Hunter wrote:
>> If a LUN fails to probe (e.g. absent BOOT WLUN), the device will not have
>> been registered but can still have a device link holding a reference to the
>> device. The unwanted device link will prevent runtime suspend indefinitely,
>> and cause some warnings if the supplier is ever deleted (e.g. by unbinding
>> the UFS host controller). Fix by explicitly deleting the device link when
>> SCSI destroys the SCSI device.
>>
>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
>> ---
>>  drivers/scsi/ufs/ufshcd.c | 7 +++++++
>>  1 file changed, 7 insertions(+)
>>
>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
>> index 708b3b62fc4d..483aa74fe2c8 100644
>> --- a/drivers/scsi/ufs/ufshcd.c
>> +++ b/drivers/scsi/ufs/ufshcd.c
>> @@ -5029,6 +5029,13 @@ static void ufshcd_slave_destroy(struct scsi_device *sdev)
>>  		spin_lock_irqsave(hba->host->host_lock, flags);
>>  		hba->sdev_ufs_device = NULL;
>>  		spin_unlock_irqrestore(hba->host->host_lock, flags);
>> +	} else {
>> +		/*
>> +		 * If a LUN fails to probe (e.g. absent BOOT WLUN), the device
>> +		 * will not have been registered but can still have a device
>> +		 * link holding a reference to the device.
>> +		 */
>> +		device_links_scrap(&sdev->sdev_gendev);
> 
> What created that link?  And why did it do that before probe happened
> successfully?

The same driver created the link.

The documentation seems to say it is allowed to, if it is the consumer.
From Documentation/driver-api/device_link.rst

  Usage
  =====

  The earliest point in time when device links can be added is after
  :c:func:`device_add()` has been called for the supplier and
  :c:func:`device_initialize()` has been called for the consumer.
Rafael J. Wysocki July 8, 2021, 12:31 p.m. UTC | #3
On Wed, Jul 7, 2021 at 7:49 PM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> On 7/07/21 8:39 pm, Greg Kroah-Hartman wrote:
> > On Wed, Jul 07, 2021 at 08:29:48PM +0300, Adrian Hunter wrote:
> >> If a LUN fails to probe (e.g. absent BOOT WLUN), the device will not have
> >> been registered but can still have a device link holding a reference to the
> >> device. The unwanted device link will prevent runtime suspend indefinitely,
> >> and cause some warnings if the supplier is ever deleted (e.g. by unbinding
> >> the UFS host controller). Fix by explicitly deleting the device link when
> >> SCSI destroys the SCSI device.
> >>
> >> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> >> ---
> >>  drivers/scsi/ufs/ufshcd.c | 7 +++++++
> >>  1 file changed, 7 insertions(+)
> >>
> >> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> >> index 708b3b62fc4d..483aa74fe2c8 100644
> >> --- a/drivers/scsi/ufs/ufshcd.c
> >> +++ b/drivers/scsi/ufs/ufshcd.c
> >> @@ -5029,6 +5029,13 @@ static void ufshcd_slave_destroy(struct scsi_device *sdev)
> >>              spin_lock_irqsave(hba->host->host_lock, flags);
> >>              hba->sdev_ufs_device = NULL;
> >>              spin_unlock_irqrestore(hba->host->host_lock, flags);
> >> +    } else {
> >> +            /*
> >> +             * If a LUN fails to probe (e.g. absent BOOT WLUN), the device
> >> +             * will not have been registered but can still have a device
> >> +             * link holding a reference to the device.
> >> +             */
> >> +            device_links_scrap(&sdev->sdev_gendev);
> >
> > What created that link?  And why did it do that before probe happened
> > successfully?
>
> The same driver created the link.
>
> The documentation seems to say it is allowed to, if it is the consumer.
> From Documentation/driver-api/device_link.rst
>
>   Usage
>   =====
>
>   The earliest point in time when device links can be added is after
>   :c:func:`device_add()` has been called for the supplier and
>   :c:func:`device_initialize()` has been called for the consumer.

Yes, this is allowed, but if you've added device links to a device
object that is not going to be registered after all, you are
responsible for doing the cleanup.

Why can't you call device_link_del() directly on those links?

Or device_link_remove() if you don't want to deal with link pointers?
Adrian Hunter July 8, 2021, 2:17 p.m. UTC | #4
On 8/07/21 3:31 pm, Rafael J. Wysocki wrote:
> On Wed, Jul 7, 2021 at 7:49 PM Adrian Hunter <adrian.hunter@intel.com> wrote:
>>
>> On 7/07/21 8:39 pm, Greg Kroah-Hartman wrote:
>>> On Wed, Jul 07, 2021 at 08:29:48PM +0300, Adrian Hunter wrote:
>>>> If a LUN fails to probe (e.g. absent BOOT WLUN), the device will not have
>>>> been registered but can still have a device link holding a reference to the
>>>> device. The unwanted device link will prevent runtime suspend indefinitely,
>>>> and cause some warnings if the supplier is ever deleted (e.g. by unbinding
>>>> the UFS host controller). Fix by explicitly deleting the device link when
>>>> SCSI destroys the SCSI device.
>>>>
>>>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
>>>> ---
>>>>  drivers/scsi/ufs/ufshcd.c | 7 +++++++
>>>>  1 file changed, 7 insertions(+)
>>>>
>>>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
>>>> index 708b3b62fc4d..483aa74fe2c8 100644
>>>> --- a/drivers/scsi/ufs/ufshcd.c
>>>> +++ b/drivers/scsi/ufs/ufshcd.c
>>>> @@ -5029,6 +5029,13 @@ static void ufshcd_slave_destroy(struct scsi_device *sdev)
>>>>              spin_lock_irqsave(hba->host->host_lock, flags);
>>>>              hba->sdev_ufs_device = NULL;
>>>>              spin_unlock_irqrestore(hba->host->host_lock, flags);
>>>> +    } else {
>>>> +            /*
>>>> +             * If a LUN fails to probe (e.g. absent BOOT WLUN), the device
>>>> +             * will not have been registered but can still have a device
>>>> +             * link holding a reference to the device.
>>>> +             */
>>>> +            device_links_scrap(&sdev->sdev_gendev);
>>>
>>> What created that link?  And why did it do that before probe happened
>>> successfully?
>>
>> The same driver created the link.
>>
>> The documentation seems to say it is allowed to, if it is the consumer.
>> From Documentation/driver-api/device_link.rst
>>
>>   Usage
>>   =====
>>
>>   The earliest point in time when device links can be added is after
>>   :c:func:`device_add()` has been called for the supplier and
>>   :c:func:`device_initialize()` has been called for the consumer.
> 
> Yes, this is allowed, but if you've added device links to a device
> object that is not going to be registered after all, you are
> responsible for doing the cleanup.
> 
> Why can't you call device_link_del() directly on those links?
> 
> Or device_link_remove() if you don't want to deal with link pointers?
> 

Those only work for DL_FLAG_STATELESS device links, but we use only
DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE flags.
Rafael J. Wysocki July 8, 2021, 3:03 p.m. UTC | #5
On Thu, Jul 8, 2021 at 4:17 PM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> On 8/07/21 3:31 pm, Rafael J. Wysocki wrote:
> > On Wed, Jul 7, 2021 at 7:49 PM Adrian Hunter <adrian.hunter@intel.com> wrote:
> >>
> >> On 7/07/21 8:39 pm, Greg Kroah-Hartman wrote:
> >>> On Wed, Jul 07, 2021 at 08:29:48PM +0300, Adrian Hunter wrote:
> >>>> If a LUN fails to probe (e.g. absent BOOT WLUN), the device will not have
> >>>> been registered but can still have a device link holding a reference to the
> >>>> device. The unwanted device link will prevent runtime suspend indefinitely,
> >>>> and cause some warnings if the supplier is ever deleted (e.g. by unbinding
> >>>> the UFS host controller). Fix by explicitly deleting the device link when
> >>>> SCSI destroys the SCSI device.
> >>>>
> >>>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> >>>> ---
> >>>>  drivers/scsi/ufs/ufshcd.c | 7 +++++++
> >>>>  1 file changed, 7 insertions(+)
> >>>>
> >>>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> >>>> index 708b3b62fc4d..483aa74fe2c8 100644
> >>>> --- a/drivers/scsi/ufs/ufshcd.c
> >>>> +++ b/drivers/scsi/ufs/ufshcd.c
> >>>> @@ -5029,6 +5029,13 @@ static void ufshcd_slave_destroy(struct scsi_device *sdev)
> >>>>              spin_lock_irqsave(hba->host->host_lock, flags);
> >>>>              hba->sdev_ufs_device = NULL;
> >>>>              spin_unlock_irqrestore(hba->host->host_lock, flags);
> >>>> +    } else {
> >>>> +            /*
> >>>> +             * If a LUN fails to probe (e.g. absent BOOT WLUN), the device
> >>>> +             * will not have been registered but can still have a device
> >>>> +             * link holding a reference to the device.
> >>>> +             */
> >>>> +            device_links_scrap(&sdev->sdev_gendev);
> >>>
> >>> What created that link?  And why did it do that before probe happened
> >>> successfully?
> >>
> >> The same driver created the link.
> >>
> >> The documentation seems to say it is allowed to, if it is the consumer.
> >> From Documentation/driver-api/device_link.rst
> >>
> >>   Usage
> >>   =====
> >>
> >>   The earliest point in time when device links can be added is after
> >>   :c:func:`device_add()` has been called for the supplier and
> >>   :c:func:`device_initialize()` has been called for the consumer.
> >
> > Yes, this is allowed, but if you've added device links to a device
> > object that is not going to be registered after all, you are
> > responsible for doing the cleanup.
> >
> > Why can't you call device_link_del() directly on those links?
> >
> > Or device_link_remove() if you don't want to deal with link pointers?
> >
>
> Those only work for DL_FLAG_STATELESS device links, but we use only
> DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE flags.

So I'd probably modify device_link_remove() to check if the consumer
device has been registered and run __device_link_del() directly
instead of device_link_put_kref() if it hasn't.

Or add an argument to it to force the removal.
Rafael J. Wysocki July 8, 2021, 3:12 p.m. UTC | #6
On Thu, Jul 8, 2021 at 5:03 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Thu, Jul 8, 2021 at 4:17 PM Adrian Hunter <adrian.hunter@intel.com> wrote:
> >
> > On 8/07/21 3:31 pm, Rafael J. Wysocki wrote:
> > > On Wed, Jul 7, 2021 at 7:49 PM Adrian Hunter <adrian.hunter@intel.com> wrote:
> > >>
> > >> On 7/07/21 8:39 pm, Greg Kroah-Hartman wrote:
> > >>> On Wed, Jul 07, 2021 at 08:29:48PM +0300, Adrian Hunter wrote:
> > >>>> If a LUN fails to probe (e.g. absent BOOT WLUN), the device will not have
> > >>>> been registered but can still have a device link holding a reference to the
> > >>>> device. The unwanted device link will prevent runtime suspend indefinitely,
> > >>>> and cause some warnings if the supplier is ever deleted (e.g. by unbinding
> > >>>> the UFS host controller). Fix by explicitly deleting the device link when
> > >>>> SCSI destroys the SCSI device.
> > >>>>
> > >>>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> > >>>> ---
> > >>>>  drivers/scsi/ufs/ufshcd.c | 7 +++++++
> > >>>>  1 file changed, 7 insertions(+)
> > >>>>
> > >>>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> > >>>> index 708b3b62fc4d..483aa74fe2c8 100644
> > >>>> --- a/drivers/scsi/ufs/ufshcd.c
> > >>>> +++ b/drivers/scsi/ufs/ufshcd.c
> > >>>> @@ -5029,6 +5029,13 @@ static void ufshcd_slave_destroy(struct scsi_device *sdev)
> > >>>>              spin_lock_irqsave(hba->host->host_lock, flags);
> > >>>>              hba->sdev_ufs_device = NULL;
> > >>>>              spin_unlock_irqrestore(hba->host->host_lock, flags);
> > >>>> +    } else {
> > >>>> +            /*
> > >>>> +             * If a LUN fails to probe (e.g. absent BOOT WLUN), the device
> > >>>> +             * will not have been registered but can still have a device
> > >>>> +             * link holding a reference to the device.
> > >>>> +             */
> > >>>> +            device_links_scrap(&sdev->sdev_gendev);
> > >>>
> > >>> What created that link?  And why did it do that before probe happened
> > >>> successfully?
> > >>
> > >> The same driver created the link.
> > >>
> > >> The documentation seems to say it is allowed to, if it is the consumer.
> > >> From Documentation/driver-api/device_link.rst
> > >>
> > >>   Usage
> > >>   =====
> > >>
> > >>   The earliest point in time when device links can be added is after
> > >>   :c:func:`device_add()` has been called for the supplier and
> > >>   :c:func:`device_initialize()` has been called for the consumer.
> > >
> > > Yes, this is allowed, but if you've added device links to a device
> > > object that is not going to be registered after all, you are
> > > responsible for doing the cleanup.
> > >
> > > Why can't you call device_link_del() directly on those links?
> > >
> > > Or device_link_remove() if you don't want to deal with link pointers?
> > >
> >
> > Those only work for DL_FLAG_STATELESS device links, but we use only
> > DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE flags.
>
> So I'd probably modify device_link_remove() to check if the consumer
> device has been registered and run __device_link_del() directly
> instead of device_link_put_kref() if it hasn't.
>
> Or add an argument to it to force the removal.

Or even modify device_link_put_kref() like this:

 static void device_link_put_kref(struct device_link *link)
 {
          if (link->flags & DL_FLAG_STATELESS)
                  kref_put(&link->kref, __device_link_del);
+        else if (!device_is_registered(link->consumer))
+                __device_link_del(link);
          else
                 WARN(1, "Unable to drop a managed device link reference\n");
 }
Adrian Hunter July 8, 2021, 4:02 p.m. UTC | #7
On 8/07/21 6:12 pm, Rafael J. Wysocki wrote:
> On Thu, Jul 8, 2021 at 5:03 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>>
>> On Thu, Jul 8, 2021 at 4:17 PM Adrian Hunter <adrian.hunter@intel.com> wrote:
>>>
>>> On 8/07/21 3:31 pm, Rafael J. Wysocki wrote:
>>>> On Wed, Jul 7, 2021 at 7:49 PM Adrian Hunter <adrian.hunter@intel.com> wrote:
>>>>>
>>>>> On 7/07/21 8:39 pm, Greg Kroah-Hartman wrote:
>>>>>> On Wed, Jul 07, 2021 at 08:29:48PM +0300, Adrian Hunter wrote:
>>>>>>> If a LUN fails to probe (e.g. absent BOOT WLUN), the device will not have
>>>>>>> been registered but can still have a device link holding a reference to the
>>>>>>> device. The unwanted device link will prevent runtime suspend indefinitely,
>>>>>>> and cause some warnings if the supplier is ever deleted (e.g. by unbinding
>>>>>>> the UFS host controller). Fix by explicitly deleting the device link when
>>>>>>> SCSI destroys the SCSI device.
>>>>>>>
>>>>>>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
>>>>>>> ---
>>>>>>>  drivers/scsi/ufs/ufshcd.c | 7 +++++++
>>>>>>>  1 file changed, 7 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
>>>>>>> index 708b3b62fc4d..483aa74fe2c8 100644
>>>>>>> --- a/drivers/scsi/ufs/ufshcd.c
>>>>>>> +++ b/drivers/scsi/ufs/ufshcd.c
>>>>>>> @@ -5029,6 +5029,13 @@ static void ufshcd_slave_destroy(struct scsi_device *sdev)
>>>>>>>              spin_lock_irqsave(hba->host->host_lock, flags);
>>>>>>>              hba->sdev_ufs_device = NULL;
>>>>>>>              spin_unlock_irqrestore(hba->host->host_lock, flags);
>>>>>>> +    } else {
>>>>>>> +            /*
>>>>>>> +             * If a LUN fails to probe (e.g. absent BOOT WLUN), the device
>>>>>>> +             * will not have been registered but can still have a device
>>>>>>> +             * link holding a reference to the device.
>>>>>>> +             */
>>>>>>> +            device_links_scrap(&sdev->sdev_gendev);
>>>>>>
>>>>>> What created that link?  And why did it do that before probe happened
>>>>>> successfully?
>>>>>
>>>>> The same driver created the link.
>>>>>
>>>>> The documentation seems to say it is allowed to, if it is the consumer.
>>>>> From Documentation/driver-api/device_link.rst
>>>>>
>>>>>   Usage
>>>>>   =====
>>>>>
>>>>>   The earliest point in time when device links can be added is after
>>>>>   :c:func:`device_add()` has been called for the supplier and
>>>>>   :c:func:`device_initialize()` has been called for the consumer.
>>>>
>>>> Yes, this is allowed, but if you've added device links to a device
>>>> object that is not going to be registered after all, you are
>>>> responsible for doing the cleanup.
>>>>
>>>> Why can't you call device_link_del() directly on those links?
>>>>
>>>> Or device_link_remove() if you don't want to deal with link pointers?
>>>>
>>>
>>> Those only work for DL_FLAG_STATELESS device links, but we use only
>>> DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE flags.
>>
>> So I'd probably modify device_link_remove() to check if the consumer
>> device has been registered and run __device_link_del() directly
>> instead of device_link_put_kref() if it hasn't.
>>
>> Or add an argument to it to force the removal.
> 
> Or even modify device_link_put_kref() like this:
> 
>  static void device_link_put_kref(struct device_link *link)
>  {
>           if (link->flags & DL_FLAG_STATELESS)
>                   kref_put(&link->kref, __device_link_del);
> +        else if (!device_is_registered(link->consumer))
> +                __device_link_del(link);
>           else
>                  WARN(1, "Unable to drop a managed device link reference\n");
>  }
> 

Thanks! :-) I will do that.
Saravana Kannan July 8, 2021, 4:45 p.m. UTC | #8
On Thu, Jul 8, 2021 at 7:17 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> On 8/07/21 3:31 pm, Rafael J. Wysocki wrote:
> > On Wed, Jul 7, 2021 at 7:49 PM Adrian Hunter <adrian.hunter@intel.com> wrote:
> >>
> >> On 7/07/21 8:39 pm, Greg Kroah-Hartman wrote:
> >>> On Wed, Jul 07, 2021 at 08:29:48PM +0300, Adrian Hunter wrote:
> >>>> If a LUN fails to probe (e.g. absent BOOT WLUN), the device will not have
> >>>> been registered but can still have a device link holding a reference to the
> >>>> device. The unwanted device link will prevent runtime suspend indefinitely,
> >>>> and cause some warnings if the supplier is ever deleted (e.g. by unbinding
> >>>> the UFS host controller). Fix by explicitly deleting the device link when
> >>>> SCSI destroys the SCSI device.
> >>>>
> >>>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> >>>> ---
> >>>>  drivers/scsi/ufs/ufshcd.c | 7 +++++++
> >>>>  1 file changed, 7 insertions(+)
> >>>>
> >>>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> >>>> index 708b3b62fc4d..483aa74fe2c8 100644
> >>>> --- a/drivers/scsi/ufs/ufshcd.c
> >>>> +++ b/drivers/scsi/ufs/ufshcd.c
> >>>> @@ -5029,6 +5029,13 @@ static void ufshcd_slave_destroy(struct scsi_device *sdev)
> >>>>              spin_lock_irqsave(hba->host->host_lock, flags);
> >>>>              hba->sdev_ufs_device = NULL;
> >>>>              spin_unlock_irqrestore(hba->host->host_lock, flags);
> >>>> +    } else {
> >>>> +            /*
> >>>> +             * If a LUN fails to probe (e.g. absent BOOT WLUN), the device
> >>>> +             * will not have been registered but can still have a device
> >>>> +             * link holding a reference to the device.
> >>>> +             */
> >>>> +            device_links_scrap(&sdev->sdev_gendev);
> >>>
> >>> What created that link?  And why did it do that before probe happened
> >>> successfully?
> >>
> >> The same driver created the link.
> >>
> >> The documentation seems to say it is allowed to, if it is the consumer.
> >> From Documentation/driver-api/device_link.rst
> >>
> >>   Usage
> >>   =====
> >>
> >>   The earliest point in time when device links can be added is after
> >>   :c:func:`device_add()` has been called for the supplier and
> >>   :c:func:`device_initialize()` has been called for the consumer.
> >
> > Yes, this is allowed, but if you've added device links to a device
> > object that is not going to be registered after all, you are
> > responsible for doing the cleanup.
> >
> > Why can't you call device_link_del() directly on those links?
> >
> > Or device_link_remove() if you don't want to deal with link pointers?
> >
>
> Those only work for DL_FLAG_STATELESS device links, but we use only
> DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE flags.

Is there a reason you can't use DL_FLAG_STATELESS? It doesn't preclude
you from using RPM_ACTIVE as far as I can tell.

-Saravana


-Saravana
Rafael J. Wysocki July 8, 2021, 4:48 p.m. UTC | #9
On Thu, Jul 8, 2021 at 6:46 PM Saravana Kannan <saravanak@google.com> wrote:
>
> On Thu, Jul 8, 2021 at 7:17 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> >
> > On 8/07/21 3:31 pm, Rafael J. Wysocki wrote:
> > > On Wed, Jul 7, 2021 at 7:49 PM Adrian Hunter <adrian.hunter@intel.com> wrote:
> > >>
> > >> On 7/07/21 8:39 pm, Greg Kroah-Hartman wrote:
> > >>> On Wed, Jul 07, 2021 at 08:29:48PM +0300, Adrian Hunter wrote:
> > >>>> If a LUN fails to probe (e.g. absent BOOT WLUN), the device will not have
> > >>>> been registered but can still have a device link holding a reference to the
> > >>>> device. The unwanted device link will prevent runtime suspend indefinitely,
> > >>>> and cause some warnings if the supplier is ever deleted (e.g. by unbinding
> > >>>> the UFS host controller). Fix by explicitly deleting the device link when
> > >>>> SCSI destroys the SCSI device.
> > >>>>
> > >>>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> > >>>> ---
> > >>>>  drivers/scsi/ufs/ufshcd.c | 7 +++++++
> > >>>>  1 file changed, 7 insertions(+)
> > >>>>
> > >>>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> > >>>> index 708b3b62fc4d..483aa74fe2c8 100644
> > >>>> --- a/drivers/scsi/ufs/ufshcd.c
> > >>>> +++ b/drivers/scsi/ufs/ufshcd.c
> > >>>> @@ -5029,6 +5029,13 @@ static void ufshcd_slave_destroy(struct scsi_device *sdev)
> > >>>>              spin_lock_irqsave(hba->host->host_lock, flags);
> > >>>>              hba->sdev_ufs_device = NULL;
> > >>>>              spin_unlock_irqrestore(hba->host->host_lock, flags);
> > >>>> +    } else {
> > >>>> +            /*
> > >>>> +             * If a LUN fails to probe (e.g. absent BOOT WLUN), the device
> > >>>> +             * will not have been registered but can still have a device
> > >>>> +             * link holding a reference to the device.
> > >>>> +             */
> > >>>> +            device_links_scrap(&sdev->sdev_gendev);
> > >>>
> > >>> What created that link?  And why did it do that before probe happened
> > >>> successfully?
> > >>
> > >> The same driver created the link.
> > >>
> > >> The documentation seems to say it is allowed to, if it is the consumer.
> > >> From Documentation/driver-api/device_link.rst
> > >>
> > >>   Usage
> > >>   =====
> > >>
> > >>   The earliest point in time when device links can be added is after
> > >>   :c:func:`device_add()` has been called for the supplier and
> > >>   :c:func:`device_initialize()` has been called for the consumer.
> > >
> > > Yes, this is allowed, but if you've added device links to a device
> > > object that is not going to be registered after all, you are
> > > responsible for doing the cleanup.
> > >
> > > Why can't you call device_link_del() directly on those links?
> > >
> > > Or device_link_remove() if you don't want to deal with link pointers?
> > >
> >
> > Those only work for DL_FLAG_STATELESS device links, but we use only
> > DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE flags.
>
> Is there a reason you can't use DL_FLAG_STATELESS? It doesn't preclude
> you from using RPM_ACTIVE as far as I can tell.

Perhaps he wants the links to be managed if they are used after all.

Anyway, this is a valid use case that is not covered right now.
Saravana Kannan July 8, 2021, 4:57 p.m. UTC | #10
On Thu, Jul 8, 2021 at 9:49 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Thu, Jul 8, 2021 at 6:46 PM Saravana Kannan <saravanak@google.com> wrote:
> >
> > On Thu, Jul 8, 2021 at 7:17 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> > >
> > > On 8/07/21 3:31 pm, Rafael J. Wysocki wrote:
> > > > On Wed, Jul 7, 2021 at 7:49 PM Adrian Hunter <adrian.hunter@intel.com> wrote:
> > > >>
> > > >> On 7/07/21 8:39 pm, Greg Kroah-Hartman wrote:
> > > >>> On Wed, Jul 07, 2021 at 08:29:48PM +0300, Adrian Hunter wrote:
> > > >>>> If a LUN fails to probe (e.g. absent BOOT WLUN), the device will not have
> > > >>>> been registered but can still have a device link holding a reference to the
> > > >>>> device. The unwanted device link will prevent runtime suspend indefinitely,
> > > >>>> and cause some warnings if the supplier is ever deleted (e.g. by unbinding
> > > >>>> the UFS host controller). Fix by explicitly deleting the device link when
> > > >>>> SCSI destroys the SCSI device.
> > > >>>>
> > > >>>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> > > >>>> ---
> > > >>>>  drivers/scsi/ufs/ufshcd.c | 7 +++++++
> > > >>>>  1 file changed, 7 insertions(+)
> > > >>>>
> > > >>>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> > > >>>> index 708b3b62fc4d..483aa74fe2c8 100644
> > > >>>> --- a/drivers/scsi/ufs/ufshcd.c
> > > >>>> +++ b/drivers/scsi/ufs/ufshcd.c
> > > >>>> @@ -5029,6 +5029,13 @@ static void ufshcd_slave_destroy(struct scsi_device *sdev)
> > > >>>>              spin_lock_irqsave(hba->host->host_lock, flags);
> > > >>>>              hba->sdev_ufs_device = NULL;
> > > >>>>              spin_unlock_irqrestore(hba->host->host_lock, flags);
> > > >>>> +    } else {
> > > >>>> +            /*
> > > >>>> +             * If a LUN fails to probe (e.g. absent BOOT WLUN), the device
> > > >>>> +             * will not have been registered but can still have a device
> > > >>>> +             * link holding a reference to the device.
> > > >>>> +             */
> > > >>>> +            device_links_scrap(&sdev->sdev_gendev);
> > > >>>
> > > >>> What created that link?  And why did it do that before probe happened
> > > >>> successfully?
> > > >>
> > > >> The same driver created the link.
> > > >>
> > > >> The documentation seems to say it is allowed to, if it is the consumer.
> > > >> From Documentation/driver-api/device_link.rst
> > > >>
> > > >>   Usage
> > > >>   =====
> > > >>
> > > >>   The earliest point in time when device links can be added is after
> > > >>   :c:func:`device_add()` has been called for the supplier and
> > > >>   :c:func:`device_initialize()` has been called for the consumer.
> > > >
> > > > Yes, this is allowed, but if you've added device links to a device
> > > > object that is not going to be registered after all, you are
> > > > responsible for doing the cleanup.
> > > >
> > > > Why can't you call device_link_del() directly on those links?
> > > >
> > > > Or device_link_remove() if you don't want to deal with link pointers?
> > > >
> > >
> > > Those only work for DL_FLAG_STATELESS device links, but we use only
> > > DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE flags.
> >
> > Is there a reason you can't use DL_FLAG_STATELESS? It doesn't preclude
> > you from using RPM_ACTIVE as far as I can tell.
>
> Perhaps he wants the links to be managed if they are used after all.
>
> Anyway, this is a valid use case that is not covered right now.

Maybe. But the suggested patch is certainly risky.

There is no requirement the consumer is registered before the links
are added though. So, randomly deleting a managed link when
device_link_put_kref() is called on a stateless refcount (they are
still the same link still) isn't right. The entity that created the
managed device link might still want it there. Also, if two entities
create a managed link and one of them calls device_link_put_kref()
before the device is registered, we have a UAF problem because managed
links aren't refcounted (more than once).

-Saravana
Rafael J. Wysocki July 8, 2021, 5:39 p.m. UTC | #11
On Thu, Jul 8, 2021 at 6:57 PM Saravana Kannan <saravanak@google.com> wrote:
>
> On Thu, Jul 8, 2021 at 9:49 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >
> > On Thu, Jul 8, 2021 at 6:46 PM Saravana Kannan <saravanak@google.com> wrote:
> > >
> > > On Thu, Jul 8, 2021 at 7:17 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> > > >
> > > > On 8/07/21 3:31 pm, Rafael J. Wysocki wrote:
> > > > > On Wed, Jul 7, 2021 at 7:49 PM Adrian Hunter <adrian.hunter@intel.com> wrote:
> > > > >>
> > > > >> On 7/07/21 8:39 pm, Greg Kroah-Hartman wrote:
> > > > >>> On Wed, Jul 07, 2021 at 08:29:48PM +0300, Adrian Hunter wrote:
> > > > >>>> If a LUN fails to probe (e.g. absent BOOT WLUN), the device will not have
> > > > >>>> been registered but can still have a device link holding a reference to the
> > > > >>>> device. The unwanted device link will prevent runtime suspend indefinitely,
> > > > >>>> and cause some warnings if the supplier is ever deleted (e.g. by unbinding
> > > > >>>> the UFS host controller). Fix by explicitly deleting the device link when
> > > > >>>> SCSI destroys the SCSI device.
> > > > >>>>
> > > > >>>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> > > > >>>> ---
> > > > >>>>  drivers/scsi/ufs/ufshcd.c | 7 +++++++
> > > > >>>>  1 file changed, 7 insertions(+)
> > > > >>>>
> > > > >>>> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> > > > >>>> index 708b3b62fc4d..483aa74fe2c8 100644
> > > > >>>> --- a/drivers/scsi/ufs/ufshcd.c
> > > > >>>> +++ b/drivers/scsi/ufs/ufshcd.c
> > > > >>>> @@ -5029,6 +5029,13 @@ static void ufshcd_slave_destroy(struct scsi_device *sdev)
> > > > >>>>              spin_lock_irqsave(hba->host->host_lock, flags);
> > > > >>>>              hba->sdev_ufs_device = NULL;
> > > > >>>>              spin_unlock_irqrestore(hba->host->host_lock, flags);
> > > > >>>> +    } else {
> > > > >>>> +            /*
> > > > >>>> +             * If a LUN fails to probe (e.g. absent BOOT WLUN), the device
> > > > >>>> +             * will not have been registered but can still have a device
> > > > >>>> +             * link holding a reference to the device.
> > > > >>>> +             */
> > > > >>>> +            device_links_scrap(&sdev->sdev_gendev);
> > > > >>>
> > > > >>> What created that link?  And why did it do that before probe happened
> > > > >>> successfully?
> > > > >>
> > > > >> The same driver created the link.
> > > > >>
> > > > >> The documentation seems to say it is allowed to, if it is the consumer.
> > > > >> From Documentation/driver-api/device_link.rst
> > > > >>
> > > > >>   Usage
> > > > >>   =====
> > > > >>
> > > > >>   The earliest point in time when device links can be added is after
> > > > >>   :c:func:`device_add()` has been called for the supplier and
> > > > >>   :c:func:`device_initialize()` has been called for the consumer.
> > > > >
> > > > > Yes, this is allowed, but if you've added device links to a device
> > > > > object that is not going to be registered after all, you are
> > > > > responsible for doing the cleanup.
> > > > >
> > > > > Why can't you call device_link_del() directly on those links?
> > > > >
> > > > > Or device_link_remove() if you don't want to deal with link pointers?
> > > > >
> > > >
> > > > Those only work for DL_FLAG_STATELESS device links, but we use only
> > > > DL_FLAG_PM_RUNTIME | DL_FLAG_RPM_ACTIVE flags.
> > >
> > > Is there a reason you can't use DL_FLAG_STATELESS? It doesn't preclude
> > > you from using RPM_ACTIVE as far as I can tell.
> >
> > Perhaps he wants the links to be managed if they are used after all.
> >
> > Anyway, this is a valid use case that is not covered right now.
>
> Maybe. But the suggested patch is certainly risky.
>
> There is no requirement the consumer is registered before the links
> are added though. So, randomly deleting a managed link when
> device_link_put_kref() is called on a stateless refcount (they are
> still the same link still) isn't right.

Device pointers are needed in order to create a device link and it is
quite unlikely that a pointer to an unregistered device will be shared
between two different pieces of code.

> The entity that created the
> managed device link might still want it there.

So the stateless kref is going to be put first.

> Also, if two entities
> create a managed link and one of them calls device_link_put_kref()
> before the device is registered, we have a UAF problem because managed
> links aren't refcounted (more than once).

IMO until a device object is registered, its creator should be allowed
to do the cleanup in the case when it gets released without
registration, including the removal of any device links to it that
have been added so far.

Messing up with a device object created by someone else that may still
go away without registration is a risky business regardless.
diff mbox series

Patch

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 708b3b62fc4d..483aa74fe2c8 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -5029,6 +5029,13 @@  static void ufshcd_slave_destroy(struct scsi_device *sdev)
 		spin_lock_irqsave(hba->host->host_lock, flags);
 		hba->sdev_ufs_device = NULL;
 		spin_unlock_irqrestore(hba->host->host_lock, flags);
+	} else {
+		/*
+		 * If a LUN fails to probe (e.g. absent BOOT WLUN), the device
+		 * will not have been registered but can still have a device
+		 * link holding a reference to the device.
+		 */
+		device_links_scrap(&sdev->sdev_gendev);
 	}
 }