diff mbox

scsi: fix race condition when removing target

Message ID 1511977145.2671.13.camel@wdc.com (mailing list archive)
State Superseded
Headers show

Commit Message

Bart Van Assche Nov. 29, 2017, 5:39 p.m. UTC
On Wed, 2017-11-29 at 17:20 +0100, hch@lst.de wrote:
> On Wed, Nov 29, 2017 at 04:18:30PM +0000, Bart Van Assche wrote:

> > As the above patch description shows it can happen that the SCSI core calls

> > get_device() after the device reference count has reached zero and before

> > the memory for struct device is freed. Although the above patch looks fine

> > to me, would you consider it acceptable to modify get_device() such that it

> > uses kobject_get_unless_zero() instead of kobject_get()? I'm asking this

> > because that change would help to reduce the complexity of the already too

> > complicated SCSI core.

> 

> I don't think we can just modify get_device, but we can add a new

> get_device_unless_zero.  In fact I have an open coded variant of that

> in nvme, and was planning to submit one for the current merge window..


Sorry but I don't see why we can't modify get_device()? Can you explain why
you think that something like the patch below is wrong?

Thanks,

Bart.


[PATCH] Make it safe to use get_device() if the reference count is zero

---
 drivers/base/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jason Yan Nov. 30, 2017, 1:18 a.m. UTC | #1
On 2017/11/30 1:39, Bart Van Assche wrote:
> On Wed, 2017-11-29 at 17:20 +0100, hch@lst.de wrote:
>> On Wed, Nov 29, 2017 at 04:18:30PM +0000, Bart Van Assche wrote:
>>> As the above patch description shows it can happen that the SCSI core calls
>>> get_device() after the device reference count has reached zero and before
>>> the memory for struct device is freed. Although the above patch looks fine
>>> to me, would you consider it acceptable to modify get_device() such that it
>>> uses kobject_get_unless_zero() instead of kobject_get()? I'm asking this
>>> because that change would help to reduce the complexity of the already too
>>> complicated SCSI core.
>>
>> I don't think we can just modify get_device, but we can add a new
>> get_device_unless_zero.  In fact I have an open coded variant of that
>> in nvme, and was planning to submit one for the current merge window..
>
> Sorry but I don't see why we can't modify get_device()? Can you explain why
> you think that something like the patch below is wrong?
>
> Thanks,
>
> Bart.
>

Hi Bart, I chose the approach in my patch because it has been used in
scsi_device_get() for years and been proved safe. I think using
kobject_get_unless_zero() is safe here and can fix this issue too. And
this approach is beneficial to all users.

>
> [PATCH] Make it safe to use get_device() if the reference count is zero
>
> ---
>   drivers/base/core.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 110230d86527..049a5d9dba8a 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -1916,7 +1916,7 @@ EXPORT_SYMBOL_GPL(device_register);
>    */
>   struct device *get_device(struct device *dev)
>   {
> -	return dev ? kobj_to_dev(kobject_get(&dev->kobj)) : NULL;
> +	return dev && kobject_get_unless_zero(&dev->kobj) ? dev : NULL;
>   }
>   EXPORT_SYMBOL_GPL(get_device);
>
Bart Van Assche Nov. 30, 2017, 4:08 p.m. UTC | #2
On Thu, 2017-11-30 at 09:18 +0800, Jason Yan wrote:
> Hi Bart, I chose the approach in my patch because it has been used in

> scsi_device_get() for years and been proved safe. I think using

> kobject_get_unless_zero() is safe here and can fix this issue too. And

> this approach is beneficial to all users.


Hello Jason,

A possible approach is that we start with your patch and defer any get_device()
changes until after your patch has been applied.

Bart.
Greg KH Nov. 30, 2017, 4:40 p.m. UTC | #3
On Thu, Nov 30, 2017 at 04:08:38PM +0000, Bart Van Assche wrote:
> On Thu, 2017-11-30 at 09:18 +0800, Jason Yan wrote:
> > Hi Bart, I chose the approach in my patch because it has been used in
> > scsi_device_get() for years and been proved safe. I think using
> > kobject_get_unless_zero() is safe here and can fix this issue too. And
> > this approach is beneficial to all users.
> 
> Hello Jason,
> 
> A possible approach is that we start with your patch and defer any get_device()
> changes until after your patch has been applied.

That might be good, I don't have the chance to look at any driver core
changes until Monday as I'm on the road until then, sorry...

greg k-h
diff mbox

Patch

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 110230d86527..049a5d9dba8a 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1916,7 +1916,7 @@  EXPORT_SYMBOL_GPL(device_register);
  */
 struct device *get_device(struct device *dev)
 {
-	return dev ? kobj_to_dev(kobject_get(&dev->kobj)) : NULL;
+	return dev && kobject_get_unless_zero(&dev->kobj) ? dev : NULL;
 }
 EXPORT_SYMBOL_GPL(get_device);