diff mbox series

[V2,for-6.10/block,1/2] loop: Fix a race between loop detach and loop open

Message ID 20240521224249.7389-1-gulam.mohamed@oracle.com (mailing list archive)
State New, archived
Headers show
Series [V2,for-6.10/block,1/2] loop: Fix a race between loop detach and loop open | expand

Commit Message

Gulam Mohamed May 21, 2024, 10:42 p.m. UTC
Description
===========

1. Userspace sends the command "losetup -d" which uses the open() call
   to open the device
2. Kernel receives the ioctl command "LOOP_CLR_FD" which calls the
   function loop_clr_fd()
3. If LOOP_CLR_FD is the first command received at the time, then the
   AUTOCLEAR flag is not set and deletion of the
   loop device proceeds ahead and scans the partitions (drop/add
   partitions)

	if (disk_openers(lo->lo_disk) > 1) {
		lo->lo_flags |= LO_FLAGS_AUTOCLEAR;
		loop_global_unlock(lo, true);
		return 0;
	}

 4. Before scanning partitions, it will check to see if any partition of
    the loop device is currently opened
 5. If any partition is opened, then it will return EBUSY:

    if (disk->open_partitions)
		return -EBUSY;
 6. So, after receiving the "LOOP_CLR_FD" command and just before the above
    check for open_partitions, if any other command
    (like blkid) opens any partition of the loop device, then the partition
    scan will not proceed and EBUSY is returned as shown in above code
 7. But in "__loop_clr_fd()", this EBUSY error is not propagated
 8. We have noticed that this is causing the partitions of the loop to
    remain stale even after the loop device is detached resulting in the
    IO errors on the partitions

Fix
---
Re-introduce the lo_open() call to restrict any process to open the loop
device when its being detached

Test case
=========
Test case involves the following two scripts:

script1.sh
----------
while [ 1 ];
do
	losetup -P -f /home/opt/looptest/test10.img
	blkid /dev/loop0p1
done

script2.sh
----------
while [ 1 ];
do
	losetup -d /dev/loop0
done

Without fix, the following IO errors have been observed:

kernel: __loop_clr_fd: partition scan of loop0 failed (rc=-16)
kernel: I/O error, dev loop0, sector 20971392 op 0x0:(READ) flags 0x80700
        phys_seg 1 prio class 0
kernel: I/O error, dev loop0, sector 108868 op 0x0:(READ) flags 0x0
        phys_seg 1 prio class 0
kernel: Buffer I/O error on dev loop0p1, logical block 27201, async page
        read

V1->V2:
	Added a test case, 010, in blktests in tests/loop/
Signed-off-by: Gulam Mohamed <gulam.mohamed@oracle.com>
---
 drivers/block/loop.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

Comments

Jens Axboe May 22, 2024, 1:39 a.m. UTC | #1
On 5/21/24 4:42 PM, Gulam Mohamed wrote:
> Description
> ===========
> 
> 1. Userspace sends the command "losetup -d" which uses the open() call
>    to open the device
> 2. Kernel receives the ioctl command "LOOP_CLR_FD" which calls the
>    function loop_clr_fd()
> 3. If LOOP_CLR_FD is the first command received at the time, then the
>    AUTOCLEAR flag is not set and deletion of the
>    loop device proceeds ahead and scans the partitions (drop/add
>    partitions)
> 
> 	if (disk_openers(lo->lo_disk) > 1) {
> 		lo->lo_flags |= LO_FLAGS_AUTOCLEAR;
> 		loop_global_unlock(lo, true);
> 		return 0;
> 	}
> 
>  4. Before scanning partitions, it will check to see if any partition of
>     the loop device is currently opened
>  5. If any partition is opened, then it will return EBUSY:
> 
>     if (disk->open_partitions)
> 		return -EBUSY;
>  6. So, after receiving the "LOOP_CLR_FD" command and just before the above
>     check for open_partitions, if any other command
>     (like blkid) opens any partition of the loop device, then the partition
>     scan will not proceed and EBUSY is returned as shown in above code
>  7. But in "__loop_clr_fd()", this EBUSY error is not propagated
>  8. We have noticed that this is causing the partitions of the loop to
>     remain stale even after the loop device is detached resulting in the
>     IO errors on the partitions
> 
> Fix
> ---
> Re-introduce the lo_open() call to restrict any process to open the loop
> device when its being detached
> 
> Test case
> =========
> Test case involves the following two scripts:
> 
> script1.sh
> ----------
> while [ 1 ];
> do
> 	losetup -P -f /home/opt/looptest/test10.img
> 	blkid /dev/loop0p1
> done
> 
> script2.sh
> ----------
> while [ 1 ];
> do
> 	losetup -d /dev/loop0
> done
> 
> Without fix, the following IO errors have been observed:
> 
> kernel: __loop_clr_fd: partition scan of loop0 failed (rc=-16)
> kernel: I/O error, dev loop0, sector 20971392 op 0x0:(READ) flags 0x80700
>         phys_seg 1 prio class 0
> kernel: I/O error, dev loop0, sector 108868 op 0x0:(READ) flags 0x0
>         phys_seg 1 prio class 0
> kernel: Buffer I/O error on dev loop0p1, logical block 27201, async page
>         read
> 
> V1->V2:
> 	Added a test case, 010, in blktests in tests/loop/
> Signed-off-by: Gulam Mohamed <gulam.mohamed@oracle.com>
> ---
>  drivers/block/loop.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index 28a95fd366fe..9a235d8c062d 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -1717,6 +1717,24 @@ static int lo_compat_ioctl(struct block_device *bdev, blk_mode_t mode,
>  }
>  #endif
>  
> +static int lo_open(struct gendisk *disk, blk_mode_t mode)
> +{
> +        struct loop_device *lo = disk->private_data;
> +        int err;
> +
> +        if (!lo)
> +                return -ENXIO;
> +
> +        err = mutex_lock_killable(&lo->lo_mutex);
> +        if (err)
> +                return err;
> +
> +        if (lo->lo_state == Lo_rundown)
> +                err = -ENXIO;
> +        mutex_unlock(&lo->lo_mutex);
> +	return err;
> +}

Most of this function uses spaces rather than tabs.
Yu Kuai May 22, 2024, 2:31 a.m. UTC | #2
Hi,

在 2024/05/22 9:39, Jens Axboe 写道:
> On 5/21/24 4:42 PM, Gulam Mohamed wrote:
>> Description
>> ===========
>>
>> 1. Userspace sends the command "losetup -d" which uses the open() call
>>     to open the device
>> 2. Kernel receives the ioctl command "LOOP_CLR_FD" which calls the
>>     function loop_clr_fd()
>> 3. If LOOP_CLR_FD is the first command received at the time, then the
>>     AUTOCLEAR flag is not set and deletion of the
>>     loop device proceeds ahead and scans the partitions (drop/add
>>     partitions)
>>
>> 	if (disk_openers(lo->lo_disk) > 1) {
>> 		lo->lo_flags |= LO_FLAGS_AUTOCLEAR;
>> 		loop_global_unlock(lo, true);
>> 		return 0;
>> 	}
>>
>>   4. Before scanning partitions, it will check to see if any partition of
>>      the loop device is currently opened
>>   5. If any partition is opened, then it will return EBUSY:
>>
>>      if (disk->open_partitions)
>> 		return -EBUSY;
>>   6. So, after receiving the "LOOP_CLR_FD" command and just before the above
>>      check for open_partitions, if any other command
>>      (like blkid) opens any partition of the loop device, then the partition
>>      scan will not proceed and EBUSY is returned as shown in above code
>>   7. But in "__loop_clr_fd()", this EBUSY error is not propagated
>>   8. We have noticed that this is causing the partitions of the loop to
>>      remain stale even after the loop device is detached resulting in the
>>      IO errors on the partitions
>>
>> Fix
>> ---
>> Re-introduce the lo_open() call to restrict any process to open the loop
>> device when its being detached
>>
>> Test case
>> =========
>> Test case involves the following two scripts:
>>
>> script1.sh
>> ----------
>> while [ 1 ];
>> do
>> 	losetup -P -f /home/opt/looptest/test10.img
>> 	blkid /dev/loop0p1
>> done
>>
>> script2.sh
>> ----------
>> while [ 1 ];
>> do
>> 	losetup -d /dev/loop0
>> done
>>
>> Without fix, the following IO errors have been observed:
>>
>> kernel: __loop_clr_fd: partition scan of loop0 failed (rc=-16)
>> kernel: I/O error, dev loop0, sector 20971392 op 0x0:(READ) flags 0x80700
>>          phys_seg 1 prio class 0
>> kernel: I/O error, dev loop0, sector 108868 op 0x0:(READ) flags 0x0
>>          phys_seg 1 prio class 0
>> kernel: Buffer I/O error on dev loop0p1, logical block 27201, async page
>>          read
>>
>> V1->V2:
>> 	Added a test case, 010, in blktests in tests/loop/
>> Signed-off-by: Gulam Mohamed <gulam.mohamed@oracle.com>
>> ---
>>   drivers/block/loop.c | 19 +++++++++++++++++++
>>   1 file changed, 19 insertions(+)
>>
>> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
>> index 28a95fd366fe..9a235d8c062d 100644
>> --- a/drivers/block/loop.c
>> +++ b/drivers/block/loop.c
>> @@ -1717,6 +1717,24 @@ static int lo_compat_ioctl(struct block_device *bdev, blk_mode_t mode,
>>   }
>>   #endif
>>   
>> +static int lo_open(struct gendisk *disk, blk_mode_t mode)
>> +{
>> +        struct loop_device *lo = disk->private_data;
>> +        int err;
>> +
>> +        if (!lo)
>> +                return -ENXIO;
>> +
>> +        err = mutex_lock_killable(&lo->lo_mutex);
>> +        if (err)
>> +                return err;
>> +
>> +        if (lo->lo_state == Lo_rundown)
>> +                err = -ENXIO;
>> +        mutex_unlock(&lo->lo_mutex);

This doesn't fix the problem completely, there is still a race window.

lo_release
  if (disk_openers(disk) > 0)
   reutrn
   -> no openers now
		lo_open
		 if (lo->lo_state == Lo_rundown)
		 -> no set yet
		 open succeed
  mutex_lock(lo_mutex)
  lo->lo_state = Lo_rundown
  mutex_unlock(lo_mutex)
  __loop_clr_fd

And with the respect, loop_clr_fd() has the same problem.

I think probably loop need a open counter for itself.

Thanks,
Kuai

>> +	return err;
>> +}
> 
> Most of this function uses spaces rather than tabs.
>
Gulam Mohamed May 22, 2024, 7:11 p.m. UTC | #3
Hi Jens,

> -----Original Message-----
> From: Jens Axboe <axboe@kernel.dk>
> Sent: Wednesday, May 22, 2024 7:10 AM
> To: Gulam Mohamed <gulam.mohamed@oracle.com>; linux-
> block@vger.kernel.org; linux-kernel@vger.kernel.org
> Cc: shinichiro.kawasaki@wdc.com; chaitanyak@nvidia.com; hch@lst.de
> Subject: Re: [PATCH V2 for-6.10/block 1/2] loop: Fix a race between loop
> detach and loop open
> 
> On 5/21/24 4:42 PM, Gulam Mohamed wrote:
> > Description
> > ===========
> >
> > 1. Userspace sends the command "losetup -d" which uses the open() call
> >    to open the device
> > 2. Kernel receives the ioctl command "LOOP_CLR_FD" which calls the
> >    function loop_clr_fd()
> > 3. If LOOP_CLR_FD is the first command received at the time, then the
> >    AUTOCLEAR flag is not set and deletion of the
> >    loop device proceeds ahead and scans the partitions (drop/add
> >    partitions)
> >
> > 	if (disk_openers(lo->lo_disk) > 1) {
> > 		lo->lo_flags |= LO_FLAGS_AUTOCLEAR;
> > 		loop_global_unlock(lo, true);
> > 		return 0;
> > 	}
> >
> >  4. Before scanning partitions, it will check to see if any partition of
> >     the loop device is currently opened  5. If any partition is
> > opened, then it will return EBUSY:
> >
> >     if (disk->open_partitions)
> > 		return -EBUSY;
> >  6. So, after receiving the "LOOP_CLR_FD" command and just before the
> above
> >     check for open_partitions, if any other command
> >     (like blkid) opens any partition of the loop device, then the partition
> >     scan will not proceed and EBUSY is returned as shown in above code
> > 7. But in "__loop_clr_fd()", this EBUSY error is not propagated  8. We
> > have noticed that this is causing the partitions of the loop to
> >     remain stale even after the loop device is detached resulting in the
> >     IO errors on the partitions
> >
> > Fix
> > ---
> > Re-introduce the lo_open() call to restrict any process to open the
> > loop device when its being detached
> >
> > Test case
> > =========
> > Test case involves the following two scripts:
> >
> > script1.sh
> > ----------
> > while [ 1 ];
> > do
> > 	losetup -P -f /home/opt/looptest/test10.img
> > 	blkid /dev/loop0p1
> > done
> >
> > script2.sh
> > ----------
> > while [ 1 ];
> > do
> > 	losetup -d /dev/loop0
> > done
> >
> > Without fix, the following IO errors have been observed:
> >
> > kernel: __loop_clr_fd: partition scan of loop0 failed (rc=-16)
> > kernel: I/O error, dev loop0, sector 20971392 op 0x0:(READ) flags 0x80700
> >         phys_seg 1 prio class 0
> > kernel: I/O error, dev loop0, sector 108868 op 0x0:(READ) flags 0x0
> >         phys_seg 1 prio class 0
> > kernel: Buffer I/O error on dev loop0p1, logical block 27201, async page
> >         read
> >
> > V1->V2:
> > 	Added a test case, 010, in blktests in tests/loop/
> > Signed-off-by: Gulam Mohamed <gulam.mohamed@oracle.com>
> > ---
> >  drivers/block/loop.c | 19 +++++++++++++++++++
> >  1 file changed, 19 insertions(+)
> >
> > diff --git a/drivers/block/loop.c b/drivers/block/loop.c index
> > 28a95fd366fe..9a235d8c062d 100644
> > --- a/drivers/block/loop.c
> > +++ b/drivers/block/loop.c
> > @@ -1717,6 +1717,24 @@ static int lo_compat_ioctl(struct block_device
> > *bdev, blk_mode_t mode,  }  #endif
> >
> > +static int lo_open(struct gendisk *disk, blk_mode_t mode) {
> > +        struct loop_device *lo = disk->private_data;
> > +        int err;
> > +
> > +        if (!lo)
> > +                return -ENXIO;
> > +
> > +        err = mutex_lock_killable(&lo->lo_mutex);
> > +        if (err)
> > +                return err;
> > +
> > +        if (lo->lo_state == Lo_rundown)
> > +                err = -ENXIO;
> > +        mutex_unlock(&lo->lo_mutex);
> > +	return err;
> > +}
> 
> Most of this function uses spaces rather than tabs.
I am sorry I should have checked that. Will address the issue in next version
> 
> --
> Jens Axboe
Gulam Mohamed May 22, 2024, 7:12 p.m. UTC | #4
Hi Kuai,

> -----Original Message-----
> From: Yu Kuai <yukuai1@huaweicloud.com>
> Sent: Wednesday, May 22, 2024 8:01 AM
> To: Jens Axboe <axboe@kernel.dk>; Gulam Mohamed
> <gulam.mohamed@oracle.com>; linux-block@vger.kernel.org; linux-
> kernel@vger.kernel.org
> Cc: shinichiro.kawasaki@wdc.com; chaitanyak@nvidia.com; hch@lst.de;
> yukuai (C) <yukuai3@huawei.com>
> Subject: Re: [PATCH V2 for-6.10/block 1/2] loop: Fix a race between loop
> detach and loop open
> 
> Hi,
> 
> 在 2024/05/22 9:39, Jens Axboe 写道:
> > On 5/21/24 4:42 PM, Gulam Mohamed wrote:
> >> Description
> >> ===========
> >>
> >> 1. Userspace sends the command "losetup -d" which uses the open() call
> >>     to open the device
> >> 2. Kernel receives the ioctl command "LOOP_CLR_FD" which calls the
> >>     function loop_clr_fd()
> >> 3. If LOOP_CLR_FD is the first command received at the time, then the
> >>     AUTOCLEAR flag is not set and deletion of the
> >>     loop device proceeds ahead and scans the partitions (drop/add
> >>     partitions)
> >>
> >> 	if (disk_openers(lo->lo_disk) > 1) {
> >> 		lo->lo_flags |= LO_FLAGS_AUTOCLEAR;
> >> 		loop_global_unlock(lo, true);
> >> 		return 0;
> >> 	}
> >>
> >>   4. Before scanning partitions, it will check to see if any partition of
> >>      the loop device is currently opened
> >>   5. If any partition is opened, then it will return EBUSY:
> >>
> >>      if (disk->open_partitions)
> >> 		return -EBUSY;
> >>   6. So, after receiving the "LOOP_CLR_FD" command and just before the
> above
> >>      check for open_partitions, if any other command
> >>      (like blkid) opens any partition of the loop device, then the partition
> >>      scan will not proceed and EBUSY is returned as shown in above code
> >>   7. But in "__loop_clr_fd()", this EBUSY error is not propagated
> >>   8. We have noticed that this is causing the partitions of the loop to
> >>      remain stale even after the loop device is detached resulting in the
> >>      IO errors on the partitions
> >>
> >> Fix
> >> ---
> >> Re-introduce the lo_open() call to restrict any process to open the
> >> loop device when its being detached
> >>
> >> Test case
> >> =========
> >> Test case involves the following two scripts:
> >>
> >> script1.sh
> >> ----------
> >> while [ 1 ];
> >> do
> >> 	losetup -P -f /home/opt/looptest/test10.img
> >> 	blkid /dev/loop0p1
> >> done
> >>
> >> script2.sh
> >> ----------
> >> while [ 1 ];
> >> do
> >> 	losetup -d /dev/loop0
> >> done
> >>
> >> Without fix, the following IO errors have been observed:
> >>
> >> kernel: __loop_clr_fd: partition scan of loop0 failed (rc=-16)
> >> kernel: I/O error, dev loop0, sector 20971392 op 0x0:(READ) flags 0x80700
> >>          phys_seg 1 prio class 0
> >> kernel: I/O error, dev loop0, sector 108868 op 0x0:(READ) flags 0x0
> >>          phys_seg 1 prio class 0
> >> kernel: Buffer I/O error on dev loop0p1, logical block 27201, async page
> >>          read
> >>
> >> V1->V2:
> >> 	Added a test case, 010, in blktests in tests/loop/
> >> Signed-off-by: Gulam Mohamed <gulam.mohamed@oracle.com>
> >> ---
> >>   drivers/block/loop.c | 19 +++++++++++++++++++
> >>   1 file changed, 19 insertions(+)
> >>
> >> diff --git a/drivers/block/loop.c b/drivers/block/loop.c index
> >> 28a95fd366fe..9a235d8c062d 100644
> >> --- a/drivers/block/loop.c
> >> +++ b/drivers/block/loop.c
> >> @@ -1717,6 +1717,24 @@ static int lo_compat_ioctl(struct block_device
> *bdev, blk_mode_t mode,
> >>   }
> >>   #endif
> >>
> >> +static int lo_open(struct gendisk *disk, blk_mode_t mode) {
> >> +        struct loop_device *lo = disk->private_data;
> >> +        int err;
> >> +
> >> +        if (!lo)
> >> +                return -ENXIO;
> >> +
> >> +        err = mutex_lock_killable(&lo->lo_mutex);
> >> +        if (err)
> >> +                return err;
> >> +
> >> +        if (lo->lo_state == Lo_rundown)
> >> +                err = -ENXIO;
> >> +        mutex_unlock(&lo->lo_mutex);
> 
> This doesn't fix the problem completely, there is still a race window.
> 
> lo_release
>   if (disk_openers(disk) > 0)
>    reutrn
>    -> no openers now
> 		lo_open
> 		 if (lo->lo_state == Lo_rundown)
> 		 -> no set yet
> 		 open succeed
>   mutex_lock(lo_mutex)
>   lo->lo_state = Lo_rundown
>   mutex_unlock(lo_mutex)
>   __loop_clr_fd
We have noticed that, at block layer, both open() and release() are protected by gendisk->open_mutex.
So, this race may not happen. Can you please let me know if I understand correctly?
> 
> And with the respect, loop_clr_fd() has the same problem.
> 
> I think probably loop need a open counter for itself.
We are looking to see how to handle this case
> 
> Thanks,
> Kuai
> 
> >> +	return err;
> >> +}
> >
> > Most of this function uses spaces rather than tabs.
> >
Yu Kuai May 23, 2024, 1:13 a.m. UTC | #5
Hi,

在 2024/05/23 3:12, Gulam Mohamed 写道:
> Hi Kuai,
> 
>> -----Original Message-----
>> From: Yu Kuai <yukuai1@huaweicloud.com>
>> Sent: Wednesday, May 22, 2024 8:01 AM
>> To: Jens Axboe <axboe@kernel.dk>; Gulam Mohamed
>> <gulam.mohamed@oracle.com>; linux-block@vger.kernel.org; linux-
>> kernel@vger.kernel.org
>> Cc: shinichiro.kawasaki@wdc.com; chaitanyak@nvidia.com; hch@lst.de;
>> yukuai (C) <yukuai3@huawei.com>
>> Subject: Re: [PATCH V2 for-6.10/block 1/2] loop: Fix a race between loop
>> detach and loop open
>>
>> Hi,
>>
>> 在 2024/05/22 9:39, Jens Axboe 写道:
>>> On 5/21/24 4:42 PM, Gulam Mohamed wrote:
>>>> Description
>>>> ===========
>>>>
>>>> 1. Userspace sends the command "losetup -d" which uses the open() call
>>>>      to open the device
>>>> 2. Kernel receives the ioctl command "LOOP_CLR_FD" which calls the
>>>>      function loop_clr_fd()
>>>> 3. If LOOP_CLR_FD is the first command received at the time, then the
>>>>      AUTOCLEAR flag is not set and deletion of the
>>>>      loop device proceeds ahead and scans the partitions (drop/add
>>>>      partitions)
>>>>
>>>> 	if (disk_openers(lo->lo_disk) > 1) {
>>>> 		lo->lo_flags |= LO_FLAGS_AUTOCLEAR;
>>>> 		loop_global_unlock(lo, true);
>>>> 		return 0;
>>>> 	}
>>>>
>>>>    4. Before scanning partitions, it will check to see if any partition of
>>>>       the loop device is currently opened
>>>>    5. If any partition is opened, then it will return EBUSY:
>>>>
>>>>       if (disk->open_partitions)
>>>> 		return -EBUSY;
>>>>    6. So, after receiving the "LOOP_CLR_FD" command and just before the
>> above
>>>>       check for open_partitions, if any other command
>>>>       (like blkid) opens any partition of the loop device, then the partition
>>>>       scan will not proceed and EBUSY is returned as shown in above code
>>>>    7. But in "__loop_clr_fd()", this EBUSY error is not propagated
>>>>    8. We have noticed that this is causing the partitions of the loop to
>>>>       remain stale even after the loop device is detached resulting in the
>>>>       IO errors on the partitions
>>>>
>>>> Fix
>>>> ---
>>>> Re-introduce the lo_open() call to restrict any process to open the
>>>> loop device when its being detached
>>>>
>>>> Test case
>>>> =========
>>>> Test case involves the following two scripts:
>>>>
>>>> script1.sh
>>>> ----------
>>>> while [ 1 ];
>>>> do
>>>> 	losetup -P -f /home/opt/looptest/test10.img
>>>> 	blkid /dev/loop0p1
>>>> done
>>>>
>>>> script2.sh
>>>> ----------
>>>> while [ 1 ];
>>>> do
>>>> 	losetup -d /dev/loop0
>>>> done
>>>>
>>>> Without fix, the following IO errors have been observed:
>>>>
>>>> kernel: __loop_clr_fd: partition scan of loop0 failed (rc=-16)
>>>> kernel: I/O error, dev loop0, sector 20971392 op 0x0:(READ) flags 0x80700
>>>>           phys_seg 1 prio class 0
>>>> kernel: I/O error, dev loop0, sector 108868 op 0x0:(READ) flags 0x0
>>>>           phys_seg 1 prio class 0
>>>> kernel: Buffer I/O error on dev loop0p1, logical block 27201, async page
>>>>           read
>>>>
>>>> V1->V2:
>>>> 	Added a test case, 010, in blktests in tests/loop/
>>>> Signed-off-by: Gulam Mohamed <gulam.mohamed@oracle.com>
>>>> ---
>>>>    drivers/block/loop.c | 19 +++++++++++++++++++
>>>>    1 file changed, 19 insertions(+)
>>>>
>>>> diff --git a/drivers/block/loop.c b/drivers/block/loop.c index
>>>> 28a95fd366fe..9a235d8c062d 100644
>>>> --- a/drivers/block/loop.c
>>>> +++ b/drivers/block/loop.c
>>>> @@ -1717,6 +1717,24 @@ static int lo_compat_ioctl(struct block_device
>> *bdev, blk_mode_t mode,
>>>>    }
>>>>    #endif
>>>>
>>>> +static int lo_open(struct gendisk *disk, blk_mode_t mode) {
>>>> +        struct loop_device *lo = disk->private_data;
>>>> +        int err;
>>>> +
>>>> +        if (!lo)
>>>> +                return -ENXIO;
>>>> +
>>>> +        err = mutex_lock_killable(&lo->lo_mutex);
>>>> +        if (err)
>>>> +                return err;
>>>> +
>>>> +        if (lo->lo_state == Lo_rundown)
>>>> +                err = -ENXIO;
>>>> +        mutex_unlock(&lo->lo_mutex);
>>
>> This doesn't fix the problem completely, there is still a race window.
>>
>> lo_release
>>    if (disk_openers(disk) > 0)
>>     reutrn
>>     -> no openers now
>> 		lo_open
>> 		 if (lo->lo_state == Lo_rundown)
>> 		 -> no set yet
>> 		 open succeed
>>    mutex_lock(lo_mutex)
>>    lo->lo_state = Lo_rundown
>>    mutex_unlock(lo_mutex)
>>    __loop_clr_fd
> We have noticed that, at block layer, both open() and release() are protected by gendisk->open_mutex.
> So, this race may not happen. Can you please let me know if I understand correctly?

Yes, __loop_clr_fd from lo_release can't concurrent with lo_open.
>>
>> And with the respect, loop_clr_fd() has the same problem.

Did you check __loop_clr_fd from lo_ioctl?

Thanks,
Kuai

>>
>> I think probably loop need a open counter for itself.
> We are looking to see how to handle this case
>>
>> Thanks,
>> Kuai
>>
>>>> +	return err;
>>>> +}
>>>
>>> Most of this function uses spaces rather than tabs.
>>>
>
Christoph Hellwig May 23, 2024, 8:12 a.m. UTC | #6
On Tue, May 21, 2024 at 10:42:48PM +0000, Gulam Mohamed wrote:
> Description
> ===========

That's a weird way to format a patch.  Between this and the odd subject
not matching patch 2 I was tricked into thinking this was just a cover
letter and patch 1 was missing for a while.  Please take a look at other
patches/commit and try to word it similarly.

> V1->V2:
> 	Added a test case, 010, in blktests in tests/loop/

These kind of patch revision changelogs belong after the --- so that they
don't go into git history.  Or even better into the cover letter, which
is missing here.

> Signed-off-by: Gulam Mohamed <gulam.mohamed@oracle.com>
> ---
>  drivers/block/loop.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index 28a95fd366fe..9a235d8c062d 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -1717,6 +1717,24 @@ static int lo_compat_ioctl(struct block_device *bdev, blk_mode_t mode,
>  }
>  #endif
>  
> +static int lo_open(struct gendisk *disk, blk_mode_t mode)
> +{
> +        struct loop_device *lo = disk->private_data;
> +        int err;
> +
> +        if (!lo)
> +                return -ENXIO;

->private_data is never cleared, so the NULL check here doesn't
make sense.

> +        err = mutex_lock_killable(&lo->lo_mutex);
> +        if (err)
> +                return err;
> +
> +        if (lo->lo_state == Lo_rundown)
> +                err = -ENXIO;
> +        mutex_unlock(&lo->lo_mutex);

What if we race with setting Lo_rundown and that gets set right
after we unlock here?
Gulam Mohamed May 23, 2024, 6:39 p.m. UTC | #7
Hi,

> -----Original Message-----
> From: Christoph Hellwig <hch@lst.de>
> Sent: Thursday, May 23, 2024 1:42 PM
> To: Gulam Mohamed <gulam.mohamed@oracle.com>
> Cc: linux-block@vger.kernel.org; linux-kernel@vger.kernel.org;
> axboe@kernel.dk; shinichiro.kawasaki@wdc.com; chaitanyak@nvidia.com;
> hch@lst.de
> Subject: Re: [PATCH V2 for-6.10/block 1/2] loop: Fix a race between loop
> detach and loop open
> 
> On Tue, May 21, 2024 at 10:42:48PM +0000, Gulam Mohamed wrote:
> > Description
> > ===========
> 
> That's a weird way to format a patch.  Between this and the odd subject not
> matching patch 2 I was tricked into thinking this was just a cover letter and
> patch 1 was missing for a while.  Please take a look at other patches/commit
> and try to word it similarly.
I will take care of this in the next version.
> 
> > V1->V2:
> > 	Added a test case, 010, in blktests in tests/loop/
> 
> These kind of patch revision changelogs belong after the --- so that they don't
> go into git history.  Or even better into the cover letter, which is missing here.
> 
Sure. I will take care of this in the next version.
> > Signed-off-by: Gulam Mohamed <gulam.mohamed@oracle.com>
> > ---
> >  drivers/block/loop.c | 19 +++++++++++++++++++
> >  1 file changed, 19 insertions(+)
> >
> > diff --git a/drivers/block/loop.c b/drivers/block/loop.c index
> > 28a95fd366fe..9a235d8c062d 100644
> > --- a/drivers/block/loop.c
> > +++ b/drivers/block/loop.c
> > @@ -1717,6 +1717,24 @@ static int lo_compat_ioctl(struct block_device
> > *bdev, blk_mode_t mode,  }  #endif
> >
> > +static int lo_open(struct gendisk *disk, blk_mode_t mode) {
> > +        struct loop_device *lo = disk->private_data;
> > +        int err;
> > +
> > +        if (!lo)
> > +                return -ENXIO;
> 
> ->private_data is never cleared, so the NULL check here doesn't
> make sense.
> 
> > +        err = mutex_lock_killable(&lo->lo_mutex);
> > +        if (err)
> > +                return err;
> > +
> > +        if (lo->lo_state == Lo_rundown)
> > +                err = -ENXIO;
> > +        mutex_unlock(&lo->lo_mutex);
> 
> What if we race with setting Lo_rundown and that gets set right after we
> unlock here?
Similar race was mentioned by Kuai in his comments. We think these race conditions can be resolved by bringing back the "lo->refcnt" ,
by reverting the commit a0e286b6a5b61d4da01bdf865071c4da417046d6 plus the above Lo_rundown check in lo_open.
diff mbox series

Patch

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 28a95fd366fe..9a235d8c062d 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1717,6 +1717,24 @@  static int lo_compat_ioctl(struct block_device *bdev, blk_mode_t mode,
 }
 #endif
 
+static int lo_open(struct gendisk *disk, blk_mode_t mode)
+{
+        struct loop_device *lo = disk->private_data;
+        int err;
+
+        if (!lo)
+                return -ENXIO;
+
+        err = mutex_lock_killable(&lo->lo_mutex);
+        if (err)
+                return err;
+
+        if (lo->lo_state == Lo_rundown)
+                err = -ENXIO;
+        mutex_unlock(&lo->lo_mutex);
+	return err;
+}
+
 static void lo_release(struct gendisk *disk)
 {
 	struct loop_device *lo = disk->private_data;
@@ -1752,6 +1770,7 @@  static void lo_free_disk(struct gendisk *disk)
 
 static const struct block_device_operations lo_fops = {
 	.owner =	THIS_MODULE,
+	.open = 	lo_open,
 	.release =	lo_release,
 	.ioctl =	lo_ioctl,
 #ifdef CONFIG_COMPAT