diff mbox series

[1/2] block: RCU protect disk->conv_zones_bitmap

Message ID 20241106231323.8008-2-dlemoal@kernel.org (mailing list archive)
State New
Headers show
Series Introduce bdev_zone_is_seq() | expand

Commit Message

Damien Le Moal Nov. 6, 2024, 11:13 p.m. UTC
Ensure that a disk revalidation changing the conventional zones bitmap
of a disk does not cause invalid memory references when using the
disk_zone_is_conv() helper by RCU protecting the disk->conv_zones_bitmap
pointer.

disk_zone_is_conv() is modified to operate under the RCU read lock and
the function disk_set_conv_zones_bitmap() is added to update a disk
conv_zones_bitmap pointer using rcu_replace_pointer() with the disk
zone_wplugs_lock spinlock held.

disk_free_zone_resources() is modified to call
disk_update_zone_resources() with a NULL bitmap pointer to free the disk
conv_zones_bitmap. disk_set_conv_zones_bitmap() is also used in
disk_update_zone_resources() to set the new (revalidated) bitmap and
free the old one.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
---
 block/blk-zoned.c      | 43 ++++++++++++++++++++++++++++++------------
 include/linux/blkdev.h |  2 +-
 2 files changed, 32 insertions(+), 13 deletions(-)

Comments

Bart Van Assche Nov. 6, 2024, 11:20 p.m. UTC | #1
On 11/6/24 3:13 PM, Damien Le Moal wrote:
> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
> index a287577d1ad6..7a7855555d6d 100644
> --- a/block/blk-zoned.c
> +++ b/block/blk-zoned.c
> @@ -350,9 +350,14 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, blk_mode_t mode,
>   
>   static inline bool disk_zone_is_conv(struct gendisk *disk, sector_t sector)
>   {
> -	if (!disk->conv_zones_bitmap)
> -		return false;
> -	return test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap);
> +	bool is_conv;
> +
> +	rcu_read_lock();
> +	is_conv = disk->conv_zones_bitmap &&
> +		test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap);
> +	rcu_read_unlock();
> +
> +	return is_conv;
>   }

The above code can be simplified significantly by using guard(rcu).

Otherwise the two patches in this series look good to me.

Thanks,

Bart.
Damien Le Moal Nov. 6, 2024, 11:44 p.m. UTC | #2
On 11/7/24 08:20, Bart Van Assche wrote:
> On 11/6/24 3:13 PM, Damien Le Moal wrote:
>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>> index a287577d1ad6..7a7855555d6d 100644
>> --- a/block/blk-zoned.c
>> +++ b/block/blk-zoned.c
>> @@ -350,9 +350,14 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, blk_mode_t mode,
>>   
>>   static inline bool disk_zone_is_conv(struct gendisk *disk, sector_t sector)
>>   {
>> -	if (!disk->conv_zones_bitmap)
>> -		return false;
>> -	return test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap);
>> +	bool is_conv;
>> +
>> +	rcu_read_lock();
>> +	is_conv = disk->conv_zones_bitmap &&
>> +		test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap);
>> +	rcu_read_unlock();
>> +
>> +	return is_conv;
>>   }
> 
> The above code can be simplified significantly by using guard(rcu).

I personally dislike very much annotations that hide code.
So unless Jens prefers using guard(rcu), I would prefer leaving the code as it is.

> Otherwise the two patches in this series look good to me.
> 
> Thanks,
> 
> Bart.
Jens Axboe Nov. 7, 2024, 12:02 a.m. UTC | #3
On 11/6/24 4:44 PM, Damien Le Moal wrote:
> On 11/7/24 08:20, Bart Van Assche wrote:
>> On 11/6/24 3:13 PM, Damien Le Moal wrote:
>>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>>> index a287577d1ad6..7a7855555d6d 100644
>>> --- a/block/blk-zoned.c
>>> +++ b/block/blk-zoned.c
>>> @@ -350,9 +350,14 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, blk_mode_t mode,
>>>   
>>>   static inline bool disk_zone_is_conv(struct gendisk *disk, sector_t sector)
>>>   {
>>> -	if (!disk->conv_zones_bitmap)
>>> -		return false;
>>> -	return test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap);
>>> +	bool is_conv;
>>> +
>>> +	rcu_read_lock();
>>> +	is_conv = disk->conv_zones_bitmap &&
>>> +		test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap);
>>> +	rcu_read_unlock();
>>> +
>>> +	return is_conv;
>>>   }
>>
>> The above code can be simplified significantly by using guard(rcu).
> 
> I personally dislike very much annotations that hide code. So unless
> Jens prefers using guard(rcu), I would prefer leaving the code as it
> is.

I don't mind it, and I do use it myself when it makes sense - but imho
it's up to the person writing the code, particularly when it's their
code in the first place.
Damien Le Moal Nov. 7, 2024, 12:14 a.m. UTC | #4
On 11/7/24 09:02, Jens Axboe wrote:
> On 11/6/24 4:44 PM, Damien Le Moal wrote:
>> On 11/7/24 08:20, Bart Van Assche wrote:
>>> On 11/6/24 3:13 PM, Damien Le Moal wrote:
>>>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>>>> index a287577d1ad6..7a7855555d6d 100644
>>>> --- a/block/blk-zoned.c
>>>> +++ b/block/blk-zoned.c
>>>> @@ -350,9 +350,14 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, blk_mode_t mode,
>>>>   
>>>>   static inline bool disk_zone_is_conv(struct gendisk *disk, sector_t sector)
>>>>   {
>>>> -	if (!disk->conv_zones_bitmap)
>>>> -		return false;
>>>> -	return test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap);
>>>> +	bool is_conv;
>>>> +
>>>> +	rcu_read_lock();
>>>> +	is_conv = disk->conv_zones_bitmap &&
>>>> +		test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap);
>>>> +	rcu_read_unlock();
>>>> +
>>>> +	return is_conv;
>>>>   }
>>>
>>> The above code can be simplified significantly by using guard(rcu).
>>
>> I personally dislike very much annotations that hide code. So unless
>> Jens prefers using guard(rcu), I would prefer leaving the code as it
>> is.
> 
> I don't mind it, and I do use it myself when it makes sense - but imho
> it's up to the person writing the code, particularly when it's their
> code in the first place.

OK, then I would prefer leaving the code as is :)
Christoph Hellwig Nov. 7, 2024, 5:40 a.m. UTC | #5
As sparse rcu warnings got in my radar for something else, I did
run sparse over this and it complains.  It will need the little
gem below to fix (and a rebase of the second patch).  Otherwise the
series looks great and way better than my initial hack, thanks!

diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index 7a7855555d6d..bf4458b11720 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -350,11 +350,12 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, blk_mode_t mode,
 
 static inline bool disk_zone_is_conv(struct gendisk *disk, sector_t sector)
 {
+	unsigned long *bitmap;
 	bool is_conv;
 
 	rcu_read_lock();
-	is_conv = disk->conv_zones_bitmap &&
-		test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap);
+	bitmap = rcu_dereference(disk->conv_zones_bitmap);
+	is_conv = bitmap && test_bit(disk_zone_no(disk, sector), bitmap);
 	rcu_read_unlock();
 
 	return is_conv;
@@ -1467,11 +1468,10 @@ static unsigned int disk_set_conv_zones_bitmap(struct gendisk *disk,
 	unsigned long flags;
 
 	spin_lock_irqsave(&disk->zone_wplugs_lock, flags);
+	if (bitmap)
+		nr_conv_zones = bitmap_weight(bitmap, disk->nr_zones);
 	bitmap = rcu_replace_pointer(disk->conv_zones_bitmap, bitmap,
 				     lockdep_is_held(&disk->zone_wplugs_lock));
-	if (disk->conv_zones_bitmap)
-		nr_conv_zones = bitmap_weight(disk->conv_zones_bitmap,
-					      disk->nr_zones);
 	spin_unlock_irqrestore(&disk->zone_wplugs_lock, flags);
 
 	kfree_rcu_mightsleep(bitmap);
Damien Le Moal Nov. 7, 2024, 5:44 a.m. UTC | #6
On 11/7/24 14:40, Christoph Hellwig wrote:
> As sparse rcu warnings got in my radar for something else, I did
> run sparse over this and it complains.  It will need the little
> gem below to fix (and a rebase of the second patch).  Otherwise the
> series looks great and way better than my initial hack, thanks!

OK. I will integrate this and send v2.

> 
> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
> index 7a7855555d6d..bf4458b11720 100644
> --- a/block/blk-zoned.c
> +++ b/block/blk-zoned.c
> @@ -350,11 +350,12 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, blk_mode_t mode,
>  
>  static inline bool disk_zone_is_conv(struct gendisk *disk, sector_t sector)
>  {
> +	unsigned long *bitmap;
>  	bool is_conv;
>  
>  	rcu_read_lock();
> -	is_conv = disk->conv_zones_bitmap &&
> -		test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap);
> +	bitmap = rcu_dereference(disk->conv_zones_bitmap);
> +	is_conv = bitmap && test_bit(disk_zone_no(disk, sector), bitmap);
>  	rcu_read_unlock();
>  
>  	return is_conv;
> @@ -1467,11 +1468,10 @@ static unsigned int disk_set_conv_zones_bitmap(struct gendisk *disk,
>  	unsigned long flags;
>  
>  	spin_lock_irqsave(&disk->zone_wplugs_lock, flags);
> +	if (bitmap)
> +		nr_conv_zones = bitmap_weight(bitmap, disk->nr_zones);
>  	bitmap = rcu_replace_pointer(disk->conv_zones_bitmap, bitmap,
>  				     lockdep_is_held(&disk->zone_wplugs_lock));
> -	if (disk->conv_zones_bitmap)
> -		nr_conv_zones = bitmap_weight(disk->conv_zones_bitmap,
> -					      disk->nr_zones);
>  	spin_unlock_irqrestore(&disk->zone_wplugs_lock, flags);
>  
>  	kfree_rcu_mightsleep(bitmap);
diff mbox series

Patch

diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index a287577d1ad6..7a7855555d6d 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -350,9 +350,14 @@  int blkdev_zone_mgmt_ioctl(struct block_device *bdev, blk_mode_t mode,
 
 static inline bool disk_zone_is_conv(struct gendisk *disk, sector_t sector)
 {
-	if (!disk->conv_zones_bitmap)
-		return false;
-	return test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap);
+	bool is_conv;
+
+	rcu_read_lock();
+	is_conv = disk->conv_zones_bitmap &&
+		test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap);
+	rcu_read_unlock();
+
+	return is_conv;
 }
 
 static bool disk_zone_is_last(struct gendisk *disk, struct blk_zone *zone)
@@ -1455,6 +1460,25 @@  static void disk_destroy_zone_wplugs_hash_table(struct gendisk *disk)
 	disk->zone_wplugs_hash_bits = 0;
 }
 
+static unsigned int disk_set_conv_zones_bitmap(struct gendisk *disk,
+					       unsigned long *bitmap)
+{
+	unsigned int nr_conv_zones = 0;
+	unsigned long flags;
+
+	spin_lock_irqsave(&disk->zone_wplugs_lock, flags);
+	bitmap = rcu_replace_pointer(disk->conv_zones_bitmap, bitmap,
+				     lockdep_is_held(&disk->zone_wplugs_lock));
+	if (disk->conv_zones_bitmap)
+		nr_conv_zones = bitmap_weight(disk->conv_zones_bitmap,
+					      disk->nr_zones);
+	spin_unlock_irqrestore(&disk->zone_wplugs_lock, flags);
+
+	kfree_rcu_mightsleep(bitmap);
+
+	return nr_conv_zones;
+}
+
 void disk_free_zone_resources(struct gendisk *disk)
 {
 	if (!disk->zone_wplugs_pool)
@@ -1478,8 +1502,7 @@  void disk_free_zone_resources(struct gendisk *disk)
 	mempool_destroy(disk->zone_wplugs_pool);
 	disk->zone_wplugs_pool = NULL;
 
-	bitmap_free(disk->conv_zones_bitmap);
-	disk->conv_zones_bitmap = NULL;
+	disk_set_conv_zones_bitmap(disk, NULL);
 	disk->zone_capacity = 0;
 	disk->last_zone_capacity = 0;
 	disk->nr_zones = 0;
@@ -1538,17 +1561,15 @@  static int disk_update_zone_resources(struct gendisk *disk,
 				      struct blk_revalidate_zone_args *args)
 {
 	struct request_queue *q = disk->queue;
-	unsigned int nr_seq_zones, nr_conv_zones = 0;
+	unsigned int nr_seq_zones, nr_conv_zones;
 	unsigned int pool_size;
 	struct queue_limits lim;
 
 	disk->nr_zones = args->nr_zones;
 	disk->zone_capacity = args->zone_capacity;
 	disk->last_zone_capacity = args->last_zone_capacity;
-	swap(disk->conv_zones_bitmap, args->conv_zones_bitmap);
-	if (disk->conv_zones_bitmap)
-		nr_conv_zones = bitmap_weight(disk->conv_zones_bitmap,
-					      disk->nr_zones);
+	nr_conv_zones =
+		disk_set_conv_zones_bitmap(disk, args->conv_zones_bitmap);
 	if (nr_conv_zones >= disk->nr_zones) {
 		pr_warn("%s: Invalid number of conventional zones %u / %u\n",
 			disk->disk_name, nr_conv_zones, disk->nr_zones);
@@ -1817,8 +1838,6 @@  int blk_revalidate_disk_zones(struct gendisk *disk)
 		disk_free_zone_resources(disk);
 	blk_mq_unfreeze_queue(q);
 
-	kfree(args.conv_zones_bitmap);
-
 	return ret;
 }
 EXPORT_SYMBOL_GPL(blk_revalidate_disk_zones);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 6d1413bd69a5..5687eb2a019c 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -195,7 +195,7 @@  struct gendisk {
 	unsigned int		nr_zones;
 	unsigned int		zone_capacity;
 	unsigned int		last_zone_capacity;
-	unsigned long		*conv_zones_bitmap;
+	unsigned long __rcu	*conv_zones_bitmap;
 	unsigned int            zone_wplugs_hash_bits;
 	spinlock_t              zone_wplugs_lock;
 	struct mempool_s	*zone_wplugs_pool;