diff mbox series

btrfs: fix the sleep inside RCU lock bug of is_same_device()

Message ID 0acb0e85c483ea5ee6d761583fcaa6efa3e92f01.1728037316.git.wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series btrfs: fix the sleep inside RCU lock bug of is_same_device() | expand

Commit Message

Qu Wenruo Oct. 4, 2024, 10:22 a.m. UTC
Previous fix of the RCU lock is causing another bug, that kern_path()
can sleep and cause the sleep inside RCU bug.

This time fix the bug by pre-allocating a string buffer, and copy the
rcu string into that buffer to solve the problem.

Unfortunately this means every time a device scan will trigger a page
allocation and free.

This will be folded into the offending patch "btrfs: avoid unnecessary
device path update for the same device" again.

Reported-by: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/volumes.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

Comments

Wang Yugui Oct. 4, 2024, 11:17 a.m. UTC | #1
Hi,

> Previous fix of the RCU lock is causing another bug, that kern_path()
> can sleep and cause the sleep inside RCU bug.
> 
> This time fix the bug by pre-allocating a string buffer, and copy the
> rcu string into that buffer to solve the problem.
> 
> Unfortunately this means every time a device scan will trigger a page
> allocation and free.
> 
> This will be folded into the offending patch "btrfs: avoid unnecessary
> device path update for the same device" again.
> 
> Reported-by: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/volumes.c | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 93704e11e611..5f5748ab6f9d 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -800,17 +800,24 @@ static bool is_same_device(struct btrfs_device *device, const char *new_path)
>  {
>  	struct path old = { .mnt = NULL, .dentry = NULL };
>  	struct path new = { .mnt = NULL, .dentry = NULL };
> -	char *old_path;
> +	char *old_path = NULL;
>  	bool is_same = false;
>  	int ret;
>  
>  	if (!device->name)
>  		goto out;
>  
> +	old_path = kzalloc(PATH_MAX, GFP_NOFS);
> +	if (!old_path)
> +		goto out;
> +
>  	rcu_read_lock();
> -	old_path = rcu_str_deref(device->name);
> -	ret = kern_path(old_path, LOOKUP_FOLLOW, &old);
> +	ret = strscpy(old_path, rcu_str_deref(device->name), PATH_MAX);

It seems a hardcode of rcu_string_strdup()?

Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2024/10/04
Johannes Thumshirn Oct. 4, 2024, 11:28 a.m. UTC | #2
On 04.10.24 13:17, Wang Yugui wrote:
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index 93704e11e611..5f5748ab6f9d 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -800,17 +800,24 @@ static bool is_same_device(struct btrfs_device *device, const char *new_path)
>>   {
>>   	struct path old = { .mnt = NULL, .dentry = NULL };
>>   	struct path new = { .mnt = NULL, .dentry = NULL };
>> -	char *old_path;
>> +	char *old_path = NULL;
>>   	bool is_same = false;
>>   	int ret;
>>   
>>   	if (!device->name)
>>   		goto out;
>>   
>> +	old_path = kzalloc(PATH_MAX, GFP_NOFS);
>> +	if (!old_path)
>> +		goto out;
>> +
>>   	rcu_read_lock();
>> -	old_path = rcu_str_deref(device->name);
>> -	ret = kern_path(old_path, LOOKUP_FOLLOW, &old);
>> +	ret = strscpy(old_path, rcu_str_deref(device->name), PATH_MAX);
> 
> It seems a hardcode of rcu_string_strdup()?

Not really, rcu_string_strdup() creates another rcu_string. This creates 
a new C string from a rcu_string.

It could be solved with rcu_string_strdup() but that would just 
complicate things IMHO.
Johannes Thumshirn Oct. 4, 2024, 11:29 a.m. UTC | #3
Thanks a lot,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
David Sterba Oct. 8, 2024, 5:23 p.m. UTC | #4
On Fri, Oct 04, 2024 at 07:52:00PM +0930, Qu Wenruo wrote:
> Previous fix of the RCU lock is causing another bug, that kern_path()
> can sleep and cause the sleep inside RCU bug.
> 
> This time fix the bug by pre-allocating a string buffer, and copy the
> rcu string into that buffer to solve the problem.
> 
> Unfortunately this means every time a device scan will trigger a page
> allocation and free.

I don't think this is a big issue, the memory is allocated for a very
short period of time. A sequence of device scan requests can even be
handed the same page from allocator (assuming that it's a device scan
that does not need to add new devices).
diff mbox series

Patch

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 93704e11e611..5f5748ab6f9d 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -800,17 +800,24 @@  static bool is_same_device(struct btrfs_device *device, const char *new_path)
 {
 	struct path old = { .mnt = NULL, .dentry = NULL };
 	struct path new = { .mnt = NULL, .dentry = NULL };
-	char *old_path;
+	char *old_path = NULL;
 	bool is_same = false;
 	int ret;
 
 	if (!device->name)
 		goto out;
 
+	old_path = kzalloc(PATH_MAX, GFP_NOFS);
+	if (!old_path)
+		goto out;
+
 	rcu_read_lock();
-	old_path = rcu_str_deref(device->name);
-	ret = kern_path(old_path, LOOKUP_FOLLOW, &old);
+	ret = strscpy(old_path, rcu_str_deref(device->name), PATH_MAX);
 	rcu_read_unlock();
+	if (ret < 0)
+		goto out;
+
+	ret = kern_path(old_path, LOOKUP_FOLLOW, &old);
 	if (ret)
 		goto out;
 	ret = kern_path(new_path, LOOKUP_FOLLOW, &new);
@@ -819,6 +826,7 @@  static bool is_same_device(struct btrfs_device *device, const char *new_path)
 	if (path_equal(&old, &new))
 		is_same = true;
 out:
+	kfree(old_path);
 	path_put(&old);
 	path_put(&new);
 	return is_same;