diff mbox series

[v2] i3c: Use i3cdev->desc->info instead of calling i3c_device_get_info() to avoid deadlock

Message ID 20241107132549.25439-1-defa.li@mediatek.com (mailing list archive)
State New
Headers show
Series [v2] i3c: Use i3cdev->desc->info instead of calling i3c_device_get_info() to avoid deadlock | expand

Commit Message

mtk25126 Nov. 7, 2024, 1:25 p.m. UTC
From: Defa Li <defa.li@mediatek.com>

A deadlock may happen since the i3c_master_register() acquires
&i3cbus->lock twice. See the log below.
Use i3cdev->desc->info instead of calling i3c_device_info() to
avoid acquiring the lock twice.

v2:
  - Modified the title and commit message

============================================
WARNING: possible recursive locking detected
6.11.0-mainline
--------------------------------------------
init/1 is trying to acquire lock:
f1ffff80a6a40dc0 (&i3cbus->lock){++++}-{3:3}, at: i3c_bus_normaluse_lock

but task is already holding lock:
f1ffff80a6a40dc0 (&i3cbus->lock){++++}-{3:3}, at: i3c_master_register

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&i3cbus->lock);
  lock(&i3cbus->lock);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

2 locks held by init/1:
 #0: fcffff809b6798f8 (&dev->mutex){....}-{3:3}, at: __driver_attach
 #1: f1ffff80a6a40dc0 (&i3cbus->lock){++++}-{3:3}, at: i3c_master_register

stack backtrace:
CPU: 6 UID: 0 PID: 1 Comm: init
Call trace:
 dump_backtrace+0xfc/0x17c
 show_stack+0x18/0x28
 dump_stack_lvl+0x40/0xc0
 dump_stack+0x18/0x24
 print_deadlock_bug+0x388/0x390
 __lock_acquire+0x18bc/0x32ec
 lock_acquire+0x134/0x2b0
 down_read+0x50/0x19c
 i3c_bus_normaluse_lock+0x14/0x24
 i3c_device_get_info+0x24/0x58
 i3c_device_uevent+0x34/0xa4
 dev_uevent+0x310/0x384
 kobject_uevent_env+0x244/0x414
 kobject_uevent+0x14/0x20
 device_add+0x278/0x460
 device_register+0x20/0x34
 i3c_master_register_new_i3c_devs+0x78/0x154
 i3c_master_register+0x6a0/0x6d4
 mtk_i3c_master_probe+0x3b8/0x4d8
 platform_probe+0xa0/0xe0
 really_probe+0x114/0x454
 __driver_probe_device+0xa0/0x15c
 driver_probe_device+0x3c/0x1ac
 __driver_attach+0xc4/0x1f0
 bus_for_each_dev+0x104/0x160
 driver_attach+0x24/0x34
 bus_add_driver+0x14c/0x294
 driver_register+0x68/0x104
 __platform_driver_register+0x20/0x30
 init_module+0x20/0xfe4
 do_one_initcall+0x184/0x464
 do_init_module+0x58/0x1ec
 load_module+0xefc/0x10c8
 __arm64_sys_finit_module+0x238/0x33c
 invoke_syscall+0x58/0x10c
 el0_svc_common+0xa8/0xdc
 do_el0_svc+0x1c/0x28
 el0_svc+0x50/0xac
 el0t_64_sync_handler+0x70/0xbc
 el0t_64_sync+0x1a8/0x1ac

Signed-off-by: Defa Li <defa.li@mediatek.com>
---
 drivers/i3c/master.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Alexandre Belloni Nov. 18, 2024, 1:30 p.m. UTC | #1
On Thu, 07 Nov 2024 21:25:39 +0800, mtk25126 wrote:
> A deadlock may happen since the i3c_master_register() acquires
> &i3cbus->lock twice. See the log below.
> Use i3cdev->desc->info instead of calling i3c_device_info() to
> avoid acquiring the lock twice.
> 
> v2:
>   - Modified the title and commit message
> 
> [...]

Applied, thanks!

[1/1] i3c: Use i3cdev->desc->info instead of calling i3c_device_get_info() to avoid deadlock
      https://git.kernel.org/abelloni/c/6cf7b65f7029

Best regards,
Kees Bakker Nov. 19, 2024, 8:51 p.m. UTC | #2
Op 07-11-2024 om 14:25 schreef mtk25126:
> From: Defa Li <defa.li@mediatek.com>
>
> A deadlock may happen since the i3c_master_register() acquires
> &i3cbus->lock twice. See the log below.
> Use i3cdev->desc->info instead of calling i3c_device_info() to
> avoid acquiring the lock twice.
>
> v2:
>    - Modified the title and commit message
>
> ============================================
> WARNING: possible recursive locking detected
> 6.11.0-mainline
> --------------------------------------------
> init/1 is trying to acquire lock:
> f1ffff80a6a40dc0 (&i3cbus->lock){++++}-{3:3}, at: i3c_bus_normaluse_lock
>
> but task is already holding lock:
> f1ffff80a6a40dc0 (&i3cbus->lock){++++}-{3:3}, at: i3c_master_register
>
> other info that might help us debug this:
>   Possible unsafe locking scenario:
>
>         CPU0
>         ----
>    lock(&i3cbus->lock);
>    lock(&i3cbus->lock);
>
>   *** DEADLOCK ***
>
>   May be due to missing lock nesting notation
>
> 2 locks held by init/1:
>   #0: fcffff809b6798f8 (&dev->mutex){....}-{3:3}, at: __driver_attach
>   #1: f1ffff80a6a40dc0 (&i3cbus->lock){++++}-{3:3}, at: i3c_master_register
>
> stack backtrace:
> CPU: 6 UID: 0 PID: 1 Comm: init
> Call trace:
>   dump_backtrace+0xfc/0x17c
>   show_stack+0x18/0x28
>   dump_stack_lvl+0x40/0xc0
>   dump_stack+0x18/0x24
>   print_deadlock_bug+0x388/0x390
>   __lock_acquire+0x18bc/0x32ec
>   lock_acquire+0x134/0x2b0
>   down_read+0x50/0x19c
>   i3c_bus_normaluse_lock+0x14/0x24
>   i3c_device_get_info+0x24/0x58
>   i3c_device_uevent+0x34/0xa4
>   dev_uevent+0x310/0x384
>   kobject_uevent_env+0x244/0x414
>   kobject_uevent+0x14/0x20
>   device_add+0x278/0x460
>   device_register+0x20/0x34
>   i3c_master_register_new_i3c_devs+0x78/0x154
>   i3c_master_register+0x6a0/0x6d4
>   mtk_i3c_master_probe+0x3b8/0x4d8
>   platform_probe+0xa0/0xe0
>   really_probe+0x114/0x454
>   __driver_probe_device+0xa0/0x15c
>   driver_probe_device+0x3c/0x1ac
>   __driver_attach+0xc4/0x1f0
>   bus_for_each_dev+0x104/0x160
>   driver_attach+0x24/0x34
>   bus_add_driver+0x14c/0x294
>   driver_register+0x68/0x104
>   __platform_driver_register+0x20/0x30
>   init_module+0x20/0xfe4
>   do_one_initcall+0x184/0x464
>   do_init_module+0x58/0x1ec
>   load_module+0xefc/0x10c8
>   __arm64_sys_finit_module+0x238/0x33c
>   invoke_syscall+0x58/0x10c
>   el0_svc_common+0xa8/0xdc
>   do_el0_svc+0x1c/0x28
>   el0_svc+0x50/0xac
>   el0t_64_sync_handler+0x70/0xbc
>   el0t_64_sync+0x1a8/0x1ac
>
> Signed-off-by: Defa Li <defa.li@mediatek.com>
> ---
>   drivers/i3c/master.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/i3c/master.c b/drivers/i3c/master.c
> index 6f3eb710a75d..bb8a8bf0c4c7 100644
> --- a/drivers/i3c/master.c
> +++ b/drivers/i3c/master.c
> @@ -282,7 +282,8 @@ static int i3c_device_uevent(const struct device *dev, struct kobj_uevent_env *e
>   	struct i3c_device_info devinfo;
>   	u16 manuf, part, ext;
>   
> -	i3c_device_get_info(i3cdev, &devinfo);
> +	if (i3cdev->desc)
> +		devinfo = i3cdev->desc->info;
Now you have left devinfo unitialized if i3cdev->desc == NULL
You have to figure out what to do in that case.
>   	manuf = I3C_PID_MANUF_ID(devinfo.pid);
>   	part = I3C_PID_PART_ID(devinfo.pid);
>   	ext = I3C_PID_EXTRA_INFO(devinfo.pid);
Kees Bakker Nov. 19, 2024, 9:19 p.m. UTC | #3
Op 19-11-2024 om 21:51 schreef Kees Bakker:
> Op 07-11-2024 om 14:25 schreef mtk25126:
>> From: Defa Li <defa.li@mediatek.com>
>>
>> A deadlock may happen since the i3c_master_register() acquires
>> &i3cbus->lock twice. See the log below.
>> Use i3cdev->desc->info instead of calling i3c_device_info() to
>> avoid acquiring the lock twice.
>>
>> v2:
>>    - Modified the title and commit message
>>
>> ============================================
>> WARNING: possible recursive locking detected
>> 6.11.0-mainline
>> --------------------------------------------
>> init/1 is trying to acquire lock:
>> f1ffff80a6a40dc0 (&i3cbus->lock){++++}-{3:3}, at: i3c_bus_normaluse_lock
>>
>> but task is already holding lock:
>> f1ffff80a6a40dc0 (&i3cbus->lock){++++}-{3:3}, at: i3c_master_register
>>
>> other info that might help us debug this:
>>   Possible unsafe locking scenario:
>>
>>         CPU0
>>         ----
>>    lock(&i3cbus->lock);
>>    lock(&i3cbus->lock);
>>
>>   *** DEADLOCK ***
>>
>>   May be due to missing lock nesting notation
>>
>> 2 locks held by init/1:
>>   #0: fcffff809b6798f8 (&dev->mutex){....}-{3:3}, at: __driver_attach
>>   #1: f1ffff80a6a40dc0 (&i3cbus->lock){++++}-{3:3}, at: 
>> i3c_master_register
>>
>> stack backtrace:
>> CPU: 6 UID: 0 PID: 1 Comm: init
>> Call trace:
>>   dump_backtrace+0xfc/0x17c
>>   show_stack+0x18/0x28
>>   dump_stack_lvl+0x40/0xc0
>>   dump_stack+0x18/0x24
>>   print_deadlock_bug+0x388/0x390
>>   __lock_acquire+0x18bc/0x32ec
>>   lock_acquire+0x134/0x2b0
>>   down_read+0x50/0x19c
>>   i3c_bus_normaluse_lock+0x14/0x24
>>   i3c_device_get_info+0x24/0x58
>>   i3c_device_uevent+0x34/0xa4
>>   dev_uevent+0x310/0x384
>>   kobject_uevent_env+0x244/0x414
>>   kobject_uevent+0x14/0x20
>>   device_add+0x278/0x460
>>   device_register+0x20/0x34
>>   i3c_master_register_new_i3c_devs+0x78/0x154
>>   i3c_master_register+0x6a0/0x6d4
>>   mtk_i3c_master_probe+0x3b8/0x4d8
>>   platform_probe+0xa0/0xe0
>>   really_probe+0x114/0x454
>>   __driver_probe_device+0xa0/0x15c
>>   driver_probe_device+0x3c/0x1ac
>>   __driver_attach+0xc4/0x1f0
>>   bus_for_each_dev+0x104/0x160
>>   driver_attach+0x24/0x34
>>   bus_add_driver+0x14c/0x294
>>   driver_register+0x68/0x104
>>   __platform_driver_register+0x20/0x30
>>   init_module+0x20/0xfe4
>>   do_one_initcall+0x184/0x464
>>   do_init_module+0x58/0x1ec
>>   load_module+0xefc/0x10c8
>>   __arm64_sys_finit_module+0x238/0x33c
>>   invoke_syscall+0x58/0x10c
>>   el0_svc_common+0xa8/0xdc
>>   do_el0_svc+0x1c/0x28
>>   el0_svc+0x50/0xac
>>   el0t_64_sync_handler+0x70/0xbc
>>   el0t_64_sync+0x1a8/0x1ac
>>
>> Signed-off-by: Defa Li <defa.li@mediatek.com>
>> ---
>>   drivers/i3c/master.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/i3c/master.c b/drivers/i3c/master.c
>> index 6f3eb710a75d..bb8a8bf0c4c7 100644
>> --- a/drivers/i3c/master.c
>> +++ b/drivers/i3c/master.c
>> @@ -282,7 +282,8 @@ static int i3c_device_uevent(const struct device 
>> *dev, struct kobj_uevent_env *e
>>       struct i3c_device_info devinfo;
>>       u16 manuf, part, ext;
>>   -    i3c_device_get_info(i3cdev, &devinfo);
>> +    if (i3cdev->desc)
>> +        devinfo = i3cdev->desc->info;
> Now you have left devinfo unitialized if i3cdev->desc == NULL
> You have to figure out what to do in that case.
It looks like it is old prolem, already introduced in 2017, commit 
3a379bbcea.
There is a similar problem in modalias_show() and mctp_i3c_setup()

To be honest, I don't know the driver well enough. Maybe i3cdev->desc should
never be NULL. Perhaps we need a WARN_ON or WARN_ON_ONCE instead
of a simple NULL check. Still we need to exit the function if the unexpected
happens.
>>       manuf = I3C_PID_MANUF_ID(devinfo.pid);
>>       part = I3C_PID_PART_ID(devinfo.pid);
>>       ext = I3C_PID_EXTRA_INFO(devinfo.pid);
>
diff mbox series

Patch

diff --git a/drivers/i3c/master.c b/drivers/i3c/master.c
index 6f3eb710a75d..bb8a8bf0c4c7 100644
--- a/drivers/i3c/master.c
+++ b/drivers/i3c/master.c
@@ -282,7 +282,8 @@  static int i3c_device_uevent(const struct device *dev, struct kobj_uevent_env *e
 	struct i3c_device_info devinfo;
 	u16 manuf, part, ext;
 
-	i3c_device_get_info(i3cdev, &devinfo);
+	if (i3cdev->desc)
+		devinfo = i3cdev->desc->info;
 	manuf = I3C_PID_MANUF_ID(devinfo.pid);
 	part = I3C_PID_PART_ID(devinfo.pid);
 	ext = I3C_PID_EXTRA_INFO(devinfo.pid);