Message ID | 20241004123738.2964524-1-cascardo@igalia.com (mailing list archive) |
---|---|
State | Accepted |
Commit | befab3a278c59db0cc88c8799638064f6d3fd6f8 |
Headers | show |
Series | usb: typec: altmode should keep reference to parent | expand |
On Fri, Oct 04, 2024 at 09:37:38AM -0300, Thadeu Lima de Souza Cascardo wrote: > The altmode device release refers to its parent device, but without keeping > a reference to it. > > When registering the altmode, get a reference to the parent and put it in > the release function. > > Before this fix, when using CONFIG_DEBUG_KOBJECT_RELEASE, we see issues > like this: Let me study what's going on in the drivers code. The children should _not_ be cleaned first before the parent. I'll have to come back to this on Monday. This really should not be necessary. > [ 43.572860] kobject: 'port0.0' (ffff8880057ba008): kobject_release, parent 0000000000000000 (delayed 3000) > [ 43.573532] kobject: 'port0.1' (ffff8880057bd008): kobject_release, parent 0000000000000000 (delayed 1000) > [ 43.574407] kobject: 'port0' (ffff8880057b9008): kobject_release, parent 0000000000000000 (delayed 3000) > [ 43.575059] kobject: 'port1.0' (ffff8880057ca008): kobject_release, parent 0000000000000000 (delayed 4000) > [ 43.575908] kobject: 'port1.1' (ffff8880057c9008): kobject_release, parent 0000000000000000 (delayed 4000) > [ 43.576908] kobject: 'typec' (ffff8880062dbc00): kobject_release, parent 0000000000000000 (delayed 4000) > [ 43.577769] kobject: 'port1' (ffff8880057bf008): kobject_release, parent 0000000000000000 (delayed 3000) > [ 46.612867] ================================================================== > [ 46.613402] BUG: KASAN: slab-use-after-free in typec_altmode_release+0x38/0x129 > [ 46.614003] Read of size 8 at addr ffff8880057b9118 by task kworker/2:1/48 > [ 46.614538] > [ 46.614668] CPU: 2 UID: 0 PID: 48 Comm: kworker/2:1 Not tainted 6.12.0-rc1-00138-gedbae730ad31 #535 > [ 46.615391] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 > [ 46.616042] Workqueue: events kobject_delayed_cleanup > [ 46.616446] Call Trace: > [ 46.616648] <TASK> > [ 46.616820] dump_stack_lvl+0x5b/0x7c > [ 46.617112] ? typec_altmode_release+0x38/0x129 > [ 46.617470] print_report+0x14c/0x49e > [ 46.617769] ? rcu_read_unlock_sched+0x56/0x69 > [ 46.618117] ? __virt_addr_valid+0x19a/0x1ab > [ 46.618456] ? kmem_cache_debug_flags+0xc/0x1d > [ 46.618807] ? typec_altmode_release+0x38/0x129 > [ 46.619161] kasan_report+0x8d/0xb4 > [ 46.619447] ? typec_altmode_release+0x38/0x129 > [ 46.619809] ? process_scheduled_works+0x3cb/0x85f > [ 46.620185] typec_altmode_release+0x38/0x129 > [ 46.620537] ? process_scheduled_works+0x3cb/0x85f > [ 46.620907] device_release+0xaf/0xf2 > [ 46.621206] kobject_delayed_cleanup+0x13b/0x17a > [ 46.621584] process_scheduled_works+0x4f6/0x85f > [ 46.621955] ? __pfx_process_scheduled_works+0x10/0x10 > [ 46.622353] ? hlock_class+0x31/0x9a > [ 46.622647] ? lock_acquired+0x361/0x3c3 > [ 46.622956] ? move_linked_works+0x46/0x7d > [ 46.623277] worker_thread+0x1ce/0x291 > [ 46.623582] ? __kthread_parkme+0xc8/0xdf > [ 46.623900] ? __pfx_worker_thread+0x10/0x10 > [ 46.624236] kthread+0x17e/0x190 > [ 46.624501] ? kthread+0xfb/0x190 > [ 46.624756] ? __pfx_kthread+0x10/0x10 > [ 46.625015] ret_from_fork+0x20/0x40 > [ 46.625268] ? __pfx_kthread+0x10/0x10 > [ 46.625532] ret_from_fork_asm+0x1a/0x30 > [ 46.625805] </TASK> > [ 46.625953] > [ 46.626056] Allocated by task 678: > [ 46.626287] kasan_save_stack+0x24/0x44 > [ 46.626555] kasan_save_track+0x14/0x2d > [ 46.626811] __kasan_kmalloc+0x3f/0x4d > [ 46.627049] __kmalloc_noprof+0x1bf/0x1f0 > [ 46.627362] typec_register_port+0x23/0x491 > [ 46.627698] cros_typec_probe+0x634/0xbb6 > [ 46.628026] platform_probe+0x47/0x8c > [ 46.628311] really_probe+0x20a/0x47d > [ 46.628605] device_driver_attach+0x39/0x72 > [ 46.628940] bind_store+0x87/0xd7 > [ 46.629213] kernfs_fop_write_iter+0x1aa/0x218 > [ 46.629574] vfs_write+0x1d6/0x29b > [ 46.629856] ksys_write+0xcd/0x13b > [ 46.630128] do_syscall_64+0xd4/0x139 > [ 46.630420] entry_SYSCALL_64_after_hwframe+0x76/0x7e > [ 46.630820] > [ 46.630946] Freed by task 48: > [ 46.631182] kasan_save_stack+0x24/0x44 > [ 46.631493] kasan_save_track+0x14/0x2d > [ 46.631799] kasan_save_free_info+0x3f/0x4d > [ 46.632144] __kasan_slab_free+0x37/0x45 > [ 46.632474] kfree+0x1d4/0x252 > [ 46.632725] device_release+0xaf/0xf2 > [ 46.633017] kobject_delayed_cleanup+0x13b/0x17a > [ 46.633388] process_scheduled_works+0x4f6/0x85f > [ 46.633764] worker_thread+0x1ce/0x291 > [ 46.634065] kthread+0x17e/0x190 > [ 46.634324] ret_from_fork+0x20/0x40 > [ 46.634621] ret_from_fork_asm+0x1a/0x30 > > Fixes: 8a37d87d72f0 ("usb: typec: Bus type for alternate modes") > Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> > --- > drivers/usb/typec/class.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/usb/typec/class.c b/drivers/usb/typec/class.c > index 9262fcd4144f..d61b4c74648d 100644 > --- a/drivers/usb/typec/class.c > +++ b/drivers/usb/typec/class.c > @@ -519,6 +519,7 @@ static void typec_altmode_release(struct device *dev) > typec_altmode_put_partner(alt); > > altmode_id_remove(alt->adev.dev.parent, alt->id); > + put_device(alt->adev.dev.parent); > kfree(alt); > } > > @@ -568,6 +569,8 @@ typec_register_altmode(struct device *parent, > alt->adev.dev.type = &typec_altmode_dev_type; > dev_set_name(&alt->adev.dev, "%s.%u", dev_name(parent), id); > > + get_device(alt->adev.dev.parent); > + > /* Link partners and plugs with the ports */ > if (!is_port) > typec_altmode_set_partner(alt); > -- > 2.34.1
On Fri, Oct 04, 2024 at 05:08:28PM +0300, Heikki Krogerus wrote: > On Fri, Oct 04, 2024 at 09:37:38AM -0300, Thadeu Lima de Souza Cascardo wrote: > > The altmode device release refers to its parent device, but without keeping > > a reference to it. > > > > When registering the altmode, get a reference to the parent and put it in > > the release function. > > > > Before this fix, when using CONFIG_DEBUG_KOBJECT_RELEASE, we see issues > > like this: > > Let me study what's going on in the drivers code. The children should > _not_ be cleaned first before the parent. I'll have to come back to > this on Monday. > > This really should not be necessary. > Well, they are likely not. But driver core API states that either way, you should keep such references. And one way to test it is using CONFIG_DEBUG_KOBJECT_RELEASE. That delays the actual release/cleanup of the struct device, so: > > [ 43.572860] kobject: 'port0.0' (ffff8880057ba008): kobject_release, parent 0000000000000000 (delayed 3000) > > [ 43.573532] kobject: 'port0.1' (ffff8880057bd008): kobject_release, parent 0000000000000000 (delayed 1000) > > [ 43.574407] kobject: 'port0' (ffff8880057b9008): kobject_release, parent 0000000000000000 (delayed 3000) > > [ 43.575059] kobject: 'port1.0' (ffff8880057ca008): kobject_release, parent 0000000000000000 (delayed 4000) > > [ 43.575908] kobject: 'port1.1' (ffff8880057c9008): kobject_release, parent 0000000000000000 (delayed 4000) 1) children (port1.0 and port1.1) last reference are put, but their actual release is delayed 4s. > > [ 43.576908] kobject: 'typec' (ffff8880062dbc00): kobject_release, parent 0000000000000000 (delayed 4000) > > [ 43.577769] kobject: 'port1' (ffff8880057bf008): kobject_release, parent 0000000000000000 (delayed 3000) 2) parent (port1) is put, but release is delayed 3s. Just in the order you would expect, but because of the delays: 3) 3s later, port1 release is called and it is freed. 4) 4s later, port1.0 release is called and it refers to the freed parent, port1. Having the references, what happens now is: 1) port1 is put, but this is not its last reference. 2) port1.0 and port1.1 are put, cleanup delayed 4s. 3) 4s later, port1.0 and port1.1 releases are called, but now they put the last reference to port1. 4) port1 last reference has now been called, cleanup delayed 3s. 5) 3s later, port1 release is called, then freed. No UAF in such case. Thanks. Cascardo. > > [ 46.612867] ================================================================== > > [ 46.613402] BUG: KASAN: slab-use-after-free in typec_altmode_release+0x38/0x129 > > [ 46.614003] Read of size 8 at addr ffff8880057b9118 by task kworker/2:1/48 > > [ 46.614538] > > [ 46.614668] CPU: 2 UID: 0 PID: 48 Comm: kworker/2:1 Not tainted 6.12.0-rc1-00138-gedbae730ad31 #535 > > [ 46.615391] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 > > [ 46.616042] Workqueue: events kobject_delayed_cleanup > > [ 46.616446] Call Trace: > > [ 46.616648] <TASK> > > [ 46.616820] dump_stack_lvl+0x5b/0x7c > > [ 46.617112] ? typec_altmode_release+0x38/0x129 > > [ 46.617470] print_report+0x14c/0x49e > > [ 46.617769] ? rcu_read_unlock_sched+0x56/0x69 > > [ 46.618117] ? __virt_addr_valid+0x19a/0x1ab > > [ 46.618456] ? kmem_cache_debug_flags+0xc/0x1d > > [ 46.618807] ? typec_altmode_release+0x38/0x129 > > [ 46.619161] kasan_report+0x8d/0xb4 > > [ 46.619447] ? typec_altmode_release+0x38/0x129 > > [ 46.619809] ? process_scheduled_works+0x3cb/0x85f > > [ 46.620185] typec_altmode_release+0x38/0x129 > > [ 46.620537] ? process_scheduled_works+0x3cb/0x85f > > [ 46.620907] device_release+0xaf/0xf2 > > [ 46.621206] kobject_delayed_cleanup+0x13b/0x17a > > [ 46.621584] process_scheduled_works+0x4f6/0x85f > > [ 46.621955] ? __pfx_process_scheduled_works+0x10/0x10 > > [ 46.622353] ? hlock_class+0x31/0x9a > > [ 46.622647] ? lock_acquired+0x361/0x3c3 > > [ 46.622956] ? move_linked_works+0x46/0x7d > > [ 46.623277] worker_thread+0x1ce/0x291 > > [ 46.623582] ? __kthread_parkme+0xc8/0xdf > > [ 46.623900] ? __pfx_worker_thread+0x10/0x10 > > [ 46.624236] kthread+0x17e/0x190 > > [ 46.624501] ? kthread+0xfb/0x190 > > [ 46.624756] ? __pfx_kthread+0x10/0x10 > > [ 46.625015] ret_from_fork+0x20/0x40 > > [ 46.625268] ? __pfx_kthread+0x10/0x10 > > [ 46.625532] ret_from_fork_asm+0x1a/0x30 > > [ 46.625805] </TASK> > > [ 46.625953] > > [ 46.626056] Allocated by task 678: > > [ 46.626287] kasan_save_stack+0x24/0x44 > > [ 46.626555] kasan_save_track+0x14/0x2d > > [ 46.626811] __kasan_kmalloc+0x3f/0x4d > > [ 46.627049] __kmalloc_noprof+0x1bf/0x1f0 > > [ 46.627362] typec_register_port+0x23/0x491 > > [ 46.627698] cros_typec_probe+0x634/0xbb6 > > [ 46.628026] platform_probe+0x47/0x8c > > [ 46.628311] really_probe+0x20a/0x47d > > [ 46.628605] device_driver_attach+0x39/0x72 > > [ 46.628940] bind_store+0x87/0xd7 > > [ 46.629213] kernfs_fop_write_iter+0x1aa/0x218 > > [ 46.629574] vfs_write+0x1d6/0x29b > > [ 46.629856] ksys_write+0xcd/0x13b > > [ 46.630128] do_syscall_64+0xd4/0x139 > > [ 46.630420] entry_SYSCALL_64_after_hwframe+0x76/0x7e > > [ 46.630820] > > [ 46.630946] Freed by task 48: > > [ 46.631182] kasan_save_stack+0x24/0x44 > > [ 46.631493] kasan_save_track+0x14/0x2d > > [ 46.631799] kasan_save_free_info+0x3f/0x4d > > [ 46.632144] __kasan_slab_free+0x37/0x45 > > [ 46.632474] kfree+0x1d4/0x252 > > [ 46.632725] device_release+0xaf/0xf2 > > [ 46.633017] kobject_delayed_cleanup+0x13b/0x17a > > [ 46.633388] process_scheduled_works+0x4f6/0x85f > > [ 46.633764] worker_thread+0x1ce/0x291 > > [ 46.634065] kthread+0x17e/0x190 > > [ 46.634324] ret_from_fork+0x20/0x40 > > [ 46.634621] ret_from_fork_asm+0x1a/0x30 > > > > Fixes: 8a37d87d72f0 ("usb: typec: Bus type for alternate modes") > > Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> > > --- > > drivers/usb/typec/class.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/drivers/usb/typec/class.c b/drivers/usb/typec/class.c > > index 9262fcd4144f..d61b4c74648d 100644 > > --- a/drivers/usb/typec/class.c > > +++ b/drivers/usb/typec/class.c > > @@ -519,6 +519,7 @@ static void typec_altmode_release(struct device *dev) > > typec_altmode_put_partner(alt); > > > > altmode_id_remove(alt->adev.dev.parent, alt->id); > > + put_device(alt->adev.dev.parent); > > kfree(alt); > > } > > > > @@ -568,6 +569,8 @@ typec_register_altmode(struct device *parent, > > alt->adev.dev.type = &typec_altmode_dev_type; > > dev_set_name(&alt->adev.dev, "%s.%u", dev_name(parent), id); > > > > + get_device(alt->adev.dev.parent); > > + > > /* Link partners and plugs with the ports */ > > if (!is_port) > > typec_altmode_set_partner(alt); > > -- > > 2.34.1 > > -- > heikki >
On Fri, Oct 04, 2024 at 11:17:01AM GMT, Thadeu Lima de Souza Cascardo wrote: > On Fri, Oct 04, 2024 at 05:08:28PM +0300, Heikki Krogerus wrote: > > On Fri, Oct 04, 2024 at 09:37:38AM -0300, Thadeu Lima de Souza Cascardo wrote: > > > The altmode device release refers to its parent device, but without keeping > > > a reference to it. > > > > > > When registering the altmode, get a reference to the parent and put it in > > > the release function. > > > > > > Before this fix, when using CONFIG_DEBUG_KOBJECT_RELEASE, we see issues > > > like this: > > > > Let me study what's going on in the drivers code. The children should > > _not_ be cleaned first before the parent. I'll have to come back to > > this on Monday. > > > > This really should not be necessary. > > > > Well, they are likely not. But driver core API states that either way, you > should keep such references. And one way to test it is using > CONFIG_DEBUG_KOBJECT_RELEASE. That delays the actual release/cleanup of the > struct device, so: > > > > > [ 43.572860] kobject: 'port0.0' (ffff8880057ba008): kobject_release, parent 0000000000000000 (delayed 3000) > > > [ 43.573532] kobject: 'port0.1' (ffff8880057bd008): kobject_release, parent 0000000000000000 (delayed 1000) > > > [ 43.574407] kobject: 'port0' (ffff8880057b9008): kobject_release, parent 0000000000000000 (delayed 3000) > > > [ 43.575059] kobject: 'port1.0' (ffff8880057ca008): kobject_release, parent 0000000000000000 (delayed 4000) > > > [ 43.575908] kobject: 'port1.1' (ffff8880057c9008): kobject_release, parent 0000000000000000 (delayed 4000) > > 1) children (port1.0 and port1.1) last reference are put, but their actual > release is delayed 4s. > > > > [ 43.576908] kobject: 'typec' (ffff8880062dbc00): kobject_release, parent 0000000000000000 (delayed 4000) > > > [ 43.577769] kobject: 'port1' (ffff8880057bf008): kobject_release, parent 0000000000000000 (delayed 3000) > > 2) parent (port1) is put, but release is delayed 3s. > > Just in the order you would expect, but because of the delays: > > 3) 3s later, port1 release is called and it is freed. > 4) 4s later, port1.0 release is called and it refers to the freed parent, > port1. > > Having the references, what happens now is: > > 1) port1 is put, but this is not its last reference. > 2) port1.0 and port1.1 are put, cleanup delayed 4s. > 3) 4s later, port1.0 and port1.1 releases are called, but now they put the > last reference to port1. > 4) port1 last reference has now been called, cleanup delayed 3s. > 5) 3s later, port1 release is called, then freed. > > No UAF in such case. Usually we don't use this config option, maybe I should also start using it for some of my tests. Nevertheless the description is pretty clear (although it might be better to add it to the commit message). Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Hi, On Fri, Oct 04, 2024 at 11:17:01AM -0300, Thadeu Lima de Souza Cascardo wrote: > On Fri, Oct 04, 2024 at 05:08:28PM +0300, Heikki Krogerus wrote: > > On Fri, Oct 04, 2024 at 09:37:38AM -0300, Thadeu Lima de Souza Cascardo wrote: > > > The altmode device release refers to its parent device, but without keeping > > > a reference to it. > > > > > > When registering the altmode, get a reference to the parent and put it in > > > the release function. > > > > > > Before this fix, when using CONFIG_DEBUG_KOBJECT_RELEASE, we see issues > > > like this: > > > > Let me study what's going on in the drivers code. The children should > > _not_ be cleaned first before the parent. I'll have to come back to > > this on Monday. > > > > This really should not be necessary. > > > > Well, they are likely not. But driver core API states that either way, you > should keep such references. And one way to test it is using > CONFIG_DEBUG_KOBJECT_RELEASE. That delays the actual release/cleanup of the > struct device, so: What I want to understand is, what is the rationale for this behaviour in the core. If this is something that the drivers can do, then why is the core not doing it for everything? Why is the core decrementing the parent reference count already in device_del() instead of device_release()? I'm probable missing something, but I really want to understand this. There are other drivers also need resources from their parent, so if there is nothing preventing this from being handled in the core, then that's where it really should be handled. thanks,
On Fri, Oct 04, 2024 at 09:37:38AM -0300, Thadeu Lima de Souza Cascardo wrote: > The altmode device release refers to its parent device, but without keeping > a reference to it. > > When registering the altmode, get a reference to the parent and put it in > the release function. > > Before this fix, when using CONFIG_DEBUG_KOBJECT_RELEASE, we see issues > like this: > > [ 43.572860] kobject: 'port0.0' (ffff8880057ba008): kobject_release, parent 0000000000000000 (delayed 3000) > [ 43.573532] kobject: 'port0.1' (ffff8880057bd008): kobject_release, parent 0000000000000000 (delayed 1000) > [ 43.574407] kobject: 'port0' (ffff8880057b9008): kobject_release, parent 0000000000000000 (delayed 3000) > [ 43.575059] kobject: 'port1.0' (ffff8880057ca008): kobject_release, parent 0000000000000000 (delayed 4000) > [ 43.575908] kobject: 'port1.1' (ffff8880057c9008): kobject_release, parent 0000000000000000 (delayed 4000) > [ 43.576908] kobject: 'typec' (ffff8880062dbc00): kobject_release, parent 0000000000000000 (delayed 4000) > [ 43.577769] kobject: 'port1' (ffff8880057bf008): kobject_release, parent 0000000000000000 (delayed 3000) > [ 46.612867] ================================================================== > [ 46.613402] BUG: KASAN: slab-use-after-free in typec_altmode_release+0x38/0x129 > [ 46.614003] Read of size 8 at addr ffff8880057b9118 by task kworker/2:1/48 > [ 46.614538] > [ 46.614668] CPU: 2 UID: 0 PID: 48 Comm: kworker/2:1 Not tainted 6.12.0-rc1-00138-gedbae730ad31 #535 > [ 46.615391] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 > [ 46.616042] Workqueue: events kobject_delayed_cleanup > [ 46.616446] Call Trace: > [ 46.616648] <TASK> > [ 46.616820] dump_stack_lvl+0x5b/0x7c > [ 46.617112] ? typec_altmode_release+0x38/0x129 > [ 46.617470] print_report+0x14c/0x49e > [ 46.617769] ? rcu_read_unlock_sched+0x56/0x69 > [ 46.618117] ? __virt_addr_valid+0x19a/0x1ab > [ 46.618456] ? kmem_cache_debug_flags+0xc/0x1d > [ 46.618807] ? typec_altmode_release+0x38/0x129 > [ 46.619161] kasan_report+0x8d/0xb4 > [ 46.619447] ? typec_altmode_release+0x38/0x129 > [ 46.619809] ? process_scheduled_works+0x3cb/0x85f > [ 46.620185] typec_altmode_release+0x38/0x129 > [ 46.620537] ? process_scheduled_works+0x3cb/0x85f > [ 46.620907] device_release+0xaf/0xf2 > [ 46.621206] kobject_delayed_cleanup+0x13b/0x17a > [ 46.621584] process_scheduled_works+0x4f6/0x85f > [ 46.621955] ? __pfx_process_scheduled_works+0x10/0x10 > [ 46.622353] ? hlock_class+0x31/0x9a > [ 46.622647] ? lock_acquired+0x361/0x3c3 > [ 46.622956] ? move_linked_works+0x46/0x7d > [ 46.623277] worker_thread+0x1ce/0x291 > [ 46.623582] ? __kthread_parkme+0xc8/0xdf > [ 46.623900] ? __pfx_worker_thread+0x10/0x10 > [ 46.624236] kthread+0x17e/0x190 > [ 46.624501] ? kthread+0xfb/0x190 > [ 46.624756] ? __pfx_kthread+0x10/0x10 > [ 46.625015] ret_from_fork+0x20/0x40 > [ 46.625268] ? __pfx_kthread+0x10/0x10 > [ 46.625532] ret_from_fork_asm+0x1a/0x30 > [ 46.625805] </TASK> > [ 46.625953] > [ 46.626056] Allocated by task 678: > [ 46.626287] kasan_save_stack+0x24/0x44 > [ 46.626555] kasan_save_track+0x14/0x2d > [ 46.626811] __kasan_kmalloc+0x3f/0x4d > [ 46.627049] __kmalloc_noprof+0x1bf/0x1f0 > [ 46.627362] typec_register_port+0x23/0x491 > [ 46.627698] cros_typec_probe+0x634/0xbb6 > [ 46.628026] platform_probe+0x47/0x8c > [ 46.628311] really_probe+0x20a/0x47d > [ 46.628605] device_driver_attach+0x39/0x72 > [ 46.628940] bind_store+0x87/0xd7 > [ 46.629213] kernfs_fop_write_iter+0x1aa/0x218 > [ 46.629574] vfs_write+0x1d6/0x29b > [ 46.629856] ksys_write+0xcd/0x13b > [ 46.630128] do_syscall_64+0xd4/0x139 > [ 46.630420] entry_SYSCALL_64_after_hwframe+0x76/0x7e > [ 46.630820] > [ 46.630946] Freed by task 48: > [ 46.631182] kasan_save_stack+0x24/0x44 > [ 46.631493] kasan_save_track+0x14/0x2d > [ 46.631799] kasan_save_free_info+0x3f/0x4d > [ 46.632144] __kasan_slab_free+0x37/0x45 > [ 46.632474] kfree+0x1d4/0x252 > [ 46.632725] device_release+0xaf/0xf2 > [ 46.633017] kobject_delayed_cleanup+0x13b/0x17a > [ 46.633388] process_scheduled_works+0x4f6/0x85f > [ 46.633764] worker_thread+0x1ce/0x291 > [ 46.634065] kthread+0x17e/0x190 > [ 46.634324] ret_from_fork+0x20/0x40 > [ 46.634621] ret_from_fork_asm+0x1a/0x30 > > Fixes: 8a37d87d72f0 ("usb: typec: Bus type for alternate modes") > Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> I tried to go through the code and the logs, but I did not really find any explanation why we couldn't keep the device release order in sync for all devices that need it. I'll see if we can improve that somehow in the core, but we can fix this one in the meantime. But this is a case of use-after-free, so shouldn't this go to the stable trees? Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> thanks,
On Mon, Oct 07, 2024 at 02:07:43PM +0300, Heikki Krogerus wrote: > On Fri, Oct 04, 2024 at 09:37:38AM -0300, Thadeu Lima de Souza Cascardo wrote: > > The altmode device release refers to its parent device, but without keeping > > a reference to it. > > > > When registering the altmode, get a reference to the parent and put it in > > the release function. > > > > Before this fix, when using CONFIG_DEBUG_KOBJECT_RELEASE, we see issues > > like this: > > > > [ 43.572860] kobject: 'port0.0' (ffff8880057ba008): kobject_release, parent 0000000000000000 (delayed 3000) > > [ 43.573532] kobject: 'port0.1' (ffff8880057bd008): kobject_release, parent 0000000000000000 (delayed 1000) > > [ 43.574407] kobject: 'port0' (ffff8880057b9008): kobject_release, parent 0000000000000000 (delayed 3000) > > [ 43.575059] kobject: 'port1.0' (ffff8880057ca008): kobject_release, parent 0000000000000000 (delayed 4000) > > [ 43.575908] kobject: 'port1.1' (ffff8880057c9008): kobject_release, parent 0000000000000000 (delayed 4000) > > [ 43.576908] kobject: 'typec' (ffff8880062dbc00): kobject_release, parent 0000000000000000 (delayed 4000) > > [ 43.577769] kobject: 'port1' (ffff8880057bf008): kobject_release, parent 0000000000000000 (delayed 3000) > > [ 46.612867] ================================================================== > > [ 46.613402] BUG: KASAN: slab-use-after-free in typec_altmode_release+0x38/0x129 > > [ 46.614003] Read of size 8 at addr ffff8880057b9118 by task kworker/2:1/48 > > [ 46.614538] > > [ 46.614668] CPU: 2 UID: 0 PID: 48 Comm: kworker/2:1 Not tainted 6.12.0-rc1-00138-gedbae730ad31 #535 > > [ 46.615391] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 > > [ 46.616042] Workqueue: events kobject_delayed_cleanup > > [ 46.616446] Call Trace: > > [ 46.616648] <TASK> > > [ 46.616820] dump_stack_lvl+0x5b/0x7c > > [ 46.617112] ? typec_altmode_release+0x38/0x129 > > [ 46.617470] print_report+0x14c/0x49e > > [ 46.617769] ? rcu_read_unlock_sched+0x56/0x69 > > [ 46.618117] ? __virt_addr_valid+0x19a/0x1ab > > [ 46.618456] ? kmem_cache_debug_flags+0xc/0x1d > > [ 46.618807] ? typec_altmode_release+0x38/0x129 > > [ 46.619161] kasan_report+0x8d/0xb4 > > [ 46.619447] ? typec_altmode_release+0x38/0x129 > > [ 46.619809] ? process_scheduled_works+0x3cb/0x85f > > [ 46.620185] typec_altmode_release+0x38/0x129 > > [ 46.620537] ? process_scheduled_works+0x3cb/0x85f > > [ 46.620907] device_release+0xaf/0xf2 > > [ 46.621206] kobject_delayed_cleanup+0x13b/0x17a > > [ 46.621584] process_scheduled_works+0x4f6/0x85f > > [ 46.621955] ? __pfx_process_scheduled_works+0x10/0x10 > > [ 46.622353] ? hlock_class+0x31/0x9a > > [ 46.622647] ? lock_acquired+0x361/0x3c3 > > [ 46.622956] ? move_linked_works+0x46/0x7d > > [ 46.623277] worker_thread+0x1ce/0x291 > > [ 46.623582] ? __kthread_parkme+0xc8/0xdf > > [ 46.623900] ? __pfx_worker_thread+0x10/0x10 > > [ 46.624236] kthread+0x17e/0x190 > > [ 46.624501] ? kthread+0xfb/0x190 > > [ 46.624756] ? __pfx_kthread+0x10/0x10 > > [ 46.625015] ret_from_fork+0x20/0x40 > > [ 46.625268] ? __pfx_kthread+0x10/0x10 > > [ 46.625532] ret_from_fork_asm+0x1a/0x30 > > [ 46.625805] </TASK> > > [ 46.625953] > > [ 46.626056] Allocated by task 678: > > [ 46.626287] kasan_save_stack+0x24/0x44 > > [ 46.626555] kasan_save_track+0x14/0x2d > > [ 46.626811] __kasan_kmalloc+0x3f/0x4d > > [ 46.627049] __kmalloc_noprof+0x1bf/0x1f0 > > [ 46.627362] typec_register_port+0x23/0x491 > > [ 46.627698] cros_typec_probe+0x634/0xbb6 > > [ 46.628026] platform_probe+0x47/0x8c > > [ 46.628311] really_probe+0x20a/0x47d > > [ 46.628605] device_driver_attach+0x39/0x72 > > [ 46.628940] bind_store+0x87/0xd7 > > [ 46.629213] kernfs_fop_write_iter+0x1aa/0x218 > > [ 46.629574] vfs_write+0x1d6/0x29b > > [ 46.629856] ksys_write+0xcd/0x13b > > [ 46.630128] do_syscall_64+0xd4/0x139 > > [ 46.630420] entry_SYSCALL_64_after_hwframe+0x76/0x7e > > [ 46.630820] > > [ 46.630946] Freed by task 48: > > [ 46.631182] kasan_save_stack+0x24/0x44 > > [ 46.631493] kasan_save_track+0x14/0x2d > > [ 46.631799] kasan_save_free_info+0x3f/0x4d > > [ 46.632144] __kasan_slab_free+0x37/0x45 > > [ 46.632474] kfree+0x1d4/0x252 > > [ 46.632725] device_release+0xaf/0xf2 > > [ 46.633017] kobject_delayed_cleanup+0x13b/0x17a > > [ 46.633388] process_scheduled_works+0x4f6/0x85f > > [ 46.633764] worker_thread+0x1ce/0x291 > > [ 46.634065] kthread+0x17e/0x190 > > [ 46.634324] ret_from_fork+0x20/0x40 > > [ 46.634621] ret_from_fork_asm+0x1a/0x30 > > > > Fixes: 8a37d87d72f0 ("usb: typec: Bus type for alternate modes") > > Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> > > I tried to go through the code and the logs, but I did not really find > any explanation why we couldn't keep the device release order in sync > for all devices that need it. I'll see if we can improve that somehow > in the core, but we can fix this one in the meantime. > > But this is a case of use-after-free, so shouldn't this go to the > stable trees? Agreed. I am fine if Cc: stable@vger.kernel.org get added when this is applied. But otherwise, I can keep track in case the Fixes line does not trigger its inclusion. Thanks. Cascardo. > > Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> > > thanks, > > -- > heikki >
On Mon, Oct 07, 2024 at 11:33:09AM +0300, Heikki Krogerus wrote: > Hi, > > On Fri, Oct 04, 2024 at 11:17:01AM -0300, Thadeu Lima de Souza Cascardo wrote: > > On Fri, Oct 04, 2024 at 05:08:28PM +0300, Heikki Krogerus wrote: > > > On Fri, Oct 04, 2024 at 09:37:38AM -0300, Thadeu Lima de Souza Cascardo wrote: > > > > The altmode device release refers to its parent device, but without keeping > > > > a reference to it. > > > > > > > > When registering the altmode, get a reference to the parent and put it in > > > > the release function. > > > > > > > > Before this fix, when using CONFIG_DEBUG_KOBJECT_RELEASE, we see issues > > > > like this: > > > > > > Let me study what's going on in the drivers code. The children should > > > _not_ be cleaned first before the parent. I'll have to come back to > > > this on Monday. > > > > > > This really should not be necessary. > > > > > > > Well, they are likely not. But driver core API states that either way, you > > should keep such references. And one way to test it is using > > CONFIG_DEBUG_KOBJECT_RELEASE. That delays the actual release/cleanup of the > > struct device, so: > > What I want to understand is, what is the rationale for this behaviour > in the core. If this is something that the drivers can do, then why is > the core not doing it for everything? Why is the core decrementing the > parent reference count already in device_del() instead of > device_release()? > > I'm probable missing something, but I really want to understand this. > There are other drivers also need resources from their parent, so if > there is nothing preventing this from being handled in the core, then > that's where it really should be handled. > > thanks, > > -- > heikki > It has been like that for 20+ years, it seems. I noticed that you have changed the behavior in the case of kobject parent lifetime relatively recently (back in 2020). One potential challenge of changing this is that dev->parent is usually set even before device_initialize gets called. At least, the reference must be get there at device_initialize. A quick glance here shows that the parent may be set between device_initialize and device_add, including in device_create_groups_vargs. Around 315 calls would need to be inspected, I would guess more than 100 callsites would need to be "fixed" for this change in the API. And, then, how many really need to keep the reference to the parent, and wouldn't keep such a reference cause any problems in any of these cases? Cascardo.
diff --git a/drivers/usb/typec/class.c b/drivers/usb/typec/class.c index 9262fcd4144f..d61b4c74648d 100644 --- a/drivers/usb/typec/class.c +++ b/drivers/usb/typec/class.c @@ -519,6 +519,7 @@ static void typec_altmode_release(struct device *dev) typec_altmode_put_partner(alt); altmode_id_remove(alt->adev.dev.parent, alt->id); + put_device(alt->adev.dev.parent); kfree(alt); } @@ -568,6 +569,8 @@ typec_register_altmode(struct device *parent, alt->adev.dev.type = &typec_altmode_dev_type; dev_set_name(&alt->adev.dev, "%s.%u", dev_name(parent), id); + get_device(alt->adev.dev.parent); + /* Link partners and plugs with the ports */ if (!is_port) typec_altmode_set_partner(alt);
The altmode device release refers to its parent device, but without keeping a reference to it. When registering the altmode, get a reference to the parent and put it in the release function. Before this fix, when using CONFIG_DEBUG_KOBJECT_RELEASE, we see issues like this: [ 43.572860] kobject: 'port0.0' (ffff8880057ba008): kobject_release, parent 0000000000000000 (delayed 3000) [ 43.573532] kobject: 'port0.1' (ffff8880057bd008): kobject_release, parent 0000000000000000 (delayed 1000) [ 43.574407] kobject: 'port0' (ffff8880057b9008): kobject_release, parent 0000000000000000 (delayed 3000) [ 43.575059] kobject: 'port1.0' (ffff8880057ca008): kobject_release, parent 0000000000000000 (delayed 4000) [ 43.575908] kobject: 'port1.1' (ffff8880057c9008): kobject_release, parent 0000000000000000 (delayed 4000) [ 43.576908] kobject: 'typec' (ffff8880062dbc00): kobject_release, parent 0000000000000000 (delayed 4000) [ 43.577769] kobject: 'port1' (ffff8880057bf008): kobject_release, parent 0000000000000000 (delayed 3000) [ 46.612867] ================================================================== [ 46.613402] BUG: KASAN: slab-use-after-free in typec_altmode_release+0x38/0x129 [ 46.614003] Read of size 8 at addr ffff8880057b9118 by task kworker/2:1/48 [ 46.614538] [ 46.614668] CPU: 2 UID: 0 PID: 48 Comm: kworker/2:1 Not tainted 6.12.0-rc1-00138-gedbae730ad31 #535 [ 46.615391] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 [ 46.616042] Workqueue: events kobject_delayed_cleanup [ 46.616446] Call Trace: [ 46.616648] <TASK> [ 46.616820] dump_stack_lvl+0x5b/0x7c [ 46.617112] ? typec_altmode_release+0x38/0x129 [ 46.617470] print_report+0x14c/0x49e [ 46.617769] ? rcu_read_unlock_sched+0x56/0x69 [ 46.618117] ? __virt_addr_valid+0x19a/0x1ab [ 46.618456] ? kmem_cache_debug_flags+0xc/0x1d [ 46.618807] ? typec_altmode_release+0x38/0x129 [ 46.619161] kasan_report+0x8d/0xb4 [ 46.619447] ? typec_altmode_release+0x38/0x129 [ 46.619809] ? process_scheduled_works+0x3cb/0x85f [ 46.620185] typec_altmode_release+0x38/0x129 [ 46.620537] ? process_scheduled_works+0x3cb/0x85f [ 46.620907] device_release+0xaf/0xf2 [ 46.621206] kobject_delayed_cleanup+0x13b/0x17a [ 46.621584] process_scheduled_works+0x4f6/0x85f [ 46.621955] ? __pfx_process_scheduled_works+0x10/0x10 [ 46.622353] ? hlock_class+0x31/0x9a [ 46.622647] ? lock_acquired+0x361/0x3c3 [ 46.622956] ? move_linked_works+0x46/0x7d [ 46.623277] worker_thread+0x1ce/0x291 [ 46.623582] ? __kthread_parkme+0xc8/0xdf [ 46.623900] ? __pfx_worker_thread+0x10/0x10 [ 46.624236] kthread+0x17e/0x190 [ 46.624501] ? kthread+0xfb/0x190 [ 46.624756] ? __pfx_kthread+0x10/0x10 [ 46.625015] ret_from_fork+0x20/0x40 [ 46.625268] ? __pfx_kthread+0x10/0x10 [ 46.625532] ret_from_fork_asm+0x1a/0x30 [ 46.625805] </TASK> [ 46.625953] [ 46.626056] Allocated by task 678: [ 46.626287] kasan_save_stack+0x24/0x44 [ 46.626555] kasan_save_track+0x14/0x2d [ 46.626811] __kasan_kmalloc+0x3f/0x4d [ 46.627049] __kmalloc_noprof+0x1bf/0x1f0 [ 46.627362] typec_register_port+0x23/0x491 [ 46.627698] cros_typec_probe+0x634/0xbb6 [ 46.628026] platform_probe+0x47/0x8c [ 46.628311] really_probe+0x20a/0x47d [ 46.628605] device_driver_attach+0x39/0x72 [ 46.628940] bind_store+0x87/0xd7 [ 46.629213] kernfs_fop_write_iter+0x1aa/0x218 [ 46.629574] vfs_write+0x1d6/0x29b [ 46.629856] ksys_write+0xcd/0x13b [ 46.630128] do_syscall_64+0xd4/0x139 [ 46.630420] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 46.630820] [ 46.630946] Freed by task 48: [ 46.631182] kasan_save_stack+0x24/0x44 [ 46.631493] kasan_save_track+0x14/0x2d [ 46.631799] kasan_save_free_info+0x3f/0x4d [ 46.632144] __kasan_slab_free+0x37/0x45 [ 46.632474] kfree+0x1d4/0x252 [ 46.632725] device_release+0xaf/0xf2 [ 46.633017] kobject_delayed_cleanup+0x13b/0x17a [ 46.633388] process_scheduled_works+0x4f6/0x85f [ 46.633764] worker_thread+0x1ce/0x291 [ 46.634065] kthread+0x17e/0x190 [ 46.634324] ret_from_fork+0x20/0x40 [ 46.634621] ret_from_fork_asm+0x1a/0x30 Fixes: 8a37d87d72f0 ("usb: typec: Bus type for alternate modes") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> --- drivers/usb/typec/class.c | 3 +++ 1 file changed, 3 insertions(+)