diff mbox series

usb: typec: altmode should keep reference to parent

Message ID 20241004123738.2964524-1-cascardo@igalia.com (mailing list archive)
State Accepted
Commit befab3a278c59db0cc88c8799638064f6d3fd6f8
Headers show
Series usb: typec: altmode should keep reference to parent | expand

Commit Message

Thadeu Lima de Souza Cascardo Oct. 4, 2024, 12:37 p.m. UTC
The altmode device release refers to its parent device, but without keeping
a reference to it.

When registering the altmode, get a reference to the parent and put it in
the release function.

Before this fix, when using CONFIG_DEBUG_KOBJECT_RELEASE, we see issues
like this:

[   43.572860] kobject: 'port0.0' (ffff8880057ba008): kobject_release, parent 0000000000000000 (delayed 3000)
[   43.573532] kobject: 'port0.1' (ffff8880057bd008): kobject_release, parent 0000000000000000 (delayed 1000)
[   43.574407] kobject: 'port0' (ffff8880057b9008): kobject_release, parent 0000000000000000 (delayed 3000)
[   43.575059] kobject: 'port1.0' (ffff8880057ca008): kobject_release, parent 0000000000000000 (delayed 4000)
[   43.575908] kobject: 'port1.1' (ffff8880057c9008): kobject_release, parent 0000000000000000 (delayed 4000)
[   43.576908] kobject: 'typec' (ffff8880062dbc00): kobject_release, parent 0000000000000000 (delayed 4000)
[   43.577769] kobject: 'port1' (ffff8880057bf008): kobject_release, parent 0000000000000000 (delayed 3000)
[   46.612867] ==================================================================
[   46.613402] BUG: KASAN: slab-use-after-free in typec_altmode_release+0x38/0x129
[   46.614003] Read of size 8 at addr ffff8880057b9118 by task kworker/2:1/48
[   46.614538]
[   46.614668] CPU: 2 UID: 0 PID: 48 Comm: kworker/2:1 Not tainted 6.12.0-rc1-00138-gedbae730ad31 #535
[   46.615391] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
[   46.616042] Workqueue: events kobject_delayed_cleanup
[   46.616446] Call Trace:
[   46.616648]  <TASK>
[   46.616820]  dump_stack_lvl+0x5b/0x7c
[   46.617112]  ? typec_altmode_release+0x38/0x129
[   46.617470]  print_report+0x14c/0x49e
[   46.617769]  ? rcu_read_unlock_sched+0x56/0x69
[   46.618117]  ? __virt_addr_valid+0x19a/0x1ab
[   46.618456]  ? kmem_cache_debug_flags+0xc/0x1d
[   46.618807]  ? typec_altmode_release+0x38/0x129
[   46.619161]  kasan_report+0x8d/0xb4
[   46.619447]  ? typec_altmode_release+0x38/0x129
[   46.619809]  ? process_scheduled_works+0x3cb/0x85f
[   46.620185]  typec_altmode_release+0x38/0x129
[   46.620537]  ? process_scheduled_works+0x3cb/0x85f
[   46.620907]  device_release+0xaf/0xf2
[   46.621206]  kobject_delayed_cleanup+0x13b/0x17a
[   46.621584]  process_scheduled_works+0x4f6/0x85f
[   46.621955]  ? __pfx_process_scheduled_works+0x10/0x10
[   46.622353]  ? hlock_class+0x31/0x9a
[   46.622647]  ? lock_acquired+0x361/0x3c3
[   46.622956]  ? move_linked_works+0x46/0x7d
[   46.623277]  worker_thread+0x1ce/0x291
[   46.623582]  ? __kthread_parkme+0xc8/0xdf
[   46.623900]  ? __pfx_worker_thread+0x10/0x10
[   46.624236]  kthread+0x17e/0x190
[   46.624501]  ? kthread+0xfb/0x190
[   46.624756]  ? __pfx_kthread+0x10/0x10
[   46.625015]  ret_from_fork+0x20/0x40
[   46.625268]  ? __pfx_kthread+0x10/0x10
[   46.625532]  ret_from_fork_asm+0x1a/0x30
[   46.625805]  </TASK>
[   46.625953]
[   46.626056] Allocated by task 678:
[   46.626287]  kasan_save_stack+0x24/0x44
[   46.626555]  kasan_save_track+0x14/0x2d
[   46.626811]  __kasan_kmalloc+0x3f/0x4d
[   46.627049]  __kmalloc_noprof+0x1bf/0x1f0
[   46.627362]  typec_register_port+0x23/0x491
[   46.627698]  cros_typec_probe+0x634/0xbb6
[   46.628026]  platform_probe+0x47/0x8c
[   46.628311]  really_probe+0x20a/0x47d
[   46.628605]  device_driver_attach+0x39/0x72
[   46.628940]  bind_store+0x87/0xd7
[   46.629213]  kernfs_fop_write_iter+0x1aa/0x218
[   46.629574]  vfs_write+0x1d6/0x29b
[   46.629856]  ksys_write+0xcd/0x13b
[   46.630128]  do_syscall_64+0xd4/0x139
[   46.630420]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[   46.630820]
[   46.630946] Freed by task 48:
[   46.631182]  kasan_save_stack+0x24/0x44
[   46.631493]  kasan_save_track+0x14/0x2d
[   46.631799]  kasan_save_free_info+0x3f/0x4d
[   46.632144]  __kasan_slab_free+0x37/0x45
[   46.632474]  kfree+0x1d4/0x252
[   46.632725]  device_release+0xaf/0xf2
[   46.633017]  kobject_delayed_cleanup+0x13b/0x17a
[   46.633388]  process_scheduled_works+0x4f6/0x85f
[   46.633764]  worker_thread+0x1ce/0x291
[   46.634065]  kthread+0x17e/0x190
[   46.634324]  ret_from_fork+0x20/0x40
[   46.634621]  ret_from_fork_asm+0x1a/0x30

Fixes: 8a37d87d72f0 ("usb: typec: Bus type for alternate modes")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
---
 drivers/usb/typec/class.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Heikki Krogerus Oct. 4, 2024, 2:08 p.m. UTC | #1
On Fri, Oct 04, 2024 at 09:37:38AM -0300, Thadeu Lima de Souza Cascardo wrote:
> The altmode device release refers to its parent device, but without keeping
> a reference to it.
> 
> When registering the altmode, get a reference to the parent and put it in
> the release function.
> 
> Before this fix, when using CONFIG_DEBUG_KOBJECT_RELEASE, we see issues
> like this:

Let me study what's going on in the drivers code. The children should
_not_ be cleaned first before the parent. I'll have to come back to
this on Monday.

This really should not be necessary.

> [   43.572860] kobject: 'port0.0' (ffff8880057ba008): kobject_release, parent 0000000000000000 (delayed 3000)
> [   43.573532] kobject: 'port0.1' (ffff8880057bd008): kobject_release, parent 0000000000000000 (delayed 1000)
> [   43.574407] kobject: 'port0' (ffff8880057b9008): kobject_release, parent 0000000000000000 (delayed 3000)
> [   43.575059] kobject: 'port1.0' (ffff8880057ca008): kobject_release, parent 0000000000000000 (delayed 4000)
> [   43.575908] kobject: 'port1.1' (ffff8880057c9008): kobject_release, parent 0000000000000000 (delayed 4000)
> [   43.576908] kobject: 'typec' (ffff8880062dbc00): kobject_release, parent 0000000000000000 (delayed 4000)
> [   43.577769] kobject: 'port1' (ffff8880057bf008): kobject_release, parent 0000000000000000 (delayed 3000)
> [   46.612867] ==================================================================
> [   46.613402] BUG: KASAN: slab-use-after-free in typec_altmode_release+0x38/0x129
> [   46.614003] Read of size 8 at addr ffff8880057b9118 by task kworker/2:1/48
> [   46.614538]
> [   46.614668] CPU: 2 UID: 0 PID: 48 Comm: kworker/2:1 Not tainted 6.12.0-rc1-00138-gedbae730ad31 #535
> [   46.615391] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> [   46.616042] Workqueue: events kobject_delayed_cleanup
> [   46.616446] Call Trace:
> [   46.616648]  <TASK>
> [   46.616820]  dump_stack_lvl+0x5b/0x7c
> [   46.617112]  ? typec_altmode_release+0x38/0x129
> [   46.617470]  print_report+0x14c/0x49e
> [   46.617769]  ? rcu_read_unlock_sched+0x56/0x69
> [   46.618117]  ? __virt_addr_valid+0x19a/0x1ab
> [   46.618456]  ? kmem_cache_debug_flags+0xc/0x1d
> [   46.618807]  ? typec_altmode_release+0x38/0x129
> [   46.619161]  kasan_report+0x8d/0xb4
> [   46.619447]  ? typec_altmode_release+0x38/0x129
> [   46.619809]  ? process_scheduled_works+0x3cb/0x85f
> [   46.620185]  typec_altmode_release+0x38/0x129
> [   46.620537]  ? process_scheduled_works+0x3cb/0x85f
> [   46.620907]  device_release+0xaf/0xf2
> [   46.621206]  kobject_delayed_cleanup+0x13b/0x17a
> [   46.621584]  process_scheduled_works+0x4f6/0x85f
> [   46.621955]  ? __pfx_process_scheduled_works+0x10/0x10
> [   46.622353]  ? hlock_class+0x31/0x9a
> [   46.622647]  ? lock_acquired+0x361/0x3c3
> [   46.622956]  ? move_linked_works+0x46/0x7d
> [   46.623277]  worker_thread+0x1ce/0x291
> [   46.623582]  ? __kthread_parkme+0xc8/0xdf
> [   46.623900]  ? __pfx_worker_thread+0x10/0x10
> [   46.624236]  kthread+0x17e/0x190
> [   46.624501]  ? kthread+0xfb/0x190
> [   46.624756]  ? __pfx_kthread+0x10/0x10
> [   46.625015]  ret_from_fork+0x20/0x40
> [   46.625268]  ? __pfx_kthread+0x10/0x10
> [   46.625532]  ret_from_fork_asm+0x1a/0x30
> [   46.625805]  </TASK>
> [   46.625953]
> [   46.626056] Allocated by task 678:
> [   46.626287]  kasan_save_stack+0x24/0x44
> [   46.626555]  kasan_save_track+0x14/0x2d
> [   46.626811]  __kasan_kmalloc+0x3f/0x4d
> [   46.627049]  __kmalloc_noprof+0x1bf/0x1f0
> [   46.627362]  typec_register_port+0x23/0x491
> [   46.627698]  cros_typec_probe+0x634/0xbb6
> [   46.628026]  platform_probe+0x47/0x8c
> [   46.628311]  really_probe+0x20a/0x47d
> [   46.628605]  device_driver_attach+0x39/0x72
> [   46.628940]  bind_store+0x87/0xd7
> [   46.629213]  kernfs_fop_write_iter+0x1aa/0x218
> [   46.629574]  vfs_write+0x1d6/0x29b
> [   46.629856]  ksys_write+0xcd/0x13b
> [   46.630128]  do_syscall_64+0xd4/0x139
> [   46.630420]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [   46.630820]
> [   46.630946] Freed by task 48:
> [   46.631182]  kasan_save_stack+0x24/0x44
> [   46.631493]  kasan_save_track+0x14/0x2d
> [   46.631799]  kasan_save_free_info+0x3f/0x4d
> [   46.632144]  __kasan_slab_free+0x37/0x45
> [   46.632474]  kfree+0x1d4/0x252
> [   46.632725]  device_release+0xaf/0xf2
> [   46.633017]  kobject_delayed_cleanup+0x13b/0x17a
> [   46.633388]  process_scheduled_works+0x4f6/0x85f
> [   46.633764]  worker_thread+0x1ce/0x291
> [   46.634065]  kthread+0x17e/0x190
> [   46.634324]  ret_from_fork+0x20/0x40
> [   46.634621]  ret_from_fork_asm+0x1a/0x30
> 
> Fixes: 8a37d87d72f0 ("usb: typec: Bus type for alternate modes")
> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
> ---
>  drivers/usb/typec/class.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/usb/typec/class.c b/drivers/usb/typec/class.c
> index 9262fcd4144f..d61b4c74648d 100644
> --- a/drivers/usb/typec/class.c
> +++ b/drivers/usb/typec/class.c
> @@ -519,6 +519,7 @@ static void typec_altmode_release(struct device *dev)
>  		typec_altmode_put_partner(alt);
>  
>  	altmode_id_remove(alt->adev.dev.parent, alt->id);
> +	put_device(alt->adev.dev.parent);
>  	kfree(alt);
>  }
>  
> @@ -568,6 +569,8 @@ typec_register_altmode(struct device *parent,
>  	alt->adev.dev.type = &typec_altmode_dev_type;
>  	dev_set_name(&alt->adev.dev, "%s.%u", dev_name(parent), id);
>  
> +	get_device(alt->adev.dev.parent);
> +
>  	/* Link partners and plugs with the ports */
>  	if (!is_port)
>  		typec_altmode_set_partner(alt);
> -- 
> 2.34.1
Thadeu Lima de Souza Cascardo Oct. 4, 2024, 2:17 p.m. UTC | #2
On Fri, Oct 04, 2024 at 05:08:28PM +0300, Heikki Krogerus wrote:
> On Fri, Oct 04, 2024 at 09:37:38AM -0300, Thadeu Lima de Souza Cascardo wrote:
> > The altmode device release refers to its parent device, but without keeping
> > a reference to it.
> > 
> > When registering the altmode, get a reference to the parent and put it in
> > the release function.
> > 
> > Before this fix, when using CONFIG_DEBUG_KOBJECT_RELEASE, we see issues
> > like this:
> 
> Let me study what's going on in the drivers code. The children should
> _not_ be cleaned first before the parent. I'll have to come back to
> this on Monday.
> 
> This really should not be necessary.
> 

Well, they are likely not. But driver core API states that either way, you
should keep such references. And one way to test it is using
CONFIG_DEBUG_KOBJECT_RELEASE. That delays the actual release/cleanup of the
struct device, so:


> > [   43.572860] kobject: 'port0.0' (ffff8880057ba008): kobject_release, parent 0000000000000000 (delayed 3000)
> > [   43.573532] kobject: 'port0.1' (ffff8880057bd008): kobject_release, parent 0000000000000000 (delayed 1000)
> > [   43.574407] kobject: 'port0' (ffff8880057b9008): kobject_release, parent 0000000000000000 (delayed 3000)
> > [   43.575059] kobject: 'port1.0' (ffff8880057ca008): kobject_release, parent 0000000000000000 (delayed 4000)
> > [   43.575908] kobject: 'port1.1' (ffff8880057c9008): kobject_release, parent 0000000000000000 (delayed 4000)

1) children (port1.0 and port1.1) last reference are put, but their actual
release is delayed 4s.

> > [   43.576908] kobject: 'typec' (ffff8880062dbc00): kobject_release, parent 0000000000000000 (delayed 4000)
> > [   43.577769] kobject: 'port1' (ffff8880057bf008): kobject_release, parent 0000000000000000 (delayed 3000)

2) parent (port1) is put, but release is delayed 3s.

Just in the order you would expect, but because of the delays:

3) 3s later, port1 release is called and it is freed.
4) 4s later, port1.0 release is called and it refers to the freed parent,
port1.

Having the references, what happens now is:

1) port1 is put, but this is not its last reference.
2) port1.0 and port1.1 are put, cleanup delayed 4s.
3) 4s later, port1.0 and port1.1 releases are called, but now they put the
last reference to port1.
4) port1 last reference has now been called, cleanup delayed 3s.
5) 3s later, port1 release is called, then freed.

No UAF in such case.

Thanks.
Cascardo.

> > [   46.612867] ==================================================================
> > [   46.613402] BUG: KASAN: slab-use-after-free in typec_altmode_release+0x38/0x129
> > [   46.614003] Read of size 8 at addr ffff8880057b9118 by task kworker/2:1/48
> > [   46.614538]
> > [   46.614668] CPU: 2 UID: 0 PID: 48 Comm: kworker/2:1 Not tainted 6.12.0-rc1-00138-gedbae730ad31 #535
> > [   46.615391] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> > [   46.616042] Workqueue: events kobject_delayed_cleanup
> > [   46.616446] Call Trace:
> > [   46.616648]  <TASK>
> > [   46.616820]  dump_stack_lvl+0x5b/0x7c
> > [   46.617112]  ? typec_altmode_release+0x38/0x129
> > [   46.617470]  print_report+0x14c/0x49e
> > [   46.617769]  ? rcu_read_unlock_sched+0x56/0x69
> > [   46.618117]  ? __virt_addr_valid+0x19a/0x1ab
> > [   46.618456]  ? kmem_cache_debug_flags+0xc/0x1d
> > [   46.618807]  ? typec_altmode_release+0x38/0x129
> > [   46.619161]  kasan_report+0x8d/0xb4
> > [   46.619447]  ? typec_altmode_release+0x38/0x129
> > [   46.619809]  ? process_scheduled_works+0x3cb/0x85f
> > [   46.620185]  typec_altmode_release+0x38/0x129
> > [   46.620537]  ? process_scheduled_works+0x3cb/0x85f
> > [   46.620907]  device_release+0xaf/0xf2
> > [   46.621206]  kobject_delayed_cleanup+0x13b/0x17a
> > [   46.621584]  process_scheduled_works+0x4f6/0x85f
> > [   46.621955]  ? __pfx_process_scheduled_works+0x10/0x10
> > [   46.622353]  ? hlock_class+0x31/0x9a
> > [   46.622647]  ? lock_acquired+0x361/0x3c3
> > [   46.622956]  ? move_linked_works+0x46/0x7d
> > [   46.623277]  worker_thread+0x1ce/0x291
> > [   46.623582]  ? __kthread_parkme+0xc8/0xdf
> > [   46.623900]  ? __pfx_worker_thread+0x10/0x10
> > [   46.624236]  kthread+0x17e/0x190
> > [   46.624501]  ? kthread+0xfb/0x190
> > [   46.624756]  ? __pfx_kthread+0x10/0x10
> > [   46.625015]  ret_from_fork+0x20/0x40
> > [   46.625268]  ? __pfx_kthread+0x10/0x10
> > [   46.625532]  ret_from_fork_asm+0x1a/0x30
> > [   46.625805]  </TASK>
> > [   46.625953]
> > [   46.626056] Allocated by task 678:
> > [   46.626287]  kasan_save_stack+0x24/0x44
> > [   46.626555]  kasan_save_track+0x14/0x2d
> > [   46.626811]  __kasan_kmalloc+0x3f/0x4d
> > [   46.627049]  __kmalloc_noprof+0x1bf/0x1f0
> > [   46.627362]  typec_register_port+0x23/0x491
> > [   46.627698]  cros_typec_probe+0x634/0xbb6
> > [   46.628026]  platform_probe+0x47/0x8c
> > [   46.628311]  really_probe+0x20a/0x47d
> > [   46.628605]  device_driver_attach+0x39/0x72
> > [   46.628940]  bind_store+0x87/0xd7
> > [   46.629213]  kernfs_fop_write_iter+0x1aa/0x218
> > [   46.629574]  vfs_write+0x1d6/0x29b
> > [   46.629856]  ksys_write+0xcd/0x13b
> > [   46.630128]  do_syscall_64+0xd4/0x139
> > [   46.630420]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > [   46.630820]
> > [   46.630946] Freed by task 48:
> > [   46.631182]  kasan_save_stack+0x24/0x44
> > [   46.631493]  kasan_save_track+0x14/0x2d
> > [   46.631799]  kasan_save_free_info+0x3f/0x4d
> > [   46.632144]  __kasan_slab_free+0x37/0x45
> > [   46.632474]  kfree+0x1d4/0x252
> > [   46.632725]  device_release+0xaf/0xf2
> > [   46.633017]  kobject_delayed_cleanup+0x13b/0x17a
> > [   46.633388]  process_scheduled_works+0x4f6/0x85f
> > [   46.633764]  worker_thread+0x1ce/0x291
> > [   46.634065]  kthread+0x17e/0x190
> > [   46.634324]  ret_from_fork+0x20/0x40
> > [   46.634621]  ret_from_fork_asm+0x1a/0x30
> > 
> > Fixes: 8a37d87d72f0 ("usb: typec: Bus type for alternate modes")
> > Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
> > ---
> >  drivers/usb/typec/class.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/usb/typec/class.c b/drivers/usb/typec/class.c
> > index 9262fcd4144f..d61b4c74648d 100644
> > --- a/drivers/usb/typec/class.c
> > +++ b/drivers/usb/typec/class.c
> > @@ -519,6 +519,7 @@ static void typec_altmode_release(struct device *dev)
> >  		typec_altmode_put_partner(alt);
> >  
> >  	altmode_id_remove(alt->adev.dev.parent, alt->id);
> > +	put_device(alt->adev.dev.parent);
> >  	kfree(alt);
> >  }
> >  
> > @@ -568,6 +569,8 @@ typec_register_altmode(struct device *parent,
> >  	alt->adev.dev.type = &typec_altmode_dev_type;
> >  	dev_set_name(&alt->adev.dev, "%s.%u", dev_name(parent), id);
> >  
> > +	get_device(alt->adev.dev.parent);
> > +
> >  	/* Link partners and plugs with the ports */
> >  	if (!is_port)
> >  		typec_altmode_set_partner(alt);
> > -- 
> > 2.34.1
> 
> -- 
> heikki
>
Dmitry Baryshkov Oct. 6, 2024, 3:24 p.m. UTC | #3
On Fri, Oct 04, 2024 at 11:17:01AM GMT, Thadeu Lima de Souza Cascardo wrote:
> On Fri, Oct 04, 2024 at 05:08:28PM +0300, Heikki Krogerus wrote:
> > On Fri, Oct 04, 2024 at 09:37:38AM -0300, Thadeu Lima de Souza Cascardo wrote:
> > > The altmode device release refers to its parent device, but without keeping
> > > a reference to it.
> > > 
> > > When registering the altmode, get a reference to the parent and put it in
> > > the release function.
> > > 
> > > Before this fix, when using CONFIG_DEBUG_KOBJECT_RELEASE, we see issues
> > > like this:
> > 
> > Let me study what's going on in the drivers code. The children should
> > _not_ be cleaned first before the parent. I'll have to come back to
> > this on Monday.
> > 
> > This really should not be necessary.
> > 
> 
> Well, they are likely not. But driver core API states that either way, you
> should keep such references. And one way to test it is using
> CONFIG_DEBUG_KOBJECT_RELEASE. That delays the actual release/cleanup of the
> struct device, so:
> 
> 
> > > [   43.572860] kobject: 'port0.0' (ffff8880057ba008): kobject_release, parent 0000000000000000 (delayed 3000)
> > > [   43.573532] kobject: 'port0.1' (ffff8880057bd008): kobject_release, parent 0000000000000000 (delayed 1000)
> > > [   43.574407] kobject: 'port0' (ffff8880057b9008): kobject_release, parent 0000000000000000 (delayed 3000)
> > > [   43.575059] kobject: 'port1.0' (ffff8880057ca008): kobject_release, parent 0000000000000000 (delayed 4000)
> > > [   43.575908] kobject: 'port1.1' (ffff8880057c9008): kobject_release, parent 0000000000000000 (delayed 4000)
> 
> 1) children (port1.0 and port1.1) last reference are put, but their actual
> release is delayed 4s.
> 
> > > [   43.576908] kobject: 'typec' (ffff8880062dbc00): kobject_release, parent 0000000000000000 (delayed 4000)
> > > [   43.577769] kobject: 'port1' (ffff8880057bf008): kobject_release, parent 0000000000000000 (delayed 3000)
> 
> 2) parent (port1) is put, but release is delayed 3s.
> 
> Just in the order you would expect, but because of the delays:
> 
> 3) 3s later, port1 release is called and it is freed.
> 4) 4s later, port1.0 release is called and it refers to the freed parent,
> port1.
> 
> Having the references, what happens now is:
> 
> 1) port1 is put, but this is not its last reference.
> 2) port1.0 and port1.1 are put, cleanup delayed 4s.
> 3) 4s later, port1.0 and port1.1 releases are called, but now they put the
> last reference to port1.
> 4) port1 last reference has now been called, cleanup delayed 3s.
> 5) 3s later, port1 release is called, then freed.
> 
> No UAF in such case.

Usually we don't use this config option, maybe I should also start using
it for some of my tests. Nevertheless the description is pretty clear
(although it might be better to add it to the commit message).

Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Heikki Krogerus Oct. 7, 2024, 8:33 a.m. UTC | #4
Hi,

On Fri, Oct 04, 2024 at 11:17:01AM -0300, Thadeu Lima de Souza Cascardo wrote:
> On Fri, Oct 04, 2024 at 05:08:28PM +0300, Heikki Krogerus wrote:
> > On Fri, Oct 04, 2024 at 09:37:38AM -0300, Thadeu Lima de Souza Cascardo wrote:
> > > The altmode device release refers to its parent device, but without keeping
> > > a reference to it.
> > > 
> > > When registering the altmode, get a reference to the parent and put it in
> > > the release function.
> > > 
> > > Before this fix, when using CONFIG_DEBUG_KOBJECT_RELEASE, we see issues
> > > like this:
> > 
> > Let me study what's going on in the drivers code. The children should
> > _not_ be cleaned first before the parent. I'll have to come back to
> > this on Monday.
> > 
> > This really should not be necessary.
> > 
> 
> Well, they are likely not. But driver core API states that either way, you
> should keep such references. And one way to test it is using
> CONFIG_DEBUG_KOBJECT_RELEASE. That delays the actual release/cleanup of the
> struct device, so:

What I want to understand is, what is the rationale for this behaviour
in the core. If this is something that the drivers can do, then why is
the core not doing it for everything? Why is the core decrementing the
parent reference count already in device_del() instead of
device_release()?

I'm probable missing something, but I really want to understand this.
There are other drivers also need resources from their parent, so if
there is nothing preventing this from being handled in the core, then
that's where it really should be handled.

thanks,
Heikki Krogerus Oct. 7, 2024, 11:07 a.m. UTC | #5
On Fri, Oct 04, 2024 at 09:37:38AM -0300, Thadeu Lima de Souza Cascardo wrote:
> The altmode device release refers to its parent device, but without keeping
> a reference to it.
> 
> When registering the altmode, get a reference to the parent and put it in
> the release function.
> 
> Before this fix, when using CONFIG_DEBUG_KOBJECT_RELEASE, we see issues
> like this:
> 
> [   43.572860] kobject: 'port0.0' (ffff8880057ba008): kobject_release, parent 0000000000000000 (delayed 3000)
> [   43.573532] kobject: 'port0.1' (ffff8880057bd008): kobject_release, parent 0000000000000000 (delayed 1000)
> [   43.574407] kobject: 'port0' (ffff8880057b9008): kobject_release, parent 0000000000000000 (delayed 3000)
> [   43.575059] kobject: 'port1.0' (ffff8880057ca008): kobject_release, parent 0000000000000000 (delayed 4000)
> [   43.575908] kobject: 'port1.1' (ffff8880057c9008): kobject_release, parent 0000000000000000 (delayed 4000)
> [   43.576908] kobject: 'typec' (ffff8880062dbc00): kobject_release, parent 0000000000000000 (delayed 4000)
> [   43.577769] kobject: 'port1' (ffff8880057bf008): kobject_release, parent 0000000000000000 (delayed 3000)
> [   46.612867] ==================================================================
> [   46.613402] BUG: KASAN: slab-use-after-free in typec_altmode_release+0x38/0x129
> [   46.614003] Read of size 8 at addr ffff8880057b9118 by task kworker/2:1/48
> [   46.614538]
> [   46.614668] CPU: 2 UID: 0 PID: 48 Comm: kworker/2:1 Not tainted 6.12.0-rc1-00138-gedbae730ad31 #535
> [   46.615391] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> [   46.616042] Workqueue: events kobject_delayed_cleanup
> [   46.616446] Call Trace:
> [   46.616648]  <TASK>
> [   46.616820]  dump_stack_lvl+0x5b/0x7c
> [   46.617112]  ? typec_altmode_release+0x38/0x129
> [   46.617470]  print_report+0x14c/0x49e
> [   46.617769]  ? rcu_read_unlock_sched+0x56/0x69
> [   46.618117]  ? __virt_addr_valid+0x19a/0x1ab
> [   46.618456]  ? kmem_cache_debug_flags+0xc/0x1d
> [   46.618807]  ? typec_altmode_release+0x38/0x129
> [   46.619161]  kasan_report+0x8d/0xb4
> [   46.619447]  ? typec_altmode_release+0x38/0x129
> [   46.619809]  ? process_scheduled_works+0x3cb/0x85f
> [   46.620185]  typec_altmode_release+0x38/0x129
> [   46.620537]  ? process_scheduled_works+0x3cb/0x85f
> [   46.620907]  device_release+0xaf/0xf2
> [   46.621206]  kobject_delayed_cleanup+0x13b/0x17a
> [   46.621584]  process_scheduled_works+0x4f6/0x85f
> [   46.621955]  ? __pfx_process_scheduled_works+0x10/0x10
> [   46.622353]  ? hlock_class+0x31/0x9a
> [   46.622647]  ? lock_acquired+0x361/0x3c3
> [   46.622956]  ? move_linked_works+0x46/0x7d
> [   46.623277]  worker_thread+0x1ce/0x291
> [   46.623582]  ? __kthread_parkme+0xc8/0xdf
> [   46.623900]  ? __pfx_worker_thread+0x10/0x10
> [   46.624236]  kthread+0x17e/0x190
> [   46.624501]  ? kthread+0xfb/0x190
> [   46.624756]  ? __pfx_kthread+0x10/0x10
> [   46.625015]  ret_from_fork+0x20/0x40
> [   46.625268]  ? __pfx_kthread+0x10/0x10
> [   46.625532]  ret_from_fork_asm+0x1a/0x30
> [   46.625805]  </TASK>
> [   46.625953]
> [   46.626056] Allocated by task 678:
> [   46.626287]  kasan_save_stack+0x24/0x44
> [   46.626555]  kasan_save_track+0x14/0x2d
> [   46.626811]  __kasan_kmalloc+0x3f/0x4d
> [   46.627049]  __kmalloc_noprof+0x1bf/0x1f0
> [   46.627362]  typec_register_port+0x23/0x491
> [   46.627698]  cros_typec_probe+0x634/0xbb6
> [   46.628026]  platform_probe+0x47/0x8c
> [   46.628311]  really_probe+0x20a/0x47d
> [   46.628605]  device_driver_attach+0x39/0x72
> [   46.628940]  bind_store+0x87/0xd7
> [   46.629213]  kernfs_fop_write_iter+0x1aa/0x218
> [   46.629574]  vfs_write+0x1d6/0x29b
> [   46.629856]  ksys_write+0xcd/0x13b
> [   46.630128]  do_syscall_64+0xd4/0x139
> [   46.630420]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [   46.630820]
> [   46.630946] Freed by task 48:
> [   46.631182]  kasan_save_stack+0x24/0x44
> [   46.631493]  kasan_save_track+0x14/0x2d
> [   46.631799]  kasan_save_free_info+0x3f/0x4d
> [   46.632144]  __kasan_slab_free+0x37/0x45
> [   46.632474]  kfree+0x1d4/0x252
> [   46.632725]  device_release+0xaf/0xf2
> [   46.633017]  kobject_delayed_cleanup+0x13b/0x17a
> [   46.633388]  process_scheduled_works+0x4f6/0x85f
> [   46.633764]  worker_thread+0x1ce/0x291
> [   46.634065]  kthread+0x17e/0x190
> [   46.634324]  ret_from_fork+0x20/0x40
> [   46.634621]  ret_from_fork_asm+0x1a/0x30
> 
> Fixes: 8a37d87d72f0 ("usb: typec: Bus type for alternate modes")
> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>

I tried to go through the code and the logs, but I did not really find
any explanation why we couldn't keep the device release order in sync
for all devices that need it. I'll see if we can improve that somehow
in the core, but we can fix this one in the meantime.

But this is a case of use-after-free, so shouldn't this go to the
stable trees?

Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>

thanks,
Thadeu Lima de Souza Cascardo Oct. 7, 2024, 1:42 p.m. UTC | #6
On Mon, Oct 07, 2024 at 02:07:43PM +0300, Heikki Krogerus wrote:
> On Fri, Oct 04, 2024 at 09:37:38AM -0300, Thadeu Lima de Souza Cascardo wrote:
> > The altmode device release refers to its parent device, but without keeping
> > a reference to it.
> > 
> > When registering the altmode, get a reference to the parent and put it in
> > the release function.
> > 
> > Before this fix, when using CONFIG_DEBUG_KOBJECT_RELEASE, we see issues
> > like this:
> > 
> > [   43.572860] kobject: 'port0.0' (ffff8880057ba008): kobject_release, parent 0000000000000000 (delayed 3000)
> > [   43.573532] kobject: 'port0.1' (ffff8880057bd008): kobject_release, parent 0000000000000000 (delayed 1000)
> > [   43.574407] kobject: 'port0' (ffff8880057b9008): kobject_release, parent 0000000000000000 (delayed 3000)
> > [   43.575059] kobject: 'port1.0' (ffff8880057ca008): kobject_release, parent 0000000000000000 (delayed 4000)
> > [   43.575908] kobject: 'port1.1' (ffff8880057c9008): kobject_release, parent 0000000000000000 (delayed 4000)
> > [   43.576908] kobject: 'typec' (ffff8880062dbc00): kobject_release, parent 0000000000000000 (delayed 4000)
> > [   43.577769] kobject: 'port1' (ffff8880057bf008): kobject_release, parent 0000000000000000 (delayed 3000)
> > [   46.612867] ==================================================================
> > [   46.613402] BUG: KASAN: slab-use-after-free in typec_altmode_release+0x38/0x129
> > [   46.614003] Read of size 8 at addr ffff8880057b9118 by task kworker/2:1/48
> > [   46.614538]
> > [   46.614668] CPU: 2 UID: 0 PID: 48 Comm: kworker/2:1 Not tainted 6.12.0-rc1-00138-gedbae730ad31 #535
> > [   46.615391] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> > [   46.616042] Workqueue: events kobject_delayed_cleanup
> > [   46.616446] Call Trace:
> > [   46.616648]  <TASK>
> > [   46.616820]  dump_stack_lvl+0x5b/0x7c
> > [   46.617112]  ? typec_altmode_release+0x38/0x129
> > [   46.617470]  print_report+0x14c/0x49e
> > [   46.617769]  ? rcu_read_unlock_sched+0x56/0x69
> > [   46.618117]  ? __virt_addr_valid+0x19a/0x1ab
> > [   46.618456]  ? kmem_cache_debug_flags+0xc/0x1d
> > [   46.618807]  ? typec_altmode_release+0x38/0x129
> > [   46.619161]  kasan_report+0x8d/0xb4
> > [   46.619447]  ? typec_altmode_release+0x38/0x129
> > [   46.619809]  ? process_scheduled_works+0x3cb/0x85f
> > [   46.620185]  typec_altmode_release+0x38/0x129
> > [   46.620537]  ? process_scheduled_works+0x3cb/0x85f
> > [   46.620907]  device_release+0xaf/0xf2
> > [   46.621206]  kobject_delayed_cleanup+0x13b/0x17a
> > [   46.621584]  process_scheduled_works+0x4f6/0x85f
> > [   46.621955]  ? __pfx_process_scheduled_works+0x10/0x10
> > [   46.622353]  ? hlock_class+0x31/0x9a
> > [   46.622647]  ? lock_acquired+0x361/0x3c3
> > [   46.622956]  ? move_linked_works+0x46/0x7d
> > [   46.623277]  worker_thread+0x1ce/0x291
> > [   46.623582]  ? __kthread_parkme+0xc8/0xdf
> > [   46.623900]  ? __pfx_worker_thread+0x10/0x10
> > [   46.624236]  kthread+0x17e/0x190
> > [   46.624501]  ? kthread+0xfb/0x190
> > [   46.624756]  ? __pfx_kthread+0x10/0x10
> > [   46.625015]  ret_from_fork+0x20/0x40
> > [   46.625268]  ? __pfx_kthread+0x10/0x10
> > [   46.625532]  ret_from_fork_asm+0x1a/0x30
> > [   46.625805]  </TASK>
> > [   46.625953]
> > [   46.626056] Allocated by task 678:
> > [   46.626287]  kasan_save_stack+0x24/0x44
> > [   46.626555]  kasan_save_track+0x14/0x2d
> > [   46.626811]  __kasan_kmalloc+0x3f/0x4d
> > [   46.627049]  __kmalloc_noprof+0x1bf/0x1f0
> > [   46.627362]  typec_register_port+0x23/0x491
> > [   46.627698]  cros_typec_probe+0x634/0xbb6
> > [   46.628026]  platform_probe+0x47/0x8c
> > [   46.628311]  really_probe+0x20a/0x47d
> > [   46.628605]  device_driver_attach+0x39/0x72
> > [   46.628940]  bind_store+0x87/0xd7
> > [   46.629213]  kernfs_fop_write_iter+0x1aa/0x218
> > [   46.629574]  vfs_write+0x1d6/0x29b
> > [   46.629856]  ksys_write+0xcd/0x13b
> > [   46.630128]  do_syscall_64+0xd4/0x139
> > [   46.630420]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > [   46.630820]
> > [   46.630946] Freed by task 48:
> > [   46.631182]  kasan_save_stack+0x24/0x44
> > [   46.631493]  kasan_save_track+0x14/0x2d
> > [   46.631799]  kasan_save_free_info+0x3f/0x4d
> > [   46.632144]  __kasan_slab_free+0x37/0x45
> > [   46.632474]  kfree+0x1d4/0x252
> > [   46.632725]  device_release+0xaf/0xf2
> > [   46.633017]  kobject_delayed_cleanup+0x13b/0x17a
> > [   46.633388]  process_scheduled_works+0x4f6/0x85f
> > [   46.633764]  worker_thread+0x1ce/0x291
> > [   46.634065]  kthread+0x17e/0x190
> > [   46.634324]  ret_from_fork+0x20/0x40
> > [   46.634621]  ret_from_fork_asm+0x1a/0x30
> > 
> > Fixes: 8a37d87d72f0 ("usb: typec: Bus type for alternate modes")
> > Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
> 
> I tried to go through the code and the logs, but I did not really find
> any explanation why we couldn't keep the device release order in sync
> for all devices that need it. I'll see if we can improve that somehow
> in the core, but we can fix this one in the meantime.
> 
> But this is a case of use-after-free, so shouldn't this go to the
> stable trees?

Agreed. I am fine if Cc: stable@vger.kernel.org get added when this is
applied. But otherwise, I can keep track in case the Fixes line does not
trigger its inclusion.

Thanks.
Cascardo.

> 
> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
> 
> thanks,
> 
> -- 
> heikki
>
Thadeu Lima de Souza Cascardo Oct. 7, 2024, 1:59 p.m. UTC | #7
On Mon, Oct 07, 2024 at 11:33:09AM +0300, Heikki Krogerus wrote:
> Hi,
> 
> On Fri, Oct 04, 2024 at 11:17:01AM -0300, Thadeu Lima de Souza Cascardo wrote:
> > On Fri, Oct 04, 2024 at 05:08:28PM +0300, Heikki Krogerus wrote:
> > > On Fri, Oct 04, 2024 at 09:37:38AM -0300, Thadeu Lima de Souza Cascardo wrote:
> > > > The altmode device release refers to its parent device, but without keeping
> > > > a reference to it.
> > > > 
> > > > When registering the altmode, get a reference to the parent and put it in
> > > > the release function.
> > > > 
> > > > Before this fix, when using CONFIG_DEBUG_KOBJECT_RELEASE, we see issues
> > > > like this:
> > > 
> > > Let me study what's going on in the drivers code. The children should
> > > _not_ be cleaned first before the parent. I'll have to come back to
> > > this on Monday.
> > > 
> > > This really should not be necessary.
> > > 
> > 
> > Well, they are likely not. But driver core API states that either way, you
> > should keep such references. And one way to test it is using
> > CONFIG_DEBUG_KOBJECT_RELEASE. That delays the actual release/cleanup of the
> > struct device, so:
> 
> What I want to understand is, what is the rationale for this behaviour
> in the core. If this is something that the drivers can do, then why is
> the core not doing it for everything? Why is the core decrementing the
> parent reference count already in device_del() instead of
> device_release()?
> 
> I'm probable missing something, but I really want to understand this.
> There are other drivers also need resources from their parent, so if
> there is nothing preventing this from being handled in the core, then
> that's where it really should be handled.
> 
> thanks,
> 
> -- 
> heikki
> 

It has been like that for 20+ years, it seems. I noticed that you have
changed the behavior in the case of kobject parent lifetime relatively
recently (back in 2020).

One potential challenge of changing this is that dev->parent is usually set
even before device_initialize gets called. At least, the reference must be
get there at device_initialize. A quick glance here shows that the parent
may be set between device_initialize and device_add, including in
device_create_groups_vargs.

Around 315 calls would need to be inspected, I would guess more than 100
callsites would need to be "fixed" for this change in the API.

And, then, how many really need to keep the reference to the parent, and
wouldn't keep such a reference cause any problems in any of these cases?

Cascardo.
diff mbox series

Patch

diff --git a/drivers/usb/typec/class.c b/drivers/usb/typec/class.c
index 9262fcd4144f..d61b4c74648d 100644
--- a/drivers/usb/typec/class.c
+++ b/drivers/usb/typec/class.c
@@ -519,6 +519,7 @@  static void typec_altmode_release(struct device *dev)
 		typec_altmode_put_partner(alt);
 
 	altmode_id_remove(alt->adev.dev.parent, alt->id);
+	put_device(alt->adev.dev.parent);
 	kfree(alt);
 }
 
@@ -568,6 +569,8 @@  typec_register_altmode(struct device *parent,
 	alt->adev.dev.type = &typec_altmode_dev_type;
 	dev_set_name(&alt->adev.dev, "%s.%u", dev_name(parent), id);
 
+	get_device(alt->adev.dev.parent);
+
 	/* Link partners and plugs with the ports */
 	if (!is_port)
 		typec_altmode_set_partner(alt);