Message ID | 20201105084108.3432509-1-jamie@nuviainc.com (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [RESEND] bonding: wait for sysfs kobject destruction before freeing struct slave | expand |
On 11/5/20 9:41 AM, Jamie Iles wrote: > syzkaller found that with CONFIG_DEBUG_KOBJECT_RELEASE=y, releasing a > struct slave device could result in the following splat: > > > This is a potential use-after-free if the sysfs nodes are being accessed > whilst removing the struct slave, so wait for the object destruction to > complete before freeing the struct slave itself. > > Fixes: 07699f9a7c8d ("bonding: add sysfs /slave dir for bond slave devices.") > Fixes: a068aab42258 ("bonding: Fix reference count leak in bond_sysfs_slave_add.") > Cc: Qiushi Wu <wu000273@umn.edu> > Cc: Jay Vosburgh <j.vosburgh@gmail.com> > Cc: Veaceslav Falico <vfalico@gmail.com> > Cc: Andy Gospodarek <andy@greyhouse.net> > Signed-off-by: Jamie Iles <jamie@nuviainc.com> > --- > drivers/net/bonding/bond_sysfs_slave.c | 12 ++++++++++++ > include/net/bonding.h | 2 ++ > 2 files changed, 14 insertions(+) > > diff --git a/drivers/net/bonding/bond_sysfs_slave.c b/drivers/net/bonding/bond_sysfs_slave.c > index 9b8346638f69..2fdbcf9692c5 100644 > --- a/drivers/net/bonding/bond_sysfs_slave.c > +++ b/drivers/net/bonding/bond_sysfs_slave.c > @@ -136,7 +136,15 @@ static const struct sysfs_ops slave_sysfs_ops = { > .show = slave_show, > }; > > +static void slave_release(struct kobject *kobj) > +{ > + struct slave *slave = to_slave(kobj); > + > + complete(&slave->kobj_unregister_done); > +} > + > static struct kobj_type slave_ktype = { > + .release = slave_release, > #ifdef CONFIG_SYSFS > .sysfs_ops = &slave_sysfs_ops, > #endif > @@ -147,10 +155,12 @@ int bond_sysfs_slave_add(struct slave *slave) > const struct slave_attribute **a; > int err; > > + init_completion(&slave->kobj_unregister_done); > err = kobject_init_and_add(&slave->kobj, &slave_ktype, > &(slave->dev->dev.kobj), "bonding_slave"); > if (err) { > kobject_put(&slave->kobj); > + wait_for_completion(&slave->kobj_unregister_done); > return err; > } > > @@ -158,6 +168,7 @@ int bond_sysfs_slave_add(struct slave *slave) > err = sysfs_create_file(&slave->kobj, &((*a)->attr)); > if (err) { > kobject_put(&slave->kobj); > + wait_for_completion(&slave->kobj_unregister_done); > return err; > } > } > @@ -173,4 +184,5 @@ void bond_sysfs_slave_del(struct slave *slave) > sysfs_remove_file(&slave->kobj, &((*a)->attr)); > > kobject_put(&slave->kobj); > + wait_for_completion(&slave->kobj_unregister_done); > } > diff --git a/include/net/bonding.h b/include/net/bonding.h > index 7d132cc1e584..78d771d2ffd3 100644 > --- a/include/net/bonding.h > +++ b/include/net/bonding.h > @@ -25,6 +25,7 @@ > #include <linux/etherdevice.h> > #include <linux/reciprocal_div.h> > #include <linux/if_link.h> > +#include <linux/completion.h> > > #include <net/bond_3ad.h> > #include <net/bond_alb.h> > @@ -182,6 +183,7 @@ struct slave { > #endif > struct delayed_work notify_work; > struct kobject kobj; > + struct completion kobj_unregister_done; > struct rtnl_link_stats64 slave_stats; > }; This seems weird, are we going to wait for a completion while RTNL is held ? I am pretty sure this could be exploited by malicious user/syzbot. The .release() handler could instead perform a refcounted bond_free_slave() action.
Hi Eric, On Thu, Nov 05, 2020 at 01:49:03PM +0100, Eric Dumazet wrote: > On 11/5/20 9:41 AM, Jamie Iles wrote: > > syzkaller found that with CONFIG_DEBUG_KOBJECT_RELEASE=y, releasing a > > struct slave device could result in the following splat: > > > > > > > This is a potential use-after-free if the sysfs nodes are being accessed > > whilst removing the struct slave, so wait for the object destruction to > > complete before freeing the struct slave itself. > > > > Fixes: 07699f9a7c8d ("bonding: add sysfs /slave dir for bond slave devices.") > > Fixes: a068aab42258 ("bonding: Fix reference count leak in bond_sysfs_slave_add.") > > Cc: Qiushi Wu <wu000273@umn.edu> > > Cc: Jay Vosburgh <j.vosburgh@gmail.com> > > Cc: Veaceslav Falico <vfalico@gmail.com> > > Cc: Andy Gospodarek <andy@greyhouse.net> > > Signed-off-by: Jamie Iles <jamie@nuviainc.com> > > --- ... > This seems weird, are we going to wait for a completion while RTNL is held ? > I am pretty sure this could be exploited by malicious user/syzbot. > > The .release() handler could instead perform a refcounted > bond_free_slave() action. Okay, so were you thinking along the lines of this moving the lifetime of the slave to the kobject? Thanks, Jamie diff --git i/drivers/net/bonding/bond_main.c w/drivers/net/bonding/bond_main.c index 84ecbc6fa0ff..ea8ecc6e87c2 100644 --- i/drivers/net/bonding/bond_main.c +++ w/drivers/net/bonding/bond_main.c @@ -1481,7 +1481,7 @@ static struct slave *bond_alloc_slave(struct bonding *bond) return slave; } -static void bond_free_slave(struct slave *slave) +void bond_free_slave(struct slave *slave) { struct bonding *bond = bond_get_bond_by_slave(slave); @@ -1691,6 +1691,10 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, */ new_slave->queue_id = 0; + res = bond_slave_kobj_init(new_slave); + if (res) + goto err_free; + /* Save slave's original mtu and then set it to match the bond */ new_slave->original_mtu = slave_dev->mtu; res = dev_set_mtu(slave_dev, bond->dev->mtu); @@ -1912,7 +1916,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, if (bond_dev->flags & IFF_PROMISC) { res = dev_set_promiscuity(slave_dev, 1); if (res) - goto err_sysfs_del; + goto err_upper_unlink; } /* set allmulti level to new slave */ @@ -1921,7 +1925,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, if (res) { if (bond_dev->flags & IFF_PROMISC) dev_set_promiscuity(slave_dev, -1); - goto err_sysfs_del; + goto err_upper_unlink; } } @@ -1961,9 +1965,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, return 0; /* Undo stages on error */ -err_sysfs_del: - bond_sysfs_slave_del(new_slave); - err_upper_unlink: bond_upper_dev_unlink(bond, new_slave); @@ -2007,7 +2008,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, dev_set_mtu(slave_dev, new_slave->original_mtu); err_free: - bond_free_slave(new_slave); + bond_slave_kobj_put(new_slave); err_undo_flags: /* Enslave of first slave has failed and we need to fix master's mac */ @@ -2066,8 +2067,6 @@ static int __bond_release_one(struct net_device *bond_dev, bond_set_slave_inactive_flags(slave, BOND_SLAVE_NOTIFY_NOW); - bond_sysfs_slave_del(slave); - /* recompute stats just before removing the slave */ bond_get_stats(bond->dev, &bond->bond_stats); @@ -2187,7 +2186,7 @@ static int __bond_release_one(struct net_device *bond_dev, if (!netif_is_bond_master(slave_dev)) slave_dev->priv_flags &= ~IFF_BONDING; - bond_free_slave(slave); + bond_slave_kobj_put(slave); return 0; } diff --git i/drivers/net/bonding/bond_sysfs_slave.c w/drivers/net/bonding/bond_sysfs_slave.c index 9b8346638f69..67732078ef26 100644 --- i/drivers/net/bonding/bond_sysfs_slave.c +++ w/drivers/net/bonding/bond_sysfs_slave.c @@ -136,24 +136,35 @@ static const struct sysfs_ops slave_sysfs_ops = { .show = slave_show, }; +static void slave_release(struct kobject *kobj) +{ + struct slave *slave = to_slave(kobj); + + bond_free_slave(slave); +} + static struct kobj_type slave_ktype = { + .release = slave_release, #ifdef CONFIG_SYSFS .sysfs_ops = &slave_sysfs_ops, #endif }; +int bond_slave_kobj_init(struct slave *slave) +{ + int err = kobject_init_and_add(&slave->kobj, &slave_ktype, + &(slave->dev->dev.kobj), "bonding_slave"); + if (err) + kobject_put(&slave->kobj); + + return err; +} + int bond_sysfs_slave_add(struct slave *slave) { const struct slave_attribute **a; int err; - err = kobject_init_and_add(&slave->kobj, &slave_ktype, - &(slave->dev->dev.kobj), "bonding_slave"); - if (err) { - kobject_put(&slave->kobj); - return err; - } - for (a = slave_attrs; *a; ++a) { err = sysfs_create_file(&slave->kobj, &((*a)->attr)); if (err) { @@ -165,7 +176,7 @@ int bond_sysfs_slave_add(struct slave *slave) return 0; } -void bond_sysfs_slave_del(struct slave *slave) +void bond_slave_kobj_put(struct slave *slave) { const struct slave_attribute **a; diff --git i/include/net/bonding.h w/include/net/bonding.h index 7d132cc1e584..ccb07e3e495e 100644 --- i/include/net/bonding.h +++ w/include/net/bonding.h @@ -622,10 +622,12 @@ int bond_create(struct net *net, const char *name); int bond_create_sysfs(struct bond_net *net); void bond_destroy_sysfs(struct bond_net *net); void bond_prepare_sysfs_group(struct bonding *bond); +int bond_slave_kobj_init(struct slave *slave); int bond_sysfs_slave_add(struct slave *slave); -void bond_sysfs_slave_del(struct slave *slave); +void bond_slave_kobj_put(struct slave *slave); int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, struct netlink_ext_ack *extack); +void bond_free_slave(struct slave *slave); int bond_release(struct net_device *bond_dev, struct net_device *slave_dev); u32 bond_xmit_hash(struct bonding *bond, struct sk_buff *skb); int bond_set_carrier(struct bonding *bond);
On 11/5/20 7:11 PM, Jamie Iles wrote: > Hi Eric, > > On Thu, Nov 05, 2020 at 01:49:03PM +0100, Eric Dumazet wrote: >> On 11/5/20 9:41 AM, Jamie Iles wrote: >>> syzkaller found that with CONFIG_DEBUG_KOBJECT_RELEASE=y, releasing a >>> struct slave device could result in the following splat: >>> >>> >> >>> This is a potential use-after-free if the sysfs nodes are being accessed >>> whilst removing the struct slave, so wait for the object destruction to >>> complete before freeing the struct slave itself. >>> >>> Fixes: 07699f9a7c8d ("bonding: add sysfs /slave dir for bond slave devices.") >>> Fixes: a068aab42258 ("bonding: Fix reference count leak in bond_sysfs_slave_add.") >>> Cc: Qiushi Wu <wu000273@umn.edu> >>> Cc: Jay Vosburgh <j.vosburgh@gmail.com> >>> Cc: Veaceslav Falico <vfalico@gmail.com> >>> Cc: Andy Gospodarek <andy@greyhouse.net> >>> Signed-off-by: Jamie Iles <jamie@nuviainc.com> >>> --- > ... >> This seems weird, are we going to wait for a completion while RTNL is held ? >> I am pretty sure this could be exploited by malicious user/syzbot. >> >> The .release() handler could instead perform a refcounted >> bond_free_slave() action. > > Okay, so were you thinking along the lines of this moving the lifetime > of the slave to the kobject? > This seems a bit too complex for a bug fix. Instead of adding a completion, you could add a refcount_t, and make bond_free_slave() a wrapper like static inline void bond_free_slave(struct slave *slave) { if (refcount_dec_and_test(&slave->refcnt)) __bond_free_slave(slave); } Once the kobj is successfully activated, you would need a refcount_inc(&slave->refcount); Total patch would be smaller and easier to review. The kobj .release handler would simply call bond_free_slave(slave);
diff --git a/drivers/net/bonding/bond_sysfs_slave.c b/drivers/net/bonding/bond_sysfs_slave.c index 9b8346638f69..2fdbcf9692c5 100644 --- a/drivers/net/bonding/bond_sysfs_slave.c +++ b/drivers/net/bonding/bond_sysfs_slave.c @@ -136,7 +136,15 @@ static const struct sysfs_ops slave_sysfs_ops = { .show = slave_show, }; +static void slave_release(struct kobject *kobj) +{ + struct slave *slave = to_slave(kobj); + + complete(&slave->kobj_unregister_done); +} + static struct kobj_type slave_ktype = { + .release = slave_release, #ifdef CONFIG_SYSFS .sysfs_ops = &slave_sysfs_ops, #endif @@ -147,10 +155,12 @@ int bond_sysfs_slave_add(struct slave *slave) const struct slave_attribute **a; int err; + init_completion(&slave->kobj_unregister_done); err = kobject_init_and_add(&slave->kobj, &slave_ktype, &(slave->dev->dev.kobj), "bonding_slave"); if (err) { kobject_put(&slave->kobj); + wait_for_completion(&slave->kobj_unregister_done); return err; } @@ -158,6 +168,7 @@ int bond_sysfs_slave_add(struct slave *slave) err = sysfs_create_file(&slave->kobj, &((*a)->attr)); if (err) { kobject_put(&slave->kobj); + wait_for_completion(&slave->kobj_unregister_done); return err; } } @@ -173,4 +184,5 @@ void bond_sysfs_slave_del(struct slave *slave) sysfs_remove_file(&slave->kobj, &((*a)->attr)); kobject_put(&slave->kobj); + wait_for_completion(&slave->kobj_unregister_done); } diff --git a/include/net/bonding.h b/include/net/bonding.h index 7d132cc1e584..78d771d2ffd3 100644 --- a/include/net/bonding.h +++ b/include/net/bonding.h @@ -25,6 +25,7 @@ #include <linux/etherdevice.h> #include <linux/reciprocal_div.h> #include <linux/if_link.h> +#include <linux/completion.h> #include <net/bond_3ad.h> #include <net/bond_alb.h> @@ -182,6 +183,7 @@ struct slave { #endif struct delayed_work notify_work; struct kobject kobj; + struct completion kobj_unregister_done; struct rtnl_link_stats64 slave_stats; };
syzkaller found that with CONFIG_DEBUG_KOBJECT_RELEASE=y, releasing a struct slave device could result in the following splat: kobject: 'bonding_slave' (00000000cecdd4fe): kobject_release, parent 0000000074ceb2b2 (delayed 1000) bond0 (unregistering): (slave bond_slave_1): Releasing backup interface ------------[ cut here ]------------ ODEBUG: free active (active state 0) object type: timer_list hint: workqueue_select_cpu_near kernel/workqueue.c:1549 [inline] ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x98 kernel/workqueue.c:1600 WARNING: CPU: 1 PID: 842 at lib/debugobjects.c:485 debug_print_object+0x180/0x240 lib/debugobjects.c:485 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 842 Comm: kworker/u4:4 Tainted: G S 5.9.0-rc8+ #96 Hardware name: linux,dummy-virt (DT) Workqueue: netns cleanup_net Call trace: dump_backtrace+0x0/0x4d8 include/linux/bitmap.h:239 show_stack+0x34/0x48 arch/arm64/kernel/traps.c:142 __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x174/0x1f8 lib/dump_stack.c:118 panic+0x360/0x7a0 kernel/panic.c:231 __warn+0x244/0x2ec kernel/panic.c:600 report_bug+0x240/0x398 lib/bug.c:198 bug_handler+0x50/0xc0 arch/arm64/kernel/traps.c:974 call_break_hook+0x160/0x1d8 arch/arm64/kernel/debug-monitors.c:322 brk_handler+0x30/0xc0 arch/arm64/kernel/debug-monitors.c:329 do_debug_exception+0x184/0x340 arch/arm64/mm/fault.c:864 el1_dbg+0x48/0xb0 arch/arm64/kernel/entry-common.c:65 el1_sync_handler+0x170/0x1c8 arch/arm64/kernel/entry-common.c:93 el1_sync+0x80/0x100 arch/arm64/kernel/entry.S:594 debug_print_object+0x180/0x240 lib/debugobjects.c:485 __debug_check_no_obj_freed lib/debugobjects.c:967 [inline] debug_check_no_obj_freed+0x200/0x430 lib/debugobjects.c:998 slab_free_hook mm/slub.c:1536 [inline] slab_free_freelist_hook+0x190/0x210 mm/slub.c:1577 slab_free mm/slub.c:3138 [inline] kfree+0x13c/0x460 mm/slub.c:4119 bond_free_slave+0x8c/0xf8 drivers/net/bonding/bond_main.c:1492 __bond_release_one+0xe0c/0xec8 drivers/net/bonding/bond_main.c:2190 bond_slave_netdev_event drivers/net/bonding/bond_main.c:3309 [inline] bond_netdev_event+0x8f0/0xa70 drivers/net/bonding/bond_main.c:3420 notifier_call_chain+0xf0/0x200 kernel/notifier.c:83 __raw_notifier_call_chain kernel/notifier.c:361 [inline] raw_notifier_call_chain+0x44/0x58 kernel/notifier.c:368 call_netdevice_notifiers_info+0xbc/0x150 net/core/dev.c:2033 call_netdevice_notifiers_extack net/core/dev.c:2045 [inline] call_netdevice_notifiers net/core/dev.c:2059 [inline] rollback_registered_many+0x6a4/0xec0 net/core/dev.c:9347 unregister_netdevice_many.part.0+0x2c/0x1c0 net/core/dev.c:10509 unregister_netdevice_many net/core/dev.c:10508 [inline] default_device_exit_batch+0x294/0x338 net/core/dev.c:10992 ops_exit_list.isra.0+0xec/0x150 net/core/net_namespace.c:189 cleanup_net+0x44c/0x888 net/core/net_namespace.c:603 process_one_work+0x96c/0x18c0 kernel/workqueue.c:2269 worker_thread+0x3f0/0xc30 kernel/workqueue.c:2415 kthread+0x390/0x498 kernel/kthread.c:292 ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:925 This is a potential use-after-free if the sysfs nodes are being accessed whilst removing the struct slave, so wait for the object destruction to complete before freeing the struct slave itself. Fixes: 07699f9a7c8d ("bonding: add sysfs /slave dir for bond slave devices.") Fixes: a068aab42258 ("bonding: Fix reference count leak in bond_sysfs_slave_add.") Cc: Qiushi Wu <wu000273@umn.edu> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Jamie Iles <jamie@nuviainc.com> --- drivers/net/bonding/bond_sysfs_slave.c | 12 ++++++++++++ include/net/bonding.h | 2 ++ 2 files changed, 14 insertions(+)