Message ID | 20211223171425.3551-1-jack@suse.cz (mailing list archive) |
---|---|
Headers | show |
Series | bfq: Avoid use-after-free when moving processes between cgroups | expand |
在 2021/12/24 1:31, Jan Kara 写道: > Hello, > > these three patches fix use-after-free issues in BFQ when processes with merged > queues get moved to different cgroups. The patches have survived some beating > in my test VM but so far I fail to reproduce the original KASAN reports so > testing from people who can reproduce them is most welcome. Thanks! Hi, Unfortunately, this patchset can't fix the UAF, just to mark split_coop in patch 3 seems not enough. Here is the result: [ 548.440184] ============================================================== [ 548.441680] BUG: KASAN: use-after-free in __bfq_deactivate_entity+0x21/0x290 [ 548.443155] Read of size 1 at addr ffff8881723e00b0 by task rmmod/13984 [ 548.444109] [ 548.444321] CPU: 30 PID: 13984 Comm: rmmod Tainted: G W 5.16.0-rc5-next-2026 [ 548.445549] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-4 [ 548.447348] Call Trace: [ 548.447682] <TASK> [ 548.447967] dump_stack_lvl+0x34/0x44 [ 548.448470] print_address_description.constprop.0.cold+0xab/0x36b [ 548.449303] ? __bfq_deactivate_entity+0x21/0x290 [ 548.449929] ? __bfq_deactivate_entity+0x21/0x290 [ 548.450565] kasan_report.cold+0x83/0xdf [ 548.451114] ? _raw_read_lock_bh+0x20/0x40 [ 548.451658] ? __bfq_deactivate_entity+0x21/0x290 [ 548.452296] __bfq_deactivate_entity+0x21/0x290 [ 548.452917] bfq_pd_offline+0xc1/0x110 [ 548.453436] blkcg_deactivate_policy+0x14b/0x210 [ 548.454058] bfq_exit_queue+0xe5/0x100 [ 548.454573] blk_mq_exit_sched+0x113/0x140 [ 548.455162] elevator_exit+0x30/0x50 [ 548.455645] blk_release_queue+0xa8/0x160 [ 548.456191] kobject_put+0xd4/0x270 [ 548.456657] disk_release+0xc5/0xf0 [ 548.457140] device_release+0x56/0xe0 [ 548.457634] kobject_put+0xd4/0x270 [ 548.458113] null_del_dev.part.0+0xe6/0x220 [null_blk] [ 548.458810] null_exit+0x62/0xa6 [null_blk] [ 548.459404] __x64_sys_delete_module+0x20a/0x2f0 [ 548.460046] ? __ia32_sys_delete_module+0x2f0/0x2f0 [ 548.460716] ? fpregs_assert_state_consistent+0x55/0x60 [ 548.461436] ? exit_to_user_mode_prepare+0x39/0x1e0 [ 548.462116] do_syscall_64+0x35/0x80 [ 548.462603] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 548.463292] RIP: 0033:0x7fdabfb0efc7 [ 548.463777] Code: 73 01 c3 48 8b 0d c1 de 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 8 [ 548.466251] RSP: 002b:00007ffd678e61d8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 [ 548.467286] RAX: ffffffffffffffda RBX: 00007ffd678e6238 RCX: 00007fdabfb0efc7 [ 548.468237] RDX: 000000000000000a RSI: 0000000000000800 RDI: 000055b176bb1258 [ 548.469187] RBP: 000055b176bb11f0 R08: 00007ffd678e5151 R09: 0000000000000000 [ 548.470130] R10: 00007fdabfb81260 R11: 0000000000000206 R12: 00007ffd678e6400 [ 548.471083] R13: 00007ffd678e845b R14: 000055b176bb0010 R15: 000055b176bb11f0 [ 548.472051] </TASK> [ 548.472351] [ 548.472569] Allocated by task 13874: [ 548.473066] kasan_save_stack+0x1e/0x40 [ 548.473579] __kasan_kmalloc+0x81/0xa0 [ 548.474082] bfq_pd_alloc+0x72/0x110 [ 548.474551] blkg_alloc+0x252/0x2c0 [ 548.475034] blkg_create+0x38e/0x560 [ 548.475508] bio_associate_blkg_from_css+0x3bc/0x490 [ 548.476169] bio_associate_blkg+0x3b/0x90 [ 548.476701] submit_bh_wbc+0x18c/0x320 [ 548.477207] ll_rw_block+0x126/0x130 [ 548.477681] __block_write_begin_int+0x6fc/0xee0 [ 548.478299] block_write_begin+0x44/0x190 [ 548.478830] generic_perform_write+0x159/0x300 [ 548.479441] __generic_file_write_iter+0x162/0x250 [ 548.480086] blkdev_write_iter+0x21a/0x300 [ 548.480628] aio_write+0x218/0x400 [ 548.481084] io_submit_one+0x855/0x11c0 [ 548.481587] __x64_sys_io_submit+0xfa/0x250 [ 548.482140] do_syscall_64+0x35/0x80 [ 548.482615] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 548.483298] [ 548.483502] Freed by task 13855: [ 548.483929] kasan_save_stack+0x1e/0x40 [ 548.484439] kasan_set_track+0x21/0x30 [ 548.484940] kasan_set_free_info+0x20/0x30 [ 548.485486] __kasan_slab_free+0x103/0x180 [ 548.486032] kfree+0x9a/0x4b0 [ 548.486425] blkg_free.part.0+0x4a/0x90 [ 548.486931] rcu_do_batch+0x2e1/0x760 [ 548.487440] rcu_core+0x367/0x530 [ 548.487879] __do_softirq+0x119/0x3ba How do you think about this: diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 1ce1a99a7160..14c1d1c3811e 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -2626,6 +2626,11 @@ bfq_setup_merge(struct bfq_queue *bfqq, struct bfq_queue *new_bfqq) while ((__bfqq = new_bfqq->new_bfqq)) { if (__bfqq == bfqq) return NULL; + if (__bfqq->entity.parent != bfqq->entity.parent) { + if (bfq_bfqq_coop(__bfqq)) + bfq_mark_bfqq_split_coop(__bfqq); + return NULL; + } new_bfqq = __bfqq; } @@ -2825,8 +2830,16 @@ bfq_setup_cooperator(struct bfq_data *bfqd, struct bfq_queue *bfqq, if (bfq_too_late_for_merging(bfqq)) return NULL; - if (bfqq->new_bfqq) - return bfqq->new_bfqq; + if (bfqq->new_bfqq) { + struct bfq_queue *new_bfqq = bfqq->new_bfqq; + + if(bfqq->entity.parent == new_bfqq->entity.parent) + return new_bfqq; + + if(bfq_bfqq_coop(new_bfqq)) + bfq_mark_bfqq_split_coop(new_bfqq); + return NULL; + } Thanks, Kuai > > Honza > . >
On Fri 24-12-21 09:30:04, yukuai (C) wrote: > 在 2021/12/24 1:31, Jan Kara 写道: > > Hello, > > > > these three patches fix use-after-free issues in BFQ when processes with merged > > queues get moved to different cgroups. The patches have survived some beating > > in my test VM but so far I fail to reproduce the original KASAN reports so > > testing from people who can reproduce them is most welcome. Thanks! > > Hi, > > Unfortunately, this patchset can't fix the UAF, just to mark > split_coop in patch 3 seems not enough. Thanks for testing! > Here is the result: > > [ 548.440184] > ============================================================== > [ 548.441680] BUG: KASAN: use-after-free in > __bfq_deactivate_entity+0x21/0x290 > [ 548.443155] Read of size 1 at addr ffff8881723e00b0 by task rmmod/13984 > [ 548.444109] > [ 548.444321] CPU: 30 PID: 13984 Comm: rmmod Tainted: G W > 5.16.0-rc5-next-2026 > [ 548.445549] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > ?-20190727_073836-4 > [ 548.447348] Call Trace: > [ 548.447682] <TASK> > [ 548.447967] dump_stack_lvl+0x34/0x44 > [ 548.448470] print_address_description.constprop.0.cold+0xab/0x36b > [ 548.449303] ? __bfq_deactivate_entity+0x21/0x290 > [ 548.449929] ? __bfq_deactivate_entity+0x21/0x290 > [ 548.450565] kasan_report.cold+0x83/0xdf > [ 548.451114] ? _raw_read_lock_bh+0x20/0x40 > [ 548.451658] ? __bfq_deactivate_entity+0x21/0x290 > [ 548.452296] __bfq_deactivate_entity+0x21/0x290 > [ 548.452917] bfq_pd_offline+0xc1/0x110 Can you pass the trace through addr2line please? I'm curious whether this is a call in bfq_flush_idle_tree() or directly from bfq_pd_offline(). Also whether the crash in __bfq_deactivate_entity() is indeed inside bfq_entity_service_tree() as I expect. > [ 548.453436] blkcg_deactivate_policy+0x14b/0x210 > [ 548.454058] bfq_exit_queue+0xe5/0x100 > [ 548.454573] blk_mq_exit_sched+0x113/0x140 > [ 548.455162] elevator_exit+0x30/0x50 > [ 548.455645] blk_release_queue+0xa8/0x160 So I'm not convinced this is the same problem as before. I'll be able to tell more when I see the addr2line output. > How do you think about this: > > diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c > index 1ce1a99a7160..14c1d1c3811e 100644 > --- a/block/bfq-iosched.c > +++ b/block/bfq-iosched.c > @@ -2626,6 +2626,11 @@ bfq_setup_merge(struct bfq_queue *bfqq, struct > bfq_queue *new_bfqq) > while ((__bfqq = new_bfqq->new_bfqq)) { > if (__bfqq == bfqq) > return NULL; > + if (__bfqq->entity.parent != bfqq->entity.parent) { > + if (bfq_bfqq_coop(__bfqq)) > + bfq_mark_bfqq_split_coop(__bfqq); > + return NULL; > + } So why is this needed? In my patches we do check in bfq_setup_merge() that the bfqq we merge to is in the same cgroup. If some intermediate bfqq happens to move to a different cgroup between the detection and merge, well that's a bad luck but practically I don't think that is a real problem and it should not cause the use-after-free issues. Or am I missing something? > new_bfqq = __bfqq; > } > > @@ -2825,8 +2830,16 @@ bfq_setup_cooperator(struct bfq_data *bfqd, struct > bfq_queue *bfqq, > if (bfq_too_late_for_merging(bfqq)) > return NULL; > > - if (bfqq->new_bfqq) > - return bfqq->new_bfqq; > + if (bfqq->new_bfqq) { > + struct bfq_queue *new_bfqq = bfqq->new_bfqq; > + > + if(bfqq->entity.parent == new_bfqq->entity.parent) > + return new_bfqq; > + > + if(bfq_bfqq_coop(new_bfqq)) > + bfq_mark_bfqq_split_coop(new_bfqq); > + return NULL; > + } So I wanted to say that this should be already handled by the change to __bfq_bic_change_cgroup() in my series. But bfq_setup_cooperator() can also be called from bfq_allow_bio_merge() and on that path we don't call bfq_bic_update_cgroup() so indeed we can merge bfqq to a bfqq in a different (or dead) cgroup through that path. I actually think we should call bfq_bic_update_cgroup() in bfq_allow_bio_merge() so that we reparent / split bfqq when new IO arrives for a different cgroup. Also I have realized that my change to __bfq_bic_change_cgroup() may not be enough and we should also clear sync_bfqq->new_bfqq if it is set. Honza
在 2022/01/04 4:37, Jan Kara 写道: > On Fri 24-12-21 09:30:04, yukuai (C) wrote: >> 在 2021/12/24 1:31, Jan Kara 写道: >>> Hello, >>> >>> these three patches fix use-after-free issues in BFQ when processes with merged >>> queues get moved to different cgroups. The patches have survived some beating >>> in my test VM but so far I fail to reproduce the original KASAN reports so >>> testing from people who can reproduce them is most welcome. Thanks! >> >> Hi, >> >> Unfortunately, this patchset can't fix the UAF, just to mark >> split_coop in patch 3 seems not enough. > > Thanks for testing! > >> Here is the result: >> >> [ 548.440184] >> ============================================================== >> [ 548.441680] BUG: KASAN: use-after-free in >> __bfq_deactivate_entity+0x21/0x290 >> [ 548.443155] Read of size 1 at addr ffff8881723e00b0 by task rmmod/13984 >> [ 548.444109] >> [ 548.444321] CPU: 30 PID: 13984 Comm: rmmod Tainted: G W >> 5.16.0-rc5-next-2026 >> [ 548.445549] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS >> ?-20190727_073836-4 >> [ 548.447348] Call Trace: >> [ 548.447682] <TASK> >> [ 548.447967] dump_stack_lvl+0x34/0x44 >> [ 548.448470] print_address_description.constprop.0.cold+0xab/0x36b >> [ 548.449303] ? __bfq_deactivate_entity+0x21/0x290 >> [ 548.449929] ? __bfq_deactivate_entity+0x21/0x290 >> [ 548.450565] kasan_report.cold+0x83/0xdf >> [ 548.451114] ? _raw_read_lock_bh+0x20/0x40 >> [ 548.451658] ? __bfq_deactivate_entity+0x21/0x290 >> [ 548.452296] __bfq_deactivate_entity+0x21/0x290 >> [ 548.452917] bfq_pd_offline+0xc1/0x110 > > Can you pass the trace through addr2line please? I'm curious whether this > is a call in bfq_flush_idle_tree() or directly from bfq_pd_offline(). Also > whether the crash in __bfq_deactivate_entity() is indeed inside > bfq_entity_service_tree() as I expect. > >> [ 548.453436] blkcg_deactivate_policy+0x14b/0x210 >> [ 548.454058] bfq_exit_queue+0xe5/0x100 >> [ 548.454573] blk_mq_exit_sched+0x113/0x140 >> [ 548.455162] elevator_exit+0x30/0x50 >> [ 548.455645] blk_release_queue+0xa8/0x160 > > So I'm not convinced this is the same problem as before. I'll be able to > tell more when I see the addr2line output. > >> How do you think about this: >> >> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c >> index 1ce1a99a7160..14c1d1c3811e 100644 >> --- a/block/bfq-iosched.c >> +++ b/block/bfq-iosched.c >> @@ -2626,6 +2626,11 @@ bfq_setup_merge(struct bfq_queue *bfqq, struct >> bfq_queue *new_bfqq) >> while ((__bfqq = new_bfqq->new_bfqq)) { >> if (__bfqq == bfqq) >> return NULL; >> + if (__bfqq->entity.parent != bfqq->entity.parent) { >> + if (bfq_bfqq_coop(__bfqq)) >> + bfq_mark_bfqq_split_coop(__bfqq); >> + return NULL; >> + } > > So why is this needed? In my patches we do check in bfq_setup_merge() that > the bfqq we merge to is in the same cgroup. If some intermediate bfqq > happens to move to a different cgroup between the detection and merge, well > that's a bad luck but practically I don't think that is a real problem and > it should not cause the use-after-free issues. Or am I missing something? Hi, I'm sure that in my repoducer bfq_setup_merge() nerver merge two bfqq from diferent group(I added some debuginfo before), and somehow later bfqq and bfqq->new_bfqq end up from different bfqg. I haven't thought of merge bfqqs from diferent group yet. My reporducer probably can't cover this case... This modification is just a lazy detection for the problem. It's right that fix the problem in the first scene is better. Thanks, Kuai > >> new_bfqq = __bfqq; >> } >> >> @@ -2825,8 +2830,16 @@ bfq_setup_cooperator(struct bfq_data *bfqd, struct >> bfq_queue *bfqq, >> if (bfq_too_late_for_merging(bfqq)) >> return NULL; >> >> - if (bfqq->new_bfqq) >> - return bfqq->new_bfqq; >> + if (bfqq->new_bfqq) { >> + struct bfq_queue *new_bfqq = bfqq->new_bfqq; >> + >> + if(bfqq->entity.parent == new_bfqq->entity.parent) >> + return new_bfqq; >> + >> + if(bfq_bfqq_coop(new_bfqq)) >> + bfq_mark_bfqq_split_coop(new_bfqq); >> + return NULL; >> + } > > So I wanted to say that this should be already handled by the change to > __bfq_bic_change_cgroup() in my series. But bfq_setup_cooperator() can also > be called from bfq_allow_bio_merge() and on that path we don't call > bfq_bic_update_cgroup() so indeed we can merge bfqq to a bfqq in a > different (or dead) cgroup through that path. I actually think we should > call bfq_bic_update_cgroup() in bfq_allow_bio_merge() so that we reparent / > split bfqq when new IO arrives for a different cgroup. > > Also I have realized that my change to __bfq_bic_change_cgroup() may not be > enough and we should also clear sync_bfqq->new_bfqq if it is set. > > Honza >