Message ID | 20180619172640.15246-1-bart.vanassche@wdc.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Jun 19 2018 at 1:26pm -0400, Bart Van Assche <bart.vanassche@wdc.com> wrote: > Commit 0ba99ca4838b ("block: Add warning for bi_next not NULL in > bio_endio()") breaks the dm driver. end_clone_bio() detects whether > or not a bio is the last bio associated with a request by checking > the .bi_next field. Commit 0ba99ca4838b clears that field before > end_clone_bio() has had a chance to inspect that field. Hence revert > commit 0ba99ca4838b. > > This patch avoids that KASAN reports the following complaint when > running the srp-test software (srp-test/run_tests -c -d -r 10 -t 02-mq): > > ================================================================== > BUG: KASAN: use-after-free in bio_advance+0x11b/0x1d0 > Read of size 4 at addr ffff8801300e06d0 by task ksoftirqd/0/9 > > CPU: 0 PID: 9 Comm: ksoftirqd/0 Not tainted 4.18.0-rc1-dbg+ #1 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 > Call Trace: > dump_stack+0xa4/0xf5 > print_address_description+0x6f/0x270 > kasan_report+0x241/0x360 > __asan_load4+0x78/0x80 > bio_advance+0x11b/0x1d0 > blk_update_request+0xa7/0x5b0 > scsi_end_request+0x56/0x320 [scsi_mod] > scsi_io_completion+0x7d6/0xb20 [scsi_mod] > scsi_finish_command+0x1c0/0x280 [scsi_mod] > scsi_softirq_done+0x19a/0x230 [scsi_mod] > blk_mq_complete_request+0x160/0x240 > scsi_mq_done+0x50/0x1a0 [scsi_mod] > srp_recv_done+0x515/0x1330 [ib_srp] > __ib_process_cq+0xa0/0xf0 [ib_core] > ib_poll_handler+0x38/0xa0 [ib_core] > irq_poll_softirq+0xe8/0x1f0 > __do_softirq+0x128/0x60d > run_ksoftirqd+0x3f/0x60 > smpboot_thread_fn+0x352/0x460 > kthread+0x1c1/0x1e0 > ret_from_fork+0x24/0x30 > > Allocated by task 1918: > save_stack+0x43/0xd0 > kasan_kmalloc+0xad/0xe0 > kasan_slab_alloc+0x11/0x20 > kmem_cache_alloc+0xfe/0x350 > mempool_alloc_slab+0x15/0x20 > mempool_alloc+0xfb/0x270 > bio_alloc_bioset+0x244/0x350 > submit_bh_wbc+0x9c/0x2f0 > __block_write_full_page+0x299/0x5a0 > block_write_full_page+0x16b/0x180 > blkdev_writepage+0x18/0x20 > __writepage+0x42/0x80 > write_cache_pages+0x376/0x8a0 > generic_writepages+0xbe/0x110 > blkdev_writepages+0xe/0x10 > do_writepages+0x9b/0x180 > __filemap_fdatawrite_range+0x178/0x1c0 > file_write_and_wait_range+0x59/0xc0 > blkdev_fsync+0x46/0x80 > vfs_fsync_range+0x66/0x100 > do_fsync+0x3d/0x70 > __x64_sys_fsync+0x21/0x30 > do_syscall_64+0x77/0x230 > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > Freed by task 9: > save_stack+0x43/0xd0 > __kasan_slab_free+0x137/0x190 > kasan_slab_free+0xe/0x10 > kmem_cache_free+0xd3/0x380 > mempool_free_slab+0x17/0x20 > mempool_free+0x63/0x160 > bio_free+0x81/0xa0 > bio_put+0x59/0x60 > end_bio_bh_io_sync+0x5d/0x70 > bio_endio+0x1a7/0x360 > blk_update_request+0xd0/0x5b0 > end_clone_bio+0xa3/0xd0 [dm_mod] > bio_endio+0x1a7/0x360 > blk_update_request+0xd0/0x5b0 > scsi_end_request+0x56/0x320 [scsi_mod] > scsi_io_completion+0x7d6/0xb20 [scsi_mod] > scsi_finish_command+0x1c0/0x280 [scsi_mod] > scsi_softirq_done+0x19a/0x230 [scsi_mod] > blk_mq_complete_request+0x160/0x240 > scsi_mq_done+0x50/0x1a0 [scsi_mod] > srp_recv_done+0x515/0x1330 [ib_srp] > __ib_process_cq+0xa0/0xf0 [ib_core] > ib_poll_handler+0x38/0xa0 [ib_core] > irq_poll_softirq+0xe8/0x1f0 > __do_softirq+0x128/0x60d > > The buggy address belongs to the object at ffff8801300e0640 > which belongs to the cache bio-0 of size 200 > The buggy address is located 144 bytes inside of > 200-byte region [ffff8801300e0640, ffff8801300e0708) > The buggy address belongs to the page: > page:ffffea0004c03800 count:1 mapcount:0 mapping:ffff88015a563a00 index:0x0 compound_mapcount: 0 > flags: 0x8000000000008100(slab|head) > raw: 8000000000008100 dead000000000100 dead000000000200 ffff88015a563a00 > raw: 0000000000000000 0000000000330033 00000001ffffffff 0000000000000000 > page dumped because: kasan: bad access detected > > Memory state around the buggy address: > ffff8801300e0580: fb fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc > ffff8801300e0600: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb > >ffff8801300e0680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > ^ > ffff8801300e0700: fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc > ffff8801300e0780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ================================================================== > > Fixes: 0ba99ca4838b ("block: Add warning for bi_next not NULL in bio_endio()") > Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> > Cc: Kent Overstreet <kent.overstreet@gmail.com> > Cc: Mike Snitzer <snitzer@redhat.com> Acked-by: Mike Snitzer <snitzer@redhat.com> Thanks Bart.
On 6/19/18 11:26 AM, Bart Van Assche wrote: > Commit 0ba99ca4838b ("block: Add warning for bi_next not NULL in > bio_endio()") breaks the dm driver. end_clone_bio() detects whether > or not a bio is the last bio associated with a request by checking > the .bi_next field. Commit 0ba99ca4838b clears that field before > end_clone_bio() has had a chance to inspect that field. Hence revert > commit 0ba99ca4838b. Applied, thanks Bart.
On Tue, Jun 19, 2018 at 10:26:40AM -0700, Bart Van Assche wrote: > Commit 0ba99ca4838b ("block: Add warning for bi_next not NULL in > bio_endio()") breaks the dm driver. end_clone_bio() detects whether > or not a bio is the last bio associated with a request by checking > the .bi_next field. Commit 0ba99ca4838b clears that field before > end_clone_bio() has had a chance to inspect that field. Hence revert > commit 0ba99ca4838b. > > This patch avoids that KASAN reports the following complaint when > running the srp-test software (srp-test/run_tests -c -d -r 10 -t 02-mq): Thanks for debugging this I take it if we had a test for request based dm in blktests or somewhere that probably would have caught this much easier :/
On Tue, 2018-06-19 at 14:16 -0400, Kent Overstreet wrote: > I take it if we had a test for request based dm in blktests or somewhere that > probably would have caught this much easier :/ I'm working on porting the srp-test software to the blktests framework. Bart.
On 6/19/18 12:17 PM, Bart Van Assche wrote: > On Tue, 2018-06-19 at 14:16 -0400, Kent Overstreet wrote: >> I take it if we had a test for request based dm in blktests or somewhere that >> probably would have caught this much easier :/ > > I'm working on porting the srp-test software to the blktests framework. That's awesome, would be a great addition!
================================================================== BUG: KASAN: use-after-free in bio_advance+0x11b/0x1d0 Read of size 4 at addr ffff8801300e06d0 by task ksoftirqd/0/9 CPU: 0 PID: 9 Comm: ksoftirqd/0 Not tainted 4.18.0-rc1-dbg+ #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 Call Trace: dump_stack+0xa4/0xf5 print_address_description+0x6f/0x270 kasan_report+0x241/0x360 __asan_load4+0x78/0x80 bio_advance+0x11b/0x1d0 blk_update_request+0xa7/0x5b0 scsi_end_request+0x56/0x320 [scsi_mod] scsi_io_completion+0x7d6/0xb20 [scsi_mod] scsi_finish_command+0x1c0/0x280 [scsi_mod] scsi_softirq_done+0x19a/0x230 [scsi_mod] blk_mq_complete_request+0x160/0x240 scsi_mq_done+0x50/0x1a0 [scsi_mod] srp_recv_done+0x515/0x1330 [ib_srp] __ib_process_cq+0xa0/0xf0 [ib_core] ib_poll_handler+0x38/0xa0 [ib_core] irq_poll_softirq+0xe8/0x1f0 __do_softirq+0x128/0x60d run_ksoftirqd+0x3f/0x60 smpboot_thread_fn+0x352/0x460 kthread+0x1c1/0x1e0 ret_from_fork+0x24/0x30 Allocated by task 1918: save_stack+0x43/0xd0 kasan_kmalloc+0xad/0xe0 kasan_slab_alloc+0x11/0x20 kmem_cache_alloc+0xfe/0x350 mempool_alloc_slab+0x15/0x20 mempool_alloc+0xfb/0x270 bio_alloc_bioset+0x244/0x350 submit_bh_wbc+0x9c/0x2f0 __block_write_full_page+0x299/0x5a0 block_write_full_page+0x16b/0x180 blkdev_writepage+0x18/0x20 __writepage+0x42/0x80 write_cache_pages+0x376/0x8a0 generic_writepages+0xbe/0x110 blkdev_writepages+0xe/0x10 do_writepages+0x9b/0x180 __filemap_fdatawrite_range+0x178/0x1c0 file_write_and_wait_range+0x59/0xc0 blkdev_fsync+0x46/0x80 vfs_fsync_range+0x66/0x100 do_fsync+0x3d/0x70 __x64_sys_fsync+0x21/0x30 do_syscall_64+0x77/0x230 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 9: save_stack+0x43/0xd0 __kasan_slab_free+0x137/0x190 kasan_slab_free+0xe/0x10 kmem_cache_free+0xd3/0x380 mempool_free_slab+0x17/0x20 mempool_free+0x63/0x160 bio_free+0x81/0xa0 bio_put+0x59/0x60 end_bio_bh_io_sync+0x5d/0x70 bio_endio+0x1a7/0x360 blk_update_request+0xd0/0x5b0 end_clone_bio+0xa3/0xd0 [dm_mod] bio_endio+0x1a7/0x360 blk_update_request+0xd0/0x5b0 scsi_end_request+0x56/0x320 [scsi_mod] scsi_io_completion+0x7d6/0xb20 [scsi_mod] scsi_finish_command+0x1c0/0x280 [scsi_mod] scsi_softirq_done+0x19a/0x230 [scsi_mod] blk_mq_complete_request+0x160/0x240 scsi_mq_done+0x50/0x1a0 [scsi_mod] srp_recv_done+0x515/0x1330 [ib_srp] __ib_process_cq+0xa0/0xf0 [ib_core] ib_poll_handler+0x38/0xa0 [ib_core] irq_poll_softirq+0xe8/0x1f0 __do_softirq+0x128/0x60d The buggy address belongs to the object at ffff8801300e0640 which belongs to the cache bio-0 of size 200 The buggy address is located 144 bytes inside of 200-byte region [ffff8801300e0640, ffff8801300e0708) The buggy address belongs to the page: page:ffffea0004c03800 count:1 mapcount:0 mapping:ffff88015a563a00 index:0x0 compound_mapcount: 0 flags: 0x8000000000008100(slab|head) raw: 8000000000008100 dead000000000100 dead000000000200 ffff88015a563a00 raw: 0000000000000000 0000000000330033 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8801300e0580: fb fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc ffff8801300e0600: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb >ffff8801300e0680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff8801300e0700: fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff8801300e0780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ================================================================== Fixes: 0ba99ca4838b ("block: Add warning for bi_next not NULL in bio_endio()") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Mike Snitzer <snitzer@redhat.com> --- Changes in v2 compared to v1: improved patch description. block/bio.c | 3 --- block/blk-core.c | 8 +------- 2 files changed, 1 insertion(+), 10 deletions(-) diff --git a/block/bio.c b/block/bio.c index 595663e0281a..accf68a5aa1e 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1777,9 +1777,6 @@ void bio_endio(struct bio *bio) if (!bio_integrity_endio(bio)) return; - if (WARN_ONCE(bio->bi_next, "driver left bi_next not NULL")) - bio->bi_next = NULL; - /* * Need to have a real endio function for chained bios, otherwise * various corner cases will break (like stacking block devices that diff --git a/block/blk-core.c b/block/blk-core.c index 7b7f237b13d5..49475e008e38 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -274,10 +274,6 @@ static void req_bio_endio(struct request *rq, struct bio *bio, bio_advance(bio, nbytes); /* don't actually finish bio if it's part of flush sequence */ - /* - * XXX this code looks suspicious - it's not consistent with advancing - * req->bio in caller - */ if (bio->bi_iter.bi_size == 0 && !(rq->rq_flags & RQF_FLUSH_SEQ)) bio_endio(bio); } @@ -3112,10 +3108,8 @@ bool blk_update_request(struct request *req, blk_status_t error, struct bio *bio = req->bio; unsigned bio_bytes = min(bio->bi_iter.bi_size, nr_bytes); - if (bio_bytes == bio->bi_iter.bi_size) { + if (bio_bytes == bio->bi_iter.bi_size) req->bio = bio->bi_next; - bio->bi_next = NULL; - } /* Completion has already been traced */ bio_clear_flag(bio, BIO_TRACE_COMPLETION);