Message ID | 20171005163328.GA3562@infradead.org (mailing list archive) |
---|---|
State | Deferred, archived |
Headers | show |
On Thu, Oct 5, 2017 at 6:33 PM, Christoph Hellwig <hch@infradead.org> wrote: > Does the patch below fix the warning for you? > > -- > From 28aae7104425433d39e6142adcd5b88dc5b0ad5f Mon Sep 17 00:00:00 2001 > From: Christoph Hellwig <hch@lst.de> > Date: Thu, 5 Oct 2017 18:31:02 +0200 > Subject: block: use DECLARE_COMPLETION_ONSTACK in submit_bio_wait > > This way we get our lockdep annotations right in case multiple layers > in the stack use submit_bio_wait. > > It also happens to simplify the code by getting rid of the submit_bio_ret > structure. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > block/bio.c | 19 +++++-------------- > 1 file changed, 5 insertions(+), 14 deletions(-) > > diff --git a/block/bio.c b/block/bio.c > index 8338304ea256..4e18e959fc0a 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -917,17 +917,9 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) > } > EXPORT_SYMBOL_GPL(bio_iov_iter_get_pages); > > -struct submit_bio_ret { > - struct completion event; > - int error; > -}; > - > static void submit_bio_wait_endio(struct bio *bio) > { > - struct submit_bio_ret *ret = bio->bi_private; > - > - ret->error = blk_status_to_errno(bio->bi_status); > - complete(&ret->event); > + complete(bio->bi_private); > } > > /** > @@ -943,16 +935,15 @@ static void submit_bio_wait_endio(struct bio *bio) > */ > int submit_bio_wait(struct bio *bio) > { > - struct submit_bio_ret ret; > + DECLARE_COMPLETION_ONSTACK(done); > > - init_completion(&ret.event); > - bio->bi_private = &ret; > + bio->bi_private = &done; > bio->bi_end_io = submit_bio_wait_endio; > bio->bi_opf |= REQ_SYNC; > submit_bio(bio); > - wait_for_completion_io(&ret.event); > + wait_for_completion_io(&done); > > - return ret.error; > + return blk_status_to_errno(bio->bi_status); > } > EXPORT_SYMBOL(submit_bio_wait); No, it doesn't -- the splat is a little bit more complicated, but fundamentally the same thing. Thanks, Ilya run fstests shared/298 at 2017-10-09 16:54:45 ====================================================== WARNING: possible circular locking dependency detected 4.14.0-rc4-ceph-g0721a67fa48f #1 Tainted: G W ------------------------------------------------------ loop0/23166 is trying to acquire lock: (&ei->i_data_sem){++++}, at: [<ffffffffa6340226>] ext4_punch_hole+0x3b6/0x570 but now in release context of a crosslock acquired at the following: ((complete)&ret.event){+.+.}, at: [<ffffffffa645c7ab>] submit_bio_wait+0x8b/0xb0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 ((complete)&ret.event){+.+.}: __lock_acquire+0x111f/0x1150 lock_acquire+0xed/0x1e0 wait_for_completion_io+0x53/0x1b0 submit_bio_wait+0x8b/0xb0 blkdev_issue_zeroout+0xcf/0x1b0 ext4_issue_zeroout+0x44/0x60 ext4_ext_zeroout+0x2f/0x40 ext4_ext_convert_to_initialized+0x6ee/0x7a0 ext4_ext_map_blocks+0x151d/0x1620 ext4_map_blocks+0x14f/0x5b0 ext4_writepages+0xa4a/0x11e0 do_writepages+0x1c/0x80 __filemap_fdatawrite_range+0xaa/0xe0 file_write_and_wait_range+0x39/0xa0 ext4_sync_file+0xae/0x580 vfs_fsync_range+0x4b/0xb0 do_fsync+0x3d/0x70 SyS_fsync+0x10/0x20 entry_SYSCALL_64_fastpath+0x23/0xc2 -> #0 (&ei->i_data_sem){++++}: down_write+0x40/0x70 ext4_punch_hole+0x3b6/0x570 ext4_fallocate+0x19d/0xb20 loop_queue_work+0xd7/0x9b0 kthread_worker_fn+0xbf/0x200 other info that might help us debug this: Possible unsafe locking scenario by crosslock: CPU0 CPU1 ---- ---- lock(&ei->i_data_sem); lock((complete)&ret.event); lock(&ei->i_data_sem); unlock((complete)&ret.event); *** DEADLOCK *** 1 lock held by loop0/23166: #0: (&x->wait#17){..-.}, at: [<ffffffffa60d678d>] complete+0x1d/0x60 stack backtrace: CPU: 0 PID: 23166 Comm: loop0 Tainted: G W 4.14.0-rc4-ceph-g0721a67fa48f #1 Hardware name: Supermicro X8SIL/X8SIL, BIOS 1.0c 02/25/2010 Call Trace: dump_stack+0x85/0xc7 print_circular_bug+0x1eb/0x2e0 ? print_bfs_bug+0x40/0x40 check_prev_add+0x396/0x7e0 ? jbd2_journal_stop+0xd0/0x4d0 lock_commit_crosslock+0x3db/0x5e0 ? lock_commit_crosslock+0x3db/0x5e0 complete+0x29/0x60 submit_bio_wait_endio+0x25/0x30 bio_endio+0x9f/0x1e0 blk_update_request+0xc4/0x3c0 blk_mq_end_request+0x1e/0x70 lo_complete_rq+0x30/0x80 __blk_mq_complete_request+0xa3/0x160 blk_mq_complete_request+0x16/0x20 loop_queue_work+0x5f/0x9b0 ? find_held_lock+0x39/0xb0 ? kthread_worker_fn+0x9c/0x200 ? trace_hardirqs_on_caller+0x11f/0x190 kthread_worker_fn+0xbf/0x200 loop_kthread_worker_fn+0x1e/0x20 kthread+0x152/0x190 ? loop_get_status64+0x90/0x90 ? kthread_stop+0x2a0/0x2a0 ret_from_fork+0x2a/0x40 run fstests shared/298 at 2017-10-09 20:38:50 ====================================================== WARNING: possible circular locking dependency detected 4.14.0-rc4-ceph-gc9ed9bcb0121 #1 Not tainted ------------------------------------------------------ loop0/15423 is trying to acquire lock: (jbd2_handle){++++}, at: [<ffffffffbd37b694>] start_this_handle+0x104/0x450 but now in release context of a crosslock acquired at the following: ((complete)&done#3){+.+.}, at: [<ffffffffbd45c79b>] submit_bio_wait+0x8b/0xc0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 ((complete)&done#3){+.+.}: __lock_acquire+0x111f/0x1150 lock_acquire+0xed/0x1e0 wait_for_completion_io+0x53/0x1b0 submit_bio_wait+0x8b/0xc0 blkdev_issue_zeroout+0xcf/0x1b0 ext4_init_inode_table+0x153/0x339 ext4_lazyinit_thread+0x2bc/0x3d0 kthread+0x152/0x190 ret_from_fork+0x2a/0x40 -> #1 (&meta_group_info[i]->alloc_sem){++++}: __lock_acquire+0x111f/0x1150 lock_acquire+0xed/0x1e0 down_read+0x43/0x70 __ext4_new_inode+0xc4a/0x1470 ext4_create+0x103/0x190 lookup_open+0x78a/0x880 path_openat+0x3c0/0xad0 do_filp_open+0x8a/0xf0 do_sys_open+0x11b/0x1f0 SyS_open+0x1e/0x20 entry_SYSCALL_64_fastpath+0x23/0xc2 -> #0 (jbd2_handle){++++}: start_this_handle+0x167/0x450 jbd2__journal_start+0xdb/0x2b0 __ext4_journal_start_sb+0x89/0x1e0 ext4_punch_hole+0x229/0x570 ext4_fallocate+0x19d/0xb20 other info that might help us debug this: Chain exists of: jbd2_handle --> &meta_group_info[i]->alloc_sem --> (complete)&done#3 Possible unsafe locking scenario by crosslock: CPU0 CPU1 ---- ---- lock(&meta_group_info[i]->alloc_sem); lock((complete)&done#3); lock(jbd2_handle); unlock((complete)&done#3); *** DEADLOCK *** 1 lock held by loop0/15423: #0: (&x->wait#17){..-.}, at: [<ffffffffbd0d678d>] complete+0x1d/0x60 stack backtrace: CPU: 1 PID: 15423 Comm: loop0 Not tainted 4.14.0-rc4-ceph-gc9ed9bcb0121 #1 Hardware name: Supermicro X8SIL/X8SIL, BIOS 1.0c 02/25/2010 Call Trace: dump_stack+0x85/0xc7 print_circular_bug+0x1eb/0x2e0 ? print_bfs_bug+0x40/0x40 check_prev_add+0x396/0x7e0 ? __lock_acquire+0x705/0x1150 lock_commit_crosslock+0x3db/0x5e0 ? lock_commit_crosslock+0x3db/0x5e0 complete+0x29/0x60 submit_bio_wait_endio+0x12/0x20 bio_endio+0x9f/0x1e0 blk_update_request+0xc4/0x3c0 blk_mq_end_request+0x1e/0x70 lo_complete_rq+0x30/0x80 __blk_mq_complete_request+0xa3/0x160 blk_mq_complete_request+0x16/0x20 loop_queue_work+0x5f/0x9b0 ? find_held_lock+0x39/0xb0 ? kthread_worker_fn+0x9c/0x200 ? trace_hardirqs_on_caller+0x11f/0x190 kthread_worker_fn+0xbf/0x200 loop_kthread_worker_fn+0x1e/0x20 kthread+0x152/0x190 ? loop_get_status64+0x90/0x90 ? kthread_stop+0x2a0/0x2a0 ret_from_fork+0x2a/0x40
On Tue, Oct 10, 2017 at 11:16 AM, Ilya Dryomov <idryomov@gmail.com> wrote: > On Thu, Oct 5, 2017 at 6:33 PM, Christoph Hellwig <hch@infradead.org> wrote: >> Does the patch below fix the warning for you? >> >> -- >> From 28aae7104425433d39e6142adcd5b88dc5b0ad5f Mon Sep 17 00:00:00 2001 >> From: Christoph Hellwig <hch@lst.de> >> Date: Thu, 5 Oct 2017 18:31:02 +0200 >> Subject: block: use DECLARE_COMPLETION_ONSTACK in submit_bio_wait >> >> This way we get our lockdep annotations right in case multiple layers >> in the stack use submit_bio_wait. >> >> It also happens to simplify the code by getting rid of the submit_bio_ret >> structure. >> >> Signed-off-by: Christoph Hellwig <hch@lst.de> >> --- >> block/bio.c | 19 +++++-------------- >> 1 file changed, 5 insertions(+), 14 deletions(-) >> >> diff --git a/block/bio.c b/block/bio.c >> index 8338304ea256..4e18e959fc0a 100644 >> --- a/block/bio.c >> +++ b/block/bio.c >> @@ -917,17 +917,9 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) >> } >> EXPORT_SYMBOL_GPL(bio_iov_iter_get_pages); >> >> -struct submit_bio_ret { >> - struct completion event; >> - int error; >> -}; >> - >> static void submit_bio_wait_endio(struct bio *bio) >> { >> - struct submit_bio_ret *ret = bio->bi_private; >> - >> - ret->error = blk_status_to_errno(bio->bi_status); >> - complete(&ret->event); >> + complete(bio->bi_private); >> } >> >> /** >> @@ -943,16 +935,15 @@ static void submit_bio_wait_endio(struct bio *bio) >> */ >> int submit_bio_wait(struct bio *bio) >> { >> - struct submit_bio_ret ret; >> + DECLARE_COMPLETION_ONSTACK(done); >> >> - init_completion(&ret.event); >> - bio->bi_private = &ret; >> + bio->bi_private = &done; >> bio->bi_end_io = submit_bio_wait_endio; >> bio->bi_opf |= REQ_SYNC; >> submit_bio(bio); >> - wait_for_completion_io(&ret.event); >> + wait_for_completion_io(&done); >> >> - return ret.error; >> + return blk_status_to_errno(bio->bi_status); >> } >> EXPORT_SYMBOL(submit_bio_wait); > > No, it doesn't -- the splat is a little bit more complicated, but > fundamentally the same thing. Easily triggered with generic/361 too, BTW. Thanks, Ilya -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/block/bio.c b/block/bio.c index 8338304ea256..4e18e959fc0a 100644 --- a/block/bio.c +++ b/block/bio.c @@ -917,17 +917,9 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) } EXPORT_SYMBOL_GPL(bio_iov_iter_get_pages); -struct submit_bio_ret { - struct completion event; - int error; -}; - static void submit_bio_wait_endio(struct bio *bio) { - struct submit_bio_ret *ret = bio->bi_private; - - ret->error = blk_status_to_errno(bio->bi_status); - complete(&ret->event); + complete(bio->bi_private); } /** @@ -943,16 +935,15 @@ static void submit_bio_wait_endio(struct bio *bio) */ int submit_bio_wait(struct bio *bio) { - struct submit_bio_ret ret; + DECLARE_COMPLETION_ONSTACK(done); - init_completion(&ret.event); - bio->bi_private = &ret; + bio->bi_private = &done; bio->bi_end_io = submit_bio_wait_endio; bio->bi_opf |= REQ_SYNC; submit_bio(bio); - wait_for_completion_io(&ret.event); + wait_for_completion_io(&done); - return ret.error; + return blk_status_to_errno(bio->bi_status); } EXPORT_SYMBOL(submit_bio_wait);