mbox series

[stable,5.10,0/3] dm: fix nullptr crash

Message ID 20220729062356.1663513-1-yukuai1@huaweicloud.com (mailing list archive)
Headers show
Series dm: fix nullptr crash | expand

Message

Yu Kuai July 29, 2022, 6:23 a.m. UTC
From: Yu Kuai <yukuai3@huawei.com>

This patchset backport three patches to fix a crash found by our test:

BUG: kernel NULL pointer dereference, address: 00000000000001a0
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP
CPU: 1 PID: 1317 Comm: mount Not tainted 5.10.0-16691-gf6076432827d-dirty #169
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-4
RIP: 0010:__blk_mq_sched_bio_merge+0x9d/0x1a0
Code: 87 1e 9d 89 d0 25 00 00 00 01 0f 85 ad 00 00 00 48 83 05 25 a1 37 0c 01 3
RSP: 0018:ffffc90000473b50 EFLAGS: 00010202
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffc90000473b98
RDX: 0000000000001000 RSI: ffff8881080c7500 RDI: ffff888103a9cc18
RBP: ffff88813bc80000 R08: 0000000000000001 R09: 0000000000000000
R10: ffff88810710be30 R11: 0000000000000000 R12: ffff888103a9cc18
R13: ffff8881080c7500 R14: 0000000000000001 R15: 0000000000000000
FS:  00007f51bcdbb040(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000001a0 CR3: 000000010d715000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Call Trace:
 blk_mq_submit_bio+0x115/0xd80
 submit_bio_noacct+0x4ff/0x610
 submit_bio+0xaa/0x1a0
 submit_bh_wbc+0x1cb/0x2f0
 submit_bh+0x17/0x20
 ext4_read_bh+0x63/0x170
 ext4_read_bh_lock+0x2c/0xd0
 __ext4_sb_bread_gfp.isra.0+0xa0/0xf0
 ext4_fill_super+0x21f/0x5610
 ? pointer+0x31b/0x5a0
 ? vsnprintf+0x131/0x7d0
 mount_bdev+0x233/0x280
 ? ext4_calculate_overhead+0x660/0x660
 ext4_mount+0x19/0x30
 legacy_get_tree+0x35/0x90
 vfs_get_tree+0x29/0x100
 ? capable+0x1d/0x30
 path_mount+0x8a7/0x1150
 do_mount+0x8d/0xc0
 __se_sys_mount+0x14a/0x220
 __x64_sys_mount+0x29/0x40
 do_syscall_64+0x45/0x70
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f51bbe1623a
Code: 48 8b 0d 51 dc 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 8
RSP: 002b:00007fff173ae898 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 000056169a120030 RCX: 00007f51bbe1623a
RDX: 000056169a120210 RSI: 000056169a120250 RDI: 000056169a120230
RBP: 0000000000000000 R08: 0000000000000000 R09: 00007fff173ad798
R10: 00000000c0ed0000 R11: 0000000000000246 R12: 000056169a120230
R13: 000056169a120210 R14: 0000000000000000 R15: 00007f51bcbac184
Modules linked in: dm_service_time dm_multipath
CR2: 00000000000001a0
---[ end trace ac5d86e09fdc7c98 ]---
RIP: 0010:__blk_mq_sched_bio_merge+0x9d/0x1a0
Code: 87 1e 9d 89 d0 25 00 00 00 01 0f 85 ad 00 00 00 48 83 05 25 a1 37 0c 01 3
RSP: 0018:ffffc90000473b50 EFLAGS: 00010202
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffc90000473b98
RDX: 0000000000001000 RSI: ffff8881080c7500 RDI: ffff888103a9cc18
RBP: ffff88813bc80000 R08: 0000000000000001 R09: 0000000000000000
R10: ffff88810710be30 R11: 0000000000000000 R12: ffff888103a9cc18
R13: ffff8881080c7500 R14: 0000000000000001 R15: 0000000000000000
FS:  00007f51bcdbb040(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f10e97a5000 CR3: 000000010d715000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Kernel panic - not syncing: Fatal exception
Kernel Offset: disabled
---[ end Kernel panic - not syncing: Fatal exception ]---

root cause:
t1 dm-mpath		t2 mount

alloc_dev
 md->queue = blk_alloc_queue
 add_disk_no_queue_reg

dm_setup_md_queue
 case DM_TYPE_REQUEST_BASED -> multipath
  md->disk->fops = &dm_rq_blk_dops;
			ext4_fill_super
                        ┊__ext4_sb_bread_gfp
                        ┊ ext4_read_bh
                        ┊  submit_bio -> queue is not initialized yet
                        ┊   __blk_mq_sched_bio_merge
                        ┊    ctx = blk_mq_get_ctx(q); -> ctx is NULL
  dm_mq_init_request_queue

Patch 3 is the fix patch, and patch 1,2 is needed to backport patch 3.

Please noted that there are lots of conficts between 5.10 and mainline,
and I made plenty adaptations in these patches.

I already tested this patchset with dmtest create/remove tests:

dmtest run --suite thin-provisioning -t /Creation\Deletion/

Christoph Hellwig (3):
  block: look up holders by bdev
  block: support delayed holder registration
  dm: delay registering the gendisk

 block/genhd.c             |  13 +++++
 drivers/md/dm.c           |  24 +++++----
 fs/block_dev.c            | 105 +++++++++++++++++++++++++++-----------
 include/linux/blk_types.h |   3 --
 include/linux/genhd.h     |   9 +++-
 5 files changed, 110 insertions(+), 44 deletions(-)