Message ID | 20170413100110.GB5964@ming.t460p (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Apr 13, 2017 at 06:02:21PM +0800, Ming Lei wrote:
> Could you try the following patch to see if it fixes your issue?
Sure, jsut have a short lunch break and then I'll report back.
On Thu, Apr 13, 2017 at 06:02:21PM +0800, Ming Lei wrote: > On Thu, Apr 13, 2017 at 10:06:29AM +0200, Johannes Thumshirn wrote: > > Doing a mkfs.btrfs on a (qemu emulated) PCIe NVMe causes a kernel panic > > in nvme_setup_prps() because the dma_len will drop below zero but the > > length not. > > Looks I can't reproduce the issue in QEMU(32G nvme, either partitioned > or not, just use 'mkfs.btrfs /dev/nvme0n1p1'), could you share the exact > mkfs command line and size of your emulated NVMe? the exact cmdline is mkfs.btrfs -f /dev/nvme0n1p1 (-f because there was a existing btrfs on the image). The image is 17179869184 (a.k.a 16G) bytes. [...] > Could you try the following patch to see if it fixes your issue? It's back to the old, erratic behaviour, see log below. > > --- > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h > index 7548f332121a..65d1510681c6 100644 > --- a/include/linux/blkdev.h > +++ b/include/linux/blkdev.h > @@ -1659,16 +1659,28 @@ static inline bool bvec_gap_to_prev(struct request_queue *q, > * and the 1st bvec in the 2nd bio can be handled in one segment. > */ > static inline bool bios_segs_mergeable(struct request_queue *q, > - struct bio *prev, struct bio_vec *prev_last_bv, > + struct bio *prev, struct bio *next, > + struct bio_vec *prev_last_bv, > struct bio_vec *next_first_bv) > { > if (!BIOVEC_PHYS_MERGEABLE(prev_last_bv, next_first_bv)) > return false; > if (!BIOVEC_SEG_BOUNDARY(q, prev_last_bv, next_first_bv)) > return false; > - if (prev->bi_seg_back_size + next_first_bv->bv_len > > + if (prev->bi_seg_back_size + next->bi_seg_front_size > > queue_max_segment_size(q)) > return false; > + > + /* > + * if 'next' has multiple segments, we need to make > + * sure the merged segment from 'pb' and the 1st segment > + * of 'next' ends at aligned virt boundary. > + */ > + if ((next->bi_seg_front_size < next->bi_iter.bi_size) && > + ((prev_last_bv->bv_offset + prev_last_bv->bv_len + > + next->bi_seg_front_size) & queue_virt_boundary(q))) > + return false; > + > return true; > } > > @@ -1681,7 +1693,7 @@ static inline bool bio_will_gap(struct request_queue *q, struct bio *prev, > bio_get_last_bvec(prev, &pb); > bio_get_first_bvec(next, &nb); > > - if (!bios_segs_mergeable(q, prev, &pb, &nb)) > + if (!bios_segs_mergeable(q, prev, next, &pb, &nb)) > return __bvec_gap_to_prev(q, &pb, nb.bv_offset); > } dracut:/# [ 1.211567] tsc: Refined TSC clocksource calibration: 2297.338 MHz [ 1.212601] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x211d6274d86, max_idle_ns: 440795243673 ns dracut:/# modprobe btrfs [ 8.139179] raid6_pq: module verification failed: signature and/or required key missing - tainting kernel [ 8.207509] raid6: sse2x1 gen() 6827 MB/s [ 8.275512] raid6: sse2x1 xor() 5654 MB/s [ 8.343507] raid6: sse2x2 gen() 11573 MB/s [ 8.411503] raid6: sse2x2 xor() 8826 MB/s [ 8.479504] raid6: sse2x4 gen() 14794 MB/s [ 8.547504] raid6: sse2x4 xor() 10618 MB/s [ 8.547830] raid6: using algorithm sse2x4 gen() 14794 MB/s [ 8.548218] raid6: .... xor() 10618 MB/s, rmw enabled [ 8.548558] raid6: using intx1 recovery algorithm [ 8.549341] xor: measuring software checksum speed [ 8.587533] prefetch64-sse: 15090.000 MB/sec [ 8.627553] generic_sse: 13530.000 MB/sec [ 8.627945] xor: using function: prefetch64-sse (15090.000 MB/sec) [ 8.633795] Btrfs loaded, crc32c=crc32c-generic, assert=on dracut:/# modprobe nvme [ 12.348762] nvme nvme0: pci function 0000:00:04.0 [ 12.386300] ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 11 dracut:/# [ 12.391707] nvme0n1: p1 dracut:/# mkfs.b[ 36.553376] random: fast init done tr dracut:/# mkfs.btrfs -f /dev/nvme0n1p1 btrfs-progs v4.5.3+20160729 See http://btrfs.wiki.kernel.org for more information. Detected a SSD, turning off metadata duplication. Mkfs with -m dup if you want to force metadata duplication. [ 46.696671] ------------[ cut here ]------------ [ 46.697338] kernel BUG at drivers/nvme/host/pci.c:494! [ 46.697806] invalid opcode: 0000 [#1] SMP [ 46.698175] Modules linked in: nvme(E) nvme_core(E) btrfs(E) xor(E) raid6_pq(E) [ 46.698879] CPU: 1 PID: 18 Comm: kworker/1:0H Tainted: G E 4.11.0-rc6-default+ #43 [ 46.699686] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014 [ 46.700737] Workqueue: kblockd blk_mq_run_work_fn [ 46.701169] task: ffff88007bd24540 task.stack: ffffc900003bc000 [ 46.701709] RIP: 0010:nvme_queue_rq+0x85d/0x886 [nvme] [ 46.702185] RSP: 0018:ffffc900003bfc78 EFLAGS: 00010286 [ 46.702670] RAX: 0000000000000078 RBX: 0000000000001000 RCX: 000000007f625000 [ 46.703318] RDX: 0000000000000000 RSI: 0000000000000246 RDI: 0000000000000246 [ 46.703968] RBP: ffffc900003bfd50 R08: 000000000013ee00 R09: 0000000000001000 [ 46.704624] R10: ffff88007f1ed000 R11: ffff88007f220000 R12: ffff88007f1ed000 [ 46.705276] R13: 00000000fffffe00 R14: 0000000000000010 R15: 000000000012fe00 [ 46.705927] FS: 0000000000000000(0000) GS:ffff88007ea80000(0000) knlGS:0000000000000000 [ 46.706673] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 46.707199] CR2: 00007ffdd1294000 CR3: 000000007b742000 CR4: 00000000000006e0 [ 46.707846] Call Trace: [ 46.708082] blk_mq_dispatch_rq_list+0x2a0/0x3d0 [ 46.708510] blk_mq_sched_dispatch_requests+0x138/0x160 [ 46.708991] __blk_mq_run_hw_queue+0x8c/0xa0 [ 46.709407] blk_mq_run_work_fn+0x12/0x20 [ 46.709781] process_one_work+0x153/0x400 [ 46.710152] worker_thread+0x12b/0x4b0 [ 46.711698] kthread+0x109/0x140 [ 46.712013] ? rescuer_thread+0x340/0x340 [ 46.712391] ? kthread_park+0x90/0x90 [ 46.712741] ret_from_fork+0x2c/0x40 [ 46.713081] Code: 01 00 48 8b 40 10 48 89 45 a8 49 8b 87 70 01 00 00 48 89 45 b0 0f 84 3e fa ff ff 49 8b 87 88 01 00 00 48 89 45 a0 e9 2e fa ff ff <0f> 0b 4c 8b 0d e2 35 8f e1 eb 80 0f 0b 4c 89 ef c6 07 00 0f 1f [ 46.714861] RIP: nvme_queue_rq+0x85d/0x886 [nvme] RSP: ffffc900003bfc78 [ 46.715810] ---[ end trace 280a594163a124fb ]--- [ 46.796265] ------------[ cut here ]------------ Thanks, Johannes
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 7548f332121a..65d1510681c6 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1659,16 +1659,28 @@ static inline bool bvec_gap_to_prev(struct request_queue *q, * and the 1st bvec in the 2nd bio can be handled in one segment. */ static inline bool bios_segs_mergeable(struct request_queue *q, - struct bio *prev, struct bio_vec *prev_last_bv, + struct bio *prev, struct bio *next, + struct bio_vec *prev_last_bv, struct bio_vec *next_first_bv) { if (!BIOVEC_PHYS_MERGEABLE(prev_last_bv, next_first_bv)) return false; if (!BIOVEC_SEG_BOUNDARY(q, prev_last_bv, next_first_bv)) return false; - if (prev->bi_seg_back_size + next_first_bv->bv_len > + if (prev->bi_seg_back_size + next->bi_seg_front_size > queue_max_segment_size(q)) return false; + + /* + * if 'next' has multiple segments, we need to make + * sure the merged segment from 'pb' and the 1st segment + * of 'next' ends at aligned virt boundary. + */ + if ((next->bi_seg_front_size < next->bi_iter.bi_size) && + ((prev_last_bv->bv_offset + prev_last_bv->bv_len + + next->bi_seg_front_size) & queue_virt_boundary(q))) + return false; + return true; } @@ -1681,7 +1693,7 @@ static inline bool bio_will_gap(struct request_queue *q, struct bio *prev, bio_get_last_bvec(prev, &pb); bio_get_first_bvec(next, &nb); - if (!bios_segs_mergeable(q, prev, &pb, &nb)) + if (!bios_segs_mergeable(q, prev, next, &pb, &nb)) return __bvec_gap_to_prev(q, &pb, nb.bv_offset); }