Message ID | 71e22257-0592-fdd3-25e5-a78ceced2ab9@sandisk.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Delegated to: | Mike Snitzer |
Headers | show |
On 01/26/2017 01:47 PM, Bart Van Assche wrote: > On 01/26/2017 11:01 AM, Jens Axboe wrote: >> On 01/26/2017 11:59 AM, hch@lst.de wrote: >>> On Thu, Jan 26, 2017 at 11:57:36AM -0700, Jens Axboe wrote: >>>> It's against my for-4.11/block, which you were running under Christoph's >>>> patches. Maybe he's using an older version? In any case, should be >>>> pretty trivial for you to hand apply. Just ensure that .flags is set to >>>> 0 for the common cases, and inherit 'flags' when it is passed in. >>> >>> No, the flush op cleanups you asked for last round create a conflict >>> with your patch. They should be trivial to fix, though. >> >> Ah, makes sense. And yes, as I said, should be trivial to hand apply the >> hunk that does fail. > > Hello Jens and Christoph, > > With the below patch applied the test got a little further but did not > pass unfortunately. I tried to analyze the new call stack but it's not yet > clear to me what is going on. > > The patch I had applied on Christoph's tree: > > --- > block/blk-mq-sched.c | 2 +- > block/blk-mq.c | 6 +++--- > 2 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c > index 3bd66e50ec84..7c9318755fab 100644 > --- a/block/blk-mq-sched.c > +++ b/block/blk-mq-sched.c > @@ -116,7 +116,7 @@ struct request *blk_mq_sched_get_request(struct request_queue *q, > ctx = blk_mq_get_ctx(q); > hctx = blk_mq_map_queue(q, ctx->cpu); > > - blk_mq_set_alloc_data(data, q, 0, ctx, hctx); > + blk_mq_set_alloc_data(data, q, data->flags, ctx, hctx); > > if (e) { > data->flags |= BLK_MQ_REQ_INTERNAL; > diff --git a/block/blk-mq.c b/block/blk-mq.c > index 83640869d9e4..6697626e5d32 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -248,7 +248,7 @@ EXPORT_SYMBOL_GPL(__blk_mq_alloc_request); > struct request *blk_mq_alloc_request(struct request_queue *q, int rw, > unsigned int flags) > { > - struct blk_mq_alloc_data alloc_data; > + struct blk_mq_alloc_data alloc_data = { .flags = flags }; > struct request *rq; > int ret; > > @@ -1369,7 +1369,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) > { > const int is_sync = op_is_sync(bio->bi_opf); > const int is_flush_fua = op_is_flush(bio->bi_opf); > - struct blk_mq_alloc_data data; > + struct blk_mq_alloc_data data = { }; > struct request *rq; > unsigned int request_count = 0, srcu_idx; > struct blk_plug *plug; > @@ -1491,7 +1491,7 @@ static blk_qc_t blk_sq_make_request(struct request_queue *q, struct bio *bio) > const int is_flush_fua = op_is_flush(bio->bi_opf); > struct blk_plug *plug; > unsigned int request_count = 0; > - struct blk_mq_alloc_data data; > + struct blk_mq_alloc_data data = { }; > struct request *rq; > blk_qc_t cookie; > unsigned int wb_acct; Looks correct to me. Your call path has blk_get_request() in it, I don't have that in my tree. Is it passing in the right mask?
On Thu, 2017-01-26 at 13:54 -0700, Jens Axboe wrote: > Your call path has blk_get_request() in it, I don't have > that in my tree. Is it passing in the right mask? Hello Jens, There is only one blk_get_request() call in drivers/md/dm-mpath.c and it looks as follows: clone = blk_get_request(bdev_get_queue(bdev), rq->cmd_flags | REQ_NOMERGE, GFP_ATOMIC); Bart. -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On 01/26/2017 02:01 PM, Bart Van Assche wrote: > On Thu, 2017-01-26 at 13:54 -0700, Jens Axboe wrote: >> Your call path has blk_get_request() in it, I don't have >> that in my tree. Is it passing in the right mask? > > Hello Jens, > > There is only one blk_get_request() call in drivers/md/dm-mpath.c > and it looks as follows: > > clone = blk_get_request(bdev_get_queue(bdev), > rq->cmd_flags | REQ_NOMERGE, > GFP_ATOMIC); Yeah, I found it in the dm patch. Looks fine to me, since blk_mq_alloc_request() checks for __GFP_DIRECT_RECLAIM. Weird, it all looks fine to me. Are you sure you tested with the patch? Either that, or I'm smoking crack.
On Thu, 2017-01-26 at 14:12 -0700, Jens Axboe wrote: > On 01/26/2017 02:01 PM, Bart Van Assche wrote: > > On Thu, 2017-01-26 at 13:54 -0700, Jens Axboe wrote: > > > Your call path has blk_get_request() in it, I don't have > > > that in my tree. Is it passing in the right mask? > > > > Hello Jens, > > > > There is only one blk_get_request() call in drivers/md/dm-mpath.c > > and it looks as follows: > > > > clone = blk_get_request(bdev_get_queue(bdev), > > rq->cmd_flags | REQ_NOMERGE, > > GFP_ATOMIC); > > Yeah, I found it in the dm patch. Looks fine to me, since > blk_mq_alloc_request() checks for __GFP_DIRECT_RECLAIM. Weird, it all > looks fine to me. Are you sure you tested with the patch? Either that, > or I'm smoking crack. Hello Jens, After I received your e-mail I noticed that there was a local modification on the test system that was responsible for the schedule- while-atomic complaint. Sorry for that. Anyway, I undid the merge with the v4.10-rc5 code and repeated my test. This time the following call stack appeared: BUG: unable to handle kernel NULL pointer dereference at 000000000000005c IP: blk_mq_sched_get_request+0x310/0x350 PGD 34bd9c067 PUD 346b37067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: dm_service_time ib_srp scsi_transport_srp target_core_user uio target_core_pscsi target_core_file ib_srpt target_core_iblock target_core_mod brd netconsole xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat libcrc32c nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm configfs ib_cm iw_cm msr mlx4_ib ib_core sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp ipmi_ssif kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul mlx4_core crc32c_intel ghash_clmulni_intel pcbc aesni_intel aes_x86_64 tg3 iTCO_wdt crypto_simd dcdbas iTCO_vendor_support ptp glue_helper ipmi_si cryptd ipmi_devintf pps_core fjes devlink ipmi_msghandler pcspkr libphy tpm_tis tpm_tis_core tpm button mei_me lpc_ich wmi mei mfd_core shpchp hid_generic usbhid mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm sr_mod drm cdrom ehci_pci ehci_hcd usbcore usb_common sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua autofs4 CPU: 0 PID: 9231 Comm: fio Not tainted 4.10.0-rc4-dbg+ #1 Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.0.2 11/17/2014 task: ffff88034c8c3140 task.stack: ffffc90005698000 RIP: 0010:blk_mq_sched_get_request+0x310/0x350 RSP: 0018:ffffc9000569bac8 EFLAGS: 00010246 RAX: ffff88034f430958 RBX: ffff88045ed2cef0 RCX: 0000000000000000 RDX: 000000000000001f RSI: ffff8803507bdcf8 RDI: 000000000000001f RBP: ffffc9000569bb00 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffffc9000569bb18 R13: 000000000000c801 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f65ca054700(0000) GS:ffff88046f200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000005c CR3: 000000034b0ed000 CR4: 00000000001406f0 Call Trace: blk_mq_alloc_request+0x5e/0xb0 blk_get_request+0x2f/0x110 multipath_clone_and_map+0xcd/0x140 [dm_multipath] map_request+0x3c/0x290 [dm_mod] dm_mq_queue_rq+0x77/0x100 [dm_mod] blk_mq_dispatch_rq_list+0x1ff/0x320 blk_mq_sched_dispatch_requests+0xa9/0xe0 __blk_mq_run_hw_queue+0x122/0x1c0 blk_mq_run_hw_queue+0x84/0x90 blk_mq_flush_plug_list+0x39f/0x480 blk_flush_plug_list+0xee/0x270 blk_finish_plug+0x27/0x40 do_io_submit+0x475/0x900 SyS_io_submit+0xb/0x10 entry_SYSCALL_64_fastpath+0x18/0xad RIP: 0033:0x7f65e4d05787 RSP: 002b:00007f65ca051948 EFLAGS: 00000202 ORIG_RAX: 00000000000000d1 RAX: ffffffffffffffda RBX: 0000000000000046 RCX: 00007f65e4d05787 RDX: 00007f65a404f158 RSI: 0000000000000001 RDI: 00007f65f6bfd000 RBP: 0000000000000815 R08: 0000000000000001 R09: 00007f65a404e3e0 R10: 00007f65a4040000 R11: 0000000000000202 R12: 00000000000006d0 R13: 00007f65a404e930 R14: 0000000000001000 R15: 0000000000000830 Code: 67 ff ff ff e9 80 fe ff ff 48 89 df e8 ba c4 fe ff 31 c9 e9 60 ff ff ff 44 89 ee 4c 89 e7 e8 c8 6d ff ff 48 89 c1 49 8b 44 24 18 <48> 63 51 5c 48 8b 80 20 01 00 00 48 8b 80 80 00 00 00 48 89 0c RIP: blk_mq_sched_get_request+0x310/0x350 RSP: ffffc9000569bac8 CR2: 000000000000005c (gdb) list *(blk_mq_sched_get_request+0x310) 0xffffffff8132dcf0 is in blk_mq_sched_get_request (block/blk-mq-sched.c:136). 131 rq->rq_flags |= RQF_QUEUED; 132 } else 133 rq = __blk_mq_alloc_request(data, op); 134 } else { 135 rq = __blk_mq_alloc_request(data, op); 136 data->hctx->tags->rqs[rq->tag] = rq; 137 } 138 139 if (rq) { 140 if (!op_is_flush(op)) { (gdb) disas blk_mq_sched_get_request [ ... ] 0xffffffff8132dce3 <+771>: callq 0xffffffff81324ab0 <__blk_mq_alloc_request> 0xffffffff8132dce8 <+776>: mov %rax,%rcx 0xffffffff8132dceb <+779>: mov 0x18(%r12),%rax 0xffffffff8132dcf0 <+784>: movslq 0x5c(%rcx),%rdx [ ... ] (gdb) print &((struct request *)0)->tag $1 = (int *) 0x5c <irq_stack_union+92> I think this means that rq == NULL and that a test for rq is missing after the __blk_mq_alloc_request() call? Bart. -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index 3bd66e50ec84..7c9318755fab 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -116,7 +116,7 @@ struct request *blk_mq_sched_get_request(struct request_queue *q, ctx = blk_mq_get_ctx(q); hctx = blk_mq_map_queue(q, ctx->cpu); - blk_mq_set_alloc_data(data, q, 0, ctx, hctx); + blk_mq_set_alloc_data(data, q, data->flags, ctx, hctx); if (e) { data->flags |= BLK_MQ_REQ_INTERNAL; diff --git a/block/blk-mq.c b/block/blk-mq.c index 83640869d9e4..6697626e5d32 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -248,7 +248,7 @@ EXPORT_SYMBOL_GPL(__blk_mq_alloc_request); struct request *blk_mq_alloc_request(struct request_queue *q, int rw, unsigned int flags) { - struct blk_mq_alloc_data alloc_data; + struct blk_mq_alloc_data alloc_data = { .flags = flags }; struct request *rq; int ret; @@ -1369,7 +1369,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio) { const int is_sync = op_is_sync(bio->bi_opf); const int is_flush_fua = op_is_flush(bio->bi_opf); - struct blk_mq_alloc_data data; + struct blk_mq_alloc_data data = { }; struct request *rq; unsigned int request_count = 0, srcu_idx; struct blk_plug *plug; @@ -1491,7 +1491,7 @@ static blk_qc_t blk_sq_make_request(struct request_queue *q, struct bio *bio) const int is_flush_fua = op_is_flush(bio->bi_opf); struct blk_plug *plug; unsigned int request_count = 0; - struct blk_mq_alloc_data data; + struct blk_mq_alloc_data data = { }; struct request *rq; blk_qc_t cookie; unsigned int wb_acct;