Message ID | 81bc399e-df90-099d-1b25-bdb0fda3f27c@sandisk.com (mailing list archive) |
---|---|
State | Accepted, archived |
Delegated to: | Mike Snitzer |
Headers | show |
On Tue, Nov 15 2016 at 6:35pm -0500, Bart Van Assche <bart.vanassche@sandisk.com> wrote: > If a single-queue dm device is stacked on top of multi-queue block > devices and map_tio_request() is called while there are no paths then > the request will be prepared for a single-queue path. If a path is > added after a request was prepared and before __multipath_map() is > called return DM_MAPIO_REQUEUE such that it gets unprepared and > reprepared as a blk-mq request. This patch makes little sense to me. There isn't a scenario that I'm aware of that would allow the request_queue to transition between old .request_fn and new blk-mq. The dm-table code should prevent this. Mike > diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c > index 7559537..6b20349 100644 > --- a/drivers/md/dm-mpath.c > +++ b/drivers/md/dm-mpath.c > @@ -541,6 +541,7 @@ static int __multipath_map(struct dm_target *ti, struct request *clone, > size_t nr_bytes = clone ? blk_rq_bytes(clone) : blk_rq_bytes(rq); > struct pgpath *pgpath; > struct block_device *bdev; > + struct request_queue *q; > struct dm_mpath_io *mpio; > > /* Do we need to select a new pgpath? */ > @@ -558,6 +559,18 @@ static int __multipath_map(struct dm_target *ti, struct request *clone, > return r; > } > > + bdev = pgpath->path.dev->bdev; > + q = bdev_get_queue(bdev); > + > + /* > + * When a request is prepared if there are no paths it may happen that > + * the request was prepared for a single-queue path and that a > + * multiqueue path is added before __multipath_map() is called. If > + * that happens requeue to trigger unprepare and reprepare. > + */ > + if ((clone && q->mq_ops) || (!clone && !q->mq_ops)) > + return r; > + > mpio = set_mpio(m, map_context); > if (!mpio) > /* ENOMEM, requeue */ > @@ -566,22 +579,20 @@ static int __multipath_map(struct dm_target *ti, struct request *clone, > mpio->pgpath = pgpath; > mpio->nr_bytes = nr_bytes; > > - bdev = pgpath->path.dev->bdev; > - > if (clone) { > /* > * Old request-based interface: allocated clone is passed in. > * Used by: .request_fn stacked on .request_fn path(s). > */ > - clone->q = bdev_get_queue(bdev); > + clone->q = q; > } else { > /* > * blk-mq request-based interface; used by both: > * .request_fn stacked on blk-mq path(s) and > * blk-mq stacked on blk-mq path(s). > */ > - clone = blk_mq_alloc_request(bdev_get_queue(bdev), > - rq_data_dir(rq), BLK_MQ_REQ_NOWAIT); > + clone = blk_mq_alloc_request(q, rq_data_dir(rq), > + BLK_MQ_REQ_NOWAIT); > if (IS_ERR(clone)) { > /* EBUSY, ENODEV or EWOULDBLOCK; requeue */ > clear_request_fn_mpio(m, map_context); > -- > 2.10.1 > > -- > dm-devel mailing list > dm-devel@redhat.com > https://www.redhat.com/mailman/listinfo/dm-devel -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On 11/15/2016 04:37 PM, Mike Snitzer wrote: > On Tue, Nov 15 2016 at 6:35pm -0500, > Bart Van Assche <bart.vanassche@sandisk.com> wrote: >> If a single-queue dm device is stacked on top of multi-queue block >> devices and map_tio_request() is called while there are no paths then >> the request will be prepared for a single-queue path. If a path is >> added after a request was prepared and before __multipath_map() is >> called return DM_MAPIO_REQUEUE such that it gets unprepared and >> reprepared as a blk-mq request. > > This patch makes little sense to me. There isn't a scenario that I'm > aware of that would allow the request_queue to transition between old > .request_fn and new blk-mq. > > The dm-table code should prevent this. Hello Mike, Are you aware that dm_table_determine_type() sets "all_blk_mq" to false if there are no paths, even if the dm device is in blk-mq mode? Bart. -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Tue, Nov 15 2016 at 7:40pm -0500, Bart Van Assche <bart.vanassche@sandisk.com> wrote: > On 11/15/2016 04:37 PM, Mike Snitzer wrote: > >On Tue, Nov 15 2016 at 6:35pm -0500, > >Bart Van Assche <bart.vanassche@sandisk.com> wrote: > >>If a single-queue dm device is stacked on top of multi-queue block > >>devices and map_tio_request() is called while there are no paths then > >>the request will be prepared for a single-queue path. If a path is > >>added after a request was prepared and before __multipath_map() is > >>called return DM_MAPIO_REQUEUE such that it gets unprepared and > >>reprepared as a blk-mq request. > > > >This patch makes little sense to me. There isn't a scenario that I'm > >aware of that would allow the request_queue to transition between old > >.request_fn and new blk-mq. > > > >The dm-table code should prevent this. > > Hello Mike, > > Are you aware that dm_table_determine_type() sets "all_blk_mq" to > false if there are no paths, even if the dm device is in blk-mq > mode? That shouldn't matter. Once the type is established, it is used to initialize the DM device's request_queue, the type cannot change across different table loads. -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On 11/15/2016 05:01 PM, Mike Snitzer wrote: > On Tue, Nov 15 2016 at 7:40pm -0500, > Bart Van Assche <bart.vanassche@sandisk.com> wrote: >> Are you aware that dm_table_determine_type() sets "all_blk_mq" to >> false if there are no paths, even if the dm device is in blk-mq >> mode? > > That shouldn't matter. Once the type is established, it is used to > initialize the DM device's request_queue, the type cannot change across > different table loads. For a single queue dm device, what prevents a user from removing all single queue paths and to add one or more blk-mq paths? I think this will cause dm_table_determine_type() to change the table type. Bart. -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Tue, Nov 15 2016 at 8:08pm -0500, Bart Van Assche <bart.vanassche@sandisk.com> wrote: > On 11/15/2016 05:01 PM, Mike Snitzer wrote: > >On Tue, Nov 15 2016 at 7:40pm -0500, > >Bart Van Assche <bart.vanassche@sandisk.com> wrote: > >>Are you aware that dm_table_determine_type() sets "all_blk_mq" to > >>false if there are no paths, even if the dm device is in blk-mq > >>mode? > > > >That shouldn't matter. Once the type is established, it is used to > >initialize the DM device's request_queue, the type cannot change across > >different table loads. > > For a single queue dm device, what prevents a user from removing all > single queue paths and to add one or more blk-mq paths? I think this > will cause dm_table_determine_type() to change the table type. A new table is created for every table load (any time a multipath device is loaded into the kernel). The DM core disallows a table with a different type to load. It will be rejected within an error. See dm-ioctl.c:table_load() -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On 11/15/2016 04:37 PM, Mike Snitzer wrote: > On Tue, Nov 15 2016 at 6:35pm -0500, > Bart Van Assche <bart.vanassche@sandisk.com> wrote: > >> If a single-queue dm device is stacked on top of multi-queue block >> devices and map_tio_request() is called while there are no paths then >> the request will be prepared for a single-queue path. If a path is >> added after a request was prepared and before __multipath_map() is >> called return DM_MAPIO_REQUEUE such that it gets unprepared and >> reprepared as a blk-mq request. > > This patch makes little sense to me. There isn't a scenario that I'm > aware of that would allow the request_queue to transition between old > .request_fn and new blk-mq. > > The dm-table code should prevent this. Hello Mike, After having added the following code in __multipath_map() just before the set_mpio() call: bdev = pgpath->path.dev->bdev; q = bdev_get_queue(bdev); if (WARN_ON_ONCE(clone && q->mq_ops) || WARN_ON_ONCE(!clone && !q->mq_ops)) { pr_debug("q->queue_flags = %#lx\n", q->queue_flags); return r; } I see the following warning appear (line 544 contains WARN_ON_ONCE(clone && q->mq_ops)): ------------[ cut here ]------------ WARNING: CPU: 2 PID: 25384 at drivers/md/dm-mpath.c:544 __multipath_map.isra.17+0x325/0x360 [dm_multipath] Modules linked in: ib_srp scsi_transport_srp ib_srpt(O) scst_vdisk(O) scst(O) dlm libcrc32c brd dm_service_time netconsole xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT xt_tcpudp tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet ib_ipoib rdma_ucm ib_ucm msr ib_uverbs ib_umad rdma_cm configfs ib_cm iw_cm mlx4_ib ib_core sb_edac edac_core x86_pkg_temp_thermal coretemp kvm_intel hid_generic ipmi_ssif usbhid ipmi_devintf kvm irqbypass mlx4_core crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel tg3 aes_x86_64 iTCO_wdt lrw gf128mul ptp dcdbas iTCO_vendor_support glue_helper pps_core ablk_helper libphy cryptd ipmi_si pcspkr mei_me fjes ipmi_msghandler mei shpchp tpm_tis tpm_tis_core lpc_ich tpm mfd_core wmi button mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys! _fops ttm drm sr_mod cdrom crc32c_intel ehci_pci ehci_hcd usbcore usb_common sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua [last unloaded: brd] CPU: 2 PID: 25384 Comm: kdmwork-254:0 Tainted: G O 4.9.0-rc6-dbg+ #1 Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.0.2 11/17/2014 ffffc90002cd7d00 ffffffff81329bb5 0000000000000000 0000000000000000 ffffc90002cd7d40 ffffffff810650e6 0000022000001000 ffff8804433a0008 ffff88039134fc28 ffff88037e804008 ffff88039bacce98 0000000000001000 Call Trace: [<ffffffff81329bb5>] dump_stack+0x68/0x93 [<ffffffff810650e6>] __warn+0xc6/0xe0 [<ffffffff810651b8>] warn_slowpath_null+0x18/0x20 [<ffffffffa0046125>] __multipath_map.isra.17+0x325/0x360 [dm_multipath] [<ffffffffa0046192>] multipath_map+0x12/0x20 [dm_multipath] [<ffffffffa002a356>] map_request+0x46/0x300 [dm_mod] [<ffffffffa002a621>] map_tio_request+0x11/0x30 [dm_mod] [<ffffffff8108a065>] kthread_worker_fn+0x105/0x1e0 [<ffffffff81089f60>] ? __kthread_init_worker+0x70/0x70 [<ffffffff81089ecb>] kthread+0xeb/0x110 [<ffffffff81089de0>] ? kthread_park+0x60/0x60 [<ffffffff8163fcc7>] ret_from_fork+0x27/0x40 ---[ end trace b181de88e3eff2a0 ]--- dm_multipath:__multipath_map: q->queue_flags = 0x1d06a00 As one can see neither QUEUE_FLAG_DYING nor QUEUE_FLAG_DEAD was set: $ grep -E 'define QUEUE_FLAG[^[:blank:]]*[[:blank:]](9|11|13|14|20|22|23|24)[[:blank:]]' include/linux/blkdev.h #define QUEUE_FLAG_SAME_COMP 9 /* complete on same CPU-group */ #define QUEUE_FLAG_STACKABLE 11 /* supports request stacking */ #define QUEUE_FLAG_IO_STAT 13 /* do IO stats */ #define QUEUE_FLAG_DISCARD 14 /* supports DISCARD */ #define QUEUE_FLAG_INIT_DONE 20 /* queue is initialized */ #define QUEUE_FLAG_POLL 22 /* IO polling enabled if set */ #define QUEUE_FLAG_WC 23 /* Write back caching */ #define QUEUE_FLAG_FUA 24 /* device supports FUA writes */ Do you want to comment on this? Thanks, Bart. -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Mon, Nov 21 2016 at 4:44pm -0500, Bart Van Assche <bart.vanassche@sandisk.com> wrote: > On 11/15/2016 04:37 PM, Mike Snitzer wrote: > > On Tue, Nov 15 2016 at 6:35pm -0500, > > Bart Van Assche <bart.vanassche@sandisk.com> wrote: > > > >> If a single-queue dm device is stacked on top of multi-queue block > >> devices and map_tio_request() is called while there are no paths then > >> the request will be prepared for a single-queue path. If a path is > >> added after a request was prepared and before __multipath_map() is > >> called return DM_MAPIO_REQUEUE such that it gets unprepared and > >> reprepared as a blk-mq request. > > > > This patch makes little sense to me. There isn't a scenario that I'm > > aware of that would allow the request_queue to transition between old > > .request_fn and new blk-mq. > > > > The dm-table code should prevent this. > > Hello Mike, > > After having added the following code in __multipath_map() just before > the set_mpio() call: > > bdev = pgpath->path.dev->bdev; > q = bdev_get_queue(bdev); > > if (WARN_ON_ONCE(clone && q->mq_ops) || > WARN_ON_ONCE(!clone && !q->mq_ops)) { > pr_debug("q->queue_flags = %#lx\n", q->queue_flags); > return r; > } > > I see the following warning appear (line 544 contains > WARN_ON_ONCE(clone && q->mq_ops)): > > ------------[ cut here ]------------ > WARNING: CPU: 2 PID: 25384 at drivers/md/dm-mpath.c:544 __multipath_map.isra.17+0x325/0x360 [dm_multipath] > Modules linked in: ib_srp scsi_transport_srp ib_srpt(O) scst_vdisk(O) scst(O) dlm libcrc32c brd dm_service_time netconsole xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT xt_tcpudp tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet ib_ipoib rdma_ucm ib_ucm msr ib_uverbs ib_umad rdma_cm configfs ib_cm iw_cm mlx4_ib ib_core sb_edac edac_core x86_pkg_temp_thermal coretemp kvm_intel hid_generic ipmi_ssif usbhid ipmi_devintf kvm irqbypass mlx4_core crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel tg3 aes_x86_64 iTCO_wdt lrw gf128mul ptp dcdbas iTCO_vendor_support glue_helper pps_core ablk_helper libphy cryptd ipmi_si pcspkr mei_me fjes ipmi_msghandler mei shpchp tpm_tis tpm_tis_core lpc_ich tpm mfd_core wmi button mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_s! ys_fops ttm drm sr_mod cdrom crc32c_intel ehci_pci ehci_hcd usbcore usb_common sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua [last unloaded: brd] > CPU: 2 PID: 25384 Comm: kdmwork-254:0 Tainted: G O 4.9.0-rc6-dbg+ #1 > Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.0.2 11/17/2014 > ffffc90002cd7d00 ffffffff81329bb5 0000000000000000 0000000000000000 > ffffc90002cd7d40 ffffffff810650e6 0000022000001000 ffff8804433a0008 > ffff88039134fc28 ffff88037e804008 ffff88039bacce98 0000000000001000 > Call Trace: > [<ffffffff81329bb5>] dump_stack+0x68/0x93 > [<ffffffff810650e6>] __warn+0xc6/0xe0 > [<ffffffff810651b8>] warn_slowpath_null+0x18/0x20 > [<ffffffffa0046125>] __multipath_map.isra.17+0x325/0x360 [dm_multipath] > [<ffffffffa0046192>] multipath_map+0x12/0x20 [dm_multipath] > [<ffffffffa002a356>] map_request+0x46/0x300 [dm_mod] > [<ffffffffa002a621>] map_tio_request+0x11/0x30 [dm_mod] > [<ffffffff8108a065>] kthread_worker_fn+0x105/0x1e0 > [<ffffffff81089f60>] ? __kthread_init_worker+0x70/0x70 > [<ffffffff81089ecb>] kthread+0xeb/0x110 > [<ffffffff81089de0>] ? kthread_park+0x60/0x60 > [<ffffffff8163fcc7>] ret_from_fork+0x27/0x40 > ---[ end trace b181de88e3eff2a0 ]--- > dm_multipath:__multipath_map: q->queue_flags = 0x1d06a00 > > As one can see neither QUEUE_FLAG_DYING nor QUEUE_FLAG_DEAD was set: > > $ grep -E 'define QUEUE_FLAG[^[:blank:]]*[[:blank:]](9|11|13|14|20|22|23|24)[[:blank:]]' include/linux/blkdev.h > #define QUEUE_FLAG_SAME_COMP 9 /* complete on same CPU-group */ > #define QUEUE_FLAG_STACKABLE 11 /* supports request stacking */ > #define QUEUE_FLAG_IO_STAT 13 /* do IO stats */ > #define QUEUE_FLAG_DISCARD 14 /* supports DISCARD */ > #define QUEUE_FLAG_INIT_DONE 20 /* queue is initialized */ > #define QUEUE_FLAG_POLL 22 /* IO polling enabled if set */ > #define QUEUE_FLAG_WC 23 /* Write back caching */ > #define QUEUE_FLAG_FUA 24 /* device supports FUA writes */ > > Do you want to comment on this? Shouldn't be possible. The previous stacktrace you shared clearly showed that the DM mpath request_queue was using blk-mq (dm_mq_queue_rq was in the stack). Whereas the stacktrace above is clearly the old request_fn interface. I'm unaware of how the existing code can allow this. As I said in my earlier mails on this: the request-queue shouldn't be able to change from blk-mq back to .request_fn or vice-versa. So if you think you're only testing blk-mq DM mpath on blk-mq paths, then you need to determine how dm_old_init_request_queue() is getting called to even setup .request_fn (dm_old_request_fn) to be used. If the opposite is true (old request_fn DM mpath stack on blk-mq paths) then determine how dm_mq_init_request_queue is getting called. Basically dm_setup_md_queue() should only ever be called the first time the "multipath" target is loaded. If that isn't the case, then you've exposed some seriously weird bug/regression. Mike -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On 11/21/2016 03:43 PM, Mike Snitzer wrote: > Shouldn't be possible. The previous stacktrace you shared clearly > showed that the DM mpath request_queue was using blk-mq (dm_mq_queue_rq > was in the stack). > > Whereas the stacktrace above is clearly the old request_fn interface. > > I'm unaware of how the existing code can allow this. As I said in my > earlier mails on this: the request-queue shouldn't be able to change > from blk-mq back to .request_fn or vice-versa. > > So if you think you're only testing blk-mq DM mpath on blk-mq paths, > then you need to determine how dm_old_init_request_queue() is getting > called to even setup .request_fn (dm_old_request_fn) to be used. > > If the opposite is true (old request_fn DM mpath stack on blk-mq paths) > then determine how dm_mq_init_request_queue is getting called. > > Basically dm_setup_md_queue() should only ever be called the first time > the "multipath" target is loaded. If that isn't the case, then you've > exposed some seriously weird bug/regression. Hello Mike, Sorry that I had not yet mentioned this, but the test I'm running is as follows: # while true; do for t in mq sq sq-on-mq; do echo ==== $t; srp-test/run_tests -s -t 02-$t; done In other words, I'm alternating mq-on-mq, sq-on-sq and sq-on-mq tests. The above command does not log the time at which each test started so I'm not sure whether it is possible to determine which test triggered the call stack I posted in my previous e-mail. Bart. -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On Mon, Nov 21 2016 at 6:57pm -0500, Bart Van Assche <bart.vanassche@sandisk.com> wrote: > On 11/21/2016 03:43 PM, Mike Snitzer wrote: > >Shouldn't be possible. The previous stacktrace you shared clearly > >showed that the DM mpath request_queue was using blk-mq (dm_mq_queue_rq > >was in the stack). > > > >Whereas the stacktrace above is clearly the old request_fn interface. > > > >I'm unaware of how the existing code can allow this. As I said in my > >earlier mails on this: the request-queue shouldn't be able to change > >from blk-mq back to .request_fn or vice-versa. > > > >So if you think you're only testing blk-mq DM mpath on blk-mq paths, > >then you need to determine how dm_old_init_request_queue() is getting > >called to even setup .request_fn (dm_old_request_fn) to be used. > > > >If the opposite is true (old request_fn DM mpath stack on blk-mq paths) > >then determine how dm_mq_init_request_queue is getting called. > > > >Basically dm_setup_md_queue() should only ever be called the first time > >the "multipath" target is loaded. If that isn't the case, then you've > >exposed some seriously weird bug/regression. > > Hello Mike, > > Sorry that I had not yet mentioned this, but the test I'm running is > as follows: > > # while true; do for t in mq sq sq-on-mq; do echo ==== $t; > srp-test/run_tests -s -t 02-$t; done But you WARN_ON_ONCE(clone && q->mq_ops) will trigger with sq-on-mq. So this would seem to be a false positive. -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On 11/21/2016 04:34 PM, Mike Snitzer wrote:
> But you WARN_ON_ONCE(clone && q->mq_ops) will trigger with sq-on-mq.
Hello Mike,
This behavior is indeed triggered by the sq-on-mq test. After having
added the following code in __bind():
if (old_map &&
dm_table_all_blk_mq_devices(old_map) !=
dm_table_all_blk_mq_devices(t))
pr_debug("%s: old all_blk_mq %d <> new all_blk_mq %d\n",
dm_device_name(md),
dm_table_all_blk_mq_devices(old_map),
dm_table_all_blk_mq_devices(t));
I see the following output appear frequently in the kernel log:
dm_mod:__bind: 254:0: old all_blk_mq 1 <> new all_blk_mq 0
Could these all_blk_mq state changes explain that WARN_ON_ONCE(clone &&
q->mq_ops) is triggered in __multipath_map()? Does this mean that the
comment in patch http://marc.info/?l=dm-devel&m=147925314306752 is correct?
Thanks,
Bart.
--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
On Tue, Nov 22 2016 at 6:47pm -0500, Bart Van Assche <bart.vanassche@sandisk.com> wrote: > On 11/21/2016 04:34 PM, Mike Snitzer wrote: > >But you WARN_ON_ONCE(clone && q->mq_ops) will trigger with sq-on-mq. > > Hello Mike, > > This behavior is indeed triggered by the sq-on-mq test. After having > added the following code in __bind(): > > if (old_map && > dm_table_all_blk_mq_devices(old_map) != > dm_table_all_blk_mq_devices(t)) > pr_debug("%s: old all_blk_mq %d <> new all_blk_mq %d\n", > dm_device_name(md), > dm_table_all_blk_mq_devices(old_map), > dm_table_all_blk_mq_devices(t)); > > I see the following output appear frequently in the kernel log: > > dm_mod:__bind: 254:0: old all_blk_mq 1 <> new all_blk_mq 0 > > Could these all_blk_mq state changes explain that WARN_ON_ONCE(clone > && q->mq_ops) is triggered in __multipath_map()? Does this mean that > the comment in patch http://marc.info/?l=dm-devel&m=147925314306752 > is correct? Yes, looks like you're on to something. dm_old_prep_tio() has: /* * Must clone a request if this .request_fn DM device * is stacked on .request_fn device(s). */ if (!dm_table_all_blk_mq_devices(table)) { ... So if your table->all_blk_mq is false (likely due to a temporary no paths in multipath table scenario) a clone will be passed to __multipath_map(). But what isn't clear is how __multipath_map() would then go on to have any underlying paths available to issue IO to (given the table would have been empty -- or so I would think). Would be great if you could verify that the table is in fact empty. It should be noted that dm_table_determine_type() has: if (list_empty(devices) && __table_type_request_based(live_md_type)) { /* inherit live MD type */ t->type = live_md_type; return 0; } So this explains how/why an empty table will inherit the DM_TYPE_MQ_REQUEST_BASED, and it also explains why the new (empty) table would not have ->all_blk_mq set to true. But again, no IO is able to be issued when there are no underlying paths. And looking closer, __multipath_map() _should_ return early with either DM_MAPIO_DELAY_REQUEUE or DM_MAPIO_REQUEUE when no paths are available. Not seeing how you could have this scenario where you prepared a clone (as if the clone request were to be issued to a .request_fn, aka "sq", device) and then by the time you get into __multipath_map() there is an underlying path available with q->mq_ops. But regardless, what certainly needs fixing is the inconsistency of inheriting DM_TYPE_MQ_REQUEST_BASED but not setting all_blk_mq to true (all_blk_mq == true is implied by DM_TYPE_MQ_REQUEST_BASED). I'm now questioning why we even need the 'all_blk_mq' state within the table. I'll revisit the "why?" on all that in my previous commits. I'll likely get back with you on this point tomorrow. And will hopefully have a fix for you. FYI: given all this I do think something like your 7/7 patch (which you referenced with the above url) would be a possible added safety net to guard against future inconsistencies/bug/regressions. -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c index 7559537..6b20349 100644 --- a/drivers/md/dm-mpath.c +++ b/drivers/md/dm-mpath.c @@ -541,6 +541,7 @@ static int __multipath_map(struct dm_target *ti, struct request *clone, size_t nr_bytes = clone ? blk_rq_bytes(clone) : blk_rq_bytes(rq); struct pgpath *pgpath; struct block_device *bdev; + struct request_queue *q; struct dm_mpath_io *mpio; /* Do we need to select a new pgpath? */ @@ -558,6 +559,18 @@ static int __multipath_map(struct dm_target *ti, struct request *clone, return r; } + bdev = pgpath->path.dev->bdev; + q = bdev_get_queue(bdev); + + /* + * When a request is prepared if there are no paths it may happen that + * the request was prepared for a single-queue path and that a + * multiqueue path is added before __multipath_map() is called. If + * that happens requeue to trigger unprepare and reprepare. + */ + if ((clone && q->mq_ops) || (!clone && !q->mq_ops)) + return r; + mpio = set_mpio(m, map_context); if (!mpio) /* ENOMEM, requeue */ @@ -566,22 +579,20 @@ static int __multipath_map(struct dm_target *ti, struct request *clone, mpio->pgpath = pgpath; mpio->nr_bytes = nr_bytes; - bdev = pgpath->path.dev->bdev; - if (clone) { /* * Old request-based interface: allocated clone is passed in. * Used by: .request_fn stacked on .request_fn path(s). */ - clone->q = bdev_get_queue(bdev); + clone->q = q; } else { /* * blk-mq request-based interface; used by both: * .request_fn stacked on blk-mq path(s) and * blk-mq stacked on blk-mq path(s). */ - clone = blk_mq_alloc_request(bdev_get_queue(bdev), - rq_data_dir(rq), BLK_MQ_REQ_NOWAIT); + clone = blk_mq_alloc_request(q, rq_data_dir(rq), + BLK_MQ_REQ_NOWAIT); if (IS_ERR(clone)) { /* EBUSY, ENODEV or EWOULDBLOCK; requeue */ clear_request_fn_mpio(m, map_context);
If a single-queue dm device is stacked on top of multi-queue block devices and map_tio_request() is called while there are no paths then the request will be prepared for a single-queue path. If a path is added after a request was prepared and before __multipath_map() is called return DM_MAPIO_REQUEUE such that it gets unprepared and reprepared as a blk-mq request. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> --- drivers/md/dm-mpath.c | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-)