Message ID | 20180522235505.20937-1-bart.vanassche@wdc.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, May 22, 2018 at 04:55:05PM -0700, Bart Van Assche wrote: > This patch avoids that KASAN reports the following complaint when > running the srp-test software: I made a real attempt to get your test suite working. It would be nice if you could make an attempt at tracking it down instead of just reverting, and holding up improvements to the whole block layer, since you seem to have an easy repro for it. This is a real looming issue - we have other code which assumes things about bi_next and this isn't a practical thing to grep/audit for.
On Tue, 2018-05-22 at 21:30 -0400, Kent Overstreet wrote: > On Tue, May 22, 2018 at 04:55:05PM -0700, Bart Van Assche wrote: > > This patch avoids that KASAN reports the following complaint when > > running the srp-test software: > > I made a real attempt to get your test suite working. Please try a little harder. Everyone I asked to run that test suite has been able to run it. Bart.
On Wed, May 23, 2018 at 01:36:02AM +0000, Bart Van Assche wrote: > On Tue, 2018-05-22 at 21:30 -0400, Kent Overstreet wrote: > > On Tue, May 22, 2018 at 04:55:05PM -0700, Bart Van Assche wrote: > > > This patch avoids that KASAN reports the following complaint when > > > running the srp-test software: > > > > I made a real attempt to get your test suite working. > > Please try a little harder. Everyone I asked to run that test suite > has been able to run it. > > Bart. > > > ========= TEST srp systemd-journald[143]: Received SIGTERM from PID 1 (systemd). Removed /etc/systemd/system/multipathd.service. Starting multipath-tools (via systemctl): multipath-tools.service. multipathd> reconfigure ok multipathd> make -C discontiguous-io discontiguous-io make[1]: Entering directory '/host/home/kent/ktest/tests/srp-test/discontiguous-io' make[1]: 'discontiguous-io' is up to date. make[1]: Leaving directory '/host/home/kent/ktest/tests/srp-test/discontiguous-io' Unloaded the ib_srpt kernel module Unloaded the rdma_rxe kernel module ../run_tests: line 65: cd: /lib/modules/4.17.0-rc4+/kernel/block: No such file or directory Zero-initializing /dev/ram0 ... done Zero-initializing /dev/ram1 ... done Configured SRP target driver Running test ../tests/01 ... Unloaded the ib_srp kernel module SRP login failed Test ../tests/01 failed Running test ../tests/02-mq ... Test file I/O on top of multipath concurrently with logout and login (10 min; mq) Unloaded the ib_srp kernel module SRP login failed Test ../tests/02-mq failed Running test ../tests/02-sq ... Test file I/O on top of multipath concurrently with logout and login (10 min; sq) Unloaded the ib_srp kernel module SRP login failed Test ../tests/02-sq failed Running test ../tests/02-sq-on-mq ... Test file I/O on top of multipath concurrently with logout and login (10 min; sq-on-mq) Unloaded the ib_srp kernel module SRP login failed Test ../tests/02-sq-on-mq failed Running test ../tests/03-4M ... Test direct I/O with large transfer sizes and cmd_sg_entries=255 Unloaded the ib_srp kernel module SRP login failed Test ../tests/03-4M failed Running test ../tests/03-8M ... Test direct I/O with large transfer sizes and cmd_sg_entries=255 Unloaded the ib_srp kernel module SRP login failed Test ../tests/03-8M failed Running test ../tests/04-4M ... Test direct I/O with large transfer sizes and cmd_sg_entries=1 Unloaded the ib_srp kernel module SRP login failed Test ../tests/04-4M failed Running test ../tests/04-8M ... Test direct I/O with large transfer sizes and cmd_sg_entries=1 Unloaded the ib_srp kernel module SRP login failed Test ../tests/04-8M failed Running test ../tests/05-4M ... Test buffered I/O with large transfer sizes and cmd_sg_entries=255 Unloaded the ib_srp kernel module SRP login failed Test ../tests/05-4M failed Running test ../tests/05-8M ... Test buffered I/O with large transfer sizes and cmd_sg_entries=255 Unloaded the ib_srp kernel module SRP login failed Test ../tests/05-8M failed Running test ../tests/06 ... Test block I/O on top of multipath concurrently with logout and login (10 min) Unloaded the ib_srp kernel module SRP login failed Test ../tests/06 failed 0 tests succeeded and 11 tests failed Unloaded the ib_srpt kernel module Unloaded the rdma_rxe kernel module ========= PASSED srp in 14s May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: --------start up-------- May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: read /etc/multipath.conf May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: path checkers start up May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: sda: fail to get serial May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: sda: using deprecated getuid callout May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: /.../srp-test/bin/getuid_callout exited with 255 May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: sda: failed to get callout uid: Input/output error May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: sdb: fail to get serial May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: sdb: using deprecated getuid callout May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: /.../srp-test/bin/getuid_callout exited with 255 May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: sdb: failed to get callout uid: Input/output error May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: sdc: using deprecated getuid callout May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: /.../srp-test/bin/getuid_callout exited with 255 May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: sdc: failed to get callout uid: Input/output error May 22 22:13:46 moria-kvm-kvm-kvm-kvm systemd[1]: Started Device-Mapper Multipath Device Controller. May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: reconfigure (operator) May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: sda: fail to get serial May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: sda: using deprecated getuid callout May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: /.../srp-test/bin/getuid_callout exited with 255 May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: sda: failed to get callout uid: Input/output error May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: sdb: fail to get serial May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: sdb: using deprecated getuid callout May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: /.../srp-test/bin/getuid_callout exited with 255 May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: sdb: failed to get callout uid: Input/output error May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: sdc: using deprecated getuid callout May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: /.../srp-test/bin/getuid_callout exited with 255 May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: sdc: failed to get callout uid: Input/output error
On Tue, 2018-05-22 at 22:16 -0400, Kent Overstreet wrote: > May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: sda: fail to get serial > May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: sda: using deprecated > getuid callout > May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: > /.../srp-test/bin/getuid_callout exited with 255 (back from traveling) Hello Kent, Please comment out the getuid_callout line from /etc/multipath.conf and try again. Changing the "/..." part into the path of the srp-test software should also help. That line is a left-over from the time when the NVMe initiator drivers supported dm-multipath. I will update the srp-test README.md to make this more clear. Thanks, Bart.
On Mon, Jun 04, 2018 at 08:59:39AM +0000, Bart Van Assche wrote: > On Tue, 2018-05-22 at 22:16 -0400, Kent Overstreet wrote: > > May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: sda: fail to get serial > > May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: sda: using deprecated > > getuid callout > > May 22 22:13:46 moria-kvm-kvm-kvm-kvm multipathd[387]: > > /.../srp-test/bin/getuid_callout exited with 255 > > (back from traveling) > > Hello Kent, > > Please comment out the getuid_callout line from /etc/multipath.conf and try again. > Changing the "/..." part into the path of the srp-test software should also help. > That line is a left-over from the time when the NVMe initiator drivers supported > dm-multipath. I will update the srp-test README.md to make this more clear. Deleted that line and it still doesn't work (I also tried fixing that line so the path was correct, but got the same output as what I sent you earlier). With that line deleted, the output is different. test output still says SRP login failed, syslog says: Jun 04 18:35:31 moria-kvm-kvm-kvm-kvm-kvm multipathd[453]: --------start up-------- Jun 04 18:35:31 moria-kvm-kvm-kvm-kvm-kvm multipathd[453]: read /etc/multipath.conf Jun 04 18:35:31 moria-kvm-kvm-kvm-kvm-kvm multipathd[453]: path checkers start up Jun 04 18:35:31 moria-kvm-kvm-kvm-kvm-kvm multipathd[453]: sda: fail to get serial Jun 04 18:35:31 moria-kvm-kvm-kvm-kvm-kvm multipathd[453]: sdb: fail to get serial Jun 04 18:35:31 moria-kvm-kvm-kvm-kvm-kvm multipathd[453]: nbd0: HDIO_GETGEO failed with 25 Jun 04 18:35:31 moria-kvm-kvm-kvm-kvm-kvm multipathd[453]: nbd0: failed to get udev uid: Invalid argument Jun 04 18:35:31 moria-kvm-kvm-kvm-kvm-kvm multipathd[453]: nbd1: HDIO_GETGEO failed with 25 Jun 04 18:35:31 moria-kvm-kvm-kvm-kvm-kvm multipathd[453]: nbd1: failed to get udev uid: Invalid argument Jun 04 18:35:31 moria-kvm-kvm-kvm-kvm-kvm multipathd[453]: nbd10: HDIO_GETGEO failed with 25 etc.
On Mon, 2018-06-04 at 18:41 -0400, Kent Overstreet wrote: > On Mon, Jun 04, 2018 at 08:59:39AM +0000, Bart Van Assche wrote: > > Please comment out the getuid_callout line from /etc/multipath.conf and try again. > > Changing the "/..." part into the path of the srp-test software should also help. > > That line is a left-over from the time when the NVMe initiator drivers supported > > dm-multipath. I will update the srp-test README.md to make this more clear. > > Deleted that line and it still doesn't work (I also tried fixing that line so > the path was correct, but got the same output as what I sent you earlier). With > that line deleted, the output is different. test output still says SRP login > failed, syslog says: > > Jun 04 18:35:31 moria-kvm-kvm-kvm-kvm-kvm multipathd[453]: --------start up-------- > Jun 04 18:35:31 moria-kvm-kvm-kvm-kvm-kvm multipathd[453]: read /etc/multipath.conf > Jun 04 18:35:31 moria-kvm-kvm-kvm-kvm-kvm multipathd[453]: path checkers start up > Jun 04 18:35:31 moria-kvm-kvm-kvm-kvm-kvm multipathd[453]: sda: fail to get serial > Jun 04 18:35:31 moria-kvm-kvm-kvm-kvm-kvm multipathd[453]: sdb: fail to get serial > Jun 04 18:35:31 moria-kvm-kvm-kvm-kvm-kvm multipathd[453]: nbd0: HDIO_GETGEO failed with 25 > Jun 04 18:35:31 moria-kvm-kvm-kvm-kvm-kvm multipathd[453]: nbd0: failed to get udev uid: Invalid argument > Jun 04 18:35:31 moria-kvm-kvm-kvm-kvm-kvm multipathd[453]: nbd1: HDIO_GETGEO failed with 25 > Jun 04 18:35:31 moria-kvm-kvm-kvm-kvm-kvm multipathd[453]: nbd1: failed to get udev uid: Invalid argument > Jun 04 18:35:31 moria-kvm-kvm-kvm-kvm-kvm multipathd[453]: nbd10: HDIO_GETGEO failed with 25 Thanks for having shared this output. In the multipathd source code I found that the error message "fail to get serial" is produced by the following code: if (get_vpd_sysfs(parent, 0x80, pp->serial, SERIAL_SIZE) <= 0) { if (get_serial(pp->serial, SERIAL_SIZE, pp->fd)) { condlog(2, "%s: fail to get serial", pp->dev); return 0; } } The message "fail to get serial" is only logged if reading /sys/class/scsi_device/*/*/*/*/vpd_pg80 fails *and* submitting a SCSI INQUIRY command through the SG_IO interface fails. This is something I haven't encountered before. Is there something special about the VM? Was e.g. multipathd configured to run as another user than root? Have some security settings been configured that prevent multipathd from accessing sysfs attributes? Was SCSI generic support disabled (CHR_DEV_SG)? Thanks, Bart.
diff --git a/block/bio.c b/block/bio.c index 0a4df92cd689..e22ebab450f8 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1777,9 +1777,6 @@ void bio_endio(struct bio *bio) if (!bio_integrity_endio(bio)) return; - if (WARN_ONCE(bio->bi_next, "driver left bi_next not NULL")) - bio->bi_next = NULL; - /* * Need to have a real endio function for chained bios, otherwise * various corner cases will break (like stacking block devices that diff --git a/block/blk-core.c b/block/blk-core.c index a216b8b137f4..4d69f91a6431 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -277,10 +277,6 @@ static void req_bio_endio(struct request *rq, struct bio *bio, bio_advance(bio, nbytes); /* don't actually finish bio if it's part of flush sequence */ - /* - * XXX this code looks suspicious - it's not consistent with advancing - * req->bio in caller - */ if (bio->bi_iter.bi_size == 0 && !(rq->rq_flags & RQF_FLUSH_SEQ)) bio_endio(bio); } @@ -3115,10 +3111,8 @@ bool blk_update_request(struct request *req, blk_status_t error, struct bio *bio = req->bio; unsigned bio_bytes = min(bio->bi_iter.bi_size, nr_bytes); - if (bio_bytes == bio->bi_iter.bi_size) { + if (bio_bytes == bio->bi_iter.bi_size) req->bio = bio->bi_next; - bio->bi_next = NULL; - } /* Completion has already been traced */ bio_clear_flag(bio, BIO_TRACE_COMPLETION);
This patch avoids that KASAN reports the following complaint when running the srp-test software: BUG: KASAN: use-after-free in bio_advance+0x110/0x1b0 Read of size 4 at addr ffff88014c7aa950 by task ksoftirqd/10/72 Call Trace: dump_stack+0x9a/0xeb print_address_description+0x65/0x270 kasan_report+0x232/0x350 bio_advance+0x110/0x1b0 blk_update_request+0x9d/0x5a0 scsi_end_request+0x4c/0x300 [scsi_mod] scsi_io_completion+0x71e/0xa40 [scsi_mod] __blk_mq_complete_request+0x13e/0x220 srp_recv_done+0x454/0x1100 [ib_srp] __ib_process_cq+0x9a/0xf0 [ib_core] ib_poll_handler+0x2d/0x90 [ib_core] irq_poll_softirq+0xe5/0x1e0 __do_softirq+0x112/0x5f0 run_ksoftirqd+0x29/0x50 smpboot_thread_fn+0x30f/0x410 kthread+0x1b2/0x1d0 ret_from_fork+0x24/0x30 This reverts commit 0ba99ca4838bc75481a4bf0e70bad20b0a5457c7. Fixes: commit 45db54d58de0 ("block: Split out bio_list_copy_data()") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Kent Overstreet <kent.overstreet@gmail.com> --- block/bio.c | 3 --- block/blk-core.c | 8 +------- 2 files changed, 1 insertion(+), 10 deletions(-)