Message ID | 20180815203728.19521-1-bart.vanassche@wdc.com (mailing list archive) |
---|---|
Headers | show |
Series | Add NVMeOF multipath tests | expand |
On Wed, Aug 15, 2018 at 01:37:25PM -0700, Bart Van Assche wrote: > Hello Omar, > > The three patches in this series add several NVMeOF multipath tests. These > tests are similar in nature to the SRP tests. Please consider these patches > for the upstream blktests repository. Hi Bart, is there a special reason this isn't in the nvme category? I don't really see a blocker here. Thanks, Johannes
On Mon, 2018-08-20 at 09:31 +0200, Johannes Thumshirn wrote: > On Wed, Aug 15, 2018 at 01:37:25PM -0700, Bart Van Assche wrote: > > Hello Omar, > > > > The three patches in this series add several NVMeOF multipath tests. These > > tests are similar in nature to the SRP tests. Please consider these patches > > for the upstream blktests repository. > > Hi Bart, > > is there a special reason this isn't in the nvme category? I don't > really see a blocker here. Hello Johannes, Moving these tests into the nvme directory is possible but will make it harder to run the NVMeOF multipath tests separately. Are you fine with this? Bart.
On Mon, Aug 20, 2018 at 03:46:45PM +0000, Bart Van Assche wrote: > Moving these tests into the nvme directory is possible but will make it > harder to run the NVMeOF multipath tests separately. Are you fine with this? Both way's have it's up and downsides, I agree. Having two distinct groups requires to run './check nvme nvmeof-mp' to run full coverage with nvme. Having it all in one group would require to run './check nvme 18 19 20 21 22 23 24 ...' to get only the dm-mpath ones. Honestly I hate both but your's (the two distinct groups) is probably easier to handle in the end, I have to admit. Thanks, Johannes
On Tue, 2018-08-21 at 08:46 +0200, Johannes Thumshirn wrote: > On Mon, Aug 20, 2018 at 03:46:45PM +0000, Bart Van Assche wrote: > > Moving these tests into the nvme directory is possible but will make it > > harder to run the NVMeOF multipath tests separately. Are you fine with this? > > Both way's have it's up and downsides, I agree. > > Having two distinct groups requires to run './check nvme nvmeof-mp' to > run full coverage with nvme. > > Having it all in one group would require to run './check nvme 18 19 20 > 21 22 23 24 ...' to get only the dm-mpath ones. > > Honestly I hate both but your's (the two distinct groups) is probably > easier to handle in the end, I have to admit. Omar, do you have a preference for one of the two aforementioned approaches? Thanks, Bart.
On Thu, Aug 23, 2018 at 01:53:33AM +0000, Bart Van Assche wrote: > On Tue, 2018-08-21 at 08:46 +0200, Johannes Thumshirn wrote: > > On Mon, Aug 20, 2018 at 03:46:45PM +0000, Bart Van Assche wrote: > > > Moving these tests into the nvme directory is possible but will make it > > > harder to run the NVMeOF multipath tests separately. Are you fine with this? > > > > Both way's have it's up and downsides, I agree. > > > > Having two distinct groups requires to run './check nvme nvmeof-mp' to > > run full coverage with nvme. > > > > Having it all in one group would require to run './check nvme 18 19 20 > > 21 22 23 24 ...' to get only the dm-mpath ones. > > > > Honestly I hate both but your's (the two distinct groups) is probably > > easier to handle in the end, I have to admit. > > Omar, do you have a preference for one of the two aforementioned approaches? > > Thanks, > > Bart. > Let's keep it in a separate category, since lots of people running nvme tests probably aren't interested in testing multipath. A bunch of the tests failed with modprobe: FATAL: Module nvme is in use. Maybe related to my test VM having an nvme device?
On Thu, Aug 23, 2018 at 05:21:33PM -0700, Omar Sandoval wrote: > On Thu, Aug 23, 2018 at 01:53:33AM +0000, Bart Van Assche wrote: > > On Tue, 2018-08-21 at 08:46 +0200, Johannes Thumshirn wrote: > > > On Mon, Aug 20, 2018 at 03:46:45PM +0000, Bart Van Assche wrote: > > > > Moving these tests into the nvme directory is possible but will make it > > > > harder to run the NVMeOF multipath tests separately. Are you fine with this? > > > > > > Both way's have it's up and downsides, I agree. > > > > > > Having two distinct groups requires to run './check nvme nvmeof-mp' to > > > run full coverage with nvme. > > > > > > Having it all in one group would require to run './check nvme 18 19 20 > > > 21 22 23 24 ...' to get only the dm-mpath ones. > > > > > > Honestly I hate both but your's (the two distinct groups) is probably > > > easier to handle in the end, I have to admit. > > > > Omar, do you have a preference for one of the two aforementioned approaches? > > > > Thanks, > > > > Bart. > > > > Let's keep it in a separate category, since lots of people running nvme > tests probably aren't interested in testing multipath. > > A bunch of the tests failed with > > modprobe: FATAL: Module nvme is in use. > > Maybe related to my test VM having an nvme device? Ping, Bart, can you look into this? It'd be nice to get this in.
On 09/12/18 18:06, Omar Sandoval wrote:
> Ping, Bart, can you look into this? It'd be nice to get this in.
Hello Omar,
I will repost this patch series as soon as I have the time. I have been
more busy than usual during the past four weeks.
Best regards,
Bart.
On 8/23/18 5:21 PM, Omar Sandoval wrote: > On Thu, Aug 23, 2018 at 01:53:33AM +0000, Bart Van Assche wrote: >> On Tue, 2018-08-21 at 08:46 +0200, Johannes Thumshirn wrote: >>> On Mon, Aug 20, 2018 at 03:46:45PM +0000, Bart Van Assche wrote: >>>> Moving these tests into the nvme directory is possible but will make it >>>> harder to run the NVMeOF multipath tests separately. Are you fine with this? >>> >>> Both way's have it's up and downsides, I agree. >>> >>> Having two distinct groups requires to run './check nvme nvmeof-mp' to >>> run full coverage with nvme. >>> >>> Having it all in one group would require to run './check nvme 18 19 20 >>> 21 22 23 24 ...' to get only the dm-mpath ones. >>> >>> Honestly I hate both but your's (the two distinct groups) is probably >>> easier to handle in the end, I have to admit. >> >> Omar, do you have a preference for one of the two aforementioned approaches? > > Let's keep it in a separate category, since lots of people running nvme > tests probably aren't interested in testing multipath. > > A bunch of the tests failed with > > modprobe: FATAL: Module nvme is in use. > > Maybe related to my test VM having an nvme device? Hello Omar, Can you have a look at the updated master branch of https://github.com/bvanassche/blktests? That code should no longer fail if unloading the nvme kernel module fails. Please note that you will need kernel v4.18 to test these scripts - a KASAN complaint appears if I run these tests against kernel v4.19-rc4. Thanks, Bart.
On Tue, Sep 18, 2018 at 02:20:59PM -0700, Bart Van Assche wrote: > On 8/23/18 5:21 PM, Omar Sandoval wrote: > > On Thu, Aug 23, 2018 at 01:53:33AM +0000, Bart Van Assche wrote: > > > On Tue, 2018-08-21 at 08:46 +0200, Johannes Thumshirn wrote: > > > > On Mon, Aug 20, 2018 at 03:46:45PM +0000, Bart Van Assche wrote: > > > > > Moving these tests into the nvme directory is possible but will make it > > > > > harder to run the NVMeOF multipath tests separately. Are you fine with this? > > > > > > > > Both way's have it's up and downsides, I agree. > > > > > > > > Having two distinct groups requires to run './check nvme nvmeof-mp' to > > > > run full coverage with nvme. > > > > > > > > Having it all in one group would require to run './check nvme 18 19 20 > > > > 21 22 23 24 ...' to get only the dm-mpath ones. > > > > > > > > Honestly I hate both but your's (the two distinct groups) is probably > > > > easier to handle in the end, I have to admit. > > > > > > Omar, do you have a preference for one of the two aforementioned approaches? > > > > Let's keep it in a separate category, since lots of people running nvme > > tests probably aren't interested in testing multipath. > > > > A bunch of the tests failed with > > > > modprobe: FATAL: Module nvme is in use. > > > > Maybe related to my test VM having an nvme device? > > Hello Omar, > > Can you have a look at the updated master branch of > https://github.com/bvanassche/blktests? That code should no longer fail if > unloading the nvme kernel module fails. Please note that you will need > kernel v4.18 to test these scripts - a KASAN complaint appears if I run > these tests against kernel v4.19-rc4. Thanks, these pass now. Is it expected that my nvme device gets a multipath device configured after running these tests? $ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT vda 254:0 0 16G 0 disk └─vda1 254:1 0 16G 0 part / vdb 254:16 0 8G 0 disk vdc 254:32 0 8G 0 disk vdd 254:48 0 8G 0 disk nvme0n1 259:0 0 8G 0 disk └─mpatha 253:0 0 8G 0 mpath Also, can you please fix: _have_kernel_option NVME_MULTIPATH && exit 1 to not exit on failure? It should use SKIP_REASON and return 1. You might need to add something like _dont_have_kernel_option to properly handle the case where the config is not found. Side note which isn't a blocker for merging is that there's a lot of duplicated code between these helpers and the srp helpers. How hard would it be to refactor that?
On 9/18/18 4:24 PM, Omar Sandoval wrote: > On Tue, Sep 18, 2018 at 02:20:59PM -0700, Bart Van Assche wrote: >> Can you have a look at the updated master branch of >> https://github.com/bvanassche/blktests? That code should no longer fail if >> unloading the nvme kernel module fails. Please note that you will need >> kernel v4.18 to test these scripts - a KASAN complaint appears if I run >> these tests against kernel v4.19-rc4. > > Thanks, these pass now. Is it expected that my nvme device gets a > multipath device configured after running these tests? > > $ lsblk > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT > vda 254:0 0 16G 0 disk > └─vda1 254:1 0 16G 0 part / > vdb 254:16 0 8G 0 disk > vdc 254:32 0 8G 0 disk > vdd 254:48 0 8G 0 disk > nvme0n1 259:0 0 8G 0 disk > └─mpatha 253:0 0 8G 0 mpath No, all multipath devices that were created during a test should be removed before that test finishes. I will look into this. > Also, can you please fix: > > _have_kernel_option NVME_MULTIPATH && exit 1 > > to not exit on failure? It should use SKIP_REASON and return 1. You > might need to add something like _dont_have_kernel_option to properly > handle the case where the config is not found. OK, I will change this. > Side note which isn't a blocker for merging is that there's a lot of > duplicated code between these helpers and the srp helpers. How hard > would it be to refactor that? Are you perhaps referring to the code that is shared between the tests/srp/rc tests/nvmeof-mp/rc shell scripts? The hardest part is probably to chose a location where to store these functions. Should I create a file with common code under common/, under tests/srp/, under tests/nvmeof-mp/ or perhaps somewhere else? Thanks, Bart.
On Tue, Sep 18, 2018 at 05:02:47PM -0700, Bart Van Assche wrote: > On 9/18/18 4:24 PM, Omar Sandoval wrote: > > On Tue, Sep 18, 2018 at 02:20:59PM -0700, Bart Van Assche wrote: > > > Can you have a look at the updated master branch of > > > https://github.com/bvanassche/blktests? That code should no longer fail if > > > unloading the nvme kernel module fails. Please note that you will need > > > kernel v4.18 to test these scripts - a KASAN complaint appears if I run > > > these tests against kernel v4.19-rc4. > > > > Thanks, these pass now. Is it expected that my nvme device gets a > > multipath device configured after running these tests? > > > > $ lsblk > > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT > > vda 254:0 0 16G 0 disk > > └─vda1 254:1 0 16G 0 part / > > vdb 254:16 0 8G 0 disk > > vdc 254:32 0 8G 0 disk > > vdd 254:48 0 8G 0 disk > > nvme0n1 259:0 0 8G 0 disk > > └─mpatha 253:0 0 8G 0 mpath > > No, all multipath devices that were created during a test should be removed > before that test finishes. I will look into this. > > > Also, can you please fix: > > > > _have_kernel_option NVME_MULTIPATH && exit 1 > > > > to not exit on failure? It should use SKIP_REASON and return 1. You > > might need to add something like _dont_have_kernel_option to properly > > handle the case where the config is not found. > > OK, I will change this. > > > Side note which isn't a blocker for merging is that there's a lot of > > duplicated code between these helpers and the srp helpers. How hard > > would it be to refactor that? > > Are you perhaps referring to the code that is shared between the > tests/srp/rc tests/nvmeof-mp/rc shell scripts? Yes, those. > The hardest part is probably > to chose a location where to store these functions. Should I create a file > with common code under common/, under tests/srp/, under tests/nvmeof-mp/ or > perhaps somewhere else? Just put it under common. Thanks!
On Tue, 2018-09-18 at 17:18 -0700, Omar Sandoval wrote: > On Tue, Sep 18, 2018 at 05:02:47PM -0700, Bart Van Assche wrote: > > On 9/18/18 4:24 PM, Omar Sandoval wrote: > > > On Tue, Sep 18, 2018 at 02:20:59PM -0700, Bart Van Assche wrote: > > > > Can you have a look at the updated master branch of > > > > https://github.com/bvanassche/blktests? That code should no longer fail if > > > > unloading the nvme kernel module fails. Please note that you will need > > > > kernel v4.18 to test these scripts - a KASAN complaint appears if I run > > > > these tests against kernel v4.19-rc4. > > > > > > Thanks, these pass now. Is it expected that my nvme device gets a > > > multipath device configured after running these tests? > > > > > > $ lsblk > > > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT > > > vda 254:0 0 16G 0 disk > > > └─vda1 254:1 0 16G 0 part / > > > vdb 254:16 0 8G 0 disk > > > vdc 254:32 0 8G 0 disk > > > vdd 254:48 0 8G 0 disk > > > nvme0n1 259:0 0 8G 0 disk > > > └─mpatha 253:0 0 8G 0 mpath > > > > No, all multipath devices that were created during a test should be removed > > before that test finishes. I will look into this. > > > > > Also, can you please fix: > > > > > > _have_kernel_option NVME_MULTIPATH && exit 1 > > > > > > to not exit on failure? It should use SKIP_REASON and return 1. You > > > might need to add something like _dont_have_kernel_option to properly > > > handle the case where the config is not found. > > > > OK, I will change this. > > > > > Side note which isn't a blocker for merging is that there's a lot of > > > duplicated code between these helpers and the srp helpers. How hard > > > would it be to refactor that? > > > > Are you perhaps referring to the code that is shared between the > > tests/srp/rc tests/nvmeof-mp/rc shell scripts? > > Yes, those. > > > The hardest part is probably > > to chose a location where to store these functions. Should I create a file > > with common code under common/, under tests/srp/, under tests/nvmeof-mp/ or > > perhaps somewhere else? > > Just put it under common. Hi Omar, All feedback mentioned above has been addressed. The following pull request has been updated: https://github.com/osandov/blktests/pull/33. Please let me know if you want me to post these patches on the linux-block mailing list. Note: neither the upstream kernel v4.18 nor v4.19-rc4 are stable enough to pass all nvmeof-mp tests if kernel debugging options like KASAN are enabled. Additionally, the NVMe device_add_disk() race condition often causes multipathd to refuse to consider /dev/nvme... devices. The output on my test setup is as follows (all tests pass): # ./check -q nvmeof-mp nvmeof-mp/001 (Log in and log out) [passed] runtime 1.528s ... 1.909s nvmeof-mp/002 (File I/O on top of multipath concurrently with logout and login (mq)) [ passed]time 38.968s ... runtime 38.968s ... 38.571s nvmeof-mp/004 (File I/O on top of multipath concurrently with logout and login (sq-on- nvmeof-mp/004 (File I/O on top of multipath concurrently with logout and login (sq-on- mq)) [passed]38.632s ... runtime 38.632s ... 37.529s nvmeof-mp/005 (Direct I/O with large transfer sizes and bs=4M) [passed] runtime 13.382s ... 13.684s nvmeof-mp/006 (Direct I/O with large transfer sizes and bs=8M) [passed] runtime 13.511s ... 13.480s nvmeof-mp/009 (Buffered I/O with large transfer sizes and bs=4M) [passed] runtime 13.665s ... 13.763s nvmeof-mp/010 (Buffered I/O with large transfer sizes and bs=8M) [passed] runtime 13.442s ... 13.900s nvmeof-mp/011 (Block I/O on top of multipath concurrently with logout and login) [pass ed] runtime 37.988s ... runtime 37.988s ... 37.945s nvmeof-mp/012 (dm-mpath on top of multiple I/O schedulers) [passed] runtime 21.659s ... 21.733s Thanks, Bart.
On Thu, Sep 27, 2018 at 04:26:42PM -0700, Bart Van Assche wrote: > On Tue, 2018-09-18 at 17:18 -0700, Omar Sandoval wrote: > > On Tue, Sep 18, 2018 at 05:02:47PM -0700, Bart Van Assche wrote: > > > On 9/18/18 4:24 PM, Omar Sandoval wrote: > > > > On Tue, Sep 18, 2018 at 02:20:59PM -0700, Bart Van Assche wrote: > > > > > Can you have a look at the updated master branch of > > > > > https://github.com/bvanassche/blktests? That code should no longer fail if > > > > > unloading the nvme kernel module fails. Please note that you will need > > > > > kernel v4.18 to test these scripts - a KASAN complaint appears if I run > > > > > these tests against kernel v4.19-rc4. > > > > > > > > Thanks, these pass now. Is it expected that my nvme device gets a > > > > multipath device configured after running these tests? > > > > > > > > $ lsblk > > > > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT > > > > vda 254:0 0 16G 0 disk > > > > └─vda1 254:1 0 16G 0 part / > > > > vdb 254:16 0 8G 0 disk > > > > vdc 254:32 0 8G 0 disk > > > > vdd 254:48 0 8G 0 disk > > > > nvme0n1 259:0 0 8G 0 disk > > > > └─mpatha 253:0 0 8G 0 mpath > > > > > > No, all multipath devices that were created during a test should be removed > > > before that test finishes. I will look into this. > > > > > > > Also, can you please fix: > > > > > > > > _have_kernel_option NVME_MULTIPATH && exit 1 > > > > > > > > to not exit on failure? It should use SKIP_REASON and return 1. You > > > > might need to add something like _dont_have_kernel_option to properly > > > > handle the case where the config is not found. > > > > > > OK, I will change this. > > > > > > > Side note which isn't a blocker for merging is that there's a lot of > > > > duplicated code between these helpers and the srp helpers. How hard > > > > would it be to refactor that? > > > > > > Are you perhaps referring to the code that is shared between the > > > tests/srp/rc tests/nvmeof-mp/rc shell scripts? > > > > Yes, those. > > > > > The hardest part is probably > > > to chose a location where to store these functions. Should I create a file > > > with common code under common/, under tests/srp/, under tests/nvmeof-mp/ or > > > perhaps somewhere else? > > > > Just put it under common. > > Hi Omar, > > All feedback mentioned above has been addressed. The following pull request has > been updated: https://github.com/osandov/blktests/pull/33. Please let me know > if you want me to post these patches on the linux-block mailing list. > > Note: neither the upstream kernel v4.18 nor v4.19-rc4 are stable enough to pass > all nvmeof-mp tests if kernel debugging options like KASAN are enabled. > Additionally, the NVMe device_add_disk() race condition often causes multipathd > to refuse to consider /dev/nvme... devices. The output on my test setup is as > follows (all tests pass): > > # ./check -q nvmeof-mp > nvmeof-mp/001 (Log in and log out) [passed] > runtime 1.528s ... 1.909s > nvmeof-mp/002 (File I/O on top of multipath concurrently with logout and login (mq)) [ > passed]time 38.968s ... > runtime 38.968s ... 38.571s > nvmeof-mp/004 (File I/O on top of multipath concurrently with logout and login (sq-on- > nvmeof-mp/004 (File I/O on top of multipath concurrently with logout and login (sq-on- > mq)) [passed]38.632s ... > runtime 38.632s ... 37.529s > nvmeof-mp/005 (Direct I/O with large transfer sizes and bs=4M) [passed] > runtime 13.382s ... 13.684s > nvmeof-mp/006 (Direct I/O with large transfer sizes and bs=8M) [passed] > runtime 13.511s ... 13.480s > nvmeof-mp/009 (Buffered I/O with large transfer sizes and bs=4M) [passed] > runtime 13.665s ... 13.763s > nvmeof-mp/010 (Buffered I/O with large transfer sizes and bs=8M) [passed] > runtime 13.442s ... 13.900s > nvmeof-mp/011 (Block I/O on top of multipath concurrently with logout and login) [pass > ed] runtime 37.988s ... > runtime 37.988s ... 37.945s > nvmeof-mp/012 (dm-mpath on top of multiple I/O schedulers) [passed] > runtime 21.659s ... 21.733s Thanks, Bart, merged.