Message ID | 149067383236.19079.12786890634892425193.stgit@dwillia2-desk3.amr.corp.intel.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 780733009b85 |
Headers | show |
Dan Williams <dan.j.williams@intel.com> writes: > +check_min_kver() > +{ > + local ver="$1" > + : "${KVER:=$(uname -r)}" > + > + [ -n "$ver" ] || return 1 > + [[ "$ver" == "$(echo -e "$ver\n$KVER" | sort -V | head -1)" ]] > +} > + > +check_min_kver "4.11" || { echo "kernel $KVER may lack latest device-dax fixes"; exit $rc; } Can we stop with this kernel version checking, please? Test to see if you can create a device dax instance. If not, skip the test. If so, and if you have a kernel that isn't fixed, so be it, you'll get failures. > + > +set -e > +trap 'err $LINENO' ERR > + > +if ! fio --enghelp | grep -q "dev-dax"; then > + echo "fio lacks dev-dax engine" > + exit 77 > +fi > + > +dev=$(./dax-dev) > +for align in 4k 2m 1g > +do > + json=$($NDCTL create-namespace -m dax -a $align -f -e $dev) > + chardev=$(echo $json | jq -r ". | select(.mode == \"dax\") | .daxregion.devices[0].chardev") > + if [ align = "1g" ]; then > + bs="1g" > + else > + bs="2m" > + fi I'm not sure the blocksize even matters. Cheers, Jeff > + > + cat > fio.job <<- EOF > + [global] > + ioengine=dev-dax > + direct=0 > + filename=/dev/${chardev} > + verify=crc32c > + bs=${bs} > + > + [write] > + rw=write > + runtime=5 > + > + [read] > + stonewall > + rw=read > + runtime=5 > + EOF > + > + rc=1 > + fio fio.job 2>&1 | tee fio.log > + > + if grep -q "fio.*got signal" fio.log; then > + echo "test/device-dax-fio.sh: failed with align: $align" > + exit 1 > + fi > + > + # revert namespace to raw mode > + json=$($NDCTL create-namespace -m raw -f -e $dev) > + mode=$(echo $json | jq -r ".mode") > + [ $mode != "memory" ] && echo "fail: $LINENO" && exit 1 > +done > + > +exit 0
On Wed, Mar 29, 2017 at 1:02 PM, Jeff Moyer <jmoyer@redhat.com> wrote: > Dan Williams <dan.j.williams@intel.com> writes: > >> +check_min_kver() >> +{ >> + local ver="$1" >> + : "${KVER:=$(uname -r)}" >> + >> + [ -n "$ver" ] || return 1 >> + [[ "$ver" == "$(echo -e "$ver\n$KVER" | sort -V | head -1)" ]] >> +} >> + >> +check_min_kver "4.11" || { echo "kernel $KVER may lack latest device-dax fixes"; exit $rc; } > > Can we stop with this kernel version checking, please? Test to see if > you can create a device dax instance. If not, skip the test. If so, > and if you have a kernel that isn't fixed, so be it, you'll get > failures. I'd rather not. It helps me keep track of what went in where. If you want to run all the tests on a random kernel just do: KVER="4.11.0" make check > >> + >> +set -e >> +trap 'err $LINENO' ERR >> + >> +if ! fio --enghelp | grep -q "dev-dax"; then >> + echo "fio lacks dev-dax engine" >> + exit 77 >> +fi >> + >> +dev=$(./dax-dev) >> +for align in 4k 2m 1g >> +do >> + json=$($NDCTL create-namespace -m dax -a $align -f -e $dev) >> + chardev=$(echo $json | jq -r ". | select(.mode == \"dax\") | .daxregion.devices[0].chardev") >> + if [ align = "1g" ]; then >> + bs="1g" >> + else >> + bs="2m" >> + fi > > I'm not sure the blocksize even matters. > iirc it affects the alignment of the mmap() request. So for example bs=4k should fail when the alignment is larger.
Dan Williams <dan.j.williams@intel.com> writes: > On Wed, Mar 29, 2017 at 1:02 PM, Jeff Moyer <jmoyer@redhat.com> wrote: >> Dan Williams <dan.j.williams@intel.com> writes: >> >>> +check_min_kver() >>> +{ >>> + local ver="$1" >>> + : "${KVER:=$(uname -r)}" >>> + >>> + [ -n "$ver" ] || return 1 >>> + [[ "$ver" == "$(echo -e "$ver\n$KVER" | sort -V | head -1)" ]] >>> +} >>> + >>> +check_min_kver "4.11" || { echo "kernel $KVER may lack latest device-dax fixes"; exit $rc; } >> >> Can we stop with this kernel version checking, please? Test to see if >> you can create a device dax instance. If not, skip the test. If so, >> and if you have a kernel that isn't fixed, so be it, you'll get >> failures. > > I'd rather not. It helps me keep track of what went in where. If you > want to run all the tests on a random kernel just do: > > KVER="4.11.0" make check This, of course, breaks completely with distro kernels. You don't see this kind of checking in xfstests, for example. git helps you keep track of what changes went in where (see git describe --contains). It's weird to track that via your test harness. So, I would definitely prefer to move to a model where we check for features instead of kernel versions. >>> + >>> +set -e >>> +trap 'err $LINENO' ERR >>> + >>> +if ! fio --enghelp | grep -q "dev-dax"; then >>> + echo "fio lacks dev-dax engine" >>> + exit 77 >>> +fi >>> + >>> +dev=$(./dax-dev) >>> +for align in 4k 2m 1g >>> +do >>> + json=$($NDCTL create-namespace -m dax -a $align -f -e $dev) >>> + chardev=$(echo $json | jq -r ". | select(.mode == \"dax\") | .daxregion.devices[0].chardev") >>> + if [ align = "1g" ]; then >>> + bs="1g" >>> + else >>> + bs="2m" >>> + fi >> >> I'm not sure the blocksize even matters. >> > > iirc it affects the alignment of the mmap() request. So for example > bs=4k should fail when the alignment is larger. I think iomem_align is a better hint for that, no? -Jeff
On Wed, Mar 29, 2017 at 1:19 PM, Jeff Moyer <jmoyer@redhat.com> wrote: > Dan Williams <dan.j.williams@intel.com> writes: > >> On Wed, Mar 29, 2017 at 1:02 PM, Jeff Moyer <jmoyer@redhat.com> wrote: >>> Dan Williams <dan.j.williams@intel.com> writes: >>> >>>> +check_min_kver() >>>> +{ >>>> + local ver="$1" >>>> + : "${KVER:=$(uname -r)}" >>>> + >>>> + [ -n "$ver" ] || return 1 >>>> + [[ "$ver" == "$(echo -e "$ver\n$KVER" | sort -V | head -1)" ]] >>>> +} >>>> + >>>> +check_min_kver "4.11" || { echo "kernel $KVER may lack latest device-dax fixes"; exit $rc; } >>> >>> Can we stop with this kernel version checking, please? Test to see if >>> you can create a device dax instance. If not, skip the test. If so, >>> and if you have a kernel that isn't fixed, so be it, you'll get >>> failures. >> >> I'd rather not. It helps me keep track of what went in where. If you >> want to run all the tests on a random kernel just do: >> >> KVER="4.11.0" make check > > This, of course, breaks completely with distro kernels. Why does this break distro kernels? The KVER variable overrides "uname -r" > You don't see > this kind of checking in xfstests, for example. git helps you keep > track of what changes went in where (see git describe --contains). It's > weird to track that via your test harness. So, I would definitely > prefer to move to a model where we check for features instead of kernel > versions. I see this as a deficiency of xfstests. We have had to go through and qualify and track each xfstest and why it may fail with random combinations of kernel, xfsprogs, or e2fsprogs versions. I'd much prefer that upstream xfstests track the minimum versions of components to make a given test pass so we can stop doing it out of tree. > >>>> + >>>> +set -e >>>> +trap 'err $LINENO' ERR >>>> + >>>> +if ! fio --enghelp | grep -q "dev-dax"; then >>>> + echo "fio lacks dev-dax engine" >>>> + exit 77 >>>> +fi >>>> + >>>> +dev=$(./dax-dev) >>>> +for align in 4k 2m 1g >>>> +do >>>> + json=$($NDCTL create-namespace -m dax -a $align -f -e $dev) >>>> + chardev=$(echo $json | jq -r ". | select(.mode == \"dax\") | .daxregion.devices[0].chardev") >>>> + if [ align = "1g" ]; then >>>> + bs="1g" >>>> + else >>>> + bs="2m" >>>> + fi >>> >>> I'm not sure the blocksize even matters. >>> >> >> iirc it affects the alignment of the mmap() request. So for example >> bs=4k should fail when the alignment is larger. > > I think iomem_align is a better hint for that, no? Certainly sounds better, I'll check it out.
Dan Williams <dan.j.williams@intel.com> writes: > On Wed, Mar 29, 2017 at 1:19 PM, Jeff Moyer <jmoyer@redhat.com> wrote: >> Dan Williams <dan.j.williams@intel.com> writes: >> >>> On Wed, Mar 29, 2017 at 1:02 PM, Jeff Moyer <jmoyer@redhat.com> wrote: >>>> Dan Williams <dan.j.williams@intel.com> writes: >>>> >>>>> +check_min_kver() >>>>> +{ >>>>> + local ver="$1" >>>>> + : "${KVER:=$(uname -r)}" >>>>> + >>>>> + [ -n "$ver" ] || return 1 >>>>> + [[ "$ver" == "$(echo -e "$ver\n$KVER" | sort -V | head -1)" ]] >>>>> +} >>>>> + >>>>> +check_min_kver "4.11" || { echo "kernel $KVER may lack latest device-dax fixes"; exit $rc; } >>>> >>>> Can we stop with this kernel version checking, please? Test to see if >>>> you can create a device dax instance. If not, skip the test. If so, >>>> and if you have a kernel that isn't fixed, so be it, you'll get >>>> failures. >>> >>> I'd rather not. It helps me keep track of what went in where. If you >>> want to run all the tests on a random kernel just do: >>> >>> KVER="4.11.0" make check >> >> This, of course, breaks completely with distro kernels. > > Why does this break distro kernels? The KVER variable overrides "uname -r" Because some features may not exist in the distro kernels. It's the same problem you outline with xfstests, below. >> You don't see this kind of checking in xfstests, for example. git >> helps you keep track of what changes went in where (see git describe >> --contains). It's weird to track that via your test harness. So, I >> would definitely prefer to move to a model where we check for >> features instead of kernel versions. > > I see this as a deficiency of xfstests. We have had to go through and > qualify and track each xfstest and why it may fail with random > combinations of kernel, xfsprogs, or e2fsprogs versions. I'd much > prefer that upstream xfstests track the minimum versions of components > to make a given test pass so we can stop doing it out of tree. Yes, that's a good point. I can't think of a good compromise, either. -Jeff
On Wed, Mar 29, 2017 at 1:49 PM, Jeff Moyer <jmoyer@redhat.com> wrote: > Dan Williams <dan.j.williams@intel.com> writes: > >> On Wed, Mar 29, 2017 at 1:19 PM, Jeff Moyer <jmoyer@redhat.com> wrote: >>> Dan Williams <dan.j.williams@intel.com> writes: >>> >>>> On Wed, Mar 29, 2017 at 1:02 PM, Jeff Moyer <jmoyer@redhat.com> wrote: >>>>> Dan Williams <dan.j.williams@intel.com> writes: >>>>> >>>>>> +check_min_kver() >>>>>> +{ >>>>>> + local ver="$1" >>>>>> + : "${KVER:=$(uname -r)}" >>>>>> + >>>>>> + [ -n "$ver" ] || return 1 >>>>>> + [[ "$ver" == "$(echo -e "$ver\n$KVER" | sort -V | head -1)" ]] >>>>>> +} >>>>>> + >>>>>> +check_min_kver "4.11" || { echo "kernel $KVER may lack latest device-dax fixes"; exit $rc; } >>>>> >>>>> Can we stop with this kernel version checking, please? Test to see if >>>>> you can create a device dax instance. If not, skip the test. If so, >>>>> and if you have a kernel that isn't fixed, so be it, you'll get >>>>> failures. >>>> >>>> I'd rather not. It helps me keep track of what went in where. If you >>>> want to run all the tests on a random kernel just do: >>>> >>>> KVER="4.11.0" make check >>> >>> This, of course, breaks completely with distro kernels. >> >> Why does this break distro kernels? The KVER variable overrides "uname -r" > > Because some features may not exist in the distro kernels. It's the > same problem you outline with xfstests, below. > Right, but you started off with suggesting that just running the test and failing was ok, and that's the behavior you get with KVER=. >>> You don't see this kind of checking in xfstests, for example. git >>> helps you keep track of what changes went in where (see git describe >>> --contains). It's weird to track that via your test harness. So, I >>> would definitely prefer to move to a model where we check for >>> features instead of kernel versions. >> >> I see this as a deficiency of xfstests. We have had to go through and >> qualify and track each xfstest and why it may fail with random >> combinations of kernel, xfsprogs, or e2fsprogs versions. I'd much >> prefer that upstream xfstests track the minimum versions of components >> to make a given test pass so we can stop doing it out of tree. > > Yes, that's a good point. I can't think of a good compromise, either. Maybe we can at least get our annotated blacklist upstream so other people can start contributing to it?
Dan Williams <dan.j.williams@intel.com> writes: >>>>>> Can we stop with this kernel version checking, please? Test to see if >>>>>> you can create a device dax instance. If not, skip the test. If so, >>>>>> and if you have a kernel that isn't fixed, so be it, you'll get >>>>>> failures. >>>>> >>>>> I'd rather not. It helps me keep track of what went in where. If you >>>>> want to run all the tests on a random kernel just do: >>>>> >>>>> KVER="4.11.0" make check >>>> >>>> This, of course, breaks completely with distro kernels. >>> >>> Why does this break distro kernels? The KVER variable overrides "uname -r" >> >> Because some features may not exist in the distro kernels. It's the >> same problem you outline with xfstests, below. >> > > Right, but you started off with suggesting that just running the test > and failing was ok, and that's the behavior you get with KVER=. Well, the goal was to be somewhat smart about it, by not even trying to test things that aren't implemented or configured into the current kernel (such as device dax, for example). Upon thinking about it further, I don't think that gets us very far. However, that does raise a use case that is not distro-specific. If you don't enable device dax, your test suite will still try to run those tests. ;-) >>>> You don't see this kind of checking in xfstests, for example. git >>>> helps you keep track of what changes went in where (see git describe >>>> --contains). It's weird to track that via your test harness. So, I >>>> would definitely prefer to move to a model where we check for >>>> features instead of kernel versions. >>> >>> I see this as a deficiency of xfstests. We have had to go through and >>> qualify and track each xfstest and why it may fail with random >>> combinations of kernel, xfsprogs, or e2fsprogs versions. I'd much >>> prefer that upstream xfstests track the minimum versions of components >>> to make a given test pass so we can stop doing it out of tree. >> >> Yes, that's a good point. I can't think of a good compromise, either. > > Maybe we can at least get our annotated blacklist upstream so other > people can start contributing to it? Are you referring to xfstests? Yeah, that's a good idea. Right now I just add tests to the dangerous group as I encounter known issues. ;-) So, my list probably isn't very helpful in its current form. -Jeff
On Wed, Mar 29, 2017 at 2:04 PM, Jeff Moyer <jmoyer@redhat.com> wrote: > Dan Williams <dan.j.williams@intel.com> writes: > >>>>>>> Can we stop with this kernel version checking, please? Test to see if >>>>>>> you can create a device dax instance. If not, skip the test. If so, >>>>>>> and if you have a kernel that isn't fixed, so be it, you'll get >>>>>>> failures. >>>>>> >>>>>> I'd rather not. It helps me keep track of what went in where. If you >>>>>> want to run all the tests on a random kernel just do: >>>>>> >>>>>> KVER="4.11.0" make check >>>>> >>>>> This, of course, breaks completely with distro kernels. >>>> >>>> Why does this break distro kernels? The KVER variable overrides "uname -r" >>> >>> Because some features may not exist in the distro kernels. It's the >>> same problem you outline with xfstests, below. >>> >> >> Right, but you started off with suggesting that just running the test >> and failing was ok, and that's the behavior you get with KVER=. > > Well, the goal was to be somewhat smart about it, by not even trying to > test things that aren't implemented or configured into the current > kernel (such as device dax, for example). Upon thinking about it > further, I don't think that gets us very far. However, that does raise > a use case that is not distro-specific. If you don't enable device dax, > your test suite will still try to run those tests. ;-) The other part of the equation is that I'm lazy and don't want to do the extra work of validating the environment for each test. So just do a quick version check and if the test fails you get to figure out what configuration you failed to enable. The most common case being that you failed to install the nfit_test modules in which case we do have a capability check for that. >>>>> You don't see this kind of checking in xfstests, for example. git >>>>> helps you keep track of what changes went in where (see git describe >>>>> --contains). It's weird to track that via your test harness. So, I >>>>> would definitely prefer to move to a model where we check for >>>>> features instead of kernel versions. >>>> >>>> I see this as a deficiency of xfstests. We have had to go through and >>>> qualify and track each xfstest and why it may fail with random >>>> combinations of kernel, xfsprogs, or e2fsprogs versions. I'd much >>>> prefer that upstream xfstests track the minimum versions of components >>>> to make a given test pass so we can stop doing it out of tree. >>> >>> Yes, that's a good point. I can't think of a good compromise, either. >> >> Maybe we can at least get our annotated blacklist upstream so other >> people can start contributing to it? > > Are you referring to xfstests? Yeah, that's a good idea. Right now I > just add tests to the dangerous group as I encounter known issues. ;-) > So, my list probably isn't very helpful in its current form. Yes, xfstests. We have entries in our blacklist like this: # needs xfsprogs fix # c8dc42356142 xfs_db: fix the 'source' command when passed as a -c option # Last checked: # - xfsprogs-4.9.0-1.fc25.x86_64 xfs/138 # needs xfsprogs fix # 3297e0caa25a xfs: replace xfs_mode_to_ftype table with switch statement # Last checked: # - xfsprogs-4.9.0-1.fc25.x86_64 xfs/348 # see "[BUG] generic/232 hangs on XFS DAX mount" thread on xfs mailing # list generic/232 # failed on emulated pmem without dax, may be impacted by the same fix # as the one for generic/270. The generic/270 failure is tracked in this # thread on the xfs mailing list: "XFS kernel BUG during generic/270 # with v4.10". Re-test on v4.11 with fa7f138 ("xfs: clear delalloc and # cache on buffered write failure") generic/269 generic/270
Ccing Eryu On Wed, Mar 29, 2017 at 02:12:25PM -0700, Dan Williams wrote: > On Wed, Mar 29, 2017 at 2:04 PM, Jeff Moyer <jmoyer@redhat.com> wrote: > > Dan Williams <dan.j.williams@intel.com> writes: > > > >>>>>>> Can we stop with this kernel version checking, please? Test to see if > >>>>>>> you can create a device dax instance. If not, skip the test. If so, > >>>>>>> and if you have a kernel that isn't fixed, so be it, you'll get > >>>>>>> failures. > >>>>>> > >>>>>> I'd rather not. It helps me keep track of what went in where. If you > >>>>>> want to run all the tests on a random kernel just do: > >>>>>> > >>>>>> KVER="4.11.0" make check > >>>>> > >>>>> This, of course, breaks completely with distro kernels. > >>>> > >>>> Why does this break distro kernels? The KVER variable overrides "uname -r" > >>> > >>> Because some features may not exist in the distro kernels. It's the > >>> same problem you outline with xfstests, below. > >>> > >> > >> Right, but you started off with suggesting that just running the test > >> and failing was ok, and that's the behavior you get with KVER=. > > > > Well, the goal was to be somewhat smart about it, by not even trying to > > test things that aren't implemented or configured into the current > > kernel (such as device dax, for example). Upon thinking about it > > further, I don't think that gets us very far. However, that does raise > > a use case that is not distro-specific. If you don't enable device dax, > > your test suite will still try to run those tests. ;-) > > The other part of the equation is that I'm lazy and don't want to do > the extra work of validating the environment for each test. So just do > a quick version check and if the test fails you get to figure out what > configuration you failed to enable. The most common case being that > you failed to install the nfit_test modules in which case we do have a > capability check for that. > > >>>>> You don't see this kind of checking in xfstests, for example. git > >>>>> helps you keep track of what changes went in where (see git describe > >>>>> --contains). It's weird to track that via your test harness. So, I > >>>>> would definitely prefer to move to a model where we check for > >>>>> features instead of kernel versions. > >>>> > >>>> I see this as a deficiency of xfstests. We have had to go through and > >>>> qualify and track each xfstest and why it may fail with random > >>>> combinations of kernel, xfsprogs, or e2fsprogs versions. I'd much > >>>> prefer that upstream xfstests track the minimum versions of components > >>>> to make a given test pass so we can stop doing it out of tree. > >>> > >>> Yes, that's a good point. I can't think of a good compromise, either. > >> > >> Maybe we can at least get our annotated blacklist upstream so other > >> people can start contributing to it? > > > > Are you referring to xfstests? Yeah, that's a good idea. Right now I > > just add tests to the dangerous group as I encounter known issues. ;-) > > So, my list probably isn't very helpful in its current form. > > Yes, xfstests. We have entries in our blacklist like this: Just a thought. How about: 0, Adding infrastructure to teach xfstests querying know issues before reporting pass or fail. 1, Formatted known issue files. xfstests may maintain some in tree, while people can have theire own in hand. Questions: 0, Tracking components, like kernel, xfstests, e2fsprogs, xfsprogs, fio, etc 1, Tracking versions of components, different distributions, upstream versions. 2, Versions of xfstests itself. We don't have many tags now, however if we start to do this, it will not be an issue. 3, Tracking architectures/platforms, x86_64, ppc64, etc 4, Is this matrix too big ? :) 5, Is this failure now the same faliure that I've got from the known issues? Thanks, Xiong > > # needs xfsprogs fix > # c8dc42356142 xfs_db: fix the 'source' command when passed as a -c option > # Last checked: > # - xfsprogs-4.9.0-1.fc25.x86_64 > xfs/138 > > # needs xfsprogs fix > # 3297e0caa25a xfs: replace xfs_mode_to_ftype table with switch statement > # Last checked: > # - xfsprogs-4.9.0-1.fc25.x86_64 > xfs/348 > > # see "[BUG] generic/232 hangs on XFS DAX mount" thread on xfs mailing > # list > generic/232 > > # failed on emulated pmem without dax, may be impacted by the same fix > # as the one for generic/270. The generic/270 failure is tracked in this > # thread on the xfs mailing list: "XFS kernel BUG during generic/270 > # with v4.10". Re-test on v4.11 with fa7f138 ("xfs: clear delalloc and > # cache on buffered write failure") > generic/269 > generic/270 > _______________________________________________ > Linux-nvdimm mailing list > Linux-nvdimm@lists.01.org > https://lists.01.org/mailman/listinfo/linux-nvdimm
On Thu, Mar 30, 2017 at 02:16:00PM +0800, Xiong Zhou wrote: > Ccing Eryu > > On Wed, Mar 29, 2017 at 02:12:25PM -0700, Dan Williams wrote: > > On Wed, Mar 29, 2017 at 2:04 PM, Jeff Moyer <jmoyer@redhat.com> wrote: > > > Dan Williams <dan.j.williams@intel.com> writes: > > > > > >>>>>>> Can we stop with this kernel version checking, please? Test to see if > > >>>>>>> you can create a device dax instance. If not, skip the test. If so, > > >>>>>>> and if you have a kernel that isn't fixed, so be it, you'll get > > >>>>>>> failures. > > >>>>>> > > >>>>>> I'd rather not. It helps me keep track of what went in where. If you > > >>>>>> want to run all the tests on a random kernel just do: > > >>>>>> > > >>>>>> KVER="4.11.0" make check > > >>>>> > > >>>>> This, of course, breaks completely with distro kernels. > > >>>> > > >>>> Why does this break distro kernels? The KVER variable overrides "uname -r" > > >>> > > >>> Because some features may not exist in the distro kernels. It's the > > >>> same problem you outline with xfstests, below. > > >>> > > >> > > >> Right, but you started off with suggesting that just running the test > > >> and failing was ok, and that's the behavior you get with KVER=. > > > > > > Well, the goal was to be somewhat smart about it, by not even trying to > > > test things that aren't implemented or configured into the current > > > kernel (such as device dax, for example). Upon thinking about it > > > further, I don't think that gets us very far. However, that does raise > > > a use case that is not distro-specific. If you don't enable device dax, > > > your test suite will still try to run those tests. ;-) > > > > The other part of the equation is that I'm lazy and don't want to do > > the extra work of validating the environment for each test. So just do > > a quick version check and if the test fails you get to figure out what > > configuration you failed to enable. The most common case being that > > you failed to install the nfit_test modules in which case we do have a > > capability check for that. > > > > >>>>> You don't see this kind of checking in xfstests, for example. git > > >>>>> helps you keep track of what changes went in where (see git describe > > >>>>> --contains). It's weird to track that via your test harness. So, I > > >>>>> would definitely prefer to move to a model where we check for > > >>>>> features instead of kernel versions. > > >>>> > > >>>> I see this as a deficiency of xfstests. We have had to go through and > > >>>> qualify and track each xfstest and why it may fail with random > > >>>> combinations of kernel, xfsprogs, or e2fsprogs versions. I'd much > > >>>> prefer that upstream xfstests track the minimum versions of components > > >>>> to make a given test pass so we can stop doing it out of tree. > > >>> > > >>> Yes, that's a good point. I can't think of a good compromise, either. > > >> > > >> Maybe we can at least get our annotated blacklist upstream so other > > >> people can start contributing to it? > > > > > > Are you referring to xfstests? Yeah, that's a good idea. Right now I > > > just add tests to the dangerous group as I encounter known issues. ;-) > > > So, my list probably isn't very helpful in its current form. > > > > Yes, xfstests. We have entries in our blacklist like this: > > Just a thought. > > How about: > 0, Adding infrastructure to teach xfstests querying know issues > before reporting pass or fail. > 1, Formatted known issue files. xfstests may maintain some in tree, > while people can have theire own in hand. I've posted similar RFC patch to fstests before. [RFC PATCH] fstests: add known issue support https://www.spinics.net/lists/fstests/msg02344.html And Dave rejected it: https://www.spinics.net/lists/fstests/msg02345.html So the short answer is: xfstests should only control whether a test should be run or not, the known issues information should be maintained outside the test harness. And I tend to agree with Dave now :) > > Questions: > 0, Tracking components, like kernel, xfstests, e2fsprogs, xfsprogs, > fio, etc > 1, Tracking versions of components, different distributions, > upstream versions. > 2, Versions of xfstests itself. We don't have many tags now, > however if we start to do this, it will not be an issue. > 3, Tracking architectures/platforms, x86_64, ppc64, etc > 4, Is this matrix too big ? :) > 5, Is this failure now the same faliure that I've got from > the known issues? I know the results handling of xfstests need more work, and the progress is slow, but we do have some processes, e.g. configurable $RESULTDIR, recent tools from Eric to compare failures across runs (tools/compare-failures). I've been thinking about Dave's suggestions for a long time, but due to the complexity of known issues (see above questions :) and other tasks/jobs, I'm not able to work out a solution yet. I'll look into it again, thanks for the reminder :) Thanks, Eryu > > Thanks, > Xiong > > > > > # needs xfsprogs fix > > # c8dc42356142 xfs_db: fix the 'source' command when passed as a -c option > > # Last checked: > > # - xfsprogs-4.9.0-1.fc25.x86_64 > > xfs/138 > > > > # needs xfsprogs fix > > # 3297e0caa25a xfs: replace xfs_mode_to_ftype table with switch statement > > # Last checked: > > # - xfsprogs-4.9.0-1.fc25.x86_64 > > xfs/348 > > > > # see "[BUG] generic/232 hangs on XFS DAX mount" thread on xfs mailing > > # list > > generic/232 > > > > # failed on emulated pmem without dax, may be impacted by the same fix > > # as the one for generic/270. The generic/270 failure is tracked in this > > # thread on the xfs mailing list: "XFS kernel BUG during generic/270 > > # with v4.10". Re-test on v4.11 with fa7f138 ("xfs: clear delalloc and > > # cache on buffered write failure") > > generic/269 > > generic/270 > > _______________________________________________ > > Linux-nvdimm mailing list > > Linux-nvdimm@lists.01.org > > https://lists.01.org/mailman/listinfo/linux-nvdimm
On Wed, Mar 29, 2017 at 05:04:31PM -0400, Jeff Moyer wrote: > Well, the goal was to be somewhat smart about it, by not even trying to > test things that aren't implemented or configured into the current > kernel (such as device dax, for example). Upon thinking about it > further, I don't think that gets us very far. However, that does raise > a use case that is not distro-specific. If you don't enable device dax, > your test suite will still try to run those tests. ;-) I think I already suggested this some time ago, but again, can't we make a syfs/debugfs/ioctl/whatever query to see what "features" are available and check this instead of the kernel version in the test suite? IIRC drm does similar things. Quoting from [1]: "Have a clear way for userspace to figure out whether your new ioctl or ioctl extension is supported on a given kernel. If you can't rely on old kernels rejecting the new flags/modes or ioctls (since doing that was botched in the past) then you need a driver feature flag or revision number somewhere." [1] http://blog.ffwll.ch/2013/11/botching-up-ioctls.html Byte, Johannes
On 03/29/2017 04:30 PM, Dan Williams wrote: > On Wed, Mar 29, 2017 at 1:19 PM, Jeff Moyer <jmoyer@redhat.com> wrote: >> Dan Williams <dan.j.williams@intel.com> writes: >> >>> On Wed, Mar 29, 2017 at 1:02 PM, Jeff Moyer <jmoyer@redhat.com> wrote: >>>> Dan Williams <dan.j.williams@intel.com> writes: >>>> >>>>> +check_min_kver() >>>>> +{ >>>>> + local ver="$1" >>>>> + : "${KVER:=$(uname -r)}" >>>>> + >>>>> + [ -n "$ver" ] || return 1 >>>>> + [[ "$ver" == "$(echo -e "$ver\n$KVER" | sort -V | head -1)" ]] >>>>> +} >>>>> + >>>>> +check_min_kver "4.11" || { echo "kernel $KVER may lack latest device-dax fixes"; exit $rc; } >>>> >>>> Can we stop with this kernel version checking, please? Test to see if >>>> you can create a device dax instance. If not, skip the test. If so, >>>> and if you have a kernel that isn't fixed, so be it, you'll get >>>> failures. >>> >>> I'd rather not. It helps me keep track of what went in where. If you >>> want to run all the tests on a random kernel just do: >>> >>> KVER="4.11.0" make check >> >> This, of course, breaks completely with distro kernels. > > Why does this break distro kernels? The KVER variable overrides "uname -r" FYI - dax-errors.sh doesn't look at KVER. -- ljk
On Thu, Mar 30, 2017 at 9:08 AM, Linda Knippers <linda.knippers@hpe.com> wrote: > On 03/29/2017 04:30 PM, Dan Williams wrote: >> On Wed, Mar 29, 2017 at 1:19 PM, Jeff Moyer <jmoyer@redhat.com> wrote: >>> Dan Williams <dan.j.williams@intel.com> writes: >>> >>>> On Wed, Mar 29, 2017 at 1:02 PM, Jeff Moyer <jmoyer@redhat.com> wrote: >>>>> Dan Williams <dan.j.williams@intel.com> writes: >>>>> >>>>>> +check_min_kver() >>>>>> +{ >>>>>> + local ver="$1" >>>>>> + : "${KVER:=$(uname -r)}" >>>>>> + >>>>>> + [ -n "$ver" ] || return 1 >>>>>> + [[ "$ver" == "$(echo -e "$ver\n$KVER" | sort -V | head -1)" ]] >>>>>> +} >>>>>> + >>>>>> +check_min_kver "4.11" || { echo "kernel $KVER may lack latest device-dax fixes"; exit $rc; } >>>>> >>>>> Can we stop with this kernel version checking, please? Test to see if >>>>> you can create a device dax instance. If not, skip the test. If so, >>>>> and if you have a kernel that isn't fixed, so be it, you'll get >>>>> failures. >>>> >>>> I'd rather not. It helps me keep track of what went in where. If you >>>> want to run all the tests on a random kernel just do: >>>> >>>> KVER="4.11.0" make check >>> >>> This, of course, breaks completely with distro kernels. >> >> Why does this break distro kernels? The KVER variable overrides "uname -r" > > FYI - dax-errors.sh doesn't look at KVER. > Patches welcome :).
On 3/30/2017 12:56 PM, Dan Williams wrote: > On Thu, Mar 30, 2017 at 9:08 AM, Linda Knippers <linda.knippers@hpe.com> wrote: >> On 03/29/2017 04:30 PM, Dan Williams wrote: >>> On Wed, Mar 29, 2017 at 1:19 PM, Jeff Moyer <jmoyer@redhat.com> wrote: >>>> Dan Williams <dan.j.williams@intel.com> writes: >>>> >>>>> On Wed, Mar 29, 2017 at 1:02 PM, Jeff Moyer <jmoyer@redhat.com> wrote: >>>>>> Dan Williams <dan.j.williams@intel.com> writes: >>>>>> >>>>>>> +check_min_kver() >>>>>>> +{ >>>>>>> + local ver="$1" >>>>>>> + : "${KVER:=$(uname -r)}" >>>>>>> + >>>>>>> + [ -n "$ver" ] || return 1 >>>>>>> + [[ "$ver" == "$(echo -e "$ver\n$KVER" | sort -V | head -1)" ]] >>>>>>> +} >>>>>>> + >>>>>>> +check_min_kver "4.11" || { echo "kernel $KVER may lack latest device-dax fixes"; exit $rc; } >>>>>> >>>>>> Can we stop with this kernel version checking, please? Test to see if >>>>>> you can create a device dax instance. If not, skip the test. If so, >>>>>> and if you have a kernel that isn't fixed, so be it, you'll get >>>>>> failures. >>>>> >>>>> I'd rather not. It helps me keep track of what went in where. If you >>>>> want to run all the tests on a random kernel just do: >>>>> >>>>> KVER="4.11.0" make check >>>> >>>> This, of course, breaks completely with distro kernels. >>> >>> Why does this break distro kernels? The KVER variable overrides "uname -r" >> >> FYI - dax-errors.sh doesn't look at KVER. >> > > Patches welcome :). You won't like my patch for that because I agree with Jeff. :-) Right now I'm more interested in seeing if I can modify the tests to not require nfit_test. I've only looked at btt-check.sh but so far, it doesn't look that hard. -- ljk
On Thu, Mar 30, 2017 at 10:06 AM, Linda Knippers <linda.knippers@hpe.com> wrote: > > > On 3/30/2017 12:56 PM, Dan Williams wrote: [..] >> Patches welcome :). > > > You won't like my patch for that because I agree with Jeff. :-) > > Right now I'm more interested in seeing if I can modify the tests to not > require nfit_test. I've only looked at btt-check.sh but so far, it doesn't > look that hard. The point of nfit_test is that you can run them with worrying about risks to real data. So I don't want to see patches moving existing nfit_test tests to something else.
On 03/30/2017 01:12 PM, Dan Williams wrote: > On Thu, Mar 30, 2017 at 10:06 AM, Linda Knippers <linda.knippers@hpe.com> wrote: >> >> >> On 3/30/2017 12:56 PM, Dan Williams wrote: > [..] >>> Patches welcome :). >> >> >> You won't like my patch for that because I agree with Jeff. :-) >> >> Right now I'm more interested in seeing if I can modify the tests to not >> require nfit_test. I've only looked at btt-check.sh but so far, it doesn't >> look that hard. > > The point of nfit_test is that you can run them with worrying about > risks to real data. So I don't want to see patches moving existing > nfit_test tests to something else. I'd like to test on an unmodified kernel using real hardware with a real nfit. As long as it's clear that the test needs a scratch device, why is that bad? Maybe other tests are more difficult but the btt-check test looks pretty straightforward. -- ljk
On Thu, Mar 30, 2017 at 10:19 AM, Linda Knippers <linda.knippers@hpe.com> wrote: > On 03/30/2017 01:12 PM, Dan Williams wrote: >> On Thu, Mar 30, 2017 at 10:06 AM, Linda Knippers <linda.knippers@hpe.com> wrote: >>> >>> >>> On 3/30/2017 12:56 PM, Dan Williams wrote: >> [..] >>>> Patches welcome :). >>> >>> >>> You won't like my patch for that because I agree with Jeff. :-) >>> >>> Right now I'm more interested in seeing if I can modify the tests to not >>> require nfit_test. I've only looked at btt-check.sh but so far, it doesn't >>> look that hard. >> >> The point of nfit_test is that you can run them with worrying about >> risks to real data. So I don't want to see patches moving existing >> nfit_test tests to something else. > > I'd like to test on an unmodified kernel using real hardware with a real nfit. > As long as it's clear that the test needs a scratch device, why is that bad? > > Maybe other tests are more difficult but the btt-check test looks pretty > straightforward. Sure, but I don't see a need to carry that in upstream ndctl. The goal with nfit_test is to be able to do unit check out of the libnvdimm sub-system without any platform dependencies.
On Thu, Mar 30, 2017 at 1:09 AM, Eryu Guan <eguan@redhat.com> wrote: > On Thu, Mar 30, 2017 at 02:16:00PM +0800, Xiong Zhou wrote: >> Ccing Eryu >> >> On Wed, Mar 29, 2017 at 02:12:25PM -0700, Dan Williams wrote: >> > On Wed, Mar 29, 2017 at 2:04 PM, Jeff Moyer <jmoyer@redhat.com> wrote: >> > > Dan Williams <dan.j.williams@intel.com> writes: >> > > >> > >>>>>>> Can we stop with this kernel version checking, please? Test to see if >> > >>>>>>> you can create a device dax instance. If not, skip the test. If so, >> > >>>>>>> and if you have a kernel that isn't fixed, so be it, you'll get >> > >>>>>>> failures. >> > >>>>>> >> > >>>>>> I'd rather not. It helps me keep track of what went in where. If you >> > >>>>>> want to run all the tests on a random kernel just do: >> > >>>>>> >> > >>>>>> KVER="4.11.0" make check >> > >>>>> >> > >>>>> This, of course, breaks completely with distro kernels. >> > >>>> >> > >>>> Why does this break distro kernels? The KVER variable overrides "uname -r" >> > >>> >> > >>> Because some features may not exist in the distro kernels. It's the >> > >>> same problem you outline with xfstests, below. >> > >>> >> > >> >> > >> Right, but you started off with suggesting that just running the test >> > >> and failing was ok, and that's the behavior you get with KVER=. >> > > >> > > Well, the goal was to be somewhat smart about it, by not even trying to >> > > test things that aren't implemented or configured into the current >> > > kernel (such as device dax, for example). Upon thinking about it >> > > further, I don't think that gets us very far. However, that does raise >> > > a use case that is not distro-specific. If you don't enable device dax, >> > > your test suite will still try to run those tests. ;-) >> > >> > The other part of the equation is that I'm lazy and don't want to do >> > the extra work of validating the environment for each test. So just do >> > a quick version check and if the test fails you get to figure out what >> > configuration you failed to enable. The most common case being that >> > you failed to install the nfit_test modules in which case we do have a >> > capability check for that. >> > >> > >>>>> You don't see this kind of checking in xfstests, for example. git >> > >>>>> helps you keep track of what changes went in where (see git describe >> > >>>>> --contains). It's weird to track that via your test harness. So, I >> > >>>>> would definitely prefer to move to a model where we check for >> > >>>>> features instead of kernel versions. >> > >>>> >> > >>>> I see this as a deficiency of xfstests. We have had to go through and >> > >>>> qualify and track each xfstest and why it may fail with random >> > >>>> combinations of kernel, xfsprogs, or e2fsprogs versions. I'd much >> > >>>> prefer that upstream xfstests track the minimum versions of components >> > >>>> to make a given test pass so we can stop doing it out of tree. >> > >>> >> > >>> Yes, that's a good point. I can't think of a good compromise, either. >> > >> >> > >> Maybe we can at least get our annotated blacklist upstream so other >> > >> people can start contributing to it? >> > > >> > > Are you referring to xfstests? Yeah, that's a good idea. Right now I >> > > just add tests to the dangerous group as I encounter known issues. ;-) >> > > So, my list probably isn't very helpful in its current form. >> > >> > Yes, xfstests. We have entries in our blacklist like this: >> >> Just a thought. >> >> How about: >> 0, Adding infrastructure to teach xfstests querying know issues >> before reporting pass or fail. >> 1, Formatted known issue files. xfstests may maintain some in tree, >> while people can have theire own in hand. > > I've posted similar RFC patch to fstests before. > > [RFC PATCH] fstests: add known issue support > https://www.spinics.net/lists/fstests/msg02344.html > > And Dave rejected it: > https://www.spinics.net/lists/fstests/msg02345.html > > So the short answer is: xfstests should only control whether a test > should be run or not, the known issues information should be maintained > outside the test harness. > > And I tend to agree with Dave now :) I read that thread as agreeing with me :). We should maintain a custom expunge list per the environment. How about a Fedora xfstests package that keeps an expunge list up to date relative to the latest available versions of the kernel+utilities in the distribution?
diff --git a/test/Makefile.am b/test/Makefile.am index 98f444231306..969fe055b35e 100644 --- a/test/Makefile.am +++ b/test/Makefile.am @@ -26,6 +26,7 @@ TESTS +=\ dax-dev \ dax.sh \ device-dax \ + device-dax-fio.sh \ mmap.sh check_PROGRAMS +=\ diff --git a/test/device-dax-fio.sh b/test/device-dax-fio.sh new file mode 100755 index 000000000000..ab620b67027f --- /dev/null +++ b/test/device-dax-fio.sh @@ -0,0 +1,74 @@ +#!/bin/bash +NDCTL="../ndctl/ndctl" +rc=77 + +set -e + +err() { + echo "test/device-dax-fio.sh: failed at line $1" + exit $rc +} + +check_min_kver() +{ + local ver="$1" + : "${KVER:=$(uname -r)}" + + [ -n "$ver" ] || return 1 + [[ "$ver" == "$(echo -e "$ver\n$KVER" | sort -V | head -1)" ]] +} + +check_min_kver "4.11" || { echo "kernel $KVER may lack latest device-dax fixes"; exit $rc; } + +set -e +trap 'err $LINENO' ERR + +if ! fio --enghelp | grep -q "dev-dax"; then + echo "fio lacks dev-dax engine" + exit 77 +fi + +dev=$(./dax-dev) +for align in 4k 2m 1g +do + json=$($NDCTL create-namespace -m dax -a $align -f -e $dev) + chardev=$(echo $json | jq -r ". | select(.mode == \"dax\") | .daxregion.devices[0].chardev") + if [ align = "1g" ]; then + bs="1g" + else + bs="2m" + fi + + cat > fio.job <<- EOF + [global] + ioengine=dev-dax + direct=0 + filename=/dev/${chardev} + verify=crc32c + bs=${bs} + + [write] + rw=write + runtime=5 + + [read] + stonewall + rw=read + runtime=5 + EOF + + rc=1 + fio fio.job 2>&1 | tee fio.log + + if grep -q "fio.*got signal" fio.log; then + echo "test/device-dax-fio.sh: failed with align: $align" + exit 1 + fi + + # revert namespace to raw mode + json=$($NDCTL create-namespace -m raw -f -e $dev) + mode=$(echo $json | jq -r ".mode") + [ $mode != "memory" ] && echo "fail: $LINENO" && exit 1 +done + +exit 0
Jeff found that device-dax was broken with respect to falling back to smaller fault granularities. Now that the fixes are upstream ([1], [2]), add a test to backstop against future regressions and validate backports. Note that kernels without device-dax pud-mapping support will always fail the alignment == 1GiB test. Kernels with the broken fallback handling will fail the first alignment == 4KiB test. The test requires an fio binary with support for the "dev-dax" ioengine. [1]: commit 70b085b06c45 ("device-dax: fix pud fault fallback handling") [2]: commit 0134ed4fb9e7 ("device-dax: fix pmd/pte fault fallback handling") Cc: Dave Jiang <dave.jiang@intel.com> Reported-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- test/Makefile.am | 1 + test/device-dax-fio.sh | 74 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 75 insertions(+) create mode 100755 test/device-dax-fio.sh