Message ID | 1476615222-7804-3-git-send-email-amir73il@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sun, Oct 16, 2016 at 01:53:42PM +0300, Amir Goldstein wrote: > Try to run xfs_io for tests in group quick with command line > option -M which starts an idle thread before performing any io. > > The purpose of this idle thread is to test io from a multi threaded > process. With single threaded process, the file table is not shared > and file structs are not reference counted. > > In order to improve the chance of detecting file struct reference > leaks, we should run xfs_io commands with this option as much as > possible. > > Analysis of the effect of xfs_io -M on tests runtime showed that > it may lead to slightly longer run times in extreme cases (e.g +3s > for generic/132), but has a negligable effect on runtime of tests > among the 'quick' group (worst case +0.3s for generic/130). > > Therefore, we automatically add the -M flags only to tests in the > 'quick' group. > > Signed-off-by: Amir Goldstein <amir73il@gmail.com> > --- > check | 2 ++ > common/rc | 15 +++++++++++++++ > 2 files changed, 17 insertions(+) > > diff --git a/check b/check > index 69341d8..e568598 100755 > --- a/check > +++ b/check > @@ -574,6 +574,8 @@ for section in $HOST_OPTIONS_SECTIONS; do > > mkdir -p $RESULT_DIR > > + export TEST_GROUPS=`grep $(basename $seqnum) "$SRC_DIR/$(dirname $seqnum)/group"` > + Do we really want to go down this path of changing behaviours of utilities based on what group they belong to? It means we can no longer look at the test code and recreate the command line without having to work out what group context the test is running under. And how do we tell that it's correct and we don't inadvertantly break it? I ask this because..... > echo -n "$seqnum" > > if $showme; then > diff --git a/common/rc b/common/rc > index a838750..7c478cf 100644 > --- a/common/rc > +++ b/common/rc > @@ -3799,6 +3799,21 @@ init_rc() > $XFS_IO_PROG -c stat $TEST_DIR 2>&1 | grep -q "is not on an XFS filesystem" && \ > export XFS_IO_PROG="$XFS_IO_PROG -F" > > + if echo $TEST_GROUPS | grep -q quick; then > + # xfs_io -M flag runs xfs_io as multi threaded process > + # in order to catch fdget/fdset reference leaks, because > + # file structs are not reference counted in a single threaded > + # process. > + # Because reference counted fdget/fdset may lead to slightly > + # longer run times in extreme cases (such as generic/132), > + # we limit the use of -M flags to tests with short runtime, > + # where the effect of the flag is negligable. > + # > + # Figure out if xfs_io supports the -M option > + $XFS_IO_PROG -M -c quit 2>/dev/null && \ > + export XFS_IO_PROG="$XFS_IO_PROG -M" > + fi > + .... I can't see how this works - init_rc is called and sets $XFS_IO_PROG long before TEST_GROUPS is set and exported. How did you verify that the correct xfs_io command lines were being issued? IMO, either turn it on for everything if it is supported, or make it an explicit command line option to enable. A couple of seconds of extra runtime here and there means nothing for a typical auto test run which can take hours to run.... Cheers, Dave.
On Mon, Oct 17, 2016 at 12:46 AM, Dave Chinner <david@fromorbit.com> wrote: > On Sun, Oct 16, 2016 at 01:53:42PM +0300, Amir Goldstein wrote: >> Try to run xfs_io for tests in group quick with command line >> option -M which starts an idle thread before performing any io. >> >> The purpose of this idle thread is to test io from a multi threaded >> process. With single threaded process, the file table is not shared >> and file structs are not reference counted. >> >> In order to improve the chance of detecting file struct reference >> leaks, we should run xfs_io commands with this option as much as >> possible. >> >> Analysis of the effect of xfs_io -M on tests runtime showed that >> it may lead to slightly longer run times in extreme cases (e.g +3s >> for generic/132), but has a negligable effect on runtime of tests >> among the 'quick' group (worst case +0.3s for generic/130). >> >> Therefore, we automatically add the -M flags only to tests in the >> 'quick' group. >> >> Signed-off-by: Amir Goldstein <amir73il@gmail.com> >> --- >> check | 2 ++ >> common/rc | 15 +++++++++++++++ >> 2 files changed, 17 insertions(+) >> >> diff --git a/check b/check >> index 69341d8..e568598 100755 >> --- a/check >> +++ b/check >> @@ -574,6 +574,8 @@ for section in $HOST_OPTIONS_SECTIONS; do >> >> mkdir -p $RESULT_DIR >> >> + export TEST_GROUPS=`grep $(basename $seqnum) "$SRC_DIR/$(dirname $seqnum)/group"` >> + > > Do we really want to go down this path of changing behaviours > of utilities based on what group they belong to? It means we can no > longer look at the test code and recreate the command line without > having to work out what group context the test is running under. And > how do we tell that it's correct and we don't inadvertantly break > it? I ask this because..... > >> echo -n "$seqnum" >> >> if $showme; then >> diff --git a/common/rc b/common/rc >> index a838750..7c478cf 100644 >> --- a/common/rc >> +++ b/common/rc >> @@ -3799,6 +3799,21 @@ init_rc() >> $XFS_IO_PROG -c stat $TEST_DIR 2>&1 | grep -q "is not on an XFS filesystem" && \ >> export XFS_IO_PROG="$XFS_IO_PROG -F" >> >> + if echo $TEST_GROUPS | grep -q quick; then >> + # xfs_io -M flag runs xfs_io as multi threaded process >> + # in order to catch fdget/fdset reference leaks, because >> + # file structs are not reference counted in a single threaded >> + # process. >> + # Because reference counted fdget/fdset may lead to slightly >> + # longer run times in extreme cases (such as generic/132), >> + # we limit the use of -M flags to tests with short runtime, >> + # where the effect of the flag is negligable. >> + # >> + # Figure out if xfs_io supports the -M option >> + $XFS_IO_PROG -M -c quit 2>/dev/null && \ >> + export XFS_IO_PROG="$XFS_IO_PROG -M" >> + fi >> + > > .... I can't see how this works - init_rc is > called and sets $XFS_IO_PROG long before TEST_GROUPS is set and > exported. Correct, but common/rc is then included again from every single test, so XFS_IO_PROG is adjusted to add the -M/i flag per test after TEST_GROUPS is set for a specific test in ./check. I saw the -F flag code and figured that's a good place to add -M/i as well. With a hunch, I would say that if -F where to be added, it where to be added twice. > How did you verify that the correct xfs_io command lines > were being issued? > Both by adding debug print and by monitoring ps > IMO, either turn it on for everything if it is supported, or make it > an explicit command line option to enable. If I make the default off, then not enough people will test for reference leaks. Since leaks may be hard to find, because they may only be on rare error path, the motivation for a test suit IMO is to enforce this option as much as possible. OTOH, if everyone, always run with -i, then the fast path fdget is never tested with xfs_io. fast path will get tested with other tools used by tests though, so maybe that is not such a big concern. > A couple of seconds of > extra runtime here and there means nothing for a typical auto test > run which can take hours to run.... > In that case, I think we should turn the flag always if supported. After all, even it tests take longer to run that's exactly the sort of penalty that should be acceptable to pay for running with extra debugging options enabled. As an extra, I suggested we could turn the flag off for people running the test with non debug kernel, or say, without CONFIG_PROVE_LOCKING. If there are people doing that, and I imagine there are, they may care more about test runtime and less about detecting leaks? > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Oct 17, 2016 at 10:01 AM, Amir Goldstein <amir73il@gmail.com> wrote: > On Mon, Oct 17, 2016 at 12:46 AM, Dave Chinner <david@fromorbit.com> wrote: >> On Sun, Oct 16, 2016 at 01:53:42PM +0300, Amir Goldstein wrote: >>> Try to run xfs_io for tests in group quick with command line >>> option -M which starts an idle thread before performing any io. >>> >>> The purpose of this idle thread is to test io from a multi threaded >>> process. With single threaded process, the file table is not shared >>> and file structs are not reference counted. >>> >>> In order to improve the chance of detecting file struct reference >>> leaks, we should run xfs_io commands with this option as much as >>> possible. >>> >>> Analysis of the effect of xfs_io -M on tests runtime showed that >>> it may lead to slightly longer run times in extreme cases (e.g +3s >>> for generic/132), but has a negligable effect on runtime of tests >>> among the 'quick' group (worst case +0.3s for generic/130). >>> >>> Therefore, we automatically add the -M flags only to tests in the >>> 'quick' group. >>> >>> Signed-off-by: Amir Goldstein <amir73il@gmail.com> >>> --- ... >> A couple of seconds of >> extra runtime here and there means nothing for a typical auto test >> run which can take hours to run.... >> > Just to follow up on the extra runtime. I did one auto test run at 1h42m without -i flag and then auto test run at 1h45m with -i flag I did not repeat several runs to reduce noise. If anyone thinks that difference is too expensive please shout. Amir. -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Oct 17, 2016 at 10:01:19AM +0300, Amir Goldstein wrote: > On Mon, Oct 17, 2016 at 12:46 AM, Dave Chinner <david@fromorbit.com> wrote: > > On Sun, Oct 16, 2016 at 01:53:42PM +0300, Amir Goldstein wrote: > >> Try to run xfs_io for tests in group quick with command line > >> option -M which starts an idle thread before performing any io. > >> > >> The purpose of this idle thread is to test io from a multi threaded > >> process. With single threaded process, the file table is not shared > >> and file structs are not reference counted. > >> > >> In order to improve the chance of detecting file struct reference > >> leaks, we should run xfs_io commands with this option as much as > >> possible. > >> > >> Analysis of the effect of xfs_io -M on tests runtime showed that > >> it may lead to slightly longer run times in extreme cases (e.g +3s > >> for generic/132), but has a negligable effect on runtime of tests > >> among the 'quick' group (worst case +0.3s for generic/130). > >> > >> Therefore, we automatically add the -M flags only to tests in the > >> 'quick' group. > >> > >> Signed-off-by: Amir Goldstein <amir73il@gmail.com> > >> --- > >> check | 2 ++ > >> common/rc | 15 +++++++++++++++ > >> 2 files changed, 17 insertions(+) > >> > >> diff --git a/check b/check > >> index 69341d8..e568598 100755 > >> --- a/check > >> +++ b/check > >> @@ -574,6 +574,8 @@ for section in $HOST_OPTIONS_SECTIONS; do > >> > >> mkdir -p $RESULT_DIR > >> > >> + export TEST_GROUPS=`grep $(basename $seqnum) "$SRC_DIR/$(dirname $seqnum)/group"` > >> + > > > > Do we really want to go down this path of changing behaviours > > of utilities based on what group they belong to? It means we can no > > longer look at the test code and recreate the command line without > > having to work out what group context the test is running under. And > > how do we tell that it's correct and we don't inadvertantly break > > it? I ask this because..... > > > >> echo -n "$seqnum" > >> > >> if $showme; then > >> diff --git a/common/rc b/common/rc > >> index a838750..7c478cf 100644 > >> --- a/common/rc > >> +++ b/common/rc > >> @@ -3799,6 +3799,21 @@ init_rc() > >> $XFS_IO_PROG -c stat $TEST_DIR 2>&1 | grep -q "is not on an XFS filesystem" && \ > >> export XFS_IO_PROG="$XFS_IO_PROG -F" > >> > >> + if echo $TEST_GROUPS | grep -q quick; then > >> + # xfs_io -M flag runs xfs_io as multi threaded process > >> + # in order to catch fdget/fdset reference leaks, because > >> + # file structs are not reference counted in a single threaded > >> + # process. > >> + # Because reference counted fdget/fdset may lead to slightly > >> + # longer run times in extreme cases (such as generic/132), > >> + # we limit the use of -M flags to tests with short runtime, > >> + # where the effect of the flag is negligable. > >> + # > >> + # Figure out if xfs_io supports the -M option > >> + $XFS_IO_PROG -M -c quit 2>/dev/null && \ > >> + export XFS_IO_PROG="$XFS_IO_PROG -M" > >> + fi > >> + > > > > .... I can't see how this works - init_rc is > > called and sets $XFS_IO_PROG long before TEST_GROUPS is set and > > exported. > > Correct, but common/rc is then included again from every single test, > so XFS_IO_PROG is adjusted to add the -M/i flag per test after > TEST_GROUPS is set for a specific test in ./check. > I saw the -F flag code and figured that's a good place to add -M/i as well. > With a hunch, I would say that if -F where to be added, it where to be > added twice. > > > How did you verify that the correct xfs_io command lines > > were being issued? > > > > Both by adding debug print and by monitoring ps > > > IMO, either turn it on for everything if it is supported, or make it > > an explicit command line option to enable. > > If I make the default off, then not enough people will test for reference leaks. > Since leaks may be hard to find, because they may only be on rare error path, > the motivation for a test suit IMO is to enforce this option as much > as possible. > > OTOH, if everyone, always run with -i, then the fast path fdget is > never tested with > xfs_io. fast path will get tested with other tools used by tests > though, so maybe > that is not such a big concern. [Sorry for the late response, I'm travelling this week and the network connection is not as good as I expected.] I'm still having concerns about losing test coverage by enabling "-i" by default. How about adding an command line option to disable it? So at least we could have a way to turn it off. Or is it completely impossible to lose any test coverage? Thanks, Eryu -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Oct 20, 2016 at 5:25 PM, Eryu Guan <eguan@redhat.com> wrote: > On Mon, Oct 17, 2016 at 10:01:19AM +0300, Amir Goldstein wrote: ... > > I'm still having concerns about losing test coverage by enabling "-i" by > default. How about adding an command line option to disable it? I could take the easy way and add a command line option to satisfy concerns, but we all know that 99% of the time, nobody is going to use it, so we will be just silencing out concerns instead of addressing them. > So at > least we could have a way to turn it off. Or is it completely impossible > to lose any test coverage? I would not argue that it is completely impossible to loose test coverage but I would argue that there is low probability of loosing *interesting* test coverage. Here is the argument I am making: A file reference leak can be anywhere, (e.g in EXDEV error path of clone file range ioctl) so we SHOULD have -i test coverage for as many APIs as possible with as many arguments as possible. OTOH, the difference between slowpath and fastpath of fdget()/fdput() for single/multi threaded process is quite invariant to the specific API, so I do not see the value in test coverage of fastpath to all APIs. Anyway, I'd be happy to add a command line option or env var if that is the preference of folks, but then I would really appreciate a suggestion for option name or var name, because it is going to be quite hard to find an intuitive name for this test variant. Thanks, Amir. -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Oct 20, 2016 at 09:27:03PM +0300, Amir Goldstein wrote: > On Thu, Oct 20, 2016 at 5:25 PM, Eryu Guan <eguan@redhat.com> wrote: > > On Mon, Oct 17, 2016 at 10:01:19AM +0300, Amir Goldstein wrote: > > ... > > > > > I'm still having concerns about losing test coverage by enabling "-i" by > > default. How about adding an command line option to disable it? > > I could take the easy way and add a command line option to satisfy > concerns, but we all know that 99% of the time, nobody is going to use it, > so we will be just silencing out concerns instead of addressing them. > > > So at > > least we could have a way to turn it off. Or is it completely impossible > > to lose any test coverage? > > I would not argue that it is completely impossible to loose test coverage > but I would argue that there is low probability of loosing *interesting* test > coverage. > > Here is the argument I am making: > A file reference leak can be anywhere, (e.g in EXDEV error path of clone > file range ioctl) so we SHOULD have -i test coverage for as many APIs > as possible with as many arguments as possible. > OTOH, the difference between slowpath and fastpath of fdget()/fdput() > for single/multi threaded process is quite invariant to the specific API, > so I do not see the value in test coverage of fastpath to all APIs. Then I'll take it as it is, and we can always add a new option to turn it off when we find it necessary :) Thanks! Eryu -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/check b/check index 69341d8..e568598 100755 --- a/check +++ b/check @@ -574,6 +574,8 @@ for section in $HOST_OPTIONS_SECTIONS; do mkdir -p $RESULT_DIR + export TEST_GROUPS=`grep $(basename $seqnum) "$SRC_DIR/$(dirname $seqnum)/group"` + echo -n "$seqnum" if $showme; then diff --git a/common/rc b/common/rc index a838750..7c478cf 100644 --- a/common/rc +++ b/common/rc @@ -3799,6 +3799,21 @@ init_rc() $XFS_IO_PROG -c stat $TEST_DIR 2>&1 | grep -q "is not on an XFS filesystem" && \ export XFS_IO_PROG="$XFS_IO_PROG -F" + if echo $TEST_GROUPS | grep -q quick; then + # xfs_io -M flag runs xfs_io as multi threaded process + # in order to catch fdget/fdset reference leaks, because + # file structs are not reference counted in a single threaded + # process. + # Because reference counted fdget/fdset may lead to slightly + # longer run times in extreme cases (such as generic/132), + # we limit the use of -M flags to tests with short runtime, + # where the effect of the flag is negligable. + # + # Figure out if xfs_io supports the -M option + $XFS_IO_PROG -M -c quit 2>/dev/null && \ + export XFS_IO_PROG="$XFS_IO_PROG -M" + fi + # xfs_copy doesn't work on v5 xfs yet without -d option if [ "$FSTYP" == "xfs" ] && [[ $MKFS_OPTIONS =~ crc=1 ]]; then export XFS_COPY_PROG="$XFS_COPY_PROG -d"
Try to run xfs_io for tests in group quick with command line option -M which starts an idle thread before performing any io. The purpose of this idle thread is to test io from a multi threaded process. With single threaded process, the file table is not shared and file structs are not reference counted. In order to improve the chance of detecting file struct reference leaks, we should run xfs_io commands with this option as much as possible. Analysis of the effect of xfs_io -M on tests runtime showed that it may lead to slightly longer run times in extreme cases (e.g +3s for generic/132), but has a negligable effect on runtime of tests among the 'quick' group (worst case +0.3s for generic/130). Therefore, we automatically add the -M flags only to tests in the 'quick' group. Signed-off-by: Amir Goldstein <amir73il@gmail.com> --- check | 2 ++ common/rc | 15 +++++++++++++++ 2 files changed, 17 insertions(+)