diff mbox series

[v2,2/2] t1006: ensure cat-file info isn't buffered by default

Message ID 20240618213041.M462972@dcvr (mailing list archive)
State Accepted
Commit 75daa42ddf5ec4755eaec47909f30cbd84719e88
Headers show
Series None | expand

Commit Message

Eric Wong June 18, 2024, 9:30 p.m. UTC
Junio C Hamano <gitster@pobox.com> wrote:
> Eric Wong <e@80x24.org> writes:
> 
> > Buffering by default breaks some 3rd-party Perl scripts using
> > cat-file, but this breakage was not detected anywhere in our
> > test suite.  The easiest place to test this behavior is with
> > Git.pm, since (AFAIK) other equivalent way to test this behavior
> > from Bourne shell and/or awk would require racy sleeps,
> > non-portable FIFOs or tedious C code.
> 
> Yes, using Perl is a good substitute for writing it in C in this
> case.  I however question the choice to use t9700/test.pl here,
> which is clearly stated that its purpose is to "test perl interface
> which is Git.pm", and added tests are not testing anything in Git.pm
> at all.
> 
> Using t9700/test.pl only because it happens to use "perl -MTest::More"
> sounds a bit eh, suboptimal.

*shrug*  I figure Test::More is common enough since it's part of
the Perl standard library; but I consider Perl a better scripting
language than sh by far and wish our whole test suite were Perl :>

> It seems that there are Perl snippets in other tests (including
> t1006 that is specifically about cat-file).  How involved would it
> be to implement these new tests without modifying unrelated test
> scripts?

> >  t/t9700/test.pl | 14 ++++++++++++++
> >  1 file changed, 14 insertions(+)

More code than that.  At least IPC::Open2 takes care of the nasty
portability bits, but getting the Perl quoting nested properly
inside sh was confusing :x

v2: moved test to t1006 to avoid Test::More,
    add select timeout in case a buffering bug does get introduced,
    updated commit message and clarified the bug it's supposed
    to guard against
    (I initially tried stdio buffering, but moved away from it for the
    patch I'm testing...)

----8<----
Subject: [PATCH] t1006: ensure cat-file info isn't buffered by default

While working on buffering changes to `git cat-file' in a
separate patch, I inadvertently made the output of --batch-check
and the `info' command of --batch-command buffered as if
opt->buffer_output is turned on by default.

Buffering by default breaks some 3rd-party Perl scripts using
cat-file, but this breakage was not detected anywhere in our
test suite.  Add a small Perl snippet to test this problem since
(AFAIK) other equivalent ways to test this behavior from Bourne
shell and/or awk would require racy sleeps, non-portable FIFOs
or tedious C code.

Signed-off-by: Eric Wong <e@80x24.org>
---
 t/t1006-cat-file.sh | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

Comments

Junio C Hamano June 18, 2024, 11:37 p.m. UTC | #1
Eric Wong <e@80x24.org> writes:

>> Yes, using Perl is a good substitute for writing it in C in this
>> case.  I however question the choice to use t9700/test.pl here,
>> which is clearly stated that its purpose is to "test perl interface
>> which is Git.pm", and added tests are not testing anything in Git.pm
>> at all.
>> 
>> Using t9700/test.pl only because it happens to use "perl -MTest::More"
>> sounds a bit eh, suboptimal.
>
> *shrug*  I figure Test::More is common enough since it's part of
> the Perl standard library; but I consider Perl a better scripting
> language than sh by far and wish our whole test suite were Perl :>

Oh, I think we (actually the author of t9700) considers it common
enough that we have PERL_TEST_MORE prerequisite to allow us to write
tests, assuming that it is available, and let us easily skip where
it is not available.  So I do not think I mind the dependency on
Test::More at all.  Moving the tests to t1006 and rewriting the
tests not to use Test::More are two separate and unrelated things,
and if you are more comfortable with Test::More (and more
importantly if it is natural to write Perl based tests using
Test::More), it is not necessary to switch away from it.
Eric Wong June 19, 2024, 5:56 p.m. UTC | #2
Junio C Hamano <gitster@pobox.com> wrote:
> Eric Wong <e@80x24.org> writes:
> 
> >> Yes, using Perl is a good substitute for writing it in C in this
> >> case.  I however question the choice to use t9700/test.pl here,
> >> which is clearly stated that its purpose is to "test perl interface
> >> which is Git.pm", and added tests are not testing anything in Git.pm
> >> at all.
> >> 
> >> Using t9700/test.pl only because it happens to use "perl -MTest::More"
> >> sounds a bit eh, suboptimal.
> >
> > *shrug*  I figure Test::More is common enough since it's part of
> > the Perl standard library; but I consider Perl a better scripting
> > language than sh by far and wish our whole test suite were Perl :>
> 
> Oh, I think we (actually the author of t9700) considers it common
> enough that we have PERL_TEST_MORE prerequisite to allow us to write
> tests, assuming that it is available, and let us easily skip where
> it is not available.  So I do not think I mind the dependency on
> Test::More at all.  Moving the tests to t1006 and rewriting the
> tests not to use Test::More are two separate and unrelated things,
> and if you are more comfortable with Test::More (and more
> importantly if it is natural to write Perl based tests using
> Test::More), it is not necessary to switch away from it.

OK, fair enough.  Given t1006 is mostly sh, I prefer keeping v2
as-is because the Test::More->builder munging of test numbers in
t9700/test.pl is nasty too and I wouldn't enjoy duplicating
those bits in a hypothetical t1006/test.pl, either.

It would be nice to have first class support for Test::More in
our suite so we could just have t/t0006-cat-file.t and
t/t9700-perl-git.t implemented in Perl without sh at all, but
that's a separate discussion.
Junio C Hamano June 20, 2024, 5:28 p.m. UTC | #3
Eric Wong <e@80x24.org> writes:

> OK, fair enough.  Given t1006 is mostly sh, I prefer keeping v2
> as-is because the Test::More->builder munging of test numbers in
> t9700/test.pl is nasty too and I wouldn't enjoy duplicating
> those bits in a hypothetical t1006/test.pl, either.

That is quite sensible.
Jeff King June 21, 2024, 7:16 a.m. UTC | #4
On Tue, Jun 18, 2024 at 09:30:41PM +0000, Eric Wong wrote:

> +script='
> +use warnings;
> +use strict;
> +use IPC::Open2;
> +my ($opt, $oid, $expect, @pfx) = @ARGV;
> +my @cmd = (qw(git cat-file), $opt);
> +my $pid = open2(my $out, my $in, @cmd) or die "open2: @cmd";
> +print $in @pfx, $oid, "\n" or die "print $!";
> +my $rvec = "";
> +vec($rvec, fileno($out), 1) = 1;
> +select($rvec, undef, undef, 30) or die "no response to `@pfx $oid` from @cmd";
> +my $info = <$out>;
> +chop($info) eq "\n" or die "no LF";
> +$info eq $expect or die "`$info` != `$expect`";
> +close $in or die "close in $!";
> +close $out or die "close out $!";
> +waitpid $pid, 0;
> +$? == 0 or die "\$?=$?";
> +'
> +
> +expect="$hello_oid blob $hello_size"
> +
> +test_expect_success PERL '--batch-check is unbuffered by default' '
> +	perl -e "$script" -- --batch-check $hello_oid "$expect"
> +'

We often use "perl -e" for one-liners, etc, but this is pretty big.
Maybe:

  cat >foo.pl <<-\EOF
  ...
  EOF
  perl foo.pl -- ...

would be more readable? To be clear I don't think there's anything
incorrect about your usage, but it would match the style of our suite a
bit better.

Likewise, it would be usual in our suite for the helper to do the
minimum that needs to be in perl, and use our normal functions for
things like comparing output (rather than taking its own "expect"
argument).

So maybe:

diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index e12b221972..929d7a7579 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -1294,4 +1294,33 @@ test_expect_success 'batch-command flush without --buffer' '
 	grep "^fatal:.*flush is only for --buffer mode.*" err
 '
 
+# Copy a single line from stdin to the program specified
+# by @ARGV, and then wait for a response _without_ closing
+# the pipe.
+cat >run-and-wait.pl <<-\EOF
+use IPC::Open2;
+open2(my $out, my $in, @ARGV) or die "open2: @ARGV";
+print $in scalar(<STDIN>) or die "print $!";
+
+my $rvec = "";
+vec($rvec, fileno($out), 1) = 1;
+select($rvec, undef, undef, 30) or die "no response after 30 seconds";
+
+print scalar(<$out>);
+EOF
+
+test_expect_success PERL '--batch-check is unbuffered by default' '
+	echo "$hello_oid" |
+	perl run-and-wait.pl git cat-file --batch-check >out &&
+	echo "$hello_oid blob $hello_size" >expect &&
+	test_cmp expect out
+'
+
+test_expect_success PERL '--batch-command info is unbuffered by default' '
+	echo "info $hello_oid" |
+	perl run-and-wait.pl git cat-file --batch-command >out &&
+	echo "$hello_oid blob $hello_size" >expect &&
+	test_cmp expect out
+'
+
 test_done

I went for brevity above. Notably missing are:

  - the use of strict/warnings. I think we've shied away from these in
    the test suite because we want to run on any version of perl. In my
    experience most strict/warnings output is actually telling you about
    obvious garbage, but not always. IIRC perl got more strict about
    "()" around lists in some contexts a few years back, and code which
    used to be OK started generating warnings. OTOH, those warnings were
    probably a sign of problems-to-come, anyway. Without "FATAL",
    though, I think "use warnings" is not doing much good (nobody is
    ever going to see its output if the test isn't failing).

  - I dropped the close/waitpid. I guess maybe it is valuable to confirm
    that cat-file did not barf, but IMHO the important thing here is
    testing that it produced the single line of output we expected.

-Peff
Eric Wong June 21, 2024, 8 p.m. UTC | #5
Jeff King <peff@peff.net> wrote:
> On Tue, Jun 18, 2024 at 09:30:41PM +0000, Eric Wong wrote:
> 
> > +script='

<snip>

> > +expect="$hello_oid blob $hello_size"
> > +
> > +test_expect_success PERL '--batch-check is unbuffered by default' '
> > +	perl -e "$script" -- --batch-check $hello_oid "$expect"
> > +'
> 
> We often use "perl -e" for one-liners, etc, but this is pretty big.
> Maybe:
> 
>   cat >foo.pl <<-\EOF
>   ...
>   EOF
>   perl foo.pl -- ...
> 
> would be more readable? To be clear I don't think there's anything
> incorrect about your usage, but it would match the style of our suite a
> bit better.

*shrug*  It doesn't save the nested quoting/expansion confusion;
but it's Junio's call.  I don't think a v3 is worth the effort.

> Likewise, it would be usual in our suite for the helper to do the
> minimum that needs to be in perl, and use our normal functions for
> things like comparing output (rather than taking its own "expect"
> argument).

<snip>

> +test_expect_success PERL '--batch-check is unbuffered by default' '
> +	echo "$hello_oid" |
> +	perl run-and-wait.pl git cat-file --batch-check >out &&
> +	echo "$hello_oid blob $hello_size" >expect &&
> +	test_cmp expect out

I prefer to avoid process spawning overhead from test_cmp;
but that's a small drop in a big bucket.

> I went for brevity above. Notably missing are:
> 
>   - the use of strict/warnings. I think we've shied away from these in
>     the test suite because we want to run on any version of perl. In my
>     experience most strict/warnings output is actually telling you about
>     obvious garbage, but not always. IIRC perl got more strict about
>     "()" around lists in some contexts a few years back, and code which
>     used to be OK started generating warnings. OTOH, those warnings were
>     probably a sign of problems-to-come, anyway. Without "FATAL",
>     though, I think "use warnings" is not doing much good (nobody is
>     ever going to see its output if the test isn't failing).

It may make problems easier to find if there are failures,
so I think the potential benefits outweight any downsides.

>   - I dropped the close/waitpid. I guess maybe it is valuable to confirm
>     that cat-file did not barf, but IMHO the important thing here is
>     testing that it produced the single line of output we expected.

I've found some unexpected bugs through excessive error checking
in the past, so much preferred to keep them.
Jeff King June 24, 2024, 3:19 p.m. UTC | #6
On Fri, Jun 21, 2024 at 08:00:02PM +0000, Eric Wong wrote:

> > We often use "perl -e" for one-liners, etc, but this is pretty big.
> > Maybe:
> > 
> >   cat >foo.pl <<-\EOF
> >   ...
> >   EOF
> >   perl foo.pl -- ...
> > 
> > would be more readable? To be clear I don't think there's anything
> > incorrect about your usage, but it would match the style of our suite a
> > bit better.
> 
> *shrug*  It doesn't save the nested quoting/expansion confusion;
> but it's Junio's call.  I don't think a v3 is worth the effort.

It does allow you to use single quotes in the script, though I think you
managed without it.

> > +test_expect_success PERL '--batch-check is unbuffered by default' '
> > +	echo "$hello_oid" |
> > +	perl run-and-wait.pl git cat-file --batch-check >out &&
> > +	echo "$hello_oid blob $hello_size" >expect &&
> > +	test_cmp expect out
> 
> I prefer to avoid process spawning overhead from test_cmp;
> but that's a small drop in a big bucket.

If we care about that, I'd rather see us make test_cmp zero-process with
a shell helper than come up with ad-hoc solutions. I've tried to measure
something like that before, but couldn't come up with any conclusive
improvements (my findings were mostly that running Git itself accounts
for most of the process overhead).

-Peff
diff mbox series

Patch

diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index e12b221972..ff9bf213aa 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -1294,4 +1294,34 @@  test_expect_success 'batch-command flush without --buffer' '
 	grep "^fatal:.*flush is only for --buffer mode.*" err
 '
 
+script='
+use warnings;
+use strict;
+use IPC::Open2;
+my ($opt, $oid, $expect, @pfx) = @ARGV;
+my @cmd = (qw(git cat-file), $opt);
+my $pid = open2(my $out, my $in, @cmd) or die "open2: @cmd";
+print $in @pfx, $oid, "\n" or die "print $!";
+my $rvec = "";
+vec($rvec, fileno($out), 1) = 1;
+select($rvec, undef, undef, 30) or die "no response to `@pfx $oid` from @cmd";
+my $info = <$out>;
+chop($info) eq "\n" or die "no LF";
+$info eq $expect or die "`$info` != `$expect`";
+close $in or die "close in $!";
+close $out or die "close out $!";
+waitpid $pid, 0;
+$? == 0 or die "\$?=$?";
+'
+
+expect="$hello_oid blob $hello_size"
+
+test_expect_success PERL '--batch-check is unbuffered by default' '
+	perl -e "$script" -- --batch-check $hello_oid "$expect"
+'
+
+test_expect_success PERL '--batch-command info is unbuffered by default' '
+	perl -e "$script" -- --batch-command $hello_oid "$expect" "info "
+'
+
 test_done